Using Python list in a C extension

Question

I've come to a point where I need to rewrite a couple of methods of my class into a C-extension to increase performance (intersection_len, union_len). My C background is very limited and I have never written an extension for Python. In the extension I need to use a list that is an attribute of the class (self.table; it is a list of lists and None-objects). The extension should return an integer. So, the question is, how can I pass a list of lists and None-objects to a C-extension and read it there? Thank you in advance.

P.S. I need a custom hash table, because I use rolling hash algorithm, that is not present in any Python module Google is aware of.

class HashTable:
    def __init__(self, table_length, base):
        self.table_length = table_length
        self.base = base
        self.items = 0
        self.table = [None for _ in xrange(self.table_length)]

    def __iter__(self):
        return iter(self.table)

    def __getitem__(self, hash_value):
        return self.table[hash_value]

    def __len__(self):
        return self.items

    def __str__(self):
        return str(self.table)

    def insert(self, item, hash_value):
        if self.table[hash_value] is None:
            self.table[hash_value] = [item]
            self.items += 1
        else:
            if item not in self.table[hash_value]:
                self.table[hash_value].append(item)
                self.items += 1

    def intersection_len(self, table):
        if table.table_length != self.table_length or table.base != self.base:
            raise ValueError('Tables must have equal length and hash function base')
        result = 0
        for i, items in enumerate(table):
            if items is None or self.table[i] is None:
                continue
            for item in items:
                if item in self.table[i]:
                    result += 1
        return result

    def union_len(self, table):
        if table.table_length != self.table_length or table.base != self.base:
            raise ValueError('Tables must have equal length and hash function base')
        result = 0
        for i in xrange(self.table_length):
            if table[i] is None and self.table[i] is None:
                continue
            elif table[i] is None:
                result += len(self.table[i])
            elif self.table[i] is None:
                result += len(table[i])
            else:
                result += len(table[i])
                for item in table[i]:
                    if item not in self.table[i]:
                        result += 1
        return result

    def dump(self):
        for i in xrange(self.table_length):
            if self.table[i] is not None:
                self.table[i] = None`

I haven't looked in detail at the methods you'd like to speed up, but can't numpy help here (with, for example, a masked array to cope with the None values)? — user707650
– user707650, Commented Dec 25, 2014 at 22:59
I assume your actual question here is that you'd like to speed-up those two methods; have you considered Cython? — user707650
– user707650, Commented Dec 25, 2014 at 23:01
@Evert I've never really used Cython. If I could keep this class in Cython and the rest of my code in Python that would be great. Rewriting all the code into Cython will take a hell lot of time. — Eli Korvigo
– Eli Korvigo, Commented Dec 25, 2014 at 23:12
@GrayFall9, cython is incredibly easy to use. I found this series of video from scipy 2013 very helpful youtube.com/watch?v=JKCjsRDffXo — Padraic Cunningham
– Padraic Cunningham, Commented Dec 26, 2014 at 0:51
After playing about with your code for a while it seems like the only improvement I found you could do in python is to change union_len. It seems to be much slower than intersection_len (6 times slower on my machine). However, you could significantly improve the execution time of union_len by simply rewriting it as return len(self) + len(other) - self.intersection_len(other) — Dunes
– Dunes, Commented Dec 26, 2014 at 4:45

Alex Martelli · Accepted Answer · 2014-12-26 00:00:29Z

4

Simplest way to build a C extension for python is via a setup.py file such as:

from distutils.core import setup, Extension

lolan = Extension('lolan', sources = ['lolanmodule.c'])

setup (name = 'Example',
        version = '1.0',
        description = 'Just an exapmle',
        ext_modules = [lolan])

The C coding proper could be something like:

#include <Python.h>

static PyObject*
count_nones(PyObject* self, PyObject* args)
{
    PyObject* lol;

    if (!PyArg_ParseTuple(args, "O", &lol))
        return NULL;

    Py_ssize_t len = PySequence_Length(lol);
    int result = 0;
    for(Py_ssize_t i=0; i<len; ++i) {
        PyObject* item = PySequence_GetItem(lol, i);
        if(item==Py_None) ++result;
        Py_DECREF(item);
    }
    return Py_BuildValue("i", result);
}

static PyMethodDef LolanMethods[] =
{
     {"count_nones", count_nones, METH_VARARGS, "Count Nones."},
     {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC
initlolan(void)
{
     (void) Py_InitModule("lolan", LolanMethods);
}

in lolanmodule.c in the same directory. To try it, cd to that directory and run:

$ python setup.py build_ext --inplace
running build_ext
building 'lolan' extension
cc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c lolanmodule.c -o build/temp.macosx-10.9-intel-2.7/lolanmodule.o
cc -bundle -undefined dynamic_lookup -arch x86_64 -arch i386 -Wl,-F. build/temp.macosx-10.9-intel-2.7/lolanmodule.o -o /Users/aleax/lolan.so

(output will differ depending on your platform -- this is in MacOSX 10.9).

Then, try it out interactively:

$ python
Python 2.7.5 (default, Mar  9 2014, 22:15:05) 
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import lolan
>>> l=[None]*3 + [[] for _ in range(4)]
>>> lolan.count_nones(l)
3
>>> lolan.count_nones(l*7)
21
>>>

Of course you need much better unit tests than this, but this starts to show that function count_nones in module lolan does accept a "list of lists and Nones" (as the module name acronymizes:-) and counts how many of the argument's items are None.

Here I've focused on the "abstract level" functions presented at https://docs.python.org/2/c-api/abstract.html ; with some type-constraints, you may getter slightly better performance with "concrete level" functions as per https://docs.python.org/2/c-api/concrete.html , but it doesn't tend to make a huge difference.

answered Dec 26, 2014 at 0:00

Alex Martelli

888k175 gold badges1.3k silver badges1.4k bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

martineau Over a year ago

You seemed to have glossed-over the accessing of any nested lists -- perhaps because you think it's obvious from what you have provided.

Alex Martelli Over a year ago

Good to hear, @GrayFall9 -- so maybe you can accept the answer?

Alex Martelli Over a year ago

@martineau, the code I've posted shows that, given a PyObject* x that's a list, PySequence_Length(o) gives you its length, and PySequence_GetItem(o, i) gives you the PyObject* that is its ith item (which you must Py_DECREF when you're done with it). This applies no mater how you've obtained o in the first place -- so, for example, it applies if it's a list obtained as an item of another list. Given all this, it doesn't seem there's any mystery left about "nested lists", does there?-)

martineau Over a year ago

I meant simply you didn't show what would have to be done if PySequence_GetItem(lol, i) returned a list object rather than Py_None, and one then wanted to iterate though it's members. As I said, it could be considered fairly obvious...

Alex Martelli Over a year ago

Depends on the specific data structures and needs, and on whether C or C++ is actually used. C++'s std::vector is typically excellent compared with Python's list, for example; but Python's dict and sort are best-in-class. For a list-of-lists-and-Nones I have no specific experience, so I would rather try several variants and measure performance on the specific platforms and workloads of interest, rather than guess. "You can't manage what you don't measure"...!-)

|

thebjorn · Accepted Answer · 2014-12-25 23:33:33Z

4

Why do you believe rewriting this algorithm in C will give you the performance you require? (What is the performance you require compared to this version?)

I would probably start by writing a performance test suite, then improving the Python algorithm using Python primitives like zip instead of nested looping (for i in xrange(..)), pulling CSEs out of loops, etc...

Before starting on an extension I would run the improved Python version under pypy to get an idea of what you could expect from creating a C extension. Finally, if an extension was called for I would investigate Cython and Boost before using C (since a Python extension is a rather hard learning curve for C -- and not likely to give you something that runs fast).

answered Dec 25, 2014 at 23:33

thebjorn

27.6k12 gold badges107 silver badges152 bronze badges

2 Comments

Eli Korvigo Over a year ago

I tried izip and it resulted in a minor improvement. The thing is I need to compare half a million of hash tables. That's why I need something dramatically faster. Seems like I should go with Cython. Thanks a lot for your help.

martineau Over a year ago

I tend to agree with thebjorn...I wouldn't give up on a pure-Python solution without looking for a better algorithm using built-in first. Many of them are already written in C. If you can change algorithms, another possibility might be multiprocessing depending on your hardware and exactly what you're doing.

casevh · Accepted Answer · 2014-12-26 00:00:38Z

2

As the size of a list increases, using x in some_list gets slower. The in operation has to sequentially read each item in the list until a match is found. I would try changing your data structure to use a list of sets.

Untested, but I think the only code that needs to change is insert().

def insert(self, item, hash_value):
    if self.table[hash_value] is None:
        self.table[hash_value] = set()
    self.table[hash_value].add(item)
    self.items += 1

answered Dec 26, 2014 at 0:00

casevh

11.5k1 gold badge30 silver badges37 bronze badges

4 Comments

Dunes Over a year ago

Sets should also make the intersection and union methods much quicker too. The for loop to establish the number of items in those methods is currently O(n^2), becomes a much quicker set operation eg. result += len(table(i) & self.table(i)).

Eli Korvigo Over a year ago

This made things slightly worse (because of the overhead of additional hash calculation). Collisions, though present, are quite rare, so comparing two sublists takes mostly O(1). Nevertheless, I agree that your method would be of greater use if my hash function made more collisions.

Eli Korvigo Over a year ago

@Dunes this leads to a significant overhead of building and removing a hell lot of new sets in memory.

Dunes Over a year ago

Building those sets is done in C, and should be very efficient. Sounds like you're expecting a sparse data set, which does make things quite different.

Collectives™ on Stack Overflow

Using Python list in a C extension

3 Answers 3

8 Comments

2 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

8 Comments

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related