I've come to a point where I need to rewrite a couple of methods of my class into a C-extension to increase performance (intersection_len, union_len). My C background is very limited and I have never written an extension for Python. In the extension I need to use a list that is an attribute of the class (self.table; it is a list of lists and None-objects). The extension should return an integer. So, the question is, how can I pass a list of lists and None-objects to a C-extension and read it there? Thank you in advance.
P.S. I need a custom hash table, because I use rolling hash algorithm, that is not present in any Python module Google is aware of.
class HashTable:
def __init__(self, table_length, base):
self.table_length = table_length
self.base = base
self.items = 0
self.table = [None for _ in xrange(self.table_length)]
def __iter__(self):
return iter(self.table)
def __getitem__(self, hash_value):
return self.table[hash_value]
def __len__(self):
return self.items
def __str__(self):
return str(self.table)
def insert(self, item, hash_value):
if self.table[hash_value] is None:
self.table[hash_value] = [item]
self.items += 1
else:
if item not in self.table[hash_value]:
self.table[hash_value].append(item)
self.items += 1
def intersection_len(self, table):
if table.table_length != self.table_length or table.base != self.base:
raise ValueError('Tables must have equal length and hash function base')
result = 0
for i, items in enumerate(table):
if items is None or self.table[i] is None:
continue
for item in items:
if item in self.table[i]:
result += 1
return result
def union_len(self, table):
if table.table_length != self.table_length or table.base != self.base:
raise ValueError('Tables must have equal length and hash function base')
result = 0
for i in xrange(self.table_length):
if table[i] is None and self.table[i] is None:
continue
elif table[i] is None:
result += len(self.table[i])
elif self.table[i] is None:
result += len(table[i])
else:
result += len(table[i])
for item in table[i]:
if item not in self.table[i]:
result += 1
return result
def dump(self):
for i in xrange(self.table_length):
if self.table[i] is not None:
self.table[i] = None`
union_len. It seems to be much slower thanintersection_len(6 times slower on my machine). However, you could significantly improve the execution time ofunion_lenby simply rewriting it asreturn len(self) + len(other) - self.intersection_len(other)