Python ctypes - Setting c_char array when string has embedded null?

Question

I'm using ctypes bit fields to dissect tightly packed binary data. I stuff a record's worth of data into a union as a string, then pull out key fields as integers.

This works great when there are no nulls in the buffer, but any embedded nulls cause cytpes to truncate the string.

Example:

from ctypes import *

class H(BigEndianStructure):
    _fields_ = [ ('f1', c_int, 8),
                 ('f2', c_int, 8),
                 ('f3', c_int, 8),
                 ('f4', c_int, 2)
                 # ...
                 ]

class U(Union):
    _fields_ = [ ('fld', H),
                 ('buf', c_char * 6)
                 ]

# With no nulls, works as expected...
u1 = U()
u1.buf='abcabc'
print '{} {} {} (expect: 97 98 99)'.format(u1.fld.f1, u1.fld.f2, u1.fld.f3)

# Embedded null breaks it...  This prints '97 0 0', NOT '97 0 99'
u2 = U()
u2.buf='a\x00cabc'
print '{} {} {} (expect: 97 0 99)'.format(u2.fld.f1, u2.fld.f2, u2.fld.f3)

Browsing the ctypes source, I see two methods to set a char array, CharArray_set_value() and CharArray_set_raw(). It appears that CharArray_set_raw() will handle nulls properly whereas CharArray_set_value() will not.

But I can't figure out how to invoke the raw version... It looks like a property, so I'd expect something like:

ui.buf.raw = 'abcabc'

but that yields:

AttributeError: 'str' object has no attribute raw

Any guidance appreciated. (Including a completely different approach!)

(Note: I need to process thousands of records per second, so efficiency is critical. Using an array comprehension to stuff a byte array in the structure works, but it's 100x slower.)

pixelbrei · Accepted Answer · 2015-12-14 14:57:25Z

1

You can also create the raw-string array outside of your struct/union:

mystring = (c_char * 6).from_buffer(u2)
print mystring.raw

This way you don't have any overhead for conversion. I wonder why a (c_char * 6) behaves differently when used alone vs. used in a Structure/Union...

edited Dec 14, 2015 at 14:57

answered Dec 11, 2015 at 13:43

pixelbrei

4183 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Eryk Sun Over a year ago

For convenience (usually), the CField descriptors for c_char and c_wchar arrays are special-cased in PyCField_FromDesc (in Modules/_ctypes/cfield.c) to convert to and from native Python strings using s_get / s_set and U_get / U_set.

Eryk Sun Over a year ago

Don't use from_address for this since the resulting array doesn't own a reference on the source buffer, u2. This is a recipe for segfault disaster. Use (c_char * 6).from_buffer(u2).

Eryk Sun Over a year ago

In cases such as this I prefer to make the field name private (e.g. _buf) and use a public property.

Mark Tolonen · Accepted Answer · 2014-10-11 05:08:17Z

0

c_char*6 is handled, unfortunately, as a nul-terminated string. Switch to c_byte*6 instead, but lose the convenience of initializing with strings:

from ctypes import *

class H(BigEndianStructure):
    _fields_ = [ ('f1', c_int, 8),
                 ('f2', c_int, 8),
                 ('f3', c_int, 8),
                 ('f4', c_int, 2)
                 # ...
                 ]

class U(Union):
    _fields_ = [ ('fld', H),
                 ('buf', c_byte * 6)
                 ]

u1 = U()
u1.buf=(c_byte*6)(97,98,99,97,98,99)
print '{} {} {} (expect: 97 98 99)'.format(u1.fld.f1, u1.fld.f2, u1.fld.f3)

u2 = U()
u2.buf=(c_byte*6)(97,0,99,97,98,99)
print '{} {} {} (expect: 97 0 99)'.format(u2.fld.f1, u2.fld.f2, u2.fld.f3)

Output:

97 98 99 (expect: 97 98 99)
97 0 99 (expect: 97 0 99)

answered Oct 11, 2014 at 5:08

Mark Tolonen

181k26 gold badges183 silver badges279 bronze badges

1 Comment

Simian Over a year ago

Thanks, Mark. This works, but the CPU overhead associated with marshaling the bytes from the string into the byte array makes the approach too slow for my application. (My trials were about 100x slower than ctype's memcpy()).

Collectives™ on Stack Overflow

Python ctypes - Setting c_char array when string has embedded null?

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related