Python/Numpy - Extracting Bits of Bytes

Question

I have 8 bytes of data in the form of a numpy.frombuffer() array. I need to get bits 10-19 into a variable and bits 20-29 into a variable. How do I use Python to extract bits that cross bytes? I've read about bit shifting but it isn't clear to me if that is the way to do it.

numpy won't help you here with bit manipulation (though it is a great library). better to just stick with python bytes objects (like the buffer you used probably already was). See my answer for an example using bytes (byte string) — Aaron
– Aaron, Commented Oct 13, 2021 at 18:56
@Aaron I would advise against making such strong comments unless one is absolutely sure about it. — Ehsan
– Ehsan, Commented Oct 13, 2021 at 19:22
@Ehsan I don't think it's that strong a statement. Numpy may be capable of doing this, but it won't make it easier or more understandable. Particularly for a new programmer, I tend to recommend sticking to the standard library, and built-in functions until you're more comfortable with the language. — Aaron
– Aaron, Commented Oct 13, 2021 at 19:26
The data is an integer. I'll see if I can pull example data. — Kevin
– Kevin, Commented Oct 13, 2021 at 19:31

Ehsan · Accepted Answer · 2021-10-13 19:40:14Z

1

Depending on your datatype you might need to slightly modify this numpy solution:

a = np.frombuffer(b'\x01\x02\x03\x04\x05\x06\x07\x08', dtype=np.uint8)
#array([1, 2, 3, 4, 5, 6, 7, 8], dtype=uint8)

unpacking bits:

first = np.unpackbits(a)[10:20]
#array([0, 0, 0, 0, 1, 0, 0, 0, 0, 0], dtype=uint8)

And if you need to repack the bits:

first_packed = np.packbits(first)
array([8, 0], dtype=uint8)

Please note that python is 0-based index and if you want 10th-19th element, please adjust the above indexing to np.unpackbits(a)[9:19].

Similarly for other case:

second = np.unpackbits(a)[20:30]
#array([0, 0, 1, 1, 0, 0, 0, 0, 0, 1], dtype=uint8)

edited Oct 13, 2021 at 19:40

answered Oct 13, 2021 at 19:34

Ehsan

12.5k2 gold badges24 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Kevin Over a year ago

thanks for this. It is easy and understandable. My initial data is 8 bytes. I need to understand from management whether all data will be 8 bytes.

Ehsan Over a year ago

@Kevin you are welcome. Please feel free to edit your question when you have more information about it and comment here for us to relook at it.

Aaron · Accepted Answer · 2021-10-13 19:43:49Z

1

Get each bit individually by indexing the correct byte, then masking off the correct bit. Then you can shift and add to build your new number from the bits.

data = b'abcdefgh' #8 bytes of data

def bit_slice(data, start, stop):
    out = 0
    for i in range(start, stop):
        byte_n = i//8
        byte_bit = i%8
        byte_mask = 1<<byte_bit
        bit = bool(data[byte_n] & byte_mask)
        out = out*2 + bit #multiply by 2 is equivalent to shift. Then add the new bit
    return out

re:comments

Each time we want to add a new bit to our number like so:

10110
101101

We have to shift the first five bits over and then either add 1 or 0 based on what the value of the next bit is. Shifting to the left moves each digit one place higher, which in binary means multiply by 2. In decimal shifting a number over one place means multiply by 10. When adding the new bit to our number we're accumulating I simply multiply by 2 instead of using the right shift operator just to show it's another option. When creating the byte mask, I did use the right shift operator (<<). It works by shifting a 1 several places over so I end up with a byte that has a 1 in just the right place that when I "and" it with the byte in question, I get just the single bit I want to index:

1<<3 = 00001000
1<<5 = 00100000
1<<0 = 00000001
1<<7 = 10000000

then apply the mask to get the bit we want:

10011011 #a byte of data
00100000 #bit mask for the 32's place
_________&
00000000
#bit in the 32's place is 0

10011011 #a byte of data
00010000 #bit mask for the 16's place
_________&
00010000
#bit in the 16's place is 1

After applying the mask, if the selected bit is 0 than the entire number will always be 0. If the selected bit is 1 the number will always be greater than 0. Calling bool on that result is equivalent to:

if data[byte_n] & byte_mask > 0:
    bit = 1
else:
    bit = 0

... because a boolean interpreted as an integer is simply a 1 or a 0.

edited Oct 13, 2021 at 19:43

answered Oct 13, 2021 at 18:51

Aaron

11.2k1 gold badge27 silver badges43 bronze badges

5 Comments

Kevin Over a year ago

Can you please explain "Multiply by 2 is equivalent to shift"?

Kevin Over a year ago

Also, I don't get the calculation of the byte_mask. Can you explain more?

Aaron Over a year ago

@Kevin see edit

Kevin Over a year ago

Thanks, @Aaron, for the extra explanation. This makes sense now. Your solution is what I'm experimenting with for now.

Aaron Over a year ago

@Kevin Ehsan's solution is also correct, and is certainly shorter, but I hope mine gives a bit more understanding on how it works under the hood...

Collectives™ on Stack Overflow

Python/Numpy - Extracting Bits of Bytes

2 Answers 2

2 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related