4

I have a long 1-dimensional list of integer 1's and 0's, representing 8-bit binary bytes. What is a neat way to create a new list from that, containing the integer bytes.

Being familiar with C, but new to Python, I've coded it in the way I'd do it with C: an elaborate structure that loops though each bit. However, I'm aware that the whole point of Python over C is that such things can usually be done compactly and elegantly, and that I should learn how to do that. Maybe using list comprehension?

This works, but suggestions for a more "Pythonic" way would be appreciated:

#!/usr/bin/env python2
bits = [1,0,0,1,0,1,0,1,0,1,1,0,1,0,1,1,1,1,1,0,0,1,1,1]
bytes = []
byt = ""
for bit in bits:
  byt += str(bit)
  if len(byt) == 8:
    bytes += [int(byt, 2)]
    byt = ""
print bytes

$ bits-to-bytes.py
[149, 107, 231]

3 Answers 3

4

You can slice the list into chunks of 8 elements and map the subelements to str:

[int("".join(map(str, bits[i:i+8])), 2) for i in range(0, len(bits), 8)]

You could split it up into two parts mapping and joining once:

mapped = "".join(map(str, bits))
[int(mapped[i:i+8], 2) for i in range(0, len(mapped), 8)]

Or using iter and borrowing from the grouper recipe in itertools:

it = iter(map(str, bits))
[int("".join(sli), 2) for sli in zip(*iter([it] * 8))]

iter(map(str, bits)) maps the content of bits to str and creates an iterator, zip(*iter([it] * 8)) groups the elements into groups of 8 subelements.
Each zip(*iter.. consumes eight subelements from our iterator so we always get sequential groups, it is the same logic as the slicing in the first code we just avoid the need to slice.

As Sven commented, for lists not divisible by n you will lose data using zip similarly to your original code, you can adapt the grouper recipe I linked to handle those cases:

from itertools import zip_longest # izip_longest python2

bits = [1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1,1,0]
it = iter(map(str, bits))

print( [int("".join(sli), 2) for sli in izip_longest(*iter([it] * 8),fillvalue="")])
[149, 107, 231, 2] # using just zip would be  [149, 107, 231] 

The fillvalue="" means we pad the odd length group with empty string so we can still call int("".join(sli), 2) and get correct output as above where we are left with 1,0 after taking 3 * 8 chunks.

In your own code bytes += [int(byt, 2)] could simply become bytes.append(int(byt, 2))

Sign up to request clarification or add additional context in comments.

7 Comments

Ah, I see you thought of grouper() as well :).
@Cyphase, yep, if the OP wanted to keep odd length slicies it would be the way to go.
@PadraicCunningham: Your version drops excess bits if the lengths of the list is not divisible by 8, since zip() stops on the shortest sequence.
@SvenMarnach, so does the OP's, I added a link to the grouper recipe which will work for odd lengths, it is not totally clear what the OP wants to do in that case
Thanks. A most comprehensive answer to my question.
|
1

Padraic's solution is good; here's another way to do it:

from itertools import izip_longest


def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # Taken from itertools recipes
    # https://docs.python.org/2/library/itertools.html#recipes
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

bits = [1, 0, 0, 1, 0, 1, 0, 1,
        0, 1, 1, 0, 1, 0, 1, 1,
        1, 1, 1, 0, 0, 1, 1, 1]

byte_strings = (''.join(bit_group) for bit_group in grouper(map(str, bits), 8))
bytes = [int(byte_string, 2) for byte_string in byte_strings]

print bytes  # [149, 107, 231]

1 Comment

You should pass in a different value than None for fillvalue if you want to be able to deal with a list length that is not divisible by 8, otherwise you'll get a string like 100101NoneNone, which int() will choke on.
0

Since you start from a numeric list you might want to avoid string manipulation. Here there are a couple of methods:

  • dividing the original list in 8 bits chunks and computing the decimal value of each byte (assuming the number of bits is a multiple of 8); thanks to Padraic Cunningham for the nice way of dividing a sequence by groups of 8 subelements;

    bits = [1,0,0,1,0,1,0,1,0,1,1,0,1,0,1,1,1,1,1,0,0,1,1,1]
    [sum(b*2**x for b,x in zip(byte[::-1],range(8))) for byte in zip(*([iter(bits)]*8))]
    
  • using bitwise operators (probably more efficient); if the number of bits is not a multiple of 8 the code works as if the bit sequence was padded with 0s on the left (padding on the left often makes more sense than padding on the right, because it preserves the numerical value of the original binary digits sequence)

    bits = [1,0,0,1,0,1,0,1,0,1,1,0,1,0,1,1,1,1,1,0,0,1,1,1]
    n = sum(b*2**x for b,x in zip(bits[::-1],range(len(bits)))) # value of the binary number represented by 'bits'
    # n = int(''.join(map(str,bits)),2) # another way of finding n by means of string manipulation
    [(n>>(8*p))&255 for p in range(len(bits)//8-(len(bits)%8==0),-1,-1)]
    

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.