0

So I have a stream of bits in python that looks like that:

bitStream = "001011000011011111000011100111000101001111100011001"

Now this stream is dynamic, meaning that it changes depending on the input received, now I want to write this to a file in python, I'm currently doing that:

f = open("file.txt", "rb+")
s = file.read() # stream

bitStream  = "001011000011011111000011100111000101001111100011001"
byteStream = int(bitStream,2).to_bytes(len(bitStream)//8, 'little')

f.close() #close handle

However that works but the thing is that the bit stream can be a non-8bits aligned string which results in a file write of n-1 bytes or an error of the type int too big to convert.

Now normally I would align the file bits to be divisible by 8 (which is normal behavior) but in this case I really cannot add bits because otherwise, when I would give again this file to my program it will misinterpret the alignment bits as something other than expected.

Would you guys have any idea?

Thanks in advance

2 Answers 2

1

A easy fix is to make sure the number is always rounded up:

(len(bitStream)+7)//8

This works because //8 always rounds down. We need to make sure that any integer above a multiple of 8 is bigger or equal to the next multiple so rounding down actually round up.

Alternatively:

math.ceil(len(bitStream)/8)

This makes sure there are always plenty of bytes.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your answer, I was actually wondering why adding 7 rounds it to a multiple of 8?
1

I really cannot add bits because otherwise, when I would give again this file to my program it will misinterpret the alignment bits as something other than expected

So, you need to write an amount of bits whose size is not a multiple of 8, but computer memory is normally byte-addressable, meaning that you can't read or write anything that is smaller than 1 byte.

However, there is only a very small number of ways the length of your input can be not a multiple of 8: len(bitStream) % 8 may be 0, 1, 2, 3, 4, 5, 6 or 7. Thus, you can align your data to a multiple of 8 bytes (if needed) and use one additional byte to indicate the amount of bits that are used for padding (possibly zero), like this:

     01110011101 # initial data
1111101110011101 # align it with 1's (or 0's, or whatever)
^^^^^ alignment of 5 bits
000001011111101110011101
^^^^^^^^----------------- the number 5 (size of alignment)
        |||||
        ^^^^^------------ the alignment itself

When you read the file, you know that the first byte holds the size of the alignment (n), so you read it, then read the remaining data and disregard the n leading bits.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.