python how to convert bytes to binary

Question

I'm trying to read a file's contents and convert them into what is actually stored in memory if I write

file = open("filename","br")
binary = "0b"
for i in file.read():
    binary += bin(i)[2:]

will binary equal the actual value stored in memory? if so, how can I convert this back into a string?

EDIT: I tried

file = open("filename.txt","br")
binary = ""
for i in file.read():
    binary += bin(i)[2:]
stored = ""
for bit in binary:
    stored += bit
    if len(stored) == 7:
        print(chr(eval("0b"+stored)), end="")
        stored = ""

and it worked fine until it reached a space and then it became weird signs and mixed-up letters.

It's not really clear what you're trying to do. file.read() is literally the bytes that are in the file. Could you give an example of what you think is in the file and what you want the result to look like? — Frank Yellin
– Frank Yellin, Commented Sep 12, 2020 at 21:22
I'm trying to do this for any text file in general. also, I want the result to be what's in the file to prove to myself that I actually have the binary version for various purposes — forever
– forever, Commented Sep 12, 2020 at 21:24
Also, you may not know that when you loop through a set of bytes, it returns the number representing those bytes, like ord does. — forever
– forever, Commented Sep 12, 2020 at 21:30

Mike67 · Accepted Answer · 2020-09-12 21:30:02Z

2

To get a (somewhat) accurate representation of the string as it is stored in memory, you need to convert each character into binary.

Assuming basic ascii (1 byte per character) encoding:

s = "python"
binlst = [bin(ord(c))[2:].rjust(8,'0') for c in s]  # remove '0b' from string, fill 8 bits
binstr = ''.join(binlst)

print(s)
print(binlst)
print(binstr)

Output

python
['01110000', '01111001', '01110100', '01101000', '01101111', '01101110']
011100000111100101110100011010000110111101101110

For unicode (utf-8), the length of each character can be 1-4 bytes so it's difficult to determine the exact binary representation. As @Yellen mentioned, it may be easier to just convert the file bytes to binary.

edited Sep 12, 2020 at 21:30

answered Sep 12, 2020 at 21:24

Mike67

11.3k2 gold badges9 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

luthervespers Over a year ago

I found an interesting article describing how to determine how many bytes UTF-8 encoded characters need to be read: johndcook.com/blog/2019/09/09/how-utf-8-works

forever Over a year ago

@Mike67 so the problem was that bin deletes trailing zeros so you need to add them back?

Mike67 Over a year ago

It deletes leading zeroes, so 00001101 becomes 1101. Need to add back zeros to fill 8 bits.

Collectives™ on Stack Overflow

python how to convert bytes to binary

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related