CSV to binary using Python

Question

I have a lot of csv files and I want to transform them into binary files, so I want to create a python script that can automate this task for me. My CSV files contain either 0 or 255.(every file has 80 row and 320 columns)

I wrote this code :

import numpy as np
import csv

csv_filename = '320x80_ImageTest_1_255.csv'
filename = "output.bin"

with open(csv_filename) as f:
    reader = csv.reader(f, delimiter =';')
    lst = list(reader)

array = np.array(lst)

with open ('new_binary.bin','wb') as FileToWrite:
    for i in range(len(array)):
        for j in range(len(array[0])):
            FileToWrite.write(''.join(chr(int(array[i][j]))).encode())

The problem is the output file is like this : screen of the output file

But intead of this caracter i want ff which corresponds to 255 in hex, where am i doing something wrong? can someone help me?

I'm still wondering what you are trying to archive there. This looks like you want to figure out the most complex way to make a copy of a file. What output do you expect from what input? — Klaus D.
– Klaus D., Commented Jun 6, 2023 at 12:53
Let's say for example my csv file is something like this : [['0;0;0;0;255;255;0;0;255], ['255;255;0;0;0;255;0;0;255], ['0;255;0;0;255;0;255;0;0], ['0;0;255;0;0;255;0;0;0]] I want to get a binary file like this : 0 0 0 0 ff ff 0 0 ff ff ff 0 0 0 ff 0 0 ff 0 ff 0 0 ff 0 ff 0 0 0 0 ff 0 0 ff 0 0 0 — mohamed AFASSI
– mohamed AFASSI, Commented Jun 6, 2023 at 13:01
If you want the file to actually contain pairs of f characters, that's not binary at all - it would be exactly as much of a text file as your original CSV, just with numbers represented in a different base. The code you posted achieves exactly what you said you wanted, that's just apparently not actually what you wanted. — jasonharper
– jasonharper, Commented Jun 6, 2023 at 13:19
I think you're right, looking into the hex view of the output file, I see this. So instead of the C3BF i want FF, that is my goal. I'm very new to the field so I might understand something in the wrong way and i'm still not fully used to work with binary — mohamed AFASSI
– mohamed AFASSI, Commented Jun 6, 2023 at 13:43

Zach Young · Accepted Answer · 2023-06-07 20:22:01Z

3

Do you want something like the following:

rows = [
    ["0", "0", "0", "0", "255", "255", "0", "0", "255"],
    ["255", "255", "0", "0", "0", "255", "0", "0", "255"],
    ["0", "255", "0", "0", "255", "0", "255", "0", "0"],
    ["0", "0", "255", "0", "0", "255", "0", "0", "0"],
]

with open("output.bin", "wb") as f_out:
    for row in rows:
        for field in row:
            f_out.write(int(field).to_bytes())

Then, inspecting output.bin:

with open("output.bin", "rb") as f_in:
    while True:
        x = f_in.read(9)
        if len(x) == 0:
            break
        print(x)

b'\x00\x00\x00\x00\xff\xff\x00\x00\xff'
b'\xff\xff\x00\x00\x00\xff\x00\x00\xff'
b'\x00\xff\x00\x00\xff\x00\xff\x00\x00'
b'\x00\x00\xff\x00\x00\xff\x00\x00\x00'

Thanks to Writing integers in binary to file in python for showing me the to_bytes(...) method, and for MyICQ for pointing out the defaults.

edited Jun 7, 2023 at 20:22

answered Jun 6, 2023 at 20:21

Zach Young

11.4k4 gold badges38 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

MyICQ Over a year ago

Notice that the function to_bytes() was changed in 3.11. Previous versions have no default arguments, so need two arguments, thus : to_bytes(1,'big') (although technically byteorder is indifferent for 1..). For Python 3.11, both arguments are optional, and defaults to 1. So you could do to_bytes().

mohamed AFASSI Over a year ago

I use python 3.9 ( sued full arguments) but still gets the same caracters when i visualize the file I get in the output, I literally copied and pasted your code but we get different results. I'll try leveling up python to the latest version then keep you guys updated

MyICQ · Accepted Answer · 2023-06-08 22:40:38Z

1

This does pretty much what is described.

I left out the reading of the input to a variable, it should be trivial. Since the input contains the ' character it can't be read as json. Instead I see it as a series of numbers, separated by something. Then a regular expression is applied to turn the numbers into an array.

# Regular expression support
import re

# the input, should be read from file
dirtyinput = "[['0;0;0;0;255;255;0;0;255], ['255;255;0;0;0;255;0;0;255], ['0;255;0;0;255;0;255;0;0], ['0;0;255;0;0;255;0;0;0]]"

# extract numbers
numbers = re.findall(r'\d+', dirtyinput)

# Using function from answer by Zach Young
with open("output.bin", "wb") as f_out:
    for n in numbers:
        f_out.write(int(n).to_bytes(1, 'big'))

# --------- another method, iterating the data (efficient if the data is large)
#
with open("output2.bin", "wb") as f:
    for x in re.finditer(r'\d+', dirtyinput):
        f.write(int(x.group()).to_bytes(1,'big'))

# -------- testing result
# 
with open("output.bin", "rb") as f_in:
    while True:
        x = f_in.read(9)
        if len(x) == 0:
            break
        print(x)

b'\x00\x00\x00\x00\xff\xff\x00\x00\xff'
b'\xff\xff\x00\x00\x00\xff\x00\x00\xff'
b'\x00\xff\x00\x00\xff\x00\xff\x00\x00'
b'\x00\x00\xff\x00\x00\xff\x00\x00\x00'

I get same result as answer above.

edited Jun 8, 2023 at 22:40

answered Jun 7, 2023 at 7:36

MyICQ

1,1761 gold badge12 silver badges36 bronze badges

2 Comments

mohamed AFASSI Over a year ago

What version of python do you use?

MyICQ Over a year ago

This was tested using Python 3.9. This is why I had to use to_bytes(1,'big') instead of just to_bytes(). See my comment above.

Collectives™ on Stack Overflow

CSV to binary using Python

2 Answers 2

2 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related