2

I have a lot of csv files and I want to transform them into binary files, so I want to create a python script that can automate this task for me. My CSV files contain either 0 or 255.(every file has 80 row and 320 columns)

I wrote this code :

import numpy as np
import csv

csv_filename = '320x80_ImageTest_1_255.csv'
filename = "output.bin"

with open(csv_filename) as f:
    reader = csv.reader(f, delimiter =';')
    lst = list(reader)

array = np.array(lst)

with open ('new_binary.bin','wb') as FileToWrite:
    for i in range(len(array)):
        for j in range(len(array[0])):
            FileToWrite.write(''.join(chr(int(array[i][j]))).encode())

The problem is the output file is like this : screen of the output file

But intead of this caracter i want ff which corresponds to 255 in hex, where am i doing something wrong? can someone help me?

8
  • docs.python.org/3/library/functions.html#ord Commented Jun 6, 2023 at 12:52
  • I'm still wondering what you are trying to archive there. This looks like you want to figure out the most complex way to make a copy of a file. What output do you expect from what input? Commented Jun 6, 2023 at 12:53
  • Let's say for example my csv file is something like this : [['0;0;0;0;255;255;0;0;255], ['255;255;0;0;0;255;0;0;255], ['0;255;0;0;255;0;255;0;0], ['0;0;255;0;0;255;0;0;0]] I want to get a binary file like this : 0 0 0 0 ff ff 0 0 ff ff ff 0 0 0 ff 0 0 ff 0 ff 0 0 ff 0 ff 0 0 0 0 ff 0 0 ff 0 0 0 Commented Jun 6, 2023 at 13:01
  • If you want the file to actually contain pairs of f characters, that's not binary at all - it would be exactly as much of a text file as your original CSV, just with numbers represented in a different base. The code you posted achieves exactly what you said you wanted, that's just apparently not actually what you wanted. Commented Jun 6, 2023 at 13:19
  • I think you're right, looking into the hex view of the output file, I see this. So instead of the C3BF i want FF, that is my goal. I'm very new to the field so I might understand something in the wrong way and i'm still not fully used to work with binary Commented Jun 6, 2023 at 13:43

2 Answers 2

3

Do you want something like the following:

rows = [
    ["0", "0", "0", "0", "255", "255", "0", "0", "255"],
    ["255", "255", "0", "0", "0", "255", "0", "0", "255"],
    ["0", "255", "0", "0", "255", "0", "255", "0", "0"],
    ["0", "0", "255", "0", "0", "255", "0", "0", "0"],
]

with open("output.bin", "wb") as f_out:
    for row in rows:
        for field in row:
            f_out.write(int(field).to_bytes())

Then, inspecting output.bin:

with open("output.bin", "rb") as f_in:
    while True:
        x = f_in.read(9)
        if len(x) == 0:
            break
        print(x)
b'\x00\x00\x00\x00\xff\xff\x00\x00\xff'
b'\xff\xff\x00\x00\x00\xff\x00\x00\xff'
b'\x00\xff\x00\x00\xff\x00\xff\x00\x00'
b'\x00\x00\xff\x00\x00\xff\x00\x00\x00'

Thanks to Writing integers in binary to file in python for showing me the to_bytes(...) method, and for MyICQ for pointing out the defaults.

Sign up to request clarification or add additional context in comments.

2 Comments

Notice that the function to_bytes() was changed in 3.11. Previous versions have no default arguments, so need two arguments, thus : to_bytes(1,'big') (although technically byteorder is indifferent for 1..). For Python 3.11, both arguments are optional, and defaults to 1. So you could do to_bytes().
I use python 3.9 ( sued full arguments) but still gets the same caracters when i visualize the file I get in the output, I literally copied and pasted your code but we get different results. I'll try leveling up python to the latest version then keep you guys updated
1

This does pretty much what is described.

I left out the reading of the input to a variable, it should be trivial. Since the input contains the ' character it can't be read as json. Instead I see it as a series of numbers, separated by something. Then a regular expression is applied to turn the numbers into an array.

# Regular expression support
import re

# the input, should be read from file
dirtyinput = "[['0;0;0;0;255;255;0;0;255], ['255;255;0;0;0;255;0;0;255], ['0;255;0;0;255;0;255;0;0], ['0;0;255;0;0;255;0;0;0]]"

# extract numbers
numbers = re.findall(r'\d+', dirtyinput)

# Using function from answer by Zach Young
with open("output.bin", "wb") as f_out:
    for n in numbers:
        f_out.write(int(n).to_bytes(1, 'big'))

# --------- another method, iterating the data (efficient if the data is large)
#
with open("output2.bin", "wb") as f:
    for x in re.finditer(r'\d+', dirtyinput):
        f.write(int(x.group()).to_bytes(1,'big'))

# -------- testing result
# 
with open("output.bin", "rb") as f_in:
    while True:
        x = f_in.read(9)
        if len(x) == 0:
            break
        print(x)
b'\x00\x00\x00\x00\xff\xff\x00\x00\xff'
b'\xff\xff\x00\x00\x00\xff\x00\x00\xff'
b'\x00\xff\x00\x00\xff\x00\xff\x00\x00'
b'\x00\x00\xff\x00\x00\xff\x00\x00\x00'

I get same result as answer above.

2 Comments

What version of python do you use?
This was tested using Python 3.9. This is why I had to use to_bytes(1,'big') instead of just to_bytes(). See my comment above.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.