2

I have a numpy array of size (192,192,4000) I would like to write this in a fast way on the disk. I don't care about the format, I can convert it afterwards.

What I do write now is that I save it in csv format which takes long time:

for i in range(0,192):
        np.savetxt(foder+"/{}_{}.csv".format(filename,i), data[i] , "%i", delimiter=", ")

Which takes 20-25 seconds. I tried pandas DataFrame and Panel approaches found in stackoverflow questions already and numpy save. All of them seems to run without error but the folder is empty when I open it.

Any idea how to improve the speed?

Why code runs without error but nothing is saved, for example for numpy.save?!

1
  • 1
    There are two different questions in here. The first is about the best method for saving NumPy arrays to disk, which is trivial. The second, Why code runs without error but nothing is saved, for example for numpy.save?!, cannot possibly be answered with so little information. Commented Jan 29, 2020 at 18:09

2 Answers 2

9

Usually the fastest way to save a large array like the one you have is to save it as a binary file, which can be done by numpy's save command. For example, the following creates a 3D array filled with zeroes, writes the array to a file and then retrieves it:

a = numpy.zeros((192,192,4000))
numpy.save("mydata.npy",a)
b = numpy.load("mydata.npy")

Of course, the file "mydata.npy" should be there in the present directory after the save command.

Sign up to request clarification or add additional context in comments.

1 Comment

I didn't specify in .npy while saving, I works now. Thanks!
2

You can also reshape your array from 3D to 2D before saving. See the following code for an example.

import numpy as gfg 


arr = gfg.random.rand(5, 4, 3) 

# reshaping the array from 3D 
# matrice to 2D matrice. 
arr_reshaped = arr.reshape(arr.shape[0], -1) 

# saving reshaped array to file. 
gfg.savetxt("geekfile.txt", arr_reshaped) 

# retrieving data from file. 
loaded_arr = gfg.loadtxt("geekfile.txt") 

# This loadedArr is a 2D array, therefore 
# we need to convert it to the original 
# array shape.reshaping to get original 
# matrice with original shape. 
load_original_arr = loaded_arr.reshape( 
    loaded_arr.shape[0], loaded_arr.shape[1] // arr.shape[2], arr.shape[2]) 

# check the shapes: 
print("shape of arr: ", arr.shape) 
print("shape of load_original_arr: ", load_original_arr.shape) 

# check if both arrays are same or not: 
if (load_original_arr == arr).all(): 
    print("Yes, both the arrays are same") 
else: 
    print("No, both the arrays are not same") 

1 Comment

Is saving a 2d array faster than a 3d array?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.