Im trying to concatenate several hundred arrays size totaling almost 25GB of data. I am testing on 56 GB machine, but i receive memory error. I reckon the way I do my precess is ineffecient and is sucking lots of memory. This is my code:
for dirname, dirnames, filenames in os.walk('/home/extra/AllData'):
filenames.sort()
BigArray=numpy.zeros((1,200))
for file in filenames:
newArray[load(filenames[filea])
BigArray=numpy.concatenate((BigArrat,newArray))
any idea, thoughts or solutions?
Thanks
a = np.concatenate((a, b)), numpy creates an intermediate array with the concatenation ofaandb, then points variableato it, and if there are no other references to the oldait then gets garbage collected. But it requires, even if only for a split second, that you have twice as much memory as the final array.