Say I have a large NumPy array of dtype int32
import numpy as np
N = 1000 # (large) number of elements
a = np.random.randint(0, 100, N, dtype=np.int32)
but now I want the data to be uint32. I could do
b = a.astype(np.uint32)
or even
b = a.astype(np.uint32, copy=False)
but in both cases b is a copy of a, whereas I want to simply reinterpret the data in a as being uint32, as to not duplicate the memory. Similarly, using np.asarray() does not help.
What does work is
a.dtpye = np.uint32
which simply changes the dtype without altering the data at all. Here's a striking example:
import numpy as np
a = np.array([-1, 0, 1, 2], dtype=np.int32)
print(a)
a.dtype = np.uint32
print(a) # shows "overflow", which is what I want
My questions are about the solution of simply overwriting the dtype of the array:
- Is this legitimate? Can you point me to where this feature is documented?
- Does it in fact leave the data of the array untouched, i.e. no duplication of the data?
- What if I want two arrays
aandbsharing the same data, but view it as differentdtypes? I've found the following to work, but again I'm concerned if this is really OK to do:
Though this seems to work, I find it weird that the underlyingimport numpy as np a = np.array([0, 1, 2, 3], dtype=np.int32) b = a.view(np.uint32) print(a) # [0 1 2 3] print(b) # [0 1 2 3] a[0] = -1 print(a) # [-1 1 2 3] print(b) # [4294967295 1 2 3]dataof the two arrays does not seem to be located the same place in memory:
Actually, it seems that the above gives different results each time it is run, so I don't understand what's going on there at all.print(a.data) print(b.data) - This can be extended to other
dtypes, the most extreme of which is probably mixing 32 and 64 bit floats:
Again, is this condoned, if the obtained behaviour is really what I'm after?import numpy as np a = np.array([0, 1, 2, np.pi], dtype=np.float32) b = a.view(np.float64) print(a) # [0. 1. 2. 3.1415927] print(b) # [0.0078125 50.12387848] b[0] = 8 print(a) # [0. 2.5 2. 3.1415927] print(b) # [8. 50.12387848]
np.shares_memory(a,b)whether the memory is shared. In all of your example it is shared.float32concatenated tofloat64mixing up significand and exponent bits). Here are more informations about thedataattributedata_bufferof an array is a 1d C array of bytes. As long thedtypeis compatible in size (multiples of 2, 4 or what ever), you canviewit in different ways. Just looking at the view doesn't harm anything, though the display might not make sense. Using it to change values may produce unpredictable results - such as the negativeint32viewed asuint32.