1

I'm saving numpy arrays while trying to use as little disk space as possible. Along the way I realized that saving a boolean numpy array does not improve disk usage compared to a uint8 array. Is there a reason for that or am I doing something wrong here?

Here is a minimal example:

import sys
import numpy as np

rand_array = np.random.randint(0, 2, size=(100, 100), dtype=np.uint8)  # create a random dual state numpy array

array_uint8 = rand_array * 255  # array, type uint8

array_bool = np.array(rand_array, dtype=bool)  # array, type bool

print(f"size array uint8 {sys.getsizeof(array_uint8)}")
# ==> size array uint8 10120
print(f"size array bool {sys.getsizeof(array_bool)}")
# ==> size array bool 10120

np.save("array_uint8", array_uint8, allow_pickle=False, fix_imports=False)
# size in fs: 10128
np.save("array_bool", array_bool, allow_pickle=False, fix_imports=False)
# size in fs: 10128
1
  • Nope. Numpy doesn't store boolean arrays as bitmaps. Just because. Commented Mar 17, 2022 at 16:17

1 Answer 1

3

The uint8 and bool data types both occupy one byte of memory per element, so the arrays of equal dimensions are always going to occupy the same memory. If you are aiming to reduce your memory footprint, you can pack the boolean values as bits into a uint8 array using numpy.packbits, thereby storing binary data in a significantly smaller array (read here)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.