AttributeError: 'numpy.uint8' object has no attribute 'bit_length'

Question

I want to find the number of bits necessary to represent an unsigned numpy integer (or each element in an array of integers) in binary, in the same way the python's int.bit_length() does, but it seems that numpy has no equivalent function.

For example:

>>> int(0b1000).bit_length()
4
>>> np.uint8(0b1000).bit_length()
AttributeError: 'numpy.uint8' object has no attribute 'bit_length'

Can anyone help me find the correct function? My current approach is to convert each array element to a python int to find the bit length, which seems like an awful option, for speed and clarity:

np.vectorize(lambda np_int: int(np_int).bit_length())

From the Python-Reference docs it seems that bit_length() determines a unique value k such that 2**(k-1) <= abs(x) < 2**k. You could implement your own function to apply this to a Numpy integer as opposed to your conversion. The documentation link also offers a simple way to calculate k but it has a limitation. Hope this helps! — gallen
– gallen, Commented Jul 22, 2020 at 20:12
What's the use of such a value?c or values for an array of numbers? You can't change the dtype to match. — hpaulj
– hpaulj, Commented Jul 22, 2020 at 20:39
@hpaulj It's part of an algorithm classifying numbers by continuity of the bits, e.g. all the bits in 6 (0b110) are continuous, but 5 (0b101) is discontinuous because there's a 0 between 1s. — Noah Jackowitz
– Noah Jackowitz, Commented Jul 22, 2020 at 21:16

mmore500 · Accepted Answer · 2024-11-14 18:18:01Z

3

Using np.frexp(arr)[1] comes in 4 to 6x faster than np.ceil(np.log2(x)).astype(int).

Note that, as pointed out by @GregoryMorse above, some additional work is needed to guarantee correct results for 64-bit inputs (bit_length3 below).

import numpy as np


def bit_length1(arr):
    # assert arr.max() < (1<<53)
    return np.ceil(np.log2(arr)).astype(int)

    
def bit_length2(arr):
    # assert arr.max() < (1<<53)
    return np.frexp(arr)[1]


def bit_length3(arr):  # 64-bit safe
    _, high_exp = np.frexp(arr >> 32)
    _, low_exp = np.frexp(arr & 0xFFFFFFFF)
    return np.where(high_exp, high_exp + 32, low_exp)

Performance results, 10 iterations on a 100,000-element array via https://perfpy.com/868.

answered Nov 14, 2024 at 18:18

mmore500

1211 silver badge9 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Samy Vilar Jul 11 at 1:55

setting aside the rounding errors, bit_length1 will still fail for powers of 2, i.e. bit_length1(4) would yield 2 instead of 3, you can correct this by shifting all values forward 1 unit i.e. bit_length1 = lambda arr: np.ceil(np.log2(arr + 1)).astype(int)

James · Accepted Answer · 2020-07-22 20:12:59Z

0

You can take the ceiling of the log2 of your array.

import numpy as np

x = np.random.randint(0, 100, size=30)
x
# returns:
array([92,  7, 53, 24, 85, 53, 78, 52, 99, 91, 79, 40, 82, 34, 18, 26, 20,
        7, 47, 38, 78, 50, 15, 12, 54,  3, 91, 82, 22, 90])

np.ceil(np.log2(x)).astype(int)
# returns:
array([7, 3, 6, 5, 7, 6, 7, 6, 7, 7, 7, 6, 7, 6, 5, 5, 5, 3, 6, 6, 7, 6,
       4, 4, 6, 2, 7, 7, 5, 7])

answered Jul 22, 2020 at 20:12

James

37k4 gold badges54 silver badges79 bronze badges

4 Comments

Noah Jackowitz Over a year ago

Thanks, this looks great! I was a little concerned about the speed impact of a log function, but it actually improves performance over my vectorized function attempt by about 16x!

James Over a year ago

The built-in functions will always outperform a vectorized function.

S. Strempfer Over a year ago

This method will not work for powers of two (Example: 16 -> 4). To fix that issue and the one mentioned by @Mikhail, use np.ceil(np.log2(x+1)).

Gregory Morse Over a year ago

This will only work for numbers up to 2^53 as its going to by default use the double precision data type with a 53 bit mantissa. However it should be reliable for unsigned numbers in that range.

polydatum · Accepted Answer · 2020-07-22 20:20:22Z

0

Try this:

np.uint8(0b1000).nbytes*8

where the *8 at the end is simply the number of bits per byte

edited Jul 22, 2020 at 20:20

answered Jul 22, 2020 at 20:13

polydatum

13 bronze badges

1 Comment

Noah Jackowitz Over a year ago

Thanks, but I'm looking for a function of the number itself, not of the minimum bytes used to store the number. E.g. 2 (0b10) is a 2-bit number, even though it has to take at least a byte in memory (0b00000010)

Gregory Morse · Accepted Answer · 2022-11-08 06:08:35Z

The classic bit-twiddling hack algorithm should be easily adaptable to numpy:

https://graphics.stanford.edu/~seander/bithacks.html#IntegerLog

Here is the variant there for 32-bit unsigned integers:

def bit_length(v):
  r =     (v > 0xFFFF) << 4; v >>= r
  shift = (v > 0xFF  ) << 3; v >>= shift; r |= shift
  shift = (v > 0xF   ) << 2; v >>= shift; r |= shift
  shift = (v > 0x3   ) << 1; v >>= shift; r |= shift
  return r | (v >> 1)

Given vectorization, for very large lists and numbers up to the mantissa size 2^53 by default for doubles, it might be better performance to use the floating point log2 method. It's not really slow. Numpy needs to offer a CTZ or CLZ functions otherwise as counter-leading zeros or counter-trailing zeros is all that happens here. Incidentally floating point log2 is not doing much more than these same CPU type operations. So its probably not as inefficient as might seem.

Samy Vilar · Accepted Answer · 2025-07-11 04:02:39Z

while numpy still doesn't support bit_length, they did add bit_count in version 2.0.0 https://github.com/numpy/numpy/blob/4816e4d31b2ce4ce5ca689cf9538e0c1aa41b3f2/doc/source/release/2.0.0-notes.rst#a-new-bitwise_count-function which python 3.10 also introduced.

Both modern x86 and ARM architectures support a similar operation through popcnt and cnt with SSE 4.2 and ARMv8 respectively with the former taking around 3 cycles, not sure about the latter.

Whether or not either software version leverages "hardware acceleration" will probably depend on how each is compiled.

Note unlike the hardware counterparts that simply interpret the arguments as unsigned ints, both python and numpy variants take the absolute values of their corresponding arguments, something to keep in mind.

Since both count/length yield the same result for all ints composed entirely of trailing set bits we can implement length through count by setting all aforementioned trailing bits, i.e.:

from timeit import timeit

setup = """
import numpy as np

bit_length = lambda arr: np.frexp(arr)[1].astype(np.uint8)

def bit_length32(arr):
    bits = arr >> 1
    bits |= arr
    bits |= bits >> 2
    bits |= bits >> 4
    bits |= bits >> 8
    bits |= bits >> 16
    return np.bitwise_count(bits)         


def bit_length64(arr):
    bits = arr >> 1
    bits |= arr
    bits |= bits >> 2
    bits |= bits >> 4
    bits |= bits >> 8
    bits |= bits >> 16
    bits |= bits >> 32
    return np.bitwise_count(bits)

highs = np.random.randint(0, 64, 1024, dtype=np.uint64)
data_64 = np.random.randint(0, np.bitwise_left_shift(1, highs, out=highs), dtype=np.uint64)

exp = np.asarray(list(map(int.bit_length, data_64.tolist())), dtype=np.uint8)

assert all(np.array_equal(exp, func(data_64))
    for func in (bit_length, bit_length64))

highs = np.random.randint(0, 32, 1024, dtype=np.uint32)
data_32 = np.random.randint(0, np.bitwise_left_shift(1, highs, out=highs), dtype=np.uint32)

exp = np.asarray(list(map(int.bit_length, data_32.tolist())), dtype=np.uint8)

assert all(np.array_equal(exp, func(data_32))
    for func in (bit_length, bit_length32))

"""

>>> timeit('bit_length(data_32)', setup)
6.119301603874192
>>> timeit('bit_length32(data_32)', setup)
10.614300672896206
>>> timeit('bit_length(data_64)', setup)
6.567314823856577
>>> timeit('bit_length64(data_64)', setup)
13.415966181084514
>>> import sys
>>> sys.version
'3.11.11 (main, Mar 29 2025, 01:50:58) [Clang 13.0.0 (clang-1300.0.29.30)]'
>>> import numpy as np
>>> np.__version__
'2.2.4'
>>> np.show_config()
...
SIMD Extensions:
  baseline:
  - SSE
  - SSE2
  - SSE3
  found:
  - SSSE3
  - SSE41
  - POPCNT
  - SSE42
  - AVX
  - F16C
  - FMA3
  - AVX2
...
>>>

So this approach seems to take roughly twice as long as the next fastest approach, though unlike np.frexp we can safely assume no rounding errors, e.g. np.frexp(2**64 - 1)[1] yields 65 instead of the correct 64.

Collectives™ on Stack Overflow

AttributeError: 'numpy.uint8' object has no attribute 'bit_length'

5 Answers 5

1 Comment

4 Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

4 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related