Short answer:
- You can unwrap 0-d results into scalars while keeping n-d results (n>0) by indexing with an empty tuple
().
- Better yet, I would try to avoid using
@np.vectorize altogether – in general, but in particular with your given example where vectorization is not necessary.
Long answer:
Following these answers to related questions, by indexing with an empty tuple (), you can systematically unwrap 0-d arrays into scalars while keeping other arrays.
So, using the @np.vectorized function rescale() from your question, you can post-process your results accordingly, for example:
with_scalar_input = rescale(5, (0, 10))[()]
with_vector_input = rescale([5], (0, 10))[()]
print(type(with_scalar_input)) # <class 'numpy.float64'>
print(type(with_vector_input)) # <class 'numpy.ndarray'>
I am not aware of any built-in NumPy mechanism that solves this edge case of @np.vectorize for you, so providing your own decorator is probably a viable way to go.
Custom scalar-unwrapping @vectorize decorator
Writing your own custom decorator that (a) accepts all arguments of and behaves exactly like @np.vectorize, but (b) appends the scalar unwrapping step, could look as follows:
from functools import wraps
import numpy as np
def vectorize(*wa, **wkw):
def decorator(f):
@wraps(f)
def wrap(*fa, **fkw): return np.vectorize(f, *wa, **wkw)(*fa, **fkw)[()]
return wrap
return decorator
@vectorize(excluded=(1, 2))
def rescale(value, srcRange, dstRange=(0, 1)):
srcMin, srcMax = srcRange
dstMin, dstMax = dstRange
t = (value - srcMin) / (srcMax - srcMin)
return dstMin + t * (dstMax - dstMin)
with_scalar_input = rescale(5, (0, 10))
with_vector_input = rescale([5], (0, 10))
print(type(with_scalar_input)) # <class 'numpy.float64'>
print(type(with_vector_input)) # <class 'numpy.ndarray'>
If you don't care about docstring propagation (of which @functools.wraps takes care), the @vectorize decorator can be shortened to:
import numpy as np
vectorize = lambda *wa, **wkw: lambda f: lambda *fa, **fkw: \
np.vectorize(f, *wa, **wkw)(*fa, **fkw)[()]
@vectorize(excluded=(1, 2))
def rescale(value, srcRange, dstRange=(0, 1)):
srcMin, srcMax = srcRange
dstMin, dstMax = dstRange
t = (value - srcMin) / (srcMax - srcMin)
return dstMin + t * (dstMax - dstMin)
with_scalar_input = rescale(5, (0, 10))
with_vector_input = rescale([5], (0, 10))
print(type(with_scalar_input)) # <class 'numpy.float64'>
print(type(with_vector_input)) # <class 'numpy.ndarray'>
Caution: All approaches using (), as proposed above, produce a new edge case: if the input is provided as a 0-d NumPy array, such as np.array(5), the result will also be unwrapped into a scalar. Likewise, you might have noticed that the scalar results are NumPy scalars, <class 'numpy.float64'>, rather than native Python scalars, <class 'float'>. If either of this is not acceptable for you, then more elaborate type checking or post-processing will be necessary.
Try to avoid @np.vectorize altogether
As a final note: Maybe try to avoid using @np.vectorize altogether in the first place, and try to write your code such that it works both with NumPy arrays and scalars.
As to avoiding @np.vectorize: Its documentation states:
The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.
As to adjusting your code accordingly: Your given function rescale() is a good example for writing code that works both with NumPy arrays and scalars correctly; in fact, it does so already, without any adjustments! You just have to ensure that vector-valued input is given as a NumPy array (rather than, say, a plain Python list or tuple):
import numpy as np
def rescale(value, srcRange, dstRange=(0, 1)):
srcMin, srcMax = srcRange
dstMin, dstMax = dstRange
t = (value - srcMin) / (srcMax - srcMin)
return dstMin + t * (dstMax - dstMin)
with_scalar_input = rescale(5, (0, 10))
with_vector_input = rescale(np.asarray([5]), (0, 10))
print(type(with_scalar_input)) # <class 'float'>
print(type(with_vector_input)) # <class 'numpy.ndarray'>
Moreover, while producing exactly the same output for vector-type input¹, the @np.vectorized version is orders of magnitude slower:
import numpy as np
from timeit import Timer
def rescale(value, srcRange, dstRange=(0, 1)):
srcMin, srcMax = srcRange
dstMin, dstMax = dstRange
t = (value - srcMin) / (srcMax - srcMin)
return dstMin + t * (dstMax - dstMin)
vectorized = np.vectorize(rescale, excluded=(1, 2))
a = np.random.normal(size=10000)
assert (rescale(a, (0, 10)) == vectorized(a, (0, 10))).all() # Same result?
print("Unvectorized:", Timer(lambda: rescale(a, (0, 10))).timeit(100))
print("Vectorized:", Timer(lambda: vectorized(a, (0, 10))).timeit(100))
On my machine, this produces about 0.003 seconds for the unvectorized version and about 0.8 seconds for the vectorized version.
In other words: we have more than a 250× speedup with the given, unvectorized function for a given 10,000-element array, while (if used carefully, i.e. by providing NumPy arrays rather than plain Python sequences for vector-type inputs) the function already produces scalar outputs for scalar inputs and vector outputs for vector inputs!
I guess the code above might not be the code that you are actually trying to vectorize; but anyway: in a lot of cases, a similar approach is possible.
¹) Again, the case of a 0-d vector input is special here, but you might want to check that for yourself.
vectorizein production code (not just experimental things), you should try to find and understand its code. Currently the [source] link of its__call__method docs is the most direct link. github.com/numpy/numpy/blob/v2.2.0/numpy/lib/…