Fast interpolation of one array axis

Question

I want to interpolate one axis of data inside a 3-dimensional array. The given x-values for the different vales differ slightly but they should all be mapped to the same x-values.

Since the given x-values are not identical, currently I do the following:

import numpy as np
from scipy import interpolate

axes_have = np.ones( ( 2, 72, 2001 ) )
axes_have *= np.linspace( 0, 100, 2001 )[None,None,:]
axes_have += np.linspace( -0.3, 0.3, 144 ).reshape( ( 2, 72 ) )[:,:,None]

arr = np.sin( axes_have )
arr *= np.random.random( (2, 72 ) )[:,:,None]

axis_want = np.linspace( 0, 100, 201 )    

arr_ip = np.zeros( ( 2, 72, 201 ) )
for i in range( arr.shape[0] ):
    for j in range( arr.shape[1] ):
         ip_func = interpolate.PchipInterpolator( axes_have[i,j,:], arr[i,j,:], extrapolate=True )
         arr_ip[i,j,:] = ip_func( axis_want )

Using two nested for-loops is unsurprisingly very slow.

Is there a way to improve the speed? Maybe by doing some NumPy array magic or parallelization.

There was an error in my example, that is fixed now. arr should now be given. — leviathan
– leviathan, Commented Sep 7, 2017 at 5:35

Saedeas · Accepted Answer · 2017-09-08 21:29:25Z

I've done some initial testing and it seems as though the vast bulk of your time is spent generating the actual interpolation functions. It doesn't seem like just vectorization will speed it up a ton, but it does make it easy to parallelize (which does yield speed increases). Here's an example.

import numpy as np
from scipy import interpolate
import timeit
import multiprocessing



def myfunc(arg):
    x, y = arg
    return interpolate.PchipInterpolator(x,
                                         y,
                                         extrapolate=True)

p = multiprocessing.Pool(processes=8)
axes_have = np.ones((2, 72, 2001))
axes_have *= np.linspace(0, 100, 2001)[None, None, :]
axes_have += np.linspace(-0.3, 0.3, 144).reshape((2, 72))[:, :, None]

arr = np.sin(axes_have)
arr *= np.random.random((2, 72))[:, :, None]

axis_want = np.linspace(0, 100, 201)

arr_ip = np.zeros((2, 72, 201))
s_time1 = timeit.default_timer()
for i in range(arr.shape[0]):
    for j in range(arr.shape[1]):
         ip_func = interpolate.PchipInterpolator(axes_have[i, j, :],
                                                 arr[i, j, :],
                                                 extrapolate=True)
         arr_ip[i, j, :] = ip_func(axis_want)
elapsed1 = timeit.default_timer() - s_time1

s_time2 = timeit.default_timer()
flatten_have = [y for x in axes_have for y in x]
flatten_arr = [y for x in arr for y in x]
interp_time = timeit.default_timer()
funcs = p.map(myfunc, zip(flatten_have, flatten_arr))
interp_elapsed = timeit.default_timer() - interp_time
arr_ip = np.asarray([func(axis_want) for func in funcs]).reshape(2, 72, 201)
elapsed2 = timeit.default_timer() - s_time2

print("Elapsed 1: {}".format(elapsed1))
print("Elapsed 2: {}".format(elapsed2))
print("Elapsed interp: {}".format(interp_elapsed))

Typical Results (note that the vectorized implementation is pretty much exactly the same speed without the parallelization and that I have 2 cores, so your runtime should be roughly (original time / # of cores)):

Elapsed 1: 10.4025919437
Elapsed 2: 5.11409401894
Elapsed interp: 5.10987687111

Don't get me wrong, there may be an algorithmic way to do this more quickly, but this seems like the easiest way to yield an immediate speedup, since the problem is embarrassingly parallel.

Collectives™ on Stack Overflow

Fast interpolation of one array axis

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related