How to speed up a loop in python

Question

I would like to speed up this short piece of code

      max_x=array([max(x[(id==dummy)]) for dummy in ids])

x and id are numpy arrays of the same dimensions and ids is an array of smaller dimension. What is the fast way to do it using vectorial operation?

you're right, thanks for the comments. maybe I should rephrase my question because what I'm most interested in is to speed up my code — Brian
– Brian, Commented Aug 15, 2012 at 10:18

seberg · Accepted Answer · 2012-08-29 13:31:30Z

This is not easy to do vectorize further (as far as I see), unless id has some structure. Otherwise a bottleneck might be doing id==dummy often, but the only solution I can think of would be the use of sorting, and due to the lack of a reduce functionality for np.max() still requires quite a bit of python code (Edit: There actually is a reduce function through np.fmax available). This is about 3x faster for a x being 1000x1000 and id/ids being in 0..100, but as its rather complex, its only worth it for larger problems with many ids:

def max_at_ids(x, id, ids):
    # create a 1D view of x and id:
    r_x = x.ravel()
    r_id = id.ravel()
    sorter = np.argsort(r_id)

    # create new sorted arrays:
    r_id = r_id[sorter]; r_x = r_x[sorter]

    # unfortunatly there is no reduce functionality for np.max...

    ids = np.unique(ids) # create a sorted, unique copy, just in case

    # w gives the places where the sorted arrays id changes:
    w = np.where(r_id[:-1] != r_id[1:])[0] + 1

I originally offered this solution which does a pure python loop over the slices, but below is a shorter (and faster) version:

    # The result array:
    max_x = np.empty(len(ids), dtype=r_x.dtype)
    start_idx = 0; end_idx = w[0]
    i_ids = 0
    i_w = 0

    while i_ids < len(ids) and i_w < len(w) + 1:
        if ids[i_ids] == r_id[start_idx]:
            max_x[i_ids] = r_x[start_idx:end_idx].max()
            i_ids += 1
            i_w += 1
        elif ids[i_ids] > r_id[start_idx]:
            i_w += 1
        else:
            i_ids += 1
            continue # skip updating start_idx/end_idx

        start_idx = end_idx
        # Set it to None for the last slice (might be faster to do differently)
        end_idx = w[i_w] if i_w < len(w) else None

    return ids, max_x

EDIT: improved version for calculation of the maxium for each slice:

There is a way to remove the python loop by the use of np.fmax.reduceat, which might outperform the previous one a lot if the slices are small (and is actually quite elgant):

# just to 0 at the start of w
# (or calculate first slice by hand and use out=... keyword argument to avoid even
# this copy.
w = np.concatenate(([0], w))
max_x = np.fmin.reduceat(r_x, w)
return ids, max_x

Now there are probably some small things where it is possible to make this a little faster. If id/ids has some structure it should be possible to simplify the code, and maybe use a different approach to achieve a much larger speedup. Otherwise the speedup of this code should be large, as long as there are many (unique) ids (and x/id arrays are not very small). Note that the code enforces np.unique(ids), which is probably a good assumption though.

Thanks for you answer! It's been extremely useful. It speeds up the calculation by roughly 98%.
Edited the answer to include a shorter/faster version by using np.fmax.reduceat to avoid the python loop over all slices.

Janne Karila · Accepted Answer · 2012-08-15 10:45:49Z

1

Using x[(id==dummy)].max() instead of the built-in max should give some speed-up.

answered Aug 15, 2012 at 10:45

Janne Karila

25.3k6 gold badges60 silver badges97 bronze badges

2 Comments

Brian Over a year ago

Great suggestion, but that speed up my calculation of only 2%. It is really the only possible way?

Janne Karila Over a year ago

Probably not, let's hope someone else chimes in.

Juh_ · Accepted Answer · 2014-01-10 11:14:23Z

0

scipy.ndimage.maximumdoes exactly that:

import numpy as np
from scipy import ndimage as nd

N = 100  # number of values
K = 10   # number of class

# generate random data
x   = np.random.rand(N)
ID  = np.random.randint(0,K,N)  # random id class for each xi's
ids = np.random.randint(0,K,5)  # select 5 random class

# do what you ask
max_per_id = nd.maximum(x,labels=ID,index=ids)

print dict(zip(ids,max_per_id))

If you want to compute the max for all ids, do ids = ID

Note that if a particular class in ids is not found in ID(i.e. no x is labeled by that class) then the maximum return for that class is 0.

edited Jan 10, 2014 at 11:14

answered Jan 10, 2014 at 11:02

Juh_

15.8k8 gold badges65 silver badges103 bronze badges

Collectives™ on Stack Overflow

How to speed up a loop in python

3 Answers 3

2 Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related