9

I'd like to 'shear' a numpy array. I'm not sure I'm using the term 'shear' correctly; by shear, I mean something like:

Shift the first column by 0 places
Shift the second column by 1 place
Shift the third colum by 2 places
etc...

So this array:

array([[11, 12, 13],
       [17, 18, 19],
       [35, 36, 37]])

would turn into either this array:

array([[11, 36, 19],
       [17, 12, 37],
       [35, 18, 13]])

or something like this array:

array([[11,  0,  0],
       [17, 12,  0],
       [35, 18, 13]])

depending on how we handle the edges. I'm not too particular about edge behavior.

Here's my attempt at a function that does this:

import numpy

def shear(a, strength=1, shift_axis=0, increase_axis=1, edges='clip'):
    strength = int(strength)
    shift_axis = int(shift_axis)
    increase_axis = int(increase_axis)
    if shift_axis == increase_axis:
        raise UserWarning("Shear can't shift in the direction it increases")
    temp = numpy.zeros(a.shape, dtype=int)
    indices = []
    for d, num in enumerate(a.shape):
        coords = numpy.arange(num)
        shape = [1] * len(a.shape)
        shape[d] = num
        coords = coords.reshape(shape) + temp
        indices.append(coords)
    indices[shift_axis] -= strength * indices[increase_axis]
    if edges == 'clip':
        indices[shift_axis][indices[shift_axis] < 0] = -1
        indices[shift_axis][indices[shift_axis] >= a.shape[shift_axis]] = -1
        res = a[indices]
        res[indices[shift_axis] == -1] = 0
    elif edges == 'roll':
        indices[shift_axis] %= a.shape[shift_axis]
        res = a[indices]
    return res

if __name__ == '__main__':
    a = numpy.random.random((3,4))
    print a
    print shear(a)

It seems to work. Please tell me if it doesn't!

It also seems clunky and inelegant. Am I overlooking a builtin numpy/scipy function that does this? Is there a cleaner/better/more efficient way to do this in numpy? Am I reinventing the wheel?

EDIT:
Bonus points if this works on an N-dimensional array, instead of just the 2D case.

This function will be at the very center of a loop I'll repeat many times in our data processing, so I suspect it's actually worth optimizing.

SECOND EDIT: I finally did some benchmarking. It looks like numpy.roll is the way to go, despite the loop. Thanks, tom10 and Sven Marnach!

Benchmarking code: (run on Windows, don't use time.clock on Linux I think)

import time, numpy

def shear_1(a, strength=1, shift_axis=0, increase_axis=1, edges='roll'):
    strength = int(strength)
    shift_axis = int(shift_axis)
    increase_axis = int(increase_axis)
    if shift_axis == increase_axis:
        raise UserWarning("Shear can't shift in the direction it increases")
    temp = numpy.zeros(a.shape, dtype=int)
    indices = []
    for d, num in enumerate(a.shape):
        coords = numpy.arange(num)
        shape = [1] * len(a.shape)
        shape[d] = num
        coords = coords.reshape(shape) + temp
        indices.append(coords)
    indices[shift_axis] -= strength * indices[increase_axis]
    if edges == 'clip':
        indices[shift_axis][indices[shift_axis] < 0] = -1
        indices[shift_axis][indices[shift_axis] >= a.shape[shift_axis]] = -1
        res = a[indices]
        res[indices[shift_axis] == -1] = 0
    elif edges == 'roll':
        indices[shift_axis] %= a.shape[shift_axis]
        res = a[indices]
    return res

def shear_2(a, strength=1, shift_axis=0, increase_axis=1, edges='roll'):
    indices = numpy.indices(a.shape)
    indices[shift_axis] -= strength * indices[increase_axis]
    indices[shift_axis] %= a.shape[shift_axis]
    res = a[tuple(indices)]
    if edges == 'clip':
        res[indices[shift_axis] < 0] = 0
        res[indices[shift_axis] >= a.shape[shift_axis]] = 0
    return res

def shear_3(a, strength=1, shift_axis=0, increase_axis=1):
    if shift_axis > increase_axis:
        shift_axis -= 1
    res = numpy.empty_like(a)
    index = numpy.index_exp[:] * increase_axis
    roll = numpy.roll
    for i in range(0, a.shape[increase_axis]):
        index_i = index + (i,)
        res[index_i] = roll(a[index_i], i * strength, shift_axis)
    return res

numpy.random.seed(0)
for a in (
    numpy.random.random((3, 3, 3, 3)),
    numpy.random.random((50, 50, 50, 50)),
    numpy.random.random((300, 300, 10, 10)),
    ):
    print 'Array dimensions:', a.shape
    for sa, ia in ((0, 1), (1, 0), (2, 3), (0, 3)):
        print 'Shift axis:', sa
        print 'Increase axis:', ia
        ref = shear_1(a, shift_axis=sa, increase_axis=ia)
        for shear, label in ((shear_1, '1'), (shear_2, '2'), (shear_3, '3')):
            start = time.clock()
            b = shear(a, shift_axis=sa, increase_axis=ia)
            end = time.clock()
            print label + ': %0.6f seconds'%(end-start)
            if (b - ref).max() > 1e-9:
                print "Something's wrong."
        print

5 Answers 5

9

numpy roll does this. For example, if you original array is x then

for i in range(x.shape[1]):
    x[:,i] = np.roll(x[:,i], i)

produces

[[11 36 19]
 [17 12 37]
 [35 18 13]]
Sign up to request clarification or add additional context in comments.

2 Comments

Yeah, roll is good stuff; this is certainly elegant. I suspect the loop will be slower than Sven's suggestion, but I shouldn't jump to conclusions without benchmarking. I'd have to think about how to extend this to the N-dimensional case.
@Andrew: Note that this solution uses less memory than mine.
8

The approach in tom10's answer can be extended to arbitrary dimensions:

def shear3(a, strength=1, shift_axis=0, increase_axis=1):
    if shift_axis > increase_axis:
        shift_axis -= 1
    res = numpy.empty_like(a)
    index = numpy.index_exp[:] * increase_axis
    roll = numpy.roll
    for i in range(0, a.shape[increase_axis]):
        index_i = index + (i,)
        res[index_i] = roll(a[index_i], -i * strength, shift_axis)
    return res

Comments

8

This can be done using a trick described in this answer by Joe Kington:

from numpy.lib.stride_tricks import as_strided
a = numpy.array([[11, 12, 13],
                 [17, 18, 19],
                 [35, 36, 37]])
shift_axis = 0
increase_axis = 1
b = numpy.vstack((a, a))
strides = list(b.strides)
strides[increase_axis] -= strides[shift_axis]
strides = (b.strides[0], b.strides[1] - b.strides[0])
as_strided(b, shape=b.shape, strides=strides)[a.shape[0]:]
# array([[11, 36, 19],
#        [17, 12, 37],
#        [35, 18, 13]])

To get "clip" instead of "roll", use

b = numpy.vstack((numpy.zeros(a.shape, int), a))

This is probably the most efficient way of doing it, since it does not use any Python loop at all.

7 Comments

Very nice! I'll try this out. Not to push my luck, but is this easy to extend to an N-dimensional array? I'll edit my question, too.
@Andrew: edited my answer to include shift_axis and increase_axis. Hope I got it right...
@Andrew: My bad -- the vstack() call needs to be replaced by a concatenate() along the correct axis -- unfortunately I have no time to look into this at the moment...
I'll keep poking at it. I really like the as_strided approach, I'm just having trouble with the edge conditions. The buffer works as long as the 'shear strength' is 1, but for larger shears I think I end up stepping into bad places in memory.
@Andrew: Added two more answers. If you do any benchmarking, please let me know the results!
|
2

Here is a cleaned-up version of your own approach:

def shear2(a, strength=1, shift_axis=0, increase_axis=1, edges='clip'):
    indices = numpy.indices(a.shape)
    indices[shift_axis] -= strength * indices[increase_axis]
    indices[shift_axis] %= a.shape[shift_axis]
    res = a[tuple(indices)]
    if edges == 'clip':
        res[indices[shift_axis] < 0] = 0
        res[indices[shift_axis] >= a.shape[shift_axis]] = 0
    return res

The main difference is that it uses numpy.indices() instead of rolling your own version of this.

1 Comment

Wow, I'm glad to know about numpy.indices(), that's much cleaner.
0
r = lambda l, n: l[n:]+l[:n]

transpose(map(r, transpose(a), range(0, len(a)))

I think. You should probably consider this psuedocode more than actual Python. Basically transpose the array, map a general rotate function over it to do the rotation, then transpose it back.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.