Check for invariant columns in a numpy array

Question

I have a big two-dimensional numpy array a of characters (dtype='a1') and want to find invariant columns that contain the same character throughout. The following code works, but is quite slow.

var_col = np.zeros(a.shape[1], dtype='bool')
for c in xrange(a.shape[1]):
    if not all(a[:,c] == a[0,c]):
        var_col[c] = True

Is there a faster solution to this problem? Thanks!

Warren Weckesser · Accepted Answer · 2014-03-26 12:46:43Z

2

Here's one way, using broadcasting with the == operator.

First create a test array.

In [27]: np.random.seed(1)

In [28]: a = np.random.choice(list("AABC"), size=(3,9))

In [29]: a
Out[29]: 
array([['A', 'C', 'A', 'A', 'C', 'A', 'C', 'A', 'C'],
       ['A', 'A', 'A', 'A', 'C', 'A', 'A', 'B', 'A'],
       ['B', 'A', 'B', 'A', 'B', 'A', 'C', 'A', 'B']], 
      dtype='|S1')

Compare each element to the element at the top of its column. a[0] is the first row; it is a 1d array (shape is (9,)). When we use == with two arrays like this, a[0] is "broadcast" to act like an array with shape (3,9), filled with copies of the first row.

In [30]: a == a[0]
Out[30]: 
array([[ True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True, False,  True,  True,  True,  True, False, False, False],
       [False, False, False,  True, False,  True,  True,  True, False]], dtype=bool)

Now use all along the first axis of the result of the comparison.

In [31]: np.all(a == a[0], axis=0)
Out[31]: array([False, False, False,  True, False,  True, False, False, False], dtype=bool)

edited Mar 26, 2014 at 12:46

answered Mar 26, 2014 at 12:41

Warren Weckesser

116k20 gold badges207 silver badges224 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

huckleg Over a year ago

great 'vectorized' solution! That's what I was looking for, thanks!

Collectives™ on Stack Overflow

Check for invariant columns in a numpy array

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related