0

I need to generate an 3xn matrix having random columns ensuring that each column does not contain the same number more than once. I am currently using the below code:

n=10
set = np.arange(0, 10)
matrix = np.random.choice(set, size=3, replace=False)[:, None]
for i in range(n):
    column = np.random.choice(set, size=3, replace=False)[:, None]
    matrix = np.concatenate((matrix, column),axis=1)
print matrix

which gives the output I expected:

[[2 1 7 2 1 9 7 4 5 2 7]
 [4 6 3 5 9 8 1 3 8 4 0]
 [3 5 0 0 4 5 4 0 2 5 3]]

However, it seems that the code does not work fast enough. I am aware that implementing the for loop using cython might help, but I want to know that is there any more performant way to write this code solely in python.

2
  • How large will n be in your actual application? Commented Aug 26, 2016 at 14:08
  • Don't do matrix = np.concatenate((matrix, column),axis=1) inside the loop. Build a python list of columns, and then convert the list of columns to an array after the loop. Appending to a python list is much more efficient than repeatedly concatenating numpy arrays. Commented Aug 26, 2016 at 14:10

2 Answers 2

1

You can speed it up further with Python's random module (probably due to this issue):

import random
np.array([random.sample(range(10), 3) for _ in range(n)]).T

n = 10**6

%timeit t = np.array([random.sample(range(10), 3) for _ in range(n)]).T
1 loop, best of 3: 6.25 s per loop

%%timeit
matrix = np.empty((3, n), dtype=np.int)
for i in range(n):
    matrix[:, i] = np.random.choice(10, size=3, replace=False)
1 loop, best of 3: 19.3 s per loop
Sign up to request clarification or add additional context in comments.

Comments

0

As was already mentioned in the comments, concatenating repeatedly to a numpy array is a bad idea, as you will have to reallocate memory a lot. As you already know the final size of your result array, you could simply allocate it in the begin and then just iterate over the columns:

matrix = np.empty((3, n), dtype=np.int)
for i in range(n):
    matrix[:, i] = np.random.choice(10, size=3, replace=False)

At least on my machine, this is already 6 times faster, than your version.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.