1

Suppose you have the following section of MATLAB code, looping through and modifying index values within the matrix as it iterates:

x = zeros(parts,2);
for i = 1:parts
x(i,1) = (i-1)*L + 1;
x(i,2) = i*L;
end

Now suppose you are a python noob, and have gotten this far:

v = np.zeros((parts,2))
for x in xrange(0,N1/L):

where parts and N1/L are predefined integer values. I've done some searching on indexing and for looping in python, but I'm having difficulty understanding how to reference specific indices and modifying them within the for loop. If someone could direct me in the correct direction to understand how to attack the next section of the code, that would be much appreciated.

1 Answer 1

2

A literal translation of the Matlab code would be

import numpy as np
x = np.zeros((parts, 2))
for i in range(parts):
    x[i,0] = i*L + 1
    x[i,1] = (i+1)*L

Note that Matlab uses 1-based indexing while Python uses 0-based indexing. This accounts for the differences in where the 1's show up.

However, when using NumPy you'll get much better performance if you avoid element-by-element modification of an array. Instead, you should seek to express the calculation in terms of as few NumPy operators or function calls as you can which affect whole arrays at once. By do this, you off-load as much work as possible to NumPy's underlying fast C/Fortran-compiled function calls and reduce the amount of time spent executing slower Python code.

This usually means you want to avoid Python for-loops, since a loop implies there will be lots of Python statements to be executed.

So, for example, a better way to express the above calculation would be

x = np.zeros((parts, 2))
x[:, 0] = np.arange(1, parts*L, L)
x[:, 1] = x[:, 0] + L - 1

Notice that the values in x are filled in using just 2 assignments. Each assignment affects a whole column of x "all at once".


To give a sense of what a difference array-based operations make, here is an (IPython) timeit test using parts = 10000, L = 3 :

In [16]: %%timeit
   ....: x = np.zeros((parts, 2))
         x[:, 0] = np.arange(1, parts*L, L)
         x[:, 1] = x[:, 0] + L - 1
10000 loops, best of 3: 51.9 µs per loop

In [17]: %%timeit
   ....: x = np.zeros((parts, 2))
         for i in range(parts):
             x[i,0] = i*L + 1
             x[i,1] = (i+1)*L
100 loops, best of 3: 3.58 ms per loop
Sign up to request clarification or add additional context in comments.

3 Comments

That kind of looping was also poor form in earlier versions of MATLAB.
This is very helpful and I thank you a lot. The only part I'm confused about is the parameters of np.arange. According to documentation, the second parameter is the 'stop' marker. Why parts*L and not just parts?
It's best to just try out np.arange(1, parts, L) versus np.arange(1, parts*L, L) in a Python interactive session, for concrete values of parts and L. I'm sure it will quickly become clear to you then. The second parameter, the stop value, is always slightly bigger than the last value in the array returned by np.arange. Like range, np.arange does not include the stop value. Since we want the last value to be (parts-1)*L + 1, a stop value of parts*L will do since we're incrementing with stepsize L. parts*L - L + 2 would also work, but that's unnecessarily complicated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.