0

Both the code snippets below check if an element exists in the array but first approach takes < 100ms while the second approach takes ~6 seconds .

Does anyone know why ?

import numpy as np
import time

xs = np.random.randint(90000000, size=8000000)

start = time.monotonic()
is_present = -4 in xs

end = time.monotonic()

print( 'exec time:', round(end-start, 3) , 'sec ') // 100 milliseconds

start = time.monotonic()
for x in xs:
  if (x == -4):
    break

end = time.monotonic()

print( 'exec time:', round(end-start, 3) , 'sec ') // 6000 milliseconds ```

repl link

2
  • 1
    Related: stackoverflow.com/questions/8385602/… and medium.com/@gough.cory/… Commented May 2, 2021 at 9:28
  • Try this with PyPy rather than CPython and it is magically much faster and the gap is getting closer. The reason is that CPython is a (slow) interpreter. The first line execute a optimized native C call while the second use the interpreter to iterate over the list (which is insanely slow compared to doing that using a native compiled code). Commented May 2, 2021 at 11:57

1 Answer 1

3

numpy is specifically built to accelerate this kind of code, it is written in c with almost all of the python overhead removed, comparatively your second attempt is pure python so it takes much longer to loop through all the elements

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.