0

This question is similar to How to use multiprocessing in a for loop - python and How to use multiprocessing in a for loop - python , but neither of these solves my problem. The function stateRecognizer() checks if a series of images exists on current screen, using a function getCoord(imgDir), and returns the corresponding state.

getCoord(key) returns an list of 4 integers. getCoord(key) returns None if the image wasn't found.

My for loop implementation

checks = {"loadingblack.png": 'loading',
          "loading.png": 'loading',
          "gear.png": 'home',
          "factory.png": 'factory',
          "bathtub.png": 'bathtub',
          "refit.png": 'refit',
          "supply.png": 'supply',
          "dock.png": 'dock',
          "spepage.png": 'spepage',
          "oquest.png": 'quest',
          "quest.png": 'quest'}

def stateRecognizer(hint=None):
    for key in checks:
       if (getCoord(key) is not None):
           return checks[key]

When I attempt to write another function and call it, it does not return the expected variable:

def stateChecker(key, value):
    if (getCoord(key) is not None):
        return value

def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as pool:
        result = pool.map(stateChecker, checks)

Outputs:

stateChecker() missing 1 required positional argument: 'value'

How do I pass in a dict to the function stateChecker?

Update 2: Thank you both @tdelaney and @Nathaniel Ford.

def stateChecker(key, value):
    if (getCoord(key) is not None):
        return value
def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, checks.items())

The function now returns [None, None, None, None, 'bathtub', None, None, None, None, None, None] with slower processing speed (around 12 times slower).I am assuming each subprocess processes the entire dict per subprocess. Also, sometimes the function fails to read the JPEG image properly.

Premature end of JPEG file
Premature end of JPEG file
[None, None, None, None, None, None, None, None, None, None, None]
Elapsed time: 7.7098618000000005
Premature end of JPEG file
Premature end of JPEG file
[None, None, None, None, 'bathtub', None, None, None, None, None, None]
Elapsed time: 7.169349200000001

When with * before checks.items() or checks

    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, *checks)

Exception raised:

Exception has occurred: TypeError
    starmap() takes from 3 to 4 positional arguments but 13 were given
4
  • 1
    Did you guard the main module with if __name__ == '__main__' like the error suggests? Commented Jul 6, 2021 at 21:08
  • Thank you that solved part 2 @flakes Commented Jul 6, 2021 at 21:10
  • 1
    You should probably break out the second problem you're getting into it's own question. There are a couple of things that may be going on, but you should isolate. Also, you may be running into a GIL issue depending on what exactly you're doing. Commented Jul 6, 2021 at 23:07
  • Thank you for pointing out the concept of GIL. Commented Jul 7, 2021 at 0:03

2 Answers 2

1

map calls the target function with a single parameter. Use starmap to unpack an iterated tuple into parameters for the target function. Since your function is written to process key/value pairs, you can use the dictionary's item iterator to do the job.

def stateChecker(key, value):
    if (getCoord(key) is not None):
        return value
def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, checks.items())
Sign up to request clarification or add additional context in comments.

5 Comments

First of all thank you for your answer. The function now returns [None, None, None, None, 'bathtub', None, None, None, None, None, None] with slower processing speed (around 16 times slower). I am assuming each subprocess processes the entire dict per subprocess.
The return values are entirely because getCoord is returning false for each of those keys, and only not returning None for bathtub.png. Your problem almost certainly exists in getCoord. Are you trying to do something like, actually_appears_on_the_screen()?
getCoord(key) returns None if the image wasn't found. actually_appears_on_the_screen() is what I am trying to achieve (in this case it may return True or False), but I found getCoord(key) doing the same thing where if it returns a coordinate, it is a True in the case of the first function, and vice versa.
No, the keys to be checked are fanned out to the subprocesses. Each will get len(checks)/cpu_count() chunks to work on. getCoord(key) must be returning None a lot for all of those None to be in the result list. One difference from your original, nonparallel code returns the first non-None match instead of the list you are returning with the pool. Multiprocesssing has its own overhead. Whether a task is good parallize depends on what sort or work it is doing. If this is a a fast or disk bound operation, multiprocessing will be slower.
Thank you for the clarification
1

There is a slightly uncommon behavior in Python:

>>> dx = {"a": 1, "b": 2}
>>> [print(i) for i in dx]
a
b

Essentially, only the key values are part of the iteration here. Whereas, if we use items() we see:

>>> dx = {"a": 1, "b": 2}
>>> [print(i) for i in dx]
a
b

When you call map on your pool, it is effectively using that first version. That means, rather than passing a key-value pair into stateChecker you are passing only the key. Thus your error 'missing 1 required positional argument'. The second value is missing.

By using starmap and items() we can get around this. As shown above, items will give an iterator of tuples (each a key-value pair from your dictionary).

def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, checks.items())

starmap here refers to using the * operator:

>>> def f(a, b):
...   print(f"{a} is the key for {b}")
... 
>>> my_tuple = ("a", 1)
>>> f(*my_tuple)
a is the key for 1
>>> f(my_tuple)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() missing 1 required positional argument: 'b'

As you can see here, when used to pass values to a function, it 'unpacks' those values, slotting each value from the tuple (or list) into an argument. You can that when we don't use the * operator, we get an error very similar to the one you originally received.

A few more notes:

When writing Python, it really is best to stick to standard naming formats. For functions, use snake case (state_checker) and for classes use camel case. This helps you reason faster, amongst more esoteric reasons.

This function is probably misbehaving:

 def stateChecker(key, value):
     if (getCoord(key) is not None):
         return value

Assuming that getCoord returns four integers in a tuple (it's unclear in the original), it's type signature is:

def getCoord(key: Any) -> Tuple[int, int, int, int]:
    ....

That means, in turn, the type signature of stateChecker is:

def stateChecker(key: Any, value: Any) -> Union[None, Tuple[int, int, int, int]]:
    ....

In this case it is because if your if clause evaluates to false it will return None. It's likely getCoord can be short-circuited in these cases, but without knowing more it's hard to say how. Regardless, you aren't really handling a None return value.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.