1

I have a Python script job.py which accepts command-line arguments. The script uses the Python package subprocess to run some external programs. Both the script and the external programs are sequential (i.e. no MPI, openMP, etc.). I want to run this script 4 times, each time with different command-line arguments. My processor has 4 cores and therefore I would like to run all 4 instances simultaneously. If I open 4 terminals and run each instance of the script in the separate terminals it works perfectly and I get exactly what I want.

Now I want to make it easier for myself to launch the 4 instances such that I can do all of this with a single command from a single terminal. For this I use a bash script batch.sh:

python job.py 4 0 &
python job.py 4 1 &
python job.py 4 2 &
python job.py 4 3 &

This does not work. It turns out that subprocess is the culprit here. All the Python code runs perfectly until it hits subprocess.call after which I get:

[1]+  Stopped                 python job.py 4 0

So how I see it, is that I am trying to run job.py in the background and job.py itself tries to run something else in the background via subprocess. This apparently does not work for reasons I do not understand.

Is there a way to run job.py multiple times without requiring multiple terminals?

EDIT #1

On recommendation I tried the multiprocessing, thread and threading packages. In the best case just one instance ran properly. I tried an ugly workaround which does work. I made a bash script which launches each instance in a new terminal:

konsole -e python job.py 4 0
konsole -e python job.py 4 1
konsole -e python job.py 4 2
konsole -e python job.py 4 3

EDIT #2

Here is the actual function that uses subprocess.call (note: subprocess is imported as sp).

def run_case(path):
    case = path['case']
    os.chdir(case)
    cmd = '{foam}; {solver} >log.{solver} 2>&1'.format(foam=CONFIG['FOAM'],
                                                       solver=CONFIG['SOLVER'])
    sp.call(['/bin/bash', '-i', '-c', cmd])

Let me fill in the blank spots:

  • CONFIG is a globally defined dictionary.
  • CONFIG['FOAM'] = 'of40' and this is an alias in my .bashrc used to source a file belonging to the binary I'm running.
  • CONFIG['SOLVER'] = 'simpleFoam' and this is the binary I'm running.

EDIT #3

I finally got it to work with this

def run_case():
    case = CONFIG['PATH']['case']
    os.chdir(case)
    cmd = 'source {foam}; {solver} >log.simpleFoam 2>&1'.format(foam=CONFIG['FOAM'],
                                                                solver=CONFIG['SOLVER'])
    sp.call([cmd], shell=True, executable='/bin/bash')

The solution was to set both shell=True and executable='/bin/bash' instead of including /bin/bash in the actual command-line to pass to the shell. NOTE: foam is now a path to a file instead of an alias.

4
  • Stopped may just mean that the instance of job.py finished. Commented Mar 23, 2017 at 14:43
  • 3
    I don't think this has anything to do with the subprocess module. I think the program you are running wants to write to stdout, and receives a SIGTTOU because you've placed it in the background so it doesn't have access to your controlling terminal. This is standard behavior. Capturing the output from the program (setting stdout=subprocess.PIPE and stderr=subprocess.PIPE) might work, although then you will need to handle program output correctly (see subprocess.Popen and the communicate method). Commented Mar 23, 2017 at 14:44
  • you should use threading to access your different cores. tutorialspoint.com/python/python_multithreading.htm Commented Mar 23, 2017 at 14:46
  • sp.call([cmd], shell=True) is wrong, it coincidentally works but you really mean sp.call(cmd, shell=True) without the [] Commented Jun 21, 2022 at 16:28

1 Answer 1

1

You can parallelize from within Python:

import multiprocessing
import subprocess

def run_job(spec):
    ...
    if spec ...:
        subprocess.call(...)

def run_all_jobs(specs):
    pool = multiprocessing.Pool()
    pool.map(run_job, specs)

It has the advantage of letting you monitor/log/debug the parallelization.

Sign up to request clarification or add additional context in comments.

5 Comments

I tried this, but it will only run one of the jobs. To run the others in the pool I have to type fg in the terminal. Somehow subprocess.call is having a locking behaviour, because everything before subprocess.call runs as expected. If I comment out every subprocess.call then all jobs run until completion (naturally without doing anything useful).
What are you calling with subprocess.call()? Are you using shell=True?
I tried it both with shell set to True and False. See EDIT #2 in OP to see the function that uses subprocess.call.
You are calling bash from the call(). You can instead use separate call() statements over the programs of interest, capture the output in Python, and write to the log file from within Python.
I finally got it to work. See OP for the solution. Thanks for all the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.