1

I created a bash script to automate multiple tools. This bash script takes a string as input and perform multiple tasks on it. Structure of scrips is like

#!/bin/bash
tool1 $1
tool2 $1
tool3 $1
tool4 $1
tool5 $1

I want to use Python3 Multiprocessing to run the n-number of tools concurrent/parallel to speedup the Process. How it can be done in Python?

2
  • 1
    Take a look at gnu.org/software/parallel for executing jobs in parallel. Commented Apr 23, 2020 at 1:42
  • You don't need any tool other than bash to do this anyway, you can just tool1 $1 & tool2 $1 & tool3 $1 & tool4 $1 & tool5 $1 (basically add & at the end of each line). Commented Apr 23, 2020 at 2:26

2 Answers 2

1

Please use multiprocessing and subprocess. When using a custom shell script, if it is not in the PATH then use the full path to the script. If your script is in the same folder as the python script, please use ./script.sh as your command.

Also ensure that there is exec permission for the script that you are running

from multiprocessing import Pool
import subprocess

def run_script(input):
    (command,arg_str)=input
    print("Starting command :{} with argument {}".format(command, arg_str))
    result = subprocess.call(command+" "+arg_str, shell=True) 
    print("Completed command :{} with argument {}".format(command, arg_str))
    return result

with Pool(5) as p: # choose appropriate level of parallelism
    # choose appropriate command and argument, can be fetched from sys.argv if needed
    exit_codes = p.map(run_script, [('echo','hello1'), ('echo','hello2')])
    print("Exit codes : {}".format(exit_codes))

You could use the exit codes to verify the completion status. Sample output :

Starting command :echo with argument hello1
Starting command :echo with argument hello2
hello1
hello2
Completed command :echo with argument hello1
Completed command :echo with argument hello2
Exit codes : [0, 0]

Another way to do this (without python) would be to use GNU Parallel. The below command does the same thing that the above python script does.

parallel -k echo ::: 'hello1' 'hello2'
Sign up to request clarification or add additional context in comments.

Comments

0

You could use multiprocessing.pool.Pool along with os.system like this:

import sys
import os 
import multiprocessing

tools = ['tool1', 'tool2', 'tool3', 'tool4', 'tool5']
arg1 = sys.argv[1]

p = multiprocessing.Pool(len(tools))
p.map(os.system, (t + ' ' + arg1 for t in tools))

This will start len(tools) parallel processes which will execute os.system('toolN arg').

In general though, you don't want Pool(len(tools)), since it does not scale well if you launch more processes than the number N of cores you have available on your machine, so you should do Pool() instead. This will still execute each tool but it will do at most N concurrently.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.