1

I am running a python script which creates a list of commands which should be executed by a compiled program (proprietary).

The program kan split some of the calculations to run independently and the data will then be collected afterwards.

I would like to run these calculations in parallel as each are a very time consuming single threaded task and I have 16 cores available.

I am using subprocess to execute the commands (in Class environment):

def run_local(self):
    p = Popen(["someExecutable"], stdout=PIPE, stdin=PIPE)
    p.stdin.write(self.exec_string)
    p.stdin.flush()
    while(p.poll() is not none):
        line = p.stdout.readline()
        self.log(line)

Where self.exec_string is a string of all the commands.

This string an be split into: an initial part, the part i want parallelised and a finishing part.

How should i go about this?

Also it seems the executable will "hang" (waiting for a command, eg. "exit" which will release the memory) if a naive copy-paste of the current method is used for each part.

Bonus: The executable also has the option to run a bash script of commands, if it is easier/possible to parallelise bash?

3

2 Answers 2

1

For bash, it could be very simple. Assuming your file looks like this:

## init part##
ls
cd ..
ls
cat some_file.txt

## parallel ##
heavycalc &
heavycalc &
heavycalc &

## finish ##
wait
cat results.txt

With & behind the command you tell bash to run this command in a background-thread. wait will then wait for all background-threads to finish, so you can be sure, all calculations are done.

I've assumed your input txt-file are plain bash-commands.

Sign up to request clarification or add additional context in comments.

5 Comments

With a bit luck this might actually work with some pipe'ing of commands to the executable.. I like the simplicity, so I might accept as answer because the remaining problems might be too specific.
Turns out pipe'ing doesn't really work. echo "some command" | someExecutable & The pipe commands can't seem to access the pointers of the regular commands.. They spawn a new instance of the executable instead. Do you have a good idea of how to pipe into some kind of shared memory instance?
you could try to make your command a subcommand and running the parent in background: eval "echo \"some command\" | someExecutable" &
"spawn a new instance of the executable". What do you mean? With every call of someExecuteable & the executable is called and, if not otherwise defined in the exec itself, spawned. What is your expected behaviour?
I expected (that the program was disk-based enough) that every new thread would find the data from the first commands. However this was not the case.. I just figured out a way to write and read little enough data that multiprocessing still is a major advantage. echo "some command" | someExecutable & was the way to go! I appreciate the note about wait - I think it works perfectly now!
1

Using GNU Parallel:

## init
cd foo
cp bar baz

## parallel ##
parallel heavycalc ::: file1 file2 file3 > results.txt

## finish ##
cat results.txt

GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to. It can often replace a for loop.

If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

Simple scheduling

GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

GNU Parallel scheduling

Installation

If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

Learn more

See more examples: http://www.gnu.org/software/parallel/man.html

Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.