2

I wrote an shell script which

  1. get list of all image files from directory
  2. create new folder if needed for new image
  3. optimize image in order to save storage resources

I've tried to use parallel -j "$(nproc)" before mogrify but found that it was wrong, because before mogrify is used DIR and mkdir, i need instead something like & at end of mogrify but to do it only for n processes.

the current code look like:

#!/bin/bash

find $1 -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" -o -iname "*.gif" -type f | while read IMAGE
do
    DIR="$2"/`dirname $IMAGE`
    echo "$IMAGE > $DIR"
    mkdir -p $DIR
    mogrify -path "$DIR" -resize "6000000@>" -filter Triangle -define filter:support=2 -unsharp 0.25x0.08+8.3+0.045 -dither None -posterize 136 -quality 82 -define jpeg:fancy-upsampling=off -define png:compression-filter=5 -define png:compression-level=9 -define png:compression-strategy=1 -define png:exclude-chunk=all -interlace none -colorspace sRGB "$IMAGE"
done

exit 0

Can someone suggest what will be the right way to run such script in parallel? as each run take about 15 seconds.

1
  • Check out GNU parallel. (Site briefly offline at the moment.) Commented Jul 1, 2019 at 16:51

2 Answers 2

2

When you have a shell loop that does some setup and invokes an expensive command, the way to parallelize it is to use sem from GNU parallel:

for i in {1..10}
do
  echo "Doing some stuff"
  sem -j +0 sleep 2
done
sem --wait

This allows the loop to run and do its thing as normal, while also scheduling the commands to run in parallel (-j +0 runs one job per CPU core).

Sign up to request clarification or add additional context in comments.

Comments

0

Make a bash function that deals correctly with one file and call that in parallel:

#!/bin/bash

doit() {
  IMAGE="$1"
  DIR="$2"/`dirname $IMAGE`
  echo "$IMAGE > $DIR"
  mkdir -p $DIR
  mogrify -path "$DIR" -resize "6000000@>" -filter Triangle -define filter:support=2 -unsharp 0.25x0.08+8.3+0.045 -dither None -posterize 136 -quality 82 -define jpeg:fancy-upsampling=off -define png:compression-filter=5 -define png:compression-level=9 -define png:compression-strategy=1 -define png:exclude-chunk=all -interlace none -colorspace sRGB "$IMAGE"
}
export -f doit

find $1 -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" -o -iname "*.gif" -type f |
    parallel doit

Default for GNU Parallel is to run one job per CPU-thread, so ǹproc is not needed.

This has less overhead than starting sem for each file (sem = 0.2 sec per call, parallel = 7 ms per call).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.