eXTReMe Tracker
May 112011

Running scripts in serial order on a multi-core machine will take quite a bit of time. If the tasks are repetitive, the command can be run under GNU Parallel.

You can install it on RPM or Debian based distribution from the GNU Parallel repository. Install the binary that fits your flavour of Linux. Please note that even though the latest release of GNU Parallel has a flavour of Linux associated with it, it runs on most of the older AND newer distributions. For instance I was able to install the April 21st 2011 release of amd_64 RPM on a RHEL5 machine, and it ran just fine for that 8 core machine.

After installation you can try it out on a sample directory by running:

ls -d */ | sed ‘s/\///g’ | parallel zip -r -q {}.zip {}

This will recursively zip all the directories within a given folder, in parallel. By default, it will use up the maximum number of cores, but you can check the man pages of parallel to check how to employ N number of processors to run the task in parallel.

ls *.jpg | parallel convert {} -resize 75% -quality 80% {}

If you have imagemagick installed, you could save a whole bunch of space by converting some of the high-res images to a slightly lower resolution. The above example will resize the images to 75% of their original size with 80% quality. The original images will be overwritten by the resize and downsampled ones. While this is running, you can fire up htop in another terminal window to watch all the processors working in parallel.