shell commands in parallel
Sat, 04 Jun 2011 19:58 categories: codeI needed a way to execute a list of commands in parallel. Existing tools like
parallel from the moreutils debian package and pexec only allowed to pass the
arguments by commandline. This becomes a problem when there are more commands
than exec()
can handle. You find out that limit with getconf NCARGS
.
Another issue with them is that they allow only a list of arguments that they
append to a given command, not a list of commands to be run in parallel. Also
the number of arguments that they can give to that command is limited to one.
They also can only execute one command and not a chain of commands separated by
semicolon.
What I needed was a program that would sequentially read commands or multiple commands separated by semicolons from a file. One command or chain of them per line and execute them when the overall number of currently executing processes is below a threshold.
The following script reads a file from stdin
and does exactly what I want:
#!/bin/sh
NUM=0
QUEUE=""
MAX_NPROC=6
while read CMD; do
sh -c "$CMD" &
PID=$!
QUEUE="$QUEUE $PID"
NUM=$(($NUM+1))
# if enough processes were created
while [ $NUM -ge $MAX_NPROC ]; do
# check whether any process finished
for PID in $QUEUE; do
if [ ! -d /proc/$PID ]; then
TMPQUEUE=$QUEUE
QUEUE=""
NUM=0
# rebuild new queue from processes
# that are still alive
for PID in $TMPQUEUE; do
if [ -d /proc/$PID ]; then
QUEUE="$QUEUE $PID"
NUM=$(($NUM+1))
fi
done
break
fi
done
sleep 0.5
done
done
wait
EDIT: way too late I figured out that what I wanted to do is much easier by just using xargs like this:
cat command_list | xargs --delimiter='\n' --max-args=1 --max-procs=4 sh -c
where -P executes sh in parallel.