how to write a process-pool bash shell

bash multi process
bash multiprocessing
bash parallel
bash threads
bash threading
bash parallel for loop
bash simultaneous loop
xargs

I have more than 10 tasks to execute, and the system restrict that there at most 4 tasks can run at the same time.

My task can be started like: myprog taskname

How can I write a bash shell script to run these task. The most important thing is that when one task finish, the script can start another immediately, making the running tasks count remain 4 all the time.


I chanced upon this thread while looking into writing my own process pool and particularly liked Brandon Horsley's solution, though I couldn't get the signals working right, so I took inspiration from Apache and decided to try a pre-fork model with a fifo as my job queue.

The following function is the function that the worker processes run when forked.

# \brief the worker function that is called when we fork off worker processes
# \param[in] id  the worker ID
# \param[in] job_queue  the fifo to read jobs from
# \param[in] result_log  the temporary log file to write exit codes to
function _job_pool_worker()
{
    local id=$1
    local job_queue=$2
    local result_log=$3
    local line=

    exec 7<> ${job_queue}
    while [[ "${line}" != "${job_pool_end_of_jobs}" && -e "${job_queue}" ]]; do
        # workers block on the exclusive lock to read the job queue
        flock --exclusive 7
        read line <${job_queue}
        flock --unlock 7
        # the worker should exit if it sees the end-of-job marker or run the
        # job otherwise and save its exit code to the result log.
        if [[ "${line}" == "${job_pool_end_of_jobs}" ]]; then
            # write it one more time for the next sibling so that everyone
            # will know we are exiting.
            echo "${line}" >&7
        else
            _job_pool_echo "### _job_pool_worker-${id}: ${line}"
            # run the job
            { ${line} ; } 
            # now check the exit code and prepend "ERROR" to the result log entry
            # which we will use to count errors and then strip out later.
            local result=$?
            local status=
            if [[ "${result}" != "0" ]]; then
                status=ERROR
            fi  
            # now write the error to the log, making sure multiple processes
            # don't trample over each other.
            exec 8<> ${result_log}
            flock --exclusive 8
            echo "${status}job_pool: exited ${result}: ${line}" >> ${result_log}
            flock --unlock 8
            exec 8>&-
            _job_pool_echo "### _job_pool_worker-${id}: exited ${result}: ${line}"
        fi  
    done
    exec 7>&-
}

You can get a copy of my solution at Github. Here's a sample program using my implementation.

#!/bin/bash

. job_pool.sh

function foobar()
{
    # do something
    true
}   

# initialize the job pool to allow 3 parallel jobs and echo commands
job_pool_init 3 0

# run jobs
job_pool_run sleep 1
job_pool_run sleep 2
job_pool_run sleep 3
job_pool_run foobar
job_pool_run foobar
job_pool_run /bin/false

# wait until all jobs complete before continuing
job_pool_wait

# more jobs
job_pool_run /bin/false
job_pool_run sleep 1
job_pool_run sleep 2
job_pool_run foobar

# don't forget to shut down the job pool
job_pool_shutdown

# check the $job_pool_nerrors for the number of jobs that exited non-zero
echo "job_pool_nerrors: ${job_pool_nerrors}"

Hope this helps!

Writing a process pool in Bash · Milhouse, Running the script yields: #=> hi boy #=> how are you ? Great, it works. Now,  Writing a process pool in Bash 20 Nov 2015 # writing-a-process-pool-in-bash TL;DR: In this post I will show how one can achieve the equivalent of a “process pool” to run many processes in parallel in Bash. Think of a thread pool, but for running individual processes.


Use xargs:

xargs -P <maximun-number-of-process-at-a-time> -n <arguments per process> <commnad>

Details here.

Can you make a process pool with shell scripts?, If you have GNU Parallel you can do this: parallel do_it {} --option foo < argumentlist. GNU Parallel is a general parallelizer and makes is easy  If this is your first time writing a script, don’t worry — shell scripting is not that complicated. That is, you can do some complicated things with shell scripts, but you can get there over time. If you know how to run commands at the command line, you can learn to write simple scripts in just 10 minutes.


Using GNU Parallel you can do:

cat tasks | parallel -j4 myprog

If you have 4 cores, you can even just do:

cat tasks | parallel myprog

From http://git.savannah.gnu.org/cgit/parallel.git/tree/README:

Full installation

Full installation of GNU Parallel is as simple as:

./configure && make && make install
Personal installation

If you are not root you can add ~/bin to your path and install in ~/bin and ~/share:

./configure --prefix=$HOME && make && make install

Or if your system lacks 'make' you can simply copy src/parallel src/sem src/niceload src/sql to a dir in your path.

Minimal installation

If you just need parallel and do not have 'make' installed (maybe the system is old or Microsoft Windows):

wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
mv parallel sem dir-in-your-$PATH/bin/
Test the installation

After this you should be able to do:

parallel -j0 ping -nc 3 ::: foss.org.my gnu.org freenetproject.org

This will send 3 ping packets to 3 different hosts in parallel and print the output when they complete.

Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Easy parallelization with Bash in Linux, Part 1 - pipes and xargs thread pool Part 2 - process substitution and flocks I love Writing a concurrent program is considered complex. in a casual Linux shell Bash with just a small effort (about Bash: wiki, manual). #Job pooling for bash shell scripts # This script provides a job pooling functionality where you can keep up to n # processes/functions running in parallel so that you don't saturate a system


I would suggest writing four scripts, each one of which executes a certain number of tasks in series. Then write another script that starts the four scripts in parallel. For instance, if you have scripts, script1.sh, script2.sh, script3.sh, and script4.sh, you could have a script called headscript.sh like so.

#!/bin/sh
./script1.sh & 
./script2.sh & 
./script3.sh & 
./script4.sh &

how to write a process-pool bash shell - bash - iOS, Look at my implementation of job pool in bash: https://github.com/spektom/shell-​utils/blob/master/jp.sh For example, to run at most 3 processes  comment écrire un shell de bash de process-pool j'ai plus de 10 tâches à exécuter, et le système restreint qu'Il ya au plus 4 tâches peuvent exécuter en même temps. ma tâche peut commencer comme: myprog taskname


Following @Parag Sardas' answer and the documentation linked here's a quick script you might want to add on your .bash_aliases.

Relinking the doc link because it's worth a read

#!/bin/bash
# https://stackoverflow.com/a/19618159
# https://stackoverflow.com/a/51861820
#
# Example file contents:
# touch /tmp/a.txt
# touch /tmp/b.txt

if [ "$#" -eq 0 ];  then
  echo "$0 <file> [max-procs=0]"
  exit 1
fi

FILE=${1}
MAX_PROCS=${2:-0}
cat $FILE | while read line; do printf "%q\n" "$line"; done | xargs --max-procs=$MAX_PROCS -I CMD bash -c CMD

I.e. ./xargs-parallel.sh jobs.txt 4 maximum of 4 processes read from jobs.txt

Run a pool of processes in shell, #!/bin/bash command=$1 n_job=$2 function kill_jobs { echo traped for job in If you write loops in shell, you will find GNU Parallel may be able to replace most  Easy parallelization with Bash in Linux Two articles that show how concurrency can be utilized for bash scripting. Part 1 - pipes and xargs thread pool Part 2 - process substitution and flocks I love concurrency. The Idea of several mindless zombies collaborating to provide a useful result thrills me and I'm always so self-satisfied when I see that these are my boys. Writing a concurrent


Process Pools in Bash, We use the following example worker.sh: 1 2 3#!/bin/bash echo "Called worker with args $@" sleep 2 The following call will run worker.sh 26 A shell is a session leader. l: Multi-thread process. +: A foreground process. We can see that Bash has a state of Ss. The uppercase “S” tell us the Bash shell is sleeping, and it is interruptible. As soon as we need it, it will respond. The lowercase “s” tells us that the shell is a session leader. The ping command has a state of T.


A script for running processes in parallel in Bash, Finally, here's the script. It can be easily manipulated to handle different jobs, too. Just write your command between #DEFINE COMMAND and #  In a BASH for loop, all the statements between do and done are performed once for every item in the list. In this example, the list is everything that comes after the word in —the numbers 1 2 3 4 5. Each time the loop iterates, the next value in the list is inserted into the variable specified after the word for.


Three Ways to Script Processes in Parallel, When you run the script, all three processes will be forked in parallel, and the scope of this article, but here's a rough equivalent to the example script above: (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash. Therefore, bash doesn’t evaluate the second condition, and that’s the reason why “echo true” is not executed in the example. This is the same for the or operator (“||”), where the second condition is not evaluated if the first one is true. Well, so much for the diving. If you want to know even more,