Bash heavy duty queue

Good day, habrazhiteli and habrachiteli!
I recently had the following task: you need to monitor a specific directory for files, and if a file appears in it, you need to transfer this file to ~~a safer~~ other place, and run quite a long processing on it. It would seem that everything is simple, but the situation is overshadowed by the fact that it is impossible to do processing of several files at the same time (processing pulls files from bourgeois servers that do not allow downloading a lot of things from a single IP).
The task queue (FIFO) immediately came to mind, which I would like to do in bash (which is really far to go). Wishing to receive the ready decision - I ask under habrakat.

The article is designed for beginners who first hear the letter FIFO in relation to bash.

In short, what will be done: we will create a queue of commands that will have to be executed one by one. The script tracking the queue will check the jobq.lock file locker . If it is not, then no one does any tasks and you can safely take the next one. If there is one, then there is no need to read anything from the queue and you can safely leave ~~with a sense of accomplishment~~ .
')
First, create the queue and location of our scripts:

umask 077 mkdir -p ~/jobs/var mkfifo ~/jobs/var/jobq mkdir -p ~/jobs/bin

In bin, our scripts will be run, which will be launched, and in var - everything that relates to the queue (in fact, the queue jobq itself , as well as the file-locker jobq.lock ).
There should also be a working, input and output folder. In my case, this is ~ / jobs / Input , ~ / jobs / Work and ~ / jobs / Output

Next, we start writing our scripts. They turned out 3:

Which monitors new data and transfers them
Which sends new data to the queue (this script is taken out separately - you can read about the reasons in the comments)
Which, in fact, checks the queue and starts tasks from there.

Let's start in the order of numbering ( $ HOME / jobs / bin / mover.sh )

 #!/bin/bash # ,         #       FILES_LIST=( $(ls $HOME/jobs/Input) ) #       for raw_file in ${FILES_LIST[@]}; do mv $HOME/jobs/Input/$raw_file $HOME/jobs/Work/ #   ,  .  ,     ,      filename=$(basename $raw_file) #      name=${filename%.*} #         mkdir -p $HOME/jobs/Output/$name #    #2,      . #           #     $HOME/complicated_task.sh -i $HOME/jobs/$raw_file -o $HOME/jobs/Output/$name >> $HOME/jobs/Output/$name/task.log $HOME/jobs/submit.sh "$HOME/complicated_task.sh -i $HOME/jobs/$raw_file -o $HOME/jobs/Output/$name >> $HOME/jobs/Output/$name/task.log" done

In this script, everything is extremely simple. And well (I hope!) Described the actions in the comments.
It remains quite a bit - to write the job crontab 'y. We will execute this script every minute

 crontab -e * * * * * $HOME/jobs/bin/mover.sh

Moving on to the second script that will queue our jobs ( $ HOME / jobs / bin / submit.sh ):

 #!/bin/bash # submit.sh. #      #     ,     , #   ,       . #        # (    &  ) echo $* > $HOME/jobs/var/jobq &

Indeed, if you don’t put & at the end of the task, the script will hang and wait for the end of all previous tasks. Why endure it? Send in the background.

And, finally, the hero of the occasion, a script that reads the queue and runs jobs from it ( $ HOME / jobs / bin / execute.sh ):

 #!/bin/sh # execute.sh #         # jobq.lock - , ,      #    ,     test -f $HOME/jobs/var/jobq.lock && exit 0 #    ,     touch $HOME/jobs/var/jobq.lock || exit 2 #   read job < $HOME/jobs/var/jobq #       : date >> $HOME/jobs/jobs.log echo " RUN: $job" >> $HOME/jobs/jobs.log echo "" >> $HOME/jobs/jobs.log eval $job #   status=$? #  ,   rm -f $HOME/jobs/var/jobq.lock || exit 3 #     ,      exit $status

And again we create a new task for our friend crontab 'at:

 crontab -e * * * * * $HOME/jobs/bin/execute.sh

Such a system has been working stably for a couple of weeks, and I decided to write here - suddenly, someone will need it.

Source: https://habr.com/ru/post/151684/

All Articles

Bash heavy duty queue

More articles: