Bash scripting techniques. # 2

My last article Bash scripting techniques caused a heated debate in the comments. Its main message was in using the library of functions. In addition, I described the method of parsing parameters in Bash. I thank everyone for the constructive comments. I draw your attention that the article is intended for a wide range of readers, and is not addressed exclusively to system administrators.

We will continue what we have begun and, with a real example, we will complement the approach to parsing the parameters and unifying the functionality of the scripts.
So, let's write a script that synchronizes a certain directory to another: dir-sync. In essence, this is cloning, and has the following differences from copying:

Files with the same date / time that already exist are not copied (this can be achieved with the cp command with the -u option)
The destination directory deletes all files and directories that are not in the source.
The script can synchronize data not only locally, but also to a remote computer

')
In other words, in the receiver, we get exactly what is in the source, and this happens in the most optimal way. This is an extremely useful approach, for example, when periodically saving a large amount of data to an external disk. Only the new is copied, and what has changed, while the deleted files in the source are also deleted in the receiver. In addition, access rights, Selinux attributes, and extended file attributes are also copied.

Actually, as many have probably already guessed, we will not reinvent the wheel, but use the rsync program, which is intended for this. Here the task is to wrap rsync with our script so that it is convenient to use. Well, who wants to write something like that ?:

 rsync -rlptgoDvEAH --delete --delete-excluded --super --force
 --progress --log-file = / var / log / rs-total.txt --log-file-format =% o% i% f% b
 / data / src / proj / perl / my / web / company / roga-i-kopyta /
 / data / save / proj / perl / my / web / company / roga-i-kopyta /

Obviously, our script has at least 2 parameters - the source and destination directories. In the last article, this situation was not considered, namely, how, along with the keys, to handle the parameters of a fixed position. And it is very desirable that the keys could be inserted anywhere, flowing around them fixed parameters. For example:

dir-sync -key1 src-sri -key2 dest-dir key3

The algorithm for parsing the parameters I described earlier allows you to process keys in any order and in two forms. It remains only to make the following changes to it:

 #     fixPrmCnt=0 pSrcDir= # - pDstDir= # - ... while [ 1 ] ; do if [ "$1" = "--yes" ] ; then pYes=1 ... else #     . #    ( )    . # ,    « » - #       (( fixPrmCnt++ )) #      #  ( )  —     #    if [ 1 -eq $fixPrmCnt ] ; then pSrcDir="$1" elif [ 2 -eq $fixPrmCnt ] ; then pDstDir="$1" #    ,   #   -  else errMess ":  " exit 1 fi fi shift done

As can be seen from the example, the processing of fixed parameters perfectly fits into the previously proposed scheme. By the way, I added several functions to the library of functions that do not need to be considered, errMess is one of them. In this article I do not focus on the implementation of library functions, since they are still very simple and obvious. You can read them in the library file (at the end of the article). For me, the main thing is to show how simple functions can significantly improve the clarity, readability and simplicity of the script code.

Now we define the functionality of our cloning script. He must:

When run without parameters, display a brief help.
Require confirmation if nothing is specified except for the two specified parameters. Yes! The script is super-destructive, and this thing will not be superfluous.
As described earlier, the --yes key is used to suppress confirmation. This will allow to use the script in other scripts.

And here is another highlight:

When specifying the -i option, the script should become interactive.

For this script, this feature is probably unnecessary. I describe it as a demonstration of what is possible, and sometimes convenient. In fact, if the script has many options that it is important not to forget, it is better to provide such an opportunity, but of course, not for every reason. The general rule is that interactivity should be optional. However, there are also scripts that should be extremely interactive — for example, to perform specific actions for inexperienced users.

Well, actually functional:

Mode selection: replica / update
Enable background mode (via nohup)
Set log file name
An opportunity to look at the resulting rsync command (also as an example, no more)

For the first time, the functionality is enough. If necessary, we will develop it in the future. I won't describe rsync and nohup — it's enough just to read their man.

We describe all the keys:

--yes: Suppress confirmation request
-i | --interactive: enable interactive mode
-lf | --log-file =: set log file name
-u | --update: update mode (default is an exact copy)
-sc | --show-command: show final rsync command
-n | --dry-run: “idle mode” - rsync starts and informs about actions, but does not actually do anything
-bg | --background: run in background

These keys require corresponding variables. So the header of the script will be something like this:

 #   fixPrmCnt=0 #    pInter= #   pLogFile= #  - pUpdate= #   pShowCmd= #   rsync pDryRun= #   pBackgr= #    pSrcDir= #   pDstDir= #   RSCmd= #  rsync RSPrm= #   rsync

We do not declare the parameter pYes, as it is in our library. Now consider the main blocks of the program.

Here is what the parameter handling looks like:

 if [ -z "$1" ] ; then usage exit fi while [ 1 ] ; do if [ "$1" = "--yes" ] ; then pYes=1 elif [ "$1" = "-i" ] ; then pInter=1 elif [ "$1" = "--interactive" ] ; then pInter=1 elif procParmS "-lf" "$1" "$2" ; then pLogFile="$cRes" ; shift elif procParmL "--log-file" "$1" ; then pLogFile="$cRes" elif [ "$1" = "-u" ] ; then pUpdate=1 elif [ "$1" = "--update" ] ; then pUpdate=1 elif [ "$1" = "-sc" ] ; then pShowCmd=1 elif [ "$1" = "--show-command" ] ; then pShowCmd=1 elif [ "$1" = "-n" ] ; then pDryRun=1 elif [ "$1" = "--dry-run" ] ; then pDryRun=1 elif [ "$1" = "-bg" ] ; then pBackgr=1 elif [ "$1" = "--background" ] ; then pBackgr=1 elif [ -z "$1" ] ; then break #   else (( fixPrmCnt++ )) if [ 1 -eq $fixPrmCnt ] ; then pSrcDir="$1" elif [ 2 -eq $fixPrmCnt ] ; then pDstDir="$1" else errMess ":  " exit 1 fi fi shift done

In the absence of parameters, a brief reference is displayed. Further processing of the parameters, followed by their verification and, if necessary, a change (biting off the final slash).

 checkParm "$pSrcDir" "  -" checkParm "$pDstDir" "  -" if [ "$pInter" = "1" ] && [ "$pYes" = "1" ] ; then errMess " : --yes  -i" exit 1 fi #   ,    pSrcDir="${pSrcDir%/}" pDstDir="${pDstDir%/}" checkDir "$pSrcDir" checkDir "$pDstDir"

The non-interactive part of the script looks very simple:

 #    if [ "$pInter" != "1" ] ; then #   if [ "$pYes" != "1" ] ; then echo " ${curScript##*/}  !" showInfo myAskYesNo "    !  ?" || exit fi createCmd

We will look at the showInfo and createCmd functions — this is actually the display of information about the parameters and the generation of the rsync command.

And the next block is the interactive part. Once again I draw your attention to how simple and readable the code becomes if you use the functions of the library. Even the interactive part does not take up much space and is also understandable and simple.

  cat <<EOF  ${curScript##*/}  !        .   : ------------------------ c) clone ( ) u) update ( ) .)  EOF input1 " : " "cu." [ "$cRes" = "." ] && exit pBackgr= #    ,     input1 "       ? (y/n): " "yn." [ "$cRes" = "." ] && exit [ "$cRes" = "y" ] && pBackgr=1 #       ,      read -p "  - ( : $pLogFile): " a1 [ -n "$a1" ] && pLogFile="$a1" [ "$a1" = "." ] && exit pShowCmd= #    ,     input1 "    ? (y/n): " "yn." [ "$cRes" = "." ] && exit [ "$cRes" = "y" ] && pShowCmd=1 createCmd echo #     showInfo if [ "$pShowCmd" = "1" ] ; then echo " rsync:" echo " $RSCmd" "${RSPrm[@]}" fi myAskYesNo "!  ?" || exit

As we can see, here parameters are sequentially polled, but not all. It is assumed that the directories are still provided from the command line and do not participate in the survey - it is more convenient to set them there, although nothing prevents those who wish to add their processing here.

There are also showInfo and createCmd functions.

And now we slightly modify the function input1 (see in the library) so that it accepts a parameter that says that if you press a dot, you need to exit the script - “dot-exit”. We exclude one line to process each parameter! Now the part of the code responsible for this looks like this:

  input1 " : " "cu." "dot-exit" pBackgr= #    ,     input1 "       ? (y/n): " "yn." "dot-exit" [ "$cRes" = "y" ] && pBackgr=1 #       ,      read -p "  - ( : $pLogFile): " a1 [ -n "$a1" ] && pLogFile="$a1" pShowCmd= #    ,     input1 "    ? (y/n): " "yn." "dot-exit" [ "$cRes" = "y" ] && pShowCmd=1

You can go further and enter several functions for entering parameters. But it will leave the next time.

The ending is obvious:

 if [ "$pBackgr" = "1" ] ; then nohup $RSCmd "${RSPrm[@]}" & else $RSCmd "${RSPrm[@]}" fi

We will look at using the array a little later, and now we will notice that if rsync is launched at the very end, the result of its execution will be the result of the execution of our script. By this we strive for the implementation of the rule that any script should return a result.

And now we will consider functions which are also simple and clear.

 showInfo() { local a1 if [ "$pUpdate" = "1" ] ; then a1="" else a1="" fi padMid 80 "" "$a1" ; echo $cRes padMid 80 "" "$pSrcDir" ; echo $cRes padMid 80 "" "$pDstDir" ; echo $cRes padMid 80 "-" "$pLogFile" ; echo $cRes transYesNoRu $pBackgr padMid 80 "  " "$cRes" ; echo $cRes transYesNoRu $pDryRun padMid 80 "   " "$cRes" ; echo $cRes }

Here we use the library functions padMid to display the parameter values nicely and smoothly (the parameter “80” is the width of the string). The transYesNoRu function of 1 makes “yes”, from the rest “no”.

The conclusion is approximately as follows:

 .....................................  .......................... /data/src/proj/fed16 ........................... /data/src/proj/sync -......................... /var/log/dir-sync.log   ...................................     ........................

Finally, the heart of the script is the generation of the rsync command, where keys are sequentially added in accordance with the specified parameters.

 createCmd() { RSCmd="$rsync" if [ "$pUpdate" = "1" ] ; then RSCmd="$RSCmd -urlptgoDvEAH" else RSCmd="$RSCmd -rlptgoDvEAH --delete" fi #    -    ,   if [ "$pBackgr" = "1" ] ; then RSCmd="$RSCmd -q" else RSCmd="$RSCmd --progress -v" fi if [ "$pDryRun" = "1" ] ; then RSCmd="$RSCmd -n" fi RSCmd="$RSCmd --super --force" #   -   n=-1 ((n++)) ; RSPrm[n]="--log-file=$pLogFile" ((n++)) ; RSPrm[n]="$pSrcDir/" ((n++)) ; RSPrm[n]="$pDstDir/" }

That is, createCmd generates the RSCmd variable, which is then run at the end of the script.

Note especially the use of the RSPrm array. The fact is that if there are spaces in the file names (and we are writing a more or less universal script, which should be taken into account this moment), then the assembly of a single RSCmd line will not work. Remember the ending: $ RSCmd "$ {RSPrm [@]}"? If everything were typed only in the $ RSCmd line and the ending would look like $ RSCmd, then the name of the directory or log file with spaces would be broken by the bash interpreter. For example, if you specified the source directory “my dir”, instead of copying “my dir” where indicated, there would be an attempt to copy my into dir, and then another into this “somewhere”.

Attempts to build a string like

 RSCmd="$RSCmd \"$pSrcDir/\" \"$pDstDir/\" "

, that is, to add escaped quotes to this line, they will also not succeed. We will get the names of files like “my dir” along with quotes .

Using an array solves this problem. The array is also initialized as a normal variable (RSPrm =), more precisely, it is a normal variable until it is used as an array. And that's exactly what we do when we execute ((n ++)); RSPrm [n] = "- log-file = $ pLogFile". Array indices in bash start at zero. For universality and readability, we initialize n = -1, then simply increment it and get a new valid index.

The use of the array occurs at the end:

 $RSCmd "${RSPrm[@]}"

This construction does the following - the elements of the array are inserted into the string separately, and they are an indivisible parameter, whatever they are (whitespace characters.) If you replace the @ symbol with * we get an effect similar to using a regular string, that is, each element the array will be parsed by whitespace characters and it is precisely the tokens broken in this way that will appear as parameters. This is exactly what we avoided, therefore we need only @ here.

In general, using arrays in this way is extremely useful when the parameters are strings containing spaces. For example, the same rsync can accept file filtering options, such as: '-f- * .tmp', meaning that synchronization ignores * .tmp files. So, '-f- * .tmp' is a single parameter that contains a space. If you collect strings once, you can specify these parameters in quotes or apostrophes of the type:

 rsync ... '-f- *.tmp' '-f- *.log' ...

But if you try to assemble such a line in advance, and then execute it - there will be a guard! For example:

 param="-f- *.tmp" param="$param -f- *.log"

likewise does not work and

 param="'-f- *.tmp'" param="$param '-f- *.log'"

And in such cases we are forced to use an array as described above.

Summary

We added the processing of fixed parameters to the parsing algorithm.
We have seen that interactivity can be provided with simple and understandable means.
We have seen that library functions clarify and make writing easier.
We have seen how arrays can help with parameter assembly.
We got a really working script within its capabilities.

What haven't we got? Of course the perfect sync script. This option is far from perfect, and there are more complaints about it than lines of code in it. But he did not claim to be the ideal, but just a clear example. But he still has, besides clarity, one more advantage - it works. And performs its narrow function.

I draw the attention of readers who are not familiar with rsync - one of the directories can be set on a remote machine in the form

 [user@][host:]dir-from-root

, i.e

 vova@mycomp:/save/work mycomp:/save/work

The script call can be, for example, like this:

 dir-sync -u /work mycomp:/save/work

If there are readers who would like to continue developing this script - write in the comments.

Library files and the script itself can be found here .

Source: https://habr.com/ru/post/163691/

All Articles

Bash scripting techniques. # 2

Summary

More articles: