Blocking the launch of the second instance of the Perl program

From time to time it becomes necessary to ensure that the program is guaranteed to work in one copy. For example, it may be a script that generates a certain file - if you run two instances of the script at the same time, the file will be corrupted.

It is necessary to check whether the process being started is the only currently running instance of the program, or is there already another instance running?
')
There are several methods of such verification, differing reliability.

Basic methods

1) Verifying the existence of a pid file

The script runs and checks for the presence of the pid file. If the pid file already exists, it means that another instance of the script is already running and should not be run a second time. If the pid file does not exist, the script creates the pid file and starts working.

The problem is that the first instance may fall without deleting the pid file. And now it will be impossible to run the script at all, since the launched script will always detect the pid file, consider itself the second instance and refuse to run until the pid file is manually deleted. In addition, there is a problem with race conditions, since checking the existence of a file and the subsequent creation of this file are two separate operations, rather than one atomic one.

2) Checking the availability of the pida in the process list

The script starts, reads the pid file and then checks if there is a process with the read pid in the process table. If such a process exists, it means that another instance of the script is already running and should not be started a second time. If such a process does not exist, the script writes its PID to the PID file and starts working.

The problem is that the first copy may fall, and the pay it worked with may be issued to another process. After that, as in the first method, there will be a problem with running the script. Of course, the likelihood of such a situation is somewhat lower than in the first case, because the re-pid will not be issued immediately. Yes, and the likelihood that an outsider process will receive exactly the same peed as our process is not very big, but it is there, since there are not an infinite amount of them and they are given out in a circle. Well, plus the race conditions, since there are more operations here than in the first method.

3) PID file lock

The script starts and tries to block the pid file. If it was not possible to block, it means that another instance of the script is already running and should not be launched a second time. If you succeed in blocking the pid file, the script continues to work.

This method has no problems arising in the previous two methods:

The fall of the first instance of the script automatically removes the lock from the pid file, so the next instance can be started without any problems.
There is no race conditions, since the lock is an atomic action

Thus, this method is guaranteed to block the launch of the second copy of the program.

Pid locking method

Consider in detail the implementation of this method.

#!/usr/bin/perl use Carp; use Fcntl qw(:DEFAULT :flock); check_proc('/tmp/testscript.pid') and die "  ,   !\n"; #   , #    #    sleep 15; #     sub check_proc { my ($file) = @_; my $result; sysopen LOCK, $file, O_RDWR|O_CREAT or croak "   $file: $!"; if ( flock LOCK, LOCK_EX|LOCK_NB ) { truncate LOCK, 0 or croak "   $file: $!"; my $old_fh = select LOCK; $| = 1; select $old_fh; print LOCK $$; } else { $result = <LOCK>; if (defined $result) { chomp $result; } else { carp " PID  - $file"; $result = '0 but true'; } } return $result; }

First of all, the script calls the function check_proc, which checks for the presence of another running instance, and, if the check completes successfully, the script stops with the appropriate message.

Notice that in this line, the functions check_proc and die are combined using the conditional operator and. Usually such bundles are made through the or operator, but in our case the logic of the bundle is different - we kind of say to the script: “Realize the meaninglessness of your existence and die!”.

The check_proc function returns the pid of an already running instance, if it is actually running, or undef. Accordingly, the true result of this function means that one copy of the program is already running and you do not need to start a second time.

Check_proc function

Now let's sort the function check_proc line by line.

1) The sysopen function opens a file for reading and writing.

It is important that the file must be opened in non-destructive mode, otherwise the contents of the file will be destroyed. Because of this, you cannot use the simple open function, since it cannot open files in non-destructive mode.

The sysopen function with the O_RDWR | O_CREAT flags opens the file in nondestructive mode. The O_RDWR flag means opening simultaneously for reading and writing, the O_CREAT flag creates a file if it does not exist at the time of opening. Flags are imported from the Fcntl module (you can do without Fcntl if you use the numerical values of the flags).

2) The flock function locks the file.

Since we need to make sure that only one process has a lock, we need to request an exclusive lock. Exclusive locking is set by the LOCK_EX flag. As soon as the process gets an exclusive lock, everything, no one else can get such a lock in parallel. This, in fact, is the basis for blocking the launch of the second copy of the program, this is a key function.

If the flock function detects that someone else has already locked the file, then it will wait until the lock is released. This behavior is not suitable for our verification. We do not need to wait for the file to be released, we need the check_proc function to immediately return a positive result when a lock is detected. To do this, use the flag LOCK_NB.

Further behavior depends on whether the lock (3) succeeded or failed (4).

3a) The truncate function clears the file

Since we opened the file in non-destructive mode, the old contents of the file remained intact. This content we do not need, and may even interfere, so the file must be cleaned.

3b) Combination of select functions and $ variable | disables buffering

We need to write the PID of the current process to the PID file. But the output to the file is buffered block by block, so the write of the pida will be (seemingly) completed, but in fact the file will be (for the present) empty. Because of this, some other process trying to read from the PID file of the PID of our running process will find an emptiness there. Our check is based on blocking the pid file, not on checking the pid, so for our processes the absence of the pid will not be a disaster. But for processes for which the pit itself is important, this will create a problem.

To disable output buffering, you need the variable $ | associated with the output handle. set to true value. The first select sets the handle of our pid file with the current descriptor, then the variable is set to the true value, then the second select returns STDOUT back to the place of the current descriptor. After that, writing to the file will take place immediately, without buffering.

4a) Read the pid from the pid file

Reading from the file itself is trivial, but it must be kept in mind that it is possible that a pid will not be detected in the file. This will mean that the program instance is already running (after all, the lock could not be obtained), but for some reason, the ID of this running instance was not recorded. This should not be a problem for our verification, because it is not based on verification of the IDA. But the check_proc function should return the true value in case of detection of the running instance, so instead of the missing pida, you need to return something else, which is, nevertheless, true.

A suitable value in this case is “true zero”. This is a magic value (of which there are many in a pearl), which is zero in a numeric context, and true in a boolean. There are several options for recording true zero, I use the option "0 but true."

Conclusion

The method of blocking a pid-file is the most reliable way to ensure that the program runs in a single copy.

The check_proc function and the connection of the Fcntl module can be put into a separate module (for example, with the name MacLeod.pm), in this case the program will be run in one copy in just two lines:

 use MacLeod; check_proc('/tmp/testscript.pid') and die "  ,   !\n";

Or, the check can be made a bit more detailed:

 use MacLeod; my $pid = check_proc('/tmp/testscript.pid'); if ($pid) { die "   $pid  ,   !\n"; } else { print "!\n"; }

In this case, the return of the running process returned by the check_proc function is written to the $ pid variable and can be displayed in a message.

Source: https://habr.com/ru/post/235279/

All Articles