The book "UNIX. Professional programming. 3rd ed. "

Hello! We reprinted the classic work of William Stevenson and Stephen Rago with the corrected misprints of the translation in hardcover.

This book is deservedly popular with serious programmers around the world, because it contains the most important and practical information about managing UNIX and Linux kernels. Without this knowledge, it is impossible to write efficient and reliable code. From the basics — files, directories, and processes — you will gradually move to more complex issues, such as signal processing and terminal I / O, a multithreaded execution model, and interprocess communication using sockets. In total, this book covers more than 70 interfaces, including POSIX asynchronous I / O functions, cyclic locks, barriers, and POSIX semaphores.

Inside, we will look at the “Demon Process” chapter.

13.2. Demon characteristics

Consider some of the most common system daemons and their relationship with process groups, controlling terminals, and sessions described in Chapter 9. The ps (1) command displays information about processes in the system. This command has many parameters, additional information about them you will find in the reference manual. Run the command
')

ps -axj

in the BSD-system and we will use the information received from it in the further discussion. The -a switch is used to output processes owned by other users, the -x switch to output processes that do not have a controlling terminal, and the -j switch to display additional information related to tasks: session identifier, process group identifier, controlling terminal and terminal process group identifier.

For systems based on System V, a similar command looks like ps -efj. (For security reasons, some versions of UNIX do not allow processes belonging to other users to be viewed by the ps command.) The output of the ps command looks like this:

From this example, we have removed several columns that are not of particular interest to us, such as accumulated processor time. The following columns are shown, from left to right: user ID (UID), process ID (PID), parent process ID (PPID), process group ID (PGID), session ID (SID), terminal name (TTY), and command line (CMD ).

The system on which this command was run (Linux 3.2.0) supports the notion of a session identifier, which we mentioned when discussing the setsid function in section 9.5. The session ID is just the session leader process ID. However, in BSD-based systems, the address of the session structure corresponding to the process group to which this process belongs (section 9.11) will be displayed.

The list of system processes that you will see depends largely on the implementation of the operating system. Typically, these will be processes with a parent process ID of 0, launched by the kernel during the boot process. (The exception is the init process, as this is a user-level command that the kernel starts at boot time.) Kernel processes are special processes, they exist all the time while the system is running. These processes have superuser privileges and have neither a controlling terminal nor command lines.

In this example, the output of the ps command, the kernel demons can be recognized by the names in square brackets. This version of Linux uses a special kernel process, kthreadd, to create kernel processes; therefore, the parent of all other kernel daemons is the kthreadd process. For each component of the kernel, which must perform operations in the context of a process, but has not been called from a user process, it usually creates its own process — a kernel daemon. For example, in Linux:

There is a kswapd daemon, also known as a page dumping daemon. It provides support for the virtual memory subsystem, over time writing the modified memory pages to disk so that they can be freed.
The flush daemon pushes changed pages to disk when the amount of available memory drops to the minimum limit. It also pushes modified memory pages to disk at regular intervals to reduce data loss in the event of a system crash. The system can simultaneously run several flush daemons — one for each device. In the example above, there is only one flush daemon named flush-8: 0, where 8 is the most significant device number and 0 is the lowest.
The sync_supers daemon periodically pops file system metadata to disk.
The jbd daemon provides ext4 journal support.

The process with ID 1 is usually the init process (launchd on Mac OS X), as discussed in Section 8.2. This is a system daemon, which, among other things, is responsible for starting various system services at various levels of loading. As a rule, these services are also implemented as daemons.

The rpcbind daemon converts RPC (Remote Procedure Call) numeric identifiers of services (Remote Procedure Calls) into network port numbers. The rsyslogd daemon can be used by programs to display messages in the system log, where the administrator can then look. Messages can be displayed in the console, as well as recorded in a file. (The syslogd logging mechanism is discussed in more detail in section 13.4.)
In section 9.3 we already talked about the inetd daemon. This daemon waits for network requests from the network. The nfsd, nfsiod, lockd, rpciod, rpc.idmapd, rpc.statd, and rpc.mountd daemons provide support for the Network File System (NFS). Note that the first four of these are kernel daemons, and the last three are user level daemons.

The cron daemon executes commands at regular intervals. It handles various system administration tasks, running them at specified intervals. The atd daemon resembles the cron daemon and gives users the ability to run jobs at certain points in time, but runs jobs once. The cupsd daemon is a print server, it serves requests to the printer. The sshd daemon provides remote access to the system and execution in secure mode.

Please note that most daemons have superuser privileges (root). None of the demons has a controlling terminal - instead of the name of the terminal there is a question mark. Kernel daemons start without a controlling terminal. The lack of a controlling terminal for user-level daemons is probably the result of a call to the setsid function. All user-level daemons are group leaders and session leaders, as well as the only processes in their process groups and sessions (with the exception of rsyslogd). Finally, note that the parent for most daemons is the init process.

13.3. Demon programming rules

To avoid unwanted interactions when programming daemons, you should follow certain rules. First we list these rules, and then we will demonstrate the daemonize function that implements them.

1. Call the umask function to reset the file creation mode mask to 0. A mask inherited from the triggering process may mask some bits of access rights. If it is assumed that the daemon process will create files, it may be necessary to set certain bits of access rights. For example, if a daemon creates files with read and write permissions for a group, the mask of the file creation mode, which turns off any of these bits, will prevent this. On the other hand, if the daemon calls the library functions that create the files, it makes sense to set a more restrictive mask (for example, 007), since the library functions may not take an argument with permission bits.

2. Call the fork function and terminate the parent process. What is it done for? First, if the daemon is started as a normal shell command, completing the parent process, we will make the command shell think that the command has been executed. Secondly, the child process inherits the process group identifier from the parent, but receives its process identifier; thus, it is guaranteed that the child process will not be the leader of the group, and this is a necessary condition for calling the function setsid, which will be performed later.

3. Create a new session by calling the function setsid. At the same time (remember section 9.5), the process becomes (a) the leader of a new session, (b) the leader of a new group of processes and (c) loses the control terminal.

For systems based on System V, some experts recommend at this point to re-invoke the fork function and terminate the parent process so that the second descendant continues as a daemon. This technique ensures that the daemon will not be the session leader, and this prevents you from receiving a controlling terminal in System V (section 9.6). Alternatively, in order to avoid acquiring a controlling terminal, the O_NOCTTY flag should be specified whenever opening a terminal device.

4. Make the root directory the current working directory. The current working directory inherited from the parent process may reside on a mounted file system. Since the daemon, as a rule, exists all the time until the system reboots, in such a situation, when the working directory of the daemon is in a mounted file system, it cannot be unmounted. Alternatively, some demons can set their own current working directory in which they perform all the necessary actions. For example, print daemons often select a spooled directory as the current working directory where print jobs are placed.

5. Close all unnecessary file descriptors. This prevents certain descriptors that are inherited from the parent process (the shell or another process) from being kept open. Using our open_max function (Listing 2.4) or using the getrlimit function (section 7.11), you can determine the maximum possible descriptor number and close all descriptors up to this number.

6. Some daemons open file descriptors with numbers 0, 1 and 2 on the / dev / null device, so any library functions that try to read from a standard input device or write to a standard output device or error messages will not render no influence. Since the daemon is not connected to any terminal device, it cannot interact with the user interactively. Even if the daemon is running in an interactive session, it still goes into the background, and the initial session can end without affecting the daemon process. From the same terminal, other users can log in to the system, and the daemon should not output any information to the terminal, and users do not expect that their input from the terminal will be read by the daemon.

Example
Listing 13.1 shows the function that an application that wants to become a daemon can call.

Listing 13.1. Initializing the daemon process

 #include "apue.h" #include <syslog.h> #include <fcntl.h> #include <sys/resource.h> void daemonize(const char *cmd) {    int                 i, fd0, fd1, fd2;    pid_t               pid;    struct rlimit       rl;    struct sigaction    sa;    /*     *     .     */    umask(0);    /*     *      .     */    if (getrlimit(RLIMIT_NOFILE, &rl) < 0)        err_quit("%s:      ", cmd);    /*     *    ,    .     */    if ((pid = fork()) < 0)        err_quit("%s:    fork", cmd);    else if (pid != 0) /*   */        exit(0);    setsid();    /*     *       .     */    sa.sa_handler = SIG_IGN;    sigemptyset(&sa.sa_mask);    sa.sa_flags = 0;    if (sigaction(SIGHUP, &sa, NULL) < 0)        err_quit("%s:    SIGHUP", cmd);    if ((pid = fork()) < 0)        err_quit("%s:    fork", cmd);    else if (pid != 0) /*   */        exit(0);    /*     *      ,     *       . */    if (chdir("/") < 0)        err_quit("%s:      /", cmd);    /*     *     .     */    if (rl.rlim_max == RLIM_INFINITY)        rl.rlim_max = 1024;    for (i = 0; i < rl.rlim_max; i++)    close(i);    /*     *    0, 1  2  /dev/null.     */    fd0 = open("/dev/null", O_RDWR);    fd1 = dup(0);    fd2 = dup(0);    /*     *   .     */    openlog(cmd, LOG_CONS, LOG_DAEMON);    if (fd0 != 0 || fd1 != 1 || fd2 != 2) {        syslog(LOG_ERR, "   %d %d %d",               fd0, fd1, fd2);        exit(1);    } }

If the daemonize function is called from a program that then pauses, we can check the status of the daemon using the ps command:

 $ ./a.out $ ps -efj UID     PID   PPID   PGID   SID   TTY  CMD sar   13800      1  13799 13799   ?    ./a.out $ ps -efj | grep 13799 sar   13800      1  13799 13799   ?    ./a.out

Using the ps command, you can also make sure that the system does not have an active process with the identifier 13799. This means that our daemon belongs to an orphaned process group (section 9.10) and is not the session leader, and therefore cannot acquire a controlling terminal. This is the result of the second call to the fork function in the daemonize function. As you can see, our daemon is initialized correctly.

13.4. Error logging

One of the problems inherent in demons is related to the maintenance of error messages. The daemon cannot simply output messages to the standard error message output device, since it does not have a controlling terminal. We cannot require the daemon to display messages in the console, since on most workstations a multi-window system is started up in the console. We also cannot require the daemon to store its messages in a separate file. This would be a source of constant headaches for the system administrator, who would be forced to memorize which file each daemon writes in its messages. Some centralized error logging mechanism is required.

The syslog mechanism for BSD systems was developed at Berkeley and has been widely distributed since 4.2BSD. Most BSD-derived systems support syslog. Before the advent of SVR4, System V did not have a centralized mechanism for logging error messages. The syslog function was included in the Single UNIX Specification standard as an XSI extension.

The syslog mechanism for BSD systems has been widely used since 4.2BSD. Most demons use this particular mechanism. In fig. 13.1 shows its structure. There are three ways to register messages.

1. Kernel procedures can access the log function. These messages are available to any user process that can open and read the device / dev / klog. We will not consider this function, since we are not going to write kernel procedures.

2. Most user processes (daemons) for logging messages call the syslog (3) function. We will look at how to work with it later. This function sends messages through the UNIX domain socket - / dev / log.

3. A user process running on this computer or on another computer connected to this computer via TCP / IP can send messages via UDP protocol to port 514. Note that the syslog function never generates UDP datagrams — this functionality requires that The program supported networking.

For more information about UNIX domain sockets, see [Stevens, Fenner, and Rudoff, 2004]. Usually, the syslogd daemon understands all three ways to log messages. At startup, this daemon reads a configuration file (usually /etc/syslog.conf), which defines where messages of various classes should be sent. For example, urgent messages can be displayed in the system administrator's console (if it is in the system), while warnings can be written to a file.

In our case, interaction with this mechanism is carried out through the syslog function.

 #include <syslog.h> void openlog(const char *ident, int option, int facility); void syslog(int priority, const char *format, ...); void closelog(void); int setlogmask(int maskpri);

Returns the previous mask priority value of logged messages.

The openlog function is optional. If the openlog function was not called before the first call to the syslog function, it will be called automatically. Calling the closelog function is also optional — it simply closes the file descriptor that was used to interact with the syslogd daemon.

The openlog function allows you to define an identification string in the ident argument, which usually contains the name of the program (for example, cron or inetd). The option argument is a bitmask that defines various ways to display messages. In tab. 13.1 lists the values that can be included in the mask. The XSI column identifies those that the Single UNIX Specification standard includes in the definition of the openlog function.

Possible values for the facility argument are given in table. 13.2. Note that the Standard Single UNIX Specification defines only a fraction of the values normally available on a particular system. The facility argument allows you to determine how messages from different sources should be processed. If the program does not call the openlog function or passes the value 0 to the facility argument, you can specify the source of the message using the syslog function, defining it as part of the priority argument.

The syslog function is called to send a message. The priority argument is the combination of the value for the facility argument (table 13.2) and the message severity level (table 13.3). The levels of importance are listed in the table. 13.3 in descending order, from highest to lowest.

Table 13.1. Possible values that can be included in the option argument of the openlog function

The format argument and all subsequent arguments are passed to the vsprintf function to create a message string. The characters% m in the format string are replaced with an error message (strerror), which corresponds to the value of the variable errno.

The setlogmask function can be used to set the priority mask for process messages. This function returns the previous mask value. If the priority mask is set, messages whose priority level is not contained in the mask will not be logged. Note: it follows from the above that if the mask is set to 0, all messages will be logged.

Many systems also have a logger (1) program that can send messages to the syslog mechanism. Some implementations allow the program to pass optional arguments that indicate the source of the message (facility), the level of importance (level) and the identification string (ident), although the System UNIX Specification standard does not define additional arguments. The logger command is intended for use in shell scripts that run in non-interactive mode and need a message logging mechanism.

Table 13.2. Possible values for the facility argument of the openlog function

Table 13.3. Message severity levels (in descending order)

Example
In the (hypothetical) print daemon you can find the following lines:

 openlog("lpd", LOG_PID, LOG_LPR); syslog(LOG_ERR, "open error for %s: %m", filename);

A call to the openlog function establishes an identification string with the name of the program, indicates that the process identifier must be added to the message, and stipulates that the source of messages is the printing system daemon. The syslog function call indicates the level of importance of the message and the message itself. If you omit the openlog function call, the syslog call might look like this:

 syslog(LOG_ERR | LOG_LPR, "open error for %s: %m", filename);

Here, in the priority argument, we have combined the link to the source of the message and the level of importance of the message.

In addition to the syslog function, many platforms support its version, which takes additional arguments in the form of a variable-length list.

 #include <syslog.h> #include <stdarg.h> void vsyslog(int priority, const char *format, va_list arg);

All four platforms discussed in this book support the vsyslog function, but it is not part of the Single UNIX Specification standard. Note: to make this feature available in your application, you may need to define an extra character, such as __BSD_VISIBLE in FreeBSD or __USE_BSD in Linux.

Most syslogd implementations in order to reduce the processing time for requests from applications put incoming messages into a queue. If at this time the daemon receives two identical messages, only one will be logged. But at the end of such a message, the daemon will add a line like this: “last message repeated N times” (the last message was repeated N times).

13.5. Single Demons

Some demons are implemented in such a way that they allow simultaneous operation of only one of their copies. The reason for this behavior may be, for example, the requirement of a monopoly ownership of a resource. So, if the cron daemon allowed several copies of itself to work simultaneously, each of them would attempt, upon reaching the scheduled time, to start the same operation, which would certainly lead to an error.

If the daemon requires access to the device, several actions may be taken by the device driver to prevent the device from opening by several programs. This will limit the number of concurrent daemon instances to one. However, if the daemon does not intend to contact such devices, then we ourselves will have to do all the necessary work to impose restrictions.

One of the main mechanisms for limiting the number of simultaneously working copies of a daemon is file and record locking. (We will look at locking files and records in files in section 14.3.) If each of the daemons creates a file and attempts to set a lock for the file for the record, the system will allow only one such lock to be set. All subsequent attempts to establish a lock for writing will fail, informing the other copies of the daemon that the daemon is already running.
File and record locking is a convenient mechanism for mutual exclusion. If the daemon sets a lock for the whole file, it will be automatically released upon completion of the daemon. This simplifies the recovery process after errors, since it removes the need to remove the lock left from the previous copy of the daemon.

Example
The function in Listing 13.2 demonstrates the use of file locks and records to ensure that a single daemon instance is started.

Listing 13.2. The function that guarantees the launch of only one copy of the daemon

 #include <unistd.h> #include <stdlib.h> #include <fcntl.h> #include <syslog.h> #include <string.h> #include <errno.h> #include <stdio.h> #include <sys/stat.h> #define LOCKFILE "/var/run/daemon.pid" #define LOCKMODE (S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH) extern int lockfile(int); int already_running(void) {    int     fd; char    buf[16];    fd = open(LOCKFILE, O_RDWR|O_CREAT, LOCKMODE);    if (fd < 0) {        syslog(LOG_ERR, "  %s: %s",               LOCKFILE, strerror(errno));        exit(1);    }    if (lockfile(fd) < 0) {        if (errno == EACCES || errno == EAGAIN) {            close(fd);            return(1);        }        syslog(LOG_ERR, "    %s: %s",               LOCKFILE, strerror(errno));        exit(1);    }    ftruncate(fd, 0);    sprintf(buf, "%ld", (long)getpid());    write(fd, buf, strlen(buf)+1);    return(0); }

Each copy of the daemon will attempt to create a file and write its process ID into it. This will help the system administrator to identify the process. If the file is already locked, the lockfile function will fail with the error code EACCESS or EAGAIN in the errno variable and return to the caller a value of 1 indicating that the daemon is already running. Otherwise, the function truncates the file size to zero, writes the process ID to it, and returns the value 0.

The truncation of the file size is necessary because the process identifier of the previous copy of the daemon, represented as a string, could have a greater length. Suppose, for example, that a previously running copy of the daemon had process ID 12345, and the current copy has process ID 9999. That is, when this daemon writes its ID, the file will contain line 99995. The truncation of the file deletes information that relates to the previous copy of the daemon.

13.6. Demon Conventions

On UNIX, the demons adhere to the following conventions.

, /var/run. , , . name.pid, name — . , cron /var/run/crond.pid.
, /etc. , , name.conf, name — . , syslogd /etc/syslog.conf.
, (/etc/rc* /etc/init.d/*). , init, respawn /etc/inittab (, init System V).
, , . , , . SIGHUP, . , , SIGHUP. .

Example
The program in Listing 13.3 demonstrates one way to force the daemon to re-read the configuration file. The program uses the sigwait function and a separate stream for signal processing, as described in section 12.8.

Listing 13.3. An example of a daemon that re-reads a configuration file by signal

 #include "apue.h" #include <pthread.h> #include <syslog.h> sigset_t mask; extern int already_running(void); void reread(void) {    /* ... */ } void * thr_fn(void *arg) {    int err, signo;    for (;;) {        err = sigwait(&mask, &signo);        if (err != 0) {            syslog(LOG_ERR, "   sigwait");            exit(1);        }        switch (signo) {        case SIGHUP:            syslog(LOG_INFO, "  ");            reread();            break;        case SIGTERM:            syslog(LOG_INFO, "  SIGTERM; ");            exit(0);        default:            syslog(LOG_INFO, "   %d\n", signo);        }    }    return(0); } int main(int argc, char *argv[]) {    int              err;    pthread_t        tid;    char             *cmd;    struct sigaction sa;    if ((cmd = strrchr(argv[0], '/')) == NULL)        cmd = argv[0];    else        cmd++;    /*     *    .     */    daemonize(cmd);    /*     * ,        .     */    if (already_running()) {        syslog(LOG_ERR, "  ");        exit(1); }    /*     *       SIGHUP     *    .     */    sa.sa_handler = SIG_DFL;    sigemptyset(&sa.sa_mask);    sa.sa_flags = 0;    if (sigaction(SIGHUP, &sa, NULL) < 0)        err_quit("%s:    SIG_DFL  SIGHUP");    sigfillset(&mask);    if ((err = pthread_sigmask(SIG_BLOCK, &mask, NULL)) != 0)        err_exit(err, "   SIG_BLOCK");    /*     *     SIGHUP  SIGTERM.     */    err = pthread_create(&tid, NULL, thr_fn, 0);    if (err != 0)        err_exit(err, "  ");    /*     *   -.     */    /* ... */    exit(0); }

To switch to daemon mode, the program uses the daemonize function from Listing 13.1. After returning from it, the function already_running from Listing 13.2 is called, which checks for other running copies of the daemon. At this point, the SIGHUP signal is still ignored, so we must reset its disposition to its default value, otherwise the sigwait function will never be able to get it.

Further, all signals are blocked, since this is recommended for multi-threaded programs, and a stream is created that will deal with signal processing. The stream serves only the SIGHUP and SIGTERM signals. When a SIGHUP signal is received, the reread function rereads the configuration file, and when a SIGTERM signal is received, the stream writes a message to the log and terminates the process.

In tab.10.1 indicates that by default, the SIGHUP and SIGTERM signals terminate the process. Since these signals are blocked, the daemon will not terminate if it receives one of them. Instead, the stream, causing sigwait, will receive the numbers of the delivered signals.

Example
The program in Listing 13.4 shows how the daemon can intercept the SIGHUP signal and re-read the configuration file without using a separate thread.

Listing 13.4. Alternative implementation of a daemon that re-reads the configuration file by signal

 #include "apue.h" #include <syslog.h> #include <errno.h> extern int lockfile(int); extern int already_running(void); void reread(void) {    /* ... */ } void sigterm(int signo) {    syslog(LOG_INFO, "  SIGTERM; ");    exit(0); } void sighup(int signo) {    syslog(LOG_INFO, "  ");    reread(); } int main(int argc, char *argv[]) {    char             *cmd;    struct sigaction sa;    if ((cmd = strrchr(argv[0], '/')) == NULL)        cmd = argv[0];    else        cmd++;    /*     *    .     */    daemonize(cmd);    /*     * ,        .     */    if (already_running()) {        syslog(LOG_ERR, "  ");        exit(1);    }    /*     *   .     */    sa.sa_handler = sigterm;    sigemptyset(&sa.sa_mask);    sigaddset(&sa.sa_mask, SIGHUP);    sa.sa_flags = 0;    if (sigaction(SIGTERM, &sa, NULL) < 0) {        syslog(LOG_ERR, "   SIGTERM: %s",               strerror(errno));        exit(1);    }    sa.sa_handler = sighup;    sigemptyset(&sa.sa_mask);    sigaddset(&sa.sa_mask, SIGTERM);    sa.sa_flags = 0;    if (sigaction(SIGHUP, &sa, NULL) < 0) {        syslog(LOG_ERR, "   SIGHUP: %s",               strerror(errno));        exit(1);    }    /*     *   -.     */     /* ... */     exit(0); }

13.7. Client server model

Most often, daemon processes are used as server processes. In fig.Figure 13.1 shows an example of interaction with the syslogd server, which receives messages from applications (clients) via a UNIX domain socket.

In general, a server is a process that expects requests for the provision of certain services to customers. So, on fig. 13.1 syslogd server provides error logging services.

Shown in fig. 13.1 the interaction between the server and the client is one-way. The client sends messages to the server, but receives nothing from it. In subsequent chapters, we will see many examples of two-way interaction between the server and the client, when the client sends a request to the server, and the server returns a response to the client.

Servers often provide customer service by running other programs using fork and exec. Such servers often open a lot of file descriptors: endpoints of interactions, configuration files, log files, etc. In the best case, it would be just a matter of negligence to leave descriptors open in the child process, because they most likely will not be used in a child-launched program, especially if this program has nothing to do with the server, at worst - this can lead to security problems: the program being launched may try to perform some malicious actions, for example, from change the server configuration file or fraudulently obtain important information from the client.

The simplest solution to this problem is to set the close flag when calling the exec function (close-on-exec) for all file descriptors that are not required by the program being run. Listing 13.5 shows a function that can be used in a server process for this purpose.

Listing 13.5. Setting the close flag when calling exec

 #include "apue.h" #include <fcntl.h> int set_cloexec(int fd) {    int     val;    if ((val = fcntl(fd, F_GETFD, 0)) < 0)        return(-1);    val |= FD_CLOEXEC; /*      exec */    return(fcntl(fd, F_SETFD, val)); }

13.8. Summarizing

The running time of the daemon processes in most cases coincides with the running time of the system itself. When developing programs that will work as daemons, it is necessary to understand and take into account the relationships between the processes that were described in Chapter 9. In this chapter, we developed a function that can be called from the process to correctly switch to daemon mode.

We also discussed ways of logging daemon error messages, since they usually do not have a controlling terminal. We reviewed a series of agreements that demons must follow in most versions of UNIX, and showed examples of how these agreements are implemented.

»More information about the book can be found on the publisher's website.
» Table of Contents
» Excerpt

For Habrozhiteley 20% discount coupon - UNIX

Source: https://habr.com/ru/post/349464/

All Articles