Writing your own linux daemon with auto repair feature

Dear users, I would like to share with you the experience of writing server daemons. There are a lot of articles about this in Runet, but most of them do not give answers to such important questions as:

How to add a daemon to startup?
What to do if in the process of work an error occurred and the demon crashed?
How to update the configuration of the daemon without interrupting its work?

In this part, we consider the following points:

The principle of the demon.
Fundamentals of the development of monitoring the state of the demon.
Error handling at work, with a detailed report in the log.
Some issues related to system resources.

For clarity, the source code of the following parts will be shown:

The main program template.
Daemon monitoring function template.
Error handling function template.
A number of auxiliary functions.

The principle of the demon.
Judging by the demon is a regular program running in the background. But since our daemon will be launched from init.d, certain restrictions are imposed on it:

The daemon must save its PID to a file so that it can be correctly stopped later.
You need to perform a number of preparatory operations to start working in the background.

In our model, the daemon will function according to the following algorithm:

Separation from the controlling terminal and transition to the background.
The division into two parts: the parent (monitoring) and descendant (demon functionality).
Monitoring the status of the daemon process.
Processing the command to update the config.
Error processing.

Template program.
This code will perform all actions that are necessary for a successful launch of the daemon.

int main( int argc, char ** argv) { int status; int pid; // , if (argc != 2) { printf( "Usage: ./my_daemon filename.cfg\n" ); return -1; } // status = LoadConfig(argv[1]); if (!status) // { printf( "Error: Load config failed\n" ); return -1; } // pid = fork(); if (pid == -1) // { // printf( "Error: Start Daemon failed (%s)\n" , strerror(errno)); return -1; } else if (!pid) // { // // , // umask(0); // , setsid(); // , , . // chdir( "/" ); // //, close(STDIN_FILENO); close(STDOUT_FILENO); close(STDERR_FILENO); // status = MonitorProc(); return status; } else // { // , .. ( ) return 0; } } * This source code was highlighted with Source Code Highlighter .

The logic of work is simple and should not cause problems with understanding. The only thing that needs to be clarified:

LoadConfig - this function loads the config from the specified file, its code will depend on the format of the config you are using, and will not be considered in this article.
Closing descriptors is necessary because we will not use printf and scanf for other functions of working with console I / O. This action is optional and is used to save resources.
The transition to the root of the disk is necessary so that later there will be no problems associated with unmounting the disks. If the current folder of the daemon resides on a disk that needs to be unmounted, the system will not allow this until the daemon is stopped.
MonitorProc - this function will perform basic actions related to monitoring the status of the program.

Fundamentals of the development of monitoring the state of the demon
The main purpose of monitoring is to monitor the status of the daemon process. Only two things will be important to us:

Notification of the completion of the daemon process.
Get the daemon completion code.

All monitoring of the daemon will be included in the MonitorProc function. The whole point of monitoring is to start a child process and monitor it, and depending on its completion code, restart it or complete its work.
Source code monitoring function:

int MonitorProc() { int pid; int status; int need_start = 1; sigset_t sigset; siginfo_t siginfo; // sigemptyset(&sigset); // sigaddset(&sigset, SIGQUIT); // sigaddset(&sigset, SIGINT); // sigaddset(&sigset, SIGTERM); // sigaddset(&sigset, SIGCHLD); // sigaddset(&sigset, SIGUSR1); sigprocmask(SIG_BLOCK, &sigset, NULL); // PID' SetPidFile(PID_FILE); // for (;;) { // if (need_start) { // pid = fork(); } need_start = 1; if (pid == -1) // { // WriteLog( "[MONITOR] Fork failed (%s)\n" , strerror(errno)); } else if (!pid) // { // // status = WorkProc(); // exit(status); } else // { // // sigwaitinfo(&sigset, &siginfo); // if (siginfo.si_signo == SIGCHLD) { // wait(&status); // status = WEXITSTATUS(status); // , if (status == CHILD_NEED_TERMINATE) { // WriteLog( "[MONITOR] Child stopped\n" ); // break ; } else if (status == CHILD_NEED_WORK) // { // WriteLog( "[MONITOR] Child restart\n" ); } } else if (siginfo.si_signo == SIGUSR1) // { kill(pid, SIGUSR1); // need_start = 0; // } else // - { // WriteLog( "[MONITOR] Signal %s\n" , strsignal(siginfo.si_signo)); // kill(pid, SIGTERM); status = 0; break ; } } } // , WriteLog( "[MONITOR] Stop\n" ); // PID' unlink(PID_FILE); return status; } * This source code was highlighted with Source Code Highlighter .

By code, you need to clarify the following:

PID_FILE is a constant that will store the file name for saving the PID. In our case, this is /var/run/my_daemon.pid
WriteLog is a function that writes to the log. In it, you can come up with what your heart desires and write a log anywhere or even send it somewhere
WorkProc - a function that implements the daemon's functionality directly.

Work requires an auxiliary function to create a PID file.
Code:

void SetPidFile( char * Filename) { FILE* f; f = fopen(Filename, "w+" ); if (f) { fprintf(f, "%u" , getpid()); fclose(f); } } * This source code was highlighted with Source Code Highlighter .

At the moment, our demon is already able to run, monitor its descendant, which performs the basic functions and, if necessary, restart it or send him a signal about the configuration change. Next, consider the child code pattern:

int WorkProc() { struct sigaction sigact; sigset_t sigset; int signo; int status; // // sigact.sa_flags = SA_SIGINFO; // sigact.sa_sigaction = signal_error; sigemptyset(&sigact.sa_mask); // sigaction(SIGFPE, &sigact, 0); // FPU sigaction(SIGILL, &sigact, 0); // sigaction(SIGSEGV, &sigact, 0); // sigaction(SIGBUS, &sigact, 0); // , sigemptyset(&sigset); // // sigaddset(&sigset, SIGQUIT); // sigaddset(&sigset, SIGINT); // sigaddset(&sigset, SIGTERM); // sigaddset(&sigset, SIGUSR1); sigprocmask(SIG_BLOCK, &sigset, NULL); // - SetFdLimit(FD_LIMIT); // , WriteLog( "[DAEMON] Started\n" ); // status = InitWorkThread(); if (!status) { // for (;;) { // sigwait(&sigset, &signo); // if (signo == SIGUSR1) { // status = ReloadConfig(); if (status == 0) { WriteLog( "[DAEMON] Reload config failed\n" ); } else { WriteLog( "[DAEMON] Reload config OK\n" ); } } else // - , { break ; } } // DestroyWorkThread(); } else { WriteLog( "[DAEMON] Create work thread failed\n" ); } WriteLog( "[DAEMON] Stopped\n" ); // return CHILD_NEED_TERMINATE; } * This source code was highlighted with Source Code Highlighter .

According to the code you need to say:

InitWorkThread is a function that creates all daemon workflows and initializes all work.
DestroyWorkThread is a function that stops daemon workflows and correctly releases resources.
ReloadConfig is a function that updates the config file (re-read the file and make the necessary changes to its work). The file name can also be taken from the command line parameters.

These functions are already dependent on your daemon implementation.
')
The principle of operation is the following: we install our handler on error signals, then we start all the workflows and wait for the completion signals or the update of the config.

Error handling at work, with a detailed report in the log.

Of course, the demons should work perfectly and not cause any kind of errors, but everyone can be wrong, and sometimes there are errors that are quite difficult to detect at the testing stage. This is especially true for errors that occur when a heavy load. For this important point in the development of the daemon is the correct error handling, as well as squeezing out of the error as much information as possible. In this case, consider error handling while maintaining the call stack (backtrace). This will give us information about exactly where the error occurred (in which function), and also we will be able to find out who called this function.

Error handler function code:

static void signal_error( int sig, siginfo_t *si, void *ptr) { void * ErrorAddr; void * Trace[16]; int x; int TraceSize; char ** Messages; // WriteLog( "[DAEMON] Signal: %s, Addr: 0x%0.16X\n" , strsignal(sig), si->si_addr); #if __WORDSIZE == 64 // 64 // ErrorAddr = ( void *)((ucontext_t*)ptr)->uc_mcontext.gregs[REG_RIP]; #else // ErrorAddr = ( void *)((ucontext_t*)ptr)->uc_mcontext.gregs[REG_EIP]; #endif // backtrace TraceSize = backtrace(Trace, 16); Trace[1] = ErrorAddr; // Messages = backtrace_symbols(Trace, TraceSize); if (Messages) { WriteLog( "== Backtrace ==\n" ); // for (x = 1; x < TraceSize; x++) { WriteLog( "%s\n" , Messages[x]); } WriteLog( "== End Backtrace ==\n" ); free(Messages); } WriteLog( "[DAEMON] Stopped\n" ); // DestroyWorkThread(); // exit(CHILD_NEED_WORK); } * This source code was highlighted with Source Code Highlighter .

When using backtrace, you can get data of approximately the following form:

[DAEMON] Signal: Segmentation fault, Addr: 0x0000000000000000
== Backtrace ==
/usr/sbin/my_daemon(GetParamStr+0x34) [0x8049e44]
/usr/sbin/my_daemon(GetParamInt+0x3a) [0x8049efa]
/usr/sbin/my_daemon(main+0x140) [0x804b170]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x126bd6]
/usr/sbin/my_daemon() [0x8049ba1]
== End Backtrace ==

From this data, you can see that the main function called the GetParamInt function. The GetParamInt function called GetParamStr. In the GetParamStr function at offset 0x34, a memory address at the zero address occurred.

In addition to the call stack, you can save the value of registers (uc_mcontext.gregs array).
It should be noted that the most informative information from backtrace can be obtained only when compiling without cutting out debugging information, as well as using the -rdynamic option.

As you can see, the code uses the CHILD_NEED_WORK and CHILD_NEED_TERMINATE constants. You can assign the value of these constants yourself, the main thing is that they are not the same.

Some issues related to system resources.

The important point is to set the maximum number of descriptors. Any open file, socket, pipe, and others spend descriptors that, if exhausted, will not open the file or create a socket or accept an incoming connection. This may affect the performance of the daemon. By default, the maximum number of open descriptors is 1024. This number is very small for highly loaded network daemons. Therefore, we will put this value more in accordance with their requirements. To do this, use the following function:

int SetFdLimit( int MaxFd) { struct rlimit lim; int status; // - lim.rlim_cur = MaxFd; // - lim.rlim_max = MaxFd; // - status = setrlimit(RLIMIT_NOFILE, &lim); return status; } * This source code was highlighted with Source Code Highlighter .

Instead of a conclusion.
So we looked at how to create the basis for a demon. Of course, the code does not claim to be perfect, but it copes with its tasks perfectly.
In the next article, the issues related to installing / uninstalling the daemon, managing it, writing autoload scripts for init.d and directly adding to autoload will be discussed.

Link to source code: http://pastebin.com/jdX5wn0E
The source code contains all the functions used in one file. When developing a project, it is desirable to scatter them into different files in accordance with their functional purpose.

Source: https://habr.com/ru/post/129207/

All Articles

Writing your own linux daemon with auto repair feature

More articles: