📜 ⬆️ ⬇️

Writing your own linux daemon with auto repair feature

Dear users, I would like to share with you the experience of writing server daemons. There are a lot of articles about this in Runet, but most of them do not give answers to such important questions as:

In this part, we consider the following points:

For clarity, the source code of the following parts will be shown:


The principle of the demon.
Judging by the demon is a regular program running in the background. But since our daemon will be launched from init.d, certain restrictions are imposed on it:

In our model, the daemon will function according to the following algorithm:


Template program.
This code will perform all actions that are necessary for a successful launch of the daemon.
int main( int argc, char ** argv)
{
int status;
int pid;

// ,
if (argc != 2)
{
printf( "Usage: ./my_daemon filename.cfg\n" );
return -1;
}

//
status = LoadConfig(argv[1]);

if (!status) //
{
printf( "Error: Load config failed\n" );
return -1;
}

//
pid = fork();

if (pid == -1) //
{
//
printf( "Error: Start Daemon failed (%s)\n" , strerror(errno));

return -1;
}
else if (!pid) //
{
//
// ,
//
umask(0);

// ,
setsid();

// , , .
//
chdir( "/" );

// //,
close(STDIN_FILENO);
close(STDOUT_FILENO);
close(STDERR_FILENO);

//
status = MonitorProc();

return status;
}
else //
{
// , .. ( )
return 0;
}
}


* This source code was highlighted with Source Code Highlighter .

The logic of work is simple and should not cause problems with understanding. The only thing that needs to be clarified:

Fundamentals of the development of monitoring the state of the demon
The main purpose of monitoring is to monitor the status of the daemon process. Only two things will be important to us:
  1. Notification of the completion of the daemon process.
  2. Get the daemon completion code.

All monitoring of the daemon will be included in the MonitorProc function. The whole point of monitoring is to start a child process and monitor it, and depending on its completion code, restart it or complete its work.
Source code monitoring function:
int MonitorProc()
{
int pid;
int status;
int need_start = 1;
sigset_t sigset;
siginfo_t siginfo;

//
sigemptyset(&sigset);

//
sigaddset(&sigset, SIGQUIT);

//
sigaddset(&sigset, SIGINT);

//
sigaddset(&sigset, SIGTERM);

//
sigaddset(&sigset, SIGCHLD);

//
sigaddset(&sigset, SIGUSR1);
sigprocmask(SIG_BLOCK, &sigset, NULL);

// PID'
SetPidFile(PID_FILE);

//
for (;;)
{
//
if (need_start)
{
//
pid = fork();
}

need_start = 1;

if (pid == -1) //
{
//
WriteLog( "[MONITOR] Fork failed (%s)\n" , strerror(errno));
}
else if (!pid) //
{
//

//
status = WorkProc();

//
exit(status);
}
else //
{
//

//
sigwaitinfo(&sigset, &siginfo);

//
if (siginfo.si_signo == SIGCHLD)
{
//
wait(&status);

//
status = WEXITSTATUS(status);

// ,
if (status == CHILD_NEED_TERMINATE)
{
//
WriteLog( "[MONITOR] Child stopped\n" );

//
break ;
}
else if (status == CHILD_NEED_WORK) //
{
//
WriteLog( "[MONITOR] Child restart\n" );
}
}
else if (siginfo.si_signo == SIGUSR1) //
{
kill(pid, SIGUSR1); //
need_start = 0; //
}
else // -
{
//
WriteLog( "[MONITOR] Signal %s\n" , strsignal(siginfo.si_signo));

//
kill(pid, SIGTERM);
status = 0;
break ;
}
}
}

// ,
WriteLog( "[MONITOR] Stop\n" );

// PID'
unlink(PID_FILE);

return status;
}


* This source code was highlighted with Source Code Highlighter .
By code, you need to clarify the following:

Work requires an auxiliary function to create a PID file.
Code:
void SetPidFile( char * Filename)
{
FILE* f;

f = fopen(Filename, "w+" );
if (f)
{
fprintf(f, "%u" , getpid());
fclose(f);
}
}


* This source code was highlighted with Source Code Highlighter .

At the moment, our demon is already able to run, monitor its descendant, which performs the basic functions and, if necessary, restart it or send him a signal about the configuration change. Next, consider the child code pattern:
int WorkProc()
{
struct sigaction sigact;
sigset_t sigset;
int signo;
int status;

//
//
sigact.sa_flags = SA_SIGINFO;
//
sigact.sa_sigaction = signal_error;

sigemptyset(&sigact.sa_mask);

//

sigaction(SIGFPE, &sigact, 0); // FPU
sigaction(SIGILL, &sigact, 0); //
sigaction(SIGSEGV, &sigact, 0); //
sigaction(SIGBUS, &sigact, 0); // ,

sigemptyset(&sigset);

//
//
sigaddset(&sigset, SIGQUIT);

//
sigaddset(&sigset, SIGINT);

//
sigaddset(&sigset, SIGTERM);

//
sigaddset(&sigset, SIGUSR1);
sigprocmask(SIG_BLOCK, &sigset, NULL);

// -
SetFdLimit(FD_LIMIT);

// ,
WriteLog( "[DAEMON] Started\n" );

//
status = InitWorkThread();
if (!status)
{
//
for (;;)
{
//
sigwait(&sigset, &signo);

//
if (signo == SIGUSR1)
{
//
status = ReloadConfig();
if (status == 0)
{
WriteLog( "[DAEMON] Reload config failed\n" );
}
else
{
WriteLog( "[DAEMON] Reload config OK\n" );
}
}
else // - ,
{
break ;
}
}

//
DestroyWorkThread();
}
else
{
WriteLog( "[DAEMON] Create work thread failed\n" );
}

WriteLog( "[DAEMON] Stopped\n" );

//
return CHILD_NEED_TERMINATE;
}


* This source code was highlighted with Source Code Highlighter .


According to the code you need to say:

These functions are already dependent on your daemon implementation.
')
The principle of operation is the following: we install our handler on error signals, then we start all the workflows and wait for the completion signals or the update of the config.

Error handling at work, with a detailed report in the log.

Of course, the demons should work perfectly and not cause any kind of errors, but everyone can be wrong, and sometimes there are errors that are quite difficult to detect at the testing stage. This is especially true for errors that occur when a heavy load. For this important point in the development of the daemon is the correct error handling, as well as squeezing out of the error as much information as possible. In this case, consider error handling while maintaining the call stack (backtrace). This will give us information about exactly where the error occurred (in which function), and also we will be able to find out who called this function.

Error handler function code:
static void signal_error( int sig, siginfo_t *si, void *ptr)
{
void * ErrorAddr;
void * Trace[16];
int x;
int TraceSize;
char ** Messages;

//
WriteLog( "[DAEMON] Signal: %s, Addr: 0x%0.16X\n" , strsignal(sig), si->si_addr);


#if __WORDSIZE == 64 // 64
//
ErrorAddr = ( void *)((ucontext_t*)ptr)->uc_mcontext.gregs[REG_RIP];
#else
//
ErrorAddr = ( void *)((ucontext_t*)ptr)->uc_mcontext.gregs[REG_EIP];
#endif

// backtrace
TraceSize = backtrace(Trace, 16);
Trace[1] = ErrorAddr;

//
Messages = backtrace_symbols(Trace, TraceSize);
if (Messages)
{
WriteLog( "== Backtrace ==\n" );

//
for (x = 1; x < TraceSize; x++)
{
WriteLog( "%s\n" , Messages[x]);
}

WriteLog( "== End Backtrace ==\n" );
free(Messages);
}

WriteLog( "[DAEMON] Stopped\n" );

//
DestroyWorkThread();

//
exit(CHILD_NEED_WORK);
}


* This source code was highlighted with Source Code Highlighter .

When using backtrace, you can get data of approximately the following form:
[DAEMON] Signal: Segmentation fault, Addr: 0x0000000000000000
== Backtrace ==
/usr/sbin/my_daemon(GetParamStr+0x34) [0x8049e44]
/usr/sbin/my_daemon(GetParamInt+0x3a) [0x8049efa]
/usr/sbin/my_daemon(main+0x140) [0x804b170]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x126bd6]
/usr/sbin/my_daemon() [0x8049ba1]
== End Backtrace ==

From this data, you can see that the main function called the GetParamInt function. The GetParamInt function called GetParamStr. In the GetParamStr function at offset 0x34, a memory address at the zero address occurred.

In addition to the call stack, you can save the value of registers (uc_mcontext.gregs array).
It should be noted that the most informative information from backtrace can be obtained only when compiling without cutting out debugging information, as well as using the -rdynamic option.

As you can see, the code uses the CHILD_NEED_WORK and CHILD_NEED_TERMINATE constants. You can assign the value of these constants yourself, the main thing is that they are not the same.

Some issues related to system resources.

The important point is to set the maximum number of descriptors. Any open file, socket, pipe, and others spend descriptors that, if exhausted, will not open the file or create a socket or accept an incoming connection. This may affect the performance of the daemon. By default, the maximum number of open descriptors is 1024. This number is very small for highly loaded network daemons. Therefore, we will put this value more in accordance with their requirements. To do this, use the following function:
int SetFdLimit( int MaxFd)
{
struct rlimit lim;
int status;

// -
lim.rlim_cur = MaxFd;
// -
lim.rlim_max = MaxFd;

// -
status = setrlimit(RLIMIT_NOFILE, &lim);

return status;
}


* This source code was highlighted with Source Code Highlighter .

Instead of a conclusion.
So we looked at how to create the basis for a demon. Of course, the code does not claim to be perfect, but it copes with its tasks perfectly.
In the next article, the issues related to installing / uninstalling the daemon, managing it, writing autoload scripts for init.d and directly adding to autoload will be discussed.

Link to source code: http://pastebin.com/jdX5wn0E
The source code contains all the functions used in one file. When developing a project, it is desirable to scatter them into different files in accordance with their functional purpose.

Source: https://habr.com/ru/post/129207/


All Articles