Handling recurring SIGSEGV-like errors

The topic has been trashed and many more copies have been broken because of it. One way or another, people keep wondering if an application written in C / C ++ can not fall after dereferencing a null pointer, for example. The short answer is yes, even on Habré there are articles on this subject.

One of the most frequent answers to this question is the phrase "Why? This simply should not happen!". The true reasons why people continue to be interested in this topic may be different, one of them may be laziness. In the case when it is lazy or expensive to check everything and everything, and exceptions happen rarely possible without complicating the code, wrap potentially falling code fragments into some try / catch that will allow you to beautifully minimize the application or even recover and continue to work as if nothing had happened. . The most abnormal just the same may seem a desire again and again to catch errors, usually leading to the fall of the application, process them and continue to work.

So let's try to create something that allows us to solve the problem of handling SIGSEGV-like errors. The solution should be maximally cross-platform, work on all the most common desktop and mobile platforms in single-threaded and multi-threaded environments. Also make possible the existence of nested try / catch sections. We will handle the following types of exceptional situations: access to memory at invalid addresses, execution of invalid instructions, and division by zero. The apotheosis will be that the hardware exceptions that have occurred will turn into ordinary C ++ exceptions.

Most often, for solving similar tasks, it is recommended to use POSIX signals on non-Windows systems, but on Windows Structured Exception Handling (SEH). We proceed roughly as follows, but instead of SEH we will use Vectored Exception Handling (VEH), which are very often deprived of attention. In general, according to Microsoft, VEH is an extension of SEH, i.e. something more functional and modern. VEH is somewhat similar with POSIX signals, in order to start catching any events, the handler must be registered. However, unlike the signals for VEH, several handlers can be registered, which will be called in turn until one of them handles the event that occurred.

In the appendage to the signal handlers, we will take on a pair of setjmp / longjmp , which will allow us to go back to where we want after the occurrence of an emergency and in some way to handle this very exceptional situation. We also need the good old thread local storage (TLS), which is also available in all the environments we are interested in, in order for our handicraft to work in multi-threaded environments.

The simplest thing to do in order to simply not fall in case of an emergency is to write your handler and register it. In most cases, people simply enough to collect the necessary amount of information and beautifully minimize the application. One way or another, the signal handler is registered in a known manner. For POSIX-compatible systems, it looks like this:

 stack_t ss; ss.ss_sp = exception_handler_stack; ss.ss_flags = 0; ss.ss_size = SIGSTKSZ; sigaltstack(&ss, 0); struct sigaction sa; sigemptyset(&sa.sa_mask); sa.sa_flags = SA_ONSTACK; sa.sa_handler = signalHandler; for (int signum : handled_signals) sigaction(signum, &sa, &prev_handlers[signum - MIN_SIGNUM]);

The above code snippet registers a handler for the following signals: SIGBUS , SIGFPE , SIGILL , SIGSEGV . In addition, by calling sigaltstack , it is indicated that the signal handler should be run on an alternative, its own stack. This allows the application to survive even under stack overflow conditions, which can easily occur in the case of infinitely recursion. If you do not set an alternative stack, then this kind of error will not be possible to handle, the application will simply fall, because there will simply be no stack for calling and executing a handler, and nothing can be done about it. Pointers to previously registered handlers are also saved, which will allow them to be called if our handler understands what he has nothing to do.

For Windows, the code is much shorter:

 exception_handler_handle = AddVectoredExceptionHandler(1, vectoredExceptionHandler);

The handler is one, it catches all events at once (not only hardware exceptions need to be said) and there is no way to do anything with the stack like in Linux, for example. The unit supplied by the first argument to the AddVectoredExceptionHandler function indicates that our handler must be called first before any other existing ones. This gives us a chance to be first and take the actions we need.

The handler for POSIX systems looks like this:

 static void signalHandler(int signum) { if (execution_context) { sigset_t signals; sigemptyset(&signals); sigaddset(&signals, signum); sigprocmask(SIG_UNBLOCK, &signals, NULL); reinterpret_cast<ExecutionContextStruct *>(static_cast<ExecutionContext *>(execution_context))->exception_type = signum; longjmp(execution_context->environment, 0); } else if (prev_handlers[signum - MIN_SIGNUM].sa_handler) { prev_handlers[signum - MIN_SIGNUM].sa_handler(signum); } else { signal(signum, SIG_DFL); raise(signum); } }

I must say that in order for our signal handler to become reusable, i.e. could be called again and again in the case of new errors, we must unlock the triggered sigal at each approach. This is necessary in cases where the handler knows that an exceptional situation has arisen in the code section, which is wrapped in some try / catch which will be discussed later. If an emergency situation exists where we did not expect it at all, the cases will be transferred to a previously registered signal handler, if there is none, then the default handler is called, which completes the application that is in trouble.

The Windows handler looks like this:

 static LONG WINAPI vectoredExceptionHandler(struct _EXCEPTION_POINTERS *_exception_info) { if (!execution_context || _exception_info->ExceptionRecord->ExceptionCode == DBG_PRINTEXCEPTION_C || _exception_info->ExceptionRecord->ExceptionCode == 0xE06D7363L /* C++ exception */ ) return EXCEPTION_CONTINUE_SEARCH; reinterpret_cast<ExecutionContextStruct *>(static_cast<ExecutionContext *>(execution_context))->dirty = true; reinterpret_cast<ExecutionContextStruct *>(static_cast<ExecutionContext *>(execution_context))->exception_type = _exception_info->ExceptionRecord->ExceptionCode; longjmp(execution_context->environment, 0); }

As mentioned above, VEH handler on Windows catches a lot more besides hardware exceptions. For example, when you call OutputDebugString an exception is thrown with the code DBG_PRINTEXCEPTION_C . We will not process such events and simply return EXCEPTION_CONTINUE_SEARCH , which will cause the OS to look for the next handler that will handle this event. Also, we do not want to handle C ++ exceptions, which correspond to the magic code 0xE06D7363L does not have a normal name.

Both on POSIX-compatible systems and on Windows, a longjmp is called at the end of the handler, which allows us to go back up the stack to the very beginning of the try section and get around it by catching the catch branch, in which you can do all the necessary actions to restore and continue work as if nothing terrible had happened.

In order for the usual C ++ try start catching exceptional situations it does not need to place a small macro HW_TO_SW_CONVERTER at the very beginning:

 #define HW_TO_SW_CONVERTER_UNIQUE_NAME(NAME, LINE) NAME ## LINE #define HW_TO_SW_CONVERTER_INTERNAL(NAME, LINE) ExecutionContext HW_TO_SW_CONVERTER_UNIQUE_NAME(NAME, LINE); if (setjmp(HW_TO_SW_CONVERTER_UNIQUE_NAME(NAME, LINE).environment)) throw HwException(HW_TO_SW_CONVERTER_UNIQUE_NAME(NAME, LINE)) #define HW_TO_SW_CONVERTER() HW_TO_SW_CONVERTER_INTERNAL(execution_context, __LINE__)

It looks pretty curly, but in fact a very simple thing is being done here:

Called setjmp , which allows us to remember the place where we started and where we need to return in case of an accident.
If a hardware exception occurred along the execution path, then setjmp will return a non-zero value after a longjmp was called somewhere along the path. This will cause a C ++ exception of type HwException to be thrown, which will contain information about what kind of error has occurred. The thrown exception is easily caught by the standard catch .

The above macro simplifies the following pseudocode:

 if (setjmp(environment)) throw HwException();

The setjmp / longjmp approach has one major drawback. In the case of the usual C ++ exceptions, a stack unwind occurs at which the destructors of all objects created along the path are called. In the case of longjmp we immediately jump to the starting position, no unwinding of the stack takes place. This imposes appropriate restrictions on the code that is inside such sections of try , there cannot be allocated any resources because there is a risk of losing them forever, which will lead to leaks.

Another limitation is that setjmp cannot be used in functions / methods declared as inline . This is a limitation of setjmp itself. At best, the compiler will simply refuse to build such a code, at worst it will compile it, but the resulting binary file will simply crash.

The most abnormal action that has to be taken after processing a hardware exception on Windows is the need to call RemoveVectoredExceptionHandler . If this is not done, then after each entry into our VEH handler and the execution of longjmp there will be a situation as if our handler was registered one more time. This leads to the fact that at each subsequent emergency the handler will be called more and more times in a row, which will lead to disastrous consequences. This solution was found solely through numerous magical experiments and has not been documented anywhere.

In order for the solution to work in multithreaded environments, it is necessary that each thread has its own place where you can save the execution context with setjmp . For these purposes, and uses TLS, in the use of which there is nothing tricky.

The execution context itself is designed as a simple class with the following constructor and destructor:

 ExecutionContext::ExecutionContext() : prev_context(execution_context) { #if defined(PLATFORM_OS_WINDOWS) dirty = false; #endif execution_context = this; } ExecutionContext::~ExecutionContext() { #if defined(PLATFORM_OS_WINDOWS) if (execution_context->dirty) RemoveVectoredExceptionHandler(exception_handler_handle); #endif execution_context = execution_context->prev_context; }

This class has a prev_context field that allows us to create chains from nested try / catch sections.

A full listing of the product described above is available at GitHub:
https://github.com/kutelev/hwtrycatch

To prove that everything works as described, there is an automatic build and tests for Windows, Linux, Mac OS X and Android platforms:

https://ci.appveyor.com/project/kutelev/hwtrycatch
https://travis-ci.org/kutelev/hwtrycatch

Under iOS, this also works, but in the absence of a device for testing, there are no automatic tests.

In conclusion, we can say that this approach can also be used in ordinary C. We just need to write a few macros that will mimic the work of try / catch from C ++.

It is also worth saying that the use of the methods described in most cases is a very bad idea, especially if one considers that it is impossible to find out at the signal level what led to the emergence of SIGSEGV or SIGBUS . This is equally likely to be as well as reading at the wrong addresses and writing. If reading at arbitrary addresses is not a destructive operation, then writing can lead to disastrous results such as destruction of the stack, heap, or even the code itself.

Source: https://habr.com/ru/post/332626/

All Articles

Handling recurring SIGSEGV-like errors

More articles: