What is so difficult about handling C ++ exceptions?

Exceptions and the associated stack promotion is one of the most enjoyable techniques in C ++. Exception handling is intuitively consistent with the block structure of the program. Externally, exception handling seems very logical and natural.

Careful use of stack objects allows you to create very efficient and safe code, where, unlike systems with garbage collection, reference locality is preserved, which makes it possible to reduce the number of calls to the system behind the memory, reduce its fragmentation, more efficiently use the cache memory.

However, in C ++, exceptions are traditionally considered literally as exceptions to error recovery. It is difficult to say whether this is a cause or a consequence of the fact that the implementation of exception handling by compilers is extremely expensive. Let's try to figure out why.
')
Continued here .
How are things.

There are two approaches to implementing exception handling:

The first can be characterized by the words - “let the loser pay”. The compiler tries to minimize costs in cases where exceptions do not occur. Ideally, the program does not bear any additional load, all the necessary information is located aside from the code in a form suitable for promotion.
The second strategy is “gradually pay everything and always.” In other words, the program incurs certain costs in the process of maintaining the relevance of the information necessary for the correct promotion of the stack. In this case, in case of an exception, promotion of the stack is cheaper.

A few examples:

GCC / SJLJ . SJLJ is short for setjmp / longjmp. It refers rather to the first approach, but thanks to the control transfer scheme, it also has a fixed fee for each try . In essence, this implementation has absorbed the worst that is inherent in both approaches. Up until the fourth version, this was the main exception handling option.
As the name implies, control is transferred via a call to longjmp, each try generates a call to setjmp. The corresponding buffer is allocated in the stack for each try block.
At the beginning of each function, a prolog is created that registers the current frame in the context stack. Similarly, an epilogue is created that removes the current context from the top of the context stack. Next to each function, an auxiliary code is created to clear the resources.
More precisely, the compiler each return from a function, potentially capable of completing an exception, enters the tree as a key, the value is a pointer to the cleanup code that must be taken at this point to clear the function context. The linker assembles the pieces of these trees for each link module into a single project tree (very roughly). This miracle is called LSDA (language specific data area) and it is located in the ".gcc_except_table" section.
When an exception occurs, based on the type_info of the exception being thrown, a block is searched for that can handle the exception. Starting from the current context and up to the handler context (using navigation through call frames), the addresses of the code that (depending on the scope of local variables) must be executed in this place are retrieved and executed. Then control is transferred.

There is a prejudice that this method is very expensive. Already due to the fact that for each try-block setjmp is called, which is not cheap. In fact, you need to fully save the state of the processor, where there may be dozens of registers. Whereas at the time of the exception, the contents of most of these registers are already useless. In fact, the compiler does quite rationally. He deploys setjmp, and saves only useful registers (he already has this information). The author doubts that the cost of setjmp is so high.

But what really catches your eye is a voluminous auxiliary code, especially in non-trivial cases. The compiler, like YACC, lists all the states of the stack automaton. And, though, the optimizer of opportunities cleans redundancy and the trivial code, what remains is more than enough.
GCC / DW2 . This is just an example of the first approach to exception handling. DW2 means DWARF2 (now already 3) - the storage format of auxiliary, including debugging information in the executable file. After all, debugging information is also needed so that at any time you can find out the value of any variable, including in the frames of the previous (upper) calls. Therefore, the compiler in the process of generating code postpones information about what it allocates in the stack, in which registers it allocates variables when it stores them ... In fact, this format is not identical to DWARF, although it is very close to it. Standard version for the fourth version of GCC.

Conceptually, at each address of the program code information is stored on how to get into the higher-level call frame. In practice, due to the bulk of this information, it is compressed, in fact, calculated using the interpretation of the byte code. This bytecode is executed when an exception occurs. All this is located in the sections ".eh_frame" and ".eh_frame_hdr".
Yes, among other things, the DWARF interpreter is a great backdoor, with which, by changing the byte code, you can catch the exception and send it for processing to your heart’s content.
GCC / DW2 uses almost the same LSDA section as GCC / SJLJ.

As we see, the costs associated with the promotion of the stack (in the absence of exceptions) are practically absent. However, the cost of initiating an exception is high. In addition, it is impossible not to note the strong integration of the architecture-dependent part of the compiler and its rather high-level layers.
MS VC ++ . This compiler implements a second processing strategy.
- For each function that can potentially throw exceptions, the compiler creates as a stack variable a structure from a pointer to the previous similar structure, addresses of the handler function and auxiliary data. The address of this structure is entered in FS: [0], which is the top of the stack of these structures. The FS register in Win32 is used as the Thread Information Block ( TIB ) (GS in Win64). A handler function is also created (with its own data set) and an epilogue that restores FS: [0] in the event of a successful completion.
- The compiler creates a table of structures — an element for each try block in a function. Each try block has an index of the beginning and end in this table (the nested block has a nested interval) corresponding to a certain state, the compiler itself keeps track of the relevance of this index. In this way, the compiler implements a stack of try-blocks.
- A catch-table table is created for each try-block. For each type of exception, the type_info table of all base classes in the hierarchy of this type of exception is created.
- For each function, an unwind table is created, each element of which contains a pointer to the function, which releases some resource and the number of the previous element. There can be several chains in the table, depending on the scopes of objects with destructors. At the time of the exception, by the index of the current state, which was mentioned above, you can find the necessary chain and call all the necessary destructors.
- For the x64 version, the auxiliary stack structures were transferred to .pdata whenever possible, probably in MS they consider the first strategy more promising.
- When an exception is triggered, the main work is carried out by the operating system, where control falls through SYSENTER.
This method has the same drawbacks as SJLJ - extensive auxiliary code and low portability.
The process of raising an exception and selecting the appropriate catch block looks about the same everywhere:
- When an exception is raised, its handle is created, which contains a copy of the object, its type_info, a pointer to the destructor
- Going up the stack of try blocks and clearing all registered stack objects, (navigating this stack is different everywhere, but the essence is the same), look through the lists of catch blocks and look for suitable ones.
- If a suitable catch block is found, the exception object becomes a local variable, and we call this block. If the catch block accepts an exception by value, not by reference, a copy is created.
- If there was no exception call, we kill the object - the exception
- "Mistress on the note":
```
some_exception exc("oioi"); throw exc; 
```
  spawns an extra copy / destructor constructor
```
 throw *new some_exception("oioi"); 
```
  leaks memory
```
 catch(some_exception exc) ... 
```
  again an extra call to the constructor and destructor
```
 catch(const some_exception *exc) ... 
```
  the exception will fly by, if you do not throw a pointer
```
 throw some_exception("oioi"); ... catch (const some_exception &exc).... 
```
  low cost

Details can be found here , here and here .

What if ...
And, it would seem, all that matters is to call in the right order destructors whose bodies already exist. How did it happen that a simple, in general, problem has such viscous, heavy and, moreover, independently developed solutions? It is difficult to say, it happened so historically.
Let's try to sketch a solution, trying to keep it simple and, whenever possible, architecturally independent.

First of all, we choose a strategy - this will be the second option.
Transfer control - setjmp / longjmp

We create a structure, all descendants of which have the ability to self-register for possible promotion.

 struct unw_item_t { unw_item_t (); virtual ~unw_item_t (); void unreg(); unw_item_t *prev_; };

And another one whose scope is the try block

 struct jmp_buf_splice { jmp_buf_splice (); ~jmp_buf_splice (); jmp_buf buf_; jmp_buf_splice *prev_; unw_item_t objs_; };

For simplicity, we will only throw const char * exceptions using
```
  extern int throw_slice (const char *str); 
```

Several macros to simulate a try block

 //   #define TRY_BLOCK { \ jmp_buf_splice __sl; \ const char *__exc = (const char *)setjmp (__sl.buf_); \ if (NULL == __exc) { ... // -  catch(…) ..    const char* #define CATCH_BLOCK_FIN \ } else { ... //   #define FIN_BLOCK \ } \ } ... //   #define THROW_IN_BLOCK(exc) \ throw_slice (exc); ... //   , __exc   TRY_BLOCK #define RETHROW_IN_BLOCK \ throw_slice (__exc);

Now we will show the bodies of the jmp_buf_splice class members:
```
 static jmp_buf_splice *root_slice_ = NULL; jmp_buf_splice::jmp_buf_splice () { objs_ = NULL; prev_ = root_slice_; root_slice_ = this; } jmp_buf_splice::~jmp_buf_splice () { root_slice_ = prev_; } 
```
Here is an option for single-threaded implementation. If there are multiple threads, instead of root_slice_, we will need to use TLS, similarly to how GCC does, for example.

It's time for members of the unw_item_t class:

 unw_item_t::unw_item_t () { if (NULL != root_slice_) { prev_ = root_slice_->objs_; root_slice_->objs_ = this; } } unw_item_t::~unw_item_t () { unreg(); } unw_item_t::unreg () { if (NULL != root_slice_ && (prev_ != reinterpret_cast<unw_item_t *>(~0))) { root_slice_->objs_ = prev_; prev_ = reinterpret_cast<unw_item_t *>(~0); } }

Now consider the process of exclusion exception and promotion of the stack:

 static int pop_slice () { jmp_buf_splice *sl = root_slice_; assert (NULL != sl); root_slice_ = sl->prev_; return 0; } int throw_slice (const char *str, bool popstate) { if (NULL == str) return -1; jmp_buf_splice *sl = root_slice_; unw_item_t *obj = root_slice_->objs_; while (NULL != obj) { unw_item_t *tmp = obj; obj = obj->prev_; tmp->~unw_item_t (); } if (popstate) pop_slice (); longjmp (sl->buf_, int(str)); return 0; }

Service class - analog std :: auto_ptr:

  template<typename cl> class deleter_t : public unw_item_t { public: deleter_t (cl *obj){ptr_ = obj;}; virtual ~deleter_t () {delete ptr_;}; private: cl *ptr_; deleter_t (); deleter_t (const deleter_t &); deleter_t &operator= (const deleter_t &); };

Service class - array:

 template<typename cl> class vec_deleter_t : public unw_item_t { public: vec_deleter_t (cl *obj){ptr_ = obj;}; virtual ~ vec_deleter_t () {delete [] ptr_;}; private: cl *ptr_; vec_deleter_t (); vec_deleter_t (const vec_deleter_t &); vec_deleter_t &operator= (const vec_deleter_t &); };

Examples

Test class

 class _A { public: _A():val_(++cnt_){printf ("A::A(%d)\n",val_);} _A(int i):val_(i){printf ("A::A(%d)\n",val_);} virtual ~_A(){printf ("A::~A(%d)\n",val_);} static int cnt_; }; int _A::cnt_ = 0; class A : public unw_item_t, _A {};

Example 1

 A a(1); TRY_BLOCK { A b(2); THROW_IN_BLOCK("error\n"); std::cerr << "notreached\n"; } CATCH_BLOCK_FIN { std::cerr << __exc; } FIN_BLOCK;

A :: A (1)
A :: A (2)
A :: ~ A (2)
error
A :: ~ A (1)

Example 2

 A a(1); TRY_BLOCK { A b(2); TRY_BLOCK { A c(3); THROW_IN_BLOCK("error\n"); std::cerr << "notreached\n"; } CATCH_BLOCK_FIN { std::cerr << "." << __exc; RETHROW_IN_BLOCK; } FIN_BLOCK; std::cerr << "notreached\n"; } CATCH_BLOCK_FIN { std::cerr << ".." << __exc; } FIN_BLOCK;

A :: A (1)
A :: A (2)
A :: A (3)
A :: ~ A (3)
.error
A :: ~ A (2)
..error
A :: ~ A (1)

Example 3

  TRY_BLOCK { vec_deleter_t<_A> da(new _A[3]); TRY_BLOCK { THROW_IN_BLOCK("error\n"); std::cerr << "notreached\n"; } CATCH_BLOCK_FIN { std::cerr << "." << __exc; RETHROW_IN_BLOCK; } FIN_BLOCK; std::cerr << "notreached\n"; } CATCH_BLOCK_FIN { std::cerr << ".." << __exc; } FIN_BLOCK;

A :: A (1)
A :: A (2)
A :: A (3)
.error
A :: ~ A (3)
A :: ~ A (2)
A :: ~ A (1)
..error

Restrictions
This solution has many disadvantages:

You can not throw exceptions in the destructor. The unw_item_t destructor has not yet deleted the link to this instance, as a result, the destructor will be called again.
Creating an object inherited from the unw_item_t class using the new operator is very dangerous. Even if the memory itself is taken care of, such a pointer can fall into another's context or even into someone else's stream, the object on which it looks can suddenly cause a destructor, which will end in a metabolic catastrophe.
A class inherited from unw_item_t cannot be aggregated as a member of another class, otherwise its destructor will be called twice.
The described method cannot be integrated with hardware exceptions.
Restrictions on exception types. Above, we used only a string pointer. If you pass as an exception primitive types, then there can be only one option. If we use the pointer to the object as an exception, we can use RTTI. Can offer something like
```
 #define CATCH_BLOCK_TYPED(t) \ } else if (NULL != dynamic_cast<t>(__exc)) { 
```
And this will give us the opportunity to use exceptions of different types. But then it is impossible to throw exceptions of primitive types.
Delete the object thrown exception must be the user.

But still.
Despite the described limitations, the described method has inherent advantages:

Simplicity. A few dozen lines of code - and everything works.
Transparency concept.
Easy portability. No dependence on architecture.

Is it possible to eliminate the disadvantages of this method, while retaining its advantages? Yes and no. Using only C ++ tools, it is impossible to do this.

What the author tends.
In the order of technical nonsense, we will think about how to modify the compiler in order to correctly implement the above scheme?
What was missing from the above solution? Knowledge of how the object was spawned.
For example, if an object is built on memory allocated from a common heap and can migrate between threads, it can in no case be registered in a thread-dependent stack. It is not necessary to register an object aggregated into another object anywhere.
And with an object of the same type, but on a stack memory, it is necessary to do this. Of course, it is possible to give a pointer to this stack object to another thread, but it is difficult to imagine in what situation this could be useful.
So:

For stack objects of type T, the compiler actually creates a wrapping class of type

 template<class T> class __st_wrapper : public unw_item_t { public: virtual ~__st_wrapper() { unreg(); ((T*)data_)->T::~T(); }; private: char data_[sizeof(T)]; };

as well as calling the desired constructor T.

The static member of the jmp_buf_splice :: root_slice_ class is implemented either via TLS or the corresponding register, if there is
The programmer still sees only an object of type T located in data_
On stack objects without virtual destructors, it appears in a wrapper.
Throw exceptions in destructors can now be Before calling the destructor itself, we unregistered.
We do not support hardware exceptions (kernel exceptions), so at the time of the excitation of an exception, the compiler knows which registers need to be “ground” and must do so
For normal destruction of stack objects, the compiler creates calls for __st_wrapper 's destructors.
The mechanism for selecting the appropriate catch block is left as is. Those. we still need auxiliary table information with descriptors of these blocks outside the code.
We will transfer control using the setjmp analog. It is proposed to implement an intermediate (with respect to the two described above) version of the transfer of control. Setjmp has a significant drawback - the buffer size is quite large, while its small part is actually used.
On the other hand, the execution of a bytecode in the spirit of DWARF is very wasteful.
Therefore, instead of the setjmp buffer, we will store a list of registers that need to be restored and shifted relative to the stack pointer, where the actual values lie. In the case of a computed value, the value is stored in the register itself. For this, additional memory is allocated in the stack and a shift is given to it. In fact, a temporary variable is started.
Before raising the exception, the compiler unloads all relevant data from the registers, in this case, you can recover without loss.

Nevertheless, it is worth noting that the use of a try block is a conscious act, there is nothing wrong with that it incurs certain costs. IMHO these (moderate) costs are even useful because stimulate a responsible attitude to the tools of the language.
Interception of exceptions when calling the operator new and new [] is left as is. Those. we protect each iteration with an internal try block and destroy everything created in the previous iterations if an exception has occurred, which we then relocate. And, of course, we give back the memory allocated for the object [s.]
To implement an array of stack objects and do nothing. But you can save some memory by implementing a special stack object - a vector similar to that used when calling the operator new [] .

By the way.

An object can know that it is a stack. For this, this must be within the stack stack of the current thread.
So you can take an object off the hook? I can not imagine why this may be necessary, but this possibility exists.
If you can remove, then you can and plant. Allocate memory in the stack through alloca , forcibly call the constructor and connect it to the stack promotion mechanism.
For architectures with separate data and control stacks, exception handling can be implemented very effectively using a control stack instead of a list.

PS: Special thanks to Alexander Artyushin for informative discussion.

Source: https://habr.com/ru/post/208006/

All Articles

What is so difficult about handling C ++ exceptions?

More articles: