Choosing the right error handling strategy (parts 3 and 4)

Parts 1 and 2: link

In the first part, we talked about different error handling strategies and when they are recommended for use. In particular, I said that the preconditions of functions should be checked using debug statements (debug assertions), i.e. only in debug mode.

To check the condition, the C library provides the assert() macro, but only if NDEBUG not defined. However, as is the case with many other things in C, this is a simple, but sometimes ineffective solution. The main problem I faced is the globality of the solution: you have statements either everywhere or nowhere. Badly, this is because you cannot disable statements in the library, leaving them only in your own code. Therefore, many authors of libraries write macros of statements independently, time after time.

Let's create our own, better solution instead, so that part of it can be reused.

Source code

Problem with `assert()`

Although assert() does its job well, this solution has a number of problems:

It is not possible to set an additional message providing more information about the failed condition. Only the expression converted to a string is displayed. This allows you to do hacks like assert(cond && !"my message") . An additional message might be useful if the condition alone does not provide enough information, like assert(false) . Moreover, sometimes you need to pass additional parameters.
Global: either all statements are active, or none is active. Assertions cannot be managed for any particular module.
The content of the message and the way it is output is determined by the implementation. But you may want to manage it, or even integrate logging into your code.
Claim levels are not supported. Some of the statements are more expensive than others, so sometimes finer control is required.
It uses macros, and one is even in lower case (lower-case)! Macros are not the best thing, their use is best minimized.

Let's write a universal advanced assert() .

First approach

It looks like the first double. Probably in the same way you write your own assertion macros:

 struct source_location { const char* file_name; unsigned line_number; const char* function_name; }; #define CUR_SOURCE_LOCATION source_location{__FILE__, __LINE__, __func__} void do_assert(bool expr, const source_location& loc, const char* expression) { if (!expr) { // handle failed assertion std::abort(); } } #if DEBUG_ASSERT_ENABLED #define DEBUG_ASSERT(Expr) \ do_assert(expr, CUR_SOURCE_LOCATION, #Expr) #else #define DEBUG_ASSERT(Expr) #endif

I have defined an auxiliary struct that contains the source location information. In this case, the work itself is performed by the do_assert() function, and the macro simply redirects.

This avoids tricks with do ... while(0) . Macro size should be as small as possible.

Now we have a macro that simply gets the current location in the code (source location) used in the approval macro. DEBUG_ASSERT_ENABLED can enable and disable assertions using the DEBUG_ASSERT_ENABLED macro setting.

Possible problem: warning about unused variable

If you have ever compiled a release build with warnings enabled, then you know that because of any variable that was used only in a statement, the warning “unused variable” will appear.

You can try to prevent this by writing a non-assertion like:

 #define DEBUG_ASSERT(Expr) (void)Expr

Do not do this!

I made a terrible mistake like that. In this case, the expression will be evaluated even with assertions disabled. And if it is rather complicated, then this will lead to large performance losses. Take a look at the code:

 iterator binary_search(iterator begin, iterator end, int value) { assert(is_sorted(begin, end)); // binary search }

is_sorted() is a linear operation, while binary_search() has a time complexity of O(log n) . Even with the assertions disabled, is_sorted() can still be computed by the compiler, because there is no evidence of the absence of its side effects!

When I made such a mistake, I got a very similar situation . Performance has fallen dramatically.

But in any case, DEBUG_ASSERT() not much better than assert() , so DEBUG_ASSERT() stop on it.

Implement customizability and modularity

Problems number 2 and 3 can be solved with the help of a policy (policy). This is an additional template parameter that controls the activation of the statement and the way the message is displayed on the screen. In each module in which you want to provide separate management of assertions, you need to define your own Handler :

 template <class Handler> void do_assert(bool expr, const source_location& loc, const char* expression) noexcept { if (Handler::value && !expr) { // handle failed assertion Handler::handle(loc, expression); std::abort(); } } #define DEBUG_ASSERT(Expr, Handler) \ do_assert<Handler>(Expr, CUR_SOURCE_LOCATION, #Expr)

Instead of hard writing in the code for the method of evaluating the expression, we call the static handle() function with reference to a specific Handler .

To prevent throwing Handler exceptions when leaving a function, I made do_assert() noexcept , and for return functions, the handler made a call to std::abort() .

The function also controls expression checking with the value constant ( std::true_type/std::false_type ). Now the assertion macro will unconditionally redirect to do_assert() .

However, this code has the same drawback as described above: the expression is always calculated when the branch Handler::value ! Is executed!

The second problem is easily solved: Handler::value is a constant, so we can use constexpr if emulation. But how to prevent expression evaluation? Let's go to the trick - use lambda:

 template <class Handler, class Expr> void do_assert(std::true_type, const Expr& e, const source_location& loc, const char* expression) noexcept { if (!e()) { Handler::handle(loc, expression); std::abort(); } } template <class Handler, class Expr> void do_assert(std::false_type, const Expr&, const source_location&, const char*) noexcept {} template <class Handler, class Expr> void do_assert(const Expr& e, const source_location& loc, const char* expression) { do_assert<Handler>(Handler{}, e, loc, expression); } #define DEBUG_ASSERT(Expr, Handler) \ do_assert<Handler>([&] { return Expr; }, CUR_SOURCE_LOCATION, #Expr)

Now this code assumes that Handler inherits from std::true_type or std::false_type .

To implement static dispatching (static dispatch), we do here the “classic” tag dispatching (tag dispatching). But more importantly, we changed the processing of the expression: instead of a direct transfer of the expression bool (which means evaluating the expression), the macro creates a lambda that returns an expression. Now it will be calculated only when calling the lambda.

This is done only with assertions enabled.

The lambda wrapping trick to put off the calculation is useful in all situations where you have only optional checks, and you do not want to use macros. For example, in memory, I use this approach to check for double release of resources (double deallocation).

Are there any costs here?

The macro is always active, so it will always call the do_assert() function. For comparison, when conditional compiling (conditional compilation), the macro runs idle. So are there any costs?

I have carefully analyzed several compilers. When compiling with optimizations off, we only have a call to do_assert() , which is redirected to a non-optimized version. The expression remains intact, and already at the initial level of optimizations the challenge is completely eliminated.

I wanted to improve code generation with optimizations disabled, so I turned on SFINAE to select overload instead of tag dispatch. This eliminates the need for a springboard function that inserts a tag. Now the macro directly calls the non-optimized version. I also marked it to be force-inline, so the compiler will do this even without optimizations. All it does is create the source_location object.

But, as before, for any optimizations, the macro seems to be idling.

Adding Approval Levels

With this approach, it is very easy to add other levels of statements:

 template <class Handler, unsigned Level, class Expr> auto do_assert(const Expr& expr, const source_location& loc, const char* expression) noexcept -> typename std::enable_if<Level <= Handler::level>::type { static_assert(Level > 0, "level of an assertion must not be 0"); if (!expr()) { Handler::handle(loc, expression); std::abort(); } } template <class Handler, unsigned Level, class Expr> auto do_assert(const Expr&, const source_location&, const char*) noexcept -> typename std::enable_if<(Level > Handler::level)>::type {} #define DEBUG_ASSERT(Expr, Handler, Level) \ do_assert<Handler, Level>([&] { return Expr; }, CUR_SOURCE_LOCATION, #Expr)

Also, SFINAE is used instead of tags.

When determining whether assertions are activated, instead of Handler::value the Level <= Handler::level condition is now included. The higher the level, the more affirmations are activated. Level 0 means that no statements are true.

Note: this also means that the minimum level of partial approval is 1.

Last step: add a message

It is very simple to do this: we enter an additional parameter that will be passed to the handler. But sometimes we don’t need the statements to contain a message, because the conditions already give enough information. So it is good to overload the macro. Alas, there is no such possibility. It’s the same with levels: maybe you don’t want to define them every time. Moreover, since the handler is a generic, it is able to take additional arguments.

So we need an assertion macro that can process any number of arguments. That is, a macro with a variable number of arguments (variadic):

 template <unsigned Level> using level = std::integral_constant<unsigned, Level>; // overload 1, with level, enabled template <class Expr, class Handler, unsigned Level, typename ... Args> auto do_assert(const Expr& expr, const source_location& loc, const char* expression, Handler, level<Level>, Args&&... args) noexcept -> typename std::enable_if<Level <= Handler::level>::type { static_assert(Level > 0, "level of an assertion must not be 0"); if (!expr()) { Handler::handle(loc, expression, std::forward<Args>(args)...); std::abort(); } } // overload 1, with level, disabled template <class Expr, class Handler, unsigned Level, typename ... Args> auto do_assert(const Expr&, const source_location&, const char*, Handler, level<Level>, Args&&...) noexcept -> typename std::enable_if<(Level > Handler::level)>::type {} // overload 2, without level, enabled template <class Expr, class Handler, typename ... Args> auto do_assert(const Expr& expr, const source_location& loc, const char* expression, Handler, Args&&... args) noexcept -> typename std::enable_if<Handler::level != 0>::type { if (!expr()) { Handler::handle(loc, expression, std::forward<Args>(args)...); std::abort(); } } // overload 2, without level, disabled template <class Expr, class Handler, typename ... Args> auto do_assert(const Expr&, const source_location&, const char*, Handler, Args&&...) noexcept -> typename std::enable_if<Handler::level == 0>::type {} #define DEBUG_ASSERT(Expr, ...) \ do_assert([&] { return Expr; }, CUR_SOURCE_LOCATION, #Expr, __VA_ARGS__)

We have two parameters that need to be specified: an expression and a handler. Since a macro variad cannot be empty, we name only the first required parameter. All parameters of the variadic are passed as parameters to the function call.

This makes some changes to the nature of the use: the type name and the Level constant can come before the Handler , and now they need to be adjusted, because they are parameters of a regular function. Handler must be an object of handler type, and Level, and an object of type level<N> . This allows you to make an argument deduction (argument deduction) to calculate the appropriate parameters.

Also, the above code supports any number of additional arguments that are simply forwarded to the handler function. I want to allow the following call options:

DEBUG_ASSERT(expr, handler{}) - no level, no additional arguments.
DEBUG_ASSERT(expr, handler{}, level<4>{}) - with a level, but without additional arguments.
DEBUG_ASSERT(expr, handler{}, msg) - without a level, but with an additional argument (message).
DEBUG_ASSERT(expr, handler{}, level<4>{}, msg) - with a level and an additional argument (message).

To do this, we need two overloads of do_assert() . The first handles all overloads with the level (2 and 4), the second - without (1 and 3).

But it is still a macro!

One of the problems with assert() is that it is a macro. Yes, still a macro!

But there is a major improvement to be noted: we no longer need a macro to turn off the statement. Now it is needed only for three things. To:

Get current location in code (source location).
Convert expression to string.
Convert the expression to lambda to enable deferred evaluation.

As for 1, then Library Fundamentals V2 has std :: experimental :: source_location . This class represents the location of the source code, as I wrote a struct . But for its extraction at compile time, it's not the macros that are responsible, but the static class function — current() . Moreover, if you use this class like this:

 void foo(std::experimental::source_location loc = std::experimental::source_location::current());

then loc will get the location of the calling code fragment, not the parameter! This is exactly what is needed for assertion macros.

Unfortunately, in the second and third variants we cannot replace the macro with anything. This must be done manually through the calling code snippet. So we will not get rid of the macro, as long as we need the flexibility of use.

Interim conclusion

We created a simple assertion utility, flexible in use, generic, and supporting separate assertion levels for each module. At the time of this writing, I decided to publish the code as a library header-only: debug-assert .

In it, you will find additional code, for example, easily generated modular handlers:

 struct my_module : debug_assert::set_level<2>, // set the level, normally done via buildsystem macro debug_assert::default_handler // use the default handler {};

Just copy the title to your project and start using the new, improved macro statements. I hope he will save you from writing macros for each of the projects in which you will need to manage the statements separately. At the moment this is a very small and quickly created library, so if you have any ideas about upgrading it, let me know!

Assertions are a useful tool for testing preconditions for functions. But the right type architecture can prevent situations in which statements need to be used. In C ++, there is a beautiful type system, so let's use it for our own benefit.

Motivation

I am working on standardese , a C ++ documentation generator. And there I have to deal with a large number of string values. In particular, I constantly remove spaces at the end of lines. Since this is a very simple task, and the definition of a space varies depending on the situation, I did not care to write a separate function for this.

Looking back, I can say that it should.

I use similar code:

 while (is_whitespace(str.back()) str.pop_back();

I’m writing two lines, commits, pushing, and, as usual waiting for CI to work, I get an email with a message about a crash in the Windows build. I am at a loss: everything worked on my machine, as in all Linux and MacOS builds! Watching the log: the test execution ended with a timeout.

I start Windows and I collect the project there. When I run the tests, I get a surprisingly arranged conversation about debugging assertion failures.

The one in which Retry means Debug.

Watching the error message. Hand face. Fix fix:

 while (!str.empty() && is_whitespace(str.back()) str.pop_back();

Sometimes the string is empty. In libstdc ++, in such cases, assertions are not included by default, which leads to a natural result. But the MSVC claims are included , and he notices such cases.

I made this mistake three times. Still, I had to write a function.

There were also several other problems: I didn’t follow the DRY principle, libstdc ++ didn’t check preconditions by default, Appveyor didn’t like graphical statements dialogs, and MSVC doesn’t exist under Linux.

But I think the std::string::back() architecture played a major role in what happened. If this class were made according to the mind, then the code would not compile and the system would not remind me of the fact that the string may be empty. This would save 15 minutes of my life and one download to Windows.

How could this be avoided? Using the type system.

Decision

The function in question has the following simplified signature (signature):

 char& back();

It returns the last character of the string. If the string is empty, then it simply does not contain the last character, which means that invoking it in any case is an indefinite behavior. How do we know about this? If you think about it, everything is clear: what char should be returned in the case of an empty string? There is no “wrong” char , so which one will not return.

It is actually \0 , but at the same time it is the last character of std::string , and you cannot distinguish between them.

But I did not think about it. My head was busy with a complex comment parsing algorithm and the problem that some developers leave spaces at the end of comments, breaking all subsequent parsing markup!

back() has a narrow contract (precondition). Without a doubt, it is more difficult to work with functions with a narrow contract than with a wide contract. So one of the possible tasks could be this: make as few narrow contracts as possible.

One of the problems with the back() function is that it does not provide for a valid return character in case of an empty string. But in C ++ 17 there is a potentially useful addition: std::optional :

 std::optional<char> back();

std::optional may or may not contain a value. If the string is not empty, then back() returns an optional containing the last character. But if the string is empty, the function can return optional, which is null. That is, we have modeled the function so that now we no longer need preconditions.

Note that we lost the ability to use back() as an l-value, because now you can’t use std::optional<T&> . So std::optional is not the best solution, but more on that below.

Suppose that std::string::back() has such a signature. I again focused on the code for parsing comments and writing a couple of lines to quickly erase the "hanging" spaces:

 while (is_whitespace(str.back()) str.pop_back();

is_whitespace() takes char , but back() returns std :: optional, so I immediately get a compile error on my machine. The compiler caught a possible bug for me, and statically, using only the type system! I was automatically reminded that the string could be empty and that I needed to make extra effort to get the character.

Of course, I can still be mistaken, because std::optional is not really intended for this task:

 while (is_whitespace(*str.back())

This code behaves in the same way, and, probably, a debugging statement will appear in MSVC. std::optional<T>::operator* should not be called with optional = null, it returns the value contained in it. So it will be a little better:

 while (is_whitespace(str.back().value())

At least, std::optional<T>::value() intended to throw an exception with optional = null, so at least it will fail to fail steadily during runtime. But both of these solutions have absolutely no advantages over the code with the same signature. (member functions) , , ! , . , , - (non-member functions) , , , — !

std::optional . std::unique_ptr<T> , , . - (pointer type), « » (Maybe), . - , . , .

 while (is_whitespace(str.back().value_or('\0'))

std::optional<T>::value_or() , . optional , . , , . is_whitespace() , std::optional<char> .

1:

, - , . back()/front() . , (optional type) std::optional<T> . , , .

, std::optional<T> , . . - std::expected<T, E> , (error type). - , , .

(parameter preconditions)

, . , , .

 void foo(T* ptr) { assert(ptr); … }

 void foo(T& ref);

(null pointer value). , (dereferencing) (callers).

 void foo(int value) { assert(value >= 0); … }

 void foo(unsigned value);

(underflow). , ++ , .

2:

, . , null? . , ? . , ? (enumeration).

(general wrapper type), — ! — , «» (raw) , :

 class non_empty_string { public: explicit non_empty_string(std::string str) : str_(std::move(str)) { assert(!str_.empty()); } std::string get() const { return str_; } … // other functions you might want private: std::string str_; };

. . , , , .

, . - . , : , ?

Conclusion

C++ , .

- . , . , .

Source: https://habr.com/ru/post/322804/

All Articles