Thinking about error handling

The topic of error handling is complex and ambiguous. There is still no optimal approach or group of approaches to this problem. All of them suffer from certain shortcomings. In this article I would like to share my thoughts on this topic, and last but not least, learn new knowledge in the comments.

The code in the article is given in scala, but the approach in question can be implemented in many other languages (c ++ using macros, java using JetBrains MPS, etc.). The closest analogue of this approach is the haskell error handling method.

When designing a function, the way to deal with errors inside it depends on the intended method of use in case of an error. There are two main ways:

The error leads to an incorrect state of the program and we must report it to the user as soon as possible and in detail. Systems with this type of error handling have the fail-fast (FF) property.
The error is a standard situation and will be processed automatically without user intervention. Similar systems are fault-tolerant (FT)

Then I will use two abbreviations: the function designed for the fact that the external code will work with it in FT mode is the FT function and, similarly, the FF function.

Why even allocate several ways to work with errors, and not use any one for all cases? The fact is that the functions calculated on the FF used in the FT mode, as a rule, will result in the system working too slowly. While functions designed for FT, used in FF mode will not provide enough information to quickly find errors by the user.
')
Many examples of inconsistencies between goals and ways of use are generated by the standard function from JDK Integer.parseInt, which was explicitly designed for FF. For example, if you are in a loop, you read lines from a stream and process only numbers, and skip the rest. In this case, try {Integer.parseInt (...)} catch {...} can slow down your code several times. A more detailed explanation and tests can be found here nadeausoftware.com/articles/2009/08/java_tip_how_parse_integers_quickly .

FT functions should return error information in the most concise manner so as not to waste resources. Most often it is either a boolean value or an error code of type int. At the same time, if some fatal error has occurred, then the information on it should be passed to the user or to the log by the upstream FF function. FT functions must be of a minimum size so that by the error code it can be understood where this error occurred.

If you are writing a regular business or web application, i.e. is, where the main role is played by an external user (and the vast majority of such applications are currently), and not the rover control system, then 99% of your code will be FF. Those. the user will be asked to enter correct data, and if some incorrect ones still skipped, then to fall with an error and require the developer to add more checks. All error handling code will take up <1% of computer time, even if its volume is large, unless you use the FF functions from the FT code. Therefore, in the future, I would like to consider in more detail how to design FF functions and reduce the amount of code necessary for convenient error handling.

Since with the FF approach, our task is to minimize all current activities as quickly as possible and notify the user about the error, it is necessary to make the error message as informative as possible and it does not matter what resources are required for this, since the bottleneck will still be the user or the programmer, who will eliminate this error altogether. Those. in this case, the error is an extraordinary situation and the system is not designed for a constant processing cycle for such an error with the maximum throughput.

Here we need to dwell on the requirements that we can make to the error handling method:

The error should be described in sufficient detail for the user to understand the reason for its occurrence. For example: division by zero.
The user must understand where to look for this cause. Those. not only that this division by zero, but also in what field he introduced this zero. In what form was this field and on which tab of our multipage application was this form.
The error description for the user should not contain any informational garbage, such as tracing the call stack. At least without clicking on the "more" button.
Error information should be available, according to which the programmer can easily find the place of its occurrence, i.e. most likely it is just a call stack.

Thus, our object describing the error should contain two principal parts: information for the user and information for the programmer.

If we are talking, for example, about checking the format of a phone number in a browser, then it is quite simple to issue a relevant error message and even a programmer can usually understand without a call stack where to look for an error if the function starts to swear at the correct number. This is because such a function is extremely small and almost directly related to the form field that it checks.

However, consider this example. There is a set of text files of one format, and the user wants to convert them in batch mode to a set of text files of another format. At the same time, the formats are not fully compatible and require manual correction of some data. However, the situation of the appearance of incorrect data is extremely rare (for example, once every six months) and therefore, at the current time, the application developer did not bother to create a corresponding replacement wizard within the application, but limited to displaying an error message.

The implementation of this functionality can be represented by the function

def convertFiles(files: List[File]): Unit

Since A full implementation can take several hundred or even thousands of lines, depending on the complexity of the formats, then the code should be broken down into a number of nested functions with minimal responsibility and maximum clarity (I mean pure functions). Suppose we have the following functions.

 def convertFiles(files: List[File]): Unit def convertFile(in: File): Unit def convertStream(is: InputStream, os: OutputStream): Unit def convertCity(el: InCity): OutCity def convertPersons(pl: List[InPerson]): List[OutPerson] def convertManager(el: InManager): OutManager def convertProgrammer(el: InProgrammer): OutProgrammer def convertPosition(el: InPosition): OutPosition def isOutdated(f: PosType): Boolean

The call tree might look like this:

 def convertFiles(files: List[File]): Unit def convertFile(in: File): Unit def convertStream(is: InputStream, os: OutputStream): Unit def convertCity(c: InCity): OutCity def convertPersons(pl: List[InPerson]): List[OutPerson] def convertManager(el: InManager): OutManager def convertPosition(el: InPosition): OutPosition def isOutdated(f: PosType): Boolean def convertProgrammer(el: InProgrammer): OutProgrammer def convertPosition(el: InPosition): OutPosition def isOutdated(f: PosType): Boolean

In our fictional example, some posts may become outdated and then the manager must manually, depending on certain personal considerations, correct an outdated post for one of the 10 new ones that came to replace her. But as has been said, such posts are extremely rare.

One way, especially common in the java-community, is to use exceptions to handle errors. Suppose isOutdated returns false. In this case, convertPosition cannot continue to work and throws an exception. The code that calls convertFiles intercepts it and notifies the user of the incident, and also writes a call stack to the log (if the post is not outdated, the programmer will quickly find the source of the error).

However, how bad is this method (at least in the current implementation in jvm)? The convertPosition function does not know anything about the context in which it is executed, so all it can say about the error is that the position is outdated. She does not know which person had this position, in which file and on which line to look for the error. According to such an error message, the user will not understand at all what to do next.
The situation can try to fix it in two ways.

The first is to transfer context to all internal functions (i.e., an object containing a person, a file, etc.). At the same time, the signatures of functions, the method of calling, the readability of the program as a whole will become more complicated, the coherence of functions between themselves will greatly increase. In other words: do not be so.

Second. convertPosition throws an exception. convertManager and convertProgrammer contain the code:

 def convertManager(el: InManager): OutManager = { ... try { ... val position = convertPosition(el.position) ... } catch { case e: PositionException => throw new PersonException(s"    ${p.name}", p, e) } ... }

Those. we intercept the exception that occurred in convertPosition, and instead we throw a new one with a modified message, additional information in the form of a person and an indication of the previous exception as the cause parameter of the new exception. This procedure can be repeated at higher levels. Thus, to the very top we will have an exception tree. We can give the user a message composed of the messages of all our exceptions, so it will be clear in which file the error and in which line and what happened (i.e. the post is outdated). The programmer will receive a trace in the log.

What are the disadvantages of this approach? First, we have to throw a lot of exceptions. And although it is not a pity for resources especially for such a thing, a certain surplus in expenses is already felt here. Secondly, the syntax is such that when an example becomes more complicated, you can dig into try ... catch. For example, there may be an example, when in the next step we pass to the function the data obtained in the previous one. Then we already have a set of nested try ... catch. Handling a person might look like this:

 case class InPerson(name: String, tel: String, addr: Address, age: Age) case class OutPerson(id: Int, name: String, homeAdder: Address, workAddr: Address, distance: Double) def convertPerson(p: InPerson): OutPerson = { try { val name = convertName(p.name) //     ,      0 val homeAddr = convertAddr(p.addr) val age = convertAge(p.age) try { val (id, wAddr) = db.query("select id, work_addr from persons where tel = ?", p.tel) val workAddr = addrFromStr(wAddr) val distance = calcDistance(homeAddr, workAddr) if (distance == 0) throw new PersonException(s",   ${p.name}   ", p, e) OutPerson(id, name, homeAddr, workAddr, distance) } catch { case e: NotFoundException => throw new PersonException(s"         ${p.tel}", p, e) } } catch { case e @ _: AddressException | _: AgeException => throw new PersonException(s"    ${p.name}", p, e) } }

Nested try ... catch blocks significantly reduce readability of the code. In addition, the error message is sometimes far enough away from the very place of the error. All this is enough to try to find an alternative.

I believe that exceptions in their current form should be used only for two purposes:
1. To leave the current stack frame. And for this, for good, there should be a more lightweight alternative, but it is not there now.
2. In situations of 1 per million lines of code, when there is no more possibility to report an error. For example, in the standard scala library there is an Option class with the get method, which returns a value only if we have a variable of the Option class in our variable - the Some class. If there is a class None, then an exception occurs. In this case, it is not possible to replace the result of get with Option [X], since this will make the get call meaningless and we just have to throw an exception. In the rest of the code, however, in a similar situation, we can replace our result X with Option [X] and do without exceptions.

So what does the standard library scala, the library scalaz and haskell offer us? There are many articles on the Internet on this topic, and the topic itself is not so new. Therefore, retelling these articles, it seems to me, is not worth it. The essence of them comes down to the use of existing monads.

For example, an article about the capabilities of the standard library.
tersesystems.com/2012/12/27/error-handling-in-scala
suggests us to use Option, Either and Try. But it all works on fairly simple examples. If, when rewriting convertPerson, we need to use a condition in the middle, like if (distance == 0) ..., then this situation will require splitting one common for into two, which will immediately create code no less than using try ... catch. And if our function returning Either or Try is called from a foreach, map or fold loop, then the only way to interrupt the loop is to use scala.util.control.Breaks and a variable at a higher level if you need to pass some value to the top. .

Scalaz, with his \ /, - \ / and \ / - in fact almost does not change the situation
typelevel.org/blog/2014/02/21/error-handling.html
Well, or I do not understand something about it and I will be glad to hear about it in the comments.

In addition, the structures described either do not track the call stack, or do not allow to specify error messages, or all at once.

The haskell uses the Error monad, which is very similar to Try and \ /, - \ /, \ / -.
book.realworldhaskell.org/read/error-handling.html
But there are not many of the problems that arise in scala because of the other device of the code itself and the standard library. Therefore, there this approach can be considered quite working, although it also has a number of drawbacks.

My bike

I tried to take the best of both approaches (exceptions and monads) and eliminate, if possible, flaws. The result can be seen here.
github.com/cs0ip/habr-error-handling
For further reading, you need to look into the code so that I do not have to give all its description here. It is better to take this code not as a ready-made library, although I use it in my development, but rather as an implementation of an idea that can be developed. In class names, abbreviations are used first for brevity, second for reducing the possibility of intersection with already existing names, which as a rule do not use abbreviations (at least for me).

I used 4 entities. Res is a class similar to Try. His descendants Ok and Err, corresponding to the correct result and error. As well as the Exit class, which serves to interrupt loops and pass values up the stack. Exit is similar to scala.util.control.Breaks. To all this, methods have been added that make life bearable when using code that does not support Res. So, for example, Res can be created from Option, Either, Try, and even Boolean, which makes it possible to insert into for statements if able to interrupt execution and not related to the previous expression. In addition, it is possible to easily handle code that throws exceptions using the safe function, without having to wrap it in Try.

Suppose that all functions written by us now support Res, then convertPerson can be rewritten as follows:

 def convertPerson(p: InPerson): Res[OutPerson] = { val res = for { name <- convertName(p.name) homeAddr <- convertAddr(p.addr) age <- convertAge(p.age) (id, wAddr) <- Res.safe(db.query("select id, work_addr from persons where tel = ?", p.tel)) mapErr { e => e.ex.get match { case _: NotFoundException => e.replace(s"         ${p.tel}") }} workAddr <- addrFromStr(wAddr) distance <- calcDistance(homeAddr, workAddr) _ <- Res(distance != 0, s",   ${p.name}   ") } yield OutPerson(id, name, homeAddr, workAddr, distance) res.mapErr(e => e.push(s"    ${p.name}")) }

Err stores the value of the Option [Throwable] type, which firstly allows you to track the call stack, and secondly, you can refuse to create an exception in general, where you need to improve performance. In addition, Err stores a list of error messages and allows you to easily replace the last message and add a new one. It should also be noted that Err is completely persistent, which excludes the possibility of accidental modification of data.

An example of using Exit might look like this:

 def convertPersons(pl: List[InPerson]): Res[List[OutPerson]] = exitRes[List[OutPerson]]{ex => Ok(pl.foldLeft(Nil){case (z, p) => (p match { case p: InManager => convertManager(p) case p: InProgrammer => convertProgrammer(p) }) match { case Ok(out) => out :: z case e: Err => ex.go(e) } }) }

exitRes [X] expects to get a value of type Res [X]. Unfortunately, when using exitRes (and exit), you always need to specify the type of the received value, since the compiler cannot output it. Theoretically, type inference can be done using macros, but as long as they are unstable, I do not want to use them.

I would also like to mention a couple of points. If you are writing a library for public use, then perhaps you want to specify your type for each error, as can be done with exceptions (InputException, NullPointerException, etc.). Err allows both to indicate the type of error with the help of the Throwable contained in it, and with the help of each element from the lst message list. Those. you can create your successors Err.Er. In addition, I would like to note that thanks to the implicit convertion from String to Err.Er, strings can be used everywhere where the signature implies Err.Er.

If you are writing your application, and not a public api, then most often in fail-fast mode you do not care about the type of error, but what is important is the message about it to the user and the location for the opportunity to be corrected. In this case, in the examples above, you can omit all checks of the type "case _: NotFoundException" and the code will become even more compact.

Java and other languages

In most languages, it is possible to implement a certain monadic entity, like the one described by Res. The problem arises from the fact that all the power of Res is revealed mainly within the for statement for scala or do for haskell. Those. need to implement a similar statement. In principle, using the tools mentioned at the beginning of the article, it seems to me possible. A description of the most for ... yield can be found here docs.scala-lang.org/tutorials/FAQ/yield.html

Pros / Cons

The advantages of this approach compared to pure exception handling:

Better error control;
the ability to edit and supplement messages in the parent functions;
less waste of resources on exceptions: only one exception is used to indicate a position, and a lightweight exception without a stack of calls is used to climb a stack;
sometimes a significant decrease in the amount of code;

Pluses in comparison with the methods using monads by links (Try, \ /, ...):

The data approach works where the methods indicated in the articles require improvement and are simply not applicable.

Disadvantages compared to pure exception handling:

Much of the code should be located in for, which makes it look somewhat unusual, although you can get used to it.
It can be difficult to design from top to bottom. Those. when for the stack of function calls, only the signature without implementation is described, and the main efforts are made to this function without the need to be distracted by the details. The problem may arise due to the fact that some functions should not return errors in general and, accordingly, should not return Res. As a result, the function signature is not always immediately clear.

Total

What I would like to say in conclusion. This approach still has drawbacks that you might not be able to fix without the support of the language. Sometime in the distant and beautiful future, I would like to see a language with support for lightweight constructs for passing values up the stack. With the ability to track the current line of code and at the same time do not spend resources on getting a stack of calls until the very moment when it will need to be displayed in the log. And most importantly, I would like to be able to describe the correct result and error independently, always explicitly and while processing, not to drown in constructions like try ... catch, but also not to modify the code to use monads. I believe that this is possible, and even have some thoughts on how, but as always there is no time to implement such things. However, it is good that any ideas usually come to the heads of different people independently and in parallel, so one can hope that someday error handling will reach a new level.

Source: https://habr.com/ru/post/262971/

All Articles

Thinking about error handling

My bike

Java and other languages

Pros / Cons

Total

More articles: