The art of programming under Unix (and not only). Part Seven, “the rule of transparency”

I continue the series of articles devoted to some simple rules for developing under Unix “according to Eric Raymond ”, which, in my deepest conviction, can be extended to any other operating system. I have already told in the first three parts about the rules of modularity , clarity , composition , separation , simplicity, and cost-saving . Today it came to the seventh rule -

Transparency rule: design software immediately with debugging tools

The meaning of this rule can be represented by two postulates:

the application should not just work well, but also show that it works well;
the possibility of “high-level debugging” of the application should be provided;

Take for example the modern mailers. Receiving and sending letters is a very difficult process, because they have to “communicate” with a certain postal service that can behave in a completely unpredictable way at any given time. Of course, it is necessary to process these exceptional situations, but, as they say, you will not hijack everything. Errors can be literally at every step - in the mail database, in the communication channel, in the mail server, in letter format. In order to understand what the problem has led to problems with accepting letters, you need to keep a detailed log of communication with the mail server and the place where the program “stumbled” will say a lot. But does a simple user need this? No, for this and put the button "Details".
')
Or - a typical Windows FTP client . A thing, it seems, has long been included in the arsenal of the average user. Well, okay, an advanced user, but in any case not a programmer who understands the intricacies of the protocols for a long time. You put yourself a program into which an FTP client is embedded, almost for the sake of it you chose it, but it does not connect to the server. Why - can not be determined. He does not give himself away. Error - and all! ..

Well, let's go even lower. Here there is Skype and ICQ under Windows. Both of them have settings for proxy servers . And every time there are problems with the connection, you don’t know why they have something with the proxy server , whether it’s “they” have a problem or it’s me.

Such examples when it is necessary to keep a log of actions is the sea. And we are looking at large industrial solutions, in which the loss of one user is a drop in the ocean. And if you are dealing with a system with high availability requirements? When is every second counted?

In the Unix world, there is a very good practice to include a debug mode in standard console applications that work as a “thing in itself” and / or keep a journal, so-called. "Logs". You set the debug level and see how the application works. So it opened the connection, threw the command there, got the answer, the answer was unexpected and the program reported an error. Besides the fact that some of the “advanced” users begin to cope with problems on their own, you also save your time for finding and fixing problems.

Logging is convenient because the program leaves traces that can be useful only after quite a long time. For example, you can analyze the log for errors that are not particularly noticeable to the user. Suddenly they appear regularly? From logs, you can take performance data and build graphs on them and analyze trends. It’s never too late to turn off logging, but if it isn’t included in the program, then it’s harder to do it.

For example, in ArtPublishing, the debugging console looked like this. Hierarchy of calls to template functions . For one of the forum templates that I mentioned in the last article , the developer could study how the software works inside by adding a debugging flag to the URL:

The given example can be used both positive and negative. In particular, it is convenient that the journal is presented in the form of a structure. The disadvantage is that it is rather difficult to collect and process numerical data from such a journal, for example, to fix the disappearance from the structure of some mandatory unit. It is similar to XML , but not at all "valid."

It is important that high-quality "logs" are like the author's style according to which the programmer of his colleague is judged. This is no less important than comments in the source code or the correct names for functions, files, folders and variables. As a rule, log entries are divided into four main types: error, warning, notification and debug message, provide a timestamp and do not forget that when translating data into a log, you should not forget about the maximum length, encoding, translation of strings (in text logs), and in general, about log rotation (archiving and / or deleting old logs).

When there is journaling, it is usually easy to do monitoring of the work of several applications running on different workstations or servers at once - it is enough to access the repository of these logs and analyze them on the fly.

Monitoring the operation of a module or application is somewhat different. If the logs show what is happening inside the program, then the monitoring reflects in a convenient for perception some important for control indicators. In case of going beyond any established boundary values, the monitoring system must notify the administrator, passively (writing to the log) or actively ( e-mail , sms). For example, messages about fatal errors of programs sent to the development company by e-mail are the same monitoring, only internal, not external.

Irregular monitoring is based on an automated testing system. In order for the construction of such a system to be possible in principle, the software architecture must take into account the possibility of “package” work. At the entrance is a set of ready-made tests, the output - the finished results. Such monitoring can be carried out after each change of software logic.

Recently engaged in a system for calculating delivery times. There is a rather tricky algorithm that takes into account the specifics of logistics. Understand the correctness of the calculation immediately impossible, you need to arm with a calculator and tables. Therefore, in order to quickly answer questions about whether the mechanism is working correctly, it was necessary to add “transparency” for it: when working out, the journal should reflect all the logic for calculating the period understood by a non-specialist. Of course, it was not necessary to reflect it on the site, but to answer the question of the form “how it happened that delivery takes as long as 15 days!” It became very easy to answer - the system itself literally gives an answer that is understandable for a person. An automated testing system was also developed that calculates deadlines for dozens of situations at once and compares the results with those calculated manually.

" Previously: the code-saving rule

Source: https://habr.com/ru/post/104191/

All Articles

The art of programming under Unix (and not only). Part Seven, “the rule of transparency”

More articles: