Development through suffering

From the translator:
Many copies are broken in disputes about when KISS is more appropriate, and when DRY, when it is better to solve the problem as quickly and easily as possible by any means, and when it is worthwhile to create beautiful and universal abstractions. Nathan Marz, author of the popular Storm framework used on Twitter, offers his own version. In order not to create tons of useless code for the sake of abstract universality and at the same time not allow the system to turn into a crutch of crutches, he uses “developing oriented suffering”.

Once I was asked: “How did you decide to take such a terrible risk - to write Storm at the same time as starting a startup ?” ( Storm is a framework for distributed computing in real time). Yes, perhaps, on the part of creating such a large project for a startup seems extremely risky. However, from my point of view it was not at all a risky business. Difficult, but not risky.

I use the development style, which greatly reduces the risk of such large projects as Storm. I call this style "development through suffering." In a nutshell: do not engage in the implementation of technologies from the absence of which you do not suffer. This advice applies to large architectural solutions as well as small daily tasks. Developing through suffering significantly reduces risk, ensuring that you always work on something important, and that you understand the subject area well, before you put a lot of effort into the solution.
')
I came up with such a mantra of development: “First, make it happen. Then - to be beautiful. Then - to make it fast. ”

Make it happen

When confronted with a new subject area, one cannot from the very beginning try to create a “common” or “expandable” solution. You just do not understand the problem so well as to predict what you will need in the future. You summarize where it is not necessary, adding complexity and wasting time.

It is much better to solve all the problems "in the forehead," perhaps with crutches and dirty hacks, taking care of only today. This allows you to quickly get results and not waste time in vain. And in the process, you will learn more and more about the intricacies of the subject area.

Phase “Make it was” for Storm lasted about a year. Through trial and error, we created a message flow processing system with queues and processes. We have learned to use the confirmation mechanism for guaranteed data processing. We learned how to scale real-time computing on clusters of queues and processes. We learned that in different cases it is better to break the flow of messages in different ways, sometimes by accident, and sometimes by hash so that the same entities are processed by the same process.

We didn’t even realize that we were in the middle of the “Do it to be” phase. We just wrote our product. Nevertheless, the pain that caused us queues and processes, very quickly became acute. Scaling was a terrible frenzy, and reliability was very far from the desired level. It became obvious that our paradigm of queues and processes was wrong and used the wrong level of abstraction. Most of our code dealt with the routing and serialization of messages, not business logic.

At the same time, in the process of developing we discovered new tasks in the subject area. We needed a function that calculated the URL coverage on Twitter, that is, the number of unique users who saw this URL. This is a complex task that may require hundreds of database calls and millions of operations. The initial implementation running on the same machine could take more than a minute to process a single URL. It became clear that you need to parallelize calculations in a distributed environment in order to work quickly.

One of the key ideas behind the Storm was that the “reach problem”
and the problem of "processing the message flow" can be combined into one simple abstraction.

Make it beautiful

"Exploration by combat" in the subject area allows you to build her map. Over time, a deeper understanding of the nuances and real use cases comes. This understanding can lead to the creation of a beautiful solution to replace the existing “crutches”, which will alleviate the suffering and make it possible to create new functions or systems that previously could not be approached.

The key to finding a beautiful system is to determine the simplest set of abstractions, sufficient to solve all the specific problems with which you dealt. Trying to predict hypothetical cases that you yourself have not encountered - an error leading to over-engineering. In general, the larger the system, the more deeply it is necessary to understand the subject area and the more versatile the set of use cases should be. Otherwise, you may be affected by the second system .

It is in the “Make It So Beautiful” phase that you can unleash your abstraction and design skills to highlight a set of harmoniously combined simple abstractions. This is similar to approximation of a set of points on a graph (your usage scenarios) as simple as a mathematical function (your set of abstractions).

The more points on the graph, the greater the chances that you will find the optimal curve. If there are too few points, there is a great risk that your curve will either describe the real data too badly, or its formula will be too complex. And this is an unnecessary waste of time and energy.

A very important condition for a beautiful decision is to remember about performance and resource requirements. The knowledge necessary for this appears precisely during the first phase.

Working on Storm, I identified a small set of key abstractions: streams, spouts, bolts and topologies. I was able to develop a new algorithm for guaranteed data processing that eliminated intermediary message brokers - that part of the system that caused the most suffering. The fact that two problems that are so different at first glance as the problem of coverage and the processing of the flow of messages were solved so elegantly in one way said that I felt something extraordinary.

I looked for a few more usage scenarios to check out the design. I asked familiar programmers and tweeted that I was working on a new real-time system and was interested in real usage scenarios. There were many interesting discussions, I received a lot of valuable information and made sure that my idea worked.

Make it fast

Once you have managed to create a beautiful solution, you can proceed to profiling and optimization.

The phase “To be fast” does not affect deep architectural problems with performance. About them it was necessary to find out during the first phase, and deal with them during the second. Now we are talking about micro-optimizations and licking the code. During the first two phases, it is necessary to reduce the asymptotic complexity of the algorithms, and during the third, to reduce the constants affecting the speed.

“Rinse and repeat”

Development through suffering is a continuous process. A beautiful and effective system opens up new opportunities and sets new tasks, which means that once again you have to “do it to make it” already in new areas and change the design based on the information received in order to meet new points on the graph.

Storm went through many such iterations. When we started using it, there was a need to generate several independent streams from one component. It turned out that if you add a special type of stream “direct stream”, the Storm will be able to process batches of records as a whole. Recently, I developed “transaction topologies” that allow you to guarantee strictly one-time processing of messages in almost arbitrary calculations.

The method of trial and error in the subject area, which you do not understand very well, by definition leads to a mess in the code. Therefore, the most important characteristic of developing through suffering is a constant concentration on refactoring. This is vital in order not to allow random complexity to flood the code.

Conclusion

In suffering development, usage scenarios are everything. They are worth their weight in gold. And the only way to get them is to start writing code, and no matter how beautiful it will be at first.

All programmers go through several stages of development. First, we make the program work somehow. The code is devoid of any structure and is full of copy-paste. Over time, we understand the advantages of a more structured approach, mastering encapsulation and more and more abstract and generalized constructions. And then we become obsessed with the desire to write as much as possible the general and extensible code, reinsuring the future.

Development methodology through suffering discards attempts to predict problems that you have not yet encountered. She acknowledges that generalizations without a deep understanding of the subject area lead to excessive complexity and lost effort. The architecture must always obey real, not invented requirements.

Source: https://habr.com/ru/post/155959/

All Articles