Haskell in Product: Project Manager Report

I have long promised to write an article about how Haskell showed itself in real-life problems in the product.

For those who didn’t keep up - at the beginning of 2012 he was lobbied and enthusiastically began to be implemented by programmers in Selector. Then I promised to publish a report on how “all this” can be used.

A product in a commercial project is not in a small sandbox “for itself”, not an academic experiment in the field of Computer Science. This is an endless struggle for the "party line", when hell, horror and doom are around, but it should still work. Int64 in XML-RPC is encoded as a string (because ints in XML-RPC are signed int32), openssl, when reading several certificates from a file, reads only the first one, you must write either “1” or “0” in bool, but sometimes - “2”, because it was the only way to invent the third mode - and so on. etc. Under these conditions, the requirements for the language gradually develop into requirements for its ecosystem, infrastructure, and readiness to adapt to the real world.
')
I will write about Haskell from the perspective of the product owner, project manager, system administrator, but not the programmer. So don’t expect me to sincerely delight in how gracefully a semigroupoid can be made through monads and how cool it is to infer types through types using types.

From the point of view of the project manager, the programming language is evaluated on several metrics that are completely different from the programmer ones. For a programmer, language and its features are perhaps the most important thing, since it is with him that he spends most of his time. For the rest of the team, what is more important is what happens outside the source code. At first it is a search for libraries and suitable technologies, then the tasks of maintenance, monitoring, implementation and debugging.

Let's start with consumer properties.

Speed of program execution

Haskell programs are faster than python, php, ruby (and other interpreted languages). Faster Erlang / Java (and other vm-based languages). Usually slower than C, although I have seen several cases where the Haskell compiler produced a result that exceeds the default.

For any practical application of Haskell performance, behind the eyes and behind the ears.

The main advantage in comparison with the python (from which we gradually migrated) is the excellent parallelism of execution. No GILs, no “external balancers between the workers”, no hell with hevent debugging.

Haskell has regular greenlets and native thread usage of the operating system.

Executable file size

Most of the time this doesn't matter, but in our configuration in some places it was cramped - and the minimum executable size of 22 MB was annoying. When the "tight places" decided, the size ceased to play any tangible role. Our largest server occupies 44MB and dynamically links up with three dozen so's.

Memory usage

(In this section we are talking about 'resources', that is, the memory in which data is stored, and not the code, in the top it corresponds to the RES column).

In computer algorithms, the memory used is usually calculated in O-notation, but there is an important factor - if there are many processes and each of them is O (1), how much memory will be eaten on the server? Those "plebeian constants" suddenly begin to play a role.

Haskell uses memory comparable to programs on python. Demons (the part of them that does not store a significant amount of data) take us from 9 to 20 megabytes. Python demons - about the same.

I must say that Haskell is somewhat inferior to OCaml in this parameter (in addition, combat services can live with 1-2 megabytes of memory), and, of course, C (for example, modd eats only 0.15 MB), but much better than the Java / Erlang situation .

Real Executables

Most cozy software environments (jvm, python, beam.smp, php, perl, .net, etc) require quite a lot of infrastructure (running interpreter / virtual machine, a lot of files in the right places, etc). When you write a program that “receives two numbers from the user, writes them to the database and shows the project administrator their amount”, everything is ok.

But sometimes it turns out that you need to write a program that runs in single mode. Or instead of init. Or from init itself. Or with suid. Or something else so that a cozy environment of execution has nowhere to deploy.

Haskell generates an executable file. This ELF. Which can be static or dynamic. And this is great.

The second important factor is the launch speed. In many cases, the program starts and ends. In Python (and many other interpreted languages), 100,500 different files are scanned at startup, especially with a bunch of imports, which leads to delays of 100–200 ms at the start. In Haskell, this value is much smaller, because ld performs multiple times faster than Python or PHP.

The same applies to the output of ps / top - the program on Haskell - these are regular executable files that appear in the process list as “just processes”, and not as a python that starts files.

This has a minus: 32/64 bits, all of a sudden - these are different executable files, and libffi5 or libffi6 is a big difference, which prevents the “cross-compatibility” of applications for a particular distribution, or even different versions of the same distribution kit.

Monitoring

Since the Haskell program is native to the operating system, there are no special features in monitoring (for comparison, the Java machine has its own indicators, which you should follow, Erlang has its own indicators).

Quality code

When operating an already written program, exactly one thing interests: how often it falls, beeps and spoils everything. So, in comparison with python - incomparably less. Yes, with proper file processing, you can catch the leaked in toplevel exception, but the probability of this is extremely small (I saw it once among all the programs among all programs).

The probability of stupid mistakes is much less. And when I say “significantly”, this is not in theory, but in practice, that is, by observing the same product, written mostly by the same people on Haskell and Python.

Python, like any other dynamically typed language, is a solid time bomb. It is necessary to think clearly about all bad situations, plus no one insures against minor local errors or negligence. Errors either appear in runtime, or they can either be hidden in the implicit except: pass (which is even worse). Object of type 'NoneType' has no method - our everything. And if this error turns out to be in a rare branch, then the mine turns out to be completely slowed down and works when the code has long been “stable and well-shown,” and in general, 300 days of uptime.

Tests that "cover the whole code", unfortunately, do not at all cover "all possible types of input data" (which, suddenly, are dynamic) and do not at all save from typing errors.

On Haskell such errors, errors of the level “oh, in this branch forgot to check” or “confused return type” does not appear in the programs. Programmers argue with a convenient type system that allows you to catch most of these errors at the compilation stage, plus a language that allows you to write important things, without being distracted by the indices of arrays and temporary variables. They know better.

From the experience of parsing found errors, I can say that most of the errors we encountered are either an error in the TK (that is, an error of your humble servant), or incorrectly understood by the TZ programmers. But not a local error or forgetfulness.

This leads to a paradoxical conclusion: the bugs in the program on Haskell are more difficult to fix than in languages with dynamic typing, because in a language with dynamic typing another place where NoneType suddenly got out, corrected the key, and on Haskele it’s necessary to deal with the algorithm yes To raise the ambiguity of TK with other people swear.

Library Maturity

Spherical programs in vacuum work only in conditions of sphericity and in vacuum. All the rest work in the real world, where millions of complex formats, specifications and protocols. That is, we need libraries. Thousands of them.

And, surprisingly, there are quite a lot of them on Haskele. That is, from the point of view of “they took and began to write the combat code” - yes, because it will not be necessary to invent logging itself, ssl, ready orm, regexps, localization support, time, http-server, etc. Almost everything is ready. Although there were some unpleasant moments. For example, we had to independently maintain the bson / mongdb implementation for Haskell, since the venerable Tengen stopped supporting it.

... At the same time, the Haskell program is equally unprotected from segfolts, because most of the programs are linked with libraries that are written in C, and this is either an error in the library, or the programmer who did not cause this library to blame (and the compiler himself no longer protects). In a couple of places, this led to the rewriting of the library on pure Haskell (for example, for this reason we wrote Hen , which implements the subset of requests for working with Xen that we need, commits for full support are welcome).

Compilation speed

I never thought that this could be a problem, but a fact: half an hour to build a project. On a very non-sickly gland with a bunch of cores and ultra-fast storage at the bottom. Personally, after the first moments of pride, “wow, our program is already compiling for half an hour,” it began to annoy me, because a minor bugfix, and hello scene:

Maintenance difficulty

Maintenance is the introduction of topical minor changes, local clarification of “what is wrong”, in short, the turnover of a living “own” project.

So, with the accompaniment turned out pretty unpleasant. It is clear that the sysadmin can be taught monadic calculations. But ... Well, you understand. If the sysadmin could find and correct the Python code when needed, now the code is Shaitan-Arba, to which special people are attached, to change it, and you have to talk about the problem only in the diagnostic style (“it doesn’t work here”, “it doesn’t "). Firstly, it spoils the devops synergy somewhat (if someone from the team does not understand the essence of what the second part of the team did, this is bad), and secondly, it makes a lot of demands on people who sit down at the code.

The search for programmers is a problem, however Haskel’s apologists would not say the opposite. Even the option to “retrain” is a problem, because under functional programming, especially with “komonads” and template haskell, you need to turn your brains hard. In other words, language is some additional obstacle that raises the threshold of entry.

Given the well-known shortage of programmers in the labor market, this is an obstacle. On the other hand, the presence of such things attracts programmers who are tired of “php + js, from here to lunchtime”.

Development speed

Surely I will hear a lot of outrage from the apologists of the language, including the programmers with whom I worked. But, an objective reality: projects on Haskele are written more slowly, than on the Python. In the counter-arguments, I will be given a changed programming style, more attention to trifles, etc., but still, my current conviction based on practice is the final speed of implementation of the new functionality on Haskell is noticeably lower. Alas.

This is partially compensated by the time for post-debugging and capturing any stupid bugs that in Python constituted a decent loop after writing the program, and which is almost absent with Haskell, but even with that in mind, it still turns out slower.

A similar problem with prototyping. If the base prototype on a python appears almost as a copywriter of what he did in an interactive environment in the laboratory, but in Haskell this is usually a kind of sacred act, which for some time goes into itself (types, etc.), and only after a while leads to the result. If it turns out that the result is “not exactly what they dreamed of,” then it becomes clear already closer to the final, and not at the beginning. Thus, the cost of iteration in finding a solution increases, making the whole process less flexible.

Source: https://habr.com/ru/post/193722/

All Articles