Linux kernel development held by email

How would you lead the development of the largest open source project, in which about 15 thousand developers and 222 companies make more than 12 thousand changes between releases or 7/8 edits every hour? What are the creator of the kernel Linus Torvalds, the stable maintainer Greg Croat-Hartman (GKH) and other comrades in order to successfully coordinate the project and ensure the timely release of each new version?

This wonderful tool is a text-based email client: Mutt for GKH and Alpine for Linus Torvalds. Andrew Morton, the third largest developer of the core, also uses the electronic control system to manage the mm branch.

The development of the development of the new development tools for the development of the development team and the development of It’s still a lot better than anything else. ^[one]

Let's try to figure out how this could happen. What is the role of technology and personal factor? Could a text email really be an ideal means of coordinating super-complex projects?

Long road to git

Recall how it all began. On August 25, 1991, an unknown Finnish student wrote a letter to the comp.os.minix news group, in which he announced his work on a free operating system and its soon completion. In October of the same year, he posted version 0.0.2 on ftp in the Linux directory, notified the hackers about the comrades and it started ...

From the very beginning, the development was carried out by a distributed team of enthusiasts. At first, Linus Torvalds checked every patch, rewrote and installed it himself. The team grew, the patches became more and more, the core grew and became more complicated. I had to put the patches in a hurry, often without any changes. In this situation, Linus made the only right decision - to delegate and trust the work, to those who succeed . This process is well illustrated by a letter known as Linus not scaling , in which Linus complains about ~~reboot~~ overload and proposes to assign the maintainers of the various subsystems of the kernel.

Linus doesn’t need to be scattered on the floor. This is where the code gets dropped. Patches that fix compile errors get dropped. Code from subsystem maintainers that Linus designated. It was warning-free. It has been repeatedly resynced and re-diffused against the trees. This is extremely frustrating to maintainers. It is a huge source of unnecessary work. The situation needs to be resolved. Fast.

Actually, the whole history of the development of the project fits into this scheme. Linus less and less coded, more and more coordinated the work of the project, and so gradually from a hacker turned into an architect. This did not happen without internal friction and roughness , but the movement continued. At the same time, the backbone of the senior developers was formed and the style of collaboration was shaped. Since the email was the main tool of the joint work from the very beginning, everyone who was in the project, adapted to send patches via email and even recorded in the documentation how to do it correctly.

 -rw-r--r-- 1 root root 36604  11 2016 /usr/src/linux/Documentation/SubmittingPatches -rw-r--r-- 1 root root 11186  11 2016 /usr/src/linux/Documentation/email-clients.txt

Nevertheless, by 2000 it became clear that a version control system was needed. The people began to grumble that the development was inhibited because of the mechanical way Linus worked. This was true, and then the team adopted BitKeeper , which at that time was a closed software product provided free of charge to open source developers. To the pragmatic Linus, it was a damn, but the team had principled guys, not deaf to Richard Stallman's voice, ready to nail everything that even contains a closed code without getting tired. Alan Cox - the programmer who wrote the first decent network stack for the kernel, was one of them.

For a while, everything seemed to be going well, BitKeeper made life much easier for developers. They didn’t have to worry anymore about who had the rights to which changes, each of them could work in their own tree of the source tree, the possibility of distributed mergers ^{[2] of the} source code yielded significant savings for everyone. Implicitly, the crisis was brewing, which led to the creation of Git .

Then the author of Samba Tridge (Andrew Tridgell) wanted to do what he could do better than others - to carry out the reverse development ^{[3] of} BitKeeper , which, under the terms of the license, could not be done by any means. The conflict broke out, Linus tried to extinguish it, but did not succeed in that. And then he took it and in one day wrote Git . What happened next - we all know: in such a delicate and conservative segment of software as VCS, Git managed to overturn competitors in just a few years.

The main advantage is the widest availability.

Recently, the maintainer of the stable Linux kernel branch, Greg Kroa-Hartman, speaking at one conference, laid out his arguments for text-based email and against using collaborative development systems, source code inspections such as GitHub or Gerrit .

Having listed the obvious advantages due to which GitHub gained immense popularity among developers, the speaker emphasized several times that it works fine for small projects , but for large projects it is not suitable, as it doesn’t scale well. As proof of this thesis, he cited Kubernetes , with 4,600 open applications and about 600 pull requests. In the presentation, these figures were 10% lower.

Other GKH claims for Github:

Requires constant access to the Internet, but not all developers have access to it.
Pull requests and mailings by themselves.
Internal checks are difficult to implement.
Claims to UI, painfully difficult to track bids.

However, the speaker made a reservation several times that some problems gradually find a solution, the service is constantly changing for the better. Greg himself hosts usbutils on Github.

The rest of the joint development and inspection systems of the sources got a lot of nuts: the only argument for using Gerrit was the fact that he liked project managers. Not surprisingly, text e-mail collected the most praise. Theses for the email were as follows.

Simplicity
Widest audience
Scalability
Promotes community growth
Internal checks without problems
Localization and translation of text
Quick review of patches
The ability to remotely check
Lack of project manager

The first three points are in fact different facets of the same: email is simple and accessible to everyone . Blind, and according to GKH, there are a number of cool, but blind programmers in the project, email is available and WWW is not. Those who sit behind the corporate firewall, git unavailable, and with e-mail there are no problems. Maintainers often travel, perform at the conference ~~change passwords and turnout~~ and it's easier for them to wait for an Internet connection, download and send letters, rather than find a time and place to check the status of applications in Gerrit . Someone managed to travel a long time on a bicycle in Africa and still send patches on a schedule.

The fourth point is also very important. Every week or two in the field of view GKH appears recruit. He is generally advised to join the mailing list and just spend a week in read-only mode to get an impression of what is happening and to get into the details. This is a very good way to distinguish important topics from minor ones, to reveal the depth and detail of the problems discussed. At the same time, if you send a newbie to read the forums somewhere there, then it may meet with misunderstanding and resentment.

Let us now dwell in more detail on some of the disadvantages of using plaintext email as the main tool for managing and developing Linux.

I receive many letters

This is what Greg's blog entry was in , in which he talks about workdays with email.

Overall, in 24 hours I received 18,799,115 bytes (18Mb) of email in 2067 individual messages:

Of these, at least 237 live letters to the newsletter, Greg reads them all.

237 emails to mailing lists that I read everything that is posted. This includes a number of openSUSE mailing lists, system linux-linux-usb, linux-pci, linux-usb linux-pci.

Among all the maintainers, he has the largest load and the number of signed commits.

Moreover, these numbers are constantly growing! Apparently GKH scales well.

How the model patch should look like is written in Documentation/SubmittingPatches . Any MIME and other nonsense, only plaintext. The patch should be contained in the body of the letter, and not be attached to it .

 6) No MIME, no links, no compression, no attachments. Just plain text. ----------------------------------------------------------------------- Linus and other kernel developers need to be able to read and comment on the changes you are submitting. It is important for a kernel developer to be able to "quote" your changes, using standard e-mail tools, so that they may comment on specific portions of your code. For this reason, all patches should be submitted by e-mail "inline". WARNING: Be wary of your editor's word-wrap corrupting your patch, if you choose to cut-n-paste your patch. Do not attach the patch as a MIME attachment, compressed or not. Many popular e-mail applications will not always transmit a MIME attachment as plain text, making it impossible to comment on your code. A MIME attachment also takes Linus a bit more time to process, decreasing the likelihood of your MIME-attached change being accepted. Exception: If your mailer is mangling patches then someone may ask you to re-send them using MIME. See Documentation/email-clients.txt for hints about configuring your e-mail client so that it sends your patches untouched.

As if it is not difficult, but in fact, not all mail programs cope. Among the main neosilatory MS Outlook, Gmail (Web UI) and Lotus Notes .

 (5:534)$ grep -A2 Lotus /usr/src/linux/Documentation/email-clients.txt Lotus Notes (GUI) Run away from it.

Companies that use them always have a dedicated Linux workstation to send patches.

In addition, among the 16 rules, the format of the subject and content of the message are clearly defined.

 The canonical patch subject line is: Subject: [PATCH 001/123] subsystem: summary phrase The canonical patch message body contains the following: - A "from" line specifying the patch author (only needed if the person sending the patch is not the author). - An empty line. - The body of the explanation, line wrapped at 75 columns, which will be copied to the permanent changelog to describe this patch. - The "Signed-off-by:" lines, described above, which will also go in the changelog. - A marker line containing simply "---". - Any additional comments not suitable for the changelog. - The actual patch (diff output).

Often, strange letters happen with letters. Linus complains that Dmitriy Torokhov’s KMail must be burned to hell, since it corrupts spaces and tabs, because of this copy-paste it gives out garbage.

I can’t just cut out the whole line, because they aren’t tabs and spaces, they are some horrible abomination.

It is true that it’s not a problem. ”\ \ 302 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 240 '(' \ xC2 \ xA0 \ x20 '), ie, some utf-8 abomination.

But Linus reprimands GMail technical support for the fact that their spambots are messed up in 20% of cases.

There may be problems of mutual understanding between the programmer and the maintainer. Comment on LWN in translation.

Checking patches in letters sucks compared to c push --force. Will the maintainer grumble if I send one of the patches in processing v2 in the answer, or vice versa, will it grumble if I send the whole fresh v2 in one block instead? Does he notice v2, in order to apply the one you need, because I can't change the status of my pull request. As a maintainer, can I correctly apply the adjustment in the same situation?

Overall balance and conclusions

It is difficult to get rid of the impression that Linus and his comrades canonized their old habits of the 1990s. This is a subjective factor, but it also rests on the objective factor. The hackers who created the Linux kernel 25 years ago, thanks to which the open source community has gained material strength and moral leadership, are able to create with great dedication in this way . If you take talented graduates [MSU Baumanka MIT Berkeley your_loves_UZU] and completely replace all mainteners with them, then in a year or two they will go to GitHub, or write something of their own and prove on the slides how much their productivity has increased. Proof can be the presence of comparable large-scale projects that quietly create without patches in the body of text letters. Also, Google uses Gerrit , for the development of Android.

As a summary, I offer a comment on the LWN .

According to my thesis, in the 90s the open community had the best software tools for cooperation, and this is the reason that we have succeeded. Then the proprietors caught up and overtook us. We must invest in our toolkit again and update it, or at least adopt their best samples if we want to continue to succeed.

And what does Habr think about this?

Used materials

↑ Why, with all the richness of the choice of "modern" development tools, such as: github, gerrit, etc., Linux kernel programmers got stuck in the 1990s with their dense rules for creating patches in the body of a simple text email message? You will learn how the development of the kernel is carried out, why we rely on the “ancient” tools of labor and how they are so superior to all others. From the page of the announcement of the GKH performance,
↑ Distributed merge.
↑ Reverse engineering.

Source: https://habr.com/ru/post/314084/

All Articles