📜 ⬆️ ⬇️

Rethinking PID 1. Part 2



Paralleling Socket Services


This type of synchronization at boot time leads to a sequence (starting services in series) of a significant part of the boot process. Wouldn't it be cool if we could get rid of the price of synchronization and the sequence? Well, we can actually get rid of it. For this, we need to understand what services (demons) actually require from each other, and why their launch is postponed. For traditional Unix daemons (services), there is only one answer to this question: they wait until the daemon providing its services is ready to accept connections. This is usually an AF_UNIX socket in the file system, but it can also be an AF_INET socket. For example: D-Bus clients are waiting for / var / run / dbus / system_bus_socket to connect to it, syslog clients are waiting for / dev / log , CUPS clients are waiting for /var/run/cups/cups.sock and NFS mount points are waiting for / var /run/rpcbind.sock and port IP portmapper, etc. And now think about it, in fact there is only one thing that the others are waiting for.

Since this is the basis of what follows, let me say it again, but in other words: if you run syslog and various syslog clients at the same time, what happens in the scheme above, it’s that messages from clients will be added to buffer / dev / log . Until the buffer is full, clients do not have to wait until the syslog has finished loading, they will pull out all the messages from the queue and process them. Another example: we run D-Bus and several clients at the same time. If a synchronous request to the bus is sent, therefore, a response will be expected, also synchronously, and the client will be blocked, however, only this single client (which sent a synchronous request) and only until D-Bus catches the request and will not process it.

In general, the kernel socket buffer helps to increase parallelization, and ordering and synchronization is done by the kernel, without any user space intervention! But if all the sockets are available before the daemons are loaded, dependency management also becomes redundant (at least minor): if the daemon needs another daemon, it will simply connect with it. If another daemon is already running, it will immediately succeed. But if another daemon has not yet been launched, but is in the process of launching, the first demon does not even have to wait for it until it makes a synchronous check. Even if another daemon is not started at all, it can be started automatically. From the point of view of the first daemon, there is no difference for it, therefore, dependency management becomes practically unnecessary or secondary, and they are all optimally parallelized and optionally with load on demand. At the top of this, it is still cooler, because sockets remain available, whereas in reality the demon may become temporarily unavailable (maybe due to a "collapse"). In fact, you can start, and then stop (or fall), start again, and stop again (and so on), and all this without notifying customers and losing any requests from customers.
')
A good moment to pause and go pour yourself some more coffee, and be sure that the material is even more interesting.

But in the beginning, let's clarify a few things: is this some kind of new logic? No, of course not. The most promising system that works like this is Apple's launchd : on MacOS, launchd listens on sockets and starts all daemons. Therefore, all services (daemons) can run in parallel and without the need for configured dependencies. And this is actually an ingenious solution and the main reason why MacOS provides fantastic download times. I highly recommend this video where the guys from launchd explain what they are doing and how. Unfortunately, this idea was not recognized outside of the Apple campus.

The idea is actually older than launchd . Prior to launchd, the highly respected inetd worked in a similar style: sockets are mainly created in demons that start the actual service (the main functionality, so to speak), passing the socket file descriptor to the exec () function. However, the inetd focus was mainly directed not at local services and daemons, but at Internet services (later implementations also support an AF_UNIX socket). inetd was not a tool for parallelizing the boot process or resolving dependencies.

For TCP sockets, inetd was mainly used as follows: a new daemon instance was created for each incoming connection. This means that for each connection, a new process was initialized and a new process was created that cannot be called a recipe for high-performance servers. However, from the very beginning, inetd also supported another mode of operation, where a single daemon was created on the first connection, and this single instance accepted subsequent connections (this was what the wait / nowait options in inetd.conf were for , and this one was very poorly documented option, unfortunately). Running a daemon on every connection had a bad effect on inetd's reputation, labeling it too slow. But this is not entirely fair.

Parallel Bus Services


Modern Linux services tend to provide services via D-Bus instead of flat AF_UNIX sockets. Now, the question is, can we apply the same logic for parallelizing logic as for traditional socket services? Yes, we can, D-Bus already has all the necessary mechanisms for this: using the activation bus, the service can be started the first time it is accessed. The activation bus also provides us with the minimum synchronization functions for each request required to run D-Bus providers and consumers at the same time: if we want to run Avahi simultaneously with CUPS (distracted note: CUPS uses Avahi to detect mDNS / DNS-SD printers), then we can run them at the same time, and if CUPS is faster than Avahi through the bus activation logic, D-Bus will queue the request as long as Avahi is busy in order to establish its service name on the bus.

So, in this way: socket-based service activation and bus-based service activation, together allow us to start all daemons in parallel, without any further synchronization. Activation also allows us to do a “lazy” load of services: if the service (daemon) is rarely used, we can simply run it on the first call to the socket or the name on the bus, instead of starting it at boot time.

And if it’s not great, then I don’t know what’s great!

To be continued…

Source: https://habr.com/ru/post/335488/


All Articles