Privileged ports cause global warming

I am 37 years old, which is 99 years by programmer's standards. I am old enough to remember the first days of the public Internet and the first Internet service providers. I first went online through an Internet service provider called Internet Access Cincinnati (IAC). He provided dial-up access to the Sun SparcStation 10 server, where users could launch terminal applications venerable in antiquity, such as elm (email client), emacs, lynx (text-based web browser), and of course IRC.

Later, we added the ability to make calls to a CSLIP terminal server (PPP predecessor) and connect directly to the Internet from your own Linux or Windows computer (if you have Trumpet WinSock ) with a real IP address.

But back to that SparcStation. The machine was equipped with two CPUs, which worked at a monstrous frequency of 33 MHz , and it could accommodate as many as 512 MB of memory, although I doubt that the slots were maximized. RAM was very expensive at that time. A server with such modest resources served 50-100 active users at the same time, processed mail for tens of thousands, kept IRC chat, supported early HTTP 1.0 through NCSA HTTPd, and voluntarily served as an FTP mirror for Slackware Linux. In general, he coped well with the load and often showed uptime for 1-2 months.

I am sure you have a premonition of a rant about how our programs have swelled up. If so, then you are right. But the difference of this speech from other similar ones is that it puts forward a logical hypothesis, which can explain the main reasons for such a swelling. In my opinion, these are the consequences of the fact that very simple design options of the past did not go along that road.
')
I remembered SparcStation because I would like to start with a very stupid question: why do we need virtualization? Or containers? How did we come to this explosion of the complexity of attachments OS-> VM-> containers-> ... instead of the simplicity of multi-user operating systems?

Virtualization is expensive

My startup ZeroTier (I apologize for advertising) is working on cloud infrastructure scattered across many data centers, providers and continents. Most of the nodes that make up the cloud presence act as stable reference points on the Internet and the latest repeaters. They accept and send packets. They need a band and some CPU, but very little memory or disk space. Take for example one of the backup TCP relay (TCP fallback relay): it usually transmits 5-10 Mbit / s of traffic, but the software requires only 10 MB of memory and less than one megabyte (!!!) of disk space. It also uses less than 1% of the CPU resources.

However, its virtual machine occupies 8 GB of disk space and at least 768 MB of memory. All this is needed to store the full basic installation of CentOS Linux, a complete set of standard applications and tools, system services like systemd and cron, as well as an OpenSSH server for remote access. A large consumption of RAM is a direct consequence of virtualization, a kind of “hack to fool the kernel as if it is running on its own hardware.” All of this memory must be accessible by the VM, because the kernels seem to run on separate machines with their local memory, so the hypervisor is forced to obey. (Modern hypervisors can to some extent reserve memory for emergency, but excessive redundancy increases the risk of performance degradation with a sharp increase in load).

Containers are slightly more efficient than hypervisors. They do not use their own Linux kernel and do not spend extra memory on the hypervisor, but still carry at least part of the entire Linux distribution just to run one or a few programs. The container for running a 10-megabyte program can weigh several gigabytes, and to manage the containers you need your own plump set of applications.

Ask anyone why we need all this. He will talk about security, isolation and scalability - and he will be right.

In the old days, before virtualization and containers, companies like ZeroTier had to place their own equipment in data centers. This is inconvenient and not very profitable. Virtualization allows you to serve hundreds of users from one very powerful server (my 768 MB virtual machine probably runs on a 16-24-core Xeon monster with 256+ GB of memory).

Hundreds of users - it's like ... hmm ... that old 33MHz SparcStation ...?

Today's software is orders of magnitude more cumbersome and complex than the programs that ran on the old IAC server, and although this was partly the result of increased levels of abstractions and unnecessary swelling, but it should be noted that there was a real increase in functionality and a huge increase in the data being processed. I'm not saying that we have to shove the load from hundreds of typical modern users into a coffee maker. But I believe that we should be able to accommodate at least several Wordpress sites (this is a typical example of a task that is usually placed in a virtual machine) on a Raspberry Pi - on a computer that is about 100 times greater than the old server in CPU power, several times RAM and 10-20 times the amount of permanent memory.

RPi costs $ 30 and consumes less than 15 watts. Do I have to say how much a monstrous VM costs and how much it consumes?

Time for Pi!

Let's do a little thought experiment where we will try to install RPi as a terminal server for a group of users who want to run Wordpress blogs. A web server and a small database together will take about 10-20 MB of memory, and our RPi will have 1024 MB, so that it can host at least 50 small or medium-sized sites. (In reality, most of the RAM is redundant or inactive, so with a swap or KSM our RPi will want several hundred sites, but we will be conservative).

Install Linux first. This is Unix's heir, a multi-user operating system (right?), So let's create 50 accounts. Each of these users can now log in. The incoming SSH session and shell occupy only one or two megabytes in memory, and our RPi has more than a thousand, so at the moment everything should go smoothly.

Yes, yes, for now we ignore security. We will deal with this issue later. But if none of our users behave badly, then all of them are now downloading and unpacking Wordpress. All is well. The problems will begin when they proceed with the installation.

Step one: install the MySQL database.

Oh, it's so easy! Type "sudo apt-get install ..."

Wait ... sudo? Granting sudo rights means that you might as well not have separate accounts for users.

It turns out that you can install MySQL in your own home directory. If you want to compile it or manually unpack the Debian package and edit some configuration files, you can do it without special privileges. It will need to use the folders / home / yourname / ... but in the end you will get your own local MySQL server running in your own local user space.

Step Two: Configure the web server to run PHP.

Let's not grumble ... again use "sudo apt-get install" to download all the necessary components, assemble them together and run. It turns out that this can also be done, and after the shaving of the yak [seemingly useless actions that are actually necessary: for example, we solve a problem that solves another problem, which through several levels of recursion solves the real problem we are working on, slang Massachusetts Institute of Technology - approx. Per.] Our own web server with PHP is ready to go.

Then you come across something like "bind: permission denied". Hm Having a little rummaged, you find that port 80, that is, the default web port, is the preferred port . You need root rights to get attached to it.

Never mind. Just change the port. But this means that now everyone will have to enter the URL of your site with a stupid: ### at the end of the host in order to connect to a non-standard port.

Eggs!

Now you realize that without limiting the privileged ports, everything meaningful in the Unix system should not be left under the root and then you could at least ensure better security of the system services. But that's not why I called this section "eggs." I called it that, because to understand the broader implications of the restriction on privileged ports, you need to take a break for a minute and talk about eggs.

The testicles are a vital organ. Placing them in a small bag outside the male body was stupid, not to mention a strange appearance. Females of our species are designed more logically: their ovaries are hidden deep inside the abdomen, where they are much better protected from strikes, kicks, swords, fences and old aunts.

Dependence on the originally chosen path - a concept from economic theory and evolutionary biology - is trying to explain where such intricate constructions come from. The bottom line is that decisions made in the past are limited in making decisions that could easily be made in the present.

Male testicles are outside the body, because certain enzymes in the male sperm function better at temperatures just below body temperature. This “decision” was made a very long time ago, probably (according to the simplest and therefore the most likely hypothesis), when we were either not warm-blooded or had a lower body temperature. Reproductive organs are what biologists call an extremely conservative system, that is, they are not inclined to change often, because any mutations of the reproductive system are likely to erase an individual from the genetic pool. As a result, it is “easier” (in the sense of intersecting an evolutionary probability graph) to place the male testicles in the piquant-looking bag between the legs, than to go back and efficiently process the sperm to work at a higher temperature.

Even if mutations for the functioning of sperm at a higher temperature were as likely as those mutations that led to the appearance of the slang word "hang", it is not a fact that it would have happened. Perhaps these specific mutations would not have occurred due to chance, leaving no options for the evolutionary system of self-study, except ... uh-uh ... unsuccessful. Okay, okay, I'm done.

Technological evolution is driven by reasonable (at least, so we think) agents, but in many respects it is similar to biological evolution. Once a decision has been made, a path is set for future decisions and the system is more likely to move along the beaten track. Sometimes changing the old solution is more expensive than coming to terms with it, and sometimes it’s just inertia.

Take these strange round 12-volt plugs in cars. Initially they were intended for cigarette lighter cigarettes. When various portable electronic devices became popular, engineers wondered how to connect them to the car. It was not possible to negotiate with the automakers to install the outlet, just as to persuade users to install the sockets themselves mechanically, but since all cars have cigarette lighters ... well then ... the answer is obvious. Make a socket that fits into the cigarette lighter socket. Now, cars often do not even produce with cigarette lighters, but they have this outlet in the cigarette lighter connector, because where does a person have to connect his smartphone?

(USB gradually replaces cigarette lighters, but they are still installed on most machines).

Not selected path

By this point you should have understood where we are going. Unix was originally developed as a multi-user time-sharing system, and many functions were added to this end so that system resources can be shared. There is a system of permissions to access files with users and groups, and on new systems thoroughly adjusted access control lists. There are quotas for each user in terms of memory, disk, and CPU usage, so that one user does not capture all system resources. There is process isolation, memory protection, and so on. But for some reason, no one even thought of adapting its network layer for multi-tenancy (multi-tenancy) of network services . Maybe at that time privileged ports still made sense and were appropriate. Or just no one thought about it.

Privileged ports were originally a security feature. At that time, all computers belonged to organizations and were managed by full-time system administrators, and the networks were closed to foreign devices (if such devices existed at all). The list of ports to which only the root has access allowed any system service to know that an incoming connection was initiated by the program with the permission of the system administrator.

In the early 1990s, when the Internet began to become a public network, many systems and networks still worked this way and relied on privileged ports as a critical authentication mechanism. So when websites started to spread and everyone wanted to have their own web servers on their own IP addresses, it seemed more conceptually obvious and less troublesome to implement operating system virtualization and leave the rest of the stack intact.

OS virtualization is useful for other purposes: for testing and debugging code, for supporting ancient software that should work on old (or other) operating systems or kernel versions, and this adds fuel to the fire. In the end, instead of adapting the Unix network stack to the needs of multi-rent, we simply threw the boxes inside the boxes and went further .

All this is done at the expense of unnecessary consumption of iron and energy. I would venture to suggest that the same 16-24-core Xeon with approximately 256 GB of RAM, which hosts probably less than a hundred 768-megabyte VMs, could host thousands of user tasks if they worked directly on the same core, and not were hung with hypervisors and plump containers. How much less CO ₂ will be released to the atmosphere if each data center is reduced tenfold?

Containers like Docker partially solve the problem. You can say that containerization is exactly the multi-trend that I aspire to, except that it partially adheres to the legacy of virtualization, perceiving system images as gigantic statically related binaries . This is also a step back in usability. I still need root rights to run the container. I cannot (easily) log in to one container and run another, as I do with processes on a simple multi-tenancy Unix machine. Instead, one has to build massive central management consoles and orchestration tools.

So what can you do? How could look the way we did not go?

Multi-tenant networking

In some cases, dependence on the chosen path manifests itself, because so many new decisions are made on the basis of old ones, that it is too expensive to return and change the old decision. It seems to me that in this case a few simple changes could rewind 20 years of DevOps history and lead us along a different path - to a much simpler and radically more efficient architecture . This will not solve all the problems, but will eliminate some of the complexity in most typical scenarios.

Step one: remove privileged ports. I think this is a 1-2 line change. Probably just enough to remove the if . Anything that relies on a non-cryptographic security like a port number for security will break, since such things are easy to fake.

Step Two: Extend user and group permissions, as well as ownership of network resources, by entering read / write / bind UID / GID permission masks (bind instead of execute) for devices and IP addresses. By default, the bind() call does not indicate that the address (0.0.0.0 or :: 0) will listen on packets or connections on all interfaces for which the current user has the appropriate bind permission, and outgoing connections will by default be sent to the first interface, owned by the user.

Step three: it is probably a good idea to allow the creation of virtual network devices (tun / tap) in user space for the same reason that users can create their own processes and files. This is not critical, but it would be nice. Permissions for programs like tcpdump / pcap will also have to be changed in accordance with the new permissions model for network resources.

Now any service can be started as a regular user. Root access is needed only for system kernel updates, changes in hardware or drivers, and also for managing users.

Not selected road is already overgrown

Since we chose the path of virtualization and containerization, we were allowed to atrophy to multi-leveraging abilities in Unix. It will take some work to get them back into shape. But it seems to me that this may be worth it - for the sake of simplicity and to eliminate unnecessary resource consumption.

We need to go back and enhance user mode protection. I do not think it is too hard. Virtualization does not provide such protection, as many think, and attacks like Rowhammer have proven that VM is not a panacea. We spent an incredible amount of man-hours to develop robust and secure virtualization and to develop a chain of tools for this, not to mention how much was spent on creating an ecosystem of containers. I think that some of these efforts would be enough to protect the user space on a minimal Linux host from attacks with elevated privileges and leaks.

We need to tighten up other aspects of isolation and implement options to limit what the user can see with commands like ps and netstat — information should be limited only to user resources. You need to change the package managers to allow the installation of packages in the subdirectory of the user's home directory, if it is not root, etc. Probably, the changes will also be required for system elements like the dynamic linker, to make it easier for user binaries to choose shared libraries. own local area, and not in favor of system libraries, if any. It would be nice if init system services were supported by user-configured services so that users do not have to make watcher scripts and cron jobs, as well as resort to other hacks.

The end result will be somewhat similar to containerization, but without clumsiness, bloating, the invention of the bicycle and inconvenience. You can deploy applications to git , run git checkout and git pull on ssh , and the orchestration will be performed either locally or in the manner of P2P, and you can simply log in to the machine and run something without having to wade through the complex container management infrastructure. Applications will become easier, because most programs have fairly common standard libraries (libc, stdc ++, etc.) and standard tools on the system if there are no insurmountable limitations, such as problems with library compatibility.

Conclusion

Probably all this is a matter of bygone days. Most likely, in reality, the path of developing a truly secure container multi-rent will be chosen, so with the help of containers, the task that should be solved by extending the Unix permissions model to the network subsystem in user space will be solved.

The purpose of this article is to show how small decisions that no one thought of could have dramatic consequences for the future evolution of technology (and society). The 1970s decision to use port numbers as a built-in signaling mechanism to test security between systems can be a trillion-dollar error that guided the evolution of the Unix platform along a path of much greater complexity, cost, and greater resource consumption.

But wait, maybe not everything has been decided. There are more than a dozen Linux distributions, and most of them work about the same. The transition to a new paradigm will be an interesting way to stand out for one of these distributions. First of all, you need to implement network permissions like those described above - and suggest a patch for the kernel. For backward compatibility, you can do so, for example, that permissions are activated through the sysctl setting, or you can release a module (if the modules are capable of making such profound changes).

Source: https://habr.com/ru/post/332896/

All Articles