Myths about / dev / urandom

Surely many of you have repeatedly encountered the myths of / dev / urandom and / dev / random. Maybe you even believe in some of them. In this post, we tear the covers from all these myths and examine the real strengths and weaknesses of these random number generators.

Myth 1: / dev / urandom is not safe. Always use / dev / random for cryptography.

Basically, this is stated in relation to fairly "fresh" Linux-based systems, and not in general to all UNIX-like systems. In fact, / dev / urandom is the preferred random number generator for cryptographic tasks on UNIX-like systems.

Myth 2: / dev / urandom is a pseudo-random number generator (PRNG), and / dev / random is a “true” random number generator.

In fact, they are both cryptographically resistant pseudo-random number generators ( CSPRNG ). The differences between them are very small and not related to the degree of randomness.
')

Myth 3: for cryptographic tasks it is definitely better to use / dev / random. There is no point in using / dev / urandom even if it were relatively safe.

Actually / dev / random has a very unpleasant problem: locks.

But this is great! The more entropy in the pool / dev / random, the higher the level of randomness. A / dev / urandom generates unsafe random numbers, even if the entropy is already exhausted.

No “Exhausting entropy” is a scarecrow, even if one does not take into account the accessibility and subsequent manipulation by users. The order of 256 bits of entropy is quite enough for a VERY long generation of computationally stable numbers.

Much more fun: where can / dev / random know how much more entropy does he have in stock?

But experts in cryptography all the time say about the constant updating of the initial state (re-seeding). Does this not contradict your last statement?

This is partly true. Yes, the random number generator constantly updates its initial state with the help of various sources of entropy that are available to it. But the reasons for this lie in the other (partly). I do not claim that the use of entropy is bad. Not at all. I am only talking about the dangers of blocking when the entropy level falls.

This is all fine, but even the / dev / (u) random directory contradicts your statements. In general, does anyone share your point of view?

I do not refute reference data at all. I can assume that you still do not quite understand all this cryptographic jargon, so you saw in the directory confirmation of the insecurity / dev / urandom for cryptographic tasks. But in the reference book it is only not recommended to use / dev / urandom in some cases. From my point of view, this is not critical. At the same time, the reference book recommends using / dev / urandom for "normal" cryptographic tasks.

Appealing to authorities is not a reason for pride. In cryptography, it is necessary to carefully approach the solution of issues, listening to the opinion of experts in specific areas.

And yes, quite a few experts share my point of view that in the context of cryptography on UNIX-like systems, the best random number generator is / dev / urandom. As you understand, this their cumulative opinion influenced me, and not vice versa.

* * *
It’s probably hard for many of you to believe in all this. “He is surely mistaken!” Well, let's take a closer look at everything said, and you decide for yourself whether I am right or not. But before proceeding with the analysis, answer yourself: what is an accident? More precisely, what kind of randomness are we talking about here? I'm not trying to be indulgent. This text was written in order to refer to it in the future, when the debate about random number generators reappears. So here I hone my arguments and arguments. In addition, I am interested in other opinions. It is not enough just to say that "/ dev / urandom is bad." You need to figure out exactly what you disagree with, and deal with it.

"He's an idiot!"

Strongly disagree! I myself once believed that / dev / urandom is insecure. Yes, we all just had to believe it, because a lot of all these respected programmers and developers on the forums and in social networks say this to us. And it seems to many that even man says the same thing. And who are we to argue with their convincing argument about the "exhaustion of entropy"?

This deeply erroneous opinion is ingrained, not because people are stupid, but just few people seriously understand cryptography (namely, the vague concept of "entropy"). Therefore, the authorities easily convince us. Even intuition agrees with them. Unfortunately, intuition also knows nothing about cryptography, like most of us.

True chance

What are the criteria for "true randomness" of a number? We will not go into the wilds, because the discussion will quickly move into the field of philosophy. In such discussions, it very quickly becomes clear that everyone talks about his favorite model of chance, not listening to others and not even caring for them to be understood.

I believe that quantum effects are the real standard of “true chance”. For example, they occur when photons pass through a semitransparent mirror, when alpha particles are emitted by radioactive material, etc. That is, ideal randomness is found in some physical phenomena. Someone may assume that they can not be truly random. Or that in the world nothing can be accidental. In short, " all pis ."

Cryptographers usually avoid participation in such philosophical debates, because they do not recognize the concept of "truth". They operate on the concept of "unpredictability." If no one has any information about the next random number, then everything is in order. I believe that this is exactly what you need to focus on when using random numbers in cryptography.

In general, I do not care much about all these “philosophically safe” random numbers, as I call for myself “truly” random ones.

Of the two types of security, only one is important.

But let's assume that you managed to get "true" random numbers. What will you do with them? Print and hang on the walls in the bedroom, enjoying the beauty of the quantum universe? Why not, I can understand this attitude.

But you probably use them, and for cryptographic needs? And it does not look so positive. You see, your truly random numbers, blessed with the goodness of the quantum effect, fall into the mundane algorithms of the real world. And the problem is that almost all of the cryptographic algorithms used do not correspond to the theoretical information security. They provide “only” computational security. Only two exceptions come to memory: the Shamir secret sharing scheme and the Vernam cipher . And if the first can act as a counterpoint (if you really are going to use it), then the second is extremely impractical. All other algorithms: AES, RSA, Diffie - Hellman, elliptic curves, such crypto packages as OpenSSL, GnuTLS, Keyczar, and cryptographic APIs are computationally safe.

What is the difference? Information-theoretically secure algorithms provide security for a period, and all other algorithms do not guarantee security in the face of an attacker who has unlimited computing power and goes through all possible key values. We use these algorithms only because if you collect all the computers in the world, they will solve the task of searching longer than the universe exists. This is the level of “insecurity” in question.

But this is only until some very smart guy hacks another algorithm with the help of much more modest computing power. Any cryptanalyst dreams of such success: gaining fame by cracking AES, RSA, etc. And when “ideal” hashing algorithms or “ideal” block ciphers are cracked, then it doesn't matter that you have your “philosophically safe” random numbers You simply have no place to use them safely. So it’s better to use computationally safe random numbers in computationally safe algorithms. In other words, use / dev / urandom.

Linux Random Number Structure: Incorrect Representation

Most likely, you think about the work of a random number generator built into the kernel like this:

A “true” coincidence, though certainly distorted, falls into the system. Its entropy is calculated and immediately added to the value of the internal entropy of the counter. After correcting and introducing white noise (whitening), the resulting entropy is transferred to the kernel pool, from which random numbers / dev / random and / dev / urandom are taken. / dev / random gets them from the pool directly if the entropy counter has the requested number of numbers. Of course, the counter decreases. Otherwise, the counter is blocked until a new portion of entropy enters the system.

The important point is that the received / dev / random data is necessarily subjected to the operation of introducing white noise. With / dev / urandom is the same story, except for the moment when the system does not have the necessary amount of entropy: instead of applying a lock, it receives “low-quality” random numbers from CSPRNG, working outside the system we are considering. The starting number for this CSPRNG is selected only once (or each time, it does not matter) based on the data in the pool. It cannot be considered safe.

For many, this is a compelling reason to avoid using / dev / urandom in Linux. When entropy is sufficient, the same data is used as in the case of / dev / random, and when it is not enough, an external CSPRNG is connected, which almost never receives data with high entropy. Awful, isn't it? Unfortunately, everything described above is fundamentally wrong. In fact, the internal structure of the random number generator looks different:

The simplification is rather crude: in fact, not one, but three pools of entropy are used:

primary,
for / dev / random,
for / dev / urandom.

The last two pools receive data from the primary. Each pool has its own counter, but the latter two are close to zero. "Fresh" entropy is fed from the first pool as needed, while its own counter decreases. Also, mixing and returning data back to the system is actively used, but all these nuances are not important for the subject of our conversation.

Do you notice the difference? CSPRNG does not work in addition to the main generator, but is an integral part of the random number generation process. For / dev / random, no “clean and good” random data is produced that has white noise. Incoming data from each source is thoroughly mixed and hashed within CSPRNG and only then is output as random numbers for / dev / random or / dev / urandom.

Another important difference is that entropy is not considered here, but is estimated. The amount of entropy from any source is not obvious, as if some data. Remember that your dearly beloved / dev / random only yields the number of random numbers that are available due to the entropy available. Unfortunately, it is quite difficult to estimate the amount of entropy. In the Linux kernel, this approach is used: the start time of an event is taken, its polynomial is interpolated and it is calculated “how unexpectedly”, according to a certain model, this event started. There are certain questions on the effectiveness of such an approach to the estimation of entropy. You also can not discount the effect of hardware delays on the start time of events. The frequency of interrogation of hardware components can also play a role, since it directly affects the value and granularity of the start time of events.

In general, we only know that the estimation of entropy in the nucleus is implemented quite well. Which means - conservatively. Some may argue about how well everything is done, but this is already beyond the scope of our conversation. You may be nervous about the lack of entropy for generating random numbers, but I personally like the current estimation mechanism.

To summarize: / dev / random and / dev / urandom work at the expense of the same CSPRNG. They differ only in behavior when the entropy pool is exhausted: / dev / random blocks, and / dev / urandom does not.

What is wrong with blocking?

Have you ever had to wait for / dev / random to produce random numbers? For example, generating PGP keys inside a virtual machine? Or when connecting to a web server that expects a portion of random numbers to create a dynamic session key? This is the problem. In essence, accessibility is blocked, your system is temporarily out of order. She does not do what she must.

In addition, it gives a negative psychological effect: people do not like it when something interferes. For example, I work on the security of industrial automation systems. What do you think is why security breaches most often occur? Because of the conscious actions of the users themselves. It is just that some measure designed to provide protection is carried out for too long, according to the employee. Or she is too uncomfortable. And when you need to find "informal solutions", people show wonders of resourcefulness. They will look for workarounds, invent fancy machinations to make the system work. People who do not understand cryptography. Normal people.

Why not patch the random () call? Why not ask someone on the forums how you can use weird ioctl to increase the entropy counter? Why not completely disable SSL ? In the end, you simply teach your users to do idiotic things that compromise your security system without even knowing it. You can arbitrarily contempt for the availability and usability of the system and other important things. Safety is paramount, right? It is better to be uncomfortable, inaccessible or useless, rather than pretending to provide security.

But this is all a false dichotomy. Security can be provided without locks, because / dev / urandom provides you with exactly the same random numbers as / dev / random.

There is nothing bad in CSPRNG

But now the situation looks very sad. If even high-quality numbers from / dev / random are generated by CSPRNG, how can you use them in tasks that require a high level of security? It seems that the main requirement for most of our cryptographic modules is the need to "look random." In order for cryptographic hash output to be accepted by cryptographers, they must be indistinguishable from a random set of strings. And the output of a block cipher without knowing the key must be indistinguishable from random data.

Do not be afraid that someone will be able to take advantage of some weaknesses of CSPRNG and crack the cryptographic modules. You still have nothing to do but accept this, since both block ciphers, and hashes, and everything else is based on the same mathematical foundation as CSPRNG. So relax.

What about exhaustion of entropy?

It does not matter. Basic cryptographic elements are designed taking into account that the attacker will not be able to predict the result if there was a lot of randomness (entropy) in the beginning. Usually the lower limit of “sufficiency” is 256 bits, not more. So forget about your entropy. As we have already discussed above, the random number generator built into the kernel cannot even accurately calculate the amount of entropy entering the system. He can only appreciate it. In addition, it is unclear exactly how the assessment is carried out.

Update initial state (re-seeding)

But if entropy has so little value, then why is fresh entropy constantly being transmitted to the generator? I must say that excess entropy is harmful . But there is another important reason for updating the initial state of the generator. Imagine that the attacker became aware of the internal state of your generator. This is the most dreadful security situation you can imagine, since the attacker has full access to the system. You are in full span, from now on, an attacker can calculate all future output data.

But over time, new portions of fresh entropy will begin to flow in, which will be mixed into the internal state, and the degree of its randomness will begin to grow again. So this is a kind of protective mechanism built into the architecture of the generator. However, note that entropy is added to the internal state, it has nothing to do with blocking the generator.

Urandom man pages

Man has no equal fears in the minds of programmers:

When reading from / dev / urandom, locking will not be performed in anticipation of the required entropy. As a result, if there is not enough entropy in the pool, then the returned data is theoretically vulnerable to cryptographic attacks on the algorithms used by the driver. Open sources contain no information on how this can be done, but the existence of such attacks is theoretically possible. If this may concern your application, use / dev / random.

About such attacks nowhere says nothing, but the NSA / FSB certainly have something in service, right? And if it bothers you (should worry!), Then / dev / random solves all your problems. Even if the way of carrying out such an attack is known to the special services, kulkhackers or babayke, then it would be irrational to simply take it and arrange it. I will say more: in the open literature, practical methods of attacks on AES, SHA-3 or any other similar ciphers and hashes are not described either. Do you refuse them too? Of course not. Particularly touching is the advice “ use / dev / random ” in the light of what we already know about its common source and / dev / urandom. If you need information-theoretically safe random numbers (and you do not need them!) And for this reason you cannot use CSPRNG, then / dev / random is useless for you! The directory is written nonsense, that's all. But the authors are at least trying to somehow fix:

If you are not sure whether you should use / dev / random or / dev / urandom, then most likely it will be better to use the second one. In most cases, with the exception of generating reusable GPG / SSL / SSH keys, you should use / dev / urandom.

Wonderful. If you want to use / dev / random for reusable keys, the flag is in your hands, although I don’t consider it necessary. Well, wait a few seconds before you can type something on the keyboard, think. Only I beg you, do not force you to connect endlessly to the mail server just because you "want to be safe."

Dedicated to orthodox

Below are some interesting statements I found on the Internet. And if you really want someone to support you with / dev / random, then refer to real cryptographers.

Daniel Bernstein aka djb:

Cryptographers are not related to this superstition. Consider: the one who wrote the manual on / dev / random really believes in it.

We do not know how to deterministically transform one 256-bit number from / dev / random into an endless stream of unpredictable keys (which is what we need from urandom), but
we can figure out how to use a single key to securely encrypt multiple messages. What, actually, we need from SSL, PGP, etc.

Cryptographers from all this will not even smile.

Thomas Pornin , one of the most useful users I've encountered on the Stack Exchange:

In short - yes. If deployed - also yes. / dev / urandom provides data that cannot be distinguished from truly random using available technologies. There is no point in striving for more “better” randomness than the provided / dev / urandom, unless you are using one of several “information theoretic” cryptoalgorithms. And you definitely do not use them, otherwise you would know about it. The urandom handbook is misleading by its suggestion that / dev / random should be used because of the “exhaustion of entropy” / dev / urandom.

Thomas Ptacek is not a real cryptographer in terms of developing algorithms or creating cryptosystems. But on the other hand, he founded a security consulting agency that has earned a good reputation through numerous penetration tests and hacking of poor-quality cryptography:

Use urandom. Use urandom. Use urandom. Use urandom. Use urandom. Use urandom.

No perfection in the world

/ dev / urandom is not perfect, and there are two reasons for this. Unlike FreeBSD, on Linux, it never blocks. And as you remember, the entire security system is based on a certain initial randomness, that is, on the choice of the starting number. In Linux, / dev / urandom without a twinge of conscience gives you not too random numbers even before the kernel has at least some opportunity to collect entropy. When does this happen? When starting the machine, while the computer is booting. In FreeBSD, everything is more correct: there is no difference between / dev / urandom and / dev / random, it’s the same thing. Only when booting / dev / random once blocks until enough accumulate entropy. After that, there are no locks anymore.

In Linux, everything is also not as bad as it looks at first glance. In all distributions at the boot stage, a certain number of random numbers are stored in a seed file, readable at the next system boot. But the recording is made only after a sufficient amount of entropy is accumulated, since the script does not start immediately after pressing the power button. So you are responsible for the accumulation of entropy from the previous session. Of course, this is not as good as if you would allow the scripts that shut down the system to record the starting number. After all, one has to save entropy much longer. But you do not depend on the correctness of the completion of the system with the execution of appropriate scripts (for example, resets and crashes of the system are not terrible for you). In addition, this solution does not help you when you first start the machine. But in Linux distributions, the seed file is recorded while the installer is running, so it will generally go.

At the same time, a new getrandom (2) system call was implemented in Linux, which initially appeared in OpenBSD as getentropy (2). It blocks until a sufficient initial amount of entropy has accumulated, and subsequently no longer blocks. True, this is not a character device, but a system call, so it is not so easy to access it from the terminal or scripting languages.

Another problem is with virtual machines. People love to clone them or return to previous states, and in these cases, the seed file will not help you. But this is not solved by the universal use of / dev / random, but by a very precise choice of the starting number for each virtual machine after cloning, returning to the previous state, etc.

Tl; dr

Just use / dev / urandom.

Source: https://habr.com/ru/post/273147/

All Articles