Educational program for pseudo-random generators

Reflecting on the need to generate pseudo-random passwords was pushed by a rather bleak statistic of hacking passwords created using BRAIN v1.0; However, it is reckless to take some first-time software password generator and use it to change all passwords. I did not conduct a detailed analysis of ready-made generators, but I will tell you some fairly simple but informative facts related to the mathematics of generating good quality pseudo-random numbers that will allow you to choose the desired program yourself.

The basis of any program of this class is a pseudo-random generator , that is, a function G: {0,1} ⁿ -> {0,1} ^{n + 1} . Indeed, the task of a pseudo-random generator is to obtain at least one bit of “randomness” based on some initial data. Then, such a generator can be started several times to get the required number of bits and translate them into a character range, having received a new password. Actually, there are two problems:

where to get the initial data (generator initialization);
what functions can be used.

Initial data

A method of initialization of a random number generator that is familiar to every C programmer is setting the initial value according to the current time value:

srand ( time(NULL) );

This method is very bad, it's just a disaster! We read the description of the time function - and we see that it gives the number of whole seconds, starting at 00:00 hours, Jan 1, 1970 UTC.

Thus, knowing that a person used a certain generator that runs on the time function, it’s enough just to find out the date of its registration on the resource (or the last password change, for example, during the last visit to the site) and iterate through the 86400 initial values (so many seconds per day) that in minutes. And it does not matter that at the same time his password consists of hundreds of characters of different registers, numbers, special signs, because the only chance is the time of starting the generator.

The situation is somewhat better with the RDTSC function, which returns the number of cycles since the processor was reset, however, its values lie in the range from 0 to 2 ⁶⁴ -1 and are discrete for different processors with different steps; to satisfy one's paranoia is no good. In reality, this randomness margin will suffice only for 6-8 character password generated (depending on the character set used).
')
These are the simplest ways (and I hope not the most frequently used ones, otherwise the sense from such programs is zero). What can give a good enough initial value?

First, the analysis of the environment: the program, at startup, that considers the user's hash of documents. No one can predict what they will be, and we will get the initial value for the generator of almost any length and good quality. The exception is a practical “bare” computer, where there is nothing.

Secondly, it’s interactive with the user: making it necessary to press a dozen buttons is probably meaningless, because there are almost obvious patterns like qwerty123456 (this is not a password, but a random number generator initializer, which of course will produce a complex password later, like A6 @ sIp! V0 ] x, but it will be very easy to break it, knowing that you used the generator program), but now offering to pull the mouse is quite a good option, there will be a very large amount of random numbers, without general patterns.

Even better, turn on the camera or record from the microphone - a huge stream of “noisy” data, which, passing through a hash function, will give an excellent initializer of arbitrary length.

Hence some moral: it is not so important how the program generates the passwords, how much - where it gets the initial data. If there is no interaction at all, you should doubt it.

Generation functions

In fact, there is not much point in inventing something. The simplest object that can give a good chance is hash functions .

If the program uses something else, then there is every chance of not trusting it. No remarkable magic sequences like X _{i + 1} = (a * X _i + c) mod m will give any reliable (cryptographic randomness).

For hash functions, there is a fairly simple substantiation of the legality of their use. So.

A function F: {0,1} ⁿ -> {0,1} ^m is called a one-way function , if
1) it is computable in polynomial time;
2) there is no polynomial algorithm that correctly computes F ^-1 with a good probability (greater than 2 ^-m ).
If m = n, such a function is called a one-sided permutation .

In general, any pseudo-random generator is a one-way function.

And any one-way permutation allows constructing a pseudo-random generator:
G ({a ₁ , a ₂ , ... an}) = {a ₁ , a ₂ , ... an, F ({{a ₁ , a ₂ , ... a _n })}.

The general scheme of construction is as follows:

Where G is a call to a one-way function, for example, MD5, and y ', y, ... are bits of the random output password.

So, it is stated [*] that the existence of pseudo-random generators follows from the existence of one-way functions, which in turn exist if P! = NP.

For specific cases of hash functions (MD5, SHA1, GOST), there is simply no such algorithm for calculating F ^-1 to cast doubt on the randomness of the output values.

Thus, it will be possible to be afraid of using hash functions in pseudo-random generation only if it suddenly turns out that P = NP.

But it doesn’t bother us either: because if we manage to easily build the inverse of the hash function, the password will be easier to obtain by reversing the hash value from the database, rather than building a chain of pseudo-random bytes that lead to a theoretical password, which still needs to be checked for compliance from the base.

A simple thesis: even MD5 can be used for pseudo-random generators (for passwords)!

Block ciphers are often used to obtain pseudo-random generators, as well as elliptic curves and solutions of differential equations — and for good reason, but I will not touch them (for now?).

[*] Handbook of Applied Cryptography, A. Menezes, P. van Oorschot, S. Vanstone.

UDP: Author of a neat image - bad_guy

Source: https://habr.com/ru/post/122711/

All Articles

Educational program for pseudo-random generators

Initial data

Generation functions

More articles: