Brief history of random numbers

“When I set out to get a really random number, I didn’t find anything better than a regular dice,” wrote Francis Galton in 1890 in the journal Nature . “After the bones are shaken and thrown into the basket, they hit each other and against the walls of the basket in such an unpredictable way that even after a light throw it becomes completely impossible to predetermine its result.”

(Dice from the time of the Roman Empire)

How can we generate a uniform sequence of random numbers? Accident, so beautiful in its diversity, is often found in living nature, but it was not always easy to extract it by artificial means. The most ancient of the dice were found in the Middle East, and they date back to about 24th century BC. Another example would be China, where as early as the 11th century BC, a blow-up of a turtle shell was used, with further interpretation of the size of the random parts obtained. Centuries later, people chopped the stalks of plants several times and compared the sizes of the parts obtained.

')
But by the mid-40s of the twentieth century, humanity needed significantly more random numbers than could be obtained from throwing bones or cutting stems. The American analytical center RAND created a machine that was able to generate random numbers using a random pulse generator (something like electronic roulette). They launched it and after some time received a sufficient number of random numbers, which were published in the form of the book " One Million Random Numbers and One Hundred Thousand Normal Deviations ."

You can read the book online or buy a 2001 paper reprint on Amazon . What today may seem an absurd art project, at that time was a serious breakthrough. For the first time, a really large sequence of truly random numbers has become available to scientists.

Another similar machine, ERNIE, built in the famous Bletchley Park today in the 40s of the twentieth century, was used to generate random numbers in the British lottery Premium Bond . In order to dispel fears about the randomness and honesty of the results, the documentary “The Importance of Being ERNIE” was even filmed. Today you can watch it on Youtube:

In 1951, randomness was finally formalized and embodied in the real computer Ferranti Mark 1 , which came with a built-in random data generator based on shot noise and instructions for obtaining 20 random bits. This functionality was developed by Alan Turing. His colleague Christopher Strachey used it to create a "generator of love letters." Here is an example of the text that was generated by this program:

JEWEL LOVE MY LIKING HUNGERS FOR YOUR ADORABLE INFATUATION. YOU ARE MY EROTIC ARDOUR.: MY FOND RAPTURE. MY THIRST SIGHS FOR YOUR INFATUATION. MY HEART SEDUCTIVELY WISHES YOUR BREATHLESS LONGING. YOURS CURIOUSLY MUC

But the Turing random number generator did not make the programmers of those years happy, because it introduced a factor of even greater instability into the already new and unstable computer systems. Desiring to achieve stable operation of a program — without debuggers and with random data — one could never achieve repeatability of results. Then the following thought arose: “And what if the random number generator can be represented as some deterministic function?” Then it would turn out that, on the one hand, the generated numbers would have signs of random, but on the other hand, with the same initialization of the generator the sequences would be the same each time. So pseudo-random number generators (PRNG) appeared.

John von Neumann developed the PRNG in 1946. His idea was to start with a certain number, take his square, discard the numbers from the middle of the result, again take the square and discard the middle, and so on. The resulting sequence, in his opinion, had the same properties as random numbers. The von Neumann theory did not stand the test of time, because whatever initial number you choose, the sequence thus generated degenerates into a short cycle of repetitive values, like 8100, 6100, 4100, 8100, 6100, 4100, ...

It is impossible to completely avoid cycles when we work with a deterministic function, whose subsequent values are based on the previous ones. But what if the cycle period is so long that in practice it will no longer matter?

The mathematician Derrick Henry Lemer developed this idea in 1949 with the invention of the linear congruential method . Here is a pseudo-random number generator based on the Lehmer approach and written in Javascript:

 // The Central Randomizer 1.3 (C) 1997 by Paul Houle (paul@honeylocust.com) // See: http://www.honeylocust.com/javascript/randomizer.html rnd.today=new Date(); rnd.seed=rnd.today.getTime(); function rnd() { rnd.seed = (rnd.seed*9301+49297) % 233280; return rnd.seed/(233280.0); }; function rand(number) { return Math.ceil(rnd()*number); };

You can see in the code a number of "magic constants". These numbers (usually simple) were chosen in such a way as to maximize the cycle period before repeating the sequence of results. This generator uses the current time as its initial value and has a period of about 2 ^ 31. It became popular with the release of JavaScript 1.0, since it did not yet have the built-in function Math.random () , and everyone wanted their banners to be replaced randomly. "I would not use this algorithm for something related to security, but for many applications it is enough," wrote the author of the above code Paul Houle.

But the Internet just required something related to security. SSL appeared in 1995 and its implementation required a high-quality pseudo-random number generator. This led to a series of fairly wild experiments in this area. If you look at the patents related to the generation of random numbers issued in those times, you will get the feeling that you are looking at the modernized attempts to build the first aircraft.

Most popular processors in the 90s did not have special instructions for generating random numbers, so choosing a good initial value for a pseudo-random number generator was very important. This resulted in security problems when Phillip Hallam-Baker discovered that in the Netscape browser (dominant at the time), the SSL implementation used a combination of the current time and its process ID to initialize the PRNG. He showed how a hacker can easily pick up this value and decrypt SSL traffic. Guessing the initial value of the algorithm for generating pseudo-random numbers is still a popular attack, although over the years it has become slightly more difficult. Here is an example of a successful attack published on Hacker News in 2009.

In 1997, programmers were limited in ways to get truly random numbers, so the SGI team created LavaRand, which consisted of a webcam aimed at a pair of lava-lamps standing side by side on the table. The picture from this camera was a great source of entropy - a Real Random Number Generator, just like Turing. But this time it was possible to generate 165 KB of random numbers per second. The invention was immediately patented .

Most of the experiments in this area did not stand the test of time, but one PRTG, named Mersenn's Whirlwind and developed in 1997 by Makoto Matsumoto and Takuji Nishimura, was able to keep the palm. It combined performance and quality results, which allowed it to quickly spread to many languages and frameworks. The Mersenne Vortex is a turn-around shift register with generalized recoil and generates a deterministic sequence with a huge cycle period. In the current implementation, the period is 2¹⁹⁹³⁷− 1, and it is this implementation that is included today in most programming languages.

In 1999, Intel added a hardware random number generator to the i810 chipset. It was good, because this implementation gave truly random numbers (based on thermal noise), but it didn’t work as fast as software PRNG, so most cryptographic applications still use PRNG, which at least can now be initialized value from the hardware random number generator.

This leads us to the notion of a cryptographically-safe pseudo-random number generator (KBGPSCH). (Oh, those abbreviations! Not surprisingly, computer science seems boring to some people.) KBGPCh became very popular in the SSL era. What requirements should be met by CBHRC? Well, here's a 131-page document with these requirements, have fun. As you already understand, the requirements are not few. For example, the same Mersenna Vortex does not satisfy them, since, having a sufficiently large sequence of numbers generated by it, one can guess the next number.

In 2012, INTEL added RDRAND and RDSEED instructions to its chips, allowing to get real random numbers based on the same temperature fluctuations, but now at speeds up to 500 Mb / s, which would make the use of software GPSNC unnecessary. But in society, rumors and doubts about the honesty of these instructions roam. Are there any specially made bookmarks for the NSA? No one can say that for sure. Rather, someone probably can, but he definitely will not write about it on Twitter.

(Random data from Redoubler hardware random number generator )

In recent years, hardware random number generators from third-party manufacturers and even completely open (both in terms of software and hardware) have begun to appear. The strength of these products is precisely in openness: you can study them yourself and even build houses from the most widely available components. Examples include REDOUBLER and Infinite Noise TRNG .

Today, people are still leading discussions about what kind of random number generator should be used in a particular system, OS kernel, programming language, cryptographic library, etc. There are many options for algorithms, sharpened on speed, memory savings, security. And each of them is constantly under attack by hackers and security experts. But for non-security daily tasks, you may well rely on data from / dev / random or the rand () function available on most platforms. This will give you a fairly large and random sequence of data that would make Alan Turing happy at the time.

Source: https://habr.com/ru/post/329778/

All Articles

Brief history of random numbers

More articles: