Risks and problems of password hashing

Security has always been an ambiguous topic, provoking numerous heated debates. And all thanks to the abundance of a variety of points of view and “ideal solutions” that suit some and are completely inappropriate for others. I believe that hacking the security system of an application is just a matter of time. Due to the rapid growth of computing power and the increase in complexity, today's secure applications will no longer be so tomorrow.

Note Translation: for a more complete picture, the Hashing Passwords with the PHP 5.5 Password Hashing API , to which the author refers in the article, will also be waiting for you.

If you have not studied hashing algorithms, then you most likely perceive them as a one-way function that converts variable-length data into fixed-length data . Let's analyze this definition:

One-way function : it is impossible to recover the original data from a hash using any effective algorithm.
Converting variable-length data into fixed-length data : the input value may be “infinite” length, but the output value may not. This implies that two or more input values can have the same hashes. The shorter the hash length, the higher the probability of a collision.

The MD5 and SHA-1 algorithms no longer provide a sufficiently high reliability in terms of the likelihood of collisions (see The Paradox of Birthdays ). Therefore, it is recommended to use algorithms that generate longer hashes ( SHA-256, SHA-512 , whirlpool , etc.), which makes the probability of a collision negligible. Such algorithms are also called “ pseudo-random functions”, i.e. the results of their work are indistinguishable from the results of the work of a full-fledged random number generator ( true random number generator , TRNG).

Disadvantages of simple hashing

The fact that using an efficient algorithm is impossible to perform a reverse operation of hashing and restore the original data does not mean that you cannot be hacked. If you search well, you can find databases with hashes of common words and short phrases. In addition, simple passwords can be quickly and easily bruteformed or hacked through a dictionary .
')
Here is a quick demonstration of how the sqlmap tool , through SQL injection, cracks passwords using brute force hashes generated by the MD5 algorithm.

Criminals can do even easier - google specific hashes in online databases:

You also need to understand that if two or more identical passwords have the same hashes, then by cracking one hash, we get access to all accounts where the same password is used. For example: let us have several thousand users, surely several of them use the password 123456 (if the site settings do not make the password more complicated). MD5 hash for this password is e10adc3949ba59abbe56e057f20f883e. So if you get this hash and search in the database for this value, you will find all users with such a password.

Why are unsafe salt hashes?

To make it harder for attacks of the kind described, the so-called salt is applied. This is a standard tool, but in the conditions of modern computing power it is no longer enough, especially if the length of the salt is small.

In general, the function using salt can be represented as follows:

f (password, salt) = hash (password + salt)

To make brute force attack difficult, the salt must be at least 64 characters long. But the problem is that for further user authentication, the salt must be stored in the database in plain text.

if (hash ([password entered] + [salt]) == [hash]) then the user is authenticated

Due to the uniqueness of salt for each user, we can solve the problem of collisions of simple hashes. Now all the hashes will be different. Also, approaches with googling hashes and brute force will not work anymore. But if an attacker gains access to salt or DB through SQL injection, he can successfully attack with a brute force or a dictionary search, especially if users choose common passwords (a la 123456).

Nevertheless, hacking any of the passwords will not automatically calculate users who have the same password, because we have ALL the hashes are different.

Moment of chance

To generate suitable salt, we need a good random number generator. Immediately forget about the rand () function.

There is a wonderful article dedicated to this issue. In short: the computer itself does not generate random data, it is a deterministic machine . That is, each executed algorithm, having received the same data several times at the input, will present the same result at the output.

When a random number is wanted from a computer, it usually takes data from several sources (for example, environment variables: date, time, number of written / read bytes, etc.), and then performs calculations on them to obtain "random" data. Therefore, such data is called pseudo-random . So, if somehow to recreate the set of initial states at the moment of execution of the pseudo-random function, then we will be able to generate the same number.

If the pseudo-random generator is also incorrectly implemented, patterns can be detected in the data it generates, and with their help it is possible to predict the result of the generation. Take a look at this picture, which is the result of the PHP function rand ():

Now compare with the data generated by a full-fledged random number generator:

Unfortunately, neither rand () nor mt_rand () can be considered suitable tools to ensure a high level of security.

If you need to get random data, use the openssl_random_pseudo_bytes () function, which is available starting from version 5.3.0. She even has a crypto_strong flag that will report a sufficient level of security.

Usage example:

<?php function getRandomBytes ($byteLength) { /* *   openssl_random_pseudo_bytes */ if (function_exists('openssl_random_pseudo_bytes')) { $randomBytes = openssl_random_pseudo_bytes($byteLength, $cryptoStrong); if ($cryptoStrong) return $randomBytes; } /* *  openssl_random_pseudo_bytes       * ,      */ $hash = ''; $randomBytes = ''; /* *  Linux/UNIX- /dev/urandom    , *       $hash */ if (file_exists('/dev/urandom')) { $fp = fopen('/dev/urandom', 'rb'); if ($fp) { if (function_exists('stream_set_read_buffer')) { stream_set_read_buffer($fp, 0); } $hash = fread($fp, $byteLength); fclose($fp); } } /* *     mt_rand(),    rand()! */ for ($i = 0; $i < $byteLength; $i ++) { $hash = hash('sha256', $hash . mt_rand()); $char = mt_rand(0, 62); $randomBytes .= chr(hexdec($hash[$char] . $hash[$char + 1])); } return $randomBytes; }

Password stretching

You can embed password stretching, this makes brute force attacks more difficult. Stretching is an iterative, or recursive, algorithm that calculates the hash itself over and over again, tens of thousands of times (or even more).

The number of iterations should be such that the total computation time takes at least one second. The longer the hashing is obtained, the more time the attacker has to spend on hacking.

To crack a password with a stretch you need:

know the exact number of iterations, since any deviation will be given by another hash;
wait at least a second between each attempt.

This makes the attack very unlikely ... but not impossible. To overcome the second delay, the attacker must use a more productive computer than the one that the hashing algorithm was configured for. Consequently, the hacking process may require additional costs.

To stretch a password, you can use standard algorithms, such as PBDKDF2, which is a key generation function :

 <?php /* *    ,     *  CPU/GPU.       * (      ).   *  ,   !   : - * http://ru.wikipedia.org/wiki/PBKDF2 - http://www.ietf.org/rfc/rfc2898.txt */ function pbkdf2 ($password, $salt, $rounds = 15000, $keyLength = 32, $hashAlgorithm = 'sha256', $start = 0) { // Key blocks to compute $keyBlocks = $start + $keyLength; // Derived key $derivedKey = ''; // Create key for ($block = 1; $block <= $keyBlocks; $block ++) { // Initial hash for this block $iteratedBlock = $hash = hash_hmac($hashAlgorithm, $salt . pack('N', $block), $password, true); // Perform block iterations for ($i = 1; $i < $rounds; $i ++) { // XOR each iteration $iteratedBlock ^= ($hash = hash_hmac($hashAlgorithm, $hash, $password, true)); } // Append iterated block $derivedKey .= $iteratedBlock; } // Return derived key of correct length return base64_encode(substr($derivedKey, $start, $keyLength)); }

There are more time-consuming and memory-intensive algorithms, for example bcrypt (we'll talk about it below) or scrypt:

 <?php // bcrypt    crypt() $hash = crypt($pasword, '$2a$' . $cost . '$' . $salt);

$ cost - the coefficient of labor intensity;
$ salt is a random string. It can be generated, for example, using the secure_rand () function described above.

The coefficient of complexity depends entirely on the machine on which the hashing is performed. You can start from 09 and gradually increase until the duration of the operation reaches one second. Starting from version 5.5, you can use the password_hash () function, we'll talk about this later.

Currently PHP does not support scrypt, but you can use the implementation from Domblack .

The use of encryption technology

Many are confused in terms of "hashing" and "encryption". As mentioned above, a hash is the result of a pseudo-random function, while encryption is the implementation of a pseudo-random transformation : the input data is divided into parts and processed in such a way that the result becomes indistinguishable from the result of a full-fledged random number generator. However, in this case, you can perform the inverse transform and restore the original data. The transformation is carried out with the help of a cryptokey, without which it is impossible to perform the inverse transformation.

There is one more important difference between encryption and hashing: the size of the output message space is unlimited and depends on the size of the input data in a 1: 1 ratio. Therefore, there is no risk of collisions.

It is necessary to pay great attention to the use of encryption. Do not think that to protect important data, it is enough just to encrypt it using some algorithm. There are many ways to steal data. The main rule is to never engage in amateur activities and use ready-made, tested implementations.

Some time ago, Adobe had a powerful user database leak due to improperly implemented encryption. Let's see what happened to them.

Suppose that the following data is stored in a table in plain text:

Someone at Adobe decided to encrypt passwords, but he made two big mistakes:

used the same crypto switch;
left passwordHint fields unencrypted.

Suppose, after encryption, the table began to look like this:

We do not know which crypto switch was used. But if you analyze the data, you can see that the same password is used in lines 2 and 7, as well as in lines 3 and 6.

It's time to go to the password hint. In line 6, this is “I'm one!”, Which is completely uninformative. But thanks to line 3, we can assume that the password is queen. Lines 2 and 7 separately do not allow to calculate the password, but if we analyze them together, then we can assume that this is halloween.

For the sake of reducing the risk of data leakage, it is better to use different hashing methods. And if you need to encrypt passwords, then pay attention to custom encryption:

Suppose we have thousands of users and we want to encrypt all passwords. As shown above, it is better to avoid using a single cryptokey. But we cannot make a unique key for each user either, since the key storage itself will become a problem. In this case, it is enough to use a common crypto switch for all, but at the same time to do a “setting”, unique for each user. The combination of the key and the "settings" will be a unique key for each user.

The simplest option “settings” is the so-called primary key , unique for each entry in the table. It is not recommended to use it in life, here it is shown only as an example:

f (key, primaryKey) = key + primaryKey

Here, the key and the primary key are simply linked together. But to ensure security, you should apply a hashing algorithm or a key derivation function to them. Also, instead of the primary key, you can use a one-time key (an analogue of salt) for each record.

If we apply custom encryption to our table, it will look like this:

Of course, it will be necessary to do something else with the password hints, but still, at least something adequate has already happened.

Please note that encryption is not an ideal solution for storing passwords. Due to threats of code injection, it is better to avoid this method of protection. For storing passwords, it is safest to use the bcrypt algorithm. But we must not forget that even the best and proven solutions have vulnerabilities.

PHP 5.5

Today, bcrypt is considered the best way to hash passwords. But many developers still prefer older and weaker algorithms like MD5 and SHA-1. And some do not even use salt when hashing. In PHP 5.5, a new hashing API was introduced, which not only encourages the use of bcrypt, but also makes it much easier to work with. Let's take a look at the basics of using this new API.

Here are four simple functions:

password_hash () - password hashing;
password_verify () - compare password with hash;
password_needs_rehash () - password rehashes;
password_get_info () - returns the name of the hashing algorithm and the options used during the hashing.

password_hash ()

Despite the high level of security provided by the crypt () function, many find it too complicated, which is why programmers often make mistakes. Instead, some developers use a combination of weak algorithms and weak salts to generate hashes:

 <?php $hash = md5($password . $salt); // ,

The password_hash () function makes life easier for the developer and increases the security of the code. To hash a password, it is enough to feed its functions, and it will return a hash that can be placed in the database:

 <?php $hash = password_hash($passwod, PASSWORD_DEFAULT);

And that's it! The first argument is a password as a string, the second argument is a hash generation algorithm. The default is bcrypt, but if necessary, you can add a stronger algorithm that will allow you to generate strings of greater length. If you use PASSWORD_DEFAULT in your project, make sure that the column width for storing hashes is at least 60 characters. It is better to set 255 characters immediately. As the second argument, you can use PASSWORD_BCRYPT. In this case, the hash will always be 60 characters long.

Note that you do not need to set the salt value or value parameter. The new API will do everything for you. Since salt is part of the hash, you do not have to store it separately. If you still need to set your salt value (or value), then this can be done using the third argument:

 <?php $options = [ 'salt' => custom_function_for_salt(), //      'cost' => 12 //     10 ]; $hash = password_hash($password, PASSWORD_DEFAULT, $options);

All this will allow you to use the latest security features. If later a stronger hashing algorithm appears in PHP, your code will use it automatically.

password_verify ()

Now consider the function of comparing a password with a hash. The first is entered by the user, and the second we take from the database. Password and hash are used as two arguments to the password_verify () function. If the hash matches the password, the function returns true.

 <?php if (password_verify($password, $hash)) { // ! } else { //   }

Remember that salt is part of a hash, so it is not specified separately here.

password_needs_rehash ()

If you want to increase the security level by adding a stronger salt or by increasing the cost parameter, or the default hashing algorithm changes, then you probably want to overwrite all existing passwords. This function will help to check each hash for which algorithm and parameters were used when creating it:

 <?php if (password_needs_rehash($hash, PASSWORD_DEFAULT, ['cost' => 12])) { //   ,     //          12 $hash = password_hash($password, PASSWORD_DEFAULT, ['cost' => 12]); //     ! }

Do not forget that you will need to do this at the moment when the user tries to log in, as this is the only time you will have access to the password in the form of plain text.

password_get_info ()

This function takes a hash and returns an associative array of three elements:

algo - constant, which allows to identify the algorithm;
algoName - the name of the algorithm used;
options - values of different options used for hashing.

Earlier versions of PHP

As you can see, it is easier to work with the new API than with the awkward crypt () function. If you are using earlier versions of PHP, I recommend paying attention to the password_compact library. It emulates this API and automatically shuts down when you upgrade to version 5.5.

Conclusion

Unfortunately, there is still no perfect solution for data protection. In addition, there is always the risk of hacking your security system. However, the fight projectile and armor does not stop. For example, our arsenal of remedies has recently been supplemented with so-called sponge functions .

Source: https://habr.com/ru/post/271245/

All Articles