⬆️ ⬇️

Rainbow tables at home





From the point of view of information security, the past week was exceptionally “successful”: either the LinkedIn hash base leaked into the network, then last.fm hashes . And in all discussions, one way or another, they mention rainbow tables .

Almost everyone has heard about them, but very few have made them with their own hands.



I think it is not reasonable to re-tell what a hash is and why, in principle, rainbow tables or some other predictions are needed. To eliminate white spots is invited to read this topic .



Intellectual breakthrough in the field of rainbow tables is not planned today, but there is a desire to tell that rainbow tables are not difficult, therefore we will write on something simple, namely: PHP. Store table in mysql.

')

All code is available on GoogleCode , but I will describe the main points that I had to think about and need to be implemented.



First you need to talk about the input alphabet. Not all ASCII table characters are involved in the password set, but only those that can be typed on the keyboard of a PC or mobile device without unnecessary tricks. The smaller the input alphabet, the faster the rainbow table will be generated, but there will also be fewer passwords for given hashes. In our case, we will use the input alphabet of numbers and letters of the Latin alphabet of upper and lower case.



$ALPHABET = array_merge(range(0, 9), range('A', 'Z'), range('a', 'z')); $LAST_SYMBOL = count($ALPHABET) - 1; //     




To create a rainbow table, chains are used, the beginning of which is some random password of a fixed length. Obviously, the function of generating random passwords from the characters of the input alphabet is needed:



 define('WORD_LENGTH', 6); //  ,      function getWord($newRandom = false) { global $ALPHABET, $LAST_SYMBOL; if($newRandom) { mt_srand(); } $word = $ALPHABET[mt_rand(0, $LAST_SYMBOL)]; for($i = 1; $i < WORD_LENGTH; ++$i) { $word .= $ALPHABET[mt_rand(0, $LAST_SYMBOL)]; } return $word; } 




Inside the chains, the hash function and the reduction function are alternately applied. With the hash function, everything is clear - this is MD5, SHA1 or any other (in our case, we will use MD5). With the function of reducing clarity less. First, the reduction function, having received a hash at the input, should produce some password from the characters of the input alphabet. Secondly, the reduction function requires not one single, but an ordered set of reduction functions, and the power of this set is equal to the length of the chain.



Of course, one could write two or three reduction functions independently, but not in the case when the chain length is 100 or 1000. Moreover, I would like the chain length to be stored in a constant that can be replaced with a flick of the wrist.



A rather obvious solution comes to mind: you need to use a pseudo-random number generator (PRNG). For each particular reduction function, initialize the PRNG with a certain set of bits from the hash entering the input, and then obtain the password by calling getWord ().



In principle, act at the level of individual bits and is not required. You need to initialize the PRNG with an int number, for my platform it is 32 bits or 4 bytes. MD5 consists of 16 bytes (look at the second parameter of the md5 function in PHP), then the number of possible allocations is 16! / (16 - 4)! = 43680 - even for the length of the chain of 1000 is enough with a margin.



No sooner said than done:



 define('CHAIN_LENGTH', 1000); //       define('HASH_LAST_BYTE', 15); //    ,   0 ( MD5 – 15) $reductions = array(); // ,         mt_srand(CHAIN_LENGTH); //    ,                 $i = 0; while($i < CHAIN_LENGTH) { $positions = array(); $positions[] = mt_rand(0, HASH_LAST_BYTE); for($j = 1; $j < 4; ++$j) { do { $ind = mt_rand(0, HASH_LAST_BYTE); if(!in_array($ind, $positions)) { $positions[] = $ind; break; } } while(true); } if(!in_array($positions, $reductions)) { //    $reductions[] = $positions; ++$i; } } 




Then the actual reduction function, which takes the hash as input and the current step number in the chain, will look like:



 function reduction($hash, $step) { global $reductions; $pos = $reductions[$step % CHAIN_LENGTH]; mt_srand(ord($hash[$pos[0]]) | ord($hash[$pos[1]]) << 8 | ord($hash[$pos[2]]) << 16 | ord($hash[$pos[3]]) << 24); return getWord(); } 




Given all the above, the function of calculating the end of the chain at its beginning is trivial:



 function getEndOfChain($word, $startStep = 0, $length = CHAIN_LENGTH) { for($i = $startStep; $i < $length; ++$i) { $hash = md5($word, true); $word = reduction($hash, $i); } return $word; } 




Congratulations, we have done a lot of work, and there is only one aspect left of finding a password for a given hash that needs to be told.



In the classic version, the last n-th reduction function from the hash is taken and the resulting password is searched in the rainbow table, if nothing is found, the n-1 reduction is taken, then the hash is calculated, then the n-th reduction is searched in the table and so on until there is a password. When using MySQL, this could result in n SELECTs of the same type (at worst) - even a novice web programmer knows what you can get for it! Of course, one SELECT is enough to search for a single password, but for this you need to generate all the passwords for the search at once:



 function getWordsInChain($hash) { $words = array(); //        100, 99, 98   for($i = 0, $n = CHAIN_LENGTH; $i < $n; ++$i) { $wordStart = reduction($hash, $i); $wordEnd = getEndOfChain($wordStart, $i + 1); $words[] = $wordEnd; } return $words; } 




All other MySQL manipulations have no direct relation to rainbow tables, and other parts of the source code, in my opinion, are understandable without explanation.



And finally, a fly in the ointment. PHP and MySQL do a fine job of quickly creating prototypes, but PHP is really not the fastest language and storing a rainbow table in a general-purpose relational database is not the most efficient solution. The rainbow table for MD5 for 6-character passwords with a chain length of 1000 generated more than 8 hours from 2 million records of a laptop based on i3-330UM. Ideally, the resulting table can convert 2 * 10 ^ 9 hashes, but this number is not commensurate with the total number of 6-character passwords, which are 56.8 * 10 ^ 9 on the selected input alphabet.



This once again shows how important the choice of a suitable tool is for solving a specific task.

I think that the task clearly demonstrates the principle of implementation of rainbow tables for me, along with PHP, I managed to solve.



Thanks for attention.

Source: https://habr.com/ru/post/145820/



All Articles