📜 ⬆️ ⬇️

Simple cryptography: frequency and differential cryptanalysis in the task NeoQUEST-2015

On the eve of the NeoQUEST-2015 in-person tour, we continue to sort out the tasks of the online stage. The article deals with the task "Abracadabra", consisting of two parts. Both parts are for cryptanalysis: the first is for frequency, the second is for differential.

Participants were given two files. The first file had the format .docx, on its first page was the very “abracadabra” - the text, consisting entirely of incomprehensible squiggles and symbols, and on the second page - not quite clear list. Looking at it, one thing was for sure: a certain encryption algorithm is described here.

The second file was in .txt format, it contained 2 columns, titled as plaintext and cyphertext. With all this it was necessary to do something ...

Incomprehensible text:
')
RћC ‰C “Ћ ‡ Њ„ ” 'ЋЊ„ Ћ ‹„ Oc' Ќ ‹І ‰ і ± 'Ћ „ ”Ќѕ Џ † … „ Ћ ÑЊЃЌo ѕ Џ † … „Ћ ‡ ЊcЌo ± »'Ћ ‰ †. ± † ЋЊ „Ћ‹ „ѕЃ єЏ‹ ‰ † ± ЉcїЊ ЂЋ „pЏ‹ Њ „» ј o „Њ‹ Џ ‹ ‡ ± † „ † Љ ‹ ‰ oЅЅ‹ Џ ‹„ • ѕ † Љ‚ „» ј єЏЋЅ † јЉ o ' † Џ † ЂЊ ‹Џѕ ‡ Њop † Џ † _„ †. ѕЅЅ ‹Џ‹ „ • o † Љ‚„ »ј єЏЋЅ † јЉ єЋЂ † … » ± † ‹Њ ± ‹ ЏЋЃЊ "Ћ Њ" Ћ ‹ЋЊ „ѕ‹ Ќ ‹І ‰ і ± 'Ћ „ ”Ќo Џ † … „ Ћ ‡ ЊcЌѕ o Џ † … „Ћ ‡ ЊЃЌѕ ± » 'Ћ ‰ † ґЉЋp . RљSЋS SЋRґS ‰ "P" C <RєSЏSЋRЅS † RјSЉR "SЌSЋSѓRіSЊ RґR" SЊS C ‡ ‰ SЋS ... With C † C "P" With SЉc SЂS ‰ ‰ † RІS SЋSѓSЋ ° C ... P ‡ ± SЋS S,SЌRѕ RґSЉSЋpSЋR ± R ± DP “S. RЃC † CЏC † C Њ C Њ C P »P¶o ‡ ЉЃ‹ Њ ‡ c ‰ ЉЃ • ‹ЉЋѓЋ Џ † і„ ‰ †. RS “ † єЋЂ † … ” ± † ‹Њ ± ‹ ЏЋЊ „Ћ ‚, pЋЊЋЏЋј Ћ ‰ „ † ± 'Ћ „ † Ѓ Џ † … „Ћ ‡ Њ‚ ‡ Ћ… ‰ † Љ † ґ »Џ † … „ Ћ ‡ Њ ‚ЋєЏ‹ ‰ ‹Љa“ “ЋѓЋ ± ” 'Ћ †. RRґCЏC † CЊR * CЊC ‹P ± C„ oЌ † C „ѕ‹, ¶ЊЋ C' † CЏC † C Њ C Њ C Џ C C Њ C Њ C o C Њ C † RIC ‹C C Љ C Ђ C C C C † o C Њ C † RIC † RіS "C ‰ † C, RєSЋSЊSЋSЌRі R¶SЊSЋ SЉRїRґSЋS <SЋSЊS" SЋRoS <C "RѕS <pSЋSЊSЋSЏSЋS <P ± SЂSЉRїR¶S † C <SЊ SЏS † C ... C" SЋS ‡ SЊo, C "C <... C † P ± P * C ‡ oCЊ CЋCЊ pЉї¶ ‹ј Џ † і„ ‰ †. ћ ‡ єЋЉ‚… іј ‰ ѕЅЅ ¶ † ‡ Њo … † ‰ † „ѕЃ, Љѕ І‹ o ‰ „ † єЏЋЉЋЌ. ¦ pЉї¶ Ђ№ЊЋј ¶ † ‡ Њo … † ‰ † „ѕc њ ¶ † ‡ ЊЋЊ„ »ј † „ † Љo….

List:

  1. Substitution (15, 1, 8, 14, 6, 11, 3, 4, 9, 7, 2, 13, 12, 0, 5, 10)
  2. Permutation P (i) = 9 * i + 4 (mod 32)
  3. 3 cycles: 2 cycles with [XOR with key + Substitution + Permutation], the last [XOR with key + Substitution + XOR with key]
  4. Input: S-box №1 (10; 1), second: S-box №3 (8; 6).


Considering the list from the Word file, it was necessary to restore the key in pairs of open and closed texts in the second .txt file.

"Abracadabra" №1

The analysis of the first part of the assignment, with incomprehensible text, should have been started from the definition of the encoding, many online decoders (in particular, this one ) successfully coped with this.



After using the decoder, a more meaningful text was obtained, with punctuation marks, suggesting the idea of ​​encryption by replacing a character with a character. However, the simplest option — Caesar’s cipher — did not help here. Further directions of thought were reduced to frequency cryptanalysis. Here, as is the case with the definition of encoding, there are services that allow frequency analysis of the text.

Result of frequency analysis of the entered text
Text analysis performed
Number of characters in the text 910
Number of spaces 114
Number of digits 0
Number of points and commas 16
Number of english letters 53
Russian letters 715

Character-based statistics and frequency analysis
The symbol occurs 114 times. Frequency 12.53%
The symbol u occurs 86 times. Frequency 9.45%
The symbol q occurs 80 times. Frequency 8.79%
The symbol f occurs 63 times. Frequency 6.92%
The symbol b is found 52 times. Frequency 5.71%
The symbol s is found 50 times. Frequency 5.49%
The symbol I occurs 43 times. Frequency 4.73%
The symbol n is found 38 times. Frequency 4.18%
The symbol occurs 31 times. Frequency 3.41%
The symbol o occurs 30 times. Frequency 3.30%
The symbol o occurs 28 times. Frequency 3.08%
The symbol h occurs 27 times. Frequency 2.97%
Symbol b occurs 22 times. Frequency 2.42%
The symbol x occurs 21 times. Frequency 2.31%
The symbol l occurs 19 times. Frequency 2.09%
The symbol k occurs 16 times. Frequency 1.76%
The symbol m occurs 16 times. Frequency 1.76%
The symbol e occurs 14 times. Frequency 1.54%
The symbol n occurs 14 times. Frequency 1.54%
The symbol r occurs 13 times. Frequency 1.43%
The symbol e occurs 12 times. Frequency 1.32%
The symbol with occurs 11 times. Frequency 1.21%
The symbol p occurs 11 times. Frequency 1.21%
The symbol c occurs 11 times. Frequency 1.21%
The symbol t occurs 11 times. Frequency 1.21%
The symbol p occurs 11 times. Frequency 1.21%
Symbol. occurs 9 times. Frequency 0.99%
The symbol is found 9 times. Frequency 0.99%
The symbol d occurs 9 times. Frequency 0.99%
The symbol is found 7 times. Frequency 0.77%
The symbol occurs 7 times. Frequency 0.77%
The symbol e occurs 6 times. Frequency 0.66%
The symbol is found 6 times. Frequency 0.66%
The symbol n occurs 5 times. Frequency 0.55%
The symbol and occurs 4 times. Frequency 0.44%
The symbol d occurs 1 time. Frequency 0.11%
Symbol a occurs 1 time. Frequency 0.11%
The symbol d occurs 1 time. Frequency 0.11%
The symbol s occurs 1 time. Frequency 0.11%

From the results of the analysis it is clear that the text is not two English letters (note the mysterious DS, which can be DES, DOS, DNS, and so on), but as many as 53! It was possible to work and write a program that sorts out letters that look the same in both Russian and English (for example, obvious o, e, p), but you could google and find a program that will highlight English letters:



Attentive participants may have noticed repeated words that are most likely different forms of the same word:

schonnyyyfEyottstfyuyu
SchonnyyayfEotstsflm
schonnyyayfEotstfueg

It is logical to assume that the most concise form of a word from these three is the nominative case, a word of 16 letters in which the letters 3 and 4 are the same. There are not so many words of 16 letters, if you believe the dictionary: 759. And such that the third and fourth letters coincide, all the more. It was possible to implement a program that selects words that match the mask to the encrypted one, but it was possible to simply check words from 16 letters with doubled 3 and 4 positions. Even if checked manually, the choice is small:

shamelessness
ballot
gellenophobia
collaborationism
collaborator
collectors
cellophane
commercialization
pessimism
reasonableness
differentiate

But the letters in positions 2 and 10 must also match! According to this parameter, only the word “differentiate” is suitable, and if you try to implement such a replacement of symbols, the text will become more readable, though not completely, from which it becomes clear that the search word is not “differentiate”, but “differential”. Having connected with the second part of the assignment, DS turns into DES, simplifying the task a little more, and the final text looks like this:

“The idea of ​​differential cryptanalysis is based on probabilistic relationships between input differences and output differences. Two relationships are of particular interest in the analysis: the differential profile and the characteristics of the round. The differential profile shows the probability relationship between the input differences and the differences of the output of the block. Similar profiles can be created for each of the eight blocks in the DES. The characteristic of the round is similar to the differential profile, but it is calculated for the whole round. It shows the probability with which one input difference would create a difference of a certain output. Note that the characteristic is the same for each round, because any relation that includes differences does not depend on the keys of the round. Use differential cryptanalysis to pass the second part of the task, or go ahead. And the key to this part of the task is frequency analysis.

So we got the key to the first part of the task, but if you enter "frequency analysis" in the input field, a message pops up saying that the key is incorrect. What to do? It's simple: you had to take an MD5-hash from this phrase. Profit! By the way, writeup for this assignment has already been published by one of our participants here , having passed it a little differently, but, nevertheless, having achieved success!

"Abracadabra" №2

As it was already written in the decrypted text, the second part of the task could be completed in two ways:


Most of the participants chose the second method of transmission (and they can be understood, because it is still more laborious to implement cryptanalysis), so we will analyze this method in more detail.

Differential Cryptanalysis

It is based on the uneven distribution of bit-wise differences modulo 2 pairs of open and corresponding encrypted texts. To attack using the differential method requires the availability of selected open and corresponding encrypted texts, this condition was met.
Participants who are inexperienced in cryptanalysis would have to work hard enough to complete the task in this way, namely:

  1. Understand what an encryption cycle differential is.
  2. Examine the concept of differential characteristic.
  3. Understand how substitution and permutation work.

The differential of the i-th encryption cycle is a pair of vectors a and b, such that a pair of open texts (x 1 and x 2 ) with a difference a can go after the i-th cycle into a pair of output texts (y 1 and y 2 ) with a difference b. The differential characteristic is a sequence of single-cycle differentials, while the output text difference for the previous cycle coincides with the input text difference of the subsequent cycle.

The block cipher has a block length and a key of 32 bits each (you can guess this by looking at condition 2 in the list). Encryption, as already indicated in the list of the .docx document, is performed on 3 cycles, each of which contains a key XOR, replacement of 4-bit words (substitution), and two cycles contain a permutation (Permutation). When performing a permutation, the bit from position i moves to position 9 * i + 4 (mod 32).

By the way, the rearrangement caused considerable difficulties for the participants due to their nontriviality: at the input, the numbering of bits ranged from 1 to 32, and at the output - from 0 to 31 (example: Permutation (0x12345678) = 0xB3E29180). However, after the publication of tips on Twitter, the participants began to actively pass the task!

General algorithm for passing the task by the method of differential cryptanalysis:
  1. Using a pair of open texts, we look at the difference for this pair at the output of 2 cycles
  2. Using a pair of output ciphertexts after 3 cycles, we perform their inverse transformations, calculating the inverse substitution and making XOR with the key.
  3. Calculate the difference for the converted ciphertexts.
  4. We compare the differences for the converted ciphertexts and for a pair of open texts.
  5. We are looking for matching bits, if any, increment the count of the number of matches.
  6. After going through all the pieces of the key (2 to 4 bits, total - 8 bits), we look for which (or for which) the greatest number of matches. This will be the desired part of the key.
  7. We repeat, selecting other active blocks and build a new characteristic, to determine the already different part of the key. We do, until either all 32 bits are opened, or until most of the key is opened — the rest can be turned on.


Brutfors

Let's proceed to the passage of the job by the method of brute force! It was possible to implement a program that performs brute force in parallel on all processor cores. In addition, of course, it was necessary to be careful when implementing the substitution and permutation.

using System; using System.Collections; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; using System.Threading; using System.Threading.Tasks; namespace ConsoleApplication2 { static class Program { static long progress = 0; static long prev_progress = 0; static DateTime prev_date; static bool complete = false; static byte reverse_bits( this byte b) { byte r = 0; for (int i = 0; i < 8; i++) { byte mask = (byte)(1 << i); byte mask_neg = (byte)(1 << (7 - i)); if ((b & mask) == mask) { r |= mask_neg; } } return r; } static byte[] reverse_bits(this byte[] array) { byte[] r = new byte[array.Length]; for (int i = 0; i < array.Length; i++) { r[i] = array[i].reverse_bits(); } return r; } static void Main(string[] args) { DateTime start_date = DateTime.Now; progress = 0; prev_progress = 0; prev_date = DateTime.Now; Thread monitor = new Thread(a => { while (progress != 0xFFFFFFFF && complete == false) { Thread.Sleep(5000); long long_percent = ((progress * 100 * 1000) / 0xFFFFFFFF); long sec = (long)(DateTime.Now - prev_date).TotalSeconds; long speed = (long)((progress - prev_progress) / sec); Console.WriteLine("{0}.{1}% ... speed={2} left={3}", long_percent / 1000, long_percent % 1000, speed, 0x100000000 - progress); prev_date = DateTime.Now; prev_progress = progress; } }); monitor.Start(0); Parallel.For(0x00000000, 0xFFFFFFFF, i => { byte[] text = BitConverter.GetBytes((UInt32)0x5ABF054B); Array.Reverse(text); byte[] key = BitConverter.GetBytes((UInt32)i); Array.Reverse(key); Crypt(ref text, key); Array.Reverse(text); if (BitConverter.ToUInt32(text, 0) == (UInt32)0x4CD3CA17) { complete = true; Console.WriteLine("KEY {0}", i.ToString("X")); File.WriteAllText(i.ToString("X") + ".txt", i.ToString()); Console.WriteLine("Complete! time = " + (DateTime.Now - start_date).ToString()); Console.ReadKey(); } Interlocked.Increment(ref progress); } ); Console.WriteLine("Complete! time = " + (DateTime.Now - start_date).ToString()); Console.ReadKey(); } static int[] substitution_table = { 15, 1, 8, 14, 6, 11, 3, 4, 9, 7, 2, 13, 12, 0, 5, 10}; static void Crypt(ref byte[] text, byte[] key) { Xor(ref text, key); Substitution(ref text); Permutation(ref text); Xor(ref text, key); Substitution(ref text); Permutation(ref text); Xor(ref text, key); Substitution(ref text); Xor(ref text, key); } static void Substitution(ref byte[] text) { for (int i = 0; i < 8; i += 2) { int h = (text[i / 2] >> 4) & 0xF; int l = (text[i / 2] >> 0) & 0xF; h = substitution_table[h]; l = substitution_table[l]; text[i / 2] = (byte)((h << 4) | l); } } static void Permutation(ref byte[] text) { BitArray bits = new BitArray(text.reverse_bits()); text = new byte[text.Length]; for (int i = 0; i < 32; i++) { int idx = ((i + 1) * 9 + 4) % 32; text[idx / 8] |= (byte)((bits[i] ? 1 : 0) << (idx % 8)); idx = idx; } text = text.reverse_bits(); } static void Xor(ref byte[] text, byte[] key) { for (int i = 0; i < text.Length; i++) { text[i] ^= key[i]; } } } } 


It is curious that for some pairs of open and corresponding encrypted texts the program could find more than one key (in particular, for the very first pair). From the found value, as in the first part of the task, it was necessary to find the MD5 hash.

Source: https://habr.com/ru/post/258893/


All Articles