Recently, at the institute, I encountered a curious cryptographic task that I would like to share with the Community. In one sentence, I can designate a task as " Attack on the LSX cipher, which does not contain a non- linear component, on the basis of open texts ". Since there are few Russian-language examples of solving such training βpuzzlesβ, and the task itself is recommended for specialists starting their own path, I believe that such an article may be of interest to the young cryptanalyst. Come under cat.
Condition
Byte substitution pi8 AES-256 cipher is determined by some sequence of operations in the field GF(28) and can also be specified by an array of 256 bytes:
63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76 ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0 b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15 04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75 09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84 53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2 cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79 e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08 ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a 70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16
We define a modified version of the cipher - "AES-256-M", which will differ from the original only by substitution, namely: ')
2b c4 4d a2 76 99 10 ff 56 b9 30 df 0b e4 6d 82 db 34 bd 52 86 69 e0 0f a6 49 c0 2f fb 14 9d 72 95 7a f3 1c c8 27 ae 41 e8 07 8e 61 b5 5a d3 3c 65 8a 03 ec 38 d7 5e b1 18 f7 7e 91 45 aa 23 cc cb 24 ad 42 96 79 f0 1f b6 59 d0 3f eb 04 8d 62 3b d4 5d b2 66 89 00 ef 46 a9 20 cf 1b f4 7d 92 75 9a 13 fc 28 c7 4e a1 08 e7 6e 81 55 ba 33 dc 85 6a e3 0c d8 37 be 51 f8 17 9e 71 a5 4a c3 2c 6f 80 09 e6 32 dd 54 bb 12 fd 74 9b 4f a0 29 c6 9f 70 f9 16 c2 2d a4 4b e2 0d 84 6b bf 50 d9 36 d1 3e b7 58 8c 63 ea 05 ac 43 ca 25 f1 1e 97 78 21 ce 47 a8 7c 93 1a f5 5c b3 3a d5 01 ee 67 88 8f 60 e9 06 d2 3d b4 5b f2 1d 94 7b af 40 c9 26 7f 90 19 f6 22 cd 44 ab 02 ed 64 8b 5f b0 39 d6 31 de 57 b8 6c 83 0a e5 4c a3 2a c5 11 fe 77 98 c1 2e a7 48 9c 73 fa 15 bc 53 da 35 e1 0e 87 68
Task. There is a ciphertext with a length of 5120 bits (40 blocks), which was obtained by encrypting plaintext with the cipher "AES-256-M" in the simple replacement mode using an unknown key. The first plaintext block is known. To propose a feasible in practice method of opening the unknown part of the ciphertext.
Theoretical introduction
Consider the general structure of LSX ciphers, one of whose representatives is AES Rijndael .
LSX Cipher Structure
The cipher of the LSX architecture is based on the iterative application of the round function to the a block of the converted text (the block length a for most modern ciphers is 128 bits). The round function consists of three transformations:
X - addition modulo 2 input block with an iterative key Ki ( i=overline1,r where r - the number of cipher rounds); S - application of fixed substitution pi to each block byte; L - linear transformation.
Schematically, the LSX transform can be represented as
Each round of LSX cipher uses a separate round key Ki somehow formed from the primary key K . Primary Key Bit Length K usually equal to or greater than the length of the iteration key. The key scanning procedures of the iterative keys from the primary one are significantly different for different ciphers.
General Encryption Transform Formula Enc(k,a) for LSX cipher with number of rounds r , can be written in the following form:
b=Enc(K,a)=X[Kr]LSX[Krβ1]...LSX[K2]LSX[K1](a).
Schematically, the general structure of the LSX cipher can be represented as
The decryption process is performed using inverse transforms:
X - addition modulo 2 input block with an iterative key; Sβ1 - application back to S substitutions; Lβ1 - application of the reverse L -conversion ( SSβ1=I - single substitution). Formula to decrypt Dec(k,b)r round LSX cipher:
In the ciphers of the LSX architecture, a huge role is played by substitutions - bijective displays of the form pin:Vn2rightarrowV2n where V_2 = \ {0, 1 \}, V ^ n_2 = \ {0, 1 \} ^ nV_2 = \ {0, 1 \}, V ^ n_2 = \ {0, 1 \} ^ n . An unsuccessfully selected substitution can lead to a decrease in the strength of the entire cipher, and in the worst case to a practical attack on the cipher.
I will not step by step analyze the AES algorithm itself, this has been done more than once on this resource: for example, you can read such wonderful articles as this one , this one , and, of course, the official documentation of the cipher.
Decision
I will try to reproduce the course of solving this problem from the point of view of an inexperienced cryptographer (myself) in order to show the logic of thinking in the case when you absolutely do not know what to take in the first place.
Intelligence service
Obviously, the key feature of this task is a vulnerable substitution (S-box), so my investigation of the problem began with the construction of a table of differential characteristics. Although the classical differential method is impractical with respect to the AES-like ciphers (the number of rounds is too large), some useful information about the modified S-box can be tried.
So, we build the table:
#!/usr/bin/env python3 # -*- coding: utf-8 -*- # Usage: python3 dc.py from itertools import product def dc(sbox): length = len(sbox) diff_table = [[0] * length for _ in range(length)] for c, d in product( *([list(range(length))]*2) ): diff_table[c ^ d][sbox[c] ^ sbox[d]] += 1 count_prob = 0 for c, d in product( *([list(range(length))]*2) ): if diff_table[c][d] == length: count_prob += 1 print('{} : {} -> {}'.format(diff_table[c][d], c, d)) return count_prob == length if __name__ == '__main__': # sboxm = [ <_> ] print(dc(sboxm))
And we are not surprised at the result: it turns out that the table has a degenerate appearance - in each of its rows there is a record with a probability of 256/256 (the remaining entries are respectively 0).
Next comes the assumption that this kind of diff. the table of the modified substitution is due to the construction features, i.e., it is artificially generated. In this case, it would be nice to find a possible way to generate such substitutions.
From the basic course of cryptography, it is known that the main role of an S-box in a block cipher is to introduce non-linearity into a cryptogram (in other words, to maximally hide the statistical relationship between plaintext and ciphertext), and since this S-box is a weak point of AES -256-M, it can be assumed that it does not fulfill its objective function. So, most likely, this S-box is the result of applying a linear function to a range of numbers overline0,255 .
By an experimental method we find out that a simple affine function of the form f(x)=Mxoplusv where
v = \ begin {pmatrix} 0 & 0 & 1 & 0 & 1 & 0 & 1 & 1 \ end {pmatrix} ^ Tv = \ begin {pmatrix} 0 & 0 & 1 & 0 & 1 & 0 & 1 & 1 \ end {pmatrix} ^ T - binary 8-bit column vector.
Thus, our assumption turned out to be true: if we represent the bytes of the substitution in the form of binary vectors, then the modified substitution turns into the above affine transformation with a linear relationship between the input and output bit set.
Now you can even write a small script to generate such substitutions.
Generate degenerate permutations of length 2 ^ n
#!/usr/bin/env python3 # -*- coding: utf-8 -*- # Usage: python3 generate_affine_sbox.py <sbox_length> import numpy as np import random import math import sys from itertools import product # ---------------------------------------------------------- # ---------------------- AFFINE SBOX ----------------------- # ---------------------------------------------------------- """ Affine function: y(x) = M*x + v, where M is 8x8 boolean matrix, v is 8-bit constant vector column, * is bitwise AND (&), + is bitwise XOR (^). """ def affine_function(x, M, v): Mx = (np.dot(M, x) % 2).T y = Mx ^ v return y def S(x, M, v, bin_length): raw_value = list(affine_function(to_bin(x, bin_length), M, v).flat) return int(''.join([ str(b) for b in raw_value ]), 2) def generate_affine_sbox(length): bin_length = len(bin(length-1)[2:]) np.random.seed() while True: M = np.random.randint(0, 2, size=(bin_length, bin_length)) if np.linalg.det(M).astype(int) % 2: break v = np.random.randint(0, 2, size=(1, bin_length)) sbox = [ S(i, M, v, bin_length) for i in range(length) ] if not any_duplicates(sbox) and is_sbox_degenerate(sbox): return (sbox, M, v) return (None, None, None) # ---------------------------------------------------------- # ----------------------- UTILITIES ------------------------ # ---------------------------------------------------------- def to_bin(number, bin_length): return np.array([ [int(b)] for b in bin(number)[2:].zfill(bin_length) ]) def print_sbox(name, sbox, length): dim = math.sqrt(length) print('{} = {{'.format(name), end='') if dim.is_integer(): print('\n\t', end='') for i in range(length): if not (i % dim) and i: print('\n\t', end='') print('{: >5}'.format(sbox[i]), end=', ') print('.\n}') else: for item in sbox: print(item, end=', ') print('.}') def any_duplicates(sbox): seen = set() for item in sbox: if item not in seen: seen.add(item) else: return True return False def is_sbox_degenerate(sbox): length = len(sbox) diff_table = [[0] * length for _ in range(length)] for c, d in product( *([list(range(length))]*2) ): diff_table[c ^ d][sbox[c] ^ sbox[d]] += 1 count_prob = 0 for c, d in product( *([list(range(length))]*2) ): if diff_table[c][d] == length: count_prob += 1 #print('{} : {} -> {}'.format(diff_table[c][d], c, d)) return count_prob == length # ---------------------------------------------------------- # -------------------------- MAIN -------------------------- # ---------------------------------------------------------- if __name__ == '__main__': if len(sys.argv) != 2: print('Usage: python3 {} <sbox_length>'.format(sys.argv[0])) sys.exit(1) try: length = int(sys.argv[1]) except ValueError: print("Invalid input type") sys.exit(1) if length & (length-1) != 0 or length < 1: print("Sbox length must be a power of two and > 1") sys.exit(1) while True: sbox, M, v = generate_affine_sbox(length) if sbox: print('M = \n{}\n'.format(repr(M))) print('v = \n{}\n'.format(repr(v))) print_sbox('Affine Sbox', sbox, length) break
Preparing to attack
So, the only component of the cipher, which was supposed to be non- linear, turned out to be very linear, so it is not difficult to guess in which direction to go further.
Let us find an alternative bit-oriented representation of all linear transformations of the AES Rijndael cipher such that it is possible to βinlineβ the linear now operation of applying the substitution to each block byte into one large linear transformation.
Subbytes
SubBytes operation, just representing the transformation S in LSX cipher, now looks like amapstoMaoplusv where a - encrypted block M and v described above.
Therefore, the bit-oriented representation of the SubBytes operation is an example of the form:
AmapstoSBβAoplusV where A - 128-bit block representation a ;
SB = \ begin {pmatrix} M & 0 & 0 & ... & 0 \\ 0 & M & 0 & ... & 0 \\ 0 & 0 & M & ... & 0 \\ ... & ... & ... & ... & ... \\ 0 & 0 & 0 & ... & M \ end {pmatrix}SB = \ begin {pmatrix} M & 0 & 0 & ... & 0 \\ 0 & M & 0 & ... & 0 \\ 0 & 0 & M & ... & 0 \\ ... & ... & ... & ... & ... \\ 0 & 0 & 0 & ... & M \ end {pmatrix} ; V=beginpmatrixvvv...vendpmatrix .
ShiftRows
Referring to [1] to determine the matrix responsible for the ShiftRows operation. This matrix has the following form:
Here it is more difficult: you need the original MDS-matrix above the field GF(28) lead to equivalent form over GF(2) . To do this, recall that the algebraic structure of the field GF(28) follows from consideration of the polynomial ring (variable x ) with coefficients from GF(2) modulo some polynomial f(x) degree 8 ( aka ring factor GF(2)[x]/f(x) ). For AES f(x)=x8+x4+x3+x+1 - irreducible over GF(2) polynomial of the 8th degree.
Next: any item from GF(28) Imagine as the sum of basic \ {1, x, x ^ 2, \ dots, x ^ 7 \}\ {1, x, x ^ 2, \ dots, x ^ 7 \} . Then remember the results of multiplying the element. 02=x on all baseline:
Since distributivity in addition and multiplication is fulfilled, then in combination with the arguments above, the element 02 can be represented as a matrix with coefficients from GF(2) :
Then it is obvious that the element 01 goes to the identity matrix C1=I8x8 and the item 03 will go into C3=C2oplusC1 .
To test our reasoning, you can look in [2] , where a similar matrix is ββproposed. C2 for bit-oriented multiplication by an element 02 of GF(28) .
As a result, taking into account that during the original transformation of MixColumns, the row of the MDS matrix is ββmultiplied by the state column , we compose the matrix MC for the bit-oriented representation of this transformation (under the spoiler).
Where mkey=L14K0oplusL13K1oplusL12K2oplusdotsoplusLK13oplusK14 - conditional "master key", which can be used to decrypt other blocks (because encryption is carried out in ECB mode):
And here is the very "magic" formula for decrypting the remaining ciphertext blocks, because of which we have gathered today:
Pi=P0oplusLβ14(C0oplusCi).
Note
In the original implementation of AES Rijndael, the MixColumns operation is omitted in the last round of the cipher because of its uselessness at this stage. This solution does not take this into account (that is, MixColumns is used after addition with the last round key) to simplify the demonstration of the result.
Conclusion, code, literature
Lβ14
First I would like to note that the matrix Lβ14 can be easily obtained using the Sage algebraic apparatus:
A small software demo that will show "live" how the hacking goes is laid out on Github .
Conclusion
What can be said about tasks of this kind? Such tasks are an excellent example of how such a seemingly insignificant modification of the original algorithm as a βcustomβ substitution (which seems to even look pseudo-random, even more so - passes some of the NIST statistical tests, see the spoiler below ) can degenerate international standard encryption in a trivial affine transformation with a well-known matrix.
NIST Statistical Test Suite
The changed substitution has passed:
Monobit Frequency Test;
Block Frequency Test;
Runs Test;
Serial Test;
Approximate Entropy Test;
Cumulative Sums Test.
Thanks for attention!
Bibliography
Algebraic Aspects of the Advanced Encryption Standard by Carlos Cid, Sean Murphy, Matthew Robshaw
Advances in Cryptology - CRYPTO 2015 - 35th Annual Cryptology Conference, Santa Barbara, CA, August 16-20, 2015, Proceedings, Part 1