📜 ⬆️ ⬇️

The method of recovering data from a disk encrypted with NotPetya using the Salsa20 algorithm



Attacks using encryption viruses have become a real trend of 2017. There were a lot of similar attacks, but the loudest of them were WannaCry and NotPetya (also known by many other names - Petya, Petya.A, ExPetr and others). Having mastered the experience of the previous epidemic, experts around the world promptly reacted to the new challenge and, in a matter of hours after the first computers were infected, they began to study instances of encrypted disks. Already on June 27, the first descriptions of methods of infection and distribution of NotPetya appeared, moreover, a vaccine against infection appeared .

When you run NotPetya, the AES algorithm encrypts user files with certain extensions, but the operating system continues to work. Limited time is allowed for encryption (the default is 1 hour). And if during this time the encryption process has time to complete, the README.TXT file with the ransom request will be placed in the root of the disk. Unfortunately, to recover files encrypted in this way requires knowledge of the RSA secret key (which, it seems, is offered to be bought on the Darknet for 100 bitcoins). If encryption was not completed, or it was interrupted, or there was no right to write to the disk root, the README.TXT file (containing the encrypted key) will not be created, and the AES-encrypted files will not be able to recover even if they receive the RSA secret key.
')
The following data recovery method is applicable in cases where the NotPetya virus had administrative privileges and encrypted the entire hard disk using the Salsa20 algorithm.

This is the second layer of encryption. However, decoding Salsa20 is advisable for several reasons:


And Salsa20 encrypts all data, regardless of type, time and access rights.

As it turned out , the authors of the virus made an error in the implementation of the Salsa20 algorithm, due to which half of the encryption key bytes were not used at all. Reducing the key length from 256 to 128 bits, unfortunately, leaves no hope of finding it within a reasonable time.

However, due to some features of using the Salsa20 algorithm, it is possible to recover data without knowing the key.

How Salsa20 works


Recall that Salsa20 is a synchronous stream cipher in which when encrypting a gamma is generated, depending on the key, and the bytes of this gamma are added using XOR operation with plaintext bytes. To decrypt, repeat the procedure.

So that the gamma can be calculated for any displacement in the stream, the gamma generator s20_expand32 () forms a 64-byte keystream array, in which the following are mixed:


This illustration, taken from the Check Point report , clearly shows how the data is laid out:



The 64 bytes of the keystream are passed through the shuffle function, and the resulting 64 bytes are used as a fragment of the gamma.

It should be noted that the generated gamma fragments are always aligned with a boundary multiple of 64 bytes. And in order to encrypt, say, 7 bytes starting at offset 100, you need to find the block number that contains the first byte (100/64 == 1), calculate the gamma for this block and use 7 bytes from it starting at offset (100% 64 == 36). If there are not enough bytes in the block, then a gamma is generated for the next block, etc.

In the process of encrypting one stream (and the disk, from the point of view of NotPetya, is one stream), no key or nonce change occurs. Therefore, for each encrypted disk, the only variable value in the keystream is the block number.

As planned by the authors of the Salsa20 cipher, 2 ^ 64 blocks of 64 bytes each make it possible to generate a gamma with a period of 2 ^ 70 ~ 10 ^ 21 bytes. This is a long enough period for almost any practical application, and hard drives of this size will not appear for a very long time.

However, the implementation is not so smooth.

Real gamma period in NotPetya


Let's look at the prototype of the function s20_crypt32 (), through which calls the disk sectors are encrypted.

enum s20_status_t s20_crypt32(uint8_t *key, uint8_t nonce[static 8], uint32_t si, uint8_t *buf, uint32_t buflen) 

Through the argument si (probably the Stream Index), the byte offset in the stream is passed. And by the type of argument it is clear that there is not 64, but only 32 bits. This value falls into the keystream after dividing by 64, that is, a maximum of 26 bits remains.

 // Set the second-to-highest 4 bytes of n to the block number s20_rev_littleendian(n+8, si / 64); 

And now we will look at one more picture taken from the same report .



On it, those bytes are highlighted in gray, which do not affect gamma formation due to the implementation error of the function s20_rev_littleendian (). Thus, of the 26 bits of the block number, only 16 bits (bytes at offset 0x20-0x21) will affect the keystream. Consequently, the maximum gamma period will be 2 ^ 16 = 65536 blocks of 64 bytes, or 4 megabytes.

The volume of encrypted data is likely to significantly exceed 4 megabytes, so many different pieces of data will be encrypted on the same gamma fragments. This allows you to implement a trivial attack based on the well-known plaintext.

And one more mistake


At this developer flaws do not end there. When you call the function s20_crypt32 (), instead of the offset value in bytes, they transfer ... the number of the 512-byte sector!

Sectors are usually encrypted in pairs (1024 bytes in one access), which means that the gamma used to encrypt two adjacent pairs of sectors will match in 1022 bytes (with an offset of 2 bytes).

Known Plaintext Attack Heuristics


Modern versions of Windows use the NTFS file system, which uses quite a lot of different structures, the lion's share of fields in which can be easily predicted.
In addition, the disc contains many files whose contents are easy to predict (in part or in full).

Introductory 512 bytes gamma



To verify the correctness of the encryption key, NotPetya encrypts sector 0x21 containing the predefined values ​​(all bytes 0x07). This gives us 512 bytes of gamma.

MFT gamma recovery


NotPetya does not encrypt the first 16 entries in the MFT (32 sectors), but encrypts all the rest.

It is known that each file entry begins with the sequence “FILE”, then bytes 30 00 03 00 usually go (UpdateSequenceArrayOffset = 0x30, UpdateSequenceArrayLength = 3). Theoretically, these 4 bytes may have other values, but they are almost always the same for all file entries within the same NTFS logical volume.

Thus, from one file record (occupying two sectors) we can get 8 bytes of gamma, and each adjacent record will give us two additional bytes (and the ability to check the 6 bytes received earlier). The last entries are almost entirely zeros, which can give up to 1024 bytes of gamma.

And by restoring the gamma fragments used to encrypt the MFT, we can completely restore the file system structure.

Recover gamma from known files


The extorter encrypts the first two sectors of each file that is longer than 1024 bytes. In this case, the cluster size is usually larger than two sectors (for example, 8). In this case, finding the encrypted beginning of any file and skipping 1024 bytes, we can easily get the next 3 kilobytes in the clear. And if we have a file in which at the offset of 1024 bytes from the beginning there are exactly the same 3 kilobytes - with high probability and the beginning of the file will coincide. And we will get up to 1024 bytes of gamma.

If you put a “clean” Windows XP and look into the Windows folder, you can find 8315 files there. In actively used Windows 8.1 files will be more than 200 thousand. The chances that many of them will match the files on the encrypted disk are more than enough.

So, if you index DLL- and EXE-files from the available Windows installations (preferably the same version and with a close set of updates), it may be enough to restore the gamma completely.

And having received gamma fragments, you can try to recover and unique files.

Perspectives and pitfalls


The complexity of the "manual" recovery of an encrypted disk is that this process takes considerable time (hours) and requires large amounts of free disk space. Few of the users have an empty disk, the volume of which is not less than the volume of the encrypted one. And to experiment on the affected original is akin to suicide.

So it is unlikely that in the near future there will be a utility that will allow everything to be restored "quickly and without SMS." Rather, we can expect the emergence of services for a more complete data recovery - in comparison with what is offered now.

Companies that specialize in data recovery are more likely to handle the task of developing related software. They should have a decent experience in solving such problems.

However, we must not forget that the algorithm for selecting the sectors to be encrypted (and therefore need to be decrypted) also contains errors (for example, when parsing NTFS structures), and this may affect the result.

Data recovery from the hard disk by this method requires the use of heuristics. The degree of recovery depends on many factors (disk size, degree of its filling and fragmentation) and can reach 100% for large disks containing many “public” files (components of the operating system and software products that are identical on many machines).

We repeat that the method proposed in the article, unfortunately, will not allow decrypting files that were encrypted with the AES algorithm used by NotPetya, if it could not receive administrative privileges at startup.

I would like to express my gratitude to Alexander Peslyak (Solar Designer) for the suggestive clues that allowed me to come up with the method described above.

Author: Dmitry Sklyarov, Head of Application Analysis, Positive Technologies.

Source: https://habr.com/ru/post/332618/


All Articles