Trivial encryption in malicious files

Hey. This is again Alexey Malanov from Kaspersky Lab. Last time I talked about the experience of hiring virus analysts , and today I’ll tell you about what virus writers are doing, so that their work is not noticed, and what we are doing, so that their work will be in vain.

Generally speaking, the attacker writes a malicious program, almost always knowing in advance that it will sooner or later fall on the "operating table" of virus analytics. And all the information from the malware can be used against the author. Why should he hide? Well, firstly your identity. Sometimes you encounter lines in the malware, such as C: \ Users \ Vasiliy Ivanov \ Documents \ Visual Studio 2005 \ earnings \ trojan \ Release \ trojan.pdb. Secondly, quite a lot of information facilitating the analysis of the malware. Let's look at some tricks of virus writers, and find out why they are useless.

What hide:

Internet links

Instances of Trojan-Downloader primarily download / update other malware from one or several links. Obviously, if the links are stored explicitly, then the auto-processing system of any anti-virus company will easily extract them, add them to the list for periodic downloads, and ban access to these resources. The meaning of Trojan-Downloader disappears completely. Other classes of malware also tend to download new versions of their modules.
')

Components

Malware is now complex, multicomponent. One module is responsible for updating, the second is the driver-rootkit, the third provides for collecting passwords, the fourth is responsible for receiving commands from the control center and sending the results. In general, very often it all turns into one “installer”. This may be a special third-party "developer" Trojan-Dropper , or simply another component of the same author. Just as with URLs, if the internal PE-EXE modules are not hidden more reliably, automatic tools will easily pull them out, and all files will be sent for auto-analysis and auto-detection. Moreover, if at least one of the modules is recognized as malicious, then the entire bundle will definitely not require manual analysis, and everything will be successfully detected.

Passwords

You will not believe it, but the authors even sew up their passwords in malicious files. Not often, of course, but it happens. For example, if Trojan-Spy takes a lot of screenshots (or even shoots a victim’s video camera), then the usual way to deliver the material to the author is uploading to FTP. The virus writer creates one account with read and write access. And so that no one extra wanders into his FTP, sets a password that fits into the malware code. As a result, it’s not anyone who wanders there, but a specific independent researcher, erases everything stolen and leaves a little note to the unfortunate author. I note that this is “unlawful access to computer information,” but the criminal will not run to write a statement.

Another example is a mailbox password. The author needs to send passwords collected from the victim. But the mail service on which he has registered the box does not allow sending letters to incomprehensible people and requires authentication when sending. It is clear that the "serious" Malvarschiki are not so wrong, especially those who are aimed at stealing banking information. Others, on the contrary, sometimes intentionally expect that their code will be viewed by the analyst. Now, almost everything goes to automatics, but 10 years ago, when I analyzed the files myself, I encountered such an appeal to a virus analyst inside a sample: “This is Pinch , which nobody detects, and I would like it to remain so. You do not need to do anything. Thank."

Trivial encryption:

Hence the conclusion: virus writers have something to hide inside their creations, and they try to do it. If you know perfectly well what XOR , ROL bit operations are and understand how you can convert data with them, you can skip this entire section.

Suppose you have the string https: //secret.url, and you want to hide it in the code from your eyes. In this case, the decryption algorithm and the key will also be in the same code, because you yourself will need this line. Perhaps the simplest way is the following.

Not

Let's represent a string in the form of a chain of bits 110100001110100 ... Replace each bit with the opposite one 001011110001011 ... Get the string " CLLP┼╨╨MJNL╤KNU ". Now, the naked eye does not recognize this line. But you can programmatically. Let's apply the operation NOT to the whole file: what was encrypted will be decrypted back. And what is not - is encrypted. In the resulting file, the URL is visible to the automatics and eyes.

Xor

Slightly complicate. When encrypting, you can not invert every bit, but, for example, every fourth (so that no one guessed). Such an inversion of some bits is also called an XOR operation. The a: = a XOR key is written, where a is the byte to be converted, and key is the key in which it is written, which bits we change.

In this case, the operation NOT is equivalent to a: = a XOR FF . Hexadecimal FF is equal to binary 11111111 - we invert each bit.

To decrypt, you need to "poxorize" the data again with the same key. To make sure that this is exactly the original, I propose to the curious reader myself. Do not forget that, unlike the previous method, in this you will have to store the key in the program if you want to decrypt your data during execution. And the analyst will also have to search / search through the key to decrypt the string. There is, however, another way, but more on that below.

Other methods

What else can be done:

Apply to each byte operation ADD, that is, add a certain number - the key
Change the key from byte to byte. For example, the first byte is “Xori” with key 71, the second is from 73, the third is from 75, etc. Then the key can no longer be simply bruised, it will have to bruise 2 keys, which is 256 times longer
You can increase the length of the key. For example, the first DWORD “Xorit” with a four-byte key, the second DWORD “Xorit” with it, etc. Then for decryption without a key, you will have to go through 4 billion options. And if you (or the tweaker program) do not know where exactly the information you are looking for is located in the file, you will have to do this for each byte (the size of the malware is usually several hundred kilobytes)
You can apply the cyclic shift operation ROL for each byte. But it is only with 1-7 bits that makes it possible to “roll” with byte encryption. By cyclically shifting byte by 8 bits, we get the same byte.
You can "Ksorit" twice. One malware encrypted string. It seemed to him that this was not enough; he once again “poked” her with the same key. Guess what came out of it.
The line can be written backwards.
The line can be written through bytes. For ASCII characters, you get Unicode.
The line can be shifted to the floor of the alphabet, that is, a-> n, b-> o, ..., z-> m, 0-> 5, 1-> 6 ...
Write the characters in hex, that is 687474703A2F2F
Place string on stack
...
push 'w //:'
push 'ptth'
Etc

All the indicated methods, obviously, can be combined for greater conspiracy :)

Decryption:

Denote the task. There is a 1 MB file. Inside, in an unknown place, one of the methods described above is encrypted with a string starting with http: // . Key unknown. You need to write a program that will process the file and, without an epic search, extract this line. Something else can be encrypted, for example, the MZ-PE header of the internal file, but with the condition: you know exactly what you are looking for. And then it is not very grateful to look for unknown garbage, which is encrypted by an unknown, in an unknown place.

A small digression. The LC now uses such methods of autoanalysis, which do not care what is encrypted, and how. At least, even a very robust algorithm (as opposed to those described). If the data is used by the program itself, they will still be retrieved by us. Below I will only demonstrate that the described encryption methods will not protect the virus writer and will not be difficult even for an amateur analyst. No revelations. Watch your hands.

Xor

Do you remember the url to which the NOT operation was applied? " CLLP┼╨╨ ". Seeing these letters again, you always guess that this is an url. And here is the line poked up with the key F0: " SHDDA ". Or with 0F: " g {{⌂5 ". Notice that the second and third, as well as the last and penultimate characters are equal. The “useful” property of the described byte transformations is that they translate equal bytes into equal ones. But XOR has another useful property - the operation is reversible. That is, from '' = 'h' XOR key
it follows that key = 'w' xor 'h' . And this means that if there is a YCUKENG sequence, then key: = 'TH' XOR 'h' . Further we check that
'C' == 't' XOR key
'K' == 'p' XOR key
'E' == ':' XOR key ...
If so, then we found the URL and know the key.

Add

Similarly, if 'TH' = 'h' + key , then key = 'TH' - 'h' . And we check that the rest of the urla is really an urla.

But what to do if the attacker linearly changed the key from byte to byte?
'TH' = 'h' + key
'N' = 't' + (key + d)
'U' = 't' + (key + 2d)
'K' = 'p' + (key + 3d) ...

All the same:
key = 'TH' - 'h'
key + d = 'n' - 't'

If the author used a four-byte key:
'YTsUK' = 'http' XOR key32

Then we can still calculate the key and even validate it on the remaining three known bytes. But if the key is the length of the required string itself, then the described approach does not work. For some reason, the Malvarians do anything, but not something that would actually help them.

Combining

For example, they combine methods. They will record the url backwards, even through a byte, and then another with a variable key. Yes, some automation can be confusing. In fact, this does not prevent successfully detecting such files. And as soon as the virus analyst parses the malware, it will add the Reverse (Unicode (LinearXor (stream))) method to the decoder, after which all old and new versions are decoded automatically.

Water marks

Another application of the described encryption methods: watermarks in the program. Suppose you are an honest programmer and give a copy of the program to customers Vasily and Georgy. No, better to Edward and Gregory. And you want to know, and not whether one of them puts the program in public access. In the source code you can write:
#pragma data_seg (". text")
extern char WATERMARK [] = "uQPW►▀} _ ▲ WTW ►"; // "Copy of Edward" XOR FF

After compilation, the encrypted line will be in the executable file, but it is not easy to find it. After all, it is not known what to look for and where.

However, as we have just found out, if a copy of Edward does get into public access, then Gregory will have two versions of the program, and he will easily find the difference in them and solve the cipher. It will only get worse for you, Gregory will be able to forge other people's copies.

Conclusion:

Here in general, that's all. We met with some trivial encryption methods and found out why they are useless. If you have any questions and comments - great, really looking forward. If the material seemed too simple to you, then here is a vital example from the field of virus analysis on the same topic.

The virus (infector) builds its body into the infected file. His body is constantly, but it is his xorit on DWORDs with pseudo-random numbers like this:
srand (key);
crypted [0] = body [0] XOR rand ();
crypted [1] = body [1] XOR rand ();
...

The body decoder with the key “smeared” by the victim's code (entry-point obscuring), so you will have to detect the encrypted body. For example, you can assume that the body of the virus is the notepad.exe code section.

It is known that the most popular compiler algorithm is used as a generator of pseudo-random numbers:

seed = seed * 1103515245 + 12345; return (seed % ((unsigned int)RAND_MAX + 1));

Show all

 /* * This file is part of the libpayload project. * * It was originally taken from the OpenBSD project. * * Copyright (c) 1990 The Regents of the University of California. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include <libpayload.h> static unsigned int next = 1; int rand_r(unsigned int *seed) { *seed = *seed * 1103515245 + 12345; return (*seed % ((unsigned int)RAND_MAX + 1)); } int rand(void) { return (rand_r(&next)); } void srand(unsigned int seed) { next = seed; }

Is it possible to quickly detect this virus programmatically ~~in the mind~~ ?

Source: https://habr.com/ru/post/207654/

All Articles