Viruses. Viruses? Viruses! Part 2

As promised in the last part , we will continue to review the virus engines. This time we will talk about the polymorphism of the executable code. Polymorphism for computer viruses means that each new infected file contains a new virus code descendant. Theoretically, for an antivirus, this would have meant a real nightmare. If there were a virus that in each new generation would change its code 100%, and in a truly random way, no signature analysis could detect it.

Maybe somewhere there is a super programmer who actually wrote such code, and that’s why we don’t know anything about it. I do not really believe it, and it even seems that mathematicians who work on the mathematical basis for the operation of computing systems could prove that there is no such specific polymorphism algorithm, the result of which could not be 100% detected with the help of another specific algorithm. But we are simple people, we are just interested in the idea of code that changes itself, and in the light of the “algorithm versus algorithm”, the consideration of confronting the methods of hiding the executable code with the detection methods for the programmer should be very interesting.

We recall and slightly supplement the first article.

Let us recall our heroes from the first article: the virus maker and the programmer of the anti-virus company and attach them to their karmic twins: the developer of the mounted protection and the cracker. The first try to hide the executable code and information about it, the second - to get access to the characteristic code and its internal algorithms. Automatic methods prevail in the virus area (virus engine and antivirus detector), in the protection area, manual methods (attachment protection parameters are controlled manually, the process of software hacking is also manual work, despite the abundance of auxiliary software).
')
At the end of the first article, we have a virus that can correctly infect an executable file (that is, it knows how to work out itself and correctly execute the code of the file itself when launching the file) and an anti-virus detector that knows that the virus is located in strictly certain places of the file, and that at some distance from the characteristic point (from the entry point, from the beginning of the section, from the end of the header) there is a fixed set of bytes that characterizes the virus. Also, so that the article does not move into the discussion of “what is wrong with the virus,” let's agree that the payload of the virus does not do anything. So you can weed out all the discussions regarding the nature of the actions of the code in question and focus on the methods of detection and concealment.

We continue to remember and supplement the first article. The scheme of opposition of a virus and antivirus can be considered by analogy from the point of view of hacking a commercial program. Instead of the infector, the program itself is running, “hanging” the protection on the executable file. It “spoils” the code of the program itself, the information necessary for its recovery is hidden in its body, and in the same tricky way as the virus, hiding the algorithm of its work, first executes the protection code that checks the validity of the serial number or the time of the trial, and Then, having “repaired” the main program, it transfers control to it. Cracker, in turn, plays the role of an antivirus, trying to get to the internal security code and learn its algorithm. It turns out that he does the same thing as the Aver, trying to find (and save) the characteristic part of the code. The only difference is that the cracker will take all the original code and make a workable file from it, and the aver will find a characteristic piece in it and create a signature.

The most popular way to remove such protection is to dump an image of the program into memory onto a disk. Cracker is looking for the moment when the protection has worked, decoded and “repaired” the main program, and there is a healthy code image of this main program in memory. In fact, he is looking for an opportunity to stop the program at the so-called OEP (Original Entry Point) - the “old” entry point of the protected program. At this point, the image in memory can be saved to disk. Of course, it will not work, but it can be repaired by “reconfiguring” the Entry Point of the executable file so that it points to the OEP, and if the program is working at that moment, such an image will work just skipping protection (there are still many manipulations with restoration of calls to external functions, multiple dumps in case the program is not fully decoded, and in general, this is a topic for a dozen articles, but the main principle is this). Another popular way is to find a piece of code that generates a serial number and, if possible, bite it, and make a small executable file that generates valid serial numbers (keygen). As we will see below, a similar course of action is not alien to the antivirus detector.

I also like to draw analogies with biological systems, I will try not to burden you very much with this. I really want to quickly see the artificial mind and life.

Disassembler and debugger

Understanding the basic principles of their work is important to consider the protection of executable code. You probably know something about this, since you are reading this article, but anyway, either be patient a little or just skip this section.

The disassembler accepts either an executable file or an abstract buffer with code and, which is quite important, the first address from which to start disassembling. In the case of an executable file, this is, for example, an entry point. Putting a pointer to the first instruction, the disassembler determines what the instruction is, its length in bytes, whether it is a transition instruction, which registers it uses, which addresses it refers to in memory, etc. If the instruction is not a branch instruction, the disassembler proceeds to the next instruction, moving the pointer forward for the length of the instruction. If it is an unconditional JMP or CALL, the disassembler moves the next instruction pointer to where the transition address points. If this is a conditional transition (JZ, JNA, etc.), then the disassembler marks the following two addresses to be considered at once - the address of the next instruction and the address to which the transition is possible. If the byte combination is not recognized, the process of disassembling this branch stops. It is also necessary to mention that the disassembler stores information about which instructions refer to this (!), Which allows you to define function calls, and, most importantly, who calls them.

A disassembler turns a sequence of bytes into a sequence of multiply connected structures in which information is stored about each byte of an instruction: whether a particular byte is part of an opcode (operation code), data, an address to which the transition is from somewhere, etc. Each structure can contain references to one or two of the following structures and at the same time be an object referenced by an arbitrary number of other structures (for example, the first instruction of a function that is called many times). Also, smart disassemblers can follow the stack pointer, or be able to recognize and correctly label for disassembly such constructs as: mov eax, 0x20056789; call eax; Plus, recognize the characteristic functions of a set of instructions, set the starting points for manual disassembly, comment on individual instructions and save the result of disassembling to disk, because the operation of constructing a call graph and marking structures is very costly, well, you can mess around with one file for days. But, as we discussed earlier, it is possible that a transition on a disk in a file leads to an encrypted buffer, and in this case the disassembler will generate a mess of instructions or stop. In this case, you need to get this encrypted buffer right at runtime, when it is openly in memory, and this requires a debugger.

The main task of the debugger is to stop the program in the most interesting place. There are several ways to do this. You can open the process memory, and instead of one of the instructions, enter int 3 - in this case, the processor, following this instruction will generate an exception, and the debugger will process it, open its window, restore the original instruction and show what is in this memory area. You can turn on the trace flag in the processor, and then the processor will generate this exception on each instruction. Finally, the processor has debug registers, you can put some address in them, and the processor, having gained access to the memory at this address, will stop. So, for example, by setting breakpoint to access the start address of the encrypted buffer, we stop the first time when the decryptor starts decrypting and reads the first byte of the buffer, and the second time when it passes control to it. At this point, the contents of the buffer can be written to disk, set a disassembler on it and learn all its secrets. In advanced protections, there is no such moment in time when the full working program code is in memory - parts of the code are decrypted in pieces as needed. In these cases, the reverse has to collect dump in pieces.

Protection from code exploration

The topic of protecting the executable code from research is worthy of a dozen articles, therefore, in the framework of the issue under consideration, we will focus on only a few points. Static protection of the code from research represents various methods of entangling executable code and encrypting buffers with important parts of the code with subsequent decryption at runtime. Code entanglement can be implemented with the help of special, code-obfuscating compilers, and encryption with the help of the polymorphic engines considered below (which, in terms of code, include commercial protections).

Dynamic protection means that the program can determine at runtime whether it is being debugged and to take some action in connection with this. For example, after reading the buffer with its own code, the program can compare its checksum with the reference one, and, if the debugger has inserted int 3 into the code (see above), understand that it is being debugged or have modified its code in some other way. But perhaps the most reliable and portable way to understand that you are being debugged is the measurement of the execution time of characteristic code segments. The point is simple: time is measured (in seconds, parrots or processor cycles) between instructions in some buffer, and if it is more than a certain threshold value, then the program is stopped in the middle. Protection, having understood that it is being debugged, may, for example, ignore branches, inside which the reverser may stop and stupidly not trigger, and the virus, remove itself from the system. To combat such situations, reversers work in controlled environments that can be easily reproduced — for example, in virtual machines, for which everything can be played back, up to BIOS settings. Therefore, when examining the code of a virus or protection, it is necessary to remember that the program may well detect the fact of the study and do something wrong.

Polymorphic engines or “code has become smaller, engine is stronger”

Let's go back to the viral engines. At a certain point in the development of DOS, after the heaps of mega-current packers appeared at the time, programmers, apart from files, began to pack everything that was being packaged. And the ".exe" files take up a lot of space, and a rather large part of such a file is executable code with a stable frequency distribution of groups of bytes, which probably presses well on the correct algorithm. Therefore packers became the first steps to polymorphic engines.

The principle of the packer is quite simple:

take a buffer with executable code (code section, for example);
we pack it;
we take the position-independent unpacker code and supplement it with the correct addresses of the beginning and end of the buffer with the packed code;
add the transition to OEP (the first instruction of the unpacked code) to the end of the unpacker;
we place the unpacker and compressed code buffer in the executable file (we correct the sizes of sections and / or EP).

The resulting file is much smaller in size than the original one. After the appearance of new, coolest hard drives with a capacity of 100MB, this became not so important, but the packaging opened many new possibilities for the wizards and protection developers:

the size of the virus (despite our coolest 100 MB hard drive) is still important. If the payload code is bold and multifunctional, then the entire virus will be harder to cram into a file, especially if you use something more cunning than adding a new section to the end of the file. The use of packaging will allow almost all large and complex virus code to be packed into a buffer that is several times smaller than the original size.
the buffer with the packed code is not necessary to be placed in the section with the execution flag. For advanced infector is a very important factor, because the main body of the virus can be safely put anywhere. After unpacking, the unpacker must take care that the memory into which the code has been unpacked is allowed to run. That is why Windows APIs that work with memory access attributes (all sorts of VirtualProtect, VirtualProtectEx, VirtualOuery, and VirtualQuervEx) invariably attract heuristics
Well, for sweet, the most important thing - instead of packing or after it, the buffer with the code can be encrypted, and the key can be put in the unpacker. Now it will not be an unpacker, but a decryptor. With each new infection (or hanging protection on the executable file), the buffer with the code can be encrypted with the help of a new key, and then the buffer with the code will have completely new content (of course when using good encryption algorithms) .: w In the future I will not write “ packed, ”but I assume that packaging may be included in the encryption process.

Well, here it is, in fact, the first polymorphic engine. Let us write in more detail the approximate algorithm of infection:

We generate a new encryption key.
We take the decryptor code (where and how - let's talk later, in the simplest case, stupidly get the ready-made code from our body).
We introduce into it (in the decryptor code) our new encryption key.
We introduce in the virus code the transfer of control from the victim file and back (while the code is not yet encrypted).
We encrypt our large buffer with a code with a new key.
Silently putting the encrypted buffer in the victim file (it is significantly different from the previous one, so you can not hide especially).
Add a transition to the beginning of the encrypted buffer at the end of the decryptor.
Cunningly (as far as possible) we put the decryptor into the victim file.

Let's see what happened: most of the virus (encrypted buffer) changes completely from file to file, and only a small decryptor remains unchanged. This decryptor actually contains several addresses (varying from file to file), a decryption key (also changing), and the decryption code itself. Antivirus now had to strut, the patterns typical for this virus are hidden inside the encrypted buffer, and now we have to look for a piece of code for the signature in the decryptor, but it is small and contains much fewer characteristic code and data sections.

Such a simplification of the task caused the appearance of more advanced polymorphic engines, which, when infected, change only the decryptor code - after all, dealing with a small piece of code is much easier than with all the payload code. Joyful virmakers and protection developers rub their hands and learn ways to hide the little decryptor more cunningly, and the Avers and crackers repair disassemblers that, after trying to disassemble randomized byte strings to which JMP is present in the code, the roof goes.

Evolution of viral engines

Now, the Aver spends a little more time creating signatures, since You have to work with a small amount of code, in which there are fewer characteristic sections. And the virmaker is only concerned with the mutation of a rather small decryptor with a fairly simple internal algorithm, and the task of hiding it from the detector now seems more real. Given that the antivirus compares the signature at a fixed offset, first the virmaker tries to shift the decryptor code in various ways and, accordingly, discredit the characteristic signature inside it.

NOP zone or "maybe blow over"

The first simple technique that came to viruses from exploiting vulnerabilities is the NOP zone. When an attacker succeeded in successfully exploiting any vulnerability and forcing the processor to make the transition to a given address, but the exact address of the location of the shellcode in memory is unknown, the attacker can do this: fill the heap of space in front of the actual exploit code with NOPs:

addr1: nop nop ;...    NOP- nop addr2: jmp addr3; shellcode pop esi; shellcode xor edx,edx shellcode ;...

Now you can make the transition "somewhere there", in the NOP-zone. If only an approximate memory location is known, this technique allows shellcode to be executed successfully.

You can do the same with the decryptor, just put it in different places of a long NOP line when infected. And in some places (where it does not break the transitions) you can cram these NOPs directly into the code. In this case, everything will work correctly, but the offset of the characteristic signature will always be different. Of course, the offsets for conversion instructions will have to be recalculated.
Too much of a free solution didn’t strain Avera, who simply added the sign “skip all the NOPs when calculating the signature”, but this small step is quite remarkable because for the first time the detector began to look at instructions, not bytes. But more about that later.

Permutation or “add something”

Reflecting on how to discredit the comparison by signature, without breaking the decryptor code, the virmaker comes up with the idea of permutation. Permutation is a permutation of code blocks in each new generation. The code consists of a certain number of blocks, these blocks are interchanged in each new generation of the virus, and are connected by JMPs. As always, everything is simple on paper, and problems start in implementation. Inside the blocks there are conditional and unconditional jumps and function calls; therefore, such logical blocks must remain intact. At the same time, the thicker the blocks, the smaller the variability of the resulting decryptor, and the smaller the block size, the more transitions must be added, inflating the decryptor code, and the more difficult it is to maintain integrity. For the alignment of blocks in length, you can, for example, use NOP zones.

Here is an example of the algorithm: in the body of the virus we store a ready-made set of blocks with markup (which is the block number and its length). Then we take a random block, write it to the buffer, and rule the JMP at the end of the previous block. We supplement the result with JMP th on the first block and the buffer with randomly rearranged blocks is ready. Unlike previous games, this is already a serious enough serious application, each new generation, albeit at the expense of unconditional transitions, but still generates, in terms of offsets, a completely different code. Virmeyker falls asleep with a contented smile.

	[block 1]	[block 2]	[block 3]	[...]	[block N]
[jmp block 1]	[block 2] [jmp block 3]	[block 1] [jmp block2]	[block 3] [jmp block 4]	[...]	[block 4] [jmp block 5]

Aver wakes up. Tracing the code of several generations of the virus, he understands that in the decryptor he is dealing with the rearrangement of blocks, and it is necessary to refine the detector, if possible without depriving it of performance. He decides to write a fast automatic disassembler, which can run according to instructions, dwell only on transition instructions, calculate the transition address and proceed to the analysis of instructions at the transition address.

Now the anti-virus database contains the following instruction: starting from the entry point, follow the instructions, make the transitions according to the encountered JMP-am, and, after passing N instructions, compare the signature. If the signature is in the tenth block, you have to go to the tenth transition, if conditional transitions (JZ) are possible inside, then they can be considered as two transitions - to the next instruction and to the address of the transition, and, accordingly, branch the instructions. Of course, no one has canceled and detection is simpler, for example, if there are blocks of a virus of fixed length L and their N pieces, you can simply make N comparisons by signature on the displacements [0, (1 * L), (2 * L), ..., ((N-1) * L)].

Evaluate the complexity of the search process using disassembler. The disassembler must provide the minimum instruction length definition and the VA (Vitual Address) to RVA (Relative Virtual Address) conversion (the address specified in the JMP in the file offset). Determining the length of the instruction is basically a fast enough algorithm (accessing the array element and calculating the next step based on the flags written in the corresponding array element), and the address translation is a pair of elementary operations of address addition based on the information on which section address. Plus, a little crazy for determining cheap tricks to replace the banal JMP next_block_address, such as:

  XOR eax,eax; JZ next_block_address; ;  PUSH next_block_address; RET; ;  MOV eax, next_block_address; CALL eax;

These are not very scary algorithms in terms of performance, but, nevertheless, it doesn’t look like a CRC32 calculation from a short line at a given offset, and an angry tester swears that the detector is already chewing on the test base for half the night and eating the entire processor.

As usual, if something is turned on, but it slows down, you must either optimize it, or try not to turn it on unnecessarily. The first method, alas, does not roll - you don’t optimize much in a simple disassembler, so the Aver goes to the favorite place of all antiviruses - the heuristic analyzer.

Heuristic analyzer or “showdown”

In the first article, we already touched upon the heuristic analysis - indeed, there are signs that with varying degrees of reliability can say that the code was injected into the file. And then, Aver really singled out some of them that were suspicious, but didn’t pull into the right to declare 100% of the fact that the file was infected. Then he just commented them out, because I spent a lot of time on them and absolutely sorry for them. Now, based on them, it is possible to decide whether to run more difficult, using disassembling, file analysis, or not.

There is one more problem - because Heuristics react to all suspicious, commercial defenses cause genuine interest in him, so Avier had to add a couple of hundred “whites” to popular attached defenses in the signature database - they cannot be touched. Thanks to them, we still can run various commercial software normally. And when writing your own software that uses methods of working with executable code, it would be nice to get rid of all the files of your program on all popular antivirus programs somewhere on virustotal before release. You can not worry much for unpopular ones, it is difficult to drag off a heuristic analyzer as easy as a signature database and it is unlikely that an unpopular antivirus analyzer will be as cool as it has been developed for many years.
It is, of course, worth mentioning of the Virmaker’s attempts to disguise his virus under popular protection. For this, the signature itself is needed, and it begins to parse the anti-virus database in order to understand where to put the necessary bytes so that the anti-virus will take its virus for protection. Anyway, making the next version of the virus, it would be nice to get acquainted with the code that detects the current one. So, anti-virus databases are also objects of reverse engineering, and the detector code is also analyzed by virmakers.
But back to our heuristic analyzer, we present several heuristic signs:

The entry point in the open section for recording (rwx). An open for writing, executable section into which control is immediately transferred is likely to indicate the presence of a self-modifying code, such sections are used in the vast majority of cases by viruses and software protection.
Transition instruction at the entry point. There is no special sense in placing the transition instruction at the entry point, and such a sign indicates the presence of a self-modifying code in the file.
Entry point in the second half of the section. Viruses that use the extension section, in most cases, are located at the end of the section. This is not typical for normal files, so this situation is suspicious.
Breakage in the title. Some modifications of the header after infection leave the file operable, but the header itself contains errors that the linker would not allow. This is also suspicious.
Non-standard format of some service sections. In executable files there are utility sections, such as, for example, .ctors, .dtors, .fini, etc. Features of these sections can be used by viruses to infect a file. Violation of the format of such a section is also suspicious.
... and a hundred more such signs

Such signs can be many, they have different degrees of danger, some can be dangerous only in combination with others, but it is the most powerful tool for making decisions about the need for more thorough analysis and the fact of infection. It is not easy to bypass the heuristics (I mean to ensure that it does not even issue a warning). These are either any platform-specific solutions that use features of certain compilers or frameworks (such as rewriting standard constructors or destructors) that quickly get into the heuristic base, or use of really large and complex infectors that can really high-quality code in the file.

When heuristic signs say that “the file is 100% infected”, but hard analysis did not find anything, the antivirus writes that the file is infected with a virus with a name like: “Generic Win32.Virus”, or in some ways “Some Win32 Virus”. Such messages can often be found on all sorts of keygens, loaders, etc. In the last article I have already said that it is for this reason that the instructions for installing pirated software are written “disable the antivirus before installation”. I also once again want to draw attention to one of the most important information assets of antivirus companies - a collection of executable files of sufficient size so that the analyzer can be tested on it without fear of releasing a version to the world that will be thrown onto legitimate files that are added there. Offended keygens and loaders are surely outraged that they are not added there promptly, but who listens to them ...

So, after working on the heuristic, Aver comes to the following general detection algorithm:

Check the file with a regular signature search.
If successful, treat the file as infected.
If a “white” defense is found, exit silently.
Check file heuristic analyzer.
If no signs were found, exit.
If signs of sufficient weight are found, run an analysis using disassembly.

At the same time, if the heuristic signs are serious enough to talk about infection, consider the file infected, regardless of whether the analyzer found anything or not.

A lot of work has been done, and the antivirus now, albeit without identifying the threat, but with a very high percentage of authenticity recognizes the facts of infection. Support of the test database of executable files allows you to safely add new heuristic signs as soon as new infection algorithms appear, and, finally, the antivirus is able to respond to threats before the new infection has time to spread. It should be noted that if earlier testing of antivirus on all executable files in the world seemed completely unrealistic, now, now the base of all possible WWW executable files no longer seems fiction. The executable file is a thing that requires serious time, and the world does not produce them that much. In addition, testing on this huge database of files is easily parallelized, so it is quite possible to train heuristics on huge arrays of possible data. Happy avier drinks his cocoa and goes to bed ...

"Mutants are coming." Metamorphism

This time, the warmaker decides not to manipulate the already existing code, but to generate a new decryptor code in each new generation. This is metamorphism - the generation of new code in each new generation. Unlike permutation, in this case the code does not just rearrange the blocks inside itself, but actually changes its content. In theory, this should mean the unconditional victory of the virmaker over the exact detection of his virus (no one has canceled the heuristics). Now, the signature made for one generation of the virus will become irrelevant for the other, and even if it continues to detect the virus, it will not give a guarantee of efficiency in the next generation.

What is a metamorphic generator? The basis for generating a new generation of decryptor is a kind of “base code”, and in what language it is written - not essential. It is stored inside the virus's encrypted body, so it can be permanent. There, in the body of the virus, lies the engine, which, on the basis of each instruction of this "base" code, each time generates a new, executable code. This is very similar to the compiler - at the input there are some semantic constructions, the output is ready for execution by the processor code. Another similar generation of executable code based on the base code occurs in virtual machines - at the moment when on a certain platform the virtual machine executes the prepared byte code. It is at this moment that the “basic” byte code turns into a specific executable that the processor understands. And, if each new platform is considered a new generation of code, then the set of virtual machines for different platforms is a metamorphic generator.

If we recall that we generate the decryptor code, which is as independent as possible from where and when it is executed (does not contain system calls, does not access the saved state, does not contain complex objects), and works with already prepared data in memory based on known offsets. then the task seems quite solvable. At the input of the generator there are three main parameters - the address of the encrypted buffer, its length and key. Well, let there be another seed for the pseudo-random generation of all sorts of constants, future keys, etc. The decryptor also contains conditional transitions, but only within its own body, which also slightly simplifies the task.

Garbage generation

Virmeyker decides to approach the issue, using the generation of a set of unnecessary instructions, and "stir" the true decryptor code in them. Even if the original instructions remain unchanged, in a heap of other instructions it will be very difficult to isolate the necessary for comparison by the signature. Despite the nondescript name, the garbage generator is the most complex and interesting part of the metamorphic engine, because garbage or non-garbage, and you need to generate executable code that will not break itself and will not spoil the main decryptor code. In the process of "mixing" you will need:

- monitor the displacements of characteristic points (addresses of transitions, exits from the cycle, etc.);
- make sure that the garbage code does not spoil the necessary registers and flag register.

MMX, SSE, floating-point instructions are very attractive candidates for the title of garbage instructions, you can easily generate as many as you need, the main thing is not to touch the stack, not to write to general registers and not to break the flags needed by the decryptor, and the first metamorphic code looks like like this:

  mov ecx, 100h; ;  lbl0: mov eax, [esi + ecx] ;  xor eax, edi ;  mov [ebx], eax ;  add ebx, 4h ;  movd mm0,edx ;  movd mm1,eax ;  psubw mm1,mm0 ;  lbl1: jcxz lbl2 ; ,    psubw mm1,mm0 ;  movd mm3,ecx ;  jmp lbl0 ; ,   lbl2: sub ebx, 100h ;

Aver is not very worried, because Heuristics still continue to swear at the infected files (working on the generator, the virmaker is reluctant to mess with a serious infector), but it can no longer identify the specific virus. Therefore, on a dark night, Avera dreams of an infector that is not amenable to heuristics, and his obsessive idea becomes the need to detect the reptile with 100% accuracy. In order to accurately identify a virus, the detector needs to be refined - now it is necessary, starting from the entry point, to step on the instructions, skip all the garbage and add only meaningful ones to the analyzed ones, which means that the disassembler in the detector begins to grow. If you remember about the NOP zones in the paragraph about permutation, then the omission of NOPs when stuffing the buffer for comparison by signature is, in fact, the first approach to taking pictures - the detector skips the NOPs as garbage instructions. Now, instead of comparing with 0x90 (opcode NOP), aver uses a disassembler (the faster, the better), which:

Shifts the pointer to the beginning of the next instruction (disassembler lengths).
Tells whether this instruction is garbage (NOP, MMX, SSE, etc.).
Significant instructions added to the analyzed buffer.
In the case of an unconditional branch, marks the transition address as the next one being analyzed.
In the case of a conditional branch, marks both possible branches of the code for further analysis.

Thus, the aver collects the buffer from the instructions that make up the main decryptor code, and already in it can make a comparison by signature. This is still a fairly quick procedure, but when programming it, the Aver gets more and more worried: “Will I always be able to distinguish trash instructions from important ones?” Wyrmaker, feeling this, is finalizing his garbage generator. Now he calls for help the instructions for saving the context: pushad / popad (put all the general-purpose registers from the stack) and pushfd / popfd (the same for the flags register).

 <pre> mov ecx, 100h; ;  lbl0: mov eax, [esi + ecx] ;  xor eax, edi ;  mov [ebx], eax ;  add ebx, 4h ;  pushad ;   pushfd ;   mov eax, 12321h ;  xor edx,edx ;    sub eax, esi ;   popfd ;   popad ;   lbl1: jcxz lbl2 ; ,    pushad ;   pushfd ;   shr ebx, 4 ;  popfd ;   popad ;   jmp lbl0 ; ,   lbl2: sub ebx, 100h ;  </pre>

, « ». , , , . . , - — « breakpoint!».

. :

« »	1	2
virt_mov eax, 10h	mov eax, 20h; sub eax, 10h;	mov edx, 10h; mov eax, edx;
virt_mov ecx, 08h	xor ecx,ecx; add ecx, 08h;;	mov ecx, 04h; add ecx, 04h;
virt_sub eax, ecx	neg ecx; add eax, ecx;	mov edx, ecx; sub eax, edx;

, : « » «virt_mov edx, 10h» «virt_mov ecx, 100h». , , , «50h», , «virt_mov edx, 10h» «mov edx, 50h; sub edx, 40h;», a «virt_mov ecx, 100h» «mov ecx, 50h; add ecx, B0h». -, , wildcards , - «mov eax, <wildcard->; <skip >; mov ecx, <wildcard->». , , …

, , , , . , — , — . , . , eax, edx esi. ebx,ecx, edi . , .

 . . . mov eax, 10h ;  mov ebx, 20h ; , ebx -  ,     xchg xchg edx, ebx ;  ,   20h   edx xor ecx,ecx ;  inc ebx ;  add ecx,ebx ;  add eax, edx ;  mov edx, [esi] ;  xchg edi,ebx ;  cmp edx, 0 ;  . . . ;

, -, «» , . «» , , «».

- , , , :

« »	1	2
virt_push eax	sub esp, 04h; mov [esp], eax;	mov edx, esp; sub edx, 04h; mov [edx], eax;
virt_mov eax, ebx	lea eax,[ebx];	push ebx; xchng eax,ebx; pop ebx;

— , . , .. , , .. .
, , , , , :

;
- ;
, ;
;
- .

« ».

, 42 - , «- » , , . , , . , , , , .. , , , . , , , , .., .. . , — .

, , , ( Original Entry Point). , , payload , OEP. , , .. , , , .. , , , , esp , . , esp breakpoint, , esp , . OEP ( ). ( , ) , , , , cur_esp , , esp.

 . . . ; base_esp = cur_esp; push eax ; cur_esp -= 4; mov eax, 1h ; - push edx ; cur_esp -= 4; . . . ; - pop edx ; cur_esp += 4; pop eax ; cur_esp += 4; (cur_esp == base_esp) !!! . . . ;

, , , . , , , ( ). , , , , - . , . — API, ..

— . , — , . -, , - , , . MMX, SSE , ( ), (.. , ). , - «BANANAS», .
, , , .. pushad/popad, , , , .. — . ( ESI «BANANAS\0»).

  mov ecx, 0h; ; ecx_var = 0; lbl0: mov eax, [esi + ecx] ; esi  eax   (  ), ;        ;   "BANANAS" xor eax, edi ; eax_var = eax_var XOR edi_var; push eax ; esp_var -= 4; *esp_var = eax_var; pushad ;    pushfd ; skip mov eax, 12321h ; skip xor edx,edx ; skip sub eax, esi ; skip popfd ; skip popad ;    ; ""   ;      ;      mov edx, 23h ; edx_var = 23h; or edx, eax; ; edx_var = edx_var OR eax_var; lbl1: inc ecx ; ecx_var++; cmp ecx, 8h; ; if (ecx_var == 8) { goto lbl2; }: pushad ;    pushfd ; skip shr ebx, 4 ; skip popfd ; skip popad ;    jmp lbl0 ; goto lbl0 lbl2: sub ebx, 100h ;    "BANANAS" - !

, , , , . , , , , .
, - , . , , , . ?

. , , . , ?

— .. , . , . , . , , , , , — . — .

, . , , . « eax» , , (xor eax,eax sub eax, eax) - — . , , .. , , , . , - «» . , , , , , , .

, , , , .. , . - , , , , , - .. , . , , .. . , , , .., , , . , - , , , , , ( ). , , , , ( ). , . , , , , .

. , , , , . , , … .

, ? , , , , ? .

, — . NT- , NTFS Linux PC — , . . , , , — , , online- — «» . , , , , ? , , .

, , — . , , , crackme, , … . , — , , . , , . — , , , . , , , , , .

Epilogue

, , , , . , . , , , .

Source: https://habr.com/ru/post/240655/

All Articles