Self-modifying code

The article describes in detail about the self-modifying code (QMS), and how to use it in their programs. Examples are written in C ++ using inline assembly. I will also tell you how to execute code on a stack, which is an essential trump when writing and executing a QMS.

1. Introduction

Well, let's go. The article promises to be long, as I want to write it so that you do not have any questions. There are already a million articles on the QMS topic, but here is my vision of the problem - after hundreds of hours of writing the QMS ... I will try to shove all my labors here. Everything, grab tomato juice (or whatever you prefer to drink there), make the music louder and get ready to learn how to save your application from beginners! Along the way, I will tell you about Windows memory and some other things that you don’t even suspect.
')

2. A brief history of self-modifying code

More recently, programmers had the luxury of using self-modifying code wherever they please. 10-20 years ago, all more or less serious attempts to protect programs used the QMS (self-modifying code). Even some compilers used QMS, operating with code in memory.

Then in the mid-90s something happened. This is something called Windows 95 / NT. Suddenly, we, the programmers, were given to understand that everything that we did before is bullshit, and we have to master a new platform. All the previously invented tricks could be forgotten, because now we could no longer play without demand with memory, hardware and the operating system. Most people thought that the writing of the QMS would no longer be possible without the use of VxD, for which, which is typical for Windows, there was no more or less competent documentation. After some time, it was discovered that we still CAN use the QMS in our programs. One way is to use the WriteProcessMemory function exported by the Kernel32 library, the other is to place the code on the stack and then modify it.

The rest of the article is mainly devoted to Microsoft Visual C ++ and 32-bit subsystem.

3. Windows memory as it is

Creating a QMS in Windows is not as easy as we would like. Here you will have to face some of the pitfalls, carefully laid out by the creators of Windows. Why? Yes, because it is Microsoft.

As you know, Windows allocates 4 gigabytes of virtual memory for each process. For addressing this memory, two selectors are used in Windows. One is loaded into the CS segment register, and the other is thrown into the DS, SS, and ES registers. They all use the same base address (equal to 0) and are limited to 4 gigabytes of space.

The program can only have ONE segment containing both code and data, as well as ONE process stack. You can use the NEAR procedure call or the transition to the control code located on the stack. In the latter case, you should not use SS to access the stack. Although the value of the CS register does not match DS, SS and ES, the commands MOV dest, CS: [src], MOV dest, DS: [src] and MOV dest, SS: [src] access the same memory location.

The memory areas (pages) containing data, code and stack may have some attributes. For example, in code pages, read and execute are allowed, in data pages, read and write are allowed, and in the stack, read, write, and execute simultaneously.

Also, these pages may have a number of security attributes. I will talk about them a little later when we need them.

4. WriteProcessMemory - new best friend

The easiest way to change a few bytes in the process (in my opinion) is to use the WriteProcessMemory function (if the protection flags are not set).

The first thing to do for this is to access the process loaded into memory using the OpenProcess function with the PROCESS_VM_OPERATION and PROCESS_VM_WRITE access attributes. Below is an example of the simplest QMS, which we will talk about. In C ++, to implement this mechanism, we need some built-in features of the language. Of course, all this can be done in other languages, but only about this we'll talk some other time. In addition, in other languages it all looks much more complicated.

Listing 1. WriteProcessMemory in the QMS service

int WriteMe(void *addr, int wb) { HANDLE h=OpenProcess(PROCESS_VM_OPERATION| PROCESS_VM_WRITE, true, GetCurrentProcessId()); return WriteProcessMemory(h, addr, &wb, 1, NULL); } int main(int argc, char* argv[]) { _asm { push 0x74 ; JMP >> JZ push offset Here call WriteMe add esp, 8 Here: JMP short Here } printf("Holy Sh^& OsIX, it worked! #JMP SHORT $2 was changed to JZ $2n"); return 0; }

As you can see, the program replaces an infinite loop with a simple transition JZ. This allows the program to proceed to the next instruction, and we see a message confirming the replacement. Great, huh? I bet you are thinking now ... hmm, interesting, but could I do something like that? Probably yes!

However, this method (using WriteProcessMemory) has a number of vulnerabilities. First of all, an experienced cracker WILL analyze the import table and detect a suspicious function. He will most likely put a few breakpoints on this challenge, analyze the code next to him and find what he needs. Because the use of WriteProcessMemory is typical only for compilers that collect code in memory, or for unpackers of executable files. At the same time, with such a trick, you are free to confound the novice cracker. I often use this technique in my programs.

Another WriteProcessMemory sax is the impossibility of creating new pages in memory. The trick with this feature works only on existing pages. Therefore, although there are several ways to bring the application of this function to mind, we will turn our attention to executing code on the stack.

5. Placing the code on the stack, and its execution!

Placing code on the stack is not only permissible, but sometimes even necessary. In particular, this makes life easier for compilers, allowing them to generate code on the fly. But will not such liberties with the stack put the security of the system at risk? By itself, they can bring trouble on your ass. In addition, this is not the best technology for your programs, because installing a patch that prohibits execution of code on the stack, paralyzes the work of most of your creations. On the other hand, although there is such a patch, in particular for Linux, for Solaris, and although it is very useful, I think that it is installed only by two people (the authors themselves, hee-hee).

Do you still remember the WriteProcessMemory vulnerabilities mentioned above? The trick with placing the executable code on the stack gives us two pleasant possibilities for eliminating them. First, the instructions that modify the code are located in an unknown area of memory, and therefore it is almost impossible for the cracker to track them. In order to analyze the protected code, it will have to cut the tree of our program right under the butt, so most likely its work will not be crowned with great success! Another argument in favor of executing code on the stack is that at any time the program can allocate as much memory as needed, and at any time can release it. By default, the operating system allocates 1 MB of memory for the stack. If the task performed requires more memory, the program may request an additional quota.

However, there are several nuances that you need to know before placing your code on the stack ... Therefore, we will talk about them now.

6. Why movable code can be harmful to your health.

You should be aware that Windows 9x, Windows NT and Windows 2k have a stack in different places. Therefore, in order for your program to be cross-platform, it is important to use relative addressing. Implement this requirement is not so difficult, for this all you need to follow a few simple rules - damn them, these rules!

To our great joy, in the world of 80x86 all short-jumps and nir-kals are relative. This means that you do not need to use linear addresses, but you must use the difference between the target address and the address of the next program guide. Such relative addressing will greatly simplify our lives, but even it has its limitations.

For example, what happens if the void OSIXDemo () {printf ("Hi from OSIXn");} function is copied to the stack and then called? Such a call is likely to lead to an error, since the address of the printf has changed.

In assembler, through the address register, we can easily fix this problem. The relocatable call to the printf function can be implemented very simply, for example, LEA EAX, printfNCALL EAX. Now the ABSOLUTE linear address, - not relative, - is placed in the register EAX. Therefore, it does not matter where the printf function is called from — it will work correctly.

To reproduce such tricks, it is necessary for your compiler to support assembly inserts. I know that if you are not interested in low-level programming, for you this is a complete sax, but you can do exactly the same thing by limiting yourself to the arsenal provided by high-level languages. Here is a simple example:

Listing 2. How to copy a function to the stack and run it there

 void Demo(int (*_printf) (const char *,...)) { _printf("Hello, OSIX!n"); return; } int main(int argc, char* argv[]) { char buff[1000]; int (*_printf) (const char *,...); int (*_main) (int, char **); void (*_Demo) (int (*) (const char *,...)); _printf=printf; int func_len = (unsigned int) _main  (unsigned int) _Demo; for (int a=0; a<func_len; a++) buff[a] = ((char *) _Demo)[a]; _Demo = (void (*) (int (*) (const char *,...))) &buff[0]; _Demo(_printf); return 0; }

So do not let anyone hang their noodles on their ears, that high-level languages do not allow to execute code on the stack.

7. We start optimization right now!

If you plan to write QMS or use the code executed on the stack, then you need to seriously approach the choice of the compiler, and learn the features of its work. Most likely, your code will collapse with an error when you first access it from the program, especially if your compiler is set to “optimization” mode.

Why it happens? Because in such purely high-level programming languages like C or Pascal, it's extremely damn hard to copy function code to the stack or anywhere else. The programmer has the opportunity to get a pointer to a function, but at the same time, there are no rules standardizing its use. Among programmers, this is called the "magic number", which is known only to the compiler.

Fortunately, almost all compilers use the same logic when generating code. This is a kind of unwritten code compilation rules. Therefore, the programmer can also use them.

Let's take another look at Listing 2. We rightly assume that the pointer to our Demo () function coincides with its beginning, and that the function body is located immediately after the beginning of this function. Most compilers adhere to this "common sense compilation", but do not expect that all of them follow this. At least, the big guys (VC ++, Borland, etc.) still adhere to this rule. Therefore, if you do not use any unknown or new compiler, do not worry about the lack of “common sense compiling”. One note about VC ++: if you are working in debug mode, the compiler inserts a kind of "adapter" and places the function in a different place. Damn Microsoft. But do not worry, just make sure that the settings are set to the “Link Incrementally” flag, which will force your compiler to generate good code. If your compiler does not have such an option, you can either not use the QMS, or use another compiler!

Another problem is determining the length of the function. To do this, there is a simple and reliable trick. In C ++, the sizeof instruction returns the size of a function pointer, not the size of the function itself. However, as a rule, compilers allocate memory for objects, in accordance with the order in which they appear in the source code. So ... the size of the function is the difference between the pointer to the function and the pointer to the function following it. Very simple! Remember this trick, it will be useful to you, even though optimizing compilers will NOT follow these rules, and therefore the method that I just described will not work. See why optimizing compilers are so bad for your health if you write QMS?!?!?

Another thing that optimizing compilers do is to delete variables that, as they THINK, are not used. Returning to our example in Listing 2, we see that some value is written to the buff buffer, but nothing is READ from there. Most compilers are unable to recognize the fact that control has been transferred to the buffer, so they remove the instructions that copy the code to the buffer. Bastards! That is why control is transferred to an uninitialized buffer, and then ... boom. Collapse. If there is such a problem, uncheck the box with “Global optimization” and everything will be fine.

If your program still does not work, do not give up. The likely reason is that the compiler at the end of each function inserts subroutine calls that control the stack. This is what Microsoft VC ++ does. It adds __chkesp function calls to debug projects. Do not bother to search for the description of this function, it is not in the documentation! This call is relative, and there is no way to exclude it. However, in the final project, VC ++ checks the state of the stack upon exiting the function, so your program will run like a clock.

8. QMS in your own programs

So, finally the time has come for what you have all been waiting for. If you have come all this great way described in the article, I greet you. (loud applause)

Well, now you can ask yourself (or ask me) "What are the benefits of executing code (function) on the stack?" In response, the crowd says: Ahhhhhhhhhhhh.

Encrypted code is such a big pain in the ass of a cracker engaged in disassembling. Of course, using the debugger, it makes life a little easier, but still the encrypted code makes his / her life incredibly difficult.

For example, the simplest encryption algorithm that consistently applies an exclusive OR (XOR) operation to each line of code and which when reused restores the original code!

Here is an example that reads the contents of our Demo () function, encrypts it, and writes the result to a file.

Listing 3. How to encrypt the Demo function

 void _bild() { FILE *f; char buff[1000]; void (*_Demo) (int (*) (const char *,...)); void (*_Bild) (); _Demo=Demo; _Bild=_bild; int func_len = (unsigned int) _Bild  (unsigned int) _Demo; f=fopen("Demo32.bin", "wb"); for (int a=0; a<func_len; a++) fputc(((int) buff[a]) ^ 0x77, f); fclose(f); }

The result of the encryption is placed in a string variable. Now the Demo () function can be removed from the source code. Later, when we need it, it can be decrypted, copied to a local buffer and called for execution. Kick in the ass, huh?

Here is an example of the implementation of this algorithm:

Listing 4. Encrypted program

 int main(int argc, char* argv[]) { char buff[1000]; int (*_printf) (const char *,...); void (*_Demo) (int (*) (const char *,...)); char code[]="x22xFCx9BxF4x9Bx67xB1x32x87 x3FxB1x32x86x12xB1x32x85x1BxB1 x32x84x1BxB1x32x83x18xB1x32x82 x5BxB1x32x81x57xB1x32x80x20xB1 x32x8Fx18xB1x32x8Ex05xB1x32x8D x1BxB1x32x8Cx13xB1x32x8Bx56xB1 x32x8Ax7DxB1x32x89x77xFAx32x87 x27x88x22x7FxF4xB3x73xFCx92x2A xB4"; _printf=printf; int code_size=strlen(&code[0]); strcpy(&buff[0], &code[0]); for (int a=0; a<code_size; a++) buff[a] = buff[a] ^ 0x77; _Demo = (void (*) (int (*) (const char *,...))) &buff[0]; _Demo(_printf); return 0; }

Notice that the printf () function displays the greeting. At a quick glance you will not notice anything unusual, but you look where the string "Hello, OSIX!" Is located. It has no place in the code segment (although Borland places lines there for some reason), by checking the data segment, you will see that it is where it should be.

Now, even if the source code is in front of the cracker's eyes, for him our program will still remain one of the hellish puzzles. I use this method to hide “secret” information (serial numbers and keys for my programs, etc.).

If you are going to use this method to check the serial number, the verification needs to be organized in such a way that even when decrypted, the cracker puzzle remains. I will show how to do this in the following listing.

Remember, when implementing a QMS, you need to know the EXACT location of the bytes that you are going to change. Therefore, instead of high-level languages, you should use assembler. Come on, stay with me, we are almost done!

When using the assembler in the implementation of the above method, there is one problem. To change any byte using the MOV instruction, you must pass the ABSOLUTE linear address as a parameter (which, you guessed it, it was UNKNOWN before compilation). BUT ... we can get this information during the execution of the program. CALL $ + 5 / POP REG / MOV [reg + relative_address], xx - code that is very popular among me. It works as follows. As a result of the execution of the CALL instruction, the address (or the absolute address of this instruction) remains on the stack. This address is used as the base for addressing the code of the stack function.

And here is an example of verification of the serial number, which I promised you ...

Listing 5. Generating a serial number and executing on the stack

 MyFunc: push esi ;   ESI   mov esi, [esp+8] ; ESI = &username[0] push ebx ;      push ecx push edx xor eax, eax ;    xor edx, edx RepeatString: ;     Lodsb ;     AL test al, al ;   ? jz short Exit ;  ,   1     ;   ,     ,   ; ()    ,   XOR mov ecx, 21h RepeatChar: xor edx, eax ;   XOR  ADC ror eax, 3 rol edx, 5 call $+5 ; EBX = EIP pop ebx ; / xor byte ptr [ebx0Dh], 26h; ;     ;  XOR  ADC. loop RepeatChar jmp short RepeatString Exit: xchg eax, edx ;   (.)  EAX pop edx ;   pop ecx pop ebx pop esi retn ;

This code looks rather strange, since its repeated calls, when passing the same arguments, either yield the same output or completely different results! It depends on the length of the username. If it is odd, the XOR is replaced by the ADC when exiting the function. Otherwise, nothing like this happens!

Well, that's all for now. I hope this article has been of some use to you. Her seal took me two whole hours! Feedback is always appreciated.

English source: Giovanni Tropeano. Self modifying code // CodeBreakers Journal. Vol. 1, No. 2, 2006.

Source: https://habr.com/ru/post/272619/

All Articles