📜 ⬆️ ⬇️

"Hello World!" In C array int main []

I would like to talk about how I wrote the implementation of “Hello, World!” In C. To warm up, I will immediately show the code. Who interests as I reached before, welcome under kat.

#include <stdio.h> const void *ptrprintf = printf; #pragma section(".exre", execute, read) __declspec(allocate(".exre")) int main[] = { 0x646C6890, 0x20680021, 0x68726F57, 0x2C6F6C6C, 0x48000068, 0x24448D65, 0x15FF5002, &ptrprintf, 0xC314C483 }; 


Foreword


So, I began by finding this article . Inspired by her, I began to think how to do it on windows.

In that article, the output to the screen was implemented using syscall, but in windows we can only use the printf function. Maybe I’m wrong, but I haven’t found anything else.
')
Grabbing courage and picking up a visual studio, I began to try. I don’t know why I was fiddling with so long in order to substitute an entry point in the compilation settings, but as it turned out later, the visual studio compiler doesn’t even throw a warning if the main is an array and not a function.

The main list of problems that I had to face:

1) The array is in the data section and cannot be executed
2) In windows there is no syscall and the output needs to be implemented using printf

Let me explain why the function call is bad. Usually the address of the call is substituted by the compiler from the symbol table, if I'm not mistaken. But we have an ordinary array, where we ourselves must write the address.

Solving the problem of "executable data"


The first problem I encountered, expectedly, was that the simple array is stored in the data section and cannot be executed as code. But a little digging stackoverflow and msdn, I still found a way out. The visual studio compiler supports the preprocessor section directive and you can declare a variable so that it is in the section with permission to execute.

After checking whether this is so, I made sure that it works and the function array main quietly executes opcode ret and does not cause an “Access violation” error.

 #pragma section(".exre", execute, read) __declspec(allocate(".exre")) char main[] = { 0xC3 }; 

Some assembly language


Now, when I could execute an array, it was necessary to compile the code to be executed.

I decided that I would store the message “Hello, World” in assembly code. I will say right away that I understand the assembler badly enough, so please do not throw too much slippers, but criticism is welcome. In understanding what kind of assembly code you can insert and not call unnecessary functions, this answer on stackoverfow helped me
I took notepad ++ and using the plugins-> converter -> “ASCII -> HEX” function I got the character code.

  Hello, World! 

  48656C6C6F2C20576F726C6421 

Next, we need to divide by 4 bytes and put them on the stack in the reverse order, not forgetting to turn them over into little-endian.

Divide, turn over.
Add a terminal zero to the end.

  48656C6C6F2C20576F726C642100 

We divide from the end to 4 byte hex numbers.

  00004865 6C6C6F2C 20576F72 6C642100 

Turn over to little-endian and reverse the order

  0x0021646C 0x726F5720 0x2C6F6C6C 0x65480000 


I lowered the moment a bit with how I tried to directly call printf and then save this address in an array. It turned out that I only saved the pointer to printf. Later it will be seen why.

 #include <stdio.h> const void *ptrprintf = printf; void main() { __asm { push 0x0021646C ; "ld!\0" push 0x726F5720 ; " Wor" push 0x2C6F6C6C ; "llo," push 0x65480000 ; "\0\0He" lea eax, [esp+2] ; eax -> "Hello, World!" push eax ;        call ptrprintf ;  printf add esp, 20 ;   } } 

Compile and watch disassembler.

 00A8B001 68 6C 64 21 00 push 21646Ch 00A8B006 68 20 57 6F 72 push 726F5720h 00A8B00B 68 6C 6C 6F 2C push 2C6F6C6Ch 00A8B010 68 00 00 48 65 push 65480000h 00A8B015 8D 44 24 02 lea eax,[esp+2] 00A8B019 50 push eax 00A8B01A FF 15 00 90 A8 00 call dword ptr [ptrprintf (0A89000h)] 00A8B020 83 C4 14 add esp,14h 00A8B023 C3 ret 

From here we need to take the code bytes.

To manually remove the assembler code, you can use regular expressions in notepad ++.
Regular expression for the sequence after the code bytes:

  {2} *. * 

The beginning of the lines can be removed using the plugin for notepad ++ TextFx:

TextFX -> "TextFx Tools" -> "Delete Line Numbers or First Word" by selecting all the lines.

After that, we will already have an almost ready sequence of code for the array.

 68 6C 64 21 00
 68 20 57 6F 72
 68 6C 6C 6F 2C
 68 00 00 48 65
 8D 44 24 02
 50
 FF 15 00 90 A8 00;  After FF 15, the next 4 bytes must be the address of the called function.
 83 C4 14
 C3


Calling a function with an “in advance known” address


I have been thinking for a long time how it is possible to leave the address from the function table in the finished sequence, if only the compiler knows it. And after asking a little from familiar programmers and experimenting, I realized that the address of the called function can be obtained using the operation of taking the address from the pointer variable to the function. What I did.

 #include <stdio.h> const void *ptrprintf = printf; void main() { void *funccall = &ptrprintf; __asm { call ptrprintf } } 



As you can see in the index, it is the address that is being called. Exactly what is needed.

Putting it all together


So, we have a sequence of bytes of assembler code, among which we need to leave an expression that the compiler converts to the address we need to call printf. We have a 4 byte address (because we write code for a 32 bit platform), which means that the array must contain 4 byte values, so that after the FF 15 byte we have the next element, where we will place our address.

By simple substitutions we get the desired sequence.
We take the previously obtained sequence of bytes of our assembly code. Based on the fact that 4 bytes after FF 15, we must compose one value format them. And the missing bytes are replaced by the operation nop with the code 0x90.

 90 68 6C 64
 21 00 68 20
 57 6F 72 68
 6C 6C 6F 2C
 68 00 00 48
 65 8D 44 24 
 02 50 FF 15
 00 90 A8 00;  address to call printf
 83 C4 14 C3

And again we will make 4 byte values ​​in little-endian. To transfer columns, it is very useful to use multi-line selection in notepad ++ with alt + shift:

 646C6890
 20680021
 68726F57
 2C6F6C6C
 48000068
 24448D65
 15FF5002
 00000000;  address to call printf, then it will be replaced by the expression
 C314C483


Now we have a sequence of 4 byte numbers and an address to call the printf function and we can finally fill our main array.

 #include <stdio.h> const void *ptrprintf = printf; #pragma section(".exre", execute, read) __declspec(allocate(".exre")) int main[] = { 0x646C6890, 0x20680021, 0x68726F57, 0x2C6F6C6C, 0x48000068, 0x24448D65, 0x15FF5002, &ptrprintf, 0xC314C483 }; 

In order to call a break point in the visual studio debugger, you must replace the first element of the array with 0x646C68 CC
We start, we look.



Done!

Conclusion


I apologize if someone seemed to the article "for the little ones." I tried to describe the process in as much detail as possible and omit the obvious things. I wanted to share my own experience of such a small study. I would be glad if the article will be interesting to someone, and possibly useful.

Leave here all the links:

The article "main usually a function"
Description section on msdn
Some explanation of assembler code on stackoverflow

And just in case I will leave a link to the 7z archive with the project under visual studio 2013

I also do not exclude that it was possible to shorten the printf call and use another function call code, but I did not have time to investigate this issue.

I will be glad to your feedback and comments.

Source: https://habr.com/ru/post/275861/


All Articles