📜 ⬆️ ⬇️

Some details about the main function

Once I became interested, the contents of the stack function main process in linux. I did some research and now I present you the result.

Options for describing the main function:
1. int main ()
2. int main (int argc, char ** argv)
3. int main (int argc, char ** argv, char ** env)
4. int main (int argc, char ** argv, char ** env, ElfW (auxv_t) auxv [])
5. int main (int argc, char ** argv, char ** env, char ** apple)

argc - the number of parameters
argv is a null-terminal array of pointers to command line parameter strings
env is a zero-terminal array of pointers to strings of environment variables. Each line in the format NAME = VALUE
auxv - array of auxiliary value (available only for PowerPC [1])
apple - the path to the executable file (on MacOS and Darwin [2])
An auxiliary vector is an array with various additional information, such as an effective user identifier, a setuid bit flag, a memory page size, etc.
')
Next on how to get an array of auxiliary values ​​for i386 and x86_64, as well as about the rest of the contents of the stack segment.


The size of the stack segment can be found in the maps file:
cat / proc / 10918 / maps
...
7ffffffa3000-7ffffffffff000 rw-p 00000000 00:00 0 [stack]
...

Before the loader transfers control to main, it initializes the contents of the arrays of command line parameters, environment variables, an auxiliary vector.
After initialization, the upper part of the stack looks like this for the 64bit version.
Senior address on top.

one.0x7ffffffff000The top of the stack segment. Circulation causes segfault
0x7ffffffff0f8Nullvoid *eight0x00 '
2filename [0]char1+"/Tmp/a.out"
charone0x00
...
env [1] [0]charone0x00
...
charone0x00
30x7fffffffe5e0env [0] [0]charone..
charone0x00
...
argv [1] [0]charone0x00
...
charone0x00
four.0x7fffffffe5beargv [0] [0]char1+"/Tmp/a.out"
five.Array of random length
6data for auxvvoid * []48 '
AT_NULLElf64_auxv_tsixteen{0,0}
...
auxv [1]Elf64_auxv_tsixteen
7auxv [0]Elf64_auxv_tsixteenEx .: {0x0e, 0x3e8}
Nullvoid *eight0x00
...
env [1]char *eight
eight.0x7fffffffe308env [0]char *eight0x7fffffffe5e0
Nullvoid *eight0x00
...
argv [1]char *eight
9.0x7fffffffe2f8argv [0]char *eight0x7fffffffe5be
ten.0x7fffffffe2f0argclong inteight'number of arguments + 1
eleven.Local variables and arguments of functions called before main
12.Local variables main
13.0x7fffffffe1fcargcintfournumber of arguments + 1
0x7fffffffe1f0argvchar **eight0x7fffffffe2f8
0x7fffffffe1e8envchar **eight0x7fffffffe308
14.Local function variables

'- I did not find the descriptions of the fields in the documents, but they are clearly visible in the dump.

For 32 bits did not check, but most likely it is enough just to divide the sizes into two.

1. Addressing addresses above the top point causes a Segfault.
2. A string containing the path to the executable file.
3. Array of strings with environment variables
4. Array of strings with command line parameters
5. Array of random length. Its selection can be disabled by commands.
sysctl -w kernel.randomize_va_space = 0
echo 0> / proc / sys / kernel / randomize_va_space
6. Data for an auxiliary vector (for example, the string "x86_64")
7. Auxiliary vector. More details below.
8. Zero-terminal array of pointers to strings of environment variables
9. Null-terminal array of pointers to command line parameter strings
10. Machine word containing the number of command line parameters (one of the arguments of the “senior” functions, see item 11)
11. Local variables and arguments of functions called before main (_start, __ libc_start_main ..)
12. Variables declared in main
13. Main function arguments
14. Variables and arguments of local functions.

Helper vector
For i386 and x86_64, you cannot get the address of the first element of the auxiliary vector, but the contents of this vector can be obtained in other ways. One of them is to access the memory area immediately behind the array of pointers to strings of environment variables.
It should look something like this:
#include <stdio.h> #include <elf.h> int main(int argc, char** argv, char** env){ Elf64_auxv_t *auxv; //x86_64 // Elf32_auxv_t *auxv; //i386 while(*env++ != NULL); //    for (auxv = (Elf64_auxv_t *)env; auxv->a_type != AT_NULL; auxv++){ printf("addr: %p type: %lx is: 0x%lx\n", auxv, auxv->a_type, auxv->a_un.a_val); } printf("\n (void*)(*argv) - (void*)auxv= %p - %p = %ld\n (void*)(argv)-(void*)(&auxv)=%p-%p = %ld\n ", (void*)(*argv), (void*)auxv, (void*)(*argv) - (void*)auxv, (void*)(argv), (void*)(&auxv), (void*)(argv) - (void*)(&auxv)); printf("\n argc copy: %d\n",*((int *)(argv - 1))); return 0; } 

The structures of Elf {32,64} _auxv_t are described in /usr/include/elf.h. Functions for filling structures in linux-kernel / fs / binfmt_elf.c

The second way to get the contents of the vector:
hexdump / proc / self / auxv

The most readable representation is obtained by setting the environment variable LD_SHOW_AUXV.

LD_SHOW_AUXV = 1 ls
AT_HWCAP: bfebfbff // processor capabilities
AT_PAGESZ: 4096 // size of memory page
AT_CLKTCK: 100 // update frequency times ()
AT_PHDR: 0x400040 // header information
AT_PHENT: 56
AT_PHNUM: 9
AT_BASE: 0x7fd00b5bc000 // address of the interpreter, I mean ld.so
AT_FLAGS: 0x0
AT_ENTRY: 0x402490 // entry point to the program
AT_UID: 1000 // user and group identifiers
AT_EUID: 1000 // nominal and effective
AT_GID: 1000
AT_EGID: 1000
AT_SECURE: 0 // whether the setuid flag is raised
AT_RANDOM: 0x7fff30bdc809 // address 16 random bytes,
generated at startup
AT_SYSINFO_EHDR: 0x7fff30bff000 // pointer to the page used for
// system calls
AT_EXECFN: / bin / ls
AT_PLATFORM: x86_64
On the left is the name of the variable, on the right is the value. All possible variable names and their description can be found in the file elf.h. (constants with the prefix AT_)

Return from main ()
After initialization of the process context, control is transferred not in main (), but to the _start () function.
main () calls already from __libc_start_main. This last function has an interesting feature - it passes a pointer to a function that must be executed after main (). And this pointer is passed naturally through the stack.
In general, the __libc_start_main arguments are of the form, according to the file glibc-2.11 / sysdeps / ia64 / elf / start.S
/ *
* Arguments for __libc_start_main:
* out0: main
* out1: argc
* out2: argv
* out3: init
* out4: fini // function called after main
* out5: rtld_fini
* out6: stack_end
* /
Those. to get the address of the fini pointer, you need to move two machine words from the last local variable main.
This is what happened (performance depends on the version of the compiler):
 #include <stdio.h> void **ret; void *leave; void foo(){ void (*boo)(void); //   printf("Stack rewrite!\n"); boo = (void (*)(void))leave; boo(); // fini() } int main(int argc, char *argv[], char *envp[]) { unsigned long int mark = 0xbfbfbfbfbfbfbfbf; //,     ret = (void**)(&mark+2); //  , ,    (fini) leave = *ret; //  *ret = (void*)foo; //  return 0; //   foo() } 


I hope it was interesting.
Good luck.

Thanks to Xeor for the useful tip.

1. www.gelato.unsw.edu.au/IA64wiki/AuxiliaryVector
2. unixjunkie.blogspot.com/2006/02/char-apple-argument-vector.html
3. articles.manugarg.com/aboutelfauxiliaryvectors.html
4. www.phrack.org/issues.html?issue=58&id=5#article
5. unixforum.org/index.php?showtopic=94993&st=30
6. sources.redhat.com/ml/libc-alpha/2007-06/msg00108.html
7. linux-kernel / fs / binfmt_elf.c
8. /usr/include/elf.h
9. glibc-2.11 / sysdeps / ia64 / elf / libc-start.c

Source: https://habr.com/ru/post/128111/


All Articles