jesstess @ kid-charlemagne: ~ / c $ cat hello.c #include <stdio.h> int main () { printf ("Hello World \ n"); return 0; }
jesstess @ kid-charlemagne: ~ / c $ gcc -o hello hello.c jesstess @ kid-charlemagne: ~ / c $ wc -c hello 10931 hello
objdump -t hello
shows 79 entries in the table of identifiers, for most of which the standard library is responsible.printf
either to get rid of the include:jesstess @ kid-charlemagne: ~ / c $ cat hello.c int main () { char * str = "Hello World"; return 0; }
jesstess @ kid-charlemagne: ~ / c $ gcc -o hello hello.c jesstess @ kid-charlemagne: ~ / c $ wc -c hello 10892 hello
-nostdlib
with the -nostdlib
key, after which (according to the documentation) gcc “will not use the system libraries and startup files when linking. Only files explicitly transferred to the linker will be used. ”jesstess @ kid-charlemagne: ~ / c $ gcc -nostdlib -o hello hello.c / usr / bin / ld: warning: cannot find entry symbol _start; defaulting to 00000000004000e8
jesstess @ kid-charlemagne: ~ / c $ wc -c hello 1329 hello
jesstess @ kid-charlemagne: ~ / c $ ./hello Segmentation fault
_start
, which seems to be needed to run the program? Where is it usually defined when using libc?_start
, rather than main
, is the real entry point into the program. Usually _start
defined in the crt1.o
ELF being crt1.o
. Verify this by linking the helloWord with crt1.o
and noting that _start
now detected (but other problems appeared instead because other startup records libc are not defined):# compile the source without linking jesstess @ kid-charlemagne: ~ / c $ gcc -Os -c hello.c # now try to link jesstess @ kid-charlemagne: ~ / c $ ld /usr/lib/crt1.o -o hello hello.o /usr/lib/crt1.o: In function `_start ': /build/buildd/glibc-2.9/csu/../sysdeps/x86_64/elf/start.S:106: undefined reference to `__libc_csu_fini ' /build/buildd/glibc-2.9/csu/../sysdeps/x86_64/elf/start.S:107: undefined reference to `__libc_csu_init ' /build/buildd/glibc-2.9/csu/../sysdeps/x86_64/elf/start.S:113: undefined reference to `__libc_start_main '
_start
lives in the libc source: sysdeps/x86_64/elf/start.S
sysdeps/x86_64/elf/start.S
. This delightfully commented file exports the _start
character, initializes the stack, some registers, and calls __libc_start_main
. If you look at the bottom csu/libc-start.c
csu/libc-start.c
, you can see the _main
call of our program:/ * Nothing special, just call the function * / result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
_start
needed. For convenience, _start
summarize what happens between _start
and the main
call: initialize a bunch of things for libc and call main
. And since we don’t need libc, we export our own _start
symbol, which only knows what to call main
, and link it with it:jesstess @ kid-charlemagne: ~ / c $ cat stubstart.S .globl _start _start: call main
jesstess @ kid-charlemagne: ~ / c $ gcc -nostdlib stubstart.S -o hello hello.c jesstess @ kid-charlemagne: ~ / c $ ./hello Segmentation fault
main
and step by step execute the program before the default:jesstess @ kid-charlemagne: ~ / c $ gcc -g -nostdlib stubstart.S -o hello hello.c jesstess @ kid-charlemagne: ~ / c $ gdb hello GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3 +: GNU GPL version 3 or later This is free software: There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu" ... (gdb) break main Breakpoint 1 at 0x4000f4: file hello.c, line 3. (gdb) run Starting program: / home / jesstess / c / hello Breakpoint 1, main () at hello.c: 5 5 char * str = "Hello World"; (gdb) step 6 return 0; (gdb) step 7} (gdb) step 0x00000000004000ed in _start () (gdb) step Single stepping until exit from function _start, which has no line number information. main () at helloint.c: 4 four { (gdb) step Breakpoint 1, main () at helloint.c: 5 5 char * str = "Hello World"; (gdb) step 6 return 0; (gdb) step 7} (gdb) step Program received signal SIGSEGV, Segmentation fault. 0x0000000000001 in ?? () (gdb)
main
is executed twice? ... The time has come to take on the assembler:jesstess @ kid-charlemagne: ~ / c $ objdump -d hello hello: file format elf64-x86-64 Disassembly of section .text: 00000000004000e8 <_start>: 4000e8: e8 03 00 00 00 callq 4000f0 4000ed: 90 nop 4000ee: 90 nop 4000ef: 90 nop 00000000004000f0: 4000f0: 55 push% rbp 4000f1: 48 89 e5 mov% rsp,% rbp 4000f4: 48 c7 45 f8 03 01 40 movq $ 0x400103, -0x8 (% rbp) 4000fb: 00 4000fc: b8 00 00 00 00 mov $ 0x0,% eax 400101: c9 leaveq 400102: c3 retq
callq
to main
we execute a few nop
and return directly to main
. Since the re-entry into main
was made without setting the return instruction pointer on the stack (as part of the standard preparation for calling the function), the second retq
call retq
to retq
dummy return instruction pointer from the stack and the program crashes. Need a way to complete.callq
to %eax
, push 1
, the sys_exit
system call sys_exit
, and so on are made. need to report the correct completion put in %ebx 0
, the only argument is SYS_exit
. Now we enter the kernel with the int $0x80
interrupt.jesstess @ kid-charlemagne: ~ / c $ cat stubstart.S .globl _start _start: call main movl $ 1,% eax xorl% ebx,% ebx int $ 0x80 jesstess @ kid-charlemagne: ~ / c $ gcc -nostdlib stubstart.S -o hello hello.c jesstess @ kid-charlemagne: ~ / c $ ./hello jesstess @ kid-charlemagne: ~ / c $
Source: https://habr.com/ru/post/88101/
All Articles