x86_64 Linux and at the same time began to dive into the source code of the Linux kernel.IP 0xfff0 CS selector 0xf000 CS base 0xffff0000
0-0xFFFFF or 1 . But it has only 16-bit registers with a maximum address of 2^16-1 or 0xffff (64 kilobytes).65536 bytes (64 KB). Since with 16-bit registers we cannot access memory above 64 KB, an alternative method was developed.* 16 . Thus, to get a physical address in memory, multiply the segment selector by 16 and add an offset to it:   =   * 16 +  CS:IP register has the value 0x2000:0x0010 , then the corresponding physical address would be: >>> hex((0x2000 << 4) + 0x0010) '0x20010' 0xffff:0xffff , we get the address: >>> hex((0xffff << 4) + 0xffff) '0x10ffef' 65520 bytes after the first megabyte. Since only one megabyte is available in real mode, 0x10ffef becomes 0x00ffef with the A20 line disabled.CS register consists of two parts: a visible segment selector and a hidden base address. Although the base address is usually formed by multiplying the value of the segment selector by 16, but during a hardware reset, the segment selector in the CS register is set to 0xf000 , and the base address is 0xffff0000 . The processor uses this special base address until the CS changes. >>> 0xffff0000 + 0xfff0 '0xfffffff0' 0xfffffff0 , which is 16 bytes below 4 GB. This point is called the reset vector . This is the memory location where the CPU waits for the first instruction to be executed after a reset: a jump operation ( jmp ), which usually points to the BIOS entry point. For example, if you look at the coreboot source code ( src/cpu/x86/16bit/reset16.inc ), we will see:  .section ".reset", "ax", %progbits .code16 .globl _start _start: .byte 0xe9 .int _start16bit - ( . + 2 ) ... jmp , namely 0xe9 , and the destination address _start16bit - ( . + 2) .reset section is 16 bytes, and it is compiled to run from 0xfffff0 ( src/cpu/x86/16bit/reset16.ld ): SECTIONS { /* Trigger an error if I have an unuseable start address */ _bogus = ASSERT(_start16bit >= 0xffff0000, "_start16bit too low. Please report."); _ROMTOP = 0xfffffff0; . = _ROMTOP; .reset . : { *(.reset); . = 15; BYTE(0x00); } } 0x55 and 0xaa . They show the BIOS that it is a bootable device. ; ; :       Intel x86 ; [BITS 16] boot: mov al, '!' mov ah, 0x0e mov bh, 0x00 mov bl, 0x07 int 0x10 jmp $ times 510-($-$$) db 0 db 0x55 db 0xaa nasm -f bin boot.nasm && qemu-system-x86_64 bootboot file, which we have just created as a disk image. Since the binary file generated above satisfies the requirements of the boot sector (start at 0x7c00 and completion with a magic sequence), QEMU will treat the binary file as the master boot record (MBR) of the disk image.
0x7c00 in memory. After launching, it causes a 0x10 interrupt, which simply prints the character ! ; fills the remaining 510 bytes with zeros and ends with two magic bytes 0xaa and 0x55 .objdump utility:nasm -f bin boot.nasm
objdump -D -b binary -mi386 -Maddr16,data16,intel boot   =   * 16 +  0xffff , so for the largest values the result will be: >>> hex((0xffff * 16) + 0xffff) '0x10ffef' 0x10ffef is 1 + 64 - 16 . In the processor 8086 (the first processor with real mode) 20-bit address line. Since 2^20 = 1048576 , the actual available memory is 1 MB.0x00000000 - 0x000003FF - table of real-mode interrupt vectors 0x00000400 - 0x000004FF - BIOS data area 0x00000500 - 0x00007BFF - not used 0x00007C00 - 0x00007DFF - our bootloader 0x00007E00 - 0x0009FFFF - not used 0x000A0000 - 0x000BFFFF - Video RAM (VRAM) Memory 0x000B0000 - 0x000B7777 - monochrome video memory 0x000B8000 - 0x000BFFFF - color mode video memory 0x000C0000 - 0x000C7FFF - Video ROM BIOS 0x000C8000 - 0x000EFFFF - shadow area (BIOS Shadow) 0x000F0000 - 0x000FFFFF - system BIOS
0xFFFFFFF0 , which is much more than 0xFFFFF (1 MB). How does the CPU access this address in real mode? Answer in coreboot documentation:0xFFFE_0000 - 0xFFFF_FFFF: 128 ROMgrub_main function initializes the console, returns the base address for the modules, sets the root device, loads / parses the grub configuration file, loads the modules, etc. At the end of the execution, it puts grub into normal mode. The grub_normal_execute function (from the grub-core/normal/main.c source file) completes the final preparations and shows the menu for selecting the operating system. When we select one of the grub menu items, the grub_menu_execute_entry function grub_menu_execute_entry , which executes the grub boot command and loads the selected OS.0x01f1 from the kernel installation code. This offset is specified in the linker script . The arch / x86 / boot / header.S kernel header begins with:  .globl hdr hdr: setup_sects: .byte 0 root_flags: .word ROOT_RDONLY syssize: .long 0 ram_size: .word 0 vid_mode: .word SVGA_MODE root_dev: .word 0 boot_flag: .word 0xAA55 write in the Linux boot protocol, as in this example) with values that were received from the command line or calculated at boot time. Now we will not dwell on the descriptions and explanations for all the header fields. We'll discuss later how the kernel uses them. For a description of all the fields, see the download protocol .  |  Kernel protected mode |
 100,000 + ------------------------ +
          |  I / O mapping |
 0A0000 + ------------------------ +
          |  Reserved.  for BIOS |  Leave as much as possible free
          ~ ~
          |  Command Line |  (may also be behind the X + 10000 mark)
 X + 10,000 + ------------------------ +
          |  Stack / pile |  To use real kernel mode code
 X + 08000 + ------------------------ +
          |  Kernel installation |  Real kernel mode code
          |  Kernel boot sector |  Legacy kernel boot sector
        X + ------------------------ +
          |  Bootloader |  <- Entry point 0x7C00 boot sector
 001000 + ------------------------ +
          |  Reserved.  for MBR / BIOS |
 000800 + ------------------------ +
          |  Usually used  MBR |
 000600 + ------------------------ +
          |  Use  BIOS only |
 000000 + ------------------------ +
 X + sizeof (KernelBootSector) + 1 X is the address of the kernel boot sector. In our case, X is 0x10000 , as seen in the memory dump:
qemu-system-x86_64 vmlinuz-3.18-generic
header.S starts with the magic number MZ (see screenshot of the dump above), the text of the error message and the PE header: #ifdef CONFIG_EFI_STUB # "MZ", MS-DOS header .byte 0x4d .byte 0x5a #endif ... ... ... pe_header: .ascii "PE" .word 0  // header.S line 292 .globl _start _start: 0x200 from MZ ) and goes straight to it, although header.S begins with the .bstext section, where the text of the error message is: // // arch/x86/boot/setup.ld // . = 0; // current position .bstext : { *(.bstext) } // put .bstext section to position 0 .bsdata : { *(.bsdata) }   .globl _start _start: .byte 0xeb .byte start_of_setup-1f 1: // // rest of the header // jmp ( 0xeb ), which goes to the point start_of_setup-1f . In the Nf notation, for example, 2f refers to the local label 2: In our case, this is label 1 , which is present immediately after the transition, and it contains the rest of the setup header. Immediately after the installation header, we see the .entrytext section, which starts with the start_of_setup tag.jmp instruction is at offset 0x200 from the beginning of the real kernel mode, that is, after the first 512 bytes. This can be seen both in the Linux kernel boot protocol and in the grub2 source code: segment = grub_linux_real_target >> 4; state.gs = state.fs = state.es = state.ds = state.ss = segment; state.cs = segment + 0x20; 0x10000 . This means that after starting the kernel installation, the registers of the segments will have the following values:gs = fs = es = ds = ss = 0x10000
cs = 0x10200start_of_setup kernel should do the following:ds and es segments point to the same address. Then clears the direction flag using the cld instruction:  movw %ds, %ax movw %ax, %es cld 0x10000 , and cs at 0x10200 , because execution does not start from the beginning of the file, but from going here: _start: .byte 0xeb .byte start_of_setup-1f 512 bytes from 4d 5a . It is also necessary to align the cs from 0x10200 to 0x10000 , as well as all the other segment registers. After that install the stack:  pushw %ds pushw $6f lretw ds onto the stack, followed by the address of label 6 and the instruction lretw , which loads the address of label 6 into the command counter register and loads cs with the value ds . After that, ds and cs will have the same values.ss register and create the correct stack if the ss value is incorrect:  movw %ss, %dx cmpw %ax, %dx movw %sp, %dx je 2f ss valid value of 0x1000 (like all other registers except cs )ss not a valid value, and the CAN_USE_HEAP flag CAN_USE_HEAP set (see below)ss not a valid value, and the CAN_USE_HEAP flag CAN_USE_HEAP not set (see below)ss valid value ( 0x1000 ). In this case, we go to label 2: 2: andw $~3, %dx jnz 3f movw $0xfffc, %dx 3: movw %ax, %ss movzwl %dx, %esp sti dx register alignment (which contains the sp value specified by the loader) by 4 bytes and check for zero. If it is equal to zero, then we put in dx value 0xfffc (the address aligned by 4 bytes before the maximum segment size of 64 KB). If it is not equal to zero, then we continue to use the sp value specified by the loader ( 0xf7f4 in our case). Then we put the value of ax in ss , which keeps the correct address of the segment 0x1000 and sets the correct sp . Now we have the right stack:
ss != ds . First we put the _end value (the end address of the installation code) into dx and check the loadflags header field using the testb instruction to check if the heap can be used. loadflags is a bitmask header, which is defined as follows: #define LOADED_HIGH (1<<0) #define QUIET_FLAG (1<<5) #define KEEP_SEGMENTS (1<<6) #define CAN_USE_HEAP (1<<7) : loadflags
.
7 (): CAN_USE_HEAP
1, ,
heap_end_ptr . ,
.CAN_USE_HEAP bit is CAN_USE_HEAP , then in dx we set the value of heap_end_ptr (which indicates _end ) and add to it STACK_SIZE (the minimum stack size is 1024 bytes). After that, go to label 2 (as in the previous case) and make the correct stack.
CAN_USE_HEAP not set, simply use the minimum stack from _end to _end + STACK_SIZE :
  cmpl $0x5a5aaa55, setup_sig jne setup_bad   movw $__bss_start, %di movw $_end+3, %cx xorl %eax, %eax subw %di, %cx shrw $2, %cx rep; stosl di . Then the address _end + 3 (+3 for alignment by 4 bytes) is moved to cx . The eax register is cleared (using the xor instruction), the size of the bss ( cx-di ) section is calculated and it is placed in cx . Then cx is divided by four (the size of the “word”) and the instruction stosl used stosl , keeping the value di , automatically increasing di by four and repeating it until __bss_start to _end :
main() C function:  calll main main() function is in arch / x86 / boot / main.c. We will talk about it in the next part.memset , memcpy , earlyprintk , .Source: https://habr.com/ru/post/428664/
All Articles