x86_64
Linux and at the same time began to dive into the source code of the Linux kernel.IP 0xfff0 CS selector 0xf000 CS base 0xffff0000
0-0xFFFFF
or 1
. But it has only 16-bit registers with a maximum address of 2^16-1
or 0xffff
(64 kilobytes).65536
bytes (64 KB). Since with 16-bit registers we cannot access memory above 64 KB, an alternative method was developed.* 16
. Thus, to get a physical address in memory, multiply the segment selector by 16 and add an offset to it: = * 16 +
CS:IP
register has the value 0x2000:0x0010
, then the corresponding physical address would be: >>> hex((0x2000 << 4) + 0x0010) '0x20010'
0xffff:0xffff
, we get the address: >>> hex((0xffff << 4) + 0xffff) '0x10ffef'
65520
bytes after the first megabyte. Since only one megabyte is available in real mode, 0x10ffef
becomes 0x00ffef
with the A20 line disabled.CS
register consists of two parts: a visible segment selector and a hidden base address. Although the base address is usually formed by multiplying the value of the segment selector by 16, but during a hardware reset, the segment selector in the CS register is set to 0xf000
, and the base address is 0xffff0000
. The processor uses this special base address until the CS changes. >>> 0xffff0000 + 0xfff0 '0xfffffff0'
0xfffffff0
, which is 16 bytes below 4 GB. This point is called the reset vector . This is the memory location where the CPU waits for the first instruction to be executed after a reset: a jump operation ( jmp ), which usually points to the BIOS entry point. For example, if you look at the coreboot source code ( src/cpu/x86/16bit/reset16.inc
), we will see: .section ".reset", "ax", %progbits .code16 .globl _start _start: .byte 0xe9 .int _start16bit - ( . + 2 ) ...
jmp
, namely 0xe9
, and the destination address _start16bit - ( . + 2)
.reset
section is 16 bytes, and it is compiled to run from 0xfffff0
( src/cpu/x86/16bit/reset16.ld
): SECTIONS { /* Trigger an error if I have an unuseable start address */ _bogus = ASSERT(_start16bit >= 0xffff0000, "_start16bit too low. Please report."); _ROMTOP = 0xfffffff0; . = _ROMTOP; .reset . : { *(.reset); . = 15; BYTE(0x00); } }
0x55
and 0xaa
. They show the BIOS that it is a bootable device. ; ; : Intel x86 ; [BITS 16] boot: mov al, '!' mov ah, 0x0e mov bh, 0x00 mov bl, 0x07 int 0x10 jmp $ times 510-($-$$) db 0 db 0x55 db 0xaa
nasm -f bin boot.nasm && qemu-system-x86_64 boot
boot
file, which we have just created as a disk image. Since the binary file generated above satisfies the requirements of the boot sector (start at 0x7c00
and completion with a magic sequence), QEMU will treat the binary file as the master boot record (MBR) of the disk image.0x7c00
in memory. After launching, it causes a 0x10 interrupt, which simply prints the character !
; fills the remaining 510 bytes with zeros and ends with two magic bytes 0xaa
and 0x55
.objdump
utility:nasm -f bin boot.nasm
objdump -D -b binary -mi386 -Maddr16,data16,intel boot
= * 16 +
0xffff
, so for the largest values the result will be: >>> hex((0xffff * 16) + 0xffff) '0x10ffef'
0x10ffef
is 1 + 64 - 16
. In the processor 8086 (the first processor with real mode) 20-bit address line. Since 2^20 = 1048576
, the actual available memory is 1 MB.0x00000000 - 0x000003FF - table of real-mode interrupt vectors 0x00000400 - 0x000004FF - BIOS data area 0x00000500 - 0x00007BFF - not used 0x00007C00 - 0x00007DFF - our bootloader 0x00007E00 - 0x0009FFFF - not used 0x000A0000 - 0x000BFFFF - Video RAM (VRAM) Memory 0x000B0000 - 0x000B7777 - monochrome video memory 0x000B8000 - 0x000BFFFF - color mode video memory 0x000C0000 - 0x000C7FFF - Video ROM BIOS 0x000C8000 - 0x000EFFFF - shadow area (BIOS Shadow) 0x000F0000 - 0x000FFFFF - system BIOS
0xFFFFFFF0
, which is much more than 0xFFFFF
(1 MB). How does the CPU access this address in real mode? Answer in coreboot documentation:0xFFFE_0000 - 0xFFFF_FFFF: 128 ROM
grub_main
function initializes the console, returns the base address for the modules, sets the root device, loads / parses the grub configuration file, loads the modules, etc. At the end of the execution, it puts grub into normal mode. The grub_normal_execute
function (from the grub-core/normal/main.c
source file) completes the final preparations and shows the menu for selecting the operating system. When we select one of the grub menu items, the grub_menu_execute_entry
function grub_menu_execute_entry
, which executes the grub boot
command and loads the selected OS.0x01f1
from the kernel installation code. This offset is specified in the linker script . The arch / x86 / boot / header.S kernel header begins with: .globl hdr hdr: setup_sects: .byte 0 root_flags: .word ROOT_RDONLY syssize: .long 0 ram_size: .word 0 vid_mode: .word SVGA_MODE root_dev: .word 0 boot_flag: .word 0xAA55
write
in the Linux boot protocol, as in this example) with values that were received from the command line or calculated at boot time. Now we will not dwell on the descriptions and explanations for all the header fields. We'll discuss later how the kernel uses them. For a description of all the fields, see the download protocol .| Kernel protected mode | 100,000 + ------------------------ + | I / O mapping | 0A0000 + ------------------------ + | Reserved. for BIOS | Leave as much as possible free ~ ~ | Command Line | (may also be behind the X + 10000 mark) X + 10,000 + ------------------------ + | Stack / pile | To use real kernel mode code X + 08000 + ------------------------ + | Kernel installation | Real kernel mode code | Kernel boot sector | Legacy kernel boot sector X + ------------------------ + | Bootloader | <- Entry point 0x7C00 boot sector 001000 + ------------------------ + | Reserved. for MBR / BIOS | 000800 + ------------------------ + | Usually used MBR | 000600 + ------------------------ + | Use BIOS only | 000000 + ------------------------ +
X + sizeof (KernelBootSector) + 1
X
is the address of the kernel boot sector. In our case, X
is 0x10000
, as seen in the memory dump:qemu-system-x86_64 vmlinuz-3.18-generic
header.S
starts with the magic number MZ (see screenshot of the dump above), the text of the error message and the PE header: #ifdef CONFIG_EFI_STUB # "MZ", MS-DOS header .byte 0x4d .byte 0x5a #endif ... ... ... pe_header: .ascii "PE" .word 0
// header.S line 292 .globl _start _start:
0x200
from MZ
) and goes straight to it, although header.S
begins with the .bstext
section, where the text of the error message is: // // arch/x86/boot/setup.ld // . = 0; // current position .bstext : { *(.bstext) } // put .bstext section to position 0 .bsdata : { *(.bsdata) }
.globl _start _start: .byte 0xeb .byte start_of_setup-1f 1: // // rest of the header //
jmp
( 0xeb
), which goes to the point start_of_setup-1f
. In the Nf
notation, for example, 2f
refers to the local label 2:
In our case, this is label 1
, which is present immediately after the transition, and it contains the rest of the setup header. Immediately after the installation header, we see the .entrytext
section, which starts with the start_of_setup
tag.jmp
instruction is at offset 0x200
from the beginning of the real kernel mode, that is, after the first 512 bytes. This can be seen both in the Linux kernel boot protocol and in the grub2 source code: segment = grub_linux_real_target >> 4; state.gs = state.fs = state.es = state.ds = state.ss = segment; state.cs = segment + 0x20;
0x10000
. This means that after starting the kernel installation, the registers of the segments will have the following values:gs = fs = es = ds = ss = 0x10000
cs = 0x10200
start_of_setup
kernel should do the following:ds
and es
segments point to the same address. Then clears the direction flag using the cld
instruction: movw %ds, %ax movw %ax, %es cld
0x10000
, and cs
at 0x10200
, because execution does not start from the beginning of the file, but from going here: _start: .byte 0xeb .byte start_of_setup-1f
512
bytes from 4d 5a . It is also necessary to align the cs
from 0x10200
to 0x10000
, as well as all the other segment registers. After that install the stack: pushw %ds pushw $6f lretw
ds
onto the stack, followed by the address of label 6 and the instruction lretw
, which loads the address of label 6
into the command counter register and loads cs
with the value ds
. After that, ds
and cs
will have the same values.ss
register and create the correct stack if the ss
value is incorrect: movw %ss, %dx cmpw %ax, %dx movw %sp, %dx je 2f
ss
valid value of 0x1000
(like all other registers except cs
)ss
not a valid value, and the CAN_USE_HEAP
flag CAN_USE_HEAP
set (see below)ss
not a valid value, and the CAN_USE_HEAP
flag CAN_USE_HEAP
not set (see below)ss
valid value ( 0x1000
). In this case, we go to label 2: 2: andw $~3, %dx jnz 3f movw $0xfffc, %dx 3: movw %ax, %ss movzwl %dx, %esp sti
dx
register alignment (which contains the sp
value specified by the loader) by 4
bytes and check for zero. If it is equal to zero, then we put in dx
value 0xfffc
(the address aligned by 4
bytes before the maximum segment size of 64 KB). If it is not equal to zero, then we continue to use the sp
value specified by the loader ( 0xf7f4
in our case). Then we put the value of ax
in ss
, which keeps the correct address of the segment 0x1000
and sets the correct sp
. Now we have the right stack:ss != ds
. First we put the _end value (the end address of the installation code) into dx
and check the loadflags
header field using the testb
instruction to check if the heap can be used. loadflags is a bitmask header, which is defined as follows: #define LOADED_HIGH (1<<0) #define QUIET_FLAG (1<<5) #define KEEP_SEGMENTS (1<<6) #define CAN_USE_HEAP (1<<7)
: loadflags
.
7 (): CAN_USE_HEAP
1, ,
heap_end_ptr . ,
.
CAN_USE_HEAP
bit is CAN_USE_HEAP
, then in dx
we set the value of heap_end_ptr
(which indicates _end
) and add to it STACK_SIZE
(the minimum stack size is 1024
bytes). After that, go to label 2
(as in the previous case) and make the correct stack.CAN_USE_HEAP
not set, simply use the minimum stack from _end
to _end + STACK_SIZE
: cmpl $0x5a5aaa55, setup_sig jne setup_bad
movw $__bss_start, %di movw $_end+3, %cx xorl %eax, %eax subw %di, %cx shrw $2, %cx rep; stosl
di
. Then the address _end + 3
(+3 for alignment by 4 bytes) is moved to cx
. The eax
register is cleared (using the xor
instruction), the size of the bss ( cx-di
) section is calculated and it is placed in cx
. Then cx
is divided by four (the size of the “word”) and the instruction stosl
used stosl
, keeping the value
(zero) in the address pointing to di
, automatically increasing di
by four and repeating it until
reaches zero). The net effect of this code is that zeros are written to all words in memory from __bss_start
to _end
:main()
C function: calll main
main()
function is in arch / x86 / boot / main.c. We will talk about it in the next part.memset
, memcpy
, earlyprintk
, .Source: https://habr.com/ru/post/428664/
All Articles