OS1: primitive kernel on Rust for x86. Part 2. VGA, GDT, IDT

The first article has not yet had time to cool down, but I decided not to keep you in the intrigue and write a sequel.

So, in the previous article we talked about linking, loading the kernel file and primary initialization. I gave several useful links, told how the loaded kernel is located in memory, how virtual and physical addresses are related at boot time, as well as how to enable support for the pages mechanism. The last control passed to the function kmain of my kernel, written in Rust. It's time to move on and find out how deep the rabbit hole is!

In this part of the notes, I briefly describe my configuration of Rust, in general I will tell you about the output of information in VGA, and in detail about configuring segments and interrupts . I ask all interested under the cat, and we start.

Rust setting

In general, there is nothing particularly difficult in this procedure, for details you can contact the blog of Phillip . However, at some points I still dwell.

Some features required for low-level development, stable Rust still does not support, therefore, to disable the standard library and build on Bare Bones, we need Rust nightly. Be careful, somehow, after upgrading to latest, I got a completely non-working compiler and had to roll back to the next stable one. If you are sure that your compiler was working yesterday, but it was updated and does not work - execute the command, substituting the date you need

rustup override add nightly-YYYY-MM-DD

For details of the mechanism you can contact here .

Next, set up the target platform, under which we will gather. I based on the blog of Philip Opperman, so many things in this section are taken from him, taken apart by bone and adapted to my needs. Phillip in his blog is developing for x64, I originally chose x32, so my target.json will be somewhat different. I bring it in full

 { "llvm-target": "i686-unknown-none", "data-layout": "em:ep:32:32-f64:32:64-f80:32-n8:16:32-S128", "arch": "x86", "target-endian": "little", "target-pointer-width": "32", "target-c-int-width": "32", "os": "none", "executables": true, "linker-flavor": "ld.lld", "linker": "rust-lld", "panic-strategy": "abort", "disable-redzone": true, "features": "-mmx,-sse,+soft-float" }

The hardest part here is the “ data-layout ” parameter. The LLVM documentation tells us that these are layout data parameters separated by “-”. The very first symbol “e” is responsible for the individuality - in our case it is little-endian, as the platform requires. The second character is m, “distortion”. Responsible for symbol names when linking. Since our output format will be ELF (see the build script), we select the value “m: e”. The third character is the pointer size in bits and ABI (Application binary interface). Everything is simple, we have 32 bits, so feel free to put “p: 32: 32”. Next are floating point numbers. We announce that we support 64-bit numbers by ABI 32 with alignment 64 - “f64: 32: 64”, as well as 80-bit numbers with alignment by default - “f80: 32”. The next element is integers. We start with 8 bits and move to a platform maximum of 32 bits - “n8: 16: 32”. The last is stack alignment. I need even 128 bit integers, so let it be S128. In any case, LLVM can safely ignore this parameter, this is our preference.

For the rest of the parameters, you can look at Philip, he explains everything well.

We also need cargo-xbuild, a tool that allows cross-compiling rust-core when building under an unfamiliar target platform.
Install.

 cargo install cargo-xbuild

We will collect like this.

 cargo xbuild -Z unstable-options --manifest-path=kernel/Cargo.toml --target kernel/targets/$(ARCH).json --out-dir=build/lib

I needed to specify the manifest for Make to work correctly, since it runs from the root directory, and the kernel is in the kernel directory.

From the features of the manifest, I can select only crate-type = ["staticlib"] , which gives a linkable file as output. We will feed him further in the LLD.

kmain and initial setup

According to the Rust conventions, if we are creating a static library (or a “flat” binary file), the root file should contain the file lib.rs, which is the entry point. It uses attributes to configure features of the language, as well as the cherished kmain.

So, in the first step, we need to disable the std library. This is done by macro.

 #![no_std]

With such a simple step, we immediately forget about multithreading, dynamic memory and other amenities of the standard library. Moreover, we even deprive ourselves of the println! Macro, so that we will have to implement it ourselves. How to do this will tell next time.

Many tutorials somewhere in this place end, displaying “Hello World” and not explaining how to live on. We will go the other way. First of all, we need to set the code and data segments for the protected mode, configure the VGA, configure the interrupts, which we will do.

 #![no_std] #[macro_use] pub mod debug; #[cfg(target_arch = "x86")] #[path = "arch/i686/mod.rs"] pub mod arch; #[no_mangle] extern "C" fn kmain(pd: usize, mb_pointer: usize, mb_magic: usize) { arch::arch_init(pd); ...... } #[panic_handler] fn panic(_info: &PanicInfo) -> ! { println!("{}", _info); loop {} }

What's going on here? As I said, we disable the standard library. We will also announce two very important modules - debug (in which we will write to the screen) and arch (in which all platform-specific magic will live). I use the Rust feature with configurations to declare identical interfaces in different architectural implementations and use them to the fullest. Here I stop only on x86 and further we speak only about it.

I declared a completely primitive panic handler, which Rust requires. Then it will be possible to refine it.

kmain accepts three arguments and is also exported to C notation without distorting the name so that the linker can correctly associate the function with the call from _loader, which I described in the previous article. The first argument is the address of the PD page table, the second is the physical address of the GRUB structure, from where we will retrieve the memory card, the third is the magic number. In the future, I would like to implement both Multiboot 2 support and my own bootloader, so I use the magic number to identify the boot method.

The first call to kmain is platform-specific initialization. We go inside. The arch_init function is located in the arch / i686 / mod.rs file, is public, specific to the x86 platform in 32 bits, and looks like this:

 pub fn arch_init(pd: usize) { unsafe { vga::VGA_WRITER.lock().init(); gdt::setup_gdt(); idt::init_idt(); paging::setup_pd(pd); } }

As you can see, for x86, the output, segmentation, interrupts, and page organization of memory are initialized in order. Let's start with VGA.

VGA Initialization

Each tutorial considers it their duty to print Hello World, so you will find how to work with VGA everywhere. For this reason, I will go as briefly as possible, focusing only on the chips that I made myself. On the use of lazy_static send you to the blog of Philip and I will not explain in detail. const fn is not yet in release, so beautifully static initialization cannot be done yet. And add a spinlock so that the whole mess does not work out.

 use lazy_static::lazy_static; use spin::Mutex; lazy_static! { pub static ref VGA_WRITER : Mutex<Writer> = Mutex::new(Writer { cursor_position: 0, vga_color: ColorCode::new(Color::LightGray, Color::Black), buffer: unsafe { &mut *(0xC00B8000 as *mut VgaBuffer) } }); }

As you know, the screen buffer is located at the physical address 0xB8000 and has a size of 80x25x2 bytes (the width and height of the screen, byte per character and attributes: colors, flickering). Since we have already turned on virtual memory, accessing this address will cause a crash, so we add 3 GB. We also dereference the raw index, which is unsafe - but we know what we are doing.
Of the interesting things in this file, perhaps only the implementation of the Writer structure, which allows not only displaying characters in a row, but also scrolling, moving to any place on the screen and other pleasant trifles.

VGA Writer

 pub struct Writer { cursor_position: usize, vga_color: ColorCode, buffer: &'static mut VgaBuffer, } impl Writer { pub fn init(&mut self) { let vga_color = self.vga_color; for y in 0..(VGA_HEIGHT - 1) { for x in 0..VGA_WIDTH { self.buffer.chars[y * VGA_WIDTH + x] = ScreenChar { ascii_character: b' ', color_code: vga_color, } } } self.set_cursor_abs(0); } fn set_cursor_abs(&mut self, position: usize) { unsafe { outb(0x3D4, 0x0F); outb(0x3D5, (position & 0xFF) as u8); outb(0x3D4, 0x0E); outb(0x3D4, ((position >> 8) & 0xFF) as u8); } self.cursor_position = position; } pub fn set_cursor(&mut self, x: usize, y: usize) { self.set_cursor_abs(y * VGA_WIDTH + x); } pub fn move_cursor(&mut self, offset: usize) { self.cursor_position = self.cursor_position + offset; self.set_cursor_abs(self.cursor_position); } pub fn get_x(&mut self) -> u8 { (self.cursor_position % VGA_WIDTH) as u8 } pub fn get_y(&mut self) -> u8 { (self.cursor_position / VGA_WIDTH) as u8 } pub fn scroll(&mut self) { for y in 0..(VGA_HEIGHT - 1) { for x in 0..VGA_WIDTH { self.buffer.chars[y * VGA_WIDTH + x] = self.buffer.chars[(y + 1) * VGA_WIDTH + x] } } for x in 0..VGA_WIDTH { let color_code = self.vga_color; self.buffer.chars[(VGA_HEIGHT - 1) * VGA_WIDTH + x] = ScreenChar { ascii_character: b' ', color_code } } } pub fn ln(&mut self) { let next_line = self.get_y() as usize + 1; if next_line >= VGA_HEIGHT { self.scroll(); self.set_cursor(0, VGA_HEIGHT - 1); } else { self.set_cursor(0, next_line) } } pub fn write_byte_at_xy(&mut self, byte: u8, color: ColorCode, x: usize, y: usize) { self.buffer.chars[y * VGA_WIDTH + x] = ScreenChar { ascii_character: byte, color_code: color } } pub fn write_byte_at_pos(&mut self, byte: u8, color: ColorCode, position: usize) { self.buffer.chars[position] = ScreenChar { ascii_character: byte, color_code: color } } pub fn write_byte(&mut self, byte: u8) { if self.cursor_position >= VGA_WIDTH * VGA_HEIGHT { self.scroll(); self.set_cursor(0, VGA_HEIGHT - 1); } self.write_byte_at_pos(byte, self.vga_color, self.cursor_position); self.move_cursor(1); } pub fn write_string(&mut self, s: &str) { for byte in s.bytes() { match byte { 0x20...0xFF => self.write_byte(byte), b'\n' => self.ln(), _ => self.write_byte(0xfe), } } } }

When rewinding, all that is done is copying the memory plots the size of the width of the screen back, filling the new line with spaces (this is how I clear) Calling outb is a bit more interesting - in no way other than working with I / O ports, you cannot move the cursor. However, we still need the input-output through the ports, so they were placed in a separate package and wrapped in secure wrappers. Under the spoiler below there will be an assembler code. For now, it is enough to know that:

Displays the absolute offset of the cursor, not the coordinates.
Output to the controller can be one byte at a time.
The output of one byte occurs in two commands - first we write a command to the controller, then the data.
Port for commands - 0x3D4, port for data - 0x3D5
First we print the lower position byte with the command 0x0F, then the upper one with the command 0x0E

out.asm

Note the work with the passed variables in the stack. Since the stack starts at the end of the space and reduces the stack pointer when the function is called, to get the parameters, the return point, and so on, the argument size must be added to the ESP register, aligned to the stack alignment - in our case, 4 bytes.

 global writeb global writew global writed section .text writeb: push ebp mov ebp, esp mov edx, [ebp + 8] ;port in stack: 8 = 4 (push ebp) + 4 (parameter port length is 2 bytes but stack aligned 4 bytes) mov eax, [ebp + 8 + 4] ;value in stack - 8 = see ^, 4 = 1 byte value aligned 4 bytes out dx, al ;write byte by port number an dx - value in al mov esp, ebp pop ebp ret writew: push ebp mov ebp, esp mov edx, [ebp + 8] ;port in stack: 8 = 4 (push ebp) + 4 (parameter port length is 2 bytes but stack aligned 4 bytes) mov eax, [ebp + 8 + 4] ;value in stack - 8 = see ^, 4 = 1 word value aligned 4 bytes out dx, ax ;write word by port number an dx - value in ax mov esp, ebp pop ebp ret writed: push ebp mov ebp, esp mov edx, [ebp + 8] ;port in stack: 8 = 4 (push ebp) + 4 (parameter port length is 2 bytes but stack aligned 4 bytes) mov eax, [ebp + 8 + 4] ;value in stack - 8 = see ^, 4 = 1 double word value aligned 4 bytes out dx, eax ;write double word by port number an dx - value in eax mov esp, ebp pop ebp ret

Segment setting

We got to the most puzzling, but at the same time, the simplest topic. As I said in the previous article, the page and segment organization of memory mixed up in my head, I loaded the address of the page table into GDTR and grabbed my head. It took me a few months to read enough material, digest it and be able to realize it. Perhaps I fell victim to the textbook by Peter Abel “Assembler. Language and programming for the IBM PC ”(great book!), Which describes the segmentation for the Intel 8086. In those pleasant times, we loaded the upper 16 bits of the twenty-bit address into the segment register, and this was the address in memory. A cruel disappointment turned out that since i286 in protected mode, everything is completely different.

So, the bare theory says that x86 supports the segment model of memory, since the old programs could only break out of the 640 KB, and then 1 MB of memory.

Programmers had to think about how to place executable code, how to place data, how to observe their security. The arrival of the paging organization made the segment organization unnecessary, but it remained for the purpose of compatibility and protection (separation of privileges into kernel-space and user-space), so that without it is simply nowhere. Some processor instructions are prohibited at privilege levels lower than 0, and access between program and kernel segments will cause a segmentation error.

Let's again (hopefully last) about address translation
Linear address [0x08: 0xFFFFFFFF] -> Segment Rights Verification 0x08 -> Virtual Address [0xFFFFFFFF] -> Page Table + TLB -> Physical Address [0xAAAAFFFF]

The segment is used only inside the processor, is stored in a special segment register (CS, SS, DS, ES, FS, GS) and is used exclusively for checking the rights to execute the code and transfer control. That is why you can not just take and call the kernel function from user space. The segment with 0x18 descriptor (I have one, you have another one) has level 3 rights, and the segment with 0x08 descriptor has level 0 rights. According to the x86 convention, to protect against unauthorized access, a segment with lower access rights cannot directly call a segment with larger rights through jmp 0x08: [EAX], and is obliged to use other mechanisms, such as gangways, gates, interrupts.

Segments and their types (code, data, traps, gates) should be described in the global descriptor table GDT, the virtual address and the size of which is loaded into the GDTR register. When switching between segments (for simplicity, I assume that a direct transition is possible), you need to call the jmp 0x08: [EAX] instruction, where 0x08 is the offset of the first valid descriptor in bytes from the beginning of the table , and EAX is the register containing the transition address. The offset (selector) will be loaded into the CS register, and the corresponding descriptor will be loaded into the shadow register of the processor. Each descriptor is an 8 byte structure. It is well documented and its description can be found both on OSDev and in Intel documentation (see the first article).

I summarize. When we initialize GDT and execute the jmp 0x08: [EAX] transition, the state of the processor will be as follows:

GDTR contains the virtual address of the GDT
CS contains the value 0x08
The handle to the address [GDTR + 0x08] is copied to the CS shadow register from memory
The EIP register contains the address from the EAX register.

A null descriptor must always be uninitialized and forbidden to be accessed. I will dwell on the TSS descriptor and its meaning in more detail when we discuss multithreading. Now my GDT table looks like this:

 extern { fn load_gdt(base: *const GdtEntry, limit: u16); } pub unsafe fn setup_gdt() { GDT[5].set_offset((&super::tss::TSS) as *const _ as u32); GDT[5].set_limit(core::mem::size_of::<super::tss::Tss>() as u32); let gdt_ptr: *const GdtEntry = GDT.as_ptr(); let limit = (GDT.len() * core::mem::size_of::<GdtEntry>() - 1) as u16; load_gdt(gdt_ptr, limit); } static mut GDT: [GdtEntry; 7] = [ //null descriptor - cannot access GdtEntry::new(0, 0, 0, 0), //kernel code GdtEntry::new(0, 0xFFFFFFFF, GDT_A_PRESENT | GDT_A_RING_0 | GDT_A_SYSTEM | GDT_A_EXECUTABLE | GDT_A_PRIVILEGE, GDT_F_PAGE_SIZE | GDT_F_PROTECTED_MODE), //kernel data GdtEntry::new(0, 0xFFFFFFFF, GDT_A_PRESENT | GDT_A_RING_0 | GDT_A_SYSTEM | GDT_A_PRIVILEGE, GDT_F_PAGE_SIZE | GDT_F_PROTECTED_MODE), //user code GdtEntry::new(0, 0xFFFFFFFF, GDT_A_PRESENT | GDT_A_RING_3 | GDT_A_SYSTEM | GDT_A_EXECUTABLE | GDT_A_PRIVILEGE, GDT_F_PAGE_SIZE | GDT_F_PROTECTED_MODE), //user data GdtEntry::new(0, 0xFFFFFFFF, GDT_A_PRESENT | GDT_A_RING_3 | GDT_A_SYSTEM | GDT_A_PRIVILEGE, GDT_F_PAGE_SIZE | GDT_F_PROTECTED_MODE), //TSS - for interrupt handling in multithreading GdtEntry::new(0, 0, GDT_A_PRESENT | GDT_A_RING_3 | GDT_A_TSS_AVAIL, 0), GdtEntry::new(0, 0, 0, 0), ];

And this is what initialization looks like, which I told so much about above. Loading the address and the size of the GDT is performed through a separate structure that contains only two fields. It is the address of this structure that is passed to the lgdt command. In the registers of data segments we load the next handle with offset 0x10.

 global load_gdt section .text gdtr dw 0 ; For limit storage dd 0 ; For base storage load_gdt: mov eax, [esp + 4] mov [gdtr + 2], eax mov ax, [esp + 8] mov [gdtr], ax lgdt [gdtr] jmp 0x08:.reload_CS .reload_CS: mov ax, 0x10 ; 0x10 points at the new data selector mov ds, ax mov es, ax mov fs, ax mov gs, ax mov ss, ax mov ax, 0x28 ltr ax ret

Then everything will be a little easier, but no less interesting.

Interruptions

Actually, it is time to give us the opportunity to interact with our core (at least, to see what we press on the keyboard). To do this, you need to initialize the interrupt controller.

Lyrical digression about style code.

Thanks to the efforts of the community and specifically Philip Opperman, the call convention x86-interrupt was added to Rust, which allows writing interrupt handlers that execute iret. However, I deliberately decided not to go this way, since I decided to separate the assembler and Rust according to different files, and therefore functions. Yes, I unwisely use stack memory, I realize it, but it still tastes. My interrupt handlers are written in assembler and do exactly one thing: call almost interrupt handlers of the same name written in Rust. Please accept this fact and be condescending.

In general, the interrupt initialization process is similar to the initialization of GDT, but it is easier to understand. On the other hand, you need a lot of uniform code. The developers of Redox OS make a beautiful solution, using all the delights of the language, but I went head-on and decided to allow code duplication.

According to the x86 convention, we have interruptions, and there are exceptional situations. In this context, the settings for us are practically the same. The only difference is that when an exception is triggered, the stack may contain additional information. For example, I use it to handle the absence of a page when working with a bunch (but everything has its time). Both interrupts and exceptions are processed from one table, which we need to fill out with you. You also need to program the PIC (Programmable Interrupt Controller). There is still APIC, but I have not figured it out yet.

I will not give a lot of comments on working with PIC, as there are many examples on the network to work with it. I'll start with the handlers in the assembler. They are all the same type, so I will remove the code under the spoiler.

IRQ

 global irq0 global irq1 ...... global irq14 global irq15 extern kirq0 extern kirq1 ...... extern kirq14 extern kirq15 section .text irq0: pusha call kirq0 popa iret irq1: pusha call kirq1 popa iret ...... irq14: pusha call kirq14 popa iret irq15: pusha call kirq15 popa iret

As you can see, all calls to Rust functions begin with the prefix “k” - for distinction and convenience. Exception handling is completely analogous. For assembly functions, the prefix “e” is chosen, for Rust - “k”. The Page Fault handler is different, but about it is in the memory management notes.

Exceptions

 global e0_zero_divide global e1_debug ...... global eE_page_fault ...... global e14_virtualization global e1E_security extern k0_zero_divide extern k1_debug ...... extern kE_page_fault ...... extern k14_virtualization extern k1E_security section .text e0_zero_divide: pushad call k0_zero_divide popad iret e1_debug: pushad call k1_debug popad iret ...... eE_page_fault: pushad mov eax, [esp + 32] push eax mov eax, cr2 push eax call kE_page_fault pop eax pop eax popad add esp, 4 iret ...... e14_virtualization: pushad call k14_virtualization popad iret e1E_security: pushad call k1E_security popad iret

We declare assembler handlers:

 extern { fn load_idt(base: *const IdtEntry, limit: u16); fn e0_zero_divide(); fn e1_debug(); ...... fn e14_virtualization(); fn e1E_security(); fn irq0(); fn irq1(); ...... fn irq14(); fn irq15(); }

We define Rust handlers that we call above. Please note that to interrupt the keyboard, I simply output the resulting code, which I receive from port 0x60 - this is how the keyboard works in the simplest mode. In the future, this transforms into a full-fledged driver, I hope. After each interruption, you need to output the 0x20 processing end signal to the controller, this is important! Otherwise, you won't get more interruptions.

 #[no_mangle] pub unsafe extern fn kirq0() { // println!("IRQ 0"); outb(0x20, 0x20); } #[no_mangle] pub unsafe extern fn kirq1() { let ch: char = inb(0x60) as char; crate::arch::vga::VGA_WRITER.force_unlock(); println!("IRQ 1 {}", ch); outb(0x20, 0x20); } #[no_mangle] pub unsafe extern fn kirq2() { println!("IRQ 2"); outb(0x20, 0x20); } ...

Initialize IDT and PIC. About PIC and its remapping, I found a large number of tutorials of varying degrees of detail, starting with OSDev and ending with amateur sites. Since the programming procedure operates with a constant sequence of operations and constant commands, I will give this code without further explanation. Pay attention only to the fact that the hardware interrupt handlers occupy the 0x20-0x2F index range in the table, and the arguments function 0x20 and 0x28 are passed, which just cover 16 interrupts in the IDT range.

 unsafe fn setup_pic(pic1: u8, pic2: u8) { // Start initialization outb(PIC1, 0x11); outb(PIC2, 0x11); // Set offsets outb(PIC1 + 1, pic1); /* remap */ outb(PIC2 + 1, pic2); /* pics */ // Set up cascade outb(PIC1 + 1, 4); /* IRQ2 -> connection to slave */ outb(PIC2 + 1, 2); // Set up interrupt mode (1 is 8086/88 mode, 2 is auto EOI) outb(PIC1 + 1, 1); outb(PIC2 + 1, 1); // Unmask interrupts outb(PIC1 + 1, 0); outb(PIC2 + 1, 0); // Ack waiting outb(PIC1, 0x20); outb(PIC2, 0x20); } pub unsafe fn init_idt() { IDT[0x0].set_func(e0_zero_divide); IDT[0x1].set_func(e1_debug); ...... IDT[0x14].set_func(e14_virtualization); IDT[0x1E].set_func(e1E_security); IDT[0x20].set_func(irq0); IDT[0x21].set_func(irq1); ...... IDT[0x2E].set_func(irq14); IDT[0x2F].set_func(irq15); setup_pic(0x20, 0x28); let idt_ptr: *const IdtEntry = IDT.as_ptr(); let limit = (IDT.len() * core::mem::size_of::<IdtEntry>() - 1) as u16; load_idt(idt_ptr, limit); }

IDTR GDTR — . STI — — , , ASCII- -.

 global load_idt section .text idtr dw 0 ; For limit storage dd 0 ; For base storage load_idt: mov eax, [esp + 4] mov [idtr + 2], eax mov ax, [esp + 8] mov [idtr], ax lidt [idtr] sti ret

Afterword

, , . setup_pd, . , , , .

- GitLab .

Thanks for attention!

UPD: 3

Source: https://habr.com/ru/post/445584/

All Articles