Operating systems from scratch; level 3 (upper half)

In this part we will add interrupt handling and take on the scheduler. Finally, we will have elements of a multitasking operating system! Of course this is only the beginning of the topic. One timer interrupt, one system call, the basic part of a simple thread scheduler. Nothing complicated. However, with this we will prepare a springboard for creating a full-fledged system that will deal with the most genuine processes without any “but”. Just like in these your linups and others. Until the end of this course, a little less than half is left

Null lab

First Laba: the younger half and the older half

The second laba: the younger half and the older half

Third Laba: the younger half

Subphase E: Returns from Exceptions

In this subphase, we will write code to return from the exception handler of any types, shapes and colors. The main work will be carried out in the file kernel/ext/init.S and the folder kernel/src/traps .

Overview

If you try to delete an infinite loop from handle_exception , then most likely Raspberry Pi will enter the exception loop. Those. improperly processed exceptions will occur again and again, and in some cases our debug shell will crash. This is all due to the fact that when the exception handler tries to return to the point where the code was executed, the state of the processor (especially the data in the registers) has changed without taking into account what happened in this code itself.

For example, consider the following code:

 1: mov x3, #127 2: mov x4, #127 3: brk 10 4: cmp x3, x4 5: beq safety 6: b oh_no

When a brk exception occurs, our exception vector is called, which ultimately causes a handle_exception . This same handle_exception function, which is compiled by Rust, will among other things use the x3 and x4 registers for its dirty deeds. When our exception handler returns to the place where the call is brk , the state x3 and x4 is not at all what we expect it to be. Accordingly, the correct condition is not guaranteed for the beq instruction in line 5. Maybe the code will jump to safety , and maybe not.

As a result, in order for our exception handler to use the entire state of the process at its discretion, we need to make sure that we save the entire processing context (registers, etc.) before this same processor starts its work. After the handler completes its sacred mission, we will need to restore the previously saved context. All in the name of the fact that the external code worked flawlessly. The process of saving / restoring context itself is called a context switch.

Why a context switch ?

It seems that the word switching here is not very appropriate. We kind of just go back to the same context, right?

In some cases it is. However, in reality we rarely want to return to the same execution context. More often we want to change this very context in order for the processor to do all sorts of different useful things. For example, when we need to implement switching between different processes , we will replace one context with another. In this way we will achieve multitasking. When we implement system calls, we will need to change the value of the registers in order to implement the returned values. Even in the case of breakpoints, we need to change the ELR register in order for the next command to be executed (otherwise the brk handler will be called again and again).

In this subphase we will be engaged in the preservation / restoration of the context. The structure that will contain our saved context will be called the trap frame. The undocumented TrapFrame structure can be found in the kernel/src/traps/trap_frame.rs . We will use this structure to access the saved registers from Rust. On the other hand, we will fill this structure in assembly code. It remains only to pass a pointer to this structure via the tf parameter to the handle_exception function.

Trap frame

')
Trap Frame is the name we give to a structure that contains the entire processor context. The name "trap frame" comes from the term "trap" (trap), which is a general term for describing the mechanism by which a processor triggers a higher level of privilege when a certain event occurs. I do not know about the good fit term for this all in Russian. I think in this case it will be more convenient to use only the English term.

There are various ways to create a trap frame, but their essence is the same. We need to save all the state, which is necessary for execution, into RAM. Most implementations put the entire state on the stack. After we fill the stack with the contents of the registers, the pointer to the top of the stack will become our pointer to the trap . It is this variation that we will use in the future.

At this point, we need to save the following portions of the state of the Cortex-A53 core:

x0 ... x30 - i.e. all 64-bit registers, of which as many as 31 pieces.
q0 ... q31 - all 128-bit registers SIMD / FP.
pc - program counter.
The register is responsible for this ELR_ELx . It may or may not be a PC. One way or another, this is the address where we should return after executing the exception handler. Usually ELR_ELx contains either PC directly or PC + 4 , i.e. address of the next command.
PSTATE - processor status flags.
Recall that the state of the probe is passed to us through the SPSR_ELx register at the previous ELx level.
sp is a pointer to the stack boundary.
Its contents can be accessed through SP_ELs for the exception level s .
TPIDR - 64-bit value of the current "process ID".
The value can be obtained from TPIDR_ELs for the exception level s .

This is all we need to save in our trap frame. We will save on the stack before calling the exception handler. After the handler returns control to the assembler code, we need to return this state as it was there. After we put everything we need on the stack, its contents should look something like this:

trap frame

Note the SP and TPIDR in this structure. They should be exactly the stack pointers and the source thread ID, and not part of the interrupt state. Since the only possible source we will have is EL0 , they can be obtained through reading SP_EL0 and TPIDR_EL0 . In this case, the current SP (which is used by the exception vector) will indicate the beginning of the trap frame. Immediately after we put the necessary values on this very stack, of course.

After we fill the stack with the required values, we will pass a pointer to the top of the stack as the third argument to handle_exception . Type this argument: &mut TrapFrame . As already mentioned, this very TrapFrame can be found in the file kernel/src/traps/trap_frame.rs . You need to add this structure.

What is the thread ID?

The TPIDR register (which is TPIDR_ELx ) allows the OS to store some information about what is being executed. Later we will implement the processes and we will store the process identifier in this register. Right now we will just save and restore this register.

Preferred return address from exclusion

When an exception occurs at the ELx level requiring processing, the CPU stores the preferred return address in ELR_ELx . Details can be found in the documentation ( ref : D1.10.1). Here's what from there:

For asynchronous exceptions, this is the address of the first command that has not yet been executed, or is not fully executed at the moment our exception occurred.
For synchronous exceptions (except system calls), this is the address of the instruction that generates this exception.
For instructions that generate exceptions, this is the address of the instruction that follows the instruction that generates the exception.

The brk instruction belongs to the second category. Thus, if we want to continue execution after the brk command, we need to make sure that the address of the next instruction is in ELR_ELx . Since all instructions in AArch64 are 32 bits in size, it will be enough for us to overwrite this value with ELR_ELx + 4 .

Implementation

Start by implementing context_save and context_restore from the os/kernel/ext/init.S . The context_save subroutine should put all the necessary registers on the stack, and then call handle_exception , passing all the necessary arguments to this function, including the trap frame as the third argument. After you context_restore this out, context_restore subroutine. This routine should restore the context back.

Pay attention to the instructions that are created by the macro HANDLER . There are already saving and restoring x0 and x30 . You should not touch these registers when saving / restoring in context_{save, restore} procedures. However, these registers must lie in the trap frame.

In order to minimize the loss of performance when switching context, you should put on the stack and remove the values from the stack like this:

 //      `x1`, `x5`, `x12`  `x13` sub SP, SP, #32 stp x1, x5, [SP] stp x12, x13, [SP, #16] //      `x1`, `x5`, `x12`  `x13` ldp x1, x5, [SP] ldp x12, x13, [SP, #16] add SP, SP, #32

Make sure the SP always 16 byte aligned. You will find that with this approach, reserved will be created in our trap frame. This most reserved should be filled with zeros.

Once you are done with these two routines, TrapFrame into the TrapFrame structure of kernel/src/traps/trap_frame.rs . Make sure that the order and size of the fields exactly match what you save in context_save and pass in as the tf parameter.

Finally, add an ELR increase of 4 to handle_exception before returning from the brk exception handler. Once you successfully implement context switching, your kernel should work fine after exiting the debug shell. When everything is ready - go to the next step.

The contents of your trap frame do not have to correspond exactly to the diagram, but must contain all the same data.

And don't forget that the qn registers are 128 bits!

Tips:

In order to call handle_exception you will need to save / restore registers that are not part of the trap frame.

Rust has types u128 and i128 for values of 128 bits.

Use the mrs and msr to read / write special registers.

Our context_save version takes about 45 instructions.

Our context_restore version takes about 41 instructions.

And our TrapFrame consists of 68 fields with a total size of 800 bytes.

How can I lazily handle registers for floating-point numbers? [lazy-float]

Saving and restoring all 128-bit SIMD / FP registers is quite expensive. They occupy as many as 512 bytes out of 800 in the TrapFrame structure! It would be ideal to process these registers only if they were actually used by the source of the exception or the purpose of the context switch.

The AArch64 architecture allows us to selectively enable / disable the use of these registers. How could we use this opportunity to lazily load these registers only when they are actually used? But at the same time be able to use these registers freely in your code. What code do you write for the exception handler? Do I need to either modify the structure of the TrapFrame in order to add any additional state and how it is add. condition should be maintained?

Phase 2: This is the process

In this part we will move on to the most delicious. We will implement user processes. Let's start with the implementation of the Process structure, which will work with the state of our process. Then we run the first process. After that, we implement a round-robin process scheduler. To do this, we will need to implement an interrupt controller driver and enable the timer interrupt. Next, we will run our scheduler when a timer interrupt occurs and deal with the context switch in order to make the transition to the next process. Finally, we implement the first system call: sleep .

At the end of this phase, we will already have a minimal, but quite fully-fledged multi-tasking operating system. Currently, processes will share physical memory with the kernel and other processes. However, in the next phase we will deal with this misunderstanding and implement virtual memory. All in order to isolate processes from each other and protect the core memory from playful writers of user-space programs.

Subphase A: Process

In this subphase we will implement everything necessary for the functioning of the Process type from the file kernel/src/process/process.rs . All this code is useful to us in the next subphase.

What is the process?

A process is a container for code and data that is executed, managed, and protected by the kernel. In fact, this is the only part of the code that applies to everything outside the kernel. Either the code is executed as part of a process, or the code is executed as part of the kernel. There are quite a few different operating system architectures (especially when it comes to purely research pieces), but almost all of them have a concept that can be considered user processes.

In most cases, processes are performed with a limited set of privileges (in our case it is EL0 ). All in the name of the fact that the kernel could provide the necessary level of stability and security of the entire system as a whole. If one of the processes breaks down, then we do not want the same fate to befall the rest of the processes. Moreover, we do not want the result of this to be a complete collapse of the entire system. In addition, we do not want processes to interfere with each other. If one process is frozen, then we want the remaining processes to still run. Thus processes imply isolation. They work to some extent independently of each other. Probably you see all these properties every day: when your browser hangs, the rest of the work continues or freezes too?

In any case, the implementation of processes consists in creating structures and algorithms for protecting, isolating, executing and managing unreliable code and data.

What is inside the process?

To implement the processes, we need to track the code and process data and all kinds of supporting information. All in order so that we can easily and freely manage the state of processes and isolate processes from each other. This all means that we need to track:

Stack
Each process requires its own unique stack. When implementing processes, we need to allocate a section of memory that is suitable for use as a process stack. And, of course, we will need to change the process stack pointer so that it points to this memory area.
Heap
In order to work with dynamic memory, each process will need to allocate its own heap. At the very beginning, the heap will be completely empty, but it can be expanded with special system calls. For the time being we will leave this topic and return to it in the future.
Code
The process is practically useless if it does not execute any code. Therefore, our kernel will need to somehow load the process code into memory and transfer control to this code when necessary.
Virtual address space
Since we do not want to give processes the ability to access the memory of the kernel and the memory of other processes, each process will be limited to its own address space using such a thing as virtual memory.
Scheduler Status
In most cases, we assume that there may be more processes than the processor cores. The kernel can execute only one command flow at a time. Consequently, we need CPU time multiplexing mechanisms (and therefore we will have several instruction flows) for simultaneous execution of processes. The task of the scheduler is to determine which process starts and at what point it will all happen. In order to do this correctly, the planner needs to know whether any process is ready for planning or not. The state of the scheduler that is stored in each process is exactly that.
Execution status
In order to properly multiplex the time of a sample between several processes, we will need to make sure that we save the process execution state when we stop the execution of this process. Well, do not forget about the correct restoration of the state at the moment when we turn this process back on. In fact, we have already done everything necessary to handle this state. For this we needed to create TrapFrame . Each process must properly store this state.

The stack, heap, and code constitute the entire physical state of the process. The rest of the state is necessary to insulate, control and protect the process.

The Process structure from kernel/src/process/process.rs will contain all this information. At the moment (in this phase) all processes will use shared memory and there will be no fields for code, heap or virtual address space. But we will add them later.

Should the process trust the kernel? [kernel-distrust]

On the whole, it is obvious that the kernel must relate to processes with obvious distrust. But should processes trust the kernel? If so, what should processes expect from the kernel?

What can go wrong if the two processes share stacks? [isolated-stacks]

Imagine two concurrent processes that share the same stack. First, what will concurrent stack usage mean? Secondly: why, with a high probability, these two processes will interfere with each other and quickly destroy each other? Third: determine the properties of the processes that, in the event of a split of one stack, will be necessary for the quiet coexistence of the two processes without untimely death. In other words, what rules should two such processes follow in order to use the same stack and not die at the same time?

Implementation

It is time to implement everything you need for Process from the file kernel/src/process/process.rs . , , Stack , kernel/src/process/stack.rs . , , . State , , . kernel/src/process/state.rs . , .

Process::new() . . ! — .

? [stack-drop]

Stack 1MiB . 16 . , , , ?

? [lazy-stacks]

Stack 1MiB . . , , ?

? [stack-size]

. 1MiB. , , ? , , ?

B:

( EL0 ). kernel/src/process/scheduler.rs kernel/src/kmain.rs .

, . :

trap frame trap_frame .
trap frame trap_frame .
, , .

. . , ?

, , . , . . . trap_frame . trap frame? ! 2 trap_frame , .

( ), . . . .

, , . trap frame, context_save , context_restore . 1 . , , .

. , . , . () , . . Rust , .

, : // (threads). , , .

. , . , :

"" trap frame .
context_restore .
EL0 .

, , .

.

, ( , ) , . , , . , , , .

Implementation

kmain.rs SCHEDULER GlobalScheduler , Scheduler . kernel/src/process/scheduler.rs . SCHEDULER .

, , start() GlobalScheduler . — start() . For this we need:

extern - , .
. , . , .
Process trap frame.
trap frame, context_restore . , extern -. . EL0 .
context_restore , eret EL0 .
trap frame . :
- context_restore .
  : . . , , context_restore , , .
- ( sp ) ( _start ). , EL1 . : ldr adr sp . , sp .
- 0 . .
- EL0 eret .

. , tf trap frame, x0 , x1 :

 unsafe { asm!("mov x0, $0 mov x1, x0" :: "r"(tf) :: "volatile"); }

— SCHEDULER.start() kmain . kmain . . , extern - EL0 .

, , . brk extern - :

 extern fn run_shell() { unsafe { asm!("brk 1" :::: "volatile"); } unsafe { asm!("brk 2" :::: "volatile"); } shell::shell("user0> "); unsafe { asm!("brk 3" :::: "volatile"); } loop { shell::shell("user1> "); } }

. LowerAArch64 , . , — .

Tips:

6 .

, T Box<T> &*box .

, unsafe -.

C:

BCM2837. , . , . os/pi/src/interrupt.rs , os/pi/src/timer.rs os/kernel/src/traps .

AArch64 — , . . .

, :

int-chain

. , , , .

?

— , , . . , .

/ . , .

. , . , .

, CPU. , .

, , , . , , , . , , . , , .

(unmasked) , . (masked) . . , , , , . , . .

EL0 , .

IRQ IRQ? [reentrant-irq]

IRQ IRQ . , ? IRQ?

. IRQ (). handle_exception kernel/src/traps/mod.rs , handle_irq kernel/src/traps/irq.rs . , , , , . handle_irq .

Implementation

pi/src/interrupt.rs . 7 BCM2873 . / IRQ, Interrupt . FIQ BasicIRQ .

tick_in() pi/src/timer.rs . 12 BCM2873 . tick_in() .

TICK . GlobalScheduler::start() kernel/src/process/scheduler.rs . TICK .

handle_exception kernel/src/traps/mod.rs , handle_irq kernel/src/traps/irq.rs . handle_irq TICK , , TICK .

, TICK . LowerAArch64 , (kind) Irq . . — .

TICK !

TICK . 2 . , , , . 1 10 . TICK 10 .

D:

round-robin . kernel/src/process/scheduler.rs , kernel/src/process/process.rs kernel/src/traps/irq.rs .

, . -, CPU. . . , . .

. round-robin . . ( TICK ), . , . round-robin .

Ready
, . , .
Running
, .
Waiting
, , . , , . , . Those. .

State kernel/src/process/state.rs . State , . , Waiting , , , .

round-robin . C , - 3 5 .

round-robin

: B , C , D , : A . C , , . A , .
B . .
C , , . . C D . D , .
. A , A .
B .
C . , . . C .

? [wait-queue]

round-robin : , . round-robin ? , ( / ) /?

Scheduler kernel/src/process/scheduler.rs , . Scheduler::add() . . TPIDR .

, Scheduler::switch() . new_state , trap frame , trap frame. , , , .

, , , process.is_ready() , kernel/src/process/process.rs . true , Ready , .

TICK . , . GlobalScheduler add() switch() Scheduler .

? [new-state]

scheduler.switch() , . , , . ?

Implementation

round-robin . :

Process::is_ready() kernel/src/process/process.rs
mem::replace() .
Scheduler kernel/src/process/scheduler.rs .
switch() , , , . . , , wfi (wait for interrupt). , , . aarch64.rs .
** GlobalScheduler::start() .
, . , .
.
SCHEDULER.switch() , .

, GlobalScheduler::start() . . ( extern -) , , . , .

, , TICK . . — .

!

unsafe !

mem::replace() state .

, ? [wfi]

wfi , , . wfi , . , ?
: , .

E: Sleep

sleep . kernel/src/shell.rs kernel/src/traps .

— , . svc #n , Svc(n) , n — , . , brk #n Brk(n) , , svc . — , , .

100 . sleep . . .

, , . , unix- . :

n svc #n .
7 x0 ... x6 .
7 x0 ... x6 .
x7 .
- x7 0 — .
- x7 1 — .
- x7 - — .
.

, 7 , u32 u64 , u64 , , :

 fn syscall_7(a: u32, b: u64) -> Result<(u64, u64), Error> { let error: u64; let result_one: u64; let result_two: u64; unsafe { asm!("mov w0, $3 mov x1, $4 svc 7 mov $0, x0 mov $1, x1 mov $2, x7" : "=r"(result_one), "=r"(result_two), "=r"(error) : "r"(a), "r"(b) : "x0", "x1", "x7") } if error != 0 { Err(Error::from(error)) } else { Ok((result_one, result_two)) } }

. , .

? [syscall-error]

unix- , Linux, ( x0 ) . . . , ? ?

Sleep

sleep 1 . u32 . , . u32 . , . :

 (1) sleep(u32) -> u32

? [sleep-elapsed]

( ) , ? , , ? , ?

Implementation

sleep . handle_exception kernel/src/traps/mod.rs . , handle_syscall kernel/src/traps/syscalls.rs . handle_syscall . sleep . Box<FnMut> , . :

 let boxed_fnmut = Box::new(move |p| { //  `p` });

Rust .

sleep <ms> . ms ( ).

, sleep . , , . . , , . — .

:

sleep .

, .

u32 FromStr .

, . . , . . . , .

Source: https://habr.com/ru/post/354784/

All Articles

Operating systems from scratch; level 3 (upper half)

Subphase E: Returns from Exceptions

Overview

Trap frame

Preferred return address from exclusion

Implementation

Phase 2: This is the process

Subphase A: Process

What is the process?

What is inside the process?

Implementation

B:

Implementation

C:

Implementation

D:

Implementation

E: Sleep

Sleep

Implementation

More articles: