Operating systems from scratch; level 2 (younger half)

KDPV In this part we will write a memory manager to unlock using Vec , String , HashMap and all that. Immediately after that, we will implement the FAT32 file system and connect the driver for EMMC (such a thing for communicating with SD cards). In the end, a couple of new commands will appear in our command shell: cd , pwd , cat , ls .

Null lab

First Laba: the younger half and the older half

Utility

The book is on Rust version 2. In Russian translation is not completed until the end. So just markdown.
Reference to the standard library Rust.
Manual peripherals processor BCM2837.
ATAGS Help
Brief description of FAT structures
An article about FAT32 from the English Wikipedia.

Phase 0: Getting Started

To begin with, we are convinced that from the very beginning nothing has changed in our environment:

Are there Linux, BSD or macOS
It is 64-bit
Do you have a USB port

In addition, this is installed: git , wget , tar , screen , make and all that was in the past series. We are convinced of stability and we go further.

Getting the necessary code

We clone into our daddy named cs140e (or cs140e you called it there) this repository:

 git clone https://web.stanford.edu/class/cs140e/assignments/2-fs/skeleton.git 2-fs

After all this, there should be something like this structure:

 cs140e ├── 0-blinky ├── 1-shell ├── 2-fs └── os

Now you need to merge the os repository, in which there is some of your code, with that, the code that will be used in this part:

 cd os git fetch git checkout 2-fs git merge master

You may have to resolve merge conflicts. You can see something similar to:

 Auto-merging kernel/src/kmain.rs CONFLICT (content): Merge conflict in kernel/src/kmain.rs Automatic merge failed; fix conflicts and then commit the result.

Automatically failed. Will have to hands. In order to resolve this, it will be necessary to manually change kmain.rs . Be sure to save any changes from the previous series. To resolve conflicts, you need to add fixed files with git add and git commit . Any useful information about resolving merge conflicts and generally about git can be found in the githowto.com guide. This tutorial and can pass. Anything will come in handy.

Firmware upgrade

You need to re-download the firmware using the command make fetch from turnip 2-fs . The team will download and unpack everything. It remains for us to put the files from the files/ subdirectory into the root of our MicroSD card. Namely files firmware/bootcode.bin , firmware/config.txt , firmware/fixup.dat and firmware/start.elf . Do not forget that if you use the bootloader from the last part as kernel8.img , then you need to add this line to the config.txt :

 kernel_address=0x4000000

Ttywrite installation

Now the Makefile from os/kernel has an additional install target. It collects the kernel and sends it directly to the ttywrite using ttywrite from the last part. Accordingly, if the kernel8.img on the kernel8.img contains a bootloader written in the previous series, this same bootloader will accept and load our file with the kernel. In order to run this, we need to do a make install in the folder with the kernel code.

At the same time, this target from the Makefile calls ttywrite directly, simply by name. Those. ttywrite must be in one of the places pointed to by the PATH environment variable. In order to ttywrite in one of these places, we can go into 1-shell/ttywrite and perform a cargo install . Our utility will be collected and copied to the right place. If everything is correct, then we will be able to call ttywrite --help and it will display help for us.

By default, make install uses /dev/tty.SLAB_USBtoUART for the device name. Edit the Makefile so that it points to what you need. Those. What is your device for your adapter? There is a variable named PI_TTY . So give it another value.

ALLOCATOR.initialize() causes panic!
Our shell should have been operational. However, if you try to test the make install target, you will find that the shell is not working. Most likely it is a call to ALLOCATOR.initialize() . There is no memory allocator. Only a stub, which simply panics when executed without any warning about it. A little later, we will correct this distressing fact. For now, you can just comment out this line, so that everything more or less, but it worked.

Phase 1: Memory Line

In this phase, we will implement two memory allocators. The simplest bump- blocker and a slightly more complex bin- blocker. This is almost all that we need in order to work memory allocation in the heap. In other words, Vec , Box and String will earn. In order to determine the amount of available memory, we will use ARM tags ( ATAGS ). In addition, we need to implement the insides of the panic_fmt function, which is ultimately called when panic! called panic! .

Subphase A: Panic!

Let's start with the implementation of panic_fmt . We will work with the file kernel/src/lang_items.rs .

Language Items

When the Rust compiler is configured to compile programs for purposes without an operating system (like our Raspberry Pi), the compiler asks us for a couple of functions. We need to write them with our own hands. Such things are called language items. These elements are functions whose calls the compiler substitutes under certain conditions. We can define such functions for any elemental language. To do this, we need to annotate such a function with the #[lang_item] attribute.

At the moment, Rust requires us only two such functions:

panic_fmt: (fmt: ::std::fmt::Arguments, file: &str, line: u32, col: u32) -> ! which is called when calling panic! . Arguments panic! are passed as the fmt parameter. In addition, the file name, line number and column of the place where panic! was called is panic! . This is appropriate for file , line and col .
eh_personality : this function depends on the specific OS and ABI. It is called during stack promotion or after abort, if necessary. Usually all this happens when panic! or exit from the stream. We will not implement this function.

Both of these functions are already declared in kernel/src/lang_items.rs . We need to add the function panic_fmt , so that it displays something useful.

Implementation `panic_fmt`

Now we will add this function. Our implementation should output the transmitted information to our console, and then go to an endless loop loop . By the way. fmt::Arguments implements the Display treit. Therefore, we can use kprintln!("{}", fmt) . Make a conclusion as you personally like it. For example, you can be inspired by the panic of the Linux kernel :

  ( ( ) ) ) ( ( ( ` .-""^"""^""^"""^""-. (//\\//\\//\\//\\//\\//) ~\^^^^^^^^^^^^^^^^^^/~ `================` The pi is overdone. ---------- PANIC ---------- FILE: src/kmain.rs LINE: 40 COL: 5 index out of bounds: the len is 3 but the index is 4

Then test the implementation of panic_fmt some kind of kernel panic. I remind you that you can use make install to compile and then run through the bootloader. And yes. ALLOCATOR.initialize() calls panic! somewhere inside. So when doing it, you should also see an error message.

In addition to this, try to cause panic in many ways. With panic!() , With other similar macros. And somehow. When you are sure that your implementation is working perfectly for yourself, go to the next phase.

Subphase B: ATAGS

In this subphase, we will iterate over the ARM (ATAGS) tags that the Malinka firmware provides. We will use an iterator to determine the amount of available memory. The main work will be carried out in the pi/src/atags and in the kernel/src/allocator/mod.rs .

ARM tags

ATAGS or ARM tags is the mechanism used by the ARM-based bootloader and firmware in order to transfer various information to the kernel of the operating system. Linux can also be configured to use ATAGS.

Malinka places an array of ATAG structures starting at address 0x100 . In addition, each tag starts with an 8-bit header:

 struct AtagHeader { dwords: u32, tag: u32, }

The dwords field contains the size of the entire tag, including the title itself. At the same time the size is in double words (double words, i.e. a 32-bit word). The minimum size is 2 (in the size of the header). The tag field contains the ATAG type. There are about a dozen of these in the ATAGS help . Malinka uses only four of them. They are interesting to us:

Name	Type ( `tag` )	The size	Description
CORE	0x54410001	5 or 2 (if empty list)	Used as a starter
None	0x00000000	2	Means the end
Mem	0x54410002	four	Describes a piece of physical memory
CMDLINE	0x54410009	different	Passing arguments to the kernel

The tag type determines how we will interpret the data after the header. Clicking on the links that are tied to the names, you can learn the details. For example, the MEM tag contains something like the following:

 struct Mem { size: u32, start: u32 }

Tags are stored in memory sequentially without any filling between them. This list begins with a CORE tag. And the last tag in this list is NONE . Everything else can be in any order. In order to find out where the next tag begins, you need to look at the contents of dwords . Graphically, it all looks like this:

ATAGS

Union and Security

Someone thinks that all these pictures have a special deep meaning?

The raw structures for ATAG tags are declared in the file pi/src/atags/raw.rs In this case, union used there. The associations in Rust are almost identical to the associations in Nyashny Xi. They define the structure in which some fields have a common memory.

 pub struct Atag { dwords: u32, tag: u32, kind: Kind } pub union Kind { core: Core, mem: Mem, cmd: Cmd }

In other words, associations allow storing arbitrary structures into memory without taking into account the correctness of the choice. It turns out that access to specific fields of association is unsafe in Rust.

There are already unsafe wrappers in many places. At least you don’t need to worry about how to apply to associations yourself. However, passing associations to library end users is a bad idea. For this reason, there is another Atag structure in the pi/src/atags/atag.rs . This structure is completely safe to access. We will transfer it to the external code. In the process of adding the atag module atag you will write a transformation between the internal representation and this structure.

Why is the transfer of associations to end users a bad idea? [enduser-unsafe]

We make a lot of efforts to create a secure interface for unsafe structures. You will see more than once in Rust. The standard library is also a good example of this. What is the use of creating a safe interlayer? Can we transfer a similar approach to languages such as Nyashny Xi?

Command line arguments

The CMDLINE tag deserves additional attention. This same tag is declared this way:

 struct Cmd { /// The first byte of the command line string. cmd: u8 }

According to the comment, the cmd field contains the first byte of the string. In other words, &cmd is a pointer to a C-like string, which ultimately ends with a zero byte. The safe version of this tag is Cmd(&'static str) . When you convert from a raw version to a safe version, you will need to determine the size of this string. Those. find a null terminator (character with code 0 ). You can then use the address and size to convert it to a slice with slice::from_raw_parts() . In turn, this slice can be converted to a string using str::from_utf8() or str::from_utf8_unchecked() . You have already used them in lab 1.

Implementation of `atags`

Well. Now we have everything we need to implement the atags module, which lies in pi/src/atags . Start by implementing raw::Atag::next() from atags/raw.rs This method determines the address of the next ATAG, after the current one. There will have to resort to unsafe code. Then implement helper methods and properties to convert from raw to secure structures, which can be found in atags/atag.rs When implementing From<&'a raw::Cmd> for Atag will From<&'a raw::Cmd> for Atag to use a little unsafe code. At the end, implement Atags' Iterator Atags , which can be found in atags/mod.rs Here, too, a bit of unsafe code may be required.

Tips:

You can (at least try!) To implement the Atags::next() method in three lines of code.
')
You can convert from x: &T to *const u32 with x as *const T as *const u32 .

The inverse transform from x: *const T to &T can be performed using &*x .

You can perform arithmetic operations on raw pointers with add () sub () or offset .

Testing `atags`

Test your own ATAGS implementation. Try to get all the values through an iterator and output the whole thing to the console. Straight in kernel/src/kmain.rs . You should see at least one of three tags other than NONE . In this case, make sure that the implementation of each ATAG meets your expectations. Once the implementation is completed, go to the next subphase.

Tip : Use `{: #?} For a more beautiful display of structures on the console.

What do CMDLINE tags contain? [atag-cmdline]

Is there any value in these tags? What arguments did the firmware give you? Where do you think this all came from and how can it be used?

How much memory is available according to the MEM tag? [atag-mem]

What is the exact starting address and size of the available memory that is reported in the MEM tag? How close is all this to 1 gigabyte of memory that Malinka has?

Subphase C: Warming Up (warming up)

In this subphase, we will prepare everything necessary for the subsequent writing of two memory allocators. We implement the align_up and align_down functions that align addresses to powers of two. In addition, we implement the memory_map function, which returns the starting and ending addresses from system memory. This function will be used by both allocators to determine how much memory is available for allocation.

Alignment

An address in memory is called N-byte-aligned when it is completely divided by N In other words, for the address k k % n == 0 is true. Usually we do not need to worry about leveling our memory. However, now we are system programmers. Our heightened attention to this topic is due to the fact that the hardware, protocols, and all that, prescribe quite certain properties from the point of view of alignment. For example, for a 32-bit ARM architecture, it is required that the stack pointer be aligned to 8 bytes. The AArch64 architecture (our case) requires alignment for a stack pointer of 16 bytes. x86-64 requires about the same. Memory page addresses must be aligned to 4 kilobytes. There are many more different examples, but we already see that without this we can not do.

In Nyashny C, the memory addresses that the allocator returns from libC will be guaranteed to be aligned by 8 bytes on a 32-bit system and 16 bytes on a 64-bit system. In addition, the caller cannot specifically control the alignment of the returned memory address and must take care of itself. For this there is, for example, the posix_memalign function from the POSIX standard.

Why were such alignment properties chosen for Nyashny S? [libc-align]

The choice of a guarantee of 8 or 16 bytes for alignment for the malloc function malloc incompetent. Why were such guarantees guaranteed in the standard C library?

In Nyashny C, the declarations malloc() and free() look like this:

 void *malloc(size_t size); void free(void *pointer);

In Rust, at a low unsafe level, alloc and dealloc , which look like this:

 // `layout.size()` is the requested size, `layout.align()` the requested alignment unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr>; // `layout` should be the same as was used for the call that returned `ptr` unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout);

Note that the calling code may indicate alignment. As a result, the allocator, rather than the calling code, must take care to return the aligned pointer. When you implement the memory allocators in the following sub-phases, you should make sure that the return address is properly aligned.

In addition, it is worth canceling that dealloc , unlike free from Nyashny, requires that the caller transfer the Layout exactly the same as for alloc . Thus, the external code must take care of storing the size and alignment for a particular allocated memory.

Why do you think Rust divides the responsibilities between the allocator and the calling code in this way? [onus]

In Nyashny, the allocator has fewer restrictions on the alignment of memory addresses that it can return. But at the same time, he himself must keep the size of the allocated space. Rust is exactly the opposite. Why do you think Rust chose the opposite path? What advantages does this provide for the allocator and for the calling code?

`align_up` : `align_up` and `align_down`

When you implement allocators in the following sub-phases, it will be useful for you to determine the next or previous aligned address. Those. for the address u following is >= or <= u , which will be aligned in powers of two. There already is (unrealized, of course) align_up and align_down . You can find them in the file kernel/src/allocator/util.rs They are announced something like this:

 /// Align `addr` downwards to the nearest multiple of `align`. /// Panics if `align` is not a power of 2. fn align_down(addr: usize, align: usize) -> usize; /// Align `addr` upwards to the nearest multiple of `align`. /// Panics if `align` is not a power of 2. fn align_up(addr: usize, align: usize) -> usize;

Release these features right now. You can verify the correctness of this process by calling make test or cargo test from the kernel directory. The tests themselves can be found in the file kernel/src/allocator/tests.rs . If everything is correct in this part, then all the tests from align_util must pass.

Attention : in the process of testing kprint{ln}! turn into print{ln}! calls print{ln}! so everything should work.

Tips:

The implementation will take one or two lines.

The align_up and align_down will be very similar to each other.

Thread safety

Memory allocators like malloc() from libC and our two, which we will implement, are global . Those. they can be called anywhere and anytime. Including in any stream. Thus, allocators must be thread-safe. Rust is very serious about thread safety. For this reason, it is difficult to implement an allocator, which will not be such, even if our system still does not have a mechanism for servicing parallelism (for example, threads).

The topic of thread safe memory allocators is quite extensive. On this topic, you can find a lot of research. At the moment, this topic, we will not touch. Just wrap our Mutex memory Mutex . This wrapper is already in kernel/src/allocator/mod.rs Read all this code now. Please note that there is an implementation of Alloc treit . It is precisely because of this that Rust will know that this is quite a valid allocator. The implementation of this trait is required for such things as #[global_allocator] (in kmain.rs ). #[global_allocator] — , , Rust- Vec , String Box . Those. alloc() dealloc() , .

Alloc Allocator kernel/src/allocator/mod.rs imp::Allocator , mutex . imp . . #[path = "bump.rs"] , Rust-, . #[path] . bump- bump.rs . bin- bin.rs .

: `memory_map`

kernel/src/allocator/mod.rs — memory_map . Allocator::initialize() , kmain() . initialize() imp::Allocator .

memory_map . , . Those. , ATAGS. , . binary_end . , _end ( layout.ld ).

memory_map , Atags B binary_end . , . - String::from("Hi!") . . , panic!() bump-. memory_map , bump- — . , .

D: Bump-

. Bump-. kernel/src/allocator/bump.rs .

Bump- . alloc current . . current ( ). , . dealloc .

, , , 1 , 512 . , .

BUMP

, kernel/src/allocator/bump.rs . Those. new() , alloc() dealloc() bump::Allocator . align_up align_down , . -. make test cargo test kernel . allocator::bump_* .

. , , !

saturating_add saturating_sub .

- , kmain() . :

 let mut v = vec![]; for i in 0..1000 { v.push(i); kprintln!("{:?}", v); }

— .

alloc ? [bump-chain]

bump::Allocator::alloc() . ? : , v.push(i) bump::Allocator::alloc() .

E: Bin-

: bin-. kernel/src/allocator/bin.rs .

Bin- , . ( ) , . . — . , . .

, , . k - 2 2^n n , 3 k ( 2^3 , 2^4 … 2^k ). , 2^3 , 2^3 . 2^3 2^4 2^4 :

bin 0 ( 2^3 ) (0, 2^3]
bin 1 ( 2^4 ) (2^3, 2^4]
bin 2 ( 2^5 ) (2^4, 2^5]
bin k ( 2^k ) (2^(k - 1), 2^k]
...

(intrusive) . kernel/src/allocator/linked_list.rs . kernel/src/allocator/bin.rs LinkedList , .

?

next ( previous ) . . .

LinkedList::new() . push() . ( ), pop() . peek() . :

 let mut list = LinkedList::new(); unsafe { list.push(address_1); list.push(address_2); } assert_eq!(list.peek(), Some(address_2)); assert_eq!(list.pop(), Some(address_2)); assert_eq!(list.pop(), Some(address_1)); assert_eq!(list.pop(), None);

LinkedList . iter() . . iter_mut() . Node , . value() pop() .

 let mut list = LinkedList::new(); unsafe { list.push(address_1); list.push(address_2); list.push(address_3); } for node in list.iter_mut() { if node.value() == address_2 { node.pop(); } } assert_eq!(list.pop(), Some(address_3)); assert_eq!(list.pop(), Some(address_1)); assert_eq!(list.pop(), None);

LinkedList . , push() . LinkedList .

? [ll-alloc]

— . , , ?

Fragmentation

, , . , . . , . . . .

. , ( ). bin- .
. , - . bin- , . .

, .

Implementation

bin- kernel/src/allocator/bin.rs . ( , bin- ) . . . - . make test cargo test . bump.rs bin.rs #[path] kernel/src/allocator/mod.rs . Well, that is bump-, bin, .

bin- — .

? [bin-about]

. :
() ?
?
?

? [bin-frag]

! , ? ( ) .

UPD :

Source: https://habr.com/ru/post/352414/

All Articles