πŸ“œ ⬆️ ⬇️

Operating systems from scratch; level 2 (upper half)

It's time to write the file system. The file system does not write itself. In this half of the labs, we still implement the FAT32 file system, attach the SD card driver to it and slightly interact with it through our interactive shell.


Null lab


First Laba: the younger half and the older half


The younger part . Continued under the cut.


Phase 2: 32-bit lipids


In this phase we will implement the FAT32 file system. Exceptionally read-only at the moment. The main work will be vestist in the directory 2-fs/fat32 .


Disks and File Systems


The data on the disk is controlled by one or more file systems. Similar to memory allocators, file systems are responsible for managing, allocating, and freeing memory. The only difference is that this is not a fast RAM, but a slow and non-volatile memory. In other words, all changes are saved at any time in the future. Including after restarting the computer. There are many different file systems. Linux has EXT4. On macOS there are HFS + and APFS. On Windows, there is NTFS. Some file systems are implemented for several operating systems at once. FAT32 is one of those. It is implemented for all major OSes including Linux, macOS and Windows. It was originally used in later versions of DOS and earlier versions of Windows. The main advantage of FAT32 is omnipresence. This is one of the most koross-platform file systems.


In order to allow more than one file system to be on the disk, this same disk can be divided into sections. Each partition can be independently formatted for different file systems. To split the disk into sections, the disk writes to a specific location where it starts, where it is downloaded and the type of file system that this section uses. One of the widely used systems is Master Boot Record (Master Boot Record) or simply MBR for brevity. The MBR contains a table of four entries describing the sections. However, some sections can not be declared as used. There are slightly more modern separation schemes like GPT, which among other things supports more than four sections.


In this task, we will implement the code for reading the MBR from disk, which in turn includes one FAT32 partition. This combination is used by our malinka: the MBR is also also FAT32.


Disk partitioning


This diagram shows the physical layout of the disk partition with MBR and FAT32:


MBR and FAT32


The PDF of the FAT structure contains all the necessary information about the sizes and contents of these very structures. Together with the minimum required description. We will use the document when implementing our file system. In addition, it will be useful to study the corresponding article from Wikipedia.


Master boot record


MBR is always in the zero sector of the disk. The MBR contains four partition entries. Each of these entries contains: the type of the partition, the offset of the partition in the sectors and different flags, such as whether this partition is bootable. All other fields like CHS (cylinder, head, sector) can be completely ignored. So does most modern implementations. It is worth noting that the partition type for FAT32 is 0xB or 0xC .


Extended Bios Parameter Block


The first sector of the FAT32 partition contains an extended BIOS parameter block. Abbreviated EBPB. This block itself starts with a block of BIOS or BPB parameters. Together they define all the necessary layout options for the FAT file system.


There is one area in EBPB that needs special attention. The one that determines the number of reserved sectors (number of reserved sectors). This offset is from the beginning of the FAT32 partition in the sectors where the FAT can be found. Immediately after the last FAT there will be an area containing data for the clusters. Now we will look at the FATs in more detail, the data area, the clusters and that's it.


Clusters


All data stored in the FAT file system is divided into clusters. In EBPB there is a field from which you can find how many sectors there are in each cluster (number of sectors per cluster). The numbering of the clusters begins with the number 2. As can be seen from the diagram, the data for cluster 2 is located at the beginning of the data area. The data for cluster 3 is located immediately after cluster 2 and so on.


File allocation table


FAT stands for file allocation table. File allocation table Based on the name FAT, this is a table (array) of FAT records. In FAT32, each such recording has a size of 32 bits. The size of the entire table is determined by the sectors per FAT and bytes per sectors from EPBP. For redundancy, there can be more than one FAT in the file system (in the name of the most holy backup!). The number of tables can also be found in EPBP. Watch the field number of FATs.


In addition to the entries behind the numbers 0 and 1, each of the FAT entries determines the status of the cluster. The note number 2 determines the status of cluster 2. Run down 3 determines the status of cluster 3. And further along the list. Each cluster has its own FAT entry.


Records 0 and 1 are most likely:



In addition to these two entries, all the rest correspond to a specific cluster from the data area. Although FAT entries have a full size of 32 bits, only 28 bits are used. The upper 4 bits are ignored. And the values ​​may be:



Chain of clusters


Clusters form chains of clusters. This is essentially a linked list of clusters. If a cluster is used for data, its value contains either a reference to the next cluster or is an EOC marker indicating the end of the chain.


As an example, consider a diagram with 8 FAT entries:


Chain of clusters


Clusters are colored by colors so that it can be easier to figure out which chain it belongs to. The first two entries are ID and EOC. Record 2 indicates that the corresponding cluster is a cluster of data and this chain (green) is one cluster in size. Record 3 indicates that cluster 3 contains data and the next in the chain (blue) will be cluster 5 with data that refers to cluster 6, which terminates this chain. Similarly, clusters 7 and 5 form a chain (red). Cluster number 8 is free and not used.


Catalogs and Records


A cluster chain is data for a file or directory. Directory - the essence of a special file, which contains the file names and all other metadata. Inside the gurney is an array of catalog entries. Each such record contains a name, a start cluster and whether this record is a directory or just a file.


There is one special directory that is not associated with entries in other directories. Root directory. The starting cluster for the root directory can be found in EBPB. Through all this, you can determine the location of all other files and directories.


For historical reasons, each physical directory entry can be interpreted in as many different ways. The attribute field also indicates one of these methods. Here are these two variations:



A long file name (LFN) has been added to FAT32 in order to use file names longer than 11 characters. If a record has a name longer than 11 characters, then it is preceded by LFN records. However, these records are not physically sorted. Instead, they contain a field to determine the sequence. Thus, it will not be possible to rely on the physical order of LFN records.


so


Before you continue, you need to understand the structures of FAT . After that, try to answer the following questions:


How to determine whether the first sector contains an MBR structure? [mbr-magic]

The first sector of the disk may not contain the MBR. How can I determine if the MBR is there or not?



What is the maximum number of FAT32 clusters? [max clusters]

FAT32 design implies a number of restrictions. What is the maximum number of clusters in FAT32 and where do these restrictions come from? And if you take FAT16, then there will be the same restrictions or other?



What is the maximum size of a single file? [max-file-size]
')
Are there any restrictions on the maximum file size? If so, what is the maximum file size and what determines this boundary?

Hint : Look at the entry structure in the directory.



How to determine if we have a LFN record or another? [lfn-identity]

If you look closely at the entries in the catalog, which ones exactly determine the LBN, the LFN in front of us or the usual entry? Specifically, what are the bytes and what should they be?



How can I find /a/b/c.txt [manual-lookup]

Not forgetting EBPB, describe all the steps you take to find the initial cluster for the file /a/b/c.txt .

Code structure


Writing a file system is quite a serious thing. FAT32 even though we will only read it, not an exception. The code provided in the 2-fs/fat32 provides mainly the basic structure, but many design solutions and most of the implementation is entirely yours.


Now we will deal with the description of what is already done. Read the code from the directory fat32/src .


File system traits


There you can find the module traits . The entry point is traits/mod.rs There you can find about seven traits and one structure. When implementing the file system, we will also implement it all.


There is one Dummy structure that provides a fictitious implementation of most traits. This type can be used as a stub. If you look at the code, then this cap is used in some places. Maybe you need it.


I advise you to read the code from traits/ in the following order:



Cache device



Direct disk access is quite an expensive operation. Therefore, all access will be performed on cached sectors. The structure of CachedDevice can be found in the file vfat/cache.rs . It provides transparent and explicit access to sector caches. In fact, it is a wrapper over BlockDevice , which internally uses a HashMap as storage. The key in the HashMap will be the sector number. Once you implement CachedDevice , you can transparently use it as a cached version of BlockDevice . In addition, get() and get_mut() methods are provided that allow you to directly refer to cached sectors.


In addition, the CachedDevice structure must monitor the correspondence between logical sectors and physical sectors, which are defined by EBPB . The virtual_to_physical() method is provided for virtual_to_physical() . This same method should be used to determine how many physical sectors need to be read for a given logical sector.


Utility


The util.rs file contains one useful trait and its implementation for slices ( &[T] ) and dynamic arrays ( Vec<T> ). This can be used to transfer one to another while maintaining certain conditions. For example, in order to build &[u32] in &[u8] you can use this:


 use util::SliceExt; let x: &[u32] = &[1, 2, 3, 4]; assert_eq!(x.len(), 4); let y: &[u8] = unsafe { x.cast() }; assert_eq!(y.len(), 16); 

MBR and EBPB


The MasterBootRecord structure can be found in the mbr.rs file. She is responsible for reading and analyzing the MBR from BlockDevice . Similarly, you can use the BiosParameterBlock structure. It can be found in the file vfat/ebpb.rs She is responsible for reading and analyzing the BPB and EBPB sections of FAT32.


Shared


The Shared<T> structure from vfat/shared.rs can be used for secure mutable access to type T Useful in the implementation of the file system. Especially when we need the ability to share FS from different parts of the code. Before proceeding, make sure you understand how and why using Shared<T> useful.


File system


The core file system itself can be found in the file vfat/vfat.rs Obviously this is a VFat structure. As you can see, the structure contains a CachedDevice . The implementation should wrap the supplied BlockDevice in CachedDevice .


What is the VFAT?

VFAT is another Microsoft file system that is the predecessor of FAT32. For various historical reasons, it has become synonymous with FAT32. We will continue this silly tradition with not always correct names.

A partial implementation of the FileSystem properties for the &Shared<VFat> already present. In addition, you can see that the from() method returns Shared<VFat> . The main task is to complete the implementation of the from() method and some required FileSystem properties for &Shared<VFat> . This pulls the implementation of the remaining structures that implement the necessary pieces of file system properties.


More vfat/ can be found:



When implementing a file system, you will need to add all this with the necessary code. Do not be afraid to supplement any of these structures with methods that you find necessary. However, do not change any of the provided definitions of traits or signatures of existing methods.


Read now all the code from vfat.rs and make sure you understand what is happening there.


Implementation


Now we have everything you need to implement the FAT32 file system. You can sell in the order in which you prefer.


Be sure to update all provided blanks!

Make sure all your repository copies are up to date. Get the latest versions of 2-fs and os using git pull and fix everything you need.

We provide a set of fairly rigorous tests to verify implementation. Before running the tests, run make clean && make fetch in the 2-fs directory. It will load several files into 2-fs/files/resources/ . These files are used by unit tests. In this directory, you will find images that contain MBR, EBPB and FAT32 inside, as well as hashes, which are used to test different parts of the implementation. Perhaps you will find it useful to analyze images using hex editors like * Bless on Linux or Hex Fiend on macOS.


Tests can be run using the cargo test . In order to see the debugging messages, you can run the cargo test -- --nocapture . This prevents stdout and stderr from intercepting. In addition, you can freely add your own tests in the number in which you find it necessary. To prevent merge conflicts, it is recommended to add tests to a file with a name other than tests.rs .


It is also recommended to follow these rules:



You can do everything in the order in which you want. But we recommend this procedure:


  1. Implement the MBR parsing in mbr.rs Probably for implementation will require the use of unsafe . But one line will suffice. Most likely slice :: from_raw_parts_mut () or mem :: transmute () . mem::transmute() an incredibly powerful tool. Avoid using it as much as it turns out. Otherwise, you must fully understand what you are doing. To implement Debug use debug_struct() from Formatter . CachedDevice Debug .
  2. EBPB ebpb.rs . MBR, unsafe .
  3. MBR EBPB. . BlockDevice Cursor<&mut [u8]> . :

     println!("{:#?}", x); 
  4. CachedDevice vfat/cached.rs .
  5. VFat::from() vfat/vfat.rs . MasterBootRecord , BiosParameterBlock CachedDevice . MBR EBPB.
  6. FatEntry vfat/fat.rs .
  7. VFat::fat_entry , VFat::read_cluster VFat::read_chain . Cluster . . . . VFat::fat_entry .
  8. vfat/metadata.rs . Date , Time Attributes , . FAT . Timestamp Metadata , Entry , File Dir .
  9. Dir vfat/dir.rs Entry vfat/entry.rs .
    Dir , Cluster Shared<VFat> . File vfat/file.rs . , Iterator<Item=Entry> entries() . entries() unsafe . VecExt SliceExt . FAT β€” , Dir .

    Entry
    LFN, , union . VFatDirEntry . Rust . .

    . , , . union . .

    . , LFN, , . . .

    UTF-16 LFN. decode_utf16() . UTF-16 Vec<u16> .
    Dir::find()

    Dir::find() traits::Dir Dir . , Dir::find() . . eq_ignore_ascii_case() .
  10. File vfat/file.rs . , Cluster Shared<VFat> . traits::File File . entries() Dir .
  11. VFat::open() vfat/vfat.rs . components() , Path . , , std , , . read_dir() , is_file() , is_dir() .

    Dir::find() . VFat::open() . 17 . Dir .

, - , , . ! , , .


SD-



SD- Raspbrerry Pi 3, Foreign function interface FFI . FFI Rust 19.1 Rust . . os/kernel/src/fs .


Foregin Function Interface


FFI Rust , . , Rust, extern :


 extern { static outside_global: u32; fn outside_function(param: i16) -> i32; } 

outside_function outside_global . :


 unsafe { let y = outside_function(10); let global = outside_global; } 

, unsafe . Rust , . . , , Rust , . outside_function outside_global . .


Rust , ( ) . Rust (mangles) , . . , , . #[no_mangle] :


 #[no_mangle] fn call_me_maybe(ptr: *mut u8) { .. } 

() :


 void call_me_maybe(unsigned char *); call_me_maybe(...); 

Rust ? [foreign-safety]

, Rust , . , Rust , Rust , , Rust.



Rust ? [mangling]

. C++ Rust . , ? , , Rust .

SD-


SD- os/kernel/ext/libsd.a . . Those. . os/kernel/src/sd.rs , .


wait_micros , . . . :


 /* * Sleep for `us` microseconds. */ void wait_micros(unsigned int us); 

β€” API Rust-. Sd , SD- new() . BlockDevice Sd . unsafe . , MBR kmain . , . , , .


: 64- ARM unsigned int u32 Rust.



? [foreign-sync]

SD- ( sd_err ) . . ? , Rust unsafe -. ? ?
: ! ( , ) ?


File system


. kernel/src/fs/mod.rs .


, . , . , , static FILE_SYSTEM: FileSystem kernel/src/kmain.rs . , .


. . FileSystem kernel/src/fs/mod.rs , FAT32 SD-. Sd ( BlockDevice ) initialize() . FileSystem , VFat . , kmain .


, ( / ) SD-. , β€” .


4: Mo'sh


cd , pwd , ls cat . os/kernel/src/shell.rs .




. ( cwd current working directory) β€” , . cwd /a , hello /a/hello . cwd /a/b/c , hello /a/b/c/hello . / , . . /hello hello .


cd <dir> . cd /hello/there , cwd /hello/there . cd you , cwd cd /hello/there/you .


. , cwd .



, : cd , pwd , ls cat . :



. . . β€” , . , .


Implementation


os/kernel/src/shell.rs . PathBuf , . PathBuf cd . . , .


, , β€” . Congratulations!


, bin-!

. , bin-. , .



: PathBuf Path .



: .. . cd .

So it goes.

Source: https://habr.com/ru/post/353024/


All Articles