📜 ⬆️ ⬇️

What actually virtual memory can do



We in 1cloud try to talk about different technologies - for example, containers , SSL or flash memory .

Today we will continue the theme of memory. Developer Robert Elder (Robert Elder) in his blog published a material describing the possibilities of virtual memory, which are not known to all engineers. We present to you the main thoughts of this article.
')
Note: The source material contains a large number of complex terms and technological descriptions, so if you find any error or inaccuracy - write us a personal message so that we can make edits and make the material better.

When I started updating my own C compiler and writing the CPU specification, Elder realized that there are a lot of issues associated with virtual memory that are not completely understood by novice developers. For this reason, he decided to write his online manual.

Before moving on to the Elder article, you can watch a video in which Jason Pitt talks about virtual memory.



How it works


Elder created a table on his site with the physical and virtual representations of a 256-byte address space. Below is a screenshot of this table. An interactive version is available on the engineer’s blog at this link .



Designations found in the Elder interactive table:

0x0This is a pointer to the top level page structure. On Intel machines, this value is stored in the CR3 register. With ARM, things are a little more complicated .
First page structure. With two-tier organization of tables, it is often called a “directory” of pages. In our case, each entry in the directory occupies 8 bits (1 byte) and contains information about the location of the page table.
The second page structure is the so-called page table (page table). Each entry contains information about the location of the physical page.
The physical page that is currently being worked on.
Active record directory pages or page tables.
The selected memory location.
Readable Memory. In this example, the permissions are not analyzed, but in a real system, the bit will be checked for compliance with the required access method.
Writeable Memory.
Executable Memory.
Inaccessible Virtual Memory.
Uninitialized physical memory (Unitialised Physical Memory). It cannot be accessed through the address space of virtual memory - this will cause a page violation.
Inaccessible Physical Memory. Plots of memory that can not be accessed.

Mapping addresses "one to one" (Identity Mapping)


This is one of the easiest ways to display virtual memory: each physical address is mapped to the same virtual address. This option is not very suitable for a multi-functional OS, but can be very useful for the rapid development of some systems (here is an example of a microkernel that Elder is working on).

Recursive Mapping


To manage memory, you need to know where page structures are located in physical memory. When the memory management unit (MMU) starts up, you can only interact directly with virtual memory addresses. For this reason, tracking physical addresses can be very difficult.

One solution to this problem is recursive page tables. If you add a recursive entry to the top-level page structure, you can easily tell which virtual address will allow access to any physical address within this structure. You just need to decide which virtual address to generate in order to “get” to the recursive entry.

As soon as you generate virtual addresses corresponding to entries in the page directory (connected via recursive notation), page directory entries can be read as page table entries. If the structure of the page table entries matches the structure of the page directory entries, then they can be considered equivalent when translating addresses.

As a result, this makes it possible to refer to any page structure via virtual memory. The disadvantage of recursive mapping is the need to allocate additional address space.

One Mapped Page (Everything Mapped to The Same Page)


An important feature of virtual memory is that it allows you to map physical pages to many virtual addresses in memory. This makes it possible to map pages belonging to a read-only portion of the shared memory to several processes.

Page violations (Page Faults Everywhere)


Page violations occur when we refer to an area for which no initialization bit is set. Another page violation is manifested at the moment when we are trying to perform some kind of action that goes against access rights (although in the presented example permissions are not checked, in a real system this is possible).

Context switching between two processes (Context Switching Between 2 Processes)


Changing the pointer to the top-level page structure, we move to another page directory. At the same time, the available addresses remain the same, but their contents change. This explains why in a virtual memory OS many processes can use the same virtual pointer.

Solving External Fragmentation


External fragmentation is a very unpleasant thing. Consider this situation: your computer has 4 GB of memory, but no hard disk. After several memory allocations, the system found itself in a situation where the entire memory space is free, except for one byte in the middle. In this case, if you need to place a large three-gigabyte block, you cannot do this (despite the fact that there is enough free memory).

There are two ways out of this situation:

  1. Move a single-byte record to the end of the memory space.
  2. Transfer two separated memory blocks to the process so that it can decide for itself what to do.

The first option may cause a decrease in performance if the copied chunk of memory is very large (say, 1 GB). However, this is not all the difficulties: after moving the value, we will have to somehow inform the process to which this memory location was issued that the pointer has changed.

The second option also does not work, because the process expects that the allocated memory will be continuous. If it ceases to be such, then you will have to create a completely new set of instructions and save information on how to get the correct address of the second half.

Virtual memory helps to effectively solve this problem. You can easily reassign the virtual address space so that the unattached portions of physical memory look like a single whole. In this case, no data movement occurs - we simply update the page table entries.

Copy on Write (Copy-On-Write)


Virtual memory is extremely useful for improving performance when executing a fork command. Making full copies of every page of memory that the process uses will result in a waste of CPU and RAM cycles. The idea of ​​copying while writing is that we simply map the memory image of the parent process to the address space of the child process.

After that, the OS prohibits both processes from writing to this memory. A valid copy will be created only in exceptional situations. In practice, it turns out that after creating a copy process, most pages are never modified, which only increases the efficiency of the method, making it less resource-intensive.

Experiment with pages


Elder conducted an experiment on his computer with the Ubuntu 14.04 operating system. He declared several variables in a row to see if their pointers would also be next to each other.

#include <stdio.h> const char a = 'a'; char b = 'b'; char c(void){return 0;}; const char d = 'd'; char e = 'e'; char f(void){return 0;}; int main(){ printf("a: %p, b: %p, c: %p, d: %p, e: %p, f: %p\n", (void *)&a, (void *)&b, (void *)&c, (void *)&d, (void *)&e, (void *)&f); return 0; } 

Here is what he got on the way out:

 a: 0x400618, b: 0x601040, c: 0x40052d, d: 0x400619, e: 0x601041, f: 0x400538 

It is visible that pointers follow not in the set order. Elder went ahead and conducted another experiment in which he showed that constants, symbols and functions are stored in the sequence in which the programmer declared them. You can find the code and explanation here .

Calling a function with constants


The following program sets several arbitrary constants (which will be replaced later) and a function that takes an integer as an input and increments it by 8. In this example, the main function immediately follows the function func1. After starting, the program displays the information necessary to perform the function func1.

 #include <stdio.h> const unsigned int a = 0x12345678; /*     */ const unsigned int b = 0x90123456; const unsigned int c = 0x78901234; const unsigned int d = 0x56789012; unsigned int func1(unsigned int i){ return i + 8; } int main(void){ unsigned int * i; unsigned int num; /* Print out the bytecode for 'func1' */ for(i = (unsigned int*)func1; i < (unsigned int*)main; i++){ printf("%p: 0x%08X\n", (void *)i, *i); } num = func1(29); printf("%u\n", num); /*  37*/ } 

At the exit we have:

 0x40052d: 0xE5894855 0x400531: 0x8BFC7D89 0x400535: 0xC083FC45 0x400539: 0x55C35D08 37 

You can simply copy these values ​​into integer constants, which will be located in memory one after the other (an example may not work if your system is different from the Elder system). Now, since they are on the same page, you can refer to them as executable data and use a pointer to a instead of a function pointer.

 #include <stdio.h> const unsigned int a = 0xE5894855; /*     */ const unsigned int b = 0x8BFC7D89; /*    */ const unsigned int c = 0xC083FC45; /*  . */ const unsigned int d = 0x55C35D08; unsigned int func1(unsigned int i){ return i + 8; } int main(void){ unsigned int * i; unsigned int num; /* Print out the bytecode for 'func1' */ for(i = (unsigned int*)func1; i < (unsigned int*)main; i++){ printf("%p: 0x%08X\n", (void *)i, *i); } /* Cast the address of 'a' to a function pointer and call it */ num = ((unsigned int (*)(unsigned int))&a)(29); printf("%u\n", num); /*  37*/ } 

At the exit, we still have the number 37.

Conclusion


As you can see, the possibilities of virtual memory are quite rich. Here are just some of the possibilities: isolating processes, solving problems of external segmentation, implementing a copy-on-write mechanism for optimizing many processes.

Source: https://habr.com/ru/post/276217/


All Articles