Transcript of a report about the Phantom-OS, made by Dmitry Zavalishin on ADD-2010

annotation

Dmitry Zavalishin spoke about the current state in the development of his favorite offspring - the original operating system PhantomOS, which is similar in concept to Microsoft Singularity, but at the same time open-source (most of the source codes of this operating system are published).

A microkernel operating system without files and processes, only with ever-living objects / threads / threads, attracted the curious even at the concept level, and now it has come to life, loaded, and is ready to turn into a real collective project.

Video

Video in HD quality, see in full screen.

Download video
')

Podcast

Link to the podcast .

Transcript

The transcript of the video recorded Stas Fomin.

How to make it really good? In the sense that all existing software today is made according to the principle “Let's take what was, well, somehow we will tweak it, improve it, move it here, correct it here,” and in the end, all that we have on today, it is legacy-legacy-legacy, some kind of old stuff that developed, developed, developed, and in the end, it is all such layers, and very complicated and heavy ...

I've been using computers for twenty-five years, you know, they have since then, in those with which I started, they had two diskettes of 160KB each, and 48KB of RAM. So, you know, it loaded faster than what I have now, but functionally I do about the same with it, I write programs on it, edit some texts, and read email. And at the same time, the processor and memory have changed, God, God ... I can’t even count ... Five orders, six orders - where does all this fail? It falls exactly there, all the software that is made today, it is made according to the principle of historical development, something so old, scary and miserable.

“Phantom” was born in principle - let's think about what it was, and let's try, do it from scratch, don't take the Linux kernel, don't take Java, with all my love for it, don't develop the existing one, let's start from scratch.

There are a number of thoughts that underlay this very idea, such as, for example, tasks. I considered modern software that we all develop with you.

What are the problems facing the developer? How easier to do? Generally speaking, this is how I found out for myself, the whole history of software development, starting from the oldest, is to give the programmer the opportunity to do development on a component basis.

That is, as much as possible to take from the side, to do as little as possible myself, and at the same time, to be able to somehow collect all this from the cubes. Even the system itself, the very idea of the operating system, was born out of it.

How was the OS born? Once upon a time, people took a computer entirely, took one of their programs, and ran it. Then it turned out that to write a hemorrhoid printer driver in each program, we must take it immediately ready, from somewhere.

Libraries appeared. Then these libraries turned into a kernel that already runs an application program. So everything gradually, gradually everything began to grow, in the end, Unix appeared. Unix, today - an operating system that clearly won, even Vind made in the image and likeness of Unix, the concept on which it is based, is clearly dominant.

It is interesting that when Unix appeared, and I was old enough to remember these times, Unix, as an operating system, was very strange, and was much inferior to all existing operating systems. There was a machine where we worked, which is called PDP-11 (SM-1600), it had a native operating system, and there was Unix. The native OS worked many times faster than Unix. Nevertheless, she died today, and Unix exists!

Why? Because Unix made a very correct step. The people who did it, they gave rise to a very simple idea, such small programs that process text files in a fixed way, and the ability to collect all these programs on the go, through the pipes, or through files, it does not matter, in some kind of chains.

It turned out that this opportunity to do on the go programming without programming, it really sounds ... it is valuable. It is possible to make some scripts on the shell, on the go, to collect something with your hands, without much difficulty.

Notice that today's administrative tools, like Perl, work in the same direction. So, the ability to quickly rivet something on the go.

In principle, the basis of the Phantom was a rather banal idea, which began to emerge in the region of the nineties. When the C ++ language appeared, which appeared in 1987, perhaps, the year, all the players were from object programming, and a thought appeared, like “Well, why not? Language is objective, why is the OS a planar one? It would be necessary to make it objective. ” Moreover, this is generally speaking, a rather practical idea that does not arise from any sort of throwing or idealism, it stems from the fact that it’s really convenient to have an object interface. If you look at today's situation, it is obviously true. The same C #, the same Java, completely wraps operating systems in object wrappers, this is convenient.

Good. Next step. Understandably, we want programs to communicate. And here it is interesting that ... that's what I forgot to say in this conversation about Java and C #, about the fact that in C #, well, in general there is OLE in Windows. This is real ... for all, sorry, the bastard of this tool, better not yet, unfortunately. And this tool really solves a completely meaningful task.

This task is the construction of the component environment, which can really be riveted from that pack of objects. That is, to get from different sides, assembled, compiled modules, and make something one of them, somehow build a working system. Moreover, it must be said that this is also not an abstraction, it is also a specific situation, in the area in which we operate, industrial management and monitoring systems, there are drivers for systems that are sold by different, different, different vendors, they are all made as COM objects . COM is a completely winning technology, it is profitable for them, they need it.

Or there is another example, this is an opportunity ... by the way, it is not very often used, but it is really tasty, the possibility of embedding Vordovsk documents, for example, in an Excel table, or somehow linking differently, such are documentary things.

Great, huh?

I once, one of the first times when I talked about Phantom, there was a big hall, there were about two hundred programmers there - “Who knows what OLE is?”. And thirty hands raised. I ask - “And how many are programming on it? Really made an OLE tool any? ". And such here, uncertain three people here, and it is clear that two of them compiled “Hello World”, approximately, and the third really cut through something and did ^{[ 1 ]} .

Why? - Very hard. Very, very, very hard.

Even if you do not take OLE, take only other tools, if the two processes are on the same machine or on different machines, the ability to really connect them is very ... very bad.

And what's the trouble? The normal, natural representation of their giblets for the program is the object graph. That is, the program usually works with a graph of connected objects, and the only way that really works on interprocess communication is a pipe. That is, a hole in which you can push the baytik. That is, the document graph does not creep in any way, it can be serialized, but in this case, it will break away from what you had, that is, it can only be copied in that direction. And to give somewhere really, such a tool so that you can work together on common data, ... well, OLE somehow allows it, but again it is very difficult. But at the same time, it is quite obvious that this is a valuable opportunity. Why is it difficult? Because many years ago, some kind of intelligent man said that the operating system is the core that runs the processes. These processes operate in separate address spaces. And since then, behind him, this ingenious thought has been constantly repeated. While, in fact, for managed languages, which both Java and C # are, in general, all that is being done now is managed languages, ... generally speaking, separate address spaces are not needed, because they are managed by themselves languages - they manage memory well, they do not have problems with a runaway pointer, with damage to other people's data.

Therefore, if you go, take a step towards managed languages, you can abandon the address spaces at the operating system level. And then the thought went this way - well, well, we took it, wrote, ... (looking at the screen) this Phantom works there ...

Is loaded?

No, it is already loaded, it is he who works.

We took two programs with you, launched them, and they became friends. One another sent some pointers, since we have a common address space, it is very easy to communicate. Pointer threw, changed, you can pull for him, pick something up, transfer something, other people's data become their own, literally. Very cheap IPC. Only one problem, you have stopped this program, and where does Pointer look? It seems like nowhere. Restarted, and even if your pointer looks at the old data, this program is running again, the data is in a different place, the connection was interrupted. Every time I say this, I give one example. When you start Photoshop, they all start Photoshop, there comes a window, such a line starts to run there - here it loads it, then it stretches over here, then it finds something here ... and it does it every time.

What the heck! Why can't he find it once and remember Pointer? Because the operating systems that exist today, they do not allow you to somehow remember what you have loaded there once. You are forced to close the application, and when you close the application, all memory is lost. So you can not connect them in a normal way, only through files, only through strange, very complex things that are difficult and slow to work. Is it possible to make the program not stop? Yes, actually you can. In fact, the program lives inside the operating system, and may well pretend that the operating system lives forever. Even if you have rebooted it and started it again, it is not necessary to tell the program about it. This is generally not even necessary in Linux.

What can be obtained from this? You can get a lot from this. As soon as this thought occurred to me, I immediately thought, well, great, if I did this, that is, I started the program once, and then it works all my life, hangs in my memory, and lives there.

It turns out that after this, files are not needed, because a significant part of the file task is storage, the state of the program between launches. I call it storing the soul of a dead program. If the program does not kill, then her soul is immortal. Files in this case are not needed. That is, it is probably possible to use them, to transfer data somewhere there, to write to a USB flash drive, send to the network. or something else, but firstly it is not necessary, and secondly, it can be done very differently.

Recently I came up with an example: there are such programs that musicians use are called sequencers. Here all these programs work with two types of files - one type of their own, personal, in which they store their state completely, it is usually some proprietary format, which is usually not known to anyone. And the second is some kind of .midi there. The file in which you can write something, so that later you can transfer it somewhere, to another program, it is exchange format. So, the first format, in this case, you can not do at all. The program started, built its state in memory. The machine was turned off, when the machine was turned off, as if it was done in Hibernate, it saved everything, you turned it on - the programs all woke up, as if nothing had happened. That is, they can safely remain in this state, you will not lose your work, you can link this program as you please, this connection turns out to be very cheap and effective.

Some more funny things come out. Well, for example, the life of all programs in the same address space means that there is quite effective I / O, because there is no context switch, as it happens now, in the core of the operating system. Due to the fact that the contexts are switched, and the program's memory can be paged out, you cannot directly do input-output into it, it can be done separately into the buffer. Well, in short, there are some troubles. In this case, you can avoid it. Why this could not be done for a long time? Because in order to do such things, you need to have machines with a very large virtual address space, in 32 bits the Phantom does not make much sense, that is, it has, but it is clear that 32 bits is the space into which the entire disk should fit , and this is not very much, it is only four gigabytes, and then does not fit. Therefore, actually, by and large, Phantom, as an idea, is focused on 64-bit machines, and if 32-bits, then this is probably a phone, or something simple, small.

Actually, this concept was the basis of the system, and the idea arose to make the operating system easy, you need to make sure that the programmers of existing programs can somehow get into it. Considered two ways for this.

The first way is native rules. The Phantom system has its own byte-code, an interpreter that builds an object model, similarly to Javascript and C #, it has a compiler of its own language for it, and translators are written to it from Java bytecode and C #. That is, the theory says that Java and C # code there, as well as all code written in languages that are compiled into the JVM and CLR, can be dragged into Phantom. Moreover, the environment for it will be very natural, natural, and all this code can interact in cheap ways through the following scheme of exchange of points.

The second environment, which was not originally planned, however, the idea arose that it was probably necessary, is an environment, such POSIX-compatible, such as UNIX within Phantom, which we plan to make in two variations, one variation is simple and elementary. Just launch unix applications compiled for Phantom. That is the POSIX-environment, the usual-ordinary POSIX-environment.

The second is more interesting, I really hope that it will work out, because to make it more difficult, but there is a big charm - it is a consistent POSIX-environment, that is, the application is a regular UNIX application, as it is, it is only taken by the system, and also works forever, that is, you can not close this application. It will still work with files, you can simply turn off the machine in this place, turn it on, and it will all be started from the same place where it is left.

And of course, some kind of interchange between them, which is also planned to be done in a very interesting way. It is clear that a modern operating system cannot fail to be micronuclear; naturally, it can be serious to drag everything into the kernel, it is clear that microkernelism is primarily a means of communication between components, including the kernel, drivers, and applications. For this, there are nuclear message passing mechanisms, and it turned out I was researching this question ... If anyone knows, there was such an operating system, it was called BeOS. For a long time, the French, who left Apple, made a platform, such an interesting one, made an operating system for it, it didn’t go very well, although it was used somewhere, the company collapsed, and the operating system went into open-source, and since It has been implemented a couple of times, and now it exists under the name Haiku.

Here in it, quite a decent mechanism, the message passing'a exists, and we actually looked at it and dragged it into the Phantom. That is, it is considered as a tool for extending the kernel components, and secondly, for interaction between the C-code and the object code, because it is clear that the message passing sticks very well from either side.

Here it is. I see nothing here on the screen, I think that you see nothing too ^{[ 2 ]} ...

Is everything loading?

No, the point is this, I will explain, what I brought to you, this core, this core, which was directly pulled out of the development process, did not do anything special with it, but the real, living, development core is inserted into it many different checks, gags, and checkpoints, so that during the development process you can immediately see that something has broken.

But some things are done significantly longer than they should.

Actually, how does the Phantom core work? It, unlike ... Simply, if you know what paging is in Unix, then Phantom is an operating system with a consistent paging, that is, a pagefile in which the state of the virtual address space is saved, it is not lost when restarted, but in the right way, it is structured, and when restarted, the system lifts all its state out of it. That is, when it simply works, it does not significantly record anything, except for the usual paging. When it is completed, a snapshot is made that captures the state of the system, a full-fledged one, and then it starts from it, almost completely restores its state.

There are two points. Actually it is often asked how it differs from Hibernate, it differs from hibernate in that hibernate must be explicitly done, otherwise you have lost everything, and the phantom memory structure, it is so arranged that from time to time it makes snapshots itself, transparently for you. and quite painless. And in principle, if you simply unplug it, then it will rise again with the old state, when turned on, but with some lag, how many minutes ago, seconds ... depends on many things. So, the time of the snapshot at the Phantom, it is, firstly, very different for the first launch of the system, and for subsequent ones.

If you installed the system at all, just installed it from scratch, then it needs to put a lot in the snapshot, because it has everything new. If the system worked ... at least one snapshot did, then after that it doesn’t have many modifications, and the next snapshots happen very quickly. And if you remove all of our checks and assertions, then it probably takes about five seconds. But the current core, which I show you, has all these checks in it, and secondly, it is now running as if it had just been born after installation, respectively, taking a snapshot takes quite a long time, because after taking a snapshot, we still check for consistency to make sure that the system of snapshots ... oh, silent, fine ... we will have an honest experiment, the virtual machine in which Phantom works, for some reason decided to break.

Accordingly, we will assume that he was pulled out of the socket completely unexpectedly. Now she breaks, Windows is a very thoughtful thing, for some reason she does not want to kill it right away, here. What is another value of such a structure? The fact is that the Phantom starts very quickly, ... what is a snapshot raising? This means that the kernel was loaded, and the kernel is very small, after which the kernel found the memory location map on the disk, lifted it into memory and started further. All other disk accesses occur within the framework of ordinary pagefaults, which are caused by memory accesses that have not yet loaded. Why is it important? Because modern computers are constantly migrating mz desktop-type laptops, in some built-in applications. My TV has Linux, for example. And this Linux, for some reason, starts eight seconds, it terribly annoys me.

Computers are used in modern cars, and even if it is Linux ... if it is Windows, then of course it’s a problem, it rises for a very long time. Even if it is Linux, even if it is very compact, it still runs for quite a long time. Actually, there is such a need. We need a system that, firstly, starts up quickly, and that does not make claims in terms of shutdown. Out of the car, the keys are so chick - I turned off the ignition, put it in my pocket, left. The system was killed. They turned it on - the system should work in the same place where it was, because you did something with it, they looked at the card, something like that, some operations took place. It is reasonable that each operation was in the same state in which you left it. If we talk about some server systems, for example, for medicine, this is my favorite example - artificial respiration systems - the cleaning lady went, pulled the fork, turned it back on, and he says, “but now I’m working on the drives ..., I’ll upload something , something else I can do. The man you see, there already did not wait. Accordingly, we are somehow trying to get into this area, I think that this is a pretty good task for us.

So, Windows agreed that the application should be killed, now we will try. Here, in theory, he should now rise from the same state in which he was at the time when he fell, this can be determined by what. There will be such a white window with tsiferki, if tsiferki there will not begin with one, ... means. ... oh, oh oh ...

What is a virtual machine?

This is TPL. Clearly, this is a white window, this is actually a Phantom application, it is very simple, I have three lines there, about

i = i + 1
print i

But the point is that by its state you can see how ... what state was raised by the system. After reboots, it is clear that it maintains its state and recovers, ... Which one? And the virtual machine - I answered.

Here it is. What is the status of the project? We have been doing this for about two years, well, I don’t think there are years of hesitation and throwing, during which pieces of experimental code were written, which today are all either rewritten or for some other reason thrown out of the project. About two years. About a year was taken by the implementation of the proof of concept, in which I very actively used external code, well, roughly speaking, I gathered from the Internet everything that fits from the standard call, such as processor management, memory management, something that is not specific to Phantom, everything was used standard. And on top of this was written an operating system that implements the basic ideas.

When it became clear to me that it was written, it exists, it works, the snapshots are worked out and restored, the task was set to clear all of this from code that does not correspond to our license. Today, the task was to get into the LGPL, and it took about half a year to rewrite, replace all of the alien code, GPL and some other code with strange licenses with your own. And today, the core, except for the TCP / IP stack, which is taken entirely from another system under a normal license, all other systems are written from scratch. The TCP / IP stack itself is brought up to some distinct level, here there are graphics drivers, there are basic drivers for standard, most popular network cards ten years ago, such as the NE2000, or the well-known RTLs.

Tests for snapshots, kernel development, in general, all this must be constantly tested, we did a test suite, which almost every build runs, for regression.

A compiler of its own language is made, coincidentally, also called Phantom, and now there are such things in the work. Somehow the kernel is being shared, although in general it is mostly ready, and work is underway on translating from Java bytecode to Phantom. I am often asked why Phantom has its own byte-code, why it wasn’t taken java, there are a lot of reasons, one of them is that phantom byte-code, as it were, is correct to say, in a scientific way ... but in Java there are native types like int, float, such here, content types, as it were, but they are not in Phantom, there all the objects are very, very, very.

In principle, it is possible to inherit from int, and to do ... develop inta interfaces, although it is not clear why, but it is possible. This was done in particular, due to the fact that under the virtual machine Phantom wanted absolute reliability. Why? Because Java is a virtual machine, by and large ... it’s java, sysharpic, unprincipled, it works in a separate process, and this is a personal problem of this separate process.

Even if she suddenly does something wrong, will fall, crash, make a mistake, spoil the data, it's all the same, your data, and no one else will touch them. Phantom virtual machine, all virtual machines, of all processes, all users, work in one common address space. This imposes some rather specific requirements on the system.

For example, a regular virtual machine, stack, is just a stack, on which all functions being called work. That is, they called the following function - it works in the same stack.

The Phantom is wrong, in the Phantom each individual function creates a separate stack for itself, and it lives in it. Why? Because you can call the method of someone else's class, and if he makes a mistake, he will spoil something with the stack, then he will ruin it for you. That is, it will be damage between different users. We didn’t want to allow this, therefore everything is very, very, very rigidly demarcated, and this code itself, in a virtual machine, is less efficient than java, but it is very strict. In addition, I do not know whether this decision was correct or not, maybe we will refuse from this, is the Java virtual machine as it is made? It has a bytecode verification phase and an execution phase. The byte code is loaded, it looks at it - “it’s possible, it’s possible, it’s possible ... and it’s impossible –– they won’t run it. Or "you can, you can, you can, you can, all is well, I will run. In the process of work, this is not checked. The Phantom is done differently, there is no verification at all, but during the work of the byte-code everything is checked. An attempt to go beyond the array, an attempt to go beyond the stack sizes, everything, everything, everything is checked on the go. This obviously affects efficiency, and is quite noticeable. However, it may not be necessary to do this now, because in any case we will do the shipment, it is now in work, it is clear that many of the verifications that are being done at run-time during the JIT compilation will be done statically, probably everyone , , , .

Another problem, a huge one, the problem that I suffered terribly from, because the system was largely written, and the problem was realized when the system was already working, it is a garbage collection issue.

For the first six months or even a year, the system worked without a garbage collection, wasted, wasted, wasted, and then everything fell. And having started making the garbage collection, I realized an unpleasant thing.

Normal Phantom, this is normal, which will be on 64-bit machines, it will operate with an address space, object-sized, like a disk. That is, for today - a terabyte, two terabytes, tomorrow - ten terabytes approximately. And the garbage collection that needs to be held, you need to spend on all this terabyte or ten terabytes.

And we must understand that today, no garbage collection exists. All garbage collection, which is implemented for today, is the assembly of “good objects”, and discarding the rest. It is clear, yes? That is, in order to collect garbage, you need to bypass all the necessary objects on the disk. And there is no possibility, somehow this is the case ... even all existing, there, partial garbage collectors, anyway, they are either incomplete - someone else picks up after them, or they stop the whole machine to hell, in this case , you can remove all, but the problem is. It seemed that this is an insoluble problem, because a terabyte, and moreover, a terabyte, so this terabyte is not all in memory, in fact, the system loads only those objects that are really needed, the rest are there on disk.

If we start doing garbage collection, throughout this very mess, then this means that we have to pick up everything that is on it, generally speaking, from the screw. In the RAM. Well, not completely, because we are mostly concerned about the object's header, but we need not only to lift, but also to record ... The war and the Germans! Just, specifically, absolutely trouble.

I began to estimate, well, maybe, I do not know, at night she will do it ... in a quiet place there ... Full, terrible suffering, I really wondered if I could kill the project, because it was not clear what to do. Two things saved me. First, I read about Azul , which is real, these guys learned how to make garbage collection on terabyte, on stock.

Even three are ways out of the situation.

The second way out of the situation, we haven’t accepted it yet, because all of these algorithms, as it turned out, are quite maliciously licensed by someone, what is the subtlety? There is a very old, kind, easy way to garbage collection. It is called reference counting. We count the number of references to the object, dropped to zero - the object was thrown out. Solid, oak, it works quite simply, it is even possible to live, but again, we all know that it does not save if you have cycles of objects. They made a ring, everyone has a link more than zero, they will all fall somewhere in the corner, there is no external link to this, forever lost memory is not going to. There are algorithms, it turns out, which are called loop-breakers. They somehow analyze whether they go along objects in the system, and somehow they keep an eye on, isn’t it a cycle? If the cycle - then they cut it,and then the usual garbage collection finishes. A good thing, but with licenses it is not clear, so they also decided to fold to the side.

And the third option, in fact, that came to mind when I went to search the Internet for everything that is on the garbage collection, this is a big area, hard, I found an article in which the author reflected on the topic ... What is the biggest problem garbage collection? If we can all stop, quietly turn off the program, and calmly bypass everything, then it is very easy to do garbage collection. And if the program continues to work, the garbage collection may be wrong, and wrong in the wrong direction. Okay, if she didn’t find any garbage, she can still find some correct data as rubbish, there’s a very simple situation that can happen, the system bypasses all objects, here are some objects, imagine, there’s a link to it from here, walked, walked, walked, did not reach him through, on the left, went to the right,and this time he was taken and given to the left object, which had already been bypassed before. She was still there when she went, she didn’t see him, they threw him over, she didn’t betray him at all, and he is quite normal and alive.

So that, on the one hand, not to stop the program, and on the other hand, allow yourself to collect all the objects, they do ... many different things are used for this, for example, they do such a thing. Through the virtual memory system, all objects touched by the program during operation are specifically marked as live. It is a normal scheme.But now, discussing this problem, the author of the article said the following funny thing, on which I just jumped - “if we had a complete copy of the entire state of the entire program” - and then I clicked - “My God, I’m writing an operating system, in which there is always a complete copy, of the entire state, of the entire program in the country. ” And this is a natural, natural life of this computer itself, which is not necessary to be additionally implemented.

And the idea is very simple - what was rubbish in yesterday’s version is also rubbish today. Therefore, if we have yesterday's version of the entire state, then you can safely find the garbage, it is already standing, it does not run anywhere, it does not change, it rolls on the disk, you can go around slowly, quietly, at night, whenever you like, find all the garbage, and in the current version kill him.

Hence the obvious solution. In Phantom there are two garbage collectors. Now while one, the second in development.

The first garbage collector is made on ref-counts, it is very dull, it has two important properties: it always works briefly, a very small, predictable time is needed to check that the link has become zero. It became zero - killed a local object. And I also talked about this, statistics tells us that most of the objects in the system are created for a short time and have one link. Therefore, the refcount garbage collector very quickly kills short-lived local objects. Therefore, the main memory turnover in the system, garbage collector with refcount is well processed. And a number of long-living objects with cycles, they are gradually postponed, that is, they remain, are not collected, this garbage collector gets into the snapshot, and after the snapshot happened, a long garbage collector is started,which has already taken a photograph of the system slowly, slowly, calmly bypasses it, finds garbage in it, and destroys it.

Why am I actually not afraid of this? When I described this to people who are engaged in garbage collectors, they asked me - are you not afraid that there will be a lot of incomplete objects and they will fill all the memory? No, I am not afraid.Because they fill the virtual memory. Well, firstly, we have a disk and so to hell, and secondly, well, well, we’ll put a megabyte on it, well, two, well, a hundred megabytes, even, generally speaking, a modern disk is completely unnoticeable. Here, more or less solved the problem.

There are still a number of problems that arise precisely in this type of system; they cannot be in Linux. For example, how does a system call work in a normal operating system? The program works, she wants to make a fileopen, she does an interrupt, gets into the kernel, ... a bad example. Let's say read from socket. It gets into the kernel, starts reading from the socket, and there is nothing in the socket! What's happening?It is blocked! That is, the program went into the system call, and there fell asleep. When the data appears, it wakes up, we go back to the code, go on.

This can not be done in the Phantom. Why?Because, how is the memory of the Phantom? This is some kind of non-swappable core, and after it, further, from above, this part of the memory begins, which is persistent, our whole object lives in it ...

If it went to the core, then you cannot make a snapshot. Why?Because the core can change, we can take another core at the next start, upgrade it, and even simply, even the old core, having started again, it is in a different state. And if we are trying to raise a snapshot in which the code is in the kernel, then most likely it will simply explode.

Therefore, the rule of “development in the Phantom” states that all system calls cannot be blocked. I went, took - come back.

Problem! After all, it still needs to be blocked. Here's a completely dumb system call for sleep, some kind of, I want to sleep, I want to sleep a moment or two, something needs to be done with it. Resolved as follows. If the process wants to do something that can block it, then it is divided into two calls - in the first call it leaves and immediately returns, but when it returns, it immediately falls asleep, there, at its top, in a state that is normal, full-fledged, in the one that can be photographed. And the core, if it did something important for him, and wants to tell him about it, it awakens him, and in the next system call, it takes him. Such is the difficult model.

What else is non-trivial? Well, in general, in actual fact, by and large, non-triviality is all on the other side, on the side of the nucleus. For a programmer, Phantom is a very simple thing, a system that is the same as everyone else, only the programs in it live forever. Also in general, I speak incorrectly, because it causes incorrect associations. Data live forever! The program can be killed, restarted, this is the usual thread, which also behaves. Because people constantly ask me: if the program lives forever, and it breaks down, how can it be killed? Well, just kill. Take and kill. The same kill and kill, or by clicking on a cross, there in the corner. In this regard, there are no differences. Just the data that she uses, they remained. When the process is stopped, all this remains.

In principle, from some of these more or less non-trivial thoughts that arose during the development process, it was a desire to somehow support more or less transparently in terms of interaction through the network, because what I wrote is all the same, within one car. Now it would be desirable, well, since we have made such an object environment, let's do something with it within the network - remote invocation of objects, and, preferably, migration as well. Because there, too, very nontrivial situations emerge. If we link to an object on another machine, then the most powerful thing can happen. Maybe it will ??? objects through different machines. From the point of view of each of the machines, it seems to be meaningful that it is connected with someone outside, and in sum, this whole bunch of objects only refer to each other, and no one needs it. Because of thisFor this kind of environment, you need a distributed garbage collection that can collect garbage on more than one machine. And also, it scared me pretty badly - well, suppose Phantom defeated the whole world in this way, Phantoms everywhere, and they all contacted each other with links, machine-to-machine links all over the world, and a crazy amount of network garbage that cannot be killed, and the entire planet is inundated with this, and I am to blame.

It turned out that there are actually quite simple algorithms that make network garbage collection, it is important that they do not just allow you to do network garbage collection, but allow you to do it incrementally, with simple steps that can be done separately. That is, it is not necessary to take and collect all the garbage on the entire planet; this can be done in small pieces, on separate machines, and in a very simple way. If there is a group of objects on the car that are not connected with it, but only look outside with their points, then you need to take them and move them to the car where the most links go. First of all, this is effective - if they are alive and needed, then they have a place there. And secondly, they are in the process of such iterations, they will eventually gather on the same machine, where they will be killed by an ordinary garbage collector.

Questions

There are many questions, so in order not to inflate habratopik, we decided not to copy them. All of them are available here .

Colleagues, I draw your attention that Dmitry Zavalishin will perform this coming Friday (April 29) at Application Developer Days in St. Petersburg. You have a great opportunity to talk to him in person. Join us !

Notes

The stenographer was one of those that raised his hand (it was at RIT-2010), and I did both OLE-objects and OLE-object, which allowed to run VB-scripts via AXHost, and those scripts used other OLE-objects ... however, it was real a long time ago, in 1997, somewhere.
See our video screencast recording, everything is visible.
Page report on the conference website .

Source: https://habr.com/ru/post/118088/

All Articles