The first versions of the L4 microkernel were so small that they could fit entirely in the cache of modern processors. Probably, this fact gave rise to the myth of the L4 micronucleus: "It is fast because it is small." Even now it is quite often possible to hear this version. So is it possible to place a microkernel in the processor and how to do it?
To answer the question of how to place a microkernel in a crystal, it is necessary to imagine what the L4 microkernel is and what functions it performs. It is safe to say that L4 keeps on three pillars:
- Any interaction between tasks takes place on the basis of messages, incl. hardware interrupts and internal exceptions
- Messages are synchronous and only synchronous - both tasks are involved in sending a message.
- Universal virtual memory pages replace traditional virtual pages.
As a result, a document was born with a rather boring description of registers, algorithms and recommendations:
Formal description of the L4 hardware microkernel (L4_Hard_20130119.pdf, 1046Kb)')
The document describes the expansion of the microprocessor instruction set to implement hardware support for the L4 microkernel of X2 revision and compatible specifications. The document is based on the following agreements.
- A task is a sequence of commands processed by an actuator.
- Each task has its own “Task register block” and is uniquely defined and described by it.
- A process is one or more tasks that share the same page table.
- A thread is a task that executes in the address space of a process.
- Scheduler - a microprocessor functional unit that expands the system of commands and provides the ability to exchange synchronous messages between tasks.
This document describes the implementation of an L4-X2 compatible hardware scheduler. The proposed version of the document has been corrected for several discussions:
Discussion Digest “Domestic Microprocessors (2) (Part 2)”Hardware microkernel. Final discussionDiscussion Digest “Domestic Microprocessors (2) (Part 3)”Thumbnail describing task registers:

It is proposed to use a large register file, which is divided into two parts - one part contains an array, each element of which is a block of task registers. The second part of the large register file contains an array, each element of which is a message buffer. It also adds several global scheduler registers. Switching tasks is carried out by switching the block of registers with the actuator. The transfer of messages is performed by re-switching the message buffer from the source task to the task receiver.
I hope the document will arouse the interest of a respected community. Ideas that emerge from the discussion will be included in the next version of the document. Enjoy reading!