I'm writing a toy OS (about the scheduler)

I continue to blog about the development of a toy OS.

In the last post I wrote about how to achieve the ability to implement interrupt handlers on C. Now, using previously written macros, you can implement a simple SMP scheduler. It will provide the minimum possible functionality, on the basis of which in the future it will be necessary to build various add-ins, in particular, synchronization primitives (for example, a mutex). Again, a beautiful modular structure does not contribute to high performance, but beauty, as is known, will save the world, so we give it preference.

So, we will try to formulate the requirements for our planner. We need the ability to create a thread, specify the stack for it, the mask of allowed logical processors (affinity), the base priority and the execution function. Further, the stream can be started, paused, continued, and finally terminated.
')
In addition, it would be great if the scheduler was not engaged in the allocation of memory, but could receive and return the memory allocated for the stream by someone else. On the one hand, this would provide the flexibility of randomly reserving the memory of threads. On the other hand, it would give a unique opportunity to save a stream in external memory (for example, on a hard disk) with its subsequent loading and launching from an interrupted place.

So, the above requirements resulted in the following scheduler interface:

// IN fields should be set before calling attach_thread struct thread_data { INTERNAL uint64_t magic; INTERNAL struct thread_data *prev, *next, *all_prev, *all_next; IN struct int_stack_frame context; IN uint8_t *stack; IN size_t stack_size; // size available to thread: stack_size - 8 IN uint64_t affinity[THREAD_AFFINITY_SIZE]; OUT uint64_t output; OUT uint64_t run_time; IN uint8_t priority; INTERNAL uint8_t real_priority; INTERNAL uint16_t quantum; INTERNAL uint8_t cpu; INTERNAL uint8_t state: 2; IN uint8_t fixed_priority : 1; }; typedef uint64_t thread_id; typedef uint64_t (*thread_proc)(uint64_t input); // set stack and stack_size fields before calling this function err_code set_thread_context(struct thread_data *thread, thread_proc proc, uint64_t input); // attached thread becomes paused; thread should not reside in stack! err_code attach_thread(struct thread_data *thread, thread_id *id); err_code detach_thread(thread_id id, struct thread_data **thread); err_code resume_thread(thread_id id); err_code pause_thread(thread_id id); err_code stop_thread(thread_id id, uint64_t output); thread_id get_thread(void);

Here is a sample code that starts a new thread:

 static uint64_t thread_proc(uint64_t input) { // do some work return output; } #define STACK_SIZE 0x1000 static void start_thread(void) { thread_id id; struct thread_data *thrd; struct thread_data *thrd = kmalloc(sizeof(*thrd)+STACK_SIZE); if (!thrd) { LOG_ERROR("Failed to allocate memory"); return; } memset(thrd, 0, sizeof(*thrd)); thrd->stack = (uint8_t*)(thrd + 1); thrd->stack_size = STACK_SIZE; thrd->affinity[0] = 1; // will run on first CPU only thrd->priority = 3; thrd->fixed_priority = true; set_thread_context(thrd, thread_proc, 0); if (attach_thread(thrd, &id) != ERR_NONE) { LOG_ERROR("Failed to attach thread"); return; } if (resume_thread(id) != ERR_NONE) { LOG_ERROR("Failed to resume thread"); return; }

Now is the time to look under the hood.

Our planner consists of the following parts:
1. List of all added threads. Protected by its own spinlock.
2. List of inactive streams. Inactive streams are suspended and terminated streams. Protected by its own spinlock.
3. Workspace CPU (one for each logical processor). It contains priority queues, as well as a special task structure (it will be discussed later). The working area of the CPU is also protected by its own spinlock.

Consider the work of the scheduler as an example of starting a new thread. First of all, the spinlock of the list of inactive streams is captured (we will extract an element from it). This spinlock not only protects the list from inconsistent modification by different threads, but also disables interrupts, ensuring that the operation on this processor is atomic. After extracting the thread, a processor with the smallest number of scheduled threads (among those allowed in affinity) is searched — a simple load balancing option.

So, the thread must be added to the workspace of the target processor. It just won't work this way (even by blocking it with a spinlock), because at any moment a timer interrupt can trigger, updating the priority queues. Therefore, it is better to entrust the addition of a thread to the target processor itself in an interrupt specially allocated for such purposes.

In the general case, the task processor of the target processor is filled in, previously blocked by the spinlock of its working area, then an interrupt is triggered on the target processor and a response is expected (in the same task structure). In the case when the target processor is local, there is no need for an interrupt (after all, the spinlock has disabled local interrupts), so the interrupt handler is called as an ordinary function.

After adding the stream to the workspace of the target processor, both spinlock (the workspace and the list of inactive tasks) are released. Our thread is in the scheduling queue, and probably will one day be run in one of the timer interrupts (this depends on its current priority, as well as on the priorities of other threads on that processor).

A few muddled presentation, many things remained behind the scenes, but enough for today. If anyone has questions - I will describe in more detail in the comments. And yes, you can always read the source .

Source: https://habr.com/ru/post/177403/

All Articles

I'm writing a toy OS (about the scheduler)

More articles: