📜 ⬆️ ⬇️

Light streams in Embox


Today, as promised, I will continue the theme of planning light entities, which I have already begun in my series of articles. In it, I talked about the internal structure of the tasklet , workqueue and protothread . Of course, the topic is not limited to these examples: there is also FreeRTOS with its coroutine , or GNU Portable threads ; or you can move away from the structures and libraries used in the OS, and recall the various green threads, which are becoming more and more.

This time I want to share how we implemented light streams in the Embox project. On the one hand, we tried to take into account the experience of previous developments, on the other - to bring something new.

How it all began


The problem came to us from different sides.

Threads require their stack, consuming a relatively large amount of RAM. Cooperative multitasking was carried out only by turning off the timer, which is responsible for calling the scheduler when the time slice expires. What did it lead to? On boards with limited resources, sometimes you have to completely abandon multitasking.
')
In addition, more and more tasks appeared for which it would be nice to have some kind of event handling mechanism. It is desirable with priorities and synchronization primitives, and that all this is processed quickly and does not require extra memory.

This made us think about lightweight, almost cooperative multitasking for some chainless entities. But at the same time no one was going to give up full-fledged multitasking.

Therefore, we decided not only to introduce lightweight planning units into the system, but also to entrust the work of dispatching these light streams to the main system scheduler on general rights with full flows. Potentially, we have seen the following benefits:

So, what we dreamed about:


To reduce the connectivity of the system, the scheduler will now manage the abstract planning units.

Schedee, or abstract planning unit


The idea is, in fact, quite simple, especially when viewed in terms of OOP. Take a look at the chart:

Schedee is the parent abstract class of various types of threads. Each stream type must provide some kind of interface for managing it. The scheduler knows nothing about specific implementations, and in its function it only determines who will be executed next, does some preparation and delegates processing to a specific schedee, for example, by calling the special function doSomethng ().

Next, I will talk about how we applied this idea in our planner.

In the C language, inheritance, unfortunately, has to be realized by hands, namely, on the basis of structure as a member of another, more specific structure. We will have an abstract schedee structure, from which the concrete thread and lthread structure types are inherited and provide the implementation of the necessary “abstract” functions.

We have already implemented the usual POSIX-compatible threads and the central scheduler that manages them. Therefore, the expansion of the functionality of the scheduler occurred with an eye on the already existing implementation.

The first thing we needed to do was to select from the structure of ordinary threads a part that the scheduler manages. Based on this, we construct the abstract structure of schedee.

It turned out to be quite simple to select common fields from the structure. But do you need something else? To answer this question, it is necessary, first, to understand how the types of flows can differ fundamentally, and, second, to deal with the work of the scheduler.

Flows can be:

Let's see now how the planner was arranged. The code is simplified for better understanding.
void schedule(void) { struct thread *prev, *next; ipl_t ipl; prev = thread_self(); /*    */ ipl = ipl_save(); if (!prev->waiting) runq_enqueue(&rq.queue, prev); next = runq_extract(&rq.queue); if (prev != next) //  ,  sched_switch(prev, next); //   . ipl_restore(ipl); } 

The scheduler is arranged quite simply: it determines which thread will be executed next, and if the thread has changed, it switches the context. After that, the new thread already naturally leaves the scheduler function and continues its work. That is, the scheduler is sharpened to work only with stack threads.

In order to abstract away from specific implementations of a schedee, specific processing needs to be put into a special function, let's call it process (). A pointer to this function will be in the schedee.

Now you need to understand how to handle threads without a stack. What do we know about them?
Such streams are not supplanted. That is, there should be no implicit rescheduling calls;
For chainless streams, there is no need to switch the context. In fact, there are different approaches, in Linux, for example, a separate stack is allocated for the sequential processing of softirq. We have so far stopped at the variant of processing such threads on the stack of the last executed full thread.

Thus, if for regular flows it is necessary to leave the scheduler function, then it will be expedient to process loopless flows in a loop directly in the scheduler, which, by the way, is a fairly common approach. The loop ends as soon as a thread arrives at the stack.
 void schedule(int preempt) { struct schedee *prev, *next, *last; prev = schedee_get_current(); /*   */ ipl_save(); if (!prev->waiting) runq_enqueue(&rq.queue, prev); while (1) { next = runq_extract(&rq.queue); /*     process() */ last = next->process(prev, next); /* process()  ,     . *    ,     , *  . */ if (last) break; /*   */ ipl_save(); } } 

Encapsulation is a good principle, even if we are dealing with procedural programming. So all the scheduler code should know only about the schedee, but not about the specific threads. Here is the complete code for the schedee structure:
 struct schedee { runq_item_t runq_link; /*    runq */ spinlock_t lock; /*     SMP */ /*  -,      *   schedee.    ,    *    ,    .. *   schedee,      * .  schedee ,    NULL. */ struct schedee *(*process)(struct schedee *prev, struct schedee *next); /*          . */ unsigned int active; unsigned int ready; unsigned int waiting; /*   */ struct affinity affinity; struct sched_timing sched_timing; /*    runq */ struct schedee_priority priority; /*    waitq.  ,   ,   . */ struct waitq_link waitq_link; }; 

And finally, what the process () function of ordinary threads looks like:
 struct schedee *thread_process(struct schedee *prev, struct schedee *next) { struct thread *next_t, *prev_t; next_t = mcast_out(next, struct thread, schedee); prev_t = mcast_out(prev, struct thread, schedee); if (prev != next) { thread_context_switch(prev_t, next_t); } ipl_enable(); if (!prev_t->siglock) { thread_signal_handle(); } return &thread_self()->schedee; } 

Light streams


Consider now the light streams of Embox, for which everything was started. Our light streams are chainless, cooperative and support multiple entry points.

In one of the tasks, it was necessary to process information constantly coming from an external source, moreover, with synchronization. Therefore, the question was originally about adequate support for synchronization, which in fact rests on the skill:
  1. first of all, for example, to capture a mutex; in the general case, wait for any condition (analog of conditional variables)
  2. in case of failure, return control to the scheduler in the middle of the function
  3. start from the right place when the thread is awakened, for example, when releasing the necessary mutex.

In the case of normal flows, everything is simple: when you sleep, a call to the scheduler is requested and rescheduled. The thread simply calls, for example, the sleep () function. Items 1 and 3 are implemented by keeping the stack and context safe. With chainless streams everything is more complicated:


To combat the last item, we decided to follow the path of coroutines, like Adam Dunkels in our protothread library. But, unlike Adam, we decided not to hide the logic behind cunning macros, but to make the interface as explicit as possible. Our solution is based on the extension of GCC Labels as Values . This extension allows you to get links to tags, as well as calculate the indentation from a particular tag, using the usual manipulations with pointers.
Light streams and their structure

To understand everything in more detail, consider the details of the implementation. First, the structure of lthread itself:
 struct lthread { struct schedee schedee; int (*run)(struct lthread *); int label_offset; struct schedee *joining; struct sched_wait_info info; }; 

The first thing to clarify is the signature of the run function. Since light streams do not have a stack, it is assumed that the structure of light streams will often be a member of another structure containing information that must be kept from the call to call. Therefore, it is wise to use the reference to the lightweight itself as the main argument, and the rest of the information to be addressed using type conversion.

The return value of the run function is the entry point information. If the input point of the function is assumed to be one or when the restart is used, the initial input point should be used, then the return value will be 0. Otherwise, the return value is specified by the lthread_yield () macro. I will describe this and other macros below.

The label_offset field is indented from the initial function label. This is a utility field for storing the input stream point. Working with different input points is done using the lthread_resume () and lthread_yield () macros.

The joining field is a reference to the schedee of a thread waiting for a light thread to complete, similar to normal threads. More about this will be below.

And finally, info is a service field with the necessary information for waiting by timeout.

Light Stream API

The basic light thread API is pretty simple. The initialization and launch functions hardly require explanation:
 void lthread_init(struct lthread *lt, int (*run)(struct lthread *)); void lthread_launch(struct lthread *lt); 

When the application has completed, you need to make sure that the light flows are over and no one will wake them up. If this is not the case, then it is not safe to terminate the application: the wake-up of a light stream can occur when its structure is irrelevant. Therefore, you must first wait for the flow to be in the desired state. This is similar to the thread_join () function. We need its analogue for light streams:
 extern void lthread_join(struct lthread *lt); 

Now this function has one drawback: it can only be called from a thread with a stack. In fact, it can be expanded to the general case, if necessary.

There is another important note regarding this feature. In order to understand it, let's digress a little and pay attention to the life cycle of the light flow. A light stream can only be in two states: ready and waiting. Ready light threads are in the queue of the runq scheduler, and while it is in this state, you cannot add it to the queue a second time. However, as soon as the ready flow comes from the queue, it enters a waiting state, and now it can be re-planned, including this it can do. For example, tasklets in Linux behave in the same way. If the light thread plans itself (in fact, never ends), then the thread that caused lthread_join () will simply hang. The user himself is responsible for ensuring that the easy flow does not plan for himself in this case.

Consider the part of the API that provides label management, that is, multiple entry points. Important note: working with labels is performed only in the run () function itself, but not in the functions it calls. Unfortunately, there is no way out of these limitations without a stack.

Macro that returns a previously saved label:
 /*  : goto lthread_resume(lt, start); */ #define lthread_resume(lt, initial_label) \ *((lt)->label_offset + (initial_label)) 

A macro that returns the indentation relative to the initial label:
 /*  : return lthread_yield(lt, start, resume_point); */ #define lthread_yield(initial_label, target_label) \ (target_label) - (initial_label); 

The lthread_resume () macro should always be paired with the goto statement, and lthread_yield () with the return statement.

All wait and sync functions for light streams work on the same principle. Consider the example of a simple wait timeout macro:
 SCHED_WAIT_TIMEOUT_LTHREAD(self, cond_expr, timeout); 

This function returns one of three possible codes:


Now consider the function of processing light threads, which is called from the scheduler:

 /* :    */ static struct schedee *lthread_process(struct schedee *prev, struct schedee *next) { struct lthread *lt = mcast_out(next, struct lthread, schedee); schedee_set_current(next); /*   ,         * . */ next->ready = false; next->waiting = true; /*   process()   . */ ipl_enable(); /*       .    * lthread_yield(). */ lt->label_offset =lt->run(lt); /*       ,  schedee,  *  . */ if (lt->joining && __lthread_is_disabled(lt)) sched_wakeup(lt->joining); return NULL; } 

Example


To demonstrate, consider a game Race, which works in two light streams: one is responsible for moving the road with obstacles, the other checks whether the control key is pressed.

All the code of the game is here , here I will give only the main function of the light flow, responsible for moving the road.
 static int move_road(struct lthread *self) { int to_wait; /*  lthread_resume()     . *    ,      , *    -   . */ goto lthread_resume(self, &&update); update: is_game_over |= is_obstacle(car_line_nr*RACE_ROAD_LEN + 1); if (is_game_over) return 0; race_update_road(); race_print_road(road); wait: to_wait = RACE_ROAD_UPD_MS - (step / RACE_LVL_STEP) * RACE_LVLUP_MS; if (SCHED_WAIT_TIMEOUT_LTHREAD(self, 0, to_wait) == -EAGAIN) { /*        0,   *   ,     * lthread_yield()  ,       *  . */ return lthread_yield(&&update, &&wait); } goto update; } 

This application was written to demonstrate the work of the scheduler and light streams on the STM32VLDISCOVERY board with an wh1602b LCD screen. In fact, the characteristics of STM32VL allow you to run Embox with this game and with preemptive streams: 8kb RAM and 128kb flash. Therefore, I will give a description of the Embox images for the board with the game on light streams and ordinary streams:
cooperativecombined
text Os / O228724/3150430152/33116
data436436
bss31683102
Main stack + thread stack1536 + 01078 + 1782 * 2

Finally, a video with the process of the game.


Conclusion


All the code I gave in the article can be found in the github repository of our project:

We still have work to do. Light flows are still young and, most likely, they will change and improve more than once. Already there are ideas and challenges:

Of course, there are a number of subsystems for which I want to introduce light flows. For example, tty driver, pnet subsystem.

Over time, all these problems will be solved. You can do it too :)

PS In St. Petersburg, every last Wednesday of the month pass Linux. I’ll tell you about light streams, how the workqueue and taklet of the Linux kernel are arranged, and maybe something else :)
There you can listen to other interesting performances. For information, see the SPbLUG mailing list . Meetings are like 19:00 at the address: 10th line of V.O. 33-35, Faculty of Geography, St. Petersburg State University.

Source: https://habr.com/ru/post/256565/


All Articles