Mybuild - build system for modular applications

A recent article about a new build system for Qt reminded me of a situation that was in our project several years ago - then we, too, were looking for a suitable build system. The project is quite complex and it needs to have a flexible configuration system. As a result, we are now using and developing our own build system, Mybuild .

Who is interested to know what we have done, and what kind of project this is, which needed its own assembly system, welcome under cat.

about the project

Our project is called Embox . It is a modular and configurable OS for embedded systems. As you can see, configurability is inherent in the idea of the project, hence the desire to have a flexible build system.
')
Initially, the project was small (although it is not very large now), and we had enough samopnym makefiles, we also set all configuration options in them. With the development of the project, ideas emerged about how to describe not just the source code for the assembly, but the modules, and even be able to set parameters for them, write dependencies, and so on.

Another build system

As often happens, the appetite comes with eating. The functionality (and at the same time crutches with it) grew like a snowball, supporting the resulting assembly infrastructure became quite expensive, and simply uncomfortable, and at one point we decided to stop and look first at the ready-made solutions.

Critique of Make and its derivatives can be found in the mapron article , I already mentioned it at the beginning. I will add that in our case, the Kbuild build system used in the Linux kernel was also considered. Beg a little criticism of her.

Assembly and configuration files are separated. Therefore, it is necessary to describe in several places (Makefile + Kconfig).
Configuration parameters are specified by #define directives, which sometimes leads to "#ifdef nightmare" in the code.
There are no namespaces for options.

Of course, there are advantages:

Kbuild supports specifying dependencies between options.
There are several graphical (and pseudo-graphical) configuration tools.
Stable development and community support.

Anyway, at that time it seemed to us that such a system was too complex for a relatively small project. In addition, we already had some small developments, and therefore it was decided to formulate the requirements and try to implement our assembly system.

So, we want to:

Application modules are described in a simple, intuitive language, preferably along with the available configuration options.
The module description, if possible, contained all the necessary information for its further use, including user documentation, available unit tests, and so on.
The build system didn't pull a lot of dependencies like a Python interpreter or a Java machine.

Since we started with ordinary makefiles, the resulting build system is written in pure GNU Make.

Little about implementation

Saying “on pure GNU Make”, I was a little tricky. If you ever tried to write something more complicated than the examples from the manual, then you probably also noticed the poverty of the built-in language. Therefore, the first thing we started with was the fight against the wretchedness of the language. In general, this topic deserves a separate article in the “Abnormal Programming” hub, here I will touch on only the main points (maybe someone will come in handy in their projects).

Improving Make syntax

The Make language is line-based, so when writing complex functions on several lines, a backslash is used. Besides the fact that it is simply inconvenient, it prevents the use of comments inside a function, since Make only has one-line comments (starting with a lattice and valid to the end of the line).

Having fixed this limitation, now we can write multi-line functions without backslashes, using comments inside functions and code indentation (tabs or spaces). In addition, we have added to the language features such as lambda expressions, inline simple functions, and others. Here's how you can now rewrite, for example, the function of turning the list:

It was	It became
`reverse = \ $(call fold,,$1,__reverse_fold) # Called with the following args: # 1. An already reversed list part. # 2. Next element. __reverse_fold = \ $2 $1`	`define reverse # Start from the empty list. $(fold ,$1, # Prepend each new element ($2) to # the result of previous computations. $(lambda $2 $1)) endef`

It was

It became

reverse = \ $(call fold,,$1,__reverse_fold) # Called with the following args: # 1. An already reversed list part. # 2. Next element. __reverse_fold = \ $2 $1

 define reverse # Start from the empty list. $(fold ,$1, # Prepend each new element ($2) to # the result of previous computations. $(lambda $2 $1)) endef

Adding OOP

Now, when you can write more or less readable code, add another bun. In Make, there is no typing; any data is represented as a string. However, in any application there is a need to structure the data, so we have implemented a set of macros that allows us to define classes, as well as functions for creating objects, calling methods, etc. For example, the following code when calling the function greet prints "Privet, Habrahabr".

 define class-Greeter $(field greeting, $(or $(value 1),Hello)) # Arg 1: who to greet. $(method sayHello, $(info $(get-field greeting), $1!)) endef define greet $(for greeter <- $(new Greeter,Privet), $(invoke greeter->sayHello,Habrahabr) $(greeter))# <- Return the instance. endef

After these two improvements, development went much faster, allowing us to tackle the logic of the build system itself.

We think over syntax

First you need to decide on a language to describe the modules and configurations. As a rule, internal or external DSL is used for the new language. Internal DSL is a subset of some general purpose language, usually the one you plan to use for interpretation. In the case of GNU Make and its clumsy language, this is not an option at all, and only an external DSL remains, that is, an independent language for describing the assembly.

I will not beat around the bush and immediately say that the resulting language strongly resembles Java. Personally, I like the Java syntax, although it is verbose, but in many respects simple and understandable. As in Java, Mybuild DSL has packages and imports, and the module description is similar to the class description. Files written in this language are called my-files (by their extension).

 /* Our first example */ module HelloWorld { source "hello.c" }

Building a language parser

Now you need to implement a parser for this language. There are also many options, starting from a self-written parser, using, for example, the method of recursive descent or any library of combinators, and ending with various generators of parsers. As a result of several experiments, we settled on the latter, as the most general and, therefore, convenient for development, especially at the stage of active development of the language. We used GOLD Parser Builder (http://goldparser.org/) as a generator, it uses a simple grammar description language, has a built-in debugger, and most importantly, it has the ability to flexibly configure the generated parser (in our case it is also implemented on Make ).

The result of the parser is a parse tree.

Build an object model

So, I want to extract as much information from my-files as possible, as well as to have easy access to it at all stages of the assembly. It is clear that you need to have some kind of internal representation. That is, now we need to turn the parse tree into a semantic model.

At about the same stage, we simultaneously thought about the support of the language by an IDE. In our case, this is Eclipse, since more than half of the developers in the project use this particular environment. For the development of the plugin, we used the Xtext framework, which, according to the grammar, is able to generate a full-fledged editor with syntax highlighting, autocompletion and other pleasures of the modern IDE. It is worth saying that Xtext itself is based on EMF , a well-known modeling framework. This prompted the idea to use the EMF technology for the development of the assembly system itself.

Thus, we get the EMF model that describes the structure of our DSL (we were kindly generated by Xtext). Now we need to turn the model into Make classes. This is where the Xpand project comes to the rescue (it is being developed by the same company as Xtext), which allows us to generate text from the model.

The final step is to write a glue code that creates model objects for the necessary nodes in the parse tree.

Back to requirements

Dependencies

One of the first points in our requirements was the ability to define intermodular dependencies. This is primarily necessary to simplify user configuration of the final application.

In the my-file, the dependency designation is as follows.

 module Foo { depends Bar, Baz }

Now that we have a complete graph of all the modules described in the project, the construction of the closure of the subgraph of the required modules is implemented quite simply.

For ease of development, Mybuild can visualize the graph of modules using Graphviz . And as an example, here is the visualization of the module graph for one of the simple Embox configurations.

Boot order at runtime

Having a complete understanding of the modules of the system and the dependencies between them, why not use this knowledge for anything besides the actual assembly of the project? For example, on the basis of this information, you can determine the order of loading modules during system execution. Indeed, as a rule, loading a module only makes sense after downloading all its dependencies.

For this purpose, a specially generated C source code is included in the assembly, in which the nodes and edges of the dependency graph are statically set. When compiling the modules themselves, it is possible to associate with the module a function of its initialization, which will be called by the manager by the loaders after all dependencies are resolved.

Build Parameters and Options

The next task we solved was specifying parameters for specific modules. To describe a parameter, the option construct is used, and access to the parameter value can be obtained at compile time using a special macro.

 module HelloWorld { source "hello.c" option string greeting = "Hello, world!" }

 int main(void) { printf("%s\n", OPTION_STRING_GET(greeting)); return 0; }

And although so far this option is not very pleasant, now for the module you can set the required parameters, their default value and override these values during configuration.

 configuration Main { include HelloWorld(greeting = "Hello, Habrahabr!") }

Processing linker scripts

The next problem, somewhat specific to the project, was that we wanted to process not only source codes (in C and assembly languages) and header files, but also other resources, in our case, for example, special linker scripts. The makefile for preprocessing and including an additional linker script in the assembly in the previous version of the build system was the following (and such code needed to be written for each special linker script).

 $(IMAGE): $($_heap_lds) $($_heap_lds): $($_SELFDIR)/heap.lds.S $(AUTOCONF_DIR)/config.lds.h @$(MKDIR) $(@D) \ && $(CPP) -P -undef $(CPPFLAGS) \ -imacros $(AUTOCONF_DIR)/config.lds.h \ -MMD -MT $@ -MF $@.d -o $@ $< -include $($_heap_lds).d

Now it looks like this, and the build system decides for itself what to do with the heap.lds.S file:

 module HeapAlloc { source "heap.lds.S" }

Processing other resources

In the previous example, the definition of the type of the file specified in source occurred by its extension (.lds.S). Sometimes it is necessary to mark certain files so that they are processed in a special way. For example, in our project, these are files whose contents should be available during execution.

Here we used the annotation mechanism, borrowed again from Java. The first thing we implemented with their help is the ability to mark a resource as requiring copying to the folder with the root file system, that is:

 module Httpd { @ InitFS source "index.html" }

Annotations can be set for any type of objects in our system, be it modules, their dependencies, source code files or parameters, as well as other annotations. Thus, we hope that we have laid out in the language good opportunities for its expansion.

Inheritance and abstract modules

And finally, I will describe one very interesting, in my opinion, feature that we haven’t met in other similar projects, namely the ability to specify interfaces and their implementations.

Since we have a highly configurable OS, we need a simple opportunity to change such system algorithms as, for example, a planning strategy. Not a scheduling policy that is set for each process using the SCHED_FIFO or SCHED_OTHER , but the algorithm by which the scheduler controls all threads (possibly considering the policy). For example, now the project has implemented three planning strategies. For the simplest systems, you can use a primitive scheduler that does not take into account neither the priorities nor the other attributes of the stream. And there is a strategy that uses priorities and takes into account how much time has already been played.

It was a small lyrical digression. So, there was a need for at least a simple inheritance, so that the system requirements could be described using interfaces, and its specific properties could be defined using their implementations. Well, once there is a need, we decided it in this way:

 @ DefaultImpl(TrivialSchedStrategy) abstract module SchedStrategy { } module TrivialSchedStrategy extends SchedStrategy { source "trivial.c", "trivial.h" } module PriorityBasedSchedStrategy extends SchedStrategy { source "priority_based.c", "priority_based.h" }

As you can see, the annotation ( @DefaultImpl ) was not done here either, in this case, if there is no explicit indication of the module that implements SchedStrategy in the configuration, then the TrivialSchedStrategy module is used by default.

Conclusion

Of course, not all features of our system, for example, you can specify specific flags for the module for compilation, associate with it a set of unit tests, and so on. But I am afraid that the article turned out to be overloaded, so anyone who is interested in learning more about Mybuild, perhaps with his hands, or even looking at the code, may find more information on the project wiki .

Of course, much remains to be realized and polished. At a minimum, we have not yet tried to unbind Mybuild from the parent Embox project, for some pieces there is not enough documentation and so on.

Links

Thank you for reading to the end, I will be glad to answer your questions and suggestions in the comments.

Source: https://habr.com/ru/post/144935/

All Articles