ARMs for the smallest: subtleties of compilation and linker

Continuing the series of articles about ARM development from scratch, today I will touch on the topic of writing linker scripts for GNU ld . This topic can be useful not only for those who work with embedded systems, but also for those who want to better understand the structure of executable files. Although the examples are somehow based on the arm-none-eabi toolchain, the layout is the same for the Visual Studio linker, for example.

Previous articles:

Code examples from the article: https://github.com/farcaller/arm-demos
')

When we compile a source file, we get an object file at the output, which typically contains several data sections. The four most common sections are:

.text - compiled machine code;
.data - global and static variables;
.rodata - analogue .data for immutable data;
.bss - global and static variables that, when started, contain a zero value.

In the binary files with which we work during this cycle, two more sections will often come across:

.comment - information about the version of the compiler;
.ARM.attributes - ARM-specific file attributes.

In addition to sections, there is another important entity in the object file: the symbol table. This is a kind of hash: name - address (and optional attributes). In the symbol table, for example, all exported functions and their addresses (which will indicate somewhere in the .text section) are indicated.

After we have gotten a few of these files, the linker takes up the business, who, according to the rules, will assemble all the sections, discard the unnecessary ones and make the final executable file. For the "standard" OS, the rules are defined where everything should be, but in the case of microcontrollers, we usually need to be engaged in pushing everything by flash and RAM manually.

Look inside

As a first example, we will examine the following C code: module_a.c :

 static int local_function(); int external_counter; static int counter; static int preset_counter = 5; const int constant = 10; int public_function() { volatile int i = 3 + constant; ++external_counter; return local_function() * i; } static int local_function() { ++counter; ++preset_counter; return counter + preset_counter; }

Compile it and see which sections we got:

 % rake 'show:sections[a]' arm-none-eabi-gcc -mthumb -O2 -mcpu=cortex-m0 -c module_a.c -o build/module_a.o arm-none-eabi-objdump build/module_a.o -h build/module_a.o: file format elf32-littlearm Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000034 00000000 00000000 00000034 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000004 00000000 00000000 00000068 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000004 00000000 00000000 0000006c 2**2 ALLOC 3 .rodata 00000004 00000000 00000000 0000006c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 .comment 00000071 00000000 00000000 00000070 2**0 CONTENTS, READONLY 5 .ARM.attributes 00000031 00000000 00000000 000000e1 2**0 CONTENTS, READONLY

As we see, there are six sections, the purpose of which is already more or less known to us. The second line is the section attributes, they will be more interesting later when linking. Let's see which characters are defined in these sections:

 % rake 'show:symbols:text[a]' arm-none-eabi-objdump build/module_a.o -j .text -t build/module_a.o: file format elf32-littlearm SYMBOL TABLE: 00000000 ld .text 00000000 .text 00000000 g F .text 00000034 public_function

Open man on objdump for consultation. In this section, we see two characters: .text is a debugging symbol that indicates the beginning of a section, public_function is a symbol that indicates our function. There is local_function symbol for local_function , since the function is declared as static , i.e., it is not exported outside the object file.

 % rake 'show:symbols:data[a]' arm-none-eabi-objdump build/module_a.o -j .data -j .bss -t build/module_a.o: file format elf32-littlearm SYMBOL TABLE: 00000000 ld .data 00000000 .data 00000000 ld .bss 00000000 .bss 00000000 l O .data 00000004 preset_counter 00000000 l O .bss 00000004 counter

In the .data and .bss sections there are two of our counters - preset_counter and counter . They are in different sections, since preset_counter has an initial value, which is stored in .data :

 % rake 'show:contents[a,.data]' arm-none-eabi-objdump build/module_a.o -j .data -s build/module_a.o: file format elf32-littlearm Contents of section .data: 0000 05000000

counter no value, so it is initialized to zero and falls into the .bss section. The .bss section itself is not physically present in the file, since its contents are always fixed - these are zeros. If you declared char buffer[1024] in the code, then the compiler would have to write a kilobyte of empty space into the object file, which makes no sense.

At this point you may have a question - where did external_counter go?

 % rake 'show:symbols:all[a]' arm-none-eabi-objdump build/module_a.o -t build/module_a.o: file format elf32-littlearm SYMBOL TABLE: 00000000 l df *ABS* 00000000 module_a.c 00000000 ld .text 00000000 .text 00000000 ld .data 00000000 .data 00000000 ld .bss 00000000 .bss 00000000 ld .rodata 00000000 .rodata 00000000 l O .data 00000004 preset_counter 00000000 l O .bss 00000004 counter 00000000 ld .comment 00000000 .comment 00000000 ld .ARM.attributes 00000000 .ARM.attributes 00000000 g F .text 00000034 public_function 00000004 O *COM* 00000004 external_counter 00000000 g O .rodata 00000004 constant

external_counter went to the *COM* section. In this case, this means that it may be outside of this object file. Already at the layout stage, ld will figure out whether a character is declared in another file, or whether it should create it itself - in this case, in the .bss section. Also note that const int constant got into .rodata . The compiler ensures that the code does not need to change the value at this address, so that the linker can easily place it in flash memory.

We can look at .comment :

 % rake 'show:contents[a,.comment]' arm-none-eabi-objdump build/module_a.o -j .comment -s build/module_a.o: file format elf32-littlearm Contents of section .comment: 0000 00474343 3a202847 4e552054 6f6f6c73 .GCC: (GNU Tools 0010 20666f72 2041524d 20456d62 65646465 for ARM Embedde 0020 64205072 6f636573 736f7273 2920342e d Processors) 4. 0030 372e3320 32303133 30333132 20287265 7.3 20130312 (re 0040 6c656173 6529205b 41524d2f 656d6265 lease) [ARM/embe 0050 64646564 2d345f37 2d627261 6e636820 dded-4_7-branch 0060 72657669 73696f6e 20313936 3631355d revision 196615] 0070 00

There really is a compiler version. We can also take a look at .ARM.attributes , though for this you should not use objdump , but readelf :

 % rake 'show:attrs[a]' arm-none-eabi-readelf build/module_a.o -A Attribute Section: aeabi File Attributes Tag_CPU_name: "Cortex-M0" Tag_CPU_arch: v6S-M Tag_CPU_arch_profile: Microcontroller Tag_THUMB_ISA_use: Thumb-1 Tag_ABI_PCS_wchar_t: 4 Tag_ABI_FP_denormal: Needed Tag_ABI_FP_exceptions: Needed Tag_ABI_FP_number_model: IEEE 754 Tag_ABI_align_needed: 8-byte Tag_ABI_align_preserved: 8-byte, except leaf SP Tag_ABI_enum_size: small Tag_ABI_optimization_goals: Aggressive Speed

Documentation on public tags can be viewed at the ARM information center .

Putting it all together

Now that we’ve looked inside the object files, let's see how ld collects them into one successful application.

The main work ld revolves around the memory card, which we saw in the first part. If to simplify greatly, the layout is the process of tearing out sections from object files, unfolding them to the specified addresses and correcting cross-references. In the "standard" OS, the kernel can read the output file and load sections into memory at the expected virtual addresses. The dynamic linker also performs similar work by loading external libraries to certain memory locations and setting cross-references to them.

With embedded systems it is easier, the firmware program takes your binary file and uploads it to the USB flash drive as is. He does not care neither macho nor elves, he works with binary dumps.

Take a simple linker script and sort by piece. layout.ld:

 MEMORY { rom(RX) : ORIGIN = 0x00000000, LENGTH = 0x8000 ram(WAIL) : ORIGIN = 0x10000000, LENGTH = 0x2000 } ENTRY(public_function) SECTIONS { .text : { *(.text) } > rom _data_start = .; .data : { *(.data) } > ram AT> rom _bss_start = .; .bss : { *(.bss) } > ram _bss_end = .; }

The default configuration of the linker allows it to use all available memory (somewhere around 0xFFFFFFFF bytes in the case of 32-bit ARM). To begin with, we define memory regions that can be used: rom and ram . The letters in parentheses define the attributes: access to read, write, execute, allocate memory. Sections that are not explicitly indicated in the script will be scattered across regions with matching attributes automatically. If there is no space for a section, the linker will refuse to work, arguing its behavior in some way: error: no memory region specified for loadable section `.data' .

The two parameters, ORIGIN and LENGTH , specify the beginning and length of the region, respectively, org , o , len and l can be found, they are equivalent. Value is an expression, i.e., it is possible to perform arithmetic operations or use the suffixes K , M , etc. The recording LENGTH = 0x8000 , for example, can alternatively be done like this: l = 32K .

The second part of the file is the section configuration. In general, this means copying ~~from one protobuff into another protoboaf of~~ given source sections into output sections.

The source sections are specified in the form _(_) , the symbol * behaves in a standard way, so the entry *(.text) means: the .text sections from all files.

The section has two addresses: LMA (Load Memory Address) - from where it loads, and VMA (Virtual Memory Address) - at what address it is available in virtual memory. Explaining is easier, LMA is where it will appear in the binary file, and VMA is where the characters will be redirected, i.e., the pointer to the character in the code will refer to the VMA address.

We are interested in three sections - code, data and data, which are null by default. Thus, we copy code ( .text ) into flash memory, data ( .data ) into flash memory, but on the basis that they will be available in RAM, and .bss into RAM.

For .bss , in general, initialization is not required ( UPD : I’m prompted in the gill that is required, we must ensure that there are zeros, and not garbage, which appeared for some reason), since the microcontroller’s operational memory and so probably reset. But with .data will have to tinker separately, the problem is due to the dual nature. On the one hand, specific data is stored there (the preset_counter starting value), so it should be in flash memory. On the other hand, this is a writeable section, so it should be in RAM. This problem is solved by different LMA and VMA, as well as an additional C code, which, when launched, will copy content from LMA to VMA. For constant data, which usually resides in the .rodata section, such a procedure, for example, is not needed, we can safely read from straight from the flash memory.

The linker has the concept of a cursor - this is the current LMA. At the beginning of the SECTIONS block, the cursor is zero and gradually increases as new sections are added. The current value of the cursor is stored in a variable . (point).

Let's run the linker and see the result of its work:

 % rake 'show:map[a]' arm-none-eabi-ld -T layout.ld -M -o build/out.elf build/module_a.o Allocating common symbols Common symbol size file external_counter 0x4 build/module_a.o Memory Configuration Name Origin Length Attributes rom 0x0000000000000000 0x0000000000008000 xr ram 0x0000000010000000 0x0000000000002000 awl *default* 0x0000000000000000 0xffffffffffffffff

First, we see how the linker places the "common" symbol external_counter into a separate category. Next, we see that our memory configuration has been loaded and added to the default configuration (which allocates the entire address space).

 Linker script and memory map .text 0x0000000000000000 0x34 *(.text) .text 0x0000000000000000 0x34 build/module_a.o 0x0000000000000000 public_function 0x0000000000000034 _data_start = .

Next, the linker places in memory the sections that we specified, primarily .text .

 .rodata 0x0000000000000034 0x4 .rodata 0x0000000000000034 0x4 build/module_a.o 0x0000000000000034 constant .glue_7 0x0000000000000038 0x0 .glue_7 0x0000000000000000 0x0 linker stubs .glue_7t 0x0000000000000038 0x0 .glue_7t 0x0000000000000000 0x0 linker stubs .vfp11_veneer 0x0000000000000038 0x0 .vfp11_veneer 0x0000000000000000 0x0 linker stubs .v4_bx 0x0000000000000038 0x0 .v4_bx 0x0000000000000000 0x0 linker stubs .iplt 0x0000000000000038 0x0 .iplt 0x0000000000000000 0x0 build/module_a.o .rel.dyn 0x0000000000000038 0x0 .rel.iplt 0x0000000000000000 0x0 build/module_a.o

Next are sections that we did not explicitly indicate - .rodata , .glue_7 , .glue_7t , .vfp11_veneer , .v4_bx , .iplt , .rel.dyn . With .rodata everything is clear, our constant constant is stored in four bytes. As for the rest of the sections, their existence is obliged to full support of efficiency, for example, the jumps from ARM to Thumb. All of these sections are empty and do not fall into the final image.

 .data 0x0000000010000000 0x4 load address 0x0000000000000038 *(.data) .data 0x0000000010000000 0x4 build/module_a.o 0x0000000010000004 _data_end = .

This is our .data section, as you see, it is located at 0x10000000 , although it is physically stored at 0x38 (that is, immediately after .rodata ). Here we see the value of our variable, read from the cursor, _data_end .

 .igot.plt 0x0000000010000004 0x0 load address 0x000000000000003c .igot.plt 0x0000000000000000 0x0 build/module_a.o .bss 0x0000000010000004 0x8 load address 0x000000000000003c *(.bss) .bss 0x0000000010000004 0x4 build/module_a.o COMMON 0x0000000010000008 0x4 build/module_a.o 0x0000000010000008 external_counter 0x000000001000000c _bss_end = .

Another empty section, followed by .bss .

 LOAD build/module_a.o OUTPUT(build/out.elf elf32-littlearm) .comment 0x0000000000000000 0x70 .comment 0x0000000000000000 0x70 build/module_a.o 0x71 (size before relaxing) .ARM.attributes 0x0000000000000000 0x31 .ARM.attributes 0x0000000000000000 0x31 build/module_a.o

Finally, ld generates the output file and discards unnecessary sections. Look like that's it?

  0x0000000000000034 _data_start = . ... .data 0x0000000010000000 0x4 load address 0x0000000000000038

The variable pointing to the beginning of the .data actually indicates the wrong place! But the truth is, the cursor after .text indicates its end. To set a variable correctly, it must be moved inside the description of the output section:

 .data : { _data_start = .; *(.data) _data_end = .; } > ram AT> rom

Compose and see what has changed:

 % rake 'show:map[a]' SCRIPT=layout2.ld arm-none-eabi-ld -T layout2.ld -M -o build/module_a.elf build/module_a.o ... .data 0x0000000010000000 0x4 load address 0x0000000000000038 0x0000000010000000 _data_start = . *(.data) .data 0x0000000010000000 0x4 build/module_a.o 0x0000000010000004 _data_end = . ...

Great, now everything is in place.

You may wonder - what is the matter for us, where is the .data ? As you remember, the data is physically stored in a flash, and working with them will be from RAM. For this reason, we will have to write a boot code that will copy the .data into RAM, and these variables will help us find out the specific addresses where the section should be moved.

Complicate the task

We dealt with one module. Let's add the second file and see what changes. The second file will contain the already-known external_counter and some C ++ code: module_b.cpp

 int external_counter; extern "C" int public_function(); void function_b() { external_counter += public_function(); } void function_c() { } void function_d() { }

As you know, when compiling C ++ code, the names of functions and methods go through "mangling", when the types of arguments, the names of classes and namespaces are encoded in the name:

 % rake 'show:symbols:text[b]' arm-none-eabi-gcc -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -mthumb -O2 -mcpu=cortex-m0 -c module_b.cpp -o build/module_b.o arm-none-eabi-objdump build/module_b.o -j .text -t build/module_b.o: file format elf32-littlearm SYMBOL TABLE: 00000000 ld .text 00000000 .text 00000000 g F .text 00000014 _Z10function_bv 00000014 g F .text 00000002 _Z10function_cv 00000018 g F .text 00000002 _Z10function_dv

We compile code with the -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables flags to avoid additional sections related to exception handling. The names of the functions were coded accordingly.

We cannot generate a map for this module, since it cannot be composed independently, it depends on the public_function function from module a . We compose both modules at once:

 % rake 'show:map[a|b]' SCRIPT=layout2.ld arm-none-eabi-ld -T layout2.ld -M -o build/out.elf build/module_a.o build/module_b.o ... .text 0x0000000000000000 0x34 build/module_a.o 0x0000000000000000 public_function .text 0x0000000000000034 0x1c build/module_b.o 0x0000000000000034 function_b() 0x0000000000000048 function_c() 0x000000000000004c function_d() ...

The block of common symbols is missing, all symbols are found in the corresponding modules. Sections .text , as well as others, are arranged one after another.

Collect trash!

For embedded applications, the size of the output file is more relevant than ever, so you should take care that the maximum amount of unnecessary data and dead code is removed. The linker is able to get rid of sections that are not referenced and which were not explicitly indicated as necessary in the layout script. This is done quite simply - with the help of the --gc-sections flag:

 % rake 'show:map[a|b]' SCRIPT=layout2.ld GC=1 arm-none-eabi-ld --gc-sections -T layout2.ld -M -o build/out.elf build/module_a.o build/module_b.o Discarded input sections .rodata 0x0000000000000000 0x4 build/module_a.o COMMON 0x0000000000000000 0x0 build/module_a.o .text 0x0000000000000000 0x1c build/module_b.o .data 0x0000000000000000 0x0 build/module_b.o ... .text 0x0000000000000000 0x34 *(.text) .text 0x0000000000000000 0x34 build/module_a.o 0x0000000000000000 public_function ...

As you can see, the .text section of build/module_b.o was removed completely, as it contained useless functions! At the same time, the linker threw out unused constants from the first module.

In fact, this optimization is not complete, as we can easily see with a simple experiment, see module_c.cpp

 void function_b(); extern "C" int public_function() { function_b(); }

We will replace module a with module c and see if the linker can delete the section.

 % rake 'show:map[b|c]' SCRIPT=layout2.ld GC=1 arm-none-eabi-gcc -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -mthumb -O2 -mcpu=cortex-m0 -c module_c.cpp -o build/module_c.o arm-none-eabi-ld --gc-sections -T layout2.ld -M -o build/out.elf build/module_b.o build/module_c.o Discarded input sections .data 0x0000000000000000 0x0 build/module_b.o .data 0x0000000000000000 0x0 build/module_c.o .bss 0x0000000000000000 0x0 build/module_c.o ... .text 0x0000000000000000 0x24 *(.text) .text 0x0000000000000000 0x1c build/module_b.o 0x0000000000000000 function_b() 0x0000000000000014 function_c() 0x0000000000000018 function_d() .text 0x000000000000001c 0x8 build/module_c.o 0x000000000000001c public_function

Although part of the sections (by the way, empty ones) were thrown away, we still lose invaluable bytes on the function_c() and function_d() , which ended up in the same section as the function_b() that we need. Compiler flags will come to the rescue, which break functions and data into different sections: -ffunction-sections and -fdata-sections :

 % rake clean && rake 'show:symbols:all[b]' SPLIT_SECTIONS=1 arm-none-eabi-gcc -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections -mthumb -O2 -mcpu=cortex-m0 -c module_b.cpp -o build/module_b.o arm-none-eabi-objdump build/module_b.o -t build/module_b.o: file format elf32-littlearm SYMBOL TABLE: 00000000 l df *ABS* 00000000 module_b.cpp 00000000 ld .text 00000000 .text 00000000 ld .data 00000000 .data 00000000 ld .bss 00000000 .bss 00000000 ld .text._Z10function_bv 00000000 .text._Z10function_bv 00000000 ld .text._Z10function_cv 00000000 .text._Z10function_cv 00000000 ld .text._Z10function_dv 00000000 .text._Z10function_dv 00000000 ld .bss.external_counter 00000000 .bss.external_counter 00000000 ld .comment 00000000 .comment 00000000 ld .ARM.attributes 00000000 .ARM.attributes 00000000 g F .text._Z10function_bv 00000014 _Z10function_bv 00000000 *UND* 00000000 public_function 00000000 g F .text._Z10function_cv 00000002 _Z10function_cv 00000000 g F .text._Z10function_dv 00000002 _Z10function_dv 00000000 g O .bss.external_counter 00000004 external_counter

Now that each function and object is placed in independent sections, the linker can get rid of them:

 % rake clean && rake 'show:map[b|c]' SCRIPT=layout2.ld GC=1 SPLIT_SECTIONS=1 arm-none-eabi-gcc -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections -mthumb -O2 -mcpu=cortex-m0 -c module_b.cpp -o build/module_b.o arm-none-eabi-gcc -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections -mthumb -O2 -mcpu=cortex-m0 -c module_c.cpp -o build/module_c.o arm-none-eabi-ld --gc-sections -T layout2.ld -M -o build/out.elf build/module_b.o build/module_c.o Discarded input sections .text 0x0000000000000000 0x0 build/module_b.o .data 0x0000000000000000 0x0 build/module_b.o .bss 0x0000000000000000 0x0 build/module_b.o .text._Z10function_cv 0x0000000000000000 0x4 build/module_b.o .text._Z10function_dv 0x0000000000000000 0x4 build/module_b.o .text 0x0000000000000000 0x0 build/module_c.o .data 0x0000000000000000 0x0 build/module_c.o .bss 0x0000000000000000 0x0 build/module_c.o ...

Instead of conclusion

And again the volume of the article is growing, now it is twice the size of the first part. Unfortunately, the layout is a complex topic, and it is difficult to master the “inlet”. In a week, we will continue to explore the linker and make a full-fledged layout script for our embedded applications.

PS As always, many thanks pfactum for reading the text.

This work is available under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported license . The program text of the examples is available under the Unlicense license (unless otherwise indicated in the file headers). This work is written solely for educational purposes and is not affiliated in any way with the current or previous employers of the author.

Source: https://habr.com/ru/post/191058/

All Articles