UPD 09/21/16: BusyBox is now successfully launched.
Inspired by a series of articles on the
Mars Rover project site, in which the author tries to launch an open system on the
Amber SoC chip and Linux on the FPGA-based Mars Rover 2, I decided to try to repeat this experience on my
Terasic DE2-115 board . But, instead of the
old as shit mammoth obsolete version of Linux 2.4.27, I will be running the latest version of Linux at the moment - 4.8.0-rc5.
Amber Crystal System
The Amber processor core is a 32-bit RISC processor, fully compatible with the architecture and the ARM v2a command system, which allows compiling programs for it using GCC. In addition to the processor itself, the Amber project provides several peripheral devices as part of a system on a chip, including a UART, a timer, and an Ethernet MAC. The processor core is available in two versions:
| Amber 23 | Amber 25 |
Conveyor | three level | five level |
Cache | common (code + data) | separated |
Wishbone bus width | 32 bits | 128 bits |
Performance | 0.75 DMIPS / MHz | 1.05 DMIPS / MHz |
As you can see, the performance of the processor core is comparable to the performance of cores based on later versions of the ARM architecture, such as ARMv4 and ARMv5. The ARMv2a architecture is implemented in the Amber processor for the reason that it is not covered by patents and its implementations can be freely distributed. However, some problems are connected with this - this architecture is considered obsolete in GCC, from where its support is gradually being “cut out”, and support for this architecture has been removed from the Linux kernel a long time ago.
An important feature of the architecture is that, unlike the newer versions of the ARM architecture, the processor does not support the THUMB mode, it lacks the CPSR / SPSR registers and support for MSR / MRS instructions, and the processor flags are contained in the PC register bits:
')
Because of this, the processor can address a maximum of 64 MB of memory (26 bits) in the PC register, the two youngest of which are always 0, because The instructions are always aligned with the word boundary, so the two lower bits of the register are used as flags defining the processor operation mode (user / privileged, interrupt handler). In other registers, the processor can address up to 4 GB of memory. More details about the architecture of the processor core and the set of commands implemented in it can be found
here and
here .
Installing ARM Cross Compiler
Unfortunately, the Sourcery CodeBench Lite compiler, which was used by the author of articles about porting a project to the Mars Rover board, is no longer available for download, but this is not a very big problem. To install the compiler, you can use
crosstool-NG or
crossdev
in Gentoo Linux.
To install using crosstool-NG, it suffices to use the out
arm-unknown-eabi
configuration:
$ ct-ng arm-unknown-eabi $ ct-ng build
This compiler will be used to build the Linux kernel and bare-metal programs, such as the initial boot loader, and a simple application that prints Hello, World to the serial port.
Compile Hello World and run in the Verilog simulator Verilator
Download the project distribution kit with
GitHub and look inside: the project is divided into 2 parts - the
hw
folder contains the source of the “hardware” part in the Verilog language, and the
sw
folder contains the source code of the programs that will be run on the processor, and some auxiliary utilities used in assembling and transforming ELF and BIN file formats into a format supported by Xilinx tools and Amber test bench scripts.
Go to the folder
sw/hello
and compile the program
hello-world.c
:
$ cd sw/hello-world $ export AMBER_CROSSTOOL=arm-unknown-eabi $ make
As a result, among other things, the file
hello-world.mem
will be generated - a text file with the contents of the compiled program, suitable for downloading to the simulator and to the Boot ROM of our processor.
The author of the original articles that I used was used to simulate the Icarus Verilog project - a free and very popular simulator, but the problem is that it works terribly slowly - on my machine with a 2.6 GHz processor, the clock frequency of the Amber core during simulation in Icarus Verilog is about 16 kHz, and each character in the string “Hello, World” from the example above is displayed for about half a second. This speed is sufficient if you need to debug the execution of a small program, such as a bootloader or the same hello-world, but is unacceptable, if you want to debug the boot of the entire Linux kernel - you have to wait for ages.
Therefore, we will use the Verilator simulator, which compiles Verilog in C ++ and works very fast - Hello World prints instantly without any visible delay, and the clock frequency on my machine is about 1.5 MHz, which is 100 times faster than Icarus Verilog! By the way, the debugging process of launching the Linux kernel took me about a week, and the simulation helped a lot, because in the simulation mode, the test bench code writes to the text log file an assembler listing of all the instructions executed by the processor, including jumps to addresses, asynchronous and software interrupts, etc. A sort of disassembler implemented in Verilog.
We install Verilator according to the instructions from the
official site , go to the
hw/de2_115/tb
folder, where the modified testbench is located, and
make
.
obj_dir
Verilog compiler warnings stream, the result will be the
obj_dir
folder, and in it the
Vtb
executable file, which we will run to simulate the operation of the system.
Next, execute the following commands:
$ cp ../../../sw/hello-world/hello-world.mem ./boot-loader.mem $ ./obj_dir/Vtb
As a result, the simulation will be launched and we will see the long-awaited Hello, World:
Load boot memory from boot-loader.mem Read in 961 lines Hello, World!
This means that the processor successfully read and executed our program compiled by GCC under ARM!
If you wish, you can add the
verilator
key to the list of startup keys of the
verilator
the
--trace
, then another test file will be generated -
out.vcd
, which can then be opened by the GTKWave program, and see the waveforms of various signals inside the processor and other blocks :

Build initramfs with Builtroot
Before building the Linux kernel, create an environment for compiling custom programs for our system (based on uClibc-ng) and generate a file that will be added to the kernel as an initramfs during the build process. To do this, use Buildroot, which can be downloaded
from here .
$ make amber_defconfig $ make
As a result, we will have an armchild
arm-buildroot-uclinux-uclibcgnueabi
and an image of the file system in
./output/images/rootfs.cpio
. The path to this image will need to be specified in the kernel configuration file, the parameter
CONFIG_INITRAMFS_SOURCE
. BusyBox is included in the file system image,
but it still doesn’t start up completely (now it is running), but in this article
we’ll restrict ourselves to a simple “Hello, World” as the process
/sbin/init
. To do this, in the directory where BuildRoot was going to create a file
hello.c
with the content known to each programmer, and run the following commands:
$ ./output/host/usr/bin/arm-buildroot-uclinux-uclibcgnueabi-gcc -o hello hello.c $ mv hello output/target/sbin/init $ rm hello.gdb $ make
After successful execution of these commands,
./output/images/rootfs.cpio
will be rebuilt with our application instead of BusyBox. This way of replacing files is suitable to quickly check something, for the full addition and replacement of files in
rootfs
during the build process there is a configuration option
BR2_ROOTFS_OVERLAY
.
Unlike the example that we launched in the Verilator simulator, this new “Hello, World” is not working as a bare-metal application, but as a user application for Linux — the text will be output to the serial port using the uClibc standard library
write
system call and transfer control to the kernel through a software interrupt, the kernel will transfer control to the
tty
driver, then the serial port driver, and finally the message will be output.
Build the Linux kernel and boot loader
Naturally, in order to launch the freshest core, it had to make some changes. For the most part, these changes are related to the interrupt handling code and processor mode switching code, since this code is architecture dependent. Next, I adapted the code to support the Integrator platform (mach-integrator), since In the original patch of the author of the Amber project for the 2.4 kernel, there are hints that this platform is the prototype of the Amber SoC architecture (in particular, it was discovered that peripheral devices, such as an interrupt controller, a timer, and a serial port, are implemented compatible with the device drivers used on this platform) and created on its basis a new platform Amber.
Fortunately, the debugging clock is over, and now the assembly of the working core is done with a flick of the wrist. Those who wish to repeat it can
clone the source and execute the following commands:
$ make ARCH=arm CROSS_BUILD=arm-none-eabi- amber_defconfig $ make -j8 ARCH=arm CROSS_BUILD=arm-none-eabi- Image $ make ARCH=arm CROSS_BUILD=arm-none-eabi- arch/arm/boot/dts/amber-de2115.dts
After the kernel is assembled, the
arch/arm/boot/Image
and
arch/arm/boot/dts/amber-de2115.dtb
files will be created, ready to be loaded into the board using a bootloader via the serial port using the XMODEM protocol.
To build the bootloader, go to the
sw/boot-loader-serial
folder, do
make
(don't forget about the environment variable
AMBER_CROSSTOOL
) and get the file
boot-loader-serial.mem
, which can be converted to MIF, which accepts Altera, using the utility
mem2mif
Quartus II as a memory initialization file.
Putting it all together
For those who have a Terasic DE2-115 board, it’s time to open the
de2_115.qpf
project and synthesize it (note that I have a serial port on the EXT_IO connector instead of the RS232 board, since there are no COM ports on my motherboard ), specify the
de2_115_sram_2048_32_byte_en
memory received in the previous step as
boot-loader-serial.mif
as the file for initialization and load the bitstream into the board. Since the Amber processor, for one developer, did not implement the reset logic for the known reasons, it is possible to reset the processor to the initial state only by reloading the bitstream. At the same time, if in the process you hold down the KEY0 button, the processor will not start the program until it is released. I used this button to debug Verilog code using SignalTap. But if you let it go, then just reloading the bitstream will help start all over again.
After downloading the bitstream in the terminal configured at 921600 baud, the prompt Amber will appear immediately. Next, you need to type the
b 80000
command and send the Linux kernel file (
arch/arm/boot/Image
) generated earlier with XMODEM, and then again type the
b 78000
command and send the DTB file, which describes which devices to search for which addresses, what drivers for them to load, how much RAM is in the system, a command line with kernel parameters and other information. I patched the bootloader in such a way that it sends the address
0x78000
to the kernel as the address where to look for DTB, so we load it at this address.

Finally, when both files are loaded into random access memory (SDRAM), you can enter the
j 80000
command in the boot loader console. The Linux download starts, and if everything is done correctly, the result will be something like this:

Our "Hello, World" was launched as the first user process (
/sbin/init
) and brought the cherished phrase to the screen through the standard library and the kernel. Wow, isn't it?
If you do not have a Terasic DE2-115 board or any other board with a FPGA of sufficient size, you can still run Linux in the Verilator simulator. To do this, add the
-DAMBER_LOAD_MAIN_MEM=1
and
-DAMBER_LOAD_DTB_MEM=1
keys to
hw/de2_115/tb/Makefile
and rebuild the
Vtb
executable file. Then, using the
amber-bin2mem
we create the kernel and DTB files for the simulator:
$ amber-bin2mem arch/arm/boot/Image 80000 > vmlinux.mem $ amber-bin2mem arch/arm/boot/dts/amber-de2115.dtb 78000 > dtb.mem
In addition, you will need to slightly correct the bootloader code for the simulation, commenting out the
main
function call as in the normal mode it requests commands from the user. Then the loader will immediately transfer control to the Linux kernel. Copy the
*.mem
files to the
*.mem
folder, run:
./obj_dir/Vtb
and watch Linux boot.
Limitations, practical benefits
Of course, Linux, which eventually started, is not quite similar to the one we used to see on servers and workstations, due to the fact that the Amber processor core does not have a MMU (Memory Management Unit) and, as a result, virtual memory support (all memory is physical), memory protection (any application can corrupt kernel memory or communicate with devices bypassing it, via the Wishbone bus), copy-on-write, etc. NOMMU Linux currently does not support ELF executable files (although There are developments on the support of the format FDPIC ELF) and dynamic bib ioteki - used instead bFLT format (Binary Flat) - a simple format based on
a.out
. And if you run, say, N instances of an application on such a system, then exactly as many copies of it will be in memory.
The practical benefits of the work done are still there, even such “trimmed” versions of Linux work in many devices based on microcontrollers with limited resources. I hope that the habrayusers who are keen on FPGA programming can learn something useful by experimenting with full-fledged Linux on an FPGA-synthesized processor (which, incidentally, only takes up 8% of capacity on DE2-115 or about 10,000 LE). If you have another board based on Altera or Xilinx, then porting to it is not difficult, because Most of the work has already been done. Of course, there are already more practical solutions that are more interesting from a practical point of view, such as Xilinx Zynq, Altera Cyclone V SoC, which contain a full-fledged ARM-SoC on a single chip with FPGA, but the solution presented in this article allows Linux to run even for owners of simple boards with no very powerful FPGAs on board. The remaining free logic can be used to implement new custom peripherals, which can be “hung” on the Wishbone bus and made available from the OS using drivers.
Plans
The Terasic DE2-115 board is truly one of the most powerful debugging boards, on the basis of which interesting projects have already been made (this is the
clearest example and
another one ). She has on board a wide range of peripherals:
- 128 MB SDRAM
- 8 MB SPI Flash
- LEDs and seven-segment indicators
- 16x2 liquid crystal display
- 24-bit audio codec
- SD card slot
- 2 gigabit ethernet ports
- VGA monitor output, PS / 2 keyboard
- USB ports
Of all this wealth, I have used only RAM in this project. In the future, if there is time, I want to compile U-Boot and place it in the built-in flash memory, in the bootloader code in the FPGA, load U-Boot, which would then load the Linux kernel and root file system from the SD memory card. In addition, I would like to try to implement the support of peripheral devices available on the card - Ethernet, for example.