📜 ⬆️ ⬇️

Multiklet: The first practical tests and performance

The debug board of the multiklet came into my hands, and I want to share the results of its testing. I will also tell you about a few pitfalls, which at first may somewhat spoil the nerves of those who want to personally touch the Multiklet.

Immediately it is worth noting that I consider only the development in C (and not in Assembler) because Nowadays, the work time of programmers is more expensive than megahertz and memory. The C-compiler Mulklet has a hard fate, and at _that moment_ it is in its infancy (in particular, no optimizations have been implemented). The situation promises to be corrected by the middle / end of the year.

Iron


This is a debugging set HW1-MCp04 (older and more expensive). The processor here operates at a frequency of 80 MHz. RS232 interfaces are set up, USB (1.1, Full-Speed, 12Mbit) and LAN (10/100 MBit) controllers are installed. No USB and LAN software support at the moment.
')

"IDE"

The Multiklet IDE is a PSPad with hotkeys for compiling and loading a binary into a debug board. Such an IDE seemed useless to me, but fortunately, you can build a project and upload firmware to the board using scripts:

Compilation:
MultiClet\SDK\shell\MultiClet\build_project.cmd <  > 

The compilation script did not work out of the box. inside, the path to platform-dependent users was not indicated, you can add it yourself:
 rem      set CPP_KEYS=%CPP_KEYS% -Wp-I.. -Wp-I"%INCDIR%" -Wp-I"[.. ..]\MultiClet\Projects\inc\c" 
Apparently the developers of the supplied C examples also encountered this problem, since In some of them, either the contents of the necessary inclusions are copied, or the files themselves connected from the SDK are copied.

Filling firmware to the board:
 MultiClet\SDK\bin\mc-ploader <   > 
You need to load while holding the reset button on the board.

For the firmware upload to work, you need to install drivers for PicoTap. Unfortunately, the PicoTap drivers themselves are not signed (!?!), So it’s difficult to load them. If you disable driver signature verification in Windows, then Windows (8 x64) blocks their loading due to a known incompatibility.

The solution is to choose the PtoTap FTDI driver for PicoTap, and ignore the Windows warning that it does not seem to her that this driver will work. This leads to hope that the creation of a homemade adapter on the FTDI chip is quite possible.

In-circuit debugging is not.

Writing Hello World

We will display messages via RS232. When connecting to a computer in a standard cable, Dad-Mom needs to swap the pins RX and TX (2 and 3). Take the standard example of uart, reconfigure it to the speed of 115200 baud:

 void uart_init(UART_TypeDef *UART) { int port, bitrate, control; port = 0x00000300; //alternative port function for uart0 bitrate = 0x56;//115200 bps control = 0x00000003; //rx, tx enable GPIOB->BPS = port; UART->BDR = bitrate; UART->CR = control; } 
The Bitrate value is calculated using the formula 80 MHz / 115200/8 = 86 (0x56) or 87.

We add the function of outputting a string and outputting a character with checking for UART buffer overflow:
 void uart_send_with_delay(char byte, UART_TypeDef *UART) { while(uart_fifo_full(UART0) == 1); uart_send_byte(byte, UART); } void uart_puts(char *msg, UART_TypeDef *UART) { while(*msg)uart_send_with_delay(*msg++, UART); } 

and finally:
 void main() { uart_init(UART0); //config uart0 uart_puts("Hello world from Multiclet!!\r\n", UART0); } 
Connect to the COM port with any convenient terminal and get the expected result.

Practical performance

Take the simplest test program:
 float i,j,result; for(i=0;i<1;i+=0.0002) for(j=0;j<1;j+=0.0002) { result+=i*j; } 

The body of the inner cycle is performed 25 million times. On a Multiclete with a frequency of 80 MHz, this code runs 20.3 seconds with the current compiler. Let us calculate what the performance of an abstract classical processor corresponds to these figures: 25 million cycles * ~ 5 operations per iteration of a cycle / 20.3 seconds - 6.1 million operations per second.

Those. performance is currently obtained at the level of an abstract non-superscalar processor with a frequency of 5-10 MHz. Of course, performance will be significantly improved as the compiler evolves.

If we help the compiler a bit, and with our hands we will expand the cycle:
 for(i=0;i<1;i+=0.0002)//8 seconds for(j=0;j<1;j+=0.0008) { result+=i*j+i*(j+0.0002)+i*(j+0.0004)+i*(j+0.0006); } 
Then the test will run for 8 seconds, if you simplify the expression to result + = i * (j * 4 + 0.0012), then 6.8 seconds.

Theoretical performance

Finally, it finally became clear what theoretically-achievable performance of the Multiklet at the perfect parallelization at a frequency of 100 MHz:

2.4 GFLOP : If all 4 cells perform only the operation of complex multiplication, and nothing more.
800 MFLOP : If all 4 cells perform the remaining arithmetic operations in packaged form (i.e., the same operation is performed on both 32-bit halves).
400 MFLOP - If we need to do operations only once, and not in pairs (as is usually the case with non-computational code).

Finally, if we cannot parallelize all 4 cells, then it will only be possible to rely on 150-300 MFLOP .

A manually assembled optimized Fourier transform code with almost perfect parallelization, with cut-out save-load blocks (the developers assure that they can be optimized in the future) gives ~ 1.2 GFLOP (less than 2.4 is obtained just because not all operations are complex multiplication, additions are needed / subtractions and others).

Power consumption

At maximum load:
1.8V - 0.39A
3.3V - 7.2mA
Accordingly, the power consumption is 0.725W at a frequency of 80 MHz (100 MHz will be higher).

When the reset is clamped, consumption drops to 0.3A over the 1.8V bus, and 0.8ma over the 3.3V bus.

Summary

With the current state of the C compiler, the performance is fatally low (corresponding to 5-10 MHz of an abstract non-superscalar processor) due to non-optimized code. All hope that the developers of the compiler for 2013 will finish it, and then Multiklet will be able to compete with other domestic developments.

Ps. I shouldn’t write me down as opponents of Multiklet - I’m using both hands for everything to work perfectly for him and he tore everyone up, and also for both domestic microelectronics.

Pps. Regarding the question of how remote access should work, I’m pretty serious - as long as I don’t have any ideas other than flashing the binary and sending back everything that comes out of RS232.

Source: https://habr.com/ru/post/165043/


All Articles