Porting Qt to STM32

Good day! We in the Embox project launched Qt on STM32F7-Discovery and would like to tell about it. Earlier, we already told how we managed to run OpenCV .

Qt is a cross-platform framework that includes not only graphic components, but also such things as QtNetwork, a set of classes for working with databases, Qt for Automation (including for implementing IoT) and much more. The developers of the Qt team have foreseen the use of Qt in embedded systems, so the libraries are pretty well configured. However, until recently, few people thought about porting Qt to microcontrollers, probably because such a task looks complicated - Qt is large, MCUs are small.

On the other hand, at the moment there are microcontrollers designed for working with multimedia and surpassing the first Pentium. About a year ago, a post appeared on the Qt blog. The developers made a Qt port under RTEMS, and launched examples with widgets on several boards running stm32f7. We are interested. It was noticeable, and the developers themselves write about this, that Qt slows down on the STM32F7-Discovery. We wondered if we could run Qt under Embox, and not just draw the widget, but start the animation.

In Embox, Qt 4.8 has long been ported, so we decided to try it on it. Chose the moveblocks application - an example of springy animation.
')

Qt moveblocks on QEMU

To begin with, we configure Qt with as few components as possible with the minimum components required to support animation. For this there is an option “-qconfig minimal, small, medium ...”. It connects the Qt configuration file with many macros - what to enable / what to disable. After this option we add other flags to the configuration if we want to disable something else additionally. Here is an example of our configuration .

In order for Qt to work, you need to add an OS compatibility layer. One way is to implement QPA (Qt Platform Abstraction). It was based on the already ready fb_base plugin in Qt, on the basis of which QPA works for Linux. The result was a small plugin emboxfb, which provides Qt with Embox'a framebuffer, and then it draws there without any help.

Here is the creation of a plugin

QEmboxFbIntegration::QEmboxFbIntegration() : fontDb(new QGenericUnixFontDatabase()) { struct fb_var_screeninfo vinfo; struct fb_fix_screeninfo finfo; const char *fbPath = "/dev/fb0"; fbFd = open(fbPath, O_RDWR); if (fbPath < 0) { qFatal("QEmboxFbIntegration: Error open framebuffer %s", fbPath); } if (ioctl(fbFd, FBIOGET_FSCREENINFO, &finfo) == -1) { qFatal("QEmboxFbIntegration: Error ioctl framebuffer %s", fbPath); } if (ioctl(fbFd, FBIOGET_VSCREENINFO, &vinfo) == -1) { qFatal("QEmboxFbIntegration: Error ioctl framebuffer %s", fbPath); } fbWidth = vinfo.xres; fbHeight = vinfo.yres; fbBytesPerLine = finfo.line_length; fbSize = fbBytesPerLine * fbHeight; fbFormat = vinfo.fmt; fbData = (uint8_t *)mmap(0, fbSize, PROT_READ | PROT_WRITE, MAP_SHARED, fbFd, 0); if (fbData == MAP_FAILED) { qFatal("QEmboxFbIntegration: Error mmap framebuffer %s", fbPath); } if (!fbData || !fbSize) { qFatal("QEmboxFbIntegration: Wrong framebuffer: base = %p," "size=%d", fbData, fbSize); } mPrimaryScreen = new QEmboxFbScreen(fbData, fbWidth, fbHeight, fbBytesPerLine, emboxFbFormatToQImageFormat(fbFormat)); mPrimaryScreen->setPhysicalSize(QSize(fbWidth, fbHeight)); mScreens.append(mPrimaryScreen); this->printFbInfo(); }

This is how a redraw will look like.

 QRegion QEmboxFbScreen::doRedraw() { QVector<QRect> rects; QRegion touched = QFbScreen::doRedraw(); DPRINTF("QEmboxFbScreen::doRedraw\n"); if (!compositePainter) { compositePainter = new QPainter(mFbScreenImage); } rects = touched.rects(); for (int i = 0; i < rects.size(); i++) { compositePainter->drawImage(rects[i], *mScreenImage, rects[i]); } return touched; }

As a result, with the compiler optimization turned on by the size of the memory -Os, the library image turned out to be 3.5 MB, which of course does not fit into the main memory of the STM32F746. As we wrote in our other article about OpenCV, this board has:

1 MB ROM
320 KB RAM
8 MB SDRAM
16 MB QSPI

Since OpenCV has already added support for executing code from QSPI, we decided to start by loading the Embox image from Qt into QSPI as a whole. And hooray, everything almost immediately started from QSPI! But as in the case of OpenCV, it turned out that it works too slowly.

Therefore, we decided to do this - first copy the image into QSPI, then load it into SDRAM and run from there. From SDRAM was a little faster, but still far from QEMU.

Next was the idea to include a floating point - because Qt does some calculations for the coordinates of the squares in the animation. We tried, but we did not get any visible acceleration here, although in the article , Qt developers claimed that FPU gives a significant increase in speed for “dragging animation” on the touchscreen. Perhaps moveblocks has significantly fewer floating point calculations, and this depends on the specific example.

The most effective was the idea to transfer the framebuffer from SDRAM to the internal memory. To do this, we made the screen dimensions not 480x272, but 272x272. They also lowered the color depth from A8R8G8B8 to R5G6B5, thus reducing the size of one pixel from 4 to 2 bytes. Received the size of the framebuffer 272 * 272 * 2 = 147968 bytes. This gave a significant acceleration, perhaps the most noticeable, the animation became almost smooth.

The last optimization was the execution of Embox code from RAM, and Qt from SDRAM. To do this, we first link Statically Embox along with Qt, as usual, but we place the text, rodata, data and bss segments of the library in QSPI so that we can later copy it into SDRAM.

 section (qt_text, SDRAM, QSPI) phdr (qt_text, PT_LOAD, FLAGS(5)) section (qt_rodata, SDRAM, QSPI) phdr (qt_rodata, PT_LOAD, FLAGS(5)) section (qt_data, SDRAM, QSPI) phdr (qt_data, PT_LOAD, FLAGS(6)) section (qt_bss, SDRAM, QSPI) phdr (qt_bss, PT_LOAD, FLAGS(6))

Due to the execution of the Embox code from ROM, we also received significant acceleration. As a result, the animation turned out quite smooth:

Right at the end, preparing the article and trying different configurations of Embox, it turned out that Qt moveblocks works fine from QSPI with a framebuffer in SDRAM, and the size of the framebuffer was the bottleneck! Apparently, in order to overcome the initial “slideshow”, acceleration was 2 times due to the banal reduction in the size of the framebuffer. But it was not possible to achieve such a result by transferring only the Embox code to various fast memories (the acceleration was not about 2, but about 1.5 times).

How to try it yourself

If you have STM32F7-Discovery, you can run Qt under Embox yourself. Read how this can be done on our wiki .

Conclusion

In the end, we managed to run Qt! The complexity of the task, in our opinion, is somewhat exaggerated. Naturally, you need to take into account the specifics of microcontrollers and generally understand the architecture of computing systems. The optimization results point to the well-known fact that the bottleneck in a computer system is not a processor, but memory.

This year we will participate in the TechTrain festival. There we will describe and show in more detail Qt, OpenCV on microcontrollers and our other achievements.

Source: https://habr.com/ru/post/459730/

All Articles

Porting Qt to STM32

How to try it yourself

Conclusion

More articles: