📜 ⬆️ ⬇️

“Why upgrade the GCC compiler?” Or “The performance of the GCC compiler on Intel Atom from version to version”


Let's try to understand what is new in the GCC compiler for Intel Atom architecture processors and how this affects the performance and code size of the well-known benchmark EEMBC CoreMark .
Above is a graph showing the performance of CoreMark compiled with peak and base options set by different versions of GCC relative to the performance of the basic set of options for GCC version 4.4.6 (higher is better).

The following compiler options were used for testing:
base options set (base): “-O2 -ffast-math -mfpmath = sse -m32 -march = atom”
base options set (base) + if convertion: “-O2 -ffast-math -mfpmath = sse -ftree-loop-if-convert -m32 -march = atom”
peak set of options (peak): “-Ofast -funroll-loops -mfpmath = sse -m32 -march = atom”, for versions 4.4 and 4.5, “-Ofast” was replaced by “-O3 -ffast-math”
More about the optimal options for GCC on x86 was written here . It is worth noting that the option “-flto” does not add to the performance of CoreMark.

From the graph, it is clear that the basic set of options with “-ftree-loop-if-convert” achieved peak-performance performance on CoreMark.

Below is a graph showing an increase in the size of the CoreMark executable code compiled with a peak set of relative base options for different versions of GCC:
')


Below is a graph showing the increase in the size of the CoreMark executable code compiled by different versions of GCC with a basic set of options relative to the basic set of options on GCC 4.4.6:



“-Ffunction-sections -Wl, - gc-sections -fno-asynchronous-unwind-tables -Wl, - strip-all” have been added to the basic and peak set of options for measuring code size. These options do not affect the performance of CoreMark.
In more detail about options for the optimal size of executable code was written here .

From the graphs it can be seen that the code size on the peak set of options is 2 times larger than on the base one and continues to grow. The basic set of options, by contrast, provides a slight decrease in code size.

All measurements were made for 1 stream on a 2-core Intel Atom CPU D525, 1.80GHz with 4Gb of memory, the Fedora 17 operating system.

GCC showed very good progress from version 4.4 to version 4.8 (mainly from version 4.6 to version 4.7 and from “-ftree-loop-if-convert” on the basic set of options version 4.8). The size of the code on the basic set of options remains unchanged, on the peak set it grows.

Below is a brief description of options and changes in GCC from version to version:


What if in GCC version 4.8 "-march = atom" would only include “-march = i686 -mtune = generic -mssse3” ? CoreMark performance would drop 5%. "-ftree-loop-if-convert” adds another 13% to the performance of the basic options set.
If code size and performance are important for your Atom application, switch to GCC version 4.8 and try to compile with options:
“-O2 -ffast-math -mfpmath = sse -ftree-loop-if-convet -fschedule-insns -fsched-pressure -m32 -march = atom”
If only performance is important, then GCC 4.8 is optimal with options:
“-Ofast -flto -funroll-loops -mfpmath = sse -fschedule-insns -fsched-pressure -m32 -march = atom”

Source: https://habr.com/ru/post/188386/


All Articles