Hi Habr! After reading a recent article
Will Intel have to remove a function from the compiler that deliberately delivers bad code for AMD processors? and, strangely enough, I didn’t see all the comments to it: the tests of “live” applications before applying the patch, blocking the processor controller in the Intel compiler code and after.

Having the days off and being
not the happy owner of a processor from AMD, I decided to study the issue in more detail. Namely, find out if there is a significant difference in the performance of real applications compiled by the compiler from Intel?
')
I decided to start by reading the original article in English, as well as reading all the discussions on this topic in various forums. Surprisingly, even there I did not find the answer to my question - what is the real increase in real applications. But I found out that depending on the compiler versions, the registers in the processor's controller code can be different, and not just eax, ebx, ecx, edx and ebp, as for example implemented in the patch
intel_patch-ppro.pl . The esi and edi registers can also be used. Not being a Perl connoisseur and not having his interpreter on the computer, it was decided to jot down a new patch on Pascal (Virtual Pascal, but Free Pascal can be easily compiled). The patch itself and its source code can be found here:
icc_patch.rarThe second task that confronted me was how to determine if a given program was compiled using the Intel Compiler? It would seem that PEiD programs should come to the rescue. I don’t know what's going on with their signatures, but they give out the average distance from the beer stall to the moon. PEiD assured me that a compiled Delphi program with an empty form and a cut off Reallocation Table was immediately written in Borland C ++ compiler, and similar things. Of course, this option had to be abandoned. The second obvious solution is to shovel all the EXE files on the computer and look in them for the very comparison cmp eax, 'Genu', and so on. But, such checks are in many programs that are not related to the Intel compiler. Therefore, we had to be patient and search the approximate code in the “heavyweight” programs available on the computer, which was already shown in the comments to the previous article:
mov eax,[ebp][-0008]
cmp eax,0756E6547 ;"uneG" ; "Genu"
jne not_intel ; , not_intel
mov eax,[ebp][-0010]
cmp eax,049656E69 ;"Ieni" ; "ineI"
jne not_intel ; - not_intel
mov eax,[ebp][-0014]
cmp eax,06C65746E ;"letn" ; "ntel"
jne not_intel ; - not_intel
mov edx,000000001 ;
jmps next
not_intel:
xor edx,edx ; 0, -
next:
As a result, from the entire set of software installed on the computer, only one such program was found, and about a miracle - it turned out to be a benchmark! She was
CineBench R10 from Maxon

It perfectly contains the above code sections. Without unnecessary words, we use the
patcher on the main EXE program, and also do not forget to patch all the other program libraries, which by the way, have a non-standard .CDL extension. The test itself evaluates the rendering speed of the image and, as a result, produces certain “parrots”. Now it's time to look at the chart:

Three launches of the program with and without a patch were made. As can be seen in the graph, the results vary. But in reality, they differ very, very little, the schedule distorts the presentation a little. For example, if you take the render time, then on the patched program it is approximately 7 minutes and 30-40 seconds, and on the benchmark without a patch 7 minutes 50 seconds - 8 minutes. The average difference is no more than 15-20 seconds.
What do we see in the end? Yes, there is some difference, but it is so insignificant that it can be safely neglected. If on large (long) tasks the difference does not exceed 20 seconds, then on simple (office) tasks it will not be visible at all.
On the other hand, it is important not to be deceived - after all, no one prevented the developers of this program from writing all critical cycles in pure assembler, thus being independent of the compiler. And I suspect that this is the way it is, so unfortunately it was not possible to see the full picture with this program.
What did I find out during this time for myself? The first is that there are not many programs used on an ordinary user's computer and compiled using the Intel Compiler every day. To be more precise - a little mizirno. Secondly, it is very difficult to evaluate performance gains on other people's programs, because it is not known how they work from the inside. Currently, there is a slow download of the trial version of the compiler, to conduct an experiment directly on its own programs. Even then it will be possible to draw at least some more or less reliable conclusions.
The purpose of this article is not to incite holivars on Intel vs AMD, or similar. The goal is to find out and understand whether that notorious processor dispatcher has such a strong influence on the execution time of the programs, as we have in the original article. And I would like to do this with your help,%% username.