I would like to invite the community to take part in a trial experiment. Its essence is to run a program written in C ++ on your computer and share the result of measuring the time it gives by comparing the speed of the functions sign (x), abs (x), min (a, b) and max ( a, b) performed with and without branching. In the article I will explain my motivation, show the functions themselves, and at the end I will offer the conditions for participation in the experiment and its (alas) limitations.
Motivation
Among programmers, especially those who learned to program somewhere in the late 90s, there is still a vividly held belief that an algorithm without branching is better than an algorithm with branching. I often found extreme fanaticism in trying to implement a number of simple functions without branching only because it seems to the programmer that it would be more efficient. Among these simple functions, I have highlighted four so far: the sign of the number, the absolute value of the number, the minimum and the maximum (in two versions: for numbers with a sign and without a sign). If my experiment is successful, I will undertake many other algorithms.
Is it true that without IF is faster?
No that's not true. To be more precise, getting rid of branches can both increase speed and aggravate the situation several times. It all depends on the specific machine on which this code will be executed, and on the compiler. I will cite as an example my old Core 2 Duo E8400 @ 3 GHz computer - all the proposed functions work faster in the IF version, when the data is streamlined and the situation can be reversed if the feed is chaotic (see details below). I have a compiler for VC ++ 2015, a compilation option / Ox.
Let's take a closer look at the functions that I propose to test. Each of the four functions can be written in half a dozen variants, all of which you can explore yourself using the links [1-7] at the end of the article. I have already tested all these options and selected two for each function: classical and one without branches (the best in my measurements). If, again, my experiment proves successful, I will make a similar comparison in general of all the options known to me [4-7].
')
To begin with, I enter my pseudonyms for data types and a constant to shift:
typedef int32_t i32;
typedef uint32_t u32;
typedef int8_t sign_t;
const u32 SHIFT = 31;
– sign
:
sign_t sign0 (i32 a) {
if (a>0) return +1;
if (a<0) return -1;
return 0;
}
:
sign_t sign1 (i32 a) {
return (a >> SHIFT) | ((u32)(-a) >> SHIFT);
}
, , .
: .
. 2,87 , 3,97 13,02 vs 1,26 .
– abs
. IF, , , :
u32 abs0 (i32 a) {
if (a<0) return -a;
return a;
}
( ) :
u32 abs1 (i32 a) {
const i32 b = a >> SHIFT;
return (a+b) ^ b;
}
2,29, — 2,33 12,01 vs 0,81 . : , abs, [3]. , IF, , . .
— mini maxi
- , :
i32 mini0 (i32 a, i32 b) {
return a<b ? a:b;
}
i32 maxi0 (i32 a, i32 b) {
return a>b ? a:b;
}
( ) :
i32 mini0 (i32 a, i32 b) {
return a>b ? b:a;
}
i32 maxi0 (i32 a, i32 b) {
return a<b ? b:a;
}
, , . , , (a b, ). , . , , inc ecx , lea eax, [eax+1], , .
( ):
i32 mini1 (i32 a, i32 b) {
i32 d = a-b;
return a - (d&(~(d^((a^b)&(d^a))) >> SHIFT));
}
i32 maxi1 (i32 a, i32 b) {
i32 d = a-b;
return b + (d&(~(d^((a^b)&(d^a))) >> SHIFT));
}
. 3,5 , – 9 2 vs 6 . .
, x86 cmovl, . , , , . , , , , .
— minu maxu
, i32 u32:
u32 minu0 (u32 a, u32 b) {
return a>b ? b:a;
}
u32 maxu0 (u32 a, u32 b) {
return a<b ? b:a;
}
:
u32 minu1 (u32 a, u32 b) {
u32 d = a-b;
return a - (d&~(int((~a&b)|(~(a^b)&d)) >> SHIFT));
}
u32 maxu1 (u32 a, u32 b) {
u32 d = a-b;
return b + (d&~(int((~a&b)|(~(a^b)&d)) >> SHIFT));
}
: 3,5 , 9,5 2 6 .
, .
(GitHub) (
meduzik ivas,
— . , — ). . , , , . , .
, STDOUT ( )
sign: 2.87 vs 3.97
abs: 2.29 vs 2.33
mini: 3.46 vs 8.93
maxi: 3.45 vs 9.10
minu: 3.45 vs 9.46
maxu: 3.45 vs 9.81
IF, – . STDERR , , .
:
sign: 13.02 vs 1.26
abs: 12.01 vs 0.81
mini: 1.89 vs 5.97
maxi: 1.93 vs 6.31
minu: 1.89 vs 6.44
maxu: 1.92 vs 6.70
: ( , ) , . , .
, , , .
- . , std::chrono. , , - ++. , Visual C++ 2015 GCC 4.8.1 ( MinGW) OS Windows 7 (64). 32- . SO, , .
- — .
- , . . chrono . runexe, , .
, , .
- Hacker's Delight
- Bit Twiddling Hacks
- Optimized abs function
- sign(x) —
- abs(x) —
- min(a,b) max(a,b)
- min(a,b) max(a,b)
meduzik ivas, , , .