C / C ++ code optimization

This article is a free translation of the article Optimizing C ++ / Code optimization / Faster operations. The original can be found on the link . The first part is here .

Part 2

Prefix or postfix operator

Prefix operator is preferred over postfix operator. When working with primitive types, prefix and postfix arithmetic are likely to have the same performance. However, with objects, postfix operators can cause an object to create its own copy in order to preserve its initial state (which must be returned as a result of the operation), as well as cause a side effect of the operation. Consider the following example:

class IntegerIncreaser { int m_Value; public: /* Postfix operator. */ IntegerIncreaser operator++ (int) { IntegerIncreaser tmp (*this); ++m_Value; return tmp; }; /* Prefix operator. */ IntegerIncreaser operator++ () { ++m_Value; return *this; }; };

Since postfix operators are required to return an unmodified version of a value that increases (or decreases) - regardless of whether the result is actually used - most likely, it will make a copy. STL iterators (for example) are more efficient when modified using prefix operators.

Built-in functions

If you do not use compiler parameters to optimize the entire program so that the compiler can embed any function, then it makes sense to transfer some functions to header files as inline functions, that is, to declare them built-in.

If you believe the manual (for example, the gcc compiler " 5.34 An Inline Function is As fast as a Macro "), then the inline function is executed (as quickly as a macro) faster than usual due to the elimination of service calls, but you should take into account that not all functions will work faster, and some functions declared as inline can slow down the entire program.

Integer division by constant

When you divide an integer (which is positive or zero) by a constant, convert the integer to unsigned.

For example, if s is a signed integer, u is an unsigned integer, and C is an expression with a constant integer (positive or negative), s / C is slower than u / C, and s% C is slower than u % C. This manifests itself most clearly when C is a power of two, but, nevertheless, when dividing, the sign should be taken into account.

By the way, the conversion from signed to unsigned will cost us nothing, since this is just another interpretation of the same bits. Therefore, if s is a signed integer that will be used later as positive or zero, you can speed up its division using the following expressions: ( unsigned ) s / C and ( unsigned ) s% C.

Using multiple arrays instead of structure fields

Instead of processing one array of aggregate objects, process two or more arrays of the same length in parallel. For example, instead of the following code:

 const int n = 10000; struct { double a, b, c; } s[n]; for (int i = 0; i < n; ++i) { s[i].a = s[i].b + s[i].c; }

The following code may be faster:

 const int n = 10000; double a[n], b[n], c[n]; for (int i = 0; i < n; ++i) { a[i] = b[i] + c[i]; }

Using this rearrangement, “a”, “b” and “c” can be processed by array processing commands, which are much faster than scalar instructions. This optimization may have zero or adverse results for some architectures.

Better yet, interleave the arrays:

 const int n = 10000; double interleaved[n * 3]; for (int i = 0; i < n; ++i) { const size_t idx = i * 3; interleaved[idx] = interleaved[idx + 1] + interleaved[idx + 2]; }

PS: Please note that each case must be tested and not optimized prematurely.

Source: https://habr.com/ru/post/339492/

All Articles