C ++ MythBusters. Intended Functions Myth

Hello.

Thanks to this vote, it turned out that Habré lacks articles on such a powerful, but less and less used C ++ language. High-level professionals, gurus, wizards and wizards of the C ++ language, as well as those who have already managed to leave this language “behind” can no longer be read. Today I want to start a series of articles designed to help the newcomers, who relatively recently started learning this language, or those who (God forbid) read few books, but try to learn everything only in practice.

I also hope to involve as many authors as possible in writing such articles, because my experience here will not be enough.
')

Lyrical digression

A few words about the title, designed to combine articles of this kind. It, naturally, did not appear by chance, however, and does not quite fit the essence.

My articles will be designed for those who are already more or less familiar with the C ++ language, but who have little experience in writing programs on it. I will not write manuals or "introductory" manuals in the spirit of "buckets", "cups", "ounces", etc. Instead, I will try to cover some of the “narrow” and not always and not quite obvious places of the C ++ language. What is meant? I have repeatedly encountered, both by my own example and when communicating with other programmers, cases when a person was sure that he was right about some possibility of C ++ language (sometimes quite a long time), and later it turned out that he was deeply and irrevocably mistaken for one God known reasons.

And the reasons are not really supernatural. Often the human factor plays its role. For example, after reading a book for beginners, in which, as you know, many nuances are not explained, and sometimes not even mentioned, in order to simplify the perception of the foundations of the language, the reader thinks out the missing things on his own for reasons a la “I think this is logical ". From here there are crumbs of misunderstandings, sometimes leading to rather serious mistakes, well, and in most cases they simply hinder the successful completion of various kinds of C ++ competitions :)

So, the first myth

As you know, in C ++ there is the possibility of declaring inline functions. This is achieved by using the inline keyword. In the place where such functions are called, the compiler will not generate the call command (with the parameters pre-pushed onto the stack), but simply copy the function body to the call site with the corresponding parameters “in place” (in the case of class methods, the compiler will also substitute the necessary address this, where used). Naturally, inline is just a recommendation to the compiler, not an order, however, if the function is not too complicated (a rather subjective concept) and the code does not perform operations like taking the address of a function, etc., then most likely the compiler will do exactly that waiting for the programmer.

The substituted function is declared quite simply:

inline void foo ( int & _i )
{
_i ++ ;
}

But it is not about that now. We will look at using inline class methods. And we begin with a small example, the fault of which can lead to this myth.

You all know that class method definitions can be written both outside the class and inside, and inline functions are no exception . Moreover, the functions defined directly inside the class automatically become substitutable, and the inline keyword in this case is unnecessary. Consider an example (I use struct instead of class only in order not to write public):

// InlineTest.cpp

#include <cstdlib>
#include <iostream>

struct A
{
inline void foo ( ) { std :: cout << "A::foo()" << std :: endl ; }
} ;

struct B
{
inline void foo ( ) ;
} ;

void B :: foo ( )
{
std :: cout << "B::foo()" << std :: endl ;
}

int main ( )
{
A a ; B b ;
a. foo ( ) ;
b. foo ( ) ;
return EXIT_SUCCESS ;
}

In this example, everything is fine, and on the screen we see the cherished lines:

A :: foo ()
B :: foo ()

And the compiler really framed the bodies of the methods in the place of their calls.

Finally we got to the essence of today's article. The problems begin at the moment when we (observing the “good programming style”) divide the class into cpp- and h-files:

// Ah

#ifndef _A_H_
#define _A_H_

class A
{
public :
inline void foo ( ) ;
} ;

#endif // _A_H_

// A.cpp

#include "Ah"

#include <iostream>

void A :: foo ( )
{
std :: cout << "A::foo()" << std :: endl ;
}

// main.cpp

#include <cstdlib>
#include <iostream>
#include "Ah"

int main ( )
{
A a ;
a. foo ( ) ;

return EXIT_SUCCESS ;
}

At the linking stage, we get an error like this (depends on the compiler - I have MSVC):

main.obj: error LNK2001: unresolved external symbol "public: void __thiscall A :: foo (void)" (? foo @ A @@@ QAEXXZ)

Why?! Everything is quite simple: the definition of the substituted method and its call are in different translation units! I'm not quite sure how exactly this is done internally, but I see this problem like this:

if this were the usual method, then in the translation unit main.obj the compiler would put something like call XXXXX, and later the linker would replace XXXXX with the specific address of the A :: foo () method from the translation unit A.obj (of course, I everything is simplified, but the essence does not change).

In our case, we are dealing with an inline method, that is, instead of a call, the compiler must substitute the text of the method itself. Since the definition is in a different translation unit, the compiler leaves this situation in the care of the linker. There are two points here: firstly, “how much space should the compiler leave to substitute the body of the method?”, And secondly, the Aobu method in the A.obj translation unit is not used anywhere, and the method is declared as inline ( which means that where needed, the compiler had to copy the body of the method), so a separate compiled version of this method does not fall into the final object file at all.

In confirmation of paragraph 2, I will give a slightly updated example:

// Ah

#ifndef _A_H_
#define _A_H_

class A
{
public :
inline void foo ( ) ;
void bar ( ) ;
} ;

#endif // _A_H_

// A.cpp

#include "Ah"

#include <iostream>

void A :: foo ( )
{
std :: cout << "A::foo()" << std :: endl ;
}

void A :: bar ( )
{
std :: cout << "A::bar()" << std :: endl ;
foo ( ) ;
}

// main.cpp

#include <cstdlib>
#include <iostream>
#include "Ah"

int main ( )
{
A a ;
a. foo ( ) ;

return EXIT_SUCCESS ;
}

Now everything works as it should, due to the fact that the inline method A :: foo () is called in the non-configurable method A :: bar (). If you look at the assembler code of the final binary, you can see that, as before, there is no separate compiled version of the foo () method (that is, the method does not have its own address), and the method body is copied directly to the call sites.

How to get out of this situation? Very simple: inline methods need to be defined directly in a header file (not necessarily inside a class declaration). In this case, the re-definition error does not occur, since the compiler tells the linker to ignore ODR ( One Definition Rule ) errors, and the linker, in turn, leaves only one definition in the resulting binary file.

Conclusion

I hope that at least someone my first article will be useful and will help to achieve a little complete understanding of such a strange and sometimes contradictory, but certainly interesting programming language like C ++. Successes :)

UPD. In the process of communication with gribozavr , some inaccuracy regarding ODR was revealed in my article. Italicized.

Source: https://habr.com/ru/post/50775/

All Articles

C ++ MythBusters. Intended Functions Myth

Lyrical digression

So, the first myth

Conclusion

More articles: