“Universal” links in C ++ 11 or T && do not always mean “Rvalue Reference”

Not so long ago, Scott Meyers (C Sc ++ Meyers), an expert in the C ++ programming language, author of many well-known books, published an article describing the details of using rvalue links in C ++ 11.
On Habré, this topic has not yet been raised, and it seems to me that the article will be of interest to the community.
Original article: “Universal References in C ++ 11 — Scott Meyers”

"Universal" links in C ++ 11

T && does not always mean “Rvalue Reference”

By: Scott Meyers

Perhaps the most important innovation in C ++ 11 are the rvalue links. They serve as the foundation on which “move semantics” and “perfect forwarding” are built. (You can get acquainted with the basics of these mechanisms in the review of Thomas Becker ).
')
Syntactically, the rvalue links are declared in the same way as “normal” links (now called lvalue links), except that you use two ampersands instead of one. Thus, this function takes a parameter of type rvalue-reference-to-Widget:

void f(Widget&& param);

Given that the rvalue links are declared using the “&&”, it would be reasonable to assume that the presence of the “&&” in the type declaration points to the rvalue link. But this is not the case:

 Widget&& var1 = someWidget; //  “&&”  rvalue  auto&& var2 = var1; //  “&&”   rvalue  template<typename T> void f(std::vector<T>&& param); //  “&&”  rvalue  template<typename T> void f(T&& param); //  “&&”   rvalue

In this article, I will describe two “&&” values in the type declaration, explain how to distinguish them from each other, and introduce a new terminology that will uniquely determine what value “&&” is used. Selecting different values is important, because if you think about the “rvalue link” when you see the “&&” in the type declaration, you will misunderstand a lot of C ++ 11 code.

The essence of the question is that “&&” in a type declaration means a rvalue reference, but sometimes it can mean either an rvalue reference or an lvalue reference. Thus, in some cases, “&&” in the source code may have the value “&”, i.e. syntactically have the form of an rvalue reference (“&&”), and in reality be an lvalue reference (“&”).

Links are a more flexible concept than lvalue links or rvalue links. So the rvalue links can only be associated with the rvalue, and the lvalue links, in addition to the ability to bind to the lvalue, can be associated with the rvalue under limited conditions (restrictions on linking the lvalue of links and rvalue is that such a link is valid only when A lvalue reference is declared as a reference to a constant, that is, const T &.) Links that are declared with “&&”, which can be either lvalue references or rvalue references, can be associated with anything. Such unusually flexible links deserve their name. I called them "universal" links .

The details when “&&” means a universal link (that is, when “&&” in the source code can really mean “&”) are quite complex, so I’ll postpone their description. And now let's focus on the following rule, because this is something that you should remember in daily programming:

If a variable or parameter is declared with type T && for some output type T, such variable or parameter is a universal reference.

The type inference requirement limits the range of situations where universal references can be. Practically, almost all universal links are parameters of function templates. And since the type inference rules for auto-declared variables are basically the same as for templates, auto-declared universal references are possible. They are not often found in production code, but I will give some in this article, since they are less verbose with examples of patterns. In the “Small Details” section of this article, I will show the possibility of the emergence of universal links in connection with the use of typedef and decltype, but so far we have not reached the “Minor Details”, I will assume that universal links refer only to function templates and auto declared variables.

The T && universal link declaration form is a more significant requirement than it may seem, but I will come back to this question later. For now, just keep in mind this requirement.

Like all references, universal references must be initialized, and it is the universal reference initializer that determines whether it will be a lvalue reference or an rvalue reference:

If the expression that initializes the universal reference is an lvalue, then the universal reference becomes the lvalue reference.
If the expression that initializes the universal reference is an rvalue, then the universal reference becomes the rvalue reference.

This information is only useful if you are able to distinguish lvalue from rvalue. The exact definition of these terms is difficult to work out (C ++ 11 standard gives a general definition of whether the expression lvalue or rvalue is case-by-case), but in practice the following is enough:

If you can take the address of an expression, then this is an lvalue expression.
If the type of the expression is an lvalue reference (i.e. T & or const T &, etc.), then this is an lvalue expression.
Otherwise, the expression is an rvalue. Conceptually (and, as a rule, in fact), rvalues correspond to temporary objects, such as those returned from functions or created by implicit type conversion. Most literals (for example, 10 and 5.3) are also rvalue.

Let's look again at the code from the beginning of the article:

 Widget&& var1 = someWidget; auto&& var2 = var1;

You can take the address var1, respectively var1 is lvalue. Declaring the type var2 as auto && makes var2 a universal reference, and since it is initialized by var1 (lvalue), var2 becomes an lvalue reference.

Careless reading of the source code may make you believe that var2 is an rvalue reference; “&&” in the ad, of course, leads to this thought. But since var2 is a generic link initialized by lvalue, it is a lvalue link. It is as if var2 were declared as follows:

 Widget& var2 = var1;

As noted above, if an expression is of type lvalue reference, it is an lvalue. Consider this example:

 std::vector<int> v; ... auto&& val = v[0]; // val  lvalue  (. )

val is a universal reference and initialized by v [0], i.e. result of calling std :: vector <int> :: operator []. This function returns a lvalue reference to the vector element (I ignore going outside the array, which will lead to undefined behavior).

Since all lvalue references are lvalue, and since this lvalue is used to initialize val, val becomes an lvalue reference, although a declaration of type val looks like an rvalue reference.

I noted that universal links are most common in the parameters of function templates. Consider again the pattern from the beginning of this article.

 template<typename T> void f(T&& param); // “&&”   rvalue

With this call f,

 f(10); // 10  rvalue

param is initialized with literal 10, which, for the reason that you cannot take its address, is an rvalue. This means that in the f call, the universal reference param is initialized to the rvalue and thus becomes the rvalue reference — in particular, int &&.

On the other hand, if f is called something like this:

 int x = 10; f(x); // x  lvalue

param is initialized to the variable x, which, for the reason that you can take its address, is an lvalue. This means that in this call to f, the universal reference param is initialized by the lvalue, and param therefore becomes the lvalue reference –– int &, to be precise.

The comment next to the f declaration should now be clear: the type of param lvalue will be a reference or the rvalue reference will depend on what was passed to f in the call. Sometimes param becomes an lvalue reference, and sometimes an rvalue reference. That is, param is truly a universal link.

Remember that “&&” denotes a universal reference only when type inference takes place. Where there is no type inference, there is no universal reference. In such cases, the “&&” in the type declaration always means the rvalue reference. Consequently:

 template<typename T> void f(T&& param); //    ⇒  ; && ≡   template<typename T> class Widget { ... Widget(Widget&& rhs); //     ⇒   ; ... // && ≡ rvalue  }; template<typename T1> class Gadget { ... template<typename T2> Gadget(T2&& rhs); //    ⇒  ; && ≡   }; void f(Widget&& param); //     ⇒   ; // && ≡ rvalue

There is nothing surprising in these examples. In any case, if you see T && (where T is a template parameter), type inference is present, so you look at the universal link. And if you see “&&” after a specific type name (for example, Widget &&), you look at the rvalue link.

I stated that the link declaration form should be “T &&” in order for the link to be universal. This is an important nuance. Look again at the declaration from the beginning of this article:

 template<typename T> void f(std::vector<T>&& param); // “&&”  rvalue

Here we have both type inference and “&&” - the described parameter of the function, but the form of the parameter declaration is not “T &&”, but “std :: vector <T> &&”. As a result, the parameter is a normal rvalue reference, and not a universal reference. A universal link ad can only be in the form of “T &&”! Even the simple addition of the const qualifier is enough to not interpret “&&” as a universal reference.

 template<typename T> void f(const T&& param); // “&&”  rvalue

“T &&” is simply a necessary form for the declaration of universal links. This does not mean that you must use the name T for the template parameters.

 template<typename MyTemplateParamType> void f(MyTemplateParamType&& param); // “&&”

Sometimes you can see T && in the template function declaration, where T is a template parameter, but there is no type inference yet. Consider the push_back function in std :: vector (only the version of interest is shown

 std::vector::push_back): template <class T, class Allocator = allocator<T> > class vector { public: ... void push_back(T&& x); //     ⇒   ; ... // && ≡ rvalue  };

Here T is the template parameter, and push_back accepts T &&. However, the parameter is not a universal link! How can it be?

The answer becomes obvious if we look at how push_back will be declared outside the class. I will pretend that the Allocator parameter is missing so as not to clutter up the code. Given this, the following is the declaration of this version.
std :: vector :: push_back:

 template <class T> void vector<T>::push_back(T&& x);

push_back cannot exist without the std :: vector <T> class that contains it. But if we have a class std :: vector <T>, then we already know what T is, and thus, there is no need to output this type.

Let's see an example. If I write,

 Widget makeWidget(); //    Widget std::vector<Widget> vw; ... Widget w; vw.push_back(makeWidget()); //  Widget    vw

then my use of push_pack will tell the compiler to instantiate this function for the class std :: vector <Widget>. Its declaration outside the class will look like this:

 void std::vector<Widget>::push_back(Widget&& x);

Do you understand? As soon as we know that the class is std :: vector <Widget>, the type of the push_back parameter is fully defined. Type inference is not performed.
Compare this with the emplace_back std :: vector method, which is declared as follows:

 template <class T, class Allocator = allocator<T> > class vector { public: ... template <class... Args> void emplace_back(Args&&... args); //    ⇒  ; ... // && ≡   };

Disregard the fact that emplace_back accepts a variable number of arguments (as indicated in the Args and args declaration). The important thing here is that the types for each of the arguments must be inferred. The template parameter of the Args function does not depend on the template parameter of the class T, so even if the class is completely known, say std :: vector <Widget>, it says nothing about the type (s) of the emplace_back arguments. The emplace_back declaration outside the class for std :: vector <Widget> clearly shows this (I continue to ignore the existence of the Allocator parameter):

 template<class... Args> void std::vector<Widget>::emplace_back(Args&&... args);

Obviously, knowing that a class is std :: vector <Widget> does not eliminate the need to infer the type (s) that are passed to emplace_back. As a result, the std :: vector :: emplace_back parameters are universal references, unlike the version parameter of std :: vector :: push_back, which, as we have seen, is an rvalue reference.

It should be borne in mind that whether the expression lvalue or rvalue does not depend on its type. Consider the int type. There is an int type lvalue (for example, variables declared as int) and there is an int rvalue (for example, literals, for example, 10). This is true for custom types, like Widget. A Widget object can be an lvalue (for example, a Widget variable) or rvalue (for example, a factory function returns a created Widget object). The type of expression will not tell you whether it is an lvalue or an rvalue.

 Widget makeWidget(); //    Widget Widget&& var1 = makeWidget(); // var1  lvalue,   // var1 –  rvalue  ( Widget) Widget var2 = static_cast< Widget&& >(var1); // cast   rvalue,  //   -  rvalue  ( Widget)

The common way to turn an lvalue (for example, var1) into an rvalue is to use std :: move, so var2 can be defined as follows:

 Widget var2 = std::move(var1); //

I initially cited the code using static_cast only to explicitly show that the type of the expression is the rvalue reference (Widget &&).

Named variables and parameters of type rvalue reference are lvalue. (You can get their address.) Consider the Widget and Gadget templates again:

 template<typename T> class Widget { ... Widget(Widget&& rhs); //  rhs - rvalue , ... //  rhs  lvalue }; template<typename T1> class Gadget { ... template <typename T2> Gadget(T2&& rhs);// rhs      ... //     rvalue   }; // lvalue ,  rhs  lvalue

In the Widget constructor, rhs is an rvalue reference, so we know that it is associated with an rvalue (i.e. passed to rvalue), but rhs itself is an lvalue, so we have to convert it back to rvalue if we want to take advantage of that rhs is associated with rvalue. Our desire, as a rule, is caused by the requirement to use rhs as a source of transfer, therefore, std :: move is used to convert lvalue to rvalue. Similarly, rhs in the Gadget constructor is a universal reference, and therefore it can be associated with an lvalue or an rvalue, but in any case, rhs itself is an lvalue. If it is associated with an rvalue and we want to take advantage of it, we need to convert rhs back to rvalue. However, if it is associated with an lvalue, we certainly do not want to interpret it as an rvalue. Such a dependency on what the universal link is connected with is the reason for using std :: forward: take a universal link and convert it to rvalue only if it is associated with an rvalue expression. The name of the function (“forward”) confirms our expectation that it will perform the transfer to another function, always preserving the type of the reference argument (lvalue or rvalue).

But std :: move and std :: forward are not the subject of this article. The article focuses on the fact that "&&" in type declarations may or may not describe a rvalue reference. In order not to be distracted, I refer you to the links in the “Additional Information” section for a detailed description of std :: move and std :: forward.

Small parts

The essence of the question is that some constructions in C ++ 11 generate references to references, and references to references are not allowed in C ++. If the source code explicitly contains a link to the link - the code is not correct:

 Widget w1; ... Widget& & w2 = w1; // !     “  ”

However, there are cases where references to links arise as a result of manipulations with types that occur during compilation, and in such cases, rejecting this code will be problematic. We know this from the experience of the original standard for C ++, that is, C ++ 98 / C ++ 03.

During type inference for template parameters, which is a universal reference, lvalues and rvalues of the same type are displayed in several different types. In particular, lvalues of type T are output as type T & (i.e. lvalue reference to T), and rvalue of type T are output simply as type T. (Note that lvalue is output as lvalue reference, rvalue is not output as rvalue reference!) Consider what happens when you call a template function that accepts a universal reference with rvalue and lvalue:

 template<typename T> void f(T&& param); ... int x; ... f(10); //  f  rvalue f(x); //  f  lvalue

In a call to f with rvalue 10 T, it is output as int, and the instantiation of f looks like this:

 void f(int&& param); // f   rvalue

It's good. However, in a call to f with lvalue x, T is outputted as int &, and the instantiation of f contains a link to the link:

 void f(int& && param); //   f  lvalue

Because of the link to the link, this instance code looks seemingly wrong, but the source code “f (x)” is quite reasonable. In order not to reject this code, C ++ performs “link folding” when a link reference appears in contexts, such as template instantiation.

Since there are two types of links (lvalue links and rvalue links), there are four possible combinations of link references: lvalue link to lvalue link, lvalue link to rvalue link, rvalue link to lvalue link, and rvalue link to rvalue link. There are only two rules for folding links:

The rvalue link to the rvalue link becomes (“minimized to”) the rvalue link.
All other links to links (i.e., all combinations involving lvalue links) are collapsed into a lvalue link.

Applying these rules to instantiating f with lvalue gives the following correct code:

 void f(int& param); //  f  lvalue

This gives the exact mechanism by which a universal link can (after deducing a type and folding links) be turned into an lvalue link. In reality, a universal link is simply a rvalue link in the context of link folding.

A special situation is when a type is displayed for variables that are references. In this case, the part of the type denoting the link is ignored. For example, if

 int x; ... int&& r1 = 10; //  r1 - int&& int& r2 = x; //  r2 - int&

then the type for both r1 and r2 when calling the template f is considered to be int. This behavior of dropping links does not depend on the type inference rules for generic types, lvalues are output as type T &, and rvalue as type T, and thus in these calls,

 f(r1); f(r2);

the inferred type for both r1 and r2 is int &. Why? First, the reference part of types r1 and r2 is discarded (int is obtained in both cases), then, since it is an lvalue, both are treated as int & during type inference for the universal reference parameter in the call to f.

Link collapsing occurs, as I noted, in “contexts such as template instantiation”. The second such context is the definition of “auto” variables. Type inference for auto variables that are universal references is essentially identical to type inference for function template parameters that are universal references, so type T lvalue is output as T & type, and type T rvalue is output as T & type. Consider again the example from the beginning of the article:

 Widget&& var1 = someWidget; // var1   Widget&& (auto  ) auto&& var2 = var1; // var2   Widget& (. )

The var1 type is Widget &&, but its “reference part” is ignored during type inference during var2 initialization; it is considered a widget type. Since this is the lvalue that is used to initialize the universal reference (var2), the inferred type will be Widget &. Substituting Widget & instead of auto in the definition of var2, we get the following invalid code,

 Widget& && var2 = var1; //

which after collapsing links will be

 Widget& var2 = var1; // var2   Widget&

The third context of link folding is the formation and use of typedef. Given this class pattern

 template<typename T> class Widget { typedef T& LvalueRefType; ... };

and such use of this pattern,

 Widget<int&> w;

the instantiated class will contain the following (invalid) typedef:

 typedef int& & LvalueRefType;

Collapsing links results in the following valid code:

 typedef int& LvalueRefType;

If we then use this typedef in context using references to it, for example,

 void f(Widget<int&>::LvalueRefType&& param);

after typedef deployment, the following invalid code will be generated,

 void f(int& && param);

but minimizing the links will cut it down and the final declaration f will be:

 void f(int& param);

The final context where link folding is applied is the use of decltype. As with templates and auto, decltype performs an expression type deduction that gives either T or T & types, and decltype then applies C ++ 11 reference folding rules.

Unfortunately, the rules for folding links used by decltype are not the same as those used in inferring a type for a template or an auto type. The details are too complicated to discuss here (the “More Information” section provides links for details), but the noticeable difference is that decltype for a named variable of a non-reference type displays type T (that is, not a reference type) when under certain conditions templates and auto-types output type T &. Another important difference is that decltype inference depends only on the decltype of the expression; The type of initialization expression (if any) is ignored. Consequently:

 Widget w1, w2; auto&& v1 = w1; // v1   , //  lvalue,  v1 //  lvalue   w1. decltype(w1)&& v2 = w2; // v2   ,  decltype(w1)  Widget, //   v2  rvalue . // w2  lvalue,    // rvalue  lvalue,      .

Conclusion

In the description of the type “&&” means either a rvalue reference, or a universal reference — a reference that is either an lvalue reference or an rvalue reference. Universal links always have the form T && for some derived type T.

Link folding is a mechanism for casting universal links (which are just rvalue links in situations where link folding is applied) sometimes to lvalue links, and sometimes to rvalue links. It is used in special contexts in which links to links may appear as a result of compilation. These are contexts for the deduction of the type of a template, the deduction of an auto-type, the generation and use of a typedef and a decltype expression.

Acknowledgments

The draft versions of this article were reviewed by Cassio Neri, Michal Mocny, Howard Hinnant, Andrei Alexandrescu, Stephan T. Lavavej, Roger Orr, Chris Oldwood, Jonathan Wakely and Anthony Williams. Their comments contributed to significant improvements in the article, as well as its presentation.

Additional Information

C ++ 11, Wikipedia.

Overview of the New C ++ (C ++ 11), Scott Meyers, Artima Press, last updated January 2012.

C ++ Rvalue References Explained, Thomas Becker, last updated September 2011.

decltype, Wikipedia.

“A Note About decltype,” Andrew Koenig, Dr. Dobb's, July 27, 2011.

Source: https://habr.com/ru/post/157961/

All Articles