C ++ expression categories

Categories of expressions, such as lvalue and rvalue , relate more to the fundamental theoretical concepts of the C ++ language than to the practical aspects of its use. For this reason, many even experienced programmers are rather vaguely aware of what they mean. In this article I will try to explain as simply as possible the meaning of these terms, diluting theory with practical examples. Immediately make a reservation: the article does not pretend to the most complete and rigorous description of categories of expressions, for details I recommend to contact directly to the source: C ++ language standard.

The article will have quite a lot of English-language terms, due to the fact that some of them are difficult to translate into Russian, while others are translated in different sources in different ways. Therefore, I will often indicate English terms in italics .

A bit of history

The terms lvalue and rvalue appeared in the C language. It is worth noting that the confusion was inherent in the terminology initially, because they refer to expressions, and not to values. Historically, a lvalue is something that can be left ( left ) of an assignment operator, and rvalue is something that can only be right ( right ).

lvalue = rvalue;

However, such a definition somewhat simplifies and distorts the essence. The C89 standard defined lvalue as an object locator , i.e. an object with an identifiable memory location. Accordingly, everything that did not fit this definition was included in the rvalue category.

Bjarn rushes to the rescue

In C ++, the terminology of categories of expressions has evolved quite strongly, especially after the adoption of C ++ Standard 11, where the concepts of rvalue links and move semantics were introduced. The history of the emergence of new terminology is interestingly described in the Straustrup article “New” Value Terminology .

The basis of a new, more rigorous terminology was formed by 2 properties:

the presence of identity ( identity ) - that is, some parameter by which one can understand whether two expressions refer to the same entity or not (for example, an address in memory);
can be moved from — supports move semantics.

Identical expressions are generalized under the term glvalue ( generalized values ), the expressions being moved are called rvalue . Combinations of these two properties defined 3 main categories of expressions:

	Have identity	Deprived of identity
Cannot be moved	lvalue	-
Can be moved	xvalue	prvalue

In fact, in the C ++ Standard 17, the concept of copy elision appeared - the formalization of situations when the compiler can and should avoid copying and moving objects. In this regard, the prvalue may not necessarily be moved. Details and examples of this can be found here . However, this does not affect the understanding of the general pattern of categories of expressions.

In the modern C ++ Standard, the category structure is given in the form of the following scheme:

Let us examine in general terms the properties of categories, as well as language expressions that fall into each of the categories. Immediately, I note that the following lists of expressions for each category cannot be considered complete, for more precise and detailed information, refer directly to the C ++ Standard.

glvalue

Glvalue expressions have the following properties:

can be implicitly converted to a prvalue ;
they can be polymorphic, i.e. the concepts of static and dynamic type make sense for them;
cannot have the void type - this directly follows from the identity presence property, because for expressions of the void type there is no such parameter that would allow to distinguish them from one another;
may be incomplete type, for example, in the form of a forward declaration (if this is permitted for a specific expression).

rvalue

Expressions in the rvalue category have the following properties:

you cannot get the address of the rvalue in memory - this directly follows from the property of lack of identity;
cannot be on the left side of an assignment or compound assignment operator;
can be used to initialize a constant lvalue link or rvalue link, while the object's lifetime extends to the link's lifetime;
if used as an argument when calling a function that has 2 overloaded versions: one accepts the constant lvalue link, and the other rvalue link, then the version that accepts the rvalue link is selected. This property is used in the implementation of move semantics semantics :

 class A { public: A() = default; A(const A&) { std::cout << "A::A(const A&)\n"; } A(A&&) { std::cout << "A::A(A&&)\n"; } }; ......... A a; A b(a); //  A(const A&) A c(std::move(a)); //  A(A&&)

Technically, A && is an rvalue and can be used to initialize both the constant lvalue reference and the rvalue reference. But thanks to this property, there is no ambiguity; a variant of the constructor is chosen that accepts an rvalue link.

lvalue

Properties:

all properties are glvalue (see above);
you can take the address (using the built-in unary operator & );
modifiable lvalues can be on the left side of an assignment statement or compound assignment statements;
can be used to initialize a reference to an lvalue (both constant and non-constant).

The lvalue category includes the following expressions:

the name of a variable, function, or field of a class of any type. Even if the variable is an rvalue reference, the name of this variable in the expression is lvalue ;

 void func() {} ......... auto* func_ptr = &func; // :     auto& func_ref = func; // :     int&& rrn = int(123); auto* pn = &rrn; // :    auto& rn = rrn; // :  lvalue-

a call to a function or an overloaded operator that returns an lvalue link, or a conversion expression of the type lvalue link;
built-in assignment operators, compound assignment operators ( = , += , /= , etc.), built-in pre-increment and pre-decrement ( ++a , - --b ), built-in pointer dereference operator ( *p );
the built-in index operator ( a[n] or n[a] ), when one of the operands is an lvalue array;
a call to a function or an overloaded statement that returns an rvalue link to the function;
string literal, for example "Hello, world!" .

A string literal differs from all other literals in C ++ by the fact that it is an lvalue (albeit immutable). For example, you can get his address:

 auto* p = &”Hello, world!”; //   ,

prvalue

Properties:

all rvalue properties (see above);
can not be polymorphic: static and dynamic types of expressions are always the same;
cannot be of an incomplete type (except for the void type, this will be discussed below);
cannot have an abstract type or be an array of elements of an abstract type.

The prvalue category includes the following expressions:

literal (except string), for example 42 , true or nullptr ;
a call to a function or overloaded operator that returns a non-reference ( str.substr(1, 2) , str1 + str2 , it++ ) or a conversion expression to a non-reference type (for example, static_cast<double>(x) , std::string{} , (int)42 );
embedded postincrement and postdecrement ( a++ , b-- ), embedded mathematical operations ( a + b , a % b , a & b , a << b , etc.), embedded logic operations ( a && b , a || b !a , and so on), comparison operations ( a < b , a == b , a >= b , etc.), the built-in operation of taking the address ( &a );
pointer this ;
enumeration item;
non-standard template parameter if it is not a class;
A lambda expression, for example [](int x){ return x * x; } [](int x){ return x * x; } .

xvalue

Properties:

all rvalue properties (see above);
all properties are glvalue (see above).

Examples of xvalue category expressions :

a call to a function or a built-in operator that returns an rvalue link, for example, std :: move (x) ;

indeed, for the result of a call to std :: move (), you cannot get an address in memory or initialize a link to it, but at the same time, this expression can be polymorphic:

 struct XA { virtual void f() { std::cout << "XA::f()\n"; } }; struct XB : public XA { virtual void f() { std::cout << "XB::f()\n"; } }; XA&& xa = XB(); auto* p = &std::move(xa); //  auto& r = std::move(xa); //  std::move(xa).f(); //  “XB::f()”

the built-in index operator ( a[n] or n[a] ), when one of the operands is an rvalue array.

Some special cases

Operator comma

For the inline comma operator, the category of the expression always matches the category of the second operand.

 int n = 0; auto* pn = &(1, n); // lvalue auto& rn = (1, n); // lvalue 1, n = 2; // lvalue auto* pt = &(1, int(123)); // , rvalue auto& rt = (1, int(123)); // , rvalue

Void type expressions

Calls to functions that return void , type-casting expressions to void , and throwing exceptions ( throw ) are considered to be expressions of the prvalue category, but they cannot be used to initialize references or as function arguments.

Ternary comparison operator

Definition of the category of a ? b : c a ? b : c is a non-trivial case, it all depends on the categories of the second and third arguments ( b and c ):

if b or c are of type void , then the category and type of the whole expression correspond to the category and type of another argument. If both arguments are of type void , then the result is a prvalue of type void ;
if b and c are glvalues of the same type, then the result is a glvalue of the same type;
in other cases, the result is a prvalue.

For the ternary operator, a number of rules are defined, according to which implicit conversions can be applied to the arguments b and c, but this is somewhat beyond the topic of the article, and I am interested in referring to the section Standard Conditional operator [expr.cond] .

 int n = 1; int v = (1 > 2) ? throw 1 : n; // lvalue, .. throw   void,    n ((1 < 2) ? n : v) = 2; //  lvalue,  ,   ((1 < 2) ? n : int(123)) = 2; //   , ..    prvalue

References to fields and methods of classes and structures

For expressions of the form am and p->m (here we are talking about the built-in operator -> ), the following rules apply:

if m is an element of an enumeration or a non-static method of a class, then the whole expression is considered a prvalue (although the link cannot be initialized with such an expression);
if a is rvalue and m is a non-static field of non-reference type, then the whole expression belongs to the category xvalue ;
otherwise it is a lvalue .

For pointers to class members ( a.*mp and p->*mp ), the rules are similar:

if mp is a pointer to a class method, then the whole expression is considered to be prvalue ;
if a is an rvalue and mp is a pointer to a data field, then the whole expression refers to xvalue ;
otherwise it is a lvalue .

Bit fields

Bit fields are a convenient tool for low-level programming, however, their implementation falls somewhat out of the general structure of categories of expressions. For example, a reference to a bit field seems to be an lvalue , since it may be present on the left side of an assignment operator. At the same time, it is not possible to take the address of the bit field or initialize a non-constant link. You can initialize a constant reference to a bit field, but a temporary copy of the object will be created:

Bit-fields [class.bit]
If the initializer is used for a reference, it is subject to the link is not directly bound to the bit field directly.

 struct BF { int f:3; }; BF b; bf = 1; // OK auto* pb = &b.f; //  auto& rb = bf; //

Instead of conclusion

As I mentioned in the introduction, this description does not claim to be complete, but merely gives a general idea of the categories of expressions. This representation will allow a little better understanding of the Standard paragraphs and the compiler error messages.

Source: https://habr.com/ru/post/441742/

All Articles

C ++ expression categories

A bit of history

Bjarn rushes to the rescue

glvalue

rvalue

lvalue

prvalue

xvalue

Some special cases

Operator comma

Void type expressions

Ternary comparison operator

References to fields and methods of classes and structures

Bit fields

Instead of conclusion

More articles: