⬆️ ⬇️

C ++ 0x (C ++ 11). Lambda expressions

Just the other day, I accidentally stumbled upon Habré on an article about lambda expressions from a new (future) C ++ standard. The article is good and makes clear the advantages of lambda expressions, however, it seemed to me that the article was not complete enough, so I decided to try to present the material in more detail.







Recall the basics



')

Lambda expressions are one of the features of functional languages, which have recently begun to be added to imperative languages ​​like C #, C ++ etc. Lambda expressions are called anonymous local functions that can be created directly within an expression.



In the last article, lambda expressions were compared with pointers to functions and functors. So here is the first thing to be understood: lambda expressions in C ++ are a short form of writing anonymous functors . Consider an example:



//  1 #include <algorithm> #include <cstdlib> #include <iostream> #include <vector> using namespace std; int main() { vector<int> srcVec; for (int val = 0; val < 10; val++) { srcVec.push_back(val); } for_each(srcVec.begin(), srcVec.end(), [](int _n) { cout << _n << " "; }); cout << endl; return EXIT_SUCCESS; } 




In fact, this code entirely corresponds to this:



 //  2 #include <algorithm> #include <cstdlib> #include <iostream> #include <vector> using namespace std; class MyLambda { public: void operator ()(int _x) const { cout << _x << " "; } }; int main() { vector<int> srcVec; for (int val = 0; val < 10; val++) { srcVec.push_back(val); } for_each(srcVec.begin(), srcVec.end(), MyLambda()); cout << endl; return EXIT_SUCCESS; } 




The output will accordingly be as follows:

0 1 2 3 4 5 6 7 8 9





What is worth paying attention to. First, from Listing 1, we see that the lambda expression always starts with [] (the brackets may be non-empty — more on this later), then the optional parameter list is followed, and then the function body itself. Secondly, we did not specify the type of the return value, and by default the lambda returns void (we will see later how and why we can specify the return type explicitly). Thirdly, as seen in Listing 2 , a default method is generated by default (we'll return to this too).



I don’t know about you, but I like for_each , which was written using lambda expressions, much more. Let's try to write a slightly complicated example:



 //  3 #include <algorithm> #include <cstdlib> #include <iostream> #include <vector> using namespace std; int main() { vector<int> srcVec; for (int val = 0; val < 10; val++) { srcVec.push_back(val); } int result = count_if(srcVec.begin(), srcVec.end(), [] (int _n) { return (_n % 2) == 0; }); cout << result << endl; return EXIT_SUCCESS; } 




In this case, lambda plays the role of a unary predicate , that is, the type of return value bool , although we have not indicated this anywhere. If there is one return in the lambda expression, the compiler calculates the type of the return value independently. If in the lambda expression there is an if or switch (or other complex constructions), as in the code below, then the compiler can no longer be relied upon:



 //  4 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <vector> using namespace std; int main() { vector<int> srcVec; for (int val = 0; val < 10; val++) { srcVec.push_back(val); } vector<double> destVec; transform(srcVec.begin(), srcVec.end(), back_inserter(destVec), [] (int _n) { if (_n < 5) return _n + 1.0; else if (_n % 2 == 0) return _n / 2.0; else return _n * _n; }); ostream_iterator<double> outIt(cout, " "); copy(destVec.begin(), destVec.end(), outIt); cout << endl; return EXIT_SUCCESS; } 




The code in Listing 4 does not compile, and, for example, Visual Studio writes an error for each return of this content:

  "Error C3499: a lambda that has been defined as a void return type cannot return a value" 


The compiler cannot independently calculate the type of the return value, so we must specify it explicitly:



 //  5 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <vector> using namespace std; int main() { vector<int> srcVec; for (int val = 0; val < 10; val++) { srcVec.push_back(val); } vector<double> destVec; transform(srcVec.begin(), srcVec.end(), back_inserter(destVec), [] (int _n) -> double { if (_n < 5) return _n + 1.0; else if (_n % 2 == 0) return _n / 2.0; else return _n * _n; }); ostream_iterator<double> outIt(cout, " "); copy(destVec.begin(), destVec.end(), outIt); cout << endl; return EXIT_SUCCESS; } 




Now the compilation is successful, and the output, as expected, will be as follows:

1 2 3 4 5 25 3 49 4 81





The only thing we added in Listing 5 is the return type for the lambda expression in the form -> double . The syntax is a bit weird and looks more like Haskell than C ++. But it would not be possible to specify the return type “on the left” (as in functions), because lambda must begin with [] so that the compiler can distinguish it.



Capture variables from external context





All the lambda expressions given above looked like anonymous functions because they did not store any intermediate state. But lambda expressions in C ++ are anonymous functors, which means they can store the state! Using lambda expressions, we write a program that displays the number of numbers that fall into the user-defined interval [lower; upper):



 //  6 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <numeric> #include <vector> using namespace std; int main() { vector<int> srcVec; for (int val = 0; val < 10; val++) { srcVec.push_back(val); } int lowerBound = 0, upperBound = 0; cout << "Enter the value range: "; cin >> lowerBound >> upperBound; int result = count_if(srcVec.begin(), srcVec.end(), [lowerBound, upperBound] (int _n) { return lowerBound <= _n && _n < upperBound; }); cout << result << endl; return EXIT_SUCCESS; } 




Finally, we got to the point where the lambda expression does not begin with empty brackets. As you can see in Listing 6 , variables can appear inside square brackets. This is called ... eeem ... " capture list". What is it for? At first glance, it may seem that the main () function is the outer scope for a lambda expression and we can freely use the variables declared in it inside the body of the lambda expression, but this is not so. Why? Because in fact, the lambda body is the body of an overloaded operator () () (how to call it ... functional function call operator) inside an anonymous functor, that is, for the code from Listing 6, the compiler will implicitly generate something like this:



 //  7 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <vector> using namespace std; class MyLambda { public: MyLambda(int _lowerBound, int _upperBound) : m_lowerBound(_lowerBound) , m_upperBound(_upperBound) {} bool operator ()(int _n) const { return m_lowerBound <= _n && _n < m_upperBound; } private: int m_lowerBound, m_upperBound; }; int main() { vector<int> srcVec; for (int val = 0; val < 10; val++) { srcVec.push_back(val); } int lowerBound = 0, upperBound = 0; cout << "Enter the value range: "; cin >> lowerBound >> upperBound; int result = count_if(srcVec.begin(), srcVec.end(), MyLambda(lowerBound, upperBound)); cout << result << endl; return EXIT_SUCCESS; } 




Listing 7 explains a little bit. Our lambda has become a functor, inside of which we cannot directly use the variables declared in main () , since these are non-intersecting scopes. In order to access lowerBound and upperBound after all, these variables are stored inside the functor itself (the same “capture” occurs): the constructor initializes them, and they are used inside operator () () . I specifically gave these variables names starting with the prefix " m_ " to emphasize the difference.



If we try to change the “captured” variables inside the lambda, we will fail, because by default the generated operator () () is declared as const . In order to get around this, we can specify the mutable specifier, as in the following example:



 //  8 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <numeric> #include <vector> using namespace std; int main() { vector<int> srcVec; int init = 0; generate_n(back_inserter(srcVec), 10, [init] () mutable { return init++; }); ostream_iterator<int> outIt(cout, " "); copy(srcVec.begin(), srcVec.end(), outIt); cout << endl << "init: " << init << endl; return EXIT_SUCCESS; } 




Earlier, I mentioned that the list of lambda parameters can be omitted when it is empty, however, in order for the compiler to properly parse the use of the word mutable , we must explicitly specify an empty parameter list.

When you run the program from Listing 8, we get the following:

0 1 2 3 4 5 6 7 8 9

init: 0





As you can see, thanks to the mutable keyword, we can change the value of the “captured” variable inside the body of the lambda expression, but, as one would expect, these changes do not affect the local variable, since the capture occurs by value. C ++ allows us to capture variables by reference and even specify the “capture mode” used by default . What does this mean? We can not specify each variable in the capture list separately: instead, you can simply specify the default capture mode, and then all the variables from the external context that are used inside the lambda will be automatically captured by the compiler. To specify the default capture mode, there is a special syntax: [=] or [&] to capture by value and by reference, respectively. For each variable, you can specify your own capture mode, but the default mode, of course, is indicated only once, and at the very beginning of the capture list. Here are the use cases:



 [] //        [=] //      [&] //      [x, y] //  x  y   [&x, &y] //  x  y   [in, &out] //  in  ,  out —   [=, &out1, &out2] //     ,  out1  out2, //     [&, x, &y] //     ,  x… 




It should be noted that syntax like & out in this case does not mean taking the address. It should be read more like SomeType & out , that is, it is just passing a parameter by reference. Consider an example:



 //  9 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <vector> using namespace std; int main() { vector<int> srcVec; int init = 0; generate_n(back_inserter(srcVec), 10, [&] () mutable { return init++; }); ostream_iterator<int> outIt(cout, " "); copy(srcVec.begin(), srcVec.end(), outIt); cout << endl << "init: " << init << endl; return EXIT_SUCCESS; } 




This time, instead of explicitly capturing the variable init , I specified the default capture mode: [&] . Now when the compiler encounters a variable from the external context inside the lambda body, it automatically grabs it by reference. Here is the equivalent code in Listing 9 :



 //  10 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <vector> using namespace std; class MyLambda { public: explicit MyLambda(int & _init) : init(_init) { } int operator ()() { return init++; } private: int & init; }; int main() { vector<int> srcVec; int init = 0; generate_n(back_inserter(srcVec), 10, MyLambda(init)); ostream_iterator<int> outIt(cout, " "); copy(srcVec.begin(), srcVec.end(), outIt); cout << endl << "init: " << init << endl; return EXIT_SUCCESS; } 




And accordingly, the output will be as follows:

0 1 2 3 4 5 6 7 8 9

init: 10





Now the main thing for you is not to get confused what, where and when to pass by reference. In fact, if we specify [&] and do not specify mutable , we will still be able to change the value of the captured variable and this will affect the local one, because operator () () const implies that we cannot change what the link indicates, and this and so impossible.



If the lambda expression has the form [=] (int & _val) mutable {...} , then the variables are captured by value, but only their internal copy will change, but the parameter is passed by reference, that is, the changes will be reflected on the original. If [] (const SomeBigObject & _val) {...} , then nothing is captured, and the parameter is received via a constant link, etc.



I understand that it is impossible to capture "by constant link". Well, maybe we don’t need it.



And what will happen if we write such a slightly contrived lambda expression inside a class method:



 //  11 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <vector> using namespace std; class MyMegaInitializer { public: MyMegaInitializer(int _base, int _power) : m_val(_base) , m_power(_power) {} void initializeVector(vector<int> & _vec) { for_each(_vec.begin(), _vec.end(), [m_val, m_power] (int & _val) mutable { _val = m_val; m_val *= m_power; }); } private: int m_val, m_power; }; int main() { vector<int> myVec(11); MyMegaInitializer initializer(1, 2); initializer.initializeVector(myVec); return EXIT_SUCCESS; } 




Despite all our expectations, the code will not be compiled, since the compiler will not be able to capture m_val and m_power : these variables are out of scope. This is what Visual Studio says:

  "Error C3480: 'MyMegaInitializer :: m_power': a lambda capture variable scope" 


How to be? To gain access to class members, you need to put this in the capture-list:



 //  12 #include <algorithm> #include <cstdlib> #include <iostream> #include <iterator> #include <vector> using namespace std; class MyMegaInitializer { public: MyMegaInitializer(int _base, int _power) : m_val(_base) , m_power(_power) {} void initializeVector(vector<int> & _vec) { for_each(_vec.begin(), _vec.end(), [this] (int & _val) mutable { _val = m_val; m_val *= m_power; }); } private: int m_val, m_power; }; int main() { vector<int> myVec(11); MyMegaInitializer initializer(1, 2); initializer.initializeVector(myVec); for_each(myVec.begin(), myVec.end(), [] (int _val) { cout << _val << " "; }); cout << endl; return EXIT_SUCCESS; } 




This program does exactly what we expected:

1 2 4 8 16 32 64 128 256 512 1024





It should be noted that this can only be captured by value, and if you try to capture by reference, the compiler will generate an error. Even if you write [&] instead of [this] in the code in Listing 12 , this will still be captured by value.



Other





In addition to all of the above, in the header of a lambda expression, you can specify a throw-list - a list of exceptions that lambda can generate. For example, such a lambda cannot generate exceptions:

 [] (int _n) throw() { … } 


And this only generates bad_alloc :

 [=] (const std::string & _str) mutable throw(std::bad_alloc) -> bool { … } 


Etc.



Naturally, if it is not specified, then the lambda can generate any exception.



Fortunately, in the final version of the standard throw-specifications declared obsolete. Instead, they left the noexcept keyword, which says that the function should not throw an exception at all.



Thus, the general view of the lambda expression is as follows (sorry for such a “free form” of grammar):

     lambda-expression :: =
                   '[' [<gripping_list>] ']'
                 ['(' <list_of_parameters> ')' ['mutable']]
                 ['noexcept']
                 ['->' <return type>]
                   '{' [<lambda body>] '}'




Reuse lambda expressions. Generation of lambda expressions.





All of the above is quite convenient, but the main power of lambda expressions comes from the fact that we can store the lambda in a variable or pass it as a parameter to a function. Boost for this is the Function class, which, if I'm not mistaken, will be included in the new STL standard (perhaps in a slightly modified form). At the moment, you can already use the features of the updated STL, however, so far these features are in the subspace of the names std :: tr1 .



The ability to save lambda expressions allows us not only to reuse lambdas, but also to write functions that generate lambdas, and even lambdas that generate lambdas.



Consider the following example:



 //  13 #include <algorithm> #include <cstdlib> #include <functional> #include <iostream> #include <iterator> #include <vector> using namespace std; using std::tr1::function; int main() { vector<int> myVec; int init = 0; generate_n(back_inserter(myVec), 10, [&] { return init++; }); function<void (int)> traceLambda = [] (int _val) -> void { cout << _val << " "; }; for_each(myVec.begin(), myVec.end(), traceLambda); cout << endl; function<function<int (int)> (int)> lambdaGen = [] (int _val) -> function<int (int)> { return [_val] (int _n) -> int { return _n + _val; }; }; transform(myVec.begin(), myVec.end(), myVec.begin(), lambdaGen(2)); for_each(myVec.begin(), myVec.end(), traceLambda); cout << endl; return EXIT_SUCCESS; } 




This program displays:

0 1 2 3 4 5 6 7 8 9

2 3 4 5 6 7 8 9 10 11





Consider more. Initially, we initialize the vector using generate_n () . Everything is simple. Next, we create a traceLambda variable of the function type <void (int)> (that is, a function that takes an int and returns void ) and assign a lambda expression to it, which outputs a value and a space to the console. Next we use the lambda just saved to display all the elements of the vector.



After this we see a rather lambdaGen declaration , which is a lambda expression, which takes one int parameter and returns another lambda, takes an int and returns an int .



Following this, we apply transform () to all elements of the vector, for which we specify lambdaGen (2) as the mutational function. In fact, lambdaGen (2) returns another lambda, which adds the number 2 to the passed parameter and returns the result. This code, of course, is a bit contrived, because the same could be written as

 transform(myVec.begin(), myVec.end(), myVec.begin(), bind2nd(plus<int>(), 2)); 


however, as an example, quite indicative.



Then we again display the values ​​of all the elements of the vector using the traceLambda saved earlier in the lambda.



In fact, this code could be written even shorter. In the new C ++ standard, the value of the auto keyword will be replaced. If earlier, auto meant that a variable is being created on the stack, and implicitly implied if you did not specify something else ( register , for example), now it is such an analog of var in C # (that is, the type of a variable declared as auto , is determined by the compiler independently on the basis of what this variable is initialized with).

It should be noted that the auto- variable will not be able to store values ​​of different types during one program run. C ++ both was and remains a statically typed language, and the auto indication only tells the compiler to take care of the type definition itself: after initialization, changing the type of the variable will no longer be possible.



In addition, the auto keyword is very useful when working with cycles like

 for (auto it = vec.begin(); it != vec.end(); ++it) { // ... } 


it is very convenient to use it with lambda expressions. Now the code from Listing 13 can be rewritten as:



 //  14 #include <algorithm> #include <cstdlib> #include <functional> #include <iostream> #include <iterator> #include <vector> using namespace std; using std::tr1::function; int main() { vector<int> myVec; int init = 0; generate_n(back_inserter(myVec), 10, [&] { return init++; }); auto traceLambda = [] (int _val) -> void { cout << _val << " "; }; for_each(myVec.begin(), myVec.end(), traceLambda); cout << endl; auto lambdaGen = [] (int _val) -> function<int (int)> { return [_val] (int _n) -> int { return _n + _val; }; }; transform(myVec.begin(), myVec.end(), myVec.begin(), lambdaGen(2)); for_each(myVec.begin(), myVec.end(), traceLambda); cout << endl; return EXIT_SUCCESS; } 




Perhaps, on this I will finish the description of lambda expressions. If you have questions, corrections or comments, I will be happy to hear.



PROFIT!



Progg it



ETA (02.20.2012): It turned out that for some people this article is still relevant, so I corrected the syntax highlighting and corrected the information about the throw lists in the lambda declaration. In addition to directly lambda expressions, I decided not to add other features from the new C ++ 11 standard (for example, container initialization lists), so the article remained practically intact.

Source: https://habr.com/ru/post/66021/



All Articles