Copy semantics and resource management in C ++

In C ++, the programmer must decide for himself how the used resources will be freed; there are no automatic tools like the garbage collector. The article discusses possible solutions to this problem, details potential problems, as well as a number of related issues.

Table of contents

Introduction
1. Basic copy-ownership strategies
1.1. Copy prohibition strategy
1.2. Exclusive ownership strategy
1.3. Deep copy strategy
1.4. Joint ownership strategy
2. Deep copy strategy - problems and solutions
2.1. Copy on write
2.2. Definition of the state exchange function for a class
2.3. Deletion of intermediate copies by the compiler
2.4. Implementing move semantics
2.5. Accommodation vs. inserts
2.6. Results
3. Possible options for implementing a joint ownership strategy
4. Exclusive ownership strategy and semantics of moving
5. Copy ban strategy - quick start.
6. The life cycle of the resource and the object-owner of the resource
6.1. Resource Capture During Initialization
6.2. Advanced options for managing the life cycle of the resource
6.2.1. Extended life cycle of the resource
6.2.2. Single capture resource
6.2.3. Increase Indirectness
6.3. Joint ownership
7. Results
Applications
Appendix A. Rvalue References
Appendix B. Movements semantics
Bibliography

Introduction

Resource management is something that a C ++ programmer has to do all the time. Resources can include memory blocks, OS kernel objects, multi-threaded locks, network connections, database connections and just any object created in dynamic memory. Access to a resource is through a descriptor, the type of the descriptor is usually a pointer or one of its aliases ( HANDLE , etc.), sometimes integer (UNIX file descriptors). After using the resource, it is necessary to free it, otherwise, sooner or later, an application that does not release resources (and possibly other applications) will face a shortage of resources. This problem is very acute, it can be said that one of the key features of the .NET, Java and several other platforms is a unified resource management system based on garbage collection.

Object-oriented features of C ++ naturally lead to the following solution: the class that manages the resource contains the resource handle as a member, initializes the handle when the resource is captured, and frees the resource in the destructor. But after some reflection (or experience) there comes an understanding that not everything is so simple. And the main problem is the semantics of copying. If the class that manages the resource uses the copy constructor generated by the compiler by default, then after copying the object we get two copies of the descriptor of the same resource. If one object frees a resource, then the second can attempt to use or release an already freed resource, which in any case is incorrect and can lead to so-called undefined behavior, that is, anything can happen, for example, a program crash.

Fortunately, in C ++, the programmer can fully control the copying process by means of his own definition of the copy constructor and the copy assignment operator, which allows to solve the problem described above, and usually not in one way. The implementation of copying should be closely linked to the resource release mechanism, and this together we will call the copy-ownership strategy. The so-called “Big Three Rule” is well known, which states that if a programmer has defined at least one of the three operations — a copy constructor, a copy assignment operator, or a destructor — then he must define all three operations. Copy-ownership strategies specify how to do this. There are four basic copy-ownership strategies.

1. Basic copy-ownership strategies

Before or after the resource is captured, the descriptor must take a special value indicating that it is not associated with the resource. This is usually zero, sometimes -1, reduced to the type of descriptor. In any case, such a descriptor will be called zero. The class that manages the resource must recognize the null descriptor and not try to use or free the resource in this case.

1.1. Copy prohibition strategy

This is the simplest strategy. In this case, it is simply forbidden to copy and assign class instances. The destructor frees the captured resource. In C ++, it is not difficult to prohibit copying; the class must declare, but not define, the closed copy constructor and the copy assignment operator.

 class X { private:    X(const X&);    X& operator=(const X&); // ... };

Attempts to copy are stopped by the compiler and linker.

The C ++ 11 standard offers a special syntax for this case:

 class X { public:    X(const X&) = delete;    X& operator=(const X&) = delete; // ... };

This syntax is more intuitive and gives more understandable compiler messages when attempting to copy.

In the previous version of the standard library (C ++ 98), the copy prohibition strategy used I / O stream classes ( std::fstream , etc.), while in Windows many classes from MFC ( CFile , CEvent , CMutex , etc.) were used. In the standard C ++ 11 library, some classes use this strategy to support multi-threaded synchronization.

1.2. Exclusive ownership strategy

In this case, when implementing copying and assignment, the resource handle moves from the source object to the target object, that is, it remains in a single copy. After copying or assignment, the source object has a null descriptor and cannot use the resource. The destructor frees the captured resource. For this strategy, the terms exclusive or strict possession of [Josuttis], Andrei Alexandrescu [Alexandrescu] uses the term destructive copying. In C ++ 11, this is done in the following way: normal copying and copying assignment are prohibited in the manner described above, and the semantics of movement are implemented, that is, the translational constructor and the movement assignment operator are defined. (More on the semantics of moving on.)

 class X { public:    X(const X&) = delete;    X& operator=(const X&) = delete;    X(X&& src) noexcept;    X& operator=(X&& src) noexcept; // ... };

Thus, the exclusive ownership strategy can be considered an extension of the copy ban strategy.

In the standard C ++ library, this strategy uses the smart pointer std::unique_ptr<> and some other classes, for example: std::thread , std::unique_lock<> , as well as classes that previously used the copy prohibition strategy ( std::fstream , etc.). In Windows, the MFC classes that previously used the copy prohibition strategy also began to use the exclusive ownership strategy ( CFile , CEvent , CMutex , etc.).

1.3. Deep copy strategy

In this case, you can copy and assign instances of the class. You must define a copy constructor and a copy assignment operator, so that the target object copies the resource to itself from the source object. After that, each object owns its own copy of the resource, and can independently use, modify, and release the resource. The destructor frees the captured resource. Sometimes for objects using the deep copy strategy, the term value objects is used.

This strategy does not apply to all resources. It can be applied to resources associated with a memory buffer, such as strings, but it is not very clear how to apply it to OS kernel objects such as files, mutexes, etc.

The deep copy strategy is used in all types of object strings, std::vector<> and other containers of the standard library.

1.4. Joint ownership strategy

In this case, you can copy and assign instances of the class. You must define a copy constructor and a copy assignment operator, in which the resource descriptor (as well as other data) is copied, but not the resource itself. After that, each object has its own copy of the descriptor, can use, modify, but cannot release the resource, while there is at least one more object that owns a copy of the descriptor. The resource is released after the last object that owns the copy of the descriptor goes out of scope. How this can be implemented is described below.

Smart pointers are often used in a joint ownership strategy; it is also natural to use it for immutable resources. In the standard C ++ 11 library, this strategy is implemented by the smart pointer std::shared_ptr<> .

2. Deep copy strategy - problems and solutions

Consider the template function of the exchange of states of objects of type T in the standard library C ++ 98.

 template<typename T> void swap(T& a, T& b) {    T tmp(a);    a = b;    b = tmp; }

If type T owns a resource and uses a deep copying strategy, then we have three operations for allocating a new resource, three operations for copying and three operations for releasing resources. Whereas in most cases this operation can be carried out at all without allocating new resources and copying, it is enough for the objects to exchange internal data, including the resource descriptor. There are many examples of such examples when you have to create temporary copies of a resource and release them immediately. Such an inefficient implementation of daily operations has stimulated the search for solutions to optimize them. Consider the main options.

2.1. Copy on write

Copy on write (COW), also known as deferred copying, can be viewed as an attempt to combine a deep copy strategy and a joint ownership strategy. Initially, when copying an object, the resource descriptor is copied, without the resource itself, and for the owners, the resource becomes shared and available in read-only mode, but as soon as some owner needs to modify the shared resource, the resource is copied and then the owner works with his a copy. The implementation of the COW solves the problem of exchanging states: there is no additional resource allocation and copying. Using COW is quite popular when implementing strings; CString (MFC, ATL) can be CString as an example. A discussion of possible ways to implement COW and problems that arise can be found in [Meyers1], [Sutter]. Guntheroth proposes an implementation of COW using std::shared_ptr<> . There are problems with the implementation of COW in a multi-threaded environment, which is why in the standard C ++ 11 library it is forbidden to use COW for strings, see [Josuttis], [Guntheroth].

The development of the COW idea leads to the following resource management scheme: the resource is immutable and is controlled by objects using the joint ownership strategy, if necessary, a new, appropriately modified resource is created and the new owner object is returned. This scheme is used for strings and other immutable objects on .NET and Java platforms. In functional programming, it is used for more complex data structures.

2.2. Definition of the state exchange function for a class

It was shown above how inefficient the state exchange function, implemented straightforward, through copying and assignment, can work. And it is used quite widely, for example, it is used by many algorithms of the standard library. In order for algorithms to use not std::swap() , but another function specifically defined for a class, you must perform two steps.

1. Determine in the class a member function Swap() (the name does not matter) that implements the exchange of states.

 class X { public:    void Swap(X& other) noexcept; // ... };

It is necessary to ensure that this function does not throw exceptions; in C ++ 11, such functions must be declared as noexcept .

2. In the same namespace as class X (usually in the same header file), define the free (non-member) function swap() as follows (the name and signature are fundamental):

 inline void swap(X& a, X& b) noexcept { a.Swap(b); }

After that, the algorithms of the standard library will use it, not std::swap() . This provides a mechanism called search dependent on argument types (argument dependent lookup, ADL). For more on ADL, see [Dewhurst1].

In the standard C ++ library, all containers, smart pointers, and other classes implement the state exchange function in the manner described above.

The member function Swap() is usually easy to determine: it is necessary to consistently apply the state exchange operation to the databases and members, if they support it, and std::swap() otherwise.

The above description is somewhat simplified, more detailed can be found in [Meyers2]. A discussion of problems related to the state exchange function can also be found in [Sutter / Alexandrescu].

The state exchange function can be attributed to one of the basic class operations. With it, you can gracefully define other operations. For example, the copy assignment operator is defined by copying and Swap() as follows:

 X& X::operator=(const X& src) {    X tmp(src);    Swap(tmp);    return *this; }

This template is called the “copy and share” idiom or the Sutter's emblem idiom, see [Sutter], [Sutter / Alexandrescu], [Meyers2] for more details. Its modification can be applied to the implementation of the semantics of displacement, see sections 2.4, 2.6.1.

2.3. Deletion of intermediate copies by the compiler

Consider the class

 class X { public:    X(/*  */); // ... };

And function

 X Foo() { // ...    return X(/*  */); }

In the case of a straight-line approach, the return from the function Foo() is realized by copying an instance of X But compilers can remove a copy operation from code, an object is created directly at the call point. This is called return value optimization (RVO). RVO has been used by compiler developers for a long time and is currently fixed in the C ++ 11 standard. Although the compiler decides on RVO, the programmer can write code for its use. For this, it is desirable that the function has one return point and the type of the returned expression matches the type of the return value of the function. In some cases, it is advisable to define a special closed constructor, called a “computational constructor,” for more details, see [Dewhurst2]. RVO is also discussed in [Meyers3] and [Guntheroth].

Compilers can also remove intermediate copies in other situations.

2.4. Implementing move semantics

The implementation of the movement semantics consists in defining a translational constructor that has a parameter of the type rvalue-reference to the source and a motion assignment operator with the same parameter.

In the standard C ++ 11 library, the state exchange function template is defined as follows:

 template<typename T> void swap(T& a, T& b) {    T tmp(std::move(a));    a = std::move(b);    b = std::move(tmp); }

In accordance with the rules for allowing overloads of functions that have parameters of type rvalue-reference (see Appendix A), in the case when type T has a displacement constructor and a transfer assignment operator, they will be used, and there will be no allocation of temporary resources and copying. Otherwise, a copy constructor and a copy assignment operator will be used.

The use of move semantics avoids the creation of temporary copies in a much wider context than the state exchange function described above. Move semantics applies to any rvalue value, that is, a temporary, unnamed value, as well as to the return value of the function if it is created locally (including the lvalue), and no RVO has been applied. In all these cases, it is guaranteed that the source object cannot be used in any way after the movement has been completed. Move semantics also applies to an lvalue value to which the std::move() transformation is applied. But in this case, the programmer himself is responsible for how the source objects will be used after the move (example std::swap() ).

The standard C ++ 11 library has been reworked to reflect the movement semantics. Many classes have added a relocation constructor and a move assignment operator, as well as other member functions, with parameters of type rvalue-reference. For example, std::vector<T> has an overloaded version of the void push_back(T&& src) . All this allows in many cases to avoid creating temporary copies.

Implementing move semantics does not override the definition of the state exchange function for a class. A specially defined state exchange function can be more efficient than the standard std::swap() . Moreover, a moving constructor and a moving assignment operator are very easily determined using the state exchange member function as follows (a variation of the idiom "copy and exchange"):

 class X { public:    X() noexcept {/*    */}    void Swap(X& other) noexcept {/*   */}    X(X&& src) noexcept : X()    {        Swap(src);    }    X& operator=(X&& src) noexcept    {        X tmp(std::move(src)); //         Swap(tmp);        return *this;    } // ... };

The relocation constructor and relocation assignment operator belong to those member functions for which it is highly desirable to ensure that they do not throw exceptions, and, accordingly, are declared as noexcept . This allows you to optimize some operations of the containers of the standard library without violating the strict security guarantee of exceptions, for more details, see [Meyers3] and [Guntheroth]. The proposed template provides such a guarantee, provided that the default constructor and the member state exchange function do not throw exceptions.

The C ++ 11 standard provides for the compiler to automatically generate a translation constructor and a transfer assignment operator; to do this, they must be declared using the "=default" construct.

 class X { public:    X(X&&) = default;    X& operator=(X&&) = default; // ... };

Operations are implemented by successively applying a move operation to databases and class members, if they support moving, and copy operations otherwise. It is clear that this option is far from always acceptable. Raw descriptors do not move, but you cannot usually copy them. If certain conditions are met, the compiler may independently generate such a moving constructor and a move assignment operator, but it is better not to use this opportunity, these conditions are rather complicated and can easily change when the class is refined. See [Meyers3] for details.

In general, the implementation and use of movement semantics is a rather “thin thing”. The compiler can apply copying where the programmer expects movement. Let us cite several rules that allow us to exclude or at least reduce the likelihood of such a situation.

If possible, use a copy ban.
Declare a relocation constructor and a move assignment operator as noexcept .
Implement move semantics for base classes and members.
Apply the std::move() transformation to the parameters of functions of the type rvalue-reference.

Rule 2 was discussed above. 4 , rvalue- lvalue (. ). .

 class B { // ...    B(B&& src) noexcept; }; class D : public B { // ...    D(D&& src) noexcept; }; D::D(D&& src) noexcept    : B(std::move(src)) //  {/* ... */}

, . 6.2.1.

2.5. vs.

, RVO (. 2.3), , . ( ), , . , . C++11 - emplace() , emplace_front() , emplace_back() , . , - — (variadic templates), . , C++11 — .

, , .
, , .

, .

 std::vector<std::string> vs; vs.push_back(std::string(3, 'X')); //  vs.emplace_back(3, '7');           //

std::string , . . , , . , [Meyers3].

2.6. Results

, , . - . . — : , . , , , . : , , «» .

: , , .NET Java. , Clone() Duplicate() .

- - , :

.
.
- rvalue-.

.NET Java - , , .NET IClonable . , .

3.

, . - , . , . Windows: , HANDLE , COM-. DuplicateHandle() , CloseHandle() . COM- - IUnknown::AddRef() IUnknown::Release() . ATL ComPtr<> , COM- . UNIX, C, _dup() , .

C++11 std::shared_ptr<> . , , , , , . , . std::shared_ptr<> [Josuttis], [Meyers3].

: - , ( ). ( ) , . std::shared_ptr<> std::weak_ptr<> . . [Josuttis], [Meyers3].

- [Alexandrescu]. ( ) , [Schildt]. , .

( ) [Alger].

-. [Josuttis] [Alexandrescu].

- .NET Java. , , , .

4.

, C++ rvalue- . C++98 std::auto_ptr<> , , , . , , ( ). C++11 rvalue- , , . C++11 std::auto_ptr><> std::unique_ptr<> . , [Josuttis], [Meyers3].

: - ( std::fstream , etc.), ( std::thread , std::unique_lock<> , etc.). MFC , ( CFile , CEvent , CMutex , etc.).

5. —

. , . , , , . , , , ( ) . , , , . ( ) , . , . — . 6.

, - -, « », - . - . , , , , - . «».

6. -

, - . , -. .

6.1.

- . , , :

. , .
.
.

, , , . C++11 .

« » (resource acquisition is initialization, RAII). RAII ( ), ., [Dewhurst1]. «» RAII. , , , (immutable) RAII.

6.2.

, RAII, , , . - , , - . , , , . .

6.2.1.

, , , , :

C++11 , , , . , - clear() , , , . . , shrink_to_fit() , , (. ).

, RAII, , , . , .

 class X { public: // RAII    X(const X&) = delete;            //      X& operator=(const X&) = delete; //      X(/*  */);              //      ~X();                            //   //     X() noexcept;                    //       X(X&& src) noexcept              //      X& operator=(X&& src) noexcept;  //    // ... };

 X x;                    //  ""  x = X(/*  */); //   x = X(/*  */); //   ,   x = X();                //

std::thread .

2.4, - . , - - . .

 class X { // RAII // ... public: // ,         X() noexcept;    X(X&& src) noexcept;    X& operator=(X&& src) noexcept;    void Swap(X& other) noexcept; //      void Create(/*  */); //      void Close() noexcept;        //   // ... }; X::X() noexcept {/*    */}

 X::X(X&& src) noexcept : X() {    Swap(src); } X& X::operator=(X&& src) noexcept {    X tmp(std::move(src)); //     Swap(tmp);    return *this; }

- :

 void X::Create(/*  */) {    X tmp(/*  */); //      Swap(tmp); } void X::Close() noexcept {    X tmp;    Swap(tmp); }

, , , - . , , , . , .

- « », , . : , , ( ). : , . , : , , . , . [Sutter], [Sutter/Alexandrescu], [Meyers2].

, RAII .

6.2.2.

RAII . , , , , :

, .
.
. , .
.
.

«» RAII, — . , , . 3. . «», .

6.2.3.

— . RAII , . , . , , ( -). - ( -). 6.2.1, .

6.3.

, - RAII, : . , , .

7.

, , , , . - -.

4 -:

. , - : , , - .

, . , , -, , .

- . . , (. 6.2.3). , (. 6.2.1). , . , , . , std::shared_ptr<> .

Applications

. Rvalue-

Rvalue- C++ , , rvalue-. rvalue- T T&& .

 class Int {    int m_Value; public:    Int(int val) : m_Value(val) {}    int Get() const { return m_Value; }    void Set(int val) { m_Value = val; } };

, rvalue- .

 Int&& r0; // error C2530: 'r0' : references must be initialized

rvalue- ++ , lvalue. Example:

 Int i(7); Int&& r1 = i; // error C2440: 'initializing' : cannot convert from 'Int' to 'Int &&'

rvalue:

 Int&& r2 = Int(42); // OK Int&& r3 = 5;       // OK

lvalue rvalue-:

 Int&& r4 = static_cast<Int&&>(i); // OK

rvalue- ( ) std::move() , ( <utility> ).

Rvalue rvalue , .

 int&& r5 = 2 * 2; // OK int& r6 = 2 * 2;  // error

rvalue- .

 Int&& r = 7; std::cout << r.Get() << '\n'; // : 7 r.Set(19); std::cout << r.Get() << '\n'; // : 19

Rvalue- .

 Int&& r = 5; Int& x = r;           // OK const Int& cx = r;    // OK

Rvalue- , . , rvalue-, rvalue .

 void Foo(Int&&); Int i(7); Foo(i);            // error, lvalue  Foo(std::move(i)); // OK Foo(Int(4));       // OK Foo(5);            // OK

, rvalue rvalue- , . rvalue-.

, , , rvalue-, (ambiguous) rvalue .

 void Foo(Int&&); void Foo(const Int&);

 Int i(7); Foo(i);            // Foo(const Int&) Foo(std::move(i)); // Foo(Int&&) Foo(Int(6));       // Foo(Int&&) Foo(9);            // Foo(Int&&)

: rvalue- lvalue.

 Int&& r = 7; Foo(r);            // Foo(const Int&) Foo(std::move(r)); // Foo(Int&&)

, rvalue-, lvalue std::move() . . 2.4.

++11, rvalue- — -. (lvalue/rvalue) this .

 class X { public:    X();    void DoIt() &;  // this   lvalue    void DoIt() &&; // this   rvalue // ... }; X x; x.DoIt();   // DoIt() & X().DoIt(); // DoIt() &&

.

, ( std::string , std::vector<> , etc.) . — . , rvalue- . , , - , - , . , , , rvalue, lvalue. , rvalue. . , ( lvalue), RVO.

Bibliography

[Alexandrescu]
Alexandrescu, Andrei. Modern design in C ++ .: Trans. from English - M .: OOO “I.D. Williams, 2002.

[Guntheroth]
, . C++. .: . from English — .: «-», 2017.

[Josuttis]
, . C++: , 2- .: . from English - M .: OOO “I.D. », 2014.

[Dewhurst1]
, . C++. , 2- .: . from English — .: -, 2013.

[Dewhurst2]
, . C++. .: . from English — .: , 2012.

[Meyers1]
, . C++. 35 .: . from English — .: , 2000.

[Meyers2]
, . C++. 55 .: . from English — .: , 2014.

[Meyers3]
, . C++: 42 C++11 C ++14.: . from English - M .: OOO “I.D. », 2016.

[Sutter]
, . C++.: . from English — : «.. », 2015.

[Sutter/Alexandrescu]
, . , . ++.: . from English - M .: OOO “I.D. », 2015.

[Schildt]
, . C++.: . from English — .: -, 2005.

[Alger]
, . C++: .: . from English — .: « «», 1999.

Source: https://habr.com/ru/post/425837/

All Articles