📜 ⬆️ ⬇️

Eggs.Variant - Part I

The publication of this translation encouraged me to comment the user @encyclopedist to the recent article "Factory Method Without Placing in Dynamic Memory" . The article interested me, but a brief googling did not reveal a translation. “Disorder.” - I thought - “Such an interesting article on C ++ and not translated into Russian. I should fix it. ”

Table of contents
  1. Introduction
  2. Design
  3. Implementation

  4. What is not yet said


Reflections on the development of Eggs.Variant - a generic type-safe tagging union in C ++ 11/14 .
')

Introduction


A union is a special type of class that at a time can store only one of its non-static members. It takes up as much space as is required to accommodate the largest of its members.
9 [class] / 5 A union is a class defined with the keyword union ; at the same time he can keep only one of his members (9.5). [...]
9.5 [class.union] / 1 Only one of the non-static members can be active in the union, that is, at the moment, the value of only one of its non-static members can be stored in the association. [...] The size of the union is sufficient to contain the largest of its non-static members. Each non-static member is located in memory as if it is the only member of the structure. All non-static members of the union object have the same address.

Original
9 [class] / 5 A union is a class defined with the class-key union; it holds at a time (9.5). [...]
9.5 [class.union] / In union data data data data data union at any time. [...] non-static data members. Each member of a struct. All non-static data is the same as address.





In C ++ 98 , union members are restricted to trivial object types. For these types, their life time begins when the storage is received and ends when it is reused or released.
3.8 [basic.life] / 1 [...] The lifetime of an object of type T begins when:
  • storage obtained with type T alignment and size and
  • object initialization, if it is non-trivial, completed.

The lifetime of an object of type T ends when:
  • if T is a type of a class with a non-trivial destructor (12.4), the call to the destructor begins, or
  • the repository that the object occupies has been reused or released.


Original
3.8 [basic.life] / 1 [...] T begins when:
  • size is obtained, and
  • if the object has a non-trivial initialization, its initialization is complete.

T ends when:
  • if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
  • The storage of the object occupies is reused or released.



This special guarantee allows you to change the active member of the union simply by assigning a new value to the association, allowing you to effectively reuse the repository — which is in good agreement with the spirit of the standard, if not with a letter.
In addition, the union does not know which member — if any — is active, so its special member functions must be implemented without knowledge of this information. Because union members are limited to trivial types, special member functions can be implemented in terms of the underlying byte union that is independent of the active member.
9.5 [class.union] / 1 [...] A class object with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4) or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member union, or an element of the array lying in the union. [...]

Original
9.5 [class.union] / 1 [...] An object of a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non -trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can it be an array of such objects. [...]


In C ++ 11, this restriction was removed; union members can now be of any type. Switching between non-trivial members requires the explicit destruction of the current active member and the use of the host operator new to construct a new active member.
9.5 [class.union] / 4 [Note: in general, it is necessary to use one explicit call to the destructor and one explicit call to the placing operator new to change the active member of the enumeration. —End of note] [Example: Consider an object u , having the type of enumeration U with non-static members m type M and n type N If M has a non-trivial destructor, and N has a non-trivial constructor (for example, if they declare or inherit virtual functions), the active member u can be safely changed from m to n by using the following destructor and placing new operator:
 um~M(); new (&u.n) N; 

—The end of the example]

Original
9.5 [class.union] / 4 [note: In general, it should be noted . —End note] [Example: Consider a non-static data type and have a non-static data members and a n of type N If you have a non-trivial constructor (for example, if you declare or inherit virtual functions), follows:
 um~M(); new (&u.n) N; 

—End example]


If a special member function of any of the members of a union is nontrivial, then the special member function of the union itself will be implicitly defined as remote, unless it is provided by the user.
9.5 [class.union] / 2 [Note: if any non-static member of the union has a non-trivial default constructor (12.1), a copy constructor (12.8), a displacement constructor (12.8), a copy assignment operator (12.8), a move assignment operator (12.8) or destructor (12.4), the corresponding member function of the union must be provided by the user, or it will be implicitly removed (8.4.3) from the union. —End of note]
9.5 [class.union] / 3 [Example: Consider the following union:
 union U { int i; float f; std::string s; }; 

Since the type std::string (21.3) declares non-trivial versions of all special member functions, type U will implicitly remove the default constructor, the copy / move constructors, the copy / move assignment operators, and the destructor. To use type U some of these member functions must be provided by the user. —The end of the example]

Original
9.5 [class.union] / 2 (non-trivial default constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8) , move assignment operator (12.8), or delete it (8.4.3) for the union. —End note]
9.5 [class.union] / 3 [Example: Consider the following union:
 union U { int i; float f; std::string s; }; 

Since he has been an implicitly defaulted constructor, he has been defined as a constructor To use it, it must be user-provided. —End example]


These nontrivial member functions can be provided — with their usual semantics — only if the active member of the enumeration is known so that it can be passed on. A tagged union is a union or a class similar to a union that has knowledge about itself , that is, it contains some identifier that lets you know which member — if any, is currently active. Marked union can provide all special member functions regardless of their triviality.
An instance of the class eggs::variant<Ts...> is a marked union of objects with types Ts . It provides a natural interface for switching the active member and provides all the special member functions with their usual semantics:
 eggs::variants<N, M> u; // u     u = M{}; //    u   M u = N{}; //    u   N,      //    - using U = eggs::variants<int, float, std::string>; 


Design


The ultimate goal of design is the synthesis and improvement of the design of the marked association without compromising its functionality. That is, he should not have a special member to select the current active member of the union, or he should occupy very little space:
 struct U { union { T0 m0; ...; TN mN; }; std::size_t which; } u; 

is replaced by this:
 using V = eggs::variant<T0, ..., TN>; V v; 

In particular:

The interface is mainly based on std::experimental::optional<T> , as defined in the Technical Specification of the main library . The conceptual model optional<T> consists of a tagged union of the types nullopt_t and T Design decisions made for optional<T> are easily transferred to the variant<Ts...> , whose conceptual model consists of the marked-up union of the nullvariant_t types and those that are hidden behind Ts . The semantics of all special member functions and related operators, as well as the interface for switching the active member — through design, assignment, or placement — is borrowed from optional<T> .
Access to the active member is done in the style of std::function , which returns a pointer to the target if it is requested with the correct target type — something like dynamic_cast for the poor. In addition, it also allows you to get a null pointer to the active member (if there is one), which has been very useful to simplify the implementation of auxiliary functions.
Finally, helper classes like std::tuple provided, as well as access to elements by index or type — albeit with unobvious semantics closer to the semantics of casting at run time.
Reference documentation can be found here .

Implementation


Direct implementation of variant<Ts> as the storage would use the underlying relaxed union :
 template <typename ...Ts> union storage { nullvariant_t nullvariant; Ts... members; }; 

However, the above code is not correct in terms of C ++ syntax, because the template parameter set cannot be disclosed in this context — it must contain the names of the members so that they can be referenced. Instead, a recursive approach should be used to build the underlying storage:
 template <typename ...Ts> union storage; template <typename T, typename ...Ts> union storage<T, Ts...> { nullvariant_t nullvariant; T head; storage<Ts...> tail; }; template <> union storage<> { nullvariant_t nullvariant; }; 

Unfortunately, this is not so easy, given that any non-trivial special member function for a type from Ts as a result, remove the corresponding special member function from the repository. To use this implementation, at least the default constructor and destructor must be implemented for the resulting list, although the destructor cannot do anything useful.
The simplest implementation, which was used before relaxed unions appeared in C ++ , would use bare storage, suitable for storing any of the types in Ts - attention, spoiler: in some cases it will not work. The standard even provides a special property to facilitate our work:
10.20.7.6 [meta.trans.other]
 template <std::size_t Len, class... Types> struct aligned_union; 

  • Condition: At least one type must be provided.
  • Comments: The typedef to the member type must be a POD type, applicable for use as an uninitialized repository for any object whose type is listed in the Types list; its size must be at least Len . The static member alignment_value must be an integer constant of type std::size_t , whose value determines the strictest alignment for all types listed in the Types list.


Original
10.20.7.6 [meta.trans.other]
 template <std::size_t Len, class... Types> struct aligned_union; 

  • Condition: At least one type is provided.
  • This is the case: its size shall be at least Len . This is the case for the quotation.



It should be noted that this property has already been removed from the working draft — along with the advent of relaxed associations — and is now a potential candidate for obsolescence. A possible replacement in C ++ 14 might be:
 template <std::size_t Len, typename ...Types> struct aligned_union { static constexpr std::size_t alignment_value = std::max({alignof(Types)...}); struct type { alignas(alignment_value) unsigned char _[std::max({Len, sizeof(Types)...})]; }; }; 

Using aligned_union as the storage type, a simplified version of variant<Ts> can be implemented as follows:
 template <typename ...Ts> class variant { template <typename T> struct _index_of { /*...*/ }; //  T  Ts...,   0 public: static constexpr std::size_t npos = std::size_t(-1); variant() noexcept : _which{npos} {} template <typename T> variant(T const& v) : _which{_index_of<T>::value} { new (target<T>()) T(v); //   T     //   new } /*...*/ std::size_t which() const noexcept { return _which; } template <typename T> T* target() noexcept { return _which == _index_of<T>::value ? static_cast<T*>(static_cast<void*>(&_storage)) : nullptr; } template <typename T> T const* target() const noexcept { return _which == _index_of<T>::value ? static_cast<T const*>(static_cast<void const*>(&_storage)) : nullptr; } private: std::size_t _which; typename std::aligned_union<0, Ts...>::type _storage; }; 

Special member functions must be redirected to the active member (if any). Again, we cannot simply use the switch to achieve this goal — although if we could do this, it would hardly have been much simpler — and the direct replacement would have a recursive implementation:
 struct _destructor { template <typename T> static void call(void* ptr) { static_cast<T*>(ptr)->~T(); } }; variant<Ts...>::~variant() { apply<_destructor, Ts...>(_which, &_storage); } template <typename F> void apply(std::size_t /*which*/, void* /*storage*/) {} template <typename F, typename T, typename ...Ts> void apply(std::size_t which, void* storage) { if (which == 0) { F::template call<T>(storage); } else { apply<F, Ts...>(which - 1, storage); } } 

A non-recursive implementation of the apply method can be constructed using a jump table, just as a switch does, with a subsequent transition to the appropriate entry:
 template <typename F, typename ...Ts> void apply(std::size_t which, void* storage) { using fun_ptr = void(*)(void*); static constexpr fun_ptr table[] = {&F::template call<Ts>...}; if (which < sizeof...(Ts)) { table[which](storage); } } 

A non-recursive implementation not only gives a faster generated code, but also significantly speeds up compilation from the case of a large number of types in the Ts list - in your case, the estimate may change.

Trivially Copied Types


A trivially copied type is one that can be copied by copying its constituent bits — that is, using std::memcpy .
3.9 [basic.types] / 2 For any object (except subobjects of the base class) of the trivially copied type T , whether or not they contain valid values ​​of type T , its constituent bytes (1.7) can be copied to the char array or unsigned char . If the contents of the char array or unsigned char copied back to an object, the object should receive its original value as a result. [...]
3.9 [basic.types] / 3 For any trivially copied type T , if two pointers to T point to different objects obj1 and obj2 type T , where either obj1 or obj2 are subobjects of the base class and the components obj1 bytes (1.7) are copied into obj2 , as a result, obj2 must contain the same value as obj1 . [...]

Original
3.9 basic basic basic 2 2 2 2 2 2 2 2 2 any any be copied into an array of char or unsigned char . It is a charcoal pattern that holds its original value. [...]
3.9 [basic.types] / 3, for any trivially copyable type T , obj1 and obj2 , what’s not a norm of the obj2 a ob (1.7) making up obj1 are copied into obj2 , obj2 . [...]


Combining trivially copied members is itself trivially replicable, which puts it into candidates for additional optimization, which cannot be done for a non-trivially replicable type. It follows that a variant with a trivially copied type must also strive to be trivially replicable.
9 [class] / 6 A trivially copied class is a class that:
  • does not have non-trivial copy constructors (12.8),
  • does not have non-trivial displacement constructors (12.8),
  • does not have non-trivial copy assignment operators (13.5.3, 12.8),
  • does not have nontrivial relocation assignment operators (13.5.3, 12.8) and
  • has a trivial destructor (12.4).

[...]

Original
9 [class] / 6 A trivially copyable class is a class that:
  • has no non-trivial copy constructors (12.8),
  • has no non-trivial move constructors (12.8),
  • has no non-trivial copy assignment operators (13.5.3, 12.8),
  • has no non-trivial move assignment operators (13.5.3, 12.8), and
  • has a trivial destructor (12.4).

[...]


To achieve this goal, a separate variant specialization must be chosen if all types in Ts are trivially replicable. The special member functions listed above should not be provided by the user for this specialization — they must either be provided implicitly or explicitly specified when they are first defined as generated by default. The implementation will provide implicit definitions for them, which will be trivial; copy and move operators will simply copy the storage bits containing them together with the discriminator, but the destructor will not do anything.
But there is one catch: a trivially copied class may be generally non-replicable, although a special member function that was removed when it was first defined is trivial. Let's look, for example, at how the class boost::noncopyable defined:
 class noncopyable { protected: constexpr noncopyable() = default; noncopyable(noncopyable const&) = delete; noncopyable& operator=(noncopyable const&) = delete; ~noncopyable() = default; }; 

It may come as a surprise that std::is_trivially_copyable prints true for the noncopyable class. Another big surprise is that the variant<noncopyable> instance variant<noncopyable> can be successfully copied, since the remote noncopyable member functions are not used at all. This is, in effect, a type security breach arising from the decision to use an untyped bare repository to store the active member.

Trivially destructible types


Another important category of types are trivially destructible types.
12.4 [class.dtor] / 5 [...] A destructor is trivial if it is not provided by the user and if:
  • it is not virtual,
  • all immediate base classes of this class have trivial destructors and
  • for all non-static members of a given class, having a class (or array of classes) of its type, each such class has a trivial destructor.

Otherwise, the destructor is nontrivial.

Original
12.4 [class.dtor] / 5 [...] A destructor is trivial if it is not user-provided and if:
  • the destructor is not virtual,
  • trivial destructors, and
  • There are a number of types of data for each of them.

Otherwise, the destructor is non-trivial.


This is especially important because one of the requirements for a literal type — whose instances can be used in the context of a constexpr expression — is its trivial destructibility.
3.9 [basic.types] / 10 A type is a literal type if it is:
  • [...]
  • ( 9), :
    • ,
    • (8.5.1) constexpr - ,
    • .



Original
3.9 [basic.types] / 10 A type is a literal type if it is:
  • [...]
  • a class type (Clause 9) that has all of the following properties:
    • it has a trivial destructor,
    • it is an aggregate type (8.5.1) or a constructor
    • all of its non-volatile literal types.




A union may be a literal type, provided that at least one of its members is a literal type, and the remaining members are trivially destructible. It follows that variantunder these conditions it must also strive to be a literal type. Another specialization variantshould be chosen if all types in Tsare trivially destructible. However, it is not so useful, since among the restrictions on constant expressions there is a restriction on casting a voidpointer to an object pointer:
5.19 [expr.const] / 2 The conditional expression eis the core of a constant expression only if the calculation efollowing the rules of the abstract machine (1.9) is not calculated in one of the following expressions:
  • [...]
  • conversion from type cv void*to type of pointer to object;
  • [...]


Original
5.19 [expr.const]/2 A conditional-expression e is a core constant expression unless the evaluation of e , following the rules of the abstract machine (1.9), would evaluate one of the following expressions:
  • [...]
  • a conversion from type cv void* to a pointer-to-object type;
  • [...]



And again, the decision to use an untyped raw storage limits the fullness of the implementation of a generalized tagged union to the same level that can be achieved by implementing it manually.

What is not yet said


The implementation on the basis of raw storage - although it justifies its price - does not hold water, it is worth going beyond the usual framework. The storage of values ​​in a generic type-safe tagged union must always be a regular union, otherwise it will not be possible to use all its functionality.

Source: https://habr.com/ru/post/244573/


All Articles