Reflection in C ++ 14

This article is a transcript (with minor corrections) of the report by Anton antoshkka Polukhin - “A bit of magic for C ++ 14”.

I've been tinkering with C ++ recently and accidentally discovered a couple of new metaprogramming techniques that allow you to make a reflection in C ++ 14. A couple of motivational examples. Here you have some kind of POD structure, there are some fields in it:

struct complicated_struct { int i; short s; double d; unsigned u; };

The number of fields and their names do not matter, the important thing is that from this structure we can write the following piece of code:
')

 #include <iostream> #include "magic_get.hpp" struct complicated_struct { /* … */ }; int main() { using namespace pod_ops; complicated_struct s {1, 2, 3.0, 4}; std::cout << "s == " << s << std::endl; // Compile time error? }

The main function, in it we create a variable of our structure, we somehow initialize it through aggregate initialization, and then we try to output this variable to std :: cout. And at this moment we, in theory, should have a compilation error: we have not defined a stream output operator for our structure, the compiler does not know how to compile all this and output it. However, it will compile and display the contents of the structure:

 antoshkka@home:~$ ./test s == {1, 2, 3.0, 4}

We can return to the code, change the names of the fields, change the name of the structure, change the name of the variable, whatever we can do - the code will continue to work and display the contents of the structure correctly. Let's see how it works.

The operator is described in the header file magicget.hpp, it works with any data types:

 template <class Char, class Traits, class T> std::basic_ostream<Char, Traits>& operator<<(std::basic_ostream<Char, Traits>& out, const T& value) { flat_write(out, value); return out; }

This statement calls the flat_write method. The flat_write method prints braces and contains a string in the middle:

 template <class Char, class Traits, class T> void flat_write(std::basic_ostream<Char, Traits>& out, const T& val) { out << '{'; detail::flat_print_impl<0, flat_tuple_size<T>::value >::print(out, val); out << '}'; }

In the middle of the string string there is a flat_tuple_size <T> :: value. And here it should be noted that the standard library has std :: tuple_size <std :: tuple>, which introduces the number of elements in a tuple. However, here T is not a tuple, not std :: tuple, but a custom type. Here, flat_tuple_size prints the number of fields in a custom type.

Let's look further at what the print function does:

 template <std::size_t FieldIndex, std::size_t FieldsCount> struct flat_print_impl { template <class Stream, class T> static void print (Stream& out, const T& value) { if (!!FieldIndex) out << ", "; out << flat_get<FieldIndex>(value); // std::get<FieldIndex>(value) flat_print_impl<FieldIndex + 1, FieldsCount>::print(out, value); } };

The print function displays or does not display a comma depending on the index of the field with which we work, and then comes the call to the flat_get function and the comment that it works like std :: get, that is, returns the field from the structure by index. A natural question appears: how does it work?

It works as follows: the output operator in the stream determines the number of fields of your structure, iterates over the fields through the indices and outputs each of the fields by index. Thus it turns out that you saw at the beginning of the article.

Let's further understand how to make the flat_get and flat_tuple_size methods, which work with user structures, determine the number of fields in the structure, output this structure by fields:

 /// Returns const reference to a field with index `I` /// Example usage: flat_get<0>(my_structure()); template <std::size_t I, class T> decltype(auto) flat_get(const T& val) noexcept; /// `flat_tuple_size` has a member `value` that constins fields count /// Example usage: std::array<int, flat_tuple_size<my_structure>::value > a; template <class T> using flat_tuple_size;

Let's start with the simple. We will calculate the number of fields in the structure. We have a POD structure T:

 static_assert(std::is_pod<T>::value, "")

for this structure, we can write an expression:

 T { args... }

This is an aggregate initialization structure. This expression is successfully compiled if the number of arguments is less than or equal to the number of fields within the structure and each type of argument corresponds to the type of field within the structure.

From this abracadabra we will try to get the number of fields within the T structure. How will we do this? We take our structure T and try to initialize it with some huge number of arguments. This will not compile. We will reject one of the arguments and try again. This also does not compile, but someday we will get to that number of arguments, which is equal to the number of fields inside our structure, and then it will gather. At this point, we just need to remember the number of arguments - and now we are ready: we have the number of fields within the structure. This is a basic idea. Let's go into the details.

How many arguments do we need from the very beginning if our T structure contains only char or unsigned char and other types of 1 byte size? In this case the number of fields inside the structure T will be equal to the size of this structure. If we have other fields, for example, int or pointer, then the number of fields will be less than the size of the structure.

We got the number of fields from which to start aggregate initialization. That is, we will initialize our structure T with the number of arguments equal to sizeof (T). If it failed to compile, then we discard one argument, try again, if compiled, we found the number of fields within the structure. One problem remains: even if we guessed with the number of arguments within the structure, the code will still fail. Because we need to know the type of field.

Let's do a workaround. We will make a structure with an implicit type cast operator to any type:

 struct ubiq { template <class Type> constexpr operator Type&() const; }; int i = ubiq{}; double d = ubiq{}; char c = ubiq{};

This means that the variables of this structure are converted to any type: int, double, std :: string, std :: vector, any custom types, to anything.

Completely recipe: we take the structure T and try to aggregate the initialization of this structure with the number of arguments equal to sizeof (T), where each argument is an instance of our ubiq structure. At the aggregate initialization stage, each instance from ubiq will turn into a field type inside the T structure, and we just need to choose the number of arguments. If many arguments are not compiled, we drop one and try again. If it was compiled, then we count the number of arguments - and we get the result.

Now for some code. Slightly changing the structure of ubiq: add a template parameter to make it easier to use this structure with variadic templates. We also need std :: make_index_sequence (an entity from C ++ 14, which expands to std :: index_sequence - a long chain of digits).

Ready to see the scary code? Go.

Only two functions:

 // #1 template <class T, std::size_t I0, std::size_t... I> constexpr auto detect_fields_count(std::size_t& out, std::index_sequence<I0, I...>) -> decltype( T{ ubiq_constructor<I0>{}, ubiq_constructor<I>{}... } ) { out = sizeof...(I) + 1; /*...*/ } // #2 template <class T, std::size_t... I> constexpr void detect_fields_count(std::size_t& out, std::index_sequence<I...>) { detect_fields_count<T>(out, std::make_index_sequence<sizeof...(I) - 1>{}); }

Both functions are named detect_fields_count. The first function is a bit more specialized. Thus, when the compiler sees detect_fields_count <T>, it will think that the first function is more specialized and that it should try to use it.

This function has a trailing return type, that is, the type of this function is a decltype from T with
aggregate initialization. If we have guessed the number of arguments, then this expression will compile, we fall into the body of this function and write the number of arguments that we have to the output variable out. If it did not work out (we did not guess the number of arguments), then the compiler will think that this is not an error, but a substitution failure, and it must find another function with the same name, but less specialized one. It will take function # 2. Function # 2 discards one of the indices (that is, reduces the number of arguments by one) and calls detect_fields_count again. Again, either the first function or the second will be called. Thus, we will go over the arguments and find the number of fields within the structure. That was the easy part.

Ahead is difficult: how to get the field type inside the structure T?

We already have our expression T with aggregate initialization and we pass ubiq's instances inside. Each ubiq instance is called an implicit type cast operator, and we know the type of the field inside this operator. All we need now is to somehow grab and pull this information into the outer scope, where we can work with it — beyond the limits of the aggregate initialization of the T structure. Unfortunately, in C ++ there is no mechanism to write the data type into a variable. More precisely, there are std :: type_index and std :: type_info, but they are useless at the compilation stage. We from them then will not pull out back type.

Let's try to somehow get around this limitation. To do this, recall what POD is (but very roughly: the standardization committee likes to change the definition every three years).

A POD structure is a structure whose fields are labeled either public, private or protected (we are only interested in public fields). And all the fields inside this structure are either other POD structures or fundamental types: pointers, int, std :: nullptr_t. For a couple of minutes, let's forget about pointers and we’ll find out that the fundamental types are quite small, less than 32-x, and this means that we can assign an identifier (integral dial) to each fundamental type. This tsiferku we can write to the output array, pull this output array beyond the implicit conversion of the operator, and then convert the tsiferku to type. Here is such a simple idea.

I went implementation. To do this, we change our ubiq structures:

 template <std::size_t I> struct ubiq_val { std::size_t* ref_; template <class Type> constexpr operator Type() const noexcept { ref_[I] = typeid_conversions::type_to_id(identity<Type>{}); return Type{}; } };

There is now a pointer to the output array, and this output array has the terrible name ref_, but it did. The implicit casting operator has also changed: it now calls the type_to_id function. It converts the type to the identifier and we write this identifier to the output array ref_. It remains to generate a bunch of type_to_id methods. We will do this using a macro:

 #define BOOST_MAGIC_GET_REGISTER_TYPE(Type, Index) \ constexpr std::size_t type_to_id(identity<Type>) noexcept { \ return Index; \ } \ constexpr Type id_to_type( size_t_<Index > ) noexcept { \ Type res{}; \ return res; \ } \ /**/

The macro will generate the type_to_id function that turns the type into an identifier and also generates the id_to_type function for us, which turns the identifier back into type. This macro is not visible to the user. As soon as we used it, we undefine it. We register the fundamental types (not all are listed here):

 BOOST_MAGIC_GET_REGISTER_TYPE(unsigned char , 1) BOOST_MAGIC_GET_REGISTER_TYPE(unsigned short , 2) BOOST_MAGIC_GET_REGISTER_TYPE(unsigned int , 3) BOOST_MAGIC_GET_REGISTER_TYPE(unsigned long , 4) BOOST_MAGIC_GET_REGISTER_TYPE(unsigned long long , 5) BOOST_MAGIC_GET_REGISTER_TYPE(signed char , 6) BOOST_MAGIC_GET_REGISTER_TYPE(short , 7) BOOST_MAGIC_GET_REGISTER_TYPE(int , 8) BOOST_MAGIC_GET_REGISTER_TYPE(long , 9) BOOST_MAGIC_GET_REGISTER_TYPE(long long , 10) ...

Zero do not use. I will say why later. Registered all fundamental types. Now we make a function that turns type T into an array of field identifiers inside this type T. The most interesting is in the body of this function:

 template <class T, std::size_t N, std::size_t... I> constexpr auto type_to_array_of_type_ids(std::size_t* types) noexcept -> decltype(T{ ubiq_constructor<I>{}... }) { T tmp{ ubiq_val< I >{types}... }; return tmp; }

Here comes the aggregate initialization of the time variable, and we pass ubiq's instances there. This time they hold a pointer to the output array: here types is the output array, into which we will write the identifiers of the field types. After this line (after initializing the temporary variable), the output array types will store the type identifiers of each field. The type_to_array_of_type_ids function is constexpr, that is, everything can be used at the compilation stage. Beautiful! We have left the identifiers to turn back into types. This is done like this:

 template <class T, std::size_t... I> constexpr auto as_tuple_impl(std::index_sequence<I...>) noexcept { constexpr auto a = array_of_type_ids<T>(); // #0 return std::tuple< // #3 decltype(typeid_conversions::id_to_type( // #2 size_t_<a[I]>{} // #1 ))... >{}; }

Zero line: here we get an array of identifiers. Here, the type of the variable a is something similar to std :: array, but heavily dependent, so that it can be used in constexpr expressions (because we have C ++ 14, not C ++ 17, where most problems with constexpr for std :: arrary fixed).

In line # 1, we create an integral constant from the element of the array. The integral constant is std :: integral_constant, the first parameter for which is size_t_, and the second parameter will be our a [I]. size_t_ is the using declaration, alias. In line # 2 we convert the identifier back to type, and in line # 3 we create std :: tuple, and each element from this tuple corresponds exactly to the data types inside the T structure, the structures inside which we looked. Now we can do something very dumb. For example, reinterpret_cast user structure to tuple'u. And we can work with the user structure as a tuple. Well, yes: smiles: reinterpret_cast.

I warn you, do not try to copy and run because the code is a bit simplified. For example, std :: tuple does not specify the order in which arguments are created and destroyed: some implementations initialize the arguments from the end forward, store them in the wrong order, which is why std :: tuple does not work. Need to make your own.

Let's go further. What to do with pointers: pointers to constant pointers, pointers to ints, etc.?

We have a type_to_id function. It returns std :: size_t and we have not used a bunch of bitties from this std :: size_t: we used only for 32 fundamental types. So, these bitiki can be used for encoding information about the pointer. For example, if we have in the user structure a field with the unsigned char type, then in binary form it will look like this:

unsigned char c0; // 0b00000000 00000000 00000000 000 00001

The least significant bit contains the char identifier. This is one: this is how we assigned it in the macro. If we have an unsigned char pointer, then the most significant bitiki will now store information that this is a pointer:

unsigned char* 1; // 0b 001 00000 00000000 00000000 000 00001

If we have a constant pointer, then the most significant bitiki store information that this is a constant pointer:

const unsigned char* 2; // 0b 010 00000 00000000 00000000 000 00001

If we add an additional level of indirection (another pointer), then the other most senior bitiki will change and will store information that we have a pointer:

const unsigned char** 3; // 0b 010001 00 00000000 00000000 000 00001

Change the underlying type: the most significant bits do not change, the least significant now contain the identifier of the seven, which means that we work with the short:

const short** s0; // 0b 010001 00 00000000 00000000 000 00111

We add functions that convert a type to an identifier (and add these bitics accordingly):

 template<class Type> constexpr std::size_t type_to_id(identity<Type*>) template<class Type> constexpr std::size_t type_to_id(identity<const Type*>) template<class Type> constexpr std::size_t type_to_id(identity<const volatile Type*>) template<class Type> constexpr std::size_t type_to_id(identity<volatile Type*>)

And we add inverse functions that convert the identifier back to the type:

 template<std::size_t Index> constexpr auto id_to_type(size_t_<Index>, if_extension<Index, native_const_ptr_type> = 0) noexcept; template<std::size_t Index> constexpr auto id_to_type(size_t_<Index>, if_extension<Index, native_ptr_type> = 0) noexcept; template<std::size_t Index> constexpr auto id_to_type(size_t_<Index>, if_extension<Index, native_const_volatile_ptr_type> = 0) noexcept; template<std::size_t Index> constexpr auto id_to_type(size_t_<Index>, if_extension<Index, native_volatile_ptr_type> = 0) noexcept;

Here if_extension is std :: enable_if with aliases and lots of magic. The magic is that, depending on the identifier, it allows you to call only one of the functions presented.

I don’t know what to do with enums. The only thing I could think of was to call std :: listing_type. That is, we lose information about what this enum is: we cannot register all user enums in our list of fundamental types, it is simply impossible. Instead, we encode only how this enum is stored. If it is an int, then we will save it as an int, if the user has specified a class enum: char, then we will get a char and encode only the char, information about the enum type will be lost.

With complex structures and classes the same problem: we cannot register them all with our list of fundamental types. Therefore, we will simply once again look inside the class and encode all the fields that exist in this class as if they are in our zero level class.

Suppose we have a structure a, it has a field whose type is structure b, we look inside b and all fields from b drag inside a. I simplify: there is still a lot of logic with alignments to prevent them from breaking.

It is done this way: one type_to_id function is added:

 template <class Type> constexpr auto type_to_id(identity<Type>, typename std::enable_if< !std::is_enum<Type>::value && !std::is_empty<Type>::value>::type*) noexcept { return array_of_type_ids<Type>(); // Returns array! }

This time, it can return an array (all past ones returned size_t). We will need to change our ubiq structure so that it can work with arrays, add logic to how to define offset'y, where to write offset'y and information about which substructure we work with. This is all long and not very interesting, there are a couple of examples of how this turns out, these are also technical details.

What does this give us and where can all this be used? No, you do not need to write it yourself, because there is a ready-made library that implements all this. That's what the library gives you.

First comparisons : no longer need to manually write comparisons for
POD structures There are three methods. Ie you can write nothing at all, but connect one header file and for all POD structures you will have a comparison out of the box.

There are heterogeneous comparisons : you can have two structures with the same fields, but
different data types compare with each other.

There is a universal hash function : you pass in any user structure there, and it considers it a hash.

I / O operators : what we have already seen in the introduction, everything also exists and works.

When I first talked about this metaprogram magic, the developers were very happy with some hardware (I won’t remember exactly). They say that they have 1000 different flat structures that represent different protocols. That is, the one-to-one structure is mapped on the protocol. For each of these structures, they have three serializers (depending on which hardware and which wires are used in the future). And that they have 3000 serializers. They were very unhappy about it. With the help of this library, they were able to simplify 3000 serializers to 3 serializers. They were extremely happy.

These metaprogram tricks offer the opportunity for basic reflection: you can write new type_traits, for example: is_continuous_layout <T>, is_padded <T>, has_unique_object_representations <T> (as in C ++ 17).

You can write great punch_hole <T, Index> functions (which are not in the library) and which define unused bitics and baitics in the user structure, return a link to them, allowing other people to use them.

: boost::spirit, , , boost::fusion boost::spirit. , boost::spirit : “- ! , ”. .

. :

 namespace foo { struct comparable_struct { int i; short s; char data[50]; bool bl; int a,b,c,d,e,f; }; } // namespace foo std::set<foo::comparable_struct> s;

. , - . std::set. . std::tie , , , . . . : , std::set, . :

 std::set<foo::comparable_struct> s = { /* ... */ }; std::ofstream ofs("dump.txt"); for (auto& a: s) ofs << a << '\n';

ofstream .

: , , :

 std::set<foo::comparable_struct> s; std::ifstream ifs("dump.txt"); foo::comparable_struct cs; while (ifs >> cs) { char ignore = {}; ifs >> ignore; s.insert(cs); }

: .

, , . flat_tie: std::tuple.

 template <class T> auto flat_tie(T& val) noexcept; struct my_struct { int i, short s; }; my_struct s; flat_tie(s) = std::tuple<int, short>{10, 11};

my_struct::i 10, my_struct::s 11 s.

, , -, C++14, .

reinterpret_cast. reinterpret_cast' : constexpr, .

. : . C++17. : C++17 structure binding. , , . , , enable_if structure binding. . tag dispatching:

 template <class T> constexpr auto as_tuple(T& val) noexcept { typedef size_t_<fields_count<T>()> fields_count_tag; return detail::as_tuple_impl(val, fields_count_tag{}); }

as_tuple_impl:

 template <class T> constexpr auto as_tuple_impl(T&& val, size_t_<1>) noexcept { auto& [a] = std::forward<T>(val); return detail::make_tuple_of_references(a); } template <class T> constexpr auto as_tuple_impl(T&& val, size_t_<2>) noexcept { auto& [a,b] = std::forward<T>(val); return detail::make_tuple_of_references(a,b); }

as_tuple_impl T. , T , as_tuple_impl. structure binding, , , . , structure binding . , b, , . Beauty!

, constexpr, std::get — . . , structure binding : structure binding. , .

antoshkka — https://youtu.be/jDI5CHKFKd0
Precise and Flat Reflection (magic_get)

Source: https://habr.com/ru/post/344206/

All Articles

Reflection in C ++ 14

More articles: