📜 ⬆️ ⬇️

Add reflection for enums in C ++

Recently, in our project, it became necessary to programmatically obtain information about enums (enum), for example, the names of constants in the form of strings, as well as a general list of all the constants available in enum.

enum Suit { Spades, Hearts, Diamonds, Clubs }; 

Usually the solution to this problem is based on the duplication of values, for example, inside a switch:

 switch(value) { case Spades: return "Spades"; case Hearts: return "Hearts"; case Diamonds: return "Diamonds"; case Clubs: return "Clubs"; default: return "" }; 

And perhaps for small transfers such a solution is indeed acceptable, but if there are many values, and especially if they change from time to time, then sooner or later the developer may forget to add or change the corresponding lines in the switch. Other obvious drawbacks are added here, for example, the fact of the need to duplicate values ​​already causes me some dissatisfaction.
')
Therefore, I tried to find a way that would not require duplication at all, but at the same time I would completely cope with the task. I think I did it.

Further in the article I will describe a way to organize reflection for enums. Who cares - welcome under cat.

Why do you need it


There may be many useful uses. One of them is the serialization of values, for example in JSON.
It can also be useful for the interaction of C ++ code with scripting languages ​​(for example, Lua).

Requirements


Since we want to avoid duplication of constants in the code, then we need to somehow save information about all the values ​​right in the place where the enumeration is defined. As you may have guessed, you will have to use a macro for this purpose. Given this, there are some additional requirements:

  1. The macro syntax for the enum description must be compatible with the normal enum.
  2. The enumeration itself (as a type) should not differ from the usual enum (including it should be possible to use typedef later)
  3. When describing values, the same possibilities should be retained as in the usual listing.
In other words, we should be able to easily wrap an already existing enumeration into our macro, after which we will immediately (programmatically) access information about it.

A prerequisite is also full portability.

Result


First, I give a brief description of what happened. Below in the article will be a description of the implementation details.

To add reflection, the enumeration instead of the enum keyword should be declared using the macro Z_ENUM . For example, for enum CardSuit from the beginning of the article, it looks like this:

 Z_ENUM( CardSuit, Spades, Hearts, Diamonds, Clubs ) 

After that, you can get a reference to the EnumReflector object that stores information about it anywhere:

 auto& reflector = EnumReflector::For< CardSuit >(); 

Then everything is simple:

 reflector.EnumName(); // == "CardSuit" reflector.Find("Diamonds").Value(); // == 2 reflector.Count(); // == 4 reflector[1].Name(); // == "Hearts" 


The following example shows a more complex listing:

 class SomeClass { public: static const int Constant = 100; Z_ENUM( TasteFlags, None = 0, Salted = 1 << 0, Sour = 1 << 1, Sweet = 1 << 2, SourSweet = (Sour | Sweet), Other = Constant, Last ) }; 

This time we will get all the available information:

 auto& reflector = EnumReflector::For< SomeClass::TasteFlags >(); cout << "Enum " << reflector.EnumName() << endl; for (auto& val : reflector) { cout << "Value " << val.Name() << " = " << val.Value() << endl; } 

Conclusion:

 Enum TasteFlags Value None = 0 Value Salted = 1 Value Sour = 2 Value Sweet = 4 Value SourSweet = 6 Value Other = 100 Value Last = 101 

Special features


The causes of these two points are discussed in the next section.

Implementation details


So, the most interesting.

Note: The code provided here is simplified in order to improve readability. You can find the full version on the githab, the link at the end of the article.

Macro Z_ENUM:

 #define Z_ENUM(enumName, ...)\ enum enumName : int \ { \ __VA_ARGS__ \ }; \ friend const ::EnumReflector& _detail_reflector_(enumName) \ { \ static const ::EnumReflector reflector( []{ \ static int sval; \ sval = 0; \ struct val_t \ { \ val_t(const val_t& rhs) : _val(rhs) { sval = _val + 1; } \ val_t(int val) : _val(val) { sval = _val + 1; } \ val_t() : _val(sval){ sval = _val + 1; } \ \ val_t& operator=(const val_t&) { return *this; } \ val_t& operator=(int) { return *this; } \ operator int() const { return _val; } \ int _val; \ } __VA_ARGS__; \ const int vals[] = { __VA_ARGS__ }; \ return ::EnumReflector( vals, sizeof(vals)/sizeof(int), \ #enumName, Z_ENUM_DETAIL_STR((__VA_ARGS__)) ); \ }() ); \ return reflector; \ } #define Z_ENUM_DETAIL_STR(x) #x 

An example of what it turns
 enum TasteFlags:int { None = 0, Salted = 1 << 0, Sour = 1 << 1, Sweet = 1 << 2, SourSweet = (Sour | Sweet), Other = Constant, Last }; friend const ::EnumReflector& _detail_reflector_(TasteFlags) { static const ::EnumReflector reflector( [] { static int sval; sval = 0; struct val_t { val_t(const val_t& rhs) : _val(rhs) { sval = _val + 1; } val_t(int val) : _val(val) { sval = _val + 1; } val_t() : _val(sval){ sval = _val + 1; } val_t& operator=(const val_t&) { return *this; } val_t& operator=(int) { return *this; } operator int() const { return _val; } int _val; } None = 0, Salted = 1 << 0, Sour = 1 << 1, Sweet = 1 << 2, SourSweet = (Sour | Sweet), Other = Constant, Last; const int vals[] = { None = 0, Salted = 1 << 0, Sour = 1 << 1, Sweet = 1 << 2, SourSweet = (Sour | Sweet), Other = Constant, Last }; return ::EnumReflector( vals, sizeof(vals)/sizeof(int), "TasteFlags", "( None = 0, Salted = 1 << 0, Sour = 1 << 1, Sweet = 1 << 2, SourSweet = (Sour | Sweet), Other = Constant, Last)" ); }()); return reflector; } 

Consider it in parts:

At the beginning, Z_ENUM is expanded into a regular enum. You may notice that the underlying data type is explicitly indicated - int . This is done only because in EnumReflector , values ​​are now stored with type int . If necessary, int can be replaced by a larger type.

After that a friend is declared _detail_reflector_ . It takes a value of the type of our enumeration and returns a reference to the EnumReflector object, which is actually a static object declared inside it.

Looking ahead a bit, I’ll give the EnumReflector :: For function, which serves as an external interface for getting the EnumReflector object:

 template<typename EnumType> inline const EnumReflector& EnumReflector::For(EnumType val) { return _detail_reflector_(val); } 
The trick here is that ADL is used to search for the _detail_reflector_ function by argument type. It is through ADL that we can get information for enumerations, regardless of their class or namespace.

But back to the _detail_reflector_ function.

To ensure atomicity, the entire initialization of the static object EnumReflector occurs inside an unnamed lambda function. Consider it in more detail.

First, it declares a static counter variable sval . It is static because we need to access it from the local class val_t , defined below. Without an additional state, the local class can obviously access only static variables of the external block. The sval variable will store the following value for the constant. The next line we initialize it to 0.

What for?
At first glance, this is a meaningless action: initially a static variable is already initialized to 0, and this code will be executed only once. However, after some tests, I noticed that compilers optimize this code much better if we explicitly reset the value before use. This is probably due to the fact that in this case the compiler does not need to start from possible previous values ​​of sval

Further the val_t type is defined . After the description of the type, __VA_ARGS__ (the values ​​of our enumeration) is revealed once again. That is, we define local variables of the val_t type - and their number corresponds to the number of values ​​in the enumeration, and the names correspond to the constants themselves (they overlap the real constants of the enum defined before this). In order for the initialization of these variables to work correctly, the val_t type has three constructors. Each of them additionally sets sval to the next value after itself, in case the next constant does not have a specially specified value.

In this place, if after the last value there is a comma, a syntax error will occur.

After, we need to "overtake" the values ​​from the variables in the array of type int . Thanks to the conversion operator in int , val_t is quite simple to do this - we can immediately use our variables of type val_t as array initializers, once again revealing __VA_ARGS__ . Since assignments may contain assignments, we add two assignment operators to val_t, which do nothing - so we completely ignore assignments.

Now, when we have an array of all values ​​and we know their number, we need to get the names of the constants in the form of strings. To do this, all values ​​are wrapped in a string of the form "(__VA_ARGS__)". This string, along with a pointer to the array and the number of elements, is passed to the constructor of EnumReflector . He needed only to parse the string, selecting the names of the constants from it, and save all values.

The speed parser itself is organized as a simple finite state machine.

Parser code in EnumReflector
 struct EnumReflector::Private { struct Enumerator { std::string name; int value; }; std::vector<Enumerator> values; std::string enumName; }; static bool IsIdentChar(char c) { return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || (c == '_'); } EnumReflector::EnumReflector(const int* vals, int count, const char* name, const char* body) : _data(new Private) { _data->enumName = name; _data->values.resize(count); enum states { state_start, // Before identifier state_ident, // In identifier state_skip, // Looking for separator comma } state = state_start; assert(*body == '('); ++body; const char* ident_start = nullptr; int value_index = 0; int level = 0; for (;;) { assert(*body); switch (state) { case state_start: if (IsIdentChar(*body)) { state = state_ident; ident_start = body; } ++body; break; case state_ident: if (!IsIdentChar(*body)) { state = state_skip; assert(value_index < count); _data->values[value_index].name = std::string(ident_start, body - ident_start); _data->values[value_index].value = vals[value_index]; ++value_index; } else { ++body; } break; case state_skip: if (*body == '(') { ++level; } else if (*body == ')') { if (level == 0) { assert(value_index == count); return; } --level; } else if (level == 0 && *body == ',') { state = state_start; } ++body; } } } 

We simply follow the string, keeping the identifiers (the names of the constants). After the next identifier, we look for the beginning of the next identifier, and so on. At the end we have a ready-made data structure containing all the information about the listing.

The rest of the implementation of the EnumReflector class serves to obtain this information and, in my opinion, is not of particular interest for this article. I remind you that at the end there is a link to the full version.

When an enumeration is declared outside the class, the _detail_reflector_ function must be declared not as a friend , but as an inline . Hence the need for a separate macro Z_ENUM_NS . In order not to accidentally use Z_ENUM_NS in the class body, it also has an empty extern "C" {} block (I remind you that its use in the class body is not allowed by the standard, so we get a compilation error).

Also, to avoid the occurrence of name collisions with constants, in the full version all identifiers inside the _detail_reflector_ function have the prefix _detail_ .

What can be improved


You can try parsing to get the titles right at compile time, using user-defined literals for strings and constexpr functions from C ++ 14.

It would also be nice to get rid of the need for two different macros to define enumerations in the class and outside the class, but so far I have not found a way to do this without breaking the ADL search.

Links


Full version of the code from the article: github.com .
Argument-Dependent Lookup: cppreference.com .

That's all. I hope the article turned out to be interesting.

PS: Welcome suggestions for improving this method.

Source: https://habr.com/ru/post/276763/


All Articles