Escape from the dungeon types. We work with data, the type of which is determined dynamically.

When the result of a SQL query entails infinite type conversions to various possible types of fields. When the code is filled with obscure logic with a huge search of overloads by types of boost :: variant. When you do not know how to take an argument of an arbitrary type using the RPC protocol. Then a dynamic typing emulation mechanism in C ++ is required. Extensible and easy to use, creating a clear API. Such that does not require a predefined list of types and does not force you to work with pointers to the base class. There is such a mechanism - double dispatch will help us!

To understand what double dispatching is and how to prepare it correctly, efficiently and clearly in C ++, you first need to clarify what it is for, and go through the whole evolutionary path to this solution. Without this explanation, the novice developer will go mad by the end of the reading, and the experienced developer will most likely drown in his own associations and draw the wrong conclusions. Therefore, we will start from the very beginning - from the basics of dynamic typing and why it is needed in C ++.

Dynamic typing in C ++

In C ++, typing is static, which allows you to track errors when working with typical operations at the compilation stage. As a rule, in 90% of cases we know in advance either the type of the result of any operation or the base class of all possible values. However, there is a class of tasks where the type of values as a result of the operation is not known in advance and is calculated at the execution stage. The classic example is the result of a query to the database, where, as a result of executing a SQL query, we get a set of serialized values that need to be unpacked into the appropriate types after the query has been executed.
')
Another example is the result of calling a remote function via the RPC protocol, the result will be known at runtime, and not always at the task setting stage we can predict the set of return values, especially if we solve a common problem with providing an intermediate API for your RPC protocol. All the same is true for any potentially extensible functionality that works with a type system that it is more convenient to calculate at runtime, for example, the same function arguments or SQL query parameters, which are generally more convenient to generalize, but at the same time need to be stored somehow and transmit.

We begin with a classic solution through the base class and its heirs.

There is a class of tasks where the type of values as a result of the operation is not known in advance and is calculated at the execution stage. A classic example is the result of a database query.

Basic Interface and Inheritance

The classic solution through the basic interface in C ++ directly uses one of the OOP paradigms - polymorphism. A general class is selected, usually abstract, it introduces a number of methods that are redefined as heirs, and work is done with a reference or pointer to the type of a common ancestor of the heir values.
Consider a small example. Suppose we have a task: to store different types of goods in stock. Let any product have a name, an identifier of the category of goods in stock and a certain price. The base interface class with this approach will look like this:

class IGoods { public: virtual std::string Name() const = 0; virtual int TypeID() const = 0; virtual float Price() const = 0; };

If, suppose, we need to describe such a category of goods as candy, then we need a class - the successor of the basic interface with certain functions Name, TypeID and Price, for example, like this:

 class Candies : public IGoods { public: static const int TYPE_ID = 9001; Candies(std::string const& name, float price); virtual std::string Name() const override { return m_name; } virtual int TypeID() const override { return TYPE_ID; } virtual float Price() const override { return m_price; } private: std::string m_name; float m_price; };

As a result, you can fill the warehouse with all sorts of goods, such as candy, while operating only with links to the base class. That is, as a rule, we do not need to know which heir class is really behind the link, since the warehouse doesn’t care what is stored in it, so that you can read the name of the product, price and article number.

We get the following advantages:

extensibility is the main plus, you can create heirs in any library and work with them on general rights; it does not work this way, for example, if you select the infinite switch method, at some point the system will suffocate from an excess of case variants in different parts of the same type code;
dynamic typing - in fact, the type can be set at the execution stage, creating an instance of one or another heir class depending on the logic of the task, as a result, you can, for example, fill in the result of parsing a JSON object or a SQL query;
visualization - the ability to very easily build a clear diagram for everyone with the tree of heirs, the very description of the base class implies the obviousness of the behavior of the class of the heir.

There are, however, and minuses, there are only three of them, but ignoring them, we will get a constant headache, because we lose all the advantages of the C ++ classes, reducing the work to pointers to the base interface class:

it is difficult to create - whatever you say, but filling up a warehouse, that is, creating objects of previously unknown heir classes, has to be done through factories;
difficult to store - there are only two variants of the built-in types: a link and a pointer, and only a pointer can be stored. Of course, storing a container filled with pointers is detrimental to the health of the application, and smart pointers come to the rescue, like std :: shared_ptr and std :: unique_ptr. The first is quite heavy, the behavior of the second causes a sharp headache with any copying, explicit or implicit;
difficult to copy - for the case of std :: unique_ptr, you should attend to the Clone method in the base class, as, indeed, for std :: shared_ptr, if we do not plan to refer to different data from different containers. That is, we either cheat the user and copying the container does not copy the data in the usual sense of C ++, or further complicate the base class and all its descendants by adding a primitive cloning operation to it.

In fact, with this classic approach, the code in one place looks like a horror movie and instead of the usual constructor, a similar monster appears:

 std::deque<std::unique_ptr<IGoods>> goods; std::unique_ptr<IGoods> result = GoodsFactory::Create<Candies>(); goods.push_back(std::move(result));

In another place of the code, a formidable horror begins when accessing the elements of the collection through “smart” pointers.

 std::deque<std::unique_ptr<IGoods>> another(goods.size()); std::transform(goods.begin(), goods.end(), another.begin(), [](std::unique_ptr<IGoods>& element) { return element->Clone(); } );

All this looks in the worst traditions of C ++, and therefore it is not surprising that most developers consider such constructions to be normal, even if they are rendered into the interface or shown in an open source system.

Is it really that bad in C ++ and you can't get by with ordinary classes with generated copy and move constructors, assignment operators and other pleasures of life? What prevents us from encapsulating all the logic of working with a pointer to the base class in an object of a container class? Yes, in general, nothing.

Data class inheritance

It's time to rebuild the logic of the base class. We will pack all the logic of working with the base interface into the usual C ++ class. The base interface class will cease to be abstract, and class objects will receive the usual logic of designers and destructors, will be able to copy and assign values, but most importantly, we will not lose all the advantages of the previous approach, getting rid of the minuses!

In other words, the base class receives some data in the form of a class whose behavior is determined by the successor classes whose data class is inherited from the data class of the base class ... does it sound confusing? Now let's take an example, and everything will become clear.

The base class receives data in the form of a class similar to the interface class, whose behavior is determined by the data classes of the heirs. Inheritance is obtained double: data classes are also inherited

 //      API class object { public: object(); virtual ~object(); virtual bool is_null() const; virtual std::string to_string() const; protected: //    ! class data; private: std::shared_ptr<data> m_data; }; //     API //     #include class object::data { public: data() { } virtual ~data() { } virtual bool is_null() const { return true; } virtual std::string to_string() const { return "null"; } }; //      API //      ,   object class flower : public object { public: flower(std::string const& name); virtual bool is_null() const override; virtual std::string to_string() const override; virtual std::string name() const; virtual void rename(std::string const& name); protected: //    ! class data; }; //     API //     #include class flower::data : public object::data { public: static const std::string FLOWER_UNKNOWN; data() : m_name(FLOWER_UNKNOWN) { } data(std::string const& name) : m_name(name) { } virtual bool is_null() const override { return false; } virtual std::string to_string() const override { return "flower: " + m_name; } virtual std::string name() const { return m_name; } virtual void rename(std::string const& name) { m_name = name; } private: std::string m_name; };

Actually heirs are usually more, and they, as a rule, appear in dependent libraries. Now it's time to figure out what this fun design allows.

 object rose = flower("rose"); object none; std::vector<object> garden; garden.push_back(std::move(rose)); garden.push_back(std::move(none)); garden[1] = flower("gladiolus"); std::for_each(garden.begin(), garden.end(), [](object const& element) { std::cout << element.to_string() << std::endl; } );

The implementation of the API class methods is obvious and proxies data manipulation methods. The constructor of the ancestor does not create data and leaves a null pointer, the constructors of the heirs initialize the pointer of the ancestor with the heir of the data of the desired type.

Now nothing prevents you from creating any new heir to the object class, asking it the logic to convert to a string and checking for the presence of a value. For example, you can select the object class shoes:

 class shoes { public: shoes(long long price); virtual bool is_null() const override; virtual std::string to_string() const override; virtual long long price() const; virtual void discount(long long price); protected: class data; };

The class shoes :: data is described by analogy with flower :: data. However, now we can get a funny result when working with our garden with flowers from the previous example:

 garden.push_back(shoes(100000000000LL));

So, you can leave in the garden shoes worth 100 billion Belarusian rubles. Also, these shoes will unintentionally stumble upon picking up flowers, but we would have encountered the same problem in the initial approach with the interface to the base class. If we meant that there should be only flowers in the garden, we would have done std :: vector. Apparently, the author of the code decided to keep anything in his garden - from flowers and shoes to previously unknown rubbish, including an atomic reactor or the Egyptian pyramids, because now nothing prevents you from inheriting new classes from object.

Welcome to the world of dynamic typing using ordinary C ++ classes with typical logic. Although not! Copying a class will only copy the link. It's time to fix the last inconsistency with the logic of C ++ classes.

Copy when changing object

Our base object is the time to learn to do what the original interface did with the Clone method, that is, to copy the contents of the heir. In this case, copying should be as gentle as possible and copy the data as late as possible. This condition is the more critical, the larger the object and the more intense its copying, explicit or implicit. Here we can use the principle of copying when changing the object data.

Copying when changing, or copy-on-write (COW), in C ++ is implemented relatively simply, an example is the Qt library, where COW is used everywhere, including for strings (QString), which reduces the cost of copying these objects to the required minimum.

The essence of the approach is as follows:

an object refers to data via an auxiliary type, like a pointer;
object methods can be const and non-const, it is important to maintain the constancy of the method, due to the following points;
when the object method is called, the call is proxied to the method call of the data class through the same auxiliary pointer type from the first item, for which two operators -> are overloaded for this purpose, for better readability, const and non-const, respectively.

The constant overload option operator -> simply calls the desired method directly on the data class, proxying the call to the outer class;
The non-constant overload variant of operator -> is a bit more interesting, it implies that the call changes the data. Therefore, you need to make sure that we refer to our data that can be changed. If the link to the data is not unique, that is, we have postponed the copying and refer to other people's data, then you need to copy your copy of the data and work with them by calling the desired method.

Copying when changing in C ++ is relatively easy to implement, through operator -> overloading the encapsulated helper class. It is important to overload both const and non-const operator overloads.

To be clear, let's get the most simplified version of this intermediate reference type:

 template <class data_type> class copy_on_write { public: copy_on_write(data_type* data) : m_data(data) { } data_type const* operator -> () const { return m_data.get(); } data_type* operator -> () { if (!m_data.unique()) m_data.reset(new data_type(*m_data)); return m_data.get(); } private: std::shared_ptr<data_type> m_data; };

In an amicable way, you need to secure this class for multi-threaded access, as well as from exceptions in the copying process, but, in principle, the class is simple enough to convey the basic idea of implementing COW in C ++. It is also worth considering that in the copy constructor, the data class implies calling the virtual method for cloning data.

Now all we have to do is change the data storage in the base object class:

 class object { ... protected: class data; private: copy_on_write<data> m_data; };

Thus, we obtain the initialization of classes of heirs compatible with the base class, that is, in fact, dynamic typing. In addition, we do not use pointers to an abstract class, we have the usual C ++ classes with constructors, destructors, copying and assigning, as simplified as possible to create their heirs. The only complication is that proxy methods, which are reduced to m_data-> method (arguments), turn out to be a plus, because besides the call itself, we have the ability to save diagnostic information, such as the stack trace, which will simplify our error tracking and exception generation with maintaining the call sequence up to method that generated the exception.

In essence, we have obtained a hybrid of Pimpl and Double dispatch approaches for dynamic data typing, for which the type we get at the execution stage.

In fact, we got a hybrid of Pimpl and Double dispatch approaches for dynamic data typing.

Implement data class interface?

When implementing a data class, it is not necessary to duplicate all the methods of the external class, as is done with the Pimpl pattern. The data class performs two main tasks: it hides the details of encapsulation in the implementation and provides access to the data in the implementation of the methods of the outer class. It is quite enough to do get_ and set_ methods and some auxiliary functionality, and perform data processing directly in the methods of the outer class. Thus we separate the implementation of the class and the details of the encapsulation.

Using dynamic typing

So, let's say we have a remote function call protocol, as an option, this is a parameterization of the SQL query to the database. We calculate the types of the arguments and the result at the execution stage if we make a general mechanism with providing the API to the end user, because it is not known in advance what the user wants to pass as arguments and which result types will be received from the remote side (sometimes it is not even known to the developer who writes on top of this API, because with a chain of calls, the arguments of the next call are often based on the results of the previous one).

In such cases, when the base class is not only an interface for the heirs, but also a container for the data of the heir, we are able to describe any functionality that requires dynamic typing in terms of C ++ classes and objects.

Consider an example SQL query. The list of arguments for executing the query can be generated with the same Boost.Preprocessor for a function of an arbitrary number of arguments of type object.

 //  SQL-,  db::SqlQuery  db::SqlQuery query("select * from users as u where u.type = $(usertype) and u.registered >= $(datetime) limit 10"); //    operator () db::SqlQueryResult result = query("admin", datetime::today()); //   std::for_each(result.begin(), result.end(), [](db::SqlQueryRow const& row) { //    object login = row["login"]; if (login.is_null()) std::cout << "not specified"; else std::cout << row["login"]; //   if (row["status"] == "deleted") std::cout << " (deleted)"; std::cout << std::endl; } );

As arguments db :: SqlQuery :: operator (), you can use an arbitrary set of object, in this case you need to define the template implicit type casting constructor for the general type object:

 class object { public: template <typename value_type> object(value_type const& value); ... };

In this case, we will need heirs from the object class of the type integer, boolean, floating, text, datetime, and others, whose data will be placed in the object when the object is initialized with the appropriate value. In this case, the initialization of an object with an arbitrary type will be extensible and all that will be needed to set an object with the desired type is to write an appropriate specialization, like this for bool:

 class boolean { public: boolean(bool value) : object(value) { } ... protected: class data; friend class object; }; template<> object::object(bool value) : m_data(new boolean::data(value)) { }

The most important thing here is different, the result of the query is a table with data calculated on the remote side of the database at the time of execution. However, we can safely bypass every line of the query result, getting an object of a completely specific type, and not some undereserialized data. You can work with an object, it can overload comparison operations, you can, by analogy with the designer, make a template method for obtaining values of a certain type, you can cast to a string, output to a stream. The object of type object we have quite a container, which can be operated as an ordinary class object.

Moreover, if you wish, you can add container logic to object and, in general, use one type for any value returned from the request. That is, by overloading it with the begin (), end (), size (), and also operator [] methods:

 object result = query("admin", datetime::today()); std::for_each(result.begin(), result.end(), [](object const& row) { std::for_each(row.begin(), row.end(), [](object const& cell) { std::cout << cell.to_string() << ' '; } std::cout << std::endl; } );

In principle, the idea can be modified to the point that you can use anything at all through the container and the base class object, but here you should not forget about common sense. The idea of static typing, which reveals errors at the compilation stage, is very good, and it is extremely unreasonable to abandon it wherever it is possible to use the typed dynamic imagination!

The idea of static typing, which reveals errors at the compilation stage, is very good, and it is extremely unwise to refuse it everywhere!

However, dynamic typing is extremely useful in the very places for which it is intended - for values whose type is obtained dynamically, usually as a result of parsing the data stream. Encapsulated in a base class, the interface for working with data of various types allows us to work with ordinary C ++ objects, creating and copying them with ordinary designers, and with an assignment operator, and copying can be made as deferred as possible (ideally forever) using the copy-on-write technique.

At the same time, the base class is both an interface class for working with various data, and a container. For convenience, you can define for the base class all the necessary operations: comparisons, indexing, mathematical and logical operations and operations with threads. In general, you can implement the most readable and logical code, the most protected from errors of storing the pointer to the base class, copying and access from different streams. This is especially useful if this API is developed for a wide range of tasks, when working with a set of types that are initially unknown, and the set of types can potentially expand.

Dynamic typing is a responsibility!

You need to be extremely prudent when entering dynamic typing. Remember that developers in scripting languages often envy the possibilities of C ++, C # and Java to check types before the algorithm is executed at the compilation stage. Use the power of static typing, emulating the rejection of it only where it is justified! As a rule, dynamic typing is needed to execute a generic API request to a remote server for serialized data (including a database request).

After deserialization already at the execution stage, a number of types can be obtained. It is usually unjustified to refuse types obtained dynamically and to work with data serialized to text or byte stream, since data processing usually requires processing. The convenience of parsing data and getting familiar C ++ types, working not with interface pointers, but with well-constructed classes of ordinary objects, is priceless.

New way

, , API C++ , RPC- . , , . , , . copy-on-write operator -> const non-const , , , , . , , , . new — .

, , . , , - , . . , Pimpl, , .

API, C++. , . , , - .

API , . , , — API, .

#189.
Author: Vladimir Qualab Kerimov, Lead C ++ Developer, Parallels

Subscribe to "Hacker"

Source: https://habr.com/ru/post/257891/

All Articles