Convenient data serialization with Variadic Templates

Foreword

During the development of my project, I needed to be able to write the contents of various data structures into binary files, and since they often contained lines, vectors and other data that change their size during execution, each such structure required an individual approach to organizing its translation in a sequence of bytes suitable for reverse reading, and using boost seemed to me cumbersome (and I don’t have it), and I wanted to solve this problem myself. Because of this, I decided to make this process as less routine as possible and I decided to do this with the help of templates.

The following data types are supported:
- All fundamental types of C ++
- std :: string
- std :: vector where T is anything from the same list
- Any enumeration type

Release

I use Visual Studio 2013 as the development environment, but the solution code is cross-platform. The class responsible for all the functionality I called AbstractSaveData. It is used with the help of inheritance. I decided not to make the class itself template, as this would make its use rather inconvenient, and the title still appears in the title. Instead, only its methods will be template-based, and thus, using this class will never have to explicitly instantiate any template method.

The class interface consists of the following methods:

virtual void const* Serialize(int& size) = 0; virtual void Deserialize(const void* buf, size_t size) = 0; virtual int SerializedSize()const = 0; void CleanSerializedBuffer();

The implementation of the first three methods should be in a descendant class.
')
The CleanSerializedBuffer method is used to clear the local buffer with serialized data. There is nothing special about the implementation:

 void CleanSerializedBuffer() { delete[] serializedBuf; serializedBuf = nullptr; m_size = 0; }

But this is only what concerns public methods. The descendant class, whose data is to be serialized, will have to deal with the following protected-methods:

 template<class ...Ts> int Serialization(const Ts&... objects); template<class ...Ts> void Deserialization(const void* buf, size_t size, Ts&... objects); template<class T, class ...Ts> inline int CalculateSize(const T& obj, const Ts&... objects)const; const char* SerializedBuf()const;

The Serialization method is not hard to guess that it is serializing. Method implementation:

 template<class ...Ts> int Serialization(const Ts&... objects) { if (serializedBuf) delete[] serializedBuf; m_size = CalculateSize(objects...); serializedBuf = new char[m_size]; ProcessSerialization(0, objects...); return m_size; }

First, the buffer size needed for the data is calculated and the corresponding amount of memory is allocated. Then the data is immediately serialized.

The implementation of the Deserialization method is similar in structure to the previous one:

 template<class ...Ts> void Deserialization(const void* buf, size_t size, Ts&... objects) { if (size) { int read = CalculateSize(buf, objects...); if (read != size) throw ApproxException(L" "); ProcessDeserialization(static_cast<const char*>(buf), objects...); } }

Only here the size calculation is needed only for error control.

Finally, the CalculateSize method, which calculates the space occupied by objects, has 2 options:

 template<class T> inline int CalculateSize(const T& obj)const { return reqSize(obj); } template<class T, class ...Ts> inline int CalculateSize(const T& obj, const Ts&... objects)const { return reqSize(obj) + CalculateSize<Ts...>(objects...); }

Here you can already observe recursion both at compile time and at run time. Familiar with tuples in C ++ is not difficult to understand what is happening here. It is worth mentioning that there are a total of 4 implementations of this method, the other two are in the private section and are not called directly from the inheriting class, but are called during deserialization.

Well, the SerializedBuf method simply returns a pointer to the serialized data:

 const char* SerializedBuf()const { return serializedBuf; }

And finally, what they say "under the hood."

Because of the number of methods, their simple listing and description will be too cumbersome and boring, so I will try to describe in general what will happen in the code below, and then actually present it.

In total there are 3 groups of methods:

First: recursive. They provide the disclosure of the list of arguments, moving through the buffer and calling methods that produce processing objects in accordance with their type. These are methods with the names CalculateSize, ProcessSerialization, ProcessDeserialization.

The second: copying. They perform serialization or deserialization at the level of an individual object and copy the resulting into the buffer or from it. These are methods named CopyS and CopyD. Methods named CopyS are used in the serialization process, and CopyD is used in deserialization.

Third: auxiliary. They calculate the space occupied at the level of an individual object. These are methods named reqSize.
In the code, explicit template specialization is actively used, as well as the standard type library tools std :: is_fundamental, std :: is_enum and std :: is_base_of, and with them std :: enable_if. These tools allowed us to separate objects with a constant size from objects with a variable size. For clarity and simplicity, I wrote my type-based type checking tool:

 template<typename T> struct is_simple : std::_Cat_base<std::is_fundamental<T>::value || std::is_enum<T>::value> { };

It simply combines many fundamental types and enumeration types, which is very convenient in our case, since in both cases the type of object can be used to unambiguously find out what size it is. For convenience, we will call these types simple below.
In general, the most common serialization is represented here - at the beginning of 4 bytes they store the size and then the data itself. The exception is the simple types, they do not need a header, since information about their size corresponds to their type and is provided by the inheriting class during deserialization. This reduces data redundancy.

Special attention should be paid to arrays (vectors). The array serialization method is selected based on the type of data it contains. In this implementation, 2 options can occur: an array of simple types and an array of types derived from AbstractSaveData. Regular structures are not supported, an attempt to use them will lead to a compilation error, but their implementation is not a problem, it is not required in my project, besides, their use removes the guarantee of successful serialization, since their contents are unknown and can be anything (pointers, same lines and vectors), and instead of them you can use the structure, with inheritance from AbstractSaveData.

This is, perhaps, all that I could tell in theory. Here is the code:

 template<class T> inline void ProcessDeserialization(const char* buf, T& obj) { CopyD(buf, obj); } template<class T, class ...Ts> inline void ProcessDeserialization(const char* buf, T& obj, Ts&... objects) { ProcessDeserialization<Ts...>(buf + CopyD(buf, obj), objects...); } template<class T> inline void ProcessSerialization(int shift, const T& obj) { CopyS(shift, obj); } template<class T, class ...Ts> inline void ProcessSerialization(int shift, const T& obj, const Ts&... objects) { shift += CopyS(shift, obj); ProcessSerialization<Ts...>(shift, objects...); } //Copy Serialization methods begin template<typename saveData> inline typename std::enable_if<std::is_base_of<AbstractSaveData, saveData>::value, int>::type CopyS(int shift, saveData& obj) { AbstractSaveData* data = dynamic_cast<AbstractSaveData*>(&obj); int size; auto ptr = data->Serialize(size); size += sizeof(int); memcpy(serializedBuf + shift, &size, sizeof(int)); memcpy(serializedBuf + shift + sizeof(int), ptr, size - sizeof(int)); data->CleanSerializedBuffer(); return size; } template<typename T> inline typename std::enable_if<is_simple<T>::value,int>::type CopyS(int shift, const T& obj) { memcpy(serializedBuf + shift, &obj, sizeof(T)); return sizeof(T); } inline int CopyS(int shift, const std::string& obj) { int size = reqSize(obj); memcpy(serializedBuf + shift, &size, sizeof(int)); memcpy(serializedBuf + shift + sizeof(int), obj.c_str(), size - sizeof(int)); return size; } inline int CopyS(int shift, const std::pair<const void*, int>& obj) { memcpy(serializedBuf + shift, &obj.second, sizeof(int)); memcpy(serializedBuf + shift + sizeof(int), obj.first, obj.second); return obj.second + sizeof(int); } template<class T> inline typename std::enable_if<is_simple<T>::value, int>::type CopyS(int shift, const std::vector<T>& obj) { int size = reqSize(obj); memcpy(serializedBuf + shift, &size, sizeof(int)); memcpy(serializedBuf + shift + sizeof(int), obj.data(), size - sizeof(int)); return size; } template<class T> inline typename std::enable_if<!is_simple<T>::value, int>::type CopyS(int shift, const std::vector<T>& objects) { int size = reqSize(objects); memcpy(serializedBuf + shift, &size, sizeof(int)); for (auto obj : objects) { shift += CopyS(shift + sizeof(int), obj); } return size; } //Copy Serialization methods end //Copy Deserialization methods begin template<class T> inline typename std::enable_if<is_simple<T>::value, int>::type CopyD(const void* buf, T& obj) { memcpy(&obj, buf, sizeof(T)); return sizeof(T); } inline int CopyD(const void* buf, std::string& obj) { int size = *static_cast<const int*>(buf) - sizeof(int); obj.reserve(size); obj.assign(size, '0'); memcpy(&obj[0], static_cast<const char*>(buf)+sizeof(int), size); return size + sizeof(int); } template<typename T> inline typename std::enable_if<is_simple<T>::value, int>::type CopyD(const void* buf, std::vector<T>& obj) { int size = *static_cast<const int*>(buf)-sizeof(int); obj.reserve(size / sizeof(T)); obj.assign(obj.capacity(), 0); memcpy(obj.data(), static_cast<const char*>(buf) + sizeof(int) , size); return size + sizeof(int); } template<typename T> inline typename std::enable_if<!is_simple<T>::value, int>::type CopyD(const void* buf, std::vector<T>& objects) { int size = *static_cast<const int*>(buf); int remainedSize = size - sizeof(int); while (remainedSize != 0) { T obj; remainedSize -= CopyD(static_cast<const char*>(buf) + size - remainedSize, obj); objects.push_back(obj); if (remainedSize < 0) throw ApproxException(L"     ."); } return size; } template<typename saveData> inline typename std::enable_if<std::is_base_of<AbstractSaveData, saveData>::value, int>::type CopyD(const void* buf, saveData& obj) { AbstractSaveData* data = dynamic_cast<AbstractSaveData*>(&obj); const int size = *static_cast<const int*>(buf); data->Deserialize(static_cast<const char*>(buf)+sizeof(int), size - sizeof(int)); return size; } //Copy Deserialization methods end template<typename simpleType> inline typename std::enable_if<is_simple<simpleType>::value, int>::type reqSize(simpleType)const { return sizeof(simpleType); } template<typename simpleType> inline typename std::enable_if<is_simple<simpleType>::value, int>::type reqSize(const void*, simpleType)const { return sizeof(simpleType); } template<typename saveData> inline typename std::enable_if<std::is_base_of<AbstractSaveData, saveData>::value, int>::type reqSize(const saveData& Data)const { return Data.SerializedSize() + sizeof(int); } inline int reqSize(const std::string& obj)const { return obj.size() + sizeof(int); } template<class T> inline typename std::enable_if<is_simple<T>::value, int>::type reqSize(const std::vector<T>& obj)const { return obj.size() * sizeof(T) + sizeof(int); } template<class T> inline typename std::enable_if<!(is_simple<T>::value), int>::type reqSize(const std::vector<T>& objects)const { int res = 0; for (auto obj : objects) { res += reqSize(obj); } return res + sizeof(int); } inline int reqSize(const std::pair<const void*, int>& obj)const { return obj.second + sizeof(int); } template<typename notSimpleType> inline typename std::enable_if<!(is_simple<notSimpleType>::value), int>::type reqSize(const void* buf, const notSimpleType&)const { return *static_cast<const int*>(buf); } template<class T> inline int CalculateSize(const void* buf, const T& obj)const { return reqSize(buf, obj); } template<class T, class ...Ts> inline int CalculateSize(const void* buf, const T& obj, const Ts&... objects)const { int shift = reqSize(buf, obj); return shift + CalculateSize<Ts...>(static_cast<const char*>(buf) + shift, objects...); }

Usage example

And for what all this was? So that you can write like this:

 using std::string; struct ShaderPart : AbstractSaveData { string Str_code; Shader_Type Shader_Type = ST_NONE; string EntryPoint; vector<RuntimeBufferInfo> BuffersInfo; vector<int> ParamsIDs; vector<int> TextureSlots; const void* Serialize(int& size)override final { size = Serialization(Str_code, Shader_Type, EntryPoint, BuffersInfo, ParamsIDs, TextureSlots); return SerializedBuf(); } void Deserialize(const void* buf, size_t size)override final { Deserialization(buf, size, Str_code, Shader_Type, EntryPoint, BuffersInfo, ParamsIDs, TextureSlots); } int SerializedSize()const override final { return CalculateSize(Str_code, Shader_Type, EntryPoint, BuffersInfo, ParamsIDs, TextureSlots); } };

In principle, in order not to write the same variables in the methods and reduce the likelihood of inconsistency of the parameter lists (which will lead to very bad consequences), you can enter support for tuples and, instead of passing the argument list, pass one tuple to all methods.

To all who took the time for my article - my gratitude, and for those who could endure to the end also respect.

Source: https://habr.com/ru/post/263599/

All Articles

Convenient data serialization with Variadic Templates

Foreword

Release

Usage example

More articles: