How to copy an array correctly and what does it have to do with SFINAE

Copy items from one container to another? There is nothing easier, the universal algorithm is placed in 5 lines:

template<class InputIterator, class OutputIterator> OutputIterator copy(InputIterator first, InputIterator last, OutputIterator result) { while(first != last) *result++ = *first++; return result; }

You may have learned the implementation of std :: copy from cplusplus.com . Why does the implementation of std :: copy from GNU STL take 139 lines? Let's see.
STL is the most basic library used by all normal C ++ programs. Therefore, it should provide the most efficient implementations of the algorithms. And effective in many ways:

Execution speed
Size generated by the compiler code
Memory consumption
Compilation speed

Execution speed

The code can be made faster if you consider the features of the types by which the template is instantiated. For example, if a type has a trivial copy constructor, it can be copied byte-by-byte. And if the objects lie in a continuous memory area, instead of repeatedly calling the constructor, you can use the good old C function memmove, which can use vector processor commands that copy data especially quickly. Please note that the implementation of std :: copy cannot use memcpy, since memcpy works only on non-overlapping memory areas.
Thus, we want to write two implementations of std :: copy: one fast, for trivially copied types, and the other universal, for all others.

SFINAE

And here comes a technique known as "substitution failure is not an error". If the substitution of template parameters results in an incorrect expression, this is not an error. The compiler should ignore the pattern and continue searching. It is important that in the case of a template function, an incorrect expression should not be found in the body of the function, when a particular template has already been selected and there is nowhere to continue the search, but in its prototype. The easiest way to do this is to use std :: enable_if.

std :: enable_if

By itself, enable_if is very simple. This is a pattern from an integral constant and type. If the constant is true, then it contains a nested type declaration named type, otherwise it is empty.

  // Define a nested type if some predicate holds. // Primary template. /// enable_if template<bool, typename _Tp = void> struct enable_if { }; // Partial specialization for true. template<typename _Tp> struct enable_if<true, _Tp> { typedef _Tp type; };

Type of

 std::enable_if<, Type>::type

will be defined and have type Type, but only if the condition is true. Use it like this:

 //      template<class T> std::enable_if<false, T>::type get_t() {return T();} //     template<class T> std::enable_if<true, T>::type get_t() {return T();}

type traits

To make everything described above make sense, you need to have constants that depend on the type. They are called type traits. These are template structures that have a static constant public property value, which is true or false, depending on the type by which the template is instantiated. Some of them are described using partial template specialization, some are implemented by the compiler itself, and some are built as expressions over other type traits.

Special implementation of std :: copy

Before the universal implementation we add the following:

 template<class T> //  gcc 4.5  std::is_trivially_copiable,      //inline typename std::enable_if<std::is_trivially_copiable<T>::value, T*>::type inline typename std::enable_if<std::is_trivial<T>::value, T*>::type copy(T* first, T* last, T* result) { const ptrdiff_t num = last - first; memmove(result, first, sizeof(T) * num); return result + num; }

If copy is invoked with three pointers to the trivially copied type, the compiler will apply this optimized implementation. Otherwise, this template will be ignored and the universal version will be used.
In a real library, everything is a bit more complicated, because the standard states that std :: copy has two template parameters. If the programmer explicitly indicates them, you still need to choose an optimized version. Therefore, various implementations are described under utility names, and in the std :: move itself there is a code that selects the most appropriate implementation. Here is a significantly simplified version:

 #include <type_traits> #include <cstring> // :     C++11. //  C++03   type_traits (     ) //      . //  is_simple,   memmove,    . //      . template<bool is_simple> struct __do_copy; //    ,   _.    //        . //        ,      //  .         std::copy template<> struct __do_copy<true> { //      //     ,      //  memmove  . template<class InputIterator, class OutputIterator> static inline OutputIterator do_copy(InputIterator first, InputIterator last, OutputIterator result) { typedef typename std::iterator_traits<InputIterator>::value_type ValueType; const std::ptrdiff_t num = last - first; memmove(result, first, sizeof(ValueType) * num); return result + num; } }; template<> struct __do_copy<false> { //   template<class InputIterator, class OutputIterator> static inline OutputIterator do_copy(InputIterator first, InputIterator last, OutputIterator result) { while(first != last) *result++ = *first++; return result; } }; //  trait      //    STL   template<typename, typename> struct are_same : public std::false_type {}; template<typename Tp> struct are_same<Tp, Tp>: public std::true_type {}; template<class InputIterator, class OutputIterator> inline InputIterator copy( InputIterator first, InputIterator last, OutputIterator result) { //  typename    ,  value_type  // , , .   . typedef typename std::iterator_traits<InputIterator>::value_type ValueTypeI; typedef typename std::iterator_traits<OutputIterator>::value_type ValueTypeO; const bool is_simple = ( std::is_trivial<ValueTypeI>::value && std::is_pointer<InputIterator>::value && std::is_pointer<OutputIterator>::value && are_same<ValueTypeI, ValueTypeO>::value); //    return __do_copy<is_simple>::do_copy(first, last, result); }

In addition, another optimization is used in GNU STL: for a random access iterator, a for loop is used to calculate the number of iterations. The compiler is able to develop such cycles, increasing the speed of the program.

Size of generated code

Please note that the template at the beginning of the article will generate a completely new code for each new type. The second pattern degenerates into one subtraction and one addition of pointers, and the memmove implementation is common to all types. That is, in addition to speeding up the program, its size decreases. Similar tricks can be used in containers: parts that are not dependent on the stored type are removed from the template and use a common implementation. For example, in the implementation of std :: list, the template structure storing data contains nothing but the data itself. References to neighbors and operations on them are implemented in the base non-template class, from which it is inherited.

Use library functions

The moral of this article is simple: use the algorithms provided by the standard library as often as possible. It makes your programs more efficient, more understandable and more flexible .
More efficiently because developers of standard libraries can apply optimizations that you cannot afford in normal code. You are not ready to write and support several implementations of copying objects for a dozen platforms? An application programmer has no time for this.
It is more understandable because you can write a short high-level code, operating with primitives familiar to your colleagues.
More flexible because you don’t have to do premature optimizations and because you use more general algorithms than you would write yourself.

PS

Note that std :: copy knows that result is not between start and finish, but uses the memmove function, which is forced to check this and select the copy direction (from beginning to end or from end to beginning). It was possible to make a slightly more optimal implementation, but then the STL developers would have to support special implementations for each of the supported architectures. The glibc developers are already spending their time on this. Nobody needs to duplicate this work.

Source: https://habr.com/ru/post/150654/

All Articles