Again about STL: containers

In the previous article we talked about arrays as a prototype and the progenitor of containers. Now it is the turn of the actual container classes and their supporting libraries.

The term library of standard templates (STL, S tandard T emplate L ibrary) refers to a set of interfaces and components originally developed by Alexander Stepanov, Meng Lee and other employees of AT & T Bell Laboratories and Hewlett-Packard Research Laboratories in the early 90s (although later still quite a few had a hand in what has become today the standard component of C ++). Further, the STL library became the property of SGI, and was also included as a component in the Boost library set. Finally, the STL library was included in the C ++ standards of 1998 and 2003 (ISO / IEC 14882: 1998 and ISO / IEC 14882: 2003) and has since been considered one of the components of the standard C ++ libraries.

The standard does not name this part of the STL library, but this chronology would be well taken into account when dealing with which version of the compiler, language and literature you are dealing with - in the process of reducing HP STL to sizes suitable for standardization, some of the algorithms and functors dropped out of the library, and something is added over time (for example, expanding the set of redefined prototypes of some container methods). The text will use the traditional name STL only to make it clear what part of the standard C ++ library we mean.

The original purpose of STL (this is clearly seen from the chronology of comments in the header files) was to create a more flexible model of regular containers compared to arrays and generalize to them some widely used algorithms (such as search, sort, and some others). But the idea turned out to be more fruitful than the original intentions, and was significantly expanded. STL introduces a number of concepts and data structures that in almost all cases make it possible to greatly simplify the program code. The following categories of concepts are introduced:
')

Container - a way to store a set of objects in memory.
Iterator - a means of accessing the contents of individual objects in a container.
Algorithm - determination of the most standard computing procedures on containers.
Adapter - adaptation of the main categories to provide the most commonly used interfaces (such as stack or queue).
Functor (functional object) - hiding a function in an object for use by other categories.

The STL library is a very large area. Her description is devoted to entire individual books. Here, by virtue of a sufficiently initial level of acquaintance, limited volume and following the previously announced goals, the technique of using STL will be considered using intuitively clear examples. The STL syntax is based on the use of such syntax constructs of the C ++ language as templates (templates) of classes and function templates. But for the successful application of the STL technique it is not necessary to have a deep understanding of the templates technique (this may come later).

The central concept of STL, around which everything else is spinning, is a container (they also use the term collection). A container is a collection of a certain number of necessarily identical elements packed in a container in a certain way. The simplest prototype of a container in the classic C ++ language is an array. The way in which elements are packaged in a container and determines the type of container and the peculiarities of working with elements in such a container. STL introduces a variety of different types of containers, the main ones are:

consecutive containers - vector (vector), doubly-connected list (list), deck (deque);
associative containers - sets (set and multiset), hash tables (map and multimap);
pseudo containers - bitmasks (bitset), strings (string and wstring), arrays (valarray);

We will look at examples of using consistently basic ones, moving from simpler to more complex ones. The simplest type of container is a vector . The closest prototype of the vector is the C ++ array, but the vector size can be dynamically changed at any time by adding (push_back () method) or removing (for example, the pop_back () method) element. As well as for an array, we can refer to an arbitrary element of the vector by the indexing operation [n]. What has already been said is already the first, surface layer of knowledge about the vector, but which is sufficient to allow us to start working with it on the analogies of how we work with traditional arrays:

#include <iostream> #include <vector> #include <climits> using namespace std; void put( const vector<float>& v ) { cout << "capacity=" << v.capacity() << ", size=" << v.size() << " : "; for( unsigned i = 0; i < v.size(); i++ ) cout << v[ i ] << " "; cout << endl; } vector<float>& operator +=( vector<float>& v, unsigned n ) { float last = v[ v.size() - 1 ]; for( unsigned i = 0; i < n; i++ ) v.push_back( last + i + 1 ); return v; } int main( void ) { float data[] = { 1., 2., 3., 4., 5., 6., 7. }; int n = sizeof( data ) / sizeof( data[ 0 ] ); vector<float> array( data, data + n ); cout << "max_size=" << array.max_size() << " ((INT_MAX+1)/2)=" << ( (unsigned)INT_MAX + 1 ) / 2 << endl; put( array ); put( array += 2 ); put( array += 6 ); }

  Description vector <float> (this is the previously mentioned template in the class description) declares an object </ b> in the code: vector

elements of type float. Next we see such methods of the vector class as max_size () - the maximum possible length of the vectors in general (implementation constant), size () - the current size (number of elements) of the vector, capacity () - the current capacity of the vector, the maximum number of elements that can be placed in the vector in its current placement . Execution of this fragment will give something like the following (details may vary depending on the implementation):

 $ ./vect1 max_size=1073741823 ((INT_MAX+1)/2)=1073741824 capacity=7, size=7 : 1 2 3 4 5 6 7 capacity=14, size=9 : 1 2 3 4 5 6 7 8 9 capacity=28, size=15 : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Here you can see quite an interesting behavior of the vector (this is its meaning): as soon as adding the next element of the vector, its capacity becomes insufficient for one more element, a new placement of the vector is made, reserving the double capacity for it (with a margin, so that the next addition of a new element did not immediately require new relocation).

Note: Doubling the capacity of a vector with relocation shown above is a typical behavior for the implementation of the GCC compiler libraries. But the standard leaves the implementation of the exact algorithm for reserving capacity for future elements, so you cannot count on it, and it is described here only for a qualitative understanding of the picture (Visual Studio implementations behave differently, reserving only a small excess ... you will learn this yourself).

We note further, with no comments so far, the important fact that the operation of placing elements into a container performs copying of the element , which entails a). Requiring a copy constructor for the type of elements and b). For structural elements, this can lead to noticeable performance costs. .

Thus, we obtained the equivalent of a C ++ array, the size of which (size ()) dynamically changes in arbitrary limits from a few units to millions of elements. Note (this is very important) that an increase in the size of a vector is not achieved by indexation beyond its current size, but by pushing (the push_back () method) to the end of the vector (symmetrically, the pop_back () method pushes the last element from the array and reduces its size ()). Another way to change the size of a vector is to immediately call the resize () methods to the desired size. It is precisely because the size of the vector, unlike the array, can dynamically change, for the vector there are 2 different ways of indexing: as an operation [i] and as a method function at (i). They differ : the at () method checks the current size of the size () vector, and when indexing beyond its boundary, throws an exception . In contrast, the indexing operation does not check the boundary, which is not safe, but it is faster. The at () method allows us to control the output of the vector and either qualify it as a logical error, or adjust the current container size to fit the need, as in such a fragment (here, access attempts are twice as large as the actual operations performed):

 int main( void ) { vector<int> nums; for( int i = 0; i < 10; ) { try { nums.at( i ) = i; // vector::at throws an out-of-range i++; } catch( const out_of_range& ) { cout << i << " "; nums.resize( i + 1 ); } } cout << endl << nums.size() << endl; }

 $ ./vect7 0 1 2 3 4 5 6 7 8 9 10

The C ++ 11 standard introduces additional expressive means, such as, for example, initialization lists and type deducibility, which greatly simplify the work with containers (and even make old customary recording techniques unnecessary). This is how a matrix can be described when its a) are simultaneously described. configuration (square, although it may be rectangular and even triangular), b). dimension (3x3) and c). initializing values:

 void print( const vector< vector<float> >& m ) { for( auto &row : m ) { for( auto x : row ) cout << x << ' '; cout << endl; } } void trans( vector< vector<float> >& m ) { for( unsigned i = 0; i < m.size(); i++ ) for( unsigned j = i + 1; j < m[ i ].size(); j++ ) { float tmp = m[ i ][ j ]; m[ i ][ j ] = m[ j ][ i ]; m[ j ][ i ] = tmp; } } int main( void ) { vector< vector<float> > matrix = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9 } }; print( matrix ); cout << "---------" << endl; trans( matrix ); print( matrix ); }

And at the same time, work with vectors is shown here (transposition of a square matrix and output to the output stream) as with pseudo-arrays (using only indexing), which are essentially the vectors (in particular, it is shown how the type of the element is determined based on output type according to C ++ 11 standard):

 $ ./vect6 1 2 3 4 5 6 7 8 9 --------- 1 4 7 2 5 8 3 6 9

Note: Within the framework of what we already know about vectors, the question sometimes arises: how strictly should the type of the returned size () result be determined (to avoid platform dependence) and, accordingly, any variable cycles operating with the vector size? From time to time, the followers of the syntax purity follow the answer that it must be size_t, and this answer is incorrect (especially since for many platforms size_t is defined as unsigned int). If you want to write an absolutely strict definition of the size () type of a vector, then the line in the example above should be written like this:

 for( vector<float>::size_type j = i + 1; j < m[ i ].size(); j++ ) { ...

Or, relying on the derivability of C ++ 11 types, like this:

 for( auto j = i + 1; j < m[ i ].size(); j++ ) { ...

Having noted this subtle nuance here (having taken note of it), we will not use it further, in order to avoid unnecessarily cumbersome examples, but we will use unsigned for dimensions.

PS Who may be interested in this, the archive of all code examples for testing and further experiments (both the previous and this parts, and all subsequent ones) can be taken here or here , so as not to bash all this manually.

Source: https://habr.com/ru/post/278369/

All Articles

Again about STL: containers

More articles: