Again about STL

This selection of short notes on C ++ containers emerged as a result of a request to prepare a brief STL review for beginning fellow programmers. Why was this necessary when there is original documentation on this subject and many whole separate books, of which at least 5-6 excellent quality exist in translations into Russian?
But the meaning, however, is, and here it is:

All books were published in the late 90s - early 2000s. Later standards of the C ++ language (up to C ++ 11) introduce new syntactic constructions that, when used with STL, give a cross effect ... with a very interesting interference. This allows frequent use of STL constructs with much greater ease.
Books usually describe the subject matter too detailed (this is good for students, but redundant for programmers, even if it is a junior level, which, after other languages, for example, requires only basic familiarization). The original documentation, on the contrary, is crammed with formal syntactic definitions (this is fine as a reference book at hand, but redundant for acquaintance). These notes, completely avoiding formal definitions, are built around use cases that are mostly understood by the programmer even without any additional explanation.
The contingent for which the texts were originally prepared, among other things, is largely focused on numerical mathematical methods, data processing in the stream, for which existing publications are not designed. Such a bias is also closer to me, and such an accent will be noticeable in the examples on which the description is built.

Due to the requirements of the mandatory uniformity of objects in containers, they could be called regular containers with clarification, this clarifies a lot (such clarification is not made just because everything is already clear about what we are talking about). Of course, we are talking about STL containers, but the traditional C / C ++ array is the same regular container, and they will appear in the text and examples. (Structures, and even more generally, classes with data fields are also containers, but you can’t call them regular.)

I would like to hope that these notes will be useful to someone from mastering STL, simplify this process of understanding. And the suggestions and remarks, when they will be in essence, from those readers who are already pros in C ++, will help improve these texts for the future, when they can be useful to someone else.

Static and dynamic dimension arrays

But it is necessary to begin the review of STL containers from traditional arrays, since the former are a logical development of the latter. Arrays are one of the most widely used forms of data organizations, and historically one of the very first forms of structuring that appeared in programming languages (late 50s), before the appearance of structures in languages and any other aggregation methods. An array is a representation of a set of consecutive elements of the same type, and it is also organic for an internal representation for machine computations at the command level. Fundamentally important in this definition are 2 points that for the array must be met:
')

Each element of the array can be specified by its location number in the sequence of elements similar to it.
All array elements must necessarily be of the same type . Everyone knows and understands the simplest definitions, for example, integer arrays: int array [100]. But this does not mean that only the simplest built-in types of the C ++ language can be organized into arrays - variable objects of any complex type (class) and of any degree of complexity can be organized into an array.

The only limitation is that all elements of the same array must be of the same type. For example, a student group can be described like this in the dean’s accounting program:

class student { ... } student group[ 30 ];

To this extremely important circumstance - the types of the elements of the array - we will return later.

Ever since the earliest programming languages (FORTRAN, etc.), a very strong restriction was imposed on arrays: the size of the array should be determined only by an integer constant , the value of which should be determined at the time of compiling the code. The same restriction persists in the C language, which became the progenitor of C ++. For example:

 #define size 100 double array[ size ];

In C ++ (in classical, except for the standards of recent years!), This restriction is slightly relaxed to the point that the size of the array can be an integer constant, the value of which can be calculated at the time of compiling the code. For example:

 #include <iostream> using namespace std; float array[ (int)( 10. * 10. ) + 2 ]; int main( void ) { cout << "array size = " << sizeof( array ) / sizeof( array[ 0 ] ) << endl; }

 $ ./siz1 array size = 102

In all such cases, after defining the array, its size is fixed and we will not be able to increase its size in any way (if, for example, during the calculations it turns out that we lack this size). Thus, certain arrays are called arrays with a statically declared (at the time of writing) size.

Note: Due to the fact that all the elements of the array are arranged sequentially (the first rule from those mentioned above), you can use the trick shown in the example to calculate the size of the array (in its visible area): the size of the array is equal to the length of the entire array divided by the length any of its elements (since they are all of the same type).

It may seem that the arrays are defined differently without specifying the size, but with a list of initial values:

 float array[] = { 1., 2., 3., 4., 5. };

But it is not so! Just here the constant value of the size of the declared array is extracted from the list of values, and is equal, in the shown example 5.

The main way to create an array, in classic C and C ++, with the size of N elements, calculated at the time of array creation (at the execution stage), was the way of dynamic allocation of the array. Which in C and C ++ is done, respectively:

 double *array = (double*)calloc( N, sizeof( double ) ); //  C double *array = new double[ N ]; //   C++

For example:

 #include <iostream> #include <cmath> using namespace std; int main( void ) { int size = (int)( sin( M_PI / 2 ) * pow( 2, 10 ) ); float *array = new float[ size ]; cout << "array size = " << size << endl; delete [] array; }

 $ ./siz2 array size = 1024

Note: Dynamic array placement is the main way, although there are some other ways, such as alloca () from the C API, or including zero-length extensions in extensible structures (extension of the GCC compiler) or replacing them with variable bounds ( C99):

 typedef struct vararr { // int n, data[ 0 ]; //    -  GCC int n, data[]; //     -  C99 } vararr_t;

But all these methods, judging by the frequency of their use, are only exotic in comparison with dynamic placement. And their diversity in different ways is only an indication that the static size of the array size is a strong limiting factor, and all this is the search for the removal of this limitation.

The latest standards (C99, C ++ 11) have made extensions that allow the creation of local arrays in functions with sizes calculated at the execution stage when entering a function (under such conditions the array will be allocated in the function stack). This is quite significant in cases where we cannot know in advance the size of the data being processed (the C99 standard calls this: VLA - variable-length array). As an example, let's look at the task of finding all primes not exceeding N (the sieve of Eratosthenes), where N is specified by the command line parameter when the program is started:

 #include <iostream> #include <cstdlib> using namespace std; int main( int argc, char **argv ) { unsigned k, j, n; //   bool a[ n = atoi( argv[ 1 ] ) + 1 ]; a[ 0 ] = a[ 1 ] = false; for( k = 2; k < n; k++ ) a[ k ] = true; for( k = 2; k < n; k++ ) for( j = k + k; j < n; j += k ) //     ka[ j ] = false; const int line = 10; for( k = 0, j = 0; k < n; k++ ) if( a[ k ] ) { cout << k << "\t"; if( 0 == ++j % line ) cout << endl; } if( j % line != 0 ) cout << endl; return 0; }

We are already obliged to compile this code with an indication of the C ++ standard of 2011 (using options of the compiler, or properties of the compiled project):

 $ g++ -Wall -std=c++11 -O3 erastof.cc -o erastof $ ./erastof 300 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293

But even after all extensions, the simplest array, as a form of organizing a collection of objects, is not flexible enough, the main limiting factors of which are:

No matter how the size of the array is determined (by a constant or by calculating the definition point), this size cannot be further increased (if you did not guess the required size in advance, or if you didn’t put a sufficient margin).
According to the rules of C / C ++, when calling functions instead of an array, a pointer to its beginning (the address of the 1st element) is passed as a parameter to the function. This allows you to greatly improve the efficiency of many computational algorithms, but this completely loses information about the size of the array (it is necessary, as an option) to pass a separate parameter. For example, if we wanted to form the sieve of Eratosthenes not in the main () function, but in a separate function, then we should form its call as an erastof (a, n).
Many intuitively simple operations on arrays cause difficulties, such as: in the 15-element array, insert element number 10 between elements 2 and 3. In addition, a). all elements from 3 to 9 need to be copied one position to the right, b). this can only be done in descending order from 9 to 3 and c). all indices of these operations must be monitored manually.

In order to make it easier to perceive the access mechanisms in STL containers, it is useful to first recall how this is done to the elements of the arrays. The choice (for reading or writing) of an arbitrary element of the array (let's call it arbitrarily arr []) can be chosen by two mechanisms: a). indexing operation - arr [n], or b). according to the pointer, counted from the beginning of the array - * (& arr [0] + n). The process of moving the pointer along an array, in the course of accessing its various elements, can be called an iteration, and the pointer used in this capacity is an iterator. Thus, access to the elements of any arrays can be done either by index or by iterator.

The C ++ 11 standard complements the operation of cyclic access to array elements with a new syntactic form, in the manner that the for_each () algorithm from STL does, which (using the deducibility of types from the same C ++ 11) may look quite unusual for the traditional syntax:

  for( auto i : arr ) . . . for( auto &i : arr ) . . . //

The following example shows all these features at the same time:

 #include <iostream> using namespace std; double arr[] = { 1, 2, 3, 4, 5, 6, 8, 9 }; const unsigned n = sizeof( arr ) / sizeof( arr[ 0 ] ); int main( void ) { for( unsigned i = 0; i < n; i++ ) cout << arr[ i ] << " "; cout << endl; for( double* i = arr; i - arr < (int)n; i++ ) cout << *i << " "; cout << endl; for( auto i : arr ) cout << i << " "; cout << endl; for( auto &i : arr ) i = i * i; for( double i : arr ) cout << i << " "; cout << endl; }

Pay attention to the expression for (auto & i: arr), when using a reference to the elements of the array, representing the left-sided expression, which allows not only to read, but also to write values to the elements of the array:

 $ g++ -Wall -std=c++11 -O3 siz3.cc -o siz3 $ ./siz3 1 2 3 4 5 6 8 9 1 2 3 4 5 6 8 9 1 2 3 4 5 6 8 9 1 4 9 16 25 36 64 81

Postscript: Naturally, that what is described here, and even more in the subsequent parts, not only works, but at least even elementarily compiles:

When compiling examples, one should explicitly indicate the following standard C ++ 11: either by the command line option (GCC and Clang - as shown in the text), or in the properties of the compiled project (Code :: Blocks). Implicitly using C ++ 11 constructs is not yet supported.
It is necessary for your compiler to know and support the C ++ 11 standard, that is, the compiler (or the IDE as part of which) was later ... well, let's say, 2013.

The last condition is completely satisfied by all the current versions of GCC or Clang in Linux. At the request of readers, the following code examples (hereinafter) were tested in the Windows 7 virtual machine, and in the implementations of Visual Sudio 2013 and Code :: Blocks 2013. With stated support for C ++ 11 (with reservations on “incompleteness”) in Visual Sudio 2013, the support there, if any, is very limited, and the compiler is not suitable for experimenting with the examples shown. But Code :: Blocks (with MinGW) turned out to be quite suitable (although, in my firm conviction, it is necessary to learn C ++ languages, and especially C, only in the POSIX / Linux environment, and then use it in any environment ... but this is a purely personal conviction, which can not be taken into account).

From such a cursory review of arrays as containers, it is clear that practice needs often require more, which created a need for STL (Standard Template Library) containers as a means of expanding the functionality of arrays.

Source: https://habr.com/ru/post/278039/

All Articles

Again about STL

Static and dynamic dimension arrays

More articles: