1. Introduction
In this article I will try to investigate and debunk five popular myths about C ++:
1 . To understand C ++, you first need to learn C
2 C ++ is an object-oriented programming language.
3 Reliable programs require garbage collection
4 To achieve efficiency, it is necessary to write low-level code.
5 C ++ is only suitable for large and complex programs.
If you or your colleagues believe in these myths - this article is for you. Some myths are true for someone, for some task at some point in time. However, today's C ++, which uses ISO C ++ 2011 compilers, makes these assertions myths.
')
They seem to me popular because I often hear them. Sometimes they are argued, but more often used as axioms. Often they are used to eliminate C ++ as one of the possible solutions to a problem.
A book can be devoted to every myth, but I will confine myself to a simple statement and a brief statement of my arguments against them.
2. Myth 1: To understand C ++, you first need to learn C
Not. In C ++, it is easier to learn the basics of programming than in C. C is almost a subset of C ++, but not the best of them, because C does not have type safety and convenient libraries that C ++ has and that make it easier to perform simple tasks . Consider a simple example of creating email addresses:
string compose(const string& name, const string& domain) { return name+'@'+domain; }
It is used like this:
string addr = compose("gre","research.att.com");
Naturally, in a real program, not all arguments will be strings.
In the C-version, it is necessary to work directly with symbols and memory:
char* compose(const char* name, const char* domain) { char* res = malloc(strlen(name)+strlen(domain)+2);
It is used like this:
char* addr = compose("gre","research.att.com");
Which option is easier to teach? Which is easier to use? Have I confused something in the C-version? Right? Why?
And finally, which version of compose () is more efficient? C ++ - because it does not need to count the characters in the arguments and it does not use dynamic memory for short strings.
2.1 Learning C ++
This is not some strange exotic example. In my opinion, it is typical. So why do so many teachers preach the “First C” approach? Because:
- they always did
- it is required by the curriculum
- that they themselves studied this way
- Since C is less than C ++, it means that it should be easier
- students don't care, sooner or later, will have to learn C
But C is not the simplest or most useful subset of C ++. Knowing enough C ++, it will be easy for you to learn C. Studying C before C ++ will come up with a lot of mistakes that are easy to avoid in C ++, and you will spend time learning how to avoid them. For the right approach to learning C ++, see my book Programming: Principles and Practice Using C ++. In the end, there is even a chapter on how to use C. It has been successfully used in teaching many students. To simplify learning, its second edition uses C ++ 11 and C ++ 14.
Thanks to C ++ 11, C ++ has become more friendly for beginners. For example, here is a vector from the standard library, initialized with a sequence of elements:
vector<int> v = {1,2,3,5,8,13};
In C ++ 98, we could initialize lists with only arrays. In C ++ 11, we can specify a constructor that accepts a list {} for any type. We can cycle through the vector:
for (int x : v) test(x);
test () will be called for each v element.
The for loop can go through any sequence, so we could just write:
for (int x : {1,2,3,5,8,13}) test(x);
In C ++ 11, we tried to make simple things simple. Naturally, without sacrificing speed.
3. Myth 2: C ++ is an object-oriented programming language.
Not. C ++ supports OOP and other styles, but it is not specifically limited. It supports the synthesis of program styles, including OOP and generic programming. More often, the best solution to the problem is to use several styles. The best means shorter, most understandable, efficient, serviced, etc.
This myth leads people to the conclusion that they do not need C ++ (compared to C), unless they need large class hierarchies with all sorts of virtual functions. Believing in the myth, C ++ is reproached for not being purely object-oriented. If you equate “good” to “OOP”, then C ++, containing a lot of non-OOP, will automatically become “bad.” In any case, this myth is an excuse not to learn C ++.
Example:
void rotate_and_draw(vector<Shape*>& vs, int r) { for_each(vs.begin(),vs.end(), [](Shape* p) { p->rotate(r); });
Is it an OOP? Of course - there is a hierarchy of classes and virtual functions. Is this generic programming? Of course, there is a parameterized container (vector) and a normal function.
for_each. Is it functional programming? Something like that. Lambda is used (construction []). And what is this style? This is the modern style of C ++ 11.
I used both the standard for loop and the for_each library algorithm, just to demonstrate the possibilities. In this code, I would use only one loop, any of them.
3.1 Generalized programming.
Want a more generalized code? In the end, it only works with pointer vectors for Shapes. What about lists and embedded arrays? What about smart pointers, such as shared_ptr and unique_ptr? And the objects that are not called Shape, but which can be draw () and rotate ()? Heed:
template<typename Iter> void rotate_and_draw(Iter first, Iter last, int r) { for_each(first,last,[](auto p) { p->rotate(r); });
This works with any sequence. This is the style of standard library algorithms. I used auto to not call the interface types of objects. This is a C ++ 11 feature, meaning “to use the type of expression that was used during initialization”, so for p the type will be the same as for first.
Another example:
void user(list<unique_ptr<Shape>>& lus, Container<Blob>& vb) { rotate_and_draw(lus.begin(),lus.end()); rotate_and_draw(begin(vb),end(vb)); }
Here, Blob is a kind of graphic type that has draw () and rotate () operations, and Container is a type of some kind of container. A list from the standard library (std :: list) has begin () and end () methods that help you walk through the sequence. This is a beautiful classic OOP. But what if Container does not support standard iteration over half-open sequences, [b: e)? If begin () and end () methods are missing? Well, I have never met something like a container that cannot be passed through, so we can define separate begin () and end (). The standard library provides such an opportunity for C-style arrays, so if the Container is an array from C, the problem is solved.
3.2 Adaptation
The case is more complicated: what if the Container contains pointers to objects, and it has a different model for access and passage? For example, it should be addressed as follows:
for (auto p = c.first(); p!=nullptr; p=c.next()) { }
This style is not rare. It can be brought to the form of the sequence [b: e) like this:
template<typename T> struct Iter { T* current; Container<T>& c; }; template<typename T> Iter<T> begin(Container<T>& c) { return Iter<T>{c.first(),c}; } template<typename T> Iter<T> end(Container<T>& c) { return Iter<T>{nullptr,c}; } template<typename T> Iter<T> operator++(Iter<T> p) { p.current = pcnext(); return p; } template<typename T> T* operator*(Iter<T> p) { return p.current; }
Such a modification is non-aggressive: I did not have to change the Container or the hierarchy of its classes to bring it to the passage model supported by the standard C ++ library. This is an adaptation, not a refactoring. I chose this example to demonstrate that such techniques of generalized programming are not limited to the standard library. In addition, they do not fall under the definition of "OO".
The idea that C ++ code is required to be OO (use hierarchies and virtual functions everywhere) has a detrimental effect on program performance. If you need to analyze a set of types at runtime, this is a good approach, and I use it often. However, it is rather inflexible (not all types fit into the hierarchy), and the call of a virtual function prevents inlining, which can slow down your program 50 times.
4. Myth 3: Garbage Collection Required in Reliable Programs
Garbage collection is good, but not perfect copes with returning unused memory. This is not a panacea. Memory may not be occupied directly, and a multitude of resources are not just memory. Example:
class Filter {
The Filter constructor opens two files. After that, a certain task is performed, the input from the file is accepted and the result is output to another file. You can hardcode a task in Filter and use it as a lambda, or you can use it as a function provided by an inherited class that overloads a virtual function. For resource management, it does not matter. You can define a Filter like this:
void user() { Filter flt {“books”,”authors”}; Filter* p = new Filter{“novels”,”favorites”};
From a resource management perspective, the problem is how to ensure that the files are closed and the resources associated with the two threads are correctly returned for future use.
A common solution for systems that rely on garbage collectors is to remove delete and destructor (because garbage collectors rarely have destructors and are better avoided because they can lead to algorithmic problems and adversely affect performance). The garbage collector can clear all memory, but we need to close the files and return all resources not related to memory (locks), but related to threads. It turns out that the memory is automatically returned, but the management of other resources is carried out manually, therefore it is prone to leaks and errors.
A common and recommended approach in C ++ is to rely on destructors to make sure that resources are returned. Usually, resources are taken away in constructors, which gives this technique the name “Resource Acquisition Is Initialization” (RAII)). In user (), the flt destructor implicitly calls the is and os thread destructors. They, in turn, close files and release resources related to streams. delete would do the same for * p.
Advanced C ++ users will notice that user () is clumsy and error prone. So it would be better:
void user2() { Filter flt {“books”,”authors”}; unique_ptr<Filter> p {new Filter{“novels”,”favorites”}};
Now, upon exiting user () * p is automatically released. The programmer will not forget to do this. unique_ptr is a standard library class that verifies that resources are freed up, without loss in performance and memory, compared to the built-in pointers.
Although this solution is too verbose (Filter repeats), and the separation of the constructor of a regular pointer (new) and a smart one (unique_ptr) requires optimization. This can be improved through the auxiliary C ++ 14 function make_unique, which creates an object of the specified type and returns unique_ptr pointing to it:
void user3() { Filter flt {“books”,”authors”}; auto p = make_unique<Filter>(“novels”,”favorites”);
Or even a better option, unless we need a second Filter in order to write everything through pointers:
void user4() { Filter flt {“books”,”authors”}; Filter flt2 {“novels”,”favorites”};
In short, simpler, clearer, and faster.
But what does the Filter destructor do? Frees resources Filter - closes files (causing their destructors). This is done implicitly, so if Filter doesn’t need anything more, you can get rid of mentioning its destructor and let the compiler do it yourself. Therefore, it is only necessary to write:
class Filter {
This record is simpler than most records from languages ​​with automatic garbage collection (Java, C #), and there are no leaks due to forgetfulness. It is also faster than obvious alternatives.
This is my ideal resource management. It manages not only memory, but also other resources - files, streams, locks. But is it really comprehensive? What about objects that don't have one obvious owner?
4.1 Transfer of ownership: move
Consider the problem of transferring objects between scopes. The question is how to get a lot of information out of scope, without unnecessary copying or error-prone use of pointers. Traditionally used pointer:
X* make_X() { X* p = new X:
And who is responsible for removing the object? In our simple case, the one who calls make_X (), but in general, the answer is not so obvious. What if make_X () caches objects to minimize memory usage? If user () passed a pointer to other_user ()? Many where you can get confused and with this programming style leaks are not uncommon. You could use shared_ptr or unique_ptr to directly determine the owner of the object:
unique_ptr<X> make_X();
But why use a pointer at all? Often it is not needed, often it distracts from the usual use of the object. For example, the addition function Matrix creates a new object, the sum of two arguments, but returning a pointer would result in a strange code:
unique_ptr<Matrix> operator+(const Matrix& a, const Matrix& b); Matrix res = *(a+b);
The
* symbol is needed to get an object with a sum, not a pointer. What I really need is an object, not a pointer to it. Small objects are quickly copied and I would not use a pointer:
double sqrt(double);
On the other hand, objects containing a bunch of data are usually the processors of this data. istream, string, vector, list and thread - they all use only a few bytes to access much larger data. Returning to the addition of the Matrix. What do we need:
Matrix operator+(const Matrix& a, const Matrix& b);
Easy:
Matrix operator+(const Matrix& a, const Matrix& b) { Matrix res;
By default, the res elements are copied to r, but since res is deleted and its memory is freed, you do not need to copy them: you can “steal” the elements. This could be done from the first days of C ++, but it was difficult to implement and not everyone understood the technique. C ++ 11 supports the “theft of the view” directly, in the form of move operations that transfer ownership of the object. Consider a simple two-dimensional matrix of double elements:
class Matrix { double* elem;
The copy operation is recognized by &. Move operation - by &&. The move operation should “steal” the view and leave behind the “empty object”. For Matrix, this means something like:
Matrix::Matrix(Matrix&& a)
That's all. When the compiler sees return res; he understands that res will soon be destroyed. It will not be used after return. Then he uses the move constructor instead of copying to pass in the return value. For
Matrix r = a+b;
res inside operator + () becomes empty. The destructor is left with very little work, and the res elements now own r. We got the elements of the result (it could be megabytes of memory) from the function (operator + ()) to a variable. And they did it with minimal cost.
C ++ experts indicate that in some cases a good compiler can completely eliminate copying for return. But it depends on their implementation, and I don’t like that the speed of simple things depends on how clever the compiler got. Moreover, a compiler that eliminates copying can also eliminate moving. We have here a simple, reliable and universal way to eliminate the complexity and cost of moving a large amount of information from one area of ​​view to another.
In addition, the displacement semantics works for assignment, so in the case of
r = a+b;
we get motion optimization for the assignment operator. Optimizing compiler assignment is more difficult.
Often we don’t even have to define all these copy and move operations. If the class consists of members that behave as expected, we can simply rely on the default operations. Example:
class Matrix { vector<double> elem;
This option behaves the same as the previous one, except that it handles errors better and takes a little more space (a vector is usually three words).
What about handles that are not handlers? If they are small, like int or complex, do not worry. Otherwise, make them handlers or return them via smart pointers unique_ptr and shared_ptr. Do not use the "bare" operations new and delete. Unfortunately, Matrix from the example is not included in the ISO C ++ standard library, but there are several libraries for it. For example, look for “Origin Matrix Sutton” and refer to Chapter 29 of The C ++ Programming Language (Fourth Edition) for comments on its implementation.
Part 2