Initialization lists in C ++: good, bad, evil

In this article I would like to talk about how the initialization lists ( braced initializer lists ) work in C ++, what problems they were designed to solve, which problems they in turn caused and how not to get into trouble.

First of all, I suggest feeling like a compiler (or language lawyer) and understand if the following examples are compiled, why, and what they do:

Classic:

std::vector<int> v1{5}; std::vector<int> v2(5); std::vector<int> v3({5}); std::vector<int> v4{5}; std::vector<int> v5 = 5;

Modern C ++ is a safe language, I will never shoot myself in the leg:

 std::vector<std::string> x( {"a", "b"} ); std::vector<std::string> y{ {"a", "b"} };

More brackets for god brackets!

 //    ,     ? std::vector<std::vector<int>> v1{{{{{}}}}};

If one designer doesn't fit, we take the second one, right?

 struct T{}; struct S { S(std::initializer_list<int>); S(double, double); S(T, T); }; int main() { S{T{}, T{}}; //    ? S{1., 2.}; //  ? }

Almost Always Auto , they said. This improves readability, they said:

 auto x = {0}; //     x? auto y{0}; //   y? //  ?

Hello from ancient times:

 struct S { std::vector<int> a, b; }; struct T { std::array<int, 2> a, b; }; int main() { T t1{{1, 2}, {3, 4}}; T t2{1, 2, 3, 4}; T t3{1, 2}; S s1{{1, 2}, {3, 4}}; S s2{1, 2, 3, 4}; S s3{1, 2}; }

All clear? Or nothing is clear? Welcome under cat.

Disclaimers

This article is an introductory, does not claim to be complete and will often sacrifice correctness for the sake of clarity. On the other hand, the reader assumes a basic knowledge of C ++.
I tried to come up with sensible translations into Russian for English terms, but with some I was completely fiasco. Syntax constructions like {...} I will call braced-init-lists , the type from the standard library is std::initializer_list , and the initialization type when we write something like this: int x{5} is list-init , also known as uniform initialization syntax , or universal initialization syntax.

Attention!

First of all I will pay attention to the important observation. Even if you only take it out of the whole article, and then it becomes too lazy to read, my mission here will be fulfilled.

So, braced-init-lists (pieces with curly braces, {1, 2, 3}, uniform initialization syntax ) and std::initializer_list are different things! They are strongly connected, there are all sorts of subtle interactions between them, but any of them may well exist without the other.

But first - a little background.

Unicorn initialization syntax

In C ++ 98 (and its bugfix-update, C ++ 03), there were enough problems and inconsistencies associated with initialization. Here are some of them:

The syntax for initializing variables (including arrays and structures) using curly brackets came from C, but it did not interact well with C ++ features (for example, initialization of structures was not available for C ++ classes)
Often you want to build some kind of container (for example, std::vector ) from previously known elements - there was no built-in possibility for this in the language, and library solutions ( Boost.Assign ) did not differ in the elegance of the syntax, they were not free in terms of speed and not too good at compile time
When initializing primitive types, it is easy to accidentally lose information during a narrowing conversion — for example, to randomly assign a double to an int
Most vexing parse , who love to scare novice C ++ nicknames.

Therefore, during the development of C ++ 11, the following idea was born: let us give the opportunity to initialize anything using curly braces:

For cases where this is applicable in C, the new syntax will work the same, only better
We will forbid narrowing transformations.
And if we try to initialize a class with constructors, then we and the constructor will be able to call, with the passed parameters

Pitfalls

It would seem that this can be finished: container initialization should turn out by itself, because in C ++ 11 there were also templates with a variable number of parameters, so if we write a variadic constructor ... actually, no, it will not work:

Such a constructor must be a template, which is often undesirable.
We'll have to instantiate constructors with all sorts of parameters, which will lead to bloat and slow compilation.
Initialization efficiency, for example, for std::vector will still not be perfect.

To solve these problems, we invented std::initializer_list - the "magic class", which is a very light wrapper for an array of elements of a known size, as well as being able to construct from braced-init-list .

Why is he "magic"? Just for the reasons described above, it cannot be effectively constructed in user code, so the compiler creates it in a special way.

Why is it needed? Mainly, so that custom classes could say: "I want to construct from braced-init-list elements of such and such type", and they would not need a template constructor for this.

(By the way, by this point it should be clear that std::initializer_list and braced-init-list are different concepts)

Is everything all right now? We just add the vector(std::initializer_list<T>) constructor vector(std::initializer_list<T>) to our container and it all works? Nearly.

Consider this entry:

 std::vector<int> v{5};

What was meant by v(5) or v({5}) ? In other words, do we want to construct a vector of 5 elements, or from one element with a value of 5 ?

To solve this conflict, overload resolution ( overload resolution , the choice of the desired function by the arguments passed) in the case of list-initialization occurs in two stages:

At first, only constructors with a single parameter of the std::initializer_list (this is one of the main moments when the compiler generates std::initializer_list based on the content of curly braces). Overload resolution occurs between them.
If no constructor is suitable, then everything is as usual further - we expand braced-init-list into the argument list and perform overload resolution among all the available constructors.

Note that the constructor, which lost at the first stage, may well come up at the second. This explains an example with an excess of parentheses for initializing a vector from the beginning of the article. For clarity, remove one of the nested templates, and also replace std::vector with your class:

 template<typename T> struct vec { vec(std::initializer_list<T>); }; int main() { vec<int> v1{{{}}}; }

Under clause 1, our constructor does not fit - {{{}}} not like std::initializer_list<int> , because int cannot be initialized with {{}} . However, {} is quite a zero-initialization, so the constructor is adopted in the second step.

It's funny, however, that a narrowing conversion is not a sufficient reason to throw out a constructor — in the following example, the first constructor is taken in the first step of resolving overloads, and then causes a compiler error. Good or bad - I do not know, for me it is just amazing.

 struct S { S(std::initializer_list<int>); S(double, double); }; int main() { S{1., 2.}; }

A similar problem with a rather scary result is obtained in the vector row example from the beginning of the article. Unfortunately, std::string has a constructor that treats the two passed pointers as the beginning and end of a string. The consequences of such behavior for string literals are obviously deplorable, while syntactically writing looks quite similar to the correct version and may well appear, for example, in generic code.

Classes Aggregates

Now is that all? Not really. The old structure initialization syntax, inherited from C, has not gone away, and you can do this:

 struct A { int i, j; }; struct B { A a1, a2; }; int main() { B b1 = {{1, 2}, {3, 4}}; B b2 = {1, 2, 3, 4}; // brace elision B b3 = {{1, 2}}; // clause omission }

As you can see, when initializing units (roughly speaking, C-like structures, not to be confused with POD , POD is about something else), you can skip nested parentheses and throw out some initializers. All this behavior was neatly transferred to C ++.

It would seem, what nonsense, why is it in the modern language? Let's at least warn the compiler to this output, thought the developers of GCC and clang, and would be right, do not be std::array class aggregate, containing an array. Thus, a warning about discarding nested brackets, for obvious reasons, works on such an innocent code:

 int main() { std::array<int, 3> a = {1,2,3}; }

GCC "solved" this problem by turning off the corresponding warning in the -Wall mode, in clang, for three years now everything is still the same.

By the way, the fact that std::array is an aggregate is not a whim of crazy standard authors or lazy developers of standard libraries: it is simply impossible to achieve the required semantics of this class by means of the language without losing efficiency. Another hello from C and its weird arrays.

Perhaps the big problem with aggregate classes is not the best interaction with generalized functions (including) from the standard library. At the moment, functions that construct an object from the passed parameters (for example, vector::emplace_back or make_unique ) cause normal initialization, not "universal". It is caused by the fact that the use of list-initialization does not allow in any normal way to call the "normal" constructor instead of the receiving std::initializer_list (approximately the same problem as with initialization in non-template code, only here the user cannot bypass it by calling another constructor). Work in this direction is underway , but for now we have what we have.

Almost Always Auto

How do braced-init-lists behave in combination with type inference? What happens if I write auto x = {0}; auto y = {1, 2}; auto x = {0}; auto y = {1, 2}; ? You can come up with some sensible strategies:

To prohibit such initialization in general (in fact, what does a programmer want to say by this?)
Print the type of the first variable as int , and disable the second option
Make it so that both x and y are of type std::initializer_lits<int>

The last option I like the least (very few people in real life have local variables of the type std::initializer_list ), but it was he who got into the standard C ++ 11. Gradually, it became clear that this caused problems for programmers (who would have thought), so a patch was added to the standard http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3922.html , which implements behavior # 2 ... only in the case of direct-list-initialization ( auto x{5} ), and in the case of copy-list-initialization ( auto x = {5} ) it leaves everything as usual.

I can not comment on this. In my opinion, this is one of the very rare cases where common sense temporarily left the authors of the language. If you have something to say about this, let me know in the comments.

Subtotals

Although the universal initialization syntax and std::initializer_list are language features added from good and right reasons, it seems to me that due to the eternal need for backward compatibility and not always forward-looking solutions in the early stages, the whole situation around them is too complicated at the moment, forced and not the most pleasant for all parties involved - the authors of the standard, compilers, libraries and application developers. We wanted the best, but it turned out like in the famous comic strip:

As an example, take, for example, the story of [over.best.ics] /4.5 , which was first added to the standard, then, without thinking, removed , as redundant, and then added back in a modified form - as a description of the extreme case with five ( !) conditions.

Nevertheless, the opportunity is useful and facilitates life, so here I will give a small and not pretending to objectivity list of how not to shoot myself in the foot:

Spend some time to get acquainted with what is actually happening (I recommend reading the paragraph of the standard - surprisingly understandable and not too dependent on the rest)
Do not use std::initializer_list , except in the constructor parameter
Yes, and use the constructor parameter only if you understand what is happening (if you are not sure, construct better from a vector, a pair of iterators or a range)
Do not use aggregate classes unless absolutely necessary; write a constructor that initializes all fields better.
Do not use braced-init-list in combination with auto
Read this article about what to do with empty initialization lists (my hands itch to transfer and post it, maybe I will be busy soon)
And, as I wrote at the very beginning, keep in mind that braced-init-list and std::initializer_list are different concepts that interact very slyly with each other.

Let's dream

Here I finish my introduction to the existing state of affairs, and I want ~~throw in~~ dream how things could be if we lived in an ideal world.

It seems to me that reusing curly braces to create std::initializer_list during initialization is a language design error. I would be very happy if, instead, we would get a more explicit and separate syntax (albeit more ugly, for example, some strange brackets like <$...$> or a built-in intrinsic like std::of(...) ). That is, we initialize the vector something like this: std::vector<std::vector<int>> x = std::of(std::of(1, 2), std::of(3, 4));

What would it give? A new initialization method (with protection against most vexing parse and narrowing conversions) would be untied from std::initializer_list , there would not have been a need to enter a separate step to resolve the overloads, the problem with the vector<int> or vector<string> would be gone, the new syntax initialization could be used in generalized code without any problems.

Of course, the drawbacks of this approach are quite serious: a more malformed syntax in the simplest cases and a departure from the goal of making the syntax more uniform with initialization in the C style (I am rather skeptical about such unification, but this is a topic for another conversation).

I also dislike classes aggregates. Leaving aside the problem with std::array , I do not see a decent justification for the existence of such a large and special language feature. The problem with the fact that programmers do not want to write trivial constructors for simple classes could be solved in less invasive ways, for example, to give an opportunity to generate a constructor that would initialize all fields in turn:

 struct S { int a, b; S(...) = aggregate; };

Conclusion

Finally, I repeat once again that I do not pretend to be 100% correct or to the ultimate truth. Welcome to the comments if something is left incomprehensible, or if there is something to say on this rather specific topic.

Source: https://habr.com/ru/post/330402/

All Articles