[CppCon 2018] Herb Sutter: Towards a simpler and more powerful C ++

In his speech at CppCon 2018, Herb Sutter presented his work in two directions to the public. First, it is the control of the lifetime of variables (Lifetime), which will allow to detect whole classes of bugs at the compilation stage. Secondly, it is an updated proposal for metaclasses that will avoid duplication of code, once describing the behavior of the class category and then connecting it to specific classes with one line.

Preface: more = easier ?!

C ++ charges that the standard is meaningless and mercilessly growing are heard. But even the most ardent conservatives will not argue with the fact that such new constructions as range-for (cycle through the collection) and auto (at least for iterators) make the code simpler. It is possible to develop exemplary criteria with which (at least one, ideally all) new extensions of the language should be satisfied in order to simplify the code in practice:

Reduce, simplify code, remove duplicate code (range-for, auto, lambda, Metaclasses)
Make safe code easier to write, prevent errors and special cases (smart pointers, Lifetimes)
Completely replace older, less functional features (typedef → using)

Herb Sutter identifies "modern C ++" - a subset of features that conform to modern coding standards (like C ++ Core Guidelines ), and treats the full standard as a "compatibility mode" that everyone does not need to know. Accordingly, if "modern C ++" does not grow, then everything is fine.

Checks for the lifetime of variables (Lifetime)

A new group of Lifetime checks is now available in Core Guidelines Checker for Clang and Visual C ++. The goal is not to achieve absolute rigor and accuracy, as in Rust, but to perform simple and quick checks within individual functions.

Basic principles of verification

From the point of view of the analysis of the lifetime, the types are divided into 3 categories:

The value is what any Pointer can indicate.
Pointer - refers to the Value, but does not control its lifetime. Can be dangling (dangling pointer). Examples: T* , T& , iterators, std::observer_ptr<T> , std::string_view , gsl::span<T>
The owner (owner) - manages the lifetime values. Usually can remove his value ahead of time. Examples: std::unique_ptr<T> , std::shared_ptr<T> , std::vector<T> , std::string , gsl::owner<T*>

The pointer can be in one of the following states:

Point to the value stored on the stack
Point to the Value contained "inside" some Owner
Be empty (null)
Be Hangable

Pointers and Values

For each Index $p$ tracked $pset (p)$ - the set of values to which it may indicate. When deleting a value, its occurrence in all $pset$ replaced by $invalid$ . When referring to a pointer value $p$ such that $invalid ∈ pset (p)$ , we give an error.

 string_view s; // pset(s) = {null} { char a[100]; s = a; // pset(s) = {a} cout << s[0]; // OK } // pset(s) = {invalid} cout << s[0]; // ERROR: invalid ∈ pset(s)

Using annotations, you can customize which operations will be considered as operations to access the Value. Default: * , -> , [] , begin() , end() .

I pay attention that the varning is issued only at the moment of access to the invalid Index. If the Value is deleted, but no one will ever turn to this Index, then everything is fine.

Pointers and Owners

If the pointer $p$ points to the value contained within the owner $o$ then it denote $pset (p) = {o '}$ .

The methods and functions of the Owners are divided into:

Owner Value Access operations. Default: * , -> , [] , begin() , end()
Access operations to the Owner itself, v.clear() pointers, like v.clear() . By default, these are all other non-const operations.
Access operations to the Owner itself, not v.empty() pointer pointers, like v.empty() . By default, these are all const operations.

Owner's old content is announced $invalid$ when deleting the Owner or when applying invalidating operations.

These rules are enough to detect many typical bugs in C ++ code:

 string_view s; // pset(s) = {null} string name = "foo"; s = name; // pset(s) = {name'} cout << s[0]; // OK name = "bar"; // pset(s) = {invalid} cout << s[0]; // ERROR

 vector<int> v = get_ints(); int* p = &v[5]; // pset(p) = {v'} v.push_back(42); // pset(p) = {invalid} cout << *p; // ERROR

 std::string_view s = "foo"s; cout << s[0]; // ERROR // :       std::string_view s = "foo"s // pset(s) = {"foo"s '} ; // pset(s) = {invalid}

 vector<int> v = get_ints(); for (auto i = v.begin(); i != v.end(); ++i) { // pset(i) = {v'} if (*i == 2) { v.erase(i); // pset(i) = {invalid} } // pset(i) = {v', invalid} } // ERROR: ++i for (auto i = v.begin(); i != v.end(); ) { if (*i == 2) i = v.erase(i); // OK else ++i; }

 std::optional<std::vector<int>> get_data(); //   ,  get_data() != nullopt for (int value : *get_data()) // ERROR cout << value; // *get_data() —     for (int value : std::vector<int>(*get_data())) // OK cout << value;

Tracking lifetime function parameters

When we start dealing with functions in C ++ that return Pointers, we can only guess about the relationship between the lifetime of the parameters and the return value. If the function accepts and returns Pointers to the same type, then it is assumed that the function "gets" the return value from one of the input parameters:

 auto f(int* p, int* q) -> int*; // pset(ret) = {p', q'} auto g(std::string& s) -> char*; // pset(ret) = {s'}

Easily detected suspicious functions that take the result from nowhere:

 std::reference_wrapper<int> get_data() { //    int i = 3; return {i}; // pset(ret) = {i'} } // pset(ret) = {invalid}

Since the temporary value can be passed to the const T& parameters, they are not taken into account, except when the result is no longer available:

 template <typename T> const T& min(const T& x, const T& y); // pset(ret) = {x', y'} //    const T&- //        auto x = 10, y = 2; auto& bad = min(x, y + 1); // pset(bad) = {x, temp} // pset(bad) = {x, invalid} cout << bad; // ERROR

 using K = std::string; using V = std::string; const V& find_or_default(const std::map<K, V>& m, const K& key, const V& def); // pset(ret) = {m', key', def'} std::map<K, V> map; K key = "foo"; const V& s = find_or_default(map, key, "none"); // pset(s) = {map', key', temp} ⇒ pset(s) = {map', key', invalid} cout << s; // ERROR

It is also believed that if a function takes a pointer (instead of a link), then it can be nullptr, and this pointer cannot be used before comparing with nullptr.

Conclusion on the control of lifetime

I repeat that Lifetime is not yet a proposal for the C ++ standard, but a bold attempt to implement lifetime checks in C ++, where, unlike Rust, for example, there have never been corresponding annotations. At first, there will be a lot of false positives, but over time, heuristics will improve.

Questions from the audience

Do Lifetime group checks provide a mathematically accurate guarantee that there are no dangling pointers?

Theoretically, it would be possible (in the new code) to hang a bunch of annotations on classes and functions, and in return the compiler would give such guarantees. But these checks were developed following the 80:20 principle, that is, you can catch most of the errors using a small number of rules and applying a minimum of annotations.

Metaclasses

The metaclass in some way complements the code of the class to which it is applied, and also serves as the name for a group of classes that satisfy certain conditions. For example, as shown below, the metaclass interface will make all functions public and purely virtual for you.

Last year, Herb Sutter made his first metaclass project ( see here ). Since then, the current proposed syntax has changed.

For starters, the metaclass usage syntax has changed:

 //  interface Shape { int area() const; void scale_by(double factor); }; //  class(interface) Shape { … }

It has become longer, but now there is a natural syntax for using several metaclasses at once: class(meta1, meta2) .

Metaclass description

Previously, metaclass was a set of rules for class modification. Now the metaclass is the constexpr-function, which takes the old class (declared in the code) as input and creates a new one.

Namely, the function takes one parameter - the meta information about the old class (the type of the parameter depends on the implementation), creates class elements (fragments), and then adds them inside the body of the new class using the __generate instruction.

Fragments can be generated using the __fragment , __inject , idexpr(…) constructs. The speaker chose not to focus on their purpose, as this part will have changed before being presented to the standardization committee. The names themselves are guaranteed to be changed, a double underscore was added specifically to clarify this. The emphasis in the report was on examples that go further.

interface

 template <typename T> constexpr void interface(T source) { // source    //     .     //  ~X,  X —   . __generate __fragment struct X { virtual ~X noexcept {} }; //    static_assert, compiler.require   //   constexpr-. //      . compiler.require(source.variables().empty(), "interfaces may not contain data members"); // member_functions(), ,  tuple<…>,   for... for... (auto f : source.member_functions()) { // ,   —   / compiler.require(!f.is_copy() && !f.is_move(), "interfaces may not copy or move; consider a virtual clone()"); //   public   if (!f.has_default_access()) f.make_public(); // (1) // ,       protected/private compiler.require(f.is_public(), "interface functions must be public"); //     f.make_pure_virtual(); // (2) //   f     __generate f; } }

You might think that on lines (1) and (2) we modify the original class, but no. Please note that we iterate over the functions of the original class with copying, modify these functions, and then insert them into a new class.

The use of metaclass:

 class(interface) Shape { int area() const; void scale_by(double factor); }; //  : class Shape { public: virtual ~Shape noexcept {} public: virtual int area() const = 0; public: virtual void scale_by(double factor) = 0; };

Mutex Debugging

Suppose we have non-thread safe data protected by a mutex. You can make debugging easier if you check in the debug build to see if the current process is locked in this mutex. For this, a simple TestableMutex class was written:

 class TestableMutex { public: void lock() { m.lock(); id = std::this_thread::get_id(); } void unlock() { id = std::thread::id{}; m.unlock(); } bool is_held() { return id == std::this_thread::get_id(); } private: std::mutex m; std::atomic<std::thread::id> id; };

Further, in our class MyData I would like every public field like

 vector<int> v;

Replace with field + getter:

 private: vector<int> v_; public: vector<int>& v() { assert(m_.is_held()); return v_; }

For functions, you can also carry out similar transformations.

Such tasks are solved using macros and code generation. Herb Sutter declared war on macros: they are insecure, ignore semantics, namespaces, etc. What the solution looks like on metaclasses:

 constexpr void guarded_with_mutex() { __generate __fragment class { TestableMutex m_; // lock, unlock } } template <typename T, typename U> constexpr void guarded_member(T type, U name) { auto field = …; __generate field; auto getter = …; __generate getter; } template <typename T> constexpr void guarded(T source) { guarded_with_mutex(); for... (auto o : source.member_variables()) { guarded_member(o.type(), o.name()); } }

How to use it:

 class(guarded) MyData { vector<int> v; Widget* w; }; MyData& x = findData("foo"); xv().clear(); // assertion failed: m_.is_held()

actor

Well, let us protect some object with a mutex, now everything is thread-safe, there are no complaints about correctness. But if the object can often be accessed in parallel by multiple threads, then the mutex will be overloaded, and there will be a large overhead on its capture.

The fundamental solution to the problem of buggy mutexes is the concept of actors, when an object has a queue of requests, all references to an object are put in a queue and executed one after another in a special thread.

Let the class Active contain an implementation of all of this — essentially, a single-thread thread pool / executor. Well, metaclasses will help get rid of duplicate code and queue all operations:

 class(active) ImageFilter { public: ImageFilter(std::function<void(Buffer*)> w) : work(std::move(w)) {} void apply(Buffer* b) { work(b); } private: std::function<void(Buffer*)> work; } //  : class ImageFilter { public: ImageFilter(std::function<void(Buffer*)> w) : work(std::move(w)) {} void apply(Buffer* b) { a.send([=] { work(b); }).join(); } private: std::function<void(Buffer*)> work; Active a; //   ,     work }

 class(active) log { std::fstream f; public: void info(…) { f << …; } };

property

There are properties in almost all modern programming languages, and those who just did not implement on the basis of C ++: Qt, C ++ / CLI, all sorts of ugly macros. However, they will never be added to the C ++ standard, since in themselves they are considered too narrow features, and there has always been hope that some proposal will implement them as a special case. Well, they can be implemented on metaclasses!

 //  class X { public: class(property<int>) WidthClass { } width; }; //  class X { public: class WidthClass { int value; int get() const; void set(const int& v); void set(int&& v); public: WidthClass(); WidthClass(const int& v); WidthClass& operator=(const int& v); operator int() const; //   move! WidthClass(int&& v); WidthClass& operator=(int&& v); } width; };

You can set your own getter and setter:

 class Date { public: class(property<int>) MonthClass { int month; auto get() { return month; } void set(int m) { assert(m > 0 && m < 13); month = m; } } month; }; Date date; date.month = 15; // assertion failed

Ideally, I would like to write property int month { … } , but such an implementation will also replace the zoo of C ++ extensions that invent properties.

Metaclass Conclusion

Metaclasses are a great new feature for an already complex language. Is it worth it? Here are some of their advantages:

Allow programmers to more clearly express their intentions (I want to write actor)
Reduce code duplication and simplify the development and maintenance of code following specific patterns.
Eliminate some groups of common mistakes (it will be enough to take care of all the subtleties once)
Allow to get rid of macros? (Herb Sutter is very belligerent)

Questions from the audience

How to debug metaclasses?

At a minimum, for Clang there is an intrinsic-function, which, if called, prints the actual contents of the class at compile time, that is, what is obtained after applying all the metaclasses.

It used to be said about the possibility of declaring non-members like swap and hash in metaclasses. Where did she go?

The syntax will be refined.

Why do we need metaclasses if concepts have already been adopted for standardization (Concepts)?

These are different things. Metaclasses are needed to define parts of a class, and concepts check if a class matches a certain pattern using class examples. In fact, metaclasses and concepts are perfectly combined. For example, you can define an iterator concept and a typical iterator metaclass, which defines some redundant operations through the rest.

Source: https://habr.com/ru/post/425873/

All Articles

[CppCon 2018] Herb Sutter: Towards a simpler and more powerful C ++

Preface: more = easier ?!

Checks for the lifetime of variables (Lifetime)

Basic principles of verification

Pointers and Values

Pointers and Owners

Tracking lifetime function parameters

Conclusion on the control of lifetime

Questions from the audience

Metaclasses

Metaclass description

interface

Mutex Debugging

actor

property

Metaclass Conclusion

Questions from the audience

More articles: