Translation of the article “Pimp my Pimpl”, part 2

In the first part of the article, translated by the respected skb7 , the Pimpl idiom (pointer to implementation, pointer to implementation), its purpose and advantages were considered. In the second part, problems arising from the use of this idiom will be discussed, and some solutions will be proposed.

Links to the original

This is a translation of the second part of an article published on the Heise Developer website. Translation of the first part can be found here . The originals of both parts (in German) are here and here .

The translation was made from an English translation .

annotation

Much has been written about this funny-sounding idiom, also known as d-pointer, compiler firewall or Cheshire Cat. The first article in Heise Developer, which presented the classic implementation of the Pimpl idiom and its advantages, is followed by this, the second and final, article about solving some problems that inevitably arise when using the Pimpl idiom.
')

Part 2

Violation of const-correctness

The first nuance, which is far from obvious, is related to the interpretation of the constantness of the object fields. When using the Pimpl idiom, methods gain access to the fields of the implementation object via the d pointer:

 SomeThing & Class::someThing() const { return d->someThing; }

Having carefully considered this example, you can see that this code bypasses the protection mechanism for constant objects in C ++: since the method is declared as const , the this pointer inside the someThing() method is of type const Class * , and the pointer d , respectively, of type Class::Private * const . This, however, is not enough to prohibit modifying access to the fields class Class::Private , because d is constant, but *d is not.

Remember: in C ++, the position of the const modifier is:

 const int * pci; //    int int * const cpi; //    int const int * const cpci; //     int *pci = 1; // : *pci  *cpi = 1; // : *cpi   *cpci = 1; // : *cpci  int i; pci = &i; //  cpi = &i; // : cpi  cpci = &i; // : cpci

Thus, using the Pimpl idiom, all methods (and those that are declared as const ) can modify the fields of the implementation object. If we didn’t use Pimpl, the compiler would manage to catch such errors.

This flaw in the type system is usually undesirable and must therefore be eliminated. To do this, you can use two methods: a wrapper class deep_const_ptr or a pair of d_func() methods. The first method is to implement a “smart” pointer that imposes constancy on the selected pointer. The definition of such a class is as follows:

 template <typename T> class deep_const_ptr { T * p; public: explicit deep_const_ptr( T * t ) : p( t ) {} const T & operator*() const { return *p; } T & operator*() { return *p; } const T * operator->() const { return p; } T * operator->() { return p; } };

Using the trick of overloading constant and regular versions of the operator*() and operator->() methods, it is possible to impose the constant d object *d . Replacing Private *d with deep_const_ptr<Private> d completely eliminates the problem in question. But this solution may be redundant: in this situation, the trick with overloading dereference operators can be applied directly to the Class class:

 class Class { // ... private: const Private * d_func() const { return _d; } Private * d_func() { return _d; } private: Private * _d; };

Now, instead of using _d in method implementations, you should call d_func() :

 void Class::f() const { const Private * d = f_func(); //  'd' ... }

Of course, nothing forbids direct access to _d in methods, which will not be the case when using the smart pointer deep_const_ptr . Therefore, the method of overloading Class methods requires more discipline from the developer. In addition, the implementation of the deep_const_ptr class can be modified to automatically delete the created Private object when a Class object is destroyed. In turn, overloading class methods is useful in creating a hierarchy of polymorphic classes, which will be demonstrated later.

Access to container class

The following obstacle arises when the developer places all private methods of the Class class into the Private class: now these methods cannot call other (non- static ) methods of the Class class, since the Class -> Private association Class -> Private unidirectional:

 class Class::Private { public: Private() : ... {} // ... void callPublicFunc() { /*???*/Class::publicFunc(); } }; Class::Class() : d( new Private ) {}

This problem can be solved by introducing a reciprocal link (the name of the q field is mentioned in the Qt code):

 class Class::Private { Class * const q; //   public: explicit Private( Class * qq ) : q( qq ), ... {} // ... void callPublicFunc() { q->publicFunc(); } }; Class::Class() : d( new Private( this ) ) {}

When using a backward link, it is important to remember that the initialization d not performed until the Private constructor worked. The developer should not call the Class methods that refer to the d field in the body of the Private constructor, otherwise he will get undefined behavior.

For reinsurance, the developer should initialize the reciprocal link with a null pointer, and set the correct link value only after practicing the Private constructor, in the body of the Class constructor:

 class Class::Private { Class * const q; // back-link public: explicit Private( /*Class * qq*/ ) : q( 0 ), ... {} // ... }; Class::Class() : d( new Private/*( this )*/ ) { //   : d->q = this; }

Despite the above limitations, usually a substantial part of the class initialization code can be transferred to the Private constructor, which is important for classes with several constructors. It is also worth mentioning that with the q pointer (inverse reference), the already considered problem of constancy violation arises, which can be solved in the same way.

Subtotals

Now that we have been able to restore the functionality lost with the introduction of the private implementation class Pimpl idiom, the rest of the article will be devoted to some “magic” that will level out the additional memory costs that arise when using the Pimpl idiom.

Increase efficiency with reuse of objects

Being a good C ++ developer, the reader is probably skeptical after reading the annotation for an article describing the classic Pimpl idiom. In particular, additional memory allocations can be extremely disadvantageous, especially with regard to classes that by themselves almost do not require memory.

First of all, such considerations should be verified by profiling the code, but this cannot be the reason for refusing to look for a solution to a potential performance problem. In the first part of the article, the inclusion of class fields in the object of realization was already mentioned, which reduced the number of requests for memory allocation. Next, we consider another, much more advanced technique: reuse of the implementation pointer.

In the hierarchy of polymorphic classes, the problem of additional memory costs is aggravated by the depth of the hierarchy: each hierarchy class has its own hidden implementation, even if it does not carry new fields (for example, inheritance in order to redefine virtual methods without introducing new class members).

A developer can deal with the proliferation of the number of d pointers (and the associated memory allocations) by reusing the base-class d pointer in the inheriting classes:

 // base.h: class Base { // ... public: Base(); protected: class Private; explicit Base( Private * d ); Private * d_func() { return _d; } const Private * d_func() const { return _d; } private: Private * _d; }; // base.cpp: Base::Base() : _d( new Private ) { // ... } Base::Base( Private * d ) : _d( d ) { // ... }

The presence of a public constructor in addition to public constructors allows the inheriting classes to embed their d pointer into the base class. The code also uses the fixed const correctness using the d_func() methods (which are now also protected ) for (non-modifying) access of the inheriting classes to _d .

 // derived.h: class Derived : public Base { public: Derived(); // ... protected: class Private; Private * d_func(); //    const Private * d_func() const; //  }; // derived.cpp: Derived::Private * Derived::d_func() { return static_cast<Private*>( Base::d_func() ); } const Derived::Private * Derived::d_func() const { return static_cast<const Private*>( Base::d_func() ); } Derived::Derived() : Base( new Private ) {}

Now the author of Derived uses the new Base constructor to pass Derided::Private instead of Base::Private to Base::_d (note the use of the same Private name in different contexts). The author also implements his d_func() methods in terms of Base methods with forced type conversion.

In order for the Base constructor to work properly, Base::Private must be an ancestor of Derived::Private :

 class Derived::Private : public Base::Private { // ... };

In order to really inherit a class from Base::Private , three conditions must be met.

First, the developer must make Base::Private destructor virtual. Otherwise, there will be undefined behavior when the Base destructor is triggered, which will try to delete the object of the implementation of Derived::Private via a pointer to Base::Private .

Secondly, the developer must implement both classes in the same library, since Private usually do not fall into the export table - they are not listed in the declspec(dllexport) on Windows, and are not listed as visibility=hidden in the ELF binaries. However, export is inevitable if Base and Derived implemented in different libraries. In exceptional cases, Private classes of the main classes of the library are exported: for example, Nokia developers exported the QObjectPrivate (from QtCore) and QWidgetPrivate (from QtGui) classes, which are very much in demand, since so many classes from other modules inherit from QObject and QWidget . However, by doing so, developers add dependencies between libraries not only at the interface level, but also at the level of "internals", thus violating the compatibility of libraries of different versions: in general, libQtGui.so.4.5.0 will not work if the dynamic the linker connects libQtCore.so.4.6.0 to it.

And finally, thirdly, the definition of Base::Private can no longer be hidden in the base class implementation file ( base.cpp ), since it is required by the definition of Derived::Private . So where to place the definition of Base::Private ? You can simply include it in base.h , but then what's the point of using Pimpl if the internal implementation is still visible from the outside? The answer to these questions is to create a special, private header file. For this purpose, Qt and KDE have established a naming scheme for the name of the _p.h (the suffixes _priv , _i and _impl also _impl ). In addition to the definition of Base::Private , this private file can contain inline implementations of Base methods, for example, a constructor:

 inline Base::Base( Private * d ) : _d( d ) {}

And in derived_p.h :

 inline Derived::Derived( Private * d ) : Base( d ) {} inline const Derived::Private * Derived::d_func() const { return static_cast<const Private*>( Base::d_func() ); } inline Derived::Private * Derived::d_func() { return static_cast<Private*>( Base::d_func() ); }

Strictly speaking, the above code is contrary to the One Definition Rule rule, since d_func() implementations are inline in files that include derived_p.h and are not inline in other files.

In practice, this is not a problem, since everyone who will call d_func() will have to include the file derived_p.h . For reinsurance, you can declare the problematic methods inline in the definition of Derived in the file derived.h - modern compilers allow the presence of the inline in methods without implementation.

Often, developers hide the redundant code that arises with this technique under macros. For example, Qt defines the Q_DECLARE_PRIVATE macro for use in a class definition, and the Q_D macro, which declares a pointer d in the method implementation and initializes it with a call to d_func() .

One drawback still remains: if the developer wants to combine the reuse of the pointer to the implementation and the backlink mechanism, there are some difficulties. In particular, care must be taken not to dereference (even implicitly!) The pointer to Derived , which is passed to the Private constructor until the constructors in the inheritance hierarchy have been worked out.

 Derived::Private( Derived * qq ) : Base( qq ) // ,   { q->setFoo( ... ); // ,     }

At the time of dereferencing, not only Derived not created, but also - and this is the difference from the non-polyform case described earlier - Base , since its Private field is still created.

In this case, just as before, you should initialize the reverse link with a null pointer. The task of setting the reciprocal reference to the correct value falls on the shoulders of the class lying at the end of the hierarchical chain, that is, the class that injects its Private class into the hierarchy. In the case of Derived , the code will look like this:

 Derived::Derived() : Base( new Private/*( this )*/ ) { d_func()->_q = this; }

If desired, the developer can put the initialization code, which requires access by the reverse link, to a separate method Private::init() (which means the Private construction in two stages). This method is called (only) in the class constructor, which itself creates an instance of Derived .

 Derived::Derived( Private * d ) : Base( d ) { // __  d->init()! } Derived::Derived() : Base( new Private ) { d_func()->init( this ); } Derived::Private::init( Derived * qq ) { Base::Private::init( qq ); //  _q //    }

In addition, each Private class must have its own back reference to the container class, or define q_func() methods that will be responsible for casting the type for the back reference of the Base::Private class. The corresponding code is not given here - its writing remains as an exercise for a respected reader. The solution to this exercise can be found on the Heise FTP server in the form of a “pumped” ¹ Shape hierarchy.

findings

Being a well-known C ++ idiom, Pimpl allows developers to separate the interface and implementation to the extent that C ++ built-in tools cannot achieve. As a positive side effect, developers gain compilation acceleration, the ability to implement transaction semantics and, through the active use of composition, the overall code acceleration in perspective.

Not everything is so smooth when using d pointers: an additional class Private , memory allocations associated with it, violation of const correctness and potential errors in the order of initialization can spoil a lot of blood for the developer. For all the problems listed in this article, solutions were proposed, which, however, require writing a large amount of code. Due to the increased complexity, a fully “pumped up” Pimpl (with reuse and backlinks) can only be recommended for a small number of classes or projects.

However, projects that are not afraid of possible difficulties will be rewarded with remarkable interface stability, allowing you to overhaul the implementation.

Sources

John Lakos; Large-Scale C ++ Software Design; Addison-Wesley Longman 1996
Herb Sutter; Exceptional C ++: 47 Engineering Puzzles, Programming Problems, and Solutions; Addison-Wesley Longman, 2000
Herb Sutter, Andrei Alexandrescu: C ++ Coding Standards: 101 Rules, Guidelines and Best Practices; Addison-Wesley Longman, 2004
Marc Mutz; Pimp my Pimpl; C ++: Vor- und Nachteile des d-Zeiger-Idioms, Teil 1; Artikel auf heise Developer ( English translation available )

Translator's notes

¹ Here and hereafter: a play on words - Pimpl is consonant with the verb to pimp, which is a reference to the TV show “Pimp My Ride” (English “Pimp my Ride”).

Source: https://habr.com/ru/post/137702/

All Articles