C ++, C-style coercion and unexpected consequences of their combination

C ++ inherited from C a type cast (type) (what to bring) - usually called a C-style cast. There are four more explicit castings in C ++ - static_cast, reinterpret_cast, dynamic_cast, const_cast.

C ++ is not the newest language, and the heated debate about what is best - the C-style cast or the use of * _cast in the right combination, started a long time ago and continues to this day. We will not add fuel to the fire, we better consider an example, and let everyone decide what he likes more.

Windows-specific and COM-specific constructs will be mentioned here, but the same problems can arise in any sufficiently complex class hierarchies, if you do not pay enough attention to type casting.
')
An example based on a real code from a real open source project. In a certain project subsystem, a class that implements several COM interfaces is declared:

 class CInterfacesImplementor : public IComInterface1, public IComInterface2, public IComInterface3, ...(   4  9), public IComInterface10 { //    };

Of course, in real life, interfaces have more meaningful names, but when they are closer to the top ten, this does not help much from the problem that is discussed further.

Recall that each COM interface is directly or indirectly inherited from IUnknown, and IUnknown contains the QueryInterface () method, the correct implementation of which is so difficult that Raymond Chen wrote about this series ( here , here and here ).

Our example is just the implementation of QueryInterface () in the class above. Brief background: when a developer announces a new COM interface, he is obliged to assign him a unique identifier. The caller calls QueryInterface () to find out if an object implements an interface with that identifier and, if it does, obtain a pointer of the appropriate type. The __uuidof () construct asks Visual C ++ to compile and locate the interface identifier specified in brackets during compilation.

So…

 HRESULT STDMETHODCALLTYPE CInterfaceImplementor::QueryInterface( REFIID iid, void** ppv ) { if( ppv == 0 ) { return E_POINTER; } if( iid == __uuidof( IUnknown ) || iid == __uuidof( IComInterface1 ) ) { *ppv = (IComInterface1*)this; } else if( iid == __uuidof( IComInterface2 ) ) { *ppv = (IComInterface2*)this; } else if( iid == __uuidof( IComInterface3 ) ) { *ppv = (IComInterface3*)this; } else if... ... //       COM- } else { //   COM-     *ppv = 0; return E_NOINTERFACE; } AddRef(); return S_OK; }

The implementation above works and is almost perfect. She checks the pointer before dereferencing. She checks if the known interface has been requested from her. It writes a null pointer before returning the E_NOINTERFACE code. It increases the reference count if the interface is supported. She even responds to the request of IUnknown. Raymond Chen would be pleased if it were not for one question.

Why is there a ghost? Why not write "* ppv = this;"?

In case of multiple inheritance, the object will be “complicated” from the subobjects of the base classes so that you can access each subobject separately. Let's say that some function can work only with IComInterface2 * - you need to pass it a pointer to this sub-object, and not to the derived object, about which it may quite possibly know nothing.

Assigning "* ppv = this;" would cause the address of the beginning of the derived object, and not the subobjects of which it consists, to be transmitted. Attempting to call the virtual method of the interface through a pointer to another sub-object will obviously lead to a long debugging.

The cast in the example above just provides an adjustment to the pointer. It is necessary there for the caller to receive a pointer to the desired sub-object.

Happinnes exists? Before this paragraph - exactly. Now it takes 100,500 days, the project is developing, it adds new functionality. In the next paragraph, we will see the consequences of unsuccessful use of copy-paste when trying to develop a project. But let's do without objections that the “right programmers” with the “right programming” and the “right architecture” do not do so.

In another subsystem of the same open source project, there is another class that implements the same set of interfaces:

 class CYetOtherImplementor : public IComInterface1, public IComInterface3, ...(   4  9), public IComInterface10 { //    };

and, of course, nobody wants to write that chain of conditions anew, especially since the implementation is obviously the same:

 HRESULT STDMETHODCALLTYPE CYetOtherImplementor::QueryInterface( REFIID iid, void** ppv ) { if( ppv == 0 ) { return E_POINTER; } if( iid == __uuidof( IUnknown ) || iid == __uuidof( IComInterface1 ) ) { *ppv = (IComInterface1*)this; } else if( iid == __uuidof( IComInterface2 ) ) { *ppv = (IComInterface2*)this; } else if( iid == __uuidof( IComInterface3 ) ) { *ppv = (IComInterface3*)this; } else if... ... //       COM- } else { //   COM-     *ppv = 0; return E_NOINTERFACE; } // V2UncmUgaGlyaW5nIC0gd3d3LmFiYnl5LnJ1L3ZhY2FuY3k= AddRef(); return S_OK; }

Now let's mentally lose what will happen when the IComInterface2 interface is requested. The control will follow the if-else-if chain until the identifier matches, and then the C-style cast will be executed.

Paragraph 5.3.5 / 5 of the C ++ standard ISO / IEC 14882: 2003 (E) says that when casting in style C, it will be executed (in our case) either static_cast or, if static_cast is impossible, reinterpret_cast.

In the first example, the class was inherited from IComInterface2 and the static_cast of this pointer was executed to a pointer to the required sub-object.

In the second example, the class is no longer inherited from IComInterface2 (yes, copy-paste plus file completion), therefore static_cast is not possible. The reinterpret_cast will be executed, the this pointer will be copied unchanged. And by the way, the object does not implement IComInterface2 at all. Here the word is appropriate.

When calling IComInterface2 in the second example, the caller will receive a non-zero pointer to an object that this interface does not implement and in general does not apply to this interface.

For comparison, if you use static_cast in each of the if-else-if branches, the compiler will give an error message and the second example will not compile, it gently hints to the developer that you need to work a little more. Minus the debugging day, you can do something useful.

Once we are here, another bad idea is to use dynamic_cast. When using dynamic_cast in the second example, the caller will get a null pointer and a false code of successful execution of the method, and the object will be in vain caused by an increase in the reference count, as a result it may leak. Plus a couple of hours of debugging, but a null pointer is at least easier to notice, however, there is no point in using dynamic_cast here.

It can be assumed that the C-style leads make it possible to write code shorter, but they complicate the writing of the correct code and only delay the moment when you have to get used to the * _cast ghosts.

The conclusions are obvious. Use the C-style cast as often as possible - this will give other developers a competitive advantage, and you (who knows) yourself may even one day get the Darwin Award.

Dmitry Mescheryakov
Department of Data Entry Products

Source: https://habr.com/ru/post/113429/

All Articles

C ++, C-style coercion and unexpected consequences of their combination

More articles: