📜 ⬆️ ⬇️

Virtuality and overhead projector

I think everyone knows what inheritance is, or at least heard about it. Often we use inheritance for the polymorphic behavior of objects. But do we think about the price that we have to pay for virtuality? I put the question differently: does everyone know this price? Let's try to understand this problem.


In general, inheritance looks like this:

class Base { int variable; }; class Child: public Base { }; 

In this case, as we well know, the class hild inherits all members of the class Base. Those. in terms of object sizes, we now have sizeof (Base) = sizeof (Child) and is 4 (since sizeof (int) = 4).
')
It does not hurt to immediately remind what alignment is. We have two classes:

 class A1 { int iv; double dv; int iv2; }; class A2 { double dv; int iv; int iv2; }; 

It seems that they do not differ from each other. However, their sizes are not the same: sizeof (A2) = 16, sizeof (A1) = 24.

It's all about the location of variables inside the class. If they have a different type, then their location can seriously affect the size of the object. In this case, sizeof (double = 8), that is, 8 + 4 + 4 = 16, but the class A1 has a larger size. And all because:



As a result, we see the extra 8 bytes that were added due to the fact that the double was in the middle. In the second case, the picture will be something like this:


But, most likely, you already knew it.

Now let's remember how we pay for virtual functions in the classroom. You may be aware of virtual method tables. The C ++ standard does not provide for any single implementation to calculate the address of a function at runtime. It all comes down to the fact that we have a pointer in each class where there is at least one virtual function.

Let's add one virtual function to the Base class and see how the sizes change:

 class Base { int variable; virtual void f() {} }; class Child: public Base { }; 

The size became equal to 16. 8 - pointer size 4 - int plus alignment. In a 32-bit architecture, the size will be equal to 8. 4 - pointer + 4 int without alignment.

So that you don’t have to believe a word, here’s the code that generated Hopper Disassembler v4:

//source
 class Base { public: int variable; virtual void f() {} Base(): variable(10) {} }; // main Base a; 

Assembly code:

  ; Variables: ; var_8: -8 __ZN4BaseC2Ev: // Base::Base() 0000000100000f70 push rbp ; CODE XREF=__ZN4BaseC1Ev+16 0000000100000f71 mov rbp, rsp 0000000100000f74 mov rax, qword [0x100001000] 0000000100000f7b add rax, 0x10 0000000100000f7f mov qword [rbp+var_8], rdi 0000000100000f83 mov rdi, qword [rbp+var_8] 0000000100000f87 mov qword [rdi], rax 0000000100000f8a mov dword [rdi+8], 0xa 0000000100000f91 pop rbp 0000000100000f92 ret 

Without a virtual function, the assembler code looks like this:

  ; Variables: ; var_8: -8 __ZN4BaseC2Ev: // Base::Base() 0000000100000fa0 push rbp ; CODE XREF=__ZN4BaseC1Ev+16 0000000100000fa1 mov rbp, rsp 0000000100000fa4 mov qword [rbp+var_8], rdi 0000000100000fa8 mov rdi, qword [rbp+var_8] 0000000100000fac mov dword [rdi], 0xa 0000000100000fb2 pop rbp 0000000100000fb3 ret 

You can see that in the second case we do not have any address entry and the variable is written without offset by 8 bytes.

For those who do not like assembler, let's derive how it will look approximately in memory:

 #include <iostream> #include <iomanip> using namespace std; const int memorysize = 16; class Base { public: int variable; //virtual void f() {} Base(): variable(0xAAAAAAAA) {} //       }; class Child: public Base { }; void PrintMemory(const unsigned char memory[]) { for (size_t i = 0; i < memorysize / 8; ++i) { for (size_t j = 0; j < 8; ++j) { cout << setw(2) << setfill('0') << uppercase << hex << (int)(memory[i * 8 + j]) << " "; } cout << endl; } } int main() { unsigned char memory[memorysize]; memset(memory, 0xFF, memorysize * sizeof(unsigned char)); //   FF new (memory) Base; //       memory PrintMemory(memory); reinterpret_cast<Base *>(memory)->~Base(); return 0; } 

Conclusion:

 AA AA AA AA FF FF FF FF FF FF FF FF FF FF FF FF 

Let's uncomment the virtual function and admire the result:

 E0 30 70 01 01 00 00 00 AA AA AA AA FF FF FF FF 

Now that we remember it all, let's talk about virtual inheritance. It's no secret that multiple inheritance is possible in C ++. This is a powerful function, which is better not to touch with inept hands - it will not lead to anything good. But let's not talk about sad things. The most famous problem with multiple inheritance is the rhombus problem.

 class A; class B: public A; class C: public A; class D: public B, public C; 

In class D we get duplicate members of class A. What's wrong with that? Even if we do not take into account that the class size will increase by extra n bytes of class A size, the bad thing is that we get ambiguities when calling functions of class A - it is not clear which one to call: B :: A :: func or C :: A :: func. We can always eliminate such ambiguities by explicit calls, but this is not very convenient. This is where virtual inheritance comes into play. In order not to receive a duplicate of class A, we are virtually inherited from it:

 class A; class B: public virtual A; class C: public virtual A; class D: public B, public C; 

Now everything is fine. Or not? What size will a class D have if we have only one virtual method in class A?

 cout << sizeof(A) << " " << sizeof(B) << " " << sizeof(C) << " " << sizeof(D) << endl; 

This is an interesting question, because everything depends on the compiler. For example, Visual Studio 2015 with the default project settings will display: 4 8 8 12.

That is, we have 4 bytes per pointer in class A (hereinafter I will abbreviate these pointers, for example, vtbA), an additional 4 bytes per pointer due to virtual inheritance for class B and C (vtbB and vtbC). Finally in D: 8 + 8 - 4, since vtbA is not duplicated, it turns out 12.

But gcc 4.2.1 will give 8 8 8 16.

Let's first consider the case without virtual inheritance, because the result will be the same.

8 bytes on vtbA, in classes B and C, there are only pointers to the virtual tables of these classes. It turns out that we are duplicating virtual tables, but on the other hand, it is not necessary to store vtbA in the heirs. Class D stores two addresses: for vtbB and vtbC.

 0000000100000f7f mov rax, qword [0x100001018] 0000000100000f86 mov rdi, rax 0000000100000f89 add rdi, 0x28 0000000100000f8d add rax, 0x10 0000000100000f91 mov rcx, qword [rbp+var_10] 0000000100000f95 mov qword [rcx], rax 0000000100000f98 mov qword [rcx+8], rdi 0000000100000f9c add rsp, 0x100000000100001018 dq 0x00000001000010a8 … __ZTV1D: // vtable for D 00000001000010a8 db 0x00 ; '.' ; DATA XREF=0x100001018 ... 00000001000010b0 dq __ZTI1D 00000001000010b8 db 0xc0 ; '.' ... 00000001000010c8 dq __ZTI1D 00000001000010d0 db 0xc0 ; '.' … 

Nothing is clear? See: we save two addresses in 0f95 and 0f98. They are calculated on the basis of the address that lies in 1018, plus 0x28 in the first case and 0x10 in the second. Total we get 10b0 and 10d0.

Now consider the case when inheritance is virtual.

In terms of assembly code, little changes; we also have two addresses, but the virtual tables for B, C, and D have become much larger. For example, the table for class D has increased more than 7 times!

Saved on the size of the object, but increased the size of the tables. But what if we use virtual inheritance everywhere, as some authors advise?

We will not provide exact links anymore, but somewhere they read that if the idea of ​​multiple inheritance is allowed, then you should always use virtual inheritance in order to avoid duplication.

So, we begin to follow the advice in the forehead:

 class A; class B: public virtual A; class C: public virtual A; class D: public virtual B, public virtual C; 

How much will the size of D?

Visual Studio 2015 will output 4 8 8 16, i.e., one more pointer has been added in class D. Through experiments, we found that if we inherit virtually from each class, the studio will add another pointer to the current class. For example, if we wrote this:

 class D: public virtual B, public C; 

or so:

 class D: public B, public virtual C; 

then the size would be 12 bytes.

Do not think that the studio saves memory, it is not at all. For standard settings, the pointer size is 4 bytes, not 8, as in gcc. So multiply the result by 2.

What about gcc 4.2.1? It will not change the size of the objects at all, the output is the same - 8 8 8 16. But can you imagine what happened to the table for D ?!

In fact, it certainly increased, but only slightly. Another question is how this will affect the subsequent hierarchies.

As a pure experiment (we will not think whether there is a practical use in this), we will check what happens with such a hierarchy:

 class A { virtual void func() {} }; class B: public virtual A { }; class C: public virtual A { }; class D: public virtual B, public virtual C { }; class E: public virtual B, public virtual C, public virtual D { }; 

In the studio, the size of class E will increase by 4, we have already figured out that, and in gcc the size of D and E will be 16 bytes.

But at the same time, the size of the virtual table for class E (which is already rather big if you remove all virtual inheritance) will increase 4 times! If I correctly calculated everything, then it will already reach half a kilobyte or so.

What conclusion can be made? Same as before: multiple inheritance should be used very carefully, virtual inheritance is not a panacea and, in one way or another, we are paying for it. It may be worth thinking in the direction of interfaces and abandoning virtual inheritance in general.

Source: https://habr.com/ru/post/327052/


All Articles