Posted by: rmn on: 20/01/2010
Actual object memory layout can be a little tricky when inheritance and its virtual tables are involved. And it gets even trickier when pointer arithmetic is employed. Do you consider yourself a low-level expert?
Let us consider the following main program:
#include <iostream>
void f (A *a) {
std::cout << a[2].x << std::endl;
}
int main () {
B b[10];
f(b);
}
In this post we will present three different definitions for classes A and B. Each definition will vary slightly from the previous one, and is likely to generate different output.
The first definition is the most straight forward:
// version I
struct A {
A () : x(1) {}
unsigned x;
void dummy () {}
};
struct B : A {
B () : y(2) {}
unsigned y;
void dummy () {}
};
As you could guess: we have defined two classes – A, and B which inherits from A. B contains one member – x, which is initialized to 1. B contains an extra member – y, which is initialized to 2. What will this version print?
At this point we shall add an innocent ‘virtual’ specification to the member function B::dummy:
// version II
struct A {
A () : x(1) {}
unsigned x;
void dummy () {}
};
struct B : A {
B () : y(2) {}
unsigned y;
virtual void dummy () {}
};
What will now be printed?
We shall be fair and let A have its own ‘virtual’ qualifier as well:
// version III
struct A {
A () : x(1) {}
unsigned x;
virtual void dummy () {}
};
struct B : A {
B () : y(2) {}
unsigned y;
virtual void dummy () {}
};
What do you expect to be printed now?
… .. .
Hint: Each of the suggested implementations generates a different output.
Nadav – I don’t know what the standard has to say about this, but i do think such a compiler/linker design would be infeasible. A and B can be declared in different files – and in fact in different lib’s altogether. When the compiler is faced with an ‘A*’ argument, it has only A’s type declaration to consult, and *not* its children.
Ofek, you bring a really good point. I looked at the compiled code and I was surprised to see the memory layout. After taking into account the fact that class “A” may not be modified, it does make sense. My mistake was that is a single virtual function is declared in the path, the entire path DOWNWARDS is declared virtual and not upwards, as we can see from this example.
rmn, I really enjoyed this one!
Roman, you could devise an even nastier ‘gotcha’ with virtual inheritance… You can fiddle with the size of the virtual *base* table, as well as the virtual *function* table – as both precede the data members in the object memory layout.
It is perceivable that the different versions of the classes above will result in different output. But isn’t A* not the proper way of accessing in all cases?
C++ does not have an inherent array type, and we assume A* to be an array. Secondly, related to the intent of your post which is object layout, the user should be aware of what he is accessing.
Nice post!
20/01/2010 at 23:20
Hi RMN!
I am surprised that the second and the third definitions have a different output. To my understanding, once a single virtual function is declared in the path, the entire path becomes virtual, which means that the second example should be the same as the third.
Nadav