A question of memory layout
January 20th, 2010 § 6 Comments
Actual object memory layout can be a little tricky when inheritance and its virtual tables are involved. And it gets even trickier when pointer arithmetic is employed. Do you consider yourself a low-level expert?
Let us consider the following main program:
#include <iostream>
void f (A *a) {
std::cout << a[2].x << std::endl;
}
int main () {
B b[10];
f(b);
}
In this post we will present three different definitions for classes A and B. Each definition will vary slightly from the previous one, and is likely to generate different output.
The first definition is the most straightforward:
// version I
struct A {
A () : x(1) {}
unsigned x;
void dummy () {}
};
struct B : A {
B () : y(2) {}
unsigned y;
void dummy () {}
};
As you could guess: we have defined two classes – A, and B which inherits from A. B contains one member – x, which is initialized to 1. B contains an extra member – y, which is initialized to 2. What will this version print?
To spice things up we shall add an innocent ‘virtual’ specification to the member function B::dummy:
// version II
struct A {
A () : x(1) {}
unsigned x;
void dummy () {}
};
struct B : A {
B () : y(2) {}
unsigned y;
virtual void dummy () {}
};
What will now be printed?
At this point we shall be fair and let A have its own ‘virtual’ qualifier as well:
// version III
struct A {
A () : x(1) {}
unsigned x;
virtual void dummy () {}
};
struct B : A {
B () : y(2) {}
unsigned y;
virtual void dummy () {}
};
What do you expect to be printed now?
Hint: Each of the suggested implementations generates a different output.
Hi RMN!
I am surprised that the second and the third definitions have a different output. To my understanding, once a single virtual function is declared in the path, the entire path becomes virtual, which means that the second example should be the same as the third.
Nadav
Nadav – I don’t know what the standard has to say about this, but i do think such a compiler/linker design would be infeasible. A and B can be declared in different files – and in fact in different lib’s altogether. When the compiler is faced with an ‘A*’ argument, it has only A’s type declaration to consult, and *not* its children.
Ofek, you bring a really good point. I looked at the compiled code and I was surprised to see the memory layout. After taking into account the fact that class “A” may not be modified, it does make sense. My mistake was that is a single virtual function is declared in the path, the entire path DOWNWARDS is declared virtual and not upwards, as we can see from this example.
rmn, I really enjoyed this one!
Roman, you could devise an even nastier ‘gotcha’ with virtual inheritance… You can fiddle with the size of the virtual *base* table, as well as the virtual *function* table – as both precede the data members in the object memory layout.
Good suggestion! I’ll add this to my TODO list
Thanks for the input.
It is perceivable that the different versions of the classes above will result in different output. But isn’t A* not the proper way of accessing in all cases?
C++ does not have an inherent array type, and we assume A* to be an array. Secondly, related to the intent of your post which is object layout, the user should be aware of what he is accessing.
Nice post!