C++ background: Static, reinterpret and C-Style casts
Ever wondered why C-Style casts and reinterpret_cast
casts are considered evil? Let’s take an in-depth look at what goes wrong with them.
Background
C++ knows 5 different casts (yeah, C-Style casting is not reinterpret_cast
):
static_cast
: Least harmful, can downcast pointersconst_cast
: Removes theconst
modifier. If used incorrectly, this can be a killer as the target might be reallyconst
and you get some invalid access errors.dynamic_cast
: Safe down/cross-casting between classes, requires RTTI - and RTTI in C++ is something that is often not enabled at all.reinterpret_cast
: Casts anything which has the same size, for example,int
toFancyClass*
on x86. Now this is not really a cast any more but just a way to tell the compiler to throw away type information and treat the data differently.-
C-Style casting, using the
(type)variable
syntax. The worst ever invented. This tries to do the following casts, in this order: (see also C++ Standard, 5.4 expr.cast paragraph 5)const_cast
static_cast
static_cast
followed byconst_cast
reinterpret_cast
reinterpret_cast
followed byconst_cast
And you thought it is just a single evil cast, in fact its a hydra!
The rule of the thumb should be: Never use reinterpret_cast
or C-Style casting, if you need to cast pointers, cast them via void*
, and only if absolutely necessary use reinterpret_cast
- that means, if you really have to reinterpret the data. Remember, C++ is an expert language, it gives you all the control over your machine you wish, but with power comes responsibility!
Example
Let’s take a look why the “hard” casts are really evil and should be avoided like the plague. In this example, we’ll assume the following class hierarchy:
class ParentWithoutVtable
{
char empty;
};
class ParentWithVTable
{
public:
virtual ~ParentWithVTable () { }
};
class Derived : public ParentWithoutVtable, public ParentWithVTable
{
};
Now, consider this code:
int main () {
ParentWithoutVtable* p = new Derived ();
Derived* a = reinterpret_cast<Derived*>(p); // (1)
Derived* b = static_cast<Derived*>(p); // (2)
Derived* c = (Derived*)(p); // (3)
}
What happens? Well, 1 will fail miserably, 2 will work and 3 - depends on whether you included the header properly or just forward defined the types! Lets take a look at (1) first: It fails because the ParentWitoutVTable*
pointer does not point at the beginning of the newly created Derived
object in memory, but somewhere else. For example, Visual C++ puts the virtual table pointer as the first element of a class, so the real layout of Derived is something like:
struct __Derived
{
function_ptr vtable; char empty;
}
So if we obtain a pointer to an object without a vtable, it will point at empty
, otherwise access via the pointer would fail. Now, the reinterpret_cast
has no idea of this, and if you call a Derived function, it will think the empty
is the vtable. The static_cast
does it right, but for this it has to know the full declaration of both types. This is exactly where the C-Style is dangerous: As long as the full declaration is available, everything will work fine without any warning, but if you choose to forward-declare the types, the C-Style will still not emit a warning (while the static_cast
will fail!) and perform a reinterpret_cast
which will break your code. So be aware, and avoid C-Style casting at all costs.
Update: Changing the order in which Derived
inherits
from its parents does not solve this problem. The C++ standard makes
no guarantees how the members will be layed out:
the order of derivation is not significant except as specified by the semantics of initialization by constructor (12.6.2), cleanup (12.4), and storage layout (9.2, 11.1).
and the referenced paragraph says:
Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).
Basically, the compiler is free to put the virtual function table where he wants – Visual C++ for example always stores the virtual function table first, no matter in what order the class derives from its parents.