CMake 2.6 reaches beta

CMake 2.6 has reached the beta status (mailing list post). CMake is a very nice cross-platform makefile generator -- it does not build the software itself but rather relies on a platform specific tool for this. However, this is actually not a real drawback: For Windows, it generates for example Visual Studio solutions, so a Windows developer can continue to work with the tools he is used to instead of having to use an external build script. 2.6 fixes some important bugs which occur in not-so-exotic environments (I couldn't build a project with 2.4.8 because of them) and seems good to use.

C++, <windows.h>, precompiled headers and pimpl

Today we take a look at how to make best use of precompiled headers, how to avoid the <windows.h>-is-messing-up-my-symbols-syndrome and how to properly use pimpl to reduce our compile times.

Well, everything you learned about doing good headers is basically ignored here. Just look at things like ...

#undef INTERFACE
#define INTERFACE INamedPropertyBag

or

#define ERROR 0

and then try something like

class Reflection
{
    enum Type {
        INTERFACE
    };
};

No chance to prevent this from happening, the only thing you can do is to clean up behind windows.h. The worst that can happen to you is that some header includes windows.h before you could to it, and does it without clean up. If you are the first one to include it, everything is fine, as the next include will do nothing (thanks to include guards). The best way to enforce this is to include it in your precompiled header. The problem is amplified by the fact that even headers like boost/shared_ptr.hpp wind up including windows.h on Windows, and there is no chance to use PIMPL to combat this. Which brings us to our next topic ...

PIMPL

The silver bullet against windows.h and friends -- hide them away from the public. PIMPL is easy to do, you just have to keep two things in mind:

  • Make them non-copyable. As the implementation is just a pointer, the pointer gets copied by default and is then shared between the two instances, unless you implement a custom "deep copy".
  • Make sure that it can't leak memory. Just use a shared_ptr and you should be fine.

Good candidates are classes which use the filesystem, process-creation and similar low-level things. Also, classes which include expensive other classes like mutexes should be pimpled.

Precompiled headers

Not much new since my last post about this, just some things to keep in mind:

  • Make sure it compiles fine without the precompiled header.
  • Build the precompiled header after evaluating the include dependencies.
  • Try to move the headers into the implementation files as much as possible -- forward declare what you can. This can especially help if you pass classes by reference. Moreover, often you get rid of more than one include as headers tend to include other headers excessively.

That is basically, with careful layout you should get gains around 50% just by using precompiled headers.

C++, legacy and future

Yes, I do like C++ for being C++, an expert language close to the metal. But there are some things in C++ that should be seriously overhauled, if C++ does not want to pass the field over to C# and friends.

Frontend changes

  • Redundant keywords, basically anything where there are multiple ways to express the same thing semantically. struct or class means the same? If so, get rid of one, or add back some meaning, like struct may contain only an empty constructor and no functions.
  • Variable argument functions, i.e. the ... . This is basically totally broken. I'd rather have something like printf (const char* string, void* params = ...) which would mean "convert anything that comes to void*" and let the user cast it. If you specify something that is not void*, the compiler would check the type. This would also allow things like vector<Type>::vector(Type* params = ...) where you do get compile time type checking. Anyway, the current ... construct is dangerous and needs some overhaul.
  • A way to enforce variable initialisations. The default being "do not initialise at all" is dangerous, as the hundreds of bugs constantly remind us. If the programmer really does not want to initialise some variable, he should make it clear, like int variable = undefined; in all other cases, he should be forced to initialise to some sane value. I know, most compilers do warn, but anyway, this should be enforced on the language level.
  • Lambda functions. Seriously, std::for_each without lambdas is no fun, and this kind of stuff is just getting more and more important as it is extremely easy to make this run in parallel. There are even two proposals in the pipeline but it seems they won't make it into C++0x.
  • Automatic type deduction -- for (var it = coolMap.begin (), end = coolMap.end (); is so much nicer than for (std::map<std::string, std::pair<std::vector<std::string>, int> > >::const_iterator. At least, this seems to be "in" for C++0x
  • Getting rid of legacy features. Things like static void foo () should be done with namespace { void foo (); }, there is no need to support the legacy stuff if someone is writing against a C++0x compiler. Let's call this the strict mode or something like that, but stop C++ compilers from parsing 20 years old deprecated stuff. Same for sstream and strstream, issue an error if someone is using the deprecated stuff using a C++0x compiler and refuse to compile it.

Runtime changes

  • An usable RTTI system. C++'s RTTI system is totally rudimentary, and is not practically useful. Just take a look at Qt, MFC, Unreal Engine, wxWidgets -- all of them roll their own RTTI instead of using the C++ RTTI system.
  • Extended memory allocation functions. New and delete is fine, but I'd like to have the ability to redirect all new calls to my custom allocator (easily! Something like std::set_new_handler ()). Same for array new. Same goes for allocation hints, hinting that some memory is going to be permanent is an important clue for the allocator, as it might be put into a special heap.

Language changes

  • Arrays should be of type std::array by default, that is, if I write char [6] it should be actually the same as std::array<char, 6>, so I can use .begin () etc. on it. This is just a consistency thing, but anyway, it surely wouldn't hurt.
  • An enumerable concept is needed. This goes hand in hand with concepts and arrays being of an array type, as one can easily loop over them using a for each keyword. Boost already has a FOR_EACH macro, but language level support is needed here so people can easily use the standard containers.

Library changes

  • Library overhaul with performance in mind. Take a look at EASTL or something like this, the default C++ STL implementations are sometime just crap. This is getting better, but the standard should also enforce some performance constraints, like "empty containers must not allocate memory".
  • The new style iterators from Boost should be the default, they are easy to use -- more people will be willing to write an iterator if it is easy.
  • Proper string handling. No, std::string, std::wstring etc. is not proper string handling in the 21st century. The default should be probably UTF-8 based, with a working system to convert between various encodings -- like ICU.
  • Proper stream handling. Operating systems write data byte-oriented, not char ... This is something that makes me wonder all the time, why the heck are the iostreams char oriented? I'd much rather have some basic input_stream which can read bytes, same for output, and layer the functionality over this, i.e. input_stream::ptr is (new file_input_stream ("filename")); char_stream<char> cs (is); Makes many things much easier, and looking at Java and C#, this seems to be working in practice, too.
  • Thread creation, synchronisation. There is some thread stuff going into C++0x, but it is looking half-serious only. For example, no way to specify priorities as it is not really portable -- heck, something like priority_hint would have done it, and I wonder how someone is supposed to implement a background thread without priorities on Windows. Ok, an embedded platform might not have priorities but this should not cripple everyone.

There are changes coming that go into the right direction (atomics are coming, regular expressions, smart pointers -- but not smart arrays, uah, and finally hash maps, a working bind is coming, function objects), but the pace is too slow to remain competitive. C++ was successful in the past because it used to be extremely backwards compatible with C, but this is no longer the case with C99 and C++ going slightly different routes, moreover, most people are stuck with either C++ or C, and the interop part of C++ via C (i.e., unmangled names via extern "C") is good and does not need to be changed at all. The main problem for C++ is the lack of a driving force that can push a standard -- changing something in this language is difficult, and it really needs a cleanup so the compilers get maintainable and robust and the language itself becomes easier to use. Even 5 years after C++03, there is still just one compiler that supports C++, and that is the EDG frontend. Seriously, if a language gets that complicated that it becomes near impossible to write a compiler that supports all of it, something is going really really wrong.

Sources, references, etc.

Vista x64 SP1

Believe it or not, Microsoft is distributing the SP1 for Windows Vista x64 via Microsoft Update. I just installed it, everything seems to work fine, and it really looks like the final. Download was 120 MiB, installation took approximately 30 min.

Vista SP1 Final