E3

It's E3 time again! Lots and lots of great new games incoming. Take a look at the trailers. Some amazing ones I've already seen: Killzone 2, Resident Evil 5, Unreal Tournament 3. I'm pretty sure there are many more, I just didn't get around to watch them yet, plus, most pages with them are under heavy load at the moment so you need some patience ;) Feel free to comment if you have seen a particularly impressive game trailer.

Cache-aware programming

I've been working today on a project, and after the first implementation session I ran it through a profiler to see whether I had some obvious performance bottlenecks. Turned out not to be the case, but looking through the code, I've seen some opportunity to reduce the working set size a bit and partition the data so the CPU would work on a smaller part of it. Took quite some while, but I got down to less than 0.000x (the x is there cause the profiler does display only 0.000) misses per instruction, both L1, L2 and TLB, giving a 0.001-0.002% performance penalty for the L1 data misses and 0.000-0.001% for the L2 misses. Some more tuning improved the branch prediction hit rate up to 99.39% (originally, it was slightly below 99% due to the partitioning overhead), making my program overall 50% faster. Note that I didn't change the underlying algorithms, I just changed how the data is presented to the algorithmic kernel! So even on modern CPUs with large caches and rather small working sets (just a few times bigger than the cache), cache aware code is still a win.

War. War never changes ...

Bethesda Softworks, which took over the development of Fallout from Interplay released today the Fallout 3 Teaser Trailer. Seems it is in the same spirit as Fallout 1 & 2 :) I had some fear they would change too much, but it seems to be in the same depressive mood with the same subtle humor (the music is "I don't want to set the world on fire", and it's coming from a "Radiation King" radio) and the same retro look. I hope they carry on the original Fallout spirit into 3D so Fallout 3 becomes a RPG milestone.

What's going on?

Quite some time since my last post - I've been rather really busy will different stuff. First of all, I had to work a bit more as I was falling behind the schedule, but I'm on track now again. I didn't work much on niven lately, but the new object management is in good shape. I polished the integrated math suite a bit, it's more flexible and generic than before and has a few more useful features. For university, I took some time to implement a few things - iterative solvers, FFT and LUP-Decomposition. Read on for more details.

  • Iterative solvers - solve equation systems by iterating from a first (guessed) solution. Sorted by difficulty ... actually, Richardson is the most tricky, as you need a correction factor smaller than twice the spectral radius of the matrix (if you think hu? - I had also no idea, but it turns out that the spectral radius is less or equal to any matrix norm, so it's not that difficult)
    • Richardson
    • Jacobi
    • Gauss-Seidel, normal and with successive over-relaxation (which did not change much on the test data)
  • FFT - fast polynomial multiplication by transforming between coefficient and point-value-representation. Although I did a "by-the-book" implementation I'm not satisfied at all with the quality - the Intel MKL offers much better numeric stability than the naive solution I implemented. I think this might come from the following issue: Instead of computing the n-th root directly, I'm starting from the first one and multiplying it n-times; I assume a good complex exponentiation to be much more precise here.
  • LUP-Decomposition - fast and exact solving of linear equations, plus very fast determinants. I implemented it with pivoting so I can solve a wider range of problems and I get better numeric stability. This one was real fun, and it turned out to be simpler than expected.

niven object managment R16 working

Yeah, you read right: Version 16 of niven's object management is done and works - better than expected. The long long quest for the ideal object manager is over now. During this time, I have written a thread library, three garbage collectors, went through 20 different file formats, one minor release of TinyXML, a huge stack of research papers, 10 pages of design documents, approx. 200 hours of development and a truck of Coke, but hey, it was really worth it, and the last system does really reflect what I have learned during that time :) I hope to be able to close the nivenCore development soon and get on to the really interesting part, the rendering engine. Stay tuned for the first teaser images during the next few months.