File I/O performance
Over the weekend I’ve been implementing some more low-level classes for the package system. The new archives which work directly on the filesystem are around 20 times slower than the memory based ones, so it’s rather very likely I’ll stick with the memory archive approach - storing an object to memory first and streaming it to file later.
Write performance
My first test consists of writing 100.000 3-float vectors to an archive. This results in 300.000 raw write calls à 4 byte on the underlying stream.
MemoryStream: 0.22 s
FileStream: 3.11 s
The FileStream
timings get much worse when some other disk I/O is going on, sometimes by as much as 50%! All this timings are from a fully optimized release build running on my machine with 2GiB RAM and a S-ATA2 RAID1 disk array. Note that the resulting file is smaller than the HDD cache let alone the system cache, so the OS should be able to buffer the complete file instead of writing 4-byte chunks.
Read performance
Reading in the same file, again 300.000 raw reads à 4 byte.
MemoryStream: 0.09 s
FileStream: 1.22 s
StreamView (FileStream): 1.52s
Again, the memory stream is outperforming the filesystem by a factor of 10 - although the file was probably put into the system cache already. Anyway, as bad as this sounds, 1 sec for loading 100.000 micro-objects is not that bad, and most objects are going to be rather huge (textures, models, etc.). Reading is also not that sensitive on other processes, it becomes at most around 10% slower. The StreamView
code is performing additional checks on every low-level read/write which slows down the whole process by another 20%. This is not much if you take a look what it really does - in every single read call it checks for a valid stream, checks the stream mode, gets the current position in the stream, checks if the next read is valid and passes the read call over to it’s contained stream. All these functions throw exceptions on error, and this only adds 20% overhead!
Raw write performance
Something rather odd is the raw write performance. Here the file system wins because it does not need to copy the 1.2 MiB of data into a temporary buffer - no matter how often I run this test, the memcpy
code seems to take the 0.02 secs, although there is enough memory available.
MemoryStream: 0.02 s
FileStream: 0.00 s
Conclusion
What does this mean for niven? I’ll probably stick with the memory archives for storing data but use file streams when loading stuff. After all, the file system gets better the larger the chunks get, so I expect that it won’t matter any longer as soon as I’m loading megabyte-sized textures.