Just seen this: Inside template heavy code, Visual C++ 8 was able to
unroll a loop and add a
prefetchnta call to prefetch data - first
time I've seen this with VC++ 8. Well done, folks!
seems even to work if the target location of the prefetched byte is
computed via a static function call, cause it adds a prefetch inside
code where all access happens via
matrix (row, column); (which in
turns calls a small function to compute the exact memory location), now
that is cool!