Taking Linux seriously

During my last vacation, I did a small experiment at home: I installed Linux, copied all my stuff from Windows over that I use daily (e-mails, bookmarks, code, IM settings, etc.) and forced myself to use Linux for two weeks straight for everything. The last time I used Linux for a longer time was during my assisted texture assignment project while I was at INRIA. Back then, the choice of using Linux was mostly a pragmatic solution to work around the fact that I had a hard time to get my hands on a recent Windows with a recent Visual Studio. This time however I've been evaluating Linux as a serious alternative to Windows. That is, I wanted to understand whether I can get work done on Linux (so I can develop productively at home on Linux) and whether Linux works fine for everyday usage.

Another reason to try switching completely is the fact that for a lot of computation tasks, I already use a Linux-based server machine at work, which outperforms my Windows-based desktop by ~30% (when I normalize for CPU count/speed.) Over the last weeks, I also spent a lot of time at work on my OpenGL back-end so I could test new graphics hardware features on Windows, bringing it up to feature parity with Direct3D. This would allow me to also properly evaluate the "work side", assuming that OpenGL would work fine on Linux.

Setting up

To ensure a fair comparison, I bought a new SSD and started the Linux experiment using Ubuntu 13.10. Installation went fine out-of-the-box and all my hardware was working right away, but I hit two annoying problems immediately:

  • I could only change my audio volume using alsamixer, but not using the UI
  • There was no way to set a per-monitor wallpaper with Unity

As I'm not too religious about my window manager, I simply switched to KDE which solves these problems. However, installing KDE over an existing Ubuntu installation leaves you with a monster with lots of packages that don't make much sense, so I had to do a clean installation with KDE.

Once set up, the first thing that I noticed was that the default font rendering in KDE sucks. While Unity has a very nice anti-aliasing set up by default, KDE used some stupid defaults. Luckily, it's easy to switch. By enabling sub-pixel anti-aliasing and setting the hinting mode to "slight" you basically get DirectWrite/OSX quality text in Linux as well.

Some minor issues with the default colour scheme and window decoration were fixed in no time. All that was left was to install the AMD binary drivers. The Ubuntu package management doesn't provide beta drivers, so I had to do this manually. This works just fine, but there is one minor caveat: To get the drivers to install OpenCL, you must not run the installer, but instead use it to generate packages for your distribution and install all three of them.

Developing

So let's get started with developing. I use CMake everywhere, so the first step was to get and build CMake from source to get the latest version. That worked right away, including the Qt UI. For the IDE, I'm using Qt Creator as well as Sublime Text. Funnily, this is pretty similar to what I use on Windows; debugging and building in Visual Studio, but some typing does happen in Sublime Text. Here's also where I hit the first major problem: On Ubuntu, debugging with Qt Creator does not work (the local variables window just remains empty), as the system gdb is linked with Python 3, but Qt Creator expects a "stock" gdb with Python 2. That's easy enough to solve (get gdb, build from source) once you know it, but figuring this out took me a while as there is no error message at all that the system GDB is not compatible.

On the compiler side, I decided to go ahead and set up builds with both GCC and Clang. Ubuntu comes with recent versions, and as I was already building the code with GCC on Linux and Clang on Mac OS X, I was set up in no time. Build times are also better than on Windows, more so with Clang, which is roughly 30% faster than GCC in my testing.

Unfortunately, I quickly ran into another issue with debugging with GDB: Stepping into a dynamic library sometimes takes several minutes. LLDB doesn't have this problem, so I hope that either GDB gets fixed or Qt Creator gets support for LLDB, as I do have to sometimes switch to LLDB just to get the debugging done.

Let's move on the graphics and OpenCL part: I'm using an AMD HD 7970 card at home, so I was prepared for the worst, but surprisingly, all of my OpenGL stuff worked right away. That is, it worked as long as I was using only OpenGL, as I can't get the OpenCL interop to work. I can create an interop context jus fine, but mapping a texture or a buffer produces an invalid memory object while returning CL_SUCCESS at the same time. Unfortunately, I have similar trouble on Windows with AMD, OpenCL/OpenGL interop and the latest drivers, so I assume that something got broken in the 13.9/13.11 beta drivers. On Windows, the 13.4 drivers seem fine and the FirePro drivers are working as well.

The voxel rasterizer from the niven research framework running Linux.
The voxel rasterizer from the niven research framework running on Linux.

However, graphics on Linux is still no on par with Windows. Basically, the Linux OpenGL drivers didn't get the love that the the Windows Direct3D drivers got. Performance is simply worse, and I also have problems with extreme frame time variance which I don't have on Windows. That said, at least correctness seems fine, as all of my OpenGL code that works on Windows also works on Linux.

For profiling, I used the built-in Linux perf commands and Code XL, and both get the job done. In particular, setting up perf is a no-brainer, and you can get useful data from it very quickly. Not sure how good it'll work to profile multi-threaded applications. Unfortunately, GPUPerfStudio2 doesn't work on Linux; for graphics profiling and debugging I'm using apitrace right now.

Development itself isn't much different. One thing that is faster on Linux though is looking up the system API, as I'm much faster at typing man 2 open than switching to my browser, typing in CreateFile msdn, and waiting for the page to load.

Gaming, entertainment, and other stuff

If it would be just for work, I'd be done here, but I'm using my machine at home for some occasional gaming, movie watching, photo work, office use and of course for browsing & e-mail. Let's start with what doesn't work: Gaming. While DOTA 2 runs on Linux, it runs slowly, it lags, audio stutters, and overall it's just a bad experience. It's worse than when I tried on Mac OS X. Nearly all games I have don't support Linux at all. The notable exception besides DOTA was Metro: Last Light, which does work and looks just as good on Windows, but is completely unplayable, as it slows down to a crawl every few seconds.

Movie watching (using Lovefilm) works surprisingly well if you like inception: There's a package which installs a specially packaged Wine with Firefox in order to run Silverlight. Energy efficiency looks differently, but heck, it does work.

For photo retouching, Lightroom is still king, and Lightroom doesn't work on Linux. No idea why an application written primarily in Lua is so difficult to port, but for the better or worse, I still have to boot Windows to develop my images. There are Linux alternatives, but they still lack the speed of Lightroom as well as its camera profiles. While lensfun does support an impressive list of lenses for distortion correction, vignetting is basically left to the user. Sure, this is not so important when working on a single image, but if you're bulk-correcting an import, these things do matter.

Everything else: It just works. There's nothing I really miss from Windows. I'm actually surprised how little difference there is for "normal" use. If I wouldn't do gaming and graphics development, all I would notice from switching to Linux is that the default fonts look different and that the start button has a different icon.

The verdict

I'm going to stick with Linux at home and try to get the issues with OpenCL/OpenGL interop sorted out. Either AMD will fix them, or I'll have to get an NVIDIA card again and check there, but I do believe that this is just a minor obstacle. For gaming I will still have to boot into Windows occasionally, but for everything else, Linux it is. The only downside is that I can't go right ahead and port everything to use the latest C++ 11 features as I can't kill of Windows support in my research framework, at least not until graphics performance is the same or better on Linux. Oh and before you ask, yes, this blog post was written 100% on Linux already :)

OpenCL & graphics: Are we there yet?

It's been a while since my last blog post on OpenCL and graphics interop. But before I take a look at the current state of OpenCL and graphics interop, I would like to motivate a bit why I even care about it.

Compute & graphics

Every time I start talking about OpenCL & graphics interop, the first reaction is basically why don't you use OpenGL compute shaders or DirectCompute. Well, one part of the answer is that when I started, OpenGL compute shaders weren't available (and they have been only recently added to the AMD driver), and DirectCompute was horrible as well (compiler took hours to compile a shader), but it turned out, there are reasons which are much more important.

First and most importantly: I can use the same code with several graphics backends (D3D9, D3D10, D3D11, OpenGL). Once debugged, I can simply switch the graphics backend and everything remains the same. This is extremely valuable for me as developing OpenCL kernels is still faster than debugging GLSL or HLSL shaders (more on this below.) I don't see any reason why I would like to develop code that works great in OpenCL twice in HLSL and GLSL and spend money & time on what is basically a source-to-source translation (and spend more money down the road as now I have to maintain duplicated code.)

Second, I can decide whether to run the code on the GPU or on the CPU, which neither DirectCompute nor OpenGL compute shaders can provide. Two examples where this is useful:

  • My voxel raytracer is limited only by the size of memory on the device, and while my GPU has 4 GiB, my desktop has 24 GiB memory. Once the data set size becomes too large for the GPU, I can transparently switch to the CPU and continue working with the same code (the only difference is that the interop texture is no longer mapped from OpenGL, but it's an image created on the host instead.) Sure, it runs slower, but customers like it when it still works.
  • For iso-surface rendering, I have a pre-process kernel which extracts the surface from a volume. When the data is already present on the GPU, it makes sense to run it there, but when loading from disk, I can run the same code on the CPU to lower the amount of uploaded data. Same kernel, nearly identical host code.

Moreover, the extension API is well designed and minimal. All you get is a few functions to enable to access every kind of graphics resource (textures, buffers) and then you can do what you want with them. That means only a few functions to learn, which are also the same for OpenGL and all Direct3D versions (except for the suffix.) If new texture formats or new resource formats are added, there is no need for new APIs. Note that OpenCL requires no new API functions on the OpenGL/Direct3D side at all. Interestingly, the API became even smaller with OpenCL 1.2, as there is only one function left for mapping any texture.

Finally, development is faster & the tooling is better. Even though I'm still waiting for a good debugger, I'm still more efficient when writing OpenCL code than when working with shaders. Typically, I start by writing OpenCL for the CPU until the code is correct. Once everything works fine, I switch to the GPU and optimize performance. This is much nicer than having to work on the GPU directly where I'm still running into problems which will freeze the machine; this is a no-issue for me when developing on the CPU. This also includes graphics stuff, as I can simply dump the input textures once, debug on CPU, once it works, enable interop again, and I'm done. On the tool side, Intel & AMD provide nice kernel editors; in particular, AMD's kernel analyzer allows you to immediately see the generated ISA and get statistics on register usage. This comes in very handy when optimizing GPU kernels.

State of the interop nation

So where are we today, and what has improved since my last post? Let's take a look:

  • Depth buffer access: This is solved by cl_khr_depth_images, which is unfortunately not implemented by AMD and NVIDIA. It's a very simple addition to the API which does not introduce any new functions, it just extends the texture mapping to support depth images as well. There's no corresponding extension for Direct3D though. Basically, everything is specified and done, but the solution hasn't been shipped yet.
  • MSAA textures: Again, solved, by cl_khr_gl_msaa_sharing, but not shipped by neither AMD nor by NVIDIA. This function also doesn't add new functions, just extends texture mapping support for MSAA images. Again, there is no Direct3D equivalent.
  • Mip-mapped textures: Solved in OpenCL 2.0 with cl_khr_mipmap_image. There is even an extension to write to individual mip-map levels. While there is no matching Direct3D extension, it's pretty easy to image that it'll be mostly identical.
  • Offline kernel compiler: Solved by SPIR, but not shipping yet. Again, a minimal addition to the API, though in this case, the vendors have significant amounts of work to do to actually generate portable SPIR.
  • Named kernel parameter access: Not solved yet, but I could work around using OpenCL 1.2 APIs.

To sum it up, 3 out of 5 issues have been solved for OpenCL 1.x but not shipped widely. One can be solved using OpenCL 1.2, but unfortunately, NVIDIA is still shipping OpenCL 1.1 only. In OpenCL 2.0, mip-mapped images also get resolved, bringing it to 4 solved and one that can be worked around -- all that we need at this point is the vendors to ship the already specified extensions.

The current verdict is thus a bit better than a year ago, as we do have the extensions specified now, but shipping implementations are still lagging behind the specification. I still don't understand why things like MSAA & depth texture sharing are not exposed on AMD & NVIDIA, as this seems to be a minimal addition which would enable extremely efficient graphics interop -- finally we could write a full-blown, Battlefield 4 style tiled deferred renderer using OpenCL only and reuse it across AMD, NVIDIA, Intel, OpenGL, Direct3D, and potentially mobile platforms as well (Sony is shipping OpenCL on their mobile phones!) Intel on the other hand is doing good progress on OpenCL, as far as I know, they do expose the depth & MSAA sharing on their integrated graphics processors, and they have also started working on OpenCL libraries like the Intel Integrated Performance Primitives for OpenCL.

[Update]:cl_khr_mipmap_image solves the mip-mapped images problem. I somehow missed that completely when looking into the OpenCL 2.0 spec.

How to uninstall the Intel OpenCL SDK 2013 after upgrading to Windows 8.1

If you have the Intel OpenCL SDK 2013 installed before you upgrade your machine to Windows 8.1, you'll notice that you cannot uninstall it after the update, as it will always fail with the following error message:

Intel® SDK for OpenCL Applications 2013 designed to work on Microsoft Windows Vista x64, Windows 7 x64, Windows 8 x64, Windows Server* 2008 R2 x64 operating systems only. The installer detected that you're trying to install the SDK on a different version. Aborting installation

Great. This prevents you from upgrading to the 2013 R2 release, which does support 8.1 -- if you simply run the 2013 R2 installer, it will fail when trying to remove the old one as well. So how can we solve this? First of all, you have to find the MSI file. I've downloaded the intel_sdk_for_ocl_applications_2013_x64_setup.exe and extracted it using 7zip, which gives me the Intel® SDK for OpenCL Applications 2013#x64#intel_sdk_for_ocl_applications_2013_x64.msi file, which I've renamed to sdk.msi to reduce typing. Now, uninstall with logging enabled using msiexec.exe /x sdk.msi /L*V "log.log".

If you open the log file, you'll find:

Action 09:57:30: LaunchConditions. Evaluating launch conditions
Action start 09:57:30: LaunchConditions.
Intel® SDK for OpenCL* Applications 2013 designed to work on Microsoft Windows Vista* x64, Windows* 7 x64, Windows* 8 x64, Windows Server* 2008 R2 x64 operating systems only. The installer detected that you're trying to install the SDK on a different version. Aborting installation

So the problem is a launch condition. Let's take a look at the installer database to find out which one it is. For this, you'll need to install the Windows SDK to get Orca, a tool for inspecting and modifying MSI packages. Orca is a separate installer which is bundled in the Windows SDK.

Once you have Orca installed, you can right-click the MSI file, "Edit with Orca", and then go to "LaunchCondition" on the left side. Lo and behold, here it is. If this condition is true: ((VersionNT64 AND (((VersionNT64=600) AND (MsiNTProductType=1)) OR VersionNT64=601 OR VersionNT64=602)) OR (VersionNT AND (((VersionNT=600) AND (MsiNTProductType=1)) OR VersionNT=601 OR VersionNT=602))), we're getting the error we see. Ok, so we have to get rid of it. Just right click and remove the row and you're done ... not so fast. Unfortunately, this is not the MSI file that is actually used during uninstall, as the MSI file has been already cached by the system, so we need to find out which MSI file is really used. Open up the log again and look for a line like this:

MSI (s) (F4:F8) [23:37:48:764]: Package we're running from ==> C:\Windows\Installer\6bcd9687.msi

That's the file we really need to modify. You'll have to run Orca as an administrator for this, and then you can open the file above and remove the launch condition. Now, you can start the uninstall, and voilà, it works.

Porting from DirectX11 to OpenGL 4.2: API mapping

Welcome to my Direct3D to OpenGL mapping cheat-sheet, which will hopefully help you to get started with adding support for OpenGL to your renderer. The hardest part for me during porting is to find out which OpenGL API corresponds to a specific Direct3D API call, and here is a write-down of what I found out & implemented in my rendering engine. If you find a mistake, please drop me a line so I can fix it!

Device creation & rendering contexts

In OpenGL, I go through the usual hoops: That is, I create an invisible window, query the extension functions on that, and then finally go on to create an OpenGL context that suits me. For extensions, I use glLoadGen which is by far the easiest and safest way to load OpenGL extensions I have found.

I also follow the Direct3D split of a device and a device context. The device handles all resource creation, and the device context handles all state changes. As using multiple device contexts is not beneficial for performance, my devices only expose the "immediate" context. That is, in OpenGL, a context is just use to bundle the state changing functions, while in Direct3D, it wraps the immediate device context.

Object creation

In OpenGL, everything is an unsigned integer. I wrap every object type into a class, just like in Direct3D.

Vertex and index buffers

Work similar to Direct3D. Create a new buffer using glGenBuffers, bind it to either vertex storage (GL_ARRAY_BUFFER) or to index storage (GL_ELEMENT_ARRAY_BUFFER) and populate it using glBufferData.

Buffer mapping

Works basically the same in OpenGL as in Direct3D, just make sure to use glMapBufferRange and not glMapBuffer, which gives you better control over how the data is mapped, and makes it easy to guarantee that no synchronization happens. With glMapBufferRange, you can mimic the Direct3D behaviour perfectly and with the same performance.

Rasterizer state

This maps directly to OpenGL; but it's split across several functions: glPolygonMode, glEnable/Disable for things like culling, glCullFace, etc.

Depth/Stencil state

Similar to the rasterizer state, you need to use glEnable/Disable to set things like the depth test, and then glDepthMask, glDepthFunc, etc.

Blend state

And another state which is split across several functions. Here we're talking about glEnable/Disable for blending in general, then glBlendEquationi to set the blend equations, glColorMaski, glBlendFunci and glBlendColor. The functions with the i suffix allow you to set the blending equations for each "blend unit" just as in Direct3D.

Vertex layouts

I require a similar approach to Direct3D here. First of all, you can create one vertex layout per vertex shader program. This allows me to query the location of all attributes using glGetAttribLocation and store them for the actual binding later.

At binding time, I bind the vertex buffer first, and then set the layout for it. I call glVertexAttribPointer (or glVertexAttribIPointer, if it is an integer type) followed by glEnableVertexAttribArray and glVertexAttribDivisor to handle per-instance data. Setting the layout after the vertex buffer is bound allows me to handle draw-call specific strides as well. For example, I sometimes render with a stride that is a multiple of the vertex size to skip data, which has to be specified using glVertexAttribPointer (unlike in Direct3D, where this is a part of the actual draw call.)

The better solution here is to use ARB_vertex_attrib_binding, which would map directly to a vertex layout in Direct3D parlance and which does not require lots of function calls per buffer. I'm not sure how this interacts with custom vertex strides, though.

Draw calls

That's pretty simple once the layouts are bound, as you have to handle the stride setting there. Once this is resolved, just pick the function which maps to the Direct3D equivalent:

Textures & samplers

First, storing texture data. Currently I use glTexImage2D and glCompressedTexImage2D for each mip-map individually. The only problem here is to handle the internal format, format and type for OpenGL -- I store them along with the texture, as they are all needed at some point. Using glTexImage2D is however not the best way to define texture storage. These APIs allow you to resize a texture later on, which is something Direct3D doesn't, and the same behaviour can be obtained in OpenGL using the glTexStorage2D function. This allocates and fixes the texture storage, and only allows you to upload new data.

Uploading and downloading data is the next part. For a simple update (where I use UpdateSubresource in Direct3D), I simply replace all image data using glTexSubImage2D. For mapping I allocate a temporary buffer and on unmap, I call glTexImage2D to replace the storage. Not sure if this is the recommended solution, but it works and allows for the same host code as Direct3D.

Binding textures and samplers is a more involved topic that I have previously blogged about in more detail. It boils down to statically assigning texture slots to shaders, and manually binding them to samplers and textures. I simply chose to add a new #pragma to the shader source code which I handle in my shader preprocessor to figure out which texture to bind to which slot, and which sampler to bind. On the Direct3D side, this requires me to use numbered samplers, to allow the host & shader code to be as similar as possible.

Texture buffers work just like normal buffers in OpenGL, but you have to associate a texture with your texture buffer. That is, you create a normal buffer first using glBindBuffer and GL_TEXTURE_BUFFER as the target, and with this buffer bound, you bind a texture to it and populate it using glTexBuffer.

Constant buffers

This maps to uniform buffers in OpenGL. One major difference is where global variables end up, in Direct3D, they are put into a special constant buffer called $Global, in OpenGL they have to be set directly. I added special-case handling for global variables to shader programs; in OpenGL, they set the variables directly and in Direct3D globals are set through a "hidden" constant buffer which is only uploaded when the shader is actually bound.

The nice thing about OpenGL is that it gives you binding of sub-parts of a buffer for free. Instead of using glBindBufferBase to bind the complete constant buffer, you simply use glBindBufferRange, no need to fiddle around with difference device context versions as in Direct3D.

Shaders

I use the separate shader programs extension to handle this. Basically, I have a pipeline bound with all stages set and when a shader program is bound, I use glUseProgramStages to set it to its correct slot. The only minor difference here is that I don't use  glCreateShaderProgram, but instead, I do the steps manually. This allows me to access the set the binary shader program hint (GL_PROGRAM_BINARY_RETRIEVABLE_HINT), which you cannot obtain otherwise. Oh I grab the shader program log manually as well, as there is no way from client code to append the shader info log to the program info log.

For shader reflection, the API is very similar. First, you query how many constant buffers and uniforms a program has using glGetProgramiv. Then, you can use glGetActiveUniform to query a global variable and glGetActiveUniformBlockiv, glGetActiveUniformBlockName to query everything about a buffer.

Unordered access views

These are called image load/store in OpenGL. You can take a normal texture and bind it to an image unit using glBindImageTexture. In the shader, you have a new data type called image2D or imageBuffer, which is the equivalent to an unordered access view.

Acknowledgements

That's it. What I found super-helpful during porting was the OpenGL wiki and the 8th edition of the OpenGL programming guide. Moreover, thanks to the following people (in no particular order): Johan Andersson of DICE fame who knows the performance of every Direct3D API call, Aras Pranckevičius, graphics guru at Unity, Christophe Riccio, who has used every OpenGL API call, and Graham Sellers, who has probably implemented every OpenGL API call.

Programmer productivity

I'm a graphics researcher, but I'm also a programmer, and as such, I'm programming most of my day. The question is, how productive I am, and what factors influence this? With productivity, I mean the number of completed tasks per day and the time working on completing tasks compared to "unproductive" time like answering e-mails, reading a web page or talking with an office colleague.

As a programmer, measuring is of course the right approach. What makes this slightly difficult is that there is no simple measurement for productivity; if I take code commits or number of lines written. Instead, I logged for the last 6 months what I did every day. Every morning, I open up a new text file, and write how I spent my time this day. If I end up talking with a colleague for half an hour, this gets noted. If I write three e-mails, it gets noted. If I improve the performance of my raytracer by 3x, well, you know the drill.

Over the time, I found that there are lots of different reasons why a day turns out to be unproductive, but there are two patterns which indicate a productive day.

Finishing lots of small stuff

The one category of productive days is when I finish lots and lots of small tasks. For instance, I manage to reply to the five e-mails waiting in my inbox, get that three small bugs fixed, and finish my bill of expenses for the last trip, all before lunch break. These tasks have a few things in common:

  • They are not really urgent. Sure, if you are the one waiting for an e-mail reply from me, you might think otherwise, but heck, if you wait a day longer the world isn't going to end.
  • They require usually little intellectual effort: That is, you have high skill in that particular area and the task provides little to no challenge, making these tasks boring. Or the task is mostly time-consuming, like installing some SDK and setting up the debugging environment.
  • If they need intellectual effort, then the tasks have some frustrating component to them: Bugs which require elaborate setup before you can actually debug them; code refactoring which will need touching lots of files, etc.

So even though the tasks are really stupid or frustrating, why do I end up having productive days solving lots of them? Each of them requires a context switch and usually a bit of preparation time.

It turns out that on the days where I wind up solving a lot of them, I usually have a list of the tasks to do. That is, all the stupid/frustrating bugs are tracked, so I can see my progress, the e-mails are all lined up in my inbox and most importantly, the tasks are actually finished when I finish them. That is, there won't be an immediate follow-up task. And finally, the scope of these tasks is really clear. If these circumstances come together, I can crunch through them really fast and be productive even if I get interrupted.

Flow

The other productive days can be simply described as being in the "flow", or working on very few tasks for longer times. Optimizing your BVH traversal kernel is just needs a few hours of work. In order to be productive, I need long uninterrupted chunks of time; ideally, I'll start with anything urgent in the morning so I have nothing else to do. It's interesting to see that I typically tend to delay anything coming up during such days until either shortly before I leave work or the next morning.

I'm also much more likely to continue working on such days from home, either to really finish the task or do solve related problems. For instance, if I hit a non-critical issue, I just delay it, and in the evening, I sit down to solve it so on the next day, I can focus solely on the core problem again.

On such days, you might also see me having a hard time killing "dead" time. Chances are high that I'll finish some large task at 11.45, and I'll have no clue what to do until 12.00 (lunch break), as 15 minutes is too short to start working on the next task. I haven't found a satisfactory solution to this yet. Most often, I spend reading some paper or blog online, so it's not completely wasted.

Bad days

All other days are bad days. The number one problem is of course interruptions; the worst ones are those that need follow-up action. For example, I send back some required paper work, and once I get the reply, I will have to quickly do some other paper work/call someone, etc. This blocks me from starting work on larger tasks, as I need uninterrupted time for them. You might think that this is one of the days for the small tasks, but it turns out that I might not have enough small tasks piled up to fill it up effectively. Or, worse, several of the small tasks have dependencies, so I can't schedule them the way I want.

There are also external reasons why a day might suck, as traffic jams, getting too late to bed, eating the wrong stuff, etc. Two are in particular detrimental to productivity:

  • Lack of sleep: Your day is going to suck, get over it. Go home earlier, go for a walk, and get some sleep!
  • Headache: You overdid it the last day or you slept badly. Try to focus on easy tasks, take more breaks, go for a walk.

Otherwise, there's only one good solution: Focus on getting things done instead of "working". As long as there are not too many interruptions, this usually works. What I have learned though is not to worry too much about bad days. If you have them logged, you can check how many and take corrective means. For instance, you'll just delay that e-mail until tomorrow with three other small tasks that you wanted to do to get a longer chunk today. Just make sure to note these things down so you are productive the next day (because you can see progress) and today (so they don't bother you all the time.) Or, go home, do something else, and sit down in the evening to churn through the most boring stuff and go to bed afterwards.

Programming task observations

I have also done a few observations over the last few months:

  • Simple tasks take either less than 5 minutes or between 30-60 minutes. If the task is really simple (fix a typo), it can be done nearly immediately, but if the task is only simple (fix the integer ceil-division routine to work with negative numbers), you will spend half an hour to make sure that the documentation is updated, that you have written the right unit tests, and that you have updated all the right places in the code.
  • Complex tasks require wasted work: Adding rendering tests is a complex task, but putting them into a single test library which turns out to be a dead-end due to excessive compile times is an additional time sink you have to accept. Why it takes some experience to see dead-ends right away, often you have to do some extra work to make sure that you solved the complex task correctly.
  • Putting in the hours: Programming takes time. There's no kidding yourself that you can skip work by doing something really fast. Writing all these functions, documentation and unit tests requires you to spend hours. While there are days where you only have one commit with just a few lines of code changed, you should typically commit lots lines of new or improved code every day. Each new feature means more code, tests and documentation. Even most refactorings which results in a net reduction usually require some new code and then changes in lots of places. All of this takes a lot of time.
  • Seemingly stupid tasks like building your code base from scratch on another machine have a high risk of taking longer than expected, due to obscure problems. These tasks either take exactly as long as you have planned for, or much longer, but they are never faster. I found it best to do those things on Friday.

So much for today! I'm curious to hear how you track your productivity, and if you have spotted similar patterns to me.