Work is all about predictability

As a programmer, you might get home and continue tinkering on your home projects -- guess what, I bet it very much looks like what you do at work. For me at least, I do the same stuff at home as I do at work; I even run my own bug tracker and wiki at home on my own server, manage releases of projects, write documentation and take care of user feedback.

Recently, it dawned upon me that the work bit in what you do at work is only about one thing: Predictability. If you think about it, what's the main difference between a bug report you send me at work compared to one for my private projects? It's the fact that at work, I know I will have time to look at it and I can give you an estimate when it'll be done -- and that's about it. If I get asked when my Python Clang bindings will be updated to the latest Clang shipping in Ubuntu, the answer is "as soon as possible" but you shouldn't rely on this at all. I try my best to spend a fixed amount of time on my personal projects each week, but it's really hard to commit hours to it -- and that's the main difference between work and "hobby projects" for me.

Oh and before someone comes with money argument -- there's plenty of private projects that make money, so I'm not taking this as the main difference. Money makes you a professional, but that's all. I could very well have some ads on this page (and maybe I'll put some down, eventually) and this blog would make money; would it become work that moment? No, unless I start devoting fixed amounts of time to it, with a predicatable schedule and so on.

So next time you think about work, think about "how predictable is it?". Because that's what you should be, during your hours you sit around at the office. Doesn't mean you can't have creative solutions, but it means people you interact with you can work off the assumption that you're going to spend a particular amount of cycles on your backlog every day and making forward progress. You can't rely on me doing things at home in a predictable way, and that's fine -- because, you know, it's fun, not work :)

Introducing kyla - part 2

Today we're going to look at the current state of kyla, what's in the box and what's missing.

What do we have here?

kyla as of today has everything you need to get started -- some documentation, a sample user interface written using Qt, the repository generator, and a sample installation to play around it. All of it is written with some C++ 11 flavor which happens to compile on Visual Studio 2015 Update 3 and recent GCC/Clang. I haven't bothered trying older compilers -- the interface itself is plain C and should work with any language. Overall, I'd estimate I've spent well over a hundred hours on this, but the project is still rather small and is comprised of just 5750 lines of code.


The sample user interface running an installation on Linux.

Does it work? If you need what I need -- a simple way to dump an application onto a server and allow someone to install parts of it selectively, then yes, it does work. It's better than just a ZIP package because the user can select what to pick, and it allows for easy updating which is really helpful if most of your application doesn't change between releases.

So what's missing?

Is kyla production ready? No, not yet. It does work in the "general" case but it's not as robust as it could be. Error handling is rather minimal, but the hooks and entry points are all there. Same goes for progress reporting -- right now, only configure reports an useful progress but everything necessary is in place. The Linux side needs some more love as the web installer is not functional there.

Other than that, the biggest item that's missing is better patch handling. Everything necessary is in place, but a tool to remove contents to create a patch repository hasn't been written. In addition, patching of files is something that fits very nicely into the current design but hasn't been implemented. The basic idea is to introduce a new concept, called "transformation", which can transform one content object into another. For a patch, the installer first identifies the missing contents, and then tries to transform existing content using a transformation. This would allow binary diffs and other things to cut down the patch size even more.

And there's a lot of small things that would be nice to have. More compression algorithms, smarter handling of small files, cancel/resume during an installation, and more.

You might think there's more missing than implemented -- but if you take a closer look, what you get today does solve the problem I've set out to solve when I started the project. If you want to know if it's enough for you, why not give it a shot: Grab the source from either Bitbucket or Github and try it out!

What's next?

My hope is that you, dear reader, will find this project useful, either to take a look at the solutions developed in there, or because some of the code looks interesting for you. Of course, I'd be most delighted if you'd actually use kyla in production for what it was designed for, and report issues, request features or submit pull requests. From my side, I'm happy with where I got: It solves the problems I wanted it to solve so I'll only touch it if I run into a bug affecting me, but otherwise, there's nothing on my list that's missing right now. But I don't know what you need and I'm really curious to see how it will evolve!

Introducing kyla - part 1

Hello and welcome - today I'll be introducing kyla, a tool I've been working on for the last 19 months. Before you ask - no, not continously, often just one evening a week, with month-long breaks inbetween. So what is kyla, why am I releasing it, and how did this happen? Glad you asked, because this is what we're going to cover today.

What? How? Why?

So let's start with what kyla actually is: Kyla is a low-level installation framework, which takes care of "installing things". In the spirit of scratching your itch, kyla exists because I was really fed up with the status quo. Every time I was downloading and installing something, I wondered: Does this really need to download everything first before I can select what features I want? Why is upgrading some complicated - can't it just fetch what changed? Why do I have this downloading, extracting, installing phases and not everything happening at the same time?

It seemed to me that most installers where stuck in design choices done during the pre-web area, and only a few people dared to do a web-first installation system -- Steam being the best example. At the same time, I had to deploy my home framework with some data sets, and this was something that just was way harder using the installation systems I'm familiar with -- mostly Windows Installer, and NSIS. While they worked fine for "normal" applications, they started to fall apart for my framework which had many files to install (documentation, headers) and also large files (PDBs, sample data). So I wrote an initial "dumb" installer for my framework which focused on easy-to-create installations, and after a year or two of using it, I've decided to do it properly - kyla was born.


I was striving to fix the shortcomings of Windows Installer, which I considered -- and still consider -- the most conceptually sound system. However, Windows Installer is not particularly fast; authoring is hard for simple cases, even when using WiX; it isn't cross-platform; and web installations were never planned for it apparently. What I wanted was the ability to run an installation from a web source, get only the data actually requested, and have at least as good update & configuration capabilities as Windows Installer. In particular, when updating my framework from version A to B, where several hundred header files didn't change for example, I'd expect an installer to skip touching them even without a dedicated "patch" installer.

The initial design was quite a bit more involved -- I was trying to solve installations in general, and that has actually lots of implications like handling conditions, some way to keep track of registry keys, etc. After a couple of months I realized that for a user-mode only installer, none of this is actually required, and I refined the scope. Kyla would be limited to the file handling part only. Any additional logic would have to go into the host application into which kyla would be embedded -- like creating shortcuts, or even just keeping track of the last installation directory.

Changing the focus was critical as it led to the super-clean unified repository concept. Instead of having installation sources and targets, why not treat everything the same way? It's all about the contents anyway. With this conceptual change, I could use an installation target as a source, and all operations could be expressed in terms of repository changes.

The repository idea

So what is the repository idea about? Basically, it means that all the data needed for an application is stored in some kind of abstract repository, with a mapping of contents to file paths. Any operation is defined in terms of what it does to the content -- file paths are merely a detail and are just passed around. An installation thus becomes adding contents to an empty repository. An upgrade first computes the difference between the contents stored in the source and target repository, and then transfers the missing bits to get both into the same state. A configuration operation adds or removes contents. The installation forlder itself is also treated like a repository as well, with the only difference that the contents happen to be stored in exactly the file layout requested by the user for deployment. How does a patch look like in this world? Well, it's a repository which doesn't store the content which is already available in another repository -- and it's easy to generate, because you can simply run an update, find out which contents were requested, and remove those from the update repository. This also implies that the system doesn't care whether you're upgrading, downgrading, or even replacing one application with another, as it will do "the right thing" in any case in terms of minimizing data transfers.

With the design set, I did some initial experiements with SQLite to make sure that the required operations would be fast enough. Turns out, SQLite was more than sufficient for my needs. Now nothing was stopping me from getting my hands dirty!

The implementation

Over the following months, I implemented the required functionality in various late evening coding sessions. Mostly after work, some on the weekends, and thanks to the help of a couple of friends with a sane C API and a user-friendly documentation. Having some other people to take a look at what you're doing, who listen to your explanations and who provide some feedback turned out to be invaluable, especially given the very long periods of time where this project wasn't making any progress. Without you guys, I wouldn't have finished this, so thanks again!

Does it work?

After I was done with the implementation, I validated the whole thing by installing some large applications like Boost, Qt, and multiple revisions of the Linux kernel. Compared to existing installers, kyla did exactly what I wanted it to do. Installations only had to fetch the database before starting, the preparation time to compute the required operations were super fast, and the installation itself was done in one step which covered downloading, extracting and verifying the data.

If you want to try for yourself, grab the source from the repository (either on Bitbucket or Github) and check it out. Once everything was working as expected, it was time to get this out. One of the major decisions I delayed until the end was which license to use. I started with GPL as it seemed like a good idea at the time. After all, this thing was meant for launchers and similar non-critical parts. However, thinking more about how this could get integrated, and how I could get people to contribute I settled with the more permissive BSD license. There's quite some neat things you can do by deeply integrating kyla into an application, and GPL would make that impossible.

Wrapping up

Just before release, I spent some time to focus on the "small bits". A proper documentation, some automated testing, a fully automated build, a sample installation, and so on. Again, thanks a lot to everyone who looked at kyla during that time and provided feedback on what to focus!

With that, I'd like to say thanks for staying with me for so long. In the next (and as it looks right now, the last one as well) post of this series we'll take a look at what's there, what's left to do, and how you can contribute!

Dependency-free autocomplete - a web development story

This week I've updated the search box on my GPU database. Previously, there were two search boxes -- one for ASICs, one for cards -- and the autocompletion was done using jQuery UI. For that, I had a simple REST API hooked up was easy to reach from jQuery UI, and you'd get the results instantly while typing. Selecting one result and hitting enter would get you to the page, so everything was fine. Well, mostly. Turns out, the front page of the GPU database is really tiny in terms of HTML and script code. Pulling in jQuery, jQuery UI, and the jQuery UI CSS is pretty huge compared to the rest of the page. Even a minimized jQuery UI with only the autocomplete ticked is around 45 KiB for the CSS and JS alone, and jQuery adds another 32 KiB on top. For comparison, the GPU database landing page is 0.94 KiB, the CSS is 1.15 KiB, and the (new script with autocomplete) is 12.41 KiB (and that is not minified). That's less than 15 total - compared to 77 for jQuery and its dependencies. Moreover, there's no access to external web pages to fetch jQuery from a CDN, which increases loading time even more.

The requirement list was pretty simple:

  1. Must work with keyboard (i.e. cursor up/down, enter)
  2. Must also work with mouse (and only one thing must be highlighted)
  3. Must be able to carry some additional data for each option (so I can actually use it instead of going to a generic search page)

Notice that 3. isn't a super-hard requirement -- it does simplify life however a bit.

And here's how the final solution looks like, before we go onto a detour on our way towards it!


The final product -- user types into the input field, autocomplete shows up, can be navigated using the cursor keys, etc.

The <datalist> fail

Technically, <datalist> is exactly what I should need. It's the HTML5 way of providing autocomplete hints to the user. The way it works is you provide a <input> element, set the list attribute on it and make it point to a <datalist>. Inside the <datalist>, you provide the options. The list itself can get populated using AJAX, so everything seems easy enough.

And in fact, I got that one working pretty quickly. But it turns out there's a few ugly things about <datalist>. To start with, it's designed for completion only -- that is, expanding text. It's not designed to provide additional data. The way I solved this is to add an event handler to select on the input, and once that happened, I'd go through the list of items and search for a matching entry, and if I found one, I assumed that's the selected one and would pull the data from there.

Well, all good, except select doesn't trigger on Chrome, only Firefox. Wrap a form around it, handle submit, and there you go. Except that Chrome (as of today) has a quite ... erm, unhelpful implementation of the completion.

Let's assume for a moment I have something like this:

<input list="autocomplete">
<datalist id="autocomplete">
    <option>R9 Fury</option>
    <option>R9 Fury X</option>

If I type Fury into the input field, what would you expect? If you want to give it a shot on your browser, try this Plunker demo. Turns out, on Firefox, it does autocomplete no matter where the string appears in an option, but Chrome only autocompletes if the option starts with the string. Which makes it pretty much useless, as graphics card names tend to have a boring prefix ...

Ok, so let's roll it on our own -- how hard can it be, after all?

100% custom

Given the fail with <datalist>, we're going to do it "old-school". Below the input field, a <div> is placed, matching the size exactly. We hook input on the input field (as well as focus) and run the autocomplete core loop there. Which is pretty simple:

  1. Send a XMLHttpRequest to get suggestions
  2. Build a list
  3. Clear the <div> and put the freshly built list into it

That gives us a basic autocomplete. Well, not so fast. It only solves 2. -- selecting by mouse. Selecting by cursor keys doesn't work, the list is very long, it does not disappear when the user clicks elsewhere ... let's get on those problems.

Focus & blur

On blur, we just hide the input list. Done. Except for one small thing -- this does break on Chrome when you click on a link inside the autocomplete suggestions. What happens is that mousedown triggers, then blur triggers -- hiding the list -- and then mouseup and click follow. By the time you get to click, which would follow the link, there's nothing left to click. Of course, it works in Firefox ... the solution is pretty simple though, add a new event listener on the element for mousedown and just call preventDefault on the event. That fixes it for Chrome, and we got this part working!

Keyboard handling

While we're on the list, we want to navigate through it using the keyboard. This means we need to hook up the cursor keys and enter. It's pretty straightforward, just hook up to keydown event and then we're going to store currently selected item somewhere and add/remove classes as needed.

inputBox.addEventListener('keydown', function (ev) {
    switch (ev.key) {
        case 'ArrowDown':
                // Update index
        // Similar for up
}, true);

And indeed, this would work, if not for Edge, who decided that ArrowDown (as the spec hints at) is not the string that should be used for the arrow down, but instead, Down. Of course the answer is to use the keyCode or just add another case. I used another case, to improve maintainability (if you want to look how different browsers interpret key codes, check out this huge table of which code maps to which key on which browser).

So this is solved now. We can navigate using the keyboard. Let's hook up the mouse next.

Mouse handling

Handling the mouse might seem easy to do by hooking up mouseenter but that's not going to fly. If the list scrolls due to keyboard events, then the mouse event will fire, and suddenly you'll have two items selected. This is not an uncommen problem: As of today, the Angular autocomplete has this issue. Select an entry using the keyboard, mouse over another, and you have two selected, and god knows where you'll go if you press enter.

The solution I use is to handle mousemove on my element, and also checking for the mouse move delta. Only if the cursor was actually moved, the selection changes, so basically whatever was used last (mouse or keyboard) decides what's selected.


Wait a moment, I just said when an entry scrolls under the mouse. That's another missing bit. So far, the autocomplete <div> was basically the size of the autocomplete list, but what if we add a max-height on the <div> to ensure it doesn't go crazy long?

Now we need to make sure that if we navigate using the keyboard, the entry is actually visible. Thank god there's scrollIntoView which is all we need. Except it's not quite what we want. The problem with scrollIntoView is that it's not 100% standardized and at least the Firefox implementation scrolls the container down until the item you called scrollIntoView is at the top of the container (or the container is already scrolled to bottom). This is pretty bad for two reasons. First of all, if you have 5 items in view, and your selection moves from the first to the second one, it will scroll even though no scrolling is needed (so you loose context). Second, if you mouse over, and it scrolls to the item selected by the mouse, hilarity ensues ... so you'd have to special case it for selections triggered by mouse and those by keyboard.

The way I solved it is to scroll "on-demand", that is, only if the item would go out of scope, and only from keyboard. The assumption is that you can always see the item you want to work on, so scrolling one item up or down is safe.

Wrapping up

Whew, that was a lot of stuff for a seemingly simple feature. The lesson here is that a lot of the user experience things are not obvious right away. On the desktop, we're used to good user experience because it's all taken care of by the OS or GUI toolkit. On the web, we only got a few basic things to work with and building a more complex user experience requires quite some work and attention to detail. Even then, my solution as presented here has some accessibility shortcomings which I hope to fix at some point. For instance, a user with a screen reader will probably fail to use the up/down keys to select an item. This means I should probably have a fallback path -- for example, a form submit handler which goes to a separate page which then redirects if the entry did match directly, instead of relying on the fact that the user selected the item from the list in one way or another.

Interestingly, the complete solution so far clocks in at roughly 150 lines of TypeScript. This seems quite reasonable for being able to get rid of jQuery and improve the loading times significantly. It also resolves all three requirements that I have, which the previous solution didn't, so there's that that :)

Mapping between HLSL and GLSL

It's 2016 and we're still stuck with various shading languages - the current contenders being HLSL for Direct3D, and GLSL for OpenGL and as the "default" front-end language to generate SPIR-V for Vulkan. SPIR-V may become eventually the IL of choice for everything, but that will take a while, so right now, you need to convert HLSL to GLSL or vice versa if you want to target both APIs.

I won't dig into the various cross-compilers today - that's a huge topic - and focus on the language similarities instead. Did you ever ask yourself how your SV_Position input is called in GLSL? Then this post is for you!


This is by no means complete. It's meant as a starting point when you're looking to port some shaders between GLSL to HLSL. I'm omitting functions which are the same for instance.

System values & built-in inputs

Direct3D specifies a couple of system values, GLSL has the concept of built-in variables. The mapping is as following:

SV_ClipDistance gl_ClipDistance
SV_CullDistance gl_CullDistance if ARB_cull_distance is present
SV_Coverage gl_SampleMaskIn & gl_SampleMask
SV_Depth gl_FragDepth
SV_DepthGreaterEqual layout (depth_greater) out float gl_FragDepth;
SV_DepthLessEqual layout (depth_less) out float gl_FragDepth;
SV_DispatchThreadID gl_GlobalInvocationID
SV_DomainLocation gl_TessCord
SV_GroupID gl_WorkGroupID
SV_GroupIndex N/A
SV_GroupThreadID gl_LocalInvocationID
SV_GSInstanceID gl_InvocationID
SV_InsideTessFactor gl_TessLevelInner
SV_InstanceID gl_InstanceID & gl_InstanceIndex (latter in Vulkan with different semantics)
SV_IsFrontFace gl_FrontFacing
SV_OutputControlPointID gl_InvocationID
N/A gl_PatchVerticesIn
SV_Position gl_Position in a vertex shader, gl_FragCoord in a fragment shader
SV_PrimitiveID gl_PrimitiveID
SV_RenderTargetArrayIndex gl_Layer
SV_SampleIndex gl_SampleID
The equivalent functionality is available through EvaluateAttributeAtSample gl_SamplePosition
SV_StencilRef gl_FragStencilRef if ARB_shader_stencil_export is present
SV_Target layout(location=N) out your_var_name;
SV_TessFactor gl_TessLevelOuter
SV_VertexID gl_VertexID & gl_VertexIndex (latter Vulkan with different semantics)
SV_ViewportArrayIndex gl_ViewportIndex

This table is sourced from the OpenGL wiki, the HLSL semantic documentation and the GL_KHR_vulkan_glsl extension specification.

Atomic operations

These map fairly easily. Interlocked becomes atomic. So InterlockedAdd becomes atomicAdd, and so on. The only difference is InterlockedCompareExchange which turns into atomicCompSwap.

Shared/local memory

groupshared memory in HLSL is shared memory in GLSL. That's it.


GroupMemoryBarrierWithGroupSync groupMemoryBarrier and barrier
GroupMemoryBarrier groupMemoryBarrier
DeviceMemoryBarrierWithGroupSync memoryBarrier, memoryBarrierImage, memoryBarrierImage and barrier
DeviceMemoryBarrier memoryBarrier, memoryBarrierImage, memoryBarrierImage
AllMemoryBarrierWithGroupSync All of the barriers above and barrier
AllMemoryBarrier All of the barriers above
N/A memoryBarrierShared

Texture access

Before Vulkan, this is bundled and not trivial to emulate. Fortunately, this changes with Vulkan, where the semantics are the same as in HLSL. The main difference is that in HLSL, the access method is part of the "texture object", while in GLSL, they are free functions. In HLSL, you'll sample a texture called Texture with a sampler called Sampler like this:

Texture.Sample (Sampler, coordinate)

In GLSL, you need to specify the type of the texture and the sampler, but otherwise, it's similar:

texture (sampler2D(Texture, Sampler), coordinate)
CalculateLevelOfDetail & CalculateLevelOfDetailUnclamped textureQueryLod
Load texelFetch and texelFetchOffset
GetDimensions textureSize, textureQueryLevels and textureSamples
Gather textureGather, textureGatherOffset, textureGatherOffsets
Sample, SampleBias texture, textureOffset
SampleCmp samplerShadow
SampleGrad textureGrad, textureGradOffset
SampleLevel textureLod, textureLodOffset
N/A textureProj

General math

GLSL and HLSL differ in their default matrix interpretation. GLSL assumes column-major, and multiplication on the right (that is, you apply \(M * v\)) and HLSL assumes multiplication from left (\(v * M\)) While you can usually ignore that - you can override the order, and multiply from whatever side you want in both - it does change the meaning of m[0] with m being a matrix. In HLSL, this will return the first row, in GLSL, the first column. That also extends to the constructors, which initialize the members in the "natural" order.

Various functions

atan2(y,x) atan, with parameters swapped
ddx dFdx
ddx_coarse dFdxCoarse
ddx_fine dFdxFine
ddy dFdy
ddy_coarse dFdyCoarse
ddy_fine dFdyFine
EvaluateAttributeAtCentroid interpolateAtCentroid
EvaluateAttributeAtSample interpolateAtSample
EvaluateAttributeSnapped interpolateAtOffset
frac fract
lerp mix
mad fma
saturate clamp(x, 0.0, 1.0)

Anything else I'm missing? Please add a comment and I'll update this post!