Introducing kyla - part 2

Today we're going to look at the current state of kyla, what's in the box and what's missing.

What do we have here?

kyla as of today has everything you need to get started -- some documentation, a sample user interface written using Qt, the repository generator, and a sample installation to play around it. All of it is written with some C++ 11 flavor which happens to compile on Visual Studio 2015 Update 3 and recent GCC/Clang. I haven't bothered trying older compilers -- the interface itself is plain C and should work with any language. Overall, I'd estimate I've spent well over a hundred hours on this, but the project is still rather small and is comprised of just 5750 lines of code.

/images/2016/kyla-ui.png

The sample user interface running an installation on Linux.

Does it work? If you need what I need -- a simple way to dump an application onto a server and allow someone to install parts of it selectively, then yes, it does work. It's better than just a ZIP package because the user can select what to pick, and it allows for easy updating which is really helpful if most of your application doesn't change between releases.

So what's missing?

Is kyla production ready? No, not yet. It does work in the "general" case but it's not as robust as it could be. Error handling is rather minimal, but the hooks and entry points are all there. Same goes for progress reporting -- right now, only configure reports an useful progress but everything necessary is in place. The Linux side needs some more love as the web installer is not functional there.

Other than that, the biggest item that's missing is better patch handling. Everything necessary is in place, but a tool to remove contents to create a patch repository hasn't been written. In addition, patching of files is something that fits very nicely into the current design but hasn't been implemented. The basic idea is to introduce a new concept, called "transformation", which can transform one content object into another. For a patch, the installer first identifies the missing contents, and then tries to transform existing content using a transformation. This would allow binary diffs and other things to cut down the patch size even more.

And there's a lot of small things that would be nice to have. More compression algorithms, smarter handling of small files, cancel/resume during an installation, and more.

You might think there's more missing than implemented -- but if you take a closer look, what you get today does solve the problem I've set out to solve when I started the project. If you want to know if it's enough for you, why not give it a shot: Grab the source from either Bitbucket or Github and try it out!

What's next?

My hope is that you, dear reader, will find this project useful, either to take a look at the solutions developed in there, or because some of the code looks interesting for you. Of course, I'd be most delighted if you'd actually use kyla in production for what it was designed for, and report issues, request features or submit pull requests. From my side, I'm happy with where I got: It solves the problems I wanted it to solve so I'll only touch it if I run into a bug affecting me, but otherwise, there's nothing on my list that's missing right now. But I don't know what you need and I'm really curious to see how it will evolve!

Introducing kyla - part 1

Hello and welcome - today I'll be introducing kyla, a tool I've been working on for the last 19 months. Before you ask - no, not continously, often just one evening a week, with month-long breaks inbetween. So what is kyla, why am I releasing it, and how did this happen? Glad you asked, because this is what we're going to cover today.

What? How? Why?

So let's start with what kyla actually is: Kyla is a low-level installation framework, which takes care of "installing things". In the spirit of scratching your itch, kyla exists because I was really fed up with the status quo. Every time I was downloading and installing something, I wondered: Does this really need to download everything first before I can select what features I want? Why is upgrading some complicated - can't it just fetch what changed? Why do I have this downloading, extracting, installing phases and not everything happening at the same time?

It seemed to me that most installers where stuck in design choices done during the pre-web area, and only a few people dared to do a web-first installation system -- Steam being the best example. At the same time, I had to deploy my home framework with some data sets, and this was something that just was way harder using the installation systems I'm familiar with -- mostly Windows Installer, and NSIS. While they worked fine for "normal" applications, they started to fall apart for my framework which had many files to install (documentation, headers) and also large files (PDBs, sample data). So I wrote an initial "dumb" installer for my framework which focused on easy-to-create installations, and after a year or two of using it, I've decided to do it properly - kyla was born.

Goals

I was striving to fix the shortcomings of Windows Installer, which I considered -- and still consider -- the most conceptually sound system. However, Windows Installer is not particularly fast; authoring is hard for simple cases, even when using WiX; it isn't cross-platform; and web installations were never planned for it apparently. What I wanted was the ability to run an installation from a web source, get only the data actually requested, and have at least as good update & configuration capabilities as Windows Installer. In particular, when updating my framework from version A to B, where several hundred header files didn't change for example, I'd expect an installer to skip touching them even without a dedicated "patch" installer.

The initial design was quite a bit more involved -- I was trying to solve installations in general, and that has actually lots of implications like handling conditions, some way to keep track of registry keys, etc. After a couple of months I realized that for a user-mode only installer, none of this is actually required, and I refined the scope. Kyla would be limited to the file handling part only. Any additional logic would have to go into the host application into which kyla would be embedded -- like creating shortcuts, or even just keeping track of the last installation directory.

Changing the focus was critical as it led to the super-clean unified repository concept. Instead of having installation sources and targets, why not treat everything the same way? It's all about the contents anyway. With this conceptual change, I could use an installation target as a source, and all operations could be expressed in terms of repository changes.

The repository idea

So what is the repository idea about? Basically, it means that all the data needed for an application is stored in some kind of abstract repository, with a mapping of contents to file paths. Any operation is defined in terms of what it does to the content -- file paths are merely a detail and are just passed around. An installation thus becomes adding contents to an empty repository. An upgrade first computes the difference between the contents stored in the source and target repository, and then transfers the missing bits to get both into the same state. A configuration operation adds or removes contents. The installation forlder itself is also treated like a repository as well, with the only difference that the contents happen to be stored in exactly the file layout requested by the user for deployment. How does a patch look like in this world? Well, it's a repository which doesn't store the content which is already available in another repository -- and it's easy to generate, because you can simply run an update, find out which contents were requested, and remove those from the update repository. This also implies that the system doesn't care whether you're upgrading, downgrading, or even replacing one application with another, as it will do "the right thing" in any case in terms of minimizing data transfers.

With the design set, I did some initial experiements with SQLite to make sure that the required operations would be fast enough. Turns out, SQLite was more than sufficient for my needs. Now nothing was stopping me from getting my hands dirty!

The implementation

Over the following months, I implemented the required functionality in various late evening coding sessions. Mostly after work, some on the weekends, and thanks to the help of a couple of friends with a sane C API and a user-friendly documentation. Having some other people to take a look at what you're doing, who listen to your explanations and who provide some feedback turned out to be invaluable, especially given the very long periods of time where this project wasn't making any progress. Without you guys, I wouldn't have finished this, so thanks again!

Does it work?

After I was done with the implementation, I validated the whole thing by installing some large applications like Boost, Qt, and multiple revisions of the Linux kernel. Compared to existing installers, kyla did exactly what I wanted it to do. Installations only had to fetch the database before starting, the preparation time to compute the required operations were super fast, and the installation itself was done in one step which covered downloading, extracting and verifying the data.

If you want to try for yourself, grab the source from the repository (either on Bitbucket or Github) and check it out. Once everything was working as expected, it was time to get this out. One of the major decisions I delayed until the end was which license to use. I started with GPL as it seemed like a good idea at the time. After all, this thing was meant for launchers and similar non-critical parts. However, thinking more about how this could get integrated, and how I could get people to contribute I settled with the more permissive BSD license. There's quite some neat things you can do by deeply integrating kyla into an application, and GPL would make that impossible.

Wrapping up

Just before release, I spent some time to focus on the "small bits". A proper documentation, some automated testing, a fully automated build, a sample installation, and so on. Again, thanks a lot to everyone who looked at kyla during that time and provided feedback on what to focus!

With that, I'd like to say thanks for staying with me for so long. In the next (and as it looks right now, the last one as well) post of this series we'll take a look at what's there, what's left to do, and how you can contribute!

Dependency-free autocomplete - a web development story

This week I've updated the search box on my GPU database. Previously, there were two search boxes -- one for ASICs, one for cards -- and the autocompletion was done using jQuery UI. For that, I had a simple REST API hooked up was easy to reach from jQuery UI, and you'd get the results instantly while typing. Selecting one result and hitting enter would get you to the page, so everything was fine. Well, mostly. Turns out, the front page of the GPU database is really tiny in terms of HTML and script code. Pulling in jQuery, jQuery UI, and the jQuery UI CSS is pretty huge compared to the rest of the page. Even a minimized jQuery UI with only the autocomplete ticked is around 45 KiB for the CSS and JS alone, and jQuery adds another 32 KiB on top. For comparison, the GPU database landing page is 0.94 KiB, the CSS is 1.15 KiB, and the (new script with autocomplete) is 12.41 KiB (and that is not minified). That's less than 15 total - compared to 77 for jQuery and its dependencies. Moreover, there's no access to external web pages to fetch jQuery from a CDN, which increases loading time even more.

The requirement list was pretty simple:

  1. Must work with keyboard (i.e. cursor up/down, enter)
  2. Must also work with mouse (and only one thing must be highlighted)
  3. Must be able to carry some additional data for each option (so I can actually use it instead of going to a generic search page)

Notice that 3. isn't a super-hard requirement -- it does simplify life however a bit.

And here's how the final solution looks like, before we go onto a detour on our way towards it!

/images/2016/search-autocomplete.png

The final product -- user types into the input field, autocomplete shows up, can be navigated using the cursor keys, etc.

The <datalist> fail

Technically, <datalist> is exactly what I should need. It's the HTML5 way of providing autocomplete hints to the user. The way it works is you provide a <input> element, set the list attribute on it and make it point to a <datalist>. Inside the <datalist>, you provide the options. The list itself can get populated using AJAX, so everything seems easy enough.

And in fact, I got that one working pretty quickly. But it turns out there's a few ugly things about <datalist>. To start with, it's designed for completion only -- that is, expanding text. It's not designed to provide additional data. The way I solved this is to add an event handler to select on the input, and once that happened, I'd go through the list of items and search for a matching entry, and if I found one, I assumed that's the selected one and would pull the data from there.

Well, all good, except select doesn't trigger on Chrome, only Firefox. Wrap a form around it, handle submit, and there you go. Except that Chrome (as of today) has a quite ... erm, unhelpful implementation of the completion.

Let's assume for a moment I have something like this:

<input list="autocomplete">
<datalist id="autocomplete">
    <option>R9 Fury</option>
    <option>R9 Fury X</option>
</datalist>

If I type Fury into the input field, what would you expect? If you want to give it a shot on your browser, try this Plunker demo. Turns out, on Firefox, it does autocomplete no matter where the string appears in an option, but Chrome only autocompletes if the option starts with the string. Which makes it pretty much useless, as graphics card names tend to have a boring prefix ...

Ok, so let's roll it on our own -- how hard can it be, after all?

100% custom

Given the fail with <datalist>, we're going to do it "old-school". Below the input field, a <div> is placed, matching the size exactly. We hook input on the input field (as well as focus) and run the autocomplete core loop there. Which is pretty simple:

  1. Send a XMLHttpRequest to get suggestions
  2. Build a list
  3. Clear the <div> and put the freshly built list into it

That gives us a basic autocomplete. Well, not so fast. It only solves 2. -- selecting by mouse. Selecting by cursor keys doesn't work, the list is very long, it does not disappear when the user clicks elsewhere ... let's get on those problems.

Focus & blur

On blur, we just hide the input list. Done. Except for one small thing -- this does break on Chrome when you click on a link inside the autocomplete suggestions. What happens is that mousedown triggers, then blur triggers -- hiding the list -- and then mouseup and click follow. By the time you get to click, which would follow the link, there's nothing left to click. Of course, it works in Firefox ... the solution is pretty simple though, add a new event listener on the element for mousedown and just call preventDefault on the event. That fixes it for Chrome, and we got this part working!

Keyboard handling

While we're on the list, we want to navigate through it using the keyboard. This means we need to hook up the cursor keys and enter. It's pretty straightforward, just hook up to keydown event and then we're going to store currently selected item somewhere and add/remove classes as needed.

inputBox.addEventListener('keydown', function (ev) {
    switch (ev.key) {
        case 'ArrowDown':
            {
                ev.preventDefault();
                // Update index
                return;
            }
        // Similar for up
    }
}, true);

And indeed, this would work, if not for Edge, who decided that ArrowDown (as the spec hints at) is not the string that should be used for the arrow down, but instead, Down. Of course the answer is to use the keyCode or just add another case. I used another case, to improve maintainability (if you want to look how different browsers interpret key codes, check out this huge table of which code maps to which key on which browser).

So this is solved now. We can navigate using the keyboard. Let's hook up the mouse next.

Mouse handling

Handling the mouse might seem easy to do by hooking up mouseenter but that's not going to fly. If the list scrolls due to keyboard events, then the mouse event will fire, and suddenly you'll have two items selected. This is not an uncommen problem: As of today, the Angular autocomplete has this issue. Select an entry using the keyboard, mouse over another, and you have two selected, and god knows where you'll go if you press enter.

The solution I use is to handle mousemove on my element, and also checking for the mouse move delta. Only if the cursor was actually moved, the selection changes, so basically whatever was used last (mouse or keyboard) decides what's selected.

Scrolling

Wait a moment, I just said when an entry scrolls under the mouse. That's another missing bit. So far, the autocomplete <div> was basically the size of the autocomplete list, but what if we add a max-height on the <div> to ensure it doesn't go crazy long?

Now we need to make sure that if we navigate using the keyboard, the entry is actually visible. Thank god there's scrollIntoView which is all we need. Except it's not quite what we want. The problem with scrollIntoView is that it's not 100% standardized and at least the Firefox implementation scrolls the container down until the item you called scrollIntoView is at the top of the container (or the container is already scrolled to bottom). This is pretty bad for two reasons. First of all, if you have 5 items in view, and your selection moves from the first to the second one, it will scroll even though no scrolling is needed (so you loose context). Second, if you mouse over, and it scrolls to the item selected by the mouse, hilarity ensues ... so you'd have to special case it for selections triggered by mouse and those by keyboard.

The way I solved it is to scroll "on-demand", that is, only if the item would go out of scope, and only from keyboard. The assumption is that you can always see the item you want to work on, so scrolling one item up or down is safe.

Wrapping up

Whew, that was a lot of stuff for a seemingly simple feature. The lesson here is that a lot of the user experience things are not obvious right away. On the desktop, we're used to good user experience because it's all taken care of by the OS or GUI toolkit. On the web, we only got a few basic things to work with and building a more complex user experience requires quite some work and attention to detail. Even then, my solution as presented here has some accessibility shortcomings which I hope to fix at some point. For instance, a user with a screen reader will probably fail to use the up/down keys to select an item. This means I should probably have a fallback path -- for example, a form submit handler which goes to a separate page which then redirects if the entry did match directly, instead of relying on the fact that the user selected the item from the list in one way or another.

Interestingly, the complete solution so far clocks in at roughly 150 lines of TypeScript. This seems quite reasonable for being able to get rid of jQuery and improve the loading times significantly. It also resolves all three requirements that I have, which the previous solution didn't, so there's that that :)

Mapping between HLSL and GLSL

It's 2016 and we're still stuck with various shading languages - the current contenders being HLSL for Direct3D, and GLSL for OpenGL and as the "default" front-end language to generate SPIR-V for Vulkan. SPIR-V may become eventually the IL of choice for everything, but that will take a while, so right now, you need to convert HLSL to GLSL or vice versa if you want to target both APIs.

I won't dig into the various cross-compilers today - that's a huge topic - and focus on the language similarities instead. Did you ever ask yourself how your SV_Position input is called in GLSL? Then this post is for you!

Note

This is by no means complete. It's meant as a starting point when you're looking to port some shaders between GLSL to HLSL. I'm omitting functions which are the same for instance.

System values & built-in inputs

Direct3D specifies a couple of system values, GLSL has the concept of built-in variables. The mapping is as following:

HLSL GLSL
SV_ClipDistance gl_ClipDistance
SV_CullDistance gl_CullDistance if ARB_cull_distance is present
SV_Coverage gl_SampleMaskIn & gl_SampleMask
SV_Depth gl_FragDepth
SV_DepthGreaterEqual layout (depth_greater) out float gl_FragDepth;
SV_DepthLessEqual layout (depth_less) out float gl_FragDepth;
SV_DispatchThreadID gl_GlobalInvocationID
SV_DomainLocation gl_TessCord
SV_GroupID gl_WorkGroupID
SV_GroupIndex N/A
SV_GroupThreadID gl_LocalInvocationID
SV_GSInstanceID gl_InvocationID
SV_InsideTessFactor gl_TessLevelInner
SV_InstanceID gl_InstanceID & gl_InstanceIndex (latter in Vulkan with different semantics)
SV_IsFrontFace gl_FrontFacing
SV_OutputControlPointID gl_InvocationID
N/A gl_PatchVerticesIn
SV_Position gl_Position in a vertex shader, gl_FragCoord in a fragment shader
SV_PrimitiveID gl_PrimitiveID
SV_RenderTargetArrayIndex gl_Layer
SV_SampleIndex gl_SampleID
The equivalent functionality is available through EvaluateAttributeAtSample gl_SamplePosition
SV_StencilRef gl_FragStencilRef if ARB_shader_stencil_export is present
SV_Target layout(location=N) out your_var_name;
SV_TessFactor gl_TessLevelOuter
SV_VertexID gl_VertexID & gl_VertexIndex (latter Vulkan with different semantics)
SV_ViewportArrayIndex gl_ViewportIndex

This table is sourced from the OpenGL wiki, the HLSL semantic documentation and the GL_KHR_vulkan_glsl extension specification.

Atomic operations

These map fairly easily. Interlocked becomes atomic. So InterlockedAdd becomes atomicAdd, and so on. The only difference is InterlockedCompareExchange which turns into atomicCompSwap.

Shared/local memory

groupshared memory in HLSL is shared memory in GLSL. That's it.

Barriers

HLSL GLSL
GroupMemoryBarrierWithGroupSync groupMemoryBarrier and barrier
GroupMemoryBarrier groupMemoryBarrier
DeviceMemoryBarrierWithGroupSync memoryBarrier, memoryBarrierImage, memoryBarrierImage and barrier
DeviceMemoryBarrier memoryBarrier, memoryBarrierImage, memoryBarrierImage
AllMemoryBarrierWithGroupSync All of the barriers above and barrier
AllMemoryBarrier All of the barriers above
N/A memoryBarrierShared

Texture access

Before Vulkan, this is bundled and not trivial to emulate. Fortunately, this changes with Vulkan, where the semantics are the same as in HLSL. The main difference is that in HLSL, the access method is part of the "texture object", while in GLSL, they are free functions. In HLSL, you'll sample a texture called Texture with a sampler called Sampler like this:

Texture.Sample (Sampler, coordinate)

In GLSL, you need to specify the type of the texture and the sampler, but otherwise, it's similar:

texture (sampler2D(Texture, Sampler), coordinate)
HLSL GLSL
CalculateLevelOfDetail & CalculateLevelOfDetailUnclamped textureQueryLod
Load texelFetch and texelFetchOffset
GetDimensions textureSize, textureQueryLevels and textureSamples
Gather textureGather, textureGatherOffset, textureGatherOffsets
Sample, SampleBias texture, textureOffset
SampleCmp samplerShadow
SampleGrad textureGrad, textureGradOffset
SampleLevel textureLod, textureLodOffset
N/A textureProj

General math

GLSL and HLSL differ in their default matrix interpretation. GLSL assumes column-major, and multiplication on the right (that is, you apply \(M * v\)) and HLSL assumes multiplication from left (\(v * M\)) While you can usually ignore that - you can override the order, and multiply from whatever side you want in both - it does change the meaning of m[0] with m being a matrix. In HLSL, this will return the first row, in GLSL, the first column. That also extends to the constructors, which initialize the members in the "natural" order.

Various functions

HLSL GLSL
atan2(y,x) atan, with parameters swapped
ddx dFdx
ddx_coarse dFdxCoarse
ddx_fine dFdxFine
ddy dFdy
ddy_coarse dFdyCoarse
ddy_fine dFdyFine
EvaluateAttributeAtCentroid interpolateAtCentroid
EvaluateAttributeAtSample interpolateAtSample
EvaluateAttributeSnapped interpolateAtOffset
frac fract
lerp mix
mad fma
saturate clamp(x, 0.0, 1.0)

Anything else I'm missing? Please add a comment and I'll update this post!

Setting up your own mailserver

Today we're going to look at yet another use for your own home server - handling all your emails. There's a couple of reasons why you want to do this, so let's start with some motivation. I always used to download my emails from my email account using POP3. That works fine as long as you only use one machine to access it, as the emails get immediately deleted after fetching. This didn't use to be an issue at first for me, but once you get access to your mail account from your mobile phone and notebook it starts to become an issue.

The easy solution is to change it to IMAP and just let all emails on the server. While that works, I did accumulate a lot of emails over the years - my inbox contains several tens of thousands of emails, and uses a couple GiB of disk space. Even though space is not so much of an issue these days, it's still slow to sync with an inbox that has thousands of emails, and there's not much reason for me to keep a lot of the old emails around.

With a home server in place, it was time to consolidate this. My goal was to achieve the following:

  • Have all emails on my server, so I don't need to worry about backups. That means the server needs to pull them from my web accounts.
  • Have all emails in an easy to process format - something where I can search through them manually, if ever needed.
  • Remove old emails from my web accounts after a grace period.

Turns out, all of this is possible, with the help of a couple tools and some scripting. Let's dive in how to do this!

The mail server

First we need a mail server. The mail server is the thing we'll connect to in the future - my Thunderbird at home doesn't connect to my web account any more, but to my home server instead. My server of choice is dovecot which you can install on Ubuntu using apt install dovecot-imapd. Next we need to configure it. There's a couple of files we need to edit. In the following, I'll assume we're going to store our emails on /tank/mail in a per-user directory.

In /etc/dovecot/dovecot.conf, add protocols = imap as the last line to enable IMAP. Next, you'll need to edit /etc/dovecot/conf.d/10-mail.conf where you specify the mail_location. I'm using mail_location = maildir:/tank/mail/%u/.maildir - maildir ensures the emails are stored in the maildir format, which you can for instance read using the Python mailbox module.

We also need to setup a way to connect to the server, so we edit /etc/dovecot/conf.d/10-auth.conf and enable the passwdfile authentication - just remove the # at the beginning of the !include auth-passwdfile.conf.ext line.

Before we continue, I'm going to set up a new user which will own all the emails - this user will be called vmail. This is straightforward:

adduser --disabled-password --shell=/bin/false vmailadduser --disabled-password --shell=/bin/false vmail
chown -R vmail /tank/mail

We also need the user-id and the group-id of that user - check it using:

id vmail

With that, we can finally add users to access our email server. I have a couple of mailboxes and one user per mailbox. I'll be using a simple password file here. In /etc/dovecot/conf.d/10-auth.conf, check that auth_mechanisms = plain is set, as well as disable_plaintext_auth = no. By default, dovecot only allows secure connections, but here we're connecting within our local network only, so we don't have to bother setting up SSL. Now we're ready to add the users - put them into /etc/dovecot/users, with one line per user like this:

username:{plain}passwort:vmail-user-id:vmail-group-id::

vmail-user-id and vmail-group-id is the user and group id of the vmail user, respectively. Phew! You can actually try to connect now using the username and (plain-text) password you just specified.

Note

If you're not comfortable with plain because you're not the only root on your server, you can also use encrypted passwords.

Getting email

The next part is to load the emails from your web account into dovecot. The easiest solution I've found is to stuff them directly into dovecot using getmail. getmail can write into a maildir directly, and dovecot - as we've set it up above - stores all emails in a maildir, so let's have getmail fetch it right in there!

Installing getmail is simple, just use apt install getmail, and then we need to set up one file per mail account we want to fetch. Those files go into /etc/getmail, and look like this (let's call it your-email_example_com.conf):

[retriever]
type = SimpleIMAPSSLRetriever
server = mymail.server.com
username = your-email@example.com
password = swordfish

[destination]
type = Maildir
path = /tank/mail/your-email_example_com/.maildir/
user = vmail

[options]
verbose = 2
delete = false
messsage_log_syslog = true
read_all = false
delivered_to = false
received = false

All that is left is to set up a cron job to fetch the emails. I'm running it every 5 minutes - this is how my /etc/cron.d/get-mail job looks like:

PATH="/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin"

*/5 * * * * root getmail --rcfile your-email_example_com.conf --getmaildir /etc/getmail/

You can pass in as many --rcfile options into getmail as you want. Notice that it's also possible to use IMAP idle to get instant emails - it's a bit tricky to set up and I didn't bother with it as I don't need this.

All right, emails are coming in, now we only need to clean up the mail folder somehow!

Cleaning up

Unfortunately, I didn't find a solution for mail cleanup, so I ended up writing my own script for it. It works very similar to getmail: For every account, you set up a configuration file. For the account above, we'd put the following content into /etc/delmail/your-email_example_com.conf:

[mailbox]
server = mymail.server.com
username = your-email@example.com
password = swordfish

[backup]
type = Maildir
path = /tank/mail/your-email_example_com/.backup/

[options]
min_age = 28

min_age specifies the age of emails before they get deleted. I'm also erring on the side of safety, so all emails the script is about to delete will be backed up into the path specified in the [backup] section.

All that is left is now a cron job for this, which doesn't have to run with the same frequency. Here's the contents of my /etc/cron.d/delete-mail:

PATH="/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin"

2 */4 * * * vmail delete-old-emails.py -config /etc/delmail/your-email_example_com.conf

Notice that the server only handles receiving emails, not sending them. For sending, I still go directly to my web email server, but I store the copy locally if sending from home.

And that's it! I've been using this setup since a couple of years now, and so far, I'm very pleased. While traveling, I can always look up my recent emails. At home, I have access to all emails I have ever received, and I have every backed up nicely. My web accounts run with 100 MiB of storage instead of several GiB as they used, and if I fire up my mobile phone or notebook occasionally, there's only a couple of emails it needs to synchronize because the web inbox is nearly empty all the time.