Skip to main content

Getting started with D3D12

Welcome to a short introduction to Direct3D 12 (also know as DX12, DirectX12 and D3D12) - the new graphics API from Microsoft, which brings new concepts to the table that have been introduced with Mantle. These new APIs could be classified as "explicit" APIs, as they have very few things that happen automatically unlike previous APIs like Direct3D 11 and OpenGL 4. In this blog post, I'll introduce the basic concepts behind these new APIs. To follow along, I'd recommend that you check out my tiny D3D12 sample application which illustrates the techniques.

Some kind of motivation

So why did these new APIs emerge? Let's start with a motivating example. In D3D11, you can map a buffer for writing and specify the discard flag. That flag is actually a serious problem for the GPU. Let's assume for a moment that the buffer hasn't been used yet, and that a frame where it will be used is queued and being processed by the GPU. The driver can't simply overwrite the buffer in GPU memory because when you submitted the frame, it wasn't mapped, and time travel is still quite hard.

The driver has only two choices. The naïve one is to simply drain the GPU and wait for it to finish. Performance will be horrible if this happens for every map call, but it will be correct. The right choice is to simply create a new buffer, put the data in there, upload it to the GPU and track the original buffer. Once the frame where the original buffer is used finishes, the original buffer can be recycled and everything is fine. Except the driver now needs to manage a new buffer per map call -- tricky, but possible.

If you think that's just an example -- no, it isn't. This buffer replacement is called buffer renaming and is a standard technique used by D3D11 drivers. Depending on how large the rename buffer is, and how often buffers are discarded, it can work quite well but it means there has to logic in the driver to manage and track this.

Going explicit

With D3D12, these things go away, and the developer is now directly exposed to memory management and synchronization. What does this mean exactly? Well, for starters, tracking of resources has to be done by the developer. If you look into my sample, you'll notice I create "frame fences" which allow me to check if a frame has finished. For the constant buffers, I have one constant buffer per queued frame in a cheap-man's ring buffer. Using the frame fence, I can synchronize with the GPU while still allowing the GPU queue to fill up. This removes the need for rename buffers from the driver.

Memory management is now also explicit, for instance, uploading does no longer happen "under the hood". You'll notice that I use two kinds of resources: Static data like the vertex and index buffer as well as the texture, and dynamic data like the constant buffer. For the dynamic data, which is read only once, it doesn't make too much sense to push it to the GPU at all. In my sample, I hence place the constant buffer in CPU memory and let the GPU read that directly. In D3D11, the driver has to guess how often a buffer will be read and where to place it, but in D3D12, I can use the knowledge I have about my access patterns to optimize this.

The other data needs to be uploaded, and unlike D3D11 where this happens automatically, I have to do this on my own. Which means I need to reserve space on the CPU from where to stage the update, allocate some GPU memory, issue a copy and wait for it to finish before I use the resource. In the small sample, you can see that I wait for it to finish manually and hence keep everything deterministic but in a larger application I could take advantage of the copy queue and copy data independent of the rendering. This makes it easy to implement advanced streaming which was very hard to do before, as the driver can't predict when a resource has to be resident on the GPU.

Resource state tracking

Another completely new responsibility for developers is state tracking. In D3D11, resources transition between states automatically which can lead to bad performance. Imagine the following scenario: Four shadow maps are rendered and applied onto the scene. The application renders into a shadow map, changes the target, renders into the next and so on and then finally loops over the four shadow maps and reads them. What you may not know is that GPUs compress depth data to improve bandwidth and eventually performance, but the texture units may not be able to read that compressed data directly and hence require a decompression. This decompression can potentially require a flush and wait-for-idle to make sure that the compressed data is written completely and no longer used before it gets decompressed.

Now, if the driver is not careful, this could result in a decompress, flush, read cycle, four times. The reason for this is that the driver only notices that the decompression is needed when it sees that the resource is bound for reading. With D3D12, these transitions are now explicit and the developer can schedule them. In the example above, he can choose to decompress all four shadow maps at once in a single transition, pay the cost for the flush once and improve performance.

Draw state & shaders

Another big area where the D3D11 driver spends time is setting and validating state. For instance, let's assume you set a vertex and a pixel shader. The driver must check that the signatures of both match and this can only happen at draw time because the driver cannot precompute all permutations of vertex and pixel shaders to look this up. Often, the driver will even delay the compilation of a shader until it is used for the first time to improve startup time and easily skip unused shader. Games often have to to "pre-warm" the driver shader cache by touching all combinations once during loading to ensure that the gameplay doesn't get interrupted when the driver starts to compile a shader.

In D3D12, this changes completely with the introduction of pipeline state objects which group all shaders and quite a bit of rendering state together. Grouping this data allows the driver to validate everything once and at runtime just swap the state without any further checks. It also means the driver can check if the pixel shader output is used at all and optimize the shader is some data is going to be discarded anyway. This is a huge change from previous APIs, and is also a major pain point when transitioning legacy engines which tend to identify the required combinations at run-time. In the D3D12 world, the shaders need to become part of the asset pipeline. In the sample, you can see how much state actually goes into the pipeline state object, even for a rather simple shader setup.

Resource binding

Finally, resource binding in the D3D12 world is totally different from D3D11. Legacy APIs tend to model the GPU as something I call the "slot machine". You have lots of different slots where you plug in textures, samplers, etc. This used to be the case how hardware worked but it's not true since several years. If you look for instance at the GCN ISA documentation, specifically for "image resources", you'll notice that there is no "sampler slot" or "texture slot" being used there. Instead, the texture and sampler descriptor is loaded into a bunch of registers and that's it. This new model is what is used by D3D12 through the root signature and descriptor tables.

The root signature serves as the first indirection level for resource bindings. It can contain some data in-line if it is small enough -- for instance, a pointer to memory (also known as a constant buffer) or a few floats, or pointers to descriptor tables that can contain larger descriptors (for instance, texture descriptors.)

It is interesting that the root signature is still tracked with renaming, but as it is generally very small, this is not a huge problem (for best performance, it should be small and some other rules should be followed as well -- check out this GDC 2015 presentation on D3D12 for details.) In the sample, you can see how the texture descriptor is placed in such a table and then referenced from the root signature. Again, the goal here is to allow to change large amounts of bindings very quickly. Unlike in D3D11, where the developer changes slots and the driver needs to map them to descriptors and build the table on demand, the developer can now swap for instance all textures and samplers required by a material by updating the descriptor table pointers in the root signature -- a very cheap and fast operation.

Things we didn't look at

D3D12 also comes with explicit command buffers which allow multiple CPU threads to record commands. I'm not covering this here as the sample doesn't take advantage of multiple threads -- maybe some other time :) I'm also not covering the different queues exposed by D3D12 today. In D3D12, it is possible to execute a compute shader concurrently with draw calls and data transfers happening by taking advantage of the graphics, compute and copy queue. This is again and advanced feature which is no good fit for an introductory post.

Shadow mapping basics

Welcome to a different kind of blog post! This time, I'll be writing an educational piece about shadow mapping for all of you who are just getting started with real-time graphics and shader programming. If you want to see the implementation, please follow the OctoAwesome live-coding project where this will be included over time.


Let's start right away by trying to define what problem we want to solve. As of today, OctoAwesome is an outdoor game without any kind of shadows but with some kind of fixed-function illumination. For every pixel that is shaded, the lighting is evaluated even if the point is occluded by some geometry. What we want to add is the ability for a point to query whether it is occluded or not.

Some of you might shout ray-tracing. And you're absolutely right, we will use this as our mental model to derive shadow mapping! So how would the lighting code work? If we had something like a shadow() API call in the shader, we could use that to trace a ray from the point being shaded to the light source. shadow() returns a boolean, which indicates whether the path is clear or not.

Three shadow rays cast by the sun. Two hit the green occluder before hitting the blue object which is currently shaded.

The problem here is how to implement the shadow() call. For efficient ray-tracing, we're going to need some kind of acceleration structure and then a rather involved kernel to do the real tracing. In the case of OctoAwesome, we would probably want to trace a binary volume for maximum efficiency. We'd also need some special way to handle transparent blocks like the trees and animated objects. Not impossible but a lot of work.

We'll notice pretty quickly that for each frame, we have to trace a lot of rays and that gets expensive due to the traversal. It's even worse as we trace the same rays over and over, at least as long the light source and the camera is not moving. This is surely not the most efficient way; it feels as if we should be able to store and reuse the results of the shadow() calls somehow.


And indeed, there is a way to reuse shadow() calls. The key insight is that we can store one value per ray to resolve the shadow() query for all points along the ray. The value we store is the distance to the closest hit, and our new shadow() call now just checks the distance of the query point against the closest point. If further away, it is in shadow.

Ray-tracing with a cache. The blue/green cell stores the values for all rays passing through it. The two top points on the blue object are tested against the blue cell, and one of them is classified as lit even though it should be in shadow. Nitpick: If done with utmost precision, actually all points would be classified as occluded as all three have a larger depth value. Generally, a small epsilon (bias) is introduced to avoid self-shadowing; in the example above, it's large enough to fix the upper point on the blue object from getting shadowed by itself.

Now the only problem is how to store the "per-ray" data, which are now cast from arbitrary points. For a distant light source, imagine we place a grid orthogonal to the light direction and quantize rays into small "cells". That is, all rays which are emitted "nearby" will go into the same cell. This introduces a bit of error, depending on how big our cells are and other factors, but in general it's quite acceptable.


What we need to do now is to produce a grid of distance values from the light source, store this somehow, and during shading, project the points into that grid and compare the values. Turns out the GPU is perfectly suited for this. Producing distance values is exactly what we do when writing into the depth buffer. Storing is equally easy, we can re-use a depth buffer as a texture. The only remaining problem is to project the points into the shadow map and compare the values.

Let's tackle the problems one-by-one. First of all, we need a new camera, which captures the scene as seen from the light into a depth map. This means we need to create a new render target, which has only a depth buffer bound. We also need to set up the camera correctly: It should cover the complete view frustum and nothing else, to maximize the effective resolution. Finally, when rendering, we should turn off all pixel shaders to improve performance. Of course, for alpha-tested geometry, we need the shaders, but if possible we should use a simplified version which only calls discard() while generating the shadow map.

The next step is the normal render pass in which we need to implement the shadow() call. For this to work, we have to project the point being shaded into the shadow map, that is, it has to go from world-space into the light-space using the same projection as we used to generate the shadow map. This means we need to pass-through the world-space position somehow through the vertex shader, the easiest way is to simply forward it and then multiply with the light projection in the pixel shader. One division, one adjustment for the -1..1 to 0..1 coordinate system difference, and a comparison later and we know if the point is in shadow!


The solution above will work in practice but result in quite ugly, blocky shadows. We can achieve higher quality by using a comparison sampler and linear filtering, which will enable hardware-accelerated percentage closer filtering.

I've also omitted lots of other problems we're going to run into. For instance, the shadow map resolution should be improved by using cascaded shadow maps, we should use some kind of contact hardening shadows to make the shadows softer further away, and so on. Good shadow mapping is actually really hard to implement, due to the fixed precision and resolution of shadow maps, but right now, it's the best we can do until the hardware becomes fast enough to trace soft-shadows in real-time. This will be hopefully the topic of a future blog post :)

Scratching your itch: Side projects

Everyone has them, small side projects which somehow never want to get finished. That small tool you wrote to convert some ancient file format into a newer one. The tiny hack to an image library to add support for scanline offset caching to improve TGA loading performance. Small things which make a library much more usable for a particular use case, or tiny tools which help with corner cases that only few people run into.

What do these side projects have in common? First of all, I guess the majority of them never gets released, which means we're all going to have our small hack to some image library, our own small converter for ancient file formats and other tiny pieces of code on which we hack occasionally. Well, not really occasionally, but every time we run into a new bug or when we find a new use-case. Second, these side projects take away increasing amounts of time and mental capacity as you have this lingering feeling at the back of your head that you should really "finish" this at some point, but you simply don't have the time to do it "properly".

I know this feeling very well. Doing it "properly" means for us to have all know issues fixed, have good documentation, port it to all platforms under the sun and ideally have 100% test coverage. After all, it's a side project, so at least here we can do everything right, right? Here, we can be the programmer we want everyone to believe we are, writing perfect code.

So how can we solve this dilemma? The first step is to understand that even non-perfect code can solve problems, especially if the problem domain is very small. If all you need is to decompress DXT images, a bunch of C-functions with inline comments might not be the packaged library that you would like to see, but it does solve the problem for people. If anyone has the urgent need to decompress DXT, he will use that library, and chances are high he'll contribute support for that one more format he cares about. This assumes that the code is out there in the first place!

The five-minute guide to release your side project is quite easy, but what you need to do is to prepare yourself to spend some time on the release itself. Not polishing the code, not fixing crazy corner cases, but doing the stuff that really matters:

  • Signing up at some web code repository: Bitbucket, Github, everything else doesn't matter.
  • Get familiar with Mercurial or git. If your code uses a different revision control system, export and reimport now. Currently, only those two revision control systems matter, with a strong bias on git.
  • Decide on the license to use. BSD or MIT is the license of choice if you want people to use your code. GPL may be acceptable for Python or other script languages where you have to release the whole source anyway, but BSD or MIT is better still.
  • Write a readme: What problems does this code solve, on what system does it run, how do I compile it. The readme is crucial for search engines to find your code. Use something like Markdown or ReStructuredText for it so the plain text can be parsed easily.
  • Decide on a name: You don't want to rename your project and loose your search engine rank. Check first if the name is already in use, calling your SQL database my-SQL-DB might not work as expected.
  • Write docs if needed: Don't waste time on docs unless they are really needed. If needed, use Sphinx or something else which is readable in plain text if people don't build the docs. Sphinx is great as there are web services like ReadTheDocs which you can point people to.
  • If you wrote a sufficiently self-contained library and there's a distribution system for your language, do the extra leg work to publish your library on the packaging system. For Python, that would be PyPI, for C#, you probably want NuGet to work and for JavaScript npm is your friend. For very small projects, you can skip this.
  • Mark the current version as 1.0. If you don't feel like 1.0, fix the most urgent bugs and push. If your stuff works, and doesn't crash on every corner, go ahead with 1.0. I know this might sound a bit crazy (hey, 1.0 means stable, right?) but the sad truth about your pet project is that the current state is probably as stable as it will get (remember? You only fix critical bugs anyway), and there won't be a "future" proper release. So you can go ahead and call it 1.0 just as well, and other people are more likely to use it. If you see a project hanging around at 0.1 for 3 years, you assume it's dead, wasn't used for anything and someone simply forgot to delete it.
  • Most importantly: Ship it! Your code is ready, don't waste time. If it's not good but useful people will tell you, and then you can improve stuff that really matters.

The steps above will likely take you something on the order of a few hours for your first project, and less than one hour later on. If you are spending a significant amount on writing docs, packaging or fixing bugs, then your side-project is probably quite big and not really solving one problem any more. Then you're in framework or application development which is an area where people are much more unlikely to use your code snippet, and you really need to nail a lot of things before you can release stuff. In this case, this post is not for you!

One great example of such a small, reusable library is the stb lib. It's a bunch of solutions to common problems which you can easily integrate into your own application. However, I'm sure you all have similar code lying around just waiting to get pushed to the web to the benefit of others! So go ahead, give it the small "release polishing" and share it with all of us -- thank you!

Building your own home server, part #4

Finishing touches

The last thing that remains to be done is to hook up the UPS with the server so it shuts down once power is low. Fortunately, there's already a package which does this for us, called apcupsd. You can fetch it using:

$ apt-get install apcupsd apcupsd-cgi

It's needs a bit of configuration to work with our UPS. Before you continue, make sure you have the USB cable connected to the server. First of all, you have to open the configuration and set the device type:

$ nano /etc/apcupsd/apcupsd.conf

Find the lines which contain UPSTYPE and DEVICE and change them to:


Now we need to write a configuration file so the daemon knows it's configure. Edit /etc/default/apcupsd and set


You can restart it now using service apcupsd restart. One nice thing about the APC UPS daemon is that it also comes with a web-interface:

Web site showing battery load and other power usage metrics.
The apcupsd web interface.

We'll use the Apache 2 web server to host the interface. This requires us to install the server, map the cgi-bin directory and then enable the CGI module. The following few commands will accomplish this:

$ apt-get install apache2
$ cat ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/ >> /etc/apache2/apache2.conf
$ a2enmod cgi
$ sudo service apache2 restart

You can now navigate to the IP address of your server to the /cgi-bin/apcupsd/multimon.cgi page and get the UPS overview. There's only one thing left to do, which is to pull the power cable to check that the UPS does actually work. The apcupsd documentation has exactly the right words for this step:

To begin the test, pull the power plug from the UPS. The first time that you do this, psychologically it won't be easy, but after you have pulled the plug a few times, you may even come to enjoy it.

I couldn't have said it better myself.

Power usage & performance

A watt meter measuring the power usage.
Measuring the power usage of the whole PC without UPS.

I've measured the total system power usage both with the UPS and without. Idle usage of the server alone is around 28-29W, and goes up to 34W under full load -- that is, all CPUs busy and maximum usage of the disk drives. With the UPS, you can expect around 33W while idle, and 36W or so under load. Keep in mind that 99% of the time, the server will be in fact idle.

Performance wise, I get sustained write rates onto the ZFS filesystem of roughly 150 MiB/s. This includes the time to generate the checksums and writes to both disk drives. You can test this easily by writing a zero-byte file:

$ dd if=/dev/zero of=/tank/dummyfile count=8192 bs=1048576
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 52.9725 s, 162 MB/s

Over network, I could easily reach more than 900 MBit over my home gigabit ethernet, which does not achieve exactly 1000 MBit even in the best circumstances. I assume that with a proper, server-grade switch you'll be able to achieve close to 1000 MBit -- if you have some numbers, please get in touch!

Closing remarks

That's all folks, you have now a full-blown Linux server at home at your disposal. There's no end to the things you can run on it, ranging from DLNA servers like MediaTomb over databases like PostgreSQL to virtual machine hosts using KVM. You'll probably want to set up a backup solution as well, which can be easily done with ZFS as you can backup a snapshot while the file system is being mutated by the users. I hope you enjoyed this guide, and if you have any questions, comment or drop me a line!

The server described here has found its new location at my friends home so I can't run additional tests on it. The deployment was very simple, I plugged in the cable and it immediately showed up on the network. Interestingly, after swapping the network cable to the monitoring port and back, I was also able to access the management console even though the network was plugged in into the "client" port. Otherwise, nothing special was needed to get it working in a different network.

Building your own home server, part #3


Ok, our home server hardware is ready, but what software are we going to run? That totally depends on your use case, but I guess at least you'll want to run a file server on it. In this post, we'll set up a not-so-basic Samba file server using Ubuntu Linux.

With a Samba file server, you can serve both Windows and Linux clients, with fine-grained access right management. As the file-system, I'll be setting up the super-robust ZFS, which is a next-gen file-system with extremely high reliability and some cool features. I'll also set up automatic snapshots and integrate them into Windows shadow copies, so Windows clients will be able to restore files that they have mistakenly deleted on their own.

As the operating system, we'll be using a long-term support release of Ubuntu Linux. You can use any other Linux you want of course, but the installation instructions here will be for Ubuntu 14.04, which does support ZFS and our hardware, and is available for free.

Server grade hardware

And now comes the seriously cool part. As we bought server-grade hardware, we can take advantage of server-grade management tools. In particular, here's what we won't need:

  • Spare monitor
  • Keyboard
  • Graphics card
  • USB thumb drive

Instead, we'll forward the screen output from the server via network to our desktop machine, mount the installation media over network and even restart the machine without getting up from our desk!

All you need is to figure out which IP address has been assigned to the network ports and point your browser at it. You'll get the management dashboard, from which you can redirect the console.

Browser window showing the server status.
The server management console. All of this runs on the management port.
Window showing the Ubuntu installation progress bar.

The login is "ADMIN"/"ADMIN", just in case. What's seriously cool is that we can now open a "remote console" here which will forward the display output to our desktop, even while the machine is starting up. In fact, you can even get the BIOS welcome screen:

Window showing the BIOS welcome screen
The BIOS welcome screen, forwarded through the management console.

From here on, it's smooth sailing, or as my administrator friend MoepMan likes to say: "Stuff works just as in the advertisement!" You plug in your Ubuntu ISO using virtual media, start the installer as always and follow the on-screen instructions, and in less than 20 minutes, the machine will boot into Linux. There are only three things you need to double-check during the setup:

  • Make sure to install to the SSD drive. When the installer asks you which drive to use, take the 40 GiB sized, and just use the guided partitioning.
  • Double-check that the first network port is selected as the default.
  • When you can select which software to install, pick OpenSSH server and Samba.

That's it, some waiting, and a reboot later, you're all set.

Window showing the Ubuntu installation progress bar.
Installing Ubuntu from an ISO mounted from the host. This should take only a few minutes.

Network administration

Once the installer has finished, it's time to log in using SSH. On Linux, SSH is built-in, so you can just use ssh your-server-name and log in, on Windows you'll need to get an SSH client like Putty.

With SSH, you get a console on the server, pretty much the same as if you would log in sitting in front of it. In fact, you can run the whole installation by just logging in on the server through the console forwarding, but I'll use SSH because it is quite a bit more comfortable if I can copy/paste into my console window.

As we're setting up the server, we'll be running lots of commands with administrator rights. The best approach is to elevate us once and then just do everything as an administrator. On Ubuntu, simply use sudo -i once logged in to become root (that is, administrator.)

The first step should be to update all installed packages, which you do using:

$ apt-get update && apt-get upgrade

You'll probably have to reboot at this point, so just type in reboot and log in again. On Ubuntu, you can't log-in as root, so make sure you log in as your normal user and then switch to root.


The first thing we want to set up is ZFS. Unfortunately, due to licensing restrictions, it's not shipped by default with Ubuntu, so we need to register a repository and fetch it from there. That's actually not that complicated:

$ apt-add-repository ppa:zfs-native/stable
$ apt-get update
$ apt-get install ubuntu-zfs zfs-auto-snapshot

This will take quite some time to build the kernel modules, so be patient. Now we can create our first pool. ZFS works in two layers: There are pools, which group hard drives, and then there are file systems which are created inside a pool. We'll be using a mirrored pool over our two hard drives and create two file systems inside it.

Before we can do this, we have to check our hard drives though, in particular, we want to know the sector size. Modern hard drives have sectors with 4096 bytes, but due to legacy reasons, they often advertise as 512 bytes sectors, and that mismatch can cost us some performance. Let's check using fdisk -l, which will print output similar to this:

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Bingo, our hard drive uses 4096 sized sectors. Now we're ready to create our first pool. We want to add our hard drives by their disk ID, so the pool will survive if we swap the cables. You can see all drives by id if you call ls /dev/disk/by-id. The two Western Digital should be easy to spot, with their name starting with ata-WDC-WD20EFRX.

To create the pool, call:

$ zpool create tank -o ashift=12 mirror ata-WDC_WD20EFRX-1 ata-WDC_WD20EFRX-2

The ashift=12 line tells ZFS to use blocks with \(2^{12}=4096\) bytes. We're calling the pool tank, because this is what the ZFS documentation always uses, and because it doesn't matter much :)

If you get an error like this, don't worry:

does not contain an EFI label but it may contain partition

Just use the -f flag as recommended, after double checking that you are using the right drives. I got this for one disk drive for whatever reason, but as I don't care about the data, we can just go ahead and ignore this. ZFS will then take over ownership of the drive and just destroy anything that is written on it.

You can now go to /tank and see that it's running. We'll also want to create a few file systems. Let's say we'll have two users on our server (Markus and Raymund -- you can create users using adduser username), and we want a shared file system. Nothing easier than that:

$ zfs create tank/Markus
$ zfs create tank/Raymund
$ zfs create tank/Shared

In ZFS, you should create one file system for every "use" case, as many settings are per-file-system (compression, deduplication, sharing, etc.) Moreover, file systems don't cost you anything.

In case your system disk fails, you'll want to reimport the pool instead of recreating it. The command to do this is:

$ zpool import -f tank

All that remains to be done is to set the access rights for the file systems, which are simply mounted as directories below /tank, and also behave like them. We'll assume that each user owns his folder:

$ chown -R raymund /tank/Raymund
$ chown -R markus /tank/Markus

This sets the folder owners, from there on, the users can log in and set the permissions to their liking.


Samba is the Linux implementation of the SMB protocol used by Windows for file sharing. Setting up Samba is very simple, as its configuration is contained in a single file. We'll set up three shares: Two for the users which can be only used with a valid log-in, and a public share for the Shared folder which can be read without logging in onto the server. For writing into Shared, a valid account will be still required.

All we need is to edit the /etc/samba/smb.conf file and add the following lines at the end:

path = /tank/Shared
public = yes
writable = yes
create mask = 0775
directory mask = 0775

# Duplicate this for Raymund
path = /tank/Markus
public = no
valid users = markus
writable = yes

The part within the brackets is the name of the share, and the rest should be self explaining. On the public shared directory, we set the file access masks such that everyone can read the data, but only the user who created a file can modify it again. One quick restart of the samba server using service smbd restart, and you should see the network shares from Windows.

Volume shadow copies for Samba using ZFS

One major feature of ZFS are zero-cost snap-shots. Unlike other file systems, ZFS is always copy-on-write, so you can store the state of the file system at a particular moment in time for free by simply creating a snapshot. Later, if you find that you want to restore a file, you just open the snapshot and take it from there. This is a bit similar to Windows' "file history", but works on the file-system level instead of individual files. The cool thing is that we can expose ZFS snapshots to Windows clients through through the file history interface right in their explorer.

The setup is straightforward, but a bit tricky. In particular, using the zfs-auto-snapshot script is not enough, as Samba requires the snapshots in a particular format. Each snapshot must contain the date and time in UTC, with a uniform prefix. So we just roll our own script to do this: zfs-snapshot. This scripts must be started regularly (every 15 minutes, for example), and what it'll do is create a snapshot in the right format and also automatically delete old snapshots. Using the default settings, it will keep only hourly snapshots for one week, then daily snapshots for a month, then monthly for a year and so forth -- that is, the older the snapshot, the lower the frequency. I've stored the script as /usr/local/bin/ Now let's set up a cron job -- basically, a simple timer which will call our script regularly:

$ cat >> /etc/cron.d/zfs-snapshot << EOL
> PATH="/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin"
> */15 * * * * root
$ chmod +x /etc/cron.d/zfs-snapshot</pre>

This will run the zfs-snapshot script every 15 minutes. All that is left is integrating it with Samba, so the snapshots actually show up in Windows. For each share, append the following lines:

vfs objects = shadow_copy2
shadow: snapdir = .zfs/snapshot
shadow: sort = desc
shadow: format = shadow_copy-%Y.%m.%d-%H.%M.%S</pre>

They're all the same for each network share, as they are all hosted on their own ZFS filesystems and hence the snapshots are in the .zfs folder. Yet another reason to use a separate file system per share! That's it, one more restart and you should see snapshots showing up in Windows.

At this stage, the rest totally depends on your needs. We have basic file sharing set up, on a robust file system with automatic snapshots. Next time, we'll look at power usage and how to integrate the APC UPS.