You may have noticed (but you probably did not), but on 2017-02-04, at 14:00, in room UB2.252A (aka "Lameere"), which at that point in time will be the Virtualisation and IaaS devroom, I'll be giving a talk on the Network Block Device protocol.

Having been involved in maintaining NBD in one way or another for over fifteen years now, I've never seen as much activity around the concept as I have in the past few years. The result of that has been that we've added a number of new features to the protocol, to allow it to be used in settings where it was not as useful or as attractive as it would have been before. As a result, these are exciting times for those who care about NBD, and I thought it would be about time that I'd speak about it somewhere. The Virtualisation/IaaS devroom seemed like the right fit today (even though I wouldn't have thought so a mere two years ago), and so I'm happy they agreed to have me.

While I still have to start writing my slides, and therefore don't really know (yet) exactly what I'll be talking about, the schedule already contains an outline.

See you there?

Posted Wed Dec 28 09:35:34 2016

Last month, I was abroad with my trusty old camera, but without its SD cards. Since the old camera has an SD only slot, which does not accept SDHC (let alone SDXC) cards, I cannot use it with cards larger than 2GiB. Today, such cards are not being manufactured anymore. So, I found myself with a few options:

  1. Forget about the camera, just don't take any photos. Given the nature of the trip, I did not fancy this option.
  2. Go on eBay or some such, and find a second-hand 2GiB card.
  3. Find a local shop, and buy a new camera body.

While option 2 would have worked, the lack of certain features on my old camera had meant that I'd been wanting to buy a new camera body for a while, but it just hadn't happened yet; so I decided to go with option 3.

The Nikon D7200 is the latest model in the Nikon D7xxx series of cameras, a DX-format ("APS-C") camera that is still fairly advanced. Slightly cheaper than the D610, the cheapest full-frame Nikon camera (which I considered for a moment until I realized that two of my three lenses are DX-only lenses), it is packed with a similar amount of features. It can shoot photos at shutter speeds of 1/8000th of a second (twice as fast as my old camera), and its sensor can be set to ISO speeds of up to 102400 (64 times as much as the old one) -- although for the two modes beyond 25600, the sensor is switched to black-and-white only, since the amount of color available in such lighting conditions is very very low already.

A camera which is not only ten years more recent than the older one, but also is targeted at a more advanced user profile, took some getting used to at first. For instance, it took a few days until I had tamed the camera's autofocus system, which is much more advanced than the older one, so that it would focus on the things I wanted it to focus on, rather than just whatever object happens to be closest.

The camera shoots photos at up to twice the resolution in both dimensions (which combines to it having four times the amount of megapixels as the old body), which is not something I'm unhappy about. Also, it does turn out that a DX camera with a 24 megapixel sensor ends up taking photos with a digital resolution that is much higher than the optical resolution of my lenses, so I don't think more than 24 megapixels is going to be all that useful.

The builtin WiFi and NFC communication options are a nice touch, allowing me to use Nikon's app to take photos remotely, and see what's going through the lens while doing so. Additionally, the time-lapse functionality is something I've used already, and which I'm sure I'll be using again in the future.

The new camera is definitely a huge step forward from the old one, and while the price over there was a few hundred euros higher than it would have been here, I don't regret buying the new camera.

The result is nice, too:


All in all, I'm definitely happy with it.

Posted Sat Nov 12 09:48:32 2016

Authenticating HTTPS clients using TLS certificates has long been very easy to do. The hardest part, in fact, is to create a PKI infrastructure robust enough so that compromised (or otherwise no longer desirable) certificates cannot be used anymore. While setting that up can be fairly involved, it's all pretty standard practice these days.

But authentication using a private CA is laughably easy. With apache, you just do something like:

SSLCACertificateFile /path/to/ca-certificate.pem
SSLVerifyClient require

...and you're done. With the above two lines, apache will send a CertificateRequest message to the client, prompting the client to search its local certificate store for certificates signed by the CA(s) in the ca-certificate.pem file, and using the certificate thus found to authenticate against the server.

This works perfectly fine for setting up authentication for a handful of users. It will even work fine for a few thousands of users. But what if you're trying to set up website certificate authentication for millions of users? Well, in that case, storing everything in a single CA just doesn't work, and you'll need intermediate CAs to make things not fall flat on their face.

Unfortunately, the standard does not state what should happen when a client certificate is signed by an intermediate certificate, and the distinguished name(s) in the CertificateRequest message are those of the certificate(s) at the top of the chain rather than the intermediates. Previously, browsers would not just send out the client certificate, but also send along the certificate that it knew to have signed that client certificate, and so on until it found a self-signed certificate. With that, the server would see a chain of certificates that it could verify against the root certificates in its local trust store, and certificate verification would succeed.

It would appear that browsers are currently changing this behaviour, however. With the switch to BoringSSL, Google Chrome on the GNU/Linux platform no longer sent the signing certificates, and instead only sends the leaf certificate that it wants to use for certificate authentication. In the bug report, Ryan Sleevi explains that while the change was premature, the long term plans are for this to be the behaviour not just on GNU/Linux, but on Windows and macOS too. Meanwhile, the issue has been resolved for Chrome 54 (due to be released), but there's no saying for how long. As if that was not enough, the new version of Safari as shipped with macOS 10.12 has also stopped sending intermediate certificates, and expects the web server to be happy with just receiving the leaf certificates.

So, I guess it's safe to say that when you want to authenticate a client in a multi-level hierarchical CA environment, you cannot just hand the webserver a list of root certificates and expect things to work anymore; you'll have to hand it the intermediary certificates as well. To do so, you need to modify the SSLCACertificateFile parameter in apache configuration so that it not only contains the root certificate, but all the intermediate certificates as well. If your list of intermediate certificates is rather long, it may improve performance to use SSLCACertificatePath and the c_rehash tool to create a hashed directory with certificates rather than a PEM file, but in essense, that should work.

While this works, the problem with doing so is that now the DNs of all the intermediate certificates are sent along with the CertificateRequest message to the client, which (again, if your list of intermediate certificates is rather long) may result in performance issues. The fix for that is fairly easy: add a line like

SSLCADNRequestFile /path/to/root-certificates.pem

where the file root-certificates.pem contains the root certificates only. This file will tell the webserver which certificates should be announced in the CertificateRequest message, but the SSLCACertificateFile (or SSLCACertificatePath) configuration item will still be used for actual verification of the certificates in question. Note though that the root certificates apparently also seem to need to be available in the SSLCACertificatePath or SSLCACertificateFile configuration; if you do not do so, then authentication seems to fail, although I haven't yet found why.

I've set up a test page for all of those "millions" of certificates, and it seems to work for me. I've you're trying to use one of those millions of certificates against your own webserver or have a similar situation with a different set of certificates, you might want to make the above changes, too.

Posted Thu Oct 13 14:35:07 2016

Let's say you have a file which contains this:

PKG_CHECK_VAR([p11_moduledir], "p11-kit-1", "p11_module_path")

and that it goes with a which contains this:

dist_p11_module_DATA = foo.module

Then things should work fine, right? When you run make install, your modules install to the right location, and p11-kit will pick up everything the way it should.

Well, no. Not exactly. That is, it will work for the common case, but not for some other cases. You see, if you do that, then make distcheck will fail pretty spectacularly. At least if you run that as non-root (which you really really should do). The problem is that by specifying the p11_moduledir variable in that way, you hardcode it; it doesn't honour any $prefix or $DESTDIR variables that way. The result of that is that when a user installs your package by specifying --prefix=/opt/testmeout, it will still overwrite files in the system directory. Obviously, that's not desireable.

The $DESTDIR bit is especially troublesome, as it makes packaging your software for the common distributions complicated (most packaging software heavily relies on DESTDIR support to "install" your software in a staging area before turning it into an installable package).

So what's the right way then? I've been wondering about that myself, and asked for the right way to do something like that on the automake mailinglist a while back. The answer I got there wasn't entirely satisfying, and at the time I decided to take the easy way out (EXTRA_DIST the file, but don't actually install it). Recently, however, I ran against a similar problem for something else, and decided to try to do it the proper way this time around.

p11-kit, like systemd, ships pkg-config files which contain variables for the default locations to install files into. These variables' values are meant to be easy to use from scripts, so that no munging of them is required if you want to directly install to the system-wide default location. The downside of this is that, if you want to install to the system-wide default location by default from an autotools package (but still allow the user to --prefix your installation into some other place, accepting that then things won't work out of the box), you do need to do the aforementioned munging.

Luckily, that munging isn't too hard, provided whatever package you're installing for did the right thing:

PKG_CHECK_VAR([p11_moduledir], "p11-kit-1", "p11_module_path")
PKG_CHECK_VAR([p11kit_libdir], "p11-kit-1", "libdir")
if test -z $ac_cv_env_p11_moduledir_set; then
    p11_moduledir=$(echo $p11_moduledir|sed -e "s,$p11kit_libdir,\${libdir},g")

Whoa, what just happened?

First, we ask p11-kit-1 where it expects modules to be. After that, we ask p11-kit-1 what was used as "libdir" at installation time. Usually that should be something like /usr/lib or /usr/lib/gnu arch triplet or some such, but it could really be anything.

Next, we test to see whether the user set the p11_moduledir variable on the command line. If so, we don't want to munge it.

The next line looks for the value of whatever libdir was set to when p11-kit-1 was installed in the value of p11_module_path, and replaces it with the literal string ${libdir}.

Finally, we exit our if and AC_SUBST our value into the rest of the build system.

The resulting package will have the following semantics:

  • If someone installs p11-kit-1 your package with the same prefix, the files will install to the correct location.
  • If someone installs both packages with a different prefix, then by default the files will not install to the correct location. This is what you'd want, however; using a non-default prefix is the only way to install something as non-root, and if root installed something into /usr, a normal user wouldn't be able to fix things.
  • If someone installs both packages with a different prefix, but sets the p11_moduledir variable to the correct location, at configure time, then things will work as expected.

I suppose it would've been easier if the PKG_CHECK_VAR macro could (optionally) do that munging by itself, but then, can't have everything.

Posted Thu Sep 8 14:44:16 2016

By popular request...

If you go to the Debian video archive, you will notice the appearance of an "lq" directory in the debconf16 subdirectory of the archive. This directory contains low-resolution re-encodings of the same videos that are available in the toplevel.

The quality of these videos is obviously lower than the ones that have been made available during debconf, but their file sizes should be up to about 1/4th of the file sizes of the full-quality versions. This may make them more attractive as a quick download, as a version for a small screen, as a download over a mobile network, or something of the sorts.

Note that the audio quality has not been reduced. If you're only interested in the audio of the talks, these files may be a better option.

Posted Wed Jul 27 22:13:54 2016

I've been tweaking the video review system which we're using here at debconf over the past few days so that videos are being published automatically after review has finished; and I can happily announce that as of a short while ago, the first two files are now visible on the meetings archive. Yes, my own talk is part of that. No, that's not a coincidence. However, the other talks should not take too long ;-)

Future plans include the addition of a video RSS feed, and showing the videos on the debconf16 website. Stay tuned.

Posted Tue Jul 5 11:58:57 2016

I had planned to do some work on NBD while here at debcamp. Here's a progress report:

Task Concept Code Tested
Change init script so it uses /etc/nbdtab rather than /etc/nbd-client for configuration
Change postinst so it converts existing /etc/nbd-client files to /etc/nbdtab
Change postinst so it generates /etc/nbdtab files from debconf
Create systemd unit for nbd based on /etc/nbdtab
Write STARTTLS support for client and/or server

The first four are needed to fix Debian bug #796633, of which "writing the systemd unit" was the one that seemed hardest. The good thing about debcamp, however, is that experts are aplenty (thanks Tollef), so that part's done now.

What's left:

  • Testing the init script modifications that I've made, so as to support those users who dislike systemd. They're fairly straightforward, and I don't anticipate any problems, but it helps to make sure.
  • Migrating the /etc/nbd-client configuration file to an nbdtab(5) one. This should be fairly straightforward, it's just a matter of Writing The Code(TM).
  • Changing the whole debconf setup so it writes (and/or updates) an nbdtab(5) file rather than a /etc/nbd-client shell snippet. This falls squarely into the "OMFG what the F*** was I thinking when I wrote that debconf stuff 10 years ago" area. I'll probably deal with it somehow. I hope. Not so sure how to do so yet, though.

If I manage to get all of the above to work and there's time left, I'll have a look at implementing STARTTLS support into nbd-client and nbd-server. A spec for that exists already, there's an alternative NBD implementation which has already implemented it, and preliminary patches exist for the reference implementation, so it's known to work; I just need to spend some time slapping the pieces together and making it work.

Ah well. Good old debcamp.

Posted Wed Jun 29 15:07:26 2016

If you're reading this through Planet Grep, you may notice that the site's layout has been overhauled. The old design, which had just celebrated its 10th birthday, was starting to show its age. For instance, the site rendered pretty badly on mobile devices.

In that context, when I did the move to github a while back, I contacted Gregory (who'd done the original design), asking him whether he would be willing to update it. He did say yes, although it would take him a while to be able to make the time.

That's now happened, and the new design is live. We hope you like it.

Posted Wed Mar 16 08:52:40 2016

Before this bug:

screenshot of Iceweasel


screenshot of Firefox


Posted Fri Mar 11 12:25:58 2016
cat hello.c
#include <stdio.h>
int main(void) {
    printf("Hello World!\n");
    return 0;

A simple, standard, C program. Compiling and running it shouldn't produce any surprises... except when it does.

bash: ./hello: No such file or directory

The first time I got that, it was quite confusing. So, strace to the rescue -- I thought. But nope:

strace ./hello
execve("./hello", ["./hello"], [/* 14 vars */]) = -1 ENOENT (No such file or directory)

(output trimmed)

No luck there. What's going on? No, it's not some race condition whereby the file is being removed. A simple ls -l will show it's there, and that it's executable. So what? When I first encountered this, I was at a loss. Eventually, after searching for several hours, I figured it out and filed it under "surprising things with an easy fix". And didn't think of it anymore for a while, because once you understand what's going on, it's not that complicated. But I recently realized that it's not that obvious, and I've met several people who were at a loss when encountering this, and who didn't figure it out. So, here goes:

If you tell the kernel to run some application (i.e., if you run one of the exec system calls), it will open the binary and try to figure out what kind of application it's dealing with. It may be a script with a shebang line (in which case the kernel calls the appropriate interpreter), or it may be an ELF binary, or whatever.

If it is an ELF binary, the kernel checks if the architecture of the binary matches the CPU we're running on. If that's the case, it will just execute the instructions in the binary. If it's not an ELF binary, or if the architecture doesn't match, it will fall back on some other mechanism (e.g., the binfmt_misc subsystem could have some emulator set up to run binaries for the architecture in question, or may have been set up to run java on jar files, etc). Eventually, if all else fails, the kernel will return an error:

./hello: cannot execute binary file: Exec format error

"Exec format error" is the error message for ENOEXEC, the error code which the kernel returns if it determines that it cannot run the given binary. This makes it fairly obvious what's wrong, and why.

Now assume the kernel is biarch -- that is, it runs on an architecture which can run binaries for two CPU ISAs; e.g., this may be the case for an x86-64 machine, where the CPU can run binaries for that architecture as well as binaries for the i386 architecture (and its 32-bit decendants) without emulation. If the kernel has the option to run 32-bit x86 binaries enabled at compile time (which most binary distributions do, these days), then running i386 ELF binaries is possible, in theory. As far as the kernel is concerned, at least.

So, the kernel maps the i386 binary into memory, and jumps to the binary's entry point. And here is where it gets interesting: When the binary in question uses shared libraries, the kernel not only needs to open this binary itself, but also the runtime dynamic linker (RTDL). It is then the job of the RTDL to map the shared libraries into memory for the process to use, before jumping to its code.

But what if the RTDL isn't installed? Well, then the kernel won't find it. The RTDL is just this file, so that means it will produce an ENOENT error code -- the error message for which is "No such file or directory".

And there you have it. The solution is simple, and the explanation too; but still, even so, it can be a bit baffling: the system tells you "the file isn't there", without telling you which file it's missing. Since you passed it only one file, it's reasonable for you to believe the missing file is the binary you're asking the system to execute. But usually, that's not the problem: the problem is that the file containing the RTDL is not installed, which is just a result of you not having enabled multiarch.


dpkg --add-architecture <target architecture>
apt update
apt install libc6:<target architecture>

Obviously you might need to add some more libraries, too, but that's not usually a problem.

Posted Fri Mar 11 12:25:58 2016