... but that doesn't mean the work is over.

One big job that needs to happen after the conference is to review and release the video recordings that were made. With several hundreds of videos to be checked and only a handful of people with the ability to do so, review was a massive job that for the past three editions took several months; e.g., in 2016 the last video work was done in July, when the preparation of the 2017 edition had already started.

Obviously this is suboptimal, and therefore another solution was required. After working on it for quite a while (in my spare time), I came up with SReview, a video review and transcoding system written in Perl.

An obvious question that could be asked is why I wrote yet another system, though, and did not use something that already existed. The short answer to that is "because what's there did not exactly do what I wanted to". The somewhat longer answer also involves the fact that I felt like writing something from scratch.

The full story, however, is this: there isn't very much out there, and what does exist is flawed in some ways. I am aware of three other review systems that are or were used by other conferences:

  1. A bunch of shell scripts that were written by the DebConf video team and hooked into the penta database. Nobody but DebConf ever used it. It allowed review via an NFS share and a webinterface, and required people to watch .dv files directly from the filesystem in a media player. For this and other reasons, it could only ever be used from the conference itself. If nothing else, that final limitation made it impossible for FOSDEM to use it, but even if that wasn't the case it was still too basic to ever be useful for a conference the size of FOSDEM.
  2. A review system used by the CCC "voc" team. I've never actually seen it in use, but I've heard people describe it. It involves a complicated setup of Samba servers, short MPEG transport stream segments, a FUSE filesystem, and kdenlive, which took someone several days to set up as an experiment back at DebConf15. Critically, important parts of it are also not licensed as free software, which to me rules it out for a tool in support of FOSDEM. Even if that wasn't the case, however, I'm still not sure it would be ideal; this system requires intimate knowledge of how it works from its user, which makes it harder for us to crowdsource the review to the speaker, as I had planned to.
  3. Veyepar. This one gets many things right, and we used it for video review at DebConf from DebConf14 onwards, as well as FOSDEM 2014 (but not 2015 or 2016). Unfortunately, it also gets many things wrong. Most of these can be traced back to the fact that Carl, as he freely admits, is not a programmer; he's more of a sysadmin type who also manages to cobble together a few scripts now and then. Some of the things it gets wrong are minor issues that would theoretically be fixable with a minimal amount of effort; others would be more involved. It is also severely underdocumented, and so as a result it is rather tedious for someone not very familiar with the system to be able to use it. On a more personal note, veyepar is also written in the wrong language, so while I might have spent some time improving it, I ended up starting from scratch.

Something all these systems have in common is that they try to avoid postprocessing as much as possible. This only makes sense; if you have to deal with loads and loads of video recordings, having to do too much postprocessing only ensures that it won't get done...

Despite the issues that I have with it, I still think that veyepar is a great system, and am not ashamed to say that SReview borrows many ideas and concepts from it. However, it does things differently in some areas, too:

  1. A major focus has been on making the review form be as easy to use as possible. While there is still room for improvement (and help would certainly be welcome in that area from someone with more experience in UI design than me), I think the SReview review form is much easier to use than the veyepar one (which has so many options that it's pretty hard to understand sometimes).
  2. SReview assumes that as soon as there are recordings in a given room sufficient to fill all the time that a particular event in that room was scheduled for, the whole event is available. It will then generate a first rough cut, and send a notification to the speaker in question, as well as the people who organized the devroom. The reviewer will then almost certainly be required to request a second (and possibly third or fourth) cut, but I think the advantage of that is that it makes the review workflow be more intuitive and easier to understand.
  3. Where veyepar requires one or more instances of per-state scripts to be running (which will then each be polling the database and just start a transcode or cut or whatever script as needed), SReview uses a single "dispatch" script, which needs to be run once for the whole system (if using an external scheduler) or once per core that may be used (if not using an external scheduler), and which does all the database polling required. The use of an external scheduler seemed more appropriate, given that things like gridengine exist; gridengine is a job scheduler which allows one to submit a job to be ran on any node in a cluster, along with the resources that this particular job requires, and which will then either find an appropriate node to run the job on, or will put the job in a "pending" state until the required resources can be found. This allows me to more easily add extra encoding capacity when required, and allows me to also do things like allocate less resources to a particular part of the whole system, even while jobs are already running, without necessarily needing to abort jobs that might be using those resources.

The system seems to be working fine, although there's certainly still room for improvement. I'm thinking of using it for DebConf17 too, and will therefore probably work on improving it during DebCamp.

Additionally, the experience of using it for FOSDEM 2017 has given me ideas of where to improve it further, so it can be used more easily by different parties, too. Some of these have been filed as issues against a "1.0" milestone on github, but others are only newly formed in my gray matter and will need some thinking through before they can be properly implemented. Certainly, it looks like this will be something that'll give me quite some fun developing further.

In the mean time, if you're interested in the state of a particular video of FOSDEM 2017, have a look at the video overview page, which lists all talks along with their review/transcode status. Also, if you were a speaker or devroom organizer at FOSDEM 2017, please check your mailbox and review your talk! With your help, we should hopefully be able to release all our videos by the end of the week.

Update (2017-02-06 17:18): clarified my position on the qualities of some of the other systems after feedback from people who were a bit disappointed by my description of them... and which was fair enough. Apologies.

Update (2017-02-08 16:26): Fixes to the c3voc stuff after feedback from them.

Posted Monday afternoon, February 6th, 2017

I got myself a bunch of Raspberry Pi 3s a short while ago:

Stack of Raspberry Pis

...the idea being that I can use those to run a few experiments on. After the experiment, I would need to be able to somehow remove the stuff that I put on them again. The most obvious way to do that is to reimage them after every experiment; but the one downside of the multi-pi cases that I've put them in is that removing the micro-SD cards from the Pis without removing them from the case, while possible, is somewhat fiddly. That, plus the fact that I cared more about price than about speed when I got the micro-SD cards makes "image all the cards of these Raspberry Pis" something I don't want to do every day.

So I needed a different way to reset them to a clean state, and I decided to use ansible to get it done. It required a bit of experimentation, but eventually I came up with this:

- name: install package-list fact
  hosts: all
  remote_user: root
  - name: install libjson-perl
      pkg: libjson-perl
      state: latest
  - name: create ansible directory
      path: /etc/ansible
      state: directory
  - name: create facts directory
      path: /etc/ansible/facts.d
      state: directory
  - name: copy fact into ansible facts directory
      dest: /etc/ansible/facts.d/allowclean.fact
      src: allowclean.fact
      mode: 0755

- name: remove extra stuff
  hosts: pis
  remote_user: root
  - apt: pkg={{ item }} state=absent purge=yes
    with_items: "{{ ansible_local.allowclean.cleanable_packages }}"

- name: remove package facts
  hosts: pis
  remote_user: root
  - file:
      path: /etc/ansible/facts.d
      state: absent
  - apt:
      pkg: libjson-perl
      state: absent
      autoremove: yes
      purge: yes

This ansible playbook has three plays. The first play installs a custom ansible fact (written in perl) that emits a json file which defines cleanable_packages as a list of packages to remove. The second uses that fact to actually remove those packages; and the third removes the fact (and the libjson-perl library we used to produce the JSON).

Which means, really, that all the hard work is done inside the allowclean.fact file. That looks like this:


use strict;
use warnings;
use JSON;

open PACKAGES, "dpkg -l|";

my @packages;

my %whitelist = (

while(<PACKAGES>) {
    next unless /^ii\s+(\S+)/;

    my $pack = $1;

    next if(exists($whitelist{$pack}));

    if($pack =~ /(.*):armhf/) {
        $pack = $1;

    push @packages, $1;

my %hash;

$hash{cleanable_packages} = \@packages;

print to_json(\%hash);

Basically, we create a whitelist, and compare the output of dpkg -l against that whitelist. If a certain package is in the whitelist, then we don't add it to our "packages" list; if it isn't, we do.

The whitelist is just a list of package => 1 tuples. We just need the hash keys and don't care about the data, really; I could equally well have made it package => undef, I suppose, but whatever.

Creating the whitelist is not that hard. I did it by setting up one raspberry pi to the minimum that I wanted, running

dpkg -l|awk '/^ii/{print "\t\""$2"\" => 1,"}' > packagelist

and then adding the contents of that packagelist file in place of the ... in the above perl script.

Now whenever we apply the ansible playbook above, all packages (except for those in the whitelist) are removed from the system.

Doing the same for things like users and files is left as an exercise to the reader.

Side note: ... is a valid perl construct. You can write it, and the compiler won't complain. You can't run it, though.

Posted late Wednesday morning, February 1st, 2017

You may have noticed (but you probably did not), but on 2017-02-04, at 14:00, in room UB2.252A (aka "Lameere"), which at that point in time will be the Virtualisation and IaaS devroom, I'll be giving a talk on the Network Block Device protocol.

Having been involved in maintaining NBD in one way or another for over fifteen years now, I've never seen as much activity around the concept as I have in the past few years. The result of that has been that we've added a number of new features to the protocol, to allow it to be used in settings where it was not as useful or as attractive as it would have been before. As a result, these are exciting times for those who care about NBD, and I thought it would be about time that I'd speak about it somewhere. The Virtualisation/IaaS devroom seemed like the right fit today (even though I wouldn't have thought so a mere two years ago), and so I'm happy they agreed to have me.

While I still have to start writing my slides, and therefore don't really know (yet) exactly what I'll be talking about, the schedule already contains an outline.

See you there?

Posted early Wednesday morning, December 28th, 2016

Last month, I was abroad with my trusty old camera, but without its SD cards. Since the old camera has an SD only slot, which does not accept SDHC (let alone SDXC) cards, I cannot use it with cards larger than 2GiB. Today, such cards are not being manufactured anymore. So, I found myself with a few options:

  1. Forget about the camera, just don't take any photos. Given the nature of the trip, I did not fancy this option.
  2. Go on eBay or some such, and find a second-hand 2GiB card.
  3. Find a local shop, and buy a new camera body.

While option 2 would have worked, the lack of certain features on my old camera had meant that I'd been wanting to buy a new camera body for a while, but it just hadn't happened yet; so I decided to go with option 3.

The Nikon D7200 is the latest model in the Nikon D7xxx series of cameras, a DX-format ("APS-C") camera that is still fairly advanced. Slightly cheaper than the D610, the cheapest full-frame Nikon camera (which I considered for a moment until I realized that two of my three lenses are DX-only lenses), it is packed with a similar amount of features. It can shoot photos at shutter speeds of 1/8000th of a second (twice as fast as my old camera), and its sensor can be set to ISO speeds of up to 102400 (64 times as much as the old one) -- although for the two modes beyond 25600, the sensor is switched to black-and-white only, since the amount of color available in such lighting conditions is very very low already.

A camera which is not only ten years more recent than the older one, but also is targeted at a more advanced user profile, took some getting used to at first. For instance, it took a few days until I had tamed the camera's autofocus system, which is much more advanced than the older one, so that it would focus on the things I wanted it to focus on, rather than just whatever object happens to be closest.

The camera shoots photos at up to twice the resolution in both dimensions (which combines to it having four times the amount of megapixels as the old body), which is not something I'm unhappy about. Also, it does turn out that a DX camera with a 24 megapixel sensor ends up taking photos with a digital resolution that is much higher than the optical resolution of my lenses, so I don't think more than 24 megapixels is going to be all that useful.

The builtin WiFi and NFC communication options are a nice touch, allowing me to use Nikon's app to take photos remotely, and see what's going through the lens while doing so. Additionally, the time-lapse functionality is something I've used already, and which I'm sure I'll be using again in the future.

The new camera is definitely a huge step forward from the old one, and while the price over there was a few hundred euros higher than it would have been here, I don't regret buying the new camera.

The result is nice, too:


All in all, I'm definitely happy with it.

Posted early Saturday morning, November 12th, 2016

Authenticating HTTPS clients using TLS certificates has long been very easy to do. The hardest part, in fact, is to create a PKI infrastructure robust enough so that compromised (or otherwise no longer desirable) certificates cannot be used anymore. While setting that up can be fairly involved, it's all pretty standard practice these days.

But authentication using a private CA is laughably easy. With apache, you just do something like:

SSLCACertificateFile /path/to/ca-certificate.pem
SSLVerifyClient require

...and you're done. With the above two lines, apache will send a CertificateRequest message to the client, prompting the client to search its local certificate store for certificates signed by the CA(s) in the ca-certificate.pem file, and using the certificate thus found to authenticate against the server.

This works perfectly fine for setting up authentication for a handful of users. It will even work fine for a few thousands of users. But what if you're trying to set up website certificate authentication for millions of users? Well, in that case, storing everything in a single CA just doesn't work, and you'll need intermediate CAs to make things not fall flat on their face.

Unfortunately, the standard does not state what should happen when a client certificate is signed by an intermediate certificate, and the distinguished name(s) in the CertificateRequest message are those of the certificate(s) at the top of the chain rather than the intermediates. Previously, browsers would not just send out the client certificate, but also send along the certificate that it knew to have signed that client certificate, and so on until it found a self-signed certificate. With that, the server would see a chain of certificates that it could verify against the root certificates in its local trust store, and certificate verification would succeed.

It would appear that browsers are currently changing this behaviour, however. With the switch to BoringSSL, Google Chrome on the GNU/Linux platform no longer sent the signing certificates, and instead only sends the leaf certificate that it wants to use for certificate authentication. In the bug report, Ryan Sleevi explains that while the change was premature, the long term plans are for this to be the behaviour not just on GNU/Linux, but on Windows and macOS too. Meanwhile, the issue has been resolved for Chrome 54 (due to be released), but there's no saying for how long. As if that was not enough, the new version of Safari as shipped with macOS 10.12 has also stopped sending intermediate certificates, and expects the web server to be happy with just receiving the leaf certificates.

So, I guess it's safe to say that when you want to authenticate a client in a multi-level hierarchical CA environment, you cannot just hand the webserver a list of root certificates and expect things to work anymore; you'll have to hand it the intermediary certificates as well. To do so, you need to modify the SSLCACertificateFile parameter in apache configuration so that it not only contains the root certificate, but all the intermediate certificates as well. If your list of intermediate certificates is rather long, it may improve performance to use SSLCACertificatePath and the c_rehash tool to create a hashed directory with certificates rather than a PEM file, but in essense, that should work.

While this works, the problem with doing so is that now the DNs of all the intermediate certificates are sent along with the CertificateRequest message to the client, which (again, if your list of intermediate certificates is rather long) may result in performance issues. The fix for that is fairly easy: add a line like

SSLCADNRequestFile /path/to/root-certificates.pem

where the file root-certificates.pem contains the root certificates only. This file will tell the webserver which certificates should be announced in the CertificateRequest message, but the SSLCACertificateFile (or SSLCACertificatePath) configuration item will still be used for actual verification of the certificates in question. Note though that the root certificates apparently also seem to need to be available in the SSLCACertificatePath or SSLCACertificateFile configuration; if you do not do so, then authentication seems to fail, although I haven't yet found why.

I've set up a test page for all of those "millions" of certificates, and it seems to work for me. I've you're trying to use one of those millions of certificates against your own webserver or have a similar situation with a different set of certificates, you might want to make the above changes, too.

Posted Thursday afternoon, October 13th, 2016

Let's say you have a configure.ac file which contains this:

PKG_CHECK_VAR([p11_moduledir], "p11-kit-1", "p11_module_path")

and that it goes with a Makefile.am which contains this:

dist_p11_module_DATA = foo.module

Then things should work fine, right? When you run make install, your modules install to the right location, and p11-kit will pick up everything the way it should.

Well, no. Not exactly. That is, it will work for the common case, but not for some other cases. You see, if you do that, then make distcheck will fail pretty spectacularly. At least if you run that as non-root (which you really really should do). The problem is that by specifying the p11_moduledir variable in that way, you hardcode it; it doesn't honour any $prefix or $DESTDIR variables that way. The result of that is that when a user installs your package by specifying --prefix=/opt/testmeout, it will still overwrite files in the system directory. Obviously, that's not desireable.

The $DESTDIR bit is especially troublesome, as it makes packaging your software for the common distributions complicated (most packaging software heavily relies on DESTDIR support to "install" your software in a staging area before turning it into an installable package).

So what's the right way then? I've been wondering about that myself, and asked for the right way to do something like that on the automake mailinglist a while back. The answer I got there wasn't entirely satisfying, and at the time I decided to take the easy way out (EXTRA_DIST the file, but don't actually install it). Recently, however, I ran against a similar problem for something else, and decided to try to do it the proper way this time around.

p11-kit, like systemd, ships pkg-config files which contain variables for the default locations to install files into. These variables' values are meant to be easy to use from scripts, so that no munging of them is required if you want to directly install to the system-wide default location. The downside of this is that, if you want to install to the system-wide default location by default from an autotools package (but still allow the user to --prefix your installation into some other place, accepting that then things won't work out of the box), you do need to do the aforementioned munging.

Luckily, that munging isn't too hard, provided whatever package you're installing for did the right thing:

PKG_CHECK_VAR([p11_moduledir], "p11-kit-1", "p11_module_path")
PKG_CHECK_VAR([p11kit_libdir], "p11-kit-1", "libdir")
if test -z $ac_cv_env_p11_moduledir_set; then
    p11_moduledir=$(echo $p11_moduledir|sed -e "s,$p11kit_libdir,\${libdir},g")

Whoa, what just happened?

First, we ask p11-kit-1 where it expects modules to be. After that, we ask p11-kit-1 what was used as "libdir" at installation time. Usually that should be something like /usr/lib or /usr/lib/gnu arch triplet or some such, but it could really be anything.

Next, we test to see whether the user set the p11_moduledir variable on the command line. If so, we don't want to munge it.

The next line looks for the value of whatever libdir was set to when p11-kit-1 was installed in the value of p11_module_path, and replaces it with the literal string ${libdir}.

Finally, we exit our if and AC_SUBST our value into the rest of the build system.

The resulting package will have the following semantics:

  • If someone installs p11-kit-1 your package with the same prefix, the files will install to the correct location.
  • If someone installs both packages with a different prefix, then by default the files will not install to the correct location. This is what you'd want, however; using a non-default prefix is the only way to install something as non-root, and if root installed something into /usr, a normal user wouldn't be able to fix things.
  • If someone installs both packages with a different prefix, but sets the p11_moduledir variable to the correct location, at configure time, then things will work as expected.

I suppose it would've been easier if the PKG_CHECK_VAR macro could (optionally) do that munging by itself, but then, can't have everything.

Posted Thursday afternoon, September 8th, 2016

By popular request...

If you go to the Debian video archive, you will notice the appearance of an "lq" directory in the debconf16 subdirectory of the archive. This directory contains low-resolution re-encodings of the same videos that are available in the toplevel.

The quality of these videos is obviously lower than the ones that have been made available during debconf, but their file sizes should be up to about 1/4th of the file sizes of the full-quality versions. This may make them more attractive as a quick download, as a version for a small screen, as a download over a mobile network, or something of the sorts.

Note that the audio quality has not been reduced. If you're only interested in the audio of the talks, these files may be a better option.

Posted Wednesday night, July 27th, 2016

I've been tweaking the video review system which we're using here at debconf over the past few days so that videos are being published automatically after review has finished; and I can happily announce that as of a short while ago, the first two files are now visible on the meetings archive. Yes, my own talk is part of that. No, that's not a coincidence. However, the other talks should not take too long ;-)

Future plans include the addition of a video RSS feed, and showing the videos on the debconf16 website. Stay tuned.

Posted at noon on Tuesday, July 5th, 2016

I had planned to do some work on NBD while here at debcamp. Here's a progress report:

Task Concept Code Tested
Change init script so it uses /etc/nbdtab rather than /etc/nbd-client for configuration
Change postinst so it converts existing /etc/nbd-client files to /etc/nbdtab
Change postinst so it generates /etc/nbdtab files from debconf
Create systemd unit for nbd based on /etc/nbdtab
Write STARTTLS support for client and/or server

The first four are needed to fix Debian bug #796633, of which "writing the systemd unit" was the one that seemed hardest. The good thing about debcamp, however, is that experts are aplenty (thanks Tollef), so that part's done now.

What's left:

  • Testing the init script modifications that I've made, so as to support those users who dislike systemd. They're fairly straightforward, and I don't anticipate any problems, but it helps to make sure.
  • Migrating the /etc/nbd-client configuration file to an nbdtab(5) one. This should be fairly straightforward, it's just a matter of Writing The Code(TM).
  • Changing the whole debconf setup so it writes (and/or updates) an nbdtab(5) file rather than a /etc/nbd-client shell snippet. This falls squarely into the "OMFG what the F*** was I thinking when I wrote that debconf stuff 10 years ago" area. I'll probably deal with it somehow. I hope. Not so sure how to do so yet, though.

If I manage to get all of the above to work and there's time left, I'll have a look at implementing STARTTLS support into nbd-client and nbd-server. A spec for that exists already, there's an alternative NBD implementation which has already implemented it, and preliminary patches exist for the reference implementation, so it's known to work; I just need to spend some time slapping the pieces together and making it work.

Ah well. Good old debcamp.

Posted Wednesday afternoon, June 29th, 2016

If you're reading this through Planet Grep, you may notice that the site's layout has been overhauled. The old design, which had just celebrated its 10th birthday, was starting to show its age. For instance, the site rendered pretty badly on mobile devices.

In that context, when I did the move to github a while back, I contacted Gregory (who'd done the original design), asking him whether he would be willing to update it. He did say yes, although it would take him a while to be able to make the time.

That's now happened, and the new design is live. We hope you like it.

Posted early Wednesday morning, March 16th, 2016