Linus apologising

Someone pointed me towards this email, in which Linus apologizes for some of his more unhealthy behaviour.

The above is basically a long-winded way to get to the somewhat painful personal admission that hey, I need to change some of my behavior, and I want to apologize to the people that my personal behavior hurt and possibly drove away from kernel development entirely.

To me, this came somewhat as a surprise. I'm not really involved in Linux kernel development, and so the history of what led up to this email mostly passed unnoticed, at least for me; but that doesn't mean I cannot recognize how difficult this must have been to write for him.

As I know from experience, admitting that you have made a mistake is hard. Admitting that you have been making the same mistake over and over again is even harder. Doing so publically? Even more so, since you're placing yourself in a vulnerable position, one that the less honorably inclined will take advantage of if you're not careful.

There isn't much I can contribute to the whole process, but there is this: Thanks, Linus, for being willing to work on those things, which can only make the community healthier as a result. It takes courage to admit things like that, and that is only to be admired. Hopefully this will have the result we're hoping for, too; but that, only time can tell.

Posted
Autobuilding Debian packages on salsa with Gitlab CI

Now that Debian has migrated away from alioth and towards a gitlab instance known as salsa, we get a pretty advanced Continuous Integration system for (almost) free. Having that, it might make sense to use that setup to autobuild and -test a package when committing something. I had a look at doing so for one of my packages, ola; the reason I chose that package is because it comes with an autopkgtest, so that makes testing it slightly easier (even if the autopkgtest is far from complete).

Gitlab CI is configured through a .gitlab-ci.yml file, which supports many options and may therefore be a bit complicated for first-time users. Since I've worked with it before, I understand how it works, so I thought it might be useful to show people how you can do things.

First, let's look at the .gitlab-ci.yml file which I wrote for the ola package:

stages:
  - build
  - autopkgtest
.build: &build
  before_script:
  - apt-get update
  - apt-get -y install devscripts adduser fakeroot sudo
  - mk-build-deps -t "apt-get -y -o Debug::pkgProblemResolver=yes --no-install-recommends" -i -r
  - adduser --disabled-password --gecos "" builduser
  - chown -R builduser:builduser .
  - chown builduser:builduser ..
  stage: build
  artifacts:
    paths:
    - built
  script:
  - sudo -u builduser dpkg-buildpackage -b -rfakeroot
  after_script:
  - mkdir built
  - dcmd mv ../*ges built/
.test: &test
  before_script:
  - apt-get update
  - apt-get -y install autopkgtest
  stage: autopkgtest
  script:
  - autopkgtest built/*ges -- null
build:testing:
  <<: *build
  image: debian:testing
build:unstable:
  <<: *build
  image: debian:sid
test:testing:
  <<: *test
  dependencies:
  - build:testing
  image: debian:testing
test:unstable:
  <<: *test
  dependencies:
  - build:unstable
  image: debian:sid

That's a bit much. How does it work?

Let's look at every individual toplevel key in the .gitlab-ci.yml file:

stages:
  - build
  - autopkgtest

Gitlab CI has a "stages" feature. A stage can have multiple jobs, which will run in parallel, and gitlab CI won't proceed to the next stage unless and until all the jobs in the last stage have finished. Jobs from one stage can use files from a previous stage by way of the "artifacts" or "cache" features (which we'll get to later). However, in order to be able to use the stages feature, you have to create stages first. That's what we do here.

.build: &build
  before_script:
  - apt-get update
  - apt-get -y install devscripts autoconf automake adduser fakeroot sudo
  - mk-build-deps -t "apt-get -y -o Debug::pkgProblemResolver=yes --no-install-recommends" -i -r
  - adduser --disabled-password --gecos "" builduser
  - chown -R builduser:builduser .
  - chown builduser:builduser ..
  stage: build
  artifacts:
    paths:
    - built
  script:
  - sudo -u builduser dpkg-buildpackage -b -rfakeroot
  after_script:
  - mkdir built
  - dcmd mv ../*ges built/

This tells gitlab CI what to do when building the ola package. The main bit is the script: key in this template: it essentially tells gitlab CI to run dpkg-buildpackage. However, before we can do so, we need to install all the build-dependencies and a few helper things, as well as create a non-root user (since ola refuses to be built as root). This we do in the before_script: key. Finally, once the packages have been built, we create a built directory, and use devscripts' dcmd to move the output of the dpkg-buildpackage command into the built directory.

Note that the name of this key starts with a dot. This signals to gitlab CI that it is a "hidden" job, which it should not start by default. Additionally, we create an anchor (the &build at the end of that line) that we can refer to later. This makes it a job template, not a job itself, that we can reuse if we want to.

The reason we split up the script to be run into three different scripts (before_script, script, and after_script) is simply so that gitlab can understand the difference between "something is wrong with this commit" and "we failed to even configure the build system". It's not strictly necessary, but I find it helpful.

Since we configured the built directory as the artifacts path, gitlab will do two things:

  • First, it will create a .zip file in gitlab, which allows you to download the packages from the gitlab webinterface (and inspect them if needs be). The length of time for which the artifacts are stored can be configured by way of the artifacts:expire_in key; if not set, it defaults to 30 days or whatever the salsa maintainers have configured (of which I'm not sure what it is)
  • Second, it will make the artifacts available in the same location on jobs in the next stage.

The first can be avoided by using the cache feature rather than the artifacts one, if preferred.

.test: &test
  before_script:
  - apt-get update
  - apt-get -y install autopkgtest
  stage: autopkgtest
  script:
  - autopkgtest built/*ges -- null

This is very similar to the build template that we had before, except that it sets up and runs autopkgtest rather than dpkg-buildpackage, and that it does so in the autopkgtest stage rather than the build one, but there's nothing new here.

build:testing:
  <<: *build
  image: debian:testing
build:unstable:
  <<: *build
  image: debian:sid

These two use the build template that we defined before. This is done by way of the <<: *build line, which is YAML-ese to say "inject the other template here". In addition, we add extra configuration -- in this case, we simply state that we want to build inside the debian:testing docker image in the build:testing job, and inside the debian:sid docker image in the build:unstable job.

test:testing:
  <<: *test
  dependencies:
  - build:testing
  image: debian:testing
test:unstable:
  <<: *test
  dependencies:
  - build:unstable
  image: debian:sid

This is almost the same as the build:testing and the build:unstable jobs, except that:

  • We instantiate the test template, not the build one;
  • We say that the test:testing job depends on the build:testing one. This does not cause the job to start before the end of the previous stage (that is not possible); instead, it tells gitlab that the artifacts created in the build:testing job should be copied into the test:testing working directory. Without this line, all artifacts from all jobs from the previous stage would be copied, which in this case would create file conflicts (since the files from the build:testing job have the same name as the ones from the build:unstable one).

It is also possible to run autopkgtest in the same image in which the build was done. However, the downside of doing that is that if one of your built packages lacks a dependency that is an indirect dependency of one of your build dependencies, you won't notice; by blowing away the docker container in which the package was built and running autopkgtest in a pristine container, we avoid this issue.

With that, you have a complete working example of how to do continuous integration for Debian packaging. To see it work in practice, you might want to look at the ola version

UPDATE (2018-09-16): dropped the autoreconf call, isn't needed (it was there because it didn't work from the first go, and I thought that might have been related, but that turned out to be a red herring, and I forgot to drop it)

Posted
VP9 live streaming from a webcam

I recently bought a new webcam, because my laptop screen broke down, and the replacement screen that the Fujitsu people had on stock did not have a webcam anymore -- and, well, I need one.

One thing the new webcam has which the old one did not is a sensor with a 1920x1080 resolution. Since I've been playing around with various video-related things, I wanted to see if it was possible to record something in VP9 at live encoding settings. A year or so ago I would have said "no, that takes waaay too much CPU time", but right now I know that this is not true, you can easily do so if you use the right ffmpeg settings.

After a bit of fiddling about, I came up with the following:

ffmpeg -f v4l2 -framerate 25 -video_size 1920x1080 -c:v mjpeg -i /dev/video0 -f alsa -ac 1 -i hw:CARD=C615 -c:a libopus -map 0:v -map 1:a -c:v libvpx-vp9 -r 25 -g 90 -s 1920x1080 -quality realtime -speed 6 -threads 8 -row-mt 1 -tile-columns 2 -frame-parallel 1 -qmin 4 -qmax 48 -b:v 4500 output.mkv

Things you might want to change in the above:

  • Set the hw:CARD=C615 bit to something that occurs in the output of arecord -L on your system rather than on mine.
  • Run v4l2-ctl --list-formats-ext and verify that your camera supports 1920x1080 resolution in motion JPEG at 25 fps. If not, change the values of the parameters -framerate and -video_size, and the -c:v that occurs before the -i /dev/video0 position (which sets the input video codec; the one after selects the output video codec and you don't want to touch that unless you don't want VP9).
  • If don't have a quad-core CPU with hyperthreading, change the -threads setting.

If your CPU can't keep up with things, you might want to read the documentation on the subject and tweak the -qmin, -qmax, and/or -speed parameters.

This was done on a four-year-old Haswell Core i7; it should be easier on more modern hardware.

Next up: try to get a live stream into a DASH system. Or something.

Posted
DebConf18 video work

For personal reasons, I didn't make it to DebConf18 in Taiwan this year; but that didn't mean I wasn't interested in what was happening. Additionally, I remotely configured SReview, the video review and transcoding system which I originally wrote for FOSDEM.

I received a present for that today:

And yes, of course I'm happy about that :-)

On a side note, the videos for DebConf18 have all been transcoded now. There are actually three files per event:

  • The .webm file contains the high quality transcode of the video. It uses VP9 for video, and Opus for audio, and uses the Google-recommended settings for VP9 bitrate configuration. On modern hardware with decent bandwidth, you'll want to use this file.
  • The .lq.webm file contains the low-quality transcode of the video. It uses VP8 for video, and Vorbis for audio, and uses the builtin default settings of ffmpeg for VP8 bitrate configuration, as well as a scale-down to half the resolution; those usually end up being somewhere between a third and half of the size of the high quality transcodes.
  • The .ogg file contains the extracted vorbis audio from the .lq.webm file, and is useful only if you want to listen to the talk and not watch it. It's also the smallest download, for obvious reasons.

If you have comments, feel free to let us know on the mailinglist

Posted
PKCS#11 v2.20

By way of experiment, I've just enabled the PKCS#11 v2.20 implementation in the eID packages for Linux, but for now only in the packages in the "continuous" repository. In the past, enabling this has caused issues; there have been a few cases where Firefox would deadlock when PKCS#11 v2.20 was enabled, rather than the (very old and outdated) v2.11 version that we support by default. We believe we have identified and fixed all outstanding issues that caused such deadlocks, but it's difficult to be sure. So, if you have a Belgian electronic ID card and are willing to help me out and experiment a bit, here's something I'd like you to do:

  • Install the eID software (link above) as per normal.
  • Enable the "continuous" repository and upgrade to the packages in that repository:

    • For Debian, Ubuntu, or Linux Mint: edit /etc/apt/sources.list.d/eid.list, and follow the instructions there to enable the "continuous" repository. Don't forget the dpkg-reconfigure eid-archive step. Then, run apt update; apt -t continuous upgrade.
    • For Fedora and CentOS: run yum --enablerepo=beid-continuous install eid-mw
    • For OpenSUSE: run zypper mr -e beid-continuous; zypper up

The installed version of the eid-mw-libs or libbeidpkcs11-0 package should be v4.4.3-42-gf78d786e or higher.

One of the new features in version 2.20 of the PKCS#11 API is that it supports hotplugging of card readers; in version 2.11 of that API, this is not the case, since it predates USB (like I said, it is outdated). So, try experimenting with hotplugging your card reader a bit; it should generally work. Try leaving it installed and using your system (and webbrowser) for a while with that version of the middleware; you shouldn't have any issues doing so, but if you do I'd like to know about it.

Bug reports are welcome as issues on our github repository.

Thanks!

Posted
Digitizing my DVDs

I have a rather sizeable DVD collection. The database that I created of them a few years back after I'd had a few episodes where I accidentally bought the same movie more than once claims there's over 300 movies in the cabinet. Additionally, I own a number of TV shows on DVD, which, if you count individual disks, will probably end up being about the same number.

A few years ago, I decided that I was tired of walking to the DVD cabinet, taking out a disc, and placing it in the reader. That instead, I wanted to digitize them and use kodi to be able to watch a movie whenever I felt like it. So I made some calculations, and came up with a system with enough storage (on ZFS, of course) to store all the DVDs without needing to re-encode them.

I got started on ripping most of the DVDs using dvdbackup, but it quickly became apparent that I'd made a miscalculation; where I thought that most of the DVDs would be 4.7G ones, it turns out that most commercial DVDs are actually of the 9G type. Come to think of it, that does make a lot of sense. Additionally, now that I had a home server that had some significant reduntant storage, I found that I had some additional uses for such things. The storage that I had, vast enough though it may be, wouldn't suffice.

So, I gave this some more thought, but then life interfered and nothing happened for a few years.

Recently however, I've picked it up again, changing my workflow. I started using handbrake to re-encode the DVDs so they wouldn't take up quite so much space; having chosen VP9 as my preferred codec, I end up storing the DVDs as about 1 to 2 G per main feature, rather than the 8 to 9 that it used to be -- a significant gain. However, my first workflow wasn't very efficient; I would run the handbrake GUI from my laptop on ssh -X sessions to multiple machines, encoding the videos directly from DVD that way. That worked, but it meant I couldn't shut down my laptop to take it to work without interrupting work that was happening; also, it meant that if a DVD finished encoding in the middle of the night, I wouldn't be there to replace it, so the system would be sitting idle for several hours. Clearly some form of improvement was necessary if I was going to do this in any reasonable amount of time.

So after fooling around a bit, I came up with the following:

  • First, I use dvdbackup -M -r a to read the DVD without re-encoding anything. This can be done at the speed of the optical medium, and can therefore be done much more efficiently than to use handbrake directly from the DVD. The -M option tells dvdbackup to read everything from the DVD (to make a mirror of it, in effect). The -r a option tells dvdbackup to abort if it encounters a read error; I found that DVDs sometimes can be read successfully if I eject the drive and immediately reinsert it, or if I give the disk another clean, or even just try again in a different DVD reader. Sometimes the disk is just damaged, and then using dvdbackup's default mode of skipping the unreadable blocks makes sense, but not in a first attempt.
  • Then, I run a small little perl script that I wrote. It basically does two things:

    1. Run HandBrakeCLI -i <dvdbackup output> --previews 1 -t 0, parse its stderr output, and figure out what the first and the last titles on the DVD are.
    2. Run qsub -N <movie name> -v FILM=<dvdbackup output> -t <first title>-<last title> convert-film
  • The convert-film script is a bash script, which (in its first version) did this:

    mkdir -p "$OUTPUTDIR/$FILM/tmp"
    HandBrakeCLI -x "threads=1" --no-dvdnav -i "$INPUTDIR/$FILM" -e vp9 -E copy -T -t $SGE_TASK_ID --all-audio --all-subtitles -o "$OUTPUTDIR/$FILM/tmp/T${SGE_TASK_ID}.mkv"
    

    Essentially, that converts a single title to a VP9-encoded matroska file, with all the subtitles and audio streams intact, and forcing it to use only one thread -- having it use multiple threads is useful if you care about a single DVD converting as fast as possible, but I don't, and having four DVDs on a four-core system all convert at 100% CPU seems more efficient than having two convert at about 180% each. I did consider using HandBrakeCLI's options to only extract the "interesting" audio and subtitle tracks, but I prefer to not have dubbed audio (to have subtitled audio instead); since some of my DVDs are originally in non-English languages, doing so gets rather complex. The audio and subtitle tracks don't take up that much space, so I decided not to bother with that in the end.

The use of qsub, which submits the script into gridengine, allows me to hook up several encoder nodes (read: the server plus a few old laptops) to the same queue.

That went pretty well, until I wanted to figure out how far along something was going. HandBrakeCLI provides progress information on stderr, and I can just do a tail -f of the stderr output logs, but that really works well only for one one DVD at a time, not if you're trying to follow along with about a dozen of them.

So I made a database, and wrote another perl script. This latter will parse the stderr output of HandBrakeCLI, fish out the progress information, and put the completion percentage as well as the ETA time into a database. Then it became interesting:

CREATE OR REPLACE FUNCTION transjob_tcn() RETURNS trigger AS $$
BEGIN
  IF (TG_OP = 'INSERT') OR (TG_OP = 'UPDATE' AND (NEW.progress != OLD.progress) OR NEW.finished = TRUE) THEN
    PERFORM pg_notify('transjob', row_to_json(NEW)::varchar);
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER transjob_tcn_trigger
  AFTER INSERT OR UPDATE ON transjob
  FOR EACH ROW EXECUTE PROCEDURE transjob_tcn();

This uses PostgreSQL's asynchronous notification feature to send out a notification whenever an interesting change has happened to the table.

#!/usr/bin/perl -w

use strict;
use warnings;

use Mojolicious::Lite;
use Mojo::Pg;

...

helper dbh => sub { state $pg = Mojo::Pg->new->dsn("dbi:Pg:dbname=transcode"); };

websocket '/updates' => sub {
    my $c = shift;
    $c->inactivity_timeout(600);
    my $cb = $c->dbh->pubsub->listen(transjob => sub { $c->send(pop) });
    $c->on(finish => sub { shift->dbh->pubsub->unlisten(transjob => $cb) });
};

app->start;

This uses the Mojolicious framework and Mojo::Pg to send out the payload of the "transjob" notification (which we created with the FOR EACH ROW trigger inside PostgreSQL earlier, and which contains the JSON version of the table row) over a WebSocket. Then it's just a small matter of programming to write some javascript which dynamically updates the webpage whenever that happens, and Tadaa! I have an online overview of the videos that are transcoding, and how far along they are.

That only requires me to keep the queue non-empty, which I can easily do by running dvdbackup a few times in parallel every so often. That's a nice saturday afternoon project...

Posted
host detection in bash

There are many tools to implement this, and yeah, this is not the fastest. But the advantage is that you don't need extra tools beyond "bash" and "ping"...

for i in $(seq 1 254); do
  if ping -W 1 -c 1 192.168.0.$i; then
    HOST[$i]=1
  fi
done
echo ${!HOST[@]}

will give you the host addresses for the machines that are live on a given network...

Posted
Day four of the pre-FOSDEM Debconf Videoteam sprint

Day four was a day of pancakes and stew. Oh, and some video work, too.

Nattie

Did more documentation review. She finished SReview documentation, got started on the documentation of the examples of our ansible repository.

Kyle

Finished splitting out the ansible configuration from the ansible code repository. The code repository now includes an example configuration that is well documented for getting started, whereas our production configuration lives in a separate repository.

Stefano

Spent much time on the debconf website, mostly working on a new upstream release of wafer.

He also helped review Kyle's documentation, and spent some time together with Tzafrir debugging our ansible test setup.

Tzafrir

Worked on documentation, and did a test run of the ansible repository. Found and fixed issues that cropped up during that.

Wouter

Spent much time trying to figure out why SReview was not doing what he was expecting it to do. Side note: I hate video codecs. Things are working now, though, and most of the fixes were implemented in a way that makes it reusable for other conferences.

There's one more day coming up today. Hopefully won't forget to blog about it tonight.

Posted
Day three of the pre-FOSDEM Debconf Videoteam sprint

This should really have been the "day two" post, but I forgot to do that yesterday, and now it's the end of day three already, so let's just do the two together for now.

Kyle

Has been hacking on the opsis so we can get audio through it, but so far without much success. In addition, he's been working a bit more on documentation, as well as splitting up some data that's currently in our ansible repository into a separate one so that other people can use our ansible configuration more easily, without having to fork too much.

Tzafrir

Did some tests on the ansible setup, and did some documentation work, and worked on a kodi plugin for parsing the metadata that we've generated.

Stefano

Did some work on the DebConf website. This wasn't meant to be much, but yak shaving sucks. Additionally, he's been doing some work on the youtube uploader as well.

Nattie

Did more work reviewing our documentation, and has been working on rewording some of the more awkward bits.

Wouter

Spent much time on improving the SReview installation for FOSDEM. While at it, fixed a number of bugs in some of the newer code that were exposed by full tests of the FOSDEM installation. Additionally, added code to SReview to generate metadata files that can be handed to Stefano's youtube uploader.

Pollo

Although he had less time yesterday than he did on monday (and apparently no time today) to sprint remotely, Pollo still managed to add a basic CI infrastructure to lint our ansible playbooks.

Posted
Day one of the pre-FOSDEM Debconf Videoteam sprint

I'm at the Linux Belgium training center, where this last week before FOSDEM the DebConf video team is holding a sprint. The nice folks of Linux Belgium made us feel pretty welcome:

Linux Belgium message

Yesterday was the first day of that sprint, where I had planned to blog about things, but I forgot, so here goes (first thing this morning)

Nattie and Tzafrir

Nattie and Tzafrir have been spending much of their time proofreading our documentation, and giving us feedback to improve their readability and accuracy.

Stefano

Spent some time working on his youtube uploader. He didn't finish it to a committable state yet, but more is to be expected today.

He also worked on landing a gstreamer pipeline change that was suggested at LCA last week (which he also visited), and did some work on setting up the debconf18 dev website.

Finally, he fixed the irker config on salsa so that it would actually work and send commit messages to IRC after a push.

Kyle

Wrote a lot of extra documentation on the opsis that we use and various other subjects, and also fixed some of the templates of the documentation, so that things would look better and link correctly.

Wouter

I spent much of my time working on the FOSDEM SReview instance, which will be used next weekend; that also allowed me to improve the code quality of some of the newer stuff that I wrote over the past few months. In between things, being the local guy here, I also drove around getting a bit of stuff that we needed.

Pollo

Pollo isn't here, but he's sprinting remotely from home. He spent some time setting up gitlab-ci so that it would build our documentation after pushing to salsa.

Posted