Playing with Moose and FFmpeg

As I've blogged before, I've been on and off working on SReview, a video review system. Development over the past year has been mostly driven by the need to have something up and running for first FOSDEM 2017, and then DebConf17, and so I've cut corners left and right which made the system, while functional, not quite entirely perfect everywhere. For instance, the backend scripts were done in ad-hoc perl, each reinventing their own wheel. Doing so made it easier for me to experiment with things and figure out where I want them to go, without immediately creating a lot of baggage that is not necessarily something I want to be stuck to. This flexibility has already paid off, in that I've redone the state machine between FOSDEM and DebConf17—and all it needed was to update a few SQL statements here and there. Well, and add a few of them, too.

It was always the intent to replace most of the ad-hoc perl with something better, however, once the time was ripe. One place where historical baggage is not so much of a problem, and where in fact abstracting away the complexity would now be an asset, is in the area of FFmpeg command lines. Currently, these are built by simple string expansion. For example, we do something like this (shortened for brevity):

system("ffmpeg -y -i $outputdir/$slug.ts -pass 1 -passlogfile ...");

inside an environment where the $outputdir and $slug variables are set in a perl environment. That works, but it has some downsides; e.g., adding or removing options based on which codecs we're using is not so easy. It would be much more flexible if the command lines were generated dynamically based on requested output bandwidth and codecs, rather than that they be hardcoded in the file. Case in point: currently there are multiple versions of some of the backend scripts, that only differ in details—mostly the chosen codec on the ffmpeg command line. Obviously this is suboptimal.

Instead, we want a way where video file formats can be autodetected, so that I can just say "create a file that uses encoder etc settings of this other file here". In addition, we also want a way where we can say "create a file that uses encoder etc settings of this other file here, except for these one or two options that I want to fine-tune manually". When I first thought about doing that about a year ago, that seemed complicated and not worth it—or at least not to that extent.

Enter Moose.

The Moose OO system for Perl 5 is an interesting way to do object orientation in Perl. I knew Perl supports OO, and I had heard about Moose, but never had looked into it, mostly because the standard perl OO features were "good enough". Until now.

Moose has a concept of adding 'attributes' to objects. Attributes can be set at object construction time, or can be accessed later on by way of getter/setter functions, or even simply functions named after the attribute itself (the default). For more complicated attributes, where the value may not be known until some time after the object has been created, Moose borrows the concept of "lazy" variables from Perl 6:

package Object;

use Moose;

has 'time' => (
    is => 'rw',
    builder => 'read_time',
    lazy => 1,
);

sub read_time {
    return localtime();
}

The above object has an attribute 'time', which will not have a value initially. However, upon first read, the 'localtime()' function will be called, the result is cached, and then (and on all further calls of the same function), the cached result will be returned. In addition, since the attribute is read/write, the time can also be written to. In that case, any cached value that may exist will be overwritten, and if no cached value exists yet, the read_time function will never be called. (it is also possible to clear values if needs be, so that the function would be called again).

We use this with the following pattern:

package SReview::Video;

use Moose;

has 'url' => (
    is => 'rw',
)

has 'video_codec' => (
    is => 'rw',
    builder => '_probe_videocodec',
    lazy => 1,
);

has 'videodata' => (
    is => 'bare',
    reader => '_get_videodata',
    builder => '_probe_videodata',
    lazy => 1,
);

has 'probedata' => (
    is => 'bare',
    reader => '_get_probedata',
    builder => '_probe',
    lazy => 1,
);

sub _probe_videocodec {
    my $self = shift;
    return $self->_get_videodata->{codec_name};
}

sub _probe_videodata {
    my $self = shift;
    if(!exists($self->_get_probedata->{streams})) {
        return {};
    }
    foreach my $stream(@{$self->_get_probedata->{streams}}) {
        if($stream->{codec_type} eq "video") {
            return $stream;
        }
    }
    return {};
}

sub _probe {
    my $self = shift;

    open JSON, "ffprobe -print_format json -show_format -show_streams " . $self->url . "|"
    my $json = "";
    while(<JSON>) {
        $json .= $_;
    }
    close JSON;
    return decode_json($json);
}

The videodata and probedata attributes are internal-use only attributes, and are therefore of the 'bare' type—that is, they cannot be read nor written to. However, we do add 'reader' functions that can be used from inside the object, so that the object itself can access them. These reader functions are generated, so they're not part of the object source. The probedata attribute's builder simply calls ffprobe with the right command-line arguments to retrieve data in JSON format, and then decodes that JSON file.

Since the passed JSON file contains an array with (at least) two streams—one for video and one for audio—and since the ordering of those streams depends on the file and is therefore not guaranteed, we have to loop over them. Since doing so in each and every attribute of the file we might be interested in would be tedious, we add a videodata attribute that just returns the data for the first found video stream (the actual source also contains a similar one for audio streams).

So, if you create an SReview::Video object and you pass it a filename in the url attribute, and then immediately run print $object->video_codec, then the object will

call ffprobe, and cache the (decoded) output for further use
from that, extract the video stream data, and cache that for further use
from that, extract the name of the used codec, cache it, and then return that name to the caller.

However, if the caller first calls $object->video_codec('h264'), then the ffprobe and most of the caching will be skipped, and instead the h265 data will be returned as video codec name.

Okay, so with a reasonably small amount of code, we now have a bunch of attributes that have defaults based on actual files but can be overwritten when necessary. Useful, right? Well, you might also want to care about the fact that sometimes you want to generate a video file that uses the same codec settings of this other file here. That's easy. First, we add another attribute:

has 'reference' => (
    is => 'ro',
    isa => 'SReview::Video',
    predicate => 'has_reference'
);

which we then use in the _probe method like so:

sub _probe {
    my $self = shift;

    if($self->has_reference) {
        return $self->reference->_get_probedata;
    }
    # original code remains here
}

With that, we can create an object like so:

my $video = SReview::Video->new(url => 'file.ts');
my $generated = SReview::Video->new(url => 'file2.ts', reference => $video);

now if we ask the $generated object what the value of its video_codec setting is without telling it ourselves first, it will use the $video object for its probed data, and use that.

That only misses generating the ffmpeg command line, but that's all fairly straightforward and therefore left as an exercise to the reader. Or you can cheat, and look it up.