Transcoding videos

Since some people asked for it, and since it's (unfortunately) fairly nontrivial, here is the script that I'm using currently to transcode a .dv file into webm:

#!/bin/bash

set -e

newfile=$(basename $1 .dv).webm
wavfile=$(basename $1 .dv).wav
normalfile=$(basename $1 .dv)-normal.wav
normalfile=$(readlink -f $normalfile)
oldfile=$(readlink -f $1)

echo "Audio split"
gst-launch-0.10 uridecodebin uri=file://$oldfile ! progressreport ! audioconvert ! audiorate ! wavenc ! filesink location=$wavfile
echo "Audio normalize"
sox --norm $wavfile $normalfile
echo "Pass 1"
gst-launch-0.10 webmmux name=mux ! fakesink \
  uridecodebin uri=file://$oldfile name=demux \
  demux. ! ffmpegcolorspace ! deinterlace ! vp8enc multipass-cache-file=/tmp/vp8-multipass multipass-mode=1 threads=2 ! queue ! mux.video_0 \
  demux. ! progressreport ! audioconvert ! audiorate ! vorbisenc ! queue ! mux.audio_0
echo "Pass 2"
gst-launch-0.10 webmmux name=mux ! filesink location=$newfile \
  uridecodebin uri=file://$oldfile name=video \
  uridecodebin uri=file://$normalfile name=audio \
  video. ! ffmpegcolorspace ! deinterlace ! vp8enc multipass-cache-file=/tmp/vp8-multipass multipass-mode=2 threads=2 ! queue ! mux.video_0 \
  audio. ! progressreport ! audioconvert ! audiorate ! vorbisenc ! queue ! mux.audio_0

rm $wavfile $normalfile

It's fairly non-optimal, because many of these command-line A/V tools are either fairly badly documented, or have a horrific interface, or something similar.

avconv is supposed to be able to do audio normalisation, but I haven't been able to figure out exactly how it's done. The options that are specified in the manpage seemingly have no effect.

gstreamer has a 'ReplayGain' plugin which can do audio normalisation. Audio normalisation requires two passes; I could in theory just add the element in the gstreamer first-pass pipeline, add a '-t' to the gst-launch-0.10 invocation, and parse out the required gain value that I can then add to the second-pass pipeline. However, the elements that I can then pass that gain value to either have a different range for the gain (rgvolume's fallback-gain parameter) or expect a completely different unit of values with no obvious way to translate between the two (the 'volume' element's 'volume' parameter expects a multiplier, the replaygain plugin produces a dB value; a simple conversion seems wrong as it produces a value way out of range). The 'volume' element reportedly has an interface that takes a dB value, except you can't reach it from gst-launch-0.10. Bummer.

So instead we split out the audio, do normalisation in sox, and mux that back in during the second pass of the video transcoding. Sox is good. Sox is easy. If all A/V command-line tools were like sox, I would be smiling now.

What's missing from the above script is how to throw away the start or end of a file in case there's several minutes of uninterestingness there. This is easiest (in my experience) with avconv's -t and -ss options. I suspect gstreamer is capable of doing that, too; but I haven't figured out how, and this works.

You may want to play with some of options of the vp8enc element. For instance, the threads= value can be increased (or decreased) depending on how many cores your transcoding machine has. You could probably also add a "target-bitrate" value if you find the quality is too low or the system uses too much diskspace. VP8 has several more options; the documentation lists them all.

If you've done video recordings for a devroom at FOSDEM outside of the FOSDEM video team, note that FOSDEM is more than willing to host the videos. To do so, the easiest way is if the videos are online somewhere (this does not have to be a public system) that we can wget or scp them from; if you've done that, contact me with the location to the files and we'll put them on the FOSDEM video server.