Converting BigBlueButton recordings to self-contained videos
When the pandemic lock downs started, my local Linux User
Group started looking at video conferencing
tools we could use to continue presenting talks and other events to
members. We ended up adopting
BigBlueButton: as well as being Open
Source, it's focus on education made it well suited for presenting
talks. It has the concept of a presenter role, and built in support for
slides (it sends them to viewers as images, rather than another video
stream). It can also record sessions for later viewing.
To view those recordings though, you need to use BBB's web player. I
wanted to make sure we could keep the recordings available should the
BBB instance we were using went away. Ideally, we'd just be able to
convert the recordings to self contained videos files that could be
archived and published along side our other recordings. There are a few
tools intended to help with this:
captures Chrome displaying BBB's web player to produce a video.
one is intended to run on the BBB server, and combines slides,
screen share and presentation audio using ffmpeg. Does not include
I really wanted something that would include both the camera footage and
slides in one video, so decided to make my own. The result is
At the present, it consists of two scripts. The first is download.py,
which takes the URL of a public BBB recording and downloads all of its
assets to a local folder. The second is make-xges.py, which assembles
those assets so they're ready to render.
The resources retrieved by the download script include:
Video from the presenters' cameras, plus the audio track for the
Video for screen sharing segments of the presentation. This is the
same length as the webcams video, with blank footage when nothing is
Timing information for when to show the screen share video, along
with the aspect ration for a particular share session
An SVG file with custom timing attributes that is uses to present
the slides and whiteboard scribbles. By following links in the SVG,
we can download all the slide images.
Mouse cursor position over time. This is used for the "red dot
laser pointer" effect.
Not actually slides. For some reason, this is the text chat replay.
My first thought to combine the various parts was to construct a
GStreamer pipeline that would play
everything back together, using timers to bring slides in and out. This
turned out to be easier said than done, so I started looking for
something higher level.
It turns out GStreamer has that covered in the form of GStreamer
a library intended to help write non-linear editing applications. That
fits the problem really well: I've got a collection of assets and
metadata, so just need to convert all the timing information into an
appropriate edit list. I can put the webcam footage in the bottom right
corner, ask for a particular slide image to display at a particular
point on the timeline and go away at another point, display screen share
footage, etc. It also made it easy to add a backdrop image to fill in
the blank space around the slides and camera and add a bit of branding
to the result.
On top of that, I can serialise that edit list to a file, rather than
encoding the video directly. The ges-launch-1.0 utility can load the
project to quickly play back the result without without having to wait
for the video to encode.
I can even load the project in Pitivi, a video
editor built on top of GES:
This makes it very easy to scrub through the timeline to quickly verify
that everything looks correct.
At this point, the scripts can produce a crisp 1080p video that should
be good enough for most presentations. There are a few areas that could
be improved though:
If there are multiple presenters with their webcam on, we still get
a single webcam video with each presenter feed shown in a square
grid. It would probably look better to try and stack each presenter
vertically. This could probably be done by applying videocrop as
an effect to extract each individual presenter, and include the
video multiple times in the project.
The data in cursor.xml is ignored. It would be pretty easy to
display a small red circle image at the correct times and positions.
Whiteboard scribbles are also ignored. This would be a bit trickier
to implement. It would probably involve dissecting shapes.svg into
a sequence of SVGs containing the elements visible at each point in
adjusts the viewBox when switching to/from slides and screen
share, and that changes how the coordinates of the scribbles are
As GUADEC is using BigBlueButton this year, hopefully it should help
with processing the recordings into individual videos.