Leaving a proprietary format
is never easy

A story: STARDIVA®

 

 

Jérôme Martinez

No Time to Wait 6, October 2022

The issue

A proprietary software was used for recording events with up to 8 simultaneous interpretations.

StarDiva description

Content playable only on Windows by a modified version of VLC.

Metadata readable only on Windows by StarDiva tools.

A modified version of VLC?

VLC is open source. And copyleft.

So source code available, respecting the license?

Well... No.

StarDiva VLC

Metadata?

Location, date, time, pauses, speaker.

Only one tool is able to read them.

No automation, no export, you are locked.

StarDiva player

First check

NSV (Nullsoft Streaming Video)

This format is documented, yeah!

Well... No.

First check

NSV is 1 video and 1 audio only.

But we expect 8 audio tracks.

 

NSV has a spec for metadata.

But no NSV metadata there.

The reverse engineering

It is slow, no guaranty of result, but no other choice found.

Was looking easy for content:
AVC for video and AAC for audio.

Well... Was not so easy.

 

Was looking difficult for metadata:
completely opaque bytes.

Well... Was even more difficult than anticipated.

Video

AVC at 25 fps in raw stream but stored as 23 fps in container.

We demux it and we fix the raw stream.

Key frame every 10 seconds.

But video stream can start without key frame.

Up to 10 first seconds of video are lost forever.

(and it is not worth it to retrieve some macroblocks)

Audio

AAC 8-channel with custom channel mapping and rare AAC features.

FFmpeg fails to decode them.

Fortunately FAAD does decode them
if we hack a bit the AAC bitstream (channel mapping)
if we discard buggy frames (else decoder stops).

Audio

2 possibilities:
- improve FFmpeg playback: awful hacks to plan for showing 8 tracks, support only on FFmpeg based players, AAC decoder to improve too.
- transcoding: more versatile, less work, but we have a reencoding, so some quality loss

Decision: decode, split 8-channel to 8-track, reencode.
Silent tracks are discarded
(Thanks to FFmpeg astats RMS level/peak).

Playable by any player supporting MKV+AVC+AAC.
We lose audio quality but mitigated by using FAAC HE-AAC for same bitrate.

Metadata

Needed lot of work for guessing the logic of the tool.

Also needed to understand the bugs in the files :-D.

Not all is understood, but we have what we wanted.

Converted to Matroska chapters.

Metadata

StarDiva Metadata

Time spent

80% of files were well reverse engineered in 20% of the time spent for this project.

We had to process all files several times (during several days), for being sure that our algo is fine everywhere.

Never underestimate the time spent on corner cases.

Quality of data

Estimation of work effort is often based on the idea of having good files.

99% of files are usually good.

You don't know the quality of your files before you try to do QA on them.

Found ~1% of files with bad content.

These files were usually having 0.01% of missing or corrupted audio packets.
Such data is lost forever, replaced by silence.

Few files are totally undecodable, all is lost. Now you know.

The (open source) tools used.

MediaInfo: metadata readout, demux.

FAAD: audio decode.

FFmpeg: channel split, silent track detection.

FFmpeg+fdk_aac: HE-AAC encode.

MKVToolnix: mux with chapters, bitrate stats.

LeaveSD: dedicated command line for automation, fix of buggy packets, tweak of other tools command lines, creation of chapters XML, reporting.

Summary

It is not only about "just a quick transcoding".

Buying a proprietary software has a long term cost.

You have no idea about the quality of your files before you analyze them.

No time to wait for checking how are your files.

A developer indicates an high cost for such work?
Well, experience...

We want to share our experience: code is open source.

Stay in touch

MediaArea: https://mediaarea.net, @MediaArea_net

LeaveSD page: https://MediaArea.net/LeaveSD

Jérôme Martinez: jerome@mediaarea.net

Slides: https://MediaArea.net/Events

License (except images): CC BY