This page is under construction. Current status: personal notes. Last edited: 2025-04-13
If you just want the tl;dr, jump to #Methodology and shell commands
Note to self: I should put collapsing menus for the explanations of choices of specific software or methods, so that people that just want to cut to the guide can do so.
In this order. dvdbackup to copy the disk to hard drive. mkvmerge from mkvtoolnix to convert the weird video file disk layout to a single mkv file. mkvpropedit from mkvtoolnix to fix the naming and tagging of audio and subtitle tracks (for instance, setting the language of an audio track, tagging an audio track as commentary ect...). ffmpeg to transcode the video to the desired formats.
Why use dvdbackup over makemkv for backing up dvd? The first, most important reason is that dvdbackup is libre software and makemkv is not. THE MAKEMKV DEVELOPPERS CALLED THE GPL A CANCER: citation from source code in mmdvdnav.h: « GPL is cancer. A hacky and otherwise useless glue code to comply with GPL licensing. ». Moreover, dvdbackup is much faster than makemkv. before copying the disk, makemkv has to analyze the different movie titles on the disk, which takes a long time, as VOB files are not contiguous and is a kind of weird format, makemkv has to run through the whole disk to figure out the mapping between VOB files and titles. by comparison, dvdbackup does not have this step, it just makes a 1 to 1 copy of the disk. what's more, makemkv is a graphical tool and requires multiple manual steps (one click to open the disk, then another to analyse the titles and yet another to backup the titles to mkv) whilst dvdbackup, being a command line tool, can be fully automated.
Why use mkvtoolnix over ffmpeg for converting the multiple VOB files of a disk into a single mkv file? I had issues with ffmpeg when converting to mkv container directly where the middle of the file was missing and there were different issues when I converted to mp4 instead. I understood I had to use a tool specific to mkv file creation, which is mkvtoolnix.
Why use ffmpeg over handbrake for transcoding? ffmpeg gives you more control and is easier to understand than handbrake in my opinion, because it uses less abstraction, which may scare people but is good for understanding (read ffmpeg's manual! it's very nice). ffmpeg is a ubiquitus tool and as such has high quality documentation and a million answered questions - if you encounter a problem someone will already have solved it. ffmpeg is installed pretty much everywhere and relativaly recent versions are in every every repo (not the case with handbrake on parabola for instance).
From Freedom respecting file formats to use for video - Conclusion
mkv container with AV1 video, Opus audio and WebVTT subtitles for text subtitles, and fallback to VOBSUB if existing subtitles were in an image format.
See Freedom respecting file formats to use for video for the full breakdown.
If we use `dvdbackup -M` to make a complete mirror of the movie "The Lord of the Rings: The Return of the King (2003)", we get this:
Lotr Return Of The King See D1/VIDEO_TS
---- Lotr Return Of The King See D1/VIDEO_TS/VIDEO_TS.BUP
---- Lotr Return Of The King See D1/VIDEO_TS/VIDEO_TS.IFO
---- Lotr Return Of The King See D1/VIDEO_TS/VIDEO_TS.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_0.BUP
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_0.IFO
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_0.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_1.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_2.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_3.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_4.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_5.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_6.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_7.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_01_8.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_02_0.BUP
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_02_0.IFO
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_02_0.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_02_1.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_03_0.BUP
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_03_0.IFO
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_03_0.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_03_1.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_04_0.BUP
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_04_0.IFO
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_04_0.VOB
---- Lotr Return Of The King See D1/VIDEO_TS/VTS_04_1.VOB
The VOB files are the actual video content. The IFO files contain information on the video files like chapter info, menu information and stuff. The BUP files are just backups of the IFO files (so that if your dvd gets scratched there's less chance that it becomes totally unreadable because the menu file got corrupted I'm guessing). The VTS_01_*.VOB files form the Title set 1. Why is there so many .VOB files? Well, dvds have this weird limitation that files can't be over 1Gb, therefore a whole movie has to be split into multiple video files. Usually, a Title set corresponds to a feature of the dvd, so for instance here the Title set 1 corresponds to the main feature of the movie, that is the actual movie, the Title set 4 corresponds to a bonus interview of the actors. However, the Title set 2 is just a short black screen and the Title set 3 is the animated logo of the company. This means we can skip all the commercials and annoying stuff by backing up just the right files.
First off, you must choose if you want to keep all the extra features on a dvd like making offs, trailers, bonuses ect... or if you just want the main film (which we shall call the "main feature")
Directory hierarchy
raw/ # we put the raw copies of the dvd titles here
named/ # we put the renamed raw copies of the dvd titles here
remuxed/ # we put the merged mkv files here
transcoded/ # [optional] we put the transcoded files here
If you just want the main feature, just use `dvdbackup -F -o raw/` to backup all the VOB files of the main feature to raw/. For instance, if you were ripping the movie "The Lord of the Rings: The Return of the King (2003)", you would get a directory hierarchy like this:
raw/ # we put the raw copies of the dvd titles here
---- raw/Lotr Return Of The King See D1
---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_0.BUP
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_0.IFO
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_0.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_1.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_2.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_3.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_4.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_5.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_6.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_7.VOB
---- ---- ---- raw/Lotr Return Of The King See D1/VIDEO_TS/VTS_01_8.VOB
As you can see, the name of the directory set by dvdbackup which is the title of the dvd can be a bit random. In this case it's not too bad but sometimes you can even get dvds with really generic names like "awesomedvd". Therefore, after the dvd is copied, go find the reference of the movie on TheMovieDataBase (TMBD) and rename the directory holding the dvd accordingly. Sometimes, it is hard to find the reference of the dvd, for instance, there are many different versions of Cinderella, so you should look at the copyright year written on the dvd to help you, or you can also search by actor/director on TheMovieDataBase and identify the correct movie that way. Put the renamed dvd in /named.
Then, you can use mkvmerge to remux all the VOB_*_* files into a single mkv. Of course, you don't have to do this movie by movie; you can backup and rename a bunch of dvds then use a bash for loop to remux them all at once. put the result in remuxed/. don't forget to use `--chapters raw/$movie_name/VIDEO_TS:
Ok, I lied, the dvd structure is in fact more complicated than I presented in #Some background on dvd structure. In fact, a single VOB file can hold multiple titles. If you mkvmerge that file you will get a mkv which contains all the titles of the VOB in a single file, whilst we want those titles to be split into different mkvs. Splitting a file into multiple mkvs is of course possible but it would have to be done by manually specifying the timestamps at which to split which is really annoying. This means that backing up all the titles of the dvd is not as simple as making a mirror of the dvd then running mkvmerge on each Title set. Instead, we have to backup each individual title with dvdbackup. So, we first run dvdbackup -I to get a list of all the titles of the dvd then we use some scripting glue to run dvdbackup `--title=
Unfortunately, it's not as that either, because dvds list many duplicate titles, so if you backup each title you will end up filling your drive with tons of duplicate data. I wrote a script to detect and delete the duplicates, which I shall add here when it is finalised. It uses xx3hash to get a hash of each title (which is just the concatenation of the hash of each VOB file of the title), then to print the duplicates we just sort by hash and and run a little awk script. The issue is that this solution is not optimal, in that we are still reading and writing duplicate data that is immediately going to be deleted. When ripping a single dvd, that is fine, but if you are a ripping a collectiong of 350 dvds like me then 10G of duplicate data / dvd means 3.5T of extraneous reading and writing which is going to take forever and use my hard drive for nothing. So, it would be really nice to patch dvdbackup to add a way of detecting duplicates by printing the sectors of the dvd to which each title points. dvdbackup is only 2000 lines of c if you don't count the dependencies, so perhaps I can figure it out myself. I will certainly try when I have time.
Also, dvds hold a lot of dummy titles that are just a few seconds of black video. It should add a way of detecting those blank titles in my script. Perhaps we can just use file size as an indicator. Sounds like it could work, because compression should make the filesize of only black video very small. The only thing to be aware off is that there are sometimes image galleries in dvds, which can be like 1 second of video @25fps with a different image at each frame, and the size of such an image gallery might be quite small, so we have to make sure we aren't deleting any extras.
TODO: finish and add the script that takes a directory full of dvd mirrors, extracts all titles from those dvds, removes all duplicates titles, then remuxes each title into a single mkv, putting the result in /remuxed
Current status of the script UNTESTED, USE AT YOUR OWN RISK
#+begin_src bash
tempdir=$(mktemp --directory)
for movie in ripping/*; do
#backup all titles of the movie and add the hash of the title to dirsums
echo "backing up all titles of $movie"
dirsums=""
titles=$(dvdbackup -i "$movie" -I | grep "Title [[:digit:]]*:$" | cut -d ' ' -f 2 | cut -d ':' -f 1)
for title in $titles; do
dvdbackup -i "$movie" -o "$tempdir" --title="$title" --name="title$title" --progress
dirsums+="$tempdir/title$title $(xxh3sum "$tempdir/title$title/VIDEO_TS/*" | cut -d ' ' -f 1 | tr -d '\n')\n"
done
#delete all duplicate titles
to_delete=$(echo -n $dirsums | sort -k2 | awk '{if ($2 == prev) print ; prev = $2 }' | cut -d' ' -f1)
for title_path in $(echo $to_delete); do
echo "deleting duplicate $title_path"
rm -rf "$title_path"
titles=$(echo -n $dirsums | sort -k2 | uniq --skip-fields=1)
done
#dvds have a lot of blank titles that are just ~1 second of black video
#use filesize to detect and delete these
#merge each title into a single mkv file
mkdir -p "./cleaned/$movie"
for title in $titles; do
mkvmerge -o "./cleaned/$movie/title$title.mkv" '[' $tempdir/title$title/* ']' --chapters
done
#we consider the biggest file to be the one containing the main feature
#use ffprobe or mkvtoolnix to get the duration
#and we put everything else in Extras/
mkdir -p "./cleaned/$movie/Extras"
mv $tempdir/* "./cleaned/$movie/Extras"
done
rm -rf $tempdir
#+end_src
Note to self: talk about the productivity tricks like using eject -t after the backup command if you are near the machine that ripps so that you get notified when the disk finished ripping so you can put in a new one, or using a visual bell otherwise (i3 has settings to react to bells btw, like switching to that window, and talk about urxvt notification)
Have an example section with some tricky examples I encountered
warn about the trapped bomb dvd we encountered, which would have filled my drive with junk. a defense would perhaps be to monitor the size of the directory of the dvd backup and kill the process if that size exceeds a limit.