Wednesday, 4 February 2009

Of Rips and Magical Musical DVDs

If there is one thing that irks me almost as much as mistagged mp3s, it's poorly encoded videos. Why is it that AVI is still so popular when Matroska containers are superior in every way? Why is MP3 still being used for the audio when any device that can play the video certainly will have enough grunt to play OGG Vorbis (codec support notwithstanding)? God forbid something encoded in ... [gasp] MPEG2 - H.264 people, H.264 (patent law notwithstanding)! Even mplayer on my Eee 701SD (running Debian Lenny) can handle all that without missing a frame!
This rant comes about as a result of me trying to buy a certain music DVD since about 5 months ago. I've had it on backorder for months, I've tried JB HiFi and some other local shops and not really trusting eBay for these kind of purchases I eventually decided to just use my left over internet quota for the month and download the thing.
On a side note, record companies - if you want to make money, why do you make it so difficult to buy things from you? "On backorder, you will receive an email when the item is back in stock", "Unfortunately xxxxx are sold out. Would you like to...", "still out of stock, there's some sorta licensing issue which is taking forever to resolve." - these are just some of the quotes I've heard as a consumer over the last year. Then there was that incident with those CDs stuck in US customs for those two months without anyone knowing where they were and costing more money as replacements were sent, then more money as they were returned after customs finally released them (So, the only thing that Free Trade Agreement did was stuff up our legal system then?)... Yeah, I'm a little off buying CDs over the Internet by now, but of course the alternative of buying in a store is quite difficult given that the bands practically have to have achieved worldwide fame to have a snowballs chance in hell of actually being in stock (ok, I am exaggerating that a little).
Now, since I live in Australia and have limits on how much I can download in a month, I strongly preference not downloading any files larger than ~700mb - and why should I? All the music DVDs I've ripped myself sound (and to a lesser extent, look) supurb at that size, surely it couldn't be that much worse than mine, right? wrong.
Now, ask yourself this - if you were ripping a music DVD, you would make sure that you set a decent bitrate on the audio track wouldn't you? I certainly would - at least 196kbps, perhaps even as high as 256kbps for those of you with ultra sensitive hearing. Well, let's just say that this particular download was a tad less than that and not go into the details too much. I won't even mention just how excruciatingly painful it was to try to listen to.
Now, it's not that hard to do a decent encoding, but it is important to have a reasonable understanding of what's actually involved in the process. It is important to know your source media - is it interlaced? Does it need to be cropped? Is there a subtitle track that you should rip as well? Is there just the one audio track; which one is the right one? Does the aspect ratio need to be fixed?
Many of those answers will vary from situation to situation and from DVD to DVD, so there isn't a one size perfectly fits all solution. Of course there are graphical tools to do this for you and some of them are no doubt pretty good, though they do not remove the need to have at least a basic understanding of what is actually happening if you want good results. I'm not going to cover any graphical tool though, I learned how to do this on the command line years ago and have stuck with that, merely expanding my knowledge when new codecs and options came out. This shell script (which I know needs work - patches welcome) is my current best practice for ripping DVDs for my personal use.
It does make a few assumptions - that the DVD is interlaced and that you want it de-interlaced (because you will be playing it on a computer monitor as opposed to a TV), that there is no subtitle track that you want to extract (if you do, add "-sid n" without the quotes and where n is the subtitle track you want, usually 0, to the end of each line starting with mencoder, though also note that there are "better" ways to do this), that this is a music DVD and not a movie (I recommend lowering the audiobitrate to 128 if it is a movie), that you only want the one default audio track (if not, specify it with mplayer's -aid option and find the appropriate ID with mplayer's -identify option), and that it doesn't need to be cropped (too error prone to automate - look at the -vf cropdetect and -vf crop options in mplayer if you need it).
You will need a few dependencies: You need the Matroska tools, Vorbis tools and x264 libraries. You will also need to make sure that you have mplayer AND mencoder built with x264 support and able to play your DVD. This probably means you will need to compile it from source, which is outside the scope of this article on account of me needing to sleep soon. Also note that depending on your location you may find that you may have legal issues regarding the patents surrounding the H.264 codec. Not to mention that you may live in a country where you cannot legally format shift or where breaking Technological Protection Measures (such as encrypted DVDs) is plain illegal - I leave it to the reader to verify that they can legally do these things or go away and complain loudly to their Government if they can't, just don't go and drag me into it all, I'm just not in the mood.

So, if you've kept reading instead of going to complain to someone in authority than I guess that you are bearing the responsibility and want to know how to actually use this.
Save it as something like rip.sh and use it like
./rip.sh filename track
where filename is the base filename you will end up with and track is the DVD track number to extract - if you leave the track blank then it will rip whatever would have played with mplayer dvd://

#!/bin/bash

targetfilesize=$[ 700 * 1024 * 1024]
audiobitrate=256

file=$1
dvddump="dvd://$2"
rawaudio="$file-rawaudio.wav"
compressedaudio="$file-compressedaudio.ogg"
pass1out="$file-pass1.avi"
pass2out="$file-pass2.avi"
finalcut="$file.mkv"

#extract audio
mplayer "$dvddump" -vc null -vo null -ao pcm:file="$rawaudio":fast </dev/null

#compress audio
oggenc "$rawaudio" -b $audiobitrate -o "$compressedaudio"
rm "$rawaudio"

#Sometimes the length of the video is misreported, so use the length of the audio track instead since it was just encoded and therefore more likely to be accurate:
#NOTE: There is a rare situation where the audio track is really not the same length as the video track - if that is the case you will need to alter this section appropriately
videolength=`echo \`mplayer -identify "$dvddump" -vo null -ao null -frames 0 2>/dev/null |awk -F= '/ID_LENGTH/ {print $2}'\` / 1 + 1 | bc`
audiolength=`echo \`mplayer -identify "$compressedaudio" -vo null -ao null -frames 0 2>/dev/null |awk -F= '/ID_LENGTH/ {print $2}'\` / 1 + 1 | bc`
echo videolength: $videolength
echo audiolength: $audiolength
length=$audiolength
echo length: $length

#calculate video bitrate
videotargetsize=$[ $targetfilesize - `du -b "$compressedaudio" | awk '{print $1}'` ]
videobitrate=`echo "$videotargetsize * 8 / $length / 1000" | bc`
echo video bitrate: $videobitrate

#video pass 1
rm divx2pass.log
mencoder "$dvddump" -vf kerndeint,scale -ovc x264 -oac lavc -lavcopts abitrate=64 -x264encopts bitrate=$videobitrate:threads=auto:pass=1:turbo=1 -o "$pass1out"

#video pass 2
mencoder "$dvddump" -vf kerndeint,scale -ovc x264 -oac lavc -lavcopts abitrate=64 -x264encopts bitrate=$videobitrate:threads=auto:pass=2 -o "$pass2out"

#compile
mkvmerge -o "$finalcut" -A "$pass2out" "$compressedaudio"

If anyone does want to submit patches for this, the main features I've been intending to implement are a more flexible command line usage, a better way to extract the audio (that doesn't have the same risk of pressing left/right yet still produces perfectly synced audio), get all the subtitle tracks embedded into the mkv file and convert the DVD chapters into a format that can be embedded into the mkv.
Update: I can't believe that I didn't think of this earlier - simply redirecting stdin from /dev/null solves the keyboard input issue when dumping the audio with mplayer.

As for me, well, I guess I'll just eBay it after all hoping it's not a bootleg and go to sleep.

Update: I'm just going to go over an issue I mentioned in this post - how to deal with media that needs it's aspect ratio corrected. The symptoms of this are generally that while you are watching a video everything just feels slightly distorted - in many cases this will be your imagination playing tricks on you, but if you are fairly certain that it isn't, read on. I'm going to use the music video for "Stick Together" which was on the bonus DVD from the album "Rock Music" by "The Superjesus" as an example. Everytime I watched this it looked distorted, so today I paused it at this frame and used the GIMP to take a screenshot:

Now, the reason I took the screenshot here is that there is a fairly large (easier to measure) drum (circular object) reasonably close to the centre of the screen (not too heavily distorted by the camera's lens) and facing the camera almost perfectly straight on (avoids perspective distortion). Using the measure tool in the GIMP I find that the drum is approximately 170 pixels wide but about 184 pixels high - clearly the aspect ratio is way out and in this case it wasn't just my imagination (phew).
You will also notice the large black bars above and below the image - these need to be cropped. And here's another reason I chose this video - take a look at this screenshot:

Notice where the black bars are and how large they are this time? This is exactly why it's important to know your source media. If I simply run mplayer -vf cropdetect over this it's going to change it's mind 3 times during playback - within the first second as that fades in it changes from crop=560:496:80:38 to crop=576:496:72:38. Then when the widescreen video starts it decides on crop=688:496:18:38. None of these are correct - the first two would cut off the left and right of the video and the last one will still leave small black bars at the top and bottom. This is one of the reasons why I mentioned that automating cropping is just too error prone. So, what's the solution? Tell mplayer to start playback after the intro artwork is gone of course! If, hypothetically, you wanted to modify my above script to attempt to detect how to crop it, I would suggest adding a line something like this (and using the crop variable in the appropriate place in the video filter chain - I cover this later):

crop=`mplayer "$dvddump" -vf cropdetect -vo null -ao null -fps 1000 -ss 60 -endpos 5|grep CROP|tail -n 1|sed 's/^.*(-vf //'|sed 's/).*$//'`

This starts the playback one minute in, quickly runs the video for 5 seconds and gives me a crop parameter of crop=688:432:18:72 - checking this with mplayer -vf crop=688:432:18:72 video.vob looks about right so it's time to move back to the problem of the aspect ratio (you could also crop the video after changing the aspect ratio - just remember to keep your video filters, including cropdetect, in the same order that you are working with).
So, let's see - I have a width of 688 pixels with the drum 170 pixels wide, and a height of 432 pixels with the drum 184 pixels high. Personally, I want to keep the width as is and scale the height to adjust the aspect. So, the currect aspect ratio is about 1.6 (688/432) and I probably want about 1.7 (432/184*170) - plugging this value into mplayer still doesn't look quite right, but I know that this is close to the standard 16:9 (1.7) aspect and a little more eyeballing tells me that's probably a bit closer. What I'm trying to get at here is that despite your best measuring efforts, it's quite difficult to get this exact and you eventually will need to just eyeball it and see if it looks good enough.
So, all together now the filter chain will look something like this:

mplayer -vf kerndeint,crop=688:432:18:72,dsize=16:9,scale=-1:-2

Breaking that down:
1. Deinterlace the video before any other processing (the absolute last thing you would ever want to do is scale first and then try to deinterlace, unless of course you like to make your eyes bleed).
2. Crop the black bars away (again, if you altered the aspect ratio before cropping the video this would be at the end of the chain).
3. dsize is used to change the intended aspect ratio used by all the following video filters (but doesn't change the aspect ratio itself).
4. Actually change the aspect ratio: a width of -1 tells it to use the original width (688 pixels), and a height of -2 tells it to scale the height using the other dimension and the intended aspect ratio.
Ok, I have lied a little - my source media is actually not interlaced in this case, so I did not use the kerndeint filter, but I wanted to drive home the point about the importance of getting the video filter order correct - I've seen it done wrong. My eyes started to bleed.

Friday, 16 January 2009

New Years Resolution: Massive Music Tag Cleanup

Once again I find that months have passed since my last entry. The blog will be a year old in little over a week and I will once again be attending linux.conf.au, this time down in Hobart. I've got myself some new gadgets - in particular a Eee PC 701SD which only cost $327 AU from JB Hifi so I have a decent computer for the conference. I'll be posting a lot more about it in the coming weeks, but am just mentioning it now as it is linked to today's post. Allow me to explain - while I've kept the default Xandros install on the internal 8 gig solid state drive I've installed Debian on a 2 gig SD card. 2 gig. yep. small, isn't it? and encrypted, but that's for another post. The point is that I've been looking for lightweight alternatives to all the software that I traditionally use in my day to day tasks, so while I'll happily leave Amarok alone on Xandros, I didn't really want to pull in all the KDE dependencies to have it on Debian, and I've come across a nice little ncurses music player called cmus to use instead.

Now, on my desktop and main laptop I use Amarok pretty much exclusively and have tried to keep all the tags in my music collection accurate - I try to check the track listing, the genre, the year and that the capitalisation complies with English capitalisation rules (except when it is apparent that the odd capitalisation is a concious decision on part of the artist and forms part of the art). I'm well aware that I've missed some - some of the artists that have been in my collection for longer still have bad capitalisation and I've only started to check the accuracy of the album years recently.

But there is a larger problem - Amarok doesn't reveal every tag to me. While that doesn't matter in the least as long as I'm only using Amarok, it can matter when I use other media players. I'm not worried about any of those albums I own a physical copy of - they're all in ogg (but if you do need a powerful ogg tag editor, tagtool's advanced mode _looks_ promising), but rather the music I've downloaded and have left in mp3. I've been aware of the issue for a while because I occasionally observe some of the symptoms on the various media players available on the Internet Tablet. I have looked at dedicated tag editors, but until now I haven't been able to find one that would show me *every* tag - not just the one's it's programmed to recognise, not just the id3v2.3 tags, but all of them. And not just the first 30 characters of them either.

Why is this *so* important, they're just extra tags, right? Well, my biggest annoyance is that cmus uses the contents of the TPE2 tag if it is present for the Artist in it's library view rather than the TPE1 tag which Amarok uses. TPE1 is defined as "Lead performer(s)/Soloist(s)", while TPE2 is defined as "Band/orchestra/accompaniment". Now, the TPE2 tag may well be perfectly valid and correct, but it is not a tag that I have been organising or validating so far with Amarok, so I'd like to get everything consistent and delete the TPE2 tags. While I'm at it, why not remove all the cover art from the mp3s - I've always felt it wasteful to keep 12 copies of the same image when I could and do just put a single image in the same folder. In fact, why not go and remove all the tags that aren't recognised by Amarok - do I really care that it was encoded with lame? I might be happy to leave the 'free download from http://www.last.fm' comment tags alone and I certainly don't want to destroy any comments that I've added, but do I really want any of the other comment tags in there?

So I finally found a id3 tag editing tool that can show me most of the tags - eyeD3. It's still not perfect - there isn't any support for id3v2.2, it doesn't show me the tags that replaygain uses and it did crash while parsing some of the mp3s - I dare say I'll have to come back to those later with another tool, even if it is hexedit. Edit: As the author pointed out, eyeD3 is in fact able to read id3v2.2 tags, just not write them and those crashes will doubtless be solved in no time.

The first step was to find out what tags are actually present in my collection:

find music -iname "*.mp3" -exec eyeD3 -v {} \; | tee index
sort -u index | awk -F\): '/^<.*$/ {print $1}' | uniq | awk -F\)\> '{print $1}' | awk -F\( '{print $(NF)}' > tags

So, that gives me a list of all the different types of tags in my collection - 44 unique tags in my case. Next step is to work out which ones are used by Amarok and if I want to keep any of the others. While I could go through and speculate on which of the three tags I can immediately see that might be a year, it's probably a better idea to look at the source code.

apt-get source amarok libtag1c2a
view amarok-1.4.9.1/amarok/src/metabundle.cpp
view taglib-1.4/taglib/mpeg/id3v2/id3v2tag.cpp

Some immediately obvious tags because it names their identifier directly are TPOS (Disc number), TBPM (beats per minute), TCOM (Composer - admittedly this is one tag that I have not been validating), TPE2 (which is marked as a non-standard MS/Apple extension - so it is aware of it but since it's messing up my collection and Amarok doesn't seem to display it anywhere I'm getting rid of it anyway) and TCMP (Compilation album, ie, show under various artists. Unfortunately cmus doesn't appear to use this tag, though does seem to have some logic for compilation albums - this is a matter I will need to investigate further later on).
Digging deeper to look past the nice friendly names that the programmers can recognise to the harsh id3 reality I also identify that I'll need to keep title (TIT2), artist (TPE1), album (TALB), comment (COMM), genre (TCON), year (TDRC) and track (TRCK) - as well as anything that is used when playing the file that isn't identified here.

Though Amarok can use images embedded in the mp3s, I don't want any - I much prefer to use Amarok's cover manager combined with copycover-offline.py to copy them into the appropriate directory (look through the comments for useful patches - hmmm, should probably submit my fix for albums with Various Artists come to think of it).

So, I made a list of these tags, one per line in a file called amaroktags. Then found all the tags in my collection that aren't supported by Amarok:

cat amaroktags tags | sort | uniq -u
view taglib-1.4/taglib/mpeg/id3v2/id3v2.4.0-frames.txt


Which left me with a list of tags that I wanted to keep:
COMM, TALB, TBPM, TCMP, TCOM, TCON, TDRC, TIT2, TPE1, TPOS, TRCK, MCDI (Music CD Identifier), TFLT (File type), TLEN (length, used for seeking), TSRC (International Standard Recording Code - the only album using it in my collection is Nine Inch Nail's Ghosts I-IV)

And an even larger list of tags to zap:
TPE2, APIC (Attached picture), TDTG (Tagging time), GEOB (arbitrary file), PCNT (Play count), POPM (Popularimeter), PRIV (private textual & binary data), TCOP (copyright), TDEN (encoding timestamp), TENC (Encoded by), TIT1 (content group description), TIT3 (Description refinement), TLAN (language), TMED (Media type), TOAL (Original title), TOFN (original filename),
TPUB (publisher), TSSE (encoding settings), TXXX (User defined text), UFID (unique file identifier), USLT (lyrics), WCOM (commercial info), WOAR (artist web page), WXXX (other URL)

As well as these ones that I couldn't identify, so I'll zap em and hope nothing breaks:
NCON, TAGC (appears to be a timestamp)

And a couple to manually check later:
TOPE (Original artist - I notice that Kong in Concert uses these for the original track names, though not accurately - they should probably be in TOAL), TYER and TDRL (years with subtly different meanings - taglib does seem to fallback and use these, but I will need to check for conflicts)

So, now I have a pretty definitive list of tags it's time to zap em' (after backing up in case something blows up in my face of course). Although not immediately obvious it appears that using the --set-text-frame specifying the 4 letter name of the frame and no contents will remove it, even if it isn't a text frame. Now, this doesn't appear to actually conserve any space in the file - it shuffles the rest of the tags upwards and zeroes out the gap (presumably conserving the space would be possible, but I don't know an easy way off the top of my head - suggestions welcome). There may be some tags that you want to have more intelligent processing on - maybe only remove some of the images or maybe only remove some of the GEOBs and if that is the case read the eyeD3 documentation, but for me I'm sick of them all and want them gone:


find music -iname "*.mp3" -exec eyeD3 --set-text-frame=TAGC: --set-text-frame=TPE2: --set-text-frame=TDTG: --set-text-frame=TCOP: --set-text-frame=TDEN: --set-text-frame=TENC: --set-text-frame=TIT1: --set-text-frame=TIT3: --set-text-frame=TLAN: --set-text-frame=TMED: --set-text-frame=TOAL: --set-text-frame=TOFN: --set-text-frame=TPUB: --set-text-frame=TSSE: --set-text-frame=TXXX: --set-text-frame=UFID: --set-text-frame=USLT: --set-text-frame=WCOM: --set-text-frame=WOAR: --set-text-frame=WXXX: --set-text-frame=NCON: --set-text-frame=APIC: --set-text-frame=GEOB: --set-text-frame=PCNT: --set-text-frame=POPM: --set-text-frame=PRIV: --set-text-frame=TCMP: {} \; | tee log


Depending on how large your collection is, at this stage you may choose to blink, stretch your arms, get some coffee, go to bed or take a vacation. Personally, I wrote a blog post.

I still have some things I know I'll have to fix up - the Deus Ex Soundtracks all seem to have multiple redundant comments, and there are some non English comment fields, but you should by this stage have a decent understanding on how to do this - that is of course, if this whole article didn't just go over your head (congrats if it did and you still read this far though :)

update: It turns out that the TCMP frame is not actually set by Amarok, so my solution is to remove all the TCMP flags from the library (I've added it to the above list, though where they are 1 in my collection is correct, but very few of the other tracks in the same album are tagged in the same way and would explain some odd behaviour when importing the albums), then to manually add them for all relevant tracks, which hopefully will ease future migration. Unfortunately as best I can tell, cmus doesn't appear to have any concept of compilation albums in it's id3.c. OGG files will supposedly get them since their tags don't require almost one thousand lines of C code to process (by contrast, cmus' vorbis.c file has a mere 285 lines including 33 lines of tag parsing), which begs the question as to why only 1 of my OGG compilation albums are marked as such in cmus.

find music/V/Various\ Artists/ -iname "*.mp3" -exec eyeD3 --set-text-frame=TCMP:1 {} \;


update: I've written a simple shell script to do this automatically, just save this as striptags.sh and execute it from your music directory:

#!/bin/sh

oktags="COMM TALB TBPM TCMP TCOM TCON TDRC TIT2 TPE1 TPOS TRCK MCDI TFLT TLEN TDTG"

indexfile=`mktemp`

#Determine tags present:
find . -iname "*.mp3" -exec eyeD3 -v {} \; > $indexfile
tagspresent=`sort -u $indexfile | awk -F\): '/^<.*$/ {print $1}' | uniq | awk -F\)\> '{print $1}' | awk -F\( '{print $(NF)}' | awk 'BEGIN {ORS=" "} {print $0}'`

rm $indexfile

#Determine tags to strip:
tostrip=`echo -n $tagspresent $oktags $oktags | awk 'BEGIN {RS=" "; ORS="\n"} {print $0}' | sort | uniq -u | awk 'BEGIN {ORS=" "} {print $0}'`

#Confirm action:
echo
echo The following tags have been found in the mp3s:
echo $tagspresent
echo These tags are to be stripped:
echo $tostrip
echo The tags will also be converted to ID3 v2.4 where appropriate
echo
echo -n Press enter to confirm, or Ctrl+C to cancel...
read dummy

#Strip 'em
stripstring=`echo $tostrip | awk 'BEGIN {FS="\n"; RS=" "} {print "--set-text-frame=" $1 ": "}'`
find . -iname "*.mp3" -exec eyeD3 --to-v2.4 $stripstring {} \; | tee -a striptags.log