I though I'd document this, because it is actually quite the PITA to do.
The idea is to get the audio from a video off Youtube. Whether this is a song you want to download, or maybe you want the audio for some reworking, or whatever.
The key is getting the audio from the video, without losing quality due to re-encoding. The following steps will guarantee, that the audio comes to your computer in the best quality available on Youtube, but at the same time being in a correct container. Keep in mind, that audio quality on Youtube isn't actually the best one you can get, but it's usually sufficient for listening on a sub-professional system. It will be audible on high-quality headphones, but PC-Speakers, or even a PA at a party will be quite fine with it.
This is the easiest step. I use http://keepvid.com, but there are other similar services, that let you download a video from Youtube. Download the Video with the highes quality available. The best indicator is the number of lines, for instance 740p. Don't care about MP4 or FLV, though, We'll deal with it later. If the best that is available is 480p, then download that one. I suggest downloading FLV, the MP4 container used to be broken and had weird clicks in the audio.
I use FFmpeg for that.
I like to check what kind of audio stream there is in the video, before doing anything with it. Youtube always uses AAC for audio encoding:
ffmpeg -i input.vlf gives us this as output:
[...] Duration: 00:14:58.93, start: 0.000000, bitrate: 765 kb/s Stream #0.0: Video: h264, yuv420p, 854x480, 633 kb/s, 29.92 tbr, 1k tbn, 59.94 tbc Stream #0.1: Audio: aac, 44100 Hz, stereo, s16, 131 kb/s [...]
As expected, it is an AAC stream, we don't care about the video. What we can see here, is that the Flash video is composed of an h264 video stream and an aac audio stream, effectively making it an MP4-type of file.
Now, to simply get the audio from the video container, just do:
ffmpeg -i input.vlf -acodec copy audio.aac
You'll notice this takes only a second or two, depending on the size of the video. FFmpeg doesn't recode here anything, it just copies the data into an extra file.
Note: FFmpeg uses file extensions, such as .aac to determin the format, so stick to those you see here!
Now let's check with file what this gave us:
file audio.aac gives us:
audio.aac: MPEG ADTS, AAC, v4 LC, 44.1 kHz, stereo
The audio stream is a pure AAC stream withouth any containers. While this is actually playable, some players don't seem to like it. Putting it into an MP4 container is a better idea, especially if you plan to tag it, etc.
From this point on, we don't need the .flv you downloaded from keepvid or someplace anymore, simply delete it. It saves some space on the hard drive, too.
Now, to put it into a MPEG v4 system version 2, you need a tool, that indeed uses version 2. FFmpeg just puts everything into version 1 when constructing an MP4 container, which is unsuitable for some players and things like Flash broadcast. Some decoders have trouble with it too, most notably neroAacDec.
I use MP4Box to put MP4 files together. I'm not gonna go through the instructions on how to compile it, the documentation on that site is sufficient, from my experience.
Anyway, once MP4Box is installed, put the AAC stream into an MP4 container:
MP4Box -new -add audio.aac -isma audio.m4a
The -isma switch forces the construction of an MP4 system version 2.
We check again with file, and this is what we get:
audio.m4a: ISO Media, MPEG v4 system, version 2
And you're done! Jumping through hoops, hasn't been easier!
The file extension doesn't matter as long as it is a somewhat accepted one for MP4-files. Whether it's .m4a or .mp4, whatever you like better. Just don't name it .aac, since this is reserved for plain AAC streams, with no container around it.