Rule of thumb: The worse your audio options in-camera, the more you need additional audio gear.
What is the minimum acceptable audio for professional applications?
This is a tough question to answer, and most people have their own ideas. I’ll share mine, and maybe you’ll find it useful. Here are the minimum requirements for a few delivery standards:
- Quality of audio CDs – 16-bit, 44.1 KHz, Compressed, 2 Channel
- Quality of DVDs – 16-bit, 44.1 KHz, Compressed, 2 Channel
- Quality of Blu-rays – 16-bit, 48 KHz, Compressed, 2 Channel
- Quality of Broadcast HDTV – 16-bit, 48 KHz, Compressed, 2 Channel
- Quality of DCI – 24-bit, 48 KHz, Uncompressed, 2 Channel
I’d say 16-bit, uncompressed or lossless PCM, sampled at 48 KHz is the minimum acceptable standard for recording audio. Your mix will be based on how many channels you are supplying to. The standard today has become 5.1 (6 channels), but don’t forget that this is specific to the final mix, not the recording format.
Uncompressed audio at the above specification will have a bit rate of approximately 1577 kbps (1.5 Mbps).
Want to know something cool?
Footage shot on high-end cameras like the Red Epic will have a bit rate of 150 MB/s (1200 Mbps or 1.23 million kbps). A 2 hour movie in this ‘native format’ will have an image-only size of approx 1 TB.
The audio, according to our standard, will have a total size of 4 GB approximately for 5.1 channels.
The ratio of video to audio in this ‘ideal’ scenario is 256:1.
Now what happens when someone wants to transfer this to DVD? The maximum bit rate of an average DVD is about 9 Mbps (1.125 MB/s), which includes audio and video. If the audio has to fit in 1/256th of 1.125 MB/s, then it should have a data rate of 36 kbps. That’s small.
And this is shooting in compressed Redcode. If I wanted uncompressed video I might only have 10 kbps or less for audio! Does encoding work this way? Of course not.
Clearly, our DVDs have better audio than it does video. In fact, all our delivery formats demand better specs for audio over video!
Why? I don’t know, but I have a theory. Our ears can sense minute imperfections easily, while our eyes can accept a larger degree of ‘error’. The eyes accept and ‘forgive’ temporal and spatial aliasing more often. But if we hear a false or discordant note, the psychological effect is more ‘painful’.
I make a full argument of what I feel about audio in Driving Miss Digital.
In the next chapter, we’ll step higher.