Intra or Inter-Frame?
A football match between two countries is called an ‘international’ match because it is ‘between’ two countries. Similarly, the internet, is a network between many computers.
On the other hand, an intranet is a network of computers ‘within’ a strictly specified area.
An interframe codec is one which compresses a frame after looking at data from many frames (between frames) near it, while an intraframe codec applies compression to each individual frame without looking at the others. Let me simplify:
The basic idea behind Compression
If I have a 256×256 image in 8-bit color, the file size is 256x256x8x3 = approx. 197KB.
What if this image is totally black, end to end? Each pixel is 00000000. A smart algorithm can
- Study the file and decide that each pixel value can be changed to just 0, and
- Add metadata with the code 00000000 to the compressed file – at the beginning or end
Since the decoding algorithm already expects this, it will know that
- 00000000 metadata represents black color that has to be used for all pixels
- Each 0 is to be replaced by 00000000
- Recreate the original file perfectly
If this file is compressed, the intermediate file is now (256x256x1)+8 = 65,544 or about 8.2KB.
That’s a 24x reduction!
This example is, of course, a highly simplistic view of the compression process. Computer engineers and mathematicians have found much more complex and brilliant ways to compress data. Let’s not go there.
When you take a still picture and compress it, you are applying an intraframe codec. There is only one frame, and you are analyzing the data within this frame to compress it. Nothing more, nothing less.
The algorithm doesn’t look at the photo you took before this one, or the next one in the folder, it just looks at one frame and tries to compress it as best as it can. It stays within the frame, hence intra-frame.
But wait, if the individual frames are saved as separate files, they are called image sequences, just like many JPEGs in a folder. Some popular image sequence formats are:
- TIFF – used for mastering
- DPX – Used in film scans and digital mastering
- EXR – Used in VFX
- JPEG2000 – DCI
An intraframe codec is bunched as one file. A few popular intraframe codecs:
- MJPEG – JPEGS bunched together
- Prores – Apple’s favorite
- DNxHD – Avid’s baby
- ALL-I – Found in the newer DSLRs
- Cinema DNG – Adobe’s baby for RAW image sequences
Video is nothing but a stream of images called frames. In film-land, each second holds 24 frames. Computer scientists realized that if video is always going to have more than one frame, then why not look at every frame before compressing one frame?
The immediate problem with this is, if the video is long, like a movie – looking at every frame is madness. The scenes and imagery changes anyway. So why not look at a few frames before and after each frame and then compress them all together?
This ‘few frames’, a number, is always specified beforehand so there’s no confusion. The number is known by the term GOP (group of pictures).
An image is two-dimensional – it has rows and columns of pixels. An intraframe codec compresses two-dimensionally. An interframe codec compresses three-dimensionally.
If all it did was look at neighboring frames for compression, it wouldn’t be much more efficient than intraframe codecs, so modern interframe codecs employ both interframe and intraframe voodoo!
That’s right, folks, an inteframe codec looks at its neighbors, and looks at itself, and compresses both ways. This is why interframe codecs are smaller in size than intraframe codecs.
A few popular interframe codecs are:
- H.264 (A variant of MPEG-4)
- AVCHD (A variant of H.264 AVC) – developed by Sony and Panasonic
- XDCAM – Sony’s baby
- XAVC – Sony’s new baby
In addition to image sequences, intraframe codecs and interframe codecs, there’s also uncompressed video, uncompressed RAW, and compressed RAW. To know more about RAW, check out Deconstructing RAW.
Which one needs more processing power and why?
Let’s look at two common scenarios for codecs: Decoding and Editing.
Imagine a person standing in an assembly line getting compressed intra-frames one at a time. He or she unwraps each frame and passes it on. Then the second frame is pulled up and so on.
In an interframe codec, when a frame is pulled up, the person has to look at the frame before it, and after it and then decide how to unwrap it – before unwrapping it.
If the person had to unwrap and display the frame within the same time period, which one would take more time do you think? If the person is unable to unwrap interframe frames as quickly as intraframe frames, we’ll need to find a ‘faster person’.
When an intraframe codec sits on the editing timeline, and is sliced, the slice happens between two frames. Since they have no connection to each other, the application can proceed to the next step, which is unwrapping the frame for further use.
If an interframe codec sits on the timeline and is sliced, the application will have to calculate its effect on the frames before and after it. It will have to re-draw the frames and put everything in order before the unwrapping can happen.
Which do you think takes more calculations per second? That’s right, interframe!
Here’s another analogy:
What if you are dictating something, and your personal secretary (you wish!) is taking notes in shorthand? Each syllable has a one-to-one symbol associated with it. As soon as Siri hears you she jots down the corresponding symbol. This is intraframe.
What if you are dictating the same thing, but in a different language? The person this message is going to is very important, and you can’t make mistakes like mentioning interframe where you meant intraframe, and so on. Siri now has to hear your complete sentence first, and then start the process of rephrasing the sentence so your meaning is preserved. This is interframe. You’ll agree this is always going to be slower than intraframe, no matter how much you’re willing to pay Siri.
This is why interframe codecs need more processing power to deal with the same resolution/frame rate video.
Which one is better and why?
Both are crap.
If I’m shooting something worthwhile, I want it to be in the best format possible. Due to budget reasons, we don’t have access to the best cameras, so we compromise. So let’s get that straight – both are compromises. Compression is fundamentally a compromise, whichever way you slice it.
To get closer to the ‘truth’, you need to get the image as soon as it leaves the sensor. Today, you can do that with RAW files.
But most of us can’t afford RAW workflows. What’s left?
Here’s the rule of thumb: Get the first image you can get – nothing will ever beat that, ever.
This means, if your camera is giving you interframe codecs like H.264 or AVCHD, edit native without re-compressing it to an intraframe codec. The logic behind why people transcode to Prores or DNxHD is:
- Editing is easier on the CPU.
- It’s a better file system so color grading or visual effects is easier.
To answer the first hurdle, modern CPUs are capable of handling interframe codecs easily, so there’s no advantage to go intraframe only for this reason.
To answer the second hurdle, I say BS. If you really cared about color or visual effects or green-screen work, you’ll do better with uncompressed image sequences. Call it what it really is – a smaller compromise, nothing more.
I have never seen any advantage in using intraframe codecs as an intermediary codec. And believe me, I tried five years ago, with both Prores and Cineform, for my movie, which contained plenty of green screen work and visual effects. Today, I see it all the time with modern DSLR and C300 footage.
Let me make myself clear: I’m not saying there is zero advantage to intermediary codecs, just not enough in today’s day and age to worth bothering about.
If at all you are compelled by the forces of misinformation and marketing to scratch your itch for intermediary codecs, please take a few hours to test first:
- Shoot a few seconds’ worth of footage containing motion and every challenge you want to throw at it. Bring this native codec to a timeline and edit.
- Transcode the native footage to an intermediary intra-frame codec and apply the same edits.
- Transcode the native footage to an uncompressed TIFF sequence and apply the same edits.
- Render all three timelines out as an uncompressed image sequence and study them.
- Apply effects, filters, plug-ins, whatever, to all three timelines and export as an uncompressed image sequence and study them.
- Be at peace with whatever works for you.
Just because intraframe codecs are better theoretically doesn’t mean the manufacturer’s implementation of it is good!