The Difference between Lossy and Lossless Compression

Compression is the elimination of data to reduce file size. The word ‘elimination’ is used with full intent. It’s final – those pixels are gone forever.

Exclusive Bonus: Download my free guide (with examples) on how to find the best camera angles for dialogue scenes when your mind goes blank.

True Lossless Compression

When a file that has been compressed can be decoded back into its original form with zero loss of information, the compression is said to be a Lossless Compression.

Is this just a pipe dream, or are there any scenarios where a true lossless compression can be achieved? I can think of two:

  • Vector images are scaled based on pure mathematical principles. Simple shapes can be made quite small in file size, however complex shapes can be larger than raster images!
  • An algorithm that stores duplicate pixel data in some place can replace them just as easily. My 256×256 image example in Interframe vs Intraframe Compression is one good example. Just like vectors though, once the image gets complicated, the level of compression will become negligible.

Most compression algorithms don’t understand the subject matter being presented. They apply generalized formulas and systems – very good ones, mind you – but still under pressure from the whip of having to reduce file size. That’s the whole point.

Lossy Compression

If, after compression, the original file cannot be brought back again (like humpty dumpty), then the compression is said to be Lossy.

Let me dispel a myth: As far as video is concerned, I have yet to see lossless compression. Name your codec – it is lossy. Compression, if your intention is to reduce file size, is always lossy.

What about Visually Lossless Compression?

Friend, let me ask you a question: When was the last time you compressed an image to get a visually lossy result?

The aim of every algorithm is to achieve a visually lossless look. These algorithms put power into your hands through software, so you can use them to the degree you see fit. If you go too far, your images will look like mush.

Look at the following image, a test TIFF file that has been compressed to JPEGs under five settings – Maximum (12), High (8), Medium (5), Low (3) and Zero (0).

The original TIFF file is 1920×1080 8-bit sRGB, with a file size of 5.95 MB. You can estimate the perfect file size according to my formula here. The numbers in the second row show you the size in KB (5.95 MB = 6096 KB). The numbers in blue show the size relative to the original size.

Comparison of JPEG compression levels

What do you see? I see this:

  • After a point, compression makes the image worse, but does not reduce file size proportionally. The relationship is exponential rather than linear.
  • Large expanses of color get worse faster than fine detail. The first and last boxes show degradation at level 8, while the fine detail holds up till level 5. One of the fundamental tenets of compression is the elimination of duplicate data. Fine detail makes that difficult.
  • Once fine detail has been compressed to the limit, compression artifacts give birth to patterns of their own, giving the illusion of detail and sharpness.
  • The maximum quality setting for JPEG is visually lossless, while at the same time being only 10% the size of a full raster image!
  • If the quality of the original image is poor (the example is a perfect image created in Photoshop), even the best compression level will contain artifacts.
  • Level 12 is 575 KB – visually lossless JPEG at maximum quality. If you had a JPEG image sequence (or intraframe codec using similar algorithms), at 30fps this would result in 134 Mbps. At 25 fps, it would mean 112 Mbps. This explains why broadcast quality delivery requirements for intraframe codecs is limited to 100 Mbps and above.

You can perform similar tests on any algorithm that claims to reduce file size. Some algorithms are designed to stop after a certain point – giving the illusion that they are ‘better’ somehow – and they increase file sizes! Really cool codecs like JPEG gets bad publicity from misinformed souls who don’t know how to test or when to stop.

Have you ever wondered why the broadcast requirement for interframe codecs is 50 Mbps while intraframe codecs is 100 Mbps? Aren’t interframe codecs worse? Shouldn’t they need more space if this were true?

The reality is different. In fact, interframe codecs like MPEG-2, H.284 and AVCHD are better than intraframe codecs – in the sense that they can provide similar visually lossless results in a lesser file size! Crazy world, huh?

How to never be taken for a ride

Whenever you are faced with a choice of codecs, conduct your own tests:

  • Create a perfect image: Either in Photoshop or take a RAW photograph of a scene.
  • Convert to an uncompressed TIFF file (Why TIFF? It can be opened in most software). This is A.
  • If you want to test video, you will also need to test for motion or temporal artifacts. If you have uncompressed RAW video (as far as I know only Arriraw is true uncompressed RAW) use that. Else, create a simple vector animation – a few seconds worth. You could combine it with a still image or another vector background. This is B.
  • Transcode both A and B to all the codecs you want to compare, from best to worst. Leave out the nitty-gritties – try one slider at a time. Do this systematically – take notes and name each baby correctly. If you plan ahead this step is easy.
  • Finally, study each result and share it with people. Then you’ll know what’s good for you. This exercise will take a few hours, but will set you up for many years.
  • Share it with the world.

Have you conducted tests of your own that you want to share? Tell me!

Exclusive Bonus: Download my free guide (with examples) on how to find the best camera angles for dialogue scenes when your mind goes blank.

11 replies on “The Difference between Lossy and Lossless Compression”

  1. SDub I’m no expert, not even close. But as far as I know, Zip compresses by removing redundant data, like spaces in file systems, etc. There’s always redundant space for metadata in a file system, and it adds up.
    I could be wrong, but this is not a subject I want to study any further!

  2. Sareesh Sudhakaran SDub Very interesting read! How does the counting theorem apply to a compression algorithm like .zip? Obviously .zip can be decompressed and have a data still retained. This article, as far as I can tell, doesn’t claim that the compression is for a certain file type/type of compression.

  3. SDub Thanks for taking the trouble. You are using the word ‘lossless’ as you understand it. However, I have used it within the context of the definition I have formed – which is ‘true lossless compression’.
    Such an algorithm doesn’t exist, and cannot exist. If you really want a technical but simplified explanation, here’s a great starting place:
    The statistical analysis used in Huffyuv encoding is pretty smart, brilliant in fact. But it, like every other compression algorithm, cannot truly recreate humpty dumpty back again.
    Lagarith is an implementation of the huffyuv encoding scheme, which is pretty much used in JPEG 2000 as well.
    If you’re a working filmmaker then please don’t bother with this stuff. If practical compression is the goal, codecs like Prores HQ and better do a stellar job, at much better compression rates – and are supported by every NLE on the market. Proof of the both are available elsewhere on this site.

  4. Sareesh Sudhakaran SDub You don’t literally say it, but here: “As far as video is concerned, I have yet to see lossless compression. Name your codec – it is lossy. Compression, if your intention is to reduce file size, is always lossy.” You say you have yet to see it, but Lagarith which is a lossless compression technique that aims to reduce file size. Am I misinterpreting something?

  5. Sareesh Sudhakaran SDub Right but you conclusively claim that such a codec doesn’t even exist, but they do. Maybe it’s just not applicable anymore?

  6. Aren’t there multiple lossless codecs for video that are used to compress the master before delivering to DVD, Internet, or Bluray? These are used with the intention of being intermediary, but they still do exist. Codecs such as HuffYUV, utvideo, lagarith, etc.

  7. Hi Sareesh:
    Good article.
    CinemaDNG is Uncompressed Raw as well.
    I have worked with Uncompressed Raw/RGB video since 2006 recording with Streampix (starting with V3 now V5), recording with Machine Vision Cameras. Even before ArriRaw or CinemaDNG…
    I also have worked with Cineform Raw material, and let me tell you this: There is NO comparison!
    The Uncompressed Raw footage then de-Bayered in post to Uncompressed RGB (AVI or MOV) even at 8 bit, is FAR superior than Cineform Raw then de-Bayered to Cineform FilmScan2 444 (the highest quality, less compression setting with this codec)even at 12 bit.
    The Uncompressed RGB video can be extremely pushed in post regarding color correction and even applying extreme gamma curves, and it preserves the image quality much better, whilst the Cineform video breaks up more easily when pushed.
    Even when not applying any color correction at all, the Cineform video has mosquito noise that is not present as much in the Uncompresed video (with video shot at 0dB gain), and we are talking about the same scene taken with the same camera here…
    So my conclusion is: There is not such thing as “lossless” compression at all!
    Cesar Rubio.

Comments are closed.