Notes by Dr. Optoglass: Stereoscopy – Challenges in Reproducing Stereopsis in Cameras and Displays

Topics Covered:

  • Interaxial Distance
  • Hyper Stereo vs Macro Stereo
  • Keystone Effect and Keystoning
  • Stereoscopic Window

In order for stereoscopic cinematography to be a creative medium it’s got to be intuitive. If it’s not in the gut, it’s not going to succeed as an art form. If you have to rely solely on tables and calculations, composing stereoscopic images is not going to work – Lenny Lipton

Stereoscopy is the art or technique of trying to reproduce stereopsis. In the olden days, there was just one set of objects collectively called stereoscopes. Today, things have become a lot more complicated.

The challenges in reproducing stereopsis in cameras and displays can be divided into 3 parts:

  • Shooting stereoscopic images
  • Displaying stereoscopic images
  • Processing stereoscopic images

The scope of this essay is very limited. I’m only going to touch on a few major challenges that stereographers (the people making stereoscopic images) face. For a thorough study please take a look at links for further research.

Shooting Stereoscopic Images

No camera and lens combination can exactly match the eye’s characteristics. If we try to match one property, another property will have to be compromised.

One of the first problems encountered in shooting with cameras is the placement of the interaxial distance. The interaxial distance is the distance between the optical paths of two camera systems. I use the term inter-ocular only in reference to the eye, and interaxial only in reference to cameras.

To match the eye (ortho-stereography), the cameras will need to have an interaxial distance equal that of the inter-ocular distance (60-65mm). When one does this, though, the result is not always similar to how the eye sees it. Every change in lens, f-number, frame rate, etc. will have profound implications on the effect of depth.

E.g., when filming far away objects, common practice is to have the interaxial distance greater than the inter-ocular distance. This is called Hyper Stereo. If we overdo it, large objects like mountains will begin to look like miniature models. To shoot objects at close distances, the usual method is to reduce the interaxial distance to a value lower than the inter-ocular distance. This practice is called Macro Stereo. Since large lenses cannot be bought closer beyond a certain point, stereoscopic beamsplitter rigs are used.

One also has to focus on objects when using cameras. The big debate in the stereoscopy world is whether to apply convergence while shooting or later. While shooting, cameras are physically turned inwards depending on the desired effect. Convergence might also have to be adjusted ‘on the fly’ while focus is being racked.

Keeping the aesthetic issues aside for a moment, the issue with setting the interaxial distance and convergence while shooting on set is that you can’t change it by much later. Slight manipulations are possible, but at the expense of time, money and artifacting.

Why would anyone want to change these settings later?

Displaying stereoscopic images

Our eyes take in two image streams and fuses them in real time. For most people with healthy binocular vision this happens perfectly, with the exception of objects coming too close to the eye.

When you bring your finger too close, your brain can no longer fuse the two images if you try to focus on the finger.

Once the images are shot they have to be displayed on a monitor or screen. This itself is a problem because, in general terms, as we have seen earlier, the sensor of the camera is in the position of the retina. Now, the same image must be increased in size to fit whatever size display you are viewing on.

Luckily, the brain is able to fuse these stereoscopic images as well, otherwise we wouldn’t have stereoscopy.

Herein lies the crux of the issue – the camera(s) has already focused (and maybe converged) on one object, and that image has been recorded. While viewing them, our eye will naturally try to focus on the image plane on which the monitor/display lies. But at the same time, it will fight to focus on the object that the camera(s) focused or converged on in the image itself.

screen convergence

Look at the red arrows. The eye will have to fight to focus on the screen, the background (the farthest depth object on the screen), and the bird (somewhere in between). The image probably is focused on the bird – now it is the eye’s job to fuse the two images together and forget its natural instinct to focus on the screen plane.

This is further compounded by distance D. When the distance of the observer changes, the mental calisthenics (brain power) required is different.

This is the greatest challenge in stereoscopy – the problem of having to focus on one object while trying to converge on another. As we know, this is not how the eye is designed to be used. This is one of the reasons why many people have issues with watching stereoscopic content.

Every time one of these properties change, the image will have to change – which is not possible when many people at various distances have to watch the same content – or if many people have to watch the same content on different size screens.

In order to find a common ground between recording ‘fixed’ stereo images and displaying them on variable monitors with varying viewing distances, we process the images.

Processing stereoscopic images

As I have mentioned earlier, one of the great debates in stereoscopic imaging is whether to shoot ‘parallel’ as opposed to applying convergence while shooting. In the parallel method, images are shot straight on, and convergence is then ‘added’ in post production, depending on the effect desired. This allows for greater creativity and freedom of choice, but with the disadvantage that it might introduce errors.

One of the major artifacts of stereoscopy is the keystone effect.

The keystone effect rears its head whenever a flat image has to be projected on a surface that is not at the same angle as it is. As you can imagine, stereoscopic images shot in parallel, when converged in post production, will have a keystone effect problem. This is corrected via a method known as keystoning.

Let’s look at another challenge.

Since most images are displayed on a flat surface, the boundaries of this surface act as a window. This is called the stereoscopic window. It is the choice of the filmmaker to either have depth inside this window (as in the image with the bird above), outside it (when objects seem to fly out of screen towards the viewer) or a combination of both.

The impact this has on the observer is, instead of a screen, we now have a window effect. Filmmakers creating stereoscopic content tries to use this effect by manipulating the stereo window. Obviously the window cannot be increased in size on the display plane – the screen or monitor cannot be shrunk or expanded. What can be changed is the illusion of depth inside and outside.

Increasing the interaxial distance will move the entire scene back, while decreasing it will move the scene forward.

Anyway, now that our screen has become a window, what happens to objects that are cut off by the edges of the screen? As long as the object is ‘behind’ the window it’ll look natural – just as if a person were standing behind a window and we can only see him or her partially.

But what if this object is in front of the window? How can half an object be in front of the window physically? This confuses the brain, and strains it. This is called a window violation. It should really be called viewer violation.

In a scene where some objects need to pop out of the screen, if there are also objects that are being cut off by the screen window, these half-objects are specifically processed so that they stay ‘behind’ the window. It’s not fun to do.

So far we’ve assumed our eyes converge on close objects and stay parallel when focused at infinity. But can our eyes diverge?

Not practical, and not comfortable to try either. But if the window is adjusted too far back some portions of the left and right images can drift apart by greater than 65mm. The observer’s eyes will have to diverge in order to fuse them!

Half the stereoscopic challenge is to control every element in a shot while filming. The other half is correcting everything that couldn’t be controlled. Yes, it’s tedious. Now you know why most producers aren’t jumping on the 3D bandwagon.

In animated images this works out much better because every aspect (except the display) can be controlled to a certain extent, and the results are more desirable, bearable and pleasurable.


  • The interaxial distance is the distance between the optical paths of two camera systems.
  • The greatest challenge in stereoscopy is the problem of having to focus on one object while trying to converge on another.
  • The method of shooting at an interaxial distance greater than the inter-ocular distance is called hyper stereo.
  • The method of shooting at an interaxial distance smaller than the inter-ocular distance is called macro stereo.
  • The keystone effect is the distortion produced whenever a flat image has to be projected on a surface that is not at the same angle as it is.
  • The frame of the display screen is called the stereoscopic window.

Links for further study:

Next: Noise and Dithering
Previous: Convergence and Fusion