3D movies are gaining in popularity but when they first arrived, many critics deemed the technology as a short-lived boom, as it had too many issues. Movie goers complained about headaches, blurry and dark pictures.

In 2012, a lot of the first step mistakes have been addressed. The Hobbit was – in selected cinemas, at least – projected in 4K resolution at 48fps. Seeing New Zealand like that was the best you could get without a $2000 flight to see the real deal. In all its gloriousness though, it only looked almost real. Why almost? I was curious. What were the missing pieces to make the illusion real?

From frames to motion

The optical flow vector of a moving object in a video sequence.

How many frames per second (“fps”) make a motion truly fluid? Peter Jackson doubled the usual 24 pictures to 48, but that is clearly not enough. Apple’s iOS device screens (and most consumer devices) run at 60 refreshes. Some TVs up to 240 refreshes, but not individual frames. It is a difficult question. But any amount of frames isn’t enough, and will eventually cause headaches. The reason is simply that our brain does not work in frames – instead, it tracks motion (also see optical flow).

Which is why I am so excited about the work of researchers at the University of Bath, who created codec based on vectorization. While eliminating pixels was their main goal, they could eliminate frames all-together: When the movie runs (and the codec decodes), “frames” would be rendered on the fly to match the display frequency, and for newer displays, only what changed gets uploaded to the display and changes. A display that literally moves stuff on its surface around, the way your brain expects it.

Focus and blur

An issue that got more visible after watching movies in 3D is the inability to focus or blur objects. Focus on an object far behind the screen you are reading this on – it will sharpen itself after a second. Focus this screen again. Try the same when watching a 3D movie. It won’t work.

Viewing the movie in 3D gives you the impression that some objects are further away and some closer, and the way this is being done is 1-to-1 the way the brain does it, sending you a slightly separate image for every eye. But the depth of field in both images are static – you can’t focus on something that wasn’t meant to be in focus. This is one of the main issues people complain about headaches.

The way to fix this is relatively straight forward but not easy to implement:

  1. Instead of a single image, use a layered image (i.e. Photoshop style)
  2. Track the viewer’s eye (and pupil) position and calculate where she is looking at
  3. Apply a new depth of field, focus the layer the user focusses at in a small transition (~1s)

Interestingly, I haven’t seen any research on this. The good news is that the Tobii Rex is an upcoming consumer product that enables #2: Tracking where the user is looking at. Funny: In their long list of possible use-cases, they never mention above’s use-case.

Head tracking

Many think VR displays have to be so large they cover your whole viewing area. But wouldn’t that mean that looking out of a window should somehow look fake? Well, turns out that it doesn’t matter how much of the screen we see – we just need to make it work like a window!

Take a look out of a window and focus on something. Now move your head (don’t turn it – move it along with your body). Notice how the object you focussed on gets closer or further away based on where you are looking from? Well, try the same with the screen you are reading this on. It won’t work.

Movie images are captured in a specific perspective, and if you move around, the perspective still stays the same. Thus, the picture will only look real if you, the viewer, become the camera! The good news is that there are a few brave folks who have done so, like the guy who transformed a Wiimote to a head tracking device, and the Oculus Rift is one of the first consumer devices to include it.

Conclusion

We’re very close! The good news is that all of the above can be implemented and doesn’t need any further research except for the video codec. The bad news is that solutions will be costly at first, and not combined into a single experience. Also, dynamic focus and head tracking will likely never work in a movie theater, as the images sent to your eyes need to be unique to you. Never say never though – as soon as we all have direct neural interfaces, it’ll be trivial :)

Have a comment or question?