GDC 2013 in 3D, Part II

  • Print
Today we have part two of Kris Roberts' coverage of GDC 2013. This time, Kris shares what he learned from Valve's presentations about VR technology and what game developers need to look out for.

Why Virtual Reality Is Hard (And Where It Might Be Going)
Michael Abrash (Valve Software)

Quite a few years ago, before I switched gears in my career and went into games professionally, I read Michael Abrash's Graphics Programming Black Book. I had been running a Quake server for a while and found it fascinating to read about what went into the state of the art in computer graphics at the time and learn about some of the background story behind how they developed Quake in particular. Michael writes with a splendid tone that gets complex concepts across clearly and succinctly, but without the impression that details are being glossed over or simplified. His talk this year was the first time I had ever seen him in person, and it was no big surprise that his personal presentation style was just as illuminating.

My main take away from his talk was that VR is hard, really hard – and once we solve the problems at hand, its likely to expose even harder things to solve. However, the session was hardly a downer or buzz kill for everyone excited about VR! Instead, I think it was more of a call to arms for our best and brightest to get to work, and that even with the problems that are intractable we should strive to overcome because at the end of the day it really is going to be awesome and worthwhile.

Michael started his talk describing his own flashback to 1996 and the notion that the Metaverse from Neal Stephenson's book Snow Crash was coming. That's when he went to work with John Carmack on Quake and although it wasn't quite the Metaverse, what they produced was pretty amazing.

So now it's 2013 and the overwhelming sense is that virtual reality is coming! We have heard this before, but why is it different this time? We are seeing a convergence right now with flat-panel displays, batteries and power management, mobile CPU/GPUs, wireless, cameras, gyroscopes, accelerometers, compasses, projectors, waveguides, computer vision, tracking hardware, and content in the form of 3D games.

True virtual reality, a simulation indistinguishable from reality, is not just around the corner. We are only seeing the very first glimpse of it with the Rift, and that's just the start. The technology has room for improvement across the board, and it will take many years to fully refine the virtual reality dream. Augmented reality is going to be even harder.

How hard can it be? What's so hard about mounting a screen in a visor and displaying images? Is that all there is to it?

These are the really hard parts Michael laid out: Tracking, latency, and producing perceptions indistinguishable from reality.

For tracking to work convincingly, images must seem fixed in space. This is different with a VR headset, as it moves differently than other displays we use. It moves relative to both real reality and our eyes. It moves with our head. The truth is that your head can move very fast, and while your head is moving your eyes can counter rotate just as fast.

In rapid relative motion, your head moves (which moves the display) while your eyes move in the opposite direction, requiring the pixels that make up an object you are tracking to shift on the display and yet appear to be stationary in the space of the presented stereo projection. Images must always be exactly in the right place relative to real reality and your head. Any errors introduced in any part of the system come across as anomalies, and the human perceptual system is tuned to pick out just these kinds of problems. After all, anomalies may be something trying to eat us, or something WE might want to hunt and eat!

The main point is that tracking has to be super accurate.

How close is VR tracking now? The Rift uses an inertial measurement unit (IMU) which is inexpensive and lightweight, but it does drift and only supports rotation, not translation. Translation is an important part of tracking, and moving side to side or forward and backward without that movement being reflected in the simulation is disorienting.

Latency is the delay between head motion and the virtual world update reaching the eyes. When it is too slow or fails, images get drawn in the wrong place and appear as anomalies. The type and severity of latency related anomalies varies with the type of head motion.

To overcome this, latency needs to be super-low. How low is super-low? The goal is somewhere between 1 and 20 ms total for:
  • Tracking
  • Rendering
  • Transmitting to the display
  • Getting photons coming out of the display
  • Getting photons to stop coming out
  • Google "Carmack Abrash latency" for details
That's a lot to do in not very much time!

The remainder of the presentation focused on an investigation of the issues for tracking using space-time diagrams to examine pixel based movement over each frame. In true reality, photons are bouncing off the object and entering our eyes continuously. In virtual reality we have different display technologies which show color by either sequentially flashing RGB or simultaneously showing each component and additionally having various lengths of persistence.

With a sequential RGB display, the light for a pixel moving across the display with our eyes fixed appear to come one after the other and show the color properly and in the right location. But if the eyes are moving, it effectively slants the segment on the space-time diagram and we see the fringing 'rainbows' familiar from DLP displays when your eyes dart across the screen. When that happens on a VR display it is less than satisfactory.

The alternative display technology shows each color component simultaneously, and for a set duration each frame. Full persistence shines for the entire frame, and stops only when the next frame starts. Half persistence shines for the initial period and then goes dark well before the next frame. A zero persistence display just flashes at high intensity at the start of each frame.

With each of the persistence types, a pixel moving across the screen shows properly. However moving your head causes the light to stay in the old location for as long as it shines and then pops to the proper location at the start of the next frame. So the shorter the persistence, the less smearing shows. This would lead us to think that zero persistence would be ideal, and indeed it looks good when your eye tracks the motion, but untracked motion of the pixel going across the screen strobes because the distance is too far for the eye to reconcile smoothly. The punch line is that none of the technologies we currently have is perfect in every situation, and great VR visual quality is going to take a lot of time and R&D. There are certainly bigger problems than just color fringing, strobing and judder.

Once we have a handle on the hard problems, we can move on to thinking about the really really hard problems:
  • Per-pixel focus
  • Haptics
  • Input
  • How to be environmentally aware of real reality while using virtual reality
  • Solving virtual reality motion sickness
  • Figuring out what is uniquely fun about virtual reality
The figuring out what's great in VR is where all the effort is leading. We don't know what it is now, it may seem obvious looking back at it, but nobody can tell for sure ahead of time. Michael says he thinks the road map for VR is likely to be pretty straightforward. First the Rift ships and is successful, and that starts a lot of development and activity similar to what we saw with 3D accelerators with various companies competing and improving the technology. The real question is whether VR ends up as a niche or the foundation for a whole new platform.

So yes, VR is going to be hard - but the same can be said for real-time 3D and just look at the tremendous strides that have been made in that technology since Quake. These are exciting times, and the problems are not to be shied away from.

For more info and references check out:
Google "abrash black book" (free online)

What We Learned Porting Team Fortress 2 to Virtual Reality
Joe Ludwig (Valve Software)

Joe was involved with the Valve team's effort to support VR in Team Fortress 2. The game is free to play, and with the current update you simply need to add -vr to the command line launch options to run in Virtual Reality mode with support for the Oculus Rift.

They worked on it with the Nvis St-50 headset as well as prototype (duct tape and love) early versions of the Rift that Palmer would send them. Joe seemed pretty excited about the actual production versions of the devkits and maybe a little jealous of the developers who are going to get to use them without knowing the joy of working with early prototype hardware.

Two specific recommendations he had were to turn off desktop effects in Windows and to get a DVI. The desktop effects can introduce latency and with a splitter you can simultaneously run the headset and also see whats being displayed on a monitor. At Valve they use an Aluratek model which you can find online for around $80. With whatever splitter you get, you may have to experiment the connections and/or particular power on sequences to ensure that the proper EDID data gets to the right devices. You'll figure it out.

Once you have your Rift and development environment all setup, what are the critical pieces involved in porting your game to VR?
  • Latency
  • Stereo Rending
  • User Interface
  • Input
  • VR Motion Sickness
The first topic of Latency is super important, but for the sake of time in the talk, Joe was not going to cover it and instead provided these links to reference material:
Google "John Carmack latency" and "Michael Abrash latency"

Stereo Rendering on the Rift is done with a 1280x800 panel, split with 640x800 per eye – but in practice the visible area is less than that and with the lens distortion and correction it needs to be calculated at a higher resolution in the rendering pipeline. But in the end you need to have two virtual cameras that respect the interpupillary distance set for the user at the time.

In the regular version of TF2 they use a player weapon model for the first person character that just includes the gun, hands and arms to the elbow. Normally that moves with the screen and you never see where the geometry ends. But in the VR mode and wider FOV it was too easy to look around and see that the model was incomplete. They ended up using the third person model so you would look down and see the entire body. I know we did similar tricks with the player model with the cockpit camera mode in Midnight Club Los Angeles – but we had to eliminate the character's head so there were no clipping issues with the geometry in the same place as the cameras. Even though Joe said they were using the full third person model I suspect they did have to do some geometry elimination.

Getting the world and character to look good in the stereoscopic view from the headset sounded like it was pretty straight forward, with the exception of full screen effects. Almost none of them worked right away and did require some effort to get working in stereo. That is a pretty common problem for anyone who has worked on porting a game to work in stereoscopic 3D.

The user interface sounds like it presented the most aesthetic challenges. The first set revolving around conflicting depth cues. When you look at a scene in 3D, these are the factors which help you identify the relative position of objects: Size, Occlusion, Parallax, Convergence, Perspective Distance Fog, Stereo Disparity, Focal Depth

Putting user interface elements in the player's view at any depth typically introduces conflicts in occlusion and convergence. These mismatched depth cues make it confusing or distracting to have UI on a virtual HUD as your eyes switch back and forth between looking at the UI information and the general scene. But in the end, the TF2 hud was basically shrunk down and positioned within the low-distortion high-resolution usable display space near the center of the player's view to make it legible and convey the information the player needs.

They did continue to use full screen menus but discovered that players were more comfortable when they still had head tracking and were not locked into a static view forcing them to see nothing but the menu.

Handling the targeting reticule is a classic problem for stereo games, and their solution was to cast a ray and render the reticule at the distance of the targeted object. This will pop its position in and out of the scene as you move your view, but in practice it sounds like most players were unaware and even after being told that's what was being done they didn't necessarily recognize the effect.

Joe then went over the various experiments they tried for the actual player input. How to consider the head tracking, mouse and keyboard is a big domain for design in VR and they set up a number of various modes in TF2 that you can switch between to see which work best for you.

Input mode 0 has you aim and steer with your nose. The mouse or control pad just rotates your torso.

Mode 1 has you aim with your nose, but move your body with the mouse. There is some drift in the Rift tracking and it sounded like sometimes players would get a little confused as to which direction their 'body' was pointed when they were looking in another direction.

Modes 2, 3 and 4 experimented with a vertical band around the center of the screen where the reticule could move freely, but if it got to the edge would pull the view along with it. The default they ended setting was mode 3 which I think I understood had the look/move direction tied to the torso. Play with the various modes and see what works for you and what ideas you have to try in your games.

The last topic Joe covered was VR motion sickness. It's something very real that the majority of players experience to some degree. The symptoms vary, but being sensitive to it as developers and trying to minimize the things that make people the most uncomfortable is important. The first thing they realized was that taking orientation control away and/or animating it without player input - typically done in death camera sequences or cutscenes - is really disorienting. This is most evident when introducing roll or moving the camera sideways without the gamer's control or intent. Even when players are in control of their view, certain movements in the game – such as going up/down stairs or ramps seems to bother players, perhaps because they are moving in two directions simultaneously.

To summarize, these were the parting topics Joe wanted to reiterate:

Eliminate latency
Buy a splitter
Fix your screen-space effects
Fix your player weapon models
Pre-distort in a shader
Eliminate the HUD if you can
Draw the HUD in stereo if you cant
Draw the crosshair at the aim depth
Include a way to turn around on the mouse
Give people some aiming without head motion
Dont mess with the horizon. Ever.
Keep view rotation 1:1 with head tracking
Dont slide the camera sideways

Great stuff! Next up, Kris had a chance to go through the GDC exhibit floor and spot some gems to look forward to. Come back for more!