Data/Algos for most realistic orientation/position tracking

Post by **daniel2e** » Fri Feb 15, 2013 5:14 pm

Hi,

This is my first post here. I was looking for the most serious forum out there for the OR, and it looks like I found it!

After following much of the technical discussion of the Rift for a while, I gather that there are some hairy problems to be solved related to accurately interpreting multi-modal, noisy and/or inaccurate sensor data. Broadly, these problems are sensor interpretation (what does this series of readings really mean in this environment?), sensor fusion (how do I unify these readings into a single model?) and sensor prediction (to offset latency). Obviously Oculus already has a pretty good handle on orientation alone, but it's not perfect, and they seem to be pretty tight lipped about the timeline for integrating high-fidelity positional tracking.

If Oculus-style VR takes off, there are going to be some deep-pocketed companies ready to jump in, so Oculus needs to stay ahead of the curve. I think they are going about this exactly the right way by building a developer community, and I think they would do well to get that community working on the hard sensor interpretation problems.

Towards this end, Oculus should consider producing time-stamped data sets that represent both (a) data read from all Rift sensors and (b) high-precision & accuracy real-world movements of the rift for a large set of random movements of the rift. Then let the community develop the algorithms that best map (a) to (b). The algorithms could be anything. If you are a machine learning guy, try out your fanciest Neural Nets/SVMs/fill-in-the-blank regression algorithms. iIf you are a math guy, throw your sensor fusion/kalman filter/statistical whatever at it. If you are a probability guy, try out your best bayesian methods to determine the maximal P(position&orientation|observations). Lots of folks, myself included, would be excited to crack this problem. For my part, I'd probably go with time-series Machine Learning/Bayesian techniques.

So the only problem is: how does one produce such a data set?

I think there are lots of viable possibilities. The great thing about building a rig for this is that you are not constrained by human-factors. The rig need not be wearable, it just needs to be able to manipulate the Rift and record what it does. I think you can tackle it from either the sensor or actuator side (or both!).

If you go the actuator route, you'll probably need 3 each of high precision servos and linear actuators. They would be responsible for manipulating the Rift in 3D space and, since you are giving them instructions, you would know precisely how the rift is really moving. I'm not sure what this rig would look like, but I bet hardware buffs can think of something.

The other possibility is using high-performance sensors, either on the rift itself or external. What's the best thing out there? Those white mo-cap balls? High-precision laser range finders? Heck, you could even throw some Kinects and Playstation-moves at it since latency is not a problem (you can do the processing offline, all you need is the timestamps for the sensor readings themselves). In this scenario, you don't even need a rig to move the rift, you can simply move it by whatever means you like, as long as you don't get in the way of the sensors.

Perhaps best of all, you only need to build this rig one time. As the Rift goes through its inevitable refinements and improvements, you just put it back in the rig and churn out a new data set. If there is a new sensor, that's just a new input to the algorithm, which for a lot of techniques is no big deal at all (I can speak for Machine Learning, anyway).

What do you guys think? Should Oculus set aside time to generate data like this? Does anyone in this community have the knowledge and inclination to build such a rig themselves?

Daniel

Post by **Apotheosis** » Fri Feb 15, 2013 7:13 pm

Welcome to the forum (madness). Lots of thoughtful questions there. I'm looking forward to seeing how this thread develops.

Post by **PasticheDonkey** » Fri Feb 15, 2013 7:47 pm

i think this is exactly the kind of stuff oculus is going to be doing themselves. too much like work or too expensive for hobbyists. they already have machines to calibrate their sensors against reality in some regards.

Post by **Pyry** » Fri Feb 15, 2013 8:00 pm

A rig like that would work to characterize the sensor noise, but wouldn't let you build a model of human motion, which is what you really want. That is, the statistical and learning approaches are trying to build a statistical model of what motions are likely, and then use that model to bias the sensor fusion towards the movements that are the most likely according to the model. To learn a statistical model of human head motions you really do need to gather data from users.

Post by **PasticheDonkey** » Fri Feb 15, 2013 8:07 pm

well that hasn't made great strides as the approach used by kinect. you'd be better of throwing random movement constrained figure models at a evolutionary learn to interpret human movement program.

Post by **daniel2e** » Fri Feb 15, 2013 10:37 pm

i think this is exactly the kind of stuff oculus is going to be doing themselves. too much like work or too expensive for hobbyists. they already have machines to calibrate their sensors against reality in some regards.

If they do - more power to them, problem solved!

However, I suspect that if it that were the case, we would see dev kits shipping with positional sensors similar to those in the razer hydra or wii controllers. These problems are hard, and they have a lot on their plate already scaling up manufacturing.

As far as the problem being too expensive for hobbyists, the rig would probably take a thousand or so dollars to build, and the algorithm research could take far more in man-hours. But as I said, people like myself are willing to work on it free of charge! I just want to see the highest fidelity tracking in the headset

A rig like that would work to characterize the sensor noise, but wouldn't let you build a model of human motion

This is a good point! But I offer two counterpoints:

(1) If the data set is reasonably large, algorithms derived from it should be robust. In other words, they should generalize to human motion.

(2) Why not have a data set representative of human motion? While replicating human motion with actuators would be tricky, if you go the sensor route (laser range finders, mo-cap, kinect/move, etc.), you could generate data by having a human wear the set and manipulate it realistically. It might be unwieldy, but all you care about here is getting solid data.

Post by **2EyeGuy** » Fri Feb 15, 2013 10:51 pm

Pyry wrote:A rig like that would work to characterize the sensor noise, but wouldn't let you build a model of human motion, which is what you really want. That is, the statistical and learning approaches are trying to build a statistical model of what motions are likely, and then use that model to bias the sensor fusion towards the movements that are the most likely according to the model. To learn a statistical model of human head motions you really do need to gather data from users.

It probably varies a bit from person to person though. You might need a large sample size. And then in other countries you might still end up with head motions you didn't expect, like this one:

[youtube]http://www.youtube.com/watch?v=tRwzcnOdNFc[/youtube]

Post by **Pyry** » Sat Feb 16, 2013 1:23 am

This is one of those cases where there's "no free lunch", so to speak. The only way you can really do better than some basic averaging strategy is by making (correct) assumptions about the underlying system (head motion in this case). In a Kalman filter, for example, these assumptions are built into the state transition and control matrices; you assume that you know the dynamics that govern the underlying system, but you don't know what the system's current state is. In the case of human head motion, however, you can't start with physics equations and derive a transition matrix, because there's this hugely unpredictable human in the loop. So then either you make some weak assumptions (that human motion is mostly smooth, that there are acceleration and velocity limits), or you try to derive a statistical model based on a large amount of data. I suspect that the algorithms learned from a non-human-controlled rig would essentially boil down to making smoothness and acceleration assumptions.

Post by **lainse** » Sat Feb 16, 2013 3:58 am

what a bout integrating eye tracking in to the head model.
I would think that most smaller head moments are there to center ones vision better.

Post by **daniel2e** » Sat Feb 16, 2013 10:51 am

Pyry - thank you for your thoughtful reply.

The only way you can really do better than some basic averaging strategy is by making (correct) assumptions about the underlying system (head motion in this case).

What you are calling 'assumptions' is similar to what we call 'bias' in Machine Learning. It is the fundamental trick that actually makes these algorithms useful, and I think you have correctly identified one of the most useful types of bias to learn in this context - tendencies and constraints of the human head.

However, I disagree that this is the 'only' useful thing to be learned! I suspect, for example, that there are useful nuances of the sensors (perhaps combined with the electromagnetic environment induced by the Rift) that can be exploited. We know they are noisy, but they doesn't mean that they give us simple gaussian distributions around the true values. Carmack mentioned playing with the hydra, and described it as extremely precise, but inaccurate, such that you could wiggle your head around and end up with your view having drifted 15 degrees. This particular case is easy to correct for (I think your gyros tell you which way is down), but it betrays that the sensor data could be more interesting than simple noise.

For example, there may be contexts in which a given sensor tends to be particularly reliable or unreliable. Perhaps a sensor performs well up to a certain linear or angular velocity, but performance starts to degrade above that. Or perhaps a sensor performs well for a given range or duration of motions, but begins to accumulate significant error outside that range. The algorithm could learn these limitations, and lean more on other sensors, or inferences it draws from them and earlier readings, in these contexts.

I suspect that the algorithms learned from a non-human-controlled rig would essentially boil down to making smoothness and acceleration assumptions.

So obviously I think there are more interesting things to learn either way, but I would reiterate that this need not be a non-human-controlled rig. I can imagine equipping a rift with high-fidelity sensors that still permit the rift to be worn, but would not be practical in a consumer device either because of cost, physical awkwardness, or assumptions made about the environment that may hold in the lab but not the living room.

My only slight concern with a human-controlled set up is the possibility of over-fitting the idiosyncrasies of a a single person's musculoskeletal/physiological behavior. You'd probably want to put a handful of people in there representing a variety of body types.

Post by **daniel2e** » Sat Feb 16, 2013 11:04 am

lainse wrote:what a bout integrating eye tracking in to the head model.
I would think that most smaller head moments are there to center ones vision better.

This is an example of another sensor that could be added in the future that might usefully inform overall tracking. I'm not predicting that it would, but it would be interesting to investigate.

However, not to go off on a tangent, but if eye tracking is eventually added it would likely be for another purpose entirely, and a pretty cool one at that. John Carmack has proposed using eye tracking to track the target of the 'fovea', which is the part of your eye that receives the light from the direct center of your vision and can interpret orders of magnitude more detail than any other part of your vision. The only reason we have high resolution in games is to satisfy the fovea, everything else can be extremely low resolution and you wouldn't be able to tell the difference!

If you can track the fovea, you can devote the vast majority of your rendering to a small 100x100 or so moving window of the display, say creating a 426ppi 120 stereo-frames-per-second 'window' into the world. The other 90% of the display can be rendered at low resolution and low framerate. This capability would by equivalent to a several fold increase in GPU power without changing any hardware!

Obviously, we'll need a headset with better than a 7" 720p display before this becomes super attractive

Post by **geekmaster** » Sat Feb 16, 2013 11:58 am

daniel2e wrote:.... If you can track the fovea, you can devote the vast majority of your rendering to a small 100x100 or so moving window of the display, say creating a 426ppi 120 stereo-frames-per-second 'window' into the world. The other 90% of the display can be rendered at low resolution and low framerate. This capability would by equivalent to a several fold increase in GPU power without changing any hardware! ...

http://ai.stanford.edu/~sgould/papers/i ... foveal.pdf

The attentive interest map, generated from features extracted from the peripheral view, is used to determine where to direct the foveal gaze...

Although intended for robot vision systems, these methods could be used to compress (or render) video data constrained primarily to areas of visual interest.

Post by **MSat** » Sat Feb 16, 2013 1:15 pm

Welcome to the forums, and great first post!

While there may be some methods and algorithms to increase the quality of data from the current IMU, there is only so much that is possible due to inherent limitations. What these limitations are and where they derive from, I don't know exactly (at least without knowing the sensor specs and the process by which the data is acquired), but they seem to be in line with what numerous researchers have experienced using similar sensors. From the looks of it, it would seem that orientation data is quite reliable in terms of both response time and accuracy. All three sensors work well together towards these goals, and low-pass filtering of accelerometer and magnetometer data is good for providing low noise orientation info. However, when it comes to translations, you're pretty much stuck with using just the accelerometer - low pass filtering is only viable if accuracy and sampling rate is high enough (as it turns out with most sensors, resolution decreases with an increase in sampling rate), the only things you can use the gygro/mag for towards these ends is to verify that there was no change in orientation, which is of marginal use.

As is, the Rfit's IMU on its own is unsuitable for translational tracking. There needs to be additional sensing systems in order to make such functionality possible - for which there appears to be no shortage of.

Post by **daniel2e** » Sat Feb 16, 2013 6:23 pm

MSat wrote: While there may be some methods and algorithms to increase the quality of data from the current IMU, there is only so much that is possible due to inherent limitations. What these limitations are and where they derive from, I don't know exactly (at least without knowing the sensor specs and the process by which the data is acquired), but they seem to be in line with what numerous researchers have experienced using similar sensors. From the looks of it, it would seem that orientation data is quite reliable in terms of both response time and accuracy. All three sensors work well together towards these goals, and low-pass filtering of accelerometer and magnetometer data is good for providing low noise orientation info. However, when it comes to translations, you're pretty much stuck with using just the accelerometer - low pass filtering is only viable if accuracy and sampling rate is high enough (as it turns out with most sensors, resolution decreases with an increase in sampling rate), the only things you can use the gygro/mag for towards these ends is to verify that there was no change in orientation, which is of marginal use.

As is, the Rfit's IMU on its own is unsuitable for translational tracking. There needs to be additional sensing systems in order to make such functionality possible - for which there appears to be no shortage of.

I think you are probably correct that the current Rift sensor configuration won't cut it even with clever software tricks, but I suspect that the configuration is limited only because Oculus doesn't have a good positional solution, so they can't justify the marginal cost of equipping the dev version with the additional sensors, as they would serve little purpose out of the box.

But I started this thread with an eye towards future configurations, which doesn't necessarily mean waiting until we have that configuration in our hands. We don't need the physical device to develop the algorithms, we only need the data, so Oculus could conceivably release such a data set as soon as they start settling on designs with the positional sensors (whatever they may be) and let the community contribute towards solving the hard problems from the software side.

Post by **MSat** » Sun Feb 17, 2013 4:40 pm

Ah, I see. Not sure if the official driver will have a hook for raw data, but it sounds like a cool idea - have a raw data buffer that can be read and cleared. Otherwise, a custom driver would need to be written. Still, I don't think we'll get a word on this until the first SDK gets released.

Post by **PalmerTech** » Sun Feb 17, 2013 5:59 pm

Raw IMU data is readable to people who want to use it. We anticipate that the vast majority of developers will want to use the default sensor fusion, but it will be very cool to see people try other things, especially integration with position tracking solutions.

In addition to raw data, the Oculus SDK provides a SensorFusion class that takes care of the details, returning orientation data as either rotation matrices, quaternions, or Euler angles.

http://www.kickstarter.com/projects/152 ... game/posts

Post by **FingerFlinger** » Sun Feb 17, 2013 6:27 pm

PalmerTech wrote:Raw IMU data is readable to people who want to use it. We anticipate that the vast majority of developers will want to use the default sensor fusion, but it will be very cool to see people try other things, especially integration with position tracking solutions.

In addition to raw data, the Oculus SDK provides a SensorFusion class that takes care of the details, returning orientation data as either rotation matrices, quaternions, or Euler angles.
http://www.kickstarter.com/projects/152 ... game/posts

Really glad to hear this! I assumed that you would expose it, and have been blindly working under that assumption, ha ha.

Post by **MSat** » Sun Feb 17, 2013 6:46 pm

D'oh! I was going to consult that KS update before I made that comment to see if perhaps I simply forgot, but my internet connection is so painfully slow, that I decided to rely on my sometimes questionable memory. Great to know!

Could you tell us if the raw data buffer is "deep" enough so that whatever software hooks into the driver doesn't necessarily have to poll it at the 1000Hz update rate?

Post by **PalmerTech** » Sun Feb 17, 2013 7:00 pm

It can be polled at rates slower than 1000hz.

Post by **android78** » Sun Feb 17, 2013 7:18 pm

Just so I'm clear... will the devkit actually have X,Y,Z accelerometers? It's been mentioned in the past that there is not position tracking with the devkit, but is this just because you haven't got the software for it working to a reasonable level, or is it because the kit will leave out the accelerometers which would be required for position tracking?

Post by **geekmaster** » Sun Feb 17, 2013 8:00 pm

android78 wrote:Just so I'm clear... will the devkit actually have X,Y,Z accelerometers? It's been mentioned in the past that there is not position tracking with the devkit, but is this just because you haven't got the software for it working to a reasonable level, or is it because the kit will leave out the accelerometers which would be required for position tracking?

Accelerometers do poor absolute position tracking, with significant drift causing by integrating offset noise. Optical or magnetic position tracking is more reliable. Integrated accelerometer position can supplement optical tracking when an optical solution cannot be obtained, and can help resolve anomalous readings from magnetic sensors such as the Hydra (which sometimes reverses coordinates relative to the base if a sensor gets too close to the base unit).

Integrated X,Y,Z accelerometers would not be of much use on their own to track body position, but they may be useful to detect constrained relative head bobbing and weaving, commonly used for depth perception by those who lack stereoscopic ability. Head bobbing can be used to detect shifting your weight from foot to foot, for a simple walking simulation without moving your knees or requiring external hardware.

Post by **brantlew** » Sun Feb 17, 2013 8:25 pm

What a great thread! I so badly want to contribute, but I'm going to have to limit my comments to some fairly lame statements. IMU's have been fairly well studied and their inherent faults well documented. Here's a great survey of the topic. http://www.cl.cam.ac.uk/research/dtg/ww ... thesis.pdf Oculus has some really bright people working on motion problems. And finally as Palmer mentioned, the raw sensor data will soon be available to all, so there is no reason that these data sets could not be generated by a third party.

Post by **android78** » Sun Feb 17, 2013 8:34 pm

geekmaster wrote:... but they may be useful to detect constrained relative head bobbing and weaving, commonly used for depth perception by those who lack stereoscopic ability. Head bobbing can be used to detect shifting your weight from foot to foot, for a simple walking simulation without moving your knees or requiring external hardware.

This is exactly what it would be good for. I would be using it for relative position shifting as well as roll/yaw absolute adjustments with reasonable low precision requirement.
From my experimentation using arduino and three axis accelerometer, they are accurate enough to move your head side to side and back to the center with little drift, so long as you are accurate in the calculation of the position shift with relation to the angle. Basically, make sure you don't just assume that the position is relative to the differential of the x and z axis. If you tilt your head 45 degrees to the right, then move 10 cm to the right, then straighten your head back to 0 degrees and move you head 10cm to the left, you aren't where you started.
I think that, so long as you aren't using the accelerometer as your primary input for your position, it doesn't really matter if your position drifts.

Post by **MSat** » Sun Feb 17, 2013 8:41 pm

android78 wrote:Just so I'm clear... will the devkit actually have X,Y,Z accelerometers? It's been mentioned in the past that there is not position tracking with the devkit, but is this just because you haven't got the software for it working to a reasonable level, or is it because the kit will leave out the accelerometers which would be required for position tracking?

geekmaster explained the limitations of the accelerometers pretty good. The only thing I wanted to add was that when an IMU is strictly being used for resolving orientation, the accelerometer's primary function is to find the gravity vector in 3 dimensions which is always stable.

Post by **daniel2e** » Sun Feb 17, 2013 11:02 pm

brantlew wrote:What a great thread! I so badly want to contribute, but I'm going to have to limit my comments to some fairly lame statements. IMU's have been fairly well studied and their inherent faults well documented. Here's a great survey of the topic. http://www.cl.cam.ac.uk/research/dtg/ww ... thesis.pdf Oculus has some really bright people working on motion problems. And finally as Palmer mentioned, the raw sensor data will soon be available to all, so there is no reason that these data sets could not be generated by a third party.

Thanks for the link, brantlew, I think I need someone to englighten me a little bit more...

First, to make sure that I don't overstate my qualifications: I am a general-purpose machine learning guy, I don't have a lot of expertise on inertial sensors, so I could be mistaken on almost anything here. When I began this thread I had in mind future Rift sensor configurations. My inclination was that the initial Rift configuration would probably not have the necessary sensors to perform adequate positional tracking - which has been reinforced by a couple of other posters - but now I'm not so sure.

From what I can gather from the dissertation linked to above, double-integration of accelerometer data theoretically yields positional information, and it's actually pretty decent, but its certainly not perfect and accumulated error ("drift") quickly becomes a problem.

But why do we care about so much about drift?

It seems to me that a VR setting affords a great deal of leeway. All you really care about for positional tracking in VR is high enough accuracy and precision to fool the brain, and no more. It doesn't actually matter how far you move in reality as long as the brain is convinced it's reasonable. And not only that, but I think for the Rift we only need to worry about very short durations and distances - say a meter or so sphere around a person. I think any scenario where the user is being tracked as they move about, say, an entire room, is well out of scope for the Rift. For the vast majority of people, I think the most reasonable use of the Rift is standing in place and using a conventional input device to actually navigate the environment. Positional tracking is very important, but only to capture a limited range of movements to evade danger or interact with/inspect the environment, for example dodging side to side, leaning to peer around corners, ducking/crouching, jumping, etc.

Not only that, but there are potentially profitable ways to cheat - for example the system can easily self-calibrate during operation to determine the users' height when standing/crouching and inconspicuously 'anchor' the camera at these levels whenever there is ambiguity.

If I'm right, then the bar for positional tracking fidelity might actually be pretty low, perhaps even achievable with the IMU alone.

Please, though, anyone with actual expertise correct me. Are only high-end IMU's (i.e. superior to those in the Rift) capable of decent tracking? Or are they really all that bad?

Post by **MSat** » Mon Feb 18, 2013 12:38 am

There's really not much leeway at all. You don't want the system indicating motions that didn't actually happen, but that's precisely what every IMU does. The only difference from one IMU to the next is the rate at which it occurs. Some are better than others, but none will ever be 100%, so drift will accumulate. As an example, imagine that drift accumulated to the point where now your avatar is crouching, even though you're still standing in real life. With strictly IMU-based tracking, the system will never be able to correct for the error. You would either have to recalibrate it manually (could be something as simple as just standing straight up, and pressing a button - but this ruins immersion), or you need an additional system that could at least provide occasional positional data for feedback (like optical tracking).

Placing an artificial limit on the range coverage would do more harm than good. For example, lets say you placed a limit to a radius of 1m from the home position. If accumulated errors caused it to drift to the right 1m, and then the user actually does try to move to the right, the limit will be exceeded, and the system will therefore not respond. At least this wouldn't happen if there was no limit in place.

Unless a super-accurate IMU can be developed (and cheaply, might I add), then sadly it will never be feasible for positional tracking.

Now, positional tracking may not be entirely necessary in the sense of where are you exactly in 3D space, but where is your head relative to the rest of your body. What I'm talking about is skeletal tracking, and this is feasible using only IMUs by placing multiple nodes strategically along the body. This may be a bit cumbersome, but it does provide a fair amount of capabilities that could be difficult to achieve with other methods.

Post by **geekmaster** » Mon Feb 18, 2013 1:35 am

Perceived motion (such as drift) when the body thinks it should be at rest causes more motion sickness than drift while moving, or not moving when you think you should. The reason for this is because perceived visual motion without vestibular sensation of motion is typically a sign of poisoning (such as from toxic mushrooms), causing a reflex to vomit up the offending toxic substance. Like when the room spins around you when you lay down while intoxicated.

Due to our extreme sensitivity to this, drift-susceptible accelerometer-based absolute position tracking is a bad idea. However, as mentioned before, relative motion tracking between head and body should be fine to a certain extent, when constrained by skeletal modelling and the known direction of gravity (such as for detecting head bob and weave, which could be extrapolated into walking).

The latest Kinect SDK does a much better job of keeping the latency down. Alternatives for absolute position tracking are PS Move (used by Project Holodeck) or other optical or magnetic tracking solutions. Arm tracking needs non-drifty absolute tracking (relative to the head), and a Hydra magnetic tracker can be used for that (also used by Project Holodeck).

We are really just rehashing stuff that is already out there, but it can be hard to find. We really need an index wiki to sort this stuff out so we do not have to keep duplicating our efforts.

Post by **android78** » Mon Feb 18, 2013 4:47 am

@MSat - The problem of height could be an issue, but regarding the moving side to side/forward and backward, I don't think that's a problem. Basically, I only really see the position tracking of the head being relative to your current position in the world. So if you move right a meter, then left a meter, you are unlikely to notice that you're actually 10cm right of where you started in the game world (and from playing with an imu, it's more like a 1% error). The gross movements would still be made using the controller, so there would be little point in adding a constraint. I would think that it wouldn't be much of a problem for height with maintaining an average height above the floor which would adjust the height by a small amount while a vertical movement is being detected.

Post by **daniel2e** » Mon Feb 18, 2013 10:12 am

MSat wrote:There's really not much leeway at all. You don't want the system indicating motions that didn't actually happen, but that's precisely what every IMU does. The only difference from one IMU to the next is the rate at which it occurs. Some are better than others, but none will ever be 100%, so drift will accumulate. As an example, imagine that drift accumulated to the point where now your avatar is crouching, even though you're still standing in real life. With strictly IMU-based tracking, the system will never be able to correct for the error. You would either have to recalibrate it manually (could be something as simple as just standing straight up, and pressing a button - but this ruins immersion), or you need an additional system that could at least provide occasional positional data for feedback (like optical tracking).

I touched on this before, but it seems to me that you could ameliorate this somewhat with a coarse-grained internal state model of what the user is actually doing. I certainly buy your argument that drift will kill you if all you are doing is adding up 30 seconds worth of inferred translational data, but what if you were consistently and subtly biasing your interpretation by heuristically guessing at what the user was doing in broad terms? For example, if the user crouches, then stands, and you correctly identify both these movements, you know that at the end of that sequence that the camera should be roughly at standing eye level. As long as you correctly identify each of these broad movements, drift will never have the opportunity to accumulate in the vertical dimension (which, I would argue, is the dimension that really matters in terms of fully eliminating drift, since your brain won't notice if you wind up two inches too far north in the environment, but very well could notice if you are suddenly 2 inches shorter).

As you point out, guessing wrong about one of these coarse movements would be disastrous, if the user ends up crouching in the world when they are not in reality, that would just be awful.

MSat wrote: Placing an artificial limit on the range coverage would do more harm than good. For example, lets say you placed a limit to a radius of 1m from the home position. If accumulated errors caused it to drift to the right 1m, and then the user actually does try to move to the right, the limit will be exceeded, and the system will therefore not respond. At least this wouldn't happen if there was no limit in place.

I actually wasn't proposing a hard (or even a soft) limit in software, it was more a suggestion that if we have high-fidelity tracking over those sorts of durations and distances, then we've got a pretty immersive experience. The user can wander around their house with the thing on at their own risk (both to their shins and their stomach).

Unless a super-accurate IMU can be developed (and cheaply, might I add), then sadly it will never be feasible for positional tracking.

If this is settled science, so be it. Just wanted to make sure

Now, positional tracking may not be entirely necessary in the sense of where are you exactly in 3D space, but where is your head relative to the rest of your body. What I'm talking about is skeletal tracking, and this is feasible using only IMUs by placing multiple nodes strategically along the body. This may be a bit cumbersome, but it does provide a fair amount of capabilities that could be difficult to achieve with other methods.

Very interesting!

Post by **daniel2e** » Mon Feb 18, 2013 10:42 am

geekmaster wrote:Perceived motion (such as drift) when the body thinks it should be at rest causes more motion sickness than drift while moving, or not moving when you think you should. The reason for this is because perceived visual motion without vestibular sensation of motion is typically a sign of poisoning (such as from toxic mushrooms), causing a reflex to vomit up the offending toxic substance. Like when the room spins around you when you lay down while intoxicated.

I was using to 'drift' to refer to small errors accumulated over time, not to literally reporting 'drift' when at rest. Is that what they do? If so....ick. If that's the case, you might have won me over that the IMU alone won't cut it.

The latest Kinect SDK does a much better job of keeping the latency down. Alternatives for absolute position tracking are PS Move (used by Project Holodeck) or other optical or magnetic tracking solutions. Arm tracking needs non-drifty absolute tracking (relative to the head), and a Hydra magnetic tracker can be used for that (also used by Project Holodeck).

Kinect really seems like overkill. For this application, I kinda like the idea of Move or the Wii Sensor bar - a big glowing light (or two IR lights in the case of the Wii) dedicated to tracking the position of precisely one thing in space as accurately and quickly as possible.

Post by **geekmaster** » Mon Feb 18, 2013 11:37 am

daniel2e wrote:I was using to 'drift' to refer to small errors accumulated over time, not to literally reporting 'drift' when at rest. Is that what they do? If so....ick. If that's the case, you might have won me over that the IMU alone won't cut it.

The fusion algorithms typically use the magnetometer and gravity vector (derived from accelerometers) to cancel rotational drift. Translational drift (moving position) cannot be compensated without some kind of absolute position sensor. Rotational drift is more likely to cause nausea than translational drift, but even small drift can accumulate into a large error. I think it would be fine for detecting gestures though, such as crouching, jumping, or head bob or weave.

Post by **geekmaster** » Mon Feb 18, 2013 11:39 am

daniel2e wrote:Kinect really seems like overkill. For this application, I kinda like the idea of Move or the Wii Sensor bar - a big glowing light (or two IR lights in the case of the Wii) dedicated to tracking the position of precisely one thing in space as accurately and quickly as possible.

You can fix the position of the wii motion sensor and attach the "sensor bar" (pair of infrared light sources) to your head. An example using "IR glasses" is shown here:

[youtube]http://www.youtube.com/watch?v=Jd3-eiid-Uw[/youtube]

BTW, if your wii sensor bar malfunctions, in a pinch, you can use a pair of tea candles instead. They provide sufficient IR, when positioned beside the TV. I would not mount candles on my glasses though for head tracking...

Post by **PasticheDonkey** » Mon Feb 18, 2013 11:43 am

positional tracking using psmove/wiimote like tech not limited by angle is in this thread. http://www.mtbs3d.com/phpbb/viewtopic.php?f=138&t=16072

in fact they were aiming for absolute position and angle within a limited area.

Post by **geekmaster** » Mon Feb 18, 2013 11:58 am

PasticheDonkey wrote:positional tracking using psmove/wiimote like tech not limited by angle is in this thread. http://www.mtbs3d.com/phpbb/viewtopic.php?f=138&t=16072

in fact they were aiming for absolute position and angle within a limited area.

Yes, I have been following that thread. It provides an extension to the "wii with IR glasses" method, allowing full 360-degree rotation.

Post by **daniel2e** » Mon Feb 18, 2013 12:39 pm

Geek - I will have to take your word for it on the limitations of the IMU, you are clearly pretty knowledgeable.

geekmaster wrote:
PasticheDonkey wrote:positional tracking using psmove/wiimote like tech not limited by angle is in this thread. http://www.mtbs3d.com/phpbb/viewtopic.php?f=138&t=16072

in fact they were aiming for absolute position and angle within a limited area.
Yes, I have been following that thread. It provides an extension to the "wii with IR glasses" method, allowing full 360-degree rotation.

I was not aware of this thread (still getting a feel for this place) - thanks for the link Pastiche. After my last post, I started thinking about potential set ups that would allow for continuous tracking without any blind spots. The best I could think of was hanging the camera from the ceiling and putting the light source on top of your head - which would suck. It looks like that guy has the real solution - very impressive work! And it may be exactly the type of 'rig' that I was hypothesizing when I began this thread - one that could be used to generate a comprehensive set of training data.

I bet that with clever software, we can coax good positional tracking out of a good deal less information than that provided by 22 LEDs. The IMU probably isn't enough, but maybe just one more sensor would do it (perhaps something based on hydra tech, what is that type of sensing called anyway, magnetic field?). I know the data is noisy, but that's what machine learning is for, that's what makes a problem fun!

But you need labeled data - you need to be able to know the real world movements that were happening when sensor data was recorded. It looks like Patim may have a very good method for generating such data. Very exciting.

Post by **FingerFlinger** » Mon Feb 18, 2013 12:46 pm

I've been working on this problem, with regard to integrating IMUs with an optical flow solution.

If you are using optical flow for your actual motion deltas, then you can reduce the IMU problem to: when does a motion begin and end. When the motion begins, you let the optical flow completely take over, and monitor the IMU for absurd results as a sanity check. When the motion ends, you take the current position (as reported by optical flow) to be the new baseline position, which you can then apply a Kalman filter against.

This has the advantage of allowing good responsiveness during gross movement, and allowing small natural head movements when the user is "at rest", while cutting down on drift.

Unfortunately, I am only to the stage of plausibly confirming this idea using simulated data, but I just got my IMU in the mail a few days ago, and will hopefully be able to integrate it with my solution given a few more weeks. I've gone down a few rabbit holes with the optical flow stuff while I was waiting on the mail, and I want to pursue that a little bit further before I begin working on the sensor fusion again.

Post by **MSat** » Mon Feb 18, 2013 2:51 pm

Oculus has stated that they'll be adding positional tracking to the consumer version, so perhaps it's a moot point to try to tackle it, unless, of course, it's more about personal interest and the challenge

If you don't have to be constrained to using the Rift's IMU, and are willing to go a custom route, then perhaps this thread may be of interest, particularly the following research paper cited on the second page: http://www.mdpi.com/1424-8220/12/2/1720

In that paper you'll find that when multiple identical sensors were combined to create a single "virtual" sensor, the noise output, and therefore drift, had been substantially reduced over that of a single sensor. I couldn't tell you if the errors could be low enough to make IMU-based translation tracking feasible at the quality level that VR demands, but it might be worthwhile to look into it. I think the most compelling case for a high performance IMU is that feedback requirements could possibly be reduced - perhaps to something like a relatively slow webcam and rough position and pose calculations based off the captured images. I admit that the last part is just my speculation, and have no evidence to support my theory.

That said, there doesn't appear to be a single solution that is the most ideal for every scenario. It's important to decide how you anticipate the system to be used, and then find the simplest way achieve it.

Post by **Fredz** » Mon Feb 18, 2013 3:51 pm

daniel2e wrote:Kinect really seems like overkill. For this application, I kinda like the idea of Move or the Wii Sensor bar - a big glowing light (or two IR lights in the case of the Wii) dedicated to tracking the position of precisely one thing in space as accurately and quickly as possible.

I've tried this with the PS Move / PS Eye here : http://www.mtbs3d.com/phpbb/viewtopic.php?f=140&t=15994

Not perfect - probably because of the fusion code - but it still has less latency than the Kinect and it's simpler than building an attachment with multiple LEDs. In the end the multiple LEDs idea will be a lot better if it can be build easily and cheaply, but for a simple short-term solution that could be enough for now.

Post by **daniel2e** » Mon Feb 18, 2013 4:43 pm

MSat wrote:Oculus has stated that they'll be adding positional tracking to the consumer version, so perhaps it's a moot point to try to tackle it, unless, of course, it's more about personal interest and the challenge

Have they? I've seen some 'we're working on it' type of responses, but not 'it's a solved problem, coming to a store near you'. If I thought that were the case, I would probably devote my attention to something else.

For whatever reason, I get the feeling they are having a hard time devising robust solution that is ready for a consumer device. By 'work' I don't just mean function correctly, but also not add any awkwardness to set up or operation, or make any significant assumptions about the environment. That probably eliminates a lot of otherwise attractive sensing options, and makes this a tough nut. I'm sure Oculus has great folks working on it, but there are only so many hours in the day.

Thanks to everyone posting links to other threads around the board and various research articles , it has been illuminating.

Post by **PatimPatam** » Tue Feb 19, 2013 3:00 pm

Fredz wrote:I've tried this with the PS Move / PS Eye here : http://www.mtbs3d.com/phpbb/viewtopic.php?f=140&t=15994

Not perfect - probably because of the fusion code - but it still has less latency than the Kinect and it's simpler than building an attachment with multiple LEDs. In the end the multiple LEDs idea will be a lot better if it can be build easily and cheaply, but for a simple short-term solution that could be enough for now.

I have to agree with Fredz on this one.. I developed the PosiTTron thingy mainly as an idea for someone designing a HMD from scratch (like Oculus for instance). If it was mass produced in a factory as just another part of the overall HMD assembly I believe it could be a cheap and effective solution. It could also work as an add-on, but again it would be cheap and simple only if a company was actually building it and supplying it.

As an easy-hack to get positional tracking on top of an existing HMD, the classic PS move option is a probably a better fit. Some people seem to like the idea, so I still plan to improve the PosiTTron and release the info and the software for those brave enough to try it, but I don’t really think it's the most practical DIY solution out there..

Also as daniel2e suggested, if it can serve some other purpose like as an information gathering device, i'll be happy to help!