Fast sparse stereo-matching [Computer Vision]

This is for discussion and development of non-commercial open source VR/AR projects (e.g. Kickstarter applicable, etc). Contact MTBS admins at customerservice@mtbs3d.com if you are unsure if your efforts qualify.
Post Reply
User avatar
FingerFlinger
Sharp Eyed Eagle!
Posts: 429
Joined: Tue Feb 21, 2012 11:57 pm
Location: Irvine, CA

Fast sparse stereo-matching [Computer Vision]

Post by FingerFlinger »

Image

As part of a larger project, I made a little bare-bones library for stereo matching of sparse features. It's very basic, and completely tailored to my own needs, but I thought it was nifty, so I spent a little time this evening cleaning it up and threw it on GitHub. Binaries here

Based on various papers and snippets, it keeps up with or exceeds the speed and performance of similar sparse matchers, despite running on 7 year-old hardware. For comparison, LibViso2 claims that it can match 1000 features in 35ms. The image above was matched in 7.7ms and matched 253 features.

Put in equivalent terms:
LibViso2: 28571 per second
My thing: 32857 per second

But, that's tuned for rock-solid matches. If you relax the matching criteria slightly, you can still have over 90% matching accuracy but get 469 matches in 5.6ms, for ~82000/s.

There are a few major optimizations left, but I am saving them for later, since it suits my needs as-is. One funny thing is that the algorithm is almost completely memory-limited, so my netbook(w/ DDR3 memory) can actually run this about 20% faster than my "gaming pc" from 2008.
Last edited by FingerFlinger on Fri Jul 05, 2013 7:31 pm, edited 2 times in total.
User avatar
cybereality
3D Angel Eyes (Moderator)
Posts: 11407
Joined: Sat Apr 12, 2008 8:18 pm

Re: Fast sparse stereo-matching [Computer Vision]

Post by cybereality »

Looking good.
WiredEarp
Golden Eyed Wiseman! (or woman!)
Posts: 1498
Joined: Fri Jul 08, 2011 11:47 pm

Re: Fast sparse stereo-matching [Computer Vision]

Post by WiredEarp »

Very cool FingerFlinger.
zalo
Certif-Eyed!
Posts: 661
Joined: Sun Mar 25, 2012 12:33 pm

Re: Fast sparse stereo-matching [Computer Vision]

Post by zalo »

What would a depth map composed of voronoi regions look like using these points?
Each region would be some grayscale color representing that point, and it might look interesting overlaid on the original image.

More points means better depth map!

By the way, is this used for you optical flow project? With my limited knowledge of optical flow algorithms, I imagine knowing depth would help immensely figuring out parallax and improving the accuracy of the track.
User avatar
FingerFlinger
Sharp Eyed Eagle!
Posts: 429
Joined: Tue Feb 21, 2012 11:57 pm
Location: Irvine, CA

Re: Fast sparse stereo-matching [Computer Vision]

Post by FingerFlinger »

Actually, when I get a little time, I am going to add a dense option to the library. I was perhaps a little preemptive when I named the repo SparseStereo.

There is no reason that you couldn't attempt to find a match for every pixel in the image, but the reason that I haven't done so is because I am targeting a real-time application. In my demo, the default settings generate about 10000 keypoints between the 2 images. That's simply many fewer points than a dense map would need to match (375*450 = 168750px), and therefore a lot faster to compute.

Yes, it is used for my optical flow/stereo visual odometry project. It doesn't necessarily improve the quality of the actual optical flow, but it helps you figure out how much weight to give each flow vector. i.e. vectors with a greater depth will have a smaller magnitude than vectors that are nearer, but they represent the same amount of camera motion in real-world terms.
User avatar
Fredz
Petrif-Eyed
Posts: 2255
Joined: Sat Jan 09, 2010 2:06 pm
Location: Perpignan, France
Contact:

Re: Fast sparse stereo-matching [Computer Vision]

Post by Fredz »

FingerFlinger wrote:Actually, when I get a little time, I am going to add a dense option to the library. I was perhaps a little preemptive when I named the repo SparseStereo.
You may already know about it but you can find a lot of good implementations of dense stereo correspondance here : http://vision.middlebury.edu/stereo/eval/
FingerFlinger wrote:There is no reason that you couldn't attempt to find a match for every pixel in the image, but the reason that I haven't done so is because I am targeting a real-time application.
The second entry in the evaluation (ADCensus) is near real-time, it takes 0.1s on a GeForce GTX 480 with CUDA. I guess something simpler could be implemented a bit faster.
User avatar
brantlew
Petrif-Eyed
Posts: 2221
Joined: Sat Sep 17, 2011 9:23 pm
Location: Menlo Park, CA

Re: Fast sparse stereo-matching [Computer Vision]

Post by brantlew »

Impressive. Keep it comin...
User avatar
FingerFlinger
Sharp Eyed Eagle!
Posts: 429
Joined: Tue Feb 21, 2012 11:57 pm
Location: Irvine, CA

Re: Fast sparse stereo-matching [Computer Vision]

Post by FingerFlinger »

Yep, I am aware of the Middlebury evaluations; the whole website is great!

I'm doing sparse because there is a lot of stuff to do with each point of interest. I also need to do optical flow and outlier rejection in addition to stereo correspondence. All of those steps add up, and I don't think it is practical to go completely dense for what I am trying to do.

Given that, I anticipate that my current performance is already good enough for my needs, and optimization will come after I have gotten the rest of my algorithm implemented.

I'm definitely interested to work with CUDA, since SSE has been pretty fruitful for me.

And thanks for pointing me specifically to that second link! I haven't seen their paper before, but their solution utilizes the Census transform, which is also what my library uses (albeit in a radically different way), so I might be able to gain some insight from their CUDA implementation.
zalo
Certif-Eyed!
Posts: 661
Joined: Sun Mar 25, 2012 12:33 pm

Re: Fast sparse stereo-matching [Computer Vision]

Post by zalo »

I was trawling some ancient leap hacking threads, and I found this image:
Image

Supposedly it's raw data from the leap's sensor (640x480 each). Despite the fact that each leap camera has ~150 degrees FoV (with substantial distortion), I'm curious as to how your stereo algorithms work on it.

The CTO of leap mentioned: "the lens distortions are nonlinear, non-monotonic, and fish-eye so they are very difficult to calibrate with existing vision packages. This is why we developed our own calibration method..." so it might be a lost cause in terms of creating dense stereo maps...

PS: If it does work, some guy made a raw data plug-in for Linux with some super sketchy code:
https://github.com/elinalijouvni/OpenLeap
User avatar
FingerFlinger
Sharp Eyed Eagle!
Posts: 429
Joined: Tue Feb 21, 2012 11:57 pm
Location: Irvine, CA

Re: Fast sparse stereo-matching [Computer Vision]

Post by FingerFlinger »

Image

Not very well, I'd say!

EDIT: You can always do a manual calibration. OpenCV is great for that. I've got a half-finished tool that would automate it, actually. When life settles down a little bit, it's on the top of my list to finish...
User avatar
android78
Certif-Eyable!
Posts: 990
Joined: Sat Dec 22, 2007 3:38 am

Re: Fast sparse stereo-matching [Computer Vision]

Post by android78 »

It looks like they probably use the brightness as a rough depth map. Using that they can use very localized matching between the two images to get the final position.
Basically, for a pixel, given the brightness you could estimate the distance being seen as an inverse square of the brightness (multiplied by scaling factor). This can be validated by the second image. Then you can do matching within a small region to get an even more accurate position.
MSat
Golden Eyed Wiseman! (or woman!)
Posts: 1329
Joined: Fri Jun 08, 2012 8:18 pm

Re: Fast sparse stereo-matching [Computer Vision]

Post by MSat »

android78 wrote:It looks like they probably use the brightness as a rough depth map. Using that they can use very localized matching between the two images to get the final position.
Basically, for a pixel, given the brightness you could estimate the distance being seen as an inverse square of the brightness (multiplied by scaling factor). This can be validated by the second image. Then you can do matching within a small region to get an even more accurate position.

I had figured it was going to be something along those lines, which make it somewhat similar to the original Kinect (more densely illuminated objects are closer). You can probably further deduce depth information by simple comparison of overlapped stereo pairs - the fact that it's monochromatic probably makes it much easier.
zalo
Certif-Eyed!
Posts: 661
Joined: Sun Mar 25, 2012 12:33 pm

Re: Fast sparse stereo-matching [Computer Vision]

Post by zalo »

Post Reply

Return to “VR/AR Research & Development”