UW-Madison CS 766 - Uncalibrated Stereo Vision - D2312087

Home> Schools> University of Wisconsin, Madison> (CS) > CS 766> Uncalibrated Stereo Vision

DOC PREVIEW

UW-Madison CS 766 - Uncalibrated Stereo Vision

School name University of Wisconsin, Madison

Course Cs 766- Computer Vision

Pages 16

This preview shows page 1-2-3-4-5 out of 16 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

5. ResultsUncalibrated Stereo Vision A CS 766 ProjectUniversity of Wisconsin - MadisonFall 2004 Tom Brunet&Leo ChaoAbstractBy viewing the same scene from two different points of view, it is often possible to ascertain some auxiliary information about the scene. Information such as depth ordering and relative placements of objects in the scene, as well as motion detection and tracking can often by found by comparisons between the two images. In the simplest case where all camera parameters are know this is a straightforward problem, but in the case where we are given only two images and no knowledge of the cameras intrinsic or extrinsic parameters, we are unable to directly apply the straightforward approaches. In this paper we discuss an attempt to circumvent the lack of calibration informationin order to naively apply the calibrated algorithms with two images modified from the originals.1. Introduction Given a series of two-dimensional images it is possible to extract a significant amount of auxiliary information about the scene being captured. One of the most useful of these pieces of information is knowledge about the relative depth of objects in the scene. It is known that, given two images of a single scene it is possible to extract the depth of various objects in the scene from the disparity between the two images. The human brain handles this task constantly, adapting a stream of paired two dimensional images to provide us with what is commonly known as depth perception: that is a intrinsic feel for the relative depth of objects in the scene. In a much simpler case we can consider how to extract this scene disparity from two images viewing the same scene (from close but not identical positions). It should be noted that without a significant amount of extra information and calculations, it is generally not possible to ascertain an exact depth measurement (as in so many meters out) but rather we can isolate "planes" of depth, that is localize which parts of the scene are at the same, or relatively close, depths.Being able to retrieve this depth information is useful for any number of applications. Most prominent of these are 3D scene reconstruction, where the depth information is used to aid in the creation of a three-dimensional model of the scene being captured. Similarly, robotics applications often use a rough 3D scene model garnered from a stereo rig to model the world around a robot in order to provide sane movement and navigation data to the machine. In general these stereovision techniques are desirable because they are passive in nature, that is no active measurements of the scene with instruments such as radar or lasers need to be obtained. Furthermore their exist asignificant range of processes which enable a user to either process the data offline, or in real-time, as the application dictates.In order to obtain the desired depth information we need to first determine the disparity between the two images. Traditionally these two images are referred to as the left and right images. Since the left and right images are viewing the plane from different place, there will be a noticeable disparity between the two images. If we are able to calculate the relative disparity between points in a scene across the two different images we should be able to create a depth map from that information. The vital point here is that points at similar depth levels in the world will have similar disparities across the left and right stereo images. Intuitively this can be seen just by moving ones head laterally: objects close to you move a large distance in your field of view while those further away move a small distance. By determining adisplacement for each point in an image, we can determine roughly which depth layer it belongs to. Thus given a set of point correspondences between the left and right images, we can determine the depth map of the scene.Fundamental to the calculation of these point correspondences, is the idea of an epipolar plane. Essentially this is a technique for constraining the search space when looking for corresponding points between the left and right images. Simply put the epipolar plane is the plane formed by the point in the scene and the optical centers of the left and right cameras. Where this plane intersects the two image planes an epipolar line is formed on each image plane. This is the line that joins the image of the scene point in each of the images to the epipole in each image. The epipoles are the point in each image where the line between the two optical centers intersects the image plane. The figure below demonstrates these terms. [1]Ideally the point in the right image corresponding to any given point in the left image is known then to lie on the epipolar line in the right image that is associated with the point in the left. We can then merely search along the line to find the proper correspondence. This ideal caseis the calibrated stereo correspondence case where the parameters of the stereo rig are known. In our project we attempt to deal with the case where these parameters are not known, that is we seek to perform stereo correspondence on two images without knowledge of their parameters. This is known as the uncalibrated stereo correspondence problem. 2. Calibrated Stereo Correspondence This approach assumes that we know the parameters for the stereo rig and reduce our search for correspondences to the epipolar lines. The key features of this type of correspondence processing is that the epipolar lines are easily determine, since camera parameters are known and the emphasis is on increasing the speed and accuracy of the search itself. There are a wide variety of approaches to solving this problem, Scharstein and Szeliski [2] provide a good overview of the general methods and approaches as well as characterizing the performance of each. The key features that differentiate these algorithms is the calculation of match costs, the means in which costs are aggregated, the actual disparity calculations used and the amountof refinement the disparity levels are subject to. Of these, the key structure differences generally lie in the disparity calculation section where techniques such as local window searches, global cost optimization and dynamic programming routines are used to efficiently(or not so)

View Full Document