DOC PREVIEW
UCSD CSE 252C - Feature-Based Mosaic Construction Proposal

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Feature-Based Mosaic ConstructionProposalKristin M. BransonDeptartment of Computer Science and EngineeringUniversity of California, San [email protected] IntroductionIn applications such as space-based remote imaging and surveillance, data is collected inthe form of multiple small images that each contain a piece of a desired scene. For example,consider film taken by a satellite orbiting a planet. Each frame will show a small piece ofthe planet surface, but what is useful is an image of a large fraction of the surface. The goalof mosaicing is to stitch together a single image from these smaller, overlapping imageframes.Each 2-D point in each image is the projection of a 3-D point in the world. A mosaic of allthe images will contain the projection of every 3-D point viewable in any image in a com-mon coordinate system. For any pair of images, if the camera’s intrinsic parameters andmotion between images is known, then image mosaicing is trivial. The intrinsic parametersdefine the 3 x 3 viewing matrix V, which is the transformation from the 3-D location of apoint relative to the position of the camera to the 2-D location of that point in an image.The inverted viewing matrix, V1, can therefore be used to transform the 2-D image lo-cation of a point in the first image, x1, to the 3-D world location of that point relative tothe camera’s current position, X1: X1V1x1. In this equation, the world coordinates areinhomogeneous, three-element coordinates and the image coordinates are homogeneous,three-element coordinates. The motion of the camera between images defines the trans-formation from the world coordinate system relative to the camera’s first position to theworld coordinate system relative to the camera’s second position. The projection of pointX1in the world coordinate system relative to the camera’s second position is X2RX1t where R is the camera’s rotation and t is the camera’s translation. Because the same cameratakes all the images, the viewing matrix also define the transformation from X2, the 3-Dlocation of a point relative to the second position of the camera, to x2, the 2-D location ofthat point in the second image. Thus, the projection of each point in the first image to thecoordinate system of the second image isx2VRV1x1t(1)In general, the camera’s intrinsic parameters and motion are both unknown, thus the trans-formation from one image’s coordinate system to another image’s coordinate system isunknown. It is impossible to estimate V, R, and t without ambiguity, given only the im-ages. It is extremely difficult to estimate these parameters, even up to an ambiguity, becauseV is common to all pairs of images. Thus the estimation cannot be broken up into parts,and must be performed over all image pairs at once.However, if we assume that the camera does not translate between images, the problemis greatly simplified. Consider the previous equations when the translation vector t0.Equation 1 reduces tox2VRV1x1Hx1Thus, there is a 3 x 3 matrix H that projects points in the first image into the coordinatesystem of the second image (Szeliski, 1994). Any 3 x 3 matrix that projects coordinatesfrom one image to another is a planar homography. More intuitively, if the camera doesnot translate, then there is no motion parallax. Parallax is the displacement of points in oneimage that are coincident in the second. It is observablewhen you switch from just your lefteye to just your right eye and is the key to depth perception. Without parallax, no effectsof points in a scene being at different depths are observable. Thus it can be assumed thatall points in the scene lie on a plane. Because the transformation between one projectionof a plane and a second projection of the same plane is a homography, the transformationbetween one frame and any other frame is a homography.Assuming that all the images are related to each other by homographies, the first task is toestimate these homographiesgiven only the images. Approaches to homographyestimationall estimate point correspondences between images and the image transformationsbetweenone image and the other images. When the image transformations are known, all imagescan be transformedto be in the same frame, and fused together to create a single panoramicimage.In this project, I applied the feature-based described in (Torr and Zisserman, 1999) ap-proach to homography estimation. The feature-based methods, in contrast with directmethods, only calculate point correspondences in areas where it is likely that an accu-rate correspondence can be found. In the next section, I explain the approach I used forhomography estimation. In Section 3, I discuss methods for blending the projected imagestogether. In Section 4, I present my results. In section 5, I discuss my conclusions andfuture work.2 Feature-Based Methods to Homography EstimationFeature-based methods find correspondences and the corresponding homographies for se-lect sets of points in each imagewhere there is a chance of estimating the correct correspon-dence. As an illustration of points in an image that would not be useful in correspondencecalculations, consider points where the gradient of the pixel intensities in all or one direc-tion is small. If the gradient in all directions is small, then the point is inside a flat region,and there is no way of telling one point in this flat region from another point in this flatregion. If the gradient in one direction is small, then the point is on a line, and there is notelling one point on this line from another point on this line. Thus points where the gradienthas these properties are not useful for correspondence calculation, so feature-based meth-ods ignore them and concentrate on calculating correspondences between interest pointswhere the gradients in both directions are large. In addition, this makes the search spacesmaller. Thus, the first step in feature-based methods is to calculate a set of interest pointsin each image. This is done using a corner detection algorithm, which finds interest pointswhere the gradient is high in both directions.The Harris corner detection algorithm finds the points for which the criterion detMktrMis large. k is a constant that is usually set to 0.04 and M is a matrix of gradients ofintensity,MI2xIxIyIxIyI2y where I is the pixel intensity at every point, Ixis the gradient of the pixel intensity inthe x


View Full Document

UCSD CSE 252C - Feature-Based Mosaic Construction Proposal

Download Feature-Based Mosaic Construction Proposal
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Feature-Based Mosaic Construction Proposal and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Feature-Based Mosaic Construction Proposal 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?