DOC PREVIEW
UCSD CSE 190 - 3D Photography

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSE 190A Projects in Vision and Learning, Winter 2007, Final Report 1 3D Photography Kristen Kho Department of Computer Science and Engineering University of California, San Diego [email protected] Abstract This paper presents the results of an undergraduate research project for CSE 190A: Projects in Vision and Learning, Winter 2007. The subject of this project is 3D photography – a computer vision technique used to reconstruct 3D models from 2D images. While the topic itself has already been thoroughly researched, this project seeks to apply knowledge of computer graphics to create models optimized for virtual gaming environments. This report addresses the objectives, results, and lessons learned during the course, as well as topics for future work on the project. 1 Introduction 3D photography is the use of technology to capture the shape and appearance of real world objects and reconstruct them in the digital world. The digitization of an object can be viewed from novel viewpoints that may be difficult or impossible to achieve in the real world, and does not require one to physically traverse great distances. Due to the detail and realism of the models generated by this process, 3D photography has many practical applications and has been employed across many different fields [15]. There are two main approaches to 3D photography: passive and active. Active techniques require tightly controlled conditions when capturing a scene, often utilizing projected light and precisely calibrated cameras or scanners. The results are extremely accurate and offer an unparalleled amount of detail. However, this approach is usually not practical for distant or fast moving objects. In addition, the process is intrusive and requires expensive equipment. The other approach, and the one to be used in this project, is passive 3D photography. Passive techniques use existing light in the scene and can utilize an unrigged camera. The resulting models do not have the same level of detail and accuracy as those produced with active techniques, but they are a practical alternative. There are several passive approaches: wide-baseline stereo, structure from motion, shape from shading, and photometric stereo [12]. This project will take the stereo approach. The stereo approach uses images taken from two different viewpoints to extract depth information from the scene. Once the correspondence between points from the two images is found, the amount of displacement can be used to compute the distance from the camera. Accuracy of stereo techniques often depends critically on camera calibration: knowledge of the camera’s position, orientation, and internal parameters [18]. However, as demonstrated by [10] and [2], it is possible to reconstruct a complex scene without this information. Originally, the objective of this project was to use 3D photography techniques to reconstruct digital models of sculptures around campus. However, the emphasis shifted to the more practical topic of model acquisition for virtual gaming environments. Models in videogames differ from the usual models obtained through 3D photography by the fact that they are subject to the performance constraints of the gaming platform. For example, a reconstructed mesh could have thousands of vertices/triangles, but in the context of a videogame that would not be practical because it would require more memory to store the data and a longer time to load and render it. Therefore, this project must address the problem of creating a model that has a limited number of vertices yet still convincingly resembles the original scene or object. This paper is organized as follows: Section 2 gives an overview of the system and Section 3 explains the steps in more detail. Section 4 evaluates the results of running the system on a dataset. Section 5 discusses lessons learned and future work. 2 Overview of the System A common approach in the literature to reconstructing a 3D model from a set of images consists of the following steps: image acquisition, feature selection, feature correspondence/matching, reconstruction, and visualization. The first step is to acquire the images of the scene or object to be reconstructed with a digital camera. While this sounds simple enough, the conditions under which those pictures are taken greatly determine the methods can be used later on in the reconstruction stage. For this system, it is assumed thatCSE 190A Projects in Vision and Learning, Winter 2007, Final Report 2 the position and orientation of the camera for each image are unknown and that the internal camera parameters (focal distance, skew, optical center) may or may not be available. However, all images should be taken with the same camera under the same settings (no auto focus or flash) to simplify the calculations during reconstruction. Once the images have been acquired, the next step is to identify potential features (interest points) that are distinct and easy to match across images. This project evaluated two different approaches – the Harris corner detector and the difference-of-Gaussian step of the Scale-Invariant Feature Transform (SIFT). The number of features that are extracted may vary from tens to hundreds of points per image depending on the parameters supplied. The next step is establishing a correspondence between the extracted features across the images. The approach chosen for feature selection determines the approach for feature correspondence – if Harris was used, then it is followed by Normalized Cross Correlation (NCC). If SIFT was used then matching is done by comparing each feature’s invariant descriptor vector [3]. After a set of matches has been obtained, the 3D positions of those points are determined. Assuming no calibration information is available, only a projective reconstruction is possible, but it can be improved by applying certain techniques. Two views are used to initialize the projection and then subsequent views are added one at a time to achieve a multiple-view reconstruction. The result of


View Full Document

UCSD CSE 190 - 3D Photography

Documents in this Course
Tripwire

Tripwire

18 pages

Lecture

Lecture

36 pages

Load more
Download 3D Photography
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 3D Photography and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 3D Photography 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?