DOC PREVIEW
CMU CS 10701 - Inferring Depth from Single Images in Natural Scenes

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Inferring Depth from Single Images in Natural Scenes Byron Boots Department of Computer Science Carnegie Mellon University Pittsburgh PA 15213 beb cs cmu edu Abstract The inverse optics problem is one of the oldest and most well known problems in visual perception Inferring the underlying sources of visual images has no analytic solution Recent work on brightness color and form has suggested that visual percepts represent the probable sources of visual stimuli and not the stimuli as such suggesting an empirical theory of visual perception Here I explore this idea by framing the perception of depth as a machine learning problem I apply two algorithms with varying levels of model complexity compare their ability to infer depth both with each other and with the best previous solutions 1 Introduction It has long been recognized that sources of visual stimuli cannot be uniquely specified by the energy that reaches sensory receptors the same pattern of light projected onto the retina may arise from different combinations of illumination reflectance and transmittance and from objects of different sizes at different distances and in different orientations Figure 1 Nevertheless visual agents must respond to real world events The inevitably uncertain sources of visual stimuli thus present a quandary although the physical properties of a stimulus cannot uniquely specify its provenance success depends on behavioral responses that are appropriate to the stimulus source This dilemma is referred to as the inverse optics problem For more than a century now investigators have surmised that the basis of successful biological vision in the face of the inverse optics problem is the inclusion of prior experience in visual processing presumably derived from both evolution and individual development This empirical influence on visual perception first suggested by George Berkeley in 1709 1 has been variously considered in terms of Helmholtz s unconscious inferences 2 the organizational principles advocated by gestalt psychology 3 and the framework of ecological optics developed by Gibson 4 More recently these broad interpretations have been bolstered by a wealth of evidence suggesting that many visual percepts can be predicted according to the real world sources to which an animal has always been exposed 5 6 7 In fact many of the anomalous percepts that humans see in response to simple visual stimuli may be rationalized in this way 6 7 In the present work I have explored the notion of an empirical approach to visual percep tion by framing the inverse optics problem as a machine learning problem Specifically I have asked how depth to surfaces in natural scenes may be inferred from monocular images I look at previous approaches to the problem and suggest novel alternatives My results demonstrate the feasibility of solving the inverse problem from two perspectives a naive linear regression perspective and from a more complex graphical modeling perspective Figure 1 The inverse optics problem with respect to geometry Objects of different sizes at different distances and in different orientations may project the same image on the retia 2 Related Work 2 1 Traditional approaches to computer vision Geometrical aspects of the inverse optics problem are frequently encountered in computer vision in the form of recovering three dimensional structure from two dimensional images Most work in this area has focused on stereopsis 8 structure from motion 9 or depth from defocus 10 all of which rely on a differential comparison between multiple images Animals however are able to judge spatial geometry from a monocular image and it is thought that this ability lies at the heart of the perception of geometrical space 11 Despite this fact the only well known method of inferring depth from a single image is the shape from shading algorithm 12 a technique that devises models of image formation based on the physics of light interaction and then inverts the models to solve for depth These inverted models are highly underconstrained requiring many simplifying assumptions e g Lambertian surface reflectance that seldom hold in images of natural scenes 14 Recently researchers have begun to recover geometrical structure from two dimensional images empirically using learning based techniques 2 2 Learning based methods in computer vision Despite the large quantity of evidence suggesting the importance of empirical data in vision there have been surprisingly few attempts to leverage machine learning techniques to infer scene geometry from monocular images I am only aware of two different methods Andrew Ng s group at Stanford University is using discriminatively trained Markov random fields to infer depth from monocular images collected from a mobile platform 13 This approach is quite successful and has the advantage of directly learning depth maps based on statistics of images and their underlying sources Potentially such an approach could be tied to vision studies which have similarly used images and depth maps to explain perceptual phenomena 5 14 Aloysha Efros group at Carnegie Mellon University is using a completely different technique where subjects hand label the possible orientation of surfaces in images 15 Their algorithm learns geometric classes defined by simple orientations such as sky ground and vertical surfaces in a scene The labels are then used to cut and fold the image providing a simple pop up model of a visual scene This method performs surprisingly well for a wide range of images and is visually appealing but is highly inaccurate and not directly related to the the true statistics of underlying depths in visual images 2 3 Filtering in visual inference In previous attempts at monocular inference it has been suggested that a variety cues are essential for judging depth In particular convolutional filters such as Laws masks for texture energy and oriented edge detectors have been used to develop complex feature vectors describing local information in image patches 13 15 Additionally the local patches themselves are augmented with information from multi scale decompositions of the image in order to provide additional scene context 13 This latter point is extremely important as local information is insufficient to determine depth 6 7 Despite their extensive use linear filters are problematic Features derived in this way introduce a priori assumptions about the importance of particular patterns and spatial frequencies on


View Full Document

CMU CS 10701 - Inferring Depth from Single Images in Natural Scenes

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download Inferring Depth from Single Images in Natural Scenes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Inferring Depth from Single Images in Natural Scenes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Inferring Depth from Single Images in Natural Scenes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?