U of M PSY 5036W - Object recognition background - D2830105

Home> Schools> University of Minnesota- Twin Cities> Psychology (PSY) > PSY 5036W> Object recognition background

DOC PREVIEW

U of M PSY 5036W - Object recognition background

School name University of Minnesota- Twin Cities

Course Psy 5036w- Computational Vision

Pages 10

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Computational VisionU. Minn. Psy 5036Daniel KerstenLecture 24: Object recognition, backgroundInitializeOff@General::spellD;OutlineLast timeObject recognition overviewTodayObject recognition: finishing up compensating for viewpoint changesRecognition, background variation, segmentation & learning objectsVariation over view: reviewFrom the previous lecture...Background context, clutter, and occlusion‡Background/context useful for "indexing"Background can provide prior information, that could be called "index" cues, to narrow down the space of possible objects to be recognized. E.g see: Oliva et al. (2003), Torralba et al. (2006) (pdf). One of the first demonstrations of the role of background information for human perception was:Biederman I (1972) Perceiving real-world scenes. Science 177:77-80.Background can provide prior information, that could be called "index" cues, to narrow down the space of possible objects to be recognized. E.g see: Oliva et al. (2003), Torralba et al. (2006) (pdf). One of the first demonstrations of the role of background information for human perception was:Biederman I (1972) Perceiving real-world scenes. Science 177:77-80.‡Background (clutter) as a confoundHow vision handles variation over background (clutter) is challenging, very important, yet poorly understood. Background clutter poses three types of problems: 1) segmentation is difficult because clutter near a target object’s borders produce misleading boundary fragments, 1) because local information is often incomplete for objects in a scene, there can be false positives for a target object, and 3) other surfaces may cover parts of the target object, i.e. occlusion leads to missing features/parts of a target object.Need a better understanding of local image cues, as well as how high-level models can be used to disambiguate local informationNatural image statistics:Let’s look at the problem of segmentation. The same image of an object appearing at different locations will produce quite different local responses in spatial filters.Let’s place the antlers (right) on the background below (left) at two different locations.2 24.ObjectRecBackgroundCat.nbLocation 1 (left) and location 2 (right) are shown below. Your visual system has no problem segmenting the antlers.24.ObjectRecBackgroundCat.nb 3But compare the local information in the following image blow ups, and corresponding edge detector outputs for locations 1 and 2.4 24.ObjectRecBackgroundCat.nbAlthough different types of edge detectors will give different outputs, it is difficult to remove the ambiguities of what edge elements to link at the boundaries. ‡Texture-based groupingThis illustrates the need to take into acount region/texture information for segmentation (e.g. Martin et al., 2004). Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell, 26(5), 530-549.Konishi SM, Yuille AL, Coughlan JM, Zhu SC (2003) Statistical edge detection: Learning and evaluat-ing edge cues. IEEE Transactions on Pattern Analysis and Machine Intelligence 25:57-74.The problem of background and clutter suggests that the visual system can make use of both intermediate-level (grouping of features) and high-level information (familiarity with object domains, such as “antlers”) to select and integrate features, both contours and texture, that belong together.In this lecture, we’ll focus on the recognition component of segmentation.‡Analysis-by-synthesisFeedforward and feedback: Use high-level information to predict input and to compare with actual input24.ObjectRecBackgroundCat.nb 5From: Cavanagh P (1991) What's up in top-down processing? In: Representations of Vision: Trends and tacit assumptions in vision research (Gorea A, ed), pp 295-304. Cambridge, UK: Cambridge University Press.Information from high-level model (in memory) can be used to "explain away" the cast shadow contours.See too: Sinha P, Poggio T (2001) High-level learning of early perceptual tasks. In: Perceptual Learning (Fahle M, ed). Cambridge, MA: MIT Press.Epshtein, B., Lifshitz, I., & Ullman, S. (2008). Image interpretation by a single bottom-up top-down cycle. Proc Natl Acad Sci U S A, 105(38), 14298-14303.Bootstrapped learning of object models in clutterBrady MJ, Kersten D (2003) Bootstrapped learning of novel objects. J Vis 3:413-422.http://gandalf.psych.umn.edu/users/kersten/kersten-lab/camouflage/digitalembryo.html‡Occlusion6 24.ObjectRecBackgroundCat.nb‡The solution?Efficient grouping based on similarity. But that may not be enough. One can also use occlusion information to "explain away" missing features.Consistent with the Bayesian idea of "explaining away".Neural evidence for top-down processing--Analysis by synthesisSee “Top-down” pdf notesNext‡Perceptual integration, perception as "puzzle solving".‡Learning object categories‡Spatial layoutAppendix‡Writing PackagesThe basic format is straightfoward:BeginPackage@"Geometry`Homogeneous`"DXRotationMatrix::"usage" ="XRotationMatrix@phiD gives the matrix for rotation aboutx-axis by phi degrees in radians"24.ObjectRecBackgroundCat.nb 7x-axis by phi degrees in radians"YRotationMatrix::"usage" ="YRotationMatrix@phiD gives the matrix for rotation abouty-axis by phi degrees in radians"ZRotationMatrix::"usage" ="ZRotationMatrix@phiD gives the matrix for rotation aboutz-axis by phi degrees in radians"ScaleMatrix::"usage" ="ScaleMatrix@sx,sy,szD gives the matrix to scale a vector bysx,sy, and sz in the x, y and z directions, respectively."TranslateMatrix::"usage" ="TranslateMatrix@x,y,zD gives the matrix to translate coordinatesby x,y,z."ThreeDToHomogeneous::"usage" ="ThreeDToHomogeneous@sx,sy,szD converts 3D coordinates to 4Dhomogeneous coordinates."HomogeneousToThreeD::"usage" ="HomogeneousToThreeD@4DvectorD converts 4D homogeneouscoordinates to 3D coordinates."ZProjectMatrix::"usage" ="ZProjectMatrix@focalD gives the 4x4 projection matrix to mapa vector through the origin to an image plane at focaldistance from the origin along the z-axis."ZOrthographic::"usage" ="ZOrthographic@vectorD projects vector on to the x-y plane."Begin@"`private`"DXRotationMatrix@theta_D :=881, 0, 0, 0<, 80, Cos@thetaD, - Sin@thetaD, 0<,80, Sin@thetaD, Cos@thetaD, 0<, 80, 0, 0, 1<<;YRotationMatrix@theta_D :=88Cos@thetaD, 0, Sin@thetaD, 0<, 80, 1, 0, 0<,8- Sin@thetaD, 0, Cos@thetaD, 0<, 80, 0, 0,

View Full Document