DOC PREVIEW
U of M PSY 5036W - Lecture 24 Top Down

This preview shows page 1-2-3-4-5-37-38-39-40-41-42-75-76-77-78-79 out of 79 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 79 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Top-downObject recognition, given real images•clutter, occlusion, noise•role of cortical architecture •Learning object categories•Amazing ability to learn from a small number of examplesTop-downObject recognition, given real images•clutter, occlusion, noise•role of cortical architecture •Learning object categories•Amazing ability to learn from a small number of examplesObject recognition in real imagesBackground clutter and occlusionObject recognition in real imagesBackground clutter and occlusionObject recognition in real imagesBackground clutter and occlusionObject recognitiongiven occlusion, clutterLinking local information (features) likely to belong to the same object or pattern•local ambiguity, noise•need for generic priors, e.g. smoothnessResolving competing explanations•occlusion, clutter•need for domain-specific priorsObject recognitiongiven occlusion, clutterLinking local information (features) likely to belong to the same object or pattern•local ambiguity, noise•need for generic priors, e.g. smoothnessResolving competing explanations•occlusion, clutter•need for domain-specific priorsSimple influence graphsCue integrationLong lineshort segmentsParent P, Zucker SW (1989) Trace inference, curvature consistency, and curve detection. IEEE Transactions on Pattern Analysis & Machine Intelligence 11:823-839.Yuille AL, Fang F, Schrater P, Kersten D (2004) Human and Ideal Observers for Detecting Image Curves. In: Advances in Neural Information Processing Systems 16 (Thrun S, Saul L, Schoelkopf B, eds). Cambridge, MA: MIT Press.Cortical basis?Short segments to long lines?Within-area linkage?1 to 2 mm1 to 2 mm~ 8 mmDas A, Gilbert CD (1999) Topography of contextual modulations mediated by short-range interactions in primary visual cortex. Nature 399:655-661.But what about whole shapes?Object recognitiongiven occlusion, clutterLinking local information (features) likely to belong to the same object or pattern•local ambiguity, noise•need for generic priors, e.g. smoothnessResolving competing explanations•occlusion, clutter•need for domain-specific priorsObject recognitiongiven occlusion, clutterLinking local information (features) likely to belong to the same object or pattern•local ambiguity, noise•need for generic priors, e.g. smoothnessResolving competing explanations•occlusion, clutter•need for domain-specific priorsCompeting explanations: Explaining away missing dataoror not?Auxiliary evidence for occlusionAuxiliary evidence for occlusionRecognition despite cast shadowsCavanagh P (1991) What's up in top-down processing? In: Representations of Vision: Trends and tacit assumptions in vision research (Gorea A, ed), pp 295-304. Cambridge, UK: Cambridge University Press.Suggests...Rather than thisComputer visionImage parsing: analysis by synthesis(Tu, Z., Chen, X., Yuille, A., & Zhu, S. (2005))models [38] [39]. In particular, the pattern types can be expanded to include material pr opertieswh ich are not explicit objects.Th e advantages of a generative model for the entire image include th e ability to “explain away”.Submodels correspondin g to different objects, or processes, compete and cooperate to explain dif-ferent parts of the image (e.g. the letter B plus bar competes with the interpretation of accidentallyaligned fragments in Figure 1B). A face model might hallucinate a face in the trunk of a tree; but atree model can overule this and provide the correct interpr etation of the tree trunk, see Figure (5).In addition, full generative models en force consistency of the interpretation of the image.textbackgroundface),,(111ΘLζ),,(222ΘLζ),,(333ΘLζsceneA. B.Figure 3: A. The image is generated (left panel) by a probabilistic context free grammar shown bya two layer graph with nodes with properties (ζ, l, θ) corresponding to regions Liin the image. B.Th e right p an el s hows samples from the face model and the letter model – i.e. from p(IR(L)|ζ, L, Θ).We now switch to the task of perform ing inference on this generative model to estimate W∗=arg maxWP (W |I). Th is requ ir es a sophisticated inference algorithm that can perform operationssuch as creating nodes, deleting nodes, diffusing the boundaries, and altering the node attrib utes.Th e strategy used in [3] is to perf orm analysis by synthesis by a data-driven Markov Chain MonteCarlo (DDMCMC) algorithm. Th is algorithm is guaranteed to converge by standard propertiesof MCMC. Informally, low -level cues are used to make hypotheses about the scene w hich can beverified or rejected by sampling from the models. For example, low-level cues [31, 32] can be usedto hypothesize that there is a face in a region of the image. This hypothesis can be validatedor rejected by sampling from a generative face model. Th e bottom-up cues propose that thereare faces in the tree bark, but this proposal is rejected by the top-down generative model, seeFigure (5). Inference is performed by applying a set of operators wh ich change the stru ctur e of theparse graph, see Figure (4). Th ese operators are implemented by transition kernels K, see Box 1for a more techn ical d escription of the algorithm. The bottom-up cues are based on discriminativemodels which are described in Box 2.6•Find most probable scene description•Bottom-up “proposals” (cues) to access low- (shading) and high-level (faces, letters) models•Verification through top-down synthesis•If bottom-up proposals are good, synthesis is not needed to find most probable scene•Flexible graphFigure 5: Top left: Input image. Top right: Bottom-up proposals for text and f aces are show nby boxes. A face is “hallucinated” in a tree. Bottom centre: Overall segmentation (bottom left),Detection of letters and faces. Bottom right: Synthesised imageTh ere is evidence th at reliable diagnostic information for certain categories is available from verysimple image measurements [35, 32], and that humans make certain categorical decisions sufficientlyfast to preclude a verification loop [40](but see [41] and [42]).“Where do the generative models come from?”Ideally the generative models, the discriminative models, and the stochastic grammar would allbe learnt from natural images. This is not difficult in principle because, as discussed in Griffithsand Yuille, learn ing the model from data is simp ly another examp le of statistical inference. TheHelmholtz machine [43] gives an illustration of h ow a generative model, and an


View Full Document

U of M PSY 5036W - Lecture 24 Top Down

Download Lecture 24 Top Down
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 24 Top Down and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 24 Top Down 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?