U of M PSY 5036W - Object Recognition - D358028

Home> Schools> University of Minnesota- Twin Cities> Psychology (PSY) > PSY 5036W> Object Recognition

DOC PREVIEW

U of M PSY 5036W - Object Recognition

School name University of Minnesota- Twin Cities

Course Psy 5036w- Computational Vision

Pages 34

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Computational VisionU. Minn. Psy 5036Daniel KerstenLecture 23: Object RecognitionInitialize‡Spell check offIn[1]:=Off@General::spell1D;OutlineLast timeTexture--notesShawn Green: Action video games, learning & probablistic inferenceScience writing notesJackie Fulvio: Visual trackingTodayTasks & discountingObject recognitionRole of geometric modeling in theories of object recognitionHigh-level vision, visual tasksIntermediate-level visionGeneric, global organizational processesDomain overlap, occlusionSurface grouping, selectionGestalt principlesCue integrationCooperative computationAttentionIntermediate-level processes could be useful for interpreting novel objects and scenes. These processes could also be useful for feature extraction to be used to store in memory, and later for testing match against stored representations. The idea is that more abstract features and relations would be more robust to image variations, of the sort discussed below.High-level visionFunctional tasksObject recognition--familiar objectsentry-level, subordinate-levelObject-object relationsScene recognitionSpatial layoutViewer-object relationsObject manipulationreach & graspHeadingTask dependency: explicit (primary) and generic (secondary, nuisance) variablesI=f(shape, material, articulation,viewpoint,relative position, illumination)2 23.ObjectRecognition.nb‡Task: Object Recognition‡Task: Absolute depth (e.g. for reaching)23.ObjectRecognition.nb 3‡Task: graspProblem: all the scene variables contribute to the variations in the image‡Preview of Mathematica application below Shape-based object recognitionestimate geometrical shape (primary variables)discount sources of image variation not having to do with shape (secondary variables)e.g. integrating out geometrical variables such as translation, rotation, and scale to estimate shape for object recognition (Variations due to illumination, background clutter, and within-category shape also need to be taken into account.)Object recognitionSources of image variationWe'll work from lower to higher levels of object abstraction‡Variation within subordinate-level category (subordinate level, e.g. mallard, Doberman, Braeburn)illuminationlevel, direction, source arrangement, shadows,spectral contentviewscaletranslation2D & 3D rotationarticulationnon-rigid, e.g. joints, hinges, facial expression, hair, clothbackground (segmentation)bounding contour/edge variationocclusion (segmentation)4 23.ObjectRecognition.nbilluminationlevel, direction, source arrangement, shadows,spectral contentviewscaletranslation2D & 3D rotationarticulationnon-rigid, e.g. joints, hinges, facial expression, hair, clothbackground (segmentation)bounding contour/edge variationocclusion (segmentation)‡Variation within basic-level category (e.g. duck, dog, chair, apple)"entry-level", "basic-level"structural relation invariance? Or image fragment-based learning and memory?‡Variation across super-ordinate category (e.g. bird, mammal, furniture, fruit )more cognitive than perceptual, non-pictorial‡Variation across contextball on tennis court vs. billiard table23.ObjectRecognition.nb 5Basic-level vs. subordinate-levelIs the distinction relevant to human behavior?Behavioral experiments (Rosch et al.), Neuropsychological (Damasio and Damasio)temporal lobe lesions disrupt object recognitionfine-grain distinctions more easily disrupted than coarse-grain onese.g. Boswell patient~can't recognize faces of family, friends, unique objects, or unique places. Can assign names like face, house, car, appropriately. Also superordinate categories: "tool"prosopagnosicsfaces vs. subordinate-level?neural evidence for distinction? IT hypercolumns?‡Basic-levelShape-particularly critical -- but qualitative, rather than metric aspects important. E.g. geons and geon relations (Biederman).Material, perhaps for some natural classes?Issue of prototypes with a model for variation vs. parts.e.g. average image face, the most familiarpriming methods to tease apart distinct representationsFragment-based methods or "features of intermediate complexity": Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Learning informative features for class categorization‡Subordinate-levelgeometric variations important for subordinate--e.g. sensitivity to configurational etc..material Prototypes -> what kind of model for variation?Problem: With only a discrete set of views, how does vision generalize to other views?6 23.ObjectRecognition.nbConsider viewpointFeatures invariant to viewpoint change?Material (surface color, texture)2D image features correlated with 3D ShapeObject ensemble is importantE.g only red object amidst others | no need to process its shape Context is importantSmall red thing flying past the window.High "cue validity" for male Cardinal bird‡Context can even over-ride local cues for identitySinha and Poggio, Nature 199623.ObjectRecognition.nb 7See too: Cox, D., Meyers, E., & Sinha, P. (2004). Contextually evoked object-specific responses in human visual cortex. Science, 304(5667), 115-117.Getting a good image representationOverview of processes that seem to be necessary-- i.e.what we think we knowFor object recognition, the contributions due to the secondary or "generic variables", (e.g. illumination and viewpoint) need to be discounted, and object features such as shape and material need to be estimated. How?o Measurements of image information likely to belong to the object. This principle should constrain segmentation.problems with: specularities, cast shadows, attached shadows (from shading). edge detection is really noisy, and ambiguous as to cause, so what are these image "features"?although noisy, are edges/groupings sufficiently reliable to determine object class?o "Sensor fusion" or cue integration to improve estimates of where object boundaries are located:combine stereo, motion, chromatic, luminance, etc..o Incorporate intermediate-level constraints to help to find object boundaries or "silhouettes". Gestalt principles of perceptual organizationMohan (1988); Zhu (1999); Geiger | figure/ground talksymmetry (Vetter et al., 1994)long smooth lines (David & Zucker, 1989; Shashua & Ullman, 1988; Field and Hess, 1993)o "Cooperative computation" for object shape, reflectance and lighting.There is no single local cue to edge identity"intrinsic images" of Barrow and Tenenbaum to extractClark & Yuille (1990); Knill &Kersten, Kersten (1991); Kersten and Madarasmi (1995). Problem: Still no bottom-up

View Full Document