UW-Madison CS 766 - Object Recognition - D61924

Home> Schools> University of Wisconsin, Madison> (CS) > CS 766> Object Recognition

UW-Madison CS 766 - Object Recognition

School name University of Wisconsin, Madison

Pages 17

Download Save

Unformatted text preview:

1Object Recognition2Lighting affects appearanceThe “Margaret Thatcher Illusion” by Peter ThompsonThe “Margaret Thatcher Illusion” by Peter Thompson3Recognition Problems• What is it?– Object detection• Who is it?– Recognizing identity• Object recognition• Category recognition• What are they doing?– Activity recognition• All of these are classification problems– Choose one class from a list of possible candidatesFace DetectionFace RecognitionP. Sinha and T. Poggio, Last but not least, Perception 31, 2002, 133.4What Makes Recognition Hard?• Intrinsic variability within each class• Pose variability• Illumination variability• Background variability• Segmentation problem– What region within an image contains the object?• Feature selection problem– What features describe shape and appearance?56Approximate Invariants• Over a limited range of viewpoint variation• Parallelism• Collinearity• Angle between a pair of lines• Co-termination7Pose consistency• Correspondences between image features and model features are not independent• A small number of correspondences yields a camera --- the others must be consistent with this• Strategy:– Generate hypotheses using small numbers of correspondences (e.g. triples of points for a calibrated perspective camera, etc., etc.)– Backproject and verify• Notice that the main issue here is camera calibration• Appropriate groups are “frame groups”8910Voting on Pose• Each model leads to many correct sets of correspondences, each of which has the same pose– Vote on pose, in an accumulator array– This is a Hough transform, with all its issues11Figure from “The evolution and testing of a model-based object recognition system”, J.L. Mundy and A. Heller, Proc. Int. Conf. Computer Vision, 1990 copyright 1990 IEEEFrom “The evolution and testing of a model-based object recognition system”, J.L. Mundy and A. Heller, Proc. ICCV, 199012Figure from “The evolution and testing of a model-based object recognition system”, J.L. Mundy and A. Heller, Proc. Int. Conf. Computer Vision, 1990 copyright 1990 IEEEFigure from “The evolution and testing of a model-based object recognition system”, J.L. Mundy and A. Heller, Proc. Int. Conf. Computer Vision, 1990 copyright 1990 IEEEFigure from “The evolution and testing of a model-based object recognition system”, J.L. Mundy and A. Heller, Proc. Int. Conf. Computer Vision, 1990 copyright 1990 IEEE13Invariants• There are geometric properties that are invariant to camera transformations• Easiest case: view a plane object in weak perspective• Assume we have three base points P_i on the object– then any other point on the object can be written as• Now image points are obtained by multiplying by a plane affine transformation, soPk=P1+µkaP2−P1()+µkbP3−P1()pk=APk= A P1+µkaP2− P1( )+µkbP3− P1( )()= p1+µkap2− p1( )+µkbp3− p1( )Invariants• This means that, if I know the base points in the image, I can read off the µ values for the object– they’re the same in object and in image ---invariant• Suggests a strategy rather like the Hough transform– search correspondences, form µ’s and voteGeometric Hashing• Vote on identity and correspondence using invariants– Take hypotheses with large enough votes• Fill up a table, indexed by µ’s, with – the base points and fourth point that yield those µ’s– the object identity1415Recognizing Human Actions• Movement and posture change– run, walk, crawl, jump, hop, swim, skate, sit, stand, kneel, lie, dance (various), …• Object manipulation– pick, carry, hold, lift, throw, catch, push, pull, write, type, touch, hit, press, stroke, shake, stir, turn, eat, drink, cut, stab, kick, point, drive, bike, insert, extract, juggle, play musical instrument (various)…• Conversational gesture– point, …• Sign LanguageActivities and Situation Assessment• Example: Withdrawing money from an ATM• Activities constructed by composing actions. Partial order plans may be a good model.• Activities may involve multiple agents• Detecting unusual situations or activity patterns is facilitated by the video activity transformObjects in Space Actions in Space-Time• Segment/Region-of-interest• Features (points, curves, wavelet coefficients..)• Correspondence and deform into alignment• Recover parameters of generative model• Discriminative classifier• Segment/volume-of-interest• Features (points, curves, wavelets, motion vectors..)• Correspondence and deform into alignment• Recover parameters of generative model• Discriminative classifier16Key Cues for Action Recognition• “Morpho-kinesics” of action (shape and movement of the body)• Identity of the object/s• Activity contextImage/Video  Stick figure Action• Stick figures can be specified in a variety of ways or at various resolutions (degrees of freedom)– 2D joint positions– 3D joint positions– Joint angles• Complete representation• Evidence that it is effectively computableHuman Body ConfigurationsHuman Body Configurations17Mathematical Challenges• Modeling shape variation• Nearest neighbor search in high dimensions• Combining statistical optimality with computational efficiency• Reconstruction algorithms for novel sensing

View Full Document


School:
Email:
New Password:
Confirm Password:

UW-Madison CS 766 - Object Recognition

Sign up for free to view:

Please select your school