Berkeley COMPSCI C280 - Recognition in Context - D2843066

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI C280> Recognition in Context

Berkeley COMPSCI C280 - Recognition in Context

School name University of California, Berkeley

Pages 130

Download Save

Unformatted text preview:

C280 Computer VisionC280, Computer VisionProf. Trevor [email protected] 16: Recognition in ContextLast Lecture• Naïve‐Bayes Nearest Neighbor (Irani)•ISM (Liebe)•ISM (Liebe)• Constellation Models (Fergus)• Transformed LDA Models (Sudderth)• 3‐D view models (Saravese)This week• Two last topics in recognition:–ContextContext– ArticulationToday: Three papers on computational models of context:• A. Torralba, K. P. Murphy, and W. T. Freeman, "Contextual models for object detection using boosted random fields," in Advances in Neural Information Processing Systems 17 (NIPS), 2005. • D. Hoiem, A. A. Efros, and M. Hebert, "Putting objects in perspective," in Computer Vision and P Rii2006Pattern Recognition, 2006• G. Heitz and D. Koller, "Learning spatial context: Using stuff to find things," in ECCV 2008, pp. 30‐43. Why is detection hard?yxy10 000 patches/object/image1,000,000 images/dayPlus, we want to do this for ~ 1000 objects10,000 patches/object/imagetimeIs local information enough?Slide credit: A. TorralbaWith hundreds of categoriesroadtablechairkeyboardtablecarkeyboardroadIf we have 1000 categories (detectors), and each detector produces 1 fa every 10 images, we will have 100 false alarms per image… pretty much garbage… Slide credit: A. TorralbaIs local information even enough?Slide credit: A. TorralbaIs local information even enough?InformationContextual featuresLocal featuresDistanceSlide credit: A. TorralbaThe system does not care about the scene, but we do…We know there is a keyboard present in this scene even if we cannot see it clearly.,We know there is no keyboard present in this scene… even if there is one indeed.Slide credit: A. TorralbaThe multiple personalities of a blobSlide credit: A. TorralbaThe multiple personalities of a blobSlide credit: A. TorralbaSlide credit: A. TorralbaSlide credit: A. TorralbaSlide credit: A. TorralbaLook-Alikes by Joan SteinerSlide credit: A. TorralbaLook-Alikes by Joan SteinerSlide credit: A. TorralbaLook-Alikes by Joan SteinerSlide credit: A. TorralbaThe context challengeHow far can you go withoutHow far can you go without using an object detector?Slide credit: A. TorralbaWhat are the hidden objects?1212Slide credit: A. TorralbaWhat are the hidden objects?Chance ~ 1/30000Slide credit: A. TorralbaThe importance of context• Cognitive psychology–Palmer 1975 – Biederman 1981–…• Computer vision– Noton and Stark (1971)– Hanson and Riseman (1978)– Barrow & Tenenbaum (1978) – Ohta, kanade, Skai (1978)– Haralick (1983)–Strat and Fischler (1991)()– Bobick and Pinhanez (1995)– Campbell et al (1997)Slide credit: A. TorralbaMulticlass object detection andMulticlass object detection and context modeling Antonio TorralbaAntonio TorralbaIn collaboration with Kevin P. Murphy and William T. FreemanpyObject representationsInside the object(intrinsic features)PartsGlobalObject sizePixelsPartsGlobal appearancePixelsA l & R th (02) M h dd P tl d (97) T k P tl d (91) Vid lN t Ull (03)Agarwal & Roth, (02), Moghaddam, Pentland (97), Turk, Pentland (91),Vidal-Naquet, Ullman, (03)Heisele, et al, (01), Agarwal & Roth, (02), Kremp, Geman, Amit (02), Dorko, Schmid, (03)Fergus, Perona, Zisserman (03), Fei Fei, Fergus, Perona, (03), Schneiderman, Kanade (00), Lowe (99)Etc.Object representationsInside the object(intrinsic features)Outside the object(contextual features)Object sizePartsGlobal appearanceLocal contextGlobal contextPixelsKruppa & Shiele, (03), Fink & Perona (03)A l & R th (02) M h dd P tl d (97) T k P tl d (91) Vid lN t Ull (03)pp , ( ), ( )Carbonetto, Freitas, Barnard (03), Kumar, Hebert, (03)He, Zemel, Carreira-Perpinan (04), Moore, Essa, Monson, Hayes (99)Strat & Fischler (91), Murphy, Torralba & Freeman (03)Agarwal & Roth, (02), Moghaddam, Pentland (97), Turk, Pentland (91),Vidal-Naquet, Ullman, (03)Heisele, et al, (01), Agarwal & Roth, (02), Kremp, Geman, Amit (02), Dorko, Schmid, (03)Fergus, Perona, Zisserman (03), Fei Fei, Fergus, Perona, (03), Schneiderman, Kanade (00), Lowe (99)Etc.Previous work on context• Strat & Fischler (91) Context defined using hand-written rules about relationships between objectsgpjPrevious work on context• Fink & Perona (03)U t t f b ti f th bj t t iUse output of boosting from other objects at previous iterations as input into boosting for this iterationPrevious work on context• Murphy, Torralba & Freeman (03)U lbl t tt dit bj t btth i dliUse global context to predict objects but there is no modeling of spatial relationships between objects.SE1E2SOp1,c1OpN,c1Op1,c2OpN,c2Class 1Class 2Keyboardsvp1,c1vpN,c1. . .vp1,c2vpN,c2. . .c2maxVc1maxVX1X2vgPrevious work on context• Carbonetto, de Freitas & Barnard (04)•Enforce spatial consistency between labels using MRF•Enforce spatial consistency between labels using MRFGraphical models for image labelinglabelingDensely connected graphs with low informative connectionsNearest neighbor gridWant to model long-range correlations between labelsPrevious work on context• He, Zemel & Carreira-Perpinan (04)Ul ibl id l di liUse latent variables to induce long distance correlations between labels in a Conditional Random Field (CRF)Outline of this talk• Use global image features (as well as local features) in boosting to help object )gpjdetection•Learn structure of dense CRF (with longLearn structure of dense CRF (with long range connections) using boosting, to exploit spatial correlationsexploit spatial correlationsImage database• ~2500 hand labeled images with ttisegmentations• ~30 objects and stuff• Indoor and outdoor• Sets of images are separated by locations and camera (digital/webcam)• No graduate students or low-income-td tl l it d f l b listudent-class exploited for labeling.Which objects are important?Average gpercentage ofpixels occupiedby each object.Object representation• Discrete/bounded/rigidScreen, car, pedestrian, bottle, …Et dd/b dd/df bl•Extended/unbounded/deformableBuilding, sky, road, shelves, desk, …We will use region labeling as a representation.Learning local features(i t i i bj t f t )(intrinsic object features)s21s2n…s31s3n…roadbuildings11s1n…1ncarVLnVL1PixelsWe maximize the probability of the true labels using Boosting.Object local features(Borenstein & Ullman, ECCV 02)**xConvolve with oriented filterThresholdConvolve withsegmentation fragmentNormalized correlation with an object patchPatches from5x5 to 30x30 pixels.Results with local

View Full Document


School:
Email:
New Password:
Confirm Password:

Berkeley COMPSCI C280 - Recognition in Context

Sign up for free to view:

Please select your school