Recovering Human Body Configurations: Combining Segmentation and RecognitionThe goalOther approaches: Simple featuresOther approaches: Probable poseOther approaches: Frequent simplifications“Arguably the most difficult recognition problem in computer vision”Solution: “Islands of Saliency”AlgorithmAlgorithm: Segmenting into regions and superpixelsSegmentationSegmentation: RegionsSegmentation: SuperpixelsAlgorithm: Finding salient limbs and torsosFinding limbsFind limbsSlide 16Evaluation: CuesEvaluation: PerformanceEvaluation summaryFinding torsosSlide 21PowerPoint PresentationEvaluationAlgorithm: Pruning to form partial configurationsBody buildingEnforce constraints:Enforce constraintsBody building: slimming downSlide 29Extending to full limbsSlide 31Slide 32Slide 33SummaryRecovering Human Recovering Human Body Configurations: Body Configurations: Combining Segmentation Combining Segmentation and Recognitionand RecognitionGreg Mori, Xiaofeng Ren, and Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley)Jitentendra Malik (UC Berkeley)Alexei A. Efros (Oxford)Alexei A. Efros (Oxford)The goalThe goalGiven an image:Given an image:Detect a human figureDetect a human figureLocalize joints and limbsLocalize joints and limbsCreate a skeleton of their poseCreate a skeleton of their poseCreate a segmentation mask of the personCreate a segmentation mask of the personOther approaches: Simple Other approaches: Simple featuresfeaturesModel people as generalized Model people as generalized cylinders (1980’s)cylinders (1980’s)Easily implemented bottom upEasily implemented bottom upOften use tree to express relationsOften use tree to express relationsProblems:Problems:Cylinders are commonCylinders are commonOften dependencies between body Often dependencies between body partspartsReally need contextReally need contextOther approaches: Probable Other approaches: Probable poseposeOften use probable poseOften use probable poseTemplate matchingTemplate matchingTop down constraints on poseTop down constraints on poseBut even highly improbable poses are still But even highly improbable poses are still possiblepossibleOther approaches: Frequent Other approaches: Frequent simplificationssimplificationsNude modelsNude modelsLimited posesLimited posesBackground subtraction or limited clutterBackground subtraction or limited clutter““Arguably the most difficult Arguably the most difficult recognition problem in recognition problem in computer vision”computer vision”Variation in clothingVariation in clothingVariation in limbsVariation in limbsVariation in poseVariation in poseSolution: “Islands of Saliency”Solution: “Islands of Saliency”Use low-level features that are informative Use low-level features that are informative independent of contextindependent of contextBased on these islands, one is able to fill in Based on these islands, one is able to fill in gaps with contextgaps with contextAlgorithmAlgorithmAlgorithm: Segmenting into Algorithm: Segmenting into regions and superpixelsregions and superpixelsSegmentationSegmentationCombine boundary finder (Martin et al., Combine boundary finder (Martin et al., 2002) with Normalized Cuts (Malik, 2002) with Normalized Cuts (Malik, Belongie, et al., 2001)Belongie, et al., 2001)Groups similar pixels into regionsGroups similar pixels into regionsSegmentation: RegionsSegmentation: Regions40 regions40 regionsMost salient parts of Most salient parts of body become regionsbody become regionsLimbs usually two Limbs usually two “half-limbs”“half-limbs”Segmentation: SuperpixelsSegmentation: Superpixels200 region 200 region (oversegmentation)(oversegmentation)Retains virtually all Retains virtually all structures in originalstructures in originalStill reduces Still reduces complexity from complexity from 400,000 pixels to 200 400,000 pixels to 200 superpixelssuperpixelsAlgorithm: Finding salient Algorithm: Finding salient limbs and torsoslimbs and torsosFinding limbsFinding limbsCandidates: all 40 regionsCandidates: all 40 regionsFour cues for half-limb detectionFour cues for half-limb detectionContour: Probability of the boundaryContour: Probability of the boundaryAverage probability of the region’s boundary, as Average probability of the region’s boundary, as measured by Martin’s boundary findermeasured by Martin’s boundary finderShape: How close to a rectangleShape: How close to a rectangleArea of overlap with reconstructed rectangle,Area of overlap with reconstructed rectangle,Find limbsFind limbsShadingShadingLimbs are roughly cylindrical, so should have 3D Limbs are roughly cylindrical, so should have 3D pop out due to shadingpop out due to shadingCompare ICompare Ix-x-, I, Ix+x+, I, Iy-y-, I, Iy+y+ for region to mean of I for region to mean of Ix-x-, I, Ix+x+, , IIy-y-, I, Iy+y+ for training set for training setFocus cueFocus cueBackground is often not in focusBackground is often not in focusCCfocusfocus = E = Ehighhigh/(a E/(a Elowlow + b) + b)Finding limbsFinding limbsCues are combined by summingCues are combined by summingUse logistic regression to learn weights Use logistic regression to learn weights (training set of hand-labeled half-limbs)(training set of hand-labeled half-limbs)Evaluation: CuesEvaluation: CuesNumber of candidates generatedNumber of hitsEvaluation: PerformanceEvaluation: PerformanceEvaluation summaryEvaluation summaryNot very good detectorsNot very good detectorsStrength of boundary best cueStrength of boundary best cueCombining cues yields better performanceCombining cues yields better performanceOn average 4.08 of top 8 candidates On average 4.08 of top 8 candidates produced were hitsproduced were hits89% have at least 3 hits among top 889% have at least 3 hits among top 8Motivates search for 3 half-limbs combined Motivates search for 3 half-limbs combined with head and torsowith head and torsoFinding torsosFinding torsosUnlike half-limbs, typically several regionsUnlike half-limbs, typically several regionsConsider all sets of adjacent regions Consider all sets of adjacent regions within some range of total sizeswithin some range of total sizesSet of cues:Set of cues:ContourContourShapeShapeFocusFocus(No shading)(No shading)Finding torsosFinding torsosFind orientation of torsoFind orientation of torsoFind
View Full Document