02/22/10 15-494 Cognitive Robotics 1Object Recognition15-494 Cognitive RoboticsDavid S. Touretzky &Ethan Tira-ThompsonCarnegie MellonSpring 201002/22/10 15-494 Cognitive Robotics 2What Makes Object Recognition Hard?●Translation invariance●Scale invariance●Rotation invariance (2D)●Rotation invariance (3D)●Occlusion●Figure/ground segmentation (where is the object?)●Articulated objects (limbs, scissors)02/22/10 15-494 Cognitive Robotics 3Template Matching●Simplest possible object recognition scheme.●Compare template pixels against image pixels at each image position.Template Match Score02/22/10 15-494 Cognitive Robotics 4Sketch<uint> templateMatch(const Sketch<uchar> &sketch, Sketch<uchar> &kernel, int istart, int jstart, int width, int height) { Sketch<uint> result("templateMatch("+sketch->getName()+")",sketch); result->setColorMap(jetMapScaled); int const npix = width * height; int const di = - (int)(width/2); int const dj = - (int)(height/2); for (int si=0; si<sketch.width; si++) for (int sj=0; sj<sketch.height; sj++) { int sum = 0; for (int ki=0; ki<width; ki++) for (int kj=0; kj<height; kj++) { int k_pix = kernel(istart+ki,jstart+kj); if ( si+di+ki >= 0 && si+di+ki < sketch.width && sj+dj+kj >= 0 && sj+dj+kj < sketch.height ) { int s_pix = sketch(si+di+ki,sj+dj+kj); sum += (s_pix - k_pix) * (s_pix - k_pix); } else sum += k_pix * k_pix; } result(si,sj) = uint(65535 - sqrt(sum/float(npix))); } result -= result->min(); return result;}Template Matcher02/22/10 15-494 Cognitive Robotics 5Limited Invariance PropertiesOriginal Occluded RotatedFlipped Sideways Diagonal02/22/10 15-494 Cognitive Robotics 6Color Histograms (Swain)●Invariant to translation,2D rotation,and scale.●Handles some occlusion.●But assumes object has already been segmented.02/22/10 15-494 Cognitive Robotics 7Object ClassesTest ImagesFigure from M. A. Stricker,http://www.cs.uchicago.edu/files/tr_authentic/TR-92-22.ps02/22/10 15-494 Cognitive Robotics 8Blocks World Vision●One of the earliest computer vision domains.–Roberts (1965) used line drawings of block scenes: the first “computer vision” program.●Simplified problem because shapes were regular.–Occlusions could be handled.●Still a hard problem. No standard blocks world vision package exists.02/22/10 15-494 Cognitive Robotics 9AIBO Blocks World●Matt Carson's senior thesis (CMU CSD, 2006).●Goal: recover positions, orientations, and sizes of blocks.02/22/10 15-494 Cognitive Robotics 10 Find the Block Faces02/22/10 15-494 Cognitive Robotics 11Find the Block From the Faces02/22/10 15-494 Cognitive Robotics 12SIFT (Lowe, 2004)●Scale-Invariant Feature Transform●Can recognize objects independent of scale, translation, rotation, or occlusion.●Can segment cluttered scenes.●Slow training, but fast recognition.02/22/10 15-494 Cognitive Robotics 13How Does SIFT Work?●Generate large numbers of features that densely cover each training object at various scales and orientations.●A 500 x 500 pixel image maygenerate 2000 stable features.●Store these features in a library.●For recognition, find clusters offeatures present in the imagethat agree on the object position,orientation, and scale.02/22/10 15-494 Cognitive Robotics 14SIFT Feature Generation1) Scale-space extrema detection➔Use differences of Gaussians to find potential interest points.2) Keypoint localization➔Fit detailed model to determine location and scale.3) Orientation assignment➔Assign orientations based on local image gradients.4) Keypoint descriptor➔Extract description of local gradients at selected scale.02/22/10 15-494 Cognitive Robotics 15Gaussian Smoothing02/22/10 15-494 Cognitive Robotics 16Difference of Gaussians:Edge DetectionDifferenceofGaussiansZeroCrossings= Edges02/22/10 15-494 Cognitive Robotics 17Scale Space02/22/10 15-494 Cognitive Robotics 18Scale Space Extrema02/22/10 15-494 Cognitive Robotics 19Filtering the Features02/22/10 15-494 Cognitive Robotics 20Keypoint Descriptors02/22/10 15-494 Cognitive Robotics 2102/22/10 15-494 Cognitive Robotics 22Real-Time SIFT ExampleFred Birchmore used SIFT to recognize soda cans.http://eyecanseecan.blogspot.com/See demo videos on his blog.02/22/10 15-494 Cognitive Robotics 23SIFT in Tekkotsu●Lionel Heng did a SIFT implementation as his class project in 2006.●Xinghao Pan implemented a SIFT tool for Tekkotsu:–Allow users to construct libraries of objects–Each object has a collection of representative images–User can control which SIFT features to use for matching–Java GUI provides for easy management of the library●How to integrate SIFT with the dual coding system?–Object scale can be used to estimate distance–Match in camera space must be converted to local space02/22/10 15-494 Cognitive Robotics 24SIFT Tool02/22/10 15-494 Cognitive Robotics 25Object Recognition in the Brain02/22/10 15-494 Cognitive Robotics 26Object Recognition in the Brain●Mishkin & Ungerleider: dual visual pathways.–The dorsal, “where” pathway lies in parietal cortex.–The ventral, “what” pathway lies in temporal cortex.–Lesions to these areas yield very specific effects.02/22/10 15-494 Cognitive Robotics 27The Macaque “Vision Pipeline”DJ Felleman and DC Van Essen (1991), Cerebral Cortex 1:1-47.RGC = retinal ganglion cells02/22/10 15-494 Cognitive Robotics 28Serre & Poggio (PAMI 2007):Model Based on Temporal Cortex02/22/10 15-494 Cognitive Robotics 29To Learn More About Computer and Biological Vision●Take Tai Sing Lee's Computer Vision class, 15-385.●Take Tai Sing Lee's Computational Neuroscience class, 15-490.●There are many books on this subject. One of the classics is “Vision” by David
View Full Document