DOC PREVIEW
CMU 15494 Cognitive Robotics - Object Recognition

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

02/22/10 15-494 Cognitive Robotics 1Object Recognition15-494 Cognitive RoboticsDavid S. Touretzky &Ethan Tira-ThompsonCarnegie MellonSpring 201002/22/10 15-494 Cognitive Robotics 2What Makes Object Recognition Hard?●Translation invariance●Scale invariance●Rotation invariance (2D)●Rotation invariance (3D)●Occlusion●Figure/ground segmentation (where is the object?)●Articulated objects (limbs, scissors)02/22/10 15-494 Cognitive Robotics 3Template Matching●Simplest possible object recognition scheme.●Compare template pixels against image pixels at each image position.Template Match Score02/22/10 15-494 Cognitive Robotics 4Sketch<uint> templateMatch(const Sketch<uchar> &sketch, Sketch<uchar> &kernel, int istart, int jstart, int width, int height) { Sketch<uint> result("templateMatch("+sketch->getName()+")",sketch); result->setColorMap(jetMapScaled); int const npix = width * height; int const di = - (int)(width/2); int const dj = - (int)(height/2); for (int si=0; si<sketch.width; si++) for (int sj=0; sj<sketch.height; sj++) { int sum = 0; for (int ki=0; ki<width; ki++) for (int kj=0; kj<height; kj++) { int k_pix = kernel(istart+ki,jstart+kj); if ( si+di+ki >= 0 && si+di+ki < sketch.width && sj+dj+kj >= 0 && sj+dj+kj < sketch.height ) { int s_pix = sketch(si+di+ki,sj+dj+kj); sum += (s_pix - k_pix) * (s_pix - k_pix); } else sum += k_pix * k_pix; } result(si,sj) = uint(65535 - sqrt(sum/float(npix))); } result -= result->min(); return result;}Template Matcher02/22/10 15-494 Cognitive Robotics 5Limited Invariance PropertiesOriginal Occluded RotatedFlipped Sideways Diagonal02/22/10 15-494 Cognitive Robotics 6Color Histograms (Swain)●Invariant to translation,2D rotation,and scale.●Handles some occlusion.●But assumes object has already been segmented.02/22/10 15-494 Cognitive Robotics 7Object ClassesTest ImagesFigure from M. A. Stricker,http://www.cs.uchicago.edu/files/tr_authentic/TR-92-22.ps02/22/10 15-494 Cognitive Robotics 8Blocks World Vision●One of the earliest computer vision domains.–Roberts (1965) used line drawings of block scenes: the first “computer vision” program.●Simplified problem because shapes were regular.–Occlusions could be handled.●Still a hard problem. No standard blocks world vision package exists.02/22/10 15-494 Cognitive Robotics 9AIBO Blocks World●Matt Carson's senior thesis (CMU CSD, 2006).●Goal: recover positions, orientations, and sizes of blocks.02/22/10 15-494 Cognitive Robotics 10 Find the Block Faces02/22/10 15-494 Cognitive Robotics 11Find the Block From the Faces02/22/10 15-494 Cognitive Robotics 12SIFT (Lowe, 2004)●Scale-Invariant Feature Transform●Can recognize objects independent of scale, translation, rotation, or occlusion.●Can segment cluttered scenes.●Slow training, but fast recognition.02/22/10 15-494 Cognitive Robotics 13How Does SIFT Work?●Generate large numbers of features that densely cover each training object at various scales and orientations.●A 500 x 500 pixel image maygenerate 2000 stable features.●Store these features in a library.●For recognition, find clusters offeatures present in the imagethat agree on the object position,orientation, and scale.02/22/10 15-494 Cognitive Robotics 14SIFT Feature Generation1) Scale-space extrema detection➔Use differences of Gaussians to find potential interest points.2) Keypoint localization➔Fit detailed model to determine location and scale.3) Orientation assignment➔Assign orientations based on local image gradients.4) Keypoint descriptor➔Extract description of local gradients at selected scale.02/22/10 15-494 Cognitive Robotics 15Gaussian Smoothing02/22/10 15-494 Cognitive Robotics 16Difference of Gaussians:Edge DetectionDifferenceofGaussiansZeroCrossings= Edges02/22/10 15-494 Cognitive Robotics 17Scale Space02/22/10 15-494 Cognitive Robotics 18Scale Space Extrema02/22/10 15-494 Cognitive Robotics 19Filtering the Features02/22/10 15-494 Cognitive Robotics 20Keypoint Descriptors02/22/10 15-494 Cognitive Robotics 2102/22/10 15-494 Cognitive Robotics 22Real-Time SIFT ExampleFred Birchmore used SIFT to recognize soda cans.http://eyecanseecan.blogspot.com/See demo videos on his blog.02/22/10 15-494 Cognitive Robotics 23SIFT in Tekkotsu●Lionel Heng did a SIFT implementation as his class project in 2006.●Xinghao Pan implemented a SIFT tool for Tekkotsu:–Allow users to construct libraries of objects–Each object has a collection of representative images–User can control which SIFT features to use for matching–Java GUI provides for easy management of the library●How to integrate SIFT with the dual coding system?–Object scale can be used to estimate distance–Match in camera space must be converted to local space02/22/10 15-494 Cognitive Robotics 24SIFT Tool02/22/10 15-494 Cognitive Robotics 25Object Recognition in the Brain02/22/10 15-494 Cognitive Robotics 26Object Recognition in the Brain●Mishkin & Ungerleider: dual visual pathways.–The dorsal, “where” pathway lies in parietal cortex.–The ventral, “what” pathway lies in temporal cortex.–Lesions to these areas yield very specific effects.02/22/10 15-494 Cognitive Robotics 27The Macaque “Vision Pipeline”DJ Felleman and DC Van Essen (1991), Cerebral Cortex 1:1-47.RGC = retinal ganglion cells02/22/10 15-494 Cognitive Robotics 28Serre & Poggio (PAMI 2007):Model Based on Temporal Cortex02/22/10 15-494 Cognitive Robotics 29To Learn More About Computer and Biological Vision●Take Tai Sing Lee's Computer Vision class, 15-385.●Take Tai Sing Lee's Computational Neuroscience class, 15-490.●There are many books on this subject. One of the classics is “Vision” by David


View Full Document

CMU 15494 Cognitive Robotics - Object Recognition

Download Object Recognition
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Object Recognition and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Object Recognition 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?