New version page

Efficient Shape Matching Using Shape Contexts

This preview shows page 1-2 out of 6 pages.

View Full Document
View Full Document

End of preview. Want to read all 6 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

Efficient Shape MatchingUsing Shape ContextsGreg Mori, Member, IEEE,Serge Belongie, Member, IEEE,andJitendra Malik, Senior Member, IEEEAbstract—We demonstrate that shape contexts can be used to quickly prune asearch for similar shapes. We present two algorithms for rapid shape retrieval:representative shape contexts, performing comparisons based on a small numberof shape contexts, and shapemes, using vector quantization in the space of shapecontexts to obtain prototypical shape pieces.Index Terms—Shape, object recognition, optical character recognition.æ1INTRODUCTIONWE are interested in the use of shape for recognizing 3D objects,represented by a collection of multiple 2D views. A satisfactorytheory of shape representation would have a number of desirableattributes:1. It should support recognition based on exquisitely finedifferences, e.g., distinguishing faces of twins.2. At the same time, it should support making coarsediscriminations very quickly. Thorpe et al. [1] showedthat people, when presented with an image, can answercoarse queries such as the presence or absence of an animalin as little as 150ms.3. The approach should scale to deal with a large number ofobjects. Biederman [2] has argued that humans candistinguish on the order of 30,000 different objects.4. It should be possible to acquire a representation of anobject category from relatively few examples, i.e., thereshould be a good generalization ability.In this paper, we further develop an approach based on therepresentation of shape contexts, introduced in Belongie et al. [3],which arguably satisfies criteria 1 and 4 above while 3 is yet only adistant possibility. The techniques we develop here can be used for2 and attempt to address the issues involved in 3, scaling to largenumbers of objects.The basic idea of shape contexts is illustrated in Fig. 1. A shapeis represented by a discrete set of points sampled from the internaland external contours on the shape. These can be obtained aslocations of edge pixels as found by an edge detector, giving us aset IP ¼fp1; ...;png, pi2 IR2,ofn points. Consider the set ofvectors originating from a point to all other sample points on ashape. These n  1 vectors express the configuration of the entireshape relative to the reference point. One way to capture thisinformation is as the distribution of the relative positions of theremaining n  1 points in a spatial histogram. Concretely, for apoint pion the shape, compute a coarse histogram hiof the relativecoordinates of the remaining n  1 points,hki¼ # q 6¼ pi: ðq  piÞ2binðkÞfg:This histogram is defined to be the shape context of pi. We use binsthat are uniform in log-polar space, making the descriptor moresensitive to positions of nearby sample points than to those ofpoints farther away. In the absence of background clutter, theshape context of a point on a shape can be made invariant underuniform scaling of the shape as a whole. This is accomplished bynormalizing all radial distances by the mean distance  betweenthe n2point pairs in the shape.As illustrated in Fig. 1, shape contexts will be different fordifferent points on a single shape S; however, corresponding(homologous) points on similar shapes S and S0will tend to havesimilar shape contexts. By construction, the shape context at agiven point on a shape is invariant under translation and scaling.Shape contexts are not invariant under arbitrary affine transforms,but the log-polar binning ensures that, for small locally affinedistortions due to pose change, intracategory, variation, etc., thechange in the shape context is correspondingly small. In addition,since the shape context descriptor gathers coarse information fromthe entire shape, it is relatively insensitive to the occlusion of anyparticular part.In contrast to the original work [3] on shape contexts, whichused the 2distance to compare shape contexts, we now treat themas feature vectors and compare them using the L2-norm. Theresults using the L2-norm are comparable to those using the2distance and the L2-norm is marginally faster to compute.We turn now to the use of shape contexts as part of a theory ofobject recognition based on shape matching. As stated earlier, it isdesirable for such a theory to support both accurate finediscrimination, as well as rapid coarse discrimination. Thissuggests a two stage approach to shape matching, namely:1. Fast pruning: Given an unknown 2D query shape, we shouldbe able to quickly retrieve a small set of likely candidateshapes from a potentially very large collection of storedshapes. The present paper will introduce two algorithms forthis problem.2. Detailed matching: Once we have a small set of candidateshapes, we can perform a more expensive and moreaccurate matching procedure to find the best matchingshape to the query shape.In this wo rk, we will not address the problem of scaleestimation. Shapes will be presented in a setting that allows forsimple estimation of scale via the mean distance between points ona shape. In a natural setting, multiscale search could be performed,or scale-invariant interest point detection or segmentation could beused to estimate scale.The thrust of this paper is in Section 4, where we developtwo different algorithms for fast pruning based on shape contexts,resulting in a short list of likely candidate shapes to be evaluatedlater by a more accurate and expensive procedure [3]. This ispreceded by Section 2 on past work and Section 3 on the structureof our matching framework. In Section 5, we show experimentalresults on the ETH-80 object database [4], the Snodgrass andVanderwart drawings [5], and the EZ-Gimpy CAPTCHA [6]. Weconclude in Section 6.2PAST WORKPast work on object recognition has developed the use of two majorcues:appearanceandshape.Thefirstgroupofwork,onappearance-based recognition, makes direct use of pixel brightnessvalues. The work of Turk and Pentland [7] is a prime example of1832 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 11, NOVEMBER 2005. G. Mori is with the School of Computing Science, Simon Fraser University,Burnaby, BC, Canada V5A 1S6. E-mail: [email protected] S. Belongie is with the Department of Computer Science and Engineering,University of California at San Diego, La Jolla, CA 92093.E-mail: [email protected] J. Malik is with the Electrical Engineering and Computer Science Division,University of California at Berkeley, Berkeley, CA 94720.E-mail:


Loading Unlocking...
Login

Join to view Efficient Shape Matching Using Shape Contexts and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Efficient Shape Matching Using Shape Contexts and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?