UCF CAP 5937 - An Image-Based Trainable Symbol Recognizer for Sketch-Based Interfaces

Unformatted text preview:

An Image-Based Trainable Symbol Recognizer for Sketch-Based InterfacesLevent Burak KaraMechanical Engineering DepartmentCarnegie Mellon UniversityPittsburgh, Pennsylvania [email protected] F. StahovichMechanical Engineering DepartmentUniversity of California, RiversideRiverside, California [email protected] describe a trainable, hand-drawn symbol recognizerbased on a multi-layer recognition scheme. Symbols are in-ternally represented as binary templates. An ensemble of fourtemplate classifiers ranks each definition according to simi-larity with an unknown symbol. Scores from the individualclassifiers are then aggregated to determine the best definitionfor the unknown. Ordinarily, template-matching is sensitiveto rotation, and existing solutions for rotation invariance aretoo expensive for interactive use. We have developed an ef-ficient technique for achieving rotation invariance based onpolar coordinates. This techniques also filters out the bulk ofunlikely definitions, thereby simplifying the task of the multi-classifier recognition step.IntroductionA long standing challenge in pen-based interaction con-cerns symbol recognition, the task of recognizing individualhand-drawn figures such as geometric shapes, glyphs andsymbols. While there has been significant recent progressin symbol recognition (Rubine 1991; Fonseca, Pimentel,& Jorge 2002; Matsakis 1999; Hammond & Davis 2003),many recognizers are either hard-coded or require large setsof training data to reliably learn new symbol definitions.Such issues make it difficult to extend these systems to newdomains with novel shapes and symbols. The work pre-sented here is focused on the development of a trainablesymbol recognizer that provides (1) interactive performance,(2) easy extensibility to new shapes, and (3) fast training ca-pabilities.Our recognizer uses an image-based recognition ap-proach. This approach has a number of desirable character-istics. First, segmentation – the process of decomposing thesketch into constituent primitives such as lines and curves –is eliminated entirely. Second, our system is well suited forrecognizing “sketchy” symbols such as those shown in Fig-ure 1. Lastly, multiple pen strokes or different drawing or-ders do not pose difficulty. Many of the existing recognitionapproaches have either relied on single stroke methods inwhich an entire symbol must be drawn in a single pen stroke(Rubine 1991; Kimura, Apte, & Sengupta 1994), or constantdrawing order methods in which two similarly shaped pat-terns are considered different unless the pen strokes leadingCopyrightc° 2004, UCR Smart Tools Lab. All rights reserved.Figure 1: Examples of symbols correctly recognized by oursystem. The top row shows symbols used in training andthe bottom row shows correctly recognized test symbols. Atthe time of the test, the database contained 104 definitionsymbols.to those shapes follow the same sequence (Ozer et al. 2001;Yasuda, Takahashi, & Matsumoto 2000).Unlike many traditional methods, our shape recognizercan learn new symbol definitions from a single prototypeexample. Because only one example is needed, users canseamlessly train new symbols, and remove or overwrite ex-isting ones on the fly, without having to depart the main ap-plication. This makes it easy for users to extend and cus-tomize their symbol libraries. To increase the flexibility ofa definition, the user can provide additional examples of asymbol.Ordinarily, template-matching is sensitive to rotation, andexisting solutions for rotation invariance are too expensivefor interactive use. We have developed an efficient tech-nique for rotation invariance based on a novel polar coor-dinate analysis. The unknown symbol is transformed into apolar coordinate representation, which allows the programto efficiently determine which orientation of the unknownbest matches a given definition. During this process, defi-nitions that are found to be markedly dissimilar to the un-known are pruned away, and the remaining ones are kept forfurther analysis. In a second step, recognition switches toscreen coordinates where the surviving definitions are ana-lyzed in more detail using an ensemble of four different clas-sifiers. Each classifier produces a list of definitions rankedaccording to their similarity to the unknown. In the final stepof recognition, results of the individual classifiers are pooledtogether to produce the recognizer’s final decision.Figure 2: Examples of symbol templates: A mechanicalpivot, letter ‘a’, digit ‘8’. The examples are demonstratedon 24x24 templates to better illustrate the quantization.The analysis in polar coordinates precedes the analysisin screen coordinates. However, for the sake of presenta-tion, we have found it useful to begin the discussion withour template representation and the four template matchingtechniques, since some of those concepts are necessary toset the context for the analysis in polar coordinates.Template MatchingSymbols are drawn using a 9 x 12 Wacom Intuos2 digitiz-ing tablet and a cordless stylus. Data points are collected astime sequenced (x,y) coordinates sampled along the stylus’trajectory. There is no restriction on the number of strokes,and symbols can be drawn anywhere on the tablet, in anysize and orientation.Input symbols are internally described as 48x48 quantizedbitmap images which we call “templates” (Figure 2). Thisquantization significantly reduces the amount of data to con-sider while preserving the patterns’ distinguishing charac-teristics. The template representation preserves the originalaspect ratio so that one can distinguish between, say, a circleand an ellipse.During recognition, the template of the unknown ismatched against the templates in the database of definitions.We use four different methods to evaluate the match be-tween a pair of templates. The first two methods are basedon the Hausdorff distance, which measures the dissimilar-ity between two point sets. Hausdorff-based methods havebeen successfully applied to object detection in complexscenes (Rucklidge 1996; Sim, Kwon, & Park 1999), but onlya few researchers have recently employed them for hand-drawn pattern recognition (Cheung, Yeung, & Chin 2002;Miller, Matsakis, & Viola 2000). Our other two recognitionmethods are based on the Tanimoto and Yule coefficients.The Tanimoto coefficient is extensively used in chemical in-formatics such as drug testing, where the goal is to iden-tify


View Full Document

UCF CAP 5937 - An Image-Based Trainable Symbol Recognizer for Sketch-Based Interfaces

Documents in this Course
Load more
Download An Image-Based Trainable Symbol Recognizer for Sketch-Based Interfaces
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view An Image-Based Trainable Symbol Recognizer for Sketch-Based Interfaces and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view An Image-Based Trainable Symbol Recognizer for Sketch-Based Interfaces 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?