1Fall 2007 CAP 5937 – Topics in Pen-Based User Interfaces ©Joseph J. LaViola Jr.Ink Preprocessing and PreparationLecture #5: Preparing InkJoseph J. LaViola Jr.Fall 2007Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Recall Pen-Based Interface DataflowRaw StrokeDataPreprocessing SegmentationFeature ExtractionAndAnalysisClassificationInk ParsingSketchUnderstandingMake Inferences2Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.miiiinsssSnityxpppps...1 ),,,(where...2121=≤≤==Representing Data Points and strokes Image pixel matrix not as popular(x1,y1,t1)(xn,yn,tn)Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Preprocessing Often required to clean raw data Stroke Invariance scale position orientation slant/skew order/direction Filtering and Smoothing DehookingNormal viewof strokeZoomed in view of stroke showingunwanted cusps and self-intersections3Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Scale Invariance Why? – want to ensure stroke has a canonical representation so its size makes no difference in recognition Approach define constant width or height scale stroke maintaining aspect ratio choose constant width or height based on strokeFall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Translation Invariance Why? – want to ensure stroke has canonical representation so its position makes no difference in recognition Approach translate stroke to origin use stroke bounding box possible translation points top left point center point4Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Rotation Invariance Primarily used when for handwriting (sometimes for shapes) Why? – want to remove baseline drift which could affect recognition Baseline drift – deviation between baseline and horizontal axis Difficult problem to deal with ambiguous baseline locations One approach (Guerfali and Plamondon1993) uses center of mass of word regions least squares for baseline constructionFall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Slant/Skew Invariance Important in handwriting recognition Handwriting slant – deviation between the principal axis of strokes and vertical axis Often referred to as deskewing process Why? – can be important for segmentation Difficult problem – very subjective One approach (Guerfali and Plamondon 1993) zone extraction observation windows local and global slants5Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Stroke Direction and Ordering Invariance Can be large variation in ways a symbol is drawn order of strokes direction of strokes Possible approach is to model each possible combination combinatorially expensive could hurt recognition accuracy Want to assign canonical ordering and direction see Matsakis (1999)Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Stroke Invariance Summary Want to have canonical representation Makes calculating features easier Makes recognition easier6Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Resampling Why? – sometimes we want to have all strokes have the same number of points helps deal with some recognition algorithms Approach linear interpolation between pointsFall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Filtering and Smoothing Remove duplicate points Remove unwanted cusps and self-intersections Thinning – reduce points Dot reduction – reduce dots to single point Stroke connection- deal with extraneous pen lifts (e.g., stroke segmentation)7Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Gaussian Smoothing ∑∑−=−−−=+==σσσσσσ3322332222kkjjjijjfiltieewpwpσ is a scaling parameterShould try to maintain cusps when filteringFall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.A Filtering Algorithm8Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Dehooking Want to eliminate hooks that can occur at the end of strokes (sometimes at the beginning) Hooks come from inaccuracies in pen-down detection rapid and erratic stylus motion Hooks vary depending on user and on strokeFall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.A Dehooking Algorithm9Fall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Dehooking Algorithm Cont’dFall 2007 CAP 5937 – Topics in Pen-based User Interfaces ©Joseph J. LaViola Jr.Next Class – Discussion Assignment 1 – due tomorrow Assignment 2 – out tomorrow Readings Guerfali, Wacef and R´ejean Plamondon. Normalizing and Restoring On-Line Handwriting. Pattern Recognition, 26(3):419-431, March 1993. Tevfik Metin Sezgin. Feature Point Detection and Curve Approximation for Early Processing of Free-Hand Sketches. Master's Thesis. May 2001. Department of EECS, MIT. Matsakis, Nicholas, Recognition of Mathematical Expressions, Master's thesis, MIT, pages 21-28.
View Full Document