UHCL CSCI 5931 - The Gesture and Activity Recognition Toolkit

Unformatted text preview:

GART: The Gesture and ActivityRecognition ToolkitKent Lyons, Helene Brashear, Tracy Westeyn,Jung Soo Kim, and Thad StarnerCollege of Computing and GVU CenterGeorgia Institute of TechnologyAtlanta, GA 30332-0280 USA{kent, brashear, turtle, jszzang, thad}@cc.gatech.eduAbstract. The Gesture and Activity Recognition Toolit (GART) isa user interface toolkit designed to enable the development of gesture-based applications. GART provides an abstraction to machine learningalgorithms suitable for modeling and recognizing different types ofgestures. The toolkit also provides support for the data collection andthe training process. In this paper, we present GART and its machinelearning abstractions. Furthermore, we detail the components of thetoolkit and present two example gesture recognition applications.Key words: Gesture recognition, user interface toolkit1 IntroductionGestures are a natural part of our everyday life. As we move about and interactwith the world we use body language and gestures to help us communicate, andwe perform gestures with physical artifacts around us. Using similar motionsto provide input to a computer is an interesting area for exploration. Gesturesystems allow a user to employ movements of her hand, arm or other parts ofher body to control computational objects.While potentially a rich area for novel and natural interaction techniques,building gesture recognition systems can be very difficult. In particular, aprogrammer must be a good application developer, understand the issuessurrounding the design and implementation of user interface systems and beknowledgeable about machine learning techniques. While there are high leveltools to support building user interface applications, there is relatively littlesupport for a programmer to build a gesture system. To create such anapplication, a developer must build components to interact with sensors, providemechanisms to save and parse that data, build a system capable of interpretingthe sensor data as gestures, and finally interpret and utilize the results.One of the most difficult challenges is turning the raw data into somethingmeaningful. For example, imagine a programmer who wants to add a smallgesture control system to his stylus–based application. How would he transformthe sequence of mouse events generated by the UI toolkit into gestures?Most likely, the programmer would use his domain knowledge to develop a(complex) set of rules and heuristics to classify the stylus movement. As hefurther developed the gesture system, this set of rules would likely becomeincreasing complex and unmanageable. A better solution would be to use somemachine learning techniques to classify the stylus gestures. Unfortunately doingso requires extensive domain knowledge about machine learning algorithms.In this paper we present the Gesture and Activity Recognition Toolkit(GART), a user interface toolkit designed to abstract away many machinelearning details so an application programmer can build gesture recognitionbased interfaces. Our goal is to allow the programmer access to powerful machinelearning techniques without requiring her to become an expert in machinelearning. In doing so we hope to bridge the gap between the state of the artin machine learning and user interface development.2 Related WorkGestures are being used in a large variety of user interfaces. Gesture recognitionhas been used for text input on many pen based systems. ParcTab’s Unistroke [8]and Palm’s Graffiti are two early examples of gesture based text entry systemsfor recognizing handwritten characters on PDAs. EdgeWrite is a more recentgesture based text entry method that reduces the amount of dexterity neededto create the gesture [11]. In Shark2, Kristensson and Zhai explored addinggesture recognition to soft keyboards [4]. The user enters text by drawing througheach key in the word on the soft keyboard and the system can recognize thepattern formed by the trajectory of the stylus through each letter. Hinckleyet al. augmented a hand–held with several sensors to detect different types ofinteraction with the device (recognizing when it is in position to take a voicenote, powering on when it is picked up, etc) [3]. Another use of gesture is as aninteraction technique for large wall or tabletop surfaces. Several systems utilizehand (or finger) posture and gestures [5, 12]. Grossman et al. also used multi-finger gestures to interact with a 3D volumetric display [2].From a high level, the basics of using a machine learning algorithm for gesturerecognition is rather straightforward. To create a machine learning model, oneneeds to collect a set of data and provide descriptive labels for it. This processis then repeated many times for each gesture and then repeated again for all ofthe different gestures to be recognized. The data is used by a machine learningalgorithm and is modeled via the “training” process. To use the recognitionsystem in an application, data is again collected. It is then sent through themachine learning algorithms using the models trained above and the label of themodel most closely matching the data is returned as the recognized value.While conceptually this is a rather simple process, in practice it is unfortu-nately much more difficult. For example, there are many details in implementingmost machine learning algorithms (such as dealing with limited precision), manyof which may not be covered in machine learning texts. A developer might useone a machine learning software package created to encapsulate a variety ofalgorithms such as Weka [1] or Matlab. An early predecessor to this work, theGeorgia Tech Gesture Toolkit (GT2k), was designed in a similar vein [9]. Itwas designed around Cambridge University’s speech recognition toolkit (CU–HTK) [13] to facilitate building gesture based applications. Unfortunately, GT2krequires the programmer to have extensive knowledge about the underlyingmachine learning mechanisms and leaves several tasks such as the collectionand management of the data to the programmer.3 GARTThe Gesture and Activity Recognition Toolkit (GART) is a user interfacetoolkit. It is designed to provide a high level interface to the machine learningprocess facilitating the building of gesture recognition applications. The toolkitconsists of an abstract interface to the machine learning algorithms (trainingand recognition), several example sensors and a library for samples.To build a gesture based application using GART,


View Full Document

UHCL CSCI 5931 - The Gesture and Activity Recognition Toolkit

Documents in this Course
Load more
Download The Gesture and Activity Recognition Toolkit
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The Gesture and Activity Recognition Toolkit and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The Gesture and Activity Recognition Toolkit 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?