New version page

706

This preview shows page 1 out of 2 pages.

View Full Document
View Full Document

End of preview. Want to read all 2 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

A Comparison of Machine Learning Techniques forModeling Human-Robot Interaction with Children withAutismElaine ShortUniv. of Southern CaliforniaDept. of Computer ScienceLos Angeles, California, [email protected] Feil-SeiferUniv. of Southern CaliforniaDept. of Computer ScienceLos Angeles, California, [email protected] Matari´cUniv. of Southern CaliforniaDept. of Computer ScienceLos Angeles, California, [email protected] machine learning techniques are used to model theb ehavior of children with autism interacting with a humanoidrob ot, comparing a static model to a dynamic model usinghand-co ded features. Good accuracy (over 80%) is achievedin predicting child vocalizations; directions fo r future ap-proaches to modeling the behavior of children with autismare suggested.Categories and Subject DescriptorsH.1.2 [Models and Principles]: User/Machine System;I.2.9 [Robotics]: MiscellaneousGeneral TermsPerformance, Design, Experimentation, Human FactorsKeywordsHuman-robot interaction, machine learning, autism1. INTRODUCTIONThe use of robotic systems is a promising technologicalp ossibility for enhancing therapy for children with autism,a common and often debilitating developmental disorder af-fecting between one in 80 and one in 240 children in theUnited States [6]. Anecdotal evidence and case studies sug-gest that not only are robots highly salient to children withautism, but that those children may exhibit social behav-iors with robots that they do not otherwise use (e.g., [7]).A number of research groups (including our own) have usedrob ots with children with autism (e.g., [7], [3], and [2]). Onthe machine learning side, modeling the behavior of chi l-dren with autism has mainly focused on diagnosis; machinelearning techniques have been used to discriminate betweenchildren with autism and typically developing children suchas in [5] and [9]. O ther work has attempted to model behav-ior in children with autism, but has either been focused onnonso cial behavior ([1]) or has been limited in the general-izations that could be made due to the heterogeneity of thep opulation ([8]). The approach of this work is unique in theCopyright is held by the author/owner(s).HRI’11, March 6–9, 2011, Lausanne, Switzerland.ACM 978-1-4503-0561-7/11/03.use of multimoda l features to model social behaviors in chil-dren with autism. We use a combination of audio and videofeatures in order to identify one particular social behavior:child vocalizations. We use two different machine learningtechniques to model the interaction in order to predict vo-calizations.2. EXPERIMENTAL DESIGNThe dataset comes from a study comparing children’s in-teractions with a robot behaving in a way that is contingenton the child’s behavior, a robot that behaves randomly, anda non-robotic toy. A description of the system used appearsin [2]. There are three primary experiment conditions, butfor this preliminary work we use data from sessions for sixchildren with autism interacting with the humanoid robotwhere the robot’s behavior is contingent on the child’s be-havior. At this time, these data are annotated by a singlecode r (the author). There are 18 features, with 44 total pos-sible feature-value pairs, including such features as where thechild was standing and whether the child was touching therob ot, the wall, or the parent. Additionally, the PrAAt au-dio analysis software was used to extract pitch and intensityfeatures from the audio.Two machine learning algorithms are used in the analysisof t he data. Conditional Random Fields (CRF) are used be-cause their dynamic nature and ability to both classify andsegment data are well-suited to the time-series data gener-ated by the experiment (using code by Kevin Murphy [4]).As a comparison, we also examine the performance of deci-sion trees (generated using the C4.5 algorithm in the Wekatool kit). One social behavior of particular interest to us ischild vocalizations, since that behavior is used in the exper-iment as one that receives the “reward” behavior from therob ot (blowing bubbles). Thus we focus on the recognitionof vocalizations, with the following hypotheses:H1: CRF will outperform decision trees for recognition ofchild vocalizations.H2: Recognition using the full multimodal feature set willoutperform recognition using only audio features.3. RESULTSAdditional time-shifted features were added to the featureset five, ten, and fifteen frames in the past (up to half of asecond). This results in a total feature set size of 241 fea-tures. The learning algorithms were evaluated using leave-one-out validation, first on the full set of features, then onElaine Short, David Feil-Seifer, and Maja Mataric. "A Comparison of Machine Learning Techniques for Modeling Human-Robot Interaction with Children with Autism". To appear in Proceedings of the International Conference on Human-Robot Interaction (HRI), Lausanne, Switzerland, Mar 2011.the set of audio features only. CRF was additionally testedon a subset of 64 optimal features chosen using informationgain. Additional analysis was performed with k-means clus-tering, but is beyond the scope of this abstract.Because of the limited numbers of children on whose datawe could train the models, there was limited statistical sig-nificance in the results. The only statistically significantpairwise difference in performance was for the F1 values inthe audio only case, with p<0.05.For the full set of data, the CRF yielded a mean F1-valueof 0.0889, with a variance of 0.0156, and an error rate of26.35% with a variance of 1.62%. The decision tree gave anF1 value of 0.1181 (variance: 0.0017) a nd an error rate of28.19% (var: 1.74%). For the audio-only data, the CRF’smean F1 value was 0.1784 (var: 0.0035) and had an er-ror rate of 21.48% (var: 3.22%). The decision tree had amean F1 value of 0.0714 (var: 0.0021) and an error rate of20.02%(var: 2.51%). Finally, for the set of best features,the CRF had a mean F1 value of 0.2281 (var:0.0700) and anerror rate of 19.81% (var:1.61%).4. DISCUSSIONHypothesis 1: CRF outperforms decision trees:The outcome of this hypothesis depends on the set of fea-tures used. Although the conditional random field and thedecision tree seem to perform comparably in terms of errorrates, when we look at the F1 value (the harmonic meanof precision and recall), we see that the decision tree out-p erforms the conditional random field in the set of all data,while the conditional


Loading Unlocking...
Login

Join to view 706 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 706 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?