0x1A Great Papers in Computer SecurityL. Zhuang, F. Zhou, D. Tygar Keyboard Acoustic Emanations Revisited (CCS 2005)Acoustic Information in Typing“Key” ObservationSound of a KeystrokeBi-Grams of CharactersAdd Spelling and GrammarRecovered TextFeedback-based TrainingOverviewExperiment: Single KeyboardResults for a Single KeyboardExperiment: Multiple KeyboardsResults for Multiple KeyboardsDefenses0x1A Great Papers inComputer SecurityVitaly ShmatikovCS 380Shttp://www.cs.utexas.edu/~shmat/courses/cs380s/L. Zhuang, F. Zhou, D. TygarKeyboard Acoustic Emanations Revisited(CCS 2005)Acoustic Information in Typing Different keystrokes make different sounds•Different locations on the supporting plate•Each key is slightly differentFrequency information in the sound of the typed key can be used to learn which key it is•Observed by Asonov and Agrawal (2004)slide 3“Key” ObservationBuild acoustic model for keyboard and typistExploit the fact that typed text is non-random (for example, English)•Limited number of words•Limited letter sequences (spelling)•Limited word sequences (grammar)This requires a language model•Statistical learning theory•Natural language processingslide 4Sound of a KeystrokeEach keystroke is represented as a vector of Cepstrum features•Fourier transform of the decibel spectrum•Standard technique from speech processingslide 5[Zhuang, Zhou, Tygar]Bi-Grams of CharactersGroup keystrokes into N clustersFind the best mapping from cluster labels to charactersUnsupervised learning: exploit the fact that some 2-character combinations are more common•Example: “th” vs. “tj”•Hidden Markov Models (HMMs)slide 65 11 2“t”“h”“e”[Zhuang, Zhou, Tygar]Add Spelling and GrammarSpelling correctionSimple statistical model of English grammar•Tri-grams of wordsUse HMMs again to modelslide 7[Zhuang, Zhou, Tygar]Recovered Text_____ = errors in recovery= errors corrected by grammarslide 8Before spelling and grammar correctionAfter spelling and grammar correction[Zhuang, Zhou, Tygar]Feedback-based TrainingRecovered characters + language correction provide feedback for more rounds of trainingOutput: keystroke classifier•Language-independent•Can be used to recognize random sequence of keys–For example, passwords•Representation of keystroke classifier–Neural networks, linear classification, Gaussian mixturesslide 9[Zhuang, Zhou, Tygar]OverviewInitial trainingUnsupervised LearningLanguage Model CorrectionSample CollectorClassifier Builderkeystroke classifierrecovered keystrokesFeature Extractionwave signal(recorded sound)SubsequentrecognitionFeature Extractionwave signalKeystroke ClassifierLanguage Model Correction(optional)recovered keystrokes[Zhuang, Zhou, Tygar]slide 10Experiment: Single KeyboardLogitech Elite Duo wireless keyboard4 data sets recorded in two settings: quiet and noisy•Consecutive keystrokes are clearly separableAutomatically extract keystroke positions in the signal with some manual error correction[Zhuang, Zhou, Tygar]slide 11Results for a Single Keyboardslide 12Recording length Number of words Number of keysSet 1 ~12 min ~400 ~2500Set 2 ~27 min ~1000 ~5500Set 3 ~22 min ~800 ~4200Set 4 ~24 min ~700 ~4300Set 1 (%) Set 2 (%) Set 3 (%) Set 4 (%)Word Char Word Char Word Char Word CharInitial 35 76 39 80 32 73 23 68Final 90 96 89 96 83 95 80 92[Zhuang, Zhou, Tygar]DatasetsInitial and final recognition rateExperiment: Multiple KeyboardsKeyboard 1: Dell QuietKey PS/2•In use for about 6 monthsKeyboard 2: Dell QuietKey PS/2•In use for more than 5 yearsKeyboard 3: Dell Wireless Keyboard•Newslide 13[Zhuang, Zhou, Tygar]Results for Multiple Keyboards12-minute recording with app. 2300 charactersKeyboard 1 (%) Keyboard 2 (%) Keyboard 3 (%)Word Char Word Char Word CharInitial 31 72 20 62 23 64Final 82 93 82 94 75 90[Zhuang, Zhou, Tygar]slide 14DefensesPhysical securityTwo-factor authenticationMasking noiseKeyboards with uniform sound (?)slide
View Full Document