Upload
christen-burton
View
27
Download
0
Embed Size (px)
DESCRIPTION
“t”. “h”. “e”. Feature Extraction. How to represent a keystroke? Vector of features: FFT, Cepstrum Cepstrum features is better Also used in speech recognition. Initial training. wave signal. Feature Extraction. Unsupervised Learning. Language Model Correction. Sample Collector. - PowerPoint PPT Presentation
Citation preview
Keyboard Acoustics Emanations RevisitedKeyboard Acoustics Emanations RevisitedLi Zhuang, Feng Zhou, J. D. Tygar, {zl,zf,tygar}@cs.berkeley.edu, University of California, Berkeley http://redtea.cs.berkeley.edu/~zl/keyboar
d
Motivation•Emanations of electronic devices leak information
•How much information is leaked by emanations?
•Apply statistical learning methods to security
•What is learned from sound of typing on a keyboard?
Key Observation•Build acoustic model for keyboard &
typist•Non-random typed text (English)
•Limited number of words•Limited letter sequences (spelling)•Limited word sequences (grammar)
•Build language model•Statistical learning theory•Natural language processing
Acoustic Information: Previous and Ours•Frequency information in sound of each typed key•Why do keystrokes make different sounds?•Different locations on the supporting plate•Each key is slightly different
Sample Collector
Feedback-based Training
Unsupervised Learning
Some Experiment Results4 date sets (12~27mins of recordings)
•Feedback for more rounds of training•Output: keystroke classifier
•Language independent•Can be used to recognize random sequence of
keys•E.g. passwords
•Representation of keystroke classifier•Neural networks, linear classification, Gaussian mixtures
Alicepassword
Asonov and Agrawal (SSP’04) Ours
Requirement Text-labeling Direct recoveryAnalogy in Cry
ptoKnown-plaintext att
ackKnown-ciphertext att
ackFeature Extract
ion FFT Cepstrum
Initial training Supervised learning with Neural Networks
Clustering (K-means, Gaussian), EM algorithm
Language Model / HMMs at different lev
elsFeedback-base
d Training / Self-improving feedback
Feature Extraction
Language Model Correction
Keystroke Classifier(use trained classifiers for
each key to recognize sound samples)
Subsequent recognitionwave signal
Unsupervised Learning
Language Model Correction
Sample Collector
Classifier Builder
Feature Extraction
Initial training
wave signal
keystroke classifierrecovered keystrokes
•How to represent a keystroke?•Vector of features: •FFT, Cepstrum
•Cepstrum features is better•Also used in speech recognition
Feature Extraction
•Group keystrokes into N clusters•Assign keystroke a label, 1, …, N
•Find best mapping from cluster labels to characters•Some character combinations are more common
•“th” vs. “tj”•Hidden Markov Models (HMMs)
5 11 2
“t” “h” “e”
Language Model Correction
Before spelling and
grammar correction
After spelling and grammar
correction
Recovered keystrokes
Set 1 (%) Set 2 (%) Set 3 (%) Set 4 (%)Word Char Word Char Word Char Word Char
Initial 35 76 39 80 32 73 23 68Final 90 96 89 96 83 95 80 92
3 different models of keyboards (12mins recording)Keyboard 1 (%) Keyboard 2 (%) Keyboard 3 (%)
Word Char Word Char Word CharInitial 31 72 20 62 23 64Final 82 93 82 94 75 90
3 different supervised learning methods in feedback
0102030405060708090100
Word Char
NNLC
GM
4/26/2006