21
Keystroke Biometrics Study Software Engineering Project Team + DPS Student

Keystroke Biometrics Study Software Engineering Project Team + DPS Student

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

Keystroke Biometrics Study

Software Engineering Project Team

+ DPS Student

Page 2: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

2

Keystroke Biometric

As with other biometrics, the keystroke one is becoming important for security apps

Advantage - inexpensive and easy to implement, the only hardware needed is a keyboard

Disadvantage - behavioral rather than physiological biometric, easy to disguise

One of the least studied biometrics, thus good for dissertation studies

Page 3: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

3

Focus of Study Previous studies mostly

concerned short character string input Password hardening Short name strings

We focus on large text input 200 or more characters per sample

Page 4: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

4

Focus of Study (cont)

Applications of interest Identification

1-of-n classification problem e.g., sender of inappropriate e-mail in a

business environment with a limited number of employees

Verification Binary classification problem, yes/no e.g., student taking online exam

Page 5: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

5

Software Components

Raw Keystroke Data Capture over the Internet (Java applet)

Feature Extraction (SAS software) Classification (SAS software)

Training Testing

Page 6: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

6

Keystroke Data Capture

(Java Applet)

Raw data recorded for each entry Key’s character Key’s code text equivalent Key’s location on keyboard

1 = standard, 2 = left, 3 = right Time key was pressed (msec) Time key was released (msec) Number of left, right, double mouse

clicks

Page 7: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

7

Keystroke Data Capture(Java Applet)

Page 8: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

8

Aligned Raw Data File(Hello World!)

Page 9: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

9

SAS Statistical Software:Feature Extraction &

Classification

Powerful tool with its own programming language and development environment

Data management Relational database built-in Many data manipulation functions

Statistical analysis Library of procedures to do a wide variety

of statistical analyses

Page 10: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

10

Feature Extraction

10 Mean and 10 Std of key press durations 8 most frequent alphabet letters (e, a, r, i, o, t, n, s)

Space & shift keys

10 Mean and 10 Std of key transitions 8 most common digrams (in, th, ti, on, an, he, al,

er) Space-to-any-letter & any-letter-to-space

15 Total number of keypresses for Space, backspace, delete, insert, home, end, enter,

ctrl, 4 arrow keys combined, shift (left), shift (right), total entry time, left, right, & double mouse clicks

Page 11: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

11

Feature Measurement Sample

Page 12: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

12

Feature Extraction Preprocessing

Outlier removal Remove samples > 2 std from mean Prevents skewing of feature

measurements caused by pausing of the keystroker

Standardization x’ = (x - xmin) / (xmax - xmin) Scales to range 0-1 to give roughly equal

weight to each feature

Page 13: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

13

Classification Identification

Nearest neighbor classifier using Euclidean distance

Input sample compared to every training sample

Verification Dichotomizer (feature difference model) Train with neural network

Page 14: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

14

Experimental Design:Identification Experiment

15 subjects that know the purpose of exp. Training – 5 reps of text a (approx. 600 char) Testing

5 reps of text a 5 reps of text b (same length as text a) 5 reps of text c (half length of text a)

28 subjects don’t know purpose of input Subset of above training/testing data Also, arbitrary text input of reasonable length

Page 15: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

15

Experimental Design: Instructions for Subjects

All subjects will be told to make any necessary corrections to the input data (texts a, b, and c are Aesop fables)

Knowing subjects will be told to input the data using their normal keystroke dynamics

The experiments are designed so that subjects leave at least a day between entering samples

Page 16: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

16

Experimental Design:Text a – about 600

characters This is an Aesop fable about the bat and the

weasels. A bat who fell upon the ground and was caught by a weasel pleaded to be spared his life. The weasel refused, saying that he was by nature the enemy of all birds. The bat assured him that he was not a bird, but a mouse, and thus was set free. Shortly afterwards the bat again fell to the ground and was caught by another weasel, whom he likewise entreated not to eat him. The weasel said that he had a special hostility to mice. The bat assured him that he was not a mouse, but a bat, and thus a second time escaped. The moral of the story: it is wise to turn circumstances to good account.

Page 17: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

17

Experimental Design (cont)

Verification Basically the same as for

identification The training and testing data consists

of various text input samples collected over a period of approximately 10 weeks

Page 18: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

18

Expected Outcomes: Recognition Accuracy

Accuracy on text a > that on text b text a is the training text

Accuracy on text b > that on text c text b is longer than text c

Accuracy on texts a, b, c > arbitrary text texts a, b, & c are similar, all Aesop fables

Accuracy on knowing subjects > that on unknowing ones Knowing subjects are more likely to use their

normal keystroke dynamics for all input

Page 19: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

19

Expected Outcomes:Analysis of Experimental

Results

Feature analysis – which are better? Key press durations or transitions More or less frequent letters/digrams Other feature measurements

Determine the spread (std) of feature measurements within versus across subjects

Page 20: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

20

Preliminary Results Reduced identification experiment

Smaller text input “The quick brown fox jumps over the lazy

dog.” Fewer subjects

Three project team members Fewer feature measurements

Mean and std for “e” and “o” key press durations

Accuracy of 80%, which is promising

Page 21: Keystroke Biometrics Study Software Engineering Project Team + DPS Student

21

Questions/Comments?

Focus or applications? Software implementation? Experimental design? Expected experimental outcomes?