26
Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1), Mitat Uysal(1), Aykut Guven(2) (1)Department of Computer Engineering, Dogus University, Istanbul, Turkey {oguven,sakyokus,muysal}@dogus.edu.tr (2)IDEA Tekonoloji Inc., Istanbul, Turkey [email protected] The IASTED International Conference on Artificial Intelligence and Applications AIA 2007 February 12 – 14, 2007 Innsbruck, Austria

Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Embed Size (px)

Citation preview

Page 1: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 1

AIA 2007

ENHANCED PASSWORD AUTENTICATION THROUGH

KEYSTROKE TYPING CHARACTERISTICS

Ozlem Guven(1), Selim Akyokus(1), Mitat Uysal(1), Aykut Guven(2)

(1)Department of Computer Engineering, Dogus University, Istanbul, Turkey{oguven,sakyokus,muysal}@dogus.edu.tr

(2)IDEA Tekonoloji Inc., Istanbul, [email protected]

The IASTED International Conference on Artificial Intelligence and ApplicationsAIA 2007

February 12 – 14, 2007 Innsbruck, Austria

Page 2: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 2

Outline

Biometric Security Systems Keystroke Pattern Recognition Systems Keystroke Timing Information Capturing Keystroke Dynamics Data A Statistical Modeling Approach for Keystroke Recognition Experimental Results Conclusion

Page 3: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 3

Biometric Security Systems

One of the most active research fields in computer security research is developing more secure authentication methods for user access by the use of biometric means.

Biometrics is a relatively new discipline that concerns the use of a person’s physiological or behavioral characteristics for the automatic identification of that person.

These are many types of biometric security systems based on methods such as face recognition, fingerprint recognition, iris recognition, handwriting recognition, and so on.

Page 4: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 4

Biometric Security Systems

A biometric security system is a pattern recognition system that compares a feature data set obtained from a person with template data set stored in a database.

Biometric security system configurations change according to chosen biometric feature, but there are some basic procedural functions that every system must include.

Page 5: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 5

A General biometric security system architecture

The enrollment part is responsible for registering people’s characteristics in the biometric template database.

The identification/verification part of a biometric security system is responsible for identifying/verifying individuals at the point of access by using a classifier.

Biometric SensorInput Data

Feature Extractor

TemplateDatabase

Input Data

BiometricSensor

Feature Extractor

Feature Matcher(Classifier)

Access granted/denied

Enrollment

Identification/Verification

Page 6: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 6

Keystroke Pattern Recognition Systems

Keystroke dynamics biometric systems analyze the way when a user types at a terminal by monitoring the keyboard events.

Keystroke dynamics refers the timing information or pattern collected about the way a user types while using a computer keyboard.

Keystroke dynamics is known with a few different names: keyboard dynamics, keystroke analyses, typing biometrics and typing rhythms.

Biometric security systems based on keystroke dynamics utilize keystroke dynamics information for user authentication since every user has a different typing pattern.

Page 7: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 7

Keystroke Timing Information

Keystroke dynamics include several different measurements which can be detected when the user presses keys on the keyboard.

Possible measurements include:

– Latency between consecutive keystrokes,– Duration of the keystroke, hold time,– Overall typing speed,– Frequency errors, (how often the user has use the backspace),– The habit of using additional keys in the keyboard, for example

writing numbers with the numeric pad,– In what order does the user press keys when writing capital

letters, is shift or the letter key is released first,– The force used when hitting keys while typing (requires a special

keyboard). Most keystroke recognition systems do not necessarily employ all of

these features. Most of the applications usually measures only latencies consecutive keystrokes or duration of keystrokes.

Page 8: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 8

Capturing Keystroke Dynamics Data

When typing on a keyboard, both key press and release events generate hardware interrupts.

Keystroke dynamics information can be easily captured by using these interrupts.

Capturing keystroke dynamics data has however a few complications. Several keys can be pressed at the same time or user presses the next key before releasing the previous one.

Another very important problem is that typing skills of people varies extremely.

– A beginner typist can type very slowly with one finger by a “hunt-and-peck” style. While a professional typist can type very fast in order of ten times faster than a beginner typist.

– The typing also depends on the mood of typist at the time of typing, what he types, or when using different types of keyboards.

There are many factors to be taken into account when designing a keystroke dynamics recognition system

Page 9: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 9

Capturing Keystroke Dynamics Data

Generally, each user has a different typing pattern. The following shows the graph of keyword latencies of passwords entered by

a user at 10 trials. As it is seen in the figure, each user has typing pattern at which keyword latencies between successive hits are very close to each other.

0

50000

100000

150000

200000

250000

300000

350000

400000

Charactersof

password

Tim

e (

ms

)

1. access trial

2. access trial

3. access trial

4. access trial

5. access trial

6. access trial

7. access trial

8. access trial

9. access trial

10. access trial

Page 10: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 10

Keystroke Pattern Recognition Systems

Keystroke dynamics recognition systems can be used for both verification (is this the person whom I think?) and identification (who is this person?).

Identification involves comparing the acquired keystroke information against templates corresponding to all users in the database.

Verification involves comparison with only those templates corresponding to the claimed identity.

These systems have the advantage of not requiring specially designed devices and complex software to be implemented.

Keystroke recognition systems are usually used to enable hardening or strengthening the login-password verification process.

Page 11: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 11

Login-password Verification Process

A typical and very common example of verification is when a user logs on to a computer at work.

He or she will then be asked for a username and password, the system will then find the matching username in the database and verify if the entered password matches the one stored with the username in the database.

If someone knows a username together with the password, one can access the computer system. Passwords are also often quite easy to guess.

– People tend to use passwords like their birth days, pet names and so on which may have direct relationship with the person, or they may be normal dictionary words. In most cases, they are easily guessed by trying all of them.

Keystroke recognition systems enable hardening or strengthening the password verification process by comparing the captured keystroke dynamics information with the user’s templates stored in a template database.

The system either rejects or accepts the login depending on if the entered information matches the stored template or not.

Page 12: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 12

Classification Methods in Keystroke Recognition.

There are many methods used in keystroke dynamics recognition systems.

– statistical methods including t-tests [8], means, standard deviations [9,10], non-weighted probability algorithm, weighted probability algorithm[10],

– machine learning or data mining methods that include nearest neighbor classifiers that use different distance metrics such as Euclidean and Mahalanobis [11,12,13], neural networks[14,15,16], k-means [12], Bayesian classification[12,17], decision trees[18],

– fuzzy classification methods[19,20], and genetic algorithms and support vector machines [21].

Page 13: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 13

A Statistical Modeling Approach for Keystroke

Recognition

In this study, we used a model for keystroke recognition using an architecture that resembles a neural network as the structure.

The model used in this study carries the characteristics of the neural network structure.

Normally, weights of a neural network are adjusted using a learning technique that minimizes the difference between the actual output and predicted output.

In this study, the weights of the layered network structure are determined by statistical methods.

Page 14: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 14

Training Phase

The average and standard deviations are determined for each user using the training dataset.

Pu,k and σu,k are the average and standard deviations of kth keyword latency for a user u.

Start

Select a user

Take user’s keysrtoke train patterns

Calculate σu,k and Pu,k

Record values to Template Database

End

Page 15: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 15

Testing Phase

At the testing stage, the test keyword latencies entered by user at a trial forms test pattern dataset for a user.

The keyword latencies obtained in a trial are compared with user’s templates (averages and standard deviations) stored in the template database by using by our matching algorithm.

Then user is given authorization to enter the computer system if template matches, otherwise rejected.

Start

Take user’s real time keysrtoke

patterns

Are they matching?

1

Run the Algorithm

Failed! : User’ s pattern does not match with the pattern stored

in the database.

Accepted! : User’s pattern matches the

pattern stored in database.

Take user’s template from

Template Database

End

Page 16: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 16

The Matching Algorithm

We use keystroke latencies (time between successive key hits) as a measure to differentiate different users in our algorithm.

The average and standard deviations of keyword latencies determines the weights of a layered network structure.

The layered network structure is used for comparing and identification of keystroke rhythms. It resembles a neural network. That is why we sometimes call it as a neural network like structure.

Page 17: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 17

The Layered Network Structure

The Tt,k is kth keystroke latency entered by user u at a trial t forms test pattern dataset for a user. The weights Pu,k and σu,k are the average and standard deviations of k th keyword latency for a user u. The layered network structure basically compares compare the latencies of each login and test if they fall between

two standard deviations from the average reference latency for each latency. If all of the possible latencies passed this test then input for that password string would be considered valid.

∑ ∏ (0, 1)

Tt,1

Tt,2

Oi = Tt,i – Pu,i

Pu,1

Pu,2σu,2

σu,1

σu,k

Tt,k

Pu,k

-2σu,k < Oi < 2σu,k

Page 18: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 18

Biometric Classifier Performance Metrics

Classifiers used in biometric systems typically use three metrics to describe biometric classifier performance.

– false rejection rate (FRR): is the percentage of valid (genuine) user attempts identified as imposters. It determines how often a valid user is not verified successfully.

– false acceptance rate (FAR): is the percentage of imposter access attempts identified as a valid users. It determines how often an imposter user can successfully bypass the security system.

– equal error rate (ERR): is the crossover point at which FRR equals FAR.

The FRR and FAR error rates are inversely proportional to each other; lowering one error rate will raise the other.

The point ERR where FAR=FRR, gives the best choice of operation for a specific biometric system for the most of common biometric applications.

The decision threshold parameters used in biometric recognition algorithms must be adjusted according to the ERR crossover point where FRR equals to FAR.

In our study, the threshold parameter is chosen as 2σ. The experimental studies are done with different threshold values σ, 2σ and 3σ. These studies show that the threshold parameter 2σ produces the best results [10,13].

Page 19: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 19

Experimental Results

This study uses a dataset which consists of the keyword latencies of passwords for 16 users.

The datasets were collected by Aykut Guven and Ibrahim Sogukpinar in an study done in [22].

All passwords are 8 characters long. For each password entrance, there are 7 keyword latencies recorded in

datasets. A matlab program has been coded to test our model using the dataset that

consists of password typing patterns of 16 users. At the learning phase, the average and standard deviations Pu,k and σu,k of

keyword latencies are determined for each user using the training dataset where u=1,2,..16 is the user number and k=1,2,..7 is the keyword latency number.

Then, the test datasets are applied to the neural-statistical algorithm. The recognition rate (RR) and False Rejection Rate (FRR) are computed for

each of the users. Recognition rate is the authorized user who accesses the system

successfully and FRR is the authorized (valid) users who are identified as imposter users.

Page 20: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 20

FRR (False Rejection Ratio) Results for all Users

As it can be seen form the table, the user number 13 is noticed with the lowest recognition ratio value as %72.12 and highest FRR value as %27.88, which is the worst case on the test results.

The best result is obtained for the user number 15 with the recognition rate value %94,22 and FRR value % 5,78.

For 16 users, the average performance success rate of the overall system is calculated; recognition rate as %83 and FRR as %17. These results are compatible with another study done by Fabian Monrose and Aviel D. Rubin.

User Recognition RatioRR(%)

False Rejection RatioFRR (%)

1 74,13 25,872 76,39 23,613 81,10 18,904 83,72 16,285 82,86 17,146 84,75 15,257 88,00 12,008 80,86 19,149 87,06 12,94

10 86,33 13,6811 84,71 15,2912 75,96 24,0413 72,12 27,8814 83,87 16,1315 94,22 5,7816 88,16 11,84

Page 21: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 21

FAR (False Acceptance Ratio) Results for each user

As seen in Table, the FAR performance of the system except the users 4, 5, 9, and 14 be accepted among the reasonable limits.

Excepts these users, the average of FAR results is 10%.

When the keyword latencies of users 4, 5, 9, and 14 are analyzed, we see that these user’s keyword latencies has large standard deviations because of their typing behavior.

For all of 16 users, the average of FAR results is 26%.

ImposterAcces to

User

Total TrialNumber

SuccessfulEntries

FalseAcceptanceRatioFAR (%)

1 3171 49 1,552 3170 0 0,003 3187 104 3,264 3099 2140 69,065 3122 1992 63,816 3091 563 18,217 3139 367 11,698 3011 556 18,479 3229 2025 62,71

10 3080 158 5,1311 2987 73 2,4412 2898 464 16,0113 3064 817 26,6615 3159 3043 96,3316 3141 95 3,02

Page 22: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 22

Discussion

The kind of variations for the obtained results is normal since each user has different typing skills.

Each user has different typing patterns depending on the characteristics such as the speed of typing and the mood of the writer at the typing type and the work done.

Page 23: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 23

Discussion

Figure shows the averages of average and 2 * standard deviations of keystroke latencies for 16 users.

The high FAR rate results from users who have slow typing speed and different typing behaviors.

As it can be seen from the Figure, some users like users 5 and 14 in our data set have large average and standard deviations.

These large averages and deviations form a wide band of keyword latencies that allow the access of imposter users with the approach used in this study.

It can be concluded that any method that uses similar methodology based on averages and standard deviations might be expected to produce the high FAR rate who has slow typing speed and different typing behaviors.

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Users

Avera

ge +

2 * St

d 2*Std

Average

Page 24: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 24

Conclusion

Biometric security systems based on keystroke dynamics can be considerably effective way to enhance the password based authentication when accessing a computer system.

The approach used the this study basically compares the latencies of passwords at each login and test if they fall between two standard deviations from the average reference latency. If all of the possible latencies passed this test then input for that password string would be considered valid.

The experimental results obtained in this study yield satisfactory FRR and FAR values for most of the users in our data set.

We tried to improve the FRR and FAR values by preprocessing methods such as outliner removal and normalization (min-max, z-score). Application of these preprocessing methods has no much effect on the improvement of performance of the system.

Page 25: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 25

Conclusion

In keystroke recognition, there is no common keystroke dynamics data set that everyone can use and make comparative evaluation of the methodologies they use.

Currently, we are working an experiment to collect a new keystroke data set with a large number of users. We pan to make this data set publicly available from Internet.

As a future work, our plan is to implement different classification algorithms and methods on this data set, and make a comparative evaluation of them.

Page 26: Selim Akyokus AIA 2007 12/2/2007 1 AIA 2007 ENHANCED PASSWORD AUTENTICATION THROUGH KEYSTROKE TYPING CHARACTERISTICS Ozlem Guven(1), Selim Akyokus(1),

Selim Akyokus AIA 2007 12/2/2007 26

AIA 2007

THANKS

http://www.akyokus.com/Presentations/