18
What Are You Searching For? A Remote Keylogging Attack on Search Engine Autocomplete Vinnie Monaco Naval Postgraduate School 1

What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

What Are You Searching For?A Remote Keylogging Attack on

Search Engine Autocomplete

Vinnie MonacoNaval Postgraduate School

1

Page 2: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Search engine autocomplete

2

Packet captureSearch query

Page 3: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

20 years of network side channels

3

Page 4: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Attack overview

• Predict search queries using only client traffic

• Combine multiple independent weak predictors• Escaped URL characters• HTTP2 header compression• Key-press time intervals• Natural language

4

Page 5: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Threat model

• Capture encrypted traffic at the NIC

• Victim types lowercase English letters + Space• No typos/backspace

• Autocomplete requests triggered by keydown events

5

Page 6: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Attack workflow

Keystroke detection

(Packet trace)

Tokenization

Dictionary pruning

Word identification

Beam search

andarethe

lazyontothat

catdogfox

the lazy dogthe lazy foxand that dog

0.2

0.1

0.4

0.1

0.3

0.3

0.2

6

Time

Page 7: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Autocomplete GET requests

GET /complete/search?q=t&cp=1GET /complete/search?q=th&cp=2GET /complete/search?q=the&cp=3GET /complete/search?q=the%20&cp=4GET /complete/search?q=the%20l&cp=5

GET /complete/search?q=the%20laz&cp=7GET /complete/search?q=the%20la&cp=6

GET /complete/search?q=the%20lazy&cp=8

7

Page 8: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Keystroke detection

• Find the longest increasing subsequence (LIS) of packet sizes

Page load First keystroke

8

Baidu example: searching for“the lazy dog”

Page 9: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Tokenization

GET /complete/search?q=t&cp=1GET /complete/search?q=th&cp=2GET /complete/search?q=the&cp=3GET /complete/search?q=the%20&cp=4GET /complete/search?q=the%20l&cp=5

GET /complete/search?q=the%20laz&cp=7GET /complete/search?q=the%20la&cp=6

GET /complete/search?q=the%20lazy&cp=8

Packet size difference

+1

+1

+3+1

+1

+1

+1

GET /complete/search?q=t&cp=1GET /complete/search?q=th&cp=2GET /complete/search?q=the&cp=3GET /complete/search?q=the%20&cp=4GET /complete/search?q=the%20l&cp=5

GET /complete/search?q=the%20laz&cp=7GET /complete/search?q=the%20la&cp=6

GET /complete/search?q=the%20lazy&cp=8

9

Page 10: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

HPACK (HTTP2 header compression)

Static Huffman Encoding

10

Page 11: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

PETAL (Preset Encoding Table Information Leakage)

GOD S 6 + 5 + 6 + 5 = 22 bits

HIF S

OV E

6 + 5 + 5 + 6 = 22 bits

7 + 5 + 5 + 5 = 22 bitsT11

Page 12: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Incremental compression

GOD S

GOD

? 5 bits

22 bits

17 bits

One of these…a e i o s t12

Page 13: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Dictionary pruningObserved

dogsguns

13

Page 14: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Word identification

• Use a BiRNN to predict keys

D O G S

Short LongShort

Time

Packet arrivals

14

Page 15: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Language model and beam search

Which word comes next?

> the lazy ____

1) dog2) car3) hat4) big

the lazy dogthe blue carand some fox

…how they run

Top 50hypotheses

15

Page 16: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Data collection and results

• Data collect• Browser automation with Selenium• Replay keystrokes with uinput• 4k unique queries• 2 search engines (Google, Baidu)• 2 browsers (Chrome, Firefox)• 16k total queries recorded

• Keystroke detection and tokenization accuracy • > 99% (Google and Baidu)

• Top-50 classification accuracy (entire query is correct)• 15% (Google)• 13% (Baidu)

16

Truthhe is recovering from a sprained

Good hypotheseshe is recovering from a sprainedhe is recovering from a strained

Bad hypothesesto be president from a positionis to learn from such a position

Example

Page 17: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Conclusions

• This attack has many of moving parts…• Several independent weak side channels combine to create a strong one

• Language modeling is key• The predictability of human behavior is difficult to mask

• Where else does incremental compression occur?• Thin clients/websites with autosave feature?• Mapping services (latitude/longitude changes incrementally)?

17

Page 18: What Are You Searching For? A Remote Keylogging Attack on ... · •Autocomplete requests triggered by keydown events 5. Attack workflow Keystroke detection (Packet trace) Tokenization

Thank you

• Source codekreep (keystroke recognition and entropy elimination program) https://github.com/vmonaco/kreep

• Contact mehttps://vmonaco.com

• Questions?

18