Strata 2015 Presentation -- Detecting Lateral Movement

Preview:

Citation preview

Problems

sensors/detections

Ranking

Why is this Important?

Why is this difficult?

Problem # 1 - Independent Alert Streams

Problem #2: Burden of triageAttacks are

complex. Need

more

detections!

So, Now I

have to

triage all of

them?

Problem #3: Feedback not captured

Problem 4: Interpretability of alerts

Windows Security Events Data

On average, an online service in O365 produces 30 billion

sessions/day; 82 TB/day

Data: Sequences of Windows security event IDs from user

sessions

• Examples: User logs into machine, process start, credential

switch, etc.

• 367 unique security event IDs

- We built separate models to detect our goal of compromised account/machines

- The models, independently assess if the account is acting suspiciously

probability of logging

sequences of events

credential elevation

auto-generated

.𝑃1 𝑃2 𝑃𝑑

𝑃1(𝑥)

…𝑃2(𝑥) 𝑃𝑑(𝑥)

. .

𝑥Session

𝑤1 𝑤2 𝑤𝑑

Combined Score

Burges, Chris, et al. "Learning to rank using

gradient descent.” 2005.

𝑃1 𝑃2 𝑃𝑑

𝑃1(𝑚) …𝑃2(𝑚) 𝑃𝑑(𝑚)

m

𝑃1 𝑃2 𝑃𝑑

𝑃1(𝑏) …𝑃2(𝑏) 𝑃𝑑(𝑏)

bPm>b

…𝑤1 𝑤2 𝑤𝑑

Putting it together

.𝑃1 𝑃2 𝑃𝑑

−𝑙𝑜𝑔𝑃1(𝑥)

…−𝑙𝑜𝑔𝑃2(𝑥) −𝑙𝑜𝑔𝑃𝑑(𝑥)

. .

𝑥Session

𝑤1 𝑤2 𝑤𝑑

Rank Score = 𝑤𝑇𝑃

Testing the system• Wargame with the red team

• Blind experiment

• 8 out of 12 top-ranked sessions on day

1 among ~28 billion sessions are pen

testers, precision at 12 is 96%

…𝑤′1 𝑤′2 𝑤′𝑑

Alert Score Weights

Higher Weight, more

contributing factor to alert

Tells the user, what is

probable cause of the alert

extensible

Reality Constantly changing environment…

….but you can account for it during training and adding metadata

In the beginning, there will be false positives… ….but you will reduce your attack surface

No labelled data…

….but you can get away with a good red team

Takeaways

Combine alert streams

Make your alerts interpretable

Capture feedback and close the last mile

Check out ranking algorithms – they are powerful!

Recommended