35
1 Data Leaks In the newsSep ‘17 143M Mar ‘19 passwords stored in readable format 600M Nov ’18 500M 1.8B US ~500 companies 2018 Cost of a Data Breach Study www.ibm.com/security/data-breach Data Breaches shutdown after data leaks 0.5M Apr ‘19 exposed user data 1B Mar ‘18

In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

1

Data Leaks In the news…

Sep‘17143M 👤

Mar‘19passwords stored in readable format 600M 👤

Nov’18500M 👤 1.8B US ~500 companies 2018

CostofaDataBreachStudywww.ibm.com/security/data-breach

Data Breaches

shutdownafterdataleaks0.5M 👤

Apr‘19exposeduserdata1B 👤

Mar‘18

Page 2: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

Dynamic Taint Tacking tracksinformationflow

2TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves!

name

scanf( );

send( );

cc#

Page 3: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

explicitdataflows

Propagation

AssociatetaintswithsensitivedataPropagatetaintstoderivedvaluesChecktaintedvaluesdon’treachuntrustedchannels

programargumentskeyboard

filesnetwork

Sources

sendtonetwork

Sinks

x = secret + y; if (secret) x = y;

implicitcontrolflows

printtoscreenwritetofile

TaintTracking 3

Dynamic Taint Tacking canpreventinformationleak

isslow! OptimisticHybridAnalysis withSafeElisions improves!

Page 4: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

enablespowerfulanalyses

overwriteattackscommandinjectionattacks

XSSattackssecurity

semanticanalysistestinganddebuggingsoftware

engineering

informationleakageprivacy

TaintTracking 4

Dynamic Taint Tacking

isslow! OptimisticHybridAnalysis withSafeElisions improves!

Page 5: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

P r o b l e m

Dynamic Taint Tracking is expensive !

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves!

Page 6: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

~𝟓× slowdown[Newsomeetal.‘05]

main (…) { x = c + 3;

y = secret;

if (p < 0) {

z = c * y;

} out = z;

printf(out); }

is expensive !

secret

c

p

x

y

z

out

⋮ ⋮

MEMORY

track

}}

}

}check}

Dynamic Taint Tacking

TaintTracking isslow! 5OptimisticHybridAnalysis withSafeElisions improves!

Page 7: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

main (…) { x = c + 3;

y = secret;

if (p < 0) {

z = c * y;

} out = z;

printf(out); }

Staticanalyses—dataflowtaintanalysis+pointeranalysis

𝟓× → 𝟐.𝟕× ∴ not effective enough…

TaintTracking isslow!

Static Analysis can help ?

sound

imprecise

notscalable

OptimisticHybridAnalysis withSafeElisions improves! 6

Page 8: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

?undecidableimprecise

SP

P: PossibleprogramstatesS: SoundStaticanalysis’statespace

TaintTracking isslow! 7

Static Analysis Limitation

OptimisticHybridAnalysis withSafeElisions improves!

sound notscalable

Page 9: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

S o l u t i o n

Optimistic Hybrid Analysis

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves!

Page 10: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

SP

O

TaintTracking isslow! OptimisticHybridAnalysis 8

P: PossibleprogramstatesS: SoundStaticanalysis’statespace

T: Testedprogramstates

O: PredicatedStaticanalysis’state space

Predicated Static Analysis

withSafeElisions improves!

T

sound notscalableimpreciseprecise scalableunsound

Page 11: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

p ≥𝟎(Assume)

Forwardoptimization

Backwardoptimization

TaintTracking isslow! OptimisticHybridAnalysis 9

Predicated Static Analysis main (…) { x = c + 3;

y = secret;

if (p < 0) {

z = c * y;

} out = z;

printf(out); }

Optimisticanalyses—dataflowtaintanalysis+pointeranalysis

+invariantassumption

preciseoptimizedforcommoncase

scalable

withSafeElisions improves!

Page 12: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

• likelyunreachablecode• likelycalleesets• likelyunrealizedcallcontexts

Profiling

OptimizedDynamicAnalysis

workflow

main () { unsigned c; c = secret; int x, y, z; if (c < 0) x = secret; if (c == 1) y = secret; z = x + y; ⋮ printf(z); }

likelyinvariants

inputs

[Devecseryetal.‘18]

main () { unsigned c; int x, y, z; c = secret;

if (c < 0) x = secret;

if (c == 1) y = secret; z = x + y; ⋮ printf(z); }

Optimistic Hybrid Analysis

PredicatedStaticAnalysis

TaintTracking isslow! OptimisticHybridAnalysis 10withSafeElisions improves!

Page 13: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

p ≥𝟎(Assume)

1.  likelyUnreachableCode

2.  likelyCalleeSets

3.  likelyUnrealizedCallContexts

TaintTracking isslow! OptimisticHybridAnalysis 11

Optimistic Assumptions

withSafeElisions improves!

invariant violation detection + analysis recovery

detectionrecovery

unsoundsound

{secret}

Taintset

→ missed state ?

{secret,y}

main (…) { x = c + 3;

y = secret;

if (p < 0) {

z = c * y;

} out = z;

printf(out); }

Page 14: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

Optimistic Hybrid Analysis Recovery in OHA is a serious issue

Profiling

OptimizedDynamicAnalysis

main () { unsigned c; c = secret; int x, y, z; if (c < 0) x = secret; if (c == 1) y = secret; z = x + y; ⋮ printf(z); }

likelyinvariants

inputs

main () { unsigned c; int x, y, z; c = secret;

if (c < 0) x = secret;

if (c == 1) y = secret; z = x + y; ⋮ printf(z); }

PredicatedStaticAnalysis

+

TaintTracking isslow! OptimisticHybridAnalysis 12

RecoveryMechanism

Conservativeapproach: Rollbacktothebeginning and re-executewithunoptimizedanalysis

SufficientforofflineanalysisProhibitiveforliveexecutions

withSafeElisions improves!

Page 15: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

UnboundedRollbacks Overheads!

check-pointing

logging rollback-replay

Rollback Recovery is Problematic !

TaintTracking isslow! OptimisticHybridAnalysis 13withSafeElisions improves!

Page 16: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

• FullDynamicAnalysisisprohibitivelyexpensive.

• ConservativeHybridAnalysisisimpreciseandinefficient.

• OptimisticHybridAnalysiscanimprove.

•  ButRollbackRecoveryischallenging.TaintTracking isslow! OptimisticHybridAnalysis 14

RECAP

withSafeElisions improves!

Page 17: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

R o l l b a c k - f r e e

Optimistic Hybrid Analysis

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves!

Page 18: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

metadata

Rollback Recovery

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions 15

Forward

Recovery

improves!

metadata?

Page 19: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

Safe Elisions

ensures metadata equivalence !

Invariantfails

{metadata1}

{metadata2}

=

monitornoop

of noop monitors

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions 16improves!

exact semantics

y = public; {secret}

Taintset

{secret}

Page 20: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

main (…) { x = c + 3;

y = secret;

if (p < 0) {

z = c * y;

} out = z;

printf(out); }

{secret,y}original

safe

unsafe

Predicated forward optimizations are safe

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions 17

ensure exact metadata state !

improves!

{secret}elided

Safe Elisions

of noop monitors

{secret,y}original

{secret,y}elided

=

Page 21: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

•  Separatecontrolflowdomainsfast-pathandslow-path

•  Switchoninvariantfailure

•  Switchoncallreturnfromslow-path

Switching to conservative analysis

fast-path slow-path

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions 18

Forward Recovery :

main()

in()

parse()

lex()

parse_tag()

template() callgraph

improves!

Page 22: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

E v a l u a t i o n

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves!

Page 23: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

•  LLVM3.9compilerinfrastructure•  Cprograms

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves! 19

IODINE Implementation

ConservativeStatic:

•  Andersen’spointeranalysis(contextinsensitive)

•  data-flowtaintanalysis

ConservativeHybrid Rollback-freeOptimisticHybrid

Dynamic:

•  tainttrackinginstrumentation-LLVMDataFlowSanitizer

PredicatedStatic:

•  Andersen’spointeranalysis(contextsensitive)

•  taintanalysis:predicatedforward+ conservativebackward

OptimizedDynamic:

•  optimizedtainttracking•  invariantchecking+forwardrecovery

Profiling:3likelyinvarianttypes

Page 24: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

Informationflowsecuritypolicies—

EmailintegrityandprivacyOverwriteattackdetection

7.23

8.14

5.25

1.27

1.32

1.52

1.07

1.07

1.12

1

2

3

4

5

6

7

8

9

smtpintegrity qmqpintegrity nginxsecurityDy

namicTaintTrackingOverhead

FullDynamic ConservativeHybrid Iodine

IODINE accelerates DIFT applications

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves! 20

POSTFIXMailserver

Webserver

4.𝟒× faster than conservative

Page 25: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

improved by 𝟐× Static Analysis Precision

Mailserver Webserver Texteditor CompressiontoolDatabaseGzip POSTFIX

0.550

0.584 0.

686

0.729

0.709

0.580

0.611

0.549

0.602 0.684

0.625

0.383

0.417

0.422

0.427 0.507

0.464

0.478

0.429

0.416

0.465

0.439

0.383

0.364 0.422

0.388 0.447

0.432

0.432

0.372

0.381

0.395

0.401

0.359

0.342

0.379

0.353

0.407

0.417

0.425

0.322

0.293 0.

395

0.367

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0Fractio

nofstaticm

onito

rs

Conservative +UnreachableCodes +CalleeSets +CallContexts

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves! 21

Page 26: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

1.0

1.2

1.4

1.6

1.8

2.0

0 100 200 300 400 500 600 700 8001.0

1.2

1.4

1.6

1.8

2.0

0 20 40 60 80 100 120 140 160

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves! 22

Profiling Effort Normalized

dynam

icana

lysistim

e

Profilingtime(s)

1.0

1.2

1.4

1.6

1.8

2.0

0 500 1000 1500 2000 2500

nginx redis vim

conservative

conservative

conservative

: regressiontestsuitesareadequate!

Page 27: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

[CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

1.0

1.2

1.4

1.6

1.8

2.0

0 100 200 300 400 500 600 700 800

[CELLRANGE]

[CELLRANGE][CELLRANGE

][CELLRANGE][CELLRANGE

][CELLRANGE

][CELLRANGE

][CELLRANGE

][CELLRANGE

]

[CELLRANGE]

[CELLRANGE][CELLRANGE

][CELLRANGE

][CELLRANGE

]

1.0

1.2

1.4

1.6

1.8

2.0

0 20 40 60 80 100 120 140 160

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves! 22

Profiling Effort Normalized

dynam

icana

lysistim

e

Profilingtime(s)

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE]

1.0

1.2

1.4

1.6

1.8

2.0

0 500 1000 1500 2000 2500

nginx redis vim

conservative

conservative

conservative

: regressiontestsuitesareadequate!

Page 28: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

[CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

1.0

1.2

1.4

1.6

1.8

2.0

0 100 200 300 400 500 600 700 800

[CELLRANGE]

[CELLRANGE][CELLRANGE

][CELLRANGE][CELLRANGE

][CELLRANGE

][CELLRANGE

][CELLRANGE

][CELLRANGE

]

[CELLRANGE]

[CELLRANGE][CELLRANGE

][CELLRANGE

][CELLRANGE

]

1.0

1.2

1.4

1.6

1.8

2.0

0 20 40 60 80 100 120 140 160

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves! 22

Profiling Effort Normalized

dynam

icana

lysistim

e

Profilingtime(s)

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE]

1.0

1.2

1.4

1.6

1.8

2.0

0 500 1000 1500 2000 2500

nginx redis vim

RegressionTests BetaTests

: regressiontestsuitesareadequate!

conservative

conservative

conservative

Page 29: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

T a k e a w a y s

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves!

Page 30: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

PracticalDynamicTaintTracking:

-  𝟐.𝟖×loweroverheadthanconservativehybridanalysis[ShadowReplica‘13,TaintPipe‘15,StraightTaint‘16]

fulldynamic ~𝟒×

~𝟏.𝟓×conservative

hybrid~𝟏.𝟐×IODINE

native

IODINE Summary

ImprovesOptimisticHybridAnalysis-  Rollback-freeusingonlysafeelisions-  Profilingusingtestsuitesisadequate

TaintTracking isslow! OptimisticHybridAnalysis withSafeElisions improves! 23

Page 31: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program
Page 32: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

Safety Guarantee

ensures metadata equivalence !

Invariantfails

{metadata1}

{metadata2}

=

monitornoop

exact semantics

y = public; {secret}

Taintset

{secret}

Page 33: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

UnboundedRollbacks Overheads!

check-pointing

logging rollback-replay

Rollbacks!

Page 34: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

[CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

1.0

1.2

1.4

1.6

1.8

2.0

0 100 200 300 400 500 600 700 800

[CELLRANGE]

[CELLRANGE][CELLRANGE

][CELLRANGE][CELLRANGE

][CELLRANGE

][CELLRANGE

][CELLRANGE

][CELLRANGE

]

[CELLRANGE]

[CELLRANGE][CELLRANGE

][CELLRANGE

][CELLRANGE

]

1.0

1.2

1.4

1.6

1.8

2.0

0 20 40 60 80 100 120 140 160

Sensitivity to Profiling Normalized

dynam

icana

lysistim

e

Profilingtime(s)

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE][CELLRANGE]

[CELLRANGE][CELLRANGE][CELLRANGE]

1.0

1.2

1.4

1.6

1.8

2.0

0 500 1000 1500 2000 2500

nginx redis vim

RegressionTests BetaTests

conservative

conservative

conservative

Page 35: In the news...explicit data flows Propagation Associate taints with sensitive data Propagate taints to derived values Check tainted values don’t reach untrusted channels program

NewAttackVector: violatelikelyinvariantsBoundedSlowdown:bestavailableconservativeanalysisAdaptingInvariants:re-analyzeexcludingfailedinvariantEarlyDetection: forcesattackertoinduceunusualbehavior

Attacks on Availability