35
A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University of California, Irvine +

A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Embed Size (px)

Citation preview

Page 1: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

A Model Counter For Constraints Over Unbounded Strings

Loi Luu* Shweta Shinde*

Prateek Saxena* Brian Demsky+

National University of Singapore*

University of California, Irvine+

Page 2: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

2

Example: Quantifying Password Strength Meters

Page 3: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

3

Example:Quantifying Password Strength Meters

Password Cracking Dictionary

Password Database

How many?

STRONGch1 = [a-Z]*ch2 = [0-9]*ch3 = [@#?^&*-+]*INPUT ch1^ch2^ch3

INPUT Dictionary

Page 4: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

4

String Model Counter

The Model Counting Problem

String Constraints

# of Solutions

Page 5: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

5

Contributions

• SMC: String Model Counter– Handles constraints structured data types– Available at https://github.com/loiluu/smc

• Handles Unbounded Strings – Uses Generating functions

• Many Practical Applications– Quantifying Password Strength meters– Quantifying Information Leaks via Side Channels

Fast & Scalable Expressive BetterPrecision

Page 6: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

6

Technical Problem Definition

• : Set of string constraints

• : Set of feasible string solutions for

• : Model count

• : Model count bounds for string length

Policy for STRONG Password

Number of STRONG Passwords

All possible STRONG passwords

Page 7: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

7

Soundness & Precision

LB UB0 2L

ε-precise: distance of LB and UB on log-scale

Unsound ImpreciseExact Count

ε

Page 8: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

8

Insufficiency of Previous Approaches: Enumeration or Sampling

• Enumerate more than strings [NDSS’ 14]

• Does not scale for large strings

Page 9: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Insufficiency of Previous Approaches:Symbolic Execution & Integer MC

const char* mystrstr(const char* str, const char* search) { if (!str || !search) { return 0; } while(*str != '\0’) { int len = 0; const char* sub = search; while(*sub != '\0’) { if (*sub == *str) { sub++; str++; len++; if (*sub == '\0’) { return str – len; } } else { str -= len; break; } } str++; } return 0;}

Symbolic execution

Symbolic Path

Constraint

Integer Model

Counter

Model Count

5100 Paths

9

Page 10: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

10

String Model Counting:Representing Set Cardinalities with GFs

ab

|S1| = 2

aaabbabb

|S2| = 4

aaaaababaabb…

|S3| = 8

Φ

|S0| = 1

Example: Set of all strings over alphabet:

. . .

Page 11: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

11

String Model Counting:Representing String Sets with GFs

• Can be viewed as a Series: • Represent as polynomial

• Has a closed form algebraic expression:

• Can represent infinite sets!

Generating Function (or GF)

Page 12: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

12

Recovering Co-efficient from GFs• How many strings of length 3?• Co-efficient of in

Page 13: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Modeling String Ops Over GFs:Concatenation

abaaabbabb…

+

-

+a-a+b-b

+aa-aa+ab-ab+ba-ba…

=

13

Page 14: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Modeling String Ops Over GFs:Regular Expression Match

14

Page 15: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Preserving Precision:contains Operation

aba babbbab aba bbb

Modeling contains as regex is not always precise

Exact!15

Page 16: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

16

SMC: Full LanguageRegExp := character | ε

| RegExp RegExp | RegExp|RegExp | RegExp*

Constraint := Var = Var | Var IN RegExp

| Var = Var • Var | Var = ConsString

| contains(Var, ConsString) | strstr(Var, ConsString)

| length(Var) ○ Num

○ := < | ≤ | > | ≥ | ≠

Formula := Formula | Formula OR Formula | Formula AND Formula | NOT Formula | Constraint

Core Constraints

Full Constraints - String level

Combining multiple Constraints

Page 17: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

SMC DesignSMC

Constraints

Constraint Formula

Generating Function

Translation

CNF

Translation

AlgebraicComputation

GeneratingFunction

Evaluate at n Model Count

Mathematica

17

SMC

Page 18: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

18

Case Studies &Applications

Page 19: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

19

Experiments• Real-world Password strength meters

– 3 websites: Drupal, eBay and Microsoft meters [NDSS ‘14]

• Quantify Leakage in C Programs– 4 UNIX Utilities: obscure, grep, csplit [CCS’13]

– 2 Web Servers: Ghttpd, Null HTTPd [ASPLOS ‘08]

• String constraints from JavaScript applications– 18,901 path constraints from 18 real world apps [SSP’10]

• 13 iGoogle gadgets and 5 AJAX applications

Page 20: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Application I: Password Strength MetersGuessing Attacks w/o Dictionaries

• eBay

• Microsoft

• Drupal

Length Invalid Weak Medium StrongL = 5

L = 10

Length Weak MediumL = 5

L = 10

Length Weak Fair Good StrongL = 5

L = 10

Page 21: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

21

• How many of these also exist in JtR Database?– Password database dictionary of 3,106 words

Application I: Password Strength MetersGuessing Attacks with Dictionaries

STRONG PASSWORDS

The smaller

The better

Page 22: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

22

Application I: Password Strength MeterWebsite Strength Total (L = 1..10)

Invalid 2678

Weak 413

Medium 3

Strong 0

Weak 2640

Medium 461

Good 0

Best 0

Weak 936

Fair 1974

Good 369

Strong 3

Drupal will be more vulnerableto JtR dictionary attack

Page 23: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

23

Application II: Quantifying Side Channel Leakage

Hypervisor & Hardware

Side Channel Attacks

Malicious VM

Web ServerVM

Sensitive Execution

input= readline(file);if (strstr(input,'\n')) lines++;if (strstr(input, '\r') || strstr(input, '\f')) { if (linepos > linelength) linelength = linepos; linepos = 0; words++;}if (strstr(input, '\t') { linepos += 8 - (linepos % 8); words++;}write_counts (lines, words);

ch1 = \n ch2 = \r|\fINPUT (ch1|ch2)*

Page 24: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Application II: Quantifying Side Channel Leakage

SMC

How many?

ch1 = \n ch2 = \r|\fINPUT (ch1|ch2)*

SMC

How many?

INPUT .*

X - Y2X 2Y

24

Page 25: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

25

Application II: Quantifying Side Channel Leakage

grep wc csplit obscure0

100

200

300

400

500

600

700

800

355.3

750.1

179.4

0.133

Path Leakage

UNIX Utilities

Leak

age

in B

its

Page 26: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

26

Tool Evaluation

Page 27: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

27

Result I: Speed

Program len SMC FuzzBALLObscure 6 0.5 sec 2 Hrsstrstr(input, "abc")!=NULL 5 0.4 sec 2Hrsstrstr(input, "abc")!=NULL 4 0.5 sec 150 secmatch regex(input, "(a|b)*") 4 0.4 sec 2 Hrs

Program len SMC QUAIL

strstr(input, "ab")=2 5 0.2 sec 6.1 secstrstr(input, "ab")=2 7 0.2 sec 648 secinput.contains("ab") 5 0.3 sec 5.1 sec input.contains("ab") 7 0.3 sec 606 sec

• SMC vs. FuzzBALL [PLAS’ 09]

• SMC vs. QUAIL [CAV’ 13]

Page 28: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Result II: Expressiveness(JavaScript Applications & UNIX utilities)

Kaluza's benchmarks

28

38242

8338289771

43381

191

95239

202

regexes

Contains

length

concatenation

comparison with _x000d_const string

18,901 JavaScript Benchmarks

Frequency of Constraints

UNIX Case studies

24

Page 29: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Result III: Precision• SMC vs. Castro et al. [ASPLOS ‘08]

• SMC vs. FuzzBALL [PLAS’ 09]

Program len SMC FuzzBALL

Bits ε-precise

Obscure 6 0 0.06 > 2 Hrs

strstr(input, "abc")!=NULL 5 22.4 0 > 2 Hrs

strstr(input, "abc")!=NULL 4 23.0 0 13.5

match regex(input, "(a|b)*”) 4 32 0.13 > 2 Hrs

Program len SMC Castroet al.Bits ε-precise

Ghttpd 620 80.2 0.003 248

Null HTTPd 500 248.0 0.002 500

29

Page 30: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

30

Conclusion

• SMC: String Model Counter

• Handles Unbounded Strings

• Practical Applications– Quantifying Password Strength meters– Quantifying Information Leaks via Side Channels

Fast & Scalable Expressive BetterPrecision

Page 31: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Related Work• [Cambridge University Press’ 09] R. Sedgewick and P. Flajolet.

Analytic Combinatorics. • [JAIR’ 99] E. Birnbaum and E. L. Lozinskii.

The good old davis-putnam procedure helps counting models• [FOCS’ 93] A. I. Barvinok.

A Polynomial Time Algorithm for Counting Integral Points in Polyhedra When the Dimension Is Fixed.

• [Algorithmica’ 07] S. Verdoolaege, R. Seghir, K. Beyls, V. Loechner, and M. Bruynooghe. Counting Integer Points in Parametric Polytopes Using Barvinok's Rational Functions

• [SSP’ 09] M. Backes, B. Kopf, and A. Rybalchenko.Automatic Discovery and Quantification of Information Leaks.

• [PLAS’ 09] J. Newsome, S. McCamant, and D. Song. Measuring Channel Capacity to Distinguish Undue Influence• [CAV’ 13] F. Biondi, A. Legay, L.M. Traonouez, and A. Wasowski. QUAIL: A quantitative security analyzer for imperative code.• [SEN’ 12] Q.S. Phan, P. Malacaria, O. Tkachuk, and C. S. Pasareanu.

Symbolic Quantitative Information Flow.• LattE Tool. http://www.math.ucdavis.edu/~latte/.• RelSat Tool. http://code.google.com/p/relsat/.

31

Page 32: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Contact

• Loi Luu, Shweta Shinde {loiluu, shweta24} @comp.nus.edu.sg

• Our SMC Tool is available at:– https://github.com/loiluu/smc

Thank You !32

Page 33: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

References• [FoSSaCS’ 09] G. Smith.

On the Foundations of Quantitative Information Flow. • [CCS’ 13] S. Tople, S. Shinde, Z. Chen, and P. Saxena.

AutoCrypt: Enabling homomorphic computation on servers to protect sensitive web content

• [ASPLOS’ 08] M. Castro, M. Costa, and J.-P. Martin. Better Bug Reporting with Better Privacy.

• [SSP’ 10] P. Saxena, D. Akhawe, S. Hanna, F. Mao, S. McCamant, and D. SongA Symbolic Execution Framework for JavaScript.

• [NDSS’ 14] X. de Carne de Carnavalet and M. Mannan.From Very Weak to Very Strong: Analyzing Password-Strength Meters

• John the Ripper password cracker. http://www.openwall.com/john• Wolfram Mathematica. http://www.wolfram.com/mathematica

33

Page 34: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

34

Backup Slides

Page 35: A Model Counter For Constraints Over Unbounded Strings Loi Luu * Shweta Shinde * Prateek Saxena * Brian Demsky + National University of Singapore * University

Robustness

Evaluation Parameters Big Test Cases(variable > 4)

Small Test Cases(variable < 4)

Number of tests 1342 17559Total running time 1h 58 mins 1h 9 minsAverage no. of constraints 187 2.05Average running time 5.29 seconds 0.24 secondsTest cases for which SMC reports exact model count

21% 94%

• Real world JavaScript Benchmarks [SSP’10]

35