Upload
georgiana-dickerson
View
221
Download
2
Embed Size (px)
Citation preview
A Model Counter For Constraints Over Unbounded Strings
Loi Luu* Shweta Shinde*
Prateek Saxena* Brian Demsky+
National University of Singapore*
University of California, Irvine+
2
Example: Quantifying Password Strength Meters
3
Example:Quantifying Password Strength Meters
Password Cracking Dictionary
Password Database
How many?
STRONGch1 = [a-Z]*ch2 = [0-9]*ch3 = [@#?^&*-+]*INPUT ch1^ch2^ch3
INPUT Dictionary
4
String Model Counter
The Model Counting Problem
String Constraints
# of Solutions
5
Contributions
• SMC: String Model Counter– Handles constraints structured data types– Available at https://github.com/loiluu/smc
• Handles Unbounded Strings – Uses Generating functions
• Many Practical Applications– Quantifying Password Strength meters– Quantifying Information Leaks via Side Channels
Fast & Scalable Expressive BetterPrecision
6
Technical Problem Definition
• : Set of string constraints
• : Set of feasible string solutions for
• : Model count
• : Model count bounds for string length
Policy for STRONG Password
Number of STRONG Passwords
All possible STRONG passwords
7
Soundness & Precision
LB UB0 2L
ε-precise: distance of LB and UB on log-scale
Unsound ImpreciseExact Count
ε
8
Insufficiency of Previous Approaches: Enumeration or Sampling
• Enumerate more than strings [NDSS’ 14]
• Does not scale for large strings
Insufficiency of Previous Approaches:Symbolic Execution & Integer MC
const char* mystrstr(const char* str, const char* search) { if (!str || !search) { return 0; } while(*str != '\0’) { int len = 0; const char* sub = search; while(*sub != '\0’) { if (*sub == *str) { sub++; str++; len++; if (*sub == '\0’) { return str – len; } } else { str -= len; break; } } str++; } return 0;}
Symbolic execution
Symbolic Path
Constraint
Integer Model
Counter
Model Count
5100 Paths
9
10
String Model Counting:Representing Set Cardinalities with GFs
ab
|S1| = 2
aaabbabb
|S2| = 4
aaaaababaabb…
|S3| = 8
Φ
|S0| = 1
Example: Set of all strings over alphabet:
. . .
11
String Model Counting:Representing String Sets with GFs
• Can be viewed as a Series: • Represent as polynomial
• Has a closed form algebraic expression:
• Can represent infinite sets!
Generating Function (or GF)
12
Recovering Co-efficient from GFs• How many strings of length 3?• Co-efficient of in
Modeling String Ops Over GFs:Concatenation
abaaabbabb…
+
-
+a-a+b-b
+aa-aa+ab-ab+ba-ba…
=
13
Modeling String Ops Over GFs:Regular Expression Match
14
Preserving Precision:contains Operation
aba babbbab aba bbb
Modeling contains as regex is not always precise
Exact!15
16
SMC: Full LanguageRegExp := character | ε
| RegExp RegExp | RegExp|RegExp | RegExp*
Constraint := Var = Var | Var IN RegExp
| Var = Var • Var | Var = ConsString
| contains(Var, ConsString) | strstr(Var, ConsString)
| length(Var) ○ Num
○ := < | ≤ | > | ≥ | ≠
Formula := Formula | Formula OR Formula | Formula AND Formula | NOT Formula | Constraint
Core Constraints
Full Constraints - String level
Combining multiple Constraints
SMC DesignSMC
Constraints
Constraint Formula
Generating Function
Translation
CNF
Translation
AlgebraicComputation
GeneratingFunction
Evaluate at n Model Count
Mathematica
17
SMC
18
Case Studies &Applications
19
Experiments• Real-world Password strength meters
– 3 websites: Drupal, eBay and Microsoft meters [NDSS ‘14]
• Quantify Leakage in C Programs– 4 UNIX Utilities: obscure, grep, csplit [CCS’13]
– 2 Web Servers: Ghttpd, Null HTTPd [ASPLOS ‘08]
• String constraints from JavaScript applications– 18,901 path constraints from 18 real world apps [SSP’10]
• 13 iGoogle gadgets and 5 AJAX applications
Application I: Password Strength MetersGuessing Attacks w/o Dictionaries
• eBay
• Microsoft
• Drupal
Length Invalid Weak Medium StrongL = 5
L = 10
Length Weak MediumL = 5
L = 10
Length Weak Fair Good StrongL = 5
L = 10
21
• How many of these also exist in JtR Database?– Password database dictionary of 3,106 words
Application I: Password Strength MetersGuessing Attacks with Dictionaries
STRONG PASSWORDS
The smaller
The better
22
Application I: Password Strength MeterWebsite Strength Total (L = 1..10)
Invalid 2678
Weak 413
Medium 3
Strong 0
Weak 2640
Medium 461
Good 0
Best 0
Weak 936
Fair 1974
Good 369
Strong 3
Drupal will be more vulnerableto JtR dictionary attack
23
Application II: Quantifying Side Channel Leakage
Hypervisor & Hardware
Side Channel Attacks
Malicious VM
Web ServerVM
Sensitive Execution
input= readline(file);if (strstr(input,'\n')) lines++;if (strstr(input, '\r') || strstr(input, '\f')) { if (linepos > linelength) linelength = linepos; linepos = 0; words++;}if (strstr(input, '\t') { linepos += 8 - (linepos % 8); words++;}write_counts (lines, words);
ch1 = \n ch2 = \r|\fINPUT (ch1|ch2)*
Application II: Quantifying Side Channel Leakage
SMC
How many?
ch1 = \n ch2 = \r|\fINPUT (ch1|ch2)*
SMC
How many?
INPUT .*
X - Y2X 2Y
24
25
Application II: Quantifying Side Channel Leakage
grep wc csplit obscure0
100
200
300
400
500
600
700
800
355.3
750.1
179.4
0.133
Path Leakage
UNIX Utilities
Leak
age
in B
its
26
Tool Evaluation
27
Result I: Speed
Program len SMC FuzzBALLObscure 6 0.5 sec 2 Hrsstrstr(input, "abc")!=NULL 5 0.4 sec 2Hrsstrstr(input, "abc")!=NULL 4 0.5 sec 150 secmatch regex(input, "(a|b)*") 4 0.4 sec 2 Hrs
Program len SMC QUAIL
strstr(input, "ab")=2 5 0.2 sec 6.1 secstrstr(input, "ab")=2 7 0.2 sec 648 secinput.contains("ab") 5 0.3 sec 5.1 sec input.contains("ab") 7 0.3 sec 606 sec
• SMC vs. FuzzBALL [PLAS’ 09]
• SMC vs. QUAIL [CAV’ 13]
Result II: Expressiveness(JavaScript Applications & UNIX utilities)
Kaluza's benchmarks
28
38242
8338289771
43381
191
95239
202
regexes
Contains
length
concatenation
comparison with _x000d_const string
18,901 JavaScript Benchmarks
Frequency of Constraints
UNIX Case studies
24
Result III: Precision• SMC vs. Castro et al. [ASPLOS ‘08]
• SMC vs. FuzzBALL [PLAS’ 09]
Program len SMC FuzzBALL
Bits ε-precise
Obscure 6 0 0.06 > 2 Hrs
strstr(input, "abc")!=NULL 5 22.4 0 > 2 Hrs
strstr(input, "abc")!=NULL 4 23.0 0 13.5
match regex(input, "(a|b)*”) 4 32 0.13 > 2 Hrs
Program len SMC Castroet al.Bits ε-precise
Ghttpd 620 80.2 0.003 248
Null HTTPd 500 248.0 0.002 500
29
30
Conclusion
• SMC: String Model Counter
• Handles Unbounded Strings
• Practical Applications– Quantifying Password Strength meters– Quantifying Information Leaks via Side Channels
Fast & Scalable Expressive BetterPrecision
Related Work• [Cambridge University Press’ 09] R. Sedgewick and P. Flajolet.
Analytic Combinatorics. • [JAIR’ 99] E. Birnbaum and E. L. Lozinskii.
The good old davis-putnam procedure helps counting models• [FOCS’ 93] A. I. Barvinok.
A Polynomial Time Algorithm for Counting Integral Points in Polyhedra When the Dimension Is Fixed.
• [Algorithmica’ 07] S. Verdoolaege, R. Seghir, K. Beyls, V. Loechner, and M. Bruynooghe. Counting Integer Points in Parametric Polytopes Using Barvinok's Rational Functions
• [SSP’ 09] M. Backes, B. Kopf, and A. Rybalchenko.Automatic Discovery and Quantification of Information Leaks.
• [PLAS’ 09] J. Newsome, S. McCamant, and D. Song. Measuring Channel Capacity to Distinguish Undue Influence• [CAV’ 13] F. Biondi, A. Legay, L.M. Traonouez, and A. Wasowski. QUAIL: A quantitative security analyzer for imperative code.• [SEN’ 12] Q.S. Phan, P. Malacaria, O. Tkachuk, and C. S. Pasareanu.
Symbolic Quantitative Information Flow.• LattE Tool. http://www.math.ucdavis.edu/~latte/.• RelSat Tool. http://code.google.com/p/relsat/.
31
Contact
• Loi Luu, Shweta Shinde {loiluu, shweta24} @comp.nus.edu.sg
• Our SMC Tool is available at:– https://github.com/loiluu/smc
Thank You !32
References• [FoSSaCS’ 09] G. Smith.
On the Foundations of Quantitative Information Flow. • [CCS’ 13] S. Tople, S. Shinde, Z. Chen, and P. Saxena.
AutoCrypt: Enabling homomorphic computation on servers to protect sensitive web content
• [ASPLOS’ 08] M. Castro, M. Costa, and J.-P. Martin. Better Bug Reporting with Better Privacy.
• [SSP’ 10] P. Saxena, D. Akhawe, S. Hanna, F. Mao, S. McCamant, and D. SongA Symbolic Execution Framework for JavaScript.
• [NDSS’ 14] X. de Carne de Carnavalet and M. Mannan.From Very Weak to Very Strong: Analyzing Password-Strength Meters
• John the Ripper password cracker. http://www.openwall.com/john• Wolfram Mathematica. http://www.wolfram.com/mathematica
33
34
Backup Slides
Robustness
Evaluation Parameters Big Test Cases(variable > 4)
Small Test Cases(variable < 4)
Number of tests 1342 17559Total running time 1h 58 mins 1h 9 minsAverage no. of constraints 187 2.05Average running time 5.29 seconds 0.24 secondsTest cases for which SMC reports exact model count
21% 94%
• Real world JavaScript Benchmarks [SSP’10]
35