38
Open Source Model Checking Radu Grosu SUNY at Stony Brook Joint work with X. Huang, S. Jain and S. A. Smolka

Open Source Model Checking Radu Grosu SUNY at Stony Brook

Embed Size (px)

DESCRIPTION

Open Source Model Checking Radu Grosu SUNY at Stony Brook. Joint work with X. Huang, S. Jain and S. A. Smolka. GCC Compiler. Early stages: A modest C compiler. Translation: source code translated directly to RTL. Optimization: at low RTL level. - PowerPoint PPT Presentation

Citation preview

Page 1: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Open Source Model Checking

Radu Grosu SUNY at Stony Brook

Joint work with

X. Huang, S. Jain and S. A. Smolka

Page 2: Open Source Model Checking Radu Grosu SUNY at Stony Brook

GCC Compiler

• Early stages: A modest C compiler.- Translation: source code translated directly to RTL.

- Optimization: at low RTL level.

- High level information lost: calls, structures, fields, etc.

• Now days: Full blown, multi-language compiler generating code for more than 30 architectures.

- Input: C, C++, Objective-C, Fortran, Java and Ada.

- Tree-SSA: added GENERIC, GIMPLE and SSA ILs.

- Optimization: at GENERIC, GIMPLE, SSA and RTL levels.

- Verification: Tree-SSA API suitable for verification, too.

Page 3: Open Source Model Checking Radu Grosu SUNY at Stony Brook

GCC Compilation Process

Java FileC++ FileC File

C Parser

C++ Parser

Java Parser

Genericize

Gimplify

Parse Tree

GEN AST

..

GPL AST

Code Gen

Build CFG

GPL AST

Rest Comp

SSA/GPL CFG

RTL Code

Obj Code

Page 4: Open Source Model Checking Radu Grosu SUNY at Stony Brook

C Program and its GIMPLE IL

int main() {

int a,b,c;

a = 5;

b = a + 10;

c = a + foo(a,b);

if (a > c)

c = b++/a + b*a;

bar(a,b,c); }

int main { int a,b,c; int T1,T2,T3,T4;

a = 5; b = a + 10; T1 = foo(a,b); T2 = a + T1;

if (a > T2) goto fi; T3 = b / a; T4 = b * a; c = T2 + T3; b = b + 1;fi: bar(a,b,c); }

Page 5: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Associated GIMPLE CFG

a = 5;b = a + 10;T1 = foo(a,b);T2 = b + T1;if (a > T2) goto B;

A

a 5

=CE

b

a 10

+

=

CE

CE

b

T1

foo a

CallE

= B

a T2

>

if

CE

T2

b T1

+

=T3 = b / a;T4 = b * a;c = T3 + T4;b = b + 1;

bar(a,b,c);return;

Exit

true falseBC

FUNCTION DECL

Entry int int int int int int inta T4T3T2c T1b

Page 6: Open Source Model Checking Radu Grosu SUNY at Stony Brook

GCC Model Checking (GMC)

• GMC: a suite of analysis and verification tools we are developing for the Tree-SSA level of GCC. Currently:

– Intra-procedural slicer: in work is inter-procedural slicing.

– Symbolic execution engine: for Boolean C programs.

– Interpreter: traverses the CFG using Tree-SSA iterators.

– Monte Carlo MC (GMC2): OSE, randomized alg. for LTL MC.

• GMC2: a newly developed technique that uses the theory of geometric random variables, statistical hypothesis testing and random sampling of lassos.

Page 7: Open Source Model Checking Radu Grosu SUNY at Stony Brook

recurrencediameter

LassosComputation tree (CT)

Explore all lassos in the CT

DDFS,SCC: time efficient DFS: memory efficient

LTL MC Finding Accepting Lassos

LTL

Page 8: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Randomized Algorithms

• Takes of next step algorithm may depend on random choice (coin flip).

– Benefits: simplicity, efficiency, and symmetry breaking.

• Monte Carlo: may produce incorrect result but with bounded error probability.– Example: Election’s result prediction

• Las Vegas: always gives correct result but running time is a random variable.

– Example: Randomized Quick Sort

Page 9: Open Source Model Checking Radu Grosu SUNY at Stony Brook

recurrencediameter

Explore N(,) independent lassos in the CT

Error margin and confidence ratio

Monte Carlo Approach

LTL…

flip a k-sided coin

LassosComputation tree (CT)

Page 10: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Bernoulli Random Variable Z(coin flip)

1

2

3

4

1

1 2

4 3

4 41

4

½

¼ ⅛

p(0) = P[Z=0] = qZ = 7/8

p(1) = P[Z=1] = pZ = 1/8

Probability mass function:

Page 11: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Geometric Random Variable

• Value of geometric RV X with parameter pz:

– No. of independent lassos until success.

• Probability mass function:

– p(N) = P[X = N] = qzN-1 pz

• Cumulative Distribution Function:

– F(N) = P[X N] = ∑i Np(i) = 1 – qzN = 1 – (1- pz)N

Page 12: Open Source Model Checking Radu Grosu SUNY at Stony Brook

How Many Lassos?

• Requiring 1- (1-pz)N = 1- δ yields:

N = ln (δ) / ln (1- pz)

• Lower bound on number of trials N needed to achieve success with confidence ratio δ.

Page 13: Open Source Model Checking Radu Grosu SUNY at Stony Brook

What If pz Unknown?

• Requiring pz ε yields:

M = ln (δ) / ln (1- ε) N = ln (δ) / ln (1- pz)

and therefore P[X M] 1- δ

• Lower bound on number of trials M needed to achieve success with

confidence ratio δ and error margin ε .

Page 14: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Statistical Hypothesis Testing

• Null hypothesis H0: pz ε

• Alternative hypothesis H1: pz < ε

• If no success after N trials, then reject H0

• Type I error: α = P[ X > M | H0 ] < δ

• Since: P[ X M | H0 ] 1- δ

Page 15: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Monte Carlo Model Checking (MC2)

input: B=(Σ,Q,Q0,δ,F), ε, δ

N = ln (δ) / ln (1- ε)

for (i = 1; i N; i++)

if (RL(B) == 1) return (1, error-trace);

return (0, “reject H0 with α = Pr[ X>N | H0 ] < δ”);

where RL(B) performs a uniform random walk through B to obtain a random lasso.

Page 16: Open Source Model Checking Radu Grosu SUNY at Stony Brook

GCC MC2 (GMC2)

• Input: a set of CFGs.– Main function: A specifically designated CFG.

• Random walks in the Büchi automaton: generated on-the-fly.– Initial state: of the main routine + bookkeeping information.

– Next state: choose process + call interpreter on its CFG.

– Processes: created by using the fork primitive.

– Optimization: interpreter returns only upon context switch.

• Lassos: detected by using a hierarchic hash table.– Local variables: removed upon return from a procedure.

Page 17: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Shared Variables Valuation(channels & semaphores)

List Of Process statesp1 p2 p3 …

CFG Name Statement #

Control State Data State

Program State

Page 18: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Shared Variables Valuation(channels & semaphores)

List Of Process statesp1 p2 p3 …

Heap Global Variables Valuation

Control State Data State

Frame Stack

Return Control State Local Variables Valuation

f1 f2 …

Program State

Page 19: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Interpreter

• Interprets GIMPLE statements: according to their semantics. Interesting:– Inter-procedural: call(), return(). Manipulate the frame

stack.

• Catches and interprets: function calls to various modeling and concurrency primitives:– Modeling: toss(), assert(). Nondeterminism and checks.

– Processes: fork(), … Manipulate the process list.

– Communication: send(), recv(). Manipulate shared vars. May involve a context switch.

Page 20: Open Source Model Checking Radu Grosu SUNY at Stony Brook

GMC2property rule bugs time sampl

1 no 0.23 1278 Safe Advisory Selection 2 yes 0.03 147

1 no 0.23 1278 Best Advisory Selection 2 yes 0.04 206

1 yes 0.01 36 Avoid unnecessary Crossing 2 yes 0.03 180

1 yes 0.01 27No. Crossing Adv. Selection 2 yes 0.01 8

1 no 0.23 1278Optimal Advisory Selection 2 yes 0.06 217

Results: TCAS

Page 21: Open Source Model Checking Radu Grosu SUNY at Stony Brook

GMC2 Verisoftph time sampl ce.len time states trans

4 0:00.07 2 12 0:00.61 16 37 6 0:00.11 4 12 0:16.60 773 11718 0:00.78 11 20 2:57.29 5431 8449 10 0:02.17 31 24 10:41 17908 31433 12 0:04.82 24 27 >2hr N/A N/A 14 0:06.22 22 44 >2hr N/A N/A

16 0:11.56 14 32 >2hr N/A N/A

(Deadlock freedom)

DPh: Symmetric Fair Version

Page 22: Open Source Model Checking Radu Grosu SUNY at Stony Brook

GMC2 Verisoft Genetic time sampl time states time errors

6h 37' 10,682,639 >8h N/A 2h 33' 3

Needham-Schroeder Protocol

• Quite sophisticated C implementation.

• However, of a sequential nature:- Essentially executes only one round of a reactive system

Page 23: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Related Work

• Software model checkers for concurrent C/C++: – VeriSoft, Spin, Blast (Slam), Magic, C-Wolf. Bogor?

• Cooperative Bug Isolation [Liblit, Naik & Zheng]:– Compile-time instrumentation. Distribute binaries/collect bugs.

– Statistical analysis to isolate erroneous code segments.

• Random interpretation [Gulvany & Necula]: – Execute random paths and merge with random linear operators.

• Monte Carlo and abstract interpretation [Monniaux]: – Analyze programs with probabilistic and nondeterministic input.

Page 24: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Conclusions

• Presented GMC2: a software MC for GCC based on Monte Carlo MC:

– At Tree-SSA level: applicable to C, C++, Ada, Java, etc.

– Open source: freely available for usage/critique/extension.

• Ongoing and Future Work: Create a software MC branch of GCC, which also includes:

– Automated abstraction/refinement/interpolation techniques.

– Currently we manually apply a form of bounded-range abstraction (e.g. in TCAS).

Page 25: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Talk Outline

1. Model Checking

2. Randomized Algorithms

3. LTL Model Checking

4. Probability Theory Primer

5. Monte Carlo Model Checking

6. Implementation & Results

7. Conclusions & Open Problem

Page 26: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Linear Temporal Logic

• LTL formula: made up inductively of

• atomic propositions p, boolean connectives , , • temporal modalities X (neXt) and U (Until).

• Safety: “nothing bad ever happens”

E.g. G( (pc1=cs pc2=cs)) where G is a derived modality (Globally).

• Liveness: “something good eventually happens”

E.g. G( req F serviced ) where F is a derived modality (Finally).

Page 27: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Model Checking

• S is a nondeterministic/concurrent system.

is a temporal logic formula.

– in our case Linear Temporal Logic (LTL).

• Basic idea: intelligently explore S’s state space in attempt to establish S |= .

Page 28: Open Source Model Checking Radu Grosu SUNY at Stony Brook

LTL Model Checking

• Every LTL formula can be translated to a Büchi automaton B such that L() = L(B)

• Automata-theoretic approach:

S |= iff L(BS) L(B ) iff L(BS B )

• Checking non-emptiness is equivalent to finding a reachable accepting cycle (lasso).

Page 29: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Emptiness Checking

• Checking non-emptiness is equivalent to finding an accepting cycle reachable from initial state (lasso).

• Double Depth-First Search (DDFS) algorithm can be used to search for such cycles, and this can be done on-the-fly!

s1 s2 s3 sksk-2 sk-1

sk+1sk+2sk+3sn

DFS2

DFS1

Page 30: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Randomized Algorithms

• Huge impact on CS: (distributed) algorithms, complexity theory, cryptography, etc.

• Takes of next step algorithm may depend on random choice (coin flip).

• Benefits of randomization include simplicity, efficiency, and symmetry breaking.

Page 31: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Lassos Probability Space

• Sample Space: lassos in BS B

• Bernoulli random variable Z :

– Outcome = 1 if randomly chosen lasso accepting

– Outcome = 0 otherwise

• pZ = ∑ pi Zi (expectation of an accepting lasso)

where pi is lasso prob. (uniform random walk)

Page 32: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Bernoulli Random Variable(coin flip)

• Value of Bernoulli RV Z:

Z = 1 (success) & Z = 0 (failure)

• Probability mass function:

p(1) = Pr[Z=1] = pz

p(0) = Pr[Z=0] = 1- pz = qz

• Expectation: E[Z] = pz

Page 33: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Statistical Hypothesis Testing

• Example: Given a fair and a biased coin.

– Null hypothesis H0 - fair coin selected.

– Alternative hypothesis H1 - biased coin selected.

• Hypothesis testing: Perform N trials.

– If number of heads is LOW, reject H0 .

– Else fail to reject H0 .

Page 34: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Statistical Hypothesis Testing

H0 is True H0 is False

reject H0

Type I error

w/prob. α

Correct to reject H0

fail to reject H0

Correct to fail to

reject H0

Type II error

w/prob. β

Page 35: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Random Lasso (RL) Algorithm

Buchi automaton B; sample lasso; return 0 if accepting; 1 if not;

(1)

input : output :

while s := rInit(B); i := 1; f := 0;

(2) (s HashTbl) {(3) HashTbl(s) := i;(4) acc

R

(

AL

s,

V al

B) f

gor

:= iif ;

ithm

(5) t

s := rNext(s,B); i := i +1; }(6) (HashTbl(s) f) 0if return elsere urn 1;

Page 36: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Correctness of MC2

Theorem: Given a Büchi automaton B, error margin ε, and confidence ratio δ, if MC2 rejects H0, then its type I error has probability

α = P[ X > M | H0 ] < δ

Page 37: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Complexity of MC2

Theorem: Given a Büchi automaton B having diameter D, error margin ε, and confidence ratio δ, MC2 runs in time O(N∙D) and uses space O(D), where N = ln(δ) / ln(1- ε)

Cf. DDFS which runs in O(2|S|+|φ|) time

for B = BS B .

Page 38: Open Source Model Checking Radu Grosu SUNY at Stony Brook

Alternative Sampling Strategies

0 1 nn-1

• Multilasso sampling: ignores backedges that do not lead to an accepting lasso.

Pr[Ln]= O(2-n)

• Probabilistic systems: there is a natural way to assign a probability to a RL.

• Input partitioning: partition input into classes that trigger the same behavior (guards).