Upload
lucy-gregory
View
21
Download
0
Embed Size (px)
DESCRIPTION
Open Source Model Checking Radu Grosu SUNY at Stony Brook. Joint work with X. Huang, S. Jain and S. A. Smolka. GCC Compiler. Early stages: A modest C compiler. Translation: source code translated directly to RTL. Optimization: at low RTL level. - PowerPoint PPT Presentation
Citation preview
Open Source Model Checking
Radu Grosu SUNY at Stony Brook
Joint work with
X. Huang, S. Jain and S. A. Smolka
GCC Compiler
• Early stages: A modest C compiler.- Translation: source code translated directly to RTL.
- Optimization: at low RTL level.
- High level information lost: calls, structures, fields, etc.
• Now days: Full blown, multi-language compiler generating code for more than 30 architectures.
- Input: C, C++, Objective-C, Fortran, Java and Ada.
- Tree-SSA: added GENERIC, GIMPLE and SSA ILs.
- Optimization: at GENERIC, GIMPLE, SSA and RTL levels.
- Verification: Tree-SSA API suitable for verification, too.
GCC Compilation Process
Java FileC++ FileC File
C Parser
C++ Parser
Java Parser
Genericize
Gimplify
Parse Tree
GEN AST
..
GPL AST
Code Gen
Build CFG
GPL AST
Rest Comp
SSA/GPL CFG
RTL Code
Obj Code
C Program and its GIMPLE IL
int main() {
int a,b,c;
a = 5;
b = a + 10;
c = a + foo(a,b);
if (a > c)
c = b++/a + b*a;
bar(a,b,c); }
int main { int a,b,c; int T1,T2,T3,T4;
a = 5; b = a + 10; T1 = foo(a,b); T2 = a + T1;
if (a > T2) goto fi; T3 = b / a; T4 = b * a; c = T2 + T3; b = b + 1;fi: bar(a,b,c); }
Associated GIMPLE CFG
a = 5;b = a + 10;T1 = foo(a,b);T2 = b + T1;if (a > T2) goto B;
A
a 5
=CE
b
a 10
+
=
CE
CE
b
T1
foo a
CallE
= B
a T2
>
if
CE
T2
b T1
+
=T3 = b / a;T4 = b * a;c = T3 + T4;b = b + 1;
bar(a,b,c);return;
Exit
true falseBC
FUNCTION DECL
Entry int int int int int int inta T4T3T2c T1b
GCC Model Checking (GMC)
• GMC: a suite of analysis and verification tools we are developing for the Tree-SSA level of GCC. Currently:
– Intra-procedural slicer: in work is inter-procedural slicing.
– Symbolic execution engine: for Boolean C programs.
– Interpreter: traverses the CFG using Tree-SSA iterators.
– Monte Carlo MC (GMC2): OSE, randomized alg. for LTL MC.
• GMC2: a newly developed technique that uses the theory of geometric random variables, statistical hypothesis testing and random sampling of lassos.
recurrencediameter
LassosComputation tree (CT)
Explore all lassos in the CT
DDFS,SCC: time efficient DFS: memory efficient
LTL MC Finding Accepting Lassos
LTL
Randomized Algorithms
• Takes of next step algorithm may depend on random choice (coin flip).
– Benefits: simplicity, efficiency, and symmetry breaking.
• Monte Carlo: may produce incorrect result but with bounded error probability.– Example: Election’s result prediction
• Las Vegas: always gives correct result but running time is a random variable.
– Example: Randomized Quick Sort
recurrencediameter
Explore N(,) independent lassos in the CT
Error margin and confidence ratio
Monte Carlo Approach
LTL…
flip a k-sided coin
LassosComputation tree (CT)
Bernoulli Random Variable Z(coin flip)
1
2
3
4
1
1 2
4 3
4 41
4
½
¼ ⅛
⅛
p(0) = P[Z=0] = qZ = 7/8
p(1) = P[Z=1] = pZ = 1/8
Probability mass function:
Geometric Random Variable
• Value of geometric RV X with parameter pz:
– No. of independent lassos until success.
• Probability mass function:
– p(N) = P[X = N] = qzN-1 pz
• Cumulative Distribution Function:
– F(N) = P[X N] = ∑i Np(i) = 1 – qzN = 1 – (1- pz)N
How Many Lassos?
• Requiring 1- (1-pz)N = 1- δ yields:
N = ln (δ) / ln (1- pz)
• Lower bound on number of trials N needed to achieve success with confidence ratio δ.
What If pz Unknown?
• Requiring pz ε yields:
M = ln (δ) / ln (1- ε) N = ln (δ) / ln (1- pz)
and therefore P[X M] 1- δ
• Lower bound on number of trials M needed to achieve success with
confidence ratio δ and error margin ε .
Statistical Hypothesis Testing
• Null hypothesis H0: pz ε
• Alternative hypothesis H1: pz < ε
• If no success after N trials, then reject H0
• Type I error: α = P[ X > M | H0 ] < δ
• Since: P[ X M | H0 ] 1- δ
Monte Carlo Model Checking (MC2)
input: B=(Σ,Q,Q0,δ,F), ε, δ
N = ln (δ) / ln (1- ε)
for (i = 1; i N; i++)
if (RL(B) == 1) return (1, error-trace);
return (0, “reject H0 with α = Pr[ X>N | H0 ] < δ”);
where RL(B) performs a uniform random walk through B to obtain a random lasso.
GCC MC2 (GMC2)
• Input: a set of CFGs.– Main function: A specifically designated CFG.
• Random walks in the Büchi automaton: generated on-the-fly.– Initial state: of the main routine + bookkeeping information.
– Next state: choose process + call interpreter on its CFG.
– Processes: created by using the fork primitive.
– Optimization: interpreter returns only upon context switch.
• Lassos: detected by using a hierarchic hash table.– Local variables: removed upon return from a procedure.
Shared Variables Valuation(channels & semaphores)
List Of Process statesp1 p2 p3 …
CFG Name Statement #
Control State Data State
Program State
Shared Variables Valuation(channels & semaphores)
List Of Process statesp1 p2 p3 …
Heap Global Variables Valuation
Control State Data State
Frame Stack
Return Control State Local Variables Valuation
f1 f2 …
Program State
Interpreter
• Interprets GIMPLE statements: according to their semantics. Interesting:– Inter-procedural: call(), return(). Manipulate the frame
stack.
• Catches and interprets: function calls to various modeling and concurrency primitives:– Modeling: toss(), assert(). Nondeterminism and checks.
– Processes: fork(), … Manipulate the process list.
– Communication: send(), recv(). Manipulate shared vars. May involve a context switch.
GMC2property rule bugs time sampl
1 no 0.23 1278 Safe Advisory Selection 2 yes 0.03 147
1 no 0.23 1278 Best Advisory Selection 2 yes 0.04 206
1 yes 0.01 36 Avoid unnecessary Crossing 2 yes 0.03 180
1 yes 0.01 27No. Crossing Adv. Selection 2 yes 0.01 8
1 no 0.23 1278Optimal Advisory Selection 2 yes 0.06 217
Results: TCAS
GMC2 Verisoftph time sampl ce.len time states trans
4 0:00.07 2 12 0:00.61 16 37 6 0:00.11 4 12 0:16.60 773 11718 0:00.78 11 20 2:57.29 5431 8449 10 0:02.17 31 24 10:41 17908 31433 12 0:04.82 24 27 >2hr N/A N/A 14 0:06.22 22 44 >2hr N/A N/A
16 0:11.56 14 32 >2hr N/A N/A
(Deadlock freedom)
DPh: Symmetric Fair Version
GMC2 Verisoft Genetic time sampl time states time errors
6h 37' 10,682,639 >8h N/A 2h 33' 3
Needham-Schroeder Protocol
• Quite sophisticated C implementation.
• However, of a sequential nature:- Essentially executes only one round of a reactive system
Related Work
• Software model checkers for concurrent C/C++: – VeriSoft, Spin, Blast (Slam), Magic, C-Wolf. Bogor?
• Cooperative Bug Isolation [Liblit, Naik & Zheng]:– Compile-time instrumentation. Distribute binaries/collect bugs.
– Statistical analysis to isolate erroneous code segments.
• Random interpretation [Gulvany & Necula]: – Execute random paths and merge with random linear operators.
• Monte Carlo and abstract interpretation [Monniaux]: – Analyze programs with probabilistic and nondeterministic input.
Conclusions
• Presented GMC2: a software MC for GCC based on Monte Carlo MC:
– At Tree-SSA level: applicable to C, C++, Ada, Java, etc.
– Open source: freely available for usage/critique/extension.
• Ongoing and Future Work: Create a software MC branch of GCC, which also includes:
– Automated abstraction/refinement/interpolation techniques.
– Currently we manually apply a form of bounded-range abstraction (e.g. in TCAS).
Talk Outline
1. Model Checking
2. Randomized Algorithms
3. LTL Model Checking
4. Probability Theory Primer
5. Monte Carlo Model Checking
6. Implementation & Results
7. Conclusions & Open Problem
Linear Temporal Logic
• LTL formula: made up inductively of
• atomic propositions p, boolean connectives , , • temporal modalities X (neXt) and U (Until).
• Safety: “nothing bad ever happens”
E.g. G( (pc1=cs pc2=cs)) where G is a derived modality (Globally).
• Liveness: “something good eventually happens”
E.g. G( req F serviced ) where F is a derived modality (Finally).
Model Checking
• S is a nondeterministic/concurrent system.
is a temporal logic formula.
– in our case Linear Temporal Logic (LTL).
• Basic idea: intelligently explore S’s state space in attempt to establish S |= .
LTL Model Checking
• Every LTL formula can be translated to a Büchi automaton B such that L() = L(B)
• Automata-theoretic approach:
S |= iff L(BS) L(B ) iff L(BS B )
• Checking non-emptiness is equivalent to finding a reachable accepting cycle (lasso).
Emptiness Checking
• Checking non-emptiness is equivalent to finding an accepting cycle reachable from initial state (lasso).
• Double Depth-First Search (DDFS) algorithm can be used to search for such cycles, and this can be done on-the-fly!
s1 s2 s3 sksk-2 sk-1
sk+1sk+2sk+3sn
DFS2
DFS1
Randomized Algorithms
• Huge impact on CS: (distributed) algorithms, complexity theory, cryptography, etc.
• Takes of next step algorithm may depend on random choice (coin flip).
• Benefits of randomization include simplicity, efficiency, and symmetry breaking.
Lassos Probability Space
• Sample Space: lassos in BS B
• Bernoulli random variable Z :
– Outcome = 1 if randomly chosen lasso accepting
– Outcome = 0 otherwise
• pZ = ∑ pi Zi (expectation of an accepting lasso)
where pi is lasso prob. (uniform random walk)
Bernoulli Random Variable(coin flip)
• Value of Bernoulli RV Z:
Z = 1 (success) & Z = 0 (failure)
• Probability mass function:
p(1) = Pr[Z=1] = pz
p(0) = Pr[Z=0] = 1- pz = qz
• Expectation: E[Z] = pz
Statistical Hypothesis Testing
• Example: Given a fair and a biased coin.
– Null hypothesis H0 - fair coin selected.
– Alternative hypothesis H1 - biased coin selected.
• Hypothesis testing: Perform N trials.
– If number of heads is LOW, reject H0 .
– Else fail to reject H0 .
Statistical Hypothesis Testing
H0 is True H0 is False
reject H0
Type I error
w/prob. α
Correct to reject H0
fail to reject H0
Correct to fail to
reject H0
Type II error
w/prob. β
Random Lasso (RL) Algorithm
Buchi automaton B; sample lasso; return 0 if accepting; 1 if not;
(1)
input : output :
while s := rInit(B); i := 1; f := 0;
(2) (s HashTbl) {(3) HashTbl(s) := i;(4) acc
R
(
AL
s,
V al
B) f
gor
:= iif ;
ithm
(5) t
s := rNext(s,B); i := i +1; }(6) (HashTbl(s) f) 0if return elsere urn 1;
Correctness of MC2
Theorem: Given a Büchi automaton B, error margin ε, and confidence ratio δ, if MC2 rejects H0, then its type I error has probability
α = P[ X > M | H0 ] < δ
Complexity of MC2
Theorem: Given a Büchi automaton B having diameter D, error margin ε, and confidence ratio δ, MC2 runs in time O(N∙D) and uses space O(D), where N = ln(δ) / ln(1- ε)
Cf. DDFS which runs in O(2|S|+|φ|) time
for B = BS B .
Alternative Sampling Strategies
0 1 nn-1
• Multilasso sampling: ignores backedges that do not lead to an accepting lasso.
Pr[Ln]= O(2-n)
• Probabilistic systems: there is a natural way to assign a probability to a RL.
• Input partitioning: partition input into classes that trigger the same behavior (guards).