Huayang Guo1,2, Ming Wu1, Lidong Zhou1, Gang Hu1,2, Junfeng Yang2, Lintao Zhang1
1Microsoft Research Asia2Columbia University
Practical Software Model Checking via Dynamic Interface Reduction
Building reliable distributed systems is hardMachine failureMessage lostMessage reorderThread interleaving
Non-determinism leads to tricky bugs
Crash
Thr1Thr2
Async I/O
Implementation-level software model checkersMaceMC (NSDI’07), MoDist (NSDI’09)Directly check implementationsNo need to construct abstract model beforehand
Crash
Thr1Thr2
Async I/O
State Space Explorer
…
State space explosion
MPS: Product-level PaxosNever fully explored 3
nodes34 years for MoDist
…
Dynamic Interface Reduction (DIR)Effective
34 years 18 hours (Fully explored MPS-3)Exponential Reduction:
100K : 1 states for MPS and Berkeley DB w/ replicationAutomatic, no manual efforts requiredProvably sound and completeEasy to integrate with legacy MCsDeMeter: DIR with MoDist and MaceMC
MC specific modifications: ≤ 1k loc
5
OutlineInsightChallengesDynamic Interface ReductionEvaluationRelated workConclusion
6
InsightDistributed systems: componentized
Local non-determinism isolated Empirically, 99.9% do not propagate (Berkeley DB)
Previous work:Check components together|m1|*|m2|*|m3|
DIR:Check components separately|m1|+|m2|+|m3|
7
Thr1Thr2
Async I/O
Thr3Thr4
Interface behavior
m1 m2
m3
Challenges and SolutionsHow to discover/construct interface
behavior of component?Manually or statically construct interface process
Impractical for complex software systemHow to guarantee
Completeness: find all bugsSoundness: no false positives
Our solution: Dynamically discover interface behaviorsCombine discovered interface behaviorsTrack dependencies
8
DIR Overview
9
Global Explorer
Explore global interface behaviors
Local Explorers
Component1
Component2
Component3Explore
local states
Explore local
states
Explore local
states
Interface behavior Interface
behaviorInterface behavior
Interface behavior
Interface behavior
Interface behavior
Example
10
Sum
Ckpt
Client Primary/Secondary //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Send(P,2); total+=n; Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Client Primary
Secondary
Produce initial global trace
11
Client(Cli) Primary/Secondary(Pri/Sec) //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Ckpt Send(P,2); total+=n; Sum Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Cli.Choose(2) = 0Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.CkptPri.SumPri.Send(Sec, 1)Sec.Recv(Pri, 1)Sec.CkptSec.SumCli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.SumPri.Send(Sec, 2)Sec.Recv(Pri, 2)Sec.Sum
Global explorer
-- Produce initial global trace.
Construct message trace
12
Client(Cli) Primary/Secondary(Pri/Sec) //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Ckpt Send(P,2); total+=n; Sum Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Cli.Choose(2) = 0Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.CkptPri.SumPri.Send(Sec, 1)Sec.Recv(Pri, 1)Sec.CkptSec.SumCli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.SumPri.Send(Sec, 2)Sec.Recv(Pri, 2)Sec.Sum
Global explorer
-- Bold statements form the message trace.
Project message trace
13
Client(Cli) Primary/Secondary(Pri/Sec) //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Ckpt Send(P,2); total+=n; Sum Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Cli.Choose(2) = 0Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.CkptPri.SumPri.Send(Sec, 1)Sec.Recv(Pri, 1)Sec.CkptSec.SumCli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.SumPri.Send(Sec, 2)Sec.Recv(Pri, 2)Sec.Sum
Global explorer
-- Project global message trace to components.
Pri.Recv(Cli, 1)Pri.Send(Sec, 1)Pri.Recv(Cli, 2)Pri.Send(Sec, 2)
Primary
Sec.Recv(Pri, 1)Sec.Recv(Pri, 2)
Secondary
Cli.Send(Pri, 1)Cli.Send(Pri, 2)
Client
Local explorer for Primary
14
Client(Cli) Primary/Secondary(Pri/Sec) //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Ckpt Send(P,2); total+=n; Sum Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Cli.Choose(2) = 0Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.CkptPri.SumPri.Send(Sec, 1)Sec.Recv(Pri, 1)Sec.CkptSec.SumCli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.SumPri.Send(Sec, 2)Sec.Recv(Pri, 2)Sec.Sum
Global explorer
Pri.Recv(Cli, 1)
Pri.Send(Sec, 1)Pri.Recv(Cli, 2)
Pri.Send(Sec, 2)
Local explorer for Primary
Pri.Ckpt
Pri.SumPri.Ckpt
Pri.Ckpt
Pri.SumPri.Sum
Pri.Sum
Pri.Recv(Cli, 1)
Pri.Send(Sec, 1)Pri.Recv(Cli, 2)
Pri.Send(Sec, 2)
Pri.Sum
Pri.SumPri.Recv(Cli, 1)
Pri.Send(Sec, 1)Pri.Recv(Cli, 2)
Pri.Send(Sec, 2)
Local explorer for Client
15
Client(Cli) Primary/Secondary(Pri/Sec) //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Ckpt Send(P,2); total+=n; Sum Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Cli.Choose(2) = 0Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.CkptPri.SumPri.Send(Sec, 1)Sec.Recv(Pri, 1)Sec.CkptSec.SumCli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.SumPri.Send(Sec, 2)Sec.Recv(Pri, 2)Sec.Sum
Global explorer
Cli.Send(Pri, 1)Cli.Send(Pri, 2)
Local explorer for Client
Cli.Choose(2) = 0Cli.Send(Pri, 1)Cli.Send(Pri, 3)
Cli.Choose(2) = 1
BranchingTrace
Composition
16
Client(Cli) Primary/Secondary(Pri/Sec) //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Ckpt Send(P,2); total+=n; Sum Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Cli.Choose(2) = 0Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.CkptPri.SumPri.Send(Sec, 1)Sec.Recv(Pri, 1)Sec.CkptSec.SumCli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.SumPri.Send(Sec, 2)Sec.Recv(Pri, 2)Sec.Sum
Global explorer
Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.Send(Sec, 1)Sec.Recv(Pri, 1)Cli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.Send(Sec, 2)Sec.Recv(Pri, 2)
Existing global message trace:
Cli.Send(Pri, 1)
Cli.Send(Pri, 3)
Branching local message trace:
dependence
==
Composition
17
Client(Cli) Primary/Secondary(Pri/Sec) //Main thread //Checkpoint thread if (Choose(2)==0){ while (n=Recv()) { Lock(); Send(P,1); Lock(); Log(total); Ckpt Send(P,2); total+=n; Sum Unlock();} else { Unlock(); Send(P,1); if (isPrimary) Send(P,3); Send(S,n); } }
Cli.Choose(2) = 0Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.CkptPri.SumPri.Send(Sec, 1)Sec.Recv(Pri, 1)Sec.CkptSec.SumCli.Send(Pri, 2)Pri.Recv(Cli, 2)Pri.SumPri.Send(Sec, 2)Sec.Recv(Pri, 2)Sec.Sum
Global explorer
Cli.Send(Pri, 1)Pri.Recv(Cli, 1)Pri.Send(Sec, 1)Sec.Recv(Pri, 1)Cli.Send(Pri, 3)
New global message trace:
EvaluationExperiment Setup
DEMETER-MODIST: MPS, an deployed product implementation of Paxos Berkeley DB (BDB)
DEMETER-MACEMC: Chord, peer-to-peer DHT implementation
18
EvaluationEffectiveness of Dynamic Interface ReductionApp-n : n is the number of distributed
nodes
Reduction Ratio: |Mw/o DIR| / |Mw DIR|
19
App MPS-2 MPS-3 BDB-2 BDB-3 Chord-2 Chord-3
Reduction 488 542944 277 278481 19 1587
Speedup 153 217178
50 44203 7 547x1000 x1000 x100
DeMeter-Modist DeMeter-MaceMC
Related WorkCompositional model checking
E.M.Clarke et. al. (Symposium on Logic in Computer Science 1989)
Partial-order reductionC.Flanagan and P.Godefroid (POPL’05)
Model checking network systemR.Guerraoui and M.Yabandeh (NSDI’11)
20
ConclusionDistributed systems componentized
Local non-determinism does not propagate
Dynamic interface reductionEffective, automatic, easyProvably sound and complete
DeMeter – enable DIR for legacy MCs21