49
Context-bounded model checking of concurrent software Shaz Qadeer Microsoft Research Joint work with: •Jakob Rehof, Microsoft Research •Dinghao Wu, Princeton University

Context-bounded model checking of concurrent software

  • Upload
    inga

  • View
    34

  • Download
    1

Embed Size (px)

DESCRIPTION

Context-bounded model checking of concurrent software. Shaz Qadeer Microsoft Research. Joint work with: Jakob Rehof, Microsoft Research Dinghao Wu, Princeton University. . . . . . . . . . . . . . . . . . . . . . . . . - PowerPoint PPT Presentation

Citation preview

Page 1: Context-bounded model checking of concurrent software

Context-bounded model checking of concurrent

softwareShaz Qadeer

Microsoft Research

Joint work with:•Jakob Rehof, Microsoft Research•Dinghao Wu, Princeton University

Page 2: Context-bounded model checking of concurrent software

Concurrent software

•Operating systems, device drivers•Databases, web servers, browsers, GUIs, ...•Modern languages: C#, Java

Processor 1

Processor 2

Thread 1

Thread 2

Thread 3

Thread 4

Page 3: Context-bounded model checking of concurrent software

Concurrency is increasingly important

• New classes of concurrent software– Web services– Workflows

• Single-chip multiprocessors are an architectural inflexion point– Software running on these chips will be

even more concurrent

Page 4: Context-bounded model checking of concurrent software

Reliable concurrent software?

•Correctness Problem– does program behaves correctly for all

inputs and all interleavings?

•Bugs due to concurrency are insidious – non-deterministic, timing dependent– difficult to detect, reproduce, eliminate– coverage from testing very poor

Page 5: Context-bounded model checking of concurrent software

Analysis of concurrent programs is difficult (1)

• Finite-data single-procedure program– n lines– m states for global data variables

• 1 thread– n * m states

• K threads– (n)

K * m states

Page 6: Context-bounded model checking of concurrent software

Analysis of concurrent programs is difficult (2)

• Finite-data program with procedures– n lines– m states for global data variables

• 1 thread– Infinite number of states– Can still decide assertions in O(n * m3)– SLAM, ESP, BLAST implement this algorithm

• K 2 threads– Undecidable! (Ramalingam 00)

Page 7: Context-bounded model checking of concurrent software

Context-bounded verification of concurrent software

Context Context Context

Context switch Context switch

Analyze all executions with small number of context switches !

Page 8: Context-bounded model checking of concurrent software

• Many subtle concurrency errors are manifested in executions with a small number of contexts

• Context-bounded analysis can be performed efficiently

Why context-bounded analysis?

Page 9: Context-bounded model checking of concurrent software

KISS: A static checker for concurrent software

• An implementation of context-bounded analysis– Technique to use any sequential checker

to perform context-bounded concurrency analysis

• Has found a number of concurrency errors in NT device drivers

Page 10: Context-bounded model checking of concurrent software

Sequentialprogram QKISS

Sequential Checker

Concurrentprogram P

No error found

Error in Q indicateserror in P

KISS: A static checker for concurrent software

Page 11: Context-bounded model checking of concurrent software

Sequentialprogram QKISS

Concurrentprogram P

KISS: A static checker for concurrent software

No error found

Error in Q indicateserror in P

SDV

Page 12: Context-bounded model checking of concurrent software

Sequentialprogram QKISS

Concurrentprogram P

KISS: A static checker for concurrent software

No error found

Error in Q indicateserror in P

PREfix

Page 13: Context-bounded model checking of concurrent software

Sequentialprogram QKISS

Concurrentprogram P

KISS: A static checker for concurrent software

No error found

Error in Q indicateserror in P

ESP

Page 14: Context-bounded model checking of concurrent software

Inside a static checker for sequential programs

int x, y, z;

void foo ( ) { if (x > y) { y = x; } if (y > z) { z = y; }

assert (x ≤ z);}

• Symbolically analyze all paths

• Check the assertion for each path

• Interprocedural analysis – e.g., PREfix, ESP, SLAM,

BLAST

Page 15: Context-bounded model checking of concurrent software

KISS strategy

• Q encodes executions of P with small number of context switches– instrumentation introduces lots of extra paths to

mimic context switches

• Leverage all-path analysis of sequential checkers

Sequentialprogram QKISS

Concurrentprogram P

SDV

Page 16: Context-bounded model checking of concurrent software

PnpStop( ) { int t; de->stopping = T; t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); WaitEvent(& de->stopEvent);}

DispatchRoutine( ) { int t; if (! de->stopping) { AtomicIncr(& de->count); // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); }}

Page 17: Context-bounded model checking of concurrent software

DispatchRoutine( ) { int t; if (! de->stopping) { AtomicIncr(& de->count); // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); }}

PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent);}

Page 18: Context-bounded model checking of concurrent software

PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent);}

DispatchRoutine( ) { int t; CODE; if (! de->stopping) { CODE; AtomicIncr(& de->count); // do useful work // … CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); }}

if ( !done ) { if ($) { done = T; PnpStop( ); }}

CODE bool done = F;

Page 19: Context-bounded model checking of concurrent software

PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent);}

DispatchRoutine( ) { int t; CODE; if (! de->stopping) { CODE; AtomicIncr(& de->count); // do useful work // … CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); }}

if ( !done ) { if ($) { done = T; PnpStop( ); }}

CODE bool done = F;

main( ) { DispatchRoutine( ); }

Page 20: Context-bounded model checking of concurrent software

PnpStop( ) { int t; CODE; de->stopping = T; CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); CODE; WaitEvent(& de->stopEvent);}

DispatchRoutine( ) { int t; if ($) return; if (! de->stopping) { if ($) return; AtomicIncr(& de->count); // do useful work // … if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); }}

if ( !done ) { if ($) { done = T; PnpStop( ); }}

CODE bool done = F;

main( ) { PnpStop( ); }

Page 21: Context-bounded model checking of concurrent software

KISS features• KISS trades off soundness for scalability • Cost of analyzing a concurrent program P =

cost of analyzing a sequential program Q– Size of Q asymptotically same as size of P

• Unsoundness is precisely quantifiable– for 2-thread program, explores all executions

with up to two context switches – for n-thread program, explores up to 2n-2

context switches

• Allows any sequential checker to analyze concurrency

Page 22: Context-bounded model checking of concurrent software
Page 23: Context-bounded model checking of concurrent software

Experimental Evaluation of KISS

Page 24: Context-bounded model checking of concurrent software

Driver Stopping Error in Bluetooth Driver (1 KLOC)

DispatchRoutine() { int t; if (! de->stopping) { AtomicIncr(& de->count); assert ! driverStopped; // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); }}

PnpStop() { int t; de->stopping = T; t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); WaitEvent(& de->stopEvent); driverStopped = T;}

Page 25: Context-bounded model checking of concurrent software

int t;if (! de->stopping) {

int t;de->stopping = T;t = AtomicDecr(& de->count);if (t == 0) SetEvent(& de->stopEvent);WaitEvent(& de->stopEvent);driverStopped = T;

AtomicIncr(& de->count); assert ! driverStopped; // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent);}

Assertion fails!

Page 26: Context-bounded model checking of concurrent software

DispatchRoutine(IRP *irp) { … irp->CancelRoutine = PacketCancelRoutine; Enqueue(irp); IoMarkIrpPending(irp); …}

IoCancelIrp(IRP *irp) { IoAcquireCancelSpinLock(); if (irp->CancelRoutine) { (irp->CancelRoutine)(irp); } …}

PacketCancelRoutine(IRP *irp) { … Dequeue(irp); IoCompleteRequest(irp); IoReleaseCancelSpinLock(); …}

IRP Cancellation Error in Packet Driver (2.5 KLOC)

Page 27: Context-bounded model checking of concurrent software

…irp->CancelRoutine = PacketCancelRoutine;Enqueue(irp);

IoAcquireCancelSpinLock();if (irp->CancelRoutine) { // inline PacketCancelRoutine(irp) … Dequeue(irp); IoCompleteRequest(irp); IoReleaseCancelSpinLock();

IoMarkIrpPending(irp);

Error: An irp should not be marked pending after it has been completed !

Page 28: Context-bounded model checking of concurrent software

Data-race Conditions in DDK Sample Drivers

• Device extension shared among threads• Data-races on device extension fields• 18 sample DDK drivers

– Range 0.5-9.2 KLOC– Total 70 KLOC

• Each field checked separately with resource limit of 20 minutes and 800MB

• Two threads: each calls nondeterministically chosen dispatch routine

Page 29: Context-bounded model checking of concurrent software

Driver KLOC # Fields # Races

Tracedrv 0.5 3 0

Moufiltr 1.0 14 0

Kbfiltr 1.1 15 0

Imca 1.1 5 1

Startio 1.1 9 0

Toaster/toastmon 1.4 8 1

Diskperf 2.4 16 0

1394diag 2.7 18 0

1394vdev 2.8 18 1

Fakemodem 2.9 39 6

Toaster/bus 5.0 30 0

Serenum 5.9 41 2

Toaster/func 6.6 24 5

Mouclass 7.0 34 1

Kbdclass 7.4 36 1

Mouser 7.6 34 1

Fdc 9.2 92 9

Total:30 races

Page 30: Context-bounded model checking of concurrent software

ToastMon_DispatchPnp(DEVICE_OBJECT *obj,IRP *irp)

{ … IoAcquireRemoveLock(); … case IRP_MN_QUERY_STOP_DEVICE: // Race: write access deviceExt->DevicePnPState = StopPending; … break; … IoReleaseRemoveLock(); …}

ToastMon_DispatchPower(DEVICE_OBJECT *obj,IRP *irp)

{ … // Race: read access if (deviceExt->DevicePnpState == Deleted) { … } …}

DevicePnpState Field in Toaster/toastmon

Page 31: Context-bounded model checking of concurrent software

Acknowledgments

• Tom Ball• Byron Cook• John Henry• Doron Holan• Vladimir Levin• Jakob Lichtenberg• Adrian Oney• Sriram Rajamani• Peter Wieland• …

Page 32: Context-bounded model checking of concurrent software

Keep It Simple and Sequential

• Context-bounded analysis by leveraging existing sequential checkers

• Validates the hypothesis that many concurrency errors require few context switches to show up

Page 33: Context-bounded model checking of concurrent software

However…

• Hard limit on number of explored contexts– e.g., two context switches for concurrent

program with two threads

• Case study: Concurrent transaction management code written in C# (Naik-Rehof 04)– Analyzed by the Zing model checker after

automatically translating to the Zing input language

– Found three bugs each requiring between three and four context switches

Page 34: Context-bounded model checking of concurrent software

Is a tuning knob possible?

Given a concurrent boolean program P and a positive integer c, does P go wrong by failing an assertion via anexecution with at most c contexts?

Given a concurrent boolean program P, does P go wrong by failing an assertion? Undecidable

Decidable

Given a concurrent boolean program P with unbounded fork-join parallelism and a positive integer c, does P go wrong by failing an assertion via an execution with at most c contexts? Decidable

Page 35: Context-bounded model checking of concurrent software

Context Context Context

Context switch Context switch

Problem:• Unbounded computation possible within each context!• Unbounded execution depth and reachable state space• Different from bounded-depth model checking

Page 36: Context-bounded model checking of concurrent software

Global store g, valuation to global variablesLocal store l, valuation to local variables Stack s, sequence of local storesState (g, s)

Sequential pushdown system

Transition relation:

(g, s) (g’, s’)

Page 37: Context-bounded model checking of concurrent software

Reachability problem for sequential pushdown

systemGiven (g, s), is there s’ such that (g, s) * (error,s’)?

Page 38: Context-bounded model checking of concurrent software

Aggregate state

Set of stacks ssAggregate state (g, ss) = { (g,s) | s ss }

Reach(g, ss, g’) = {s’ | (g’, s’) Reach(g, ss)}

Reach(g, ss) = { (g’, s’) | exists s ss such that (g, s) * (g’, s’) }

Page 39: Context-bounded model checking of concurrent software

Theorem (Buchi, Schwoon00)

• If ss is regular, then Reach(g, ss, g’) is regular.

• If ss is given as a finite automaton A, then a finite automaton A’ for Reach(g, ss, g’) can be constructed from A in polynomial time.

Page 40: Context-bounded model checking of concurrent software

Algorithm

Solution:Compute automaton for Reach(g, {s}, error) and report error if it is nonempty.

Problem:Given (g, s), is there s’ such that (g, s) * (error,s’)?

Page 41: Context-bounded model checking of concurrent software

Global store g, valuation to global variablesLocal store l, valuation to local variables Stack s, sequence of local storesState (g, s1, s2)

Concurrent pushdown system

Transition relation:

(g, s1) (g’, s’1) in thread 1

(g, s1, s2) 1 (g, s’1, s2)

(g, s2) (g’, s’2) in thread 2

(g, s1, s2) 2 (g, s1, s’2)

Page 42: Context-bounded model checking of concurrent software

Reachability problem for concurrent pushdown

system

Given (g, s1, s2), are there s’1 and s’2 such that (g, s1, s2) reaches (error, s’1, s’2) via an execution with at most c contexts?

Page 43: Context-bounded model checking of concurrent software

Aggregate transition relation

ss’1 = Reach1(g, ss1, g’)

(g, ss1, ss2) 1 (g’, ss’1, ss2)

(g, ss1, ss2) 2 (g’, ss1, ss’2)

ss’2 = Reach2(g, ss2, g’)

Page 44: Context-bounded model checking of concurrent software

Algorithm: 2 threads, c contexts

1 2

1 2

1 2Depth c

(g, {s1}, {s2})

Compute the set of reachable aggregate states.Report an error if (g, ss1, ss2) is reachable andg = error, ss1 is nonempty, and ss2 is nonempty.

Page 45: Context-bounded model checking of concurrent software

Complexity: 2 threads, c contexts

1 2

1 2

1 2

Depth of tree = context bound cBranching factor bounded by G 2 (G = # of global stores)Number of edges bounded by (G 2) (c+1)

Each edge computable in polynomial time

Depth c

(g, {s1}, {s2})

Page 46: Context-bounded model checking of concurrent software

Unbounded fork-join parallelism

• Fork operation: x = fork• Join operation: join(x)• Copy thread identifier from one

variable to another

Page 47: Context-bounded model checking of concurrent software

Algorithm: unbounded fork-join parallelism, c contexts

• At most c threads may perform a transition

• Reduce to previously solved problem with c threads and c contexts– Nondeterministically pick c forked

threads for execution

Page 48: Context-bounded model checking of concurrent software

start : {1, …, c} boolean, initialized to i. (i == 1) end : {1, …, c} boolean, initialized to i. false

x = fork translates to

if ($) { assume(count < c); count = count + 1; x = count; start[count] = true;} else { x = c + 1;}

join(x) translates to

assume(x c);assume(end[x]);

count : {1, …, c}, initialized to 1

• c statically created threads• thread i starts execution when start[i] is true • thread i sets end[i] to true on termination

Page 49: Context-bounded model checking of concurrent software

Context-bounded analysis of concurrent software

• Many subtle concurrency errors are manifested in executions with few context switches – Experience with KISS on Windows drivers– Experience with Zing on transaction manager

• Algorithms for context-bounded analysis are more efficient than those for unbounded analysis– Reducibility to sequential checking with KISS– Decidability of assertion checking for

concurrent boolean programs