On Sequentializing Concurrent Programs (Bounded Model Checking) Gennaro Parlato University of...

Preview:

Citation preview

On Sequentializing Concurrent Programs

(Bounded Model Checking)

Gennaro ParlatoUniversity of Southampton, UK

UPMARC 7th Summer School on Multicore Computing, June 8-10, 2015

Concurrent Programs - Reachability Problem

concurrent C programs• POSIX threads• SC memory model

reachability• assertion failure• out-of-bound array• division-by-zero, …

bounded model checking (BMC)• bug-finding, not complete analysis

SHARED MEMORY

T1 T2TN

THREADS

BMC approach – Sequential Programs

Efficient tools for C• BLITZ [ Cho, D'Silva, Song – ASE’13 ]• CBMC [ Clarke, Kroening, Lerda – TACAS’04 ] • LLBMC [ Falke, Merz, Sinz – ASE’13 ]• ESBMC [ Cordeiro, Fischer, Marques-Silva – ASE’09 ]

PROGRAMBOUNDEDPROGRAM

SAT/SMTFORMULA

SOLVER

inliningunrollingSSA form

direct SAT/SMT approach• encode each thread as in the sequential case • add a conjunct for shared memory operations• all possible interleavings in the bounded program

φthreads ∧ φconcurrency

papers:• [ Sinha, Wang – POPL’11 ]• [ Alglave, Kroening, Tautschnig – CAV’13 ]

BMC approach - Concurrent C Programs

CONCPROGRAM

BOUNDEDPROGRAM

SAT/SMTFORMULA

SOLVER

concurrencyhandling

SE

QU

EN

TIA

LIZ

AT

ION

(cod

e-to

-cod

e tr

ansl

atio

n)

Sequentialization

papers

• proposal [ Qadeer, Wu – PLDI’04 ]• eager, bounded context-switch, finite # threads [ Lal, Reps – CAV’08 ]• lazy, finite # threads, parameterized [La Torre, Madhusudan, Parlato – CAV’09, CAV’10]• thread creation [Bouajjani, Emmi, Parlato – SAS’11] [Emmi, Qadeer, Rakamaric – POPL’11]• Lal/Reps for real-time systems [Chaki, Gurfinkel, Strichman – FMCAD’11]• message-passing programs [Bouajjani, Emmi -- TACAS’12]

CONCPROGRAM

BOUNDEDPROGRAM

SAT/SMTFORMULA

SOLVER

SEQPROGRAM

BMCSEQ TOOL

SE

QU

EN

TIA

LIZ

AT

ION

(cod

e-to

-cod

e tr

ansl

atio

n)

Sequentialization

BMC-based tools (Implementations of variants of Lal/Reps schema)

• Corral (microsoft research) [ Lal, Qadeer, Lahiri – CAV’12, FSE’14 ]

• Rek [ Chaki, Gurfinkel, Strichman – FMCAD’11 ]

• STORM [ Lahiri,Qadeer,Rakamaric CAV’09 ]

CONCPROGRAM

BOUNDEDPROGRAM

SAT/SMTFORMULA

SOLVER

SEQPROGRAM

BMCSEQ TOOL

Sequentialization

We have designed new sequentializations targeting BMC scalable analyses + surprisingly simple

•Lazy-CSeq•Memory Unwinding

CONCPROGRAM

BOUNDEDPROGRAM

SAT/SMTFORMULA

SOLVER

SEQPROGRAM

SE

QU

EN

TIA

LIZ

AT

ION

(cod

e-to

-cod

e tr

ansl

atio

n)

BMCSEQ TOOL

Lazy-CSeq: Schema Overview(new sequentialization for BMC)

[ Inverso–Tomasco–Fischer–La Torre–Parlato, CAV’14 ]

Lazy-CSeq Approach

BOUNDEDPROGRAM

BMC SEQUENTIAL

TOOL

SEQPROGRAM

SE

QU

EN

TIA

LIZ

AT

ION

(cod

e-to

-cod

e tr

ansl

atio

n)

CONCPROGRAM

Bounded Concurrent Programs

main()

T0 TNTN-1T1 …

• no loops• no function calls• Control flow only forward• one procedure for each thread

Round Robin Schedule

main()

T0 TNTN-1T1

Lazy-Cseq sequentialization:•captures all bounded Round-Robin computations for a given bound•error manifest themselves within very few rounds [ Musuvathi, Qadeer – PLDI’07 ]

round 1

round 2

round k

round 3

Schema Overview

…main()T0 T1

TN

…F0 F1

FN main()

bounded concurrent program

“equivalent”

Sequential program with non determinism

Sequentialization(code-to-code translation)

Sequentialized functions Driver

translatestranslates

translatestranslates

translatestranslates

Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei)

Fi();}

main driver• a global pc for each thread • thread locals thread global

Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei)

Fi();}

main driverfor each round

for each thread Ti

simulate Ti

Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei)

Fi();}

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi()

Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei)

Fi();}

switch(pci) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi()

... ...

context-switch resum

e m

echanism

Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei)

Fi();}

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi() ...

... ...

Context-switch simulation: #define CS(j) if (*) { pci=j; return; }

Context-switch simulation: #define CS(j) if (*) { pci=j; return; }

Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei)

Fi();}

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi() ...

... ...

Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; pc1=0; ... pcN=0;local0; local1; ... localk;

main() { for (r=0; r<R; r++) for (k=0; k<N; k++) // simulate Tk

Fk();}

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

...

... ...

Formula encoding:

goto statement to formula

add a guard for each crossing control-flow edge

= O(M2) guards

Formula encoding:

goto statement to formula

add a guard for each crossing control-flow edge

= O(M2) guards

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei) Fi();}

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi() ...

... ...

CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0;local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei) Fi();}

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi() ...

... ...

CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION

pc0=0; ... pcN=0; local0; ... localk;

main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei) Fi();}

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi() ...

CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi() ...

CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION

#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;

pc0=0; ... pcN=0; local0; ... localk;nextCS;main() for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei) nextCS = nondet; assume(nextCS>=pci) Fi(); pci = nextCS;

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi()

CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION

#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;

pc0=0; ... pcN=0; local0; ... localk;nextCS;main() for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei) nextCS = nondet; assume(nextCS>=pci) Fi(); pci = nextCS;

switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;

... case M: goto M;}

1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;

main driver

Fi()

CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION

#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;

...

pc0=0; ... pcN=0; local0; ... localk;nextCS;main() for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti

if (activei) nextCS = nondet; assume(nextCS>=pci) Fi(); pci = nextCS;

resuming + context-switch

1: CS(1); stmt1;2: CS(2); stmt2;3: CS(3); stmt3;

EXECUTE

M: CS(M); stmtM;

nextCS

...

skip

pci

...

skip

#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;

CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION

Individual Threads Sequentialization

1: CS(1); stmt1;2: CS(2); stmt2;3: CS(3); stmt3;

EXECUTE

M: CS(M); stmtM;

...

...

Formula encoding:

goto statement to formula

add a guard for each crossing control-flow edge

= O(M) guards

Formula encoding:

goto statement to formula

add a guard for each crossing control-flow edge

= O(M) guards

Individual Threads Sequentialization

1: CS(1); stmt1;2: CS(2); stmt2;3: CS(3); stmt3;

EXECUTE

M: CS(M); stmtM;

...

...

#define CS(j) if (j<pci || j>=next_CS) goto pc+1; #define CS(j) if (j<pci || j>=next_CS) goto pc+1;

inject light-weight, non-invasive control code

•no non-determinism•no assignments•no return

inject light-weight, non-invasive control code

•no non-determinism•no assignments•no return

Experiments

Evaluation: bug-hunting SVCOMP’14, Concurrency (UNSAFE instances)

Evaluation: bug-hunting SVCOMP’14, Concurrency (UNSAFE instances)

Remarks

Lazy-CSeq is a new sequentialization targeted to BMC backends• lazy• based on bounded round-robin computations• efficient for bug-hunting• simple to implement (CSeq framework)

Eager vs Lazy• no empirical evidence that lazy is faster • Lazy does not require an implementation of a memory model and handling of error

checks (as opposed to LR sequentialization)

Lazy-CSeq won 2 gold medals in the Concurrency category of the last two editions of the software verification competition SV-COMP

• all verification tasks solved• 30x faster than the best tool with native

concurrency handling

Sequentialization based on

Memory Unwinding

[ Tomasco–Inverso–Fischer–La Torre–Parlato, TACAS’15 ]

T1 T2 T3

Interleaving semantics

rxwx rx wy rxwxrywy ryrx wywxrun

T1 T2 T3

From interleavings to memory unwinding

rxwx rx wy rxwxrywyrun ryrx wy

accesses to shared memory

writes

rxwx rx wy rxwxrywy ryrx wy

wx

⇒ Memory unwinding models thread interaction history by data

MEMORYUNWINDINGwx wy wxwy wywx wy wxwy wy

Memory unwindings as thread interfaces

T1 T2

Assume-guarantee reasoning style decomposition of verification

TASK 1 TASK 2

T1

bounding parameter: # of shared write operations-most concurrency bugs exposed by few interactions [Lu, Park, Seo, Zhou – ASPLOS 2008]

Assume-guarantee reasoning

T1 T2

TASK 1 TASK 2

Thread T2

assumes: MU writes of other threads

guarantees: its own MU writes

Task 1: Task 2:

Thread T1

assumes: MU writes of other threads

guarantees: its own MU writes

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

x=2y=1y=1

x=1

x=3y=8

z=2

pcmc

Global variable write – check against current MU entry

void write(uint t, uint v, int val) { mc[t] = th_nxt_wr[t][mc[t]]; assume(var[mc[t]] == v); assume(val[mc[t]] == val);}

pc

void write(uint t, uint v, int val) { mc[t] = th_nxt_wr[t][mc[t]];

}mc

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

x=2y=1y=1

x=1

x=3y=8

z=2

pc

mc

Local statement (no global variables) – update pc, keep mc

pc

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

x=2y=1y=1

x=1

x=3y=8

z=2

pcmc

Global variable read – “pick write position in range”

int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}

non-deterministic assignment

non-deterministic assignment

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

x=2y=1y=1

x=1

x=3y=8

z=2

pcmc

Global variable read – “pick write position in range”

int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}

...before next write of thread...

...before next write of thread...

Check in unwinding...

Check in unwinding...

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

x=2y=1y=1

x=1

x=3y=8

z=2

pcmc

Global variable read – “pick write position in range”

int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from< mc[t]) assume(var_nxt_wr[nxt_mc] > mc[t]); else { if (r_from< var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}

...before next write of thread...

...before next write of thread...

...right variable......right variable...

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

x=2y=1y=1

x=1

x=3y=8

z=2

pcmc

Global variable read – “pick write position in range”

int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}

...right variable......right variable......next write of variable in

future...

...next write of variable in

future...

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

x=2y=1y=1

x=1

x=3y=8

z=2

pcmc

Global variable read – “pick write position in range”

int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}

...before next write of

variable...

...before next write of

variable...

...else update mc.

...else update mc.

mc

Return value.Return value.

pc

Simulating a thread against an MU

y=1 …. a=x+5 ..... x=1 ….

assert(b);

x=2y=1y=1

x=1

x=3y=8

z=2

Handling errors – “flag that an error occurs”

pc

translates to if (! b) _error = 1;

(computation might not be feasible)

translates to if (! b) _error = 1;

(computation might not be feasible)

Our apporach: sequentialization by MU

…T0 T1

TN

…F0 F1

FN main()

Concurrent program

Sequential program

Sequentialization(code-to-code translation)

Simulation functions

translates

translates

translates

Main Sequential Program

void main(void){

memory_init(W);

ct := mem_thread_create();

F0();

thread_terminate(ct);

all_thread_simulated();

assert(_error == 0) }

main

instantiate memory unwinding

error check

all writes of the thread are executed

simulation starts from main thread

all writes of the MU are executed

register main thread

Thread creationThread creation

CSeq framework + Evaluation

CSeq framework

sequentialnon-deterministic

C program

P'

concurrentC program

P

sequentialanalysis

toolcode-to-codetranslation

is a framework that simplifies code-to-code translations- for C programs + Pthread- comprises several code-to-code translation modules- supports several sequential analysis back-end tools

Internal modules-unrolling-function inlining-counter-example

Sequentialisations-Memory-Unwinding-Lazy-CSeq-LR-CSeq

testingKlee

bounded model-checking-BLITZ-CBMC-ESBMC-LLBM

abstraction-CPA-checker-Frama-C-SATABS

http://users.ecs.soton.ac.uk/gp4/cseq/

Evaluation [ SV–COMP 2014 2015 ]

SV-COMP software verification competition @ TACASConcurrency category benchmarks: 1003 files •Lazy-CSeq won 2 GOLD medals (2014, 2015)•MU-CSeq won 2 SILVER medals (2014, 2015)

Ongoing & Future Work

Sequentializations based on BMC and Abstraction for programs communication through FIFO channels

• weak memory models (TSO) [Atig, Bouajjani, – CAV 2011 ]• message passing interface (MPI)

Symbolic partial order reduction based on interleaving partitioning (CEGAR) combining:

• abstraction interpretation to discard “safe” partitions• bug-finding for partitions with “potential” bugs with few interleavings• swarm verification (on clusters)• implemented using Lazy- and MU-Cseq sequentialization

Foundational work: Reasoning about programs using graph representations. Analysis based on graph decompositions.

[“The Tree Width of Automata with Auxiliary Storage” Madhusudan, Parlato – POPL 2011 ]for multi-stack pushdown automata communicating either through shared variables or queues)

Ongoing work & Future work

Sequentializations based on BMC and Abstraction for programs communication through FIFO channels

• weak memory models (TSO) [Atig, Bouajjani, – CAV 2011 ]• message passing interface (MPI)

Symbolic partial order reduction based on interleaving partitioning (CEGAR) combining:

• abstraction interpretation to discard “safe” partitions• bug-finding for partitions with “potential” bugs with few interleavings• swarm verification (on clusters)• implemented using Lazy- and MU-Cseq sequentialization

Foundational work: Reasoning about programs using graph representations. Analysis based on graph decompositions.

[“The Tree Width of Automata with Auxiliary Storage” Madhusudan, Parlato – POPL 2011 ]for multi-stack pushdown automata communicating either through shared variables or queues)

Ongoing work & Future work

Thank You

users.ecs.soton.ac.uk/gp4/cseq

Recommended