Automatic verification of parameterized data structures
Automatic Verification of Parameterized Data Structures
Jyotirmoy V. Deshmukh, E. Allen Emerson and Prateek Gupta
The University of Texas at Austin
The University of Texas at Austin 1
Automatic verification of parameterized data structures
Outline
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 2
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 3
Automatic verification of parameterized data structures Motivation
Motivation
• Data structures: basic building blocks of software systems.
• Methods: programs operating on data structures.
• Traditional approach: check correctness up to bounded size.
• Parameterized verification: correctness for arbitrarily large sizes.
• Parameterized verification faces several difficulties!
The University of Texas at Austin 4
Automatic verification of parameterized data structures Motivation
Verifying programs operating on data structures
• Data structures:
– may have arbitrarily large sizes.
– may use pointers that range over arbitrarily large address space.
– may use data values that range over unbounded domains.
• Parameterized correctness is generally undecidable.
• Decidable classes of programs face severe combinatorial explosion.
The University of Texas at Austin 5
Automatic verification of parameterized data structures Motivation
Potential Applications
• Verification of data structure libraries in C++, Java.
• File system manipulation routines.
• Memory management algorithms, e.g. garbage collection.
• Algorithms in SoC designs.
The University of Texas at Austin 6
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 7
Automatic verification of parameterized data structures Preliminaries
Correspondence between a data structure and a graph
ab
d
a
b
d
c
root
c
root(G)null
nullnull
null
null
Data Structure Corresponding Graph
The University of Texas at Austin 8
Automatic verification of parameterized data structures Preliminaries
Problem Definition
• Given :
– Method M : operates on input graph Gi to
produce output graph Go =M (Gi).
– Property ϕ: some predicate on graphs.
• Parameterized correctness :
For any arbitrarily large Gi, determine if: 〈ϕ(Gi)〉M 〈ϕ(Go)〉
The University of Texas at Austin 9
Automatic verification of parameterized data structures Preliminaries
Review: Tree automata
The University of Texas at Austin 10
Automatic verification of parameterized data structures Preliminaries
Example: Nondeterministic tree automaton
for reachability (EF b)
Σ = {a,b,c,d} Q = {q,q f } q0 = q Φ : {c(q) = red(1),c(q f ) = green(2)}
q
a,c,d a,c,d b
1
0
0 0
1
1
0 1
01
a,b,c,d
a,b,c,d
Transition diagram for δ
qf
The University of Texas at Austin 11
Automatic verification of parameterized data structures Preliminaries
Example: accepting run of A reach (EF b)
a
a
a b
c
d
a
The University of Texas at Austin 12
Automatic verification of parameterized data structures Preliminaries
Example: accepting run of A reach (EF b)
a
a
a b
c
d
a
q
The University of Texas at Austin 13
Automatic verification of parameterized data structures Preliminaries
Example: accepting run of A reach (EF b)
a
a
a b
c
d
a
q qf
q
The University of Texas at Austin 14
Automatic verification of parameterized data structures Preliminaries
Example: accepting run of A reach (EF b)
a
a
a b
c
d
a
qf
q qf
qqf qf
q
The University of Texas at Austin 15
Automatic verification of parameterized data structures Preliminaries
Example: accepting run of A reach (EF b)
a
a
a b
c
d
a
q
q qf
qf qfqf qf qf
qf
qf
qqf qf
The University of Texas at Austin 16
Automatic verification of parameterized data structures Preliminaries
Definition: Destructive pass
• Pass : Traversal of graph visiting each node at most once.
• Destructive update : Modification of the input graph.
e.g. Adding a node, Deleting a node, Changing a link, Changing a value, etc.
• Destructive pass : pass that performs at least one destructive update.
The University of Texas at Austin 17
Automatic verification of parameterized data structures Preliminaries
Stipulations
• Methods:
– must terminate.
– should perform only a bounded number of
destructive passes over the graph.
– should be iterative (no recursion).
• Domain of data values should be finite.
• Input graphs have varying, but bounded branching.
The University of Texas at Austin 18
Automatic verification of parameterized data structures Preliminaries
Example methods
• Insertion/Deletion of nodes in linked lists (linear/circular),
• Insertion/Deletion of nodes in k-ary trees,
• Iterative modification of nodes in general graphs,
• Reversal of linked lists,
• Swapping nodes within a bounded distance.
The University of Texas at Austin 19
Automatic verification of parameterized data structures Preliminaries
Property specification
• Properties specified as non-deterministic tree automata.
• Aϕ and A¬ϕ called property automata.
• Examples include: Acyclicity, Sortedness, Reachability,
Treeness, Listness, etc.
The University of Texas at Austin 20
Automatic verification of parameterized data structures Preliminaries
Example: checking acyclicity in a binary graph
A cy = {Σ,{q,q f },q,δ,{c(q) = red(1),c(q f ) = green(2)}}
1
1
10
0
0
qf
∀σ null
q
∀σ
6= null
Transition diagram for Acy
The University of Texas at Austin 21
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 22
Automatic verification of parameterized data structures Solution strategy
Modeling the method
• Method M modeled using tree automaton AM .
• (Gi,Go) represented as composite graph Gc.
• AM accepts all graphs Gc that represent valid I/O behavior of M .
The University of Texas at Austin 23
Automatic verification of parameterized data structures Solution strategy
Input Graph: Gi
Added
Deleted
Original node/edge
node/edge
node/edge
n1 n2 n3
(d2, d′
2)(d1, d
′
1) (d3, d
′
3)
root
The University of Texas at Austin 24
Automatic verification of parameterized data structures Solution strategy
Output Graph: Go
Added
Deleted
Original node/edge
node/edge
node/edge
n1 n3
nnew
(d1, d′
1) (d3, d
′
3)
root
(dnew,d
′
new)
The University of Texas at Austin 25
Automatic verification of parameterized data structures Solution strategy
Composite Graph
Added
Deleted
Original node/edge
node/edge
node/edge
n1 n2 n3
nnew
(d2, d′
2)(d1, d
′
1) (d3, d
′
3)
root
(dnew,d
′
new)
The University of Texas at Austin 26
Automatic verification of parameterized data structures Solution strategy
Composite automaton
• Given : Aϕ, A¬ϕ and AM .
• Construct : Composite automaton A c.
• A c: (synchronous) product of Aϕ, AM and A¬ϕ.
• A c accepts Gc, iff:
– AM accepts Gc,
– Aϕ accepts Gi (input part), and
– A¬ϕ accepts Go (output part).
The University of Texas at Austin 27
Automatic verification of parameterized data structures Solution strategy
Reduction to language emptiness
• A c accepts exactly those graphs that witness a failure of M .
• M is correct iff language accepted by A c is empty.
• A c is empty implies parameterized correctness of M .
The University of Texas at Austin 28
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 29
Automatic verification of parameterized data structures Programming language
Node of a data structure
Data Fields (data)
next1
next2
nextk
The University of Texas at Austin 30
Automatic verification of parameterized data structures Programming language
Programming language
• Methods equipped with an iterator called “cursor”.
• Bounded window (w): set of nodes within fixed distance from cursor.
• Auxiliary pointers: denote positions within w, relative to cursor.
• Types of statements: Assignment, Conditional and Loop statements.
The University of Texas at Austin 31
Automatic verification of parameterized data structures Programming language
Example method: Insertion in a singly linked list
method InsertNode (value, newValue){
1: cursor := head;
2: while (cursor != null) {
[ncursor := cursor->next]
3: if (cursor->data == value) {
4: cursor->next := new node {
data := newValue;
next := ncursor;};
5: break; }
6: cursor := ncursor when true; } }
The University of Texas at Austin 32
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 33
Automatic verification of parameterized data structures Translation
How does AM emulate M ?
While operating on composite graph Gc = (Gi,Go), AM :
• reads a new node n = (ni,no),
• changes state to mimic atomic updates to ni,
• checks if updated node matches no, and
• if yes, moves to next node.
The University of Texas at Austin 34
Automatic verification of parameterized data structures Translation
From M to AM : I
• AM starts in state q0 and reads node (ni,no).
• State of AM encodes updated value of ni.
• Statements that do not alter cursor position map to ε-moves.
e.g. conditionals, loop body, assignments (except to cursor)
The University of Texas at Austin 35
Automatic verification of parameterized data structures Translation
From M to AM : II
• For assignments that alter cursor position:
– check if current state matches no,
– if yes, read new node,
– if no, transition to reject state.
• Transition to accept state after last statement in M .
• Add self-loops to reject and accept states.
The University of Texas at Austin 36
Automatic verification of parameterized data structures Translation
Example: Compilation of a while statement
Reject!States encoding action of stmt
Exit loop;ψ is false
Enter loop; ψ is true
Skip loop, ψ is false
Read NextNode
while (ψ) {
stmt;
update statement; }
loop body;l.b. acting on ni
does not match no!
(l.b.)
ψ holds true
States encoding actionof loop body
The University of Texas at Austin 37
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
– Examples
• Efficiency
• Related work and conclusions
The University of Texas at Austin 38
Automatic verification of parameterized data structures
Example method: Insertion in a singly linked list
method InsertNode (value, newValue){
1: cursor := head;
2: while (cursor != null) {
[ncursor := cursor->next]
3: if (cursor->data == value) {
4: cursor->next := new node {
data := newValue;
next := ncursor;};
5: break; }
6: cursor := ncursor when true; } }
The University of Texas at Austin 39
Automatic verification of parameterized data structures
Example method automaton for insertNode
qacc
qrej
q0
nodeQ4
Qτ4 Q¬τ
4
Q6
Q¬ψ6 Q
ψ6
Q1
Q2
Qψ2 Q
¬ψ2
Q3
Qφ3 R¬τ Rτ R = Q
¬φ3
The University of Texas at Austin 40
Automatic verification of parameterized data structures
Example: An incorrect method
method sampleMethod {
cursor->next := cursor;
cursor->data := 10; }
Does this method preserve acyclicity?
The University of Texas at Austin 41
Automatic verification of parameterized data structures
Constructing the composite automaton
c = 1c = 2
β
¬β
β
¬β
qrej
q
qf
q0
qaccqf
q
AM Aϕ A¬ϕ
β = (n 6= null)
Method automaton Property automata
Q1
conform ¬conform
The University of Texas at Austin 42
Automatic verification of parameterized data structures
Composite automaton is non-empty!
c = 1c = 2
null
n(d, d’)
Q1
β
¬β
β
¬β
qrej
q
qf
q0
qaccqf
q
AM Aϕ A¬ϕ
β = (n 6= null)
Method automaton Property automata Witness to
nonemptiness
conform ¬conform
The University of Texas at Austin 43
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 44
Automatic verification of parameterized data structures Efficiency
Efficiency
• A c: linear in |AM |, |Aϕ| and |A¬ϕ|.
– Size of AM : O(|M |).
– AM , Aϕ, A¬ϕ have small, fixed number of colors in parity condition.
• Non-emptiness: polynomial in |A c|.
• Overall complexity: polynomial in size of M and property automata.
The University of Texas at Austin 45
Automatic verification of parameterized data structures
• Motivation
• Preliminaries
• Solution strategy
• Programming language
• Compiling into automata
• Efficiency
• Related work and conclusions
The University of Texas at Austin 46
Automatic verification of parameterized data structures Related work
Related work: I
• Pointer Assertion Logic Engine : [Møller, Schwartzbach, 2001]
– More general (uses MSOL), but complexity is non-elementary.
– Requires human ingenuity in providing loop invariants.
• Separation logic : [O’Hearn, Reynolds, Yang, 2001]
– Deductive system with proof rules.
– Decidable fragment treats only linked lists.
The University of Texas at Austin 47
Automatic verification of parameterized data structures Related work
Related Work: II
• Shape analysis : [Sagiv, Reps, Wilhelm, 1999]
– Shape invariants represented using 3-valued logic.
– Broad scope, but inexact solutions.
• Transducer-based approach :[Bouajjani et al, 2005]
– Abstraction refinement based approach.
– Limited to single successor data structures.
The University of Texas at Austin 48
Automatic verification of parameterized data structures
Conclusions
• Efficient algorithmic technique for verification of
parameterized data structures.
• Reasoning about a large class of methods, examples include:
Adding, deleting, inserting nodes in linked lists, binary search trees,
swapping nodes within a bounded distance, reversing lists, etc.
• Properties such as: acyclicity, reachability, sortedness,
treeness, listness, sharing etc.
• Complexity: polynomial in size of method and property specifications.
The University of Texas at Austin 49
Automatic verification of parameterized data structures
Thank You!
The University of Texas at Austin 50
Automatic verification of parameterized data structures
Tree automata
A (parity) tree automaton A has the form: (Σ,Q,δ,q0,Φ), where:
• Σ is the input alphabet (nodes of the graph),
• Q is the finite non-empty set of states,
• δ : Q×Σ → 2Qkis the non-deterministic transition relation,
• q0 is the initial state, and
• Φ is the parity acceptance condition.
The University of Texas at Austin 51
Automatic verification of parameterized data structures
Run of a tree automaton A
• Run: Annotation of input tree with states of A .
• Accepting run: Run in which acceptance condition is true for all paths.
• A accepts tree T if there is some accepting run on T .
• Notion of run can be generalized to general graphs.
The University of Texas at Austin 52
Automatic verification of parameterized data structures
Parity acceptance condition
• States colored with colors {c0, . . . ,cm}.
• π is some finite/infinite sequence of states.
• π satisfies parity condition iff: maximal index of color appearing
infinitely often is even .
• Remark: Our technique needs 2 colors in most cases.
The University of Texas at Austin 53
Automatic verification of parameterized data structures
Programming language: Syntax
• Assignment statement syntax:
– cursor->data := d; (Modify data value)
– cursor->next := ptr; (Redirect an edge)
– cursor := ptr; (Change cursor location)
– cursor := new node{data:=d;next1:=null;. . .}; (Add new node)
– cursor->next := new node { . . . }; (Add new node after cursor)
• Conditional statements:
– standard if-then-else construct
– test condition: data comparison, pointer comparison (within the window)
The University of Texas at Austin 54
Automatic verification of parameterized data structures
Loop statements
while (ψ) {
loop body;
update statement; }
• Used for iterating through the data structure.
• Nesting of loops not permitted.
• cursor cannot be changed inside loop body.
• Update statement used to change cursor position.
The University of Texas at Austin 55