Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Contract-Driven Data Structure Repair:A Novel Approach for Error Recovery
Razieh Nokhbeh Zaeem, Ph.D.
The University of Texas at Austin
January 4, 2016
Khajeh Nasir Toosi University of Technology (KNTU)
Joint work with Prof. Sarfraz Khurshid
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 1/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Outline
1 IntroductionMotivationProblem Definition and ApproachRelated Work
2 Contract-Driven Data Structure RepairExampleBackgroundHistory-Aware Data Structure RepairRepair Abstractions
3 Structure Generation with Dynamic ProgrammingProblem Definition and ApproachExampleTest Input Generation Using Dynamic Programming
4 Conclusion and Future WorkIdeas on Using Dynamic Programming for RepairConclusionContributions
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 2/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Motivation
Software reliability is important
A lot of research is dedicated to before-deployment activities
Software failure is still frequent and costly
Examples of software failures
Ariane-5 rocket crashed: overflow exception (left)Mars polar lander crashed: premature shut down (center)USS Yorktown was dead in the water: division by zero (right)
Photos: ESA/CNES Photos: JPL/NASA Photo: navsource.org
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 3/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Repair Versus Halt-on-Error
Many systems get deployed with unknown or unfixed bugsWhen bugs cause failures, the traditional halt-on-error approach
May fix the bug permanentlyBut is not always desirable or feasible
Figure: Traditional approach to errors (left) versus repair approach (right).
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 4/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Our Repair Approach
Data structure repair is continuing program operation
By fixing the effect of bugsIn the program state (i.e., data structures)On-the-fly
Our approach: contract-driven data structure repair
Contracts are formal interface specifications for softwarecomponents including
Abstract data types and invariantsPre-conditions and post-conditions
The idea is to transmute contracts into efficient implementations
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 5/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Related Work: Constraint-Based Repair
Constraint-based repair uses data structure invariants (aka repOK)Written using first-order logic [DemskyRinardOOPSLA03] orJava assertions [Khurshid+SPIN05, Elkarablieh+ASE07]
Does not necessitate writing dedicated repair routines as precedingrepair frameworks did
But data structure constraints are too weak for error recovery
input faulty output of remove(5) constraint-based repair result
root // 3size = 3
�� ��null 5
�� ��4
�� ��
null
null null
root // 3size = 2
�� ��null 5
�� ��4
��
OO
null
null
root // 3size = 3
�� ��null 5
�� ��4
�� ��
null
null null
Figure: Constraint-based repair is limited to data structure invariants.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 6/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Outline
1 IntroductionMotivationProblem Definition and ApproachRelated Work
2 Contract-Driven Data Structure RepairExampleBackgroundHistory-Aware Data Structure RepairRepair Abstractions
3 Structure Generation with Dynamic ProgrammingProblem Definition and ApproachExampleTest Input Generation Using Dynamic Programming
4 Conclusion and Future WorkIdeas on Using Dynamic Programming for RepairConclusionContributions
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 7/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Example
Binary search tree
1 c l a s s B i n a r y S e a r c h T r e e {2 Node r o o t ;3 i n t s i z e ;4
5 b o o l e a n remove ( i n t x ) { . . . }6
7 c l a s s Node {8 Node l e f t , r i g h t ;9 i n t e l e m e n t ;
10 }11 }
Listing 1: A binary search tree deceleration in Java.
Invariants include acyclicity, search property, correct size, etc.
remove method post-conditions include correct remove, correctremove result
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 8/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Example Bug
left.right = parent (bug) instead of parent.right = left
Invariants and/or post-conditions might be violated
input expected output of remove(5) faulty output of remove(5)
root // 3size = 3
�� ��null 5
�� ��4
�� ��
null
null null
root // 3size = 2
�� ��null 4
�� ��null null
root // 3size = 2
�� ��null 5
�� ��4
��
OO
null
null
Figure: Example bug manifested as a faulty output.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 9/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Our Previous Work: Contract-Based Data Structure RepairUsing Alloy
Contract-Based Data Structure Repair Using Alloy
A Master’s thesis [ZaeemKhurshidABZ10,ZaeemKhurshidECOOP10, ZaeemMS10]
Introduced the basic idea of contract-based data structure repairUsed Alloy tool-set
Example Alloy constraintall n: t.root.*(left+right) | n !in n. (left+right) //directed acyclicity
Tarmeem
Necessary background
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 10/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Formal Definition of Repair
Definition
Definition: Let φ be a method post-condition that relates pre- andpost-states such that φ(r , t) if and only if pre-state r and post-state tsatisfy the post-condition. Given a valid pre-state u, and an invalidpost-state s (i.e., !φ(u, s)), mutate s into state s ′ such that φ(u, s ′).
Definition
Definition: Graph edit distance, defined as the minimum number ofedge/node additions/deletions to change a graph to another, is used tomeasure the similarity between two states.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 11/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Background on Alloy and Kodkod
Alloy is a relational first order logic languageModels everything as a relationLB(R) ⊂ instance(R) ⊂ UB(R) More
Alloy Analyzer/Kodkod check Alloy specificationsTranslate the model to a satisfiability problemUse scope bounded SAT solving
Relaxation: let the SAT solver decide about the valuesE.g., relax the dotted edge
LB(right) = {(N0, N1)}UB(right) = {(N0, N1), (N2, N0), (N2, N1), (N2, N2)}
faulty output of remove(5) relational representation
T0.root // N0 : 3T0.size = 2
l��r
null N1 : 5
l~~ r
��N2 : 4
l��
r
OO
null
null
root = {(T0, N0)} size = {(T0, 2)}
right = {(N0, N1), (N2, N0)} left = {(N1, N2)}
element = {(N0, 3), (N1, 5), (N2, 4)}
Figure: Alloy models.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 12/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Basic Contract-Based Repair (Implemented in Tarmeem)
Basic repair method is oblivious to the current faulty post-stateHolds the pre-state constant, relaxes everything in the post-stateroot = {(T0, ?)}size = {(T0, ?)}right = {(N0, ?), (N1, ?), (N2, ?)}left = {(N0, ?), (N1, ?), (N2, ?)}element = {(N0, ?), (N1, ?), (N2, ?)}Displayed for visualization: ? can be anything
Tarmeem adds heuristics More
faulty output of remove(5) expected output basic method repair result
root // 3size = 2
�� ��null 5
�� ��4
��
OO
null
null
root // 3size = 2
�� ��null 4
�� ��null null
root // 4size = 2
�� ��3
�� ��
null
null null
Figure: Basic contract-based repair result.Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 13/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Challenges for Repair
Repair should be
Efficient
When repairing an errorWhen nothing goes wrong
Scalable
In enforcing contracts, locating and fixing the error
Effective
I.e., introduce little perturbation
Usable
I.e, provide logs and feedback
Tarmeem was not efficient nor scalable
It was independent of the program dynamic executionEach call to SAT acted independently
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 14/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
History-Aware Contract-Based Repair
The idea is to use [Zaeem+TACAS12]1 Dynamic program history of field writes and reads (through barriers)
A barrier is a code sequence to do additional work while writing to orreading from the heapIt is widely supported in languages with automatic memorymanagement, like Java
2 Repair history (through unsatisfiable cores SAT solvers return)
Unsat core includes contradicting parts of the contract and stateIt is sound and irreducible
1 i f ( ! a s s e r t C o n t r a c t s ( ) ) {2 re laxSAT ( w r i t e B a r r i e r L o g ) ;3 i f ( ! a s s e r t C o n t r a c t s ( ) ) {4 re laxSAT ( w r i t e B a r r i e r L o g , r e a d B a r r i e r L o g ) ;5 i f ( ! a s s e r t C o n t r a c t s ( ) ) {6 re laxSAT ( u n s a t C o r e F i e l d s ) ;7 i f ( ! a s s e r t C o n t r a c t s ( ) ) {8 r e p o r t M o d e l I n c o n s i s t e n c y ( ) ;}}}}
Listing 2: History-aware contract-based repair.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 15/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
History-Aware Repair Example(a) input (b) expected output of remove(5)
constraint-based repair history-aware contract-based repair
root// 3
btSize = 3
�� ��null 5
�� ��4
�� ��
null
null null
root// 3
btSize = 2
�� ��null 4
�� ��null null
(c) faulty output of remove(5) (d) contract-based repair without history
root// 3
�� ��
btSize = 2
null 5
�� ��4
��
OO
null
null
root// 4
btSize = 2
�� ��3
�� ��
null
null null
Figure: History-aware repair result. Dotted and dashed lines indicate writtenand read fields respectively.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 16/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Evaluation of Cobbler
Cobbler implements history-aware repair for Java programs
Subject programs for experiments:
Singly-linked listANTLR, from DaCapo [Blackburn+06] (BaseTree data structure)
A tool to build recognizers, interpreters, compilers, and translatorsfrom grammars
Kodkod (Red-black tree data structure)
A SAT-based constraint solverThe back-end of Alloy Analyzer
Cobbler found and repaired a real bug in ANTLR
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 17/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Results for Cobbler I
0.00000.00010.00100.01000.10001.0000
10.0000
CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P
10 50 200 10 50 200 10 50 200 10 50 200 10 50 200 10 50 200 10 50 200
Error 1 Error 2 Error 3 Error 4 Error 5 Error 6 Error 7
Number of Nodes
Singly-Linked List Repair Time (s) - Logarithmic Scale
Logging Check Repair
Figure: Performance: repairing singly-linked lists with Cobbler (C).
Subject tools1 Tarmeem (T) [ZaeemKhurshidABZ10, ZaeemKhurshidECOOP10]2 Enhanced Juzi (J) [ElkarabliehKhurshidICSE08, Elkarablieh+ASE07]3 PBnJ (P) [Samimi+ECOOP10]
Time measurements1 Logging time: the overhead due to logging read and write actions2 Check time: the time to detect a contract violation3 Repair time: the time to search and find a repaired data structure
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 18/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Results for Cobbler II
0.00000.00010.00100.01000.10001.0000
10.0000
CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P CT J P
10 50 200 10 50 200 10 50 200 10 50 200 10 50 200 10 50 200 10 50 200
Error 1 Error 2 Error 3 Error 4 Error 5 Error 6 Error 7
Number of Nodes
Singly-Linked List Repair Time (s) - Logarithmic Scale
Logging Check Repair
Figure: Performance: repairing singly-linked lists with Cobbler (C).
Cobbler performs best for 5 out of 7 types of errors
Error 2 skips an update to the size field (not read or written)Error 4 introduces a cycle; Juzi is tailored for such errors
Cobbler’s overhead on an error-free execution includes
Logging time, which is negligibleCheck time, can we improve it?
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 19/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Results for Cobbler III
0
50
100
CT J P CT J P CT J P CT J P CT J P CT JP CT J P CT J P CT JP CT J P CT J P CT JP CT J P CT J P CT JP CT J P CT J P CT J P CT J P CT J P CT J P
10 50 200 10 50 200 10 50 200 10 50 200 10 50 200 10 50 200 10 50 200
Error 1 Error 2 Error 3 Error 4 Error 5 Error 6 Error 7
Pe
rce
nt
Number of Nodes
Singly-Linked List Repair Accuracy
Similarity
Figure: Accuracy: repairing singly-linked lists with Cobbler.
Cobbler, except for one case, always produces exactly the sameoutput as expected (edit distance = 0, similarity = 100%)
Edit distance is used solely for evaluation and not for optimization
More
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 20/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
A Bug in ANTLR v3.2
Public addChild Method of BaseTree class
The input to this method is a tree
A check is missing to avoid adding a tree to itself as a child
Violates acyclicity and ascending child indexes
Cobbler finds and repairs the erroneous state
Zero edit distance30 s for a tree of 300 nodes
Contacted ANTLR team
They assume acyclicity in the tree but do not check for thatYet, since addChild is public, they should perform checks
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 21/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Cobbler and Repair Challenges
Efficiency
It speeds up over Tarmeem by reducing the size of the search space
Scalability
It scales ten times better than Tarmeem
Effectiveness
Repair results are 100% to 90% similar to the correct structure inmore than 90% of the cases
Usability
Barrier logs can be used as repair reports
Cobbler still does not scale to real applicationsRepair overhead should be decreased
Use repair abstractions
Check overhead is unacceptable
Translate Alloy checks to Java and check through JVM, not SAT
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 22/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Repair Abstractions
The Insight is that once an error is repaired, it might recur
The idea [Zaeem+RV13]
Abstract out repair actionsReuse them as possible repair action candidates in futureMalik’s PhD proposal [MalikPhD13] introduced the basic idea ofrepair abstractionWe develop it in the context of repair with a SAT back-end
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 23/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Repair Abstraction Example I
faulty output of remove(5) repair result
root // 3size = 2
�� ��null 5
�� ��4
��
OO
null
null
root // 3size = 2
�� ��null 4
�� ��null null
concrete repair actions: abstract to:
[3].right = [4] First(in post-state).right = First.Neighbor.Neighbor(in post-state) and
[4].right = null First.Neighbor.Neighbor(in post-state).right = null
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 24/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Repair Abstraction Example II
Running the same faulty code on a different tree
faulty output of remove(7) repair result
root // 2size = 4
{{ !!1
�� ��
7
�� ��null null 6
��
TT
null
5
�� ��null null
root // 2size = 4
{{ !!1
�� ��
6
�� ��null null 5
�� ��
null
null null
reuse abstract repair actions: concretize to:
First(in post-state).right = First.Neighbor.Neighbor(in post-state) and [2].right = [6]
First.Neighbor.Neighbor(in post-state).right = null [6].right = null
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 25/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
DREAM
DREAM (Data structure Repair using Efficient Abstraction Methods)
A generic framework that piggybacks on different repair frameworks1 DREAM concretizes and applies abstracted repair actions, repaired?2 If not repaired, DREAM calls the underlying repair framework3 DREAM abstracts out the concrete repair actions the underlying
repair framework took for future
More
Java Program
DREAM
Underlying Repair Framework
Java Virtual Machine
Figure: Placement of DREAM.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 26/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Arreh: a Tool for Translating Alloy Checks to Java
DREAM improves repair
Frequent checking of contracts through SAT is time consuming
Arreh translates Alloy checks to Java, checks through JVM not SAT
Based on Minshar [Al NaffouriMS04]
Arreh adds support for pre- and post-conditions
Arreh is an extension to Alloy Analyzer
Parses the model to Alloy Abstract Syntax Tree (AST)Recursively replaces each operation of AST with a Java method call
Performance improvement of Arreh is tow-fold
Executing Java checks is faster than SATTranslation to Java is a one time operation for Arreh
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 27/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Alloy Analyzer/Arreh Snapshot
Figure: A snapshot of Arreh.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 28/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Arreh Example
1 pred repOK{ // d i r e c t e d a c y c l i c i t y2 a l l n : t . r o o t ’ . ∗ ( l e f t ’+r i g h t ’ ) | n ! i n n . ˆ ( l e f t ’+r i g h t ’ ) }
Listing 3: Simplified Alloy specification for Binary Search Tree.
1 p u b l i c b o o l e a n repOK ( ) {2 f o r ( Node n : method0 ( ) ) {3 i f ( ! method4 ( n ) ) { r e t u r n f a l s e ; } }4 r e t u r n t r u e ; }5 // method0 implements t . r o o t ’ . ∗ ( l e f t ’+ r i g h t ’ )6 p u b l i c Set<Node> method0 ( ) {7 Node x0 = getRoot ( ) ;8 r e t u r n g e t T r a n s i t i v e C l o s u r e ( x0 ) ; }9 // method4 implements n ! i n n . ˆ ( l e f t ’+ r i g h t ’ )
10 p u b l i c b o o l e a n method4 ( Node n ) {11 i f ( g e t C l o s u r e ( n ) . c o n t a i n s ( n ) ) r e t u r n f a l s e ;12 r e t u r n t r u e ; }
Listing 4: Simplified Arreh translation to Java.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 29/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Evaluation of DREAM
DREAM + Cobbler + Arreh
Same subject errors as Cobbler
Repair scenario: re-use repair on a structure with a different size butsimilar fault as repaired beforeE.g., build repair abstraction with respect to an erroneous list with10 nodes, and apply it to an erroneous list with 500 nodes
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 30/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Results for DREAM + Arreh
0.0000
0.0010
0.1000
10.0000
C D C D C D C D C D C D C D
500 500 500 500 500 500 500
Error 1 Error 2 Error 3 Error 4 Error 5 Error 6 Error 7
Number of Nodes
Singly-Linked List Repair Time (s) - Logarithmic Scale
Check Repair Check after Repair
Figure: Performance: repairing singly-linked lists. Cobbler (C), DREAM (D).
Time measurements1 DREAM repair time includes concretization and application2 Check after repair time is to check contracts on the result of
DREAM repairImproved Cobbler performance in two ways
1 Cost of initial check (Arreh), also for error-free executions2 Cost of repair (DREAM)
However, repair abstractions do not always workThey can be too tailored to a specific problem instance, e.g., error 2
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 31/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Repair Abstractions and Repair Challenges
Efficiency and scalability
DREAM reuses repair actions without searching the spaceAmortizes the cost of repair from cases that do require a searchArreh improves the efficiency and scalability of checking
EffectivenessAccuracy: DREAM is as accurate as the underlying repair
Exact same result as Cobbler when they both worked
Usability
Summarizes repair actions into intuitive descriptions for debuggingfaulty code
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 32/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Outline
1 IntroductionMotivationProblem Definition and ApproachRelated Work
2 Contract-Driven Data Structure RepairExampleBackgroundHistory-Aware Data Structure RepairRepair Abstractions
3 Structure Generation with Dynamic ProgrammingProblem Definition and ApproachExampleTest Input Generation Using Dynamic Programming
4 Conclusion and Future WorkIdeas on Using Dynamic Programming for RepairConclusionContributions
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 33/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Test Input Generation and Repair
Testing is the most commonly used methodology to find faults
Web browsers, etc. need structurally complex test inputs
Constraint-based testing is a well-known automation technique1 Logical constraints define desired inputs2 Constraints are solved for all small instances3 Solutions are refined as test inputs
The problems of contract-based repair and constraint-based testinput generation are similar
Repair finds a solution to φ(pre, post) when pre is held constantInput generation enumerates solutions to ρ(pre)
We present a test input generation technique [ZaeemKhurshidFSE12]
We discuss future ideas to adapt it for repair
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 34/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Our Test Input Generation Approach
We present a constraint-based test input generation technique thatLeverages recursive structures of complex inputs
To enable more natural formulation of constraintsTo provide faster generation of inputs and better scalability
Uses dynamic programming
To build larger inputs using smaller ones with the same structureDynamic programming is a problem solving methodology designed toexploit common subproblems
�� ��null
�� ��null null
+ null →�� ��
�� ��
null
null
�� ��null null
Figure: Combining smaller inputs to build larger ones.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 35/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Example: Generating HTML Files
Many data structures are inherently recursiveExample: model HTML files as TreesRecursive predicatesAssuming acyclicity
The framework is aware that there might be cycles
1 c l a s s B i n a r y T r e e {2 B i n a r y T r e e l e f t , r i g h t ;3 i n t s i z e ;4 b o o l e a n repOK ( ) {5 i n t r i g h t s i z e ;6 i f ( r i g h t == n u l l ) r i g h t s i z e = 0 ;7 e l s e {8 i f ( ! r i g h t . repOK ( ) ) r e t u r n f a l s e ;9 r i g h t s i z e = r i g h t . s i z e ; }
10 . . .11 r e t u r n ( s i z e == r i g h t s i z e + l e f t s i z e
+ 1) ; } }
Listing 5: A recursive binary tree in Java.
html
�� ��head
��
body
��link h1
Figure: A treerepresentation of anHTML input.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 36/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Test Input Generation Using Dynamic Programming
User writes constraints as recursive predicates
We break the problem into building smaller similar data structuresand solve it efficiently using dynamic programming
Non-conventional application
We present three algorithms for test input generation1 DP: Core algorithm to use dynamic programming2 LazyDP: Lazy initialization strategy for efficiency3 SymboLazyDP: Combination with symbolic execution
More
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 37/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Three Algorithms: DP Finding Binary Trees I
(Size ≤ 2)
Iteration 1 size = 0 size = 1 size = 2pool
lastRoundTests null size = 0, #0
thisRoundTests size = 1, #1
�� ��null null
pool: all structures that we have already used to build bigger ones
lastRoundTests: all previously generated structures that we have notyet used
thisRoundTests: all structures that we are generating at this iteration
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 38/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Three Algorithms: DP Finding Binary Trees II
Iteration 2 size = 0 size = 1 size = 2pool null size = 0, #0
lastRoundTests size = 1, #1
�� ��null null
thisRoundTests repOK // size = 2, #2
�� ��|| ��
null
null null no repOK call
kk
size = 2, #3
{{ ��null
}} !!null null
Non-isomorphism:≥ 1 structures fromlastRoundTests
size = 1
�� ��null null
btSize = 3
xx &&�� �� �� ��
null null null null
Pruning: No structurewith > 2 nodes
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 39/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Three Algorithms: LazyDP
Data structures are kept in a concise format to preserve memory
When repOK is called, we need to expand them
Expand them lazily (only when needed)
size = 2, #3
�� ��null
�� ��null null
+ null size = 0, #0→ size = 3
�� ��left.size
�� ��
nullright.size
null
�� ��null null
〈(right child =) 1, (left child =) 0, (size =) 2〉
Figure: Lazy initialization.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 40/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Three Algorithms: SymboLazyDP
SymboLazyDP avoids a systematic search for non-recursive (data)fields
Symbolic execution of repOK More
Uses symbols instead of values for non-recursive fieldsBuilds a path condition along the execution pathUses a constraint solver to solve for non-recursive fields
Saves path conditions (not solutions) with valid substructures
After combining the substructures, solves for the entire candidate
x#3, x < y
�� ��null y
�� ��null null
Figure: Binary Tree with symbolic values.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 41/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Evaluation of Dynamic Programming forTest Generation
RQ1: How efficient and scalable are our algorithms, compared toKorat: an open-source tool for solving (Java) repOK’s
Performs execution-driven pruning and isomorphism breaking
Pex: a symbolic execution tool from Microsoft
We run Pex on (C#) repOK’s (but not full application)
Six small subjects: sorted lists, binary trees, red-black trees,Fibonacci heaps, binary heaps, hash tablesEvaluate for both systematic and random test generation
RQ2: How effective are the generated tests in finding real bugs?
Two web browsers: Google Chrome and Apple Safari
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 42/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Results for Test Generation I
1.00
10.00
100.00
1,000.00
0.00
0.01
10 11 12 13 14 15 16 17
Size of Generated Tests (SortedSinlyLinkedList)Korat DP LazyDP SymboLazyDP Pex
Figure: Systematic generation performance: sorted linked lists.
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 43/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Experimental Results for Test Generation II
Table: Random generation of ten tests with 90 ≤ size ≤ 100. TO represents atime out of 1000s.
Benchmark Generation Time (s) StateKorat SymboLazyDP Pex Space
Linked List 0.137 0.136 322 10351
Binary Tree 0.266 0.174 375 10353
Red-Black Tree TO 20.256 TO 10529
Fibonacci Heap 0.114 0.567 TO 10527
Binary Heap 7.617 2.823 83 10527
Hash Table TO 3.947 TO 10527
More
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 44/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Testing Safari and Chrome: Overview
We tested the support for rendering CSS3 3D effects
CSS allows separating presentation from content
Applied our approach to directly test the two browsers
Wrote repOK’s and a test harness and oracle
An instance of an HTML file as a treeA CSS rule as a linked list of properties
Each property has a linked list of values
Used staged generation: first generate CSS, then HTML
2 declarations inside a CSS block with 5 properties each8 tags inside an HTML file
Used differential testing: image differencing algorithm
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 45/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Testing Safari and Chrome: Results
Table: Chrome and Safari test input generation results.
Gen. Time (s) #Tests
CSS HTML
DP 0.140 10.851 3081
LazyDP 0.140 10.850 3081
SymboLazyDP N/A 1.628 3081
818 out of 3081 tests failed
148 false positives due to the inaccuracy of our test oracle
Manually classified the remaining failing tests
At least three distinct bugs in Chrome
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 46/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Example Bug Found in Chrome
CSS style sheet HTML file using the CSS
+↙↘
Chrome (second line should be hiddenbecause it is inside ClassName12)
Safari
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 47/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Outline
1 IntroductionMotivationProblem Definition and ApproachRelated Work
2 Contract-Driven Data Structure RepairExampleBackgroundHistory-Aware Data Structure RepairRepair Abstractions
3 Structure Generation with Dynamic ProgrammingProblem Definition and ApproachExampleTest Input Generation Using Dynamic Programming
4 Conclusion and Future WorkIdeas on Using Dynamic Programming for RepairConclusionContributions
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 48/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Using Dynamic Programming for Repair
1 Memoized checks (Ditto [ShankarBodikPLDI07]) locate the error
Ditto incrementally checks recursive repOK More
We improve Ditto with support for
CyclesSome pre- and post-conditions, according to the user
2 We use a set of small patches built using SymboLazyDP
We solve for symbolic values using the values of the location of errorWe check if patches repair the data structure using contracts
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 49/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Example: Ideas on Using Dynamic Programming for Repair(a) input for remove(7) (b) faulty output of remove(7)
root // 2btSize = 5
yy ##repOK():true
ks
1
�� ��
7
�� ��repOK():true
ks
null null 6
�� ��
null
5
�� ��null repOK():true
fn
null null repOK():true
go
root // 2btSize = 4
|| !!
repOK():false,postCondition(): false
ks
1
�� ��
7
�� ��
repOK():false,postCondition(): false
ks
null null 6
��
TT
null
5
�� ��
repOK():false,postCondition(): false
go
null null repOK():true,postCondition(): true
go
(c) patches generated with SymboLazyDP (d) repair result of remove(7)
x#1
�� null null
x#5, x < y < z
�� ��null y
�� ��null z
~~ ��null null
x#3, x < y
�� ��null y
�� ��null null
x#6, x < z < y
�� ��null y
�� ��z
�� null
null null
root // 2btSize = 4
xx %%1
�� ��6
�� ��null null 5
�� ��null
null null
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 50/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Dynamic Programming and Repair Challenges
Efficiency and effectiveness
Memoized checks locate the error more accuratelyThis repair technique does not alter correct parts of the state
Usability
Provides an alternative way of describing contracts recursivelyAmortizes the overhead of writing and maintaining contractsbetween test input generation and repair
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 51/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Conclusion
Contract-driven data structure repair improves our ability to developreliable programs
For programs with contracts, error recovery comes at a low cost
The same contracts can be used for testing code before deployment
Using existing techniques and our new technique for input generationAmortizing the cost of writing and maintaining contracts betweentesting and repair
Contract-driven data structure repair can potentially improvesoftware quality
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 52/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
Publications
More
Razieh Nokhbeh Zaeem, Ph.D. Contract-Driven Data Structure Repair 53/ 53
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
References I[Al NaffouriMS04] Bassel Y. Al-Naffouri.
MintEra: A testing environment for Java programs.Master’s thesis, Massachusetts Institute of Technology, 2004.
[Blackburn+06] Stephen M Blackburn, Robin Garner, Chris Hoffmann, Asjad MKhan, Kathryn S McKinley, Rotem Bentzur, Amer Diwan, DanielFeinberg, Daniel Frampton, Samuel Z Guyer, Martin Hirzel, AntonyHosking, Maria Jump, Han Lee, J Eliot B Moss, Aashish Phansalkar,Darko Stefanovic, Thomas VanDrunen, Daniel von Dincklage, andBen Wiedermann.The DaCapo Benchmarks: Java Benchmarking Development andAnalysis.In Proceedings of Conference on Object-Oriented Programming,Systems, Languages, and Applications (OOPSLA), pages 190–211,Portland, Oregon, 2006.
[DemskyRinardOOPSLA03] Brian Demsky and Martin Rinard.Automatic detection and repair of errors in data structures.In Proceedings of Conference on Object-Oriented Programming,Systems, Languages, and Applications (OOPSLA), pages 78–95,Anaheim, California, 2003.
[Elkarablieh+ASE07] Bassem Elkarablieh, Ivan Garcia, Yuk Lai Suen, and SarfrazKhurshid.Assertion-based repair of complex data structures.In 22nd IEEE/ACM International Conference on AutomatedSoftware Engineering (ASE), pages 64–73, Atlanta, GA, 2007.
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
References II
[ElkarabliehKhurshidICSE08] Bassem Elkarablieh and Sarfraz Khurshid.Juzi: A tool for repairing complex data structures.In Proceedings of 30th International Conference on SoftwareEngineering (ICSE), pages 855–858, Leipzig , Germany, May 2008.Research Demo Paper.
[Khurshid+SPIN05] Sarfraz Khurshid, Ivan Garcıa, and Yuk Lai Suen.Repairing structurally complex data.In 12th SPIN Workshop on Model Checking of Software, pages123–138, San Francisco, CA, August 2005.
[MalikPhD13] Muhammad Zubair Malik.Combining data structure repair and program repair.PhD Thesis Proposal, University of Texas at Austin, Austin, TX,2013.
[Samimi+ECOOP10] Hesam Samimi, Ei Darli Aung, and Todd Millstein.Falling Back on Executable Specifications.In Proceedings of 24th European Conference on Object-OrientedProgramming (ECOOP), pages 552–576, Maribor, Slovenia, EU,May 2010.
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
References III
[ShankarBodikPLDI07] Ajeet Shankar and Rastislav Bodik.DITTO: automatic incrementalization of data structure invariantchecks (in Java).In Proceedings of ACM SIGPLAN’07 Conference on ProgrammingLanguage Design and Implementation (PLDI), pages 310–319, SanDiego, California, 2007.
[ZaeemKhurshidABZ10] Razieh Nokhbeh Zaeem and Sarfraz Khurshid.Introducing Specification-Based Data Structure Repair Using Alloy.In Proceedings of International Conference on ASM Alloy B and Z,pages 398–399, Orford, Quebec, Canada, February 2010.
[ZaeemKhurshidECOOP10] Razieh Nokhbeh Zaeem and Sarfraz Khurshid.Contract-Based Data Structure Repair Using Alloy.In Proceedings of 24th European Conference on Object-OrientedProgramming (ECOOP), pages 577–598, Maribor, Slovenia, EU,May 2010.
[ZaeemKhurshidFSE12] Razieh Nokhbeh Zaeem and Sarfraz Khurshid.Test input generation using dynamic programming.In Proceedings of the ACM SIGSOFT 20th International Symposiumon the Foundations of Software Engineering, number 34, Cary, NorthCarolina, 2012.11 pages.
Introduction Contract-Driven Data Structure Repair Structure Generation with Dynamic Programming Conclusion and Future Work
References IV
[ZaeemMS10] Razieh Nokhbeh Zaeem.Contract-based data structure repair using Alloy.Master’s thesis, Department of Electrical and Computer Engineering,University of Texas at Austin, May 2010.
[Zaeem+RV13] Razieh Nokhbeh Zaeem, Muhammad Zubair Malik, and SarfrazKhurshid.Repair abstractions for more efficient data structure repair.In Fourth International Conference on Runtime Verification, pages235–250, INRIA Rennes, France, 2013.
[Zaeem+TACAS12] Razieh Nokhbeh Zaeem, Divya Gopinath, Sarfraz Khurshid, andKathryn S McKinley.History-aware data structure repair using SAT.In Tools and Algorithms for the Construction and Analysis ofSystems (TACAS), pages 2–17, Tallinn, Estonia, March 2012.
Supplemental Slides
Outline
5 Supplemental Slides
Supplemental Slides
More Background on Alloy and Kodkod
Each Alloy relation has its instance with respect to two bounds
Lower bound: all tuples that a relation must haveUpper bound: all tuples that a relation may haveLB(R) ⊂ instance(R) ⊂ UB(R)A lower or upper bound is a function of type R → 2T
Main
Supplemental Slides
Summary of Four Heuristics of Tarmeem
Table: Comparison of the four repair algorithms of Tarmeem.
Uses post-state Uses post-condition to optimize Uses annotationsBasic approachIterative relaxation
√
Error localization√ √
Guided error localization√ √ √
Main
Supplemental Slides
Edit Distance and Similarity
r : repaired data structure
e: expected data structure
We measure the edit distance in set difference operations using therelational representation
dist(e, r) =∑
R(|inste(R)− instr (R)|+ |instr (R)− inste(R)|)
sim(e, r) = (1− dist(e,r)∑R |inste(R)| )× 100
Main
Supplemental Slides
Abstraction
Abstracting a concrete repair action o.f = v has three steps1 Use BFS (for pre-state and post-state) to find sequences of
dereferences to reach o and vE.g., [3].right = [4] is reachable through root(in post-state).right =root.right.left(in post-state)
2 Find all abstractionsE.g., First(in post-state).right = First.Neighbor.Neighbor(inpost-state), among others
3 Build abstract repair actions α.f = β
A pre-defined, extensible repository of abstractions faulty output of remove(5)
Pointer-based (Null, First, Leaf, Self, Neighbor, . . . )
Value-based (Offset, Coefficient, . . . ) root // 3size = 2
�� ��null 5
�� ��4
��
OO
null
null
Supplemental Slides
Concretization
Concretizing an abstract repair action α.f = β has two steps1 Traverse the data structure (in the appropriate state) to find
concrete values for α and βE.g., First(in post-state).right = First.Neighbor.Neighbor(inpost-state) is concretized to root(in post-state).right =root.right.left(in post-state), among others
2 Perform the concrete repair, i.e., mutate the post-state accordinglyE.g., [2].right = [6]
All possible previous abstractions faulty output of remove(7)
with all possible concretizations are tried root // 2size = 4
{{ !!1
�� ��
7
�� ��null null 6
��
TT
null
5
�� ��null null
Main
Supplemental Slides
DP, LazyDP, SymboLazyDP
1 DP: Core algorithm to use dynamic programming
Memoizes internal repOK checks (on smaller structures)Prunes based on the expected sizeAvoids repetitions (non-isomorphic generation)Stores structures as candidate vectors with pointersSupports systematic and random generation
2 LazyDP: Lazy initialization strategy for efficiency
Expands structure as needed during repOK execution
3 SymboLazyDP: Combination with symbolic execution
Handles constraints on non-recursive (data) fields
Main
Supplemental Slides
SymboLazyDP
SymboLazyDP automatically performs source to sourceinstrumentation
11 i f ( g e t B o o l e a n ( ) ) {12 addCond ( ” b t S i z e ” , EQ, ” r i g h t B t S i z e+l e f t B t S i z e +1” ) ;13 r e t u r n t r u e ;14 } e l s e {15 addCond ( ” b t S i z e ” , NOTEQ, ” r i g h t B t S i z e+l e f t B t S i z e +1” ) ;16 r e t u r n f a l s e ;}
Listing 6: Instrumenting BinaryTree for symbolic execution.
Main
Supplemental Slides
Systematic Generation of Fibonacci Heaps
1.00
10.00
100.00
1,000.00
0.00
0.01
1 2 3 4 5 6 7 8
Size of Generated Tests (FibonacciHeap)Korat DP LazyDP SymboLazyDP Pex
Figure: Systematic generation performance: Fibonacci heaps.
Main
Supplemental Slides
Ditto
Memoized checks (Ditto [ShankarBodikPLDI07]) locate the error
Ditto incrementally checks repOKMemoizes the result of previous checks and re-checks only where thedata structure changesUses write barriers to identify changesOptimistically assumes repOK returns true
Propagates the result up if it does not
We improve Ditto with support for
CyclesSome pre- and post-conditions, according to the user
Main
Supplemental Slides
Contributions
Contract-based data structure repair [ZaeemKhurshidABZ10,ZaeemKhurshidECOOP10, ZaeemMS10]
History-aware contract-based repair [Zaeem+TACAS12]Read and write barriers to obtain program execution historyUnsatisfiable cores to obtain SAT solving history
Repair abstraction for contract-based repair usingAlloy [Zaeem+RV13]
Dynamic programming for structure generationTest input generation [ZaeemKhurshidFSE12]Ideas for repair
Implementation
EvaluationRepair: microbenchmarks, open source projects (ANTLR andKodkod)Test generation: versus state-of-the-art tools Korat and Pex, testedApple Safari and Google Chrome (found 3 bugs)
Main