Upload
dean
View
48
Download
0
Embed Size (px)
DESCRIPTION
Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda. Deconstructing Randoop Mutation-based test generation. Deconstructing Randoop. deconstruct. verb [trans.] - PowerPoint PPT Presentation
Citation preview
Recent Work, in Two ActsCarlos Pacheco
8/15/2008
Agenda
1. Deconstructing Randoop
2. Mutation-based test generation
Deconstructing Randoop
deconstruct
verb [trans.]analyze (a text or a linguistic or conceptual system), typically in order to expose its hidden internal assumptions and contradictions and subvert its apparent significance or unity.
verb [trans.]analyze (a tool, algorithm or software system), typically in order to expose its hidden internal assumptions and components and evaluate its apparent significance or unity.
(alt.)deconstruct
Goals
• Identify Randoop's key, separable ideas
• Determine their individual effectiveness
• Determine their combination's effectiveness
Randoopclasses
under test
propertiesto check
feedback-directed random test generator
failingtest cases
Randoopclasses
under test
propertiesto check
feedback-directed random test generator
failingtest cases
java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...
Randoopclasses
under test
propertiesto check
feedback-directed random test generator
failingtest cases
java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...
Reflexivity of equality:" o != null : o.equals(o) == true
Randoopclasses
under test
propertiesto check
feedback-directed random test generator
failingtest cases
java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...
Reflexivity of equality:" o != null : o.equals(o) == true
public void test() {
Object o = new Object(); ArrayList a = new ArrayList(); a.add(o); TreeSet ts = new TreeSet(a); Set us = Collections.unmodifiableSet(ts);
// Fails at runtime. assertTrue(us.equals(us));
}
1. Seed component setcomponents = { ... }
2. Do until time limit expires:a. Create a new sequence
i. Randomly pick a method call m(T1...Tk)/Tret ii. For each input parameter of type Ti, randomly pick a sequence Si
from the components that constructs an object vi of type Ti
iii. Create new sequence Snew = S1; ... ; Sk ; Tret vnew = m(v1...vk);
iv. if Snew was previously created (lexically), go to i
b.Classify the new sequence Snew
a.May discard, output as test case, or add to components
Feedback-directed random test generation
int i = 0; boolean b = false;
Classifying a sequence
execute andcheck
properties
componentset
contract-violatingtest case
propertyviolated?
minimizesequence
yes
exceptionthrown?
no
yes
discardsequence
start
no
Prior evaluation
• Compared with other techniques– Model checking, symbolic execution, traditional random
testing
• On collection classes (lists, sets, maps, etc.)– Randoop achieved equal or higher code coverage in less
time
• On a large benchmark of programs (750KLOC)– Randoop revealed more errors
Randoop's two key ideas
1. Create method sequences incrementally (component set)
2. Use runtime information to guide generation
14
What makes it work?
• Component set?• Runtime feedback?• Both... Or neither?
Four techniques
RandoopRandoopwithout
feedback
naivewith
feedbacknaive
yes no
yes
no
use feedback?
usecompo-nents?
Naive sequence generation
• To generate one sequence:1. Start from the empty sequence S2. Select an enabled method at random3. Select input to the method from S4. Extend S with the new method call, go back to 1
• A method is enabled if S declares objects that can serve as its receiver and arguments
Naive generation with feedback
• Extend new sequence with method call• Execute method call, check properties• If exception/failure, go back one step– Remove last method call– Attempt different extension
Randoop without feedback
Add every new sequence to component set, regardless of its execution result.
Review: four techniques
RandoopRandoopwithout
feedback
naivewith
feedbacknaive
yes no
yes
no
use feedback?
usecompo-nents?
Evaluation
• Apply the four techniques to a set of libraries• Compare– coverage– errors revealed
library members LOC
chain 189 8K
logging 136 4Kjavax 90 14K
prims 990 6Kcollections 415 39K
jelly 469 14Kutilmde 577 13K
collext 2114 61Kmath 687 21K
Libraries
library 1 2 3 4 5
chain 28 1.3K 97K 10M 1B
logging 35 1.6K 112K 10.7M 1B
javax 38 2.2K 167K 15M 1.3B
prims 372 154K 63M 26B 1 x 1012
collections 1.6K 2.7M 4.6B 7.8 x 1012 1.3 x 1016
jelly 910 1.5M 3.5B 8.1 x 1012 1.8 x 1016
utilmde 2.8K 9.2M 30B 3.0 x 1014 3.0 x 1018
collext 6.9K 49M 343M 2.4 x 1015 1.7 x 1019
math 25K 623M 1.5 x 1013 3.8 x 1017 9.6 x 1021
Input space sizedistinct input sequences of length...
Input
For each library:– All public members in library– Sequence limit: 50 calls– Small set of primitives (0, -1, 100, 'a', etc.)
Other details
• Stopping criterioncoverage does not increase after 100 seconds
• Five propertiesEquals symmetric, equals reflexive, equals to null returns false, equals-hashcode, no NPEs
• Engineering fairness– Optimized all four techniques to make sequence
construction efficient
Output
• Failing test cases
• One test per (violating method,property) pair
• Ongoing: manually inspecting all failures
Failureslibrary naive Randoop w/o
feedbacknaive w/feedback
Randoop
chain 24 0 0 13
logging 19 0 0 12
javax 0 0 0 0
prims 10 0 13 16
collections 21 0 20 15
jelly 57 0 0 80
utilmde 1 2 0 2
collext 50 3 14 85
math 64 8 2 10
TOTAL 246 13 49 233
Failure kinds
library naive Randoop w/ofeedback
naive w/feedback
Randoop
NPEs 218 13 0 176
Other 28 0 49 57
TOTAL 246 13 49 233
Coverage achieved
javax chain jelly collext logging util collections prims math0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Randoop without feedback
naive with feedback
naive
Randoop
Coverage vs. time
Randoop
Other
time
cove
rage
Coverage vs. time
Randoop
Other
time
cove
rage
tother
Coverage vs. time
Randoop
Other
time
cove
rage
tothertRandoop
tRandoop / totherlibrary naive Randoop w/o
feedbacknaive w/feedback
chain 0.1 1 0.01
logging n/a 1 0.15
javax 0.67 1 0.12
prims 0.03 0.15 0.01
collections 0.03 0.22 0.01
jelly 0.03 1 0.01
utilmde 0.001 0.11 0.005
collext 0.04 0.19 0.06
math 0.01 0.13 0.01
Conclusion
• Randoop:– High coverage very quickly– More "serious" failures
• Naive:– Good coverage, slower/less than Randoop– More NPE failures
• Other techniques– Not as effective
Mutation-based generation
Carlos PachecoJeff Perkins
Motivation
• Randoop– Achieves reasonable coverage– Hits a coverage plateau
• Can we push the coverage plateau up?
Randoop
time
cove
rage
Goal
Idea
• Follow random generation with systematic mutation of method sequences– null– unrelated types– related types (super, subclasses)– aliasing– structurally-equivalent objects
Mutation via dataflow tracking
1. When coverage plateaus, stop random generation2. Identify frontier branches3. for each frontier branch:
a) Select candidate sequences (that reach frontier branches)b) Track the variables whose data flows into branch conditionc) Systematically mutate the variables
Example
Candidate sequence:
int var1 = 5;BinTree var2 = new BinTree(var1);int var3 = 2;t.add(var3);int var4 = 6;t.remove(var4);
Frontier branch:
Class BinTree {public boolean remove(int x) { . . . if (current.value == x) . . . }}
Runtime analysis:
relevant variables: var3 and var4var3 was compared to 6var4 was compared to 2
Strategy:
Modify every relevant variableto take on each compared value
Runtime analysis
• Determine data flow at frontier branch1. Tag each variable's runtime value on creation2. On each operation, create a tree with the operation as
the root and operands as branches3. From branch tree, determine
relevant variablesvalues that each variable was compared to
• Could also track control flow
Sequence mutation strategies
• Primitive variables– For each primitive variable x:
Set x to compared values +/- {0, 1, 10, 100}
• Reference variables– Given two variables x and y (of the same type):• Replace uses of x by y (alias)• Make x and y structurally equivalent (copy)• Make one null, the other non-null
Example 2
Candidate sequence:
int var0 = 100;int var1 = -1;List var2 = nCopies(var0, var1);shuffle(var2);
Frontier branch:
public int next(int n) { . . . if ((n & -n)==n) // i.e. n is a power of 2 . . .}
Runtime analysis:
Relevant variables: var0, var1var0 was compared to 4, 100
Winning strategy:
set var0 to 4
Example 3
Candidate sequence:ArrayList var0 = new ArrayList();int var1 = 0;String var2 = "a";var0.add(var1, var2);int var4 = 1;String var5 = "a";var0.add(var4, var5);long var7 = 100;boolean var8 = var0.add(var7);int var9 = 0;short var10 = 0;Object var11 = var0.set(var9, var10);String var12 = "b";boolean var13 = var0.remove(var12);double var14 = 0.0;int var15 = var0.lastIndexOf(var14);
Frontier branch:
public int lastIndexOf(Object elem) { . . . for (int i = size-1 ; i >= 0 ; i--) { if(elem.equals(elementData[i])) . . .}
Runtime analysis:
Relevant variables: var1, var9, var10, var14
Example 3
Candidate sequence:ArrayList var0 = new ArrayList();int var1 = 0;String var2 = "a";var0.add(var1, var2);int var4 = 1;String var5 = "a";var0.add(var4, var5);long var7 = 100;boolean var8 = var0.add(var7);int var9 = 0;short var10 = 0;Object var11 = var0.set(var9, var10);String var12 = "b";boolean var13 = var0.remove(var12);double var14 = 0.0;int var15 = var0.lastIndexOf(var14);
Frontier branch:
public int lastIndexOf(Object elem) { . . . for (int i = size-1 ; i >= 0 ; i--) { if(elem.equals(elementData[i])) . . .}
Winning strategy:
Replace uses of var14 with var7
Coverage-directed sequence mutation
• Randoop covered 933 of 2064 branches– 163 frontier branches– Dataflow information was found for 29 frontier
branches• Mutation strategies were able to cover 19 of
those branches
Dataflow implementation
• Instrument java class files as they are loaded• Maintain tags for each runtime value• When two values interact, merge their tags• Create summaries for JDK methods