47

Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

  • Upload
    dean

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda. Deconstructing Randoop Mutation-based test generation. Deconstructing Randoop. deconstruct. verb [trans.] - PowerPoint PPT Presentation

Citation preview

Page 1: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda
Page 2: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Recent Work, in Two ActsCarlos Pacheco

8/15/2008

Agenda

1. Deconstructing Randoop

2. Mutation-based test generation

Page 3: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Deconstructing Randoop

Page 4: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

deconstruct

verb [trans.]analyze (a text or a linguistic or conceptual system), typically in order to expose its hidden internal assumptions and contradictions and subvert its apparent significance or unity.

Page 5: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

verb [trans.]analyze (a tool, algorithm or software system), typically in order to expose its hidden internal assumptions and components and evaluate its apparent significance or unity.

(alt.)deconstruct

Page 6: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Goals

• Identify Randoop's key, separable ideas

• Determine their individual effectiveness

• Determine their combination's effectiveness

Page 7: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Randoopclasses

under test

propertiesto check

feedback-directed random test generator

failingtest cases

Page 8: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Randoopclasses

under test

propertiesto check

feedback-directed random test generator

failingtest cases

java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...

Page 9: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Randoopclasses

under test

propertiesto check

feedback-directed random test generator

failingtest cases

java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...

Reflexivity of equality:" o != null : o.equals(o) == true

Page 10: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Randoopclasses

under test

propertiesto check

feedback-directed random test generator

failingtest cases

java.util.Collectionsjava.util.ArrayListjava.util.TreeSetjava.util.LinkedList...

Reflexivity of equality:" o != null : o.equals(o) == true

public void test() {

Object o = new Object(); ArrayList a = new ArrayList(); a.add(o); TreeSet ts = new TreeSet(a); Set us = Collections.unmodifiableSet(ts);

// Fails at runtime. assertTrue(us.equals(us));

}

Page 11: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

1. Seed component setcomponents = { ... }

2. Do until time limit expires:a. Create a new sequence

i. Randomly pick a method call m(T1...Tk)/Tret ii. For each input parameter of type Ti, randomly pick a sequence Si

from the components that constructs an object vi of type Ti

iii. Create new sequence Snew = S1; ... ; Sk ; Tret vnew = m(v1...vk);

iv. if Snew was previously created (lexically), go to i

b.Classify the new sequence Snew

a.May discard, output as test case, or add to components

Feedback-directed random test generation

int i = 0; boolean b = false;

Page 12: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Classifying a sequence

execute andcheck

properties

componentset

contract-violatingtest case

propertyviolated?

minimizesequence

yes

exceptionthrown?

no

yes

discardsequence

start

no

Page 13: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Prior evaluation

• Compared with other techniques– Model checking, symbolic execution, traditional random

testing

• On collection classes (lists, sets, maps, etc.)– Randoop achieved equal or higher code coverage in less

time

• On a large benchmark of programs (750KLOC)– Randoop revealed more errors

Page 14: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Randoop's two key ideas

1. Create method sequences incrementally (component set)

2. Use runtime information to guide generation

14

Page 15: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

What makes it work?

• Component set?• Runtime feedback?• Both... Or neither?

Page 16: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Four techniques

RandoopRandoopwithout

feedback

naivewith

feedbacknaive

yes no

yes

no

use feedback?

usecompo-nents?

Page 17: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Naive sequence generation

• To generate one sequence:1. Start from the empty sequence S2. Select an enabled method at random3. Select input to the method from S4. Extend S with the new method call, go back to 1

• A method is enabled if S declares objects that can serve as its receiver and arguments

Page 18: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Naive generation with feedback

• Extend new sequence with method call• Execute method call, check properties• If exception/failure, go back one step– Remove last method call– Attempt different extension

Page 19: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Randoop without feedback

Add every new sequence to component set, regardless of its execution result.

Page 20: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Review: four techniques

RandoopRandoopwithout

feedback

naivewith

feedbacknaive

yes no

yes

no

use feedback?

usecompo-nents?

Page 21: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Evaluation

• Apply the four techniques to a set of libraries• Compare– coverage– errors revealed

Page 22: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

library members LOC

chain 189 8K

logging 136 4Kjavax 90 14K

prims 990 6Kcollections 415 39K

jelly 469 14Kutilmde 577 13K

collext 2114 61Kmath 687 21K

Libraries

Page 23: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

library 1 2 3 4 5

chain 28 1.3K 97K 10M 1B

logging 35 1.6K 112K 10.7M 1B

javax 38 2.2K 167K 15M 1.3B

prims 372 154K 63M 26B 1 x 1012

collections 1.6K 2.7M 4.6B 7.8 x 1012 1.3 x 1016

jelly 910 1.5M 3.5B 8.1 x 1012 1.8 x 1016

utilmde 2.8K 9.2M 30B 3.0 x 1014 3.0 x 1018

collext 6.9K 49M 343M 2.4 x 1015 1.7 x 1019

math 25K 623M 1.5 x 1013 3.8 x 1017 9.6 x 1021

Input space sizedistinct input sequences of length...

Page 24: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Input

For each library:– All public members in library– Sequence limit: 50 calls– Small set of primitives (0, -1, 100, 'a', etc.)

Page 25: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Other details

• Stopping criterioncoverage does not increase after 100 seconds

• Five propertiesEquals symmetric, equals reflexive, equals to null returns false, equals-hashcode, no NPEs

• Engineering fairness– Optimized all four techniques to make sequence

construction efficient

Page 26: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Output

• Failing test cases

• One test per (violating method,property) pair

• Ongoing: manually inspecting all failures

Page 27: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Failureslibrary naive Randoop w/o

feedbacknaive w/feedback

Randoop

chain 24 0 0 13

logging 19 0 0 12

javax 0 0 0 0

prims 10 0 13 16

collections 21 0 20 15

jelly 57 0 0 80

utilmde 1 2 0 2

collext 50 3 14 85

math 64 8 2 10

TOTAL 246 13 49 233

Page 28: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Failure kinds

library naive Randoop w/ofeedback

naive w/feedback

Randoop

NPEs 218 13 0 176

Other 28 0 49 57

TOTAL 246 13 49 233

Page 29: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Coverage achieved

javax chain jelly collext logging util collections prims math0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Randoop without feedback

naive with feedback

naive

Randoop

Page 30: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Coverage vs. time

Randoop

Other

time

cove

rage

Page 31: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Coverage vs. time

Randoop

Other

time

cove

rage

tother

Page 32: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Coverage vs. time

Randoop

Other

time

cove

rage

tothertRandoop

Page 33: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

tRandoop / totherlibrary naive Randoop w/o

feedbacknaive w/feedback

chain 0.1 1 0.01

logging n/a 1 0.15

javax 0.67 1 0.12

prims 0.03 0.15 0.01

collections 0.03 0.22 0.01

jelly 0.03 1 0.01

utilmde 0.001 0.11 0.005

collext 0.04 0.19 0.06

math 0.01 0.13 0.01

Page 34: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Conclusion

• Randoop:– High coverage very quickly– More "serious" failures

• Naive:– Good coverage, slower/less than Randoop– More NPE failures

• Other techniques– Not as effective

Page 35: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Mutation-based generation

Carlos PachecoJeff Perkins

Page 36: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Motivation

• Randoop– Achieves reasonable coverage– Hits a coverage plateau

• Can we push the coverage plateau up?

Randoop

time

cove

rage

Goal

Page 37: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Idea

• Follow random generation with systematic mutation of method sequences– null– unrelated types– related types (super, subclasses)– aliasing– structurally-equivalent objects

Page 38: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Mutation via dataflow tracking

1. When coverage plateaus, stop random generation2. Identify frontier branches3. for each frontier branch:

a) Select candidate sequences (that reach frontier branches)b) Track the variables whose data flows into branch conditionc) Systematically mutate the variables

Page 39: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Example

Candidate sequence:

int var1 = 5;BinTree var2 = new BinTree(var1);int var3 = 2;t.add(var3);int var4 = 6;t.remove(var4);

Frontier branch:

Class BinTree {public boolean remove(int x) { . . . if (current.value == x) . . . }}

Runtime analysis:

relevant variables: var3 and var4var3 was compared to 6var4 was compared to 2

Strategy:

Modify every relevant variableto take on each compared value

Page 40: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Runtime analysis

• Determine data flow at frontier branch1. Tag each variable's runtime value on creation2. On each operation, create a tree with the operation as

the root and operands as branches3. From branch tree, determine

relevant variablesvalues that each variable was compared to

• Could also track control flow

Page 41: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Sequence mutation strategies

• Primitive variables– For each primitive variable x:

Set x to compared values +/- {0, 1, 10, 100}

• Reference variables– Given two variables x and y (of the same type):• Replace uses of x by y (alias)• Make x and y structurally equivalent (copy)• Make one null, the other non-null

Page 42: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Example 2

Candidate sequence:

int var0 = 100;int var1 = -1;List var2 = nCopies(var0, var1);shuffle(var2);

Frontier branch:

public int next(int n) { . . . if ((n & -n)==n) // i.e. n is a power of 2 . . .}

Runtime analysis:

Relevant variables: var0, var1var0 was compared to 4, 100

Winning strategy:

set var0 to 4

Page 43: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Example 3

Candidate sequence:ArrayList var0 = new ArrayList();int var1 = 0;String var2 = "a";var0.add(var1, var2);int var4 = 1;String var5 = "a";var0.add(var4, var5);long var7 = 100;boolean var8 = var0.add(var7);int var9 = 0;short var10 = 0;Object var11 = var0.set(var9, var10);String var12 = "b";boolean var13 = var0.remove(var12);double var14 = 0.0;int var15 = var0.lastIndexOf(var14);

Frontier branch:

public int lastIndexOf(Object elem) { . . . for (int i = size-1 ; i >= 0 ; i--) { if(elem.equals(elementData[i])) . . .}

Runtime analysis:

Relevant variables: var1, var9, var10, var14

Page 44: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Example 3

Candidate sequence:ArrayList var0 = new ArrayList();int var1 = 0;String var2 = "a";var0.add(var1, var2);int var4 = 1;String var5 = "a";var0.add(var4, var5);long var7 = 100;boolean var8 = var0.add(var7);int var9 = 0;short var10 = 0;Object var11 = var0.set(var9, var10);String var12 = "b";boolean var13 = var0.remove(var12);double var14 = 0.0;int var15 = var0.lastIndexOf(var14);

Frontier branch:

public int lastIndexOf(Object elem) { . . . for (int i = size-1 ; i >= 0 ; i--) { if(elem.equals(elementData[i])) . . .}

Winning strategy:

Replace uses of var14 with var7

Page 45: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Coverage-directed sequence mutation

• Randoop covered 933 of 2064 branches– 163 frontier branches– Dataflow information was found for 29 frontier

branches• Mutation strategies were able to cover 19 of

those branches

Page 46: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda
Page 47: Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Dataflow implementation

• Instrument java class files as they are loaded• Maintain tags for each runtime value• When two values interact, merge their tags• Create summaries for JDK methods