31
Has the Bug Really Been Fixed? Zhongxian Gu, Earl T. Barr, David J. Hamilton, Zhendong Su University of California, Davis ICSE 2010

Has the Bug Really Been Fixed?

  • Upload
    kyle

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Has the Bug Really Been Fixed?. Zhongxian Gu , Earl T. Barr, David J. Hamilton, Zhendong Su University of California, Davis. ICSE 2010. Zhongxian GU. Publications: Has the bug really been fixed? Zhongxian Gu , Earl T. Barr, David J, Hamilton, Zhendong Su (ICSE 2010) - PowerPoint PPT Presentation

Citation preview

Page 1: Has the Bug Really Been Fixed?

Has the Bug Really Been Fixed?

Zhongxian Gu, Earl T. Barr, David J. Hamilton, Zhendong Su

University of California, Davis

ICSE 2010

Page 2: Has the Bug Really Been Fixed?

Authors

Publications:

Has the bug really been fixed?Zhongxian Gu, Earl T. Barr, David J, Hamilton, Zhendong Su(ICSE 2010)

Effective Identification of Failure-Inducing Changes: A Hybrid ApproachSai Zhang , Yu Lin, Zhongxian Gu and Jianjun Zhao.(PASTE 2008),

Change Impact Analysis for AspectJ ProgramsSai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ICSM 2008),

AutoFlow: An Automatic Debugging Framework for AspectJ ProgramsSai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ISSTA 2008)

Celadon: A Change Impact Analysis Tool for Aspect-Oriented ProgramsSai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ICSE 2008)

Zhongxian GU

Page 3: Has the Bug Really Been Fixed?

Authors

Publications:Scalable and precise detection of buggy inconsistencies (OOPSLA'10) How unique is source code? (FSE'10) Perturbing numerical computation to detect instabilities (ISSTA'10) Dynamic detection of unsafe component loadings (ISSTA'10, Distinguished Paper Award) Has the bug really been fixed? (ICSE'10) Simultaneously learning and enforcing temporal properties (ICSE'10)

Zhendong Su

Current Projects:Automated debugging [ICSE'06, ASE'07]Clone detection and similarity checking [ICSE'07, FSE'07]Firewall modeling, analysis, and optimization [S&P'06, TNSM]Malicious code detection, analysis, and prevention [CCS'05, ASPLOS'06, ACSAC'06]Program analysis of numerical software [TACAS'04, TCS'05, ICSE'06]Web and database application security and reliability [ICSE'04, SAVCBS'04, POPL'06, PLDI'07]

NEW: Please submit good papers to the following venues: TOSEM, SAS'11, ESEC/FSE'11,OOPSLA'11, ICSE'12, and ISSTA'12.

Page 4: Has the Bug Really Been Fixed?

Fixing a Bug

detectbug

understandbug

fix codeverify

fix

detect f1 f2 … fn

bad fixes

Motivation Approach Implementation Evaluation

Page 5: Has the Bug Really Been Fixed?

Do Bad Fixes Exist?

• Empirical study– Explore Bugzilla databases of Ant, AspectJ and Rhino

– Focus on “reopened” bugs

– Study the comment histories

• Bad fixes do exist!– Of reopened bugs, 66% in Ant, 73% in AspectJ, 80% in

Rhino are due to bad fixes

– “Oh, I am sorry, I didn’t consider that possibility.”

Page 6: Has the Bug Really Been Fixed?

Example

Bug(Rhino): Continuations do not work for __noSuchMethod__

// no idea what to do if it’s a TAIL_CALLif(fun instanceof NoSuchMethodShim && op != Icode_TAIL_CALL) { // get the shim and the actual method NoSuchMethodShim noSuchMethodShim = (NoSuchMethodShim) fun; ...}

1st fix

2nd fix

3rd fix

Page 7: Has the Bug Really Been Fixed?

The Bad Fix Problem

known bug-triggering input

bug-triggering input domain

input domain

Inputs covered by the fix

• Coverage: Inputs in the domain are not covered

• Disruption: Change behavior of unrelated inputs

Motivation Approach Implementation Evaluation

Page 8: Has the Bug Really Been Fixed?

Our Approach

• Detect bad fix as soon as possible

• Coverage– Discover the bug-triggering input domain

– Test the fixed program using the bug-triggering input domain

• Disruption– Regression testing

– Random testing: use buggy program as the oracle

Page 9: Has the Bug Really Been Fixed?

Discover the Bug-Triggering Input Domain

• A known bug-triggering input induces a concrete path

• Dijkstra’s weakest precondition (WP)– Path explosion

– Loop invariants

Page 10: Has the Bug Really Been Fixed?

Path Neighborhood

• Intuition: paths in the neighborhood of the concrete buggy path are more error prone

• Under-approximate the bug-triggering input domain via exploring neighboring paths

Page 11: Has the Bug Really Been Fixed?

Distance-Bounded WP (WPd)

• Inputs: program P, initial predicate ,a concrete path , distance budget d

• Generate candidate paths

• Restrict the computation of WP to

the candidate paths C

• Under-approximation of WP

Page 12: Has the Bug Really Been Fixed?

Loop Invariants

• Unroll the loop nodes

• All paths are simple– Compute the distance

– Compute the WP

Unrolled-CFG

Page 13: Has the Bug Really Been Fixed?

Coverage Analysis

• Collect the concrete path

• Under-approximate input domain using WPd

• Symbolically execute the fixed program

buggy inputWPd

buggy program

d…

fixed program

symbolic execution

Page 14: Has the Bug Really Been Fixed?

FIXATION Architecture

fixed program

buggy program

buggy input

distance budget

CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results

Motivation Approach Implementation Evaluation

concretepath

Page 15: Has the Bug Really Been Fixed?

FIXATION Architecture

CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results

• CFG generator: instrument WALA-CFG generator to support finitely-unrolled CFG generation

fixed program

buggy program

buggy input

concretepath

distance budget

Page 16: Has the Bug Really Been Fixed?

FIXATION Architecture

CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results

• Instrumentation module: WALA-Shrike bytecode library

fixed program

buggy program

buggy input

concretepath

distance budget

Page 17: Has the Bug Really Been Fixed?

FIXATION Architecture

CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results

• WPd module: implement a prototype

fixed program

buggy program

buggy input

concretepath

distance budget

Page 18: Has the Bug Really Been Fixed?

FIXATION Architecture

CFG generator

instrumentation

module

Pb-CFG

concretepath

WPd module

symbolic execution

module

post-processor

results

• SE module & post-processor: Java Pathfinder

fixed program

buggy program

buggy input

distance budget

Page 19: Has the Bug Really Been Fixed?

Evaluation

Motivation Approach Implementation Evaluation

• Objective– Demonstrate the feasibility of our approach

– Differentiate WPd from WP

• Experimental setup– Dell XPS 630i– 2.4GHz QuadCPU– 3.2 GB of memory– Ubuntu 8.04

Page 20: Has the Bug Really Been Fixed?

Program Transformation

Page 21: Has the Bug Really Been Fixed?

Benchmark

NameLOC Nodes Arcs Cyclomatic

Complexity

P Pi P Pi P Pi P Pi

NoSuchMethod 60 65 60 64 70 75 12 13

MultiTask 23 36 51 62 54 68 5 8

Substring 8 17 10 16 10 18 2 4

NativeErr 9 20 10 18 10 21 2 5

Loop 34 42 42 46 46 51 6 7

PathExp 114 133 103 119 124 147 23 30

Page 22: Has the Bug Really Been Fixed?

Evaluation - Examplebegin

fun == InterpretedFun

fun == Continuation

fun == IdFunObject

call nodes

call nodes

call nodes

end

assert(false)

known bug-triggering input: fun = NoSuchMethodShimdistance budget: d = 0

pass

fail

true

fun != idFunObject

fun != idFunObject && fun != Continuation

fun != idFunObject && fun != Continuation && fun != InterpretedFun

Page 23: Has the Bug Really Been Fixed?

Evaluation - ExampleWPd = (fun != InerpretedFun)&&(fun != Continuation) && (fun != IdFunctionObject)

FIXATION

Bad fix!Assertion fails again.New bug-triggering input is:fun == NoSuchMethodShim && op == Icode_TAIL_CALL

Page 24: Has the Bug Really Been Fixed?

WPd vs.WP

Name dPaths explored

by WPdTotal paths

NoSuchMethod 0 1 30

MultiTask 3 32 48

SubString 2 3 3

NativeErr 2 4 5

Loop 5 354 ∞

PathExp 27 234 1021

Page 25: Has the Bug Really Been Fixed?

WPd vs. WP (cont.)

Fea

sib

le p

ath

s

distance budget

PathExp

81 83

234

1021

number of paths explored

detect bad fix

Page 26: Has the Bug Really Been Fixed?

Conclusion

• Introduce and formalize the bad fix problem

• Propose distance-bounded WP

• Implement a prototype FIXATION

Page 27: Has the Bug Really Been Fixed?

No Bad Fixes!

detect f1 f2 … fn

Page 28: Has the Bug Really Been Fixed?

Soundness and Completeness

• Under-approximation of real bug-triggering input domain

• Sound: every bad fix we detect is a real bad fix.

• Not complete: we may miss some bad fixes.

Page 29: Has the Bug Really Been Fixed?

Threats to Validity

• Determine distance budget (d)

• Fixation is currently not optimized

• Benchmark pickup

• Suffer the limitation of WP computation and symbolic execution components

Page 30: Has the Bug Really Been Fixed?

Strength & Weakness

− How to model bug as assertion failure− Difficult to Determine distance budget

Path close to a buggy-path are more likely to be error prone.

Although not complete, all detected bad fixes are bad.

Page 31: Has the Bug Really Been Fixed?

References[1] J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug?In ICSE ’06: Proceedings of the 28th international conference onSoftware engineering, 2006.

[6] S. Chandra, S. J. Fink, and M. Sridharan. Snugglebug: a powerfulapproach to weakest preconditions. In PLDI ’09: Proceedings of the2009 ACM SIGPLAN conference on Programming language designand implementation, volume 44, 2009.

[12] J. Dolby, M. Vaziri, and F. Tip. Finding bugs efficiently with a SATsolver. In ESEC-FSE ’07: Proceedings of the the 6th joint meeting ofthe European software engineering conference and the ACM SIGSOFTsymposium on The foundations of software engineering, 2007.

[29] S. Person, M. B. Dwyer, S. Elbaum, and C. S. Pˇasˇareanu. Differentialsymbolic execution. In SIGSOFT ’08/FSE-16: Proceedings of the 16thACM SIGSOFT International Symposium on Foundations of softwareengineering, 2008.