Has the Bug Really Been Fixed?

Has the Bug Really Been Fixed?

Zhongxian Gu, Earl T. Barr, David J. Hamilton, Zhendong Su

University of California, Davis

ICSE 2010

Authors

Publications:

Has the bug really been fixed?Zhongxian Gu, Earl T. Barr, David J, Hamilton, Zhendong Su(ICSE 2010)

Effective Identification of Failure-Inducing Changes: A Hybrid ApproachSai Zhang , Yu Lin, Zhongxian Gu and Jianjun Zhao.(PASTE 2008),

Change Impact Analysis for AspectJ ProgramsSai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ICSM 2008),

AutoFlow: An Automatic Debugging Framework for AspectJ ProgramsSai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ISSTA 2008)

Celadon: A Change Impact Analysis Tool for Aspect-Oriented ProgramsSai Zhang , Zhongxian Gu, Yu Lin and Jianjun Zhao. (ICSE 2008)

Zhongxian GU

Authors

Publications:Scalable and precise detection of buggy inconsistencies (OOPSLA'10) How unique is source code? (FSE'10) Perturbing numerical computation to detect instabilities (ISSTA'10) Dynamic detection of unsafe component loadings (ISSTA'10, Distinguished Paper Award) Has the bug really been fixed? (ICSE'10) Simultaneously learning and enforcing temporal properties (ICSE'10)

Zhendong Su

Current Projects:Automated debugging [ICSE'06, ASE'07]Clone detection and similarity checking [ICSE'07, FSE'07]Firewall modeling, analysis, and optimization [S&P'06, TNSM]Malicious code detection, analysis, and prevention [CCS'05, ASPLOS'06, ACSAC'06]Program analysis of numerical software [TACAS'04, TCS'05, ICSE'06]Web and database application security and reliability [ICSE'04, SAVCBS'04, POPL'06, PLDI'07]

NEW: Please submit good papers to the following venues: TOSEM, SAS'11, ESEC/FSE'11,OOPSLA'11, ICSE'12, and ISSTA'12.

Fixing a Bug

detectbug

understandbug

fix codeverify

fix

detect f1 f2 … fn

bad fixes

Motivation Approach Implementation Evaluation

Do Bad Fixes Exist?

• Empirical study– Explore Bugzilla databases of Ant, AspectJ and Rhino

– Focus on “reopened” bugs

– Study the comment histories

• Bad fixes do exist!– Of reopened bugs, 66% in Ant, 73% in AspectJ, 80% in

Rhino are due to bad fixes

– “Oh, I am sorry, I didn’t consider that possibility.”

Example

Bug(Rhino): Continuations do not work for __noSuchMethod__

// no idea what to do if it’s a TAIL_CALLif(fun instanceof NoSuchMethodShim && op != Icode_TAIL_CALL) { // get the shim and the actual method NoSuchMethodShim noSuchMethodShim = (NoSuchMethodShim) fun; ...}

1st fix

2nd fix

3rd fix

The Bad Fix Problem

known bug-triggering input

bug-triggering input domain

input domain

Inputs covered by the fix

• Coverage: Inputs in the domain are not covered

• Disruption: Change behavior of unrelated inputs


Our Approach

• Detect bad fix as soon as possible

• Coverage– Discover the bug-triggering input domain

– Test the fixed program using the bug-triggering input domain

• Disruption– Regression testing

– Random testing: use buggy program as the oracle

Discover the Bug-Triggering Input Domain

• A known bug-triggering input induces a concrete path

• Dijkstra’s weakest precondition (WP)– Path explosion

– Loop invariants

Path Neighborhood

• Intuition: paths in the neighborhood of the concrete buggy path are more error prone

• Under-approximate the bug-triggering input domain via exploring neighboring paths

Distance-Bounded WP (WPd)

• Inputs: program P, initial predicate ,a concrete path , distance budget d

• Generate candidate paths

• Restrict the computation of WP to

the candidate paths C

• Under-approximation of WP

Loop Invariants

• Unroll the loop nodes

• All paths are simple– Compute the distance

– Compute the WP

…

Unrolled-CFG

Coverage Analysis

• Collect the concrete path

• Under-approximate input domain using WPd

• Symbolically execute the fixed program

buggy inputWPd

…

buggy program

d…

fixed program

symbolic execution

FIXATION Architecture

fixed program

buggy program

buggy input

distance budget

CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results


concretepath


CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results

• CFG generator: instrument WALA-CFG generator to support finitely-unrolled CFG generation

fixed program

buggy program

buggy input

concretepath

distance budget


CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results

• Instrumentation module: WALA-Shrike bytecode library

fixed program

buggy program

buggy input

concretepath

distance budget


CFG generator

instrumentation

module

Pb-CFG

WPd module

symbolic execution

module

post-processor

results

• WPd module: implement a prototype

fixed program

buggy program

buggy input

concretepath

distance budget


CFG generator

instrumentation

module

Pb-CFG

concretepath

WPd module

symbolic execution

module

post-processor

results

• SE module & post-processor: Java Pathfinder

fixed program

buggy program

buggy input

distance budget

Evaluation


• Objective– Demonstrate the feasibility of our approach

– Differentiate WPd from WP

• Experimental setup– Dell XPS 630i– 2.4GHz QuadCPU– 3.2 GB of memory– Ubuntu 8.04

Program Transformation

Benchmark

NameLOC Nodes Arcs Cyclomatic

Complexity

P Pi P Pi P Pi P Pi

NoSuchMethod 60 65 60 64 70 75 12 13

MultiTask 23 36 51 62 54 68 5 8

Substring 8 17 10 16 10 18 2 4

NativeErr 9 20 10 18 10 21 2 5

Loop 34 42 42 46 46 51 6 7

PathExp 114 133 103 119 124 147 23 30

Evaluation - Examplebegin

fun == InterpretedFun

fun == Continuation

fun == IdFunObject

call nodes

call nodes

call nodes

end

assert(false)

known bug-triggering input: fun = NoSuchMethodShimdistance budget: d = 0

pass

fail

true

fun != idFunObject

fun != idFunObject && fun != Continuation

fun != idFunObject && fun != Continuation && fun != InterpretedFun

Evaluation - ExampleWPd = (fun != InerpretedFun)&&(fun != Continuation) && (fun != IdFunctionObject)

FIXATION

Bad fix!Assertion fails again.New bug-triggering input is:fun == NoSuchMethodShim && op == Icode_TAIL_CALL

WPd vs.WP

Name dPaths explored

by WPdTotal paths

NoSuchMethod 0 1 30

MultiTask 3 32 48

SubString 2 3 3

NativeErr 2 4 5

Loop 5 354 ∞

PathExp 27 234 1021

WPd vs. WP (cont.)

Fea

sib

le p

ath

s

distance budget

PathExp

81 83

234

1021

number of paths explored

detect bad fix

Conclusion

• Introduce and formalize the bad fix problem

• Propose distance-bounded WP

• Implement a prototype FIXATION

No Bad Fixes!

detect f1 f2 … fn

Soundness and Completeness

• Under-approximation of real bug-triggering input domain

• Sound: every bad fix we detect is a real bad fix.

• Not complete: we may miss some bad fixes.

Threats to Validity

• Determine distance budget (d)

• Fixation is currently not optimized

• Benchmark pickup

• Suffer the limitation of WP computation and symbolic execution components

Strength & Weakness

− How to model bug as assertion failure− Difficult to Determine distance budget

Path close to a buggy-path are more likely to be error prone.

Although not complete, all detected bad fixes are bad.

References[1] J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug?In ICSE ’06: Proceedings of the 28th international conference onSoftware engineering, 2006.

[6] S. Chandra, S. J. Fink, and M. Sridharan. Snugglebug: a powerfulapproach to weakest preconditions. In PLDI ’09: Proceedings of the2009 ACM SIGPLAN conference on Programming language designand implementation, volume 44, 2009.

[12] J. Dolby, M. Vaziri, and F. Tip. Finding bugs efficiently with a SATsolver. In ESEC-FSE ’07: Proceedings of the the 6th joint meeting ofthe European software engineering conference and the ACM SIGSOFTsymposium on The foundations of software engineering, 2007.

[29] S. Person, M. B. Dwyer, S. Elbaum, and C. S. Pˇasˇareanu. Differentialsymbolic execution. In SIGSOFT ’08/FSE-16: Proceedings of the 16thACM SIGSOFT International Symposium on Foundations of softwareengineering, 2008.

Documents

Has the Bug Really Been Fixed?