34
Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

  • View
    227

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Specification-Based Error Localization

Brian DemskyCristian Cadar

Daniel RoyMartin Rinard

Computer Science and Artificial Intelligence Laboratory

Massachusetts Institute of Technology

Page 2: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Problem

Error Introduced

Execution with

Broken Data

Structure

Crash or Unexpected

Result

• Have to trace symptom back to cause• Error may be present but not visible in test

suite

Page 3: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Problem

• Goal is to discover bugs when • they corrupt data• not when effect becomes visible

• Perform frequent consistency checks• Bug localized between

• first unsuccessful check and• last successful check

Error Introduced

Execution with

Broken Data

Structure

Crash or Unexpected

Result

Page 4: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Our Approach

Specification of Data Structure Consistency Properties

Archie Compiler

Efficient Consistency Checker

Program

Instrumented Program with EarlyData Structure Corruption Detection

+

Page 5: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Architecture

Concrete Data Structure Abstract Model

Model DefinitionRules

Model Consistency Constraints

Page 6: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Architecture RationaleWhy use the abstract model?

• Model construction separates objects into sets• Reachability properties• Field values

• Different constraints for objects in different sets• Appropriate division of complexity

• Data structure representation complexity encapsulated in model definition rules

• Consistency property complexity encapsulated in (clean, uniform) model constraint language

Page 7: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

List Example

structure node {node *next;value *data;

} structure value {

int data;}node * head;

Page 8: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Sets and Relations in Model

• Sets of objectsset NODE of nodeset VALUE of value

• Relations between objects – values of object fields, referencing relationships between objectsrelation NEXT : NODE -> NODErelation DATA : NODE -> VALUE

Page 9: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Model TranslationBits translated to sets and relations in abstract

model using statements of the form:

Quantifiers, Condition Inclusion Constraint

true head in NODEfor n in NODE, !n.next = NULL n.next in NODE

for n in NODE, !n.next = NULL n,n.next in NEXTfor n in NODE, !n.data = NULL n.data in VALUE

for n in NODE, !n.data = NULL n,n.data in DATA

Page 10: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Generated Model

NODEVALUE

DATA

NEXT

Page 11: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Consistency PropertiesQuantifiers, Body

• Body is first-order property of basic propositions• Inequality constraints on numeric fields • Cardinality constraints on sizes of sets• Referencing relationships for each object• Set and relation inclusion constraints

• Example:for n in NODE, size(NEXT.n)<=1for v in VALUE, size(DATA.v)=1

Page 12: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Consistency ViolationsEvaluate consistency propertiesfor v in VALUE, size(DATA.v)=1

NODEVALUE

DATA

NEXT

Page 13: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Consistency ViolationsEvaluate consistency propertiesfor v in VALUE, size(DATA.v)=1

NODEVALUE

DATA

NEXTInconsistency

Found!!!

Page 14: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Default Instrumentation

void copynode(struct node *n) {

struct node * newnode= malloc(sizeof(struct

node));newnode.data=n.data;newnode.next=n.next;n.next=newnode;

}

Insert check here

Insert check here

Page 15: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Instrumentation

void copynode(struct node *n) {

struct node * newnode= malloc(sizeof(struct

node));newnode.data=n.data;newnode.next=n.next;n.next=newnode;

}

Insert check here

Insert check here Failed

Pass

Page 16: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Instrumentation

void copynode(struct node *n) {

struct node * newnode= malloc(sizeof(struct

node));newnode.data=n.data;newnode.next=n.next;n.next=newnode;

}

Insert check here

Insert check here Failed

Pass

Page 17: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Performance is a Key Issue

• Would like to perform checks as often as possible

• Performance of consistency checking limits how frequently program can check

• Have developed compiler optimizations• Fixed point elimination• Relation elimination• Set elimination

• Key idea: Perform checks directly on data structures (eliminating model when possible)

Page 18: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Fixed Point Elimination

• Evaluation of model definition rules requires fixed point computation

• Replace fixed point computation with more efficient traversal when possible• Compute dependence graph for model

definition rules• Compute strongly connected

components (SCCs)• Topologically sort SCCs• Eliminate fixed point computation for

SCCs with no cyclic dependences

Page 19: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Relation Elimination

Model Definition Rules:

for i in 0..C, true for i in 0..C, true f[i] in S f[i] in S

for s in S, true s,s.r in R

for s in S, !s.q=NULL for s in S, !s.q=NULL s,s.qs,s.q in Q in Q

for s in S, !s.q=NULL for s in S, !s.q=NULL s.q in T s.q in T

Model Constraints:for s in S, MIN<=s.R and s.R<=MAXfor t in T, (Q.t).R!=K

Page 20: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Relation Elimination

Model Definition Rules:

for i in 0..C, true for i in 0..C, true f[i] in S f[i] in S

for s in S, true s,s.r in R

for s in S, !s.q=NULL for s in S, !s.q=NULL s,s.qs,s.q in Q in Q

for s in S, !s.q=NULL for s in S, !s.q=NULL s.q in T s.q in T

Model Constraints:for s in S, MIN<=s.r and s.r<=MAXfor t in T, (Q.t).r!=K

• _

Page 21: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Model Definition Rules:for i in 0..C, true f[i] in S

for s in S, !s.q=NULL s,s.q in Q

for s in S, !s.q=NULL s.q in T

Model Definition Rule Inlining

Page 22: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Model Definition Rules:for i in 0..C, true f[i] in S

!f[i].q=NULL f[i],f[i].q in Q

!f[i].q=NULL f[i].q in T

Model Definition Rule Inlining

Page 23: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Model Definition Rules:for i in 0..C, true f[i] in S

!f[i].q=NULL f[i],f[i].q in Q

!f[i].q=NULL f[i].q in T

Model Constraints:for s in S, MIN<=s.r and s.r<=MAXfor t in T, (Q.t).r!=K

Constraint Inlining

Page 24: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Model Definition Rules:for i in 0..C, true f[i] in S

!f[i].q=NULL f[i],f[i].q in Q

!f[i].q=NULL f[i].q in T MIN<=f[i].r and f[i].r<=MAX

Model Constraints:for t in T, (Q.t).r!=K

Constraint Inlining

Page 25: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Model Definition Rules:for i in 0..C, true f[i] in S

!f[i].q=NULL f[i],f[i].q in Q

!f[i].q=NULL f[i].q in T MIN<=f[i].r and f[i].r<=MAX

Model Constraints:for t in T, (Q.t).r!=K

Set Elimination

Page 26: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Model Definition Rules:for i in 0..C, true f[i] in S

!f[i].q=NULL f[i],f[i].q in Q

!f[i].q=NULL f[i].q in T MIN<=f[i].r and f[i].r<=MAX

Model Constraints:for t in T, (Q.t).r!=K

Set Elimination

Page 27: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Freeciv Benchmark

• Multiplayer Client/Server based online game

• Available at www.freeciv.org• Looked at the server• Server contains 73,000 lines of code• Added 750 instrumentation sites• 20,000 consistency checks performed in

our sample execution

Page 28: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Performance Evaluation• Fixed point elimination (47x speedup)• Relation construction elimination (110x

speedup)• Set construction elimination (820x speedup)• Bottom line

• Baseline compiled version 5,100 times slower than uninstrumented

• Optimized version 6 times slower than uninstrumented

• Optimized version can be used interactively

Page 29: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

User Study

• Designed to answer following question:

Does inconsistency detection help developers to more quickly localize and correct detected data structure corruption errors?

Page 30: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

User Study

• Created three buggy version of Freeciv• Two groups of three developers

• One used conventional tools• One used specification-based

consistency checking• Each participant was asked to spend at

least one hour on each version• Both populations given an instrumented

version of Freeciv

Page 31: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Results

With Archie

0

20

40

60

80

100

1st Bug 2nd Bug 3rd Bug

Time (min)

Without Archie

0

20

40

60

80

100

1st Bug 2nd Bug 3rd Bug

Page 32: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Extension: Data Structure Repair

• Do not stop program with inconsistent data• Instead, use consistency specification to repair

data structure and keep executing!• Input: inconsistent data structure• Output: consistent data structure

• “Automatic detection and repair of errors in data structures” (Demsky, Rinard OOPSLA 2003)• Repair enables continued execution• All programs execute successfully after

repair

Page 33: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Related Work

• Specification languages such as UML or Alloy• Specification-based testing

• Korat (Boyapati et. al. ISSTA 2002)• Testera (Marinov and Khurshid ASE 2001)• Eiffel (Meyer 1992)

• Invariant inference and checking• Daikon (Ernst et. al. ICSE 1999)• DIDUCE (Hangal and Lam ICSE 2002)• Carrot (Pytlik et. al. 2003)

• Debugging tools• AskIgor (Zeller FSE 2002)• Debugging Backwards in Time (Lewis AADEBUG

2003)

Page 34: Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

Conclusion

• Consistency checking to localize data structure corruption bugs

• Optimizations for good performance• Experimental confirmation that

consistency checking may be useful• Data structure repair