42
Figh%ng So*ware Inefficiency Through Automated Bug Detec%on Shan Lu University of Chicago 1

Fighting Software Inefficiency Through Automated Bug Detection

Embed Size (px)

Citation preview

Page 1: Fighting Software Inefficiency Through Automated Bug Detection

Figh%ngSo*wareInefficiencyThroughAutomatedBugDetec%on

ShanLuUniversityofChicago

1

Page 2: Fighting Software Inefficiency Through Automated Bug Detection

Ali=lebitaboutmyself

2

Page 3: Fighting Software Inefficiency Through Automated Bug Detection

MylifeJ

3

Page 4: Fighting Software Inefficiency Through Automated Bug Detection

Ilearnedalotfrom…

4

Page 5: Fighting Software Inefficiency Through Automated Bug Detection

BugDetec%onBackground

5

Page 6: Fighting Software Inefficiency Through Automated Bug Detection

Figh%ngso*warebugsiscrucial

•  So3wareiseverywhere– h5p://en.wikipedia.org/wiki/List_of_so3ware_bugs

•  So3warebugsarewidespreadandcostly– Leadto40%systemdownEme[Blueprints2000]– Cost312Billionlostperyear[Cambridge2013]

6

Page 7: Fighting Software Inefficiency Through Automated Bug Detection

Whatisyourfavoritebug?

•  Howmanyofyouhavebeenbotheredbybugs?

7

Page 8: Fighting Software Inefficiency Through Automated Bug Detection

Howtodetectbugs?

•  Study&understandreal-worldbugs•  Discoverpa5ernsofcommonbugs

– Sourcecodelevel– Binarycodelevel– …

•  Designpa5ern-matchingprogramanalysis– StaEcanalysis– Dynamicanalysis– …

8

Page 9: Fighting Software Inefficiency Through Automated Bug Detection

Bugdetec%onexamples– MemorybugdetecEon

•  Pa5ern:over-boundwrites,…•  DetecEon:checkmemoryaccesses&…

– ConcurrencybugdetecEon•  Pa5ern:dataraces,atomicityviolaEons,…•  DetecEon:checkmemoryaccesses&…

9

P = malloc (10); P[100] = ‘a’;

if (P) *P=‘a’;

P=NULL;

Page 10: Fighting Software Inefficiency Through Automated Bug Detection

WhyPerformance-BugDetec%on?

10

Page 11: Fighting Software Inefficiency Through Automated Bug Detection

Howdidthisstart?

•  Oneofourbugdetectorsisstrangelyslow

– Whynotprofiling?•  Lotsofnoisesinprofiling•  Measuringcostnotinefficiency

•  AmIabletodothis?•  WhereshouldIstart?

11

Page 12: Fighting Software Inefficiency Through Automated Bug Detection

AnEmpiricalStudyofReal-WorldPerformanceBugs

Arethereperformancebugs?Aretheyimportant?

Whattypesofperformancebugsarethere?

12 Understanding and detecting real-world performance bugs [PLDI '12]

Page 13: Fighting Software Inefficiency Through Automated Bug Detection

Methodology

13

ApplicaEon

Apache

Chrome

GCC

Mozilla

MySQL

So3wareType

ServerSo3ware

GUIApplicaEon

GUIApplicaEon

Compiler

Command-lineUElity+Server+Library

Language

C/Java

C/C++

C/C++

C++/JS

C/C++/C#

MLOC

1.3

BugDBHistory Tags

Compile-Eme-hog 5.7

4.7

14.0

N/A

N/A

perf

S5

0.45

14y

13y

10y

13y

4y

#Bugs

25

10

10

36

28

Total:109

Page 14: Fighting Software Inefficiency Through Automated Bug Detection

Findings

•  Arethereperformancebugs?– Yes

•  Aretheyimportant?– Someare

•  Whattypesofperformancebugsarethere?– Whataretheirrootcauses?– Wherearetheytypicallylocated?– Howaretheyusuallyfixed?

14

Page 15: Fighting Software Inefficiency Through Automated Bug Detection

BugExamples

15

+inti=-k.length();-while(s.indexOf(k)==-1){+while(i++<0||+s.substring(i).indexOf(k)==-1){s.append(nextchar());}

PatchforApache-AntBug34464

for(i=0;i<tabs.length;i++){…tabs[i].doTransact();}+doAggregateTransact(tabs);

MozillaBug490742&Patch

Page 16: Fighting Software Inefficiency Through Automated Bug Detection

Whatisnext?

Canwedetectperformancebugs?What“pa5ern”didwefind?

16

Page 17: Fighting Software Inefficiency Through Automated Bug Detection

APatch-BasedInefficiencyDetector

17

Page 18: Fighting Software Inefficiency Through Automated Bug Detection

Sta%cinefficiencypa=ernsexist

•  StaEcallycheckableinefficiencypa5ernsexist

+inti=-k.length();-while(s.indexOf(k)==-1){+while(i++<0||+s.substring(i).indexOf(k)==-1){s.append(nextchar());}

PatchforApache-AntBug34464

for(i=0;i<tabs.length;i++){…tabs[i].doTransact();}+doAggregateTransact(tabs);

MozillaBug490742&Patch

Page 19: Fighting Software Inefficiency Through Automated Bug Detection

Howtogetthesepa=erns?

•  Manuallyextractfrompatches

19

NotContainRules

DynamicRules

LLVMCheckers

PythonCheckers

Page 20: Fighting Software Inefficiency Through Automated Bug Detection

Detec%onResults

•  17checkersfindPPPsinoriginalbuggyversions•  13checkersfind332PPPsinlatestversions

Foundbycross-applicaKon

checking

Inheritsfrombuggyversions

Introducedlater

*PPP:Poten6alPerformanceProblem

Inefficiencypa=ernbasedbugdetec%onispromising!

Page 21: Fighting Software Inefficiency Through Automated Bug Detection

Whatisnext?

Dowehavetomanuallyspecifyrules?Canwebuildgenericdetectors?

21

Page 22: Fighting Software Inefficiency Through Automated Bug Detection

Toddler

AdynamicandgenericdetectortargeEnginefficientnestedloops

22 Toddler: Detecting Performance Problems via Similar Memory-Access Patterns [ICSE '13]

Page 23: Fighting Software Inefficiency Through Automated Bug Detection

Whataregenericinefficiencypa=erns?

23

Page 24: Fighting Software Inefficiency Through Automated Bug Detection

Previousexample

24

while(s.indexOf(k)==-1){s.append(nextchar());}

Apache-AntBug34464

Password: abcdefg Password: abcdefgh Password: abcdefghi

Page 25: Fighting Software Inefficiency Through Automated Bug Detection

Whatisthepa=ern?

•  Whattypeofnestedloopsarelikelyinefficient?– Manyinnerloopsaresimilarwitheachother

•  SomeinstrucEonskeepsreadingsimilarsequencesofvalues

25

abcdefg abcdefgh abcdefghi

Page 26: Fighting Software Inefficiency Through Automated Bug Detection

Howtodetect?•  Input:Testcode+systemundertest•  Steps:

1.  InstrumentthesystemundertestMonitorloopstarts/ends&memoryreadsinsideloops

2.  AnalyzetraceproducedbyinstrumentaEonIdenEfyrepeEEvememory-readsequences

•  Output:Loopsthatarelikelyperformancebugs

26

Page 27: Fighting Software Inefficiency Through Automated Bug Detection

Evalua%onSubjectsandNewBugs

27

Applica%on Descrip%on LOC KnownBugs NewBugs Fixed Confirmed

Ant Buildtool 109,765 1 8 1 0

ApacheCollecEons CollecEonslibrary 51,516 1 20 10 4

Groovy Dynamiclanguage 136,994 1 2 2 0

GoogleCoreLibraries CollecEonslibrary 156,004 2 10 1 2

JFreeChart Chartframework 64,184 1 1 0 0

Jmeter LoadtesEngtool 86,549 1 0 0 0

Lucene Textsearchengine 320,899 2 0 0 0

PDFBox PDFframework 78,578 1 0 0 0

Solr Searchserver 373,138 1 0 0 0

JDKstandardlibrary 2 0 0

JUnittesEngframework 1 1 0

9Apps+2Libs 50,000–320,000 11 44 15 6

Page 28: Fighting Software Inefficiency Through Automated Bug Detection

Toddlervs.HProf

28

KnownBugBugDetected? FalseP. Rank Slowdown

TODD. PROF TODD. PROF TODD. PROF

Ant ü û 0 19.3 13.7 4.2

ApacheCollecEons ü ü 0 1.0 10.0 2.1

Groovy ü ü 0 3.7 15.5 3.7

GoogleCoreLibraries#1 ü ü 0 1.8 9.0 3.8

GoogleCoreLibraries#2 ü û 0 5.3 7.5 3.2

JFreeChart ü û 0 53.7 13.4 8.8

JMeter ü û 0 10.3 8.5 1.9

Lucene#1 ü û 0 7.7 6.8 2.5

Lucene#2 ü ü 0 3.1 25.4 3.1

PDFBox ü û 1 18.8 51.8 12.1

Solr ü û 0 178.3 114.2 7.1

11 11 4 1 n/a 15.9X 4.0X

Page 29: Fighting Software Inefficiency Through Automated Bug Detection

Whatisnext?

Whysomanybugsarenotfixedbydevelopers?

29

Page 30: Fighting Software Inefficiency Through Automated Bug Detection

Caramel

AstaEcandgenericdetectortargeEnginefficientloopswith

simplepatches

30

CARAMEL: Detecting and Fixing Performance Problems That Have Non-Intrusive Fixes [ICSE'15] Won SIGSOFT Distinguished Paper Award

Page 31: Fighting Software Inefficiency Through Automated Bug Detection

Whatareperf.bugsnotfixed?

31

Correctness Maintainability Manual effort

Potential speedup under certain workload

Can we detect bugs with simple fixes?

Page 32: Fighting Software Inefficiency Through Automated Bug Detection

Whatisthepa=ern?

•  Whatisatypicalsimplefixforaninefficientloop?

•  Whattypesofbugshavetheabovetypeoffix?– WethoughtforalongEme…

32

for(…) + if (cond) break;

Page 33: Fighting Software Inefficiency Through Automated Bug Detection

Bugexample

33

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//sideeffect}

•  NewbuginPDFBoxfoundbyus,fixedbydevelopers

•  DevelopersfixbugsthathaveCondBreakfixes:– WastecomputaEoninloops– Fixisnon-intrusive

Page 34: Fighting Software Inefficiency Through Automated Bug Detection

WhatBugsHaveCondBreakFixes?

34

EveryItera%on

LateItera%ons

EarlyItera%ons

No-Result Type1 Type2 TypeY

Useless-Result TypeX Type3 Type4

Where Is Computation Wasted? How Is Computation Wasted?

Page 35: Fighting Software Inefficiency Through Automated Bug Detection

Ingredient1:ResultInstruc%on

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//sideeffect}

ResultInstrucEon

Page 36: Fighting Software Inefficiency Through Automated Bug Detection

Ingredient2:Instruc%on-Condi%on

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//ResultIns.}

InstrucEon-CondiEon

36

Page 37: Fighting Software Inefficiency Through Automated Bug Detection

Ingredient3:Loop-Condi%on

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//ResultIns.}

InstrucEon-CondiEon

AlsoLoop-CondiEon

37

Page 38: Fighting Software Inefficiency Through Automated Bug Detection

Evalua%onSubjectsandNewBugs

•  15applica%ons–  11Java,4C/C++

•  150newbugs•  116bugsfixed

–  51inJava–  65inC/C++

•  Only4rejected•  22bugsinGCCfixed•  149/150fixedautomaEcally

•  Only23falseposiEves

Applica%on Descrip%on LOC Bugs

Ant Buildtool 140,674 1

Groovy Dynamiclanguage 161,487 9

JMeter LoadtesEngtool 114,645 4

Log4J Loggingframework 51,936 6

Lucene Textsearchengine 441,649 14

PDFBox PDFframework 108,796 10

Sling Webapp.framework 202,171 6

Solr Searchserver 176,937 2

Struts Webapp.framework 175,026 4

Tika ContentextracEon 50,503 1

Tomcat Webserver 295,223 4

GoogleChrome Webbrowser 13,371,208 22

GCC Compiler 1,445,425 22

Mozilla Webbrowser 5,893,397 27

MySQL Databaseserver 1,774,926 18

Page 39: Fighting Software Inefficiency Through Automated Bug Detection

Differentaspectsoffigh%ngbugs

39

In-housebugdetecEon

In-fieldfailurerecovery

In-fieldfailurediagnosis

In-housebugfixing

Lowoverhead

Highaccuracy Highaccuracy

Page 40: Fighting Software Inefficiency Through Automated Bug Detection

Workfrommygroup

40

In-housebugdetecEon

[ASPLOS06];[SOSP07];[ASPLOS09];[ASPLOS10];[ASPLOS11];[OOPSLA13]

[PLDI12];[ICSE13];[ICSE15]

In-fieldfailurerecovery

[ASPLOS13.A][FSE14]

Notyet

In-fieldfailurediagnosis

[OOPSLA10];[ASPLOS13.B];[ASPLOS14];[OOPSLA16*]

[OOPSLA14]

In-housebugfixing

[PLDI11];[OSDI12];[FSE16]

[CAV13]

concurrencybugs

performancebugs

Page 41: Fighting Software Inefficiency Through Automated Bug Detection

Conclusions&FutureWork

41

Constraints/Requirements

Techniques

Bugs

Page 42: Fighting Software Inefficiency Through Automated Bug Detection

Thanks!Ques%ons?

42

Mycollaborators•  Prof.DarkoMarinov•  AdrianNistor•  LinhaiSong