Fighting Software Inefficiency Through Automated Bug Detection

  • View
    118

  • Download
    3

  • Category

    Science

Preview:

Citation preview

Figh%ngSo*wareInefficiencyThroughAutomatedBugDetec%on

ShanLuUniversityofChicago

1

Ali=lebitaboutmyself

2

MylifeJ

3

Ilearnedalotfrom…

4

BugDetec%onBackground

5

Figh%ngso*warebugsiscrucial

•  So3wareiseverywhere– h5p://en.wikipedia.org/wiki/List_of_so3ware_bugs

•  So3warebugsarewidespreadandcostly– Leadto40%systemdownEme[Blueprints2000]– Cost312Billionlostperyear[Cambridge2013]

6

Whatisyourfavoritebug?

•  Howmanyofyouhavebeenbotheredbybugs?

7

Howtodetectbugs?

•  Study&understandreal-worldbugs•  Discoverpa5ernsofcommonbugs

– Sourcecodelevel– Binarycodelevel– …

•  Designpa5ern-matchingprogramanalysis– StaEcanalysis– Dynamicanalysis– …

8

Bugdetec%onexamples– MemorybugdetecEon

•  Pa5ern:over-boundwrites,…•  DetecEon:checkmemoryaccesses&…

– ConcurrencybugdetecEon•  Pa5ern:dataraces,atomicityviolaEons,…•  DetecEon:checkmemoryaccesses&…

9

P = malloc (10); P[100] = ‘a’;

if (P) *P=‘a’;

P=NULL;

WhyPerformance-BugDetec%on?

10

Howdidthisstart?

•  Oneofourbugdetectorsisstrangelyslow

– Whynotprofiling?•  Lotsofnoisesinprofiling•  Measuringcostnotinefficiency

•  AmIabletodothis?•  WhereshouldIstart?

11

AnEmpiricalStudyofReal-WorldPerformanceBugs

Arethereperformancebugs?Aretheyimportant?

Whattypesofperformancebugsarethere?

12 Understanding and detecting real-world performance bugs [PLDI '12]

Methodology

13

ApplicaEon

Apache

Chrome

GCC

Mozilla

MySQL

So3wareType

ServerSo3ware

GUIApplicaEon

GUIApplicaEon

Compiler

Command-lineUElity+Server+Library

Language

C/Java

C/C++

C/C++

C++/JS

C/C++/C#

MLOC

1.3

BugDBHistory Tags

Compile-Eme-hog 5.7

4.7

14.0

N/A

N/A

perf

S5

0.45

14y

13y

10y

13y

4y

#Bugs

25

10

10

36

28

Total:109

Findings

•  Arethereperformancebugs?– Yes

•  Aretheyimportant?– Someare

•  Whattypesofperformancebugsarethere?– Whataretheirrootcauses?– Wherearetheytypicallylocated?– Howaretheyusuallyfixed?

14

BugExamples

15

+inti=-k.length();-while(s.indexOf(k)==-1){+while(i++<0||+s.substring(i).indexOf(k)==-1){s.append(nextchar());}

PatchforApache-AntBug34464

for(i=0;i<tabs.length;i++){…tabs[i].doTransact();}+doAggregateTransact(tabs);

MozillaBug490742&Patch

Whatisnext?

Canwedetectperformancebugs?What“pa5ern”didwefind?

16

APatch-BasedInefficiencyDetector

17

Sta%cinefficiencypa=ernsexist

•  StaEcallycheckableinefficiencypa5ernsexist

+inti=-k.length();-while(s.indexOf(k)==-1){+while(i++<0||+s.substring(i).indexOf(k)==-1){s.append(nextchar());}

PatchforApache-AntBug34464

for(i=0;i<tabs.length;i++){…tabs[i].doTransact();}+doAggregateTransact(tabs);

MozillaBug490742&Patch

Howtogetthesepa=erns?

•  Manuallyextractfrompatches

19

NotContainRules

DynamicRules

LLVMCheckers

PythonCheckers

Detec%onResults

•  17checkersfindPPPsinoriginalbuggyversions•  13checkersfind332PPPsinlatestversions

Foundbycross-applicaKon

checking

Inheritsfrombuggyversions

Introducedlater

*PPP:Poten6alPerformanceProblem

Inefficiencypa=ernbasedbugdetec%onispromising!

Whatisnext?

Dowehavetomanuallyspecifyrules?Canwebuildgenericdetectors?

21

Toddler

AdynamicandgenericdetectortargeEnginefficientnestedloops

22 Toddler: Detecting Performance Problems via Similar Memory-Access Patterns [ICSE '13]

Whataregenericinefficiencypa=erns?

23

Previousexample

24

while(s.indexOf(k)==-1){s.append(nextchar());}

Apache-AntBug34464

Password: abcdefg Password: abcdefgh Password: abcdefghi

Whatisthepa=ern?

•  Whattypeofnestedloopsarelikelyinefficient?– Manyinnerloopsaresimilarwitheachother

•  SomeinstrucEonskeepsreadingsimilarsequencesofvalues

25

abcdefg abcdefgh abcdefghi

Howtodetect?•  Input:Testcode+systemundertest•  Steps:

1.  InstrumentthesystemundertestMonitorloopstarts/ends&memoryreadsinsideloops

2.  AnalyzetraceproducedbyinstrumentaEonIdenEfyrepeEEvememory-readsequences

•  Output:Loopsthatarelikelyperformancebugs

26

Evalua%onSubjectsandNewBugs

27

Applica%on Descrip%on LOC KnownBugs NewBugs Fixed Confirmed

Ant Buildtool 109,765 1 8 1 0

ApacheCollecEons CollecEonslibrary 51,516 1 20 10 4

Groovy Dynamiclanguage 136,994 1 2 2 0

GoogleCoreLibraries CollecEonslibrary 156,004 2 10 1 2

JFreeChart Chartframework 64,184 1 1 0 0

Jmeter LoadtesEngtool 86,549 1 0 0 0

Lucene Textsearchengine 320,899 2 0 0 0

PDFBox PDFframework 78,578 1 0 0 0

Solr Searchserver 373,138 1 0 0 0

JDKstandardlibrary 2 0 0

JUnittesEngframework 1 1 0

9Apps+2Libs 50,000–320,000 11 44 15 6

Toddlervs.HProf

28

KnownBugBugDetected? FalseP. Rank Slowdown

TODD. PROF TODD. PROF TODD. PROF

Ant ü û 0 19.3 13.7 4.2

ApacheCollecEons ü ü 0 1.0 10.0 2.1

Groovy ü ü 0 3.7 15.5 3.7

GoogleCoreLibraries#1 ü ü 0 1.8 9.0 3.8

GoogleCoreLibraries#2 ü û 0 5.3 7.5 3.2

JFreeChart ü û 0 53.7 13.4 8.8

JMeter ü û 0 10.3 8.5 1.9

Lucene#1 ü û 0 7.7 6.8 2.5

Lucene#2 ü ü 0 3.1 25.4 3.1

PDFBox ü û 1 18.8 51.8 12.1

Solr ü û 0 178.3 114.2 7.1

11 11 4 1 n/a 15.9X 4.0X

Whatisnext?

Whysomanybugsarenotfixedbydevelopers?

29

Caramel

AstaEcandgenericdetectortargeEnginefficientloopswith

simplepatches

30

CARAMEL: Detecting and Fixing Performance Problems That Have Non-Intrusive Fixes [ICSE'15] Won SIGSOFT Distinguished Paper Award

Whatareperf.bugsnotfixed?

31

Correctness Maintainability Manual effort

Potential speedup under certain workload

Can we detect bugs with simple fixes?

Whatisthepa=ern?

•  Whatisatypicalsimplefixforaninefficientloop?

•  Whattypesofbugshavetheabovetypeoffix?– WethoughtforalongEme…

32

for(…) + if (cond) break;

Bugexample

33

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//sideeffect}

•  NewbuginPDFBoxfoundbyus,fixedbydevelopers

•  DevelopersfixbugsthathaveCondBreakfixes:– WastecomputaEoninloops– Fixisnon-intrusive

WhatBugsHaveCondBreakFixes?

34

EveryItera%on

LateItera%ons

EarlyItera%ons

No-Result Type1 Type2 TypeY

Useless-Result TypeX Type3 Type4

Where Is Computation Wasted? How Is Computation Wasted?

Ingredient1:ResultInstruc%on

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//sideeffect}

ResultInstrucEon

Ingredient2:Instruc%on-Condi%on

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//ResultIns.}

InstrucEon-CondiEon

36

Ingredient3:Loop-Condi%on

booleanalreadyPresent=false;while(isActualEmbeddedProperty.hasNext()){if(alreadyPresent)break;//CondBreakFIXif(oldVal.getStr().equals(newVal.getStr()))alreadyPresent=true;if(!alreadyPresent)prop.container().addProp(newVal);//ResultIns.}

InstrucEon-CondiEon

AlsoLoop-CondiEon

37

Evalua%onSubjectsandNewBugs

•  15applica%ons–  11Java,4C/C++

•  150newbugs•  116bugsfixed

–  51inJava–  65inC/C++

•  Only4rejected•  22bugsinGCCfixed•  149/150fixedautomaEcally

•  Only23falseposiEves

Applica%on Descrip%on LOC Bugs

Ant Buildtool 140,674 1

Groovy Dynamiclanguage 161,487 9

JMeter LoadtesEngtool 114,645 4

Log4J Loggingframework 51,936 6

Lucene Textsearchengine 441,649 14

PDFBox PDFframework 108,796 10

Sling Webapp.framework 202,171 6

Solr Searchserver 176,937 2

Struts Webapp.framework 175,026 4

Tika ContentextracEon 50,503 1

Tomcat Webserver 295,223 4

GoogleChrome Webbrowser 13,371,208 22

GCC Compiler 1,445,425 22

Mozilla Webbrowser 5,893,397 27

MySQL Databaseserver 1,774,926 18

Differentaspectsoffigh%ngbugs

39

In-housebugdetecEon

In-fieldfailurerecovery

In-fieldfailurediagnosis

In-housebugfixing

Lowoverhead

Highaccuracy Highaccuracy

Workfrommygroup

40

In-housebugdetecEon

[ASPLOS06];[SOSP07];[ASPLOS09];[ASPLOS10];[ASPLOS11];[OOPSLA13]

[PLDI12];[ICSE13];[ICSE15]

In-fieldfailurerecovery

[ASPLOS13.A][FSE14]

Notyet

In-fieldfailurediagnosis

[OOPSLA10];[ASPLOS13.B];[ASPLOS14];[OOPSLA16*]

[OOPSLA14]

In-housebugfixing

[PLDI11];[OSDI12];[FSE16]

[CAV13]

concurrencybugs

performancebugs

Conclusions&FutureWork

41

Constraints/Requirements

Techniques

Bugs

Thanks!Ques%ons?

42

Mycollaborators•  Prof.DarkoMarinov•  AdrianNistor•  LinhaiSong