32
REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE REPAIR Claire Le Goues Westley Weimer Stephanie Forrest http:// genprog.cs.virginia.ed u 1

Representations and Operators for Improving Evolutionary Software Repair

  • Upload
    greg

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Representations and Operators for Improving Evolutionary Software Repair. Claire Le Goues. Westley Weimer. Stephanie Forrest. “Everyday, almost 300 bugs appear […] far too many for only the Mozilla programmers to handle.” – Mozilla Developer, 2005. - PowerPoint PPT Presentation

Citation preview

Page 1: Representations and Operators for Improving Evolutionary Software Repair

http://genprog.cs.virginia.edu 1

REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE REPAIR

Claire Le Goues

Westley Weimer

Stephanie Forrest

Page 2: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 2

PROBLEM: BUGGY SOFTWARE

http://genprog.cs.virginia.edu

“Everyday, almost 300 bugs appear […] far too many for only the Mozilla programmers to handle.”

– Mozilla Developer, 2005

Annual cost of software errors in the

US: $59.5 billion (0.6% of GDP).

Average time to fix a security-critical error:

28 days.

90%: Maintenance

10%: Everything Else

Page 3: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 3

APPROACH: EVOLUTIONARY COMPUTATION

Page 4: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 4

Genetic Programming

Search

Genetic Programming

Search

Input: source code, specification

Output: repaired version of the program

Input: source code, specification

Output: repaired version of the program

Page 5: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 5

SEARCH SPACE

Page 6: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 6

Learning: patches or repaired programsPopulation: 40Iterations: 10 Max variant size: unboundedLargest benchmark program: 2.8 million lines of C code

Learning: patches or repaired programsPopulation: 40Iterations: 10 Max variant size: unboundedLargest benchmark program: 2.8 million lines of C code.

OTHER GP PROBLEMSGECCO GP-TRACK

BEST PAPERSLearning: expression trees or listsPopulation: 64 – 2500Iterations: 50 – 10000Max variant size: 16 operations, 48 operations, 17 levels, 11 levels

PROGRAM REPAIR

Page 7: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 7

SEARCH SPACE

EC-based repair starts with a large genome.

The starting individual is mostly correct.

Page 8: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 8

Genetic Programming

Search

Input: source code, specification

Output: repaired version of the program

Page 9: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 9

IN-DEPTH STUDY OF REPRESENTATION AND

OPERATORS FOR EVOLUTIONARY PROGRAM

REPAIR.

OUR GOAL

IN-DEPTH STUDY OF REPRESENTATION AND

OPERATORS FOR EVOLUTIONARY PROGRAM

REPAIR.

Page 10: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012

INPUT

OUTPUT

EVALUATE FITNESS

DISCARD

ACCEPT

10CROSSOVER, MUTATE, SELECT

Page 11: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 11

IN-DEPTH STUDY OF REPRESENTATION AND

OPERATORS FOR EVOLUTIONARY PROGRAM

REPAIR.

OUR GOAL

Page 12: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012

INPUT

OUTPUT

EVALUATE FITNESS

DISCARD

ACCEPT

12CROSSOVER, MUTATE, SELECT

Page 13: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012CROSSOVER, MUTATE, SELECT

DISCARD

INPUT EVALUATE FITNESS

ACCEPT

OUTPUT

Page 14: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 14

5 6

8 9 10

11

12

7

1

2 3 4

Page 15: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 15

Input:

REPRESENTATIONPatch:

Delete(3)

Replace(3,5)

AST/WP:

1

2

4 5

5’

2

4

1

5

54

1

2 3 Delete(3) Insert(2,4) Replace(5,1)

Insert(5,4) Insert(3,3) Delete(4) …

Page 16: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 16

Mutation operators• Manipulate only existing genetic material.• Semantic checking improves probability that mutation is

viable.Crossover:

• One-point: on the weighted path or edit list.• Patch-subset: uniform, on the edit list.

Mutation operators• Manipulate only existing genetic material.• Semantic checking improves probability that mutation is

viable.Crossover:

• One-point: on the weighted path or edit list.• Patch-subset: uniform, on the edit list.

GENETIC OPERATORSswap insertdelete

Aside: mutation operator selection matters!

Page 17: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012

Legend:Likely faulty. probabilityMaybe faulty. probabilityNot faulty.

2

5 6

1

3 4

8

7

9

11

10

12

http://genprog.cs.virginia.edu

Input:

17

2

5 6

1

3 4

8

7

9

11

10

12

Legend:High change probability.Low change probability.Not changed.

Default: 10 : 1 ratio

Page 18: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 18

IN-DEPTH STUDY OF REPRESENTATION AND

OPERATORS FOR EVOLUTIONARY PROGRAM

REPAIR.

OUR GOAL

Page 19: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 19

Benchmarks: 105 bugs in 8 real-world programs.1 • 5 million lines of C code, 10,000 test cases.• Bugs correspond to human-written repairs for regression test

failures.Default parameters, for comparison:

• Patch representation.• Mutation operators selected with equal random probability. 1

mutation, 1 crossover/individual/iteration. • Population size: 40. 10 iterations or 12 wall-clock hours,

whichever comes first. Tournament size: 2.

EXPERIMENTAL SETUP

1Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer, “A Systematic Study of Automated Program Repair: Fixing 55 out of 105 bugs for $8 Each.” International Conference on Software Engineering (ICSE), 2012, pp. 3 – 13.

Page 20: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 20

55/105 bugs repaired using default parameters.Some bugs are more difficult to repair than others!

• Easy: 100% success rate on default parameters.• Medium: 50 – 100% success rate on default parameters• Hard: 1 – 50% success rate on default parameters• Unfixed: 0% success

Metrics: • # fitness evaluations to a repair • GP success rate.

EXPERIMENTAL SETUP

Page 21: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 21

Representation: • Which representation choice gives better results? • Which representation features contribute most to

success? Crossover: Which crossover operator is best? Operators:

• Which operators contribute the most to success? • How should they be selected?

Search space: How should the representation weight program statements to best define the search space?

RESEARCH QUESTIONSRepresentation:

• Which representation choice gives better results? • Which representation features contribute most to

success? Crossover: Which crossover operator is best? Operators:

• Which operators contribute the most to success? • How should they be selected?

Search space: How should the representation weight program statements to best define the search space?

Representation: • Which representation choice gives better results? • Which representation features contribute most to

success?Crossover: Which crossover operator is best? Operators:

• Which operators contribute the most to success? • How should they be selected?

Search space: How should the representation weight program statements to best define the search space?

Page 22: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 22

Procedure:Compare AST/WP to PATCH on original benchmarks with default parameters.For both representations, test effectiveness of:

1. Crossover.2. Semantic check.

Results:1. Patch outperforms AST/WP (14 – 30%).2. Semantic check strongly influences success rate of

both representations.3. Crossover also improves results.

REPRESENTATION: RESULTSProcedure:

Compare AST/WP to PATCH on original benchmarks with default parameters.For both representations, test effectiveness of:

1. Crossover.2. Semantic check.

Results:1. Patch outperforms AST/WP (14 – 30%).2. Semantic check strongly influences success rate of

both representations.3. Crossover also improves results.

Page 23: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 23

Representation: • Which representation choice gives better results? • Which representation features contribute most to

success? Crossover: Which crossover operator is best? Operators:

• Which operators contribute the most to success? • How should they be selected?

Search space: How should the representation weight program statements to best define the search space?

RESEARCH QUESTIONS

Page 24: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 24

CROSSOVER: RESULTS

Crossover Operator Success Rate

Fitness evaluations to repair

None 54.4% 82.43Default/“Uniform” 61.1% 163.05One-Point/AST-WP

63.7% 114.12

One-Point/Patch 65.2% 118.20

Crossover Operator Success Rate

Fitness evaluations to repair

None 54.4% 82.43Default/“Uniform” 61.1% 163.05One-Point/AST-WP

63.7% 114.12

One-Point/Patch 65.2% 118.20

Crossover Operator Success Rate

Fitness evaluations to repair

None 54.4% 82.43Default/“Uniform” 61.1% 163.05One-Point/AST-WP

63.7% 114.12

One-Point/Patch 65.2% 118.20

Page 25: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 25

Representation: • Which representation choice gives better results? • Which representation features contribute most to

success? Crossover: Which crossover operator is best? Operators:

• Which operators contribute the most to success? • How should they be selected?

Search space: How should the representation weight program statements to best define the search space?

RESEARCH QUESTIONS

Page 26: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 26

SEARCH SPACE: SETUPHypothesis: statements executed only by the failing test case(s) should be mutated more often than those also executed by the passing test cases.Procedure: examine that ratio in actual repairs.Result:

Expected: 10 : 1 vs.

Actual: 1 : 1.85

Hypothesis: statements executed only by the failing test case(s) should be mutated more often than those also executed by the passing test cases.Procedure: examine that ratio in actual repairs.Result:

Page 27: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 27

Easy Medium Hard All0

102030405060708090

100110

34.3

93.6103.1

66.0

49.1

75.7

27.1

59.5

36.3

67.057.7 58.6

DefaultRealisticEqual

Search difficulty

# fit

ness

eva

luat

ions

to re

pair

SEARCH SPACE: REPAIR TIME

Page 28: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 28

Easy Medium Hard 0% All0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%1.00

0.78

0.24

0.00

0.62

0.930.85

0.23 0.23

0.69

0.92 0.90

0.33

0.19

0.70

DefaultRealisticEqual

Search difficulty

GP

Succ

ess

Rat

eSEARCH SPACE: SUCCESS RATE

Page 29: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 29

This EC problem is atypical; atypical problems warrant study.We studied representation and operators for EC-based bug repair. These choices matter, especially for difficult bugs.

• Incorporating all recommendations, GenProg repairs 5 new bugs; repair time decreases by 17–43% on difficult bugs.

We don’t know why some of these things are true, but:• We now have lots of interesting data to dig into!• We are currently (as we sit here) doing more and bigger

runs with the new parameters on the as-yet unfixed bugs.

CONCLUSIONS/DISCUSSION

Page 30: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 30

PLEASE ASK QUESTIONS

Page 31: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 31

REPRESENTATION BENCHMARKSProgram LOC Tests Bug Descriptiongcd 22 6 infinite loop Exampleuniq-utx 1146 6 segfault Text processinglook-utx 1169 6 segfault Dictionary lookuplook-svr 1363 6 infinite loop Dictionary lookupunits-svr 1504 6 segfault Metric conversionderoff-utx 2236 6 segfault Document processing

nullhttpd 5575 7 buffer exploit Webserver

indent 9906 6 infinite loop Source code processingflex 18775 6 segfault Lexical analyzer generator

atris 21553 3 buffer exploit Graphical tetris game

Total 63249

Page 32: Representations and Operators for Improving Evolutionary Software Repair

Claire Le Goues, GECCO 2012 http://genprog.cs.virginia.edu 32

SEARCH SPACE BENCHMARKSProgram LOC Tests Bugs Descriptionfbc 97,000 773 3 Language (legacy)gmp 145,000 146 2 Multiple precision mathgzip 491,000 12 5 Data compressionlibtiff 77,000 78 24 Image manipulationlighttpd 62,000 295 9 Web serverphp 1,046,000 8,471 44 Language (web)python 407,000 355 11 Language (general)wireshark 2,814,000 63 7 Network packet analyzerTotal 5,139,000 10,193 105

55/105 bugs repaired using default parameters.