17
Automated Search For “Good” Coverage Criteria Phil McMinn Mark Harman Gordon Fraser Gregory Kapfhammer University of Sheffield University College London University of Sheffield Allegheny College Position Paper

Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Automated Search For “Good” Coverage Criteria

Phil McMinn Mark Harman

Gordon FraserGregory Kapfhammer

University of Sheffield University College London University of Sheffield Allegheny College

Position Paper

Page 2: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Coverage Criteria: The “OK”, The Bad

and The Ugly

The “OK” • Divide up system into things to test

• Useful to generate tests on if no functional model exists

• Indicates what parts of the system are and aren’t tested

Page 3: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

The Bad

• Not based on anything to do with faults, not even:

• Fault histories

• Fault taxonomies

• Common faults

Page 4: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

The Ugly

• Studies disagree as to which criteria are best

• Coverage or test suite size?

Page 5: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

The Key Question of this Talk

Can we evolve “good” coverage criteria?Coverage criteria that are better correlated with fault revelation?

Page 6: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Why This Might Work

• The best criterion might actually be a mix and match of aspects existing criteria

• For example “cover the top n longest d-u paths, and then any remaining uncovered branches”

• Or…

Page 7: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Maybe this is One Big Empirical Study using SBSE

… which aspects of which criteria and how much

less

less

less

more

more

more

branches

complex d-u chains

basis paths

Page 8: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

What About Including Aspects Not Incorporated into Existing Criteria

Non functional aspects • For example timing behaviour, memory usage

• “Cover all branches using as much memory as possible”

Fault histories • “Maximize basis path coverage in classes with

the longest fault histories”

Page 9: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

“Isn’t This Just Mutation Testing?”

Our criteria are more like generalised strategies

• Potentially more insightful to the nature of faults

• Cheaper to apply (coverage is generally easier to obtain than a 100% mutation score)

Perhaps different strategies will work best for different types of software, or different teams of software developers

Page 10: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

How This Might Work

Page 11: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Fault DatabaseNeed examples of real faults

• Defects4J

• CoREBench

• … or, just use mutation

Page 12: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Fitness Function

“Goodness” is correlation between greater coverage and greater fault revelation

• Needs test suites to establish

Page 13: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Generation of Test Suites

At least two possibilities

• Generate up front universe of test suites

• Generate specific test suites with the aim of achieving specific coverage levels of the criteria under evaluation (drawback: expensive)

Page 14: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Search RepresentationGP Trees

OR

up to 50%branch coverage memory usage

maximise

over 75%basis path coverage

AND

Page 15: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Handling Bloat

GP techniques classically involve “bloat”

• Consequence: generated criteria may not be very succinct

• Various techniques could be applied to simplify the criteria, e.g. delta debugging

Page 16: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

OverfittingThe evolved criteria may not generalise beyond the systems studied and the faults seeded

• May not be a disadvantage:

• insights into classes of system

• faults made by particular developers

• … apply traditional techniques from machines learning to combat overfitting.

Page 17: Automated Search For “Good” Coverage Criteria · Automated Search For “Good” Coverage Criteria Phil McMinn ... Coverage Criteria: The “OK”, The Bad and The Ugly ... score)

Summary

Our Position: SBSE can be used to automatically evolve coverage criteria that are well correlated with fault revelation Over to the audience: Is it feasible that we could do this?