Upload
gerald-muecke
View
151
Download
0
Embed Size (px)
Citation preview
Must.Kill.Mutants.Introduction to Mutation Testing
Gerald MückeDevCon5 GmbH, Switzerland
2
AGENDA
Quality Assurance
Value of Testing
Mutation Testing
Tool Demo: Pit
Conclusion
About me
• Gerald Mücke
• Founder & CEO of DevCon5 GmbH
• Passionate Software Developer• Focal Points
• Performance Analysis• Test Automation• Mutation Testing• DevOps
• Using mutation testing for > 2 years
Effectiveness & Efficiency
Die Quickly Thrive
Die Slowly Survive
4
Doing the right thing
Doin
gth
ings
right
Ineffectively Effectively
Inef
ficie
ntly
Effic
ient
ly
GetShit
Done
Do rightthingsright
«Quality Assurance»“is a way of preventing mistakes or defects in
manufactured products and avoiding problems when delivering solutions or services to customers”
(Wikipedia)
«manufactured» products
• «The process of converting raw materials, components, or parts into finished goods that meet a customer's expectations or specifications.»
• Most of the critical code is written manually• Raw Materials
• existing software or parts of it• brain, ideas, knowledge, experience,
requirements,• Every product is unique
(may look similar, though)
«Preventing» defects
• Defects are «created» in development• Can not be prevented,
it’s human to make mistakes• Could be detected:
the earlier, the better• Defects manifest in production
• Or during test• Can be prevented:
the earlier, the better
Sources of a Product• Internal Development
• QA embeddable• QA along the pipe line• Quality is shared effort• More Easy to change or
influence
• External Development
• Software Vendors• more effort required for
dedicated QA• Less easy to change• handoff «Waterfall» style
«We have tested it»Anonymous Developer
Real-Life Bugs
if( isThreadSafe() ) {computeSingleThreaded();
} else {computeMultiThreaded();
} Made it to Production, Performance Impact: 500% Duration of Day
End Processing
Real-Life Bugs
if( ! isDevelopmentMode() ){collectProfileDataAndSendDeveloperReport();
}
In Production, Impact:
20% Performance lossCompliance Violation
Real-Life Bugsvoid function(LocalDate begin, LocalDate end, LocalDate minFrom, ...) {//...outerLoop:while( it.hasNext() ) {
Object current = it.next();Local from = funcA(current);Local upto = funcB(current);while(true){
if( ! isBeforeOrEqual( from , upto ) ) {continue outerLoop;
}if( condY(from, minFrom) ) {
from = DateUtil.addDaysToDate(upto, 1);upto = DateUtil.getLastOfMonth(from);from = DateUtil.min(new LocalDate[]{ end, from});upto = DateUtil.min(new LocalDate[]{ end, upto});
void function(LocalDate begin, LocalDate end, LocalDate minFrom, ...) {//...outerLoop:while( it.hasNext() ) {
Object current = it.next();Local from = funcA(current);Local upto = funcB(current);while(true){
if( ! isBeforeOrEqual( from , upto ) ) {continue outerLoop;
}if( condY(from, minFrom) ) {
from = DateUtil.addDaysToDate(upto, 1);upto = DateUtil.getLastOfMonth(from);from = DateUtil.min(new LocalDate[]{ end, from});upto = DateUtil.min(new LocalDate[]{ end, upto});
«This will never happen in
production»Anonymous Developer
How to make informed decisions?
… without having a clue
Product Delivery Pipeline
Development ContinuousIntegration
Quality Assurance Release Operations
Decision Point
Good Decisions are based on Information
SimpleMetricsNumber of Unit TestsLine CoverageBranch Coverage
ComplexTest ResultsCode ReviewStatic Code Analysis…
17
Code Coverage
Information about what elements of a product have been touched by a test.
Common Coverage Metrics Line Coverage Condition Coverage Branch Coverage
Semantics ?
Code
Test
Test Oracle
Would yourelease a product basedon100% Line Coverage100% Branch CoverageAnd all Tests are green
19
«Line or Branche coverage provide no value»
Arcance Arts To most of the Non-Developers
Software Development seems increasingly like an arcane art
Languages, Paradigms, Frameworks
Algorithms & DatastructuresO(n), ByteCode, Lambdas...
Magic Delivery Pipeline
Development ContinuousIntegration
Quality Assurance Release Operations
Magic Happens Here
Decision Point
Quality Gates
Decision PointList of Checks when the Product is ready to be releasedBased on informationBased on agreement between stakeholdersPart of Definition of DoneEvolves over timeShould not replace human judgement
«Testing is about gaining new
information»
Perspectives
Programmers• Implement the Solution• Provide indication the solution is
working• claim, they did it «right»
Testers • Show if and how the solution will fail• have to provide information for
stakeholders to make informeddecisions
• usually don’t understand arcane arts
Checking vs. Testing
Things weare aware of
but don‘tunderstand
Things weare aware of
andunderstand
Things weare neither
aware of norunderstand
Things weunderstandbut are not
aware of
28
Understanding
Awar
enes
s
Unknowns Knowns
Unk
now
nKn
own
CheckingTesting
AutomatedChecking
The Testing Pyramid of FunctionalTests
UI Tests
Integration Tests
Unit Tests
Degr
eeof
Auto
mat
ion
Information Gain (without Testing)
Information
Development ContinuousIntegration
Quality Assurance Release Operations
The Testing Pyramid of FunctionalTests
UI Tests
Integration Tests
Unit Tests
Degr
eeof
Auto
mat
ion
Degr
eeof
Expl
orat
ory
Test
ing
Information Gain (with Testing)
Information
Development ContinuousIntegration
Quality Assurance Release Operations
Value of Testing
The Testing Pyramid of FunctionalTests
UI Tests
Integration Tests
Unit Tests
Degr
eeof
Auto
mat
ion
Degr
eeof
Expl
orat
ory
Test
ing
Info
rmat
ion
Gap
Test Coverability
% CoverableSemantics
Development ContinuousIntegration
Quality Assurance Release Operations
Cost of Defects
Development ContinuousIntegration
Quality Assurance Release Operations
Cost
Where to improve?
Development ContinuousIntegration
Quality Assurance Release Operations
Cost / Defect
CoverableSemantics
InformationInformation Gap
Magic Happens Here
How to prove, to test the right
thing right?
Mutation Testing – HistoryMutations testing injects faults, based on rules, into a product
to verify if the test suite is capable of finding it.
Fault injection technique Concept is known since ~1970 First implementation of a mutation testing tool in 1980 Most of the time it was subject to academic research only Recently, with increasing processing power, there is a growing interest
More academic research ongoing Practical tooling available
Mutation Testing – Some Theory Mutation testing is a special form of Fault Injection Based on two hypotheses
1: Most of the software faults are due to small syntactic errors 2: Simple faults can cascade to more emergent faults
Assumption: “if a mutant was introduced without the behavior of the test suite being affected, this indicated either
that the code that had been mutated was never executed (dead code) or that the test suite was unableto locate the faults represented by the mutant” (Wikipedia)
Mutation Testing - Definitions
Mutant a variation P’ of the product P created by
applying a mutant operator mP’ = m(P)
Killed Mutant a variation P’ in which a test has found at
least ONE error Live Mutant
a variation P’ in which a test has found NO errors
Mutation Operators A function m() that creates a variation of the
Product P by applying a set of modification rules
Inject Faults into the Product Based on Bug Taxonomies
Mutation Score Number of Killed Mutants / Total number of
Mutants Also Called Mutation Coverage
Some more definitions
Equivalent Mutation a variation P’ that is semantically
identical to P
Duplicate Mutation a variation P’ that is equivalent to another
variation P’’
Weak Mutation Fault does not lead to incorrect output
Strong Mutation Fault propagates to incorrect output
Unstable Mutation Any test can find the mutations generated
by it
High-Order Mutants Mutants that are defined by a set of Low-
Level Mutants
Subsumed Mutants One mutant subsumes another if at least
one test kills the first and every test that kills the first also kills the second.
Mutation Operators
Boundaries Conditional Boundary Negate Conditionals Remove Conditional
Return values Return Values Argument Propagation
Method Calls Non Void Method Calls Void Method Calls Constructor Calls
Calculations Invert Negatives Increments / Remove Increments Math
Members and Constants Inline Constants Member Variable (experimental)
Java Language Switch (experimental) Modifiers ...
...
«Alive Mutants will eventually turn
into a Bug»
Approaches to Mutation Testing Byte Code Mutation
Can be done on-the-fly Faster to apply and execute Might be affected by compiler optimizations
Source Code Mutation Requires recompilation after every change Takes very long Is not affected by compiler optimizations
Higher Level Mutations Configuration, Architecture, Specification,
Use/Business Case, ... No Tooling Support (yet?)
Mutation Testing Phases Mutant generation
analyzing classes and generate mutations for them
Test selection selecting the tests to run against the mutations
Mutant insertion loading the mutations into a JVM / Runtime Environment
Mutant detection executing tests against the loaded mutants
Mutation Testing 101
Modify your code(Mutant generation)
Re-Run the Test(Test selection + Loading)
Check if test is failing(Detection)
class Builder {Builder withValue(String in) {
this.value = in;this.value = in;return this;
}}@Testpublic void testLeft() {Builder b = b.withValue(„one");assertNotNull(b);
}
If test is Green it‘s a Fail!!!
Related Techniques Bebugging / Fault Seeding
randomly adding bugs, programmers are tasked to find them
Fuzzing Injecting Faults into Test Data
For Operations: Chaos Monkey (Simian Army, Netflix)
Randomly terminating running processes or servers to test operational procedures or fitnesse
ToolingPIT Java, Scala, KotlinmuJava JavaJester JavaJudy Java
Mutator Java, JavaScript,Ruby, PHP
Javalanche JavaJumble JavaMajor JavaStryker JavaScript
mutate.py Python
Mutant Ruby
Heckle Ruby
NinjaTurtles .Net, Mono
Nester C#
Humbug PHP
MuCheck Haskel
…
Tool: PIT Mutation Testing for Java / JVM
Operates on ByteCode modification easy to use - works with ant, maven, gradle and
others
~ 20 Mutation Operands for altering your code
Parallel execution fast - can analyze in minutes what would take
earlier systems days
Active Community actively developed & supported
Mature Tooling Good Documentation
HTML & XML Reports
Example Output
Interpreting Results Live Mutants
Reflects unspecified behavior superfluous code / unrequired semantics Could be an actual bug that is not covered by the test suite Could be equivalent mutation
Killed by TimeOut or MemError Could be “real kill” (i.e. endless loop) Could be still alive
Mutation Score Gives an indication of the overall quality of you test suite
Unit Test Maturity ModelLevel Description0 No Test1 We have a test2 1 + We have > 0% Line Coverage3 2 + We have > 50% Branch Coverage4 3 + We have at least 1 effective assertion per test5 4 + We have > 80 % Mutation Coverage
Demo
53
Timeouts Mutating abort-conditions of loops can cause timeouts
Loop runs endlessly Mutation is effectively killed Mutation might not be killed
Loop runs longer (i.e. counter underrun / overrun) -> Mutation might eventually survive Your System is just too slow / the tests takes too long
When to stop the test? Will the test fail?
If a loop runs longer, the machine performance is important for choosing the timeout.
Limitations Fault Coverage
~¼ of real faults are not coverable by mutation testing
Mutation Score PIT does not recognize subsumed or equivalent mutations mutation score may not be “academically” precise – context matters!
Mutation Operators PIT has no Java concurrency mutation operands PIT has no high-order mutation operands PIT has no Java-language specific mutation operands
Techniques PIT does not support sampling
Value has it’s cost Mutation Testing is computationally expensive Duration of a mutation test depends on
number of tests test suite execution time number of mutation operators Processing Power
Basically:D = xn
n = number of mutation operators x = number of tests
Deviation in Mutation Score
Impact of Mutation Operator Selection
Size of Codebase
Computational Effort
MutationsFound More Operands
Less Operands
Mutation Analysis of Large Code bases
Computational Effort
Time Cap
Size of CodebaseBreak into Chunks
Mo Tu We Th Fr
Other techniques Incremental Analysis
Based on historical data Only test code that has changed Increases deviation
Sampling Good for Mutation Scoring Increases Deviation No support for sampling in PIT
Challenges of Mutation Testing Redundant Mutants
Subsuming Duplicates Equivalent
Equivalent Detection Current Algorithm achieves 50% detection rate in
Research
High Order Mutations Computational Cost
Mutation and Test Selection
EquivalentMutants
Subsuming/DuplicateMutants
Conclusion
Some Advices Unit Tests are usually owned by development
challenge them with Mutation Testing! It’s NOT unit tested until mutation tested.
Don’t go on a killing spree Set achievable goals for mutation score Triage surviving mutants A mutation score > 0.8 is considered good (it depends…)
Determine mutation score regularly in a sensible intervall Every build vs. Every release Use historical data & SCM support
Find concrete mutants as needed Adjust mutators & scope
Use Cases Finding Gaps in Test Suite Testing Highly Exposed Code
Algorithms and Calculations Security-related code Transaction-related code
Assessing Test Suites / Testing Strategies / Methodologies By comparing the mutation scores, i.e.
Developing Test Suites for Legacy Code Finding semantic hotspots Finding gaps in Test Suite Forced to break Code Base into more manageable pieces
Minimizing Test Suites Reduce number of Tests while keeping mutation score stable ! Reduces the effectiveness of the suite for detecting real faults
«Test First does not lead to better test
suites»
Takeaways Don’t trust your Unit Tests unless you mutation-tested it.
Mutation Testing is the practice to find bugs in your test suite
Forget about other coverage metrics Cheap to get, but next to no value
Include Mutation Testing in your project. Always.
Use it with common sense don’t go on a killing spree.
For Java PIT is the tool to use.
Gerald Mücke
DevCon5 @gmuecke