Can You Trust Your Tests? (Agile Tour 2015 Kaunas)

CanYou Trust Your Tests?

2015 Vaidas Pilkauskas & Tadas Ščerbinskas

Van Halen

Band Tour Rider

Van Halen’s 1982 Tour Rider

Agenda

1. Test quality & code coverage2. Mutation testing in theory3. Mutation testing in practice

Prod vs. Test code quality

Code has bugs.

Tests are code.

Tests have bugs.

Test quality

Readable

Focused

Concise

Well named

“Program testing can be used to show the presence of bugs, but never to

show their absence!”- Edsger W. Dijkstra

Code Coverage

Types of Code Coverage

Branches

Instructions

Cyclomatic Complexity

Methods

& more

string foo() { return "a" + "b"}

assertThat(a.foo(), is("ab"))

string foo(boolean arg) { return arg ? "a" : "b"}

assertThat(a.foo(true), is("a"))

Branches

string foo(boolean arg) { return arg ? "a" : "b"}

assertThat(a.foo(true), is("a"))

assertThat(a.foo(false), is("b"))

SUCCESS: 26/26 (100%) Tests passed

Can you trust 100% coverage?

Code coverage can only show what is not tested.

For interpreted languages 100% code coverage is kind of like full compilation.

Code Coverage can be gamed

On purpose or by accident

Mutation testing

Changes your program code and expects your tests to fail.

What exactly is a mutation?

def isFoo(a) { return a == foo}

def isFoo(a) { return a != foo}

def isFoo(a) { return true}

def isFoo(a) { return null}

Terminology

Applying a mutation to some code creates a mutant.

If test passes - mutant has survived.

If test fails - mutant is killed.

Failing is the new passing

array = [a, b, c]

max(array) == ???

// testmax([0]) == 0 ✘

// testmax([0]) == 0 ✔

// implementationmax(a) { return 0}

// testmax([0]) == 0 ✔max([1]) == 1 ✘

// implementationmax(a) { return 0}

// testmax([0]) == 0 ✔max([1]) == 1 ✔

// implementationmax(a) { return a.first}

// testmax([0]) == 0 ✔max([1]) == 1 ✔max([0, 2]) == 2 ✘

// implementationmax(a) { return a.first}

// testmax([0]) == 0 ✔max([1]) == 1 ✔max([0, 2]) == 2 ✔

// implementationmax(a) { m = a.first for (e in a) if (e > m) m = e return m}

Coverage

Mutation// testmax([0]) == 0 ✔max([1]) == 1 ✔max([0, 2]) == 2 ✔

// implementationmax(a) { m = a.first for (e in a) if (true) m = e return m}

// testmax([0]) == 0 ✔max([1]) == 1 ✔max([0, 2]) == 2 ✔

// implementationmax(a) { return a.last}

// testmax([0]) == 0 ✔max([1]) == 1 ✔max([0, 2]) == 2 ✔max([2, 1]) == 2 ✘

// implementationmax(a) { return a.last}

// testmax([0]) == 0 ✔max([1]) == 1 ✔max([0, 2]) == 2 ✔max([2, 1]) == 2 ✔

Baby steps matter

Tests’ effectiveness is measured by number of killed mutants by

your test suite.

It’s like hiring a white-hat hacker to try to break into your server and making sure you detect it.

What if mutant survives

● Simplify your code● Add additional tests

● TDD - minimal amount of code to pass the test

Challenges

1. High computation cost - slow2. Equivalent mutants - false negatives3. Infinite loops

Equivalent mutations

// Originalint i = 0;while (i != 10) { doSomething(); i += 1;}

// Mutantint i = 0;while (i < 10) { doSomething(); i += 1;}

Infinite Runtime

// Originalwhile (expression) doSomething();

// Mutantwhile (true) doSomething();

Disadvantages

● Can slow down your TDD rhythm

● May be very noisy

Let’s say we have codebase with:● 300 classes● around 10 tests per class● 1 test runs around 1ms● total test suite runtime is about 3s

Is it really slow?

Let’s do 10 mutations per class● We get 3000 (300 * 10) mutations● runtime with all mutations is 150 minutes (3s * 3000)

Speeding it up

Run only tests that cover the mutation● 300 classes● 10 tests per class● 10 mutations per class● 1ms test runtime● total mutation runtime 10 * 10 * 1 * 300 = 30s

Speeding it up

During development run tests that cover only your current changes

● Continuous integration● TDD with mutation testing only

on new changes● Add mutation testing to your

legacy project, but do not fail a build - produce warning report

Usage scenarios

● Ruby - Mutant● Java - PIT● And many tools for other

languages

Summary

● Code coverage highlights code that is definitely not tested

● Mutation testing highlights code that definitely is tested

● Given non equivalent mutations, good test suite should work the same as a hash function

Vaidas Pilkauskas

@liucijus

● Vilnius JUG co-founder

● Vilnius Scala leader

● Coderetreat facilitator

● Mountain bicycle rider

● Snowboarder

About usTadas Ščerbinskas

@tadassce

● VilniusRB co-organizer

● RubyConfLT co-organizer

● RailsGirls Vilnius & Berlin coach

● Various board sports’ enthusiast

Credits

A lot of presentation content is based on work by these guys

● Markus Schirp - author of Mutant● Henry Coles - author of PIT● Filip Van Laenen - working on a book

Can You Trust Your Tests? (Agile Tour 2015 Kaunas)

Software

Demystifying roles in scrum team - Agile Tour 2013 Kaunas

Agile Tour -Comverse case study

Agile tour Nancy 2010 - Agile values and continuous improvement

Agile tour 2011 Dublin

Spicing up Agile Retrospectives - Agile Tour London 2015 - Ben Linders

Ponencia Agile Tour

Srijan - agile tour 2015 -- building agile cultures

Agile tour 2011 radu davidescu

Agile tour Thailand2014: Agile & Psychology

TDD Agile Tour Beirut

Keynote Need for Continuous Improvement - Agile Tour Kaunas 2016 - Ben Linders

A tour of agile tour beirut 2013

Saying "no" is agile (Agile Tour Riga 2012)

Agile tour 2011 ralph jocham

Rethinking Agile Transformation - Agile Tour Montreal Keynote

Agile Tour DC Chasing Windmills: Agile in the Government

FLUPA Agile Tour 2010

Value Driven Development / Agile with GUTS - Présentation Agile Tour Strasbourg

Expeirence design-vietnam-agile tour

EventStorming Agile Tour Aix-Marseille