26
Covrig: A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software Paul Marinescu, Petr Hosek, Cristian Cadar Imperial College London 1

Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

  • Upload
    miyo

  • View
    45

  • Download
    4

Embed Size (px)

DESCRIPTION

Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software. Paul Marinescu , Petr Hosek , Cristian Cadar Imperial College London. Goal. A nswer questions about software evolution Code quality Test quality Development model - PowerPoint PPT Presentation

Citation preview

Page 1: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

1

Covrig: A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

Paul Marinescu, Petr Hosek, Cristian CadarImperial College London

Page 2: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

2

Goal

• Answer questions about software evolution– Code quality– Test quality– Development model– Testing improvement opportunities

…using software development historical data

Page 3: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

3

Target Audience

• Researchers– Hypothesis validation (e.g. are software patches

poorly tested?)

• Programmers/Project Managers– Assess development quality

Page 4: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

4

Software Metrics

• Static–Measured by parsing the software artifacts

• Dynamic–Require running the evolving software–More challenging–Very few studies

Page 5: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

5

Example questions

1. Do executable and test code evolve in sync?2. How many patches touch only code/test/none/both?3. What is the distribution of patch sizes?4. How spread out is each patch through the code?5. Is test suite execution deterministic?6. How does the overall coverage evolve?7. What is the distribution of patch coverage across revisions?8. What is the latent patch coverage?9. Are bug fixes better covered than other patches?10. Is the coverage of buggy code less than average?

Page 6: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

6

Data mining infrastructure

Empirical case study

Page 7: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

7

Covrig Overview

3

2

1

Page 8: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

8

Docker Containers

• Lightweight, OS-level virtualization– Guest shares kernel with host– Namespace isolation

• PID• Network• IPC• Filesystem

– Resource limiting

• cgroups + Linux Containers + Docker

Page 9: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

9

Docker Containers Features

• Isolation• Consistency• Reproducibility• Easy cloud deployment• Performance

Page 10: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

10

Covrig

Page 11: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

11

Metric Granularity

Static

Test size Lines

Executable code size Lines

HunksPatch executable size Files

Dynamic

Overall coverageLines

Branches

Patch coverageLines

Branches

Latent patch coverage Lines

Test result FAIL/PASS

Page 12: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

12

Challenges

Evolving dependenciesEvolving containersCustom compile flags (-Wno-error)

Page 13: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

13

Challenges

Branching development structureConsider only the ‘main’ branch

Alice Bob

r1

r3

m1

r2

r4

r1

r3

r2+r4

Page 14: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

14

Challenges

Revisions that fail to compileAccumulate until reaching a compilable revision

r1

r2

r3 r1+r2+r3

Page 15: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

15

Data mining infrastructure

Empirical case study

Page 16: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

16

Case Study Subjects

App ELOCTests

Period (mo)Lang LOC

Binutils 27,029 DejaGnu 5,186 35

Git 79,760 C/shell 108,464 5

Lighttpd 23,884 Python 2,440 36

Memcached 4,426 C/Perl 4,605 47

Redis 18,203 Tcl 7,589 6

ZeroMQ 7,276 C++ 3,460 17

1500 revisions and 12 years of development in total

Page 17: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

17

Patch type

Page 18: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

18

Is test suite execution deterministic?

FAIL/PASS determinismBinutils Git Lighttpd Memcached Redis ZeroMQ

NondeterministicRevisions 0 1 1 21 16 32

Page 19: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

19

Is test suite execution deterministic?

Coverage determinismBinutils Git Lighttpd Memcached Redis ZeroMQ

NondeterministicLines (median) 0 13 10 8.5 23 27

Page 20: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

20

Test Suite Nondeterminism Causes

• Bugs– Race conditions– Hardcoded wall clock timeouts– Incorrect resource consumption expectations

• Random test data• Benign race conditions

Page 21: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

21

Are patches properly tested?

Sometimes

Page 22: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

22

Patch coverage

Page 23: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

23

Patch coverage

0% 0%

0%0%

0%

0%

Page 24: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

24

Does covered code contain fewer bugs that not covered code?

Not really

Page 25: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

25

Does covered code contain fewer bugs that not covered code?

Patch Coverage (median) Patches Fully Covered

Buggy All Buggy All

Memcached 100% 89% 67% 45%

Redis 94% 0% 47% 25%

ZeroMQ 71% 76% 37% 33%

85 total bugs

Page 26: Covrig : A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software

26

Conclusions

Dynamic software metrics mining

Case study on 6 systems/1500 revisions/12 years of development

Open source extensible infrastructurehttp://srg.doc.ic.ac.uk/projects/covrig/