Aug./Sept. 2009 Data Mutation Software Testing 1 Data Mutation Testing -- A Method for Automated...

Aug./Sept. 2009

Data Mutation Software Testing

Data Mutation Testing-- A Method for Automated Generation of

Structurally Complex Test Cases

Hong ZhuDept. of Computing and Electronics,

Oxford Brookes Univ., Oxford, OX33 1HX, UKEmail: hzhu@brookes.ac.uk

2Data Mutation Software Testing

Outline Motivation

Overview of existing work on software test case generation The challenges to software testing

The Data Mutation Testing Method Basic ideas Process Measurements

A Case study Subject software under test The mutation operators Experiment process Main results

Perspectives and future works Potential applications Integration with other black box testing methods

Motivation Test case generation

Need to meet multiple goalsReality: to represent real operation of the systemCoverage: functions, program code, input/output data

space, and their combinationsEfficiency: not to overkill, easy to execute, etc. Effective: capable of detecting faults, which implies easy

to check the correctness of program’s outputExternally useful: help with debugging, reliability

estimation, etc. Huge impact on test effectiveness and efficiencyOne of the most labour intensive tasks in practices

Existing WorkProgram-based test case generationStatic: analysis of code without execution, e.g. symbolic

executionPath oriented

Howden, W. E. (1975, 1977, 1978); Ramamoorthy, C., Ho, S. and Chen, W. (1976) ; King, J. (1975) ; Clarke, L. (1976) ; Xie T., Marinov, D., and Notkin, D. (2004); J. Zhang. (2004), Xu, Z. and Zhang, J. (2006)

Goal orientedDeMillo, R. A., Guindi, D. S., McCracken, W. M., Offutt, A. J. and King, K. N. (1988) ; Pargas, R. P., Harrold, M. J. and Peck, R. R. (1999); Gupta, N., Mathur, A. P. and Soffa, M. L. (2000);

Dynamic: through execution of the programKorel, B. (1990) , Beydeda, S. and Gruhn, V. (2003)

Hybrid: combination of dynamic execution with symbolic execution, e.g. concolic techniques

Godefroid, P., Klarlund, N., and Sen, K.. (2005);

Techniques:Constraint solver, Heuristic search, e.g. genetic algorithms:

McMinn, P. and Holcombe, M. (2003) , Survey: McMinn, P. (2004)

Specification-based test case generationDerive from either formal or semi-formal specifications

of the required functions and/or the designs Formal specification-based:

First order logic, Z spec and Logic programs: : Tai, K.-C. (1993); Stocks, P. A. and Carrington, D. A. (1993) ; Ammann, P. and Offutt, J. (1994) ; Denney, R. (1991)

Algebraic specification: Bouge, L., Choquet, N., Fribourg, L. and Gaudel, M.-C. (1986) ; Doong, R. K. and Frankl, P. G. (1994) ; Chen, H. Y., Tse, T. H. and Chen, T. Y. (2001) ; Zhu (2007);

Finite state machines: Fujiwara, S., et al.. (1991) ; Lee, D. and Yannakakis, M. (1996) ; Hierons, R. M. (2001) ; Zhu, H., Jin, L. & Diaper, D. (1999) ;

Petri nets: Morasca, S. and Pezze, M. (eds). (1990) ; Zhu, H. and He, X. (2002) Model-based: derive from semi-formal graphic models

SSADM models: Zhu, H., Jin, L. and Diaper, D. (1999, 2001); UML models: Offutt, J. and Abdurazik, A. (2000) ; Tahat, L. H., et al. (2001); Hartman, A. and

Nagin, K. (2004); Li, S., Wang, J. and Qi, Z.-C. (2004) ;

Techniques: Constraint solving; Theorem prover; Model-checker

Random testingThrough random sampling over input domain based on

probabilistic models of the operation of the software under test.

Profile-based: Sampling over an existing operation profile at random

Stochastic model based: Use a probabilistic model of software usages Markov chain:

Avritzer, A. Larson, B. (1993) ; Avritzer, A. Weyuker, E. J. (1994) ; Whittaker, J. A. and Poore, J. H. (1993) ; Guen, H. L., Marie, R. and Thelin, T. (2004) ; Prowell, S. J. (2005)

Stochastic automata networks: Farina, A. G., Fernandes, P. and Oliveira, F. M. (2002, 2004) ;

Bayesian networks: Fine, S. and Ziv, A. (2003)

Adaptive random testing: Even spread of randomly test cases (Chen, T. Y., Leung, H. and Mak, I. K. (2004) )Variants: Mirror, Restricted, and Probabilistic ART

Domain-specific techniquesDatabase applications

Zhang, J., Xu, C. and Cheung, S. C. (2001)

Spreadsheets:Fisher, M., Cao, M., Rothermel, G., Cook, C. and Burnett, M. (2002) Erwig, M., Abraham, R., Cooperstein, I., and Kollmansberger S. (2005)

XML Scheme:Lee, S. C. and Offutt, J. (2001) ; Li, J. B. and Miller, J. (2005)

Compiler:See Boujarwah, A. S. and Saleh, K. (1997) for a survey.

The ChallengeHow to generate adequate test cases of high reality for programs that process structurally complex inputs?

Structural complexity: A large number of elements A large number of possible explicitly represented relationships between the

elements A large number of constraints imposed on the relationships Meaning of the data depends on not only the values of the elements, but

also the relationships and thus their processingReality:

Likely or close to be a correct real input in the operation of the system Likely or close to be an input that contains errors that a user inputs to the

system in operationExamples:

CAD, Word processor, Web browser, Spreadsheets, Powerpoint, Software modelling tools, Language processor, Theorem provers, Model-checkers, Speech recognition, Hand writing recognition, Search engine,…

Basic Ideas of Data Mutation Testing1.Preparing the seeds, i.e. a small set of test cases

Contain various types of elements and relationships between them Highly close to the real input data Easy to check their correctness

2.Generating mutant test cases by modifying the seeds slightly Preserve the validity of the input Change at one place a time unless imposed by the constraints (but may

use second order even higher order mutants) Make as many different mutants as possible

3.Executing the software under test on both seeds and their mutants What to observe:

program’s correctness on both seeds and mutants the differences of the program’s behaviours on seed and their mutants

Uses of metrics and measurements seeds are sufficient mutations are effective and/or sufficient Feedback to step 1 and 2 if necessary, or to improve the observation.

Illustrative Example Triangle classification

Input: x, y, z: Natural Numbers; Output:

{equilateral, isosceles, scalene, non-triangle}

Seeds:

The lengths of the sides

The type of triangles

Non-triangle(x=3, y=5, z=9)t4

Scalene(x=5, y=7, z=9)t3

Isosceles(x=5, y=5, z=7) t2

Equilateral(x=5, y=5, z=5) t1

Expected outputInputID

Mutation operators IVP: Increase the value of a parameter by 1; DVP: Decrease the value of a parameter by 1; SPL: Set the value of a parameter to a very large number, say

1000000; SPZ: Set the value of a parameter to 0; SPN: Set the value of a parameter to a negative number, say -

2; WXY: Swap the values of parameters x and y; WXZ: Swap the values of parameters x and z; WYZ: Swap the values of parameters y and z; RPL: Rotate the values of parameters towards left; RPR: Rotate the values of parameters towards right.

Generation of mutant test cases For example, by applying the mutation operator IVP

to test case t1 on parameter x, we can obtain the following test case t5.

IVP(t1, x) = t5 = Input: (x=6, y=5, z=5).

Total number of mutants: (5*3 +5)*4 = 80

Covering all sorts of combinations of data elements Systematically produced from the four seeds

Execution of program and classification of mutantsA mutant is classified as dead, if the execution of the

software under test on the mutant is different from the execution on the seed test case. Otherwise, the mutant is classified as alive.

For exampleFor a correctly implemented Triangle Classification program, the execution on the mutant test case t5 will output isosceles while the execution on its seed t1 will output equilateral.

TrC(t5) TrC(t1) t5 is dead

It depends on how you observe the behaviour!

Analyse test effectiveness Reasons why a mutant can remain alive:

The mutant is equivalent to the original with respect to the functionality or property of the software under test.

RPL(t1)=t1

The observation on the behaviour and output of the software under test is not sufficient to detect the difference

RPL(t2)= t6 = Input: (x=5, y=7, z=5).

The software is incorrectly designed and/or implemented so that it is unable to differentiate the mutants from the original.

Same output, but different execution paths for a correct program.

Measurements of Data Mutation Equivalent mutant score EMS:

A high equivalent mutant score EMS indicates that the mutation operators have not been well designed to achieve variety in the test cases.

Live mutant score LMS:

A high LMS indicates that the observation on the behaviour and output of the software under test is insufficient.

Typed live mutant score LMS ,

where is a type of mutation operatorsA high LMS reveals that the program is not sensitive to the type of mutation probably because a fault in design or implementation.

LM EMLMS

Number of equivalent mutants

Total number of mutants

Number of life mutants

Process of Data Mutation Testing2. Develop data

mutation operators

1. Prepare seed test cases

3. Generate mutant test

4. Execute software on

seed & mutants

Modify the software under

Prepare new seed test case(s)

7. Analyse software

correctness

8. Analyse test adequacy

5. Classify mutant test

6. Analyse test effectiveness

Revise mutation operators

Lack of seeds

High equivalent mutant scoreLack of mutants

Analysis of Program Correctness Can data mutation testing be helpful to the analysis of

program correctness? Consider the examples in Triangle Classification:

IVP or DVP to test case t1, we can expect the output to be isosceles.

For the RPL, RPR, WXY, WYZ, and WYZ mutation operators, we can expect that the program should output the same classification on a seed and its mutant test cases.

If the software’s behaviour on a mutant is not as expected, an error in the software under test can be detected.

Case Study The subject

CAMLE: Caste-centric Agent-oriented Modelling Language and Environment

Automated modelling tool for agent-oriented methodologyDeveloped at NUDT of China

Potential threats to the validity of the case study Subject is developed by the tester The developer is not professional software developer

Validation of the case study against the potential threats The test method is black box testing. The knowledge of the code and

program structure affect the outcomes. The subject was developed before the case study and no change at all was

made during the course to enable the case study to be carried out. In software testing practice, systems are often tested by the developers. The developer is a capable master degree student with sufficient training

at least equivalent to an average programmer. The correctness of the program’s output can be judges objectively.

Complexity of the Input Data Input: models in CAMLE language

Multiple views: a caste diagram that describes the static structure of a multi-agent

system, a set of collaboration diagrams that describe how agents collaborate

with each other, a set of scenario diagrams that describe typical scenarios namely

situations in the operation of the system, and a set of behaviour diagrams that define the behaviour rules of the

agents in the context of various scenarios.

Well-formedness constraintsEach diagram has a number of different types of nodes and arcs, etc. Each diagram and the whole model must satisfy a set of well-

formedness conditions to be considered as a valid input (e.g. the types of nodes and arcs must match with each other)

The Function to Be Tested Consistency checker

Consistency constraints are formally defined in first order logic

Potential threat to the validity The program is not representative.

Validation of the case study The program’s input is structurally complex The program is non-trivial

414Inter-model

88Inter-diagram

10Intra-diagramIntra-model

GlobalLocal

Vertical ConsistencyHorizontal Consistency

Table 1. Summary of CAMLE’s Consistency Constraints

Types of Data Mutation Operators

Delete an existing env node in a sub-collaboration diagram

Delete env node12

Generate a sub-collaboration diagram for an existing node

Add sub diagram11

Replace an existing node with a new node of another type

Change node type10Rename an existing node in a diagramRename node9Delete an existing node in a diagramDelete node 8Replicate an existing node in a diagramReplicate node 7Add an edge of some type to a diagramAdd edge6Add a node and link it to an existing nodeAdd node with edge5Add a node of some type to a diagram Add node4Change the title of an existing diagram Rename diagram3Delete an existing diagram Delete diagram 2Add a collaboration or behaviour or scenario diagramAdd diagram1DescriptionOperator typeNo.

13 Rename env node Rename an existing environment node in a sub-collaboration diagram

14 Delete node annotation Remove an annotation on an existing node

15 Replicate edge Replicate an existing non-interaction edge

16 Delete edge Delete an existing edge in a diagram

17 Change edge association Change the Start or End node of an existing edge

18 Change edge direction Reverse the direction of an existing edge

19 Change edge type Replace an existing edge in a diagram with a new edge of another type

20 Replicate interaction edge

Replicate an existing interaction edge without Action List

21 Replicate interaction Replicate an existing interaction edge with Action List

22 Change edge annotation Change the Action List annotated to an existing interaction edge

23 Delete edge annotation Delete the Action List of an existing interaction edge

24 Change edge end to env Change the Start or End node of an existing edge to an env node

The Seed Test Cases Models developed in previous case studies of agent-

oriented software development methodology The evolutionary multi-agent Internet information retrieval

system Amalthaea (originally developed at MIT media lab); Online auction web service; The agent-oriented model of the United Nation’s Security

Council on the organisational structure and the work procedure to pass resolutions at UNSC.

All seeds passed consistency check before the case study started

No change was made to these seeds in this case study

The Seed Test Cases and Their Mutants

7808108232603466Number of Mutants2573899110#Edges35254140158#Nodes3471314#Diagrams

110110#Edges260422#Nodes3012#Diagrams

Scenario Diagram

Behaviour Diagram

Collaboration Diagram

17467# Edges19379#Nodes3111#Diagrams

Caste Diagram

TotalUNSCAuctionAmalthaea

The Results: Fault Detecting Ability

5114 (97%)61 (52%)118Total

019 (100%)12 (63%)19Transpositionof statements

114 (93%)9 (60%)15Incorrect

expression

031 (100%)13 (42%)31Omission of statements

021 (88%)14 (58%)24Incorrect variable

Computation

217 (100%)8 (47%)17Path selection

212 (100%)5 (42%)12Missing pathDomain

IndigenousInserted

By mutantsBy seeds

No. of Detected FaultsNo. of Inserted Faults

Fault Type

Detecting Design Errors In the case study, we found that a large number of

mutants remain alive

16.47%652212867808Total

15.43%9151671082UNSC

12.94%28384223260Auction

20.11%27696973466Amalthaea

%Dead#Alive #Dead #MutantSeed

Table. The numbers of alive and dead mutants

Review: Three possible reasons:(a) improper design of data mutation operators, (b) insufficient observation on the behaviour and output(c) defects in the software under test.

Statistics on Amalthaea test suite

……………

0%22022Replicate edge

0%39039Delete annotation on node

100%044Rename environment node

100%044Delete environment node

100%088Add sub diagram

39%372461Change node type

63%4677123Rename node

25%11037147Delete node

0%1300130Replicate node

13%12051731378Add edge

64%223961Combine node

16%741488Add node

100%099Rename diagram

22%729Delete diagram

67%123Add diagram

%Dead#Live#Dead#TotalOperator type

Some typed mutation score is very low

Design of consistency checker has

errors! Especially, the

consistency constraints are

Results: Detecting Design Errors Hypothesis

Design of the tool is weak in detecting certain types of inconsistency or incompleteness

Validation of the hypothesis Strengthening the well-formedness constraints Strengthening the consistency constraints: 3 constraints modified Introducing new completeness constraints: 13 new constraints introduced Test again using the same seeds and the same mutation operators A significant change in the statistic data is observed.

85.18%1060 6092 7152Total

82.76%171821992UNSC

83.33%51625793095Auction

87.83%37326923065Amalthaea

%Dead#Alive#Dead#MutantSeed

Table. The statistics of alive and dead mutants after modification

Test Adequacy Our experiments show that high test adequacy can be

achieved through data mutation. Coverage of input data space

Measured by the coverage of various kinds of mutants

Coverage of program structure Measured by code coverage (equivalent to the branches covered)

Coverage of the functions of the requirementsMeasured by the consistency constraints used in checking

Two factors the determines the test adequacy: the seeds the mutation operators

Coverage of scenario diagram variants

40004017

11011016

1002814

1002810

2404209

2404208

2404207

2400246

1703145

2107144

Total UNSCAuctionAmalthaeaMutation

operator type

Coverage of Program Structure and Functions

100150200250

300350400

450500

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Error Message Type

Number of Mutants Amal thaea

Auct i onUNSC

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Warning Message Type

The test data achieved 100% coverage of the functions of the consistency checker and 100% of the branches in the code.

Test Cost

Source of cost Amount in case study

Design and implementation of data mutation operators

1.5 man-month

Development of seed test cases 0 man-month

Analysis of program correctness on each test case

2 man-month (estimated)

The seeds were readily available from previous case studies of the tool.

Table. Summary of the test cost spent in the case study

Analysis Program’s Correctness The experiment took the black-box approach

The output on a test case consists ofWhether the input (a model) is consistent and completeThe error message(s) and/or warning message(s), if any

The expected output on a mutant is specified

14, (Interaction edges in the main collaboration diagram)

6, (Caste nodes in the main collaboration diagram)

5, (Agent nodes in the main collaboration diagram)

E0031Add a new

Collaboration diagram / Top of model

#Messages, Message ContentMessage

IDViolated

Constraint

Expected OutputOperator /Location

Mutant No.

Experiments The experiments

Mutants are selected at random The program’s correctness on each mutant is checked manually Time is measured for how long it needs to check the correctness of the

program on each test case Two experiments were conducted

Experiment 1 1 mutant selected at random from each set of the mutants generated by

one type of mutation operator (24 mutants in total) Detected 2 faults in the checker and 1 fault in other parts of the tool

Experiment 2 22 live mutants from the Amalthaea suite selected at random Detected 2 faults in the other parts of the tool

The Experiment Data

211DeadNon-equivalent

11AliveNon-equivalent

00DeadEquivalent

234AliveEquivalent

#Detected Faults#MutantsAliveness Type of Mutant

Results:• Checking correctness on dead mutants: 3 minute/per mutant• Checking correctness on live mutants: 1 minute/per mutant

Related Works Mutation testing

Program or specification is modified Used as a criteria to measure test adequacy Data mutation testing adopted the idea of mutation operators, but applied to test

cases to generate test case, rather than to measure adequacy. Meek and Siu (1989)

Randomisation in error seeding into programs to test compiler Adaptive Random Testing (Chen, et al. 2003, 2004)

Random test cases as far apart as possible Not yet applied to structurally complex input space

Data perturbation testing (Offutt, 2001) Test XML message for web services As a application specific technique and applicable to XML files

Metamorphic testing (Chen, Tse, et al. 2003) As a test oracle automation technique and focus on the metamorphic relations

rather than to generate test cases Could be integrated with data mutation method

Future Work More case studies with potential applications

Security control software: Role-Base Access ControlInput: Role model, User assignments <Roles, Resources, Permissions: RoleResources, Constraints Roles X Resources X Permissions>User assignments: Users PP(Roles)

Virus detectionInput: files infected by virus

Virus are programs in assembly/binary code format One virus may have many variants obtained by equivalent

transformation of the code. Spreadsheet processing software and spreadsheets

applicationsInput: spreadsheets <data cells, program cells>

Perspectives and Future Work Integration of data mutation testing, metamorphic

testing and algebraic testing methods:Prog D O

: , 0iki ii

, , 0i ik ni i ii

D O k n

Let be the program under test

Data mutation testing generates test cases using a set of data mutation operators

Metamorphic testing used a set of metamorphic relations to check output correctness

We can use i to define metamorphic relations as follows:

( , ( ), ( ), ( ( ))x x p x p x

Example

Consider the Triangle Classification program PThe following is a metamorphic relation

P(t)= equilateral P(IPV(t))= isoscelesFor each of the data mutation operators = WXY,

WXZ, WYZ, RPL, or RPR, the following is a metamorphic relation

P((t))=P(t)

We observed in case study that data mutation operators are very helpful to find metamorphic relations.

Integration with Algebraic Testing In algebraic software testing, axioms are written in the

form ofT1=T’1 ^ T2=T’2 ^ … ^ Tn=T’n => T=T’,

Where Ti, T’i are terms constructed from variables and function/procedure/methods of the program under test.

The integration of data mutation testing, metamorphic testing and algebraic testing by developing A black box software testing specification language An automated tool to check metamorphic relations Using observation context to check if a relation is true To allow user defined data mutation operators to be invoked To allow metamorphic relations to be specified

Screen Snapshot of Algebraic Testing Tool CASCAT

References Lijun Shan and Hong Zhu, Generating Structurally Complex Test

Cases by Data Mutation: A Case Study of Testing an Automated Modelling Tool, Special Issue on Automation of Software Test, the Computer Journal, (In press).

Shan, L. and Zhu, H., Testing Software Modelling Tools Using Data Mutation, Proc. of AST’06, ACM Press, 2006, pp43-49.

Zhu, H. and Shan, L., Caste-Centric Modelling of Multi-Agent Systems: The CAMLE Modelling Language and Automated Tools, in Beydeda, S. and Gruhn, V. (eds) Model-driven Software Development, Research and Practice in Software Engineering, Vol. II, Springer, 2005, pp57-89.

Liang Kong, Hong Zhu and Bin Zhou, Automated Testing EJB Components Based on Algebraic Specifications, Proc. of TEST’07, IEEE CS Press, 2007.

Aug./Sept. 2009 Data Mutation Software Testing 1 Data Mutation Testing -- A Method for Automated...

Documents

An Oversight on Mutation Testing - IJERT Journal · MUTATION TESTING: The best way to find a test case, that efficiently works on software faults is, mutation testing. Seeding defects

Literature Survey on Mutation Testing August 2014 ... - …parsai.net/files/research/Literature Survey on Mutation Testing.pdfsoftware in question, mutation testing frameworks are

Mutation Testing - ARIADcmlrisks.com/Resources/Pdf/Mutation Testing Slide Kit.pdf · CML Treatment Failure: More Threatening Than It Appears Mutation Testing . Content •Mutations

Mutation Testing and MuJava

Multi Objective Higher Order Mutation Testing with …...The mutation testing approach has also been used as a basis for test case generation. Unfortunately, mutation testing is very

Eﬃcient multi-objective higher order mutation …¬ƒcient multi-objective higher order mutation testing with genetic programming ... mutation testing a highly generic and ﬂexible

Semantic Mutation Testing

Topics in Mutation Testing and Program Perturbationjhk39/teaching/cs576su06/L7.pdfDependable Software Systems (Mutation) Dependable Software Systems Topics in Mutation Testing and

Mutation Testing: Development and Challenges - · PDF fileMutation Testing: Development and Challenges Keynote - Lisbon, Nov 20th, 2012 Mutation Testing: Development and Challenges

Lecture 7 Advanced Topics in Testing. Mutation Testing Mutation testing concerns evaluating test suites for their inherent quality, i.e. ability to reveal

[AnDevCon 2016] Mutation Testing for Android

Mutation testing

Mutation testing in php with Humbug

Semantic Mutation Testing

Mutation Testing - Voxxed Days Bucharest 10.03.2017

Software Testing and Validation SWE 434 Mutation testing

Mutation testing in Java

Mutation testing with the mutant gem

Who Watches the Watchmen? - Mutation Testing

Mutation Testing for Model based requirements€¦ · Mutation Testing Mutation Testing is a method of inserting faults into models based requirements to test whether the tests pick