View
212
Download
0
Category
Tags:
Preview:
Citation preview
Aug./Sept. 2009
Data Mutation Software Testing
1
Data Mutation Testing-- A Method for Automated Generation of
Structurally Complex Test Cases
Hong ZhuDept. of Computing and Electronics,
Oxford Brookes Univ., Oxford, OX33 1HX, UKEmail: hzhu@brookes.ac.uk
Aug
./Sep
t. 20
09
2Data Mutation Software Testing
Outline Motivation
Overview of existing work on software test case generation The challenges to software testing
The Data Mutation Testing Method Basic ideas Process Measurements
A Case study Subject software under test The mutation operators Experiment process Main results
Perspectives and future works Potential applications Integration with other black box testing methods
Aug
./Sep
t. 20
09
3Data Mutation Software Testing
Motivation Test case generation
Need to meet multiple goalsReality: to represent real operation of the systemCoverage: functions, program code, input/output data
space, and their combinationsEfficiency: not to overkill, easy to execute, etc. Effective: capable of detecting faults, which implies easy
to check the correctness of program’s outputExternally useful: help with debugging, reliability
estimation, etc. Huge impact on test effectiveness and efficiencyOne of the most labour intensive tasks in practices
Aug
./Sep
t. 20
09
4Data Mutation Software Testing
Existing WorkProgram-based test case generationStatic: analysis of code without execution, e.g. symbolic
executionPath oriented
Howden, W. E. (1975, 1977, 1978); Ramamoorthy, C., Ho, S. and Chen, W. (1976) ; King, J. (1975) ; Clarke, L. (1976) ; Xie T., Marinov, D., and Notkin, D. (2004); J. Zhang. (2004), Xu, Z. and Zhang, J. (2006)
Goal orientedDeMillo, R. A., Guindi, D. S., McCracken, W. M., Offutt, A. J. and King, K. N. (1988) ; Pargas, R. P., Harrold, M. J. and Peck, R. R. (1999); Gupta, N., Mathur, A. P. and Soffa, M. L. (2000);
Dynamic: through execution of the programKorel, B. (1990) , Beydeda, S. and Gruhn, V. (2003)
Hybrid: combination of dynamic execution with symbolic execution, e.g. concolic techniques
Godefroid, P., Klarlund, N., and Sen, K.. (2005);
Techniques:Constraint solver, Heuristic search, e.g. genetic algorithms:
McMinn, P. and Holcombe, M. (2003) , Survey: McMinn, P. (2004)
Aug
./Sep
t. 20
09
5Data Mutation Software Testing
Specification-based test case generationDerive from either formal or semi-formal specifications
of the required functions and/or the designs Formal specification-based:
First order logic, Z spec and Logic programs: : Tai, K.-C. (1993); Stocks, P. A. and Carrington, D. A. (1993) ; Ammann, P. and Offutt, J. (1994) ; Denney, R. (1991)
Algebraic specification: Bouge, L., Choquet, N., Fribourg, L. and Gaudel, M.-C. (1986) ; Doong, R. K. and Frankl, P. G. (1994) ; Chen, H. Y., Tse, T. H. and Chen, T. Y. (2001) ; Zhu (2007);
Finite state machines: Fujiwara, S., et al.. (1991) ; Lee, D. and Yannakakis, M. (1996) ; Hierons, R. M. (2001) ; Zhu, H., Jin, L. & Diaper, D. (1999) ;
Petri nets: Morasca, S. and Pezze, M. (eds). (1990) ; Zhu, H. and He, X. (2002) Model-based: derive from semi-formal graphic models
SSADM models: Zhu, H., Jin, L. and Diaper, D. (1999, 2001); UML models: Offutt, J. and Abdurazik, A. (2000) ; Tahat, L. H., et al. (2001); Hartman, A. and
Nagin, K. (2004); Li, S., Wang, J. and Qi, Z.-C. (2004) ;
Techniques: Constraint solving; Theorem prover; Model-checker
Aug
./Sep
t. 20
09
6Data Mutation Software Testing
Random testingThrough random sampling over input domain based on
probabilistic models of the operation of the software under test.
Profile-based: Sampling over an existing operation profile at random
Stochastic model based: Use a probabilistic model of software usages Markov chain:
Avritzer, A. Larson, B. (1993) ; Avritzer, A. Weyuker, E. J. (1994) ; Whittaker, J. A. and Poore, J. H. (1993) ; Guen, H. L., Marie, R. and Thelin, T. (2004) ; Prowell, S. J. (2005)
Stochastic automata networks: Farina, A. G., Fernandes, P. and Oliveira, F. M. (2002, 2004) ;
Bayesian networks: Fine, S. and Ziv, A. (2003)
Adaptive random testing: Even spread of randomly test cases (Chen, T. Y., Leung, H. and Mak, I. K. (2004) )Variants: Mirror, Restricted, and Probabilistic ART
Aug
./Sep
t. 20
09
7Data Mutation Software Testing
Domain-specific techniquesDatabase applications
Zhang, J., Xu, C. and Cheung, S. C. (2001)
Spreadsheets:Fisher, M., Cao, M., Rothermel, G., Cook, C. and Burnett, M. (2002) Erwig, M., Abraham, R., Cooperstein, I., and Kollmansberger S. (2005)
XML Scheme:Lee, S. C. and Offutt, J. (2001) ; Li, J. B. and Miller, J. (2005)
Compiler:See Boujarwah, A. S. and Saleh, K. (1997) for a survey.
Aug
./Sep
t. 20
09
8Data Mutation Software Testing
The ChallengeHow to generate adequate test cases of high reality for programs that process structurally complex inputs?
Structural complexity: A large number of elements A large number of possible explicitly represented relationships between the
elements A large number of constraints imposed on the relationships Meaning of the data depends on not only the values of the elements, but
also the relationships and thus their processingReality:
Likely or close to be a correct real input in the operation of the system Likely or close to be an input that contains errors that a user inputs to the
system in operationExamples:
CAD, Word processor, Web browser, Spreadsheets, Powerpoint, Software modelling tools, Language processor, Theorem provers, Model-checkers, Speech recognition, Hand writing recognition, Search engine,…
Aug
./Sep
t. 20
09
9Data Mutation Software Testing
Basic Ideas of Data Mutation Testing1.Preparing the seeds, i.e. a small set of test cases
Contain various types of elements and relationships between them Highly close to the real input data Easy to check their correctness
2.Generating mutant test cases by modifying the seeds slightly Preserve the validity of the input Change at one place a time unless imposed by the constraints (but may
use second order even higher order mutants) Make as many different mutants as possible
3.Executing the software under test on both seeds and their mutants What to observe:
program’s correctness on both seeds and mutants the differences of the program’s behaviours on seed and their mutants
Uses of metrics and measurements seeds are sufficient mutations are effective and/or sufficient Feedback to step 1 and 2 if necessary, or to improve the observation.
Aug
./Sep
t. 20
09
10Data Mutation Software Testing
Illustrative Example Triangle classification
Input: x, y, z: Natural Numbers; Output:
{equilateral, isosceles, scalene, non-triangle}
Seeds:
The lengths of the sides
The type of triangles
Non-triangle(x=3, y=5, z=9)t4
Scalene(x=5, y=7, z=9)t3
Isosceles(x=5, y=5, z=7) t2
Equilateral(x=5, y=5, z=5) t1
Expected outputInputID
Aug
./Sep
t. 20
09
11Data Mutation Software Testing
Mutation operators IVP: Increase the value of a parameter by 1; DVP: Decrease the value of a parameter by 1; SPL: Set the value of a parameter to a very large number, say
1000000; SPZ: Set the value of a parameter to 0; SPN: Set the value of a parameter to a negative number, say -
2; WXY: Swap the values of parameters x and y; WXZ: Swap the values of parameters x and z; WYZ: Swap the values of parameters y and z; RPL: Rotate the values of parameters towards left; RPR: Rotate the values of parameters towards right.
Aug
./Sep
t. 20
09
12Data Mutation Software Testing
Generation of mutant test cases For example, by applying the mutation operator IVP
to test case t1 on parameter x, we can obtain the following test case t5.
IVP(t1, x) = t5 = Input: (x=6, y=5, z=5).
Total number of mutants: (5*3 +5)*4 = 80
Covering all sorts of combinations of data elements Systematically produced from the four seeds
Aug
./Sep
t. 20
09
13Data Mutation Software Testing
Execution of program and classification of mutantsA mutant is classified as dead, if the execution of the
software under test on the mutant is different from the execution on the seed test case. Otherwise, the mutant is classified as alive.
For exampleFor a correctly implemented Triangle Classification program, the execution on the mutant test case t5 will output isosceles while the execution on its seed t1 will output equilateral.
TrC(t5) TrC(t1) t5 is dead
It depends on how you observe the behaviour!
Aug
./Sep
t. 20
09
14Data Mutation Software Testing
Analyse test effectiveness Reasons why a mutant can remain alive:
The mutant is equivalent to the original with respect to the functionality or property of the software under test.
RPL(t1)=t1
The observation on the behaviour and output of the software under test is not sufficient to detect the difference
RPL(t2)= t6 = Input: (x=5, y=7, z=5).
The software is incorrectly designed and/or implemented so that it is unable to differentiate the mutants from the original.
Same output, but different execution paths for a correct program.
Aug
./Sep
t. 20
09
15Data Mutation Software Testing
Measurements of Data Mutation Equivalent mutant score EMS:
A high equivalent mutant score EMS indicates that the mutation operators have not been well designed to achieve variety in the test cases.
Live mutant score LMS:
A high LMS indicates that the observation on the behaviour and output of the software under test is insufficient.
Typed live mutant score LMS ,
where is a type of mutation operatorsA high LMS reveals that the program is not sensitive to the type of mutation probably because a fault in design or implementation.
EMEMS
TM
LM EMLMS
TM EM
LM EMLMS
TM EM
Number of equivalent mutants
Total number of mutants
Number of life mutants
Aug
./Sep
t. 20
09
16Data Mutation Software Testing
Process of Data Mutation Testing2. Develop data
mutation operators
1. Prepare seed test cases
3. Generate mutant test
cases
4. Execute software on
seed & mutants
Modify the software under
test
Prepare new seed test case(s)
7. Analyse software
correctness
8. Analyse test adequacy
5. Classify mutant test
cases
6. Analyse test effectiveness
Revise mutation operators
Lack of seeds
High equivalent mutant scoreLack of mutants
Aug
./Sep
t. 20
09
17Data Mutation Software Testing
Analysis of Program Correctness Can data mutation testing be helpful to the analysis of
program correctness? Consider the examples in Triangle Classification:
IVP or DVP to test case t1, we can expect the output to be isosceles.
For the RPL, RPR, WXY, WYZ, and WYZ mutation operators, we can expect that the program should output the same classification on a seed and its mutant test cases.
If the software’s behaviour on a mutant is not as expected, an error in the software under test can be detected.
Aug
./Sep
t. 20
09
18Data Mutation Software Testing
Case Study The subject
CAMLE: Caste-centric Agent-oriented Modelling Language and Environment
Automated modelling tool for agent-oriented methodologyDeveloped at NUDT of China
Potential threats to the validity of the case study Subject is developed by the tester The developer is not professional software developer
Validation of the case study against the potential threats The test method is black box testing. The knowledge of the code and
program structure affect the outcomes. The subject was developed before the case study and no change at all was
made during the course to enable the case study to be carried out. In software testing practice, systems are often tested by the developers. The developer is a capable master degree student with sufficient training
at least equivalent to an average programmer. The correctness of the program’s output can be judges objectively.
Aug
./Sep
t. 20
09
19Data Mutation Software Testing
Complexity of the Input Data Input: models in CAMLE language
Multiple views: a caste diagram that describes the static structure of a multi-agent
system, a set of collaboration diagrams that describe how agents collaborate
with each other, a set of scenario diagrams that describe typical scenarios namely
situations in the operation of the system, and a set of behaviour diagrams that define the behaviour rules of the
agents in the context of various scenarios.
Well-formedness constraintsEach diagram has a number of different types of nodes and arcs, etc. Each diagram and the whole model must satisfy a set of well-
formedness conditions to be considered as a valid input (e.g. the types of nodes and arcs must match with each other)
Aug
./Sep
t. 20
09
20Data Mutation Software Testing
The Function to Be Tested Consistency checker
Consistency constraints are formally defined in first order logic
Potential threat to the validity The program is not representative.
Validation of the case study The program’s input is structurally complex The program is non-trivial
414Inter-model
88Inter-diagram
10Intra-diagramIntra-model
GlobalLocal
Vertical ConsistencyHorizontal Consistency
Table 1. Summary of CAMLE’s Consistency Constraints
Aug
./Sep
t. 20
09
21Data Mutation Software Testing
Types of Data Mutation Operators
Delete an existing env node in a sub-collaboration diagram
Delete env node12
Generate a sub-collaboration diagram for an existing node
Add sub diagram11
Replace an existing node with a new node of another type
Change node type10Rename an existing node in a diagramRename node9Delete an existing node in a diagramDelete node 8Replicate an existing node in a diagramReplicate node 7Add an edge of some type to a diagramAdd edge6Add a node and link it to an existing nodeAdd node with edge5Add a node of some type to a diagram Add node4Change the title of an existing diagram Rename diagram3Delete an existing diagram Delete diagram 2Add a collaboration or behaviour or scenario diagramAdd diagram1DescriptionOperator typeNo.
Aug
./Sep
t. 20
09
22Data Mutation Software Testing
13 Rename env node Rename an existing environment node in a sub-collaboration diagram
14 Delete node annotation Remove an annotation on an existing node
15 Replicate edge Replicate an existing non-interaction edge
16 Delete edge Delete an existing edge in a diagram
17 Change edge association Change the Start or End node of an existing edge
18 Change edge direction Reverse the direction of an existing edge
19 Change edge type Replace an existing edge in a diagram with a new edge of another type
20 Replicate interaction edge
Replicate an existing interaction edge without Action List
21 Replicate interaction Replicate an existing interaction edge with Action List
22 Change edge annotation Change the Action List annotated to an existing interaction edge
23 Delete edge annotation Delete the Action List of an existing interaction edge
24 Change edge end to env Change the Start or End node of an existing edge to an env node
Aug
./Sep
t. 20
09
23Data Mutation Software Testing
The Seed Test Cases Models developed in previous case studies of agent-
oriented software development methodology The evolutionary multi-agent Internet information retrieval
system Amalthaea (originally developed at MIT media lab); Online auction web service; The agent-oriented model of the United Nation’s Security
Council on the organisational structure and the work procedure to pass resolutions at UNSC.
All seeds passed consistency check before the case study started
No change was made to these seeds in this case study
Aug
./Sep
t. 20
09
24Data Mutation Software Testing
The Seed Test Cases and Their Mutants
7808108232603466Number of Mutants2573899110#Edges35254140158#Nodes3471314#Diagrams
Total
110110#Edges260422#Nodes3012#Diagrams
Scenario Diagram
170287567#Edges27043115112#Nodes16268#Diagrams
Behaviour Diagram
4961726#Edges3781415#Nodes12453#Diagrams
Collaboration Diagram
17467# Edges19379#Nodes3111#Diagrams
Caste Diagram
TotalUNSCAuctionAmalthaea
Aug
./Sep
t. 20
09
25Data Mutation Software Testing
The Results: Fault Detecting Ability
5114 (97%)61 (52%)118Total
019 (100%)12 (63%)19Transpositionof statements
114 (93%)9 (60%)15Incorrect
expression
031 (100%)13 (42%)31Omission of statements
021 (88%)14 (58%)24Incorrect variable
Computation
217 (100%)8 (47%)17Path selection
212 (100%)5 (42%)12Missing pathDomain
IndigenousInserted
By mutantsBy seeds
No. of Detected FaultsNo. of Inserted Faults
Fault Type
Aug
./Sep
t. 20
09
26Data Mutation Software Testing
Detecting Design Errors In the case study, we found that a large number of
mutants remain alive
16.47%652212867808Total
15.43%9151671082UNSC
12.94%28384223260Auction
20.11%27696973466Amalthaea
%Dead#Alive #Dead #MutantSeed
Table. The numbers of alive and dead mutants
Review: Three possible reasons:(a) improper design of data mutation operators, (b) insufficient observation on the behaviour and output(c) defects in the software under test.
Aug
./Sep
t. 20
09
27Data Mutation Software Testing
Statistics on Amalthaea test suite
……………
0%22022Replicate edge
0%39039Delete annotation on node
100%044Rename environment node
100%044Delete environment node
100%088Add sub diagram
39%372461Change node type
63%4677123Rename node
25%11037147Delete node
0%1300130Replicate node
13%12051731378Add edge
64%223961Combine node
16%741488Add node
100%099Rename diagram
22%729Delete diagram
67%123Add diagram
%Dead#Live#Dead#TotalOperator type
Some typed mutation score is very low
Design of consistency checker has
errors! Especially, the
consistency constraints are
weak.
Aug
./Sep
t. 20
09
28Data Mutation Software Testing
Results: Detecting Design Errors Hypothesis
Design of the tool is weak in detecting certain types of inconsistency or incompleteness
Validation of the hypothesis Strengthening the well-formedness constraints Strengthening the consistency constraints: 3 constraints modified Introducing new completeness constraints: 13 new constraints introduced Test again using the same seeds and the same mutation operators A significant change in the statistic data is observed.
85.18%1060 6092 7152Total
82.76%171821992UNSC
83.33%51625793095Auction
87.83%37326923065Amalthaea
%Dead#Alive#Dead#MutantSeed
Table. The statistics of alive and dead mutants after modification
Aug
./Sep
t. 20
09
29Data Mutation Software Testing
Test Adequacy Our experiments show that high test adequacy can be
achieved through data mutation. Coverage of input data space
Measured by the coverage of various kinds of mutants
Coverage of program structure Measured by code coverage (equivalent to the branches covered)
Coverage of the functions of the requirementsMeasured by the consistency constraints used in checking
Two factors the determines the test adequacy: the seeds the mutation operators
Aug
./Sep
t. 20
09
30Data Mutation Software Testing
Coverage of scenario diagram variants
40004017
11011016
1002814
1002810
2404209
2404208
2404207
2400246
1703145
2107144
30123
30122
31111
Total UNSCAuctionAmalthaeaMutation
operator type
Aug
./Sep
t. 20
09
31Data Mutation Software Testing
Coverage of Program Structure and Functions
050
100150200250
300350400
450500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Error Message Type
Number of Mutants Amal thaea
Auct i onUNSC
1
10
100
1000
10000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Warning Message Type
Nu
mb
er
of
Mu
tan
ts
The test data achieved 100% coverage of the functions of the consistency checker and 100% of the branches in the code.
Aug
./Sep
t. 20
09
32Data Mutation Software Testing
Test Cost
Source of cost Amount in case study
Design and implementation of data mutation operators
1.5 man-month
Development of seed test cases 0 man-month
Analysis of program correctness on each test case
2 man-month (estimated)
The seeds were readily available from previous case studies of the tool.
Table. Summary of the test cost spent in the case study
Aug
./Sep
t. 20
09
33Data Mutation Software Testing
Analysis Program’s Correctness The experiment took the black-box approach
The output on a test case consists ofWhether the input (a model) is consistent and completeThe error message(s) and/or warning message(s), if any
The expected output on a mutant is specified
14, (Interaction edges in the main collaboration diagram)
E0165
6, (Caste nodes in the main collaboration diagram)
E0042
5, (Agent nodes in the main collaboration diagram)
E0031Add a new
Collaboration diagram / Top of model
1
#Messages, Message ContentMessage
IDViolated
Constraint
Expected OutputOperator /Location
Mutant No.
Aug
./Sep
t. 20
09
34Data Mutation Software Testing
Experiments The experiments
Mutants are selected at random The program’s correctness on each mutant is checked manually Time is measured for how long it needs to check the correctness of the
program on each test case Two experiments were conducted
Experiment 1 1 mutant selected at random from each set of the mutants generated by
one type of mutation operator (24 mutants in total) Detected 2 faults in the checker and 1 fault in other parts of the tool
Experiment 2 22 live mutants from the Amalthaea suite selected at random Detected 2 faults in the other parts of the tool
Aug
./Sep
t. 20
09
35Data Mutation Software Testing
The Experiment Data
211DeadNon-equivalent
11AliveNon-equivalent
00DeadEquivalent
234AliveEquivalent
#Detected Faults#MutantsAliveness Type of Mutant
Results:• Checking correctness on dead mutants: 3 minute/per mutant• Checking correctness on live mutants: 1 minute/per mutant
Aug
./Sep
t. 20
09
36Data Mutation Software Testing
Related Works Mutation testing
Program or specification is modified Used as a criteria to measure test adequacy Data mutation testing adopted the idea of mutation operators, but applied to test
cases to generate test case, rather than to measure adequacy. Meek and Siu (1989)
Randomisation in error seeding into programs to test compiler Adaptive Random Testing (Chen, et al. 2003, 2004)
Random test cases as far apart as possible Not yet applied to structurally complex input space
Data perturbation testing (Offutt, 2001) Test XML message for web services As a application specific technique and applicable to XML files
Metamorphic testing (Chen, Tse, et al. 2003) As a test oracle automation technique and focus on the metamorphic relations
rather than to generate test cases Could be integrated with data mutation method
Aug
./Sep
t. 20
09
37Data Mutation Software Testing
Future Work More case studies with potential applications
Security control software: Role-Base Access ControlInput: Role model, User assignments <Roles, Resources, Permissions: RoleResources, Constraints Roles X Resources X Permissions>User assignments: Users PP(Roles)
Virus detectionInput: files infected by virus
Virus are programs in assembly/binary code format One virus may have many variants obtained by equivalent
transformation of the code. Spreadsheet processing software and spreadsheets
applicationsInput: spreadsheets <data cells, program cells>
Aug
./Sep
t. 20
09
38Data Mutation Software Testing
Perspectives and Future Work Integration of data mutation testing, metamorphic
testing and algebraic testing methods:Prog D O
: , 0iki ii
D D k
, , 0i ik ni i ii
D O k n
Let be the program under test
Data mutation testing generates test cases using a set of data mutation operators
Metamorphic testing used a set of metamorphic relations to check output correctness
We can use i to define metamorphic relations as follows:
( , ( ), ( ), ( ( ))x x p x p x
Aug
./Sep
t. 20
09
39Data Mutation Software Testing
Example
Consider the Triangle Classification program PThe following is a metamorphic relation
P(t)= equilateral P(IPV(t))= isoscelesFor each of the data mutation operators = WXY,
WXZ, WYZ, RPL, or RPR, the following is a metamorphic relation
P((t))=P(t)
We observed in case study that data mutation operators are very helpful to find metamorphic relations.
Aug
./Sep
t. 20
09
40Data Mutation Software Testing
Integration with Algebraic Testing In algebraic software testing, axioms are written in the
form ofT1=T’1 ^ T2=T’2 ^ … ^ Tn=T’n => T=T’,
Where Ti, T’i are terms constructed from variables and function/procedure/methods of the program under test.
The integration of data mutation testing, metamorphic testing and algebraic testing by developing A black box software testing specification language An automated tool to check metamorphic relations Using observation context to check if a relation is true To allow user defined data mutation operators to be invoked To allow metamorphic relations to be specified
Aug
./Sep
t. 20
09
41Data Mutation Software Testing
Screen Snapshot of Algebraic Testing Tool CASCAT
Aug
./Sep
t. 20
09
42Data Mutation Software Testing
References Lijun Shan and Hong Zhu, Generating Structurally Complex Test
Cases by Data Mutation: A Case Study of Testing an Automated Modelling Tool, Special Issue on Automation of Software Test, the Computer Journal, (In press).
Shan, L. and Zhu, H., Testing Software Modelling Tools Using Data Mutation, Proc. of AST’06, ACM Press, 2006, pp43-49.
Zhu, H. and Shan, L., Caste-Centric Modelling of Multi-Agent Systems: The CAMLE Modelling Language and Automated Tools, in Beydeda, S. and Gruhn, V. (eds) Model-driven Software Development, Research and Practice in Software Engineering, Vol. II, Springer, 2005, pp57-89.
Liang Kong, Hong Zhu and Bin Zhou, Automated Testing EJB Components Based on Algebraic Specifications, Proc. of TEST’07, IEEE CS Press, 2007.
Recommended