View
219
Download
1
Category
Preview:
Citation preview
@ GMU
of 33
Improving Logic-Based TestingJeff Offutt
Professor, Software EngineeringGeorge Mason University
Fairfax, VA USAwww.cs.gmu.edu/~offutt/
offutt@gmu.edu
Joint research with Gary Kaminski and Paul AmmannImproving Logic-Based Testing, invited to Journal of Software Systems
@ GMU
of 33
Outline
Better Logic-Testing © Kaminski, Ammann, Offutt 2
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
@ GMU
of 33
Software is a Skin that Surrounds Our Civilization
Linköping, January 2011 © Jeff Offutt 3
Quote due to Dr. Mark Harman
@ GMU
of 33
Costly Software Failures• “The Economic Impacts of Inadequate
Infrastructure for Software Testing”– Inadequate software testing costs the US alone
between $22 and $59 billion USD annually– Better testing could cut this amount in half
• 2006 : Amazon’s BOGO offer became a double discount
• 2007 : Symantec says that most security vulnerabilities are now due to faulty software– And more than half are in web applications
• Huge losses due to web application failures– Financial services : $6.5 million per hour (just in
USA!)– Credit card sales applications : $2.4 million per hour
(in USA)Better Logic-Testing © Kaminski, Ammann, Offutt 4
World-wide monetary loss is staggering
@ GMU
of 33
Cost Of Late Testing
Linköping, January 2011 © Jeff Offutt 5
60
50
40
30
20
10
0
Require
men
ts
Prog /
Unit
Test
Desig
n
Inte
gratio
n
Test
Fault origin (%)
Fault detection (%)
Unit cost (X)
Software Engineering Institute; Carnegie Mellon University; Handbook CMU/SEI-96-HB-002
Syste
m Tes
t
Product
ion
@ GMU
of 33
Outline
Better Logic-Testing © Kaminski, Ammann, Offutt 6
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
@ GMU
of 33
Covering Logic Expressions• Logic expressions show up in many situations • They are essential to defining software behavior• Covering logic expressions is required by the US
Federal Aviation Administration for safety critical software
• Logical expressions can come from many sources– Decisions in programs– UML : FSMs and statecharts, activity diagrams– Requirements– SQL queries
• Test designs are subsets of expressions’ truth assignments
Better Logic-Testing © Kaminski, Ammann, Offutt 7
@ GMU
of 33Better Logic-Testing © Kaminski, Ammann, Offutt 8
Logic Predicates and Clauses• A predicate is an expression that evaluates to a
boolean value• Predicates can contain
– boolean variables– non-boolean variables that contain >, <, ==, >=,
<=, !=– boolean function calls
• Internal structure is created by logical operators– ¬ – the negation operator– – the and operator– – the or operator– – the implication operator– – the exclusive or operator– – the equivalence operator
• A clause is a predicate with no logical operators
@ GMU
of 33
Power of Logic Testing• Logic expressions encode the behavior of
software• Logic expressions define the domain of values
for which the software behaves in a certain way• Logic expressions are often
– Complicated– Subtle– Easy to get wrong, both in design and
implementation
Better Logic-Testing © Kaminski, Ammann, Offutt 9
Testing logic predicates is a cost-effective way to find many subtle
software faults
@ GMU
of 33
Problems Addressed• This (mostly) theoretical talk presents results on
three problems with logic predicate testing :1. Redundant mutation operators for predicate testing2. Weakness of major logic testing criterion : MCDC3. A stronger logic test criterion, minimal-MUMCUT
• Solutions based on theoretical analysis
• Solutions can be immediately used to create better tools and stronger criteria, with very slight additional cost
Better Logic-Testing © Kaminski, Ammann, Offutt 10
@ GMU
of 33
Outline
Better Logic-Testing © Kaminski, Ammann, Offutt 11
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
@ GMU
of 33
Mutation Testing• Mutation helps testers design tests directly to
find common mistakes1. Modify the software in small, syntactic, ways
(mutants)• Replace a variable, replace an operator, delete
statements, …
2. Design or find a test to cause each mutant to result in incorrect behavior (killing mutants)
• The resulting tests are very strong—will detect most mistakes in the software
Better Logic-Testing © Kaminski, Ammann, Offutt 12
Fundamental Premise : If the software contains a fault, there will usually be mutants that can only be killed by a test that also detects that fault
@ GMU
of 33
Redundancy in Mutation• Mutation is widely considered to be “expensive”• This expense is largely based on the high
number of test requirements—mutants• But Li et al. found that mutation needed fewer
tests !
Better Logic-Testing © Kaminski, Ammann, Offutt 13
Edge-Pair
All-Uses Prime Path
Mutation0
100
200
300
400
500
Number of Tests29 Java Classes
Edge-Pair
All-Uses Prime Path
Mutation0
1000
2000
3000
Number of Test Requirements
29 Java Classes
Li, Praphamontripong, Offutt, An experimental comparison of four unit test criteria, Mutation 2009
@ GMU
of 33
Eliminating Redundancy
• This is strong evidence that mutation tools use many redundant operators
• A more clever mutation system should have less redundancy
• Fewer mutants means less work for the tester … cheaper!
Better Logic-Testing © Kaminski, Ammann, Offutt 14
@ GMU
of 33
Mutation Predicate Testing• Traditional ROR operator :
Better Logic-Testing © Kaminski, Ammann, Offutt 15
Each occurrence of a relational operator (<, >, <=, >=, =, !=) is replaced by each other operator, and the expression is replaced by True and False.
• Example:– Original predicate: a > b– Mutant 1 : a < b– Mutant 2 : a <= b– Mutant 3 : a >= b– Mutant 4 : a == b– Mutant 5 : a != b– Mutant 6 : true– Mutant 7 : false
@ GMU
of 33
Mutation Predicate Testing A fault hierarchy establishes theoretical
dominance relations among faults:
Better Logic-Testing © Kaminski, Ammann, Offutt 16
TNF
LNF
LRF
LIF
TOF
ORF+
ENF
ORF.
LOF
If fault A dominates fault B, then any test that detects fault A will by definition detect fault B
Lau and Yu’s logic fault hierarchydetects
@ GMU
of 33
ROR Mutant Hierarchy
Better Logic-Testing © Kaminski, Ammann, Offutt 17
If mutant A dominates mutant B, then any test that detects mutant A will by definition detect mutant B
Mutants for a < b
a <= bfalse a != b
true
a >= b
a == b a > b
Mutants for a >= ba > btrue a == b
false
a < b
a != b a <= b
@ GMU
of 33
A Cheaper ROR Operator
Better Logic-Testing © Kaminski, Ammann, Offutt 18
Each occurrence of a relational operator (<, >, <=, >=, =, !=) is replaced by operators as follows:• < : <=, !=, False• > : >=, !=, False• <= : <, ==, True• >= : >, ==, True• == : <=, >=, False• != : <, >, True
Saves four mutants for each relational operator
@ GMU
of 33
Outline
Better Logic-Testing © Kaminski, Ammann, Offutt 19
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
@ GMU
of 33
MCDC• Multiple Condition - Decision Coverage
– Required by the USA Federal Aviation Administration to test safety critical software
• Each clause (condition) in a predicate (decision) is required to be tested with true and false when the clause “matters”– Changing the value of the clause changes the
predicate’s value
• Example :
• MCDC considered to thoroughly probe predicates
• Most useful when predicate has more than 4 clauses– Otherwise, we can test all 2N truth assignments
Better Logic-Testing © Kaminski, Ammann, Offutt 20
p = a (b c)Test 1 for a : a=true, (b
c)=falseTest 2 for a : a=false, (b
c)=false
@ GMU
of 33
Weakness of MCDC• MCDC was invented in the early 1990s
• Research community has invented additional logic criteria since– MCDC is weaker than ROR-mutation
• MCDC works at the clause level
• ROR works at the relational operator level
Better Logic-Testing © Kaminski, Ammann, Offutt 21
Solution : Extend MCDC to the relational operator level
@ GMU
of 33
Stronger MCDC• MCDC can be extended to include requirements
to kill ROR mutants• Method :
– MCDC requires clause c = x op y to have two values, True and False
– Cheaper-ROR requires c to have three values : x < y, x == y, x > y– The two MCDC values will always satisfy at least two
of the cheaper-ROR requirements– Add one additional test to cover the third
Better Logic-Testing © Kaminski, Ammann, Offutt 22
@ GMU
of 33
Cost is Minor
• MCDC on a predicate with N clauses requires N+1 .. 2N tests
• MCDC + ROR requires N more (2N+1 ... 3N tests)
• Algorithm and proof in paper
Better Logic-Testing © Kaminski, Ammann, Offutt 23
@ GMU
of 33
Example
Better Logic-Testing © Kaminski, Ammann, Offutt 24
p = a b ca = (a1 < a2), b = (b1 <= b2), c = (c1 == c2)
The following test set satisfies MCDC :T = { t1, t2, t3, t4} = { ttf, tft, tff, ftf }
Which can be refined with the following value assignments :
Test Value a1 a2 b1 b2 c1 c2 a b c
t1 TTF 5 6 10 11 21 22 < <
t2 TFT 5 6 11 10 21 21 ==
t3 TFF 5 6 11 10 21 22 > <
t4 FTF 6 5 10 11 21 22 >
New (t1) 5 5 10 11 21 22 ==
New (t1) 5 6 10 10 21 22 ==
New (t2) 5 6 11 10 22 21 >
ROR
tests
@ GMU
of 33
Outline
Better Logic-Testing © Kaminski, Ammann, Offutt 25
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
@ GMU
of 33
Faults in Logic Expressions• Types of common and possible faults in logic
expressions have been categorized– LIF : Literal Insertion Fault … An extraneous literal, or
clause, is in the predicate– LRF : Literal Replacement Fault … The wrong literal,
or clause, is used in the predicate– LOF : Literal Omission Fault … A literal, or clause,
that should have been in the predicate was omitted– TNF : Term Negation Fault … A term, or collection of
clauses, that should have been negated was not (or vice versa)
• Lau and Yu defined nine types of logic faults and explored their relationships
Better Logic-Testing © Kaminski, Ammann, Offutt 26
@ GMU
of 33
Lau and Yu’s Fault Hierarchy A fault hierarchy establishes theoretical
dominance relations among faults:
Better Logic-Testing © Kaminski, Ammann, Offutt 27
TNF
LNF
LRF
LIF
TOF
ORF+
ENF
ORF.
LOF
If fault A dominates fault B, then any test that detects fault A will by definition detect fault B
MCDC
minimal-MUMCUT
detects
detects
@ GMU
of 33
Empirical Studies• The fault hierarchy result is theoretical
– That is, MCDC is only guaranteed to detect TNF and ENF faults
– But it could detect others by serendipity
• Two studies on different sets of logic expressions1. Software from 5 airplane “black boxes” (Line
Replaceable Units)2. Air traffic collision avoidance system (TCAS)
Better Logic-Testing © Kaminski, Ammann, Offutt 28
@ GMU
of 33
Empirical Results• Black boxes
– 20,256 logic expressions from 5 different “black boxes”
– Expressions originally studied by Chilenski– 132 expressions with at least 5 unique literals– 125 were simple, so that MCDC can find all faults,
leaving 7– MCDC found 81% of the possible faults (140 of 173)
• TCAS– 19 predicates (originally studied by Chen, Lau and
Yu)– Larger predicates, several with 25 clauses– MCDC found only 35% of the possible faults (205 of
580)Better Logic-Testing © Kaminski, Ammann, Offutt 29
MCDC found 81% of the possible faults (140 of173)
MCDC found only 35% of the possible faults (205 of 580)
@ GMU
of 33
Size Matters
Better Logic-Testing © Kaminski, Ammann, Offutt 30
Unique
Literals
Total Faults
Faults Found
Percent
5 71 61 86%
7 523 328 63%
8 316 152 48%
9 1630 650 40%
10 1120 432 39%
11 956 312 33%
12 4870 1421 29%
13 1623 535 33%
Faults found grouped by number of unique literals
@ GMU
of 33
Minimal-MUMCUT vs. MCDC• Minimal-MUMCUT finds significantly more faults
than MCDC– Especially in large, complicated, logic expressions– Which is precisely when engineers are most likely to
make mistakes!– Also very hard to debug when the software fails
• Requires up to four times as many tests• Suggested approach
1. Use all combinations on predicates with less than 5 clauses
2. Use MCDC with 5 to 10 clauses3. Use minimal-MUMCUT with above 10 clauses
Better Logic-Testing © Kaminski, Ammann, Offutt 31
@ GMU
of 33
Outline
Better Logic-Testing © Kaminski, Ammann, Offutt 32
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
@ GMU
of 33
Recommendations1. Replace ROR with cheaper-ROR in mutation
tools– No loss in strength– Saves 4 test requirements (mutants) for each
relational operator– We are currently implementing in the muJava
mutation tool
Better Logic-Testing © Kaminski, Ammann, Offutt 33
2. Extend logic criteria such as MCDC with ROR– Logic testing should apply to the relational operator
level– Small increase in the number of tests– Large increase in the testing strength3. Replace MCDC– Better: Replace MCDC with Minimal-MUMCUT + ROR– RTCA-DO-178B has been in effect for almost 20 years– MCDC was a brilliant idea … in 1992– MCDC was adopted and required without scientific
validation• Little theoretical analysis and no experimental studies
– We now know that MCDC is poor at discovering crucial faults
Recommended