33
@ GMU of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA www.cs.gmu.edu/~offutt/ [email protected] Joint research with Gary Kaminski and Paul Ammann Improving Logic-Based Testing, invited to Journal of Software Systems

Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/[email protected]

Embed Size (px)

Citation preview

Page 1: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Improving Logic-Based TestingJeff Offutt

Professor, Software EngineeringGeorge Mason University

Fairfax, VA USAwww.cs.gmu.edu/~offutt/

[email protected]

Joint research with Gary Kaminski and Paul AmmannImproving Logic-Based Testing, invited to Journal of Software Systems

Page 2: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Outline

Better Logic-Testing © Kaminski, Ammann, Offutt 2

1. Motivation

2. Logic-Based Testing

3. Making Mutation Cheaper

4. Strengthening MCDC

5. Making MCDC Obsolete

6. Summary

Page 3: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Software is a Skin that Surrounds Our Civilization

Linköping, January 2011 © Jeff Offutt 3

Quote due to Dr. Mark Harman

Page 4: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Costly Software Failures• “The Economic Impacts of Inadequate

Infrastructure for Software Testing”– Inadequate software testing costs the US alone

between $22 and $59 billion USD annually– Better testing could cut this amount in half

• 2006 : Amazon’s BOGO offer became a double discount

• 2007 : Symantec says that most security vulnerabilities are now due to faulty software– And more than half are in web applications

• Huge losses due to web application failures– Financial services : $6.5 million per hour (just in

USA!)– Credit card sales applications : $2.4 million per hour

(in USA)Better Logic-Testing © Kaminski, Ammann, Offutt 4

World-wide monetary loss is staggering

Page 5: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Cost Of Late Testing

Linköping, January 2011 © Jeff Offutt 5

60

50

40

30

20

10

0

Require

men

ts

Prog /

Unit

Test

Desig

n

Inte

gratio

n

Test

Fault origin (%)

Fault detection (%)

Unit cost (X)

Software Engineering Institute; Carnegie Mellon University; Handbook CMU/SEI-96-HB-002

Syste

m Tes

t

Product

ion

Page 6: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Outline

Better Logic-Testing © Kaminski, Ammann, Offutt 6

1. Motivation

2. Logic-Based Testing

3. Making Mutation Cheaper

4. Strengthening MCDC

5. Making MCDC Obsolete

6. Summary

Page 7: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Covering Logic Expressions• Logic expressions show up in many situations • They are essential to defining software behavior• Covering logic expressions is required by the US

Federal Aviation Administration for safety critical software

• Logical expressions can come from many sources– Decisions in programs– UML : FSMs and statecharts, activity diagrams– Requirements– SQL queries

• Test designs are subsets of expressions’ truth assignments

Better Logic-Testing © Kaminski, Ammann, Offutt 7

Page 8: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33Better Logic-Testing © Kaminski, Ammann, Offutt 8

Logic Predicates and Clauses• A predicate is an expression that evaluates to a

boolean value• Predicates can contain

– boolean variables– non-boolean variables that contain >, <, ==, >=,

<=, !=– boolean function calls

• Internal structure is created by logical operators– ¬ – the negation operator– – the and operator– – the or operator– – the implication operator– – the exclusive or operator– – the equivalence operator

• A clause is a predicate with no logical operators

Page 9: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Power of Logic Testing• Logic expressions encode the behavior of

software• Logic expressions define the domain of values

for which the software behaves in a certain way• Logic expressions are often

– Complicated– Subtle– Easy to get wrong, both in design and

implementation

Better Logic-Testing © Kaminski, Ammann, Offutt 9

Testing logic predicates is a cost-effective way to find many subtle

software faults

Page 10: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Problems Addressed• This (mostly) theoretical talk presents results on

three problems with logic predicate testing :1. Redundant mutation operators for predicate testing2. Weakness of major logic testing criterion : MCDC3. A stronger logic test criterion, minimal-MUMCUT

• Solutions based on theoretical analysis

• Solutions can be immediately used to create better tools and stronger criteria, with very slight additional cost

Better Logic-Testing © Kaminski, Ammann, Offutt 10

Page 11: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Outline

Better Logic-Testing © Kaminski, Ammann, Offutt 11

1. Motivation

2. Logic-Based Testing

3. Making Mutation Cheaper

4. Strengthening MCDC

5. Making MCDC Obsolete

6. Summary

Page 12: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Mutation Testing• Mutation helps testers design tests directly to

find common mistakes1. Modify the software in small, syntactic, ways

(mutants)• Replace a variable, replace an operator, delete

statements, …

2. Design or find a test to cause each mutant to result in incorrect behavior (killing mutants)

• The resulting tests are very strong—will detect most mistakes in the software

Better Logic-Testing © Kaminski, Ammann, Offutt 12

Fundamental Premise : If the software contains a fault, there will usually be mutants that can only be killed by a test that also detects that fault

Page 13: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Redundancy in Mutation• Mutation is widely considered to be “expensive”• This expense is largely based on the high

number of test requirements—mutants• But Li et al. found that mutation needed fewer

tests !

Better Logic-Testing © Kaminski, Ammann, Offutt 13

Edge-Pair

All-Uses Prime Path

Mutation0

100

200

300

400

500

Number of Tests29 Java Classes

Edge-Pair

All-Uses Prime Path

Mutation0

1000

2000

3000

Number of Test Requirements

29 Java Classes

Li, Praphamontripong, Offutt, An experimental comparison of four unit test criteria, Mutation 2009

Page 14: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Eliminating Redundancy

• This is strong evidence that mutation tools use many redundant operators

• A more clever mutation system should have less redundancy

• Fewer mutants means less work for the tester … cheaper!

Better Logic-Testing © Kaminski, Ammann, Offutt 14

Page 15: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Mutation Predicate Testing• Traditional ROR operator :

Better Logic-Testing © Kaminski, Ammann, Offutt 15

Each occurrence of a relational operator (<, >, <=, >=, =, !=) is replaced by each other operator, and the expression is replaced by True and False.

• Example:– Original predicate: a > b– Mutant 1 : a < b– Mutant 2 : a <= b– Mutant 3 : a >= b– Mutant 4 : a == b– Mutant 5 : a != b– Mutant 6 : true– Mutant 7 : false

Page 16: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Mutation Predicate Testing A fault hierarchy establishes theoretical

dominance relations among faults:

Better Logic-Testing © Kaminski, Ammann, Offutt 16

TNF

LNF

LRF

LIF

TOF

ORF+

ENF

ORF.

LOF

If fault A dominates fault B, then any test that detects fault A will by definition detect fault B

Lau and Yu’s logic fault hierarchydetects

Page 17: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

ROR Mutant Hierarchy

Better Logic-Testing © Kaminski, Ammann, Offutt 17

If mutant A dominates mutant B, then any test that detects mutant A will by definition detect mutant B

Mutants for a < b

a <= bfalse a != b

true

a >= b

a == b a > b

Mutants for a >= ba > btrue a == b

false

a < b

a != b a <= b

Page 18: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

A Cheaper ROR Operator

Better Logic-Testing © Kaminski, Ammann, Offutt 18

Each occurrence of a relational operator (<, >, <=, >=, =, !=) is replaced by operators as follows:• < : <=, !=, False• > : >=, !=, False• <= : <, ==, True• >= : >, ==, True• == : <=, >=, False• != : <, >, True

Saves four mutants for each relational operator

Page 19: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Outline

Better Logic-Testing © Kaminski, Ammann, Offutt 19

1. Motivation

2. Logic-Based Testing

3. Making Mutation Cheaper

4. Strengthening MCDC

5. Making MCDC Obsolete

6. Summary

Page 20: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

MCDC• Multiple Condition - Decision Coverage

– Required by the USA Federal Aviation Administration to test safety critical software

• Each clause (condition) in a predicate (decision) is required to be tested with true and false when the clause “matters”– Changing the value of the clause changes the

predicate’s value

• Example :

• MCDC considered to thoroughly probe predicates

• Most useful when predicate has more than 4 clauses– Otherwise, we can test all 2N truth assignments

Better Logic-Testing © Kaminski, Ammann, Offutt 20

p = a (b c)Test 1 for a : a=true, (b

c)=falseTest 2 for a : a=false, (b

c)=false

Page 21: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Weakness of MCDC• MCDC was invented in the early 1990s

• Research community has invented additional logic criteria since– MCDC is weaker than ROR-mutation

• MCDC works at the clause level

• ROR works at the relational operator level

Better Logic-Testing © Kaminski, Ammann, Offutt 21

Solution : Extend MCDC to the relational operator level

Page 22: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Stronger MCDC• MCDC can be extended to include requirements

to kill ROR mutants• Method :

– MCDC requires clause c = x op y to have two values, True and False

– Cheaper-ROR requires c to have three values : x < y, x == y, x > y– The two MCDC values will always satisfy at least two

of the cheaper-ROR requirements– Add one additional test to cover the third

Better Logic-Testing © Kaminski, Ammann, Offutt 22

Page 23: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Cost is Minor

• MCDC on a predicate with N clauses requires N+1 .. 2N tests

• MCDC + ROR requires N more (2N+1 ... 3N tests)

• Algorithm and proof in paper

Better Logic-Testing © Kaminski, Ammann, Offutt 23

Page 24: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Example

Better Logic-Testing © Kaminski, Ammann, Offutt 24

p = a b ca = (a1 < a2), b = (b1 <= b2), c = (c1 == c2)

The following test set satisfies MCDC :T = { t1, t2, t3, t4} = { ttf, tft, tff, ftf }

Which can be refined with the following value assignments :

Test Value a1 a2 b1 b2 c1 c2 a b c

t1 TTF 5 6 10 11 21 22 < <

t2 TFT 5 6 11 10 21 21 ==

t3 TFF 5 6 11 10 21 22 > <

t4 FTF 6 5 10 11 21 22 >

New (t1) 5 5 10 11 21 22 ==

New (t1) 5 6 10 10 21 22 ==

New (t2) 5 6 11 10 22 21 >

ROR

tests

Page 25: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Outline

Better Logic-Testing © Kaminski, Ammann, Offutt 25

1. Motivation

2. Logic-Based Testing

3. Making Mutation Cheaper

4. Strengthening MCDC

5. Making MCDC Obsolete

6. Summary

Page 26: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Faults in Logic Expressions• Types of common and possible faults in logic

expressions have been categorized– LIF : Literal Insertion Fault … An extraneous literal, or

clause, is in the predicate– LRF : Literal Replacement Fault … The wrong literal,

or clause, is used in the predicate– LOF : Literal Omission Fault … A literal, or clause,

that should have been in the predicate was omitted– TNF : Term Negation Fault … A term, or collection of

clauses, that should have been negated was not (or vice versa)

• Lau and Yu defined nine types of logic faults and explored their relationships

Better Logic-Testing © Kaminski, Ammann, Offutt 26

Page 27: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Lau and Yu’s Fault Hierarchy A fault hierarchy establishes theoretical

dominance relations among faults:

Better Logic-Testing © Kaminski, Ammann, Offutt 27

TNF

LNF

LRF

LIF

TOF

ORF+

ENF

ORF.

LOF

If fault A dominates fault B, then any test that detects fault A will by definition detect fault B

MCDC

minimal-MUMCUT

detects

detects

Page 28: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Empirical Studies• The fault hierarchy result is theoretical

– That is, MCDC is only guaranteed to detect TNF and ENF faults

– But it could detect others by serendipity

• Two studies on different sets of logic expressions1. Software from 5 airplane “black boxes” (Line

Replaceable Units)2. Air traffic collision avoidance system (TCAS)

Better Logic-Testing © Kaminski, Ammann, Offutt 28

Page 29: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Empirical Results• Black boxes

– 20,256 logic expressions from 5 different “black boxes”

– Expressions originally studied by Chilenski– 132 expressions with at least 5 unique literals– 125 were simple, so that MCDC can find all faults,

leaving 7– MCDC found 81% of the possible faults (140 of 173)

• TCAS– 19 predicates (originally studied by Chen, Lau and

Yu)– Larger predicates, several with 25 clauses– MCDC found only 35% of the possible faults (205 of

580)Better Logic-Testing © Kaminski, Ammann, Offutt 29

MCDC found 81% of the possible faults (140 of173)

MCDC found only 35% of the possible faults (205 of 580)

Page 30: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Size Matters

Better Logic-Testing © Kaminski, Ammann, Offutt 30

Unique

Literals

Total Faults

Faults Found

Percent

5 71 61 86%

7 523 328 63%

8 316 152 48%

9 1630 650 40%

10 1120 432 39%

11 956 312 33%

12 4870 1421 29%

13 1623 535 33%

Faults found grouped by number of unique literals

Page 31: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Minimal-MUMCUT vs. MCDC• Minimal-MUMCUT finds significantly more faults

than MCDC– Especially in large, complicated, logic expressions– Which is precisely when engineers are most likely to

make mistakes!– Also very hard to debug when the software fails

• Requires up to four times as many tests• Suggested approach

1. Use all combinations on predicates with less than 5 clauses

2. Use MCDC with 5 to 10 clauses3. Use minimal-MUMCUT with above 10 clauses

Better Logic-Testing © Kaminski, Ammann, Offutt 31

Page 32: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Outline

Better Logic-Testing © Kaminski, Ammann, Offutt 32

1. Motivation

2. Logic-Based Testing

3. Making Mutation Cheaper

4. Strengthening MCDC

5. Making MCDC Obsolete

6. Summary

Page 33: Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu

@ GMU

of 33

Recommendations1. Replace ROR with cheaper-ROR in mutation

tools– No loss in strength– Saves 4 test requirements (mutants) for each

relational operator– We are currently implementing in the muJava

mutation tool

Better Logic-Testing © Kaminski, Ammann, Offutt 33

2. Extend logic criteria such as MCDC with ROR– Logic testing should apply to the relational operator

level– Small increase in the number of tests– Large increase in the testing strength3. Replace MCDC– Better: Replace MCDC with Minimal-MUMCUT + ROR– RTCA-DO-178B has been in effect for almost 20 years– MCDC was a brilliant idea … in 1992– MCDC was adopted and required without scientific

validation• Little theoretical analysis and no experimental studies

– We now know that MCDC is poor at discovering crucial faults