Lintanlecutures

ECE453/SE465/ECE653/CS447/CS647: Software Testing, Quality Assurance,

and Maintenance

Lin Tan

January 1, 2015

01-Intro - January 1, 2015

What this course is and is not

• A class to teach you

• how to test as a developer & become a better developer

• how to measure testing coverage, and

• how to use & write automated testing tools.

• Not a QA course (only 1-2 lectures on QA)

2


Software Testing Instructor: Lin Tan

“The Testing Course”

ECE451/SE4653:Requirements

ECE452/SE464:Design

ECE453/SE465/CS447/ECE653/CS647:

Testing3


4



Why Software Testing?

• Software failures are costly!

• One hour downtime costs >$6M on average [Gartner’98].

• Bugs account for 30% of system failures [Marcus’00]

• Contribute to 55% of most severe vulnerabilities [CERT]

• Cost billions annually [NIST’02, myitVendor’02]

• $22.2B annual potential cost reduction from feasible infrastructure improvements [NIST’02]

• Testing is an effective way of avoiding software failures.

5


Top 2 Requests from Companies

• “..., we spoke to representatives from a half-dozen leading software companies.

• We were struck by the unanimity of the number-one request from each company: that students learn how to enhance sparsely documented legacy code. In priority order, other requests were making testing a first-class citizen, working with nontechnical customers, performing design reviews, and working in teams.” [CACM May 2012]

6



Testing is Important

• Testing accounts for over 50% of software development costs.

7

Not testing is even more expensive

Program Managers often say: “Testing is too expensive.”



How Software Fails?

• Crash

• Hang

• Data Corruption

• Incorrect Functionality

• Performance Degradation

• ...

8



“Famous” Software Bugs

9

http://www.wired.com/software/coolapps/news/2005/11/69355

• Therac-25 medical accident, 1985-1987

• At least 5 patients die; several seriously injured

• Northeast Blackout, 2003

• 50M people out of power;

• $6B financial losses

• Ariane 5 Flight 501 Crash, 1996

• a loss of $370M

• http://www.youtube.com/watch?v=kYUrqdUyEpI

• Morris Worm, 1988

• ~6000 computers affected; $10M-100M losses estimated



Why Does Software Fail? (1)

• Segfaults - incorrect memory usage

10

This course will use slides and notes adapted from Prof. Patrick Lam’s.




• Deadlocks11




• Wrong implementation

12




• Memory Leaks

• Exist in safe languages such as Java as well

13




• Regression bugs

14



Some “Bugging” Software Failure

• Suspected leaks:

• Firefox or Chrome becomes a memory hog after days

• Your own “favourite” example?

15



Avoiding Software Failures

• Test the software (in-house, externally)

• Code review

• Better design (“write better code!”)

• Include fewer features

• Better programming languages & IDEs

• Defensive programming (esp. w.r.t. plugins)

16



Mitigation: Failure is Inevitable

• Software (in reality) never completely works!

• Aim: Produce software that is good enough!

17



Coping with An Imperfect World (1)

• Disclaim liability

18

25. LIMITATION ON AND EXCLUSION OFDAMAGES. You can recover from Microsoft and itssuppliers only direct damages up to the amount youpaid for the software. You cannot recover any otherdamages, including consequential, lost profits, special,indirect or incidental damages.(Vista license)



• Release patches

• Backup user data

• Failure recovery

19

Coping with An Imperfect World (2)



Ways of Testing Software

• Compile it

• Run it on one input

• Run it on many inputs

• Run it on a representative set of inputs

• Run it on all inputs (static analysis)

20



Other Testing Concerns

• Integration testing

• Nonfunctional properties

• Regression tests

21



Key Concepts: Coverage

• Idea: find a reduced space and cover it with tests.

• Possible spaces: graphs, logic, input space, syntax.

22



• Learn about

• Graph Coverage

• Logic Coverage

• Syntax-based Coverage

• Input-space Based Coverage

• Testing in Practice

• State-of-the-art Techniques

• Use & Build Tools

• Concurrency

• ...

• View this course as a window to the larger world …

In This Course

23



Tools

• JUnit

• Clang/LLVM

• Valgrind

• Daikon

• Randoop

• Java Path Finder

• cppcheck

24

• FindBugs

• Splint

• Coverity

• Visual Studio (SAL)

• iComment

• ...


Software Testing Instructor: Lin TanSoftware Testing Instructor: Lin Tan

Openings• Our research lab has multiple openings

• ECE/SE499, USRA, URA, Co-op

• Master’s, PhD

• Research areas & topics (incomplete):

• Automated bug detection & fixing

• Apply Natural Language Processing (NLP), Machine Learning, Program Analysis to improve software reliability

• Comment & document analysis

• Code dependency analysis

• ...25


ECE453/SE465/CS447/ECE653/CS647: Fault, Error, Failure

Lin Tan

January 8, 2015

02-FaultErrorFailure - January 8, 2015

Software Testing Instructor: Lin Tan 2

•Fault -- often referred to as Bug [Avizienis’00]•A static defect in software (incorrect lines of code)

•Error•An incorrect internal state (unobserved)

•Failure•External, incorrect behaviour with respect to the expected behaviour (observed)

•Not used consistently in literature!

Terminology, IEEE 610.12-1990


Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java

What is this?

A failure?

An error?

A fault?

We need to describe specified and desired behaviour first!

3



Erroneous State (“Error”)

4



Design Fault

5



Mechanical Fault

6



Example: Fault, Error, Failure

7

public static int numZero (int[] x) { //Effects: if x==null throw NullPointerException// else return the number of occurrences of 0 in x

int count = 0;for (int i =1; i <x.length; i++) {

if (x[i]==0) {count++;

}}return count;

}

x = [2,7,0], fault executed, error, no failure

x = [0,7,2], fault executed, error, failure

State of the program: x, i, count, PC

Wrong State:x = [2,7,0]i =1count =0PC=first iteration of if

Expected State:x = [2,7,0]i =0count =0PC=first iteration of if

Fix: for (int i =0, i<x.length; i++)


Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java 8

Exercise



Exercise - cont.

• Read this faulty program, which includes a test case that results in failure. Answer the following questions.• (a) Identify the fault, and fix the fault.• (b) If possible, identify a test case that does not execute

the fault.• (c) If possible, identify a test case that executes the fault,

but does not result in an error state.• (d) If possible identify a test case that results in an error,

but not a failure. Hint: Don't forget about the program counter.

• (e) For the given test case, identify the first error state. Be sure to describe the complete state.

9



States• State 0:

• x = [2,3,5] • y = 2 • i = undefined • PC = findLast(...)

10

• State 1: • x = [2,3,5] • y = 2 • i = undefined • PC = before i =x.length-1;

• State 2: • x = [2,3,5] • y = 2 • i = 2 • PC = after i =x.length-1;

• State 3: • x = [2,3,5] • y = 2 • i = 2 • PC = i>0;



States

11

• State 3: • x = [2,3,5] • y = 2 • i = 2 • PC = i>0;

• State 4: • x = [2,3,5] • y = 2 • i = 2 • PC = if (x[i] ==y);

• State 5: • x = [2,3,5] • y = 2 • i = 1 • PC = i--;

• State 8: • x = [2,3,5] • y = 2 • i = 0 • PC = i--;

• State 7: • x = [2,3,5] • y = 2 • i = 1 • PC = if (x[i] ==y);

• State 6: • x = [2,3,5] • y = 2 • i = 1 • PC = i>0;



States

12

• State 9: • x = [2,3,5] • y = 2 • i = 0 • PC = i>0;

• State 8: • x = [2,3,5] • y = 2 • i = 0 • PC = i--;

• State 10: • x = [2,3,5] • y = 2 • i = 0 (undefined) • PC = return -1;

Incorrect Program

• State 10: • x = [2,3,5] • y = 2 • i = 0 • PC = if (x[i]==y);

Correct Program


Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java

Exercise - Solutions• (a) The for-loop should include the 0 index:

• for (int i=x.length-1; i >= 0; i--)

• (b) The null value for x will result in a NullPointerException before the loop test is evaluated, hence no execution of the fault.

• Input: x = null; y = 3

• Expected Output: NullPointerException

• Actual Output: NullPointerException

• (c) For any input where y appears in the second or later position, there is no error. Also, if x is empty, there is no error.

• Input: x = [2, 3, 5]; y = 3;

• Expected Output: 1

• Actual Output: 1

13


Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java

Exercise - Solutions cont.• (d) For an input where y is not in x, the missing path (i.e. an incorrect PC on the

final loop that is not taken, normally i = 2, 1, 0, but this one has only i = 2, 1, ) is an error, but there is no failure.

• Input: x = [2, 3, 5]; y = 7;

• Expected Output: -1

• Actual Output: -1

• (e) Note that the key aspect of the error state is that the PC is outside the loop (following the false evaluation of the 0>0 test. In a correct program, the PC should be at the if-test, with index i==0.

• Input: x = [2, 3, 5]; y = 2;

• Expected Output: 0

• Actual Output: -1

• First Error State:

• x = [2, 3, 5]

• y = 2;

• i = 0 (or undefined);

• PC = return -1;

14



RIP Model

• Three conditions must be present for a failure to be observed:• Reachability: the location or locations in the program

that contain the fault must be reached.• Infection: After executing the location, the state of

the program must be incorrect.• Propagation: The infected state must propagate to

cause some output of the program to be incorrect.

15



How do we deal with Faults, Errors, and Failures?

16



Addressing Faults at Different Stages

Fault Avoidance Fault ToleranceFault Detection

Better Design, Better PL, ...

Testing, Debugging, ...

Redundancy, Isolation, ...

17



Declaring the Bug as a Feature

18



Modular Redundancy: Fault Tolerance

19



Patching: Fixing the Fault

20



Testing: Fault Detection

21



Testing vs. Debugging

• Testing: Evaluating software by observing its execution• Debugging: The process of finding a fault given a failure

• Testing is hard: • Often, only specific inputs will trigger the fault into creating

a failure.

• Debugging is hard:• Given a failure, it is often difficult to know the fault.

22



Testing is hard

• Only input x=100 & y =100 triggers the crash.• Probability to trigger the crash: 1 over 2

• Assuming x and y are 32-bit integers

23

if ( x - 100 <= 0 )if ( y - 100 <= 0 )if ( x + y - 200 == 0 )crash();

64


ECE453/SE465/CS447/ECE653/CS647: Structural Coverage

Lin Tan

January 22, 2015

03-StructuralCoverage - January 22, 2015

Software Testing Lin Tan

How would you test this program?

2

public class Floor{ public static int Floor(int x, int y) { // effects: return floor(max(x,y)/min(x,y)) if (x > y) return x/y; else return y/x; }

}

floor(x) is the largest integer not greater than x.



Testing

• Static Testing [at compile time] • Static Analysis • Review

• Walk-through [informal] • Code inspection [formal]

• Dynamic Testing [at run time] • Black-box testing • White-box testing

• Commonly, testing refers to dynamic testing.

3



Complete Testing?

• Poorly defined terms: “complete testing”, “exhaustive testing”, “full coverage”

• The number of potential inputs are infinite.

• Impossible to completely test a nontrivial system

• Practical limitations: Complete testing is prohibitive in time and cost [e.g., 30 branches, 50 branches, ...]

• Theoretical limitations: e.g. Halting problem

• Need testing criteria

4



Test Case• Test Case: [informally]

• What you feed to software; and • What the software should output in response.

• Test Set: A set of test cases

• Test Case: test case values, expected results, prefix values, and postfix values necessary to evaluate software under test

• Expected Results: The result that will be produced when executing the test if and only if the program satisfies its intended behaviour

5



• Consider testing a cellphone from the off state:

• Test Case Values: The input values necessary to complete some execution of the software under test.

• Traditionally called Test Case • Prefix Values: inputs to prepare software for test case values. • Postfix Values: inputs for software after test case values;

• Verification values: inputs to show results of test case values;

• Exit commands: inputs to terminate program or to return it to initial state.

Anatomy of a Test Case

6



Test Requirement & Coverage Criterion• Test Requirement: A test requirement is a specific

element of a software artifact that a test case must satisfy or cover.

• Ice cream cone flavors: vanilla, chocolate, mint • One test requirement: test one chocolate cone • TR denotes a set of test requirements

• A coverage criterion is a rule or collection of rules that impose test requirements on a test set.

• Coverage criterion is a recipe for generating TR in a systematic way.

• Flavor criterion [cover all flavors] • TR = {flavor=chocolate, flavor=vanilla, flavor=mint}

7

More general



Measure test sets

• How to know how good a test set is?• Testing a ice cream stand:

• Test set 1:• 3 chocolate cones, 1 vanilla cone

• Test set 2:• 1 chocolate cone, 1 vanilla cone, 1 mint cone

8



Coverage• Coverage: Given a set of test requirements

TR for a coverage criterion C, a test set T satisfies C iff for every test requirement tr ∈ TR, at least one t ∈ T satisfies tr.

9



Infeasible Test Requirements

10

Statement criterion cannot be satisfied for many programs.



Coverage Level

• Coverage Level: Given a set of test requirements TR and a test set T, the coverage level is the ratio of the number of test requirements satisfied by T to the size of TR.

• TR = {flavor=chocolate, flavor=vanilla, flavor=mint} • Test set 1 T1 = {3 chocolate cones, 1 vanilla cone} • Coverage Level = 2/3 = 66.7%

• Coverage levels help us evaluate the goodness of a test set, especially in the presence of infeasible test requirements.

11



Subsumption• Criteria Subsumption: A test criterion C1

subsumes C2 if and only if every set of test cases that satisfies criterion C1 also satisfies C2

• Must be true for every set of test cases

12

Edge Coverage

EC

Node Coverage

NC

subsumes

• Subsumption is a rough guide for comparing criteria, although it’s hard to use in practice.

Which one is stronger?



More powerful coverage criterion helps find more bugs!

13

N2: print x

N1: if (x !=0)

print x

N5: exit

N3: if (y>0)

N4: print y/x

T

T

F

F

- Path [N1, N2, N3, N4, N5]: satisfies node coverage but not edge coverage. The corresponding test case passes. No bug found.- Path [N1, N3, N4, N5] or [N1, N3, N4]: division-by-zero bug!



Graph Coverage

14


Introduction to Software Testing [Ch 2] © Ammann & Offutt

Three Example Graphs

0

21

3

N0 = { 0 }

Nf = { 3 }

0

21

3

N0 = { }

Nf = { 3 }

9

0

43

7

1

5

8

2

6

N0 = { 0, 1, 2 }

Nf = { 7, 8, 9 }

Not a valid graph



Subgraph

16



Path

• A path is a sequence of nodes from a graph G whose adjacent pairs all belong to the set of edges E of G.

17



Subpath

• A subpath is a subsequence of a path. • This textbook definition is ambiguous.

• Is [1,2,5] a subpath of [1,2,3,5,6,2]?

18



Subpath

• A subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements.

• [1,2,5] is a subsequence of [1,2,3,5,6,2].

• But here should interpret it as a subsequence of a path, i.e., all edges should be in the path.

• [1,2,5] is NOT a subpath of [1,2,3,5,6,2].

• For more details, see the slide about sidetrips and detours later.

19



Test Path

• Test path examples: • [1, 2, 3, 5, 6, 7] • [1, 2, 3, 5, 6, 2, 3, 5, 6, 7]

• A test path is a path p [possibly of length 0] that starts at some node in N0 and ends at some node in Nf.

20



Paths & Semantics

• Some paths in a control flow graph may not correspond to program semantics.

• In this course, we generally only talk about the syntax of a graph -- its nodes and edges -- and not its semantics.

21

β is never executed!



Syntactical and Semantic Reachability

• A node n is syntactically reachable from ni if there exists a path from ni to n.

• A node n is semantically reachable if one of the paths from ni to n can be reached on some input.

• Standard graph algorithms, like breadth-first search and depth-first search, can compute syntactic reachability.

• Semantic reachability is undecidable.

22



Reachability• We define reachG[] as the subgraph that is

syntactically reachable from . • is a node, an edge, or a set of nodes or edges.

23

reachG[1] is G in this example.



Syntactical Reachability

24



Reachability Example

N0 = { 0, 1, 2 }

Nf = { 7, 8, 9 }

9

0

43

7

1

5

8

2

6

Reach [0] = { 0, 3, 4, 7, 8, 5, 1, 9 }

Reach [{0, 2}] = G



ReachG[N0]

• When we talk about the nodes or edges in a graph G in a coverage criterion, we'll generally mean ReachG[N0].

• The unreachable nodes tend to • be uninteresting; and • frustrate coverage criteria.

26



SESEs• SESE graphs: All test paths start at a single node and end at

another node – Single-entry, single-exit – N0 and Nf have exactly one node. 0

2

1

63

5

4

Double-diamond graph Four test paths p=[ 0, 1, 3, 4, 6 ]

[ 0, 1, 3, 5, 6 ] [ 0, 2, 3, 4, 6 ] [ 0, 2, 3, 5, 6 ]

• p0 = [1, 3, 4], p0 is a subpath of p, and conversely, p tours p0. • Any path tours itself.

• p visits node 3 and edge [0, 1] • 3 ∈ p • [0, 1] ∈ p

27



Connect Test Cases and Test Paths• Connect test cases and test paths with a mapping

pathG from test cases to test paths • e.g., pathG[t] is the set of test paths corresponding to

test case t.

• Usually just write path, as G is obvious from the context

• Lift the definition of path to test set T by defining path(T)

• Each test case gives at least one test path. If the software is deterministic, then each test case gives exactly one test path; otherwise, multiple test cases may arise from one test path.

28



Deterministic and Nondeterministic CFG

29



Connecting Test Cases, Test Paths & CFG

30



Node Coverage

• Node coverage: For each node n ∈ reachG[N0], TR contains a requirement to visit node n.

• Node Coverage [NC]: TR contains each reachable node in G.

• TR = {n0,n1,n2,n3,n4,n5,n6}

31

The graphs that we’ll be talking about in this course will almost always be SESE.

Here’s another example of a graph, which happens to be SESE, and test paths in that graph. We’llcall this graph D, for double-diamond, and it’ll come up a few times.

n0

n1

n2

n3

n4

n5

n6

Here are the four test paths in D:

[n0, n1, n3, n4, n6][n0, n1, n3, n5, n6][n0, n2, n3, n4, n6][n0, n2, n3, n5, n6]

We next focus on the path p = [n0, n1, n3, n4, n6] and use it to explain several path-related defini-tions. We can say that p visits node n3 and edge (n0, n1); we can write n3 ⇥ p and (n0, n1) ⇥ prespectively.

Let p0 = [n1, n3, n4]. Then p0 is a subpath of p, and conversely, p tours p0. Note that any path toursitself.

Test cases and test paths. We connect test cases and test paths with a mapping pathG fromtest cases to test paths; e.g. pathG(t) is the set of test paths corresponding to test case t.

• usually we just write path since G is obvious from the context.

• we can lift the definition of path to test sets T by defining path(T ) = {path(t)|t ⇥ T}.

• each test case gives at least one test path. If the software is deterministic, then each test casegives exactly one test path; otherwise, multiple test cases may arise from one test path.

Here’s an example of deterministic and nondeterministic control-flow graphs:

2



Edge Coverage

• Edge Coverage [EC]: TR contains each reachable path of length up to 1, inclusive, in G.

• TR = {[1,2], [2,4], [2,3], [3,5], [4,5], [5,6] [6,7], [6,2]}

32



Edge Pair Coverage

• Edge-Pair Coverage [EPC]: TR contains each reachable path of length up to 2, inclusive, in G.

• TR= {[1,2,3], [1,2,4], [2,3,5], [2,4,5], [3,5,6], [4,5,6]}

33

Here is a more involved example:

n0

n1

n2

x < y

x ⇤ y

Let’s define

path(t1) = [n0, n1, n2]path(t2) = [n0, n2]

Then

T1 = satisfies node coverage

T2 = satisfies edge coverage

Going beyond 1. So far we’ve seen length ⇥ 0 (node coverage) and length ⇥ 1. Of course, wecan go to lengths ⇥ 2, etc., but we quickly get diminishing returns. Here is the criterion for length⇥ 2.

Criterion 2 Edge-Pair Coverage. (EPC) TR contains each reachable path of length up to 2,inclusive, in G.

Here’s an example.

1 2 3

4

5 6

• nodes:

• edges:

• paths of length 2:

2



Node and Edge Coverage• Edge coverage is slightly stronger than node coverage.

• NC and EC are only different when there is an edge and another subpath between a pair of nodes [as in an “if-else” statement]

Node Coverage : TR = { 0, 1, 2 } Test Path = [ 0, 1, 2 ]

Edge Coverage : TR = { [0,1], [0, 2], [1, 2] } Test Paths = [ 0, 1, 2 ] [ 0, 2 ]

1

2

0



Simple Path

• A path is simple if no node appears more than once in the path, except that the first and last nodes may be the same.

• Some properties of simple paths: • no internal loops; • can bound their length; • can create any path by composing simple paths; and • many simple paths exist [too many!]

35



Simple Path Examples

• Simple path examples: • [1, 2, 3, 5, 6, 7] • [1, 2, 4] • [2,3,5,6,2]

• Not simple Path: [1,2,3,5,6,2,4]

36



Prime Path

• Because there are so many simple paths, let’s instead consider prime paths, which are simple paths of maximal length.

• A path is prime if it is simple and does not appear as a proper subpath of any other simple path.

37



Prime Path Examples

• Prime path examples: • [1, 2, 3, 5, 6, 7] • [1, 2, 4, 5, 6, 7] • [6, 2, 4, 5, 6]

• Not a prime path: [3, 5, 6, 7]

38



Prime Path Coverage

• Prime Path Coverage [PPC]: TR contains each prime path in G.

• There is a problem with using PPC as a coverage criterion: a prime path may be infeasible but contains feasible simple paths.

• How to address this issue?

39



More Path Coverage Criterions

• Complete Path Coverage [CPC]: TR contains all paths in G.

• Specified Path Coverage [SPC]: TR contains a specified set S of paths.

40



Prime Path Example

Len 0 [1] [2] [3] [4] [5] [6] [7] !

Len 1 [1,2] [2,4] [2,3] [3,5] [4,5] [5,6] [6,7] ! [6,2]

Len 2 [1,2,4] [1,2,3] [2,4,5] [2,3,5] [3,5,6] [4,5,6] [5,6,7] ! [5,6,2] [6,2,4] [6,2,3]

Len 3 [1,2,4,5] [1,2,3,5] [2,4,5,6] [2,3,5,6] [3,5,6,7]! [3,5,6,2] [4,5,6,7]! [4,5,6,2] [5,6,2,4] [5,6,2,3] [6,2,4,5] [6,2,3,5]

Len 4 [1,2,4,5,6] [1,2,3,5,6] [2,4,5,6,7]! [2,4,5,6,2]* [2,3,5,6,7]![2,3,5,6,2]* [3,5,6,2,4] [3,5,6,2,3]* [4,5,6,2,4]* [4,5,6,2,3] [5,6,2,4,5]* [5,6,2,3,5]* [6,2,4,5,6]* [6,2,3,5,6]*

Simple paths

41

Len 5 [1,2,4,5,6,7]! [1,2,3,5,6,7]!

12 Prime Paths

! means path terminates.

* denotes path cycles.

x means not prime paths.

x x x x x x

x x x x x x x

x x x x x x x x x

x x x x

x x x x x x

x x

Check paths without a x or *:

53 Simple Paths



Prime Path Example (2)• This graph has 38 simple paths • Only 9 prime paths

Prime Paths [ 0, 1, 2, 3, 6 ] [ 0, 1, 2, 4, 5 ] [ 0, 1, 2, 4, 6 ] [ 0, 2, 3, 6 ] [ 0, 2, 4, 5] [ 0, 2, 4, 6 ] [ 5, 4, 6 ] [ 4, 5, 4 ] [ 5, 4, 5 ]5

0

2

1

3 4

6


Introduction to Software Testing [Ch 2] © Ammann & Offutt25

Prime Path Example (2)

Len 0 [0] [1] [2] [3] [4] [5] [6] !

! means path terminates

Len 1 [0, 1] [0, 2] [1, 2] [2, 3] [2, 4] [3, 6] ! [4, 6] ! [4, 5] [5, 4]

Len 2 [0, 1, 2] [0, 2, 3] [0, 2, 4] [1, 2, 3] [1, 2, 4] [2, 3, 6] ! [2, 4, 6] ! [2, 4, 5] ! [4, 5, 4] * [5, 4, 6] ! [5, 4, 5] *

Len 4 [0, 1, 2, 3, 6] ! [0, 1, 2, 4, 6] ! [0, 1, 2, 4, 5] !

Simple paths

43

5

0

2

1

3 4

69 Prime Paths

* denotes path cycles

Len 3 [0, 1, 2, 3] [0, 1, 2, 4] [0, 2, 3, 6] ! [0, 2, 4, 6] ! [0, 2, 4, 5] ! [1, 2, 3, 6] ! [1, 2, 4, 5] ! [1, 2, 4, 6] !



Node Coverage TR = { 0, 1, 2, 3, 4, 5, 6 } Test Paths:

Edge Coverage TR = { [0,1], [0,2], [1,2], [2,3], [2,4], [3,6], [4,5], [4,6], [5,4] } Test Paths:

Edge-Pair Coverage TR = { [0,1,2], [0,2,3], [0,2,4], [1,2,3], [1,2,4], [2,3,6], [2,4,5], [2,4,6], [4,5,4], [5,4,5], [5,4,6] } Test Paths:

Complete Path Coverage Test Paths: [ 0, 1, 2, 3, 6 ] [ 0, 1, 2, 4, 6 ] [ 0, 1, 2, 4, 5, 4, 6 ] [ 0, 1, 2, 4, 5, 4, 5, 4, 6 ] [ 0, 1, 2, 4, 5, 4, 5, 4, 5, 4, 6 ] …

44

5

0

2

1

3 4

6

[ 0, 1, 2, 3, 6 ] [ 0, 1, 2, 4, 5, 4, 6 ]

[ 0, 1, 2, 3, 6 ] [ 0, 2, 4, 5, 4, 6 ]

[ 0, 1, 2, 3, 6 ] [ 0, 2, 3, 6 ] [ 0, 2, 4, 5, 4, 5, 4, 6 ] [ 0, 1, 2, 4, 6 ]



Prime Path Coverage vs. Complete Path Coverage

45

[n0, n1, n3][n0, n2, n3]

[n0, n1, n3], [n0, n2, n3]



Prime Path Coverage vs. Complete Path Coverage (2)

46

[q0, q1, q2], [q0, q1, q3, q4], [q3, q4, q1, q2], [q1, q3, q4, q1], [q3, q4, q1, q3], [q4, q1, q3, q4]

[q0, q1, q2][q0, q1, q3, q4, q1, q3, q4, q1, q2]



Sidetrips and Detours

• It may not be possible for any test path to tour path q = [1,2,4].

• Why?

• Any path that visits 3 does not tour path q. • p1 =[0, 1, 2, 3, 2, 4, 5] • p2 =[0, 1, 2, 3, 4, 5]

• But p1 and p2 tour q in a more general sense. 47

0 21 5

3

4



Sidetrips and Detours Example0 21 5

3

4

Touring q with a sidetrip

Touring q with a detour

a b e f

c d

a b e

cd

a b c d

Touring q directly

0 21 5

3

4

0 21 5

3

4

q = [1,2, 4]



Tour with Sidetrips and Detours

• For the following definitions, assume that q is simple.

• Tour (directly): Test path p tours subpath q iff q is a subpath of p.

• Tour with sidetrips: Test path p tours subpath q with sidetrips iff every edge in q is also in p, in the same order.

• Tour with detours: Test path p tours subpath q with detours iff every node in q is also in p, in the same order.

49



Sidetrips and Detours Example (2)

• p = [0, 1, 2, 4, 5, 4, 6] • p1 = [0, 1, 2, 4, 6] • p2 = [0, 2, 4, 5,4, 6]

• p tours p1 with a sidetrip. • p tours p2 with a detour.

50

5

0

2

1

3 4

6



Refining Coverage Criteria

• We could define each graph coverage criterion and explicitly include the kinds of tours allowed, e.g.

• prime paths, with direct tours; • prime paths, sidetrips allowed; • prime paths, detours allowed.

• Detours seem less practical, so we do not include detours further.

• We do need sidetrips sometimes, or too many TRs are infeasible.

51



Best Effort Touring• We'd rather not use sidetrips when we don't have to. • TRtour: the subset of test requirements that can be toured [directly] • TRsidetrip: the subset of test requirements that can be toured with

sidetrips. • Best Effort Touring: A set T of test paths achieves

best effort touring if for every path p in TRtour, some path in T tours p [directly], and for every path p in TRsidetrip, some path in T tours p either directly or with a sidetrip.

• Best-effort touring meets as many TRs as possible, each in the strictest possible way. Best-effort touring has desirable theoretical properties with respect to subsumption.

52



Round trip path

• A round trip path is a prime path of nonzero length that starts and ends at the same node.

• Simple Round Trip Coverage [SRTC]: TR contains at least one round-trip path for each reachable node in G that begins and ends a round-trip path.

• Complete Round Trip Coverage [CRTC]: TR contains all round-trip paths for each reachable node in G.

53



Graph Coverage Criteria Subsumption

Simple Round Trip Coverage

SRTC

Prime Path Coverage

PPC

Complete Path Coverage

CPC

Node CoverageNC

Edge CoverageEC

Edge-Pair CoverageEPC

Complete Round Trip Coverage

CRTC

54

Simple Test Path Coverage

Length-up-to-3-path Coverage

Bridge Coverage



Length-up-to-3-path Coverage


Simple Round Trip Coverage

SRTC

Prime Path Coverage

PPC


CPC

Node CoverageNC

Edge CoverageEC

Edge-Pair CoverageEPC

Complete Round Trip Coverage

CRTC

55

Bridge Coverage

Simple Test Path Coverage



Bridge Coverage• Bridge Coverage (BC): If removing an edge adds

unreachable nodes to the graph, then this edge is a bridge. The set of test requirements for BC contains all bridges.

• (a) Assume that a graph contains at least two nodes, and all nodes in a graph are reachable from the initial nodes. Does BC subsume Node coverage (NC)? If yes, justify your answer. If no, give a counterexample.

• TRBC = {[1,2], [2,3], [3,5]} • TRNC = {[1, 2, 3,4, 5} • Test path [1,2,3,5] satisfies BC, but not NC because node

4 is not visited.

56

1

5

2

3

4



Bridge Coverage Cont.

• (b) Does NC subsume BC? If yes, justify your answer. If no, give a counterexample. Simply saying yes or no receives no points.

• Key points for the proof: • For any bridge [a, b], any test case that visits b must

also visit the edge [a, b] (can be proved by contradiction).

• Any test set that satisfies NC must visit node b (TR of NC contains all nodes, including node b). Therefore, for any bridge [a, b], the test set will visit it. Therefore, NC subsumes BC.

57



Exercise

• Answer questions [a]-[g] for the graph defined by the following sets:

• N ={1, 2, 3, 4, 5, 6, 7} • N0 = {1} • Nf = {7} • E = {[1, 2], [1, 7], [2, 3], [2, 4], [3, 2], [4, 5], [4, 6],

[5, 6], [6, 1]}

• Also consider the following test paths: • t0 = [1, 2, 4, 5, 6, 1, 7] • t1 = [1, 2, 3, 2, 4, 6, 1, 7]

58



Exercise - cont.• [a] Draw the graph. • [b] List the test requirements for EPC. [Hint: You

should get 12 requirements of length 2]. • [c] Does the given set of test paths satisfy EPC? If

not, identify what is missing. • [d] Consider the simple path [3,2,4,5,6] and test

path [1,2,3,2,4,6,1,2,4,5,6,1,7]. Does the test path tour the simple path directly? With a sidetrip? Is so, identify the sidetrip.

• [e] List the test requirements for NC, EC and PPC on the graph.

• [f] List a test path that achieve NC but not EC on the graph.

• [g] List a test path that achieve EC but not PPC on the graph.

59



Partial Solutions• [a] Draw the graph. • [b] List the test requirements for EPC. [Hint:

You should get 12 requirements of length 2]. • The edge pairs are: {[1, 2, 3], [1, 2, 4], [2, 3, 2], [2,

4, 5], [2, 4, 6], [3, 2, 3], [3, 2, 4], [4, 5, 6], [4, 6, 1], [5, 6, 1], [6, 1, 2], [6, 1, 7] }

• [c] Does the given set of test paths satisfy EPC? If not, identify what is missing.

• No. Neither t0 nor t1 tours the following edge-pairs: {[3, 2, 3], [6, 1, 2]}

• [d] Consider the simple path [3,2,4,5,6] and test path [1,2,3,2,4,6,1,2,4,5,6,1,7]. Does the test path tour the simple path directly? With a sidetrip? Is so, identify the sidetrip.

• No, with a sidetrip [4, 6, 1, 2, 4] or [2,4,6,1,2]60



Partial Solutions Cont.

• [e] TR for NC, EC, and PPC. • TRNC = • TREC = • TRPPC = {[3, 2, 4, 6, 1, 7], [3, 2, 4, 5, 6, 1, 7], [4, 6, 1,

2, 3], [4, 5, 6, 1, 2, 3], [3, 2, 3], [2, 3, 2], [1, 2, 4, 5, 6, 1], [1, 2, 4, 6, 1], [2, 4, 6, 1, 2], [2, 4, 5, 6, 1, 2], [4, 6, 1, 2, 4], [4, 5, 6, 1, 2, 4], [5, 6, 1, 2, 4, 5], [6, 1, 2, 4, 6], [6, 1, 2, 4, 5, 6]}

• [f] A test path that achieve NC but not EC. • [1, 2, 3, 2, 4, 5, 6, 1, 7] does not cover edge [4, 6].

• [g] A test path that achieve EC but not PPC. • [1, 2, 3, 2, 4, 5, 6, 1, 2, 4, 6, 1, 7] does not cover

prime paths such as [3,2,3].

61


ECE453/SE465/CS447/ECE653/CS647: Control Flow Graph

Lin TanJanuary 22, 2015

06-CFG - January 22, 2015


Control-Flow Graphs

• So far, we’ve seen a number of coverage criteria for graphs, but I’ve been vague about how to actually construct graphs from source code.

• Fundamental graph for source code: Control-Flow Graph (CFG).

• CFG nodes: zero or more statements; • CFG edges: an edge (s1,s2) indicates that s1 may be

followed by s2 in an execution.

206-CFG - January 22, 2015


CFG Example

x=5 y=2 L0: if (y< 17) goto L1 y=y+1 print (x) goto L0 L1: nop

3

1

y=2

L0: if (y<17)

L1: nop

T Fy=y+1

x=5

print (x)



Basic Block

• We can simplify a CFG by grouping together statements which always execute together (in sequential programs)

• A basic block has one entry point and one exit point.

406-CFG - January 22, 2015


CFG With Basic Block

5

x=5 y=2 L0: if (y< 17) goto L1 y=y+1 print (x) goto L0 L1: nop

1L0: if (y<17)

L1: nop

T F

x=5 y=2

y=y+1 print(x)



Basic Block

• A basic block has one entry point and one exit point.

• Note that a basic block may have multiple successors. • However, there may not be any jumps into the middle

of a basic block (which is why statement L0 has its own basic block.)

606-CFG - January 22, 2015


CFG Example/Exercise - For-Loop

7

f o r ( i n t i = 0 ; i < 5 7 ; i++) { i f ( i % 3 == 0) {

pr in t ( i ) ; }

}

int i =0

1i<57

if (i%3 ==0)

T

T

F

print (i) i++

F

exit



if (x[i]%2 ==1)

CFG Example/Exercise - Short Circuit

8

if (x[i] % 2 == 1 || x [i] >0 ) { count++;

}

Tcount++

F

x [i] >0T

F

exit



CFG Take-Home Exercise - Short Circuit

9



CFG Example/Exercise - Switch Statement

switch (arg.length()) {case 0: q /= 2; break; case 1: o = new A(); q = 10; break; default: o = new B(); break;

}

10

switch (arg.length)

defaulto=new B();

break;

exit

q/=2; break;

0 1o=new A();

q=10; break;



CFG Example/Exercise - Switch Statement (2)

11

switch (n)

‘J’

print (‘J’);

‘Default’

print (‘-’); break;

exit

print (‘I’); break;

‘I’

switch (n) {case ‘I’: print(‘I’); break; case ‘J’: print(‘J’); //fall through default: print(‘-’); break;

}



CFG - Put it all together

12

int binary_search(int a[], int low, int high, int target) { /* binary search for target in the sorted a[low, high] */ 1 while (low <= high) { 2 int middle = low + (high - low)/2; 3 if (target < a[middle]) 4 high = middle - 1; 5 else if (target > a[middle]) 6 low = middle + 1; else 7 return middle; } 8 return -1; /* return -1 if target is not found in a[low, high] */}

1

2, 3

FT

8

T

6

F

T

4

F

5

7

Draw a control flow graph with 7 nodes.



CFG - Put it all together

13

int binary_search(int a[], int low, int high, int target) { /* binary search for target in the sorted a[low, high] */ 1 while (low <= high) { 2 int middle = low + (high - low)/2; 3 if (target < a[middle]) 4 high = middle - 1; 5 else if (target > a[middle]) 6 low = middle + 1; else 7 return middle; } 8 return -1; /* return -1 if target is not found in a[low, high] */}

1

2

3

FT

8

T

6

F

T

4

F

5

7

Draw a control flow graph with 8 nodes.


Software Testing Instructor: Lin Tan 14

/* Function: ReturnAverage Computes the average of all those numbers in the input array in the positive range [MIN, MAX]. The max size of the array is AS. But, the array size could be smaller than AS in which case the end of input is designated by -999. */

1 public static double ReturnAverage(int value[], int AS, int MIN, int MAX) { 2 int i, ti, tv, sum; 3 double av; 4 i = 0; ti = 0; tv = 0; sum = 0; 5 while (ti < AS && value[i] != -999) { 6 ti++; 7 if (value[i] >= MIN && value[i] <= MAX) { 8 tv++; 9 sum = sum + value[i]; 10 } 11 i++; 12 } 13 if (tv > 0) av = (double)sum/tv; 14 else av = (double) -999; 15 return (av); 16 }

CFG - Put it all together (2)


ECE453/SE465/CS447/ECE653/CS647: Testing Concurrent Programs

Lin Tan

January 27, 2015

07-Concurrency - January 27, 2015

``More often than not, printing a page on my dual-G5 crashes the application. The funny thing is, printing almost never crashes on my (single-core) G4 PowerBook.” http://www.oreillynet.com/mac/blog/2006/09/dreaded_concurrency.html

2

Multicores are here now• Hardware shifts to multi-core

• Concurrent programs will be pervasive • To leverage the multi-core resource

• Concurrency bugs will be more severe in field • More frequent manifestation on multi-core machines


Hard to Write Concurrent Programs

3

1 #include <stdio.h> 2 #include <pthread.h> 3 4 int counter = 0; // shared variable 5 6 void * func() { 7 int tmp; 8 9 tmp = counter; 10 tmp++; 11 counter = tmp; 12 } 13 14 int main() { 15 pthread_t thread1, thread2; 16 pthread_create (&thread1, 0, func, 0); 17 pthread_create (&thread2, 0, func, 0); 18 pthread_join (thread1, 0); 19 pthread_join (thread2, 0); 20 printf("%d\n", counter); 21 return 0; 22 }


Races

• A race happens when you have two parallel accesses to the same state, at least one of which is a write.

4


Race Detection Tools• Helgrind (part of Valgrind)

• The Linux kernel’s lockdep

• Oracle Solaris Studio's Thread Analyzer

• Coverity Thread Analyzer

• Intel Inspector XE 2011 (Formerly Intel® Thread Checker)

• ...

5


Helgrind Example

6

$ gcc -pthread -g race.c

$ valgrind --tool=helgrind ./a.out

==706== Possible data race during read of size 4 at 0x600a2c by thread #3==706== at 0x4005A0: func (race.c:9)==706== by 0x4C27330: mythread_wrapper (hg_intercepts.c:201)==706== by 0x4E3077C: start_thread (in /lib64/libpthread-2.5.so)==706== by 0x511825C: clone (in /lib64/libc-2.5.so)==706== This conflicts with a previous write of size 4 by thread #2==706== at 0x4005B0: func (race.c:11)==706== by 0x4C27330: mythread_wrapper (hg_intercepts.c:201)==706== by 0x4E3077C: start_thread (in /lib64/libpthread-2.5.so)==706== by 0x511825C: clone (in /lib64/libc-2.5.so)

ecelinux.uwaterloo.ca


Race-free ≠ Bug-free

7

pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;int counter = 0; // shared variable

void * func(void * params) { int tmp;

pthread_mutex_lock(&mutex1); tmp = counter; pthread_mutex_unlock(&mutex1); tmp++; pthread_mutex_lock(&mutex1); counter = tmp; pthread_mutex_unlock(&mutex1);}


How to Test Programs with Locks?

• Run multiple times

• Add noise - sleep, background load, ...

• Helgrind, etc.

• Force different scheduling

• Static approaches

• lock-set

• happens-before

• state-of-art techniques8


pthread locks

• pthread_mutex_trylock(): If the mutex is already locked, it returns immediately. Otherwise, this operation returns with the mutex in the locked state with the calling thread as its owner.

• gcc/g++ -pthread for GNU C/C++

• -D_REENTRANT9

pthread_mutex_lock (mutex)pthread_mutex_trylock (mutex) - for better performance

pthread_mutex_unlock (mutex)


Reentrant Locks

10

A thread can request a re-entrant lock multiple times, and then must release it the same number of times.


Reentrant Locks (2)

11

The ReentrantLock specification states that you should always put the unlock() call in a finally block, in case the lock body throws an exception.


lock-unlock must be paired!

• The award for best effort:

/* 2.4.0:drivers/sound/cmpci.c:cm_midi_release: */ lock_kernel(); if (file->f_mode & FMODE_WRITE) { add_wait_queue(&s->midi.owait, &wait); ... if (file->f_flags & O_NONBLOCK) { remove_wait_queue(&s->midi.owait, &wait); set_current_state(TASK_RUNNING); return –EBUSY; … unlock_kernel();

12Bugs as Deviant Behaviours: A general approach to inferring errors in systems code (SOSP’01)


Deriving “A() must be followed by B()”

• “a(); … b();” implies MAY belief that a() follows b()

• Programmer may believe a-b paired.

• Results: 23 errors, 11 false positives.

foo(p, …) bar(p, …);

foo(p, …); …

“error:foo, no bar!”

13



Lin Tan

Locks in OpenSolaris

14

opensolaris/common/os/taskq.c:/* Assumes: tq->tq_lock is held. */static void taskq_ent_free(…){…}

static taskq_t *taskq_create_common(…){ …

mutex_enter(…);taskq_ent_free(…);…

}

/* iComment: Bugs or Bad Comments? */ (SOSP’07)


Lin Tan

Locks in Mozilla

15

mozilla/security/nss/lib/ssl/sslsnce.c:/* Caller must hold cache lock when calling this. */static sslSessionID * ConvertToSID(…) {…}

Specification in Comment.

}

Mismatch!The bad comment

is already confirmed by

Mozilla developers after we reported it.

A bad comment automatically detected by iComment:

• Comments are not updated accordingly.

• Bad comments can cause and have caused new bugs.

static sslSessionID *ServerSessionIDLookup(…) { …

UnlockSet(cache, set); …sid = ConvertToSID(…);…

}

The lock is released before calling

ConvertToSID().



Lin Tan

Locks in the Linux kernel

linux/drivers/ata/libata-core.c: /* LOCKING: caller. */ void ata_dev_select(…) { … }


}

Mismatch!The bug is

already confirmed by

Linux developersafter we

reported it.

16

A bug automatically detected by iComment:

int ata_dev_read_id(…) {…ata_dev_select(…);…

}

No lock is held before calling

ata_dev_select.




Deadlocks

17


Interrupts Complicate OS Synchronization

18

Thread (T2)

L

Context Switch

L

L

L

Failed Lock Acquisition

Lock Acquisition

L

L

Lock ReleaseL

Thread (T1)

aComment: Mining Annotations from Comments and Code to Detect Interrupt Related Concurrency Bugs (ICSE’11)07-Concurrency - January 27, 2015

D

19

1

Thread (T1) Interrupt Handler Thread (TH)

Interrupt

L

DeadlockFailed Lock Acquisition

Lock Acquisition

L

L

Should disable interrupts

Interrupts Complicate OS Synchronization

L

• If you have a spinlock that can be taken by code that runs in (hardware or software) interrupt context, you must use one of the forms of spin_lock that disables interrupts. Doing otherwise can deadlock the system, sooner or later.

aComment: Mining Annotations from Comments and Code to Detect Interrupt Related Concurrency Bugs (ICSE’11)07-Concurrency - January 27, 2015

Spin_lock_irqsave

• spin_lock_irqsave() will disable interrupts locally and provide the spinlock on SMP.

• This covers both interrupt and SMP concurrency issues. With a call to spin_unlock_irqrestore(), interrupts are restored to the state when the lock was acquired.

20

spinlock_t mr_lock = SPIN_LOCK_UNLOCKED;unsigned long flags;spin_lock_irqsave(&mr_lock, flags);/* critical section ... */spin_unlock_irqrestore(&mr_lock, flags);


ECE453/SE465/CS447/ECE653/CS647: Automated Testing & Bug Detection

Tools

Lin Tan

January 27, 2015

08-Tools - January 27, 2015


OS X Mavericks Goto Fail Bug

2

if ((err = SSLHashSHA1.update(&hashCtx,&serverRandom)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; goto fail; if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) goto fail; err = sslRawVerify(...); fail: return err;

http://opensource.apple.com/source/Security/Security-55471/libsecurity_ssl/lib/sslKeyExchange.c — contains the bughttp://www.opensource.apple.com/source/Security/Security-55179.13/libsecurity_ssl/lib/sslKeyExchange.c — doesn’t contain the bug


• -Wunreachable-code compiler option • PC-Lint:warning 539: Did not expect positive

indentation • PVS-Studio:V640: Logic does not match

formatting

3

https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-1266

https://nakedsecurity.sophos.com/2014/02/24/anatomy-of-a-goto-fail-apples-ssl-bug-explained-plus-an-unofficial-patch/

How to Detect?

Developed from an earlier version from Sye Van De Veen


Automated Testing & Bug Detection

• Automated testing (including test-generation) technology is becoming the mainstream.

• Coverity used by 900+ companies including Blackberry, Mozilla, etc.

• Microsoft requires Windows device drivers to pass the Static Driver Verifier.

4


Downloadable Tools

5

• For Java

• For C/C++


• An open-source static bytecode analyzer for Java from the University of Maryland.

• Find bug patterns

• Off-by-one

• Null Pointer Dereference

• Read() Return Should Be Checked - # of bytes

• Return Value Should Be Checked - immutable classes

• Uninitialized Read In Constructor

• ...

• Techniques to reduce false positives: https://ece.uwaterloo.ca/~lintan/publications/ua-msr14.pdf

6

http://findbugs.sourceforge.net/


Java Path Finder (JPF) - NASA

• JPF is a special Java Virtual Machine that, by default, explores different thread interleavings looking for concurrency bugs.

• “JPF is an explicit state software model checker for Java™ bytecode”.

• JPF can search for deadlocks and unhandled exceptions (e.g. NullPointerExceptions and AssertionErrors). A number of extensions, like race condition and heap bounds checks are included in the JPF distribution.

• http://javapathfinder.sourceforge.net/7


Korat - U. of Illinois• Korat generates Java objects from a representation

invariant specification that is written as a Java method.

• Consider a BinaryTree Java class. The Java repOk() method will return true if the tree is well-formed.

• It might check for each node in the tree that the left and right pointers do not refer to the same child node.

• Given this repOk() method, Korat will generate all non-isomorphic trees up to a given size (say, 3). These trees can then be used as inputs for testing the add() method of the tree class (or any other methods in the class).

• http://korat.sourceforge.net/index.html

8Thanks to Derek Rayside for the description of Korat, Randoop, ESE/Java and JPF.


Randoop - MIT

• Randoop generates random sequences of method calls looking for object contract violations. Just point it at the program under test let it run.

• Randoop discards method sequences that result in illegal argument exceptions, etc. It remembers method sequences that construct complex objects, and sequences that result in object contract violations.

• http://code.google.com/p/randoop/

9


DAIKON - Dynamic Invariant Detection

• Goal: recover invariants from programs

• Technique: run the program, examine values

• Results:

• recovered formal specifications

• revealed bugs

• http://groups.csail.mit.edu/pag/daikon/

• Daikon works on Java, C, C++, Lisp


ESC/Java - Compaq

• ESC/Java checks Java programs against specification written in JML (Java Modelling Language). We have seen a few examples in class.

• //@ public invariant balance >= 0 && balance <= MAX_BALANCE;

• http://www.hpl.hp.com/downloads/crl/jtk/index.html

11


Downloadable Tools

12

• For Java

• For C/C++


cppcheck• cppcheck — Open-source tool that statically checks

for several types of errors:

• Bounds checking

• Memory leaks

• division by zero

• null pointer

• obsolete functions

• uninitialized variables

• ...

• http://sourceforge.net/projects/cppcheck/13


Valgrind• Memcheck is a memory error detector. It helps

you make your programs, particularly those written in C and C++, more correct.

• Illegal read / Illegal write errors

• Use of uninitialised values

• Illegal frees

• Overlapping source and destination blocks

• Memory leak detection

• Helgrind is a thread error detector. It helps you make your multi-threaded programs more correct.

• http://valgrind.org/14


flawfinder

• Finding security bugs based on known patterns

• buffer overflow risks (e.g., strcpy(), strcat(), gets(), sprintf(), and the scanf() family)

• format string problems ([v][f]printf(), [v]snprintf(), and syslog())

• race conditions (such as access(), chown(), chgrp(), chmod(), tmpfile(), tmpnam(), tempnam(), and mktemp())

• ...

• http://www.dwheeler.com/flawfinder/

15


Clang Static Analyzer - U. of Illinois

• A C/C++ compiler front-end that includes a static analyzer.

• http://clang-analyzer.llvm.org/

• http://clang-analyzer.llvm.org/available_checks.html

• Check for null pointers passed as arguments to a function whose arguments are marked with the 'nonnull' attribute.

• Memory leaks, Division by Zero, Null pointer dereference, ...

16


Sparse - Linux

• Sparse — A parser tool designed to find faults in the Linux kernel.

• User/kernel space

• Null pointer

• ...

• https://sparse.wiki.kernel.org/index.php/Main_Page

• http://linux.die.net/man/1/sparse

17


Splint - U. of Virginia

• Lint — The original static code analyzer for C.

• Splint — An open source evolved version of Lint (for C).

• Splint is a tool for statically checking C programs for security vulnerabilities and coding mistakes.

• http://www.splint.org/18

/*@falsewhennull@*/ bool isNonEmpty (/*@null@*/ char *x) { return (x != NULL && *x != ‘\0’); }


Pex - Microsoft’s White Box Unit Testing

• Try http://pexforfun.com/

19


DASE: Document-Assisted Symbolic Execution for Improving Automated Software Testing. ICSE’15. Wong, Zhang, Wang, Liu, & Tan.

Symbolic Executionbool isPositiveEvenNum(int x) { if (x <= 0) return false; else { if (x % 2 == 0) return true; else return false; } }

20

x <= 0 x >0

x % 2 == 0

x %2 != 0

x: *

x: > 0 && % 2 == 0

x: <=0

x: > 0 && % 2 != 0

x: -1

x: 2

x: 1

Dynamic binary execution tree

State

Path


Path Explosion

…… ……

next state to run

Search strategy

…… ……

21DASE: Document-Assisted Symbolic Execution for Improving Automated Software Testing. ICSE’15. Wong, Zhang, Wang, Liu, & Tan.


KLEE - Symbolic Execution Engine

• To download: http://klee.github.io/

• Research paper: http://www.doc.ic.ac.uk/~cristic/papers/klee-osdi-08.pdf

• Techniques to use input constraints automatically extracted form documents to guide symbolic execution to test more effectively:

• DASE: Document-Assisted Symbolic Execution for Improving Automated Software Testing. ICSE’15. Edmund Wong, Lei Zhang, Song Wang, Taiyue Liu and Lin Tan.

22


Other Tools From Academia & Industry

23


Coverity Static Analyzer

• Coverity Static Analysis

• Identifies bugs in C/C++, Java, and C# codebases, “scaling to hundreds of users, thousands of defects, and millions of lines of code in a single analysis”

• Must beliefs & May beliefs

• Coverity aims for low false positives

• http://www.coverity.com/

24


GrammaTech CodeSonar

• Static analysis tool for C & C++ and Java

• CodeSonar is very good at C/C++

• Its Java bug finding performance might be similar to FindBugs (except CodeSonar has better user interface)

• CodeSonar aims for lower false negatives.

• http://www.grammatech.com/products/codesonar/

25


Visual Studio - Microsoft

• Visual Studio Team Suite and Development Edition

26

opensolaris/intel/io/acpica/resources/rscalc.c: /* … AmlBufferLength - Size of AmlBuffer … */ ACPI_STATUS AcpiRsGetListLength ( UINT8 *AmlBuffer, UINT32 AmlBufferLength, ...)__ecount(AmlBuffer)


Lin Tan 27

Commit: 09a02f...Author: John SmithMessage: I submitted some code.

file1.c+++-

file2.c+----

Predicted Buggy!

Source Control

Code Comments,

Documentation

Pictures: courtesy of A. Hassan

Personalized Defect Prediction (ASE’13) Jiang, Tan, and Kim.

Defect Prediction


Other Commercial Tools

• PCLint - Fast, http://www.gimpel.com/html/pcl.htm

• PVS-Studio - http://www.viva64.com/en/b/0149/

• Fortify — Helps developers identify software security vulnerabilities in C/C++, .NET, Java, JSP, ASP.NET, ColdFusion, "Classic" ASP, PHP, VB6, VBScript, JavaScript, PL/SQL, T-SQL, python and COBOL as well as configuration files.

• Intel — Intel Parallel Studio XE: Contains Static Security Analysis (SSA) feature supports C/C++ and Fortran

• Klocwork Insight — finds security issues and bugs in C/C++, Java, and C#

28


Other Tools

• ScalaTest (Unit testing)

• ScalaCheck (random generators / property based testing)

• Jacoco (Code coverage)

• Atlassian Bamboo (Continuous integration server)

29

Suggested by Michael Viana


Homework

• Draw a decision tree to help a software tester/developer select an appropriate tool (if any) for a given program.

• More than one tool may be appropriate for a given program.

• Pick two tools, use them, and compare them

30


ECE453/SE465/CS447/ECE653/CS647: Coverity & iComment

Lin Tan

January 31, 2015

09-Beliefs-iComment - February 1, 2015


static int reset_hardware(…){…}

linux/drivers/scsi/in2000.c:static int in2000_bus_reset(…){ …

reset_hardware(…); …

}

Finding Bugs Requires Specifications

• A specification/rule/intention example:

2

Callers of reset_hardware() must

acquire the lock.

No lock acquisition ⇒ A bug!




• Specification/constraint examples:

• len < 100

• x = 16*y + 4*z + 3

• fclose() is called after fopen()

• Callers of reset_hardware() must acquire the lock.

3



How to Get Specifications?

• Use Automated Tools

• Infer from source code, execution traces, comments, ...

• Static tools: Coverity, iComment, PR-Miner,...

• Dynamic tools: Daikon, DIDUCE, ...

4



Impact

• Coverity (http://coverity.com/)

• Able to find many bugs in large programs (millions lines of code)

• A leading company in building bug detection tools

• 900+ customers including Blackberry, Yahoo, Mozilla, MySQL, McAfee, ECI Telecom, Samsung, Siemens, Synopsys, NetApp, Akamai, ...

• http://softwareintegrity.coverity.com/FreeTrialWebSite.html

5


A partial list of 70+ customers…

Biz Applications

Government

OS

EDA SecurityStorage

Open Source

Networking Embedded


Courtesy of Dawson Engler

“No, your tool is broken: that’s not a bug”• “No, the loop will go

through once!”

• “No, && is ‘or’!”

• “No, ANSI lets you write 1 past end of the array!” • (“We’ll have to agree to disagree.” !!!!)

for(i=1; i < 0; i++) { …deadcode… }

void *foo(void *p, void *q) { if(!p && !q) return 0;

unsigned p[4]; p[4] = 1;



• Intuition: how to find errors without knowing truth?

• Contradiction. To find lies: cross-examine. Any contradiction is an error.

• Deviance. To infer correct behaviour: if 1 person does X, might be right or a coincidence. If 1000s do X and 1 does Y, probably an error.

• Crucial: we know contradiction is an error without knowing the correct belief!

Goal: find as many serious bugs as possible

8



• MUST beliefs: – Inferred from acts that imply beliefs code *must* have.

– Check using internal consistency: infer beliefs at different locations, then cross-check for contradiction

• MAY beliefs: could be coincidental – Inferred from acts that imply beliefs code may have

– Check as MUST beliefs; rank errors by belief confidence.

Cross-checking program belief systems

x = *p / z; // MUST belief: p not null // MUST: z != 0 unlock(l); // MUST: l acquired x++; // MUST: x not protected by l

// MAY: A() and B() // must be paired

A(); …

B();

A(); …

B();

A(); …

B();

A(); …

B();

9



Trivial consistency: NULL pointers

• *p implies MUST belief:

• p is not null

• A check (p == NULL) implies two MUST beliefs:

• POST: p is null on true path, not null on false path

• PRE: p was unknown before check

• Cross-check these for three different error types.

• Check-then-use (79 errors, 26 false pos)

/* 2.4.1: drivers/isdn/svmb1/capidrv.c */ if (!card)

printk(KERN_ERR, “capidrv-%d: …”, card->contrnr…)

10



Null pointer fun

• Use-then-check: 102 bugs, 4 false

• Contradiction/redundant checks (24 bugs, 10 false)

/* 2.4.7: drivers/char/mxser.c */ struct mxser_struct *info = tty->driver_data;

unsigned flags; if(!tty || !info->xmit_buf)

return 0;

/* 2.4.7/drivers/video/tdfxfb.c */ fb_info.regbase_virt = ioremap_nocache(...);

if(!fb_info.regbase_virt) return -ENXIO;

fb_info.bufbase_virt = ioremap_nocache(...); /* [META: meant fb_info.bufbase_virt!] */

if(!fb_info.regbase_virt) { iounmap(fb_info.regbase_virt);

11



Redundancy checking

• Assume: code supposed to be useful

• Useless actions = conceptual confusion. Like type systems, high level bugs map to low-level redundancies

• Identity operations: “x = x”, “1 * y”, “x & x”, “x | x”

• Assignments that are never read:

/* 2.4.5-ac8/net/appletalk/aarp.c */ da.s_node = sa.s_node; da.s_net = da.s_net;

for(entry=priv->lec_arp_tables[i];entry != NULL; entry=next){ next = entry->next; if (…) lec_arp_remove(priv->lec_arp_tables, entry); lec_arp_unlock(priv); return 0; }

12



Handling MAY beliefs

• MUST beliefs: only need a single contradiction

• MAY beliefs: need many examples to separate fact from coincidence

• Conceptually:

• Assume MAY beliefs are MUST beliefs

• Record every successful check with a “check” message

• Every unsuccessful check with an “error” message

• rank errors based on ratio of checks (n) to errors (err)

• The most likely errors are those where n is large, and err is small.

13



Statistical: Deriving deallocation routines

• Use-after free errors are horrible.

• Problem: lots of undocumented sub-system free functions

• Solution: derive behaviourally: pointer “p” not used after call “foo(p)” implies MAY belief that “foo” is a free function

• Conceptually: Assume all functions free all arguments

• (in reality: filter functions that have suggestive names)

• Emit a “check” message at every call site.

• Emit an “error” message at every use

• rank bar’s error first

• Results: 23 free errors, 11 false positives

foo(p); *p = x;

foo(p); *p = x;

foo(p); *p = x;

bar(p); p = 0;

bar(p); p = 0;

bar(p); *p = x;

14



• Traditional:

•Use global analysis to track which routines return NULL

•Problem: false positives when pre-conditions hold, difficult to tell statically (“return p->next”?)

• Instead: see how often programmer checks.

•Rank errors based on number of checks to non-checks.

• Algorithm: Assume *all* functions can return NULL

• If pointer checked before use, emit “check” message

• If pointer used before check, emit “error”

•Sort errors based on ratio of checks to errors

• Result: 152 bugs, 16 false.

Statistical: deriving routines that can fail

p = bar(…); If(!p) return;

*p = x;


*p = x;


*p = x;

15

p = bar(…); *p = x;



Deriving “A() must be followed by B()”

• “a(); … b();” implies MAY belief that a() follows b()

• Programmer may believe a-b paired, or might be a coincidence.

• Algorithm:

• Assume every a-b is a valid pair (reality: prefilter functions that seem to be plausibly paired)

• Emit “check” for each path that has a() then b()

• Emit “error” for each path that has a() and no b()

• Results: 23 errors, 11 false positives.


“check foo-bar”

foo(p,); bar(p,);

“check foo-bar”

foo(p, …); …

“error:foo, no bar!”

16



Project Related

17

void scope1() { A(); B(); C(); D();}void scope2() { A(); C(); D();}void scope3() { A(); B();}void scope4() { B(); D(); scope1();}void scope5() { B(); D(); A();}void scope6() { B(); D();}

• “A() and B() must be paired”: either A() then B() or B() then A().

• Support is the number of times a pair of functions appears together.

• support({A,B}) = 3

• Confidence({A,B}, {A}) = support({A,B})/support({A})= 3/4



Project Related

18

void scope1() { A(); B(); C(); D();}void scope2() { A(); C(); D();}void scope3() { A(); B();}void scope4() { B(); D(); scope1();}void scope5() { B(); D(); A();}void scope6() { B(); D();}

• The sample output with the support threshold 3 and confidence threshold 65% is (intra-procedural analysis):

• bug: A in scope2, pair: (A B), support: 3, confidence: 75.00%

• bug: A in scope3, pair: (A D), support: 3, confidence: 75.00%

• bug: B in scope3, pair: (B D), support: 4, confidence: 80.00%

• bug: D in scope2, pair: (B D), support: 4, confidence: 80.00%



Checking derived lock functions

• The award for best effort:

/* 2.4.0:drivers/sound/cmpci.c:cm_midi_release: */ lock_kernel(); if (file->f_mode & FMODE_WRITE) { add_wait_queue(&s->midi.owait, &wait); ... if (file->f_flags & O_NONBLOCK) { remove_wait_queue(&s->midi.owait, &wait); set_current_state(TASK_RUNNING); return –EBUSY; … unlock_kernel();

19



• Key ideas:

• Check code beliefs: find errors without knowing truth.

• Beliefs code MUST have: Contradictions = errors

• Beliefs code MAY have: check as MUST beliefs and rank errors by belief confidence

• Assumption for May Beliefs: Majority of the code is correct.

Summary: Belief Analysis

20



Additional Reading• Bugs as Deviant Behaviours:A general approach to inferring errors

in systems code

• Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, Benjamin Chelf Stanford University, SOSP’01

• Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions (Best Paper) (OSDI’00)

• http://www.stanford.edu/~engler/mc-osdi.pdf

• eXplode: a Lightweight, General System for Finding Serious Storage System Errors (OSDI’06)

• http://www.stanford.edu/~engler/explode-osdi06.pdf

21



How to Get Specifications?

• Use Automated Tools

• Infer from source code, execution traces, comments, ...

• Static tools: Coverity, iComment, PR-Miner, ...

• Dynamic tools: Daikon, DIDUCE, ...

22



static int reset_hardware(…){…}

linux/drivers/scsi/in2000.c:static int in2000_bus_reset(…){ …

reset_hardware(…); …

}


• A specification/rule/intention example:

23

Callers of reset_hardware() must

acquire the lock.

No lock acquisition ⇒ A bug!



Extracting Specs from Comments & Code

• Many comments contain specifications.

24

linux/drivers/scsi/in2000.c:/* Caller must hold instance lock! */static int reset_hardware(…) {…}

• Infer from source code & traces: [ErnstICSE’00], [EnglerSOSP’01], [HangalICSE’02], [LiFSE’05], [KremenekOSDI’06], [TanSecurity’08] ...

• Lack of support

• E.g., reset_hardware() is called with the lock held once

• Lack of some type of information

• E.g., Units, 128 MB



Comments for Reliability & More

• Specifications/rules (examples from real-world software):

• Calling context:

• Calling order:

• Unit:

• Help ensure correct software evolution:

• Beyond reliability:

• Help reduce code navigation:

25

/* Must be called with interrupts disabled *//* Call scsi_free before mem_free since ... */

int mem; /* memory in 128 MB units */

/* See comment in struct sock definition to understand ... */

/* WARNING: If you change any of these defines, make sure to change ... */

/* FIXME: We should group addresses here. */

timeout_id_t msd_timeout_id; /* id returned by timeout() */



Comments are Under-utilized

26

• Millions lines of comments (23-30%) exist in software.

(kernel only for OSs, excluding blank lines, including copyright notices)

Software Linux FreeBSD OpenSolaris Chrome Eclipse

Lines of Comment 1.0M 0.6M 1.1M 0.7M 1.7M

• Comments are ignored by current compilers and tools.

• Difficult to obtain information from comments

• Written in natural language

• Many are not grammatically correct.



Statement

• We show it is feasible to automatically extract specs from comments to detect comment-code mismatches

• Program analysis

• Natural language processing (NLP)

• Machine learning

• Statistics

27



Comment-code Inconsistency

• Use comment-code redundancy to detect comment-code mismatches.

• Mismatches indicate:

• Possibility (1): Bugs

• Possibility (1I): Bad comments

28



Possibility (1): Bugs

linux/drivers/ata/libata-core.c: /* LOCKING: caller. */ void ata_dev_select(…) { … }


}

Mismatch!The bug is

already confirmed by

Linux developersafter we

reported it.

29

A bug automatically detected by iComment:

int ata_dev_read_id(…) {…ata_dev_select(…);…

}

No lock is held before calling

ata_dev_select.



Possibility (2): Bad Comments

30

mozilla/security/nss/lib/ssl/sslsnce.c:/* Caller must hold cache lock when calling this. */static sslSessionID * ConvertToSID(…) {…}


}

Mismatch!The bad comment

is already confirmed by

Mozilla developers after we reported it.

A bad comment automatically detected by iComment:

• Comments are not updated accordingly.

• Bad comments can cause and have caused new bugs.

static sslSessionID *ServerSessionIDLookup(…) { …

UnlockSet(cache, set); …sid = ConvertToSID(…);…

}

The lock is released before calling

ConvertToSID().



Bug-causing Bad Comments

31

mozilla/nsprpub/pr/include/prio.h: /* The thread invoking this function BLOCKS until all the data is written */ PR_EXTERN(PRInt32) PR_Write(…); …/* The operating will BLOCK until … */ PR_EXTERN(PRInt32) PR_Recv(…);

Quote from Mozilla Bug Report 363114:“These statements have led numerous Mozilla developers to

conclude that the functions are blocking, ... Some very wrong code has been written in Mozilla due to this mistaken belief ... These statements need to be fixed ASAP.”

Bad comments!



Challenges

• Goal: Detecting comment-code inconsistencies

• Challenges:

• Various ways to paraphrase

• Comments do not contain the complete information.

• Use Natural Language Processing (NLP) techniques?

32

/* We need to acquire the write IRQ lock before calling ep_unlink(). */

/* Lock must be acquired on entry to this function. */

/* Caller must hold instance lock! */



NLP is Not Magic

• NLP does NOT “understand” natural language text.

33

Noun ... Verb Noun...

Caller must hold instance lock

1. POS Tagging (acc: ~97%)

3. Semantic Role Labeling (acc: ~80%)2. Chunking (acc: ~90%)

Subject ObjectVerb

• Impossible to automatically analyze any arbitrary comments.

• Ambiguity, variations, incomplete information, ...

• NLP can analyze sentence structures.



iComment Contributions

• First work to automatically analyze comments written in natural language to detect bugs and bad comments

• Combined program analysis, NLP, machine learning, and statistics

• Automatically extracted 1832 rules and detected 60 new bugs and bad comments (many confirmed and fixed)

• 2 important topics*: lock-related & caller-callee.

• Latest versions of large software: Linux, Mozilla, Apache & Wine.

34

* Supported by our topic miners and cComment study (ICSE’09)



Overall Results

• Automatically detected 60 new bugs and bad comments

• 20 already confirmed by developers.

• Time: ~1 hour for each software

35

Software True Mismatches FP Rules

Linux 51 32 1209

Mozilla 6 3 410

Wine 2 3 149

Apache 1 0 64

Total 60 38 1832

Software Bugs BadCom FP Rules

Linux 30 21 32 1209

Mozilla 2 4 3 410

Wine 1 1 3 149

Apache 0 1 0 64

Total 33 27 38 1832



Additional Reading• /* iComment: Bugs or Bad Comments? */, SOSP’07

• Listening to Programmers - Taxonomies and Characteristics of Comments in Operating System Code. ICSE’09

• aComment: Mining Annotations from Comments and Code to Detect Interrupt-Related Concurrency Bugs. ICSE’11

• @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies. ICST’12

36

https://ece.uwaterloo.ca/~lintan/projects.html



Recall: Symbolic Executionbool isPositiveEvenNum(int x) { if (x <= 0) return false; else { if (x % 2 == 0) return true; else return false; } }

37

x <= 0 x >0

x % 2 == 0

x %2 != 0

x: *

x: > 0 && % 2 == 0

x: <=0

x: > 0 && % 2 != 0

x: -1

x: 2

x: 1

Dynamic binary execution tree

State

Path


Recall: Path Explosion

…… ……

next state to run

Search strategy

…… ……

38DASE: Document-Assisted Symbolic Execution for Improving Automated Software Testing. ICSE’15. Wong, Zhang, Wang, Liu, & Tan.


DASE Approach

• Use high-level info. from documents – i.e., input constraints

• To filter paths before scheduling via search strategies – Favoring paths of core functionality code

39

Document-Assisted Symbolic Execution



Approach Overview

…… ……

40

docs



Approach Overview

…… ……

41

docs



Approach Overview

…… ……

42

docs

…… ……

Search strategy

next to run



Lin Tan

One-Click

Register & Login

HttpAccess

SendSMS

Stealthy!

43

AsDroid: Detecting Stealthy Behaviors in Android Applications by User Interface and Program Behavior Contradiction. Huang, Zhang, Tan, Wang, and Liang. (ICSE’14)

An app’s behaviors that mismatch its UI Text

Real App “Qiyu"

Detecting Stealthy Behaviors in Mobile Apps


Lin Tan

Reducing False Alerts

44

Finding Patterns in Static Analysis Alerts. Hanam, Tan, Holmes, and Lam (MSR’14)

We tried bug detection tools, but there were too many false alerts.


Lin Tan

• FindBugs Alert: Method might ignore exception.

• Find statements related to the alert and rank the alerts with a machine learner.

45

AUnactionable Alerts VS

try{ reader.close(); resource = null;} catch (Exception ignore) {}

try{ object.doSomething();} catch (Exception e) {}

Suppress this ( Display this (

DE_MIGHT_IGNORE DE_MIGHT_IGNORE

Finding Patterns in Static Analysis Alerts. Hanam, Tan, Holmes, and Lam (MSR’14)


Lin Tan 46

Commit: 09a02f...Author: John SmithMessage: I submitted some code.

file1.c+++-

file2.c+----

Predicted Buggy!

Source Control

Code Comments,

Documentation

Pictures: courtesy of A. Hassan

Personalized Defect Prediction (ASE’13) Jiang, Tan, and Kim.

Defect Prediction


ECE453/SE465/CS447/ECE653/CS647: Data Flow Criteria

Lin Tan

February 13, 2015

10-DataFlow - February 13, 2015

Introduction to Software Testing (Ch 2) © Ammann & Offutt

Data Flow Criteria

• Definition (def) : A location where a value for a variable is stored into memory

• Use : A location where a variable’s value is accessed • def (n) or def (e) : The set of variables that are defined by node n

or edge e • use (n) or use (e) : The set of variables that are used by node n or

edge e

Goal: Try to ensure that values are computed and used correctly

X = 42

Z = X-8

Z = X*2 Defs: def (0) = {X}

def (4) = {Z}

def (5) = {Z}

Uses: use (4) = {X}

use (5) = {X}2

0

2

1

63

5

4



Another Example

3

def(q0) = {z}, use(q0) = {x, y}def(q1) = {}, use(q1) = {z, w} def(q2) = {}, use(q2) = {z}def(q3) = {}, use(q3) = {w}def(q4) = {}, use(q4) = {}



Reach• The definition at n2 does not reach the use at

n3, since no path goes from n2 to n3.

• Another example:

4



DU-pair

5

• DU pair: A pair of locations (li, lj) such that a variable v is defined at li and used at lj.


Software Testing Instructor: Lin TanIntroduction to Software Testing (Ch 2)

Def-clear and Reach

• Def-clear: A path p from li to lj is def-clear with respect to variable v if for every node nk and every edge ek on p from li to lj, where k ≠ i and k ≠ j, v is not in def(nk) or in def(ek).

• Reach: If there is a def-clear path from li to lj with respect to v, the def of v at li reaches the use at lj.

6



Does the def at n0 reach the use at n5?

7



DU Paths

• du-path : A simple path that is def-clear with respect to v from a def of v to a use of v

• associated with a variable • simple (otherwise there are too many) • may be any number of uses in a du-path

• Def-pair set du (ni, nj, v) – the set of du-paths from ni to nj

• Def-path set du (ni, v) – the set of du-paths that start at ni

8



Def-pair & Def-set Example

• du(n5, u1, x) = • du(n5,x) =

• du(n3,x)=

9

{[n5, u0, n, u1]}du(n5, u0,x) U du (n5, u1, x) U du (n5, u2, x)

= {[n5, u0], [n5, u0, n, u1], [n5, u0, n, u1, u2]}du (n3, u2,x)= { [n3, u2]}



Data Flow Test Criteria

• Then we make sure that every def reaches all possible uses:• All-uses coverage (AUC) : For each set of du-paths to uses S =

du (ni, nj, v), TR contains at least one path d in S.

• Finally, we cover all the du-paths between defs and uses:• All-du-paths coverage (ADUPC) : For each set S = du (ni, nj, v),

TR contains every path d in S.

• First, we make sure every def reaches a use:• All-defs coverage (ADC) : For each set of du-paths S = du (n, v),

TR contains at least one path d in S.

10

Assume definitions occur on nodes only, for the following definitions.



ADC, AUC, ADUPC Example

X = 42

Z = X-8

Z = X*2

All-defs for X

[ 0, 1, 3, 4 ]

All-uses for X

[ 0, 1, 3, 4 ]

[ 0, 1, 3, 5 ]

All-du-paths for X

[ 0, 1, 3, 4 ]

[ 0, 2, 3, 4 ]

[ 0, 1, 3, 5 ]

[ 0, 2, 3, 5 ]

11

0

2

1

63

5

4



ADC, AUC, ADUPC Example (2)

• ADC requires: • AUC requires: • ADUPC requires:

12

{[n0,n1]}{[n0,n1], [n0, n2,n3,n5]}

{[n0,n1], [n0,n2,n3,n5], [n0,n1,n3,n5]}List all du-paths even if they are a subpath of another.




• ADC requires: • AUC requires:

• ADUPC requires:

13

{[n5,u0], [n9, n, u1], [n3, u2]}{[n5,u0], [n5,u0, n, u1], [n5,u0, n, u1, u2], [n9, n, u1], [n9, n, u1, u2],[n3, u2]}

{[n5,u0], [n5,u0, n, u1], [n5,u0, n, u1, u2], [n9, n, u1], [n9, n, u1, u2],[n3, u2]}



Touring DU-Paths

• A test path p du-tours subpath d with respect to v if p tours d and the subpath taken is def-clear with respect to v

• Sidetrips can be used, just as with previous touring

• Three criteria - assume best effort touring – Use every def – Get to every use – Follow all du-paths

14



Definitions and Uses in Code• Definitions. Here are some statements which

correspond to definitions. • x = 5: x occurs on the left-hand side of an assignment

statement; • foo(T x) { ...}: implicit definition for x at the start of a

method; • bar(x): during a call to bar, x might be defined if x is a

C++ reference parameter. • subsumed by others: x is an input to the program.

• Uses. The book lists a number of cases of uses, but it boils down to “x occurs in an expression that the program evaluates.''

• E.g., RHS of an assignment, as part of a method parameter, or in a conditional

15



Basic blocks and defs/uses

• Basic blocks can rule out some definitions and uses as irrelevant.

• Defs: consider the last definition of a variable in a basic block. (If we're not sure whether x and y are aliased, keep both of them.)

• Uses: consider only uses that aren't dominated by a definition of the same variable in the same basic block, e.g. y = 5; use(y) is not interesting.

16




17

Prime Path Coverage

PPC


CPC

Node Coverage NC

Edge Coverage EC

Edge-Pair Coverage EPC

All-defs Coverage

ADC

All-DU-Paths Coverage ADUPC

All-uses Coverage

AUC



AUC & EC

• Under what additional assumptions, AUC subsumes EC? • For every node with multiple outgoing edges, at

least one variable is used on each out edge, (and the same variables are used on each out edge).

• Reasoning: • Each edge e1 must be either a branch edge or not.• If e1 is a branch edge, then it has a use, thus it must

be covered by AUC.• If e1 is not a branch edge, the closest branch edge

before e1, denoted as e2, should be covered AUC, and if e2 is covered by a test, e1 must also be. If the closed branch edge doesn’t exit, then the program has no branch edges; therefore e1 must be covered.

18



Compiler tidbit

• In a compiler, we use intermediate representations to simplify expressions, including definitions and uses.

• For instance, we would simplify:

• x = foo(y + 1, z * 2)

19



Exercise• Answer questions (a)-(f) for the graph defined

by the following sets: • N ={0, 1, 2, 3, 4, 5, 6, 7} • N0 = {0} • Nf = {7} • E = {(0,1), (1, 2), (1, 7), (2, 3), (2, 4), (3, 2), (4, 5),

(4, 6), (5, 6), (6, 1)} • def(0)=def(3)=use(5)=use(7) = {X}

• Also consider the following test paths: • t1 = [0,1,7] • t2 = [0,1,2,4,6,1,7] • t3 = [0,1,2,4,5,6,1,7] • t4 = [0,1,2,3,2,4,6,1,7] • t5 = [0,1,2,3,2,3,2,4,5,6,1,7] • t6 = [0,1,2,3,2,4,6,1,2,4,5,6,1,7]

20



Exercise - cont.

• (a) Draw the graph.

• (b) List all of the du-paths with respect to x. (Note: Include all du-paths, even those that are subpaths of some other du-paths).

• (c) For each test path, determine which du-paths that test path du-tours. Consider direct touring only. Hint: A table is a convenient format for describing the relationship.

21



Exercise - cont.

• (d) List a minimal test set that satisfies all-defs coverage with respect to x (Direct tours only). Use the given test paths.

• (e) List a minimal test set that satisfies all-uses coverage with respect to x. (Direct tours only). Use the given test paths.

• (f) List a minimal test set that satisfies all-du-paths coverage with respect to x. (Direct tours only). Use the given test paths.

22



Exercise - cont.• (b) du (0, 5, x): i: [0,1,2,4,5]

du (0, 7, x): ii: [0,1,7] du (3, 5, x): iii: [3,2,4,5] du (3, 7, x): iv: [3,2,4,6,1,7] v: [3,2,4,5,6,1,7]

• (c)

23

Test Path Du-tour directly

t1 = [0,1,7] ii

t2 = [0,1,2,4,6,1,7] /

t3 = [0,1,2,4,5,6,1,7] i

t4 = [0,1,2,3,2,4,6,1,7] iv

t5 = [0,1,2,3,2,3,2,4,5,6,1,7] iii, v

t6 = [0,1,2,3,2,4,6,1,2,4,5,6,1,7] /



Exercise - Partial Solutions.

• (d) List a minimal test set that satisfies all-defs coverage with respect to x (Direct tours only). Use the given test paths.

• {t1, t4} or {t1, t5} or {t3, t4} or {t3, t5} • Why isn’t {t1, t6} a correct answer?

• t6 does not du-tour any path in du(3, x). • (e) List a minimal test set that satisfies all-uses

coverage with respect to x. (Direct tours only). Use the given test paths.

• {t1, t3, t5} • (f) List a minimal test set that satisfies all-du-paths

coverage with respect to x. (Direct tours only). Use the given test paths.

• {t1, t3, t4, t5}

24


Software Testing Instructor: Lin Tan© Ammann & Offutt 7

Interprocedural Data Flow

• Data flow couplings among units are more complicated than control flow couplings.

– When values are passed, they “change names”. – Finding which uses a def can reach is very difficult.

• Caller : A unit that invokes another unit• Callee : The unit that is called• Callsite : Statement or node where the call appears• Actual parameter : Variable in the caller• Formal parameter : Variable in the callee

Introduction to Software Testing (Ch 2) 25


© Ammann & OffuttIntroduction to Software Testing (Ch 2)

• Applying data flow criteria to def-use pairs between units is too expensive.

• Too many possibilities • But this is integration testing, and we really only care about the

interface …

Example Call Site

a() { ... b(x) ...}

b(y){ ...}

Caller

Actual Parameter

Formal Parameter

Calleeinterface


Software Testing Instructor: Lin Tan© Ammann & Offutt 9

Inter-procedural DU Pairs

• If we focus on the interface, then we just need to consider the last definitions of variables before calls and returns and first uses inside units and after calls.

• Last-def : The set of nodes that define a variable x and has a def-clear path from the node through a callsite to a use in the other unit

• Can be from caller to callee (parameter or shared variable) or from callee to caller as a return value

• First-use : The set of nodes that have uses of a variable y and for which there is a def-clear and use-clear path from the call site to the nodes.

Introduction to Software Testing (Ch 2) 27



Last-defs & First-uses

28

last-defs are 2, 3 the first-use is 12


© Ammann & OffuttIntroduction to Software Testing (Ch 2)

Example Inter-procedural DU Pairs

f() { x = 14

y = g (x)

print (y)

g(a) { print (a)

b = 42

return (b)

Caller

Callee

DU pair

DU pair11

B (int y)

Z = y T = y

print (y)

10

12

13

Last Defs 2, 3

First Uses 11, 12

callsite

first-use

first-use

last-def

last-defx = 5

x = 4

x = 3

B (x)

1

2

3

4

29

}

}


© Ammann & Offutt 11

1 // Program to compute the quadratic root for two numbers 2 import java.lang.Math; 3 4 class Quadratic5 { 6 private static float Root1, Root2; 78 public static void main (String[] argv) 9 { 10 int X, Y, Z;11 boolean ok;12 int controlFlag = Integer.parseInt (argv[0]); 13 if (controlFlag == 1) 14 { 15 X = Integer.parseInt (argv[1]);16 Y = Integer.parseInt (argv[2]);17 Z = Integer.parseInt (argv[3]); 18 } 19 else20 { 21 X = 10;22 Y = 9;23 Z = 12; 24 }

25 ok = Root (X, Y, Z);26 if (ok) 27 System.out.println 28 (“Quadratic: ” + Root1 + Root2); 29 else30 System.out.println (“No Solution.”);31 } 3233 // Three positive integers, finds quadratic root 34 private static boolean Root (int A, int B, int C)35 { 36 float D; 37 boolean Result;38 D = (float) Math.pow ((double)B, (double2-4.0)*A*C ); 39 if (D < 0.0) 40 { 41 Result = false; 42 return (Result);43 } 44 Root1 = (float) ((-B + Math.sqrt(D))/(2.0*A)); 45 Root2 = (float) ((-B – Math.sqrt(D))/(2.0*A)); 46 Result = true; 47 return (Result);48 } / /End method Root4950 } // End class Quadratic

Example – Quadratic

Introduction to Software Testing (Ch 2)



1 // Program to compute the quadratic root for two numbers 2 import java.lang.Math; 3 4 class Quadratic5 { 6 private static float Root1, Root2; 78 public static void main (String[] argv) 9 { 10 int X, Y, Z;11 boolean ok;12 int controlFlag = Integer.parseInt (argv[0]); 13 if (controlFlag == 1) 14 { 15 X = Integer.parseInt (argv[1]); 16 Y = Integer.parseInt (argv[2]); 17 Z = Integer.parseInt (argv[3]); 18 } 19 else20 { 21 X = 10; 22 Y = 9; 23 Z = 12; 24 }




last-defs

first-uses

last-defs

first-use







last-defs

first-uses

Pairs of locations: method name, variable name, statement (main (), X, 15) – (Root (), A, 38)(main (), Y, 16) – (Root (), B, 38) (main (), Z, 17) – (Root (), C, 38)(main (), X, 21) – (Root (), A, 38)(main (), Y, 22) – (Root (), B, 38)(main (), Z, 23) – (Root (), C, 38)






last-defs

first-usePairs of locations: method name, variable name,

statement (Root (), Root1, 44) – (main (), Root1, 28)(Root (), Root2, 45) – (main (), Root2, 28)(Root (), Result, 41) – (main (), ok, 26 )(Root (), Result, 46) – (main (), ok, 26 )


© Ammann & OffuttIntroduction to Software Testing (Ch 2) 14

Quadratic – Coupling DU-pairs


(Root (), Root1, 44) – (main (), Root1, 28)(Root (), Root2, 45) – (main (), Root2, 28)(Root (), Result, 41) – (main (), ok, 26 )(Root (), Result, 46) – (main (), ok, 26 )


ECE453/SE465/CS447/ECE653/CS647: Syntax-based Testing (Mutation Testing)

Lin Tan

February 13, 2015

12-Syntax - February 13, 2015

Mutation Testing

• We will see two types of mutation testing:

• Input-space grammars: to create inputs (both valid and invalid)

• Program-based grammars: to modify programs (valid)

2


Regular Expression

• Consider this Perl Regular expression:

• ^(deposit|debit) [0-9]{3} \$[0-9]+\.[0-9]{2}$

• What are some valid strings?

3


Grammar

• What is the limitation of regular expressions?

• Grammar

• action = dep | deb

• dep = "deposit" account amount

• deb = "debit" account amount

• account = digit{3}

• amount = "$"digit+"."digit{2}

• digit = ["0"-"9"]

4


Using Grammar

• Two ways to use input grammars for software testing and maintenance:

• Recognizer: can include them in a program to validate inputs;

• Generator: can create program inputs for testing.

5


Grammar Terminology

• Grammar

• actions = action*





• amount = “$”digit+"."digit{2}

• digit = ["0"-"9"]

6

Start symbol

Non-terminals

Production rule

Terminals


Coverage Criteria

• Terminal Symbol Coverage (TSC). TR contains each terminal of grammar G.

• Production Coverage (PDC). TR contains each production of grammar G.

• Derivation Coverage (DC). TR contains every possible string derivable from G.

• PDC subsumes TSC.

• DC often generates infinite test sets, and even if you limit to fixed-length strings, you still have huge numbers of inputs.

7


Grammar Terminology

• Grammar







• digit = ["0"-"9"]

8

Start symbol

Non-terminals

Production rule

Terminals

How many terminals?How many productions? How many derivable strings?


Terminology

• Ground String: A (valid) string belonging to the language of the grammar.

• Mutation Operator: A rule that specifies syntactic variations of strings generated from a grammar.

• Mutant: The result of one application of a mutation operator to a ground string.

• Mutants may be generated either by modifying existing strings or by changing a string while it is being generated.

9


Mutation Testing




10


Testing Invalid Inputs

• Your program accepts strings described by a regex or a grammar.

• Was it implemented correctly?

• Reasons to test invalid inputs:

• Often implementers overlook invalid inputs;

• Undefined behaviour not acceptable (e.g. buffer overruns).

• Solution: Mutate grammars and generate test strings from the mutated grammars.

11


Some Grammar Mutation Operators

• Nonterminal Replacement

• dep = "deposit" account amount =>

• dep = "deposit" amount amount

• Use your judgement to replace nonterminals with similar nonterminals.

• Terminal Replacement

• amount = “$”digit+"."digit{2} =>

• amount = “$”digit+"$"digit{2}

12

deposit $9.22 $12.35

ground string: deposit 123 $12.35

deposit 123 $12$35


More Grammar Mutation Operators

• Terminal and Nonterminal Deletion; e.g.


• dep = "deposit" amount

• Terminal and Nonterminal Duplication; e.g.


• dep = "deposit" account account amount

13

deposit $12.35

deposit 123 123 $12.35


Killing Mutants

• Test case t kills m if running t on m produces different output than running t on m0,

• where m is a mutant for an original ground string program m0.

14


Recall the Terminology

• Ground String: A (valid) string belonging to the language of the grammar.

• Mutation Operator: A rule that specifies syntactic variations of strings generated from a grammar.

• Mutant: The result of one application of a mutation operator to a ground string.

15


Recall Production

• Grammar







• digit = ["0"-"9"]

16

Production rule


Coverage Criteria• Mutation Coverage (MC). For each mutant m, TR contains a

requirement to “kill m".

• Mutation score is the percentage of mutants killed.

• Mutation Operator Coverage (MOC). For each mutation operator op, TR contains requirement to create a mutated string m derived using op.

• Mutation Production Coverage (MPC). For each mutation operator op and each production p that op can be applied to, TR contains requirement to create a mutated string from p.

17


Mutation Testing




18


Mutant Testing: Program-based Grammars

• Generating mutants by modifying programs according to the language grammar, using mutation operators.

• Mutants are valid programs (not tests) which ought to behave differently from the ground string.

• The goal of mutation testing is to create tests which distinguish mutants from originals.

19


Example

20

int foo(int x, int y) { // original if (x > 5)

return x + y; else

return x;}

int foo(int x, int y) { // mutant if (x > 5) return x - y; else return x;}

• What test case can kill the mutant?

• Once we find a test case that kills a mutant, we can forget the mutant and keep the test case. The mutant is then dead.


Uninteresting Mutants

• Three kinds of mutants are uninteresting:

• Stillborn: such mutants cannot compile (or immediately crash);

• Trivial : killed by almost any test case;

• Equivalent: indistinguishable from original program.

• The usual application of program-based mutation is to individual statements in unit testing.

21


Example Mutants

22

Original Method

int Min (int A, int B){ int minVal; minVal = A; if (B < A) { minVal = B; } return (minVal);} // end Min

Mutant

int Min (int A, int B){ int minVal; minVal = A;∆ 1 minVal = B; if (B < A)∆ 2 if (B > A)∆ 3 if (B < minVal) { minVal = B;∆ 4 Crash ();∆ 5 minVal = A;∆ 6 minVal = failOnZero (B); } return (minVal);} // end Min

6 mutants

Each represents a separate program

Replace one variable with another

Changes operator

Immediate runtime failure … if reached

Immediate runtime failure if B==0 else does nothing


Goals of Mutation Testing

• 1. Mimic (and hence test for) typical mistakes

• 2. Encode knowledge about specific kinds of effective tests in practice

• Statement coverage (∆4)

• Checking for 0 values (∆6)

23


Process of Using Mutation Testing

• Goal

• Kill mutants

• Desired side effect

• Good tests which kill the mutants.

• The tests will help find faults.

• We find these tests by intuition and analysis.

24


Strong and Weak Mutation

• Strong mutation: A fault must be reachable, infect the state, and propagate to output.

• Weak mutation: A fault which kills a mutant need only be reachable and infect the state.

• The book claims that experiments show that weak and strong mutation require almost the same number of tests to satisfy them.

25


Strong Mutation Coverage (SMC)

• Strongly Killing Mutants: Given a mutant m for a program P and a test t, t is said to strongly kill m iff the output of t on P is different from the output of t on m.

• Strong Mutation Coverage (SMC). For each mutant m ∈ M, TR contains a test which strongly kills m.

26


Weak Mutation Coverage (WMC)

• Weakly Killing Mutants: Given a mutant m that modifies a source location L in program P and a test t, t is said to weakly kill m iff the state of the execution of P on t is different from the state of the execution of m on t, immediately after some execution of L.

• Weak Mutation Coverage (WMC). For each mutant m ∈ M, TR contains a test which weakly kills m.

27


Condition Analysis

28

Original Methodint Min (int A, int B){ int minVal; minVal = A; if (B < A) { minVal = B; } return (minVal);} // end Min

Mutantint Min (int A, int B){ int minVal;∆ 1 minVal = B; if (B < A) { minVal = B; } return (minVal);} // end Min

• Reachability:

• Infection:

• Propagation:

Unavoidable;

Need B ≠ A;

Need B > A;Wrong minVal needs to return to the caller; that is, we can't execute the body of the if statement.


Necessary and Sufficient Conditions

• Reachability: unavoidable;

• Infection: need B ≠ A;

• Propagation: need B > A.

Wrong minVal needs to return to the caller; that is, we can't execute the body of the if statement.

• Condition for weakly killing mutant: B ≠ A

• Condition for strongly killing mutant: B > A

29


WMC vs SMC

30


With Embedded Mutantsint Min (int A, int B){ int minVal;∆ 1 minVal = B; if (B < A) { minVal = B; } return (minVal);} // end Min

• Condition for strongly killing mutant B > A

• A test case is A = 5, B = 7 (return value =7 , expected = 5).


• A test case is A = 8, B = 2 (return value = 2, expected = 2).


Another Example

31

Original Method

int Min (int A, int B){ int minVal; minVal = A; if (B < A) { minVal = B; } return (minVal);} // end Min

With Embedded Mutants

int Min (int A, int B){ int minVal; minVal = A;∆ 3 if (B < minVal) { minVal = B; } return (minVal);} // end Min

• This mutant is an equivalent mutant, since A = minVal. (The infection condition boils down to “false”.)

• Equivalence testing is, in its full generality, undecidable, but we can always estimate.


Workflow of Mutation Testing

32


Mutation Operators

33

int mutationTest ( int a , b) {int x = 3 * a , y ;if (a > b) {

y = -x ;}else if ( ! ( a > -b ) ) {

x = a * b ;}return x ;

}

What (intra-procedural) mutation operators can you come up with?

• Absolute Value Insertion

• Operator Replacement

• Arithmetic, Relational, ...

• Scalar Variable Replacement

• Crash Statement replacement


Interface/Integration Mutation

• Change calling method by changing actual parameter values;

• Change calling method by changing callee; or

• Change callee by modifying the return statement.

34

class M { int f, g; void c(int x) { foo (x, g); bar (3, x); } int foo(int a, int b) { return a + b * f; } int bar(int a, int b) { return a * b; }}


Summary of Syntax-based Testing

• Program-based testing has notion of strong and weak mutants.

• Sometimes we mutate the grammar, not strings, and get tests from the mutated grammar.

35


PIT Mutation Testing Tool•Conditionals Boundary Mutator

•Negate Conditionals Mutator

•Remove Conditionals Mutator

•Math Mutator

• Increments Mutator

• Invert Negatives Mutator

• Inline Constant Mutator

•Return Values Mutator

•Void Method Calls Mutator

•Non Void Method Calls Mutator

•Constructor Calls Mutator

•Experimental Inline Constant Mutator

•Experimental Member Variable Mutator

•Experimental Switch Mutator36

• http://pitest.org/


ECE453/SE465/CS447/ECE653/CS647: Graph Coverage for

Specifications

Lin Tan

February 24, 2015

13-GC4Specs - February 24, 2015


How are the coverage criteria useful?

• Does industry use NC, EC, PPC, etc.? • Some, NC and EC are used more often

• Why aren’t they used (more widely)? • Ignorance • Lack of tools to measure coverage criteria &

expensive to achieve them • Lack of evidence to support the benefits

• Possible Solutions • You can educate them! • Build tools • Measure the benefits (e.g., find more bugs)

2



Sequencing Constraints

3

/* @requires size() > 0 */ public int dequeue() { ... }

/* @ensures size() = \old(size()) + 1 */ public void enqueue (int x) { ... }

Must call enqueue() before calling dequeue()!



Using Sequencing Constraints

• Let's consider the following three methods: • open() • close() • write()

• and the following rules: • 1. open() must be executed before any call to write(); • 2. open() must be executed before any call to close(); • 3. write() must not execute after close() unless open()

has executed in between; • 4. there must be write() before every close(); • 5. close() must not execute after close() unless open()

executes in between; • 6. open() must not execute after open() unless close()

4



An Example

5

s1

s2s3

s4 s5

s6

open ()

write()

write ()

close ()

Which rule is violated? Which path violates it?

• 1. open() must be executed before any call to write();

• 2. open() must be executed before any call to close();

• 3. write() must not execute after close() unless open() has executed in between;

• 4. there must be write() before every close();

• 5. close() must not execute after close() unless open() executes in between;

• 6. open() must not execute after open() unless close() executes in between.NC, EC, PPC?



Another Example

6

close ()

s1

s2 s3

s4s5

s8

open ()

write ()

write ()

s6 s7close ()

• 1. open() must be executed before any call to write();

• 2. open() must be executed before any call to close();

• 3. write() must not execute after close() unless open() has executed in between;

• 4. there must be write() before every close();

• 5. close() must not execute after close() unless open() executes in between;

• 6. open() must not execute after open() unless close() executes in between.

Which rule is violated? Which path violates it?

NC, EC, PPC?



Two Questions

• How to obtain sequencing constraints?

• How to check these constraints? • Static • Dynamic

7



How to Get Sequencing Constraints?

8

/* Must Call A() before B()! */

A(); …

B();

A(); …

B();

A(); …

B();

A(); …

B();

Developers/Users

Code Comments/Docs


Sneak Peek: Dynamically Detecting

Likely Program InvariantsMichael Ernst, Jake Cockrell,

Bill Griswold (UCSD), David NotkinUniversity of Washington

ICSE’00



Two Questions

• How to obtain sequencing constraints?

• How to check these constraints? • Static • Dynamic

10



Beyond Sequencing Constraints

11

/* @requires size() > 0 */ public int dequeue() { ... }

/* @ensures size() = \old(size()) + 1 */ public void enqueue (int x) { ... }

Must have at least as many enqueue() calls as dequeue() calls!

How to represent it?



Testing State Behavior• A finite state machine (FSM) is a graph that describes how

software variables are modified during execution • Nodes : States, representing sets of values for key variables • Edges : Transitions, possible changes in the state

Off On

switch up

switch down




Finite State Machine – Two Variables

Tropical Depression circulation = yes

windspeed < 39mph

Tropical Storm circulation = yes

windspeed = 39..73 mph

Major Hurricane circulation = yes

windspeed >= 110 mph

Hurricane circulation = yes

windspeed 74..109 mph

Something Else circulation = no windspeed = ...

Other variables may exist but not be part of state




Finite State Machines are Common• FSMs can accurately model many kinds of software

– Embedded and control software (think electronic gadgets) – Abstract data types – Compilers and operating systems – Web applications

• Creating FSMs can help find software problems

• Limitations : FSMs are not always practical for programs that have lots of states (for example, GUIs)




Annotations on FSMs• FSMs can be annotated with different types of actions

– Actions on transitions – Entry actions to nodes – Exit actions on nodes

• Actions can express changes to variables or conditions on variables

• These slides use the basics: – Preconditions (guards) : conditions that must be true for transitions to be

taken – Triggering events : changes to variables that cause transitions to be taken




Example Annotations

Closed Open

open elevator door

pre: elevSpeed = 0

trigger: openButton = pressed




Covering FSMs• Node coverage : execute every state (state coverage) • Edge coverage : execute every transition (transition coverage) • Edge-pair coverage : execute pairs of transitions (transition-pair)

• Data flow: – Nodes often do not include defs or uses of variables – Defs of variables in triggers are used immediately (the next state) – Defs and uses are usually computed for guards, or states are extended – FSMs typically only model a subset of the variables

• Generating FSMs is often harder than covering them …




Deriving FSMs• With some projects, a FSM (such as a statechart) was

created during design – Tester should check to see if the FSM is still current with

respect to the implementation • If not, it is very helpful for the tester to derive the

FSM • Strategies for deriving FSMs from a program:

1. Combining control flow graphs 2. Using the software structure 3. Modelling state variables 4. Using implicit or explicit specifications

• Example in textbook




CFG

• Nodes aren't really states; they just abstract the program counter;

• Inessential nondeterminism due e.g. to method calls;

• Can only build these when you have an implementation;

• Tend to be large and unwieldy.

19



Software Structure. Better than CFGs.

• Subjective (which is not necessarily bad);

• Requires lots of effort;

• Requires detailed design information and knowledge of system.

20



Modelling State & Specifications

• Modelling State. This approach is more mechanical: once you've chosen relevant state variables and abstracted them, you need not think much.

• You can also remove impossible states from such an FSM, for instance by using domain knowledge.

• Specifications. These are similar to building FSMs based on software structure. Generally cleaner and easier to understand. Should resemble UML statecharts.

21



Advantages and Disadvantages of FSMs

• Advantages: • Enable creation of tests before implementation; • Easier to analyze an FSM than the code.

• Disadvantages: • abstract models are not necessarily exhaustive; • subjective (so they could be poorly done); • FSM may not match the implementation.

22


ECE453/SE465/CS447/ECE653/CS647: Midterm Review

Lin TanFebruary 24, 2015

14-MidtermReview - February 24, 2015

Admin.

✤ Lin’s additional office hours will be 9:30-10:30 Friday Feb. 27 in DC 2536.

✤ Midterm from 2012 and its solutions have been posted on LEARN.



Graph Coverage

• You should be able to do the following with graphs: • Define graphs, paths • Create TRs for structural and dataflow criteria • Structural: NC, EC, EPC, PPC, CPC • Dataflow: ADC, AUC, ADUPC • Use the subsumption chart to evaluate test sets • Create graphs from source code • Understand concerns in handling multiple methods • Use specifications (e.g., sequencing constraints) to create

graphs • Different approaches to check/test specifications, and their

pros and cons

3



Strengths and Weaknesses of Graph Coverage:• Must create graph • Node coverage is usually easy, but cycles make

it hard to get good coverage in general. • Incomplete node or edge coverage point to

deficiencies in a test set.

4



Summary• Summarizing Structural Coverage:

• Generic; hence broadly applicable • Uses no domain knowledge

• Summarizing Dataflow Coverage: • Definitions and uses are useful but hard to reason.

• Miscellaneous other notes: • Control-flow graphs are manageable for single methods, but

not generally more than that. • Use call graphs to represent multiple methods, hiding details

of each method. • When we want to test du-paths across multiple elements,

use first-use/last-def heuristic. • Testing based on specifications: sequencing constraints can

be hard to find, and hard to check/test.

5



Midterm

• Open-book exam

• See course website for time and location! • https://ece.uwaterloo.ca/~lintan/courses/testing/

exams.html

6



Coverage

• Lecture 1 - 13 (Graph Coverage for Specifications)

7



Main Topics

• Fault, Error, Failure • Graph Coverage

• Structural Coverage • Data Flow Coverage • Control Flow Graphs • For Specifications

• Coverity Rule Inference • Syntax-based Testing

8



Fault, Error, Failure

9

public static int numZero (int[] x) { //Effects: if x==null throw NullPointerException// else return the number of occurrences of 0 in x

int count = 0;for (int i =1; i <x.length; i++) {

if (x[i]==0) {count++;

}}return count;

}

x = [2,7,0], fault executed, error, no failurex = [0,7,2], fault executed, error, failureState of the program: x, i, count, PC

Wrong State:x = [2,7,0]i =1count =0PC=first iteration of if

Expected State:x = [2,7,0]i =0count =0PC=first iteration of if

Fix: for (int i =0, i<x.length; i++)



Graph

10




• ADC requires: • AUC requires: • ADUPC requires:

11

{[n0,n1]}{[n0,n1], [n0, n2,n3,n5]}

{[n0,n1], [n0,n2,n3,n5], [n0,n1,n3,n5]} List all du-paths even if they are a subpath of another.



Subsumption• Criteria Subsumption: A test criterion C1

subsumes C2 if and only if every set of test cases that satisfies criterion C1 also satisfies C2

• Must be true for every set of test cases

12

Edge Coverage

EC

Node Coverage

NC

subsumes




13

Prime Path Coverage

PPC


CPC

Node Coverage NC

Edge Coverage EC

Edge-Pair Coverage EPC

All-defs Coverage

ADC

All-DU-Paths Coverage ADUPC

All-uses Coverage

AUC



Paths

• Path • Subpath • Test Path • Simple Path • Prime Path • Visit • Tour

• direct

• Du-tour

14



Definitions & Uses

• Definitions. Here are some Java statements which correspond to definitions.

• x = 5: x occurs on the left-hand side of an assignment statement;

• foo(T x) {...} : implicit definition for x at the start of a method;

• bar(x): during a call to bar, x might be defined if x is a C++ reference parameter.

• (subsumed by others): x is an input to the program.

• Uses. The book lists a number of cases of uses, but it boils down to x occurs in an expression that the program evaluates.

15



Last-defs & First-uses

16

last-defs are 2, 3 the first-use is 12


Introduction to Software Testing (Ch 2) © Ammann & Offutt© Ammann & Offutt 11





last-defs

first-uses



Introduction to Software Testing (Ch 2) © Ammann & Offutt© Ammann & Offutt 11




last-defs

first-usePairs of locations: method name, variable name,

statement (Root (), Root1, 44) – (main (), Root1, 28)(Root (), Root2, 45) – (main (), Root2, 28)(Root (), Result, 41) – (main (), ok, 26 )(Root (), Result, 46) – (main (), ok, 26 )


Introduction to Software Testing (Ch 2) © Ammann & Offutt© Ammann & OffuttIntroduction to Software Testing (Ch 2) 14

Quadratic – Coupling DU-pairs


(Root (), Root1, 44) – (main (), Root1, 28)(Root (), Root2, 45) – (main (), Root2, 28)(Root (), Result, 41) – (main (), ok, 26 )(Root (), Result, 46) – (main (), ok, 26 )



• Traditional:

•Use global analysis to track which routines return NULL

•Problem: false positives when pre-conditions hold, difficult to tell statically (“return p->next”?)

• Instead: see how often programmer checks.

•Rank errors based on number of checks to non-checks.

• Algorithm: Assume *all* functions can return NULL

• If pointer checked before use, emit “check” message

• If pointer used before check, emit “error”

•Sort errors based on ratio of checks to errors

• Result: 152 bugs, 16 false.

Statistical: deriving routines that can fail


*p = x;


*p = x;


*p = x;

20

p = bar(…); *p = x;



Mutation Testing

• Mutation operators • Generate mutants • Strongly kill a mutant • Weakly kill a mutant • Mutation Coverage (MC)

21



Condition Analysis

22


Mutantint Min (int A, int B){ int minVal;∆ 1 minVal = B; if (B < A) { minVal = B; } return (minVal);} // end Min

• Reachability:

• Infection:

• Propagation:

Unavoidable;

Need B ≠ A;

Need B > A;Wrong minVal needs to return to the caller; that is, we can't execute the body of the if statement.



Necessary and Sufficient Conditions

• Reachability: unavoidable;

• Infection: need B ≠ A;

• Propagation: need B > A.

Wrong minVal needs to return to the caller; that is, we can't execute the body of the if statement.


• Condition for strongly killing mutant: B > A

23


Last Hints

• Make sure you know how to do the in-class exercises.

• Be familiar with questions in 2012’s midterm.

• If a topic was not mentioned here, it does not imply that it will not appear in the exam.

• Good luck!14-MidtermReview - February 24, 2015

Documents

Lintanlecutures