1 Verification. 2 Outline What are the goals of verification? What are the main approaches to verification? –What kind of assurance do we get through

1

Verification

2

Outline• What are the goals of verification?

• What are the main approaches to verification?

– What kind of assurance do we get through testing?

– How can testing be done systematically?

– How can we remove defects (debugging)?

• What are the main approaches to software analysis?

– informal vs. formal

3

Verification / Validation• Verification: all activities to make sure the

implementation meets the design objectives– These activities include a wide range of efforts, such as:

testing, mathematical proofs, and informal testing. – Program verification often consists of trying a few

sample cases to see whether the results of running the code match our expectation.

– Experimental data from industry have shown that the cost of removing an error after the software has been developed completely, is much higher than if errors are eliminated earlier.

• Validation: checking that the final product’s features conform with the software requirement.

4

Requirements for verification

• In an ideal case, everything must be verified:– In general, everyone makes mistakes including

designers even if they are skilled and follow sound principles.

• Every required non-functional qualities of both process and product– Performance, portability, modifiability– even the test cases that are used must be verified.

5

Properties of verification• May not be binary (OK, not OK)

– severity of defect is important– some defects may be tolerated in large

software systems

• May be subjective or objective– e.g., usability, portability, … Subjective

• Even implicit qualities should be verified– because requirements are often incomplete– e.g., robustness

6

Approaches to verification• Two approaches:1. Experiment with behavior of product, i.e,

testing– sample behaviors via testing– goal is to find "counterexamples"– dynamic technique

2. Analyze product to deduce its adequacy– analytic study of properties– static technique

7

Testing and lack of "continuity"

• Testing sample behaviors by examining "test cases"

• Impossible to estimate behavior of software from a finite set of test cases

• No continuity of behavior– it can exhibit correct behavior in infinitely

many cases, but may still be incorrect in some cases

8

Verification in engineering

9

procedure binary-search (key: in element; table: in elementTable; found: out Boolean) is

beginbottom := table-first; top := table-last; while bottom < top loop

if (bottom + top) rem 2 ≠ 0 then middle := (bottom + top - 1) / 2;

else middle := (bottom + top) / 2;

end if;if key ≤ table (middle) then

top := middle;else

bottom := middle + 1;end if;

end loop;found := key = table (top);

end binary-search

if we omit this line,the routineworks if the elseis never hit!(i.e. if size of table is a power of 2)

Power of 2:

0 1 2 3 4 5 6 7

Size 8: [10, 14, 22, 33, 45, 66, 68, 90]

0 + 7 = 7 rem 2 = 1

10

Goals of testing

• To show the presence of bugs (Dijkstra, 1987)

• If tests do not detect failures, we cannot conclude that software is defect-free

• Still, we need to do testing– driven by sound and systematic principles

11

Goals of testing (cont.)

• Should help isolate errors– to facilitate debugging

• Should be repeatable– repeating the same experiment, we should get the

same results• this may not be true because of the effect of execution

environment on testing• because of nondeterminism

• Should be accurate

12

Theoretical foundations of testing

13

Definitions (1)

• P (program), • D (input Domain), • R (output domain or “Range”)

P: D R (may be partial function)

• Correctness is defined by OR D R OR: output requirement

P(d) is correct if <d, P(d)> OR P is correct if all P(d) are correct

14

Definitions (2)

• FAILURE– P(d) is not correct

• may be undefined (error state or hung or crash) or may be the wrong result

• ERROR (DEFECT or BUG)– anything that may cause a failure

• typing mistake• programmer forgot to test “x = 0”

• FAULT– incorrect intermediate state entered by program– A FAULT happens only if the program has ERROR – FAILURE occurs if a FAULT happens during execution

15

Definitions (3)

• Test case t – an element of D

• Test set T– a finite subset of D

• Test t is successful if P(t) is correct

• Test set T is successful if P correct for all t in T

16

Definitions (4)

• Ideal test set T– if P is incorrect, there is an element of T

such that P(t) is incorrect

• if an ideal test set exists for any program, we could prove program correctness by testing

17

Test criterion

• A criterion C defines finite subsets of D (i.e., test sets) that have some common property

C 2D

• A test set T satisfies C if it is an element of CExample

C = {<x1, x2,..., xn> | n 3 i, j, k, ( xi<0 xj=0 xk>0)}

What is missing in this set definition?

<-5, 0, 22> is a test set that satisfies C<-10, 2, 8, 33, 0, -19> also does<1, 3, 99> does not

18

Empirical testing principles

• Find strategy to select significant test cases– Significant = has high potential of

uncovering presence of error

19

Complete-Coverage Principle• Try to group elements of D into subdomains D1, D2, …,

Dn where any element of each Di is likely to have similar behavior D = D1 D2 … Dn

• Select one test case as a representative of the subdomain

– If Dj Dk for all j, k (partition), any element (test case) can be chosen from each subdomain

– Otherwise (not a partitioning) choose representatives to minimize number of tests, yet fulfilling the principle

Example ofa partitionof domain D

Subdomain

20

Testing in the smallWe test individual modules• BLACK BOX (functional) testing

– partitioning criteria based on the module’s specification

– tests what the program is supposed to do

• WHITE BOX (structural) testing– partitioning criteria based on module’s

internal code– tests what the program does

21

White box testing

derives test cases from program code

22

Structural Coverage Testing

• (In)adequacy criteria – If significant parts of program structure

are not tested, testing is inadequate

• Control flow coverage criteria– Statement coverage– Edge coverage– Condition coverage– Path coverage

23

Statement-coverage criterion

• Select a test set T such that every elementary statement in P is executed at least once by some d in T:– Assignments; I/Os; Procedure calls

– If an input datum executes many statements then try to minimize the number of test cases still preserving the desired coverage

24

Exampleread (x); read (y);if x > 0 then

write ("1");else

write ("2");end if;

if y > 0 then write ("3");

else write ("4");

end if;

{<x = 2, y = 3>, <x = - 13, y = 51>, {<x = 2, y = 3>, <x = - 13, y = 51>, <x = 97, y = -17>, <x = - 1, y = - 1>}<x = 97, y = -17>, <x = - 1, y = - 1>}covers all statementscovers all statements

{<x = - 13, y = 51>, <x = 2, y = - 3>}{<x = - 13, y = 51>, <x = 2, y = - 3>} is minimalis minimal

25

Weakness of the statement coverage

criterionif x < 0 then

x := -x; end if;z := x;

The value of Z is always positive; therefore, it does not showwhether the branch has been executed or not.

26

Edge-coverage criterion

• Select a test set T such that every edge (branch) of the control flow is exercised at least once by some d in T

this requires formalizing the concept of the control graph, and how to construct it

Control Graph: – Edges represent statements

– Nodes at the beginning and end of an edge represent entry into the statement and exit

27

G G1 2 G1

G1

G1

G 2

I/O, assignment, or procedure call

if-then-else if-then

while loop

two sequential statements

Control graph construction rules

28

Simplificationa sequence of edges can be collapsed into just one edgeWHY?

. . .n n nnn k-1 k1 2 3

n1n

k

29

beginread (x); read (y);while x ≠ y loop

if x > y then x := x - y;

else y := y - x;

end if;end loop;GCD : = x;

end;

Exemple: Euclid's algorithm

x > y

x != y

x <= y

x := x- yy:=y-x

end if

end loop

GCD:=x

end

read(x)

reqd(y)

GCD with Pre-Post-conditions{x > 0 and y > 0}

beginread (x); read (y);while x ≠ y loop

if x > y then x := x - y;

else y := y - x;

end if;end loop;GCD : = x;

End;

{(exists z1, z2 (x = GCD * z1 and y = GCD * z2)and not (exists h (exists z1, z2 (x = h * z1 and y = h * z2) and h > GCD))}

PRE

POST

31

Weakness of edge-coverageexample: search for an element in a table

found := false; counter := 1;

while (not found) and counter < number_of_items loopif table (counter) = desired_element then

found := true;end if;counter := counter + 1;

end loop;if found then

write ("the desired element is in the table");else

write ("the desired element is not in the table");end if;

Test cases: (1) empty table, (2) table with 3 items, second of which is the item to look for. Can not discover the error: (< should be replaced by ≤ ) boundary cases are not test

32

Weakness of Edge Coverage

{<x = 0, z = 1>, <x = 1, z = 3>} causes the execution of all edges

if x ≠ 0 then

y := 5; else

z := z - x; end if;

if z > 1 then z := z / x;

else z := 0;

end if;

But fails to expose the risk of a division by zero

33

Condition-coverage criterion

• Select a test set T such that every edge of P’s control flow is traversed and all all possible values of the constituents possible values of the constituents of compound conditions are of compound conditions are exercised at least onceexercised at least once

– it is finer than edge coverage

34

Path-coverage criterion

• Select a test set T which traverses all paths from the initial to the final node of P’s control flow

– it is finer than previous kinds of coverage

– however, number of paths may be too large, or even infinite (see while loops)

• Therefore, additional constraints must be provided

35

The infeasibility problem• Syntactically indicated behaviors

(statements, edges, etc.) are often impossible

– unreachable code, infeasible edges, paths, etc.

• Adequacy criteria may be impossible to satisfy

– manual justification for omitting each impossible test case

– adequacy “scores” based on coverage • example: 95% statement coverage

36

Further problem

• What if the code omits the implementation of some part of the specification?

• White box test cases derived from the code will ignore that part of the specification!

37

Black box testing

derives test cases from specifications

38

Specification of an example: Sorted file of Invoices

The program receives as input a record describing an invoice. (A detailed description of the format of the record has also been given).

The invoice must be inserted into a file of invoices that is sorted by date.

The invoice must be inserted in the appropriate position: If other invoices exist in the file with the same date, then the invoice should be inserted after the last one.

Also, some consistency checks must be performed: The program should verify whether the customer is already in a corresponding file of customers, whether the customer’s data in the two files match, etc.

39

Consider these cases for testing the invoice system

• An invoice whose date is the current date• An invoice whose date is before the current date

(This might be even forbidden by law)This case, in turn, can be split into the two following subcases: • An invoice whose date is the same as that of

some existing invoice • An invoice whose date does not exist in any

previously recorded invoice• Several incorrect invoices, checking different types of

inconsistencies

What type of data structure do you choose for such a system? Array Linked-list Hash-table Tree-structure

40

Systematic black-box techniques

• Testing driven by logic specifications: Pre-/ Post-conditions (we cover only (we cover only

this)this)

• Syntax-driven testing

• Decision table based testing

• Cause-effect graph based testing

41

Logic specification for insertion of invoice record in a file

for all x: Invoice, f: Invoice_Files{sorted_by_date(f) and not exist j, k (j ≠ k and f(j) =f(k)}

insert(x, f)

{sorted_by_date(f) and for all k (old_f(k) = z implies exists j (f(j) = z)) and for all k (f(k) = z and z ≠ x) implies exists j (old_f(j) = z) andexists j (f(j). date = x. date and f(j) ≠ x) implies j < pos(x, f) andresult x.customer belongs_to customer_file andwarning (x belongs_to old_f or x.date < current_date or ....)}

42

TRUE implies sorted_by_date(f) and

for all k old_f(k) = z implies exists j (f(j) = z) and for all k (f(k) = z and z ≠ x) implies exists j (old_f(j) = z)

and(x.customer belongs_to customer_file) implies resultand not (x.customer belongs_to customer_file and ...)

implies not resultandx belongs_to old_y implies warningandx.date < current_date implies warningand....

Apply condition coverage criterion to post-condition Rewrite in a more convenient way… N

o le

ss, No m

ore

No le

ss, No m

ore

43

Applying Condition Coverage to generate Test Cases

• Test case to verify that the file which is produced contains all and only previous invoices plus the new one and is sorted.

• At least one test case with an invoice whose field customer exists in the customer_ file, and one test case with an invoice whose field customer does not exist in such file.

• At least one test case with an invoice whose field date is the same as that of an already existing invoice whose field date is the same as that of an already existing invoice, and one test case with an invoice whose field date is not that of an already existing invoice.

44

The oracle problemHow to define the correctness of the output we obtain?

• Oracles are required at each stage of testing

• Automated test oracles are required for running large amounts of tests

• Oracles are difficult to design - no universal recipe• If x > 0 then S1 else S2 endif • test against x= 2 and x = 0• How do you know if the outcome of S1 and S2 are

correct? Or which x belong to which S?

45

Testing in the large• Module (Unit) testing

– testing a single module

• Integration testing– integration of modules and subsystems

• System testing– testing the entire system

• Acceptance testing– performed by the customer

46

Module testing• Experimental environment needed to

create the environment in which the module should be tested

– stubs• Fake modules used by the module under

test

– driver• module activating the module under test

47

Module (Unit) Testing• Driver

– Usually main program that accepts data and passes to the module to be tested and prints relevant results.

• Stub– Simulates a subroutine module that

is called by the module to be tested

• Test harness– A collection of drivers and stubs

– An automatic test-result checking with anticipated-result will accelerate the testing process.

48

Type sequencesequence(max_size: NATURAL) IS record size : INTGER range 0 … max_size := 0; contents = array (1 .. Max_size) of INTEGER end record

The Stub Looks Like This:

Procedure sort (seq: in out sequencesequence) is -- unsorted data inputBegin write (“the sequence to be sorted is the following: “); for I in 1 .. Seq.size loop write (seq.contents (I)); --- write unsorted data for user end loop

write (“enter the result of sorting the sequence”); for I in 1 .. Seq.size loop read (seq.contents (I) ); -- user provides sorted data end loop

--- a safer version of the stub could verify the consistency of --- the user-supplied data with respect to procedure specificationEnd sort

49

Testing a functional module

PROCEDURE UNDER TEST DRIVERSTUB

CALL CALL

ACCESS TO NONLOCAL VARIABLES

Sets the valuesProvides the values

- Stub is implemented as a look-up table- For each value in the domain of a function stub returns a value from table as the result.- For example: Stub plays the role of a math function f := sin(x)

50

Integration testing

• Big-bang approach– first test individual modules in isolation– then test integrated system

• Incremental approach– modules are progressively integrated and

tested• can proceed both top-down and bottom-up

according to the USES relation

51

Integration testing and USES relation

A driver is a program that simulates the use of the modulebeing tested. Sets the values of the shared data as they would be set in the real application by other modules that are yet to be designed

uses uses

52

M1 M2

2,1 2,2M M

Example

M1 USES M2 and M2 IS_COMPOSED_OF {M2,1, M2,2}

CASE 1Test M1, providing a stub for M2 and a driver for M1 Then provide an implementation for M2,1 and a stub for M2,2

CASE 2Implement M2,2 and test it by using a driver, Implement M2,1 and test the combination of M2,1 and M2,2 (i.e., M2) by using a driverFinally, implement M1 and test it with M2, using a driver for M1

Top

-dow

nB

ott

om

-up

53

Analysis

54

Analysis vs. testing• Testing characterizes a single execution

• Verification by experimentation

• Analysis characterizes a class of executions; it is based on a model

• They have complementary advantages and disadvantages

Analyzing a system means inspecting it to understand its properties and capabilities

Example of testing a car

55

Informal analysis techniquesCode walkthroughsCode walkthroughs

• Based on “playing the computer” operations

• Recommended prescriptions– Small number of people (three to five)

– Participants receive written documentation from the designer a few days before the meeting

– Predefined duration of meeting (a few hours)

– Focus on the discovery of errors, not on fixing them

– Participants: designer, moderator, and a secretary

– Foster cooperation; no evaluation of people• Experience shows that most errors are discovered by the designer

during the presentation, while trying to explain the design decisions to other people.

56

Informal analysis techniquesCode inspectionCode inspection

• Organizational aspects similar to code walk-through

• A reading technique aiming at error discovery

• Based on checklists; e.g.:– use of uninitialized variables; – jumps into loops; – Non-terminating loops; – array indexes out of bounds; – mismatch between actual / formal parameters

• Writing a procedure that modifies a formal parameter• Calling the procedure with a constant value as the

actual parameter;

57

Java Code Inspection Checklistby: “Praktikum Software Engineering”

58


59


Important

60

Example:Low-level Design of

ABM Modules

61

Design of database schema(i.e., data format for mydatabase.txt)

62

Example ABM User-interface

Running the Simplified ABM System

63

ABM Example:Executable Files for Different Machines

• You can run the executable file on a PC Windows system , on an Apple computer, or on a Sun Solaris system (see below) and compare the operations of your system with this reference system.

• Note: you should put both the "executable file" and "mydatabase.txt" in one directory (for Apple and Solaris systems you must change the mode of the file to be "executable") and then run it.

• Go to Lab 8 in the course web page and click on:– PC Windows– Apple Macintosh– Sun Solaris

http://www.cas.mcmaster.ca/~sartipi/course/se3km4/f06/LABS/LAB_9/ABM-Windows

http://www.cas.mcmaster.ca/~sartipi/course/se3km4/f06/LABS/LAB_9/ABM-Apple

http://www.cas.mcmaster.ca/~sartipi/course/se3km4/f06/LABS/LAB_9/ABM-Solaris

64

Black Box Testing of ABM (Lab 9)

65

Random Test GenerationABM system

• Consider a typical module (class) Account in an ABM system and try to generate random test cases for this module.

• Class Account: open(); setup(); deposit(); withdraw(); balance(); summarize(); creditLimit(); close()

• Constraints: first open account, do operations, finally close account. – Minimum case: open > setup > deposit > withdraw > close

• General case using regular expression (0 or more repetitions):– Open > setup > deposit > [deposit | withdraw | balance |

summarize | creditLimit] n > withdraw > close

• Random Testing:– R1: open > setup > deposit > deposit > balance >summarize > withdraw > close– R2: open > setup > deposit > withdraw > deposit > balance > creditLimit > withdraw >

close

• Partition Testing:– P1: (Change State) open > setup > deposit > deposit > withdraw >close – P2: (No C S) open > setup > deposit > summarize > creditLimit > withdraw > close

66

InterClass Test Case DesignClass collaboration (integration) testing

Random test cases:1. For each client class, use the list of operations to generate a

series of random test sequences (as previous slide). The operations will send messages to other server classes

2. For each message that is generated, determine the collaborator class and the corresponding operation in the server object.

3. For each operation in the server object (that has been invoked by the messages sent from the client object) , determine the messages that it transmits.

4. For each of the messages, determine the next level of operations that are invoked and incorporate those into the test sequence.

Client Server

67

Example of ATM (ABM) machine

68

Example of ATM (ABM) machine

1

3

4

2

69

Example Test Cases for ATMATM --> Bank

• Sequences of messages between ATM and Bank

• verifyAcct > verifyPIN > [[verifyPolicy > withdrawReq] | depositReq | acctInfo] n

• Random test cases according to guideline in slide #69:

– R3: verifyAcct > verifyPIN > depositReq

– R4: verfiyAccBank [validAccValidationInfo] > verifyPINBank > [validPinValidationInfo] > depositReqBank > [depositAccount]

70

Tests Derived from Behavior Model• State diagrams model the dynamic behavior of a class and can

be used to derive a sequence of tests to test a class and its collaborators.

• The test should traverse all states. • S1: open > setupAccnt > deposit (initial) > withdraw (final) >

close• S2: open > setupAccnt > deposit(initial) > deposit > balance >

credit > withdraw(final) > close• S3: open > setupAccnt > deposit(initial) > deposit > withdraw >

accntInfo > withdraw(final) > close

Account’s Life Cycle

71

Scenario-based Testing• Tests the overall behavior of the system through complex

interactions (task scenarios or use-cases) between user and system.

• Example use-cases for a text editor:– Fix the final draft:

• Print the entire document• Move around in the document, changing certain pages• As each page is changed, it is printed• Sometimes a series of pages are printed.

– Print a new copy:• Open the document• Select “print” in the menu• Check if you’re printing a page range; if so, click to print the

entire document• Click on the print button• Close the document

– Scenario generation is critical in behavior recovery

72

Presentation Outline for the ABM System 5 Bonus Marks!

• Overview: a short introduction about software engineering and why it is important as an engineering discipline; different phases of the software life cycle process; and the way we exercised in this class.

• Discus the benefits of having a good requirement specification and present the SRS that you obtained from RFP.

• Discuss how you generated your high-level design (SDS using Component Diagram) from SRS and what are its features.

• Discuss how you refined SDS to low-level design (SDS using modules)

• Present your implementation in Java and discuss how good it is.• Present a short demo of your running ABM system• Conclusion: your experiences in team working; what you gained

from this software engineering exercise in the labs and how you think these exercises will be useful for your future career. Suggestions to improve the lab.

73

What you learned in SE 3KO4• Learned how to apply systematic approaches to a design and

development problem (can be any engineering discipline: EE, CE, Control, MechT, or SE)

• Exposed and learned leading edge design and development tools: Eclipse, Rational Rose, NetBeans

• Used popular OO programming language Java and C#• Learned several specification techniques to specify the requirements

of a system: Logical, Z, Statechart• Used different UML design tools: class diagrams, component

diagrams, sequence diagram, use-case diagram, as well as design patterns.

• Experienced in the labs with a complete set of activities in a software design and development life cycle through standard IEEE templates and example case study to develop and test a multi-component system.

• Experienced team work, leadership, and have opportunity to present your work professionally in the class.

• Will learn Web Services Development Environment.• All these will be valuable assets to mention in your CV.

Documents

1 Verification. 2 Outline What are the goals of verification? What are the main approaches to verification? –What kind of assurance do we get through