Rob Oshana Southern Methodist University Software Testing

Preview:

Citation preview

Rob OshanaSouthern Methodist

University

SoftwareTesting

Why do we Test ?

• Assess Reliability

• Detect Code Faults

Industry facts

Software testing accounts for 50% of pre-release costs,and 70% of post-release costs [Cigital

Corporation]

30-40% of errors detected after deployment are run-time errors [U.C. Berkeley, IBM’s TJ Watson

Lab]

The amount of software in a typical device doubles every 18 months [Reme Bourguignon, VP of Philips

Holland]

Defect densities are stable over the last 20 years : 0.5 - 2.0 sw failures / 1000 lines [Cigital

Corporation]

Critical SW Applications

Critical software applications which have failed :

Mariner 1 NASA 1962Missing ‘-’ in ForTran code Rocket bound for Venus destroyed

Therac 25 Atomic Energy of Canada Ltd 1985-87Data conversion error Radiation therapy machine for cancer

Long Distance Service AT&T 1990A single line of bad code Service outages up to nine hours long

Patriot Missiles U.S. military 1991Endurance errors in tracking system 28 US soldiers killed in barracks

Tax Calculation Program InTuit 1995Incorrect results SW vendor payed tax penalties for users

Good and successful testing

• What is a good test case?

• A goodgood test case has a high probability of finding an as-yet undiscovered errorundiscovered error

• What is a successful test case?

• A successful testsuccessful test is one that uncovers an as-yet undiscovered errorundiscovered error

Understands the systembut, will test “gently”and, is driven by “delivery”

Must learn about the system, but, will attempt to break it and, is driven by quality

developer independent tester

Who tests the software better ?

Testability – can you develop a program for testability?

• Operability - “The better it works, the more efficiently it can be tested”

• Observability - the results are easy to see, distinct output is generated for each input, incorrect output is easily identified

• Controllability - processing can be controlled, tests can be automated & reproduced

• Decomposability - software modules can be tested independently

• Simplicity - no complex architecture and logic

• Stability - few changes are requested during testing

• Understandability - program is easy to understand

Did You Know...

• Testing/Debugging can worsen reliability?

• We often chase the wrong bugs?

• Testing cannot show the absence of faults, only the existence?

• The cost to develop software is directly proportional to the cost of testing?– Y2K testing cost $600 billion

Did you also know...

• The most commonly applied software testing techniques (black box and white box) were developed back in the 1960’s

• Most Oracles are human (error prone)!!

• 70% of safety critical code can be exceptions– this is the last code written!

Testing Problems

• Time

• Faults hides from tests

• Test Management costs

• Training Personnel

• What techniques to use

• Books and education

“Errors are more common, more pervasive, and more troublesome in software than with other technologies”

David Parnas

What is testing?

• How does testing software compare with testing students?

What is testing?

• “Software testing is the process of comparing the invisible to the ambiguous as to avoid the unthinkable.” James Bach, Borland corp.

What is testing?

• Software testing is the process of predicting the behavior of a product and comparing that prediction to the actual results." R. Vanderwall

Purpose of testing

• Build confidence in the product

• Judge the quality of the product

• Find bugs

Finding bugs can be difficult

x

x

x

x

x

x

x

x

x

x

x

Mine field

A path through themine field (use case) A path through the

mine field (use case)

Why is testing important?

• Therac25: Cost 6 lives

• Ariane 5 Rocket: Cost $500M

• Denver Airport: Cost $360M

• Mars missions, orbital explorer & polar lander: Cost $300M

Why is testing so hard?

Reasons for customer reported bugs

• User executed untested code• Order in which statements were executed

in actual use different from that during testing

• User applied a combination of untested input values

• User’s operating environment was never tested

Interfaces to your software

• Human interfaces

• Software interfaces (APIs)

• File system interfaces

• Communication interfaces– Physical devices (device drivers)– controllers

Selecting test scenarios

• Execution path criteria (control)– Statement coverage– Branching coverage

• Data flow – Initialize each data structure– Use each data structure

• Operational profile • Statistical sampling….

What is a bug?

• Error: mistake made in translation or interpretation ( many taxonomies exist to describe errors)

• Fault: manifestation of the error in implementation (very nebulous)

• Failure: observable deviation in behavior of the system

Example

• Requirement: “print the speed, defined as distance divided by time”

• Code: s = d/t; print s

Example

• Error; I forgot to account for t = 0

• Fault: omission of code to catch t=0

• Failure: exception is thrown

Severity taxonomy

• Mild - trivial

• Annoying - minor

• Serious - major

• Catastrophic - Critical

• Infectious - run for the hills

What is your taxonomy ?

IEEE 1044-1993

Life cycle

Requirements

Design

Code

Testing

error

error

error

Errors can be introduced ateach of these stages

Resolve

Isolate

Classify

error

error

error

error

Testing and repair process can bejust as error prone as the developmentProcess (more so ??)

Ok, so lets just design our systems with “testability” in

mind…..

Testability

• How easily a computer program can be tested (Bach)

• We can relate this to “design for testability” techniques applied in hardware systems

JTAG

A standard Integrated Circuit

CoreIC

Logic

Test access portcontroller

Test mode Select (TMS)

Test clock (TCK)

Test data out (TDO)

Test data in (TDI)

BoundaryScan cells

BoundaryScan path

I/O pads

Data in

Data out

TDI TDOcell

Operability

• “The better it works, the more efficiently it can be tested”– System has few bugs (bugs add

analysis and reporting overhead)– No bugs block execution of tests– Product evolves in functional stages

(simultaneous development and testing)

Observability

• “What you see is what you get”– Distinct output is generated for each input– System states and variables are visible and

queriable during execution– Past system states are ….. (transaction logs)– All factors affecting output are visible

Observability

– Incorrect output is easily identified– Internal errors are automatically

detected through self-testing mechanisms

– Internal errors are automatically reported

– Source code is accessible

Visibility Spectrum

DSPvisibility

GPPvisibility

Factoryvisibility

End customervisibility

Controllability

• “The better we can control the software, the more the testing can be automated and optimized”– All possible outputs can be generated

through some combination of input– All code is executable through some

combination of input

Controllability

– SW and HW states and variables can be controlled directly by the test engineer

– Input and output formats are consistent and structured

Decomposability

• “By controlling the scope of testing, we can more quickly isolate problems and perform smarter testing”– The software system is built from

independent modules– Software modules can be tested

independently

Simplicity

• “The less there is to test, the more quickly we can test it”– Functional simplicity (feature set is

minimum necessary to meet requirements)

– Structural simplicity (architecture is modularized)

– Code simplicity (coding standards)

Stability

• “The fewer the changes, the fewer the disruptions to testing”– Changes to the software are infrequent,

controlled, and do not invalidate existing tests

– Software recovers well from failures

Understandability

• “The more information we have, the smarter we will test”– Design is well understood– Dependencies between external, internal, and

shared components are well understood– Technical documentation is accessible, well

organized, specific and detailed, and accurate

“Bugs lurk in corners and congregate at boundaries”

Boris Beizer

Types of errors

• What is a Testing error?– Claiming behavior is erroneous when it

is in fact correct– ‘fixing’ this type of error actually breaks

the product

Errors in classification

• What is a Classification error ?– Classifying the error into the wrong

category

• Why is this bad ?– This puts you on the wrong path for a

solution

Example Bug Report

• “Screen locks up for 10 seconds after ‘submit’ button is pressed”

• Classification 1: Usability Error • Solution may be to catch user events and

present an hour-glass icon• Classification 2: Performance error• solution may be a modification to a sort

algorithm (or visa-versa)

Isolation error

• Incorrectly isolating the erroneous modules

• Example: consider a client server architecture. An improperly formed client request results in an improperly formed server response

• The isolation determined (incorrectly) that the server was at fault and was changed

• Resulted in regression failure for other clients

Resolve errors

• Modifications to remediate the failure are themselves erroneous

• Example: Fixing one fault may introduce another

What is the ideal test case?

• Run one test whose output is "Modify line n of module i."

• Run one test whose output is "Input Vector v produces the wrong output"

• Run one test whose output is "The program has a bug" (Useless, we know this)

More realistic test case

• One input vector and expected output vector– A collection of these make of a Test Suite

• Typical (naïve) Test Case– Type or select a few inputs and observe output– Inputs not selected systematically– Outputs not predicted in advance

Test case definition

• A test case consists of;– an input vector– a set of environmental conditions– an expected output.

• A test suite is a set of test cases chosen to meet some criteria (e.g. Regression)

• A test set is any set of test cases

Testing Software Intensive Systems

V&V

• Verification– are we building the product right?

• Validation– are we building the right product?– is the customer satisfied?

• How do we do it?• Inspect and Test

What do we inspect and test?

• All work products!

• Scenarios

• Requirements

• Designs

• Code

• Documentation

Defect Testing

A Testing Test

• Problem– A program reads three integer values from

the keyboard separated by spaces. The three values are interpreted as representing the lengths of the sides of a triangle. The program prints a message that states whether the triangle is scalene, isosceles or equilateral.

• Write a set of test cases to adequately test this program

Static and Dynamic V&V

Requirementsspecification

High-Leveldesign

Detaileddesign

Program

Static Verification

Dynamic Verification

Prototype

Techniques

• Static Techniques– Inspection– Analysis– Formal verification

• Dynamic Techniques– Testing

SE-CMMPA 07: Verify & Validate

System• Verification: perform comprehensive

evaluations to ensure that all work products meet requirements– Address all work products: from user

needs an expectations through production and maintenance

• Validation - meeting customer needs - continues throughout product lifecycle

V&V Base Practices

• Establish plans for V&V– objectives, resources, facilities, special equipment– come up with master test plan

• Define the work products to be tested (requirements, design, code) and the methods (reviews, inspections, tests) that will be used to verify

• Define verification methods– test case input, expected results, criteria– connect requirements to tests

V&V Base Practices...• Define how to validate the system

– includes customer as user/operator– test conditions– test environment– simulation conditions

• Perform V&V and capture results– inspection results; test results; exception reports

• Assess success– compare results against expected results– success or failure?

Testing is...

• The process of executing a program with the intent of finding defects

• This definition implies that testing is a destructive process - often going against the grain of what software developers do, i.e.. construct and build software

• A successful test run is NOT one in which no errors are found

Test Cases

• A successful test case finds an error

• An unsuccessful test case is one that causes the program to produce the correct result

• Analogy: feeling ill, going to the doctor, paying $300 for a lab test only to be told that you’re OK!

Testing demonstrates the presence not the absence of

faults

Iterative Testing Process

Acceptancetesting

Sub systemtesting

Unittesting

Moduletesting

Systemtesting

Component TestingIntegration Testing

User Testing

It is impossible to completely test a program

Testing and Time

• Exhaustive testing is impossible for any program with low to moderate complexity

• Testing must focus on a subset of possible test cases

• Test cases should be systematically derived, not random

Testing Strategies• Top Down testing

– use with top-down programming; stubs required; difficult to generate output

• Bottom Up testing– requires driver programs; often combined with top-down testing

• Stress testing– test system overload; often want system to fail-soft rather than

shut down– often finds unexpected combinations of events

Test-Support Tools

• Scaffolding– code created to help test the software

• Stubs– a dummied-up low-level routine so it

can be called by a higher level routine

Stubs Can Vary in Complexity

• Return, no action taken• Test the data fed to it• Print/echo input data• Get return values from interactive input• Return standard answer• Burn up clock cycles• Function as slow, fat version of ultimate

routine

Driver Programs

• Fake (testing) routine that calls other real routines

• Drivers can:– call with fixed set of inputs

– prompt for input and use it

– take arguments from command line

– read arguments from file

• main() can be a driver - then “remove” it with preprocessor statements. Code is unaffected

System Tests Should be Incremental

A

B

test 1test 2

Modules

System Tests Should be Incremental

A

B

test 1test 2

test 3C

Modules

System Tests Should be Incremental

A

B

test 1test 2

test 3test 4C

D

Modules

Not Big-Bang

Approaches to Testing• White Box testing

– based on the implementation - the structure of the code; also called structural testing

• Black Box testing– based on a view of the program as a function of

Input and Output; also called functional testing

• Interface Testing– derived from program specification and

knowledge of interfaces

White Box (Structural) Testing

• Testing based on the structure of the code

if x then j = 2else k = 5…...

start with actual program code

White Box (Structural) Testing

• Testing based on the structure of the code

if x then j = 2else k = 5…...

Test data

Tests

Test output

White Box Technique:Basis Path Testing

• Objective-– test every independent execution path through

the program• If every independent path has been executed

then every statement will be executed• All conditional statements are tested for

both true and false conditions• The starting point for path testing is the flow

graph

Flow Graphs

if then else

Flow Graphs

if then else loop while

Flow Graphs

if then else loop while case of

1)j = 2;

2) k = 5;

3) read (a);

4) if a=2

5) then j = a

6) else j = a*k;

7) a = a + 1;

8) j = j + 1;

9) print (j);

How many paths thru this program?

1) j = 2;

2) k = 5;

3) read (a);

4) if a=2

5) then j = a

6) else j = a*k;

7) a = a + 1;

8) j = j + 1;

9) print (j);

1, 2, 3

4

5 6

7,8,9

How many paths thru this program?

How Many Independent Paths?

• An independent path introduces at least one new statement or condition to the collection of already existing independent paths

• Cyclomatic Complexity (McCabe)

• For programs without GOTOs,Cyclomatic Complexity

= Number of decision nodes + 1

also called predicate nodes

The Number of Paths

• Cyclomatic Complexity gives an upper bound on the number of tests that must be executed in order to cover all statements

• To test each path requires– test data to trigger the path– expected results to compare against

1) j = 2;

2) k = 5;

3) read (a);

4) if a=2

5) then j = a

6) else j = a*k;

7) a = a + 1;

8) j = j + 1;

9) print (j);

1, 2, 3

4

5 6

7,8,9

1) j = 2;

2) k = 5;

3) read (a);

4) if a=2

5) then j = a

6) else j = a*k;

7) a = a + 1;

8) j = j + 1;

9) print (j);

1, 2, 3

4

5 6

7,8,9

Test 1input: 2

expected output: 3

1) j = 2;

2) k = 5;

3) read (a);

4) if a=2

5) then j = a

6) else j = a*k;

7) a = a + 1;

8) j = j + 1;

9) print (j);

1, 2, 3

4

5 6

7,8,9

Test 1input: 2

expected output: 3

Test 2input: 10

expected output: 51

What Does Statement Coverage Tell You?

• All statements have been executed at least once

so?

What Does Statement Coverage Tell You?

• All statements have been executed at least once

Coverage testing may lead to the false illusion that the software has been comprehensively tested

The Downside of Statement Coverage

• Path testing results in the execution of every statement

• BUT, not all possible combinations of paths thru the program

• There are an infinite number of possible path combinations in programs with loops

The Downside of Statement Coverage

• The number of paths is usually proportional to program size making it useful only at the unit test level

Black Box Testing

Forget the code details!

Forget the code details!

Treat the program as a

Black Box

In Out

Black Box Testing

• Aim is to test all functional requirements

• Complementary, not a replacement for White Box Testing

• Black box testing typically occurs later in the test cycle than white box testing

Defect Testing Strategy

Input Test Data Locate inputs

causingerroneous output

OutputOutput indicating defects

Black Box Techniques

• Equivalence partitioning

• Boundary value testing

Equivalence Partitioning

• Data falls into categories

• Positive and Negative Numbers

• Strings with & without blanks

• Programs often behave in a comparable way for all values in a category -- also called an equivalence class

invalid input valid input

System

invalid input valid input

System

Choose test cases from partitions

Specification determines Equivalence Classes

• Program accepts 4 to 8 inputs

• Each is 5 digits greater than 10000

Specification determines Equivalence Classes

• Program accepts 4 to 8 inputs

• Each is 5 digits greater than 10000

less than 4 4 thru 8 more than 8

Specification determines Equivalence Classes

• Program accepts 4 to 8 inputs

• Each is 5 digits greater than 10000

less than 4 4 thru 8 more than 8

less than 10000

10000 thru 99999

more than 99999

Specification determines Equivalence Classes

• Program accepts 4 to 8 inputs

• Each is 5 digits greater than 10000

less than 4 4 thru 8 more than 8

less than 10000

10000 thru 99999

more than 99999

Specification determines Equivalence Classes

• Program accepts 4 to 8 inputs

• Each is 5 digits greater than 10000

less than 4 4 thru 8 more than 8

less than 10000

10000 thru 99999

more than 99999

Boundary Value Analysis • Complements equivalence partitioning• Select test cases at the boundaries of a

class• Range Boundary a..b

– test just below a and just above b

• Input specifies 4 values– test 3 and 5

• Output that is limited should be tested above and below limits

Other Testing Strategies

• Array testing

• Data flow testing

• GUI testing

• Real-time testing

• Documentation testing

Arrays

• Test software with arrays of one value

• Use different arrays of different sizes in different tests

• Derive tests so that the first, last and middle elements are tested

Data Flow testing

• Based on the idea that data usage is at least as error-prone as control flow

• Boris Beizer claims that at least half of all modern programs consist of data declarations and initializations

Data Can Exist in One of Three States

• Defined– initialized, not used

a = 2

• Usedx = a * b + c;

z = sin(a)

• Killedfree (a)– end of for loop or block where is was defined

Entering & Exiting

• Terms describing context of a routine before doing something to a variable

• Entered– control flow enter the routine before

variable is acted upon

• Exited– control flow leaves routine immediately

after variable is acted upon

Data Usage Patterns

• Normal– define variable; use one or more times;

perhaps killed

• Abnormal Patterns– Defined-Defined– Defined-Exited

• if local, why?

– Defined-Killed• wasteful if not strange

More Abnormal Patterns

• Entered-Killed• Entered-Used

– should be defined before use

• Killed-Killed– double kills are fatal for pointers

• Killed-Used– why really are you using?

• Used-Defined– what’s its value?

Define-Use Testingif (condition-1)

x = a;

elsex = b;

if (condition-2)y = x + 1;

elsey = x - 1;

Path Testing

Test1: condition-1 TRUE

condition-2 TRUE

Test2: condition-1 FALSE

condition-2 FALSE

WILL EXERCISE EVERY LINE OF CODE … BUT will NOT test the DEF-USE combinations

x=a / y = x-1

x=b/ y = x + 1

GUIs

• Are complex to test because of their event driven character

• Windows– move, resized and scrolled– regenerate when overwritten and then

recalled– menu bars change when window is active– multiple window functionality available?

GUI.. Menus

• Menu bars in context?

• Submenus - listed and working?

• Are names self explanatory?

• Is help context sensitive?

• Cursor changes with operations?

Testing Documentation

• Great software with lousy documentation can kill a product

• Documentation testing should be part of every test plan

• Two phases– review for clarity– review for correctness

Documentation Issues

• Focus on functional usage?• Are descriptions of interaction

sequences accurate• Examples should be used• Is it easy to find how to do something• Is there trouble shooting section• Easy to look up error codes• TOC and index

Real-Time Testing

• Needs white box and black box PLUS– consideration of states, events,

interrupts and processes

• Events will often have different effects depending on state

• Looking at event sequences can uncover problems

Real-Time Issues• Task Testing

– test each task independently

• Event testing– test each separately; then in context of state

diagrams;– scenario sequences and random sequences

• Intertask testing– Ada rendezvous– message queuing; buffer overflow

Other Testing Terms

• Statistical testing– running the program against expected usage

scenarios

• Regression testing– retesting the program after modification

• Defect testing– trying to find defects (aka bugs)

• Debugging• the process of discovering and removing defects

Summary V&V

• Verification– Are we building the system right?

• Validation– Are we building the right system?

• Testing is part of V&V• V&V is more than testing... • V&V is plans, testing, reviews, methods,

standards, and measurement

Testing Principles

• The necessary part of a test case is a definition of the expected output or result– the eye often sees what it wants to see

• Programmers should avoid testing their own code

• Organizations should not test their own programs

• Thoroughly inspect the results of each test

Testing Principles

• Test invalid as well as valid conditions

• The portability of errors in a section code is proportional to the number of errors already found there

Testing Principles• Tests should be traceable to customer

requirements• Tests should be planned before testing begins• The Pareto principle applies - 80% of all errors

is in 20% of the code• Begin small, scale up• Exhaustive testing is not possible• The best testing is done by a 3rd party

Guidelines• Testing capabilities is more important than

testing components– users have a job to do; tests should focus on

things that interfere with getting the job done, not minor irritations

• Testing old capabilities is more important then testing new features

• Testing typical situations is more important than testing boundary conditions

System Testing

Ian Summerville

System Testing

• Testing the system as a whole to validate that it meets its specification and the objectives of its users

Development testing

• Hardware and software components should be tested;– as they are developed

– as sub-systems are created.

• These testing activities include:– Unit testing. – Module testing– Sub-system testing

Development testing

• These tests do not cover:– Interactions between components or

sub-systems where the interaction causes the system to behave in an unexpected way

– The emergent properties of the system

System testing• Testing the system as a whole instead of

individual system components• Integration testing

– As the system is integrated, it is tested by the system developer for specification compliance

• Stress testing– The behavior of the system is tested under

conditions of load

System testing

• Acceptance testing– The system is tested by the customer to

check if it conforms to the terms of the development contract

• System testing reveals errors which were undiscovered during testing at the component level

System Test Flow

Requirementsspecification

Systemdesign

Detaileddesign

Acceptancetest plan

SystemIntegrationtest plan

Sub-systemIntegrationtest plan

Service Acceptancetest

SystemIntegrationtest

Sub-systemIntegrationtest

Unit code and test

Systemspecification

Integration testing• Concerned with testing the system

as it is integrated from its components

• Integration testing is normally the most expensive activity in the systems integration process

Integration testing

• Should focus on – Interface testing where the interactions

between sub-systems and components are tested

– Property testing where system properties such as reliability, performance and usability are tested

Integration Test Planning• Integration testing is complex and time-

consuming and planning of the process is essential

• The larger the system, the earlier this planning must start and the more extensive it must be

• Integration test planning may be the responsibility of a separate IV&V (verification and validation) team– or a group which is separate from the development

team

Test planning activities• Identify possible system tests using the

requirements document

• Prepare test cases and test scenarios to run these system tests

• Plan the development, if required, of tools such as simulators to support system testing

• Prepare, if necessary, operational profiles for the system

• Schedule the testing activities and estimate testing costs

Interface Testing

• Within a system there may be literally hundreds of different interfaces of different types. Testing these is a major problem.

• Interface tests should not be concerned with the internal operation of the sub-system although they can highlight problems which were not discovered when the sub-system was tested as an independent entity.

Two levels of interface testing

• Interface testing during development when the developers test what they understand to be the sub-system interface

• Interface testing during integration where the interface, as understood by the users of the subsystem, is tested.

Two levels of interface testing

• What developers understand as the system interface and what users understand by this are not always the same thing.

Interface Testing

TestCases

A B

C

Interface Problems

• Interface problems often arise because of poor communications within the development team or because of poor change management procedures

• Typically, an interface definition is agreed but, for good reasons, this has to be chanegd during development

Interface Problems

• To allow other parts of the system to cope with this change, they must be informed of it

• It is very common for changes to be made and for potential users of the interface to be unaware of these changes – problems arise which emerge during

interface testing

What is an interface?• An agreed mechanism for

communication between different parts of the system

• System interface classes– Hardware interfaces

• Involving communicating hardware units

– Hardware/software interfaces• Involving the interaction between hardware

and software

What is an interface?

– Software interfaces• Involving communicating software

components or sub-systems

– Human/computer interfaces• Involving the interaction of people and the

system

– Human interfaces• Involving the interactions between people in

the process

Hardware interfaces

• Physical-level interfaces– Concerned with the physical connection of

different parts of the system e.g. plug/socket compatibility, physical space utilization, wiring correctness, etc.

• Electrical-level interfaces– Concerned with the electrical/electronic

compatibility of hardware units i.e. can a signal produced by one unit be processed by another unit

Hardware interfaces

• Protocol-level interfaces– Concerned with the format of the

signals communicated between hardware units

Software interfaces

• Parameter interfaces– Software units communicate by setting

pre-defined parameters

• Shared memory interfaces– Software units communicate through a

shared area of memory– Software/hardware interfaces are

usually of this type

Software interfaces

• Procedural interfaces– Software units communicate by calling

pre-defined procedures

• Message passing interfaces– Software units communicate by passing

messages to each other

Parameter Interfaces

Subsystem 1 Subsystem 2

Parameterlist

Shared Memory Interfaces

Shared memory area

SS1 SS2 SS3

Procedural Interfaces

Subsystem 1 Subsystem 2Defined procedures(API)

Message Passing Interfaces

Subsystem 1 Subsystem 2

Exchangedmessages

Interface errors

• Interface misuse– A calling component calls another

component and makes an error in its use of its interface e.g. parameters in the wrong order

• Interface misunderstanding– A calling component embeds

assumptions about the behavior of the called component which are incorrect

Interface errors

• Timing errors– The calling and called component

operate at different speeds and out-of-date information is accessed

Stress testing• Exercises the system beyond its

maximum design load– The argument for stress testing is that system

failures are most likely to show themselves at the extremes of the system’s behavior

• Tests failure behavior– When a system is overloaded, it should

degrade gracefully rather than fail catastrophically

Stress testing

• Particularly relevant to distributed systems– As the load on the system increases, so

too does the network traffic. At some stage, the network is likely to become swamped and no useful work can be done

Acceptance testing

• The process of demonstrating to the customer that the system is acceptable

• Based on real data drawn from customer sources. The system must process this data as required by the customer if it is to be acceptable

Acceptance testing

• Generally carried out by customer and system developer together

• May be carried out before or after a system has been installed

Performance testing• Concerned with checking that the

system meets its performance requirements

• Number of transactions processed per second– Response time to user interaction– Time to complete specified operations

Performance testing

• Generally requires some logging software to be associated with the system to measure its performance

• May be carried out in conjunction with stress testing using simulators developed for stress testing

Reliability testing• The system is presented with a large number

of ‘typical’ inputs and its response to these inputs is observed

• The reliability of the system is based on the number of incorrect outputs which are generated in response to correct inputs

• The profile of the inputs (the operational profile) must match the real input probabilities if the reliability estimate is to be valid

Security testing

• Security testing is concerned with checking that the system and its data are protected from accidental or malicious damage

• Unlike other types of testing, this cannot really be tested by planning system tests. The system must be secure against unanticipated as well as anticipated attacks

Security testing

• Security testing may be carried out by inviting people to try to penetrate the system through security loopholes

Some Costly and Famous Software Failures

Mariner 1 Venus probe loses its way: 1962

Mariner 1

• A probe launched from Cape Canaveral was set to go to Venus

• After takeoff, the unmanned rocket carrying the probe went off course, and NASA had to blow up the rocket to avoid endangering lives on earth

• NASA later attributed the error to a faulty line of Fortran code

Mariner 1

• “... a hyphen had been dropped from the guidance program loaded aboard the computer, allowing the flawed signals to command the rocket to veer left and nose down…

• The vehicle cost more than $80 million, prompting Arthur C. Clarke to refer to the mission as "the most expensive hyphen in history."

Therac 25 Radiation Machine

Radiation machine kills four: 1985 to 1987

• Faulty software in a Therac-25 radiation-treatment machine made by Atomic Energy of Canada Limited (AECL) resulted in several cancer patients receiving lethal overdoses of radiation

• Four patients died

Radiation machine kills four: 1985 to 1987

• A lesson to be learned from the Therac-25 story is that focusing on particular software bugs is not the way to make a safe system,”

• "The basic mistakes here involved poor software engineering practices and building a machine that relies on the software for safe operation."

AT&T long distance service fails

AT&T long distance service fails: 1990

• Switching errors in AT&T's call-handling computers caused the company's long-distance network to go down for nine hours, the worst of several telephone outages in the history of the system

• The meltdown affected thousands of services and was eventually traced to a single faulty line of code

Patriot missile

Patriot missile misses: 1991

• The U.S. Patriot missile's battery was designed to head off Iraqi Scuds during the Gulf War

• System also failed to track several incoming Scud missiles, including one that killed 28 U.S. soldiers in a barracks in Dhahran, Saudi Arabia

Patriot missile misses: 1991• The problem stemmed from a software error

that put the tracking system off by 0.34 of a second

• System was originally supposed to be operated for only 14 hours at a time– In the Dhahran attack, the missile battery had

been on for 100 hours– errors in the system's clock accumulated to the

point that the tracking system no longer functioned

Pentium chip

Pentium chip fails math test: 1994

• Pentium chip gave incorrect answers to certain complex equations– bug occurred rarely and affected only a tiny

percentage of Intel's customers

• Intel offered to replace the affected chips, which cost the company $450 million

• Intel then started publishing a list of known "errata," or bugs, for all of its chips

New Denver airport

New Denver airport misses its opening: 1995

• The Denver International Airport was intended to be a state-of-the-art airport, with a complex, computerized baggage-handling system and 5,300 miles of fiber-optic cabling

• Bugs in the baggage system caused suitcases to be chewed up and drove automated baggage carts into walls

New Denver airport misses its opening: 1995

• The airport eventually opened 16 months late, $3.2 billion over budget, and with a mainly manual baggage system

The millennium bug: 2000

• No need to discuss this !!

Ariane 5 Rocket

Ariane 5

• The failure of the Ariane 501 was caused by the complete loss of guidance and attitude information 37 seconds after start of the main engine ignition sequence (30 seconds after lift- off)

• This loss of information was due to specification and design errors in the software of the inertial reference system

Ariane 5

• The extensive reviews and tests carried out during the Ariane 5 Development Programme did not include adequate analysis and testing of the inertial reference system or of the complete flight control system, which could have detected the potential failure

More on Testing

From Beatty – ESC 2002

Agenda

• Introduction

• Types of software errors

• Finding errors – methods and tools

• Embedded systems and RT issues

• Risk management and process

Introduction

• Testing is expensive

• Testing progress can be hard to predict

• Embedded systems have different needs

• Desire for best practices

Method

• Know what you are looking for

• Learn how to effectively locate problems

• Plan to succeed – manage risk

• Customize and optimize the process

Entomology

• What are we looking for ?

• How are bugs introduced?

• What are their consequences?

Entomology – Bug Frequency

• Rare

• Less common

• More common

• Common

Entomology – Bug severity

• Non functional; doesn’t affect object code• Low: correct problem when convenient• High: correct as soon as possible• Critical: change MUST be made

– Safety related or legal issue

Domain Specific !

Entomology - Sources

• Non-implementation error sources– Specifications– Design– Hardware– Compiler errors

• Frequency; common – 45 to 65%

• Severity; Non-functional to critical

Entomology - Sources

• Poor specifications and designs are often;– Missing– Ambiguous– Wrong– Needlessly complex– Contradictory

Testing can fix these problems !

Entomology - Sources

• Implementation error sources;– Algorithmic/processing bugs– Data bugs– Real-time bugs– System bugs– Other bugs

Bugs may fit in more than one category !

Entomology – Algorithm Bugs

• Parameter passing– Common only in complex invocations– Severity varies

• Return codes– Common only in complex functions or libraries

• Reentrance problem– Less common– Critical

Entomology – Algorithm Bugs

• Incorrect control flow– Common– Severity varies

• Logic/math/processing error– Common– High

• Off by “1”– Common– Varies, but typically high

Example of logic error

If (( this AND that ) OR ( that AND other )AND NOT ( this AND other ) AND NOT( other OR NOT another ))

Boolean operations and mathematical calculations can be easily misunderstoodIn complicated algorithms!

Example of off by 1

for ( x = 0;, x <= 10; x++)

This will execute 11 times, not 10!

for ( x = array_min; x <= array_max; x++)

If the intention is to set x to array_maxon the last pass through the loop, thenthis is in error!

Be careful when switching between 1 basedlanguage (Pascal, Fortran) to zero based (C)

Entomology – Algorithm bugs

• Math underflow/overflow– Common with integer or fixed point

math– High severity– Be careful when switching between

floating point and fixed point processors

Entomology – Data bugs

• Improper variable initialization– Less common– Varies; typically low

• Variable scope error– Less common– Low to high

Example - Uninitialized dataint some_function ( int some_param ) { int j;if (some_param >= 0){ for ( j=0; j<=3; j++) { /* iterate through some process */ }} else {

if (some_param <= -10){

some_param += j; /* j is uninitialized */ }

return some_param; } return 0;}

Entomology – Data bugs

• Data synchronization error– Less common– Varies; typically high

Example – synchronized data

struct state { /* an interrupt will trigger */GEAR_TYPE gear; /* sending snapshot in a message */U16 speed;U16 speed_limit;U8 last_error_code;

} snapshot;

snapshot.speed = new_speed; /* …somewhere in code */

snapshot.gear = new gear; /* somewhere else */snapshot.speed_limit = speed_limit_tb[ gear ];

Interrupt splitting these two would be bad

Entomology – Data bugs

• Improper data usage– Common– Varies

• Incorrect flag usage– Common when hard-coded constants

used– varies

Example – mixed math error

unsigned int a = 5;int b = -10;

/* somewhere in code */if ( a + b > 0 ){

a+b is not evaluated as –5 !the signed int b is converted to an unsigned int

Entomology – Data bugs

• Data/range overflow/underflow– Common in asm and 16 bit micro– Low to critical

• Signed/unsigned data error– Common in asm and fixed point math– High to critical

• Incorrect conversion/type cast/scaling– Common in complex programs– Low to critical

Entomology – Data bugs

• Pointer error– Common– High to critical

• Indexing problem– Common– High to critical

Entomology – Real-time bugs

• Task synchronization– Waiting, sequencing, scheduling, race

conditions, priority inversion– Less common– Varies

• Interrupt handling– Unexpected interrupts– Improper return from interrupt

• Rare• critical

Entomology – Real-time bugs

• Interrupt suppression– Critical sections– Corruption of shared data– Interrupt latency

• Less common• critical

Entomology – System bugs

• Stack overflow/underflow– Pushing, pulling and nesting

• More common in asm and complex designs• Critical

• Resource sharing problem– Less common– High to critical– Mars pathfinder

Entomology – System bugs

• Resource mapping– Variable maps, register banks,

development maps– Less common– Critical

• Instrumentation problem– Less common– low

Entomology – System bugs

• Version control error– Common in complex or mismanaged

projects– High to critical

Entomology – other bugs

• Syntax/typing– if (*ptr=NULL) Cut&paste errors– More common– Varies

• Interface– Common– High to critical

• Missing functionality– Common– high

Entomology – other bugs

• Peripheral register initialization– Less common– Critical

• Watchdog servicing– Less common– Critical

• Memory allocation/de-allocation– Common when using malloc(), free()– Low to critical

Entomology – Review

• What are you looking for ?

• How are bugs being introduced ?

• What are their consequences ?

Form your own target list!

Finding the hidden errors…

• All methods use these basic techniques;– Review; checking– Tests; demonstrating– Analysis; proving

These are all referred to as “testing” !

Testing

• “Organized process of identifying variances between actual and specified results”

• Goal: zero significant defects

Testing axioms

• All software has bugs• Programs cannot be exhaustively

tested• Cannot prove the absence of all

errors• Complex systems often behave

counter-intuitively• Software systems are often brittle

Finding spec/design problems

• Reviews / Inspections / Walkthroughs

• CASE tools

• Simulation

• Prototypes

Still need consistently effective methods !

Testing – Spec/Design Reviews

• Can be formal or informal– Completeness– Consistency– Feasibility– Testability

Testing – Evaluating methods

• Relative costs– None– Low– Moderate– High

• General effectiveness– Low– Moderate– High– Very high

Testing – Code reviews

• Individual review– Effectiveness: high– Cost; Time – low,

material - none

• Group inspections– Effectiveness: very

high– Cost; Time –

moderate, material - none

Testing – Code reviews

• Strengths– Early detection of errors– Logic problems– Math errors– Non-testable requirement or paths

• Weaknesses– Individual preparation and experience– Focus on details, not “big picture”– Timing and system issues

Step by step execution

• Exercise every line of code or every branch condition

• Look for errors– Use simulator, ICE, logic analyzer– Effectiveness; moderate – dependent on

tester– Cost; time is high, material is low or

moderate

Functional (Black Box)

• Exercise inputs and examine outputs

• Test procedures describe expected behavior

• Subsystems tested and integrated– Effectiveness is moderate– Cost; time is moderate, material varies

Tip; where functional testing finds problemslook deeper in that area !

Functional (Black Box)

• Strengths– Requirements problems– Interfaces– Performance issues– Most critical/most used features

• Weaknesses– Poor coverage– Timing and other problems masked– Error conditions

Functional test process• ID requirements to test• Choose strategy

– 1 test per requirement– Test small groups of requirements– Scenario; broad sweep of many requirements

• Write test cases– Environment– Inputs– Expected outputs

• Traceability

Structural (White box)

• Looks at how code works• Test procedures• Exercise paths using many data values• Consistency between design and

implementation– Effectiveness; high– Cost; time is high, material low to moderate

Structural (White box)

• Strengths– Coverage– Effectiveness– Logic and structure problems– Math and data errors

• Weaknesses– Interface and requirements– Focused; may miss “big picture”– Interaction with system– Timing problems

Structural (White box)

• Test rigor based on 3 levels of Risk (FAA)

• C – reduced safety margins or functionality– Statement coverage– Invoke every statement at least once

Structural (White box)

• Test rigor based on 3 levels of Risk (FAA)

• B – Hazardous – Decision Coverage– Invoke every statement at least once– Invoke every entry and exit– Every control statement takes all

possible outcomes– Every non-constant Boolean expression

evaluated to both a True and a False result

Structural (White box)

• Test rigor based on 3 levels of Risk (FAA)

• A – Catastrophic – Modified Condition Decision Coverage– Every statement has been invoked– Every point of entry and exit has been

invoked

Structural (White box)

– Every control statement has taken all possible outcomes

– Every Boolean expression has evaluated to both a True and a False result

– Every condition in a Boolean expression has evaluated to both True/False

– Every condition in a Boolean expression has been shown to independently affect that expression’s outcome

Unit test standards

• What is the white box testing plan?

• What do you test?

• When do you test it?

• How do you test it?

Structural test process

• ID all inputs• ID all outputs• ID all paths• Set up test cases

– Decision coverage– Boundary value analysis– Checklist– Weaknesses

Structural test process

• Measure worst case execution time

• Determine worst case stack depth

• Bottom up

Integration

• Combines elements of white and block box– Unexpected return codes or

acknowledgements– Parameters – boundary values– Assumed initial conditions/state– Unobvious dependencies– Aggregate functionality

Integration

• Should you do this…when?– Depends on the complexity of the

system– Boundary values of parameters in

functions– Interaction between units– “interesting” paths

• Errors• Most common

Verification

• Verify the structural integrity of the code

• Find errors hidden at other levels of examination

• Outside of requirements

• Conformance to standards

Verification

• Detailed inspection, analysis, and measurement of code to find common errors

• Examples– Stack depth analysis– Singular use of flags/variables– Adequate interrupt suppression– Maximum interrupt latency– Processor-specific constraints

Verification

• Strengths– Finds problems that testing and inspection

can’t– Stack depth– Resource sharing– Timing

• Weaknesses– Tedious– Focused on certain types of errors

Verification

• Customize for your process/application– What should be checked– When– How– By whom

Stress/performance

• Load the system to maximum…and beyond!

• Helps determine “factor of safety”

• Performance to requirements

Stress/performance

• Examples– Processor utilization– Interrupt latency– Worst time to complete a task– Periodic interrupt frequency jitter– Number of messages per unit time– Failure recovery

Other techniques

• Fault injection• Scenario testing• Regression

– Critical functions– Most functionality with the least tests– Automation– Risk of not re-testing is higher than the cost

• Boundary value testing

Tools

ICE Simulator Logic analyzer

Step through code

X X X

Control execution

X X

Modifying data

X X

Coverage X X X

Timing analysis

X X X

Code Inspection Checklist

Code Inspection Checklist

• Code correctly implements the document software design

• Code adheres to coding standards and guidelines

• Code is clear and understandable• Code has been commented

appropriately• Code is within complexity guidelines

– Cyclomatic complexity < 12

Code Inspection Checklist

• Macro formal parameters should not have side affects (lint message 665)

• Use parenthesis to enhance code robustness, use parenthesis around all macro parameters (665, 773)

• Examine all typecasts for correct operation

• Examine affects of all implicit type conversions (910-919)

Code Inspection Checklist

• Look for off-by-one errors in loop counters, arrays, etc

• Assignment statements within condition expressions (use cchk)

• Guarantee that a pointer can never be Null when de-referencing it

• Cases within a switch should end in a break (616)

Code Inspection Checklist

• All switch statements should have a default case (744)

• Examine all arguments passed to functions for appropriate use of pass by value, pass by reference, and const

• Local variables must be initialized before use

• Equality test on floating point numbers may never be True (777)

Code Inspection Checklist

• Adding and subtracting floats of different magnitudes can result in lost precision

• Insure that division by zero cannot occur

• Sequential multiplications and divisions may produce round-off errors

Code Inspection Checklist

• Subtracting nearly equal values can produce cancellation errors

• C rounds towards zero – is this appropriate here ?

• Mathematical underflow/overflow potential

• Non-deterministic timing constructs

Unit test standards

Unit test standards

• 1. Each test case must be capable of independent execution, i.e. the setup and results of a test case shall not be used by subsequent test cases

• 2. All input variables shall be initialized for each test case. All output variables shall be given an expected value, which will be validated against the actual result for each test case

Unit test standards

• 3. Initialize variables to valid values taking into account any relationships among inputs. In other words, if the value of a variable A affects the domain of variable B, select values for A and B which satisfy the relationship

• 4. Verify that the minimum and maximum values can be obtained for each output variable (i.e. select input values that produce output values as close to the max/min as possible)

Unit test standards

• 5. Initialize output variables according to the following;– If an input is expected to change, set its

initial value to something other than the expected result

– If an output is not expected to change, set its initial value to its expected value

• 6. Verify loop entry and exit criteria

Unit test standards

• 7. Maximum loop iterations should be executed to provide worst case timing scenarios

• 8. Verify that the loss of precision due to multiplication or division is within acceptable tolerance

Unit test standards

• 9. The following apply to conditional expressions– “OR” expressions are evaluated by

setting all predicates “FALSE” and then setting each one “TRUE” individually

– “AND” expressions are evaluated by setting all predicates “TRUE” and then setting each one “FALSE” individually

Unit test standards

• 10. Do not stub any functions that are simple enough to include within the unit test

• 11. Non-trivial tests should include an explanation of what is being tested

Unit test standards

• 12. Unit test case coverage is complete when the following criteria are satisfied (where applicable)– 100% function and exit coverage– 100% call coverage– 100% statement block coverage– 100% decision coverage– 100% loop coverage– 100% basic condition coverage– 100% modified condition coverage

Unit test checklistCommon coding error checks

Status Note(s)

Mathematical expression underflow/overflow

<Pass/Fail/NA>

Off-by-one errors in loop counters

<Pass/Fail/NA>

Assignment statements within conditional expressions

<Pass/Fail/NA> May be detected by compiler, lint, cchk

Floats are not compared solely for equality

<Pass/Fail/NA> Lint message 777

Variables and calibrations use correct precision and ranges in calculations

<Pass/Fail/NA>

Unit test checklistCommon coding error checks

Status Note(s)

Pointers initialized and de-referenced properly

<Pass/Fail/NA>

Intermediate calculations are not stored in global variables

<Pass/Fail/NA>

All declared local variables are used in the function

<Pass/Fail/NA> May be detected by compiler or lint

Typecasting has been done correctly

<Pass/Fail/NA>

Unreachable code has been removed

<Pass/Fail/NA> Lint message 527

Common coding error checks

Status Note(s)

All denominators are guaranteed to be zero (no divide by 0)

<Pass/Fail/NA>

Switch statement handle every case of the control variable (have DEFAULT paths). Any cases that “fall through” to the next case are intended to do so

<Pass/Fail/NA> Lint message 744, 787

Fall through 616

Static variables are used for only one purpose

<Pass/Fail/NA>

All variables have been properly initialized before being used assume a value of “0” after power –up

<Pass/Fail/NA>

Recommended