198
1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

1

Software Reliability Engineering:Techniques and Tools

CS130Winter, 2002

Page 2: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

2

Source Material

“Software Reliability and Risk Management: Techniques and Tools”, Allen Nikora and Michael Lyu, tutorial presented at the 1999 International Symposium on Software Reliability Engineering

Allen Nikora, John Munson, “Determining Fault Insertion Rates For Evolving Software Systems”, proceedings of the International Symposium on Software Reliability Engineering, Paderborn, Germany, November, 1998

Page 3: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

3

AgendaPart I: IntroductionPart II: Survey of Software Reliability ModelsPart III: Quantitative Criteria for Model SelectionPart IV: Input Data Requirements and Data Collection

MechanismsPart V: Early Prediction of Software ReliabilityPart VI: Current Work in Estimating Fault ContentPart VII: Software Reliability Tools

Page 4: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

4

Part I: Introduction Reliability Measurement Goal Definitions Reliability Theory

Page 5: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

5

Reliability Measurement Goal

Reliability measurement is a set of mathematical techniques that can be used to estimate and predict the reliability behavior of software during its development and operation.

The primary goal of software reliability modeling is to answer the following question:

“Given a system, what is the probability that it will fail in a given time interval, or, what is the expected duration between successive failures?”

Page 6: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

6

Basic Definitions

Software Reliability R(t): The probability of failure-free operation of a computer program for a specified time under a specified environment.

Failure: The departure of program operation from user requirements.

Fault: A defect in a program that causes failure.

Page 7: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

7

Basic Definitions (cont’d)

Failure Intensity (rate) f(t): The expected number of failures experienced in a given time interval.

Mean-Time-To-Failure (MTTF): Expected value of

a failure interval.

Expected total failures (t): The number of failures expected in a time period t.

Page 8: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

8

Reliability Theory

Let "T" be a random variable representing the failure time or lifetime of a physical system.For this system, the probability that it will fail by time "t" is:

The probability of the system surviving until time "t" is:

t

dxxftTPtF0

)(][)(

t

dxxftFtTPtR )()(1][)(

Page 9: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

9

Reliability Theory (cont’d)

Failure rate - the probability that a failure will occur in the interval [t1, t2] given that a failure has not occurred before time t1. This is written as:

P t t t T t

t t

P t t t

t t P T t

F t F t

t t R t

[ | ] [ ]

( ) [ ]

( ) ( )

( ) ( )

1 2 1

2 1

1 2

2 1 1

2 1

2 1 1

Page 10: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

10

Reliability Theory (cont’d)Hazard rate - limit of the failure rate as the length of the interval approaches zero. This is written as:

This is the instantaneous failure rate at time t, given that the system survived until time t. The terms hazard rate and failure rate are often used interchangeably.

z tF t t F t

tR t

f t

R t

t( ) lim

( ) ( )

( )

( )

( )

0

Page 11: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

11

Reliability Theory (cont’d)

A reliability objective expressed in terms of one reliability measure can be easily converted into another measure as follows (assuming an “average” failure rate, , is measured):

etRMTTF

MTTF

tt

t

1

1

Page 12: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

12

Reliability Theory (cont'd)

Region 1 - debugging, infant mor- tality

Region 2 - useful life period

Region 3 - component wear-out or fatigue

Haza

rd r

ate

z(t

)

Time t

Page 13: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

13

Part II: Survey of Software Reliability Models

• Software Reliability Estimation Models:• Exponential NHPP Models

• Jelinski-Moranda/Shooman Model• Musa-Okumoto Model

• Geometric Model• Software Reliability Modeling and

Acceptance Testing

Page 14: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

14

Jelinski-Moranda/Shooman Models

Jelinski-Moranda model was developed by Jelinski and Moranda of McDonnell Douglas Astronautics Company for use on Navy NTDS software and a number of modules of the Apollo program. The Jelinski-Moranda model was published in 1971.

Shooman's model, discovered independently of Jelinski and Moranda's work, was also published in 1971. Shooman's model is identical to the JM model.

Page 15: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

15

Jelinski-Moranda/Shooman (cont'd)

Assumptions:1. The number of errors in the code is fixed.2. No new errors are introduced into the code through the

correction process.3. The number of machine instructions is essentially

constant.4. Detections of errors are independent.5. The software is operated in a similar manner as the

anticipated operational usage.6. The error detection rate is proportional to the number of

errors remaining in the code.

Page 16: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

16

Jelinski-Moranda/Shooman (cont'd)Let represent the amount of debugging time spent on the system since the start of the test phase.From assumption 6, we have:

where K is the proportionality constant, and r is the error rate (number of remaining errors normalized with respect to the number of instructions).

ET = number of errors initially in the program

IT = number of machine instructions in the program

c = cumulative number of errors fixed in the interval [0,normalized by the number of instructions).

r() = ET / IT - c()

z() = Kr()

Page 17: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

17

Jelinski-Moranda/Shooman (cont'd)ET and IT are constant (assumptions 1 and 3).

No new errors are introduced into the correction process (assumption 2).

As , c() ET/IT, so r() 0.

The hazard rate becomes:

CT

T

IEKz

z()

K(ET/IT)

C()ET/IT

Page 18: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

18

Jelinski-Moranda/Shooman (cont'd)

The reliability function becomes:

The expression for MTTF is:

e C

T

T

I

EKR

C

T

T

IEK

1

Page 19: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

19

Geometric Model Proposed by Moranda in 1975 as a variation of the

Jelinski-Moranda model.

Unlike models previously discussed, it does not assume that the number of errors in the program is finite, nor does it assume that errors are equally likely to occur.

This model assumes that errors become increasingly difficult to detect as debugging progresses, and that the program is never completely error free.

Page 20: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

20

Geometric Model (cont'd)

Assumptions:

1. There are an infinite number of total errors.2. All errors do not have the same chance of detection.3. The detections of errors are independent.4. The software is operated in a similar manner as the

anticipated operational usage.5. The error detection rate forms a geometric

progression and is constant between error occurrences.

Page 21: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

21

Geometric Model (cont'd)

The above assumptions result in the following hazard rate: z(t) = Di-1

for any time "t" between the (i - 1)st and the i'th error.

The initial value of z(t) = D

Page 22: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

22

Geometric Model (cont'd)Hazard Rate Graph

D

D

D2

time

Hazard rate

D(1 - ) D(1 - )2} }

1

1

error1stiofdiscoveryontzinchange

errorthi'ofdiscoveryontzinchange1

21

DD

DDii

ii

Page 23: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

23

Musa-Okumoto Model

The Musa-Okumoto model assumes that the failure intensity function decreases exponentially with the number of failures observed:

Since (t) = d(t)/dt, we have the following differential equation:

or

te 0

tuedt

td 0)(

0)( tue

dt

td

Page 24: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

24

Musa-Okumoto Model (cont’d)

Note that

We then obtain

tut

edt

td

dt

de )()(

de

dt

t

( )

0

Page 25: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

25

Musa-Okumoto Model (cont’d)

Integrating this last equation yields:

Since (0) = 0, C = 1, and the mean value function (t) is:

Cte t 0

1ln 0

t

t

Page 26: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

26

Software Reliability Modeling and Acceptance Testing

Given a piece of software advertised as having a failure rate , you can see if it meets that failure rate to a specific level of confidence. is the risk (probability) of falsely saying that the

software does not meet the failure rate goal. is the risk of saying that the goal is met when it is

not. The discrimination ratio, , is the factor you specify

that identifies acceptable departure from the goal. For instance, if = 2, the acceptable failure rate lies between /2 and 2.

Page 27: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

27

Software Reliability Modeling and Acceptance Testing (cont’d)

Accept

Continue

Reject

Normalized Failure Time(Time to failure times failure intensity objective)

Fai

lure

Nu

mb

er

Page 28: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

28

Software Reliability Modeling and Acceptance Testing (cont’d)

We can now draw a chart as shown in the previous slide. Define intermediate quantities A and B as follows:

The boundary between the “reject” and “continue” regions is given by:

where n is number of failures observed. The boundary between the “continue” and “accept” regions of the chart is given by:

A B

ln ln

1

1

A n

ln1

B n

ln1

Page 29: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

29

Part III: Criteria for Model Selection

Background Non-Quantitative criteria Quantitative criteria

Page 30: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

30

Criteria for Model Selection - Background

When software reliability models first appeared, it was felt that a process of refinement would produce “definitive” models that would apply to all development and test situations:

Current situation Dozens of models have been published in the literature Studies over the past 10 years indicate that the

accuracy of the models is variable Analysis of the particular context in which reliability

measurement is to take place so as to decide a priori which model to use does not seem possible.

Page 31: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

31

Criteria for Model Selection (cont’d)

Non-Quantitative Criteria Model Validity Ease of measuring parameters Quality of assumptions Applicability Simplicity Insensitivity to noise

Page 32: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

32

Criteria for Model Selection (cont’d)

Quantitative Criteria for Post-Model Application Self-consistency Goodness-of-Fit Relative Accuracy (Prequential Likelihood

Ratio) Bias (U-Plot) Bias Trend (Y-Plot)

Page 33: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

33

Criteria for Model Selection (cont’d)

Self-constency - Analysis of a model’s predictive quality can help user decide which model(s) to use.

Simplest question a SRM user can ask is “How reliable is the software at this moment?

The time to the next failure, Ti, is usually predicted using observed times ot failure

In general, predictions of Ti can be made using observed times to failureThe results of predictions made for different values of K can then be compared. If a model produced “self consistent” results for differing values of K, this indicates that its use is appropriate for the data on which the particular predictions were made.

HOWEVER, THIS PROVIDES NO GUARANTEE THAT THE PREDICTIONS ARE CLOSE TO THE TRUTH.

0K,,,, 21 ttt Ki

ttt i 121 ,,,

Page 34: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

34

Criteria for Model Selection (cont’d)

Goodness-of-fit - Kolmogorov-Smirnov Test Uses the absolute vertical distance between

two CDFs to measure goodness of fit. Depends on the fact that:

where F0 is a known, continuous CDF, and Fn is the sample CDF, is distribution free.

xxnn FFD nx

n 0sup

Page 35: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

35

Criteria for Model Selection (cont’d)Goodness-of-fit (cont’d) - Chi-Square Test More suited to determining GOF of failure counts data than to interfailure

times. Value given by:

where: n = number of independent repetitions of an experiment in which the

outcomes are decomposed into k+1 mutually exclusive sets A1, A2,..., Ak+1

Nj = number of outcomes in the j’th set pj = P[Aj]

1

1

2

k

jj

k npnpN

Q jj

Page 36: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

36

Criteria for Model Selection (cont’d)Prequential Likelihood Ratio The pdf for Fi(t) for Ti is based on observations . The pdf

For one-step ahead predictions of , the prequential likelihood is:

Two prediction systems, A and B, can be evaluated by computing the prequential likelihood ratio:

If PLRn approaches infinity as n approaches infinity, B is discarded in favor of A

nj

ji

iin tfPL1

~

)(

PLR

f t

f tn

iA

i

i j

j n

i

B

i

i j

j n

~

~

( )

( )

1

1

ttt i 121 ,,,

TTT njjj ,,, 21 dttFdtf ii

Page 37: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

37

Prequential Likelihood Example

true pdffifi+2 fi+1

High bias, low noise

truepdffi

fi+2fi+1

Low bias, high noise

fi+3

Page 38: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

38

Criteria for Model Selection (cont’d)Prequential Likelihood Ratio (cont'd)

When predictions have been made for , the PLR is given by:

Using Bayes' Rule, the PLR is rewritten as:

PLRp t t t t A

p t t t t Bn

j n j j

j n j j

( , , | , , , )

( , , | , , , )

1 1

1 1

PLR

p A t t p t t t tp A t t

p B t t p t t t tp B t t

n

j n j n j j

j

j n j n j j

j

( | , , ) ( , , | , , )( | , , )

( | , , ) ( , , | , , )( | , , )

1 1 1

1

1 1 1

1

TTT njjj ,,, 21

Page 39: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

39

Criteria for Model Selection (cont’d)

Prequential Likelihood Ratio (cont’d)

This equals:

If the initial conditions were based only on prior belief, the second factor of the final equation is the prior odds ratio. If the user is indifferent between models A and B, this ratio has a value of 1.

p A t t

p B t t

p B t t

p A t t

j n

j n

j

j

( | , , )

( | , , )

( | , , )

( | , , )

1

1

1

1

Page 40: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

40

Criteria for Model Selection (cont’d)

Prequential Likelihood Ratio (cont’d):

The final equation is then written as:

This is the posterior odds ratio, where wA is the posterior belief that A is true after making predictions with both A and B and comparing them with actual behavior.

PLRw

wn

A

A

1

Page 41: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

41

Criteria for Model Selection (cont’d)

The “u-plot” can be used to assess the predictive quality of a model Given a predictor, , that estimates the probability that the time to

the next failure is less than t. Consider the sequence

where each is a probability integral transform of the observed ti using the previously calculated predictor based upon .

If each were identical to the true, but hidden, , then the would be realizations of independent random variables with a uniform distribution in [0,1].

The problem then reduces to seeing how closely the sequence resembles a random sample from [0,1]

ttt i 121 ,,,

tFu iii~

F i~

tF i~

F i~

F i

ui

ui

ui

Page 42: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

42

U-Plots for JM and LV Models

JM

LV

0.5

1.000

0.5

1.0

Page 43: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

43

Criteria for Model Selection (cont’d)

The y-plot: Temporal ordering is not shown in a u-plot. The y-plot addresses this

deficiency To generate a y-plot, the following steps are taken:

Compute the sequence of For each , compute Obtain by computing:

for i m, m representing the number of observations made If the really do form a sequence of independent random variables in [0,1],

the slope of the plotted will be constant.

x xjj

i

jj

m

1 1

ui

ux ii 1lnui

yi

ui

yi

Page 44: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

44

Y-Plots for JM and LV Models

JM

LV

0.5

1.000

0.5

1.0

Page 45: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

45

Criteria for Model Selection (cont’d)

Quantitative Criteria Prior to Model Application Arithmetical Mean of Interfailure Times Laplace Test

Page 46: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

46

Arithmetical Mean of Interfailure Times

Calculate arithmetical mean of interfailure times as follows:

i = number of observed failures

j = j’th interfailure time

Increasing series of (i) suggests reliability growth. Decreasing series of (i) suggests reliability decrease.

i

jji

i1

1

Page 47: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

47

Laplace Test The occurrence of failures is assumed to follow a non-

homogeneous Poisson process whose failure intensity is decreasing:

Null hypothesis is that occurrences of failures follow a homogeneous Poisson process (I.e., b=0 above).

For interfailure times, test statistic computed by:

ebta

th

1121

211

1

1

1

1

1

i

iiu

i

jj

i

n

i

jjn

jj

Page 48: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

48

Laplace Test (cont’d)

For interval data, test statistic computed by:

k

i

k

i

k

i

in

ink

iniku

k1

2

11

12

1

21

1

Page 49: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

49

Laplace Test (cont’d) Interpretation

Negative values of the Laplace factor indicate decreasing failure intensity. Positive values suggest an increasing failure intensity. Values varying between +2 and -2 indicate stable reliability. Significance is that associated with normal distribution; e.g.

The null hypothesis “H0 : HPP” vs. “H1 : decreasing failure intensity” is rejected at the 5% significance level for (T) < -1.645

The null hypothesis “H0 : HPP” vs. “H1 : increasing failure intensity” is rejected at the 5% significance level for (T) > -1.645

The null hypothesis “H0 : HPP” vs. “H1 : there is a trend” is rejected at the 5% significance level for |(T)| > 1.96

Page 50: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

50

Part IV: Input Data Requirements and Data Collection Mechanisms

Model Inputs Time Between Successive Failures Failure Counts and Test Interval Lengths

Setting up a Data Collection Mechanism Minimal Set of Required Data Data Collection Mechanism Examples

Page 51: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

51

Input Data Requirements and Data Collection Mechanisms

Model Inputs - Time between Successive Failures Most of the models discussed in Section II require the times

between successive failures as inputs. Preferred units of time are expressed in CPU time (e.g., CPU

seconds between subsequent failures). Allows computation of reliability independent of wall-clock

time. Reliability computations in one environment can be easily

transformed into reliability estimates in another, provided that the operational profiles in both environments are the same and that the instruction execution rates of the original environment and the new environment can be related.

Page 52: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

52

Input Data Requirements and Data Collection Mechanisms (cont’d)

Model Inputs - Time between Successive Failures (cont’d)

Advantage - CPU time between successive failures tends to more accurately characterize the failure history of a software system than calendar time. Accurate CPU time between failures can give greater resolution than other types of data.

Disadvantage - CPU time between successive failures can often be more difficult to collect than other types of failure history data.

Page 53: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

53

Input Data Requirements and Data Collection Mechanisms (cont’d)

Model Inputs ( cont’d) - Failure Counts and Test Interval Lengths Failure history can be collected in terms of test interval lengths

and the number of failures observed in each interval. Several of the models described in Section II use this type of input.

The failure reporting systems of many organizations will more easily support collection of this type of data rather than times between successive failures. In particular, the use of automated test systems can easily establish the length of each test interval. Analysis of the test run will then provide the number of failures for that interval.

Disadvantage - failure counts data does not provide the resolution that accurately collected times between failures provide.

Page 54: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

54

Input Data Requirements and Data Collection Mechanisms (cont’d)

Setting up a Data Collection Mechanism

1. Establish clear, consistent objectives.

2. Develop a plan for the data collection process. Involve all individuals concerned (e.g. software designers, testers, programmers, managers, SQA and SCM staff). Address the following issues:

a. Frequency of data collection.

b. Data collection responsibilities

c. Data formats

d. Processing and storage of data

e. Assuring integrity of data/adherence to objectives

f. Use of existing mechanisms to collect data

Page 55: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

55

Input Data Requirements and Data Collection Mechanisms (cont’d)

Setting up a Data Collection Mechanism (cont’d)

3. Identify and evaluate tools to support data collection effort.

4. Train all parties in use of selected tools.

5. Perform a trial run of the plan prior to finalizing it.

6. Monitor the data collection process on a regular basis (e.g. weekly intervals) to assure that objectives are being met, determine current reliability of software, and identify problems in collecting/analyzing the data.

7. Evaluate the data on a regular basis. Assess software reliability as testing proceeds, not only at scheduled release time.

8. Provide feedback to all parties during data collec-tion/analysis effort.

Page 56: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

56

Input Data Requirements and Data Collection Mechanisms (cont’d)

Minimal Set of Required Data - to measure software reliability during test, the following minimal set of data should be collected by a development effort:

Time between successive failures OR test interval lengths/number of failures per test interval.

Functional area tested during each interval. Date on which functionality was added to software under test;

identifier for functionality added. Number of testers vs. time. Dates on which testing environment changed, and nature of

changes. Dates on which test method changed.

Page 57: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

57

Part VI: Early Prediction of Software Reliability

Background RADC Study Phase-Based Model

Page 58: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

58

Part VI: Background Modeling techniques discussed in preceding sections can be applied

only during test phases. These techniques do not take into account structural properties of the

system being developed or characteristics of the development environment.

Current techniques can measure software reliability, but model outputs cannot be easily used to choose development methods or structural characteristics that will increase reliability.

Measuring software reliability prior to test is an open area. Work in this area includes: RADC study of 59 projects Phase-Based model Analysis of complexity

Page 59: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

59

Part VI: RADC Study Study of 59 software development efforts, sponsored by RADC in mid

1980s Purpose - develop a method for predicting software reliability in the life

cycle phases prior to test. Acceptable model forms were: measures leading directly to reliability/failure rate predictions predictions that could be translated to failure rates (e.g., error density)

Advantages of error density as a software reliability figure of merit, according to participating investigators: It appears to be a fairly invariant number. It can be obtained from commonly available data. It is not directly affected by variables in the environment Conversion among error density metrics is fairly straightforward.

Page 60: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

60

Part VI: RADC Study (cont’d) Advantages of error density as a software reliability figure of merit

(cont’d) Possible to include faults by inspection with those found during

testing and operations, since the time-dependent elements of the latter do not need to be accounted for.

Major disadvantages cited by the investigators are: This metric cannot be combined with hardware reliability

metrics. Does not relate to observations in the user environment. It is far

easier for users to observe the availability of their systems than their fault density, and users tend to be far more concerned about how frequently they can expect the system to go down.

No assurance that all of the faults have been found.

Page 61: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

61

Part VI: RADC Study (cont’d)

Given these advantages and disadvantages, the investigators decided to attempt prediction of error density during the early phases of a development effort, and develop a transformation function that could be used to interpret the predicted error density as a failure rate. The driving factor seemed to be that data available early in life cycle could be much more easily used to predict error densities rather than failure rates.

Page 62: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

62

Part VI: RADC Study (cont’d)Investigators postulated that the following measures representing development environment and product characteristics could be used as inputs to a model that would predict the error density, measured in errors per line of code, at the start of the testing phase.

A -- Application Type (e.g. real-time control system, scientific computation system, information management system)

D -- Development Environment (characterized by development methodology and available tools). The types of development environments considered are the organic, semi-detached, and embedded modes, familiar from the COCOMO cost model.

Page 63: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

63

Part VI: RADC Study (cont’d)Measures of development environment and product characteristics (cont’d):

Requirements and Design Representation Metrics SA - Anomaly Management ST - Traceability SQ - Incorporation of Quality Review results into the software

Software Implementation Metrics SL - Language Type (e.g. assembly, high-order language, fourth generation

language) SS - Program Size SM - Modularity SU - Extent of Reuse SX - Complexity SR - Incorporation of Standards Review results into the software

Page 64: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

64

Part VI: RADC Study (cont’d) Initial error density at the start of test given by:

Initial failure rate:

F = linear execution frequency of the programK = fault exposure ratio (1.4*10-7 < K < 10.6*10-7, with an average

value of 4.2*10-7)

W0 = number of inherent faults

SRSXSUSMSSSLSQSTSADA **********0

code source of lines ofNumber ***00 KF

WKF 00**

Page 65: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

65

Part VI: RADC Study (cont’d)Moreover, F = R/I, where

R is the average instruction rate I is the number of object instructions in the program

I can be further rewritten as IS * QX, where

IS is the number of source instructions,

QX is the code expansion ratio (the ratio of machine instruc-tions to source instructions, which has an average value of 4 according to this study).

Therefore, the initial failure rate can be expressed as:

IW

Q Sx

KR 0

0**

Page 66: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

66

Part VI: Phase-Based Model Developed by John Gaffney, Jr. and Charles F. Davis of the Software

Productivity Consortium Makes use of error statistics obtained during technical review of

requirements, design and the implementation to predict software reliability during test and operations.

Can also use failure data during testing to estimate reliability. Assumptions:

The development effort's current staffing level is directly related to the number of errors discovered during a development phase.

The error discovery curve is monomodal. Code size estimates are available during early phases of a

development effort. Fagan inspections are used during all development phases.

Page 67: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

67

Part VI: Phase-Based ModelThe first two assumptions, plus Norden's observation that the Rayleigh curve represents the "correct" way of applying to a development effort, results in the following expression for the number of errors discovered during a life cycle phase:

E = Total Lifetime Error Rate, expressed inErrors per Thousand Source Lines of Code (KSLOC)

t = Error Discovery Phase index

eeV BttBE

t

22

)1(

Page 68: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

68

Part VI: Phase-Based Model

Note that t does not represent ordinary calendar time. Rather, t represents a phase in the development process. The values of t and the corresponding life cycle phases are:

t = 1 - Requirements Analysist = 2 - Software Designt = 3 - Implementation

t = 4 - Unit Testt = 5 - Software Integration Test

t = 6 - System Testt = 7 - Acceptance Test

Page 69: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

69

Part VI: Phase-Based Model

p, the Defect Discovery Phase Constant is the location of the peak in a continuous fit to the failure data. This is the point at which 39% of the errors have been discovered:

The cumulative form of the model is:

where Vt is the number of errors per KSLOC that have been dis-covered through phase t

Bp

1

22

eV BtE

t

2

1

Page 70: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

70

Part VI: Phase-Based Model

Typical Error Discovery Profile

0

2

4

6

8

10

12

14

1 2 3 4 5 6 7 8

Development Phase Index

Err

or

Den

sity

(E

rro

rs p

er K

LO

C)

Page 71: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

71

Part VI: Phase-Based ModelThis model can also be used to estimate the number of latent errors in the software. Recall that the number of errors per KSLOC removed through the n'th phase is:

The number of errors remaining in the software at that point is:

times the number of source statements

e BnE

2

eV BnE

n

2

1

Page 72: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

72

Part VII: Current Work in Estimating Fault Content

Analysis of Complexity Regression Tree Modeling

Page 73: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

73

Analysis of Complexity

The need for measurement The measurement process Measuring software change Faults and fault insertion Fault insertion rates

Page 74: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

74

Analysis of Complexity (cont’d)

Recent work has focused on relating measures of software structure to fault content.

Problem - although different software metrics will say different things about a software system, they tend to be interrelated and can be highly correlated with one another (e.g., McCabe complexity and line count are highly correlated).

Page 75: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

75

Relative complexity measure, developed by Munson and Khoshgoftaar, attempts to handle the problem of interdependence and multicollinearity among software metrics.

Technique used is factor analysis, whose pur-pose is to decompose a set of correlated measures into a set of eigenvalues and eigenvectors.

Analysis of Complexity (cont’d)

Page 76: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

76

Analysis of Complexity (cont’d)

The need for measurement The measurement process Measuring software change Faults and fault insertion Fault insertion rates

Page 77: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

77

Analysis of Complexity - Measuring Software

{ int i,j; for (i=0; Array[i][0]!='\0'; i++) { j = strcmp(String, Array[i]); if (j>0) continue; if (j<0) return -1; else return i; }; return -1; }

Module

Source Code

LOC 14Stmts 12N1 30N2 23eta1 15eta2 12

• • • Metric

Analysis

CMA

ModuleCharacteristics

Page 78: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

78

Analysis of Complexity - Simplifying Measurements

Modules

• • •

• • •

• • •

CMA PCA/RCM

MetricAnalysis

PrincipalComponents

Analysis

12 23 54 12 203 39 238 34

Program Raw Metrics Relative Complexity

7 13 64 12 215 9 39 238

11 21 54 12 241 39 238 35

5 33 44 12 205 39 138 44

42 55 54 12 113 29 234 14

50

40

6045

55

Page 79: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

79

Analysis of Complexity - Relative Complexity

Relative complexity is a synthesized metric

Relative complexity is a fault surrogate Composed of metrics closely related to

faults Highly correlated with faults

m

j

Bj

Bj

Bi d

1

Page 80: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

80

Analysis of Complexity (cont’d)

The need for measurement The measurement process Measuring software change Faults and fault insertion Fault insertion rates

Page 81: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

81

Analysis of Complexity (cont’d)

Software Evolution We assume that we are developing (maintaining) a

program We are really working with many programs over time They are different programs in a very real sense We must identify and measure each version of each

program module

Page 82: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

82

Analysis of Complexity (cont’d)

Evolution of the STS Primary Avionics Software System (PASS)

1 4 9

1 5 6 1 2

1 1 1 2

3

1 3

1 7

4

1 4

2 2

5

8

2 3

6

9

1 5

1 0

7

1 6

2 4

3 6

1 5 7

3 7

1 8

2 5

1 9

2 6

3 8

2 0

2 7

3 9

4 3

2 1

2 8

3 1

4 0

4 4

2 9

3 2

3 4

4 1

4 5

6 4

3 0

3 3

3 5

4 2

4 6

6 5

4 7

6 6

7 7

4 8

6 7

7 8

4 9

5 1

5 3

5 5

6 8

7 9

1 0 8

6 9

8 0

1 0 9

5 0

5 2

5 4

5 6

5 8

1 5 8

5 7

6 0

7 0

8 1

1 1 0

6 1

7 1

7 3

7 5

8 2

8 3

1 1 1

6 2

7 2

7 4

7 6

8 4

8 8

8 9

1 0 0

1 0 4

1 1 2

6 3

8 5

1 5 9

9 0

1 0 1

1 0 5

1 1 3

1 1 0

1 1 6

1 1 8

8 6

1 0 2

1 0 6

1 1 4

1 1 1

1 1 7

1 1 9

8 7

1 0 3

1 0 7

1 1 7

1 1 5

1 1 2

1 2 0

1 1 8

1 1 6

1 1 3

1 3 5

1 2 1

1 1 9

1 1 4

1 3 6

1 2 2

1 1 5

1 3 7

1 2 3

1 3 8

1 2 4

1 3 9

1 2 5

1 4 0

1 2 6

1 4 1

1 2 7

1 4 2

1 2 8

1 3 1

1 4 3

1 2 9

1 3 2

1 4 4

1 4 6

1 3 0

1 3 3

1 3 4

1 4 5

1 4 7 1 4 8

1 4 9 1 5 0

1 74

1 5 1

1 54

1 55

1 6 5

1 6 8

1 71

1 75

1 5 2

1 60

1 63

1 6 6

1 6 9

1 72

1 76

1 5 3

1 61

1 64

1 6 7

1 7 0

1 73

1 77

1 62

1 78 1 79 1 80 1 81

2 03

1 82

1 85

1 8 8

2 0 0

2 04

1 83

1 86

1 8 9

2 0 1

2 05

1 84

1 87

1 9 0

2 0 2

2 06 2 07 2 08 2 09 2 10

2 24

2 11

2 14

2 1 7

2 2 0

2 25

2 1 2

2 1 5

2 1 8

2 2 1

2 2 6

2 1 3

2 1 6

2 1 9

2 2 2

2 2 7

2 3 3

2 2 3

2 2 8

2 3 0

2 3 4

2 2 9

2 3 1

2 3 5

2 3 2

2 3 6 2 3 7 2 3 8 2 3 9

2 5 3

2 4 0

2 5 4

2 4 1

2 5 5

2 4 2

2 5 6

2 4 3

2 5 7

3 3 6

2 4 4

2 4 5

2 4 7

2 5 0

2 5 8

3 3 7

3 4 0

2 4 6

2 4 8

2 5 1

2 5 9

3 3 8

3 4 1

3 4 2

2 4 9

2 5 2

2 6 0

3 3 9

2 6 1

3 5 0

2 6 2

3 4 3

2 6 5

2 6 8

2 7 0

2 7 1

2 7 4

2 7 5

3 5 1

2 6 3

3 4 4

2 6 6

2 6 9

3 4 7

2 7 2

3 4 9

2 7 6

3 5 2

2 6 4

3 4 5

2 6 7

3 4 6

3 4 8

2 7 3

2 7 7

2 7 8 2 7 9 2 8 0 2 8 1 2 8 2 2 8 3 2 8 4

2 8 1

3 0 2

2 8 2

3 0 3

2 8 3

3 0 4

2 8 4

3 0 5

2 8 5

3 0 6

2 8 6

3 0 7

3 2 4

2 8 7

2 9 0

2 9 1

2 9 4

2 9 7

2 9 8

2 9 9

3 0 0

3 0 8

3 2 5

2 8 8

3 5 4

2 9 2

2 9 5

3 5 5

3 5 7

3 5 9

3 0 1

3 0 9

3 2 6

2 8 9

2 9 3

2 9 6

3 5 6

3 5 8

3 6 0

3 1 0

3 2 7

3 1 1

3 2 8

3 1 2

3 1 3

3 1 6

3 1 9

3 2 2

3 2 3

3 2 9

3 6 1

3 1 4

3 1 7

3 2 0

3 6 3

3 3 0

3 3 3

3 6 2

3 1 5

3 1 8

3 2 1

3 6 4

3 3 1

3 3 4

3 3 2

3 3 5

A

C

B

B

C

A

Page 83: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

83

Analysis of Complexity

(cont’d)

TheProblem

Build N Build N+1AB

C

DE

F

GHIJK

A

B

L

DE

F

G

H

IJ

K

M

Page 84: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

84

Analysis of Complexity (cont’d)

Managing fault counts during evolution Some faults are inserted during branch builds

These fault counts must be removed when the branch is pruned

Some faults are eliminated on branch builds These faults must be removed from the main sequence

build Fault count should contain only those faults on the main

sequence to the current build Faults attributed to modules not in the current build must be

removed from the current count

Page 85: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

85

Analysis of Complexity (cont’d)

Baselining a software system Software changes over software builds Measurements, such as relative complexity, change

across builds Initial build as a baseline Relative complexity of each build Measure change in fault surrogate from initial

baseline

Page 86: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

86

Analysis of Complexity - Measurement Baseline

Point B

Point A

Page 87: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

87

Analysis of Complexity - Baseline Components

Vector of means Vector of standard deviations

Transformation matrix

Bj

Bj

iBj

iBj sxwz /)( ,,

BiBiB Tzd ,,

Page 88: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

88

Analysis of Complexity - Comparing Two Builds

MeasurementTools

RCM-Delta

RCM Values

Baselined Build i

Build i

Build j

CodeChurn

CodeDeltas

Source Code

Baselined Build j

Baseline

Page 89: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

89

Analysis of Complexity - Measuring Evolution

Different modules in different builds set of modules not in latest build set of modules not in early build set of common modules

Code delta Code churn Net code churn

jiaM ,

jibM ,

jicM ,

iBa

jBa

jia

jia

,,,,

ji

bbji

aacc Mm

jBb

Mm

iBa

Mm

jic

ji

,,

,,,,

iB

a

jB

a

ji

a

,,,

Page 90: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

90

Analysis of Complexity (cont’d)

The need for measurement The measurement process Measuring software change Faults and fault insertion Fault insertion rates

Page 91: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

91

Analysis of Complexity - Fault Insertion

Build N Build N+1

Existing Faults

Existing Faults

FaultsRemoved

FaultsAdded

Page 92: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

92

Analysis of Complexity - Identifying and Counting Faults

Unlike failures, faults are not directly observable fault counts should be at same level of granularity as software

structure metrics Failure counts could be used as a surrogate for fault counts if:

Number of faults were related to number of failures Distribution of number of faults per failure had low variance The faults associated with a failure were confined to a

single procedure/function

Actual situation shown on next slide

Page 93: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

93

Analysis of Complexity - Observed Distribution of Faults per Failure

30 0 10.5667 7.5000 9.3428 3.7500 7.5000 13.2500Defectsper 1Failure

Valid Missing

N

Mean MedianStd.

Deviation 25 50 75

Percentiles

Statistics

0

1

2

3

4

Fre

qu

en

cy

1 3 5 7 9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Defects per Reported Failure

Distribution of Defects per Failure

Page 94: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

94

Analysis of Complexity - Fault Identification and Counting Rules

Taxonomy based on corrective actions taken in response to failure reports

faults in variable usage Definition and use of new variables Redefinition of existing variables (e.g. changing type from

float to double) Variable deletion Assignment of a different value to a variable

faults involving constants Definition and use of new constants Constant definition deletion

Page 95: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

95

Analysis of Complexity - Fault Identification and Counting Rules (cont’d)

Control flow faults Addition of new source code block Deletion of erroneous conditionally-executed path(s) within a set

of conditionally executed statements Addition of execution paths within a set of conditionally executed

statements Redefinition of existing condition for execution (e.g. change “if i <

9” to “if i <= 9”) Removal of source code block Incorrect order of execution Addition of a procedure or function Deletion of a procedure or function

Page 96: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

96

Analysis of Complexity (cont’d)Control flow fault examples - removing execution paths from a code block

Counts as two faults, since two paths were removed

Page 97: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

97

Analysis of Complexity (cont’d)Control flow examples (cont’d) - addition of conditional execution paths to code block

Counts as three faults, since three paths were added

Page 98: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

98

Analysis of Complexity - Estimating Fault Content

The fault potential of a module i is directly proportional to its relative complexity

From previous development projects develop a proportionality constant, k, for total faults

Faults per module:

0

00

Rr ii

00 kRF 000 Frg ii

Page 99: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

99

Analysis of Complexity - Estimating Fault Insertion Rate

Proportionality constant, k’, representing the rate of fault insertion

For jth build, total faults insertion

Estimate for the fault insertion rate

jj kkRF ,00

1,

,01,0

,001,001

)(

jj

jj

jjjj

kk

kkRkkRFF

Page 100: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

100

Analysis of Complexity (cont’d)

The need for measurement The measurement process Measuring software change Faults and fault insertion Fault insertion rates

Page 101: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

101

Analysis of Complexity - Relationships Between Change in Fault Count and

Structural Change

= code churn

= code delta

Version ID Number ofCases

Pearson’s R SpearmanCorrelation

Version 3 10 .376 -.323 .451 -.121Version 4 12 .661 .700 .793 .562Version 5 13 .891 -.276 .871 -.233Versions 3-5 35 .568 .125 .631 .087

Page 102: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

102

Analysis of Complexity - Regression Models

is the number of faults inserted between builds j and j+1

is the measured code churn between builds j and j+1

is the measured code delta between builds j and j+1

djj 1,

1, jj

1, jj

Model b0 b1 t (b1) b2 t (b2) R R2 AdjustedR2

ResidualSum ofSquares

DegreesOf

Freedom

bbdjjjj

10

1,1, 1.507 0.373 3.965 ----- ----- 0.568 0.323 0.302 140.812 33

bbbdjjjjjj

210

1,1,1, 1.312 0.460 4.972 0.172 2.652 0.667 0.445 0.412 115.441 32

bdjjjj

1

1,1, ----- 0.576 7.937 ----- ----- 0.806 0.649 0.639 179.121 34

bbdjjjjjj

21

1,1,1, ----- 0.647 9.172 0.201 2.849 0.846 0.719 0.702 143.753 33

Page 103: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

103

Analysis of Complexity - PRESS Scores - Linear vs. Nonlinear

Models

Effect

Linear ModelsExcluding

Observations ofChurn = 0

Nonlinear ModelsExcluding

Observations ofChurn = 0

Churn only 186.090 205.718Churn and

Delta159.875 157.831

Page 104: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

104

Analysis of Complexity - Selecting an Adequate Linear Model

Linear model gives best R2 and PRESS score. Is the model based only on code churn an adequate

predictor at the 5% significance level?

R2-adequate test shows that code churn is not an adequate predictor at the 5% significance level .

Page 105: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

105

Analysis of Complexity - Analysis of Predicted ResidualsPredicted Residuals vs. Observed Defects

Defects = b1*Churn + b2*Delta

Number of observed defects - versions 3-5

121086420

Pre

dic

ted

Re

sid

ua

ls

8

6

4

2

0

-2

-4

W i l c o x o n T e s t R e s u l t s – P r e d i c t i o n s f o r E x c l u d e d O b s e r v a t i o n sS a m p l e P a i r N u m b e r

O fO b s e r v a t i o n s

M e a n R a n k S u m o f R a n k s T e s tS t a t i s t i c

Z

A s y m p t o t i cS i g n i f i c a n c e

( 2 - t a i l e d )O b s e r v e d D e f e c t sA n d E s t i m a t e dD e f e c t s f r o m E x c l u d e dO b s e r v a t i o n s .E s t i m a t e d D e f e c t sP r e d i c t e d b y :

bbd

jjjjjj

21

1,1,1,

N e g a t i v e R a n k sP o s i t i v e R a n k sT i e sT o t a l

1 02 50

3 5

1 9 . 2 01 7 . 5 2

1 9 2 . 0 04 3 8 . 0 0

- 2 . 0 1 5 0 . 0 4 4

Page 106: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

106

Regression Tree Modeling

Objectives Attractive way to encapsulate the knowledge of

experts and to aid decision making. Uncovers structure in data Can handle data with complicated an unexplained

irregularities Can handle both numeric and categorical

variables in a single model.

Page 107: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

107

Algorithm Determine set of predictor variables (software metrics)

and a response variable (number of faults). Partition the predictor variable space such that each

partition or subset is homogeneous with respect to the dependent variable.

Establish a decision rule based on the predictor variables which will identify the programs with the same number of faults.

Predict the value of the dependent variable which is the average of all the observations in the partition.

Regression Tree Modeling (cont’d)

Page 108: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

108

Algorithm (cont’d) Minimize the deviance function given by:

Establish stopping criteria based on: Cardinality threshold - leaf node is smaller than certain

absolute size. Homogeneity threshold - deviance of leaf node is less

than some small percentage of the deviance of the root node; I.e., leaf node is homogeneous enough.

Regression Tree Modeling (cont’d)

ˆˆ,2

iiii yyD

Page 109: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

109

Regression Tree Modeling (cont’d)

Application• Software for medical imaging system, consisting of 4500

modules amounting to 400,000 lines of code written in Pascal, FORTRAN, assembly language, and PL/M.

• Random sample of 390 modules from the ones written in Pascal and FORTRAN, consisting of about 40,000 lines of code.

• Software was developed over a period of five years, and had been in use at several hundred sites.

• Number of changes made to the executable code documented by Change Reports (CRs) indicates software development effort.

Page 110: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

110

Application Software for medical imaging system, consisting of 4500

modules amounting to 400,000 lines of code written in Pascal, FORTRAN, assembly language, and PL/M.

Random sample of 390 modules from the ones written in Pascal and FORTRAN, consisting of about 40,000 lines of code.

Software was developed over a period of five years, and had been in use at several hundred sites.

Number of changes made to the executable code documented by Change Reports (CRs) indicates software development effort.

Regression Tree Modeling (cont’d)

Page 111: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

111

Application (cont’d) Metrics

Total lines of code (TC) Number of code lines (CL) Number of characters (Cr) Number of comments (Cm) Comment characters (CC) Code characters (Co) Halstead’s program length Halstead’s estimate of program length metric Jensen’s estimate of program length metric Cyclomatic complexity metric Bandwidth metric

Regression Tree Modeling (cont’d)

Page 112: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

112

Regression Tree Modeling (cont’d)

Pruning• Tree grown using the stopping rules is too elaborate.• Pruning - equivalent to variable selection in linear

regression.• Determines a nested sequence of subtrees of the

given tree by recursively snipping off partitions with minimal gains in deviance reduction

• Degree of pruning can be determine by using cross-validation.

Page 113: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

113

Pruning Tree grown using the stopping rules is too elaborate. Pruning - equivalent to variable selection in linear

regression. Determines a nested sequence of subtrees of the given

tree by recursively snipping off partitions with minimal gains in deviance reduction

Degree of pruning can be determine by using cross-validation.

Regression Tree Modeling (cont’d)

Page 114: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

114

Cross-Validation Evaluate the predictive performance of the regression tree

and degree of pruning in the absence of a separate validation set.

Data are divided into two mutually exclusive sets, viz., learning sample and test sample.

Learning sample is used to grow the tree, while the test sample is used to evaluate the tree sequence.

Deviance - measure to assess the performance of the prediction rule in predicting the number of errors for the test sample of different tree sizes.

Regression Tree Modeling (cont’d)

Page 115: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

115

Performance Analysis Two types of errors:

Predict more faults than the actual number - Type I misclassification. Predict fewer faults than actual number - Type II error.

Type II error is more serious. Type II error in case of tree modeling is 8.7%, and in case of fault

density is 13.1%. Tree modeling approach is significantly than fault density approacy. Can also be used to classify modules into fault-prone and non fault-

prone categories. Decision rule - classifies the module as fault-prone if the predicted

number of faults is greater than a certain a. Choice of a determines the misclassification rate.

Regression Tree Modeling (cont’d)

Page 116: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

116

SRMP SMERFS CASRE

Part VIII: Software Reliability Tools

Page 117: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

117

Where Do They Come From? Software Reliability Modeling Program (SRMP)

Bev Littlewood of City University, London Statistical Modeling and Estimation of Reliability

Functions for Software (SMERFS) William Farr of Naval Surface Warfare Center

Computer-Aided Software Reliability Estimation Tool (CASRE) Allen Nikora, JPL & Michael Lyu, Chinese

University of Hong Kong

Page 118: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

118

SRMP Main Features

Multiple Models (9) Model Application Scheme: Multiple Iterations Data Format: Time-Between-Failures Data Only Parameter Estimation: Maximum Likelihood Multiple Evaluation Criteria - Prequential

Likelihood, Bias, Bias Trend, Model Noise Simple U-Plots and Y-Plots

Page 119: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

119

SMERFS Main Features Multiple Models (12) Model Application Scheme: Single Execution Data Format: Failure-Counts and Time-Between Failures On-line Model Description Manual Two parameter Estimation Methods

Least Square Method Maximum Likelihood Method

Goodness-of-fit Criteria: Chi-Square Test, KS Test Model Applicability - Prequential Likelihood, Bias, Bias Trend,

Model Noise Simple Plots

Page 120: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

120

The SMERFS Tool Main Menu Data Input Data Edit Data Transformation Data Statistics

Plots of the Raw Data Model Applicability Analysis Executions of the Models Analyses of Model Fit Stop Execution of SMERFS

Page 121: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

121

CASRE Main Features

Multiple Models (12) Model Application Scheme: Multiple Iterations Goodness-of-Fit Criteria - Chi-Square Test, KS Test Multiple Evaluation Criteria - Prequential Likelihood, Bias,

Bias Trend, Model Noise Conversions between Failure-Counts Data and Time-

Between-Failures Data Menu-Driven, High-Resolution Graphical User Interface Capability to Make Linear Combination Models

Page 122: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

122

CASRE High-Level Architecture

Page 123: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

123

Further Reading A. A. Abdel-Ghaly, P. Y. Chan, and B. Littlewood; "Evaluation of Competing Software

Reliability Predictions," IEEE Transactions on Software Engineering; vol. SE-12, pp. 950-967; Sep. 1986.

T. Bowen, "Software Quality Measurement for Distributed Systems", RADC TR-83-175.

W. K. Erlich, A. Iannino, B. S. Prasanna, J. P. Stampfel, and J. R. Wu, "How Faults Cause Software Failures: Implications for Software Reliability Engineering", published in proceedings of the International Symposium on Software Reliability Engineering, pp 233-241, May 17-18, 1991, Austin, TX

M. E. Fagan, "Advances in Software Inspections", IEEE Transactions on Software Engineering, vol SE-12, no 7, July, 1986, pp 744-751

M. E. Fagan, "Design and Code Inspections to Reduce Errors in Program Development," IBM Systems Journal, Volume 15, Number 3, pp 182-211, 1976

W. H. Farr, O. D. Smith, and C. L. Schimmelpfenneg, "A PC Tool for Software Reliability Measurement," published in the 1988 Proceedings of the Institute of Environmental Sciences, King of Prussia, PA

Page 124: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

124

Further Reading (cont’d) W. H. Farr, O. D. Smith, "Statistical Modeling and Estimation of Reliability Functions for

Software (SMERFS) User's Guide," Naval Weapons Surface Center, December 1988 (approved for unlimited public distribution by NSWC)

J. E. Gaffney, Jr. and C. F. Davis, "An Approach to Estimating Software Errors and Availability," SPC-TR-88-007, version 1.0, March, 1988, proceedings of Eleventh Minnowbrook Workshop on Software Reliability, July 26-29, 1988, Blue Mountain Lake, NY

J. E. Gaffney, Jr. and J. Pietrolewicz, "An Automated Model for Software Early Error Prediction (SWEEP)," Proceedings of Thirteenth Minnow-brook Workshop on Software Reliability, July 24-27, 1990, Blue Mountain Lake, NY

A. L. Goel, S. N. Sahoo, "Formal Specifications and Reliability: An Experimental Study", published in proceedings of the International Symposium on Software Reliability Engineering, pp 139-142, May 17-18, 1991, Austin, TX

A. Grnarov, J. Arlat, A. Avizienis, "On the Performance of Software Fault-Tolerance Strategies", published in the proceedings of the Tenth International Symposium on Fault Tolerant Computing (FTCS-10), Kyoto, Japan, October, 1980, pp 251-253

Page 125: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

125

Further Reading (cont’d) K. Kanoun, M. Bastos Martini, J. Moreira De Souza, “A Method for Software

Reliability Analysis and Prediction - Application to the TROPICO-R Switching System”, IEEE Transactions on Software Engineering, April 1991, pp, 334-344

J. C. Kelly, J. S. Sherif, J. Hops, "An Analysis of Defect Densities Found During Software Inspections", Journal of Systems Software, vol 17, pp 111-117, 1992

T. M. Khoshgoftaar and J. C. Munson, "A Measure of Software System Complexity and its Relationship to Faults," proceedings of 1992 International Simulation Technology Conference and 992 Workshop on Neural Networks (SIMTEC'92 - sponsored by the Society for Computer Simulation), pp. 267-272, November 4-6, 1992, Clear Lake, TX

M. Lu, S. Brocklehurst, and B. Littlewood, "Combination of Predictions Obtained from Different Software Reliability Growth Models," proceedings of the IEEE 10th Annual Software Reliability Symposium, pp 24-33, June 25-26, 1992, Denver, CO

M. Lyu, ed. Handbook of Software Reliablity Engineering, McGraw-Hill and IEEE Computer Society Press, 1996, ISBN 0-07-0349400-8

Page 126: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

126

Further Reading (cont’d) M. Lyu, "Measuring Reliability of Embedded Software: An Empirical Study with JPL Project Data,"

published in the Proceedings of the International Conference on Probabilistic Safety Assessment and Management; February 4-6, 1991, Los ngeles, CA.

M. Lyu and A. Nikora, "A Heuristic Approach for Software Reliability Prediction: The Equally-Weighted Linear Combination Model," published in the proceedings of the IEEE International Symposium on Software Reliability Engineering, May 17-18, 1991, Austin, TX M. Lyu and A. Nikora, "Applying Reliability Models More Effectively", IEEE Software, vol. 9, no. 4, pp. 43-52, July, 1992

M. Lyu and A. Nikora, "Software Reliability Measurements Through Com-bination Models: Approaches, Results, and a CASE Tool," proceedings the 15th Annual International Computer Software and Applications Conference COMPSAC91), September 11-13, 1991, Tokyo, Japan

J. McCall, W. Randall, S. Fenwick, C. Bowen, P. Yates, N. McKelvey, M. Hecht, H. Hecht, R. Senn, J. Morris, R. Vienneau, "Methodology for Software Reliability Prediction and Assessment," Rome Air Development Center (RADC) Technical Report RADC-TR-87-171. volumes 1 and 2, 1987

J. Munson and T. Khoshgoftaar, "The Use of Software Metrics in Reliability Models," presented at the initial meeting of the IEEE Subcommittee on Software Reliability Engineering, April 12-13, 1990, Washington, DC

Page 127: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

127

Further Reading (cont’d) J. C. Munson, "Software Measurement: Problems and Practice," Annals of Software Engineering,

J. C. Baltzer AG, Amsterdam 1995. J. C. Munson, “Software Faults, Software Failures, and Software Reliability Modeling”,

Information and Software Technology, December, 1996. J. C. Munson and T. M. Khoshgoftaar “Regression Modeling of Software Quality: An Empirical

Investigation,” Journal of Information and Software Technology, 32, 1990, pp. 105-114. J. Munson, A. Nikora, “Estimating Rates of Fault Insertion and Test Effectiveness in Software

Systems”, invited paper, published in Proceedings of the Fourth ISSAT International Conference on Quality and Reliability in Design, Seattle, WA, August 12-14, 1998

John D. Musa., Anthony Iannino, Kazuhiro Okumoto, Software Reliability: Measurement, Prediction, Application; McGraw-Hill, 1987; ISBN 0-07-044093-X.

A. Nikora, J. Munson, “Finding Fault with Faults: A Case Study”, presented at the Annual Oregon Workshop on Software Metrics, May 11-13, 1997, Coeur d’Alene, ID.

A. Nikora, N. Schneidewind, J. Munson, "IV&V Issues in Achieving High Reliability and Safety in Critical Control System Software," proceedings of the Third ISSAT International Conference on Reliability and Quality in Design, March 12-14, 1997, Anaheim, CA.

Page 128: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

128

Further Reading (cont’d) A. Nikora, J. Munson, “Determining Fault Insertion Rates For Evolving Software

Systems”, proceedings of the Ninth International Symposium on Software Reliability Engineering, Paderborn, Germany, November 4-7, 1998

Norman F. Schneidewind, Ted W,.Keller, "Applying Reliability Models to the Space Shuttle", IEEE Software, pp 28-33, July, 1992

N. Schneidewind, “Reliability Modeling for Safety-Critical Software”, IEEE Transactions on Reliability, March, 1997, pp. 88-98

N. Schneidewind, "Measuring and Evaluating Maintenance Process Using Reliability, Risk, and Test Metrics", proceedings of the International Conference on Software Maintenance, September 29-October 3, 1997, Bari, Italy.

N. Schneidewind, "Software Metrics Model for Integrating Quality Control and Prediction", proceedings of the 8th International Sympsium on Software Reliability Engineering, November 2-5, 1997, Albuquerque, NM.

N. Schneidewind, "Software Metrics Model for Quality Control", Proceedings of the Fourth International Software Metrics Symposium, November 5-7, 1997, Albuquerque, NM.

Page 129: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

129

Additional Information

CASRE Screen Shots Further modeling details

Additional Software Reliability Models Quantitative Criteria for Model Selection – the

Subadditivity Property Increasing the Predictive Accuracy of Models

Page 130: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

130

CASRE - Initial Display

Page 131: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

131

CASRE - Applying Filters

Page 132: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

132

CASRE - Running Average Trend Test

Page 133: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

133

CASRE - Laplace Test

Page 134: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

134

CASRE - Selecting and Running Models

Page 135: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

135

CASRE - Displaying Model Results

Page 136: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

136

CASRE - Displaying Model Results (cont’d)

Page 137: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

137

CASRE - Prequential Likelihood Ratio

Page 138: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

138

CASRE - Model Bias

Page 139: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

139

CASRE - Model Bias Trend

Page 140: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

140

CASRE - Ranking Models

Page 141: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

141

CASRE - Model Ranking Details

Page 142: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

142

CASRE - Model Ranking Details (cont’d)

Page 143: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

143

CASRE - Model Results Table

Page 144: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

144

CASRE - Model Results Table (cont’d)

Page 145: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

145

CASRE - Model Results Table (cont’d)

Page 146: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

146

Additional Software Reliability Models

Software Reliability Estimation Models: Exponential NHPP Models

Generalized Poisson Model Non-homogeneous Poisson Process Model Musa Basic Model Musa Calendar Time Model Schneidewind Model

Littlewood-Verrall Bayesian Model Hyperexponential Model

Page 147: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

147

Generalized Poisson Model

Proposed by Schafer, Alter, Angus, and Emoto for Hughes Aircraft Company under contract to RADC in 1979.

Model is analogous in form to the Jelinski-Moranda model but taken within the error count framework. The model can be shown reduce to the Jelinski-Moranda model under the appropriate circumstances.

Page 148: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

148

Generalized Poisson Model (cont'd)Assumptions:

1. The expected number of errors occurring in any time interval is proportional to the error content at the time of testing and to some function of the amount of time spent in error testing.

2. All errors are equally likely to occur and are independent of each other.

3. Each error is of the same order of severity as any other error.4. The software is operated in a similar manner as the

anticipated usage.5. The errors are corrected at the ends of the testing intervals

without introduction of new errors into the program.

Page 149: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

149

Generalized Poisson Model (cont'd)Construction of Model:

Given testing intervals of length X1, X2,...,Xn

fi errors discovered during the i'th interval At the end of the i'th interval, a total of M i errors have been corrected First assumption of the model yields:

E(fi) = (N - Mi-1)gi (x1, x2, ..., xi) where

is a proportionality constant N is the initial number of errors gi is a function of the amount of testing time spent, previously and

currently. gi is usually non-decreasing. If gi (x1, x2, ..., xi) = xi, then the model reduces to the Jelinski-Moranda model.

Page 150: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

150

Schneidewind Model

Proposed by Norman Schneidewind in 1975.

Model's basic premise is that as the testing progresses over time, the error detection process changes. Therefore, recent error counts are usually of more use than earlier counts in predicting future error counts.

Schneidewind identifies three approaches to using the error count data. These are identified in the following slide.

Page 151: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

151

Schneidewind Model First approach is to use all of the error counts for all

testing intervals.

Second approach is to use only the error counts from test intervals s through m and ignore completely the error counts from the first s - 1 test intervals, assuming that there have been m test intervals to date.

Third approach is a hybrid approach which uses the cumulative error count for the first s - 1 intervals and the individual error counts for the last m - s + 1 intervals.

Page 152: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

152

Schneidewind Model (cont'd)Assumptions:

1. The number of errors detected in one interval is independent of the error count in another.

2. The error correction rate is proportional to the number of errors to be corrected.

3. The software is operated in a similar manner as the anticipated operational usage.

4. The mean number of detected errors decreases from one interval to the next.

5. The intervals are all of the same length.

Page 153: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

153

Schneidewind Model (cont'd)

Assumptions (cont'd):

6. The rate of error detection is proportional the number of errors within the program at the time of test. The error detection process is assumed to be a nonhomogeneous Poisson process with an exponentially decreasing error detection rate. This rate of change is:

for the i'th interval, where > 0 and > 0 are constants of the model.

edi

i

Page 154: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

154

Schneidewind Model

Proposed by Norman Schneidewind in 1975.

Model's basic premise is that as the testing progresses over time, the error detection process changes. Therefore, recent error counts are usually of more use than earlier counts in predicting future error counts.

Schneidewind identifies three approaches to using the error count data. These are identified in the following slide.

Page 155: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

155

Schneidewind Model First approach is to use all of the error counts for all

testing intervals.

Second approach is to use only the error counts from test intervals s through m and ignore completely the error counts from the first s - 1 test intervals, assuming that there have been m test intervals to date.

Third approach is a hybrid approach which uses the cumulative error count for the first s - 1 intervals and the individual error counts for the last m - s + 1 intervals.

Page 156: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

156

Schneidewind Model (cont'd)Assumptions:

1. The number of errors detected in one interval is independent of the error count in another.

2. The error correction rate is proportional to the number of errors to be corrected.

3. The software is operated in a similar manner as the anticipated operational usage.

4. The mean number of detected errors decreases from one interval to the next.

5. The intervals are all of the same length.

Page 157: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

157

Schneidewind Model (cont'd)

From assumption 6, the cumulative mean number of errors is:

for the i'th interval, the mean number of errors is:

mi = Di - Di - 1

eDi

i

1

eeii

1

Page 158: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

158

Nonhomogeneous Poisson Process

Proposed by Amrit Goel of Syracuse University and Kazu Okumoto in 1979.

Model assumes that the error counts over non-overlapping time intervals follow a Poisson distribution.

It is also assumed that the expected number of errors in an interval of time is proportional to the remaining number of errors in the program at that time.

Page 159: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

159

Nonhomogeneous Poisson Process (cont’d)Assumptions:

1. The software is operated in a similar manner as the anticipated operational usage.

2. The numbers of errors, (f1, f2, f3,...,fm) detected in each of the respective time intervals [(0, t1), (t1, t2),...,(tm-1,tm)] are independent for any finite collection of times t1 < t2 < ... < tm.

3. Every error has the same chance of being detected and is of the same severity as any other error.

4. The cumulative number of errors detected at any time t, N(t), follows a Poisson distribution with mean m(t). m(t) is such that the expected number of error occurrences for any time (t, t + t), is proportional to the expected number of undetected errors at time t.

Page 160: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

160

Assumptions (cont’d):

5. The expected cumulative number of errors function, m(t), is assumed to be a bounded, nondecreasing function of t with:

m(t) = 0 for t = 0

m(t) = a for t =

where a is the expected total number of errors to be eventually detected in the testing process.

Nonhomogeneous Poisson Process (cont’d)

Page 161: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

161

Assumptions 4 and 5 give the number of errors discovered in the interval as:

whereb is a constant of proportionalitya is the total number of errors in the programSolving the above differential equations yields:

Which satisfies the initial conditions eatm bt 1

tttmabtmttm O

ttt ,

amm ,00

Nonhomogeneous Poisson Process (cont’d)

Page 162: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

162

Musa Basic Model

Assumptions:1. Cumulative number of failures by time t, M(t), follows a

Poisson process with the following mean value function:

2. Execution times between failures are piecewise exponentially distributed, i.e, the hazard rate for a single failure is constant.

tt 10

exp1

Page 163: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

163

Musa Basic Model (cont’d)

Mean value function given on previous slide. Failure intensity given by:

et

tt 1

10

Page 164: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

164

Musa Calendar Time Developed by John Musa of AT&T Bell Labs between 1975 and

1980. This model is one of the most popular software reliability

models, having been employed both inside and outside of AT&T.

Basic model (see previous slides) is based on the amount of CPU time that occurs between successive failures rather than on wall clock time.

Calendar time component of the model attempts to relate CPU time to wall clock time by modeling the resources (failure identification personnel, failure correction personnel, and computer time) that may limit various time segments of testing.

Page 165: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

165

Musa Calendar Time (cont'd)

Musa's basic model is essentially the same as the Jelinski-Moranda model The importance of the calendar component of the model is:

Development of resource allocation (failure identification staff, failure correction staff, and computer time)

Determining the relationship between CPU time and wall clock time

Let tI/ = instantaneous ratio of calendar to execution time resulting from effects of failure identification staff

Let tF/ = instantaneous ratio of calendar to execution time resulting from effects of failure correction staff

Let tC/ = instantaneous ratio of calendar to execution time resulting from effects of available computer time

Page 166: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

166

Musa Calendar Time (cont'd)

Increment in calendar time is proportional to the average amount by which the limiting resource constrains testing over a given execution time segment:

Resource requirements associated with a change in MTTF from 1 to 2 can be approximated by:

XK = K + Km = execution time incrementm = increment of failures experiencedK = execution time coefficient of resource expenditure

K = failure coefficient of resource expenditure

2

1

,,max

dd

dt

d

dt

d

dtt

CFI

Page 167: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

167

Littlewood-Verrall Bayesian Model

Littlewood's model, a reformulation of the Jelinski-Moranda model, postulated that all errors do not contribute equally to the reliability of a program.

Littlewood's model postulates that the error rate, assumed to be a constant in the Jelinski-Moranda model, should be treated as a random variable.

The Littlewood-Verrall model of 1978 attempts to account for error generation in the correction process by allowing for the probability that the program could be made less reliable by correcting an error.

Page 168: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

168

Littlewood-Verrall (cont'd)Assumptions:1. Successive execution times between failures are independent random variables

with probability density functions

where the i are the error rates

2. i's form a sequence of random variables, each with a gamma distribution of parameters and (i), such that:

(i) is an increasing function of i that describes the "quality" of the programmer and the "difficulty" of the task.

eXXi

iii

if

ei ii

i

ig

1

Page 169: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

169

Littlewood-Verrall (cont'd)Assumptions (cont'd)

Imposing the constraint for any x reflects the intention to make the program better by correcting errors. It also reflects the fact that sometimes corrections will make the program worse.

3. The software is operated in a similar manner as the anticipated operational usage.

xPxP jj 1

Page 170: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

170

Littlewood-Verrall (cont'd)

=

=

This is a Pareto distribution

0

)(1)(

)(

)]([

i

Xii

Xii deie ii

dXX iiiiigfif ,

iX

i

i

1

Page 171: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

171

Littlewood-Verrall (cont'd)

Joint density for the Xi's is given by:

For , Littlewood and Verrall suggest:

f[X1, X2, ... , Xn | , (i)] =

( )

( )

i

X i

i

n

i

i

n

1

1

1

ii

ii

210

10

or

Page 172: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

172

Hyperexponential Model

Extension to classical exponential models. First con-sidered by Ohba; addressed in variations by Yamada, Osaki, and Laprie et. al.

Basic idea is that different sections of software ex-perience an exponential failure rate. Rates may vary over these sections to reflect their different natures.

Page 173: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

173

Hyperexponential Model (cont’d)

Assumptions:

Suppose that there are K sections of the software such that within each class:

1. The rate of fault detection is proportional to the current fault content within that section of the software.

2. The fault detection rate remains constant over the intervals between fault occurrence.

3. A fault is corrected instantaneously without introducing new faults into the software.

4. Every fault has the same chance of being encountered and is of the same severity as any other fault.

Page 174: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

174

Hyperexponential Model (cont’d)Assumptions (cont’d):

1. The failures, when the faults are detected, are independent.

2. The cumulative number of failures by time t, (t), follows a Poisson process with the following mean value function:

( ) [ exp( )]t N p t

p

p N

i i

i

K

i i

i

K

i

1

0 1 1

0 1

1

1

where , ,

, and is finite

Page 175: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

175

Criteria for Model Selection

Quantitative Criteria Prior to Model Application Subadditive Property Analysis

Page 176: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

176

Subadditive Property Analysis

Common definition of reliability growth is that successive interfailure times tend to grow larger:

Under the assumption of the interfailure times being stochastically independent, we have:

TT jst

iji

xT

xT

xandji FFji

0

Page 177: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

177

Subadditive Property Analysis (cont’d)

As an alternative to assumption of stochastic independence, we can consider that successive failures are governed by a non-homogeneous Poisson process: N(t) = cumulative number of failures observed in [0,t] H(t) = E[N(t)], mean value of cumulative failures h(t) = dH(t)/dt, failure intensity

Natural definition of reliability growth is that the increase in the expected number of failures, H(t), tends to become lower. However, there are situations where reliability growth may take place on average even though the failure intensity fluctuates locally.

Page 178: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

178

Subadditive Property Analysis (cont’d) To allow for local fluctuations, we can say that the expected

number of failures in any initial interval (i.e., of the form [0,t]) is no less than the expected number of failures in any interval of the same length occurring later (i.e., in [x,x+t]).

Independent increment property of an NHPP allows above definition to be written as:

H(t1)+H(t2) H(t1+ t2) When above inequality holds, H(t) is said to be subadditive. This definition of reliability growth allows there to be local

intervals in which reliability decreases without affecting the overall trend of reliability increase.

Page 179: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

179

Subadditive Property Analysis (cont’d) We can interpret the subadditive property graphically. Consider the curve (t),

representing the mean value function for the cumulative number of failures, and the line Lt joining the two end points of (t) at (t,H(t)), shown below.

Let AH(t) denote the difference between the area delimited by (1) (t) and the coordinate axis and (2) Lt and the coordinate axis. If H(t) is subadditive, then AH(t) 0 for all t [0,T]. We call AH(t) the subadditivity factor.

H(x)

T

(t)L

t1 t2

Page 180: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

180

Subadditive Property Analysis (cont’d)Summary AH(t) 0 over [0,T] implies reliability growth on average over [0,T]. AH(t) 0 over [0,T] implies reliability decrease on average over [0,T]. AH(t) constant over [0,T] implies stable reliability on average over [0,T]. Changes in the sign of AH(t) indicate reliability trend changes. When derivative with respect to time of AH(t), Ah(t), 0 over subinterval

[t1,t2], this implies local reliability growth on average over [t1,t2] Ah(t) 0 over [t1,t2] implies local reliability decrease on average over [t1,t2]. Changes in the sign of Ah(t) indicate local reliability trend changes. Transient changes of the failure intensity are not detected by the

subadditivity property.

Page 181: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

181

Increasing the Predictive Accuracy of Models

Introduction Linear Combination Models

Statically-Weighted Models Statically-Determined/Dynamically-Assigned

Weights Dynamically-Determined/Dynamically-Assigned

Weights Model Application Results

Model Recalibration

Page 182: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

182

Problems in Reliability Modeling

Introduction Significant differences exist among the performance

of software reliability models When software reliability models were first introduced,

it was felt that a process of refinement would produce a “definitive” model

The reality is: no single model could be determined a priori as the best model during measurement

Page 183: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

183

The Reality Is In Between

PessimisticProjection

OptimisticProjection

Reality

Elapsed time

CumulativeDefects

Page 184: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

184

Linear Combination Models

Forming Linear Combination Models(1) Identify a basic set of models (called component

models)(2) Select models that tend to cancel out in their biased

predictions(3) Keep track of the software failure data with all the

component models(4) Apply certain criteria to weigh the selected

component models and form one or several Linear Combination Models for final predictions

Page 185: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

185

A Set of Linear Combination ModelsSelected component models : GO, MO, LV(1) ELC Equally-Weighted LC Model (Statically-Weighted model)(2) MLC Median-Oriented LC Model

(Statically-Determined/Dynamically-Assigned)(3) ULC Unequally-Weighted LC Model

(Statically-Determined/Dynamically-Assigned)(4) DLC Dynamically-Weighted LC Model (Dynamically

Determined and Assigned) Weighting is determined by dynamically calculating the

posterior “prequential likelihood” of each model as a meta-predictor

Page 186: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

186

A Set of Linear Combination Models (cont’d)Equally-Weighted Linear Combination Model - apply equal weights in selecting component models and form the Equally-Weighted Linear Combination model for final predictions.

One possible ELC is:

These models were chosen as component models because: Their predictive validity has been observed in previous investigations. They represent different categories of models: exponential NHPP,

logarithmic NHPP, and inverse-polynomial Bayesian. Their predictive biases tend to cancel for the five data sets analyzed.

LVMOGO3

1

3

1

3

1

Page 187: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

187

Median-Oriented Linear Combination Model - instead of choosing the arithmetic mean as was done for the ELC model, use the median value. This model's formulation is:

where :

O = optimistic prediction

M = median prediction

P = pessimistic prediction

OMP *0*1*0

A Set of Linear Combination Models (cont’d)

Page 188: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

188

Unequally-Weighted Linear Combination Model - instead of choosing the arithmetic mean as was done for the ELC model, use the PERT weighting scheme. This model's formulation is:

where:

O = optimistic prediction

M = median prediction

P = pessimistic prediction

OMP6

1

6

4

6

1

A Set of Linear Combination Models (cont’d)

Page 189: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

189

Dynamically-Weighted LC Model - use changes in one or more of the four previously defined criteria (e.g. prequential likelihood) to determine weights for each component model.

Changes can be observed over windows of varying size and type. Fixed or sliding windows can be used.

Weights for component models vary over time.

A Set of Linear Combination Models (cont’d)

w reference window w reference windowi+1 w reference window

i+2i

w computationiw computation

i+1w computation

i+2

Time

w reference window

w reference windowi+1

w reference windowi+2

iTime

Page 190: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

190

Data Set 1: Model Comparisons for Voyager Project

Recommended Models: 1-DLC 2-ELC 2-LV

Page 191: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

191

Overall Model Comparisons Using All Four Criteria

Summary of Model Ranking for Each Data Set by All Four Criteria Model Data Set

JM GO MO DU LM LV ELC ULC MLC DLC RADC 1 10 9 1 6 8 6 4 2 3 5 RADC 2 9 10 6 7 8 1 4 5 2 2 RADC 3 6 8 4 9 9 6 4 3 2 1 Voyager 10 7 6 7 9 2 2 4 5 1 Galileo 5 7 10 6 9 4 1 3 8 2 Galileo CDS

8 6 6 8 10 1 1 1 4 5

Magellan 5 5 8 1 9 10 1 5 4 3 Alaska SAR

1 5 1 9 3 10 8 7 3 6

Sum of Rank

54 57 42 53 65 40 25 30 31 25

Handicap +22 +25 +10 +21 +33 +8 -7 -2 -1 -7 Total Rank

8 9 6 7 10 5 1 3 4 1

Page 192: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

192

Overall Model Comparisons by the Prequential Likelihood Measure

Model Ranking Summary for Each Data Set’s Prequential Likelihood Model Data Set

JM GO MO DU LM LV ELC ULC MLC DLC RADC 1 10 9 2 8 6 7 5 4 3 1 RADC 2 7 9 4 10 7 1 4 4 3 2 RADC 3 4 7 4 10 8 9 2 2 4 1 Voyager 10 7 6 8 9 2 3 4 5 1 Galileo 5 7 9 10 5 4 2 3 8 1 Galileo CDS

6 5 8 10 6 2 3 4 8 1

Magellan 6 6 6 2 6 5 3 4 6 1 Alaska SAR

2 6 2 10 2 9 8 7 2 1

Sum of Rank

50 56 41 68 49 39 30 32 39 9

Handicap +18 +24 +9 +36 +17 +7 -2 0 +7 -23 Total Rank

8 9 6 10 7 4 2 3 4 1

Page 193: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

193

Summary of Long-Term Predictions

Page 194: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

194

Possible Extensions of LC Models

1. Apply models other than GO, MO, and LV

2. Apply more than three component models

3. Apply other meta-predictors for weight assignments in DLC-type models

4. Apply user-determined weighting schemes, subject to project criteria and user judgment

5. Apply combination models as component models for a hybrid combination

Page 195: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

195

Increasing the Predictive Accuracy of Models

Model Recalibration - Developed by Brocklehurst, Chan, Littlewood, and Snell. Uses model bias function (u-plot) to reduce bias in model prediction.

Let the random variable represent the prediction of the next time to failure, based on observed times to failure .

Let represent the true, but hidden, distribution of the random variable , and let represent the prediction of . The relationship between the estimated and true distributions can be written as:

ttt i 121 ,,, T i

tF i

T i tF i

ˆ T i

tFGtF iii ˆ

Page 196: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

196

Increasing the Predictive Accuracy of Models (cont’d)

Model Recalibration (cont’d):

If were known, it be possible to determine the true distribution of from the inaccurate predictor.

The key notion in recalibration is that the sequence is approximately stationary. Experience seems to show that changes only slowly in many cases. This opens the possibility of approximating with an estimate and using it to form a new prediction:

tFGtF iii ˆ**

Gi T i

Gi

Gi

Gi Gi*

Page 197: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

197

Increasing the Predictive Accuracy of Models (cont’d)

Model Recalibration (cont’d):

A suitable estimator for is suggested by the observation that

is the distribution function . The estimate is based on the u-plot calculated from predictions which have been made prior to

This new prediction recalibrates the model based on knowledge of the accuracy of past predictions. The simplest form of is the u-plot with the steps joined to form a polygon. Smooth versions can be constructed using spline techniques.

Gi

Gi TFU iii Gi*

.Gi

Gi*

Page 198: 1 Software Reliability Engineering: Techniques and Tools CS130 Winter, 2002

198

Increasing the Predictive Accuracy of Models (cont’d)

Model Recalibration (cont’d):

To recalibrate a model’s predictions, follow these four steps:1. Check that error in previous predictions is approximately

stationary. The y-plot can be used for this purpose.

2. Find the u-plot for predictions made before Ti, and join up the steps on the plot to form a polygon G* (alternatively, use a spline technique to construct a smooth version).

3. Use the basic model (e.g., the JM, LV, or MO models) to make a “raw” prediction .

4. Recalibrate the raw prediction using .

tF iˆ

tFGtF iii ˆ