Rescaling Reliability Bounds for a New Operational Profile Peter G Bishop pgb@adelard

Rescaling Reliability Bounds for a New Operational Profile

Peter G Bishop

[email protected]@csr.city.ac.uk

Adelard, Drysdale Building, Northampton Square, London EC1V 0HB +44 20 7490 9450www.adelard.com

2

Outline of Talk

Original reliability bound theory (same op. profile)

Extended theory (different operational profile)

Implications of the theory

Experimental evaluation

3

Original Theory

Operationalprofile (I)

Defect

Observeddefectfailurefrequency

D1

D2

D3

Input Domain

4

Theory assumptions

the operational profile is invariant, i.e.s are constant over time

when a failure occurs the associated defect is immediately and perfectly corrected

removal of a defect does not affect the s of the remaining defects

5

Basic idea

Given some test interval t :

Defects with large s will be removed already

Defects with small s will remain - but have little affect on program reliability

So there must be an “worst case” for a defect that maximises the program failure rate after t

6

Worst-case bound

Original paper showed that, given the assumptions, max failure /unit time for a defect i is:

i|t 1/et (where t is the test time)

So if there are N faults in the program the failure rate at time t is bounded by:

|t N/et

7

Bound is independent of

0.00001

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

t

Probablityof failure | t

=0.1 =0.01 =0.001

1/et

8

Refinement for discrete tests

For for a discrete sequence of T tests the result is:

|T N (T/T+1)T/(T+1)

N/(eT) (conservative approx.)

So it is conservative to use original equation.

9

Limitations

Assumes operational profile I is constant hence s are constant

But we know that in practice the profile changes. So the reliability bound does not apply if the

operational profile changes (e.g. from system test to actual use) but will “settle back” in long term if new profile stable

New theory gives a means for “rescaling” the reliability bound for a different profile

10

Additional assumptions

Each defect is localised to a single code “block”

The operational profile I can be characterised by the distribution of code block executions Q in the program {q(1), q(2), … }

The failure rate of defect in block, i q(i)

There is a constant probability of a fault existing in any line of executable code.

11

Rescaling for known defect

For a defect i in code block j , the re-scaled bound would be:

where q’(j) is the new execution rate and q(j) is the old execution rate.

eTjq

jqTi

1

)(

)()(

12

Probability of defect in block We do not know which block contains defect i, but we assume that the chance of being in j is:

L(j)/L

where L(j) is the length of the code block, and L is the total length of the executable code.

13

Re-scaled bound

Taking the average over all blocks:

So the “scale factor” relative to the original bound is:

Also true if there are N faults rather than 1

eTLjq

jLjqTi

1

)(

)()(')('

L

jL

jq

( jq )(

)(

)

14

Theory predictions - Fair testing

If q L of blocks “dominated” by decision branch,scale factor unchanged by any other profile

Applies to any acyclic graph, And subgraphs with fixed iteration loops

Segment j L(j). q(j) q’(j) L(j)q’(j)q(j)

Root 0 10 1 1 10Branch 1 10 0.1 0.9 90Branch 2 90 0.9 0.1 10Sum 110 110S =Sum/L 1

15

Use of “unbalanced” test profile can be very sensitive to changes in profile

Factor can be less than 1 if under-tested blocks avoided, e.g. Q’={1,1,0} gives S = 0.19

Unfair testing

Segment j L(j) q(j)

Root 0 10 1 1 10Branch 1 10 0.9 0.1 1.1Branch 2 90 0.1 0.9 810Sum 110 829S =Sum/L 7.5

q’(j) L q’/q

16

Limits to fair test approach

Fair test apportionment does not work for variable loops, recursion and subroutines

Even if we identify a fair test profile, it may be infeasible to execute

Decisions not independent (shared variable)

17

Maximum scale factor If we know max. possible execution rates for

each block, can estimate a “maximum scale factor”:

( q(k) max / q(k) ) (L(k) / L) Where k relates to a worst case “thread”

through the graph. Hard to identify this thread, but easier to compute a more pessimistic factor:

( q(j) max / q(j) ) (L(j) / L)

where j includes all blocks.

No knowledge of the new profile is needed

18

Including module tests

Can combine module tests and system tests, composite scale factor is:

where x(j) are the total executions under module testing

Module tests can “fill in” uncovered segments that would make the test profile “unbalanced”

Tjxjq

jq

L

jL

/)()(

)()(

19

Experimental evaluation

Use programs with known set of defects PODS

simple reactor trip application (<1000 code lines) simple structure, fixed loops

PREPRO ~ 10 000 code lines parses input description file of indefinite length recursive - max execution unknown

Similar results - will only discuss PODS here

20

PODS evaluation

Measure Q for different test profiles Uniform, Normal, Inverse normal - “bathtub”

Measure defect failure rates (i) under all profiles

Predict residual failure rate:

(i) exp(-(i)T) Compute failure rate for new profile:

’(i) exp(-(i)T) Compare with scaled bound:

(L(j)/L)(q’(j)/q(j))N/eT

21

Variation in q(j)

Segmentexecutionsper test q(j)

0.0001

0.001

0.1

1

10Inv. normal

UniformNormal

0.01

Segment j

22

Predicted scale factors

Operational profileTest profile uniform inv-normal

normal

uniform 1 1.2 0.9

inv-normal 3.2 1 6.2

normal 115 346 1

Note the predicted reduction in bound

23

Maximum scale factor

Test profile Max scale-up factoruniform 6.6inv-normal 10.0normal 1059

2-5 times worst than bound with a known profile

Can be over-pessimistic But could indicate relative sensitivity to

change

24

“Unfair” Normal test profile

0.000001

0.00001

0.0001

0.001

0.01

0.1

1

10 100 1000 10000 100000

Tests

MeanFails/test

Operation(uniform)

(normal)

N/et bound

Scaled bound

Max bound

25

“Fairer” Uniform test profile

0.000001

0.00001

0.0001

0.001

0.01

0.1

1

10 100 1000 10000 100000Tests

MeanFails/test

Operation(normal)

Test(uniform)

N/et bound

Scaled bound

Max bound

26

PREPRO

Similar results changes in failure rates are within the scaled

bounds But could not compute a maximum bound

program is recursive so no upper bound on the execution of program

code blocks

27

Summary

Theory suggests:

Can rescale bound (knowing Q and Q’) Can include module test execution information Can compute max scale up (knowing Q and Qmax) For some program structures can identify a

totally "fair" test profile - bound insensitive to change

The experimental evaluations appear to be consistent with the predictions of the theory

28

Conclusions Could affect approach to testing:

“fairer” test profiles rather than realistic profiles integrated module and system test strategy

Could improve reliability bound prediction for new environment

Could assess sensitivity to profile change e.g. by computing maximum scale factor

But based on quite strong assumptions, need to: validate assumptions assess impact of assumption violation evaluate on more examples

29

Documents

Rescaling Reliability Bounds for a New Operational Profile Peter G Bishop pgb@adelard