17
The Tester’s Dashboard: Release Decision Support Robert V. Binder System Verification Associates, LLC [email protected] Peter B. Lakey Cognitive Concepts, Inc. [email protected]

The Tester’s Dashboard: Release Decision Support

Embed Size (px)

DESCRIPTION

Industry track paper, November 2, 2010. ISSRE-18, San Jose. Combining reliability, coverage, and an information-theoretic metric provides better feedback for release decisions.

Citation preview

Page 1: The Tester’s Dashboard: Release Decision Support

The Tester’s Dashboard: Release Decision Support

Robert V. Binder System Verification Associates, LLC

[email protected]

Peter B. Lakey Cognitive Concepts, Inc.

[email protected]

Page 2: The Tester’s Dashboard: Release Decision Support

Overview

• Complementary metrics for release decision-support – Model-based testing

• Operational profile

• Model coverage metrics

– Reliability Demonstration Chart

– Relative Proximity

• Case Study

• Observations

Page 3: The Tester’s Dashboard: Release Decision Support

Release Decision Support

Page 4: The Tester’s Dashboard: Release Decision Support
Page 5: The Tester’s Dashboard: Release Decision Support

Model-based Reliability Estimation

• Test suites must be

– Proportional to operational profile

– Sequentially feasible

– Input feasible

• Approach

– Markov model

– Monte Carlo simulation

– Post run analytics

Page 6: The Tester’s Dashboard: Release Decision Support

Model Coverage Metrics

S T

Software System

Process SpaceProcess Space

Data SpaceData Space

fault

activated

Usage Profile

Trigger

Latent

Defect

Observe

System

Failure

Observe

System

Failure

• % States Reached

• % State-Transitions Reached

Page 7: The Tester’s Dashboard: Release Decision Support

Reliability Demonstration Chart

• Sequential Sampling

• Risk-Adjusted

• Musa equations

http://sourceforge.net/projects/rdc/

Page 8: The Tester’s Dashboard: Release Decision Support

Relative Proximity

• Kullback-Lieber Distance – Information theoretic – Characterizes difference in variation of message population E

(expected) and sample A (actual) as “relative entropy”

• Relative Proximity – KLD math doesn’t work unless failures modeled (sum of the

actuals must be 1.0) – Assume the target failure rate is aggregate – Allocate failure rate in proportion to each operation

KLD = ∑ 𝐴𝑖 (𝑙𝑜𝑔2 (𝐴𝑖 / E𝑖 ))

Page 9: The Tester’s Dashboard: Release Decision Support

Profile Explicit Failure Modes

• Assume maximum acceptable failure rate intensity of 1 in 10,000

Operation Mode Standard Profile

Explicit Failure Profile

Expected Number, 10000 Tests

A Pass 0.7 0.6993 6993

B Pass 0.2 0.1998 1998

C Pass 0.1 0.0999 999

Page 10: The Tester’s Dashboard: Release Decision Support

Profile Explicit Failure Modes

• Relative Proximity indicates the difference between actual and observed failure rates

• Many possible operation failure rates with better or worse fidelity

• RDC based on aggregate FIO, not sensitive to operation variance

Mode Expected Actual KL Distance Actual KL Distance

A Pass 6993 7000 10.104 6990 -4.327

Fail 7 0 0.000 10 5.146

B Pass 1998 1990 -11.518 2000 2.887

Fail 2 10 23.219 0 0.000

C Pass 999 980 -27.149 994 -7.195

Fail 1 20 86.439 6 15.510

10000 10000 81.094 10000 12.020

Page 11: The Tester’s Dashboard: Release Decision Support

Case Studies

• Stochastic Models

• Assumed Failure Rates

• Word Processing Application

• Ground-Based Midcourse Missile Defense

Page 12: The Tester’s Dashboard: Release Decision Support

GBMD Test Run, 0-100

Page 13: The Tester’s Dashboard: Release Decision Support

GBMD Test Run, 1K, 5K

Page 14: The Tester’s Dashboard: Release Decision Support

GMBD Test Run, 10K

Page 15: The Tester’s Dashboard: Release Decision Support

GBMD Relative Proximity Trend

1484.00

418.20

6.3012.0067.90

0.00

200.00

400.00

600.00

800.00

1000.00

1200.00

1400.00

1600.00

10 100 1000 10000

Page 16: The Tester’s Dashboard: Release Decision Support

Observations

• Model coverage indicates minimal sufficiency

– Wouldn’t release without all state-xtn pairs covered

– Stochastic can take a long time to do this

– Cover with N+ first

• RDC assumes “flat” profile

– With sequential constraints, may be optimistic

– Strength is explicit risk-adjustment

• Relative Proximity will indicate when operation-specific Failure Intensity is as expected (or not)

Page 17: The Tester’s Dashboard: Release Decision Support

Q & A