Project Data Incorporating Qualitative Factors for Improved Software Defect Prediction

Preview:

DESCRIPTION

Norman Fenton, Martin Neil, William Marsh, Peter Hearty, Lukasz Radinski, Paul Krause

Citation preview

Slide 1

Project Data Incorporating Qualitative Factors for

Improved Software Defect Prediction

Norman FentonMartin Neil, William Marsh, Peter Hearty and

Łukasz Radliński, Paul Krause

PROMISE

20 May 2007

Slide 2

Overview

• Background

• The data

• Results

• Caveats

Slide 3

Background

• Predicting reliability

• Statistical models

• Causal models

Slide 4

Causal model (Bayesian network)

Probability offinding defect

Testingprocess

effectiveness

Testingprocessquality

Testingeffort

Testingstaff

experienceQuality of

documented test cases

Testingprocess

well-defined

Slide 5

Background

• AID

• MODIST

Slide 6

Schematic view of model

Existing codebase

Defectinsertion

and recovery

Testingand

rework

Designand

development

Specificationand

documentation

Commoninfluences Scale of

new requiredfunctionality

Slide 8

Example question: “Relevant Experience of Spec & Doc Staff”

• Very High: Over 3 years experience in requirements management, and extensive domain knowledge.

• High: Over 3 years experience in requirements management, but limited domain knowledge.

• Medium: 1-3 years experience in requirements management.

• Low: 1-3 three years experience, but no experience in requirements management.

• Very Low: Less than one year’s experience, and no previous domain experience.

Slide 9

How projects were selected

•Reliable Data

•Satisfactory end

•Key people available

•Breadth

•Depth

Slide 10

Defects vs size

0

500

1000

1500

2000

2500

0 50 100 150 200

Code Size (KLoC)

Def

ects

Fo

un

d

Slide 11

Actual versus predicted defects

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 500 1000 1500 2000 2500

Actual

Pre

dic

ted

Slide 12

Caveats

• Biased priors

• Structural aspects biased

• Data accuracy

• Projects overly ‘uniform’

Slide 13

Conclusions

• No ‘data fitting’

• Dataset provided a validation

• Good predictions with few of the inputs

• Causal model provides genuine support for risk management

Recommended