Comparison and Assessment of Cost Models for NASA Flight Projects
Ray Madachy, Barry Boehm, Danni Wu{madachy, boehm, danwu}@usc.eduUSC Center for Systems & Software Engineeringhttp://csse.usc.edu
21st International Forum on COCOMO and Software Cost Modeling
November 8, 2006
2
Outline
Introduction and background Model comparison examples Estimation performance analysis Conclusions and future work
3
Introduction
This work is sponsored by the NASA AMES project Software Risk Advisory Tools, Cooperative Agreement No. NNA06CB29A
Existing parametric software cost, schedule, and quality models are being assessed and updated for critical NASA flight projects – Includes a comparative survey of their strengths, limitations and
suggested improvements – Developing transformations between the models– Accuracies and needs for calibration are being examined with relevant
NASA project data This presents the latest developments in ongoing research at the USC
Center for Systems and Software Engineering (USC-CSSE)– Current work builds on previous research with NASA and the FAA
4
Frequently Used Cost/Schedule Models for Critical Flight Software
COCOMO II is a public domain model that USC continually updates and is implemented in several commercial tools
SEER-SEM and True S are proprietary commercial models with unique features that also share some aspects with COCOMO– Include factors for project type and application domain
All three have been extensively used and tailored for flight project domains
5
Support Acknowledgments
Galorath Inc. (SEER-SEM)– Dan Galorath, Tim Hohmann, Bob Hunt, Karen McRitchie
PRICE Systems (True S)– Arlene Minkiewicz, James Otte, David Seaver
Softstar Systems (COCOMO calibration)– Dan Ligett
Jet Propulsion Laboratories– Jairus Hihn, Sherry Stukes
NASA Software Risk Advisory Tools research team – Mike Lowry, Tim Menzies, Julian Richardson
This study was performed mostly by persons highly familiar with COCOMO but not necessarily with the vendor models. The vendors do not certify or sanction the data nor information contained in these charts.
6
Approach
Develop “Rosetta Stone” transformations between the models so COCOMO inputs can be converted into corresponding inputs to the other models, or vice-versa
– Crosscheck multiple estimation methods– Represent projects in a consistent manner in all models and to help understand
why estimates may vary– Extensive discussions with model proprietors to clarify definitions
Models assessed against a common database of relevant projects– Using a database with effort, size and COCOMO cost factors for completed
NASA projects called NASA 94 Completion dates 1970s through late 1980s
– Additional data as it comes in from NASA or other data collection initiatives Analysis considerations
– Calibration issues– Model deficiencies and extensions– Accuracy with relevant project data
Repeat analysis with updated calibrations, revised domain settings, improved models and new data
7
Critical Factor Distributions by Project Type
Reliability Complexity
0
20
40
60
80
100
Very Low Low Nominal High Very High
Rating
Per
cen
t o
f P
roje
cts
Flight Projects
0
20
40
60
80
100
Very Low Low Nominal High Very High
Rating
Per
cen
t o
f P
roje
cts
Ground Embedded Projects
0
20
40
60
80
100
Very Low Low Nominal High Very High
Rating
Per
cen
t o
f P
roje
cts
Ground Other Projects
0
20
40
60
80
100
VeryLow
Low Nominal High VeryHigh
ExtraHigh
Rating
Per
cen
t o
f P
roje
cts
Ground Embedded Projects
0
20
40
60
80
100
VeryLow
Low Nominal High VeryHigh
ExtraHigh
Rating
Per
cen
t o
f P
roje
cts
Flight Projects
0
20
40
60
80
100
VeryLow
Low Nominal High VeryHigh
ExtraHigh
Rating
Pe
rce
nt
of
Pro
jec
ts
Ground Other Projects
8
Outline
Introduction and background Model comparison examples Estimation performance analysis Conclusions and future work
9
Cost Model Comparison Attributes
Algorithms Size definitions
– New, reused, modified, COTS– Language adjustments
Cost factors– Exponential, linear
Work breakdown structure (WBS) and labor parameters– Scope of activities and phases covered– Hours per person-month
10
Common Effort Formula
Effort in person-months A - calibrated constant B - scale factor EM - effort multiplier from cost factors
Size
Cost Factors Effort = A * Size B * EM Effort Phase and Activity Calibrations Decomposition
11
Example: Top-Level Rosetta Stone for COCOMO II Factors (1/3)
COCOMO II Factor SEER Factor(s) True S Factor(s)
PRODUCT ATTRIBUTES
Required Software Reliability Specification Level - Reliability Operating Specification
Data Base Size none Code Size non Executable
Product Complexity - Complexity (Staffing)- Application Class Complexity
Functional Complexity
Required Reusability - Reusability Level Required- Software Impacted by Reuse
Design for Reuse
Documentation Match to Lifecycle Needs
none Operating Specification
PLATFORM ATTRIBUTES
Execution Time Constraint Time Constraints Project Constraints - Communications and Timing
Main Storage Constraint Memory Constraints Project Constraints - Memory & Performance
Platform Volatility - Target System Volatility - Host System Volatility
Hardware Platform Availability 3
12
Example: Top-Level Rosetta Stone for COCOMO II Factors (2/3)
COCOMO II Factor SEER Factor(s) True S Factor(s)
PERSONNEL ATTRIBUTES
Analyst Capability Analyst Capability Development Team Complexity - Capability of Analysts and Designers
Programmer Capability Programmer Capability Development Team Complexity - Capability of Programmers
Personnel Continuity none Development Team Complexity- Team Continuity
Application Experience Application Experience Development Team Complexity - Familiarity with Product
Platform Experience - Development System Experience
- Target System Experience
Development Team Complexity - Familiarity with Platform
Language and Toolset Experience
Programmer’s Language Experience
Development Team Complexity - Experience with Language
13
Example: Top-Level Rosetta Stone for COCOMO II Factors (3/3)
COCOMO II Factor SEER Factor(s) True S Factor(s)
PROJECT ATTRIBUTES
Use of Software Tools Software Tool Use Design Code and Test Tools
Multi-site Development Multiple Site Development Multi Site Development
Required Development Schedule
none 2 Start and End Date
1 - SEER Process Improvement factor rates the impact of improvement, not the CMM level
2 - Schedule constraints handled differently
3 - A software assembly input factor
14
Example: Model Size Inputs
COCOMO II SEER-SEM True S
New Software
New Size New Size New Size New Size Non-executable
Adapted Software
Adapted Size % Design Modified % Code Modified % Integration Required Assessment and
Assimilation Software Understanding 1
Programmer Unfamiliarity 1
Pre-exists Size 2
Deleted Size Redesign Required % Reimplementation
Required % Retest Required %
Adapted Size Adapted Size Non-executable % of Design Adapted % of Code Adapted % of Test Adapted Reused Size Reused Size Non-executable Deleted Size Code Removal Complexity
1 - Not applicable for reused software2 - Specified separately for Designed for Reuse and Not Designed for Reuse
15
Example: SEER Factors with No Direct COCOMO II Mapping
PERSONNEL CAPABILITIES AND EXPERIENCE
Practices and Methods Experience
DEVELOPMENT SUPPORT ENVIRONMENT
Modern Development Practices Logon thru Hardcopy Turnaround Terminal Response Time Resource Dedication Resource and Support Location Process Volatility
PRODUCT DEVELOPMENT REQUIREMENTS
Requirements Volatility (Change) 1
Test Level 2
Quality Assurance Level 2
Rehost from Development to Target
PRODUCT REUSABILITY
Software Impacted by Reuse
DEVELOPMENT ENVIRONMENT COMPLEXITY
Language Type (Complexity) Host Development System Complexity Application Class Complexity 3
Process Improvement
TARGET ENVIRONMENT
Special Display Requirements Real Time Code Security Requirements
1 – COCOMO II uses the Requirements Evolution and Volatility size adjustment factor2 – Captured in the COCOMO II Required Software Reliability factor 3 – Captured in the COCOMO II Complexity factor
16
Vendor Elaborations of Critical Domain Factors
COCOMO II SEER * True S Required Software
Reliability Specification Level – Reliability Test Level Quality Assurance Level
Operating Specification Level (platform and environment settings for required reliability, portability, structuring and documentation)
Product Complexity Complexity (Staffing) Language Type (Complexity) Host Development System
Complexity Application Class Complexity
Functional Complexity– Application Type
Language Language Object-Oriented
* SEER factors supplemented with and may be impacted via knowledge base settings for– Platform– Application– Acquisition method– Development method– Development standard– Class– Component type (COTS only)
17
Example: Required Reusability Mapping (1/2)
SEER-SEM– Reusability Level
XH = Across organization VH = Across product line H = Across project N = No requirements
– Software Impacted by Reuse (% reusable)
100% 50% 25% 0%-
COCOMO II– XH = Across multiple product lines– VH = Across product line– H = Across program– N = Across project– L = None
• Cost to develop software module for subsequent reuse
SEER-SEM to COCOMO II:– XH = XH in COCOMO II
100% reuse level = 1.50 50% reuse level = 1.40 25% reuse level = 1.32 0% reuse level = 1.25
– VH = VH in COCOMO II100% reuse level = 1.32 50% reuse level = 1.26 25% reuse level = 1.22 0% reuse level = 1.16
– H = N in COCOMO II– N = L in COCOMO II
18
Example: Required Reusability Mapping (2/2)
SEER-SEM to COCOMO II:– XH = XH in COCOMO II
100% reuse level = 1.50 50% reuse level = 1.40 25% reuse level = 1.32 0% reuse level = 1.25
– VH = VH in COCOMO II100% reuse level = 1.32 50% reuse level = 1.26 25% reuse level = 1.22 0% reuse level = 1.16
– H = N in COCOMO II– N = L in COCOMO II
19
Example: WBS Mapping
Activities
COCOMO II Elaboration
ManagementEnvironment/CMRequirementsDesignImplementationAssessmentDeployment
True S ConceptSystem
RequirementsSoftware
Requirements Preliminary DesignDetailed Design
Code / Unit Test
Integration & Test
Hardware / Software
Integration Field TestSystem Integration &
Test
DesignProgrammingDataSEPGMQ/ACFM
Legend core effort coverage per model common estimate baseline
effort add-on as % of core coverage
effort add-on with revised model
TransitionInception Construction
Phases
20
Example: Model Normalization
0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
0 25 50 75 100 125 150 175 200 225 250
Size (KSLOC)
Eff
ort
(P
ers
on
-Mo
nth
s)
COCOMO II (nominal)True S @ complexity = 5.5True S @ complexity = 5True S @ complexity = 6
True S ParametersFunctional Complexity = 5-6Operating Specification = 1.0Organizational Productivity = 1.33Development Team Complexity = 2.5
21
Outline
Introduction and background Model comparison examples Estimation performance analysis Conclusions and future work
22
Model Analysis Flow
NASA 94 database
Apply COCOMO 81→COCOMO II
Rosetta Stone
Outlier analysis
Select relevant domain projects
UncalibratedCOCOMO II analysis
COCOMO II calibration (via Costar)
Calibrated COCOMO II analysis
Apply COCOMO II→SEER
Rosetta Stone
Uncalibrated SEER analysis w/ additional factors defaulted
Apply SEER knowledge base settings
SEER analysis with knowledge base settings
Apply COCOMO II→True S
Rosetta Stone
Set additional True S factors for application domain
True S analysis with application domain settings
Additional data
SEER analysis with calibration and refined settings
Consolidated analysis
Update analysis?
YNEnd
Start
COCOMO II SEER-SEM True S
Not all steps performed on iterations 2-n
23
Performance Measures
For each model, compare actual and estimated effort for n projects in a dataset:
Relative Error (RE) = ( Estimated Effort – Actual Effort ) / Actual Effort
Magnitude of Relative Error (MRE) = | Estimated Effort – Actual Effort | / Actual Effort
Mean Magnitude of relative error (MMRE) = (MRE) / n
Root Mean Square (RMS) = ((1/n) (Estimated Effort – Actual Effort)2) ½
Prediction level PRED(L) = k / n
where k = the number projects in a set of n projects whose MRE <= L.
24
COCOMO II Performance Examples
Dataset Name: NASA 94Category: AvionicsMode: EmbeddedNumber of projects = 11
Effort Prediction SummaryCalibrated UncalibratedMMRE 55% MMRE 91%RMS 433.0 RMS 1404.4PRED(10) 12% PRED(10) 9%PRED(20) 19% PRED(20) 9%PRED(30) 45% PRED(30) 18%PRED(40) 65% PRED(40) 36%
Effort Estimates vs. Actuals
10
100
1000
10000
10 100 1000 10000
Estimated Effort (Person-months)
Ac
tua
l E
ffo
rt (
Pe
rso
n-
mo
nth
s)
PRED(40) Calibration Effect
MMRE Calibration EffectCOCOMO II MMRECalibration Effect
0
10
20
30
40
50
60
70
80
90
100
Flight AvionicsEmbedded
Flight ScienceEmbedded
Flight (All) Ground Embedded
Project Types
Pe
rce
nt
Uncalibrated
Calibrated
COCOMO II PRED(40)Calibration Effect
0
10
20
30
40
50
60
70
80
90
100
Flight AvionicsEmbedded
Flight ScienceEmbedded
Flight (All) Ground Embedded
Project Types
Pe
rce
nt
Uncalibrated
Calibrated
25
SEER-SEM Performance Examples
MMRE Progressive Adjustment Effects PRED(40) Progressive Adjustment Effects SEER PRED(40)
Progressive Adjustment Effects
Flight Avionics Embedded Flight (All)
Project Types
Pe
rce
nt
UncalibratedInitial Knowledge Base SettingsCalibrated and Project-Adjusted
SEER MMREProgressive Adjustment Effects
Flight Avionics Embedded Flight (All)
Project Types
Pe
rce
nt
UncalibratedInitial Knowledge Base SettingsCalibrated and Project-Adjusted
26
Model Performance Summaries For Flight Projects
Effort Estimates vs. Actuals
100
1000
10000
100 1000 10000Actual Effort (Person-months)
Est
imat
ed E
ffo
rt
(Per
son
-mo
nth
s)
Model 1Model 2Model 3
Model 1 Model 2 Model 3
MMRE 29% 39% 49%
RMS 460.2 642.4 613.5
PRED(10) 36% 20% 17%
PRED(20) 45% 50% 42%
PRED(30) 64% 50% 50%
PRED(40) 73% 60% 58%
27
Outline
Introduction and background Model comparison examples Estimation performance analysis Conclusions and future work
28
Vendor Concerns
Study limited to a COCOMO viewpoint only Current Rosetta Stones need review and may be weak
translators from the original data Results not indicative of model performance due to
ignored parameters Risk and uncertainty were ground ruled out Data sanity checking needed
29
All cost models (COCOMO II, SEER-SEM, True S) performed well against NASA database of critical flight software
– Calibration and knowledge base settings improved default model performance
– Estimate performance varies by domain subset Complexity and reliability factor distributions characterize the
domains as expected SEER-SEM and True S vendor models provide additional factors
beyond COCOMO II– More granular factors for the overall effects captured in the COCOMO II
Complexity factor. – Additional factors for other aspects, many of which are relevant for NASA
projects Some difficulties mapping inputs between models, but
simplifications are possible Reconciliation of effort WBS necessary for valid comparison
between models
Conclusions (1/2)
30
Conclusions (2/2)
Models exhibited nearly equivalent performance trends for embedded flight projects within the different subgroups – Initial uncalibrated runs from COCOMO II and SEER-SEM both
underestimated the projects by approximately 50% overall– Improvement trends between uncalibrated estimates and those with
calibrations or knowledge base refinements were almost identical SEER experiments illustrated that model performance measures
markedly improved when incorporating knowledge base information for the domains
– All three models have roughly the same final performance measures for either individual flight groups or combined
In practice no one model should be preferred over all others– Use a variety of methods and tools and then investigate why the
estimates may vary
31
Future Work
Study has been helpful in reducing sources of misinterpretation across the models but considerably more should be done *
– Developing two-way and/or multiple-way Rosetta Stones– Explicit identification of residual sources of uncertainty across models and their estimates
not fully addressable by Rosetta Stones– Factors unique to some models but not others– Many-to-many factor mappings– Partial factor-to-factor mappings– Similar factors that affect estimates in different ways: linear, multiplicative, exponential, other– Imperfections in data: subjective rating scales, code counting, counting of other size factors,
effort/schedule counting, endpoint definitions and interpretations, WBS element definitions and interpretations
Repeating the analysis with improved models, new data and updated Rosetta Stones – COCOMO II may be revised for critical flight project applications
Improved analysis process– Revision of vendor tool usage to set knowledge bases before COCOMO translation
parameter setting– Capture estimate inputs in all three model formats; try different translation directionalities
With modern and more comprehensive data, COCOMO II and other models can be further improved and tailored for NASA project usage
– Additional data always welcome
* The study participants welcome sponsorship of further joint efforts to pin down sources of uncertainty, and to more explicitly identify the limits to comparing estimates across models
32
Bibliography
Boehm B, Abts C, Brown A, Chulani S, Clark B, Horowitz E, Madachy R, Reifer D, Steece B, Software Cost Estimation with COCOMO II, Prentice-Hall, 2000
Boehm B, Abts C, Chulani S, Software Development Cost Estimation Approaches – A Survey, USC-CSE-00-505, 2000
Galorath Inc., SEER-SEM User Manual, 2005 Lum K, Powell J, Hihn J, Validation of Spacecraft Software Cost
Estimation Models for Flight and Ground Systems, JPL Technical Report, 2001
Madachy R, Boehm B, Wu D, Comparison and Assessment of Cost Models for NASA Flight Projects, http://sunset.usc.edu/csse/TECHRPTS/2006/usccse2006-616/usccse-2006-616.pdf, USC Center for Systems and Software Engineering Technical Report USC-CSSE-2006-616, 2006
PRICE Systems, True S User Manual, 2005 Reifer D, Boehm B, Chulani S, The Rosetta Stone - Making
COCOMO 81 Estimates Work with COCOMO II, Crosstalk, 1999