1
ECE 355: Software Engineering
Project Cost Estimation
Instructor:
Kostas Kontogiannis
2
Course Outline
• Introduction to software engineering• Requirements Engineering• Design Basics• Traditional Design• OO Design• Design Patterns• Software Architecture• Design Documentation• Verification & ValidationSoftware Process & Project Management
3
• These slides are based on:– Lecture slides by Ian Summerville, see
http://www.comp.lancs.ac.uk/computing/resources/ser/
– ECE355 Lecture slides by Sagar Naik
4
Process/Project Management
• Project management involves a whole host of issues and skills– Effort estimation– Staffing– Defining and managing the process– Scheduling activities– Monitoring quality– …
• Process management at the level of an organization– A software development organization should define, implement and
constantly improve their• Software processes• Organizational structure
5
Overview - Software Process & Project Management
Cost estimation & Staffing• Project scheduling• Software Life-Cycle Models• Examples of Software processes• Process improvement and Software metrics
6
Software cost estimation
• Predicting the resources required for a software development process
©Ian Sommerville 1995
7
Topics covered
• Productivity
• Estimation techniques
• Algorithmic cost modelling
• Project duration and staffing
©Ian Sommerville 1995
8
Software cost components• Effort costs (the dominant factor in most
projects)– salaries of engineers involved in the project– costs of building, heating, lighting– costs of networking and communications– costs of shared facilities (e.g library, staff restaurant, etc.)– costs of pensions, health insurance, etc.
• Other costs– Hardware and software costs– Travel and training costs– …
©Ian Sommerville 1995 [modified]
9
Costing and pricing• There is not a simple relationship between the
development cost and the price charged to the customer
• Software pricing factors– Market opportunity – low price to enter the market, e.g.,
initially “free software”– Cost estimation uncertainty– Contractual terms– Requirements volatility– Financial health– …
©Ian Sommerville 1995 [modified]
10
• A measure of the rate at which individual engineers involved in software development produce software and associated documentation– Not quality-oriented although quality assurance
is a factor in productivity assessment
• Measure useful functionality produced per time unit & programmer
Programmer productivity
©Ian Sommerville 1995 [modified]
11
• Size related measures based on some output from the software process. This may be lines of delivered source code (SLOC), object code instructions, etc.– E.g., SLOC / person-month
• Function-related measures based on an estimate of the functionality of the delivered software. Function-points are the best known of this type of measure– E.g., FP / person-month
Productivity metrics
©Ian Sommerville 1995 [modified]
12
• What's a line of code?– Many different ways to count lines (e.g., with
or without comments, counting statements rather than lines, or counting lines in a automatically formatted code)
– Need to know the measurement method before comparing SLOC numbers
• Assumes linear relationship between system size and volume of documentation
Lines of code
©Ian Sommerville 1995 [modified]
13
• Problems of LOC-based comparisons– The lower level the language, the more
productive the programmer– The more verbose the programmer, the higher
the productivity
• Function points provide a more accurate measure of productivity than LOC
Cross-language comparisons
©Ian Sommerville 1995 [modified]
14
System development times
Analysis Design Coding Testing DocumentationAssembly codeHigh-level language
3 weeks3 weeks
5 weeks5 weeks
8 weeks8 weeks
10 weeks6 weeks
2 weeks2 weeks
Size Effort ProductivityAssembly codeHigh-level language
5000 lines1500 lines
28 weeks20 weeks
714 lines/month300 lines/month
©Ian Sommerville 1995
15
Productivity
The “Vicious Square”
+ +
--
- -
+ +
Quality Scope
Development time Cost
16
• All metrics based on volume/unit time are flawed because they do not take quality into account
• Productivity may generally be increased at the cost of quality
• It is not clear how productivity/quality metrics are related
Quality and productivity
©Ian Sommerville 1995
17
• Real-time embedded systems, 40-160 LOC/P-month
• Systems programs , 150-400 LOC/P-month
• Commercial applications, 200-800 LOC/P-month
Productivity estimates
©Ian Sommerville 1995
18
The four variables
• The main four variables of a project– Development cost– Time– Quality– Scope
• Only three of these variables can be (more or less) freely adjusted• Development cost, time and quality are bad control variables
– The number of developers can only be incrementally increased (negative effects beyond the optimal count)
– Deadlines are often predetermined externally (e.g., market window, important presentation)
– Low quality upsets customers and developers
• Scope is the only real control variable
19
Accuracy of Estimation
Feasibility Requirements Design Code Deliveryx
2x
4x
0.25x
0.5x
As a project progresses, more information aboutthe progress becomes available and the accuracyof estimation can be increased over time.
x – the actual cost of the system
Estimates on projects studied by BarryBoehm occupied the area between the curves
20
Estimation techniques• Expert judgement• Estimation by analogy• Parkinson's Law• Pricing to win• Top-down estimation• Bottom-up estimation• Function point estimation• Algorithmic cost modelling
©Ian Sommerville 1995 [modified]
21
Expert judgement• One or more experts in both software
development and the application domain use their experience to predict software costs. Process iterates until some consensus is reached.
• Advantages: Relatively cheap estimation method. Can be accurate if experts have direct experience of similar systems
• Disadvantages: Very inaccurate if there are no experts!
©Ian Sommerville 1995
22
Estimation by analogy• The cost of a project is computed by comparing
the project to a similar project in the same application domain
• Advantages: Accurate if project data available• Disadvantages: Impossible if no comparable
project has been tackled. Needs systematically maintained cost database
©Ian Sommerville 1995
23
Parkinson's Law
• The project costs whatever resources are available
• Advantages: No overspend
• Disadvantages: System is usually unfinished
©Ian Sommerville 1995
24
Pricing to win• The project costs whatever the customer has to
spend on it• Advantages: You get the contract• Disadvantages: The probability that the
customer gets the system he or she wants is small. Costs do not accurately reflect the work required
©Ian Sommerville 1995
25
Top-down estimation• Approaches may be applied using a top-down
approach. Start at system level and work out how the system functionality is provided
• Takes into account costs such as integration, configuration management and documentation
• Can underestimate the cost of solving difficult low-level technical problems
©Ian Sommerville 1995
26
Bottom-up estimation• Start at the lowest system level. The cost of each
component is estimated individually. These costs are summed to give final cost estimate
• Accurate method if the system has been designed in detail
• May underestimate costs of system level activities such as integration and documentation
©Ian Sommerville 1995
27
Function Points
• The idea of function point was first proposed by Albrecht in 1979.
• The function point of a system is a measure of the “functionality” of the system.
• Steps– Counting the information domain – counting FPs– Assessing complexity of the software – adjusting FPs– Applying an empirical relationship to come up with
LOC or P-months based on the adjusted FPs
• This method cannot be performed automatically
©Ian Sommerville 1995
28
Counting Function Points
29
Counting Function Points
• User inputs. Each user input that provides distinct application oriented data to the software is counted.
• User outputs. Each user output that provides application oriented information to the user is counted. Individual data items within a report are not counted separately.
• User inquiries. This is an on-line input that results in the generation of some response.
• Files. Each master file is counted.
• External interfaces. Each interface that is used to transmit information to another system is counted.
30
Adjusting Function Points
Answer the following questions using a scale of [0-5]: 0 not important; 5 absolutely essential. We call them influence factors (Fi).
1. Does the system require reliable backup and recovery?2. Are data communications required?3. Are there distributed processing functions?4. Is performance critical?5. Will the system run in an existing, heavily utilized operational env.?6. Does the system require on-line data entry?
31
Adjusting Function Points
7. Does the on-line data entry require the input transaction to be built over multiple screens or operations (user efficiency)?
8. Are the master files updated on-line?9. Are the inputs, outputs, files, or inquiries complex?10. Is the internal processing complex?11. Is the code designed to be reusable?12. Is installation included in the design?13. Is the system designed for multiple installations?14. Is the application designed to facilitate change and ease of
use by the user?
32
Map FPs to LOC
• Use an empirical relationship– Function point = count total [0.65 + 0.01 (sum of the 14 Fi)]– Companies may want to refine their own version
• According to a 1989 study, implementing a function point in a given programming language requires the following number of lines of code– Assembly 320– C 128– COBOL 106– C++ 64– Visual Basic 32– SQL 12
• See www.ifpug.org for more information on FP
33
Example: Your PBX project
34
Example: Your PBX project
• Total of FPs = 25
• F4 = 4, F10 = 4, other Fi’s are set to 0. Sum of all
Fi’s = 8.
• FP = 25 x (0.65 + 0.01 x 8) = 18.25• Lines of code in C = 18.25 x 128 LOC = 2336
LOC• In the past, students have implemented their
projects using about 2500 LOC.
35
Algorithmic cost modelling• Cost is estimated as a mathematical function of
product, project and process attributes whose values are estimated by project managers
• The function is derived from a study of historical costing data
• Most commonly used product attribute for cost estimation is LOC (code size)
• Most models are basically similar but with different attribute values
©Ian Sommerville 1995
36
The COCOMO model• Developed at TRW, a US defence contractor• Based on a cost database of more than 60
different projects• Exists in three stages
– Basic - Gives a 'ball-park' estimate based on product attributes
– Intermediate - Modifies basic estimate using project and process attributes
– Advanced - Estimates project phases and parts separately
©Ian Sommerville 1995
37
Project classes
• Organic mode small teams, familiar environment, well-understood applications, no difficult non-functional requirements (EASY)
• Semi-detached mode Project team may have experience mixture, system may have more significant non-functional constraints, organization may have less familiarity with application (HARDER)
• Embedded Hardware/software systems, tight constraints, unusual for team to have deep application experience (HARD)
38
Basic COCOMO Formula
• Organic mode: PM = 2.4 (KDSI) 1.05
• Semi-detached mode: PM = 3 (KDSI) 1.12
• Embedded mode: PM = 3.6 (KDSI) 1.2
• KDSI = Kilo Delivered Source Instructions
©Ian Sommerville 1995
39
Effort estimates1000
800
600
400
200
00
20 40 60 80 100 120
KDSI
Person-months
Embedded
Intermediate
Simple
©Ian Sommerville 1995
40
• Organic mode project, 32KLOC– PM = 2.4 (32) 1.05 = 91 person months– TDEV = 2.5 (91) 0.38 = 14 months– N = 91/15 = 6.5 people
• Embedded mode project, 128KLOC– PM = 3.6 (128)1.2 = 1216 person-months– TDEV = 2.5 (1216)0.32 = 24 months– N = 1216/24 = 51
COCOMO examples
©Ian Sommerville 1995
41
• Implicit productivity estimate – Organic mode = 16 LOC/day– Embedded mode = 4 LOC/day
• Time required is a function of total effort NOT team size
• Not clear how to adapt model to personnel availability
COCOMO assumptions
©Ian Sommerville 1995
42
• Takes basic COCOMO as starting point
• Identifies personnel, product, computer and
project attributes which affect cost
• Multiplies basic cost by attribute multipliers which may increase or decrease costs
Intermediate COCOMO
©Ian Sommerville 1995
43
• Personnel attributes– Analyst capability– Virtual machine experience– Programmer capability– Programming language experience– Application experience
• Product attributes– Reliability requirement– Database size– Product complexity
Personnel attributes
©Ian Sommerville 1995
44
• Computer attributes– Execution time constraints
– Storage constraints
– Virtual machine volatility
– Computer turnaround time
• Project attributes– Modern programming practices
– Software tools
– Required development schedule
Computer attributes
©Ian Sommerville 1995
45
• These are attributes which were found to be significant in one organization with a limited size of project history database
• Other attributes may be more significant for other projects
• Each organization must identify its own attributes and associated multiplier values
Attribute choice
©Ian Sommerville 1995
46
• All numbers in cost model are organization
specific. The parameters of the model must be modified to adapt it to local needs
• A statistically significant database of detailed cost information is necessary
Model tuning
©Ian Sommerville 1995
47
Predicted costs
0 20 40 60 80 100Size
Effort
Curve fitted tomeasured effort
Predictedeffort
©Ian Sommerville 1995
48
• Embedded software system on microcomputer hardware.
• Basic COCOMO predicts a 45 person-month effort requirement
• Attributes = RELY (1.15), STOR (1.21), TIME (1.10), TOOL (1.10)
• Intermediate COCOMO predicts – 45*1.15*1.21.1.10*1.10 = 76 person-months.
• Total cost = 76*$7000 = $532, 000
Example
©Ian Sommerville 1995
49
• Organic: TDEV = 2.5 (PM) 0.38
• Semi-detached: TDEV = 2.5 (PM) 0.35
• Embedded mode: TDEV = 2.5 (PM) 0.32
• Personnel requirement: N = PM/TDEV– This last formula needs to be adjusted (see
next slide)
Development time estimates
©Ian Sommerville 1995 [modified]
50
Staffing requirements• Staff required can’t be computed by diving the
development time by the required schedule• The number of people working on a project
varies depending on the phase of the project• The more people who work on the project, the
more total effort is usually required• Very rapid build-up of people often correlates
with schedule slippage• Adding more people to a delayed project will
delay it even more
©Ian Sommerville 1995 [modified]
51
Rayleigh manpower curves
Time
©Ian Sommerville 1995
Rc
Resources Rc=(t/k2) e-t2/2k2
k1 k2 k3
52
Estimation methods - Summary
• Function points– SRS -> LOC– SRS -> PM
• COCOMO– LOC -> PM– May use FP as a front-end to COCOMO
• COCOMO II– Refined version with different estimation models based on
• Requirements (FP->PM),• Early design (FP->PM), and• Architecture (FP or LOC->PM)
53
Estimation methods - Summary• Each method has strengths and weaknesses• Estimation should be based on several methods• If these do not return approximately the same
result, there is insufficient information available• Some action should be taken to find out more in
order to make more accurate estimates• Pricing to win is sometimes the only applicable
method
©Ian Sommerville 1995