SDPM - Lecture 5 - Software effort estimation

Leiden Institute of Advanced Computer Science

1

System’s Development and Project Management – Software effort estimation

Prof. Dr. Thomas Bäck

Leiden Institute of Advanced Computer Science Dates

Feb. 1 14:45 – 17:30 Introduction, Project Description Feb. 2 13:45 – 16:30 STEP WISE Approach to Project Planning Feb. 9 13:10 – 15:45 Selecting an Appropriate Software Dev.

Approach Feb. 15 14:45 – 17:30 Activity Planning and Resource Allocation Feb. 16 15:15 – 18:00 Software Effort Estimation Feb. 22 14:45 – 17:30 Risk management, project escalation Feb. 23 13:45 – 16:30 Project monitoring and control Mar. 1 14:45 – 17:00 Exam Mar. 2 13:45 – 16:30 Software Quality Assurance Mar. 8 14:45 – 17:30 Managing People; Contract Management Mar. 9 13:45 – 16:30 Various Mar. 15 14:45 – 17:30 Trade Fair

2


3

STEP WISE overview

1. Identify project objectives 0. Select Project 2. Identify project infrastructure

3. Analyze pr. characteristics

4. Identify products and activities

5. Estimate effort for activity

6. Identify activity risks

7. Allocate resources

8. Review / publicize plan 9. Execute plan

10. Lower level planning

For each activity

Review lower level detail


4

What makes a successful project?

Delivering: !   Agreed functionality !   On time !   At the agreed cost !   With the required quality

Stages: 1. Set targets 2. Attempt to achieve

targets

But what if the targets are not achievable?


5

Software effort estimation: Notoriously difficult …

!   Complexity and invisibility of software !   Intensely human activities of system development !   Cannot be treated in a purely mechanistic way !   Novel applications of software !   Changing technology !   Lack of homogeneity of project experience


6

The Cone of Uncertainty

Time

4x

1.25x 1.0x 0.8x

0.67x

0.5x

0.25x

2x

1.5x

Initial concept

Approved product definition

Requirements complete

User Interface Design Complete

Detailed Design Complete

Software complete


TAXONOMY OF METHODS

7


8

Over and under-estimating !   Parkinson’s Law: “Work expands to fill the time

available” !   An over-estimate is likely to cause project to take

longer than it would otherwise !   Brook’s Law: “Putting more people on a late job

makes it later.” !   Weinberg’s Zeroth Law of reliability: “a software

project that does not have to meet a reliability requirement can meet any other requirement”

!   “If you don’t care about quality, you can meet any other requirement”


Effort estimation: Taxonomy Effort

estimation methods

Algorithmic/Parametric

Function Points

Data Points

Object Points

Web Points

Use Case Points

COCOMO

COCOMO II

Empirical

Price-to-win Parkinson Analogy-based Percentage-based

Expert-estimation

Delphi

PERT

Single person estimate

•  Complexity and invisibility of software •  Intensely human activities of system

development •  Cannot be treated in a purely mechanistic way •  Novel applications of software •  Changing technology •  Lack of homogeneity of project experience

Expert-estimation 25.5%

Analogy 60.8%

„Capacity problem“ 20.8%

Price-to-win 8.9%

Parametric models 13.7%

Heemstra & Kuster

Why do we need them ?

-  Only 50% kept project data on past projects - but 60.8% used analogy! -  35% did not produce estimates

-  62% used methods based on intuition - only 16% used formalized methods -  Function point users produced worse estimates!

Leiden Institute of Advanced Computer Science Overview Method Complexity Pros Cons Function Points High Standardized,

transparent Requires data foundation, partially subjective

COCOMO II Average to high Good for first raw estimate, tool support available

System size estimate required

Expert-estimation Small to average Quick, flexible Subjective, missing transparency

Analogy-based Small to average Quick for similar projects

Not applicable to new projects, data foundation required

Percentage-based Small Applicable early on, after initial phase

Requires data foundation, variance for new projects


11

A taxonomy of estimating methods

!  Expert opinion - just guessing? !  Bottom-up: activity based

!   Components are identified and sized !   Estimates are aggregated !   More appropriate at later, more detailed stages of

project planning !  Parametric: e.g. function points

!   Use effort drivers representing characteristics of the target system and the implementation environment to predict effort


12

A taxonomy of estimating methods (cont’d)

!  Analogy !   A similar, completed project is identified, and its actual

effort is used as the basis for the new project !  Artificial neural networks - a view of the future? !  Parkinson: based on staff-effort available

!   Cf. Parkinson’s law – use staff effort available

!   ‘Price to win’: figure sufficiently low to win contract !   Estimate what the client’s budget is

Not recommended

Not recommended


13

Top-down vs. bottom-up

!  Top-down (usually parametric models) !   Produce overall estimate based on project cost

drivers !   Based on past project data !   Normally formulas such as

effort = system size x productivity rate !   FP focuses on system size !   COCOMO focuses on productivity factors

!  Bottom-up !   Use when no past project data is available

Hours per KLOC


14

Top-down estimating

!   Produce overall estimate using effort driver(s)

!   Distribute proportions of overall estimate to components

Design Code

Overall project

Test

Estimate 100 days

30% i.e. 30 days

30% i.e. 30 days

40% i.e. 40 days


15

Bottom-up estimating

1. Break project into smaller and smaller components

[2. Stop when you get to what one person can do in one/two weeks]

3. Estimate costs for the lowest level activities 4. At each higher level calculate estimate by

adding estimates for lower levels

Leiden Institute of Advanced Computer Science Empirical Methods

• Figure sufficiently low to win contract Price-to-win

• Based on staff-effort available • Parkinson’s Law: “Work expands to fill the time available” Parkinson

• A similar, completed project is identified, and its actual effort is used as the basis for the new project Analogy-based

• Top-down approach, based on overall effort estimate • Distribute percentages of effort to components in work

breakdown structure

Percentage-based

• Delphi: Multiple estimates • Single expert estimate • PERT: Schedule risk estimation approach

Expert-estimation


Algorithmic/parametric methods

!  Parametric = top-down !   Produce overall estimate based on project cost

drivers !   Based on past project data !   Normally formulas such as

effort = system size / productivity rate (KLOC/h)

•  Focus on system size/functionality

Function points

•  Focus on productivity factors COCOMO


PARAMETRIC MODELS

18


Parametric Models

!  Simplistic model:

!  E.g.: !   System size = lines of code (KLOC) !   Productivity rate = KLOC/day

!  How to derive productivity rate?

!   Based on data from past projects

estimated effort = (system size) / productivity rate

productivity rate= (system size) / effort


Basic approach

Model

Number of file types

Number of I/O transaction types

System size Model Effort

System size

Productivity factors

Function points

•  Used to estimate system size (SLOC, source lines of code)

COCOMO

•  Focus on productivity •  Lines of code as

input


21

Parametric models

!  COCOMO (lines of code) and function points examples of these

!  Problem with COCOMO etc:

Guess Algorithm Estimate

but what is desired is…

Algorithm Estimate System characteristic


22

Parametric models (cont’d)

!  Examples of system characteristics !   No. of screens x 4 hours !   No. of reports x 2 days !   No. of entity types x 2 days

!  The quantitative relationship between the input and output products of a process can be used as the basis of a parametric model


23

Parametric models (cont’d)

!  Simplistic model for an estimate estimated effort = (system size) / productivity rate

!  E.g. !   System size = lines of code !   Productivity = lines of code per day

!  Productivity rate = (system size) / effort !   Based on past projects

KLOC per hour


COCOMO Productivity Focus:

24


25

COCOMO: Constructive Cost Model

!  Based on industry productivity standards - database is constantly updated

!  Allows an organization to benchmark its software development productivity

!  Basic equation: effortnom = m * sizen

!   Person-months !   Thousands of delivered source code instructions

(kdsi) !  Refers to a group of models


26

COCOMO: Constructive Cost Model

!  System types: !   Organic (small teams, in-house software

development, small and flexible systems) !   Embedded (tight product operation constraints; costly

changes) !   Semi-detached (combines elements of the above;

intermediate)


27

COCOMO: Example

!  System types: !   Organic: m=2.4, n=1.05, o=2.5, p=0.38 !   Semi-detached: m=3.0, n=1.12, o=2.5, p=0.35 !   Embedded: m=3.6, n=1.20, o=2.5, p=0.32

!  Effort = m * size[kdsi]n

!  Time = o * effortp = o * (m * sizen)p

!  Example: Organic project, 1,500 dsi !   Effort = 2.4 * 1.51.05 = 3.67 !   Time = 2.5 * 3.670.38 = 4.1


28

COCOMO (cont‘d) !   Intermediate version of COCOMO incorporates 15

cost drivers, !   Product attributes:

•  Required software reliability, database size used, product complexity.

!   Computer attributes: •  Execution time constraint, main storage constraint, virtual

machine volatility, computer turnaround time. !   Personnel attributes:

•  Analyst capability, application experience, programmer capability, virtual machine experience, language experience

!   Project factors: •  Modern programming practice, software tools, development

schedule.


29

COCOMO (cont‘d)

!  Complementary equation: Effortest = Effortnom * dem1 * … * dem15

(demi: development effort multiplier) !  Effortnom as before (i.e., exponential function)


FUNCTION POINTS System Size Focus:

30


31

System size: Function Points (FP)

!  Based on work at IBM 1979 onwards !   Albrecht and Gaffney wanted to measure the

productivity independently of lines of code !   Has now been developed by the International FP

User Group (which is US based) !   Mark II FPs developed by Simons mainly used in

UK !  Based on functionality of the program


Albrecht function points internal logical

files external inputs external

outputs

external inquiries

external interface files

• Input transaction through screens, forms, dialog boxes. • Updates internal files External input types

• Data is output to user by screens, reports, graphs, control signals. External output types • Standing files used by system • One or more record types (group of data that is usually accessed together) Logical internal files

• Input and output passing through and from other computer applications External interface files • Transactions initiated by user • Provide information, but do not update internal files • User inputs some information that directs system to details required

External inquiry types

•  Based on program functionality •  2 “data function” types •  3 “transactional function” types •  Each occurrence is judged simple,

average, or complex


33

Albrecht FP weightings

Type Simple Average Complex

ILF 7 10 15

EIF 5 7 10

EI 3 4 6

EO 4 5 7

EQ 3 4 6

If judged …

… then assign … points


34

IFPUG developments

Record types

<<< Data Types >>>

1-19 20-50 > 50 1 simple simple average

2-5 simple average complex

> 5 average complex complex

Functional complexity was later defined by rules e.g. internal logical files and external interface files as below:


35

IFPUG external input complexity

File types

<<< Data Types >>>

< 5 5-15 > 15 1 simple simple average


> 5 average complex complex


36

IFPUG external output complexity

File types

<<< Data Types >>>

1-19 20-50 > 50 0 or 1 simple simple average


>5 average complex complex


37

IFPUG external inquiry complexity

!  External inquiries are counted both as if they were an external input and an external output.

!  Use higher score of the two.


38

Assignment 3

EI EO ILF EIF EQ

1 2 FTR: LECTURER, DEP, 4 DETs

0 4 (all 1 RET, < 20 data types)

0

0

2 1 FTR: LECTURER, 3 DETs 0 1 FTR: LECTURER, 3 DETs

3 3 FTR: TEACHING, LECTURER, COURSE), 11 DET (do not count activity_ref, staff_id, course_code)

0 0

4 3 FTR: LECTURER, COURSE, TEACHING_ACT, 11 DETs

0 1 FTR: TEACHING_ACT, 7 DET

5 4 FTR, 12 DETs 0 2 FTR: (LECTURER, COURSE), 2 DET

Low: 3

Low: 3

Low: 3

4 x Low: 7 = 28

34

34

37

37

31

High: 6

Low: 3

High: 6

Low: 3

High: 6

:3 = 10.3

:5 = 6.8

:6 = 5.6

:4.5 = 8.2

:7.72 = 4.8


39

From FPs to LoCs

Language Minimum (minus 1 s.-dev.)

Mode (most common)

Maximum (plus 1 s.-dev.)

C 60 128 170

C# 40 55 80

C++ 40 55 140

Cobol 65 107 150

Fortran 90 45 80 125

Fortran 95 30 71 100

Java 40 55 80

SQL 7 13 15

MS Visual Basic 15 32 41

After: McConnel, Steve: Software Estimation, Microsoft Press, Redmond, WA, 2006, p. 202.


40

Mark II function points

!  Developed by Charles R. Symons !   ‘Software sizing and estimating - Mark II

FPA’, Wiley & Sons, 1991. !  Builds on work by Albrecht !  Work originally for CCTA:

!   Should be compatible with SSADM; mainly used in UK

!  Has developed in parallel to IFPUG FPs


41

Mark II function points (cont’d)

For each TA count: !   Data types input (Ni) !   Data types output (No) !   Entity types accessed

(Ne)

No. entities accessed

No. input data items

No. output data items

FP count = Ni * 0.58 + Ne * 1.66 + No * 0.26

Technical complexity adjustments (TCA)


42

Mark II function points (cont’d) !  Weightings derived by asking developers

!   Which effort has been spend in previous projects !   … regarding processing inputs !   … regarding processing outputs !   … regarding accessing and modifying stored data.

!  Work out average hours of work !  Normalize averages into ratios or weightings

which add up to 2.5 !  … or: use industry averages.

Adjustment to Albrecht FPs


43

Using Mark II function points

!  Calculate FPs for each transaction in a system

!  Total transaction counts to get a count for the system

!  Recall that estimated effort = size (FPs) x productivity rate (effort per FP)

!  Productivity rate obtained from past projects


ANALOGY BASED

44


45

Source cases

attribute values

effort

attribute values ????? Target case

attribute values

attribute values

attribute values

attribute values

attribute values

effort

effort

effort

effort

effort Select case with closest attribute values

Use effort from source as estimate

Estimating by analogy: case-based reasoning


46

Estimating by analogy (cont’d)

!   Identify significant attributes (‘drivers’) !  Locate closest match amongst source cases

for target !  Adjust for differences between source and

target


47

Code-Oriented Approach

!  Envisage number / types of modules in final system.

!  Estimate SLOC of each identified program !   Implementation language specific

!  Estimate work !   Take into account complexity and technical

difficilty. !  Calculate work-days effort


48

N

Church with steeple

You are here: how do you get to red cross?

Forest

go to church by line of sight

Pace distance on a bearing

Forest

Anchor and adjust


49

Machine assistance for source selection (ANGEL)

Euclidean distance = sq root ((It - Is)2 + (Ot - Os)2 )

Num

ber o

f inp

uts

Number of outputs

Target

Source A

Source B

It-Is

Ot-Os


50

Stages: identify

!  Significant features of the current project !  Previous project(s) with similar features !  Differences between the current and previous

projects !  Possible reasons for error (risk) !  Measures to reduce uncertainty


CONCLUSIONS

51


52

Some conclusions: how to review estimates

Ask the following questions about an estimate !  What are the task size drivers? !  What productivity rates have been used? !   Is there an example of a previous project of

about the same size? !  Are there examples of where the productivity

rates used have actually been found?


53

Strenghts and Weaknesses

!  Expert judgement: !   Expert with relevant experience can provide good

estimation. !   Fast estimation. !   Dependent on the „expert“. !   May be biased. !   Suffers from incomplete recall.


54


!  Analogy: !   Based on actual project data and past experience. !   Similar projects may not exist. !   Historical data may not be accurate.

!  Parkinson, price-to-win: !   Often win the contract. !   Poor practice. !   May have large overruns.


55


!  Top-Down (parametric, FP, COCOMO): !   System level focus !   Faster and easier than bottom-up !   Require minimal project detail. !   Provide little detail for justifying estimates. !   Less accurate than the other methods.


56


!  Bottom-Up: !   Based on detailed analysis. !   Support project tracking better than other methods. !   Estimates address lower level tasks. !   May overlook system level cost factors. !   Requires more estimation effort than top-down. !   Difficult to perform the estimate early in the life cycle.


57


!  Algorithmic (FP, COCOMO): !   Objective, repeatable results. !   Gain a better understanding of the estimation method. !   Subjective inputs. !   Calibrated to past projects and may not reflect the

current environment. !   Algorithms may be company specific and not be

suitable for software development in general.


58

Heemstra and Kuster’s survey (cont’d)

!  Only 50% kept project data on past projects - but 60.8% used analogy!

!  35% did not produce estimates !  62% used methods based on intuition - only

16% used formalized methods !  Function point users produced worse

estimates!

Education

SDPM - Lecture 5 - Software effort estimation