Introduction to monte-carlo analysis for software development - Troy Magennis (Focused Objective)

Introduction to Monte-carlo Analysis for Software Development

Troy Magennis – Focused Objective Page 1

Introduction to

Monte-carlo

Analysis for

Software

Development Forecasting and managing software

development project risks & uncertainty

Monte-carlo analysis is the tool of choice for managing risk in many fields

where risk is an inherent part of doing business. This paper examines how

to use monte-carlo techniques to understand and leverage risk in Software

Development projects and teams.

2011

Troy Magennis

Focused Objective (FocusedObjective.com)

6/1/2011



Introduction

For software development, it is often

necessary to estimate a project upfront in

order to get project approval, obtain budget

and hire the correct team size and skill-mix.

This is often at odds with the Agile

development methodology where full

upfront design and specification is avoided,

and delivery happens in small iterations

until a backlog is completed. The desire to

work iteration to iteration and choose a

finite level of work each cycle is compelling,

and it does un-deniably bring value to

production earlier than a pure waterfall

approach. However, the fact still remains

that in order to provide any value to an

organization, a finite minimum level of

functionality (work) needs to be delivered

by a preferred date, within a budget

constraint; very few companies will sign off

on a project that has no target date, and an

open budget. Often delays incur high cost;

not just development costs, but also as

competitors launch new feature first, or take

an increasing market share. Even with Agile

teams it is important for any development

manager or organization to be ready to

answer the following questions on an

ongoing basis –

1. How much will this product cost to

develop and deliver?

2. What is the likelihood of releasing

by date x?

3. What resources do you need to hit

date x (money equals people, so the

question is often how much more

money do you need to hit date x)?

This paper introduces a technique for

answering these questions given the risks

involved in software development and

delivery. Monte-carlo analysis is a proven

technique for determining the likelihood of

an outcome in the face of many difficult to

measure input criteria. Monte-carlo analysis

doesn’t completely eliminate any risk, but it

does give a much higher degree of

satisfactory answer than the plain guesses

and gut feel that is employed today (as to

release date) in many software projects.

What is Monte-carlo analysis?

Monte-carlo analysis is a mathematical

technique that finds the likely patterns in an

equations result given random input values

that are constrained between likely real-

world values for those inputs. In place of an

equation, for most purposes a spreadsheet

of software model of the real-world process

is built, and likely (but random) inputs are

fed into these models many thousands of

times to find a pattern in the results.

For example, if you know that there are

one-hundred software product stories

(features) to develop, and that from history

(or educated estimate), you know that the

shortest time it would take each story is one

day, and the longest is three days then a

Monte-carlo analysis would simulate in

software completing these one-hundred

stories with a random work time of between

one and three days; and it would do this

thousands of times. The result would be a

histogram of the total time for each

simulated project. This would be similar to



if you had the benefit of actually doing the

project one thousand times, but the

computer does this quicker. For a model

this simple, the answer can be computer by

simple averaging without employing

Monte-carlo analysis. But as the model for

developing and delivering software starts to

follow a more real-world scenario, it

quickly gets too complex for simple

arithmetic. Defects, added scope,

environment downtime and other blocking

events, staff availability are just a few of the

normal day to day events that cause a

cascading impact on software delivery, and

Monte-carlo analysis is the right tool to

manage estimation given the un-predictable

nature of these events (but likely following

a pattern).

The problem with traditional

estimation

Developer estimates for software stories

often turn out to be in-accurate causing

more erosion of trust in organizations than

any other aspect of the business to

technology teams’ relationship. When a

new project is explored, developers are

given vague single sentence descriptions of

a vision an analyst or business owner has in

their head, and asked to give an estimate.

These estimates are totaled, and that total

divided by a utilization rate for developers,

and turned into a number of weeks. From

that time forward the date is fixed, and

often the budget.

Given the vague inputs and knowledge

they will be held to this estimate, developer

(or anyone put in this position) over-

estimate. They add a little bit more to cover

the unknowns – often doubling each

estimate. Worst still, knowing that estimates

are traditionally under-estimated, each time

they are presented, the next level of

management mentally or in power-point

presentation, double what they see/hear.

This leads to projects not being funded

because of the excessive investment need

for even the smallest of features. On the

other end of accuracy, all too often, other

staff aren’t in this estimate loop, QA for

testing, DevOps for release management,

graphic designers for the artistic flair, often

don’t get the benefit of adding their input to

the estimate equation leaving the estimates

(even given the contingency fudge factor)

under-estimated for high risk features.

The organization as a whole still has the

problem of needing to make a decision on

whether the cost involved in a project will

give the return on investment needed to

proceed. For that decision, a delivery date

and the cost (staff, equipment, software

licenses, etc.) of development and delivery

is needed. Given no other option, the

developer estimates have to be taken as an

input, and therefore delivery date and

budget are fixed. Through the use of Monte-

carlo simulation, and the ongoing tuning of

historical patterns of events within an

organization, it is possible to improve the

estimates without causing more work by

the developers in estimating, or requiring

more detailed specification up-front.



Modeling software projects

This paper looks at two common Agile

methodologies and presents Monte-carlo

models for each. Scrum is a commonly

employed agile development process, and

Kanban is an emerging methodology that

shows great promise in predictable software

delivery.

Scrum Modeling with Excel

Scrum delivers value through fixed time

iterations. Teams choose a set of

functionality (stories) to deliver each

iteration time-box, measured in a points

system. For this example we will use

Microsoft Excel. The first step to Monte-

carlo simulation is to build a model of a

scrum process using various excel formula’s

that cascade into a final amount of story

points for each simulation row. From the

story points, the number of iterations, and

therefore a date can be determined.

The inputs required for this model are –

1. Number of stories for each “size”

story (in this example, stories were

estimates were limited to one of 1, 2,

3, 5, 8, 13, 20, 40 units by the

developers. Also in this example,

some stories were missing estimates,

so they were spread according to the

median story size of existing story

estimates)

2. The lower bound, average and high

boundary adjustments to apply

against each estimate size. Random

numbers will fall within these

boundaries, weighted towards the

average. It’s common for small

stories to be more accurate than

massive stories with lots of

unknown risks, so these adjustments

are entered for each estimate size

level. These can be obtained by

analyzing actual versus estimate

data of already completed stories, or

if no data exists initially guessed

3. Defect rate (expressed as 1 defect for

x points of y size)

4. Added scope rate (expressed as 1

story for every x points of the

medium story size for the project, 5

in this example)

5. Start date

6. Days per iteration (work days, for

example 10 for a two week iteration

cycle)

7. Number of story points per iteration

targets (team velocity, pick a will

always be better than lower velocity,

a stretch goal velocity for the upper

bound, and the velocity falling

between these two limits as a

starting point)

To capture this data, an input worksheet

can be built in Excel, similar to that shown

in Figure 1.



These inputs allow a simulation model to be

built. The calculations required at each step

are pretty simple, except for the random

number generation (and this is also pretty

simple in Excel). A thorough explanation of

strategies for building random numbers

that follow certain patterns evident in real-

life for a given input is covered later; For

now it is just important to understand that

each random number provided by the

random number generator will after many

generations follow a bell-curve pattern in

(more occurrences happen around the

chosen mean value, falling off either side),

with 45% below that mean, and 45% above

that mean. Less than 5% occur above and

below the specified lower and upper

bounds. More advanced tuning of a model

would allow specification of an exact

probability curve for random numbers, and

this is exactly what commercial products

offer. For this example, we stay within the

simple but often indicative standard bell

curve (Normal Distribution) with the user

being able to specify the bounds and the

mean adjustment for each estimate size (the

rationale being that the bigger the estimate

size, the more variability, but it is wise to

look at historical data and make this model

conform to a team’s estimation ability).

Once a random number is generated for

each estimate size bin (the number of stories

with that estimate size), this number is

multiplied by the total story size of that

estimate, and these are summed. For

Figure 1 - Input worksheet for Scrum simulation



example, Equation 1 shows the function to

equate the random number within the

bounds chosen, and multiplying the total

story size for that estimate bin in order to

get a total story points scenario for a single

simulation event.

=NORMINV(RAND(),Mean1,(UpperBoun

d1-LowBound1)/3.29)*Bin1Total

Equation 1 – Calculating the adjusted story points

for an estimate size using a random number within

the boundaries chosen

The equation shown in Equation 1 is

replicated for each estimate size bin, and

these are summed to obtain a final “total”

number of story points in an entire backlog.

Additional story points to account for

defects are calculated and added to the

total. The total project points is divided by

the number of defects per point input, and

multiplied by the number of points per

defect input, as shown in Equation 2

=((Total_Project_Points/DefectSc

ale)*DefectsPerScale)*PointsPerD

efect

Equation 2 – Calculating the adjustment to add for

defects

Scope is often added after the project

begins, whether in the form of new features,

or work relating to fixing production

defects, or non-development specific tasks.

This model simply applies a single

introduced stories value following the rate

specified as an input and adds this to the

total story points so far totaled.

=((Total_Points/IntroducedScale)*In

troducedPerScale)*PointsPerIntroduc

ed

Equation 3 – Calculating the adjustment to add for

introduced scope

The project story point total, the defect

point total and the introduced scope total

are summed and this value represents the

total story points to burn down over the

course of a project to achieve 100%

complete. To determine the number of

iterations it would take to complete these

stories, it is a matter of simply dividing by

one of the three target points per iteration

inputs as shown in Equation 4.

=(Total_Points + Defect_Total +

Increased_Scope_Total) /

(PointsPerIteration*Vacation_Adj

ustment)

Equation 4 - Calculating the number of iterations

required to complete ALL points. In this model, we

calculate this for three different target point per

iteration inputs

Some adjustment is done at this point to

account for developer vacation time, a

particular problem for long running projects

with large number of developers. An

adjustment is calculated that reduces the

number of point per iteration to account for

this. In this model, it is simply the formula

shown in Equation 5.

=1-(AvgDevsOnVacation/TotalDevs)

Equation 5 - Simple Vacation Adjustment Equation,

normally in the range of 0.9 to 0.95

The above equations are run many

thousands of times in different rows

(simply the first row in copied down in



Excel to get a set of simulation results). The

more times they are calculated, the firmer

probability patterns will emerge. Figure 2

shows the first five rows of many

thousands. Each row will determine how

many iterations it will take to complete a

backlog for the three target velocities, in this

example 190, 200 and 220 points per

iteration.

The only remaining steps to determine

completion date and probability results like

those shown in Figure 3 is to calculate the

calendar date, and how many simulation

rows fall within a given number of

iterations. The Figure 3 results ask the user

to give a range of iterations, seven to twelve

in this example. The completion date is

calculated using a convenient Excel function

that determines a date from a given number

of workdays, and optionally excluded

public holidays as shown in Equation 6

demonstrates.

=WORKDAY(StartingDate,

DaysPerIteration *

Iteration_Target,

Public_Holiday_Dates)

Equation 6 - Finding the date given number of

workdays. Iteration_Target will be 7 to 12 for our

example. StartingDate and DaysPerIteration are user

inputs as shown in Figure 1

To calculate the percentage probability of

achieving a result at a target velocity (one of

three), the equation shown in Equation 7 is

used. This equation counts the number of

simulation rows less than the target, divides

that by the total number of simulations to

find the percentage likelihood. This is done

for each of the target velocities.

Figure 2 - The results for the calculations showing the first 5 simulations of many thousands

Figure 3 - The results showing probabilities of hitting certain dates



=(COUNTIF(Velocity1Range,"< " &

Iteration_Target)

/COUNT(Velocity1Range))

Equation 7 - Count the number of simulations that

complete within a target, and convert to a

percentage

The results shown in Figure 3 indicate that

as long as the team can maintain 200 story

points per iteration, they have an 87%

chance of finishing by 22 October 2009

(when this simulation was done). As a

project progresses, the model can be tuned

to improve confidence and accuracy. Defect

counts can be determined from the bug

tracking database (how many point for x

number of defects raised), the random

number boundaries for each estimate size

match actual prior data. By maintaining this

model, the probability of hitting a given

date is always available, and some rigor

was used in the calculation.

Kanban Model

Modeling a Kanban project using Excel is

difficult, not because the calculations are

complex, but because the interaction

between stories would require at least one

column per story, per simulation row, and

this just gets un-maintainable. A custom

application makes more sense, and this

article covers one such application.

Kanban divides the steps of delivering a

single story into columns (called Status’

throughout this article). For example, a

story might pass from Design, to

Development, to Testing, to Release. The

time taken for each story in each Status is

recorded. Work is limited in each Status,

and a new story can only be pulled from left

to right when a vacant position is available

(total cards within a Status are below the

limit). A card system on a wall using post-it

notes (or electronic version) is used to

Figure 4 - Example digital Kanban Board visualizing work flowing from left to right through a process.



represent stories flowing from left to right

as shown in Figure 4.

To simulate, the application takes the inputs

of the number of Status columns, and a

lower bound and upper bound for time

taken to complete stories in each status, and

the limit of stories allows in each status at

one time (called the WIP Limit or work in

progress limit). In place of an actual

backlog, a number of initial story cards are

specified by the user as shown in Figure 5.

These inputs are enough to do a simple

simulation, where the application loops

simulating a given time interval, for

example 1 day. The simulator grabs the first

few stories and populates the first status

column. For each story a random time

within that status’ boundaries is calculated

and stories are only move to the right when

a) that time has elapsed, b) there is an open

position that keeps the number of stories

below the WIP limit for that status. This

process continues until all cards have

traversed from the imaginary backlog to the

completed stories pile, and the time take to

do this is recorded.

This type of simulation avoids having to

have accurate estimates for each story by

looking at the previous lower and upper

bounds for completing stories and using

random numbers between these

boundaries. It would be a small

enhancement to add the ability to have size

for each story, but this would complicate

the model and may not increase accuracy in

a significant way; The actual times

measured on previous work is likely more

indicative of future patterns. These actual

ranges can be mined from any work

tracking tool, and are often easy to read

from a Cumulative Flow Diagram which is

a graphical representation of how many

Figure 5 - Kanban Simulation Setup Screen for the basic inputs.



cards are in each status at any given

moment.

Defects, added scope and the time stories

spend in a “Blocked” state (no test

environment, questions to a stakeholder,

un-available experts) are represented by

adding more stories to the backlog

according to rates specified by the user, and

extending story times by given user rates.

Each of these real-world values can be

obtained from tracking systems, the defect

database for example, or the spreadsheet

holding the story data, or initially guessed

from prior experience. Tuning these values

over time and demonstrating to the entire

team the impact of these occurrences is a

great way to manage scope creep and

quality issues in a team. Figure 6 shows

how defects are specified in our example

system where defects can be raised in

different status’ and those defects will cause

a story to start back in another status for a

random time between the specified

boundaries.

Figure 6 - Blocking rate, Defects, and Added scope

all materially impact time and need to be simulated

Kanban simulation is carried out with the

specified setup either visually for a single

pass, or many hundreds or thousands of

times for Monte-carlo results. Once the

0

50

100

150

200

250

Fre

qu

ency

Completion Date

Histogram - Completion Date Probability

(Project start: 5/24/2011)

Frequency

Figure 7 - Sample histogram of Monte-carlo simulated completion dates from a Kanban model



simulation has run, this application writes

the results to Excel for further analysis as

shown in the histogram chart in Figure 7.

There is no absolute result, just a pattern of

the most commonly simulated completion

dates, in this case early December to mid-

December is the likely range.

Kanban simulation can answer another key

question – If you had to add staff, how

many and what skills do you need? If we

assume that each Kanban status has a

specific skillset, for example, graphic

designers in the Design status (Status 2 in

this example), Developers in the Dev status

(Status 3), QA in the Testing status (Status

4), and release management represented in

the DevOps status (Status 5) – then by

systematically increasing and decreasing

the WIP limits for each status and executing

a Monte-carlo simulation run, the status

that has the most impact can be determined.

The example simulator supports this feature

as the results show in Figure 8.

Monte-carlo simulation offers advantages to

teams expected to give completion dates for

projects, and to model the uncertainties in a

productive way. Whilst Monte-carlo

simulation doesn’t give an exact date, it

shows the likely pattern and ranges that can

be expected, and the factors that influence

that date most, a process called Sensitivity

Analysis.

Sensitivity Analysis – What input

has the most impact on date

Sensitivity analysis answers the question of

what input factor has the greatest impact on

the final result. In essence, if all the inputs

were increased and decreased by 10% (a

consistent amount, 10% makes the math

easy), one at a time and a simulation run

each time – how much each change

impacted the final result.

Figure 8 - Kanban simulation finds what Status column increase gives the best improvement.



Commercial Monte-carlo simulation tools

make this functionality easy to visualize in

graphs, but for our Excel model, it is easy to

do by hand. To determine if defects or

introduced scope is having a bigger impact

on outcome, temporarily increase the defect

rater and then the introduced scope rate by

a percentage and take the average number

of iterations before and after the change.

Figure 9 shows such a result for the Scrum

model used earlier. Although close,

increasing the defect rate by a percentage

has more impact on average number of

iterations to complete than the same

percentage rate change for additional scope.

From observing many models, this is a

common case, and the model has helped

teams understand the impact of quality

earlier when developing code. After

reducing defect rate, look for the next most

important factor and improve that area.

Sensitivity analysis and a Monte-carlo

simulation give teams the tool to

demonstrate how little improvements

count.

Relevant Random Number

Generation

Random number generation is a complex

field of mathematics. For truly random

numbers, a computer is the last thing you

want. Random numbers generated by

computer are never truly random, they rely

on algorithms that attempt to be random,

but the algorithm used is repeatable given

the same starting value, therefore – not

random! For most purposes this won’t

cause an issue for the modeling we

undertake, but it is important to realize that

random number generators have flaws, and

to avoid them if they will impact the results.

To simulate effectively, we need sets of

random numbers that fall within the likely

bounds of the real world problem. Excel

helps with the function: Rand(). Rand()

returns random numbers from 0 to 1 with

an equal chance of occurrence across those

bounds as shown in Figure 10, but

obtaining a random number within a bound

is just one part of the problem. In the real

world, the random numbers between a

range might occur more frequently around

one value, or end of the boundaries. If we

Figure 9 - Manual sensitivity analysis. Defects have more impact on average iterations than increased scope in

our example Scrum model. Motivate the team to reduce defect rate anyway they can.



ignored this, we compromise accuracy in

the final result. For example, when looking

at the actual time taken for previous

estimates, a bias towards overrunning time

could be the pattern. Even though the

boundaries might be from 80% (20% under

the estimate) to 200% (double the estimate),

the majority of estimates are 175%. Left

alone, the random number generator in

Excel would evenly distribute random

values across the range.

Excel supports applying bias to the random

number generation, and custom

applications take this to a whole new level

offering features not only to produce sets of

random numbers that fit a curve, but also to

look at existing data and match a random

number set to that data.

In the Scrum model covered in the article,

we forced the random number generator to

follow the common Bell Curve, or Normal

Distribution as shown in Figure 11.

0

20

40

60

80

0.00

0301

328

0.06

0274

12

0.12

0246

911

0.18

0219

703

0.24

0192

494

0.30

0165

286

0.36

0138

077

0.42

0110

869

0.48

0083

66

0.54

0056

452

0.60

0029

243

0.66

0002

035

0.71

9974

826

0.77

9947

618

0.83

9920

409

0.89

9893

201

0.95

9865

992

Fre

qu

ency

Bin

=RAND()

Frequency

0

50

100

150

200

-3.0

5297

1696

-2.5

5294

631

-2.0

5292

0923

-1.5

5289

5537

-1.0

5287

0151

-0.5

5284

4764

-0.0

5281

9378

0.44

7206

008

0.94

7231

395

1.44

7256

781

1.94

7282

167

2.44

7307

553

2.94

7332

94

3.44

7358

326

3.94

7383

712

4.44

7409

099

4.94

7434

485

Fre

qu

ency

Bin

=NORMINV(RAND(),1,1)

Frequency

Figure 10 - Excel's RAND() function returns random numbers from the range 0 to 1 with equal probability

Figure 11 - To obtain random numbers that fit the Normal distribution (Bell curve), use the NormInv() function



“Normal” is one distribution curve of

many, and commercial Monte-carlo

packages support more than the basic

curves.

EasyFit is a commercial curve-fitting

package that will analyze existing data and

determine what probability curve fits that

data. This application will also then create a

set of random numbers that match this

curve, allowing you to simulate with a

random input that is indicative of the real

world values. One use for this type of

application is looking at the frequency of

previous estimates and employing those

values in future Monte-carlo simulations.

Figure 12 shows an actual set of estimates

from a previous project. Without any other

better information, random numbers

generated from this curve could be used to

simulate future similar project estimates

without disrupting development staff for

detailed analysis. Where possible, it is

recommended to analyze prior data, or to

carefully consider the likely range of

possible values, and whether they are

weighted more frequently towards one

boundary than another when choosing a

random number distribution fit. If it is

significant, then look for commercial tools.

Conclusion

This article touches the surface of how to

build and model software projects using

Monte-carlo techniques. The ability to

quickly forecast a projects most likely

completion date, and the impact of adding

more staff, or reducing defect counts makes

this analysis an important tool for any

development managers arsenal, and place

them in a position to answer with a level of

Figure 12 - EasyFit from Mathwave Technologies is a commercial curve fitting tool that can create random

numbers that fit real-world data



unprecedented confidence the three likely

ongoing upper-management questions -

1. How much will this product cost to

develop and deliver?

2. What is the likelihood of hitting date

x?

3. What resources do you need to hit

date x (money equals people, so the

question is often how much more

money do you need to hit date x)?

[END]

About the Author

Troy Magennis is founder of Focused

Objective, a consulting firm that aims to

improve software development practices

and management through better tools and

education. Troy has held positions at VP

level for many companies in diverse field

from Automotive, Financial, Image Rights

Management and Travel.

For feedback, Troy can be contacted at –

[email protected]

For more articles like this, visit use at –

http://www.FocusedObjective.com

mailto:[email protected]

http://www.focusedobjective.com/

Technology

Introduction to monte-carlo analysis for software development - Troy Magennis (Focused Objective)