Monte Carlo Simulation OCR

Embed Size (px)

Citation preview

  • 8/6/2019 Monte Carlo Simulation OCR

    1/19

    MonteCarloSimulationOverview

    In the business world, you often have to make far-reaching decisions based on limitedinformation. To ascertain the full consequences of your decision, you will want to use all tools andmethods available to you.

    MonteCarlosimulation,one type of riskanalysis, is a powerful tool that can make you aware ofthe positive, as well as the negative, outcomesof your decision.

    Course Objectives

    This course will:

    Introduce you to the benefits of using Monte Carlosimulation.

    Present some basic statistical terminology.

    Expose you to two Monte Carlosimulation packages.Provide realistic practice exercises.

    Benefits of Using MonteCarloSimulationBenefits of Using MonteCarloSimulation: ObjectivesOnce you have completed this section, you should be able to:

    Define Monte Carlosimulation.

  • 8/6/2019 Monte Carlo Simulation OCR

    2/19

    combinations.

    These results are presented in the form of histograms.

    Monte Carlo simulation cannot provide either a single, absolutely correct answer or the decision.Used with skill, however, these simulations can help you make reasonable, and better, businessdecisions.

    When a problem or question arises in which there is uncertainty in the variables, Monte Carlosimulation can help the analysis and lead to better decision-making.

    It can also:

    Facilitate a thorough investigation of both the direct and indirect consequences of randomvariation within a system.

    Identify prime sources of fluctuations.

    In the petroleum industry common uses for Monte Carlo simulation are:

    To estimate capital.

    To appraise and evaluate projects.

    To create a production forecast.

    To provide strategic planning and portfolio mix.

    In a capital-estimation problem, some of the questions investigated are:

    What would the total cost of a project be?

  • 8/6/2019 Monte Carlo Simulation OCR

    3/19

    When creating a production forecast, Monte Carlo simulation can help estimate:

    Production demands

    Material requirements

    Material and labor costs

    Capacity

    Rate of return

    Net present value

    Economic forecasts

    Your company can use Monte Carlo simulation for strategic planning and portfolio mix.

    You can simulate estimates of:

    Aggregate capital

    Revenue and NPV

    Rate of return

    Efficiency

    Benefits of Using MonteCarlo

    Simulation: SummaryIn summary, Monte Carlo Simulation is any numerical method that uses random sampling toconstruct the solution to a physical or mathematical problem.

  • 8/6/2019 Monte Carlo Simulation OCR

    4/19

  • 8/6/2019 Monte Carlo Simulation OCR

    5/19

    ,* By organizing our age data in cumulative/',, percentages, we can construct a cumulative, frequency graph.

    To do this, we need to calculate thepercentage of "older than" and "youngerthan" for each class interval.

    This figure represents the "older than" graph.

    We have been discussing cumulativeprobability distributions. Another way torepresent common density functions are thefamiliar bell-shaped curve (the normaldistribution), an asymmetrical bell with a tail to

  • 8/6/2019 Monte Carlo Simulation OCR

    6/19

    Notice that the bell-shaped curve is symmetric and that the mean, median, and mode occur atthe same location.

    The normal distribution actually extends infinitely in each direction, but it is customary to draw it toextend only three standard deviations on either side of the mean.

    A = ModeLognormal is a skewed right curve. NoticeMedianthat for this distribution, the mean, median,

    = Mean and mode have different values. Recall thatIn the normal distribution, all three values

    are the same.

    You will find that the lognormal distribution isvery important in Monte Carlo simulation1 \ used for upstream petroleum models.

    A B CThe lognormal distribution actually extends

    from zero to infinity. It always represents items with positiv&alues. There is no conventionalcutoff point, as there is for a normal distribution.

    MostLikelyI

    The triangulardistribution uses threepoints: the minimum, the maximum, andthe most likely.

    When choosing a minimum value, make

    sure it is a value lower than the lowestvalue that could ever occur.

    Likewise, the maximum value should be avalue higher than the highest value that

  • 8/6/2019 Monte Carlo Simulation OCR

    7/19

    1#avI*

    I*I.>>I*I=11I18) .1%I.*I*WI..

    m1 30%

    Median for age is 42.5.

    ReviewofStatistics

    Fundamentals:Lesson

    TheMedian is the point that separates themembers of the data set into two groups, eachwith an equal number of samples. The medianis also referred to as the P50 or the 50thpercentile.

    For a sample with an even N (sample,population, number of data points), Excel

    picks the average of the middle two numbers(41 and 44 in this sample), which is why our

    TheMode is the one value that occurs most frequently within a sample. The mode in our ageexample is 29. Although 49 also occurs twice, Excel picks the first occurringnumber (notnecessarily the smallest), if there is a tie.

    TheMean of this example is the arithmetic average or the sum of the values divided by the totalnumber of measurements.

    In this case we would add up all the ages (769) and divide by the sample (18) The

  • 8/6/2019 Monte Carlo Simulation OCR

    8/19

    Only the mean value changes. It is now 43.56.Notice how the mode and the median remain the

    same.

    These measures, mean, median, andmode are often referredto as characteristics of centraltendency. They are very useful and essential in risk analysis.

    Skewness is a measure of the lopsidedness of a distribution. It illustrates the relationshipbetween the mode, median, and mean.

    When the Skewness =0,the data are symmetric: 10,20,30,40,50.This would bea normaldistribution.

    When the Skewness < 0,there are a few numbersmuch smaller than the mean: 1, 2,30, 0,30,40,50.

    When the Skewness > 0,there are a few numbers much larger than the mean: 10,20,30,30,30,30,70,100.

    These data might have come from a lognormal distribution, which are always skewed right(have positive skewness).

  • 8/6/2019 Monte Carlo Simulation OCR

    9/19

    Rangeof values is a descriptivedevice. It expresses the gap between the extremes of the data(the maximum minus the minimum). In this example, our age range is 31 (60-29).

    Minimum: Maximum: 1Variance indicates how scattered data is.

    It is the sum of the squares of the difference between individualvalues and the mean value,divided by the number of data points or population.

    If you are calculatingvariance from a sample, you need to use N-1 (VAR in the Excelspreadsheet) instead of N (VARP).

    In our example, if 18 were the population (for example, a physics class at a university), then thevariance is 99.98. However, it is more likely that this group is a sample of a larger population

    (such as all undergraduatestudents at the university). Using N -1, the variance becomes 105.86.

    The problem with variance is that much of what we measure cannot be thought of in terms ofsquared units.

    In this example, how would you use the units of years2?StandardDeviation, another measure of central tendency, solves this squared unit problem. Itis the square root of the variance

  • 8/6/2019 Monte Carlo Simulation OCR

    10/19

    What happens when wechange one of the ages from 60 to 75?

    Remember standard deviationuses the mean, therefore every single data point affects it.

    Now that we know how to calculate thestandard deviation, how is it linked toprobability determination?

    One standard deviation from the meanincludes 34.1 5% of the total observations in anormal distribution.

    Therefore, if we measure one standarddeviation to the right and one standarddeviation to the left of the mean, the areacovered is 68.3 %.

    A randomly selected value from thisdistribution would have only a 31.7% chanceof occurring outside this area.

    Two standard deviations would include 95.5%of the total curve. A randomly selected valuefrom this distribution would have only a 4.5%chance of occurring outside this area

  • 8/6/2019 Monte Carlo Simulation OCR

    11/19

    The next important term is correlation.Correlation (CORREL) is a relationship between two

    variables.

    Correlation is always between -1and 1. When it is 0, the XY-scatter plot has no apparent trend orrelationship. If the correlationis less than 0,as X increases, Y has a tendency to decrease. Witha correlationgreater than 0,as X increases, Y has a tendency to increase.In this example, there is a negative correlationbetween age and weight. According to these data,as people get older, their weight tends to decrease somewhat. (LC I -,

    One final concept that is essential to Monte Carlosimulation is that of sensitivity.

    Sensitivityanalysis identifies which input variables have the largest impact on your model.These are the variables that are causing the most uncertainty.

    Statistically, sensitivity analysis is measured by the correlation coefficient between the inputs andthe outputs. This will be discussed more in the first Crystal Ball or @RISKexercise part of thelesson.

    ReviewofStatistics Fundamentals: Summary

    In this section we have learned that distributions are a useful method for organizing data.

    The three distributions commonlyused in Monte simulations are

  • 8/6/2019 Monte Carlo Simulation OCR

    12/19

    Using MonteCarloSimulation

    Using MonteCarloSimulation: ObjectivesMany software companies have developed statistical programs to run Monte Carlo simulations.@Riskand Crystal Ball are the two most widely used packages; both are add-ons to Excel.This part of the lesson will explain:

    Learning @Riskand Crystal Ball menusRunning Monte Carlo simulationsAnalyzing three common distributions

    Using MonteCarloSimulation: ExercisesPlease choose an exercise:

    Crystal Ball

    @Risk

    Using MonteCarloSimulation: SummaryAlthough both @Riskand Crystal Ball provide an easy method of generating Monte Carlosimulations, keep in mind they can only produce results based upon your input.

    Using MonteCarloSimulation: Summary

  • 8/6/2019 Monte Carlo Simulation OCR

    13/19

    Search for reality checks by correlation, by comparisons to similar but known situations,or by checits of limits set by reality.

    Glossary

    Assumption

    Coefficient of

    Variability

    Continuous

    Probability

    Distribution

    Correlation

    Correlation

    Coefficient

    An estimated value (input to a spreadsheet model in Crystal

    Ball)

    A measure of relative variation that

    relates the Standard Deviation to the

    mean. Results are represented in

    percentages for comparison purposes.(A l s o called Coefficient of Variance orCoefficient of Variation)

    A probability distribution that

    describes a 4et of uninterrupted valuesover a range. In contrast to a discretedistribution. a continuous distribution

    assumes there are an infinite numberof possible values.

    Relationship between two variables.

    A number between -1 and +1 that

    describes the degree of positive or

    negative correlation between

    variables. Correlation of+1 indicates

  • 8/6/2019 Monte Carlo Simulation OCR

    14/19

    Forecast In Crystal Ball, an output for a

    simulation model.

    The Standard Deviation of the

    distribution of possible sample

    means. This statistic gives one

    indication of the accuracy of the

    simulation. Algebraically, the

    standard deviation divided by the

    square root of N.

    Iteration (Trial) One calculation of the user's model

    during a simulation. A simulation

    consists of many recalculations oriterations.

    Mean Sum of all the values in a set divided

    (Expected by the total number of values in the

    Value) set.

    Mean Standard The Standard Deviation of the

    Error distribution of possible sample means.

  • 8/6/2019 Monte Carlo Simulation OCR

    15/19

    Mode

    Monte CarloSimulation

    being exceeded (i .e. P50 is the50th percentile).

    For data, the mode is the item that

    repeats most. For a continuous

    distribution, the mode is the value

    corresponding to the highest point

    on the probability density function.

    ANY numerical method that uses

    random sampling to construct the

    solution to a physical or

    mathematical problem. It refers to

    the traditional method of sampling

    random variables in simulation

    modeling. Samples are chosen

  • 8/6/2019 Monte Carlo Simulation OCR

    16/19

    Probability

    Distribution

    Range

    A set of all possible events and their

    associated probabilities.

    The difference between the largest and

    smallest values in a data set. Range is

    the simplest measure of the dispersion

    or "risk of a distribution".

    Risk The uncertainty or variability in the

    (Uncertainty) outcome of some event or decision

    Sensitivity

    Skewness

    The extent to which a simulationoutput is influenced by each of the

    inputs. Thus. an output is more

    sensitive to some input variables than

    others. Sensitivity is measured by a

    correlation coefficient between the

    output and the input.

    Is the measure of the shape or degreeof asymmetry of a distribution.

    Negatively skewed distribution has

    most of its values at the upper end of

  • 8/6/2019 Monte Carlo Simulation OCR

    17/19

    Uncertainty The uncertainty or variability in the

    ( R i \ k ) outcome ofsome event or decision.

    Yariance The square ofthe Standard Deviation.It is the measure of how widely

    dispersed the values are in a

    distribution. It is one indicator of

    uncertainty. Variance givesdisproportionate weight to outliers or

    values that are far away from the

    mean. When values are close to the

    mean. variance is small, when widely

    scattered. the variance is larger.

  • 8/6/2019 Monte Carlo Simulation OCR

    18/19

    count1

    2

    34

    5

    6789

    10I

    12

    1314

    151617

    18

    MEDIAN

    MODE

    VARPVARSTDEVP

    STDEV

    SKEW

    CORREL

    AGE Weight

    X ( ~ - xave )~2x-xavep3 Y188.30 -2583.89 165

    To get this in Excel: Tools Data Analysis Descriptive statistics

    Column

    Mean 42.72Standard Error 2.43Median 42.50

    Mode 29.00

    Standard Deviation 10.29Sample Variance 105.86

    Kurtosis -1.17Skewness 0.22Range 31.OOMinimum 29.00Maximum 60.00

    Sum 769.00Count 18.00

    2340.86 175

    4313.06 1855157.79 139

    Nicknames

    199.73 Mean = average, arithme tic average0.20 Median =P50, the 50th percentile

    Mode =most likely

    99.98 VARP is the average of the squared deviations from the mean (column C)105.86 VAR is the sum of the squared deviations divided by N- I (instead of N).10.00 This is the sqrt of VARP10.29 This is the sqrt of VAR

    0.22 SKEW is almost the average CUBED deviation from the mean, divided by the cube of the standard deviationCheck out the formula for SKEW in Excel.Excel uses NI[(N-I)"(N-2)], which is close to 1INWhen SKEW is between-.Iand .Ior even -.2 and .2, the histogram would appear symmetric

    -0.322008 CORREL is the ordinary correlation coefficientbetween X and YCORREL(X,Y) = CORREL(Y,X)CORREL is always between -1 and 1. When it is 0, the xy-scatter plot has no apparent trendCORREL O indicates that as X increases, Y has a tendency to increaseMonte Carlo software uses Rank correlation, which is CORREL on the ranks of the data

  • 8/6/2019 Monte Carlo Simulation OCR

    19/19

    --

    Mean 41 42

    Standard Error 1.25

    Median

    Mode

    Standard Deviation

    Sample Variance

    Kurtosis

    Skewness -0.12

    Range 15.00

    Minimum 33.00

    Maximum 48.00

    Sum 497.00

    Count 12.00