31
Business Econometrics using SAS Tools (BEST) Class IV – Probability Refresher

Analytics -4

Embed Size (px)

DESCRIPTION

Analytics PPT- 4

Citation preview

PowerPoint Presentation

Business Econometrics using SAS Tools (BEST)Class IV Probability RefresherProbabilityQuantifying randomnessThe context: An experiment that admits several possible outcomesSome outcome will occurThe observer is uncertain which (or what) before the experiment takes placeEvent space = the set of possible outcomes. (Also called the sample space.)Probability = a measure of likelihood attached to the events in the event space. (Try to define probability without using a word that means probability.)2Rules (Axioms) of ProbabilityAn event E will occur or not occurP(E) is a number that equals the probability that E will occur.By convention, 0 < P(E) < 1.E' = the event that E does not occurP(E') = the probability that E does not occur.3Essential Results for ProbabilityIf P(E) = 0, then E cannot (will not) occurIf P(E) = 1, then E must (will) occurE and E' are exhaustive either E or E' will occur.Something will occur, P(E) + P(E') = 1Only one thing can occur. If E occurs, then E' will not occur E and E' are exclusive.4Joint EventsPairs (or groups) of events: A and B One or the other occurs: A or B A B Both events occur A and B A BIndependent events: Occurrence of A does not affect the probability of BAn addition rule: P(A B) = P(A)+P(B)-P(A B)The product rule for independent events: P(A B) = P(A)P(B)5ApplicationFemaleMaleTotalUninsured.04186.07242.11429Insured.43691.44880.88571Total.47877.521231.00000Survey of 27326 German Individuals over 5 years. Frequency in black, sample proportion in red. E.g., .04186 = 1144/27326, .52123 = 14243/27326 FemaleMaleTotalUninsured114419793123Insured119391226424203Total1308314243273266The Addition Rule - ApplicationFemaleMaleTotalUninsured.04186.07242.11429Insured.43691.44880.88571Total.47877.521231.00000An individual is drawn randomly from the sample of 27,326 observations.P(Female or Insured) = P(Female) + P(Insured) P(Female and Insured) = .47877 + .88571 - .43691 = .92757Survey of 27326 German Individuals over 5 years 7Independent EventsEvents are independent if the occurrence of one does not affect probabilities related to the other.Events A and B are independent if P(A|B) = P(A). I.e., conditioning on B does not affect the probability of A.Using Conditional Probabilities: Bayes Theorem

9Drug TestingNotation+ = test indicates disease, = indicates no diseaseD = presence of disease, N = absence of diseaseKnown DataP(Disease) = P(D) = .005 (Fairly rare) (Incidence)P(Test correctly indicates disease) = P(+|D) = .98 (Sensitivity)(Correct detection of the disease) P(Test correctly indicates absence) = P(-|N) = . 95 (Specificity)(Correct failure to detect the disease) Objectives: DeduceP(D|+) (Probability disease really is present | test positive)P(N|) (Probability disease really is absent | test negative)

Note, P(D|+) = the probability that a patient actually has the disease when the test says they do.10More InformationDeduce: Since P(+|D)=.98, we know P(|D)=.02because P(-|D)+P(+|D)=1 [P(|D) is the P(False negative).

Deduce: Since P(|N)=.95, we know P(+|N)=.05because P(-|N)+P(+|N)=1 [P(+|N) is the P(False positive).

Deduce: Since P(D)=.005, P(N)=.995 because P(D)+P(N)=1.11Now, Use Bayes Theorem

12Expected Value - A Risky Business Venture4 Alternative Projects: Success depends on economic conditions, which cannot be forecasted perfectly. Boom Recession Expected (Probability) (90%) (10%) ValueBeer -10,000 +12,000 -7,800Fine Wine+20,000 -8,000 +17,200Both+10,000 +4,000 +9,400T-bill +3,000 +3,000 +3,00013Actuarially Fair InsuranceInsurance policyYou pay premium = FIf you collect on the policy, the payout = WProbability they pay you = PExpected profit to them is E[Profit] = F - P x W > 0 if F/W > PWhen is insurance fair? E[Profit] = 0?ApplicationsAutomobile deductibleConsumer product warranties14Litigation Risk AnalysisForm probability tree for decisions and outcomesDetermine conditional expected payoffs (gains or losses)Choose strategy to optimize expected value of payoff function (minimize loss or maximize (net) gain.15Litigation Risk Analysis: Using Probabilities to Determine a StrategyTwo paths to a favorable outcome. Probability =(upper) .7(.6)(.4) + (lower) .5(.3)(.6) = .168 + .09 = .258.How can I use this to decide whether to litigate or not?Suppose the cost to litigate = $1,000,000 and a favorable outcome pays $3,000,000. What should you do?

P(Upper path) = P(Causation|Liability,Document)P(Liability|Document)P(Document) = P(Causation,Liability|Document)P(Document) = P(Causation,Liability,Document) = .7(.6)(.4)=.168. (Similarly for lower path, probability = .5(.3)(.6) = .09.)16Random VariableDefinition: A variable that will take a value assigned to it by the outcome of a random experiment.

Realization of a random variable: The outcome of the experiment after it occurs. The value that is assigned to the random variable is the realization. X = the variable, x = the realization

Use random variables to organize the information about a random occurrence.17Types of Random VariablesDiscrete: Takes integer valuesFinite: How many female children in families with 4 children; values = 0,1,2,3,4Infinite: How many people will catch a certain disease per year in a given population? Values = 0,1,2,3, (How can the number be infinite? It is a model.)

Continuous: A measurement. How long will a light bulb last? Values = 0 to

Intervals and preferences: On the scale 1=worst,2,3,4,5=best, how do you feel about candidate _____ ? (What does this ranking mean? Intensity of feelings should be continuously variable.)18Probability DistributionRange of the random variable = the set of values it can takeDiscrete: A set of integers. May be finite or infiniteContinuous: A range of valuesProbability distribution: Probabilities associated with values in the range.19NotationProbability distribution = probabilities assigned to outcomes.

P(X=x) or P(Y=y) is common.

Probability function = PX(x). Sometimes called the density function

Cumulative probability is Prob(X < x) for the specific X.

20Rules for Probabilities1. 0 < P(x) < 1 (Valid probabilities)

2.

3. For different values of x, say A and B, Prob(X=A or X=B) = P(A) + P(B)

21Common Results for Random VariablesConcentration of ProbabilityFor almost any random variable, 2/3 of the probability lies within 1For almost any random variable, 95% of the probability lies within 2For almost any random variable, more than 99.5% of the probability lies within 3What it means: For any random outcome, An (observed) outcome more than one away from is somewhat unusual. One that is more than 2 away is very unusual. One that is more than 3 away from the mean is so unusual that it might be an outlier (a freak outcome).22Probabilities for two Events, A,BMarginal Probability = The probability of an event not considering any other events. P(A)Joint Probability = The probability that two events happen at the same time. P(A,B)Conditional Probability = The probability that one event happens given that another event has happened. P(A|B)IndependenceRandom variables are independent if the occurrence of one does not affect the probability distribution of the other.

If P(Y|X) does not change when X changes, then the variables are independent.24Two Important Math ResultsFor two random variables, P(X,Y) = P(X|Y) P(Y) P(Color blind, Male) = P(Color blind|Male)P(Male) = .05 x .5 = .025

For two independent random variables, P(X,Y) = P(X) P(Y) P(Ace,Heart) = P(Ace) x P(Heart). (This does not work if they are not independent.)25Measuring How Variables Move Together: Covariance

Covariance can be positive or negativeThe measure will be positive if it is likely that Y is above its mean when X is above its mean.It is usually denoted XY.26Correlation is Units Free

27Aspect of Correlation Independence implies zero correlation. If the variables are independent, then the numerator of the correlation coefficient is 0.28Math Facts 1 Mean of a SumMean of a sum. The Mean of X+Y = E[X+Y] = E[X]+E[Y]

Mean of a weighted sum Mean of aX + bY = E[aX] + E[bY] = aE[X] + bE[Y]29Math Facts 2 Variance of a SumVariance of a Sum Var[x+y] = Var[x] + Var[y] +2Cov(x,y) Variance of a sum equals the sum of the variances only if the variables are uncorrelated.Standard deviation of a sum The standard deviation of x+y is not equal to the sum of the standard deviations.

30Math Facts 3 Variance of a Weighted SumVar[ax+by] = Var[ax] + Var[by] +2Cov(ax,by) = a2Var[x] + b2Var[y] + 2ab Cov(x,y).

Also, Cov(x,y) is the numerator in xy, so Cov(x,y) = xy x y.

31