Upload
randolf-owens
View
246
Download
0
Tags:
Embed Size (px)
Citation preview
Random Variables
Streamlining Probability:Probability Distribution, Expected Value and Standard Deviation of
Random Variable
Graphically and Numerically Summarize a Random
ExperimentPrincipal vehicle by which we do this:random variablesA random variable assigns a number
to each outcome of an experiment
Random Variables
Definition:A random variable is a numerical-
valued variable whose value is based on the outcome of a random event.
Denoted by upper-case letters X, Y, etc.
When the number of possible values of X is finite (number of heads in 3 tosses of a coin) or countably infinite (number of tosses until you get 3 heads in a row), the random variable is discrete. (Will study continuous rv’s later).
Examples: Discrete rv’s
1. X = # of games played in a randomly selected World Series
Possible values of X are x=4, 5, 6, 7
2. Y=score on 13th hole (par 5) at Augusta National golf course for a randomly selected golfer on day 1 of 2015 Masters
y=3, 4, 5, 6, 7
Examples: Discrete rv’s
Number of girls in a 5 child familyNumber of customers that use an
ATM in a 1-hour period.Number of tosses of a fair coin that
is required until you get 3 heads in a row (note that this discrete random variable has a countably infinite number of possible values: x=3, 4, 5, 6, 7, . . .)
CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.590249410 4 TYSON FOODS INC 20.9 3.94886910 5 ATLANTIC SOUTHEAST AIRLINES 20.1 15.787183910 9 SYSTEM SOFTWARE ASSOC INC 23.7 11.662475210 4 MUELLER (PAUL) CO 14.5 3.936473510 7 GANTOS INC 15.7 1.800755P10 9 ADVANTAGE HEALTH CORP 23.3 5.323935910 2 DAWSON GEOPHYSICAL CO 14.9 9.368555910 4 ORBIT INTERNATIONAL CP 15.0 3.016278010 4 CHECK TECHNOLOGY CORP 17.1 3.251460610 4 LANCE INC 19.0 8.54523710 4 ASPECT TELECOMMUNICATIONS 25.7 8.274555310 4 PULASKI FURNITURE CORP 22.0 2.180819410 4 SCHULMAN (A.) INC 19.4 6.019770920 9 COLUMBIA HOSPITAL CORP 18.3 3.123790310 4 DATA MEASUREMENT CORP 11.3 2.611457710 4 BROOKTREE CORP 13.8 13.600431L10 9 ACCESS HEALTH MARKETING INC 22.4 11.029605610 4 ESCALADE INC 10.8 2.023303110 4 DBA SYSTEMS INC 6.3 5.064124610 4 NEUTROGENA CORP 27.2 9.059492810 6 MICROAGE INC 9.0 0.522821010 7 CROWN BOOKS CORP 24.4 1.8190710 4 AST RESEARCH INC 9.7 7.346978310 6 JACO ELECTRONICS INC 31.9 0.4531320 4 ADAC LABORATORIES 18.5 10.649766010 4 KIRSCHNER MEDICAL CORP 33.0 0.830205210 4 EXIDE ELECTRS GROUP INC 29.0 2.446065P10 5 INTERPROVINCIAL PIPE LN 11.9 19.219247910 4 COHERENT INC 40.2 1.2
CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.5
Data Variables and Data Distributions
Data variables are
known outcomes.
CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.590249410 4 TYSON FOODS INC 20.9 3.94886910 5 ATLANTIC SOUTHEAST AIRLINES 20.1 15.787183910 9 SYSTEM SOFTWARE ASSOC INC 23.7 11.662475210 4 MUELLER (PAUL) CO 14.5 3.936473510 7 GANTOS INC 15.7 1.800755P10 9 ADVANTAGE HEALTH CORP 23.3 5.323935910 2 DAWSON GEOPHYSICAL CO 14.9 9.368555910 4 ORBIT INTERNATIONAL CP 15.0 3.016278010 4 CHECK TECHNOLOGY CORP 17.1 3.251460610 4 LANCE INC 19.0 8.54523710 4 ASPECT TELECOMMUNICATIONS 25.7 8.274555310 4 PULASKI FURNITURE CORP 22.0 2.180819410 4 SCHULMAN (A.) INC 19.4 6.019770920 9 COLUMBIA HOSPITAL CORP 18.3 3.123790310 4 DATA MEASUREMENT CORP 11.3 2.611457710 4 BROOKTREE CORP 13.8 13.600431L10 9 ACCESS HEALTH MARKETING INC 22.4 11.029605610 4 ESCALADE INC 10.8 2.023303110 4 DBA SYSTEMS INC 6.3 5.064124610 4 NEUTROGENA CORP 27.2 9.059492810 6 MICROAGE INC 9.0 0.522821010 7 CROWN BOOKS CORP 24.4 1.8190710 4 AST RESEARCH INC 9.7 7.346978310 6 JACO ELECTRONICS INC 31.9 0.4531320 4 ADAC LABORATORIES 18.5 10.649766010 4 KIRSCHNER MEDICAL CORP 33.0 0.830205210 4 EXIDE ELECTRS GROUP INC 29.0 2.446065P10 5 INTERPROVINCIAL PIPE LN 11.9 19.219247910 4 COHERENT INC 40.2 1.2
DATA DISTRIBUTIONDATA DISTRIBUTIONPrice-Earnings RatiosPrice-Earnings Ratios
|||| ||||
Class(bin)
ClassBoundary Tally Frequency
1 6.00-12.99 |||| | 6 6/35 = 0.171
2 13.00-19.99 10
3 20.00-26.99 |||| |||| |||| 14
4 27.00-33.99 |||| 4
5 34.00-40.99 | 1 1/35 = 0.029
RelativeFrequency
4/35 = 0.114
14/35 = 0.400
10/35 = 0.286
Handout 2.1, P. 10
CUSIP IND CONAME PE NPM60855410 4 MOLEX INC 24.7 8.740262810 5 GULFMARK INTL INC 21.4 8.181180410 4 SEAGATE TECHNOLOGY 21.3 2.246489010 9 ISOMEDIX INC 25.2 21.169318010 9 PCA INTERNATIONAL INC 21.4 4.726157010 7 DRESS BARN INC 24.5 4.5
Data Variables and Data Distributons
Data variables are
known outcomes.
Data distributions
tell us what happened.
Random Variables and Probability Distributions
Random variables areunknown chance outcomes.
Probability distributionstell us what is likely
to happen.
Data variables are
known outcomes.
Data distributions
tell us what happened.
X = the random variable (profits)xi = outcome i
x1 = 10
x2 = 5
x3 = 1
x4 = -4
Notation
Probability
Great 0.20
Good 0.40
OK 0.25
EconomicScenario
Profit($ Millions)
5
1
-4Lousy 0.15
10
x4
X
x1
x2
x3
P is the probabilityp(xi)= Pr(X = xi) is the probability of X being
outcome xi
p(x1) = Pr(X = 10) = .20
p(x2) = Pr(X = 5) = .40
p(x3) = Pr(X = 1) = .25
p(x4) = Pr(X = -4) = .15
Notation
Probability
Great 0.20
Good 0.40
OK 0.25
EconomicScenario
Profit($ Millions)
5
1
-4Lousy 0.15
10
Pr(X=x4)
X
Pr(X=x1)
Pr(X=x2)
Pr(X=x3)
x1
x2
x3
x4
.05
.10
.15
.40
.20
.25
.30
.35
Probability Histogram
-4 -2 0 2 4 6 8 10 12
Profit
Probability
Lousy
OK
Good
Great
.05
.10
.15
.40
.20
.25
.30
.35
Probability
Great 0.20
Good 0.40
OK 0.25
EconomicScenario
Profit($ Millions)
5
1
-4Lousy 0.15
10
p(x4)
X
x1
x2
x3
x4
P
p(x1)
p(x2)
p(x3)
Probability Distribution Of Number of Games Played in Randomly Selected World Series
Estimate based on results from 1946 to 2014.
x 4 5 6 7
p(x) 12/65=0.185
12/65=0.185
14/65=0.215
27/65=0.415
Probability Histogram
4 5 6 70
0.1
0.2
0.3
0.4
0.185 0.1850.215
0.415
Number of Games in Randomly Selected World
Series
Probability distributions: requirements
Notation: p(x)= Pr(X = x) is the probability that the random variable X has value x
Requirements1. 0 p(x) 1 for all values x of X
2. all x p(x) = 1
Expected Value of a Discrete Random Variable
A measure of the “middle” of the values of a random variable
k = the number of outcomes
µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk)
Weighted meanEach outcome is weighted by its
probability
Mean orExpectedValue
Sample MeanSample Mean
n
n
1=ii
X
= X
å
nx
n
1 + ... +
3x
n
1 +
2x
n
1 +
1x
n
1 =
nn
x + ... + 3
x + 2
x + 1
x = X
k
i ii=1
( ) = x P(X=x )E x å
Other Weighted Means1. Stock Market: The Dow Jones
Industrial Average The “Dow” consists of 30 companies (the
30 companies in the “Dow” change periodically)
To compute the Dow Jones Industrial Average, a weight proportional to the company’s “size” is assigned to each company’s stock price
2. GPA A=4, B=3, C=2, D=1, F=0Five 3-hour courses: 2 A's (6 hrs), 1 B (3 hrs), 2 C's (6 hrs)
4 * 6 3*3 2 * 6 45GPA: 3.0
15 15
k = the number of outcomes (k=4)
µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk)
EXAMPLE
µ = 10*.20 + 5*.40 + 1*.25 – 4*.15 = 3.65 ($ mil)
Mean
Probability
Great 0.20
Good 0.40
OK 0.25
EconomicScenario
Profit($ Millions)
5
1
-4Lousy 0.15
10
P(X=x4)
X
x1
x2
x3
x4
P
P(X=x1)
P(X=x2)
P(X=x3)
k
i ii=1
( ) = x P(X=x )E x å
-4 -2 0 2 4 6 8 10 12
Profit
Probability
Lousy
OK
Good
Great
.05
.10
.15
.40
.20
.25
.30
.35
k = the number of outcomes (k=4)
µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk)
EXAMPLE
µ = 10·.20 + 5·.40 + 1·.25 - 4·.15 = 3.65 ($ mil)
Mean
µ=3.65
k
i ii=1
( ) = x P(X=x )E x å
Interpretation
E(x) is not the value of the random variable x that you “expect” to observe if you perform the experiment once
Interpretation
E(x) is a “long run” average; if you perform the experiment many times and observe the random variable x each time, then the average x of these observed x-values will get closer to E(x) as you observe more and more values of the random variable x.
Example: Green Mountain Lottery
State of Vermontchoose 3 digits from 0 through 9;
repeats allowedwin $500
x $0 $500p(x) .999 .001
E(x)=$0(.999) + $500(.001) = $.50
Example (cont.)
E(x)=$.50On average, each ticket wins $.50.Important for Vermont to knowE(x) is not necessarily a possible
value of the random variable (values of x are $0 and $500)
Example (cont.)
So the probability distribution of x is:
x 0 1 2 3p(x) 1/8 3/8 3/8 1/8
Example
Let X = number of heads in 3 tosses of a fair coin.
1.58
12
)81(3)
83(2)
831()
81(0
4
1i)
ip(x
ixE(x)
is )μ (orE(x)
å
So the probability distribution of X is:
x 0 1 2 3p(x) 1/8 3/8 3/8 1/8
US Roulette Wheel and Table
The roulette wheel has alternating black and red slots numbered 1 through 36.
There are also 2 green slots numbered 0 and 00.
A bet on any one of the 38 numbers (1-36, 0, or 00) pays odds of 35:1; that is . . .
If you bet $1 on the winning number, you receive $36, so your winnings are $35
American Roulette 0 - 00(The European version has only one 0.)
US Roulette Wheel: Expected Value of a $1 bet on a single number
Let x be your winnings resulting from a $1 bet on a single number; x has 2 possible values
x -1 35p(x) 37/38 1/38
E(x)= -1(37/38)+35(1/38)= -.05So on average the house wins 5 cents on
every such bet. A “fair” game would have E(x)=0.
The roulette wheels are spinning 24/7, winning big $$ for the house, resulting in …
Summarizing data and probability
DataHistogrammeasure of the center: sample mean
xmeasure of spread:
sample standard deviation s
Random variableProbability
Histogrammeasure of the
center: population mean m
measure of spread: population standard deviation s
Standard Deviation of a Discrete Random Variable
Measures how “spread out” the random variable is
s =
(X X)
n - 1 =
1805.703
34 = 53.10892
i2
i=1
n
å
VarianceVariance
The deviations of the individual x ‘s from the mean (expected value) of their probability distribution: xi - µ
Var(X)=2 (sigma squared) is the variance of the probability distribution
Variation
X - Xi
s =
(X X)
n - 1 =
1805.703
34 = 53.10892
i2
i=1
n
å
VarianceVariance
Variation
2 2
=1
Var(X) = = ( ) ( = )k
i ii
x P X x å
Variance of discrete random variable X
Probability
Great 0.20
Good 0.40
OK 0.25
EconomicScenario
Profit($ Millions)
5
1
-4Lousy 0.15
10
P(X=x4)
X
x1
x2
x3
x4
P
P(X=x1)
P(X=x2)
P(X=x3)
P. 207, Handout 4.1, P. 4
Example2 = (x1-µ)2 · P(X=x1) + (x2-µ)2 · P(X=x2) +
(x3-µ)2 · P(X=x3) + (x4-µ)2 · P(X=x4)
= (10-3.65)2 · 0.20 + (5-3.65)2 · 0.40 + (1-3.65)2 · 0.25 + (-4-3.65)2 · 0.15 =
19.3275
Variation
3.65 3.65
3.65
3.65
2 2
1
= ( ) ( = )=
x P X xi ii
k
å
Standard Deviation: of More Interest then the Variance
variancepopulation theof
root square theisdeviation standard population The
Standard Deviation (s) =
Positive Square Root of the Variance
Standard DeviationStandard Deviation
s = s2
, or SD, is the standard deviation of the probability distribution
Standard Deviation
(or SD) = 19.3275 4.40 ($ mil.)
2 = 19.3275
2 (or SD) =
Finance and Investment Interpretation
X = return on an investment (stock, portfolio, etc.)
E(x) = =m expected return on this investment
s is a measure of the risk of the investment
ExampleA basketball player shoots 3 free throws. P(make)
=P(miss)=0.5. Let X = number of free throws made.
2 2 2 2 23 31 18 8 8 8
3 31 18 8 8 8
0 1 2 3
1 3 3 1( ) E(X)
8 8 8 8
Compute the variance:
(0 1.5) (1 1.5) (2 1.5) (3 1.5)
2.25 .25 .25 2.25
.75.
.75 .866
x
p x
2 2
=1
= ( ( )) ( = )k
i ii
x E X P X x å
© 2010 Pearson Education
37
Expected Value of a Random VariableExample: The probability model for a particular life insurance policy is shown. Find the expected annual payout on a policy.
We expect that the insurance company will pay out $200 per policy per year.
© 2010 Pearson Education
38
Standard Deviation of a Random Variable
Example: The probability model for a particular life insurance policy is shown. Find the standard deviation of the annual payout.
68-95-99.7 Rule for Random Variables
For random variables x whose probability histograms are approximately mound-shaped:
P( - m s x + ) .68msP( - m s x + ) .9m sP( -3 m s x + 3 ) .997m s
( - , + m s m s) (50-5, 50+5) (45, 55)P( - m s X + ) = ms P(45 X 55)=.048+.057+.066+.073+.078+.08+.078+.
073+ .066+.057+.048=.724
Rules for E(X), Var(X) and SD(X):adding a constant a
If X is a rv and a is a constant:
E(X+a) = E(X)+a
Example: a = -1
E(X+a)=E(X-1)=E(X)-1
Rules for E(X), Var(X) and SD(X): adding constant a (cont.)
Var(X+a) = Var(X)SD(X+a) = SD(X)
Example: a = -1
Var(X+a)=Var(X-1)=Var(X)
SD(X+a)=SD(X-1)=SD(X)
Probability
Great 0.20
Good 0.40
OK 0.25
EconomicScenario
Profit($ Millions)
5
1
-4Lousy 0.15
10
P(X=x4)
X
x1
x2
x3
x4
P
P(X=x1)
P(X=x2)
P(X=x3)
Probability
Great 0.20
Good 0.40
OK 0.25
EconomicScenario
Profit($ Millions)
5+2
1+2
-4+2Lousy 0.15
10+2
P(X=x4)
X+2
x1+2
x2+2
x3+2
x4+2
P
P(X=x1)
P(X=x2)
P(X=x3)
E(x + a) = E(x) + a; SD(x + a)=SD(x); let a = 2
Probability
0
0.1
0.2
0.3
0.4
0.5
-4 -2 0 2 4 6 8 10 12 14
Profit=m 5.65
= 4.40Probability
0
0.1
0.2
0.3
0.4
0.5
-4 -2 0 2 4 6 8 10 12 14
Profit=m 3.65
= 4.40
New Expected Value
Long (UNC-CH) way:E(x+2)=12(.20)+7(.40)+3(.25)+(-2)
(.15)= 5.65
Smart (NCSU) way:a=2; E(x+2) =E(x) + 2 = 3.65 + 2 =
5.65
New Variance and SDLong (UNC-CH) way: (compute from
“scratch”)Var(X+2)=(12-5.65)2(0.20)+…
+(-2+5.65)2(0.15) = 19.3275SD(X+2) = √19.3275 = 4.40
Smart (NCSU) way:Var(X+2) = Var(X) = 19.3275SD(X+2) = SD(X) = 4.40
Rules for E(X), Var(X) and SD(X): multiplying by constant b
E(bX)=b E(X)
Var(b X) = b2Var(X)
SD(bX)= |b|SD(X)
Example: b =-1 E(bX)=E(-X)=-E(X)
Var(bX)=Var(-1X)==(-1)2Var(X)=Var(X)
SD(bX)=SD(-1X)==|-1|SD(X)=SD(X)
Expected Value and SD of Linear Transformation a + bx
Let X=number of repairs a new computer needs each year. Suppose E(X)= 0.20 and SD(X)=0.55
The service contract for the computer offers unlimited repairs for $100 per year plus a $25 service charge for each repair.
What are the mean and standard deviation of the yearly cost of the service contract?
Cost = $100 + $25XE(cost) = E($100+$25X)=$100+$25E(X)=$100+$25*0.20=
= $100+$5=$105SD(cost)=SD($100+$25X)=SD($25X)=$25*SD(X)=$25*0.55=
=$13.75
Addition and Subtraction Rules for Random Variables
E(X+Y) = E(X) + E(Y); E(X-Y) = E(X) - E(Y)
When X and Y are independent random variables:1. Var(X+Y)=Var(X)+Var(Y)
2. SD(X+Y)=SD’s do not add:
SD(X+Y)≠ SD(X)+SD(Y)3. Var(X−Y)=Var(X)+Var(Y)
4. SD(X −Y)=SD’s do not subtract:
SD(X−Y)≠ SD(X)−SD(Y)SD(X−Y)≠ SD(X)+SD(Y)
( ) ( )Var X Var Y
( ) ( )Var X Var Y
Motivation forVar(X-Y)=Var(X)+Var(Y)
Let X=amount automatic dispensing machine puts into your 16 oz drink (say at McD’s)
A thirsty, broke friend shows up.Let Y=amount you pour into friend’s 8 oz
cup Let Z = amount left in your cup; Z = ?Z = X-YVar(Z) = Var(X-Y) =
Var(X) + Var(Y)Has 2 components
Example: rv’s NOT independent
X=number of hours a randomly selected student from our class slept between 9 am yesterday and 9 am today.
Y=number of hours a randomly selected student from our class was awake between 9 am yesterday and 9 am today. Y = 24 – X.
What are the expected value and variance of the total hours that a student is asleep and awake between 9 am yesterday and 9 am today?
Total hours that a student is asleep and awake between 9 am yesterday and 9 am today = X+Y
E(X+Y) = E(X+24-X) = E(24) = 24 Var(X+Y) = Var(X+24-X) = Var(24) = 0. We don't add Var(X) and Var(Y) since X and Y are not
independent.
a2
c2=a2+b2
b2
Pythagorean Theorem of Statistics for Independent X and Y
a
b
c
a2 + b2 = c2
Var(X)
Var(Y)
Var(X+Y)
SD(X)
SD(Y)
SD(X+Y)
Var(X)+Var(Y)=Var(X+Y)
a + b ≠ cSD(X)+SD(Y) ≠SD(X+Y)
9
25=9+16
16
Pythagorean Theorem of Statistics for Independent X and Y
3
4
5
32 + 42 = 52
Var(X)
Var(Y)
Var(X+Y)
SD(X)
SD(Y)
SD(X+Y)
Var(X)+Var(Y)=Var(X+Y)
3 + 4 ≠ 5SD(X)+SD(Y) ≠SD(X+Y)
Example: meal plansRegular plan: X = daily amount spentE(X) = $13.50, SD(X) = $7Expected value and stan. dev. of total
spent in 2 consecutive days?E(X1+X2)=E(X1)+E(X2)=$13.50+
$13.50=$27
1 2 1 2 1 2
2 2 2 2 2
( ) ( ) ( ) ( )
($7) ($7) $ 49 $ 49 $ 98 $9.90
SD X X Var X X Var X Var X
SD(X1 + X2) ≠ SD(X1)+SD(X2) = $7+$7=$14
Example: meal plans (cont.)Jumbo plan for football players
Y=daily amount spentE(Y) = $24.75, SD(Y) = $9.50Amount by which football player’s
spending exceeds regular student spending is Y-X
E(Y-X)=E(Y)–E(X)=$24.75-$13.50=$11.25
2 2 2 2 2
( ) ( ) ( ) ( )
($9.50) ($7) $ 90.25 $ 49 $ 139.25 $11.80
SD Y X Var Y X Var Y Var X
SD(Y @ X) ≠ SD(Y) @ SD(X) = $9.50 @ $7=$2.50
For random variables, X+X≠2X Let X be the annual payout on a life insurance
policy. From mortality tables E(X)=$200 and SD(X)=$3,867.
1) If the payout amounts are doubled, what are the new expected value and standard deviation?Double payout is 2X.
E(2X)=2E(X)=2*$200=$400SD(2X)=2SD(X)=2*$3,867=$7,734
2) Suppose insurance policies are sold to 2 people. The annual payouts are X1 and X2. Assume the 2 people behave independently. What are the expected value and standard deviation of the total payout?E(X1 + X2)=E(X1) + E(X2) = $200 + $200 =
$400
1 2 1 2 1 2
2 2
SD(X + X )= ( ) ( ) ( )
(3867) (3867) 14,953,689 14,953,689
29,907,378
Var X X Var X Var X
$5,468.76
The risk to the insurance co. when doubling the payout (2X) is not the same as the risk when selling policies to 2 people.