07: Variance, Bernoulli, Binomial - Stanford...

Preview:

Citation preview

07: Variance, Bernoulli, BinomialJerry CainApril 12, 2021

1

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Quick slide reference

2

3 Variance 07a_variance_i

10 Properties of variance 07b_variance_ii

17 Bernoulli RV 07c_bernoulli

22 Binomial RV 07d_binomial

34 Exercises LIVE

Variance

3

07a_variance_i

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Stanford, CA𝐸 high = 68°F𝐸 low = 52°F

Washington, DC𝐸 high = 67°F𝐸 low = 51°F

4

Average annual weather

Is 𝐸 𝑋 enough?

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Stanford, CA𝐸 high = 68°F𝐸 low = 52°F

Washington, DC𝐸 high = 67°F𝐸 low = 51°F

5

Average annual weather

0

0.1

0.2

0.3

0.4

0

0.1

0.2

0.3

0.4

𝑃𝑋=π‘₯

𝑃𝑋=π‘₯

Stanford high temps Washington high temps

35 50 65 80 90 35 50 65 80 90

68Β°F 67Β°F

Normalized histograms are approximations of PMFs.

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Variance = β€œspread”Consider the following three distributions (PMFs):

β€’ Expectation: 𝐸 𝑋 = 3 for all distributionsβ€’ But the β€œspread” in the distributions is different!β€’ Variance, Var 𝑋 : a formal quantification of β€œspread”

6

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Variance

The variance of a random variable 𝑋 with mean 𝐸 𝑋 = πœ‡ is

Var 𝑋 = 𝐸 𝑋 βˆ’ πœ‡ (

β€’ Also written as: 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

β€’ Note: Var(X) β‰₯ 0β€’ Other names: 2nd central moment, or square of the standard deviation

Var 𝑋

def standard deviation SD 𝑋 = Var 𝑋7

Units of 𝑋!

Units of 𝑋

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Variance of Stanford weather

8

Varianceof 𝑋

Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

Stanford, CA𝐸 high = 68°F𝐸 low = 52°F

0

0.1

0.2

0.3

0.4

𝑃𝑋=π‘₯

Stanford high temps

35 50 65 80 90

𝑋 𝑋 βˆ’ πœ‡ !

57Β°F 121 (Β°F)2

𝐸 𝑋 = πœ‡ = 68Β°F

71Β°F 9 (Β°F)2

75Β°F 49 (Β°F)2

69Β°F 1 (Β°F)2

… …

Variance 𝐸 𝑋 βˆ’ πœ‡ ! = 39 (Β°F)2

Standard deviation = 6.2Β°F

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Stanford, CA𝐸 high = 68°F

Washington, DC𝐸 high = 67°F

9

Comparing variance

0

0.1

0.2

0.3

0.4

0

0.1

0.2

0.3

0.4

𝑃𝑋=π‘₯

𝑃𝑋=π‘₯

Stanford high temps Washington high temps

35 50 65 80 90 35 50 65 80 90

Var 𝑋 = 39 Β°F ! Var 𝑋 = 248 Β°F !

Varianceof 𝑋

Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

68Β°F 67Β°F

Properties of Variance

10

07b_variance_ii

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Properties of varianceDefinition Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

def standard deviation SD 𝑋 = Var 𝑋

Property 1 Var 𝑋 = 𝐸 𝑋! βˆ’ 𝐸 𝑋 !

Property 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž!Var 𝑋

11

Units of 𝑋!

Units of 𝑋

β€’ Property 1 is often easier to compute than the definitionβ€’ Unlike expectation, variance is not linear

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Properties of varianceDefinition Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

def standard deviation SD 𝑋 = Var 𝑋

Property 1 Var 𝑋 = 𝐸 𝑋! βˆ’ 𝐸 𝑋 !

Property 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž!Var 𝑋

12

Units of 𝑋!

Units of 𝑋

β€’ Property 1 is often easier to compute than the definitionβ€’ Unlike expectation, variance is not linear

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Computing variance, a proof

13

Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

= 𝐸 𝑋! βˆ’ 𝐸 𝑋 !

= 𝐸 𝑋! βˆ’ πœ‡!= 𝐸 𝑋! βˆ’ 2πœ‡! + πœ‡!= 𝐸 𝑋! βˆ’ 2πœ‡πΈ 𝑋 + πœ‡!

= 𝐸 𝑋 βˆ’ πœ‡ ! Let 𝐸 𝑋 = πœ‡

Everyone, please

welcome the second

moment!

Varianceof 𝑋

Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

= 𝐸 𝑋! βˆ’ 𝐸 𝑋 !

=,"

π‘₯ βˆ’ πœ‡ !𝑝 π‘₯

=,"

π‘₯! βˆ’ 2πœ‡π‘₯ + πœ‡! 𝑝 π‘₯

=,"

π‘₯!𝑝 π‘₯ βˆ’ 2πœ‡,"

π‘₯𝑝 π‘₯ + πœ‡!,"

𝑝 π‘₯

β‹… 1

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Let Y = outcome of a single die roll. Recall 𝐸 π‘Œ = 7/2 .Calculate the variance of Y.

14

Variance of a 6-sided dieVarianceof 𝑋

Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

= 𝐸 𝑋! βˆ’ 𝐸 𝑋 !

1. Approach #1: Definition

Var π‘Œ =161 βˆ’

72

!+162 βˆ’

72

!

+163 βˆ’

72

!+164 βˆ’

72

!

+165 βˆ’

72

!+166 βˆ’

72

!

2. Approach #2: A property

𝐸 π‘Œ! =161! + 2! + 3! + 4! + 5! + 6!

= 91/6

Var π‘Œ = 91/6 βˆ’ 7/2 !

= 35/12 = 35/12

2nd moment

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Properties of varianceDefinition Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 !

def standard deviation SD 𝑋 = Var 𝑋

Property 1 Var 𝑋 = 𝐸 𝑋! βˆ’ 𝐸 𝑋 !

Property 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž!Var 𝑋

15

Units of 𝑋!

Units of 𝑋

β€’ Property 1 is often easier to compute than the definitionβ€’ Unlike expectation, variance is not linear

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Property 2: A proofProperty 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž!Var 𝑋

16

Var π‘Žπ‘‹ + 𝑏= 𝐸 π‘Žπ‘‹ + 𝑏 ! βˆ’ 𝐸 π‘Žπ‘‹ + 𝑏 ! Property 1

= 𝐸 π‘Ž!𝑋! + 2π‘Žπ‘π‘‹ + 𝑏! βˆ’ π‘ŽπΈ 𝑋 + 𝑏 !

= π‘Ž!𝐸 𝑋! + 2π‘Žπ‘πΈ 𝑋 + 𝑏! βˆ’ π‘Ž! 𝐸[𝑋] ! + 2π‘Žπ‘πΈ[𝑋] + 𝑏!

= π‘Ž!𝐸 𝑋! βˆ’ π‘Ž! 𝐸[𝑋] !

= π‘Ž! 𝐸 𝑋! βˆ’ 𝐸[𝑋] !

= π‘Ž!Var 𝑋 Property 1

Proof:

Factoring/Linearity of Expectation

Bernoulli RV

17

07c_bernoulli

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Consider an experiment with two outcomes: β€œsuccess” and β€œfailure.”def A Bernoulli random variable 𝑋 maps β€œsuccess” to 1 and β€œfailure” to 0.

Other names: indicator random variable, boolean random variable

Examples:β€’ Coin flipβ€’ Random binary digitβ€’ Whether a disk drive crashed

Bernoulli Random Variable

18

𝑃 𝑋 = 1 = 𝑝 1 = 𝑝𝑃 𝑋 = 0 = 𝑝 0 = 1 βˆ’ 𝑝𝑋~Ber(𝑝)

Support: {0,1} VarianceExpectation

PMF

𝐸 𝑋 = 𝑝Var 𝑋 = 𝑝(1 βˆ’ 𝑝)

Remember this nice property of expectation. It will come back!

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Jacob Bernoulli

19

Jacob Bernoulli (1654-1705), also known as β€œJames”, was a Swiss mathematician

One of many mathematicians in Bernoulli familyThe Bernoulli Random Variable is named for him

My academic great14 grandfather

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Defining Bernoulli RVs

20

Serve an ad.β€’ User clicks w.p. 0.2β€’ Ignores otherwise

Let 𝑋: 1 if clicked

𝑋~Ber(___)𝑃 𝑋 = 1 = ___𝑃 𝑋 = 0 = ___

Roll two dice.β€’ Success: roll two 6’sβ€’ Failure: anything else

Let 𝑋 : 1 if success

𝑋~Ber(___)

𝐸 𝑋 = ___

𝑋~Ber(𝑝) 𝑝" 1 = 𝑝𝑝" 0 = 1 βˆ’ 𝑝𝐸 𝑋 = 𝑝

Run a programβ€’ Crashes w.p. 𝑝‒ Works w.p. 1 βˆ’ 𝑝

Let 𝑋: 1 if crash

𝑋~Ber(𝑝)𝑃 𝑋 = 1 = 𝑝𝑃 𝑋 = 0 = 1 βˆ’ 𝑝 πŸ€”

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Defining Bernoulli RVs

21

Serve an ad.β€’ User clicks w.p. 0.2β€’ Ignores otherwise

Let 𝑋: 1 if clicked

𝑋~Ber(___)𝑃 𝑋 = 1 = ___𝑃 𝑋 = 0 = ___

Roll two dice.β€’ Success: roll two 6’sβ€’ Failure: anything else

Let 𝑋 : 1 if success

𝑋~Ber(___)

𝐸 𝑋 = ___

𝑋~Ber(𝑝) 𝑝" 1 = 𝑝𝑝" 0 = 1 βˆ’ 𝑝𝐸 𝑋 = 𝑝

Run a programβ€’ Crashes w.p. 𝑝‒ Works w.p. 1 βˆ’ 𝑝

Let 𝑋: 1 if crash

𝑋~Ber(𝑝)𝑃 𝑋 = 1 = 𝑝𝑃 𝑋 = 0 = 1 βˆ’ 𝑝

Binomial RV

22

07d_binomial

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Consider an experiment: 𝑛 independent trials of Ber(𝑝) random variables.def A Binomial random variable 𝑋 is the number of successes in 𝑛 trials.

Examples:β€’ # heads in n coin flipsβ€’ # of 1’s in randomly generated length n bit stringβ€’ # of disk drives crashed in 1000 computer cluster

(assuming disks crash independently)

Binomial Random Variable

23

π‘˜ = 0, 1, … , 𝑛:

𝑃 𝑋 = π‘˜ = 𝑝 π‘˜ = π‘›π‘˜ 𝑝! 1 βˆ’ 𝑝 "#!𝑋~Bin(𝑛, 𝑝)

Support: {0,1, … , 𝑛}

PMF

𝐸 𝑋 = 𝑛𝑝Var 𝑋 = 𝑛𝑝(1 βˆ’ 𝑝)Variance

Expectation

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021 24

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Reiterating notation

The parameters of a Binomial random variable:β€’ 𝑛: number of independent trialsβ€’ 𝑝: probability of success on each trial

25

1. The random variable

2. is distributed as a

3. Binomial 4. with parameters

𝑋 ~ Bin(𝑛, 𝑝)

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Reiterating notation

If 𝑋 is a binomial with parameters 𝑛 and 𝑝, the PMF of 𝑋 is

26

𝑋 ~ Bin(𝑛, 𝑝)

𝑃 𝑋 = π‘˜ = π‘›π‘˜ 𝑝( 1 βˆ’ 𝑝 )*(

Probability Mass Function for a BinomialProbability that 𝑋takes on the value π‘˜

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Three coin flipsThree fair (β€œheads” with 𝑝 = 0.5) coins are flipped.β€’ 𝑋 is number of headsβ€’ 𝑋~Bin 3, 0.5

Compute the following event probabilities:

27

𝑋~Bin(𝑛, 𝑝) 𝑝 π‘˜ = π‘›π‘˜ 𝑝# 1 βˆ’ 𝑝 $%#

𝑃 𝑋 = 0

𝑃 𝑋 = 1

𝑃 𝑋 = 2

𝑃 𝑋 = 3

𝑃 𝑋 = 7P(event)

πŸ€”

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Three coin flipsThree fair (β€œheads” with 𝑝 = 0.5) coins are flipped.β€’ 𝑋 is number of headsβ€’ 𝑋~Bin 3, 0.5

Compute the following event probabilities:

28

𝑋~Bin(𝑛, 𝑝) 𝑝 π‘˜ = π‘›π‘˜ 𝑝# 1 βˆ’ 𝑝 $%#

𝑃 𝑋 = 0 = 𝑝 0 = 30 𝑝$ 1 βˆ’ 𝑝 % = &

'

𝑃 𝑋 = 1

𝑃 𝑋 = 2

𝑃 𝑋 = 3

𝑃 𝑋 = 7

= 𝑝 1 = 31 𝑝& 1 βˆ’ 𝑝 ! = %

'

= 𝑝 2 = 32 𝑝! 1 βˆ’ 𝑝 & = %

'

= 𝑝 3 = 33 𝑝% 1 βˆ’ 𝑝 $ = &

'

= 𝑝 7 = 0P(event) PMF

Extra math note:By Binomial Theorem,we can proveβˆ‘()$* 𝑃 𝑋 = π‘˜ = 1

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Consider an experiment: 𝑛 independent trials of Ber(𝑝) random variables.def A Binomial random variable 𝑋 is the number of successes in 𝑛 trials.

Examples:β€’ # heads in n coin flipsβ€’ # of 1’s in randomly generated length n bit stringβ€’ # of disk drives crashed in 1000 computer cluster

(assuming disks crash independently)

Binomial Random Variable

29

𝑋~Bin(𝑛, 𝑝)Range: {0,1, … , 𝑛} Variance

Expectation

PMF

𝐸 𝑋 = 𝑛𝑝Var 𝑋 = 𝑛𝑝(1 βˆ’ 𝑝)

π‘˜ = 0, 1, … , 𝑛:

𝑃 𝑋 = π‘˜ = 𝑝 π‘˜ = π‘›π‘˜ 𝑝! 1 βˆ’ 𝑝 "#!

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Ber 𝑝 = Bin(1, 𝑝)

Binomial RV is sum of Bernoulli RVs

Bernoulliβ€’ 𝑋~Ber(𝑝)

Binomialβ€’ π‘Œ~Bin 𝑛, 𝑝‒ The sum of 𝑛 independent

Bernoulli RVs

30

π‘Œ =,+)&

*

𝑋+ , 𝑋+ ~Ber(𝑝)

+

+

+

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Consider an experiment: 𝑛 independent trials of Ber(𝑝) random variables.def A Binomial random variable 𝑋 is the number of successes in 𝑛 trials.

Examples:β€’ # heads in n coin flipsβ€’ # of 1’s in randomly generated length n bit stringβ€’ # of disk drives crashed in 1000 computer cluster

(assuming disks crash independently)

Binomial Random Variable

31

𝑋~Bin(𝑛, 𝑝)Range: {0,1, … , 𝑛}

𝐸 𝑋 = 𝑛𝑝Var 𝑋 = 𝑛𝑝(1 βˆ’ 𝑝)Variance

Expectation

PMF π‘˜ = 0, 1, … , 𝑛:

𝑃 𝑋 = π‘˜ = 𝑝 π‘˜ = π‘›π‘˜ 𝑝! 1 βˆ’ 𝑝 "#!

Proof:

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

Consider an experiment: 𝑛 independent trials of Ber(𝑝) random variables.def A Binomial random variable 𝑋 is the number of successes in 𝑛 trials.

Examples:β€’ # heads in n coin flipsβ€’ # of 1’s in randomly generated length n bit stringβ€’ # of disk drives crashed in 1000 computer cluster

(assuming disks crash independently)

Binomial Random Variable

32

𝑋~Bin(𝑛, 𝑝)Range: {0,1, … , 𝑛}

PMF

𝐸 𝑋 = 𝑛𝑝Var 𝑋 = 𝑛𝑝(1 βˆ’ 𝑝)

We’ll prove this later in the course

VarianceExpectation

π‘˜ = 0, 1, … , 𝑛:

𝑃 𝑋 = π‘˜ = 𝑝 π‘˜ = π‘›π‘˜ 𝑝! 1 βˆ’ 𝑝 "#!

Lisa Yan, Chris Piech, Mehran Sahami, and Jerry Cain, CS109, Spring 2021

No, give me the variance proof right now

33

proofwiki.org

Recommended