Pascal Triangle and Bernoulli Trials - College of Engineering

Preview:

Citation preview

1.  Sample  Space  and  Probability  Part  IV:  Pascal  Triangle  and  

Bernoulli  Trials  ECE  302  Spring  2012  

Purdue  University,  School  of  ECE  

Prof.  Ilya  Pollak    

ConnecGon  between  Pascal  triangle  and  probability  theory:  Number  of  successes  in  a  sequence  of  independent  Bernoulli  trials  

•  A  Bernoulli  trial  is  any  probabilisGc  experiment  with  two  possible  outcomes  

Ilya Pollak

ConnecGon  between  Pascal  triangle  and  probability  theory:  Number  of  successes  in  a  sequence  of  independent  Bernoulli  trials  

•  A  Bernoulli  trial  is  any  probabilisGc  experiment  with  two  possible  outcomes,  e.g.,  – Will  CiGgroup  become  insolvent  during  next  12  months?  

–  Democrats  or  Republicans  in  the  next  elecGon?  – Will  Dow  Jones  go  up  tomorrow?  

– Will  a  new  drug  cure  at  least  80%  of  the  paGents?  

Ilya Pollak

ConnecGon  between  Pascal  triangle  and  probability  theory:  Number  of  successes  in  a  sequence  of  independent  Bernoulli  trials  

•  A  Bernoulli  trial  is  any  probabilisGc  experiment  with  two  possible  outcomes,  e.g.,  – Will  CiGgroup  become  insolvent  during  next  12  months?  

–  Democrats  or  Republicans  in  the  next  elecGon?  – Will  Dow  Jones  go  up  tomorrow?  

– Will  a  new  drug  cure  at  least  80%  of  the  paGents?  

•  Terminology:  someGmes  the  two  outcomes  are  called  “success”  and  “failure.”  

•  Suppose  the  probability  of  success  is  p.    What  is  the  probability  of  k  successes  in  n  independent  trials?  

Ilya Pollak

Probability  of  k  successes  in  n  independent  Bernoulli  trials  

•  n  independent  coin  tosses,  P(H)  =  p  

Ilya Pollak

Probability  of  k  successes  in  n  independent  Bernoulli  trials  

•  n  independent  coin  tosses,  P(H)  =  p  •  E.g.,  P(HTTHHH)  =  p(1-­‐p)(1-­‐p)p3  =  p4(1-­‐p)2  

Ilya Pollak

Probability  of  k  successes  in  n  independent  Bernoulli  trials  

•  n  independent  coin  tosses,  P(H)  =  p  •  E.g.,  P(HTTHHH)  =  p(1-­‐p)(1-­‐p)p3  =  p4(1-­‐p)2  •  P(specific  sequence  with  k  H’s  and  (n-­‐k)  T’s)  =  pk  (1-­‐p)n-­‐k  

Ilya Pollak

Probability  of  k  successes  in  n  independent  Bernoulli  trials  

•  n  independent  coin  tosses,  P(H)  =  p  •  E.g.,  P(HTTHHH)  =  p(1-­‐p)(1-­‐p)p3  =  p4(1-­‐p)2  •  P(specific  sequence  with  k  H’s  and  (n-­‐k)  T’s)  =  pk  (1-­‐p)n-­‐k  •  P(k  heads)  =  (number  of  k-­‐head  sequences)  ·∙  pk  (1-­‐p)n-­‐k  

Ilya Pollak

Probability  of  k  successes  in  n  independent  Bernoulli  trials  

•  n  independent  coin  tosses,  P(H)  =  p  •  E.g.,  P(HTTHHH)  =  p(1-­‐p)(1-­‐p)p3  =  p4(1-­‐p)2  •  P(specific  sequence  with  k  H’s  and  (n-­‐k)  T’s)  =  pk  (1-­‐p)n-­‐k  •  P(k  heads)  =  (number  of  k-­‐head  sequences)  ·∙  pk  (1-­‐p)n-­‐k  

Ilya Pollak

An  interesGng  property  of  binomial  coefficients  

Since P(zero H's) + P(one H) + P(two H's) + … + P(n H's) = 1,

it follows that nk⎛

⎝ ⎜ ⎞

⎠ ⎟ pk (1− p)n−k = 1.

k= 0

n

∑Another way to show the same thing is to realize that

nk⎛

⎝ ⎜ ⎞

⎠ ⎟ pk (1− p)n−k = (p + (1− p))n = 1n = 1.

k= 0

n

Ilya Pollak

Binomial  probabiliGes:  illustraGon  

Ilya Pollak

Binomial  probabiliGes:  illustraGon  

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Comments  on  binomial  probabiliGes  and  the  bell  curve  

•  Summing  many  independent  random  contribuGons  usually  leads  to  the  bell-­‐shaped  distribuGon.  

Ilya Pollak

Comments  on  binomial  probabiliGes  and  the  bell  curve  

•  Summing  many  independent  random  contribuGons  usually  leads  to  the  bell-­‐shaped  distribuGon.  

•  This  is  called  the  central  limit  theorem  (CLT).  

•  We  have  not  yet  covered  the  tools  to  precisely  state  the  CLT,  but  we  will  later  in  the  course.  

Ilya Pollak

Comments  on  binomial  probabiliGes  and  the  bell  curve  

•  Summing  many  independent  random  contribuGons  usually  leads  to  the  bell-­‐shaped  distribuGon.  

•  This  is  called  the  central  limit  theorem  (CLT).  

•  We  have  not  yet  covered  the  tools  to  precisely  state  the  CLT,  but  we  will  later  in  the  course.  

•  The  behavior  of  the  binomial  distribuGon  for  large  n  shown  above  is  a  manifestaGon  of  the  CLT.  

Ilya Pollak

InteresGngly,  we  get  the  bell  curve  even  for  asymmetric  binomial  probabiliGes  

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

This  tells  us  how  to  empirically  esGmate  the  probability  of  an  event!  

•  To  esGmate  the  probability  p  based  on  n  flips,  divide  the  observed  number  of  H’s  by  the  total  number  of  experiments:  k/n.  

•  To  see  the  distribuGon  of  k/n  for  any  n,  simply  rescale  the  x-­‐axis  in  the  distribuGon  of  k.  

•  This  distribuGon  will  tell  us  – What  we  should  expect  our  esGmate  to  be,  on  average,  and  

– What  error  we  should  expect  to  make,  on  average  Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Ilya Pollak

Note: o  for 50 flips, the most likely outcome is the correct one, 0.8 o  it’s also close to the “average” outcome o  it’s very unlikely to make a mistake of more than 0.2

Ilya Pollak

Ilya Pollak

Ilya Pollak

If p=0.8, when estimating based on 1000 flips, it’s extremely unlikely to make a mistake of more than 0.05.

Ilya Pollak

If p=0.8, when estimating based on 1000 flips, it’s extremely unlikely to make a mistake of more than 0.05. •  Hence, when the goal is to forecast a two-way election, and the actual p is reasonably far from 1/2, polling a few hundred people is very likely to give accurate results.

Ilya Pollak

If p=0.8, when estimating based on 1000 flips, it’s extremely unlikely to make a mistake of more than 0.05. •  Hence, when the goal is to forecast a two-way election, and the actual p is reasonably far from 1/2, polling a few hundred people is very likely to give accurate results. •  However,

o  independence is important; o  getting a representative sample is important (for a country with 300M population, this is tricky!) o  when the actual p is extremely close to 1/2 (e.g., the 2000 presidential election in Florida or the 2008 senatorial election in Minnesota), pollsters’ forecasts are about as accurate as a random guess.

Ilya Pollak

The  2008  Franken-­‐Coleman  elecGon  

•  Franken  1,212,629  votes  •  Coleman  1,212,317  votes  

•  In  our  analysis,  we  will  disregard  third  party  candidate  who  got  437,505  votes  (he  actually  makes  pre-­‐elecGon  polling  even  more  complicated)  

•  EffecGvely,  p  ≈  0.500064  

Ilya Pollak

ProbabiliGes  for  fracGons  of  Franken  vote  in  pre-­‐elecGon  polling  based  on  n=2.5M  (more  than  all

 Franken  and  Coleman  votes  combined)  

•  Even though we are unlikely to make an error of more than 0.001, this is not enough because p-0.5=0.000064! •  Note: 42% of the area under the bell curve is to the left of 1/2. •  When the election is this close, no poll can accurately predict the outcome. •  In fact, the noise in the voting process itself (voting machine malfunctions, human errors, etc) becomes very important in determining the outcome.

Ilya Pollak

EsGmaGng  the  probability  of  success  in  a  Bernoulli  trial:  summary  

•  As  the  number  n  of  independent  experiments  increases,  the  empirical  fracGon  of  occurrences  of  success  becomes  close  to  the  actual  probability  of  success,  p.  

•  The  error  goes  down  proporGonately  to  n1/2.  I.e.,  error  aler  400  trials  is  twice  as  small  as  aler  100  trials.  

•  This  is  called  the  law  of  large  numbers.  

•  This  result  will  be  precisely  described  later  in  the  course.  

Ilya Pollak

Recommended