65
Probability •Formal study of uncertainty •The engine that drives statistics

Probability Formal study of uncertainty The engine that drives statistics

Embed Size (px)

Citation preview

Probability

•Formal study of uncertainty•The engine that drives statistics

Introduction

• Nothing in life is certain• We gauge the chances of

successful outcomes in business, medicine, weather, and other everyday situations such as the lottery (recall the birthday problem)

History

• For most of human history, probability, the formal study of the laws of chance, has been used for only one thing: gambling

History (cont.)• Nobody knows exactly when

gambling began; goes back at least as far as ancient Egypt where 4-sided “astragali” (made from animal heelbones) were used

History (cont.)• The Roman emperor Claudius

(10BC-54AD) wrote the first known treatise on gambling.

• The book “How to Win at Gambling” was lost.

Rule 1: Let Caesar win IVout of V times

Approaches to Probability

• Relative frequencyevent probability = x/n, where x=# of occurrences of event of interest, n=total # of observations

• Coin, die tossing; nuclear power plants?

• Limitationsrepeated observations not practical

Approaches to Probability (cont.)

• Subjective probabilityindividual assigns prob. based on personal experience, anecdotal evidence, etc.

• Classical approachevery possible outcome has equal probability (more later)

Basic Definitions

• Experiment: act or process that leads to a single outcome that cannot be predicted with certainty

• Examples:1. Toss a coin2. Draw 1 card from a standard deck of

cards3. Arrival time of flight from Atlanta to

RDU

Basic Definitions (cont.)

• Sample space: all possible outcomes of an experiment. Denoted by S

• Event: any subset of the sample space S;typically denoted A, B, C, etc.Simple event: event with only 1 outcomeNull event: the empty set Certain event: S

Examples

1. Toss a coin onceS = {H, T}; A = {H}, B = {T} simple events

2. Toss a die once; count dots on upper faceS = {1, 2, 3, 4, 5, 6}A=even # of dots on upper face={2, 4, 6}B=3 or fewer dots on upper face={1, 2, 3}

Laws of Probability

1)(,0)(.2

event any for ,1)(0 1.

SPP

AAP

Laws of Probability (cont.)

3. P(A’ ) = 1 - P(A)For an event A, A’ is the complement of A; A’ is everything in S that is not in A.

AA'

S

Birthday Problem• What is the smallest number of

people you need in a group so that the probability of 2 or more people having the same birthday is greater than 1/2?

• Answer: 23No. of people 23 30 40 60Probability .507.706.891.994

Example: Birthday Problem

• A={at least 2 people in the group have a common birthday}

• A’ = {no one has common birthday}

502.498.1)'(1)(

498.365

343

365

363

365

364)'(

:23365

363

365

364)'(:3

APAPso

AP

people

APpeople

Unions and Intersections

S

A B

A

A

Mutually Exclusive Events

• Mutually exclusive events-no outcomes from S in common

S

AB

A =

Laws of Probability (cont.)

Addition Rule for Disjoint Events:

4. If A and B are disjoint events, then

P(A B) = P(A) + P(B)

• 5. For two independent events A and B

P(A B) = P(A) × P(B)

Laws of Probability (cont.)

General Addition Rule

6. For any two events A and B

P(A B) = P(A) + P(B) – P(A B)

P(AB)=P(A) + P(B) - P(A B)

S

A B

A

Example: toss a fair die once

• S = {1, 2, 3, 4, 5, 6}• A = even # appears = {2, 4, 6}• B = 3 or fewer = {1, 2, 3}• P(A B) = P(A) + P(B) - P(A B)

=P({2, 4, 6}) + P({1, 2, 3}) - P({2})

= 3/6 + 3/6 - 1/6 = 5/6

Laws of Probability: Summary

• 1. 0 P(A) 1 for any event A• 2. P() = 0, P(S) = 1• 3. P(A’) = 1 – P(A)• 4. If A and B are disjoint events, then

P(A B) = P(A) + P(B)• 5. If A and B are independent events,

thenP(A B) = P(A) × P(B)

• 6. For any two events A and B,P(A B) = P(A) + P(B) – P(A B)

Probability Models

The Equally Likely Approach(also called the Classical

Approach)

Assigning Probabilities

• If an experiment has N outcomes, then each outcome has probability 1/N of occurring

• If an event A1 has n1 outcomes, then

P(A1) = n1/N

We Need Efficient Methods for Counting Outcomes

Product Rule for Ordered Pairs

• A student wishes to commute to a junior college for 2 years and then commute to a state college for 2 years. Within commuting distance there are 4 junior colleges and 3 state colleges. How many junior college-state college pairs are available to her?

Product Rule for Ordered Pairs

• junior colleges: 1, 2, 3, 4• state colleges a, b, c• possible pairs:(1, a) (1, b) (1, c)(2, a) (2, b) (2, c)(3, a) (3, b) (3, c)(4, a) (4, b) (4, c)

Product Rule for Ordered Pairs

• junior colleges: 1, 2, 3, 4• state colleges a, b, c• possible pairs:(1, a) (1, b) (1, c)(2, a) (2, b) (2, c)(3, a) (3, b) (3, c)(4, a) (4, b) (4, c)

4 junior colleges3 state collegestotal number of possiblepairs = 4 x 3 = 12

4 junior colleges3 state collegestotal number of possiblepairs = 4 x 3 = 12

Product Rule for Ordered Pairs

• junior colleges: 1, 2, 3, 4• state colleges a, b, c• possible pairs:(1, a) (1, b) (1, c) (2, a) (2, b) (2, c)(3, a) (3, b) (3, c)(4, a) (4, b) (4, c)

In general, if there are n1 waysto choose the first element ofthe pair, and n2 ways to choosethe second element, then the number of possible pairs isn1n2. Here n1 = 4, n2 = 3.

In general, if there are n1 waysto choose the first element ofthe pair, and n2 ways to choosethe second element, then the number of possible pairs isn1n2. Here n1 = 4, n2 = 3.

Counting in “Either-Or” Situations• NCAA Basketball Tournament: how

many ways can the “bracket” be filled out?

1. How many games?2. 2 choices for each game3. Number of ways to fill out the bracket:

263 = 9.2 × 1018

• Earth pop. about 6 billion; everyone fills out 1 million different brackets

• Chances of getting all games correct is about 1 in 1,000

Counting Example

• Pollsters minimize lead-in effect by rearranging the order of the questions on a survey

• If Gallup has a 5-question survey, how many different versions of the survey are required if all possible arrangements of the questions are included?

Solution• There are 5 possible choices for the

first question, 4 remaining questions for the second question, 3 choices for the third question, 2 choices for the fourth question, and 1 choice for the fifth question.

• The number of possible arrangements is therefore

5 4 3 2 1 = 120

Efficient Methods for Counting Outcomes

• Factorial Notation:n!=12 … n

• Examples1!=1; 2!=12=2; 3!= 123=6; 4!

=24;5!=120;• Special definition: 0!=1

Factorials with calculators and Excel

• Calculator: non-graphing: x ! (second function)graphing: bottom p. 9 T I Calculator Commands(math button)

• Excel:Paste: math, fact

Factorial Examples• 20! = 2.43 x 1018

• 1,000,000 seconds?• About 11.5 days• 1,000,000,000 seconds?• About 31 years• 31 years = 109 seconds• 1018 = 109 x 109

• 31 x 109 years = 109 x 109 = 1018 seconds

• 20! is roughly the age of the universe in seconds

Permutations

A B C D E• How many ways can we choose 2

letters from the above 5, without replacement, when the order in which we choose the letters is important?

• 5 4 = 20

Permutations (cont.)

20)!25(

!5:

45!3

!5

)!25(

!52045

25

PNotation

Permutations with calculator and Excel

• Calculatornon-graphing: nPr

• Graphingp. 9 of T I Calculator Commands(math button)

• ExcelPaste: Statistical, Permut

Combinations

A B C D E• How many ways can we choose 2

letters from the above 5, without replacement, when the order in which we choose the letters is not important?

• 5 4 = 20 when order important• Divide by 2: (5 4)/2 = 10 ways

Combinations (cont.)

!)!(

!

102

20

21

45

!2!3

!5

!2)!25(

!525

52

rrn

nC

C

rnnr

ST 101 Powerball Lottery

From the numbers 1 through 20,choose 6 different numbers.

Write them on a piece of paper.

Chances of Winning?

760,38!6)!620(

!20

ies?possibilit ofNumber

important.not order t,replacemen

without 20, from numbers 6 Choose

620206

C

North Carolina Powerball Lottery

Prior to Jan. 1, 2009 After Jan. 1, 2009

:

55!3,478,761

5!50!

:

42!42

1!41!

3,478,761*42

146,107,962

5 from 1- 55

1 from 1- 42 (p'ball #)

:

59!5,006,386

5!54!

:

39!39

1!38!

5,006,386*39

195,249,054

5 from 1- 59

1 from 1- 39 (p'ball #)

Visualize Your Lottery Chances

• How large is 195,249,054?• $1 bill and $100 bill both 6” in length

• 10,560 bills = 1 mile• Let’s start with 195,249,053 $1 bills

and one $100 bill …• … and take a long walk, putting

down bills end-to-end as we go

Raleigh to Ft. Lauderdale…

… still plenty of bills remaining, so continue from …

… Ft. Lauderdale to San Diego

… still plenty of bills remaining, so continue from…

… still plenty of bills remaining, so continue from …

… San Diego to Seattle

… still plenty of bills remaining, so continue from …

… Seattle to New York

… still plenty of bills remaining, so …

… New York back to Raleigh

Go around again! Lay a second path of bills

Still have ~ 5,000 bills left!!

Chances of Winning NC Powerball Lottery?

• Remember: one of the bills you put down is a $100 bill; all others are $1 bills

• Your chance of winning the lottery is the same as bending over and picking up the $100 bill while walking the route blindfolded.

Example: Illinois State Lottery

balls) pong pingmillion 16.5 house, ft (1200

months) 10in second 1about (

165,827,25!6!48

!54

importantnot order t;replacemen

withoutnumbers 54 from numbers 6 Choose

2

654 C

Virginia State Lottery

969000,52!1!24

!25760,118,2

760,118,2

760,118,2!5!45

!50: 5Pick

125

550

C

C

Probability Trees

A Graphical Method for Complicated Probability

Problems

Example: AIDS Testing• V={person has HIV}; CDC: P(V)=.006• +: test outcome is positive (test

indicates HIV present)• -: test outcome is negative• clinical reliabilities for a new HIV test:

1. If a person has the virus, the test result will be positive with probability .999

2. If a person does not have the virus, the test result will be negative with probability .990

Question 1

• What is the probability that a randomly selected person will test positive?

Probability Tree Approach

• A probability tree is a useful way to visualize this problem and to find the desired probability.

Probability Treeclinical reliability

clinical reliability

Probability TreeMultiply

branch probsclinical reliability

clinical reliability

Question 1 Answer

• What is the probability that a randomly selected person will test positive?

• P(+) = .00599 + .00994 = .01593

Question 2

• If your test comes back positive, what is the probability that you have HIV?(Remember: we know that if a person has the virus, the test result will be positive with probability .999; if a person does not have the virus, the test result will be negative with probability .990).

• Looks very reliable

Question 2 Answer

Answertwo sequences of branches lead to positive test; only 1 sequence represented people who have HIV.

P(person has HIV given that test is positive) =.00599/(.00599+.00994) = .376

Summary• Question 1:• P(+) = .00599 + .00994 = .01593• Question 2: two sequences of

branches lead to positive test; only 1 sequence represented people who have HIV.

P(person has HIV given that test is positive) =.00599/(.00599+.00994) = .376

Recap• We have a test with very high clinical

reliabilities:1. If a person has the virus, the test result will be

positive with probability .9992. If a person does not have the virus, the test

result will be negative with probability .990

• But we have extremely poor performance when the test is positive:

P(person has HIV given that test is positive) =.376

• In other words, 62.4% of the positives are false positives! Why?

• When the characteristic the test is looking for is rare, most positives will be false.

examples1. P(A)=.3, P(B)=.4; if A and B are

mutually exclusive events, then P(AB)=?

A B = , P(A B) = 02. 15 entries in pie baking contest at

state fair. Judge must determine 1st, 2nd, 3rd place winners. How many ways can judge make the awards?

15P3 = 2730