33
Chapter 7 Random Variables

Chapter 7 Random Variables. Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.) If we let X = the number of heads, then

Embed Size (px)

Citation preview

Page 1: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Chapter 7Random Variables

Page 2: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Random Variables Sample spaces are not always numeric (example tossing 4

coins: HTTH, TTTH, etc.) If we let X = the number of heads, then in those 2 outcomes X

= 2, X = 1.

We call X a “random variable” because its values vary when the coin tossing is repeated.

Random variables usually denoted by capital letters such as X or Y

As we progress from general probability to inference, the sample space S just lists the possible values of the random variable

Page 3: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

7.1 Discrete and Continuous random variables

Page 4: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Example A University posts the grade

distributions for its courses online. Students in stats received 21% A’s, 43% B’s, 30% C’s, 5% D’s, 1% F’s. Choose a student at random- the student’s grade on a four point scale (A = 4) is a random variable. The value of X changes when we repeatedly choose students at random but it’s always either a 0, 1, 2, 3, or 4.

Page 5: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Here is the distribution:

The probability that the student got a B or better is the sum of the probabilities of an A and a B P(X ≥ 3) = P(X = 3) + P(X = 4)

= .43 + .21 = .64

Value of X 0 1 2 3 4

Probability .01 .05

.30

.43 .21

Page 6: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Probability Histograms Idealized pictures of the results over

many trials

Page 7: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Continuous Random Variables When we choose a digit between 0 and 9, each

number has .1 probability. What if we wanted to allow any number between 0 and 1 as our outcome? There are infinite possible Outcomes! How can we Assign probabilities? X in this case is a continuousVariable b/c its values are not Isolated numbers but an entire interval of numbers

Page 8: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

The random number generator will spread its output uniformly across the entire interval from 0 to 1 creating a density curve of a UNIFORM DISTRIBUTION. Area under curve is 1 and probability of

any event is the area under the curve and above the event in question.

Page 9: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Probability model for a continuous variable assigns probabilities to INTERVALS of outcomes- not individual outcomes! An individual outcome actually gets assigned a probability of 0!

Ex: Random number gen produces a number between .79 and .81 with a probability of .02. An outcome between .799 and .801 has probability .002 (we can ignore the distinction between > and ≥ in continuous but not discrete variables.

Page 10: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Normal Distributions The density curve we are most familiar

with is a Normal Curve. Remember N(μ, σ) is our notation for a

normal distribution.

Z score is a standard Normal random variable having the distribution N (0, 1)

Page 11: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Example: cheating in School True population probability that a student will report

on another cheating student is 12% based on a survey of 400 random students, done many

times, we expect the average probability to get close to this .12 value. (with a standard deviation of .016)

This is a continuous random variable b/c if I draw one sample of 400 I would likely get a different proportion.

What is the probability that if I conduct a survey my result will differ from the true probability by 2%? If the result is less than .10 or greater than .14 P(From table A or calculator, P(-1.25 ≤ Z ≤ 1.25) = .8944 - .1056 = .7888 So probability we seek is 1 - .7888 = .2112 which is 21%

Page 12: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then
Page 13: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Homework Read P. 465-474, #2-5, 9 and 10

Page 14: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

7.2 Means and variances of Random Variables (weighted average)

The mean of a probability distribution describes the LONG RUN average outcome. (not all outcomes need be equally likely)Mean of a probability distribution is denoted as μx

Page 15: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Example

Lottery: You choose a 3 digit number. If the lottery shows your same number you win $500. Since there are 1000 possible 3 digit numbers, you have a 1/1000 chance of winning.

What is your average payoff from many tickets? $500(.001) + $0(.999) = $.50 How would this differ if you paid 1$ for your

ticket?

Payoff X $0 $500

Probability .999 .001

Page 16: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Variance of a random Variable Variance of a random variable X notated

as σ2x (different from variance of a

sample which is notated as s2).

Page 17: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Example- Linda sells carsCars Sold

0 1 2 3

Probability

.3 .4 .2 .1

Xi Pi XiPi (Xi – μx

2)Pi

0 .3 .0 (0-1.1)2(.3)

.363

1 .4 .4 (1-1.1)2(.4)

.004

2 .2 .4 (2-1.1)2(.1)

.162

3 .1 .3 (3-1.1)2(.1)

.361

Μx = 1.1 Σx2

= .890

We can find the mean and variance of X with a table (or with our calculator! enter X in L1, P in L2, do 1-var stat L1, L2!)

Page 18: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Mean of a continuous random variable The point at which the area under the

density curve would balance if it were made out of solid material (center if symmetrical).

Page 19: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Law of Large Numbers If we want to estimate the mean height μ of

all American women between age 18-24. To estimate μ we take a SRS of females18-24 and use the sample mean X bar to estimate the unknown population mean. If we repeat this, and choose another sample,

the mean height will likely differ, but the more times we repeat this- drawing a sample and recording the mean- we expect that the average of the mean heights of all our samples will get very close to the true μ

Page 20: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

The behavior of X bar is just like the behavior of expected probabilities!

Example: Suppose the true μ for women’s heights was 64.5 inches with a standard deviation of 2.5 inches.

If I continuously repeat drawing a sample of women from this population and recording their average height. After each recording, I write down the average of my mean sample heights. The more times I do this, the closer my overall average gets to 64.5

Page 21: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Casinos, Insurance companies, and law of large numbers Gamblers may win or lose, but the

casino will win in the long run because the law of large numbers says what the average outcome of many thousands of bets will be

This is the same concept when insurance companies decide what to charge or how many beef patties McD’s should make per day

Page 22: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Law of Small numbers(there isn’t one)

Rules of probability and law of large numbers describe the regular behavior of chance phenomena in the LONG run, but Psychologists have discovered that our intuitive understanding of randomness is quite different from the true laws of chance. We expect that even short sequences of random events will show

the kind of average behavior that in fact only appears in the long run.

Ex: Write down a sequence of heads and tails that you think imitates 10 tosses of a balanced coin. What was the longest run of consecutive heads or tails in your tosses?

Most people don’t write a run of more than 2 consecutive heads or tails. Longer runs don’t seem “random” to us.

In fact, the probability of a run of 3 or more consecutive heads or tails in 10 tosses is greater than .5078!

Seeing a run of 3 or more may cause us to incorrectly conclude that we have a biased coin. Some gamblers follow “hot-hand” theory…silly!

Page 23: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

How large is large? The law doesn’t say how many trials are

needed to guarantee a mean outcome close to μ. That depends on the variability of the

outcomes. The more variable the outcomes, the more trials

are needed to ensure that the mean outcome X bar is close to the distribution mean μ

Casinos understand this: the outcomes of games of chance are variable enough to hold the interest of gamblers. Only the casino plays often enough to rely on the law of large numbers. Gamblers get entertainment, Casino has a business!

Page 24: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Homework Expected Value WS Special Problem 7A, #3

Page 25: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Rules for means

Review: How can we tell if something is a legitimate probability distribution?

Page 26: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Example (Linda cars)Cars Sold

0 1 2 3

Probability

.3 .4 .2 .1

Trucks/SUV

0 1 2

Probability .4 .5 .1

Let X be the number of cars Linda sells and Y the number of trucks and SUV’s.

μx = (0)(.3) + (1)(.4) + (2)(.2) + (3)(.1) = 1.1 cars

μy = (0)(.4) + (1)(.5) + (2)(.1) = .7 trucks and SUV’s

Page 27: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

At her commission rate of 25% of gross profit on each vehicle she sells, Linda expects to earn $350 on each car and $400 on each truck/SUV sold. her earnings are Z = 350X + 400Y What are her average (expected) earnings?

Combing rules 1 and 2, her mean earnings are μz = 350 μx + 400 μy

= 350x1.1 + 400x .7 = $665

That’s her best estimate of her earnings for the day.

Page 28: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Rules for Variances For this course we only need to deal with variances of

2 variables that are independent. This is an ASSUMPTION when we do these problems (always ask yourself if the assumption of independence seems reasonable).

RULE 1: Multiplying X by a constant multiplies the SD by that constant (and thus the variance by the square of that constant)

RULE 2: If X and Y are independentσ2

x+Y = σx2 + σy

2 and σ2x-y = σx

2 + σy2

(they’re the same b/c variance affected by the square of the change so doesn’t matter if neg or pos) The difference X – Y is more variable than either X or Y

alone because variations in both X and Y contribute to variation in their difference.

Page 29: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Example: Winning lottery (review) The payoff X of a $1 ticket (do on calc)

The standard deviation is σx = √($249.75) = $15.80 (*Usual for games of chance to have large variances, keeps them

exciting) If you buy a ticket your winnings are W = X -

1 (b/c you paid $1) By rules for means, the mean amount you win

is μw = μx – 1 = -$.50 (standard deviation and variance of μx – 1 will

be the same as μx b/c adding or subtracting a constant is a linear trans!)

Xi Pi XiPi (Xi – μx2)Pi

0 .999 0 (0 - .5)2(.999) .24975

500 .001 .5 (500-.5)2(.001)

249.50025

μx = .5 σ2x =

249.75

Page 30: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Suppose now that you buy a ticket on each of 2 different days. The payoffs X and Y on the two tickets are independent because separate drawings are held each day. Your total payoff X + Y has mean: μx+y = μx + μy = $.5 + $.5 = $1.00 Because X and Y are independent, the variance

of X + Y is σ2

x+Y = σx2 + σy

2 = 249.75 + 249.75 = 499.50 the standard deviation of the total payoff is

σx+Y =√(499.5) = $22.35 **not the same as sum of individual standard dev!**

If you buy a ticket every day (365 tickets a year) your mean payoff is the sum of 365 daily payoffs. That’s 365 x $.50 = $182.50. Of course it cost you $365 to play so you actually lose $182.50!

Page 31: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Combining Normal Random Variables Any linear combination of independent

Normal Random Variables is also Normally Distributed.

Page 32: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Example- golf Tom and George are playing in a tournament. Their

scores vary as they play the course repeatedly. Tom’s score X has the N(110, 10) distribution, and George’s score Y varies from round to round according to the N(100,8) distribution. If they play independently, what is the probability that Tom will score lower than George and thus do better in the tournament? The difference X – Y between their scores is Normally

distributed with mean and variance: μx-y = μx – μy = 110 – 100 = 10 σ2

x-y = σ2x + σ2

y = 102 + 82 = 164 √(164) = 12.8, X – Y has the N(10, 12.8) distribution.

The probability that Tom wins is: P(X<Y) = P(X – Y < 0) P(Z < (0-10)/12.8) P(Z < .78) = .2177

Page 33: Chapter 7 Random Variables.  Sample spaces are not always numeric (example tossing 4 coins: HTTH, TTTH, etc.)  If we let X = the number of heads, then

Homework Read P. 480-486, #10, 16, 24-28