22
UNIT 6 ~ Random Variables 1 ***SECTION 7.1*** Discrete and Continuous Random Variables Sample spaces need not consist of numbers; tossing coins yields H’s and T’s. However, in statistics we are most often interested in numerical outcomes such as the ______________ of heads in the four tosses. We will then use shorthand notation. For example, let X be the number of heads, then if our outcome is HHTH, then we write ___________. Notice that X can be __________________. We call X a ____________________________________ because its values vary when the coin tossing is repeated. A ____________________________________ is a variable whose value is a numerical outcome of a random phenomenon. Usually denoted with capital letters near the end of the alphabet (X, Y, etc.) Random variables are the basic units of _____________________ distributions, which, in turn, are the foundations for inference. There are two types of random variables to be studied: _______________ and _____________________________. Discrete Random Variables A _____________________________________________ X has a countable number of possible values. The ________________________________________ of a discrete random variable X lists the values and their probabilities: Value of X: 1 x 2 x 3 x k x Probability: 1 p 2 p 3 p k p The probabilities i p must satisfy two requirements: 1. Every probability i p is a number between ________________. 2. The sum of the probabilities is _______: 1 2 ... 1 k p p p + + + = . Find the probability of any event by adding the probabilities i p of the particular values i x that make up the event. Example 1: Automobile Defects A consumer organization that evaluates new automobiles customarily reports the number of major defects on each car examined. Let x denote the number of major defects on a randomly selected car of a certain type. A large number of automobiles were evaluated, and a probability distribution consistent with these observations is: Interpret p(3) Find and interpret the probability that the number of defects is between 2 and 5 inclusive x 0 1 2 3 4 5 6 7 8 9 10 p(x) .041 .010 .209 .223 .178 .114 .061 .028 .011 .004 .001

***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

Embed Size (px)

Citation preview

Page 1: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

1

***SECTION 7.1*** Discrete and Continuous Random Variables

Sample spaces need not consist of numbers; tossing coins yields H’s and T’s. However, in statistics we are most often interested in numerical outcomes such as the ______________ of heads in the four tosses. We will then use shorthand notation. For example, let X be the number of heads, then if our outcome is HHTH, then we write ___________. Notice that X can be __________________. We call X a ____________________________________ because its values vary when the coin tossing is repeated. A ____________________________________ is a variable whose value is a numerical outcome of a random phenomenon. Usually denoted with capital letters near the end of the alphabet (X, Y, etc.) Random variables are the basic units of _____________________ distributions, which, in turn, are the foundations for inference. There are two types of random variables to be studied: _______________ and _____________________________.

Discrete Random Variables

A _____________________________________________ X has a countable number of possible values. The ________________________________________ of a discrete random variable X lists the values and their probabilities:

Value of X: 1x 2x 3x … kx

Probability: 1p 2p 3p … kp

The probabilities ip must satisfy two requirements:

1. Every probability ip is a number between ________________.

2. The sum of the probabilities is _______: 1 2 ... 1kp p p+ + + = .

Find the probability of any event by adding the probabilities ip of the particular values ix that

make up the event. Example 1: Automobile Defects A consumer organization that evaluates new automobiles customarily reports the number of major defects on each car examined. Let x denote the number of major defects on a randomly selected car of a certain type. A large number of automobiles were evaluated, and a probability distribution consistent with these observations is: Interpret p(3) Find and interpret the probability that the number of defects is between 2 and 5 inclusive

x 0 1 2 3 4 5 6 7 8 9 10

p(x) .041 .010 .209 .223 .178 .114 .061 .028 .011 .004 .001

Page 2: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

2

A probability histogram is a pictorial representation of a discrete probability distribution. Below is a probability histogram for the previous example: Note: 1) The height of each bar shows the probability of the outcome as its base 2) Since the heights are probabilities, they add to 1 3) All bars in a histogram are the same width 4) Histograms make it easy to quickly ___________________ distributions Example 2: Hot Tub Models Suppose that each of four randomly selected customers purchasing a hot tub at a certain store chooses either an electric (E) or a gas (G) model. Assume that these customers make their choices independently of one another and that 40% of all customers select an electric model. This implies that for any particular one of the four customers, P(E) = _______ and P(G) = _______ One possible experimental outcome is EGGE, where the first and fourth customers select electric models and the other two choose gas models. Because the customers make their choices independently, the multiplication rule for independent events implies that P(EGGE) = P(1st chooses E and 2nd chooses G and 3rd chooses G and 4th chooses E) = P(E) P(G) P(G) P(E) = (.4) (.6) (.6) (.4) = .0576 What is the probability for the experimental outcome GGGE? Let us now consider x = the number of electric hot tubs purchased by the four customers Then we can consider the outcomes and probabilities as follows.

Page 3: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

3

Outcome Probability x Value Outcome Probability x Value

GGGG .1296 0 GEEG

EGGG .0864 1 GEGE

GEGG .0864 GGEE .0576

GGEG GEEE 3

GGGE .0864 EGEE

EEGG EEGE

EGEG .0576 2 EEEG .0384

EGGE .0576 EEEE

Notice that there are four different outcomes for which x = 1, so p(1) results from summing the four corresponding probabilities: p(1) = P( x = 1) = P(EGGG) + P(GEGG) + P(GGEG) + P(GGGE) Similarly, p(2) = .3456 p(3) = .1536 p(4) = .0256 We can then summarize the probability distribution of x: The probability histogram for this distribution is: Find and interpret the probability that at least two of the four customers choose electric models

Page 4: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

4

Continuous Random Variables When we use the table of random digits to select a digit between 0 and 9, the result is a __________ random variable. The probability model assigns probability ________ to each of the 10 possible outcomes. Suppose that we want to choose a number at random between 0 and 1, allowing ________ number between 0 and 1 as the outcome. The sample space is now an entire interval of numbers: How can we assign probabilities? We would like all outcomes to be ________________________, but we cannot assign probabilities to each individual value of x and then sum because there are _________________ many possible values. So we will use a new way of assigning probabilities directly to events; that is, using ______________________________________________________. Any density curve has area exactly _____ underneath it, corresponding to total probability 1. Example 3: Application Processing Time Define a continuous random variable x by x = amount of time (in minutes) taken by a clerk to process a certain type of application form. Suppose that x has a probability distribution with density function

( ).5 4 < < 6

0 otherwise

xf x

=

The graph of f(x), the density curve, is shown below in (a). When the density curve is constant over an interval (resulting in a horizontal density curve), the probability distribution is called a __________________ distribution. It is especially easy to use this density curve to calculate probabilities, because it just requires finding the area of rectangles using the formula

( )( )area base height=

The curve has positive height, 0.5, only between x = 4 and x = 6. The total area under the curve is:

b) ( )4.5 5.5P x< < =

c) ( )5.5P x< =

Interpretation:

Page 5: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

5

A _____________________ random variable X takes all values in an ______________ of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area __________ the density curve and ___________ the values of X that make up the event. The probability model for a continuous random variable assigns probabilities to intervals of outcomes rather than to individual outcomes. In fact, all continuous probability distributions assign probability 0 to every individual outcome! Only intervals of values have positive probability. This is clear when you think of finding area.

* We can ignore the distinction between > and ≥ when finding probabilities for CONTINUOUS (but not discrete) random variables.

Normal Distributions as Probability Distributions The density curves that are most familiar to us are the Normal curves (from Chapter 2). Because any density curve describes an assignment of probabilities, Normal distributions are probability

distributions. Example 4: Internet for Information According to a recent Associated Press poll, approximately 40% of American adults indicated they used the internet to get news and information about political candidates. Suppose 40% of all American adults use this method to get their political information. What would happen if you randomly sampled a group of American adults and asked them if they used the internet to get this information. Define X to be the % of your sample that would respond that the internet was their primary source. Use the fact that the distribution of X is approximately N(0.4, 0.01265) to answer the following questions.

a) ( )0.42P X ≥ b) ( )0.38 0.42P X< <

Page 6: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

6

***SECTION 7.2*** Means and Variances of Random Variables In previous chapters, we moved from graphs to numerical measures such as _____________ and __________________________________________. Now we will make the same move to expand our descriptions of the distributions of random variables.

The Mean of a Random Variable

The mean x of a set of observations is their ordinary average. The mean of a discrete random variable X is also an _________________ of the possible values of X, but with an essential change to take into account the fact that ______________ outcomes need to be equally likely.

The common symbol for the ___________ of a probability distribution is µ , the Greek letter mu. We used µ in Chapter 2 for the mean of a Normal distribution, so this is not a new notation. We will often be interested in _________________ random variables, each having a _______________ probability distribution with a __________________ mean. To remind ourselves that we are

talking about the mean of X we often write Xµ rather than simply µ . You will often find the mean of a random variable called the _________________________________________.

Mean of a Discrete Random Variable Suppose that X is a discrete random variable whose distribution is

Value of X: 1x 2x 3x … kx

Probability: 1p 2p 3p … kp

To find the mean of X, multiply each possible value by its probability, then add all the products:

1 1 2 2 ...X k kx p x p x pµ = + + +

i ix p=∑

Example 5: Exam Attempts Individuals applying for a certain license are allowed up to four attempts to pass the licensing exam. Let x denote the number of attempts made by a randomly selected applicant. The probability distribution of x is as follows:

x 1 2 3 4

p(x) .1 .2 .3 .4

Then x has mean value:

Page 7: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

7

The Variance of a Random Variable The variance and the standard deviation are the measures of ______________ that accompany the choice of the mean to measure center. Just as for the mean, we need a _________________ symbol

to distinguish the variance of a random variable from the variance 2s of a data set. We write the

variance of a random variable X as 2

Xσ . The definition of the variance 2

Xσ of a random variable is

similar to the definition of the sample variance 2s given in Chapter 1. Here is the definition.

Variance of a Discrete Random Variable Suppose that X is a discrete random variable whose distribution is

Value of X: 1x 2x 3x … kx

Probability: 1p 2p 3p … kp

and that µ is the mean of X. The variance of X is

( ) ( ) ( )2 2 22

1 1 2 2 ...X X X k X kx p x p x pσ µ µ µ= − + − + −

( )2i X ix pµ= −∑

The standard deviation Xσ of X is the square root of the variance.

Example 6: Defective Components A television manufacturer receives certain components in lots of four from two different suppliers. Let x and y denote the number of defective components in randomly selected lots from the first and second suppliers, respectively. The probability distributions for x and y are as follows:

x 0 1 2 3 4 y 0 1 2 3 4

p(x) .4 .3 .2 .1 0 p(y) .2 .6 .2 0 0

Probability histograms are given below: What can we say by examining the histograms? Find the variance and standard deviation for x and y. What can we conclude?

Page 8: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

8

Statistical Estimation and the Law of Large Numbers

To estimate µ , we often choose an _________ and use the sample mean x to estimate the unknown population mean µ . Statistics obtained from probability samples are random variables because their values would ________ in ___________________ sampling. The _______________ _______________________ of statistics are just the probability distributions of these random variables. We will study sampling distributions in Chapter 9.

It seems reasonable to use x to _______________ µ . An SRS should fairly represent the population, so the mean x of the sample should be somewhere near the mean µ of the population. We don’t expect x to be _______________ µ . We realize that if we choose another SRS, the luck of the draw will probably produce a ____________________ x . However, if we keep adding

observations to our random sample, the statistic x is ______________________ to get as close as

we wish to the parameter µ and then stay that close. This remarkable fact is called the law of large numbers and it holds for _______ population.

Law of Large Numbers

Draw ______________________ observations at random from any population with finite mean µ . Decide how accurately you would like to estimate µ . As the number of observations drawn __________________, the mean x of the observed values eventually ______________________

the mean µ of the population as closely as you specified and then stays that close.

Notice that as we increase the size of our sample, the sample mean x approaches the mean µ of the population:

Thinking about the Law of Large Numbers The gamblers in a casino may win or lose, but the casino will win ___________________________ because the law of large numbers says what the average outcome of many thousands of bets will be. An insurance company deciding how much to charge for life insurance and a fast-food restaurant deciding how many beef patties to prepare also rely on the fact that averaging over __________ individuals produces a _______________ result.

Page 9: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

9

How large is a large number? The law of large numbers doesn’t say how many trials are needed to

guarantee a mean outcome close to µ . It all depends on the ______________________ of the random outcomes. The more variable the outcomes, the more trials are needed to ensure that the

mean outcome x is close to the distribution mean µ . Casinos understand this: the outcomes of games of chance are variable enough to hold the interest of gamblers. Only the casino plays often enough to rely on the law of large numbers. Gamblers get entertainment; the casino has a business.

* Our intuition doesn’t do a good job of distinguishing random behavior from systematic

influences. This is also true when we look at data. We need statistical inference to

supplement exploratory analysis of data because probability calculations can help verify that

what we see in the data is more than a random pattern.

Rules for Means Rule 1 – If X is a random variable and a and b are fixed numbers, then

a bX Xa bµ µ+ = + .

Rule 2 – If X and Y are random variables, then

X Y X Yµ µ µ+ = + .

Example 7: Linda Sells Cars and Trucks Linda is a sales associate at a large auto dealership. She motivates herself by using probability estimates of her sales. For a sunny Saturday in April, she estimates her car sales as follows:

Cars sold: 0 1 2 3 Probability: 0.3 0.4 0.2 0.1

Linda’s estimate of her truck and SUV sales is:

Vehicles sold: 0 1 2 Probability: 0.4 0.5 0.1

Take X to be the number of cars Linda sells and Y the number of trucks and SUVs. The means of these random variables are:

Page 10: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

10

Example 7 – continued… At her commission rate of 25% of a gross profit on ach vehicle she sells, Linda expects to earn $350 for each car sold and $400 for each truck or SUV sold. So her earnings are Thus, her mean earnings (best estimate of her earnings for the day) are

Rules for Variances Two random variables X and Y are _____________________ if knowing that any event involving X alone did or did not occur tells us _______________ about the occurrence of any event involving Y alone. Probability models often assume independence when the random variables describe outcomes that appear unrelated to each other. You should _____________ ask in each instance whether the assumption of independence _________________________________________.

Rules for Variances Rule 1 – If X is a random variable and a and b are fixed numbers, then

2 2 2

a bX Xbσ σ+ = .

Rule 2 – If X and Y are independent random variables, then

2 2 2

2 2 2

X Y X Y

X Y X Y

σ σ σ

σ σ σ+

= +

= +

This is the addition rule for variances of independent random variables. As with data, we prefer the _____________________________________ of a random variable to the variance as a measure of ___________________________. Example 8: Propane Gas Consider the experiment in which a customer of a propane gas company is randomly selected. Suppose that the standard deviation of the random variable

x = number of gallons required to fill a customer’s propane tank

is known to be 42 gallons. The company is considering two different pricing models: Model 1: $3 per gal Model 2: service charge of $50 + $2.80 per gal

Page 11: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

11

Example 8 – continued… The company is interested in the variable y = amount billed For each of the two models, y can be expressed as a function of the random variable x:

Model 1: model 1 3y x=

Model 2: model 2 50 2.8y x= +

Find the standard deviation of the billing amount variable. Example 9: Luggage Weights A commuter airline flies small planes between San Luis Obispo and San Francisco. For small planes, the baggage weight is a concern, especially on foggy mornings, because the weight of the plane has an effect on how quickly the plane can ascend. Suppose that it is know that the variable x = weight of baggage checked by a randomly selected passenger has mean and standard deviation of 42 and 16, respectively. Consider a flight on which 10 passengers, all traveling alone, are flying.

If we use ix to denote the baggage weight for passenger i (for i ranging from 1 to 10), the total

weight of checked baggage, y, is then

1 2 10...y x x x= + + +

Note that y is a linear combination of the ix . Find the mean and standard deviation of y.

Page 12: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

12

Combining Normal Random Variables So far, we have concentrated on finding rules for means and variances of random variables. If a random variable is Normally distributed, we can use its mean and variance to compute probabilities. What if we combine two Normal random variables? Any linear combination of independent Normal random variables is also Normally distributed. Example 10: A Round of Golf Tom and George are playing in the club golf tournament. Their scores vary as they play the course repeatedly. Tom’s score X has the N(110, 10) distribution, and George’s score Y varies from round to round according to the N(100, 8) distribution. If they play independently, what is the probability that Tom will score lower than George and thus do better in the tournament?

Page 13: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

13

***SECTION 8.1*** The Binomial Distributions

In practice, we frequently encounter random phenomenon where there are two outcomes of interest. For example,

- Tossing a coin in football to see who will kick or receive. - Shooting a free throw in basketball. - Having a baby. - Production of parts in an assembly line.

In this chapter we will explore two important classes of distributions – the ____________________ distributions and the _______________________ distributions – and learn some of their properties.

The Binomial Setting 1) Each observation falls into _______ of just _______ categories, which for convenience we call “_________________” or “___________________.”

2) There is a ____________ number n of observations. 3) The n observations are all ___________________________. That is, knowing the result of one observation tells you _________________ about the other observations.

4) The probability of success, call it p, is the ____________ for each observation. * If you are presented with a random phenomenon, it is important to be able to _________________ it as a binomial setting, a geometric setting (covered in the next section), or neither. If data are produced in a binomial setting, then the random variable X = number of successes is called a _________________ random variable, and the probability distribution of X is called a ____________________________________________________.

Binomial Distribution The distribution of the count X of _____________________ in the binomial setting is the binomial distribution with parameters n and p. The parameter n is the _____________________ of __________________________, and p is the ________________________ of _________________

on any ________ observation, we say that X is ( ),B n p .

* The most important skill for using binomial distributions is the ability to recognize

situations to which they do and don’t apply.

Example 1: Blood Types Blood type is inherited. If both parents carry genes for the O and A blood types, each child has probability 0.25 of getting two O genes and so of having blood type O. Different children inherit independently of each other. Identify the distribution of X = number of O blood types among 5 children of these parents.

Page 14: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

14

Example 2: Dealing Cards Deal 10 cards from a shuffled deck and count the number X of red cards. There are 10 observations, and each gives either a red or a black card. A “success” is a red card. Identify the distribution of X = number of red cards among the 10 drawn.

Binomial Distributions in Statistical Sampling The binomial distributions are important in statistics when we wish to make ___________________ about the proportion p of “successes” in a population.

Sampling Distribution of a Count Choose an ________ of size n from a population with proportion p of successes. When the population is much _______________ than the sample, the count X of successes in the sample has ___________________________ the ____________________ distribution with parameters n and p. Example 3: Aircraft Engine Reliability Engineers define reliability as the probability that an item will perform its function under specific conditions for a specific period of time. If an aircraft engine turbine has probability 0.999 of performing properly for an hour of flight, identify the distribution of X = number of turbines in a fleet of 350 engines that fly for an hour without failure.

Binomial Formulas We can find a formula for the probability that a binomial random variable takes any value by adding probabilities for the different ways of getting exactly that many successes in n observations.

Page 15: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

15

Binomial Coefficient The number of ways of arranging k successes among n observation is given by the binomial

coefficient ( )!

! !

n n

k k n k

= −

for k = 0, 1, 2, …n.

Binomial coefficients (read as “binomial coefficient n choose k”) have many uses in mathematics, but we are interested in them only as an aid to finding binomial _________________________.

The binomial coefficient n

k

counts the number of ways in which k ___________________ can be

________________________ among n __________________________.

The binomial probability ( )P X k= is this count multiplied by the probability of any specific

arrangement of the k successes.

Binomial Probability If X has the binomial distribution with n observations and probability p of successes on each observation, the possible values of X are 0, 1, 2, …, n. If k is any one of these values,

( ) ( )1n kk

nP X k p p

k

− = = ⋅ ⋅ −

Example 4: Defective switches The number X of switches that fail inspection has approximately the binomial distribution with n = 10 and p = 0.1. Find the probability that no more than 1 switch fails.

Page 16: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

16

Finding Binomial Probabilities In practice, you will _____________ have to use the formula for calculating the probability that a binomial random variable takes any of its values. We will use a calculator or other statistical software to calculate binomial probabilities. Example 5: Inspecting switches A quality engineer selects an SRS of 10 switches from a large shipment for detailed inspection. Unknown to the engineer, 10% of the switches in the shipment fail to meet the specifications. What is the probability that no more than 1 of the 10 switches in the sample fail inspection?

pdf Given a discrete random variable X, the _______________________________________________ ________________________ (pdf) assigns a probability to each value of X. The probabilities must satisfy the rules for probabilities given in Chapter 6. Example 6: Multiple-Choice Quiz Suppose you were taking a 10-item multiple-choice quiz with choices: A, B, C, D, and E. Identify the random variable of interest, and find the probability that the number of correct guesses is 4.

Page 17: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

17

In applications we frequently want to find the probability that a random variable take a __________ of ______________.

cdf Given a random variable X, the ______________________________________________________ _____________________ (cdf) of X calculates the sum of the probabilities for 0, 1, 2, …, up to the value X. That is, it calculates the probability of obtaining at most X successes in n trials. The cdf is also particularly useful for calculating the probability that the number of successes is more than a certain number. This calculation uses the complement rule:

( ) ( )1 0,1, 2,3,4,...P X k P X k k> = − ≤ =

Example 7: Multiple-Choice Quiz Suppose you were taking a 10-item multiple-choice quiz with choices: A, B, C, D, and E. a) Find the probability that the number of correct guesses is at most 4. b) Find the probability that the number of correct guesses is more than 6.

Binomial Mean and Standard Deviation The binomial distribution is a special case of a probability distribution for a ______________ random variable. Hence, it is possible to find the mean and standard deviation of a binomial in the same way as we did for a discrete random variable – but we really don’t want to! It is just too much work (e.g., if n = 25, there are 26 values and 26 probabilities). We will have formulas for the __________ and __________________________________________ for BINOMIAL

DISTRIBUTIONS ONLY!!!

Mean and Standard Deviation of a Binomial Random Variable If a count X has the binomial distribution with number of observations n and probability of success p, the mean and standard deviation of X are:

( )1

np

np p

µ

σ

=

= −

Page 18: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

18

Example 8: Bad Switches Continuing Example 5, the count X of bad switches is binomial with n = 10 and p = 0.1. This is the sampling distribution the engineer would see if she drew all possible SRSs of 10 switches from the shipment and recorded the value of X for each sample. Find the mean and standard deviation.

The Normal Approximation to Binomial Distributions The formula for binomial probabilities becomes awkward as the number of trials n increases. As the number of trials n gets larger, the binomial distribution gets close to a Normal distribution. When n is large, we can use Normal probability calculations to approximate hard-to-calculate binomial probabilities. Look at the following probability histograms for the binomial distribution:

Page 19: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

19

In the following figure, the Normal curve is overlaid on the probability histogram of 1000 counts of a binomial distribution with n = 2500 and p = 0.6. As the figure shows, this Normal distribution approximates the binomial distribution quite well.

Normal Approximation for Binomial Distributions Suppose that a count X has the binomial distribution with n trials and success probability p. When

n is large, the distribution of X is approximately normal, ( )( ), 1N np np p− .

As a rule of thumb, we will use the Normal approximation when n and p satisfy 10np ≥ and

( )1 10n p− ≥ .

Example 9: Attitudes toward shopping Are attitudes toward shopping changing? Sample surveys show that fewer people enjoy shopping than in the past. A survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that “I like buying new clothes, but shopping is often frustrating and time-consuming.” The population that the poll wants to draw conclusions about is all U.S. residents aged 18 and over. Suppose that in fact 60% of all adult U.S. residents would say “Agree” if asked the same question. Approximate the probability that 1520 or more of the sample agree? Note: Now you can simulate a binomial event by using randBin(digit, p, n)

Page 20: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

20

***SECTION 8.2*** The Geometric Distributions If the goal of an experiment is to obtain _________ success, a random variable X can be defined that _____________ the number of trials __________________ to obtain that __________ success. A random variable that satisfies the above description is called ______________________, and the distribution produced by this random variable is called a geometric distribution. The possible values of a geometric random variable are 1, 2, 3, …, that is, an __________________ set, because it is ________________________ possible to proceed indefinitely _______________ ever obtaining a success. Some examples are,

- Flip a coin until you get a head. - Roll a die until you get a 3. - In basketball, attempt a three-point shot until you make a basket.

The Geometric Setting 1) Each observation falls into _______ of just _______ categories, which for convenience we call “_________________” or “___________________.”

2) The observations are all __________________________. 3) The probability of success, call it p, is the __________ for each observation. 4) The ___________________ of interest is the __________________ of trials required to obtain the ___________ success.

Example 10: Roll a die A game consists of rolling a single dies. The event of interest is rolling a 3; this event is called a success. The random variable is defined as X = the number of trials until a 3 occurs. Is this a geometric setting? Why or why not? Example 11: Draw an ace Suppose you repeatedly draw cards without replacement from a deck of 52 cards until you draw an ace. There are two categories of interest: ace = success; not ace = failure. Is this a geometric setting? Why or why not?

Page 21: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

21

In general, if p is the probability of success, then ____________ is the probability of failure. So, thinking of a probability for a geometric random variable, we would have:

...failure failure failure failure success⋅ ⋅ ⋅ ⋅ ⋅

This leads us to the rule for calculating geometric probabilities.

Rule for Calculating Geometric Probabilities

If X has a geometric distribution with probability p of success and ( )1 p− of failure on each

observation, the possible values of X are 1, 2, 3, …. If n is any one of these values, the probability that the first success occurs on the nth trial is

( ) ( ) 11

nP x n p p

−= = − ⋅ .

Example 12: Roll a die The rule for calculating geometric probabilities can be used to construct a probability distribution table for X = number of rolls of a die until a 3 occurs:

The Expected Value and Other Properties of the Geometric Random Variable

If you were flipping a fair coin, how many times would you expect to have to flip the coin in order to observe the first head? If you were rolling a die, how many times would you expect to have to roll the die in order to observe the first 3? We have formulas that will assist us in determining this.

Page 22: ***SECTION 7.1*** Discrete and Continuous Random Variablesapstatsclass.weebly.com/uploads/1/2/4/4/12442984/unit_6_notes.pdf · Discrete and Continuous Random Variables ... A _____

UNIT 6 ~ Random Variables

22

The Mean and Standard Deviation of a Geometric Random Variable If X is a geometric random variable with probability of success p on each trial, then the ___________, or ______________________________, of the random variable, that is the expected

number of trials required to get the first success, is 1

pµ = .

The ____________________ of X is ( )

2

1 p

p

−.

To find the standard deviation, simply take the square root. Example 13: Arcade game Glenn likes the game at the state fair where you toss a coin into a saucer. You win if the coin comes to rest in the saucer without sliding off. Glenn has played this game many times and has determined that on average he wins 1 out of every 12 times he plays. He believes that his chances of winning are the same for each toss. He has no reason to think that his tosses are not independent. Let X be the number of tosses until a win. Does this describe a geometric setting? Why or why not? Then find the mean and standard deviation.

P(X > n) The probability that it takes more than n trials to see the first success is

( ) ( )1n

P X n p> = −

Example 14: Applying the formula a) Roll a die until a 3 is observed. The b) What is the probability that it takes Glenn probability that it takes more than 6 (from Ex. 13) more than 12 tosses to win? rolls to observe a 3 is: More than 24 tosses?