Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
STAT 203 – Lecture 4-1.
- The normal distribution is symmetric.
- Getting the probability from between two z-scores
- Translating standard scores to and from raw scores.
- Extreme values beyond the table.
So Majestic!
Text from last Friday:
Say a value X followed the normal distribution, with mean μ
(mu, pronounced ‘mew’) and standard deviation σ (sigma).
We used the z-table to find things like the probability that X is
greater than 1.28 standard deviations above the mean.
In other words, we found Pr( X > μ + 1.28σ)
μ + 1.28σ means a z-score of 1.28.
From the z-table, page 515
z Area between Mean and z
Area beyond z
… … … 1.27 39.80 10.20
1.28 39.97 10.03 1.29 40.15 9.85
… … …
Since we’re looking at the values farther away from the mean
than the cutoff, we want the area beyond z.
Pr( X > μ + 1.28σ) = 10.03%, or about 10%
Can we find Pr( X > μ - 1.28σ) ? Hint: Think symmetry.
We can find Pr( X > μ - 1.28σ)
Symmetry: The same on both sides.
What is the chance that this value, X, is more than 1 standard
deviation away the mean in either direction?
Start with Pr( X > μ + 1σ) ,
or, because it’s simpler to write:
Pr( Z > 1)
By the table (page 514)… z Area between Mean
and z Area beyond z
… … … .99 33.89 16.11
1.00 34.13 15.87 1.01 34.38 15.62
Pr( Z > 1) = .16
Pr( Z > 1) = .16,
so Pr( Z < -1) = .16 also
Pr( Z > 1) + Pr(Z < -1) = .32
Not surprizing since Pr( -1 < Z < 1) = .68, .68 + .32 = 1.00
We could have done this the other way too:
Working backwards from Pr( -1 < Z < 1) = .68
We could get by converse Pr(Z < -1) + Pr(Z > 1) = .32
… and get by symmetry Pr(Z > 1) = .16
One other thing to note is that Z = 0 right at the mean, because
the mean is 0 standard deviations above or below the mean.
Let’s try with some uglier z-scores.
Pr( -1.75 < Z < 0.52)
z Area between Mean and z
Area beyond z
… … … 0.51 19.50 30.50
0.52 19.85 30.15 … … … … … …
1.74 45.91 4.09
1.75 45.99 4.01
Doing the math…
Pr( -1.75 < Z < 0.52) can be split into two ranges using the
mean as the split point.
Pr( -1.75 < Z < 0 ) + Pr( 0 < Z < 0.52)
Why would we do this? Because the table has everything from
the mean.
Pr(-1.75 < Z < 0) = .4599
Pr(0 < Z < 0.52) = .1985
.4599 + .1985 = .6584 About 66% of the area.
Pic of the 66%
Z-scores, or standard scores, are a bridge between real
data and probabilities surrounding them.
We find z-scores with this (important!):
Leave some space here for notes, we’re looking at z in detail.
Example problem:
The time spent on homework in hours/week for
full time students is normally distributed with
mean 25, and standard deviation = 7
What proportion of students spend more than
20 hours on homework?
Step 1: Identify – μ = 25, σ = 7, x = 20.
We want the proportion, which is like the
probability.
We know the distribution is normal.
These are clues to find the z-score / standard
score, and use it in the z-table to get the
proportion.
Step 2: Apply.
What do we want?! !!!!
What do we have?! μ = 25, σ = 7, x = 20. !!!!
Use the formula that has Z on one side, and μ, σ, and x on
the other.
-0.71 isn’t on the table, but by symmetry, we can use 0.71.
By the table, 26.11% is between the mean and z=0.71
,23.89% is beyond z=0.71.
We want Pr( X > 20), which is Pr(Z > -0.71)…
Method 1: Split
Pr( Z > -0.71) = Pr( Z < 0) + Pr(-0.71 < Z < 0)
Method 2: Converse
Pr( Z > -0.71) = 1 – Pr(Z < -0.71) =
We can work backwards from a probability to get a value too,
with this: (also important)
This is the same formula as the z-score (standard score)
formula, but rearranged so that X is the value we get out of it.
Example problem:
Homework/week is normally distributed, μ = 25, σ = 7
What’s the minimum homework I can expect 90% of the class
to do?
In other words Pr(X > ??? ) = .9000
Step 1: Identify.
We have the proportion, and we want the value x.
Again, z-score is going to be our bridge.
Going X Z Prob, we used the table last.
Going Prob Z X, we’ll use the table first.
We want the Z value such that 10% of the area is beyond the
mean.
As z increases, the area beyond that value decreases.
Z % Area Beyond 0.00 50.00 0.01 49.60 0.02 49.20 0.03 48.80 0.04 48.40 0.05 48.01
… …
We can use that to find the Z-score with 10% beyond.
(Approximation may be needed)
Z % Area Beyond 0.00 50.00 0.01 49.60 0.02 49.20
… … 0.44 33.00 0.45 32.64 0.46 32.36
… … 1.27 10.20
1.28 10.03
Now we know Pr( Z > 1.28) = 10.03%, that’s the closest z-score
to 10% in the table.
What do we want?! !!!!
What do we have?! μ = 25, σ = 7, z = -1.28 !!!!
So 90% of the full-time students spend 16.04 hours or more on
homework.
What proportion of students spend more than 60 hours/week?
μ = 25, σ = 7, x = 60.
Now we have z = 5, how do we get Pr(Z > 5)?
The table only goes to z = 3.5ish.
Use inference: We want the area beyond z=5, and the area
shrinks as z goes up.
The smallest area is 0.01%, so the area beyond z=5 must be
smaller than that. That’s all we can tell from this table.
Fewer than 0.01% of students spend 60 hours/week on
homework.
(for interest)
Very few data points are going to be more than six standard
deviations above or below the mean. Far less than 0.01%
Six Sigma is a business practice based on making each part in a
machine consistent enough that it will work as long as it’s
within six standard deviations, or 6σ of the mean.
Next time:
- A few more notes on Z-scores
- Post Mortem of Assignment
- Discuss Midterm
- We start chapter 6.