STAT 203 Lecture 4-1. The normal distribution is symmetric ...jackd/Stat203_2011/Wk04_1.pdf · Now we have z = 5, how do we get Pr(Z > 5)? The table only goes to z = 3.5ish. Use inference:

STAT 203 – Lecture 4-1.

- The normal distribution is symmetric.

- Getting the probability from between two z-scores

- Translating standard scores to and from raw scores.

- Extreme values beyond the table.

So Majestic!

Text from last Friday:

Say a value X followed the normal distribution, with mean μ

(mu, pronounced ‘mew’) and standard deviation σ (sigma).

We used the z-table to find things like the probability that X is

greater than 1.28 standard deviations above the mean.

In other words, we found Pr( X > μ + 1.28σ)

μ + 1.28σ means a z-score of 1.28.

From the z-table, page 515

z Area between Mean and z

Area beyond z

… … … 1.27 39.80 10.20

1.28 39.97 10.03 1.29 40.15 9.85

… … …

Since we’re looking at the values farther away from the mean

than the cutoff, we want the area beyond z.

Pr( X > μ + 1.28σ) = 10.03%, or about 10%

Can we find Pr( X > μ - 1.28σ) ? Hint: Think symmetry.

We can find Pr( X > μ - 1.28σ)

Symmetry: The same on both sides.

What is the chance that this value, X, is more than 1 standard

deviation away the mean in either direction?

Start with Pr( X > μ + 1σ) ,

or, because it’s simpler to write:

Pr( Z > 1)

By the table (page 514)… z Area between Mean

and z Area beyond z

… … … .99 33.89 16.11

1.00 34.13 15.87 1.01 34.38 15.62

Pr( Z > 1) = .16

Pr( Z > 1) = .16,

so Pr( Z < -1) = .16 also

Pr( Z > 1) + Pr(Z < -1) = .32

Not surprizing since Pr( -1 < Z < 1) = .68, .68 + .32 = 1.00

We could have done this the other way too:

Working backwards from Pr( -1 < Z < 1) = .68

We could get by converse Pr(Z < -1) + Pr(Z > 1) = .32

… and get by symmetry Pr(Z > 1) = .16

One other thing to note is that Z = 0 right at the mean, because

the mean is 0 standard deviations above or below the mean.

Let’s try with some uglier z-scores.

Pr( -1.75 < Z < 0.52)

z Area between Mean and z

Area beyond z

… … … 0.51 19.50 30.50

0.52 19.85 30.15 … … … … … …

1.74 45.91 4.09

1.75 45.99 4.01

Doing the math…

Pr( -1.75 < Z < 0.52) can be split into two ranges using the

mean as the split point.

Pr( -1.75 < Z < 0 ) + Pr( 0 < Z < 0.52)

Why would we do this? Because the table has everything from

the mean.

Pr(-1.75 < Z < 0) = .4599

Pr(0 < Z < 0.52) = .1985

.4599 + .1985 = .6584 About 66% of the area.

Pic of the 66%

Z-scores, or standard scores, are a bridge between real

data and probabilities surrounding them.

We find z-scores with this (important!):

Leave some space here for notes, we’re looking at z in detail.

Example problem:

The time spent on homework in hours/week for

full time students is normally distributed with

mean 25, and standard deviation = 7

What proportion of students spend more than

20 hours on homework?

Step 1: Identify – μ = 25, σ = 7, x = 20.

We want the proportion, which is like the

probability.

We know the distribution is normal.

These are clues to find the z-score / standard

score, and use it in the z-table to get the

proportion.

Step 2: Apply.

What do we want?! !!!!

What do we have?! μ = 25, σ = 7, x = 20. !!!!

Use the formula that has Z on one side, and μ, σ, and x on

the other.

-0.71 isn’t on the table, but by symmetry, we can use 0.71.

By the table, 26.11% is between the mean and z=0.71

,23.89% is beyond z=0.71.

We want Pr( X > 20), which is Pr(Z > -0.71)…

Method 1: Split

Pr( Z > -0.71) = Pr( Z < 0) + Pr(-0.71 < Z < 0)

Method 2: Converse

Pr( Z > -0.71) = 1 – Pr(Z < -0.71) =

We can work backwards from a probability to get a value too,

with this: (also important)

This is the same formula as the z-score (standard score)

formula, but rearranged so that X is the value we get out of it.

Example problem:

Homework/week is normally distributed, μ = 25, σ = 7

What’s the minimum homework I can expect 90% of the class

to do?

In other words Pr(X > ??? ) = .9000

Step 1: Identify.

We have the proportion, and we want the value x.

Again, z-score is going to be our bridge.

Going X Z Prob, we used the table last.

Going Prob Z X, we’ll use the table first.

We want the Z value such that 10% of the area is beyond the

mean.

As z increases, the area beyond that value decreases.

Z % Area Beyond 0.00 50.00 0.01 49.60 0.02 49.20 0.03 48.80 0.04 48.40 0.05 48.01

… …

We can use that to find the Z-score with 10% beyond.

(Approximation may be needed)

Z % Area Beyond 0.00 50.00 0.01 49.60 0.02 49.20

… … 0.44 33.00 0.45 32.64 0.46 32.36

… … 1.27 10.20

1.28 10.03

Now we know Pr( Z > 1.28) = 10.03%, that’s the closest z-score

to 10% in the table.

What do we want?! !!!!

What do we have?! μ = 25, σ = 7, z = -1.28 !!!!

So 90% of the full-time students spend 16.04 hours or more on

homework.

What proportion of students spend more than 60 hours/week?

μ = 25, σ = 7, x = 60.

Now we have z = 5, how do we get Pr(Z > 5)?

The table only goes to z = 3.5ish.

Use inference: We want the area beyond z=5, and the area

shrinks as z goes up.

The smallest area is 0.01%, so the area beyond z=5 must be

smaller than that. That’s all we can tell from this table.

Fewer than 0.01% of students spend 60 hours/week on

homework.

(for interest)

Very few data points are going to be more than six standard

deviations above or below the mean. Far less than 0.01%

Six Sigma is a business practice based on making each part in a

machine consistent enough that it will work as long as it’s

within six standard deviations, or 6σ of the mean.

Next time:

- A few more notes on Z-scores

- Post Mortem of Assignment

- Discuss Midterm

- We start chapter 6.

Documents

STAT 203 Lecture 4-1. The normal distribution is symmetric ...jackd/Stat203_2011/Wk04_1.pdf · Now we have z = 5, how do we get Pr(Z > 5)? The table only goes to z = 3.5ish. Use inference: