30
Stat 31, Section 1, Last Time Distribution of Sample Means Expected Value same Variance less, Law of Averages, I Dist’n Normal, Law of Averages, II Statistical Inference Confidence Intervals

Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value same –Variance less, Law of Averages, I –Dist’n Normal, Law of Averages,

Embed Size (px)

DESCRIPTION

Choice of Sample Size Could choose n to make = desired value But S. D. is not very interpretable, so make “margin of error”, m = desired value Then get: “ is within m units of, 95% of the time”

Citation preview

Page 1: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Stat 31, Section 1, Last Time

• Distribution of Sample Means

– Expected Value same

– Variance less, Law of Averages, I

– Dist’n Normal, Law of Averages, II

• Statistical Inference

– Confidence Intervals

Page 2: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Choice of Sample Size

Additional use of margin of error ideaBackground: distributions Small n Large n

X

n

Page 3: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Choice of Sample Size

Could choose n to make = desired value

But S. D. is not very interpretable, so make “margin of error”, m = desired value

Then get: “ is within m units of , 95% of the time”

n

X

Page 4: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Choice of Sample Size

Given m, how do we find n?Solve for n (the equation):

n

mn

XPmXP

95.0

nmZP

Page 5: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Choice of Sample Size

Graphically, find m so that: Area = 0.95 Area = 0.975

nm

nm

Page 6: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Choice of Sample Size

Thus solve:

2

1,0,975.0

NORMINVm

n

1,0,975.0NORMINVn

m

1,0,975.0NORMINVm

n

Page 7: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Choice of Sample Size

EXCEL Implementation:Class Example 20, Part 3:

https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg20.xls

HW: 6.19, 6.21

2

1,0,975.0

NORMINVm

n

Page 8: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Interpretation of Conf. Intervals2 Equivalent Views: Distribution Distribution

95%

pic 1 pic 2

m m m 0 m

X X

Page 9: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Interpretation of Conf. Intervals

Mathematically:

pic 1 pic 2

no pic

"",.. bracketsmXmXICtheP

mXPmXmP 95.0

mXmXP

Page 10: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Interpretation of Conf. Intervals

Frequentist View: If repeat the experiment many times,

About 95% of the time, CI will contain

(and 5% of the time it won’t)

Page 11: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Interpretation of Conf. Intervals

A nice Applet, from Ogden and West:http://www.amstat.org/publications/jse/v6n3/applets/ConfidenceInterval.html

• Try a few at

• “more interval” allows regeneration

• “on average” about 2.5/50 don’t cover

• This is idea of “% coverage”

Page 12: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Interpretation of Conf. IntervalsRevisit Class Example 16

https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg16.xls

Recall Class HW: Estimate % of Male Students at UNC

CI View: Class Example 21https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg21.xls

Illustrates idea: CI should cover 95% of time

Page 13: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Interpretation of Conf. IntervalsClass Example 21:

https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg21.xls

Q1: SD too small Too many cover

Q2: SD too big Too few cover

Q3: Big Bias Too few cover

Q4: Good sampling About right

Q5: Simulated Bi Shows “natural var’n”

Page 14: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Interpretation of Conf. IntervalsHW: 6.23, 6.26 (0.857, 0.135, 0.993)

Page 15: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Sec. 6.2 Tests of Significance= Hypothesis Tests

Big Picture View:Another way of handling random error

I.e. a different view point

Idea: Answer yes or no questions, under uncertainty

(e.g. from sampling of measurement error)

Page 16: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Hypothesis TestsSome Examples:• Will Candidate A win the election?• Does smoking cause cancer?• Is Brand X better than Brand Y?• Is a drug effective?• Is a proposed new business strategy

effective?(marketing research focuses on this)

Page 17: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Hypothesis TestsE.g. A fast food chain currently brings in

profits of $20,000 per store, per day. A new menu is proposed. Would it be more profitable?

Test: Have 10 stores (randomly selected!) try the new menu, let = average of their daily profits.

X

Page 18: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business ExampleSimplest View: for :

new menu looks better.

Otherwise looks worse.

Problem: New menu might be no better (or even worse), but could have

by bad luck of sampling

(only sample of size 10)

000,20$X

000,20$X

Page 19: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

Problem: How to handle & quantify gray area in these decisions.

Note: Can never make a definite conclusion e.g. as in Mathematics, Statistics is more about real life…

(E.g. even if or , that might be bad luck of sampling, although very unlikely)

0$X 000,000,1$X

Page 20: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Hypothesis Testing

Note: Can never make a definite conclusion,Instead measure strength of evidence.Approach I: (note: different from text)Choose among 3 Hypotheses: H+: Strong evidence new menu is better

H0: Evidence in inconclusive

H-: Strong evidence new menu is worse

Page 21: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Hypothesis Testing

Terminology:

H0 is called null hypothesis

Setup: H+, H0, H- are in terms of

parameters, i.e. population quantities

(recall population vs. sample)

Page 22: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

E.g. Let = true (over all stores) daily

profit from new menu.

H+: (new is better)

H0: (about the same)

H-: (new is worse)000,20$

000,20$

000,20$

Page 23: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

Base decision on best guess:

Will quantify strength of the evidence using

probability distribution of

E.g. Choose H+

Choose H0

Choose H-000,20$X

000,20$X

000,20$X

X

Page 24: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

How to draw line?

(There are many ways,

here is traditional approach)

Insist that H+ (or H-) show strong evidence

I.e. They get burden of proof

(Note: one way of solving

gray area problem)

Page 25: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

Assess strength of evidence by asking:

“How strange is observed value ,

assuming H0 is true?”

In particular, use tails of H0 distribution as

measure of strength of evidence

X

Page 26: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business ExampleUse tails of H0 distribution as measure of

strength of evidence: distribution under H0

observed value ofUse this probability to measure

strength of evidence

X

X

k20$

Page 27: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Hypthesis TestingDefine the p-value, for either H+ or H0, as:

P{what was seen, or more conclusive | H0}

Note 1: small p-value strong evidence against H0, i.e. for H+ (or H-)

Note 2: p-value is also called observed significance level.

Page 28: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

Suppose observe: ,based on

Note , but is this conclusive?or could this be due to natural sampling variation?(i.e. do we risk losing money from new menu?)

400,2$s000,21$X10n

000,20$X

Page 29: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

Assess evidence for H+ by:

H+ p-value = Area

10400,2,000,20' NndistX

000,21$000,20$

Page 30: Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,

Fast Food Business Example

Computation in EXCEL:

Class Example 22, Part 1:https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg22.xls

P-value = 0.094