View
217
Download
0
Category
Tags:
Preview:
Citation preview
Last Time
• Central Limit Theorem– Illustrations– How large n?– Normal Approximation to Binomial
• Statistical Inference– Estimate unknown parameters– Unbiasedness (centered correctly)– Standard error (measures spread)
Administrative Matters
Midterm II, coming Tuesday, April 6
Administrative Matters
Midterm II, coming Tuesday, April 6
• Numerical answers:– No computers, no calculators
Administrative Matters
Midterm II, coming Tuesday, April 6
• Numerical answers:– No computers, no calculators– Handwrite Excel formulas (e.g. =9+4^2)– Don’t do arithmetic (e.g. use such formulas)
Administrative Matters
Midterm II, coming Tuesday, April 6
• Numerical answers:– No computers, no calculators– Handwrite Excel formulas (e.g. =9+4^2)– Don’t do arithmetic (e.g. use such formulas)
• Bring with you:– One 8.5 x 11 inch sheet of paper
Administrative Matters
Midterm II, coming Tuesday, April 6
• Numerical answers:– No computers, no calculators– Handwrite Excel formulas (e.g. =9+4^2)– Don’t do arithmetic (e.g. use such formulas)
• Bring with you:– One 8.5 x 11 inch sheet of paper– With your favorite info (formulas, Excel, etc.)
Administrative Matters
Midterm II, coming Tuesday, April 6
• Numerical answers:– No computers, no calculators– Handwrite Excel formulas (e.g. =9+4^2)– Don’t do arithmetic (e.g. use such formulas)
• Bring with you:– One 8.5 x 11 inch sheet of paper– With your favorite info (formulas, Excel, etc.)
• Course in Concepts, not Memorization
Administrative Matters
Midterm II, coming Tuesday, April 6
• Material Covered:
HW 6 – HW 10
Administrative Matters
Midterm II, coming Tuesday, April 6
• Material Covered:
HW 6 – HW 10
– Note: due Thursday, April 2
Administrative Matters
Midterm II, coming Tuesday, April 6
• Material Covered:
HW 6 – HW 10
– Note: due Thursday, April 2– Will ask grader to return Mon. April 5– Can pickup in my office (Hanes 352)
Administrative Matters
Midterm II, coming Tuesday, April 6
• Material Covered:
HW 6 – HW 10
– Note: due Thursday, April 2– Will ask grader to return Mon. April 5– Can pickup in my office (Hanes 352)– So today’s HW not included
Administrative Matters
Extra Office Hours before Midterm II
Monday, Apr. 23 8:00 – 10:00
Monday, Apr. 23 11:00 – 2:00
Tuesday, Apr. 24 8:00 – 10:00
Tuesday, Apr. 24 1:00 – 2:00
(usual office hours)
Study Suggestions
1. Work an Old Exama) On Blackboard
b) Course Information Section
Study Suggestions
1. Work an Old Exama) On Blackboard
b) Course Information Section
c) Afterwards, check against given solutions
Study Suggestions
1. Work an Old Exama) On Blackboard
b) Course Information Section
c) Afterwards, check against given solutions
2. Rework HW problems
Study Suggestions
1. Work an Old Exama) On Blackboard
b) Course Information Section
c) Afterwards, check against given solutions
2. Rework HW problemsa) Print Assignment sheets
b) Choose problems in “random” order
Study Suggestions
1. Work an Old Exama) On Blackboard
b) Course Information Section
c) Afterwards, check against given solutions
2. Rework HW problemsa) Print Assignment sheets
b) Choose problems in “random” order
c) Rework (don’t just “look over”)
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 356-369, 487-497
Approximate Reading for Next Class:
Pages 498-501, 418-422, 372-390
Law of AveragesCase 2: any random sample
CAN SHOW, for n “large”
is “roughly”
Terminology: “Law of Averages, Part 2” “Central Limit Theorem”
(widely used name)
nXX ,,1
X ,N
Central Limit TheoremIllustration: Rice Univ. Applethttp://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html
Starting Distribut’n
user input
(very non-Normal)
Dist’n of average
of n = 25
(seems very
mound shaped?)
Extreme Case of CLTConsequences:
roughly
roughly
Terminology: Called
The Normal Approximation to the Binomial
p npppN 1,
X pnpnpN 1,
Normal Approx. to BinomialHow large n?
• Bigger is better
• Could use “n ≥ 30” rule from above
Law of Averages
• But clearly depends on p
• Textbook Rule:
OK when {np ≥ 10 & n(1-p) ≥ 10}
Statistical InferenceIdea: Develop formal framework for
handling unknowns p & μ
e.g. 1: Political Polls
e.g. 2a: Population Modeling
e.g. 2b: Measurement Error
Statistical InferenceA parameter is a numerical feature of
population, not sample
An estimate of a parameter is some function of data
(hopefully close to parameter)
Statistical InferenceStandard Error: for an unbiased estimator,
standard error is standard deviation
Notes: For SE of , since don’t know p, use
sensible estimate For SE of , use sensible estimate
pp
s
Statistical InferenceAnother view:
Form conclusions by
Statistical InferenceAnother view:
Form conclusions by
quantifying uncertainty
Statistical InferenceAnother view:
Form conclusions by
quantifying uncertainty
(will study several approaches, first is…)
Confidence Intervals
Background:
Confidence Intervals
Background:
The sample mean, , is an “estimate”
of the population mean,
X
Confidence Intervals
Background:
The sample mean, , is an “estimate”
of the population mean,
How accurate?
X
Confidence Intervals
Background:
The sample mean, , is an “estimate”
of the population mean,
How accurate?
(there is “variability”, how
much?)
X
Confidence IntervalsIdea:
Since a point estimate
(e.g. or )X p
Confidence IntervalsIdea:
Since a point estimate
is never exactly right
(in particular ) 0XP
Confidence IntervalsIdea:
Since a point estimate
is never exactly right
give a reasonable range of likely values
(range also gives feeling
for accuracy of estimation)
Confidence IntervalsIdea:
Since a point estimate
is never exactly right
give a reasonable range of likely values
(range also gives feeling
for accuracy of estimation)
Confidence IntervalsE.g. ,~,,1 NXX n
Confidence IntervalsE.g. with σ known ,~,,1 NXX n
Confidence IntervalsE.g. with σ known
Think: measurement error
,~,,1 NXX n
Confidence IntervalsE.g. with σ known
Think: measurement error
Each measurement is Normal
,~,,1 NXX n
Confidence IntervalsE.g. with σ known
Think: measurement error
Each measurement is Normal
Known accuracy (maybe)
,~,,1 NXX n
Confidence IntervalsE.g. with σ known
Think: population modeling
,~,,1 NXX n
Confidence IntervalsE.g. with σ known
Think: population modeling
Normal population
,~,,1 NXX n
Confidence IntervalsE.g. with σ known
Think: population modeling
Normal population
Known s.d.
(a stretch, really need to improve)
,~,,1 NXX n
Confidence IntervalsE.g. with σ known
Recall the Sampling Distribution:
,~,,1 NXX n
nNX
,~
Confidence IntervalsE.g. with σ known
Recall the Sampling Distribution:
(recall have this even when data not
normal, by Central Limit Theorem)
,~,,1 NXX n
nNX
,~
Confidence IntervalsE.g. with σ known
Recall the Sampling Distribution:
Use to analyze variation
,~,,1 NXX n
nNX
,~
Confidence Intervals
Understand error as:
(normal density quantifies
randomness in )
ndistX '
X
Confidence Intervals
Understand error as:
(distribution centered at μ)
ndistX '
Confidence Intervals
Understand error as:
(spread: s.d. = )
ndistX 'n
n
Confidence Intervals
Understand error as:
How to explain to untrained consumers?
ndistX 'n
Confidence Intervals
Understand error as:
How to explain to untrained consumers?
(who don’t know randomness,
distributions, normal curves)
ndistX 'n
Confidence Intervals
Approach: present an interval
Confidence Intervals
Approach: present an interval
With endpoints:
Estimate +- margin of error
Confidence Intervals
Approach: present an interval
With endpoints:
Estimate +- margin of error
I.e. mX
Confidence Intervals
Approach: present an interval
With endpoints:
Estimate +- margin of error
I.e.
reflecting variability
mX
Confidence Intervals
Approach: present an interval
With endpoints:
Estimate +- margin of error
I.e.
reflecting variability
mX
Confidence Intervals
Approach: present an interval
With endpoints:
Estimate +- margin of error
I.e.
reflecting variability
How to choose ?
mX
m
Confidence Intervals
Choice of Confidence Interval Radius
Confidence Intervals
Choice of Confidence Interval Radius,
i.e. margin of error, m
Confidence Intervals
Choice of Confidence Interval Radius,
i.e. margin of error, :
Notes:
• No Absolute Range (i.e. including “everything”) is available
m
Confidence Intervals
Choice of Confidence Interval Radius,
i.e. margin of error, :
Notes:
• No Absolute Range (i.e. including “everything”) is available
• From infinite tail of normal dist’n
m
Confidence Intervals
Choice of Confidence Interval Radius,
i.e. margin of error, :
Notes:
• No Absolute Range (i.e. including “everything”) is available
• From infinite tail of normal dist’n
• So need to specify desired accuracy
m
Confidence IntervalsChoice of margin of error, m
Confidence IntervalsChoice of margin of error, :Approach:• Choose a Confidence Level
m
Confidence IntervalsChoice of margin of error, :Approach:• Choose a Confidence Level• Often 0.95
m
Confidence IntervalsChoice of margin of error, :Approach:• Choose a Confidence Level• Often 0.95
(e.g. FDA likes this number for approving new drugs, and it is a common standard for publication in many fields)
m
Confidence IntervalsChoice of margin of error, :Approach:• Choose a Confidence Level• Often 0.95
(e.g. FDA likes this number for approving new drugs, and it is a common standard for publication in many fields)
• And take margin of error to include that part of sampling distribution
m
Confidence Intervals
E.g. For confidence level 0.95, want
0.95 = Area
Confidence Intervals
E.g. For confidence level 0.95, want
distribution
0.95 = Area
X
Confidence Intervals
E.g. For confidence level 0.95, want
distribution
0.95 = Area
= margin of errorm
X
Confidence Intervals
Computation: Recall NORMINV
Confidence Intervals
Computation: Recall NORMINV takes
areas (probs)
Confidence Intervals
Computation: Recall NORMINV takes
areas (probs), and returns cutoffs
Confidence Intervals
Computation: Recall NORMINV takes
areas (probs), and returns cutoffs
Issue: NORMINV works with lower areas
Confidence Intervals
Computation: Recall NORMINV takes
areas (probs), and returns cutoffs
Issue: NORMINV works with lower areas
Note: lower tail
included
Confidence Intervals
So adapt needed probs to lower areas….
Confidence Intervals
So adapt needed probs to lower areas….
When inner area = 0.95,
Confidence Intervals
So adapt needed probs to lower areas….
When inner area = 0.95,
Right tail = 0.025
Confidence Intervals
So adapt needed probs to lower areas….
When inner area = 0.95,
Right tail = 0.025
Shaded Area = 0.975
Confidence Intervals
So adapt needed probs to lower areas….
When inner area = 0.95,
Right tail = 0.025
Shaded Area = 0.975
So need to compute as:
nNORMINV
,,975.0
Confidence Intervals
Need to compute:
nNORMINV
,,975.0
Confidence Intervals
Need to compute:
Major problem: is unknown
nNORMINV
,,975.0
Confidence Intervals
Need to compute:
Major problem: is unknown
• But should answer depend on ?
nNORMINV
,,975.0
Confidence Intervals
Need to compute:
Major problem: is unknown
• But should answer depend on ?
• “Accuracy” is only about spread
nNORMINV
,,975.0
Confidence Intervals
Need to compute:
Major problem: is unknown
• But should answer depend on ?
• “Accuracy” is only about spread
• Not centerpoint
nNORMINV
,,975.0
Confidence Intervals
Need to compute:
Major problem: is unknown
• But should answer depend on ?
• “Accuracy” is only about spread
• Not centerpoint
• Need another view of the problem
nNORMINV
,,975.0
Confidence Intervals
Approach to unknown
Confidence Intervals
Approach to unknown :
Recenter, i.e. look at dist’n
X
Confidence Intervals
Approach to unknown :
Recenter, i.e. look at dist’n
X
Confidence Intervals
Approach to unknown :
Recenter, i.e. look at dist’n
Key concept:
Centered at 0
X
Confidence Intervals
Approach to unknown :
Recenter, i.e. look at dist’n
Key concept:
Centered at 0
Now can calculate as:
nNORMINVm
,0,975.0
X
Confidence Intervals
Computation of:
nNORMINVm
,0,975.0
Confidence Intervals
Computation of:
Smaller Problem: Don’t know
nNORMINVm
,0,975.0
Confidence Intervals
Computation of:
Smaller Problem: Don’t know
Approach 1: Estimate with
(natural approach: use estimate)
nNORMINVm
,0,975.0
s
Confidence Intervals
Computation of:
Smaller Problem: Don’t know
Approach 1: Estimate with
• Leads to complications
nNORMINVm
,0,975.0
s
Confidence Intervals
Computation of:
Smaller Problem: Don’t know
Approach 1: Estimate with
• Leads to complications
• Will study later
nNORMINVm
,0,975.0
s
Confidence Intervals
Computation of:
Smaller Problem: Don’t know
Approach 1: Estimate with
• Leads to complications
• Will study later
Approach 2: Sometimes know
nNORMINVm
,0,975.0
s
Research Corner
How many bumps in stamps data?
Kernel Density Estimates
Depends on Window
~1?
Research Corner
How many bumps in stamps data?
Kernel Density Estimates
Depends on Window
~2?
Research Corner
How many bumps in stamps data?
Kernel Density Estimates
Depends on Window
~7?
Research Corner
How many bumps in stamps data?
Kernel Density Estimates
Depends on Window
~10?
Research Corner
How many bumps in stamps data?
Kernel Density Estimates
Depends on Window
Early Approach:
Use data to choose
window width
Research Corner
How many bumps in stamps data?
Kernel Density Estimates
Depends on Window
Challenge:
Not enough info in
data for good choice
Research Corner
How many bumps in stamps data?
Kernel Density Estimates
Depends on Window
Alternate Approach:
Scale Space
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
• Terminology from Computer Vision
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
• Terminology from Computer Vision
(goal: teach computers to “see”)
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
• Terminology from Computer Vision:– Oversmoothing: coarse scale view
(zoomed out – macroscopic perception)
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
• Terminology from Computer Vision:– Oversmoothing: coarse scale view
– Undersmoothing: fine scale view
(zoomed in – microscopic perception)
Research Corner
Scale Space:
View 1: Rainbow colored movie
Research Corner
Scale Space:
View 2: Rainbow
colored overlay
Research Corner
Scale Space:
View 3: Rainbow
colored surface
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
Challenge: how to do statistical inference?
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
Challenge: how to do statistical inference?
Which bumps are really there?
Research Corner
Scale Space:
Main Idea:
• Don’t try to choose window width
• Instead use all of them
Challenge: how to do statistical inference?
Which bumps are really there?
(i.e. statistically significant)
Research Corner
Scale Space:
Challenge: how to do statistical inference?
Which bumps are really there?
(i.e. statistically significant)
Research Corner
Scale Space:
Challenge: how to do statistical inference?
Which bumps are really there?
(i.e. statistically significant)
Address this next time
Confidence Intervals
E.g. Crop researchers plant 15 plots
with a new variety of corn.
Confidence Intervals
E.g. Crop researchers plant 15 plots
with a new variety of corn. The
yields, in bushels per acre are:
138
139.1
113
132.5
140.7
109.7
118.9
134.8
109.6
127.3
115.6
130.4
130.2
111.7
105.5
Confidence Intervals
E.g. Crop researchers plant 15 plots
with a new variety of corn. The
yields, in bushels per acre are:
Assume that = 10 bushels / acre
138
139.1
113
132.5
140.7
109.7
118.9
134.8
109.6
127.3
115.6
130.4
130.2
111.7
105.5
Confidence IntervalsE.g. Find:
a) The 90% Confidence Interval for the mean value , for this type of corn.
b) The 95% Confidence Interval.
c) The 99% Confidence Interval.
d) How do the CIs change as the confidence level increases?
Solution, part 1 of Class Example 11:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg11.xls
Confidence IntervalsE.g. Find:
a) 90% Confidence
Interval for
Next study relevant parts of E.g. 11:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg11.xls
Confidence IntervalsE.g. Find:
a) 90% Confidence
Interval for
Use Excel
Confidence IntervalsE.g. Find:
a) 90% Confidence
Interval for
Use Excel
Data in C8:C22
Confidence IntervalsE.g. Find:
a) 90% Confidence Interval for
Steps:
- Sample Size, n
Confidence IntervalsE.g. Find:
a) 90% Confidence Interval for
Steps:
- Sample Size, n
- Average,
X
Confidence IntervalsE.g. Find:
a) 90% Confidence Interval for
Steps:
- Sample Size, n
- Average,
- S. D., σ
X
Confidence IntervalsE.g. Find:
a) 90% Confidence Interval for
Steps:
- Sample Size, n
- Average,
- S. D., σ
- Margin, m
X
Confidence IntervalsE.g. Find:
a) 90% Confidence Interval for
Steps:
- Sample Size, n
- Average,
- S. D., σ
- Margin, m
- CI endpoint, left
X
Confidence IntervalsE.g. Find:
a) 90% Confidence Interval for
Steps:
- Sample Size, n
- Average,
- S. D., σ
- Margin, m
- CI endpoint, left
- CI endpoint, right
X
Confidence IntervalsE.g. Find:
a) 90% CI for : [119.6, 128.0]
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Note: same
margin of error
as before
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Inputs:
Sample Size
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Inputs:
Sample Size
S. D.
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Inputs:
Sample Size
S. D.
Alpha
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Careful: parameter α
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Careful: parameter α is:
2 tailed outer area
Confidence Intervals
An EXCEL shortcut:
CONFIDENCE
Careful: parameter α is:
2 tailed outer area
So for level = 0.90, α = 0.10
Confidence IntervalsE.g. Find:
a) 90% CI for μ: [119.6, 128.0]
Confidence IntervalsE.g. Find:
a) 90% CI for μ: [119.6, 128.0]
b) 95% CI for μ: [118.7, 128.9]
Confidence IntervalsE.g. Find:
a) 90% CI for μ: [119.6, 128.0]
b) 95% CI for μ: [118.7, 128.9]
c) 99% CI for μ: [117.1, 130.5]
Confidence IntervalsE.g. Find:
a) 90% CI for μ: [119.6, 128.0]
b) 95% CI for μ: [118.7, 128.9]
c) 99% CI for μ: [117.1, 130.5]
d) How do the CIs change as the confidence level increases?
Confidence IntervalsE.g. Find:
a) 90% CI for μ: [119.6, 128.0]
b) 95% CI for μ: [118.7, 128.9]
c) 99% CI for μ: [117.1, 130.5]
d) How do the CIs change as the confidence level increases?
– Intervals get longer
Confidence IntervalsE.g. Find:
a) 90% CI for μ: [119.6, 128.0]
b) 95% CI for μ: [118.7, 128.9]
c) 99% CI for μ: [117.1, 130.5]
d) How do the CIs change as the confidence level increases?
– Intervals get longer– Reflects higher demand for accuracy
Confidence Intervals
HW: 6.11 (use Excel to draw curve &
shade by hand)
6.13, 6.14 (7.30,7.70, wider)
6.16 (n = 2673, so CLT gives Normal)
Choice of Sample Size
Additional use of margin of error idea
Choice of Sample Size
Additional use of margin of error idea
Background: distributions
Small n Large n
X
n
Choice of Sample Size
Could choose n to make = desired valuen
Choice of Sample Size
Could choose n to make = desired value
But S. D. is not very interpretable
n
Choice of Sample Size
Could choose n to make = desired value
But S. D. is not very interpretable, so make “margin of error”, m = desired value
n
Choice of Sample Size
Could choose n to make = desired value
But S. D. is not very interpretable, so make “margin of error”, m = desired value
Then get: “ is within m units of ,
95% of the time”
n
X
Choice of Sample Size
Given m, how do we find n?
Choice of Sample Size
Given m, how do we find n?
Solve for n (the equation):
mXP 95.0
Choice of Sample Size
Given m, how do we find n?
Solve for n (the equation):
(where is n in this?)
mXP 95.0
Choice of Sample Size
Given m, how do we find n?
Solve for n (the equation):
(use of “standardization”)
n
mn
XPmXP
95.0
Choice of Sample Size
Given m, how do we find n?
Solve for n (the equation):
n
mn
XPmXP
95.0
nm
ZP
Choice of Sample Size
Given m, how do we find n?
Solve for n (the equation):
[so use NORMINV & Stand. Normal, N(0,1)]
n
mn
XPmXP
95.0
nm
ZP
Choice of Sample Size
Graphically, find m so that:
Area = 0.95
nm
Choice of Sample Size
Graphically, find m so that:
Area = 0.95 Area = 0.975
nm
nm
Choice of Sample Size
Thus solve:
1,0,975.0NORMINVn
m
Choice of Sample Size
Thus solve:
1,0,975.0NORMINVn
m
1,0,975.0NORMINVm
n
Choice of Sample Size
Thus solve:
2
1,0,975.0
NORMINVm
n
1,0,975.0NORMINVn
m
1,0,975.0NORMINVm
n
Choice of Sample Size
(put this on list of formulas)
2
1,0,975.0
NORMINVm
n
Choice of Sample Size
Numerical fine points:
2
1,0,975.0
NORMINVm
n
Choice of Sample Size
Numerical fine points:
• Change this for coverage prob. ≠ 0.95
2
1,0,975.0
NORMINVm
n
Choice of Sample Size
Numerical fine points:
• Change this for coverage prob. ≠ 0.95
• Round decimals upwards,
To be “sure of desired coverage”
2
1,0,975.0
NORMINVm
n
Choice of Sample Size
EXCEL Implementation:
Class Example 11, Part 2:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg11.xls
2
1,0,975.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
Recall:
Corn Yield Data
Choice of Sample Size
Class Example 11, Part 2:
Recall:
Corn Yield Data
Gave X
Choice of Sample Size
Class Example 11, Part 2:
Recall:
Corn Yield Data
Gave
Assumed σ = 10
X
Choice of Sample Size
Class Example 11, Part 2:
Recall:
Corn Yield Data
Resulted in margin of error, m
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Compute from: 2
1,0,95.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Compute from:
(recall 90% central area,
so use 95% cutoff)
2
1,0,95.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Compute from: 2
1,0,95.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Compute from: 2
1,0,95.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Compute from: 2
1,0,95.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Compute from: 2
1,0,95.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
Compute from:
Round up, to be
safe in statement
2
1,0,95.0
NORMINVm
n
Choice of Sample Size
Class Example 11, Part 2:
Excel Function to round up:
CIELING
Choice of Sample Size
Class Example 11, Part 2:
How large should n be to give smaller (90%) margin of error, say m = 2?
n = 68
Choice of Sample Size
Now ask for higher confidence level:
How large should n be to give smaller (99%) margin of error, say m = 2?
Choice of Sample Size
Now ask for higher confidence level:
How large should n be to give smaller (99%) margin of error, say m = 2?
Similar computations:
n = 166
Choice of Sample Size
Now ask for smaller margin:
How large should n be to give smaller (99%) margin of error, say m = 0.2?
Choice of Sample Size
Now ask for smaller margin:
How large should n be to give smaller (99%) margin of error, say m = 0.2?
Similar computations:
n = 16588
Choice of Sample Size
Now ask for smaller margin:
How large should n be to give smaller (99%) margin of error, say m = 0.2?
Similar computations:
n = 16588
Note: serious
round up
Choice of Sample Size
Now ask for smaller margin:
How large should n be to give smaller (99%) margin of error, say m = 0.2?
Similar computations:
n = 16588
(10 times the accuracy requires
100 times as much data)
Choice of Sample Size
Now ask for smaller margin:
How large should n be to give smaller (99%) margin of error, say m = 0.2?
Similar computations:
n = 16588
(10 times the accuracy requires
100 times as much data)
(Law of Averages: Square Root)
Choice of Sample Size
HW: 6.29, 6.30 (52), 6.31
2
1,0,95.0
NORMINVm
n
And now for somethingcompletely different….
An interesting advertisement:
http://www.albinoblacksheep.com/flash/honda.php
C.I.s for proportionsRecall:
Counts: pnpnppnBiX XX 1,,,~
C.I.s for proportionsRecall:
Counts:
Sample Proportions:
pnpnppnBiX XX 1,,,~
npp
pnX
p pp
1,,ˆ ˆˆ
C.I.s for proportions
Calculate prob’s with BINOMDIST
C.I.s for proportions
Calculate prob’s with BINOMDIST
(but C.I.s need inverse of probs)
C.I.s for proportions
Calculate prob’s with BINOMDIST,
but note no BINOMINV
C.I.s for proportions
Calculate prob’s with BINOMDIST,
but note no BINOMINV,
so instead use Normal Approximation
Recall:
Normal Approx. to BinomialExample: from StatsPortal
http://courses.bfwpub.com/ips6e.php
For Bi(n,p):
Control n
Control p
See Prob. Histo.
Compare to fit
(by mean & sd)
Normal dist’n
C.I.s for proportions
Recall Normal Approximation to Binomial
C.I.s for proportions
Recall Normal Approximation to Binomial:
For 101&10 pnnp
C.I.s for proportions
Recall Normal Approximation to Binomial:
For
is approximatelyX pnpnpN 1,
101&10 pnnp
C.I.s for proportions
Recall Normal Approximation to Binomial:
For
is approximately
is approximately
npp
pN1
,
X pnpnpN 1,
p
101&10 pnnp
C.I.s for proportions
Recall Normal Approximation to Binomial:
For
is approximately
is approximately
So use NORMINV
npp
pN1
,
X pnpnpN 1,
p
101&10 pnnp
C.I.s for proportions
Recall Normal Approximation to Binomial:
For
is approximately
is approximately
So use NORMINV (and often NORMDIST)
npp
pN1
,
X pnpnpN 1,
p
101&10 pnnp
C.I.s for proportions
Main problem: don’t know p
C.I.s for proportions
Main problem: don’t know p
Solution: Depends on context:
CIs or hypothesis tests
C.I.s for proportions
Main problem: don’t know p
Solution: Depends on context:
CIs or hypothesis tests
Different from Normal
C.I.s for proportions
Main problem: don’t know p
Solution: Depends on context:
CIs or hypothesis tests
Different from Normal, since now mean and
sd are linked
C.I.s for proportions
Main problem: don’t know p
Solution: Depends on context:
CIs or hypothesis tests
Different from Normal, since now mean and
sd are linked, with both depending on
p
C.I.s for proportions
Main problem: don’t know p
Solution: Depends on context:
CIs or hypothesis tests
Different from Normal, since now mean and
sd are linked, with both depending on
p, instead of separate μ & σ.
C.I.s for proportions
Case 1: Margin of Error and CIs:
95%
npp
Npp1
,0~ˆ
m m
C.I.s for proportions
Case 1: Margin of Error and CIs:
95% 0.975
npp
Npp1
,0~ˆ
m m m
C.I.s for proportions
Case 1: Margin of Error and CIs:
95% 0.975
So:
npp
Npp1
,0~ˆ
nppNORMINVm /1,0,975.0
m m m
C.I.s for proportions
Case 1: Margin of Error and CIs:
nppNORMINVm /1,0,975.0
C.I.s for proportions
Case 1: Margin of Error and CIs:
Continuing problem: Unknown
nppNORMINVm /1,0,975.0
p
C.I.s for proportions
Case 1: Margin of Error and CIs:
Continuing problem: Unknown
Solution 1: “Best Guess”
nppNORMINVm /1,0,975.0
p
C.I.s for proportions
Case 1: Margin of Error and CIs:
Continuing problem: Unknown
Solution 1: “Best Guess”
Replace by
nppNORMINVm /1,0,975.0
p
p
p
C.I.s for proportionsSolution 2: “Conservative”
C.I.s for proportionsSolution 2: “Conservative”
Idea: make sd (and thus m) as large as possible
C.I.s for proportionsSolution 2: “Conservative”
Idea: make sd (and thus m) as large as possible
(makes no sense for Normal)
C.I.s for proportionsSolution 2: “Conservative”
Idea: make sd (and thus m) as large as possible
(makes no sense for Normal)
pppppf 21
C.I.s for proportionsSolution 2: “Conservative”
Idea: make sd (and thus m) as large as possible
(makes no sense for Normal)
pppppf 21
C.I.s for proportionsSolution 2: “Conservative”
Idea: make sd (and thus m) as large as possible
(makes no sense for Normal)
zeros at 0 & 1
pppppf 21
C.I.s for proportionsSolution 2: “Conservative”
Idea: make sd (and thus m) as large as possible
(makes no sense for Normal)
zeros at 0 & 1
max at 2/1p
pppppf 21
C.I.s for proportions
Solution 1: “Conservative”
Can check by calculus
so 41
21
121
1max]1,0[
pp
p
C.I.s for proportions
Solution 1: “Conservative”
Can check by calculus
so
Thus nNORMINVm /4/1,0,975.0
41
21
121
1max]1,0[
pp
p
C.I.s for proportions
Solution 1: “Conservative”
Can check by calculus
so
Thus nNORMINVm /4/1,0,975.0
41
21
121
1max]1,0[
pp
p
nsqrtNORMINV *2/1,0,975.0
C.I.s for proportions
Example: Old Text Problem 8.8
C.I.s for proportions
Example: Old Text Problem 8.8
Power companies spend time and money trimming trees to keep branches from falling on lines.
C.I.s for proportions
Example: Old Text Problem 8.8
Power companies spend time and money trimming trees to keep branches from falling on lines. Chemical treatment can stunt tree growth, but too much may kill the tree.
C.I.s for proportions
Example: Old Text Problem 8.8
Power companies spend time and money trimming trees to keep branches from falling on lines. Chemical treatment can stunt tree growth, but too much may kill the tree. In an experiment on 216 trees, 41 died.
C.I.s for proportions
Example: Old Text Problem 8.8
Power companies spend time and money trimming trees to keep branches from falling on lines. Chemical treatment can stunt tree growth, but too much may kill the tree. In an experiment on 216 trees, 41 died. Give a 99% CI for the proportion expected to die from this treatment.
C.I.s for proportions
Example: Old Text Problem 8.8
Solution: Class example 12, part 1http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg12.xls
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
Data Count, X
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
Data Count, X
Sample Prop.,
Check Normal
Approximation
p
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
Data Count, X
Sample Prop.,
Check Normal
Approximation
p
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
Data Count, X
Sample Prop.,
Best Guess
Margin of Error
p
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
Data Count, X
Sample Prop.,
Best Guess
Margin of Error
p
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
Data Count, X
Sample Prop.,
Best Guess
Margin of Error
(Recall 99% level
& 2 tails…)
p
C.I.s for proportions
Class e.g. 12, part 1
Sample Size, n
Data Count, X
Sample Prop.,
Best Guess
Margin of Error
Conservative
Margin of Error
p
C.I.s for proportions
Class e.g. 12, part 1
Best Guess CI:
[0.121, 0.259]
C.I.s for proportions
Class e.g. 12, part 1
Best Guess CI:
[0.121, 0.259]
Conservative CI:
[0.102, 0.277]
C.I.s for proportionsExample: Old Text Problem 8.8
Solution: Class example 12, part 1http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg12.xls
Note: Conservative is bigger
C.I.s for proportionsExample: Old Text Problem 8.8
Solution: Class example 12, part 1http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg12.xls
Note: Conservative is bigger
Since 5.019.0ˆ p
C.I.s for proportionsExample: Old Text Problem 8.8
Solution: Class example 12, part 1http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg12.xls
Note: Conservative is bigger
Since
Big gap
5.019.0ˆ p
C.I.s for proportionsExample: Old Text Problem 8.8
Solution: Class example 12, part 1http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg12.xls
Note: Conservative is bigger
Since
Big gap
So may pay substantial
price for being “safe”
5.019.0ˆ p
C.I.s for proportionsHW:
8.7
Do both best-guess and conservative CIs:
8.11, 8.13a, 8.19
Recommended