Upload
hakiet
View
214
Download
0
Embed Size (px)
Citation preview
Point and Interval EstimationDaniel Y.T. Fong(Email: [email protected])
NURS4302 - STATISTICS
School of Nursing, The University of Hong Kong
Statistical Inference
Population Sample
Random sampling :- easy!
Statistical Inference
■ Estimation■ Hypothesis Testing
Learning Objectives
1. To estimate a population mean
2. To estimate a population proportion
… when we do not have data from the population
Estimation for Mean- Point estimation
… when we do not have data from the whole population
Population Sample
Can You Plan for This?
What is the blood pressure, on average, of an
African-Chinese after taking calcium for 12 weeks?
What we want to do …
SamplingBlood pressure of all African Chinese at 12 weeks after
calcium intake
(Population)
Sample
mean mean
unknown observed
Estimation
How willYou Start the Study?
1. Decide a sample size !
2. Draw a random sample
Let’s take it as 5 !
What is the blood pressure, on average, of an African-Chinese
after taking calcium for 12 weeks?
97, 121, 113, 98, 101 Average = 106
Point estimate of the population meanAre we done?
Point Estimation
97, 121, 113, 98, 101 Average = 106
112, 108, 97, 101, 113 Average = 106.2
99, 109, 92, 98, 121 Average = 103.8
What is the blood pressure, on average, of an African-Chinese
after taking calcium for 12 weeks?
Is it sufficient to just report the sample mean ?
Knowing the Sample Mean
Sample mean is Random ! Average of all possible
sample means is the population mean!
Sample mean is an unbiased estimate of the population mean
Sample variance (using n-1) is also an unbiasedestimate of the population variancePopulation mean
The truth
Sample mean Sample 1 106 Sample 2 106.2 Sample 3 103.8 Sample 4 108
Average 105
Example – An Illustration
You are interested in the heights of a family consisting of 4 members
You did not know that their heights are 1.7m, 1.5m, 0.9m and 0.8m
Taking random samples of size 2, with replacement
Mean = 1.23m Variance = 0.15m2
Example – An Illustration
Using divisor of 2-1 = 1
Their average = 1.23
Their average = 0.15
Population mean = 1.23Population variance = 0.15
Example – An Illustration
Their average = 1.23
Their standard deviation (divisor = 16) is the standard error for the mean
Standard error for the mean refers to the variability of a sample mean
Standard deviation () of the population refers to the variability of measurements in the population
Both of them are population parameters and are therefore unknown
Standard Error for the Mean
n
Standard error for the mean
where n is the sample size
Example – An Illustration
Their average = 1.23
Their standard deviation (divisor = 16) is 0.27
27.0238.0
n
With only one random sample, how can the standard error for the mean be estimated ?
Sample Standard Error for the Mean (SEM)
SEM = sample standard error for the mean SD = sample standard deviation n = sample size
SEM =SD
n
SEM vs SD
0
5
10
15
20
25
30
0 5 10 15 20
Sample size (n)
SEMSD = 30
SD = 20SD = 10
SEM =SD
n
The larger the variability in the sample/population, the larger is the variability of the sample mean
An increase in sample size will reduce the SEM
Is SEM < SD always true?
Revisiting Example
97, 121, 113, 98, 101 Average = 106
What is the blood pressure, on average, of an African-Chinese
after taking calcium for 12 weeks?
SD = 10.54SEM = 10.54/5 = 4.71
112, 108, 97, 101, 113 Average = 106.2SD = 6.98SEM = 6.98/5 = 3.12
■ We need to report the sample mean in order to estimate the population mean
■ We need to report the SEM in order to know the precision of using the sample mean for estimation
Example from the Literature
A survey was conducted to examine the perception of Hong Kong high school students in choosing nursing as their career
604 and 639 male and female students responded For each student, a PNC score was obtained to assess
the perception of nursing as a career ◦ The highest possible PNC score is 44, indicating the most positive
attitude towards nursing as a career.
◦ The lowest score is 11, signifying a strong negative attitude.
Suppose you are particularly interested in the mean PNC score of all male high school students in Hong Kong.
~ Law & Arthur (2003)
Estimating the mean PNC score of all male high school students in Hong Kong
1. In the sample of 604 male students, the sample mean and standard deviation are 27.8 and 3.42.
2. A point estimate is 27.83. The SEM is
604/42.3
It can generally be considered as relatively small. Therefore, the point estimate can be considered as precise.
= 0.14
Estimation for Mean- Interval estimation
… when we do not have data from the whole population
Population Sample
An Alternative to Point Estimate
The Idea
Obtain an interval which we are highly certain that it includes the population mean (the truth)
Truth (unknown)So, how?
Sampling Distribution for the Mean
The population
Sample 1 Sample 2
mean1 mean2 mean3
Sample 3
Sampling distributionfor the mean
Sampling Distribution for the Mean
The distribution of the sample mean is N(µ,2/n)provided the sample size is large enough
(Central limit theorem)
What is given..
What we can say …
A sample whose corresponding population has mean=µ and variance=2
Getting an interval with high certainty to include the true value (µ)?
nXl
2
1 96.1
nXl
2
2 96.1 96.1
/21
nlX
Getting an interval with 95% chance toinclude the true value (µ)?
95.0)( 21 llP
),( 21 ll
95.0)( 12 llP
Suppose the interval is .
95.0)( 12 lXXlXP
95.0)///
(2
122
2
nlX
nX
nlXP
95.0)//
(2
12
2
nlXZ
nlXP
)1,0(~/2
Nn
XZ
0
0.95
This is what we required!
Similarly,
is unknown!!2
( – t(df,/2) SEM, + t(df,/2) SEM)XX
Interval Estimation
= sample meanSEM = estimated standard error for the meant(df,/2) = t value that depends on two values: df and df = degrees of freedom = n-11- = level of certainty/confidence, 0 < < 1
X
confidence interval for the mean (population)(1-)100%
e.g. An = 0.05 specifies a 95% confidence interval
??
Obtaining t(df,/2)
/2/2
t(df,/2)-t(df,/2)
t-distribution with degrees of freedom = df
1. Determine df and
2. Check out the critical values from the t-Table (using two-tailed)
Obtaining t(df,/2)
t(4,0.1/2) t(4,0.05/2) t(4,0.01/2)
Revisiting Example
97, 121, 113, 98, 101 Average = 106SD = 10.54SEM = 10.54/5 = 4.71
What is the blood pressure, on average, of an African-Chinese
after taking calcium for 12 weeks?
1. Determine df and
2. Check out the critical values from the t-table
df = 4 (=5-1); = 0.05 (specified)
2.776
( – t(df,/2) SEM, + t(df,/2) SEM)XX
(1-)100% confidence interval
A 95% confidence interval for the mean is(106 – 2.7764.71, 106 + 2.7764.71) = (92.9, 119.1)
So, we are 95% certain that the interval (92.9, 119.1) includes the true population mean.
Confidence Intervals
Drew 100 random samples Hence, 100 CIs
95% CI thatincludes the true valuedoes not include the true value
99% CI thatincludes the true valuedoes not include the true value
Revisiting ExampleWhat is the blood pressure, on average, of an African-Chinese
after taking calcium for 12 weeks?
112, 108, 97, 101, 113 Average = 106.2SD = 6.98SEM = 6.98/5 = 3.12
A 90% confidence interval for the mean is(106.2 – 2.1323.12, 106.2 + 2.1323.12) = (99.5, 112.9)
A 95% confidence interval for the mean is(106.2 – 2.7763.12, 106.2 + 2.7763.12) = (97.5, 114.9)
A 99% confidence interval for the mean is(106.2 – 4.6043.12, 106.2 + 4.6043.12) = (91.8, 120.6)
Width of Confidence Interval
The width reflects the precision
90% CI
95% CI
99% CI
97.5 114.9
91.8 120.6
99.5 112.9
十拿九穩
九五之尊
百發失一
Example from the Literature - Revisit
1. In the sample of 604 male students, the sample mean and standard deviation are 27.8 and 3.42.
2. A point estimate is 27.8 with SEM = 0.143. df =603, no such row on the t-table!
( – t(df,/2) SEM, + t(df,/2) SEM)XX
(1-)100% confidence interval
Normal Approximation
A t-distribution with df= is the standard Normal distribution
For a 90% confidence interval, approximatet(603,0.1/2) by 1.645
For a 95% confidence interval, approximatet(603,0.05/2) by 1.96
For a 99% confidence interval, approximatet(603,0.01/2) by 2.576
(27.57, 28.03)
(27.53, 28.07)
(27.44, 28.16)
( – t(df,/2) SEM, + t(df,/2) SEM)XX
(1-)100% confidence interval
Q & A
measures the variability of the observations is a measure of how far the sample mean is likely to be
from the population mean
is greater than the SD of the sample
is proportional to the number of observations
1. On the sample standard error for the mean (SEM) of a sample
True or False ?
Estimation for Proportion… when we do not have data from the whole population
Population Sample
Estimation of Proportion
What is the percentage of persons who experienced lethal shock
after receiving the current vaccine?
What we want to do …
Sampling
Estimation
Whether or not all vaccinated
persons experienced lethal shock
(Population)
Sample
proportion proportion
unknown observed
How willYou Start the Study?
What is the percentage of persons who experienced lethal shock
after receiving the current vaccine?
Point estimate of the population proportion
■ Decide a sample size!■ Sample size = 136■ Number of persons experienced lethal shock
= 9
Sample proportion (p) = 6.6%
Confidence Interval for Proportion
Sample standard error for proportion (SEP)
SEP =p(1p)
n
A 95% CI = (2.4%, 10.8%)
(p – z(/2)SEP, p + z(/2)SEP)
when the sample size is sufficiently large
(1-)100% confidence interval for the proportion
SEP = 0.021
Critical Value from the Standard Normal Distribution
Z(0.05/2)
Example from the Literature- Revisit
In the same survey, students were also asked if they would consider nursing as a career possibility
A total of 348 from the total of 1243 students responded they would
Estimate the proportion of students who would consider nursing as a career possibility
Estimate the proportion of students who would consider nursing as a career possibility
A point estimate for the proportion is 348/1243 = 28% The SEP =
= = 0.013
A 95% confidence interval for the proportion of school students who would consider nursing as a career possibility is
npp /)1( 1243/)28.01(28.0
))013.0(96.128.0),013.0(96.128.0( = (0.255, 0.305)
That is, we are 95% confident that the proportion of school students who would consider nursing as a career possibility is between 25.5% and 30.5%.
Q & A
The SEP becomes smaller when the sample size becomes smaller.
The SEP when p=0.1 is larger than that when p=0.5. A 95% confidence interval for a population proportion
bears 95% chance to include the sample proportion.
2. Decide True or False in the following questions.
True or False ?