Upload
nelson-burns
View
226
Download
0
Embed Size (px)
Citation preview
2014.3.131
Medical StatisticsMedical Statistics
Tao YuchunTao Yuchun
55
http://cc.jlu.edu.cn/ms.html
2014.3.132
Statistical inferenceStatistical inference
1. Estimation of population
parameter
2014.3.133
1.1 Sampling error and standard error of mean•Sampling Study
Sampling errorSampling error:
Sample → sample mean
(different from population mean)
Different samples → Different sample
means
(different from each other )
2014.3.134
(1)(1) Sampling error is related to theSampling error is related to the
variationvariation of the population of the population
•No variation, no error, also no Sampling error!
ExampleExample: The sample means of systolic blood pressure.
For adult population (age 25~90)
-- vary substantially
For young population (age 18~25)
-- not vary too much
No No variation, variation,
no no statistics, statistics,
too !too !
2014.3.135
(2)(2) Sampling error is also related toSampling error is also related to
sample sizesample size
If
sample size = population size
there is no sampling error!
If
sample size = 1
Sampling error ≡ variation of population!
2014.3.136
•How changes for sample mean?•See simulative experiment below•Sampling from N(4.6602, 0.57462) by computer (100 times)
Frequency distribution of sample means Value Frequency
of mean Sample size=5
Sample size=10
Sample size=20
Sample size=50
0.75 1 1.25 1 1.75 4 1 2.25 2 2 2.75 12 5 2 1 3.25 15 8 9 5 3.75 12 16 24 22 4.25 10 26 31 45 4.75 17 16 22 24 5.25 8 15 10 3 5.75 6 8 2 6.25 7 3 6.75 4
7.25-7.75 1
2014.3.137
2014.3.138
•Sampling from a skew distribution by computer
1 2 3 4 5 6 7 8 9
(a)
1 2 3 4 5 7 8
n=5
(b)
1 2 3 4 5 6 7 8 9
n=10
(c)
1 2 3 4 5 6 7 8 9
n=20
(d)
1 2 3 4 5 6 7 8 9
n=30
(e)
2014.3.139
1.2 The distribution of sample mean
If the variable ~ a normal distribution
sample means ~ a normal distribution
If the distribution of variable ~ skew,
For small sample
distribution of sample mean – skew
For large sample
sample mean close to a normal distribution
nX
),(~ 2XNX
),( 2N
),(~ 2XNX --Came from Central Limit Theorem
2014.3.1310
1.3 standard error
• Standard deviation of the population: • Standard deviation of the sample mean or Standard error of sample mean or Standard error:
• In any case: Standard error of sample mean =
or
X
n
population theofdeviation standard
nX
2014.3.1311
•For application
n
SSX
S is estimation of σ, is estimation of .XSX
2014.3.1312
1.4 Student’s t distribution
•The t distribution was discovered
by William S. GossetWilliam S. Gosset in 1908.
•“Student” is his pen name.
1876 - 1937
For a normal distribution
William S. Gosset
),( 2N
),(~ 2XNX
If
Z follows a standard normal distribution ---N(0,1).
n
XXZ
X /
2014.3.1313
•When σ is unknown,
nS
X
S
Xt
X /
t follows a t distribution.
0
t curve
2014.3.1314
• The Property of The Property of tt Distribution Distribution
I. centrosymmetric
• Center is 0.
II. ν — shape parameter
• also called degree of freedomdegree of freedom, ν = n-1.
• determine shape of a t curve.• different ν, different t curve. When ν is increasing, t curve is close to standard normal curve; when ν →∞, t curve became standard normal curve. See this animation
In statistics, the number of degrees of freedomis the number of values in the final calculationof a statistic that are free to vary.
--Came from Wikipedia
2014.3.1315
•The different t curves
ν= ∞(standard normal curve)
ν= 4
ν= 1
f(t)
2014.3.1316
III. The area under the t curve
• The Table for t distribution.• t value denotes , α is probability, ν is degree of freedom, ν = n-1.•The area under the t curve means:
,t
One side : P(t≤-tα,ν)=α or P(t≥tα,ν)=α
Two sides : P(t≤-tα,ν)+P(t≥tα,ν)=α
•See next figure
2014.3.1317
ν
,t ,t
2
2
•The meanings of the area under the t curve for two sides
2014.3.1318
1.5 Confidence Interval of Population Mean
Statistical
inference
Estimation
parameter
Hypothesis testing
point estimation
interval estimation
Point estimation of population mean
-- sample mean
Interval estimation of population mean
-- (1-α) confidence interval Confidence level: 1-α, such as 95% or 99%.
2014.3.1319
1)( ,, XX StXStXP
1)( ,, t
S
XtP
X
1)( ,, tttP
• From P(t≤-tα,ν)+P(t≥tα,ν)=α
•We can get
X t SX
,
•It is the formula of (1- (1- αα) confidence interval ) confidence interval
of population meanof population mean for two sides.
2014.3.1320
• (1- (1- αα) confidence interval of population ) confidence interval of population
meanmean can abbreviate to 95% CI95% CI or 99% CI99% CI.
Whenever we get a mean and standard
deviation from a sample,
put them into
then
X
XStX ,
xxx StXStXStX ,,, ~
•The two extreme values are called confidence limits.
S
2014.3.1321
Example Systolic blood pressures of 20 healthy
males were measured.
What is 95% confidence interval of the population mean?
mmHgSmmHgX 8.10,4.118
5.123415.2093.24.118
3.113415.2093.24.118
093.2
05.0,191201
415.220
8.10,20
8.10,4.118
19,05.0
19,05.0
19,05.0
X
X
X
StX
StX
t
nn
SSn
mmHgSmmHgX
came from the Table of t distribution
2014.3.1322
95% CI: mmHg)5.123,3.113(
•What does “confidence interval” mean?
(1-α) CI
Not include μ
2014.3.1323
CC
You should knowYou should know:
Once you got a 95% confidence interval of the
certain population mean, the μ for this population
may be in it, also may not be in it, but the
probabilityprobability being in it is 95% !
(Guilin Pagodas http://en.wikipedia.org/wiki/Guilin)
2014.3.1324
In figure, the red curve is standard normal curve , the blue
curve is t curve , df is ν (degree of freedom).