24
2014.3.13 1 Medical Medical Statistics Statistics Tao Tao Yuchun Yuchun 5 5 http://cc.jlu.edu.cn/m s.html

2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

Embed Size (px)

Citation preview

Page 1: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.131

Medical StatisticsMedical Statistics

Tao YuchunTao Yuchun

55

http://cc.jlu.edu.cn/ms.html

Page 2: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.132

Statistical inferenceStatistical inference

1. Estimation of population

parameter

Page 3: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.133

1.1 Sampling error and standard error of mean•Sampling Study

Sampling errorSampling error:

Sample → sample mean

(different from population mean)

Different samples → Different sample

means

(different from each other )

Page 4: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.134

(1)(1) Sampling error is related to theSampling error is related to the

variationvariation of the population of the population

•No variation, no error, also no Sampling error!

ExampleExample: The sample means of systolic blood pressure.

For adult population (age 25~90)

-- vary substantially

For young population (age 18~25)

-- not vary too much

No No variation, variation,

no no statistics, statistics,

too !too !

Page 5: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.135

(2)(2) Sampling error is also related toSampling error is also related to

sample sizesample size

If

sample size = population size

there is no sampling error!

If

sample size = 1

Sampling error ≡ variation of population!

Page 6: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.136

•How changes for sample mean?•See simulative experiment below•Sampling from N(4.6602, 0.57462) by computer (100 times)

Frequency distribution of sample means Value Frequency

of mean Sample size=5

Sample size=10

Sample size=20

Sample size=50

0.75 1 1.25 1 1.75 4 1 2.25 2 2 2.75 12 5 2 1 3.25 15 8 9 5 3.75 12 16 24 22 4.25 10 26 31 45 4.75 17 16 22 24 5.25 8 15 10 3 5.75 6 8 2 6.25 7 3 6.75 4

7.25-7.75 1

Page 7: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.137

Page 8: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.138

•Sampling from a skew distribution by computer

1 2 3 4 5 6 7 8 9

(a)

1 2 3 4 5 7 8

n=5

(b)

1 2 3 4 5 6 7 8 9

n=10

(c)

1 2 3 4 5 6 7 8 9

n=20

(d)

1 2 3 4 5 6 7 8 9

n=30

(e)

Page 9: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.139

1.2 The distribution of sample mean

If the variable ~ a normal distribution

sample means ~ a normal distribution

If the distribution of variable ~ skew,

For small sample

distribution of sample mean – skew

For large sample

sample mean close to a normal distribution

nX

),(~ 2XNX

),( 2N

),(~ 2XNX --Came from Central Limit Theorem

Page 10: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1310

1.3 standard error

• Standard deviation of the population: • Standard deviation of the sample mean or Standard error of sample mean or Standard error:

• In any case: Standard error of sample mean =

or

X

n

population theofdeviation standard

nX

Page 11: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1311

•For application

n

SSX

S is estimation of σ, is estimation of .XSX

Page 12: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1312

1.4 Student’s t distribution

•The t distribution was discovered

by William S. GossetWilliam S. Gosset in 1908.

•“Student” is his pen name.

1876 - 1937

For a normal distribution

William S. Gosset

),( 2N

),(~ 2XNX

If

Z follows a standard normal distribution ---N(0,1).

n

XXZ

X /

Page 13: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1313

•When σ is unknown,

nS

X

S

Xt

X /

t follows a t distribution.

0

t curve

Page 14: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1314

• The Property of The Property of tt Distribution Distribution

I. centrosymmetric

• Center is 0.

II. ν — shape parameter

• also called degree of freedomdegree of freedom, ν = n-1.

• determine shape of a t curve.• different ν, different t curve. When ν is increasing, t curve is close to standard normal curve; when ν →∞, t curve became standard normal curve. See this animation

In statistics, the number of degrees of freedomis the number of values in the final calculationof a statistic that are free to vary.

--Came from Wikipedia

Page 15: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1315

•The different t curves

ν= ∞(standard normal curve)

ν= 4

ν= 1

f(t)

Page 16: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1316

III. The area under the t curve

• The Table for t distribution.• t value denotes , α is probability, ν is degree of freedom, ν = n-1.•The area under the t curve means:

,t

One side : P(t≤-tα,ν)=α or P(t≥tα,ν)=α

Two sides : P(t≤-tα,ν)+P(t≥tα,ν)=α

•See next figure

Page 17: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1317

ν

,t ,t

2

2

•The meanings of the area under the t curve for two sides

Page 18: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1318

1.5 Confidence Interval of Population Mean

Statistical

inference

Estimation

parameter

Hypothesis testing

point estimation

interval estimation

Point estimation of population mean

-- sample mean

Interval estimation of population mean

-- (1-α) confidence interval  Confidence level: 1-α, such as 95% or 99%.

Page 19: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1319

1)( ,, XX StXStXP

1)( ,, t

S

XtP

X

1)( ,, tttP

• From P(t≤-tα,ν)+P(t≥tα,ν)=α

•We can get

X t SX

,

•It is the formula of (1- (1- αα) confidence interval ) confidence interval

of population meanof population mean for two sides.

Page 20: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1320

• (1- (1- αα) confidence interval of population ) confidence interval of population

meanmean can abbreviate to 95% CI95% CI or 99% CI99% CI.

Whenever we get a mean and standard

deviation from a sample,

put them into

then

X

XStX ,

xxx StXStXStX ,,, ~

•The two extreme values are called confidence limits.

S

Page 21: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1321

Example Systolic blood pressures of 20 healthy

males were measured.

What is 95% confidence interval of the population mean?

mmHgSmmHgX 8.10,4.118

5.123415.2093.24.118

3.113415.2093.24.118

093.2

05.0,191201

415.220

8.10,20

8.10,4.118

19,05.0

19,05.0

19,05.0

X

X

X

StX

StX

t

nn

SSn

mmHgSmmHgX

came from the Table of t distribution

Page 22: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1322

95% CI: mmHg)5.123,3.113(

•What does “confidence interval” mean?

(1-α) CI

Not include μ

Page 23: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1323

CC

You should knowYou should know:

Once you got a 95% confidence interval of the

certain population mean, the μ for this population

may be in it, also may not be in it, but the

probabilityprobability being in it is 95% !

(Guilin Pagodas http://en.wikipedia.org/wiki/Guilin)

Page 24: 2014.3.13 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5

2014.3.1324

In figure, the red curve is standard normal curve , the blue

curve is t curve , df is ν (degree of freedom).