[email protected] Chi-Square Test and Goodness-of-Fit Testing Ming-Tsung Hsu

[email protected] 1

Chi-Square Test and Goodness-of-Fit Testing

Ming-Tsung Hsu

[email protected] 2

Outline

Goal of Hypothesis Test Terms & Notation Chi-Square Test Goodness-of-Fit Testing Example

[email protected] 3

Goal of Hypothesis Test

To examine statistical evidence, and to determine whether it supports or contradicts a claim The life of lamps is more than 10,000 hours The data are from normal distribution

To reduce the directly-relevant data to a “level of suspicion” based purely on the data

[email protected] 4

Terms & Notation

Null Hypothesis (H0) vs. Alternative hypothesis (H1 or HA) Type I Error vs. Type II Error

Parametric Test vs. Non-Parametric Test Significance level (α) and Critical Region

“Reject H0” vs. “Do not reject H0“

Central Limit Theorem Sampling distribution of the sample mean

Test Statistic vs. Table Value P-value

[email protected] 5

Null Hypothesis vs. Alternative hypothesis

01 H NotH

ondistributi normal from are DataH

H

H

:

:

:

:

0

01

00

[email protected] 6

Type I Error vs. Type II Error

Type I error H0 is true but reject H0

Pr(reject H0 | H0) = α

Type II error H1 is true but do not reject H0

Pr(do not reject H0 | H1) = β

[email protected] 7

Parametric Test vs. Non-Parametric Test Parametric Test

Parameters of population Mean test, variance test, etc.

Non-Parametric Test Make no assumptions about the frequency

distributions of the variables being assessed Independent test, distribution test, etc.

[email protected] 8

Significance level (α) and Critical Region

[email protected] 9

Central Limit Theorem

n as NZZ

normal, standardthe isn

nXZ

of ondistributi limiting the then

variance and mean withondistributi a from sample

random a is X X If (CLT) Theorem Limit Central

dn

n

ii

n

n1

)1,0(~

,

,,

1

2

[email protected] 10

Test Statistic vs. Table Value

0

99.0

975.095.0

211

0

H Rej. . ||

:

326.2

96.1 ,645.1

Sided)-(Two Z),(

:. .

: . .

VTZ

RuleDecision

Z

ZZ

SidedOneZ

VTn

XZ

ST


P-value

0H Rej

sided)-(two 2 )(

)|(

orsidedonevaluep

rule:Decision

xXpvaluep


Chi-Square Test

Non-Parametric Test T. S. ~χ2(ν)

Goodness-of-Fit Test Also known as “Pearson's chi-square test”

Independent Test Homogeneity Test


Goodness-of-Fit Testing

Used to test if a sample of data came from a population with a specific distribution

)1(~)(

:. .

2

1

22 mk

E

EO

STk

i i

ii

Oi ： Observations of ith groupEi ： Expected frequency of ith groupk： Number of groupsm: Number of estimated parametersK-1-m: Degree of freedom


Example


Parameter Estimation - λ

246.006.4

11ˆ t

of MLE The


Observations and Expected FrequenciesInterval Obs t F(t) = p(T < t) C.F. Frequency

0 ~ < 1 14 1 0.218078 12.86659 12.86659

1 ~ < 2.5 12 2.5 0.459359 27.10219 14.2356

2.5 ~ < 5 18 5 0.707707 41.75474 14.65255

5 ~ < 7.5 5 7.5 0.841975 49.67651 7.921768

7.5 ~ < 10 5 10 0.914565 53.95934 4.282832

≧10 5 ≧10 1 59 5.040662

？ !


Test Statistic and P-value

66.0))116(|4137.2(

:

4137.2)(

:. .

22

1

22

P

valuep

E

EO

STk

i i

ii


Observations and Expected Frequencies - Paper

72785.0))116(|403.2(

043.2)(

:. .

22

1

22

P

valuep

E

EO

STk

i i

ii

18

12.87

14.24

14.65

7.92 4.28 5.04


Re-Grouping

ID lower upper Freq.

1 0.3 3.3 30

2 3.3 6.3 17

3 6.3 9.3 6

4 9.3 12.3 3

5 12.3 15.3 1

6 15.3 18.3 1

7 18.3 21.3 1

Obs t F(x) C. F. E. F.

30 3.3 0.556 32.812 32.813

17 6.3 0.788 46.486 13.673

6 9.3 0.899 53.020 6.534

6 ≧9.3 1 59 5.980

579.0))114(|0942.1(

0942.1)(

:. .

22

1

22

P

valuep

E

EO

STk

i i

ii

# of groups = 1+3.322*log(n)

Documents

[email protected] Chi-Square Test and Goodness-of-Fit Testing Ming-Tsung Hsu