8
STATISTICS Advanced Higher Chi- squared test

STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

Embed Size (px)

Citation preview

Page 1: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

STATISTICSAdvanced

Higher

Chi-squared test

Page 2: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

Advanced Higher

STATISTICS

Chi-squared testFinding if there is a significant association between sets of data.

Lesson Objectives1. Explain why it is used.2. List the advantages and disadvantages .3. Understand how to apply the statistical test.4. Apply it to a relevant context.

Page 3: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

Advanced Higher

STATISTICSCh

i-squ

ared

test

: loo

king

for a

diff

eren

ce The situationA group of students have visited the Lake District National Park to investigate the impact of tourism upon the landscape. One of their data collection techniques is to record the amount of traditional and modern looking houses in 8 villages inside of the National Park boundary and 8 villages outside of the boundary line...

What should they do?• What data should they have collected to complete this investigation?• How much data should they collect?• How can they make sure that the data is reliable?• What initial data representation skill could they utilise to discover an initial impression?• What statistical test should they use to confidently state there is or is not a relationship?

Page 4: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

-What did you observe? (what data did you actually collect?)

- What would you expect if there was no association?Ch

i-squ

ared

test

: loo

king

for a

diff

eren

ce

O = the Observed frequency (what you actually counted)

E = the Expected frequency (what you would expect if there was no association)

(O-E)2

EX2 = S

Traditional houses Modern houses

We found 180 traditional homes inside of the National Park and

23 outside.

We found 103 modern homes inside of the

National Park and 452 outside.

Page 5: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

Null Hypothesis:

There is no significant difference between building ages inside and outside of the National ParkAlternative Hypothesis: There is a significant difference between building ages inside and outside of the National Park

TESTING THE RELATIONSHIP

(O-E)2

EX2 = S

OBSERVED FREQUENCIES

Inside the national park

Outside the national park

row total

Traditional houses 180 23 203

Modern houses 103 452 555

Column total 283 475 753

1st: construct a table with the data that you have observed

EXPECTEDFREQUENCIES

Inside the national park

Outside the national park

row total

Traditional houses 40.7 128 203

Modern houses 111 350 555

Column total 151 478 753

2nd: work out the expected frequency

Expected Frequency = row total x column totalGrand total

(O-E)2

EX2 = S

(O-E)2

EX2 = S

Page 6: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

Null Hypothesis:

There is no significant difference between building ages inside and outside of the National ParkAlternative Hypothesis: There is a significant difference between building ages inside and outside of the National Park

TESTING THE RELATIONSHIP

Degrees of Freedom

0.05 (95%) 0.01 (99%)

1 3.84 6.642 5.99 9.213 7.82 11.344 9.49 13.285 11.08 15.096 12.59 16.817 14.07 18.488 15.51 20.099 16.92 21.67

10 18.31 23.2111 19.68 24.7212 21.03 26.2213 22.36 27.6914 23.68 29.1415 25.00 30.5816 26.30 32.0017 27.59 33.4118 28.87 34.8019 30.14 36.1920 37.57 37.5730 43.77 50.89

FINAL STATEMENT

IF X2 IS HIGHER THAN OR EQUAL TO THE CRITICAL VALUE REJECT THE

NULL HYPOTHESIS AND ACCEPT THE ALTERNATIVE.

As X2 is (greater than / less than) the Critical Value I can (accept / reject) the Null Hypothesis and (accept /

reject) the Alternative Hypothesis.

Therefore I can state that there (is no / is a) significant association…

…to a significance level of 0.05 (95%

sure results have not occurred by chance).

CALCULATE THE DEGREES OF FREEDOM: (Number of Rows – 1) x (Number of Columns – 1)

Chi2 value of ____ is higher than3.84 and 6.64 so…

Page 7: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

Reasons to use it• It allows you to identify if there is a difference or a relationship

between two characteristics.• It is simple to carry out• It compares the data that you have observed with what you would

expect to happen. Disadvantages of using it• The data must be in the form of frequencies.• The frequency of the data must have a precise numerical value and be

able to be organised into categories or groups.• The total number of observations must be more than 20.• The expected frequency in any one cell of the table must be more

than 5.

There is a significant associationbetween housing age inside and outside of the Lake

District National Park.

State the answer in terms of the alternative hypothesis.

• Sometimes buildings are built recently but designed to look old.• The survey may have included unused farm buildings as traditional but

not necessarily used as homes.• It is uncertain how the survey determined what was modern or

traditional.• The survey indicates that villages inside of the Park are smaller.

• Perhaps there is a static village size and new buildings aren’t being built.

Referring to a National park that you have

studied, comment on

the results shown in this

test.

Justify the suitability of

using chi2 test.

Page 8: STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of

You compare the observed data with the data that you would expect.Looking for a difference between O & E.

If there is a difference, then there is an association!

Reason to use this test:• If you have categorical data (eg. blue eyes)

• means are not a category. Colours, for example, are.

Must have:• More than one category

• A minimum of 5 in each one