Upload
seth-little
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
Computing in Computing in ArchaeologyArchaeology
Session 10. Statistical Session 10. Statistical tests of significancetests of significance
© Richard Haddlesey www.medievalarchitecture.net
AimsAims
To understand what we mean by statistical To understand what we mean by statistical significance and archaeological significance and archaeological significancesignificance
To test significance (through the To test significance (through the Null Null hypothesishypothesis and and Chi-squaredChi-squared testing) testing)
Key text: Fletcher & Lock (2Key text: Fletcher & Lock (2ndnd Ed) 2005. Ed) 2005. Digging NumbersDigging Numbers. Oxford. 63-5, 128-38 . Oxford. 63-5, 128-38
Is it significant?
What do we mean by significant?
Choosing a test
positively skeweddistribution
normal distribution
positively skeweddistribution
normal distribution
parametric test
positively skeweddistribution
non-parametric test
normal distribution
parametric test
Hypothesis testingHypothesis testing
Before we can test significance we must Before we can test significance we must formulate two hypothesesformulate two hypotheses
So what do we mean by hypothesis So what do we mean by hypothesis testing?testing?
Theories abound in archaeology although Theories abound in archaeology although many of them cannot be tested in any way many of them cannot be tested in any way let alone in the formal way described let alone in the formal way described throughout this lecturethroughout this lecture
Hypothesis testingHypothesis testing
A test must be repeatable, not just by you, A test must be repeatable, not just by you, but by anyone who has access to the data but by anyone who has access to the data setset
A hypothesis, therefore, must represent a A hypothesis, therefore, must represent a quantifiable relationship and it is this quantifiable relationship and it is this relationship which is tested formallyrelationship which is tested formally
We could say that all hypotheses are We could say that all hypotheses are theories whereas not all theories are theories whereas not all theories are hypotheseshypotheses
Example hypothesisExample hypothesis
In order to illustrate the logic of a In order to illustrate the logic of a hypothesis test consider testing the hypothesis test consider testing the hypothesis that at least 40% of all hypothesis that at least 40% of all bronze spearheads come from bronze spearheads come from burialsburials
Fletcher & Lock, 63
Step 1 – formulate two hypotheses
Step 1 – formulate two hypotheses• null hypothesis (H0)
Step 1 – formulate two hypotheses• null hypothesis (H0)
• alternative hypothesis (H1)
Step One:Step One: H H00 & H & H11
This should be done so that one and This should be done so that one and only one only one mustmust be true be true
In this case we would have:In this case we would have:• HH00: proportion of bronze spears from burials is ≥40%: proportion of bronze spears from burials is ≥40%
• HH11: proportion of bronze spears from burials is : proportion of bronze spears from burials is <<40%40%
Step 1 – formulate two hypotheses• null hypothesis (H0)
• alternative hypothesis (H1)
Step 2 – take measurements
Step TwoStep Two
Take a suitable measurement or Take a suitable measurement or observation from which a test statistic and observation from which a test statistic and its associated probability (step 3) can be its associated probability (step 3) can be calculatedcalculated
Here we have a sample of 20 bronze Here we have a sample of 20 bronze spearheads 7 of which have been found in spearheads 7 of which have been found in burials (this is the observed result)burials (this is the observed result)
So far so good!So far so good!
Step 1 – formulate two hypotheses• null hypothesis (H0)
• alternative hypothesis (H1)
Step 2 – take measurements
Step 3 – calculate test statistic
Step 3: the difficult bitStep 3: the difficult bit
Here we calculate a Here we calculate a test statistictest statistic which can then be tested for which can then be tested for significance in step 4. significance in step 4.
The test statistic allows for the The test statistic allows for the calculation of the probability of the calculation of the probability of the observed result which is often called observed result which is often called the p-valuethe p-value
Step 3: continuedStep 3: continued
If HIf H0 0 is true and at least 40% of all bronze is true and at least 40% of all bronze spearheads do come from burials what is spearheads do come from burials what is the probability of a sample of 20 the probability of a sample of 20 containing 7 from burials?containing 7 from burials?
P (burial)P (burial) =0.40 and so P (not burial) =0.60=0.40 and so P (not burial) =0.60
P (not burial for 1P (not burial for 1stst & 2 & 2ndnd)) =(0.60)(0.60)=(0.60)(0.60) =(0.60) =(0.60)22
hence P (not burial for 13 (20-7))hence P (not burial for 13 (20-7)) =(0.60)=(0.60)1313 =0.0013=0.0013
The p-value (probability of the observed The p-value (probability of the observed result) is 0.0013 or 0.13%result) is 0.0013 or 0.13%
Step 1 – formulate two hypotheses• null hypothesis (H0)
• alternative hypothesis (H1)
Step 2 – take measurements
Step 3 – calculate test statistic
Step 4 – calculate significance
Step 4: testing the hypothesesStep 4: testing the hypotheses
Remember that the Remember that the null hypothesis null hypothesis is being tested. The significance of is being tested. The significance of the test statistic will determine the test statistic will determine whether the Null Hypothesis is whether the Null Hypothesis is accepted or rejectedaccepted or rejected
There are set conventions for There are set conventions for significance testing and these will significance testing and these will guide our discussionguide our discussion
Step 4: continuedStep 4: continued
Common significance levels used in Common significance levels used in the social sciences are:the social sciences are:
• pp<0.10 reject at the 10% level<0.10 reject at the 10% level• p<0.05 reject at the 5% levelp<0.05 reject at the 5% level• p<0.01 reject at the 1% levelp<0.01 reject at the 1% level• p<0.001 reject at the 0.1% levelp<0.001 reject at the 0.1% level
The 5% level is often used within The 5% level is often used within archaeologyarchaeology
Step 4: continuedStep 4: continued
If pIf p<0.05 (5%) reject H<0.05 (5%) reject H00 at the 5% at the 5% level and conclude that there is level and conclude that there is significant evidence to show that the significant evidence to show that the percentage of bronze spearheads percentage of bronze spearheads from burials is less than 40% (in from burials is less than 40% (in other words if Hother words if H0 0 is rejected His rejected H11 mustmust be acceptedbe accepted
What does this mean?What does this mean?
We can now conclude that we are 95% We can now conclude that we are 95% certain that the percentage of bronze certain that the percentage of bronze spearheads from burials is less than 40%spearheads from burials is less than 40%
If, however, the p-value was greater than If, however, the p-value was greater than 0.05, the conclusion would have been to 0.05, the conclusion would have been to reject Hreject H00 at the 5% level and accept H at the 5% level and accept H11
Confidence interval
90%
95%
99%
Confidence interval
90%
95%
99%
Probability
p=0.10
Confidence interval
90%
95%
99%
Probability
p=0.10
p=0.05
Confidence interval
90%
95%
99%
Probability
p=0.10
p=0.05
p=0.01
p<0.05 reject at the 5% level
p<0.05 reject at the 5% level
p<0.10 reject at the 10% level
p<0.05 reject at the 5% level
p<0.10 reject at the 10% level
p<0.01 reject at the 1% level
Chi-squared testChi-squared test
The Chi-squared Test was developed by The Chi-squared Test was developed by Karl Pearson in 1900 to test if a Karl Pearson in 1900 to test if a contingency table provides significant contingency table provides significant evidence of an association between two evidence of an association between two variablesvariables
It can be used for both nominal and It can be used for both nominal and ordinal levels, though it is better suited to ordinal levels, though it is better suited to nominal datanominal data
sample:• 40 spearheads
variables:• material – iron/bronze• loop – yes/no
First we need to display the data in a contingency table
Chi-squared testChi-squared test
Chi-squared explainedChi-squared explained
It’s a method of comparing the It’s a method of comparing the observed frequencies (the data) with observed frequencies (the data) with those expected under the null those expected under the null hypothesis of no association between hypothesis of no association between two variablestwo variables
bivariate frequency tablebivariate frequency table
No loopNo loop LoopLoop
IronIron 2020 00 2020
BronzeBronze 99 1111 2020
2929 1111 4040
•Is there any association between the two variables?
•How strong is the association between the two variables?
expected frequency (E) = expected frequency (E) = (row total)(column total)(row total)(column total) (overall total) (overall total)
No loopNo loop LoopLoop
IronIron 2020 00 2020
BronzeBronze 99 1111 2020
2929 1111 4040
expected frequency (E) = expected frequency (E) = (row total)(column total)(row total)(column total) (overall total) (overall total)
No loopNo loop LoopLoop
IronIron 2020 00 2020
BronzeBronze 99 1111 2020
2929 1111 4040
expected frequency (E) = expected frequency (E) = (20)(11)(20)(11) = 5.5 = 5.5 (40) (40)
No loopNo loop LoopLoop
IronIron 2020 0 0 (5.5)(5.5) 2020
BronzeBronze 99 1111 2020
2929 1111 4040
No loopNo loop LoopLoop
IronIron 20 20 (14.5)(14.5) 0 0 (5.5)(5.5) 2020
BronzeBronze 9 9 (14.5)(14.5) 11 11 (5.5)(5.5) 2020
2929 1111 4040
No loopNo loop LoopLoop
IronIron 20 (14.5)20 (14.5) 0 (5.5)0 (5.5) 2020
BronzeBronze 9 (14.5)9 (14.5) 11 (5.5)11 (5.5) 2020
2929 1111 4040
No loopNo loop LoopLoop
IronIron 20 (14.5)20 (14.5) 0 (5.5)0 (5.5) 2020
BronzeBronze 9 (14.5)9 (14.5) 11 (5.5)11 (5.5) 2020
2929 1111 4040
degrees of freedom (d.f.) = (r-1)(c-1)
No loopNo loop LoopLoop
IronIron 20 (14.5)20 (14.5) 0 (5.5)0 (5.5) 2020
BronzeBronze 9 (14.5)9 (14.5) 11 (5.5)11 (5.5) 2020
2929 1111 4040
degrees of freedom (d.f.) = (r-1)(c-1) = (2-1)(2-1) = (1)(1) = 1
Critical values of the χ2 distribution
Critical values of the χ2 distribution
d.f. = 1
Critical values of the χ2 distribution
d.f. = 1
χ2 = 15.18
Cramer’s V statisticCramer’s V statistic
Cramer’s V statistic can be Cramer’s V statistic can be calculated to measure the strength calculated to measure the strength of associationof association
This gives us a value This gives us a value V V between 0 between 0 and 1 with values close to 1 and 1 with values close to 1 indicating a strong relationshipindicating a strong relationship
Cramer’s V statistic
Where:
n = total of all frequencies (40)
m = the smaller of (c-1) and (r-1)
Cramer’s V statistic
V= √15.18/(40)(1)√0.3795= 0.62
SummarySummary