29
Computing in Computing in Archaeology Archaeology Session 12. Multivariate Session 12. Multivariate statistics statistics © Richard Haddlesey www.medievalarchitecture.net

Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Embed Size (px)

Citation preview

Page 1: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Computing in Computing in ArchaeologyArchaeology

Session 12. Multivariate Session 12. Multivariate statisticsstatistics

© Richard Haddlesey www.medievalarchitecture.net

Page 2: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

AimsAims

To introduce the techniques of To introduce the techniques of multivariate analysismultivariate analysis• Cluster analysisCluster analysis• Correspondence analysisCorrespondence analysis• Principal components and factor analysisPrincipal components and factor analysis• Multiple regressionMultiple regression• Discriminant analysisDiscriminant analysis

Key textKey text• Fletcher & Lock 2005 Fletcher & Lock 2005 Digging NumbersDigging Numbers

Page 3: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Introduction to Introduction to multivariate analysismultivariate analysis

In earlier lectures we have seen examples In earlier lectures we have seen examples of of univariateunivariate analysis using such analysis using such techniques as simple bar charts, frequency techniques as simple bar charts, frequency tables of one variable and calculations of a tables of one variable and calculations of a simple sample meansimple sample mean

When 2 variables are involved such as in When 2 variables are involved such as in clustered bar charts, scatterplots, when clustered bar charts, scatterplots, when we comparing the mean of 2 groups or we comparing the mean of 2 groups or when we are asking is the any association when we are asking is the any association between 2 variables, then we are using between 2 variables, then we are using such techniques of such techniques of bivariatebivariate analysis analysis

Page 4: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Introduction to Introduction to multivariate analysismultivariate analysis

More than two variables, however, More than two variables, however, we are dealing with we are dealing with multivariatemultivariate analysisanalysis

Page 5: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

SPSSSPSS

These techniques require the use of These techniques require the use of suitable statistical packages, such as suitable statistical packages, such as SPSS, because of the considerable SPSS, because of the considerable computation involvedcomputation involved

Consequently, the approach of working Consequently, the approach of working examples by hand used in earlier lectures examples by hand used in earlier lectures is not relevant here and we will not be is not relevant here and we will not be going into the statistical and mathematical going into the statistical and mathematical details behind the techniques details behind the techniques

Page 6: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Techniques discussedTechniques discussed

Type A: reduction and groupingType A: reduction and grouping• Given several measurements (ordinal interval Given several measurements (ordinal interval

or presence/absence) on each of many objects or presence/absence) on each of many objects (i.e. several variables and many cases) is it (i.e. several variables and many cases) is it possible to reduce the number of variables, still possible to reduce the number of variables, still maintaining the information in the data?maintaining the information in the data?

• Using either the original variables or the new Using either the original variables or the new reduced set can these objects be put into reduced set can these objects be put into groups or clusters so that within each group groups or clusters so that within each group the objects are similar but between groups the objects are similar but between groups there are interpretable differences there are interpretable differences

Page 7: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Techniques discussedTechniques discussed

Type B: predictionType B: prediction• Given several measurements (ordinal Given several measurements (ordinal

interval or presence/absence) on each of interval or presence/absence) on each of many objects (i.e. several variables many objects (i.e. several variables many cases) with one of the variables of many cases) with one of the variables of particular interest, is it possible to particular interest, is it possible to predict this variable from the others and predict this variable from the others and if so which variables are important in if so which variables are important in this prediction?this prediction?

Page 8: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Type A techniquesType A techniques

Cluster analysisCluster analysis

Correspondence AnalysisCorrespondence Analysis

Principal Components and Factor Principal Components and Factor Analysis (PCA)Analysis (PCA)

Page 9: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Type B techniquesType B techniques

Multiple regressionMultiple regression

Discriminant analysisDiscriminant analysis

Page 10: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Type A:Type A:1. reduction and grouping 1. reduction and grouping

2. cluster analysis2. cluster analysis

We may wish to askWe may wish to ask• Can spearheads be grouped or clustered, so Can spearheads be grouped or clustered, so

that those within a cluster are similar to each that those within a cluster are similar to each other but there are important differences other but there are important differences between the clusters?between the clusters?

• i.e. if we group by dimension, thus creating i.e. if we group by dimension, thus creating clusters of like sized spearheads, will it show a clusters of like sized spearheads, will it show a difference between various size clusters?difference between various size clusters?

Page 11: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Hierarchical cluster analysisHierarchical cluster analysis Most stats packages offer a standard clustering Most stats packages offer a standard clustering

method called method called hierarchical cluster analysishierarchical cluster analysis

It starts by making each spearhead a single It starts by making each spearhead a single cluster. We then tell it how we want the clusters cluster. We then tell it how we want the clusters produced and SPSS will reduce the single clusters produced and SPSS will reduce the single clusters into one big clusterinto one big cluster

It will then output the data and provide It will then output the data and provide information on cluster membership and indicate information on cluster membership and indicate how good the clustering has been (i.e. how how good the clustering has been (i.e. how similar the members are) similar the members are)

Page 12: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 13: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 14: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 15: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

DendrogramsDendrograms

The way to “visualise” the clusters as The way to “visualise” the clusters as they are formed, as an aid to they are formed, as an aid to deciding how many are “significant”, deciding how many are “significant”, is by asking the software to produce is by asking the software to produce a a dendrogramdendrogram

Page 16: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 17: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 18: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 19: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 20: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 21: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 22: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 23: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 24: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 25: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 26: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Page 27: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Type B:Type B:1 prediction1 prediction

2 multiple regression2 multiple regression

We have already covered the theory We have already covered the theory of of predictionprediction and and regressionregression in the in the previous lecture. Although we are previous lecture. Although we are now talking about now talking about multiple multiple regression,regression, the principle is the the principle is the same and is best understood through same and is best understood through the practical session to follow the practical session to follow

Page 28: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Type B:Type B:1 prediction1 prediction

2 multiple regression2 multiple regression

We may askWe may ask• Can the length of a spear be predicted if Can the length of a spear be predicted if

the tip is missing?the tip is missing?

Previously we discussed correlation Previously we discussed correlation and regression between two and regression between two variables, multiple regression allows variables, multiple regression allows to use multiple variablesto use multiple variables

Page 29: Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey

Multiple regressionMultiple regression

Multiple regression will produce a linear Multiple regression will produce a linear equation relating spear length, the equation relating spear length, the dependant variabledependant variable, to several , to several independent variablesindependent variables such as socket such as socket length, maximum width, width of upper length, maximum width, width of upper socket and width of lower socket.socket and width of lower socket.

Both the dependant variable (the one to Both the dependant variable (the one to be predicted) and the individual variables be predicted) and the individual variables (the ingredients for this prediction) must (the ingredients for this prediction) must be measured on an interval scale or be be measured on an interval scale or be presence/absence datapresence/absence data