62
1 Discriminant Analysis Discriminant analysis is used to determine which variables discriminate between two or more naturally occurring groups. Computationally, discriminant function analysis is very similar to analysis of variance (ANOVA ). Sunday 15 May 2022 09:42 AM

Discriminant Analysis

Embed Size (px)

DESCRIPTION

Discriminant analysis is used to determine which variables discriminate between two or more naturally occurring groups. Computationally, discriminant function analysis is very similar to analysis of variance (ANOVA). Discriminant Analysis. Sunday, 30 November 2014 9:42 PM. - PowerPoint PPT Presentation

Citation preview

Page 1: Discriminant Analysis

1

Discriminant Analysis

Discriminant analysis is used to determine which variables discriminate between two or more naturally occurring groups.

Computationally, discriminant function analysis is very similar to analysis of variance (ANOVA).

Wednesday 19 April 2023 04:37 PM

Page 2: Discriminant Analysis

2

Discriminant Analysis

For example, an educational researcher may want to investigate which variables discriminate between high school graduates who decide to(1)go to college, (2)attend a trade or professional school, or (3)seek no further training or education. For that purpose the researcher could collect data on numerous variables prior to students' graduation. After graduation, most students will naturally fall into one of the three categories. Discriminant Analysis could then be used to determine which variable(s) are the best predictors of students' subsequent educational choice.

Page 3: Discriminant Analysis

3

Discriminant Analysis

For example, a medical researcher may record different variables relating to patients' backgrounds in order to learn which variables best predict whether a patient is likely to recover completely (group 1), partially (group 2), or not at all (group 3). A biologist could record different characteristics of similar types (groups) of flowers, and then perform a discriminant function analysis to determine the set of characteristics that allows for the best discrimination between the types.

Page 4: Discriminant Analysis

4

Discriminant Analysis

The data examined here consist of five measurements on each of 32 skulls found in the southwestern and eastern districts of Tibet.

1. Greatest length of skull (measure 1)2. Greatest horizontal breadth of skull (measure

2)3. Height of skull (measure 3)4. Upper face length (measure 4)5. Face breadth between outermost points of

cheekbones (measure 5)There are also location and grouping variables.

Page 5: Discriminant Analysis

5

Discriminant AnalysisThis work is loosely based on A Handbook of Statistical Analyses Using SPSS Sabine Landau, Brian S. Everitt Chapman and Hall CRC 2003 and Handbook of Statistical Analyses Using Stata, Fourth Edition By Brian S. Everitt, Sophia Rabe-Hesketh CRC 2006.

These data, collected by Colonel L.A. Waddel, were first reported in Morant (1923) and are also given in Hand et al. (1994).

Hand, D.J. Daly, F. Lunn, A.D. McConway K.J. and Ostrowski, E. 1994 A Handbook of Small Data Sets London: Chapman & Hall.Morant, G.M. 1923 A first study of the Tibetan skull Biometrika 14 193-260.

Page 6: Discriminant Analysis

6

Discriminant Analysis

The data can be divided into two groups. The first comprises skulls 1 to 17 found in graves in Sikkim and the neighbouring area of Tibet (Type A skulls). The remaining 15 skulls (Type B skulls) were picked up on a battlefield in the Lhasa district and are believed to be those of native soldiers from the eastern province of Khams. These skulls were of particular interest since it was thought at the time that Tibetans from Khams might be survivors of a particular human type, unrelated to the Mongolian and Indian types that surrounded them.

Page 7: Discriminant Analysis

7

Discriminant Analysis

There are two questions that might be of interest for these data:

Do the five measurements discriminate between the two assumed groups of skulls and can they be used to produce a useful rule for classifying other skulls that might become available?

Taking the 32 skulls together, are there any natural groupings in the data and, if so, do they correspond to the groups assumed?

Page 8: Discriminant Analysis

8

Discriminant Analysis

Classification is an important component of virtually all scientific research. Statistical techniques concerned with classification are essentially of two types. The first (cluster analysis) aims to uncover groups of observations from initially unclassified data. The second (discriminant analysis) works with data that is already classified into groups to derive rules for classifying new (and as yet unclassified) individuals on the basis of their observed variable values.

Page 9: Discriminant Analysis

9

Discriminant Analysis

Initially it is wise to take a look at your raw data.

Page 10: Discriminant Analysis

10

Discriminant Analysis

Select matrix scatter

Use Define to select.

Page 11: Discriminant Analysis

11

Discriminant Analysis

Select matrix variables and markers.

Note that greatest length of skull is above the list shown.

Use OK to accept.

Page 12: Discriminant Analysis

12

Discriminant AnalysisWhile this diagram only allows us to asses the group separation in two dimensions, it seems to suggest that face breadth between outer-most points of cheek bones (meas5), greatest length of skull (meas1), and upper face length (meas4) provide the greatest discrimination between the two skull types.

Page 13: Discriminant Analysis

13

Discriminant Analysis

We shall now use Fisher’s linear discriminant function to derive a classification rule for assigning skulls to one of the two predefined groups on the basis of the five measurements available.

Page 14: Discriminant Analysis

14

Discriminant Analysis

Now proceed to complete the analysis.

Page 15: Discriminant Analysis

15

Discriminant Analysis

As before use the secondary screens to select the grouping variable (place) and use Define Range.

Page 16: Discriminant Analysis

16

Discriminant Analysis

From the statistics button make the following selection

Now proceed to complete the analysis.

Page 17: Discriminant Analysis

17

Discriminant Analysis

Select the independents, use OK to run.

Page 18: Discriminant Analysis

18

Discriminant Analysis

The Group Statistics table gives the resulting descriptive output. It displays, means and standard deviations of each of the five measurements for each type of skull, and overall (total).

Group Statistics

174.824 6.7475 17 17.000

139.353 7.6030 17 17.000

132.000 6.0078 17 17.000

69.824 4.5756 17 17.000

130.353 8.1370 17 17.000

185.733 8.6269 15 15.000

138.733 6.1117 15 15.000

134.767 6.0263 15 15.000

76.467 3.9118 15 15.000

137.500 4.2384 15 15.000

179.938 9.3651 32 32.000

139.063 6.8412 32 32.000

133.297 6.0826 32 32.000

72.938 5.3908 32 32.000

133.703 7.4443 32 32.000

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

Place whereskulls were foundSikkem or Tibet

Lhasa

Total

Mean Std. Deviation Unweighted Weighted

Valid N (listwise)

Page 19: Discriminant Analysis

19

Discriminant AnalysisThe within-group covariance matrices shown in the Covariance Matrices table suggest that the sample values differ to some extent, see Box’s test for equality of covariances (see Log Determinants and Test Results, below). Covariance Matrices

45.529 25.222 12.391 22.154 27.972

25.222 57.805 11.875 7.519 48.055

12.391 11.875 36.094 -.313 1.406

22.154 7.519 -.313 20.936 16.769

27.972 48.055 1.406 16.769 66.211

74.424 -9.523 22.737 17.794 11.125

-9.523 37.352 -11.263 .705 9.464

22.737 -11.263 36.317 10.724 7.196

17.794 .705 10.724 15.302 8.661

11.125 9.464 7.196 8.661 17.964

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

Place whereskulls were foundSikkem or Tibet

Lhasa

Greatestlength of skull

Greatesthorizontalbreadth of

skull Height of skullUpper face

length

Face breadthbetween

outermostpoints of

cheek bones

Page 20: Discriminant Analysis

20

Discriminant Analysis

The within-group covariance matrices shown in the Covariance Matrices table suggest that the sample values differ to some extent, but according to Box’s test for equality of covariances (tables Log Determinants and Test Results) these differences are not statistically significant (F(15,3490) = 1.2, p = 0.25).

Log Determinants

5 16.164

5 15.773

5 16.727

Place where skullswere foundSikkem or Tibet

Lhasa

Pooled within-groups

RankLog

Determinant

The ranks and natural logarithms of determinantsprinted are those of the group covariance matrices.

Test Results

22.371

1.218

15

3489.901

.249

Box's M

Approx.

df1

df2

Sig.

F

Tests null hypothesis of equal population covariance matrices.

Page 21: Discriminant Analysis

21

Discriminant Analysis

It appears that the equality of covariance matrices assumption needed for Fisher’s linear discriminant approach to be strictly correct is valid here.

In practice, Box’s test is not of great use since even if it suggests a departure for the equality hypothesis, the linear discriminant may still be preferable over a quadratic function. Here we shall simply assume normality for our data relying on the robustness of Fisher’s approach to deal with any minor departure from the assumption.

Page 22: Discriminant Analysis

22

Discriminant Analysis

The resulting discriminant analysis shows the eigenvalue (here 0.93) represents the ratio of the between-group sums of squares to the within-group sum of squares of the discriminant scores. It is this criterion that is maximized in discriminant function analysis.

Eigenvalues

.930a 100.0 100.0 .694Function1

Eigenvalue % of Variance Cumulative %CanonicalCorrelation

First 1 canonical discriminant functions were used in theanalysis.

a.

Page 23: Discriminant Analysis

23

Discriminant Analysis

The canonical correlation is simply the Pearson correlation between the discriminant function scores and group membership coded as 0 and 1. For the skull data, the canonical correlation value is 0.694 so that 0.6942 × 100 = 48% of the variance in the discriminant function scores can be explained by group differences.

Eigenvalues

.930a 100.0 100.0 .694Function1

Eigenvalue % of Variance Cumulative %CanonicalCorrelation

First 1 canonical discriminant functions were used in theanalysis.

a.

Page 24: Discriminant Analysis

24

Discriminant AnalysisWilk’s Lambda provides a test for assessing the null hypothesis that in the population the vectors of means of the five measurements are the same in the two groups. The lambda coefficient is defined as the proportion of the total variance in the discriminant scores not explained by differences among the groups, here 51.8%. The formal test confirms that the sets of five mean skull measurements differ significantly between the two sites ( (5) = 18.1, p = 0.003). If the equality of mean vectors hypothesis had been accepted, there would be little point in carrying out a linear discriminant function analysis.

Wilks' Lambda

.518 18.083 5 .003Test of Function(s)1

Wilks'Lambda Chi-square df Sig.

2

2

Page 25: Discriminant Analysis

25

Discriminant Analysis

Next we come to the Classification Function Coefficients. This table is displayed as a result of checking Fisher’s in the Statistics sub-dialogue box.

Classification Function Coefficients

1.468 1.558

2.361 2.205

2.752 2.747

.775 .952

.195 .372

-514.956 -545.419

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

(Constant)

Sikkem orTibet Lhasa

Place where skullswere found

Fisher's linear discriminant functions

Page 26: Discriminant Analysis

26

Discriminant Analysis

It can be used to find Fisher’s linear discrimimant function as defined by simply subtracting the coefficients given for each variable in each group giving the following result:

Sikkern or Tibet

Lhasa Difference

Greatest length of skull

(measure 1)1.468 1.558 -0.090

Greatest horizontal

breadth of skull (measure 2)

2.361 2.205 0.156

Height of skull (measure 3)

2.752 2.747 0.005

Upper face length (measure 4)

0.775 0.952 -0.177

Face breadth between

outermost points of cheekbones

(measure 5)

0.195 0.372 -0.177

Z = -0.09 meas1 + 0.156 meas2+ 0.005 meas3 – 0.177 meas4 – 0.177 meas5

Page 27: Discriminant Analysis

27

Discriminant AnalysisZ = -0.09 meas1 + 0.156 meas2+ 0.005 meas3 – 0.177 meas4 – 0.177 meas5

The difference between the constant coefficients (-514.956 and -545.419, bottom row of Classification Function Coefficients, previously) provides the sample mean of the discriminant function scores

463.30z

Page 28: Discriminant Analysis

28

Discriminant Analysis

The coefficients defining Fisher’s linear discriminant function in the equation are proportional to the unstandardised coefficients given in the “Canonical Discriminant Function Coefficients” table which is produced when Unstandardised is checked in the Statistics sub-dialogue box.

Canonical Discriminant Function Coefficients

.048

-.083

-.003

.095

.095

-16.222

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

(Constant)

1

Function

Unstandardized coefficients

Page 29: Discriminant Analysis

29

Discriminant Analysis

These scores can be compared with the average of their group means (shown in the Functions at Group Centroids table) to allocate skulls into groups. Here the threshold against which a skull’s discriminant score is evaluated is

0.0585= ½ (-0.877 + 0.994)Functions at Group Centroids

-.877

.994

Place whereskulls were foundSikkem or Tibet

Lhasa

1

Function

Unstandardized canonical discriminantfunctions evaluated at group means

Thus new skulls with discriminant scores above 0.0585 would be assigned to the Lhasa site (type B); otherwise, they would be classified as Sikkim/Tibet (type A).

Page 30: Discriminant Analysis

30

Discriminant Analysis

When variables are measured on different scales, the magnitude of an unstandardised coefficient provides little indication of the relative contribution of the variable to the overall discrimination. The “Standardized Canonical Discriminant Function Coefficients” listed attempt to overcome this problem by rescaling of the variables to unit standard deviation. Standardized Canonical Discriminant Function Coefficients

.367

-.578

-.017

.405

.627

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

1

Function

Page 31: Discriminant Analysis

31

Discriminant Analysis

For our data, such standardisation is not necessary since all skull measurements were in millimetres. Standardization should, however, not matter much since the within-group standard deviations were similar across different skull measures. According to the standardized coefficients, skull height (meas3) seems to contribute little to discriminating between the two types of skulls.Standardized Canonical Discriminant Function Coefficients

.367

-.578

-.017

.405

.627

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

1

Function

Page 32: Discriminant Analysis

32

Discriminant Analysis

A question of some importance about a discriminant function is: how well does it perform? One possible method of evaluating performance is to apply the derived classification rule to the data set and calculate the misclassification rate.

Page 33: Discriminant Analysis

33

Discriminant Analysis

Repeat using the following classification.

Now proceed to complete the analysis.

Page 34: Discriminant Analysis

34

Discriminant AnalysisThis is known as the re-substitution estimate and the corresponding results are shown in the Original part of the Classification Results table. According to this estimate, 81.3% ((17×82.4+15×80)/(15+17)) of skulls can be correctly classified as type A or type B on the basis of the discriminant rule.

Classification Resultsb,c

14 3 17

3 12 15

82.4 17.6 100.0

20.0 80.0 100.0

12 5 17

6 9 15

70.6 29.4 100.0

40.0 60.0 100.0

Place whereskulls were foundSikkem or Tibet

Lhasa

Sikkem or Tibet

Lhasa

Sikkem or Tibet

Lhasa

Sikkem or Tibet

Lhasa

Count

%

Count

%

Original

Cross-validateda

Sikkem orTibet Lhasa

Predicted GroupMembership

Total

Cross validation is done only for those cases in the analysis. In crossvalidation, each case is classified by the functions derived from all cases otherthan that case.

a.

81.3% of original grouped cases correctly classified.b.

65.6% of cross-validated grouped cases correctly classified.c.

Page 35: Discriminant Analysis

35

Discriminant AnalysisHowever, estimating misclassification rates in this way is known to be overly optimistic and several alternatives for estimating misclassification rates in discriminant analysis have been suggested. One of the most commonly used of these alternatives is the so called leaving one out method, in which the discriminant function is first derived from only n – 1 sample members, and then used to classify the observation left out. The procedure is repeated n times, each time omitting a different observation.

Page 36: Discriminant Analysis

36

Discriminant Analysis

Classification Resultsb,c

14 3 17

3 12 15

82.4 17.6 100.0

20.0 80.0 100.0

12 5 17

6 9 15

70.6 29.4 100.0

40.0 60.0 100.0

Place whereskulls were foundSikkem or Tibet

Lhasa

Sikkem or Tibet

Lhasa

Sikkem or Tibet

Lhasa

Sikkem or Tibet

Lhasa

Count

%

Count

%

Original

Cross-validateda

Sikkem orTibet Lhasa

Predicted GroupMembership

Total

Cross validation is done only for those cases in the analysis. In crossvalidation, each case is classified by the functions derived from all cases otherthan that case.

a.

81.3% of original grouped cases correctly classified.b.

65.6% of cross-validated grouped cases correctly classified.c.

The Cross-validated part of the Classification Results table shows the results from applying this procedure. The correct classification rate now drops to 65.6% ((17×70.6+15×60)/(15+17)), a considerably lower success rate than suggested by the simple re-substitution rule.

Page 37: Discriminant Analysis

37

Discriminant AnalysisWe now turn to applying cluster analysis to the skull data. Here the prior classification of the skulls will be ignored and the data simply “explored” to see if there is any evidence of interesting “natural” groupings of the skulls and if there is, whether these groups correspond in anyway with Morant’s classification.

Here we will use two hierarchical agglomerative clustering procedures, complete and average linkage clustering and then k-means clustering.

Page 38: Discriminant Analysis

38

Discriminant AnalysisSelect Analyze > Classify > Hierarchical Cluster

Page 39: Discriminant Analysis

39

Discriminant AnalysisIn the usual way select the variables of interest

Page 40: Discriminant Analysis

40

Discriminant AnalysisSelect the plots desired

Page 41: Discriminant Analysis

41

Discriminant AnalysisSelect the desired method

Now proceed to complete the analysis.

Page 42: Discriminant Analysis

42

Discriminant AnalysisThe complete linkage clustering output shows which skulls or clusters are combined at each stage of the cluster procedure.

Page 43: Discriminant Analysis

43

Discriminant

AnalysisFirst, skull 8 is joined with skull 13 since the Euclidean distance between these two skulls is smaller than the distance between any other pair of skulls. The distance is shown in the column labelled “Coefficients”.

Agglomeration Schedule

8 13 3.041 0 0 4

15 17 5.385 0 0 14

9 23 5.701 0 0 11

8 19 5.979 1 0 8

24 28 6.819 0 0 17

21 22 6.910 0 0 21

16 29 7.211 0 0 15

7 8 8.703 0 4 13

2 3 8.874 0 0 14

27 30 9.247 0 0 23

5 9 9.579 0 3 13

18 32 9.874 0 0 18

5 7 10.700 11 8 24

2 15 11.522 9 2 28

6 16 12.104 0 7 22

14 25 12.339 0 0 21

24 31 13.528 5 0 23

11 18 13.537 0 12 22

1 20 13.802 0 0 26

4 10 14.062 0 0 28

14 21 15.588 16 6 25

6 11 16.302 15 18 24

24 27 18.554 17 10 27

5 6 18.828 13 22 29

12 14 20.700 0 21 30

1 26 24.597 19 0 27

1 24 25.269 26 23 30

2 4 25.880 14 20 29

2 5 26.930 28 24 31

1 12 36.342 27 25 31

1 2 48.816 30 29 0

Stage1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

Cluster 1 Cluster 2

Cluster Combined

Coefficients Cluster 1 Cluster 2

Stage Cluster FirstAppears

Next Stage

Page 44: Discriminant Analysis

44

Discriminant

AnalysisSecond, skull 15 is joined with skull 17 and so on.

Agglomeration Schedule

8 13 3.041 0 0 4

15 17 5.385 0 0 14

9 23 5.701 0 0 11

8 19 5.979 1 0 8

24 28 6.819 0 0 17

21 22 6.910 0 0 21

16 29 7.211 0 0 15

7 8 8.703 0 4 13

2 3 8.874 0 0 14

27 30 9.247 0 0 23

5 9 9.579 0 3 13

18 32 9.874 0 0 18

5 7 10.700 11 8 24

2 15 11.522 9 2 28

6 16 12.104 0 7 22

14 25 12.339 0 0 21

24 31 13.528 5 0 23

11 18 13.537 0 12 22

1 20 13.802 0 0 26

4 10 14.062 0 0 28

14 21 15.588 16 6 25

6 11 16.302 15 18 24

24 27 18.554 17 10 27

5 6 18.828 13 22 29

12 14 20.700 0 21 30

1 26 24.597 19 0 27

1 24 25.269 26 23 30

2 4 25.880 14 20 29

2 5 26.930 28 24 31

1 12 36.342 27 25 31

1 2 48.816 30 29 0

Stage1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

Cluster 1 Cluster 2

Cluster Combined

Coefficients Cluster 1 Cluster 2

Stage Cluster FirstAppears

Next Stage

Page 45: Discriminant Analysis

45

Discriminant AnalysisThe dendrogram is simpler to interpret (see next slide).

Page 46: Discriminant Analysis

46

Page 47: Discriminant Analysis

47

Discriminant AnalysisThe dendrogram may, on occasions, also be useful in deciding the number of clusters in a data set with a sudden increase in the size of the difference in adjacent steps taken as an informal indication of the appropriate number of clusters to consider.

Page 48: Discriminant Analysis

48

Discriminant AnalysisA fairly large jump occurs between stages 29 and 30 (indicating a three-group solution) and an even bigger one between this penultimate and the ultimate fusion of groups (a two-group solution).

Page 49: Discriminant Analysis

49

Discriminant AnalysisFor an alternate approach use

Now proceed to produce the plot

Page 50: Discriminant Analysis

50

Discriminant AnalysisThe initial steps agree with the complete linkage solution, but eventually the trees diverge with the average linkage dendrogram successively adding small clusters to one increasingly large cluster. For the average linkage dendrogram (see next slide) it is not clear where to cut the dendrogram to give a specific number of groups.

Page 51: Discriminant Analysis

51

Page 52: Discriminant Analysis

52

Discriminant AnalysisSince we believe there are two groups a final cluster analysis, employing this information, may be attempted.

Page 53: Discriminant Analysis

53

Discriminant AnalysisThe variable selection and number of clusters are shown.

Page 54: Discriminant Analysis

54

Discriminant AnalysisThe resulting cluster output shows the Initial Cluster Centre table displays the starting values used by the algorithm.

Initial Cluster Centers

200.0 167.0

139.5 130.0

143.5 125.5

82.5 69.5

146.0 119.5

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

1 2

Cluster

Page 55: Discriminant Analysis

55

Discriminant AnalysisThe Iteration History table indicates that the algorithm has converged.

Iteration Historya

16.626 16.262

.000 .000

Iteration1

2

1 2

Change in ClusterCenters

Convergence achieved due to no or smallchange in cluster centers. The maximumabsolute coordinate change for any center is.000. The current iteration is 2. The minimumdistance between initial centers is 48.729.

a.

Page 56: Discriminant Analysis

56

Discriminant AnalysisThe Final Cluster Centres tables describe the final cluster solution.

Final Cluster Centers

188.4 174.1

141.3 137.6

135.8 131.6

77.6 69.7

138.5 130.4

Greatest length of skull

Greatest horizontalbreadth of skull

Height of skull

Upper face length

Face breadth betweenoutermost points ofcheek bones

1 2

Cluster

Page 57: Discriminant Analysis

57

Discriminant AnalysisThe Number of Cases in each Cluster tables describe the final cluster solution.

Number of Cases in each Cluster

13.000

19.000

32.000

.000

1

2

Cluster

Valid

Missing

Page 58: Discriminant Analysis

58

Discriminant AnalysisHow does the k-means two-group solution compare with the original classification of the skulls into types A and B?

We can investigate this by first using the Save button on the k-Means Cluster Analysis dialogue box to save cluster membership for each skull in the Data View spreadsheet.

Page 59: Discriminant Analysis

59

Discriminant AnalysisThe new categorical variable now available (labelled QCL_1) can be cross-tabulated with assumed skull type (variable place). The display shows the resulting table; the k-means clusters largely agree with the skull types as originally suggested by Morant, with cluster 1 consisting primarily of Type B skulls (those from Lhasa) and cluster 2 containing mostly skulls of Type A (from Sikkim and the neighbouring area of Tibet). Only six skulls are wrongly placed.

Page 60: Discriminant Analysis

60

Discriminant AnalysisThe new categorical variable now available (labelled QCL_1) can be cross-tabulated with assumed skull type.

Page 61: Discriminant Analysis

61

Discriminant AnalysisThe new categorical variable now available (labelled QCL_1) can be cross-tabulated with assumed skull type.

Page 62: Discriminant Analysis

62

Discriminant AnalysisThe new categorical variable now available (labelled QCL_1) can be cross-tabulated with assumed skull type. Assumed type of skull

A B

Count Count

1 2 11 Cluster Number of Case

2 15 4

The k-means clusters largely agree with the skull types as originally suggested, with cluster 1 consisting primarily of Type B skulls (those from Lhasa) and cluster 2 containing mostly skulls of Type A (from Sikkim and the neighbouring area of Tibet). Only six skulls are wrongly placed.