Upload
homer-wade
View
215
Download
0
Embed Size (px)
Citation preview
Analysis Using R
R>voles_mds<- cmdscale(watervoles, k=13, eig=TRUE)
1.Finds the classical scaling solution.
2. Computes the Trace criterion & Magnitude criterion to determine the # of dimensions.
R>voles_mds$eig [1] 7.359910e-01 2.626003e-01 1.492622e-01 6.990457e-02 2.956972e-02 1.931184e-02 8.326673e-17 -1.139451e-02 -1.279569e-02 -2.849924e-02[11] -4.251502e-02 -5.255450e-02 -7.406143e-02
(1*)
We have 13 eigenvalues, some positive & some negative.
R>sum(abs(voles_mds$eig[1:2]))/ sum (abs(voles_mds$eig))[1] 0.6708889
This code satisfies Magnitude Criterion. (2*)
R>sum((voles_mds$eig[1:2])^2/ sum((voles_mds$eig)^2) [1] 0.9391378Which is the criteria suggested by Mardia et al.;
the above code is exactly: k n
∑│λ│ / ∑ │λ│ i=1 i=1
0.6708889 & .9391378 are large enough to suggest that the original distancesBetween water vole populations can be represented adequately in two dimensions. (3*)
R>x<- voles_mds$points[, 1] 1: Coordinate 1R>y<-voles_mds$points[, 2] 2: Coordinate 2R>plot(x,y,xlab= "Coordinate 1", ylab= "Coordinate 2", xlim= range(x) * 1.2, type= "n")R>text(x,y, labels= colnames(watervoles))
-0.2 0.0 0.2 0.4 0.6
-0.2
-0.1
0.0
0.1
0.2
0.3
Coordinate 1
Co
ord
ina
te 2
Surrey
Shropshire
YorkshirePerthshire
AberdeenElean Gamhna
Alps
Yugoslavia
GermanyNorway
Pyrenees I
Pyrenees II
North Spain
South Spain
6 British populations are close to populations living in the Alps, Yugoslavia, Germany, Norway, & Pyrenees I. (Arvicola terrestris)
These British populations are distant from the populations in Pyrenees II, North Spain and South Spain. (Arvicola sapidus) (4*)
Minimum Spanning Tree (MDS solution) highlight possible distortions. (5,6*)
Links of a minimum spanning tree of the proximity matrix may be plotted onto the two-dimensional scaling representation to identify distortions (if any) produced by scaling solutions.
These distortions look like nearby points on the plot that are not linked by an edge of a tree. (7*)
R>library("ape") R>st<- mst(watervoles)R>plot(x,y,xlab= "Coordinate 1", ylab= "Coordinate 2", xlim= range (x) * 1.2, type= "n")R>for (i in 1:nrow(watervoles)) {+ w1<- which(st[i, ] == 1)+ segments(x[i], y[i], x[w1], y[w1]) }R>text(x, y, labels= colnames(watervoles))
-0.2 0.0 0.2 0.4 0.6
-0.2
-0.1
0.0
0.1
0.2
0.3
Coordinate 1
Coo
rdin
ate
2
Surrey
Shropshire
YorkshirePerthshire
AberdeenElean Gamhna
Alps
Yugoslavia
GermanyNorway
Pyrenees I
Pyrenees II
North Spain
South Spain
The apparent closeness of the populations, Germany & Norway, by the points suggested here in this MDS solution, does not reflect accurately their calculated dissimilarity. (8*)
R Function isoMDS from package “MASS”
Example: Analyzing Voting Behavior among Republicans.
R>library("MASS")R>data("voting", package= "HSAUR")R>voting_mds<- isoMDS(voting)
Two-Dimensional Solution:R>x<- voting_mds$points[, 1]R>y<- voting_mds$points[, 2]R>plot(x,y, xlab="Coordinate 1", ylab= "Coordinate 2",+ xlim= range(voting_mds$points[, 1]) *1.2,+ type= "n" )R>text(x,y, labels= colnames(voting))R>voting_sh<- Shepard(voting[ lower.tri(voting)], + voting_mds$points
-10 -5 0 5
-6-4
-20
24
68
Coordinate 1
Co
ord
ina
te 2
Hunt(R)
Sandman(R)
Howard(D)
Thompson(D)
Freylinghuysen(R)
Forsythe(R)
Widnall(R)
Roe(D)Heltoski(D)
Rodino(D)Minish(D)
Rinaldo(R)
Maraziti(R)
Daniels(D)
Patten(D)
Figure suggests that there is more variation in voting behavior among Republicans. (*9)
R>library("MASS")R>voting_sh<- Shepard(voting[lower.tri(voting)],+ voting_mds$points)R>plot(voting_sh, pc= ".", xlab= "Dissimilarity", + ylab= "Distance", xlim= range(voting_sh$x),+ ylim= range(voting_sh$x))R>lines(voting_sh$x, voting_sh$yf, type= "S")
Quality of a multidimensional scaling can be assessed informally by plotting the original dissimilarities & the distances obtained from a multidimensional scaling in a scatterplot.
Ideally, all the lines should fall on the bisecting line.
5 10 15
51
01
5
Dissimilarity
Dis
tan
ce