8/8/2019 Topic 10 Factor Analysis and Reliability
1/32
TOPIC 10 A CONDUCIVE TEACHING AND LEARNING ENVIRONMENT
200
TT
INTRODUCTIONFactor Analysis is used to uncover the latent structures (dimensions) of a set ofvariables. It is a family of analysis under data reduction. Other methods areLatent Class Analysis, Latent Profile Analysis, Latent Trait Analysis , andPrincipal Component Analysis (PCA). The focus of Principal ComponentAnalysis is to reduce the number of variables into a smaller set of principalcomponents (dimensions). It allows researchers to use a smaller number of
factors to explain what the long list of variables actually measure. PCA isnormally used to reduce a large number of variables to a smaller number offactors. Prior to Mu ltiple R egression analysis, factor analysis was used to create aset of factors to be treated as uncorrelated variables as one approach to handlemulti-collinearity. Factor analysis is an Interdependency Technique; it aims tofind the latent factors that account for the patterns of collinearity among multiplemetric variables. Some statisticians do not consider PCA as factor analysis.
FactorAnalysisandReliability
ooppiicc
1100
LEARNING OUTCOMES
By the end of this topic, you should be able to:
1. Describe the requirements for factor analysis for a given data set;
2. Use the appropriate method to determine the principal componentsunderpinning the responses on a set of variables; and
3. Compute the reliability index.
8/8/2019 Topic 10 Factor Analysis and Reliability
2/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 201
Salient features ofPrincipal Com ponent Analysis:(a) It is a variable reduction procedure.(b)
The main purpose is to reduce the number of variables into a smaller set ofprincipal components (dimensions).
(c) It is a large sample procedure where the focus is only on summarising thesample information into a smaller set of principal components asopposed to detecting the latent factors that influence the scores on theobserved variables.
The following are the assumptions for Factor Analysis:(a) Large enough sample to yield reliable estimates of the correlations among
the variables (according to Hair et al: 5 respondents per item in the scale arepreferred).
(b) Statistical inference is improved if the variables are multivariate normal (notrequired for PCA).
(c) Relationships among the pairs of variables are linear.(d) Absence of outliers among the cases.(e) Some degree of collinearity among the variables but not an extreme degree
or singularity among the variables (according to Kline (1998), thecorrelation value between the variables fall between 0.3 and 0.8).
ILLUSTRATING THE INTER-DEPENDENCYBETWEEN VARIABLES
10.1
A teacher wanted to gauge the Emotional Intelligence of the Form Five studentsof his school. Based on his readings, he drafted a nine-item questionnaire.Respondents are required to provide their ratings on a five-point Likert Scale. Headministered the questionnaire to a group of Form Five students and ran asimple correlation analysis.
8/8/2019 Topic 10 Factor Analysis and Reliability
3/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY202
The following Table 10.1 shows the results:
Table 10.1: Correlation Analysis OutputX1 X2 X3 X4 X5 X6 X7 X8 X9
X1 1.00 0.75 0.70 0.65 0.01 0.20 0.18 0.16 0.03X2 1.00 0.63 0.65 0.08 0.11 0.13 0.04 0.09X3 1.00 0.74 0.02 0.12 0.07 0.15 0.05X4 1.00 0.01 0.11 0.06 0.02 0.13X5 1.00 0.73 0.72 0.65 0.83X6 1.00 0.71 0.79 0.72X7 1.00 0.95 0.75X8 1.00 0.73X9 1.00
Basic Principle:Variables that significantly correlate with each other do so because they aremeasuring the same "construct".
The Problem:What is the "construct" that brings the variables together?
The interpretation of Table 10.1:
(a) Variables 1, 2, 3 & 4 correlate highly with each other, but not with the rest ofthe variables.
(b) Variables 5, 6, 7, 8 & 9 correlate highly with each other, but not with the restof the variables.
(c) The nine variables seem to be measuring TWO "constructs" or underlyingfactors.
To find out the answer, we need to carry out Factor Analysis or The PrincipalComponent Analysis, to be more precise.
8/8/2019 Topic 10 Factor Analysis and Reliability
4/32
8/8/2019 Topic 10 Factor Analysis and Reliability
5/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY204
Variance [Cov (Y, Y)]:1n
YYYYn
1iii
= ))(( = using the standardised value
Cov (Y, Y) = 1.00
Covariance [Cov (X, Y)]:1n
YYXXn
1iii
=))((
= 0.77
Variance-covariance matrix: ; if the X and Y scores aretransformed into standardised scores, the variance-covariance will give us thecorrelation matrix.
),cov(),cov(
),cov(),cov(
yyxy
yxxx
Thus, for the above example, the variance-covariance matrix for the standardisedvalue of
X and Y is
001770
770001
..
..
Step 3Calculate the eigenvectors and eigenvalues of the covariance matrix:
),cov(),cov(
),cov(),cov(
yyxy
yxxxX =
b
a
b
a
Variance-covariance Matrix Eigenvector Eigenvalue
8/8/2019 Topic 10 Factor Analysis and Reliability
6/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 205
The number of eigenvalues depends on the number of variables in the analysis.In general, if there are n variables in the analysis, there will be n number ofeigenvalues. However, not all the eigenvalues will have the same magnitude butthe total is equal to the number of variables in the analysis.
Each eigenvalue will have its corresponding eigenvector. The computation ofeigenvalues and eigenvectors involves complicated mathematical proceduresespecially if the number of variables in the analysis is large. Any computersoftware that performs the Principal Component Analysis will compute theeigenvalues (while some programmes will also provide the eigenvectors).
For the example above, the eigenvalues are:
1 = 1 + r12
and
2 = 1 - r12
(where r12 is the correlation between the two variables, in this case, thecorrelation value is 0.77)
Thus, the eigenvalues are 1.77 and 0.23
The eigenvalues will give the eigenvectors.
When eigenvalue is 1.77, the eigenvector is 542 1.When eigenvalue is 0.23, the eigenvector is
431
1
.
8/8/2019 Topic 10 Factor Analysis and Reliability
7/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY206
Step 4Plotting the standardised values on a two dimensional plane and overlying theeigenvectors.
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
-2.00 -1.00 0.00 1.00 2.00
Eigenvector
431
1
.
Eigenvector
542
1
.
Figure 10.1: Plotting on a two dimensional plane and overlying the eigenvectorsFrom the plot shown in Figure 10.1, it can be concluded that the data set is fairlywell represented by the eigenvector derived when the eigenvalue is 1.77.
The above discussion is just for illustrative purpose. In real situations, there will be more than two observed variables and thus, visual representation (e.g.
graphical) is not possible.
8/8/2019 Topic 10 Factor Analysis and Reliability
8/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 207
10.2.2 Types of Factor Analysis
Basically, there are two types of factor analysis and they are:(a) Exploratory factor analysis
It is a non-theoretical application. The aim is to answer the question Givena set of variables, what are the underlying dimensions (factors) that accountfor the patterns of collinearity among the variables?
Example: Respondents responses on a scale measuring delinquency isgoverned by certain theory, as such, what are the latent factors thatinfluence their behaviour?
(b) Confirmatory factor analysisIt is to validate a predetermined theory. The aim is to answer the questionDo the responses of a scale conform with the theory that explainsrespondents behaviour?
Example: Given a theory that attributes delinquency to four independentfactors, do respondents responses on a scale that measures delinquencyconverge into these four factors?
THE LOGIC OF FACTOR ANALYSIS(PRINCIPAL COMPONENT ANALYSIS)
In studying the Emotional Intelligence of teachers, a researcher uses focus groupinterviews in developing the instrument for his study. The following items (The
below data is attached in Appendix II, Data Set B) were generated based on focusgroup interviews with selected teachers from Klang Valley.
1 It is difficult for me to face unpleasant situations.
2 I am able to face challenges pretty well.
3 I am able to deal with upsetting problems.
4 I find it difficult to control my anxiety.
5 I am able to keep calm in difficult situations.
6 I can handle stress without getting too nervous.
7 I am usually calm when facing challenging situations.
8 I am motivated to continue, even when things get difficult.
9 Whatever the situation, I believe I can handle it well.
10 I am optimistic about most things I do.
10.3
8/8/2019 Topic 10 Factor Analysis and Reliability
9/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY208
11 I am sure of what I am doing in most situations.
12 I believe things will turn out all right despite setbacks from time to time.
13 I believe in my ability to handle upsetting problems/situations.
14 If others can do it, I dont see why I cant.
15 I feel good about myself.
16 I feel that I am not inferior compared with others.
17 I feel confident of myself in most situations.
18 I have good self respect.
19 I am happy with what I am now.
20 It is fairly easy for me to express my feelings.
21 I am aware of what is happening around me even when I am upset.
22 I am aware of the way I feel.
23 It is difficult for me to describe my feelings.
The researcher developed a questionnaire to assess Emotional Intelligence ofteachers using the items generated from the focus group interviews. He used a 7-point Likert Scale for his questionnaire. The following is the description of theLikert Scale:
[ 1= Strongly Disagree; 2 = Disagree; 3 = Slightly Disagree; 4 = Not Sure; 5 =
Slightly Agree; 6 = Agree; 7 = Strongly Agree]. He administered the questionnaireto 176 randomly selected teachers from both private and public schools in KlangValley. Table 10.3, shows the sample of the responses.
Table10.3: Sample of Students ResponsesSubject VariableX1 X2 X3 X4 X5 X6 Xn1 6 5 7 3 4 4
2 5 7 4 4 4 3
3 7 5 6 2 5 4
N Mean X1 Mean X2 Mean X3 Mean X4 Mean X5 Mean X6
Mean X.
Having run the correlation analysis, the researcher found that some of the itemshave high correlations with one another while others, not so. Table 10.4 shows anexample of the correlation analysis.
8/8/2019 Topic 10 Factor Analysis and Reliability
10/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 209
Table 10.4: Sample of Inter-Correlation Values between VariablesX1 X2 X3 X4 X5 X6 .... Xk
X1 1.00 0.76 0.84 X2 1.00 0.76 X3 1.00 X4 1.00 0.76 0.77 X5 1.00 0.81X6 1.00-- -Xk 1.00
The next logical thing to do is to cluster the variables with high inter correlationstogether and define them as belonging to the same family. This is what factoranalysis (or Principal Component Analysis, to be precise) is all about. Table 10.5displays an example of the factor analysis. The values in the cells are the factorloadings (Refer to Subsection 10.3.1 for further explanation on factor loadings).
Table 10.5: Sample of Factor Analysis OutcomeVariables Factor I Factor II Factor III Factor IV Factor .. Factor n
X1 0.932 0.013 0.250X2 0.851 0.426 0.211X3 0.634 0.451 0.231X4 0.322 0.644 0.293X5 0.725 0.714 0.293X6 0.435 0.641 0.332X7 0.322 0.311 0.677X8 0.211 0.233 0.771 Xk 0.122 0.110 0.200
8/8/2019 Topic 10 Factor Analysis and Reliability
11/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY210
10.3.1 Factor Loading
What is a Factor Loading?A factor loading is the correlation between a variable and a factor that has beenextracted from the data.
ExampleNote the factor loadings for variable X1.
Variables Factor I Factor II Factor IIIX1 0.932 0.013 0.250
Variable X1 is highly correlated with Factor I, but negligibly correlated withFactors II and III.
Communality: Refers to the total variance in variable X1 accounted for by thethree factors that were extracted.
Simply square the factor loadings and add them together:
(0.9322 + 0.0132 + 0.2502) = 0.93129
As such, the initial communality for the variables before extracting the factors isalways 1.00. In the above example, emotional intelligences is operationalisedusing 23 specific situations and the initial factors will be 23, with some havinggreater dominance than the others (this will be reflected in the eigenvalues).
Once the dominant factors are identified (e.g. those with eigenvalue greater than1.00), the communality value for each variable will be less than 1.00. This is
because in factor analysis, those factors that have negligible effect on the variableswill be dropped.
8/8/2019 Topic 10 Factor Analysis and Reliability
12/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 211
STEPS IN FACTOR ANALYSIS (PRINCIPALCOMPONENT ANALYSIS)
There are a few crucial steps to be followed in factor analysis or to put it moreprecisely, the Principal Component Analysis:
Step 1Compute a k by k inter-correlation matrix. According to Hair et.al, inter-correlation values must be at least 0.3 for the items to be considered for factoranalysis.
Step 2Extract an initial solution.
Step 3Determine the appropriate number of factors to be extracted in the final solution.
Step 4Rotate the factors to clarify the factor pattern in order to better interpret thenature of the factors if necessary.
Step 5Establish the measures of goodness-of-fit of the factor solution
A Ten Variable ExampleThe researcher used the responses on the first 10 questions on the EmotionalIntelligence questionnaire to perform factor analysis. The Table 10.6 below showsthe codes and the variable names for the variables included in the factor analysis.
Table 10.6: Codes and Variable NamesCode Variable Namerq1 It is difficult for me to face unpleasant situations.
rq2 I am able to face challenges pretty well.
rq3 I am able to deal with upsetting problems.
rq4 I find it difficult to control my anxiety.
rq5 I am able to keep calm in difficult situations.
rq6 I can handle stress without getting too nervous.
rq7 I am usually calm when facing challenging situations.
rq8 I am motivated to continue, even when things get difficult.
rq9 Whatever the situation, I believe I can handle it well.
rq10 I am optimistic about most things I do.
10.4
8/8/2019 Topic 10 Factor Analysis and Reliability
13/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY212
The principal components can be illustrated as follows:
X1 X2 X3 X4 X5 X6
C1
X7 X8 X9 X10
C1 = b11(X1) + b11(X1) + + b10(X10)
C1 = Factor score on Component 1
b = Regression weight (also known as factor weight)
X = Respondents score on the observed variables
= Strong regression weight
= Weak regression weight
X1 X2 X3 X4 X5 X6
C2
X7 X8 X9 X10
C2 = b11(X1) + b11(X1) + + b10(X10)
C2 = Factor score on Component 2
bij = Regression weight (also known as factor loading)
Xi = Respondents score on the observed variables
= Strong regression weight (large factor loading)
= Weak regression weight (small factor loading)
8/8/2019 Topic 10 Factor Analysis and Reliability
14/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 213
All the observed variables will have some influence on all the factors extracted,however, a different set of the variables will have different degrees of influenceon the different common factors.
In short, a principal component is a linear combination of optimally weightedobserved variables. The weighting is done in such a way that it maximises theamount of variance in the data set.
The following Figure 10.2 summarises the requirements and assumptions forprincipal component analysis.
Summ arise original info intominimal factor
Yes No
Oblique Orthogonal
Total variance
Continuous
Principal Component analysis
It is a large sample procedure.No generalisation involved.No assumption of normality
Purpose
Measurement level of the principalcomponent factor
Parameter for analysis
Assumption of normality
Type of analysis
Principal components correlate
Type of rotation
Figure 10.2: Requirements and assumptions for principal component analysisExtracting the principal components from the list of observed variables is aniterative procedure that requires one to check for the assumptions along theprocess until the final conclusion is made. The procedural map in Appendix VIsummarises the procedure and assumptions required for PCA with orthogonalrotation.
8/8/2019 Topic 10 Factor Analysis and Reliability
15/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY214
10.4.1 Correlation between Variables
As a first step, correlations between the variables are computed. Table 10.7 shows
the values of the correlation between the variables. The shaded cells represent thediagonal while the values below and above the diagonal are the correlationvalues between the variables. Since the correlation values between the variablesare greater than 0.3 with at least one other variable, all the 10 variables arefactorable. At the same time the values are not too high (not more than 0.85) andas such, each variable is distinct from the others.
Table 10.7: Inter-Correlation among the VariablesCorrelation Matrixa
rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10
rq1 1.000 .604 .578 .419 .514 .580 .497 .555 .554 .481
rq2 .604 1.000 .615 .518 .488 .545 .543 .402 .402 .401
rq3 .578 .615 1.000 .519 .567 .536 .572 .481 .484 .496
rq4 .419 .518 .519 1.000 .581 .430 .450 .336 .174 .357
rq5 .514 .488 .567 .581 1.000 .577 .577 .466 .382 .574
rq6 .580 .545 .536 .430 .577 1.000 .575 .510 .417 .437
rq7 .497 .543 .572 .450 .577 .575 1.000 .459 .442 .521
rq8 .555 .402 .481 .336 .466 .510 .459 1.000 .585 .602
rq9 .554 .402 .484 .174 .382 .417 .442 .585 1.000 .529
Correlation
rq10 .481 .401 .496 .357 .574 .437 .521 .602 .529 1.000
a. Determinant =0.005
8/8/2019 Topic 10 Factor Analysis and Reliability
16/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 215
There is more evidence of factorability:(a) Bartlett's Test of Sphe ricity
Table 10.8 shows the inter-correlation matrix of an identity matrix.
Table 10.8: Intercorrelation of an Identity MatrixX1 X2 X3 X4 X5
X1 1.00 0.00 0.00 0.00 0.00X2 1.00 0.00 0.00 0.00X3 1.00 0.00 0.00X4 1.00 0.00X5 1.00
The variables are totally non-collinear. If this matrix was factor-analysed, it
would extract as many factors as variables, since each variable would be itsown factor. As such, it is totally non-factorable. The factor solution will beexactly the same as the initial solution.
The determinant of an identity matrix is equal to one, while the determinantof a non-identity matrix is some other value (different from one).Bartlett's Test of Sphericity calculates the determinant of the matrix of thesums of products and cross-products (S) from which the inter-correlationmatrix is derived. The determinant of the matrix S is converted to a chi-
square statistic and tested for significance.
Null Hypothesis: The inter-correlation matrix of the variables is notdifferent from an identity matrix.
Alternate Hypothesis: The inter-correlation matrix of the variables isdifferent from an identity matrix.Table 10.9 shows the sample results:
Table 10.9: Sample Results of Bartlett's Test of SphericityKM O and Bartlett's TestKaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914
Approx. Chi-Square 887.955
df 45
Bartlett's Test of Sphericity
Sig. .000
8/8/2019 Topic 10 Factor Analysis and Reliability
17/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY216
Test Results2 = 887.955 ; df = 45 ; p < 0.0001
Statistical DecisionThe inter-correlation matrix of the variables is significantly different froman identity matrix. In other words, the sample inter-correlation matrix didnot come from a population in which the inter-correlation matrix is anidentity matrix.
(b) Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KM O)If two variables share a common factor with other variables, their partialcorrelation (aij) will be small, indicating the unique variance they share.
If aij 0.0; the variables are measuring a common factor, and KMO 1.0
If aij 1.0; the variables are not measuring a common factor, and KMO 0.0
Table 10.10 portrays the interpretation of the KMO as characterised byKaiser, Meyer, and Olkin:
Table 10.10: Degree of Common VarianceKMO Value Degree of Comm on Variance0.90 to 1.00 Marvelous
0.80 to 0.89 Meritorious
0.70 to 0.79 Middling
0.60 to 0.69 Mediocre
0.50 to 0.59 Miserable
0.00 to 0.49 Not Appropriate for Factor Analysis
8/8/2019 Topic 10 Factor Analysis and Reliability
18/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 217
As characterised by Kaiser, Meyer, and Olkin, results of the KMO can beseen or referred in the below Table 10.11.Table 10.11: KMO and Bartlett's Test
KM O and Bartlett's TestKaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914
Approx. Chi-Square 887.955
df 45
Bartlett's Test of Sphericity
Sig. .000
The KMO = 0.914
InterpretationThe degree of common variance among the ten variables is marvellous.
If a factor analysis is conducted, the factors extracted will account for asubstantial amount of variance.
10.4.2 Extracting an Initial Solution
A variety of methods have been developed to extract factors from an inter-correlation matrix. SPSS Statistics offers the following methods:
(i) Principal components(ii) Unweighted least-squares(iii) Generalised least squares(iv) Maximum likelihood(v) Principal axis factoring(vi) Alpha factoring(vii) Image factoringNote: In this module, we will only focus on the Principal Component Method.
Communality is the proportion of variance of a particular variable (item in thequestionnaire) that is due to common factors. In the initial solution, each variable(item) is considered as a single factor, as such, the communality for the initialsolution is 1.00. After extraction, the number of factors will be reduced and each
8/8/2019 Topic 10 Factor Analysis and Reliability
19/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY218
initial factor (item) now belongs to new factors and the new factors explain acertain proportion of the variance in the variable. Thus, the proportion ofvariance of each variable (item) explained by the new factors is less than 1.00
(refer to Table 10.12).
Table 10.12: CommunalitiesCommunalities
Initial Extraction
rq1 1.000 .626
rq2 1.000 .623
rq3 1.000 .647
rq4 1.000 .732
rq5 1.000 .649
rq6 1.000 .588
rq7 1.000 .594
rq8 1.000 .694
rq9 1.000 .762
rq10 1.000 .614
The variance of each variable is 1.0, the total variance to be explained is 10 (10variables, each with a variance = 1.0). Since a single variable can account for 1.0unit of variance, a useful new factor must account for more than 1.0 unit of
variance, or have an eigenvalue () greater than 1.0. Otherwise, the factorextracted (new factor) explains less variance than a single variable. Table 10.7shows the results of the factor analysis of the 10 items.
8/8/2019 Topic 10 Factor Analysis and Reliability
20/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 219
10.4.3 Determine the Appropriate Number of Factors tobe Extracted in the Final Solution
Table 10.13: The Results of Factor Analysis
Initial EigenvaluesExtraction Sum s of Squared
LoadingsRotation Sums of Squared
Loadings
Component Total
% of
Variance
Cumulative
% Total
% of
Variance
Cumulative
% Total
% of
Variance
Cumulative
%
1 5.489 54.888 54.888 5.489 54.888 54.888 3.515 35.152 35.152
2 1.041 10.406 65.294 1.041 10.406 65.294 3.014 30.143 65.294
3 .691 6.910 72.205
4 .539 5.387 77.592
5 .506 5.056 82.648
6 .395 3.948 86.596
7 .383 3.830 90.426
8 .359 3.590 94.017
9 .320 3.201 97.218
10 .278 2.782 100.000
Extraction Method: Principal Component Analysis.Referring to the above Table 10.13, the results of the initial solution:
Interpretation10 factors (components) were extracted, the same as the number of variablesfactored:
(a) Factor IThe 1st factor has an eigenvalue = 5.489. The value is greater than 1.0, assuch, it explains more variance than a single variable, in fact 5.489 times asmuch.
The percent of variance explained by Factor I is:
(5.489 / 10 units of variance) (100) = 54.89%
8/8/2019 Topic 10 Factor Analysis and Reliability
21/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY220
(b) Factor IIThe 2nd factor has an eigenvalue = 1.041. It is also a value greater than 1.0,and therefore, explains more variance than a single variable.
The percent of variance explained by Factor II is:
(1.041 / 10 units of variance) (100) = 10.41%(c) Subsequent factors
The subsequent factors (3 through 10) have eigenvalues less than 1.0, assuch, explain less variance than a single variable. These are not goodfactors.
The Key Points
The sum of the eigenvalues associated with each factor (component)sums to 10 (e.g (5.489 + 1.041 + 0.691 + 0.539 + + 0.278) = 10)
The cumulative percentage of variance explained by the first two factorsis 65.29%
In other words, 65.29% of the common variance shared by the 10variables can be accounted for by the 3 factors.
This initial solution suggests that the final solution should extract notmore than 2 factors.
Under the subject of determining the appropriate number of factors to beextracted in the final solution that has been discussed in this subsection, there aretwo more important elements to be addressed:
(a) Cattell's Scree PlotAnother way to determine the number of factors to extract in the finalsolution is via Cattell's Scree plot (refer to Figure 10.3). This is a plot of theeigenvalues associated with each of the factors extracted, against eachfactor. At the point that the plot begins to level off, the additional factorsexplain less variance than a single variable.
8/8/2019 Topic 10 Factor Analysis and Reliability
22/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 221
Figure 10.3: Cattell's Scree Plot(b) Factor Loadings
The component matrix indicates the correlation of each variable with each factor.
Component M atrixaComponent
1 2
rq1 .785 .099
rq2 .748 -.253
rq3 .795 -.127
rq4 .640 -.567
rq5 .776 -.216
rq6 .762 -.086
rq7 .765 -.096
rq8 .727 .406
rq9 .667 .563
rq10 .728 .289
Extraction Method: PrincipalComponent Analysis.a. 2 components extracted.
Explanation:
The variable rq1correlates 0.785 withFactor I
correlates 0.099 withFactor II
The total proportion of the variance in rq1 explained by the two factors is:(0.7852 + 0.0992) = 0.626
8/8/2019 Topic 10 Factor Analysis and Reliability
23/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY222
This is called the communality of the variable rq1The communalities of the 10 variables are as follows: (cf. column headed asExtraction)
CommunalitiesInitial Extraction
rq1 1.000 .626
rq2 1.000 .623
rq3 1.000 .647
rq4 1.000 .732
rq5 1.000 .649
rq6 1.000 .588
rq7 1.000 .594
rq8 1.000 .694
rq9 1.000 .762
rq10 1.000 .614
The proportion of variancein each variable accountedfor by the two factors isnot the same.
The key to determining what the factors measure is the factor loadings.
Component M atrixaComponent
1 2
rq1 .785 .099
rq2 .748 -.253
rq3 .795 -.127
rq4 .640 -.567
rq5 .776 -.216
rq6 .762 -.086
rq7 .765 -.096
rq8 .727 .406
rq9 .667 .563
rq10 .728 .289
Extraction Method: Principal ComponentAnalysis.
a. 2 components extracted.
8/8/2019 Topic 10 Factor Analysis and Reliability
24/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 223
Factor IVariable Factor Loading
rq1 .785rq2 .748
rq3 .795
rq4 .640
rq5 .776
rq6 .762
rq7 .765
rq8 .727
rq9 .667
rq10 .728
The correlation coefficient between rq1 and Factor I is 0.785
The correlation coefficient between rq2 and Factor I is 0.748
The correlation coefficient between rq3 and Factor I is 0.795
The correlation coefficient between rq4 and Factor I is 0.640
The correlation coefficient between rq5 and Factor I is 0.776
The correlation coefficient between rq6 and Factor I is 0.762
The correlation coefficient between rq7 and Factor I is 0.765
The correlation coefficient between rq8 and Factor I is 0.727
The correlation coefficient between rq9 and Factor I is 0.667
The correlation coefficient between rq10 and Factor I is 0.728
Factor IIVariable Factor Loading
rq1 .099
rq2 -.253
rq3 -.127
rq4 -.567
rq5 -.216
rq6 -.086
rq7 -.096
rq8 .406
rq9 .563
rq10 .289
The correlation coefficient between rq1 and Factor II is 0.099
The correlation coefficient between rq2 and Factor II is -0.253
The correlation coefficient between rq3 and Factor II is -0.127
The correlation coefficient between rq4 and Factor II is -0.567
The correlation coefficient between rq5 and Factor II is -0.216
The correlation coefficient between rq6 and Factor II is -0.086
The correlation coefficient between rq7 and Factor II is -0.096
The correlation coefficient between rq8 and Factor II is 0.406
The correlation coefficient between rq9 and Factor II is 0.563
The correlation coefficient between rq10 and Factor II is 0.289
10.4.4 Rotate the Factors to Clarify the Factor Pattern inorder to Better Interpret the Nature of the Factors.
In many instances, one or more variables may load about the same on more thanone factor, making the interpretation of the factors ambiguous. Ideally, the
analyst would like to find that each variable loads high ( 1.0) on one factor andapproximately zero on all the others ( 0.0). The factor pattern can be clarified by"rotating" the factors in F-dimensional space. There are two types of rotation:
8/8/2019 Topic 10 Factor Analysis and Reliability
25/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY224
(a) Orthogonal Rotation: Preserves the independence of the factors,
geometrically they remain 90 apart.
(b) Oblique Rotation: Will produce factors that are not independent,
geometrically not 90 apart.
Below is the comparison between the Component matrix and RotatedComponent matrix (Using Varimax rotation, an orthogonal type) for the tenvariables:
Component M atrixa Rotated Component MatrixaComponent Component
1 2 1 2
rq1
rq2
rq3
rq4
rq5
rq6
rq7
rq8
rq9rq10
.785
.748
.795
.640
.776
.762
.765
.727
.667
.728
.099
-.253
-.127
-.567
-.216
-.086
-.096
.406
.563
.289
rq1
rq2
rq3
rq4
rq5
rq6
rq7
rq8
rq9rq10
.519
.726
.677
.855
.723
.625
.634
.272
.123
.350
.597
.309
.435
.003
.356
.443
.438
.788
.864
.701
Extraction Method: PrincipalComponent Analysis.a. 2 components extracted.
Extraction Method: PrincipalComponent Analysis. RotationMethod: Varimax with KaiserNormalization.a. Rotation converged in 3 iterations.
8/8/2019 Topic 10 Factor Analysis and Reliability
26/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 225
Reproduced correlation m atrixOne measure of the goodness-of-fit is whether the factor solution can reproducethe original inter-correlation matrix among the ten variables.
Table 10.14 : Reproduced CorrelationsReproduced Correlations
rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10
rq1 .626a .562 .611 .446 .588 .590 .591 .611 .580 .600
rq2 .562 .623a .626 .622 .635 .591 .596 .441 .357 .471
rq3 .611 .626 .647a .580 .644 .616 .620 .526 .459 .542
rq4 .446 .622 .580 .732a .619 .536 .544 .235 .108 .302
rq5 .588 .635 .644 .619 .649a .610 .614 .477 .397 .503
rq6 .590 .591 .616 .536 .610 .588a .591 .519 .460 .530
rq7 .591 .596 .620 .544 .614 .591 .594a .517 .456 .529
rq8 .611 .441 .526 .235 .477 .519 .517 .694a .714 .647
rq9 .580 .357 .459 .108 .397 .460 .456 .714 .762a .649
ReproducedCorrelation
rq10 .600 .471 .542 .302 .503 .530 .529 .647 .649 .614a
rq1 .042 -.033 -.027 -.074 -.009 -.094 -.056 -.026 -.119
rq2 .042 -.011 -.104 -.147 -.047 -.053 -.039 .046 -.070
rq3 -.033 -.011 -.061 -.077 -.080 -.048 -.045 .025 -.046
rq4 -.027 -.104 -.061 -.038 -.106 -.094 .101 .066 .055
rq5 -.074 -.147 -.077 -.038 -.033 -.037 -.011 -.014 .071
rq6 -.009 -.047 -.080 -.106 -.033 -.016 -.009 -.042 -.093
rq7 -.094 -.053 -.048 -.094 -.037 -.016 -.058 -.014 -.008
rq8 -.056 -.039 -.045 .101 -.011 -.009 -.058
-.129 -.045rq9 -.026 .046 .025 .066 -.014 -.042 -.014 -.129 -.120
Residualb
rq10 -.119 -.070 -.046 .055 .071 -.093 -.008 -.045 -.120
Extraction Method: Principal Component Analysis.
a. Reproduced communalitiesb. Residuals are computed between observed and reproduced correlations. There are 21
(46.0%) non-redundant residuals with absolute values greater than 0.05.
8/8/2019 Topic 10 Factor Analysis and Reliability
27/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY226
The upper half of the above Table 10.14 presents the bivariate correlations.Compare these with the lower half of the table that presents the residuals.
Residual = (observed - reproduced correlation)
Less than half of the residuals (42%) are greater than 0.05
10.4.5 Establish the Measures of Goodness-of-Fit of theFactor Solution
Table 10.15 shows the goodness of fit of the tw o factor solution.Table 10.15: Goodness of Fit of the Two Factor Solution
Measu re Value InterpretationKMO 0.914 MarvelousBarletts Test 2 = 887.955 ;
df = 45 ;
p < 0.0001
The inter-correlation matrixprovides evidence of thepresence of common factors
Total Variance Explained 65.29% The two factors extractedcan explain 65.29% of thevariance in the ten variables
Factor pattern 2 Factors The pattern is clear for twofactors
RELIABILITY10.5
In many areas of educational and psychological research, the precisemeasurement of various variables or theoretical constructs poses a challenge. Forexample, the precise measurement of personality variables or attitudes is usuallya necessary first step before any theories of personality or attitudes can beconsidered. In general, unreliable measurements of people's beliefs or intentions
will obviously hamper efforts to predict their behaviour. Reliability analysis isoften used to statistically check the reliability of an instrument. Reliability is themeasure of consistency of a particular instrument. This refers to the capabilityof the instrument producing consistently similar results if it were administered toa homogenous group of respondents. Generally, there are four classes ofreliability estimates. They are inter-rater or inter-observer reliability, test-retestreliability, parallel-form reliability, and internal consistency. The inter-rater or theinter-observer reliability is used to assess the degree to which two differentobservers describes a phenomenon. This is widely used in establishing reliability
8/8/2019 Topic 10 Factor Analysis and Reliability
28/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 227
for open-ended questions. The test-retest, the parallel-forms and the internalconsistency reliability are mainly used to assess the reliability for fixed responseitems. The test-retest is used to measure the consistency of the measure from one
time to another, while the parallel-form is the reliability measure of theconsistency of two tests which were constructed using the same content domain.
The internal-consistency is a measurement to evaluate the consistency of theresponses for each item within the instrument. This is reported in termscoefficient of Cronbachs alpha and the values range from zero to one and this ismeasured by the formula;
=
=
k
sum2
S
i2
S1
1k
k
1i
where
Si2 = variance for k individuals
S2sum = variance for the sum of all items
If there is no true score but only random errors in the items(uncorrelated across items) then Si2 = S2sum and = 0
If all items measure the same thing (true score) then =1 Nunnaly (1978) suggests an > 0.7
10.5.1 Reliability using Cronbachs Alpha
There are many different types of statistics to check reliability and one of themost commonly used is Cronbachs Alpha which is based on the averagecorrelation of items within a test. Cronbachs alpha is the most common form ofinternal consistency reliability coefficient. By convention, a lenient cut-off of 0.60is common in exploratory research; alpha should be at least 0.70 or higher toretain an item in an "adequate" scale; and many researchers require a cut-off of
0.80 for a "good scale."
8/8/2019 Topic 10 Factor Analysis and Reliability
29/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY228
ExampleA researcher gave a 10-item questionnaire on Emotional Intelligence to a sampleof randomly selected secondary school students. The aim is to determine theinternal consistency of the scale using Cronbachs alpha. The Table 10.16 below isthe SPSS output.
Table 10.16: Item-Total StatisticsItem-Total Statistics
Scale Mean ifItem Deleted
Scale Varianceif Item Deleted
CorrectedItem-TotalCorrelation
SquaredMultiple
Correlation
Cronbach'sAlpha if Item
Deleted
rq1 41.89 63.948 .718 .560 .895
rq2 41.78 64.915 .676 .533 .897
rq3 41.89 64.380 .731 .555 .894
rq4 42.24 65.499 .560 .458 .905rq5 42.19 62.074 .713 .573 .895
rq6 42.14 63.800 .692 .516 .896
rq7 42.00 63.202 .696 .508 .896
rq8 41.83 64.745 .654 .521 .899
rq9 41.93 66.185 .583 .491 .903
rq10 41.97 64.849 .658 .517 .898
SPSS STA TISTICS Com mand s for Reliability Analysis Select Analyse menu and click on Scale and then Reliability
Analysis
to open the Reliability Analysis dialogue box. Select the variables or items you require, click the right arrow
to move the variables to the Items: box. Ensure that Alpha is displayed in the Model: box. Click on the Statistics . command pushbutton to open theReliability Ana lysis: Statistics sub-dialogue box. In the Descriptives for box, select the Scale and Scale if itemdeleted check boxes. In the Inter-Item box, select the Correlations check box. Click on Continue and OK .
8/8/2019 Topic 10 Factor Analysis and Reliability
30/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 229
10.5.2 Interpretation on Cronbachs alpha
There are several interpretations on Cronbachs alpha:
(a) Scale Mean If Item D eletedThis column tells us about the average score if the specific item is excludedfrom the scale. So, if rq1 is deleted, the average score will be 41.89
(b) Corrected Item-Total CorrelationThis column gives the Pearson correlation coefficient between theindividual item and the sum of the scores on the remaining items. A lowitem-total correlation means that the item is little correlated with the overallscale and the researcher should consider dropping it. However, it should be
noted that a scale with an acceptable Cronbach's alpha may still have one ormore items with low item-total correlations. Items rq4 and rq9 are not verystrong in that they are not consistent with the rest of the scale. Theircorrelations with the sum scale are 0.56 and 0.58 respectively, while all otheritems correlate at 0.65 or better.
(c) Cronbachs alpha if Item D eletedThis column gives the alpha correlation coefficient that would result if theitem is removed from the attitude scale. The researcher may wish to dropitems with high coefficients in this column as another way to improve the
alpha level.
(d) Cronbachs alphaThe Cronbachs alpha for the overall attitude scale is 0.7678 for the 10 itemswithout removal of any items. The alpha can be increased if the two itemsare removed. It is a common practice for researchers to either remove theproblematic items or rewrite the items and administer the items again to seeif the alpha improves.
ACTIVITY 10.1
(a) What is the reliability analysis?
(b) What does the Cronbachs alpha indicate?
(c) Explain Cronbachs alpha if an item is deleted.
8/8/2019 Topic 10 Factor Analysis and Reliability
31/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY230
Factor analysisis used to uncover the latent structure (dimensions) of a set ofvariables.
Principal Component Analysis is used to reduce the number of variables intoa smaller set of principal components (dimensions).
Among the required assumptions for factor analysis are a large sample,normality (not for PCA), linear relationship among variables, absence ofoutliers, and no multi collinearity.
Factor loading is the correlation between a variable and a factor that has beenextracted from the data.
Bartlett Test of Sphericity and Kaiser-Meyer-Olkin Measure of SamplingAdequacy (KMO) are two commonly used tests to test the factorability of the data.
An initial factor solution is normally rotated to obtain a more interpretablesolution.
The initial solution can be rotated using orthogonal or oblique rotations. Reliability is the measure of consistency of a particular instrument. There are four classes of reliability estimates. They are inter-rater or inter-
observer reliability, test-retest reliability, parallel-form reliability, and internalconsistency.
Cronbachs Alpha is the most common form of internal consistency reliabilitycoefficient.
Factor analysis
Principal component analysis
Factor loading
Correlation matrix
Co-variance
Rotation
Orthogonal
Oblique
Reliability
Cronbachs alpha coefficient
8/8/2019 Topic 10 Factor Analysis and Reliability
32/32
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 231
Carry out Factor Analysis to determine the dimensions in theEmotional Intelligence construct developed by the researcher (You caneither name the factors or label them as Factor 1, Factor 2, etc).
Report the Cronbachs Alpha for each dimension.
Black, T. R. (1999). Doing quantitative research in the Social Sciences. London:Sage Publications.
Coladraci, T., Cobb, C., Minium, E. & Clarke, R. (2007). Fundamentals of statistical reasoning in Education. New Jersey: Wiley.
Dancey, C. P. & Reidy, J. (2007). Statistics without maths for Psychology. Harlow,England: Pearson Prentice Hall.
Field, A. (2005), Discovering statistics using SPSS. London: Sage Publications.
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E. & Tatham, R. L. (2006).Multivariate data analysis. Upper Saddle River: Prentice Hall.
Welkowitz, J., Cohen, B. & Ewen, R. (2006). Introductory statistics for theBehavioral Sciences. New Jersey: Wiley.