Topic 10 Factor Analysis and Reliability

  • Upload
    sumilah

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    1/32

    TOPIC 10 A CONDUCIVE TEACHING AND LEARNING ENVIRONMENT

    200

    TT

    INTRODUCTIONFactor Analysis is used to uncover the latent structures (dimensions) of a set ofvariables. It is a family of analysis under data reduction. Other methods areLatent Class Analysis, Latent Profile Analysis, Latent Trait Analysis , andPrincipal Component Analysis (PCA). The focus of Principal ComponentAnalysis is to reduce the number of variables into a smaller set of principalcomponents (dimensions). It allows researchers to use a smaller number of

    factors to explain what the long list of variables actually measure. PCA isnormally used to reduce a large number of variables to a smaller number offactors. Prior to Mu ltiple R egression analysis, factor analysis was used to create aset of factors to be treated as uncorrelated variables as one approach to handlemulti-collinearity. Factor analysis is an Interdependency Technique; it aims tofind the latent factors that account for the patterns of collinearity among multiplemetric variables. Some statisticians do not consider PCA as factor analysis.

    FactorAnalysisandReliability

    ooppiicc

    1100

    LEARNING OUTCOMES

    By the end of this topic, you should be able to:

    1. Describe the requirements for factor analysis for a given data set;

    2. Use the appropriate method to determine the principal componentsunderpinning the responses on a set of variables; and

    3. Compute the reliability index.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    2/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 201

    Salient features ofPrincipal Com ponent Analysis:(a) It is a variable reduction procedure.(b)

    The main purpose is to reduce the number of variables into a smaller set ofprincipal components (dimensions).

    (c) It is a large sample procedure where the focus is only on summarising thesample information into a smaller set of principal components asopposed to detecting the latent factors that influence the scores on theobserved variables.

    The following are the assumptions for Factor Analysis:(a) Large enough sample to yield reliable estimates of the correlations among

    the variables (according to Hair et al: 5 respondents per item in the scale arepreferred).

    (b) Statistical inference is improved if the variables are multivariate normal (notrequired for PCA).

    (c) Relationships among the pairs of variables are linear.(d) Absence of outliers among the cases.(e) Some degree of collinearity among the variables but not an extreme degree

    or singularity among the variables (according to Kline (1998), thecorrelation value between the variables fall between 0.3 and 0.8).

    ILLUSTRATING THE INTER-DEPENDENCYBETWEEN VARIABLES

    10.1

    A teacher wanted to gauge the Emotional Intelligence of the Form Five studentsof his school. Based on his readings, he drafted a nine-item questionnaire.Respondents are required to provide their ratings on a five-point Likert Scale. Headministered the questionnaire to a group of Form Five students and ran asimple correlation analysis.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    3/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY202

    The following Table 10.1 shows the results:

    Table 10.1: Correlation Analysis OutputX1 X2 X3 X4 X5 X6 X7 X8 X9

    X1 1.00 0.75 0.70 0.65 0.01 0.20 0.18 0.16 0.03X2 1.00 0.63 0.65 0.08 0.11 0.13 0.04 0.09X3 1.00 0.74 0.02 0.12 0.07 0.15 0.05X4 1.00 0.01 0.11 0.06 0.02 0.13X5 1.00 0.73 0.72 0.65 0.83X6 1.00 0.71 0.79 0.72X7 1.00 0.95 0.75X8 1.00 0.73X9 1.00

    Basic Principle:Variables that significantly correlate with each other do so because they aremeasuring the same "construct".

    The Problem:What is the "construct" that brings the variables together?

    The interpretation of Table 10.1:

    (a) Variables 1, 2, 3 & 4 correlate highly with each other, but not with the rest ofthe variables.

    (b) Variables 5, 6, 7, 8 & 9 correlate highly with each other, but not with the restof the variables.

    (c) The nine variables seem to be measuring TWO "constructs" or underlyingfactors.

    To find out the answer, we need to carry out Factor Analysis or The PrincipalComponent Analysis, to be more precise.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    4/32

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    5/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY204

    Variance [Cov (Y, Y)]:1n

    YYYYn

    1iii

    = ))(( = using the standardised value

    Cov (Y, Y) = 1.00

    Covariance [Cov (X, Y)]:1n

    YYXXn

    1iii

    =))((

    = 0.77

    Variance-covariance matrix: ; if the X and Y scores aretransformed into standardised scores, the variance-covariance will give us thecorrelation matrix.

    ),cov(),cov(

    ),cov(),cov(

    yyxy

    yxxx

    Thus, for the above example, the variance-covariance matrix for the standardisedvalue of

    X and Y is

    001770

    770001

    ..

    ..

    Step 3Calculate the eigenvectors and eigenvalues of the covariance matrix:

    ),cov(),cov(

    ),cov(),cov(

    yyxy

    yxxxX =

    b

    a

    b

    a

    Variance-covariance Matrix Eigenvector Eigenvalue

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    6/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 205

    The number of eigenvalues depends on the number of variables in the analysis.In general, if there are n variables in the analysis, there will be n number ofeigenvalues. However, not all the eigenvalues will have the same magnitude butthe total is equal to the number of variables in the analysis.

    Each eigenvalue will have its corresponding eigenvector. The computation ofeigenvalues and eigenvectors involves complicated mathematical proceduresespecially if the number of variables in the analysis is large. Any computersoftware that performs the Principal Component Analysis will compute theeigenvalues (while some programmes will also provide the eigenvectors).

    For the example above, the eigenvalues are:

    1 = 1 + r12

    and

    2 = 1 - r12

    (where r12 is the correlation between the two variables, in this case, thecorrelation value is 0.77)

    Thus, the eigenvalues are 1.77 and 0.23

    The eigenvalues will give the eigenvectors.

    When eigenvalue is 1.77, the eigenvector is 542 1.When eigenvalue is 0.23, the eigenvector is

    431

    1

    .

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    7/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY206

    Step 4Plotting the standardised values on a two dimensional plane and overlying theeigenvectors.

    -2.00

    -1.50

    -1.00

    -0.50

    0.00

    0.50

    1.00

    1.50

    2.00

    -2.00 -1.00 0.00 1.00 2.00

    Eigenvector

    431

    1

    .

    Eigenvector

    542

    1

    .

    Figure 10.1: Plotting on a two dimensional plane and overlying the eigenvectorsFrom the plot shown in Figure 10.1, it can be concluded that the data set is fairlywell represented by the eigenvector derived when the eigenvalue is 1.77.

    The above discussion is just for illustrative purpose. In real situations, there will be more than two observed variables and thus, visual representation (e.g.

    graphical) is not possible.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    8/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 207

    10.2.2 Types of Factor Analysis

    Basically, there are two types of factor analysis and they are:(a) Exploratory factor analysis

    It is a non-theoretical application. The aim is to answer the question Givena set of variables, what are the underlying dimensions (factors) that accountfor the patterns of collinearity among the variables?

    Example: Respondents responses on a scale measuring delinquency isgoverned by certain theory, as such, what are the latent factors thatinfluence their behaviour?

    (b) Confirmatory factor analysisIt is to validate a predetermined theory. The aim is to answer the questionDo the responses of a scale conform with the theory that explainsrespondents behaviour?

    Example: Given a theory that attributes delinquency to four independentfactors, do respondents responses on a scale that measures delinquencyconverge into these four factors?

    THE LOGIC OF FACTOR ANALYSIS(PRINCIPAL COMPONENT ANALYSIS)

    In studying the Emotional Intelligence of teachers, a researcher uses focus groupinterviews in developing the instrument for his study. The following items (The

    below data is attached in Appendix II, Data Set B) were generated based on focusgroup interviews with selected teachers from Klang Valley.

    1 It is difficult for me to face unpleasant situations.

    2 I am able to face challenges pretty well.

    3 I am able to deal with upsetting problems.

    4 I find it difficult to control my anxiety.

    5 I am able to keep calm in difficult situations.

    6 I can handle stress without getting too nervous.

    7 I am usually calm when facing challenging situations.

    8 I am motivated to continue, even when things get difficult.

    9 Whatever the situation, I believe I can handle it well.

    10 I am optimistic about most things I do.

    10.3

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    9/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY208

    11 I am sure of what I am doing in most situations.

    12 I believe things will turn out all right despite setbacks from time to time.

    13 I believe in my ability to handle upsetting problems/situations.

    14 If others can do it, I dont see why I cant.

    15 I feel good about myself.

    16 I feel that I am not inferior compared with others.

    17 I feel confident of myself in most situations.

    18 I have good self respect.

    19 I am happy with what I am now.

    20 It is fairly easy for me to express my feelings.

    21 I am aware of what is happening around me even when I am upset.

    22 I am aware of the way I feel.

    23 It is difficult for me to describe my feelings.

    The researcher developed a questionnaire to assess Emotional Intelligence ofteachers using the items generated from the focus group interviews. He used a 7-point Likert Scale for his questionnaire. The following is the description of theLikert Scale:

    [ 1= Strongly Disagree; 2 = Disagree; 3 = Slightly Disagree; 4 = Not Sure; 5 =

    Slightly Agree; 6 = Agree; 7 = Strongly Agree]. He administered the questionnaireto 176 randomly selected teachers from both private and public schools in KlangValley. Table 10.3, shows the sample of the responses.

    Table10.3: Sample of Students ResponsesSubject VariableX1 X2 X3 X4 X5 X6 Xn1 6 5 7 3 4 4

    2 5 7 4 4 4 3

    3 7 5 6 2 5 4

    N Mean X1 Mean X2 Mean X3 Mean X4 Mean X5 Mean X6

    Mean X.

    Having run the correlation analysis, the researcher found that some of the itemshave high correlations with one another while others, not so. Table 10.4 shows anexample of the correlation analysis.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    10/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 209

    Table 10.4: Sample of Inter-Correlation Values between VariablesX1 X2 X3 X4 X5 X6 .... Xk

    X1 1.00 0.76 0.84 X2 1.00 0.76 X3 1.00 X4 1.00 0.76 0.77 X5 1.00 0.81X6 1.00-- -Xk 1.00

    The next logical thing to do is to cluster the variables with high inter correlationstogether and define them as belonging to the same family. This is what factoranalysis (or Principal Component Analysis, to be precise) is all about. Table 10.5displays an example of the factor analysis. The values in the cells are the factorloadings (Refer to Subsection 10.3.1 for further explanation on factor loadings).

    Table 10.5: Sample of Factor Analysis OutcomeVariables Factor I Factor II Factor III Factor IV Factor .. Factor n

    X1 0.932 0.013 0.250X2 0.851 0.426 0.211X3 0.634 0.451 0.231X4 0.322 0.644 0.293X5 0.725 0.714 0.293X6 0.435 0.641 0.332X7 0.322 0.311 0.677X8 0.211 0.233 0.771 Xk 0.122 0.110 0.200

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    11/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY210

    10.3.1 Factor Loading

    What is a Factor Loading?A factor loading is the correlation between a variable and a factor that has beenextracted from the data.

    ExampleNote the factor loadings for variable X1.

    Variables Factor I Factor II Factor IIIX1 0.932 0.013 0.250

    Variable X1 is highly correlated with Factor I, but negligibly correlated withFactors II and III.

    Communality: Refers to the total variance in variable X1 accounted for by thethree factors that were extracted.

    Simply square the factor loadings and add them together:

    (0.9322 + 0.0132 + 0.2502) = 0.93129

    As such, the initial communality for the variables before extracting the factors isalways 1.00. In the above example, emotional intelligences is operationalisedusing 23 specific situations and the initial factors will be 23, with some havinggreater dominance than the others (this will be reflected in the eigenvalues).

    Once the dominant factors are identified (e.g. those with eigenvalue greater than1.00), the communality value for each variable will be less than 1.00. This is

    because in factor analysis, those factors that have negligible effect on the variableswill be dropped.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    12/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 211

    STEPS IN FACTOR ANALYSIS (PRINCIPALCOMPONENT ANALYSIS)

    There are a few crucial steps to be followed in factor analysis or to put it moreprecisely, the Principal Component Analysis:

    Step 1Compute a k by k inter-correlation matrix. According to Hair et.al, inter-correlation values must be at least 0.3 for the items to be considered for factoranalysis.

    Step 2Extract an initial solution.

    Step 3Determine the appropriate number of factors to be extracted in the final solution.

    Step 4Rotate the factors to clarify the factor pattern in order to better interpret thenature of the factors if necessary.

    Step 5Establish the measures of goodness-of-fit of the factor solution

    A Ten Variable ExampleThe researcher used the responses on the first 10 questions on the EmotionalIntelligence questionnaire to perform factor analysis. The Table 10.6 below showsthe codes and the variable names for the variables included in the factor analysis.

    Table 10.6: Codes and Variable NamesCode Variable Namerq1 It is difficult for me to face unpleasant situations.

    rq2 I am able to face challenges pretty well.

    rq3 I am able to deal with upsetting problems.

    rq4 I find it difficult to control my anxiety.

    rq5 I am able to keep calm in difficult situations.

    rq6 I can handle stress without getting too nervous.

    rq7 I am usually calm when facing challenging situations.

    rq8 I am motivated to continue, even when things get difficult.

    rq9 Whatever the situation, I believe I can handle it well.

    rq10 I am optimistic about most things I do.

    10.4

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    13/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY212

    The principal components can be illustrated as follows:

    X1 X2 X3 X4 X5 X6

    C1

    X7 X8 X9 X10

    C1 = b11(X1) + b11(X1) + + b10(X10)

    C1 = Factor score on Component 1

    b = Regression weight (also known as factor weight)

    X = Respondents score on the observed variables

    = Strong regression weight

    = Weak regression weight

    X1 X2 X3 X4 X5 X6

    C2

    X7 X8 X9 X10

    C2 = b11(X1) + b11(X1) + + b10(X10)

    C2 = Factor score on Component 2

    bij = Regression weight (also known as factor loading)

    Xi = Respondents score on the observed variables

    = Strong regression weight (large factor loading)

    = Weak regression weight (small factor loading)

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    14/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 213

    All the observed variables will have some influence on all the factors extracted,however, a different set of the variables will have different degrees of influenceon the different common factors.

    In short, a principal component is a linear combination of optimally weightedobserved variables. The weighting is done in such a way that it maximises theamount of variance in the data set.

    The following Figure 10.2 summarises the requirements and assumptions forprincipal component analysis.

    Summ arise original info intominimal factor

    Yes No

    Oblique Orthogonal

    Total variance

    Continuous

    Principal Component analysis

    It is a large sample procedure.No generalisation involved.No assumption of normality

    Purpose

    Measurement level of the principalcomponent factor

    Parameter for analysis

    Assumption of normality

    Type of analysis

    Principal components correlate

    Type of rotation

    Figure 10.2: Requirements and assumptions for principal component analysisExtracting the principal components from the list of observed variables is aniterative procedure that requires one to check for the assumptions along theprocess until the final conclusion is made. The procedural map in Appendix VIsummarises the procedure and assumptions required for PCA with orthogonalrotation.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    15/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY214

    10.4.1 Correlation between Variables

    As a first step, correlations between the variables are computed. Table 10.7 shows

    the values of the correlation between the variables. The shaded cells represent thediagonal while the values below and above the diagonal are the correlationvalues between the variables. Since the correlation values between the variablesare greater than 0.3 with at least one other variable, all the 10 variables arefactorable. At the same time the values are not too high (not more than 0.85) andas such, each variable is distinct from the others.

    Table 10.7: Inter-Correlation among the VariablesCorrelation Matrixa

    rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10

    rq1 1.000 .604 .578 .419 .514 .580 .497 .555 .554 .481

    rq2 .604 1.000 .615 .518 .488 .545 .543 .402 .402 .401

    rq3 .578 .615 1.000 .519 .567 .536 .572 .481 .484 .496

    rq4 .419 .518 .519 1.000 .581 .430 .450 .336 .174 .357

    rq5 .514 .488 .567 .581 1.000 .577 .577 .466 .382 .574

    rq6 .580 .545 .536 .430 .577 1.000 .575 .510 .417 .437

    rq7 .497 .543 .572 .450 .577 .575 1.000 .459 .442 .521

    rq8 .555 .402 .481 .336 .466 .510 .459 1.000 .585 .602

    rq9 .554 .402 .484 .174 .382 .417 .442 .585 1.000 .529

    Correlation

    rq10 .481 .401 .496 .357 .574 .437 .521 .602 .529 1.000

    a. Determinant =0.005

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    16/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 215

    There is more evidence of factorability:(a) Bartlett's Test of Sphe ricity

    Table 10.8 shows the inter-correlation matrix of an identity matrix.

    Table 10.8: Intercorrelation of an Identity MatrixX1 X2 X3 X4 X5

    X1 1.00 0.00 0.00 0.00 0.00X2 1.00 0.00 0.00 0.00X3 1.00 0.00 0.00X4 1.00 0.00X5 1.00

    The variables are totally non-collinear. If this matrix was factor-analysed, it

    would extract as many factors as variables, since each variable would be itsown factor. As such, it is totally non-factorable. The factor solution will beexactly the same as the initial solution.

    The determinant of an identity matrix is equal to one, while the determinantof a non-identity matrix is some other value (different from one).Bartlett's Test of Sphericity calculates the determinant of the matrix of thesums of products and cross-products (S) from which the inter-correlationmatrix is derived. The determinant of the matrix S is converted to a chi-

    square statistic and tested for significance.

    Null Hypothesis: The inter-correlation matrix of the variables is notdifferent from an identity matrix.

    Alternate Hypothesis: The inter-correlation matrix of the variables isdifferent from an identity matrix.Table 10.9 shows the sample results:

    Table 10.9: Sample Results of Bartlett's Test of SphericityKM O and Bartlett's TestKaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914

    Approx. Chi-Square 887.955

    df 45

    Bartlett's Test of Sphericity

    Sig. .000

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    17/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY216

    Test Results2 = 887.955 ; df = 45 ; p < 0.0001

    Statistical DecisionThe inter-correlation matrix of the variables is significantly different froman identity matrix. In other words, the sample inter-correlation matrix didnot come from a population in which the inter-correlation matrix is anidentity matrix.

    (b) Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KM O)If two variables share a common factor with other variables, their partialcorrelation (aij) will be small, indicating the unique variance they share.

    If aij 0.0; the variables are measuring a common factor, and KMO 1.0

    If aij 1.0; the variables are not measuring a common factor, and KMO 0.0

    Table 10.10 portrays the interpretation of the KMO as characterised byKaiser, Meyer, and Olkin:

    Table 10.10: Degree of Common VarianceKMO Value Degree of Comm on Variance0.90 to 1.00 Marvelous

    0.80 to 0.89 Meritorious

    0.70 to 0.79 Middling

    0.60 to 0.69 Mediocre

    0.50 to 0.59 Miserable

    0.00 to 0.49 Not Appropriate for Factor Analysis

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    18/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 217

    As characterised by Kaiser, Meyer, and Olkin, results of the KMO can beseen or referred in the below Table 10.11.Table 10.11: KMO and Bartlett's Test

    KM O and Bartlett's TestKaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914

    Approx. Chi-Square 887.955

    df 45

    Bartlett's Test of Sphericity

    Sig. .000

    The KMO = 0.914

    InterpretationThe degree of common variance among the ten variables is marvellous.

    If a factor analysis is conducted, the factors extracted will account for asubstantial amount of variance.

    10.4.2 Extracting an Initial Solution

    A variety of methods have been developed to extract factors from an inter-correlation matrix. SPSS Statistics offers the following methods:

    (i) Principal components(ii) Unweighted least-squares(iii) Generalised least squares(iv) Maximum likelihood(v) Principal axis factoring(vi) Alpha factoring(vii) Image factoringNote: In this module, we will only focus on the Principal Component Method.

    Communality is the proportion of variance of a particular variable (item in thequestionnaire) that is due to common factors. In the initial solution, each variable(item) is considered as a single factor, as such, the communality for the initialsolution is 1.00. After extraction, the number of factors will be reduced and each

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    19/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY218

    initial factor (item) now belongs to new factors and the new factors explain acertain proportion of the variance in the variable. Thus, the proportion ofvariance of each variable (item) explained by the new factors is less than 1.00

    (refer to Table 10.12).

    Table 10.12: CommunalitiesCommunalities

    Initial Extraction

    rq1 1.000 .626

    rq2 1.000 .623

    rq3 1.000 .647

    rq4 1.000 .732

    rq5 1.000 .649

    rq6 1.000 .588

    rq7 1.000 .594

    rq8 1.000 .694

    rq9 1.000 .762

    rq10 1.000 .614

    The variance of each variable is 1.0, the total variance to be explained is 10 (10variables, each with a variance = 1.0). Since a single variable can account for 1.0unit of variance, a useful new factor must account for more than 1.0 unit of

    variance, or have an eigenvalue () greater than 1.0. Otherwise, the factorextracted (new factor) explains less variance than a single variable. Table 10.7shows the results of the factor analysis of the 10 items.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    20/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 219

    10.4.3 Determine the Appropriate Number of Factors tobe Extracted in the Final Solution

    Table 10.13: The Results of Factor Analysis

    Initial EigenvaluesExtraction Sum s of Squared

    LoadingsRotation Sums of Squared

    Loadings

    Component Total

    % of

    Variance

    Cumulative

    % Total

    % of

    Variance

    Cumulative

    % Total

    % of

    Variance

    Cumulative

    %

    1 5.489 54.888 54.888 5.489 54.888 54.888 3.515 35.152 35.152

    2 1.041 10.406 65.294 1.041 10.406 65.294 3.014 30.143 65.294

    3 .691 6.910 72.205

    4 .539 5.387 77.592

    5 .506 5.056 82.648

    6 .395 3.948 86.596

    7 .383 3.830 90.426

    8 .359 3.590 94.017

    9 .320 3.201 97.218

    10 .278 2.782 100.000

    Extraction Method: Principal Component Analysis.Referring to the above Table 10.13, the results of the initial solution:

    Interpretation10 factors (components) were extracted, the same as the number of variablesfactored:

    (a) Factor IThe 1st factor has an eigenvalue = 5.489. The value is greater than 1.0, assuch, it explains more variance than a single variable, in fact 5.489 times asmuch.

    The percent of variance explained by Factor I is:

    (5.489 / 10 units of variance) (100) = 54.89%

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    21/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY220

    (b) Factor IIThe 2nd factor has an eigenvalue = 1.041. It is also a value greater than 1.0,and therefore, explains more variance than a single variable.

    The percent of variance explained by Factor II is:

    (1.041 / 10 units of variance) (100) = 10.41%(c) Subsequent factors

    The subsequent factors (3 through 10) have eigenvalues less than 1.0, assuch, explain less variance than a single variable. These are not goodfactors.

    The Key Points

    The sum of the eigenvalues associated with each factor (component)sums to 10 (e.g (5.489 + 1.041 + 0.691 + 0.539 + + 0.278) = 10)

    The cumulative percentage of variance explained by the first two factorsis 65.29%

    In other words, 65.29% of the common variance shared by the 10variables can be accounted for by the 3 factors.

    This initial solution suggests that the final solution should extract notmore than 2 factors.

    Under the subject of determining the appropriate number of factors to beextracted in the final solution that has been discussed in this subsection, there aretwo more important elements to be addressed:

    (a) Cattell's Scree PlotAnother way to determine the number of factors to extract in the finalsolution is via Cattell's Scree plot (refer to Figure 10.3). This is a plot of theeigenvalues associated with each of the factors extracted, against eachfactor. At the point that the plot begins to level off, the additional factorsexplain less variance than a single variable.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    22/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 221

    Figure 10.3: Cattell's Scree Plot(b) Factor Loadings

    The component matrix indicates the correlation of each variable with each factor.

    Component M atrixaComponent

    1 2

    rq1 .785 .099

    rq2 .748 -.253

    rq3 .795 -.127

    rq4 .640 -.567

    rq5 .776 -.216

    rq6 .762 -.086

    rq7 .765 -.096

    rq8 .727 .406

    rq9 .667 .563

    rq10 .728 .289

    Extraction Method: PrincipalComponent Analysis.a. 2 components extracted.

    Explanation:

    The variable rq1correlates 0.785 withFactor I

    correlates 0.099 withFactor II

    The total proportion of the variance in rq1 explained by the two factors is:(0.7852 + 0.0992) = 0.626

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    23/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY222

    This is called the communality of the variable rq1The communalities of the 10 variables are as follows: (cf. column headed asExtraction)

    CommunalitiesInitial Extraction

    rq1 1.000 .626

    rq2 1.000 .623

    rq3 1.000 .647

    rq4 1.000 .732

    rq5 1.000 .649

    rq6 1.000 .588

    rq7 1.000 .594

    rq8 1.000 .694

    rq9 1.000 .762

    rq10 1.000 .614

    The proportion of variancein each variable accountedfor by the two factors isnot the same.

    The key to determining what the factors measure is the factor loadings.

    Component M atrixaComponent

    1 2

    rq1 .785 .099

    rq2 .748 -.253

    rq3 .795 -.127

    rq4 .640 -.567

    rq5 .776 -.216

    rq6 .762 -.086

    rq7 .765 -.096

    rq8 .727 .406

    rq9 .667 .563

    rq10 .728 .289

    Extraction Method: Principal ComponentAnalysis.

    a. 2 components extracted.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    24/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 223

    Factor IVariable Factor Loading

    rq1 .785rq2 .748

    rq3 .795

    rq4 .640

    rq5 .776

    rq6 .762

    rq7 .765

    rq8 .727

    rq9 .667

    rq10 .728

    The correlation coefficient between rq1 and Factor I is 0.785

    The correlation coefficient between rq2 and Factor I is 0.748

    The correlation coefficient between rq3 and Factor I is 0.795

    The correlation coefficient between rq4 and Factor I is 0.640

    The correlation coefficient between rq5 and Factor I is 0.776

    The correlation coefficient between rq6 and Factor I is 0.762

    The correlation coefficient between rq7 and Factor I is 0.765

    The correlation coefficient between rq8 and Factor I is 0.727

    The correlation coefficient between rq9 and Factor I is 0.667

    The correlation coefficient between rq10 and Factor I is 0.728

    Factor IIVariable Factor Loading

    rq1 .099

    rq2 -.253

    rq3 -.127

    rq4 -.567

    rq5 -.216

    rq6 -.086

    rq7 -.096

    rq8 .406

    rq9 .563

    rq10 .289

    The correlation coefficient between rq1 and Factor II is 0.099

    The correlation coefficient between rq2 and Factor II is -0.253

    The correlation coefficient between rq3 and Factor II is -0.127

    The correlation coefficient between rq4 and Factor II is -0.567

    The correlation coefficient between rq5 and Factor II is -0.216

    The correlation coefficient between rq6 and Factor II is -0.086

    The correlation coefficient between rq7 and Factor II is -0.096

    The correlation coefficient between rq8 and Factor II is 0.406

    The correlation coefficient between rq9 and Factor II is 0.563

    The correlation coefficient between rq10 and Factor II is 0.289

    10.4.4 Rotate the Factors to Clarify the Factor Pattern inorder to Better Interpret the Nature of the Factors.

    In many instances, one or more variables may load about the same on more thanone factor, making the interpretation of the factors ambiguous. Ideally, the

    analyst would like to find that each variable loads high ( 1.0) on one factor andapproximately zero on all the others ( 0.0). The factor pattern can be clarified by"rotating" the factors in F-dimensional space. There are two types of rotation:

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    25/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY224

    (a) Orthogonal Rotation: Preserves the independence of the factors,

    geometrically they remain 90 apart.

    (b) Oblique Rotation: Will produce factors that are not independent,

    geometrically not 90 apart.

    Below is the comparison between the Component matrix and RotatedComponent matrix (Using Varimax rotation, an orthogonal type) for the tenvariables:

    Component M atrixa Rotated Component MatrixaComponent Component

    1 2 1 2

    rq1

    rq2

    rq3

    rq4

    rq5

    rq6

    rq7

    rq8

    rq9rq10

    .785

    .748

    .795

    .640

    .776

    .762

    .765

    .727

    .667

    .728

    .099

    -.253

    -.127

    -.567

    -.216

    -.086

    -.096

    .406

    .563

    .289

    rq1

    rq2

    rq3

    rq4

    rq5

    rq6

    rq7

    rq8

    rq9rq10

    .519

    .726

    .677

    .855

    .723

    .625

    .634

    .272

    .123

    .350

    .597

    .309

    .435

    .003

    .356

    .443

    .438

    .788

    .864

    .701

    Extraction Method: PrincipalComponent Analysis.a. 2 components extracted.

    Extraction Method: PrincipalComponent Analysis. RotationMethod: Varimax with KaiserNormalization.a. Rotation converged in 3 iterations.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    26/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 225

    Reproduced correlation m atrixOne measure of the goodness-of-fit is whether the factor solution can reproducethe original inter-correlation matrix among the ten variables.

    Table 10.14 : Reproduced CorrelationsReproduced Correlations

    rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10

    rq1 .626a .562 .611 .446 .588 .590 .591 .611 .580 .600

    rq2 .562 .623a .626 .622 .635 .591 .596 .441 .357 .471

    rq3 .611 .626 .647a .580 .644 .616 .620 .526 .459 .542

    rq4 .446 .622 .580 .732a .619 .536 .544 .235 .108 .302

    rq5 .588 .635 .644 .619 .649a .610 .614 .477 .397 .503

    rq6 .590 .591 .616 .536 .610 .588a .591 .519 .460 .530

    rq7 .591 .596 .620 .544 .614 .591 .594a .517 .456 .529

    rq8 .611 .441 .526 .235 .477 .519 .517 .694a .714 .647

    rq9 .580 .357 .459 .108 .397 .460 .456 .714 .762a .649

    ReproducedCorrelation

    rq10 .600 .471 .542 .302 .503 .530 .529 .647 .649 .614a

    rq1 .042 -.033 -.027 -.074 -.009 -.094 -.056 -.026 -.119

    rq2 .042 -.011 -.104 -.147 -.047 -.053 -.039 .046 -.070

    rq3 -.033 -.011 -.061 -.077 -.080 -.048 -.045 .025 -.046

    rq4 -.027 -.104 -.061 -.038 -.106 -.094 .101 .066 .055

    rq5 -.074 -.147 -.077 -.038 -.033 -.037 -.011 -.014 .071

    rq6 -.009 -.047 -.080 -.106 -.033 -.016 -.009 -.042 -.093

    rq7 -.094 -.053 -.048 -.094 -.037 -.016 -.058 -.014 -.008

    rq8 -.056 -.039 -.045 .101 -.011 -.009 -.058

    -.129 -.045rq9 -.026 .046 .025 .066 -.014 -.042 -.014 -.129 -.120

    Residualb

    rq10 -.119 -.070 -.046 .055 .071 -.093 -.008 -.045 -.120

    Extraction Method: Principal Component Analysis.

    a. Reproduced communalitiesb. Residuals are computed between observed and reproduced correlations. There are 21

    (46.0%) non-redundant residuals with absolute values greater than 0.05.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    27/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY226

    The upper half of the above Table 10.14 presents the bivariate correlations.Compare these with the lower half of the table that presents the residuals.

    Residual = (observed - reproduced correlation)

    Less than half of the residuals (42%) are greater than 0.05

    10.4.5 Establish the Measures of Goodness-of-Fit of theFactor Solution

    Table 10.15 shows the goodness of fit of the tw o factor solution.Table 10.15: Goodness of Fit of the Two Factor Solution

    Measu re Value InterpretationKMO 0.914 MarvelousBarletts Test 2 = 887.955 ;

    df = 45 ;

    p < 0.0001

    The inter-correlation matrixprovides evidence of thepresence of common factors

    Total Variance Explained 65.29% The two factors extractedcan explain 65.29% of thevariance in the ten variables

    Factor pattern 2 Factors The pattern is clear for twofactors

    RELIABILITY10.5

    In many areas of educational and psychological research, the precisemeasurement of various variables or theoretical constructs poses a challenge. Forexample, the precise measurement of personality variables or attitudes is usuallya necessary first step before any theories of personality or attitudes can beconsidered. In general, unreliable measurements of people's beliefs or intentions

    will obviously hamper efforts to predict their behaviour. Reliability analysis isoften used to statistically check the reliability of an instrument. Reliability is themeasure of consistency of a particular instrument. This refers to the capabilityof the instrument producing consistently similar results if it were administered toa homogenous group of respondents. Generally, there are four classes ofreliability estimates. They are inter-rater or inter-observer reliability, test-retestreliability, parallel-form reliability, and internal consistency. The inter-rater or theinter-observer reliability is used to assess the degree to which two differentobservers describes a phenomenon. This is widely used in establishing reliability

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    28/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 227

    for open-ended questions. The test-retest, the parallel-forms and the internalconsistency reliability are mainly used to assess the reliability for fixed responseitems. The test-retest is used to measure the consistency of the measure from one

    time to another, while the parallel-form is the reliability measure of theconsistency of two tests which were constructed using the same content domain.

    The internal-consistency is a measurement to evaluate the consistency of theresponses for each item within the instrument. This is reported in termscoefficient of Cronbachs alpha and the values range from zero to one and this ismeasured by the formula;

    =

    =

    k

    sum2

    S

    i2

    S1

    1k

    k

    1i

    where

    Si2 = variance for k individuals

    S2sum = variance for the sum of all items

    If there is no true score but only random errors in the items(uncorrelated across items) then Si2 = S2sum and = 0

    If all items measure the same thing (true score) then =1 Nunnaly (1978) suggests an > 0.7

    10.5.1 Reliability using Cronbachs Alpha

    There are many different types of statistics to check reliability and one of themost commonly used is Cronbachs Alpha which is based on the averagecorrelation of items within a test. Cronbachs alpha is the most common form ofinternal consistency reliability coefficient. By convention, a lenient cut-off of 0.60is common in exploratory research; alpha should be at least 0.70 or higher toretain an item in an "adequate" scale; and many researchers require a cut-off of

    0.80 for a "good scale."

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    29/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY228

    ExampleA researcher gave a 10-item questionnaire on Emotional Intelligence to a sampleof randomly selected secondary school students. The aim is to determine theinternal consistency of the scale using Cronbachs alpha. The Table 10.16 below isthe SPSS output.

    Table 10.16: Item-Total StatisticsItem-Total Statistics

    Scale Mean ifItem Deleted

    Scale Varianceif Item Deleted

    CorrectedItem-TotalCorrelation

    SquaredMultiple

    Correlation

    Cronbach'sAlpha if Item

    Deleted

    rq1 41.89 63.948 .718 .560 .895

    rq2 41.78 64.915 .676 .533 .897

    rq3 41.89 64.380 .731 .555 .894

    rq4 42.24 65.499 .560 .458 .905rq5 42.19 62.074 .713 .573 .895

    rq6 42.14 63.800 .692 .516 .896

    rq7 42.00 63.202 .696 .508 .896

    rq8 41.83 64.745 .654 .521 .899

    rq9 41.93 66.185 .583 .491 .903

    rq10 41.97 64.849 .658 .517 .898

    SPSS STA TISTICS Com mand s for Reliability Analysis Select Analyse menu and click on Scale and then Reliability

    Analysis

    to open the Reliability Analysis dialogue box. Select the variables or items you require, click the right arrow

    to move the variables to the Items: box. Ensure that Alpha is displayed in the Model: box. Click on the Statistics . command pushbutton to open theReliability Ana lysis: Statistics sub-dialogue box. In the Descriptives for box, select the Scale and Scale if itemdeleted check boxes. In the Inter-Item box, select the Correlations check box. Click on Continue and OK .

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    30/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 229

    10.5.2 Interpretation on Cronbachs alpha

    There are several interpretations on Cronbachs alpha:

    (a) Scale Mean If Item D eletedThis column tells us about the average score if the specific item is excludedfrom the scale. So, if rq1 is deleted, the average score will be 41.89

    (b) Corrected Item-Total CorrelationThis column gives the Pearson correlation coefficient between theindividual item and the sum of the scores on the remaining items. A lowitem-total correlation means that the item is little correlated with the overallscale and the researcher should consider dropping it. However, it should be

    noted that a scale with an acceptable Cronbach's alpha may still have one ormore items with low item-total correlations. Items rq4 and rq9 are not verystrong in that they are not consistent with the rest of the scale. Theircorrelations with the sum scale are 0.56 and 0.58 respectively, while all otheritems correlate at 0.65 or better.

    (c) Cronbachs alpha if Item D eletedThis column gives the alpha correlation coefficient that would result if theitem is removed from the attitude scale. The researcher may wish to dropitems with high coefficients in this column as another way to improve the

    alpha level.

    (d) Cronbachs alphaThe Cronbachs alpha for the overall attitude scale is 0.7678 for the 10 itemswithout removal of any items. The alpha can be increased if the two itemsare removed. It is a common practice for researchers to either remove theproblematic items or rewrite the items and administer the items again to seeif the alpha improves.

    ACTIVITY 10.1

    (a) What is the reliability analysis?

    (b) What does the Cronbachs alpha indicate?

    (c) Explain Cronbachs alpha if an item is deleted.

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    31/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY230

    Factor analysisis used to uncover the latent structure (dimensions) of a set ofvariables.

    Principal Component Analysis is used to reduce the number of variables intoa smaller set of principal components (dimensions).

    Among the required assumptions for factor analysis are a large sample,normality (not for PCA), linear relationship among variables, absence ofoutliers, and no multi collinearity.

    Factor loading is the correlation between a variable and a factor that has beenextracted from the data.

    Bartlett Test of Sphericity and Kaiser-Meyer-Olkin Measure of SamplingAdequacy (KMO) are two commonly used tests to test the factorability of the data.

    An initial factor solution is normally rotated to obtain a more interpretablesolution.

    The initial solution can be rotated using orthogonal or oblique rotations. Reliability is the measure of consistency of a particular instrument. There are four classes of reliability estimates. They are inter-rater or inter-

    observer reliability, test-retest reliability, parallel-form reliability, and internalconsistency.

    Cronbachs Alpha is the most common form of internal consistency reliabilitycoefficient.

    Factor analysis

    Principal component analysis

    Factor loading

    Correlation matrix

    Co-variance

    Rotation

    Orthogonal

    Oblique

    Reliability

    Cronbachs alpha coefficient

  • 8/8/2019 Topic 10 Factor Analysis and Reliability

    32/32

    TOPIC 10 FACTOR ANALYSIS AND RELIABILITY 231

    Carry out Factor Analysis to determine the dimensions in theEmotional Intelligence construct developed by the researcher (You caneither name the factors or label them as Factor 1, Factor 2, etc).

    Report the Cronbachs Alpha for each dimension.

    Black, T. R. (1999). Doing quantitative research in the Social Sciences. London:Sage Publications.

    Coladraci, T., Cobb, C., Minium, E. & Clarke, R. (2007). Fundamentals of statistical reasoning in Education. New Jersey: Wiley.

    Dancey, C. P. & Reidy, J. (2007). Statistics without maths for Psychology. Harlow,England: Pearson Prentice Hall.

    Field, A. (2005), Discovering statistics using SPSS. London: Sage Publications.

    Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E. & Tatham, R. L. (2006).Multivariate data analysis. Upper Saddle River: Prentice Hall.

    Welkowitz, J., Cohen, B. & Ewen, R. (2006). Introductory statistics for theBehavioral Sciences. New Jersey: Wiley.