15
Welcome to Powerpoint slides for Chapter 12 Factor Analysis for Data Reduction Marketing Research Text and Cases by Rajendra Nargundkar

Chapter12 Slides

Embed Size (px)

Citation preview

Page 1: Chapter12 Slides

Welcome to Powerpoint slides

for

Chapter 12

Factor Analysisfor

Data Reduction

Marketing ResearchText and Cases

byRajendra Nargundkar

Page 2: Chapter12 Slides

Introduction

1. Factor Analysis is a set of techniques used for understanding variables by grouping them into “factors” consisting of similar variables

2. It can also be used to confirm whether a hypothesized set of variables groups into a factor or not

3. It is most useful when a large number of variables needs to be reduced to a smaller set of “factors” that contain most of the variance of the original variables

4. Generally, Factor Analysis is done in two stages, called

• Extraction of Factors and • Rotation of the Solution obtained in stage

5. Factor Analysis is best performed with interval or ratio-scaled variables

Slide 1

Page 3: Chapter12 Slides

Application Areas/Example

1. In marketing research, a common application area of Factor Analysis is to understand underlying motives of consumers who buy a product category or a brand

2. The worked out example in the chapter will help clarify the use of Factor Analysis in Marketing Research

3. In this example, we assume that a two wheeler manufacturer is interested in determining which variables his potential customers think about when they consider his product

4. Let us assume that twenty two-wheeler owners were surveyed by this manufacturer (or by a marketing research company on his behalf). They were asked to indicate on a seven point scale (1=Completely Agree, 7=Completely Disagree), their agreement or disagreement with a set of ten statements relating to their perceptions and some attributes of the two-wheelers.

5. The objective of doing Factor Analysis is to find underlying "factors" which would be fewer than 10 in number, but would be linear combinations of some of the original 10 variables

Slide 2

Page 4: Chapter12 Slides

The research design for data collection can be stated as follows-

Twenty 2-wheeler users were surveyed about their perceptions and image attributes of the vehicles they owned. Ten questions were asked to each of them, all answered on a scale of 1 to 7 (1= completely agree, 7= completely disagree).

1. I use a 2-wheeler because it is affordable.2. It gives me a sense of freedom to own a 2-wheeler.3. Low maintenance cost makes a 2-wheeler very economical in the long run.4. A 2-wheeler is essentially a man’s vehicle.5. I feel very powerful when I am on my 2-wheeler.6. Some of my friends who don’t have their own vehicle are jealous of me.7. I feel good whenever I see the ad for 2-wheeler on T.V., in a magazine or on a hoarding.8. My vehicle gives me a comfortable ride.9. I think 2-wheelers are a safe way to travel.10. Three people should be legally allowed to travel on a 2-wheeler.

Slide 3

Page 5: Chapter12 Slides

Slide 4

The input data containing responses of twentyrespondents to the 10 statements are in Appendix 1,in the form of a 20 Row by 10 column matrix(reproduced below).

QUESTION NO.

S.No.

1 2 3 4 5 6 7 8 9 10

1 1 4 1 6 5 6 5 2 3 22 2 3 2 4 3 3 3 5 5 23 2 2 2 1 2 1 1 7 6 24 5 1 4 2 2 2 2 3 2 35 1 2 2 5 4 4 4 1 1 26 3 2 3 3 3 3 3 6 5 37 2 2 5 1 2 1 2 4 4 58 4 4 3 4 4 5 3 2 3 39 2 3 2 6 5 6 5 1 4 1

10 1 4 2 2 1 2 1 4 4 1

Table contd on next slide...

Page 6: Chapter12 Slides

11 1 5 1 3 2 3 2 2 2 112 1 6 1 1 1 1 1 1 2 213 3 1 4 4 4 3 3 6 5 314 2 2 2 2 2 2 2 1 3 215 2 5 1 3 2 3 2 2 1 616 5 6 3 2 1 3 2 5 5 417 1 4 2 2 1 2 1 1 1 318 2 3 1 1 2 2 2 3 2 219 3 3 2 3 4 3 4 3 3 320 4 3 2 7 6 6 6 2 3 6

Slide 4 contd

QUESTION NO.

S.No.

1 2 3 4 5 6 7 8 9 10

Page 7: Chapter12 Slides

Slide 5

The data are subjected to Factor Analysis in twostages (though the stages are 2, both outputs can berequested at the same time, at least in SPSS, by theprocess described in the SPSS Commands Appendixto the chapter).1. In stage 1, we request the software package used

(SPSS, Statistica, etc.) to EXTRACT factors withan Eigen Value of 1 or higher. The methodrequested is the PRINCIPAL COMPONENTS.This gives us the output in Figs. 2 and 3.

Fig. 2: Factor Matrix (Unrotated)

Factor Factor 2 Factor 3VAR00001 .17581 .66967 .49301VAR00002 - -.60774 .25369VAR00003 - .81955 .21827VAR00004 .96647 -.03627 -.09745VAR00005 .95098 .16594 -.13593VAR00006 .95184 -.08442 -.02522VAR00007 .97128 .09591 -.04636VAR00008 - .77498 -.03757VAR00009 - .73502 -.48213VAR00010 .16143 .31862 -.81356

Page 8: Chapter12 Slides

Slide 6

Interpretation of the Output

1. The first step in interpreting the output is to lookat the factors extracted, their eigen values and thecumulative percentage of variance (fig 3,reproduced below).

Fig. 3: Final Statistics

Variable Communality

* Factor Eigenvalue

Pactof Var

CumPct

VAR00001 .72243 * 1 3.88282 38.8 38.8VAR00002 .45214 * 2 2.77701 27.8 66.6VAR00003 .73056 * 3 1.37475 13.7 80.3VAR00004 .94488 *VAR00005 .95038 *VAR00006 .91376 *VAR00007 .95474 *VAR00008 .79869 *VAR00009 .77745 *VAR00010 .78946 *

Page 9: Chapter12 Slides

1. We note that three factors have been extracted,based on our criterion that only Factors with eigenvalues of 1 or more should be extracted. We seefrom the Cum. Pct. (Cumulative Percentage ofVariance Explained) column in Fig. 3 that thethree factors extracted together account for 80.3percent of the total variance (informationcontained in the original ten variables). This is apretty good bargain, because we are able toeconomise on the number of variables (from 10we have reduced them to 3 underlying factors),while we lost only about 20 percent of theinformation content (80 percent is retained by the3 factors extracted out of the 10 originalvariables).

2. This represents a reasonably good solution for ourproblem.

Slide 6 contd...

Page 10: Chapter12 Slides

Slide 7

1. Now, we try to interpret what these 3 extractedfactors represent. This we can accomplish bylooking at figs 4 and 2, the rotated and unrotatedfactor matrices.

Fig. 4: Rotated Factor Matrix

Factor 1 Factor 2 Factor 3VAR00001 .13402 .34749 .76402VAR00002 -.18143 -.64300 -.07596VAR00003 -.10944 .62985 .56742VAR00004 .96986 -.06383 -.01338VAR00005 .96455 .13362 .04660VAR00006 .94544 -.13868 .02600VAR00007 .97214 .02862 .09411VAR00008 -.26169 .85203 .06517VAR00009 .00891 .87772 -.08347VAR00010 .07209 -.10990 .87874

Page 11: Chapter12 Slides

1. Looking at fig. 4, the rotated factor matrix, wenotice that variable nos. 4, 5, 6 and 7 haveloadings of 0.96986, 0.96455, 0.94544 and0.97214 on factor 1 (we look down the Factor 1column in fig. 4, and look for high loadings closeto 1.00). This suggests that Factor 1 is acombination of these four original variables. Fig.2 also suggests a similar grouping. Therefore,there is no problem interpreting factor 1 as acombination of “a man’s vehicle” (statement invariable 4), “feeling of power” (variable 5),“others are jealous of me” (variable 6) and “feelgood when I see my 2-wheeler ads”.

2. At this point, the researcher’s task is to find asuitable phrase which captures the essence of theoriginal variables which form the underlyingconcept or “factor”. In this case, factor 1 could benamed “male ego”, or “machismo”, or “pride ofownership” or something similar. With the samemathematical output, interpretations of differentresearchers may differ.

Slide 7 contd...

Page 12: Chapter12 Slides

1. Now we will attempt to interpret factor 2. We look in fig 4, down the column for Factor 2, and find that variables 8 and 9 have high loadings of 0.85203 and 0.87772, respectively. This indicates that factor 2 is a combination of these two variables.

2. But if we look at fig. 2, the unrotated factor matrix, a slightly different picture emerges. Here, variable 3 also has a high loading on factor 2, along with variables 8 and 9. It is left to the researcher which interpretation he wants to use, as there are no hard and fast rules. Assuming we decide to use all three variables, the related statements are “low maintenance”, “comfort” and “safety” (from statements 3, 8 and 9). We may combine these variables into a factor called “utility” or “functional features” or any other similar word or phrase which captures the essence of these three statements / variables.

Slide 8

Page 13: Chapter12 Slides

3. For interpreting Factor 3, we look at the column labelled

factor 3 in fig. 4 and find that variables 1 and 10 are loaded

high on factor 3. According to the unrotated factor matrix of

fig. 2, only variable 10 loads high on factor 3. Supposing we

stick to fig. 4, then the combination of “affordability’ and

“cost saving by 3 people legally riding on a 2-wheeler” give

the impression that factor 3 could be “economy” or “low

cost”.

4. We have now completed interpretation of the 3 factors

with eigen values of 1 or more. We will now look at some

additional issues which may be of importance in using factor

analysis.

Slide 8 contd...

Page 14: Chapter12 Slides

Slide 9

Additional Issues in Interpreting Solutions

1. We must guard against the possibility that a variable may load highly on more than one factors. Strictly speaking, a variable should load close to 1.00 on one and only one factor, and load close to 0 on the other factors. If this is not the case, it indicates that either the sample of respondents have more than one opinion about the variable, or that the question/ variable may be unclear in its phrasing.

2. The other issue important in practical use of factor analysis is the answer to the question ‘what should be considered a high loading and what is not a high loading?” Here, unfortunately, there is no clear-cut guideline, and many a time, we must look at relative values in the factor matrix. Sometimes, 0.7 may be treated as a high value, while sometimes 0.9 could be the cutoff for high values.

Page 15: Chapter12 Slides

Slide 9…contd…Additional Issues (Contd.)

1. The proportion of variance in any one of the original variables which is captured by the extracted factors is known as Communality. For example, fig. 3 tells us that after 3 factors were extracted and retained, the communality is 0.72243 for variable 1, 0.45214 for variable 2 and so on (from the column labelled communality in fig. 3). This means that 0.72243 or 72.24 percent of the variance (information content) of variable 1 is being captured by our 3 extracted factors together. Variable 2 exhibits a low communality value of 0.45214. This implies that only 45.214 percent of the variance in variable 2 is captured by our extracted factors. This may also partially explain why variable 2 is not appearing in our final interpretation of the factors (in the earlier section). It is possible that variable 2 is an independent variable which is not combining well with any other variable, and therefore should be further investigated separately. “Freedom” could be a different concept in the minds of our target audience.

2. As a final comment, it is again the author’s recommendation that we use the rotated factor matrix (rather than unrotated factor matrix) for interpreting factors, particularly when we use the principal components method for extraction of factors in stage 1.