Upload
amos-collins
View
212
Download
0
Embed Size (px)
Citation preview
Cluster Analysis
Forming Groups within the Sample of Respondents
Cluster vs. Factor Analysis
• Factor analysis for groups of items, identifying common traits underlying their ranges across respondents’ scores.
• Cluster analysis forms groups of respondents, based on the similarity of responses to “independent” items.
K-Means Cluster
• Procedural, judgmental approach, where you compare results of 2, 3, 4… cluster solutions.
• Best suited when you have a set of (6 or more) continuous, or interval coded, variables…– that have low and non-significant inter-correlation—
near independence, and…– good range of responses across sample.
• Produces cluster scores for subsequent analysis.
Validity of Clusters
• Examine the relative sizes and composition of the clusters—are the sizes helpful?
• Do the clusters have face validity? Can you assign names to segments produced from the analysis based on the means on the individual items?
• Can you add new items and retain the same clusters.
Reliability of the Clusters
• Are the clusters “stable” across different sets of (randomly assigned) respondents?
• Are the clusters “stable” with the inclusion or deletion of items.
• Can significant differences be shown from an ANOVA across means on each of the items used in the cluster analysis?
• Do the clusters illustrate differences in responses to separate items?
Limitations of Cluster
• Is largely dependent on the composition of sample, different composition of the sample will produce different clusters.
• Has a well-deserved poor reputation as an a-theoretical approach toward classification and data analysis.
• Best suited for exploratory research designed to exaggerate differences between groups of respondents.
• Addicting, creative approach to forming segments.
Suggested Technique to Create an Understandable Cluster Analysis
• Start with a subset of the questionnaire items used to form clusters.
• Start with 2 clusters, and increase to 3, then 4, examining changing clusters and sizes of each.
• Include additional items one at a time—do the cluster definitions improve in consistency?
Discriminant Analysis
• Cluster analysis forms classification, or categorical variable based on responses to continuous variables.
• Discriminant analysis takes a pre-determined classification variable and identifies continuous variables that show significant differences.
Appropriate Analyses for Project
• Descriptive Statistics– Frequencies on items, particularly those
showing popularity, strongest sentiments, importance.
• “Bivariate” Statistics– ANOVA, F-statistics for differences in means– T-tests for comparisons of means between
two groups– Cross-tabulations of categorical, nominally
coded items.
Multiple Regression
• Limited number of continuous variables that would appropriate/interesting as dependent variables for the Alltel project.
• Best: – Willingness to pay $xx for a certain carrier
service– “Must have” vs. “Don’t need” for features
items
Measurement Issues
• Correlations
• Reliability—mean inter-item correlations
• Factor Analysis– Data reduction to subscales and underlying
traits through explained variance.– Factor loadings (pattern matrices)– Later, we’ll use factor scores for visual plots
(perceptual mapping)
Classification Methods
• Discriminant Analysis– Categorical, nominally coded dependent
variable– Wilk’s Lambda– Classification results
• Cluster Analysis– Creates categorical variables from sets of
continuous, or interval coded items