Multi Variate Analysis Techniques

Embed Size (px)

Citation preview

  • 7/27/2019 Multi Variate Analysis Techniques

    1/4

    Eleven Multivariate Analysis Techniques:

    Key Tools In Your Marketing Research Survival KitBy Michael Richarme

    1.817.640.6166 or 1.800. ANALYSIS www.decisionanalyst.com

    Situation 1: A harried executive walks into your

    oce with a stack o printouts. She says, Yourethe marketing research whiztell me how many

    o this new red widget we are going to sell next

    year. Oh, yeah, we dont know what price we can

    get or it either.

    Situation 2: Another harried executive (they all

    seem to be that way) calls you into his oce and

    shows you three proposed advertising campaigns

    or next year. He asks, Which one should I use?

    They all look pretty good to me.

    Situation 3: During the annual budget meet-

    ing, the sales manager wants to know why two o

    his main competitors are gaining share. Do they

    have better widgets? Do their products appeal to

    dierent types o customers? What is going on

    in the market?

    All o these situations are real, and they happen

    every day across corporate America. Fortunately,

    all o these questions are ones to which solid,

    quantiable answers can be provided.

    An astute marketing researcher quickly develops

    a plan o action to address the situation. The

    researcher realizes that each question requires

    a specic type o analysis, and reaches into the

    analysis tool bag or...

    Over the past 20 years, the dramatic increase

    in desktop computing power has resulted in acorresponding increase in the availability o

    computation intensive statistical sotware. Pro-

    grams like SAS and SPSS, once restricted to

    mainrame utilization, are now readily available

    in Windows-based, menu-driven packages. The

    marketing research analyst now has access to a

    much broader array o sophisticated techniques

    with which to explore the data. The challenge

    becomes knowing which technique to select, and

    clearly understanding its strengths and weak-

    nesses. As my ather once said to me, I you

    only have a hammer, then every problem starts

    to look like a nail.

    Overview

    The purpose o this white paper is to provide

    an executive understanding o 11 multivariate

    analysis techniques, resulting in an understand-

    ing o the appropriate uses or each o the tech-

    niques. This is not a discussion o the underlying

    statistics o each technique; it is a eld guide to

    understanding the types o research questions

    that can be ormulated and the capabilities and

    limitations o each technique in answering thosequestions.

    In order to understand multivariate analysis, it

    is important to understand some o the termi-

    nology. A variate is a weighted combination o

    variables. The purpose o the analysis is to nd

    the best combination o weights. Nonmetric

    data reers to data that are either qualitative or

    categorical in nature. Metric data reers to data

    that are quantitative, and interval or ratio innature.

    The purpose of

    this white paper

    is to provide

    an executive

    understanding

    of 11 multivariate

    analysis

    techniques,

    resulting in an

    understanding ofthe appropriate

    uses for each of

    the techniques.

    Strategic ResearchAnalyticsModelingOptimization

    Copyright 2002 Decision Analyst. All rights reserved.

  • 7/27/2019 Multi Variate Analysis Techniques

    2/4

    Initial StepData Quality

    Beore launching into an analysis technique, it is important to

    have a clear understanding o the orm and quality o the data.

    The orm o the data reers to whether the data are nonmet-

    ric or metric. The quality o the data reers to how normally

    distributed the data are. The rst ew techniques discussed

    are sensitive to the linearity, normality, and equal variance

    assumptions o the data. Examinations o distribution, skew-

    ness, and kurtosis are helpul in examining distribution. Also,

    it is important to understand the magnitude o missing values

    in observations and to determine whether to ignore them

    or impute values to the missing observations. Another data

    quality measure is outliers, and it is important to determine

    whether the outliers should be removed. I they are kept, they

    may cause a distortion to the data; i they are eliminated, theymay help with the assumptions o normality. The key is to at-

    tempt to understand what the outliers represent.

    Multiple Regression Analysis

    Multiple regression is the most commonly utilized multivari-

    ate technique. It examines the relationship between a single

    metric dependent variable and two or more metric indepen-

    dent variables. The technique relies upon determining the

    linear relationship with the lowest sum o squared variances;

    thereore, assumptions o normality, linearity, and equal vari-

    ance are careully observed. The beta coecients (weights)

    are the marginal impacts o each variable, and the size o the

    weight can be interpreted directly. Multiple regression is oten

    used as a orecasting tool.

    Logistic Regression Analysis

    Sometimes reerred to as choice models, this technique is

    a variation o multiple regression that allows or the predic-

    tion o an event. It is allowable to utilize nonmetric (typically

    binary) dependent variables, as the objective is to arrive at a

    probabilistic assessment o a binary choice. The independent

    variables can be either discrete or continuous. A contingency

    table is produced, which shows the classication o observa-

    tions as to whether the observed and predicted events match.

    The sum o events that were predicted to occur which actu-

    ally did occur and the events that were predicted not to occur

    which actually did not occur, divided by the total number o

    events, is a measure o the eectiveness o the model. This

    tool helps predict the choices consumers might make when

    presented with alternatives.

    Discriminant Analysis

    The purpose o discriminant analysis is to correctly classiy

    observations or people into homogeneous groups. The inde-

    pendent variables must be metric and must have a high degree

    o normality. Discriminant analysis builds a linear discrimi-

    nant unction, which can then be used to classiy the observa-

    tions. The overall t is assessed by looking at the degree to

    which the group means dier (Wilkes Lambda or D2) and

    how well the model classies. To determine which variables

    have the most impact on the discriminant unction, it is pos-

    sible to look at partial F values. The higher the partial F, the

    more impact that variable has on the discriminant unction.

    This tool helps categorize people, like buyers and nonbuyers.

    Multivariate Analysis of Variance (MANOVA)

    This technique examines the relationship between several

    categorical independent variables and two or more metric

    dependent variables. Whereas analysis o variance (ANOVA)

    assesses the dierences between groups (by using T tests or

    2 means and F tests between 3 or more means), MANOVA

    examines the dependence relationship between a set o depen-

    dent measures across a set o groups. Typically this analysis is

    used in experimental design, and usually a hypothesized rela-

    tionship between dependent measures is used. This technique

    is slightly dierent in that the independent variables are cate-

    gorical and the dependent variable is metric. Sample size is an

    issue, with 15-20 observations needed per cell. However, too

    many observations per cell (over 30) and the technique loses

    its practical signicance. Cell sizes should be roughly equal,with the largest cell having less than 1.5 times the observa-

    tions o the smallest cell. That is because, in this technique,

    normality o the dependent variables is important. The model

    t is determined by examining mean vector equivalents

    across groups. I there is a signicant dierence in the means,

    the null hypothesis can be rejected and treatment dierences

    can be determined.

    Decision Analyst

  • 7/27/2019 Multi Variate Analysis Techniques

    3/4

    Factor Analysis

    When there are many variables in a research design, it is

    oten helpul to reduce the variables to a smaller set o ac-

    tors. This is an independence technique, in which there is no

    dependent variable. Rather, the researcher is looking or the

    underlying structure o the data matrix. Ideally, the indepen-

    dent variables are normal and continuous, with at least 3 to

    5 variables loading onto a actor. The sample size should be

    over 50 observations, with over 5 observations per variable.

    Multicollinearity is generally preerred between the variables,

    as the correlations are key to data reduction. Kaisers Measure

    o Statistical Adequacy (MSA) is a measure o the degree to

    which every variable can be predicted by all other variables.

    An overall MSA o .80 or higher is very good, with a measure

    o under .50 deemed poor.

    There are two main actor analysis methods: common actor

    analysis, which extracts actors based on the variance shared

    by the actors, and principal component analysis, which ex-

    tracts actors based on the total variance o the actors. Com-

    mon actor analysis is used to look or the latent (underlying)

    actors, where as principal components analysis is used to nd

    the ewest number o variables that explain the most variance.

    The rst actor extracted explains the most variance. Typical-

    ly, actors are extracted as long as the eigenvalues are greaterthan 1.0 or the scree test visually indicates how many actors

    to extract. The actor loadings are the correlations between

    the actor and the variables. Typically a actor loading o .4

    or higher is required to attribute a specic variable to a actor.

    An orthogonal rotation assumes no correlation between the

    actors, whereas an oblique rotation is used when some rela-

    tionship is believed to exist.

    Cluster Analysis

    The purpose o cluster analysis is to reduce a large data set to

    meaningul subgroups o individuals or objects. The division

    is accomplished on the basis o similarity o the objects across

    a set o specied characteristics. Outliers are a problem with

    this technique, oten caused by too many irrelevant variables.

    The sample should be representative o the population, and

    it is desirable to have uncorrelated actors. There are three

    main clustering methods: hierarchical, which is a treelike pro-

    cess appropriate or smaller data sets; nonhierarchical, which

    requires specication o the number o clusters a priori, and

    a combination o both. There are 4 main rules or develop-

    ing clusters: the clusters should be dierent, they should be

    reachable, they should be measurable, and the clusters should

    be protable (big enough to matter). This is a great tool ormarket segmentation.

    Multidimensional Scaling (MDS)

    The purpose o MDS is to transorm consumer judgments

    o similarity into distances represented in multidimensional

    space. This is a decompositional approach that uses percep-

    tual mapping to present the dimensions. As an exploratory

    technique, it is useul in examining unrecognized dimensions

    about products and in uncovering comparative evaluations o

    products when the basis or comparison is unknown. Typically

    there must be at least 4 times as many objects being evalu-

    ated as dimensions. It is possible to evaluate the objects with

    nonmetric preerence rankings or metric similarities (paired

    comparison) ratings. Kruskals Stress measure is a badness

    o t measure; a stress percentage o 0 indicates a perect t,

    and over 20% is a poor t. The dimensions can be interpreted

    either subjectively by letting the respondents identiy the di-

    mensions or objectively by the researcher.

    Correspondence Analysis

    This technique provides or dimensional reduction o object

    ratings on a set o attributes, resulting in a perceptual map o

    the ratings. However, unlike MDS, both independent vari-

    ables and dependent variables are examined at the same time.

    This technique is more similar in nature to actor analysis.

    It is a compositional technique, and is useul when there are

    many attributes and many companies. It is most oten used in

    assessing the eectiveness o advertising campaigns. It is alsoused when the attributes are too similar or actor analysis to

    be meaningul. The main structural approach is the develop-

    ment o a contingency (crosstab) table. This means that the

    orm o the variables should be nonmetric. The model can be

    assessed by examining the Chi-square value or the model.

    Correspondence analysis is dicult to interpret, as the di-

    mensions are a combination o independent and dependent

    variables.

    Eleven Multivariate Analysis Techniques

  • 7/27/2019 Multi Variate Analysis Techniques

    4/4

    Conjoint Analysis

    Conjoint analysis is oten reerred to as trade-o analysis,

    in that it allows or the evaluation o objects and the various

    levels o the attributes to be examined. It is both a composi-

    tional technique and a dependence technique, in that a level

    o preerence or a combination o attributes and levels is

    developed. A part-worth, or utility, is calculated or each level

    o each attribute, and combinations o attributes at specic

    levels are summed to develop the overall preerence or the

    attribute at each level. Models can be built which identiy the

    ideal levels and combinations o attributes or products and

    services.

    Canonical Correlation

    The most fexible o the multivariate techniques, canonicalcorrelation simultaneously correlates several independent

    variables and several dependent variables. This powerul tech-

    nique utilizes metric independent variables, unlike MANO-

    VA, such as sales, satisaction levels, and usage levels. It can

    also utilize nonmetric categorical variables. This technique

    has the ewest restrictions o any o the multivariate tech-

    niques, so the results should be interpreted with caution due

    to the relaxed assumptions. Oten, the dependent variables

    are related, and the independent variables are related, so nd-

    ing a relationship is dicult without a technique like canoni-

    cal correlation.

    Structural Equation Modeling (SEM)

    Unlike the other multivariate techniques discussed, structural

    equation modeling (SEM) examines multiple relationships

    between sets o variables simultaneously. This represents a

    amily o techniques, including LISREL, latent variable anal-

    ysis, and conrmatory actor analysis. SEM can incorporate

    latent variables, which either are not or cannot be measured

    directly into the analysis. For example, intelligence levels can

    only be inerred, with direct measurement o variables like

    test scores, level o education, grade point average, and other

    related measures. These tools are oten used to evaluate many

    scaled attributes or build summated scales.

    Conclusions

    Each o the multivariate techniques described abovehas a specifc type o research question or which it is

    best suited. Each technique also has certain strengths

    and weaknesses that should be clearly understood by the

    analyst beore attempting to interpret the results o the

    technique. Current statistical packages (SAS, SPSS,

    S-Plus, and others) make it increasingly easy to run a

    procedure, but the results can be disastrously misinter-

    preted without adequate care.

    About the Author

    Michael Richarme is a Senior Vice President at Decision Analyst. The author may be reached by emailing

    [email protected] by calling 1-800-262-5974 or1-817-640-6166.

    Decision Analyst is a leading international marketing research and analytical consulting rm. The companyspecializes in advertising testing, strategy research, new product ideation, new product research, and advanced

    modeling for marketing-decision optimization.

    604 Avenue H East Arlington, TX 76011-100, USA

    1.817.640.6166 or 1.800. ANALYSIS www.decisionanalyst.comStrategic ResearchAnalyticsModelingOptimization

    4 Copyright 2002 Decision Analyst. All rights reserved.