Market Research Doc

  • Upload
    zqasim

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

  • 8/11/2019 Market Research Doc

    1/38

    MARKET RESEARCH

    Teaching you how to fish

  • 8/11/2019 Market Research Doc

    2/38

    Why, you ask?

    Because I have this burning desire inside of me to give gyaanto people who dont need it.

    Also because it is saddening to see the one sensible marketing subject detested by the majority.

    And because Mala issoawesome.

    To the best of my knowledge, this guide is accurate and should suffice for the end term

    examination. It took me two days to do this. An invaluable learning experience for me as well.

    However, I am not claiming it to be perfect. So, if you have any corrections or additions to make

    to this document, please drop a mail [email protected].

    Cheers.

    Oh, I almost forgot.

    If you find this useful, do share it with others who might need this as well. There is no point in

    holding onto information just to score more marks. That A grade in the mark sheet will takeyou only so far.

    *Hardcore open source fan*

    mailto:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/11/2019 Market Research Doc

    3/38

    Contents

    MULTI-DIMENSIONAL SCALING .................................................................................................................... 4

    CONJOINT ANALYSIS ................................................................................................................................... 11

    CLUSTER ANALYSIS ...................................................................................................................................... 17

    FACTOR ANALYSIS ....................................................................................................................................... 30

    BINARY LOGISTIC REGRESSION ................................................................................................................... 31

    DISCRIMINANT ANALYSIS ............................................................................................................................ 38

  • 8/11/2019 Market Research Doc

    4/38

    MULTI-DIMENSIONAL SCALING

    MDS allows the perceptions and preferences of the consumers to be clearly represented in a

    spatial map. It gives quantitative estimates of similarity between groups of items.

    Lets say, in a survey, the respondents were asked to give ratings for all possible pairs of nine

    pairs of soft drinks in terms of their similarity and MDS results in the following spatialrepresentation of the soft drinks:

    From this it can be inferred that Coke and Pepsi are the most similar as there is the least distance

    between them, while Dr Pepper and Diet 7-up (Or Diet coke and Tango) are the most dissimilar.

    The catch in this is that MDS doesnt give what the horizontal and vertical axes represent. Theseare dependent on the judgment of the researcher. For instance, vertical axis can be dietness and

    horizontal axis can be flavor. Or it could be price, color, soda amount etc.

    Other applications of MDS would be

  • 8/11/2019 Market Research Doc

    5/38

    Market segmentationGroup customers with similar interests

    New product development - Where would you want your new soft drink to be placed in

    the above map?

    Pricing

    Choice of retail outletsWhere and which channel?

    AdvertisingWhich tagline or actor is suitable for different brands/products?

    Eg. For advertising: Suppose there are four categories and four actors

    1.

    Telecom 1. Shahrukh Khan

    2.

    Fairness creams 2. Uday Chopra

    3. Tourism 3. John Abraham

    4. Viagra 4. Abhishek Bachchan

    The respondents can be asked to rate individual pairs on a likert scale of 1-7 (1 Good match, 7

    Not good match) and see which actor might be suitable for a category:

  • 8/11/2019 Market Research Doc

    6/38

    So, this shows that Abhishek Bachchan would be most suitable for Telecom ads (This is a

    random figure. We know that its not true in reality), and none of the four are close enough to

    Viagra. Probably because they dont have what it takes to represent that category.

    How to do MDS in SPSS?

    To obtain a spatial map for 5 brands of beer:

    1. Obtain the data from respondents on a likert scale of 1-5 or 1-7, with 1 Most similar, 7

    least similar

    2.

    Here row 1 represents Budvar, 2Budweiser and so on. So, theyll have 0 (or 1 ) against

    themselves. Only upper half of the triangle in the data matrix suffices here. There mightbe some special cases where respondents may say, New Zealand is similar to Australia,

    but Australia is not similar to New Zealand. Then, you will need the complete matrix ofdata to analyze.

    3. Now, there are two methods of MDSPROXSCAL AND ALSCAL.

    PROXSCAL is an update in the SPSS after ALSCAL was found to be inefficient.Apparently, we were taught ALSCAL, which I am not comfortable with. So you may

    figure it out on your own once you have gone through the PROXSCAL method.

    4. Go to Analyze > Scale > MDS [PROXSCAL] Click on Define in the next window that

    appears.

  • 8/11/2019 Market Research Doc

    7/38

    5. Put all 5 brands of beer under proximities.

    6.

    In model, choose upper-triangular matrix and dimensions as 1 to 4, as we have 5 brands.

    We need to see the stress plot and reduce the dimensions later.

  • 8/11/2019 Market Research Doc

    8/38

    7. Go to plots and check stress plot and uncheck common space for now.

  • 8/11/2019 Market Research Doc

    9/38

    8. Run it. The scree plot shows the elbow formed at dimension 2. After that, the stress

    doesnt reduce by a significant amount as the dimensions increase. Our objective is to

    keep both dimensions and stress minimum. So, we have to make a call at some point to

    balance the two.

    9.

    Now, run the MDS PROXSCAL all over again, but this time, change the dimensions

    range to Min2, max2 and check the common space plots.

  • 8/11/2019 Market Research Doc

    10/38

    10.Youll get the above result. It can be seen that Budweiser is most unique among the 5.

    Heineken and Carlsberg, and Corona and Budvar are similar pairs. It is hard to name thedimensions with such a small sample. It could be aftertaste, degree of high that you get,

    etc. The dimensions can also be judged by taking a ranking of beer attributes in a separate

    survey.

    11.

    Validity: Stress values are indicative of the quality of MDS solutions. In general,following is the recommendation for stress values

    In our output, it is 2.79% (0.02796) which makes it an excellent fit.

  • 8/11/2019 Market Research Doc

    11/38

    CONJOINT ANALYSIS

    Used to determine how people value different features that make up an individual product orservice.

    Lets say I am a shoe manufacturer and there are three important features(attributes):

    Material

    Color

    Price

    Further, we know that there is a range of feasible alternatives(attribute levels) for each of these

    features, say,

    Material Durability Price

    Leather 1 year 300

    Canvas 2 years 1000

    Rubber 2000

    Obviously, the markets ideal shoe would be

    Material Durability Price

    Leather 2 years 300

    And the ideal shoe from my perspective (manufacturer) would be

    Material Durability Price

    Rubber 1 year 2000

    Here is the basic marketing issue: Id be stripped naked selling the first shoe whereas the market

    wouldnt buy the second. So, the most viable product lies somewhere in between. Conjoint

    analysis lets us find out where.

    Now, we would need a survey wherein the respondents are asked to evaluate different productcombinations. In the present case, we have 3*2*3 = 18 possible combinations. Generally, it is

    advisable to bring it down to 10-12 combinations after careful judgment. For now, let us consider

    all 18 cases.

  • 8/11/2019 Market Research Doc

    12/38

    Once you have got the scores/ranking, convert to binary variable format and enter into SPSS.

    Below is an example:

    For the Material attribute, there are 3 levels, so we use 2 variables.

    If both VarCanvas and VarLeather are zero, it implies material is rubber. Similarly, for otherattributes and their respective levels.

    Dur2 = 1 implies durability of 2 years.

    V2000 and V1000 are two variables for the prices 2000 and 1000 rupees respectively.

    Preference = 18, is the highest preference of the customer. So, consider them as scores and not asa ranking from 1 to 18.

    Then, you do a linear regression.

  • 8/11/2019 Market Research Doc

    13/38

    Preference is the dependent variable and the rest are independent variables. Click OK.

  • 8/11/2019 Market Research Doc

    14/38

    The output is as follows:

    Interpretation:

    Firstly, the validity of the linear regression model has to be checked from the model summary.

    Rsquare = 0.89 i.e. the model explains 89% of the variance. In general, 60%+ is considered to be

    a good fit.

    Note on R square:

  • 8/11/2019 Market Research Doc

    15/38

    The regression model on the left accounts for 38.0% of the variance while the one on the

    right accounts for 87.4%. The more variance that is accounted for by the regression

    model the closer the data points will fall to the fitted regression line. Theoretically, if a

    model could explain 100% of the variance, the fitted values would always equal the

    observed values and, therefore, all the data points would fall on the fitted regression line.

    Second, ANOVA table. The null hypothesis of ANOVA would be there are no differences

    between the means of the samples. Meaning, there is no significant difference between all the

    variables taken which would make it redundant. So, the level of significance should be < 0.05 so

    that the null hypothesis is rejected. Here, sig. = 0 from the table.

    Third, we calculate the utilities from the coefficients table. The B column gives the observed

    utilities. Note that the total utility for any attribute (like material) is equal to zero.

    Utility (Canvas) = 1.667

    Utility (Leather) = 5.833

    Therefore, Utility (Rubber) = -7.50

    Similarly for others:

    Utility (Dur2) = 6.778, Utility (Dur1) = -6.778

    Utility (V2000) = -6.167, Utility (V1000) = -3.833, Utility (V300) = 10

    Four, we calculate the actual utilities because SPSS considered the utility of the implicit

    variables (Rubber, Dur1 and V300) as zero, and it gave the other utilities of the explicit variables

    in relative to zero.

    So, Utility (Canvas)Utility (Rubber) = 1.667

  • 8/11/2019 Market Research Doc

    16/38

    And, Utility (Leather)Utility (Rubber) = 5.833

    We already know that Utility(Canvas) + Utility(Leather) + Utility (Rubber) = 0

    Solving these three equations, we get the actual utilities.

    Material Utility

    Canvas -0.85

    Leather 3.36

    Rubber -2.51

    Similarly, do for other two attributes.

    Durability Utility

    2 years 3.389

    1 year -3.389

    Price Utility

    2000 -2.837

    1000 -0.493

    300 3.33

    Now, we can rank the different product combinations on the basis of maximum sum of utilities.

    The top ten combinations would be

    Material Durability Price

    Utility

    sum

    Leather 2 years 300 10.079

    Leather 2 years 1000 6.256

    Canvas 2 years 300 5.869

    Rubber 2 years 300 4.209

    Leather 2 years 2000 3.912

    Leather 1 year 300 3.301

    Canvas 2 years 1000 2.046

    Rubber 2 years 1000 0.386

    Canvas 2 years 2000 -0.298

    Rubber 2 years 2000 -1.958

    This is where I, as a manufacturer, would make a trade-off. I would leave the top 3 preferredproducts as they would mean only losses for me and maybe go for the 4thor 5thcombination.

    There would be no customers for the last few combinations.

  • 8/11/2019 Market Research Doc

    17/38

    CLUSTER ANALYSIS

    Identifying groups of individuals or objects that are similar to each other but different

    from individuals in other groups

    Each object is assigned to only one cluster

    In cluster analysis, there is no a priori information about the group or cluster membership

    for any of the objects (In discriminant, we do. Will discuss that in later section.)

    Mainly used for understanding buying behaviors and market segmentation.

    DURR CASE:

    Background - DRR Environmental Controls is a German conglomerate producing air emission

    control systems and that has extensive industrial operations in the US. The company isconsidering introducing one or more offers in the US market and believes that its product will

    need lower service costs that its competitors products.

    Our objective is to propose a marketing segmentation, allowing DURR to target customers with

    specific and efficient sales pitch.

    MARKET AND COMPETITION:

    While choosing a product in the market, customers look at four dimensions:

    Efficiency

    Delivery Time

    Price

    Delivery Terms

    Each of these levels have 4 sub-levels.

    Now, we do analysis using SPSS. We have 3 types of clustering: Hierarchical, K-means and two

    step. We have studied only the first two methods.

    For K-means, we need to know how many clusters we want. So first, we use hierarchical cluster

    method.

  • 8/11/2019 Market Research Doc

    18/38

  • 8/11/2019 Market Research Doc

    19/38

    We use squared Euclidean distance measurements

    Also, check Dendogram under plots section and run the program.

  • 8/11/2019 Market Research Doc

    20/38

    The proximity matrix gives the squared Euclidean distances (Ill call it SED) between 2

    companies across the 16 variables. For eg. The SED between 8 and 9 is 235.

    How is this obtained? You remember how we used to find the distance between two points in co-

    ordinate geometry? That was Euclidean distance too.

    Square c and you get SED. So, consider company 8 and company 9 as two points and the

    variable values as its co-ordinates.

    Going back to the DURR data sheet, we get the SED between 8 and 9 as

    (19-17)2+ (57)2 + (34) 2+ (0-0)2 + (19-20) 2+. = 235

    Now, we know how all the possible SEDs are calculated. SPSS then arranges in increasing order

    of the distances between the pairs and puts them in a cluster step by step. This is when you study

    the agglomeration schedule table. It lists the 30 least distances. As there are 31 companies, we

  • 8/11/2019 Market Research Doc

    21/38

  • 8/11/2019 Market Research Doc

    22/38

    You go on doing this till stage 8, then a company re-appears for the second time. This is where

    the right half of the table helps.

    At stage 9, in the right half, it says that company 8 had already appeared in stage 4. This means

    you are going to put company 3 into the STAGE 4 box and then rename it to STAGE 9 now.

    The number of boxes (or clusters) goes on reducing as you go further. In stage 14, for instance,

    boxes 7 and 8 are merged to form a bigger box 14.

    The dendogram also depicts this process. Take a moment and youll understand the figure. The

    plus signs are the points where the merge happens. (Some SPSS outputs dont give dotted lines

    or plus symbols, but they are also quite simple to read.)

  • 8/11/2019 Market Research Doc

    23/38

    Now, youll have to make a call on the number of clusters to keep. Look at the agglomeration

    schedule and see where the maximum jump on distances happens. In class, Mala mentioned it as

    stage 29 (a jump of 1000), so we should keep 3 clusters. But if you see the dendogram, the threeclusters would have 21, 1 and 9 companies which is not good segmentation. (Company 14

    appears for the first time in stage 29). So, youll have to come down to two clusters with 22 and

    9 companies.

  • 8/11/2019 Market Research Doc

    24/38

    Now that we know how many clusters we want, we can use K-means cluster analysis. The path

    is: Analyze > Classify > K-means cluster

    Put the 16 variables and put number of clusters as 2.

    Under Save, check cluster membership.

    Under options, check ANOVA table and cluster information for each case. Run the program.

  • 8/11/2019 Market Research Doc

    25/38

    In the output, two tables are important: First is the cluster membership table. Cluster membership

    will also come in the data sheet in the last column.

    Second is the ANOVA table. ANOVA analysis in itself is not important i.e. Sig. values play

    no role. However, the differences between the F-ratios (F column in the ANOVA Table) makes

    it possible to draw general conclusions about the role of the different mean variables in the

    forming of the clusters.

  • 8/11/2019 Market Research Doc

    26/38

    It shows that V11 has the greatest influence in the forming of the clusters and V13 has the least

    influence, among the given values.

    Now, we do a means comparison between the two clusters.

  • 8/11/2019 Market Research Doc

    27/38

  • 8/11/2019 Market Research Doc

    28/38

    Cluster Number ofCase

    1 2 Total

    Mean N Std.

    Deviation

    Mean N Std.

    Deviation

    Mean N Std.

    Deviation

    Exceeds 9% 15.56 9 5.028 32.14 22 12.552 27.32 31 13.250

    Exceeds 5% 6.11 9 2.667 21.27 22 9.867 16.87 31 10.908

    Meets specifications 4.22 9 2.819 10.05 22 8.375 8.35 31 7.644

    Short by 5% .00 9 .000 .00 22 .000 .00 31 .000

    6 months 17.33 9 4.975 36.82 22 11.283 31.16 31 13.287

    9 months 10.11 9 4.226 22.77 22 10.099 19.10 31 10.502

    12 months 6.44 9 3.909 10.32 22 5.801 9.19 31 5.552

    15 months.00 9 .000 .00 22 .000 .00 31 .000

    V1032.33 9 14.018 12.95 22 6.191 18.58 31 12.617

    V11 24.33 9 10.817 7.77 22 4.985 12.58 31 10.343

    V12 12.33 9 7.089 3.95 22 2.820 6.39 31 5.823

    V13.33 9 1.000 .09 22 .426 .16 31 .638

    Installed, with 2-

    year warranty

    29.33 9 11.619 16.77 22 8.799 20.42 31 11.126

    Installed, with 1-

    year warranty

    19.56 9 11.706 11.59 22 6.659 13.90 31 9.005

    Installed, with

    service contract

    4.33 9 3.775 6.09 22 3.663 5.58 31 3.722

    FOB, with service

    contract

    .00 9 .000 .00 22 .000 .00 31 .000

    Sales$_2004 34.133 9 41.9710 4.464 22 3.8761 13.077 31 25.8396

    Profit% 9.611 9 11.8505 3.764 22 6.2700 5.461 31 8.4999

    Return_on_Equity 19.467 9 13.6689 17.714 22 17.0450 18.223 31 15.9327

    Employees 56.33 9 67.050 14.11 22 15.130 26.37 31 41.697

    SalesGrowth_(2003-

    2004)

    9.467 9 12.0271 20.164 22 24.7709 17.058 31 22.1913

    TopMgt 18.89 9 2.713 35.77 22 2.617 30.87 31 8.213

    Engineering 20.11 9 3.296 40.59 22 2.631 34.65 31 9.851

    Finance 28.89 9 3.180 10.09 22 2.759 15.55 31 9.124

    Purchasing 32.11 9 3.983 13.64 22 5.019 19.00 31 9.723Growth 21.89 9 2.315 9.59 22 2.594 13.16 31 6.192

    Profit 28.33 9 2.828 20.77 22 2.159 22.97 31 4.191

    MarketShare 14.22 9 2.819 10.36 22 3.170 11.48 31 3.511

    TechLeadership 15.78 9 2.489 8.77 22 3.023 10.81 31 4.301

    CorpCitEnv 6.00 9 3.000 25.55 22 3.143 19.87 31 9.521

    GovReg 13.56 9 7.452 25.18 22 6.382 21.81 31 8.491

  • 8/11/2019 Market Research Doc

    29/38

    It can be noticed that cluster 1 has high mean values (cluster centers) for price variable

    and warranty variable. So, they are price sensitive and prefer good service.

    Cluster 2 has high values for efficiency and delivery time variables. So, they want

    efficient products and quick delivery.

    You can write more gas on the companys strategies using the above two points.

  • 8/11/2019 Market Research Doc

    30/38

    FACTOR ANALYSIS

    This is slightly complicated to explain in writing and requires a lot of time.

    There is a series of 6 videos of less than 40 min in total by Dr. Dawg(Yeah, I know).

    He explains the basics pretty well. Heres a link to the first video:

    https://www.youtube.com/watch?v=MB-5WB3eZI8

    You can find the rest 5 videos on your own from here. You can watch all 6 and get a hold on

    factor analysis and then read a bit on the net or the text book.

    https://www.youtube.com/watch?v=MB-5WB3eZI8https://www.youtube.com/watch?v=MB-5WB3eZI8https://www.youtube.com/watch?v=MB-5WB3eZI8
  • 8/11/2019 Market Research Doc

    31/38

    BINARY LOGISTIC REGRESSION

    You can use binary logistic regression to predict whether youll pass the upcoming MR

    examination or not, based on study time, test anxiety and lecture attendance.

    BLR predicts the probability that an observation falls into one of two categories of a

    dichotomous dependent variable based on one or more independent variables that can be either

    continuous or categorical.

    First, lets understand the different types of variables. Broadly there are two types: Categoricaland Continuous.

    Categorical variables can be divided further into nominal, ordinal or dichotomous.

    Nominal2 or more categories with no intrinsic order. Eg: Types of property - Houses,

    co-ops or bungalows, OccupationDoctor, Engineer, Artist

    Ordinal Similar to nominal, but here, the categories can be ordered and ranked. So if

    you asked someone if they liked the concept of ALS ice bucket challenge and they could

    answer Not very much, It is Ok or Yes, a lot, then it is ordinal. You can rank these

    three as least positive, middle response, most positive respectively. DichotomousNominal variable with only 2 categories. Eg: Gender or property divided

    into two segments only (Residential and commercial)

    Continuous variable are also known as quantitative variables. Further divided into interval or

    ratio variables.

    Interval Can be measured along a continuum and they have a numerical value. Eg.

    Temperature in degree Celsius

    RatioInterval variables with the added condition that zero of the measurement indicates

    that there is none of the variable. Eg. Height, mass, distance

    For BLR, dependent variable must be dichotomous with the 2 categories being mutually

    exclusive and exhaustive, while independent variables can be continuous or categorical.

    How to do in SPSS?

  • 8/11/2019 Market Research Doc

    32/38

    Consider the example from our final project We wanted to predict whether the Customer

    shopped online? based on their age, gender and expenditure per month.

    Therefore, age, gender and expenditure become our independent variables while online

    becomes the dependent variable. The responses to online shopping were coded as:

    1 = No, I dont shop online

    2 = Yes, I shop online

    The SPSS output will model the changes in the likelihood of online shoppers as it has the highercoding value.

    We had 100 respondents. Part of the survey looked like this:

    Gender Age

    Expenditure on

    clothing per month

    Online

    shopping?

    Female26 to30 Less than 1000 Yes

    Male26 to30 1000-2000 No

    Female22 to25 1000-2000 No

    Male22 to25 1000-2000 No

    Male26 to30 2000-5000 Yes

    Male26 to30 2000-5000 Yes

    Female26 to30 2000-5000 No

    Code it suitably to use in SPSS

  • 8/11/2019 Market Research Doc

    33/38

    Go to binary logistic regression

    Put online into dependent variable box and the other three into covariates box.

  • 8/11/2019 Market Research Doc

    34/38

    Under options, choose the three plots and continue.

  • 8/11/2019 Market Research Doc

    35/38

  • 8/11/2019 Market Research Doc

    36/38

    Basically, regression develops an equation of the kind -

    Online behavior = b1 * age + b2 * gender + b3 * expenditure + constant (intercept)

    If no model is used, it just uses the intercept to explain the behavior. So the sig. value under

    coefficient for the model has to be

  • 8/11/2019 Market Research Doc

    37/38

    1 2 Correct

    Step 1 Online 1 16 23 41.0

    2 7 54 88.5

    Overall Percentage 70.0a. The cut value is .500

    Variables in the equation

    Variables in the Equation

    B S.E. Wald df Sig. Exp(B)

    Step 1a Gender .767 .433 3.133 1 .077 2.153

    Age -.328 .278 1.397 1 .237 .720

    Expenditure .599 .263 5.173 1 .023 1.820

    Constant -.817 1.211 .456 1 .500 .442

    a. Variable(s) entered on step 1: Gender, Age, Expenditure.

    Examine the standard errors for the b coefficients. A standard error larger than 2.0 indicates

    numerical problems. Analyses that indicate numerical problems should not be interpreted.

    None of the independent variables in this analysis had a standard error larger than 2.0.

    Also, from the level of significance, expenditure (p=0.023) added value significantly to the

    model as compared to age(p=0.237) and gender(p=0.077). Least values of Significance has the

    highest contribution to the equation.

    We can observe that the logistic coefficient is highest for Gender and least for age. Consider theExp(B) column:

    A unit increase in gender would increase the odds of online shopping by 2.153 times.

    Since male was coded as 1 and female as 0, the odds were more in favor of the males A unit increase in age increases the odds of online shopping by 0.72 times, which means

    every unit increase in age decreases online shopping chances by 28%. In our case, a unit

    increase in age is increase of about 5 years as we have considered interval variable for

    age.

    A unit increase in expenditure increases the odds of online shopping by 1.82 times

  • 8/11/2019 Market Research Doc

    38/38

    Now, create/use your own data or get data from the net and try Binary Logistic regression.

    DISCRIMINANT ANALYSIS

    Explained in the PPT attached along with this document. 122 slides, but most of them are

    screenshot images and I guess only the first half of the PPT is required for us i.e. one example of

    discriminant analysis should be enough to understand.