23
Time-varying Coefficients in Brand Choice Modelling Thomas Kneib Department of Statistics Ludwig-Maximilians-University Munich joint work with Bernhard Baumgartner University of Regensburg Winfried J. Steiner Clausthal University of Technology 17.9.2008

Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Time-varying Coefficients in Brand Choice Modelling

Thomas Kneib

Department of StatisticsLudwig-Maximilians-University Munich

joint work with

Bernhard BaumgartnerUniversity of Regensburg

Winfried J. SteinerClausthal University of Technology

17.9.2008

Page 2: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Overview

Overview

• Brand Choice Modelling and the Multinomial Logit Model

• Semiparametric Multinomial Logit Models with Time-varying effects

• Application to Coffee Data.

Time-varying Coefficients in Brand Choice Modelling 1

Page 3: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Brand Choice Data

Brand Choice Data

• When purchasing a product, the consumer is faced with a discrete set of alternatives(substitute goods).

• Aim of marketing research: Identify factors (covariates) guiding the decision.

• Two types of covariates:

– Global covariates: Independent of the brand, e.g. age or gender of the consumer.

– Product-specific covariates: Depending on the brand, e.g. loyalty to a brand, price,or advertising.

Time-varying Coefficients in Brand Choice Modelling 2

Page 4: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Brand Choice Data

• In the following: Data on purchases of the coffee brands with largest market share.

• Some characteristics of the data set:

– Scanner panel data collected by the Gesellschaft fur Konsumforschung (GfK).

– Five coffee brands with 53% market share in total.

– Data collected over one year.

– 7.439 households with 32.477 purchase acts in total.

– Divided in 16.238 observations for model estimation and 16.239 observations formodel validation.

Time-varying Coefficients in Brand Choice Modelling 3

Page 5: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Brand Choice Data

• Covariates:

loyalty loyalty of the consumer to a specific brand.

reference price internal reference price built through experience.

loss negative difference between reference price and price.

gain positive difference between reference price and price.

promotional activity dummy-variables for the presence of special promotion.

• Loyalty and reference price are estimated based on an exponentially weighted averageof former purchases.

Time-varying Coefficients in Brand Choice Modelling 4

Page 6: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Multinomial Logit Models

Multinomial Logit Models

• General idea of regression models describing brand choice: Latent utility associatedwith consumer i‘s decision to buy brand r:

L(r)i , r = 1, . . . , k.

• Note: We do not observe the utilities but only the brand choice decisions.

• Rational behaviour: The consumer chooses the product that maximises her/his utility:

Yi = r ⇐⇒ L(r)i = max

s=1,...,kL

(s)i .

Time-varying Coefficients in Brand Choice Modelling 5

Page 7: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Multinomial Logit Models

• Express the utilities in terms of covariates and random error:

L(r)i = u′iα

(r) + w(r)i′δ + ε

(r)i .

• Ingredients of the model:

u′iα(r): product-specific effects of global covariates ui.

w(r)i′δ: global effects of product-specific covariates w

(r)i .

ε(r)i : random error.

• In our application:

L(r)i = α(r) + w

(r)i′δ + ε

(r)i .

Time-varying Coefficients in Brand Choice Modelling 6

Page 8: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Multinomial Logit Models

• If the error terms are i.i.d. standard extreme value distributed, the principle ofmaximum utility

Yi = r ⇐⇒ L(r)i = max

s=1,...,kL

(s)i .

yields the multinomial logit model

π(r)i = P (Yi = r) =

exp(η(r)i )

1 +∑k−1

s=1 exp(η(s)i )

, r = 1, . . . , k − 1

withη(r)i = u′iα

(r) + (w(r)i − w

(k)i )′δ = u′iα

(r) + w(r)i′δ.

• Only contrasts of product-specific covariates can be included due to identifiabilityrestrictions.

• The covariate effects act multiplicatively on the ratios of choice probabilities π(r)i /π

(k)i .

• Positive effects cause an increase in the preference for brand r as compared to thereference category k.

Time-varying Coefficients in Brand Choice Modelling 7

Page 9: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Multinomial Logit Models

• Purely parametric models neglect the temporal dimension of the data.

• Time-varying preferences and time-varying effects result, for example,

– if the economic conditions change.

– due to promotional activities.

– due to specific usage situations or special events (such as holidays).

– changes of properties of the product.

⇒ Semiparametric extensions of the multinomial logit model.

Time-varying Coefficients in Brand Choice Modelling 8

Page 10: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Semiparametric Multinomial Logit Models

Semiparametric Multinomial Logit Models

• Extend the linear model L(r)it = α(r) + w

(r)it′δ + ε

(r)it in two steps.

• Time-varying preferences: Replace the constant intercepts with time-varyingfunctions.

L(r)it = f

(r)0 (t) + w

(r)it′δ + ε

(r)it .

• Time-varying effects: Replace the regression coefficients with time-varyingparameters.

L(r)it = f

(r)0 (t) +

J∑

j=1

w(r)itj fj(t) + ε

(r)it .

• The functional form of the functions f(r)0 (t) and fj(t) shall remain unspecified and

will be estimated based on penalised splines.

Time-varying Coefficients in Brand Choice Modelling 9

Page 11: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Semiparametric Multinomial Logit Models

• Spline smoothing: Approximate a function f(t) in terms of a B-spline basis:

f(t) =M∑

m=1

βmBm(t).

−2

−1

01

2

−3 −1.5 0 1.5 3

B-spline basis −2

−1

01

2

−3 −1.5 0 1.5 3

Scaled B-splines

−2

−1

01

2

−3 −1.5 0 1.5 3

Resulting estimate

Time-varying Coefficients in Brand Choice Modelling 10

Page 12: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Semiparametric Multinomial Logit Models

• To regularise estimation, add a smoothness penalty to the likelihood.

• Integral penalties can be approximated with difference penalties:

12τ2

M∑m=2

(βm − βm−1)2 (first order differences)

12τ2

M∑m=3

(βm − 2βm−1 + βm−2)2 (second order differences)

• The smoothing parameter τ2 governs the trade-off between fidelity to the data (τ2

large) and smoothness of the function estimate (τ2 small).

Time-varying Coefficients in Brand Choice Modelling 11

Page 13: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Statistical Inference

Statistical Inference

• The model contains two types of parameters:

– Regression coefficients describing parametric or nonparametric effects and

– smoothing parameters.

• Penalised likelihood estimation for the regression coefficients based on a modifiedFisher scoring algorithm.

• Estimation of smoothing parameters based on an approximate marginal likelihood.

Time-varying Coefficients in Brand Choice Modelling 12

Page 14: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Results

Results

−.4

−.2

0.2

.4.6

0 10 20 30 40 50

brand 2

−2

−1

01

0 10 20 30 40 50

brand 3

−.5

0.5

0 10 20 30 40 50

brand 4

−3

−2

−1

01

0 10 20 30 40 50

brand 5

Time-varying Coefficients in Brand Choice Modelling 13

Page 15: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Results

02

46

0 10 20 30 40 50

Gain

−5

−4

−3

−2

0 10 20 30 40 50

Loss

−3

−2.

5−

2−

1.5

−1

0 10 20 30 40 50

reference price

Time-varying Coefficients in Brand Choice Modelling 14

Page 16: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Results

3.5

44.

55

5.5

0 10 20 30 40 50

loyalty

−2

−1

01

2

0 10 20 30 40 50

promotional activity

Time-varying Coefficients in Brand Choice Modelling 15

Page 17: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Model Validation & Proper Scoring Rules

Model Validation & Proper Scoring Rules

• Is the increased model complexity in semiparametric regression models really requiredfor our data?

• Model validation based on the prediction performance.

• What are suitable measures of predictive performance? What is a prediction?

• We consider predictive distributions

π = (π(1), . . . , π(k))

based on the estimated choice probabilities

π(r) = P (Y = r).

• A scoring rule is a real-valued function S(π, r) that assigns a value to the event thatcategory r is observed when π is the predictive distribution.

Time-varying Coefficients in Brand Choice Modelling 16

Page 18: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Model Validation & Proper Scoring Rules

• Score: Sum over individuals in a validation data set

S =n∑

i=1

S(πi, ri).

• Let π0 denote the true distribution. Then a scoring rules is called

– Proper if Eπ0(S(π0, ·)) ≤ Eπ0(S(π, ·)) for all π.

– Strictly proper if equality holds only if π = π0.

• Some common examples:

– Hit rate (proper but not strictly proper):

S(π, ri) =

{1n if π(ri) = max{π(1), . . . , π(k)},0 otherwise.

Time-varying Coefficients in Brand Choice Modelling 17

Page 19: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Model Validation & Proper Scoring Rules

– Logarithmic score (strictly proper):

S(π, ri) = log(π(ri)).

– Brier score (strictly proper):

S(π, ri) = −k∑

r=1

(1(ri = r)− π(r)

)2

– Spherical score (strictly proper):

S(π, ri) =π(ri)

√∑kr=1(π(r))2

.

Time-varying Coefficients in Brand Choice Modelling 18

Page 20: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Model Validation & Proper Scoring Rules

• In our data sets:

parametric time-varying preferences time-varying effects

hit rate (estimation) 0.6956 0.7011 0.7043

hit rate (prediction) 0.6967 0.7007 0.7020

logarithmic (estimation) -13816.1104 -13591.8655 -13498.9401

logarithmic (prediction) -13953.8974 -13844.1197 -13801.7610

Brier (estimation) -6913.6315 -6807.7259 -6763.6767

Brier (prediction) -6930.8855 -6874.8116 -6859.8093

spherical (estimation) 12101.5453 12174.2805 12199.5373

spherical (prediction) 12093.4520 12132.6391 12139.2852

• The inclusion of time-varying preferences and effects results in an increased predictionperformance.

• This is also reflected in the prediction of market shares.

Time-varying Coefficients in Brand Choice Modelling 19

Page 21: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Model Validation & Proper Scoring Rules

.2.2

5.3

.35

.4.4

5

5 10 15 20 25 30

market share (brand 1)

— true market share, - - - time-varying effects, · · · time-constant effects

Time-varying Coefficients in Brand Choice Modelling 20

Page 22: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Software

Software

• Implemented in the free software package BayesX.

• Stand-alone software for additive and geoadditive regression.

• Supports univariate exponential families, categoricalregression and time-continuous duration time models.

• Available from

http://www.stat.uni-muenchen.de/~bayesx

Time-varying Coefficients in Brand Choice Modelling 21

Page 23: Time-varying Coe–cients in Brand Choice Modelling · † Is the increased model complexity in semiparametric regression models really required ... (Y = r): † A scoring rule is

Thomas Kneib Summary

Summary

• Semiparametric extensions of the multinomial logit model.

• Fully automatic estimation of all model parameters (incl. the smoothing parameters).

• Model validation based on scoring rules.

• References:

– Kneib, T., Baumgartner, B. & Steiner, W. J. (2007). Semiparametric MultinomialLogit Models for Analysing Consumer Choice Behaviour. AStA Advances inStatistical Analysis, 91, 225–244.

– Kneib, T., Baumgartner, B. & Steiner, W. J. (2008). Time-Varying Coefficientsin Brand Choice Models (in preparation).

• A place called home:

http://www.stat.uni-muenchen.de/~kneib

Time-varying Coefficients in Brand Choice Modelling 22