Upload
vankhuong
View
213
Download
0
Embed Size (px)
Citation preview
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Bayesian estimation of complex networks anddynamic choice in the music industry
Stefano Nasini Víctor Martínez-de-Albéniz
Dept. of Production, Technology and Operations Management,IESE Business School, University of Navarra,
Barcelona, Spain
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Outline
1 Data sets from the music broadcasting industry
2 Multidimensional panel data
3 An exponential random modelMultidimensional Gaussian reductionThe exponential family of distribution
4 Estimation methodNumerical resultsGoodness of fit
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Artist goods: the music broadcasting industry
Artist goodsTheir life cycles that resembleclothing fashion trends, with a timewindow in which their popularityincreases shortly after theirpremiere and then decrease.
This is due to network externalitiesin individual preferences andopinions.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Artist goods: the music broadcasting industry
A data set of songs played on TVchannels and radio stations
Germany UKBroadcasting companies 41 51
Artists 13860 16169Songs 48785 65531
Time periods 163 weeks 163 weeks
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Artist goods: the music broadcasting industry
A song’s popularity increases after their premiere and then decrease
(a) B. Mars, Just the way you are in Germany. (b) B. Mars, Locked Out Of Heaven in Germany.
(c) B. Mars, Just the way you are in the UK. (d) B. Mars, Locked Out Of Heaven in the UK.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Artist goods: the music broadcasting industry
Correlated choices from different broadcasting companies
BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth Radio LondonBBC 1 Xtra 1.000 0.729 0.668 0.686 0.010 –Capital FM 1.000 0.814 0.830 -0.135 –Kiss 100 FM 1.000 0.829 -0.142 –Metro Radio 1.000 0.078 –Radio City 1.000 –Smooth Radio London 1.000
Table: Spearman’s correlations among the dynamic plays of Locked Out Of Heaven.
BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth Radio LondonBBC 1 Xtra 1.000 0.508 – 0.329 0.001 -0.076Capital FM 1.000 – 0.417 -0.128 -0.091Kiss 100 FM 1.000 – – –Metro Radio 1.000 -0.268 -0.222Radio City 1.000 0.495Smooth Radio London 1.000
Table: Spearman’s correlations among the dynamic plays of Just the way you are.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Artist goods: the music broadcasting industry
Our goal is to have a joint model which allows . . .
Predicting the common life cycle of song diffusion withinthe music broadcasting industry.Detecting the structure of imitation and spillover betweenradio stations and TV channels, based on the observedcorrelations.Taking decision about what’s the best broadcastingindustry to launch a song in order to maximize the futurenumber of plays.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional panel data as two-mode network
NotationR := set of individuals (primary layer); S := set of item (secondary layer); T := set of time periods;
xst = [xs1t xs2t . . . xs|R|t ]T ∈ χ is the |R|-dimensional connection profile of the sth item at time t .
E ⊆ R×R := a set of connections between broadcasting industries;
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional panel data as two-mode network
Spillover measurements to internalize cross-section dependency in the panel
i Ghk (xst ; xs,t−1, . . . , xs,t−τ ) = 1|E|τ
∑τ`=0 d`
((xsht )uh (xsk(t−`))uk
) 1p ;
ii Ghk (xst ; xs,t−1, . . . , xs,t−τ ) = − 1|E|τ
∑τ`=0 d`
∣∣∣∣ xshtuh−
xsk(t−`)uk
∣∣∣∣ 2p ;
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
An exponential random model
P(xst | xs,t−1, . . . , xs,t−τ ) ∝ h(xst ) exp
αst Ss +∑r∈R
βr Rr +∑
(h, k)∈Eγhk Ghk
- Sst accounts for the size effect of each item in the secondary layer, for s ∈ S;
- Rr accounts for the size effect of each individual in the primary layer, for r ∈ R;
- Ghk internalizes the one-mode projection into the primary layer, for (h, k) ∈ R;
Underlying measure: either h(xst ) =1∏
r∈Rxsrt !
or h(xst ) = (2π)−(τ+1)|R|
2
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
An exponential random model
The spillover measurement Ghk plays an important role.
P(xsrt | xsr ′t′ such that r ′ 6= r , t ′ < t) ∝1
xsrt !exp
([αst + βr
η
]T [ xsrtC(xsrt )
]),
where
η =1
τ |E|
∑k∈R
γrk
τ∑`=1
(xsk(t−`))1p
and C(xsrt ) = (xsrt )1p , for i,
η =1
τ |E|
γr1
.
.
.γrn
and C(xsrt ) =
τ∑`=1
d`
∣∣∣∣∣ xsrt
ur−
xs1(t−`)
u1
∣∣∣∣∣2p
.
.
.τ∑`=1
d`
∣∣∣∣ xsrt
ur−
xsn(t−`)
un
∣∣∣∣ 2p
for ii.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
An exponential random model
α = 1 and γ = 1 α = −1 and γ = 1 Spillover measurement
1x!y!
exp(α(x + y) + γ(xy)1/2)
1x!y!
exp(α(x + y) + γ|x − y |)
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
Multidimensional Gaussian reduction
Under special conditions:
P(xst | xs,t−1, . . . , xs,t−τ ) ∝ h(xst ) exp
αst Ss +∑r∈R
βr Rr +∑
(h, k)∈Eγhk Ghk
- Ghk (xst ; xs,t−1, . . . , xs,t−τ ) =
∑τ`=0 d`
(xsht xsk(t−`)
);
- h(xst ) = (2π)−(τ+1)|R|
2 ;
Xst
.
.
.Xs,t−τ
∼ N (µ,Σ) , where µ = Σ
αst e + β
.
.
.αs,t−τ e + β
and Σ = −1
2
d0Γ . . . dτ Γ
.
.
....
dτ Γ . . . d0Γ
−1
.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
Why is our model an extension of the ERGM?
Exponential FamilyWhenever the density of a random variable may be written f (x) ∝ h(x) exp{θT C(x)}the family of all such random variables (for all possible θ) is called an exponentialfamily.
Exponential Random Graph Model (ERGM)
Pθ(X = x) =exp{θT C(x)}
Z (θ), where
X is a random network on n nodes (a matrix of 0’s and 1’s);
θ is a vector of parameters;
C(x) is a known vector of graph statistics on x.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
Why it is difficult to find the MLE
The log-likelihood function
- the model: P(X = x(0)|θ) = exp{θT C(x(0))}Z (θ)
, where x(0) is the
observed data set.
- The log-likelihood function is
`(θ) = θT C(x(0))− log Z (θ)
= θT C(x(0))− log
( ∑all possible x
exp{θT C(x)}
)
- Even in the simplest case of undirected graphs withoutself-edges, the number of graphs in the sum is very large.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
Maximum Pseudo-likelihood
Let xw be a unique component of x and x−w the vector of all theremaining components.
The pseudo-likelihood function
Let’s approximate the marginal P(xw |θ) by the conditionalP(xw |x−w ; θ)?
Then ˜̀(θ) =∏w
P(xw |x−w ; θ).
Result: The maximum pseudo-likelihood estimate.Unfortunately, little is known about the quality of MPL estimates.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
Pseudo-likelihood for ERGM
Notation: For a network x and a pair (i , j) of nodes
˜̀(θ) =∏w
P(xw |x−w ; θ)
=∏(i,j)
exp{θT C(x(0))}exp{θT C(xij = 1, x−ij)}+ exp{θT C(xij = 0, x−ij)}
=exp{n(n − 1)θT C(x(0))}∏
(i,j)
(exp{θT C(xij = 1, x−ij)}+ exp{θT C(xij = 0, x−ij)}
)
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
Pseudo-likelihood for our model
Pseudo-likelihood for our model
˜̀(θ) =∏(r,t)
P(xsrt | xsr ′t′ such that r ′ 6= r , t ′ < t)
∝∏(r,t)
1xsrt !
exp
([αst + βr
η
]T [ xsrtC(xsrt )
]),
What is the normalizing constant for the full conditional?
Z (αst , βr , η) =∑
xsrt≥0
1xsrt !
exp
([αst + βr
η
]T [ xsrtC(xsrt )
])
Even the pseudo-likelihood is hard to define for our model
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Multidimensional Gaussian reductionThe exponential family of distribution
Pseudo-likelihood for our model
Pseudo-likelihood for our model
˜̀(θ) =∏(r,t)
P(xsrt | xsr ′t′ such that r ′ 6= r , t ′ < t)
∝∏(r,t)
1xsrt !
exp
([αst + βr
η
]T [ xsrtC(xsrt )
]),
What is the normalizing constant for the full conditional?
Z (αst , βr , η) =∑
xsrt≥0
1xsrt !
exp
([αst + βr
η
]T [ xsrtC(xsrt )
])
Even the pseudo-likelihood is hard to define for our model
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Bayesian posteriorLet θ = [α1t , . . . , α|S|t , β1, . . . , β|R| , γ11, . . . , γ|R|,|R|]
T be the vector of naturalparameters, π(θ) a prior distribution and x(0) the observed data set. By applying theBayes rule we have:
P(θ | x(0)) =P(x(0) | θ)π(θ)∫
θP(x(0) | θ)π(θ) dθ
=
P(x1 . . . , xτ ; θ)w∏
t = τ+1
P(xt | xt−1 . . . , xt−τ ; θ)π(θ)
∫θ
P(x1 . . . , xτ ; θ)w∏
t = τ+1
P(xt | xt−1 . . . , xt−τ ; θ)π(θ) dθ
=
P(x1 . . . , xτ ; θ)π(θ)
Z (θ)
w∏t = τ+1
m∏s=1
qs,t,θ(xst )∫θ
P(x1 . . . , xτ ; θ)π(θ)
Z (θ)
w∏t = τ+1
m∏s=1
qs,t,θ(xst ) dθ
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Metropolis-Hastings
Since both P(x(0) | θ) and P(θ | x(0)) can only be specified under proportionalityconditions, almost all known valid MCMC algorithms for θ cannot be applied.Consider for instance the Metropolis-Hastings acceptance probability:
πaccept(θ,θ′) = min
{1 , P(x(0) | θ′)π(θ′)
P(x(0) | θ)π(θ)× Q(θ |θ′)
Q(θ′ |θ)
}
= min
1 ,
P(x1 . . . , xτ ; θ′)w∏
t = τ+1
m∏s=1
qs,t,θ′ (xst )π(θ′)
P(x1 . . . , xτ ; θ)w∏
t = τ+1
m∏s=1
qs,t,θ(xst )π(θ)
×Z (θ) Q(θ |θ′)Z (θ′)Q(θ′ |θ)
where Q(θ′ |θ) is the proposal distribution.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Specialized MCMC for doubly intractable distributions
Murray proposed a MCMC approach which overcomes the drawback to alarge extent, based on the simulation of the joint distribution of the parameterand the sample spaces, conditioned to the observed data set x(0), that is tosay P(x, θ | x(0)).
Algorithm 1 Exchange algorithm of Murray.
1: Initialize θ2: repeat3: Draw θ′ from an arbitrary proposal distribution;4: Draw x′ from P(. | θ′)
5: Accept θ′ with probability min{
1,P(x′ | θ)P(x(0) | θ′)π(θ′)P(x(0) | θ)P(x′ | θ′)π(θ)
}6: Update θ
7: until Convergence
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Goodness of fit: graphical illustration
Total number of plays along time by the top-30 songs
(a) Full model. (b) Null model (γ = 0).
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Goodness of fit: graphical illustration
Total number of plays along time by the top-30 songs
(a) Total plays along time. (b) Market share.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Reducing the dimensionality of the parameter space
Model specification based on structuralproperties of the music industryThe parameter space is the whole (|T | × |S|+ |R|+ |E|)-dimensionalEuclidean space, while the sample space has dimension (|T | × |S| × |R|).We use two strategies to reduce the dimensionality of the parameter space:
A. Define communities of broadcasting companies to consider only within-groupspillover effects γ;
B. Define a functional form for the effect of the song life cycle α.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Reducing the dimensionality of the parameter space
A. Reducing the |E| effects γ
Pairwise spillover effects γkh,between individualcompanies h and k with thesame radio format.
Common spillover effectbetween different radioformats γkh, if h and k havedifferent formats.
B. Reducing the |T | × |S| effects α
The broadcasting pattern of songsexhibit a time window in which theirpopularity quickly increases shortlyafter their premier and thendecreases.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Groups of broadcasting companies
WITHIN FORMAT – BETWEEN FORMATS
TV channels
Contemporary and Easy listening
Top 40 and UrbanRadio stations
Rock music
Let’s introduce only the effects γ whichare associated to TV channels and radiostation of the same format.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
The estimated spillover effects
The estimated spillover effects
Contemporary Rock News Sport Top-40 World-Music TV channelsContemporary (−0.089,0.004) (0.012,0.021) (−0.028,0.014) (−0.164,0.012) (−0.030,0.019)
(−0.186,−0.068)
Rock (−0.035,−0.021) (−0.049,0.037) (−0.018,0.001) (−0.032,0.001) (−0.015,−0.021)News (−0.023,0.047) (−0.072,−0.010) (−0.035,0.008) (−0.005,−0.024) (0.009,0.030)Sport (−0.009,0.076) (−0.036,−0.001) (−0.068,0.030) (−0.015,0.013) (−0.029,0.001)Top-40 (−0.070,0.001) (−0.083,0.022) (−0.052,0.000) (−0.038,0.022) (−0.025,0.019)World-Music (−0.017,0.014) (−0.029,0.036) (−0.022,0.005) (−0.017,0.011) (−0.014,0.024)
TV channels (−0.291,−0.038)
BBC 1 Xtra Capital FM Kiss 100 FM Metro Radio Radio City Smooth R. LondonBBC 1 Xtra (−0.009,0.060) (−0.104,0.057) (−0.015,0.012) (0.005,0.024) (−0.015,0.012)Capital FM (−0.015,0.051) (−0.060,0.001) (−0.009,0.025) (0.000,0.025) (−0.013,0.019)Kiss 100 FM (−0.020,0.124) (−0.028,0.025) (−0.009,0.025) (−0.032,0.021) (0.001,0.029)Metro Radio (−0.008,0.094) (−0.009,0.012) (−0.027,0.026) (−0.014,0.037) (0.000,0.055)Radio City (−0.019,0.110) (−0.040,0.012) (−0.015,0.022) (−0.021,0.009) (0.010,0.033)Smooth R. London (−0.033,0.011) (−0.021,0.014) (−0.022,0.016) (−0.032,0.023) (−0.022,0.001)
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Songs’ dynamics
Define a functional form for the effect of song dynamics
The attractiveness trajectory of the sth song can be specified by letting t0 bethe starting week when the song is launched and then considering a gammakernel to design the shape its time dynamics:
αst =
{δ0
s + δ1s (t − t0) + δ2
s log(t − t0) if t > t0−∞ otherwise
where t0 is the week when the song has been launched.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Songs life cycle
Common life cycle of the top-30 songs
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Propagation of the broadcasting decision after thepremier week t0.
max1
S
∑s∈S
T∑t′=1
E[xs,•,t+t′ |xsrt = zr : for all r ∈ R
],
subject to∑
r∈Ryr = 1
zr ≤ min{Myr , φ} r ∈ R,yr ∈ {0, 1}, zr ≥ 0, F ≥ φ ≥ 0 r ∈ R,
Format Eigenvector Expected plays in t0 + 1 Expected plays in t0 + 2φ = 10 φ = 100 φ = 10 φ = 100
Contemporary 0.098 265.795 267.647 265.949 267.720Rock 0.121 265.209 261.803 265.687 261.381News 0.098 265.609 264.058 265.995 263.211Sport 0.177 260.301 263.021 260.875 263.055Top-40 0.097 264.272 265.318 264.879 265.098World Music 0.187 267.345 266.350 266.858 266.603TV-channels 0.101 264.165 263.425 264.171 263.438
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
Discussion
Which are the real achievements of this work?
We considered a large multidimensional panel of songs weeklybroadcasted on radio stations and TV channels and detect a pattern ofcross-section dependencies, based on pairwise imitations.
An exponential random model has been proposed to internalized in aunique probabilistic framework both the songs’ life cycle and thecomplex correlation structure.
A specialized MCMC method has been implemented to estimate themodel parameters.
The out-of-sample goodness of fit has been analyzed, assessing themodel adequacy for the observed data set.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015
Data sets from the music broadcasting industryMultidimensional panel data
An exponential random modelEstimation method
Goodness of fit
THANK YOU FOR YOUR ATTENTION
Acknowledgements The research leading to these results hasreceived funding from the European Research Council under theEuropean Union’s Seventh Framework Programme (FP/2007-2013) /ERC Grant Agreement n. 283300.
Stefano Nasini, Víctor Martínez-de-Albéniz ENBIS-Spring-meeting-2015