26
Scalable inference for a full multivariate stochastic volatility model SYstemic Risk TOmography: Signals, Measurements, Transmission Channels, and Policy Interventions P. Dellaportas, A. Plataniotis and M. Titsias UCL(London),AUEB(Athens),AUEB(Athens) Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne February 19, 2016

Scalable inference for a full multivariate stochastic volatility

Embed Size (px)

Citation preview

Page 1: Scalable inference for a full multivariate stochastic volatility

Scalable inference for a full

multivariate stochastic

volatility model

SYstemic Risk TOmography:

Signals, Measurements, Transmission Channels, and Policy Interventions

P. Dellaportas, A. Plataniotis and M. Titsias UCL(London),AUEB(Athens),AUEB(Athens)

Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne February 19, 2016

Page 2: Scalable inference for a full multivariate stochastic volatility

I An important indicator of systemic risk is instantaneous volatilities and

correlations

I N-dimensional asset returns: rt = µt + "t , "t ⇠ N(0,⌃t ), t = 1, · · · ,T .

I The focus is shifted to modelling and predicting the covariance matrices ⌃t so

we assume that rt ⌘ "t .

I For realistic financial applications (portfolio allocation, systemic risk) think of N in

hundreds and T = 2000.

I Problem 1: The number of parameters in ⌃t is N(N + 1)/2 which grows

quadratically in N. The total number of parameters that need to be estimated is

TN(N + 1)/2.

I Problem 2: The N(N + 1)/2 parameters of each ⌃t are restricted since ⌃t

should be positive definite.

I Problem 3: There are many missing values (about 3% in the data we looked at)

and series with short lengths.

Page 3: Scalable inference for a full multivariate stochastic volatility

1-d Stochastic volatility model

I 1-dimensional returns

rt ⇠ N(µt ,�2t ),

with unobservable variances

log�2t+1 = µ+ � log�2

t + ⌘t , ⌘t ⇠ N(0, ⌧2),

I MCMC algorithms since 1994; sequential importance sampling, adaptive MCMC,

Laplace approximations, etc.

I Compare the stochastic volatility parameter-driven models with GARCH-type

observational-driven models

Page 4: Scalable inference for a full multivariate stochastic volatility

Volatility matrices - State of the art

I Two recent review articles on mulativariate stochastic volatility (Asai, McAleer,Yu,

2006; Chib, Omori, Asai, 2009); current state of the art is parsinomious

modelling of ⌃t and factor models with few independent factors, each one of

them being modelled as univariate stochastic volatility processes.

I A review article on multivariate GARCH models (Bauwens, Laurent, Rombouts;

2006); state of the art is parsimonious modelling of ⌃t and two-step estimation

procedures.

I Other approaches include Wishart processes (Philipov and Glickman; 2006) and

dynamic matrix-variate graphical models via inverted Wishart processes

(Carvalho and West; 2007).

Page 5: Scalable inference for a full multivariate stochastic volatility

Dynamic eigenvalue and eigenvector modelling

I We decompose ⌃t = Ut⇤t UTt and model Ut and ⇤t with an AR(1) process.

Direct modelling of Ut is hard.

I Since Ut is a rotation matrix, it can be parameterised w.r.t. N(N � 1)/2 Givens

angles, each one belonging to matrix Gjt :

Ut =

N(N�1)2

Y

j=1

Gjt

Page 6: Scalable inference for a full multivariate stochastic volatility

2-Dim

⌃t =

0

B

B

@

cos(!t ) sin(!t )

� sin(!t ) cos(!t )

1

C

C

A

0

B

B

@

�1t 0

0 �2t

1

C

C

A

0

B

B

@

cos(!t ) sin(!t )

� sin(!t ) cos(!t )

1

C

C

A

T

I Uniqueness: �1t > �2t , �⇡2 < !t <

⇡2

Page 7: Scalable inference for a full multivariate stochastic volatility

3-Dim

(Ignoring t): ⌃ = U⇤UT = G12G13G23⇤GT23GT

13GT12

U =

0

B

B

B

B

B

@

cos(!12) sin(!12) 0

� sin(!12) cos(!12) 0

0 0 1

1

C

C

C

C

C

A

0

B

B

B

B

B

@

cos(!13) 0 sin(!13)

0 1 0

� sin(!13) 0 cos(!13)

1

C

C

C

C

C

A

0

B

B

B

B

B

@

1 0 0

0 cos(!23) sin(!23)

0 � sin(!23) cos(!23)

1

C

C

C

C

C

A

Page 8: Scalable inference for a full multivariate stochastic volatility

U =

N(N�1)2

Y

j=1,k>j

Gjk =

N(N�1)2

Y

j=1,k>j

0

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

@

1 0 . . . . . . 0...

0 cos(!jk ) 0 . . . 0 sin(!jk ) 0 . . .

...

0 0 . . . 1 . . . 0 . . . 0...

0 � sin(!jk ) 0 . . . 0 cos(!jk ) 0 . . .

...

0 0 0 . . . 0 1

1

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

A

I Note the sparsity of the N-dimensional rotation matrix: it contains 4 elements

with cosines and sines of the angle, ones in the diagonal, and zeroes

everywhere else.

Page 9: Scalable inference for a full multivariate stochastic volatility

I rt = (r1t , . . . , rNt )T , rt ⇠ MVN

0,Ut⇤t UTt

o

.

I Transformations: hit = log⇤it , �it = log⇣

⇡/2+!it⇡/2�!it

, i = 1, . . . ,N, t = 1, . . . ,T

hi,t+1 = µhi + �h

i · (hit � µhi ) + �h

i · ⌘hit , i = 1, . . . ,N

�j,t+1 = µ�j + ��

j · (�jt � µ�j ) + ��

j · ⌘�jt , j = 1, . . . ,N(N � 1)

2

where ⌘hit , ⌘

�jt ⇠ N

n

0, 1o

independently, and we denote

✓h = (�h1, . . . ,�

hN , µ

h1, . . . , µ

hN ,�

h1 , . . . ,�

hN)

✓� = (��1 , . . . ,�

�N(N�1)

2, µ�

1 , . . . , µ�N(N�1)

2,��

1⌘ , . . . ,��N(N�1)

2)

Page 10: Scalable inference for a full multivariate stochastic volatility

Priors

hi,t+1 = µhi + �h

i · (hit � µhi ) + �h

i · ⌘hit , i = 1, . . . ,N

�j,t+1 = µ�j + ��

j · (�jt � µ�j ) + ��

j · ⌘�jt , j = 1, . . . ,N(N � 1)

2

µhi ⇠ N(µ1,�

21), i = 1, . . . ,N

�hi ⇠ N(µ2,�

22), i = 1, . . . ,N

µ�j ⇠ N(µ3,�

23), j = 1, . . . ,

N(N � 1)2

��j ⇠ N(µ3,�

23), j = 1, . . . ,

N(N � 1)2

The Exchangeability assumption via a hierarchical model allows borrowing strength.

Partial exchangeability conditional on markets, sectors, etc is probably more realistic.

Page 11: Scalable inference for a full multivariate stochastic volatility

A general model formulation

A more general structure is a K-factor model constructed with an N ⇥ K matrix of factor

loadings B:

I rt = Bft + et , ✏t ⇠ N (0,�2I)

I The factor loadings matrix B has fixed/known structure while its non-zero

elements follow a Gaussian prior distribution

I ft ⇠ N (0,⌃t )

I ⌃t follows the multivariate stochastic volatility model with the Givens matrix

construction

I We need to constrain B so that the model is identifiable

I We do NOT need this model only when N is large: this model can treat missing

values -this is very important in real applications.

Page 12: Scalable inference for a full multivariate stochastic volatility

Computation

I With the Givens angles type model formulation we now deal a non-linear

likelihood plus a Gaussian process prior

I MCMC for these problems: Use an auxiliary Langevin MCMC based on an idea

by Titsias in the discussion of the RSSB discussion paper by Girolami and

Carderhead (2011).

I The Computational complexity: It is O(d3) for Normal densities of dimesion d ;

we achieve O(d2) even for the derivatives of the likelihood wrt Givens angles, so

our MCMC algorithm has complexity O(d2).

I Missing data are treated without any problem

Page 13: Scalable inference for a full multivariate stochastic volatility

The Sampling algorithm

Model: rt = Bft + et , ✏t ⇠ N (0,�2I), ft ⇠ N (0,⌃t) Denote by X

all latent paths

p(B,�2, (ft)Tt=1|rest) /

TY

t=1

N (rt |Bft ,�2I)N (ft |0,⌃t(xt))

!p(B,�2),

p(X |rest) /

TY

t=1

N (ft |0,⌃t(xt))

!p(X |✓h, ✓�),

p(✓h, ✓�|rest) / p(X |✓h, ✓�)p(✓h, ✓�).

We do not need to generate the missing data in rt

Page 14: Scalable inference for a full multivariate stochastic volatility

Sampling the Gaussian latent process

I Denote F = (f1, . . . , fT )

I Prior p(X) = N (X |M,Q�1)

I Current state of X is Xn. Use slice Gibbs:

I Introduce auxiliary variables U that live in the same space as X :

p(U|Xn) = N (U|Xn + �2r log p(F |Xn), �

2 I)

I U injects Gaussian noise into Xn and shifts it by (�/2)r log p(F |Xn)

I We cannot sample from p(X |U) so we use a Metropolis step: Propose Y from

proposal q:

q(Y |U) =1

Z(U)N (Y |U,

2I)p(Y )

= N (Y |(I +�

2Q)�1(U +

2QM),

2(I +

2Q)�1).

where Z(U) =R

N (Y |U, �2 I)p(Y )dY .

Page 15: Scalable inference for a full multivariate stochastic volatility

I Accept Y with Metropolis-Hastings probability min(1, r):

r =p(F |Y )p(U|Y )p(Y )

p(F |Xn)p(U|Xn)p(Xn)

q(Xn|U)

q(Y |U)=

p(F |Y )p(U|Y )p(Y )

p(F |Xn)p(U|Xn)p(Xn)

1Z(U)N (Xn|U, �

2 I)p(Xn)

1Z(U)N (Y |U, �

2 I)p(Y )

=p(F |Y )N (U|Y + �

2 Gy ,�2 I)

p(F |Xn)N (U|Xn + �2 Gt ,

�2 I)

N (Xn|U, �2 I)

N (Y |U, �2 I)

=p(F |Y )

p(F |Xn)exp

�(U � Xn)T Gt + (U � Y )T Gy �

4(||Gy ||2 � ||Gt ||2)

where Gt = r log p(F |Xn), Gy = r log p(F |Y ) and ||Z || denotes the Euclidean

norm of a vector Z .

I The Gaussian prior terms p(Xn) and p(Y ) have been cancelled out from the

acceptance probability, so their computationally expensive evaluation is not

required: the resulting q(Y |U) is invariant under the Gaussian prior.

I Tune � to achieve an acceptance rate of around 50 � 60%.

Page 16: Scalable inference for a full multivariate stochastic volatility

O(K 2) computation for the K-factor MSV model

I ft ⇠ N(0,⌃t ), ⌃t = Ut⇤t UTt , Ut =

Q

K (K�1)2

j=1 Gjt

log MSV(ft ) = �K2

log(2⇡)�12

KX

i=1

hit �12

vTt vt , (1)

where vt = ⇤� 1

2t UT

t ft and where we used that log |⌃t | = log |⇤t | =PK

i=1 hit .

I Given vt the above expression takes O(K ) time to compute.

I Gij (!ji,t )T ft takes O(1) time to compute since all of its elements are equal to the

corresponding ones from the vector ft apart from the i-th and j-th elements that

become ft [i] cos(!ji,t )� ft [i] sin(!ji,t ) and ft [j] sin(!ji,t ) + ft [j] cos(!ji,t ),

respectively.

I Similarly rht log MSV and r!ij,t log MSV are calculated in O(K 2) time.

Page 17: Scalable inference for a full multivariate stochastic volatility

O(N2) computation for the MSV model

Initialize vt = ft .for i = 1 to N � 1 do

for j = i + 1 to N do

c = cos(!ji,t), s = sin(!ji,t)t1 = vt [i], t2 = vt [j]vt [i] c ⇤ t1 � s ⇤ t2vt [j] s ⇤ t1 + c ⇤ t2

end for

end for

vt = vt � diag(⇤� 12

t ) (elementwise product)

Page 18: Scalable inference for a full multivariate stochastic volatility

The Sampling algorithm revisited

Model: rt = Bft + et , ✏t ⇠ N (0,�2I), ft ⇠ N (0,⌃t )

Denote by X all latent paths

p(B,�2, (ft )Tt=1|rest) /

0

@

TY

t=1

N (rt |Bft ,�2I)N (ft |0,⌃t (xt ))

1

A p(B,�2),

p(X |rest) /

0

@

TY

t=1

N (ft |0,⌃t (xt ))

1

A p(X |✓h, ✓�),

p(✓h, ✓� |rest) / p(X |✓h, ✓�)p(✓h, ✓�).

Page 19: Scalable inference for a full multivariate stochastic volatility

Sampling the latent factors in O(TNK ) time

I p(ft |rest) / N (rt |Bft ,�2I)N (ft |0,⌃t ) = N (ft |��2M�1t BT rt ,M�1

t ) where

Mt = ��2BT B + ⌃t . To simulate from this Gaussian we need first to compute

the stochastic volatility matrix ⌃t and subsequently the Cholesky decomposition

of Mt . Both operations have a cost O(K 3).

I We replace the exact Gibbs step with a much faster Metropolis within Gibbs step

that scales as O(T (NK + K 2)).

I To achieve this we apply the same auxiliary Langevin scheme as before

Page 20: Scalable inference for a full multivariate stochastic volatility

The Data

I 571 stocks from Europe Stoxx 600 index

I Daily data from 08/01/2010 to 5/1/2014 (T = 2017)

I 36340 missing values or 36340/(571 ⇤ 2017) = 3.2%

I Factor model with 30 factors: the dimension of the latent path is

2017 ⇥ 30 ⇥ 31/2 = 937, 905

I Choice of number of factors: Based on predictive performance wrt quadratic

covariation. We tried 20, 30 and 40 factors.

Page 21: Scalable inference for a full multivariate stochastic volatility

Next day minimum variance portfolio weights for

the 571 stocks

Page 22: Scalable inference for a full multivariate stochastic volatility

Pairwise correlations across time

Page 23: Scalable inference for a full multivariate stochastic volatility

Log-Variances across time

Page 24: Scalable inference for a full multivariate stochastic volatility

I January 2009: Banking shares in the UK plummet as the Royal Bank of Scotland

posts the biggest loss in British history. The Bank of England reduces the base

rate of interest to a new historic low of 1%. The U.S. economy lost 598,000 jobs

during January 2009, with unemployment rising to 7.6 percent. Bankruptcies in

the United Kingdom rose during 2008 by 50 percent to an all-time high.

California’s Alliance Bank and Georgia’s FirstBank are closed, raising the

number of 2009 U.S. bank failures to eight.

I July 2012: Barclays chairman and Chief Executive of British bank Barclays

resign following a scandal in which the bank tried to manipulate the Libor and

Euribor interest rates systems. The central banks of the European Union, Great

Britain, and the People’s Republic of China, in what appears to be a co-ordinated

action, each loosen their respective monetary systems.

Page 25: Scalable inference for a full multivariate stochastic volatility

Discussion

I Incorporation of Leverage effects, Jumps

I Small N: Nested Laplace approximations (PhD thesis by Plataniotis,

AUEB),importance sampling based on copulas (in progress)

I Bayesian model determination for number of factors

I Relations with other PCN proposals

Page 26: Scalable inference for a full multivariate stochastic volatility

This project has received funding from the European Union’s

Seventh Framework Programme for research, technological

development and demonstration under grant agreement n° 320270

www.syrtoproject.eu

This document reflects only the author’s views.

The European Union is not liable for any use that may be made of the information contained therein.