View
1
Download
0
Category
Preview:
Citation preview
Cleaning correlation matrices,
Random Matrix Theory &
HCIZ integrals
J.P Bouchaud
with: M. Potters, L. Laloux, R. Allez, J. Bun, S. Majumdar
http://www.cfm.fr
Portfolio theory: Basics
• Portfolio weights wi, Asset returns Xti
• If expected/predicted gains are gi then the expected gain ofthe portfolio is
G =∑
i
wigi
• Let risk be defined as: variance of the portfolio returns(maybe not a good definition !)
R2 =∑
ij
wiσiCijσjwj
where σ2i is the variance of asset i, and
Cij is the correlation matrix.
J.-P. Bouchaud
Markowitz Optimization
• Find the portfolio with maximum expected return for a givenrisk or equivalently, minimum risk for a given return (G)
• In matrix notation:
wC = GC−1g
gTC−1gwhere all gains are measured with respect to the risk-free
rate and σi = 1 (absorbed in gi).
• Note: in the presence of non-linear contraints, e.g.∑
i
|wi| ≤ A
a “spin-glass” problem! (see [JPB,Galluccio,Potters])
J.-P. Bouchaud
Markowitz Optimization
• More explicitly:
w ∝∑
αλ−1α (Ψα · g)Ψα = g +
∑
α(λ−1α − 1) (Ψα · g)Ψα
• Compared to the naive allocation w ∝ g:
– Eigenvectors with λ≫ 1 are projected out
– Eigenvectors with λ≪ 1 are overallocated
• Very important for “stat. arb.” strategies (for example)
J.-P. Bouchaud
Empirical Correlation Matrix
• Before inverting them, how should one estimate/clean cor-relation matrices?
• Empirical Equal-Time Correlation Matrix E
Eij =1
T
∑
t
XtiXtj
σiσj
Order N2 quantities estimated with NT datapoints.
When T < N , E is not even invertible.
Typically: N = 500−2000; T = 500−2500 days (10 years –Beware of high frequencies) −→ q := N/T = O(1)
J.-P. Bouchaud
Risk of Optimized Portfolios
• “In-sample” risk (for G = 1):
R2in = wTEEwE =
1
gTE−1g
• True minimal risk
R2true = wTCCwC =
1
gTC−1g
• “Out-of-sample” risk
R2out = wTECwE =
gTE−1CE−1g(gTE−1g)2
J.-P. Bouchaud
Risk of Optimized Portfolios
• Let E be a noisy, unbiased estimator of C. Using convexityarguments, and for large matrices:
R2in ≤ R2true ≤ R2out
• In fact, using RMT: R2out = R2true(1 − q)−1 = R2in(1 − q)−2,indep. of C! (For large N)
• If C has some time dependence (beyond observation noise)one expects an even worse underestimation
J.-P. Bouchaud
In Sample vs. Out of Sample
0 10 20 30Risk
0
50
100
150R
etur
n
Raw in-sample Cleaned in-sampleCleaned out-of-sampleRaw out-of-sample
J.-P. Bouchaud
Rotational invariance hypothesis (RIH)
• In the absence of any cogent prior on the eigenvectors, onecan assume that C is a member of a Rotationally Invariant
Ensemble – “RIH”
• Surely not true for the “market mode” ~v1 ≈ (1,1, . . . ,1)/√N ,
with λ1 ≈ Nρ but OK in the bulk (see below)
A more plausible assumption: factor model → hierarchical,block diagonal C’s (“Parisi matrices”)
• “Cleaning” E within RIH: keep the eigenvectors, play witheigenvalues
→ The simplest, classical scheme, shrinkage:
C = (1 − α)E + αI → λ̂C = (1 − α)λE + α, α ∈ [0,1]
J.-P. Bouchaud
RMT: from ρC(λ) to ρE(λ)
• Solution using different techniques (replicas, diagrams, freematrices) gives the resolvent GE(z) = N
−1Tr(E − zI) as:
GE(z) =∫dλ ρC(λ)
1
z − λ(1 − q+ qzGE(z)),
Note: One should work from ρC −→ GE
• Example 1: C = I (null hypothesis) → Marcenko-Pastur [67]
ρE(λ) =
√(λ+ − λ)(λ− λ−)
2πqλ, λ ∈ [(1 −√q)2, (1 + √q)2]
• Suggests a second cleaning scheme (Eigenvalue clipping, [Lalouxet al. 1997]): any eigenvalue beyond the Marcenko-Pastur
edge can be trusted, the rest is noise.
J.-P. Bouchaud
Eigenvalue clipping
λ < λ+ are replaced by a unique one, so as to preserve TrC =
N .
J.-P. Bouchaud
RMT: from ρC(λ) to ρE(λ)
• Solution using different techniques (replicas, diagrams, freematrices) gives the resolvent GE(z) as:
GE(z) =∫dλ ρC(λ)
1
z − λ(1 − q+ qzGE(z)),
Note: One should work from ρC −→ GE
• Example 2: Power-law spectrum (motivated by data)
ρC(λ) =µA
(λ− λ0)1+µΘ(λ− λmin)
• Suggests a third cleaning scheme (Eigenvalue substitution,Potters et al. 2009, El Karoui 2010): λE is replaced by the
theoretical λC with the same rank k
J.-P. Bouchaud
Empirical Correlation Matrix
0 1 2 3 4 5λ
0
0.5
1
1.5
ρ(λ)
DataDressed power law (µ=2)Raw power law (µ=2)Marcenko-Pastur
0 250 500rank
-2
0
2
4
6
8
κMP and generalized MP fits of the spectrum
J.-P. Bouchaud
Eigenvalue cleaning
0 0.2 0.4 0.6 0.8 1α
0
0.5
1
1.5
2
2.5
3
3.5
R2
Classical ShrinkageLedoit-Wolf ShrinkagePower Law SubstitutionEigenvalue Clipping
Out-of sample risk for different 1-parameter cleaning schemes
J.-P. Bouchaud
A RIH Bayesian approach
• All the above schemes lack a rigorous framework and are atbest ad-hoc recipes
• A Bayesian framework: suppose C belongs to a RIE, withP(C) and assume Gaussian returns. Then one needs:
〈C〉|Xti =∫
DCCP(C|{Xti})
with
P(C|{Xti}) = Z−1 exp[−NTrV (C, {Xti})
];
where (Bayes):
V (C, {Xti}) =1
2q
[logC + EC−1
]+ V0(C)
J.-P. Bouchaud
A Bayesian approach: a fully soluble case
• V0(C) = (1 + b) lnC + bC−1, b > 0: “Inverse Wishart”
• ρC(λ) ∝√
(λ+−λ)(λ−λ−)λ2
; λ± = (1 + b±√
(1 + b)2 − b2/4)/b
• In this case, the matrix integral can be done, leading exactlyto the “Shrinkage” recipe, with α = f(b, q)
• Note that b can be determined from the empirical spectrumof E, using the generalized MP formula
J.-P. Bouchaud
The general case: HCIZ integrals
• A Coulomb gas approach: integrate over the orthogonalgroup C = OΛO†, where Λ is diagonal.
∫DO exp
[
−N2q
Tr[logΛ + EO†Λ−1O + 2qV0(Λ)
]]
• Can one obtain a large N estimate of the HCIZ integral
F(ρA, ρB) = limN→∞
N−2 ln∫
DO exp[N
2qTrAO†BO
]
in terms of the spectrum of A and B?
J.-P. Bouchaud
The general case: HCIZ integrals
• Can one obtain a large N estimate of the HCIZ integral
F(ρA, ρB) = limN→∞
N−2 ln∫
DO exp[N
2qTrAO†BO
]
in terms of the spectrum of A and B?
• When A (or B) is of finite rank, such a formula exists in termsof the R-transform of B [Marinari, Parisi & Ritort, 1995].
• When the rank of A,B are of order N , there is a formula dueto Matytsin [94](in the unitary case), later shown rigorously
by Zeitouni & Guionnet, but its derivation is quite obscure...
J.-P. Bouchaud
An instanton approach to large N HCIZ
• Consider Dyson’s Brownian motion matrices. The eigenval-ues obey:
dxi =
√2
βNdW +
1
Ndt∑
j 6=i
1
xi − xj,
• Constrain xi(t = 0) = λAi and xi(t = 1) = λBi. The proba-bility of such a path is given by a large deviation/instanton
formula, with:
d2xidt2
= − 2N2
∑
ℓ 6=i
1
(xi − xℓ)3.
J.-P. Bouchaud
An instanton approach to large N HCIZ
• Constrain xi(t = 0) = λAi and xi(t = 1) = λBi. The proba-bility of such a path is given by a large deviation/instanton
formula, with:
d2xidt2
= − 2N2
∑
ℓ 6=i
1
(xi − xℓ)3.
• This can be interpreted as the motion of particles interactingthrough an attractive two-body potential φ(r) = −(Nr)−2.Using the virial formula, one finally gets Matytsin’s equations:
∂tρ+ ∂x[ρv] = 0, ∂tv+ v∂xv = π2ρ∂xρ.
J.-P. Bouchaud
An instanton approach to large N HCIZ
• Finally, the “action” associated to these trajectories is:
S ≈ 12
∫dxρ
[
v2 +π2
3ρ2]
− 12
[∫dxdyρZ(x)ρZ(y) ln |x− y|
]Z=B
Z=A
• Now, the link with HCIZ comes from noticing that the prop-agator of the Brownian motion in matrix space is:
P(B|A) ∝ exp−[N2
Tr(A−B)2] = exp−N2
[TrA2+TrB2−2TrAOBO†]
Disregarding the eigenvectors of B (i.e. integrating over O)
leads to another expression for P(λBi|λAj) in terms of HCIZthat can be compared to the one using instantons
• The final result for F(ρA, ρB) is exactly Matytsin’s expression,up to details (!)
J.-P. Bouchaud
Back to eigenvalue cleaning...
• Estimating HCIZ at large N is only the first step, but...
• ...one still needs to apply it to B = C−1, A = E = X†CXand to compute also correlation functions such as
〈O2ij〉E→C−1with the HCIZ weight
• As we were working on this we discovered the work of Ledoit-Péché that solves the problem exactly using tools from RMT...
J.-P. Bouchaud
The Ledoit-Péché “magic formula”
• The Ledoit-Péché [2011] formula is a non-linear shrinkage,given by:
λ̂C =λE
|1 − q+ qλE limǫ→0GE(λE − iǫ)|2.
• Note 1: Independent of C: only GE is needed (and is observ-able)!
• Note 2: When applied to the case where C is inverse Wishart,this gives again the linear shrinkage
• Note 3: Still to be done: reobtain these results using theHCIZ route (many interesting intermediate results to hope
for!)
J.-P. Bouchaud
Eigenvalue cleaning: Ledoit-Péché
Fit of the empirical distribution with V ′0(z) = a/z+b/z2+c/z3.
J.-P. Bouchaud
What about eigenvectors?
• Up to now, most results using RMT focus on eigenvalues
• What about eigenvectors? What natural null-hypothesis be-yond RIH?
• Are eigen-values/eigen-directions stable in time?
• Important source of risk for market/sector neutral portfolios:a sudden/gradual rotation of the top eigenvectors!
• ..a little movie...
J.-P. Bouchaud
What about eigenvectors?
• Correlation matrices need a certain time T to be measured
• Even if the “true” C is fixed, its empirical determinationfluctuates:
Et = C + noise
• What is the dynamics of the empirical eigenvectors inducedby measurement noise?
• Can one detect a genuine evolution of these eigenvectorsbeyond noise effects?
J.-P. Bouchaud
What about eigenvectors?
• More generally, can one say something about the eigenvec-tors of randomly perturbed matrices:
H = H0 + ǫH1
where H0 is deterministic or random (e.g. GOE) and H1random.
J.-P. Bouchaud
Eigenvectors exchange
• An issue: upon pseudo-collisions of eigenvectors, eigenvaluesexchange
• Example: 2 × 2 matrices
H11 = a, H22 = a+ ǫ, H21 = H12 = c,−→
λ± ≈ǫ→0 a+ǫ
2±√
c2 +ǫ2
4
• Let c vary: quasi-crossing for c→ 0, with an exchange of thetop eigenvector: (1,−1) → (1,1)
• For large matrices, these exchanges are extremely numerous→ labelling problem
J.-P. Bouchaud
Subspace stability
• An idea: follow the subspace spanned by P -eigenvectors:
|ψk+1〉, |ψk+2〉, . . . |ψk+P 〉 −→ |ψ′k+1〉, |ψ′k+2〉, . . . |ψ′k+P 〉
• Form the P × P matrix of scalar products:
Gij = 〈ψk+i|ψ′k+j〉
• The determinant of this matrix is insensitive to label per-mutations and is a measure of the overlap between the two
P -dimensional subspaces
– D = −1P ln |detG| is a measure of how well the first sub-space can be approximated by the second
J.-P. Bouchaud
Intermezzo
• Non equal time correlation matrices
Eτij =1
T
∑
t
XtiXt+τj
σiσj
N ×N but not symmetrical: ‘leader-lagger’ relations
• General rectangular correlation matrices
Gαi =1
T
T∑
t=1
Y tαXti
N ‘input’ factors X; M ‘output’ factors Y
– Example: Y tα = Xt+τj , N = M
J.-P. Bouchaud
Intermezzo: Singular values
• Singular values: Square root of the non zero eigenvaluesof GGT or GTG, with associated eigenvectors ukα and v
ki →
1 ≥ s1 > s2 > ...s(M,N)− ≥ 0
• Interpretation: k = 1: best linear combination of input vari-ables with weights v1i , to optimally predict the linear com-
bination of output variables with weights u1α, with a cross-
correlation = s1.
• s1: measure of the predictive power of the set of Xs withrespect to Y s
• Other singular values: orthogonal, less predictive, linear com-binations
J.-P. Bouchaud
Intermezzo: Benchmark
• Null hypothesis: No correlations between Xs and Y s:
Gtrue ≡ 0
• But arbitrary correlations among Xs, CX, and Y s, CY , arepossible
• Consider exact normalized principal components for the sam-ple variables Xs and Y s:
X̂ti =1√λi
∑
j
UijXtj; Ŷ
tα = ...
and define Ĝ = Ŷ X̂T .
J.-P. Bouchaud
Intermezzo: Random SVD
• Final result:([Wachter] (1980); [Laloux,Miceli,Potters,JPB])
ρ(s) = (m+ n− 1)+δ(s− 1) +
√(s2 − γ−)(γ+ − s2)
πs(1 − s2)with
γ± = n+m− 2mn± 2√mn(1 − n)(1 −m), 0 ≤ γ± ≤ 1
• Analogue of the Marcenko-Pastur result for rectangular cor-relation matrices
• Many applications; finance, econometrics (‘large’ models),genomics, etc. and subspace stability!
J.-P. Bouchaud
Back to eigenvectors
• Extend the target subspace to avoid edge effects:
|ψk+1〉, |ψk+2〉, . . . |ψk+P 〉 −→ |ψ′k−Q+1〉, |ψ′k+2〉, . . . |ψ′k+Q〉
• Form the P ×Q matrix of scalar products:
Gij = 〈ψk+i|ψ′k+j〉
• The singular values of G indicates how well the Q perturbedvectors approximate the initial ones
D = −1P
∑
i
ln si
J.-P. Bouchaud
Null hypothesis
• Note: if P and Q are large, D can be “accidentally” small
• One can compute D exactly in the limit P,Q → ∞, N → ∞,with fixed p = P/N , q = Q/N :
• Final result: (same problem as above!)
D = −∫ 1
0ds ln s ρ(s)
with:
ρ(s) =
√(s2 − γ−)(γ+ − s2)
πs(1 − s2)and
γ± = p+ q − 2pq ± 2√pq(1 − p)(1 − q), 0 ≤ γ± ≤ 1
J.-P. Bouchaud
Back to eigenvectors: perturbation theory
• Consider a randomly perturbed matrix:
H = H0 + ǫH1
• Perturbation theory to second order in ǫ yields:
D ≈ ǫ2
2P
∑
i∈{k+1,...,k+P}
∑
j 6∈{k−Q+1,...,k+Q}
(〈ψi|H1|ψj〉λi − λj
)2.
• The full distribution of s can again be computed exactly (insome limits) using free random matrix tools.
J.-P. Bouchaud
GOE: the full SV spectrum
• Initial eigenspace: spanned by [a, b] ⊂ [−2,2], b− a = ∆
Target eigenspace: spanned by [a− δ, b+ δ] ⊂ [−2,2]
• Two cases (set s = ǫ2ŝ):
– Weak fluctuations ∆ ≪ δ ≪ 1 → ρ(ŝ) is a semi circlecentered around ∼ δ−1, of width ∼
√∆
– Strong fluctuations δ ≪ ∆ ≪ 1 → ρ(ŝ) ∼ ŝmin/ŝ2, withŝmin ∼ ∆−1 and ŝmax ∼ δ−1.
J.-P. Bouchaud
The case of correlation matrices
• Consider the empirical correlation matrix:
E = C + η η =1
T
T∑
t=1
(XtXt − C)
• The noise η is correlated as:〈ηijηkl
〉=
1
T(CikCjl + CilCjk)
• from which one derives:
D ≈ 12TP
P∑
i=1
N∑
j=Q+1
λiλj
(λi − λj)2
.
(and a similar equation for eigenvalues)
J.-P. Bouchaud
Stability of eigenvalues: Correlations
Eigenvalues clearly change: well known correlation crises
J.-P. Bouchaud
Stability of eigenspaces: Correlations
D(τ) for a given T , P = 5, Q = 10
J.-P. Bouchaud
Stability of eigenspaces: Correlations
D(τ = T) for P = 5, Q = 10
J.-P. Bouchaud
Conclusion
• Many RMT tools available to understand the eigenvalue spec-trum and suggest cleaning schemes
• The understanding of eigenvectors is comparatively poorer
• The dynamics of the top eigenvector (aka market mode) isrelatively well understood
• A plausible, realistic model for the “true” evolution of C isstill lacking (many crazy attempts – Multivariate GARCH,
BEKK, etc., but “second generation models” are on their
way)
J.-P. Bouchaud
Bibliography
• J.P. Bouchaud, M. Potters, Financial Applications of Ran-dom Matrix Theory: a short review, in “The Oxford Hand-
book of Random Matrix Theory” (2011)
• R. Allez and J.-P. Bouchaud, Eigenvectors dynamics: generaltheory & some applications, arXiv 1108.4258
• P.-A. Reigneron, R. Allez and J.-P. Bouchaud, Principal re-gression analysis and the index leverage effect, Physica A,
Volume 390 (2011) 3026-3035.
J.-P. Bouchaud
Recommended