Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Jeffery-Williams Lecture
On the effectiveness of operator-valued
free probability theory
Roland Speicher
Universitat des Saarlandes
Saarbrucken, Germany
joint work with Serban Belinschi, Tobias Mai, John Treilhard,
Carlos Vargas
Once Upon a Time ....
... There Were Large Random Matrices
We are interested in the limiting eigenvalue distribution of an
N ×N random matrix for N →∞.
Typical phenomena for basic random matrix ensembles:
• almost sure convergence to a deterministic limit eigenvalue
distribution
• this limit distribution can be effectively calculated
We are interested in the limiting eigenvalue distribution of an
N ×N random matrix for N →∞.
Typical phenomena for basic random matrix ensembles:
• almost sure convergence to a deterministic limit eigenvalue
distribution
• this limit distribution can be effectively calculated
We are interested in the limiting eigenvalue distribution of an
N ×N random matrix for N →∞.
Typical phenomena for basic random matrix ensembles:
• almost sure convergence to a deterministic limit eigen-
value distribution
• this limit distribution can be effectively calculated
Wigner Random Matrix
A Wigner random matrix
X =(xij)Ni,j=1
• is symmetric:
X∗ = X
• {xij | 1 ≤ i ≤ j ≤ N} are independent and identically dis-
tributed
8 eigenvalues of an 8× 8 matrix with random ±1 entries
−10 −8 −6 −4 −2 0 2 4 6 8 100
1
2
3
4
5
6
7
8
1 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 1
8 eigenvalues of an 8× 8 matrix with random ±1 entries
−10 −8 −6 −4 −2 0 2 4 6 8 100
1
2
3
4
5
6
7
8
−1 −1 −1 −1 −1 −1 −1 −1−1 −1 −1 −1 −1 −1 −1 −1−1 −1 −1 −1 −1 −1 −1 −1−1 −1 −1 −1 −1 −1 −1 −1−1 −1 −1 −1 −1 −1 −1 −1−1 −1 −1 −1 −1 −1 −1 −1−1 −1 −1 −1 −1 −1 −1 −1−1 −1 −1 −1 −1 −1 −1 −1
8 eigenvalues of an 8× 8 matrix with random ±1 entries
−10 −8 −6 −4 −2 0 2 4 6 8 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
−1 −1 1 1 1 −1 1 −1−1 −1 1 −1 −1 −1 1 −11 1 1 1 −1 −1 1 −11 −1 1 −1 −1 −1 1 11 −1 −1 −1 1 1 −1 −1−1 −1 −1 −1 1 1 −1 11 1 1 1 −1 −1 −1 −1−1 −1 −1 1 −1 1 −1 −1
8 eigenvalues of an 8× 8 matrix with random ±1 entries
−10 −8 −6 −4 −2 0 2 4 6 8 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
1 −1 −1 1 −1 −1 1 −1−1 1 −1 1 1 −1 1 1−1 −1 −1 −1 1 1 1 −11 1 −1 1 −1 1 1 1−1 1 1 −1 −1 −1 −1 −1−1 −1 1 1 −1 1 −1 −11 1 1 1 −1 −1 −1 −1−1 1 −1 1 −1 −1 −1 −1
100 eigenvalues of 100× 100 matrix with random ±1√N
entries
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
N = 100 realisation 1
100 eigenvalues of 100× 100 matrix with random ±1√N
entries
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
N = 100 realisation 2
100 eigenvalues of 100× 100 matrix with random ±1√N
entries
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
N = 100 realisation 3
4000 eigenvalues of 4000×4000 matrix with random ±1√N
entries
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N = 4000 realisation 1
4000 eigenvalues of 4000×4000 matrix with random ±1√N
entries
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N = 4000 realisation 2
4000 eigenvalues of 4000×4000 matrix with random ±1√N
entries
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N = 4000 realisation 3
Almost Sure Convergence to a Deterministic
Limit Eigenvalue Distribution
For large N , the eigenvalue distribution of X is with very high
probability very close to a deterministic “limit distribution”.
Wishart Random Matrix
A Wishart random matrix X is of the form X = AA∗ where
• A is an N ×M matrix
A =(aij)i=1,...,Nj=1,...,M
• where all entries are independent and identically distributed:
{aij | 1 ≤ i ≤ N,1 ≤ j ≤M} are iid
For N →∞, one keeps the ratio
λ :=N
Mfixed.
100 eigenvalues of a Wishart matrix, with λ = 0.25
0 0.5 1 1.5 2 2.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
N = 100
1000 eigenvalues of a Wishart matrix, with λ = 0.25
0 0.5 1 1.5 2 2.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
N = 1000
3000 eigenvalues of a Wishart matrix, with λ = 0.25
0 0.5 1 1.5 2 2.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
N = 3000 realisation 1
3000 eigenvalues of a Wishart matrix, with λ = 0.25
0 0.5 1 1.5 2 2.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
N = 3000 realisation 2
3000 eigenvalues of a Wishart matrix, with λ = 0.25
0 0.5 1 1.5 2 2.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
N = 3000 realisation 3
Almost Sure Convergence to a Deterministic
Limit Eigenvalue Distribution
For large N , the eigenvalue distribution of X is with very high
probability (for generic choices of X) very close to a deterministic
“limit distribution”, which depends on λ.
We are interested in the limiting eigenvalue distribution of an
N ×N random matrix for N →∞.
Typical phenomena for basic random matrix ensembles:
• almost sure convergence to a deterministic limit eigen-
value distribution
• this limit distribution can be effectively calculated
We are interested in the limiting eigenvalue distribution of an
N ×N random matrix for N →∞.
Typical phenomena for basic random matrix ensembles:
• almost sure convergence to a deterministic limit eigenvalue
distribution
• this limit distribution can be effectively calculated
The Cauchy (or Stieltjes) Transform
For any probability measure µ on R we define its Cauchy trans-
form
G(z) :=∫R
1
z − tdµ(t)
This is an analytic function G : C+ → C− and we can recover µ
from G by Stieltjes inversion formula
dµ(t) = −1
πlimε→0=G(t+ iε)dt
For our basic random matrix ensembles one can derive equationsfor the Cauchy transform of the limiting eigenvalue distribution,solve those equations and then get the density via Stieltjes in-version:
Wigner random matrix
G(z)2 + 1 = zG(z),
which can be solved as
G(z) =z −
√z2 − 4
2, thus dµs(t) =
1
2π
√4− t2dt
Wigner random matrix and Wigner’s semicircle
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
x
For our basic random matrix ensembles one can derive equations
for the Cauchy transform of the limiting eigenvalue distribution,
solve those equations and then get the density via Stieltjes in-
version:
Wishart random matrix
λ
1−G(z)+
1
G(z)= z
which can be solved as
G(z) =z + 1− λ−
√(z − (1 + λ))2 − 4λ
2zand thus
dµ(t) =1
2πλt
√4λ− (t− (1 + λ))2dt
Wishart random matrix and Marchenko-Pastur distrib.
0 0.5 1 1.5 2 2.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
x
The Saga Begins ...
.... Consider Functions of Several Independent
Random Matrices
We are interested in the limiting eigenvalue distribution of
functions of several N ×N random matrices for N →∞.
Typical phenomena:
• almost sure convergence to a deterministic limit eigenvalue
distribution
• this limit distribution can be effectively calculated
Wigner + Wishart random matrices, N = 3000
−2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
realization 1
Wigner + Wishart random matrices, N = 3000
−2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
realization 2
Wigner + Wishart random matrices, N = 3000
−2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
realization 3
We are interested in the limiting eigenvalue distribution of
functions of several N ×N random matrices for N →∞.
Typical phenomena:
• almost sure convergence to a deterministic limit eigenvalue
distribution
• this limit distribution can be effectively calculated only in
very simple situations
For simple situations one can derive equations for the Cauchy
transform of the limiting eigenvalue distribution; those can usu-
ally not be solved explicitly; however, as fixed point equations
they have a good analytic behaviour and can be solved numeri-
cally by iteration algorithms
Wigner + Wishart: For G(z) := GWigner+Wishart(z) one finds
the fixed point equation (in subordination form)
G(z) = GWishart(z −G(z)),
which can be easily solved by iteration.
Wigner + Wishart random matrices, N = 3000
−2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Results for Calculations of the Limit Eigenvalue
Distribution
• Marchenko, Pastur 1967: general Wishart matrices ADA∗
• Pastur 1972: deterministic + Wigner (deformed semicircle)
• Speicher, Nica 1998; Vasilchuk 2003: commutator or anti-commutator: X1X2 ±X2X1
• more general models in wireless communications (Tulino,Verdu 2004; Couillet, Debbah, Silverstein 2011):
RADA∗R or∑i
RiAiDiA∗iRi
The Quest:
But What About More Complicated or Even
General Selfadjoint Polynomials
.... something like
P (X,Y ) = XY + Y X +X2
or
P (X1, X2, X3) = X1X2X1 +X2X3X2 +X3X1X3
or even just
P (X1, . . . , Xk) P selfadjoint polynomial
P (X,Y ) = XY + Y X +X2
for independent X,Y ; X is Wigner, Y is Wishart
−5 0 5 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N = 100
P (X,Y ) = XY + Y X +X2
for independent X,Y ; X is Wigner, Y is Wishart
−5 0 5 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N = 300
P (X,Y ) = XY + Y X +X2
for independent X,Y ; X is Wigner, Y is Wishart
−5 0 5 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N = 3000, Realization 1
P (X,Y ) = XY + Y X +X2
for independent X,Y ; X is Wigner, Y is Wishart
−5 0 5 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
N = 3000, Realization 2
The Hero:
Free Probability Theory
Definition of Freeness (Voiculescu 1985)
Let (A, ϕ) be non-commutative probability space, i.e., A isa unital algebra and ϕ : A → C is unital linear functional (i.e.,ϕ(1) = 1)
Unital subalgebras Ai (i ∈ I) are free or freely independent, ifϕ(a1 · · · an) = 0 whenever
• ai ∈ Aj(i), j(i) ∈ I ∀i, j(1) 6= j(2) 6= · · · 6= j(n)
• ϕ(ai) = 0 ∀i
Random variables x1, . . . , xn ∈ A are freely independent, if theirgenerated unital subalgebras Ai := algebra(1, xi) are so.
What Is Freeness?
Freeness between x and y is an infinite set of equations relating
various moments in x and y:
ϕ
(p1(x)q1(y)p2(x)q2(y) · · ·
)= 0
Basic observation: free independence between x and y is actually
a rule for calculating mixed moments in x and y from the
moments of x and the moments of y:
ϕ
(xm1yn1xm2yn2 · · ·
)= polynomial
(ϕ(xi), ϕ(yj)
)
If x and y are freely independent, then we have
ϕ(xmyn) = ϕ(xm) · ϕ(yn)
ϕ(xm1ynxm2) = ϕ(xm1+m2) · ϕ(yn)
but also
ϕ(xyxy) = ϕ(x2) · ϕ(y)2 + ϕ(x)2 · ϕ(y2)− ϕ(x)2 · ϕ(y)2
Free independence is a rule for calculating mixed moments,
analogous to the concept of independence for random variables.
Note: free independence is a different rule from classical indepen-
dence; free independence occurs typically for non-commuting
random variables, like operators on Hilbert spaces
Consequence: Distribution of Polynomial in
Freely Independent Variables Is Determined by
Distributions of Their Variables
If x1, . . . , xk are freely independent, and p is a polynomial in k
variables, then the distribution of p(x1, . . . , xk) is determined by
the moments of each of the xi and by the fact that they are
freely independent.
Where Does Free Independence Show Up?
• generators of the free group in the corresponding free group
von Neumann algebras L(Fn)
• creation and annihilation operators on full Fock spaces
• for many classes of random matrices
Where Does Free Independence Show Up?
• generators of the free group in the corresponding free group
von Neumann algebras L(Fn)
• creation and annihilation operators on full Fock spaces
• for many classes of random matrices
Asymptotic Freeness of Random Matrices
Basic result of Voiculescu (1991):
Large classes of independent random matrices (like Wigner or
Wishart matrices) become asymptoticially freely independent,
with respect to ϕ = 1NTr, if N →∞.
Consequence: Reduction of Our random Matrix
Problem to the Problem of Polynomial in Freely
Independent Variables
If the random matrices X1, . . . , Xk are asymptotically freely inde-
pendent, then the distribution of a polynomial p(X1, . . . , Xk) is
asymptotically given by the distribution of p(x1, . . . , xk), where
• x1, . . . , xk are freely independent variables, and
• the distribution of xi is the asymptotic distribution of Xi
Can We Actually Calculate Polynomials in
Freely Independent Variables?
Free probability can deal effectively with simple polynomials
• the sum of variables (Voiculescu 1986, R-transform)
p(x, y) = x+ y
• the product of variables (Voiculescu 1987, S-transform)
p(x, y) = xy (=√xy√x)
• the commutator of variables (Nica, Speicher 1998)
p(x, y) = xy − yx
There is no hope to calculate effectively more
complicated or general polynomials in freely
independent variables with usual free probability
theory!
The Superhero:
Operator-Valued Extension of Free
Probability
Let B ⊂ A. A linear map
E : A → B
is a conditional expectation if
E[b] = b ∀b ∈ B
and
E[b1ab2] = b1E[a]b2 ∀a ∈ A, ∀b1, b2 ∈ B
An operator-valued probability space consists of B ⊂ A and a
conditional expectation E : A → B
Consider an operator-valued probability space E : A → B.
Random variables xi ∈ A (i ∈ I) are freely independent with
respect to E (or operator-valued freely independent) if
E[a1 · · · an] = 0
whenever ai ∈ B〈xj(i)〉 are polynomials in some xj(i) with coeffi-
cients from B and
E[ai] = 0 ∀i and j(1) 6= j(2) 6= · · · 6= j(n).
Calculation Rule for Mixed Moments
For operator-valued freely independent variables, one has analo-
gous formulas as in scalar-valued case, ...
The formula
ϕ(xyxy) = ϕ(xx)ϕ(y)ϕ(y) +ϕ(x)ϕ(x)ϕ(yy)−ϕ(x)ϕ(y)ϕ(x)ϕ(y)
has now to be written as
E[xyxy] = E[xE[y]x
]·E[y] +E[x] ·E
[yE[x]y
]−E[x]E[y]E[x]E[y]
Can We Actually Calculate Polynomials in
Operator-Valued Freely Independent Variables?
Again, in principle all operator-valued polynomials in freely inde-pendent variables are determined, but effectively we can againonly deal with simple polynomials:
• the sum of variablesVoiculescu 1995Belinschi, Mai, Speicher 2012
• the product of variablesVoiculescu 1995; Dykema 2006Belinschi, Speicher, Treilhard, Vargas 2012
The Miracle:
The Linearization Trick
Operator-Valued Polynomials Are Matrices of
Polynomials
Operator-valued polynomials in variables x1, . . . , xk are matrices
with entries given by polynomials in those random variables:
p11(x1, . . . , xk) · · · p1r(x1, . . . , xk)
... . . . ...
pr1(x1, . . . , xk) . . . prr(x1, . . . , xk)
The Linearization Philosophy:
In order to understand matrices of polynomials it suffices tounderstand (bigger) matrices of linear polynomials.
In particular, in order to understand polynomials in non-commuting variables, it suffices to understand matrices of linearpolynomials in those variables.
• Voiculescu 1987: motivation
• Haagerup, Thorbjørnsen 2005: largest eigenvalue
• Anderson 2012: the selfadjoint version
The selfadjoint linearization of
p = xy + yx+ x2 is p =
0 x y + x
2
x 0 −1
y + x2 −1 0
This means: the Cauchy transform Gp(z) of p = xy + yx + x2
is given as the (1,1)-entry of the operator-valued (3× 3 matrix)
Cauchy transform of p:
Gp(b) = id⊗ϕ[(b− p)−1
]=
Gp(z) ∗ ∗∗ ∗ ∗∗ ∗ ∗
for b =
z 0 00 0 00 0 0
But
p =
0 x y + x
2
x 0 −1
y + x2 −1 0
= x+ y
with
x =
0 x x
2
x 0 0
x2 0 0
and y =
0 0 y
0 0 −1
y −1 0
.
So p is just the sum of two operator-valued variables
p =
0 x x
2
x 0 0
x2 0 0
+
0 0 y
0 0 −1
y −1 0
.
where we understand the operator-valued distributions of x and
of y.
Are x and y freely independent?
Another Miracle
Matrices of Freely Independent Variables are matrix-valuedFreely Independent
If x and y are freely independent with respect to ϕ, then for anypolynomials pij in x and any polynomials qkl in y one has:
p11(x) . . . p1r(x)... . . . ...
pr1(x) . . . prr(x)
and
q11(y) . . . q1r(y)... . . . ...
qr1(y) . . . qrr(y)
are free with respect to
id⊗ ϕ
The Final Battle:
Algorithm and Calculation for Arbitrary
Selfadjoint Polynomial in Freely
Independent Variables
Input: p(x, y), Gx(z), Gy(z)
↓
Linearize p(x, y) to p = x+ y
↓
Gx(b) out of Gx(z) and Gy(b) out of Gy(z)
↓
Get w1(b) as the fixed point of the iterationw 7→ Gy(b+Gx(w)−1 − w)−1 − (Gx(w)−1 − w)
↓
Gp(b) = Gx(ω1(b))
↓
Recover Gp(z) as one entry of Gp(b)
P (X,Y ) = XY + Y X + X2
for independent X,Y ; X is Wigner and Y is Wishart
−5 0 5 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
p(x, y) = xy + yx + x2
for free x, y; x is semicircular and y is Marchenko-Pastur
P (X1, X2, X3) = X1X2X1 + X2X3X2 + X3X1X3for independent X1, X2, X3; X1, X2 Wigner, X3 Wishart
−10 −5 0 5 10 150
0.05
0.1
0.15
0.2
0.25
0.3
0.35
p(x1, x2, x3) = x1x2x1 + x2x3x2 + x3x1x3for free x1, x2, x3; x1, x2 semicircular, x3 Marchenko-Pastur
The Happy End
Theorem (Belinschi, Mai, Speicher 2012):
Combining the selfadjoint lineaziation trick with our
new analysis of operator-valued free convolution we
can provide an efficient and analytically controllable
algorithm for calculating the asymptotic eigenvalue
distribution of
• any selfadjoint polynomial in asymptotically
free random matrices.