32
arXiv:1609.03120v2 [math-ph] 15 Sep 2016 RANDOM MATRIX ENSEMBLES WITH SPLIT LIMITING BEHAVIOR PAULA BURKHARDT, PETER COHEN, JONATHAN DEWITT, MAX HLAVACEK, STEVEN J. MILLER, CARSTEN SPRUNGER, YEN NHI TRUONG VU, ROGER VAN PESKI, AND KEVIN YANG ABSTRACT. We introduce a new family of N × N random real symmetric matrix ensembles, the k-checkerboard matrices, whose limiting spectral measure has two components which can be deter- mined explicitly. All but k eigenvalues are in the bulk, and their behavior, appropriately normalized, converges to the semi-circle as N →∞; the remaining k are tightly constrained near N/k and their distribution converges to the k × k hollow GOE ensemble (this is the density arising by modifying the GOE ensemble by forcing all entries on the main diagonal to be zero). Similar results hold for complex and quaternionic analogues. We isolate the two regimes by using matrix perturbation re- sults and a nonstandard weight function for the eigenvalues, then derive their limiting distributions using a modification of the method of moments and analysis of the resulting combinatorics. CONTENTS 1. Introduction 2 1.1. Background 2 1.2. Generalized Checkerboard Ensembles 2 1.3. Results 3 2. The Bulk Spectral Measure 7 3. The Blip Spectral Measure 9 4. Generalizations to C and H 16 5. Almost-sure convergence 19 Appendix A. Details for the Bulk 23 Appendix B. Proof of Two Regimes 24 Appendix C. Proof of Lemma 3.16 25 Appendix D. Bounds for X (r) m,N 26 Appendix E. Moment Convergence Theorem 27 References 29 Date: September 16, 2016. 2010 Mathematics Subject Classification. 15B52 (primary), 15B57, 15B33 (secondary). Key words and phrases. Random Matrix Ensembles, Checkerboard Matrices, Limiting Spectral Measure, Gaussian Orthogonal Ensemble, Gaussian Unitary Ensemble, Gaussian Symplectic Ensemble. The authors were partially supported by NSF Grants DMS1265673, DMS1561945, DMS1449679 and DMS1347804, Amherst College, the University of Michigan, Princeton University and Williams College. We thank Eyvindur Palsson, Arup Bose and Aaditya Sharma for helpful conversations. This work was supervised by the fifth named author at the Williams SMALL REU program; the first, third and ninth authors determined the behavior in the bulk and began the investigations of the blip during the 2015 SMALL program, which were undertaken by the remaining authors the following year. 1

BSTRACT arXiv:1609.03120v2 [math-ph] 15 Sep 2016 · Eyvindur Palsson, Arup Bose and Aaditya Sharma for helpful conversations. This work was supervised by the fifth named author at

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

arX

iv:1

609.

0312

0v2

[mat

h-ph

] 15

Sep

201

6

RANDOM MATRIX ENSEMBLES WITH SPLIT LIMITING BEHAVIOR

PAULA BURKHARDT, PETER COHEN, JONATHAN DEWITT, MAX HLAVACEK, STEVEN J. MILLER,CARSTEN SPRUNGER, YEN NHI TRUONG VU, ROGER VAN PESKI, AND KEVIN YANG

ABSTRACT. We introduce a new family ofN × N random real symmetric matrix ensembles, thek-checkerboard matrices, whose limiting spectral measure has two components which can be deter-mined explicitly. All butk eigenvalues are in the bulk, and their behavior, appropriately normalized,converges to the semi-circle asN → ∞; the remainingk are tightly constrained nearN/k and theirdistribution converges to thek × k hollow GOE ensemble (this is the density arising by modifyingthe GOE ensemble by forcing all entries on the main diagonal to be zero). Similar results hold forcomplex and quaternionic analogues. We isolate the two regimes by using matrix perturbation re-sults and a nonstandard weight function for the eigenvalues, then derive their limiting distributionsusing a modification of the method of moments and analysis of the resulting combinatorics.

CONTENTS

1. Introduction 21.1. Background 21.2. Generalized Checkerboard Ensembles 21.3. Results 32. The Bulk Spectral Measure 73. The Blip Spectral Measure 94. Generalizations toC andH 165. Almost-sure convergence 19Appendix A. Details for the Bulk 23Appendix B. Proof of Two Regimes 24Appendix C. Proof of Lemma 3.16 25Appendix D. Bounds forX(r)

m,N 26Appendix E. Moment Convergence Theorem 27References 29

Date: September 16, 2016.2010Mathematics Subject Classification.15B52 (primary), 15B57, 15B33 (secondary).Key words and phrases.Random Matrix Ensembles, Checkerboard Matrices, LimitingSpectral Measure, Gaussian

Orthogonal Ensemble, Gaussian Unitary Ensemble, GaussianSymplectic Ensemble.The authors were partially supported by NSF Grants DMS1265673, DMS1561945, DMS1449679 and

DMS1347804, Amherst College, the University of Michigan, Princeton University and Williams College. We thankEyvindur Palsson, Arup Bose and Aaditya Sharma for helpful conversations. This work was supervised by the fifthnamed author at the Williams SMALL REU program; the first, third and ninth authors determined the behavior inthe bulk and began the investigations of the blip during the 2015 SMALL program, which were undertaken by theremaining authors the following year.

1

1. INTRODUCTION

1.1. Background. Since their introduction by Wishart [Wis] in the 1920s in statistics, the distribu-tion of eigenvalues of random matrix ensembles have played amajor role in a variety of fields, espe-cially in nuclear physics and number theory; see for examplethe surveys [Bai, BFMT-B, Con, FM,KaSa, KeSn] and the textbooks [Fo, Meh, MT-B, Tao2]. One of the central results in the subject isWigner’s semi-circle law. Inspired by studies of energy levels of heavy nuclei, Wigner conjecturedthat their energy levels are well-modeled by eigenvalues ofa random matrix ensemble, and he andothers proved that in many matrix ensembles the distribution of the scaled eigenvalues of a typicalmatrix converge, in some sense, to the semi-circle distribution [Wig1, Wig2, Wig3, Wig4, Wig5].

Which matrix ensemble models the system depends on its physical symmetries. Though themost used in physics and number theory are the Gaussian Orthogonal, Unitary and Symplectic En-sembles, it is of interest to study other families. In many cases the additional symmetry constraintson the matrix (for example, requiring it to be Toeplitz or circulant or arising from ad-regulargraph) lead to a different density of states. There is now an extensive literature on the density ofeigenvalues of special ensembles; see for example [Bai, BasBo1, BasBo2, BanBo, BLMST, BCG,BHS1, BHS2, BM, BDJ, GKMN, HM, JMRR, JMP, Kar, KKMSX, LW, MMS,MNS, MSTW,McK, Me, Sch], where many of them have limiting spectral measures different than the semi-circle(though recent work, see [ERSY, ESY, TV1, TV2] among others,shows that in many cases thespacing between normalized eigenvalues is universal and equals that of the Gaussian ensembles).

In many of these special ensembles while one is able to prove the density of eigenvalues of atypical matrix converges to a limiting spectral measure, one cannot write down a nice, closed-formexpression for this limiting distribution (notable exceptions ared-regular graphs [McK], blockcirculant matrices [KKMSX] and palindromic Toeplitz matrices [MMS]). In what follows westudy a new ensemble of ‘checkerboard’ matrices, the eigenvalues of which are split into twotypes, each of which converges to a different limiting spectral distribution which can be solvedfor explicitly. Most of the eigenvalues are of order

√N and converge to a semi-circle; however,

a small number are of sizeΘ(N) and converge to new limiting measures related to the Gaussianensembles. We define these matrices in the next section, and then summarize our findings and thetechniques developed to study such split behavior.

1.2. Generalized Checkerboard Ensembles.Our arguments apply with only minor modificationto the reals, complex numbers and quaternions, and show connections between the checkerboardand Gaussian ensembles. As we often usei as an index of summation, we usei :=

√−1 and

similarly i, j andk for the quaternions. Additionally, we index the entriesmij of a matrix startingat0 to simplify certain congruence conditions.

Definition 1.1. FixD = R,C or H, k ∈ N,w ∈ R. Then theN×N (k, w)-checkerboard ensembleoverD is the ensemble of matricesM = (mij) given by

mij =

aij if i 6≡ j mod k

w if i ≡ j mod k(1.1)

whereaij = aji and

aij =

rij if D = R

rij+bij i√2

if D = C

rij+bij i+cij j+dij k

2if D = H

(1.2)

2

with rij , bij , cij , anddij i.i.d. random variables with mean 0, variance 1, and finite higher moments,and the probability measure on the ensemble given by the natural product probability measure. Werefer to the(k, 1)-checkerboard ensemble overD simply as thek-checkerboard ensemble overD.

When not stated or otherwise clear from context, we assume that D = R when talking aboutk-checkerboard matrices. We usew = 1 throughout for simplicity, since only slight alterations areneeded to make the results hold for anyw 6= 0.

For example, a(2, w)-checkerboard matrixA would be of the form

A =

w a0 1 w a0 3 w · · · a0N−1

a0 1 w a1 2 w a1 4 · · · ww a1 2 w a2 3 w · · · a2N−1...

......

......

. . ....

a0N−1 w a2N−1 w a4N−1 · · · w

. (1.3)

1.3. Results. Let νA,N be the empirical spectral measure of aN × N matrixA, where we havenormalized the eigenvalues by dividing by

√N :

νA,N =1

N

N∑

i=1

δ

(

x− λi√N

)

, (1.4)

where theλiNi=1 are the eigenvalues ofA. Here, we useA andN in the subscript to highlight boththe matrix and its size. Wigner’s semicircle law states thatfor many random matrix ensembles, foralmost all sequencesANN∈N of N × N matricesAN , we have weak convergence of empiricalspectral measuresνAN ,N asN → ∞ to the semicircle measure of radiusR, σR, which has density

2πR2

√R2 − x2 if |x| ≤ R

0 if |x| > R.(1.5)

Note that forR 6= 1 the ‘semicircle’ distribution is actually a semi-ellipse with horizontal axisR. While one can renormalize the eigenvalues by a constant independent ofN to rescale to asemicircle, we will see below that in our setting that constant would depend onk. We prefer not tointroduce a renormalization dependent onk, as it makes no material difference.

For the ensembles mentioned in §1.1 one is able to determine the limiting spectral measurethrough the method of moments. The situation is more subtle here. As we argue later, thek-checkerboard matrices havek eigenvalues of sizeN/k. As the variance of these eigenvalues is oforderk, for fixedk we see these eigenvalues are well-separated from theN − k eigenvalues thatare of order

√N . In fact, by using a matrix perturbation approach we are ableto establish the

following result, which we prove in Appendix B:

Theorem 1.2. Let ANN∈N be a sequence of(k, w)-checkerboard matrixs. Then almost surelyasN → ∞ the eigenvalues ofAN fall into two regimes:N − k of the eigenvalues areO(N1/2+ǫ)andk eigenvalues are of magnitudeNw/k +O(N1/2+ǫ).

We refer to theN − k eigenvalues that are on the order of√N as the eigenvalues in thebulk ,

while thek eigenvalues nearN/k are called the eigenvalues in theblip . See [CHS] for somegeneral results about a class of random matrices exhibitinga different kind of split behavior.

3

While the presence of thesek large eigenvalues prevent us from using one of the standard tech-niques, the method of moments, to determine the limiting density of the eigenvalues in the bulk,numerics (see Figure 1) suggest that the limit is a semi-ellipse.

FIGURE 1. A histogram, normalized appropriately to achieve unit mass, of thescaled eigenvalue distribution for100×100 2-checkerboard real matrices withw =1 after 500 trials.

The following result (see [Tao1]) allows us to bypass the complications presented by the smallnumber of large eigenvalues.

Theorem 1.3([Tao1]). LetANN∈N be a sequence of random Hermitian matrix ensembles suchthat νAN ,NN∈N converges weakly almost surely to a limitν. Let ANN∈N be another se-quence of random matrix ensembles such that1

Nrank(AN) converges almost surely to zero. Then

νAN+AN ,NN∈N converges weakly almost surely toν.

TakingAN to be the fixed matrix with entriesmij = 1i≡j (mod k) implies that the limiting spectraldistribution of thek-checkerboard ensemble as defined previously withw = 1 is the same as thelimiting spectral distribution of the ensemble withw = 0, which does not have thek large blipeigenvalues (for the remainder of this paper,AN always refers to anN×N matrix). This overcomesthe issue of diverging moments.

Theorem 1.4. Let ANN∈N be a sequence of realN × N k-checkerboard matrices. Then theempirical spectral measuresνAN ,N converges weakly almost surely to the semicircle distribution.

The proof is by standard combinatorial arguments. We give the details in Appendix A.On the other hand, the blip is where the vast majority of interesting behavior and technical

challenges are encountered. We begin with some heuristic arguments which give intuition for howthe blip arises and behaves.

Firstly, recall that a matrixA for which the sum of all entries in any given row is equal tosome fixedd has the trivial eigenvalued with eigenvector(1, 1, . . . , 1)T . For a matrix in theN × N k-checkerboard ensemble, the sum of theith row is equal toN/k +

∑Nj=1 aij where the

4

aij are i.i.d. with mean0 and variance1. This is approximatelyN/k, so heuristically there shouldbe an eigenvector very close to(1, 1, . . . , 1)T with eigenvalue roughlyN/k. Similarly, there arek − 1 other eigenvalues of size approximatelyN/k with eigenvectors close to the one describedpreviously with some additional periodic sign changes.

Hence the blip may be thought of as deviations about the trivial eigenvalues. The surprisingresult of this paper is that these deviations, while seemingly quite different from the eigenvaluedistributions of classical random matrix theory, in fact have the same distribution as the eigenvaluesof the followingk×k random matrix sub-ensemble of the classical Gaussian Orthogonal Ensemble(GOE).

Definition 1.5. Thehollow Gaussian Orthogonal Ensembleis given byA = (aij) = AT with

aij =

NR(0, 1) if i 6= j

0 if i = j.(1.6)

The spectral distribution of the2× 2 hollow GOE is Gaussian (see Proposition 3.18), and in thek → ∞ limit the eigenvalue distribution is a semicircle by standard GOE arguments. For largerfinite k we see an interesting sequence of distributions which interpolate between the Gaussian andthe semicircle, similarly to the results in [KKMSX] for block circulant matrices. The first few areshown in Figures 2 and 3.

- 1 0 1 2

0.0

0.5

1.0

1.5

2.0

- 1 0 1 2

0.0

0.5

1.0

1.5

2.0

FIGURE 2. (Left) Histogram of eigenvalues of 320002× 2 hollow GOE matrices.(Right) Histogram of eigenvalues of 320003× 3 hollow GOE matrices.

- 1 0 1 2

0.0

0.5

1.0

1.5

2.0

- 1 0 1 2

0.0

0.5

1.0

1.5

2.0

FIGURE 3. (Left) Histogram of eigenvalues of 320004× 4 hollow GOE matrices.(Right) Histogram of eigenvalues of 3200016× 16 hollow GOE matrices.

Computing this distribution poses substantial challenges. Ideally, we would like to define aweighted blip spectral measure which takes into account only the eigenvalues of the blip and notthe bulk. Naively, one could multiply the empirical spectral measure by some smooth cutoff func-tion of the form1[N/k−δ(N),N/k+δ(N)] for δ(N) growing appropriately to capture all of the blip

5

and neglect the bulk in the limit. However, with such a weighting function we cannot use theeigenvalue-trace formula to reduce the problem to combinatorics on products of matrix entries inthe standard way. The next reasonable possibility is to try Taylor expanding a nice cutoff function,for then each expected moment is of the form

E

[ ∞∑

i=0

cipi(λ1, . . . , λN)

]

, (1.7)

wherepi is the power sum symmetric polynomial of degreei andλj ’s are the eigenvalues. Unfor-tunately, Taylor series convergence and limit-switching issues make this approach untenable.

Hence, we are led to use a polynomial weighting function. No polynomial of fixed degree is asufficiently good approximation to a smooth cutoff function, so we use a sequence of polynomialsof degree increasing with the matrix sizeN so that in the limit we mimic a smooth cutoff function.Specifically, let

fn(x) := x2n(x− 2)2n (1.8)

Thus we alter the standard empirical spectral measure in thefollowing way to capture the blip.

Definition 1.6. The empirical blip spectral measureassociated to anN × N k-checkerboardmatrixA is

µA,N :=1

k

λ eigenvalue ofA

fn(N)

(

N

)

δ

(

x−(

λ− N

k

))

(1.9)

wheren(N) is a function for which there exists someǫ so thatN ǫ ≪ n(N) ≪ N1−ǫ.

At a blip eigenvalueλ ≈ N/k, we havefn(

λN/k

)

≈ 1; because the standard deviation of the

bulk eigenvaluesλ′ is on the order of√N , fn

(

λ′

N/k

)

≈ 0 for any bulk eigenvalueλ′. Because

fn(1) = 1, f(0) = 0, andf ′n(1) = 0 = f ′

n(0) = · · · = f(2n−1)n (0), the bulk eigenvalues are given

weight roughly0 and the blip eigenvalues are all given weight roughly1, and small deviationsabout these weights disappear in the limit.

Remark 1.7. The authors experimented with several other sequence of polynomials and all givethe same end results under some suitable conditions, but this one simplifies computations. Further-more, it is nonnegative, ensuring that the empirical blip spectral measure is actually a measure. Itis almost a probability measure, i.e. for a typical matrixµA,N(R) is close to1. To makeµA,N aprobability measure we would need to divide by the sum of the weights associated to the eigenval-ues, but the expected value of this quotient is intractable,so we instead divide byk.

Definition 1.6 finally allows reduction to tractable combinatorics. Interestingly, this combina-torics reduces back to random matrix theory, yielding convergence in expectation of the momentsof the weighted blip spectral measure of thek-checkerboard matrix ensemble to those of thek× khollow GOE. However, we cannot show almost-sure weak convergence of measures by standardarguments because (a) due to the weighting function, the blip empirical spectral measure is nolonger a probability measure, and (b) the number of eigenvalues in the blip is fixed so there are notenough to average over. We modify the moment convergence theorem to overcome the first diffi-culty, and average over the eigenvalues of multiple independent matrices to overcome the second.

We now state this result formally.6

Definition 1.8. Fix a functiong : N → N. Theaveraged empirical blip spectral measureassoci-ated to ag(N)-tuple ofN ×N k-checkerboard matrices(A(1)

N , A(2)N , . . . , A

(g(N))N ) is

µN,g,A

(1)N

,A(2)N

,...,A(g(N))N

:=1

g(N)

g(N)∑

i=1

µA

(i)N

,N. (1.10)

Theorem 1.9. Let g : N → N be such that there exists aδ > 0 for which g(N) ≫ N δ. LetA(i) = A(i)

N N∈N be sequences of fixedN × N matrices, and letA = A(i)i∈N be a se-quence of such sequences. Then, asN → ∞, the averaged empirical blip spectral measuresµN,g,A

(1)N

,A(2)N

,...,A(g(N))N

of thek-checkerboard ensemble overR converge weakly almost-surely to the

measure with moments equal to the expected moments of the standard empirical spectral measureof thek × k hollow Gaussian Orthogonal Ensemble.

One can also naturally define thehollow Gaussian Unitary Ensembleand thehollow GaussianSymplectic Ensembleby extending Definition 1.5 to complex valued matrices comprised of com-plex Gaussians and quaternion valued matrices comprised ofquaternion Gaussians, respectively.In Theorem 4.3 we obtain analogous results to Theorem 1.9, connecting the limiting blip spectralmeasure of thek-checkerboard ensembles overC andH to the empirical spectral measures of thehollow GUE and hollow GSE, respectively.

In §2 we prove our claims concerning the eigenvalues in the bulk, then turn to the blip spectralmeasure in §3 (and the mentioned generalizations in §4). We then prove results on the convergenceto the limiting spectral measure in §5.

2. THE BULK SPECTRAL MEASURE

In this section we establish that the limiting bulk measure for k-checkerboard matrices followsa semi-circle law. We denote byµ(m) themth moment of the measureµ.

Theorem 2.1. Let ANN∈N be a sequence ofN × N (k, 1)-checkerboard matrices, and letνAN

denote the empirical spectral measure, thenνANconverges weakly almost surely to the Wigner

semicircle measureσR with radius

R = 2√

1− 1/k. (2.1)

One common tool used to study the limiting spectral density of a matrix ensemble is the methodof moments. However, this method cannot be applied directlyto the study of checkerboard matri-ces when studying the bulk regime because the limiting expected moments of the empirical spectralmeasure do not exist. For a proof of their divergence, see Proposition A.1 in the appendix. Thefollowing result overcomes this difficulty by allowing us totreat thew entries as0.

Theorem 2.2. [Tao1] Let ANN∈N be a sequence of random Hermitian matrix ensembles suchthat νAN ,NN∈N converges weakly almost surely to a limitν. Let ANN∈N be another se-quence of random matrix ensembles such that1

Nrank(AN) converges almost surely to zero. Then

νAN+AN ,NN∈N converges weakly almost surely toν.

We now use the method of moments to establish the result for(k, 0)-checkerboard matrices.The main work is using combinatorics to establish convergence of the expected moments. Theremaining arguments establishing almost sure weak convergence are standard and may be foundin Appendix A.

7

Lemma 2.3. The expected moments of the bulk empirical spectral measuretaken overAN in theN×N (k, 0)-checkerboard ensemble converge to the moments of the Wigner semicircle distributionσR with radiusR = 2

1− 1/k

E

[

ν(ℓ)AN

]

→ σ(ℓ)R (2.2)

asN → ∞.

Proof. We have immediately from the eigenvalue-trace lemma and linearity of expectation that

E

[

ν(ℓ)AN

]

=1

N ℓ/2+1

1≤i1,...,iℓ≤N

E[

ai1i2 · · ·aiℓ−1iℓaiℓi1]

. (2.3)

Each term in the sum is associated to a sequenceI = i1i2 . . . iℓi1. Each sequence correspondsto a closed walk on the complete graph with vertices labeled by the elements of the seti1, ..., iℓby giving the order in which the vertices are visited. Define the weightof such a sequenceI tobe the number of distinct entries ofI. If the weight of a walk is greater thanℓ/2 + 1, the walkcontributes nothing to the sum because the expectation of some entry is independent of the rest andits expectation is0.

The sequences of weight less thanℓ/2 + 1 contributeo(N ℓ/2+1) to the sum. This is because thesequences may be partitioned into a finite number of equivalence classes by the isomorphism classof the corresponding walk. An isomorphism class of weightt then gives rise toO(N t) walks ofweightt by choosing labels for the distinct nodes in any such walk.

The sequences of weightℓ/2+ 1 require a finer analysis. Whenℓ is odd, the expectation associ-ated to each such sequence is0. Whenℓ is even, the walk corresponding to such a sequence visitsℓ/2 + 1 nodes and traversesℓ/2 distinct edges. Hence as the walk is connected, it is a tree. More-over, each walk may be rooted by associating the initial nodeof the walk to the root. As is wellknown, there areCℓ/2 rooted trees onℓ/2 + 1 nodes, whereCℓ is theℓth Catalan number. We maythen label the nodes in the tree in such a way that no two adjacent nodes have the same congruence

class inN ℓ/2+1(

k−1k

)ℓ/2+ o

(

N ℓ/2+1)

ways. WritingζI for ai1i2 · · · aiℓ−1iℓaiℓi1 , we have

E

[

ν(ℓ)AN

]

=1

N ℓ/2+1

weightI<ℓ/2+1

E [ζI ] +∑

weightI=ℓ/2+1

E [ζI ] +∑

weightI>ℓ/2+1

E [ζI ]

=1

N ℓ/2+1

(

o(N ℓ/2+1) + Cℓ/2

(

N ℓ/2+1

(

k − 1

k

)ℓ/2

+ o(

N ℓ/2+1)

)

+ 0

)

= Cℓ/2

(

k − 1

k

)ℓ/2

+ o(1) (2.4)

Hence

limN→∞

E

[

ν(ℓ)AN

]

=

(

R2

)ℓCℓ/2 if ℓ is even

0 otherwise,(2.5)

which are the moments of the semicircle distribution of radiusR.

8

3. THE BLIP SPECTRAL MEASURE

To appropriately modify the measure in (1.4), we weight by the polynomial

fn(x) = x2n(x− 2)2n (3.1)

and study the following spectral measure.

Definition 3.1. The empirical blip spectral measureassociated to anN × N k-checkerboardmatrixA is

µA,N :=1

k

λ an eigenvalue ofA

fn(N)

(

N

)

δ

(

x−(

λ− N

k

))

, (3.2)

wheren(N) is a function for which there exists someǫ so thatN ǫ ≪ n(N) ≪ N1−ǫ; the particularchoice is not important as long as these conditions are satisfied.

The modified spectral measure of Definition 1.6 weights eigenvalues within the blip by almostexactly 1, due to the scaling, and those in the bulk are weighted by almost exactly zero. We shiftthe eigenvalues by subtracting roughly mean of the blip in order to center the blip rather than thebulk. This does not truly center the blip, but causes the center to remain fixed asN → ∞; wecompute the limiting moments of this measure and center later.

First, we explicitly derive a formula for the expectedmth moment of the blip spectral measuregiven in (1.9), where the expectation is taken over theN ×N k-checkerboard ensemble.

Lemma 3.2. The expectedmth moment of the blip empirical spectral measure,µA,N , is

E[µ(m)A,N ] =

1

k

(

k

N

)2n 2n∑

j=0

(

2n

j

)m+j∑

i=0

(

m+ j

i

)(

−N

k

)m−i

E Tr A2n+i (3.3)

where both expectations are taken over theN ×N k-checkerboard ensemble.

Proof. We have

E[µ(m)A,N ] =

1

kE

[

λ

f

(

λ

N/k

)(

λ− N

k

)m]

=1

kE

[

λ

(

N

)2n 2n∑

j=0

(

2n

j

)

(−1)j(

k

N

)j (

−N

k

)j m+j∑

i=0

(

m+ j

i

)(

−N

k

)m−i

λi

]

=1

k

(

k

N

)2n 2n∑

j=0

(

2n

j

)m+j∑

i=0

(

m+ j

i

)(

−N

k

)m−i

E

[

λ

λ2n+i

]

=1

k

(

k

N

)2n 2n∑

j=0

(

2n

j

)m+j∑

i=0

(

m+ j

i

)(

−N

k

)m−i

E Tr A2n+i, (3.4)

where the first equality comes from straightforward algebrausing binomial expansion and the lastequality comes from the Eigenvalue-Trace Lemma.

Now, recall thatE Tr Mn =

1≤i1,...,in≤N

E[mi1i2mi2i3 · · ·mini1]. (3.5)

9

We refer to termsE[mi1i2mi2i3 · · ·mini1 ] ascyclic products andm’s as entries of cyclic prod-ucts. By Lemma 3.2, it suffices to understand the cyclic products making upE Tr A2n+i, whichreduces to a combinatorics problem of understanding the contributions of different cyclic products.We develop the following vocabulary to classify types of cyclic products according to the aspectsof their structure that determine overall contributions.

Definition 3.3. A block is a set of adjacenta’s surrounded byw’s in a cyclic product, where thelast entry of a cyclic product is considered to be adjacent tothe first. We refer to a block of lengthℓ as anℓ-block or sometimes a block of sizeℓ.

Definition 3.4. A configuration is the set of all cyclic products for which it is specified (a) howmany blocks there are, and of what lengths, and (b) in what order these blocks appear. However, itis not specified how manyw’s there are between each block.

Example 3.5.The set of all cyclic products of the formw · · ·waw · · ·waaw · · ·waw · · ·w, whereeach· · · represents a string ofw’s and the indices are not yet specified, is a configuration.

Definition 3.6. LetS be a multiset of natural numbers. AnS-class, or class whenS is clear fromcontext, is the set of all configurations for which there exists a uniques-block for everys ∈ Scounting multiplicity. In other words, two configurations in the same class must have the sameblocks but they may be ordered differently and have different numbers ofw’s between them.

When we speak of thecontributionof a configuration or class toE Tr A2n+i, we assume thatthe length of the cyclic product is fixed at2n+ i. The reason that the length of the cyclic product issuppressed in our notation is becausen(N) varies withN and we wish to consider the contributionof a configuration or class asN → ∞.

Definition 3.7. Given a configuration, amatching is an equivalence relation∼ on thea’s in thecyclic product which constrains the ways of indexing (see Definition 3.10) thea’s as follows: anindexing ofa’s conforms to a matching∼ if, for any twoa’s aiℓ,iℓ+1

andait,it+1, we haveiℓ, iℓ+1 =it, it+1 if and only ifaiℓiℓ+1

∼ ait,it+1. We further constrain that eacha is matched with at leastone other by any matching∼.

Remark 3.8. Noting that theaij are drawn from a mean-0 distribution, any matching with anunmatcheda would not contribute in expectation, hence it suffices to only consider those with thea’s matched at least in pairs.

Example 3.9. Given a configurationai1i2wi2i3ai3i4wi4i5ai5i6wi6i7ai7i8wi8i1 (the indices are not yetspecified because this is a configuration), ifai1i2 ∼ ai5i6 we must have eitheri1 = i5 and i2 = i6or i1 = i6 andi2 = i5.

Definition 3.10. Given a configuration, matching, and length of the cyclic product, then anindex-ing is a choice of

(1) the (positive) number ofw’s between each pair of adjacent blocks (in the cyclic sense), and(2) the integer indices of eacha andw in the cyclic product.

Two comments on these definitions are in order.

Remark 3.11. It is very important to note that the definitions of class, configuration, and matchingdo not fix the length of the cyclic product and hence we may consider their contribution asn(N)grows; however, because the length of the product directly affects the number of indexings, we musttake it into account when summing over them.

10

Remark 3.12. Note that the choice of indexings is constrained by the configuration as well as thematching, because entriesaij havei 6≡ j (mod k) andwij havei ≡ j (mod k). This is importantlater.

With the above vocabulary, we have

E Tr Aη =∑

S-classesC

configurationsC∈C

matchingsM

indexingsI givenM,C ,η

E[Π] (3.6)

whereΠ is the cyclic product given by the choice of indexing.The following lemma allows us to determine whichS-classes contribute to the trace terms in

(3.3) and which contributions become insignificant in the limit asN → ∞.

Lemma 3.13. In the limit asN → ∞, the only classes which contribute are those with only1-or 2-blocks,1-blocks are matched with exactly one other1-block, and botha’s in any2-block arematched with their adjacent entry and no others.

Proof. Fix the number of blocksβ in the classes we consider. We refer to the power ofN in thecontribution of our class as its number of degrees of freedom; each degree of freedom correspondsto the choice of an index in the cyclic product. For a fixed configuration, we consider the totalnumber of degrees of freedom lost by the constraints placed by a matching. Because we have fixedthe number of blocks, we may then talk about the average number of degrees of freedom lost perblock. Given a2-blockaijajℓ, matching the twoa’s constrainsℓ = i and hence loses one degree offreedom; if two singletonsaij andatℓ are matched theni, j = t, ℓ so two degrees of freedomare lost. Therefore if all blocks have size1 or 2 and the matchings are as in the hypotheses, thenone degree of freedom per block is lost when averaged over allblocks, no matter the configurationor length of the cyclic product. Thus classes in which more than one degree of freedom per blockis lost do not contribute in theN → ∞ limit, so it suffices to show any classes and matchingswhich are not as specified in the lemma statement lose more than one degree of freedom per block.

Fix a configurationC with α a’s and a matching∼. Then∼ partitions thea’s in C into equiv-alence classesT1, . . . , Ts. If there were no matching restrictions, only the restriction that the firstindex of ana matches the second index of the last one, then the number of degrees of freedomfrom choosing the indices of thea’s would be

M =∑

blocksb

(len(b) + 1) = β + α. (3.7)

Let F be the actual number of degrees of freedom from choosing the indices of thea’s, givenour configuration and matching. Naively, we may choose two indices for each matching classT1, . . . , Ts, but then there may be restrictions froma’s from different matching classes being adja-cent that cause a loss of degrees of freedom. Lettingc be the number of degrees of freedom lost tosuch crossovers, the number of degrees of freedom we have is2s− c. Then the number of degreesof freedom lost per block is

M − (2s− c)

β= 1 +

α + c− 2s

β. (3.8)

It thus suffices to show that our configuration and matching are of the form specified in the lemmastatement if and only ifα + c − 2s = 0, or equivalentlyα+c

s= 2, and α+c

s> 2 for any other

configuration and matching. The forward direction was proven in the beginning of this proof.11

For the backward direction, because|Ti| ≥ 2 for all i by the definition of matching, we imme-diately haveα

s≥ 2. If there is someTi with |Ti| > 2 then we haveα

s> 2, and if there existi, j

such that ana from Ti is adjacent to ana from Tj then we havecs> 0. Therefore ifα+c

s= 2 then

there is noTi with |Ti| > 2 or i, j such that ana from Ti is adjacent to ana from Tj , i.e., thea’sare matched in pairs and no unmatcheda’s are adjacent. This proves the lemma.

We now explicitly compute the contributions of each of theseclasses.

Proposition 3.14. The total contribution toE Tr Aη of an S-classC with m1 1-blocks and(|S| −m1) 2-blocks

p(η)

(|S|m1

)

(k − 1)|S|−m1Ek Tr Bm1

(

(

N

k

)η−|S|+O

(

(

N

k

)η−|S|−1))

(3.9)

where

p(η) =η|S|

|S|! +O(η|S|−1) (3.10)

and the expectationEk Tr Bm1 is taken over thek × k hollow GOE as defined in Definition 1.5.

Proof. Let A = m1 + 2(m − m2) denote the number ofa’s in C. Let p(η) be the number ofways to arrange|S| blocks and(η−A) w’s into a cyclic product of lengthη, where the blocks aretaken to be indistinguishable. We first computep(η). We may think of the configurations inC bybijectively identifying them with the set of(η − (A − |S|))-gons with|S| non-adjacent verticeslabeled bya (these correspond to blocks of any size, not just1-blocks) and the rest labeled byw. Eacha vertex corresponds to a particular block and eachw vertex corresponds to aw in theconfiguration; these are on a polygon rather than a straight line because the first and last entry of acyclic product are considered adjacent (see Definition 3.3).

We may calculatep(η) by first examining all possible choices of(

η−(A−|S|)|S|

)

distinct vertices,then subtracting off all the cases for which at least one pairof the vertices selected are adjacent.If one pair is adjacent, then–with no other restrictions placed upon the other vertices–there areη − (A − |S|) possible locations for the2-block to be placed. This leaves

(

η−A+|S|−2|S|−2

)

possiblelocations for the remaining labels. As such, the term that must be subtracted off has degree inηstrictly less than|S|. Hence

p(η) =

(

η − (A− |S|)|S|

)

+O(η|S|−1) =η|S|

|S|! +O(

η|S|−1)

. (3.11)

Having specified the locations of the blocks, there are(|S|m1

)

ways to choose which locations havea1-block and which have a2-block.

By Definition 1.1, we have that for any entryaij , i 6≡ j mod k, and for any entrywij , i ≡ jmod k. We consider what conditions this places on the indices in a given configuration.

Example 3.15.Consider the configuration

· · · ai1i2wi2i3wi3i4ai4i5ai5i4wi4i6ai6i7 · · · . (3.12)

Then we havei2 ≡ i3 ≡ i4 ≡ i6 (mod k) (3.13)

and these are not congruent toi1, i5 or i7 modk.12

We thus see that the congruence class of the second index of a1-block determines the congru-ence classes of the indices of the string ofw’s to its right. Similarly, the leftmost indexi of amatched2-block aijaji determines the rightmost index. Thus the congruence class modulok ofthe second index of a1-block propagates throughw’s and2-blocks, and hence determines the con-gruence class modulok of the first index of the next1-block, where ‘next’ is taken in the cyclicsense for the last1-block in the cyclic product.

We now claim that the number of ways to choose congruence classes of the indices of the1-blocks, such that there exists a consistent choice of indices for the other entries given the constraintsdiscussed above, isEk Tr Bm1 . First, note that by the above considerations, the number ofwaysto choose congruence classes modk of the indices of the1-blocks is equal to the number of waysto choose indices of the cyclic productbi1i2bi2i3 · · · bim1 i1

with i1, . . . , im1 ∈ 1, . . . , k under therestrictionij 6= ij+1 for all j.

However, there are two restrictions on our choices of indices. Firstly given any pair of congru-ence classes modk, any contributing cyclic product must have an even number ofa’s with bothindices coming from that pair of congruence classes, because thea’s must be matched in pairs byLemma 3.13. Secondly, if there are more than two1-blocka’s with indices from the same pair ofcongruence classes, then there is a choice as to how to match them1. Specifically, if there are2qa’s with indices from the same pair of congruence classes, then there are(2q − 1)!! ways to matchthem into pairs.

This means that if we haveq a’s with indices from the same pair of congruence classes, thenthere are0 ways to get a contributing matching ifq is odd and(q−1)!! ways ifq is even. But theseare exactly the moments of a Gaussian, so given a configuration, the number of ways to specify thecongruence classes of the1-blocks and specify a matching which will contribute in the limit is

1≤i1,...,ir≤k distinct

E[bi1i2bi2i3 · · · biki1 ] (3.14)

with eachbij ∼ N (0, 1) i.i.d. under the restriction thatbij = bji andbii = 0 for all i. This is thek × k hollow GOE as defined in Definition 1.5

Finally, after specifying these congruence classes, the congruence classes of the indices of thew’s and the outer indicesi of pairsaijaji are determined as argued previously. However, there arestill k− 1 possible choices of congruence class for each inner indexj in each2-block, because thecongruence class of the inner indexj must be different from that of the outer one, which is alreadydetermined. Therefore there are(k − 1)|S|−m1 ways to choose these congruence classes. Afterall congruence classes are determined, there areN/k choices for each index. However, becausethere are|S| blocks, by the proof of Lemma 3.13 there are|S| indices which are determined by

another. Therefore the contribution from actually specifying the indices is(

Nk

)η−|S|. Therefore,

the contribution from choosing the locations of blocks, then locations of1-blocks, then congruenceclasses of indices, and finally the indices themselves is

p(η)

(|S|m1

)

(k − 1)|S|−m1Ek Tr Bm1

(

(

N

k

)η−|S|+O

(

(

N

k

)η−|S|−1))

, (3.15)

where the lower order terms inN come from matchings which were proven in Lemma 3.13 toyield fewer degrees of freedom inN .

1Recall that we have specified that the indices come from the same congruence class, but we must still specify pairsof a’s with indicesactually equal.

13

When computing themth moment, the following combinatorial lemma allows us to cancel thecontributions of classes with more thanm blocks.

Lemma 3.16.For any0 ≤ p < m,

m∑

j=0

(−1)j(

m

j

)

jp = 0. (3.16)

Furthermorem∑

j=0

(−1)m−j

(

m

j

)

jm = m!. (3.17)

The proof is a straightforward calculation; see Appendix C.We are now ready to prove our main result on the moments.

Theorem 3.17.Denote the centered moments of the empirical blip spectral measure of theN ×N

k-checkerboard ensemble byµ(m)A,N . Then

limN→∞

E[µ(m)A,N ] =

1

kEk Tr Bm. (3.18)

Proof. Recall that by Lemma 3.2,

E[µ(m)A,N ] =

1

k

(

k

N

)2n 2n∑

j=0

(

2n

j

)m+j∑

i=0

(

m+ j

i

)(

−N

k

)m−i

E Tr A2n+i. (3.19)

We consider which values of|S| allow a class to actually contribute in the limit. For fixedj, bythe formula for the expectedmth moment ofµ given in (3.3) and Lemma 3.16, the contribution ofanS-class cancels ifp(η) has degree less thanm + j. Hence by the expression for the degree ofp given in Proposition 3.14, anS-class cancels if|S| < m + j. However, again by Proposition3.14, the contribution of anS-class toE Tr Aη is O(Nη−|S|), which if |S| > m contributes ao(1) term to (3.19) after multiplying with the(k/N)2n term. Hence the only contributingS-classeshavem+ j ≤ |S| ≤ m, i.e.,|S| = m andj = 0.

Then we may remove the sum overj to yield

E[µ(m)A,N ] =

1

k

(

k

N

)2n m∑

i=0

(

m

i

)(

−N

k

)m−i

E Tr A2n+i. (3.20)

By the previous discussion, Lemma 3.13, the terms which contribute toE Tr A2n+i and do notvanish in the limit arise from classes withm1 1-blocks and(m − m1) 2-blocks. By Proposition3.14, these are of the form

p(2n+ i)

(

m

m1

)

(k − 1)m−m1Ek Tr Bm1

(

(

N

k

)(2n+i)−m

+O

(

(

N

k

)(2n+i)−m−1))

. (3.21)

14

Hence the contribution of such a class toE[µ(m)A,N ] in the limit is

1

k

(

k

N

)2n m∑

i=0

(

m

i

)(

−N

k

)m−i

p(2n+ i)

(

m

m1

)

(k − 1)m−m1Ek Tr Bm1

(

N

k

)(2n+i)−m

=1

k

(

m

m1

)

(k − 1)m−m1Ek Tr Bm1

m∑

i=0

(−1)m−i

(

m

i

)

p(2n+ i). (3.22)

By the first part of Lemma 3.16, all terms inp(2n + i) of degree lower thanm in i cancel. Sincep is of degreem by Lemma 3.13, only the highest degree term ini contributes, and this term isequal toim/(m!) by the same lemma. Applying the second part of the Lemma 3.16,we have thatthe contribution from our class to the limiting expectedmth moment is

1

k

(

m

m1

)

(k − 1)m−m1Ek Tr Bm11

m!m! =

1

k

(

m

m1

)

(k − 1)m−m1Ek Tr Bm1 . (3.23)

Summing the above contributions overm1, we have that

limN→∞

E

[

µ(m)A,N

]

=1

k

m∑

m1=0

(

m

m1

)

(k − 1)m−m1 · Ek TrBm1 . (3.24)

It is natural to compute the centered moments of the distribution. The uncentered mean is

E

[

µ(1)A,N

]

= k − 1. (3.25)

It is not trivial from the definition that centering the limiting expected moments ofµA,N yieldsthe limiting expected centered moments ofµA,N , but this can be shown straightforwardly from thedefinitions so we omit the proof. Now, applying the definitionof centered moment to the momentsgiven in (3.24) and reindexing summations gives us that the limiting expected centered momentsare

µ(m)c := lim

N→∞E

[∫

(x− µ(1)A,N)

mdµA,N

]

=

m∑

m1=0

(

m

m1

)

(−(k − 1))m−m1E

[

µ(m1)A,N

]

=m∑

m1=0

[

(

m

m1

)

(−1)m−m1(k − 1)m−m11

k

m1∑

i=0

(

m1

i

)

(k − 1)m1−i · Ek Tr Bj

]

=

m∑

m1=0

[

(

m

m1

)

(−1)m−m1

m1∑

i=0

(

m1

i

)

(k − 1)m−i 1

kEk Tr Bi

]

=

m∑

j=0

[

(

m

i

)

(k − 1)m−j 1

kEk Tr Bi

m∑

m1=i

(

m− j

m1 − i

)

(−1)m−m1

]

. (3.26)

Now consider the inner sum in (3.26), which is equal tom−j∑

m1=0

(

m− j

m1

)

(−1)m−m1 . (3.27)

15

In fact, this is exactly equal to(−1)mδmj whereδmj is the Kronecker delta function. From this, thelimiting expected centered moments are

µ(m)c =

(−1)m

kEk Tr Bm. (3.28)

BecauseEk Tr Bm = 0 for m odd, we remove the(−1)m factor, completing the proof.

Although the formula given by Theorem 3.17 is implicit, it enables us to compute anymth

centered moment of the limiting empirical blip spectral measure of theN × N k-checkerboardensemble by combinatorics on the indices of the corresponding k × k hollow GOE. We illustratethis with the first few cases.

For ease of notation, we define

Mk,m :=1

kEk Tr Bm. (3.29)

For a fixed value ofk, theMk,m’s are the moments of a spectral measure defined upon thek × khollow GOE. Some elementary consequences of this are relevant to our purposes, and we presentthese below.

Proposition 3.18.For k = 2, we have that theMk,m’s are the moments of the standard Gaussian.

Proof. Fork = 2, matricesB in the hollow GOE are of the form

B =

[

0 bb 0

]

(3.30)

for b ∼ NR(0, 1).The eigenvalues ofB areλ = ±b, and the proposition follows immediately.

Proposition 3.19.We haveMk,2 = k − 1.

Proof. ForB a hollow GOE matrix, we have1

kEk Tr B2 =

1

k

1≤i,j≤k

E[bijbji] =1

k(k2 − k) = k − 1 (3.31)

upon noting thatE[bijbji] = 1 andbii = 0.

4. GENERALIZATIONS TO C AND H

We generalize the result of the previous section to complex and quaternion ensembles. Bothcases can be reduced to the arguments of the real case in exactly the same manner, so we showonly the proof of the quaternion case. The ensembles were defined in Definition 1.1; note we areusingi, j andk for the imaginary units to avoid confusion with indicesi, j, k.

Analogously, we define the hollow GUE and GSE.

Definition 4.1. Thek × k hollow Gaussian unitary ensembleand hollow Gaussian symplecticensembleare the ensembles of matricesB = (bij) given by

bij =

NC(0, 1) (resp.NH(0, 1)) if i 6= j

0 if i = j(4.1)

under the restrictionbij = bji. We denote the expectation over these ensemble with respectto thenatural product probability measures byEC

k Tr andEH

k Tr .16

We define in addition one new combinatorial notation.

Definition 4.2. A congruence configurationis a configuration together with a choice of the con-gruence class modulok of every index of a1-block.

The following generalizes Theorem 3.17 to complex and quaternion ensembles.

Theorem 4.3. LetD = C,H, and letDE

[

µ(m)AN

]

be the expectedmth moments of the empirical blip

spectral measures over theN ×N complex or quaternionk-checkerboard ensemble defined as in(3.3). Then

limN→∞ D

E

[

µ(m)AN

]

=1

kEDk Tr[Bm]. (4.2)

Proof. We prove the quaternion case, and the complex case follows similarly.To begin, notice that the first statement in Lemma 3.13 regarding counting the arrangements of

blocks applies to this case exactly as it was stated.Now, notice that theS-classes for which blocks have size one or two, which were shown to be

the only contributing classes in Lemma 3.13, still have the same number of degrees of freedom inthe quaternion case. It is apparent that a configuration in the quaternion case cannot have moredegrees of freedom than the analogous configuration given above in the real case. It follows thatconfigurations which do not contribute in the real case also do not contribute in the quaternioncase.

Note that because the quaternions are not commutative, we must take care in computing theexpectations of the cyclic products. Consider a configuration. Note that thew’s in the configurationare real. Further, the matched2-blocks, i.e.,aijaji = |aij|2, are also real. Therefore, all thequarternion-valued1-blocks in the configuration commute with thew’s and the matched2-blocks.Hence, for all cyclic products, we have that the expectationbreaks down as

E[Cyclic Product] = E[1-blocks (In the order they appear)] · E[2-blocks andw’s]. (4.3)

By Lemma 3.13 we need only consider matchings where the1-blocks have different indicesfrom the2-blocks. Also, given a choice of the congruence classes of the indices of the1-blocks,we may construct a corresponding product of the entries in thek × k hollow GSE given by entriesin thek × k hollow GSE whose indices are those prescribing the congruence class choices on theindices of the1-blocks.

Now, suppose thatΠ1 is a congruence configuration of thek-checkerboard matrix, and supposethatΠ2 is the corresponding product of entries of thek × k hollow GSE. Theaij ’s that make upthese products are quaternions, and hence they do not necessarily commute under multiplication.To deal with this issue, we distribute the product and commute the summed terms to make sure thatall the copies of a distinct random variable are placed adjacently in the product. Upon doing this wecan use independence of the random variables to convert the expectation of the product into a prod-uct of expectations which allows us to compute the expectation using the moments of the Gaussian.

In particular, we letaij =rij+ixij+jyij+kzij

2. Distributing, we get

ai1i2 · · · aini1 =1

2n

cij

1≤ℓ≤n

ciℓiℓ+1, (4.4)

17

where the sum overcij is over all choices ofcij ∈ rij , ixij , jyij, kzij for eachij = iℓiℓ+1 and theindices are taken cyclically. Note that this expansion is the same for a product of1-blocks in thek-checkerboard ensemble and a product of entries in the hollow GSE.

Π1 distributes into a product of Gaussian terms times 1, a product of Gaussian terms timesi,a product of Gaussian terms timesj, and a product of Gaussian terms timesk. We denote theseproducts byΠRe

1 , Πi1, Π

j1, andΠk

1, respectively. Similarly defineΠRe

2 , Πi2, Π

j2, andΠk

2. Now,distribution yields

Π1 =1

2n

(

ΠRe

1 +Πi1i+Πj

1j +Πk1 k)

(4.5)

on thek-checkerboard side, and

Π2 =1

2n

(

ΠRe

2 +Πi2i+Πj

2j +Πk2 k)

(4.6)

on the hollow GSE side. Note that theE[Πi1], E[Π

j1], andE[Πk

1] terms are all zero, because anonreal coefficient can only occur when there is an unpaired1-blockxij , yij or zij . Hence only thereal partsE[ΠRe

1 ] andE[ΠRe

2 ] remain.Using (4.5), we have

E

[

matchings

indexings

Π1

]

= E

[

matchings

r,x,y or z choices

indexings

Π′Re

1

]

(4.7)

whereΠ′Re

1 is summed over all matchings, all4length(Π1)/2 ways to substitute in eitherr, x, y or zfor each matched pair, and finally all indexings of these products.

Similarly, on the hollow GSE side, we have

E [Π2] = E

[

r,x,y or z choices

Π′Re

2

]

. (4.8)

where again the sum is over all ways to substitute anr, x, y or z for the entries inΠ′2 so that there

are an even number of each. Hence to show equality of the two sides, it suffices to show termwiseequality for each summand of

r,x,y or z choices. Specifically, we must show

(

k

N

)m1∑

matchings

indexings

E[Π′Re

1 ] = E[Π′Re

2 ]. (4.9)

By the same argument as in the GOE case, when there are, say,2q x terms inΠ′Re

1 , there are(2q−1)!! ways to choose a matching of them on the LHS, while on the RHS the expectation of the2q x terms contribute the Gaussian moment(2q − 1)!!. For any product and choice of matchingsthere are

(

Nk

)m1 ways to choose indexings, cancelling the(

kN

)m1 on the LHS. Therefore (4.9)holds, completing the proof of the quaternion case.

The complex case may be proven by the exact same technique of distributing out products ofcomplex Gaussians into products of real Gaussians and arguing as in the real case on these prod-ucts, proving the theorem.

Remark 4.4. It is possible to prove the complex case directly by a more complicated version ofthe GOE argument, which was the course first taken by the authors before solving the quaternioncase. However, the approach outlined in the proof of Theorem4.3 is cleaner and more general.

18

5. ALMOST-SURE CONVERGENCE

The traditional way to show weak convergence of empirical spectral measures to a limiting spec-tral measure (in probability or almost-surely) is to show that the variance (resp. fourth moment)of themth moment, averaged over theN × N ensemble, isO( 1

N) (resp.O

(

1N2

)

). In the case ofthe blip spectral measure, we encounter a problem: both assertions are false. Heuristically, asNgrows, the empirical spectral measures ofN × N matrices from most standard ensembles will allbe similar because there is a large and growing number of eigenvalues to average over and so thebehavior of individual eigenvalues is drowned out by the average. However, for ak-checkerboardmatrix there are onlyk eigenvalues in the blip, so each blip spectral measure is just a collection ofkisolated delta spikes distributed randomly according to the limiting spectral computed in Theorem3.17. As such, for fixedk the variance and fourth moment over the ensemble of the general mth

moment do not go to0. We therefore define a modified spectral measure which averages over theeigenvalues of many matrices in order to extend standard techniques.

In order to facilitate the proof of the main convergence result (Theorem 5.5) we first introducesome new notation. In all that follows we fixk and suppressk-dependence in our notation forsimplicity. LetΩN be the probability space ofN × N k-checkerboard matrices with the naturalprobability measure. Then we define the product probabilityspace

Ω :=∏

N∈NΩN . (5.1)

By Kolmogorov’s extension theorem, this is equipped with a probability measure which agreeswith the probability measures onΩN when projected to theN th coordinate. GivenANN∈N ∈ Ω,we denote byAN theN × N matrix given by projection to theN th coordinate. In what follows,we suppress the subscriptN ∈ N on elements ofΩ, writing them asAN.

Remark 5.1. [KKMSX] employs a similar construction using product space, while[HM] viewselements ofΩ as infinite matrices and the projection mapΩ → ΩN as simply choosing the upperleftN ×N minor.

Previously we treated themth moment of an empirical spectral measureµ(m)A,N as a random vari-

able onΩN , but we may equivalently treat it as a random variable onΩ. To highlight this, wedefine the random variableXm,N onΩ

Xm,N (AN) := µ(m)AN ,N . (5.2)

These have centeredrth moment

X(r)m,N := E[(Xm,N − E[Xm,N ])

r]. (5.3)

Per our motivating discussion at the beginning of this section, because we wish to average overa growing number of matrices of the same size, it is advantageous to work overΩN; this again isequipped with a natural probability measure by Kolmogorov’s extension theorem. Its elements aresequences of sequences of matrices, and we denote them byA = A(i)i∈N whereA(i) ∈ Ω. Wenow give a more abstract definition of the averaged blip spectral measure defined in Definition 1.8.

Definition 5.2. Fix a functiong : N → N. Theaveraged empirical blip spectral measureassoci-ated toA ∈ ΩN is

µN,g,A :=1

g(N)

g(N)∑

i=1

µA

(i)N

,N. (5.4)

19

In other words, we project onto theN th coordinate in each copy ofΩ and then average over thefirst g(N) of theseN ×N matrices.

Remark 5.3. If one wishes to avoid defining an empirical spectral measurewhich takes eigenval-ues of multiple matrices, one may use the (rather contrived)construction of aN× N block matrixwith independentN ×N checkerboard matrix blocks.

Analogously toXm,N , we denote byYm,N,g the random variable onΩN defined by the momentsof the averaged empirical blip spectral measure

Ym,N,g(A) := µ(m)

N,g,A. (5.5)

The centeredrth moment (overΩN) of this random variable will be denoted byY (r)m,N,g.

We now prove almost-sure weak convergence of the averaged blip spectral measures under agrowth assumption ong. Recall the following definition.

Definition 5.4. A sequence of random measuresµNN∈N on a probability spaceΩ convergesweakly almost-surelyto a fixed measureµ if, with probability1 overΩN, we have

limN→∞

fdµN =

fdµ (5.6)

for all f ∈ Cb(R) (continuous and bounded functions).

Theorem 5.5. Let g : N → N be such that there exists anδ > 0 for whichg(N) = ω(N δ). Then,asN → ∞, the averaged empirical spectral measuresµN,g,A of thek-checkerboard ensembleconverge weakly almost-surely to the measure with momentsMk,m = 1

kEk Tr [Bm], the limiting

expected moments computed in Theorem 3.17.

Proof. For simplicity of notation, we suppressk and denoteMk,m byMm. By the triangle inequal-ity, we have

|Ym,N,g −Mm| ≤ |Ym,N,g − E[Ym,N,g]|+ |E[Ym,N,g]−Mm|. (5.7)

From Theorem 3.17, we know thatE[Xm,N ] → Mm, and it follows thatE[Ym,N,g] → Mm. Henceto show thatYm,N,g → Mm almost surely, it suffices to show that|Ym,N,g − E[Ym,N,g]| → 0 almostsurely asN → ∞. We show that the limit asN → ∞ of all moments overΩN of any arbitrarymoment of the empirical spectral measure exists, and that wemay always choose a sufficiently highmoment2 such that the standard method of Chebyshev’s inequality andthe Borel-Cantelli lemmagives that|Ym,N,g − E[Ym,N,g]| → 0. Finally, the moment convergence theorem gives almost-sureweak convergence to the limiting averaged blip spectral measure.

Lemma 5.6. Let Xm,N be as defined in(5.2). Then for anyt ∈ N, the rth centered moment ofXm,N satisfies

X(r)m,N = E [(Xm,N − E[Xm,N ])

r] = Om,r(1) (5.8)

asN goes to infinity.

Proof. After expandingE [(Xm,N − E[Xm,N ])r] binomially, the proof follows similarly to that of

Theorem 3.17. For more details, see Appendix D.

2Note the difference between this and the standard techniqueof, for instance, [HM], which uses only the fourthmoment.

20

We apply the following Theorem (Theorem1.2 of [Fer]) withX = Xm,N −E [Xm,N ], s = g(N)

andµi = X(i)m,N .

Theorem 5.7. Let r ∈ N and letX1, . . . , Xs be i.i.d. copies of some mean-zero random variableX with absolute momentsE[|X|ℓ] < ∞ for all ℓ ∈ N. Then

E

[(

s∑

i=1

Xi

)r]

=∑

1≤m≤ r2

Bm,r(µ2, µ3, . . . , µr)

(

s

m

)

(5.9)

whereµi are the moments ofX andBm,r is a function independent ofs, the details of which aregiven in[Fer].

We must first show boundedness of the absolute moments ofXm,N . By Cauchy-Schwarz,(∫

|x2ℓ+1|dµXm,N

)2

≤∫

|x|2dµXm,N·∫

|x|4ℓdµXm,N, (5.10)

whereµXm,Nis the probability measure onΩ given by the density ofXm,N . Since, for fixedN , the

even moments ofXm,N are finite by D.3, the previous bound shows that all odd absolute momentsare finite as well. Hence Theorem 5.7 applies, yielding

E

g(N)∑

i=1

Xm,N,i − E [Xm,N,i]

r

=∑

1≤m≤ r2

Bm,r(X(2)m,N , X

(3)m,N , . . . , X

(r)m,N)

(

g(N)

m

)

. (5.11)

where theXm,N,i are i-indexed i.i.d. copies ofXm,N . By Lemma 5.6, for sufficiently highN ,X

(t)m,N are uniformly bounded above by some constantK for 1 ≤ t ≤ m, so there existsC such

thatBm,r(X(2)m,N , X

(3)m,N , . . . , X

(r)m,N) < C for all sufficiently largeN and for all1 ≤ m ≤ r/2.

Hence

E

g(N)∑

i=1

Xm,N,i − E [Xm,N,i]

r

≤∑

1≤m≤ r2

C

(

g(N)

m

)

. (5.12)

As such, we have

Y(r)m,N,g =

1

g(N)rE

g(N)∑

i=1

Xm,N,i − E [Xm,N,i]

r

≤∑

1≤m≤ r2

C

g(N)r

(

g(N)

m

)

= O

(

1

g(N)r/2

)

.

(5.13)Sinceg(N) = ω(N δ), we may chooser sufficiently large so that

Y(r)m,N,g = O

(

1

N2

)

. (5.14)

Then by Chebyshev’s inequality,

Pr(|Ym,N,g − E[Ym,N,g]| > ǫ) ≤ E [(Ym,N,g − E[Ym,N,g])r]

ǫr=

Y(r)m,N,g

ǫr= O

(

1

N2

)

. (5.15)

We now apply the following.21

Lemma 5.8(Borel-Cantelli). LetBi be a sequence of events with∑

i Pr(Bi) < ∞. Then

Pr

( ∞⋂

j=1

∞⋃

ℓ=j

Bℓ

)

= 0. (5.16)

Define the events

B(m,d,g)N :=

A ∈ ΩN : |Ym,N,g(A)− E[Ym,N,g]| ≥ 1

d

. (5.17)

ThenPr(B(m,d,g)N ) ≤ Cmdr

N2 , so for fixedm, d, the conditions of the Borel-Cantelli lemma aresatisfied. Hence

Pr

( ∞⋂

j=1

∞⋃

ℓ=j

B(m,d,g)ℓ

)

= 0. (5.18)

Taking a union of these measure-zero sets overd ∈ N we have

Pr (Ym,N,g 6= E[Ym,N,g] for infinitely manyN) = 0, (5.19)

and taking the union overm ∈ Z≥0,

Pr (∃m such thatYm,N,g 6= E[Ym,N,g] for infinitely manyN) = 0. (5.20)

Therefore with probability1 overΩN, |Ym,N,g−E[Y m,N, g]| → 0 for eachm. This, together with(5.7) and the discussion following it, yields that the momentsµ(m)

N,g = Ym,N,g → Mm almost surely.We now use the following to show almost-sure weak convergence of measures (see for example[Ta]).

Theorem 5.9 (Moment Convergence Theorem). Let µ be a measure onR with finite momentsµ(m) for all m ∈ Z≥0, andµ1, µ2, . . . a sequence of measures with finite momentsµ

(m)n such that

limn→∞ µ(m)n = µ(m) for all m ∈ Z≥0. If in addition the momentsµ(m) uniquely characterize a

measure, then the sequenceµn converges weakly toµ.

To show Carleman’s condition is satisfied for the limiting momentsMm, we show thatMm arebounded above by the moments of the Gaussian. The odd momentsvanish, and by Theorem 3.17the even moments are given by

M2m =1

kEk Tr A2m =

1≤i1,...,i2m≤k

E[ai1i2ai2i3 . . . ai2mi1 ], (5.21)

and asE[ai1i2ai2i3 . . . aini1 ] is maximized when allaiℓiℓ+1are equal,

M2m ≤∑

1≤i1,...,i2m≤k

(2m− 1)!! = k2m(2m− 1)!!. (5.22)

These are the moments ofN (0, k) so Carleman’s condition is satisfied, thus we letµ be the uniquemeasure determined by the momentsMm. ChooseA ∈ ΩN. Then the preceding argument showedthat, with probability1 overA chosen fromΩN, all momentsµ(m)

N,g,Aof the measuresµN,g,A con-

verge toMm. Then by Theorem 5.9 the measuresµN,g,A converge weakly toµ with probability1,completing the proof.

22

APPENDIX A. DETAILS FOR THE BULK

In this appendix we give additional details related to §2. First, we verify that the expected highermoments of the(k, 1)-checkerboard ensemble do not converge asN → ∞. We then demonstratealmost sure weak convergence of the bulk eigenvalues to a semicircle.

Proposition A.1. The average moments diverge in the bulk case, namely

E

[

ν(ℓ)AN

]

= Ω(N ℓ/2−1). (A.1)

Proof. By the eigenvalue-trace lemma, we have that

E

[

ν(ℓ)AN

]

=1

N ℓ/2+1E[

Tr(AℓN )]

=1

N ℓ/2+1

1≤i1,...,iℓ≤N

E[

ai1i2 · · · aiℓ−1iℓaiℓi1]

. (A.2)

Note that the expectation of any term in the sum is non-negative. We now count the numberof terms where eachaij = 1. Each such term uniquely corresponds to a choice ofi1, ..., iℓ allcongruent to each other modulok. Hence the contribution of these terms isΩ(N ℓ), which givesthe result.

In §2, we established convergence in expectation of the moments. We now show how to extendthis to almost sure weak convergence of the empirical densities. This verification is standard, forinstance, see [Fe]. To do this, we establish the following lemma.

Lemma A.2. LetAN be anN ×N (k, 0)-checkerboard matrix. Then for each fixedℓ,

Var(ν(ℓ)AN

) = O(1/N2). (A.3)

From this lemma, we can obtain almost sure convergence as follows. Firstly, by Chebyshev’sinequality and the previous lemma,

∞∑

N=1

Pr(∣

∣ν(ℓ)AN

− E

[

ν(ℓ)AN

]∣

∣> ǫ

)

≤ 1

ǫ2

∞∑

N=1

Var(ν(ℓ)AN

) < ∞. (A.4)

Hence, by Borel-Cantelli,Pr(

lim supN

∣ν(ℓ)AN

− E

[

ν(ℓ)An

]∣

∣> ǫ)

= 0, soν(ℓ)An

→ E

[

ν(ℓ)An

]

almost

surely, giving us Theorem 2.1 by the method of moments.

Proof of Lemma A.2.This proof is combinatorics. By the eigenvalue trace lemma∣

E

[

(ν(ℓ)AN

)2]

−[

E(ν(ℓ)AN

)]2∣

=1

N ℓ+2

∣E[

tr(AℓN)

2]

−(

E[

tr(AℓN)])2∣

=1

N ℓ+2

I ,I ′

|E[ζIζI′ ]− E[ζI]E[ζI′ ]| , (A.5)

whereζI is a stand-in for writing out the productai1i2 · · · aiℓ−1iℓaiℓi1 associated to the sequenceI = i1 . . . iℓ, where1 ≤ i1, . . . , iℓ ≤ N ; hence the sum over pairs(I, I′). Moreover, as in Lemma2.3, each pair corresponds to a pair of walks on a graph with verticesV(I,I′) = i1, . . . iℓ, i′1, . . . i′ℓand with edges that we denote asE(I,I′). We say that two such pairs of walks are equivalent if theyare equivalent up to relabeling the underlying set of nodes.We then define the weight of(I, I′) tobe|V(I,I′)|.

23

We claim that the pairs of weightt ≤ ℓ contributeO(N t) to the sum. Each equivalence classof weightt gives rise toO(N t) equivalent pairs as we are choosingt distinct nodes for the labels.Moreover, the contribution of each term isO(1) as the moments of the random entries are finite.

We now consider the entries with weightt ≥ ℓ + 1. Note that for the expectation of the term(I, I′) to be nonzero, each edge inE(I,I′) must be traversed twice. In addition, the graphs inducedby I andI′ must share an edge, as otherwise,E[ζIζI′] = E[ζI]E[ζI′ ] by independence. Since eachedge is traversed twice there are at mostℓ unique edges inE(I,I′), which is too few to form aconnected graph onℓ + 2 nodes. Therefore, no pair satisfying the two aforementioned conditionscan have weightℓ+2. Furthermore, in the case of weightℓ+1, there is no such pair either. In thiscase there areℓ + 1 nodes and at mostℓ unique edges inE(I,I′). Hence as the graph is connectedit is a tree. As the walk induced byI in this graph begins and ends ati1, each edge in the walk istraversed twice: once in each direction. An identical statement holds for the walk induced byI′.Hence as there are exactly two of each edge inE(I,I′) the walks induced byI andI′ are disjoint, acontradiction. Hence, no pairs of weight greater thanℓ contribute to the sum, which, together with(A.5), gives us Lemma A.2.

Remark A.3. We note that the previous lemma also establishes that Var(Tr(AℓN)) = O(N ℓ).

APPENDIX B. PROOF OFTWO REGIMES

This appendix is based on work done by Manuel Fernandez ([email protected]) and Nicholas Sieger([email protected]) at Carnegie Mellon under the supervision of the fifth named author, expandedby the third, seventh and eighth named authors.

In this appendix we demonstrate that checkerboard matricesalmost surely have two regimes ofeigenvalues, one that isO(N1/2+ǫ) (the bulk) and the other of orderN (the blip). To do this, werely on matrix perturbation theory. In particular, we view a(k, w)-checkerboard matrix as the sumof a (k, 0)-checkerboard matrix and a fixed matrixZ whereZij = wχi≡j mod k. In that sense,we view the(k, w)-checkerboard matrix as a perturbation of the matrixZ. Then, as the spectralradius of the(k, 0)-checkerboard matrix isO(N1/2+ǫ), we obtain by standard results in the theoryof matrix perturbations that the spectrum of the(k, w)-checkerboard matrix is the same as that ofmatrixZ up to an orderN1/2+ǫ perturbation.

We begin with the following observation on the spectrum of the matrixZ:

Lemma B.1. The matrixZ has exactlyk non-zero eigenvalues, all of which are equal toNw/k.

Proof. For 1 ≤ j ≤ k the vectors∑N/k−1

i=0 eki+j are eigenvectors with eigenvaluesNw/k. Fur-thermore, for1 ≤ i ≤ N/k − 1 and1 ≤ j < k the vectoreki+j − eki+j+1 are eigenvectors witheigenvalues equal to0.

Weyl’s inequality gives the following:

Lemma B.2. (Weyl’s inequality)[HJ] LetH,P beN × N Hermitian matrices, and let the eigen-values ofH, P , andH + P be arranged in increasing order. Then for every pair of integers suchthat1 ≤ j, ℓ ≤ n andj + ℓ ≥ n+ 1 we have

λj+ℓ−n(H + P ) ≤ λj(H) + λℓ(P ), (B.1)

and for every pair of integersj, ℓ such that1 ≤ j, ℓ ≤ n andj + ℓ ≤ n+ 1 we have

λj(H) + λℓ(P ) ≤ λj+ℓ−1(H + P ). (B.2)24

Let ‖P‖op denotemaxi |λi(P )|. By using the fact that|λℓ(P )| ≤ ‖P‖op and takingℓ = n in(B.1), we obtain thatλj(H + P ) ≤ λj(H) + ‖P‖op. Takingℓ = 1 in (B.2) gives the inequality onthe other side, hence|λj(H + P )− λj(H)| ≤ ‖P‖op.

The above lemma gives that if the spectral radius ofP isO(f) then the size of the perturbationswill be O(f) as well. Hence it suffices to demonstrate that almost surely the spectral radius of asequence of(k, 0)-checkerboard matrices isO(N1/2+ǫ).

Lemma B.3. Let m ∈ N and letANN∈N be a sequence of(k, 0)-checkerboard matrices, thenalmost surely, asN → ∞, ‖AN‖op = O(N1/2+1/(2m)).

Proof. Suppose that for somem ∈ N we have a sequenceANN∈N such that‖AN‖op is notO(N1/2+1/(2m)). Then, by straightforward calculation the(2m + 2)nd momentsµ(2m+2)

AN ,N do not

converge. Hence ifPr‖AN‖op = O(N1/2+1/(2m)) 6= 1, then with nonzero probabilityµ(2m+2)AN ,N

do not converge. This contradicts the almost-sure moment convergence result of Appendix A, andLemma B.3 follows.

Since Lemma B.3 holds for allm ∈ N, we have that almost surely‖AN‖op is O(N1/2+ǫ).Together with Lemma B.1 and Lemma B.2, we obtain

Theorem B.4. Let ANN∈N be a sequence of(k, w)-checkerboard matrices. Then almost surelyasN → ∞ the eigenvalues ofAN fall into two regimes:N − k of the eigenvalues areO(N1/2+ǫ)andk eigenvalues are of magnitudeNw/k +O(N1/2+ǫ).

APPENDIX C. PROOF OFLEMMA 3.16

In §3, we introduced Lemma 3.16 without proof. Here we provide a short proof of it.

Proof. Consider the function

f0(x) = (1− x)m =m∑

j=0

(−1)j(

m

j

)

xj . (C.1)

We inductively define, for each0 ≤ p < m− 1, the functionfp+1(x) = xf ′p(x). One can prove by

straightforward induction that

fp(x) =

p∑

i=1

ci,pxi(1− x)m−i, (C.2)

for each0 ≤ p < m, with ci,p ∈ R, by using the product rule. Therefore, for each0 ≤ p < m

0 = fp(1) =

m∑

j=0

(−1)j(

m

j

)

jp. (C.3)

By the same reasoning,m∑

j=0

(−1)j(

m

j

)

jm = fm(1) = (−1)mm! (C.4)

and the second claim follows.

25

APPENDIX D. BOUNDS FORX(r)m,N

Proof of Lemma 5.6.Firstly, we have

E [(Xm,N − E[Xm,N ])r] = E

[

r∑

ℓ=0

(

r

)

(Xm,N)ℓ (E[Xm,N ])

r−ℓ

]

=r∑

ℓ=0

(

r

)

E[

(Xm,N)ℓ]

(E[Xm,N ])r−ℓ (D.1)

By (3.24), we haveE[Xm,N ] = Om(1) hence(E[Xm,N ])r−ℓ = Om,r,ℓ(1) for all ℓ. As such, it

suffices to show thatE[

(Xm,N)ℓ]

= Om,ℓ(1). By (3.3), we have that

E[Xℓm,N ] (D.2)

=

(

k

N

)2nℓ

E

(

2n∑

j=0

(

2n

j

)m+j∑

i=0

(

m+ j

i

)(

−N

k

)m−i

TrA2n+i

)ℓ

=

(

k

N

)2nℓ

E

[

2n∑

j1=0

· · ·2n∑

jℓ=0

[

ℓ∏

u=1

(

2n

ju

)

]

m+j1∑

i1=0

· · ·m+jℓ∑

iℓ=0

[

m+jv∏

v=1

(

m+ jviv

)

]

(

−N

k

)m−iv

TrA2n+iv

]

=

(

k

N

)2nℓ 2n∑

j1=0

· · ·2n∑

jℓ=0

[

ℓ∏

u=1

(

2n

ju

)

]

m+j1∑

i1=0

· · ·m+jℓ∑

iℓ=0

[

ℓ∏

v=1

(

m+ jviv

)

]

(

−N

k

)m−iv

E

[

ℓ∏

v=1

TrA2n+iv

]

.

(D.3)

Now, recall that

E

[

ℓ∏

v=1

TrA2n+iv

]

=∑

α11,...,α

12n+i1

≤N

· · ·∑

αℓ1,...,α

ℓ2n+iℓ

≤N

E

[

ℓ∏

j=1

aαj1,α

j2. . . aαj

2n+ij,αj

1

]

. (D.4)

We have now reached a combinatorial problem similar to the one we encounter in §3. For eachj, since the length of the cyclic productaαj

1,αj2. . . aαj

2n+ij,αj

1is fixed at2n + ij , we can choose the

number of blocks (determining the class), the location of the blocks (determining the configura-tion), the matchings and indexings. By Lemma 3.13, we have that the main contribution from

configurations of length(2n + ij) in Bj-class is(2n+ij)Bj

Bj !. By the same arguments made in §3,

the number of ways we can choose the number of blocks having one a and twoa’s as well as thenumber of ways to choose matchings across theℓ cyclic products are independent ofN , j’s andij ’s, so for simplicity, we are denoting them asC. Finally, the contribution from choosing theindices of all the blocks andw’s is Ok(N

2nℓ+i1+···+iℓ−B1−···−Bℓ). As such, ifB1, . . . , Bℓ ≥ m, thetotal contribution isOm,k(1). If there existsBj′ < m, then the overall contribution is

CN ℓm−B1−···−Bℓ

ℓ∏

u=1

[

2n∑

ju=0

(

2n

ju

)m+ju∑

iu=0

(

m+ juiu

)

(−1)m−iu(2n+ iu)

Bu

Bu!

]

= 0. (D.5)

since the sum overju = j′ is equal to 0 by by Lemma 3.16. As such, the total contributionofE[Xℓ

m,N ] is simplyOm,ℓ(1) (suppressingk), as desired.

26

APPENDIX E. MOMENT CONVERGENCETHEOREM

The following argument is standard (though usually assumesall measures concerned are proba-bility measures), and is given for completeness.

Definition E.1. A sequence of measures(µn)n≥1 onR is uniformly tight if, for everyǫ > 0, thereis a compact setK such thatsupn≥1 µn(R \K) < ∞.

We are now ready to prove the moment convergence theorem for general finite measures, largelyfollowing the treatment of [Cha].

Proof of Theorem 5.9.By convergence of moments, we have that

Ck := supn≥1

R

xkdµn (E.1)

is bounded. For anyR > 0 we then have by Chebyshev’s inequality that

µn(R \ [−R,R]) ≤∫

Rx2dµn

R2≤ C2

R2. (E.2)

Therefore theµn are uniformly tight. Hence by Prokhorov’s theorem for general measures (see[Bog], Theorem8.6.2), every subsequence of(µn)n≥1 contains a weakly convergent subsequencewhich converges to some measureν.

For any subsequence(µnℓ)ℓ≥1 converging weakly to some measureν, we show thatν = µ. Fix

somek ∈ Z≥0 andR ∈ R>0. LetϕR be a continuous function such that

1[−R,R] ≤ ϕR ≤ 1[−R−1,R+1]. (E.3)

We may split the integral as∫

xkdµnℓ=

xkϕRdµnℓ+

xk(1− ϕR)dµnℓ. (E.4)

By the Cauchy-Schwarz inequality,∣

xk(1− ϕR)dµnℓ

2

≤∫

x2kdµnℓ·∫

(1−ϕR)2dµnℓ

≤∫

x2kdµnℓ·µnℓ

(R \ [−R,R]), (E.5)

and by our moment bounds and the definition ofϕR, this is≤ C2·C2k

R2 . Therefore, we have

limR→∞

xkϕRdµnℓ=

xkdµnℓ. (E.6)

By the moment convergence assumption

limℓ→∞

xkdµnℓ=

xkdµ (E.7)

and by weak convergence,

limℓ→∞

xkϕRdµnℓ=

xkϕRdν. (E.8)

We must now show

limℓ→∞

limR→∞

xkϕRdµnℓ= lim

R→∞limℓ→∞

xkϕRdµnℓ. (E.9)

For this, it suffices to show that∫

xkϕRdµnℓconverges uniformly with respect toR asℓ → ∞. In

the following argument, we assumek is even so thatxk is nonnegative, but this may be modified27

easily fork odd. By the same argument for uniform tightness as in Equation (E.2), there existsCsuch that for allℓ (and when replacingµnℓ

by ν),∫

R\[−K,K]

xkϕRdµnℓ≤∫

R\[−K,K]

xkdµnℓ≤ C

K2. (E.10)

Hence for anyǫ > 0, there exists someK such that∫

R\[−K,K]

xkϕRdµnℓ< ǫ/3 (E.11)

unconditionally onℓ. By weak convergence, for any fixedR, there also exists anNR so that∣

xkϕRdµnℓ−∫

xkϕRdν

< ǫ/3 (E.12)

for all ℓ > NR. Therefore, lettingN = supR∈NR≤K

NR, we have that forℓ > N and anyR,∣

xkϕRdµnℓ−∫

xkϕRdν

(E.13)

≤∣

[−K,K]

xkϕRdµnℓ−∫

[−K,K]

xkϕRdν

+

R\[−K,K]

xkϕRdµnℓ

+

R\[−K,K]

xkϕRdν

(E.14)

<

R

xkϕ(K−1)dµnℓ−∫

R

xkϕ(K−1)dν

+2

3ǫ (E.15)

< ǫ (E.16)

Thus we have uniform convergence, so the limits may be switched. Putting all this together,

limR→∞

xkϕRdν = limR→∞

limℓ→∞

xkϕRdµnℓ(E.17)

= limℓ→∞

limR→∞

xkϕRdµnℓ(E.18)

= limℓ→∞

xkdµnℓ(E.19)

=

xkdµ (E.20)

with the last equality following by the moment convergence hypothesis. We haveϕRx2k ≤

ϕR+1x2k and both are nonnegative, so by the monotone convergence theorem

x2kdµ = limR→∞

ϕRx2kdν =

x2kdν. (E.21)

Hencexk ∈ L2(ν), soxk ∈ L1(ν). SinceϕRxk ≤ xk, by the dominated convergence theorem

xkdµ = limR→∞

ϕRxkdν =

xkdν. (E.22)

Sinceµ is uniquely characterized by its moments,ν = µ. Since every subsequence of(µn)n≥1

has a subsequence weakly converging toµ, standard arguments give thatµn converges weakly toµ.

28

REFERENCES

[Bai] Z. Bai, Methodologies in spectral analysis of large-dimensional random matrices, a review, Statist. Sinica9 (1999), no. 3, 611–677.

[BFMT-B] O. Barrett, F. W. K. Firk, S. J. Miller and C. Turnage-Butterbaugh,From Quantum Systems toL-Functions:Pair Correlation Statistics and Beyond, Open Problems in Mathematics (editors John Nash Jr. and MichaelTh. Rassias), Springer-Verlag, 2016.

[BasBo1] A. Basak and A. Bose,Balanced random Toeplitz and Hankel matrices, Electronic Comm. in Prob.15(2010), 134–148.

[BasBo2] A. Basak and A. Bose,Limiting spectral distribution of some band matrices, Periodica Mathematica Hun-garica63 (2011), no. 1, 113–150.

[BanBo] S. Banerjee and A. Bose,Noncrossing partitions, Catalan words and the Semicircle Law, Journal of The-oretical Probability26 (2013), no. 2, 386–409.

[BLMST] O. Beckwith, V. Luo, S. J. Miller, K. Shen and N. Triantafillou, Distribution of eigenvalues of weighted,structured matrix ensembles, Integers: Electronic Journal Of Combinatorial Number Theory 15 (2015),paper A21, 28 pages.

[Bog] V. I. Bogachev,Measure theory, Vol. 2. Springer-Verlag Berlin Heidelberg (2007).[BCG] A. Bose, S. Chatterjee, and S. Gangopadhyay,Limiting spectral distributions of large dimensional random

matrices, J. Indian Statist. Assoc. (2003),41, 221–259.[BHS1] A. Bose, R. S. Hazra, and K. Saha,Patterned random matrices and notions of independence, Technical

report R3/2010 (2010), Stat-Math Unit, Kolkata.[BHS2] A. Bose, R. S. Hazra, and K. Saha,Patterned random matrices and method of moments, in Proceedings of

the International Congress of Mathematicians Hyderabad, India, 2010, 2203–2230. (Invited article). WorldScientific, Singapore and Imperial College Press, UK.

[BM] A. Bose and J. Mitra,Limiting spectral distribution of a special circulant, Statist. Probab. Lett.60 (2002),no. 1, 111–120.

[BDJ] W. Bryc, A. Dembo, and T. Jiang,Spectral Measure of Large Random Hankel, Markov, and ToeplitzMatrices, Annals of Probability34 (2006), no. 1, 1–38.

[Cha] D. Chafai,From moments convergence to weak convergence,http://djalil.chafai.net/blog/2010/12/14/from-moments-to-weak-convergence/, Posted:2012-10-14 , Accessed: 2016-08-05

[CHS] A. Chakrabarty, R. Hazra and D. Sarkar,From random matrices to long range dependence, Random Ma-trices Theory Appl.5 (2016), no. 2, 1650008, 52 pp.

[Con] J. B. Conrey,L-Functions and random matrices. Pages 331–352 inMathematics unlimited — 2001 andBeyond, Springer-Verlag, Berlin, 2001.

[ERSY] L. Erdös, J. A. Ramirez, B. Schlein, and H.-T. Yau,Bulk Universality for Wigner Matrices, Comm. PureAppl. Math.63 (2010), no. 7, 895–925.

[ESY] L. Erdös, B. Schlein, and H.-T. Yau,Wegner estimate and level repulsion for Wigner random matrices, Int.Math. Res. Not. 2010, no. 3, 436–479.

[Fe] A. R. Feier, Methods of proof in random matrix theory, http://www.math.harvard.edu/theses/senior/feier/feier.pdf, Posted: 2012, Accessed: 2015-07-05.

[Fer] D. Ferge,Moment equalities for sums of random variables via integer partitions and Faà di Bruno’sformula, Turkish Journal of Mathematics38 (2014), no. 3, 558–575; doi:10.3906/mat-1301-6.

[FM] F. W. K. Firk and S. J. Miller,Nuclei, Primes and the Random Matrix Connection, Symmetry1 (2009),64–105;

[Fo] P. J. Forrester,Log-Gases and Random Matrices, London Mathematical Society Monographs34, PrincetonUniversity Press, Princeton, NJ 2010.

[GKMN] L. Goldmakher, C. Khoury, S. J. Miller and K. Ninsuwan, On the spectral distribution of large weightedrandom regular graphs, to appear in Random Matrices: Theory and Applications.http://arxiv.org/abs/1306.6714.

[HJ] R. Horn and C. Johnson,Matrix Analysis, Cambridge University Press, 1985.[HM] C. Hammond and S. J. Miller,Eigenvalue spacing distribution for the ensemble of real symmetric Toeplitz

matrices, Journal of Theoretical Probability18 (2005), no. 3, 537–566.

29

[JMP] S. Jackson, S. J. Miller, and V. Pham,Distribution of Eigenvalues of Highly Palindromic Toeplitz Matrices,Journal of Theoretical Probability25 (2012), 464–495.

[JMRR] D. Jakobson, S. D. Miller, I. Rivin, and Z. Rudnick,Eigenvalue spacings for regular graphs. Pages 317–327 inEmerging Applications of Number Theory (Minneapolis, 1996), The IMA Volumes in Mathematicsand its Applications, Vol. 109, Springer, New York, 1999.

[Kar] V. Kargin, Spectrum of random Toeplitz matrices with band structure, Elect. Comm. in Probab.14 (2009),412–421.

[KaSa] N. Katz and P. Sarnak,Zeros of zeta functions and symmetries, Bull. AMS 36 (1999), 1–26.[KeSn] J. P. Keating and N. C. Snaith,Random matrices andL-functions. In Random Matrix Theory, J. Phys. A

36 (2003), no. 12, 2859–2881.[KKMSX] G. S. Kopp, M. Kologlu, S. J. Miller, F. Strauch and W. Xiong,The Limiting Spectral Measure for Ensem-

bles of Symmetric Block Circulant Matrices, J. of Theoretical Probability26 (2013), no. 4, 1020–1060.[LW] D.-Z. Liu and Z.-D. Wang,Limit Distribution of Eigenvalues for Random Hankel and Toeplitz Band Ma-

trices, J. of Theoretical Probability24 (2011), no. 4, 1063–1086.[MMS] A. Massey, S. J. Miller, J. Sinsheimer,Distribution of eigenvalues of real symmetric palindromicToeplitz

matrices and circulant matrices, Journal of Theoretical Probability20 (2007), no. 3, 637–662.[McK] B. McKay, The expected eigenvalue distribution of a large regular graph, Linear Algebra Appl.40 (1981),

203–216.[Me] M. Meckes,The spectra of random abelian G-circulant matrices, ALEA Lat. Am. J. Probab. Math. Stat.9

(2012) no. 2, 435–450.[Meh] M. Mehta,Random Matrices, 2nd edition, Academic Press, Boston, 1991.[MNS] S. J. Miller, T. Novikoff and, A. Sabelli,The distribution of the second largest eigenvalue in families of

random regular graphs, Experimental Mathematics17 (2008), no. 2, 231–244.[MSTW] S. J. Miller, K. Swanson, K. Tor and K. Winsor,Limiting Spectral Measures for Random Matrix Ensembles

with a Polynomial Link Function, Random Matrices: Theory and Applications4 (2015), no. 2, 1550004(28 pages).

[MT-B] S. J. Miller and R. Takloo-Bighash,An Invitation to Modern Number Theory, Princeton University Press,Princeton, NJ, 2006, 503 pages.

[Sch] J. Schenker,Eigenvector localization for random band matrices with power law band width, Comm. Math.Phys.290(2009), no. 3, 1065–1097.

[Ta] L. Takacs,A Moment Convergence Theorem, The Amer. Math. Monthly98 (Oct., 1991), no. 8, 742–746.[Tao1] T. Tao,254a, notes 4: The semi-circular law, https://terrytao.wordpress.com/2010/02/

02/254a-notes-4-the-semi-circular-law/, Posted:2010-02-02, Accessed:2016-08-04.[Tao2] T. Tao,Topics in Random Matrix Theory, Graduate Studies in Mathematics132, AMS 2012.[TV1] T. Tao and V. Vu,From the Littlewood-Offord problem to the Circular Law: universality of the spectral

distribution of random matrices, Bull. Amer. Math. Soc.46 (2009), 377–396.[TV2] T. Tao and V. Vu,Random matrices: universality of local eigenvalue statistics up to the edge, Comm.

Math. Phys.298(2010), no. 2, 549–572.[Wig1] E. Wigner,On the statistical distribution of the widths and spacings of nuclear resonance levels, Proc.

Cambridge Philo. Soc.47 (1951), 790–798.[Wig2] E. Wigner,Characteristic vectors of bordered matrices with infinite dimensions, Ann. of Math.2 (1955),

no. 62, 548–564.[Wig3] E. Wigner,Statistical Properties of real symmetric matrices. Pages 174–184 inCanadian Mathematical

Congress Proceedings, University of Toronto Press, Toronto, 1957.[Wig4] E. Wigner,Characteristic vectors of bordered matrices with infinite dimensions. II, Ann. of Math. Ser. 2

65 (1957), 203–207.[Wig5] E. Wigner,On the distribution of the roots of certain symmetric matrices, Ann. of Math. Ser. 267 (1958),

325–327.[Wis] J. Wishart,The generalized product moment distribution in samples from a normal multivariate population,

Biometrika20 A (1928), 32–52.

30

E-mail address: [email protected]

DEPARTMENT OFMATHEMATICS, POMONA COLLEGE, CLAREMONT, CA 91711

E-mail address: [email protected], [email protected]

DEPARTMENT OFMATHEMATICS, BOWDOIN COLLEGE, BRUNSWICK, ME 04011

E-mail address: [email protected]

DEPARTMENT OFMATHEMATICS AND STATISTICS, HAVERFORD COLLEGE, HAVERFORD, PA 19041

E-mail address: [email protected]

DEPARTMENT OFMATHEMATICS, HARVEY MUDD COLLEGE, CLAREMONT, CA 91711

E-mail address: [email protected],[email protected]

DEPARTMENT OFMATHEMATICS AND STATISTICS, WILLIAMS COLLEGE, WILLIAMSTOWN , MA 01267

E-mail address: [email protected]

DEPARTMENT OFMATHEMATICS, UNIVERSITY OF M ICHIGAN , ANN ARBOR, MI 48109

E-mail address: [email protected]

DEPARTMENT OFMATHEMATICS AND STATISTICS, AMHERST COLLEGE, AMHERST, MA 01002

E-mail address: [email protected]

DEPARTMENT OFMATHEMATICS, PRINCETON UNIVERSITY, PRINCETON, NJ 08544

E-mail address: [email protected]

DEPARTMENT OFMATHEMATICS, HARVARD UNIVERSITY, CAMBRIDGE, MA 02138

31

- 1 0 1 2

0.0

0.5

1.0

1.5

- 1 0 1 2

0.0

0.5

1.0

1.5

- 1 0 1 2

0.0

0.5

1.0

1.5

2.0

- 1 0 1 2

0.0

0.5

1.0

1.5

2.0

Spectral distribution for 2x2 hollow GOE Spectral distribution for 3x3 hollow GOE

Spectral distribution for 4x4 hollow GOE Spectral distribution for 16x16 hollow GOE