33
May 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin Chen , Yuejie Chi * , Andrea J. Goldsmith Stanford University , Ohio State University * Page 1

Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

May 9

Exact and Stable Covariance Estimation from

Quadratic Sampling via Convex Programming

Yuxin Chen†, Yuejie Chi∗, Andrea J. Goldsmith†

Stanford University†, Ohio State University∗

Page 1

Page 2: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

• Data Stream / Stochastic Processes

Each data instance can be high-dimensional

We’re interested in information in the datarather than the data themselves

• Covariance Estimation

second-order statistics Σ ∈ Rn×n

cornerstone of many information processing tasks

High-Dimensional Sequential Data / Signals

Page 2

Page 3: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

What are Quadratic Measurements?

• Quadratic Measurements

obtain m measurements of Σ taking the form

yi ≈ a>i Σai (1 ≤ i ≤ m)

rank-1 measurements!

Page 3

Page 4: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Example: Applications in Spectral Estimation

• High-frequency wireless and signal processing (Energy Measurements)

Spectral estimation of stationary processes (possibly sparse)

Page 4

Page 5: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Example: Applications in Spectral Estimation

• High-frequency wireless and signal processing (Energy Measurements)

Spectral estimation of stationary processes (possibly sparse)

Channel Estimation in MIMO Channels

Page 4

Page 6: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

10 20 30 40

5

10

15

20

25

30

35

40

45

10 20 30 40

5

10

15

20

25

30

35

40

45

Fig credit: Chi et al

Example: Applications in Optics

• Phase Space Tomography

measure correlation functions of a wave field

Page 5

Page 7: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

10 20 30 40

5

10

15

20

25

30

35

40

45

10 20 30 40

5

10

15

20

25

30

35

40

45

courtesy of Chi et al

Figure 1: A typical setup for structured illuminations in diffraction imaging using a phase mask.

Figure 2: A typical setup for structured illuminations in diffraction imaging using oblique illumina-tions. The left image shows direct (on-axis) illumination and the right image corresponds to oblique(off-axis) illumination.

6

courtesy of Candes et al

Example: Applications in Optics

• Phase Space Tomography

measure correlation functions of a wave field

• Phase Retrieval

signal recovery from magnitude measurements

Page 5

Page 8: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

binary data stream by Kazmin

Example: Applications in Data Streams

• Covariance Sketching

data stream: real-time data xt∞t=1 arriving sequentially at a high rate...

• Challenges

limited memory

computational efficiency

hopefully a single pass over the data

Page 6

Page 9: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Proposed Quadratic Sketching Method

1) Sketching:

at each time t, obtain a quadratic sketch (a>i xt)2

— ai: sketching vector

Page 7

Page 10: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Proposed Quadratic Sketching Method

1) Sketching:

at each time t, obtain a quadratic sketch (a>i xt)2

— ai: sketching vector

2) Aggregation:

all sketches are aggregated into m measurements

yi = a>i

(1

T

T∑t=1

xtx>t

)ai ≈ a>i Σai (1 ≤ i ≤ m)

Page 7

Page 11: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Proposed Quadratic Sketching Method

1) Sketching:

at each time t, obtain a quadratic sketch (a>i xt)2

— ai: sketching vector

2) Aggregation:

all sketches are aggregated into m measurements

yi = a>i

(1

T

T∑t=1

xtx>t

)ai ≈ a>i Σai (1 ≤ i ≤ m)

• Benefits:

one pass

minimal storage (as will be shown)

Page 7

Page 12: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

• Given: m ( n2) quadratic measurements y = yimi=1

yi = a>i Σai + ηi, i = 1, · · · ,m,

ai : sampling vectors

η = ηimi=1: noise terms

more concise operator form:

y = A(Σ) + η

• Goal: recover Σ ∈ Rn×n.

• Sampling model

sub-Gaussian i.i.d. sampling vectors

Problem Formulation

Page 8

Page 13: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Piet Mondrian

1) low rank 2) Toeplitz low rank 3) jointly sparse and low rank

Geometry of Covariance Structure

• # unknown > # stored measurements

exploit low-dimensional structures!

• Structures considered in this talk:

low rank

Toeplitz low rank

simultaneously sparse and low-rank

Page 9

Page 14: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Low Rank

• Low-Rank Structure:

A few components explains most of the data variability

metric learning, array signal processing, collaborative filtering ...

• rank(Σ) = r n.

Page 10

Page 15: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

• Trace Minimization

(TraceMin) minimizeM trace (M)︸ ︷︷ ︸low rank

s.t. ‖A (M)− y‖1 ≤ ε︸︷︷︸noise bound

,

M 0.

inspired by Candes et. al. for phase retrieval

Trace Minimization for Low-Rank Structure

Page 11

Page 16: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

minimize tr (M) s.t. ‖A (M)− y‖1 ≤ ε, M 0

Theorem 1 (Low Rank). With high prob, for all Σ with rank(Σ) ≤ r, thesolution Σ to TraceMin obeys

‖Σ−Σ‖F .‖Σ−Σr‖∗√

r︸ ︷︷ ︸due to imperfect structure

m︸︷︷︸due to noise

,

provided that m & rn. (Σr: rank-r approx of Σ)

• Exact recovery in the noiseless case

• Universal recovery: simultaneously works for all low-rank matrices

• Robust recovery when Σ is approximately low-rank

• Stable recovery against bounded noise

Near-Optimal Recovery for Low-Rank Structure

Page 12

Page 17: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

m / (n*n)

r/n

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

theoretic sampling limit

empirical success probability of Monte Carlo trials: n = 50

Phase Transition for Low-Rank Recovery

• Near-Optimal Storage Complexity!

degrees of freedom ≈ rn

Page 13

Page 18: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Toeplitz Low Rank

• Toeplitz Low-Rank Structure:

Spectral sparsity!∗ possibly off-the-grid frequency spikes (Vandemonde decomposition)

wireless communication, array signal processing ...

• rank(Σ) = r n.

Page 14

Page 19: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

• Trace Minimization

(ToepTraceMin) minimizeM trace (M)︸ ︷︷ ︸low rank

s.t. ‖A (M)− y‖2 ≤ ε2︸︷︷︸noise bound

,

M 0,

M is Toeplitz.

Trace Minimization for Toeplitz Low-Rank Structure

Page 15

Page 20: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

minimize tr (M) s.t. ‖A (M)− y‖2 ≤ ε2, M 0, M is Toeplitz

Theorem 2 (Toeplitz Low Rank). With high prob, for all Toeplitz Σ withrank(Σ) ≤ r, the solution Σ to ToepTraceMin obeys

‖Σ−Σ‖F .ε2√m︸︷︷︸

due to noise

,

provided that m & rpoly log(n).

• Exact recovery in the absence of noise

• Universal recovery: simultaneously works for all Toeplitz low-rank matrices

• Stable recovery against bounded noise

Toeplitz ball

Near-Optimal Recovery for Toeplitz Low-Rank Structure

Page 16

Page 21: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

m: number of measurements

r: r

ank

5 10 15 20 25 30 35 40 45 500

5

10

15

20

25

30

35

40

45

50

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

theoretic sampling limit

empirical success probability of Monte Carlo trials: n = 50

Phase Transition for Toeplitz Low-Rank Recovery

• Near-Optimal Storage Complexity!

degrees of freedom ≈ r

Page 17

Page 22: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Simultaneous Structure

• Joint Structure: Σ is simultaneously sparse and low-rank.

rank: r

sparsity: k

SVD: Σ = UΛU>, where U = [u1, · · · ,ur]

Page 18

Page 23: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

• Convex Relaxation

minimizeM trace (M)︸ ︷︷ ︸low rank

+ λ‖M‖1︸ ︷︷ ︸sparsity

s.t. ‖A (M)− y‖1 ≤ ε︸︷︷︸noise bound

,

M 0.

coincides with Li and Voroninski for rank-1 cases

Convex Relaxation for Simultaneous Structure

Page 19

Page 24: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Exact Recovery for Simultaneous Structure

minimize tr (M) + λ ‖M‖1 s.t. A (M) = y, M 0

Theorem 3 (Simultaneous Structure). SDP with λ ∈[

1n,

1NΣ

]is exact with

high probability, provided that

m &r log n

λ2(1)

where NΣ := max

‖sign (ΣΩ)‖ ,

√k∑r

i=1‖ui‖21r

.

• Exact recovery with appropriate regularization parameters

• Question: how good is the storage complexity (1)?

Page 20

Page 25: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Compressible Covariance Matrices: Near-Optimal Recovery

Definition (Compressible Matrices)

• non-zero entries of ui exhibit power-law decays

‖ui‖1 = O(poly log(n)).

Page 21

Page 26: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Compressible Covariance Matrices: Near-Optimal Recovery

Definition (Compressible Matrices)

• non-zero entries of ui exhibit power-law decays

‖ui‖1 = O(poly log(n)).

Corollary 1 (Compressible Case). For compressible covariance matrices, SDPwith λ ≈ 1√

kis exact w.h.p., provided that

m & kr · poly log(n).

• Near-Minimal Measurements!

degree-of-freedom: Θ(kr)

Page 22

Page 27: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

• noise: ‖η‖1 ≤ ε

• imperfect structural assumption: Σ = ΣΩ︸︷︷︸simultaneous sparse and low-rank

+ Σc︸︷︷︸residuals

Stability and Robustness

Page 23

Page 28: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

• noise: ‖η‖1 ≤ ε

• imperfect structural assumption: Σ = ΣΩ︸︷︷︸simultaneous sparse and low-rank

+ Σc︸︷︷︸residuals

Theorem 4. Under the same λ as in Theorem 1 or Corollary 1,

∥∥∥Σ−ΣΩ

∥∥∥F.

1√r

‖Σc‖∗ + λ ‖Σc‖1︸ ︷︷ ︸due to imperfect structure

m︸︷︷︸due to noise

• stable against bounded noise

• robust against imperfect structural assumptions

Stability and Robustness

Page 24

Page 29: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Mixed-Norm RIP (for Low-Rank and Joint Structure)

• Restricted Isometry Property: a powerful notion for compressed sensing

∀X in some class : ‖B (X)‖2 ≈ ‖X‖F .

unfortunately, it does NOT hold for quadratic models

Page 25

Page 30: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Mixed-Norm RIP (for Low-Rank and Joint Structure)

• Restricted Isometry Property: a powerful notion for compressed sensing

∀X in some class : ‖B (X)‖2 ≈ ‖X‖F .

unfortunately, it does NOT hold for quadratic models

• A Mixed-norm Variant: RIP-`2/`1

∀X in some class : ‖B (X)‖1 ≈ ‖X‖F .

Page 25

Page 31: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Mixed-Norm RIP (for Low-Rank and Joint Structure)

• Restricted Isometry Property: a powerful notion for compressed sensing

∀X in some class : ‖B (X)‖2 ≈ ‖X‖F .

unfortunately, it does NOT hold for quadratic models

• A Mixed-norm Variant: RIP-`2/`1

∀X in some class : ‖B (X)‖1 ≈ ‖X‖F .

does NOT hold for A, but hold after A is debiased

A very simple proof for PhaseLift!

Page 25

Page 32: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Piet Mondrian

Concluding Remarks

• Our approach / analysis works for other structural models

Sparse covariance matrix

Low-Rank plus Sparse matrix

• The way ahead

Sparse inverse covariance matrix

Beyond sub-Gaussian sampling

Online recovery algorithms

Page 26

Page 33: Exact and Stable Covariance Estimation from Quadratic ...yc5/slides/CovEst_slides.pdfMay 9 Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming Yuxin

Q&A

Full-length version available at arXiv:

Exact and Stable Covariance Estimation from Quadratic Samplingvia Convex Programming

http://arxiv.org/abs/1310.0807

Thank You! Questions?

Page 27