67
Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence Acceleration of Randomized Kaczmarz Method Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of Randomized Kaczmarz Method

Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Acceleration of Randomized Kaczmarz Method

Deanna Needell [Joint work with Y. Eldar]

Stanford University

BIRS Banff, March 2011

D. Needell Acceleration of Randomized Kaczmarz Method

Page 2: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Problem Background

Setup

Setup

Let Ax = b be an overdetermined consistent system of equations

D. Needell Acceleration of Randomized Kaczmarz Method

Page 3: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Problem Background

Setup

Setup

Let Ax = b be an overdetermined consistent system of equations

D. Needell Acceleration of Randomized Kaczmarz Method

Page 4: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Problem Background

Setup

Setup

Let Ax = b be an overdetermined consistent system of equations

D. Needell Acceleration of Randomized Kaczmarz Method

Page 5: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Problem Background

Setup

Setup

Let Ax = b be an overdetermined consistent system of equations

Goal

From A and b we wish to recover unknown x . Assume m ≫ n.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 6: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Method

Kaczmarz

The Kaczmarz method is an iterative method used to solveAx = b.

Due to its speed and simplicity, it’s used in a variety ofapplications.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 7: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Method

Kaczmarz

The Kaczmarz method is an iterative method used to solveAx = b.

Due to its speed and simplicity, it’s used in a variety ofapplications.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 8: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Method

Kaczmarz

The Kaczmarz method is an iterative method used to solveAx = b.

Due to its speed and simplicity, it’s used in a variety ofapplications.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 9: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Method

Kaczmarz

1 Start with initial guess x02 xk+1 = xk +

b[i ]−〈ai ,xk〉‖ai‖

22

ai where i = (k mod m) + 1

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 10: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Method

Kaczmarz

1 Start with initial guess x02 xk+1 = xk +

b[i ]−〈ai ,xk〉‖ai‖

22

ai where i = (k mod m) + 1

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 11: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Method

Kaczmarz

1 Start with initial guess x02 xk+1 = xk +

b[i ]−〈ai ,xk〉‖ai‖

22

ai where i = (k mod m) + 1

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 12: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Method

Kaczmarz

1 Start with initial guess x02 xk+1 = xk +

b[i ]−〈ai ,xk〉‖ai‖

22

ai where i = (k mod m) + 1

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 13: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Geometrically

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 14: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Geometrically

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 15: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Geometrically

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 16: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

Geometrically

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 17: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 18: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 19: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 20: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 21: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 22: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 23: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 24: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 25: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Kaczmarz Method

But what if...

Denote Hi = {w : 〈ai ,w〉 = b[i ]}.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 26: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz

Kaczmarz

1 Start with initial guess x02 xk+1 = xk +

b[i ]−〈ai ,xk〉‖ai‖

22

ai where i is chosen randomly

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 27: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz

Kaczmarz

1 Start with initial guess x02 xk+1 = xk +

b[i ]−〈ai ,xk〉‖ai‖

22

ai where i is chosen randomly

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 28: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz

Strohmer-Vershynin

1 Start with initial guess x0

2 xk+1 = xk +bp−〈ap,xk〉

‖ap‖22ap where P(p = i) =

‖ai‖22

‖A‖2F

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 29: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz

Strohmer-Vershynin

1 Start with initial guess x0

2 xk+1 = xk +bp−〈ap,xk〉

‖ap‖22ap where P(p = i) =

‖ai‖22

‖A‖2F

3 Repeat (2)

D. Needell Acceleration of Randomized Kaczmarz Method

Page 30: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK)

Strohmer-Vershynin

Let R = ‖A−1‖2‖A‖2F (‖A−1‖def

= inf{M : M‖Ax‖2 ≥ ‖x‖2 for all x})

Then E‖xk − x‖22 ≤(

1− 1R

)k‖x0 − x‖22

Well conditioned A → Convergence in O(n) iterations → O(n2) total runtime.

Better than O(mn2) runtime for Gaussian elimination and empirically oftenfaster than Conjugate Gradient.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 31: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK)

Strohmer-Vershynin

Let R = ‖A−1‖2‖A‖2F (‖A−1‖def

= inf{M : M‖Ax‖2 ≥ ‖x‖2 for all x})

Then E‖xk − x‖22 ≤(

1− 1R

)k‖x0 − x‖22

Well conditioned A → Convergence in O(n) iterations → O(n2) total runtime.

Better than O(mn2) runtime for Gaussian elimination and empirically oftenfaster than Conjugate Gradient.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 32: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK)

Strohmer-Vershynin

Let R = ‖A−1‖2‖A‖2F (‖A−1‖def

= inf{M : M‖Ax‖2 ≥ ‖x‖2 for all x})

Then E‖xk − x‖22 ≤(

1− 1R

)k‖x0 − x‖22

Well conditioned A → Convergence in O(n) iterations → O(n2) total runtime.

Better than O(mn2) runtime for Gaussian elimination and empirically oftenfaster than Conjugate Gradient.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 33: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK)

Strohmer-Vershynin

Let R = ‖A−1‖2‖A‖2F (‖A−1‖def

= inf{M : M‖Ax‖2 ≥ ‖x‖2 for all x})

Then E‖xk − x‖22 ≤(

1− 1R

)k‖x0 − x‖22

Well conditioned A → Convergence in O(n) iterations → O(n2) total runtime.

Better than O(mn2) runtime for Gaussian elimination and empirically oftenfaster than Conjugate Gradient.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 34: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK) with noise

System with noise

We now consider the consistent system Ax = b corrupted by noiseto form the possibly inconsistent system Ax ≈ b + z .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 35: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK) with noise

Theorem [N]

Let Ax = b be corrupted with noise: Ax ≈ b + z . Then

E‖xk − x‖2 ≤(

1− 1

R

)k/2‖x0 − x‖2 +

√Rγ,

where γ = maxi|z [i ]|‖ai‖2

.

This bound is sharp and attained in simple examples.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 36: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK) with noise

Theorem [N]

Let Ax = b be corrupted with noise: Ax ≈ b + z . Then

E‖xk − x‖2 ≤(

1− 1

R

)k/2‖x0 − x‖2 +

√Rγ,

where γ = maxi|z [i ]|‖ai‖2

.

This bound is sharp and attained in simple examples.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 37: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Randomized Version

Randomized Kaczmarz (RK) with noise

0 20 40 60 80 1000.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

Trials

Err

or

Error in estimation: Gaussian 2000 by 100 after 800 iterations.

ErrorThreshold

0 100 200 300 400 500 600 700 8000

0.2

0.4

0.6

0.8

1

Iterations

Err

or

Error in estimation: Gaussian 2000 by 100

0 20 40 60 80 1000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

Trials

Err

or

Error in estimation: Partial Fourier 700 by 101 after 1000 iterations.

ErrorThreshold

0 20 40 60 80 100

0.02

0.03

0.04

0.05

0.06

0.07

Trials

Err

or

Error in estimation: Bernoulli 2000 by 100 after 750 iterations.

ErrorThreshold

Figure: Comparison between actual error (blue) and predicted threshold(pink). Scatter plot shows exponential convergence over several trials.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 38: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Even better convergence? : Noiseless case revisited

Recall xk+1 = xk +b[i ]−〈ai ,xk〉

‖ai‖22

ai

Since these projections are orthogonal, the optimal projectionis one that maximizes ‖xk+1 − xk‖2.Therefore we choose i maximizing |b[i ]−〈ai ,xk〉

‖ai‖2|.

Too costly → Project onto low dimensional subspace.

Use the low dimensional representations to predict the optimalprojection.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 39: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Even better convergence? : Noiseless case revisited

Recall xk+1 = xk +b[i ]−〈ai ,xk〉

‖ai‖22

ai

Since these projections are orthogonal, the optimal projectionis one that maximizes ‖xk+1 − xk‖2.Therefore we choose i maximizing |b[i ]−〈ai ,xk〉

‖ai‖2|.

Too costly → Project onto low dimensional subspace.

Use the low dimensional representations to predict the optimalprojection.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 40: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Even better convergence? : Noiseless case revisited

Recall xk+1 = xk +b[i ]−〈ai ,xk〉

‖ai‖22

ai

Since these projections are orthogonal, the optimal projectionis one that maximizes ‖xk+1 − xk‖2.Therefore we choose i maximizing |b[i ]−〈ai ,xk〉

‖ai‖2|.

Too costly → Project onto low dimensional subspace.

Use the low dimensional representations to predict the optimalprojection.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 41: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Even better convergence? : Noiseless case revisited

Recall xk+1 = xk +b[i ]−〈ai ,xk〉

‖ai‖22

ai

Since these projections are orthogonal, the optimal projectionis one that maximizes ‖xk+1 − xk‖2.Therefore we choose i maximizing |b[i ]−〈ai ,xk〉

‖ai‖2|.

Too costly → Project onto low dimensional subspace.

Use the low dimensional representations to predict the optimalprojection.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 42: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Even better convergence? : Noiseless case revisited

Recall xk+1 = xk +b[i ]−〈ai ,xk〉

‖ai‖22

ai

Since these projections are orthogonal, the optimal projectionis one that maximizes ‖xk+1 − xk‖2.Therefore we choose i maximizing |b[i ]−〈ai ,xk〉

‖ai‖2|.

Too costly → Project onto low dimensional subspace.

Use the low dimensional representations to predict the optimalprojection.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 43: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Even better convergence? : Noiseless case revisited

Recall xk+1 = xk +b[i ]−〈ai ,xk〉

‖ai‖22

ai

Since these projections are orthogonal, the optimal projectionis one that maximizes ‖xk+1 − xk‖2.Therefore we choose i maximizing |b[i ]−〈ai ,xk〉

‖ai‖2|.

Too costly → Project onto low dimensional subspace.

Use the low dimensional representations to predict the optimalprojection.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 44: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

JL Dimension Reduction

Johnson-Lindenstrauss Lemma

Let δ > 0 and let S be a finite set of points in Rn. Then for any d

satisfying

d ≥ Clog |S |δ2

, (1)

there exists a Lipschitz mapping Φ : Rn → Rd such that

(1− δ)‖si − sj‖22 ≤ ‖Φ(si )− Φ(sj)‖22 ≤ (1 + δ)‖si − sj‖22, (2)

for all si , sj ∈ S .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 45: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

JL Dimension Reduction

Moreover

In the proof of the JL Lemma the map Φ is chosen as theprojection onto a random d -dimensional subspace of Rn. Nowmany known distributions will yield such a projection.

Recently, transforms with fast multiplies have also been shownto satisfy the JL Lemma [Ailon-Chazelle, Hinrichs-Vybiral,Ailon-Liberty, Krahmer-Ward, ...]

Perform Reduction

Choose such a d × n projector Φ and during preprocessing setαi = Φai .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 46: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

JL Dimension Reduction

Moreover

In the proof of the JL Lemma the map Φ is chosen as theprojection onto a random d -dimensional subspace of Rn. Nowmany known distributions will yield such a projection.

Recently, transforms with fast multiplies have also been shownto satisfy the JL Lemma [Ailon-Chazelle, Hinrichs-Vybiral,Ailon-Liberty, Krahmer-Ward, ...]

Perform Reduction

Choose such a d × n projector Φ and during preprocessing setαi = Φai .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 47: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

JL Dimension Reduction

Moreover

In the proof of the JL Lemma the map Φ is chosen as theprojection onto a random d -dimensional subspace of Rn. Nowmany known distributions will yield such a projection.

Recently, transforms with fast multiplies have also been shownto satisfy the JL Lemma [Ailon-Chazelle, Hinrichs-Vybiral,Ailon-Liberty, Krahmer-Ward, ...]

Perform Reduction

Choose such a d × n projector Φ and during preprocessing setαi = Φai .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 48: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

RK via Johnson-Lindenstrauss (RKJL) [N-Eldar]

Select: Select n rows so that each row ai is chosen withprobability ‖ai‖22/‖A‖2F . For each set

γi =|b[i ]− 〈αi ,Φxk〉|

‖αi‖2,

and set j = argmaxi γi .Test: For aj and the first row al selected set

γ∗j =

|b[j ] − 〈aj , xk 〉|

‖aj‖2

and γ∗l =

|b[l ] − 〈al , xk 〉|

‖al‖2

.

If γ∗l

> γ∗j, set j = l .

Project: Set

xk+1 = xk +b[j ] − 〈aj , xk 〉

‖aj‖22

aj .

Update: Set k = k + 1 and repeat.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 49: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

RK via Johnson-Lindenstrauss (RKJL) [N-Eldar]

Select: Select n rows so that each row ai is chosen with probability ‖ai‖22/‖A‖

2F . For each set

γi =|b[i ] − 〈αi , Φxk 〉|

‖αi‖2

,

and set j = argmaxi γi .

Test: For aj and the first row al selected set

γ∗j =|b[j]− 〈aj , xk〉|

‖aj‖2and γ∗l =

|b[l ]− 〈al , xk〉|‖al‖2

.

If γ∗l > γ∗j , set j = l .Project: Set

xk+1 = xk +b[j ] − 〈aj , xk 〉

‖aj‖22

aj .

Update: Set k = k + 1 and repeat.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 50: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

RK via Johnson-Lindenstrauss (RKJL) [N-Eldar]

Select: Select n rows so that each row ai is chosen with probability ‖ai‖22/‖A‖

2F . For each set

γi =|b[i ] − 〈αi , Φxk 〉|

‖αi‖2

,

and set j = argmaxi γi .

Test: For aj and the first row al selected set

γ∗j =

|b[j ] − 〈aj , xk 〉|

‖aj‖2

and γ∗l =

|b[l ] − 〈al , xk 〉|

‖al‖2

.

If γ∗l > γ∗

j , set j = l .

Project: Set

xk+1 = xk +b[j]− 〈aj , xk〉

‖aj‖22aj .

Update: Set k = k + 1 and repeat.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 51: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

RK via Johnson-Lindenstrauss (RKJL) [N-Eldar]

Select: Select n rows so that each row ai is chosen with probability ‖ai‖22/‖A‖

2F . For each set

γi =|b[i ] − 〈αi , Φxk 〉|

‖αi‖2

,

and set j = argmaxi γi .

Test: For aj and the first row al selected set

γ∗j =

|b[j ] − 〈aj , xk 〉|

‖aj‖2

and γ∗l =

|b[l ] − 〈al , xk 〉|

‖al‖2

.

If γ∗l > γ∗

j , set j = l .

Project: Set

xk+1 = xk +b[j ] − 〈aj , xk 〉

‖aj‖22

aj .

Update: Set k = k + 1 and repeat.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 52: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Runtime

Select: Calculate Φxk : In general O(nd)Calculate γi for each i (of n): O(nd)

Test: Calculate γ∗j and γ∗l : O(n)

Project: Calculate xk+1: O(n)

Overall Runtime

Since each iteration takes O(nd), we have convergence in O(n2d).

D. Needell Acceleration of Randomized Kaczmarz Method

Page 53: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Choosing parameter d

Lemma: Choice of d

Let Φ be the n× d (Gaussian) matrix with d = Cδ−2 log(n) as inthe RKJL method. Set γi = 〈Φai ,Φxk〉 also as in the method.Then |γi − 〈ai , xk〉| ≤ 2δ for all i and k in the first O(n) iterationsof RKJL.

Low Risk

This shows worst case expected convergence in at mostO(n2 log n) time, and of course in most cases one expects farfaster convergence.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 54: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Choosing parameter d

Lemma: Choice of d

Let Φ be the n× d (Gaussian) matrix with d = Cδ−2 log(n) as inthe RKJL method. Set γi = 〈Φai ,Φxk〉 also as in the method.Then |γi − 〈ai , xk〉| ≤ 2δ for all i and k in the first O(n) iterationsof RKJL.

Low Risk

This shows worst case expected convergence in at mostO(n2 log n) time, and of course in most cases one expects farfaster convergence.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 55: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Choosing parameter d

Lemma: Choice of d

Let Φ be the n× d (Gaussian) matrix with d = Cδ−2 log(n) as inthe RKJL method. Set γi = 〈Φai ,Φxk〉 also as in the method.Then |γi − 〈ai , xk〉| ≤ 2δ for all i and k in the first O(n) iterationsof RKJL.

Low Risk

This shows worst case expected convergence in at mostO(n2 log n) time, and of course in most cases one expects farfaster convergence.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 56: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Modified RK

Choosing parameter d

Lemma: Choice of d

Let Φ be the n× d (Gaussian) matrix with d = Cδ−2 log(n) as inthe RKJL method. Set γi = 〈Φai ,Φxk〉 also as in the method.Then |γi − 〈ai , xk〉| ≤ 2δ for all i and k in the first O(n) iterationsof RKJL.

Low Risk

This shows worst case expected convergence in at mostO(n2 log n) time, and of course in most cases one expects farfaster convergence.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 57: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Theorem [Assuming row normalization]

Fix an estimation xk and denote by xk+1 and x∗k+1 the next estimationsusing the RKJL and the standard RK method, respectively. Setγ∗

j = |〈aj , xk〉|2 and reorder these so that γ∗

1 ≥ γ∗

2 ≥ . . . ≥ γ∗

m. Then

when d = Cδ−2 log n,

E‖xk+1−x‖22 ≤ min

E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗

j + 2δ, E‖x∗k+1 − x‖22

where

pj =

{

(m−j

n−1)(mn)

, j ≤ m − n + 1

0, j > m − n + 1

are non-negative values satisfying∑m

j=1 pj = 1 andp1 ≥ p2 ≥ . . . ≥ pm = 0.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 58: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Theorem [Assuming row normalization]

Fix an estimation xk and denote by xk+1 and x∗k+1 the next estimationsusing the RKJL and the standard RK method, respectively. Setγ∗

j = |〈aj , xk〉|2 and reorder these so that γ∗

1 ≥ γ∗

2 ≥ . . . ≥ γ∗

m. Then

when d = Cδ−2 log n,

E‖xk+1−x‖22 ≤ min

E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗

j + 2δ, E‖x∗k+1 − x‖22

where

pj =

{

(m−j

n−1)(mn)

, j ≤ m − n + 1

0, j > m − n + 1

are non-negative values satisfying∑m

j=1 pj = 1 andp1 ≥ p2 ≥ . . . ≥ pm = 0.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 59: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Theorem [Assuming row normalization]

Fix an estimation xk and denote by xk+1 and x∗k+1 the next estimationsusing the RKJL and the standard RK method, respectively. Setγ∗

j = |〈aj , xk〉|2 and reorder these so that γ∗

1 ≥ γ∗

2 ≥ . . . ≥ γ∗

m. Then

when d = Cδ−2 log n,

E‖xk+1−x‖22 ≤ min

E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗

j + 2δ, E‖x∗k+1 − x‖22

where

pj =

{

(m−j

n−1)(mn)

, j ≤ m − n + 1

0, j > m − n + 1

are non-negative values satisfying∑m

j=1 pj = 1 andp1 ≥ p2 ≥ . . . ≥ pm = 0.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 60: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Theorem [Assuming row normalization]

Fix an estimation xk and denote by xk+1 and x∗k+1 the next estimationsusing the RKJL and the standard RK method, respectively. Setγ∗

j = |〈aj , xk〉|2 and reorder these so that γ∗

1 ≥ γ∗

2 ≥ . . . ≥ γ∗

m. Then

when d = Cδ−2 log n,

E‖xk+1−x‖22 ≤ min

E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗

j + 2δ, E‖x∗k+1 − x‖22

where

pj =

{

(m−j

n−1)(mn)

, j ≤ m − n + 1

0, j > m − n + 1

are non-negative values satisfying∑m

j=1 pj = 1 andp1 ≥ p2 ≥ . . . ≥ pm = 0.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 61: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Corollary

Fix an estimation xk and denote by xk+1 and x∗k+1 the nextestimations using the RKJL and the standard method, respectively.Set γ∗j = |〈aj , xk〉|2 and reorder these so that γ∗1 ≥ γ∗2 ≥ . . . ≥ γ∗m.Then when exact geometry is preserved (δ → 0),

E‖xk+1 − x‖22 ≤ E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗j .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 62: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Corollary

Fix an estimation xk and denote by xk+1 and x∗k+1 the nextestimations using the RKJL and the standard method, respectively.Set γ∗j = |〈aj , xk〉|2 and reorder these so that γ∗1 ≥ γ∗2 ≥ . . . ≥ γ∗m.Then when exact geometry is preserved (δ → 0),

E‖xk+1 − x‖22 ≤ E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗j .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 63: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Corollary

Fix an estimation xk and denote by xk+1 and x∗k+1 the nextestimations using the RKJL and the standard method, respectively.Set γ∗j = |〈aj , xk〉|2 and reorder these so that γ∗1 ≥ γ∗2 ≥ . . . ≥ γ∗m.Then when exact geometry is preserved (δ → 0),

E‖xk+1 − x‖22 ≤ E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗j .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 64: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Analytical Justification

Corollary

Fix an estimation xk and denote by xk+1 and x∗k+1 the nextestimations using the RKJL and the standard method, respectively.Set γ∗j = |〈aj , xk〉|2 and reorder these so that γ∗1 ≥ γ∗2 ≥ . . . ≥ γ∗m.Then when exact geometry is preserved (δ → 0),

E‖xk+1 − x‖22 ≤ E‖x∗k+1 − x‖22 −m∑

j=1

(

pj −1

m

)

γ∗j .

D. Needell Acceleration of Randomized Kaczmarz Method

Page 65: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Empirical Evidence

500 1000 1500 2000 2500 30000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

RKRKJL

Figure: ℓ2-Error (y-axis) as a function of the iterations (x-axis). Thedashed line is standard Randomized Kaczmarz, and the solid line is themodified one, without a Johnson-Lindenstrauss projection. Instead, thebest move out of the randomly chosen n rows is used. Note that wecannot afford to do this computationally.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 66: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Justification

Empirical Evidence

500 1000 1500 2000 2500 30000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

RKRKJL, d=1000RKJL, d=500RKJL, d=100RKJL, d=10

Figure: ℓ2-Error (y-axis) as a function of the iterations (x-axis) forvarious values of d with m = 60000 and n = 1000.

D. Needell Acceleration of Randomized Kaczmarz Method

Page 67: Acceleration of Randomized Kaczmarz Methoddeanna/Banfftalk.pdf · Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 D. Needell Acceleration of

Introduction Kaczmarz Randomized Kaczmarz Modified RK Evidence

Thank you

For more information

E-mail:

[email protected]

Web: http://www-stat.stanford.edu/~dneedell

References:

Eldar, Needell, “Acceleration of Randomized Kaczmarz Method via theJohnson-Lindenstrauss Lemma”, Num. Algorithms, to appear.

Needell, “Randomized Kaczmarz solver for noisy linear systems”, BITNum. Math., 50(2) 395–403.

Strohmer, Vershynin, “ A randomized Kaczmarz algorithm withexponential convergence”, J. Four. Ana. and App. 15 262–278.

D. Needell Acceleration of Randomized Kaczmarz Method