30
Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan Kunis

Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Embed Size (px)

Citation preview

Page 1: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Low ranks in computational Fourier analysisPart II: Fast matrix vector multiplication

Stefan Kunis

Page 2: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Outline

Part I: Introduction

Part II: Fast matrix vector multiplication

Part III: Efficient Reconstruction

Page 3: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Outline of part II

Hierarchical matrices in a nutshell

Fast Laplace transform - hierarchical approximation

Sparse fast Fourier transform - butterfly approximation

Application I - photoacoustic imaging

Application II - evaluating polynomials in a disk

Page 4: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Hierarchical matrices in a nutshell

hierarchical matrices (Greengard, Rokhlin; Hackbusch; Borm, Grasedyck; Bebendorf; Fenn, Steidl)

model problem d = 1

source nodes y` ∈ [0, 1], coefficients f` ∈ R, ` = 1, . . . ,N, andtarget nodes xj ∈ [0, 1], xj 6= y`, j , ` = 1, . . . ,N, compute

uj =N∑`=1

f`|xj − y`|

naive: O(N2) floating point operations

Page 5: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Hierarchical matrices in a nutshell

the kernel κ : [0, 1]× [0, 1]→ R

κ(x , y) =1

|x − y |

is asymptotically smooth

Page 6: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Hierarchical matrices in a nutshell

X = [xmin, xmax], Y = [ymin, ymax]

degenerate approximation κ : X × Y → R

κ(x , y) =

p−1∑s=0

(x − x0)s · (y − x0)−(s+1)

admissibility condition

|xmax − xmin| = diam(X ) ≤ dist(X ,Y ) = |ymin − xmax|

if p ≥ C | log ε|, then

‖κ− κ‖C(X×Y ) ≤ ε

Page 7: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Hierarchical matrices in a nutshell

dyadic decomposition of X = [0, 1]

0 1

0 1/2 1

0 1/4 1/2 3/4 1

0 1/8 1/4 3/8 1/2 5/8 3/4 7/8 1

Level l = 0X00

Level l = 1X10 X11

Level l = 2X20 X21 X22 X23

Level l = 3X30 X31 X32 X33 X34 X35 X36 X37

Page 8: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Hierarchical matrices in a nutshell

dyadic decompositions - two binary trees

X00

X10 X11

X20 X21 X22 X23

X30 X31 X32 X33 X34 X35 X36 X37

Y00

Y10 Y11

Y20 Y21 Y22 Y23

Y30 Y31 Y32 Y33 Y34 Y35 Y36 Y37

Page 9: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Hierarchical matrices in a nutshell

admissible pairs - a quadtree

X00 × Y00

X10 × Y10 X10 × Y11 X11 × Y10 X11 × Y11

X20 × Y22 X20 × Y23 X21 × Y22 X21 × Y23

X32 × Y34 X32 × Y35 X33 × Y34 X33 × Y35

Page 10: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Hierarchical matrices in a nutshell

X∗

Y∗

matrix partitioning

H-matrix

X∗ = {xj ∈ [0, 1] : j = 1, . . . ,N},Y∗ = {y` ∈ [0, 1] : ` = 1, . . . ,N} well distributed

local computations, admissible block in level l

κ(xj , y`) ≈p−1∑s=0

φs(xj)ψs(y`), K ≈ ΦΨ>, 2pN

2l

total computations

O(N log N| log ε|)

naive: O(N2) floating point operations

Page 11: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

discrete Laplace transform (Rokhlin; Strain; Andersson)

model problem, d = 1

T ⊂ [0,Tmax], |T | = N

X ⊂ [0,Xmax], |X | = N

f = (fk)k∈T ∈ CN

evaluate sum of exponentials for x ∈ X

f (x) =∑k∈T

fke−kx

naive: O(N2) floating point operations

Page 12: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

admissibility condition

diam(T ) ≤ dist(T , 0) and diam(X ) ≤ dist(X , 0)

kernel function κ : [0, 1]2 → R, κ(k, x) = e−kx

using singular value decomposition

locally rank 1 κ

Page 13: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

admissibility condition

diam(T ) ≤ dist(T , 0) and diam(X ) ≤ dist(X , 0)

kernel function κ : [0, 1]2 → R, κ(k, x) = e−kx

using singular value decomposition

locally rank 1 error

Page 14: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

admissibility condition

diam(T ) ≤ dist(T , 0) and diam(X ) ≤ dist(X , 0)

kernel function κ : [0, 1]2 → R, κ(k, x) = e−kx

using singular value decomposition

locally rank 2 error

Page 15: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

admissibility condition

diam(T ) ≤ dist(T , 0) and diam(X ) ≤ dist(X , 0)

kernel function κ : [0, 1]2 → R, κ(k, x) = e−kx

using singular value decomposition

locally rank 3 error

Page 16: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

kernel function κ : [0,Tmax]× [0,Xmax]→ R

κ(k, x) = e−kx

admissibility condition allowing low rank approximation

diam(X ) ≤ dist(X , 0)

diam(T ) ≤ dist(T , 0)

subdivide both intervals geometrically

0 Xmax

X4X5 X3 X2 X1

Page 17: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

let X and T be admissible and

B =(e−kx

)x∈X ,k∈T

polynomial interpolation in k and x at ts = cos 2s−12p π (Trefethen)

LX ∈ R|X |×p, LT ∈ Rp×|T |, Lagrange matrices, and

Bp =(

e−(tr+1)(ts+1)diamTdiamX/4)pr ,s=1

Lemma (local error)

If X ,T ⊂ [0,∞) are admissible, then

‖B− LXBpLT‖1→∞ ≤ 2 · 4−p

Page 18: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Fast Laplace transform - hierarchical approximation

Theorem (global error, complexity)

Let N ∈ N, ε > 0, X ,T ⊂ [0,∞), then a matrix vector productwith B = (e−xjk`)j ,`=1,...,N can be computed in

O(

N log1

ε+ log3

1

εlog

xmaxkmax

ε

)floating point operations.

1 3 5 7 9 11 13 15 17 1910

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

10−2

100

102

104

21

24

27

210

213

216

219

222

error vs. p, N = 214 time vs. N, p = 8

Page 19: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Sparse fast Fourier transform - butterfly approximation

model problem, d = 1

T ⊂ [0,N], |T | = N

X ⊂ [0,N], |X | = N

f = (fk)k∈T ∈ CN

evaluate almost periodic function for x ∈ X

f (x) =∑k∈T

fke2πikx/N

naive: f = Af takes O(N2) floating point operations

FFT for nonequispaced nodes in time and frequency domain(nnFFT, Elbel, Steidl; Potts, Steidl, Tasche; Keiner, Knopp, Potts, K.; type-3 nuFFT, Greengard, Lee)

butterfly approximation scheme(Edelman; Michielsen, Boag; Chew, Song; Ying; O’Neil, Woolfe, Rokhlin; Candes, Demanet; Tygert)

Page 20: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Sparse fast Fourier transform - butterfly approximation

Lemma (local error)

Let N, p ∈ N, X ,T ⊂ [0,N] fulfil the admissibility conditiondiam(T )diam(X ) ≤ N, then

‖A− LXApLT‖1→∞ ≤ 3 ·(

π

p − 1

)p

SVD of an admissible block

lower bound (Widom)

C(π

8

)p 1

p!≈ C ′

(1.06

p

)p

1 3 5 7 9 11 13 15 17 1910

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

local error, N = 210

Page 21: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Sparse fast Fourier transform - butterfly approximation

X

T

dyadic decomposition of X = [0,N]

0 N

0 N/2 N

0 N/4 N/2 3N/4 N

Level l = 0X00

Level l = 1X10 X11

Level l = 2X20 X21 X22 X23

Page 22: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Sparse fast Fourier transform - butterfly approximation

dyadic decompositions - two binary trees

X00

X10 X11

X20 X21 X22 X23 T00

T10 T11

T20 T21 T22 T23

Page 23: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Sparse fast Fourier transform - butterfly approximation

admissible pairs - a butterfly graph, N = 4

X00 × T20 X00 × T21 X00 × T22 X00 × T23

X10 × T10 X10 × T11 X11 × T10 X11 × T11

X20 × T00 X21 × T00 X22 × T00 X23 × T00

Page 24: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Sparse fast Fourier transform - butterfly approximation

Theorem (global error, complexity)

Let N ∈ N, ε > 0, X ,T ⊂ [0,N], then a matrix vector productwith A = (e2πixjk`/N)j ,`=1,...,N can be computed in

O(

N log N log2N

ε

)floating point operations.

−70

8

−7

0

8−8

0

8

−0.50

0.5

−0.5

0

0.5

0

0.5

d = 3, sparse T sparse X

Page 25: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Application I - photoacoustic imaging

spherical means, f : Rd → R

Mf (z, t) =

∫Sd−1

f (z + tx)dσ(x), z ∈ Sd−1, t ∈ [0, 2]

−0.25 −0.125 0 0.125 0.25

−0.25

−0.125

0

0.125

0.25

1 2 3 4 5 6

0.1

0.2

0.3

0.4

N2 function values N×N measurements

Page 26: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Application I - photoacoustic imaging

spectral ansatz ek(x) = e2πikx, k ∈ I = [−N2 ,

N2 )d ∩ Zd ,

Mek(y, r) =Γ(d2

)J d

2−1(2π|k|r)

(π|k|r)d2−1

· ek(y)

d = 3, |I | = N3, |Y| = N2, |R| = N

J 12(t) =

√2

πt· sin t

butterfly sparse fast Fourier transform

T =

{(k±|k|

): k ∈ I

}X = Y ×R

Page 27: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Application I - photoacoustic imaging

d = 3, problem size N3, time complexity (Gorner, Hielscher, K.)

naive O(N5)butterfly O(N3 log6 N)

iterative reconstruction (Brandt, Dong, Gorner, K.)

‖Mf − g‖22 + λ‖f ‖TV → min

d = 2

geometry least squares regularised

Page 28: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Application II - evaluating polynomials in a disk

model problem

T := {1, . . . ,N}Z ⊂ {z ∈ C : |z | ≤ 1}, |Z | = N

f = (fk)k∈T ∈ CN

evaluate the polynomial

f (z) =∑k∈T

fkzk , z ∈ Z

naive: O(N2) floating point operations

idea: write z = e−ye2πix , if B = LY BpLT , then

(A� B)f =(

LY � A(

diag f)

L>TB>p

)1

Page 29: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Application II - evaluating polynomials in a disk

Theorem (global error, complexity)

Let N ∈ N, ε > 0, zj ∈ {z ∈ C : |z | ≤ 1}, then a matrix vectorproduct with C = (zk

j )j ,k=1,...,N can be computed in

O(

N log N log1

εlog3

N

ε

)floating point operations.

1 3 5 7 9 11 13 15 17 1910

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

10−2

100

102

104

21

24

27

210

213

216

219

222

error vs. p, N = 214 time vs. N, p = 8

Page 30: Low ranks in computational Fourier analysis - TU Chemnitzpotts/cms/cms15/talk2kunis.pdf · Low ranks in computational Fourier analysis Part II: Fast matrix vector multiplication Stefan

Summary

local low rank

Laplace transform - asymptotically smooth kernelsFourier transform - Fourier integral operatorsapplication in photoacoustic imagingapplication for evaluating polynomials

think global, act local

O(N loga Nε )

www.analysis.uos.de