55
Wavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat, Wavelet tour of signal processing Table of contents Week 1 (9/9/2010) .......................... 2 Fourier Transform .................................................... 2 Uncertainty Principles ............................................... 2 Time Frequency Representations ........................................... 4 Week 2 (9/16/2010) ......................... 6 Wavelet Transform .................................................... 8 The Haar System .................................................... 10 Week 3 (9/23/2010) ......................... 11 Convergence of wavelet series ............................................. 11 Week 4 (9/30/2010) ......................... 16 Multiresolution Analysis (MRA) Framework (Meyer, Mallat) ........................ 16 Week 5 (10/7/2010) ......................... 20 Finding the Mother Wavelet ............................................. 20 Week 6 (10/14/2010) ......................... 23 Week 7 (10/21/2010) ......................... 27 Deisgning Wavelets ................................................... 27 Meyer’s construction of wavelets in S ........................................ 31 Week 8 (10/28/2010) ......................... 32 Vanishing Moments ................................................... 32 Spline Wavelets ...................................................... 33 Week 9 (11/4/2010) ......................... 36 Compactly Supported Wavelets ........................................... 36 Week 10 (11/11/2010) ......................... 40 Week 11 (11/24/2010) ......................... 44 Vanishing Moments, Again .............................................. 44 Decay of wavelet coefficients from vanishing moments ........................... 46 Lipschitz regularity from decay of wavelet coefficients ........................... 47 Week 12 (12/1/2010) ......................... 48 L p convergence of wavelet series ........................................... 48 Approximation by wavelet series ........................................... 51 1

Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

  • Upload
    vandung

  • View
    223

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Wavelets, Approximation Theory, and Signal Processing

Güntürk, Fall 2010Scribe: Evan Chou

References: Mallat, Wavelet tour of signal processing

Table of contents

Week 1 (9/9/2010) . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Uncertainty Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Time Frequency Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Week 2 (9/16/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 6

Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8The Haar System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Week 3 (9/23/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 11

Convergence of wavelet series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Week 4 (9/30/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 16

Multiresolution Analysis (MRA) Framework (Meyer, Mallat) . . . . . . . . . . . . . . . . . . . . . . . . 16

Week 5 (10/7/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 20

Finding the Mother Wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Week 6 (10/14/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 23

Week 7 (10/21/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 27

Deisgning Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Meyer’s construction of wavelets in S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Week 8 (10/28/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 32

Vanishing Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Spline Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Week 9 (11/4/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 36

Compactly Supported Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Week 10 (11/11/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 40

Week 11 (11/24/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 44

Vanishing Moments, Again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Decay of wavelet coefficients from vanishing moments . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Lipschitz regularity from decay of wavelet coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Week 12 (12/1/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 48

Lp convergence of wavelet series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Approximation by wavelet series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

1

Page 2: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Week 13 (12/9/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 52

Linear Approximation v Nonlinear Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Besov Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Week 1 (9/9/2010)

Fourier Transform

To start, we’ll start with the Fourier transform, studying its deficiencies and how to overcome them.

The convention for Fourier transform we’ll use in this course is the one with 2π in the exponent, for easeof inversion, Plancherel, sampling at integers without 2π, among other reasons:

Ff(ξ) = f (ξ)=

R

f(t)e−2πiξtdt

Initially this makes sense for f ∈ L1, and as an operator F : L1 → L∞ (and in fact maps to C0). By the

standard extension procedure we can define this for L2 (approximate by L1 ∩ L2), and we have that F :L2→L2 and is an isometry, so that ‖f ‖2= ‖f ‖2. By interpolation (Riesz-Thorin), we also have have thatF :Lp→Lp

′for 1< p< 2, a result by the name of Hausdorff-Young.

Also, if we consider the Fourier transform as an operator on the Schwarz space S, then F : S → S and byduality F :S ′→S ′, an operator on the space of tempered distributions such as the dirac delta.

To recover f from f , we have the inversion formula:

f(t) =

f (ξ)e2πiξtdξ

and viewing f =Ff as an analysis of f , this can be considered the synthesis (or reconstruction) of f from

its frequency content f .

Deficiency. This is a very nonlocal transform. To recover f(t) for even a single t, we need all frequency

data f (ξ) for all ξ ∈R.

A quick reason for this is the fact that the complex exponential e2πiξt is globally supported. This is themain deficiency of the Fourier Transform, when local information about a signal is needed. For instance,in music we can discern which frequencies are being played at which time, and if we were to take theFourier transform of the music, it would not give this information; instead, it would give only a vaguesense of which frequencies were present in the entire song. Such local frequency information would beuseful in general for signals with transients: seismic signals, medical signals, images and edges, just toname a few.

Here we investigate why the Fourier transform does not provide an efficient way of capturing local fre-quency information.

Uncertainty Principles

These essentially say that a function cannot be simultaneously localized in time and frequency.

0. Scaling properties:

f(δ · )(ξ)= 1

δf

(

ξ

δ

)

Thus dilating f has the effect of spreading out f .

2

Page 3: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

1. If f 0, then at most one of supp f and supp f can be compact.

The reason for this is that if supp f is compact, then

f (z)=

f(t)e2πiztdt

is analytic, and therefore f (ξ) cannot be compactly supported.

2. Heisenberg Uncertainty Principle. Suppose we had f ∈ L2 with ‖f ‖2 = 1. In physics, we canthink of |f(t)|2 as a pdf. In this context we then define the mean

mf6 ∫

t |f(t)|2 dt

and the variance

σf26 ∫

(t−mf)2 |f(t)|2 dt

Similarly we define mf and σf2 . Then we have

Theorem 1. (Heisenberg Uncertainty Principle)

σfσf ≥1

Proof. Note a few facts: σf and σf are invariant under translations and modulations, i.e. if we

define fα,β(t) = e2πiαtf(t − β) then σfα,β=σf. The same holds for f since the Fourier transform

exchanges translations and modulations. Thus without loss of generality we may assume that mf =mf =0.

For what follows, we assume f is Schwarz, and the general case can be obtained by approximation.

1=

|f |2dt =

f f dt

= t(ff )∣

−∞∞ −

t(ff )′ dt

= − 2Re

tf f ′ dt

(Cauchy-Schwarz) ≤ 2

( ∫

t2 f2 dt

)1/2( ∫

|f ′|2 dt)1/2

(Plancherel) = 2σf

( ∫

|2πiξ f (ξ)|2 dξ)1/2

= 4πσfσf

We conclude that

1

4π≤ σfσf

For general f , we can just approximate with a Schwarz function and take limits.

The minimizers of σf σf turn out to be Gaussians.

3

Page 4: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Examples:

• f = δ0, f = 1. (Not quite an instance of above, but an extreme case). Here we have perfectlocalization in time, but no frequency localization.

• f = 1[− 1

2,1

2

], f =sinπξ

πξ, where σf <∞ but σf =∞. This is a milder case.

• f(t)= e−πt2/2. Then σf σf =

1

4π. This is the balanced case.

3. Amrein-Betheir. If f ∈L1, and f 0, then

|supp f | · |supp f |=∞

where |A| denotes the measure.

4. Donaho-Stark If( ∫

T c |f |2)1/2≤ εT ‖f ‖2 and

(

Ωc |f |2)1/2

≤ εΩ ‖f ‖2, then

|T | |Ω| ≥ (1− εT − εΩ)2

so that even if we wanted to find sets T , Ω such that f |T and f |Ω capture most of the norms of f ,f , there is a limitation on how small we can make T ,Ω.

5. Uniform Uncertainty Principle(s), Candés, Tao, others. For f ∈RN, and k,m related by

m≥ ck logN

If #(supp f)≤ k, then there exists a set Λ⊂1, , N such that |Λ| ≤m and f∣

Λdetermines f .

This means that if f , g are two signals with support of size k, and f∣

Λ= g∣

Λ, then f = g.

Also, if h is a signal of support size 2k, then h cannot vanish on Λ, otherwise we can break h ino a differ-ence two signals of size k with disjoint support, whose Fourier transforms agree on Λ. So this is a discreteversion which says that a signal of small support must have some frequency content in Λ, and equivalently,if a signal vanishes on Λ, then the support of the signal cannot be smaller than 2k.

Time Frequency Representations

We will investigate different ways of analyzing local frequency content of functions, starting with the shorttime (windowed) Fourier transform, and later we will look at wavelets (which analyzes functions at dif-ferent time scales).

Since the Fourier transform does not give us a good view of the frequency content of a signal near partic-ular times, the idea is to preprocess the signal by restricting it to values near a particular time (to awindow of time) and then feeding it to the Fourier transform.

f

ϕ( · − τ )

τ

4

Page 5: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Above ϕ is the window, translated to time τ . We define the windowed Fourier transform to be

(Tϕf)(ξ, τ )6 ∫

f(t) ϕ(t− τ ) e−2πiξtdt= 〈f , ϕξ,τ 〉

which can be interpreted as inner products with a family of functions ϕξ,τ(t) = e2πiξtϕ(t − τ). This is acontinuous transform, and later we will be considering whether we can sample at discrete points on a lat-tice in the time-frequency plane (ξ, τ ).

How do we recover f? Since these are inner products, it is tempting to try placing the coefficients backwith the functions, and this does work:

Theorem 2. (Reconstruction from windowed Fourier transform)

∫ ∫

(Tϕf)(ξ, τ )ϕξ,τ(t)dξdt

Proof. Again we will assume that f is sufficiently nice to apply Fubini. The rest is just computation:

∫ ∫

(Tϕf)(ξ, τ )e2πiξtϕ(t− τ )dξdt =

∫ ∫ ∫

f(s) ϕ(s− τ ) e−2πiξs dse2πiξtϕ(t− τ )dξdτ

=

∫ ∫

f(s)e2πiξ (t−s)[ ∫

ϕ(t− τ ) ϕ(s− τ ) dτ

]Φt(s)

dsdξ

=

e2πiξtFfΦt(ξ)dξ= f(t)Φt(t)

= f(t)

This gives us a better idea about which frequencies are present near a particular time. Also, in the recon-struction formula for f(t), we note that if ϕ has sufficiently fast decay, we can truncate the integral witha small loss in error, so we do not really need all values Tϕf(ξ, τ ) for (ξ, τ ) ∈R2 to recover f(t) approxi-mately. Also, we suspect that the information in Tϕf is fairly redundant, at least at a glance we are usinga R2 → R function to represent our signal f : R → R. There is an example that makes this redundancyvery apparent.

Example 3. Let ϕ = 1[0,1]. Then it is already obvious that we lose no information by restricting τ to

integers, as∑

ϕ(t− k) = 1 form a partition of unity. Furthermore, for each piece f(t)ϕ(t− k) supportedon [k, k+1] we have a Fourier series representation, so we can sample ξ at integers also. In fact, ϕξ,τ , (ξ,τ)∈Z2 form an orthonormal basis for L2(R), and thus

f =∑

(ξ,τ)∈Z2

〈f , ϕξ,τ 〉 ϕξ,τ =∑

(ξ,τ)∈Z2

(Tϕf)(ξ, τ )ϕξ,τ

As a quick note, we cannot sample any less here or else we will actually lose part of the original signal.The other thing to note is that 1[0,1] is not nice, as f 1[0,1] is not even continuous (even on the torus T),

and thus the decay of the Fourier series coefficients will be slow, which will not allow us to truncate theseries for a finite approximation.

5

Page 6: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

In the 1940’s, Gabor suggested using Gaussian windows, since they balance smoothness and locality well.However, it turns out that sampling the windowed Fourier transform at integers does not work.

Notation: For ξ0> 0, τ0> 0, we write

ϕm,nξ0,τ06 ϕmξ0,nτ0

and

G(ϕ, ξ0, τ0) =

ϕm,nξ0,τ0, (m,n)∈Z2

The family G we will call a Gabor system. We can then ask the following questions.

Question 1: When is a given Gabor system an orthonormal basis?

We already have an example above. A necessary condition is that ξ0 τ0=1. But this condition is not suffi-

cient, just considering the earlier example, since we cannot sample at (1

2, 2) for instance. In general, con-

sider ϕ with supp(ϕ)⊂ [0, τ0]

Question 2: When is the system complete? (i.e. it has the proeprty that if 〈f , ϕm,n〉 = 0 for all m, n,then f =0).

A necessary condition is that ξ0 τ0≤ 1. But this condition is not sufficient for the same reason.

Later we will also show

Theorem 4. (Balian-Low) If G(ϕ, 1, 1) is an orthonormal basis, then either σϕ=∞ or σϕ=∞.

In particular, this means that Gaussians can never yield a Gabor system that is an orthonormal basis.

Week 2 (9/16/2010)

When is a good time frequency localization possible for ϕ? Last time we saw a limitation from theHeisenberg Uncertainty Principle (Theorem 1):

σfσf ≥1

Now we turn to proving the Balian-Low theorem concerning Gabor systems.

Proof. (of Theorem 4) Suppose G(ϕ, 1, 1)= ϕm,nm,n∈Z is an orthonormal basis. Define

(Pf)(x) 6 xf(x)

(Qf)(x) 6 (Pf )∨(x) =1

2πi

df

dx

(i.e. (Qf)∧ = Pf ). We want to show that either Pϕ L2 or Qϕ L2. Also, without loss of generality wecan center the functions at 0, i.e. we may assume mf = mf = 0. Towards a contradiction, assume thatboth Pϕ and Qϕ are in L2. We thus have

〈Pϕ, Qϕ〉=∑

m,n

〈Pϕ, ϕm,n〉〈ϕm,n, Qϕ〉

First, note that

〈Pϕ, ϕm,n〉= 〈Pϕ, ϕm,n〉−n 〈ϕ, ϕm,n〉

6

Page 7: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

since ϕ= ϕ0,0 and thus 〈ϕ0,0, ϕm,n〉=0 for (m,n) (0, 0), and otherwise n=0. Continuing, we have that

〈Pϕ, ϕm,n〉 = 〈(P −nI)ϕ, ϕm,n〉=

(x−n)ϕ(x)e−2πimxϕ(x−n) dx

=

xϕ(x+n)e−2πimxϕ(x) dx

= 〈ϕ−m,−n, Pϕ〉

We have a similar identity with Qϕ, which we prove from the identity for Pϕ:

〈ϕm,n, Qϕ〉 =⟨

(ϕm,n)∧, (Qϕ)∧

= 〈(ϕ)−n,m, P ϕ 〉= 〈P ϕ , (ϕ)n,−m〉= 〈(Qϕ)∧, (ϕ−m,−n)∧〉= 〈Qϕ, ϕ−m,−n〉

This implies that

〈Pϕ, Qϕ〉 =∑

m,n

〈Pϕ, ϕm,n〉〈ϕm,n, Qϕ〉

=∑

m,n

〈Qϕ, ϕm,n〉〈ϕm,n, Pϕ〉

= 〈Qϕ,Pϕ〉

Note also that

〈Pf , g〉 =

xf(x) g(x) dx= 〈f , Pg〉

〈Qf , g〉 = 〈(Qf)∧, g 〉=⟨

P f , g⟩

=⟨

f , P g⟩

=⟨

f , (Qg)∧⟩

= 〈f , Qg〉

so that both P , Q are self-adjoint. Thus,

〈ϕ, PQϕ〉= 〈Pϕ, Qϕ〉= 〈Qϕ,Pϕ〉= 〈ϕ, QPϕ〉

and

〈ϕ, (PQ−QP )ϕ〉=0 for all ϕ

But also, we can compute (PQ−QP )ϕ:

(PQ−QP )ϕ =x

2πi

dx− 1

2πi

d

dx(xϕ)

= − 1

2πiϕ

So we have a contradiction since

〈ϕ, (PQ−QP )ϕ〉=− 1

2πi‖ϕ‖2 0

7

Page 8: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

As mentioned earlier, the Balian-Low theorem implies that G(ϕ, 1, 1) is not an orthonormal basis for anyGaussian ϕ. It is true that if ξ0 τ0< 1 then G(ϕ, ξ0, τ0) forms a tight frame (Seip, Lyubarski).

It turns out that we can get around this theorem if we measure the variance slightly differently.

Generalizations:

Define for any p> 0,

σf ,p6 infm

( ∫

(t−m)2p |f |2)1/2p

For p= 1, the m that minimizes this is exactly the mean mf, and hence σf ,1= σf. Note that if σf ,p<∞,

then σf ,q<∞ for q < p by Hölder (treat |f |2 dt as a probability measure).

We then have the following results, which we will not prove:

Theorem 5. (Balian) There exists ϕ such that σϕ,p σϕ , p < ∞ for all p < 1 and G(ϕ, 1, 1) is an

orthonormal basis for L2

Note that the example of the rectangular window ϕ= 1[−1/2,1/2] does not satisfy this, as ϕ(ξ) =sinπξ

πξand

in order for σϕ , p<∞ we need p< 1/2 so that the decay is still o(

1

ξ

)

.

Theorem 6. (Steges) For any orthonormal basis ψk of L2(R), and for any p> 1,

(

supk

σψk, p

)(

supk

σψk, p

)

=∞

Theorem 7. (Bourgain) There exists ψk an orthonormal basis of L2(R) such that

(

supk

σψk

)(

supk

σψk

)

<∞

Wavelet Transform

Let ψ ∈L2(R). We will use the notation

ψa,b(t) =1

a√ ψ

(

t− b

a

)

for a > 0, b ∈R. The a√

normalization is chosen so that ψa,b has the same L2 norm as ψ. The continuouswavelet transform is defined as

(Wψf)(a, b)=⟨

f , ψa,b⟩

where a is the scale and b is the position. The reconstruction formula is similar to that of the windowedFourier transform:

f(t)=1

−∞

∞db

0

∞ da

a2(Wψf)(a, b)ψ

a,b(t)

8

Page 9: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

where cψ =∫ |ψ(ξ)|2

|ξ| dξ <∞ (assumed to be finite). In particular, if this condition is satisfied, it is neces-

sary that ψ(0) =∫

ψ = 0 and a sufficient condition for this to be satisfied is that ψ ∈ L2 ∩ L1, ψ(0) =∫

ψ=0 and some regularity of ψ at ξ=0.

Theorem 8. Assume cψ=∫ |ψ(ξ)|2

|ξ| dξ <∞. Then for all f , g ∈L2(R),

cψ 〈f , g〉=∫

−∞

∞db

0

∞ da

a2(Wψf)(a, b) (Wψg)(a, b)

This implies the reconstruction formula, since if 〈f1, g〉= 〈f2, g〉 for all g, then f1= f2 (take g= f1− f2 sothat 〈f1− f2, g〉= ‖f1− f2‖2=0). Note that

−∞

∞db

0

∞ da

a2(Wψf)(a, b) (Wψg)(a, b) =

⟨∫

−∞

∞db

0

∞ da

a2(Wψf)(a, b)ψ

a,b( · ), g⟩

Proof. By properties of Fourier transform,

(ψa,b)∧(ξ)= a

√e−2πibξ ψ(aξ)

Also,

Wψf (a, b) =⟨

f , ψa,b⟩

=⟨

f , ψa,b⟩

=

f (ξ) a√

e−2πibξψ(aξ) dξ

Computation:

db

da

a2Wψf(a, b)Wψg(a, b) =

db

da

a

[ ∫

dξ f (ξ) ψ(aξ) e2πibξ][ ∫

dξ ′ g(ξ ′) ψ(aξ ′)e−2πibξ ′]

=

da

a

db[

f ( · ) ψ(a · )]∧(b)[

g( · ) ψ(a · )]∧(b)

=

da

a

dξ f (ξ) ψ(aξ) g(ξ) ψ(aξ)

=

dξ f (ξ) g (ξ)

da

a|ψ(aξ)|2

=

dξ f (ξ) g (ξ)

da

a|ψ(a)|2

= cψ

f , g⟩

= cψ〈f , g〉

We can also talk about the discrete wavelet transform which samples a, b in the continuous wavelet trans-form. For sampling dilations and translations it is natural to look at dyadic scales

a∈ 2−j , j ∈Z, b∈2−j k, j ∈Z, k ∈Z

so that for each scale a, the supports of ψa,b cover all of R. We will use the notation

ψj,k(x)6 ψ2−j,2−jk(x)= 2j/2 ψ(2j(x− 2−jk)) = 2j/2 ψ(2jx− k)

9

Page 10: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

As we saw earlier, a necessary admissibility condition is that ψ(0)= 0. Recall

ψj,k(ξ)= 2−j/2 e2πi2−jkξ ψ(2−jξ)

k only affects the phase of ψj,k whereas j dilates ψ , giving higher frequency localization for smaller j. Ifwe consider the time-frequency locality of the family of functions ψj,k, we see a different tiling of thetime-frequency plane. In the windowed Fourier transform, the sampled functions all have the same time-frequency localization. In the wavelet transform, ψj,k has higher frequency localization for negative j andhigher time localization for positive j:

ξ

τ

j=−1

j=0

j=1

j=−2

j=−2

j=−1

j=0

j=1

ψj,k(ξ)

ξ

The Haar System

(A Haar, 1910) For H=L2([0, 1]), define

H(x)= ψ(x)=

1 x∈ [0, 1/2)− 1 x∈ [1/2, 1]

and Hj,k= ψj,k. Then 1[0,1] ∪ Hj,kj=0,k=0, ,2j−1∞ forms an orthonormal basis for H. Orthogonality is

obvious. 1[0,1] and Hj,k are orthogonal since Hj,k have mean zero. Otherwise, for 〈Hj,k, Hj ′,k ′〉, if j = j ′

and k k ′ then the supports are disjoint, so the inner product is zero. If j j ′, then either the supportsare disjoint or the wider function is constant on the support of the narrower function, which has meanzero, and thus the inner product is again zero.

We will show completeness next time.

10

Page 11: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Week 3 (9/23/2010)

Convergence of wavelet series

Some definitions:

• Ij,k= [k2−j , (k+1)2−j). Ij,k, k ∈Z partitions R for all j.

• Vj= f ∈L2: f is constant on each Ij,k, k ∈Z

f ∈Vj f =∑

kαk 1Ij,k with

∑ |ak|2<∞Note that Vj⊂Vj+1 for all j.

• Pj 6 orthogonal projection on Vj.

Fact: minα ‖f −α‖L2(A) is attained at the average value α∗=1

|A|∫

Af .

Given any other α, f −α∗ and α∗−α will be orthogonal with respect to the inner product 〈f , g〉=1

|A|∫

f g dx , and the rest follows by Pythagorean identity

‖f −α‖2= ‖f −α∗‖2+ ‖α∗−α‖2≥‖f −α∗‖2

Thus

Pj f =∑

k∈Z

fj,k 1Ij,k

where fj,k=1

|Ij,k|∫

Ij,kf =2j

Ij,kf .

• Let ϕ= 1[0,1), and define

ϕj,k(x) = 2j/2 ϕ(2jx− k)= 2j/21Ij,k(x)

Then ϕj,k: k ∈Z is an orthonormal basis for Vj for all j.

If we set cj,k(f)= 〈f , ϕj,k〉=2−j/2fj,k, then Pj f =∑

k 〈f , ϕj,k〉 ϕj,k

• Note that Pj is well-defined on Lloc1 by the same formula.

Now we define Wj to be the orthogonal complement of Vj in Vj+1, i.e.

Vj+1=Vj⊕Wj

and let Qj be the orthogonal projection on Wj. We call Wj the “detail space” at scale 2−j. Note thatQj=Pj+1−Pj, and

Qj f =Pj+1 f −Pjf =∑

k

fj+1,k1Ij+1,k −∑

k

fj,k1Ij,k

Note that Ij,k = Ij+1,2k ∪ Ij+1,2k+1, so that fj,k =1

2(fj+1,2k + fj+1,2k+1). Then above we can split the

first sum into even and odd terms and compute:

Qjf =∑

k

(

fj+1,2k1Ij+1,2k + fj+1,2k+11Ij+1,2k+1− fj,k(1Ij+1,2k+ 1Ij+1,2k+1))

=∑

k

[fj+1,2k− fj,k]1Ij+1,2k − [fj,k− fj+1,2k+1]1Ij+1,2k+1

11

Page 12: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

and since 2 fj,k = fj+1,2k + fj+1,2k+1, we have that fj+1,2k − fj,k = fj,k − fj+1,2k+1, and denoting thisquantity bj,k, we have that

Qj f =∑

k

bj,k (1Ij+1,2k − 1Ij+1,2k+1)=∑

k

bj,k 2−j/2 ψj,k

where ψj,k was defined earlier for the Haar system. Note that ψj,k, k ∈ Z form an orthonormal systemfor Wj. In fact, it is a basis by the above computation, where we can express any Qjf ∈ Wj in terms ofthe elements ψj,k. Define

dj,k6 〈f , ψj,k〉=2−j/2 bj,k

Now pick any pair J0<J1. We have that

VJ1=VJ0⊕WJ0⊕WJ0+1⊕ ⊕WJ1−1

by repeatedly applying the decomposition Vj+1 = Vj ⊕ Wj. VJ0 is the function at the coarser scale 2−J0,and WJ0,WJ0+1, are successive refinements (detail spaces). Equivalently,

PJ1=PJ0+∑

k=J0

J0−1

Qk

Naturally, we want to know what happens as we take J0→−∞ and J1→∞.

Theorem 9. For all f ∈L2(R), Pj f→ f in L2 as j→∞, i.e. ‖Pjf − f ‖L2→ 0

Proof. The first observation is that it suffices to check this for a dense subset X of L2. This is a stan-dard approximation argument: for ε> 0, we can find g ∈X with ‖f − g‖2≤ ε. Then

‖Pjf − f ‖2≤‖Pj(f − g)‖2+ ‖Pjg− g‖2+ ‖g− f ‖2≤ 2ε+ ‖Pjg− g‖2

(note ‖Pj (f − g)‖2≤‖f − g‖2 since Pj is a projection in L2)

Now if we take the limsup of both sides as j→∞, we know that ‖Pjg− g‖2→ 0, so

limsupj

‖Pjf − f ‖2≤ 2ε

as ε is arbitrary, we have that limsupj ‖Pjf − f ‖2≤ 0 and thus limj ‖Pjf − f ‖2=0.

So now we find a convenient dense subspace to work with. Here we will use X = Cc(R), the continuousfunctions with compact support, which is dense in L2. Note that these functions are uniformly continuous,i.e. the modulus of continuity

ωg(δ)6 sup|x−y |<δ

|g(x)− g(y)| 0 as δ→ 0

This means that

supk

‖g − gj,k‖L∞(Ij,k)≤ωg(2−j) 0 as j→∞

This implies that ‖Pjg− g‖L∞ 0 as j→∞ (recall Pjg is just gj,k on Ij,k), and thus by Hölder

‖Pjg − g‖L2≤ |supp(Pjg− g)|1/2 ‖Pjg − g‖2 0

12

Page 13: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

noting that we can find |supp(Pjg − g)| ≤C for large j (in fact C = |supp(g)|+1 for j > 0)

We therefore have the following immediate corollary:

Corollary 10. For any f ∈L2(R),

f =P0f +∑

j=0

∞Qjf

(Take J1→∞ and J0=0)

If we restrict ourselves to functions on [0, 1], then this says that

f = f0,0+∑

j=0

∞∑

k=0

2j−1

〈f , ψj,k〉 ψj,k

for any f ∈ L2([0, 1]). Furthermore, if f is continuous, then the convergence is uniform (directly from theproof). Compare this with Fourier series, where uniform convergence is not guaranteed for continuousfunctions (in fact, not even pointwise convergence...). This Haar basis was the first orthonormal basis forL2[0, 1] found with this property. Here the basis is

1[0,1]∪ ψj,k, j > 0, k ∈Z

For L2(R), we have the orthonormal basis

ϕ0,k, k ∈Z∪ ψj,k, j ≥ 0, k ∈Z

for which we have uniform convergence of the basis expansion of f ∈L2 if f is uniformly continuous.

Now let’s consider what happens if we take J0→−∞.

Theorem 11. For all f ∈L2, Pjf→ 0 in L2 as j→−∞.

Proof. Again, it suffices to show the result for a dense subspace X ⊂L2, with the exact same reasoning.

For the dense subspace, it will be convenient to work with X = L1 ∩ L2 since as j→ −∞, the support ofPj f grows to R, in which case we will be integrating f on larger and larger sets.

Note Pj g=∑

k 〈g, ϕj,k〉ϕj,k, and thus by triangle inequality,

‖Pj g‖2 ≤∑

k

|〈g, ϕj,k〉|=∑

k

2j/2∫

Ij,k

g

≤ 2j/2 ‖g‖1

which tends to 0 as j→−∞.

Corollary 12. Taking J1→∞ and J0→−∞, we therefore have that

f =∑

j=−∞

∞Qj f =

j,k

〈f , ψj,k〉 ψj,k

where convergence is in L2.

13

Page 14: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Hence, ψj,k, j , k ∈Z is an orthonormal basis for L2(R).

Remark: Note that∫

ψj,k = 0 for all j , k, yet∫

f need not be zero, even though we can write f as a

linear combination of ψj,k! In fact, this will be the case for any f ∈ L1 ∩ L2 with∫

f 0. Note that thisis not contradictory since the convergence is not in L1.

We can also talk about L1 convergence.

Proposition 13.

1. If f ∈L1, then Pjf→ f in L1 as j→∞

2. It is not necessarily true that Pjf→ 0 in L1 as j→−∞

Proof. To show (1), it suffices to show that ‖Pj‖1,1 are uniformly bounded as j → ∞. Note that in theproof of Theorem 9, we can use the same dense subspace X = Cc to show that Pj g→ g in L1 as j → ∞(we can transfer from uniform convergence to L1 convergence). Now if we take g ∈ X with ‖f − g‖1 ≤ ε,and use the triangle,

‖Pjf − f ‖1≤‖Pj(f − g)‖1+ ‖Pj g− g‖1+ ‖g − f ‖1≤ (1+ ‖Pj‖1,1)ε+ ‖Pj g− g‖1

we would be able to carry out the same argument if we can bound ‖Pj‖1,1 uniformly.

Note that if f ≥ 0 then Pj f ≥ 0, so that Pj is a positive operator (Pj replaces f by its average value onintervals, and if f is never negative, the average value will never be negative). This means that since−|h| ≤h≤ |h|, −Pj |h| ≤Pjh≤Pj |h| and thus

‖Pj h‖1 ≤∫

Pj |h|

=∑

j,k

Ij,k

(

2j∫

Ij,k

|h|dx)

dy

=∑

j,k

2−j 2j∫

Ij,k

|h|dx

=

|h|= ‖h‖1

This implies that ‖Pj‖1,1 ≤ 1 (operator norm), and we can follow through with the density argument asdescribed above.

To show (2), note that∫

Pj f =∫

f by the same reasoning as above (note we need to apply Fubini,

which is okay since f ∈L1), and therefore

f

=

Pjf

≤∫

|Pj f |= ‖Pj f ‖1

Thus if∫

f 0, then ‖Pjf ‖10 as j→−∞.

Remarks

• Note that this tells us that Haar’s expansion f = 〈f , ϕ0,0〉ϕ0,0 +∑

j,k 〈f , ψj,k〉ψj,k on [0, 1] con-

verges in the L1 sense as well (which is also not true for the Fourier series).

14

Page 15: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

• It might be tempting to think of Pj f as a conditional expectation E[f | F j] where F j is the σ-fieldgenerated by the intervals Ij,k; however, we are dealing with Lebesgue measure on R which is not aprobability distribution. Interestingly, if we replace the Lebesgue measure with any probabilitymeasure (which would require renormalization of the orthonormal basis), then we can think of con-ditional expectations, and in fact Pjf =E[f | F j] f (in Lp, 1 ≤ p <∞, including ∞ if f is uni-formly continuous) for j→∞ would follow from martingale convergence theorem (although the ele-mentary proof above works just as well).

Note that

∅, (−∞, 0), [0,∞),R=F−∞⊂ ⊂F−1⊂F0⊂ ⊂F∞=B(R) , Borel sets on R

For j → − ∞, we would have E[f | Fj] E[f | F−∞], which is1

µ(x< 0)

x<0f for x < 0 and

1

µ(x> 0)

x>0f for x> 0. This can proved directly as well. The idea is that given ε> 0, we can find

some j (negative), for which µ([2−j , 2j])≥ 1− ε. Thus, the two intervals Ij,−1 and Ij,0 give almostthe entire space, and thus Pj f will be close to E[f | F−∞].

Implementing the Haar System

The point of the Haar expansion is that it allows us to work with approximations of a function at dif-ferent scales. As we saw earlier, we can go from a coarse scale PJ0 f to a fine scale PJ1f with the succes-sive refinements Qk f , J0≤ k ≤ J1− 1. Let us look at how to transition from coarse to fine scales algorith-mically.

Using the notation dj,k(f)= 〈f , ψj,k〉 and cj,k(f) = 〈f , ϕj,k〉 as before, we note that since ϕj,k=2j/21Ij,kand ψj,k=2j/2(1Ij+1,2k

− 1Ij+1,2k+1), we have that

ψj,k =1

2√ [ϕj+1,2k− ϕj+1,2k+1]

ϕj,k =1

2√ [ϕj+1,2k+ ϕj+1,2k+1]

Thus we have the formulas

cj,k =1

2√ [cj+1,2k− cj+1,2k+1]

dj,k =1

2√ [cj+1,2k− cj+1,2k+1]

Suppose we have PJ1f , the finer approximation of f . This is described entirely by the coefficients cJ1,k.

Using the formulas we obtain cJ1−1,k, dJ1−1,k from cJ1,k, and so on until we obtain cJ0, dJ0:

cJ1 cJ1−1

dJ1−1

cJ1−2 cJ0+1

dJ0+1

cJ0

dJ0

dJ1−2

The boxed elements above give the fine to coarse decomposition, and of course we can invert the formulasto go backwards as well. Let’s examine how the algorithm works in the discrete setting (or, finite interval[0, 1]):

15

Page 16: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

c30 c31 c32 c33 c34 c35 c36 c37

c20 d20 c21 d21 c22 d22 c23 d23

c10 d10 c11 d11

c00 d00

This is a picture in the case where we are going between P3 f and (P0f , Qkf , 0 ≤ k ≤ 2). We can recon-struct the coefficients c3,k from c0,0, d0,0, d1,∗, d2,∗ and vice versa.

Each line in the diagram represents a multiply-add operation to transfer between successive levels. At thetop row, we need 16 = 24 operations, and the second we need 8 = 23, and at the bottom we need 4 = 22

operations. In general, to transfer between the coefficients of PJf and (P0f , Qkf , 0≤ k ≤ J − 1) we need a

total of 2J+1 + 2J + + 22 = 2J+2 − 4≈ 4N operations, which is a linear time algorithm (compare to theFFT).

We can also represent this transformation as a matrix:

1

8√

(

1 1 1 1 1 1 1 11 1 1 1 −1 −1 −1 −1

)

1

4√

(

1 1 −1 −11 1 −1 −1

)

1

2√

1 −11 −1

1 − 11 −1

c30c31c32c33c34c35c36c37

=

c00d00d10d11d20d21d22d23

The Haar basis functions (wavelets) are not ideal since they are not smooth. In terms of applying theHaar decomposition to functions, we will not be able to obtain good estimates (especially not if we wantto apply Fourier transforms). We will need to work towards finding similar systems with smooth wavelets.

Week 4 (9/30/2010)

We have that the Haar basis is a basis for L2 as well as Lp, and also approximates C[0, 1] in the L∞ norm(though since the Haar basis functions are not continuous themselves, it is not a basis for (C[0, 1], ‖ · ‖∞)).In general, there is a framework that captures more general bases, and we will use this to find smoothbasis functions of any specified degree.

Multiresolution Analysis (MRA) Framework (Meyer, Mallat)

Definition 14. An MRA is a sequence (Vj)j∈Z of linear closed subspaces of L2(R) with the followingproperties:

1. Vj⊂Vj+1 for all j ∈Z

16

Page 17: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

2.⋃

j∈ZVj=L2(R), i.e. L2(R) can be approximated by the subspaces Vj.

3.⋂

j∈ZVj= 0

4. f ∈Vj f(2 · )∈Vj+1 for all j ∈Z, i.e. Vj are scaled copies of one another.

5. There exists ϕ∈ V0 such that ϕ( · − k), k ∈Z is an orthonormal basis for V0.

(4) and (5) together imply that 2j/2 ϕ(2jx − k), k ∈ Z is an orthonormal basis for Vj. We will also seethat the conditions are redundant: (3) can be derived from the other properties.

(5) can also be relaxed to require that ϕ( · − k), k ∈ Z be a Riesz basis rather than an orthonormalbasis, i.e. ϕ( · − k), k ∈Z spans V0 and

C1 ‖c‖l2≤∥

k

ckϕ( · − k)

L2

≤C2 ‖c‖l2, for all c∈ l2

As in the Haar basis case, we define

• Pj as the orthogonal projection on Vj

• Wj is the orthogonal complement of Vj in Vj+1, i.e. Vj+1 = Vj ⊕Wj. Note that f ∈Wj if and onlyif f(2 · )∈Wj+1.

• Qj is the orthogonal projection on Wj, so Pj+1=Pj+Qj.

(1) and (2) above say that

limj→∞

‖Pj f − f ‖2=0

which we proved in Theorem 9

and (1) and (3) say that

limj→−∞

‖Pj f ‖2=0, for all f ∈L2

proved in Theorem 11.

As in Corollary 12, we also have

f =∑

j∈Z

Qj f

with convergence in L2.

Summarizing,

Proposition 15.

1. limj→∞ ‖Pj f − f ‖2=0

2. limj→−∞ ‖Pj f ‖2=0, for all f ∈L2

3. f =∑

j∈ZQj f with convergence in L2

Proposition 16. (3) follows from properties (1),(2),(4),(5)

17

Page 18: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Proof. Let f ∈ ⋂j∈Z

Vj. Then f ∈ V−j for j > 0. Property (4) implies that f(2j · ) ∈ V0 for j > 0. Prop-

erty (5) implies that 2j/2 f(2jx) =∑

kαk(j)

ϕ(x − k), with ‖α(j)‖l2 = ‖f ‖L2 for all j. Taking the Fouriertransform, we have that

2−j/2f (2−jξ)=

[

k

αk(j)e−2πikξ

]

ϕ(ξ)

Denote mj(ξ)6∑

kαk(j)e−2πikξ, a 1-periodic function. Note that ‖mj‖L2(T)= ‖α(j)‖l2= ‖f ‖L2. Replace ξ

by 2jξ above:

f (ξ)= 2j/2mj(2jξ) ϕ(2j ξ)

We will show that f =0 by showing that∫

If =0 for any interval I. Let a, b> 0 first. Then

a

b

|f (ξ)|dξ ≤(

a

b

|mj(2jξ)|2dξ

)1/2(∫

a

b

2j |ϕ(2jξ)|2 dξ)1/2

≤(

2−j∫

a2j

b2j

|mj(ξ)|2 dξ)1/2(

a2j

b2j

|ϕ(ξ)|2dξ)1/2

≤(

2−j ⌈2j(b− a)⌉‖f ‖22)1/2

(

a2j

b2j

|ϕ(ξ)|2dξ)1/2

≤ C

(

a2j

b2j

|ϕ(ξ)|2dξ)1/2 0

as j→ ∞. This argument works the same if a, b < 0 as well, and in general we can split the interval intothree parts, [a, δ], [−δ, δ], [δ, b], noting that the integral over [−δ, δ] can be made arbitrarily small. This

implies that f =0 and therefore f =0.

Definition: A wavelet ψ is a function in L2 such that

ψj,k6 2j/2 ψ(2jx− k), j , k ∈Z

is an orthonormal basis for L2(R).

We say that the wavelet is associated to a given MRA (Vj)j∈Z if ψ( · − k)k∈Z is an orthonormal basis

for W0, in which case ψj,k defined above is an orthonormal basis for L2(R).

Remark 17. We can find wavelets ψ for which ψj,k, j , k ∈Z is a basis, but that ψ is not associated toany MRA. In this case, if we try to find the associated MRA, which by definition must be

Vj=⊕

l=−∞

∞Wl

property (5) can fail. Nevertheless, the MRA framework is very convenient to work with, and many exam-ples will fall under this framework.

Example 18. Let Vj6 f ∈L2: supp(f )⊂ [−2j−1, 2j−1], j ∈Z. Properties (1),(2),(3),(4) are all trivial.

V0= f ∈L2: supp(f )⊂ [−1/2, 1/2]

18

Page 19: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Set ϕ=F−1 (1[−1/2,1/2]), then since 2−2πikξ ϕ , k ∈Z is an orthonormal basis for L2[−1/2, 1/2], we have

that ϕ(x− k), k ∈Z is an orthonormal basis for V0, and ϕ(x)=sinπx

πx. Also,

W0= f ∈L2: supp(f )⊂ [−1,−1/2]∪ [1/2, 1]

the wavelet is ψ=F−1(1[−1,−1/2]∪[1/2,1]). Note that J = [−1,−1/2]∪ [1/2, 1] is congruent to [0, 1) mod Z,

i.e. J + k, k ∈ Z is a partition of R. This means that e−2πikξ1J , k ∈ Z is an orthonormal basis for

L2(J)=F(W0) and thus ψ( · − k), k ∈Z is an orthonormal basis for W0. Computing ψ explicitly:

ψ = 1[−1,1]− 1[−1/2,1/2]

ψ = 2sin(2πx)

2πx− sinπx

πx

=sinπx

πx[2 cosπx− 1]

which still has bad localization properties. This is often called the Shannon wavelet in connection to thesampling theorem, which says that if f ∈V0 then f(x)=

kf(k)ϕ(x− k).

This is related to the Littlewood Paley decomposition, which replaces ϕ with smoother cutoffs, but in theprocess property (5) no longer holds.

So we have the Haar wavelet which lacks smoothness in time but has good localization in time, and theShannon wavelet which lacks localization in time but has great smoothness in time. (In frequency theroles reverse). They lie at the two extremes of time frequency localization.

Goal: Find intermediate cases. Daubechies found orthonormal bases of compactly supported wavelets inCk for any specified k.

Further Goal: Understand / characterize smoothness function spaces (Sobolev, Besov, etc) in terms ofthe wavelet coefficients. (This is impossible without good localization / smoothness)

First, we pose the following question:

Question: For what g ∈ L2 is g( · − k), k ∈ Z an orthonormal system? Or equivalently, when is

e−2πikξg(ξ), k ∈Z an orthornomal system. (equivalent since F is an isometry)

Theorem 19. g( · − k), k ∈Z is an orthonormal system if and only if∑ |g (ξ+ l)|2=1 a.e.

(Already we have the example where g(x)=sinπx

πx)

Proof. Define G(ξ)=∑

l∈Z|g(ξ+ l)|2, which is 1-periodic, and also well defined since

0

1

G(ξ)dξ=

−∞

∞|g(ξ)|2= ‖g‖L2

2 <∞

so that G∈L1(T) and hence is finite a.e. We can then look at the Fourier series

G(k)6 ∫

0

1

G(ξ)e−2πikξ dξ =

−∞

∞e−2πikξg(ξ) g(ξ) dξ

=

−∞

∞g(x− k) g(x) dx

19

Page 20: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

using Plancherel. This implies that

G≡ 1 on T G(k)= 〈g, g( · − k)〉= δk 〈g( · − k), g( · − k ′)〉= δk,k ′

which gives the result.

Exercise 1. Let ψ =1J where J = I ∪ (−I) where I = [2/7, 1/2]∪ [2, 2+ 2/7]. Then |J |=1, and check that

• 2jJ , j ∈Z and J + l, l∈Z are both partitions of R

• Defining Wj= f ∈L2, supp(f )⊂ 2jJ and Vj=⊕

−∞

j−1Wl, Property (5) fails.

Short term goal: Given an MRA, what is the corresponding wavelet ψ? We will cover this next time.

Week 5 (10/7/2010)

Finding the Mother Wavelet

Given an MRA, we are looking for the functino ψ (mother wavelet) for which ψ( · − k) is anorthonormal basis for W0, then ψj,k, j , k ∈Z will be an orthonormal basis for L2(R).

Basic Relations

• f ∈V0 if and only if f =∑

〈f , ϕ0,k〉 ϕ0,k, and denoting ck(f)= 〈f , ϕ0,k〉, this is true if and only if

f (ξ)=[

ck e−2πikξ

]

ϕ(ξ)

Call af(ξ)=∑

cke−2πikξ, a one-periodic function, and we note ‖af‖L2(T)= ‖ck(f)‖l2= ‖f ‖L2(R)

• Since V0⊂ V1, f ∈ V0 implies that f ∈ V1 so that f(x) =∑ 〈f , ϕ1,k〉 ϕ1,k where ϕ1,k= 2

√ϕ(2 · − k).

Then

f (ξ)=

[

1

2√

〈f , ϕ1,k〉 e−πikξ]

ϕ(ξ/2)

Call mf(ξ/2)=1

2√∑ 〈f , ϕ1,k〉 e−πikξ, then we have ‖mf‖L2(T)=

1

2√ ‖f ‖L2(R).

So f (ξ)=mf(ξ/2) ϕ(ξ/2)

Notation: For f = ϕ, we will write m06 mϕ.

For f = ϕ, we have that

ϕ(x)=∑

hkϕ1,k(x)

where hk= 〈ϕ, ϕ1,k〉 and ϕ1,k(x)= 2√

ϕ(2x− k). Then

ϕ(x)= 2√ ∑

k

hkϕ(2x− k)

with ‖h‖2=1. This is called the refinement equation (2 scale difference equation).

20

Page 21: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

In the case of the Haar basis, ϕ= 1[0,1] and

1[0,1]= 2√ (

1

2√ 1[0,1/2]+

1

2√ 1[1/2,1]

)

so that h0= h1=1

2√ and ϕ1,0=1[0,1/2], ϕ1,1=

1

2√ 1[1/2,1].

Let us examine m0(ξ) =1

2√∑

hke−2πikξ. We have ϕ(ξ) = m0(ξ/2) ϕ(ξ/2). Recall from last time (The-

orem 19) that if g( · − k), k ∈ Z form an orthonormal basis for L2, then

l|g(ξ + l)|2 = 1 for a.e. ξ.

Applying this to g= ϕ, we have

1=∑

l∈Z

|ϕ(2ξ+ l)|2 =∑

l∈Z

|ϕ(2ξ+2l)|2+∑

l∈Z

|ϕ(2ξ+2l+1)|2 a.e. ξ

=∑

l∈Z

|m0(ξ+ l)|2 |ϕ(ξ+ l)|2+∑

l∈Z

|m0(ξ+1/2)|2|ϕ(ξ+ l+1/2)|2

1 = |m0(ξ)|2+ |m0(ξ+1/2)|2

Above we have used the fact that m0 is 1-periodic so that m0(ξ + l) = m0(ξ). Summarizing the facts, wehave

Proposition 20.

• If f ∈ V0, then f (ξ)=m0(ξ/2) f (ξ/2). In particular, ϕ(ξ)=m0(ξ/2) ϕ(ξ/2)

• 1= |m0(ξ)|2+ |m0(ξ+1/2)|2

Example 21.

1. For Haar, we have ϕ= 1[0,1]=1[−1/2,1/2]( · − 1/2), so ϕ(ξ)= e−πiξsinπξ

πξ, and

m0(ξ) =1

2√(

1

2√ +

1

2√ e−2πiξ

)

=1+ e−2πiξ

2= e−πiξ cos(πξ)

It is easy to see that both properties above are satisfied using the half angle formula. Also, we notethat m0(ξ) is a low-pass filter, mainly supported near 0 ∈ T, and m0(ξ + 1/2) is a high-pass filter,mainly supported near 1/2∈T.

2. For Shannon, we have ϕ = 1[−1/2,1/2] so ϕ(x)=sinπx

πx.

1[−1/2,1/2]= 1[−1/2,1/2]1[−1,1]

so m0(ξ) = 1[−1/4,1/4]. Again, it is easy to check the two properties and that m0(ξ) is low pass and

m0(ξ+1/2) is high-pass.

Let us now characterize the detail space W0. f ∈ W0 if and only if f ∈ V1 ∩ V0⊥. Now we already know

that f ∈V1 if and only if f (ξ)=mf(ξ/2) ϕ(ξ/2) with ‖mf‖L2(T)= ‖f ‖L2.

f is perpendicular to V0 if and only if f is perpendicular to ϕ0,k for all k ∈ Z, which is true if and only if

f is perpendicular to e−2πikξ ϕ(ξ). Computationally, (the same computation as in Theorem 19) thismeans that

0=

R

f (ξ)ϕ(ξ) e2πikξ dξ =

0

1(

l

f (ξ+ l) ϕ(ξ+ l)

)

e2πikξ for all k

21

Page 22: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

noting that∑

lf (ξ+ l) ϕ(ξ+ l) is a one-periodic function that is well-defined, as

l

f (ξ+ l) ϕ(ξ+ l)

L1(T)

≤∑

l

∣f (ξ+ l) ϕ(ξ+ l)

L1(T)

≤∑

l

|f (ξ+ l)|L2(T)|ϕ(ξ+ l)|L2(T)

≤(

l

|f (ξ+ l)|L2(T)2

)1/2(∑

l

|ϕ(ξ+ l)|L2(T)2

)1/2

= ‖f ‖L2 ‖ϕ‖L2 <∞

Continuing on, the previous statement is equivalent to

l

f (ξ+ l) ϕ(ξ+ l)≡ 0 a.e. ξ

Now we split into evens and odds again, and replace ξ by 2ξ. Now we additionally use the fact that f ∈ V1so that f (ξ)=mf(ξ/2) ϕ(ξ/2).

0 =∑

l

f (2ξ+2l) ϕ(2ξ+2l) +∑

l

f (2ξ+2l+1) ϕ(2ξ+2l+1)

=∑

l

mf(ξ+ l)ϕ(ξ+ l)m0(ξ+ l) ϕ(ξ+ l)+∑

l

mf(ξ+ l+1

2)ϕ(ξ+ l+

1

2)m0(ξ+ l+

1

2)ϕ(ξ+ l+

1

2)

= mf(ξ)m0(ξ)∑

l

|ϕ(ξ+ l)|2+mf(ξ+1/2)m0(ξ+1/2)∑

l

|ϕ(ξ+ l+1/2)|2

0 = mf(ξ)m0(ξ)+mf(ξ+1/2)m0(ξ+1/2)

We use the following elementary lemma: Let (z1, z2)∈C2 and (ν1, ν2)∈C2 such that z1 ν1+ z2ν2=0. Thenthere exists λ ∈C such that (z1, z2) = λ(ν2,−ν1). (Proof: if ν1 0, set λ=−z2/ν1, otherwise ν2 0 and weset λ= z1/ν2)

All of this implies that f ∈W0 if and only if there exists λf(ξ) such that

(mf(ξ),mf(ξ+1/2))=λf(ξ)(m0(ξ+1/2),−m0(ξ))

i.e. mf(ξ)=λf(ξ)m0(ξ+1/2) and mf(ξ+1/2)=−λf(ξ)m0(ξ). Combining these shows that

λf(ξ+1/2)m0(ξ)=mf(ξ+1/2)=−λf(ξ)m0(ξ)

so that λf(ξ + 1/2) = − λf(ξ) whenever m0(ξ) 0. Note the second property of m0 says that |m0(ξ)|2 +|m0(ξ + 1/2)|2 = 1 for a.e. ξ, so that either m0(ξ) 0 or m0(ξ + 1/2) 0 for all ξ. In the first case, thismeans that λf(ξ+ 1/2) =−λf(ξ), and in the second case, this means that λf(ξ +1)= λf(ξ) =− λf(ξ +1/2), so in fact λf(ξ+1/2)=−λf(ξ) for all ξ.

Now define νf(ξ)6 e−πiξλf(ξ/2). νf is one-periodic by the anti-symmetry of λf, so

νf(ξ+1)=− e−πiξλf(ξ/2+ 1/2)= e−πiξλf(ξ/2)= νf(ξ)

Thus we have the following characterization for W0:

Proposition 22. f ∈W0 if and only if

f (ξ)= eπiξ νf(ξ)m0(ξ/2+1/2) ϕ(ξ/2)

22

Page 23: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

for some νf ∈L2(T) with ‖νf‖L2(T)= ‖f ‖L2(R).

νf contains information specific to f , and m0(ξ/2+ 1/2) acts as a high pass filter.

Proof. We already have that f ∈V1 implies f (ξ) =mf(ξ/2) ϕ(ξ/2). We just showed that

mf(ξ/2)=λf(ξ/2)m0(ξ/2+ 1/2)= eπiξ νf(ξ)m0(ξ/2+ 1/2)

for some νf (note we can transfer between λf and νf with the definition) Also,

‖f ‖L2(R)2 = 2‖mf‖L2(T)

2

= 2

0

1/2

|mf(ξ)|2 dξ+2

0

1/2

|mf(ξ+1/2)|2 dξ

= 2

0

1/2

|λf(ξ)|2 (|m0(ξ+1/2)|2+ |mf(ξ)|2)dξ

= 2

0

1/2

|λf(ξ)|2

=

0

1

|νf(ξ)|2

as desired.

Now for a function in W0 to be the mother wavelet, there is an additional property to be satisfied (fromTheorem 19). We will show that

Theorem 23. Given an MRA with scaling function ϕ and the associated low pass filter m0, ψ ∈W0 is amother wavelet for the given MRA (i.e. ψ( · − k) an orthonormal basis for W0)

if and only if

ψ(ξ)= eπiξm0(ξ/2+ 1/2) ϕ(ξ/2)γ(ξ)

where |γ(ξ)|=1 a.e. ξ and is 1-periodic.

Week 6 (10/14/2010)

Proof. First assume that the equation for ψ holds. By Proposition 22, with νf = γ, we have that ψ ∈W0

and that ‖ψ‖L2(R) = ‖γ‖L2(T) = 1. We want to check that ψ( · − k), k ∈ Z forms an orthonormal basis

for W0. By Theorem 19, we just need to check that∑ |ψ(ξ + l)|2 = 1 a.e. We already know that

|ϕ(ξ + l)|2 = 1 a.e. since ϕ( · − k), k ∈ Z is an orthonormal basis for V0. Applying the usual computation,we look at 2ξ and split into even and odds:

l

|ψ(2ξ+ l)|2 =∑

l

|m0(ξ+ l/2+ 1/2)|2 |ϕ(ξ+ l/2)|2

=∑

l

|m0(ξ+ l+1/2)|2 |ϕ(ξ+ l)|2+∑

l

|m0(ξ+ l+1)|2|ϕ(ξ+ l+1/2)|2

= |m0(ξ+1/2)|2+ |m0(ξ+ l)|2=1

23

Page 24: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

noting the facts from Proposition 20 for the last line, and that m0 is 1-periodic. Thus ψ( · − k), k ∈ Z

forms an orthonormal system. Now we show that this orthonormal system spans W0. Again from the pre-vious Proposition 22, we know that since f ∈W0 we can write

f (ξ)= eπiξ νf(ξ)m0(ξ/2+ 1/2) ϕ(ξ/2)=νf(ξ)

γ(ξ)ψ(ξ)

so that f (ξ) is the product of a 1-periodic function and ψ(ξ). Taking the Fourier series ofνf(ξ)

γ(ξ)and

inverting the Fourier transform (see remark below) shows that

f(ξ)=∑

k

ckψ(ξ − k)

so that f ∈ spanψ( · − k), k ∈Z. This shows that ψ is a wavelet for the given MRA.

Conversely, now assume that ψ is a wavelet for the given MRA. Then by Proposition 22 we have some νψsuch that ψ(ξ) = eπiξ νψ(ξ)m0(ξ/2 + 1/2) ϕ(ξ/2). and ‖νψ‖L2(T) = ‖ψ‖L2(R) = 1. We also have that∑

l|ψ(ξ + l)|2 = 1 a.e. since ψ is an orthonormal system. Then by the same computation as above, we

have

1=∑

l

|ψ(2ξ+ l)|2 =∑

l

|νψ(2ξ+ l)|2 |m0(ξ+ l/2+ 1/2)|2 |ϕ(ξ+ l/2)|2

= |νψ(2ξ)|2(

l

|m0(ξ+ l+1/2)|2|ϕ(ξ+ l)|2+∑

l

|m0(ξ+ l+1)|2|ϕ(ξ+ l+1/2)|2)

= |νψ(2ξ)|2

so that |νψ(ξ)|=1 for a.e. ξ.

The following is a useful characterization of when a function lies in the span of translates, and we used itin the previous proof:

Proposition 24. Let g( · − k), k ∈ Z be an orthonormal basis for Y ⊂ L2. We note that f ∈ Y =

spang( · − k), k ∈Z if and only if f (ξ)=λf(ξ) g(ξ) where λf is a 1-periodic function.

Proof. We showed this in the previous proof by using the Fourier series λf(ξ)=∑

λf (k)e−2πikξ, so that

f (ξ)=∑

λf (k)e−2πikξ g(ξ) f(x)=

λf (k)g(x− k)

Conversely, if f ∈ Y , so that f =∑

ck g(x − k), then taking the Fourier transform shows that f (ξ) =(∑

kck e

−2πikξ)

g (ξ).

From this, we can show that for any γ, 1-periodic with |γ | = 1, if we set gγ(ξ)6 g(ξ)γ(ξ), then gγ( · −k), k ∈Z also forms an orthonormal basis for Y . This follows from the fact that

|g (ξ+ l)|2=1 a.e. ∑

|gγ(ξ+ l)|2=1 a.e.

so that gγ( · − k), k ∈Z forms an orthonormal system. That the span is the same follows immediately fromthe remark, so that for any f ∈Y ,

f (ξ)=λf(ξ) g(ξ)=λf(ξ)

γ(ξ)gγ(ξ)

24

Page 25: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

just as in the previous proof.

For example, if we take γ(ξ) = e2πiNξ, this shifts ψ by N , which is saying the obvious fact that if ψ( · −k), k ∈Z is an orthonormal basis for W0, then ψ( · −N − k), k ∈Z is an orthonormal basis for W0.

Essentially this observation says that the γ of Theorem 23 can be an arbitrary 1-periodic function with|γ |=1. The canonical choice of course is just to set γ(ξ)= 1.

This allows us to compute ψ explicitly in terms of the refinement equation. Recall

m0(ξ)=1

2√

k

hk e−2πikξ

where hk= 〈ϕ, ϕ1,k〉. Take γ≡ 1, so that

ψ(ξ)= eπiξm0(ξ/2+1/2)ϕ(ξ/2) =1

2√

k

hk e2πik(ξ/2+1/2) eπiξϕ(ξ/2)

=1

2√

k

hk (−1)k eπi(k+1)ξϕ(ξ/2)

(k=−l− 1) =1

2√

l

(−1)−l−1h−l−1 e−πilξ ϕ(ξ/2)

Inverting the Fourier transform gives

Proposition 25.

ψ(x) = 2√ ∑

l

h−l−1 (−1)l−1 ϕ(2x− l)

so ψ is built from the coefficients hk, except flipped and with modulated signs.

Example 26. Check this with the Haar system. In the case of the Haar basis, we had ϕ= 1[0,1],

m0(ξ)=1

2√(

1

2√ +

1

2√ e−2πiξ

)

= e−πiξ cos(πξ)

and h0=h1=1

2√ . Then

ψ(x) = 2√ (

1

2√ 1[0,1](2x+1)− 1

2√ 1[0,1](2x+2)

)

= 1[0,1](2x+1)−1[0,1](2x+2)=

1 −1

2≤ x≤ 0

−1 − 1≤x≤−1

2

Note that it is a shifted and flipped version of what we used before, which of course is equivalent from theearlier remarks.

Computing the Wavelet Coefficients

Define gk6 − h−k−1 (−1)k−1, so that ϕ(x) = 2√

hk ϕ(2x − k) and ψ(x) = 2√

gk ϕ(2x − k). Thismeans that

ϕj,k(x) = 2j/2ϕ(2jx− k)= 2j+1

2

l

hlϕ(2(2jx− k)− l)=

l

hlϕj+1,l+2k

25

Page 26: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Likewise,

ψj,k=∑

l

glϕj+1, l+2k

As before, we will set cj,k(f) = 〈f , ϕj,k〉 and dj,k(f) = 〈f , ψj,k〉. Then taking the inner product with theprevious relations,

cj,k =∑

l

hl cj+1, l+2k=∑

l

hl−2k cj+1,l

We write∑

lh−(2k−l) cj+1,l as the convolution

(

h−(·) ∗ cj+1,(·))

2k

cj,k =∑

l

hl−2k cj+1, l=(

h−(·) ∗ cj+1,(·))

2k

dj,k =∑

l

gl−2k cj+1, l=(

g−(·) ∗ cj+1,(·))

2k

As a block diagram,

(g−n)

(h−n)

↓2

↓2(cj+1)

(cj)

(dj)

where the first blocks denote convolution and ↓2 denotes the downsampling operator

(c0, c1, c2, c3, ) (c0, c2, )

(note we want the 2k-th coefficient of the convolution for the k-th coefficients for cj , dj). As before, giventhe coefficients for a function f at the finer level Vj+1, we can decompose them to the coefficients at thecoarser level Vj and the coefficients in the detail space Wj. In a matrix form, we have

cj,kdj,kcj,k+1

dj,k+1

=

h0 h1 h2 g0 g1 h2 h0 h1 h2 g0 g1 g2

cj+1,2k

cj+1,2k+1

cj+1,2k+2

cj+1,2k+3

cj+1,2k+4

This matrix is an orthogonal matrix.

Exercise 2. Write out |m0(ξ)|2+ |m0(ξ+1/2)|2=1 in terms of hk only. Then

∑khk hk+2n= δn.

Doing the same for mψ(ξ)m0(ξ)+mψ(ξ+1/2)m0(ξ+1/2)= 0 shows that∑

hk g2k+n=0.

For the reconstruction formula, we note that

Pj+1,ff =Pj f +Qj f =∑

cj,lϕj,l+∑

dj,lψj,l

26

Page 27: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

so that

cj+1,k = 〈f , ϕj+1,k〉= 〈Pj+1 f , ϕj+1,k〉=∑

cj,l〈ϕj,l, ϕj+1,k〉+∑

dj,l〈ψj,l, ϕj+1,k〉=∑

cj,lhk−2l+∑

dj,l gk−2l

noting that

〈ϕj,l , ϕj+1,k〉=2j

2 2j+1

2

ϕ(2jx− l)ϕ(2j+1x− k)dx= 2√ ∫

ϕ(x)ϕ(2x− k+2l)dx= 〈ϕ0,0, ϕ1,k−2l〉=hk−2l

and a similar computation hold sfor gk−2l. We can write the above as

cj+1,k=(

h ∗ cj,(·))

k+(

g ∗ dj,(·))

k

where cj = ( c−1, 0, c0, 0, c1, 0, ) and likewise for d (this operation is called upsampling, which we willdenote by ↑2 in the block diagram:

h

g

↑2

↑2

(cj+1)

(cj)

(dj)

Note that for practical purposes, instead of working directly with the functions f ∈ L2, ϕ, ψ, these equa-tions allow us to simply work entirely in terms of the wavelet coefficients cj , dj , hj , gj.

For finite sequences cj, let us compute the time to compute the decomposition dj−1, dj−2, , d0, c0. Aslong as h, g are finite length sequences, we note that the implementation time is still linear! Computingcj−1, dj−1 from cj takes a linear amount of computation, say CN (depending on the support of h, g).Then, we note that there are half as many coefficients in cj−1, so to compute cj−2, dj−2 from cj−1 takeshalf the amount of computation CN/2, and so forth. Adding these up is a geometric series of length atmost log2 N (where N is the number of coefficients in (cj)), so the total amount of computation is

CN[

20 + + 2−log2N]≤ 2CN . (Again, compare to FFT(N) which takes N logN time). The multiresolu-

tion structure is what buys us this speedup.

Week 7 (10/21/2010)

Deisgning Wavelets

What do we want in a wavelet basis?

1. A basis not just for L2 but for other function spaces (Sobolev, Hölder, Besov). This typicallyrequires that ψ has sufficient smoothness.

2. Local expansions. This requires that ψ be well localized in time, ideally compact with small sup-port, otherwise fast decay.

27

Page 28: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

3. If we have f =∑

dj,kψj,k, we want sparse decompositions, or decay of the wavelet coefficients forlarge |j | (the scale). This is analogous to the case of Fourier basis, where the smoothness of thefunction corresponds to decay in the Fourier coefficients. For k (the position) there is no corre-sponding property, as decay in positional coefficients only describes the decay of the function, andnot the smoothness. In any case, we care about wavelets for which the wavelet coefficients decay in|j | at a rate that corresponds to the smoothness class of f .

This requires vanishing moments for ψ, i.e.∫

xk ψ(x)dx= 0 or ψ (k)(0) = 0 for k = 0, 1, , r. Thisis because if functions are locally smooth, then locally they are well-approximated by polynomials.In this case the wavelet coefficients are nothing more than

fψj,k=

(f − pf)ψj,k

if the degree of pf ≤ r. If pj approximates f well near where ψj,k is supported, then the waveletcoefficient is small.

The reason we care about decay is for truncation purposes: we can describe f with fewer coeffi-cients while minimizing the distortion when we reconstruct f .

4. Fast algorithms (using MRA framework and good time localization)

So far we’ve seen the route MRA((Vj), ϕ) ψ , characterizing the wavelet ψ for an MRA. Now wewant to design ϕ so that we construct an MRA corresponding to a wavelet with the properties that wewant, i.e. the route we take now is

ϕ MRA((Vj), ϕ) ψ

Given some ϕ, we can start by setting Vj = spanϕj,k , k ∈ Z. We then ask when this gives rise to anMRA? Here is one ingredient:

Proposition 27. Let ϕ∈L2(R) such that ϕ( · − k), k ∈Z forms an orthonormal system. If ϕ is contin-

uous at 0 and ϕ(0) 0 then⋃

j∈ZVj=L2(R).

Proof. We want to show that if f ⊥ Vj for all j ∈ Z then f ≡ 0. Let f be such a function, and let ε > 0.There exists g ∈ L2 such that supp g is compact, say supp g ⊂ [−2J−1 , 2J−1] such that ‖f − g‖ ≤ ε. Thisimplies that ‖Pj g‖2= ‖Pf(f − g)‖2≤ ε. We will be interested in the finer scales j ≥ J . Consider the func-tion

hj(ξ) = g (ξ) ϕ(2−jξ) on [−2j−1, 2j−1]

and write the Fourier series expansion with respect to 2−j/2 e2πikξ2−j

, k ∈Z:

‖hj‖L2(R)2 =

−2J−1

2J−1

|g(ξ)|2 |ϕ(2−j ξ)|2 dξ

(Plancherel) =∑

k∈Z

|hj(k)|2

(support of g is [−2J−1, 2J−1]) =∑

k∈Z

R

g (ξ) ϕ(2−jξ) 2−j e−2πikξ2−j

2

=∑

k∈Z

|〈g , ϕj,k 〉|2=∑

k∈Z

|〈g, ϕj,k〉|2 (Parseval)

= ‖Pj g‖L2(R)2

28

Page 29: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

This means that ‖hj‖2 = ‖Pj g‖2 ≤ ε. We know that as j → ∞, ‖hj‖2 → ‖g ‖22 · |ϕ(0)|2 since ϕ(ξ 2−j) →ϕ(0) uniformly on ξ ∈ [−2J−1, 2J−1], using the fact that ϕ is continuous at 0. Thus,

‖g ‖2≤ ε

|ϕ(0)| for all ε> 0

Now we use that ϕ(0) 0, and transfer to f :

‖f ‖2≤‖f − g‖2+ ‖g‖2≤ ε+ε

|ϕ(0)| ≤Cε

and since ε is arbitrary, this implies ‖f ‖2=0 and f ≡ 0.

Note that if (Vj , ϕ) also satisfies the other properties of an MRA, then earlier we showed that ‖Pj g‖2 →‖g‖2 (needs nesting of Vj, Proposition 15). Since above we showed that ‖Pjg‖2 →‖g‖2|ϕ(0)|, this impliesthat |ϕ(0)|=1. In other words,

Proposition 28. Let (Vj , ϕ) be an MRA. If ϕ is continuous at 0, and ϕ(0) 0, then |ϕ(0)|=1

Thus, if we want to construct an MRA in this manner, we would find ϕ satisfying ϕ continuous at 0 with|ϕ(0)|=1.

If we assume something even stronger, that ϕ∈L1 so that ϕ is continuous everywhere, then

Corollary 29. Let (Vj , ϕ) be an MRA. If ϕ∈L1∩L2 and ϕ(0) 0, then

1. ϕ(k)= 0 for all k ∈Z\0

2.∑

kϕ(x+ k)= 1 a.e.

Proof. Since ϕ is in L1, ϕ is continuous, and since the translates of ϕ form an orthonormal system(MRA),

∑ |ϕ(ξ + l)|2 = 1 a.e., and by continuity, this in fact holds everywhere. But since ϕ(0) = 1(assumption and the previous proposition), it must be the case that ϕ(k) = 0 for k ∈ Z\0, and thisshows (1).

Now consider Φ(x)=∑

kϕ(x+ k), which is 1-periodic. Taking the Fourier series,

Φ(l)F.S.

= ϕ(l)F.T.

= δl

so that Φ≡ 1 a.e., which shows (2)

Another relation on ϕ, ψ.

Earlier we had derived the following relations:

ϕ(ξ) = m0(ξ/2) ϕ(ξ/2)

ψ(ξ) = eiπξm0(ξ/2+1/2) ϕ(ξ/2)γ(ξ)

1 = |m0(ξ)+m0(ξ+1/2)|2

where m0, γ is 1-periodic, in L2(T) and |γ |=1 (Proposition 20, Theorem 23)

29

Page 30: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

These imply that

|ϕ(ξ)|2+ |ψ(ξ)|2= |ϕ(ξ/2)|2 for all ξ

As an aside, we note that plugging in ξ =0, then if we choose ϕ so that ϕ is continuous at 0, then ψ(0) =0, i.e.

ψ=0 (note this is not a necessary condition for (Vj , ϕ) to be an MRA)

Now rewrite the above:

|ϕ(ξ)|2 = |ϕ(2ξ)|2+ |ψ(2ξ)|2= |ϕ(4ξ)|2+ |ψ(4ξ)|2+ |ψ(2ξ)|2

= |ϕ(2Nξ)|2+∑

j=1

N

|ψ(2j ξ)|2

Note that since∑ |ϕ(ξ + l)|2 = 1, |ϕ(ξ)|2 ≤ 1 a.e. This implies that both

j=1N |ψ(2jξ)|2 and ϕ(2Nξ)

converge a.e. as N → ∞. Since ϕ(2N · ) → 0 in L2, it must be that ϕ(2N · ) → 0 a.e. Finally, this meansthat

|ϕ(ξ)|2=∑

j=1

∞|ψ(2jξ)|2 a.e.

Note that this does not hold everywhere since plugging in ξ = 0 gives 0 on the right but 1 on the left, andplugging in other integer ξ gives 0 on the left but not necessarily on the right.

In any case, this allows us to recover |ϕ | from |ψ | and vice versa. (Can look at what this looks like graph-ically, and can compare to the Haar system).

Summary

We have the following ingredients. We will choose ϕ∈L2 such that

1.∑

l∈Z|ϕ(ξ+ l)|2=1, ensuring that ϕ( · − k), k ∈Z is an orthonormal system.

2. ϕ continuous at 0, and |ϕ(0)|=1. (satisfies spanning criteria⋃

jVj=L2)

3. ϕ(ξ) =m0(ξ/2) ϕ(ξ/2) for some 1-periodic function m0∈L2(T) (a necessary condition for ϕ∈ V0 in

an MRA. ). Note that the same iteration procedure (as above for |ϕ | , |ψ |) allows us to recover ϕfrom m0

Then it turns out that all the properties of an MRA will be satisfied by (Vj , ϕ).

We need to check the conditions of Definition 14. As mentioned above, the second condition follows from (2)and the fifth condition follows from (1).

By construction, we have that ϕj,k, k ∈Z is an orthonormal basis for Vj. Then from Proposition 24, we have

f ∈V0 f (ξ)= (1-periodic fn) ϕ(ξ) 1

2f (ξ/2)= (2-periodic fn) ϕ(ξ/2) f(2 · )∈V1

which is the fourth condition. The final condition is the nesting condition Vj ⊂ Vj+1. Now we use (3), whichgives the refinement equation

ϕ(x)=∑

ckϕ(2x− k)

30

Page 31: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

with ck=2m0(−k) (plug in and take Fourier transforms). This gives the nesting, since for f ∈V0,

f =∑j

aj(f)ϕ0,j=∑j

aj(f)∑k

ckϕ(2x− 2 j − k)=∑j,k

aj(f)ckϕ1,k+2j ∈ V1

Note that we showed earlier that the third condition is implied by the others.

Meyer’s construction of wavelets in S

We will construct a wavelet ψ with ψ compactly supported and ψ ∈C∞. In particular ψ will not be com-pactly supported, which will not be as useful. In fact, there is no wavelet that is compactly supported andC∞ (can only obtain up to fixed order Ck). The idea is as follows: Let ϕ ∈C∞ satisfying:

a) supp (ϕ)⊂ [− 2/3, 2/3]

b) ϕ(ξ) = 1 on [− 1/3, 1/3]

c) ϕ real valued and even (so that ϕ is also real valued and even)

d) |ϕ(ξ)|2+ |ϕ(ξ − 1)|2=1 for all ξ ∈ [0, 1].

We will refer to the ingredients (1,2,3) in the previous discussion.

(a,d) imply (1) since the support condition on ϕ make it so that the infinite sum∑

l∈Z|ϕ(ξ + l)|2 con-

sists of two terms for every ξ, which will be consecutive, and by (d) the sum will be 1.

(b) will imply (2) by construction. (a-d) will imply (3) with

m0(ξ)=

ϕ(2ξ) |ξ | ≤ 1/3

01

3≤ |ξ | ≤ 1

2

(and periodized of course)

Note

ϕ(2ξ) =

m0(ξ) |ξ |< 1/30 |ξ | ≥ 1/3

and

m0(ξ) ϕ(ξ) =

m0(ξ) |ξ | ≤ 1/3 (from (b))0 1/3≤ |ξ | ≤ 2/3 (m0(ξ)= 0 on [1/2, 2/3])0 1/2≤ |ξ | ≤ 2/3 (ϕ(ξ)= 0 after [2/3,∞))

How do we satisfy (d)? Since we are looking for even ϕ, we want |ϕ(ξ)|2+ |ϕ(1− ξ)|2=1. We want to use

|cos(ξ)|2 + |cos(π2− ξ)|2 = |cos(ξ)|2 + |sin(ξ)|2 = 1, but cos(ξ) is not compactly supported. However, now

the problem reduces to finding η ∈ C∞ for which η(ξ) + η(1 − ξ) =π

2for all ξ with η(ξ) =

π

2for |ξ | ≥ 2/3

and η(ξ)= 0 for |ξ | ≤ 1/3.

The construction is as follows. First we find a smooth function which is 1 for x < 0 and 0 for x > 1 and strictlybetween 0 and 1 otherwise. We simply convolve a smooth bump with support [ − ε, ε], ε < 1/2, with the func-tion H(x) where H is 1 for x< 1/2 and 0 for x> 1/2. Call this function f .

Now we take η(x) =f(x)

f(x) + f(1− x), which is smooth since f is smooth and f(x) + f(1 − x) > 0. Then by con-

struction we have η(x) + η(1 − x) =f(x)+ f(1−x)

f(x)+ f(1−x)= 1, and η(x) = 1 for x < 0 (since f(x) = 1 and f(1 − x) = 0

when x< 0) and η(x)= 0 for x> 1.

31

Page 32: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Then just shift and scale η to fit the requirements above.

∗ Alternatively, take a symmetric smooth bump α(t) with support [1/3, 2/3] centered at 1/2 with integralπ

2and consider

η(ξ)=

∫−∞

ξ

α(t)dt

Note that η(ξ)+ η(1− ξ)=∫

−∞

ξ+

∫−∞

1−ξ=

∫−∞

ξ+

∫ξ

2(using symmetry of α).

In summary, ϕ is C∞ and compactly supported, and thus in S, so that ϕ is in S. If ψ is chosen accordingto our convention, then ψ ∈ S as well. Moreover, ψ vanishes in a neighborhood of 0, which we can seefrom

ψ(ξ)2= ϕ(ξ/2)2− ϕ(ξ)2

noting ϕ(ξ) is 1 in a neighorhood of zero, and thus ψ(ξ) is 0 in a neighborhood of 0. This means that

ψ (k)(0)=0 for all k=0, 1, 2, so all moments vanish:∫

xkψ(x)dx=0 for all k.

As mentioned previously, the Meyer wavelet is not usable in practice since it is not compactly supported.

Week 8 (10/28/2010)

Vanishing Moments

First, we revisit specifics about vanishing moments. If f is smooth and ψ has vanishing moments, then〈f , ψj,k〉 decays fast in j→∞, depending on the smoothness of f and the number of vanishing moments.

Here is a basic case:

Theorem 30. Let f be Lipα ∩L2, i.e. |f(x)− f(y)| ≤ |f |Lipα|x− y |α, where 0<α ≤ 1, and let ψ be such

that∫

ψ(x)dx=0. If cψ6 ∫

|x|α |ψ(x)|dx<∞, then |〈f , ψj,k〉|.C(f , ψ)2−j(α+1/2).

Remark: The result should not be confused with exponential decay in frequency, since ψj,k in the fre-

quency domain is concentrated around ξ ∼ 2−j so really the decay in frequency is like |ξ |−α. Note thatthe condition that cψ<∞ means that ψ should have some order of decay at as |x|→∞.

Proof. Note

〈f , ψj,k〉=2j/2∫

f(x) ψ(2jx− k)dx=2j/2∫

[

f(x)− f(k2−j)]

ψ(2jx− k)dx

noting that ψ integrates to 0 and ψj,k centers around k 2−j, so we extract f(k 2−j) from f(x). Thentaking absolute values inside, and applying Lipschitz constant:

|〈f , ψj,k〉| ≤ 2j/2|f |Lipα

|x− k2−j |α |ψ(2jx− k)|dx

= 2j(1/2−α)|f |Lipα

|2jx− k |α |ψ(2jx− k)|2j dx

= 2j(1/2−α) |f |Lipαcψ

32

Page 33: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Thus, having the 0-th vanishing moment for ψ gives results for Lipα. For higher vanishing moments, wecan subtract a polynomial instead of the constant, and for this we need more smoothness of f to get(local) polynomial approximation. l vanishing moments can handle f ∈ Lipl+α. The proof is essentiallythe same.

Note that for f in a range of Lipα spaces and say ψ has all vanishing moments, to obtain the optimaldecay in j we note that we need to look at the how |f |Lipα

and cψ depend on α and optimize over α(there is a tradeoff between the decay rate of 2−jα and the growth of |f |Lipα

, cψ(α)).

Spline Wavelets

At this point, Meyer wavelets can be used to characterize spaces, but implementation-wise we will needother wavelets. In particular we want wavelets with corresponding high pass and low pass having finitesupport. As an intermediate step to the Daubechies compactly supported wavelets, we will look at splines,piecewise polynomials on uniformly spaced knot sequences (Knots are the points at which the functionchanges. Uniform spacing will give us shift invariance).

First, some basic background on splines.. Let d ≥ 0 be an integer (degree of splint), and fix a > 0(spacing). Define

Sd(aZ)6

f :R→C, f |[ak,a(k+1)]is a polynomial of degree ≤ d, f ∈Cd−1

e.g. S0(aZ) are piecewise constants and S1(aZ) are piecewise linear functions which are still continuous.

A basis for Sd(Z) is given by B-splines, Bd( · − k), k ∈Z where B0=1[0,1] and Bn+1=Bn ∗ 1[0,1].

(Note that as n→∞, n√

Bn( n√ · ) tends to a Gaussian, by the central limit theorem)

We can check the following properties (inductively):

• Bd∈Sd(Z). Assuming Bn is piecewise polynomial on [k, k+1], we note that

Bn+1(x) =

x−1

x

Bn(y)dy

and for x∈ [k, k+1], we have

Bn+1(x)=

x−1

k

Bn(y)dy+

k

x

Bn(y)dy

which is polynomial since the antiderivative of polynomials is again polynomial.

Also, we show that Bn+1 ∈ Cn assuming Bn ∈ Cn−1. It suffices to check near integer points k ∈ Z,

since polynomials are smooth. Using properties of convolution, we have that Bn+1(n)

=d

dx

[

Bn(n−1) ∗

1[0,1]

]

, then

Bn+1(n)

(x) =d

dx

[ ∫

x−1

x

Bn(n−1)

(y)dy

]

= Bn(n−1)

(x)−Bn(n−1)

(x− 1)

which shows also that Bn+1(n) is continuous (since Bn

(n−1)(x) is continuous by our inductino hypoth-esis).

• Bd is symmetric aboutd+1

2

33

Page 34: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

We wish to show that Bn+1(x) =Bn+1(d+2−x) given that Bn(x) =Bn(d+1−x):

Bn+1(x) =

x−1

x

Bn(y)dy

=

x−1

x

Bn(d+1− y)dy

=

d+1−x

d+2−xBn(y)dy

= Bn+1(d+2− x)

• supp(Bd)= [0, d+1]

This follows from convolution properties, supp(f ∗ g) = supp(f) + supp(g) (Minkowski sum).

• ∑

Bd(x− k)≡ 1 for all x, i.e. Bd( · − k) form a partition of unity.

Let G(x) =∑

kBd(x − k), then G(l) = Bd(l) =

[

B0(l)]d+1

(G is Fourier series and Bd is Fourier

transform). Noting that B0(ξ)= e−iπξsin(πξ)

πξ, this implies that G(l)= δl so that G≡ 1.

• Bd( · − k), k ∈Z form a basis for Sd(Z).

The previous two properties implies that the functions are linearly independent. Suppose we have afinite linear combination of

ak Bd(x − k) = 0, note that the functions at the end (or beginning)have support where the other functions are not supported, and thus the coefficient must be 0. Byrepeating this process, we conclude that all the coefficients are 0, by peeling off the functions thatare furthest away from the origin.

To show that these functions span the spline space is more difficult (There’s a reference in thewikipedia article for “B-spline”)

Spline MRA (d is fixed)

Define Vj6 Sd(2−jZ)∩L2(R).

Note that we immediately have Vj ⊂ Vj+1 (knot sequence of Vj is a subsequence of the knot sequence ofVj+1).

Also,⋃

j∈ZVj = L2(R). We showed this for B0 (Haar system), but in general we can approximate L2

functions by splines of a fixed degree (but whose knot sequences are finer and finer). This can be achieved

inductively, as if we can find a d − 1 spline g ∈ Sd−1(2−jZ) for which ‖f − g‖2 ≤ ε, if we convolve with

2k1[0,2−k] for suitable k we can obtain a d spline h∈Sd(2−kZ) for which ‖g− h‖2≤ ε.

f ∈Vj f(2 · )∈Vj+1 is also immediate from our choice of Vj.

What’s left is to find the scaling function ϕ for which ϕ( · − k), k ∈Z is an orthonormal basis for V0, andthen (Vj , ϕ) will satisfy all the properties of an MRA (Definition 14)

We have Bd( · − k), k ∈ Z which is not orthogonal, but is still a basis. It turns out that it is a Rieszsequence as well (which we prove later)

Definition: Let H be a Hilbert space. (fn) ∈ H is said to be a Riesz sequence if there exists C1, C2 > 0such that

C1 ‖x‖2≤∥

i

xifi

H≤C2‖x‖2 for all x∈ l2

34

Page 35: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

The left inequality implies linear independence. We have the following characterization of Riesz sequences:

Proposition 31. Let ϕ∈L2(R). Then the following are equivalent:

1. ϕ( · − k), k ∈Z is a Riesz sequence in L2 with constants C1, C2

2. C1≤(∑ |ϕ(ξ+ l)|2

)1/2≤C2 for a.e. ξ

Proof. Computation:

xkϕ( · − k)∥

L2

2=∥

∥ϕ(ξ)

xk e−2πikξ

L2(ξ)

2

=

R

|ϕ(ξ)|2 |x(ξ)|2 dξ

=

0

1 (∑

l

|ϕ(ξ+ l)|2)

|x(ξ)|2 dξ

Now showing (2) (1) is easy. Given (2), we have that

C12

0

1

|x(ξ)|2 dξ ≤∫

0

1 (∑

|ϕ(ξ+ l)|2)

|x(ξ)|2 dξ ≤C22

0

1

|x(ξ)|2 dξ

and from Parseval,∫

0

1 |x(ξ)|2 dξ= ‖x‖l22 , and thus using the computation we have

C12 ‖x‖l22 ≤

xkϕ( · − k)∥

L2

2≤C2

2‖x‖l22

which shows (1).

To show (1) (2), one way is through the contrapositive. Suppose (2) does not hold. Then there is a set

A of positive measure and ε > 0 for which(∑ |ϕ(ξ + l)|2

)1/2is larger than C2 + ε, say (or smaller than

C1− ε, but the same argument will work). If we use x(ξ)= 1A, then we have that

0

1 (∑

|ϕ(ξ+ l)|2)

|x(ξ)|2 dξ > (C2+ ε)2∫

0

1

|x(ξ)|2 dξ=(C2+ ε)2 ‖x‖l22

so that (1) is not satisfied.

The other way is similar, assuming (1) we let I be any interval and set x = 1I. Then we have that

C12‖x‖22≤

I

|ϕ(ξ+ l)|2 dξ ≤C22‖x‖22

since ‖x‖22= ‖x‖L2(T)2 = |I |, we have that

C12≤ 1

|I |

I

|ϕ(ξ+ l)|2≤C22

Take |I |→ 0 centered around any ξ, and by Lebesgue differentiation we ahve that for almost every ξ

C12≤∑

|ϕ(ξ+ l)|2≤C22

35

Page 36: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

which proves (2)

Let’s show that Bd( · − k), k ∈Z is a Riesz sequence. Note that

l

|Bd(ξ+ l)|2=∑

l

sin(πξ)

π(ξ+ l)

2(d+1)

≥∣

sin(πξ)

πξ

2(d+1)

≥A> 0

for ξ ∈ [−1/2, 1/2], and since the function is 1-periodic, the lower bound holds for all ξ. Note thatsin x

x→

1 as x → 0. For the upper bound, we simply note that the sum converges absolutely, and in fact is

bounded by 2+∑

l 0

1

|π(l− 1/2)|2(d+1)= :B <∞. Thus we have a Riesz sequence.

Recall that an orthonormal sequence has a similar characterization, except with C1=C2=1. This gives aneasy way to convert any Riesz sequence into an orthonormal sequence (normalization). We set ϕd so that

ϕd(ξ)6 Bd(ξ)(

∑ |Bd(ξ+ l)|2)1/2

Then we note that∑

l|ϕd(ξ+ l)|2=

l|Bd(ξ+ l)|2

k|Bd(ξ+ k)|2

=1 a.e., and thus ϕd( · − k), k ∈Z is an orthonormal

system, and ϕd is the desired scaling function for our spline MRA.

If we set G(ξ) 6 1(

∑ |Bd(ξ+ l)|2)1/2

=∑

αk e−2πikξ (a 1-periodic function with Fourier series coefficients

αk), then ϕd =BdG, and thus ϕd(x) =∑

αkBd(x− k).

It is known that αk is an infinite sequence, that there is no smooth spline wavelet with compact support.The corresponding wavelet ψd will have infinite support for d > 0 (d= 0 is the Haar system, which is com-pactly supported). However ϕd and ψd have exponential decay, which can be shown by showing that ϕdand ψd have analytic extensions to some horizontal strip 0 < Im z < b (the length of strip corresponds tothe rate of exponential decay). We will not prove this here.

Week 9 (11/4/2010)

Compactly Supported Wavelets

We will approach the construction of compactly supported wavelets through m0. Recall that ϕ satisfies

ϕ(x)= 2√ ∑

k

hkϕ(2x− k)

and

ϕ(ξ)=m0(ξ/2) ϕ(ξ/2)

where m0(ξ)=1

2√∑

khk e

−2πikξ. Also,

|m0(ξ)|2+ |m0(ξ+1/2)|2=1

If ϕ is continuous and ϕ(0) 0, then ϕ(0) =m0(0) ϕ(0) and m0(0) = 1. This implies m0(1/2) = 0. m0 iscontinuous at all ξ for which ϕ(ξ) 0.

If ϕ is compact, then hk is necessarily finite, so that m0 is a trigonometric polynomial.

36

Page 37: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

We can iterate the formula above:

ϕ(ξ) =m0(ξ/2)m0(ξ/4) ϕ(ξ/4)= ′′=′′

[

j=1

∞m0(2

−jξ)

]

ϕ(0)

so long as we have convergence in the infinite product. Without loss of generality ϕ(0) = 1 (in general weknow |ϕ(0)|=1 if ϕ(0) 0 and ϕ is continuous at 0, Proposition 28).

Theorem 32. Let m(ξ)=∑

k=KL

ak e−2πikξ be a trigonometric polynomial such that

1. |m(ξ)|2+ |m(ξ+1/2)|2=1

2. m(0)= 1

3. m(ξ) 0 for |ξ | ≤ 1/4.

Then θ(ξ) =∏

j=1∞

m(2−jξ) defines a continuous function in L2. If ϕ6 θ∨, then supp ϕ ⊂ [K, L] and ϕ

defines an MRA, and the corresponding wavelet ψ is also compactly supported (support contained in [K,L]).

Proof.

(θ is continuous) For all ξ, m(2−jξ) → m(0) = 1, so that the infinite product makes sense. For products,we note

ai converges if∑ |1− ai| converges (verify with logarithms/exponentiation). Note

|m(2−jξ)− 1|= |m(2−j(ξ))−m(0)| ≤ ‖m′‖∞2−j |ξ |

Since m is a trigonometric polynomial, ‖m′‖∞ ≤ C < ∞ (and can be computed with Bernstein’sinequality). Since

j2−j < ∞, get convergence for all ξ, and in fact the convergence is uniform for all

bounded sets. This means that the limiting function θ is continuous.

(θ ∈L2) Since |m(ξ)| ≤ 1 from (1),

|θ(ξ)| ≤∣

j=1

N

m(2−jξ)

for any N , where the product is 2N-periodic, so that the inequality will only be useful up to 2N. We willestimate

−2N−1

2N−1

|θ(ξ)|2 dξ ≤∫

−2N−1

2N−1∣

j=1

N

m(2−jξ)

2

It turns out that the right hand side is 1 for any N . We will show this inductively:

First, for N =1, we have that

−1

1

|m(ξ/2)|2 dξ =

0

2

|m(ξ/2)|2 dξ

=

0

1

|m(ξ/2)|2 dξ+∫

0

1

|m(ξ/2+ 1/2)|2 dξ= 1

37

Page 38: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Assuming that it is true for N − 1, we have

−2N−1

2N−1∣

j=−1

N

m(2−jξ)

2

dξ =

0

2N

|m(2−Nξ)|2∣

j=−1

N−1

m(2−jξ)

2

=

0

2N−1

|m(2−Nξ)|2∣

j=−1

N−1

m(2−jξ)

2

+

0

2N−1

|m(2−Nξ+1/2)|2∣

j=−1

N−1

m(2−jξ)

2

=

0

2N−1∣

j=−1

N−1

m(2−jξ)

2

dξ=1

noting that∣

j=−1N−1

m(2−jξ)∣

2is a 2N−1-periodic function. This implies that

−∞

∞|θ(ξ)|2 dξ ≤ 1

taking the limit as N→∞. Thus θ ∈L2.

(ϕ= θ∨ gives rise to an MRA)

We know that ϕ = θ is continuous, and ϕ(0) = 1. If we show that ϕ( · − k), k ∈Z forms an orthonormalsystem, we are done (from Proposition 27). The translates form an orthonormal system if and only if∑

l|ϕ(ξ+ l)|2=1, which occurs if and only if the Fourier series coefficients

0

1∑

l

|ϕ(ξ+ l)|2 e−2πikξ dξ= δk

or∫

−∞

∞|θ(ξ)|2 e−2πikξ dξ= δk

The claim is that∫

−2N−1

2N−1

|θN(ξ)|2 e−2πikξ dξ= δk for all N and k ∈Z. The proof is identical to the induc-

tive proof earlier that∫

−2N−1

2N−1

|θN(ξ)|2 dξ = 1. Note that 1[−2N−1,2N−1]θN e−2πikξ→ θ(ξ)e−2πikξ pointwise

for all k, so that if we can find a square-integrable function γ for which |1[−2N−1,2N−1]θN | ≤ |γ | for all ξ,then by dominated convergence theorem

−∞

∞|θ(ξ)|2 e−2πikξ dξ= lim

N→∞

−2N−1

2N−1

|θN(ξ)|2 e−2πikξ dξ=1

Note that

θN(ξ)=∏

j=1

N

m(2−jξ)=

j=1∞

m(2−jξ)∏

k=1∞ m(2−j2−Nξ)

=θ(ξ)

θ(2−Nξ)

Thus we just need to show that θ(2−Nξ) has a lower bound in [ − 2N−1, 2N−1]. Let ω = 2−Nξ, then we

wish to show that |θ(ω)| ≥C > 0 for |ω | ≤ 1

2. We will use (3) here.

θ(ω)=∏

j=1

∞m(2−jω)=m(ω/2)m(ω/4)

38

Page 39: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

We are interested in |ω | ≤ 1

2. We know that m(ξ) 0 for |ξ | ≤ 1/4, and by continuity on a compact set

there exists c0 for which |m(ξ)| ≥ c0> 0 for |ξ | ≤ 1/4. This means all terms have the bound |m(ω/2j)| ≥ c0if |ω | ≤ 1/2. We will use this bound for the first few terms, and rely on convergence for the rest. Due touniform convergence on bounded sets of the infinite product, we note that there is an integer M such that

j=M

∞|m(2−jω)| ≥ 1/2 for all |ω | ≤ 1/2

This implies that∣

j=1

∞m(2−jω)

≥ 1

2c0M−1 for all |ω | ≤ 1/2

and thus |θN(2−jω)| ≤ 2

c0M−1

|θ(ξ)| for all |ξ | ≤ 2N−1, and thus dominated convergence applies.

(ϕ is supported in [K,L])

Claim: Without loss of generality we may assume that K =0. If we write

m(ξ)= e−2πikξ∑

0

L−Kak+K e

−2πikξ= : e−2πikξ m(ξ)

we have

θ(ξ)=∏

j=1

∞e−2πiK 2−jξ

j=1

∞m(2−jξ)= e−2πiKξ

j=1

∞m(2−jξ)

now let θ =∏

j=1∞

m(2−jξ). We have that θ∨= θ∨( · −K). So supp θ

∨⊂ [K, L] if and only if supp θ∨⊂ [0,

L−K]. Thus if we prove the result for K =0 we can prove the result for general K by shifting.

θN(ξ) =∏

j=1N

m(2−jξ). Let µ be the measure µ=∑

0Lakδk so that µ =m(ξ). Note supp µ= 0, 1, , L.

Define µj6∑

0Lakδk2−j, so that µj =m(2−jξ). Note supp µj = 0, 2−j , , L2−j. We have that

θN∨ = µ1 ∗ ∗ µN

Note that the support of a convolution is the Minkowski sum of the supports, so

supp(θN∨ )= supp(µ1) + supp(µ2)+ + supp(µN)

with supp(µ1)⊆ [0, L/2], supp(µ2)⊆ [0, L/4] and supp(µk)⊆ [0, L/2k]. This means that supp(θN∨ )⊆ [0, L/

2+L/4+ ] = [0, L]. Since θN→ θ, this means that θN∨ → θ∨= ϕ weakly. Then supp(θN

∨ )⊆ [0, L] for all Nimplies that supp(ϕ)⊆ [0, L].

(ψ is also compactly supported)

We can reconstruct a wavelet from ϕ with the formula Proposition 25

ψ(x)= 2√ ∑

l=−L−1

−K−1

h−l−1 ϕ(2x− l)

39

Page 40: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Since |supp(ϕ)| ≤L−K, we have that |supp(ϕ(2x− l))| ≤ L−K

2and thus |supp(ψ)| ≤L−K: the left-most

function in the sum has support [a, a +L−K

2] and the right-most function in the sum has support [a +

L−K

2, a+L−K], so the total support length is L−K.

Wish List

We seek m with the given properties:

• |m(ξ)|2+ |m(ξ+1/2)|2=1

• m(0)= 1

• m(ξ) 0 for |ξ | ≤ 1/4

We can verify these properties for the Haar system already. m(ξ) =1

2√∑

hk e−2πikξ, h0 = h1 = 1, so

m(ξ) =1+ e−2πiξ

2= e−πiξcos(πξ). Note that m(ξ) 0 for |ξ | ≤ 1/2. The support of ϕ is in [0, 1] (K =0, L=

1).

Where will the smoothness of ϕ come from? We will need to show decay estimates for θ. If we have thedecay estimate |θ(ξ)|. |ξ |−r−1−ε, then ϕ∈Cr, for instance (can use integration by parts to prove).

Week 10 (11/11/2010)

This time we show that for all r > 0, there exists a wavelet satisfying our wish list above in Cr. In partic-ular, we can arrange |supp ψr | ≤Cr for some constant C.

Originally the construction was due to Daubechies. We will be doing a simpler construction.

Remark: Let V0= spanϕ( · − k), k ∈Z= spanϕ( · − k), k ∈Z, i.e. V0 is generated by translates of eitherϕ and ϕ, and suppose that ϕ( · − k, k ∈Z and ϕ( · − k), k ∈Z are orthonormal bases for V0. We havealready characterized orthonormal generators previously. We know that

|ϕ(ξ+ l)|2=1=∑

|ϕ∧(ξ+ l)|2

(Theorem 19) and if we write ϕ =∑

ak ϕ( · − k), then ϕ∧(ξ) = a(ξ) ϕ(ξ) with |a(ξ)|= 1 where a is one-periodic (Proposition 24). If ϕ, ϕ are both compactly supported, we will see that ϕ must be an integertranslate of ϕ.

Proof. Note that if ϕ, ϕ are compactly supported, then ak is a finite sequence, and thus a is a trigono-

metric polynomial a(ξ)=∑

k=KL

ake−2πikξ, with aK 0 and aL 0. Let’s write a in the form

a(ξ) = e−2πikK∑

0

N

ak+K e−2πikξ

and call bk= ak+K, so that b0 0 and bN 0. Then

|a(ξ)|2=1=∑

k,l

bkbl e−2πi(k−l)ξ=

n=−N

N (

k−l=nbkbl

)

e−2πinξ

40

Page 41: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

This means that∑

k−l=n bkbl = δn.

Assume towards a contradiction that N > 0. Then if n = N , we have that l = 0 and k = N and bNb0 = 0,but this contradicts the fact that neither b0 nor bN are zero. Thus N = 0 and a(ξ) = e−2πikK, and thusϕ= ϕ( · −K)

Now we design m with the desired properties. We’ll initially design not m(ξ), but |m(ξ)|2, and we will usethe following Lemma to go back:

Lemma 33. (Riesz) If g(ξ) is a trigonometric polynomial, g(ξ) =∑

−KK

γk e−2πikξ is a non-negative

(real) trigonometric polynomial with real coefficients γk ∈R. Then there exists m(ξ) =∑

0Kak e

−2πikξ suchthat ak∈R and |m(ξ)|2= g(ξ). (Actually we do not need γk to be real)

We will prove this lemma later.

A family of trigonometric polynomials gkk≥0 that will work:

gk(ξ)6 1−∫

0

2πξ(sin t)2k+1 dt

0

π (sin t)2k+1 dt, k=0, 1,

Call ck6 ∫

0

π(sin t)2k+1 dt. Let’s see that gk are admissible for our purposes:

Properties:

• gk(0)= 1, gk(1/2)= 0

gk(ξ)+ gk(ξ+1/2) = 2− ck−1

[

0

2πξ

+

0

2π(ξ+1/2)]

(sin t)2k+1 dt

= 2− ck−1

(

ck+

[

0

2πξ

+

π

2πξ+π]

(sin t)2k+1 dt

)

= 2− ck−1ck

= 1

noting that∫

π

2πξ+π

(sin t)2k+1 dt=

0

2πξ

(sin (t+ π))2k+1 dt=−∫

0

2πξ

(sin t)2k+1 dt

• 0≤∫

0

2πξ(sin t)2k+1 dt≤ ck so that 0≤ gk≤ 1, and gk 0 for |ξ |< 1

2.

• If we graph gk, we will see that as k grows larger, the function becomes flatter at 0, π, and this willcontribute to decay in ξ of the corresponding infinite product.

Proposition 34. Applying Riesz to obtain trigonometric polynomials mk such that |mk|2= gk,

j=1

∞mk(2−jξ)

2

=∏

j=1

∞gk(2−jξ). |ξ |−βk

for some β > 0

41

Page 42: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

First, we prove the following:

Claim: gk(ξ) =[

1+ cos(2πξ)

2

]k+1γk(ξ)= [cos(πξ)]

2k+2γk where γk is another trigonometric polynomial.

Proof. Let pk(x) = ck−1 ∫

−1

x(1− u2)k du which is a degree 2k + 1 polynomial. Set u= cos t in the defini-

tion of gk, and we have

gk(ξ) = 1+ ck−1

1

cos(2πξ)

(1− u2)kdu

= ck−1

[

−∫

1

−1

+

1

cos(2πξ)]

(1−u2)k du

= ck−1

−1

cos 2πξ

(1− u2)k du

= pk(cos 2πξ)

Note pk(−1) = 0 and pk′ (x) = ck

−1(1 − x2)k. The derivative has a root of order k at −1, and thus pk has a

root of order k+1, and pk(x)= (x+1)k+1qk(x) for some polynomial qk. We conclude that

gk(ξ)= (1+ cos(2πξ))k+1qk(cos(2πξ))

and we take γk(ξ) = 2−k−1 qk(cos(2πξ)).

Remark:

j=1

∞cos(2−jξ)=

j=1

∞sin(2−j2πξ)

2 sin(2−jπξ)= limm→∞

sin(πξ)

2m sin(2−mπξ)=

sin(πξ)

πξ

Roughly speaking, each cos πξ factor buys us1

|ξ| decay for the infinite product∏

gk(2−jξ).

Lemma 35. Let Mk(ξ)6 gk(ξ)

[cos(πξ)]k= [cos(πξ)]k+2 γk(ξ). Then there exits α< 1 such that ‖Mk‖∞. 2αk

Proof. Since gk(ξ) = pk(cos 2πξ),

‖Mk‖∞= sup−1≤x≤1

pk(x)

(

1+ x

2

)−k/2, (take x= cos 2πξ)

Then

(1+ x)−k/2 pk(x) = ck−1

−1

x (1− u2)k

(1+x)k/2du

= ck−1

−1

x[

1+ u

1+ x

]k/2≤1

(1+u)k/2(1−u)k du

≤ ck−1

−1

x[

(1+ u)(1−u)2]k/2

du

≤ 2ck−1[

sup|u|≤1

(1+ u)(1−u2)]k/2

≤ 2ck−1 2k/2

42

Page 43: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

where we used that (1 + u)(1 − u2) = (1 − u2)(1 − u) ≤ 2 for |u| ≤ 1. Now we just need a lower bound forck.

ck =

0

π

(sin t)2k+1 dt

=

−π/2

π/2

(cos t)2k+1dt

≥∫

|t|≤ 1

2k+1√

(

1− t2

2

)2k+1

dt

≥[

1− 1

2(2k+1)

]2k+11

k√

≥ C0

k√

and this implies that

‖Mk‖∞≤ 2k

C02k/2

Note that cos t ≥ 1 − t2

2holds when

|t|44!

≥ |t|66!

≥ , or when |t|2 ≤ 30 (alternating series estimates), and in

particular on the set |t| ≤ 1

2k+1√ .

Thus we have shown the result for α=1/2.

Now we prove the estimate.

Proof. (of Proposition 34) gk(ξ)=Mk(ξ)[cos πξ]k. Then

j=1

∞gk(2

−jξ)=

(

sinπξ

πξ

)k∏

j=1

∞Mk(2

−jξ)

We note that the infinite product converges uniformly on bounded sets, since

|Mk(2−jξ)− 1|= |Mk(2

−jξ)−M(0)| ≤Ck 2−jξ

Note Mk(0) =gk(0)

cos(0)k= 1. The sup-norm upper bound we derived earlier is only good for finitely many

terms, so we will obtain the estimate by breaking the product into two parts:

1

∞Mk(2

−jξ)=∏

1

J

Mk(2−jξ)

J+1

∞Mk(2

−jξ)

We know that for |ξ | ≤ 1,∣

1∞Mk(2

−jξ)∣

∣ ≤ Ck′ , since we have uniform convergence of the infinite pro-

duct, and thus the limiting function is continuous, and thus bounded on the compact set |ξ | ≤ 1. Thisimplies that

J+1

∞Mk(2

−jξ)

≤Ck′ for |ξ | ≤ 2J

43

Page 44: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Now for each ξ ∈R, there is a unique J such that 2J−1≤ |ξ | ≤ 2J. Then

j=1

∞Mk(2

−jξ)

≤ ‖Mk‖∞J Ck′

≤ 2αJkCk′

≤ [2 |ξ |]αkCk′

This implies that∣

j=1

∞gk(2−jξ)

≤Ck′ [2 |ξ |]αk 1

|πξ |k ∼ Ck |ξ |−βk

with α< 1, and − β=α− 1, so β > 0. (using the previous lemma we get β=1/2)

From this, in the context of Theorem 32, we note that |θ(ξ)| .k |ξ |−βk/2 so that ϕ ∈ Cβk/2−2, so we endup with a compactly supported wavelet with arbitrarily high smoothness. The one missing piece is the theproof of the Riesz lemma:

Proof. (of Lemma 33) From http://people.virginia.edu/~jlr5m/Papers/FejerRiesz.pdf

Letting w(z) =∑

−KK

γk zk so that w(e−2πiξ) = g(ξ), the goal is to find a polynomial p(z) such that

p(z) p(z) = w(z) for |z | = 1. We note that γ−k = γk since g is real, and thus by inspection we have that

w(1/z) =w(z) for all z 0. In particular, the zeros of w that are not on the unit circle occur in pairs (αj ,1/αj) having equal multiplicity. Also, the zeros of w on the unit circle must have even multiplicity,because the symmetry properties shows that the map z w(z) is locally 2 (or some other even number)to 1 near such zeros.

Now consider q(z) = zKw(z), which is now a polynomial of degree 2K with q(0) = γ−K. Without loss ofgenerality we may assume γ−K 0 by working with the conjugate of g if necessary. Noting the above, wecan factor q as

q(z)=∏

j=1

K

(z −αj)(z − 1/αj)=C∏

j=1

K

(z −αj)z(z−1− αj)

so that

w(z) =C∏

j=1

K

(z −αj)(z−1− αj)

Then we just take p(ξ) = C√

j=1K (e−2πiξ − αj). (If in addition γk are all real, we note the additional

relation w(1/z) =w(z) which then forces the zero-pairs (α, 1/α) to be real)

Week 11 (11/24/2010)

Vanishing Moments, Again

From the formula of Theorem 23 we note that a 0 of order L at m(1/2) gives a 0 of order L at ψ(0), sowe have L− 1 vanishing moments for ψ. Here we consider a set of conditions which is sufficient for havingvanishing moments (note that it does not assume that ψ is a wavelet)

44

Page 45: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Theorem 36. Suppose ψ satisfies the following properties:

• ψj,k is an orthonormal system for j

• ψ ∈CL(R) and ψ(l)∈L∞ for l=0, , L• |ψ(x)|. 1

(1+ |x|)A for A>L+1

Then∫

xlψ(x)dx=0 for l=0, , L.Proof. The idea is that we know 〈ψ, ψj,k〉=0 for all j , k (0, 0), and locally ψ is well-approximated by apolynomial.

Let l0 be the smallest l such that∫

xl ψ(x)dx 0. If l0>L, there is nothing to prove, so we assume l0 ≤L towards a contradiction. ψ(l0) cannot vanish identically since ψ is not a polynomial (boundedness condi-

tion). Thus, there exists some a such that ψ(l0)(a) 0, and without loss of generality we may assume a=

K 2−J for J ≥ 0 and J ,K ∈Z. Consider the Taylor formula for ψ at a:

ψ(x)=∑

l=0

l0ψ(l)(a)

l!(x− a)l+R(x)

where R(x) = o(|x− a|l0) as x→ a.R(x)

|x− a|l0is a bounded function, noting the form

R(x)=

a

x

[ψ(l0)(t)− ψ(l0)(a)](t− a)l0−1

(l0− 1)!dt

and the assumption that ψ(l0)∈L∞.

Let cl=ψ(l)(a)

l!be the Taylor polynomial coefficients. We know cl0 0.

Consider 〈ψ, ψj,k〉=0, i.e.∫

ψ(x)ψ(2jx− k) dx=0. Let kj=2ja for j ≥ J so that kj=K 2j−J ∈Z. Thenwe have that

ψ(x) ψ(2j(x− a)) dx=0

for all j ≥J . Putting in the Taylor formula, we have

[

l=0

l0

cl(x− a)l+R(x)

]

ψ(2j(x− a)) dx=0

Since∫

xlψ(x)dx=0 for all l < l0, we’re left with

[

cl0(x− a)l0+R(x)]

ψ(2j(x− a)) dx=0

We can rearrange the above as

xl0ψ(x)dx=2j(l0+1)

xl0 ψ(2j x)dx=− 2j(l0+1)

cl0

R(x) ψ(2j(x− a)) dx

and we will show that the RHS tends to 0 as j→∞.

45

Page 46: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Using the assumptions, we have

|RHS| ≤ 2j(l0+1)

cl0

∫ |R(x)||x− a|l0

|x− a|l0(1+2j |x− a|)A dx

=2j(l0+1)

cl0

∫ |R(a+2−ju)||2−ju|l0

|2−ju|l0(1+ |u|)A 2−j du (x= a+2−ju)

=1

cl0

∫ |R(a+2−ju)||2−ju|l0

|u|l0(1+ |u|)A du

As j→∞, pointwise the integrand goes to 0 (R(x) = o(|x − a|l0 near a). The integrand is also boundedby

R( · )| · − a|l0

|u|l0(1+ |u|)A ∈L1

and so dominated convergence holds, and we have that∫

xl0ψ(x)dx= 0, which contradicts the definitionof l0.

Decay of wavelet coefficients from vanishing moments

Theorem 37. Let ψ be such that∫

xl ψ(x)dx = 0 for l = 0, , L. and∫

(1 + |x|)L+1 |ψ(x)|dx <∞.

Then if f ∈CL and f (L)∈Lipα for 0<α< 1, then

|〈f , ψj,k〉|. 2−j(L+α+1/2)

Remarks: Above, the 1/2 is an artifact from l2 normalization. We proved this for L= 0 previously (The-orem 30). Also, this is an if and only if statement, which will only partly show later.

Proof. By Taylor’s formula, for each a, there exists a polynomial pa,L of degree L such that

|f(x)−Pa,L(x)| ≤C |x− a|L+α

This is from the integral form of Taylor remainder:

a

x

[f (L)(t)− f (L)(a)](t− a)L−1

(L− 1)!dt

≤ |x− a|α |x− a|LL!

Now

〈f , ψj,k〉 =

f(x)2j/2ψ(2jx− k) dx

(vanishingmoments) =

[f(x)−Pa,L(x)]2j/22j/2ψ(2jx− k) dx

(a=2−jk) ≤∫

C |x− 2−jk |L+α 2j/2 |ψ(2jx− k)|dx

(u=2jx− k) = 2−j(L+α+1/2)

C |u|L+α |ψ(u)|du

≤ C ′ 2−j(L+α+1/2)

where C ′ is a constant that does not depend on j (just f , ψ).

46

Page 47: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Lipschitz regularity from decay of wavelet coefficients

0 < α < 1. Let ϕ, ψ ∈ C1 be compactly supported, ψ is the wavelet corresponding to the MRA generatedby ϕ.

Theorem 38. For f ∈L2, suppose |〈f , ψj,k〉|. 2−j(α+1/2) for all j , k ∈Z. Then f ∈Lipα.

Proof. Decompose f =P0f +∑

j=0∞

Qj f ,

P0f =∑

k

〈f , ϕ0,k〉 ϕ0,k

Qjf =∑

k

〈f , ψj,k〉 ψj,k

where the equations above are valid in L2 as well as pointwise (compactly supported wavelets, so onlyfinitely many terms in each sum). Thus P0f , Qjf ∈C1.

|Qj f(x)| ≤(

supk

|〈f , ψj,k〉|)

2j/2.2−jα

k

|ψ(2jx− k)|

Note∑

k|ψ(2jx− k)| has finitely many terms, and the number of terms does not depend on j. Thus

‖Qj f ‖∞. 2−jα

In particular,∑

kQjf is uniformly convergent, and since each Qjf ∈ C1, the limit

kQj f is contin-

uous. Now showing that |f(x)− f(x)|. |x− y |α boils down to showing that

j

Qjf(x)−Qjf(y)

. |x− y |α

noting that Pjf ∈C1 already (an thus in Lipα). In fact we will show that

j

|Qjf(x)−Qjf(y)|. |x− y |α

Qjf(x)−Qjf(y) =∑

k

〈f , ψj,k〉 (ψj,k(x)− ψj,k(y))

|Qjf(x)−Qjf(y)| . 2−jα∑

k

|ψ(2jx− k)− ψ(2jy− k)|

(mean value theorem, t between x and y) ≤ 2j(1−α)|x− y |∑

k

|ψ ′(2jt− k)|

. 2j(1−α)|x− y |

noting again that∑

k|ψ ′(2j t− k)| has finitely many nonzero terms. This estimate is only useful for small

j.

For large j, we can use the simpler estimate ‖Qjf ‖∞. 2−jα so that

|Qjf(x)−Qjf(y)|. 2−jα

47

Page 48: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Then we can split the sum into two parts and use the two estimates:

j=0

∞|Qjf(x)−Qjf(y)| .

2j≤ 1

|x−y|

|Qjf(x)−Qjf(y)|+∑

2j≥ 1

|x−y|

|Qjf(x)−Qjf(y)|

. |x− y |(|x− y |−1)1−α+(|x− y |−1)−α

. |x− y |α

Here we are using that the sum of exponentially controlled increasing/decreasing sequence is controlled bythe largest term.

Week 12 (12/1/2010)

Lp convergence of wavelet series

Here we address conditions for when Pj f → f in Lp. We already analyzed this for p = 2 and touched onp=1 case in Week 3.

Definition 39. We say that ϕ is majorized by a function µ if |ϕ(x)| ≤ µ(|x|) where µ: R+ → R+ ismonotone decreasing and µ∈L1

Proposition 40. Let ϕ be majorized by µ and suppose ϕ( · − k), k ∈Z is an orthonormal system. Thenthere exists constants C1, C2> 0 (depending on µ) such that

C1 ‖a‖lp ≤∥

k

akϕ( · − k)

Lp

≤C2 ‖a‖lp

for 1≤ p≤∞.

Proof. Let a∈ lp.For p=∞, we have the following upper bound:

akϕ( · − k)∥

L∞≤ ‖a‖∞ sup

x∈[0,1]

k

|ϕ(x− k)|1−periodic

≤ ‖a‖∞(

µ(0)+∑

k

µ(k))

where we have used that ϕ is majorized by µ and that µ is monotone in the second line. We can thentake C2 : = 2

kµ(k) which is finite since µ∈L1.

For the lower bound, let f =∑

kakϕ( · − k). Then ak= 〈f , ϕ( · − k)〉, so that

|ak| ≤ ‖f ‖∞ ‖ϕ‖1

Note ‖ϕ‖1≤ 2‖µ‖L1(R+)≤ 2∑

k=0∞

µ(k)=C2(µ) (monotonicity of µ), so

|ak| ≤C2(µ)‖f ‖∞

Taking supremum over k gives the lower bound.

48

Page 49: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

For p<∞,

akϕ(x− k)∣

p

≤[

|ak| |ϕ(x− k)|1/p |ϕ(x− k)|1/p′]p

(Hölder) ≤(

k

|ak|p|ϕ(x− k)|)

(

|ϕ(x− k)|)p/p′

≤C2p/p′

akϕ(x− k)∣

p

dx ≤ C2(µ)p/p′ ‖ϕ‖L1 ‖a‖pp

Again using ‖ϕ‖L1≤C2(µ) , we have

k

akϕ( · − k)

Lp

≤C2(µ)1/p+1/p′ ‖a‖p=C2(µ)‖a‖p

For the lower bound, by density it suffices to show the result for finite sums f =∑

kakϕ( · − k). We again

have ak= 〈f , ϕ( · − k)〉 (this did not require finiteness), and we essentially use the same trick:

|ak|p ≤[ ∫

|f(y)| |ϕ(y− k)|1/p |ϕ(y− k)|1/p′dy

]p

≤( ∫

|f(y)|p |ϕ(y− k)|dy)( ∫

|ϕ(y− k)|dy)p/p′

k

|ak|p ≤ C2(µ)p/p′

|f(y)|p(

k

|ϕ(y− k)|)

dy

(

k

|ak|p)1/p

≤ C2(µ)‖f ‖p

The goal is to show that Pjf → f in Lp as j→∞ for all f ∈ Lp if 1 ≤ p <∞ and for bounded, uniformlycontinuous f if p=∞.

Proposition 41. As before, assume ϕ is majorized by µ∈L1(R+). Then

1. ‖P0‖p→p≤Cµ2 for all 1≤ p≤∞

2. ‖Pj‖p→p= ‖P0‖p→p for all j

Proof. (of (1)). Let f ∈Lp∩L2. Then since P0f =∑

k 〈f , ψ0,k〉 ϕ0,k, denoting ak= 〈f , ψ0,k〉 we have

‖P0f ‖Lp ≤Cµ‖a‖lp ≤Cµ2 ‖f ‖Lp

(of (2)).

Pjf(x) =∑

j,k

2j∫

f(y) ψ(2jy− k)dyϕ(2jx− k)

Note 2j∫

f(y)ψ(2jy− k) dy=∫

f(2−ju) ϕ(u− k) du (u=2jy), so

[Pjf(2j · )](2−jx) =

j,k

( ∫

f(y) ϕ(u− k) du

)

ϕ(x− k)= [P0f ](x)

49

Page 50: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

This means that with change of variables,

|[Pj f(2j · )](x)|p dx =

|P0f(2jx)|p dx

= 2−j‖P0f ‖pp

≤ ‖P0‖p→pp 2−j‖f ‖pp

= ‖P0‖p→pp ‖f(2j · )‖pp

so that ‖Pj‖p→p= ‖P0‖p→p

Up to now, we have not used the full properties of MRA; we’ve used orthogonality and the definition ofPj.

Theorem 42. Suppose ϕ is majorized by µ and gives rise to an MRA. Then Pjf → f in Lp 1 ≤ p <∞and if p=∞, it holds if f is uniformly continuous and bounded.

Proof. First we write Pj f as a convolution operator:

Pjf(x) =∑

f(y) ϕj,k(y) dyϕj,k(x)=

∫[

k

ϕj,k(x) ϕj,k(y)]

f(y)dy= :

Kj(x, y)f(y)dy

where Kj(x, y)= 2jK0(2jx, 2jy),K0(x, y) =

kϕ(x− k) ϕ(y− k).

Next we prove the following Claim: |K0(x, y)|. cµµ(

|x− y|2

)

|K0(x, y)| ≤∑

k

|ϕ(x− k)| |ϕ(y− k)|

≤∑

k

µ(|x− k |)µ(|y− k |)

(Monotonicity) ≤ 1

(

|x− y |2

)

k

[µ(|x− k |) + µ(|y− k)|]

≤ 2cµµ

(

|x− y |2

)

Continuing on, noting that∫

Kj(x, y)dy=∑

k

|ϕ(x− k)|=1

(property of MRA, P0(1R)= 1R), we have

(Pjf − f)(x) =

Kj(x, y)f(y)dy− f(x)

=

Kj(x, y)[f(y)− f(x)]dy

|Pjf − f |( · ) .

2jµ

(

2j|x− y |

2

)

|f(y)− f( · )|dy

=

2jµ(2j−1|u|)|f( · − u)− f( · )|dy

50

Page 51: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Then Minkowski gives

‖Pjf − f ‖p .

2jµ(2j−1|u|)‖f( · − u)− f( · )‖p du

=

µ

(

|t|2

)

∥f(

· − 2jt)

− f∥

Lp dt

Now ‖f( · − 2jt) − f ‖Lp → 0 as j→∞ (continuity of translation operator in Lp, p <∞ and for uniformlycontinuous functions when p = ∞). Then dominated convergence applies since integrand is bounded by2‖f ‖Lpµ( · ) which is integrable, and we have the result.

Remark: If we consider f = 1[0, 2√

] with the Haar system (ϕ, ψ), then Pjf always has some interval con-

taining 2√

where Pjf is constant and f takes values 0 and 1 so that ‖Pjf − f ‖∞≥ 1/2 for all j. Thus wecannot expect the result to be true for p=∞ in general.

The above result allows us to write f = P0f +∑

j=0∞

Qjf with convergence in Lp. For the full expansion,

we need Pjf → 0 as j → − ∞. The proof of this result is similar to the proof of the result in the Haarsystem.

Theorem 43. Pjf→ 0 as j→−∞ if 1< p<∞. The result is not true if p=1,∞

Proof. Exactly the same proof as Theorem 11 (Week 3)

For 1< p<∞, we can write

f =∑

j=−∞

∞∑

k=−∞

∞〈f , ψj,k〉 ψj,k=

j

Qjf

with convergence in Lp

For p=1,∞, we can just do f =∑

k 〈f , ϕ0,k〉ϕ0,k+∑

j,k 〈f , ψj,k〉ψj,k=P0f +∑

j≥0 Qjf .

Approximation by wavelet series

We showed earlier that f ∈ Lipα |〈f , ψj,k〉|.f ,ψ 2−j(α+1/2) for all j ∈Z (Theorem 37), and it can be

seen from the proof that |f |Lipα6 supx y |f(x− y)|

|x− y|α ≍ supj,k 2j(α+1/2)|〈f , ψj,k〉|.

We established the relationship between smoothness and decay of wavelet coefficients. Now we establishanother connection to the approximation error, the rate of approximation for functions in a smoothnessspace using truncated wavelet series.

For f ∈Lipα∩L∞,

‖f −PJf ‖∞=

j=J

∞Qjf

∞≤∑

j=J

∞‖Qjf ‖∞. 2−Jα

We bound ‖Qjf ‖∞ as follows: Qjf =∑

2j/2〈f , ψj,k〉ψ(2jx− k), so that

‖Qjf ‖∞≤Cµ 2j/2 sup

k

|〈f , ψj,k〉|. 2−j(α+1/2)

and thus ‖Qjf ‖∞. 2−jα.

This can be reversed as well to show that if ‖f −PJf ‖∞. 2−Jα, then f ∈Lipα.

51

Page 52: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

In general, given a subspace X of Lp, we may ask what is the rate of decay of ‖f −Pjf ‖p for f ∈X?

Notation: Single bars |f | will be used to denote seminorms, which will generally be components ofnorms, denoted in double bars ‖f ‖Examples: For the Sobolev space X = Hs(R), s > 0, where |f |Hs

2 6 ∫

|ξ |2s |f (ξ)|2 dξ and the norm isdefined as

‖f ‖Hs2 = ‖f ‖L2

2 + |f |Hs2 ≍

(1+ |ξ |2)s |f (ξ)|2 dξ

(equivalent definitions up to constants)

We will characterize this space (wavelet coefficient decay) with the Shannon wavelet (will work for arbi-trary wavelets with more work). Recally Shannon wavelet had ϕ = F−1

1[−1/2,1/2]

and ψ =F−1

1±[1/2,1]

. Note

ψj,k(ξ)= 2−j/2 ψ(ξ/2j)2−2πi

ξ

2jk

so ψj,k, k ∈Z is an orthonormal basis for L2(± [2j−1, 2j]). Then splitting into dyadic frequencies,

|f |Hs2 =

j∈Z

±[2j−1,2j]

|ξ |2s|f (ξ)|2 dξ

≍∑

j∈Z

22 js∫

[2j−1,2j]

|f (ξ)|2 dξ

=∑

j∈Z

22 js∑

k

|〈f , ψj,k〉|2

so that

‖f ‖Hs2 ≍

j,k

(1+ 2js)|〈f , ψj,k〉|2

Note for the seminorm we can rewrite this as

|f |Hs≍∥

(

2js ‖(dj,k)k‖2)

j

2

(dj,k = 〈f , ψj,k〉) i.e. the seminorm is equivalent to taking the 2 norm of the wavelet coefficients in thetranslation index, and then the 2 norm of the resulting sequence weighted with the dyadic scale

To relate to approximation error, we can also show that f ∈ Hs (

2js‖f − Pjf ‖L2

)

j∈Z∈ l2 (Exercise,

easy using exact formulas).

Week 13 (12/9/2010)

Recall that

f ∈Lipα ‖f −Pjf ‖2. 2−jα |〈f , ψj,k〉| . 2−j(α+1/2)

Note that before we were using L2-normalized ψj,k. Consider instead ψj,kL∞

= ψ(2jx − k) = 2−j/2ψj,k.

Writing f =∑

dj,k∞ ψj,k

L∞using the new normalization means dj,k

∞ = dj,k2j/2 so that

|f |Lipα≍∥

(

2jα ‖(dj,k∞ )k‖∞)

j

52

Page 53: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Compare this to what we have for Hs, we see the exact same expression, with s instead of α and using adifferent norm. We can generalize this to define another space Bp,q

α where α is the degree of smoothness, pis the primary space where smoothness is measured and q is a secondary measure. The ultimate charac-terization for the norm would be something of the form

‖f ‖Bp,qα 6 ‖f ‖Lp+ |f |Bp,q

α

|f |Bp,qα =

(

2jα∥

(

2j,k(p))

k

lp

)

j

lq

where f =∑

dj,k(p)ψj,k(p)

, the p-normalized basus,

Back to Hα: How does ‖f −PJf ‖L2 behave?

‖f −PJf ‖L22 =

j=J

∞‖Qjf ‖22

L2=VJ⊕

j=J+1∞

Wj

On one hand,

j

22jα ‖f −Pjf ‖L22 =

j=−∞

∞22jα

l=j

∞‖Qlf ‖22

=∑

l=−∞

∞‖Qlf ‖22

j=−∞

l

22 jα

.∑

l∈Z

22lα ‖Qlf ‖22∑

k|〈f ,ψj,k〉|2

. |f |Hα2

We also have

j

22jα ‖f −Pjf ‖L22 =

j=−∞

∞22 jα

l=j

∞‖Qlf ‖22

≥∑

j=−∞

∞22jα‖Qjf ‖22

≍ |f |Hα2

So that

|f |Hα2 ≍

(

22jα‖f −Pjf ‖L2

)

j

l2

Three Interacting Components:

• Classical Smoothness Spaces

• Approximation Spaces (abstract, class of functions that can be approximated at a certain rate)

• Interpolation Spaces, i.e. X =L2, Y =W 1,2, then get Z = (X, Y )α,q, (Lp, W 1,p)α,q =Bp,qα , 0< q <

∞, 0 < α < 1. This interpolation is achieved via the definition of a K-functional that combines thetwo norms with some weights...

53

Page 54: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Linear Approximation v Nonlinear Approximation

Let H be a Hilbert space, separable, en1∞ be an orthonormal basis.

Let En6 spane1, , en (e.g. trigonometric polynomials, etc). For f ∈H, we are interested in

infg∈En

‖f − g‖H= dist(f , En) = :En(f) (linear approximation)

Let Σn 6 ∑

k∈Λ ckek, |Λ| ≤ n

. This space is closed under scalar multiplication but not addition. At

best Σn+Σn⊂Σ2n, and so it is not a linear space (an n-dimensional manifold).

For f ∈H, we are interested in

dist(f ,Σn) = infg∈Σn

‖f − g‖H= : σn(f)

Exact formulas:

f =∑

kckek, En(f)

2=∑

1n |ck|2 (norm of projection)

Letting |c1∗|> |c2∗|> be a decreasing rearrangement. Then σn(f)2=∑

1n |ck∗ |2.

In general, suppose we are given X1⊂X1⊂ ⊂Xn⊂ ⊂H=X. Define

Aα(X, Xn)6

f ∈X, infg∈Xn

‖f − g‖X.n−α

This is a linear space. We want to know the description of Aα(En) and Aα(Σn).

Define |f |AαXn6 supnα dist(f ,Xn)X

For linear approximation,

En(f).n−α E2j(f). 2−jα (monotonicity) (

2j<k<2j+1

|ck(f)|2)1/2

. 2−jα

(uses the fact that the sum of a geometric series is bounded by the largest term)

Thus |f |AαEn≍ supj 2jα(

2j<k<2j+1 |ck(f)|2)1/2

For nonlinear approximation,

σn(f).n−α (

k=n+1

∞|ck∗ |2

)1/2

.n−α

(Claim) |ck∗ |. k−α−1/2 for all k

The part of the claim is immediate. For ,

n |c2n∗ |2≤∑

n=1

2n

|ck∗ |2.n−2α |c2n∗ |.n−α−1/2

Thus |f |Aα(Σn)≍ supn nα+1/2 |cn∗ |

54

Page 55: Wavelets, Approximation Theory, and Signal …chou/notes/waveapprox.pdfWavelets, Approximation Theory, and Signal Processing Güntürk, Fall 2010 Scribe: Evan Chou References: Mallat,

Besov Space

The original definition of Bp,qα . For simplicity, assume 0<α< 1.

Define Lp modulus of continuity

ω(f , t)Lp6 sup|h|<t

‖f( ·+ h)− f( · )‖Lp

We are interested in a bound like ω(f , t)Lp. tα. Define the seminorm

|f |Bp,qα 6

suptω(f , t)Lp

tαq=∞

(

0

1 [t−αω(f , t)

]q dt

t0< q <∞

(The dt/t term captures boundedness in the second case)

For higher α, r <α< r+1, need to look at higher differences ∆hr f where ∆hf = f( ·+ h)− f( · ).

This captures fine behavior of ω(f , t).

Some cases: Bp,∞α =Lip(α,Lp). B2,2

α =Hα, B∞,∞α =Lip(α)

Properties:

• For α1>α2, Bp,q1α1 ⊂Bp,q2

α2 for all q1, q2

• For p1> p2, Bp1,q1α ⊂Bp2,q2

α for all q1, q2

(q is a secondary index)

Ex. H1/2+ε=B2,21/2+ε L∞.

Using Wavelet Bases: X1⊂X2 ⊂ with X2j 6 Vj. Then Aqα(Lp, Vj) =Bp,q

α , and for the nonlinear case,

the n-term wavelet series Σn=

j,k∈Λ dj,kψj,k, |Λ| ≤n

, Aqα(L2, Σn)=Bq,q

α where 1/q=1/2+α.

Comparing to the case where p = 2, the linear approximation space gives B2,qα and the nonlinear gives

Bq,qα .

55