9
2/7/17 1 A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 4 Fourier transform shi6, convolu:on theorem examples DFT of complex exponen:al Stochas:c processes Reading See web page tomorrow Webpage: www.astro.cornell.edu/~cordes/A6523 1 Fourier Transforms Examples on board Shi6 theorem Finding maximum of func:on Shi6ing discrete data 2

Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

1  

A6523 Modeling, Inference, and Mining �Jim Cordes, Cornell University�

•  Lecture  4  –  Fourier  transform  shi6,  convolu:on  theorem  examples  – DFT  of  complex  exponen:al  –  Stochas:c  processes  

•  Reading  –  See  web  page  tomorrow  

•  Webpage:        www.astro.cornell.edu/~cordes/A6523    

 

1  

Fourier Transforms�•  Examples  on  board  – Shi6  theorem  •  Finding  maximum  of  func:on  •  Shi6ing  discrete  data  

2  

Page 2: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

2  

Discrete Fourier Transform (DFT)The DFT of a uniformly spaced array of data {xn, n = 0, . . . , N � 1} is defined as

Xk = N�1N�1�

n=0

xn e�2⇥i nk/N

The inverse transform is

xn =N�1�

k=0

Xk e+2⇥i nk/N

which may be shown to have the correct normalization, etc. by substituting for Xk:

xn = CN�1�

k=0

Xk e+2⇥i nk/N

= CN�1�

k=0

N�1N�1�

n⇤=0

xn⇤ e�2⇥i n⇤k/Ne+2⇥i nk/N

= C N�1N�1�

n⇤=0

xn⇤N�1�

k=0

e2⇥i (n�n⇤)k/N

= C N�1N�1�

n⇤=0

xn⇤ N �nn⇤

= Cxn = xn for C ⇥ 1

1

The  FFT  is  simply  a  fast  algorithm  for  calcula4ng  the  DFT.      It  exploits  redundancies  in  the  exponen4al  when  N  is  factorable;  esp.  N  =  2M    but  any  prime  will  do.  

A  direct  calcula:on  of  the  DFT  requires  ~  N2  opera:ons.    The  FFT  requires  ~  NlogN  opera:ons.  

3  

• The normalization calculation expresses the orthogonality property of the basis functions(exponentials).

• In calculations of this kind, identifying the implied �-function is typical. Here it relied onsumming over products of basis functions.

• In other contexts, the �-function will arise from statistical independence of random vari-ables.

• What happens if the sampling of xn is not uniform? Orthogonality is broken. So what?

• Notation: we often designate that xn and Xk are Fourier transform pairs by writing

xn ⇤⌅ Xk

and we say that n and k are conjugate variables.

• n can be time, a spatial coordinate, a wavelength, anything.

• Extension to ND dimensions is trivial:– E.g. a 2D DFT of an N �M size object can be calculated as a series of M 1D-DFTs of

length N followed by N 1D-DFTs of length M

• From a systems point of view, the DFT is a linear operation and does not lose information.

• An alternative approach is to fit a sinusoidal model to the data using an assumed frequencyk/N . It can be shown that the DFT is the least-squares solution for the amplitudes ofthe sinusoids for all k. We will show this later on by using matrix algebra to solve forleast-squares solutions.

2 4  

Page 3: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

3  

Symmetry Properties of the DFT

Xk = N�1N�1�

n=0

xn e�2⇥i nk/N and xn =

N�1�

k=0

Xk e+2⇥i nk/N

• Periodic with period N in both domains

• Time series:

– Discrete functions with sample intervals �t and �f

– T = N�t = time total time span– �f = 1/T

– Nyquist frequency = maximum frequency that is represented without distortion:

fN =N�f

2=

1

2�t

3 5  

Symmetry Properties of the DFT

Xk = N�1N�1�

n=0

xn e�2�i nk/N and xn =

N�1�

k=0

Xk e+2�i nk/N

• Hermitian (show by substituting into the DFT expression for xn)

x⇥n ⌅⇧ X⇥N�k

If xn is real then

xn ⌅⇧ X⇥N�k

X⇥N�k = Xk

• The symmetry properties tell us how to fill an array with data to achieve specific results

• What are the symmetry properties of a 2D DFT?

Xkl =1

NM

n

m

xnme�2�i (nk+ml)/NM

4 6  

Page 4: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

4  

How do we fill an array to get a real signal in the other domain?

•  k = 0, 1, …, N-1 • Need to know N/2-1+2 = N/2+1

unique values

0 1 2 3 N/2-1 N/2 N/2+1 N-2 N-1 N … … X

N⌘

X0 X1 ⌘ XN�1

X2 ⌘ XN�2

unique values

7  

Gaussian  example  

DFT of a Complex Exponential + NoiseConsider a time series

xn = A ei⇥o n�t + nn, n = 0, . . . , N � 1

where nn is complex white noise.

What are the properties of white noise? By definition, white noise has a flat spectrum. But this meansflat in the mean. Mean over what? Over a statistical ensemble. What is an ensemble?

We will define these later. But it needs to be clear that we have one realization of data that is conceptu-ally part of an ensemble of all possible realizations.

3

8  

Page 5: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

5  

0 200 400 600 800 1000Frequency Index t

�0.02

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Spe

ctru

m

NegativeFrequencies

PositiveFrequencies

N = 1024 P = 4.0 samples (S/N)t = 0.500

�0.6 �0.4 �0.2 0.0 0.2 0.4 0.6Frequency (Hz)

�0.02

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Spe

ctru

m

10�510�410�310�210�1100

log

Sy

Zoom in

9  

Mapping Frequency to Frequency BinConsider

x(t) = ei⇤0t = e2⇥if0t

In an N-point DFT, in which bin does the signal fall (mostly)?

As before we have T = N�t and �f = 1/T

The frequency mapping is

fj = j�f for j = 0, . . . , N/2

= (N � j)�f for j = N/2 + 1, . . . , N � 1

The Nyquist frequency is the maximum frequency in the spectrum

fN =1

2�t

So for f0 ⇥ fN we the corresponding frequency bin k in the DFT is

k0 = f0N�t

Negative frequencies correspond to k above the Nyquist frequency.

Frequencies |f0| > fN are still represented in the DFT (remember ... it is lossless) but they appear ataliased frequencies.

The frequency bin k0: varies with N for fixed f0 and �t and varies with �t for fixed f0 and N .

6 10  

Page 6: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

6  

1st and 2nd-order MomentsWe characterize the noise using the statistical moments. These are the first moment (the mean) and thesecond moment, the autocorrelation function (ACF).

Ensemble average moments are designated with angular brackets ⇧· · · ⌃:

The mean of the white noise is

⇧nn⌃ = 0 zero mean

and the ACF is written in terms of a Kronecker delta,

⇧nn n⇤m⌃ = ⇥2

n �nm white noise.

The �-form of the ACF is consistent with any pair of values nn and nm being statistically independent.

Note that we take a conjugate (*) because the noise is complex. One gets a different answer without theconjugate!

By definition (as we will see formally later on), the power spectrum is the Fourier transform of theensemble-average ACF where the ACF is assumed to depend only on the difference between the twotimes n and m. This is a property of stochastic processes that have stationary statistics.

For white noise we could write the ACF as

R(n,m) �⌅ R(n�m) = ⇥2n�(n�m)0.

The DFT of a delta function is a constant so we have shown that our definitions are consistent.

4

Ensemble  averages  technically  require  knowledge  of  an  N-­‐dimensional  PDF.    We  consider  cases  that  are  much  simpler.  

11  

Characterizing  the  noise  

We can calculate the DFT of the complex exponential because it is simply a geometric series. Usuallyit is not so simple!

The DFT of Xn is

Xk = N�1N�1�

n=0

Xn e�2⇥ink/N

= A N�1N�1�

n=0

ei (⇧0�t�2⇥k/N)n +N�1N�1�

n=0

nn e�2⇥ink/N

= A N�1 ei⌅nsin N

2 (⇧0�t� 2⇥k/N)

sin 12(⇧0�t� 2⇥k/N)

+ Nk

where ⌅n is an uninteresting phase factor and Nk is the DFT of the white noise.

The amplitude of the spectral line term is A (the limit where the arguments of the sin functions ⇤ 0).

The noise term Nk is a zero mean random process with second moment

⇧Nk N⇥k⌅⌃ = N�2

n

n⌅

⇧nn n⇥n⌅⌃ e�2⇥i(nk�n⌅k⌅)/N

= N�2�

n

n⌅

⇤2n �nn⌅ e

�2⇥i(nk�n⌅k⌅)/N

= (⇤2n/N

2)�

n

e�2⇥in(k�k⌅)/N

= (⇤2n/N) �kk⌅.

The second moment of the noise has the same form in both the time and frequency domains.

5 12  

Page 7: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

7  

We can calculate the DFT of the complex exponential because it is simply a geometric series. Usuallyit is not so simple!

The DFT of Xn is

Xk = N�1N�1�

n=0

Xn e�2⇥ink/N

= A N�1N�1�

n=0

ei (⇧0�t�2⇥k/N)n +N�1N�1�

n=0

nn e�2⇥ink/N

= A N�1 ei⌅nsin N

2 (⇧0�t� 2⇥k/N)

sin 12(⇧0�t� 2⇥k/N)

+ Nk

where ⌅n is an uninteresting phase factor and Nk is the DFT of the white noise.

The amplitude of the spectral line term is A (the limit where the arguments of the sin functions ⇤ 0).

The noise term Nk is a zero mean random process with second moment

⇧Nk N⇥k⌅⌃ = N�2

n

n⌅

⇧nn n⇥n⌅⌃ e�2⇥i(nk�n⌅k⌅)/N

= N�2�

n

n⌅

⇤2n �nn⌅ e

�2⇥i(nk�n⌅k⌅)/N

= (⇤2n/N

2)�

n

e�2⇥in(k�k⌅)/N

= (⇤2n/N) �kk⌅.

The second moment of the noise has the same form in both the time and frequency domains.

5

We  can’t  calculate  the  noise  DFT.    But  we  can  calculate  its  second  moment.    

O6en  but  not  always,  the  Fourier  transform  of  a  stochas:c  process  will  be  delta  correlated  

What  is  the  amplitude  of  the  signal  part  as  the  argument  of  the  sin()’s  à  0?  

What  does  the  signal  part  look  like?  

13  

Detection . . . or not?Suppose you have a data set that you think may have the form of the model given above. To answer thequestion “is there a signal in the data” we have to assess what are the fluctuations in the DFT (or, moreusefully, the squared magnitude of the DFT = an estimate for the power spectrum) due to the additivenoise. We would like to have confidence that a feature in the DFT or the spectrum is “real” as opposedto being a noise fluctuation that is spurious. To quantify our confidence, we need to know the propertiesof our test statistic. The following develops an approach that is applicable to the particular problem andillustrates generally how we go about assessing test statistics.

6 14  

Page 8: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

8  

Signal to noise ratio:

The rms amplitude of the noise term (in the frequency domain) is therefore ⇤N = ⇤n/⌃N and the

signal-to-noise ratio is

(S/N)DFT =line peak

rms noise=⌃N

A

⇤n.

Thus, the S/N of the line is⌃N larger than the S/N of the time series

(S/N)time series =amplitude of exponential

rms noise=

A

⇤n.

In practice, we must investigate the S/N of the squared magnitude of the DFT. Let ⌅0�t = 2⇥f0 �t =2⇥ ko/N so that the frequency is commensurate with the sampling in frequency space. Then Xk =A �kk0 + Nk and the spectral estimate becomes

Sk ⇥ |Xk|2 = |A �kk0 + Nk|2 (1)

= A2 �kk0 + A �kk0 (Nk + N �k ) + |Nk|2.

The ensemble average of the estimator is

⇤Sk⌅ = ⇤|Xk|2⌅ = A2 �kk0 + ⇤|Nk|2⌅ (2)

= A2 �kk0 + ⇤2n/N

The ratio of the peak to the off line mean is N A2/⇤2, consistent with (S/N)DFT calculated before.

7 15  

The Probability of False Alarm:

Suppose we want to test whether a feature in a spectrum is signal or noise. Let’s suppose that there isno signal (a ‘null’ hypothesis) in which case we can calculate the probability that a given amplitude isjust a noise fluctuation.

If there is only noise, the probability density function of Sk for any given k is a one-sided exponentialbecause Sk is �2

2:

fSk(S) =1

⇥Sk⇤e�S/⇥Sk⇤ U(S)

10

Why  is  the  spectrum  distributed  as  chi2    with  two  degrees  of  freedom?    

16  

Page 9: Lecture4 A6523 Spring2017 - Cornell Universityhosting.astro.cornell.edu/~cordes/A6523/Lecture4_A6523_Spring2017.pdf · 2/7/17 3 Symmetry Properties of the DFT X˜ k=N 1 N1 n=0 x ne

2/7/17  

9  

Suppose there is a spike in the spectrum of amplitude �⌃Sk⌥

The noise-like aspect of Sk implies that there can be spikes above a specified detection threshold thatare spurious (“false alarms”). The probability that a spike has an amplitude ⌅ �⌃Sk⌥ is

P (S ⌅ �⌃Sk⌥) =� ⇧

�⌃Sk⌥ds fSk(s) ⇥ e��

If the DFT length is NDFT, there are NDFT unique values of the spectrum.

Note this is true for a complex process but not for a real one. Why?

The expected number of spurious (i.e. false-alarm) spikes that equal or exceed �⌃Sk⌥ is

Nspurious = NDFT e��

To have Nspurious ⇤ 1 we must have

NDFT e�� ⇤ 1

we need

� ⌅ lnNDFT

11 17  

NDFT � to have Nspurious � 1

128 4.9

1k 6.9

16k 9.7

1M 13.9

1G 20.8

1T 27.7

12

The  larger  the  number  of  trials,  the  higher  the  threshold  that  is  needed  to  have  a  specified  number  of  false  posi:ves.    There  are  never  zero  false  posi:ves!  

18