Lecture 11sdf

7/23/2019 Lecture 11sdf

1/41

Detection Theory
http://find/http://goback/


2/41

Detection theory

A the last topic of the course, we will briefly considerdetection theory.

The methods are based on estimation theory and attemptto answer questions such as

Is a signal of specific model present in our time series? E.g.,detection of noisy sinusoid; beep or no beep?

Is the transmitted pulse present at radar signal at time t? Does the mean level of a signal change at timet? After calculating the mean change in pixel values of

subsequent frames in video, is there something moving inthe scene?
http://find/


3/41

Detection theory

The area is closely related tohypothesis testing, which is widelyused e.g., in medicine: Is the response in patients due to the newdrug or due to random fluctuations?

In our case, the hypotheses could be

H1 :x[n] =A cos(2f0n + ) + w[n]

H0 :x[n] =w[n]

This example corresponds to detection of noisy sinusoid.

The hypothesisH1corresponds to the case that the sinusoid ispresent and is calledalternative hypothesis.

The hypothesisH0corresponds to the case that the

measurements consists of noise only and is callednull hypothesis.
http://find/


4/41

Introductory Example

Neyman-Pearson approachis the classical way of solvingdetection problems in an optimal manner.

It relies on so calledNeyman-Pearson theorem.

Before stating the theorem, consider a simplistic detectionproblem, where we observe one samplex[n]from one oftwo densities: N(0, 1)orN(1, 1).

The task is to choose the correct density in an optimalmanner.
http://find/


5/41


4 3 2 1 0 1 2 3 40

0.1

0.2

0.3

0.4

x[0]

Likelihood

p(x; H0) p(x; H1)

Our hypotheses are now

H1 :=1,

H0 :=0.

An obvious approach for deciding the density wouldchoose the one, which is higher for a particularx[0].
http://find/


6/41


More specifically, study the likelihoods and choose themore likely one.

The likelihoods are

H1 :p(x[0]|=1) = 1

2exp

(x[n] 1)2

2

.

H0 :p(x[0]|=0) = 1

2exp

(x[n])2

2 . Now, one should selectH1if

p(x[0]|=1)> p(x[0]|=0).
http://find/


7/41


Lets state this in terms ofx[0]:

p(x[0]|=1)> p(x[0]|=0)

p(x[0]|=1)

p(x[0]|=0) >1

12

exp

(x[n]1)2

2

12

exp (x[n])2

2 >1

exp

(x[n] 1)2) x[n]2

2

>1
http://find/


8/41


(x[n]2 (x[n] 1)2)>0 2x[n] 1>0

x[n]>1

2 .

In other words, chooseH1ifx[0]>0.5 andH0ifx[0]


9/41


Note, that it is also possible to study posterior probabilityratiosp(H1 |x)/p(H0 |x)instead of the above likelihoodratiop(x| H1)/p(x| H0).

However, using Bayes rule, thisMAP testturns out to be

p(x| H1)

p(x| H0) >

p(H1)

p(H2),

i.e., the only effect of using posterior probability is on thethreshold for the LRT.
http://find/


10/41

Error Types

It might be that the detection problem is not symmetric

and some errors are more costly than others. For example, when detecting a disease, a missed detection

is more costly than a false alarm.

The tradeoffbetween misses and false alarms can be

adjusted using the threshold of the LRT.
http://find/


11/41

Error Types

The below figure illustrates the probabilities of the twokinds of errors. The red area on the left corresponds to theprobability of choosingH1whileH0would hold (false

match). The blue area is the probability of choosingH0whileH1would hold (missed detection).

4 3 2 1 0 1 2 3 40

0.1

0.2

0.3

0.4

x[0]

Likelihood
http://find/


12/41

Error Types

It can be seen that we can decrease either probabilityarbitrarily small by adjusting the detection threshold.

4 3 2 1 0 1 2 3 40

0.1

0.2

0.3

0.4

x[0]

Likelihood

4 3 2 1 0 1 2 3 40

0.1

0.2

0.3

0.4

x[0]

Likelihood

Left: large threshold; small probability of false match (red),but a lot of misses (blue).

Right: small threshold; only a few missed detections(blue), but a huge number of false matches (red).
http://find/


13/41

Error Types

Probability of false alarm for the threshold =1.5 is

PFA =P(x[0]> |= 0) =

1.5

12

exp

(x[n])2

2

dx[n] 0.0668.

Probability of missed detection is

PM =P(x[0]> | =1) =

1.5

12

exp

(x[n] 1)2

2

dx[n] 0.6915.

An equivalent, but more useful term is the complement ofPM:probability of detection:

PD =1 PM =

1.5

12

exp

(x[n] 1)2

2

dx[n] 0.3085.
http://find/


14/41

Neyman-Pearson Theorem

SincePFAandPDdepend on each other, we would like tomaximizePDsubject to given maximum allowedPFA.Luckily the following theorem makes this easy.

Neyman-Pearson Theorem: For a fixed PFA, the likelihood

ratio test maximizesPDwith the decision rule

L(x) = p(x; H1)

p(x; H0) > ,

with thresholdis the value for whichx:L(x)>

p(x; H0) dx=PFA.
http://find/


15/41


As an example, suppose we want to find the best detectorfor our introductory example, and we can tolerate 10%false alarms (PFA =0.1).

According to the theorem, the detection rule is:

SelectH1ifp(x|=1)p(x|=0)

>

The only thing to find out now is the thresholdsuch that

p(x|=0) dx=0.1.

This can be done with Matlab functionicdf, which solvesthe inverse cumulative distribution function.
http://find/


16/41


Unfortunatelyicdfsolves thefor which

p(x|=0) dx=0.1 instead of

p(x|=0) dx=0.1.

Thus, we have to use the function like this:icdf(norm, 1 - 0.1, 0, 1), which gives 1.2816.

Similarly, we can also calculate thePDwith this threshold:

PD =

1.2816p(x|=1) dx 0.3891.
http://find/


17/41

Detector for a known waveform

The NP approach applies to all cases where likelihoods areavailable.

An important special case is that of a known waveforms[n]

embedded in WGN sequencew[n]

:

H1 :x[n] =s[n] + w[n]

H0 :x[n] =w[n].

An example of a case where the waveform is known couldbe detection of radar signals, where a pulses[n]transmitted by us is reflected back after some propagationtime.
http://find/


18/41


For this case the likelihoods are

p(x| H1) =

N1n=0

122

exp

(x[n] s[n])2

22

,

p(x| H0) =N1n=0

122

exp

(x[n])2

22

.

The likelihood ratio test is easily obtained as

p(x| H1)

p(x| H0) =exp

1

22

N1n=0

(x[n] s[n])2

N1n=0

(x[n])2

> .
http://goforward/http://find/http://goback/


19/41


This simplifies by taking the logarithm from both sides:

1

22N1

n=0(x[n] s[n])2

N1

n=0(x[n])2> ln.

This further simplifies into

1

2

N1

n=0

x[n]s[n] 1

22

N1

n=0

(s[n])2 >ln.
http://find/


20/41


Sinces[n]is a known waveform (= constant), we cansimplify the procedure by moving it to the right hand sideand combining it with the threshold:

N1

n=0

x[n]s[n]> 2

ln +

1

2

N1

n=0

(s[n])2

.

We can equivalently call the right hand side as ourthreshold (say ) to get the final decision rule

N1n=0

x[n]s[n]> .
http://find/


21/41

Examples

This leads into some rather obvious results.

The detector for a known DC level in WGN is

N1n=0

x[n]A > AN1n=0

x[n]>

Equally well we can set a new threshold and call it

=/(AN). This way the detection rule becomes: x > .Note that a negativeAwould invert the inequality.
http://find/


22/41

Examples

The detector for a sinusoid in WGN is

N1n=0

x[n]A cos(2f0n + )> AN1n=0

x[n] cos(2f0n + )> .

Again we can divide byAto get

N1n=0

x[n] cos(2f0n + )> .

In other words, we check the correlation with the sinusoid.Note that the amplitude A does not affect our statistic, onlythe threshold which is anyway selected according to thefixedPFArate.


23/41

Examples

As an example, the below picture shows the detectionprocess with =0.5.

0 100 200 300 400 500 600 700 800 9001

0

1Noiseless signal

0 100 200 300 400 500 600 700 800 9005

0

5Noisy signal

0 100 200 300 400 500 600 700 800 900 100050

0

50

Detection result
http://find/


24/41

Detection of random signals

The problem with the previous approach was that themodel was too restrictive; the results depend on how wellthe phases match.

The model can be relaxed by consideringrandom signals,whose exact form is unknown, but the correlation structureis known. Since the correlation captures the frequency (butnot the phase), this is exactly what we want.

In general, the detection of a random signal can beformulated as follows.
http://find/


25/41


Supposes N(0, Cs)andw N(0, 2I). Then the detection

problem is a hypothesis test

H0 :x N(0, 2I)

H1 :x N(0, Cs+ 2I)

It can be shown (see Kay-2, p. 145), that the decision rulebecomes

DecideH1, ifxTs> ,

where

s=Cs(Cs+ 2I)1x.
http://find/


26/41


The termsis in fact the estimate of the signal; morespecifically, the linear Bayesian MMSE estimator, whichassumes linearity for the estimator (similar to BLUE).

A particular special case of a random signal is the Bayesian

linear model. The Bayesian linear model assumes linearityx =H +w

together with a prior for the parameters, such as N(0, 2I)

Consider the following detection problem:

H0 :x =w

H1 :x =H +w
http://find/


27/41


Within the earlier random signal framework, this is writtenas

H0

:x N(0, 2I)

H1 :x N(0, Cs+ 2I)

withCs =HCHT.

The assumptions N(0, HCHT)states that the exact

form of the signal is unknown, and we only know itscovariance structure.


28/41


This is helpful in the sinusoidal detection problem: we arenot interested in the phase (included by the exactformulationx[n] =A cos(2f0n + )), but only in thefrequency (as described by the covariance matrixHCH

T).

Thus, the decision rule becomes:

DecideH1, ifxTs> ,

where

s=Cs(Cs+ 2I)1x

=HCHT(HCH

T + 2I)1x
http://find/


29/41


Luckily the decision rule simplifies quite a lot by noticingthat the last part is the MMSE estimate of:

xTs=xTCHT(HCH

T + 2I)1x

=xT

H.

An example of applying the linear model is in Kay:Statistical Signal Processing, vol. 2; Detection Theory,pages 155-158.

In the example, a Rayleigh fading sinusoid is studied,which has an unknown amplitudeAand phase term.Only the frequencyf0is assumed to be known.
http://find/


30/41


This can be manipulated into a linear model form with twounknowns corresponding toAand.

The final result is the decision rule:

N1n=0

x[n] exp(2if0n) > .

As an example, the below picture shows the detectionprocess with =0.5.

Note the simplicity of Matlab implementation:

h = exp(-2*pi*sqrt(-1)*f0*n);

y = abs(conv(h,x));


31/41


0 100 200 300 400 500 600 700 800 9001

0

1Noiseless signal

0 100 200 300 400 500 600 700 800 9005

0

5

Noisy signal

0 100 200 300 400 500 600 700 800 9000

50Detection result
http://find/


32/41

Receiver Operating Characteristics

A usual way of illustrating the detector performance is theReceiver Operating Characteristicscurve (ROC curve).

This describes the relationship betweenPFAandPDfor allpossible values of the threshold.

The functional relationship betweenPFAandPDdependson the problem (and the selected detector, although we

have proven that LRT is optimal).
http://find/


33/41


For example, in the DC level example,

PD() =

12

exp(x 1)2

2 dxPFA() =

12

exp

x2

2

dx

It is easy to see the relationship:

PD() =1

12

exp

x2

2

dx=PFA( 1).
http://find/


34/41


Plotting the ROC curve for allresults in the followingcurve.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Probability of False Alarm, PFA

ProbabilityofDetection,

PD

Large threshold

Small threshold
http://find/


35/41


The higher the ROC curve, the better the performance.

A random guess has diagonal ROC curve.

In the DC level case, the performance increases if the noisevariance2 decreases. Below are the ROC plots for variousvalues of2.
http://find/


36/41


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Probability of False Alarm, PFA

Proba

bilityofDetection,

PD

variance = 0.2

variance = 0.4

variance = 0.6

variance = 0.8

variance = 1Random guess

This gives rise to a widely used measure for detectorperformance: theArea Under (ROC) Curve, or AUCcriterion.
http://find/


37/41

Composite hypothesis testing

In the previous examples the parameter values specifiedthe distribution completely; e.g., eitherA =1 orA =0.

Such cases are calledsimple hypotheses testing.

Often we cant specify exactly the parameters for eithercase, but instead a range of values for each case.

An example could be our DC modelx[n] =A +w[n]with

H1 :A 0H0 :A=0
http://find/


38/41

Composite hypothesis testing

The question can be posed in a probabilistic manner asfollows:

What is the probability of observing x[n]ifH

0would hold?

If the probability is small (e.g., allx[n] [0.5, 1.5], and letssay the probability of observingx[n]underH0is 1 %), thenwe can conclude that the null hypothesis can berejected

with 99% confidence.
http://find/


39/41

An example

As an example, consider detecting a biased coin in a cointossing experiment.

If we get 19 heads out of 20 tosses, it seems rather likely

that the coin is biased. How to pose the question mathematically?

Now the hypotheses is

H1 :coin is biased: p 0.5

H0 :coin is unbiased:p=0.5,

wherepdenotes the probability of a head for our coin.
http://find/


40/41

An example

Additionally, lets say, we want 99% confidence for the test.

Thus, we can state the hypothesis test as: "what is theprobability of observing at least 19 heads assuming

p=0.5?"

This is given by the binomial distribution20

19

0.519 0.51

19 heads+

or

0.520

20 heads

0.00002.

Since 0.00002 < 1%, we can reject the null hypothesis andthe coin is biased.


41/41

An example

Actually, the 99% confidence was a bit loose in this case.

We could have set a 99.98% confidence requirement andstill reject the null hypothesis.

The upper limit for the confidence (here 99.98%) is widelyused and called thep-value.

More specifically,

The p-value is the probability of obtaining a test statistic at least

as extreme as the one that was actually observed, assuming thatthe null hypothesis is true.
http://find/

Documents

Lecture 11sdf