37
336 IRE TRANSACTIONS ON COMMtJNICATIONS SYSTEMS , December On the Optimum Detection of Digital Signals in the Presence of White Gaussian Noise- A Geometric Interpretation and a Study of ThreeLBasic IData Transmission Systems* Summary-This paper considers the problem of optimally de- tecting digital waveforms in the presence of additive white Gaussian noise. A technique for representing the transmitted signals and the additive noise which leads to a geometric interpretation of the detectionproblem is presented on atutorial level. Subsequently, this technique is used to derive the optimum detector for each of three basic data transmission systems: m-level Phase Shift Keyed, m-level Amplitude Shift Keyed and m-level Frequency Shift Keyed. Corresponding probability of error curves are derived, compared and discussed with reasonable detail. I. INTRODUCTION HIS PAPER WAS written to fulfill two objectives. The first objective is tutorial; the paper is intended to serve as an introduction to some of the ideas of modernstatisticalcommunicationtheory;inparticular, the problem of detecting a known signal in a white noise background with minimum probability of error is intro- duced. The second objective is to analyze and compare in detail the performance of three basic data transmission systems, namely, m-level Phase Shift Keyed, m-level Amplitude Shift Keyed and m-level (orthogonal) Fre- quency Shift Keyed. The approach adopted within this paper stresses the geometricviewpoint. Specifically, advantage is taken of the fact that in the white noise case it is possible to choose a convenient (orthonormal) representation for the trans- mitted signals and yet still be guaranteed that the noise can be decomposed suitably (ie., as described in Theorem 11). This freedom of choice in signal representation leads in a natural way to a geometric interpretation of the detection problem which is both analytically ,correct and intuitively plausible. Furthermore, it is felt that the presented approach which, strictly speaking, is only applicable to the white noise case can serve as a useful * Received June 15, 1962. This work has been supported by the Mitre Corporation, Bedford, Mass., under Contract No. AF 33(600) 39852. t Bell Telephone Laboratories, Murray Hill, N. J. On leave from Dept. of Elec. Engrg., M. I. T. Research Laboratory of Electronics, Bedford, Mass. Cambridge, Mass. Formerly Consultant to the Mitre Corporation, 1 M. I. T. Dept. of Mathematics and Research Laboratory of Electronics, Cambridge, Mass. On leave fromthe Mitre Corporation, Bedford, Mass. AND H. DYMZ, MEMBER, IRE introduction to the more general treatment wherein the transmitted sign:tls are represented in terms of aKar- hunen-Loeve exp:znsion.’ Theconcepts developed intheearlyportions of the paper are used subsequently to derive the optimum detector for each of three systems mentioned above (both under the assumption of phase zoherence and phase in- coherence) and to derive the corresponding expressions for theprobability of error. Several sets of curvesare presented and discussed in reasonable detail in the final sections of the paper. Inthe course of the presentation, references to ad- ditional articles and books on related subject matter are cited. However, appreciating the difficulty involved in sifting through many references each with its own peculiar notation, an effort has been made to write this paper as a complete unit. Accordingly, theoretical results which are utilized within the body of this pa.per without proof are discussed at length in the appendixes, which, with the exception of Appendix 11, do notdependcritically on outside sourc13 material. The treatment of Section V, which is concernNed with deriving the probability of error for the systems under consideration, is somewhat different. The objective thlsrein is to present a complete description of the calculations involved and the types of estimation whi.ch can be resorted to. It is believed that some of the presented resultx are new, although this is difficult to ascertain without an extensive search of literature. Principally, however, it is felt that the value of this paper lies in the presentation; considerable insight into the significant factors which contribute to error is gained by the geometric approach emphasized. References to some alternate techniques for calculating the probability of error are presented where thought to be of interest or where they have been of direct help to the authors. We remark that, as pointed out in the text, Section V-A to V-F may be skipped by the reader without loss of continuity. W. B.Davennortand W. L. Root. ‘‘An Introduction to the Theory of Randoh Signals and Noise,” McGraw-Hill Book- Co., Inc., New York, N. Y., pp. 96-99,338-345; 1958. a C. W. Helstr’Drn, “Statistical Theory of Signal Detection,” Pergamon Press, lnc., New York, N. Y., pp. 95-109; 1960.

336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

336 IRE TRANSACTIONS ON COMMtJNICATIONS SYSTEMS , December

On the Optimum Detection of Digital Signals in the Presence of White Gaussian Noise-

A Geometric Interpretation and a Study of ThreeLBasic IData

Transmission Systems*

Summary-This paper considers the problem of optimally de- tecting digital waveforms in the presence of additive white Gaussian noise. A technique for representing the transmitted signals and the additive noise which leads to a geometric interpretation of the detection problem is presented on a tutorial level. Subsequently, this technique is used to derive the optimum detector for each of three basic data transmission systems: m-level Phase Shift Keyed, m-level Amplitude Shift Keyed and m-level Frequency Shift Keyed. Corresponding probability of error curves are derived, compared and discussed with reasonable detail.

I . INTRODUCTION

HIS PAPER WAS written to fulfill two objectives. The first objective is tutorial; the paper is intended to serve as an introduction to some of the ideas of

modern statistical communication theory; in particular, the problem of detecting a known signal in a white noise background with minimum probability of error is intro- duced. The second objective is to analyze and compare in detail the performance of three basic data transmission systems, namely, m-level Phase Shift Keyed, m-level Amplitude Shift Keyed and m-level (orthogonal) Fre- quency Shift Keyed.

The approach adopted within this paper stresses the geometric viewpoint. Specifically, advantage is taken of the fact that in the white noise case it is possible to choose a convenient (orthonormal) representation for the trans- mitted signals and yet still be guaranteed that the noise can be decomposed suitably (ie., as described in Theorem 11). This freedom of choice in signal representation leads in a natural way to a geometric interpretation of the detection problem which is both analytically ,correct and intuitively plausible. Furthermore, it is felt that the presented approach which, strictly speaking, is only applicable to the white noise case can serve as a useful

* Received June 15, 1962. This work has been supported by the Mitre Corporation, Bedford, Mass., under Contract No. AF 33(600) 39852.

t Bell Telephone Laboratories, Murray Hill, N. J. On leave from Dept. of Elec. Engrg., M. I. T. Research Laboratory of Electronics,

Bedford, Mass. Cambridge, Mass. Formerly Consultant to the Mitre Corporation,

1 M. I. T. Dept. of Mathematics and Research Laboratory of Electronics, Cambridge, Mass. On leave from the Mitre Corporation, Bedford, Mass.

AND H. DYMZ, MEMBER, IRE

introduction to the more general treatment wherein the transmitted sign:tls are represented in terms of a Kar- hunen-Loeve exp:znsion.’ J

The concepts developed in the early portions of the paper are used subsequently to derive the optimum detector for each of three systems mentioned above (both under the assumption of phase zoherence and phase in- coherence) and to derive the corresponding expressions for the probability of error. Several sets of curves are presented and discussed in reasonable detail in the final sections of the paper.

In the course of the presentation, references to ad- ditional articles and books on related subject matter are cited. However, appreciating the difficulty involved in sifting through many references each with its own peculiar notation, an effort has been made to write this paper as a complete unit. Accordingly, theoretical results which are utilized within the body of this pa.per without proof are discussed a t length in the appendixes, which, with the exception of Appendix 11, do not depend critically on outside sourc13 material. The treatment of Section V, which is concernNed with deriving the probability of error for the systems under consideration, is somewhat different. The objective thlsrein is to present a complete description of the calculations involved and the types of estimation whi.ch can be resorted to. It is believed that some of the presented resultx are new, although this is difficult to ascertain without an extensive search of literature. Principally, however, it is felt that the value of this paper lies in the presentation; considerable insight into the significant factors which contribute to error is gained by the geometric approach emphasized. References to some alternate techniques for calculating the probability of error are presented where thought to be of interest or where they have been of direct help to the authors. We remark that, as pointed out in the text, Section V-A to V-F may be skipped by the reader without loss of continuity.

W. B. Davennort and W. L. Root. ‘‘An Introduction to the Theory of Randoh Signals and Noise,” McGraw-Hill Book- Co., Inc., New York, N. Y. , pp. 96-99,338-345; 1958.

a C. W. Helstr’Drn, “Statistical Theory of Signal Detection,” Pergamon Press, lnc., New York, N. Y., pp. 95-109; 1960.

Page 2: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

196%' Arthurs and Dym: Detection of Digital Signals in Presence of Noise 337

11. BASIC GEOMETRIC CONCEPTS

A. Discussion of Assumed Model The analysis of data transmission systems is commonly

based on the following model. There is assumed to exist a message source generating a stream of equally likely messages, M,, Mzl . . . , M,, into a waveform generator having available an alphabet of m distinct waveforms, Sl(t), S,(t), . . . , S,(t), each of duration T (and necessarily finite energy). One waveform is transmitted every T seconds, the choice of waveform depending in some fashion on the incoming message and possibly on the waveforms transmitted in preceding time slots. The medium coupling the transmitter to the receiver is assumed to add station- ary-white-zero mean-Gaussian noise to the transmitted signal but otherwise is assumed to be distortion free. It is generally further assumed that the receiver is time synchronized with the transmitter (synchronous detec- tion). Sometimes it is also assumed that the receiver is phase locked to the transmitter (coherent detection). In this report we shall always assume time synchronism but shall distinguish between coherent and incoherent de- tection.

The problem we are generally interested in solving, given this model (see Fig. l), is how to design the receiver so that it makes as few errors as possible. Furthermore, assuming that an optimum receiver (optimum in the sense that it will make fewer errors in the long run than any other receiver) is constructed, we are interested in calculating its error rate.

B. Geometric Representation of a Known Set of Waveforms One purpose of this report is to point out that all

problems of the type mentioned above may be trans- formed into geometric problems with considerable simplifi- cation of detail. Basic to the geometric viewpoint are two theorems, the first of which we shall now state.

Theorem I Any finite set of physically realizable waveforms of

duration T , say S,(t), S,(t), . . - , S,(t), may be expressed as a linear combination of k orthonormal waveforms

That is to say, we can rewrite the Si(t), i = 1,2, . . * , m cpl(t), (Pz(t), * . * , k I rn.

in the form

&(t) = allcpl(t) + a12cpz(t) + 1 . + alkcpk(t)

r I

--- TRANSMITTER -

Fig. 1-Idealized model of data transmission system.

and the cpi(t), j = 1, 2, . . . , k are waveforms having the property (definition of orthonormal) that

The proof of this theorem is presented in Appendix I. Note that the conventional Fourier Series expansion of a waveform of duration T is an example of a particular ex- pansion of this type. There are, however, two very im- portant distinctions we wish to make.

1) The form of the cpi(t) has not been specified. That is to say, we have not confined the expansion to be in terms of sinusoids and cosinusoids.

2 ) The expansion of S i ( t ) in terms of a finite number of terms is not an approximation wherein only the first k terms are significant but rather an exact expression where IC and only k terms are significant. The number k , incidentally, is referred to as the dimension of the signal alphabet of waveforms.

The form of the cpj(t) is dependent upon the form of the message waveforms originally specified, SI ( t ) , . . , S,(t). The proof of Theorem I outlines a method of determining the cpi(t).

Accepting the fact that each signal waveform may be represented by a linear combination of cpi(t), j = 1, . . . , k , namely,

it is apparent that each signal waveform may actually be specified uniquely in terms of the coefficients of the cpj(t), ( j = 1, . . . IC). Thus, we can represent Si(t) by the set of k-tuples (ail , a,,, . - . , aik). Furthermore, if we conceptually extend our conventional notion of 2 and 3 dimensional Euclidean spaces to a k-dimensional Euclidean space, we can think of the numbers

ail , a i2 , , as the k coordinate projections of the signal point Si on a k-dimensional Euclidean space.

Thus, for example, if k = 3 we may plot the point Si corresponding to the waveform

&(t> = ailcpl(t) + aiZcpz(t) + ai3cp3(t)

Page 3: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

338 IRE TRANSACTIONS ON COMlWUNICATIONS SYSTEMS December as a point in a 3-dimensional Euclidean space with co- ordinates (ail7 ai2, ais,) as shown in Fig. 2 . - - -__

We shall subsequently refer to the k-dimensional space on which Si is plotted as the signal space.

There are ;some interesting relationships between the energy content of a signal and the distance between a signal point and the origin of the signal space. The dis-

(x1, xz, 9 . . , xk) and (yl, yz, . - . , yk), respectively, is given by the formula

S d l l ."a;, *I It1 1.1

betwee'' a pair Of points, '7 with coordinates Fig. '&-Geometric representation of the sigllal waveform s,(t).

(5) i=1

It follows readily that the distance between a signal I point Si with coordinates (ail, aizl + . , aik) and the origin of the signal space is given by

. Fig. 3-Planar section in the signal space determined by the origin

!" and the signal points Si and S,, i # v.

d(Xi , 0) = ui i . (6) i=1

which, however, may be simplified to yield Now, by (4), we may write

P T

J Si(t)S,(t) d t = 0 . 0

Thus, we can conclude that if the signal points cor- responding to the pair of waveforms S i ( t ) and S,(t) are

signal points being defined as in Pig. 3, then

[' Si(t)S,(t) dt = 0 (i # v).

= $ l' Pi(t)[ailP1(t) + aizPz(t) + . * * + ai'p'(t)l d t orthogonal to ea& other, the angle between a pair of

but since the qi ( t ) are orthornormal [see (3 ) ] , this latter equation reduces simply to

lT XT(t) dt = 5 (7)

That is to say, the energy content of Xi(t), E , is equal to

E , = d2(Si, 0) = (aii)'. (8)

i-1

k

i - 1

It may similarly be shown that

iT [Si(t) - S,(t)12 dt = 2 (aii - a,J2. . (9)

Though not essential to the development we might point out an interesting sidelight. Namely, if we consider the plane formed by lines joining the signal points Si, S,, (i # v) and the origin, then by the law of cosines we may express cos e (see Fig. 3) as

i=l

J o

If, further, each of the waveforms S i ( t ) , i = 1, 2, e , m is suitably scaled, that is, normalized, so that

s,' X?( t ) dt = 1

for i = 1, 2, ' . , m, then the set of waveforms Si(t) , i = 1,2, . . , m is termed an orthonormal set. [See also (3)]. Thus, we see th,at there is a rather simple geometric interpretation which can be given to the notion of ortho- nornlal waveforms.

C . Detection of Signals in the Presence of Noise

Now, returning to the main development, we wish to point out that the coefficients

n T

If, in particular, e = ~ / 2 , then cos 0 = 0 and (10) grators (properly synchronized to the waveform gene-

Such a series of product integrators can, in fact, be used as the first stage of a detector in a data transmission

reduces to rators) as shown in Fig. 4.

d2(0, Xi) + d2(0, X,) - d2(Si , X.) = 0.

It follows, therefore, from (6) and (7) that system. The function of the second stage or decision stage, as we shall term ix, is then to decide, on the basis of the k

lT Sf(t> d t + ST Xf(t> dt - ST (Si - SJ2 d t = 0 actually sent. outputs of the product integrators, what signal was

0 '

Page 4: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 339

The decision problem is complicated by the fact that the transmitted signal is perturbed by noise. (We are assuming, for the present, coherent detection.)

Typically, the noise is assumed to be additive white- stationary-zero mean-Gaussian, the reasons for this assumption being that

1) it makes calculations more tractable, and 2) it is a reasonable description of the type of noise

present in many communication channels.

We shall now outline, briefly, the meaning of each of the terms used in the description of the noise.

Classifying the noise as additive implies simply that the received signal, which we shall designate as x ( t ) , consists of a noise term in addition to the originally trans- mitted signal. That is, if x;(t) was transmitted, the received signal

+{'I

s,w - - r o T .ELI -

- a l l

Fig. 4-Set of product integrators which may be used to calculate the signal space coordinates of the signal S,(t).

z(t) = &(t) + n( t ) .

Correspondingly, the output of the jth product inte- grator equals

where f I I

(2) Fig. 5-Samples of possible noise voltages which might be super-

imposed on the transmitted signal.

n, = 1' n(t>p,(t> d t . (13) is equal to

The amplitude of the term nj will be dependent upon the particular noise sample which perturbed the trans- mitted signal waveform. Since there is an infinite number of such possible noise samples, each of which could have perturbed the transmitted signal (see Fig. 5), there is a correspondingly infinite number of values which the term ni can take on. Accordingly, the amplitude of n, cannot be specified in advance and can a t best be described in a probabilistic sense.

The fact that the noise is stationary tells us that the statistics of the noise are independent of the particular time we choose to start transmitting data. In particular, referring to Fig. 5, the choice of the point t = 0 is arbitrary as far as the noise is concerned since all joint probability density functions will depend only upon time differences and not upon the actual values of time with respect to some absolute reference.

If, in particular, the noise model used is assumed to be stationary Gaussian with zero mean, it may be shown (Appendix 11) that the probability density function of the noise perturbation ni is Gaussian with zero mean. That is to say, the probability that

a i n j < b

1 P(a I n, < b) = ~ 4% ( T i s," e- drc z 2 / 2 0 j 2

j = 1, 2, . . . , k. (14)

Evaluation of the integral described in (14) requires knowledge of the quantity g;, that is, the variance of the noise perturbation n,. Since the noise is stationary, the variance of each noise perturbation is determined by the spectral density. In fact, if the noise is specified to be white which implies that the spectral density W(f ) equals

N o watts/cps for all f (positive and negative),

it may be shown (Appendix 11) that the variance of n, is

ui2 = N o j = 1 , 2 , - - a , k .

It may further be shown (Appendix 11) that each of the perturbations are independent. That is the probability of the joint event that, say, a, I n, < b, and a, I n, < bz * . and ak 5 nk < bk is equal to simply the product of the probabilities of the individual events.

P(al I nl < b,, a, I n, < b,, . . - ak I nk < bk)

= P(al i n, < bl)P(a, I n, < b,)

*P(ak 5 n k < bk). (15)

Page 5: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

340 IRE TRANSACTIONS ON COMMUNICATI0,YS SYSTEMS December

We wish to point out that the n,, n,, . . . , nk do not serve to completely characterize the noise but only that portion of the noise which interacts with the product integrators. That is, n(t) cannot be expanded simply in terms of p i ( t ) alone but, rather, must be expressed as

n(t) = n,cp,(t) + n d t ) + - + n,c.,(t) + h(t) (16)

where h(t) is a sort of remainder term which must be in- cluded on the right to preserve the equality. [Contrast this with the expansion of Si(t) , in (1) .] Utilizing the fact that the 4 j ( t ) are orthonormal and that

ni == ,lT n(t)+i(t) d t j = 1, 2 , . . , k (13)

it may be deduced from (16) that

l* h(t)pi(t) d t = 0 j = 1, 2 , . . . , k . (17)

This is, of course, no more than the statement that h(t) does not have any components on the signal space.

The results of the preceding few pages may be sum- marized as the second basic theorem.

Theorem I1 Given a set of orthonormal waveforms, cpl(t), p z ( t ) , . . ,

( P k ( t ) , which characterize a signal space and a stationary- white-zero mean-Gaussian noise source, n(t),, with spectral density No, the noise may be decomposed into two portions, the first nlcpl(t) + n:p2(t) + * * + n k P k ( t ) consisting of the projection of the noise on the signal space and the second consisting of that portion of the noise which is orthogonal to the signal space. [See (16) and (17).] The nj j = 1, 2, , k , which are defined by (13), are independent Gaussian random variables with zero mean and variance No.

That is to say, the n,, n, . . . , nk represent the lc co- ordinate projections of the noise on the signal space and represent that portion of the noise which will interfere with the detection process. The remaining portion of the noise [h(t)] may be thought of as being effectively tuned out by the detector.

111. COHERENT DETECTION

A . Statement of Detection Problem in Geometric Terms Summarizing the results of Section 11, we note that

a received signal, x ( t ) , may be represented by a point in a Euclidean space of the appropriate dimension. The co- ordinates of the point are calculated by a series of product integrators which make up the first stage of our con- ceptual detector. Each coordinate, as may be deduced from (12), consists of two components-one due to the transmitted signal and the other due to the noise which has been superimposed on the signal in the channel coupl- ing the transmitter to the receiver. The function of the decision stage of the detector is to guess which signal was transmitted from the position of the received “noisy” point. We emphasize the fact that the best the detector

can do in the presence of a statistical perturbation such as the additive noise model assumed is to guess at the transmitted message. As a consequence, one reasonable measure for the performance of a detector is the number of times it gueses wrong in a long typical sequence of messages. Or, more precisely, since by assumption the a priori probability for transmission of each signal wave- form Si(t) is known, we can calculate for each detector the probability of making an error.

B. Optimum Decision Rule In the coherent case, the coordinates of each possible

transmitted signal may be calculated by the detector. Thus, m points,, each of which corresponds to a trans- mitted signal, may be plotted in the detector signal space. We shall subsequently refer to these m points as the message points or the transmitted signal points. Note that the received signal point will be displaced from the traasmitted signal point due to the addition of noise. Since, as may ’be deduced from the bell shaped curve, small noise perturbations are much more likely than large ones in the Gaussian case, a reasonable decision rule to adopt is to assume that the signal whose message point lies closest to th’e received point was actually transmitted. In fact, in Appendix I11 the following theorem is verified.

Theorem I11 If each signal waveform is transmitted with equal

probability and if the received signal is perturbed by additive stationary-white-zero mean-Gaussian noise, then, for the case of coherent detection, that decision rule which selects the message point closest to the received point minimizes the probability of error. ,

A detector which embodies the decision rule of Theorem 111: is often referred to as a maximum likelihood Detector. It should be noted, however, that a maximum likelihood detector will only minimize the probability of error, when, as in this case, it is assumed that each possible signal waveform is transmitted with equal pr~babi l i ty .~

We shall now illustrate this rule for three cases of practical interest-coherent Phase Shift Keyed, coherent Amplitude Shift Keyed and coherent Frequency Shift Keyed.

C. Coherent PSK This modulation scheme is characterized by the fact

that the information carried by the transmitted wave- form is contained in the phase. A typical set of message waveforms is described by

10 elsewhere i = 1, 2, . . . , m

pp. 317-324; R. M. Fano, “Transmission of Information,” M. I. T. 3 For further discussion, see Davenport and Root, op. cit . ,

Press, Cambridge, Mass., John Wiley and Sons, Inc., New York, N. Y., p. 184; 1961.

Page 6: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise

where E is the energy content of Si( t ) and /

2nno wo = - T for some fixed integer no.

Now, recognizing that each Si(t) may be written in terms of a sinusoid and cosinusoid, which are orthogonal, and then suitably scaling to fulfill the conditions of (3), we conclude that the appropriate form for the orthonormal waveforms cpl(t) and cpz(t) (alternately, we could have used the techniques described in Appendix I) to be used in the product integrators of Fig. 4 is

-

cpl(t) = & cos wot

The coordinates of the message points may be cal- culated by (a), (18) and (19).

ail = L' dg cos (uot + E)dF cos m o t dt

ai* = S,' JF cos (uot + %)dF sin wot d t

Note that for the particular case m = 2 [often termed phase reversal since Si(t) = sin (wot =k ~ / 2 ) ] , aiz = 0. Accordingly, we can dispense with cpz(t).

We shall now illustrate the decision rule embodied in Theorem 111 for the case m = 4. The four possibly trans- mitted signal points whose coordinates are given by (20) are shown in the diagram of the signal space displayed in Fig. 6. To realize the decision rule we must partition the signal space into four regions, namely, the set of points in the signal space closest to X , , the set of points closest to Sz , the set of points closest to S3 and the set of points closest to S4. This is accomplished by constructing the perpendicular bisectors of the 4-sided polygon S1S2S3S4 and marking off the appropriate regions. It may in this way be deduced that the regions of interest are cones whose vertices coincide with the origin. These regions are marked zone 1, zone 2, zone 3 and zone 4 according to the transmitted signal point about which they are constructed.

The decision rule is now simply to guess X l ( t ) was trans- mitted if the received signal point falls in zone 1, guess &(t) was transmitted if the received signal point falls in zone 2 and so on. An erroneous decision will be made if, for example, S4(t) is transmitted and the noise is such that the received signal point falls outside zone 4. The probability of error for the PSK coherent case is calcu-

341

Fig. 6-Optimum partitioning of detector signal space for a 4-level PSK coherent system.

TRANSMITTEQ POWTS

Fig. 7-Optimum partitioning of detector signal space for a 3-level ASK coherent system.

lated for various values of m in Section V. Curves and discussions are presented in Section VI.

D. Coherent AXK

In this modulation scheme the information carried by the transmitted waveform is contained in the amplitude. A typical set of message waveforms is described by

cos wot 0 5 t 5 T S,(t) =

10 elsewhere i = 1, 2, . . . , m

where Ed is the energy content of Sd(t) and wo = 2ano/T for some fixed integer no.

It should be clear that each transmitted waveform may be expanded in terms of the single orthonormal waveform

and that

ail = lT cos coot$ cos w d t = a. (23)

The possible transmitted signal points are illustrated for the case m = 3 in Fig. 7. The signal space is partitioned into 3 distinct detection zones according to the techniques just discussed. Thus, for example, zone 2 consists of the set of points in the signal space which lie closer to X , than to S1 or Sa.

Probability of error calculations for this case (under the further assumption of average power limitations and uniform amplitude spacing starting with zero) are pre- sented in Section V. Curves and discussion appear in Section VI.

Page 7: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

342 I R E T R A N S A C T I O N S ON COMMUNICATIONS SYSTEMS December

E . Coherent FSK

This modulation scheme is characterized by the fact that the information carried by the transmitted signal is contained in the frequency. A typical set of signal waveforms is described by

I -

S , ( t ) = \ T i 2E I- cos (wi t ) 0 5 t 5 T

(24)

lo elsewhere

where E is the energy content of Si ( t ) ,

wi = 27r ____ + i, for some fixed integer no T i = 1 , 2 , . * . , m .

Fig. 8- -Optimum partitioning of detector signal space for FSK coherent system.

a 3 - l e d

Following the procedure of Appendix I or observing directly that the Si(t) are orthogonal (not orthonormal), ITJ. INCOHERENT DETECTION it may be deduced that the most useful form for the Incoherent systems differ from coherent systems in orthonormal waveforms cpl(t), cpz(t), . . . , cpk(t) is that no provisions have been made to phase synchronize

the receiver with the transmitter. Accordingly, if the cp;(t) = 6 cos wit j = 1 , A, 3 . . . , IC = m. (25) wa,veform

- Correspondingly, S(t) = g cos (wot + 4) ( 2 : o

a , . " 1 = s,",,/: - COS w i t J" - COS wit dt T is transmitted, 1;he received signal z ( t ) will be of the form

(0 otherwise

That is to say that ith signal point is located on the ith coordinate axis a t a displacement of dg from the origin of the signal space.

It should also be noted that in this modulation scheme the distance between any two signal points Si and Si is constant, since by (5) and (26) we have

d(S,, Si) = d2E' i # j .

The detection rule is illustrated for the case m = 3 in Fig. 8.

Calculations for the probability of error of an m-level FSK coherent modulation scheme are presented in Section V. Curves and discussion appear in Section VI.

F . Remarks The procedure we have followed in the last three

examples is to partition the detector signal space into ( m ) distinct regions, each region containing one and only one message point and consisting of those points in the signal space which are closer to the contained message point than to any other message point. The received signal point will (with probability one) fall into one, and only one, of these regions.

The optimum decision rule (when the hypothesis of Theorem I11 is satisfied) is simply to identify the region in which the received signal point falls and assume that the signal corresponding to the contained message point was actually transmitted.

z ( t ) = E cos (ant + 4 + a) + n(t) (28)

where the angle a is unknown and is usually considered to be a random variable uniformly distributed between 0 and 27r.

It may readily be deduced that the detection schemes presented previously are inadequate for the incoherent case for if the received signal takes the form described by (28), the outputs of the product integrators will be functions of the unknown angle a. We shall now discuss in turn the ma'difications which must be introduced to the PSI<, ASK and FSK systems discussed previously.

A . PSK Incoherent

It should be clear from (28) that the presence of the random phase angle in the argument of the cosine prevents the receiver from deriving any information from the phase of the incoming signal alone, namely, (4 + a) . If, however, a -varies slowly (that is, slowly enough so that it may be considered constant over the period of time required to transmit two waveforms, 2 T ) , then the relative phase difference between two successive waveforms will be inde- pendent of a: [i.~?., + a) - (42 + a) = +1 - 44. Thus, if the detector was equipped with storage, it could measure the phase difference between successive signals regardless of the value of (X. This suggests that we modify the coding scheme at the transmitter as follows. To send the ith message (i = I., 2, . . , m), phase advance the current signal waveform by 27ri/m radians over the previous waveform.

Page 8: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1966 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 343

Correspondingly, the detector should (at least, con- ceptually) calculate the coordinates of the incoming signals by product integrating it with the locally generated wave- forms cos wot and d- sin wot. It should then plot the received signal points and measure the angle between the currently received signal point and the previously received signal point which has been stored. It may be shown (see Appendix IV) that the best rule for the detector to follow is to quantize the measured angle in steps of 2 ~ / m and guess that the corresponding message was transmitted.

Thus, for example, if the ith message was transmitted, a pair of successively received signals z l ( t ) and zz(t - T ) will be of the form

where the angle a is unknown and is assumed to be uni- formly distributed over a 27r interval (symmetric with respect to some mean which may be unknown). The coordinates of the corresponding signal points x , and x,, which we designate as (xl, yl) and (x,, yz), will be of the form

x1 = L* zl(t)4,(t) d t = v'% cos a + n,, (30a)

y1 = l* ~ ~ ( t ) c # ~ ~ ( t ) dt = - dE sin a + n,, (30b)

~ 2 ( t - T)4,(t - T ) dt

= d~ cos (a + %) + n,,

= - d g s i n (a + - + nZz 2ri) m

where n,,, n12, n,, and n2: are independent-Gaussian- random variables, each having zero mean and variance No.

Suppose now that for some particular combination of received signals, the random variables xl, y,, x,, yz take on the values a,, b,, a,, br, respectively. The optimum rule for the detector to follow (see Appendix IV) is to measure the angle 0 between the two points (al , b,) and (u,, b2) , which are shown plotted in Fig. 9, round off to the nearest integral multiple of (27rlm) and guess that the signal corresponding to that phase rotation was transmitted.

A basic difference between coherent and incoherent PSK systems is that in the coherent case the received signal is being compared with a clean reference, that is, the known position of the transmitted point. In the inco- herent case, however, two noisy signals are being com-

CUROtWTL" RLCElYLil

$*HAL P0I"T v i Q , . h , I

I T O R E 0 SIGNAL POIN,

+ < I * )

Fig. 9-Illustration of detection rule for PSK incoherent.

pared with each other. Thus, we might, after a crude fashion, say that there is twice as much noise present in the incoherent case as in the coherent and, consequently, there will be a 3-db degradation in performance. This latter statement will in fact turn out to be approximately true under the appropriate restrictions (namely, high signal-to-noise ratio and m > 2) as will be discussed in Section V-D.

We wish also to point out that under the proposed detection scheme two product integrators will be neces- sary for the two-level case as well as for the multilevel case. Recall that only one product integrator is required for the coherent two-level case.

B. ASK Incoherent In the ASK incoherent case the received signal z ( t ) is

of the form

z( t ) = __ cos (mot a) + n(t) i = 1, 2, . . . , m (31)

where a is unknown and is assumed to be uniformly dis- tributed over a 2 s interval.

Consider for the present the signal portion of a particular received signal which we shall designate as z* ( t ) , for which it is known that a = A . That is,

E

- z*(t) = & cos (mot + A ) .

Any waveform of this form may be expressed as a linear combination of the pair of orthonormal waveforms .\/2/T cos wot and sin wot. Product integrating

x * ( t ) with dm cos wot and e sin wot, respectively, yields

a = l T x * ( t ) $ cos wot dt = fi cos A (334

b = LT z*(t)$ sin w,t dt = - fi sin A . (33b)

In accordance with our previous discussion z*(t) may be represented by the point x* with coordinates (a, b) shown plotted in Fig. 10.

It may readily be deduced that the line segment drawn from the origin to the point x* has length 6 and is displaced A radians below the abscissa.

As far as recovery of the transmitted information is concerned, only the distance of the point from the origin,

Page 9: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

344 IRR TRANSACTIONS ON COMM UNICATIOiVS SYSTEMS December

Y &111 - I

Fig. IO-:Plot of the point z* in the detector signal space.

namely, d+K is significant. We wish to point out, how- ever, that l?ig. 10 lends itself to a simple geometric interpretation of the difference between ASK coherent and ASK incoherent systems. In both cases the set of message points lies on a straight line (see, e.g., Fig. 7). In the coherent case, however, the receiver knows the orientation of this line (Le., the angle A ) and, thus, need only make one measurement (along the line) in order to deduce what message point was sent.

In the incoherent case the receiver does not know the position of the line and must, therefore, perform its analysis in the plane containing all possible rotations of the line. In particular, to calculate the distance of a point from the origin it must first measure the projections of the point on each of two perpendicular axes lying on the plane and passing through the origin and then take the square root of the sum of the squares of the two pro- jections. Accordingly, it should be noted that the ASK coherent detector requires only one product integrator whereas the ASK incoherent detector requires two. (The logic following the product integrators will, of course, be different in the two cases.)

Owing to the presence of noise the actual outputs of the product integrators will be of the form

I

Fig. 11-Illustration of detection rule for 3-level ASK incoherent.

sponding to the ith message consists of those points with coordinates (a, g), for which

/dK&-- v E J < [ d G 3 - f i l all j # i. (35)

Such a partitioning of the detector signal space is illustrated in Fig. 11 for the three-level case.

We wish to point out that this decision rule is not the optimum one to adopt but approaches the optimum rule in the case of ‘high signal-to-noise ratio. The rationale for adopting this rule is that it is considerably easier t o instrument than the optimum rule for which the decision regions are functions of the signal-to-noise ratio. The decision rule is discussed in Appendix IV. Probability of error calculations for the particular case of uniformly spaced signal amplitudes starting with zero and an average power limited transmitter are presented in Section V. Curves and discussion appear in Section VI.

C. FSK Incoherent

IC = LT z(t)+l(t) dt = dz cos a f n, In the FSIi incoherent case, if the ith message is trans- c34a) mitted, the received signal z ( t ) will be of the form

where n, and n, are independent-Gaussian-random vari- ables, each having zero mean and variance N o and a is assumed to be a uniformly distributed random variable over a 27r interval.

It may be shown (see Appendix IV) that a reasonable decision rule for the receiver to adopt is to measure the outputs of the two product integrators, calculate the rms amplitude, quantize in steps of and guess that the corresponding signal was transmitted. Thus, for example, if, in particular, IC = a and y = b, then the receiver should guess that the message corresponding to the value of i which minimizes the quantity

was sent. This is equivalent to saying that the two- dimensional space corresponding to all possible outputs of the two product integrators should be partitioned into m distinct regions (zones), each of which is associated with a particular message where the ith region corre-

where the unknown angle CY is assumed to be a random variable uniformly distributed over a 27r interval.

Although each of the transmitted signals may be repre- sented by a point in an m-dimensional space, the presence of the unknown angle CY makes it necessary to resolve the inc,oming signal :in terms of the 2m orthonormal waveforms

12 -

cos Wt, q F cos w , f , . * . , & cos w,t

, ,&sin w,t.

Correspondingly, the 2m product integrator outputs will be of the form

xi = l T z ( t ) $ coswjt dt = jzi (374 cosa + nZi j = i

Page 10: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 345

where the 2m noise perturbations

n,,, n,, j = 1 , 2 , + . , m

are independent random variables with zero mean and variance N o and LY is a uniformly distributed random variable over a 29 interval.

If, a t some instant, the random variables zl, z2, , x,, yl, y2, . . . , y, take on the particular values a,, a2, . . . , a,, b,, b,, . . , b,, respectively, it may be shown (Appendix IV) that the optimum decision rule for the receiver to follow is to find that value of j, j = 1, 2, . . . , m for which the quantity d a : + bq is a maximum and guess that the corresponding signal was transmitted. That is to say, the detector should calculate the rms amplitude associated with each possibly transmitted frequency and select the largest one.

The probability of error for an orthogonal FSK inco- herent system is calculated in Section V. Curves and discussion appear in Section VI.

D . Interpretation of Decision Rules in Light of Appropriate Minimum Distance. Criteria

In concluding our discussion of incoherent systems we wish to point out that the decision rule adopted in each case can be interpreted as a minimum distance type rule although the spaces in which the distances are measured are not simply related to the lc-dimensional Euclidean spaces which characterize the IC product-integrator out- puts. Thus, in the PSI< incoherent case the space of interest is that corresponding to the possible values of the relative phase difference between a pair of successively received signals, namely, an interval of length 2a. The possibly transmitted phase differences determine a set of m mes- sage points with coordinate displacements 2ai/m i = 1, 2, . . . , m, respectively, and the optimum decision rule is equivalent to measuring the phase difference plotting the resultant number in the space and selecting the closest message point.

In the ASK incoherent case the space of interest is that corresponding to the possible values of the rms amplitude of the received signal. This space can be represented geometrically by a semi-infinite line running from zero through the positive real numbers to infinity. The set of possibly transmitted signals define a set of m messages points with coordinate displacements d E i = 1, 2, . . . , m, respectively, and the adopted decision rule (which is only asymptotically optimum) is equivalent to calculating the rms amplitude of the received signal, plotting the resultant value in the space and selecting the closest message point.

In the FSK incoherent case the space of interest is an m-dimensional Euclidean space (or, to be more exact, the positive “quadrant” of that space) wherein each direction in that space is associated with one of the possibly transmitted frequencies. The received signal may be considered to be an m-dimensional vector whose co- ordinate projection in each direction is equal to the rms

amplitude of the outputs of the sine and cosine product integrators associated with that direction (frequency). The m distinct points, one on each coordinate axis, displaced

units from the origin, constitute the message points. Correspondingly, the optimum decision rule is equivalent to measuring the distance (in this space) from the re- ceived point to each message point and selecting the closest one.

V. PROBABILITY OF ERROR CALCULATIONS For the systems under consideration it is possible to

obtain exact expressions for the probability of error in integral form. Unfortunately, however, in many cases the integrals in question are not simply integrable nor have they been tabulated over the ranges of interest. When this is the case it is sometimes possible to obtain upper and lower bounds on the probability of error which are usually adequate to predict the signal-to-noise ratio (within a decibel or so) required to maintain a prescribed error rate.

The approximations which can be made fall into two categories, namely, simplification of the integrand and simplification of the region of integration. The latter procedure is especially useful in the coherent case where the regions of integration are fixed relative to the signal space and the noise is symmetric Gaussian with zero mean. In fact, Theorem IV may be shown (see Appendix V).

Theorem IV Given M message waveforms, each transmitted with

equal probability and perturbed by additive stationary- white-zero mean-Gaussian noise with double-sided spectral density N o watts/cps, then the average probability of error for a maximum likelihood coherent detector is bounded by4

where p* and p are defined in terms of pi, the distance between message point i and its closest neighbor. That is,

p* = minimum (pi) i = 1 , 2 , - , m t

p = - C p ; . 1 “ m i = l

Actually, i t is often possible to establish tighter bounds than those presented in Theorem IV. The results presented, however, are indicative of the type of bounds which may be achieved by overestimating and underestimating the regions of integration.

Similar reasoning may be employed to estimate the probability of error for the ASIC incoherent and FSK

4 The upper bound is similar to one presented by E. N. Gilbert, “A comparison of signaling alphabets,” Bell Sys. Tech. J., vol. 31, pp. 504-522 (Theorem 3); 1952. Gilbert’s Lower Bound is incorrect.

Page 11: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

346 IRE' TRANSACTIONS ON COMII!IUNICATIONS SYSTEMS December

incoherent systems where, again, we are dealing with Thus, x and y are independent-Gaussian-random variables fixed regions of integration independent of the incoming with means V% cos 27ri/m and -* sin 2?ri/m, re- signal (though the probability of error for the latter spectively, and common variance No. Consequently, the system may be evaluated exactly). Unfortunately, how- probability that z lands in Ri when S i ( t ) is transmitted ever, these techniques are not readily applicable to is given by

the PSK incoherent case. Therein it was only possible, with the exception of the two-level case (for which the probability of error may be evaluated exactly), to obtain approximate expressions (in simple form) which represent neither an upper bound nor a lower bound to the proba- bility of error.

Sections V-A to V-F are devoted to calculating the probability of error and approximations thereof for each of the six systems under consideration. They may be skipped without loss of continuity. Before proceeding to this section, however, we wish to point out that, in the ASK systems, the FSK systems and the coherent PSI< system, an error occurs whenever the ith signal wave- form Si ( t ) is transmitted and the received signal point does not land in the region associated with the message point Si. Designating this region by Ri and the received signal point by z, the event x falling inside the region Ri will be written symbolically as z E Ri whereas the event z falling outside the region R , will be denoted z $ Ri. Averaging over all possibly transmitted signals it is readily seen that the average probability of error P , equals

. P, = P(S, sent) P(z q! R,/Si) m

i = 1

which, transforming to polar coordinates with x = p dK cos 8 and y = p 6 sin 8, may be written as

P[x E Ri /S i] = !2n [[ exp { -; [ p2 - 2p&

Ri

But Ri, as m ; q be deduced from Fig. 12; is simply the set of points sat,isfying the two conditions

o < p < a

:!ai ?r 27ri 7r

m m - < e < - -

Thus, substituting the appropriate limits of integration in (41), we get

P(.z E R i / S i ) = ;- [ 1 - 2 s i / m + * / r n

.2?r -2s i / m - s / m

where we are using standard notation to probability of an event and the conditional of an event.5 Now let us consider each of the individually.

A. Coherent PSI<

denote the Note that (42) is independent of the choice of i. That probability is to say, the probability of interpreting the received six systems signal correctly is the same regardless of which particular

signal was transmitted. Therefore, the probability of error for the nz-level PSI< coherent system is simply equal to one minus the right-hand side of (42). That is,

y = l z ' x ( t )p2 ( t ) d t = - dz sin + n,. 27ri (40b) (p2 - 2 p J E cos e + -

NO ") = ( p - Jz cos + g s i n 2 e

5 See, e.g., Davenport and Root, op . cit . , pp. 7-13.

Page 12: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 347

yielding finally that P

2 " < x S,,,sin r / m

dX . (47)

Thus, we have established a simple upper bound to the probability of error for the case na > 2 . We remark that if >> 1, which is usually the case, the left-hand side of inequality, (47), represents a good approximation to P,.

A lower bound for the probability of error for m > 2 may be established quite easily by geometric reasoning.

Fig. 12-Jllustration of the detection region Ri corresponding to the In Particular, it may be deduced from the symmetry signal Si(t) for the multilevel PSI< coherent case. of the message points [though it has been shown formally

in the discussion following (42)] that the average proba- and substitute this result into (43), we get bility of error is equal to the probability of landing outside

detection region i when message point i is sent. The probability of landing outside Ri, however, is larger than the probability of landing in the shaded planar region

planar region if the component of noise pemendicular to

S ' r / " L P . = l - - de exp { -: EsirlZ e}

2n - * / m

.s," d p p exp{-+ [ p - ,/$ cos el2} of Fig. 13. But the received point x will only fall in the

de the boundary line of the planar region sin n/m. That is, designating the planar symbol B,,

= I - - 2 s - s / m

.exp { -% E . sm2 e}{ exp {-a E cos2 e } P, > P[x E B,]

+ cos e 1" but

P[x E B,] = ___ e - d\/E/NY cos 0 1 -2=/2.v0 dX

Eq. (44) may be bounded quite easily for m > 2. For if m > 2, then -n /2 < 0 < s / 2 which implies that 1 ,."

kceeds dE region by the

cos e > 0 and, therefore, that6

esp { - + t 2 } dt > 6 Therefore,

Combining (44) and (45) results in the bound

1 P , < l - - dZ

-_ If we now let x = d E / N o sin 0, then the right-hand

side of (46) may be written as

in W. Feller, "An Introduction to Probability Theory and its 6 This result follows from some clernentary inequalities presented

Applications," John Wiley and Sons, Inc., New York, N. Y., vol. 1, 2nd ed., pp. 164-166; 1959.

e-= '/' dx .

Note that if we were to consider also the shaded planar region of Fig. 14, which we shall denote B,, then it should be clear that

P, < P[x E B,] + P[x 1 3 2 1 ,

from which we can conclude that

The upper bound so calculated is identical to the one established previously [see (47)] with considerably more effort.

Now let us consider separately the cases nr = 2 and na = 4, for which P. may be evaluated exactly. If nz = 2, it may be deduced from Fig. 15 that the probability of error is equal to the probability that a Gaussian random variable of mean zero and variance N o exceeds 4. That is, whenw = 2,

(49)

Page 13: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

348 IRE TRANSACTIONS ON COMM UNICATIOJTS SYSTEMS December

\ Fig. 13-Illustration of detection region Ri for the multilevel

coherent PSK case and the planar region BI.

Fig. 14-Illustration of detection region Ri for the multilevel PSK coherent case and the planar region Bz.

Fig. 15-Illustration of detection regions for 2-level PSK coherent system.

Fig. 16-Illustration of the detection region Ra for a 4-level PSEi coherent system.

The probability of error for the case m = 4 may be calculated most easily by resolving the noise into the two orthogonal direc1;ions x' and 1~' indicatJed on Fig. 16. It follows readily that an error will not occur if n: and n;, the noise components in directions x' and y', satisfy the conditions

but since n: and. n; are independent noise vectors with zero mean and variance N o , the probability of this event is simply r . 1 2

Thus, the probability of error for the 4-level case equals

e-Z'/2

The results of the preceding calculations for the proba- bility of error of an m-level PSI< coherent system are summarized below .

An exact expression for the probability of error in integral form is, by (44),

p = I - - 1 dee-E/2hTe s i n * 8

& -,r/m

If m > 2, simple bounds for the probability of error are given by (47) and (48). That is,

The geometrical reasoning used to establish these bounds is similar to that used to establish Theorem IV (see Appendix V). Theorem IV, however, yields a set of slightly weaker bounds when applied to this case, namely,

Page 14: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise

We might point out, however, that Theorem IV is valid even for m = 2, in which case both the upper and lower bounds coincide yielding a check for (49), namely,

1 ” P. = - diG s a e - 2 z / 2 dx .

For the particular case m = 4, the probability of error is given by (50) as

As a final point it should be noted that both (49) and (50) could have been derived from (44).’

B. Coherent ASK

In the ASK coherent case, if Si(t) is transmitted, the received signal point x has a single coordinate [see (11), (12) and W I ,

x = L T z(t)cp(t) d t = + n. (51)

Thus, x is a Gaussian random variable with mean 4% and variance N o . Let us assume in particular that the message points are uniformly spaced, starting with 1/E7 = 0. That is,

fi = (i - 1) A i = 1 , 2 , e . . , m. (52)

The corresponding detector signal space is shown in Fig. 17.

It is readily deduced from Fig. 17 that if 1 < i < In, then the probability of interpreting the transmitted signal incorrectly is simply

On the other hand, if i = 1 or i = m, then the proba- ability of interpreting the transmitted signal incorrectly is

P[z # Rl/SlI = P[x # Rm/Sml

Now, combining (39), (53) and (54), we can compute the average probability of error P , to equal

P , = 2(m - 1) 1 ”

d% J A / 2 6 e-z2 /2 dx. (55)

If we further assume that the transmitter is subject to an average power limitation of E/T (watts), it follows

from a form of (44) has been presented by E. A. Trabka, “Embodi- An ingenious, though somewhat involved, derivation of (50)

ments of the Maximum Likelihood Receiver for Detection of

(Appendix) in “Investigation of Digital Data Communication Coherent Phase Shift Keyed Signals,” Detect Memo. No. 5A

Ithaca, N. Y., Rept. No. UA-1420-S-1; January, 1961. Systems,” J. G. Lawton, Ed., Cornel1 Aeronautical Lab., Inc.,

349

La. ,-T-I-T-[T-, A I d ! *v f -b-Ldl I I 2 I

\ I 1 Cl- REGKU I T REGION 2 4 F neem m &

Fig. 17-Illustration of detector signal space for the coherent ASK case, assuming uniformly spaced message points stsrting with ./E7 = 0.

that

Substituting (52) into (56) and making use of the equality’

it is readily seen that the quantity A is constrained to equal

Consequently, the average probability of error (valid for all m) may be written as

C . Coherent FSK In the coherent FSK case, when Si(t) is transmitted,

the received signal point x has coordinates [see ( l l ) , (12) and (as)]

j = 1, 2, , m. (60)

The xi are independent-Gaussian-random variables with mean zero if i # j with mean dB if i = j, and each having variance N o . The decision rule, namely, to choose the message point closest to the received signal point, is equivalent to choosing that value of j for which xi is largest. This may be deduced by noting that if the received signal point z, with coordinates xl, xz, , x, is closer to, say, the signal point Si than to any other signal point, then

d2(z, Si) < d2(z, Si) all j # i.

and S. MacLane, “A Survey of Modern Algebra,” The Macmillan 8 This equality may be verified by induction. See e . g., G. Birkhoff

Co., New York, N. Y., revised ed.; 1960. Note, in particular, example 5a, p. 13.

Page 15: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

350 IRE TRANSACTIONS ON COMMUNICATIONS SYSTEMS

That is to say, [by (5), (26) and (60)] and, therefore, th.at

December

x ; + * * a +x;+ e . . +(xi - d Z y + * * a +x: < s f + . . . + ( x i - d ~ ) 2 + . . . + x L : + . . . + X ~

but this is true if, and only if,

-2*xi < - 2 a x j ;

that is, if, and only if,

xi > xi .

Since, when Si( t ) is sent, each of the xi ( j # i) are independent-Gaussian-random variables with mean zero and variance N,, the probability that each of the (m - 1) xi is less than xi is simply

P[x j < xi all j # i/xi, Si]

- - - [ IZi e-U'/21v0 duIm-' dixo --

* (61)

Hence, the probability of a correct decision when Si is sent to equal

P[xi < xi all j # i / S i ]

The right-hand side of (62) is independent of the choice of i and is, in fact, equal to the average probability of a correct decision. It follows, therefore, that the average probability of error

-[A Lm exp {-id} d u I"-, dx. (63)

The integral appearing in (63) does not appear to be solvable in terms of standard functions for 11% > 2 . It should be noted, however, that the integral -- has been tabulated for several values of nz and d E / N o by U r b a n ~ , ~ although not over the ranges which are considered to be of interest within this report. Fortunately, simple bounds for the average probability of error may be found quite easily by applying Theorem IV. In particular, noting [by (5) and (as)] that the distance between any two distinct message points Si, Si is

d(Si, Xi) = d%, it follows that

p = p * = - p = 4%

9 R. H. Urbano, "Analysis and Tabulation of the M Positions

AF Cambridge Res. Ctr., Bedford, Mass., Tech. Rept. No. AFCRC Experiment Integral and Related Error Function Integrals,"

TR-55-100; April, 1955.

. "m

(m - 1.) " e-" ' /Z dl. - < 7 d . & x (64)

Actually a tighter upper bound for P , has been derived by Fanolo for use in a channel capacity argument. For our purposes, however, the considerably simpler, if less sophisticated, bounds presented above will be adequate.

Note that if, in particular, nz = 2, the upper and lower bounds coincide. It follows, therefore, that in the two- level case,

P , = __ v5hiizc Irn e-"='2 dx. (65)

D. PSK Incoherent A distinguishing feature of the PSI< incoherent case

is the fact that there are no pre-assigned detection regions in the signal space, each of which corresponds to a par- ticular transmitlied signal. The decision, rather, is based on the phase aqgle between successively received signals. If the ith message has been sent, such a pair of successively received signals will, by (29a) and (29b), be of the form

-

z,(t - T ) = ,& cos ( 4 + a! + m

Correspondingly, the measurement will (at least con- ceptually) be based on the pair of signal points z1 and z2, having coordinates (x,, yl) and (x,, y,), respectively, where, by (30a)--(30d), it is known that xl, y,, x*, yz are of -the form

x1 == 4 cos a + n,,

x. == dZ cos a + + nZ1 (

where rill, nlz, ?az1, nz2 are independent-Gaussian-random variables with zero mean and variance N o and a! is a uniformly distributed random variable over a 27r interval.

A possible set of received signal points corresponding to the case x1 =: al, yl = b,, xz = a,, yz = b,, a! = A are shown plotted i n Fig. 18 as vectors. Each vector is repre- sented as the sum of a signal vector and a noise vector.

The decision will be based on the angle $ which equals

10 R. M. Fano, o p . it.,^ pp. 200-206.

Page 16: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1966 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 351

Fig. 18-Illustration of a pair of successively received signal points with coordinates (al, b l ) and ( a , b?), respectively.

An erroneous decision will be made if, and only if, the noise is such that

Note that the angles $7 and 4% which are defined in Fig. 18 are samples of identically distributed random variables. Since the probability density of these random variables, which we shall denote by 4, and &, may readily be determined, the probability density of the new random variable 7,

7 = I 4 2 -41 I , (67) can be calculated. Designating this density function by p ( o ) , it follows that the average probability of error

P6 = l;m P ( d all- (68)

The approach to calculating the probability of error which has just been outlined has, in fact, been used by Fleck and Trabka." They have shown that"

(1 - cos 11 sin +)

and, correspondingly, that

cos r ) sin +) d+ d v . (70) 1 Since the manipulations required to establish (69) are

rather involved and since, furthermore, the derived ex- pression for the probability of error, (70)) is awkward to work with (except if ?n = 2, in which case it reduces to a more tractable form) unless some simplifying approxi-

l1 J. T. Fleck and E. A. Tmbka, "Error Probabilities of Multiple- State Differentially Cohercnt Phase Shift Keyed Systems in the Presence of White Gaussian Noise," Detect Memo. No. 2A in "Investigation of Digital Data Communication Systems," J. G. Lawton, Ed., Cornel1 Aeronautical Lab., Inc., Ithaca, N. Y., Rept. No. UA-1420-S-1; January, 1961.

It should be noted that p ( 7 ) = 2h(7) and R = E/2No since we $re I* Ibid., see (32) and (33) from which (69) of this paper follodis.

using a double-sided noise spectrum.

Fig. 19-Decomposition of received signal vector into components parallel and perpendicular to the signal component of the received signal.

mations are introduced, it is worthwhile to consider an alternate procedure for estimating the probability of error. This we shall now do.

Initially let us calculate the probability density of the random variable &, corresponding to the angle 47 defined in Fig. 18. The probability density of the random variable dz is, of course, the same. It is convenient to resolve the noise component of the corresponding received signal vector into components which are parallel and perpen- dicular to the signal component as illustrated in Fig. 19.

Designating the projections of the received signal on the directions parallel to and perpendicular to the signal component of the received signal as d and 6, respectively, it follows readily from Fig. 19 that

a = v ' E + n ~ (714

b = nt ( 7 m

where n: and n*, are observed samples of the independent- Gaussian-random variables n,, nz, each of which has zero mean and variance No. That is to say, (7. and 6 are sample values of a pair of independent random variables which we shall denote by j: and 5. Since j: is Gaussian with mean v'E and variance N o and 5 is Gaussian with mean zero and variance No, the joint probability density is given by

-

Now, transforming to polar coordinates by means of the relationships

f = VdK (73a)

5 = v d N , s i n (73b)

we can express (72) in terms of the random variables Y

and Since the Jacobian of the transfornlation is equal to vNo, the joint probability density of v and which we shall denote by q(v, 4,) is equal to13

q(v, = p < v f i cos v d K s i n 4,) . v ~ , .

That is, by (72),

l3 Davenport and Root, op. cit., pp. 37-38.

Page 17: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

352 IRE T R A N S A C T I O N S ON COMM U N I C A T I O M X S Y S T E M S December

Integrating out the v dependency, we get, finally, the probability density of dl, q(&) equal to

The reader might find it interesting to compare the right-hand side of (75) with the integrand of (42) and to accordingly note that the probability of a correct decision in the PSI< coherent case is equal to the probability that the angle between the transmitted signal vector and the received signal vector is less than a/m radians in magnitude.

Now, recall [see (66) and preceding discussion] that a correct decision will be made by the detector if, and only if, I 41 - 42 1 I ,/In. The region in the +1 X +z space corresponding to a correct decision is illustrated by cross hatchings in. Fig. 20 (we are assuming that 41 and +2

are restricted to lie between -7r and 7r modulo 2.1~). Neglecting edge effects, it may be deduced from Fig. 20

that the correct decision region is characterized by the condition

Thus, the probability of a correct decision is approxi- mately equal to the probability that inequality (76) is satisfied. Unfortunately, the exact probability density of 4; is not readily determinable. For small &’, however, (75) yields a reasonably good approximation to the probability density of +;, as we shall now demonstrate.

It is clear from (75) that q(41) is an even function, +1,

and that in the interval I dl I 5 7r, takes on its maximum value a t C#Q = 0 and decreases monotonically with 1 4l I to its minimum value at I 4l I = 7r.

Furthermore, following the procedure that was used to transform (43) into (44), (75) can be rewritten in the form

Substituting some particular values of I#Q into (77) to gauge the way in which decreases as 4l increases, we getx4

notation) have been presented by C. R. Cahn, “Performance of 14 Some plots of a(+,) for various values of E/21Vo( = S / N in his

digital phase-modulation communication systems,” IRE TRANS. ON COMMUNICATIONS SYSTEMS, vol. CS-7, pp. 3-6 (Fig. 2); May, 1959.

Fig. 20-Illustration of correct decision region in X $2 space.

d t ] . (78d) 27r

The point we wish to make is that in the case of high signal-to-noise ratio, say E / 2 N o >> 20, q(+l) falls off quite rapidly as 1 c$~ 1 departs from the origin. If, in particular, 1 c$~ I < ~ / 2 , then inequality (45) is valid. Treating this inequality as an approximate equality (the approximation i:mproves as a cos e increases) and substituting into (77), we get

d4l) = - 1 - E / 2 N o sinZ cos

(if I 41 I < a/2) . (79)

In the neighborhood of the origin

sin +1 z +1

and

cos r$l w 1

and, correspondingly, -

That is to say, the probability density of 41 is approxi- mately Gaussian in the region where it has the most weight, namely, near the origin. Correspondingly, the joint distribution

d41, 4 2 ) = 4(41)Q(4z)

is approximately circularly symmetric in the same region and, thus, for small 41 the probability density of 4; is approximately given by (79). Since the probability of error for the csse m = 2 may be evaluated exactly as will be shown below, we shall only assume (79) to be a valid representation for the probability density of 4;

Page 18: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 353

when m 2 4 or, correspondingly, by (76) when

The probability of a correct decision when m 2 4 is thus

r

Transposing (81) and introducing a new variable -

u = ,J$ sin 44,

we get, finally, that for m 2 4 the probability of error is approximately equal to

P, x - 6 s- a s i n r / ~ m

du . (82)

We shall now consider the case m = 2. From our previous discussion it should be clear that the probability of error is dependent only on the angles +, and +> and not on the phase difference of 27ri/m introduced between successive signals at the transmitter. Accordingly, in examining the two-level case it is sufficient to calculate the probability of error for the particular case where the transmitter keeps sending the same waveform. That is to say, we shall consider the case for which the signal components of the successively received signal points x , and zz coincide. Assuming now that the angle +1 is known and is equal to, say, 4; and that the signal point z1 has coordinates (u,, b,), it may be deduced from Fig. 21 that an error will occur if, and only if, the noise component of z2 in a direction parallel to the orientation of the stored vector z1 exceeds 4 cos +";15 That is,

P[error/+, = +:] = - 1 " dz COS @x*

'I2 dx. (83)

The average probability of making an error is, how- ever, equal to

P, = I-: P[error/+, = +TI~(+T) d + ~ . (84)

Substituting (77) and (83) into (84) thus yields

been suggested here. An alternate technique involving a reduction 15 Ibid., this approach to calculating the probability of error has

of (70) is presented in Fleck and Trabka, op. cit.

Fig. 21-Illustration of necessary and sufficient conditions for an error to occur in the 2-level PSK incoherent case, given that 4, = +,* and that i = m.

Defining the quantity

Now, by simple symmetry arguments it may be de- duced that

The expression for the average probability of error, (87), thus reduces to

E. ASK Incoherent

In the ASE incoherent case, if Si(2) is transmitted, the received signal point x has coordinates of the form [see (34a) and (34b)l

x = l' z(t)Cp,(t) dt = . fi cos a + n,

y = 1' z(t)p2(t) dt = - 2/Ea sin a f n,

where n, and n, are independent-Gaussian-random variables with zero mean and variance N o and a is a uniformly distributed random variable over a 2%- interval.

The conditional joint density function of the random variables II: and y, given that a = A and that #,(t) was

Page 19: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

3 54 I R E T R A N S A C T I O N S ON COMM7JNICA.TIONS SYSTEABS December

transmitted, is thus equal to

Correspondingly, averaging over all possible values of A , me get the conditional joint density function of z and y, given only that Si( t ) was transmitted, equal to

where I , is the modified Bessel function of the first kind of zero order.16

As previously noted [see discussion preceding and following (35) ] , the decision rule for this case is simply to round off the measured value of I x I = 2/x2 + y2 to the nearest value of a and guess that the corresponding signal was transmitted. Thus, if S,(t) is transmitted, the received signa,l will be interpreted correctly if, and only if,

--

i = 1, 2 , . . . , m (94)

where we define d& = - dE and d h 5 = a.

Now the probability that inequality (94) is satisfied, ie., the Probability that the received signal point z falls into the detection zone Ri when S i ( t ) is transmitted, may be expressed as

Ri H i

Transforming to polar coordinates with

me get,

P[x E R, /S i ]

Publications, Inc., New York, N. Y., pp. 41-42; 1958. l 6 F. Bowman, "Introduction to Bessel Functions," Dover

Since the region Ri is defined by the equations

Unfortunately, however, this integral is not simply solvable except for the particular case 4% = 0 nor has it been tabulated over the regions of interest."

It is possible, however, to bound integrals of this type from above and below by analyzing the integral geo- metrically and then modifying the regions of integration. Thus, for example, we might note that the integral in question represents the probability that a two-dimensional spherically symmetric Gaussian noise vector with mean zero and variance N o , originating from some point lying on the circumference of a circle of radius a, falls inside the ring determined by the circles (concentric with the first) of radius

respectively. The probability of the noise vector falling inside the

ring, however, is certainly larger than the probability of it falling inside the circle of radius centered on the noise origin (see Fig. 2 2 ) , where is chosen as large as possible subject only to the constraint that the circle must lie in the ring.

On the other lnand, the probability of landing in the ring is certainly less than the probability of landing in the shaded region of Fig. 22(c). Both these probabilities are easily evaluated.

The probability that the noise lands inside the circle of radius, [see Fig. 22(b)], which we shall denote by P(i) is equal to

The Q function is defined in J. I. Marcum and P. Swerling, "Studies 1: Eq. (98) may be expressed as the difference of 2 Q functions.

of Target Iletection by Pulsed Radar," IRIS TRANS. ON INFOR- MATION THEORY (Special Monograph Issue), vol. IT-6, p. 159; April, 1960. The Q function has been tabulated, though not within the

Functions," Rand Corp., Santa Monica, Calif., Rept. RM-339; ranges of interest o f this rcport, by J. I. Marcum, "Table of Q

January I, 1950.

Page 20: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1969. Arthurs and Dym: Detection of Digital Signals in Presence of Noise 355

n,

Fig. 22-Geon1etric interpretation of (98). (a) The right-hand side of (98) is equal to the probability that the two-dimensional spherically symmetric Gaussian noise vector, with coordinates nl, ' n ~ , each having zero mean and variance NO, lands inside the shaded region. (b) The probability of this event is certainly larger than the probability of landing in this shaded region. (c) The probability of this event is certainly sm:tller than the probability of landing in this shaded region.

The probability that the noise lands in the shaded region of Fig. 22(c), P( i ) , is equal to

P(i) = 1 - - e-" " ax. (100)

Note that the quantities a,, &, a3, &, which are defined in Fig. 22 [see also the remark following (94)], are functions of i, the parameter used to index the possibly transmitted signals.

We have established in (99) and (100) lower and upper bounds, respectively, to the probability of the event of interest as expressed by (98). That is to say,

- 4 2 T 1- ( m - f i l x ) / G

P(i) I P [ z E ZZ,/S,] < P(i) . (101)

n'ow if, in particular, we assume that the message points are uniformly spaced starting with zero, then, by (52),

~ E = ( ~ - - I ) A i = l , 2 , . . . , m

where, assuming the same average power constraints as in the ASK coherent case [see (58)],

GE A = - 1)(2?n - 1) '

It follows readily by direct substitution that = A/2 independently of the choice of i and, therefore, that

- (102) p(i) = 1 - e - A 2 / S f V o

The average probability of a correct decision is equal to

1 - P. - P[z E R;/S;] 1 " m i = l

(103)

and is, therefore, by (101) and (102), bounded from below by

1 - p > 1 - e - A 2 / S A ' o E - (104

An upper bound to 1 - P , can be obtained in a similar fashion by evaluating (100) for different values of i. The resultant expression is, however, rather cumbersome. A satisfactory upper bound can be obt,ained quite simply by noting that since

P[ZER,/X,] > 0 ,

it is certainly true that [see (39)]

That is,

P , > 1 - P[x E R,/S,] m

but, for the case i = 1, (98) reduces to

J n

Substituting this latter result into (105) yields

P, > - e 1 - A * / 8 N o

m (IOG)

Finally combining (104), (106) and ( B ) , we get the average probability of error bounded by

GE { -8N,(m - 1)(27n - 1) m < P,

Gh' - < exp - }. (107) { 8No(m - 1)(2m - 1)

The bounds are valid for all I n 2 2.

F . FXK Incoherent In the FSK incoherent case, if X i ( t ) is transmitted,

the received signal point z has, by (37a) and (37b), co- ordinates of the form

xi = { 'n, i jzi dE cos a +- n,; i = i

where the nzi and n!,i, j = 1, 2, . . . , m are independent- Gaussian-random variables with mean zero and variance N o and a is a uniformly distributed random variable over a 27r interval.

A correct decision will be made by the detector when X i ( t ) is sent, if, and only if, for all j # i the inequality

d m < d m (108)

Page 21: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

356 I R E T R A N S A C T I O N S ON C O M M U N I C A T I O N S S Y S T E M S December

is satisfied. The probability density of the random variable of (112) with E set equal to zero. That is, d F y 7 , j = 1, 2, . . , m may be established readily by converting the joint density of x i and yi to polar form and then integrating out the angular dependency. In Condition (108) is equivalent to requiring that particular, if j = i, the joint conditional density of x i , yi, given that S,(t) was transmitted and that a = A , is vi < vi (1 14)

equal to for all j f i. Since the vi are independent random variables, the probability khat inequality (114) is satisfied under the condition that, vi is known is simply

q(vi) = y j e - y j 2 / 2 ( j # i). (113)

p i ( x i , yi /a = A )

- -- exp { -- 1 1 2aN0 2No

- [(xi - 4 E cos A)2 P[vi < v i , all j # i / v i ] = P[vi < v i / v i ] . (115) i f i

+ ( y i + 4 Z s i n A)’] But the probability that for any particular j f i that

(log) vi < v i , given v i , :is equal to [by (113)]

Averaging over all possible values of A to get the joint conditional density of xi and yi, given only that si(t) was transmitted, yields

P[vi < v i / v ; ] = vie-vi’/2 dvi LVi - - 1 - e - v i = / 2 (116)

Substituting (1116) into (115) yields

P[vi < v i , all1 j # i / v i ] = (1 - e-”i2’2)m-1. (117) -- - 1 exp {-%-[x: 1 + y: + E ] }

exp { -m 1x3 + zJ3 + EI} , -- 1 = lm p [ v i < v i all j # i / v i l q (v i ) dvi,

2aN, Consequently, the probability of a correct decision when Si ( t ) is sent, which is given by

P[v, < v i , all j # i] . .C,””” exp (2’ 4:, COS A - yi v‘% sin A } No

1 2aN0

-

where lo is the modified Bessel function of the first kind of zero order.16

Letting xi = v i 4% cos +;, yi = vi 6 sin +i and noting that the Jacobian of the transformation is equal to viNo, the joint probability density of the random variables vi and + i , q ( v i , + i ) which is equal to13

q(v i , +J = vi^, .p(vi 4%; COS d i t vi 4% sin + J ,

may be written as

is equal to [see (112) and (117)l

P[vi < v i , all j # i]

= L~ vi exp { -; [v: + g]}Io(vi&$

. [1 - e-”i./slm-l dVi . (1 18)

Utilizing the binomial expansion, we can write

which result, when combined with (118), yields

Therefore, the probability density of v i , q ( v i ) is equa 1

to The integral appearing in the right-hand side of (119)

is a standard form whose solution is1’

If j # i, then the joint density of the random variables xi and yi is equal to the right-hand side of (110) with E set equal to zero. Correspondingly, the probability density of the random variable vj is equal to the right-hand side

E -- - k 1 {2N& + l ) } . (120)

18 G. N. Watson, “A Treatise on the Theory of Bessel Functions,” 2nd ed., Cambridge University Press, Cambridge, England, Sec. 13.3, p. 393, Eq. (1); 19513.

Page 22: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1966 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 357

Combining (119) and (120), we get the probability of a correct decision when Si(t) is transmitted equal to

P[vi < v i , all j + i] = exp -- { 2 3

The right-hand side of (121) is, however, independent of the choice of i and is, therefore, in fact, equal to the average probability of a correct decision. The average probability of error for an m-level incoherent (orthogonal) FSK system is, thus, equal to

Noting that (122) can be written as

and that

we can, introducing a new summation index, q = k + 1, rewrite the expression for the average probability of error in final form as

VI. CONCLUSIONS

A. Review of Basic Assumptions In the preceding sections of this paper, an approach

to the problem of optimally detecting a set of known wave- forms in a stationary-white-Gaussian environment has been presented. Utilizing this approach, three basic. data transmission systems have been studied and a set of error characteristics has been derived. Curves wherein the probability of error (actually log,, P.) is plotted as a function of signal-to-noise ratio (in decibels), the number of levels m appearing as a parameter, are presented in Figs. 23-28 (pages 358-359). Before proceeding to a de- tailed discussion of these curves, however, we wish to re- view explicitly the assumptions upon which they are based.

In particular we have assumed that:

1) Each signal waveform is transmitted with equal probability.

2) The transmitter is subject to an average power limitation, E / T (watts), where T is the duration of each transmitted signal waveform.

3) The received signal is the sum of the transmitted signal and a noise term, the noise being stationary- white-zero mean-Gaussian with double-sided spectral density No (watts/cps). That is to say, the noise power passed by an ideal filter with unit gain and (positive) bandwidth W is 2N0W watts.

4) The receiver is in time synchronism with the trans- mitter, by which we mean to say that the receiver knows when to sample and when to quench the product integrators. When, in addition, it is assumed that the receiver is phase locked to the transmitter, the system is referred to as coherent.

5) The received signal is processed by a maximum- likelihood detector except in the ASK incoherent case. For reasons of simplicity, an approximation to the maximum-likelihood detector which ap- proaches the true maximum-likelihood detector in a high signal-to-noise ratio environment was chosen for this case.

6 ) In the FSK case and the PSI< case, the transmitted signal waveforms, which are sinusoidal pulses, each contain equal energy E whereas in the ASK case the amplitudes (that is, the square root of the energy) of the transmitted pulses are uniformly spaced starting with zero. Furthermore, in the FSK cases the transmitted waveforms are orthogonal.

B. Physical Significance of P ,

It should be particularly noted that the calculated quantity designated as P , represents the average proba- bility of misinterpreting the transmitted waveforms. That is to say, if, in a long period of time KT, K waveforms are transmitted and, say, L of them are misinterpreted by the detector, then P , will (almost always) be approxi- mately equal to

the approximation becoming better as K + m . The point we wish to emphasize is that P,, which is

sometimes also referred to as the character error, is not, in general, equal to the probability that a single binary symbol is received incorrectly or that a binary sequence of some arbitrary length is received incorrectly. Thus, we might note, for example, that if, in an 8-level system the waveform corresponding to the binary sequence 001 is misinterpreted for the waveform corresponding to the binary sequence 011, a single character error has been made but two out of three of the binary digits have been received correctly. It is, of course, possible for a single

Page 23: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

358 IRE TRANSACTIONS ON COMMUNICATIOA'X XYXTEAdS December

Fig. 23-probability of waveform ps6 coherent), Fig. 25-Probability of waveform error (m-level FSK coherent), assuming that the duratiorl of each signal is fixed independently assuming that the duration of each signal is fixed independently of WL.

of m.

0 ' ' ' ' I ~ " , I ' ' ' ' - 0 ,

."1..0. ,l*..L CN111.

1.1c1 a,,' I0.I. ,.ICY.., O("111T ,..,.LOO,. L

t .. Fig. 24-Probability of waveform error (wdevel ASK coherent),

assuming that the duration of each signal is fixed independently of 7n.

- ,.,CE a131 .11.1. ,.C<,".L Dr.,,,,

.%*.OC S8SX.L I * C " ~ , ,..I. LOO,. I

Z".

Fig. 26-Probability of waveform error (m-level PSI< incoherent), assuming that the duration of each signal is fixed independently of m.

Page 24: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 359

E

OF A" rxu,

- 0 , 0 ' " ' " ' " ' " ' ' " ~ ' ' ~ ~ ~ ' ~ " " ~ ~ ~ ~ ' ' 10 JO 4 0 LO 60

."l".CC I,O../ 1*1.01

,.,<r "Ol,C P0.I. ,.,II".L o<w,,, I.., . LOO,. L

I * .

Fig. 28-Probability of waveform error (m-level FSK incoherent), assuming that the duration of each signal is fixed independently of m.

character error to correspond to two or three binary errors (in the 8-level case). Whether or not such a character error is more serious than the other kind depends upon the particular coding scheme used and cannot be predicted in advance. In one coding scheme, for example, we might be interested in the probability that binary sequences of length 6 are received correctly (alpha-numeric code). If an 8-level system is utilized in this situation, each binary sequence of length 6 must be associated with a pair of waveforms which will then be transmitted in suc- cession. Correspondingly, the sequence will be received correctly if, and only if, both the associated waveforms are received correctly. The probability of receiving such a sequence incorrectly is, therefore,

P,,, = 1 - (1 - PJ2 which for small P , is approximately equal to

P,,, w 2P,.

In the last cited example, it was a relatively simple task to derive an expression for the probability of the event of interest, namely, that a binary sequence of length 6 is received incorrectly, in terms of P,. In general, however, the events of interest will not be simply related to P,. In fact, it may be impossible to calculate the proba- bility of certain events such as the probability of a binary error without resorting to a detailed analysis of the probability with which various types of waveform errors can occur1Q (e.g., what is the probability that if waveform number one is sent it is interpreted as waveform number two, number three, etc.?). Nevertheless, a set of in- equalities which adequately relate the probability of the event of interest to the probability of a character error may frequently be established quite easily. Thus, if we are interested in the probability of a binary error we might note that if K waveforms of an nz-level system are sent in a period of time K T , then K log, 712 (which is assumed to be an integer) binary symbols are sent in that time. If L out of K waveforms are misinterpreted by the receiver, then at least L but no more than L log, ,112 binary symbols are received incorrectly. Consequently, the ratio r of the number of binary symbols received incorrectly to the number transmittcd is bounded by

L < r 5 L log, m K log, m - K log, m

Now, as K becomes large L/K approaches P,, the probability of a waveform crror aud r approaches P,, the probability of a binary error. In the limit, therefore, as K approaches infinity we have

p e 5 P, 5 P,. log, m ( 124)

l9 Some special cases have been considered by J. I<. Wolf, "Com- parison of N-ary Transnlission Systems," Rome Air I h v . Ctr., Rome, PIT. Y., Rept. No. RADC-TN-60-210; December, 1960.

Page 25: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

360 IRE TRANSACTIONS ON COMMUNICATI0N.S SYSTEMS December

C. Accuracy of Prese,nted Curves Regarding the accuracy of the curves presented, it

should be noted that in all cases except PSK incoherent (nz > 2) either an exact expression for P , or an upper and lower bound to P , is plotted. Unfortunately, for the PSK incoherent case simple bounds could not be found with the exception of the 2-level case which was evaluated exactly. Correspondingly, the curves presented for the PSI< incoherent case ( m > 2) represent simply an ap- proximation t80 the actual probability of error, the ap- proximation being quite good for large values of m.

We might point out that the plotted curves cannot be used to sharply determine the probability of error for a particular modulation scheme, given E/2No and m. The reader can verify this himself simply by trial and error, noting in particular that a small uncertainty in log P , results in a much larger uncert.ainty in P,. The inverse problem, which is perhaps the more natural, can be handled much more satisfactorily. That is to say, given a value of P , which it is desired to maintain and a value of m , the needed signal-to-noise ratio (E/2No) can be determined qnite closely.

D. Comparisons Assuming Fixed Waveform Duration Now, referring to the curves appearing on Figs. 23-28,

there are a few general conclusions which can be drawn. In the first place, note that increasing m (the number of waveforms which may be transmitted in any time T ) tends to increase the probability of error whereas in- creasing the energy content in each transmitted signal ( i e . , the signal-to-noise ratio) tends to decrease the probability of error. Furthermore, it should be noted that increasing m introduces the most degradation in the ASK case, somewhat less degradation in the PSK case and comparatively little degradation in the FSIC case. Geometrically, the rewon for this is clear. In all cases, increasing m increases the number of message points. In the ASIC case, these points lie in a one-dimensional space; in the PSI< case, they lie in a two-dimensional space whereas in the FSIi case, the dimension of the space increases linearly with 112. Correspondingly, in the first two cases the points (since we are subject to an average power limitation) become crowded together and the fre- quency of errors increase. In the FSK case, however, all message points may be maintained equidistant. Conse- quently, there is little deterioration in performance with increasing m. In fact, we might expect that if m is suffi- ciently large, the average probability of error of an FSIC system should be smaller than that of a PSK or an ASIC system (if it is not so already). That this is indeed the case may readily be deduced from Figs. 29-31 wherein some cross plots of the error rates for various systems are presented for given values of m. For the sake of clarity only four curves are presented in each diagram rather than the full six. The selected curves are, however, representative, as the following comparison of the per-

I /'

Fig. 30-Comparison of error performance (4-level systems).

Page 26: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

Fig. 31-Comparison of error performance (8-level systems).

formance of coherent systems vs incoherent systems will demonstrate.

E. Coherent vs Incoherent Of the three basic systems under study, the PSK system

suffers the most degradation in performance due to lack of coherence. The calculations performed in Section V show that for m > 2 the probability of error for a PSK coherent system and a PSK incoherent system are ap- proximately given by [see inequality (47) and the remark following it and (Sa)]

From these equations it may be seen that for large m ( i e . , where sin ?r/m w ?r/m) the cost of incoherence is a 3-db degradation in signal-to-noise ratio. For small m the degradation is somewhat less. In fact, for m = 2 the degradation approaches zero in the high signal-to- noise ratio case. This latter point may be verified by comparing the expressions for the probability of error when m = 2 [see (49) and (91)]. A graphical comparison is presented in Fig. 32.

In the ASK and FSK cases effective comparisons be- tween the analytic expressions for the probability of error in. the coherent case and the incoherent case do not seem feasible. Yet, examination of Figs. 33 and 34 indicate

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 361

10 log,, (s) = 10 log,, - -I- 10 log,, (log, m). (130) E, that in those regions where theprobability of error is 2NO

small enough to be of interest (say, P , < low5) the degradation between coherent and incoherent is of the order of a decibel or less. Consequently, either the co- herent curves or the incoherent curves may be taken as a representative set of curves for either of these two systems.

F . Comparisons Assuming Fixed Signalling Rate Returning now to the discussion of degradation in

performance with increasing m, we wish to emphasize the fact that any conclusions drawn regarding the relative performance of various systems must take into account the constraints under which the comparisons are made. In particular, it should be noted that the curves presented in Figs. 23-28 are drawn under the assumption that the duration of each transmitted signal is maintained a t a fixed value, T, independent of the choice of m. The relative positions of these curves might change considerably if, instead, we considered the equally valid constraint of fixed rate R. (We are still assuming that the transmitter is average-power limited.)

Under this constraint we can allow a longer time dura- tion for each waveform in a multilevel system and, hence, increase the energy content of the transmitted signals. Thus, for example, if each waveform employed in a 2- level scheme has a duration T, the waveforms employed in the corresponding 4-level version of the scheme can have a duration 2T. Doubling the allotted time duration doubles the energy content of the signal and, hence, is equivalent to a 3-db boost in signal-to-noise ratio E/2No. Now, in considering the general case let us attach the subscripts m to the parameters E and T to indicate the number of levels which are under discussion. Assuming that the rate at which data is being transmitted,

log m T,

R = A (bits/sec) , (125)

is maintained constant, it follows that

1 Tz = E (126)

and, therefore, that

T, = T, log, m. (127)

Since the transmitter is average-power limited,

E m E, T , T, - = -. (128)

Combining (127) and (128) yields

E,,, = E, log, m (129)

from which it follows that

Page 27: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

362 I R E T R A N S A C T I O N S ON COMMUNICATIONS SYSTEMS December

That is to say, under the assumptions of constant rate an nz-level system has a signal-to-noise ratio advantage of 10 log,, (log, nz) db over its 2-level counterpart. Making use of this fact a set of error characteristics for the assump- tion of constant rate may be derived from the curves presented in Bigs. 23-28 (which were drawn under the assumption that the duration time of each transmitted waveform was fixed) simply by translating the m-level characteristic to the left by log,, (log, 11.2) db. A sample set of constant rate curves which were so constructed is presented in Figs. 35-38. It may be noted from these curves that multilevel PSI< and ASK systems are inferior in performance to their two-level counterparts whereas FSIi; systems seem to improve in performance with in- cree,siag nz. In both the PSK and the ASK cases, however, the difference in performance between the multilevel and the corresponding two-level schemes is smaller under the present assumption of constant rate than under the previous assumptions of constant time duration. These results are completely consistent with the geometric picture. In all CafjeS, increasing 7n increases the number of message points. Under the restriction of constant rate, however, the energy content of the transmitted signals increases with 771. and, hence, the volume of the sphere within which the message points are constrained to lie also increases with nt. This tends to partially counteract the crowding together of message points which occurs when n a is incrhased in the PSI< and ASK cases. In the FSK case, the mlessage points are actually moved further apart as m increases, the result being an improvement in performance.

Page 28: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthzcrs and Dym: Detection of Digital Signals in Presence of Noise

."l".GC 31011.1 C M " m

,.,cr 10111 W.I. I.IL.(I IL O l l l l l r , d h , . LOG,. L

2 %

Fig. 35-Probability of waveform error (m-level PSK coherent), assuming that signal duration is adjusted to keep the data rate constant.

363

Fig. 36-Probability of waveform error (rn-level PSK incoherent), Fig. 38-Probability of waveform error (m-level FSK incoherent),

constant. assuming that signal duration is adjusted to keep the data rate assuming that signal duration is adjusted to keep the data rAte

constant.

Page 29: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

364 I R E T R A N S A C T I O N S ON C O M M U N I C A T I O N S S Y S T E M S December

G. Discussion of Bandwidth A significant factor in the comparison of different com-

munication systems is the quality of the channel required by the system to maintain a “satisfactory” flow of in- formation between the transmitter and receiver sites. To this point in the discussion, all comparisons have been made under the assumption that a distortionless “wide- band’’ Gaussian channel (which is completely specified by the spectral density of the noise No) is available. Un- fortunately, channels of this type, while most amenable to analysis, are hard to come by in practice. In particular, the bandwidth allotted to any one transmitter is generally limited and, hence, the relative efficiency with which it uses the available bandwidth is of prime interest. As a measure of bandwidth efficiency we shall introduce the parameter r defined as the ratio of the rate a t which information is being transmitted, R = log, m / T bits/sec, to the Nyquist rate of transmission, 2B bits/sec. That is,

(131)

where

m = the number of different waveforms which may be

T = the duration time of a waveform B = bandwidth required to maintain satisfactory

transmitted

operation.

It is quite difficult to define B precisely. For the purposes of present discussion i t will be adequate to assume simply that a sinusoid of duration T and frequency f o will be passed with negligible distortion by an ideal filter with pass band 1.5/T centered around fo. It follows, therefore, that for the PSK and ASK modulation schemes wherein the frequency of the pulses sent in each time slot is fixed the required .transmitter bandwidth B 1.5/T and, cor- respondingly, the bandwidth eficiency

In the FSK case, however, assuming a separation of 1 / T between adjacent tones, an m-level transmitter requires a bandwidth (see Fig. 39)

B,- m f 0.5 T ’

in which case

In the FSK coherent case, however, adjacent signals need only be separated by a frequency difference of 1/2T to maintain orthogonality and, hence, the required band- width may he reduced to

B S - m + 2 2T ’

1 2 3 mi m

Fig. 39-Illustration of transmission bandwidth required by an m-level orthogonal FSK incoherent system.

NUMBER OF LEVELS

Fig. 40-Plot of bandwidth efficiency as a function of the number of levels.

resulting in a bandwidth efficiency of

Eqs. (132), (133) and (134) are plotted for purposes of comparison in Fig. 40.

It is apparent from Fig. 40 that simple multilevel orthogonal FSK systems, even under idealized operating conditions, are inefficient users of bandwidth. Physically, the reason is cle,ar; in any time period T only a fraction of the total system bandwidth, namely, that occupied by the particular tasne transmitted, is utilized. Thus, we see that although th.e number of levels and, correspondingly, the rate (if we keep the duration time of each waveform T fixed) of such an FSIC system may be increased with relatively little degradation in performance (see Figs. 25 and 28), there is, correspondingly, an increase in the bandwidth required by the system to operate. Conse- quently, the utility of simple multilevel orthogonal FSK systems is limitfed to situations where conservation of bandwidth is not of principle concern. ..

It is clear froin Fig. 40 that multilevel PSK and ASK modulation schemes utilize bandwidth more efficiently than do the FSK modulation schemes. For these systems the bandwidth utilization factor r increases montonically

Page 30: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 365

with the number of levels m. The probability of error, however, as may be deduced from Figs. 23, 24, 26 and 27, also increases with m. There is, thus, an effective upper limit to r beyond which the error rate becomes intolerable. This upper limit will be a function of the signal-to-noise ratio on the channel.

H . Numerical Examples A feel for the types of tradeoff involved can perhaps

best be gained by considering a numerical example. Let us, therefore, investigate the feasibility of operating one of the modulation schemes discussed above at the Nyquist rate, that is, a t r = 1. We assume for the sake of definite- ness that the maximum acceptable error rate is (characters per second) and that the signal-to-noise ratio available is 15 db. The curves of Fig. 40 indicate that a transmission rate corresponding to r = 1 limits our choice of systems to 8-level PSI< or 8-level ASIC. Furthermore, an examination of Fig. 23 reveals that if the signal-to- noise ratio is limited to 15 db, the error rate of an 8-level PSK coherent system is of the order of character/sec, which is unacceptable whereas the error rates of PSK incoherent and ASIC modulation schemes are even higher (see Fig. 31). It follows, therefore, that under the assumed constraints none of the systems considered will operate satisfactorily at the Nyquist rate. If, on the other hand, the available signal-to-noise ratio was increased to 22 db, both the PSI< coherent and the PSK incoherent modula- tion schemes would meet the stated requirements.

As a second example let us consider in a semiqualitative way the type of performance we might expect from one particular modulation scheme, namely, 4-level PSIC in- coherent on a telephone channel. Assuming a useable bandwidth of about 2 kc, it follows from Fig. 40 that, for a 4-level PSI< system, r M .68 and, thus, the rate a t which information may be transmitted is of the order of R M (4000)(0.68) M 2700 bits/sec.

Correspondingly, from Fig. 26 we may deduce that this system will operate a t a character error rate of with a signal-to-noise ratio of about 15 3 db and a t a character error rate of about lo-” with a signal-to-noise ratio of 22 db.

I . Final Remarks Since typically the signal-to-noise ratio on a telephone

channel is well in excess of 22 db and yet the error rates of existing 4-phase systems are more nearly on the order of one can only conclude that the principle sources of errors on a telephone channel are non-Gaussian. Indeed, Kelly and Mercurio” have noted that the major sources of errors on a telephone channel appear to be impulse noise and dropouts.

There are, of course, additional factors which tend to

2o J. P. Kelly, and J. F. Mercurio, “Comparative Performance of Digital Data Modems,” Mitre Corp., Bedford, Mass., Tech. Memo., TM-3037, p. 4; April 14, 1961.

limit the performance of actual communication systems which have been ignored in the present analysis. Among these are distortion in the received signal and inter- symbol interference due to the nonlinear delay, band- limiting and gain fluctuation of the medium coupling the transmitter to the receiver. Furthermore, the received signal is processed in less than ideal fashion by the de- tector due to imperfections in the hardware and timing recovery. It is to be noted that all these factors ultimately manifest themselves at the detector simply as a per- turbation in the position of the transmitted message point. As such, the effects are similar to those produced by the noise and may largely be compensated for by an additional margin of signal-to-noise ratio at the detector.21

Conversely, we might, loosely speaking, say that some fraction of the total signal-to-noise ratio available at the detector is needed to compensate for effects of the type listed above which were not accounted for in the basic analysis. Consequently, only the remaining fraction of signal-to-noise ratio is available for combating Gaussian noise. It is to be expected, therefore, that any predictions of the probability of error based on estimates of the total signal-to-noise ratio available at the detector will be un- duly optimistic.

APPENDIX I The following is devoted to a proof of Theorem I.

Within the course of the proof, a technique for calculating the orthonormal functions pl(t) , pZ(t), , p d t ) will be outlined.

Initially, let us note that a set of functions f l ( t ) , f2 ( t ) , . . . , f,(t) are said to be linearly dependent if there exists a set of constants, al, a*, . . . , a,, not all equal to zero such that

alfl(t) -I- azfz(t) 4- . . . + aqfp(t) 0. (135)

If the set of functions is not linearly dependent, it is

Consider the given set of waveforms Sl( t ) , Sz(t) , said to be linearly independent and vice versa.

S,(t). Either this set is linearly independent or it is not linearly independent. If not, then (by definition) there exists a set of constants, b,, bz, - . , b,, not all equal to zero such that

b,S,(t) + b,Sz(t) + * * * + b,X,(t) = 0.

Suppose, in particular, that b, # 0. Then,

X&) = -(e b S,(t) + b X&) + * * + L L S,-,(t)). m b m

That is to say, Sm(t) can be expressed in terms of the remaining (m - 1) waveforms.

parameter related to signal-to-noise ratio for a particular data 21 An assessment of the effects of delay distortion in terms of a

transmission scheme has been carried out by R. A. Gibby, “An evaluation of AM data system performance by computer simula- tion,” Bell Sys. Tech. J., vol. 39, pp. 675-704, Ref. 1; May, 1960.

Page 31: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

366 IRE’ TRANSACTIONS ON COMMUNICATZONA‘~ SYSTEMS December

Consider now the set of waveforms S,(t), S,(t), . . . , S,-,(t). Either this set is linearly independent or it is not. If not, there exists a set of constants, cl, cp , . 9 . , c,,-~, not all equal to zero such that

ClS,(t) + c,S,(t) + . . . + c,-1Sm-1(t) = 0.

Suppose that # 0. Then,

S,_,(t) = -(-5 S,(t) + 2 S,(t) + . . ’ + S&) c m - 1 e,-1 m-I

which implies that S,-,(t) can be expressed as a linear combination of the remaining ( m - 2) waveforms. Now, examining the set of waveforms Sl ( t ) , S,(t), . . . , Sm-,(t) for linear independence and continuing in this fashion, i t is clear that we will eventually end up with a linearly independent subset of the original set of waveforms, say,

Sl(t), x&), . . . , S d t ) k I m.

(The indicise of the given set, of waveforms can always be permuted in such a fashion that the first k waveforms, S,(t), S,(t), . . . , S,(t), will be linearly independent.) Note that each of the given waveforms S l ( t ) , Sz( t ) , . . . , S,(t) may be expressed as a linear combination of these IC waveforms.

We shall now, utilizing the Gram Schmidt process, show that if the given waveforms are physically realizable (or, to be more precise, L, functions), which condition guarantees the existence of t,he integrals in question, it is possible to construct a set of k orthonormal wave-

independent waveforms S1(t), S,(t), . . , S,(t). forms, cpl(t>, ( c2 ( t ) , . . . , cpk(t) , from the derived linearly

As a starting point set,

Sl(t> .

S:(o dt cpl(t> = (136)

It is clear that

s,’ &t) dt = 1. (137)

Now, define a new intermediate function,

hz(t) = S*(t> - Xcp,(t), (138)

where x is some constant which is yet to be determined. Since, by (137) and (138),

lT h,(t)cp,(t) d t = J T SS(t)cpl(t) dt - X, (139)

i t is clear that if we set

X = 1’ S2(t)cpI(t) dt (140)

and

U t ) (141)

that

and

iT &(t) d t = 1.

That is to say, cpl(t) and cpz(t) form an orthonormal set. Continuing in the same fashion, set

and the constants yi , j = 1, 2, . . , i - 1 equal to

:li = l’ fl,(t)cp,(t) d t . (143)

It follows readily that the set of functions

form an orthonormal set. Since each waveform of the derived subset S;(t) i =

I , 2 , . . . , IC may be expressed as a linear combination of the pi ( t ) i = 1, 2 , . . . , k , it follows that each of the originally given waveforms Si( t ) i = 1, 2, . . . , nz may be expressed as a linear combination of the cpi(t) i = 1, 2, . . . , k . That is to say, we can write

S,(t) = 5 aijcpi(t) i = 1, 2, . . . , m (145) ;=1

where the a;; are constants. Furthermore, multiplying both sides of (14.5) by cpj( t ) and integrating from 0 to T , we can deduce the fact that

We remark that the results of this appendix are ab- stracted from the general theory of vector spaces. The interested reader -would do well to refer to some of the st,aadard texts in the area.22

APPENDIX I1 In this appendix we wish to investigate the properties

of the quantities ni, j = 1, 2, . . . , k defined in (13). In so doing, we shall use the notation of Davenport and Root’ and shall rdso make use of some of the results derived therein.

Now, by (13),

nj = l’ ?2(t)cpj(t) dt j = 1, 2 , . . . , k

where the cpj( t ) form an orthonormal set.

Nostrand Co., Inc., Princeton, N. J., A958; G. Bifkhoff and S. 22 1’. R. Halmos, “Finite Dimensional Vector Spaces,” D. Van

MacLane, “A Survey of Modern Algebra, The Macmlllan Go., New York, N. Y.; 1960..

Page 32: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthur8 and Dym: Detection of Digital Signals i,n Presence of Noise 367

If n(t) is a Gaussian random process, then ni is a Gaus- sian random variablez3 and is thus characterized com- pletely by its mean and variance. In particular, the mean of ni is equal to

fi, = E[?tj] = E[n( t ) ]p ; ( t ) d t lT and the variance of n, is equal to

= I' dt l' ds ~[n(t)n(s)]cp~(t)cp~(S) -

If n(t) is a zero mean process, then

E[n(t)] = 0

which in turn, by (147), implies

f i j = E[ni] = 0.

By definition, the statistical autocorrelation function of the random process n(t) is equal to

R(t, s) = E[n(t)n(s)]. (151)

If n(t) is stationary, then the autocorrelation function is a function of the time difference t - s alone and not on the particular choice of t and s per se.

Summarizing the results to this point we note that if n(t) is a stationary Gaussian random process with zero mean, then n, is a Gaussian random variable with zero mean and variance uf equal to

.f = .c," d t l' dsR(t - s)pj( t )pi(s) . (152)

For a stationary random process, the spectral density and the statistical autocorrelation function form a Fourier Transform pair.24 In particular,

~ ( 7 ) = W(f)e+iZ"" df (153)

where W ( f ) represents the spectral density of the station- ary random process in question.

For white noise,

W(f) = N o for all f

and, correspondingly,

~ ( 7 ) = N0e+'2"'' df = N , ~ ( 7 ) (154)

where 8(7) is the unit impul~e. '~

23 Davenport and Root, op. cit., pp. 155-156. 24 Davenport and Root, o p . cit . , p. 104. 25 Davenport and Root, o p . cit., pp. 365-368.

Consequently, substituting this result into the ex- pression for the variance, (152), we get

= No s,' df&(t ) = N o (155)

where we have utilized the sifting integralz6 and the fact that the p j ( t ) : j = 1, 2, . . . , k form an orthonormal set.

It follows readily that if j z k,

= N o lT dt pj( t )pk(t ) = 0. (156)

This suffices to prove that the Gaussian random variable

We might further point out that successive noise out- n j and nk are independent if j # k.27

puts in time are independent. That is to say, if

E[nn*] = 0. (159)

An easy way to see this is to let

and then to note that (157) and (158) can be rewritten in the form

= s, n(Qs,(t) dt 2 1'

(162)

(163) n* = l n(t)q2(t) d t . ?T

Now, noting further that -2T

26 Davenport and Rout, op. cit., pp. 365-368. 27 Davenport and Root, op. cit., pp. 55-58.

Page 33: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

368 I R E T R A N S A C T I O N S ON C O M M U N I C A T I O N S S Y S T E M S December

it follows readily that r , 2 ~ ,ZT 1

"2 T

APPENDIX I11 This appendix is devoted to a proof of Theorem III.

The proof will proceed in two steps. Initially, we shall show that a maximum-likelihood detector minimizes the probability of error if each possible signal is transmitted with equal probability. Secondly, we shall show that if the transmitted signals are perturbed by additive station- ary-white-zero mean-Gaussian noise, then, in the coherent case, maximum-likelihood detection is equivalent to picking the message point closest to the received signal point and guessing that the corresponding signal was transmitted.

Assume tha,t in each time slot T one of the m possible signals S,(t), S,(t), * . . , S,(t) is transmitted with equal probability, namely, l /m. Assume further that, whenever a signal is transmitted, a point (or vector) y is observed at the detector. Denoting the set of all possibly observed y by Y , the observation space, we suppose that the con- ditional probability density of y [under the condition that Si(t) is sent], pi(y) i = 1, 2, * . , m, is defined on Y . Our objective is to establish a rule for partitioning the space Y into a set of disjoint regions, Y,, Y,, , Y,, such that if we guess that Si(t) was transmitted whenever y lands in the region Yi, i = 1, 2, . . e , m, the probability of error is at a minimum.

Note initially that the probability that y lands in the region Yi when S,(t) is transmitted may be written in the following equivalent ways:

"

= 1 - P[Y e Y,/S,]

Now, assuming that we do partition the observation space Y into a set of disjoint regions, Y,, Y,, . . . , Y,, and then guess that Si(t) was transmitted if y lands in the region Y,., i = 1, 2, . . . , m, the probability of incorrect decision P, is equal to

It is clear that P. will be minimized if cy==l J y , pi(y) dy is maximized. A little reflection, however, indicates that

this latter sum will take on its maximum value if we set Y , equal to the set of points y in Y , for which

p i ( y ) > pi(y) all j # i.

(In case there is a point yo in Y for which, say,

Pl(ZJ0) = PZ(Y0) =: pa(y0) > P,(Yo), Q = 4, 5, . * * , m,

the point yo can be assigned to either Y , or Y , or Y3.) The decision rule embodied .in this partitioning of the

observation space is equivalent to taking the observed point, say yo, finding that value of i for which pi(yo) is a maximum and then guessing that the corresponding signal Si(t) was transmitted. That is to say, this decision rule is equivalent' to maximum-likelihood detection. We conclude, therefore, that if each signal is transmitted with equal proba,bility, a maximum-likelihood detector will minimize the probability of error.

In the text it has been shown that, in the case of co- herent detection, if Si(t) is transmitted, the received signal can be characterized by a point in a k-dimensional Euclidean space with coordinates [see (la)]

ai, + ni j = 1 , 2 , , k where the a,; are the coordinates of the transmitted signal and the n, are independent Gaussian random variables with, zero mean and variance N o .

The decision as to what signal was sent will be based on the coordinates of the received point. That is to say, in this case the observation space Y is a k-dimensional Euclidean space. Accordingly, let us designate the k- dimensional random vector corresponding to the received signal by (9) (the symbol z was used in the paper). Since each coordinate of 8, namely, y, = ai; + ni; j = 1 , 2 , . . , k, is an independenr; Gaussian random variable with mean aii when si(t> is transmitted and variance N o , the con- ditional probability density function pi(9) is equal to

j-1

where d(9, Xi) .is equal to the distance between the received point 9 and the transmitted point Si.

Now, suppose a particular signal is observed at the receiver; that is t13 say, $j = go. The maximum-likelihood detection rule is simply to choose that value of i for which

is a maximum and guess that the corresponding signal Si(Q was transmitted. This is, however, equivalent to selecting that value of i for which d(go, Si) is a minimum or selecting the message point closest to the received signal point.

Page 34: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 369

APPENDIX IV The following is devoted to discussing the decision rules

which have been adopted for the incoherent systems. As shown in Appendix 111, a maximum-likelihood detector will minimize the average probability of error if each possible message is transmitted with equal probability. Accordingly, we shall, in each case derive the decision rule corresponding to maximum-likelihood detection but shall, for the purpose of simplicity, modify the decision rule derived for the ASK case.

A. PXK Incoherent In the PSI< incoherent case the decision as to what

message was sent is based on the successive outputs of the two product integrators. Assuming, in particular, that the ith message has been sent, the decision will be based on the four quantities described by (30), namely,

x1 = di3 cos a! + n,,

y, = - dE sin a + n12

x2 = dE cos (a! + 27ri/m) + n2,

y2 = - dE sin (a + 27ri/m) + nZ2

where rill, n12, n2,, nZz are independent Gaussian random variables with zero mean and variance N o and a! is assumed to be uniformly distributed over a 2~ interval. That is to say,

{;/27r e I a < e + 27r P ( 4 = (1 65)

elsewhere

The conditional joint density function of the random variables xl, yl, x2 , y2, given, say, that a! = A and that the ith message was transmitted, may be written as

Pi(X1, y1, 2 2 , Y2l .c = A)

= ('>2 27rN0 exp { -& [ (xl - G cos A)'

+ (yl + 43 sin A)'

-I- (x2 - dE cos ( A + E)>' + (yz + 4~ sin ( A + %)>'I}

= D exp { P cos A + &sin A }

where

P = [x, + x2 cos - - y2 sin - 27ri NO m m

Q = s[ No - y, - x2 sm . - 27ri - y2 cos - m m

The joint conditional density for x,, y,, x2, y2, assuming only that the ith message was sent, is equal to

Pi(%, y1, 2 2 , Y Z )

= s,"" p i ( x l , yl, x2, yZ/a = A)p(A) dA (170)

where we are averaging over all possible A .

the indicated integration, we get Substituting (165) and (166) into (170) and performing

P&, yl, 2 2 , y2) = DIo( d P 2 + Q 2 ) (171) where Io is the modified Bessel function of the first kind of zero order.''

Now, in principle, to decide what signal was sent, the decision box of the detector should substitute the measured values of x,, yl, xf, ya into the right-hand side of (171) and evaluate this expression for all values of i, i = 1, 2, . , rn. It should then select that value of i which yielded the largest value and assume that the corre- sponding signal was sent. However, since the quantity D is independent of the choice of i and the Bessel function increases monotonically with its argument, it is sufficient to find the value of i which maximizes the quantity

P2 + Q'.

By (168) and (169), we find that

cos - + 2(y,x2 - x1y2) sin 27ri m m

The right-hand side of (172) can be interpreted geo- metrically. Before doing so, however, we wish to point out that since [as can be deduced from (29a), (30a) and ( 3 0 ~ 1

1' E 42(t) - cos (mot + a) dt = - d g s i n a ,

i t is necessary to consider angular displacements in the clockwise direction to be positive if the output of the +l(t) product integrator is interpreted as the x coordinate projection of the received signal and the output of the &(t ) product integrator is interpreted as the y coordinate projection of the received signal. Under this convention, the transformation from polar coordinates to rectangular coordinates is given by

X = COS e y = -psin e.

In Fig. 41, x, y, p and e are defined. Now, let us consider a pair of successively received

signal points with coordinates (u,, b,) and (a2, b2) , re- spectively. That is to say, we are assuming that the

Page 35: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

370 IRE TRANSACTIONS ON COMA!/ UNICATIOATS SYSl%drlX December

Fig. 41-Definition of polar and rectangular coordikate systems adopted in Section A, Appendix IV.

y l

/

Fig. 42-Geometric interpretation of (173) and (177).

random variables xl, y,, xa, y2 take on the particular values al , b,, a,, b,, respectively. The two signal points are shown plotted in Fig. 42.

If we denote the point (a3, b3) as the rotation of the point (a,, b,) in the counterclockwise direction by 2ai/m radians, it is apparent, since

P2 + Qz = -5 E { (2; + a,' + bq + 0; N n

___. + 2- d a , " + bi cos $) . (177)

The only term in the right-hand side of (177) which is dependent upon i, the index of the transmitted signal, is cos $. Accordingly, that value of i, i = 1, 2, . . . , m, which maximizes cos $ will also maximize (Pz + Q'). Equivalently, WE: can solve for that value of i which minimizes I $ I. That is to say, the decision rule, as may be deduced from. Fig. 42, reduces simply to measuring the phase difference between successively received signals, rounding off the measured value to the nearest value of Zai/m i = 1,2, . . . , m and guessing that the corresponding signal was transmitted.

B . .4 SK Incoherent

In the ASK incoherent case, if Si( t ) is transmitted, the decision at the receiver as to what signal was transmitted is based on the random variables x and y described in (34a) and (34b), :namely,

x = dE cos a + n,

y = - fi sin a + n,

that

a2 = d ~ ; + b; COS e ____

where n,, n, are independent Gaussian random variables I - 7 - with zero mean and variance No and a is assumed to be

uniformly distributed over a 2a interval. The joint conditional probability density function of

the random variables x and y, given that a = A and that Si(t) was transmitted, is equal to

0, = - Y a; + bz sin 0,

a3 = d a : + b; cos (e - z) ~-

2ai = a, cos - - b, sin - 2ai

m m

and

b:, = - d a l + 0," sin Therefore, averaging over all possible values of A, the conditional joint density function of the random variables x and y, given only that Si(t) was transmitted, is equal to 2ai 2ai

m m = b, cos - + a2 sin -- (174)

Furthermore, applying the law of cosines to the angle $ defined in Fig. 42, we get

Substituting (173) and (174) into (175) yields

cos $

2ai cos - + @,a, - albz) sln - . 2ai

Substituting this result into (172) yields (recall we are considering that particular case where x1 = a,, yl = b,, x2 = a,, y z = b,)

1 -- - E:xp -- 2aN0 { 2ko (x2 + y2 + E J }

where I , is the modified Bessel function of the first kind of zero order.16

The optimum decision rule for the detector to follow is to substitute the measured value of x and y into the right-hand side oii (179), find the value of i which maxi- mizes the resu1ta:nt expression and assume that the cor- responding signal was transmittcd. Unfortunately, this decision rule is rather complicated to instrument and we

Page 36: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

1962 Arthurs and Dym: Detection of Digital Signals in Presence of Noise 371

shall adopt, instead, its asymptotic form although we shall not always be operating in ranges where the resultant rule is optimum.

The asymptotic expansion of the Bessel function is given by2'

Substituting the first tern1 of the expansion into the right-hand side of (179), we get

In the regions where this expansion is valid, the be- havior of the right-hand side of (181) is dominated by the exponential term. It follows, therefore, that the set of (x, y) points for which pi (z , y) > p j ( z , y) all j # i cor- responds approximatcly to thc set of (x, y) points for which

I d G Z - dz I < I d L q - 4% I a,ll j # i. (182)

That] is to say, it is approximately true that if, in particular, x = a and !J = b, the value of i which maxi- mizes p, (a , b) corresponds to the value of i which mini- mizes I 2/a2 + b2 - 4% 1 . The indicated decision rule is, therefore, to calculate the rms amplitude of the re- ceived signal roundoff to the nearest value of d%, i = 1, 2, . . . , m and guess that the corresponding signal was transmitted. It should be noted that this decision rule, although seemingly a natural one to adopt, is only optimum in the region where it is 1egit)imate to approxi- mate the Bessel function by the first term of its asymptotic expansion-that is, for those values of x, y, dz and N o for which d x 2 + ya * / N o >> 1.

C. FSK Incoherent In the FSK incoherent case, if Si(t) is transmitted, the

detector will base its decision as to what was sent on the 2m random variables [see (37a) and (37b)l

where thc nLi and n,, are independent Gaussian random variables, each having zero mean and variance N , and a!

is a random variable uniformly distributed over a 2 r interval.

The conditional joint probability density of the random variables xl , x2, . . . , x,, yl, -y2, . . . , ym, given that X, ( t )

Bowman, op . cit., p. 84.

where

Now, averaging over all possible values of A to find the joint conditional probability density of the random variables xl , x?, . . . , xm, yl, ya, . . 9 , ym, given only that bSi(t) was sent, we get

p;(xl, 5 2 , . ' ' , xfn, Y1, y2, . . . , ?Irn)

where I,, is the modified Ressel function of the first kind of zero order.I6

Clearly, since L is independent of the choice of i and I, is a monotonically increasing function of its argument, the set of points xl , x,, . . . , x,, yl, ya, . . . , ym for which

Pi(zl, 2 2 , " ' x m , !/I, ?Jz, " ' , V n t )

> p,(x,, 2 2 , . . . ,x,, V I , y2, . . - , urn) for all j # i

is equal to the set of points for which 4- xi yi > dx; 4- y: all j Z i. (186)

Thus, if, a t some instant, the random variables zl, x2, * . , x,, yl, y2, . . . , ym take on the particular values a], a,, . . . , a,, bl, b,, . . . , b, , the receiver should calculate the nh rms amplitudes 2/aS + b: i = 1, 2, . . . , m (each of which is associated with a particular frequency), select the largest one and guess that the corresponding signal was transmitted.

APPENDIX V This appendix is devoted to a proof of Theorem IV.

Recall that the decision rule for the maximum-likelihood detector in the coherent case is simply to choose the mes- sage point closest to the received signal point and to guess that the corresponding signal was transmitted. Accordingly, an error will occur if, and only if, when si(t) is transmitted, the received signal point lies closer to one of the message points Si ( j # i) than to the mes- sage point Si . Denoting the distance between message

Page 37: 336 IRE TRANSACTIONS ON December On Optimum Detection of ...faculty.washington.edu/jar7/EEclasses/EE506/Notes/... · presented resultx are new, although this is difficult to ascertain

372 IRE TRANSACTIONS ON COMMUNICATIONS SYSTEMS December

points Si and Si by p i i and the noise components origi- nating at the point Si and directed towards the point Si by nii , it follows that the probability of this event which equals the probability of an error, Pa, , when Si(t) is transmitted is equal to

r 1

P,< = I' nii > Pii for at least one j # i] (187) 1 2

where n,, is a Gaussian random variable with mean zero and variance N o . Since the events ni j > pii /2 are not necessarily mutually exclusive, Pei is certainly less than or equal to

1

Now, noting that

and defining

is, t'herefore, by (194), bounded by

Since the seclond derivative of H(pi/2-) (with respect to its argument) exists and .is greater than or equal to zero if p i 2. 0, it follows that H(pi /2 . \ /No) is a convex function2' for p i 2 0. That is to say,

Defining the sylnbol p as the average of p i , we have

1 " m p = - c p i (198)

and combining (1.97) and (198) with the left-hand equality of (196), we get . I . ,<on\

H ( a ) = - 1 " I Further, defining the symbol 2- 6 P < f / 2 6

dx, (190) p* = min [ p i ]

z

i t follows, since H(pii/2-) is a monotonically de- creasing function of its argument, that and noting that

Pei 5 H(*) < (m - ~)H(L) (191) €I(&) I H(*-) 2 a ' (201)

we get, by combining (201) with the right-hand inequality 2 - - 22 /N,

where of (196),

p i = min [ p i i ] . i#i

The quantity p i defined in (192) is equal to the distance between the signal point Si and its nearest neighbor. It is clear [by (187) and (192)] that

Thus, by the inequalities of (199) and (202), we have

"(&j - < P, 5 (m - ~)H(L) 2 a P,< 2 .(x)

2 a (lg3) which, by (190), may be rewritten in the form presented in the text, namely,

which result, when combined with (191), yields __ 1 " [ e--.'r'2

LF 7 dx 5 P ,

(195) Cambridge University Press, Cambridge, England; 1959. In partic- 29 G. H. Hardy, J. E. Littlewood and G. Polya, "Inequalities,"

ular, note Sew. 3.5 and 3.10.