Signals and transforms in linear systems analysis

Wasyl Wasylkiwskyj

Signals and Transforms in Linear Systems Analysis

Signals and Transforms in Linear Systems Analysis

Wasyl Wasylkiwskyj

Signals and Transforms inLinear Systems Analysis

123

Wasyl WasylkiwskyjProfessor of Engineering and Applied ScienceThe George Washington UniversityWashington, DC, USA

ISBN 978-1-4614-3286-9 ISBN 978-1-4614-3287-6 (eBook)DOI 10.1007/978-1-4614-3287-6Springer New York Heidelberg Dordrecht London

Library of Congress Control Number: 2012956318

© Springer Science+Business Media, LLC 2013This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part ofthe material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology nowknown or hereafter developed. Exempted from this legal reservation are brief excerpts in connection withreviews or scholarly analysis or material supplied specifically for the purpose of being entered and executedon a computer system, for exclusive use by the purchaser of the work. Duplication of this publication orparts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in itscurrent version, and permission for use must always be obtained from Springer. Permissions for use may beobtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution underthe respective Copyright Law.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.While the advice and information in this book are believed to be true and accurate at the date of publication,neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors oromissions that may be made. The publisher makes no warranty, express or implied, with respect to thematerial contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

www.springer.com

Preface

This book deals with aspects of mathematical techniques and models that con-stitute an important part of the foundation for the analysis of linear systems.The subject is classical and forms a significant component of linear systemstheory. These include Fourier, Z-transforms, Laplace, and related transformsboth in their continuous and discrete versions. The subject is an integral part ofelectrical engineering curricula and is covered in many excellent textbooks. Inlight of this, an additional book dealing with the same topics would appear su-perfluous. What distinguishes this book is that the same topics are viewed froma distinctly different perspective. Rather than dealing with different transformsessentially in isolation, a methodology is developed that unifies the classicalportion of the subject and permits the inclusion of topics that usually are notconsidered part of the linear systems theory. The unifying principle here is theleast mean square approximation, the normal equations, and their extensionsto the continuum. This approach gives equal status to expansions in terms ofspecial functions (that need not be orthogonal), Fourier series, Fourier integrals,and discrete transforms. As a by-product one also gains new insights. For ex-ample, the Gibbs phenomenon is a general property of LMS convergence at stepdiscontinuities and is not limited to Fourier series.

This book is suitable for a first year graduate course that provides a tran-sition from the level the subject is presented in an undergraduate course insignals and systems to a level more appropriate as a prerequisite for graduatework in specialized fields. The material presented here is based in part on thenotes used for a similar course taught by the author in the School of Electri-cal and Computer Engineering at The George Washington University. The sixchapters can be covered in one semester with sufficient flexibility in the choiceof topics within each chapter. The exception is Chap. 1 which, in the spirit ofthe intended unity, sets the stage for the remainder of the book. It includesthe mathematical foundation and the methodology applied in the chapters tofollow.

The prerequisites for the course are an undergraduate course in signals andsystems, elements of linear algebra, and the theory of functions of a complexvariable. Recognizing that frequently the preparation, if any, in the latter issketchy, the necessary material is presented in the Appendix.

Wasyl Wasylkiwskyj

V

Contents

1 Signals and Their Representations 11.1 Signal Spaces and the Approximation Problem . . . . . . . . . . 11.2 Inner Product, Norm and Representations by Finite Sums

of Elementary Functions . . . . . . . . . . . . . . . . . . . . . . . 41.2.1 Inner Product and Norm . . . . . . . . . . . . . . . . . . 41.2.2 Orthogonality and Linear Independence . . . . . . . . . . 71.2.3 Representations by Sums of Orthogonal

Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.2.4 Nonorthogonal Expansion Functions

and Their Duals . . . . . . . . . . . . . . . . . . . . . . . 141.2.5 Orthogonalization Techniques . . . . . . . . . . . . . . . . 16

1.3 The LMS Approximation and the Normal Equations . . . . . . . 191.3.1 The Projection Theorem . . . . . . . . . . . . . . . . . . . 191.3.2 The Normal Equations . . . . . . . . . . . . . . . . . . . . 211.3.3 Generalizations of the Normal Equations . . . . . . . . . 221.3.4 LMS Approximation and Stochastic Processes* . . . . . . 25

1.4 LMS Solutions via the Singular Value Decomposition . . . . . . . 271.4.1 Basic Theory Underlying the SVD . . . . . . . . . . . . . 271.4.2 Solutions of the Normal Equations

Using the SVD . . . . . . . . . . . . . . . . . . . . . . . . 301.4.3 Signal Extraction from Noisy Data . . . . . . . . . . . . . 321.4.4 The SVD for the Continuum . . . . . . . . . . . . . . . . 351.4.5 Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371.4.6 Total Least Squares . . . . . . . . . . . . . . . . . . . . . 391.4.7 Tikhonov Regularization . . . . . . . . . . . . . . . . . . . 43

1.5 Finite Sets of Orthogonal Functions . . . . . . . . . . . . . . . . 441.5.1 LMS and Orthogonal Functions . . . . . . . . . . . . . . . 441.5.2 Trigonometric Functions . . . . . . . . . . . . . . . . . . . 451.5.3 Orthogonal Polynomials [1] . . . . . . . . . . . . . . . . . 47

1.6 Singularity Functions . . . . . . . . . . . . . . . . . . . . . . . . . 521.6.1 The Delta Function . . . . . . . . . . . . . . . . . . . . . 521.6.2 Higher Order Singularity Functions . . . . . . . . . . . . . 59

VII

VIII Contents

1.6.3 Idealized Signals . . . . . . . . . . . . . . . . . . . . . . . 611.6.4 Representation of Functions with Step

Discontinuities . . . . . . . . . . . . . . . . . . . . . . . . 631.6.5 Delta Function with Functions as Arguments . . . . . . . 65

1.7 Infinite Orthogonal Systems . . . . . . . . . . . . . . . . . . . . . 661.7.1 Deterministic Signals . . . . . . . . . . . . . . . . . . . . . 661.7.2 Stochastic Signals: Karhunen–Loeve Expansion∗ . . . . . 68

2 Fourier Series and Integrals with Applications to SignalAnalysis 752.1 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

2.1.1 Pointwise Convergence at Interior Pointsfor Smooth Functions . . . . . . . . . . . . . . . . . . . . 75

2.1.2 Convergence at Step Discontinuities . . . . . . . . . . . . 782.1.3 Convergence at Interval Endpoints . . . . . . . . . . . . . 822.1.4 Delta Function Representation . . . . . . . . . . . . . . . 842.1.5 The Fejer Summation Technique . . . . . . . . . . . . . . 862.1.6 Fundamental Relationships Between the Frequency

and Time Domain Representations . . . . . . . . . . . . . 922.1.7 Cosine and Sine Series . . . . . . . . . . . . . . . . . . . . 942.1.8 Interpolation with Sinusoids . . . . . . . . . . . . . . . . . 982.1.9 Anharmonic Fourier Series . . . . . . . . . . . . . . . . . 104

2.2 The Fourier Integral . . . . . . . . . . . . . . . . . . . . . . . . . 1072.2.1 LMS Approximation by Sinusoids Spanning

a Continuum . . . . . . . . . . . . . . . . . . . . . . . . . 1072.2.2 Transition to an Infinite Observation Interval:

The Fourier Transform . . . . . . . . . . . . . . . . . . . . 1082.2.3 Completeness Relationship and Relation to Fourier

Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1092.2.4 Convergence and the Use of CPV Integrals . . . . . . . . 1112.2.5 Canonical Signals and Their Transforms . . . . . . . . . . 1142.2.6 Basic Properties of the FT . . . . . . . . . . . . . . . . . 1172.2.7 Convergence at Discontinuities . . . . . . . . . . . . . . . 1282.2.8 Fejer Summation . . . . . . . . . . . . . . . . . . . . . . . 128

2.3 Modulation and Analytic Signal Representation . . . . . . . . . . 1322.3.1 Analytic Signals . . . . . . . . . . . . . . . . . . . . . . . 1322.3.2 Instantaneous Frequency and the Method

of Stationary Phase . . . . . . . . . . . . . . . . . . . . . 1342.3.3 Bandpass Representation . . . . . . . . . . . . . . . . . . 1402.3.4 Bandpass Representation of Random Signals* . . . . . . . 143

2.4 Fourier Transforms and Analytic Function Theory . . . . . . . . 1482.4.1 Analyticity of the FT of Causal Signals . . . . . . . . . . 1482.4.2 Hilbert Transforms and Analytic Functions . . . . . . . . 1492.4.3 Relationships Between Amplitude and Phase . . . . . . . 1522.4.4 Evaluation of Inverse FT Using Complex Variable

Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

Contents IX

2.5 Time-Frequency Analysis . . . . . . . . . . . . . . . . . . . . . . 1592.5.1 The Uncertainty Principle . . . . . . . . . . . . . . . . . . 1592.5.2 The Short-Time Fourier Transform . . . . . . . . . . . . . 163

2.6 Frequency Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . 1682.6.1 Phase and Group Delay . . . . . . . . . . . . . . . . . . . 1682.6.2 Phase and Group Velocity . . . . . . . . . . . . . . . . . . 1712.6.3 Effects of Frequency Dispersion on Pulse Shape . . . . . . 1732.6.4 Another Look at the Propagation of a Gaussian

Pulse When β′′′ (ω0) = 0 . . . . . . . . . . . . . . . . . . . 1802.6.5 Effects of Finite Transmitter Spectral Line Width* . . . . 182

2.7 Fourier Cosine and Sine Transforms . . . . . . . . . . . . . . . . 185

3 Linear Systems 191

3.1 Fundamental Properties . . . . . . . . . . . . . . . . . . . . . . . 1913.1.1 Single-valuedness, Reality, and Causality . . . . . . . . . . 1913.1.2 Impulse Response . . . . . . . . . . . . . . . . . . . . . . 1933.1.3 Step Response . . . . . . . . . . . . . . . . . . . . . . . . 1963.1.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1963.1.5 Time-invariance . . . . . . . . . . . . . . . . . . . . . . . 197

3.2 Characterizations in terms of Input/Output Relationships . . . . 1993.2.1 LTI Systems . . . . . . . . . . . . . . . . . . . . . . . . . 1993.2.2 Time-varying Systems . . . . . . . . . . . . . . . . . . . . 201

3.3 Linear Systems Characterized by Ordinary DifferentialEquations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2073.3.1 First-Order Differential Equations . . . . . . . . . . . . . 2073.3.2 Second-Order Differential Equations . . . . . . . . . . . . 2133.3.3 N-th Order Differential Equations . . . . . . . . . . . . . 225

4 Laplace Transforms 2354.1 Single-Sided Laplace Transform . . . . . . . . . . . . . . . . . . . 235

4.1.1 Analytic Properties . . . . . . . . . . . . . . . . . . . . . . 2354.1.2 Singularity Functions . . . . . . . . . . . . . . . . . . . . 2394.1.3 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . 2404.1.4 Inversion Formula . . . . . . . . . . . . . . . . . . . . . . 2414.1.5 Fundamental Theorems . . . . . . . . . . . . . . . . . . . 2434.1.6 Evaluation of the Inverse LT . . . . . . . . . . . . . . . . 248

4.2 Double-Sided Laplace Transform . . . . . . . . . . . . . . . . . . 2604.2.1 Definition and Analytic Properties . . . . . . . . . . . . . 2604.2.2 Inversion Formula . . . . . . . . . . . . . . . . . . . . . . 2614.2.3 Relationships Between the FT and the Unilateral LT . . . 267

5 Bandlimited Functions Sampling and the Discrete FourierTransform 2715.1 Bandlimited Functions . . . . . . . . . . . . . . . . . . . . . . . . 271

5.1.1 Fundamental Properties . . . . . . . . . . . . . . . . . . . 2715.1.2 The Sampling Theorem . . . . . . . . . . . . . . . . . . . 274

X Contents

5.1.3 Sampling Theorem for Stationary RandomProcesses* . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

5.2 Signals Defined by a Finite Number of Samples . . . . . . . . . . 2785.2.1 Spectral Concentration of Bandlimited Signals . . . . . . 2815.2.2 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

5.3 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2865.3.1 Impulse Sampling . . . . . . . . . . . . . . . . . . . . . . 2865.3.2 Zero-Order Hold Sampling and Reconstruction . . . . . . 2875.3.3 BandPass Sampling . . . . . . . . . . . . . . . . . . . . . 2905.3.4 Sampling of Periodic Signals . . . . . . . . . . . . . . . . 293

5.4 The Discrete Fourier Transform . . . . . . . . . . . . . . . . . . . 2975.4.1 Fundamental Definitions . . . . . . . . . . . . . . . . . . . 2975.4.2 Properties of the DFT . . . . . . . . . . . . . . . . . . . . 300

6 The Z-Transform and Discrete Signals 3116.1 The Z-Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

6.1.1 From FS to the Z-Transform . . . . . . . . . . . . . . . . 3116.1.2 Direct ZT of Some Sequences . . . . . . . . . . . . . . . . 3166.1.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

6.2 Analytical Techniques in the Evaluation of the Inverse ZT . . . . 3206.3 Finite Difference Equations and Their Use in IIR and FIR

Filter Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3276.4 Amplitude and Phase Relations Using the Discrete Hilbert

Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3316.4.1 Explicit Relationship Between Real and Imaginary

Parts of the FT of a Causal Sequence . . . . . . . . . . . 3316.4.2 Relationship Between Amplitude and Phase

of a Transfer Function . . . . . . . . . . . . . . . . . . . . 3336.4.3 Application to Design of FIR Filters . . . . . . . . . . . . 334

A Introduction to Functions of a Complex Variable 337A.1 Complex Numbers and Complex Variables . . . . . . . . . . . . . 337

A.1.1 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . 337A.1.2 Function of a Complex Variable . . . . . . . . . . . . . . . 341

A.2 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 342A.2.1 Differentiation and the Cauchy–Riemann

Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 342A.2.2 Properties of Analytic Functions . . . . . . . . . . . . . . 344A.2.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . 345

A.3 Taylor and Laurent Series . . . . . . . . . . . . . . . . . . . . . . 349A.3.1 The Cauchy Integral Theorem . . . . . . . . . . . . . . . 349A.3.2 The Taylor Series . . . . . . . . . . . . . . . . . . . . . . . 351A.3.3 Laurent Series . . . . . . . . . . . . . . . . . . . . . . . . 354

A.4 Singularities of Functions and the Calculus of Residues . . . . . . 356A.4.1 Classification of Singularities . . . . . . . . . . . . . . . . 356A.4.2 Calculus of Residues . . . . . . . . . . . . . . . . . . . . . 361

Contents XI

Bibliography 369

Index 371

*The subsections marked with* are supplements and not parts of the main text

Introduction

Although the book’s primary purpose is to serve as a text, the basic nature ofthe subject and the selection of topics should make it also of interest to a wideraudience. The general idea behind the text is two fold. One is to close thegap that usually exists between the level of student’s preparation in transformcalculus acquired in undergraduate studies and the level needed as preparationfor graduate work. The other is to broaden the student’s intellectual horizon.

The approach adopted herein is to exclude many of the topics that are usuallycovered in undergraduate linear systems texts, select those that in the opinionof the author serve as the common denominator for virtually all electrical engi-neering disciplines and present them within the unifying framework of the gen-eralized normal equations. The selected topics include Fourier analysis, both inits discrete and continuous formats, its ramifications to time- frequency analysis,frequency dispersion and its ties to linear systems theory, wherein equal statuesis accorded the LTI and time-varying systems . The Laplace and Z-transformsare presented with special emphasis on their connection with Fourier analysis.

The book begins within a rather abstract mathematical framework thatcould be discouraging for a beginner. Nevertheless, to pave the path to thematerial in the following chapters, I could not find a simpler approach. Theintroductory mathematics is largely contained in the first chapter. The followingis the synopsis.

Starting on familiar ground, a signal is defined as a piecewise differentiablefunctions of time and the system input/output relation as a mapping by anoperator from its domain unto its range. Along with the restriction to linearoperators the representation of a signal as a sum of canonical expansion functionsis introduced. The brief discussion of error criteria with focus on the LMS (LeastMean Squared approximation) is followed by an examination of the basic linearalgebra concepts: norm, inner products linear independence and orthogonality.To emphasize the conceptual unity of the subject, analogue signals and theirdiscrete counterparts are given equal status.

The LMS problem is viewed from the standpoint of the normal equations.The regularization of ill conditioned matrices is studied using the SVD (singularvalue decomposition). Its noise suppression attributes are examined via numer-ical examples. The TLS (Total Least Square) solution technique is discussedbriefly as is the Tikhonov regularization.

XIII

XIV Introduction

The normal equations present us with three algebraically distinct but con-ceptually identical representations. The first is the discrete form which solvesthe problem of minimizing the MS error in the solution of an overdeterminedsystem:

ϕHk f =

∑N

n=1ϕHk ϕnfn, k = 1 . . .N

ϕk = [ϕ1kϕ2k . . . ϕmk]T , M > N (a)

A representation wherein the expansion function set is discrete but the inde-pendent variable extends over a continuum is

∫ T2

T1

ϕ∗k(t)f(t)dt =

N∑

n=1

[∫ T2

T1

ϕ∗k(t)ϕn(t)dt

]fn k = 1 . . .N (b)

A further generalization follows by letting both the summation index and theindependent variable form a continuum. In that case one obtains

∫ T2

T1

ϕ∗(ω, t)f(t)dt =∫ Ω2

−Ω1

[∫ T2

T1

ϕ∗(ω, t)ϕ(η, t)dt

]f(η)dη − Ω1 < ω < Ω2

(c)The formal identity of the first and the second forms follows when one interpretsthe independent variable in the second equation as a row index. In the lastversion the dependent variable is continuous as well so that the sum is replacedby an integral.

In the formulation adopted herein the normal equations embody the essenceof linear transform theory: the second form above leads to a representationas a sum of bases functions whereas the last equation encompasses integraltransforms such as the Fourier integral.

Representations of signals by finite sums of orthogonal bases functions areillustrated using trigonometric functions and Legendre, Laguerre and Hermitepolynomials. The mathematical rigor needed to examine the limiting forms asthe number of expansion functions approaches infinity is outside the scope ofthis text. Instead, as in most engineering oriented works, we use the followingsemi-heuristic approach. The starting point is to use the analogy with linearalgebra and argue that the LMS error is in general not equal to zero unless thegiven signal is in the subspace spanned by the basis functions. As the numberof basis functions is allowed to approach infinity (assuming the series converges)the LMS error need not approach zero. If it does, the series will converge point-wise and the (infinite) set of basis functions is said to be complete. The first stepin testing a given set of basis functions for completeness is to sum a truncatedseries of basis functions, multiply the resulting kernel by the specified functionand integrate the product. If the set is complete the limit of this integral, asthe set of basis functions approaches infinity, will converge to the given functionon a dense set of points. This property defines a delta function, which is usedliberally throughout the text.

Chapter 1

Signals and TheirRepresentations

1.1 Signal Spaces and the Approximation

Problem

We shall use the term signal to designate a function of time. Unless specifiedotherwise, the time shall be allowed to assume all real values. In order not todwell on mathematical generalities that are of peripheral interest in engineeringproblems the mathematical functions which we shall employ to represent thesesignals will be restricted to those that are piecewise differentiable. We shall alsobe interested in treating a collection of such signals as a single entity, in whichcase we shall define these signals as components of a vector. For example, withsn (t), n = 1, 2, 3, ..N the given signal set, the corresponding vector shall bedenoted by s (t) with components denoted by

s (t) =[s1 (t) s2 (t) s3 (t) . . . sN (t)

]T, (1.1)

where T is the transpose. We shall be interested in developing a representationalmethodology that will facilitate the study of the transformation of such signalsby systems, more specifically by an important class of systems known as linearsystems. In the most general sense a system may be defined by the operationit performs on the signal set (1.1), taken as the input to the system, whichoperation gives rise to another signal set,

y (t) =[y1 (t) y2 (t) y3 (t) . . . yM (t)

]T, (1.2)

which we designate as the output signal set. Note that the dimension of theoutput set may differ from the dimension of the input set. This “black box”description can be represented schematically as in Fig. 1.1.

We could also formally indicate this transformation by writing

y (t) = T {s (t)} , (1.3)

W. Wasylkiwskyj, Signals and Transforms in Linear Systems Analysis,DOI 10.1007/978-1-4614-3287-6 1, © Springer Science+Business Media, LLC 2013

1

2 1 Signals and Their Representations

s(t) SYSTEM y(t)

Figure 1.1: Input/output schematization of a system

{s(t)} {y(t)}Domain Range

τ{s(t)}

Figure 1.2: A system interpreted as a mapping

where the symbol T stands for the an operator which defines the mathematicaloperation performed on the input. Another way of schematizing such trans-formations is to think of (1.3) as a mapping of a set inputs (domain of theoperator) into a set of outputs (range of the operator). This mapping notioncan be conveyed pictorially as in Fig. 1.2. The mapping of interest to us shallbe exclusively linear by which we mean

T {s1(t) + s2(t) + . . .} = T {s1(t)} + T {s2(t)} + . . .

We shall return to this operator description and discuss it in detail inChap. 3. Presently we shall focus entirely on alternative mathematical rep-resentations of signals.

It frequently happens that we would like to represent a given signal (timefunction) by a sum of other signals (functions). Such an approach may bedesirable for several reasons. For example, we may be dealing with a systemtransformation which is especially simple in terms of a certain class of functions(as is exemplified by the use of exponentials for linear time-invariant systems)or a particular function class may render the transformation of the signal by

1.1 Signal Spaces and the Approximation Problem 3

the given system less susceptible to noise interference (which is routinely donethrough special choices of waveforms in communications and radar). Still otherreasons may be related to certain desirable scaling properties possessed by thechosen function class (as, e.g., in the use of wavelet transforms in the interpre-tation of complex seismic signals). In any case, the general problem of signalrepresentation may be phrased as follows. Given a function f(t) in the intervala ≤ t ≤ b, and a set of functions φ1 (t) , φ2 (t) , . . . φN (t) defined in the sameinterval, we should like to use a linear combination of these to approximatef (t), i.e.,

f (t) ∼N∑

n=1

fnφn (t) , (1.4)

where the fn are expansion coefficients. In what sense the sum on the right of(1.1) is to approximate the given function remains to be specified. Whateverour criterion, there will in general be a residual error depending on the chosenfunctions and, of course, on N . Denoting this error by eN (t), we can rewrite(1.1) using the equality sign as follows:

f (t) =

N∑

n=1

fnφn (t) + eN (t) , (1.5)

We should like to choose the coefficients fn such that the error eN (t) is smallwithin the interval a ≤ t ≤ b in a sense that remains to be specified. Forexample, we could stipulate that the coefficients be chosen such that

supa≤t≤b

|eN (t)| = α, (1.6)

where α is a specified positive constant which is to be minimized by a suitablechoice of the coefficients. In this form the approximation problem can be solvedexplicitly, provided one constrains the expansion functions to polynomials. Thesolution can be phrased most succinctly in terms of Tschebytscheff polynomials,which finds extensive application in the design of frequency selective filters. Forother classes of expansion functions, the approximation problem in this formcan only be attempted numerically.

Another way in which the approximation problem can be phrased is torequire that the error vanishes identically on a specific set of points, i.e.,

eN (tk) = 0; k = 1, 2, 3, . . .N. (1.7)

The solution for the unknown coefficients is then formally reduced to the solutionof the set of algebraic equations, viz.,

f (tk) =

N∑

n=1

fnφn (tk) (1.8)

for some specified set of φn (t). In this form the problem is usually referredto as the interpolation problem and the chosen functions φn (t) are referred toas interpolation functions. We shall touch on the interpolation problem only


very briefly later. The approximation we shall focus on almost exclusively isthe so-called least mean square (LMS) approximation. More than any otherapproximation technique it has become an indispensible tool in the most diverseapplications and provides a unifying methodology with a truly amazingly largescope. The basic idea had its relatively humble origins as the method of leastsquares introduced by Karl Friedrich Gauss almost two centuries ago. Since thenit has been substantially expanded in scope and plays a significant role not onlyin modern functional analysis but also in a variety of applied disciplines. Indeed,it appears difficult today to name an area of mathematical physics, statistics,estimation theory, or data analysis where it does not play at least some role. Ouruse of the principle of least squares in this work will incline toward its humbleraspects. Thus, we shall relinquish any pretense to mathematical generality.For example, we shall avoid the use of the theory of Lebesgue integration butinstead stick to the pedestrian notion of the Riemann integral. Additionally,the class of functions we shall be concerned with shall be no larger than theclass of piecewise differentiable functions. Significant as these restrictions mayappear from a purely mathematical standpoint, they do not materially restrictthe range of engineering problems to which our results are applicable.

In our terminology, the least squares approximation problem shall be phrasedas follows. GivenN functions φn (t) and a function f (t) in the interval a ≤ t ≤ b,we seek N coefficients fn in (1.5) such that the mean squared error εN ,

εN ≡∫ b

a

|eN (t)|2 dt (1.9)

is minimized, wherein f (t) and the expansion functions and therefore the co-efficients may in general be complex. The formal solution to this problem isstraightforward and can be obtained, e.g., by substituting (1.5) into (1.9), dif-ferentiating with respect to the coefficients, and setting the result to zero. Beforeactually carrying this out it will be convenient to digress from the main themeand introduce a notation which will not only simplify the bookkeeping but willalso provide interesting and useful geometrical interpretations of the results.

1.2 Inner Product, Norm and Representations

by Finite Sums of Elementary Functions

1.2.1 Inner Product and Norm

Given two complex functions f (t) and g (t) we define their inner product(f, g) by

(f, g) ≡∫ b

a

f∗ (t) g (t) dt, (1.10)

where ∗ denotes the complex conjugate. When f (t) = g (t), we introduce thespecial notation

(f, f) ≡ ‖f‖2 , (1.11)

1.2 Inner Product, Norm and Representations by Finite Sums... 5

where ‖f‖ is defined as the norm of f (t). Explicitly

‖f‖ =√∫ b

a

|f (t)|2 dt (1.12)

which may be taken as the generalization of the length of a vector. This ter-minology may be justified by recalling that for a real M -dimensional (X real)vector x,

xT =[x1 x2 x3 . . . xM

], (1.13)

where the superscript T defines the transpose, the length of x is defined by√∑Mn=1 x

2n, which is nothing more than the Pythagorean theorem in M dimen-

sions. Alternatively, in matrix notation this can be written as√xTx. For vectors

with complex components the last two definitions generalize to

√∑Mn=1 |xn|2

and√xHx, respectively, where the superscript H denotes the complex conju-

gate transpose. Evidently if one interprets the integral in (1.12) as a limit ofa sum of infinitessimal (Riemann sum), definition (1.12) appears to be a nat-ural extension of the concept of length of a vector to an infinite-dimensionalfunction space. Actually such analogies can be dispensed with if one acceptsan axiomatic definition of the norm. Thus the norm of a function or a finite oreven infinite-dimensional vector is defined by the following three postulates:

‖f‖ > 0 and ‖f‖ = 0 if and only if f = 0, (1.14a)

‖f‖+ ‖g‖ ≥ ‖f + g‖ (triangle inequality), (1.14b)

‖λf‖ = |λ| ‖f‖ , for any complex constant λ. (1.14c)

Clearly (1.12) satisfies (1.14a) and (1.14c). To prove that it also satisfies (1.14b)consider two functions f and g, each not identically zero, and define

w = αf + βg, (1.15)

where α and β are constants. We form the inner product (f, w) = α(f, f) +β (f, g) and since (f, f) = 0 we can always find an α such that (f, w) = 0,to wit,

α = −β (f, g)(f, f)

. (1.16)

Since (w,w) ≥ 0 we have

(αf + βg, αf + βg) = |α|2 (f, f) + β∗α (g, f) + α∗β (f, g) + |β|2 (g, g) ≥ 0.

Substituting for α from (1.16) we note that the first and second terms on theright of the equality sign mutually cancel so that after multiplication by (f, f),

the entire inequality may be replaced by − |β|2 |(f, g)|2 + |β|2 (f, f) (g, g) ≥ 0.

After dividing both sides by |β|2, moving the first term to the right of theinequality sign and using the definition of the norm, we obtain

|(f, g)|2 ≤ ‖f‖2 ‖g‖2 , (1.17)


where the equality sign applies if and only if w = αf + βg = 0, or, equivalently,f = λg, with λ a (complex) constant.

The inequality (1.17) is called the Schwarz (sometimes the Cauchy–Schwarz)inequality and is of considerable importance in its own right.1 It implies thetriangle inequality (1.14b), as can be seen in the following development. Thedistributive nature of the inner product permits us to write

‖f + g‖2 = (f, f + g) + (g, f + g) = ‖f‖2 + ‖g‖2 + (f, g) + (g, f) .

Since the last two terms are complex conjugates, we get the bound

‖f + g‖2 ≤ ‖f‖2 + ‖g‖2 + 2 |(f, g)| . (1.18)

By the Schwarz inequality (1.17) |(f, g)| ≤ ‖f‖ ‖g‖ so that the right side isbounded by the square of the sum of the norms, i.e.,

‖f + g‖2 ≤ (‖f‖+ ‖g‖)2 . (1.19)

Upon taking the square root of both sides (1.14b) follows.It is by no means true that (1.12) is the only definition that satisfies the

norm postulates. Thus the following definition of the norm:

‖f‖ =[∫ b

a

|f (t)|p dt]1/p

, (1.20)

where p ≥ 1 can also be shown to satisfy (1.14) (see problem (1.10)). Equa-tion (1.20) is said to define the p-norm. In accordance with this terminology(1.12) can be said to correspond to the 2-norm. Our interest in norms for pother than two will be only peripheral. Unless specified otherwise we shall bedealing exclusively with 2-norms.

The concepts of inner product and norm apply equally well to finite-dimensional or denumerably infinitely dimensional vectors with the integralsreplaced by sums. In the course of the following chapters we shall be dealing withfunctions defined over a continuum as well as with their discrete counterpartsdefined over discrete times only. The latter shall be described by vectors witha finite or a denumerably infinite number of components. Although the formalproperties of norm and the inner product are identical in all these cases, so that,in principle, a uniform notation could be adopted we shall not do so. Instead,for the inner product of finite-dimensional or denumerably infinite-dimensionalvectors matrix notation shall be employed. Thus for twoM -dimensional vectors

1There are many other proofs of the Schwarz inequality. A particularly simple one is thefollowing. For any two functions we have the identity

‖f‖2 ‖g‖2 − |(f, g)|2 = 1/2

∫∫|f(x)g(y) − f(y)g(x)|2 dxdy.

Since the right side is nonnegative (1.17) follows (from Leon Cohen, “Time-Frequency Anal-ysis,” Prentice Hall PTR, Englewood Cliffs, New Jersey (1995) p. 47).


x and y the inner product, defined by the sum∑M

n=1 x∗nyn, shall be written in

general as xHy or as xTy for real vectors and not (x,y); the latter symbol shallbe reserved exclusively for functions defined on a continuous time interval. Onthe other hand, a common notation shall be used for the norm, e.g., ‖x‖ and‖f‖. For example, the Schwarz inequality for the vectors x and y reads

∣∣xHy∣∣2 ≤ ‖x‖2 ‖y‖2 . (1.21)

The familiar interpretation of the inner product in three-dimensional Euclideanspace as the product of the magnitude of two vectors and the cosine of the anglebetween the vectors can be extended without difficulty to spaces with higherdimensions and to function spaces. Thus if the vectors x and y are real, wehave

xTy = ‖x‖ ‖y‖ cos θ. (1.22)

Since the vectors are assumed real, the factor cos θ is necessarily real. Now theSchwarz inequality (1.21) guarantees that the magnitude of cos θ is less thanunity, i.e., that the angle θ is real. Also if f and g are real functions defined ona continuum of values a ≤ t ≤ b, (1.22) reads

(f, g) = ‖f‖ ‖g‖ cos θ. (1.23)

Again the Schwarz inequality in (1.17) guarantees a real θ, even though thevisualization of an angle between two functions may present a bit of a problem.With these geometrical analogies, we can extend the notion of orthogonality oftwo vectors from its familiar geometrical setting in three-dimensional Euclideanspace to any number of dimensions or for that matter also to function space. Forexample, we can define two functions as orthogonal if θ = π/2 or, equivalently,if the inner product vanishes, i.e.,

(f, g) = 0. (1.24)

When the functions or (and) the vectors are complex, (1.22) and (1.23) stillapply except that the factor cos θ becomes a complex number, with magnitudeless than unity, but which, unfortunately, no longer admits the interpretationas the cosine of an angle. Nevertheless, orthogonality of two complex-valuedfunctions is still defined by the condition that their inner product vanishes, asstipulated by (1.24).

1.2.2 Orthogonality and Linear Independence

The concept of orthogonality defined by (1.24) for two functions can be gen-eralized to any finite or infinite set. Thus the set of functions φn (t) withn = 1, 2, . . .N will be called orthogonal in a ≤ t ≤ b if

(φn, φm) =

{0; n = m,Qn; n = m,

(1.25)


where Qn is a normalization constant. It is sometimes convenient to normalizethe functions such that Qn = 1. In that case the functions shall be referred toas orthonormal. As a notational convenience the orthogonality condition shallbe represented by

(φn, φm) = Qnδnm, (1.26)

where the δnm is the so called Kronecker symbol defined by

δnm =

{0; n = m,1; n = m,

(1.27)

which is recognized as nothing more than an element of a unity matrix in N -dimensional space.

A more general concept than that of orthogonality is that of linear inde-pendence. This concept is suggestive of an independent coordinate set, or, toventure into the realm of communications theory, to the number of degrees offreedom of a signal. We say that the functions φn (t), n = 1, 2, . . .N are linearlyindependent in a ≤ t ≤ b if the relationship

N∑

n=1

αnφn (t) = 0 (1.28)

implies αn = 0 for all n. Conversely, if (1.28) is satisfied with at least one ofthe αn = 0, the functions are said to be linearly dependent. A necessary andsufficient condition for N functions φn (t) to be linearly dependent is that

det [(φm, φn)] = 0. (1.29)

To prove this suppose the functions are linearly dependent so that (1.28) holdswith at least one αn not identically zero. Forming the inner product with φm (t)gives

N∑

n=1

αn (φm, φn) = 0; m = 1, 2, . . .N. (1.30)

This system of equations can yield a set of nonzero αn only if the determinantof the system vanishes, i.e., when (1.29) holds. On the other hand, suppose(1.29) holds. Then the system (1.30) will yield at least one αn different fromzero. Now multiply (1.30) by α∗

m and sum over m to obtain

N∑

m=1

N∑

n=1

α∗mαn (φm, φn) = 0. (1.31)

Clearly this is equivalent to

∥∥∥∥∥

N∑

n=1

αnφn (t)

∥∥∥∥∥

2

= 0, (1.32)


which implies (1.28). By analogy with finite-dimensional vector spaces we shallcall a finite set of linearly independent functions a basis and the functions basisfunctions.

The determinant Γ ≡ det [(φm, φn)] in (1.29) is called the Gram determinantand the corresponding matrix

G ≡

⎡

⎢⎢⎢⎢⎣

(φ1, φ1) (φ1, φ2) (φ1, φ3) . (φ1, φN)(φ2, φ1) (φ2, φ2) (φ2, φ3) . (φ2, φN)(φ3, φ1) (φ3, φ2) (φ3, φ3) . (φ3, φN)

. . . . .(φN , φ1) (φN , φ2) (φN , φ3) . (φN , φN )

⎤

⎥⎥⎥⎥⎦(1.33)

is known as the Gram matrix. From the definition of the inner product we seethat G = GH , i.e., the Gram matrix is Hermitian. As we have just shown Γ = 0for a linearly independent set of functions. In fact since

N∑

m=1

N∑

n=1

α∗mαn (φm, φn) =

∥∥∥∥∥

N∑

n=1

αnφn (t)

∥∥∥∥∥

2

≥ 0, (1.34)

the Gram matrix is non-negative definite and necessarily positive definite if thefunctions are linearly independent. Assuming linear independence, let us write(1.34) in “block matrix” form

αHGα >0

with

α =[α1 α2 . . . αN

].

Since G is Hermitian, it can be diagonalized by a unitary transformation2 U,so that G = UΛUH with Λ a diagonal matrix comprised of Λn, n = 1, 2, ..N ,

the N eigenvalues of G. Hence αHGα = αHUΛUHα = βHΛβ > 0 with

β = UHα. Because G is positive definite all elements of Λ must be positive sothat

Γ = det (G) = det(UΛUH

)=

N

Πn=1

Λn > 0.

Thus Γ is always positive for any linearly independent set of functions. We notein passing that Γ ≥ 0 for N > 2 may be taken as the generalization of theSchwarz inequality (1.17).

When the expansion functions are differentiable, the necessary and sufficientconditions for linear independence can also be phrased in terms of the deriva-tives. This alternative formulation is of fundamental importance in the theory

2UHU = INN


of ordinary linear differential equations. To derive it consider again (1.28) un-der the assumption that the functions are linearly dependent. Assume that thefunctions posses N − 1 derivatives and denote the k-th derivative of φn (t) by

φ(k)n (t). A k-fold differentiation then gives

N∑

n=1

αnφ(k)n (t) = 0, k = 0, 1, 2 . . .N − 1; a ≤ t ≤ b.

This system can have nontrivial solutions for αn if and only if

W [φ1, φ2, . . . φN ] ≡ det

⎡

⎢⎢⎢⎢⎢⎣

φ1 φ2 φ3 . . . φNφ(1)1 φ

(1)2 φ

(1)3 . . . φ

(1)N

φ(2)1 φ

(2)2 φ

(2)3 . . . φ

(2)N

. . . . .

φ(N−1)1 φ

(N−1)2 φ

(N−1)3 . . . φ

(N−1)N

⎤

⎥⎥⎥⎥⎥⎦= 0.

(1.35)

This determinant, known as the Wronskian, will be employed in Chap. 3 whenwe discuss systems described by linear differential equations. We note in passingthat unlike the Gram determinant Γ, which depends on the values assumedby the functions throughout the entire interval, the Wronskian is a pointwisemeasure of linear independence. Consequently, it may vanish at isolated pointswithin the interval even though Γ = 0. Consider, for example, the functionst and t2 in the interval (−1, 1). The corresponding Gram determinant is 4/15while the Wronskian is t2, which vanishes at zero.

Even though mathematically the given function set is either linearly depen-dent or linearly independent, from a practical numerical standpoint one canspeak of the degree of linear dependence. Thus based on the intuitive geometri-cal reasoning we would consider vectors that are nearly parallel as nearly linearlydependent. In fact our ability to decide numerically between linear dependenceand independence will be vitiated in cases of nearly parallel vectors by the pres-ence of noise in the form of round off errors. On the other hand, we wouldexpect greater immunity to noise interference in cases of nearly orthogonal vec-tors. It is therefore useful to have a numerical measure of the degree of linearindependence. A possible measure of the degree of this linear independence, orin effect, the degree of the singularity of the inverse of G, is the numerical valueof Γ. Other measures, referred to generally as matrix condition numbers, aresometimes more appropriate and will be discussed later.

From the foregoing it should be intuitively obvious that orthogonality of afunction set implies linear independence. A formal proof is not difficult to con-struct. Thus suppose the N functions are linearly dependent and yet orthogonal.Since linear dependence means that (1.28) holds, we can form the inner prod-uct of its left side successively for m = 1, 2, . . .N with φm (t), which then yieldsαm = 0;m = 1, 2 . . .N , thus contradicting the assumption of linear dependence.

The use of the language of three-dimensional Euclidean geometry in describ-ing function spaces suggests that we also borrow the corresponding pictorial


φ

φ

f

f1

f2��

�1

2

Figure 1.3: Projection vector

representation. For example, suppose the function f (t) can be represented ex-actly by two orthonormal functions φ1 (t) and φ2 (t). We then have

f (t) = f1φ1 (t) + f2φ2 (t) . (1.36)

Since the two expansion functions are orthonormal, we can at once solve forthe two expansion coefficients by multiplying (1.36) by φ∗

1 (t) and φ∗2 (t) andintegrating the result to obtain

f1 = (φ1, f) , (1.37a)

f2 = (φ2, f) . (1.37b)

Just as if we were dealing with finite-dimensional Euclidean vector space insteadof function space, we can interpret the coefficients f1 and f2 in (1.37) as projec-tions on a pair of unit basis vectors. Of course, formally these are projections ofan infinite-dimensional vector f (t) along the directions of the two “orthogonalunit vectors” φ1 and φ2 which are themselves infinite dimensional. Nevertheless,

if we use these projections to define the two-dimensional vector f ,

f =

[f1f2

](1.38)

we can interpret its relationship to the orthonormal basis functions φ1 and φ2

geometrically as shown in Fig. 1.3 where, to facilitate the graphical representa-tion, we assume that f1 and f2 are real numbers.

We note that f and f (t) have the same norm. Depending on the notationalpreferences this result can be phrased in the following alternative ways:

E =

∫ b

a

|f (t)|2 dt = ‖f (t)‖2 =

∫ b

a

∣∣∣f1φ1 (t) + f2φ2 (t)∣∣∣2

dt

=∣∣∣f1

∣∣∣2

+∣∣∣f2

∣∣∣2

= fH f =∥∥∥f

∥∥∥2

, (1.39)


where E is termed the signal energy.3 The significance of (1.39) is that it showsthat the signal energy can be computed indirectly by summing the squaredmagnitudes of the expansion coefficients. This is a special case of a more gen-eral result known as Parseval’s theorem which holds generally for orthogonalexpansions.

1.2.3 Representations by Sums of OrthogonalFunctions

The preceding geometrical interpretation of the projections of the signal in func-tion space can be employed for any number of orthonormal expansion functionseven though the actual graphical construction is obviously limited to N < 4.Thus for a signal f (t) that can be represented by N orthonormal functions(1.36) generalizes to

f (t) =

N∑

n=1

fnφn (t) . (1.40)

As in (1.36) we take advantage of the orthogonality of the φn (t) so that the ex-

pansion coefficients fn (or, speaking geometrically, the projections of the signalon hypothetical orthogonal axes) are given by

fn = (φn, f) . (1.41)

Collecting these into the N -dimensional vector f , we have

fT =[f1 f2 . . . fN

],

This vector furnishes a representation that is fully equivalent to the direct dis-play of the signal as a function of time because the time domain representationcan always be reconstructed by appending the known expansion functions. Ev-idently the square of the norm of this vector is again the signal energy E as in(1.39) so that Parseval’s theorem now generalizes to

E =

∫ b

a

|f (t)|2 dt = ‖f (t)‖2 =

∫ b

a

∣∣∣∣∣

N∑

n=1

fnφn (t)

∣∣∣∣∣

2

dt

=N∑

n=1

∣∣∣fn∣∣∣2

= fH f =∥∥∥f

∥∥∥2

, (1.42)

3The term energy as used here is to be understood as synonymous with the square of thesignal norm which need not (and generally does not) have units of energy.


which is Parseval’s theorem for general orthogonal functions. Similarly the innerproduct of two functions f(t) and g(t) can be written as the inner product ofvectors with expansion coefficients for their components:

∫ b

a

f(t)∗g(t)dt =

∫ b

a

N∑

n=1

f∗n (t)φ

∗n (t)

N∑

n=1

gnφn (t) dt

=

N∑

n=1

f∗n gn = fH g. (1.43)

In case two or more signals are represented in terms of the same orthonormalset of functions their addition and subtraction can be reduced to operations withvectors defined by their expansion coefficients. For example, given signals f1 (t)and f2 (t) their direct addition f1 (t) + f2(t) = f3(t) corresponds to the vector

addition f1+ f2 = f3.In applications of signal analysis it is frequently important to have a mea-

sure of the difference between two signals. A measure frequently employed indetection theory is the “distance” between two signals, defined as follows:

d12 =

√∫ b

a

|f1 (t)− f2 (t)|2 dt =√‖f1‖2 + ‖f2‖2 − 2e (f1, f2). (1.44)

If each of the functions is represented by an orthonormal expansion with coef-ficients f1n, f2n, the preceding definition is easily shown to be equivalent to

d12 =

√√√√N∑

n=1

∣∣∣f1n − f2n∣∣∣2

=∥∥∥f1 − f2

∥∥∥ =

√(f1 − f2

)H (f1 − f2

)(1.45)

Geometrically we can interpret this quantity as the (Euclidean) distance in N -

dimensional space between points located by the position vectors, vectors f1,and f2 and think of it as a generalization of the 2-D construction as shown inFig. 1.4.

φ

φ

f1�

f2�

d12

2

1

Figure 1.4: Vector representation of the distance between two signals


Clearly we can always represent the coefficients of signals that can be ex-pressed as sums of N orthonormal functions as components of N-dimensionalvectors, but graphical representations such as that in Fig. 1.4 are limited to realfunctions in at most three dimensions.

1.2.4 Nonorthogonal Expansion Functionsand Their Duals

When the expansion functions are not orthogonal each coefficient fn in (1.40)depends not just on φn but on the entire set φ1, φ2, φ3 . . . φN . We can see thisdirectly from (1.40). After multiplying it from the left by φ∗

m(t) and integratingwe get

(φm, f) =

N∑

n=1

(φm, φn)fn (1.46)

so that the computation of the expansion coefficients requires the inversionof the Gram matrix which depends on the inner product of all the expansionfunctions. When the expansion functions are not independent, the Gram matrixis singular and the inverse does not exist. In case of orthonormal expansionfunction (ϕm, φn) = δnm i.e., the Gram matrix reduces to a unit matrix, inwhich case the expansion coefficients are again given by (1.41).

A similar formula can be obtained by introducing a new set of expansionfunctions, referred to as dual or reciprocal basis functions ψn(t), n = 1, 2, . . .Nconstructed so as to be orthogonal to the ϕn(t). This construction starts withthe expansion

ψn(t) =

N∑

l=1

αnlφl(t), n = 1, 2, . . .N. (1.47)

Multiplying from the left by φ∗k(t) and integrating we obtain

(φk, ψn) =

N∑

l=1

αnl (φk, φl) , (1.48)

where k = 1, 2, . . .N . The sum over the index l is seen to be a product of twomatrices: the transpose of the Gram matrix GT , with elements (φk, φl), anda matrix with elements αnl. At this point the αnl are at our disposal and aslong as they represent the elements of a nonsingular matrix we can constructthe corresponding set of reciprocal basis functions ψn(t). If we choose the αnlas elements of the inverse of GT , then the left side of (1.48) is necessarily theunit matrix, i.e.,

(φk, ψn) = δkn. (1.49)

Thus the direct basis functions φk are orthogonal to the reciprocal basis func-tions ψk. They will also have unit norm if the norm of φk is unity. Collectivelythe ψk and φk are also referred to as a biorthogonal set. In the special case of


orthonormal φk the Gram matrix degenerates into a unit matrix and in accor-dance with (1.47) ψk = φk.

In view of (1.49) we can express the expansion coefficients in (1.40) as innerproducts of f(t) and ψn (t). We obtain this by multiplying (1.40) from the leftby ψ∗

n(t), integrating and taking account of (1.49). This yields

fn = (ψn, f) . (1.50)

If a signal can be represented by a finite set of linearly independent functions, onecan always construct the corresponding reciprocal basis and use it to representthe signal. Thus any given signal has two alternative representations, one interms of the direct and other in terms of the reciprocal basis functions. For afunction represented in terms of the direct basis functions4

f(t) =

N∑

n=1

fnφn (t) (1.51)

the expansion in terms of the reciprocal basis functions will be represented by

f(t) =

N∑

n=1

fnψn (t) . (1.52)

The two sets of coefficients are related by the Gram matrix. This follows fromequating (1.51) to (1.52) and using (1.50). We get

fm =M∑

n=1

(φm, φn) fn. (1.53)

Using the dual representations (1.51) and (1.52) the energy of the signal is

E =

∫ b

a

|f(t)|2 dt = (f, f) =

N∑

n=1

N∑

n=1

f∗nfm (φn, ψm) =

N∑

n=1

f∗nfn, (1.54)

which may be taken as an extension of Parseval’s theorem (1.42) to biorthogonalfunctions. More generally, the inner product of two functions, one representedin the direct and the other in the reciprocal basis, reads

∫ b

a

f(t)∗g(t)dt = (f, g) =

∫ b

a

N∑

n=1

f∗n (t)φ

∗n (t)

N∑

m=1

gmψm (t) dt

=

N∑

n=1

N∑

m=1

f∗ngm (φn, ψm) =

N∑

n=1

f∗n gn. (1.55)

4Mathematically it is immaterial which set is designated as the direct and which as thereciprocal basis.


The representation of the inner product of two functions as an inner product oftheir expansion coefficients is an important attribute of orthogonal expansions.According to (1.55) this attribute is shared by biorthogonal expansions. Theirdrawback is the requirement of two sets of functions. Only one set of functionswould be needed if the φn could be transformed into a symmetric orthogonal set.In the following we describe two techniques that provide such a transforming.

1.2.5 Orthogonalization Techniques

Orthogonalization via the Gram Matrix

A linear transformation of a nonorthogonal to an orthogonal basis can be con-structed from eigenvectors of the Gram matrix. To this end we first solve theeigenvalue problem

Gvn = λnvn, n = 1, 2, 3, . . .N (1.56)

and take note of the fact that the Gram matrix is Hermitian so that the eigen-values are real and the eigenvectors orthogonal. We assign them unit norm sothat

vHn vm = δnm (1.57)

withvHn = [v∗1 v

∗n2 v

∗n3 . . . v∗nN ] (1.58)

The orthonormal basis functions wn(t) are then given by

wn(t) =1√λn

N∑

k=1

vnk φk (t) (1.59)

as can be demonstrated by computing the inner product

(wn, wm) =1√λnλm

(N∑

k=1

vnk φk,N∑

l=1

vml φl

)

=1√λnλm

N∑

k=1

N∑

l=1

v∗nkvml (φk, φl)

=1√λnλm

vHn Gvm.

With the aid of (1.56) and (1.57) we obtain the final result

1√λnλm

vHn Gvm =λm√λnλm

vHn vm = δnm. (1.60)

Since the key to this orthogonalization technique is the Hermitian structureof the Gram matrix it applies equally well to the discrete case. Here we startwith a general set of linearly independentM -dimensional column vectors an, n =1, 2, 3, . . .N and for the M X N matrix

A = [a1 a2 a3 . . .aN ]. (1.61)


The corresponding Gram matrix is

G = AHA

and (1.59) now reads

wn =1√λn

N∑

k=1

akvnk . (1.62)

The only difference in its derivation is the replacement of the inner products byaHn am and wH

n wm so that the orthogonality statement assumes the form

wHn wm = δnm. (1.63)

The Gram–Schmidt Orthogonalization

One undesirable feature of the approach just presented is that every time the setof basis functions is enlarged the construction of the new orthogonal set requiresa new solution of the eigenvalue problem. An alternative and less computation-ally demanding technique is the Gram–Schmidt approach where orthogonalityis obtained by a successive subtraction of the projections of a vector (function)on a set of orthogonal subspaces. The geometrical interpretation of this processis best illustrated by starting with a construction in two dimensions. Thus forthe 2 real vectors a1 and a1 as shown in Fig. 1.5 we denote their respectivemagnitudes by a1 and a2 and the included angle by θ. In this notation a1/a1 isa vector of unit magnitude that points in the direction of a1 and a2 cos θ is theprojection of a2 along a1. Thus

(a1/a1) a2 cos θ (1.64)

a1

a2

w2

⎟⎠

⎞⎟⎜⎜

⎝

⎛a2 cos θ

a1

a1

θ

Figure 1.5: Orthogonalization in 2 dimensions


represents a component of the vector a2 in the direction of a1. If we definethe “dot” product in the usual way by a2· a1 = a1a2 cos θ, solve for cos θ andsubstitute into (1.64), we obtain

a2.a1a21

. (1.65)

If we now form a new vector w2 by subtracting the projection (1.65) from a2

w2 = a2 − a2·a1a21

a1. (1.66)

Forming the dot product of a1 with w2 we get orthogonality with w2. We alsosee directly from geometrical construction in Fig. 1.5 that w2.a1 = 0. If we nowdefine w1 ≡ a1, we can claim that we have succeeded in transforming a general(not necessarily orthogonal) basis a1, a2 into the basis w1,w2, which is orthogo-nal. This procedure, referred to as the Gram–Schmidt orthogonalization, worksin any number of dimensions. For example, suppose we have three noncoplanarvectors in three dimensions: a1, a2, and a3. We can choose any one of them asthe first member of the orthogonal set. For example, w1 = a1. We constructthe second vector w2 precisely as in (1.66). With the replacement of a1 by w1

this reads

w2 = a2 − a2.w1

w21

w1, (1.67)

where w1 is the magnitude of w1. In the next step we subtract from a3 itsprojection on the plane formed by the vectors w2 and w1. Thus

w3 = a3 − a3.w2

w22

w2 − a3.w1

w21

w1. (1.68)

Evidently the procedure can be generalized to any number of dimensions. IntheM -dimensional case it is customary to replace the dot product by the matrixinner product, i.e., write aT2 a1 instead of a2·a1. Also, the restriction to realvectors is not necessary as long as the Hermitian inner product aHk ak−1 is used.In fact the procedure is applicable as when the space is infinite dimensional.We merely have to employ the appropriate inner product. Thus given a set oflinearly independent functions φ1, φ2, φ3, . . . φN the corresponding orthogonalset w1, w2, w3, . . . wN is

wk = φk −(φk, wk−1)

(wk−1, wk−1)wk−1 − (φk, wk−2, )

(wk−2, wk−2)wk−2 − (φk, w1)

(w1, w1)w1 (1.69)

with k = 1, 2, . . . , N and where quantities with subscripts less than unity are tobe set to zero. Note that unlike the orthogonalization using the Gram matrixthe Gram–Schmidt construction provides an orthogonal but not a orthonormalset. This is easily remedied dividing each wk or ak by its norm.

1.3 The LMS Approximation and the Normal Equations 19

Clearly the Gram–Schmidt orthogonalization procedure can also be appliedto functions that have been multiplied by a real weighting function, say Ψ(t).This is equivalent to generalizing the definition of the inner product to

(f, g) ≡∫ b

a

ψ2 (t) f∗ (t) g (t) dt, (1.70)

We shall discuss this generalized inner product definition in 1.3.3 in connectionwith the LMS error minimization. Using suitable weighing functions the Gram–Schmidt procedure can be employed to obtain special orthogonal functions usingthe polynomial set 1, t, t2, . . . , tN−1. Examples are the Legendre Polynomials [1](a = −1, b = 1, ψ = 1), the Laguerre Polynomials [1] (a = 0, b =∞, ψ = e−t/2),and the Hermit Polynomials [1] (a = −∞, b =∞, ψ = e−t

2/2) discussed in 1.5.3.For example, for Legendre Polynomials the first four terms are

w1 = 1, w2 = t− (1, t)

(1, 1)1 = t, w2 = t− (1, t)

(1, 1)1 = t, (1.71a)

w3 = t2 −(t, t2

)

(t, t)t−

(1, t2

)

(1, 1)1 = t2 − 1

3, (1.71b)

w4 = t3 − 3

5t (1.71c)

Comparing these with (1.225) we note the following correspondence: w1 =P0, w2 = P1, w3 = 2

3P2, w4 = 25P3. Evidently, the normalization in Gram–

Schmidt procedure differs from the standard normalization of the LegendrePolynomials.5

1.3 The LMS Approximationand the Normal Equations

1.3.1 The Projection Theorem

We now return to the minimization of the mean squared error (1.9), which wepresently rewrite using the notation developed in the preceding section as

εN = (eN , eN ) . (1.72)

Solving (1.5) for eN (t) we have

eN (t) = f (t)− fN (t) , (1.73)

where fN (t) is the partial sum

fN (t) =

N∑

n=1

fnφn (t) , (1.74)

5Another set of orthogonal polynomials can be constructed using the eigenvectors of theGram matix of tk . The orthogonal polynomials are then given by (1.62).


εN =

(f −

N∑

n=1

fnφn, f −N∑

n=1

fnφn

). (1.75)

We now seek the N complex coefficients fn or, equivalently, 2N real coefficientsthat minimize (1.75). Alternatively, we can carry out the minimization with

respect to the 2N complex coefficients fn and f∗n. Using the latter approach we

differentiate (1.75) with respect to fm and f∗m and set the derivatives to zero to

obtain

∂εN

∂fm= − (eN , φm) = 0; m = 1, 2, . . .N, (1.76a)

∂εN

∂f∗m

= − (φm, eN ) = 0; m = 1, 2, . . .N. (1.76b)

We note that (1.76a) and (1.76b) are redundant since they are merely complexconjugates of each other. They both state that the minimization process leadsto a residual error function which is orthogonal to all members of the chosen setof expansion functions. This result is usually referred to as the Projection The-orem. Interpreted geometrically it states that the best an LMS approximationcan accomplish is to force the residual error function into a subspace orthogonalto that spanned by the given set of expansion functions. It is important to notethat the expansion functions themselves need not be orthogonal.

The essence of the Projection Theorem is best captured with the aid of ageometrical construction. Assuming N = 2, we start by again representing f (t)in terms of two expansion functions as in (1.36) but assume this time that theydo not suffice for an exact representation of f (t) and our objective is to minimizethe error by adjusting the projection coefficients along the φ1 and φ2 “axes.” Tosimplify matters suppose that the inclusion of one additional expansion function,say φ3 (t), orthogonal to both φ1 (t) and φ2 (t), would render the representationof f (t) exact. In that case, according to the Projection Theorem, the residualerror function e2 (t) ≡ e2min (t) that corresponds to the optimum choice ofthe coefficients must be parallel to φ3 (t), as illustrated by the construction as

shown in Fig. 1.6. Note that the vector f is comprised of three projections off (t) that are nonvanishing on all three basis functions. The projections of f (t)in the subspace spanned by the two basis functions φ1 and φ2 are represented

by the vector f (2) which represents f to within the error vector e2. As may beseen from the figure, the norm of this error vector will be minimized wheneverthe projections in the subspace spanned by φ1 and φ2 are adjusted such thatthe resulting error vector approaches e2min which lies entirely in the subspaceorthogonal to φ1 and φ2. The corresponding projections (coefficients of theexpansion of f (t)) yielding the best LMS approximation are then represented

by the vector fmin.


φ

φ

φ

f

f f

e e

90�� (2) �

min

2

�2min

�

�2

3

1

Figure 1.6: LMS error minimization and the orthogonality principle

1.3.2 The Normal Equations

Upon substituting (1.73) together with (1.74) into (1.76b) we obtain the sys-tem of linear algebraic equations that must be solved to find the projectioncoefficients fn that afford the LMS approximation:

(φm, f) =

N∑

n=1

(φm, φn) fn ; m = 1, 2, 3, . . .N . (1.77)

These equations are known as the normal equations for the unconstrained LMSminimization problem. We note that their formal solution requires the inversionof the Gram matrix. Clearly for a unique solution of (1.77) the Gram deter-minant must be nonvanishing, or, equivalently, the N expansion function mustbe linearly independent. The corresponding LMS error is found with the aid of(1.72) and (1.75) taking into account of (1.76b). We then obtain

εN min =

(f −

N∑

n=1

fnφn, eN

)= (f, eN) = (f, f)−

N∑

n=1

fn (f, φn) . (1.78)

Since εN min is nonnegative, we have the inequality

(f, f)−N∑

n=1

fn (f, φn) ≥ 0. (1.79)

As will be discussed in 1.7.1 under certain circumstances an infinite set of ex-pansion functions can be chosen that reduces the LMS to zero.

The normal equations are more commonly used with expansion functions(vectors) in discrete form. For an M × N matrix A we can identify thesevectors as columns and collect the N unknown expansion coefficients into anN -dimensional vector x, and attempt to solve the system Ax = y for x. As longas N < M (which is the usual case), such a solution will in general not exist.We can, however, always obtain an x that approximates y by Ax in the LMS


sense. It is not hard to see from (1.77) that the vector x in this case satisfiesthe normal equations in the form

AHy = AHAx, (1.80)

where

AHA =

⎡

⎢⎢⎢⎢⎣

aH1aH2aH3.

aHN

⎤

⎥⎥⎥⎥⎦

[a1 a2 a3 . aN

]

=

⎡

⎢⎢⎢⎢⎣

aH1 a1 aH1 a2 aH1 a3 . aH1 aNaH2 a1 aH2 a2 aH2 a3 . aH2 aNaH3 a1 aH3 a2 aH3 a3 . aH3 aN. . . . .

aHNa1 aHNa2 aHNa3 . aHNaN

⎤

⎥⎥⎥⎥⎦. (1.81)

where AHA is the discrete form of the Gram matrix. Note that from a strictlyalgebraic viewpoint we could always append a finite number (M−N) of linearlyindependent vectors to the columns of A transforming it into a nonsingularM ×M matrix and thus ensuring an exact (rather than an LMS) solution ofAx = y for x.

1.3.3 Generalizations of the Normal Equations

In certain situations it is more meaningful to use as a measure of the goodnessof fit to a given function a weighted MS error in the form

εψ,N ≡∫ b

a

|ψ (t)|2 ∣∣f (t)− fN (t)∣∣2 dt = (ψeN , ψeN) , (1.82)

where ψ is a specified weighting function. Its choice could depend on achievinga desired amount of emphasis (or de-emphasis) of the contribution to the errorfrom different regions of the time interval, or, as discussed in 1.2.5, it couldserve as a means of achieving orthogonality among a selected set of expansionfunctions. Other reasons for the use of special weighting functions will be dis-cussed later in connection with techniques known as windowing. Note that(1.82) implies a representation of the function f (t) in the following form:

ψ (t) f (t) =N∑

n=1

fnψ (t)φn (t) + eψ,N (t) . (1.83)

The corresponding normal equations for the expansion coefficients follow from(1.77) through the replacements f → ψf and φn → ψφn:

(ψφm, ψf) =N∑

n=1

(ψφm, ψφn) fn ; m = 1, 2, 3, . . .N . (1.84)


In the discrete case, (1.80), the weighting factors appear as elements of anM -element diagonal matrix D so that the MS error becomes

εD,N = ‖D (y −Ax)‖2 (1.85)

while the corresponding normal equations are

AHDHDy = AHDHDAx. (1.86)

We can also generalize the LMS approximation problem to expansion func-tion sets defined over a “continuous summation index,” i.e., we replace the sumin (1.73) by an integral. Thus

f (t) =

∫

ω∈SΩ

f (ω)φ (ω, t) dω + eΩ (t) , (1.87)

where the summation variable ω ranges over an interval whose extent is denotedsymbolically by SΩ, wherein Ω is proportional to the total length of the interval(e.g., SΩ = (−Ω,Ω)), the φ (ω, t) are the specified expansion functions, and the

f (ω) the unknown coefficients. The latter are to be chosen to minimize

εΩ ≡∫ b

a

|eΩ (t)|2 dt = (eΩ, eΩ) . (1.88)

Since the summation index in (1.87) ranges over a continuum, the approachwe used previously in (1.76) is not directly applicable. In such situations itis natural to use a variational approach. Before applying it to (1.88) we shallillustrate it by re-deriving (1.77) from (1.75).

To start with, we perturb each of the coefficients fn in (1.75) by the small

amount δfn. The corresponding change in the error δεN follows by takingdifferentials

δεN =

(f −

N∑

n=1

fnφn,−N∑

n=1

δfnφn

)+

(−

N∑

n=1

δfnφn, f −N∑

n=1

fnφn

). (1.89)

We now assume that the fn are precisely those that minimize εN . It is thennecessary for the left side in (1.89) to vanish. Upon expanding the two innerproducts on the right of (1.89) and relabeling the summation indices we obtain

0 =

N∑

m=1

αmδfm +

N∑

m=1

βmδf∗m, (1.90)

where

αm = − (f, φm) +

N∑

n=1

(φn, φm) f∗n, and (1.91a)

βm = − (φm, f) +

N∑

n=1

(φm, φn) fn. (1.91b)


Now the δfm and δf∗m are completely arbitrary so that (1.90) can be satisfied

only by αm = βm = 0. Thus (1.90) again yields (1.76) and hence the normalequations (1.77).

We now use the same approach to minimize (1.88) which we first rewrite asfollows:

εΩ =(f − f (Ω), f − f (Ω)

), (1.92)

where f (Ω) (t) is the partial “sum”

f (Ω) (t) =

∫

ω∈SΩ

f (ω)φ (ω, t) dω. (1.93)

Proceeding as in (1.89) we obtain

δεΩ =(f − f (Ω),−δf (Ω)

)+

(−δf (Ω), f − f (Ω)

). (1.94)

Upon setting δεΩ = 0 and expanding the inner products results in a formanalogous to (1.90) wherein the sums are replaced by integrals:

0 =

⎡

⎣∫

ω∈SΩ

α (ω) dω

⎤

⎦ δf (ω) +

⎡

⎣∫

ω∈SΩ

β (ω) dω

⎤

⎦ δf∗ (ω) , (1.95)

where

α (ω) = − (f, φ (ω)) +

∫

ω′∈SΩ

(φ (ω′) , φ (ω)) f∗ (ω′) dω′, (1.96a)

β (ω) = − (φ (ω) , f) +

∫

ω′∈SΩ

(φ (ω) , φ (ω′)) f (ω′) dω′. (1.96b)

As in (1.90) we argue that δf (ω) and δf∗ (ω) are arbitrary deviations. In fact,

we could even choose δf (ω) = α∗ (ω) so that in view of α (ω) = β∗ (ω) (1.95)becomes

0 =

∫

ω∈SΩ

|α (ω)|2 dω, (1.97)

which clearly requires α (ω) = 0. From (1.96b) we then have our final result

(φ (ω) , f) =

∫

ω′∈SΩ

(φ (ω) , φ (ω′)) f (ω′) dω′; ω ∈ SΩ ; a ≤ t ≤ b, (1.98)

as the desired generalization of the normal equations. In contrast to (1.77)which is simply a set of algebraic equations, (1.98) represents an infinite setof integral equations (one for each choice of the variable ω) for the unknown

functions f (ω). Writing out the inner products in (1.98) explicitly and setting

(φ (ω) , φ (ω′)) =∫ b

a

φ∗ (ω, t)φ (ω′, t) dt ≡ G (ω, ω′) , (1.99)


(1.98) assumes the form

∫ b

a

f (t)φ∗ (ω, t) dt =∫

ω′∈SΩ

G (ω, ω′) f (ω′) dω′ ; ω ∈ SΩ. (1.100)

Equation (1.100) will form the basis in our investigation of integral transforms ofsignals while (1.77) will provide the basis in the study of series representations.As a final generalization of the normal equations we consider expansions infunctions that depend on two discrete indices, i.e., of the form φnm (t), so that

f (t) =

N∑

n=1

M∑

m=1

fnmφnm (t) + eN,M (t) (1.101)

The dependence of two indices is merely a notational detail and does not affectthe general form of the Normal Equations. In fact these can be written downimmediately from (1.77) by simply taking account of the extra index. Thus

(φpq, f

)=

N∑

n=1

M∑

m=1

(φpq, φnm

)fnm ; p = 1, 2, . . .N ; q = 1, 2, . . .M. (1.102)

1.3.4 LMS Approximation and Stochastic Processes*

The LMS approximation plays an important role in probability theory and thetheory of stochastic processes and is treated in detail in standard works on thesubject. In the following we confine ourselves to showing the connection betweenthe LMS formulation just described and its extension to random variables.

The approximation problemmay then be phrased as follows. Given a randomvariable y we are asked to approximate it by a linear combination of random

variables an, n = 1, 2, ..N . The error eN is necessarily a random variable6

eN = y −N∑

n=1

xnan, (1.103)

which may be taken as the analogue of the deterministic statement (1.73). Thequantity to be minimized is the ensemble average (expectation) of the squaredmagnitude of the error eN

εN =< e∗NeN >=< |eN |2 >, (1.104)

6Here we distinguish a random variable by an underline and denote its expectation (orensemble average) by enclosing it in < >


which may be interpreted as a generalization of the squared norm in the deter-ministic case (1.72). Substituting from (1.103) and expanding we get

εN = <

∣∣∣∣∣y −N∑

n=1

xnan

∣∣∣∣∣

2

>

= < y∗y > −N∑

n=1

< a∗ny > x∗n −N∑

n=1

< any∗ > xn

+

N∑

n=1

N∑

m=1

< ana∗m > xnx

∗m. (1.105)

Differentiating with respect to xn and x∗n and setting the derivatives to zeroresults in

< a∗meN >= 0 , m = 1, 2, . . .N. (1.106)

or, in expanded form,

< a∗my >=N∑

n=1

< a∗man > xn ,m = 1, 2, . . .N (1.107)

Equation (1.106) may be taken as an extension of the projection theorem(1.76a) and (1.107) of the normal equations (1.77). The fact that the algebraicforms of the deterministic and stochastic versions of the normal equations areidentical is actually not unexpected if one recalls that ensemble averages aremerely inner products with a weighting factor equal to the probability densityfunction [c.f. (1.82)].

Typical applications of LMS estimation arise in statistical time series analy-ses. For example, suppose we would like to predict the value of the n-th sampleof a random signal f (nΔt) at time t = nΔt from its past values at t = (n−�)Δt,� = 1, 2, . . .N . We can formulate this as an LMS estimation problem and seekN linear prediction filter coefficients x� , � = 1, 2, . . .N that minimize

<

∣∣∣∣∣f (nΔt)−N∑

�=1

x�f [(n− �)Δt]∣∣∣∣∣

2

> . (1.108)

If we assume samples from a stationary process with correlation coefficients

< f∗ (nΔt) f (mΔt) >= R [(n−m)Δt] , (1.109)

the normal equations (1.107) become

⎡

⎢⎢⎣

R (Δt)

R (2Δt)

.

R (NΔt)

⎤

⎥⎥⎦ =

⎡

⎢⎢⎣

R (0) R (−Δt) . R ((1 −N)Δt)

R (Δt) R (0) . R ((2 −N)Δt)

. . . .

R ((N − 1)Δt) R ((N − 2)Δt) . R (0)

⎤

⎥⎥⎦

⎡

⎢⎢⎣

x1

x2

.

xN

⎤

⎥⎥⎦

(1.110)

1.4 LMS Solutions via the Singular Value Decomposition 27

A more general form of the normal equations arises in connection with multidi-mensional filtering. Here instead of a scalar random variable we approximate anM -dimensional vector y (comprised of M random variables) by a linear combi-nation of M -dimensional vector random variables an, n = 1, 2, . . .N . Then

eN = y −N∑

n=1

xnan (1.111)

and we find that the xn that minimize the < |eN |2 > satisfy the normalequations

< aHmy >=N∑

n=1

< aHman > xn ,m = 1, 2, . . .N (1.112)

1.4 LMS Solutions via the Singular

Value Decomposition

1.4.1 Basic Theory Underlying the SVD

In many applications where discrete representations leading to the normal equa-tions of the form (1.80) arise, the expansion vectors are not orthogonal. In factin many cases, especially when dealing with experimentally derived data, theycould be nearly linearly dependent. Solutions based on a direct inversion of theGram matrix could then be strongly corrupted by roundoff errors and turn outto be nearly useless. A much more powerful approach in such cases is to solvethe LMS problem with the aid of a matrix representation known as the singularvalue decomposition (SVD). In the following we provide a brief account of thistechnique.

With A a complex-valued M ×N matrix (M ≥ N) we form the (M +N)×(M +N) matrix

W =

[0MM AAH 0NN

]. (1.113)

It is easy to show that W is Hermitian for any MXN matrix it can be diago-nalized by a unitary transformation comprised of N +M vectors qj satisfyingthe eigenvalue problem

Wqj = σjqj , j = 1, 2, 3, . . .N +M, (1.114)

where the eigenvalues σj are necessarily real. If we partition qj to read

qj =

[ujvj

](1.115)

with uj and vj having dimensions M and N , respectively, (1.114) is equivalentto the following two matrix equations:

Avj = σjuj , (1.116a)

AHuj = σjvj . (1.116b)


Multiplying both sides of (1.116a) from the left by AH and replacing AHujwith (1.116b) gives

AHAvj = σ2jvj . (1.117)

Similarly, multiplication of both sides of (1.116b) from the left by A and asubstitution from (1.116a) results in

AAHuj = σ2juj . (1.118)

Equations (1.117) and (1.118) are eigenvalue problems for the two nonnegativedefinite Hermitian matrices AAH and AHA. As is easy to see, these matriceshave the same rank which is just the number of linearly independent columnsof A. If we denote this number by R (R ≤ N), then the number of nonzeroeigenvalues σ2

j in (1.117) and (1.118) is precisely R. Because the matrices in(1.117) and (1.118) are Hermitian, the eigenvectors vj and uj can be chosen toform two orthonormal sets. Thus

vHj vk = δjk, j, k = 1, 2, 3, . . .N, (1.119)

anduHj uk = δjk, j, k = 1, 2, 3, . . .M. (1.120)

By virtue of (1.119) we can resolve an N × N unit matrix INN into a sum ofprojection operators vjv

Hj , i.e.,

N∑

j=1

vjvHj = INN . (1.121)

If we now multiply (1.116a) from the right by vHj , sum both sides over all j andtake account of (1.121), we obtain what is referred to as the SVD representationof A:

A =

R∑

j=1

σjujvHj . (1.122)

The σj are referred to as the singular values of A. By convention they arechosen as positive. We note that this can always be done in this representationby absorbing the sign in one of the transformation matrices. Since there can beonly R nonzero singular values, only R vectors of each set (1.119) and (1.120)enter into the SVD. Nevertheless, the SVD is sometimes written in block formthat includes the complete unitary transformations

U =[u1 u2 u3 . . . uM

](1.123)

andV =

[v1 v2 v3 . . . vN

], (1.124)

whose dimensions are, respectively,M ×M and N×N . In this notation (1.119)and (1.120) and (1.121) may be replaced by

UHU = UUH = IMM , VHV = VVH = INN . (1.125)


We can then rewrite (1.122) as follows:

A = UΣVH , (1.126)

where the M ×N matrix Σ has the structure

Σ =

⎡

⎢⎢⎢⎢⎢⎢⎢⎢⎣

σ1 0 0 0 0 0 . 00 σ2 0 0 0 0 . 00 0 . 0 0 0 . 00 0 0 σR 0 0 . 00 0 0 0 0 0 . 0. . . . . . . .0 0 0 0 0 0 0 0

⎤

⎥⎥⎥⎥⎥⎥⎥⎥⎦

. (1.127)

The representation (1.122) is usually referred to as the reduced SVD form. Agenerally adopted convention is to order the singular values in accordance with

σ1 > σ2 > . . . > σR.

An alternative block matrix representation is obtained by eliding all but thefirst R columns in U and V. Thus if we denote the resulting matrices by UR

and VR (whose respective dimensions are M ×R and N ×R), we obtain

A = URΣRVHR , (1.128)

where

ΣR =

⎡

⎢⎢⎣

σ1 0 0 00 σ2 0 00 0 . 00 0 0 σR

⎤

⎥⎥⎦ . (1.129)

Note that since the columns of UR and VR are orthonormal, we have

UHRUR = IRR (1.130)

andVHRVR = IRR. (1.131)

On the other hand, URUHR is no longer a unit matrix but an M ×M (rank R)

projection operator into the range of A. Similarly, VRVHR is an N × N (rank

R) projection operator into the domain of A.It is perhaps worth noting at this juncture that the approach to the SVD

via the two eigenvalue problems (1.117) and (1.118) has been adopted hereinmainly for didactic reasons and is not recommended for actual numerical work.In practice the member matrices entering in (1.128) can be computed usingmuch more efficient algorithms. These are described in [8] and we shall notdiscuss them. For purposes of providing numerical examples we shall contentourselves in using the routines incorporated in MATLAB.


1.4.2 Solutions of the Normal EquationsUsing the SVD

We now solve the normal equations (1.80) using the SVD. Substituting (1.128)for A in (1.80) we get

AHAx = VRΣRUHRURΣRV

HRx = VRΣ

2RV

HRx = VRΣRU

HRy. (1.132)

Multiplying from the left by Σ−2R VH

R and taking account of (1.131) yield for thelast two members of (1.132)

VHRx = Σ−1

R UHRy. (1.133)

In case R = N , we have VNVHN = INN , so that we can solve (1.133) for x by

simply multiplying both sides by VN . We then obtain the unique solution

x ≡ xLS= VNΣ−1N UH

Ny. (1.134)

When R < N , (1.133) possessesN−R+1 solutions. In many physical situationsit is reasonable to select the solution vector having the smallest norm. It is easilyshown that this is given by

x ≡ x0LS= VRΣ

−1R UH

Ry. (1.135)

To show this, we first note that x0LS satisfies (1.133). Thus substituting (1.135)

into the left side of (1.133) and taking account of (1.131) yield

VHRx0

LS = VHRVRΣ

−1R UH

Ry = Σ−1R UH

Ry. (1.136)

Second, let xLS be any other solution of (1.133) and set

xLS = x0LS + ξ. (1.137)

Again from (1.133), VHR

(xLS − x0

LS

)= 0, so that VH

R ξ =0, or, which is thesame thing,

ξHVR=0. (1.138)

Multiplying (1.135) from the left by ξH and taking account of (1.138) we findξHx0

LS = 0. Hence

‖xLS‖2 =∥∥x0

LS + ξ∥∥2

=∥∥x0

LS

∥∥2+ ‖ξ‖2 ≥ ∥∥x0

LS

∥∥2, (1.139)

which proves that x0LS is the solution of (1.133) with the smallest norm.

The matrix A˜1 ≡ VRΣ−1R UH

R is called the Moore–Penrose pseudoinverseof A. Introducing this notation into (1.135) we write

x0LS= A˜1y. (1.140)

Note thatA˜1A = VRV

HR , (1.141)


so that the right side reduces to a unit matrix only if A˜1 has full rank, in whichcase A˜1 is the true left inverse of A. (A right inverse does not exist as longas M > N). The general solution of (1.133) is given by (1.140) plus any of theN −R possible solutions ξ� of the homogeneous set

VRVHR ξ� = 0, (1.142)

i.e.,xLS = A˜1y + ξ�, � = 1, 2, . . .N −R. (1.143)

Let us now represent the overdetermined system in the form

y = Ax+ eR, (1.144)

where by analogy with (1.73) eR represents the residual error. If x is chosen inaccordance with (1.143), then (1.144) becomes

y = UR

(UHRy

)+eR, (1.145)

i.e., we obtain an expansion of y in terms of R orthonormal basis vectors(columns of UR) plus the residual vector eR whose norm has been minimized.We find

‖eR‖2 = εN min = yHy − yHAxLS

= yHy − yHURΣRVHR

[VRΣ

−1R UH

Ry + ξ�]

= yHy − yHURUHRy, (1.146)

which is identical for all xLS in (1.143), and, in particular, for the minimumnorm solution is given directly by the Moore–Penrose pseudoinverse in (1.140).

The minimum norm least squares solution (1.140) ignores contributions fromthe vector y not in the subspace defined by the columns of UR. For if, yc issuch a vector, then UH

Ryc = 0. This is, of course, beneficial if yc happensto comprise only noise (of either numerical or physical origin) for then UR

acts as an ideal noise suppression filter. If, on the other hand, yc includes asubstantial signal component, we introduce errors. The only way to capture sucha signal component is to enlarge the set of basis vectors. Clearly, a net benefitwould accrue from such an enlargement only if the additional signal contributionexceeds the noise contribution. Thus, ideally, the signal subspace should beknown a priori. In practice this is rarely the case. The usual approach is toconstruct (on the basis of physical theory and/or empirical data) a nominallyfull rank matrix A and set to zero all but the first R dominant singular valuesso that R is taken as the estimate of the “effective” dimension of the signalsubspace and the columns of UR as its basis. This semi-empirical procedure isonly meaningful if the singular value plot versus the index exhibits a more orless sharp threshold, so that R can be identified with little ambiguity.


1.4.3 Signal Extraction from Noisy Data

In the following we illustrate the use of the SVD in the extraction of signalsfrom data corrupted by additive random noise. We consider a model whereinthe signal to be detected consists of a weighted sum of known waveforms andthe objective is to determine the weighting factors. The measured data are thesum

y(tm) ≡ ym =

N∑

k=1

xk fk(tm) + n(tm) , m = 1, 2, . . .M,

wherein the fk(t) are known functions, xk the unknown weighting factors. n(t)represents the random noise and the tm the time instances at which the dataare recorded. The number of samplesM is larger (preferably much larger) thanN . For our purposes the sampling time interval tm − tm−1 is not importantbut in practice would probably be governed the signal bandwidth. We shallbase our estimate on a single record taken over some fixed time interval. Weuse the SVD to estimate xT = [x1 x2 x3 . . . xN ] using the measured data vectoryT = [y1 y2 y3 . . . xN ] and the M × N matrix Amk = fk(tm). The estimatedweighting factors are given by

x = A∼1y

taking into account only of the dominant singular values.We illustrate this estimation technique by the following example. Consider

the data vector y with components

y(m) = (m− 1)2 + n (m) ; 1 ≤ m ≤ 64,

where (m− 1)2 may be taken as the desired signal and n(m) is a random noisedisturbance. Let us model this disturbance by setting

n(m) = 3, 000rand(m),

wherein rand(m) is a pseudorandom sequence uniformly distributed between1 and −1. We shall attempt to extract the signal from the noisy data byapproximating the data vector in the LMS sense with the polynomial

yest(m) = a0 + a1m+ a2m2 ; 1 ≤ m ≤ 64.

To relate this to our previous notation, we have

A =

⎡

⎢⎢⎢⎢⎣

1 1 11 2 41 3 9. . .1 64 4096

⎤

⎥⎥⎥⎥⎦, x =

⎡

⎣a0a1a2

⎤

⎦

and we seek a vector x that provides the best LMS fit of Ax to y. The solutionis given at once by multiplying the Moore–Penrose pseudoinverse by the datavector as in (1.140). By using the MATLAB SVD routine we find


0 10 20 30 40 50 60 70-3000

-2000

-1000

0

1000

2000

3000

4000

5000

6000

7000

*

**

***

* y(data)

***

**

yest3

y0=(t-1)2

yest3=1219.5-98.6*t+2.5*t.2

sigma=[14944 75 3]

Figure 1.7: LMS fit of polynomial to noisy data (R = 3)

Σ3 =

⎡

⎣14944 0 00 75 00 0 3

⎤

⎦ ,

while the pseudoinverse gives

xT =[1219.5 −98.6 2.5

].

The resulting polynomial fit, yest3, together with the original data and thesignal y0 are plotted in Fig. 1.7.

We observe that the estimated signal approximates the uncorrupted (noisefree) signal much better than does the original data. Still the reconstruction isnot very satisfactory, particularly for low indices. Nevertheless this is the bestone can achieve with a straightforward LMS procedure. It should be evidentthat a direct solution of the normal equations would yield identical results. Wehave, however, not yet exploited the full potential of the SVD approach. Wenote that the given A matrix is ill conditioned, the ratio of the maximum to theminimum singular value being 14, 944/3 ≈ 5, 000. Thus we should expect animproved signal estimate by removing the lowest singular value and employingthe truncated SVD with

Σ2 =

[14944 00 75

].


0 10 20 30 40 50 60 70-3000

-2000

-1000

0

1000

2000

3000

4000

5000

6000

7000

sigma=[14944 75]

*** yest2

** y0=(t-1)2

* y(data)

***

**

*

yest2=1.4143-22.7615*t+1.5009*t.2


0 10 20 30 40 50 60 70-3000

-2000

-1000

0

1000

2000

3000

4000

5000

6000

7000

yest1=.0004+.0205*t+1.0588*t.2

*

*****

* y(data)

** y0=(t-1)2

*** yest1

sigma=[14944]



The recomputation of the pseudoinverse with R = 2 yields

xT =[ −1.4143 −22.7615 1.5009

],

and the resulting polynomial fit, yest2, together with the original data and thesignal y0 are plotted in Fig. 1.8.

Clearly the removal of the lowest singular value resulted in an improvedsignal estimate. Indeed, we can improve the estimate even further by retainingonly the one dominant singular value. For this case the polynomial coefficientsare

xT =[.0004 .0205 1.0588

],

and the results are plotted in Fig. 1.9.

1.4.4 The SVD for the Continuum

Let us now apply the SVD representation to the LMS approximation problemwithin the realm of the continuum. We note that for any finite set of expansionfunctions the equivalent algebraic problem is overdetermined, since formallyM =∞. If we think of the expansion functions φn (t) as an infinite-dimensionalcolumn vector, the matrix A assumes the form

A =[φ1 (t) φ2 (t) . . . φN (t)

]. (1.147)

To obtain the SVD representation in this case we first solve the eigenvalueproblem for vj in (1.117). In view of (1.147) we have

AHA = G, (1.148)

where G is the Gram matrix defined by (1.33). Thus for the continuum (1.117)reads

Gvj = σ2jvj ; j = 1, 2, . . . R, (1.149)

where again R ≤ N . The (formally) infinite-dimensional vectors uj then followfrom (1.116a):

uj ≡ uj (t) = 1

σj

N∑

�=1

φ� (t) v�j ; j = 1, 2, . . . R, (1.150)

where v�j is the �-th element of vj . If the chosen φn (t) are strictly linearlyindependent, then, of course, R = N . However, it may turn out that some ofthe singular values σj are very small (indicative of quasi linear dependence).In such cases only R of the functions uj (t) are retained corresponding to theindices of the R dominant contributors. If the eigenvectors vj in (1.149) arenormalized to unity, the uj (t) are automatically orthonormal, just as in the


discrete case in (1.120). We can show this directly using (1.150). Thus usingthe inner product notation for the continuum we have

(uj , uk) = (1

σj

N∑

�=1

φ�v�j ,1

σk

N∑

p=1

φpvpk)

=1

σjσk

N∑

�=1

v∗�jN∑

p=1

(φ�, φp

)vpk =

1

σjσk

N∑

�=1

v∗�jN∑

p=1

G�pvpk

=1

σjσk

N∑

�=1

σ2kv

∗�jv�k =

σkσj

N∑

�=1

v∗�jv�k

= δjk , j = 1, 2, . . . R. (1.151)

The SVD representation of A follows from (1.122). If we denote the matrixelements of A by Atn (1.122) is equivalent to

Atn ≡ φn (t) =R∑

j=1

σjuj (t) v∗nj ; a ≤ t ≤ b, 1 ≤ n ≤ N. (1.152)

An alternative form not involving the singular values explicitly follows froma replacement of uj (t) in the preceding by (1.150):

Atn =

N∑

�=1

R∑

j=1

φ� (t) v�jv∗nj ; a ≤ t ≤ b, 1 ≤ n ≤ N. (1.153)

The corresponding elements of the Moore–Penrose pseudoinverse are

A˜1nt =

R∑

j=1

σ−1j vnjuj (t)

∗; 1 ≤ n ≤ N, a ≤ t ≤ b . (1.154)

The LMS minimum norm solution to the normal equations (1.77) now reads

fn =

R∑

j=1

σ−1j vnj(uj , f). (1.155)

Replacing uj in this expression by (1.150) yields the following:

fn =

N∑

�=1

λ�n (φ�, f) , (1.156)

where

λ�n =

R∑

j=1

σ−2j vnjv

∗�j . (1.157)


In the special case of N orthonormal expansion functions R = N , the σj areall unity so that λ�n = δ�n. As expected, in this case (1.156) reduces to (1.41)(with Qn = 1).

Using the coefficients in (1.155)

N∑

n=1

fnφn (t) =

N∑

n=1

⎛

⎝R∑

j=1

σ−1j vnj(uj , f)

⎞

⎠φn (t)

=

R∑

j=1

(uj , f)

N∑

n=1

σ−1j vnjφn (t)

=

R∑

j=1

(uj , f)uj (t) ,

wherein the last step we used (1.150). Hence

f (t) =

R∑

j=1

(uj, f)uj (t) + eN (t) , (1.158)

which is the expansion (1.145) for the continuum, wherein (uj, f) are the expansioncoefficients and eN (t) the residual error with minimum norm.

The basis set represented by (1.150) is particularly useful when one is facedwith the problem of constructing a basis for a set of functions φn (t) n =1, 2, 3, . . .N only a small number of which are linearly independent. This sit-uation is found, for example, in digital communications where it is frequentlyadvantageous to use modulation formats involving a set of linearly dependentwaveforms. It is usually easy to deduce an appropriate basis for the waveformsfor small N (e.g., by using symmetry arguments). For large N (and partic-ularly for N � R) or whenever symmetry arguments and simple deductionsfail, the algorithm embodied in (1.149) and (1.150) provides an alternative andsystematic approach.

1.4.5 Frames

We form the sum∑Nn=1 φn (t)φ

∗n (t

′) and substitute for each φn (t) and φ∗n (t

′)the corresponding expansion from (1.152). Upon taking account of the orthog-

onality∑N

n=1 v∗njvnk = δjk we obtain the following identity

N∑

n=1

φn (t)φ∗n (t

′) =R∑

j=1

σ2juj (t)u

∗j (t

′) . (1.159)

Multiplying both sides by f∗ (t) and f (t′) and integrating with respect to bothvariables give

N∑

n=1

|(f, φn)|2 =

R∑

j=1

σ2j |(f, uj)|2 .


With A = min{σ2j

}and B = max

{σ2j

}; 1 ≤ j ≤ R the preceding implies the

following inequality:

AR∑

j=1

|(f, uj)|2 ≤N∑

n=1

|(f, φn)|2 ≤ BR∑

j=1

|(f, uj)|2 . (1.160)

Since the (f, uj) are coefficients in an orthonormal expansion, (1.41) applies

with the replacements Qn → 1,and fn → (un, f) and εN min → εRmin. Solv-

ing (1.158) for∑R

n=1 |(f, un)|2 and noting that εN min ≡ εRmin (in accordancewith the SVD decomposition the minimized LMS error obtained with the Nexpansions functions φn is the same as that with the R orthonormal un) andsubstituting into (1.160) we obtain

A[(f, f)− εN min] ≤N∑

n=1

|(f, φn)|2 ≤ B[(f, f)− εN min]. (1.161)

When the N expansion functions represent f (t) with zero LMS error they aresaid to constitute a frame. Thus for a frame the preceding inequality reads

A (f, f) ≤N∑

n=1

|(f, φn)|2 ≤ B (f, f) . (1.162)

We should note that thus far no assumptions have been made about the natureof the expansion functions. Let us now assume that they are of unit norm.If they are also orthogonal, then the Gram matrix is an N × N unit matrixand A = B = 1. A frame with A = B is said to be tight. Thus an orthogonalset of linearly independent functions constitutes a tight frame. However tightframes can result even when the functions are linearly dependent. For example,consider the following set of four functions: φ1, φ1, φ2,φ2 with (φ1, φ2) = ρ. Wehave N = 4 and R = 2. By assumption all functions have unit norm so thatthe Gram matrix is ⎡

⎢⎢⎣

1 1 ρ ρ1 1 ρ ρρ∗ ρ∗ 1 1ρ∗ ρ∗ 1 1

⎤

⎥⎥⎦ .

The two nonzero eigenvalues are found to be 2 (1 + |ρ|) ≡ B and 2 (1− |ρ|) ≡ A.We see that whenever φ1 and φ2 are orthogonal the frame becomes tight withA = B = 2. We might suspect that this factor of two is indicative of thefact that we have twice as many orthogonal functions at our disposal than theminimum required to span the space of dimension R = 2. This is actually thecase. To see that this rule applies generally consider the following set of Rgroups each comprised of K identical functions

φ1 . . . φ1φ2 . . . φ2φ3 . . . φ3 . . . . . . φR . . . φR,

that are assumed be orthonormal. It is not hard to see that the correspondingGram matrix is comprised entirely of R K ×K matrices of ones along the main


diagonal and that a typical eigenvector corresponding to nonzero eigenvalueshas the structure

vT =[0 . . . 0 1 . . . 1 0 . . . 0

],

wherein the j-th block of K ones lines up with the j-th sub matrix of ones.As a result the R nonzero eigenvalues are all identical and numerically equalto K. Thus we have again a tight frame with A = B = K, indicating a K-thorder redundancy in our choice of the orthonormal basis. Note that both a tightframe and the numerical correspondence of the constant A with the redundancyrequire orthogonality of the linearly independent subset of the N expansionfunctions. Also a tight frame requires that the redundancy be uniformly ap-portioned among the basis functions. For example, the set φ1, φ2, φ2,φ2, φ3, φ3

is not a tight frame because the eigenvalues of the associated Gram matrix are1, 3, and 2. To transform this set into a frame we could, e.g., supplement it withtwo additional φ1 and one φ3 resulting in A = B = 3.

1.4.6 Total Least Squares

In the approach to the LMS minimization of ‖Ax− y‖2 using the SVD thereis a tacit assumption that the M × N data matrix A is known exactly. Inpractice this is not always the case. In the following we present a modified LMStechnique that takes account of possible uncertainties in the data matrix. Thedevelopment follows closely the ideas introduced by Golub and Van Loan [9].

The problem may be formulated as follows. Denoting by aHi the i-th rowof A, and by yi the i-th element of y we interpret the minimization of (1.146)as the problem of finding a vector x whose projection on ai is close to theprescribed yi for i = 1, 2 . . .M . We now seek an N × 1 vector ai such that thedistance between the 1× (N + 1) vector cTi defined by

cTi = [aHi aHi x] (1.163)

and the vector [aHi yi] defined entirely by the data, is minimized for a given i.This minimization allows for the possibility that both the data matrix A andthe data vector y may not be consistently specified (in the sense that yi maynot be well approximated by aHi x for any x) To find ai we set

ψi(x) = min[∥∥ aHi − aHi

∥∥2+

∣∣yi − aHi x∣∣2]

(1.164)

and equate the variational derivative to zero. Thus

δ[∥∥ aHi − aHi

∥∥2+

∣∣yi − aHi x∣∣2]

= − [aHi − aHi

]δai−δaHi [ ai − ai]

− [yi − aHi x

]xHδai−δaHi x

[y∗i − xH ai

]= 0 (1.165)

Since the increments δai and δaHi are arbitrary, the coefficients of each mustvanish independently. This yields for the coefficients of δaHi

ai − ai = x[y∗i − xH ai

](1.166)


and a similar (but redundant equation) for the coefficients of δai. Moving theterms containing ai to the left of the equality sign gives

[INN + xxH

]ai = ai + y∗i x (1.167)

One can readily verify that

[INN + xxH

]−1= INN − xxH/ (1 + β) (1.168)

where β = xHx. With the aid of the above relationship one readily finds

ai=

[INN − xxH

1 + β

]ai +

y∗i1 + β

x (1.169)

Next we compute ψi(x) by using (1.169) in (1.164). This gives

‖ai − ai‖2 =1

(1 + β)2[− (

aHi x)xH + yix

H] [−x (

xHai)+ y∗i x

]

=1

(1 + β)2

[ ∣∣aHi x∣∣2 β − yiβ

(xHai

)− y∗i β(xHai

)∗

+β |yi|2]

(1.170)

∣∣yi − aHi x∣∣2 =

1

(1 + β)2[y∗i −

(xHai

)] [yi −

(xHai

)∗]

=1

(1 + β)2

[|yi|2 − yi

(xHai

)− y∗i(xHai

)∗

+∣∣aHi x

∣∣2

]. (1.171)

Adding (1.170) and (1.171) we obtain

ψi(x) =

∣∣aHi x−yi∣∣2

1 + β. (1.172)

The vector x is yet to be determined. It is the vector that minimizes∑M

i=1 ψi(x),or, equivalently

α = minx

{‖Ax− y‖21 + xHx

}. (1.173)

Taking the variational derivative with respect to x of the quantity in bracketsand setting it to zero we have

[Ax− y]HAδx−δxHAH [Ax− y]

1 + xHx− ‖Ax− y‖2 [δxHx+ xHδx

]

[1 + xHx]2 = 0

Equating the coefficients of δx to zero yields the following:

[Ax− y]H [

A (1 + β)− (Ax− y)xH]= 0. (1.174)

By taking the Hermitian transpose of both sides we get the equivalent form[AH (1 + β)− x

(xHAH−yH)]

[Ax− y] = 0. (1.175)


Scalar multiplication of both sides from the left by xH gives rise to xHAHβas well as its negative so that these terms cancel and the resulting expressionsimplifies to [

xHAH + βyH][Ax− y] = 0. (1.176)

Taking the Hermitian transpose of this expression results in

[Ax− y]H[Ax+βy] = 0. (1.177)

We will now use the last result to modify (1.174) so that it can be solved for x.For this purpose we write the term Ax− y appearing within the square bracketin (1.174) as follows:

Ax− y = Ax+βy − (1 + β)y. (1.178)

This modifies (1.174) to read

[Ax− y]H[A (1 + β)− (Ax+βy)xH + (1 + β)yxH

]= 0. (1.179)

By virtue of (1.177) multiplication of the middle term in brackets by [Ax− y]H

gives zero and a subsequent division by 1 + β reduces (1.179) to

[Ax− y]H[A+ yxH

]= 0. (1.180)

We propose to solve the preceding equation for x using the SVD. For thispurpose we define an M × (N + 1) augmented data matrix by adjoining thedata vector y to the N columns of A as follows:

C ≡ [A y]. (1.181)

Equation (1.180) can then be written in the following partitioned form:

[xH −1 ]

CHC

[INNxH

]= 0. (1.182)

Using the SVD decomposition

C = UΣVH ,

whereU andV have, respectively, dimensionsM×(N +1) and (N+1)×(N+1),we have

CHC = VΣVH . (1.183)

To facilitate subsequent manipulations we partition V and Σ as follows:

V =

[V11 v12

v21 v22

], Σ =

[Σ1 00 σ22

](1.184)

where V11,v12,v21 are, respectively, N × N , N × 1, 1 × N , v22 and σ22 arescalars, and

Σ1 = diag [σ1 σ2 σ3 . . . σN ] . (1.185)


Since V is unitary,

[VH

11 vH21vH12 v∗22

] [V11 v12

v21 v22

]=

[INN 00 1

](1.186)

which provides the following constraints:

VH11V11+ vH21v21 = INN , (1.187a)

vH12V11+ v∗22v21 = 0, (1.187b)

VH11v12+ vH21v22 = 0, (1.187c)

vH12v12+ v∗22v22 = 1. (1.187d)

Substituting (1.184) into (1.183) and then into (1.182) we obtain the following:

[xHV11 − v12

]Σ1

[VH

11 + vH21xH]+ σ22

[xHv12 − v22

] [vH12 + v∗22x

H]= 0.(1.188)

Let us assume that v22 = 0. Then (1.188) is satisfied by

x ≡ xTLS = −v12

v22(1.189)

since this reduces the last factor in (1.188) to zero while the first factor on theleft gives

xHV11 − v12 = −vH12V11

v∗22− v12 = − 1

v∗22

[VH

11v12+ vH21v22],

which vanishes by virtue of (1.187c).Equation (1.189) is the total least squares solution of Golub and Van Loan.

When v22 = 0, the TLS solution does not exist. When k singular values areidentical (say the last k) then, because the SVD of C would in that case notbe affected by a permutation of the corresponding columns of V, any of thelast k columns of V can serve as the vector

[vT12 v22

]for constructing the TLS

solution. In such cases, just as was done in the LMS solution, a reasonablechoice is a vector with the smallest norm. Because the matrix is unitary it isclear that this prescription is equivalent to choosing that column of V for which

maxN−k+2≤�≤N+1

|V (N + 1, �)| (1.190)

The residual error α in (1.173) when written in terms of the SVD of C reads

α =wHVΣVHw

wHw, (1.191)

where w = [xTLS − 1]. Using (1.189) and (1.187d) we obtain

wHw =1/ |v22|2 . (1.192)


Also

VHw =

[VH

11 vH21vH12 v∗22

] [xTLS−1

]

=

[ −VH11v12/v22 −vH21

−vH12v12/v22 −v∗22

]=

[O

−1/v22]

(1.193)

where we have made use of (1.187b) and (1.187d). Taking account of the defi-nition of Σ in (1.184), (1.191) reduces to

α = σ22. (1.194)

Let us suppose that A has full rank N (and M > N). If the rank of Cis also N , then y is in the subspace spanned by the columns of A and thesystem Ax = y has a unique solution, which must be identical with the LMSsolution. Since the ranks of C and A are identical, σ22 = 0 so that the TLSsolution is also the same. A different situation arises when the rank of C isN + 1 so that σ22 = 0. Then y is not in the subspace spanned by the columnsof A and Ax = y has no solutions. However, as long as there is a nonzeroprojection of the vector y into the range of A there exists a nontrivial LMSsolution. The TLS solution also exists and is distinct from the LMS solution.Clearly, the two solutions must approach each other when σ22 → 0.

1.4.7 Tikhonov Regularization

We have shown that when the rank an MXN (M>N) matrix A is less thanN, the solution of the normal equations (1.80) for x can be obtained using theSVD. An alternative approach is the so-called Tikhonov regularization [24]. Thecorresponding LMS problem is formulated as follows: instead of seeking an xthat minimizes ε = ‖Ax− y‖2 one seeks instead an x that minimizes

ε = ‖Ax− y‖2 + α2 ‖x‖2 (1.195)

for a fixed real α (Tikhonov regularization parameter). Using the variationalapproach (as, e.g., in the preceding subsection) one finds that the required xmust satisfy (

AHA+α2IN)x = AHy, (1.196)

where IN is an NXN unit matrix. Evidently the inverse in (1.196) exists evenfor a singular AHA provided α = 0 so that

x =(AHA+α2IN

)−1AHy. (1.197)

Let us now represent A in terms of its SVD. Thus with rank(A) =R ≤ N ,we set

A = URΣRVHR

and substitute in (1.196) to obtain(VRΣ

2RV

HR +α2IN

)x = VRΣRU

HRy (1.198)


To solve (1.198) for x we introduce a unitary NXN matrix V in which VR formsthe first R columns. Clearly (1.198) is then equivalent to

V

[Σ2R +α2IR 0

0 α2IN−R

]VHx = V

[ΣR

0

]UHRy, (1.199)

where IR and IN−R are, respectively, RXR and (N −R)X (N −R) unitmatrices. Multiplying (1.199) from the left first by VH , then by the inverse ofthe diagonal matrix, and finally by V, yields

x = VR

⎡

⎢⎣

σ1

σ21+α

2 0 0

0 . 00 0 σR

σ2R+α2

⎤

⎥⎦UHRy. (1.200)

Note that with α = 0 we obtain the usual SVD solution. As we know, forsufficiently small singular values (large matrix condition numbers) such a solu-tion could be noisy. As we have seen, the recommended remedy in such casesis to trim the matrix, i.e., eliminate sufficiently small singular values and thusreduce R. From (1.200) we observe that the Tikhonov approach provides analternative: instead of eliminating the troublesome small singular values, onecan introduce a suitable value of α to reduce their influence. This is equiva-lent to constructing a new matrix with a smaller condition number. Thus, ifthe condition number of A is 1/σmin, the condition number of the new matrixbecomes σmin /

(σ2min + α2

).

1.5 Finite Sets of Orthogonal Functions

1.5.1 LMS and Orthogonal Functions

When the expansion functions are orthogonal the Gram matrix in (1.33) isdiagonal so that the expansion coefficients are given by

fn = Q−1n (φn, f) ; n = 1, 2, 3, . . .N, (1.201)

where Qn is the normalization factor in (1.25). We note that once the coefficient

fn has been determined for some n, say n = k, it remains final, i.e., it doesnot need to be updated no matter how many more coefficients one decides tocalculate for n > k. This is definitely not the case for a nonorthogonal system.Thus, suppose we decided to increase our function set from N to N + 1. Thenfor a nonorthogonal system all the coefficients have to be updated, but not sofor an orthogonal system. This property has been referred to that of “finality,”and is closely related to what mathematicians refer to as completeness, to bediscussed in 1.7.1. Upon substituting (1.201) in (1.78) we obtain

εN min = (f, f)−N∑

n=1

Qn

∣∣∣fn∣∣∣2

, (1.202a)

1.5 Finite Sets of Orthogonal Functions 45

and since εN min ≥ 0 we have

(f, f) ≥N∑

n=1

Qn

∣∣∣fn∣∣∣2

, (1.202b)

which is usually referred to as the Bessel inequality.7

1.5.2 Trigonometric Functions

Perhaps the best known orthogonal function set is comprised of the trigonometricfunctions cos(nt) and sin(nt);n = 0, 1, 2, . . .N . The LMS approximation to asignal f (t) in the interval 0 < t ≤ 2π in terms of these functions reads

f (t) ∼ f0 +N∑

n=1

f (e)n cos(nt) +

N∑

n=1

f (o)n sin(nt). (1.203)

Using the notation

φ(e)n (t) = cos(nt), (1.204a)

φ(o)n (t) = sin(nt), (1.204b)

for n ≥ 1, we have over the interval 0 ≤ t ≤ 2π(φ(e)n , φ(e)

m

)= πδnm, (1.205a)

(φ(o)n , φ(o)m

)= πδnm, (1.205b)

(φ(e)n , φ(o)m

)= 0, (1.205c)

as is easily demonstrated by direct integration. Thus the 2N functions (1.204)together with the constant 1 comprise an orthogonal set of 2N + 1 functionswithin the specified interval. In view of (1.201) we have Qn = π for n > 1 whileQ0 = 2π so that the expansion coefficients are

f0 = (2π)−1

(1, f) , (1.206a)

f (e)n = (π)

−1(φ(e)n , f

), (1.206b)

f (o)n = (π)

−1(φ(o)n , f

). (1.206c)

Since the sinusoids are periodic with period 2π, the orthogonality relationshipsin (1.205) hold over any interval of 2π duration. For example,

(φ(e)n , φ(e)m

)≡

∫ 2π+τ

τ

cos (nt) cos (mt) dt = πδnm, (1.207)

7Formulas (1.201) and (1.202) assume a more esthetically pleasing form if we assumean orthonormal set for then Qn = 1. Even though this can always be realized by simplydividing each expansion function by

√Qn, it is not customary in applied problems and we

shall generally honor this custom.


wherein τ is an arbitrary reference value. Thus if a function f(t) were to bespecified over the interval τ < t ≤ 2π + τ , the expansion coefficients (1.206)would be given by inner products over this interval. For example,

f (e)n = (π)

−1(φ(e)n , f

)= (π)

−1∫ 2π+τ

τ

cos (nt) f (t) dt. (1.208)

We note that the right side of (1.203) is always periodic over any 2π interval.This does not imply that f (t) itself need to be periodic since the coefficients inthis expansion have been chosen to approximate the given function only withinthe specified finite interval; the behavior of the sum outside this interval neednot bear any relation to the function f (t). Thus the fact that the approximationto the function yields a periodic extension outside the specified interval is to betaken as a purely algebraic property of sinusoids. Of course, in special cases itmay turn out that the specified function is itself periodic over 2π. Under suchspecial circumstances the periodic extension of the approximation would alsoapproximate the given function and (1.205) would hold for |t| <∞.

The restriction to the period of length 2π is readily removed. For example,suppose we choose to approximate f (t) within the interval −T/2 < t ≤ T/2.Then the expansion functions (1.206) get modified by a replacement of theargument t by 2π/T , i.e.,

φ(e)n (t) = cos(2πnt/T ), (1.209a)

φ(o)n (t) = sin(2πnt/T ), (1.209b)

while the expansion coefficients are given by

f0 = (T )−1

(1, f) , (1.210a)

f (e)n = 2 (T )

−1(φ(e)n , f

), (1.210b)

f (o)n = 2 (T )

−1(φ(o)n , f

). (1.210c)

In applications it is frequently more convenient to employ instead of (1.203)an expansion in terms of the complex exponential functions

φn (t) = cos(2πnt/T ) + i sin(2πnt/T ) = ei2πnt/T . (1.211)

The LMS approximation to f(t) then assumes the symmetrical form

f(t) ∼n=N∑

n=−Nfnφn (t) . (1.212)

The three orthogonality statements (1.205) can now be merged into the singlerelationship

(φn, φm) ≡∫ T/2

−T/2φ∗n (t)φm (t) dt = Tδnm, (1.213)


and the expansion coefficients in (1.212) become

fn = (T )−1

(φn, f) . (1.214)

The Bessel inequality in (1.202b) in the present case reads

(f, f) =

∫ T/2

−T/2|f(t)|2 dt ≥ T

n=N∑

n=−N

∣∣∣fn∣∣∣2

. (1.215)

As will be shown in Sect. 1.2.1, for piecewise differentiable functions theLMS error can always be made to approach zero as N → ∞ in which case theinequality in (1.215) becomes an equality so that

(f, f) =

∫ T/2

−T/2|f(t)|2 dt = T

n=∞∑

n=−∞

∣∣∣fn∣∣∣2

. (1.216)

This relationship is usually referred to as Parseval theorem. The correspondinginfinite series is the Fourier series. Its convergence properties we shall study indetail in Chap. 2.

1.5.3 Orthogonal Polynomials [1]

As an additional illustration of LMS approximation of signals by orthogonalfunctions we shall consider orthogonal polynomials. Although they do not playas prominent a role in signal analysis as trigonometric functions they neverthe-less furnish excellent concrete illustrations of the general principles discussed inthe preceding. Orthogonal polynomials are of importance in a variety of areasof applied mathematics (for example in constructing algorithms for accuratenumerical integration) and are the subject of an extensive technical literature.In the following we shall limit ourselves to extremely brief accounts of only threetypes of orthogonal polynomials: those associated with the names of Legendre,Laguerre, and Hermit.

Legendre Polynomials

Legendre Polynomials are usually introduced in connection with the defining(Legendre) differential equation. An alternative, and for present purposes, moresuitable approach is to view them simply as polynomials that have been con-strained to form an orthogonal set in the interval −1 ≤ t ≤ 1. Using theGram–Schmidt orthogonalization procedure in 1.2.5 the linearly independentset

1, t, t2, . . . tN−1 (1.217)

can be transformed into the set of polynomials

Pn (t) =n∑

�=0

(−1)� [2 (n− �)]!2n�! (n− �)! (n− 2�)!

tn−2� , n = 0, 1, 2, . . .N − 1, (1.218)


where

n =

{n/2 ; n even,(n− 1) /2 ; n odd,

with the orthogonality relationship

(Pn, Pm) ≡∫ 1

−1

Pn (t)Pm (t) dt =2

2n+ 1δnm. (1.219)

Thus the normalization factor in (1.26) is in this case

Qn = 2/(2n+ 1), (1.220)

so that the coefficients in the LMS approximation of f (t) by the firstN LegendrePolynomials are

fn = Q−1n (Pn, f) . (1.221)

The LMS approximation is then

f (t) ∼N−1∑

n=0

fnPn (t) . (1.222)

The Bessel inequality in (1.202b) now reads

(f, f) ≡∫ 1

−1

|f (t)|2 dt ≥N−1∑

n=0

Qn

∣∣∣f∣∣∣2

. (1.223)

Again as in the case of trigonometric functions the limiting form N →∞ leadsto a zero LMS error for a large class of functions, in particular for functions thatare piecewise differentiable within the expansion interval. Equation (1.223) isthen satisfied with equality. This relationship may be referred to as Parseval’stheorem by analogy with Fourier Series.

It should be noted that there is no need to restrict the expansion interval to−1 ≤ t ≤ 1 for the simple substitution

t =2

b− a[t′ − b+ a

2

]

transforms this interval into the interval a ≤ t′ ≤ b in terms of the new variablet′. Thus the expansion (1.222) can be adjusted to apply to any finite interval.

There are many different series and integral representations for LegendrePolynomials, the utility of which lies primarily in rather specialized analyticalinvestigations. Here we limit ourselves to one additional relationship, the so-called Rodrigue’s formula, which is

Pn (t) =1

2nn!

dn

dtn(t2 − 1

)n. (1.224)


-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1P0

P1

P2

P3P4

Figure 1.10: Plots of the first five Legendre polynomials

This recursion formula is frequently taken as the basic definition of LegendrePolynomials and represents a convenient way of generating these functions pro-vided n is not too large. For example, the first five of these polynomials are

P0 (t) = 1, (1.225a)

P1 (t) = t, (1.225b)

P2 (t) =3

2t2 − 1

2, (1.225c)

P3 (t) =5

2t3 − 3

2t, (1.225d)

P4 (t) =35

8t4 − 30

8t2 +

3

8. (1.225e)

Plots of (1.225) are shown in Fig. 1.10.As an example suppose we obtain the LMS approximation to the function

3/2 + sign(t) by a finite number of Legendre polynomials. Figure 1.11 showsthe manner in which the partial sum approximates the given function as thenumber of (odd) Legendre Polynomials is increased from 4 to 6 and to 11. Wenote that the approximation at t = 0 yields in all cases the arithmetic mean ofthe function at the discontinuity, i.e.,

f (0+) + f (0−)2

= 3/2.


-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5

3

K=4K=6K=11

* 3/2+sign(t)

Figure 1.11: LMS approximation of a step discontinuity by LegendrePolynomials

We shall show later that this is a general property of LMS convergence at a stepdiscontinuity: the partial sums always give the arithmetic mean of the functionvalues at the right and the left side of the discontinuity. In addition, we notethat as the number of expansion functions is increased, the oscillatory overshootdoes not appear to diminish. We shall meet this behavior again in connectionwith the study of Fourier series and Fourier integrals, where it is usually referredto as the Gibbs Phenomenon.

Laguerre Polynomials

Suppose we attempt to approximate the function f(t) with an orthogonal setof functions over a semi-infinite interval. It would appear that these expan-sion functions cannot be polynomials, since the square of a polynomial is notintegrable over a semi-infinite interval. Polynomials can still be used, if one iswilling to introduce a suitable weighting factor into the definition of the innerproduct, as described in 1.2.5, to render the inner product convergent. Thussuppose Ln(t) is an n-th order polynomial and we choose e−t as the weightingfactor. Clearly the norm defined by

‖Ln (t)‖ =√∫ ∞

0

e−tL2n (t) dt (1.226)


converges irrespective of the nature of the polynomial. Polynomials Ln(t)orthogonal over (0,∞) with an exponential weighting factor, viz.,

∫ ∞

0

e−tLn (t)Lm(t)dt = Qnδnm, (1.227)

are called Laguerre polynomials. With the Gram–Schmidt orthogonalizationsuch polynomials can be generated from the set 1, t, t2, . . . tN−1. The funda-mental definitions are as follows:

Ln (t) =

n∑

�=0

n!(−1)��!

(n

�

)t� (1.228)

andQn = (n!)

2. (1.229)

The equivalent of the Rodriguez formula for Laguerre polynomials reads

Ln (t) = etdn

dtn(tne−t

). (1.230)

Expansions in Laguerre polynomials follow from the normal equations with theinner product as in (1.82) with a = 0, b = ∞, φn = Ln, and ψ = e−t/2. Usingthe orthogonality properties (1.227) the expansion coefficients are given by

fn = Q−1n

(e−tLn, f

)(1.231)

and the expansion (1.83) reads

f (t) =

N−1∑

n=0

fnLn (t) + eψ,N−1et/2. (1.232)

This expression suggests that a reasonable alternative definition of the errormight be eψ,N−1e

t/2. However then

∫ ∞

0

∣∣∣eψ,N−1et/2

∣∣∣2

dt =

∫ ∞

0

∣∣∣f(t)− f (N) (t)∣∣∣2

dt, (1.233)

so that eψ,N−1 must decrease at infinity at least exponentially for the integraljust to converge. Indeed, the square of this renormalized MS error is alwaysgreater or equal to

‖ eψ,N−1 ‖2=∫ ∞

0

e−t∣∣∣f(t)− f (N) (t)

∣∣∣2

dt, (1.234)

which is just the definition of the norm employed in minimization process leadingto the normal equations. The Bessel inequality for these expansion functionsreads: ∫ ∞

0

e−t |f (t)|2 dt ≥N−1∑

n=0

Qn

∣∣∣fn∣∣∣2

. (1.235)


Hermite Polynomials

When a function is to be approximated over an infinite time interval −∞ < t <∞ by polynomials a reasonable weighting factor is ψ = e−t

2/2. The polynomialsthat yield orthogonality in this case are called Hermite polynomials and aredefined by

Hn (t) = (−1)n et2 dn

dtne−t

2

. (1.236)

One can show that∫ ∞

−∞e−t

2

Hn (t)Hm(t)dt = Qnδnm, (1.237)

withQn = 2nn!

√π. (1.238)

In view of the orthogonality properties (1.237) the expansion coefficients are

fn = Q−1n

(e−t

2

Hn, f).

1.6 Singularity Functions

1.6.1 The Delta Function

Definition of the Delta Function

In dealing with finite-dimensional vectors the Kronecker symbol δnm, (1.27), isfrequently employed to designate a unit matrix. In that case the n-th componentof any M -dimensional vector x can be represented by

xn =

M∑

m=1

δnmxm. (1.239)

What we now seek is an analogue of (1.239) when the vector comprises a nonde-numerably infinite set of components, i.e., we wish to define an identity trans-formation in function space. Since the sum then becomes an integral, thisdesideratum is equivalent to requiring the existence of a function δ (t) with theproperty

f (t) =

∫ b

a

δ (t− t′) f (t′) dt′. (1.240)

In order to serve as an identity transformation, this expression should hold forall functions of interest to us that are defined over the interval a < t < b.Unfortunately it is not possible to reconcile the required properties of δ (t) withthe usual definition of a function and any known theory of integration for which(1.240) could be made meaningful. In other words, if we restrict ourselves toconventional definitions of a function, an identity transformation of the kindwe seek does not exist. A rigorous mathematical theory to justify (1.240) is

1.6 Singularity Functions 53

t− ε / 2 t+ ε / 2t

1/εf(t)

t'ba

Figure 1.12: Geometrical relationship of f (t) and the pulse function

available in terms of the so-called theory of distributions. Even a brief discussionof this theory would take us too far afield. Instead, we shall follow the moreconventional approach and justify the notation (1.240) in terms of a limitingprocess. For this purpose consider the pulse function defined by

Kε (t) =

{1ε ; |t| ≤ ε/2,0; |t| > ε/2.

(1.241)

Let us next choose an f (t) that is continuous and differentiable everywhere ina < t < b and compute the integral

Iε (t) =

∫ b

a

Kε (t− t′) f (t′) dt′. (1.242)

It is not hard to see that this is equivalent to

Iε (t) =1

ε

∫ t+ε/2

t−ε/2f (t′) dt′ (1.243)

for a + ε/2 < t < b − ε/2. Thus the integral is just the sliding arithmeticaverage of f (t) over a continuum of intervals of length ε and may be given thegeometrical interpretation as shown in Fig. 1.12.

Clearly, for sufficiently small ε the f (t′) within the interval encompassed bythe pulse function can be approximated by f (t) as closely as desired. Henceformally we can write

f(t) = limε→0

∫ b

a

Kε (t− t′) f (t′) dt′. (1.244)


Thus, instead of (1.240), we have established an identity transformation whichentails a limiting process. It is tempting to pull the limiting process into theintegrand of (1.244) and establish a formal identity with (1.244) by defining

δ (t− t′) = limε→0

Kε (t− t′) . (1.245)

a definition which does not actually represent a mathematically valid limitingprocess. Nevertheless, because of notational convenience and custom we shalluse it and similar expressions to “define” the Dirac delta function δ (t) butonly with the firm understanding that this is merely a mnemonic rule for alimiting process such as (1.244). Of course we could, as is sometimes done,consider (1.245) as representing a “function” which is everywhere zero withthe exception of a single point where it becomes infinite. Such a definitionis difficult to apply consistently in many physical situations where the deltafunction provides a very useful shorthand notational tool. Although we maystill think physically of the delta function as a sort of tall spike, it is analyticallymore fruitful to think in terms of a limiting process such as (1.244), especiallysince the physical significance of a delta function almost invariably manifestsitself under an integral sign so that there is no need to ask for the actual valuetaken on by the function.

The pulse function is not the only way in which to obtain limiting forms ofa delta function. In fact there is an infinite set of kernels KΩ (t, t′) with theproperty that for a continuous function integrable in a, b the following limitingform holds:

f(t) = limΩ→Ω0

∫ b

a

KΩ (t, t′) f (t′) dt′; a < t < b. (1.246)

The parameter Ω0 is usually either zero or infinity. Every kernel for which thepreceding limit exists will be said to define a delta function. Note that this limitis taken with the kernel under the integral sign. What about the limit of thekernel itself, i.e., limKΩ (t, t′) as Ω→ Ω0? It is not hard to see that as long ast = t′ this limit is zero. However for t = t′ the limit does not exist. Nevertheless,we shall employ the customary notation and define the delta function by

δ (t− t′) = limΩ→Ω0

KΩ (t, t′) , (1.247)

which is to be understood merely as a shorthand notation for the more elaboratestatement (1.246). Implicit in this definition is the interval a, b over which theidentity transformation (1.240) is defined.

A general property of delta function kernels is that they are symmetric, i.e.,

KΩ (t, t′) = KΩ (t′, t) . (1.248)

In view of (1.247) this means that δ (τ ) = δ (−τ) so that for symmetric kernelsthe delta function may be regarded as an even function. It may turn out that the


-4 -3 -2 -1 0 1 2 3 40

0.5

1

1.5

2

2.5

3

3.5

t

MA

GN

ITU

DE

0.5

0.1

1.0

Figure 1.13: Delta function kernel in Eq. (1.249)

kernel itself is a function of the difference of t and t′, i.e., KΩ (t, t′) = KΩ (t− t′)as in the case for the pulse function in Fig. 1.12. Under these circumstances(1.247) may be replaced by

δ (t) = limΩ→Ω0

KΩ (t) .

Limiting Forms Leading to Delta Functions

Besides the pulse function another limiting form of the delta function is thefollowing:

δ (t) = limΩ→0

Ω

π [t2 +Ω2], (1.249)

which holds over the infinite time interval, i.e., a = −∞ and b = ∞, withΩ0 = 0. To prove this, we have to carry out the limiting process

I (t) ≡ limΩ→0

∫ ∞

−∞

Ω

π[(t− t′)2 +Ω2

]f (t′) dt′, (1.250)

and establish that I (t) = f (t). Plots of the kernel in (1.250) are shown inFig. 1.13 for several values of Ω.

We note that this kernel tends to zero everywhere except at the soli-tary point at which the integration variable t′ approaches t, where the


ratio Ω/[(t− t′)2 +Ω2

]becomes indeterminate. In order to estimate the

contribution from the neighborhood of this special point we subdivide theintegration interval into three parts, as follows:

∫ t−ε/2

−∞{} dt′ +

∫ t+ε/2

t−ε/2{} dt′ +

∫ ∞

t+ε/2

{} dt′, (1.251)

where ε is a positive quantity which we may choose as small as desired. Sincet′ = t in the first and third integrals, their limiting values as Ω −→ 0 are zero.Therefore the entire contribution must come from the middle term alone, i.e.,

I (t) = limΩ→0

∫ t+ε/2

t−ε/2

Ω

π[(t− t′)2 +Ω2

]f (t′) dt′. (1.252)

Since f (t) is continuous, we can choose a sufficiently small ε so that f (t′) maybe regarded as a constant within the integration interval and set equal to f (t).Thus, upon factoring f (t) out of the integrand and changing the variable ofintegration to x = (t′ − t)/Ω, (1.252) leads to the final result

I = f (t) limΩ→0

∫ ε/2Ω

−ε/2Ω

dx

π [x2 + 1]= f (t)

∫ ∞

−∞

dx

π [x2 + 1]= f (t) , (1.253)

wherein the last step account has been taken of the indefinite integral∫ [1 + x2

]−1dx = arctan (x).

Yet another limiting form for the delta function, which will be shown to playthe central role in the theory of the Fourier integral, is the following:

δ (t) = limΩ→∞

sin(Ωt)

πt(1.254)

with t again defined over the entire real line. To prove (1.254) we need anintermediate result known as the Riemann–Lebesgue Lemma [16] (RLL) whichfor our purposes may be phrased as follows: Given a continuous and piecewisedifferentiable function g(t) we define

Ψ1 (ω, α) =

∫ ∞

α

g(t)e−iωtdt, (1.255a)

Ψ2 (ω, β) =

∫ β

−∞g(t)e−iωtdt, (1.255b)

where α and β are real constants with α > β. If

∫ ∞

α

|g(t)| dt < ∞ , and (1.256a)

∫ β

−∞|g(t)| dt < ∞, (1.256b)


then the following limits hold:

limω→∞Ψ1 (ω, α) −→ 0, (1.257a)

limω→∞Ψ2 (ω, β) −→ 0. (1.257b)

To prove (1.257a) we choose a constant τ > α and write (1.255a) as the sumof two parts as follows:

Ψ1 (ω, α) = Ψ1 (ω, α, τ ) +

∫ ∞

τ

g(t)e−iωtdt, (1.258)

where Ψ1 (ω, α, τ) =∫ ταg(t)e−iωtdt. For any finite τ we have the bound

|Ψ1 (ω, α)| ≤∣∣∣Ψ1 (ω, α, τ)

∣∣∣+∫ ∞

τ

|g(t)| dt. (1.259)

Integrating∫ τα g(t)e

−iωtdt by parts, we get

Ψ1 (ω, α, τ) =g (τ ) e−iωτ

−iω − g (α) e−iωα

−iω − 1

−iω∫ τ

α

dg(t)

dte−iωtdt. (1.260)

Since g (t) is bounded, the first two terms on the right of (1.260) approach zerofor sufficiently large ω. Also, since g (t) is assumed continuous and piecewisedifferentiable, its derivative is integrable over any finite interval. Hence as ω →∞ last term on the right of (1.260) tends to zero as well so that

limω→∞Ψ1 (ω, α, τ) −→ 0. (1.261)

Since in accordance with (1.256a) g (t) is absolutely integrable the second termon the right of (1.259) can be made arbitrarily small by simply choosing asufficiently large τ . Thus since the right side of the inequality in (1.259) canbe made arbitrarily small, its left side must also approach zero. This provesassertion (1.257a). By an identical argument one can prove (1.257b).

From the preceding proof it should be evident that, in particular,

limω→∞

∫ b

a

g(t)e−iωt −→ 0, (1.262)

where the only requirement is that g(t) be finite and piecewise differentiableover the finite interval a, b. Also, it is easy to see that assertions (1.257) and(1.262) remain valid when e−iωt is replaced by sin (ωt) or cos (ωt).

Returning to the proof of (1.254), we have to evaluate

I (t) ≡ limΩ→∞

∫ ∞

−∞

sin [Ω (t− t′)]π (t− t′) f (t′) dt′ (1.263)

and show that I (t) = f (t). This is more readily accomplished by regarding theintegrand in (1.263) as a product of sin [Ω (t− t′)] and the function

g (t, t′) = f (t′) /π (t− t′) . (1.264)


If we suppose that the latter is absolutely integrable with respect to t′ in thesense of (1.256) we can apply the RLL as long as we exclude the point t′ = t forthere g (t, t′) becomes infinite for f (t′) = 0. We again isolate this special pointby breaking up the integration interval into three nonoverlapping segments, justas we did in (1.251):

I (t) = limΩ→∞

∫ t−ε/2

−∞g (t, t′) sin [Ω (t− t′)] dt′

+ limΩ→∞

∫ t+ε/2

t−ε/2g (t, t′) sin [Ω (t− t′)] dt′

+ limΩ→∞

∫ ∞

t+ε/2

g (t, t′) sin [Ω (t− t′)] dt′. (1.265)

In view of (1.264) and (1.256) for functions f (t) satisfying

∫ ∞

α

|f(t′)|t′

dt′ <∞ ,

∫ β

−∞

|f(t′)|t′

dt′ <∞ (1.266)

the first and the third integrals on the right of (1.265) vanish in accordance withthe RLL for any ε > 0. Hence we need concern ourselves only with the middleintegral, i.e.,

I (t) = limΩ→∞

∫ t+ε/2

t−ε/2

sin [Ω (t− t′)]π (t− t′) f (t′) dt′. (1.267)

Again we choose a sufficiently small ε so that f (t′) may be approximated byf (t) as closely as desired and write

I (t) = f (t) limΩ→∞

∫ t+ε/2

t−ε/2

sin [Ω (t− t′)]π (t− t′) dt′.

The limit on the right is evaluated by first changing the variable of integrationfrom t′ to τ = Ω(t′ − t) to obtain

limΩ→∞

∫ t+ε/2

t−ε/2

sin [Ω (t− t′)]π (t− t′) dt′ = lim

Ω→∞

∫ Ωε/2

−Ωε/2

sin τ

πτdτ =

1

π

∫ ∞

−∞

sin τ

τdτ .

(1.268)The last integral converges to a known value, i.e.,

∫ ∞

−∞

sin τ

τdτ = π,

so that we obtain I (t) = f (t), thus proving (1.254), or, equivalently,

f(t) = limΩ→∞

∫ ∞

−∞

sin [Ω (t− t′)]π (t− t′) f (t′) dt′. (1.269)

There are other limiting forms that lead to delta functions which we shall discussin the sequel. From the foregoing we note that such limiting forms evidently re-quire not only the specification of the time interval in question but also the class


of admissible functions. For example, in the three cases we have considered, thelimiting process required that the functions be continuous and sectionally dif-ferentiable throughout the time interval over which the delta function property(i.e., (1.246)) is to hold. The modification needed to include functions withstep discontinuities will be discussed in the next subsection. For the case of theFourier kernel, (1.254), an additional constraint on the growth of the function atinfinity, (1.266), had to be imposed. We shall discuss this and other constraintson the growth at infinity in connection with Fourier Transforms in Chap. 2.

1.6.2 Higher Order Singularity Function

Consider now a function f (t) with a sectionally continuous and differentiablefirst derivative. Then the fundamental limiting form (1.246) should apply tothe derivative and we may write

df(t)

dt= lim

Ω→Ω0

∫ b

a

KΩ (t, t′)df (t′)dt′

dt′; a < t < b. (1.270)

If we assume a differentiable kernel the preceding may be integrated by partsand written in the form

df(t)

dt= lim

Ω→Ω0

[KΩ (t, t′) f (t′)

∣∣ba +

∫ b

a

−dKΩ (t, t′)dt′

f (t′) dt′].

Now limΩ→Ω0

KΩ (t, t′) = 0 for t = t′ so that if t does not coincide with the inte-

gration limits the preceding is equivalent to

− df(t)

dt= lim

Ω→Ω0

∫ b

a

dKΩ (t, t′)dt′

f (t′) dt′. (1.271)

Just as we abbreviated the limiting process (1.246) by the introduction of thedelta function in (1.240) that selects f (t′) at a single point t′ = t we canabbreviate the limiting process in (1.271) by introducing the symbolic function

δ(1) (t) that selects the negative derivative of f (t′) at t′ = t, i.e.,

− df(t)

dt=

∫ b

a

δ(1) (t′ − t) f (t′) dt′. (1.272)

This new symbolic function, referred to as the “doublet”, can be formally con-sidered a derivative of the delta function. This interpretation follows from thefollowing integration by parts:

∫ b

a

dδ (t′ − t)dt′

f (t′) dt′ = δ (t′ − t) f (t′) ∣∣ba −∫ b

a

δ (t′ − t) df (t′)

dt′dt′ = −df (t)

dt.

(1.273)


Comparing the first integral in the preceding with the integral in (1.272) we

readily identify dδ (t′ − t) /dt′ with δ(1) (t′ − t). Note that if instead of the delta

function we were to insert the doublet δ(1) (t′ − t) in the first integral in (1.273)the integration by parts would yield d2f(t)/dt2. This suggests, by analogy with

(1.272), that we interpret dδ(1) (t′ − t) /dt′ ≡ δ(2) (t′ − t) as a “higher order”singularity function which, when multiplied by f(t′), selects upon integrationits second derivative at t = t′. Clearly by repeating this procedure we cangenerate singularity functions of increasingly higher order and thus generalize(1.272) to8

(−1)k dkf(t)

dtk=

∫ b

a

δ(k) (t′ − t) f (t′) dt′ ; k = 1, 2, 3, . . . (1.274)

wherein the k-th order singularity function δ(k) (t) is defined (in a purely formalway) as the k-th order derivative of a delta function, i.e.,

δ(k) (t) =dkδ (t)

dtk. (1.275)

These singularity functions and the attendant operations are to be understood aspurely formal shorthand notation whose precise meaning, just as the meaningof the delta function itself, must be sought in the underlying limiting formsof suitable kernels. Fortunately in most applications these detailed limitingprocesses need not be considered explicitly and one is usually content with theformal statements (1.274) and (1.275).

As an example of a limiting form leading to a doublet consider the kerneldefined by the triangle function

KΩ (t− t′) =⎧⎨

⎩2Ω

(1− |t−t

′|Ω/2

); |t− t′| < Ω/2,

0; |t− t′| ≥ Ω/2.(1.276)

A plot of this function and its derivative is shown in Fig. 1.14By repeating the limiting arguments employed in conjunction with the pulse

function, (1.243), one readily finds that for a continuous and differentiablefunction f (t)

limΩ−→0

∫ ∞

−∞KΩ (t) f (t) dt = f (0)

so that (1.276) defines a delta function kernel. On the other hand, employingthe derivative depicted in Fig. 1.14 we obtain

limΩ−→0

∫ ∞

−∞

dKΩ (t)

dtf (t) dt = lim

Ω−→0{[f (−Ω/2)− f (Ω/2)] (Ω/2) (4/Ω2

)}

=f (−Ω/2)− f (Ω/2)

Ω/2= −df (t)

dt|t=0 (1.277)

in agreement with formula (1.271) (with Ω0 = 0 and t = 0).

8The reader may be puzzled by the change in the argument from t−t′, which we employedin the preceding subsection for the delta function, to t′ − t. We do this to avoid distinguishingbetween the derivative with respect to the argument and with respect to t′ in the definitionof δ(k) (t).


Ω / 2

Ω / 2

− Ω / 2

− Ω / 2

2 / Ω

4 / Ω2

− 4 / Ω2

t

t

KΩ (t)

dt

dKΩ (t)

Figure 1.14: Triangle delta function kernel and its derivative

1.6.3 Idealized Signals

In this as well as in subsequent sections repeated use will be made of severalidealized signals. We list them here for ready reference.

1. The unit step function

U (t) =

{1; t > 0,0; t < 0.

(1.278a)

2. Rectangular pulse

pT (t) =

{1; |t| < T,0; |t| > T.

(1.278b)

3. The sign function

sign (t) =

{1; t > 0,−1 : t < 0.

(1.278c)


-6 -4 -2 0 2 4 6-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

(1/p

i)*S

i(x/p

i)

x/pi

Figure 1.15: Sine integral function

4. The triangular pulse

qT (t) =

{1− |t|

T ; |t| < T,0; |t| > T.

(1.278d)

5. The sine integral function

Si(t) =

∫ t

0

sin τ

τdτ . (1.278e)

This function, plotted in Fig. 1.15, plays an important role in the theory ofFourier series and integrals. The damped oscillations reach their maximum(minimum) at x = ±π of ±1.859 and decay asymptotically at ±∞ whereSi (±∞) = ±π/2

6. Derivative of the unit step function

In the following we shall obtain the derivative of a unit step as a limitingform of the derivative of the function defined by

Uε (t) = [1− qε (t+ ε/2)]U (t+ ε/2)

shown plotted in Fig. 1.16.

From the figure we note that dUε (t) /dt = (1/ε) pε/2 (t) = Kε(t), as definedin (1.241). Consequently, for any function f (t) continuous and differen-tiable at the origin,

limε−→0

∫ ∞

−∞

dUε (t)

dtf (t) dt = f (0) ,


− ε / 2 ε / 2

Uε (t)

1.0

t

Figure 1.16: Step function with finite rise time

so that we make the formal identification

dU (t)

dt= δ (t) . (1.279)

1.6.4 Representation of Functions with StepDiscontinuities

The most general analogue signal of interest to us is represented by a piece-wise differentiable function with a denumerable set of step discontinuities. Ifwe denote the smooth portion of the signal (i.e., defined as continuous and dif-ferentiable) by fs(t) and the points of discontinuity by tk, such a function f (t)may be represented as follows:

f (t) = fs (t) +∑

k

[f(t+k

)− f (t−k

)]U (t− tk) , (1.280)

where f(t+k

) − f(t−k

)represents the function increment (step) at the disconti-

nuity point tk as indicated in Fig. 1.17. Even though (1.280)

df (t)

dt=dfs (t)

dt+

∑

k

[f(t+k

)− f (t−k

)]δ (t− tk) , (1.281)

where dfs(t)dt is an ordinary derivative of the smooth part of the function.

We have shown that limiting forms of certain kernels lead to delta functionproperties provided the signal is smooth (i.e., continuous and differentiable).With the aid of the representation (1.279) we can examine these limiting prop-erties for signals with step discontinuities. For example, consider the limit withthe Fourier kernel in (1.269). After substituting (1.279) in (1.269) we may takeaccount of the fact that the limiting process involving fs (t) follows along the


f(t)

tt1 t2 t3

f(t1)

f(t1)

f(t2)

f(t−) f(t3)

f(t3)

2

+

+

+

−

−

Figure 1.17: A piecewise smooth signal

lines leading to (1.269), i.e., the corresponding limit is just fs (t). Hence thelimiting form for the discontinuous function can therefore be written as follows:

I (t) = fs (t) +∑

k

[f(t+k

)− f (t−k

)]lim

Ω→∞

∫ ∞

tk

sin [Ω (t− t′)]π (t− t′) dt′. (1.282)

Changing the integration variable to τ = Ω(t− t′), the last integral becomes

limΩ→∞

∫ ∞

tk

sin [Ω (t− t′)]π (t− t′) dt′ = lim

Ω→∞1

π

∫ Ω(t−tk)

−∞

sin τ

τdτ =

⎧⎨

⎩

1 ; t > tk,12 ; t = tk,0 ; t < tk.

(1.283)

In order to see what happens when the last result is substituted in (1.282) letus temporarily assume that we have only one discontinuity, say at t = t1. Then

I (t) =

⎧⎨

⎩

fs (t) + f(t+1

)− f (t−1

); t > t1,

fs (t) +12

[f(t+1

)− f (t−1

)]; t = t1,

fs (t) ; t < t1.(1.284)

Note that at the discontinuity I (t) = I (t1) = fs (t1) +12

[f(t+1

)− f (t−1

)]=

f(t−1

)+ 1

2

[f(t+1

)− f (t−1

)]= 1

2

[f(t+1

)+ f

(t−1

)], i.e., the representation con-

verges to the arithmetic mean of the two function values. On the other hand, forall other values of t, I (t) converges to f (t). Clearly the same argument appliesto any number of discontinuities. We can therefore write the general result asfollows:

limΩ→∞

∫ ∞

−∞

sin [Ω (t− t′)]π (t− t′) f (t′) dt′ =

1

2

[f(t+

)+ f

(t−

)], (1.285)

where the symbols t+ and t− represent time values “just after” and “justbefore” the discontinuity at t. When the function is continuous at t then


f (t+) = f (t−) = f (t) and (1.285) reduces to the old formula (1.269). It turnsout that the limiting form (1.285) is not restricted to the Fourier kernel but isa general property of delta function kernels. Thus for any piecewise differen-tiable function with step discontinuities the fundamental delta function property(1.240) becomes

1

2

[f(t+

)+ f

(t−

)]=

∫ b

a

δ (t− t′) f (t′) dt′. (1.286)

1.6.5 Delta Function with Functions as Arguments

In applications the delta function appears frequently with an “ordinary” func-tion in the argument, e.g., δ [g (t)]. A typical case might involve another functionf (t) in the integrand in the form

I =

∫ b

a

f (t) δ [g (t)] dt. (1.287)

Assuming that g (t) is differentiable we can replace the argument of the deltafunction by a new variable τ = g (t) and follow the standard procedure forchanging the variable of integration to obtain

I =

∫ g(b)

g(a)

f[g−1 (τ )

] δ (τ )∣∣∣dg(t)dt

∣∣∣t=g−1(τ)

dτ , (1.288)

where symbol g−1 (τ ) represents the inversion of g (t), i.e., a solution of g (t) = τfor t. Clearly if g (t) = 0 anywhere within the interval of integration, then I = 0.On the other hand, if

g (tk) = 0 ; a < tk < b; k = 1, 2, 3, . . .L, (1.289)

the delta function δ (τ ) will give a contribution at each tk so that (1.288) eval-uates to

I =

L∑

k=1

f (tk)∣∣∣dg(t)dt

∣∣∣t=tk

. (1.290)

Clearly we obtain the same result if in (1.287) we simply set

δ [g (t)] =

L∑

k=1

δ (t− tk)∣∣∣dg(t)dt

∣∣∣t=tk

. (1.291)

Thus (1.290) may be taken as the formal definition of δ [g (t)].As an example, consider the integral

I =

∫ b

a

1

t4 + 1δ(t2 − 3t+ 2)dt. (1.292)


The roots of t2 − 3t+ 2 are t1 = 1 and t2 = 2 so that (1.291) gives

δ(t2 − 3t+ 2) = δ (t− 1) + δ (t− 2) .

The value of I depends on whether none, one, or both roots fall within theintegration limits. Thus

I =

⎧⎪⎪⎨

⎪⎪⎩

0 ; a < 1, b < 1 or a > 2, b > 2 or 1 < a < 2, 1 < b < 2,12 ; a < 1, 1 < b < 2,117 ; 1 < a < 2, b > 2,117 + 1

2 ; a < 1, b > 2.

(1.293)

If a root happens to coincide with one of the integration limits, one obtainsa situation analogous to integrating a discontinuous function so that (1.287)applies. Thus in the preceding example with a = 1 and 1 < b < 2 one obtainsI = 1/4. Note that to obtain a nonzero contribution requires that g (t) havereal roots. For example,

I =

∫ b

a

1

t4 + 1δ(t2 + 1)dt = 0

for all (real) choices of a and b since t2 + 1 = 0 possesses no real roots.

1.7 Infinite Orthogonal Systems

1.7.1 Deterministic Signals

Thus far we considered the LMS approximation of signals only in terms of afinite number of expansion functions. In a finite, N -dimensional, vector spacewe know that the LMS error can always be made to approach zero by simplychoosing N linearly independent basis vectors. In fact in this case not onlywould the LMS error be zero but the actual error would also be zero. Theanalogous situation in an infinite-dimensional vector space requires an infinitenumber of expansion functions. However, unlike in the finite-dimensional space,the LMS error may approach zero without at the same time the actual errortending to zero. In the former case we speak of convergence in the mean, in thelatter of pointwise convergence. Thus

limN−→∞

∫ b

a

|ψ(t)|2 ∣∣f (t)− fN (t)∣∣2 dt = 0 (1.294)

for convergence in the mean and

limN−→∞

fN (t) = f (t) (1.295)

for pointwise convergence. While it is obvious that (1.295) implies (1.294), theconverse is in general not true. We shall call the infinite set of expansion func-tions φn (t) complete if for every piecewise differentiable function f (t) (1.294)holds while (1.295) holds at all points of continuity [5, 21].

1.7 Infinite Orthogonal Systems 67

Assume that the set φn (t) is orthonormal in a, b with a weighting factor

w (t) = |ψ (t)|2, i.e.,

(φm, φn) =

∫ b

a

w (t)φ∗m (t)φn (t) dt = δnm. (1.296)

We now select the fn in

fN (t) =N∑

n=1

fnφn (t) (1.297)

such that(f − fN , f − fN)

is minimized. Recall that this is accomplished with

fn = (φn, f) =

∫ b

a

w (t′)φ∗n (t

′) f (t′) dt′. (1.298)

We now substitute this fn in (1.297) to obtain

fN (t) =

∫ b

a

f (t′)w (t′)N∑

n=1

φn (t)φ∗n (t

′) dt′. (1.299)

Now (1.295) implies that

limN−→∞

∫ b

a

f (t′)w (t′)N∑

n=1

φn (t)φ∗n (t

′) dt′ = f (t) . (1.300)

If we now refer to our fundamental definition of the delta function in (1.246),we identify in the above N = Ω, Ω0 =∞, and

KN (t, t′) = w (t′)N∑

n=1

φn (t)φ∗n (t

′) , (1.301)

then (1.300) implies that

limN−→∞

w (t′)N∑

n=1

φn (t)φ∗n (t

′) = δ (t− t′) , (1.302)

or, alternatively,

δ (t− t′)w (t′)

=

∞∑

n=1

φn (t)φ∗n (t

′) . (1.303)

which is the statement of completeness at points of continuity.In addition, whenever (1.294) holds (but not necessarily (1.295)), we have

∫ b

a

w (t) |f (t)|2 dt =∞∑

n=1

∣∣∣fn∣∣∣2

(1.304)


a relationship known in the special case of Fourier analysis as Parseval’s formula.It follows either by direct substitution of (1.297) in (1.294) or by taking thelimiting form of the Bessel inequality (1.202b). When the expansion functionsare not of unit norm but instead (φm, φn) = Qnδnm, (1.304) becomes

∫ b

a

w (t) |f (t)|2 dt =∞∑

n=1

Qn

∣∣∣fn∣∣∣2

. (1.305)

From the foregoing we see that delta function kernels arise naturally in con-junction with expansions in orthonormal functions. Equation (1.303) (or theformally equivalent statements (1.300) or (1.302)) is referred to as the com-pleteness relation for the orthonormal set φn (t) with weighting factor w (t). Itcan be shown that at points of discontinuity the right side of (1.300) should bereplaced by 1

2 [f (t+) + f (t−)] with the LMS error tending to zero in accordance

with (1.294). In Chap. 2 we shall demonstrate this explicitly for Fourier seriesand the Fourier integral.

1.7.2 Stochastic Signals: Karhunen–Loeve Expansion∗

Suppose we approximate a stochastic process f (t) by the sum

fN (t) =

N∑

n=1

fnφn (t) , (1.306)

where the φn (t) are deterministic expansion functions and the coefficients fn

are random variables. The error in the approximation, eN (t) = f (t)− fN (t),is a stochastic process and will differ for each sample function (realization) of

f (t). Similarly the MS error (eN , eN) is a random variable. Selecting the fnso

as to minimize the MS error for each sample function of the stochastic processleads, just as in the deterministic case, to the projection theorem (φm, eN ) = 0

so that the fnsatisfy the stochastic normal equations

(φm, f

)=

N∑

n=1

(φm, φn) fn.

Denoting the inverse of the Gram matrix by {Gnm}−1the solution for the

random variables fnreads

fn=

N∑

m=1

{Gnm}−1 (φm, f

)

1.7 Infinite Orthogonal Systems 69

For each sample function of f (t) the LMS error is

(eN , eN ) = (f, f)−N∑

n=1

fn

(f, φn

)

= (f, f)−N∑

n=1

N∑

m=1

{Gnm}−1 (φm, f) (f, φn

). (1.307)

Denoting by Rff (t, t′) =< f(t)f∗ (t′) > the autocorrelation function of the

process we compute the ensemble average of both sides of (1.307) to obtain

εN min = < (eN , eN ) >=

∫ b

a

Rff (t, t)dt

−N∑

n=1

N∑

m=1

{Gnm}−1∫ b

a

dt

∫ b

a

dt′φ∗m (t′)φn (t)Rff (t

′, t) . (1.308)

In the preceding no assumption other than linear independence has been madeabout the expansion functions. We now assume that they are eigenfunctions ofthe integral equation

∫ b

a

dt′Rff (t, t′)φn (t′) = λnφn (t) , n = 1, 2, . . . (1.309)

It can be shown that the properties of the autocorrelation function guaranteethe existence of an infinite set of orthogonal functions satisfying (1.309) and thecompleteness relationship

∞∑

n=1

φn (t)φ∗n (t

′) = δ (t− t′) . (1.310)

The eigenvalues λn are positive for all finite n and converge to zero as n tendsto infinity. To obtain an expansion of the autocorrelation function in terms ofits eigenfunctions we multiply both sides of (1.309) by φ∗n (t

′′) and sum over n.This yields a delta function under the integral sign so that upon integration weget (after replacing t′′ with t′) the expansion

Rff (t, t′) =

∞∑

n=1

λnφn (t)φ∗n (t

′) . (1.311)

The expansion functions are orthogonal so that with unit normalization theexpansion coefficients in (1.306) are

fn=

∫ b

a

φ∗n (t) f (t) dt. (1.312)


The correlation matrix elements for these coefficients are given by the en-semble averages

< fnf∗m>=

∫ b

a

φ∗n (t) dt

∫ b

a

dt′Rff (t, t′)φm (t′) (1.313)

and using (1.309) we get

< fnf∗m>= λn

∫ b

a

φ∗n (t)φm (t) = λnδnm. (1.314)

This shows that when the expansion functions in (1.306) are chosen to be eigen-functions of the autocorrelation function the expansion coefficients are uncorre-lated.

Let us compute the average LMS error when a stochastic process is ap-proximated by such an expansion. For this purpose we may use (1.308) with

{Gnm}−1= δnm. Taking account of (1.309) we get as N →∞

ε∞min =

∫ b

a

Rff (t, t)dt−∞∑

n=1

λn. (1.315)

But in view of (1.311)∫ baRff (t, t)dt =

∑∞n=1 λn so that ε∞min = 0.

The LMS expansion of the stochastic process f (t)

f (t) ˜

∞∑

n=1

fnφn (t) (1.316)

in which the expansion functions are eigenfunctions of the autocorrelation func-tion is known as the Karhunen–Loeve [18] expansion and finds extensive appli-cations in statistical signal analysis. As shown above, it converges in the sensethat

limN→∞

<

∫ b

a

∣∣∣∣∣f (t)−N∑

n=1

fnφn (t)

∣∣∣∣∣

2

dt >→ 0. (1.317)

Problems

1. Prove that the p-norm defined by (1.20) satisfies postulates (1.14a),(1.14b) and (1.14c).

2. Show that the functions φ1(t) = t√

3/2 and φ2(t) = (3t2 − 1)√5/8 are

orthonormal over −1 ≤ t ≤ 1 and find the Euclidean distance betweenthe signals f (t) and g (t),

f (t) = 0.5φ1(t) + φ2(t),

g (t) = −φ1(t) + 0.5φ2(t).

Problems 71

3. Approximate f(t) = sin t within the interval 0, π in the LMS sense by thefunction sets 1, t, t2 and 1, t, t2, t3, t4. Compute the corresponding LMSerrors and plot f3 (t) and f5 (t) together with f(t) on the same set ofaxes.

4. Generate a data set y [n] as follows:

y [n] = (2n+ 1)2+ 12000 ∗ rand (n) , 1 ≤ n ≤ 64,

where rand (n) is a random sequence of 64 numbers uniformly distributedbetween −1 and 1. Find the coefficients a0, a1, a2 such that

yest [n] ≡ a0 + a1n+ a2n2

represents an LMS fit to the data. Carry out the coefficient evaluationusing:

(a) The Normal Equations.

(b) The SVD retaining (i) all three singular values (ii) two dominantsingular values, and (iii) one dominant singular value.

(c) Plot the yest [n] obtained in (a) and (b) together with the data set

as well as the “signal” y0 [n] ≡ (2n+ 1)2on the same set of axes.

5. Prove that the three functions e−t, te−t, t2e−t are linearly independent inthe interval 0,∞ and transform them into an orthonormal set using

(a) The Gram–Schmidt procedure.

(b) The orthonormal functions defined by (1.150).

(c) Repeat (a) and (b) with the addition of the fourth function e−2t tothe set.

6. Find an orthonormal set defined in the interval 0, π corresponding to thefollowing three functions : sin t, t sin t, 2 sin t+ 3t sin t using:

(a) Eq. (1.56)

(b) The Gram–Schmidt orthogonalization procedure

7. For the N × 4 matrix (N > 4) A defined by

A = [a1a2a2a1] ,

where a1 and a2 are two N -dimensional orthogonal column vectors, withaH1 a1 = 2 and aH2 a2 = 3. Find

(a) The SVD of A

(b) The Moore–Penrose pseudoinverse of A

(c) Given an arbitrary N -dimensional column vector y use the result of(b) to solve Ax ∼ y in the LMS sense for x.

(d) Find the LMS error in (c)


8. A set of data xn, yn, n = 1, 2, 3, . . .N is to fitted to the parabola

y = ax2 + b

in the LMS sense with the correspondence xn ⇔ x and yn ⇔ y.

(a) Write down the normal equations

(b) Solve for a and b

(c) Find the LMS error

9. The functionf(t) = 1− |t|

is to be approximated in the LMS sense in the interval −1 ≤ t ≤ 1 by thepolynomial

2∑

n=0

fntn

(a) Find the coefficients fn

(b) Find the LMS error

10. Prove the following (Hadamard) inequality:

det[(φn, φm)] ≤N

Πk=1

(φk, φk) .

11. Using the three M -dimensional column vectors b1,b2,b3, we form theMX9 matrix B,

B =[b1 b1 b1 b1 b2 b2 b2 b3 b3

],

Assuming M ≥ 9 and that b1,b2,b3 are orthonormal determine the SVDof B.

12. In Problem 11 show that the column vectors of B constitute a frame inthe three-dimensional subspace spanned by b1,b2,b3. What are the framebounds A and B? Using the three vectors b1,b2,b3 explain how one couldmodify the matrix B to obtain a tight frame.

13. The signalf(t) = cos(2πt/T )

is to be approximated in the interval 0 ≤ t ≤ T in the LMS sense by theN functions defined by

φn(t) = pδ (t− nδ) ,where δ = T/N and

pδ (t) =

{1/√δ; 0 ≤ t ≤ δ,

0 ; otherwise .

Problems 73

(a) Denoting the partial sum by

fN (t) =

N−1∑

n=0

fnφn(t)

find the coefficients fn.

(b) Compute the LMS error as a function of T and N .

(c) Prove that the LMS error approaches 0 as N −→∞.

14. With

φ(e)n (t) = cos [2πnt/ (b− a)]φ(o)n (t) = sin [2πnt/ (b− a)]

and n = 1, 2, 3, . . . prove directly the orthogonality relations

∫ b

a

φ(e)n (t)φ(e)

m (t) =b− a2

δnm,

∫ b

a

φ(o)n (t)φ(o)m (t) =

b− a2

δnm,

∫ b

a

φ(e)n (t)φ(o)m (t) = 0.

15. Consider the set of polynomials φk = tk−1, k = 1, 2, . . .N . With the innerproduct defined by

(φk, φj

)=

∫ ∞

−∞φk (t)φj (t) e

−t2dt,

employ the Gram–Schmidt orthogonalization procedure to show that theresulting orthogonal functions are proportional to the Hermite polynomi-als in (1.236).

16. Prove the following:

(a) δ (t− t′) = lima−→∞

sin2[a(t−t′)]πa(t−t′)2 for all piecewise differentiable functions

satisfying (1.266)

(b) δ (t− t′) = lima−→0

1√2πa2

e−(t−t′)2

2a2 for all piecewise differentiable and

bounded functions.


1Δ3

2

Δ3−

0 Δ 3Δ

KΔ(t)

t2Δ

17. Evaluate the following integrals

(a)∫ 4

1

δ(t2−2)t2+1 dt

(b)∫ 2π

−2πδ[sin(2t)]1+cos2(2t)dt

(c)∫ 1

−1

δ(t2+1)2+t dt

18. Prove that every real square integrable function (i.e., one satisfying∫ ∞−∞

f (t)2dt <∞) also satisfies (1.266).

19. For the kernel defined by KΔ(t) =1Δ3 pΔ/2(t−Δ/2)− 2

Δ3 pΔ/2(t−3Δ/2)+1Δ3 pΔ/2(t − 5Δ/2) and plotted in the following sketch prove that for apiecewise continuous function f (t) that is twice differentiable

limΔ−→0

∫ ∞

−∞KΔ (t) f (t) dt = f ′′ (0) ,

i.e., that limΔ−→0KΔ (t) = δ(2)(t)

Chapter 2

Fourier Series and Integralswith Applications to SignalAnalysis

Perhaps the most important orthogonal functions in engineering applications aretrigonometric functions. These were briefly discussed in 1.5.2 as one exampleof LMS approximation by finite orthogonal function sets. In this chapter wereexamine the LMS approximation problem in terms of infinite trigonometricfunction sets. When the approximating sum converges to the given functionwe obtain a Fourier Series; in case of a continuous summation index (i.e., anintegral as in (1.92) the converging approximating integral is referred to as aFourier Integral.

2.1 Fourier Series

2.1.1 Pointwise Convergence at Interior Points for SmoothFunctions

We return to 1.5.2 and the LMS approximation of f (t) within the interval−T/2 < t < T/2 by 2N + 1 using complex exponentials as given by (1.211).The approximating sum reads

fN (t) =

N∑

n=−Nfne

i2πnt/T , (2.1)

while for the expansion coefficients we find from (1.214)

fn =1

T

∫ T/2

−T/2f (t) e−i2πnt/T dt. (2.2)


75

76 2 Fourier Series and Integrals with Applications to Signal Analysis

Upon substituting (2.2) in (2.1) and interchanging summation and integrationwe obtain

fN (t) =

∫ T/2

−T/2f (t′)KN (t− t′) dt′, (2.3)

where

KN (t− t′) = 1

T

N∑

n=−Nei2πn(t−t

′)/T =

N∑

n=−N

(1√Tei2πnt/T

)(1√Tei2πnt

′/T)∗

,

(2.4)and as shown in the following, approaches a delta function at points of con-tinuity as N approaches infinity. The last form highlights the fact that thiskernel can be represented as a sum of symmetric products of expansion func-tions in conformance with the general result in (1.301) and (1.302). Using thegeometrical series sum formula we readily obtain

KN (t− t′) = sin [2π (N + 1/2) (t− t′) /T ]T sin [π (t− t′) /T ] , (2.5)

which is known as the Fourier series kernel. As is evident from (2.4) this kernelis periodic with period T and is comprised of an infinite series of regularlyspaced peaks each similar to the a-periodic sinc function kernel encountered in(1.254). A plot of T KN (τ ) for N = 5 as a function of (t − t′)/T ≡ τ/T isshown in Fig. 2.1. The peak value attained by T KN (τ ) at τ/T = 0,±1,±2,

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-4

-2

0

2

4

6

8

10

12

tau/T

T*K

(tau

/T)

Figure 2.1: Fourier series kernel (N=5)

2.1 Fourier Series 77

is in general 2N + 1, as may be verified directly with the aid of (2.4) or (2.5).As the peaks of these principal lobes grow in proportion with N their widthsdiminish with increasing N. In fact we readily find directly from (2.5) thatthe peak-to-first null lobe width is Δτ = T/(2N + 1). We note that ΔτKN

(±kT ) = 1, so that the areas under the principal lobes should be on the orderof unity for sufficiently large N. This suggests that the infinite series of peaksin Fig. 2.1 should tend to an infinite series of delta functions as the numberN increases without bound. This is in fact the case. To prove this we mustshow that for any piecewise differentiable function defined in any of the intervals(k − 1/2)T < t < (k + 1/2)T, k = 0,±1, . . . the limit

limN−→∞

∫ (k+1/2)T

(k−1/2)T

f (t′)KN (t− t′) dt′ = 1

2

[f(t+

)+ f

(t−

)](2.6)

holds. Of course, because of the periodicity of the Fourier Series kernel it sufficesif we prove (2.6) for k = 0 only. The proof employs steps very similar to thosefollowing (1.263) except that the constraint on the behavior of f (t) at infinity,(1.266), presently becomes superfluous since the integration interval is finite.Consequently the simpler form of the RLL given by (1.262) applies. As in(1.267) we designate the limit by I (t) and write

I (t) = limN−→∞

∫ T/2

−T/2g (t, t′) sin [2π (N + 1/2) (t− t′) /T ]dt′, (2.7)

where by analogy with (1.264) we have defined the function

g (t, t′) =f (t′)

T sin [π (t− t′) /T ] . (2.8)

In (2.7) we may identify the large parameter 2π (N + 1/2) /T with ω in (1.262)and apply the RLL provided we again exclude the point t = t′where g (t, t′)becomes infinite. We then proceed as in (1.265) to obtain


∫ t−ε/2

−T/2g (t, t′) sin [2π (N + 1/2) (t− t′) /T ]dt′

+ limN−→∞

∫ t+ε/2

t−ε/2g (t, t′) sin [2π (N + 1/2) (t− t′) /T ]dt′

+ limN−→∞

∫ T/2

t+ε/2

g (t, t′) sin [2π (N + 1/2) (t− t′) /T ]dt′, (2.9)

where ε is an arbitrarily small positive number. Let us first assume that f (t′) issmooth, i.e., piecewise differentiable and continuous. In that case then g (t, t′)has the same properties provided t = t′. This is true for the function in theintegrand of the first and third integral in (2.9). Hence by the RLL these vanishso that I(t) is determined solely by the middle integral


∫ t+ε/2

t−ε/2f (t′)

sin [2π (N + 1/2) (t− t′) /T ]T sin [π (t− t′) /T ] dt′. (2.10)


Since ε is arbitrarily small, f (t′) can be approximated as closely as desiredby f (t) and therefore factored out of the integrand. Also for small ε thesin [π (t− t′) /T ] in the denominator can be replaced by its argument. Withthese changes (2.10) becomes

I (t) = f (t) limN−→∞

∫ t+ε/2

t−ε/2

sin [2π (N + 1/2) (t− t′) /T ]π (t− t′) dt′. (2.11)

The final evaluation becomes more transparent when the integration variable ischanged from t′ to x = 2π (N + 1/2) (t′ − t) /T which transforms (2.11) into

I (t) = f (t) limN−→∞

1

π

∫ επ(N+1/2)/T

−επ(N+1/2)/T

sinx

xdx = f (t)

1

π

∫ ∞

−∞

sinx

xdx = f (t) .

(2.12)

This establishes the delta function character of the Fourier series kernel. Equiv-alently, we have proven that for any smooth function f (t) the Fourier series

f (t) =

∞∑

n=−∞fne

i2πnt/T (2.13)

with coefficients given by (2.2) converges in the interval −T/2 < t < T/2.

2.1.2 Convergence at Step Discontinuities

Note that in the preceding limiting argument we have excluded the endpoints ofthe interval, i.e., we have shown convergence only in the open interval. In fact,as we shall shortly see, pointwise convergence can in general not be achieved by aFourier series at t = ±T/2 even for a function with smooth behavior in the openinterval. It turns out that convergence at the endpoints is intimately relatedto convergence at a step discontinuity, to which we now turn our attention.Thus suppose our function possesses a finite number of step discontinuities inthe (open) interval under consideration. We can then represent it as a sumcomprising a smooth function fs (t) and a sum of step functions as in (1.280).In order not to encumber the development with excessive notation we confinethe discussion to one typical discontinuity, say at t = t1, and write

f (t) = fs (t) +[f(t+1

)− f (t−1

)]U (t− t1) . (2.14)

The Fourier coefficients follow from (2.2) so that

fn =1

T

∫ T/2

−T/2fs (t) e

−i2πnt/Tdt+

[f(t+1

)− f (t−1

)]

T

∫ T/2

t1

e−i2πnt/Tdt (2.15)

and substitution in (2.1) yields the partial sum

fN (t) =

∫ T/2

−T/2fs (t

′)sin [2π (N + 1/2) (t− t′) /T ]

T sin [π (t− t′) /T ] dt′+[f(t+1

)− f (t−1

)]λN (t) ,

(2.16)


where

λN (t) =

∫ T/2

t1

sin [2π (N + 1/2) (t− t′) /T ]T sin [π (t− t′) /T ] dt′ . (2.17)

The limiting form of the first integral on the right of (2.16) as N −→ ∞ hasalready been considered so that

limN−→∞

fN (t) = fs (t) +[f(t+1

)− f (t−1

)]lim

N−→∞λN (t) (2.18)

and only the last limit introduces novel features. Confining our attention to thisterm we distinguish three cases: the interval −T/2 < t < t1, wherein t

′ = t sothat the RLL applies, the interval t1 < t < T/2, and the point of discontinuityt = t1. In the first case λN (t) approaches zero. In the second case we divide theintegration interval into three subintervals as in (2.9). Proceeding in identicalfashion we find that λN (t) approaches unity. For t = t1 we subdivide theintegration interval into two subintervals as follows:

λN (t1) =

∫ t1+ε/2

t1

sin [2π (N + 1/2) (t1 − t′) /T ]T sin [π (t1 − t′) /T ] dt′

+

∫ T/2

t1+ε/2

sin [2π (N + 1/2) (t1 − t′) /T ]T sin [π (t1 − t′) /T ] dt′, (2.19)

where again ε is an arbitrarily small positive quantity. In the second integralt′ = t1 so that again the RLL applies and we obtain zero in the limit. Hencethe limit is given by the first integral which we compute as follows:

limN−→∞

λN (t1) = limN−→∞

∫ t1+ε/2

t1

sin [2π (N + 1/2) (t1 − t′) /T ]T sin [π (t1 − t′) /T ] dt′

= limN−→∞

∫ π(2N+1)ε2T

0

sinx

π (2N + 1) sin x2N+1

dx

= limN−→∞

∫ π(2N+1)ε2T

0

sinx

πxdx =

∫ ∞

0

sinx

πxdx =

1

2. (2.20)

Summarizing the preceding results we have

limN−→∞

λN (t) =

⎧⎨

⎩

0 ; − T/2 < t < t1,1/2 ; t = t1,

1 ; t1 < t < T/2.(2.21)

Returning to (2.18) and taking account of the continuity of fs (t) we have thefinal result

limN−→∞

fN (t1) =1

2

[f(t+1

)+ f

(t−1

)]. (2.22)

Clearly this generalizes to any number of finite discontinuities within the ex-pansion interval. Thus, for a piecewise differentiable function with step discon-tinuities the Fourier series statement (2.13) should be replaced by

1

2

[f(t+

)+ f

(t−

)]=

∞∑

−∞fne

i2πnt/T . (2.23)


Although the limiting form (2.23) tells us what happens when the numberof terms in the series is infinite, it does not shed any light on the behavior ofthe partial approximating sum for finite N. To assess the rate of convergencewe should examine (2.17) as a function of t with increasing N. For this purposelet us introduce the function

Si s(x,N) =

∫ x

0

sin[(N + 1/2)θ]

2 sin (θ/2)dθ (2.24)

so that the dimensionless parameter x is a measure of the distance fromthe step discontinuity (x = 0). The integrand in (2.24) is just the sum

(1/2)∑n=Nn=−N exp(−inθ) which we integrate term by term and obtain the

alternative form

Si s(x,N) =x

2+

N∑

n=1

sin(nx)

n. (2.25)

Note that for any N the preceding gives Si s(π,N) = π/2. As N → ∞ with0 < x < π this series converges to π/2. A plot of (2.25) for N = 10 and N = 20is shown in Fig. 2.2. For larger values of N the oscillatory behavior of Si s(y,N)

-4 -3 -2 -1 0 1 2 3 4-2

-1.5

-1

-0.5

0

0.5

1

1.5

2** *

* N=10

** N=20

x

Sis

(x,N

)

Figure 2.2: FS convergence at a step discontinuity for N=10 and N=20

damps out and the function approaches the asymptotes ±π/2 for y = 0. Notethat as N is increased the peak amplitude of the oscillations does not diminishbut migrates toward the location of the step discontinuity, i.e., y = 0. Thenumerical value of the overshoot is ±1.852 or about 18% above (below) thepositive (negative) asymptote. When expressed in terms of (2.25), (2.17) reads

λN (t) =1

πSi s[(T/2− t)2π/T,N ]− 1

πSi s[(t1 − t)2π/T,N ]. (2.26)


Taking account of the limiting forms of (2.25) we note that as long as t < T/2 inthe limit as N →∞ the contribution from the first term on the right of (2.26)approaches 1/2, while the second term tends to −1/2 for t < t1,1/2 for t >t1 and 0 for t = t1, in agreement with the limiting forms enumerated in (2.21).

Results of sample calculations of λN (t) (with t1 = 0) for N = 10, 20, and50 are plotted in Fig. 2.3. Examining these three curves we again observe thatincreasing N does not lead to a diminution of the maximum amplitude of the

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5-0.2

0

0.2

0.4

0.6

0.8

1

1.2

t/T

Lam

bdaN

(t)

******

* N=10** N=20

***N=50

Figure 2.3: Convergence at a step discontinuity

oscillations. On the contrary, except for a compression of the timescale, theoscillations for N = 50 have essentially the same peak amplitudes as those forN = 10 and in fact exhibit the same overshoot as in Fig. 2.2. Thus N appears toenter into the argument in (2.22) merely as scaling factor of the abscissa, so thatthe magnitude of the peak overshoot appears to persist no matter how large Nis chosen. The reason for this behavior can be demonstrated analytically byapproximating (2.24) for large N. We do this by first changing the variable ofintegration in (2.24) to y = (N + 1/2)θ to obtain

Si s(x,N) =

∫ (N+1/2)x

0

sin y

(2N + 1) sin[y/(2N + 1)]dy. (2.26*)

Before proceeding with the next algebraic step we note that as N → ∞ thenumerator in (2.24) will be a rapidly oscillating sinusoid so that its contributionsto the integral will mutually cancel except for those in the neighborhood ofsmall θ. In terms of the variables in (2.26*) this means that for large N theargument y/(2N + 1) of the sine function will remain small. In that case wemay replace the sine by its argument which leads to the asymptotic form

Si[s(N, x)] ∼ Si[(N + 1/2)x], (2.26**)


where Si(z) is the sine integral function defined in (1.278e) and plotted inFig. 1.15. If we use this asymptotic form in (2.26), we get

λN (t) =1

πSi[(N + 1/2)(T/2− t)2π/T ]− 1

πSi[(N + 1/2)(t1 − t)2π/T ],

which shows directly that N enters as a scaling factor of the abscissa. Thusas the number of terms in the approximation becomes infinite the oscillatorybehavior in Fig. 2.3 compresses into two vanishingly small time intervals whichin the limit may be represented by a pair of infinitely thin spikes at t = 0+ andt = 0−. Since in the limit these spikes enclose zero area we have here a directdemonstration of convergence in the mean (i.e., the LMS error rather than theerror itself tending to zero with increasing N). This type of convergence, charac-terized by the appearance of an overshoot as a step discontinuity is approached,is referred to as the Gibbs phenomenon, in honor of Willard Gibbs, one of theAmerica’s greatest physicists. Gibbs phenomenon results whenever an LMS ap-proximation is employed for a function with step discontinuities and is by nomeans limited to approximations by sinusoids (i.e., Fourier series). In fact thenumerical example in Fig. 1.11 demonstrates it for Legendre Polynomials.

Another aspect of the Gibbs phenomenon worth mentioning is that it af-fords an example of nonuniform convergence. For as we have seen limN → ∞λN (t1) → 1/2. On the other hand, the limit approached when N is allowed toapproach infinity first and the function subsequently evaluated at t as it is madeto approach t1 (say, through positive values) is evidently unity. Expressed insymbols, these two alternative ways of approaching the limit are

limN−→∞

limt−→t+1

λN (t) = 1/2, (2.26***a)

limt−→t+1

limN−→∞

λN (t) = 1. (2.26***b)

In other words, the result of the limiting process depends on the order in whichthe limits are taken, a characteristic of nonuniform convergence. We canview (2.26***) as a detailed interpretation of the limiting processes impliedin the Fourier series at step discontinuities which the notation (2.23) does notmake explicit.

2.1.3 Convergence at Interval Endpoints

The preceding discussion applies only to convergence properties of the Fourierseries within the open interval. To complete the discussion of convergence wemust still consider convergence at the interval endpoints ±T/2. We start withthe approximate form (2.20) (c.f. Fig. 2.3) which, together with the periodicityof λN (t) based on the exact form (2.17), gives

limN−→∞

λN (±T/2) = 1/2.


Thus in view of (2.16) we have at the endpoints

limN→∞

fN (±T/2)

= limN→∞

∫ T/2

−T/2fs (t

′)sin [2π (N + 1/2) (±T/2− t′) /T ]

T sin [π (±T/2− t′) /T ] dt′

+

[f(t+1

)+ f

(t−1

)]

2. (2.27)

Since the observation points ±T/2 coincide with the integration limits, the lim-iting procedure following (2.9) is not directly applicable. Rather than examiningthe limiting form of the integral in (2.27) directly, it is more instructive to in-fer the limit in the present case from (2.24) and the periodicity of the Fourierseries kernel. This periodicity permits us to increment the integration limitsin (2.27) by an arbitrary amount, say τ, provided we replace fs (t) by its peri-odic extension

fexts (t) =

n=∞∑

n=−∞fs (t− nT ) . (2.28)

With this extension the endpoints ±T/2 now become the interior points in aninfinite sequence of expansion intervals . . . (τ − 3T/2, τ − T/2) , (τ − T/2, τ+T/2) . . .. These intervals are all of length T and may be viewed as centeredat t = τ ± nT , as may be inferred from Fig. 2.4. We note that unless fs (T/2)= fs (−T/2) the periodic extension of the originally smooth fs (t) will have astep discontinuity at the new interior points of the amount fs (−T/2)−fs (T/2) .Thus with a suitable shift of the expansion interval and the replacement of

T

T / 2−T / 2 3T / 2−3T / 2 5T / 2

τ − T / τ + T /

fs(t)

o

2 2

Figure 2.4: Step discontinuity introduced by a periodic extension of fs (t)

fs (t′) by the fexts (t′) in (2.28) we can mimic the limiting process employed

following (2.17) without change. Carrying this out we get an identical resultat each endpoint, viz., [fs (−T/2) + fs (T/2)] /2. Clearly as far as any “real”discontinuity at an interior point of the original expansion interval is concerned,say at t = t1, its contribution to the limit is obtainable by simply adding thelast term in (2.27). Hence

limN→∞

fN (±T/2) = f (−T/2) + f (T/2)

2. (2.29)


Of course, as in the convergence at an interior discontinuity point, thelimit (2.29) gives us only part of the story, since it sidesteps the very importantissue of Gibbs oscillations for finite N. A representative example of what hap-pens when the given function assumes different values at the two endpoints isdemonstrated by the Fourier expansion of e−t as shown in Fig. 2.5, where theexpansion interval is 0, 1, and 21 terms (N = 10) are employed. Clearly theconvergence at t = 0 and t = 1 is quite poor. This should be contrasted withthe plot in Fig. 2.6 which shows the expansion of e−|t−1/2| over the same interval

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

t

f10(t

)

Figure 2.5: Fourier series approximation of e−t with 21 sinusoids

and the same number of expansion functions. When the discontinuity occurs inthe interior of the interval, the convergence is also marred by the Gibbs oscilla-tions as illustrated in Fig. 2.7 for the pulse p.5 (t− .5) , again using 21 sinusoids.

Fig. 2.8 shows a stem diagram of the magnitude of the Fourier coefficients fnplotted as a function of (m = n+ 10, n = −10,−9, . . .11). Such Fourier coef-ficients are frequently referred to as (discrete) spectral lines and are intimatelyrelated to the concept of the frequency spectrum of a signal as will be discussedin detail in connection with the Fourier integral.

2.1.4 Delta Function Representation

The convergence properties of Fourier series can be succinctly phrased in termsof delta functions. Thus the Fourier series kernel can be formally representedby the statement

limN→∞

sin [2π (N + 1/2) (t− t′) /T ]T sin [π (t− t′) /T ] =

∞∑

k=−∞δ (t− t′ − kT ) . (2.30)


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Figure 2.6: Fourier series approximation of e−|t−1/2| with 21 sinusoids

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

p 1(t

-.5)

Figure 2.7: Fourier series approximation of a pulse using 21 terms


0 5 10 15 20 250

0.05

0.1

0.15

0.2

0.25

m

f m

Figure 2.8: Magnitude of Fourier series coefficients for the pulse in Fig. 2.7

Alternatively, we can replace the kernel by the original geometric series andwrite

∞∑

n=−∞

(1√Tei2πnt/T

)(1√Tei2πnt

′/T)∗

=1

T

∞∑

n=−∞ei2πn(t−t

′)/T =∞∑

k=−∞δ (t− t′ − kT ) . (2.31)

These expressions, just as the corresponding completeness statements for generalorthogonal sets discussed in 1.7.1, are to be understood as formal notationaldevices invented for efficient analytical manipulations; their exact meaning isto be understood in terms of the limiting processes discussed in the precedingsubsection.

2.1.5 The Fejer Summation Technique

The poor convergence properties exhibited by Fourier series at step disconti-nuities due to the Gibbs phenomenon can be ameliorated if one is willing tomodify the expansion coefficients (spectral lines) by suitable weighting factors.The technique, generally referred to as “windowing,” involves the multiplicationof the Fourier series coefficients by a suitable (spectral) “window” and summa-tion of the new trigonometric sum having modified coefficients. In general,the new series will not necessarily converge to the original function over theentire interval. The potential practical utility of such a scheme rests on the factthe approximating sum may represent certain features of the given function that


are of particular interest better than the original series. This broad subject istreated in detail in books specializing in spectral estimation. Here we merelyillustrate the technique with the so-called Fejer summation approach, whereinthe modified trigonometric sum actually does converge to the original function.In fact this representation converges uniformly to the given function and thuscompletely eliminates the Gibbs phenomenon.

The Fejer [16] summation approach is based on the following result fromthe theory of limits. Given a sequence fN such that lim

N→∞fN → f exists, the

arithmetic average

σM =1

M + 1

M∑

N=0

fN (2.32)

approaches the same limit as M →∞, i.e.,

limM→∞

σM → f. (2.33)

In the present case we take for fN = fN (t) , i.e., the partial Fourier seriessummation. Thus if this partial sum approaches f (t) as N →∞, the precedingtheorem states that σM = σM (t) will also converge to f (t). Since fN (t) is justa finite sum of sinusoids we should be able to find a closed-form expression forσM (t) by a geometrical series summation. Thus

σM (t) =1

M + 1{f0 +

[f0 + f1e

i2πt/T + f−1e−i2πt/T

]+

[f0 + f1e

i2πt/T + f2ei2(2πt/T )+

f−1e−i2πt/T + f−2e

−i2(2πt/T )

]+ . . .}.

This can be rewritten as follows:

σM (t) =1

M + 1{(M + 1) f0 +M

(f1e

i2πt/T + f−1e−i2πt/T

)+

(M − 1)(f2e

i2(2πt/T ) + f−2e−i2(2πt/T )

)+ . . .}

=1

M + 1{(M + 1) f0 +

M∑

k=1

fk (M − k + 1) eik(2πt/T )

+

M∑

k=1

f−k (M − k + 1) e−ik(2πt/T )}.

After changing the summation index from k to −k in the last sum we get

σM (t) =

M∑

k=−Mfk

(1− |k|

M + 1

)eik(2πt/T ), (2.34)

which we now identify as the expansion of the function σM (t) in terms of 2M +1 trigonometric (exponential) functions. We note that expansion coefficients


are obtained by multiplying the Fourier series coefficients fk by the triangularspectral window

wk (M) = 1− |k|M + 1

k = 0,±1,±2, . . .±M. (2.35)

We can view (2.34) from another perspective if we substitute the integral repre-sentation (2.3) of the partial sum fN (t) into (2.32) and carry out the summationon the Fourier series kernel (2.5). Thus after setting ξ = 2π(t − t′)/T we getthe following alternative form:

σM (t) =1

M + 1

∫ T/2

−T/2

M∑

N=0

sin [(N + 1/2) ξ]

T sin [ξ/2]f (t′) dt′

=1

M + 1

∫ T/2

−T/2

f (t′) dt′

T sin(ξ/2)

M∑

N=0

(ei(N+1/2)ξ

2i− e−i(N+1/2)ξ

2i

). (2.36)

Using the formula

M∑

N=0

eiNξ = eiξM/2 sin [(M + 1) ξ/2]

sin [ξ/2]

to sum the two geometric series transforms (2.36) into

σM (t) =

∫ T/2

−T/2

sin2 [(M + 1)π(t− t′)/T ]T (M + 1) sin2 [π(t− t′)/T ]f (t

′) dt′. (2.37)

This representation of σM (t) is very much in the spirit of (2.3). Indeed in viewof (2.33) σM (t) must converge to the same limit as the associated Fourier series.The new kernel function

KM (t− t′) = sin2 [(M + 1)π(t− t′)/T ]T (M + 1) sin2 [π(t− t′)/T ] (2.38)

is called the Fejer kernel and (2.34) the Fejer sum. Just like the Fourier serieskernel the Fejer kernel is periodic with period T so that in virtue of (2.33) wemay write

limM→∞

sin2 [(M + 1)π(t− t′)/T ]T (M + 1) sin2 [π(t− t′)/T ] =

∞∑

k=−∞δ (t− t′ − kT ) . (2.39)

Alternatively with the aid of limiting arguments similar to those employedin (2.11) and (2.12) one can easily verify (2.39) directly by evaluating the limitin (2.37) as M →∞.

Figure 2.9 shows the approximation achieved with the Fejer sum (2.34) (orits equivalent (2.37)) for f (t) = U (t− 0.5) with 51 sinusoids (M = 25). Alsoshown for comparison is the partial Fourier series sum for the same value of M .


Note that in the Fejer sum the Gibbs oscillations are absent but that the ap-proximation underestimates the magnitude of the jump at the discontinuity.In effect, to achieve a good fit to the “corners” at a jump discontinuity thepenalty one pays with the Fejer sum is that more terms are needed than witha Fourier sum to approximate the smooth portions of the function. To getsome idea of the rate of convergence to the “corners” plots of Fejer sums forM = 10, 25, 50, and 100 are shown in Fig. 2.10, where (for t > 0.5) σ10 (t) <σ25 (t) < σ50 (t) < σ100 (t) .

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

← Fourier sum

← Fejer sum

Figure 2.9: Comparison of Fejer and Fourier convergence

In passing we remark that the Fejer sum (2.34) is not a partial Fourier series

sum because the expansion coefficients themselves, σk = wk (M) fk are functionsof M. Trigonometric sums of this type are not unique. In fact by forming thearithmetic mean of the Fejer sum itself

σ(1)M (t) =

1

M + 1

M∑

N=0

σN (t) (2.40)

we can again avail ourselves of the limit theorem in (2.32) and (2.33) and con-

clude that the partial sum σ(1)M (t) must approach f (t) in the limit of large

M , i.e.,

limM→∞

σ(1)M (t) = f (t) . (2.41)

For any finiteM we may regard σ(1)M (t) as the second-order Fejer approximation.

Upon replacingM by N in (2.34) and substituting for σN (t) we can easily carry


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Figure 2.10: Convergence of the Fejer approximation

out one of the sums and write the final result in the form

σ(1)M (t) =

M∑

k=−Mfkw

(1)k (M) eik(2πt/T ), (2.42)

where

w(1)k (M) =

1

M + 1

M−|k|+1∑

n=1

n

|k|+ n, k = 0,±1,±2, . . .±M (2.43)

is the new spectral window. We see that we no longer have the simple lineartaper that obtains for the first-order Fejer approximation. Unfortunately thissum does not appear to lend itself to further simplification. A plot of (2.43) inthe form of a stem diagram is shown in Fig. 2.11 for M = 12. Figure 2.12 showsplots of the first- and second-order Fejer approximations for a rectangular pulseusing M = 25. We see that the second-order approximation achieves a greaterdegree of smoothing but underestimates the pulse amplitude significantly morethan does the first-order approximation. Apparently to reduce the amplitudeerror to the same level as achieved with the first-order approximation muchlarger spectral width (values of M) are required. This is consistent with theconcave nature of the spectral taper in Fig. 2.11 which, for the same bandwidth,will tend to remove more energy from the original signal spectrum than a lin-ear taper.

Clearly higher order Fejer approximations can be generated recursively withthe formula

σ(m)M (t) =

1

M + 1

M∑

k=0

σ(m−1)k (t) , (2.44a)


-10 -5 0 5 100

0.2

0.4

0.6

0.8

1

1.2

1.4

Figure 2.11: Second-order Fejer spectral window

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.2

0

0.2

0.4

0.6

0.8

1

1.2

***

* first order**second order

Figure 2.12: First- and second-order Fejer approximations

wherein σ(0)k (t) ≡ σk (t). It should be noted that Fejer approximations of all

orders obey the limiting property

limM→∞

σ(m−1)M (t) =

1

2[f

(t+

)+ f

(t−

)] ; m = 1, 2, 3, . . . . (2.44b)

i.e., at step discontinuities the partial sums converge to the arithmetic averageof the given function, just like ordinary Fourier series. The advantage of higher


order Fejer approximations is that they provide for a greater degree of smoothingin the neighborhood of step discontinuities. This is achieved at the expense ofmore expansion terms (equivalently, requiring wider bandwidths) to reach agiven level of approximation accuracy.

2.1.6 Fundamental Relationships Between the Frequencyand Time Domain Representations

Parseval Formula

Once all the Fourier coefficients of a given function are known they may beused, if desired, to reconstruct the original function. In fact, the specificationof the coefficients and the time interval within which the function is defined is,in principle, equivalent to the specification of the function itself. Even thoughthe fn are components of the infinite-dimensional vector

f = [. . .fn..]T , (2.45)

we can still interpret them as the projections of the signal f (t) along thebasis functions ei2πnt/T and think of them geometrically as in Fig. 1.3. Becauseeach fn is uniquely associated with a radian frequency of oscillation ωn, withωn/2π = n/T Hz, f is said to constitute the frequency domain representationof the signal, and the elements of f the signal (line) spectrum. A very importantrelationship between the frequency domain and the time domain representationsof the signal is Parseval formula

1

T

∫ T/2

−T/2|f(t)|2 dt =

n=∞∑

n=−∞

∣∣∣fn∣∣∣2

. (2.46)

This follows as a special case of (1.305) and is a direct consequence of the LMSerror in the approximation tending to zero. With f

′ ≡ √T f we can rewrite(2.46) using the notation

(f, f) = ‖f ′‖2 , (2.47)

which states that the norm in the frequency domain is identical to that in thetime domain. Since physically the time average on the left of (2.46) may gener-ally be interpreted as the average signal power (or some quantity proportionalto it), Parseval formula in effect states that the average power in the time andfrequency domains is preserved.

Given the two functions, f (t) and g (t) within the interval −T/2, T/2 with

Fourier coefficients fn and gn, it is not hard to show (problem 2-2) that (2.46)generalizes to

1

T

∫ T/2

−T/2f (t) g∗ (t) dt =

n=∞∑

n=−∞fng

∗n. (2.48)


Time and Frequency Domain Convolution

An important role in linear system analysis is played by the convolution integral.From the standpoint of Fourier series this integral is of the form

h (t) =1

T

∫ T/2

−T/2f (τ) g (t− τ) dτ . (2.49)

We now suppose that the Fourier series coefficients fn and gn of f (t) and g (t),

defined within −T/2, T/2, are known. What will be the Fourier coefficients hmof h (t) when expanded in the same interval? The answer is readily obtainedwhen we represent f (τ ) by its Fourier series (2.13) and similarly g (t− τ) . Thus

h (t) =1

T

∫ T/2

−T/2

∞∑

n=−∞fne

i2πnτ/T∞∑

m=−∞gme

i2πm(t−τ)/Tdτ

=1

T

∞∑

m=−∞gme

i2πmt/T∞∑

n=−∞fn

∫ T/2

−T/2ei2π(n−m)τ/Tdτ

=1

T

∞∑

m=−∞gme

i2πmt/T∞∑

n=−∞fnTδnm

=∞∑

m=−∞gmfme

i2πmt/T =∞∑

m=−∞hme

i2πmt/T (2.50)

from which we identify hm = gmfm. A dual situation frequently arises when weneed the Fourier coefficients of the product of the two functions, e.g., q(t) ≡ f (t)g (t) . Here we can proceed similarly

q(t) ≡ f (t) g (t)

=∞∑

n=−∞fne

i2πnt/T∞∑

m=−∞gme

i2πmt/T

=∞∑

n=−∞

∞∑

m=−∞fngme

i2π(n+m)t/T

=∞∑

n=−∞

∞∑

k=−∞fngk−nei2πkt/T

=

∞∑

n=−∞

( ∞∑

m=−∞fmgn−m

)ei2πnt/T =

∞∑

n=−∞qne

i2πnt/T , (2.51)

where in the last step we identify the Fourier coefficient of q(t) as qn =∑∞

m=−∞fmgn−m which is a convolution sum formed with the Fourier coefficients of thetwo functions.


Symmetries

Frequently (but not always ) the signal in the time domain will be real. In thatcase the formula for the coefficients gives

f−n = f∗n, (2.52)

which means that the magnitude of the line spectrum is symmetrically disposedwith respect to the index n = 0. Simplifications also arise when the signal iseither an even or an odd function with respect to t = 0. In case of an evenfunction f (t) = f (−t) we obtain

fn =2

T

∫ T/2

0

f (t) cos (2πnt/T )dt (2.53)

and since f−n = fn the Fourier series reads

f(t) = f0 + 2

∞∑

n=1

fn cos (2πnt/T ) . (2.54)

In case of an odd function f (t) = −f (−t) the coefficients simplify to

fn =−i2T

∫ T/2

0

f (t) sin (2πnt/T )dt (2.55)

and since f−n = −fn we have for the Fourier series

f(t) = i2

∞∑

n=1

fn sin (2πnt/T ) . (2.56)

It is worth noting that (2.53-2.54) hold for complex functions in general, inde-pendent of (2.52).

2.1.7 Cosine and Sine Series

In our discussion of convergence of Fourier series we noted that whenever a func-tion assumes unequal values at the interval endpoints its Fourier series coveragesat either endpoint to the arithmetic mean of the two endpoint values. An illus-tration of how the approximation manifests itself when finite partial sums areinvolved may be seen from the plot in Fig. 2.5 for an exponential function. Itturns out that these pathological convergence properties can actually be elimi-nated by a judicious choice of the expansion interval. The approach rests on thefollowing considerations. Suppose function f (t) to be expanded is defined in theinterval 0, T while the nature of its periodic extension is outside the domain ofthe problem of interest and, consequently, at our disposal. In that case we mayartificially extend the expansion interval to −T, T and define a function overthis new interval as f (|t|), as shown in Fig. 2.13. This function is continuous


0 T−T

f

t

(t)

Figure 2.13: Extension of the function for the cosine series

at t = 0 and moreover assumes identical values at −T and T. Hence its periodicextension is also continuous at these endpoints which means that its Fourierseries will converge uniformly throughout the closed interval −T, T to f (|t|)and, in particular, to the prescribed function f (t) throughout the desired range0 ≤ t ≤ T. Of course, since f (|t|) is even with respect to t = 0, this Fourierseries contains only cosine terms. However, because the expansion interval is2T rather than T, the arguments of the expansion functions are πnt/T ratherthan 2πnt/T. Hence

fn =1

2T

∫ T

−Tf (|t|) cos (πnt/T )dt

=1

T

∫ T

0

f (t) cos (πnt/T )dt. (2.57)

The Fourier cosine series reads

f(t) = f0 + 2

∞∑

n=1

fn cos (πnt/T )

=∞∑

n=0

f cn cos (πnt/T ) , (2.58)

where

f cn =

{1T

∫ T0 f (t) dt ; n = 0,

2T

∫ T0 f (t) cos (πnt/T )dt ;n > 0.

(2.59)

The approximation to e−t using a cosine series comprised of 10 terms is plottedin Fig. 2.14. We note a significant improvement in the approximation over thatobtained with the conventional partial Fourier series sum in Fig. 2.5, where 21terms are employed to approximate the same function.

It should be noted that the coefficients of the cosine series (2.59) are nothingmore than the solution to the normal equations for the LMS problem phrasedin terms of the cosine functions

φcn (t) = cos (πnt/T ) , n = 0, 1, 2, . . . (2.60)


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 2.14: Cosine series approximation (N=10)

As may be verified directly, they are orthogonal over the interval 0, T . In ourcompact notation this reads

(φcn, φcm) = (T/εn) δnm,

where we have introduced the abbreviation

εn =

{1;n = 0,2 ; n > 0,

which is usually referred to as the Neumann symbol.The convergence properties of the cosine series at points of continuity and

at jump discontinuities within the interval are identical to those of the com-plete Fourier series from which, after all, the cosine series may be derived.The cosine expansion functions form a complete set in the space of piecewisedifferentiable functions whose derivatives must vanish at the interval endpoints.This additional restriction arises because of the vanishing of the derivative ofcos (πnt/T ) at t = 0 and t = T. In accordance with (1.303), the formal state-ment of completeness may be phrased in terms of an infinite series of productsof the orthonormal expansion functions

√εn/Tφ

cn (t) as follows:

δ (t− t′) =∞∑

n=0

√εnT

cos (πnt/T )

√εnT

cos (πnt′/T ) . (2.61)

Sine Series

If instead of an even extension of f (t) into the interval −T, 0 as in Fig. 2.13, weemploy an odd extension, as in Fig. 2.15, and expand the function f (|t|) sign(t)in a Fourier series within the interval −T, T , we find that the cosine terms


0 T−T

f

t

(t)

Figure 2.15: Function extension for sine series

vanish and the resulting Fourier series is comprised entirely of sines. Within theoriginal interval 0, T it converges to the prescribed function f (t) and constitutesthe so-called sine series expansion, to wit,

f (t) =

∞∑

n=0

f sn sin (πnt/T ) , (2.62)

where

fsn =2

T

∫ T

0

f (t) sin (πnt/T )dt. (2.63)

Evidently because the sine functions vanish at the interval endpoints the sineseries will necessarily converge to zero there. Since at a discontinuity a Fourierseries always converges to the arithmetic mean of the left and right endpointvalues, we see from Fig. 2.15 that the convergence of the sine series to zero atthe endpoints does not require that the prescribed function also vanishes there.Of course, if this is not the case, only LMS convergence is guaranteed at theendpoints and an approximation by a finite number of terms will be vitiatedby the Gibbs effect. A representative illustration of the expected convergencebehavior in such cases can be had by referring to Fig. 2.5. For this reason thesine series is to be used only with functions that vanish at the interval endpoints.In such cases convergence properties very similar to those of cosine series areachieved. A case in point is the approximation shown in Fig. 2.6.

The sine expansion functions

φsn (t) = sin (πnt/T ) , n = 1, 2, 3, . . . (2.64)

possess the orthogonality properties

(φsn, φsm) = (T/2) δnm; (2.65)


they form a complete set in the space of piecewise differentiable functions thatvanish at the interval endpoints. Again the formal statement of this complete-ness may be summarized by the delta function representation

δ (t− t′) =∞∑

n=0

√2

Tsin (πnt/T )

√2

Tsin (πnt′/T ) . (2.66)

2.1.8 Interpolation with Sinusoids

Interpolation Using Exponential Functions

Suppose f (t) can be represented exactly by the sum

f (t) =

N∑

n=−Ncne

i 2πntT ; 0 ≤ t ≤ T. (2.67)

If f (t) is specified at M = 2N + 1 points within the given interval (2.67) canbe viewed as a system ofM linear equations for the M unknown coefficients cn.A particularly simple formula for the coefficients results if we suppose that thefunction is specified on uniformly spaced points within the interval. To deriveit we first change the summation index in (2.67) from n to m = N +n to obtain

f (t) =

2N∑

m=0

cm−Nei2π(m−N)t

T . (2.68)

With t = �Δt and Δt = T/M (2.68) becomes

f (�Δt) =M−1∑

m=0

cm−Nei2π(m−N)�

M . (2.69)

From the geometric series∑M−1

m=0 eimα = eiα(M−1)/2 sin (Mα/2) / sin(α/2) we

readily establish the orthogonality relationship

M−1∑

�=0

ei2π�(m−k)

M =Mδmk. (2.70)

Upon multiplying both sides of (2.69) by e−i2π�kM and summing on � and us-

ing (2.70) we obtain the solution for the coefficients

cm−N =1

M

M−1∑

�=0

f (�Δt) e−i2π�(m−N)

M . (2.71)

Reverting to the index n and M = 2N + 1 the preceding is equivalent to

cn =1

2N + 1

2N∑

�=0

f (�Δt) e−i2π�n2N+1 . (2.72)


On the other hand we know that the solution for cn in (2.67) is also given bythe integral

cn =1

T

∫ T

0

f(t)e−i2πntT dt. (2.73)

If in (2.72) we replace 1/ (2N + 1) by its equivalent Δt/T , we can interpret (2.71)as a Riemann sum approximation to (2.73). However we know from the fore-going that (2.72) is in fact an exact solution of (2.69). Thus whenever f (t)is comprised of a finite number of sinusoids the Riemann sum will representthe integral (2.73) exactly provided 2N+1 is chosen equal to or greater than thenumber of sinusoids. Evidently, if the number of sinusoids is exactly 2N+1, thecn as computed using either (2.73) or (2.72) must be identically zero whenever|n| > N. If f (t) is a general piecewise differentiable function, then (2.67) withthe coefficients determined by (2.72) provides an interpolation to f (t) in termsof sinusoids. In fact by substituting (2.72) into (2.67) and again summing ageometric series we obtain the following explicit interpolation formula:

f (t) =

M−1∑

�=0

f (�Δt)sin

[π(tΔt − �

)]

M sin[πM

(tΔt − �

)] . (2.74)

Unlike the LMS approximation problem underlying the classical Fourier series,the determination of the coefficients in the interpolation problem does not re-quire the evaluation of integrals. This in itself is of considerable computationaladvantage. How do interpolation-type approximations compare with LMS ap-proximations? Figure 2.16 shows the interpolation of e−t achieved with 11sinusoids while Fig. 2.17 shows the approximation with the same number of si-nusoids using the LMS approximation. We note that the fit is comparable inthe two cases except at the endpoints where, as we know, the LMS approxi-mation necessarily converges to

(1 + e−1

)/2. As the number of terms in the

interpolation is increased the fit within the interval improves. Nevertheless,the interpolated function continues to show considerable undamped oscillatorybehavior near the endpoints as shown by the plot in Fig. 2.18.

Interpolation Using Cosine Functions

Recalling the improvement in the LMS approximation achieved with the co-sine series over the complete Fourier expansion, we might expect a similar im-provement in case of interpolation. This turns out actually to be the case.As will be demonstrated, the oscillatory behavior near the endpoints in Fig. 2.18can be completely eliminated and a substantially better fit to the prescribedfunction achieved throughout the entire approximating interval using an alter-native interpolation that employs only cosine functions, i.e., an interpolationformula based on (2.58) rather than (2.67). In this case we set the interpolationinterval to

Δt = T/(M − 1/2) (2.75)


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

1.4

exp(

-t)

t

Figure 2.16: Interpolation of e−t using 11 sinusoids

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

t

exp(

-t)

Figure 2.17: LMS approximation to e−t using 11 sinusoids

and with t = mΔt in (2.58) we obtain

f (mΔt) =

M−1∑

n=0

ccn cos [πnm/ (M − 1/2)] ; m = 0, 1, 2, . . .M − 1, (2.76)


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

t

exp(

-t)

Figure 2.18: Interpolation of e−t using 101 sinusoids

where the ccn are the unknown coefficients. The solution for the ccn is madesomewhat easier if one first extends the definition of f (mΔt) to negative indicesas in Fig. 2.13 and rewrites (2.76) in terms of complex exponentials. Thus

f (mΔt) =

M−1∑

n=−(M−1)

c′cn ei2πnm/(2M−1) ; m = 0,±1,±2, . . .±(M − 1) , (2.77)

where in addition to f (mΔt) = f (−mΔt) we postulated that ccn = cc−n anddefined

c′cn =

{cc0 ; n = 0,ccn/2 ; n = 0.

(2.78)

Again using the geometric series sum formula we have the orthogonality

M−1∑

n=−(M−1)

ei2πn(m−k)/(2M−1) =sin [π (m− k)]

sin [π (m− k) /(2M − 1)]

≡ (2M − 1) δmk (2.79)

with the aid of which the solution for the c′cn in (2.78) follows at once:

c′cn =1

2M − 1

M−1∑

m=−(M−1)

f (mΔt) e−i2πnm/(2M−1)

=1

2M − 1

M−1∑

m=0

εmf (mΔt) cos [2πnm/ (2M − 1)]


=1

M − 1/2

M−1∑

m=0

(εm/2) f (mΔt) cos [πnm/ (M − 1/2)] .

Taking account of (2.78) we obtain the final result

ccn =2

M − 1/2

M−1∑

m=0

(εnεm/4) f (mΔt) cos [πnm/ (M−1/2)] ; n = 0, 1, 2, . . .M−1.(2.80)

The final interpolation formula now follows through a direct substitution of(2.80) into

f (t) =M−1∑

n=0

ccn cos (πnt/T ) . (2.81)

After summation over n we obtain

f (t) =1

M − 1/2

M−1∑

m=0

(εm/2) f (mΔt) {1 + kM (t/Δt−m) + kM (t/Δt+m)} ,(2.82)

where

kM (t) = cos

(πM

2M − 1t

) sin[π(M−1))2(M−1/2) t

]

sin[

π2(M−1/2) t

] . (2.83)

Fig. 2.19 shows the interpolation of e−t using 11 cosine functions.The improvement over the interpolation wherein both sines and cosines were

employed, Fig. 2.16, is definitely noticeable. A more important issue with gen-eral sinusoids is the crowding toward the interval endpoints as in Fig. 2.18. Withthe cosine interpolation these oscillations are completely eliminated, as may beseen from the plot in Fig. 2.20.

By choosing different distributions of the locations and sizes of the interpo-lation intervals the interpolation properties can be tailored to specific classes offunctions. Of course, a nonuniform distribution of interpolation intervals willin general not lead to analytically tractable forms of expansion coefficients andwill require a numerical matrix inversion. We shall not deal with nonuniformdistribution of intervals. There is, however, a slightly different way of specify-ing a uniform distribution of interpolation intervals from the one we have justconsidered which is worth mentioning since it leads to formulas for the so-calleddiscrete cosine transform commonly employed in data and image compressionwork. Using the seemingly innocuous modification of (2.75) to

Δt =T

2M(2.84)


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

t

exp(

-t)

Figure 2.19: Interpolation of e−t with 11 cosine functions

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

t

exp(

-t)

Figure 2.20: Interpolation of e−t with 101 cosine functions

and forcing the first and the last step size to equal Δt/2 we replace (2.76) by

f [Δt (m+ 1/2)] =

M−1∑

n=0

ccn cos [πn (2m+ 1) /2M ] ; m = 0, 1, 2, . . .M − 1.

(2.85)


With the aid of the geometrical sum formula we can readily verify the orthogo-nality relationship

M−1∑

m=0

cos [πn (2m+ 1) /2M ] cos [πk (2m+ 1) /2M ] =M

εnδnk (2.86)

with the aid of which we solve for the coefficients in (2.85):

ccn =εnM

M−1∑

m=0

f [Δt (m+ 1/2)] cos [πn (2m+ 1) /2M ] . (2.87)

Replacing ccn in (2.81) by ccn of (2.87) yields the interpolation formula

f (t) =1

M

M−1∑

m=0

f [Δt (m+ 1/2)][1 + kM

(τ+

)+ kM

(τ−

)], (2.88)

where

τ+ = t/Δt− (m+ 1/2) (2.88a)

τ− = t/Δt+ (m+ 1/2) (2.88b)

and

kM (t) = cos(πt/2)sin

(π(M−1)

2M t)

sin(π

2M t) . (2.89)

Equation (2.85) together with (2.87) is usually referred to as the discrete cosinetransform pair. Here we have obtained it as a by-product along our route towarda particular interpolation formula comprised of cosine functions.

2.1.9 Anharmonic Fourier Series

Suppose we approximate the signal f(t) in the LMS sense by a sum of sinusoidswith radian frequencies μ1, μ2, . . . μN which are not necessarily harmonicallyrelated. Assuming the signal is specified in the interval a ≤ t ≤ b we write thisapproximating sum as follows:

f(t) ∼N∑

n=1

fnψn (t) , (2.90)

wherein

ψn (t) = An sinμnt+Bn cosμnt (2.91)

and An and Bn are suitable normalization constants. It is not hard to showthat as long as all the μn are distinct the Gram matrix Γnm = (ψn, ψm) isnonsingular so that the normal equations yield a unique set of expansion co-efficients fn. Of course their computation would be significantly simplified if


it were possible to choose a sets of radian frequencies μn such that the Grammatrix is diagonal, or, equivalently, that the ψn are orthogonal over the choseninterval. We know that this is always the case for harmonically related radianfrequencies. It turns out that orthogonality also obtains when the radian fre-quencies are not harmonically related provided they are chosen such that fora given pair of real constants α and β the ψn(t) satisfy the following endpointconditions:

ψn (a) = αψ′n (a) , (2.92a)

ψn (b) = βψ′n (b) . (2.92b)

To prove orthogonality we first observe that the ψn (t) satisfy the differentialequation of the harmonic oscillator, i.e.,

d2ψndt2

+ μ2nψn = 0, (2.93)

where we may regard the ψn as an eigenvector and μ2n as the eigenvalue of the

differential operator −d2ψn/ dt2. Next we multiply (2.93) by ψm and integratethe result over a ≤ t ≤ b to obtain

ψmdψndt

∣∣ba −

∫ b

a

dψmdt

dψndt

dt+ μ2n

∫ b

a

ψmψndt = 0, (2.94)

where the second derivative has been eliminated by an integration by parts. Aninterchange of indices in (2.94) gives

ψndψmdt

∣∣ba −

∫ b

a

dψndt

dψmdt

dt+ μ2m

∫ b

a

ψnψmdt = 0 (2.95)

and subtraction of (2.95) from (2.94) yields

ψmdψndt

∣∣ba − ψn

dψmdt

∣∣ba =

(μ2n − μ2

m

) ∫ b

a

ψnψmdt. (2.96)

We now observe that substitution of the endpoint conditions (2.92) into the leftside of (2.96) yields zero. This implies orthogonality provided we assume thatfor n = m μm and μn are distinct. For then

∫ b

a

ψnψmdt = 0 ; n = m. (2.97)

The fact that the eigenvalues μ2n are distinct follows from a direct calculation.

To compute the eigenvalues we first substitute (2.91) into (2.92) which yieldsthe following set of homogeneous algebraic equations:

(sinμna− αμn cosμna)An + (cosμna+ αμn sinμna)Bn = 0, (2.98a)

(sinμnb− βμn cosμnb)An + (cosμna+ αμn sinμna)Bn = 0. (2.98b)


0 1 2 3 4 5 6 7 8 9-4

-3

-2

-1

0

1

2

3

4

x

y

•

•

•

b1

µ

b2µ

b3µ

Figure 2.21: Diagram of the transcendental equation −0.2x = tanx

A nontrivial solution for An and Bn is only possible if the determinant ofthe coefficients vanishes. Computing this determinant and setting the result tozero yield the following equation for μn :

(β − α)μn cos [μn(b− a)]− (1 + αβμ2n) sin [μn(b− a)] = 0. (2.99)

This transcendental equation possesses an infinite set of distinct positive simplezeros μn. For an arbitrary set of parameters these roots can only be determinednumerically. Many standard root finding algorithms are available for this pur-pose. Generally these are iterative techniques that require a “good” first guessof the root. In case of (2.99) an approximate location to start the iteration canbe got from a graphical construction. We illustrate it for α = 0 and a = 0 inwhich case (2.99) becomes βμn cosμnb− sinμnb = 0, which is equivalent to

βμn = tan(μnb). (2.100)

Defining the nondimensional variable x = μnb in (2.100) we obtain the rootsfrom the intersection of the straight line y = xβ/b with the curves defined bythe various branches of y = tanx as shown in Fig. 2.21 for β/b = −0.2. The firstthree roots expressed in terms of the nondimensional quantities μ1b, μ2b and μ3bmay be read off the abscissa. When α = 0 and a = 0 (2.98a) requires that Bn= 0 so that the expansion functions that correspond to the solutions of (2.100)are

ψn(t) = An sinμnt. (2.101)

Setting An = 1 we compute the normalization constant

Qn =

∫ b

0

sin2 μntdt =

∫ b

0

1− cos 2μnt

2dt

2.2 The Fourier Integral 107

=

(b/2− sin 2μnb

4μn

)=b

2(1 − sinμnb cosμnb

μnb).

The expansion coefficients in (2.90) for this case are

fn =2

b

(1− sinμnb cosμnb

μnb

)−1 ∫ b

0

f(t) sinμntdt. (2.102)

From Fig. 2.21 we note that as n increases the abscissas of the points wherethe straight line intersects the tangent curves approach π/2(2n − 1) ≈ nπ.Hence for large n the radian frequencies of the anharmonic expansion (2.90) areasymptotically harmonic, i.e.,

μnn∼∞

∼ nπ/b. (2.103)

Taking account of (2.103) in (2.102) we also observe that for large n formula(2.102) represents the expansion coefficient of a sine Fourier series (2.63). Thusthe anharmonic character of the expansion appears to manifest itself only forfinite number of terms. Hence we would expect that the convergence propertiesof anharmonic expansions to be essentially the same as harmonic Fourier series.

An anharmonic series may be taken as a generalization of a Fourier series.For example, it reduces to the (harmonic) sine series in (2.62) when α = β = 0and when α = β →∞ to the (harmonic) cosine series (2.58), provided f (a) = 0and f (b) = 0. When the endpoint conditions (2.92) are replaced by a periodicitycondition we obtain the standard Fourier series.

2.2 The Fourier Integral

2.2.1 LMS Approximation by Sinusoids Spanninga Continuum

Instead of approximating f (t) by a sum of 2N + 1 sinusoids with discrete fre-quencies ωn = 2πn/T we now suppose that the frequencies ω span a continuumbetween −Ω and Ω. With

fΩ (t) =

∫ Ω

−Ω

f (ω) eiωtdω (2.104)

we seek a function f (ω) such that the MS error

εΩ(T ) ≡∫ T/2

−T/2

∣∣f (t)− fΩ (t)∣∣2 dt (2.105)

is minimized. As we know, this minimization leads to the normal equation(1.99), where we identify φ (ω, t) = eiωt, a = −T/2, b = T/2, so that with theaid of (1.100) we obtain

∫ T/2

−T/2f (t) e−iωtdt =

∫ Ω

−Ω

f (ω′)2 sin [(ω − ω′)T/2]

(ω − ω′)dω′. (2.106)


Thus unlike in the case of a discrete set of sinusoids the unknown “coefficients”f (ω′) now span a continuum. In fact, according to (2.106), to find f (ω′) wemust solve an integral equation.

2.2.2 Transition to an Infinite Observation Interval:The Fourier Transform

For any finite time interval T the solution of (2.106) for f (ω′) can be expressedin terms of spheroidal functions [23]. Here we confine our attention to the caseof an infinite time interval, which is the conventional domain of the Fourierintegral. In that case we can employ the limiting form of the Fourier kernel in(1.269) (with Ω replaced by T/2) so that the right side of (2.106) becomes

limT−→∞

∫ Ω

−Ω

f (ω′)2 sin [(ω − ω′)T/2]

(ω − ω′)dω′ = 2πf (ω) . (2.107)

Hence as the expansion interval in the time domain is allowed to approachinfinity the solution of (2.106) reads

∫ ∞

−∞f (t) e−iωtdt = F (ω), (2.108)

where we have set F (ω) = 2πf (ω) which shall be referred to as the FourierIntegral (or the Fourier transform) of f (t) . Substituting this in (2.104) andintegrating with respect to ω we get

fΩ (t) =

∫ ∞

−∞f (t′)

sin [(t− t′)Ω]π (t− t′) dt′. (2.109)

The corresponding LMS error εΩmin is

εΩmin =(f − fΩ, f − fΩ

)

= (f, f)− (f, fΩ

) ≥ 0, (2.110)

where the inner products are taken over the infinite time domain and account hasbeen taken of the projection theorem (1.75). Substituting for fΩ from (2.104)the preceding is equivalent to

εΩmin =

∫ ∞

−∞|f (t)|2 dt−

∫ ∞

−∞f∗ (t) dt

∫ Ω

−Ω

f (ω) eiωtdω

=

∫ ∞

−∞|f (t)|2 dt− 2π

∫ Ω

−Ω

∣∣∣f (ω)∣∣∣2

dω

=

∫ ∞

−∞|f (t)|2 dt− 1

2π

∫ Ω

−Ω

|F (ω)|2 dω � 0, (2.111)

which is the Bessel inequality for the Fourier transform. As Ω −→ ∞ theintegrand in (2.109) approaches a delta function and in accordance with (1.285)


we have

limΩ−→∞

fΩ (t) =1

2

[f(t+

)+ f

(t−

)](2.112)

or, equivalently, using (2.104) with F (ω) = 2πf (ω)

limΩ−→∞

1

2π

∫ Ω

−Ω

F (ω) eiωtdω =1

2

[f(t+

)+ f

(t−

)]. (2.113)

At the same time the MS error in (2.111) approaches zero and we obtain

∫ ∞

−∞|f (t)|2 dt = 1

2π

∫ ∞

−∞|F (ω)|2 dω, (2.114)

which is Parseval theorem for the Fourier transform. Equation (2.113) is usuallywritten in the abbreviated form

f (t) =1

2π

∫ ∞

−∞F (ω) eiωtdω (2.115)

and is referred to as the inverse Fourier transform or the Fourier transforminversion formula. It will be frequently convenient to designate both (2.115)and the direct transform (2.108) by the concise statement

f (t)F⇐⇒ F (ω) . (2.116)

In addition, we shall at times find it useful to express the direct and inversetransform pair as

F {f (t)} = F (ω) , (2.117)

which is just an abbreviation of the statement “the Fourier transform of f (t) isF (ω).” We shall adhere to the convention of designating the time domain signalby a lowercase letter and its Fourier transform by the corresponding uppercaseletter.

2.2.3 Completeness Relationship and Relation to FourierSeries

Proceeding in a purely formal way we replace F (ω) in (2.115) by (2.108) andinterchange the order of integration and obtain

f (t) =

∫ ∞

−∞f (t′)

{1

2π

∫ ∞

−∞eiω(t−t

′)dω

}dt′. (2.118)

The quantity in braces can now be identified as the delta function

δ (t− t′) = 1

2π

∫ ∞

−∞eiω(t−t

′)dω, (2.119)


which is a slightly disguised version of (1.254). To see this we merely have torewrite (2.119) as the limiting form

limΩ−→∞

1

2π

∫ Ω

−Ω

eiω(t−t′)dω

and note that for any finite Ω the integration yields sin [Ω (t− t′)] /π(t− t′).The representation (2.119) bears a formal resemblance to the completeness

relationship for orthonormal discrete function sets, (1.302), and, more directly,to the completeness statement for Fourier series in (2.31). This resemblance canbe highlighted by rewriting (2.119) to read

δ (t− t′) =∫ ∞

−∞

(1√2πeiωt

)(1√2πeiωt

′)∗

dω (2.120)

so that a comparison with (2.31) shows that the functions φω (t) ≡1/√2πexp (iωt) play an analogous role to the orthonormal functions φn (t) ≡

1/√Texp (2πint/T ) provided we view the continuous variable ω in (2.120) as

proportional to a summation index. In fact a direct comparison of the variablesbetween (2.31) and (2.120) gives the correspondence

ω ←→ 2πn

T(2.121a)

dω ←→ 2π

T. (2.121b)

Thus as the observation period T of the signal increases, the quantity 2π/T maybe thought of as approaching the differential dω while the discrete spectral linesoccurring at 2πn/T merge into a continuum corresponding to the frequencyvariable ω. Moreover the orthogonality over the finite interval −T/2, T/2, as in(1.213), becomes in the limit as T −→∞

δ (ω − ω′) =1

2π

∫ ∞

−∞eit(ω−ω

′)dt

=

∫ ∞

−∞

(1√2πeitω

)(1√2πeitω

′)∗

dt (2.122)

i.e., the identity matrix represented by the Kronecker symbol δmn goes over intoa delta function, which is the proper identity transformation for the continuum.

A more direct but qualitative connection between the Fourier series and theFourier transform can be established if we suppose that the function f (t) isinitially truncated to |t| < T/2 in which case its Fourier transform is

F (ω) =

∫ T/2

−T/2f (t) e−iωtdt. (2.123)

The coefficients in the Fourier series that represents this function within theinterval −T/2, T/2 can now be expressed as fn = F (2πn/T )/T so that

f (t) =

∞∑

n=−∞F (2πn/T ) ei2πnt/T

(1

T

). (2.124)


Thus in view of (2.121) we can regard the Fourier transform inversion for-mula (2.115) as a limiting form of (2.124) as T −→ ∞. Figure 2.22 shows the

-20 -15 -10 -5 0 5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

ωτ

Am

plitu

de

Figure 2.22: Continuous and discrete spectra

close correspondence between the discrete spectrum defined by Fourier seriescoefficients and the continuous spectrum represented by the Fourier transform.The time domain signal is the exponential exp−2 |t/τ |. For the discrete spec-trum the time interval is truncated to −T/2 ≤ t ≤ T/2 (with T/2τ = 2) and

the Fourier series coefficients∣∣∣(T/τ) fn

∣∣∣ (stem diagram) plotted as a function of

2πnτ/T. Superposed for comparison is the continuous spectrum represented by

4/[4 + (ωτ)

2], the Fourier transform of (1/τ) exp−2 |t/τ |.

2.2.4 Convergence and the Use of CPV Integrals

The convergence properties of the Fourier integral are governed by the deltafunction kernel (2.109). In many respects they are qualitatively quite simi-lar to the convergence properties of Fourier series kernel (2.5). For example,as we shall show explicitly in 2.2.7, the convergence at points of discontinu-ity is again accompanied by the Gibbs oscillatory behavior. The one conver-gence issue that does not arise with Fourier series, but is unavoidable with theFourier Integral, relates to the behavior of the functions at infinity, a problemwe had already dealt with in Chap. 1 in order to arrive at the limit state-ment (1.269). There we found that it was sufficient to require that f (t) satisfy(1.266) which, in particular, is satisfied by square integrable functions (Prob-lem 1-18). Unfortunately this constraint does not apply to several idealizedsignals that have been found to be of great value in simplifying system analysis.To accommodate these signals, the convergence of the Fourier transform has to


be examined on a case-by-case basis. In certain cases this requires a special def-inition of the limiting process underlying the improper integrals that define theFourier transform and its inverse. In the following we provide a brief accountof this limiting process.

An improper integral of the form∫∞−∞ f (t) dt, unless stated to the contrary,

implies the limit (1.278c)

limT1→∞

limT2→∞

∫ T2

−T1

f (t) dt, (2.125)

which means that integral converges when the upper and lower limits approachinfinity independently. This definition turns out to be too restrictive in manysituations of physical interest. An alternative and more encompassing definitionis the following:

limT→∞

∫ T

−Tf (t) dt. (2.126)

Here we stipulate that upper and lower limits must approach infinity at thesame rate. It is obvious that (2.126) implies (2.125). The converse is, however,not true. The class of functions for which the integral exists in the sense of(2.126) is much larger than under definition (2.125). In particular, all (piece-wise differentiable) bounded odd functions are integrable in the sense of (2.126)and the integral yields zero. Under these circumstances (2.125) would gener-ally diverge, unless of course the growth of the function at infinity is suitablyrestricted. When the limit is taken symmetrically in accordance with (2.126)the integral is said to be defined in terms of the Cauchy Principal Value (CPV).We have in fact already employed this definition implicitly on several occasions,in particular in (2.113). A somewhat different form of the CPV limit is also ofinterest in Fourier transform theory. This form arises whenever the integral isimproper in virtue of one or more simple pole singularities within the integration

interval. For example, the integral∫ 8

−2dtt−1 has a singularity at t = 1 where the

integrand becomes infinite. The first inclination would be to consider this inte-gral simply as divergent. On the other hand since the integrand changes sign asone moves through the singularity it is not unreasonable to seek a definition ofa limiting process which would facilitate the mutual cancellation of the positiveand negative infinite contributions. For example, suppose we define

I (ε1, ε2) =

∫ 1−ε1

−2

dt

t− 1+

∫ 8

1+ε2

dt

t− 1,

where ε1 and ε2 are small positive numbers so that the integration is carriedout up to and past the singularity. By direct calculation we find I (ε1, ε2) =ln (7ε1/3ε2) . We see that if we let ε1 and ε2 approach zero independently theintegral diverges. On the other by setting ε1 = ε2 = ε the result is always finite.Apparently when the singularity is approached symmetrically from both sides


results in a cancellation of the positive and negative infinite contributions andyield a convergent integral. The formal expression for the limit is

limε→0

{∫ 1−ε

−2

dt

t− 1+

∫ 8

1+ε

dt

t− 1

}= ln (7/3) .

This limiting procedure constitutes the CPV definition of the integral wheneverthe singularity falls within the integration interval. Frequently a special symbolis used to indicate a CPV evaluation. We shall indicate it by prefixing the letter

P to the integration symbol. Thus P∫ 8

−2dtt−1 = ln (7/3) . When more than one

singularity is involved the CPV limiting procedure must be applied to each. Forexample,

I = P

∫ 9

−5

dt

(t− 1) (t− 2)

= limε→0

{∫ 1−ε

−5

dt

(t− 1) (t− 2)+

∫ 2−ε

1+ε

dt

(t− 1) (t− 2)

+

∫ 9

2+ε

dt

(t− 1) (t− 2)

}

= ln 3/4.

The following example illustrates the CPV evaluation of an integral with infinitelimits of integration:

I = P

∫ ∞

−∞

dt

t− 2= lim

ε→0 T→∞

{∫ 2−ε

−T

dt

t− 2+

∫ T

2+ε

dt

t− 2

}

= limε→0 T→∞

ln(2− ε− 2) (T − 2)

(−T − 2) (2 + ε− 2)= 0.

Note that the symbol P in this case pertains to a CPV evaluation at t = −∞and t =∞. A generic form of an integral that is frequently encountered is

I = P

∫ b

a

f(t)

t− q dt, (2.127)

where a < q < b and f(t) is a bounded function within a, b and differentiableat t = q. We can represent this integral as a sum of an integral of a boundedfunction and a CPV integral which can be evaluated in closed form as follows:

I = P

∫ b

a

f(t)− f (q) + f (q)

t− q dt

=

∫ b

a

f(t)− f(q)t− q dt+ f (q)P

∫ b

a

dt

t− q

=

∫ b

a

f(t)− f(q)t− q dt+ f (q) ln

b− qq − a . (2.128)


Note that the integrand in the first integral in the last expression is finite att = q so that the integral can be evaluated, if necessary, numerically usingstandard techniques.

Let us now apply the CPV procedure to the evaluation of the Fourier trans-form of f (t) = 1/t. Even though a signal of this sort might appear quite artificialit will be shown to play a pivotal role in the theory of the Fourier transform.Writing the transform as a CPV integral we have

F (ω) = P

∫ ∞

−∞

e−iωt

tdt = P

∫ ∞

−∞

cosωt

tdt− iP

∫ ∞

−∞

sinωt

tdt.

Since P∫∞−∞

cosωtt dt = 0, and sinωt

t is free of singularities we have

F (ω) = −i∫ ∞

−∞

sinωt

tdt (2.129)

Recalling that∫ ∞−∞

sin xx dx = π we obtain by setting ωt = x in (2.129)

1

π

∫ ∞

−∞

sinωt

tdt = sign(ω) =

{1 ; ω > 0,−1 ; ω < 0.

(2.130)

Thus we arrive at the transform pair

1

πt

F⇐⇒ −isign(ω). (2.131)

By using the same procedure for the inverse transform of 1/ω we arrive at thepair

sign(t)F⇐⇒ 2

iω. (2.132)

Several idealized signals may be termed canonical in that they form the essentialbuilding blocks in the development of analytical techniques for evaluation ofFourier transforms and also play a fundamental role in the characterization oflinear system. One such canonical signal is the sign function just considered.We consider several others in turn.

2.2.5 Canonical Signals and Their Transforms

The Delta Function

That Fourier transform of δ (t) equals 1 follows simply from the basic propertyof the delta function as an identity transformation. The consistency of this withthe inversion formula follows from (2.119). Hence

δ (t)F⇐⇒ 1. (2.133)

In identical fashion we get

1F⇐⇒ 2πδ (ω) . (2.134)


The Unit Step Function

From the identity U (t) = 12 [1 + sign(t)] we get in conjunction with (2.132)

and (2.134)

U (t)F⇐⇒ πδ (ω) +

1

iω. (2.135)

The Rectangular Pulse Function

Using the definition for pT (t) in (1.6-40b) we obtain by direct integration thepair

pT (t)F⇐⇒ 2 sin(ωT )

ω, (2.136)

where we again find the familiar Fourier integral kernel. If, on the other hand,pΩ(ω) describes a rectangular frequency window, then a direct evaluation of theinverse transform yields

sin(Ωt)

πt

F⇐⇒ pΩ(ω). (2.137)

The transition of (2.137) to (2.133) as Ω → ∞ and of (2.136) to (2.134) asT →∞ should be evident.

Triangular Pulse Function

Another signal that we should like to add to our catalogue of canonical trans-forms is the triangular pulse qT (t) defined in (1.278d) for which we obtain thepair

qT (t)F⇐⇒ T sin2(ωT/2)

(ωT/2)2. (2.137*)

Exponential Functions

Since the Fourier transform is a representation of signals in terms of exponentialswe would expect exponential functions to play a special role in Fourier analysis.In the following we distinguish three cases: a purely imaginary argument, apurely real argument with the function truncated to the positive time axis, anda real exponential that decays symmetrically for both negative and positivetimes. In the first case we get from the definition of the delta function (2.119)and real ω0

eiω0t F⇐⇒ 2πδ (ω − ω0) . (2.138)

This result is in perfect consonance with the intuitive notion that a singletone, represented in the time domain by a unit amplitude sinusoidal oscillationof infinitely long duration, should correspond in the frequency domain to a sin-gle number, i.e., the frequency of oscillation, or, equivalently, by a spectrumconsisting of a single spectral line. Here this spectrum is represented symbol-ically by a delta function at ω = ω0. Such a single spectral line, just like the


corresponding tone of infinite duration, are convenient abstractions never realiz-able in practice. A more realistic model should consider a tone of finite duration,say −T < t < T. We can do this either by truncating the limits of integrationin the evaluation of the direct transform, or, equivalently, by specifying thistruncation in terms of the pulse function pT (t). The resulting transform pairthen reads

pT (t)eiω0t F⇐⇒ 2 sin [(ω − ω0)T ]

(ω − ω0), (2.139)

so that the form of the spectrum is the Fourier kernel (2.136) whose peak hasbeen shifted to ω0. One can show that slightly more than 90% of the energy iscontained within the frequency band defined by the first two nulls on either sideof the principal peak. It is therefore reasonable to take this bandwidth as thenominal spectral linewidth of the tone. Thus we see that a tone of duration 2Thas a spectral width of 2π/T which is sometimes referred to as the Rayleigh res-olution limit. This inverse relationship between the signal duration and spectralwidth is of fundamental importance in spectral analysis. Its generalization to awider class of signals is embodied in the so-called uncertainty principle discussedin 2.5.1.

With α > 0 and the exponential truncated to the nonnegative time axiswe get

e−αtU (t)F⇐⇒ 1

α+ iω. (2.140)

For the exponential e−α|t| defined over the entire real line the transform pairreads

e−α|t| F⇐⇒ 2α

α2 + ω2. (2.141)

Formula (2.140) also holds when α is replaced by the complex number p0 =α− iω0 where ω0 is real. A further generalization follows if we differentiate theright side of (2.140) n− 1 times with respect to ω. The result is

tn−1

(n− 1)!e−p0tU (t)

F⇐⇒ 1

(p0 + iω)n ; n ≥ 1. (2.142)

Using this formula in conjunction with the partial fraction expansion techniqueconstitutes one of the basic tools in the evaluation of inverse Fourier transformsof rational functions.

Gaussian Function

A rather important idealized signal is the Gaussian function

f(t) =1√2πσ2

t

e− t2

2σ2t ,


where we have adopted the normalization(√f,√f)= 1. We compute its FT

as follows:

F (ω) =1√2πσ2

t

∫ ∞

−∞e− t2

2σ2t e−iωtdt =

1√2πσ2

t

∫ ∞

−∞e− 1

2σ2t[t2+2iωσ2

t t]dt

=e−

12σ

2tω

2

√2πσ2

t

∫ ∞

−∞e− 1

2σ2t[t+iωσ2

t ]2

dt =e−

12σ

2tω

2

√2πσ2

t

∫ ∞+iωσ2t

−∞+iωσ2t

e− z2

2σ2t dz.

The last integral may be interpreted as an integral in the complex z planewith the path of integration running along the straight line with endpoints(−∞ + iωσ2

t ,∞ + iωσ2t ). Since the integrand is analytic in the entire finite z

plane we can shift this path to run along the axis of reals so that

∫ ∞+iωσ2t

−∞+iωσ2t

e− z2

2σ2t dz =

∫ ∞

−∞e− z2

2σ2t dz =

√2πσ2

t .

Thus we obtain the transform pair

1√2πσ2

t

e− t2

2σ2t

F⇐⇒ e−12σ

2tω

2

. (2.142*)

Note that except for a scale factor the Gaussian function is its own FT. Here wesee another illustration of the inverse relationship between the signal durationand bandwidth. If we take σt as the nominal duration of the pulse in the timedomain, then a similar definition for the effective bandwidths of F (ω) yieldsσω = 1/ σt.

2.2.6 Basic Properties of the FT

Linearity

The Fourier transform is a linear operator which means that for any set offunctions fn (t) n = 1, 2, . . .N and corresponding transforms Fn (ω) we have

F{

N∑

n=1

αnfn (t)

}=

N∑

n=1

αnFn (ω) ,

where the αn are constants. This property is referred to as the superpositionprinciple. We shall return to it in Chap. 3 when we discuss linear systems. Thissuperposition principle carries over to a continuous index. Thus if

f (ξ, t)F⇐⇒ F (ξ, ω)

holds for a continuum of values of ξ, then

F{∫

f (ξ, t) dξ

}=

∫F (ξ, ω) dξ.


Symmetries

For any Fourier transform pair

f (t)F⇐⇒ F (ω)

we also have, by a simple substitution of variables,

F (t)F⇐⇒ 2πf (−ω) . (2.143)

For example, using this variable replacement in (2.141), we obtain

α

π (α2 + t2)

F⇐⇒ e−α|ω|. (2.143*)

The Fourier transform of the complex conjugate of a function follows throughthe variable replacement

f∗ (t) F⇐⇒ F ∗ (−ω) . (2.144)

Frequently we shall be interested in purely real signals. If f (t) is real, thepreceding requires

F ∗ (−ω) = F (ω) . (2.145)

If we decompose F (ω) into its real and imaginary parts

F (ω) = R (ω) + iX (ω) , (2.146)

we note that (2.145) is equivalent to

R (ω) = R (−ω) , (2.147a)

X (ω) = −X (−ω) , (2.147b)

so that for a real signal the real part of the Fourier transforms is even functionwhile the imaginary part an odd function of frequency. The even and oddsymmetries carry over to the amplitude and phase of the transform. Thuswriting

F (ω) = A (ω) eiθ(ω), (2.148)

wherein

A (ω) = |F (ω)| =√[R (ω)]

2+ [X (ω)]

2, (2.149a)

θ (ω) = tan−1 X (ω)

R (ω), (2.149b)

we have in view of (2.147)

A (ω) = A (−ω) (2.150a)

θ (ω) = −θ (−ω) . (2.150b)


As a result the inversion formula can be put into the form

f (t) =1

π

∫ ∞

0

A (ω) cos [ωt+ θ (ω)] dω

= e{

1

2π

∫ ∞

0

2F (ω) eiωtdω

}. (2.151)

The last expression shows that a real physical signal can be represented asthe real part of a fictitious complex signal whose spectrum equals twice thespectrum of the real signal for positive frequencies but is identically zero fornegative frequencies. Such a complex signal is referred to as an analytic signal,a concept that finds extensive application in the study of modulation to bediscussed in 2.3.

Time Shift and Frequency Shift

For any real T we have

f (t− T ) F⇐⇒ F (ω) e−iωT (2.152)

and similarly for any real ω0

f (t) eiω0t F⇐⇒ F (ω − ω0) . (2.153)

The last formula is the quantification of the modulation of a high frequencyCW carrier by a baseband signal comprised of low frequency components.For example, for the carrier of A cos (ω0t+ θ0) and a baseband signal f (t) weget

f (t)A cos (ω0t+ θ0)F⇐⇒ A

2eiθ0F (ω − ω0) +

A

2e−iθ0F (ω + ω0) . (2.154)

If we suppose that F (ω) is negligible outside the band defined by |ω| < Ω, andalso assume that ω0 > 2Ω, the relationship among the spectra in (2.154) maybe represented schematically as in Fig. 2.23

ω0

ω0

−ω0 Ω−Ω

F

2F

Aeiq0

2Ae−iq0

ω

ω

ω − )( ω0F ω + )(

( )

Figure 2.23: Modulation by a CW carrier


Differentiation

If f (t) is everywhere differentiable, then a simple integration by parts gives∫ ∞

−∞f ′ (t) e−iωtdt = f (t) e−iωt

∣∣∞−∞ + iω

∫ ∞

−∞f (t) e−iωtdt

= iωF (ω) . (2.155)

Clearly if f (t) is differentiable n times we obtain by repeated integration

f (n) (t)F⇐⇒ (iω)n F (ω) . (2.156)

Actually this formula may still be used even if f (t) is only piecewise dif-ferentiable and discontinuous with discontinuous first and even higher orderderivatives at a countable set of points. We merely have to replace f (n) (t) witha generalized derivative defined in terms of singularity functions, an approachwe have already employed for the first derivative in (1.280). For example, theFourier transform of (1.280) is

f ′ (t) F⇐⇒ iωF (ω) = F {f ′s (t)}+

∑

k

[f(t+k

)− f (t−k

)]e−iωtk . (2.157)

In the special case of only one discontinuity at t = 0 and f (0−) = 0 (2.157)becomes

f ′s (t)

F⇐⇒ iωF (ω)− f (0+

). (2.158)

What about the Fourier transform of higher order derivatives? Clearly if thefirst derivative is continuous at t = 0, the Fourier transform of f ′′

s (t) may beobtained by simply multiplying the right side of (2.158) by iω. However in case ofa discontinuity in the first derivative the magnitude of the jump in the derivativemust be subtracted. Again assuming f ′ (0−) = 0 we have

f ′′s (t)

F⇐⇒ iω[iωF (ω)− f (

0+)]− f ′ (0+

). (2.159)

Higher order derivatives can be handled similarly.Since an n− th order derivative in the time domain transforms in the fre-

quency domain to a multiplication by (iω)n, the Fourier transform of any lineardifferential operator with constant coefficients is a polynomial in iω. This featuremakes the Fourier transform a natural tool for the solution of linear differen-tial equations with constant coefficients. For example, consider the followingdifferential equation:

x′′ (t) + 2x′ (t) + x (t) = 0. (2.160)

We seek a solution for x (t) for t ≥ 0 with initial conditions x (0+) = 2 andx′ (0+) = 6. Then

x′ (t) F⇐⇒ iωX (ω)− 2

x′′ (t) F⇐⇒ −ω2X (ω)− iω2− 6.


The solution for X (ω) reads

X (ω) =i2ω + 10

−ω2 + i2ω + 1,

while the signal x (t) is to be computed from

x (t) =1

2π

∫ ∞

−∞

i2ω + 10

−ω2 + i2ω + 1eiωtdω. (2.161)

The integral can be evaluated by contour integration as will be shown in 2.4.4(see also (A.96) in the Appendix).

Inner Product Invariance

We compute the inner product of two functions in the time domain and with theaid of the inversion formulas transform it into an inner product in the frequencydomain as follows:

(f1, f2) =

∫ ∞

−∞f∗1 (t) f2 (t) dt

=

∫ ∞

−∞

{1

2π

∫ ∞

−∞F ∗1 (ω) e−iωtdω

1

2π

∫ ∞

−∞F2 (ω

′) eiω′tdω′

}dt

=1

2π

∫ ∞

−∞

∫ ∞

−∞F ∗1 (ω)F2 (ω

′){

1

2π

∫ ∞

−∞ei(ω

′−ω)tdt}dω′dω

=1

2π

∫ ∞

−∞

∫ ∞

−∞F ∗1 (ω)F2 (ω

′) δ (ω − ω′) dω′dω

=1

2π

∫ ∞

−∞F ∗1 (ω)F2 (ω) dω.

The final result may be summarized to read∫ ∞

−∞f∗1 (t) f2 (t) dt =

1

2π

∫ ∞

−∞F ∗1 (ω)F2 (ω) dω, (2.162)

which is recognized as a generalization of Parseval’s formula.

Convolution

We have already encountered the convolution of two functions in connection withFourier series, (2.49). Since in the present case the time domain encompassesthe entire real line the appropriate definition is

h (t) =

∫ ∞

−∞f (τ ) g (t− τ ) dτ .

We shall frequently employ the abbreviated notation∫ ∞

−∞f (τ ) g (t− τ ) dτ = f ∗ g. (2.163)


Note that∫ ∞

−∞f (τ) g (t− τ) dτ =

∫ ∞

−∞g (τ ) f (t− τ ) dτ

as one can readily convince oneself through a change of the variable of integra-tion. This can also be expressed as f ∗ g = g ∗ f , i.e., the convolution operation

is commutative. In view of (2.152) g (t− τ ) F⇐⇒ G (ω) e−iωτ so that

∫ ∞

−∞f (τ ) g (t− τ ) dτ F⇐⇒

∫ ∞

−∞f (τ )G (ω) e−iωτdτ = F (ω)G (ω) . (2.164)

In identical manner we establish that

f (t) g (t)F⇐⇒ 1

2π

∫ ∞

−∞F (η)G (ω − η) dη =

1

2πF ∗G. (2.165)

Integration

When the Fourier transform is applied to integro-differential equations one some-times needs to evaluate the transform of the integral of a function. For examplewith g (t) =

∫ t−∞ f (τ ) dτ we would like to determine G (ω) in terms of F (ω) .

We can do this by first recognizing that∫ t−∞ f (τ ) dτ =

∫∞−∞ f (τ )U (t− τ ) dτ .

Using (2.164) and (2.135) we have

∫ ∞

−∞f (τ)U (t− τ ) dτ F⇐⇒ F (ω)

[πδ (ω) +

1

iω

]

with the final result

∫ t

−∞f (τ ) dτ

F⇐⇒ πF (0) δ (ω) +F (ω)

iω= G (ω) . (2.166)

Note that the integral implies g′ (t) = f (t) so that

iωG (ω) = F (ω) . (2.167)

This is certainly compatible with (2.166) since ωδ (ω) = 0. However the solu-tion of (2.167) for G (ω) by simply dividing both sides by iω is in general notpermissible since G (ω) = F (ω) /iω unless F (0) = 0.

Causal Signals and the Hilbert Transform [16]

Let

fe (t) =f (t) + f (−t)

2, (2.168a)

fo (t) =f (t)− f (−t)

2, (2.168b)


so that f (t) = fe (t) + fo (t) for any signal. Since fe (t) = fe (−t) and fo (t) =−fo (−t) (2.168a) and (2.168b) are referred to as the even and odd parts of f (t),respectively. Now

F {fe (t)} =1

2

∫ ∞

−∞[f (t) + f (−t)] [cos (ωt)− i sin (ωt)] dt

=

∫ ∞

−∞f (t) cos (ωt) dt (2.169a)

and

F {fo (t)} =1

2

∫ ∞

−∞[f (t)− f (−t)] [cos (ωt)− i sin (ωt)] dt

= −i∫ ∞

−∞f (t) sin (ωt) dt. (2.169b)

In view of the definition (2.146), for a real f (t) (2.169a) and (2.169b) areequivalent to

fe (t)F⇐⇒ R (ω) , (2.170a)

fo (t)F⇐⇒ iX (ω) . (2.170b)

In the following we shall be concerned only with real signals.As will be discussed in Chap. 3, signals that vanish for negative values of the

argument play a special role in linear time-invariant systems. Such signals aresaid to be causal. Suppose f (t) is a causal signal. Then according to (2.168)

f (t) =

{2fe (t) = 2fo (t) ; t > 0,

0 ; t < 0.(2.171)

Evidently the even and odd parts are not independent for

fe (t) = fo (t) ; t > 0,

fe (t) = −fo (t) ; t < 0, .

which can be rephrased in more concise fashion with the aid of the sign functionas follows:

fo (t) = sign (t) fe (t) (2.172a)

fe (t) = sign (t) fo (t) . (2.172b)

Taking account of (2.170), (2.132), and (2.165) Fourier Transformation of bothsides of (2.172) results in the following pair of equations:

X (ω) = − 1

πP

∫ ∞

−∞

R (η) dη

ω − η , (2.173a)

R (ω) =1

πP

∫ ∞

−∞

X (η) dη

ω − η . (2.173b)


These relations show explicitly that the real and imaginary parts of the Fouriertransform of a causal signal may not be prescribed independently. For exampleif we know R (ω), then X (ω) can be determined uniquely by (2.173a). SinceP

∫ ∞−∞

dηω−η = 0, an R (ω) that is constant for all frequencies gives a null result

for X (ω). Consequently, (2.173b) determines R (ω) from X (ω) only within aconstant.

The integral transform 1πP

∫∞−∞

R(η)dηω−η is known as the Hilbert Transform

which shall be denoted by H{R (ω)} . Using this notation we rewrite (2.173) as

X (ω) = −H{R (ω)} , (2.174a)

R (ω) = H{X (ω)} . (2.174b)

Since (2.174b) is the inverse of (2.174a) the inverse Hilbert transform is obtainedby a change in sign. As an example, suppose R (ω) = pΩ (ω) . Carrying out thesimple integration yields

X (ω) =1

πln

∣∣∣∣ω − Ω

ω +Ω

∣∣∣∣ , (2.175)

which is plotted in Fig. 2.24. The Hilbert Transform in the time domain isdefined similarly. Thus for a signal f (t)

H{f (t)} = 1

πP

∫ ∞

−∞

f (τ ) dτ

t− τ . (2.176)

-3 -2 -1 0 1 2 3-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

ω/Ω

R

X

ω)(

ω)(

Figure 2.24: R (ω) and its Hilbert transform


Particularly simple results are obtained for Hilbert transforms of sinusoids.For example, with f (t) = cos (ωt) (with ω a real constant) we have

1

πP

∫ ∞

−∞

cos (ωτ ) dτ

t− τ =1

πP

∫ ∞

−∞

cos [ω (t− τ )] dττ

= cos (ωt)1

πP

∫ ∞

−∞

cos (ωτ) dτ

τ

+sin (ωt)1

πP

∫ ∞

−∞

sin (ωτ ) dτ

τ.

We note that the first of the two preceding integrals involves an odd function andtherefore vanishes, while in virtue of (2.132) the second integral yields sign (ω) .Hence

H{cos (ωt)} = sign (ω) sin (ωt) . (2.177)

In identical fashion we obtain

H{sin (ωt)} = −sign (ω) cos (ωt) . (2.178)

We shall have occasion to employ the last two formulas in connection withanalytic signal representations.

The Hilbert transform finds application in signal analysis, modulation the-ory, and spectral analysis. In practical situations the evaluation of the Hilberttransform must be carried out numerically for which purpose direct use of thedefining integral is not particularly efficient. The preferred approach is to carryout the actual calculations in terms of the Fourier transform which can be com-puted efficiently using the FFT algorithm. To see how this may be arranged,let us suppose that R (ω) is given and we wish to find X (ω) . By taking theinverse FT we first find fe (t) , in accordance with (2.170a). In view of (2.171),if we now multiply the result by 2, truncate it to nonnegative t, and take thedirect FT, we should obtain F (ω) . Thus

2fe (t)U (t)F⇐⇒ F (ω) (2.179)

and X (ω) follows by taking the imaginary part of F (ω) . In summary we have

H{R} = −�m{∫ ∞

0

2F−1 {R} e−iωtdt}

= −X (ω) , (2.180)

where

F−1 {R} ≡ 1

2π

∫ ∞

−∞R (ω′) eiω

′tdω′.

Initial and Final Value Theorems

Again assume that f (t) is a causal signal and that it is piecewise differentiablefor all t > 0. Then

F {f ′ (t)} =

∫ ∞

0+f ′ (t) e−iωtdt = f (t) e−iωt |∞0+ + iω

∫ ∞

0+f (t) e−iωtdt

= iωF (ω)− f (0+

).


Since by assumption f ′ (t) exists for t > 0, or, equivalently, f (t) is smooth,F {f ′ (t)} approaches zero as ω →∞ (c.f. (2.158)). Under these conditions thelast equation yields

limω→∞iωF (ω) = f

(0+

), (2.181)

a result known as the initial value theorem. Note that fe (0) ≡ f (0) but ac-cording to (2.171) 2 fe (0) = f (0+) . Hence

f (0) =1

2f(0+

), (2.182)

which is consistent with the fact that the FT converges to the arithmetic meanof the step discontinuity.

Consider now the limit

limω→0

[iωF (ω)− f (

0+)]

= limω→0

∫ ∞

0+f ′ (t) e−iωtdt

=

∫ ∞

0+f ′ (t) lim

ω→0

[e−iωt

]dt

= limt→∞f (t)− f

(0+

).

Upon cancelling f (0+) we get

limω→0

[iωF (ω)] = limt→∞f (t) , (2.183)

which is known as the final value theorem.

Fourier Series and the Poisson Sum Formula

Given a function f (t) within the finite interval −T/2, T/2 we can represent iteither as a Fourier integral, (2.123), comprised of a continuous spectrum of

( ) ( ) ωωπ

ω deFtf tj∫∞

−∞=

21

( ) Tn

j

n

eTTn

F

tfπ

π2

2

∑∞

−∞=

⎟⎠⎞⎜

⎝⎛

=

2/T− 2/T

2/T2/T−

Figure 2.25: Fourier integral and Fourier series representations


sinusoids, or as a Fourier series, (2.124), comprised of discrete harmonicallyrelated sinusoids. In the former case the representation converges to zero outsidethe interval in question while in the latter case we obtain a periodic repetition(extension) of the given function, as illustrated in Fig. 2.25. The significant pointto note here is that the Fourier series coefficients are given by the FT formula.Note also that the Fourier transform of f (t) and its periodic extension (takenover the entire real-time axis) is a infinite series comprised of delta functions, i.e.,

∞∑

n=−∞fne

i2πnt/T F⇐⇒ 2π

∞∑

n=−∞fnδ (ω − 2πn/T ) . (2.184)

In the following we present a generalization of (2.124), known as the Poissonsum formula wherein the function f (t) may assume nonzero values over theentire real line. We start by defining the function g (t) through the sum

g (t) =∞∑

n=−∞f (t− nT ) . (2.185)

It is easy to see that g (t) is periodic with period T. We take the FT to obtain

∞∑

n=−∞f (t− nT ) F⇐⇒

∞∑

n=−∞F (ω) e−iωnT .

In view of (2.31) the sum of exponentials can be replaced by a sum comprisedof delta functions. Thus

∞∑

n=−∞F (ω) e−iωnT =

∞∑

�=−∞F (ω) 2πδ (ωT − 2π�)

=2π

T

∞∑

�=−∞F (2π�/T ) δ (ω − 2π�/T ) .

Inverting the FT gives

∞∑

�=−∞F (2π�/T ) /Tei2π�t/T

F⇐⇒ 2π

T

∞∑

�=−∞F (2π�/T ) δ (ω − 2π�/T ) .

Since the left side in the last expression must be identical to (2.185) we arejustified in writing

∞∑

n=−∞f (t− nT ) =

∞∑

�=−∞F (2π�/T ) /Tei2π�t/T , (2.186)

which is the desired Poisson sum formula.As an example, suppose f (t) = 1/(1 + t2). Then F (ω) = πe−|ω| (see

(2.143*)) and with T = 2π we get

∞∑

n=−∞

1

1 + (t− 2πn)2=

1

2

∞∑

�=−∞e−(|�|−i�t) =

e2 − 1

2 [e2 − 2e cos(t) + 1]. (2.187)


2.2.7 Convergence at Discontinuities

The convergence of the FT at a step discontinuity exhibits the Gibbs oscillatorybehavior similar to Fourier series. Thus suppose f (t) has step discontinuitiesat t = tk, k = 1, 2, . . . and we represent it as in (1.282). Then with fΩ (t) as in(2.109) we have

fΩ (t) = fΩs (t) +

∑

k

[f(t+k

)− f (t−k

)] ∫ ∞

tk

sin [(t− t′) Ω]π (t− t′) dt′

= fΩs (t) +

∑

k

[f(t+k

)− f (t−k

)] 1

π

∫ (t−tk)Ω

−∞

sinx

xdx

= fΩs (t) +

∑

k

[f(t+k

)− f (t−k

)] [12+

1

πSi [(t− tk)Ω]

]. (2.188)

As Ω → ∞ the fΩs (t) tends uniformly to fs (t) whereas the convergence of

each member in the sum is characterized by the oscillatory behavior of the sine

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-0.2

0

0.2

0.4

0.6

0.8

1

1.2Ω= 10Ω=20Ω=50

Figure 2.26: Convergence of the Fourier transform at a step discontinuity

integral function. This is illustrated in Fig. 2.26 which shows a unit step togetherwith plots of 1/2 + (1/π) Si(Ωt) for Ω = 10, 20, and 50.

2.2.8 Fejer Summation

In 2.1.5 it was shown that the Gibbs oscillations at step discontinuities arisingin partial sums of Fourier series can be suppressed by employing the Fejer sum-mation technique. An analogous procedure works for the Fourier Integral where


instead of (2.32) we must resort to the following fundamental theorem from thetheory of limits. Given a function f(Ω) integrable over any finite interval 0,Ωwe define, by analogy with (2.135), the average σΩ by

σΩ =1

Ω

∫ Ω

0

f(Ω)dΩ. (2.189)

It can be shown that if limΩ−→∞

f(Ω) = f exists then so does limΩ−→∞

σΩ = f .

Presently for the function f(Ω) we take the partial “sum” f Ω(t) in (2.109)and denote the left side of (2.189) by σΩ (t). If we suppose that lim

Ω−→∞fΩ (t) =

12 [f(t

+) + f (t−)], then by the above limit theorem we also have

limΩ−→∞

σΩ (t) =1

2

[f(t+) + f

(t−

)]. (2.190)

By integrating the right side of (2.109) with respect to Ω and using (2.189) weobtain

σΩ (t) =

∫ ∞

−∞f (t′)

sin2 [(Ω/2) (t− t′)]π (Ω/2) (t− t′)2 dt′. (2.191)

Unlike the kernel (2.38) in the analogous formula for Fourier series in (2.37),the kernel

KΩ (t− t′) = sin2 [(Ω/2) (t− t′)]π (Ω/2) (t− t′)2 (2.192)

is not periodic. We leave it exercise to show that

limΩ−→∞

sin2 [(Ω/2) (t− t′)]π (Ω/2) (t− t′)2 = δ (t− t′) , (2.193)

which may be taken as a direct verification of (2.190). A plot of the Fejerkernel together with the Fourier integral kernel is shown in Fig. 2.27, wherethe maximum of each kernel has been normalized to unity. Note that the Fejerkernel is always nonnegative with a wider main lobe than the Fourier kernel andexhibits significantly lower sidelobes. One can readily show that

F{sin2 (Ω/2) t

π (Ω/2) t2

}=

{1− |ω|

Ω ; |ω| < Ω,0; |ω| > Ω.

(2.194)

Since the right side of (2.191) is a convolution in the time domain, its FTyields a product of the respective transforms. Therefore using (2.194) we canrewrite (2.191) as an inverse FT as follows:

σΩ (t) =1

2π

∫ Ω

−Ω

F (ω)

(1− |ω|

Ω

)eiωtdω. (2.195)


-20 -15 -10 -5 0 5 10 15 20-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Ωt

*Fejer

**Fourier

Figure 2.27: Fejer and Fourier integral kernels

We see that the Fejer “summation” (2.195) is equivalent to the multiplicationof the signal transform F (ω) by the triangular spectral window:

W (ω) =

(1− |ω|

Ω

)pΩ (ω) (2.196)

quite analogous to the discrete spectral weighting of the Fourier series coeffi-cients in (2.34). Figure 2.28 shows a rectangular pulse together with the Fejerand Fourier approximations using a spectral truncation of Ω = 40/T. Theseresults are seen to be very similar to those plotted in Fig. 2.9 for Fourier series.Just like for Fourier series, we can also introduce higher order Fejer approxima-

tions. For example, the second-order approximation σ(1)Ω (t) can be defined by

σ(1)Ω (t) = σΩ =

1

Ω

∫ Ω

0

σa (t) da (2.197)

again with the property

limΩ−→∞

σ(1)Ω (t) =

1

2

[f(t+) + f

(t−

)]. (2.198)

Substituting (2.191) with Ω replaced by the integration variable a into (2.197)one can show that

σ(1)Ω (t) =

∫ ∞

−∞f (t′)K(1)

Ω (t− t′) dt′, (2.199)

where

K(1)Ω (t) =

1

πt2

∫ Ωt

0

1− cosx

xdx. (2.200)


-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-0.2

0

0.2

0.4

0.6

0.8

1

1.2

**

*

t/T

** Fourier

*Fejer

Figure 2.28: Comparison of Fejer and Fourier integral approximations

One can show directly that limΩ−→∞

K(1)Ω (t) = δ (t) , consistent with (2.198).

A plot of 4πK(1)Ω (t) /Ω2 as a function of Ωt together with the (first-order) Fejer

and Fourier kernels is shown in Fig. 2.29. Unlike the Fourier Integral and the

(first-order) Fejer kernels, K(1)Ω (t) decreases monotonically on both sides of the

-20 -15 -10 -5 0 5 10 15 20-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Ωt

*

**

***

*Fourier

**(first order) Fejer

***(second order) Fejer

Figure 2.29: Comparison of Fourier and Fejer kernels


maximum, i.e., the functional form is free of sidelobes. At the same time itssingle lobe is wider than the main lobe of the other two kernels. It can be shownthat for large Ωt

K(1)Ω (t) ∼ ln(Ω |t| γ)

π (Ωt)2 , (2.201)

where lnγ = 0.577215 . . . is the Euler constant. Because of the presence ofthe logarithmic term (2.201) represents a decay rate somewhere between thatof the Fourier Integral kernel (1/Ωt) and that of the (first-order) Fejer kernel(1/(Ωt)2

).

The Fourier transform of K(1)Ω (t) furnishes the corresponding spectral win-

dow. An evaluation of the FT by directly transforming (2.200) is somewhatcumbersome. A simpler approach is the following:

F{K(1)Ω (t)} = F{ 1

Ω

∫ Ω

0

sin2 [(a/2) t]

π (a/2) t2da} = 1

Ω

∫ Ω

0

F{sin2 [(a/2) t]

π (a/2) t2

}da

=1

Ω

∫ Ω

0

(1−|ω|

a

)pa (ω) da =

{1Ω

∫ Ω

|ω|(1− |ω|

a

)da ; |ω| < Ω,

0 ; |ω| > Ω.

The last integral is easily evaluated with the final result

F{K(1)Ω (t)} ≡W (1) (ω) =

{1 +|ω|Ω

(ln|ω|Ω− 1

)}pΩ (ω) . (2.202)

A plot of this spectral window is shown in Fig. 2.30 which is seen to be quitesimilar to its discrete counterpart in Fig. 2.11.

2.3 Modulation and Analytic Signal

Representation

2.3.1 Analytic Signals

Suppose z (t) is a real signal with a Fourier transform Z (ω) = A (ω) eiθ(ω).According to (2.151) this signal can be expressed as a real part of the complexsignal whose Fourier transform vanishes for negative frequencies and equalstwice the transform of the given real signal for positive frequencies. Presentlywe denote this complex signal by w (t) so that

w (t) =1

2π

∫ ∞

0

2Z (ω) eiωtdω, (2.203)

whence the real and imaginary parts are, respectively,

e {w (t)} = z (t) =1

π

∫ ∞

0

A(ω) cos[ωt+ θ (ω)]dω, (2.204a)

2.3 Modulation and Analytic Signal Representation 133

-2 -1.5 -1 -0.5 0 0.5 1 1.5 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

ω/Ω

Figure 2.30: FT of second-order Fejer kernel

�m {w (t)} =1

π

∫ ∞

0

A(ω) sin[ωt+ θ (ω)]dω. (2.204b)

We claim that �m {w (t)}, which we presently denote by z (t) , is the Hilberttransform of z (t) , i.e.,

z (t) =1

πP

∫ ∞

−∞

z (τ )

t− τ dτ . (2.205)

Taking Hilbert transforms of both sides of (2.204a) and using trigonometric sumformulas together with (2.179) and (2.180) we obtain

z (t) = H{z (t)} = H{1

π

∫ ∞

0

A(ω) cos[ωt+ θ (ω)]dω

}

=1

π

∫ ∞

0

A(ω)H{cos[ωt+ θ (ω)]} dω

=1

π

∫ ∞

0

A(ω) (cos [θ (ω)]H{cos (ωt)} − sin [θ (ω)]H{sin (ωt)}) dω

=1

π

∫ ∞

0

A(ω) (cos [θ (ω)] sin (ωt) + sin [θ (ω)] cos(ωt)) dω

=1

π

∫ ∞

0

A(ω) sin[ωt+ θ (ω)]dω = �m {w (t)}

as was to be demonstrated. As a by-product of this derivation we see thatthe evaluation of the Hilbert transform of any signal can always be carried


out entirely in terms of the FT, as already remarked in connection with thefrequency domain calculation in (2.180).

The complex function

w (t) = z (t) + iz (t) (2.206)

of a real variable t is referred to as an analytic signal.1 By construction theFourier transform of such a signal vanishes identically for negative frequen-cies. This can also be demonstrated directly by Fourier transforming both sidesof (2.206). This entails recognition of (2.205) as a convolution of z (t) with 1/πtand use of (2.164) and (2.131). As a result we get the transform pair

z (t)F⇐⇒ −i sign (ω) Z (ω) . (2.207)

Using this in the FT of (2.206) yields W (ω) = Z (ω) + i [−i sign (ω) Z (ω)]which is equivalent to

W (ω) =

{2Z (ω) ; ω > 0

0 ; ω < 0.(2.208)

In practical situations a signal will invariably have negligible energy above acertain frequency. It is frequently convenient to idealize this by assuming thatthe FT of the signal vanishes identically above a certain frequency. Such asignal is said to be bandlimited (or effectively bandlimited). For example if z (t)is bandlimited to |ω| < ωmax the magnitude of its FT may appear as shown inFig. 2.31a. In conformance with (2.208) the magnitude of the Fourier spectrumof the corresponding analytic signal then appears as in Fig. 2.31b. It is commonto refer to the spectrum in Fig. 2.31a as double sided and to that in Fig. 2.31b asthe single sided. In practical applications the use of the latter is more common.The energy balance between the time and the frequency domains follows fromParseval theorem

∫ ∞

−∞|w (t)|2 dt = 2

π

∫ ωmax

0

|Z (ω)|2 dω.

Because of (2.207) the energy of an analytic signal is shared equally by the realsignal and its Hilbert transform.

2.3.2 Instantaneous Frequency and the Methodof Stationary Phase

The analytic signal furnishes a means of quantifying the amplitude, phase,and frequency of signals directly in the time domain. We recall that theseconcepts have their primitive origins in oscillatory phenomena described by

1The term “analytic” refers to the fact that a signal whose Fourier transform vanishesfor real negative values of frequency, i.e., is represented by the integral (2.203), is an analyticfunction of t in the upper half of the complex t plane (i.e., �m (t) > 0). (See Appendix, pages341–348).


a

b

ω

ω

ω

ω

ω

ω−

ω

ω

−

Z(

W(

max

max

max

max

)

)

Figure 2.31: Spectrum of z(t) and z(t) + iz(t)

sinusoids. Thus we say that the signal r cos (ωt+ ψ0) has amplitude r, fre-quency ω and a fixed phase reference ψ0, where for purposes of analysis wesometimes find it more convenient to deal directly with a fictitious complex sig-nal r exp [i (ωt+ ψ0)] with the tacit understanding that physical processes areto be associated only with the real part of this signal. A generalization of thisconstruct is an analytic signal. In addition to simplifying the algebra such com-plex notation also affords novel points of view. For example, the exponential ofmagnitude r and phase angle ψ (t) = ωt+ θ0 can be interpreted graphically asa phasor of length r rotating at the constant angular velocity ω = d

dt (ωt+ θ0) .Classically for a general nonsinusoidal (real) signal z (t) the concepts of fre-quency, amplitude, and phase are associated with each sinusoidal componentcomprising the signal Fourier spectrum, i.e., in this form these concepts appearto have meaning only when applied to each individual spectral component ofthe signal. On the other hand we can see intuitively that at least in specialcases the concept of frequency must bear a close relationship to the rate of zerocrossings of a real signal. For pure sinusoids this observation is trivial, e.g.,the number of zero crossings of the signal cos (10t) per unit time is twice thatof cos(5t). Suppose instead we take the signal cos

(10t2

). Here the number of

zero crossings varies linearly with time and the corresponding complex signal,as represented by the phasor exp

[i(10t2

)], rotates at the rate d

dt

(10t2

)= 20t

rps. Thus we conclude that the frequency of this signal varies linearly withtime. The new concept here is that of instantaneous frequency which is clearlynot identical with the frequency associated with each Fourier component of thesignal (except of course in case of a pure sinusoid). We extend this definition to


arbitrary real signals z (t) through an analytic signal constructed in accordancewith (2.203). We write it presently in the form

w(t) = r (t) eiψ(t), (2.209)

where

r (t) =√z2 (t) + z2 (t) (2.210)

is the (real, nonnegative) time-varying amplitude, or envelope, ψ(t) the instan-taneous phase, and

ω (t) =dψ

dt(2.211)

the instantaneous frequency. Note also that the interpretation of ω (t) as azero crossing rate requires that it be nonnegative which is compatible with theanalytic signal having only positive frequency components. To deduce the rela-tionship between the instantaneous frequency and the signal Fourier spectrumlet us formulate an estimate of the spectrum of w(t) :

W (ω) =

∫ ∞

−∞r (t) ei[ψ(t)−ωt]dt. (2.212)

We can, of course, not “evaluate” this integral without knowing the specificsignal. However for signals characterized by a large time-bandwidth productwe can carry out an approximate evaluation utilizing the so-called principleof stationary phase. To illustrate the main ideas without getting sidetrackedby peripheral generalities consider the real part of the exponential in (2.212),i.e., cos [q(t)] with q(t) = ψ(t) − ωt. Figure 2.32 shows a plot of cos [q(t)] for

0 1 2 3 4 5 6 7 8 9 10-1.5

-1

-0.5

0

0.5

1

1.5

t

Figure 2.32: Plot of cos(5t2 − 50t) (stationary point at t = 5)


the special choice ψ (t) = 5t2 and ω = 50. This function is seen to oscillaterapidly except in the neighborhood of t = t0 = 5 = ω/10 which point cor-responds to q′ (5) = 0. The value t0 = 5 in the neighborhood of which thephase varies slowly is referred to as the stationary point of q (t) (or a point ofstationary phase). If we suppose that the function r (t) is slowly varying rel-ative to these oscillations, we would expect the contributions to an integral ofthe form

∫∞−∞ r (t) cos

(5t2 − 50t

)dt from points not in the immediate vicinity

of t = 5 to mutually cancel. Consequently the dominant contributions to theintegral would arise only from the values of r (t) and ψ (t) in the immediateneighborhood of the point of stationary phase. We note in passing that in thisexample the product t0ω = 250 >>1. It is not hard to show that the larger thisdimensionless quantity (time bandwidth product) the narrower the time bandwithin which the phase is stationary and therefore the more nearly localized thecontribution to the overall integral. In the general case the stationary point isdetermined by

q′(t) = ψ′(t)− ω = 0, (2.213)

which coincides with the definition of the instantaneous frequency in (2.211).When we expand the argument of the exponential in a Taylor series about t = t0we obtain

q(t) = ψ(t0)− ωt0 + 1

2(t− t0)2ψ′′(t0) + . . . (2.214)

Similarly we have for r (t)

r (t) = r (t0) + (t− t0)r′ (t0) + . . . (2.215)

In accordance with the localization principle just discussed we expect, given asufficiently large ωt0, that in the exponential function only the first two Taylorseries terms need to be retained. Since r (t) is assumed to be relatively slowlyvarying it may be replaced by r (t0) . Therefore (2.212) may be approximated by

W (ω) ∼ r (t0) ei[ψ(t0)−ωt0]∫ ∞

−∞ei

12 (t−t0)2ψ′′(t0)dt. (2.216)

When the preceding standard Gaussian integral is evaluated we obtain the finalformula

W (ω)ωt0∼∞

∼ r (t0) ei[ψ(t0)−ωt0]√

2π∣∣ψ′′(t0)∣∣ei π4 sign[ψ

′′(t0)]. (2.217)

It should be noted that the variable t0 is to be expressed in terms of ω byinverting (2.213), a procedure that in general is far from trivial. When thisis done (2.217) provides an asymptotic approximation to the signal Fourierspectrum for large ωt0.


To illustrate the relationship between the instantaneous frequency of a signaland its frequency content as defined by Fourier synthesis consider the signal

g(t) =

{A cos(at2 + βt) ; 0 ≤ t ≤ T,

0 elsewhere.

whose instantaneous frequency increases linearly from ωmin = β (β > 0) toωmax = 2aT +β rps. Based on this observation it appears reasonable to definethe nominal bandwidth of this signal byB = aT/π Hz. The relationship betweenB and the bandwidth as defined by the signal Fourier spectrum is more readilyclarified in terms of the dimensionless parametersM = 2BT (the nominal time-bandwidth product) and r = ωmin/ωmax < 1. Using these parameters we putthe signal in the form

g(t) =

{A cos

[(π2M

)(t/T )2 + π M

1−r(tT

)]; 0 ≤ t ≤ T,

0 elsewhere.(2.218)

The FT of (2.218) can be expressed in terms of Fresnel integrals whose standardforms read

C(x) =

∫ x

0

cos(π2ξ2

)dξ, (2.219a)

S(x) =

∫ x

0

sin(π2ξ2

)dξ. (2.219b)

One then finds

G(ω) =AT

2√M

[e−iπ2M

(r−f1−r

′)2

[C{√M

1− f1− r

′

}+ iS{√M

1− f1− r

′

}

−C{√Mr − f1− r

′

} − iS{√Mr − f1− r

′

}]

+eiπ2M

(r+f1−r

′)2

[C{√M

1 + f

1− r′

} − iS{√M

1 + f

1− r′

}

−C{√Mr + f

1− r′

}+ iS{√Mr + f

1− r′

}]], (2.220)

where we have introduced the normalized frequency variable f ′ = ω(1−r)/2πB.Using the asymptotic forms of the Fresnel integrals for large arguments, i.e.,C(±∞) = ±1/2 and S(±∞) = ±1/2, we find that as the nominal time-bandwidth product (M/2) approaches infinity, the rather cumbersome expres-sion (2.220) assumes the simple asymptotic form

G (ω)M∼∞

∼

⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

AT2

√2M e

−iπ2M(

r−f1−r

′)2

; r < f ′ < 1,

AT2

√2M e

iπ2M

(r+f1−r

′)2

;−1 < f ′ < −r,0 ; |f ′| > 1 and |f ′| < r.

(2.221)


From (2.221) we see that the FT of g(t) approaches the constant AT/2√2/M

within the frequency band r < |f ′| < 1 and vanishes outside this range, exceptat the band edges (i.e., f ′ = ±1 and ±r) where it equals one-half this constant.Since g(t) is of finite duration it is asymptotically simultaneously bandlimitedand timelimited. Even though for any finite M the signal spectrum will not bebandlimited this asymptotic form is actually consistent with Parseval theorem.For applying Parseval formula to (2.221) we get

1

2π

∫ ∞

−∞|G (ω)|2 dω =

A2

4

T

B2B =

(A2/2

)T (2.222)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

frequency*(1-r)/B

abs(

G)*

2*sq

rt(B

*T)/

(A*T

)

BT=5

BT=10

BT=1000

Figure 2.33: Magnitude of the FT of a linear FM pulse for different time-bandwidth products

On the other hand we recognize the last term as the asymptotic form (i.e., forlarge ω0T ) of the total energy of a sinusoid of fixed frequency, amplitude A,and duration T . Thus apparently if the time-bandwidth product is sufficientlylarge we may approximate the energy of a constant amplitude sinusoid withvariable phase by the same simple formula (A2/2)T. Indeed this result actuallygeneralizes to signals of the form A cos [φ(t)]. How large must M be for (2.221)to afford a reasonable approximation to the signal spectrum? Actually quitelarge, as is illustrated by the plots in Fig. 2.33 for BT = 5, 10, and 1, 000, wherethe lower (nominal) band edge is defined by r = 0.2 and the magnitude of theasymptotic spectrum equals unity within .2 < f ′ < 1 and 1/2 at f ′ = .2 andf ′ = 1.


2.3.3 Bandpass Representation

The construct of an analytic signal affords a convenient tool for describing theprocess of modulation of a low frequency (baseband) signal by a high frequencycarrier as well as the demodulation of a transmitted bandpass signal down tobaseband frequencies. We have already encountered a modulated signal in sim-plified form in connection with the frequency shifting properties of the FT in(2.154). Adopting now a more general viewpoint we take an arbitrary real sig-nal z (t) together with its Hilbert transform z (t) and a positive constant ω0 todefine two functions x (t) and y (t) as follows:

x (t) = z (t) cos (ω0t) + z(t) sin (ω0t) , (2.223a)

y (t) = −z (t) sin (ω0t) + z(t) cos (ω0t) , (2.223b)

which are easily inverted to yield

z (t) = x (t) cos (ω0t)− y(t) sin (ω0t) (2.224a)

z(t) = x (t) sin (ω0t) + y(t) cos (ω0t) . (2.224b)

Equations (2.223) and (2.224) constitute a fundamental set of relations thatare useful in describing rather general modulation and demodulation processes.In fact the form of (2.224a) suggests an interpretation of z (t) as a signalmodulated by a carrier of frequency ω0 a special case of which is represented bythe left side of (2.154). Comparison with (2.224a) yields x (t) = Af(t) cos(θ0)and y (t) = Af(t) sin(θ0). We note that in this special case x (t) and y(t) arelinearly dependent which need not be true in general.

Let us now suppose that the only datum at our disposal is the signal z (t)and that the carrier frequency ω0 is left unspecified. As far as the mathematicalrepresentation (2.223) and (2.224) is concerned it is, of course, perfectly validand consistent for any choice of (real ) ω0. However, if x (t) and y(t) in (2.223) areto represent baseband signals at the receiver resulting from the demodulation ofz (t) by the injection of a local oscillator with frequency ω0, then the bandwidthof x (t) and y(t) (centered at ω = 0) should certainly be less than 2ω0. A moreprecise interrelation between the constraints on signal bandwidth and carrierfrequency ω0 is readily deduced from the FT of x (t) and y(t). Denoting these,respectively, by X (ω) and Y (ω), we obtain using (2.223) and (2.207)

X (ω) = U(ω0 − ω)Z (ω − ω0) + U(ω + ω0)Z (ω + ω0) , (2.225a)

Y (ω) = iU(ω0 − ω)Z (ω − ω0)− iU(ω + ω0)Z (ω + ω0) . (2.225b)

On purely physical grounds we would expect Z (ω) to be practically zero abovesome finite frequency, say ωmax. If the bandwidth of X(ω) and Y (ω) is tobe limited to |ω| ≤ ω0, then ωmax − ω0 may not exceed ω0. This followsdirectly from (2.225) or from the graphical superposition of the spectra shownin Fig. 2.34. In other words if x (t) and y(t) are to represent baseband signals,we must have

ω0 ≥ ωmax/2. (2.226)


When this constraint is satisfied the spectrum of z (t) may in fact extend downto zero frequency (as, e.g., in Fig. 2.31a) so that theoretically the spectra ofx (t) and y(t) are allowed to occupy the entire bandwidth |ω| ≤ ω0. Howeverin practice there will generally also be a lower limit on the band occupancyof Z (ω), say ωmin. Thus the more common situation is that of a bandpassspectrum illustrated in Fig. 2.34 wherein the nonzero spectral energy of z (t)occupies the band ωmin < ω < ωmax for positive frequencies and the band-ωmax < ω < −ωmin for negative frequencies.

ω

0ω−

−−

0min ω0ωω

minω

ω

− −

0

0

ω minω

ω ω

−

0ωmin

ωmin ωmin

ω− 0ωmax

ωmax ωmax

ω

0ω

−

ω−

0ω maxω−

Z( U(

X(

ω

ω

ω

)

)0)−

ω

ω

ωZ(

Z(

0)+

ω

0 ω+U( )

)

ω

Figure 2.34: Bandpass and demodulated baseband spectra

In the case depicted ωmin < ω0 < ωmax and ω0 − ωmin > ωmax − ω0.The synthesis of X (ω) from the two frequency-shifted sidebands followsfrom (2.225a) resulting in a total band occupancy of 2 |ω0 − ωmin|. It iseasy to see from (2.225b) that Y (ω) must occupy the same bandwidth. Ob-serve that shifting ω0 closer to ωmin until ωmax − ω0 > ω0 − ωmin results in a


total band occupancy of 2 |ωmax − ω0| and that the smallest possible basebandbandwidth is obtained by positioning ω0 midway between ωmax and ωmin.

The two real baseband signals x (t) and y(t) are referred to as the inphaseand quadrature signal components. It is convenient to combine them into thesingle complex baseband signal

b(t) = x(t) + iy(t). (2.227)

The analytic signal w(t) = z(t) + iz(t) follows from a substitution of (2.224a)and (2.224b)

w(t) = x (t) cos (ω0t)− y(t) sin (ω0t)

+i[x (t) sin (ω0t) + y(t) cos (ω0t)]

= [x(t) + iy(t)] eiω0t = b (t) eiω0t. (2.228)

The FT of (2.228) reads

W (ω) = X (ω − ω0) + iY (ω − ω0) = B (ω − ω0) (2.229)

or, solving for B (ω) ,

B (ω) =W (ω + ω0) = 2U (ω + ω0)Z(ω + ω0) = X (ω) + iY (ω) . (2.230)

In view of (2.224a) the real bandpass z(t) signal is given by the real partof (2.228), i.e.,

z(t) = e{b (t) eiω0t}. (2.228*)

Taking the FT we get

Z(ω) =1

2[B (ω − ω0) +B∗ (−ω − ω0)] , (2.228**)

which reconstructs the bandpass spectrum in terms of the baseband spectrum.As the preceding formulation indicates, given a bandpass signal z (t) , the

choice of ω0 at the receiver effectively defines the inphase and quadraturecomponents. Thus a different choice of (local oscillator) frequency, say ω1,ω1 = ω0 leads to the representation

z (t) = x1 (t) cos (ω1t)− y1 (t) sin (ω1t) , (2.229a)

z (t) = x1 (t) sin (ω1t) + y1 (t) cos (ω1t) , (2.229b)

wherein the x1 (t) and y1 (t) are the new inphase and quadrature components.The relationship between x (t) , y (t) and x1 (t) and y1 (t) follows upon equat-ing (2.229) to (2.224):[

cos (ω0t) − sin (ω0t)sin (ω0t) cos (ω0t)

] [x (t)y (t)

]=

[cos (ω1t) − sin (ω1t)sin (ω1t) cos (ω1t)

] [x1 (t)y1 (t)

],

which yields[x (t)y (t)

]=

[cos [(ω0 − ω1) t] sin [(ω0 − ω1) t]− sin [(ω0 − ω1) t] cos [(ω0 − ω1) t]

] [x1 (t)y1 (t)

]. (2.230)


The linear transformations defined by the 2 × 2 matrices in (2.224), (2.229),and (2.230) are all orthogonal so that

z2 (t) + z2 (t) = x2 (t) + y2 (t) = x21 (t) + y21 (t) = |b (t)|2 = r2 (t) . (2.231)

This demonstrates directly that the analytic signal and the complex basebandsignal have the same envelope r (t) which is in fact independent of the frequencyof the reference carrier. We shall henceforth refer to r(t) as the signal enve-lope. Unlike the signal envelope, the phase of the complex baseband signal doesdepend on the carrier reference. Setting

θ (t) = tan−1 x (t)

y (t), θ1 (t) = tan−1 x1 (t)

y1 (t), (2.232)

we see that with a change in the reference carrier the analytic signal undergoesthe transformation

w (t) = r (t) eiθ(t)eiω0t

= r (t) eiθ1(t)eiω1t (2.233)

or, equivalently, that the two phase angles transform in accordance with

θ (t) + ω0t = θ1 (t) + ω1t. (2.234)

It should be noted that in general the real and imaginary parts of a complexbaseband signal need not be related by Hilbert transforms. In fact suppose x (t)and y (t) are two arbitrary real signals, bandlimited to |ω| < ωx and |ω| < ωy,respectively. Then, as may be readily verified, for any ω0 greater than ωx/2and ωy/2 the Hilbert transform of the bandpass signal z (t) defined by (2.224a)is given by (2.224b).

2.3.4 Bandpass Representation of Random Signals*

In the preceding discussion it was tacitly assumed that the signals are deter-ministic. The notion of an analytic signal is equally useful when dealing withstochastic signals. For example, if z (t) is a real wide-sense stationary stochas-tic process, we can always append its Hilbert transform to form the complexstochastic process

w (t) = z (t) + iz (t) . (2.235)

By analogy with (2.206) we shall refer to it as an analytic stochastic process. Aswe shall show in the sequel its power spectrum vanishes for negative frequencies.First we note that in accordance with (2.207) the magnitude of the transferfunction that transforms z (t) into z (t) is unity. Hence the power spectrum aswell as the autocorrelation function of z (t) are the same as that of z (t) , i.e.,

〈z (t+ τ ) z (t)〉 ≡ Rzz (τ ) = 〈z (t+ τ ) z (t)〉 = Rzz (τ ) . (2.236)


The cross-correlation between z (t) and its Hilbert transform is then

Rzz (τ) = 〈z (t+ τ ) z (t)〉 = 1

π

∫ ∞

−∞

〈z(τ

′)z (t)〉

t+ τ − τ ′ dτ′

=1

π

∫ ∞

−∞

Rzz

(τ

′ − t)

t+ τ − τ ′ dτ′=

1

π

∫ ∞

−∞

Rzz (ξ)

τ − ξ dξ. (2.237)

The last expression states that the cross-correlation between a stationarystochastic process and its Hilbert transform is the Hilbert transform of theautocorrelation function of the process. In symbols

Rzz (τ ) = Rzz (τ ) . (2.238)

Recall that the Hilbert transform of an even function is an odd function andconversely. Thus, since the autocorrelation function of a real stochastic processis always even, Rzz (τ ) is odd. Therefore we have

Rzz (τ ) ≡ Rzz (−τ ) = −Rzz (τ) . (2.239)

The autocorrelation function of w (t) then becomes

Rww (τ ) = 〈w (t+ τ )∗w (t)〉 = 2 [Rzz (τ ) + iRzz (τ)]

= 2[Rzz (τ ) + iRzz (τ)

]. (2.240)

With

Rzz (τ )F⇐⇒ Szz (ω) (2.241)

we have in view of (2.238) and (2.207)

Rzz (τ) = Rzz (τ)F⇐⇒ −iSzz (ω) sign (ω) . (2.242)

Denoting the spectral density of w (t) by Sww (ω) , (2.240) together with (2.241)and (2.242) gives

Sww (ω) =

{4Szz (ω) ; ω > 0,

0; ω < 0.(2.243)

so that the spectral density of the analytic complex process has only positivefrequency content. The correlation functions of the baseband (inphase) x (t) and(quadrature) process y (t) follow from (2.223). By direct calculation we get

Rxx (τ) = 〈{z (t+ τ) cos [ω0 (t+ τ )] + z (t+ τ ) sin [ω0 (t+ τ)]}{z (t) cos (ω0t) + z (t) sin (ω0t)}〉

= Rzz (τ) cos (ω0τ) +Rzz (τ) sin (ω0τ ) (2.244a)

Ryy (τ) = 〈{−z (t+ τ ) sin [ω0 (t+ τ )] + z (t+ τ) cos [ω0 (t+ τ )]}{−z (t) sin (ω0t) + z (t) cos (ω0t)}〉

= Rzz (τ) cos (ω0τ) +Rzz (τ) sin (ω0τ ) = Rxx (τ) (2.244b)


Rxy (τ) = 〈{z (t+ τ) cos [ω0 (t+ τ )] + z (t+ τ ) sin [ω0 (t+ τ)]}{−z (t) sin (ω0t) + z (t) cos (ω0t)}〉

= −Rzz (τ ) cos (ω0τ ) +Rzz (τ ) sin (ω0τ )

= −Rzz (τ ) cos (ω0τ ) +Rzz (τ ) sin (ω0τ ) . (2.244c)

Recall that for any two real stationary processes Rxy(τ ) = Ryx(−τ ). Usingthis relation in (2.244c) we get

Ryx(τ ) = −Rxy(τ ). (2.245)

Also according to (2.244a) and (2.244b) the autocorrelation functions of the in-phase and quadrature components of the stochastic baseband signal are identicaland consequently so are the corresponding power spectra. These are

Sxx (ω) = Syy (ω) =

1

2[1− sign (ω − ω0)]Szz (ω − ω0)

+1

2[1 + sign (ω + ω0)]Szz (ω + ω0) . (2.246)

From (2.244c) we note that Rxy(0) ≡ 0 but that in general Rxy(τ ) = 0 whenτ = 0. The FT of this quantity, i.e., the cross-spectrum, is

Sxy (ω) =i

2[sign (ω − ω0)− 1]Szz (ω − ω0)

+i

2[sign (ω + ω0) + 1]Szz (ω + ω0) . (2.247)

By constructing a mental picture of the relative spectral shifts dictated by(2.247) it is not hard to see that the cross spectrum vanishes identically (or,equivalently, Rxy(τ ) ≡ 0) when Szz (ω) , the spectrum of the band-pass process,is symmetric about ω0.

Next we compute the autocorrelation function Rbb(τ )of the complex stochas-tic baseband process b(t) = x(t) + iy(t). Taking account of Rxx(τ ) = Ryy (τ )and (2.245) we get

Rbb(τ ) = 〈b(t+ τ )b∗(t)〉 = 2[Rxx(τ ) + iRyx(τ )]. (2.248)

In view of (2.240) and (2.228) the autocorrelation function of the analyticbandpass stochastic process is

Rww(τ ) = 2[Rzz (τ ) + iRzz (τ )

]= Rbb(τ )e

iω0τ . (2.249)

The autocorrelation function of the real bandpass process can then be repre-sented in terms of the autocorrelation function of the complex baseband processas follows:

Rzz (τ ) =1

2e{Rbb(τ )eiω0τ

}. (2.250)


With the definition

Rbb (τ )F⇐⇒ Sbb (ω)

the FT of (2.250) reads

Szz(ω) =1

4[Sbb(ω + ω0) + Sbb(−ω − ω0)] . (2.251)

Many sources of noise can be modeled (at least on a limited timescale) as sta-tionary stochastic processes. The spectral distribution of such noise is usuallyof interest only in a relatively narrow pass band centered about some frequency,say ω0. The measurement of the power spectrum within a predetermined passband can be accomplished by using a synchronous detector that separates theinphase and quadrature channels, as shown in Fig. 2.35.

Acos(w0 t)

−Asin(w0 t)

Ax(t) / 2

Ay(t) / 2

z(t)

z(t)

s(t) BPF

LPF

LPF

Figure 2.35: Synchronous detection

The signal s (t) is first bandpass filtered to the bandwidth of interest andthen split into two separate channels each of which is heterodyned with a localoscillator with a 90 degree relative phase shift. The inphase and quadraturecomponents are obtained after lowpass filtering to remove the second harmoniccontribution generated in each mixer. To determine the power spectral densityof the bandpass signal requires measurement of the auto and cross spectra ofx (t) and y (t). The power spectrum can then be computed with the aid of(2.246) and (2.247) which give

Szz (ω + ω0) =1

2[Sxx (ω)− iSxy (ω)] . (2.252)

This procedure assumes that the process s (t) is stationary so that Sxx (ω) =Syy (ω) . Unequal powers in the two channels would be an indication of non-stationarity on the measurement timescale. A rather common form of nonsta-tionarity is the presence of an additive deterministic signal within the bandpassprocess.


A common model is a rectangular bandpass power spectral density. Assumingthat ω0 is chosen symmetrically disposed with respect to the bandpass powerspectrum, the power spectra corresponding to the analytic and basebandstochastic processes are shown in Fig. 2.36. In this case the baseband autocor-relation function is purely real and equal to

Rbb (τ) = N0sin(2πBτ)

πτ= 2Rxx (τ ) = 2Ryy (τ ) (2.253)

Szz

Sww

Sbb

ω

ω

ω0 Bπω0 2+Bπω 20 −ω0− Bπω0 2+−Bω0 2π−−

Bω0 2π− ω0 Bω0 2π+

B2πB2π−ω

4/0N

N0

N0

4/N0

Figure 2.36: Bandpass-to-baseband transformation for a symmetric power spec-trum

while Rxy (τ ) ≡ 0. What happens when the local oscillator frequency is setto ω = ω1 = ω0 − Δω, i.e., off the passband center by Δω? In that case thebaseband power spectral density will be displaced by Δω and equal to

Sbb (ω) =

{N0 ; − 2πB +Δω < ω < 2πB +Δω,

0 ; otherwise.(2.254)

The baseband autocorrelation function is now the complex quantity

Rbb (τ ) = eiτΔωN0sin(2πBτ)

πτ. (2.255)

In view of (2.248) the auto and crosscorrelation functions of the inphase andquadrature components are

Rxx (τ) = Ryy (τ ) = cos (τΔω)N0sin(2πBτ)

2πτ, (2.256a)

Ryx (τ) = sin (τΔω)N0sin(2πBτ)

2πτ. (2.256b)


The corresponding spectrum Sxx (ω) = Syy (ω) occupies the band |ω| ≤ 2πB+Δω. Unlike in the symmetric case, the power spectrum is no longer flat butexhibits two steps caused by the spectral shifts engendered by cos (τΔω), asshown in Fig. 2.37.

SyySxx =

Δ+B2

ω

ω ω ωπ π π π ω

2/N0

4/N0

Δ−B2Δ−− B2 Δ+− B2

Figure 2.37: Baseband I&Q power spectra for assymmetric local oscillator fre-quency positioning

2.4 Fourier Transforms and Analytic FunctionTheory

2.4.1 Analyticity of the FT of Causal Signals

Even though both the direct and the inverse FT have been initially definedstrictly for functions of a real variables one can always formally replace t and(or) ω by complex numbers and, as long as the resulting integrals converge,define the signal f (t) and (or) the frequency spectrum F (ω) as functions of acomplex variable. Those unfamiliar with complex variable theory should consultthe Appendix, and in particular A.4.

Let us examine the analytic properties of the FT in the complex domain ofa causal signal. To this end we replace ω by the complex variable z = ω + iδand write

F (z) =

∫ ∞

0

f (t) e−iztdt, (2.257)

wherein F (ω) is F (z) evaluated on the axis of reals. Furthermore let us as-sume that ∫ ∞

0

|f (t)| dt <∞. (2.258)

To put the last statement into the context of a physical requirement let ussuppose that the signal f (t) is the impulse response of a linear time-invariantsystem. In that case, as will be shown in 3.1.4, absolute integrability in the senseof (2.258) is a requirement for system stability. Using (2.257) we obtain in viewof (2.258) for all Im z = δ ≤ 0 the bound

∣∣∣∣∫ ∞

0

f (t) e−iztdt∣∣∣∣ ≤

∫ ∞

0

|f (t)| eδtdt <∞. (2.259)

2.4 Fourier Transforms and Analytic Function Theory 149

From this follows (see Appendix) that F (z) is an analytic function of the com-plex variable z in the closed lower half of the complex z plane, i.e., Im z ≤ 0.Moreover, for Im z ≤ 0,

lim|z|→∞

F (z)→ 0 (2.260)

as we see directly from (2.259) by letting δ approach −∞. In other words,the FT of the impulse response of a causal linear time-invariant system is ananalytic function of the complex frequency variable z in the closed lower halfplane. This feature is of fundamental importance in the design and analysis offrequency selective devices (filters).

2.4.2 Hilbert Transforms and Analytic Functions

A direct consequence of the analyticity of F (z) is that the real and imaginaryparts of F (ω) may not be specified independently. In fact we have alreadyestablished in 2.2.6 that for a causal signal they are linearly related through theHilbert transform. The properties of analytic functions afford an alternativederivation. For this purpose consider the contour integral

IR (ω0) =

∮

ΓR

F (z)

z − ω0dz, (2.261)

ω

δ

R

R− Rεω0 − εω0 +

CR

cε

θεθR

Figure 2.38: Integration contour ΓR for the derivation of the Hilbert transforms

wherein ω0 is real, taken in the clockwise direction along the closed path ΓR asshown in Fig. 2.38. We note that ΓR is comprised of the two linear segments(−R,ω0 − ε), (ω0 + ε,R) along the axis of reals, the semicircular contour cεofradius ε with the circle centered at ω = ω0, and the semicircular contour CR


of radius R in the lower half plane with the circle centered at ω = 0. Sincethe integrand in (2.261) is analytic within ΓR, we have IR (ω0) ≡ 0, so thatintegrating along each of the path-segments indicated in Fig. 2.38 and addingthe results in the limit as ε→ 0 and R→∞, we obtain

0 = limε→0, R→∞

(∫ ω0−ε

−R

F (ω)

ω − ω0dω +

∫ R

ω0+ε

F (ω)

ω − ω0dω

)

+limε→0

∫

cε

F (z)

z − ω0dz + lim

R→∞

∫

CR

F (z)

z − ω0dz. (2.262)

On CR we set z = ReiθR so that dz = iReiθRdθR and we have∣∣∣∣∣∣

∫

CR

F (z)

z − ω0dz

∣∣∣∣∣∣=

∣∣∣∣∣

∫ −π

0

F(ReiθR

)

ReiθR − ω0RdθR

∣∣∣∣∣

so that in view of (2.260) in the limit of large R the last integral in (2.262) tendsto zero. On cε we set z − ω0 = εeiθε and substituting into the third integral in(2.262) evaluate it as follows:

limε→0

∫

cε

F (z)

z − ω0dz = lim

ε→0

∫ 0

−πF

(ω0 + εeiθε

)idθ = iπF (ω0) .

Now the limiting form of the first two integrals in (2.262) are recognized as thedefinition a CPV integral so that collecting our results we have

0 = P

∫ ∞

−∞

F (ω)

ω − ω0dω + iπF (ω0). (2.263)

By writing F (ω) = R(ω) + iX(ω) and similarly for F (ω0), substitutingin (2.263), and setting the real and the imaginary parts to zero we obtain

X(ω0) =1

πP

∫ ∞

−∞

R (ω)

ω − ω0dω, (2.264a)

R(ω0) = − 1

πP

∫ ∞

−∞

R (ω)

ω − ω0dω, (2.264b)

which, apart from a different labeling of the variables, are the Hilbert Transformsin (2.173a) and (2.173b). Because the real and imaginary parts of the FTevaluated on the real frequency axis are not independent it should be possibleto determine the analytic function F (z) either from R(ω) of from X (ω) . Toobtain such formulas let z0 be a point in the lower half plane (i.e., Im z0 < 0)and apply the Cauchy integral formula

F (z0) = − 1

2πi

∮

ΓR

F (z)

z − z0 dz (2.265)


ω

δ

R− R

CR

•z0

Figure 2.39: Integration contour for the evaluation of Eq. (2.265)

taken in the counterclockwise direction over the contourΓR as shown in Fig. 2.39and comprised of the line segment (−R,R) and the semicircular contour CR ofradius R. Again because of (2.260) the contribution over CR vanishes as R isallowed to approach infinity so that (2.265) may be replaced by

F (z0) = − 1

2πi

∫ ∞

−∞

F (ω)

ω − z0 dω

= − 1

2πi

∫ ∞

−∞

R (ω)

ω − z0 dω −1

2π

∫ ∞

−∞

X (ω)

ω − z0 dω. (2.266)

In the last integral we now substitute for X (ω) its Hilbert Transform from(2.264a) to obtain

− 1

2π

∫ ∞

−∞

X (ω)

ω − z0 dω = − 1

2π2

∫ ∞

−∞dωP

∫ ∞

−∞

R (η)

(ω − z0) (η − ω)dη

=1

2π2

∫ ∞

−∞R (η) dηP

∫ ∞

−∞

dω

(ω − z0) (ω − η) .(2.267)

The last CPV integral over ω is evaluated using the calculus of residues asfollows:

P

∫ ∞

−∞

dω

(ω − z0) (ω − η) =

∮

ΓR

dz

(z − z0) (z − η) − iπ1

η − z0 , (2.268)

where ΓR is the closed contour in Fig. 2.38 and where the location of the simplepole at ω0 is now designated by η. The contour integral in (2.268) is performedin the clockwise direction and the term −iπ/ (η − z0) is the negative of thecontribution from the integration over the semicircular contour cε. The only


contribution to the contour integral arises from the simple pole at z = z0 whichequals −i2π/ (z0 − η) resulting in a net contribution in (2.268) of iπ/ (η − z0) .Substituting this into (2.267) and then into (2.266) gives the final result

F (z) =i

π

∫ ∞

−∞

R (η)

η − z dη, (2.269)

where we have replaced the dummy variable ω by η and z0 by z ≡ ω + iδ.Unlike (2.257), the integral (2.269) defines the analytic function F (z) only inthe open lower half plane, i.e., for Im z < 0. On the other hand, one wouldexpect that in the limit as δ → 0, F (z)→ F (ω) . Let us show that this limit isactually approached by the real part. Thus using (2.269) we get with z = ω+ iδ

ReF (z) = R(ω, δ) =

∫ ∞

−∞R (η)

−δπ[(η − ω)2 + δ2

]dη. (2.270)

The factor multiplying R (η) in integrand will be recognized as the delta functionkernel in (1.250) so that limR(ω, δ) as −δ → 0 is in fact R (ω) .

2.4.3 Relationships Between Amplitude and Phase

We again suppose that F (ω) is the FT of a causal signal. Presently we write itin terms of its amplitude and phase

F (ω) = A(ω)eiθ(ω) (2.271)

and setA(ω) = eα(ω). (2.272)

Taking logarithms we have

lnF (ω) = α (ω) + iθ (ω) . (2.273)

Based on the results of the preceding subsection it appears that if lnF (ω) canbe represented as an analytic function in the lower half plane one should beable to employ Hilbert Transforms to relate the phase to the log amplitude ofthe signal FT. From the nature of the logarithmic function we see that this isnot possible for an arbitrary FT of a causal signal but only for signals whoseFT, when continued analytically into the complex z-domain via formula (2.257)or (2.269), has no zeros in the lower half of the z-plane. Such transforms aresaid to be of the minimum-phaseshift type. If f (t) is real so that A(ω) andθ (ω) is, respectively, an even and an odd function of ω, we can express θ (ω) interms of α (ω) using contour integration, provided the FT decays at infinity inaccordance with

|F (ω)|ω→∞

∼ O(|ω|−k

)for some k > 0. (2.274)


ω

δ

R

R− Rεω0 − εω0 +

CR

+cε

εω0 −− εω0 +−• •ω0ω0−

−cε

Figure 2.40: Integration contour for relating amplitude to phase

For this purpose we consider the integral

IR =

∮

ΓR

lnF (z)

ω20 − z2

dz (2.275)

taken in the clockwise direction over the closed contour ΓR comprised of thethree linear segments (−R,−ω0 − ε) , (−ω0 + ε, ω0 − ε) ,(ω0 + ε,R) , the twosemicircular arcs cε− and cε+ each with radius ε, and the semicircular arc CRwith radius R, as shown in Fig. 2.40. By assumption F (z) is free of zeros withinthe closed contour so that IR ≡ 0. In the limit as R→∞ and ε→ 0 the integralover the line segments approaches a CPV integral while the integrals cε and c

+ε

each approach iπ times the residue at the respective poles. The net result canthen be written as follows:

0 = P

∫ ∞

−∞

lnF (ω)

ω20 − ω2

dω + iπlnF (−ω0)

2ω0+ iπ

lnF (ω0)

−2ω0

+ limR→∞

∮

CR

lnF (z)

ω20 − z2

dz. (2.276)

In view of (2.274) for sufficiently large R the last integral may be bounded asfollows:

∣∣∣∣∣∣

∮

CR

lnF (z)

ω20 − z2

dz

∣∣∣∣∣∣≤ constant×

∫ π

0

k lnR

|ω20 −R2ei2θ|Rdθ. (2.277)

Since lnR < R for R > 1, the last integral approaches zero as R → ∞ so thatthe contribution from CR in (2.276) vanishes. Substituting from (2.273) into


the first three terms on the right of (2.276) and taking account of the fact thatα (ω) is even while θ (ω) is odd, one obtains

0 = P

∫ ∞

−∞

α (ω) + iθ (ω)

ω20 − ω2

dω + iπα (ω0)− iθ (ω0)

2ω0+ iπ

α (ω0) + iθ (ω0)

−2ω0

Observe that the terms on the right involving α (ω0) cancel while the integrationinvolving θ (ω) vanishes identically. As a result we can solve for θ (ω0) with theresult

θ (ω0) =2ω0

πP

∫ ∞

0

α (ω)

ω2 − ω20

dω. (2.278)

Proceeding similarly with the aid of the contour integral

IR =

∮

ΓR

lnF (z)

z (ω20 − z2)

dz (2.279)

one obtains the formula

α(ω0) = α(0)− 2ω20

πP

∫ ∞

0

θ (ω)

ω (ω2 − ω20)dω. (2.280)

It is worth noting that the assumed rate of decay at infinity in (2.274) iscrucial to the vanishing of the contribution over the semicircular contour CR inFig. 2.40 and hence the validity of (2.278). Indeed if the decay of the FT is toorapid the contribution from CR will not vanish and can in fact diverge as, e.g.,for A(ω) = exp(−ω2). Note that in this case (2.278) also diverges. This meansthat for an arbitrary A (ω) one cannot find a θ(ω) such that A (ω) exp−iθ(ω)has a causal inverse, i.e., an f(t) that vanishes for negative t. What propertiesmust A (ω) possess for this to be possible? An answer can be given if A (ω)is square integrable over (−∞,∞). In that case the necessary and sufficientcondition for a θ(ω) to exist is the convergence of the integral

∫ ∞

−∞

|lnA(ω)|1 + ω2

dω <∞,

which is termed the Paley–Wiener condition [15]. Note that it precludes A(ω)from being identically zero over any finite segment of the frequency axis.

2.4.4 Evaluation of Inverse FT Using Complex VariableTheory

The theory of functions of a complex variable provides a convenient tool forthe evaluation of inverse Fourier transforms. The evaluation is particularlystraightforward when the FT is a rational function. For example, let us evaluate

f(t) =1

2π

∫ ∞

−∞

eiωtdω

ω2 + iω + 2. (2.281)


Ám

Âe

i

−2i•

•

w

w

P

Figure 2.41: Deformation of integration path within the strip of analyticity

The only singularities of F (ω) = 1/(ω2 + iω+2) in the complex ω plane arepoles corresponding to the two simple zeros of ω2+ iω+2 = (ω− i)(ω+2i) = 0,namely ω1 = i and ω2 = −2i. Therefore the integration path in (2.281) maybe deformed away from the real axis into any path P lying within the strip ofanalyticity bounded by−2 < Imω < 1, as depicted in Fig. 2.41. The exponentialmultiplying F (ω) decays for t > 0 in the upper half plane (Imω > 0) and fort < 0 in the lower half plane (Imω < 0) . For t > 0 we form the contour integral

IR =

∮eiωtF (ω)

dω

2π(2.282)

taken in the counterclockwise direction over the closed path formed by the linearsegment (−R,R) along P and the circular contour CR+ lying in the upper halfplane, as shown in Fig. 2.42. The residue evaluation at the simple pole at ω = igives IR = e−t/3. As R is allowed to approach infinity the integral over thelinear segment becomes just f(t). Therefore

e−t/3 = f (t) + limR→∞

∮

CR+

eiωtF (ω)dω

2π.

Since F (ω)→ 0 as ω →∞, and the exponential decays on CR+ Jordan lemma(see Appendix A) applies so that in the limit the integral over CR+ vanishesand we obtain f (t) = e−t/3 ; t > 0. When t < 0 the contour integral (2.282)is evaluated in the clockwise direction over the closed path in Fig. 2.43 with acircular path CR− in the lower half plane. The residue evaluation at the simplepole at ω = −2i now gives IR = e2t/3 so that

e2t/3 = f (t) + limR→∞

∮

CR−

eiωtF (ω)dω

2π.


R− R

Ám

CR+

Âe

•

•iw

w

2i−

Figure 2.42: Integration contour for t > 0

R- R

Ám

CR-

Âe

·

· iw

w

2i-

Figure 2.43: Integration contour for t < 0

Since now the exponential decays in the lower half plane, Jordan’s lemma againguarantees that the limit of the integral over CR− vanishes. Thus the final resultreads

f(t) =

{e−t/3 ; t ≥ 0,e2t/3 ; t ≤ 0.

(2.283)

This procedure is readily generalized to arbitrary rational functions. Thus sup-pose F (ω) = N (ω) /D(ω) with N(ω) and D (ω) polynomials in ω. We shallassume that2 degree N (ω) < degree D (ω) so that F (ω) vanishes at infinity,

2If N and D are of the same degree, then the FT contains a delta function which can beidentified by long division to obtain N/D =constant+N/D, with degree N <degree D. The

inverse FT then equals constant× δ (t) + F−1(N/D

).


as required by the Jordan lemma. If D (ω) has no real zeros, then proceedingas in the preceding example we find that the inverse FT is given by the residuesums

f (t) =

⎧⎨

⎩i∑

k;Imωk>0res

[N(ω)D(ω)e

iωt]

ω=ωk

; t ≥ 0,

−i∑k;Imωk<0res

[N(ω)D(ω)e

iωt]

ω=ωk

; t ≤ 0.(2.284)

For example, suppose F (ω) = i/(ω + 2i)2(ω − i) which function has a doublepole at ω = −2i and a simple pole at ω = i. For t ≥ 0 the contribution comesfrom the simple pole in the upper half plane and we get

f (t) = iie−t

(i + 2i)2=e−t

9; t ≥ 0.

For t ≤ 0 the double pole in the lower half plane contributes. Hence

f (t) = −i i ddω

eiωt

(ω − i) |ω=−2i =(ω − i) iteiωt − eiωt

(ω − i)2 |ω=−2i

=1− 3t

9e−2t ; t ≤ 0.

The case of D (ω) having real roots requires special consideration. First, ifthe order of any one of the zeros is greater than 1, the inverse FT does notexist.3 On the other hand, as will be shown in the sequel, if the zeros aresimple the inverse FT can computed by suitably modifying the residue formu-las (2.284). Before discussing the general case we illustrate the procedure bya specific example. For this purpose consider the time function given by theinversion formula

f(t) =1

2πP

∫ ∞

−∞

eiωt

(ω2 − 4)(ω2 + 1)dω, (2.285)

where F (ω) = 1/ (ω2 − 4)(ω2 + 1) has two simple zeros at ω = ±i and twoat ω = ±2 with the latter forcing a CPV interpretation of the integral. Beforecomplementing (2.285) with a suitable contour integral it may be instructive tomake the CPV form of (2.285) explicit. Thus

f(t) = limε→0,R→∞

IR,ε (2.286)

with

IR,ε =1

2π

{∫ −2−ε

−R+

∫ 2−ε

−2+ε

+

∫ R

2+ε

}eiωt

(ω2 − 4)(ω2 + 1)dω. (2.287)

3The corresponding time functions are unbounded at infinity and are best handled usingLaplace transforms.


To evaluate (2.286) by residues we define a contour integral

IR,ε =

∮

Γ

eiωtF (ω)dω

2π(2.288)

over a closed path Γ that includes IR,ε as a partial contribution. For t > 0the contour Γ is closed with the semicircle of radius R and includes the twosemicircles cε+ and cε−of radius ε centered, respectively, at ω = 2 and ω = −2,as shown in Fig. 2.44.

R- R

Ám

CR+

Âe

·

· i

w

w

i-

· ·ee

e-2 e+2e--2 e+-2

Figure 2.44: Integration contour for CPV integral

Writing (2.287) out in terms of its individual contributors we have

IR,ε = IR,ε +

∫

cε−

eiωtF (ω)dω

2π+

∫

cε+

eiωtF (ω)dω

2π+

∫

CR+

eiωtF (ω)dω

2π. (2.289)

Taking account of the residue contribution at ω = i we get for the integral overthe closed path

IR,ε = ieiωt

(ω2 − 4)(2ω)|ω=i = −e

−t

10.

As ε → 0 the integrals over cε− and cε− each contribute −2πi times one-halfthe residue at the respective simple pole (see Appendix A) and a R → ∞ theintegral over CR+ vanishes by the Jordan lemma. Thus taking the limits andsumming all the contributions in (2.289) we get

−e−t

10= f(t)− i

[1

2

eiωt

(2ω)(ω2 + 1)|ω=−2 +

1

2

eiωt

(2ω)(ω2 + 1)|ω=2

]

= f(t) +1

20sin(2t)

and solving for f(t),

f(t) = − 1

20sin(2t)− e−t

10; t > 0. (2.290)

2.5 Time-Frequency Analysis 159

For t < 0 we close the integration path (−R,−2− ε) + cε− + (−2 + ε, 2− ε) +(2+ ε,R) in Fig. 2.44 with a semicircular path in the lower half plane and carryout the integration in the clockwise direction. Now Γ encloses in addition to thepole at the pole at ω = −i, the two poles at ω = ±2. Hence

IR,ε = −i eiωt

(ω2 − 4)(2ω)|ω=−i − i

[eiωt

(2ω)(ω2 + 1)|ω=−2 +

eiωt

(2ω)(ω2 + 1)|ω=2

]

= −e−t

10+

1

10sin(2t).

Summing the contributions as in (2.289) and taking limits we get

− et

10+

1

10sin(2t) = f(t)− 1

10sin(2t).

Solving for f(t) and combining with (2.290) we have for the final result

f(t) = − 1

20sin(2t) sign(t)− e−|t|

10. (2.291)

Note that we could also have used an integration contour with the semicirclescε− and cε+ in the lower half plane. In that case we would have picked up theresidue at ω = ±2 for t > 0.

Based on the preceding example it is not hard to guess how to generalize(2.284) when D(ω) has simple zeros for real ω. Clearly for every real zero at

ω = ωk we have to add the contribution sign(t) (i/2) res[N(ω)D(ω)e

iωt]|ω=ωk

.

Hence we need to replace (2.284) by

f (t) = (i/2) sign(t)∑

k;Imωk=0

res

[N (ω)

D (ω)eiωt

]|ω=ωk

+

⎧⎨

⎩i∑

k;Im ωk>0res

[N(ω)D(ω) e

iωt]

ω=ωk

; t ≥ 0,

−i∑k;Im ωk<0res

[N(ω)D(ω) e

iωt]

ω=ωk

; t ≤ 0.(2.292)

For example, for F (ω) = iω/(ω20 − ω2), the preceding formula yields f(t) =

12sign(t) cosω0t and setting ω0 = 0 we find that the FT of sign(t) is 2/iω, inagreement with our previous result.

2.5 Time-Frequency Analysis

2.5.1 The Uncertainty Principle

A common feature shared by simple idealized signals such as rectangular, tri-angular, or Gaussian pulses is the inverse scaling relationship between signalduration and its bandwidth. Qualitatively a relationship of this sort actuallyholds for a large class of signals but its quantitative formulation ultimately


depends on the nature of the signal as well as on the definition of signalduration and bandwidth. A useful definition which also plays a prominent rolenot only in signal analysis but also in other areas where Fourier transforms arepart of the basic theoretical framework is the so-called rms signal duration σt,defined by

σ2t =

1

E

∫ ∞

−∞(t− < t >)2 |f (t)|2 dt, (2.293)

where

< t >=1

E

∫ ∞

−∞t |f (t)|2 dt (2.294)

and

E =

∫ ∞

−∞|f (t)|2 dt (2.295)

are the signal energies. We can accept this as a plausible measure of signalduration if we recall that σ2

t corresponds algebraically to the variance of a ran-

dom variable with probability density |f (t)|2 /E wherein the statistical meanhas been replaced by < t >. This quantity we may term “the average time ofsignal occurrence”.4 Although definition (2.295) holds formally for any signal(provided, of course, that the integral converges), it is most meaningful, justlike the corresponding concept of statistical average in probability theory, whenthe magnitude of the signal is unimodal. For example, using these parametersa real Gaussian pulse takes the form

f (t) =

√E

(2πσ2t )

1/4exp− (t− < t >)2

4σ2t

. (2.296)

To get an idea how the signal spectrum F (ω) affect the rms signal duration wefirst change the variables of integration in (2.293) from t to t′ = t− < t > andwrite it in the following alternative form:

σ2t =

1

E

∫ ∞

−∞t′2 |f (t′+ < t >)|2 dt′. (2.297)

Using the identities F {−itf(t)} = dF (ω)/dω and F {f (t+ < t >)} =F (ω) exp iω < t > we apply Parseval’s theorem to (2.297) to obtain

σ2t =

1

2πE

∫ ∞

−∞

∣∣∣∣d [F (ω) exp iω < t >]

dω

∣∣∣∣2

dω

=1

2πE

∫ ∞

−∞

∣∣∣∣dF (ω)

dω+ i < t > F (ω)

∣∣∣∣2

dω. (2.298)

This shows that the rms signal duration is a measure of the integrated fluctu-ations of the amplitude and phase of the signal spectrum. We can also express

4For a fuller discussion of this viewpoint see Chap. 3 in Leon Cohen, “Time-FrequencyAnalysis,” Prentice Hall PTR, Englewood Cliffs, New Jersey (1995).


the average time of signal occurrence < t > in terms of the signal spectrumby first rewriting the integrand in (2.294) as the product tf(t)f(t)∗ and usingF {tf(t)} = idF (ω)/dω together with Parseval’s theorem. This yields

< t >=1

2πE

∫ ∞

−∞idF (ω)

dωF ∗ (ω) dω.

With F (ω) = A (ω) eiθ(ω) the preceding becomes

< t >=1

2πE

∫ ∞

−∞

[−θ′ (ω)] |F (ω)|2 dω, (2.299)

where θ′ (ω) = dθ (ω) /dω. In 2.6.1 we shall identify the quantity −θ′ (ω) as thesignal group delay. Equation then (2.299) states that the group delay, when

averaged with the “density” function |F (ω)|2 /2πE, is identical to the averagetime of signal occurrence.

We now apply the preceding definitions of spread and average location inthe frequency domain. Thus the rms bandwidth σω will be defined by

σ2ω =

1

2πE

∫ ∞

−∞(ω− < ω >)

2 |F (ω)|2 dω, (2.300)

where

< ω >=1

2πE

∫ ∞

−∞ω |F (ω)|2 dω. (2.301)

We can view < ω > as the center of mass of the amplitude of the frequencyspectrum. Clearly for real signals < ω >≡ 0. By analogy with (2.297) we changethe variable of integration in (2.300) to ω′ = ω− < ω > and rewrite it as follows:

σ2ω =

1

2πE

∫ ∞

−∞ω′2 |F (ω′+ < ω >)|2 dω (2.302)

F {df(t)/dt} = iωF (ω) and Parseval’s theorem obtain the dual to (2.295), viz.,

σ2ω =

1

E

∫ ∞

−∞

∣∣∣∣d [f(t) exp−i < ω > t]

dt

∣∣∣∣2

dt

=1

E

∫ ∞

−∞

∣∣∣∣df (t)

dt− i < ω > f (t)

∣∣∣∣2

dt. (2.303)

Thus the rms bandwidth increases in proportion to the norm of the rate ofchange of the signal. In other words, the more rapid the variation of the signalin a given time interval the greater the frequency band occupancy. This iscertainly compatible with the intuitive notion of frequency as a measure of thenumber of zero crossings per unit time as exemplified, for instance, by signalsof the form cos [ϕ (t)] .

Again using F {df(t)/dt} = iωF (ω) and Parseval’s theorem we trans-form (2.301) into

< ω >=1

E

∫ ∞

−∞−idf (t)

dtf∗ (t) dt.


If f (t) = r (t) exp iψ (t) is an analytic signal, the preceding yields

< ω >=1

E

∫ ∞

−∞ψ′ (t) |f (t)|2 dt, (2.304)

where ψ′ (t) = dψ (t) /dt is the instantaneous frequency. This equation providesanother interpretation of < ω >, viz., as the average instantaneous frequencywith respect to the density |f (t)|2 /E, a result which may be considered a sortof dual to (2.299).

The rms signal duration and rms bandwidth obey a fundamental inequality,known as the uncertainty relationship, which we now proceed to derive. Forthis purpose let us apply the Schwarz inequality to the following two functions:(t− < t >) f(t) and df (t) /dt− i < ω > f (t) . Thus

∫ ∞

−∞(t− < t >)2 |f(t)|2 dt

∫ ∞

−∞

∣∣∣∣df (t)

dt− i < ω > f (t)

∣∣∣∣2

dt

≥∣∣∣∣∫ ∞

−∞(t− < t >) f∗(t)

[df (t)

dt− i < ω > f (t)

]dt

∣∣∣∣2

. (2.305)

Substituting for the first two integrals in (2.305) the σ2t and σ2

ω from (2.297)and (2.303), respectively, the preceding becomes

σ2tσ

2ωE

2 ≥∣∣∣∣∫ ∞

−∞(t− < t >) f∗(t)

[df (t)

dt− i < ω > f (t)

]dt

∣∣∣∣2

=

∣∣∣∣∫ ∞

−∞(t− < t >) f∗(t)

df(t)

dtdt

∣∣∣∣2

, (2.306)

where in view of (2.294) we have set∫∞−∞ (t− < t >) |f(t)|2 dt = 0. We now

integrate the last integral by parts as follows:

∫ ∞

−∞(t− < t >) f∗(t)

df(t)

dtdt

= (t− < t >) |f (t)|2 ∣∣∞−∞ −∫ ∞

−∞f(t)

d [(t− < t >) f∗(t)]dt

dt

= (t− < t >) |f (t)|2 ∣∣∞−∞ − E −∫ ∞

−∞(t− < t >) f(t)

df∗(t)dt

dt. (2.307)

Because f(t) has finite energy it must decay at infinity faster than 1/√t so that

(t− < t >) |f (t)|2 ∣∣∞−∞ = 0. Therefore after transposing the last term in (2.307)to the left of the equality sign we can rewrite (2.307) as follows:

Re

{∫ ∞

−∞(t− < t >) f∗(t)

df(t)

dtdt

}= −E/2. (2.308)


Since the magnitude of a complex number is always grater or equal to the mag-nitude of its real part the right side of (2.306) equals at least E2/4. Cancellingof E2 and taking the square root of both sides result in

σtσω ≥ 1

2, (2.309)

which is the promised uncertainty relation. Basically it states that simultaneouslocalization of a signal in time and frequency is not achievable to within arbi-trary precision: the shorter the duration of the signal the greater its spectraloccupancy and conversely. We note that except for a constant factor on theright (viz., Planck’s constant ), (2.309) is identical to the Heisenberg uncer-tainty principle in quantum mechanics where t and ω stand for any two canoni-cally conjugate variables (e.g., particle position and particle momentum). Whendoes (2.309) hold with equality? The answer comes from the Schwarz inequal-ity (2.305) wherein equality can be achieved if and only if (t− < t >) f(t) anddf(t)dt − i < ω > f (t) are proportional. Calling this proportionality constant −α

results in the differential equation

df (t)

dt− i < ω > f (t) + α (t− < t >) f(t) = 0. (2.310)

This is easily solved for f(t) with the result

f(t) = A exp{−α2(t− < t >)

2+α

2< t >2 +i < ω > t

}, (2.311)

where A is a proportionality constant. Thus the optimum signal from the stand-point of simultaneous localization in time and frequency has the form of a Gaus-sian function. Taking account of the normalization (2.295) we obtain after asimple calculation

α = 1/2σ2t , A =√E/2πσ2

t exp{− < t >2 /2σ2t

}. (2.312)

2.5.2 The Short-Time Fourier Transform

Classical Fourier analysis draws a sharp distinction between the time and fre-quency domain representations of a signal. Recall that the FT of a signal ofduration T can be computed only after the signal has been observed in its en-tirety. The computed spectrum furnishes the relative amplitude concentrationswithin the frequency band and the relative phases but information as to thetimes at which the particular frequency components have been added to thespectrum is not provided. Asking for such information is of course not alwayssensible particularly in cases of simple and essentially single scale signals suchas isolated pulses. On the other hand for signals of long duration possessingcomplex structures such as speech, music, or time series of environmental pa-rameters the association of particular spectral features with the times of theirgeneration not only is meaningful but in fact also constitutes an essential step


in data analysis. A possible approach to the frequency/time localization prob-lem is to multiply f(t), the signal to be analyzed, by a sliding window functiong (t− τ) and take the FT of the product. Thus

S (ω, τ) =

∫ ∞

−∞f(t)g (t− τ ) e−iωtdt (2.313)

whence in accordance with the FT inversion formula

f(t)g (t− τ ) = 1

2π

∫ ∞

−∞S (ω, τ) eiωtdω. (2.314)

We can obtain an explicit formula for determining f(t) from S (ω, τ) by requiringthat the window function satisfies∫ ∞

−∞|g (t− τ)|2 dτ = 1 (2.315)

for all t. For if we now multiply both sides of (2.314) by g∗ (t− τ ) and integratewith respect to τ we obtain

f(t) =1

2π

∫ ∞

−∞

∫ ∞

−∞S (ω, τ ) g∗ (t− τ ) eiωtdωdτ . (2.316)

The two-dimensional function S (ω, τ ) is referred to as the short-timeFourier transform5 (STFT) of f(t) and (2.316) the corresponding inversionformula. The STFT can be represented graphically in various ways. The mostcommon is the spectrogram, which is a two-dimensional plot of the magnitudeof S (ω, τ) in the τω plane. Such representations are commonly used as an aidin the analysis of speech and other complex signals.

Clearly the characteristics of the STFT will depend not only on the signalbut also on the choice of the window. In as much as the entire motivationfor the construction of the STFT arises from a desire to provide simultaneouslocalization in frequency and time it is natural to choose for the window functionthe Gaussian function since, as shown in the preceding, it affords the optimumlocalization properties. This choice was originally made by Gabor [6] and theSTFT with a Gaussian window is referred to as the Gabor transform. Here weadopt the following parameterization:

g (t) =21/4√se−

πt2

s2 . (2.317)

Reference to (2.311) and (2.312) shows that σt = s/ (2√π) . Using (2.142*) we

have for the FTG (ω) = 21/4

√se−s

2ω2/4π (2.318)

from which we obtain σω =√π/s so that σtσω = 1/2, as expected.

As an example, let us compute the Gabor transform of exp(αt2/2

). We

obtain

S (ω, τ) /√s = 21/4

√π

iαs2/2− π exp−(2πτs − iωs

)2

4 (iαs2/2− π) −πτ2

s2. (2.319)

5Also referred to as the sliding-window Fourier transform


Figure 2.45: Magnitude of Gabor Transform of exp{i 12αt

2}

A relief map of the magnitude of S (ω, τ ) /√s (spectrogram) as a function of the

nondimensional variables τ/s (delay) and ωs (frequency) is shown in Fig. 2.45.In this plot the dimensionless parameter (1/2)αs2 equals 1/2. The map

shows a single ridge corresponding to a straight line ω = ατ correspondingto the instantaneous frequency at time τ . As expected only positive frequencycomponents are picked up by the transform. On the other hand, if insteadwe transform the real signal cos

{12αt

2}, we get a plot as in Fig. 2.46. Since

the cosine contains exponentials of both signs the relief map shows a secondridge running along the line ω = −ατ corresponding to negative instantaneousfrequencies.

As a final example consider the signal plotted in Fig. 2.47. Even thoughthis signal looks very much like a slightly corrupted sinusoid, it is actuallycomprised of a substantial band of frequencies with a rich spectral structure.This can be seen from Fig. 2.48 which shows a plot of the squared magnitude ofthe FT. From this spectral plot we can estimate the total signal energy and therelative contributions of the constitutive spectral components that make up thetotal signal but not their positions in the time domain. This information canbe inferred from the Gabor spectrogram whose contour map is represented inFig. 2.49. This spectrogram shows us that the spectral energy of the signal is infact confined to a narrow sinuous band in the time-frequency plane. The width ofthis band is governed by the resolution properties of the sliding Gaussian window(5 sec. widths in this example) and its centroid traces out approximately thelocus of the instantaneous frequency in the time-frequency plane.


Figure 2.46: Magnitude of Gabor Transform of cos{

12αt

2}

0 10 20 30 40 50 60 70 80 90 100-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Time (sec)

f(t)

Figure 2.47: Constant amplitude signal comprising multiple frequencies


0 0.5 1 1.5 2 2.5 30

2000

4000

6000

8000

10000

12000

Frequency (Hz)

Sig

nal E

nerg

y/H

z

Figure 2.48: Squared magnitude of the FT of the signal in Fig. 2.47

100 200 300 400 500 600 700 800 900 1000

50

100

150

200

250

0.2

0.2

0.4

0.4

0.6

0.8

1

1.2

1.2

1.4

1.6

1.6

sec*10.24

Hz*

100

Contours of Squared Magnitude of Gabor Transform

Figure 2.49: Contour map of the Gabor Transform of the signal in Fig. 2.48


2.6 Frequency Dispersion

2.6.1 Phase and Group Delay

In many physical transmission media the dominant effect on the transmittedsignal is the distortion caused by unequal time delays experienced by differentfrequency components. In the frequency domain one can characterize such atransmission medium by the transfer function e−iψ(ω) where ψ(ω) is real. TheFT F (ω) of the input signal f(t) is then transformed into the FT Y (ω) of theoutput signal y(t) in accordance with

Y (ω) = e−iψ(ω)F (ω) . (2.320)

The time domain representation of the output then reads

y(t) =1

2π

∫ ∞

−∞eiωte−iψ(ω)F (ω) dω (2.321)

so that by Parseval’s theorem the total energy of the output signal is identicalto that of the input signal. However its spectral components are in generaldelayed by different amounts so that in the time domain the output appearsas a distorted version of the input. The exceptional case arises whenever thetransfer phase ψ(ω) is proportional to frequency for then with ψ(ω) = ωT theoutput is merely a time delayed version of the input:

y(t) = f (t− T ) . (2.322)

Such distortionless transmission is attainable in certain special situations, themost notable of which is EM propagation through empty space. It may also beapproached over limited frequency bands in certain transmission lines (coaxialcable, microstrip lines). In most practical transmission media however one hasto count on some degree of phase nonlinearity with frequency, particularly asthe signal bandwidth is increased. Clearly for any specific signal and transferphase the quantitative evaluation of signal distortion can proceed directly viaa numerical evaluation of (2.321). Nevertheless, guidance for such numericalinvestigations must be provided by a priori theoretical insights. For example,at the very minimum one should like to define and quantify measures of signaldistortion. Fortunately this can usually be accomplished using simplified andanalytically tractable models.

Let us first attempt to define the delay experienced by a typical signal.Because each spectral component of the signal will be affected by a differentamount, it is sensible to first attempt to quantify the delay experienced by atypical narrow spectral constituent of the signal. For this purpose we conceptu-ally subdivided the signal spectrum F (ω) into narrow bands, each of width Δω,as indicated in Fig. 2.50 (also shown is a representative plot of ψ(ω), usuallyreferred to as the medium dispersion curve). The contribution to the outputsignal from such a typical band (shown shaded in the figure) is

yn(t) = e {zn(t)} , (2.323)

2.6 Frequency Dispersion 169

where

zn(t) =1

π

∫ ωn+Δω/2

ωn−Δω/2

eiωte−iψ(ω)F (ω) dω (2.324)

zn(t) is the corresponding analytic signal (assuming real f (t) and ψ(ω) =−ψ(−ω)) and the integration is carried out over the shaded band in Fig. 2.50.

wnwn-1 wn+1

· · ····

y¢ (wn)

y (w)F(

y (w),

w

w

w

D

)

F(w)

Figure 2.50: Group delay of signal component occupying a narrow frequencyband

Clearly the complete signal y (t) can be represented correctly by simply sum-ming over the totality of such non-overlapping frequency bands, i.e.,

y(t) =∑

n

yn(t) . (2.325)

For sufficiently small Δω/ωn the phase function within each band may beapproximated by

ψ(ω) ∼ ψ(ωn) + (ω − ωn)ψ′(ωn), (2.325*)

where ψ′(ωn) is the slope of the dispersion curve at the center of the band inFig. 2.50. If we also approximate the signal spectrum F (ω) by its value at theband center, (2.324) can be replaced by

zn(t) ∼ 1

πF (ωn)

∫ ωn+Δω/2

ωn−Δω/2

eiωte−i[ψ(ωn)+(ω−ωn)ψ′(ωn)]dω.

After changing the integration variable to η = ω − ωn this becomes

zn(t) ∼ 1

πF (ωn) e

i(ωnt−ψ(ωn))

∫ Δω/2

−Δω/2

eiη(t−ψ′(ωn))dη


= 2iF (ωn) ei(ωnt−ψ(ωn))

sin[Δω/2

(t− ψ′(ωn)

)]

π(t− ψ′(ωn)

) (2.326)

and upon setting F (ωn) = A (ωn) eiθ(ωn) the real signal (2.323) assumes the

form

yn(t) ∼ A (ωn) sin [ωnt+ θ (ωn) + π − ψ(ωn)]sin

[Δω/2

(t− ψ′(ωn)

)]

π(t− ψ′(ωn)

) (2.327)

a representative plot of which is shown in Fig. 2.51. Equation (2.327) has theform of a sinusoidal carrier at frequency ωn that has been phase shifted by

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-4

-3

-2

-1

0

1

2

3

4

t-Tg

Figure 2.51: Plot of (2.327) for Δω/2θ′ (ωn) = 10, ωn = 200rps and A (ωn) = 1

ψ(ωn) radians. Note that the carrier is being modulated by an envelope in formof a sinc function delayed in time by ψ′(ωn). Evidently this envelope is the timedomain representation of the spectral components contained within the bandΔω all of which are undergoing the same time delay as a “group.” Accordinglyψ′(ωn) is referred to as the group delay (Tg ) while the time (epoch) delay of thecarrier θ (ωn) /ωn is referred to as the phase delay (Tϕ). One may employ theseconcepts to form a semi-quantitative picture of signal distortion by assigning toeach narrow band signal constituent in the sum in (2.325) its own phase andgroup delay. Evidently if the dispersion curve changes significantly over the


signal bandwidth no single numerical measure of distortion is possible. Thus itis not surprising that the concept of group delay is primarily of value for signalshaving sufficiently narrow bandwidth. How narrow must Δω be chosen for therepresentation (2.327) to hold? Clearly in addition to Δω/ωn << 1 the nextterm in the Taylor expansion in (2.325*) must be negligible by comparison with(ω − ωn)ψ′(ωn). Since |ω − ωn| ≤ Δω/2 this additional constraint translatesinto

Δω �∣∣∣∣4ψ′ (ωn)ψ′′ (ωn)

∣∣∣∣ (2.328)

which evidently breaks down when ψ′ (ωn) = 0.

2.6.2 Phase and Group Velocity

Phase and group delay are closely linked to phase and group velocities associatedwith wave motion. To establish the relationship we start with the definition ofan the elementary wave

f (t, x) = f(t− x/v), (2.329)

where t is time x, represents space, and v a constant. Considered as a function of

-5 -4 -3 -2 -1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

t1 t2 t3 t4 t5

x

t-x/

v

Figure 2.52: Self-preserving spatial pattern at successive instants of time (t1 <t2 < t3 < t4 < t5)

x which is sampled at discrete instances of time we can display it as in Fig. 2.52.Such a spatial display may be regarded as a sequence of snapshots of the functionf (ξ) which executes a continuous motion in the direction of the positive x-axis.


Clearly the speed of this translation may be defined unambiguously by thecondition that the functional argument t− x/v be maintained constant in timefor a continuum of x. The derivative of the argument is then zero so that

dx

dt= v. (2.330)

We take (2.330) as the definition of the velocity of the wave. Note that thisdefinition is based entirely on the requirement that the functional form f (ξ)be preserved exactly. This characterizes what is usually designated as disper-sionless propagation. It is an idealization just as is distortionless transmissionmentioned in the preceding subsection. Evidently as long as x is fixed the twoconcepts are identical as we see by setting x/v = T in (2.322). In general thepreservation of the waveform is approached only by narrow band signals. Hencewe can again examine initially the propagation of a single sinusoid and appealto Fourier synthesis to formulate the general case. For a time-harmonic signalthe elementary wave function (2.329) reads

eiω(t−x/v(ω)) = eiωte−iβ(ω)x, (2.331)

wherein we now allow the speed of propagation vϕ(ω) to depend on frequency.Note, however, that even though mathematically the functional forms (2.329)and (2.331) are identical, (2.331) represents an infinitely long periodic patternso that we cannot really speak of the velocity of the translation of an identi-fiable space limited pattern (as, e.g., displayed in Fig. 2.52). Thus if we wantto associate vϕ(ω) with the motion of some identifiable portion of the spatialpattern, we have only a phase reference at our disposal. Quite aptly then vϕ(ω)is referred to as the phase velocity. The quantity β (ω) = ω/v(ω) in (2.331)represents the propagation constant and may be taken as a fundamental char-acteristic of the propagation medium. The time domain representation of ageneral signal with spectrum F (ω) that has propagated through a distance x isobtained by multiplying (2.331) by F (ω) and taking the inverse FT. Thus

y(t, x) =1

2π

∫ ∞

−∞eiωte−iβ(ω)xF (ω) dω, (2.332)

which is just (2.321) with the phase shift relabeled as β (ω)x. Note that in thespecial case β (ω) = ω/v and with v a constant (2.332) reduces to (2.329), i.e.,the propagation is dispersionless. In the general case we proceed as in (2.326).After replacing ψ(ω) with β (ω)x in (2.327) we obtain

yn(t, x) ∼ A (ωn) sin [ωnt+ θ (ωn) + π − β (ωn)x]sin

[Δω/2

(t− β′ (ωn)x

)]

π(t− β′ (ωn)x

) .

(2.333)Unlike (2.327), (2.333) depends on both space and time. It does not, however,have the same simple interpretation as the wavefunction defined in (2.329) be-cause the speed of propagation of the carrier phase and the envelope differ. Thuswhile the carrier phase moves with the phase velocity

vϕn = ωn/β (ωn) (2.334)


the envelope6 moves with velocity

vgn = 1/β′ (ωn) =dω

dβ

∣∣β=β(ωn). (2.335)

The latter is referred to as the group velocity and is the speed of propagationof the energy contained within the frequency band Δω in Fig. 2.50. By con-trast, the phase velocity has generally no connection with energy transport butrepresents merely the translation of a phase reference point.

2.6.3 Effects of Frequency Dispersion on Pulse Shape

Thus far we have not explicitly addressed quantitative measures of signal distor-tion. For this purpose consider a pulse with a (baseband) spectrum P (ω) mostof whose energy is confined to the nominal frequency band (−Ω,Ω) . The pulse,after having been modulated by a carrier of frequency ω0, propagates througha medium of length L characterized by the propagation constant β (ω) . Theoutput signal occupies the frequency band ω0 − Ω < ω < ω0 +Ω with and hasthe time domain representation

y(t, ω0) = e{2

∫ ω0+Ω

ω0−Ω

(1/2)P (ω − ω0) e−iβ(ω)Leiωt

dω

2π

}

= e{eiω0t

∫ Ω

−Ω

P (η) e−iβ(η+ω0)Leiηtdη

2π

}, (2.336)

where we have assumed that the pulse is a real function. From the last expressionwe identify the complex baseband output signal as

s (t) =

∫ Ω

−Ω

P (η) e−iβ(η+ω0)Leiηtdη

2π. (2.337)

Irrespective of the nature of the pulse spectrum the frequencies at the bandcenter ω = ω0 will be delayed by the group delay β′ (ω0)L. In order to focus onpulse distortion (e.g., pulse broadening) it will be convenient to subtract thisdelay. We do this by initially adding and subtracting ηβ′ (ω0)L from the phaseof the integrand in (2.337) as follows:

s (t) =

∫ Ω

−Ω

P (η) e−i[β(η+ω0)−β′(ω0)η]Leiη[t−β′(ω0)L] dη

2π. (2.338)

Observe that this integral defines the time delayed version of s (t) defined by

s (t) = s[t− β′ (ω0)L

](2.339)

6When the emphasis is on wave propagation rather than signal analysis, it is customaryto represent the wavefunction (2.332) as a superposition of propagation constants β, in termsof the so-called wavenumber spectrum. In that case the envelope in (2.333) (usually referredto as a wavepacket) assumes the form

sin[Δβ/2(vgnt−x)]vgnπ(vgnt−x)

,

where Δβ is the range of propagation constants corresponding to the frequency band Δω.


or, explicitly, by

s (t) =

∫ Ω

−Ω

P (η) e−i[β(η+ω0)−β′(ω0)η]Leiηtdη

2π. (2.340)

We shall obtain an approximation to this integral under the following twoassumptions:

ω0 � Ω, (2.341a)

Ω2β′′(ω0)L � 1. (2.341b)

The first of these is the conventional narrow band approximation while thesecond implies a long propagation path.7 Thus in view of (2.341a) we mayapproximate β (η + ω0) by

β (η + ω0) ∼ β (ω0) + β′ (ω0) η +1

2β′′ (ω0) η

2. (2.342)

Substituting this into (2.340) leads to the following series of algebraic steps:

s (t) ∼ e−iβ(ω0)L

∫ Ω

−Ω

P (η) e−iL2 β

′′(ω0)η2

eiηtdη

2π

= e−iβ(ω0)L

∫ Ω

−Ω

P (η) e−iL2 β′′(ω0)Ω

2[( η

Ω )2−2( η

Ω )(

tΩβ′′(ω0)L

)]dη

2π

= Ωe−iβ(ω0)L

∫ 1

−1

P (νΩ) e−iL2 β′′(ω0)Ω

2[ν2−2ν

(t

Ωβ′′(ω0)L

)]dν

2π

= Ωe−iβ(ω0)Le−i t2

2Lβ′′(ω0)

∫ 1

−1

P (νΩ) e−iL2 β′′(ω0)Ω

2[ν− t

Ωβ′′(ω0)L

]2 dν

2π

= Ωe−iβ(ω0)Le−i t2

2Lβ′′(ω0)

∫ 1−t/Ωβ′′(ω0)L

−1−t/Ωβ′′(ω0)L

P[xΩ + t/β′′ (ω0)L

]e−i

L2 β

′′(ω0)Ω2x2 dx

2π. (2.343)

Since we are interested primarily in assessing pulse distortion the range of thetime variable of interest is on the order of t ∼ 1/Ω we have in view of (2.341b)

t/Ωβ′′ (ω0)L� 1 . (2.344)

Consequently the limits in the last integral in (2.343) may be replaced by −1, 1.Again in view of (2.341b) we may evaluate this integral by appealing to theprinciple of stationary phase. Evidently the point of stationary phase is atx = 0 which leads to the asymptotic result

s (t) ∼ 1√2π

∣∣β′′ (ω0)∣∣Le−iπ/4sign[β

′′(ω0)]e−iβ(ω0)Le−i t2

2Lβ′′(ω0)P

(t

β′′ (ω0)L

).

(2.345)

7Note (2.341b) necessarily excludes the special case β′′(ω0) = 0.


In many applications (e.g., intensity modulation in fiber optic communicationsystems) only the pulse envelope is of interest. In that case (2.345) assumes thecompact form

|s (t)|2 ∼ 1

2π∣∣β′′ (ω0)

∣∣L

∣∣∣∣P(

t

β′′ (ω0)L

)∣∣∣∣2

(2.346)

Parseval’s theorem tells us that the energies of the input and output signalsmust be identical. Is this still the case for the approximation (2.346)? Indeedit is as we verify by a direct calculation:

∫ ∞

−∞|s (t)|2 dt = (1/2π

∣∣β′′ (ω0)∣∣L)

∫ ∞

−∞

∣∣P(t/β′′ (ω0)L

)∣∣2 dt

=1

2π

∫ ∞

−∞|P (ω)|2 dω ≡ 1

2π

∫ Ω

−Ω

|P (ω)|2 dω.

Equation (2.346) states that the envelope of a pulse propagating over a suf-ficiently long path assumes the shape of its Fourier transform wherein thetimescale is determined only by the path length and the second derivative ofthe propagation constant at the band center. For example, for a pulse of unitamplitude and duration T we obtain

|s (t)|2 ∼ 4sin2

(tT

2β′′(ω0)L

)

(t

β′′(ω0)L

)2

giving a peak-to-first null pulsewidth of

TL =

∣∣∣∣2πβ′′ (ω0)L

T

∣∣∣∣ . (2.347)

In optical communications pulse broadening is usually described by the groupindex N(ω) defined as the ratio of the speed of light in free space to the groupvelocity in the medium:

N(ω) =c

vg(ω)= cβ′(ω). (2.348)

Expressed in terms of the group index the pulse width in (2.347) reads

TL =

∣∣∣∣2πL

cT

d

dωN(ω) |ω=ω0

∣∣∣∣ . (2.349)

In view of (2.341b) these results break down whenever β′′ (ω0) = 0, i.e., atthe inflection points (if they exist) of the dispersion curve. To include thecase of inflection points requires the retention of the third derivative in Taylorexpansion (2.342), i.e.,

β (η + ω0) ∼ β (ω0) + β′ (ω0) η +1

2β′′ (ω0) η

2 +1

6β′′′ (ω0) η

3 (2.350)


so that

s (t) ∼ e−iβ(ω0)L

∫ Ω

−Ω

P (η) e−iL2 β

′′(ω0)η2−iL6 β′′′(ω0)η

3

eiηtdη

2π. (2.351)

We shall not evaluate (2.351) for general pulse shapes but confine our attentionto a Gaussian pulse. In that case we may replace the limits in (2.351) by ±∞and require only that (2.341a) hold but not necessarily (2.341b). Using theparameterization in (2.296) we have

p (t) =21/4√Te−

πt2

T2 , (2.352)

where we have relabeled the nominal pulse width s by T . The correspondingFT then reads

P (ω) = 21/4√Te−T

2ω2/4π (2.353)

so that (2.351) assumes the form

s (t) ∼ 21/4√Te−iβ(ω0)L

∫ ∞

−∞e−T

2η2/4πe−iL2 β

′′(ω0)η2−iL6 β′′′(ω0)η

3

eiηtdη

2π

= 21/4√Te−iβ(ω0)L

∫ ∞

−∞e−i

Lβ′′′(ω0)6 [η3+Bη2−Cη] dη

2π, (2.354)

where

B =3β′′ (ω0)

β′′′ (ω0)− i 3T 2

2πLβ′′′ (ω0), (2.355a)

C =6t

Lβ′′′ (ω0). (2.355b)

Changing the variable of integration to z via η = z − B/3 eliminates thequadratic term in the polynomial in the exponential (2.354) resulting in

η3 +Bη2 − Cη = z3 − z (B2/3 + C)+ (2/27)B3 +BC/3.

Because of the analyticity of the integrand the integration limits in (2.354)may kept at ±∞. A subsequent change of the integration from z to w =[Lβ′′′ (ω0) /2

]1/3z transforms (2.354) into

s (t) ∼ 21/4√Te−iβ(ω0)Le−i

β′′′(ω0)L6 [(2/27)B3+BC/3]

{β′′′ (ω0)L

2

}−1/3

Ai

{−

[β′′′ (ω0)L

2

]2/.3 (B2/9 + C/3

)}, (2.356)

where Ai(x) is the Airy function defined by the integral

Ai(x) =1

2π

∫ ∞

−∞e−i(w

3/3+xw)dw. (2.357)


The interpretation of (2.356) will be facilitated if we introduce the followingdimensionless parameters:

q =β′′ (ω0)T

β′′′ (ω0), (2.358a)

p =β′′′ (ω0)L

T 3, (2.358b)

χ = 2qp =2β′′ (ω0)L

T 2. (2.358c)

Introducing these into (2.357) we obtain

s (t) ∼ 21/4(1/√T )e−iβ(ω0)Le

−i{

χq2

6 (1− iπχ )

[(1− i

πχ )2+ 6

χq (tT )

]}

(p/2)−1/3Ai

{−q2

(p2

)2/3[(

1− i

πχ

)2

+ 4t

qχT

]}. (2.359)

Let us first examine this expression for the case in which the third derivativeterm in (2.350) can be neglected. Clearly this is tantamount to dropping thecubic term in (2.351). The integral then represents the FT of a Gaussian func-tion and can be evaluated exactly. On the other hand, from the definition of qin (2.358a) we note that β′′′ (ω0) → 0 and β′′ (ω0) = 0 correspond to q → ∞.Hence we should be able to obtain the same result by evaluating (2.359) in thelimit as q → ∞. We do this with the aid of the first-order asymptotic form ofthe Airy function for large argument the necessary formula for which is givenin [1]. It reads

Ai(−z) ∼ π−1/2z−1/4 sin(ζ +π

4), (2.360)

where

ζ =2

3z3/2 ; |arg(z)| < π. (2.361)

Thus we obtain for8 |q| ∼ ∞

(p/2)−1/3

Ai

{−q2

(p2

)2/.3[(

1− i

πχ

)2

+ 4t

qχT

]}

∼ −i[πχ(1− i

πχ)

]−1/2

⎛

⎜⎜⎜⎜⎝

exp

{i (p/3) q3

[(1− i

πχ

)2

+ 4 tqχT

]3/2+ iπ4

}

− exp

{−i (p/3) q3

[(1− i

πχ

)2

+ 4 tqχT

]3/2− iπ4

}

⎞

⎟⎟⎟⎟⎠, (2.362)

8q is real but may be of either sign.


where in the algebraic term corresponding to z−1/4 in (2.360) we have droppedthe term o(1/q). Next we expand the argument of the first exponentials termin (2.362) as follows:

i(χq2/6

)[(

1− i

πχ

)2

+ 4t

qχT

]3/2

= i(χq2/6

)(1− i

πχ

)3

⎡

⎢⎣1 + 4t

qχT(1− i

πχ

)2

⎤

⎥⎦

3/2

= i(χq2/6

)(1− i

πχ

)3

⎡

⎢⎣1 + 6t

qχT(1− i

πχ

)2 + 6t2

(qχT )2(1− i

πχ

)4 + o(1/q3)

⎤

⎥⎦

= i(χq2/6

)(1− i

πχ

)3

+ iqt

T

(1− i

πχ

)

+it2

χT 2

(1− i

πχ

)−1

+ o(1/q). (2.363)

In identical fashion we can expand the argument of the second exponential whichwould differ from (2.363) only by a minus sign. It is not hard to show that forsufficiently large |q| is real part will be negative provided

χ2 >1

3π2. (2.364)

In that case the second exponential in (2.362) asymptotes to zero and may beignored. Neglecting the terms o(1/q) in (2.363) we now substitute (2.362) into(2.359) and note that the first two terms in the last line of (2.363) cancel againstthe exponential in (2.359). The final result then reads

s (t) ∼ 21/4(1/√T )e−iβ(ω0)L

{−i

[πχ(1− i

πχ)

]−1/2}exp

{it2

χT 2

(1− i

πχ

)−1

+ iπ

4

}

= 21/4(1/√T )e−iβ(ω0)L (1 + iπχ)

−1/2exp−πt

2

T 2(1 + iπχ)

−1. (2.365)

For the squared magnitude of the pulse envelope we get

|s (t)|2 ∼√2

T

(1 + π2χ2

)−1/2exp− πt2

(T 2/2) (1 + π2χ2). (2.366)

The nominal duration of this Gaussian signal may be defined by (T/2√π)√

1 + π2χ2 so that χ plays the role of a pulse-stretching parameter. Whenχ� 1 (2.366) reduces to


|s (t)|2 ∼√2

πχTexp− 2t2

πT 2χ2=

T

π√2β′′ (ω0)L

exp− T 2t2

2πβ′′ (ω0)L. (2.367)

The same result also follows more directly from the asymptotic form (2.346)as is readily verified by the substitution of the FT of the Gaussian pulse (2.353)into (2.346). Note that with χ = 0 in (2.366) we recover the squared magnitudeof the original (input) Gaussian pulse (2.352). Clearly this substitution violatesour original assumption |q| ∼ ∞ under which (2.366) was derived for in accor-dance with (2.358) χ = 0 implies q = 0. On the other hand if β′′′ (ω0) is taken tobe identically zero (2.366) is a valid representation of the pulse envelope for allvalues of χ. This turns out to be the usual assumption in the analysis of pulsedispersion effects in optical fibers. In that case formula (2.366) can be obtaineddirectly from (2.351) by simply completing the square in the exponential andintegrating the resulting Gaussian function. When β′′′ (ω0) = 0 with q arbitrarynumerical calculations of the output pulse can be carried out using (2.359). Forthis purpose it is more convenient to eliminate χ in favor of the parameters pand q. This alternative form reads

s (t) ∼ 21/4(1/√T )e−iβ(ω0)Le

−i{

p3 (q− i

2πp )[(q− i

2πp)2+ 3

p (tT )

]}

(|p| /2)−1/3Ai

{−

( |p|2

)2/3[(

q − i

2πp

)2

+ 2t

pT

]}. (2.368)

-4 -3 -2 -1 0 1 2 3 40

0.5

1

1.5

p=0

p=0.2

p=0.5

p=1.0p=-1.0

p=-0.5

p=-0.2

t/T

T*a

bs(s

)2

Figure 2.53: Distortion of Gaussian pulse envelope by cubic phase nonlinearitiesin the propagation constant


To assess the influence of the third derivative of the phase on the pulse envelopewe set q = 0 and obtain the series of plots for several values of p as shownin Fig. 2.53. The center pulse labeled p = 0 corresponds to the undistortedGaussian pulse (χ = 0 in (2.366)). As p increases away from zero the pulseenvelope broadens with a progressive increase in time delay. For sufficientlylarge p the envelope will tend toward multimodal quasi-oscillatory behavior theonset of which is already noticeable for p as low as 0.2. For negative p the pulseshapes are seen to be a mirror images with respect to t = 0 of those for positivep so that pulse broadening is accompanied by a time advance.

2.6.4 Another Look at the Propagation of a GaussianPulse When β′′′ (ω0) = 0

As was pointed out above in the absence of cubic (and higher order) nonlinear-ities (2.366) is an exact representation of the pulse envelope. In fact we can alsoget the complete waveform in the time domain with the aid of (2.336), (2.339),and (2.365). Thus

y(t, ω0) = 21/4/√Te

⎧⎨

⎩eiω0

[t− π2χt2

(T2)(1+π2χ2)

]

e−i[β(ω0)L+(1/2) tan−1(πχ)](1 + π2χ2

)−1/4exp− πt2

(T 2)(1+π2χ2)

⎫⎬

⎭ ,

(2.369)

wheret = t− β′ (ω0)L. (2.370)

Note that the instantaneous frequency of this complex waveform varies linearlywith time, i.e.,

ω (t) = ω0 −2π2χ

(t− β′ (ω0)L

)

(T 2) (1 + π2χ2). (2.371)

In fiber optics such a pulse is referred to as a chirped pulse. This “chirping,”(or linear FM modulation) is just a manifestation of the fact that the pulsedistortion is due entirely to the quadratic nonlinearity in the phase rather thanin the amplitude of the effective transfer function. On the other hand, chirpingcan occur also due to intrinsic characteristics of the transmitter generating theinput pulse. We can capture this effect using the analytic form

p (t) = Ae− t2

2T20(1+iκ)

, (2.372)

where A is a constant, κ the so-called chirp factor, and 2T0 the nominal pulsewidth.9 Evidently when this pulse gets upconverted to the carrier frequencyω0 its instantaneous frequency becomes

9Note that T0 =(1/

√2π

)T where T represents the definition of pulse width in (2.352).

Also A = 21/4/√T .


ω (t) = ω0

[1− κ

ω0T0

(t

T0

)](2.373)

so that over the nominal pulse interval −T0 ≤ t ≤ T0 the fractional change inthe instantaneous frequency is 2κ/ω0T0. Presently we view this chirping as theintrinsic drift in the carrier frequency during the formation of the pulse. Howdoes this intrinsic chirp affect pulse shape when this pulse has propagated overa transmission medium with transfer function exp− β (ω)L ? If we neglect theeffects of the third and higher order derivatives of the propagation constant theanswer is straightforward. We first compute the FT of (2.372) as follows:

P (ω) = A

∫ ∞

−∞e− t2

2T20(1+iκ)

e−iωtdt = A

∫ ∞

−∞e− (1+iκ)

2T20

[(t− iωT2

01+iκ

)2

+ω2T4

0(1+iκ)2

]

dt

= Ae−ω2T2

02(1+iκ)

∫ ∞

−∞e− (1+iκ)

2T20

(t− iωT2

01+iκ

)2

dt

= AT0

√2π

1 + iκe−

ω2T20

2(1+iκ) , (2.374)

where the last result follows from the formula for the Gaussian error func-tion with (complex) variance parameter T 2

0 /√1 + iκ. Next we substitute (2.374)

in (2.340) with Ω =∞ together with the approximation (2.342) to obtain

s (t) = e−iβ(ω0)L

∫ ∞

−∞P (η) e−i

12β

′′(ω0)η2Leiηt

dη

2π(2.375)

Simplifying,

s(t) = AT0

√2π

1 + iκe−iβ(ω0)L

∫ ∞

−∞e−[

T20

2(1+iκ)+iLβ′′(ω0)

2

]η2

eiηtdη

2π. (2.376)

Setting Q = T 20 / [2 (1 + iκ)] + iLβ′′ (ω0) /2 we complete the square in the

exponential as follows:

e−Qη2+iηt = e

−Q[(η− it

2Q )2+ t2

4Q2

]

= e−t2

4Q e−Q(η−it2Q )

2

. (2.377)

From this we note that the complex variance parameter is 1/(2Q) so that (2.376)integrates to

s (t) =AT02π

√2π

1 + iκe−iβ(ω0)Le−

t2

4Q

√π

Q

=A√

1 + iκe−iβ(ω0)L

T0√T 20 + iβ′′ (ω0)L (1 + iκ)

exp− t2 (1 + iκ)

2[T 20 + iβ′′ (ω0)L (1 + iκ)

] .

(2.378)


Expression for the pulse width and chirp is obtained by separating the argumentof the last exponential into real and imaginary parts as follows:

exp− t2 (1 + iκ)

2[T 20 + iβ′′ (ω0)L (1 + iκ)

]

= exp− T 20 t

2

2{[T 20 − β′′ (ω0)Lκ

]2+

[β′′ (ω0)L

]2} exp−iψ, (2.379)

where

ψ =κt2

[T 20 − β′′ (ω0)L(1 + κ)

]

2{[T 20 − β′′ (ω0)Lκ

]2+

[β′′ (ω0)L

]2} . (2.380)

Defining the magnitude of (2.379) as exp−t2/ (2T 2L

)we get for the pulse

length TL

TL = T0

√(1− β′′ (ω0)Lκ

T 20

)2

+

(β′′ (ω0)L

T 20

)2

. (2.381)

When the input pulse is unchirped κ = 0, and we get

TL =

√

T 20 +

(β′′ (ω0)L

T0

)2

. (2.382)

We see from (2.381) that when κ = 0, TL may be smaller or larger than the rightside of (2.382) depending on the sign of κ. and the magnitude of L. Note, how-ever, that for sufficiently large L, (2.381) is always larger than (2.382) regardlessof the sign of κ. The quantity

LD = T 20 /β

′′(ω0) (2.383)

is known as the dispersion length. Using this in (2.381) we have

TL = T0

√(1− L

LDκ

)2

+

(L

LD

)2

. (2.384)

The significance of LD is that with κ = 0 for L � LD the effect of dispersionmay be neglected.

2.6.5 Effects of Finite Transmitter Spectral Line Width*

In the preceding it was assumed that the carrier modulating the pulse ismonochromatic, i.e., an ideal single frequency sinusoid with constant phase.In practice this will not be the case. Instead the carrier will have a fluctuatingamplitude and phase which we may represent as

a(t) cos(ω0t+ φ(t)), (2.385)


where a(t) and φ(t) are random functions of time and ω0 is the nominal carrierfrequency which itself has to be quantified as a statistical average. In the follow-ing we assume that only the phase is fluctuating and that the carrier amplitudeis fixed. Reverting to complex notation we then assume that the pulse p(t) uponmodulation is of the form

p(t)eiω0teiφ(t). (2.386)

If we denote the FT of eiφ(t) by the random function ˜X(ω), we get for the FTof (2.386)

∫ ∞

−∞p(t)eiω0teiφ(t)e−iωtdt =

1

2π

∫ ∞

−∞P (ω − ξ − ω0) ˜X(ξ)dξ. (2.387)

To get the response that results after this random waveform has propagated overa transmission medium with transfer function exp−iβ(ω)L we have to replaceP (ω − ω0) in (2.336) by the right side of (2.387). Thus we obtain

y(t) = e{2

∫ ω0+Ω

ω0−Ω

(1/2)

{1

2π

∫ ∞

−∞P (ω−ξ−ω0) ˜X(ξ)dξ

}e−iβ(ω)Leiωt

dω

2π

}

= e{eiω0t

∫ Ω

−Ω

{1

2π

∫ ∞

−∞P (η − ξ) ˜X(ξ)dξ

}e−iβ(η+ω0)Leiηt

dη

2π

}

= e{eiω0ts(t− β′

(ω0)L},

where

s(t) =

∫ Ω

−Ω

{1

2π

∫ ∞

−∞P (η − ξ) ˜X(ξ)dξ

}e−i

[β(η+ω0)−β′

(ω0)η]Leiηt

dη

2π(2.388)

is the complex random envelope of the pulse. It is reasonable to characterizethis envelope by its statistical average which we denote by

|ENV |2 ≡ 〈|s (t)|2〉. (2.389)

In evaluating (2.389) we shall assume that eiφ(t) is a WSS process so that itsspectral components are uncorrelated, i.e.,

〈X(ξ)X∗(ξ′)〉 = 2πF (ξ) δ(ξ − ξ′) , (2.390)

where F (ξ) is the spectral power density of eiφ(t). If we approximate the propa-gation constant in (2.388) by the quadratic form (2.342), and substitute (2.388)into (2.389) we obtain with the aid of (2.390)

|ENV |2 =

∫ ∞

−∞F (ξ) dξ

1

(2π)3

∣∣∣∣∣

∫ Ω

−Ω

P (η − ξ)e−iβ′′(ω0)η2L/2eiηtdη

∣∣∣∣∣

2

. (2.391)


Assuming a Gaussian pulse with the FT as in (2.374) the inner integral in (2.391)can be expressed in the following form:

1

(2π)3

∣∣∣∣∣

∫ Ω

−Ω

P (η − ξ)e−iβ′′(ω0)η2L/2eiηtdη

∣∣∣∣∣

2

=A2T 2

0 π

(2π)2√1 + κ2 |Q|f (ξ) , (2.392)

where Q = T 20 / [2 (1 + iκ)] + iLβ′′ (ω0) /2,

f (ξ) = e2e{Qb2}e− ξ2T20

2 [ 11+iκ+ 1

1−iκ ] (2.393)

and

b =ξT 2

0

2Q(1 + iκ)+

it

2Q. (2.394)

To complete the calculation of the average pulse envelope we need the functionalform of the power spectral density of the phase fluctuations. The form dependson the physical process responsible for these fluctuations. For example for highquality solid state laser sources the spectral line width is Lorenzian, i.e., of theform

F (ω − ω0) =2/W

1 +(ω−ω0

W

)2 . (2.395)

Unfortunately for this functional form the integration in (2.391) has to be carriedout numerically. On the other hand, an analytical expression is obtainable if weassume the Gaussian form

F (ω − ω0) =1√

2πW 2exp− (ω − ω0) /2W

2. (2.396)

After some algebra we get

|ENV |2 =A2T 2

0 π

(2π)2√

1 + κ2 |Q|

√(T 20−β′′ (ω0)Lκ

)2+

(β′′ (ω0)L

)2[(T 20−β′′ (ω0)Lκ

)2+(1+2W 2T 2

0 )(β′′(ω0)L

)2]

exp− t2T 20(

T 20 − β′′ (ω0)Lκ

)2+ (1 + 2W 2T 2

0 )(β′′ (ω0)L

)2 . (2.397)

Note that the preceding is the squared envelope so that to get the effectivepulse length of the envelope itself an additional factor of 2 needs to be inserted(see (2.381)). We then get

TL = T0

√(1− β′′ (ω0)Lκ

T 20

)2

+ (1 + 2W 2T 20 )

(β′′ (ω0)L

T 20

)2

. (2.398)

It should be noted that this expression is not valid when β′′ (ω0) = 0 as thenthe cubic phase term dominates. In that case the pulse is no longer Gaussian.The pulse width can then be defined as an r.m.s. duration. The result reads

TL = T0

√(1− β′′ (ω0)Lκ

T 20

)2

+ (1 + 2W 2T 20 )

(β′′ (ω0)L

T 20

)2

+ C, (2.399)

2.7 Fourier Cosine and Sine Transforms 185

where

C = (1/4) (1 + κ2 + 2W 2T 20 )

(β′′′ (ω0)L

T 30

)2

. (2.400)

2.7 Fourier Cosine and Sine Transforms

In Chap. 2.2 we took as the starting point in our development of the FT theorythe LMS approximation of a function defined in (−T/2, T/2) in terms of sinu-soids with frequencies spanning the interval (−Ω,Ω). The formal solution canthen be phrased in terms of the integral equation (2.106) for the unknown coef-ficients (functions). For arbitrary finite intervals no simple analytical solutionsof the normal equation appears possible. On the other hand, when both theexpansion interval in the time domain and the range of admissible frequenciesare allowed to approach infinity the normal equations admit a simple solutionwhich we have identified with FT. As we shall see in the following a suitable setof normal equations can also be solved analytically when the expansion intervalsin the time domain and in the frequency domain are chosen as semi-infinite.

We suppose that f(t) is defined over (0, T ) and seek its LMS approximationin terms of cos(ωt) with ω in the interval (0,Ω) :

f(t) ∼∫ Ω

0

cos(ωt)fc (ω) dω = fΩc (t), (2.401)

where fc (ω) if the expansion (coefficient) function. In accordance with (1.100)the normal equation reads

∫ T

0

cos(ωt)f (t) dt =

∫ Ω

0

fc (ω′) dω′

∫ T

0

cos(ωt) cos(ω′t)dt. (2.402)

Using the identity cos(ωt) cos(ω′t) = (1/2){cos[t(ω − ω′)] + cos[t(ω + ω′)]} wecarry out the integration with respect to t to obtain

∫ T

0

cos(ωt)f (t) dt =π

2

∫ Ω

0

fc (ω′) dω′{ sin[(ω − ω

′)T ]π(ω − ω′)

+sin[(ω + ω′)T ]π(ω + ω′)

}.(2.403)

For arbitrary T this integral equation does not admit of simple analytical solu-tions. An exceptional case obtains when T is allowed to approach infinity forthen the two Fourier Integral kernels approach delta functions. Because Ω > 0only the first of these contributes. Assuming that fc (ω

′) is a smooth functionand we obtain in the limit

Fc(ω) =

∫ ∞

0

cos(ωt)f (t) dt, (2.404)

where we have definedFc(ω) =

π

2fc (ω) . (2.405)


Inserting (2.404) into (2.401) the LMS approximation to f(t) reads

fΩc (t) =

2

π

∫ Ω

0

cos(ωt)

∫ ∞

0

cos(ωt′)f (t′) dt′dω

=

∫ ∞

0

f (t′) dt′2

π

∫ Ω

0

cos(ωt) cos(ωt′)dω

=

∫ ∞

0

f (t′) dt′(1/π)∫ Ω

0

{cos[ω(t− t′)] + cos[ω(t+ t′)]}dω

=

∫ ∞

0

f (t′) dt′{sin[(t− t′)Ω]π(t− t′) +

sin[(t+ t′)Ω]π(t+ t′)

}. (2.406)

Using the orthogonality principle the corresponding LMS error εΩmin is

εΩmin =

∫ ∞

0

|f (t)|2 dt−∫ ∞

0

f∗ (t) fΩc (t)dt

and using (2.401) and (2.404)

εΩmin =

∫ ∞

0

|f (t)|2 dt−∫ ∞

0

f∗ (t)∫ Ω

0

cos(ωt)fc (ω) dωdt

=

∫ ∞

0

|f (t)|2 dt− 2

π

∫ Ω

0

|Fc(ω)|2 dω ≥ 0. (2.407)

As Ω→∞ the two Fourier kernels yield the limiting form

limΩ→∞

fΩc (t) =

f (t+) + f(t−)2

. (2.408)

We may then write in lieu of (2.401)

f (t+) + f(t−)2

=2

π

∫ ∞

0

cos(ωt)Fc(ω)dω. (2.409)

At the same time limΩ→∞

εΩmin = 0 so that (2.407) gives the identity

∫ ∞

0

|f (t)|2 dt = 2

π

∫ ∞

0

|Fc(ω)|2 dω. (2.410)

When f(t) is a smooth function (2.409) may be replaced by

f(t) =2

π

∫ ∞

0

cos(ωt)Fc(ω)dω. (2.411)

The quantity Fc(ω) defined by (2.404) is the Fourier Cosine Transform (FCT)and (2.411) the corresponding inversion formula. Evidently (2.410) is the cor-responding Parseval formula. As in the case of the FT we can use the compactnotation

f (t)Fc⇐⇒ Fc (ω) . (2.412)

Problems 187

Replacing Fc(ω) in (2.411) by (2.404) yields the identity

δ(t− t′) =∫ ∞

0

√2

πcos(ωt)

√2

πcos(ωt′)dω, (2.413)

which may be taken as the completeness relationship for the FCT.Note that the derivative of fΩ

c (t) at t = 0 vanishes identically. This meansthat pointwise convergence for the FCT is only possible for functions that pos-sess a zero derivative at t = 0. This is, of course, also implied by the fact thatthe completeness relationship (2.413) is comprised entirely of cosine functions.

What is the relationship between the FT and the FCT? Since the FCTinvolves the cosine kernel one would expect that the FCT can be expressed interms of the FT of an even function. This is actually the case. Thus supposef(t) is even then

F (ω) =

∫ ∞

0

2f(t) cos(ωt)dt (2.414)

so that F (ω) is also even. Therefore the inversion formula becomes

f (t) =1

π

∫ ∞

0

F (ω) cos(ωt)dω. (2.415)

Evidently with Fc(ω) = F (ω)/2 (2.414) and (2.415) correspond to (2.404)and (2.411), respectively.

In a similar manner, using the sine kernel, one can define the Fourier SineTransform (FST):

Fs(ω) =

∫ ∞

0

sin(ωt)f (t) dt. (2.416)

The corresponding inversion formula (which can be established either formallyin terms of the normal equation as above or derived directly from the FT rep-resentation of an odd function) reads

f (t) =2

π

∫ ∞

0

Fs (ω) sin(ωt)dω. (2.417)

Upon combining (2.416) and (2.417) we get the corresponding completenessrelationship

δ(t− t′) =∫ ∞

0

√2

πsin(ωt)

√2

πsin(ωt′)dω. (2.418)

Note that (2.417) and (2.418) require that f(0) = 0 so that only for suchfunctions a pointwise convergent FST representation is possible.

Problems

1. Using (2.37) compute the limit as M →∞, thereby verifying (2.39).

2. Prove (2.48).


3. Derive the second-order Fejer sum (2.42).

4. For the periodic function shown in the following sketch:

••• •••

t

f(t)

2 4 60−2

4

t2

Figure P4: Periodic function with step discontinuities

(a) Compute the FS coefficients fn.

(b) Compute and plot the partial sum fN(t) for N = 5 and N = 20.Also compute the corresponding LMS errors.

(c) Repeat (b) for the first-order Fejer sum.

(d) Repeat (c) for the second-order Fejer sum.

5. Derive the interpolation formula (2.82)

6. Derive the interpolation formula (2.88)

7. Approximate the signal f(t) = te−t in the interval (0, 4) by the firstfive terms of a Fourier sine series and an anharmonic Fourier series withexpansion functions as in (2.101) assuming (a) β = −1/3 and (b) β = −1.Plot f5(t) for the three cases together with f(t) on the same set of axes.Account for the different values attained by the three approximating sumsat t = 4.

8. The integral

I = P

∫ 2

−2

tdt

(t− 1)(t2 + 1) sin t

is defined in the CPV sense. Evaluate it numerically.

9. Derive formulas (2.137)(2.137*)(2.141) and (2.142).

10. The differential equation

x′′(t) + x′(t) + 3x(t) = 0

is to be solved for t ≥ −2 using the FT. Assuming initial conditionsx(−2) = 3 and x′(−2) = 1 write down the general solution in terms of theFT inversion formula.

Problems 189

11. Derive formula (2.73).

12. Prove that K(1)Ω (t) in (2.200) is a delta function kernel.

13. Prove the asymptotic form (2.201).

14. Compute the Fourier transform of the following signals:

a)(e−3t cos 4t

)U(t) b)e−4|t| sin 7t

c)(te−5t sin 4t

)U(t) d)

∞∑

n=0

4−nδ(t− nT )

e)

(sinat

at

)(sin 2a (t− 1)

a (t− 1)

)f)

∞∑

n=−∞e−|t−4n|

15. Compute the Fourier transform of the following signals:

(a) f(t) = sin atπt U(t)

(b) f(t) =∫∞−∞ g(t+ x)g∗(x)dx with g(t) = e−atU(t− 2)

(c) f(t) = w(t) with w(t) defined in the following sketch.

t

11

−2−3 2 3

w(t)

16. With the aid of Parseval’s theorem evaluate∫∞−∞

sin4 xx4 dx.

17. With F (ω) = R(ω) + iX (ω) the FT of a causal signal find X (ω) when a)

R(ω) = 1/(1 + ω2) b) R(ω) = sin2 2ωω2 .

18. Given the real signal 1/(1 + t2

)construct the corresponding analytic sig-

nal and its FT.

19. Derive (2.217).

20. For the signal z(t) = cos 5t1+t2 compute and plot the spectra of the inphase

and quadrature components x (t) and y (t) for ω0 = 5, 10, 20. Interpretyour results in view of the constraint (2.226).

21. The amplitude of a minimum phase FT is given by |F (ω)| = 11+ω2n , n > 1.

Compute the phase.

Chapter 3

Linear Systems

3.1 Fundamental Properties

3.1.1 Single-valuedness, Reality, and Causality

At the beginning of Chap. 1 we defined a system as a mapping of an input signalinto an output signal, as expressed by (3.3). The nature of the mathematicaloperation or operations that the system performs on the input vector f (t) toyield the output vector y (t) defines the (system) operator T. Of course, of theinfinite variety of possible mathematical operations that could be carried out onthe input signal only certain classes could represent models of physical systems.For example, given a specific input signal f (t) a reasonable expectation is forthe output to be represented by a unique time function y (t) (rather than twoor more different time functions, say y1 (t) and y2 (t)). In other words, froma mathematical standpoint, we would like the operator T to be single valued.Another necessary attribute of a physical system is that it be real in the sensethat an input represented by a real f (t) should always result in a real y (t).

It should be emphasized that the concept of the systems operator T as usedherein requires the specification of the input for −∞ < t < ∞ enabling thedetermination of the output for −∞ < t <∞. Next let us consider the systemresponse (output) due to two possible inputs f1 (t) and f2 (t) related as follows:

f1 (t) = f2 (t) ; t < T,

f1 (t) = f2 (t) ; t ≥ T.

What we have postulated here are two inputs that are identical over the infinitepast up to t = T but differ subsequently. The corresponding outputs may berepresented symbolically as follows:

y1 (t) = T {f1 (t)} ,y2 (t) = T {f2 (t)} .


191

192 3 Linear Systems

T

Tt

t

f1(t) = f2(t)

y1(t) = y2(t)

f1(t)

y1(t)

f2(t)

y2(t)

Figure 3.1: Illustration of causality

What is the relationship between y1 (t) and y2 (t)? Clearly the assumption thaty1 (t) = y2 (t) for t < T with identical inputs would be untenable for it wouldmean that the system could predict future changes in the input. Were such aprediction possible the state of affairs would be bizarre indeed for we could neverpull the plug on such a system: it would anticipate our action! We say that sucha system is noncausal and physically unrealizable. The definition of causalitycan be phrased as follows. A system is said to be causal if, whenever twoinputs are identical for t < T the corresponding two outputs are also identicalfor t < T . The relationship between the input and output signals of a causalsystem is illustrated in Fig. 3.1.

It is important to keep in mind that the notion of causality is forced uponus only when the input/output functional domains involve real time. Systeminput/output models where the functional domains involve other variables (e.g.,space) are not governed by an equivalent constraint.

We now confine our attention to a very special class of systems: systemsthat are said to be linear. We define a system as linear if, given

y1 (t) = T {f1 (t)} ,y2 (t) = T {f2 (t)} ,

and any two constants α1 and α2, the following holds:

T {α1f1 (t) + α2f2 (t)} = α1y1 (t) + α2y2 (t) . (3.1)

3.1 Fundamental Properties 193

This principle of superposition generalizes to any number of input vector func-tions. Thus if (3.1) holds then given any number of relations of the formyn (t) = T {fn (t)} and constants αn we have

T

{∑

n

αnfn (t)

}=

∑

n

αnT {fn (t)}

=∑

n

αnyn (t) . (3.2)

These sums may extend over a finite or an infinite number of members. In thelatter case it is presupposed that the necessary convergence criteria hold. Infact if we interpret an integral as a limit of a sum (3.2) can even be extendedto integrals. Thus suppose

y (ξ, t) = T {f (ξ, t)} , (3.3)

wherein ξ takes on a continuum of values within some prescribed range. If thisrange encompasses the entire real number line, the extended superposition prin-ciple reads

T

{∫ ∞

−∞α (ξ) f (ξ, t) dξ

}=

∫ ∞

−∞α (ξ)T {f (ξ, t)} dξ

=

∫ ∞

−∞α (ξ)y (ξ, t) dξ. (3.4)

We shall find the last form of the superposition principle particularly useful inthe application of integral transform techniques to linear systems. A systemoperator T that obeys the superposition principle is called a linear operator.We recall that both the Fourier transform and the Hilbert transform are linearoperators.

3.1.2 Impulse Response

In the following we specialize to single input/single output systems so that

y (t) = T {f (t)} . (3.5)

Suppose the input is the impulse δ (t− τ). The corresponding output is calledthe system impulse response h (t, τ ). The linear operator in (3.5) provides theformal identity

h (t, τ ) = T {δ (t− τ)} . (3.6)

Note that the impulse response depends on two variables: the time of observa-tion t and the time τ at which the impulse is applied. Since

f (t) =

∫ ∞

−∞f (τ ) δ (t− τ ) dτ


we can substitute it into (3.5), and using the superposition principle, bring thelinear operator into the integrand to operate on the impulse function directly.Thus

y (t) = T

{∫ ∞

−∞f (τ) δ (t− τ ) dτ

}=

∫ ∞

−∞f (τ)T {δ (t− τ )} dτ .

Substituting (3.6) into the last integral gives

y (t) =

∫ ∞

−∞f (τ )h (t, τ) dτ , (3.7)

which is perhaps the most important single relationship in linear systems the-ory. It shows explicitly that the impulse response completely characterizes theinput/output properties of a linear system in the sense that if we know the im-pulse response the response due to any input f (t) is uniquely determined. It isimportant to note that in accordance with (3.7) this input must be known forall the infinite past and future. In practice this is of course neither possible nornecessary. We postpone until a later juncture the discussion of the importantpractical consequences of specifying f (t) only over a finite segment of the timeaxis.

Note that the linear operator T defined previously only in terms of its super-position properties now assumes the specific form of an integral operator withkernel h (t, τ ). The question may well be asked at this point whether all singleinput/single output linear systems are governed by integral operators. An affir-mative answer can be given provided h (t, τ ) is allowed to encompass singularityfunctions. For example, suppose

h (t, τ ) = a0δ (t− τ) + a1δ(1) (t− τ ) , (3.8)

where a0 and a1 are constants. Then (3.7) gives

y (t) = a0f (t) + a1df (t)

dt, (3.9)

so that in this case an alternative representation of the integral operator is

T {f (t)} ={a0 + a1

d

dt

}f (t) = a0f (t) + a1

df (t)

dt, (3.10)

i.e., a differential operator. An important feature of a differential operator notshared by an integral operator (with a non symbolic kernel) is that it is mem-oryless: the output at a specific time, say t = t1, is dependent entirely on theproperties of the input at t = t1 only. Clearly we can generate differential oper-ators of any order by appending terms with higher order singularity functionsto (3.8).

We have associated the notion of causality with physical realizability of anysystem. Let us now examine the implications of this concept with respect to


a linear system. Thus starting with two possible inputs f1 (t) and f2 (t) thecorresponding outputs are T {f1 (t)} = y1 (t) and T {f2 (t)} = y2 (t) we supposethat f1 (t) = f2 (t) for t < T . If the system is causal, then in accordance withthe previous definition y1 (t) = y2 (t) for t < T . However since the system isalso linear it follows by superposition that the input f1 (t)− f2 (t) = 0 for t < Tresults in the output y1 (t) − y2 (t) = 0 for t < T . In other words, for a linearsystem the definition of causality may be modified to read: a system is causal iffor any nontrivial input which is zero for t < T the output is also zero for t < T .

We now prove that a linear system represented by the linear operator (3.7)is causal if and only if

h (t, τ ) = 0 for t < τ. (3.11)

First suppose (3.11) holds. Then from (3.7) it follows that y (t) =∫ t−∞ f

(τ )h (t, τ) dτ from which we see directly that if f (t) = 0 for t < T thenalso y (t) = 0 for t < T . Hence the system is causal. On the other hand,suppose the system is causal. Then (3.7) gives with f (t) = 0 for t < Ty (t) =

∫∞T f (τ )h (t, τ) dτ = 0 for t < T . The last expression must be satisfied

for arbitrary f (τ) for τ > T . This is possible only if h (t, τ) = 0 for τ > T , butsince T > t this implies (3.11).

In summary the input/output relationship for a causal linear system is

y (t) =

∫ t

−∞f (τ )h (t, τ) dτ . (3.12)

The relationship between the position of the impulse in time and the responseof a causal linear system is shown in Fig. 3.2.

d (t−t )

tt

t

h(t,t )

t

Figure 3.2: Impulse response of a causal linear system

Even though only causal systems are physically realizable, it is frequentlyconvenient to use simplified system models that are not causal so that the generalform of the input/output relationship (3.7) is still of value.


3.1.3 Step Response

Certain features of a linear system are better represented by its response to aunit step rather than to a unit impulse. We define the step response a(t, τ ) asthe response to U (t− τ ), i.e.,

a(t, τ) = T {U (t− τ )} . (3.13)

In view of (3.7) we get the relationship

a(t, τ ) =

∫ ∞

−∞h(t, τ ′)U (τ ′ − τ ) dτ ′

=

∫ ∞

τ

h(t, τ ′)dτ ′. (3.14)

Differentiation yieldsda(t, τ )

dτ= −h(t, τ). (3.15)

Thus we can replace (3.7) by the equivalent form

y (t) = −∫ ∞

−∞f (τ )

da(t, τ )

dτdτ . (3.16)

When the system is causal a(t, τ ) = 0 for t < τ so that the preceding becomes

y (t) = −∫ t

−∞f (τ )

da(t, τ )

dτdτ . (3.17)

3.1.4 Stability

An important attribute of a system is that of stability, a term that generallyrefers to some measure of finiteness of the system response. Among the sev-eral possible definitions of stability we shall mention only one: the so-calledbounded input/bounded output (BIBO) stability. Its formal definition is as fol-lows. A system is said to be BIBO stable if for any bounded input f (t), i.e.,|f (t)| < B = constant, there exists a constant I such that |y (t)| < BI. Weshall prove the following [16]. A linear system is BIBO stable if and only if

∫ ∞

−∞|h (t, τ)| dτ ≤ I <∞ for all |t| <∞. (3.18)

To prove sufficiency we suppose that (3.18) holds. Then if |f (t)| < B we

have |y (t)| =∣∣∣∫∞−∞ f (τ)h (t, τ ) dτ

∣∣∣ ≤ B∫∞−∞ |h (t, τ )| dτ ≤ BI. To prove the

necessity consider an input of the form

f (ξ, t) =

{h(ξ,t)|h(ξ,t)| ;h (ξ, t) = 0,

1 ;h (ξ, t) = 0.


where ξ may be regarded as fixed parameter. Clearly f (ξ, t) is a bounded func-tion for all real ξ and the system output is

y (ξ, t) =

∫ ∞

−∞

h (ξ, τ )

|h (ξ, τ )|h (t, τ) dτ.

Now at t = ξ this becomes

y (ξ, ξ) =

∫ ∞

−∞

h2 (ξ, τ )

|h (ξ, τ)|dτ =

∫ ∞

−∞|h (ξ, τ )| dτ

so that for y (ξ, ξ) to be bounded requires the boundedness of the last integral.

3.1.5 Time-invariance

A very important special class of linear systems are those termed time-invariant.We say a linear system is time-invariant if the input/output relationship

y (t) = T {f (t)} (3.19a)

impliesy (t− T ) = T {f (t− T )} (3.19b)

for all real-time shifts T . In other words, the absolute time of initializationof the input has no effect on the relative relationship between the input andthe output. For example, if we were to subject a linear time-invariant (LTI)system to inputs with different time delays, we would observe a sequence suchas depicted in Fig. 3.3.

f (t) f (t−T1) f (t−T2)

y(t) y(t−T1) y(t−T2)

t

t

Figure 3.3: Input/output relationship for a time-invariant system

Physically the property of time-invariance means that the system input/output relationship is defined by a set of parameters that do not vary withtime. For example, as we shall see in the sequel, if the input/output relationshipis described by a differential equation, then time-invariance requires that thecoefficients of this differential equation be constants (i.e., not functions of time).


To see what constraint is imposed by time-invariance on the system impulseresponse we return to (3.7) and compute the output due to a time shifted input

y (t) =

∫ ∞

−∞f (τ − T )h (t, τ ) dτ . (3.20)

By time-invariance the same result must follow by simply displacing the timevariable t in (3.7) by T . Thus

y (t− T ) =∫ ∞

−∞f (τ)h (t− T, τ) dτ

=

∫ ∞

−∞f (τ − T )h (t− T, τ − T )dτ . (3.21)

Since (3.20) and (3.21) must be identical for arbitrary f (τ − T ) we haveh (t, τ) = h (t− T, τ − T ), independent of the choice of T . Clearly this impliesthat for a time-invariant system the impulse response is a function only of thedifference between the time of the application of the impulse and the time ofobservation. Therefore we can write for the impulse response

h (t, τ) = h (t− τ) . (3.22)

The property of time-invariance allows us to specify the system impulseresponse in terms of only a single independent variable. This follows from theobservation that h (t− τ ) can be determined for any τ from the specificationh (t, 0) ≡ h (t). In virtue of (3.22) the input/output relationship (3.7) can nowbe rewritten as the convolution

y (t) =

∫ ∞

−∞f (τ )h (t− τ ) dτ ≡

∫ ∞

−∞f (t− τ )h (τ ) dτ . (3.23)

Also the condition for BIBO stability (3.18) can be restated in terms of onlyone variable, viz., ∫ ∞

−∞|h (t)| dt <∞. (3.24)

Note that the concepts of causality and time invariance are independent ofeach other. If a time-invariant system is also causal, the added constraint (3.22)in (3.11) requires that the impulse response vanishes for negative arguments, i.e.,

h (t) = 0 for t < 0. (3.25)

Observe that for a causal LTI system the convolution (3.23) may be modifiedto read

y (t) =

∫ t

−∞f (τ )h (t− τ ) dτ

=

∫ ∞

0

f (t− τ )h (τ ) dτ . (3.26)

3.2 Characterizations in terms of Input/Output Relationships 199

For an LTI system the step response (3.13) is likewise a function of t−τ andis completely defined in terms of the single variable t. Accordingly (3.14) nowbecomes

a(t) =

∫ t

−∞h (τ ) dτ , (3.26*)

which for a causal system reads

a(t) =

∫ t

0

h (τ ) dτ . (3.27)

Thus unless h(t) has a delta function at the origin, a(0) = 0. In view of (3.27)we have da(t)/dt = h(t) and if the LTI system is causal we may replace (3.17) by

y (t) =

∫ t

−∞f (τ) a′(t− τ )dτ , (3.28)

where the ′ denotes the derivative with respect to the argument. When theinput is zero for t < 0 y (t) exists only for positive t in which case we mayreplace (3.28) by

y (t) =

∫ t

0

f (τ) a′(t− τ )dτ . (3.29)

Formula (3.29) is referred to as the Duhamel integral and has played an impor-tant role in the older literature on linear systems (before the popularization ofthe delta function).

3.2 Characterizations in terms of Input/OutputRelationships

3.2.1 LTI Systems

The exponential function plays a unique role in the analysis of LTI systems. Thespecial feature that accounts for this prominence is the replication by an LTIsystem of an exponential input in the output. Even though this follows almostby inspection from the basic convolution representation (3.31) it is instructive toderive it by invoking explicitly the postulates of linearity and time-invariance.1

Consider the input eiωt where ω is a real parameter. Then

T{eiωt

}= q (t, ω)

and introducing a time delay τ we have by virtue of time-invariance

T{eiω(t−τ)

}= q (t− τ , ω)

1This proof is believed to have been first presented by Norbert Wiener in “The FourierIntegral and Some of its Applications,” Cambridge University Press, New York, 1933, wherethe LTI operator is referred to as “the operator of the closed cycle.”


Furthermore linearity permits us to factor out e−iωτ so that

T{eiωt

}= eiωτ q (t− τ , ω) .

But τ is arbitrary so that in the last expression we may set it equal to t result-ing in

T{eiωt

}= eiωtq (0, ω) . (3.30)

Equation (3.30) may be interpreted as an eigenvalue problem for the linear op-erator T wherein eiωt is an eigenfunction and q (0, ω) an eigenvalue. In otherwords the exponential eiωt is an eigenfunction of every LTI system. Interpretingthe real parameter ω as the frequency (3.30) states that an LTI system is inca-pable of generating frequencies at the output not present at the input. Clearly ifthe input is comprised of a sum of sinusoids say

∑αne

iωnt, then superpositiongives

T{∑

αneiωnt

}=

∑eiωntαnq (0, ωn) . (3.31)

Note that here and in (3.30) the sinusoidal inputs are presumed to be of infiniteduration so that (3.30) and (3.31) are not valid for inputs initiated at some finitetime in the past. Not all LTI systems can be characterized by a finite q (0, ω).As we shall see in the following for a finite exponential response to exist itis sufficient (but not necessary) that the system be BIBO stable. How is theconstant q (0, ω) related to the system impulse response? The answer is providedat once by (3.23) with f (t) = eiωt. After a simple change of variable we obtain

y (t) = eiωt∫ ∞

−∞h (τ ) e−iωτdτ = eiωtH (ω) (3.32)

so that q (0, ω) = H (ω) is identified as the Fourier Transform of the impulseresponse, or, using our compact notation

h (t)F⇐⇒ H (ω) . (3.33)

Hence a finite exponential response (3.30) will exist for any LTI system whoseimpulse response possesses a Fourier Transform. In virtue of (3.24) this is thecase for all BIBO stable systems. On the other hand we know that absoluteintegrability is not a necessary requirement for the existence of the FourierTransform. Thus even though the system with the impulse response

h (t) =sin at

πtU (t) (3.34)

is not BIBO stable its Fourier Transform exists. Indeed,

H (ω) =1

2pa (ω) + i

1

2πln

∣∣∣∣a− ωa+ ω

∣∣∣∣ . (3.35)

On the other hand the impulse response h (t) = eαtU (t) with α > 0 is neitherBIBO stable nor does it possess a Fourier Transform.


We shall refer to H (ω) as the transfer function of the LTI system, a termjustified by the frequency domain representation of the convolution (3.23):

Y (ω) = F (ω)H (ω) . (3.36)

Let us now assume an input of the form

f (t) = eiωtU (t− T ) ,

i.e., an exponential that has been initialized at some finite t = T . Assuming theexistence of the FT, the convolution integral for the response becomes

y (t) =

∫ ∞

−∞eiωτU (τ − T )h (t− τ) dτ

=

∫ ∞

T

eiωτh (t− τ) dτ

= eiωt∫ t−T

−∞e−iωτh (τ ) dτ

= eiωtH(ω)− eiωt∫ ∞

t−Th (τ) e−iωτdτ . (3.37)

The last expression may be interpreted as follows. The first term on the right isthe output to an exponential, initialized in the infinite past as in (3.32), whilethe second term may be considered a correction that accounts for the finiteinitialization time. Since this term vanishes as t − T −→ ∞ it is referred toas the transient response of the system. Thus after a sufficiently long time haselapsed the system output approaches eiωtH(ω) which is defined as the steadystate system response. Evidently if we know the transfer function then, bysuperposition, we can write down the steady response to any linear combinationof sinusoids. For example, the steady state response to A cos (ω0t− θ) is

T {A cos (ω0t− θ)} = A/2[H (ω0) ei(ω0t−θ) +H (−ω0) e

−i(ω0t−θ)], (3.38)

which is just another combination of sinusoids at the input frequency ω0. Sincewe know H(ω), we can determine h (t) from (3.33) so that the transient responsecould be determined using the second term on the right of (3.37), a calculationrequiring, of course, rather more effort.

3.2.2 Time-varying Systems

Relationship between the Impulse Response and the Responseto an Exponential Input

It is instructive to see how the preceding results are modified if the system istime-variant. Thus suppose we again attempt to characterize the system bymeans of the response to an exponential input. Such a response will, of course,


no longer be representable by the product of a time-independent parameterwith the exponential input but will in general be comprised of a time-varyingamplitude and phase. Resorting again to the linear operator symbolism, theinput/output relationship reads

T{eiωt

}= A (ω, t) eiθ(ω,t), (3.39)

wherein A (ω, t) and θ (ω, t) are, respectively, the modulus and argument of acomplex number and therefore real functions. As in case of an LTI systemit is sufficient for the existence of such a representation to assume that thesystem is BIBO stable. What can we say about the impulse response of such asystem? It is not hard to see that just like for an LTI system the response to anexponential defines the system impulse uniquely. Thus, since by definition theimpulse response is a linear operation on a time-delayed delta function, we writeT {δ (t− τ )} = h (t, τ ) and represent this delta function as a superposition ofexponentials (see 2.119 in 2.2). We linear operator symbol into the integrand.This results in

T {δ (t− τ)} = T

{1

2π

∫ ∞

−∞eiω(t−τ)dω

}=

1

2π

∫ ∞

−∞T{eiω(t−τ)

}dω

=1

2π

∫ ∞

−∞T{eiωt

}e−iωτdω.

Finally substituting from (3.39) into the last integrand we obtain

h (t, τ) =1

2π

∫ ∞

−∞A (ω, t) eiθ(ω,t)e−iωτdω. (3.40)

If the system is LTI, then A (ω, t) eiθ(ω,t) = A (ω) eiθ(ω)eiωt with A (ω) eiθ(ω) =H (ω) independent of t, so that, as expected, (3.40) gives h (t, τ) = h (t− τ).Thus, by analogy with an LTI system, we may regard the quantity

A (ω, t) ei[θ(ω,t)−ωt] (3.41)

as the transfer function of a time-varying system, provided we take note of thefact that unlike in the LTI system this transfer function is not the FT of theimpulse response.

As an example, suppose we want to determine the impulse response of thelinear system defined by

T{eiωt

}=

1

(αt)2+ iω

eiωt, (3.42)

where α is a constant. Clearly the system is not time-invariant because theoutput is not of the form H(ω)eiωt. By superposition the impulse response is

h (t, τ) =1

2π

∫ ∞

−∞

eiω(t−τ)

(αt)2+ iω

dω = e−(αt)2(t−τ)U (t− τ ) . (3.43)


From this we note that the system is causal, and, in accordance with (3.18),BIBO stable for α = 0. On the other hand, if the exponential eiωt is initializedat t = 0 we get

∫ t

0

e−(αt)2(t−τ)eiωτdτ = − e−(αt)2t

(αt)2+ iω

+1

(αt)2+ iω

eiωt. (3.44)

The first term on the right is the transient response since it vanishes as t→∞while the second term represents the steady state response. As expected, it isidentical with the right side of (3.42). Note that this response is not purelysinusoidal but rather comprised of sinusoids with slowly varying amplitude andphase factors. Its deviation from the pure single spectral line of the input isbest appreciated by viewing its FT which is easily found to be

Y (ω) = − π

K1e−|1−ω/ω0|K2eiπ/4e−i|1−ω/ω0|K2 . (3.45)

We have relabeled the fixed input frequency in (3.42) by ω0 and introduced

the dimensionless parameters K1 = αω1/20 and K2 = α−1ω

3/20 /2. A plot of

magnitude of Y (ω) as a function of ω/ω0 is shown in Fig. 3.4.

0.98 0.985 0.99 0.995 1 1.005 1.01 1.015 1.02 1.0250

5

10

15

20

25

30

35

w 0

w

K1 = 10, K2 = 500

Figure 3.4: Plot of |Y (ω)| in Eq. (3.45)

Very often instead of the impulse response the transfer function (3.41) ofa system constitutes the given data (either experimentally or through an an-alytical model). It is then more efficient to use this transfer function directlyrather than going through the intermediate step of finding the system impulse


response. We get the required formula by putting (3.40) into (3.7) in Sect. 3.1and obtain

y (t) =1

2π

∫ ∞

−∞A (ω, t) eiθ(ω,t)

{∫ ∞

−∞f (τ ) e−iωτdτ

}dω

=1

2π

∫ ∞

−∞A (η, t) eiθ(η,t)F (η) dη, (3.46)

where F (η) is the FT of f (t). To emphasize that the indicated integration isnot being carried out directly over the frequency spectrum of the system outputwe have replaced the integration variable ω with η. In fact to find the spectrumof the output we must still compute the FT of the right side of (3.46). Theresult may be written as follows

Y (ω) =1

2π

∫ ∞

−∞H (η, ω)F (η) dη, (3.47)

where we have introduced the transform pair

A (η, t) eiθ(η,t)F⇐⇒ H (η, ω) . (3.48)

Equation (3.47) highlights the fact that a time-varying system does not merelydeemphasize(attenuate) and (or) enhance the frequency components alreadypresent in the input signal (as in an LTI system) but can generate frequencycomponents not present in the input as we have already seen in the example inFig. 3.4. Evidently for an LTI system

H (η, ω) = 2πH (ω) δ(ω − η), (3.49)

where H (ω) = A (ω) eiθ(ω) so that we recover (3.32).There are also cases where the transfer function (3.41) does not provide a

useful characterization of a time-varying system. For example, for a system withthe impulse response

h (t, τ ) = cos [α (T − t) τ ]U (t− τ ) (3.50)

the transfer function does not exist. Nevertheless this impulse response is auseful approximation to a pulse compression circuit. Taking the rectangularpulse pT/2 (t) as the input we find

y (t) =

⎧⎪⎨

⎪⎩

0 ; t ≤ −T/2,sin[α(t−T )t]α(T−t) ; − T/2 ≤ t ≤ T/2,

2 sin[α2 (T−t)T ]α(T−t) ; t ≥ T/2.

(3.51)

A plot of (3.51) is shown in Fig. 3.5.The rectangular pulse of duration T is compressed to a “pulse” of nominal

duration 4π/αT . In view of the functional form of the output we see that thiscompression is accomplished by the system by effectively performing an FT onthe input.


−1 −0.5 0 0.5 1 1.5 2

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

aT4p

Ty(t)

Tt

pT/2(t)

Figure 3.5: Pulse compression

Communications Channel Modeling

Mathematical representations of time-varying linear system play a key role inthe modeling of so-called fading communications channels. This includes mo-bile radio communications, where signal fading is engendered by the motionof receivers and (or) transmitters relative to fixed scatterers and reflectors, andcommunications wherein the propagation paths between receivers and transmit-ters traverse intrinsically time-varying media as, e.g., in case of the ionosphereor the troposphere. In these applications it turns out to be more convenient touse a somewhat different form of the system response function than presentedin the preceding. In the following we present a brief account of this alternativeformulation.

To begin with, we introduce a new variable τ = t− τ and define a modifiedimpulse response function c(τ ; t) by

c(τ ; t) = h(t, t− τ ). (3.52)

Since t is the time of observation and τ the time at which the impulsive inputis applied, τ is just the time interval between the observation and the applica-tion of the impulse. Clearly if the system is LTI, then c(τ ; t) is only a functionof τ so that any explicit dependence of c(τ ; t) on the second variable (i.e., theobservation time t) is a direct measure of the extent to which the system charac-teristics deviate from an LTI model. In terms of the modified impulse responsethe system output in (3.7) becomes

y (t) =

∫ ∞

−∞c (τ ; t) f (t− τ ) dτ . (3.53)


Note also that from the definition of τ it follows that for a causal system

c(τ ; t) = 0 ; τ < 0 (3.54)

in which case the expression for the response is

y (t) =

∫ ∞

0

c (τ ; t) f (t− τ) dτ . (3.55)

The dependence of the modified impulse response on the first variable is a mea-sure of the frequency dispersive property of the channel whereas the dependenceon the second variable is a measure of the Doppler spread. For example, themodified impulse response for a mobile link with terrain reflections might be

c (τ ; t) = h0(τ ) + β (t) δ[τ − τ0(t)], (3.56)

where the first term on the right is the direct path contribution and the secondterm represents the time-varying multipath signal component resulting, e.g.,from terrain reflections. Inserting this into (3.55) we see that the response re-sulting from the first term is that of an LTI system with some fixed frequencydispersive properties (depending on the nature of the direct path). The contri-bution from the second term is β (t) f [t − τ0(t)] which is a combination of anamplitude fluctuation (amplitude fading) and a variable time delay resulting ina Doppler spread of the signal.

By evaluating the FT with respect to the time delay variable and the timeof observation the effects of dispersion and Doppler spread can be viewed intheir respective frequency domains. To represent the spectrum of the dispersiveeffects we employ (3.40)

c (τ ; t) =1

2π

∫ ∞

−∞A (ω, t) eiθ(ω,t)−iωteiωτdω, (3.57)

which yields the transform pair

c (τ ; t)F⇐⇒

(τ ,ω)A (ω, t) exp {i [θ (ω, t)− ωt]} . (3.58)

The transform of (3.57) with respect to t is

c (τ ; t)F⇐⇒

(t,η)

1

2π

∫ ∞

−∞C (ω, η) exp (iωτ) dω, (3.59)

where we have used (3.48) with an interchange of the variables ω and η so that

C (ω, η) = H(ω, ω + η) =

∫ ∞

−∞A (ω, t) eiθ(ω,t)e−iωte−iηtdt. (3.60)

Taking account of (3.58) we can rewrite (3.60) as a 2-D FT:

C (ω, η) =

∫ ∞

−∞

∫ ∞

−∞c (τ ; t) e−iωτ−iηtdτdt. (3.61)

3.3 Linear Systems Characterized by Ordinary Differential Equations 207

As an example consider a moving source of radiation with a fixed wavepropagation speed v excited by the exponential input signal eiωt. The lineartransformation of this exponential into the RF signal at the antenna terminalsof a fixed receiver located at range r(t) can be represented as

T{eiωt

}=H0 (ω)G (ω, t)

r (t)e−i

ωr(t)v eiωt, (3.62)

where H0 (ω) is the transfer function of an equivalent stationary source (whichfor our purposes may be considered an LTI system) and the factor G (ω, t) isproportional to the square root of the product of the transmitter and receiverantenna gains at the instantaneous position of the trajectory defined by r (t).As in (3.43) we can find the impulse response by superposition with the result

c (τ ; t) =1

2π

∫ ∞

−∞

H0 (ω)G (ω, t)

r (t)eiω(τ−

r(t)v )dω. (3.63)

Generally the variation of the gain with frequency is negligible within the signalinformation band so that we may set G (ω, t) ≈ G (t). In that case (3.63)becomes

c (τ ; t) =

(G (t)

r (t)

)1

2π

∫ ∞

−∞H0 (ω) e

iω(τ− r(t)v )dω (3.64)

and the 2-D spectrum (3.61) is

C (ω, η) = H0 (ω)

∫ ∞

−∞

G (t)

r (t)e−i

ωr(t)v e−iηtdt. (3.65)

The integral gives the spectrum of the Doppler spread. For a stationary sourceit reduces to 2πH0 (ω) δ (η) [G/r] e

−iωrv

3.3 Linear Systems Characterized by Ordinary

Differential Equations

3.3.1 First-Order Differential Equations

In the preceding discussion we dealt only with the input/output characteriza-tion of a linear system. Such a description does not reveal anything about thephysical processes that give rise to the system impulse response. The relation-ship between the impulse response and the dynamical variables governing thesystem must be established with the aid of the underlying differential/integralequations. For example, the following first-order linear differential equation

dy (t)

dt+ a (t) y (t) = f (t) , (3.66)

where a (t) is a given function of time, characterizes a linear system with inputf (t) and output y (t). We can represent this system by the feedback network in


+

−f(t) y(t)

a(t)

∫

Figure 3.6: Feeedback representation of first-order system

Fig. 3.6 consisting of an integrator, a time-varying transducer and a differencingnetwork.

Although much of the theory can be developed for quite general a (t) for ourpurposes it will be sufficient to assume that is a smooth bounded function oftime. Using our previous notation for a linear operator, y (t) = T {f (t)}. More-over if the input is prescribed for −∞ < t <∞ the output can be represented interms of the impulse response as in (3.7). Let us determine the impulse responseh (t, τ) of this system. Formally we are required to solve (3.66) when the inputequals δ (t− τ ). Thus

dh (t, τ )

dt+ a (t)h (t, τ ) = δ (t− τ) . (3.67)

This first-order equation must be solved subject to an initial condition. For ex-ample, if we impose the requirements that h (t, τ ) be causal, then the initialcondition is given by (3.11) in Sect. 3.1. When t = τ the right side of (3.67) iszero so that h (t, τ) satisfies the homogeneous equation whose solution is

h (t, τ ) = C (τ ) exp−∫ t

0

a (t′) dt′, (3.68)

where C (τ) is an integration constant that may depend on τ . To find thisconstant we first integrate both sides of (3.67) with respect to t between thelimits t = τ − ε ≡ τ−and t = τ + ε ≡ τ+, where ε is an arbitrarily small positivequantity. Thus

∫ τ+

τ−

dh (t, τ )

dtdt+

∫ τ+

τ−a (t)h (t, τ) dt =

∫ τ+

τ−δ (t− τ ) dt = 1 (3.69)

and integrating the first member yields

h(τ+, τ

)− h (τ−, τ

)+

∫ τ+

τ−a (t)h (t, τ) dt = 1. (3.70)

In virtue of the causality constraint h (τ−, τ) = 0, and since a (t) is bounded wehave in the limit of vanishingly small ε

limε→0

∫ τ+

τ−a (t)h (t, τ ) dt = 0.


Consequently (3.70) yieldsh(τ , τ) = 1. (3.71)

Setting t = τ in (3.68) and using (3.71) we find the constant C (τ )= exp∫ τ0

a (t′) dt′ and taking account of causality we obtain

h (t, τ) =

{exp− ∫ t

τ a (t′) dt′ ; t ≥ τ ,

0 ; t < τ .(3.72)

Alternatively and more compactly the preceding may be written

h (t, τ ) =

(exp−

∫ t

τ

a (t′) dt′)U (t− τ ) . (3.73)

Once the impulse response has been determined the response to an arbitraryinput f (t) is given in terms of the superposition integral (3.12 in Sect. 3.1). Fromthis superposition integral it would appear that to determine the response atany time t requires the knowledge of the input over the entire infinite past up tothe present instant t. This is more information than would be normally availablefor practically the input f (t) would be known only for t at and past some timet0. How does one determine the output under these circumstances? It turnsout that because we are dealing here with a causal system the knowledge ofthe input prior to t0 is actually superfluous provided we know the value of theoutput (initial condition) at t = t0. To show this explicitly let us select a pointt0, y (t0) and rewrite the system response (3.12 in Sect. 3.1) as a sum of twocontributions one dependent on the input prior to t0 and the other dependenton the input for t > t0. Thus we have for t ≥ t0

y (t) =

∫ t0

−∞f (τ)h (t, τ ) dτ +

∫ t

t0

f (τ )h (t, τ) dτ . (3.74)

Setting t = t0 this gives

y (t0) =

∫ t0

−∞f (τ )h (t0, τ ) dτ (3.75)

so that the initial condition, i.e., y (t0), is completely determined by thesummation of the input over the infinite past up to t = t0. Definingy (t)= exp− ∫ t

0 a (t′) dt′ we obtain the identity exp− ∫ t

t0a (t′) dt′ = y (t) [y (t0)]

−1=

h (t, t0) so that h (t, τ ) = h (t, t0)h (t0, τ ). Inserting this in the first integralin (3.74) and using (3.75) we get

y (t) = h (t, t0) y (t0) +

∫ t

t0

f (τ )h (t, τ) dτ ; t ≥ t0. (3.76)

Note thaty0i (t) ≡ h (t, t0) y (t0) (3.77)


i(t) R(t) C v (t)

+

−

Figure 3.7: Time-varying resistor circuit

satisfies the homogeneous differential equation and when evaluated at t = t0yields y (t0). In virtue of (3.75) the latter is determined by the portion of theinput signal defined for−∞ < t < t0. Thus for t ≥ t0 the net effect on the outputof this portion of the input is captured by the solution of the homogeneousequation subject to the initial condition y0i (t0) = y (t0). Because y0i (t) isobtained with f (t) = 0 for t ≥ t0 it is frequently referred to as the zero inputresponse.2 The output engendered by segment of f (t) for t ≥ t0 is

y0s (t) ≡∫ t

t0

f (τ )h (t, τ) dτ (3.78)

and corresponds to the output with zero initial condition at t = t0. It is usuallyreferred to as the zero state response. For an LTI system a(t) = a = constant sothat (3.73) takes on the form h(t− τ ) = e−a(t−τ)U(t− τ) which we may replaceit with the simpler statement

h(t) = e−atU(t). (3.79)

The representation of the solution in terms of a zero state and a zero inputresponse in (3.76) now reads

y (t) = e−a(t−t0)y (t0) +∫ t

t0

f (τ ) e−a(t−τ)dτ ; t ≥ t0. (3.80)

Example. As an illustration of a time-varying first-order linear system con-sider the circuit in Fig. 3.7 where input is the current i (t) supplied by an idealcurrent source and the output is the voltage v (t) taken across a fixed capacitorC in parallel with the time-varying resistor R(t).

Using Kirchhoff’s circuit laws we get

dv

dt+

1

R(t)Cv =

1

Ci (t) . (3.81)

2Note that the zero input response (3.77) can also be obtained by exciting the systemwith f(t) = y(t0)δ(t − t0). This is just a particular illustration of the general equivalenceprinciple applicable to linear differential equations permitting the replacement of a homoge-neous system with specified nonzero initial conditions by an inhomogeneous system with zeroinitial conditions using as inputs singularity functions whose coefficients incorporate the initialconditions of the original problem.


0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9

10

Figure 3.8: Plot of αCR (t) vs ω0t

We suppose that the resistor varies sinusoidally as follows:

1/R(t) ≡ G (t) = αC cos2 ω0t =αC

2[1 + cos 2ω0t], (3.82)

where α is a positive constant. Figure 3.8 shows a plot of αCR(t) as a functionof ω0t.

We see that the resistor acts approximately like an on–off switch, pre-senting an open circuit at t = (2n + 1)π/2ω0 and a resistance of 1/αC att = nπ/ω0, n = 0,±1,±2, . . .. Physically, a time-varying resistance of this sortcan be approximated by applying a sinusoidal excitation (local oscillator) to acircuit comprised of a suitable nonlinear device (e.g., a diode). If the magnitudeof the input current is sufficiently small, the relationship between v(t) and i(t)may be modeled as a linear system which is generally referred to as small sig-nal analysis of RF mixers or frequency converters. For example, in the contextof a heterodyne receiver we may consider the current i(t) as the informationbearing RF input signal in which case the voltage v(t) represents the desireddownconverted (baseband) signal, and (since the circuit in Fig. 3.7 includes noband-pass filter) spurious responses in form of beats and their harmonics.

To maintain consistency with the parameters introduced in (3.66) we setf (t) = i(t)/C. Upon substituting a (t) = 1/R(t)C = α cos2 ω0t in (3.73) weobtain the impulse response

h(t, τ ) = e−α2 (t−τ)e−

α4ω0

[sin 2ω0t−sin 2ω0τ ]U(t− τ ). (3.83)

If we specify the initial conditions at t = 0, the complete response for t ≥ 0reads

v(t) = v(0)e−α2 te

− α4ω0

sin 2ω0t

+1

C

∫ t

0

e−α2 (t−τ)e−

α4ω0

[sin 2ω0t−sin 2ω0τ ]i (τ ) dτ , (3.84)


wherein, in accordance with our previous designation, the first term on the rightis the zero input response and the second term the zero state response. The nu-merical evaluation of the integral in (3.84) is facilitated by the introduction ofthe Fourier series representation

e±α

4ω0sin 2ω0t =

∞∑

n=−∞q±n e

i2nω0t (3.85)

with3

q±n =ω0

π

∫ πω0

− πω0

e±α

4ω0sin 2ω0te−i2nω0tdt. (3.86)

Let us now consider a sinusoidal current of the form

i(t) = A cos(ω1t+ ϕ), (3.87)

which may be taken as the carrier of an RF signal (ϕ is a fixed phase reference).Substituting this in (3.84) and carrying out the simple integration term-by-termand letting αt ∼ ∞ we obtain the steady state response

v(t) ∼ A

2C

∞∑

�=−∞

{β+� e

i(2�ω0+ω1)t + β−� e

i(2�ω0−ω1)t}, (3.88)

where

β+� = eiϕ

∞∑

n=−∞

q−�−nq+n

2nω0 + ω1 + α/2, (3.89a)

β−� = e−iϕ

∞∑

n=−∞

q−�−nq+n

2nω0 − ω1 + α/2. (3.89b)

These coefficients give the amplitude and phase of the sinusoidal outputs atvarious frequencies. For example, for the fundamental (input) frequency wehave

v0 (t) =A

2C(β+

0 eiω1t + β−

0 e−iω1t). (3.90)

The amplitude and phase of the upper sideband (sum of the carrier and the LOfrequencies) is

v+1 (t) =A

2C(β+

1 ei(2ω0+ω1)t + β−

−1e−i(2ω0+ω1)t) (3.91)

and similarly for the lower sideband (difference of the carrier and the LO fre-quencies)

v−1 (t) =A

2C(β+

−1ei(−2ω0+ω1)t + β−

1 ei(2ω0−ω1)t). (3.92)

3These Fourier coefficients can be expressed in terms of the modified Bessel functions Inas follows: q±n = (±i)nIn(

α4ω0

).


An important design parameter of a frequency converter is its conversionefficiency. In the present model it is numerically equal to the squared magnitudeof the coefficients β±

� . For example, the conversion efficiency for the lower

sideband is∣∣β+

−1

∣∣2 =∣∣β−

1

∣∣2.In a practical circuit the converter output frequency (e.g., the lower side-

band) would be selected by means of a band-pass filter so that our analysisis only approximate. The more accurate approach requires the inclusion ofthe effects of the reactances introduced by the filter which in turn leads to adifferential equation of order higher than the first.

3.3.2 Second-Order Differential Equations

Time-varying Systems

Next we consider a linear system described by the second-order differential equa-tion

d2y (t)

dt2+ a(t)

dy (t)

dt+ b(t)y (t) = f(t), (3.93)

where we suppose that the coefficients a(t) and b(t) are continuous functionsof t. As in the first-order system f(t) is the input and y (t) the output. We canagain represent the system by a feedback network. To account for the additionalderivative requires two loops, as shown in Fig. 3.9.

+

−f (t)

a(t)

y(t)

b(t)

∫ ∫

++

Figure 3.9: Feedback representation of second-order system

Let us first consider the homogeneous equation.

d2y (t)

dt2+ a(t)

dy (t)

dt+ b(t)y (t) = 0. (3.94)

It will be demonstrated in (3.98), that (3.95) has two linearly independentsolutions which we presently denote by y1(t) and y2 (t). A linear superpositionof these can be used to construct the zero input response for t ≥ t0 which wedefine as in the case of the first-order system. However now two constants arerequired instead of one for its specification. Thus

y0i(t) = α (t0) y1(t) + β (t0) y2(t); t ≥ t0, (3.95)


where the two constants α (t0) and β (t0) are to be determined from the initialconditions y (t0) and y

′ (t0). We find

α (t0) =y′2(t0)y (t0)− y2(t0)y′ (t0)

W [y1(t0), y2(t0)](3.96a)

β (t0) =−y′1(t0)y (t0) + y1(t0)y

′ (t0)W [y1(t0), y2(t0)]

, (3.96b)

where W [y1(t0), y2(t0)] ≡ W (t0) is the Wronskian (see 1.35 in Sect. 1.2) eval-uated at t = t0. It is not hard to show that the Wronskian W (t) satisfies thefirst-order differential equation

dW (t)

dt+ a(t)W (t) = 0. (3.97)

Hence it can be determined to within a multiplicative constant from the coeffi-cient a(t), viz.,

W (t) = W (t0) e− ∫ t

t0a(τ)dτ

. (3.98)

As long as the coefficient a(t) is bounded so that the integral exists for all finitevalues of t (3.98) guarantees that if the Wronskian is not zero at one particularinstant of time it will not be zero for all finite time. This means that if the twosolutions of the homogeneous equation can be shown to be linearly independentat some fixed time t = t0 they must be linearly independent for arbitrary finite t.Also, from the definition of W (t) it follows that given the Wronskian and onesolution of (3.94), say y1(t), of the homogeneous equation (3.94) we can find asecond solution, linearly independent from the first, by solving the first-orderdifferential equation y′2(t)− [y′1(t)/y1(t)]y2 (t) = W (t) /y1(t) for y2(t). We get

y2(t) =y1 (t)

y1 (t0)y2(t0) + y1 (t)

∫ t

t0

W (τ )

y21 (τ )dτ . (3.99)

As in the case of the first-order system we can construct the complete solutionto (3.93) in terms of the impulse response. Presently we have

d2h (t, τ)

dt2+ a(t)

dh (t, τ )

dt+ b(t)h (t, τ) = δ(t− τ ), (3.100)

which we solve subject to the causality constraint h (t, τ ) = 0; t < τ . NowEq. (3.100) is presumed to be valid for |t| < ∞ and, in particular, at t = τ .Therefore the delta function source appearing on the right side must be balancedby a delta function on the left side. Because the derivative of a delta functionis a singularity function of a higher order (doublet), to achieve a balance, thedelta function on the left side can be contained only in the second derivative. Itfollows then that the first derivative of h (t, τ ) has a step discontinuity at t = τso that h (t, τ ) is a continuous function of t at t = τ . Coupled with the causalityconstraint this requires that

h (t, τ ) = 0 ; t ≤ τ . (3.101)


Because h (t, τ ) vanishes identically for t < τ we also have

dh (t, τ)

dt= 0 ; t < τ . (3.102)

For t = τ the impulse response satisfies the homogeneous equation. Thereforewe can represent it as a linear combination of y1(t) and y2(t). Since it is alsocontinuous at t = τ the representation valid for all time reads

h (t, τ) =

{A y1(t) +B y2(t) ; t ≥ τ ,

0 ; t ≤ τ . (3.103)

The coefficients A and B are functions of τ . One equation follows immediatelyfrom (3.101), for at t = τ we must have

A y1(τ ) +B y2(τ ) = 0. (3.104)

The second equation follows from an integration of both sides of (3.100) betweenthe limits t = τ + ε ≡ τ+ and t = τ − ε ≡ τ− as in (3.69). We get

∫ τ+

τ−

d2h (t, τ)

dt2dt+

∫ τ+

τ−a(t)

dh (t, τ)

dtdt+

∫ τ+

τ−b(t)h (t, τ) dt =

∫ τ+

τ−δ (t− τ ) dt = 1.

(3.105)

Because h (t, τ ) and its derivative are integrable over any finite interval we have

limε→0

∫ τ+

τ−a(t)

dh (t, τ )

dtdt → 0

limε→0

∫ τ+

τ−b(t)h (t, τ ) dt → 0

and taking account of (3.102) the integration of the first member on the leftof (3.105) is

limε→0

∫ τ+

τ−

d2h (t, τ)

dt2dt =

dh (t, τ)

dt|t=τ = 1. (3.106)

As a result the second relationship between A and B is

A y′1(τ ) +B y′2(τ ) = 1. (3.107)

Solving (3.104) and (3.107) for A and B and substituting in (3.103) we obtain

h (t, τ ) =y1(τ )y2(t)− y2(τ )y1(t)

W [y1(τ ), y2(τ )]U (t− τ ) . (3.108)

In parallel with (3.76) the complete solution for t ≥ t0 comprising the zero inputas well as the zero state response is

y(t) = α (t0) y1(t) + β (t0) y2(t) +

∫ t

t0

h (t, τ ) f (τ ) dτ , (3.109)

where α (t0) and β (t0) are given by (3.96).


Example. With the choice of the coefficients a (t) = (t2+4t+2)(t2+3t+2)−1

and b (t) = t(t2+3t+2)−1 we can validate by direct substitution that y1(t) = e−t

and y2(t) = (t + 2)−1 are solutions of (3.94). The corresponding Wronskian isW (t) = e−t(t+2)−2(t+1). We note that this Wronskian equals zero at t = −1.This is not surprising since the coefficient a (t) becomes infinite at this pointso that the integral in (3.98) will diverge whenever t = −1 falls within theintegration interval. Using (3.108) we get the impulse response

h (t, τ ) =1

τ + 1

[τ + 2

t+ 2− e−(t−τ)

]U (t− τ ) . (3.110)

Although this impulse response is finite provided t and τ exceed −1, it is notBIBO stable since (3.110) fails to satisfy (3.18).

Equivalent First-Order Vector System. The second-order system (3.93)can also be reformulated as a two-dimensional vector first-order system by in-troducing the change of variables x1 = y and x2 = y′. Then (3.93) becomesx′2+ax2+ bx1 = f which together with x′1 = x2 may be written in the followingform

[x′1x′2

]+

[0 −1b a

] [x1x2

]=

[0f

]. (3.111)

Setting

x =[x1 x2

]T,A =

[0 −1b a

], f =

[0 f

]T(3.112)

we achieve the compact matrix form

dx

dt+ Ax = f , (3.113)

which constitutes the so-called state space representation of the differential equa-tion (3.93). Let us first consider the homogeneous equation

dx

dt+ Ax = 0, (3.114)

which is, of course, completely equivalent to (3.94). Using the two linearlyindependent solutions y1 and y2 we construct the matrix

X (t)=

[y1 (t) y2 (t)y′1 (t) y′2 (t)

], (3.115)

which necessarily satisfies (3.113). Thus

dX (t)

dt+ AX (t)= 0. (3.116)


Since X (t) is nonsingular X (t)X−1 (t) = I is satisfied for all t. Hence it can bedifferentiated to obtain

dX

dtX−1 +X

dX−1

dt= 0. (3.117)

After substituting for the derivative from (3.116) and multiplying from the leftby X−1 we get

dX−1

dt−X−1A = 0. (3.118)

This so-called adjoint equation can be used to solve the inhomogeneous sys-tem (3.113). This is done by first multiplying both sides of (3.113) from the leftby X−1 (t) and (3.118) from the right by x and then adding. As a result weobtain

X−1 dx

dt+dX−1

dtx+X−1Ax−X−1Ax = X−1f ,

which is obviously equivalent to

d[X−1 (t)x (t)]

dt= X−1 (t) f (t) .

Integrating between a fixed limit t0 and a variable limit t gives for t ≥ t 0

X−1 (t)x (t)−X−1 (t0)x (t0) =

∫ t

t0

X−1 (τ ) f (τ ) dτ .

The general solution for x (t) now follows after multiplication from the left byW (t). We write it in the form

x (t) = H (t, t0)x (t0) +

∫ t

t0

H (t, τ) f (τ ) dτ , (3.119)

whereH (t, τ ) = X (t)X−1 (τ ) . (3.120)

Equation (3.119) is equivalent to (3.109). Structurally however there is a closerresemblance to (3.76). For example, it is not hard to see that H (t, τ ) is the 2X2matrix counterpart to h (t, τ ) and is the impulse response satisfying

dH (t, τ)

dt+ AH (t, τ)= Iδ (t− τ) . (3.121)

In fact one can start with (3.121) and (3.116) and imposing causality ob-tain (3.120). Also, just as in the scalar first-order system, the zero input responsecan be related directly to the input prior for t < t0, i.e., for t ≥ t0

H (t, t0)x (t0) =

∫ t0

−∞H (t, τ) f (τ) dτ. (3.122)


LTI System

Time Domain Response. For an LTI system the coefficients a and b in (3.93)are constants and the homogenous equation (3.94) is satisfied by the exponen-tial eλt. Inserting this in (3.94) we get the characteristic equation

λ21,2 + aλ1,2 + b = 0. (3.123)

When the two roots λ1 and λ2 are distinct, the two linearly independent solu-tions of (3.93) are y1 = eλ1t and y2 = eλ2t. The corresponding Wronskian is

W (t) = (λ2 − λ1) e(λ1 +λ2)t (3.124)

and from (3.108) we get for the impulse response

h (t) =eλ2t − eλ1t

λ2 − λ1 U (t) , (3.125)

where, as usual for an LTI system, we have set the variable τ to zero. We notethat when λ1 and λ2 have negative real parts the impulse response will decay tozero at infinity. It will then also be absolutely integrable so that, in accordancewith (3.24) in Sect. 3.1, the system will be BIBO stable. Physically this impliesnet energy dissipation (rather than generation) within the system and requiresthat a > 0. In a mechanical system comprising a mass and a compliant springthis coefficient generally represents viscous damping and in an electrical networkohmic resistance (or conductance). For a dissipative system three classes ofsolutions can be distinguished. When the damping is very strong both rootsare real (a2 > 4b) and the system is said to be overdamped. As the dampingis reduced we reach the condition a2 = 4b which corresponds to the limitingcase λ2 → λ1 in (3.125) and constitutes the so-called critically damped case. Ifwe reduce the damping further, a2 < 4b and both roots become complex. Thisis the so-called underdamped case for which the impulse response assumes theform

h (t) = e−(a/2)t sin√b− a2/4t√b− a2/4 U (t) . (3.126)

The waveform is a damped sinusoid oscillating at the frequency ωres =√b− a2/4. As a → 0, ωres →

√b ≡ ω0, which would be, e.g., the frequency

of oscillation of an undamped mass/spring or inductor/capacitor combination.Clearly in this limiting case the system is no longer BIBO stable.

Using the formulas in (3.96a) for the constants we can get an explicit formfor the zero input response. Adding it to the zero state response we get fort ≥ t0

y(t) =λ2y (t0)− y′ (t0)

λ2 − λ1 eλ1(t−t0) +−λ1y (t0) + y′ (t0)

λ2 − λ1 eλ2(t−t0)

+

∫ t

t0

eλ2(t−τ) − eλ1(t−τ)

λ2 − λ1 f (τ) dτ. (3.127)


We still have to examine the critically damped case λ1 = λ2 = λ. We canstill choose y1 = eλt for one of the solutions and substitute in (3.99) to get thesecond linearly independent solution. Since λ1 + λ2 = 2λ, the Wronskian isW (t) = W (t0) e

−a(t−t0) =W (t0) e2λ(t−t0). We then obtain

y2(t) = eλ(t−t0)y2(t0) + eλtW (t0)

∫ t

t0

e2λ(τ−t0)

e2λτdτ

= eλ(t−t0)y2(t0) + (t− t0)W (t0) eλte−2λt0 .

After setting t0 = 0 and y2(0) = 0 we get, upon dropping the constant W (0),

y2(t) = teλt. (3.128)

With eλt and teλt as the two linearly independent solutions we get for theWronskian

W (t) = e2λt. (3.129)

Using (3.108) the impulse response is

h (t) = teλtU (t) (3.130)

while the complete response with initial conditions specified at t = t0 becomes

y(t) = {y (t0) + (t− t0)[y′ (t0)− λy (t0)]}eλ(t−t0)

+

∫ t

t0

(t− τ )eλ(t−τ)f (τ ) dτ . (3.131)

Transfer Function. Recall that if the system is BIBO stable then withf (t) = eiωt the output y (t) = H (ω) eiωt is identical with the steady stateresponse when the same exponential is initiated at some finite time. If we knowthe impulse response, then its FT equals H (ω). However if we have the govern-ing differential equation we can get the transfer function H (ω) more directlyby simply substituting f (t) and y (t) into the DE, cancelling the common ex-ponential, and solving for H (ω). Carrying out this operation on (3.93) (with aand b are constant) we get

H (ω) =1

−ω2 + iωa+ b. (3.132)

Assuming b>0 and setting α= − a/2, β=

√b− (a/2)2, ω0=

√b=

√β2 + α2,

Q = β/α we write the magnitude of (3.132) in the following normalizedform:

ω20 |H (ω)| = 1√[(

ωω0

)2

− 1

]2+ 4

1+Q2

(ωω0

)2

. (3.133)


0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Q=1

1.5

2

3

FREQENCY

MA

GN

ITU

DE

OF

TR

AN

SF

ER

FU

NC

TIO

N

Figure 3.10: Plot of ω20 |H (ω)| vs ω/ω0

In this parameterization we assume α < 0 and ω20 ≥ α2 which covers the under-

damped range up to critical damping at ω20 = α2. The parameter Q is a relative

measure of the dissipation in the system (Q = 1 for critical damping) and thewidth of the resonant peak which occurs at ω = ωres = ω0

√(Q2 − 1)/(Q2 + 1).

A plot of ω20 |H (ω)| as a function of ω/ω0 is shown in Fig. 3.10 for several values

of Q.The relationship among ωres, ω0, α, the position of the resonance peak along

the frequency axis and the position of the pole4 α± iβ in the complex plane canbe displayed geometrically as shown in Fig. 3.11.

Direct Solution Using the FT. An alternative technique for obtaining thegeneral solution (3.127) or (3.131) of an LTI system is to apply the FT directlyto the differential equation, an example of which we have already encountered in2.2.6. Thus if the input signal f (t) in (3.93) is specified for t ≥ t0 only and weare given the values y (t0) and y

′ (t0), the imposition of causality on the solutionpermits us to consider an equivalent problem wherein we set both f (t) and y (t)to zero for t < t0. We then evaluate

F (ω) =

∫ ∞

t0

f (t) e−iωtdt (3.134)

while the implied discontinuity of y (t) at t = t0 forces us to interpret thederivatives in the differential equations as derivatives of “smooth” functions, asin (2.158 in 2.2) and (2.159 in 2.2).

4Here we adhere to the convention and define the pole in terms of the complex variables = iω so that the roots of the denominator of the transfer function are solutions of s2−2αs+ω20 = 0.


α

a + ib

b

a − ib

a + i0

w 0

w

wres

−w res

θ

⋅

⋅

H(w)

Figure 3.11: Relationship between |H (ω)| and Pole position

y′ (t) F⇐⇒ iωY (ω)− e−iωt0y (t0) (3.135a)

y′′ (t) F⇐⇒ iω[iωY (ω)− e−iωt0y (t0)

]− e−iωt0y′ (t0)= −ω2Y (ω)− iωe−iωt0y (t0)− e−iωt0y′ (t0) . (3.135b)

Substituting (3.135a) and (3.135b) in (3.93), rearranging terms, and solving forY (ω) we get

Y (ω) =(a+ iω) y (t0) + y′ (t0)−ω2 + iωa+ b

e−iωt0 +F (ω)

−ω2 + iωa+ b. (3.136)

Now −ω2 + iωa + b = (iω − λ1)(iω − λ2) and assuming for definiteness thatthese roots are distinct, the inverse FT of the first term on the right of (3.136)is easily shown (e.g., using a residue evaluation as in 3.41) to be identical withthe zero input response in (3.127). Also in virtue of (3.132) and the convolutiontheorem for the FT the second term on the right of (3.136) is just the zero stateresponse in (3.127).


f (t)

y(t)

∫

∫

+

+ +−a

−b

g

c

Figure 3.12: Feedforward representation of the right side of (3.137)

Input Transformed by a Differential Operator. Consider now an LTIsystem represented by (3.93) wherein the right side is replaced by a differentialoperator on the input. For example

d2y

dt2+ a

dy

dt+ by = cf + g

df

dt. (3.137)

Clearly this modification of the input leaves the zero input response unaffected.We can find the impulse response by setting f = δ (t) and convolving (3.125)or (3.130), which we presently denote by h0(t), with the input. We obtain

h (t) =

∫ t

−∞h0(t− τ )[cδ (τ) + gδ(1) (τ )]dτ

= ch0(t) + gh′0(t). (3.138)

In virtue of (3.102) and (3.106) this impulse response is discontinuous at t = 0,viz., h (0+) = g. Alternatively, we can determine the impulse response by firstfinding the response of the system to eiωt we obtain the transfer function

H(ω) =c+ iωg

−ω2 + iωa+ b(3.139)

and then compute h (t) with the aid of the FT inversion formula. Note thatin addition to the two poles at ω = ±iλ1,2 this transfer function has a zero atω = ic/g. This zero is a direct consequence of the differentiation operation onthe input. This differentiation operation can also be modeled by adding to thefeedback loop in Fig. 4.9 that represents the left side of (3.137) a feedforwardloop shown in Fig. 3.12.

Here the coefficients c and g represent gain settings. Note that it is this loopthat is responsible for the creation of the zero in the numerator of the transferfunction. One realization of this system as an electrical network is shown inFig. 3.13 where we identify the parameters in (3.137) as a = (R1 + R2)/L, b =1/LC, c = 1/LC, and g = R1/L. When one of the factors in the denominatormatches the zero in the numerator we obtain a second realization discussed inthe following example.


f (t) y(t)+−

C +

−

L R2

R1

Figure 3.13: Electrical network realization of (3.137)

Example. For the system

d2y

dt2+ 3

dy

dt+ 2y = 2f + 2

df

dt(3.140)

let us find the output when f = e−α|t|, with α > 0. Since the input is specifiedfor −∞ < t <∞ initial conditions at the output are redundant. (In this case thezero input response is in fact zero!) One way of determining the output is to findthe impulse response and convolve it with the input. From the characteristicequation λ2 + 3λ + 2 = 0 we find λ1 = −1, λ2 = −2. Instead of substitutingin the general formulas we have derived let us determine the impulse responsedirectly for the parameters of (3.140). First, with the right side of (3.140) setto δ (t) the response is h0 (t) =

(Ae−t +Be−2t

)U (t). Setting h0 (0) = 0 and

h′0 (0+) = 1 we get A + B = 0 and −A − 2B = 1 so that A = 1 and B = −1.The system impulse response is then (see (3.138))

h (t) = 2dh0 (t)

dt+ 2h0 (t) = 2e−2tU(t). (3.141)

Alternatively we can compute the impulse response from the transfer function.Because for the parameters at hand the zero at ω = i in the numerator coincideswith the pole in the denominator they mutually cancel and we get

H (ω) =2

iω + 2. (3.142)

Thus in this simple case we see that the inverse FT of (3.142) gives (3.141). Thesystem response to the specified input is

y (t) =

∫ t

−∞2e−2(t−τ)e−α|τ |dτ =

{2eαt

2+α ; t ≤ 0,2e−2t

2+α + 2 e−αt−e−2t

2−α ; t ≥ 0.(3.143)

Another way of getting this result is to multiply the transfer function by theFT of the input and invert the FT. Thus

y (t) =1

2π

∫ ∞

−∞

4αeiωt

(iω + 2) (ω2 + α2)dω. (3.144)

We leave it as an exercise to show that (3.144) yields (3.143).


f (t)

f (t)

y(t)

y(t)

+− 2

1/2 +

−

1 1

~

+

+−

−

2

1

•

•

•

•

•

•

•

•

Figure 3.14: Two electrical networks with identical transfer functions

Note that (3.142) is the transfer function of a first-order system whereasthe differential equation (3.140) describes a second-order system. In fact thistransfer function holds for both electrical circuits in Fig. 3.14.

Apparently if only the input and output are known one cannot distinguishbetween the two networks. As mentioned above this is a direct result of thecancellation of factors in the numerator and the denominator. For differentialequations of higher order the number of possible cancellations is increased.

This indeterminacy applies strictly to measurements of the transfer functionalone, which in the present example is a voltage ratio. If additional or alternativemeasurements are permitted (e.g., the input impedance), then in most casesalternative realizations of same transfer function can be distinguished.

Transfer functions can be defined in various ways. For example, as an inputvoltage to current ratio, i.e., the input impedance. A famous problem in thisclass leading to indeterminacy is the so-called Slepian’s Black Box Problem [22]for which the circuits are shown in Fig. 3.15.

The problem is to determine from the input impedance whether the “blackbox” contains the resistor R in (a) or the series arrangement of the parallel RLand RC circuits shown in (b). A simple calculation shows that when R =

√L/C

the input impedance looking into the terminal equals R independent of whichcircuit is in the “box.” Unlike indeterminacies caused by the cancellation ofcommon factors, where they generally can be resolved by the addition of externalcircuit elements at the input of the output ports, here no such solutions arepossible.5

5The problem has a long history. One of the early resolutions was on the thermal noisegenerated by the resistor. See, e.g., [11].


R

a

b

R R

LC

Figure 3.15: The Slepian black box problem

3.3.3 N-th Order Differential Equations

Time-Varying Systems

Standard Form. We now consider a linear system described by an N -thorder differential equation. Represented as a general form of (3.93) that alsoincludes the differentiation of the input as in (3.137) it reads

N∑

k=0

aN−k(t)dk

dtky (t) =

M∑

k=0

bN−k (t)dk

dtkf (t) , (3.145)

where M < N . Here again f(t) is the input and y(t) the output. As in thetwo-dimensional case we will treat the differential operator on the right sideof (3.145) as an equivalent input. If we wish to exclude singularity functions,the restrictions on f(t) are more stringent than in two-dimensions since now itmust now possess derivatives up to order N . When N = 0 only this equivalentinput remains. In this case it is more reasonable to regard it as the differentialoperator

y(t) =1

aN (t)

M∑

k=0

bM−k (t)dkf (t)

dtk. (3.146)

Its impulse response is a sum of singularity functions

h (t, τ ) =1

aN (t)

M∑

k=0

bM−k (t) δ(k) (t− τ ) . (3.147)


Retaining the first term (k = 0) we have

N∑

k=0

aN−k (t)dky (t)

dtk= bM (t) f(t) (3.148)

for which the impulse response is obtained by solving

N∑

k=0

aN−k (t)dkh0 (t, τ )

dtk= bM (t) δ (t− τ) (3.149)

for h0(t, τ ) subject to the initial condition (causality)

h0(t, τ ) = 0 ; t < τ . (3.150)

It will be notationally more convenient to deal with (3.149) if the coefficientof the highest derivative is normalized to unity or, equivalently, both sidesof (3.149) are divided by a0 (t). The restated problem is then

dNh0(t, τ )

dtN+

N−1∑

k=0

αN−k (t)dkh0(t, τ )

dtk= βM (t) δ (t− τ) , (3.151)

where we have defined

αN−k(t) =aN−k (t)a0 (t)

, βM (t) =bM (t)

a0 (t). (3.152)

We shall assume that N > 1 and that the αN−k(t) and βM (t) are continuousfunctions for all t. As a consequence, just as we have seen in the case of N = 2,(3.151) requires that at t = τ the N -th derivative contain a delta function,the N − 1-st derivative a step discontinuity and all lower order derivatives in-cluding h0(t, τ ) be continuous. We now construct h0(t, τ ) from the N linearlyindependent solutions yn (t)n = 1, 2, . . .N of the homogeneous equation

dNyn (t)

dtN+N−1∑

k=0

αN−k (t)dkyn (t)

dtk= 0 ; n = 1, 2, . . .N. (3.153)

Clearly for t = τh0(t, τ ) must be a linear superposition of the solutions of(3.153). Since by continuity of h0(t, τ ) the same superposition must also holdat t = τ we may write

h0(t, τ ) =

{ ∑Nn=1An (τ) yn (t) ; t ≥ τ,

0 ; t < τ ,(3.154)

where we anticipate that the expansion coefficients An (τ ) must be functionsof τ . Employing the continuity of h0(t, τ ) and its derivatives of order lower than


N − 1 at t = τ in (3.154) provides us with N − 1 equations for the coefficientsAn (τ ). Thus

N∑

n=1

An (τ ) y(m)n (τ) = 0 ; m = 0, 1, . . .N − 2, (3.155)

where y(m)n (τ ) ≡ dmy (τ ) /dτm and y

(0)n (τ) ≡ yn (τ). To solve for An (τ) we

need one more equation which we obtain by integrating (3.151) between thelimits t = τ + ε = τ+ and t = τ − ε = τ−:

dN−1h0(t, τ)

dtN−1|t=τ+ − dN−1h0(t, τ)

dtN−1|t=τ−

+N−1∑

k=0

∫ τ+

τ−αN−k (t)

dkh0(t, τ )

dtkdt

= βM (τ ) . (3.156)

The causality condition (3.150) requires that dN−1h0(t, τ )/dtN−1 |t=τ− = 0

while by virtue of the continuity of the members in the integrands the contri-bution of the entire sum vanishes as ε→ 0. Hence (3.156) is equivalent to

dN−1h0(t, τ )

dtN−1|t=τ+ = βM (τ ) (3.157)

and using (3.154) we obtain

N∑

n=1

An (τ ) y(N−1)n (τ ) = βM (τ ) . (3.158)

Combining (3.155) and (3.158) into a single matrix equation gives

⎡

⎢⎢⎢⎢⎢⎣

y1 (τ) y2 (τ ) . . . yN (τ)

y(1)1 (τ ) y

(1)2 (τ) . . . y

(1)N (τ )

y(2)1 (τ ) y

(2)2 (τ) . . . y

(2)N (τ )

. . . . . .

y(N−1)1 (τ ) y

(N−1)2 (τ ) . . . y

(N−1)N (τ)

⎤

⎥⎥⎥⎥⎥⎦

⎡

⎢⎢⎢⎢⎣

A1 (τ )A2 (τ )A3 (τ ).

AN (τ )

⎤

⎥⎥⎥⎥⎦=

⎡

⎢⎢⎢⎢⎣

000.

βM (τ )

⎤

⎥⎥⎥⎥⎦.

(3.159)

We recognize the determinant of this matrix as the Wronskian of the linearlyindependent solutions of the DE so that (3.159) always has a unique solutionwhich, together with (3.154), yields the impulse response of (3.149).

We now return to the general case (3.145). Again it will be convenient tonormalize the coefficients bM−k (t) as in (3.152)

βM−k (t) =bM−k (t)a0 (t)

; k = 0, 1, 2, . . .M (3.160)


so that (3.145) becomes

N∑

k=0

αN−k (t)dky (t)

dtk=

M∑

k=0

βM−k (t)dkf (t)

dtk

= βM (t)

{M∑

k=0

βM−k (t)βM (t)

dkf (t)

dtk

}. (3.161)

Since the impulse response h0(t, τ ) is the causal solution of (3.161) when thesum in braces equals δ (t− τ) we may use superposition and solve for y (t) asfollows:

y(t) =

∫ t

−∞dτ ′h0 (t, τ ′)

M∑

k=0

βM−k(τ

′)

βM (τ ′)

dkf(τ

′)

dτ ′k. (3.162)

The impulse response of the system, as viewed from the actual port (i.e., f(t))

rather than the equivalent port, can be obtained by replacing x(τ

′)

in the

preceding expression by δ(τ

′ − τ)and using the properties of the delta function

and its derivatives. Thus we obtain

h(t, τ) = h0 (t, τ) +

M∑

k=1

(−1)k dk

dτk

{h0 (t, τ) βM−k (τ )

βM (τ )

}(3.163a)

and we can rewrite (3.162) in the standard form

y (t) =

∫ t

−∞h (t, τ ) f (τ ) dτ . (3.163b)

Note that (3.163a) assumes that βM−k (τ) /βM (τ ) possesses derivatives upto at least order M . In that case h(t, τ ) will be continuous at t = τ providedM < N − 1 and will contain singularity functions whenever M ≥ N .

Note also that in the preceding development the initial conditions were ex-cluded (or, equivalently, were specified at −∞.). Rather than exhibiting themexplicitly as in (3.109) it is notationally simpler to include them in the statevariable formulation discussed next.

Equivalent N-Dimensional First-Order Vector Form. The change ofvariables used to transform the second-order differential equation into the two-dimensional first-order system (3.111) generalizes in the N -dimensional case

x =[y y(1) y(2) . . . y(N−2) y(N−1)

]T

=[x1 x2 x3 . . . xN−1 xN

]T, (3.164)

where y(k) = dky/dtk. Next we set the right side of (3.145) to fM and rewriteit in the compact form

N∑

k=0

aN−k y(k) = fM . (3.165)


Using the correspondence between the derivatives y(k) and the vectorcomponents xk in (3.164) we establish the chain

x2 = x(1)1 , x3 = x

(1)2 , x4 = x

(1)3 . . . xN = x

(1)N−1

from which we construct the N -dimensional state space representation

⎡

⎢⎢⎢⎢⎢⎢⎢⎣

x(1)1

x(1)2

x(1)2

.

x(1)N−1

x(1)N

⎤

⎥⎥⎥⎥⎥⎥⎥⎦

=

⎡

⎢⎢⎢⎢⎢⎢⎣

0 1 0 . 0 00 0 1 . 0 00 0 0 . . .. . . 0 1 00 0 . 0 0 1−aN −aN−1 . a3 a2 a1

⎤

⎥⎥⎥⎥⎥⎥⎦

⎡

⎢⎢⎢⎢⎢⎢⎣

x1x2x3.

xN−1

xN

⎤

⎥⎥⎥⎥⎥⎥⎦=

⎡

⎢⎢⎢⎢⎢⎢⎣

000.0fM

⎤

⎥⎥⎥⎥⎥⎥⎦

(3.166)

where we have set a0 = 1. In the block matrix notation (3.166) takes on theform

x(1) = Ax+ fM . (3.167)

The procedure for solving (3.167) for the state vector x = x(t) is exactly thesame as used in the two-dimensional case (3.113) through (3.119) since it isindependent of the dimension of X (t). Thus if we use

X (t) =[x1 x2 x3 . xN

](3.168)

the solution is of (3.167) is again given by (3.119). Note that (3.168) is identicalwith the matrix (3.159). Its columns are the N linearly independent solutions of

x(1)k = Axk , k = 1, 2, . . .N. (3.169)

Feedback Representation. The feedback diagram in Fig. 3.9 when gen-eralized to the N-dimensional case is shown in Fig. 3.16. That this arrangementis consistent with (3.145) when M = 0 can be established by noting that start-ing with highest derivative y(t)N successive integrations bring us down to theoutput y(t). (Recall that y(0) (t) ≡ y (t)) On the other hand, the weighted sumof the derivatives

−N∑

k=1

aky(N−k) (t)

when added to f(t), must match y(N)(t). As a result we obtain (3.145). WhenM = 0 the representation in Fig. 3.16 should be extended to include the de-pendence on the derivatives of f(t), as required by (3.162). In the following wediscuss this extension for the time-invariant case.


++

++

+ +

+

+

−a1(t) −aN−2(t) −aN−1(t) −aN(t)

f (t) y(t)y(N )(t) y(N−1)(t) y(2)(t) y(1)(t)

Figure 3.16: Feedback representation of a differential equation with time-varyingcoefficients

Time-invariance

In this case the coefficients aN−k and bN−k in (3.145) are constant and we cansolve it using Fourier Transforms. With reference to (3.135) for initial conditionsat t = 0+ the Fourier Transform of the k-th derivative of y(t) is

F{y(t)(k)

}= (iω)k Y (ω)−

k∑

�=1

(iω)k−� y(0+

)(�−1). (3.170)

After substituting into (3.145) and rearranging terms we get

N∑

k=0

aN−k (iω)k Y (ω) =

N∑

k=0

aN−kk∑

�=1

(iω)k−� y(0+

)(�−1)

+ F (ω)M∑

k=0

bM−k(iω)k, (3.171)

where the double summation on the right represents the total contribution fromthe initial conditions at t = 0, the second term the contribution from the equiv-alent source. Here, as in the two-dimensional case, we can identify in (3.171)the Fourier Transform of the zero state response

Y0s (ω) =

∑Mk=0 bM−k(iω)k∑Nk=0 aN−k (iω)

kF (ω) (3.172)

as well as the zero input response

Y0i (ω) =

∑Nk=0 aN−k

∑k�=1 (iω)

k−�y (0+)

(�−1)

∑Nk=0 aN−k (iω)

k. (3.173)

As we know the former coincides with the time harmonic response so that wemay designate the ratio

HN,M (ω) =

∑Mk=0 bM−k(iω)k∑Nk=0 aN−k (iω)

k(3.174)


+

+

+

+

+

+

+

+

+

+

+

+

+

+

q(N ) q(N−t) q(M+2) q(M+1) q(2) q(1) q(0)q(M)

b0 bM−2 bM−1 bM

+

++

++

y(t)

f (t)

aM+2 −aN−2 −aN−aN−1aMaM+1a1

Figure 3.17: Feedback/feedforward representation of a transfer function

as the system transfer function. Its can be generated by adding a feed forwardloop to the feedback loop in Fig. 3.16 resulting in Fig. 3.17. To trace the signalfrom input to output we start, as in Fig. 3.16, on the right side of the lowerpart of the diagram and sum the weighted derivatives of q(t) ≡ q(0) up to orderq(N−1). Subtracting this from f(t) yields q(N). As a result we get

f (t) =

N∑

k=0

aN−k q(k)(t), (3.175)

which is a differential equation for q(t). This corresponds to the feedback portionof the loop in Fig. 3.17, where y(t) ≡ q(t). Here q(t)is fed “forward” and y(t) isderived from the weighted sum

y(t) =M∑

k=0

bM−kq(k) (t) . (3.176)

Evidently the solution for y(t) in terms of f(t) requires the elimination of q(t)between (3.175) and (3.176). As long as the coefficients aN−k and bM−k areconstant this is easily done with Fourier Transforms. Thus setting F {q(t)} =Q (ω) we get from (3.176)

Y (ω) = Q (ω)

M∑

k=0

bM−k (iω)k (3.177)

and from (3.175)

F (ω) = Q (ω)N∑

k=0

aN−k (iω)k . (3.178)

The ratio Y (ω) /F (ω) is the transfer function (3.175).


When the coefficients in (3.175) and (3.176) depend on time a solution fory(t) in terms of f(t) may still exist but it will in general not satisfy (3.145)unless the feed forward coefficients bM−k are constant.

Problems

1. The input eiωt to a linear system results in the output ei4ωt

1+2iω for all real ωand t.

(a) Is the system time-invariant? Justify your answer.

(b) Find the impulse response.

(c) Is the system causal?

(d) Is the system stable?

(e) Find the output when the input is

f (t) =

{1 ;−1/2 ≤ t ≤ 1/2,0 ; otherwise.

2. The input eiωt to a linear system results in the output eiω(1−2v/c)t

a2+(vt)2for all

real ω and t, where v, c and a are constants and such that

0 < 2v/c < 1.

(a) Find the impulse response.

(b) Is the system causal? Justify your answer.

(c) Find the output when the input is the pulse

pT (t) =

{1 ;−T ≤ t ≤ T,0 ; otherwise.

3. A linear system is defined by the differential equation

dy (t)

dt+

(α+ 3βt2

)y (t) = x (t) ,

where α and β are real nonnegative constants and where x (t) is the inputand y (t) the output. Find the system impulse response. With x (t) =cos 2t exp−βt3 and initial condition y(−3) = 2 find the output for t ≥ −3.In your solution identify the zero state, the zero input, the transient, andthe steady state responses.



d2y

dt2+ 2

dy

dt+ y = x+

dx

dt

with x (t) the input and y (t) the output. Find

(a) The output when the input is 2 cos(6t).

(b) The impulse response.

(c) The output for t ≥ 0 with initial conditions y(0) = 1, y′(0) = −1when the input is 2 cos(6t)U(t). Identify the zero state, the zeroinput, and the steady state responses.


d2y(t)

dt2+ 5

dy(t)

dt+ 6y(t) = 3x(t) +

dx(t)

dt,

where x(t) is the input and y(t) is the output.

(a) Find the system impulse response.

(b) Find the output when the input is 4 cos(3t).

(c) Find the output when the input is 2 sin(2t)U(t+ 2).

(d) Find the output when the input is e−8|t|.

6. The input eiωtU(t) to a linear system results in the output

− e−t3

t2+iω + 1t2+iω e

iωt

for all real ω and t.

(a) Find the system impulse response.

(b) Find the output when the input is

f (t) =

{1 ; 0 ≤ t ≤ 1,0 ; otherwise.

7. Obtain the Fourier Transform of Y (ω) in (3.171) with initial conditionst = t+0 .

Chapter 4

Laplace Transforms

4.1 Single-Sided Laplace Transform

4.1.1 Analytic Properties

The analysis of LTI system performance sometimes calls for signal models forwhich the FT does not exist. For example, the system defined by the DE

d2y

dt2− 4

dy

dt+ 4y = f (t) (4.1)

has a causal impulse response te2tU (t). This function has no FT and since it isnot absolutely integrable it is not BIBO stable. Of course, an unstable systemof this sort would generally not be the desired end product of a design effort butmerely a pathological state of affairs requiring correction. On the other hand,to identify and apply design procedures that ensure stable system performanceone must be able to quantify the unstable behavior of a system. This requiresthat the class of admissible signals includes signals that can be generated byunstable systems. These include functions that can become unbounded at in-finity and of the generic form tneqt(q > 0) for which the FT does not exist.For such signals a convergent integral transform can be constructed by initiallymultiplying the signal by the exponential convergence factor e−σt with σ > 0and then computing the FT. Thus

∫ ∞

−∞e−σt f (t) e−iωtdt. (4.2)

Clearly we can reduce the growth of the integrand as t → ∞ by choosing asufficiently large positive σ. Unfortunately this same factor will contribute toan exponential growth of the integrand for t < 0 turning the convergence factoreffectively into a “divergence” factor. However if we confine ourselves to causal


235

236 4 Laplace Transforms

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

2

4

6

8

10

12

14

f (t)

f (t)

Meat

Meat

t0

Figure 4.1: Growth of a function of exponential order

signals there is an easy remedy. We simply truncate the integral to nonnegativevalues of t and define an integral transform by

F (s) =

∫ ∞

0

e−st f (t) dt, (4.3)

where we have set s = σ+ iω. Equation (4.3) defines the so-called one-sided (orunilateral) Laplace transform (LT). As in case of the FT we will at times findit convenient to use an abbreviation for the defining integral. Thus

L {f (t)} = F (s). (4.4)

Having defined the transform our next task is to identify the class of functionsfor which (4.3) converges and delineate the resulting properties of F (s). Eventhough we allow f(t) to grow at infinity we cannot expect convergence for allpossible functions that tend to infinity with t. In fact the use of the exponentialconvergence factor implies that our signals may not grow any faster than anexponential function. Formally we say that f(t) is required to be of exponentialorder. Expressed in mathematical terminology this means that there exist realconstants t0, α, and M such that

| f(t)| < Meαt ; t > t0. (4.5)

Note that this definition does not imply that f(t) may not exceed Meαt forsome values of t. It merely means that it may not exceed this exponential fort > t0. This is illustrated graphically in Fig. 4.1.

The following notation is commonly employed to designate a function ofexponential order:

f(t)t∼∞

∼ O (eαt

). (4.6)

4.1 Single-Sided Laplace Transform 237

Consistent with our previous assumption on the class of admissible signals weshall suppose that f(t) is sectionally smooth. However, we shall relax our pre-vious constraint on boundedness and replace it by the requirement that

g (t) ≡∫ t

0

f (τ ) dτ (4.7)

be finite for all finite t ≥ 0. For example, within this framework we includefunctions of the form t−q with 0 < q < 1 which tend to infinity at t = 0. As weshall see signals with singularities of this type arise in the study of the transientbehavior of distributed systems (e.g., transmission lines).

We shall prove the following. The LT of a sectionally smooth function sat-isfying (4.5) and (4.7) has the following properties:

F (s) is an analytic function of s for Re s > α (4.8a)

and such that

lim|s|→∞

F (s)→ 0 for Re s > α. (4.8b)

We first show that if f (t) is of exponential order g (t) is exponentially boundedfor all nonnegative t. We start with (4.7) which implies that for 0 ≤ t ≤ t0

|g (t)| =∣∣∣∣∫ t

0

f (τ) dτ

∣∣∣∣ ≤ Q (t0). (4.9)

For t > t0 we write the bound as a sum of two members as follows:

|g (t)| =∣∣∣∣∫ t0

0

f (τ ) dτ +

∫ t

t0

f (τ ) dτ

∣∣∣∣ ≤ Q (t0) +

∣∣∣∣∫ t

t0

f (τ ) dτ

∣∣∣∣. (4.10)

In view of (4.5) we have∣∣∣∫ tt0f (τ ) dτ

∣∣∣ < M(eαt − eαt0)/α so that the bound in

(4.10) becomes

|g (t)| ≤ eαt[Q (t0) e

−αt +M

α(1− e−α(t−t0))

]

≤ eαt[Q (t0) e

−αt0 +M

α

]= eαtP (t0) (4.11)

with P (t0) = Q (t0) e−αt0 +M/α. If we now define N = max {Q (t0) , P (t0)}

then since for 0 ≤ t ≤ t0 the left side of (4.9) is also bounded by Q (t0) eαt we

obtain in combination with (4.11)

|g (t)| ≤ eαtN ; 0 ≤ t <∞. (4.12)

Note that unlike f(t) (see Fig. 4.1) g (t) is bounded by the exponential for allnonnegative t. Next we integrate (4.3) by parts to obtain

F (s) = e−stg (t) |∞0 + s

∫ ∞

0

e−st g (t) dt. (4.13)


With s = σ + iω and choosing σ > α the first term on the right of (4.13)vanishes at the upper limit. It also vanishes at the lower limit because g (0) = 0.Therefore setting

G (s) =

∫ ∞

0

e−st g (t) dt (4.14)

we obtain the relationship

F (s) = sG (s) ; Re s > α. (4.15)

We now utilize (4.12) to bound (4.14) as follows:

|G (s)| ≤∫ ∞

0

∣∣e−st g (t)∣∣ dt ≤ N

∫ ∞

0

e−(σ−α)tdt =N

σ − α ; σ > α. (4.16)

The preceding states that G (s) exists and is bounded for Re s > α. We cansimilarly bound the derivative of G (s). Thus

|G′ (s)| ≤∫ ∞

0

∣∣−te−st g (t)∣∣ dt ≤ N∫ ∞

0

te−(σ−α)tdt =N

(σ − α)2 ; σ > α.

In fact we get for n-th derivative∣∣∣G(n) (s)

∣∣∣ ≤∫ ∞

0

∣∣(−t)n e−st g (t)∣∣ dt ≤ N∫ ∞

0

tne−(σ−α)tdt

=Nn!

(σ − α)n+1 ; σ > α.

This shows that not only G(s) but also all its derivatives exist and are boundedfor Re s > α. Consequently G(s) is an analytic function for Re s > α. There-fore, in view of (4.15) and (4.16) F (s) is also analytic for Re s > α with thepossible exception of a simple pole at infinity. This shows that (4.8a) holds inthe finite part of the complex plane. To prove that there is no pole at infinity,i.e., that in fact (4.8b) holds, we use the fact that the existence of the integralin (4.7) implies that given an ε > 0 there exists a δ such that |g (t)| < ε for all0 ≤ t ≤ δ. We now write the transform of g (t) as the sum of two parts

G (s) =

∫ δ

0

e−st g (t) dt+∫ ∞

δ

e−st g (t) dt (4.17)

and bound each integral separately. For the second integral we use the expo-nential bound on g (t) in (4.12) so that

∣∣∣∣∫ ∞

δ

e−st g (t) dt∣∣∣∣ ≤ N

∫ ∞

δ

e−(σ−α)t dt = Ne−(σ−α)δ

σ − α ; σ > α. (4.18)

In the first integral we first change variables and write∫ δ0e−st g (t) dt =

1s

∫ δs0 e−x g (x/s) dx and then setting s = σ use the fact that in view of (4.7)

|g (t)| is bounded by ε. In this way we get∣∣∣∣∣

∫ δ

0

e−σt g (t) dt

∣∣∣∣∣ ≤1

σ

∣∣∣∣∣

∫ δσ

0

e−xg (x/σ) dx

∣∣∣∣∣ ≤ε

σ

(1− e−δσ) . (4.19)


Finally we add (4.18) and (4.19) and multiply the result by σ and in view of(4.17) obtain

F (σ) = σG (σ) ≤ N σe−(σ−α)δ

σ − α + ε(1− e−δσ) ; σ > α. (4.20)

Since ε may be chosen as small as desired the preceding expression gives

limσ→∞F (σ) = 0 for σ > α. (4.21)

Because F (s) is analytic for Re s > α (4.21) implies (4.8b).The significance of (4.8) is that it forms the basis for the evaluation of the

inverse transform, i.e., the determination of f(t) from F (s) using the techniquesof the calculus of residues to be discussed in Sect. 4.1.4.

4.1.2 Singularity Functions

The analytic properties of F (s) discussed in the preceding subsection are re-stricted to LT of piecewise smooth functions satisfying (4.5) and (4.7). We canalso accommodate within the framework of LT theory the delta function as wellas higher order singularity functions. As we shall see their transforms do notsatisfy (4.8). On first thought the computation of the LT of a delta function ap-pears to present a bit of a problem since δ (t), being defined in terms of limitingforms of kernels that are even in t, appears incompatible with the semi-infiniteintegration interval of the unilateral LT. This difficulty is easily removed bythe artifice of permitting the lower limit of the LT integral to assume slightlynegative vales, i.e.,

F (s) =

∫ ∞

−εe−st f (t) dt (4.21*)

with ε an arbitrarily small positive quantity.1 For ordinary functions the re-placement of (4.3) by (4.21*) is inconsequential since such functions are definedas zero for negative t. However as long as ε = 0 the limit for any of the kernelsstudied in Chapter 2 reads

limΩ→∞

∫ ∞

−εKΩ(t)dt = 1

so that also

limΩ→∞

∫ ∞

−εe−st KΩ(t)dt = 1.

Thus we find

L {δ (t)} = 1 (4.22)

1An alternative popular notation for 4.21* is F (s) =∫∞0− e−st f (t) dt.


α

Im(s)

Re(s)

Figure 4.2: Region of analyticity of F (s) defined by Re(s) > α

i.e., the LT of a delta function is unity just like the corresponding FT. We seethat in this case (4.8b) does not hold. Similarly for the higher order singularityfunctions we obtain for n ≥ 1

L{δ(n) (t)

}= (s)n (4.23)

so that now both (4.8a) and (4.8b) are violated.

4.1.3 Some Examples

For piecewise smooth functions of exponential order the region of analyticityRe s > α in the complex s-plane may be identified by the cross-hatched regionin Fig. 4.2.

For example, we easily find for Re s > 0 that

L {U (t)} = 1/s. (4.24)

The only singularity of this LT is a simple pole at s = 0 to the right of which1/s is analytic. Thus in this case α = 0. Similarly a simple integration yieldsthe transform

L{eqt

}=

1

s− q (4.25)

with q an arbitrary complex number. This transform has a simple pole ats = q so that α = Re q. As another example consider the function

√t. Direct

computation gives

L{√

t}=

∫ ∞

0

e−st√tdt = 2

∫ ∞

0

x2e−sx2

dx =

√π

2s−3/2. (4.26)


Im(s)

Re(s)θr

•s = reiq

branch cut

Figure 4.3: Branch cut to render s−3/2 analytic in the right half-plane

Unlike (4.24) and (4.25) this transform is not a rational function. The singularityat s = 0 is not a pole but a branch point and the function is multivalued.Nevertheless we can still define an analytic function for Re s > ε where α = εwith ε an arbitrarily small positive quantity. We do this by choosing a branchcut along the negative real axis, as shown in Fig. 4.3.

This renders s−3/2 analytic everywhere except along the branch cut. An-other example of a function with a non-rational LT is 1/

√t. Unlike in examples

(4.25) and (4.26) the derivative of this function does not have an LT since itfails to satisfy (4.7). On the other hand L

{1/√t}exists and is given by

L{1/√t}=

∫ ∞

0

e−st(1/√t)dt = 2

∫ ∞

0

e−sx2

dx =

√π

s. (4.27)

The same branch cut as in Fig. 4.3 can be used to ensure that 1/√s is analytic

in the right half plane. More generally, we can obtain a formula for the LT of tν ,where ν is an arbitrary real number greater than −1 by using the definition ofthe Gamma function

Γ(ν) =

∫ ∞

0

xν−1e−xdx.

The result is

L {tν} = Γ(ν + 1)

sν+1; ν > −1. (4.27*)

4.1.4 Inversion Formula

Consider the function w (t) defined by the integral

w(t) =1

2πi

∫ γ+i∞

γ−i∞

F (s) est

s2ds, (4.28)

where the integration is performed along the straight line path intercepting theaxis of reals at Re s = γ with γ > α, as shown in Fig. 4.4.


α

Im(s)

Re(s)γ

Figure 4.4: Integration path in the complex s-plane

Next we substitute for F (s) in (4.28) the integral (4.3) and interchange theorders of integration. As a result we obtain

w(t) =

∫ ∞

0

f (τ )

{1

2πi

∫ γ+i∞

γ−i∞

es(t−τ)

s2ds

}dτ . (4.29)

We now evaluate the inner integral by residues. First consider the case t −τ < 0. Since the exponential decays in the right half of the complex plane weform a closed contour by adding to the existing straight line path an infinitesemicircular contour situated in the right half plane. Because 1/s2 approacheszero at infinity Jordan’s lemma (see Appendix A) ensures that in the limitthe contribution along the semicircular path vanishes. Therefore the integralalong the straight line path in Fig. 4.4 may be equated to the negative sum ofthe residues at singularities located to the right of the integration path. Since,however, 1/s2 is analytic to the right of γ this integral equals zero. For t−τ > 0the exponential decays in the left half of the complex plane. We therefore closethe contour with a large semicircle in the left half plane. This contribution againvanishes in accordance with Jordan’s lemma. The resulting integration alongthe path in Fig. 4.4 is then equal to the residue at the second-order pole locatedat s = 0. A simple differentiation of the exponential shows that this residueequals t− τ . Incorporating both results into a single expression we obtain

1

2πi

∫ γ+i∞

γ−i∞

es(t−τ)

s2ds = (t− τ )U (t− τ ) . (4.30)

When the inner integral in (4.29) is replaced by the right side of (4.30) the upperlimit of the integral may be truncated to t so that

w(t) =

∫ t

0

f (τ ) (t− τ )dτ . (4.31)


Differentiating this we get w′(t) =∫ t0 f (τ ) dτ and with the aid of a second

differentiation we find that w′′(t) = f (t). Thus to obtain f(t) we merely needto differentiate both sides of (4.28) twice. The result is

f(t) =1

2πi

∫ γ+i∞

γ−i∞F (s) estds, (4.32)

which is the LT inversion formula. Since the integral on the right represents theinverse operation of (4.3) it seems appropriate to use the notation

f(t) = L−1{F (s)}. (4.33)

As a compact notation for both the direct transform (4.3) and its inverse (4.32)we shall also at times use the symbols

f (t)L⇐⇒ F (s) . (4.34)

4.1.5 Fundamental Theorems

Properties

Transformation of Derivatives and solution of DE. Because of the ex-ponential nature of the kernel, the LT, just like the FT, can be employed toconvert a differential operator with constant coefficients into a polynomial inthe transform variable. In (2.161) in Sect. 2.2 we have demonstrated how thisproperty can be exploited in conjunction with the FT to solve linear differentialequations with constant coefficients. We can accomplish the same result by us-ing the LT. In fact in this respect the LT has two advantages over the FT. Thefirst of these is that given the initial conditions at say t = t0 we can solve a DEfor t ≥ t0 without the need of explicitly imposing the constraint that the outputsignal be identically zero for t < t0. Instead we accomplish the same resultby truncating the integral defining the transform to t ≥ t0. By convention inthe (unilateral) LT t0 is set equal to zero so that causality becomes an implicitattribute of the transform. The other advantage of the LT over the FT is thatthe given DE may not admit solutions which are both causal and possess an FT.On the other hand, as long as the excitation (input) is of exponential order ast ∼ ∞ we are assured of the existence of an LT.

We now assume that the derivatives up to and including order N of the func-tion satisfying an N-th order DE exist for t > 0. Thus we exclude functions withstep discontinuities except at t = 0 and functions that may become unboundedat t = 0 as, e.g., the function in (4.27). Since the function will generally possessa step discontinuity at t = 0 we must distinguish between its right side value,which we denote by f(0+), and the value at t = 0− which by definition equalszero. We then define the LT of f ′ (t) with the lower integration limit set tot = 0+. This avoids differentiation at the discontinuity and implicitly identifies


f ′ (t) as the derivative of the “ smooth” part of the function. The Laplaceintegral is then transformed by an integration by parts as follows:

L {f ′ (t)} =

∫ ∞

0+e−stf ′ (t) dt = e−stf (t) |∞0+ + s

∫ ∞

0+e−stf (t) dt

= sF (s)− f (0+

), (4.35)

where it is assumed that Re s > α. Alternatively, we can get the same resultby using an explicit decomposition of f(t) into the smooth (differentiable) partand a step, i.e.,

f(t) = fs (t) + f(0+

)U(t).

Upon differentiation this becomes

f ′ (t) = f ′s (t) + f

(0+

)δ(t). (4.36)

On the other hand differentiating both sides of the inversion formula (4.32)gives L {f ′ (t)} = sF (s). This derivative is to be interpreted as a generalizedderivative that may include singularity functions. Taking the LT of both sidesof (4.36) and identifying the LT of the left side with sF (s) yields

L {f ′s (t)} = sF (s)− f (

0+). (4.37)

The preceding is identical to (4.35) provided f ′ (t) is relabeled as f ′s (t). The

distinction between fs (t) and f (t) is usually not made explicit with unilateralLT and we shall adhere to this custom by dropping the subscript s. Nevertheless,to avoid apparent inconsistencies this distinction should be kept in mind. Forexample, using (4.35) with f(t) = U(t) we get on account of (4.24)

L

{dU (t)

dt

}= s

1

s− 1 = 0,

which in view of (4.22) and dU (t) /dt = δ(t) should equal unity. This contra-diction is only apparent for formula (4.35) holds only for the smooth part of thefunction which according to the decomposition (4.36) is in this case identicallyzero.

The LT of higher order derivatives follows from a repetition of the steps in(4.35). For example, for the second derivative we obtain

L

{d2f (t)

dt2

}= s2F (s)− sf (

0+)− f (1)

(0+

), (4.38)

where f (1) (0+) is the first derivative at t = 0+. Finally, for a derivative of anyorder, we get

L

{dnf (t)

dtn

}= snF (s)−

n∑

�=1

sn−�f (�−1)(0+

). (4.39)


We can use formula (4.39) to find the LT of the general solution of an N -th orderlinear differential equation with constant coefficients subject to initial conditionsat t = 0. Thus we transform

N∑

n=0

andny (t)

dtn= f (t) (4.40)

as follows:

Y (s)

N∑

n=0

ansn =

N∑

n=1

an

n∑

�=1

sn−�f (�−1)(0+

)+ F (s), (4.41)

where Y (s) = L {y(t)}. Solving for Y (s) we get

Y (s) =

∑Nn=1 an

∑n�=1 s

n−�f (�−1) (0+)∑N

n=0 ansn

+F (s)

∑Nn=0 ans

n. (4.42)

Using the notation in (4.33) for the inverse transform the complete solution of(4.40) for t ≥ 0 reads

y(t) = L−1

{∑Nn=1 an

∑n�=1 s

n−�f (�−1) (0+)∑N

n=0 ansn

}+ L−1

{F (s)

∑Nn=0 ans

n

}. (4.43)

We recognize the first term on the right as the zero input and the second termas the zero state response of the corresponding LTI system. Also, since the LTof a delta function is unity, the system impulse response is

h(t) = L−1

{1

∑Nn=0 ans

n

}. (4.44)

The utility of this formula and (4.43) depends on the ease with which the inversetransforms can be evaluated. We shall discuss techniques of evaluating theinverse LT in 4.1.6.

Integration. In the analysis of linear systems one sometimes encountersintegro-differential operators involving integrals of the form

∫ t0f(τ )dτ . We can

readily find the corresponding LT using integration by parts. Thus assumingRe s > α we compute

L

{∫ t

0

f(τ )dτ .

}=

∫ ∞

0

e−st{∫ t

0

f(τ)dτ

}dt

=

{−e

−st

s

∫ t

0

f(τ )dτ

}|∞0 +

1

s

∫ ∞

0

f(t)e−st dt.

Because∣∣∣∫ t0f(τ)dτ

∣∣∣ < Neαt the first term to the right of the second equality

vanishes at the upper limit and we obtain

L

{∫ t

0

f(τ)dτ

}=F (s)

s. (4.45)


T

f (t−T );T ≥0f (t)

t

Figure 4.5: Truncation of f(t) for T < 0

Initial and Final Value Theorems. An initial and final value theoremsimilar to the one we have obtained for the FT of causal functions (2.181)and (2.183) in Sect. 2.2 hold also for the LT. Thus if the LT of the derivativeexists, it must satisfy (4.8b). Hence using (4.35) we have

limRe s>α, |s|−→∞

sF (s) = f(0+

), (4.46)

which is the initial value theorem. For the final value theorem we can proceedas for FT. Assuming lim

t→∞f(t) = A exists we have

lims−→0

sF (s)− f (0+

)= lim

s−→0L

{df (t)

dt

}= lims−→0

∫ ∞

0+e−st

df (t)

dtdt = A− f (

0+)

and cancelling f (0+) from both sides we get

lims−→0

sF (s) = A. (4.47)

Differentiation with Respect to the Transform Variable. Differentiat-ing (4.3) n times with respect to s we obtain

L {tnf(t)} = (−1)n dnF (s)

dsn. (4.48)

Time Shift. For a time-shifted signal f(t− T ) we have

L {f(t− T )} =∫ ∞

0

f (t− T ) e−stdt = e−sT∫ ∞

−Tf (τ) e−sτdτ .

Because f(t) = 0 for t < 0 the last integral equals F (s) only if T ≥ 0, i.e.,when the signal is delayed in time. When T < 0 this integral equals the LT of afunction that is equal to f(t) only for t ≥ −T but identically zero in the interval0 ≤ t < −T . This truncation is illustrated in Fig. 4.5.

We can write the final result as follows:

L {f(t− T )} ={

e−sTF (s) ;T ≥ 0,e−sT

∫∞−T f (τ) e

−sτdτ ;T < 0.(4.49)


Multiplication by Exponential Functions. With q an arbitrary complexnumber and f(t) of exponential order we form the function y(t) = eqtf(t). (Notethat this modifies the growth rate at infinity to O(e(α+Re q)t).) The correspond-ing LT is

Y (s) = L{eqtf(t)

}= F (s− q) (4.50)

so that the region of analyticity of Y (s) is shifted to Re(s) > α+Re q.

Convolution. For two causal functions f1(t) and f2(t) the convolution inte-gral becomes

∫ ∞

−∞f1(τ )f2(t− τ )dτ =

∫ t

0

f1(τ )f2(t− τ )dτ . (4.51)

For the LT we obtain

L

{∫ t

0

f1(τ )f2(t− τ )dτ}

= L

{∫ ∞

0

f1(τ )f2(t− τ )U(t− τ)dτ}

=

∫ ∞

0

f1(τ )L{f2(t− τ )U(t− τ )}dτ

=

∫ ∞

0

f1(τ )

{∫ ∞

0

e−stf2(t− τ )U(t− τ)dt}dτ

=

∫ ∞

0

f1(τ )e−sτdτ

∫ ∞

−τe−sxf2(x)U(x)dx

=

∫ ∞

0

f1(τ )e−sτdτ

∫ ∞

0

e−sxf2(x)dx

= F1(s)F2(s). (4.52)

Thus just like for the FT, the convolution in the time domain yields a productin the transform domain.

LT of Polynomial and Exponential Signals

The LT of signals comprised of polynomials and exponentials can be synthesizedin terms of preceding results without evaluating the transform integral directly.For example, using (4.24) and (4.48) we get

L {tn} = n!

sn+1. (4.53)

Using (4.50) together with (4.53) we obtain

L{tneqt

}=

n!

(s− q)n+1 . (4.54)

Also with q = iω0 with ω0 real and f(t) = U(t) (4.50) yields

L{eiω0t

}=

1

s− iω0


and using appropriate superpositions of eiω0t we readily obtain the transforms

L {cosω0t} = s

s2 + ω20

, (4.55a)

L {sinω0t} = ω0

s2 + ω20

. (4.55b)

With α > 0 the LT of the damped sinusoid e−αt cosω0t can be found from(4.55a) in conjunction with (4.50). Thus

L{e−αt cosω0t

}=

s+ α

(s+ α)2 + ω20

. (4.56)

Note that as long as α > 0 the FT can be obtained from (4.56) by the substitu-tion s = iω. Evidently this procedure fails when α = 0 for it cannot account forthe presence of the delta functions in the FT (see (2.138)). For α < 0 the FTdoes not exist but (4.56) retains its validity. We shall discuss the relationshipbetween the FT and the LT in detail in (2.3).

Periodic Signals

Let f0(t) be a signal of duration T , identically zero for t ≥ T and t < 0. In viewof (4.5) f0(t) is of exponential order with α = 0. The signal defined by the sum

f(t) =

∞∑

n=0

f0(t− nT ) (4.57)

is then periodic for t ≥ 0 with period T . Setting

L {f0(t)} = F0(s)

and applying (4.49) to (4.57) we obtain

L {f(t)} = F (s) = F0(s)

∞∑

n=0

e−sTn.

This geometric series converges for Re s > 0 so that f(t) is of exponential orderwith α > 0. Summing the series we get

F (s) =F0(s)

1− e−sT . (4.58)

4.1.6 Evaluation of the Inverse LT

Rational Functions

The LT of the zero input response of an LTI system governed by ordinaryDE is in general a ratio of two polynomials, i.e., a rational function. If the


input is representable as a superposition of a combination of exponentials andpolynomials, then the LT of the zero state response is also a rational function.In that case we can represent the LT of the total output by

Y (s) =N(s)

D(s), (4.59)

where N(s) and D(s) are coprime polynomials (i.e., without common factors)and degreeN(s) ≤ degreeD(s). If the two polynomials have identical degrees,we can carry out a long division to obtain

Y (s) = c+ Y (s),

where c is a constant and

Y (s) =N(s)

D(s)=b1s

N−1 + b2sN−2 + · · ·+ bN

sN + a1sN−1 + · · ·+ aN. (4.60)

The inverse transform of (4.59) is then

y (t) = cδ (t) + y (t)

with

y (t) = L−1 {Y (s)} = 1

2πi

∫ γ+i∞

γ−i∞Y (s) estds. (4.61)

The region of analyticity of Y (s) is defined by Re s > α = Re{smax} with smax

the zero of D(s) with the largest real part. Consequently in (4.61) we mustchoose γ > Re{smax}. Referring to Fig. 4.6, we now evaluate the integral ofY (s) est along a closed path in the counterclockwise direction over the contourconsisting of the straight line and the circular path in the left half plane. Wechoose the radius R of the circle large enough to enclose all the zeros of D(s)and obtain

1

2πi{∫ γ+i

√R2−γ2

γ−i√R2−γ2

Y (s) estds+

∫

CR

Y (s) estds} =K∑

n=1

rn (t) , (4.62)

where the subscriptCR refers to the circular path, rn(t) is the residue of Y (s) est

at the pole located at s = sn, and K ≤ N is the number of poles. If welet R approach infinity (while keeping γ fixed) the first integral on the leftof (4.62) reduces to (4.61). Because Y (s) → 0 for Re s < 0 as R → ∞ theintegral over the circular path approaches zero in accordance with Jordan’slemma (Appendix A). Hence

y (t) = L−1 {Y (s)} = 1

2πi

∫ γ+i∞

γ−i∞Y (s) .estds =

K∑

n=1

rn(t). (4.63)


R

γℜe s

ℑm s

×

×

×

×

× ×

×

×

sn

Figure 4.6: Evaluation of inverse LT of rational functions

For a pole of order m at s = sn the residue is (see Appendix A)

rn(t) =1

(m− 1)!

dm−1

dsm−1

[(s− sn)mN(s)est

D(s)

]

s=sn

. (4.64)

When the pole is simple (m = 1) the following alternative formula (Appendix A)may be used:

rn(t) =N(s)est

ddsD(s)

|s=sn . (4.65)

Example 1

F (s) =1

s(s+ 2)(4.66)

The function has simple poles at s = 0 and s = −2. We choose γ > 0 andin accordance with (4.63) and (4.64) obtain

L−1 {F (s)} = 1

s+ 2est |s=0 +

1

sest |s=−2

=1

2− 1

2e−2t. (4.67)

Example 2

F (s) =s

(s+ 1)(s− 3)2(4.68)


This function has a simple pole at s = −1 and a pole of order 2 at s = 3.Here γ > 3 so that

L−1 {F (s)} = sest

(s− 3)2|s=−1+

d

ds

sest

(s+ 1)|s=3

= − 1

16e−t +

(s+ 1)(1 + st)est − sest(s+ 1)2

|s=3

= − 1

16e−t +

1 + 12t

16e3t. (4.69)

Example 3

F (s) =s2

(s+ 2)(s+ 3)(4.70)

In this case F (s) approaches a finite value at infinity which must be sub-tracted before the residue formula can be applied. Since F (∞) = 1 we write

F (s) = 1− 1 + F (s) = 1− 5s+ 6

(s+ 2)(s+ 3).

The inverse of the first term on the right is a delta function while the secondterm represents a transform that vanishes at infinity. To the second term wemay apply the residue formula and obtain

L−1 {F (s)} = δ (t)− (5s+ 6) est

s+ 2|s=−3 − (5s+ 6) est

s+ 3|s=−2

= δ (t)− 9e−3t + 4e−2t. (4.71)

Example 4

F (s) =4s

[(s+ 1)2 + 4] (s+ 2). (4.72)

This function has two simple poles at s1,2 = −1± 2i as well as a simple poleat s = −2. We obtain

L−1 {F (s)} = 4sest

[(s+ 1)2 + 4]|s=−2

+4sest

2(s+ 1)(s+ 2)|s=−1+2i +

4sest

2(s+ 1)(s+ 2)|s=−1−2i,

where for the complex poles we have used the differentiation formula (4.65).Completing the algebra in the preceding expression we get

L−1 {F (s)} = −8

5e−2t + e−t(

8

5cos 2t+

6

5sin 2t). (4.73)


Transcendental Functions

Meromorphic Functions. In many physical problems one encounters LTsthat are not rational functions. A particularly important example of nonrationalfunctions is the class referred to as meromorphic functions. Their only singular-ities in the finite part of the complex plane are poles. Meromorphic functionsdiffer from rational functions only in that the number of poles is allowed toapproach infinity. For example, referring to (4.58), we note that the zeros of thedenominator are infinite in number so that the LT of a periodic function willbe meromorphic provided F0(s) is itself either meromorphic or rational. Let usexamine a specific case:

F (s) =e−s − e−2s

s (1− e−2s). (4.74)

Here F0(s) =(e−s − e−2s

)/s so that f0(t) is the rectangular pulse U(t − 1)−

U(t−2). Hence f(t) = L−1 {F (s)} is the periodic function displayed in Fig. 4.7.Alternatively we can apply the inversion formula to (4.74). Since F (s)→ 0

as |s| → ∞ the Jordan lemma applies so that for t > 0 f(t) may be equatedto sum of the residues corresponding to the infinite number of simple poles ats = sn = inπ, with n = ±1,±2, . . . . and the solitary double pole at s = 0.

Setting G =(e−s − e−2s

) (1− e−2s

)−1the contribution from the latter is

d

ds

(Gsest

) |s=0 = est[(1 + st)G+ s

dG

ds

]|s=0.

As s approaches zero G approaches a finite value which we compute with the aidof Taylor expansions about s = 0 of the numerator and denominator as follows:

G→ 1− s+ s2/2− . . .− [1− 2s+ 2s2 − . . .]

1− [1− 2s+ 2s2 − . . .] → 1

2.

Clearly dGds |s=0 is also finite so that we obtain

d

ds

(Gsest

) |s=0 =1

2. (4.75)

For the residues at the infinite set of simple poles we get

(e−s − e−2s

)est

s dds (1− e−2s)|s=inπ =

[(−1)n − 1] einπt

2inπ=

{0 ; n even,einπt

inπ ; n odd.(4.76)

Summing over all positive and negative n, setting n = 2k+1 and adding (4.75)we get the final result

f(t) =1

2+ 2

∞∑

k=0

sin [π (2k + 1) t]

π (2k + 1), (4.77)

which is just the FS representation of the square wave in Fig. 4.7.


t

f 0(t )

0 1 2 3 4 5 6

Figure 4.7: Inverse LT of (4.74)

Example 2Next consider the LT

I(s) =1 + γ

2s

1 + e−sτ

1 + γe−eτ, (4.78)

where 0 ≤ γ ≤ 1 and τ > 0. This function represents the LT of the electriccurrent i(t) in the resistor R in the circuit shown in Fig. 4.8. The resistor isshunted by an open-circuited transmission line of length L and characteristicimpedance Z0, and the parallel combination is excited by the ideal currentgenerator U(t). The transmission line is assumed lossless with τ = L/v, v the

phase velocity, and γ = (R− Z0) (R+ Z0)−1

.

i(t)

U(t)

RZ0

L

Figure 4.8: Electric circuit interpretation of (4.78)

As in Example 1 the LT (4.78) approaches zero as |s| → ∞ so that i(t) isobtained by summing the residues. Presently we have a simple pole at s = 0and, unless γ = 1, an infinite set of simple poles corresponding to 1+γe−eτ = 0.Denoting these by sn we have

sn = ln γ/τ + inπ/τ ; n = ±1,±3,±5 . . . (4.79)


Note that the real parts of sn are all identical and always negative and tend to−∞ as γ approaches zero. Carrying out the residue calculation we get

i(t) =1 + γ

2

1 + e−sτ

1 + γe−eτest |s=0 +

∑

n

1 + γ

2s

1 + e−sτdds (1 + γe−eτ )

est |s=sn

=1

2− 1− γ2

2γγt/τ

∑

n

einπt

ln γ + inπ. (4.80)

Changing the summation index to k with n = 2k+1 the last series assumesthe form of a Fourier series:

∑

n

einπt

ln γ + inπ=

∞∑

k=0

ln γ cosπ(2k + 1)t+ π(2k + 1) sinπ(2k + 1)t

ln2 γ + π2(2k + 1)2. (4.80*)

Functions with Branch Point Singularities. In a variety of physical prob-lems one encounters multivalued. LT. Examples include lossy transmission lines,waveguides and, more generally, electromagnetic radiating structures (e.g., an-tennas). Unlike for rational and meromorphic functions analyticity of F (s) forsufficiently large |s| in the right half plane is for such functions not automaticbut must be imposed by confining the range of s to a particular (usually top)Riemann sheet. This can be done through a judicious choice of branch cuts asillustrated in the following representative examples.

Example 1

F (s) =b√s√

s− a , (4.81)

where a and b are real constants. Since F (∞) = b, we subtract b from F (s)yielding a function of s that tends to zero at infinity and then add it back againto maintain equality. Thus

F (s) =b√s√

s− a − b+ b = b+ab√s− a .

Hence

f(t) = L−1 {F (s)} = bδ (t) + abL−1 {G (s)} , (4.82)

where

G (s) =1√s− a . (4.83)

Evidently G (∞) = 0 which ensures the validity of the Jordan lemma for theinversion formula. We note that G (s) has a branch point at s = 0 and an

isolated singularity at√s = a. Since (

√s− a)−1

= (√s− a) (s− a2)−1

we seethat this singularity is a simple pole at s = a2. To apply the LT inversionformula we must construct a G (s) that is analytic for e √s > α. We canaccomplish this by defining the analytic function

√s by means of a branch cut

along the negative real axis, with the result that e √s > 0 on the top Riemann


r

θ a2 γ ℜe s

ℑm s

Pbranch cut

×pole

Figure 4.9: Integration path for the LT inversion formula in (4.84)

sheet. Consequently, the pole will exist on the top Riemann sheet only if a > 0.Assuming this to be the case G (s) will be analytic for es > a2 (α > a2) sothat the inversion formula reads

g(t) = L−1

{1√s− a

}=

1

2πi

∫ γ+i∞

γ−i∞

estds√s− a, (4.84)

where γ > a2, and the disposition of the integration path P relative to thesingularities is as shown in Fig. 4.9.

We close the path of integration in (4.84) with a semicircle in the left halfplane and two integrals along the two sides of the branch cut, as shown inFig. 4.10.

The resulting integral around a closed path equals 2πi times the residue atthe enclosed pole. Thus

1

2πi

{ ∫ γ+iR sinψ

γ−iR sinψestds√s−a +

∫ΓR+

estds√s−a +

∫BC+

estds√s−a

+∫BC−

estds√s−a +

∫ΓR−

estds√s−a

}

= 2aea2t. (4.85)

As R→∞ the Jordan lemma ensures that for t > 0 the two integrals along thesemicircular path approach zero. Because e √s > 0 on the top Riemann sheet√s = i

√r along the upper side of the branch cut and −i√r on the lower side.

The integrations around the branch cut then yield the following:

limR→∞

∫

BC+

estds√s− a =

∫ 0

∞

−e−rtdri√r − a = −2

∫ ∞

0

e−tx2

xdx

a− ix ,

limR→∞

∫

BC−

estds√s− a =

∫ ∞

0

−e−rtdr−i√r − a = 2

∫ ∞

0

e−tx2

xdx

a+ ix.


a2 γ ℜe s

ℑm s

P

×

Rsin y

−Rsin y

−R−R

ψ

R

BC +

BC −

ΓR+

ΓR−

Figure 4.10: Deformation of the integration path in the evaluation of theinverse LT

Inserting these into (4.85) we get in the limit as R→∞

g(t) =1

2πi

∫ γ+i∞

γ−i∞

estds√s− a = 2aea

2t +2

πQ(t), (4.86)

where

Q(t) =

∫ ∞

0

e−tx2

x2dx

a2 + x2. (4.87)

The last integral can be expressed in terms of the error function by first setting

P (t) =∫∞0e−tx

2 (a2 + x2

)−1dx and noting Q(t) = − P ′(t). Then since

∫ ∞

0

e−tx2

(x2 + a2)dx

a2 + x2=

∫ ∞

0

e−tx2

dx =1

2

√π

2

we get the differential equation

P ′(t)− a2P (t) = −1

2

√π

t.

The solution for P (t) reads

P (t) = P (0)ea2t −

√π

2

∫ t

0

ea2(t−τ)τ−1/2dτ ,


where P (0) =∫∞0

(a2 + x2

)−1dx = π/2a. Through a change of the integration

variable to x = aτ1/2 the preceding becomes

P (t) = (π/2a) ea2t −

√π

aea

2t

∫ a√t

0

e−x2

dx,

so that

Q(t) =1

2

√π

t− aπ

2ea

2t +aπ

2ea

2t erf(a√t), (4.88)

where

erf(a√t) =

2√π

∫ a√t

0

e−x2

dx = 1− erf c(a√t) (4.89)

is the error function and its complement. Collecting our results we get the finalexpression

f(t) = bδ (t) + 2a2bea2t − a2bea2t erf c(a√t) + ab√

πt, (4.90)

which requires that a > 0. When a < 0 the pole exists on the bottom Riemannsheet so that the right side of (4.85) is zero. Proceeding as above we obtain

g(t) =1√πt− |a| ea2t erf c(|a|√t), (4.91)

which then yields

f(t) = bδ (t)− a2bea2t erf c(|a| √t) + ab√πt. (4.92)

Example 2In this example we compute the response of a homogeneous electrically conduct-ing medium to a unit step excitation. A plane wave field component initializedby a unit step at x = 0 and t = 0 that has propagated a distance x into theconducting medium can be represented at time t by

A(t, x) =1

2πi

∫ γ+i∞

γ−i∞est

e−(sx/v)√

1+(sτ)−1

sds, (4.93)

where τ is the medium relaxation time and v the phase velocity. It will beconvenient to introduce the nondimensional parameters t = t/τ, x = x/vτ andchange the complex variable s to z = sτ . With these changes (4.93) transformsinto

A(t, x) ≡ A(t, x) =1

2πi

∫ γ+i∞

γ−i∞ezt

e−zx√1+z−1

zdz

=1

2πi

∫ γ+i∞

γ−i∞ez(t−x)

e−x

(√z(z+1)−z

)

zdz. (4.94)


γ ℜe s

ℑm s

q2 q1

r2 r1

z•

BC +

BC−

−1 0 C

Figure 4.11: Disposition of singularities in the evaluation of the inverse LT(4.95)

Next we note that√z (z + 1)−z→ 0 as |z| → ∞ so that p(z) = e

−x(√

z(z+1)−z)

/z → 0 as |z| → ∞. Thus if we can demonstrate that p(z) is analytic for es > 0 (which we do in the sequel by a particular choice of branch cut) then,because exp z

(t− x) decays in the left half plane whenever t − x < 0, it will

follow from the Jordan lemma that A(t, x) = 0 for t− x < 0. This is equivalentto t < x/v, which is in agreement with our expectation that no signal canpropagate faster than v. Since the signal evolution will commence at t = x it issensible to shift the time origin to coincide with the time of first signal arrival.This amounts to replacing (4.94) by

a(t, x

)=

1

2πi

∫ γ+i∞

γ−i∞ezt

e−x

(√z(z+1)−z

)

zdz. (4.95)

The singularities in the integrand are the two branch points of√z (z + 1) : one

at z = 0, where the integrand approaches ∞, and the other at z = −1 wherethe integrand is finite. We can render the function

p(z) = e−x

(√z(z+1)−z

)

/z (4.96)

analytic for e s > 0 by connecting the branch points with the branch cut alongthe segment (−1, 0) of the real axis as shown in Fig. 4.11.

This follows from the following definition of the square root:√z(z + 1) =

√r1r2e

i(θ1+θ2), (4.97)

where the angles θ1and θ2 shown in the figure are restricted to −π ≤ θ1,2 ≤ π.Using (4.97) we observe that p(z) in (4.96) is continuous everywhere except


possibly on the branch cut. This implies that p(z) is analytic outside everyclosed curve enclosing the branch cut. Turning our attention to the branch cutitself we note that on its upper side, BC+, the angles approach θ1 = π, θ2 = 0which results in

√z(z + 1) = i

√r1r2 = i

√r1(1− r1). On the lower side of

the branch cut, BC−, the angles evaluate to θ1 = −π, θ2 = 0 which gives√z(z + 1) = −i√r1r2 = −ir1(1 − r1). Thus in traversing the branch cut

downward the square root undergoes a jump of i2√r1(1− r1).

We have now all the necessary machinery to evaluate (4.95). Thus for t < 0the exponential decays in right half plane so that, invoking the Jordan lemma,a(t, x

)= 0, in agreement with our previous conjecture. For t > 0 the expo-

nential decays in left half plane. Again taking account of the Jordan lemmaand the analyticity of p(z) for all points not on the branch cut we may deformthe original path of integration into the path enclosing the branch cut shown inFig. 5.15. The three contributions can then be represented as follows:

a(t, x

)=

1

2πi

⎧⎨

⎩

∫

C

p(z)eztdz +

∫

BC+

p(z)eztdz +

∫

BC−

p(z)eztdz

⎫⎬

⎭ . (4.98)

Evaluating each contribution in turn we get

1

2πi

∫

C

p(z)eztdz = 1, (4.99a)

1

2πi

∫

BC+

p(z)eztdz = 12πi

∫ 1

0− −dr1e−r1 t e−x(i

√r1(1−r1)+r1)−r1 , (4.99b)

1

2πi

∫

BC−

p(z)eztdz = 12πi

∫ 0−

1 −dr1e−r1 t e−x(−i

√r1(1−r1)+r1)−r1 . (4.99c)

Upon summation and a change of the integration variable from r1 to ξ viar1 = ξ2 we get

a(t, x

)=

⎧⎨

⎩1− 2

π

∫ 1

0

e−ξ2(t+x)

sin(xξ

√1− ξ2

)

ξdξ

⎫⎬

⎭U(t)

(4.100)

or, reverting to the A(t, x) in (4.94)

A(t, x) = a(t, x

)=

⎧⎨

⎩1− 2

π

∫ 1

0

e−ξ2 tsin

(xξ

√1− ξ2

)

ξdξ

⎫⎬

⎭U(t− x) . (4.101)

A plot of (4.101) is shown in Fig. 4.12 at three positions within the medium.


0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

OBSERVATION TIME RELATIVE TO MEDIUMRELAXATION TIME (t /tau)

E-F

IELD

AT

x/(

v*ta

u) D

UE

TO

UN

ITIN

CID

EN

T E

-FIE

LD A

T x

=0

x/(v*tau)=5

x/(v*tau)=1

x/(v*tau)=0.1

Figure 4.12: Response of conducting medium to unit step excitation

4.2 Double-Sided Laplace Transform

4.2.1 Definition and Analytic Properties

By analogy with functions of exponential order at t ∼ ∞ we can definefunctions of exponential order at t ∼ −∞. Adapting our previous shorthandnotation to this case we write

f(t) ∼ Ot∼−∞(eβt) (4.102)

by which we shall mean that there exist constants t0, β, andM such that |f(t)| <Meβt for all t < t0. Again by analogy with (4.3) in Sect. 4.1 we now define thetransform over the negative t-axis by

F (s) =

∫ 0

−∞f(t)e−stdt. (4.103)

Clearly with a trivial replacement of variables all the steps in the proof of (4.8a)and (4.8b) in Sect. 4.1 retain their validity so that

F (s) is an analytic function of s for Re s < β (4.104a)

and

lim|s|→∞

F (s)→ 0 for Re s < β. (4.104b)

We can combine the unilateral transform F (s) and F (s) into a single function

FII(s) = F (s) + F (s), (4.105)

4.2 Double-Sided Laplace Transform 261

α

Im(s)

Re(s)β

F(s)

F(s)

Figure 4.13: Strip of analyticity of a bilateral LT

where F (s) is analytic for Re s < β while F (s) is analytic for Re s > α. If β > α,then the two functions F (s) and F (s) have a common region of analyticity, i.e.,the vertical strip defined by

a < Re s < β (4.106)

as illustrated in Fig. 4.13.Under these circumstances we define FII(s) as the double-sided (or bilateral)

LT and write

FII(s) =

∫ ∞

−∞f(t)e−stdt, (4.107)

which from the preceding discussion requires that f (t) be of exponential orderat ±∞, i.e.,

f(t) ∼{O

(eβt

)t ∼ −∞,

O (eαt) t ∼ ∞ .(4.108)

4.2.2 Inversion Formula

To derive the inversion formula for the bilateral LT we let s = γ + iω witha < γ < β and rewrite (4.107) as follows:

FII(γ + iω) =

∫ ∞

−∞f(t)e−γte−iωtdt. (4.109)

Evidently FII(γ+ iω) is the FT of f(t)e−γt. Therefore we may employ the FTinversion formula and write

f(t)e−γt =1

2π

∫ ∞

−∞FII(γ + iω)eiωtdω. (4.110)


α

Im(s)

Re(s)β

γ

Figure 4.14: Integration path for the bilateral LT

If we now multiply both sides by eγt and revert to the complex variable s =γ + iω, we get

f(t) =1

2π

∫ ∞

−∞FII(s)e

stdω.

Since γ is fixed ds = idω. If we also express the limits of integration in termsof s, the preceding integral becomes

f(t) =1

2πi

∫ γ+i∞

γ−i∞FII(s)e

stds, (4.111)

which is the desired result. The integration path intercept γ may be locatedanywhere within the strip of analyticity as shown in Fig. 4.14.

Equation (4.111) is identical in appearance to the formula for the unilateralLT. The distinction between the two lies in the choice of the integration paths asdefined by the intercept γ. Recall that for the unilateral transform we requiredanalyticity for Re s > α which in accordance with (4.106) implies that β = ∞.In general a given FII(s) may possess several strips of analyticity so that thecorresponding signal is not unique but depends on the prescription of γ. Weillustrate this in the following example.

Example 1

FII(s) =1

(s− 2)(s− 1).


The two poles define the following three strips of analyticity:

−∞ < γ < 1, (i)

1 < γ < 2, (ii)

2 < γ <∞. (iii)

Let us evaluate

f(t) =1

2πi

∫ γ+i∞

γ−i∞

1

(s− 2)(s− 1)estds

for each of the three possible choices of integration paths.case(i)When t < 0 the exponential decays in the right half plane (Re s > 0). Since

FII(s) decays at infinity, by Jordan’s lemma an integral of FII(s)est taken over

a circular contour in the right half plane vanishes as the radius of the circleis allowed to approach infinity. Therefore the line integral of FII(s)e

st takenover the path γ is equivalent to an integral over a closed contour enclosing allthe singularities of FII(s) in the clockwise direction. Computing the (negative)residues at poles s = 1 and s = 2 we obtain

f(t) = − est

(s− 2)|s=1 − est

(s− 1)|s=2 = et − e2t, t < 0.

When t > 0 the exponential decays in the left half plane (Re s < 0). In thiscase, again applying Jordan’s lemma, the integral over a circular path in the lefthalf plane vanishes. The inversion integral is therefore equivalent to a closedcontour integral enclosing all the singularities of FII(s)e

st in the counterclock-wise direction. Since in the present case there are no singularities to the left ofγ the result of the integration is identically zero. Hence

f(t) = 0, t > 0.

case(ii)Proceeding as above for t < 0 we close the contour with a circular path in the

right half plane. The contributing pole is at s = 2 with a (negative) residue of

f(t) = −e2t, t < 0.

For t > 0 we close the contour in the left half plane. The contribution is a(positive) residue from the pole at s = 1 which is

f(t) = −et, t > 0.

case(iii)Now there are no singularities to the right of γ so that

f(t) = 0, t < 0


while for t > 0 both poles contribute positive residues, i.e.,

f(t) = −et + e2t, t > 0.

Evidently case (iii) corresponds to the unilateral LT.Note that in case(i) γ = 0 is a permissible choice of integration path. Con-

sequently in this case with the substitution s = iω the double-sided LT convertsto the FT, viz.,

(et − e2t)U(−t) F⇐⇒ 1

(iω − 2)(iω − 1).

Example 2

The following transform has a pole of order 2 at s = −2 and a simple pole ats = 3:

FII (s) =1

(s+ 2)2(s− 3).

Again we have three strips of analyticity

−∞ < γ < −2, (i)

−2 < γ < 3, (ii)

3 < γ <∞. (iii)

Proceeding as in Example 1 we get the following three functionscase(i)

f(t) =

{− dds

est

(s−3) |s=−2 − est

(s+2)2 |s=3 = e−2t

5

(t+ 1

5

)− e3t

25 ; t < 0

0 ; t > 0

case(ii)

f(t) =

{− e3t25 ; t < 0

− e−2t

5

(t+ 1

5

); t > 0

case(iii)

f(t) =

{0 ; t < 0

− e−2t

5

(t+ 1

5

)+ e3t

25 ; t > 0

Again case(iii) corresponds to the unilateral LT and case(ii) to the FTF (ω) = FII (iω). Hence in the second case we may also write the transformrelationship

−e3t

25U(−t)− e−2t

5

(t+

1

5

)U(t)

F⇐⇒ 1

(iω + 2)2(iω − 3).


Example 3

Next let us consider the transform

FII (s) =1

(s2 + 4)(s+ 3),

which has two simple poles ±i2 on the imaginary axis and a simple pole ats = −3. Again the inverse can be one of the three functions corresponding tothe strips

−∞ < γ < −3, (i)

−3 < γ < 0, (ii)

0 < γ <∞. (iii)

Clearly case(iii) corresponds to the unilateral LT for which the inverse reads

f(t) =

{est

(2s)(s+ 3)|s=i2 +

est

(2s)(s+ 3)|s=−i2 +

est

(s2 + 4)|s=−3

}U (t)

=

(− 1

13cos 2t+

3

26sin 2t+

e−3t

13

)U (t)

The FT of this function exists and we could compute it directly. Can we alsodetermine it from FII (s)? Since the strip of analyticity excludes the imaginaryaxis, i.e., γ = 0, the substitution s = iω we employed in Examples 1 and 2is no longer permissible. Our guidance here should be the inversion formula.Evidently the offending singularities are the two poles at s = ±i2 which wecan circumnavigate with two semicircular contours while keeping the remainingpaths along the imaginary axis γ = 0, as shown in Fig. 4.15.

We may then write the inversion formula as follows:

f(t) =1

2πlimε→0

(∫ −2−ε

−∞+

∫ 2−ε

−2+ε

+

∫ ∞

2+ε

)eiωtdω

(−ω2 + 4)(iω + 3)

+1

2πilimε→0

∫

C−

estds

(s2 + 4)(s+ 3)+

1

2πilimε→0

∫

C+

estds

(s2 + 4)(s+ 3). (4.112)

The integral along the straight line path will be recognized as a sum of two CPVintegrals while each of the two integrals along the circular paths equals one-halfthe residue at the respective pole. Therefore (4.112) is equivalent to

f(t) =1

2πP

∫ ∞

−∞

eiωtdω

(−ω2 + 4)(iω + 3)

+ei2t

8(−2 + i3)+

e−i2t

8(−2− i3) . (4.113)


•ε

2−e

2+e

−2+e

−2−e

•ε

•

i2

−i2

−3 ℜe s

ℑm s

C+

C−

Figure 4.15: Deformation of LT integration path around simple poles on theimaginary axis

Note that consistency with the FT inversion formula is obtained by recognizingthat the last two terms in (4.113) result from a pair of delta functions. The FTis then

F (ω) =1

(−ω2 + 4)(iω + 3)

+πδ (ω − 2)

4(−2 + i3)+ π

δ (ω + 2)

4(−2− i3) . (4.114)

To verify that the two terms involving delta functions yield the correct signalin the time domain we evaluate

1

2π

∫ ∞

−∞

{πδ (ω − 2)

4(−2 + i3)+ π

δ (ω + 2)

4(−2− i3)}eiωtdω,

which yields the last two terms in (4.113). It is worth noting that the noncausalsignal corresponding to Case ii (−3 < γ < 0) also possesses a Fourier transform.In this case the path of integration in Fig. 5.12 will approach the imaginary axis


from the left and the integrals taken over the two semi-circular indentations willeach yield −iπ times the residue at each pole. As a result we get

F (ω) =1

(−ω2 + 4)(iω + 3)

− π δ (ω − 2)

4(−2 + i3)− π δ (ω + 2)

4(−2− i3) . (4.115)

Example 4

Consider now the transform

FII (s) =s3

(s2 + 4)(s+ 3). (4.116)

Unlike in Example 3 this transform does not vanish at infinity so that theJordan lemma does not apply. However, by subtracting its value at infinity wecan represent FII (s) as the sum FII (s) = FII (∞)+ FII (s) where FII (∞) = 0.Thus

FII (s) = 1 +

{s3

(s2 + 4)(s+ 3)− 1

}

= 1− 3s2 + 4s+ 12

(s2 + 4)(s+ 3). (4.117)

The inverse transform of the constant yields a delta function while the secondterm can represent three possible functions, depending on the choice of theintercept γ, as in Example 3. In particular, the Fourier transform of the causalsignal (γ > 0) is

F (ω) = 1− −3ω2 + i4ω + 12

(−ω2 + 4)(iω + 3)

+4π

{δ(ω + 2)

3− i2 −δ(ω − 2)

3 + i2

}. (4.118)

4.2.3 Relationships Between the FTand the Unilateral LT

Determination of the FT from the Unilateral Laplace Transform

As evidenced by the preceding discussion the Fourier transform F (ω)of a signalwith a unilateral LT FI(s) which is analytic for Re s ≥ 0 is FI(iω). On theother hand, as shown in Examples 3 and 4, if FI(s) is analytic for Re s > 0 butfails to be analytic on the imaginary axis because of simple pole singularities ats = iωk the Fourier transform assumes the form

F (ω) = FI(iω) + π∑

k

res {FI(iωk)} δ (ω − ωk) , (4.119)


where res {FI(iωk)} is the residue of FI(s) at s = iωk. For poles of higherorder (or for essential singularities) on the imaginary axis the Fourier Transformdoes not exist since in that case the integration about a semicircular path (as,e.g., in Fig. 5.12) enclosing the singularity tends to infinity as the radius of thesemicircle approaches zero. This is consistent with the observation that inverseLTs of functions with poles on the imaginary axis of order higher than unityyield signals that are unbounded at infinity (e.g., 1/s2). Cases for which FI(s)is analytic for Re s > 0 with branch point singularities on the imaginary axisgenerally do not yield delta functions. However, the replacement of s by iω insuch cases must be understood in the limit as the imaginary axis is approachedfrom the right half of the complex plane, i.e.,

F (ω) = limσ→0+

FI (σ + iω) . (4.120)

Determination of the Unilateral Laplace Transform from the FT

It would appear that the solution to the converse problem, i.e., that of determin-ing the unilateral LT from the FT of a causal signal amounts to the replacementof variables F (s/i) and, if necessary, making the adjustment for the presence ofsimple poles as prescribed by (4.119). This approach always works for rationalfunctions. A formal approach that works in all cases can be constructed bynoting that with s = iω + σ

FI(s) =

∞∫

0

f(t)e−(iω+σ)tdt =

∞∫

−∞f(t)e−(iω+σ)tU(t)dt (4.121)

for σ > 0 and any f(t) whose FT is F (ω). Since

∞∫

−∞f(t)e−(iω+σ)tU(t)dt =

∞∫

−∞

[f(t)e−σtU(t)

]e−iωtdt (4.122)

we are dealing with the FT of the product of two functions so that we can usethe frequency convolution theorem to obtain

∞∫

−∞

[f(t)e−σtU(t)

]e−iωtdt =

1

2π

∞∫

−∞

F (η) dη

σ + i(ω − η) . (4.123)

Hence with s = iω + σ the preceding combined with (4.121) yields

FI(s) =1

2π

∞∫

−∞

F (η) dη

s− iη , (4.124)

which is an analytic function for Re(s) > 0. Note that (4.124) is valid, irrespec-tive of whether F (ω) is the FT of a causal function. Thus in case the inverse of

Problems 269

F (ω) is not causal, the inverse of FI(s) is simply the inverse of F (ω) truncatedto nonnegative time values. For example, for F (ω) = pa(ω) we obtain

FI(s) =i

2πln

(s− ias+ ia

), (4.125)

This function has two logarithmic branch points at s = ±ia. A branch cut rep-resented by a straight line connecting these branch points can be used to ensureanalyticity of FI(s) everywhere on the top Riemann sheet, and in particular forRe(s) > 0. It can be shown by a direct application of the inversion formula that

sin at

πtU(t)

L⇐⇒ i

2πln

(s− ias+ ia

). (4.126)

To find the FT of this causal signal requires the limiting form (4.120) to (4.125)which can be shown to yield the FT pair given in (3.35) in Sect. 3.2.

Problems

1. Solve the following differential equation using LTs:

d2y (t)

dt2+ 3

dy (t)

dt+ 2y(t) = f(t)

when

(a) f(t) = 0, y (0+) = 0, y′ (0+) = 1 ; t � 0

(b) f(t) = e−tU(t), y (0+) = y′ (0+) = 0 ; t � 0

(c) f(t) = cos 4t U(t+ 3) ; y (−3+) = y′ (−3+) = 0; t � −32. Find the LT of the following functions:

(a) e−3(t−1)√2t

U(t)

(b) t3/2e−5tU(t)

(c) e−2t√t−2

U(t− 2)

(d) e−4tt5/2 sin 2tU(t)

3. Find the causal inverse LT of the following functions:

(a) s2

s2+4

(b) e−s√s+2

(c) 1√s(1−e−2s)

(d) s2+1(s+2)2


4. For a linear system defined by the differential equation

d2y (t)

dt2− dy (t)

dt− 6y(t) = x(t)

where y (t) is the output and x (t) the input, find the impulse responseh(t) for each of the following cases:

(a) The system is causal.

(b) The system is stable.

(c) The system is neither causal nor stable.

5. The inverse of the LT of

F (s) =s− 1

(s+ 2)(s+ 3)(s2 + s+ 1)

can represent several functions of time. Find the functions.

6. The inverse of the LT F (s),

F (s) =s+ 3

(s2 + 9)(s− 2),

can represent several functions of time.

(a) Find the functions.

(b) Identify the function that has a single-sided LT.

(c) Identify the function that has a Fourier transform. Find the Fouriertransform.

7. A causal LTI system is defined by the system function

H (s) =s+ 2

s2 + 2s+ 2

Find the output when the input is e−2|t|.

8. Find the causal inverse LT of

F (s) =e−a

√s−1

s− 1

where a > 0.

Chapter 5

Bandlimited FunctionsSampling and the DiscreteFourier Transform

5.1 Bandlimited Functions

5.1.1 Fundamental Properties

In the preceding chapters we have dealt exclusively with signals and their trans-formations defined on a continuous time interval. Such signals can be referred toas analogue signals. In this chapter we introduce the important topic of discretesignals, i.e., signals that are represented by sequences of numbers. Sometimesthese numbers can be interpreted as samples of a continuous signal taken atdiscrete time intervals. Such an interpretation is, however, not always possibleor for that matter even necessary. Nevertheless, the subject is best introducedby starting with the idea of sampling. For this purpose we shall initially restrictthe class of analogue signals which we propose to sample to so-called bandlim-ited functions. We shall call a function (analogue signal) f (t) bandlimited if itsFourier transform F (ω) vanishes outside a finite frequency interval, i.e.,

|F (ω)| = 0; |ω| > Ω. (5.1)

No essential loss of generality is incurred if we simplify matters and also assumethat1

∫ Ω

−Ω

|F (ω)| dω <∞. (5.2)

1In particular, this excludes functions of the type (1/iω) pΩ (ω) .


271

272 5 Bandlimited Functions Sampling and the Discrete Fourier Transform

In view of (5.1) f (t) can be represented by

f (t) =1

2π

∫ Ω

−Ω

F (ω) eiωtdω (5.3)

while the direct transform may be computed in the usual way from

F (ω) =

∫ ∞

−∞f (t) e−iωtdt. (5.4)

Using (5.3) the n-th derivative of f (t) is

dn

dtnf (t) =

1

2π

∫ Ω

−Ω

(iω)nF (ω) eiωtdω,

which in view of (5.2) converges for all n and all finite real and complex valuesof t. This shows that f (t) possesses derivatives of all orders for all finite t.We recall that this property defines an analytic function. Consequently f (t) isanalytic for all finite t (entire function). A fundamental property of an analyticfunction is that it may not vanish over any finite segment of its independent vari-able without vanishing identically. In particular, this means that f (t) cannotbe identically zero on any finite segment of the real t axis. Thus a bandlimitedfunction cannot be simultaneously timelimited (i.e., be truncated to a finite timesegment, say −T/2 < t < T/2, while its Fourier transform satisfies (5.1) and(5.2)). Hence all bandlimited signals are necessarily of infinite duration. Sincewe know that all signals in practice must be of finite duration the concept ofa bandlimited signal may initially strike one as hopelessly artificial. As will bediscussed in the sequel, the practical utility of this concept arises largely fromthe fact that the actual theoretical duration of a bandlimited analogue signal isless important than the number of discrete samples in terms of which the signalmay be represented. In practice the latter are always finite in number.

An important property of a bandlimited function is that the Fourier integral(5.4) may be replaced by a sum taken over discrete samples of the function. Toobtain such a representation we expand the Fourier transform F (ω) in Fig. 5.1in a Fourier series within the interval (−Ω,Ω)

Ω−5Ω −3Ω 3Ω−Ωω

F(w)

Figure 5.1: Fourier transform limited to a finite band and its periodic extension

F (ω) =

∞∑

n=−∞cne

−inπω/Ω, (5.5)

5.1 Bandlimited Functions 273

where the Fourier series expansion coefficients cn may be computed in the usualway, i.e.,

cn =1

2Ω

∫ Ω

−Ω

F (ω) einπω/Ωdω. (5.6)

The series (5.5) converges in the mean to the prescribed function within theclosed interval (−Ω,Ω) and to the periodic extension of F (ω) outside this in-terval, as indicated in Fig. 5.1. Presently we consider the latter an artifact ofno direct interest to us. Comparing the last formula with (5.3) we readily makethe identification

cn = Δtf (nΔt) , (5.7)

where

Δt =π

Ω(5.8)

so that the Fourier series expansion coefficients of the Fourier transform areproportional to the samples f (nΔt) of the function taken at uniform inter-vals Δt. We make this explicit by putting (5.5) into the form

F (ω) =

∞∑

n=−∞f (nΔt) e−iωnΔtΔt. (5.9)

Thus for a bandlimited function the Fourier transform (5.4) has an alternativerepresentation in form of a series with coefficients proportional to the samplesof the function spaced Δt apart. Upon closer examination we note that thisseries also has the form of as a Riemann Sum approximation to (5.4) whichshould converge to the integral only in the limit as Δt approaches zero. Thatfor bandlimited functions convergence appears possible also for finite samplingintervals may appear initially as a startling result. It is, however, perfectlyconsistent with the fact that Δtf (nΔt) is a Fourier series coefficient. Equa-tion (5.9) can be used to represent F (ω) provided Δt ≤ π/Ω. This follows fromthe observation that when Δt′ ≤ Δt, any function bandlimited to π/Δt is alsobandlimited to π/Δt

′.

From another perspective (5.9) as a Fourier series is subject to the sameconvergence constraints as the Fourier series in the time domain discussed in 2.1.The only difference lies in the interpretation. In Sect. 2.1 we dealt with theexpansion of a waveform limited in time to the interval (−T, T ). Here we usethe same representation for a Fourier Transform limited to the frequency band(−Ω,Ω) and interpret the expansion coefficients as products of signal samplesand the sampling interval. In fact one can conceive of a dual situation whereinthe expansion coefficients of a time-limited waveform are interpreted as productsof samples of the frequency spectrum and the frequency sampling interval.

Recall that improvements of resolution in the time domain generally requirean increase in the number of Fourier series coefficients. Similarly, to obtain abetter spectral resolution requires an increase in the number of signal samples.Since we are dealing with uniformly spaced samples any increase in the number


of samples requires a longer set of data. This is illustrated in Fig. 5.2 whichshows the magnitude of a spectral density tending to a pronounced peak asthe signal duration is increased from 16 Δt to 64Δt. It turns out that in thisparticular example nearly full resolution is reached with 64 samples so thatfurther increases in signal duration will not alter the spectral density.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5

3

3.5

4

N=64

N=16

N=32

Figure 5.2: Spectral resolution and signal duration

A somewhat different situation arises when the spectral density changessignificantly within a narrow band. An extreme example of this is the functionalform

F (ω) =

{3 ; |ω| < 0.5,

1 ; 0.5 < |ω| < 1.

A step change in this pure form is clearly not physically realizable. Neverthelessidealizations of this sort are frequently of value in preliminary assessments ofthe performance of bandpass and band suppression filters. Unlike the “smooth”spectral forms in Fig. 5.2 convergence of the FS at step discontinuities will benonuniform. Figure 5.3 shows the spectra for signal durations of 16Δt and 64ΔtAs in the mathematically equivalent case in the time domain we see already forN = 64 the beginning of Gibbs characteristic oscillatory convergence.

5.1.2 The Sampling Theorem

Since (5.9) represents a Fourier Transform we are entitled to substitute it in(5.3). This gives us a representation of a signal on a continuum of t. With thissubstitution we obtain

f(t) =1

2π

∫ Ω

−Ω

{ ∞∑

n=−∞f (nΔt) e−inΔtωΔt

}eiωtdω


−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10.5

1

1.5

2

2.5

3

3.5

N=16

N=64

Figure 5.3: Spectrum with step discontinuities

Integrating the exponential yields

f (t) =

∞∑

n=−∞f (nΔt)

sin (Ωt− nπ)(Ωt− nπ) , (5.10)

which is the famous Shannon sampling theorem. An alternative form whereinΩ is replaced by π/Δt is

f (t) =

∞∑

n=−∞f (nΔt)

sin [(t/Δt− n)π](t/Δt− n)π (5.11)

which highlights the interpolation aspect of this representation. In fact the in-terpolation formula for sinusoids, 2.74, reduces to the Shannon sampling repre-sentation for sufficiently large M. The sampling interval Δt is generally referredto as the Nyquist interval and its reciprocal as the Nyquist sampling rate, usu-ally expressed in Hz. If we denote the (single-sided) bandwidth of the signalby B = Ω/2π, then Δt = 1/2B. As already noted in conjunction with (5.9) asignal bandlimited to B is also bandlimited to B′ > B, so that (5.11) remainsvalid for any 1/2B′ = Δt ≤ 1/2B. A signal sampled at a rate greater thanthe Nyquist rate is said to be oversampled. We will postpone the discussionof the consequences of sampling at a rate lower than the Nyquist rate (i.e.,undersampling) to Sect. 5.3.

Observe that for any integer k substitution of t = kΔt into the argumentof the sin(x)/x function in (5.11) gives zero for n = k and unity when n = k,thus verifying directly the convergence of the sum to f (kΔt). Since the ex-pansion coefficients of (5.11) are the signal samples and the sum reproducesthe function at these sample values exactly, (5.11) may be considered an in-terpolation formula. It is actually more than an interpolation formula sinceunlike interpolation formulas in 2.1.8, it reproduces the given function exactly


at all intermediate values of t. In fact (5.11) may also be viewed as an LMSrepresentation of f (t) in terms of the expansion functions

φn (t/Δt) = sin [(t/Δt− n)π] / [(t/Δt− n)π] . (5.12)

It is not hard to show that these functions are orthogonal over −∞,∞:

∫ ∞

−∞φn (t/Δt)φm (t/Δt) dt = Δtδnm. (5.13)

Since the series converges pointwise, the LMS error must, of course, be zero. Ifin (5.11) we set f (nΔt) = fn we can apply Parseval’s formula (1.126) to obtain

∫ ∞

−∞|f (t)|2 dt =

∞∑

n=−∞|f (nΔt)|2 Δt, (5.14)

This result could also have been obtained by first applying Parseval’s theoremto the Fourier series (5.9) and then to the Fourier transform, i.e., using

1

2π

∫ Ω

−Ω

|F (ω)|2 dω =

∫ ∞

−∞|f (t)|2 dt.

Eq. (5.14) states that the energy in the bandlimited analogue signal may beviewed as if it were distributed among its sampled values. The right side of(5.14), just as the Fourier series (5.9), looks like a Riemann sum approximationbut is in fact an exact representation of the integral to its left. As in (5.9) wehave the option of choosing for Δt any value less or equal to 1/2B.

5.1.3 Sampling Theorem for Stationary RandomProcesses*

Frequently it is more appropriate to model signals as random processes ratherthan as definite deterministic functions. A particularly important class of ran-dom processes are stationary processes one of whose attributes is that theyextend over an infinite time interval. Here we examine the meaning of ban-dlimiting for such processes as well as their representation in terms of sampledvalues. Unlike deterministic signals, sample functions of stationary random pro-cesses do not possess Fourier transforms so that the definition of a bandlimitedrandom function as in (5.1) and (5.2) cannot be used directly. On way out ofthis difficulty is not to focus on the properties of individual sample functionsbut define bandlimiting in terms a statistical (ensemble) average. A suitableaverage is the autocorrelation function the FT of which is the power spectrumof the process. Thus a bandlimited stationary random process may be definedas a process whose power spectrum S (ω) vanishes identically for |ω| > Ω, i.e.,

S (ω) =

∫ ∞

−∞R (τ ) e−iωτdτ = 0 ; |ω| > Ω, (5.15)


where

R (τ ) = 〈x(t+ τ )x∗(t)〉 (5.16)

is the autocorrelation function and x(t) is a sample function of the process. Wewill show that for such a sample function

limK−→∞

⟨∣∣∣∣∣x(t)−�=K∑

�=−Kx(�Δt)φ� (t/Δt)

∣∣∣∣∣

2⟩= 0, (5.17)

where φ� (t/Δt) is given by (5.12) and Δt ≤ π/Ω. Equation (5.17) states thatthe Shannon sampling theorem is also valid for stationary random processesprovided one interprets the convergence in the (statistical) mean squared sense.

To prove (5.16) we first expand squared magnitude in (5.16) and performthe statistical average on each term. Thus

⟨∣∣∣∣∣x(t)−�=K∑


∣∣∣∣∣

2⟩

= R(0)−�=K∑

�=−KR(�Δt− t)φ� (t/Δt)−

�=K∑

�=−KR(t− �Δt)φ� (t/Δt)

+

n=K∑

n=−K

�=K∑

�=−KR [(�− n)Δt]φ� (t/Δt)φn (t/Δt) . (5.18)

Next we substitute for the correlation function in (5.18) its spectral representa-tion (5.15) to obtain

⟨∣∣∣∣∣x(t)−�=K∑


∣∣∣∣∣

2⟩

=1

2π

∫ Ω

−Ω

dωS (ω)

×[

1−∑�=K�=−K e

iω(�Δt−t)φ� (t/Δt)−∑�=K�=−K e

−iω(�Δt−t)φ� (t/Δt)+

∑n=Kn=−K

∑�=K�=−K e

i[(�−n)Δt]φ� (t/Δt)φn (t/Δt)

]

=1

2π

∫ Ω

−Ω

dωS (ω)

∣∣∣∣∣eiωt −

�=K∑

�=−Keiω�Δtφ� (t/Δt)

∣∣∣∣∣

2

. (5.19)

On the other hand, a direct development of eiωt in a Fourier series in ω in theinterval |ω| ≤ π/Δt with t as a parameter gives

eiωt =∞∑

�=−∞eiω�Δtφ� (t/Δt) . (5.20)

Hence in the limit as K → ∞ the squared magnitude in the second integrandof (5.19) vanishes, thus proving (5.17).


5.2 Signals Defined by a Finite Numberof Samples

As we have seen, one fundamental attribute of bandlimiting is that the cor-responding signal is necessarily of infinite duration. No incompatibility withthe bandlimited nature of the signal is implied if we assume that the numberof nonzero samples is finite. For definiteness, assume that these correspond toindices n = 0, 1, 2, . . .N − 1. Then (5.11) becomes

f (t) =

N−1∑

n=0

f (nΔt)sin [(t/Δt− n)π](t/Δt− n)π (5.21)

and the corresponding FT reads

F (ω) =

N−1∑

n=0

f (nΔt) e−inΔtωΔt ; |ω| ≤ Ω. (5.22)

Note that the signal itself is of infinite duration as required by the bandlimitednature of the spectrum. For example, a bandlimited signal of infinite durationthat can be represented by only one nonzero Nyquist sample is

f (t) = f (0)sin [(t/Δt)π]

(t/Δt)π

for which the spectrum is the rectangle Δtf (0) pπ/Δt (ω). A bandlimited signalcomprised of the seven nonzero samples 6, 4, 7, 6, 6, 4, 5 is plotted in Fig. 5.4.

0 5 10 15 20 25 30 35−2

−1

0

1

2

3

4

5

6

7

8

Figure 5.4: Bandlimited signal comprised of seven nonzero samples

5.2 Signals Defined by a Finite Number of Samples 279

The sampling interval equals 1 s so that these samples are numerically equalto the expansion coefficients of the Fourier series expansion of the signal spec-trum which is

F (ω) =

{ ∑23n=17 f (n) e

−iωn ; |ω| ≤ π,0 ; |ω| > π.

A plot of |F (ω)| is shown in Fig. 5.5.

−8 −6 −4 −2 0 2 4 6 80

5

10

15

20

25

30

35

40

Figure 5.5: Magnitude of the FT of the function in Fig. 5.4

In accordance with (5.14) the total energy in this signal is simply the sumof the squares of the seven samples. Thus even though the analogue signalitself occupies an infinite time interval it is reasonable to assign it an effectiveduration of only 7 s. In general when the nonzero samples span the time interval0 ≤ t ≤ T with T = NΔt as in (5.22) the signal energy is

E =

∫ ∞

−∞|f (t)|2 dt =

N−1∑

n=0

|f (nΔt)|2 Δt. (5.23)

If f (0) and f [(N − 1)Δt] are not zero T may be taken as the effective signalduration. Of course this definition does not preclude that one or more samplesin the interior of the interval equal zero but only that all samples falling outside0 ≤ t ≤ T are zero.

The product of the (double sided) signal bandwidth 2B Hz and the (effec-tive) signal duration T , i.e., (2Ω/2π)T = 2BT ≡ N is an important dimension-less parameter in signal analysis. It represents the number of sinusoids com-prising the signal spectrum or in view of the Shannon representation (5.21), thenumber of sinc functions comprising the signal itself. In the spirit of our discus-sion of alternative representations in Chap. 2 we may think of these expansionfunctions as independent coordinates and regard N as the geometrical dimen-sionality of the signal space, or, to borrow a term from Mechanics, as the numberof degrees of freedom of the signal. In the general case the dimensionality of a


bandlimited signal can be infinite and truncation to a finite number of Nyquistsamples must lead to errors. The nature of these errors can be assessed byrecalling the one-to-one correspondence between the Nyquist samples and thecoefficients in the Fourier series representation of the spectrum of the signal.As will be recalled from the examples discussed in Chap. 2, truncation of aFourier series representing a signal in the time domain results in a loss of detail(resolution). Because (5.9) is a Fourier expansion in the frequency domain itstruncation results in a reduction in resolution in the frequency domain. Forexample, consider the signal comprised of two pure tones

f(t) = cos [(Ω/4) t] + cos [(5Ω/16) t] . (5.24)

Because the higher of the two frequencies is 5Ω/16 rps, the maximum permissiblesampling interval may not exceed 16π/5Ωs. Choosing Δt = π/Ω gives

f (nΔt) = cos(nπ/4) + cos(n5π/16) (5.25)

with n = 0, 1, . . .N − 1. Employing (5.22), the magnitude of the Fourier Trans-form based on 16 samples is plotted in Fig. 5.6.

−4 −3 −2 −1 0 1 2 3 40

2

4

6

8

10

12

ωΔt

Figure 5.6: Spectrum of signal in (5.24) using 16 samples

Only two peaks appear, symmetrically disposed with respect to the zerofrequency, which is consistent with the presence of a single tone. Evidently thetwo frequencies present in (5.24) are not resolved. This is not surprising forbased on the effective signal duration of T = 16π/Ωs the Rayleigh resolutionlimit is 2π/T = Ω/8 rps whereas the frequency separation of the two tonesis only Ω/16 rps. Increasing the number of samples to 32 doubles the signalduration so that the Rayleigh limit just matches the frequency separation. Asconfirmed by the plot in Fig. 5.7 the two tones are just beginning to be resolved.


−4 −3 −2 −1 0 1 2 3 40

2

4

6

8

10

12

14

16

18

ωΔt

Figure 5.7: Spectrum of signal in (5.24) using 32 samples

Use of more samples would further reduce the width of the spikes represent-ing the individual frequency components eventually approaching two pairs ofdelta functions as the number of samples tends to infinity.

5.2.1 Spectral Concentration of Bandlimited Signals

Suppose we have the freedom to assign arbitrary weights to N signal samples.We would like to find a set of weights that will maximize the concentration ofenergy within a prescribed sub-band (−Ω0,Ω0) of a signal with fixed energybandlimited to (−Ω,Ω). To find such an optimum sequence we again representthe Fourier transform of the signal in terms of its N samples, i.e.,

FN (ω) =

N−1∑

n=0

f (nΔt) e−iωnΔtΔt. (5.26)

The corresponding total energy is then

EΩ =1

2π

∫ Ω

−Ω

∣∣∣∣∣

N−1∑

n=0

f (nΔt) e−iωnΔtΔt

∣∣∣∣∣

2

dω

=Δt2

2π

N−1∑

n=0

N−1∑

m=0

f (nΔt) f∗ (mΔt)

∫ Ω

−Ω

e−iω(n−m)Δtdω

=N−1∑

n=0

|f (nΔt)|2 Δt, (5.27)


where in the final step we made use of (5.8). Next we carry out a similarcalculation for the total energy within (−Ω0,Ω0). Again using (5.26) we get

EΩ0 =1

2π

∫ Ω0

−Ω0

∣∣∣∣∣

N−1∑

n=0

f (nΔt) e−iωnΔtΔt

∣∣∣∣∣

2

dω

=Δt2

2π

N−1∑

n=0

N−1∑

m=0

f (nΔt) f∗ (mΔt)

∫ Ω0

−Ω0

e−iω(n−m)Δtdω

=

N−1∑

n=0

N−1∑

m=0

sin[πΩ0

Ω (n−m)]

πΩ0

Ω (n−m)f (nΔt) f∗ (mΔt)Δt. (5.28)

Our objective is to maximize the ratio

ρ =EΩ0

EΩ(5.29)

for a constant EΩ. The necessary manipulations are simplified if we rewrite(5.29) in matrix form. Accordingly we define the column vector

f = [f (0) f (Δt) f (2Δt) f (3Δt) . . . f ((N − 1)Δt)]T (5.30)

and the matrix A with elements

Anm =sin

[πΩ0

Ω (n−m)]

πΩ0

Ω (n−m). (5.31)

As a result we get

ρ =fTAf

fT f. (5.32)

The deviation of ρ from its maximum must vanish which we express by

δ

(fTAf

fT f

)= 0. (5.33)

Since fT f is constant the preceding is equivalent to

δ(fTAf

)= 0. (5.34)

But the f that satisfies (5.34) maximizes ρ and therefore the energy EΩ0 .Since according to (5.34) fTAf is constant, this constant must necessarily equalEΩ0 so that we have

fTAf =EΩ0 . (5.35)

Multiplying this from the left by fT (5.35) yields fTAfT f = fTEΩ0 but sincefT f =EΩ we get fTA = fTEΩ0/EΩ. Transposing and taking account of the factthat A is a symmetric matrix give us the final result

Af =

(EΩ0

EΩ

)f (5.36)


Thus the maximum relative concentration of power within the prescribedfractional band Ω0/Ω is in given by EΩ0/EΩ = λmax, the maximum eigenvalueof A. The samples of the signal whose spectrum provides this maximum areelements of the corresponding eigenvector (5.30). It can be shown that the Neigenvalues of A are positive. They are usually arranged to form a monoton-ically decreasing sequence which makes λmax the first eigenvalue. The eigen-vectors of A form an orthogonal set and represent a special case of discreteProlate Spheroidal Wave Functions that have been studied extensively and havea variety of applications.2

A useful normalization of the fractional bandwidth is obtained when it isexpressed in terms of the time-bandwidth product of the output signal.

Setting Δt = 1/2B we get Ω0/Ω = 2πB0/2πB = 2B0Δt = 2B0T/N sothat 2B0T = N(B0/B). In Fig. 5.8 λmax is plotted as a function of 2B0Tfor N = 1024. Figure 5.9 shows plots of components of three eigenvectors asfunction of the normalized variable k/N (with N = 1024). The correspondingeigenvalues are shown on the plot. The fractional bandwidths for these cases(not shown on the plot) are .004, .0015, and .7938. The corresponding valuesof 2B0T can be found from Fig. 5.9 except for the first value for which 2B0T is4.096.

2BoT

Max

.Eig

enva

lue

0 0.5 1 1.5 2 2.5 30

0.10.20.30.40.50.60.70.80.9

1

Figure 5.8: Spectral concentration

5.2.2 Aliasing

We now suppose that f (t) is not bandlimited. Let us pick, quite arbitrarily, asampling interval Δt and inquire what happens when we attempt to computethe FT using the sum (5.9). In anticipation that presently this sum may notconverge to F (ω) we denote it by F (ω) and write

F (ω) =

∞∑

n=−∞f (nΔt) e−inΔtωΔt . (5.37)

2One of the early works on this subject is D. Slepian, H.O. Pollack, and H.T.Landow, “Prolate Spheroidal Wave Functions, Fourier Analysis and Uncertainty Principle,”Bell System Technical J., 40, no.1 pp. 43–84. January, 1961.


0 0.2 0.4 0.6 0.8 1−0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.05

Lmax=1.0000

Lmax=.9363

Lmax=.7938

Figure 5.9: Selected eigenvectors (prolate spheroidal wave functions) of A

Upon substituting the FT inversion formula for f (nΔt) we obtain

F (ω) =Δt

2π

∞∑

n=−∞

∫ ∞

−∞F

(ω

′)einΔt

(ω

′−ω)

dω′

=Δt

2π

∫ ∞

−∞F

(ω

′)dω

′∞∑

n=−∞einΔt

(ω

′−ω)

. (5.38)

Taking account of (2.31) we have

∞∑

n=−∞einΔt

(ω

′−ω)

= 2π∞∑

k=−∞δ[Δt

(ω

′ − ω)+ 2πk

]

=2π

Δt

∞∑

k=−∞δ

[ω

′ − ω +2πk

Δt

].

Substituting the last expression into (5.22) and integrating with respect to ω′

we obtain

F (ω) =∞∑

k=−∞F

(ω − 2πk

Δt

). (5.39)

Thus while the term corresponding to k = 0 in this series gives the correct FTit is embedded in an infinite set of frequency shifted transforms which we shall


refer to as the images of the spectrum. When the function is bandlimited to|ω| < π/Δt = Ω the images do not overlap and we are back to the picturein Fig. 5.1: the images coincide with what we had previously referred to asperiodic extensions of the spectrum. When the signal spectrum extends outsidethe bandwidth defined by π/Δt the images overlap and we have the situationdepicted in Fig. 5.10. The individual image spectra together with F (ω) areshown in Fig. 5.10b where for the sake of simplicity we suppose that the signal

ω

ω

ω

−6p Δt −4p Δt −2p Δt 2p Δt 4p Δt 6p Δt

p Δt−p Δt

−p Δt p Δt

F (w)

0

a

b

c

p Δt−p Δt 0

0

••••••

F (w)ˆ

Figure 5.10: Signal spectrum corrupted by aliasing

spectrum is purely real. Note that the overlap (shown shaded in the diagram)with F (ω) of the two image spectra corresponding to k = ±1 can be viewedgeometrically as a folding of F (ω) along the two vertical lines defined by ω =±π/Δt whereby the high frequencies |ω| > π/Δt are mapped or “folded” intothe baseband |ω| < π/Δt. For this reason π/Δt is sometimes referred to asthe folding frequency. Such fold lines correspond to all frequencies that aremultiples of ±π/Δt, as illustrated in more detail in Fig. 5.11. In effect thefrequencies 2π/Δt− ω, 4π/Δt− ω, . . . ω − 3π/Δt, ω − 5π/Δt, . . . all appear as


0

fold lines fold lines

−4p Δt −p Δt p Δt−2p Δt 2p Δt 4p Δtω

F(w)

Figure 5.11: Aliasing in terms of spectral folding

ω within the band 0 ≤ ω ≤ π/Δt and similarly for negative ω. This mappingof the high frequencies of the image spectra into the lower frequency basebandregime is referred to as aliasing. When the signal spectrum extends to infinity(e.g., as in the case of a Gaussian function) the number of aliased images istheoretically infinite. In the example in Fig. 5.10 only the two image spectra oneither side of F (ω) contribute so that the signal spectrum with the superposedaliases appears as in Fig. 5.10c. In practice alias-induced spectral distortion canbe reduced by increasing the sampling rate or (and) bandlimiting the signalto |ω| < π/Δt. For this purpose a low-pass filter (generally referred to as ananti aliasing filter) is employed. Since perfect bandlimiting can only be achievedwith filters having infinitely sharp cutoff aliasing effects can never be completelyeliminated but only reduced below some tolerable level.

5.3 Sampling

5.3.1 Impulse Sampling

The process of sampling may be visualized as a multiplication of the analogue sig-nal by a sequence of uniformly spaced pulses sufficiently narrow so that thesignal remains practically constant during each pulse period. As an extremeidealization of this process we replace these pulses by delta functions and referto the sampling as impulse sampling. The impulse train

P (t) ≡ Δt

∞∑

n=−∞δ (t− nΔt) (5.40)

5.3 Sampling 287

when multiplied by the analogue signal f (t) will be denote fp (t). The propertiesof delta functions permit us to write

fp (t) =

∞∑

n=−∞f (nΔt)Δtδ (t− nΔt) . (5.41)

With

fp (t)F⇐⇒ Fp (ω) (5.42)

we have

Fp (ω) =

∞∑

n=−∞f (nΔt) e−inΔtωΔt . (5.43)

We can mechanize this idealized sampling operation as shown in Fig. 5.12 wherethe block labeled I/D represents the conversion of the impulse sampled function

I/Df (t)

∞d (t − nΔt)∑

n=−∞P(t) = Δt

f (nΔt)Δtfp(t)

Figure 5.12: Impulse sampling

fp (t) to the sequence of samples f (nΔt)Δt. The reverse process that of convert-ing fp (t) to the original analogue signal f (t) can be accomplished by passingfp (t) through an ideal low-pass filter with transfer function H (ω) = pπ/Δt (ω).This follows directly by convolving the impulse response sin (πt/Δt) /πt with(5.22):

sin (πt/Δt) /πt ∗ fp (t) =∞∑

n=−∞f (nΔt)

sin [(t/Δt− n)π](t/Δt− n)π

which is just the Shannon sampling representation (5.11). Thus, as expected, weget a perfect reconstruction only if f (t) is perfectly bandlimited to |ω| < π/Δt.This reconstruction is summarized in Fig. 5.13 where the block labeled D/Irepresents the conversion of the discrete sequence f (nΔt)Δt to impulse samples.

5.3.2 Zero-Order Hold Sampling and Reconstruction

One difficulty with the implementation of the reconstruction scheme in Fig. 5.13is that it requires a train of delta functions. However, this is not a fundamen-tal flaw since impulses can always be approximated by pulses of sufficiently


f(t)f (nΔt)Δt

−p Δt/ p Δt/

H(ω)n=−∞

f (nΔt)Δtd(t − nΔt)∑

D/I

∞

Figure 5.13: Reconstruction of analogue signal from sampled values

Δtt t

f (t)a bf (nΔt)h0 (t−nΔt)

n=−∞∑∞

Figure 5.14: Zero-order hold reconstruction

short duration. A more fundamental difficulty is that the process is noncausal.Practically this means that before reconstruction can be attempted one has toobserve the entire data record (past and future). In other words a real-time im-plementation of impulse sampling is not possible. One technique which does notsuffer from this drawback is the so-called zero-order hold method. This schemeemploys a circuit whose response to an impulse is constant for the duration ofthe sampling interval and subsequently resets to zero. The resulting transfor-mation of a typical analogue signal is illustrated in Fig. 5.14. The net effect isto replace the signal in Fig. 5.14a by the staircase approximation in Fig. 5.14b.Mathematically the output is represented by

y (t) =∞∑

n=−∞f (nΔt)Δt h0 (t− nΔt) , (5.44)

where

h0 (t) =

{1/Δt ; 0 ≤ t ≤ Δt,

0 otherwise(5.45)

5.3 Sampling 289

is the impulse response of the zero-order hold circuit. The zero-order hold sam-pling representation (5.44) can be mechanized as shown in Fig. 5.15. To examinethe faithfulness of such a reconstruction we take the FT of (5.44) to obtain

Y (ω) =2 sin (ωΔt/2)

ωΔte−iωΔt/2

∞∑

n=−∞f (nΔt) e−inΔtωΔt

=2 sin (ωΔt/2)

ωΔte−iωΔt/2

∞∑

k=−∞F

(ω − 2πk

Δt

). (5.46)

Thus the reconstructed spectrum differs from the ideal case obtained with im-pulse sampling by the factor

H (ω) =2 sin (ωΔt/2)

ωΔte−iωΔt/2, (5.47)

i.e., the transfer function of the zero-order hold circuit. In addition to the phaseshift causing a signal delay of Δt/2 s, which in itself is relatively unimportant,the magnitude of the transfer function decreases from unity at ω = 0 to 2/π≈ .6366 at the folding frequency, representing an amplitude reduction of about4 dB, i.e., 20log(.636). Amplitude tapers of this magnitude are generally notacceptable so that compensating techniques must be employed.

f(t)fp(t)

y(t)h0(t)

d (t − nΔt)∑n=−∞

∞P(t) = Δt

Figure 5.15: Zero-hold sampling

One approach is to increase the sampling rate. For example, if the actualsignal bandwidth is 2π/Δt we choose the sampling interval equal to Δ

′t = Δt/r

where r > 1 may be defined as the order of oversampling. The magnitude of thetransfer function at the actual band edges of the signal spectrum then becomes|H (π/Δt)| = 2r sin (π/2r) /π. For example, if we oversample by a factor of 2the total amplitude taper is seen to be 2

√2/π which is only about to 0.9 dB. The

geometrical relationship between the signal spectrum and the transfer functionsfor the Nyquist and double Nyquist sampling rates is displayed in Fig. 5.16.

Alternatively one can attempt to compensate for the amplitude taper of thezero hold filter with an additional filter with transfer function magnitude pro-portional to the reciprocal of |H (ω)| . Unfortunately such filters can be realizedonly approximately so that a variety of compromises are generally employed inpractice usually involving both filtering and oversampling.


1~0.9dB

~4dB

−2p Δt/ −p Δt/ 2p Δt/p Δt/

/2sin(wΔt 4)/wΔt 2

H(w ) =

/2sin(wΔt 2)wΔt

H(w ) =

/F(w) F(0)

0

Figure 5.16: Reduction in the spectral amplitude taper by oversampling

5.3.3 BandPass Sampling

In many applications an analogue signal that is to be sampled will be modulatedby a carrier (as, e.g., in a wireless communications link) or for other reasonsmay contain significant energy only within a relatively narrow band arounda center frequency. Such a signal can of course be treated formally as beingbandlimited to its maximum frequency, say ωmax, and, in principle, could besampled without alias-induced degradations at ωmax/πHz. We say in principlesince such an approach would generally lead to unacceptably high samplingrates. An acceptable solution would be to be able to sample at or near thesampling rate dictated by the much narrower bandwidth of the baseband signal.On first thought it might appear that this cannot be done without serious signaldegradation due to aliasing. Surprisingly, it turns out that we can actually takeadvantage of this aliasing and reduce the sampling rate substantially belowωmax/π. We accomplish this by forcing the ratio of the carrier frequency tothe modulation bandwidth to be an integer. As a result an aliased band canbe made to correspond exactly to the low-pass equivalent of the modulationspectrum. To see how this comes about consider a bandpass signal whose FTis represented by

Z (ω) = Z− (ω) + Z+ (ω) , (5.48)

where Z− (ω) and Z+ (ω) are bandpass spectra, corresponding to the negativeand positive frequency ranges, respectively, as illustrated in Fig. 5.17. The spec-trum of the signal sampled at intervals Δt is the periodic function

Z(ω) =∞∑

n=−∞z (nΔt) e−iωnΔtΔt =

∞∑

n=−∞Z

(ω − 2πn

Δt

). (5.49)

5.3 Sampling 291

−w2 −w0 −w1 w1 w0 w2

w

Z(w)

Δw ΔwZ−(w) Z+(w)

Figure 5.17: Bandpass spectrum

Assuming the carrier frequency ω0 is fixed and not at our disposal we adjustthe nominal modulation bandwidth Δω (by simply adding extra guard bandsaround the given modulation band) such that

ω0/Δω = K = integer . (5.50)

We choose K to be the largest possible integer compatible with the given mod-ulation band (i.e., Δω is nearly identical with the “true” modulation band).Setting

Δt =2π

Δω=

2πK

ω0(5.51)

the Z(ω) becomes the periodic repetition of the inphase component of the base-band spectrum (c.f. 2.225a)

Z− (ω − ω0) + Z+ (ω + ω0) = X(ω) (5.52)

illustrated in Fig. 5.18. Hence we can unambiguously extract X(ω) from anyfrequency segment having a total bandwidth 2π/Δt. For example, we coulduse the interval symmetrically disposed with respect to the zero frequency high-lighted in Fig. 5.18. Note that this spectrum is actually an alias of the original(undersampled) waveform. An alternative (and perhaps more direct) interpre-tation of this result can be had if we recall that Z− (ω − ω0) + Z+ (ω + ω0) isjust the FT of the inphase component of the signal, i.e., x(t) cos(ω0t), so that itssampled value is x(nΔt) cos(ω0Δtn). Evidently choosing the sampling intervalsuch that ω0Δt = 2πK reduces this sampled value to x(nΔt). Thus we areable to demodulate the inphase component of the bandpass signal by simplysampling it at an appropriate rate. There are, however, significant practicallimitations in implementing such a demodulation scheme, one of which is therequirement that the time-aperture (duration of the sampling pulse) of the A/Dconverter must be much less than the period of the carrier. To assess the effectof a finite time-aperture let us assume a sampling pulse duration δ � Δt sothat the estimated sampled value of x(t) cos(ω0t) is

x(nΔt) =

∫ ∞

−∞

1

δpδ/2(t− nΔt)x(t) cos(ω0t)dt


Δt− π

Δtπ−w2 −w0 −w1 w 1 w 0 w 2

w

Δw ΔwZ−(w) Z+(w)

Z−(w −w 0) + Z+(w +w 0)Z(w)

Figure 5.18: Spectrum of sampled bandpass signal

=1

δ

∫ δ/2

−δ/2x(ξ + nΔt) cos [ω0 (ξ + nΔt)] dξ

≈ x(nΔt)1

δ

∫ δ/2

−δ/2cos [ω0 (ξ + nΔt)] dξ

= x(nΔt)1

δ

∫ δ/2

−δ/2cos(ω0ξ)dξ = x(nΔt)

sin (ω0δ/2)

ω0δ/2

= x(nΔt)sin (πδ/T )

πδ/T, (5.53)

where T is the carrier period. Thus the error incurred as a result of a finitepulse width is

x(nΔt) − x(nΔt) = x(nΔt)

[1− sin (πδ/T )

πδ/T

].

From this we infer that for a given δ/T the maximum possible resolution thatcan be achieved with an A/D converter is approximately

N = − log2

[1− sin (πδ/T )

πδ/T

]bits, (5.54)

which is plotted in Fig. 5.19. We observe that to achieve a 12-bit resolutionrequires pulse widths approximately 1% of the carrier period.

To achieve digital demodulation of the complete bandpass signal x(t)cos(ω0t) − y(t) sin(ω0t) requires that the carrier of the quadrature componentbe phase shifted by 90◦, as illustrated in Fig. 5.20. Note that this schemeassumes that the phase shift has no effect on the signal envelope. This isnot difficult to achieve with communication signals at microwave frequencieswhere the bandwidth of the envelope is usually much smaller than the car-rier frequency. (A simple way of implementing such a phase shift is to use aquadrature hybrid.)

5.3 Sampling 293

0 0.02 0.04 0.06 0.08 0.15

6

7

8

9

10

11

12

13

14

15

A/D time-aperture (fraction of carrier period)

Max

imum

ach

ieva

ble

reso

lutio

n(bi

ts)

Figure 5.19: Time-aperture requirements for digital demodulation of bandpasssignals

x(t)cosw 0t − y(t)sinw 0t

x(t)cosw 0t − y(t)sinw 0t

x(t)sinw 0t + y(t)cosw 0t

x(nΔt) y(nΔt)

A/D A/D

−90

w 0Δt = 2pK

Figure 5.20: Demodulation using bandpass sampling

5.3.4 Sampling of Periodic Signals

In the following we discuss aliasing effects engendered by sampling of periodicsignals. The interesting cases arise when we assume that the periodic functioncan be represented by a truncated FS. This is nearly always true in practice.The significance of this assumption is that a signal comprised of a finite numberof FS terms is bandlimited and therefore representable in terms of its samplesexactly. Let us first consider a simple sinusoid such as cos (ω0t) and parameterizethe sampling interval as follows:

Δt =π

ω0+ ξ.


As long as − πω0

< ξ < 0 aliasing is avoided and the series defining the sampledspectrum gives

∞∑

n=−∞cos (ω0nΔt)Δte

−iωnΔt = πδ (ω − ω0) + πδ (ω + ω0) , (5.55)

wherein ω0 < π/Δt. Moreover the Shannon sampling theorem gives

cos (ω0t) =

∞∑

n=−∞cos (ω0nΔt)

sin [π(t/Δt− n)]π (t/Δt− n) . (5.56)

A plot of (5.56) using 13 samples with ω0ξ = −π/2 (twice the Nyquist rate) isshown in Fig. 5.21. When 0 < ξ the sampled spectrum is aliased since then ω0

> π/Δt. Restricting ξ to ξ < πω0

we get for |ω| < π/Δt = ω0/ (1 + ω0ξ/π)

∞∑

n=−∞cos (ω0nΔt)Δte

−iωnΔt = πδ

{ω + ω0

[π/ω0 − ξπ/ω0 + ξ

]}

+πδ

{ω − ω0

[π/ω0 − ξπ/ω0 + ξ

]}. (5.57)

The inverse FT of the right side is cos{ω0

[π/ω0−ξπ/ω0+ξ

]t}. Thus undersampling

of a pure tone produces a pure tone of lower frequency which can be adjustedfrom ω0 down to zero by varying ξ. In fact (5.56) can in this case be replacedby

0 10 20 30 40 50 60 70−1.5

−1

−0.5

0

0.5

1

1.5

Figure 5.21: Representation of cos(ω0t) using 13 samples (sampled at twice theNyquist rate)

cos

{ω0

[π/ω0 − ξπ/ω0 + ξ

]t

}=

∞∑

n=−∞cos (ω0nΔt)

sin [π(t/Δt− n)]π (t/Δt− n) . (5.58)

5.3 Sampling 295

Plots of (5.58) together with stem diagrams indicating the sampled values of thesignal are shown in Fig. 5.22 for ω0 = 1π × 100 rps , ξ = 0 (Nyquist sampling,Δt = 5ms) and ξ = .001667 (Δt = 6.667ms). As is evident from the figure thefrequency of the undersampled and reconstructed signal is exactly one half ofthe original analogue signal.

The reduction of the signal frequency through aliasing finds application inthe sampling oscilloscope. The idea is to increase the timescale (reduce thefrequency) of a rapidly varying waveform to facilitate its observation. Here theperiodicity of the signal plays an essential role. Thus suppose x(t) is the rapidlyvarying waveform whose FS representation reads

x(t) =M∑

m=−Mxme

i2πmt/T . (5.59)

Let us sample this signal at intervals Δt = T+τ with τ a positive time incrementto be specified. The sampled (and possibly aliased) spectrum

0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.095 0.1−1.5

−1

−0.5

0

0.5

1

1.5

Time (sec)

* **

* Sampling Interval= 5ms (Nyquist)** Sampling Interval= 6.667 ms

Figure 5.22: Reduction in frequency by undersampling is, by definition

X(ω) = (T + τ )∞∑

n=−∞x [n (T + τ)] e−inω(T+τ)

= (T + τ )∞∑

n=−∞x [nτ ] e−inω(T+τ), (5.60)

where in the last expression we have taken advantage of the periodicity in Tand set x [n (T + τ )] = x [nτ ] . Next we evaluate (5.59) at t = nτ and substitutein (5.60) to obtain


X(ω) = (T + τ)

M∑

m=−Mxm

∞∑

n=−∞ei2πmnτ/T e−inω(T+τ)

= 2π (T + τ )

M∑

m=−Mxm

∞∑

�=−∞δ

[2πmτ

T− ω (T + τ ) + 2π�

]

= 2π

M∑

m=−Mxm

∞∑

�=−∞δ

[ω − 2πmτ

T (T + τ )− 2π�

T + τ

], (5.61)

where we have employed the delta function representation of the infinite sum ofsinusoids. Alternatively, we may rewrite the preceding as follows:

X(ω) =

∞∑

�=−∞X(ω − 2π�

T + τ), (5.62)

where

X(ω) = 2π

M∑

m=−Mxmδ

[ω − 2πmτ

T (T + τ )

]. (5.63)

Since the bandwidth of X(ω) is limited to |ω| ≤ 2πMτ/T (T + τ ), and thesampling is done at intervals T + τ the frequency components 2π�/(T + τ ) in(5.62) for � = 0 represent aliases. To avoid aliasing (see Fig. 5.10) we choose τsuch that 2πMτ/T (T + τ ) < π/ (T + τ ) or, equivalently,

τ < T/2M. (5.64)

In that case the expression for the spectrum (5.61) simplifies to

X(ω) = 2πM∑

m=−Mxmδ

[ω − 2πmτ

T (T + τ )

](5.65)

and taking the inverse transform we get

x (t) =M∑

m=−Mxme

i 2πmτT(T+τ) t.

Comparing this with (5.59) yields

x (t) = x

(tτ

T + τ

)(5.66)

so that the sampled and reconstructed signal is just the original periodic signalscaled in time by the factor τ/ (T + τ ). This technique is illustrated in Fig. 5.23which shows 10 cycles of a periodic signal with period of 4 s comprised of afundamental frequency and a third harmonic. Superposed is a period of thetime-scaled signal reconstructed from samples spaced 4.4 s apart. Here M = 3so that (5.64) is satisfied since τ = 0.4 < 2/3.

5.4 The Discrete Fourier Transform 297

0 5 10 15 20 25 30 35 40−1.5

−1

−0.5

0

0.5

1

1.5

Time (sec)

Figure 5.23: Stretching the timescale of a periodic waveform by undersampling

5.4 The Discrete Fourier Transform

5.4.1 Fundamental Definitions

Let us now revisit the model of a bandlimited function whose spectrum canbe represented by a FS comprised of a finite number of terms. The FT is thengiven by (5.22) and the periodic extensions outside the baseband may be ignoredsince they exist where the actual transform is defined as identically zero. Onthe other hand, it turns out that for reasons of mathematical symmetry it isadvantageous to incorporate a periodic extension of the FT into the definition ofthe Discrete Fourier Transform (DFT). The relationship between this extensionand the actual transform is best illustrated with the following simple example.Consider a bandlimited function defined by the two non-zero samples f (0) =f (Δt) = 1. Its FT is

F (ω) =

{ (1 + e−iωΔt

)Δt ; |ω| ≤ π/Δt

0 ; |ω| > π/Δt.(5.67)

A plot of the magnitude and phase of F (ω) / Δt within the baseband is shownin Fig. 5.24. On the other hand, the identical information can be extracted fromFig. 5.25 where the magnitude and phase of 1+ e−iωΔt are plotted in the range0 ≤ ω < 2π/Δt provided of course one recognizes that the folding frequency nowfalls in the center of the plot, while the negative frequencies have been mappedinto the region π/Δt < ω < 2π/Δt. The impetus for this convention was appar-ently driven by the insistence that the time interval (which is normally takenas the positive segment of the time axis with zero as the reference) occupiedby the signal map into a corresponding positive segment of the frequency axis.


−3 −2 −1 0 1 2 30

0.5

1

1.5

2

2.5

3

ωΔt

−3 −2 −1 0 1 2 3ωΔt

MAGNITUDE

−80−60−40−20

020406080

PHASE

Figure 5.24: Magnitude and phase of the FT in (5.67)

Despite the somewhat anomalous situation of finding negative frequencies onthe second half of the positive frequency axis this convention has found almostuniversal acceptance and is incorporated in the standard definition of the DFT.To derive it we first rewrite (5.22) with a small notational change:

F (ω) =

N−1∑

n=0

f [n] e−inΔtω (5.68)

f [n] = f (nΔt)Δt. (5.69)

An inversion formula for the recovery of the sample values f [n] in terms of F (ω)is, of course, available in the form of the FS integral (5.6). On the other hand,(5.68) also states that F (ω) is represented exactly by N sinusoids. We recognizethis as an interpolation problem analogous to the one already dealt with in 2.1.8in the time domain. This suggests that an alternative to evaluating the integral(5.6) is to specify F (ω) at N discrete frequencies and solve the resulting NXNsystem of algebraic equations for f [n]. Although, in principle, these samplepoints need not be uniformly distributed only a uniform distribution leads toan algebraically simple inversion formula. Accordingly we divide the frequencyinterval 0, 2Ω into N bins each of width Δω = 2Ω/N = 2π/NΔt, and setω = mΔω.3 We introduce these into (5.68) to obtain

F [m] =

N−1∑

n=0

f [n] e−i2πnm/N ;m = 0, 1, 2, . . .N, (5.70)

3Note that with the choice 0, 2Ω for the baseband rather than −Ω,Ω we are adopting theperiodic extension definition of Fig. 5.18.


0 1 2 3 4 5 60

0.5

1

1.5

2

2.5

3

ωΔt

0 1 2 3 4 5 6ωΔt

MAGNITUDE

−80−60−40−20

020406080

PHASE

Figure 5.25: Plots of periodic extension of FT in (5.67)

where for the sake of notational consistency with (5.69) we denote the frequencysamples by

F [m] = F (mΔω) . (5.71)

The final step is to use the orthogonality relationship (2.70) to solve for f [n].The result is

f [n] =1

N

N−1∑

m=0

F [m] ei2πnm/N ;n = 0, 1, 2, . . .N. (5.72)

The transform pairs (5.70) and (5.72) are, respectively, the standard forms ofthe direct and the inverse DFT. In addition it is customary to set

WN ≡ e−i2π/N (5.73)

which recasts (5.70) and (5.72) into the following form:

F [m] =

N−1∑

n=0

WmnN f [n] ; m = 0, 1, 2, . . .N (5.74a)

f [n] =1

N

N−1∑

m=0

W−nmN F [m] ;n = 0, 1, 2, . . .N. (5.74b)

Although both transformations involve only discrete samples the underlyingbandlimited time waveform and the frequency spectrum can be reconstructedwith suitable interpolation formulas. We already know that in the time domain


f (t) can be recovered via the Shannon sampling formula (5.11). An interpo-lation formula for F (ω) can be obtained by substituting f [n] from (5.72) into(5.68) and summing the geometric series with respect to n. The result is

F (ω) =

N−1∑

k=0

F [k] eiπ(N−1N )(k−ω/Δω) sin [π (k − ω/Δω)]

N sin[π 1N (k − ω/Δω)] , (5.75)

which is very similar to the time domain interpolation formula for Fourier seriesin (2.74). An alternative interpolation technique is the so-called “zero padding”described on page 308.

5.4.2 Properties of the DFT

Even though we have derived the DFT with reference to bandlimited functionsand the continuum this transform may be viewed in its own right as a linearoperation on discrete sequences. In fact it constitutes a significant processingtool in diverse DSP (Discrete Signal Processing) applications quite independentfrom any relationship to analogue waveforms. In the following we investigatesome of its fundamental properties.

Periodicity

Since WNkN = 1 for any integer k, every sequence f [n] of length N transforms

into a periodic sequence F [m] of period N i.e.,

F [m+Nk] = F [m] ; k = ±1,±2, . . .In this sense the situation is very similar to Fourier series so that we may regardF [m] for |m| > N as a periodic extension. Similarly every sequence F [m] oflength N transforms into a periodic sequence f [n] of period N . Even thoughnormally the specification of a data sequence f [n] is restricted to the indexrange 0 ≤ n ≤ N − 1 negative indexes can be accommodated through periodicextensions. For example, the sequence defined by

f [n] =

{1− |n|

10 ; |n| ≤ 100 otherwise

shown plotted in Fig. 5.26a with respect to the index range −32 ≤ n < 32appears as in Fig. 5.26b when plotted with respect to 0 ≤ n ≤ 63

Orthogonality

According to (2.70) the DFT matrix WmnN in (5.74) is comprised of orthogonal

columns and rows. Because of periodicity of WmnN the orthogonality general-

izes to

N−1∑

n=0

WmnN W−nk

N =

N−1∑

n=0

WmnN W

∗nkN = Nδm,k+N� ; � = 0,±1,±2 . . . (5.76)


−40 −30 −20 −10 0 10 20 30 400

0.2

0.4

0.6

0.8

a

b

1

0 10 20 30 40 50 60 700

0.2

0.4

0.6

0.8

1

Figure 5.26: Periodic extensions for the DFT

Matrix Properties

To emphasize the matrix properties of the DFT we rewrite (5.74a) as follows:

F = WN f (5.77)

where WN is the NXN matrix with elements WmnN and

f =[f [0] f [1] f [2] . . . f [N − 1]

]T, (5.78a)

F =[F [0] F [1] F [2] . . . F [N − 1]

]T. (5.78b)

The DFT matrix is symmetric which we may write formally as

WN = WTN . (5.79)

Its structure is best appreciated with the aid of the display

WN =

⎡

⎢⎢⎢⎢⎣

1 1 1 . 1

1 W 1N W 2

N . WN−1N

1 W 2N W 4

N . W2(N−1)N

. . . . .

1 WN−1N W

2(N−1)N . W

(N−1)(N−1)N

⎤

⎥⎥⎥⎥⎦. (5.80)

In view of (5.73) the elements of WN in (5.80) are powers of the N -th roots ofunity. Hence the elements of a typical column of WN , say wk,

wk =[1 W k

N W k2N . . . W

k(N−1)N

]T


when plotted in the complex plane will be uniformly spaced on a unit circle.Moreover the next vector wk+1 is generated by successive rotations of the com-plex numbers representing the elements of wk through the angles

0, 2π/N, 2 (2π/N) , 4 (2π/N) , . . . (N − 1) (2π/N)

radians. The direct DFT (5.77) may be viewed as an expansion of F in termsof the orthogonal basis vectors represented by the columns of WN . In terms ofthese basis vectors the orthogonality (5.76) may be expressed as

wHk wn = Nδkn (5.81)

or equivalently in block matrix form

WNWHN = WNW∗

N = NI. (5.82)

With the aid (5.81) the inverse DFT follows at once as

f =1

NW∗

NF. (5.83)

Recall that a matrix U with the property UUH = I is said to be unitary. Thusin virtue of (5.92) the matrix

UN =1√N

WN (5.84)

is unitary and according to (5.79) it is also symmetric. Thus except for anormalization factor the DFT is a unitary symmetric transformation.4

Consider now the eigenvalue problem of the normalized DFT matrix (5.83):

UNek = λkek; k = 1, 2, . . . , N. (5.85)

We recall that a unitary and symmetric matrix can be diagonalized a real or-thogonal transformation. Hence all the ek are real and

eTk em = δkm. (5.86)

Therefore an arbitrary N -dimensional sequence can be represented by the su-perposition

f =

N−1∑

k=0

fkek, (5.87)

where fk = eTk f . An interesting feature of this basis is that each ek is propor-tional to its own DFT. Thus if we take the DFT of (5.87) we find

F =√N

N−1∑

k=0

fkλkek (5.88)

from which the DFT of ek follows as√Nλkek.

4Recall that an analogous result holds for FT which, except for the factor 1/2π, is also aunitary transformation.


By virtue of the special symmetry of the DFT matrix we can find the eigen-values λk without having to solve directly for the roots of the characteristicpolynomial. To see this let us first evaluate the elements of the matrix Q ≡ U2.We have

Qkm =1

N

N−1∑

n=0

W knN Wnm

N = δk,−m+�N ; � = 0,±1,±2 . . . (5.89)

Since the indices k,m are restricted to run from 0 to N − 1 the right side of(5.89) is zero unless � = 0,m = 0 or � = 1 and m = �N − k, in which case itequals unity. Therefore we get for Q the elementary matrix

Q =

⎡

⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 0 0 . 0 0 00 0 0 . 0 0 10 0 0 . 0 1 0. . . . . . .0 0 0 . 0 0 00 0 1 . 0 0 00 1 0 . 0 0 0

⎤

⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

It is not hard to see that Q2 = U4 = I so that in view of (5.85) for any ek

U4ek = λ4kek = ek

and λ4k = 1. From this we see that the N eigenvalues λk are restricted to the fourroots of unity which are ±1 and ±i. Consequently for large N the eigenvalueproblem for the DFT matrix is highly degenerate (many eigenvectors correspondto the same eigenvalue). Let us tag each of the four groups of eigenvectors withthe corresponding eigenvalue and rewrite the expansion (5.88) as follows:

F =√N

∑

N(1)

f(1)k e

(1)k −

√N

∑

N(−1)

f(−1)k e

(−1)k

+i√N

∑

N(i)

f(i)k e

(i)k − i

√N

∑

N(i)

f(−i)k e

(−i)k , (5.90)

where N = N (1) + N (−1) + N (i) + N (−i). From this we see that a generalsequence represented by the vector f projects into four orthogonal subspaces

spanned by the vectors e(1)k , e

(−1)k , e

(i)k e

(−i)k . Except for a constant factor, each

of these projections is seen to be its own transform. The content of (5.90) maytherefore be restated as follows. Any sequence f [n] can be written uniquely asa sum of four orthogonal sequences each of which is proportional to its DFT.When N is chosen such that log2N is an integer one can show that each of thefour subspaces has the following dimension: N (1) = N

4 + 1, N (−1) = N4 , N

(i) =N4 − 1, N (−i) = N/4. Thus for large N the number of linearly independenteigenvectors in each subspace is essentially identical.


Parseval Identities

Consider two sequences represented by the vectors f and g and their respectiveDFTs by F and G. We compute the inner product

fHg =1

N

(WH

NF)H 1

NWH

NG =1

N2FHWNWH

NG

and using (5.82) in the last term obtain

fHg =1

NFHG. (5.91)

This is one form of the Parseval identity for the DFT. If one again ignoresthe asymmetry caused by N, (5.91) is a statement of the preservation of theinner product under a unitary transformation. In the special case f = g thisreduces to

‖f‖2 =1

N‖F‖2 . (5.92)

If we are dealing with bandlimited analogue signals the preceding can also bewritten in terms of integrals as follows:

‖f‖2 = Δt

∫ ∞

−∞|f (t)|2 dt = Δt

2π

∫ Ω

−Ω

|F (ω)|2 dω =1

N‖F‖2 (5.93)

Alternatively from last equality we get

1

2π

∫ Ω

−Ω

|F (ω)|2 dω =1

2π

N−1∑

n=0

|F (nΔω)|2 Δω, (5.94)

which highlights the fact that for bandlimited signals a finite Riemann sumrepresents the energy integral exactly also in the spectral domain.

Time and Frequency Shift

As we know, the special attribute of the FT that renders it particularly usefulin the analysis of LTI systems is the transformation of a time-delayed signalf (t− T ) into F (ω) e−iωt. This relationship is also maintained between thesamples and the FT of the time-delayed version of a bandlimited function. Thisfollows from

f (t− T ) F⇐⇒∞∑

n=−∞f (nΔt−KΔt) e−iωnΔtΔt

= e−iωKΔt∞∑

n=−∞f (�Δt) e−iωnΔtΔt = e−iωKΔtF (ω) , (5.95)


where T = KΔt, and K an integer. Using the notation (5.69) we get

∞∑

n=−∞f [n−K] e−iωnΔt = e−iωKΔt

∞∑

�=−∞f [�] e−iω�Δt. (5.96)

Unfortunately an analogous relationship cannot hold without restriction for theDFT since it is defined with reference to the fixed data window of length N . Tosee this explicitly let us introduce the notation for the DFT pair

f [n]DF⇐⇒ F [m] (5.97)

and then compute

f [n−K]DF⇐⇒

N−1∑

n=0

WmnN f [n−K] =WmK

N

N−K−1∑

�=−KWm�N f [�] . (5.98)

Clearly the preceding sum is in general not equal to F [m]. There are twoconditions when this will be the case. First, if the sequence f [n] is identicallyzero outside the data window and is of length L ≤ N provided N − K ≥ L.Second, if f [n] is either periodic with principal period N or is defined outsidethe data window by its periodic extension. If one of these conditions holds, wemay write

f [n−K]DF⇐⇒ F [m]WmK

N . (5.99)

By symmetry, if the same constraints are imposed on F [m] we have

f [n]W−nKN

DF⇐⇒ F [m−K] . (5.100)

Convolution

The convolution operation for discrete sequences is defined in a manner analo-gous to analogue signals. Thus

y [n] =

∞∑

m=−∞f [n−m]h [m] =

∞∑

m=−∞h [n−m] f [m] . (5.101)

When each of the sequences is of finite length, say f [n] is of length Nf andh [n] of length Nh, (5.101) can be modified by taking explicit account of thetruncations of the sequences. For example, the first of the sums in (5.101)transforms into

y [n] =

{ ∑nm=0 f [n−m]h [m] ; 0 ≤ n ≤ Nh − 1,∑Nh−1

m=0 f [n−m]h [m] ;Nh − 1 < n ≤ Nh +Nf − 2.(5.102)


The sequence y [n] is of length Nf + Nh − 1, as may also be seen from thefollowing matrix representation of (5.102) (assuming Nf > Nh) :

⎡

⎢⎢⎢⎢⎢⎢⎣

y [0]y [1].

y [Nh].

y [Nh +Nf − 2]

⎤

⎥⎥⎥⎥⎥⎥⎦=

⎡

⎢⎢⎢⎢⎢⎢⎣

f [0] 0 . 0f [1] f [0] . 0. . . .

f [Nh − 1] f [Nh − 2] . f [0]. . . .0 0 . f [Nf − 1]

⎤

⎥⎥⎥⎥⎥⎥⎦

⎡

⎢⎢⎣

h [0]h [1].

h [Nh − 1]

⎤

⎥⎥⎦ . (5.103)

We shall abbreviate the convolution by writing

y =conv (f ,h) ≡ conv (h, f) . (5.104)

It is worth noting at this juncture that the elements of the vector y can be inter-preted as the coefficients of a polynomial that results from the multiplicationof two polynomials whose coefficients are given by the vectors f and h. To see

this let PNf(z) =

∑Nf−1�=0 f [n] z� and PNh

(z) =∑Nh−1

k=0 h [k] zk. Multiplicationgives

PNf(z)PNh

(z) =

Nf−1∑

�=0

Nh−1∑

k=0

f [�]h [k] z�+k

=

Nh+Nf−2∑

n=0

(Nh−1∑

k=0

f [n− k]h [k])zn. (5.105)

The sum

y [n] =

Nh−1∑

k=0

f [n− k]h [k] (5.106)

with the upper limit set to n whenever n < Nh − 1 is just the convolution(5.102). It is also the n-th coefficient of an Nh + Nf − 1th order polynomial.Clearly (5.105) is independent of the nature of z. In fact if we replace z by Wm

N

we can identify the two polynomials with DFTs provided the data window N ischosen sufficiently long to accommodate both sequences, i.e., N ≥ Nh+Nf −1,and we extend each of the sequences to length N by appending N − Nf andN −Nh zeros to f [n] and h [n] , respectively. For such zero-padded sequences(5.105) with z = Wm

N is equivalent to the transform pair

conv (f ,h)DF⇐⇒ F [m]H [m] ; N ≥ Nh +Nf − 1. (5.107)


Similarly, by replacing the sequences in (5.105) with their zero-padded trans-forms F and H, we deduce in almost identical fashion the frequency domainconvolution formula5

f [n]h [n]DF⇐⇒ 1

Nconv (F,H) ; N ≥ NH +NF − 1, (5.108)

where the vectors F and H have lengths NF and NH .The zero-padding of the sequences in (5.107) may be viewed as a device to

avoid the cumbersome notation in (5.102) resulting from the truncation of thesequences. A systematic approach to this is to introduce the so-called circularconvolution that employs the periodic extension of one of the sequences to fillthe convolution matrix. This type of convolution is defined by

circonv (f ,h) =

N−1∑

k=0

f [(n− k)modN ]h [k] , (5.109)

where N is the data window as well as the nominal length of each of the se-quences. The symmetry of the matrix in (5.109) may be clarified by the display

⎡

⎢⎢⎢⎢⎢⎢⎣

y [0]y [1]y [2].

y [N − 2]y [N − 1]

⎤

⎥⎥⎥⎥⎥⎥⎦=

⎡

⎢⎢⎢⎢⎢⎢⎣

f [0] f [N − 1] . f [1]f [1] f [0] . f [2].f [2] f [1] . .f [3]. . . .

f [N − 2] f [N − 3] . f [N − 1]f [N − 1] f [N − 2] . f [0]

⎤

⎥⎥⎥⎥⎥⎥⎦

⎡

⎢⎢⎣

h [0]h [1].

h [N − 1]

⎤

⎥⎥⎦ .

(5.110)

The relationship between the elements in adjacent columns shows that the entirematrix can be generated starting with a uniform arrangement of the elementscomprising the first column along the circumference of a circle as shown inFig. 5.27.

Each succeeding column vector is then generated by a clockwise rotationthrough one element position (hence the name circular convolution). By sub-stituting the DFT representation for each sequence on the right of (5.109) it isnot hard to show that

circonv (f ,h)DF⇐⇒ F [m]H [m] (5.111)

and similarly for the inverse transform

f [n]h [n]DF⇐⇒ 1

Ncirconv (F,H) . (5.112)

5It is important to note an essential difference between zero-padding of the (time-domain)sequences and their transforms. While zero-padding a sequence can be done by simply ap-pending any number of zero without changing the form of the signal, zero-padding in thefrequency domain requires that the symmetry relation between the spectral components oneither side of the folding frequency be maintained.


• ••• •

•

•

•

•••

•

•

•

•

f [7]f [6]

f [5]

f [4]

f [3]

f [2]

f [0]

...

f [1]

f [N −7]

f [N −6]

f [N −5]

f [N −4]

f [N −3]

f [N −2]

f [N −1]

Figure 5.27: Matrix elements for circular convolution

This shows that for circular convolution the direct and inverse DFT factorizationformulas (5.111) and (5.112) hold without any restrictions on the sequencelengths. This is, of course, entirely due to our use of the periodic extensionof one of the sequences. By comparing (5.110) with (5.103) it is not hard to seethat a circular convolution of two sequences which have been zero-padded toN ≥ Nh+Nf −1 reduces to an ordinary convolution. One utility of the circularconvolution is that it provides, albeit indirectly, a numerically more efficientfor the evaluation of ordinary convolutions than does the direct summation for-mula (5.102). The procedure is to zero-pad the sequences to N = Nh +Nf − 1and compute their DFT. The inverse DFT of the product of these two DFTsthen computed which, by virtue of (5.111), is formally equal to the circularconvolution but because the initial zero-padding is the same as the ordinaryconvolution.

Zero-padding can also be used as an interpolation technique. Recall thatgiven a DFT at N frequencies we can interpolate to any frequency using (5.75).An alternative scheme is to append K zeros to the original sequence of lengthN thereby creating a DFT of length N +K. Since this new DFT extends overthe entire frequency band (appending zeros does not change the spectrum ofthe signal) we obtain a denser sampling of the signal spectrum by the factor(N +K) /K. As an illustration consider the sequence

g(n) = cos[(n− 1−N/2) π

N

]+ sin

[(n− 1−N/2) π

N

]; n = 1, 2, . . . N

(5.113)and its DFT

F (m) =

M∑

n=0

WnmM g(n), (5.114)

where M ≥ N .Figure 5.28 shows stem diagrams of the sequence (5.113) on the left and

the magnitude of its DFT for M = N = 64 on the right.


0 10 20 30 40 50 60 70−1

−0.5

0

0.5

1

1.5

0 10 20 30 40 50 60 700

5

10

15

20

25

30

35

40

Figure 5.28: Stem diagrams of sequence (5.113) and the magnitude of its DFT

0 10 20 30 40 50 60 700

5

10

15

20

25

30

35

40

0 200 400 600 800 1000 120005

101520253035404550

Figure 5.29: Resolution improvement by zero-padding fromN = 64 toN = 1024

When the sequence is zero-padded to M = 1024 the resolution improvessignificantly. This is demonstrated by the plots in Fig. 5.29 which compare theDFT magnitudes before and after zero-padding.

In our discussion of the DFT we have not addressed the problem of its actualnumerical evaluation. It is not hard to see that a direct computation requiresN2 multiplications plus additions. This placed a severe damper on the use ofDFTs in signal processing until the invention of numerically efficient algorithms.The first of these was the Cooley and Tukey (1965) FFT (Fast Fourier Trans-form) which reduced the number of operations from N2 to O(N log2N) therebyrevolutionizing digital signal processing. Today there is a large number of “fastalgorithms” for DFT evaluation. The presentation of their structure and theirrelative merits belongs in courses on digital signal processing and numericalanalysis. A discussion of the classic Cooley and Tukey radix 2 algorithm can befound in virtually every introductory text on digital signal processing and wewill not repeat it. For our purposes the MATLAB implementations of the FFTare sufficient.


Problems

1. The Fourier transform F (ω) of a bandlimited signal f (t) is given by

F (ω) =

{cos2

(πω2Ω

)e−i

πωΩ ; |ω| ≤ Ω,

0 ; |ω| > Ω.

(a) The signal is sampled at intervals Δt = π/Ω. Find the samples.

(b) Using the sampled values find f (t).

(c) Suppose the signal is sampled at intervals Δt′ = 2π/Ω and its Fouriertransform approximated by

F (ω) = Δt′∞∑

n=−∞f (nΔt′) e−iωnΔt

′.

Compute and sketch F (ω).

2. A bandlimited signal f(t) is represented by the finite number of samplesf(0), f( Δt), f (2Δt) , . . . f ((N − 1)Δt). Show that the ratio of the signalenergy within the time interval 0 < t < NΔt to the total energy is givenby the largest eigenvalue of the N ×N matrix A with elements

Amn =

∫ N

0

sin [π(s− n)] sin [π(s−m)]

π(s− n)π(s−m)ds.

3. Derive (5.75).

4. Use (5.79) with N = 64 to interpolate the sequence (5.113). Interpolatethis sequence using zero padding with M = 256. Compare the resolutionimprovement obtained with the two techniques.

Chapter 6

The Z-Transformand Discrete Signals

6.1 The Z-Transform

6.1.1 From FS to the Z-Transform

We return to the FT of a bandlimited function as given by (5.9) in Sect. 5.1.Setting ωΔt = θ we have the FS

F (ω) = F (eiθ) =

∞∑

n=−∞f [n]e−inθ (6.1)

with the coefficients f [n] = f(nΔt)Δt computed in the usual way, viz.,

f [n] =1

2π

∫ π

−πF (eiθ)einθdθ. (6.2)

As we have seen in Chap. 5, for discrete systems (6.1) can also be interpretedas DFT of the sequence f [n] without any requirement that this sequence cor-responds in any way to samples of a bandlimited analogue signal. Howeverregardless whether we are dealing with samples of an analogue signal or sim-ply with the DFT of a sequence of numbers mathematically formulas (6.1) and(6.2) represent, respectively, nothing more than the classic FS and its inverseand hence subject to all the restrictions and convergence properties of FS dis-cussed in Chap. 2. In particular, for infinite sequences (6.1) can converge onlyif f [n] decays sufficiently rapidly as n → ±∞. The convergence issues hereare similar to those we encountered in connection with the theory of the FT ofanalogue signals where we had to impose restrictions on the growth of the timefunction as t → ±∞ for its FT to exist. In Chap. 4 we also found that by ap-pending a convergence factor to the imaginary exponential of the FT we couldenlarge the class of admissible signals and obtain convergent integral transforms


311

312 6 The Z-Transform and Discrete Signals

for function that may grow at infinity. This suggests that with the aid of a suit-able convergence factor in (6.1) we should be able to obtain DFTs of divergentinfinite sequences. Thus suppose f [n] grows as ρnmin for n → ∞. By analogywith the notation employed in connection with the Laplace transform theorywe shall denote this by

f [n] ∼ O(ρnminn∼∞

). (6.3)

To simplify matters let us first suppose that f [n] = 0 for n < 0. We shall refer tosuch a sequence as a causal sequence. If we introduce the real positive numberρ, the series

∞∑

n=0

f [n]ρ−ne−inθ

will converge provided ρ > ρmin. Since the left side of (6.1) is a function of eiθ

the introduction of the convergence factor permits us to write the preceding asfollows:

F+(ρeiθ) =∞∑

n=0

f [n](ρeiθ)−n, (6.4)

where we have added the superscript to draw attention to the causal natureof the corresponding sequence. Of course, (6.4) is still an FS with coefficientsf [n]ρ−n so that the usual inversion formula applies which gives

f [n]ρ−n =1

2π

∫ π

−πF+(ρeiθ)einθdθ.

Clearly no substantive change will result if we simply multiply both sides by ρn

and write

f [n] =1

2π

∫ π

−πF+(ρeiθ)(ρeiθ)ndθ. (6.5)

As a notational convenience we now introduce the complex variable z = ρeiθ in(6.4) in which case the series assumes the form

F+(z) =

∞∑

n=0

f [n]z−n. (6.6)

In view of the convergence factor just introduced this series converges for allcomplex z such that |z| = ρ > ρmin, i.e., outside a circle with radius ρmin.Equation (6.6) will be recognized as the principal part of the Laurent se-ries expansion of an analytic function (see Appendix) in the annular regionρmin < |z| < ∞. The formula for the FS coefficients (6.5) can now be inter-preted as a contour integral carried out in the counterclockwise direction alonga circular path of radius ρ. We see this directly by changing the variable of in-tegration from θ to z = ρeiθ. Computing dz = ρieiθdθ = izdθ and substitutingin (6.5) give

f [n] =1

2πi

∮

|z|=ρ

F+(z)zn−1dz, (6.7)

6.1 The Z-Transform 313

where ρ > ρmin. The integration contour and the region of analyticity of F+ (z)are shown in Fig. 6.1. Equations (6.6) and (6.7) represent, respectively, the

ℑm(z)

ℜe(z)rmin

r

Figure 6.1: Region of analyticity and inversion contour for the unilateral Z-transform F+ (z)

(single-sided or unilateral) Z-transform and its inverse. We shall denote it by

f [n]Z⇐⇒ F+ (z) . (6.8)

Just like the analogous inversion formula for the unilateral Laplace trans-form, (6.8) yields zero for negative times, i.e., for n < 0. This is automaticallyguaranteed by the analyticity of F (z) outside on the integration contour. Wecan see this directly by applying the Cauchy residue theorem to a closed con-tour formed by adding to the circular contour in Fig. 6.1 a circle at infinity.The contribution from the integral along the latter vanishes as we see from thefollowing bounding argument:

1

2π

∣∣∣∣∣∣∣

∮

|z|=ρ

F+(z)zn−1dz

∣∣∣∣∣∣∣≤ 1

2π

∫ π

−π

∣∣F+(ρeiθ)∣∣ ρndθ

≤ ∣∣F+(ρeiθ)∣∣max

ρn. (6.9)

Because F+(z) is finite as ρ→∞ the last term tends to zero for negative n.When ρmin < 1 the integration path in Fig. 6.1 can be chosen to coincide

with the unit circle. In that case the inversion formula (6.7) may be replacedby

f [n] =1

2π

∫ π

−πF+(eiθ)einθdθ (6.10)


and we may interpret F+(eiθ) either as the DFT of a causal sequence or as theFT of a bandlimited analogue signal whose sample values on the negative timeaxis are zero.

The restriction to causal sequences is not always convenient. To encompassnegative indexes we suppose, by analogy with (6.3), that

f [n] ∼ O(ρnmaxn∼−∞

). (6.11)

Let us suppose that this sequence is zero for nonnegative indexes. The transformfor such an “anti-causal” sequence may be defined by

F−(z) =−1∑

n=−∞f [n]z−n. (6.12)

With z = ρeiθ we see that this series converges for 0 ≤ ρ < ρmax and may beinterpreted as an FS, i.e.,

F−(ρeiθ) =−1∑

n=−∞f [n]ρ−ne−inθ, (6.13)

wherein the coefficients are given by

f [n]ρ−n =1

2π

∫ π

−πF−(ρeiθ)einθdθ. (6.14)

Changing the integration variable to z transforms (6.14) into the contour integral

f [n] =1

2πi

∮

|z|=ρ

F−(z)zn−1dz, (6.15)

where the radius of the integration contour ρ < ρmax lies in the shaded regionof Fig. 6.2, corresponding to the region of analyticity of F−(z). Note that thetransforms F+(z) and F−(z) are analytic functions in their respective regions ofanalyticity. In general these regions would be disjoint. If, however, ρmax > ρmin,the regions of analyticity of F+(z) and F−(z) overlap, the overlap being theannular region

ρmin < |z| < ρmax (6.16)

shown in Fig. 6.3. In this case F (z),

F (z) = F+(z) + F−(z) (6.17)

is an analytic function within this annulus (6.16) and defines the bilateral Z-transform of the sequence f [n] represented by the series

F (z) =∞∑

n=−∞f [n]z−n. (6.18)


ℑm(z)

ℜe(z)

r

rmax

Figure 6.2: Region of analyticity and inversion contour for F− (z)

1

ℑm z

ℜe z

rmax

rmin

Figure 6.3: Typical annular region of analyticity of the doublesided Z-transformF (z)

The annular analyticity region is also the region of convergence (ROC) of (6.18).This convergence is guaranteed as long as the sequence f [n] exhibits the asymp-totic behavior specified by (6.3) and (6.11). Equation (6.18) will be recognizedas the Laurent series (see Appendix) expansion of F (z) about z = 0. The inver-sion formula is now given by the sum of (6.7) and (6.15), i.e.,

f [n] =1

2πi

∮

|z|=ρ

F (z)zn−1dz (6.19)


with ρ lying within the shaded annulus in Fig. 6.3. Of course due to analyticityof the integrand we may use any closed path lying within the annular region ofanalyticity. If the ROC includes the unit circle as, e.g., in Fig. 6.4, the inversionformula (6.19) may be replaced by (6.2) in which case F (z) = F (eiθ) reduces tothe FT.

1 rmax

rmin

ℑm z

ℜe z

Figure 6.4: Evaluation of the inversion integral of the bilateral Z-transformalong the unit circle

In summary (6.18) and (6.19) constitute, respectively, the direct and inversebilateral Z-transform (ZT) of a sequence. As is customary, we shall use inthe sequel the common symbol to denote both the bilateral and the unilateralZ-transform. Thus our compact notation for both transforms will be

f [n]Z⇐⇒ F (z) . (6.20)

Other notations that will be employed in various contexts are : F (z) = Z{f [n]}and f [n] = Z−1{ F (z)}.

6.1.2 Direct ZT of Some Sequences

The principal tool in the evaluation of the direct ZT is the geometric series sum-mation formula. For example, for the exponential sequence anu[n], we obtain

F (z) =

∞∑

n=0

anz−n =

∞∑

n=0

(a

z)n =

z

z − a (6.21)

provided |z| > |a| = ρmin. This ZT has only one singularity, viz., a simple poleat z = a. Using the notation (6.20) we write

anu[n]Z⇐⇒ z

z − a. (6.22)


For the special case of the unit step we get

u[n]Z⇐⇒ z

z − 1. (6.23)

Differentiating both sides of (6.21) k times with respect to a we get

∞∑

n=0

[n(n− 1) . . . (n− k + 1)]an−kz−n =zk!

(z − a)k+1.

Dividing both sides by k! and using the binomial symbol results in the transformpair (

nk

)an−k Z⇐⇒ z

(z − a)k+1(6.24)

a ZT with a k + 1-th order pole at z = a. Another canonical signal of interestin discrete analysis is the unit impulse, denoted either by the Kronecker symbolδmn or by its equivalent δ[n−m]. Formula (6.18) gives the corresponding ZT as

δ[n−m]Z⇐⇒ z−m. (6.25)

For m > 0 this ZT is analytic everywhere except for a pole of order m at z = 0which for m < 0 becomes a pole of the same order at z = ∞.

An example of a sequence that leads to a nonrational ZT is the sequencef [n] = u[n− 1]/n

u[n− 1]/nZ⇐⇒ − ln

z − 1

z, (6.26)

which may be obtained from a Taylor expansion of ln(1−w) for |w| < 1. This ZTmay be defined as analytic function for |z| > 1 by connecting the two logarithmicsingularities at z = 1 and z = 0 with the branch cut shown in Fig. 6.5.

As in case of the LT derivation of more complex transform pairs is facilitatedby the application of several fundamental properties of the ZT which we discussin the sequel.

6.1.3 Properties

Time Shift

Forward. We are given F (z) = Z{f [n]} and wish to compute Z{f [n+k]}where k ≥ 0. Here we must distinguish between the unilateral and the bilateraltransforms. In the latter case we have

Z{f [n+ k]} =

∞∑

n=−∞f [n+ k]z−n = zk

∞∑

m=−∞f [m]z−m

= zkF (z). (6.27)


1

BRANCH CUT

1+i00

ℑmz

ℜez

Figure 6.5: Branch cut that renders − ln [(z − 1)/z] analytic outside the unitcircle

On the other hand for the unilateral transform we obtain

Z{f [n+ k]} =∞∑

n=0

f [n+ k]z−n = zk∞∑

m=k

f [m]z−m. (6.28)

Since f [m] = 0 for m < 0 then whenever k ≤ 0 we may replace the lower limit ofthe last sum by zero in which case we again obtain (6.27). A different situationarises when k > 0 for then the last sum in (6.28) omits the first k − 1 values inthe sequence. Note that this is similar to the signal truncation for the unilateralLT in (4.49) in 4.1. We can still express (6.28) in terms of F (z) by adding and

subtracting the series∑k−1

m=0 f [m]z−m. We then obtain

Z{f [n+ k]} = zk[F (z)−k−1∑

m=0

f [m]z−m]. (6.29a)

The last result is particularly useful in the solution of finite difference equationswith constant coefficients with specified initial conditions.

Backward Shift. For k ≥ 0 we also compute Z{f [n− k], referred to as abackward shift. Instead of (6.28) we get

Z{f [n− k]} =∞∑

n=0

f [n− k]z−n = z−k∞∑

m=−kf [m]z−m. (6.29b)


Note that in this case the initial conditions are specified for negative indices.When the sequence is causal1 these vanish and we get

Z{f [n− k]} = z−kF (z) . (6.29c)

Time Convolution

By direct calculation we get

Z{∞∑

k=−∞f [n− k]g[k]} =

∞∑

k=−∞Z{f [n− k]}g[k]

=

∞∑

k=−∞F (z)z−kg[k]

= F (z)G(z). (6.30)

Since∑∞

k=−∞ u[n − k]f [k] ≡ ∑nk=−∞ u[n − k]f [k] ≡ ∑n

k=−∞ f [k] we get as aby-product the formula

n∑

k=−∞f [k]

Z⇐⇒ zF (z)

z − 1, (6.31)

which is useful in evaluating the ZT of sequences defined by sums. For example,applying to (6.26) results in

n∑

k=1

1

k

Z⇐⇒ −z lnz−1z

z − 1. (6.32)

Frequency Convolution

The ZT of the product of two sequences is

Z{f [n]g[n]} =

∞∑

n=−∞f [n]g[n]z−n

=

∞∑

n=−∞

1

2πi

∮

|z|=ρ′F (z′)z′n−1dz′g[n]z−n

=1

2πi

∮

|z|=ρ′F (z′)G(z/z′)z′−1dz

′. (6.33)

When the inversion contour is the unit circle the preceding becomes

Z{f [n}g[n]} = 1

2π

∫ π

−πF (θ′)G(θ − θ′)dθ′′ . (6.34)

1This does not imply that the underlying system is not causal. The shift of initial condi-tions to negative time is just a convenient way to handle forward differences.


Initial Value Theorem

For a causal sequence we have

lim|z|→∞

F (z) = f [0], (6.35)

which follows directly from (6.18) and analyticity of F (z) for |z| > ρmin.

Differentiation

Differentiating (6.18) we get the transform pair

Z{nf [n]} = −z dF (z)dz

. (6.36)

For example, starting with u[n] = f [n} and applying formula (6.36) twice weget

n2u[n]Z⇐⇒ z(z + 1)

(z − 1)3 . (6.37)

6.2 Analytical Techniques in the Evaluationof the Inverse ZT

From the preceding discussion we see that the ZT bears the same relationship tothe FS as the LT to the FT. Just like the LT inversion formula, the ZT inversionformula (6.54) yields signals whose characteristics depend on the choice of theintegration contour. Whereas in case of the LT this nonuniqueness is due to thepossibility of the existence of several strips of analyticity within each of whichwe may place a linear integration path, in case of the ZT we may have severalannular regions of analyticity wherein we may locate the circular integrationcontour. We illustrate this with several examples.

Example 1 Let us return to the simple case of a ZT whose only singularityis a simple pole at z = a, i.e.,

F (z) =z

z − a . (6.38)

There are two annular regions of analyticity (i) |a| < ρ < ∞ (ρmin = |a| andρmax = ∞) and (ii) 0 ≤ ρ < |a| (ρmin = 0 and ρmax = |a|). In accordance with(6.19) we have to compute

f [n] =1

2πi

∮

|z|=ρ

zn

z − adz. (6.39)

Consider first case (i). For n < 0 the integrand decays as |z| → ∞. In fact(6.39) taken over a circle of infinite radius vanishes. To see this let z = R eiθ sothat dz = R eiθidθ and the integral may be bounded as follows

6.2 Analytical Techniques in the Evaluation of the Inverse ZT 321

∣∣∣∣∣∣∣

∮

|z|=R

zn

z − adz

∣∣∣∣∣∣∣≤

∫ π

−π

R

|R eiθ − a|Rndθ. (6.40)

Since n < 0 the right side of the preceding expression approaches zero asR→∞.(It is not hard to see that the limit will also be zero for any F (z) that approachesa constant as |z| → ∞ a result we shall rely on repeatedly in the sequel.)Because there are no intervening singularities between the circle |z| = ρ and ∞this integral vanishes. For n ≥ 0 the integrand in (6.39) is analytic except atz = a and a residue evaluation gives an. In summary, for case (i) we have thecausal sequence

f [n] =

{0 ; n < 0,an ; n ≥ 0.

(6.41)

so that (6.38) together with the choice of contour represents the unilateral trans-form. For case (ii) the pole at z = a lies between the inversion contour and ∞.Because the integral over the infinite circular contour vanishes for n < 0 equa-tion (6.39) gives the negative residue at the pole, i.e., the integration aroundthe pole is, in effect, being carried out in the clockwise direction. This may bevisualized in terms of the composite closed contour shown in Fig. 6.6 comprisedof the circle with radius ρ, the circle at infinity, and two linear segments alongwhich the individual contributions mutually cancel. As we proceed along thecomposite contour we see that the pole is being enclosed in the clockwise direc-tion. For n ≥ 0 the integrand is analytic within the integration contour and weget zero. Thus, in summary, for case (ii) we obtain the anti-causal sequence

R

x a

ℑm z

ℜe zr

Figure 6.6: Residue evaluation using an auxiliary contour


f [n] =

{ −an ; n < 0,0 ; n ≥ 0.

(6.42)

From the preceding example we see that if |a| > 1, i.e., the pole is outside theunit circle, the unilateral ZT (case (i)) corresponds to a sequence that growswith increasing n. Clearly for this sequence the FT does not exist. On the otherhand for case (ii) |a| > 1 results in a sequence that decays with large negativen. Since in this case the inversion contour can be chosen to coincide with theunit circle the FT exists and equals

F (ω) =eiωΔt

eiωΔt − a . (6.43)

For |a| < 1 one obtains the reverse situation: the unilateral ZT corresponds toa decaying sequence whereas the anti-causal sequence in case (ii) grows. Thefour possibilities are illustrated in Fig. 6.7.

-20 -15 -10 -5 0

case(I)

case(II)

5 10 15 200

1

2

3

4

5abs(a)>1

abs(a)<1

-20 -15 -10 -5 0 5 10 15 200

1

2

3

4

5

6abs(a)<1

abs(a)>1

Figure 6.7: Sequences corresponding to a ZT with a simple pole

Note, however, as long as |a| = 1 formula (6.43) remains valid and representsthe FT of either the causal or the anti-causal decaying sequence for |a| < 1 and|a| > 1, respectively. What happens when |a| → 1? To answer this questionwe cannot simply substitute the limiting form of a in (6.43) for (6.43) is notvalid for a on the unit circle. Rather we must approach the limit by evaluating(6.19) along a contour just inside or just outside the unit circle. The answer,not surprisingly, depends on which of the two options we choose. Suppose westart with a contour just outside the unit circle. Since we are looking for theFT we should like our integration contour to follow the path along the unit


circle as much as possible. Our only obstacle is the pole at z = eiθ0 . To remainoutside the circle we circumnavigate the pole with a small semicircle of radius εresulting in the composite integration path shown in Fig. 6.8. The contributionto the integral along the circular contour is represented by a CPV integral.For nonnegative n, the sum of the two contributions equals the residue at theenclosed pole, and vanishes for negative n. This is, of course, identical to (6.41)

1•

ℑm z

ℜe zq0

Figure 6.8: Integration path along unit circle in presence of a simple pole

with a = eiθ0 . Thus summing the two contributions along the closed contourwe have

1

2einθ0 +

1

2πCPV

∫ π

−π

einθ

1− e−i(θ−θ0) dθ ≡ u [n] einθ0 = f [n] . (6.44)

Using the identity

1

1− e−i(θ−θ0) =1

2− i

2cot [(θ − θ0)/2]

we observe that (6.44) implies the transform pair

u [n] einθ0F⇐⇒ πδ (θ − θ0) + 1

2− i

2cot [(θ − θ0)/2] . (6.45)

We leave it as an exercise to find the FT when the unit circle is approachedfrom the interior.


Example 2 As another example consider the ZT with two simple poles

F (z) =z2

(z − 1/2)(z − 2). (6.46)

In this case there are three annular regions of analyticity: (i) 2 < ρ1, (ii)1/2 < ρ2 < 2, and (iii) 0 ≤ ρ3 < 1/2. We distinguish the three sequences bythe superscript k and evaluate

f (k)[n] =1

2πi

∮

|z|=ρk

zn+1

(z − 1/2)(z − 2)dz, k = 1, 2, 3.

As in the preceding example F (∞) is finite so that the contribution to theinversion integral for n < 0 from a circle at ∞ is zero. The three sequences arethen found by a residue evaluation as follows.

case (i)When n ≥ 0 we have

f (1) [n] =zn+1

(z − 1/2)|z=2 +

zn+1

(z − 2)

∣∣z=1/2 =

4

32n − 1

32−n (6.47)

so that

f (1) [n] =

{0 ; n < 0,

432n − 1

32−n ; n ≥ 0.

(6.48)

case (ii)For n < 0 the negative of the residue at z = 2 contributes and for n ≥ 0 the

positive residue as in (6.47). Thus we obtain

f (2) [n] =

{ − 432n ; n < 0,

− 132

−n ; n ≥ 0.(6.49)

case (iii)No singularities are enclosed for n ≥ 0 so that f (3) [n] = 0. For n < 0 the

negatives of the two residues in (6.48) contribute, so that the final result reads

f (3) [n] =

{ − 432n + 1

32−n ; n < 0,

0 ; n ≥ 0.(6.50)

We note that only the sequence f (2) [n] possesses an FT, which is

F (ω) =e2iωΔt

(eiωΔt − 1/2)(eiωΔt − 2). (6.51)

This sequence is, however, not causal. Note that this is due to the presence of apole outside the unit circle. This same pole gives rise to the exponential growthwith n of the sequence in case (i). This sequence is causal and correspondsto the single-sided ZT. Clearly to obtain a sequence which is both causal andstable requires that all the poles of the ZT lie within the unit circle.


Example 3 In this next example

F (z) =z

(z + 1/3) (z + 3)2 (6.52)

we again have two poles one of which is a double pole. We have again threeannular regions of analyticity and we evaluate

f (k)[n] =1

2πi

∮

|z|=ρk

zn

(z + 1/3) (z + 3)2dz, k = 1, 2, 3 (6.53)

on each of three circular contours defined by 0 ≤ ρ1 < 1/3, 1/3 < ρ2 < 3, 3 < ρ3.The residue evaluation gives

f (1) [n] =

{0 ;n ≥ 0,

− 964 (−1/3)n + (9−8n)(−3)n

64 ;n < 0,(6.54a)

f (2) [n] =

⎧⎨

⎩

(−1/3)n

(−1/3+3)2= 9

64 (−1/3)n ;n ≥ 0,

− ddz

{zn

z+1/3

}|z=−3 = (9−8n)(−3)n

64 ;n < 0,(6.54b)

f (3) [n] =

{964 (−1/3)n − (9−8n)(−3)n

64 ;n ≥ 0,0 ;n < 0.

(6.54c)

Of the three sequences only f (2) [n] possesses an FT which is

F (ω) =eiωΔt

(eiωΔt + 1/3) (eiωΔt + 3)2 . (6.55)

Recall that the sample values of the corresponding analogue signal are f (2) [n] /Δtso that the reconstruction of this signal via the Shannon sampling theorem reads

f (t) =1

Δt

−1∑

n=−∞

(9− 8n) (−3)n64

sin [π (t/Δt− n)] / [π (t/Δt− n)]

+1

Δt

∞∑

n=0

9

64(−1/3)n sin [π (t/Δt− n)] / [π (t/Δt− n)] . (6.56)

Example 4 In the preceding examples the ZT was a proper rational function.If this is not the case, a causal inverse does not exist. Consider, for example,

F (z) =z4

z2 − 1/4. (6.57)

In addition to the two simple poles at z = ±1/2 there is a second order pole atinfinity. Accordingly two analyticity regions are 1/2 < ρ1 < ∞ and 0 ≤ ρ2 <1/2. The sequence corresponding to the first region is

f (1)[n] =1

2πi

∮

|z|=ρ1

zn+3

z2 − 1/4dz. (6.58)


The integral over the circle at infinity vanishes provided n + 3 ≤ 0 so thatf (1)[n] = 0 for n ≤ −3. For n > −3 the only singularities within |z| < ρ1 arethe two simple poles of F (z). Summing the two residues gives

f (1)[n] =1

8

[(1

2

)n+

(−1

2

)n]u [n+ 2] . (6.59)

For the second sequence

f (2)[n] =1

2πi

∮

|z|=ρ2

zn+3

z2 − 1/4dz (6.60)

the integration yields a null result for n+3 ≥ 0. For n+3 < 0 the integral overthe infinite circle again vanishes so that we can sum the residues of the poleslying outside the integration contour resulting in

f (2)[n] = −1

8

[(1

2

)n+

(−1

2

)n]u [−n− 4] . (6.61)

An alternative approach to dealing with an improper rational function is toemploy long division and reduce the given function to a sum of a polynomialand a proper rational function. The inverse ZT of the polynomial is then just asum of Kronecker deltas while the inverse ZT of the proper rational function isevaluated by residues. Using this approach in the present example we have

z4

z2 − 1/4= z2 +

1

4

z2

z2 − 1/4. (6.62)

The inverse of z2 is δ [n+ 2] while the residue evaluation involving the secondterm yields either a causal or anticausal sequence. In the former case we get forthe final result

f (1)[n] = δ [n+ 2] +1

8

[(1

2

)n+

(−1

2

)n]u [n] , (6.63)

which is easily seen as just an alternative way of writing (6.59).

Example 5 Let us find the FT of the causal sequence whose ZT is

F (z) =z − 2

(z − 1/2)(z − eiθ0)(z − e−iθ0) . (6.64)

This function has two simple poles on the unit circle and one interior pole.Consequently the inversion contour for the FT along the unit circle will haveto be modified by two small semi-circles surrounding the poles (instead of oneas in Fig. 6.8). The integration along each of these semi-circles will contribute

6.3 Finite Difference Equations and Their Use in IIR and FIR Filter Design 327

one-half the residue at the respective pole while the integral along the unit circlemust be defined as a CPV integral. As a result we obtain

f [n] =1

2πCPV

∫ π

−πF

(eiθ

)einθdθ +

+1

2

eiθ0(n−1)(eiθ0 − 2

)

(eiθ0 − 1/2)(eiθ0 − e−iθ0) +1

2

e−iθ0(n−1)(e−iθ0 − 2

)

(e−iθ0 − 1/2)(e−iθ0 − eiθ0) . (6.65)

By absorbing the two residue contributions as multipliers of delta functions wecan write the complete FT as follows:

F (θ) = F(eiθ

)+ π

e−iθ0(eiθ0 − 2

)

(eiθ0 − 1/2)(eiθ0 − e−iθ0)δ (θ − θ0)

+πeiθ0

(e−iθ0 − 2

)

(e−iθ0 − 1/2)(e−iθ0 − eiθ0)δ (θ + θ0) . (6.66)

6.3 Finite Difference Equations and Their

Use in IIR and FIR Filter Design

The “method of finite differences” generally refers to the approximation ofderivatives in a differential equation using finite increments of the independentvariable. The approximate solution for the dependent variable is then found byalgebraic means. The finite difference approximation can be of the forward orof the backward type. Thus if the finite difference is defined as

y (t+Δt)− y(t)

Δt(6.67)

it is of the forward type. If it is defined by

y(t)− y (t−Δt)

Δt(6.68)

it is referred to as a backward difference. Whereas (6.67) is more commonwhen dealing directly with numerical solutions of differential equations (6.68)is generally preferred in digital signal processing mainly because y (t−Δt) hasdirect physical interpretation of a step in time delay. To illustrate the connectionbetween a differential equation and the associated difference equation considerthe simple case of the first-order equation

dy(t)

dt+ a0 y(t) = f (t) . (6.69)

With the forward difference approximation we get

y (t+Δt)− y(t)

Δt+ a0 y(t) ≈ f (t) . (6.70)


If we are interested in y (t) only at discrete time intervals we can set t = nΔt sothat y (t+Δt) = y [Δt (n+ 1)] ≡ y [n+ 1] , y(t) = y (nΔt) ≡ y [n] and Δtf (t) =Δtf (nΔt) ≡ f [n]. Making these changes transforms (6.70) into the differenceequation

y [n+ 1] + (a0Δt− 1) y [n] = f [n] . (6.71)

With Z{y [n]} = Y (z) , Z{f [n]} = F (z) we get

Y (z) =F (z) + z y [0]

z + a0Δt− 1=

F (z)

z + a0Δt− 1+

z y [0]

z + a0Δt− 1

and upon inversion the solution of (6.71):

y [n] = Z−1{ F (z)

z + a0Δt− 1}+ (a0Δt− 1)

ny [0] . (6.72)

As far as the difference equation (6.71) is concerned (6.72) is its exact solution.The actual solution to the differential equation (the analogue problem) has tobe gotten via a limiting process. For simplicity we do this when F (z) = 0 andlook for the limit of

y [n] = (1− a0Δt)n y [0] (6.73)

as Δt → 0. This is easily done by noting that discretization tells us that t maybe replaced by nΔt. If in addition we replace a0Δt by − δ, (6.73) becomes

y [n] = y [0] (1 + δ)−(1/δ)a0t. (6.74)

Recalling the definition of e we get

limδ→0

y [n] = (1 + δ)−(1/δ) → e−1

so that (6.73) approaches y [0] e−a0t which is the solution of (6.69). On theother hand, the physical problem of interest may have been formulated ab initioas a difference equation (6.71) where a finite increment Δt has a direct physicalsignificance. In that case the limiting form would constitute a wrong answer.

The application of the forward difference operation to an N -th order dif-ferential equation with constant coefficients (3.145) in 3.3 leads, after someunpleasant algebra, to the N -th order difference equation

N∑

k=0

ak y [n− k] = f [n] . (6.75)

Taking the Z-Transform yields

N∑

k=0

akzkY (z) = F (z) +

N∑

k=0

ak

k−1∑

�=0

f [�] zk−�, (6.76)

6.3 Finite Difference Equations and Their Use in IIR and FIR Filter Design 329

where the excitation on the right includes, in addition to the forced excitationF (z), the contribution from the initial conditions as given by (6.29a). Solvingfor the transform of the output

Y (z) =F (z)

∑Nk=0 akz

k+

∑Nk=0 ak

∑k−1�=0 y [�] z

k−�∑N

k=0 akzk

(6.77)

we identify the quantity

H+ (z) =1

∑Nk=0 akz

k(6.78)

as the system transfer function.2 If all poles are within the unit circle then, inaccordance with the results in the preceding section, the application of the in-version integral to (6.78) along the unit circle yields a causal and stable sequenceh [n]. Evaluation of (6.78) on the unit circle gives the FT

H (θ) =1

∑Nk=0 ake

ikθ= A (θ) eiψ(θ), (6.79)

where A (θ) is the amplitude and ψ (θ) the phase. The amplitude is an even andthe phase an odd function of frequency just like for continuous signals. An im-portant class of transfer functions is characterized by the absence of zeros outsidethe unit circle. They are called minimum phase-shift functions. For functionsof this type the phase can be determined from the amplitude (see 6.4.2).

Similar to the filter structures in Fig. 3.16 and 3.17 that were derived fromdifferential equations, difference equations lead to topologically similar repre-sentations. They play an important role in the design of DSP algorithms. Herewe will switch from the representation based on forward differencing we usedto derive (6.78) to backward differencing which is more common in DSP appli-cations. Figure 6.9 shows a feedback-type structure similar to that in Fig. 3.16.

The integrators have been replaced by unit delay elements (denoted by −1within the upper circles). For each delay element an input y [n] gives an outputy [n− 1] consistent with backward differencing (6.68). Referring to the figure,if we subtract the sum of the outputs from the difference operators from theinput f [n ] and multiply the result by 1/a0 we get

N∑

k=0

ak y [n− k] = f [n ] . (6.80)

Assuming y [n] = 0 for n < 0 we have for the Z-transform

Y (z)

N∑

k=0

ak z−k = F (z) (6.81)

2The subscript (+) identifies that it is based on forward differencing.


−aN

++

+

++

+

−a2−a1

y[n−N]y[n−2]y[n−1]

f

y[n]

1/a0

+

• • • • •

• • • • •

−1 −1

[n]

Figure 6.9: Infinite impulse response filter (IIR)

and for the transfer function

H− (z) =1

∑Nk=0 akz

−k . (6.82)

Since (6.82) is based on backward differencing it does not agree with (6.78).We can see the relationship between the two by changing the summation indexfrom k to m = N − k with the result

H− (z) =zN

∑Nm=0 aN−mzm

. (6.83)

The denominator has the form of (6.78) but the polynomial coefficients havebeen interchanged so that the pole positions are different.

The transfer function (6.82) forms the basis for the design of infinite impulseresponse (IIR) filters. The name derives from the property that with poleswithin the unit circle (the usual case) the impulse response is of infinite duration.This is also true for all pole analogue filters. In fact design procedures for digitalIIR filters are essentially the same as for analogue filters. A second class offilters is finite impulse response (FIR) filters. A representative structure of anFIR filter is shown in Fig. 6.10.

[n]f [n]f

−1 −1

b0b0

[n]fb0 [n-1]fb1

b1 b2 bN

[n]y+

Figure 6.10: Finite impulse response filter (FIR)

6.4 Amplitude and Phase Relations Using the Discrete Hilbert Transform 331

Adding the tapped and in unit steps delayed input to the input that hasbeen delayed and multiplied by the last tap we get

b0f [n] + b1f [n] + b2f [n− 2] + . . . bN−1f [n− (N − 1)] + bNf [n−N ] = y [n]

or, in compact, notation

y [n] =

N∑

k=0

bk f [n− k] . (6.84)

Here the impulse response is the finite length sequence

h [n] = bn ;n = 0, 1, 2 . . .N (6.85)

with the transfer function

HFIR (z) =

N∑

k=0

bk z−k (6.86)

and an FT3

HFIR (θ) =

N∑

k=0

bk e−ikθ. (6.87)

Since this is also an FS we can easily compute the coefficients for a prescribedH (θ) and hence the filter parameters in Fig. 6.10. The practical problem hereis that finite length sequences cannot be contained within a finite bandwidth.Generally the biggest offenders here are steep changes in the frequency spectrumas would be the case e.g., for band-pass filters with steep skirts. These problemscan be in part alleviated by tapering the sequence (i.e., in the time domain).However, generally for this and other reasons FIR filters require many taps.

A general filter structure in DSP applications combines IIR and FIR transferfunctions into the form

H(z) =

∑Nk=0 bk z

−k∑Mk=0 akz

−k . (6.88)

Such transfer functions can be realized either by using FIR and IIR filters intandem or combining them into a single structure similar to that in Fig. 3.17.

6.4 Amplitude and Phase Relations Usingthe Discrete Hilbert Transform

6.4.1 Explicit Relationship Between Real and ImaginaryParts of the FT of a Causal Sequence

We recall from 2.2.6 that the real and imaginary parts of the FT of a real causalanalogue signal are related by the Hilbert transform. In 2.4.2 this relationship

3Here and in the entire discussion of difference equations we have increased the sequencelength from that used with the DFT in Chap. 5 from N to N + 1. Consistency is easilyrestored by setting the Nth term to zero.


is reestablished and is shown to be a direct consequence of the analytic proper-ties of the FT of causal signals; its extension to amplitude and phase of transferfunctions is discussed in 2.4.3. In the following we show that a similar set of rela-tionships holds also for causal discrete signals, or, equivalently, for bandlimitedfunctions whose Nyquist samples are identically zero for negative indices.

As in the analogue case, in (2.168) in 2.2 we start directly from the definitionof a causal signal, in this case a real sequence f [n] which we decompose into itseven and odd parts as follows:

f [n] = fe [n] + fo [n] , (6.89)

where

fe [n] =f [n] + f [−n]

2, (6.90a)

fo [n] =f [n]− f [−n]

2. (6.90b)

Withf [n]

F⇐⇒ F (θ) = R (θ) + iX (θ) , (6.91)

wherein R (θ) and X (θ) are real and imaginary parts of F (θ) it is not hard toshow that

fe [n]F⇐⇒ R (θ) , (6.92a)

fo [n]F⇐⇒ iX (θ) , (6.92b)

i.e., just as in the case of analogue signals, the odd and even parts are defined,respectively, by the real and the imaginary parts of the FT of the sequence. Inorder to carry out the transformations analogous to in (2.172) 2.2 we need theFT of the discrete sign function sign [n] defined as +1 for all positive integers,−1 for negative integers and zero for n = 0. We can find this transform from(6.45) by first expressing the unit step as follows:

u [n] =1

2(sign [n] + δ [n] + 1) , (6.93)

where δ [n] = 1 for n = 0 and 0 otherwise. With the aid of (6.45) we then obtain

sign [n]F⇐⇒ −i cot (θ/2) . (6.94)

Suppose the sequence f [n] is causal. Then in view of (6.89) and (6.90) we have

f [n] = 2fe [n]u [n] (6.95)

and also

fo [n] = sign [n] fe [n] (6.96a)

fe [n] = f [0] δ [n] + sign [n] fo [n] . (6.96b)

6.4 Amplitude and Phase Relations Using the Discrete Hilbert Transform 333

Taking account of (6.92) and (6.94) and applying the frequency convolutiontheorem to (6.96) yield

X (θ) = − 1

2πCPV

∫ π

−πR

(θ′)cot

[(θ − θ′) /2]dθ′, (6.97a)

R (θ) = f [0] +1

2πCPV

∫ π

−πX

(θ′)cot

[(θ − θ′) /2] dθ′. (6.97b)

These relations are usually referred to as the discrete Hilbert transforms (DHT).

6.4.2 Relationship Between Amplitude and Phaseof a Transfer Function

We now suppose that A (θ) is the amplitude of the transfer function of a causaldigital filter with a real unit sample response. This means that there exists aphase function ψ (θ) such that the FT A (θ) eiψ(θ) = H (θ) has a causal inverse.The problem before us is to find ψ (θ) given A (θ). As we shall demonstrate inthe sequel, a ψ (θ) that results in a causal transfer function can always be foundprovided lnA (θ) can be expanded in a convergent Fourier series in (−π, π). Asin the corresponding analogue case the solution is not unique for we can alwaysmultiply the resulting transfer function by an all pass factor of the form eiψ0(θ)

which introduces an additional time delay (and hence does not affect causality)but leaves the amplitude response unaltered.

To find ψ (θ) we proceed as follows. Assuming that the FS expansion

lnA (θ) =

∞∑

n=−∞w [n] e−inθ (6.98)

exists we form the function

Q (θ) = lnH(θ) = lnA (θ) + iψ (θ) . (6.99)

Since the unit sample response is real, A (θ) is an even function of θ so that wemay regard lnA(θ) as the real part of the FT of a causal sequence q [n] withψ (θ) the corresponding imaginary part. In that case ψ (θ) can be expressedin terms of lnA (θ) by the DHT (6.97a) so that the solution to our problemappears in the form

ψ (θ) = − 1

2πCPV

∫ π

−πlnA

(θ′)cot

[(θ − θ′) /2] dθ′. (6.100)

The evaluation of this integral is numerically much less efficient than the fol-lowing alternative approach. It is based on the observation that for the causalsequence q [n] the FT of its even part is just the given log amplitude lnA (θ),i.e.,

w [n] = qe [n] =q [n] + q [−n]

2

F⇐⇒ lnA (θ) . (6.101)


But in view of (6.95) we can write

q [n] = 2qe [n]u [n]F⇐⇒ lnA (θ) + iψ (θ) . (6.102)

Thus to find ψ (θ) from A (θ) we first take the inverse FT of lnA (θ), truncatethe sequence to positive n, multiply the result by 2, and compute the FT of thenew sequence. The desired phase function is then given by the imaginary partof this FT. Expressed in symbols the procedure reads:

ψ (θ) = ImF {2F−1{lnA (θ)}u [n]} . (6.103)

The complete transfer function is then

H (θ) = A (θ) ei ImF{2F−1{2 lnA(θ)}u[n]}. (6.104)

To prove that this H (θ) has a causal inverse we note that, by construction,

lnH(eiθ

)=

∞∑

n=0

q [n] e−inθ (6.105)

so that∑∞

n=0 q [n] z−n is analytic in |z| > 1. But then lnH (z) must also be

analytic outside the unit circle. In particular, this means that H (z) cannothave any poles or zeros outside the unit circle. Since the exponential is ananalytic function

H (z) = e∑∞

n=0 q[n]z−n

(6.106)

is necessarily analytic in |z| > 1 and hence has a causal inverse.Analytic functions devoid of zeros outside the unit circle are referred to as

minimum phase-shift functions, a terminology shared with Laplace transformsof analogue signals that are analytic in the right half of the s plane and free ofzeros there. Despite this common terminology minimum phase-shift functionsof sampled causal signals are endowed with an important feature not sharedby transfer functions of causal analogue signals, viz., the absence of zeros aswell as analyticity outside the unit circle also ensures the analyticity of 1/H (z)outside the unit circle. Hence the reciprocal of minimum phase-shift functionalso possesses a causal inverse, a feature of great importance in the design offeedback control systems.

6.4.3 Application to Design of FIR Filters

The preceding procedure can be used to synthesize a minimum phase FIR filterhaving a prescribed amplitude response. With A [m], m = 0, 1, . . .N − 1 theprescribed amplitude function we compute the transfer function via (6.104)

H [m] = A [m] ei ImF{2F−1{2 lnA(k)}u[n]}. (6.107)

The unit sample response h [n] is then given by

h [n] =1

N

N−1∑

m=0

H [m] ei2πnm

N ; n = 0, 1, . . .N − 1. (6.108)

Problems 335

Equations (6.108) and (6.109) are exact but unfortunately the required numberof filter coefficients (taps) equals the number of samples, which would normallybe impractically large. It turns out, however, that for many practically usefulfilter functions the h [n] decay rapidly with increasing n so that the number oftaps can be chosen much less than N. Thus instead of the exact form of H [m]we shall use the first M values of h [n] in (6.108) and define

H(M) [m] =M−1∑

n=0

h [n] ei2πnm

N ;m = 0, 1.2 . . .N − 1. (6.109)

The degree of permissible truncation depends on the specified performance level(e.g., stopband attenuation, relative bandwidth) and is strongly dependent onthe functional form of the chosen amplitude function. The recommended proce-dure is to start with (6.108) and progressively reduce the number of taps untilthe observed amplitude spectrum begins to show significant deviations from theprescribe amplitude response.

Problems

1. The Z-transformF (z) =

z

(z2 + 1/2) (z + 16)

can represent several sequences.

(a) Find all the sequences.

(b) One of the sequences represents the coefficients of the Fourier seriesexpansion of the Fourier transform of a band-limited function. Iden-tify the sequence and find the Fourier transform of the correspondingbandlimited function.

2. The Fourier transform of a bandlimited function f(t) is given by

F (ω) =

{ 15/4−cos(πω/Ω) ; |ω| ≤ Ω

0; |ω| > Ω

(a) The function is sampled at intervals Δt = π/Ω. Find the Z-transformof the sequence f [n] = f(n Δt)Δt.

(b) Find the n-th sampled value f(n Δt) of f(t).

(c) Suppose the function is sampled at intervals Δt′ = 2π/Ω and itsFourier transform F (ω) is reconstructed using the formula.

F (ω) = Δt′∞∑

n=−∞f(nΔt′)e−iωnΔt

′

Compute F (ω) and sketch∣∣∣F (ω)

∣∣∣ in the interval |ω| ≤ Ω.


3. Using Z-transforms solve the following difference equation:

y[n+ 2] + 2y [n+ 1] +1

4y [n] = (1/2)

n, n ≥ 0, y [0] = 1, y [1] = −1.

4. The Fourier transform F (ω) of a bandlimited signal f (t) is given by

F (ω) =

{sin4

(πωΩ

); |ω| ≤ Ω,

0 ; |ω| > Ω.

(a) The signal is sampled at intervals Δt = π/Ω. Find the samples.

(b) Find the Z-transform of the sampled sequence f [n] = f(nΔt)Δt.

(c) Find f (t).

(d) Suppose the signal is sampled at intervals Δt′ = 2π/Ω and its Fouriertransform is approximated by

F (ω) = Δt′∞∑

n=−∞f (nΔt′) e−iωnΔt

′.

Compute and sketch F (ω) within the band |ω| ≤ Ω.

5. The sequence y [n] satisfies the following difference equation:

y[n+ 2] +1

6y [n+ 1]− 1

6y [n] = u [n] ,

where

u [n] =

{1 ; n ≥ 0,0 ; n < 0.

(a) Assuming y [0] = 0 and y [1] = 0 find the causal solution of thedifference equation using Z transforms.

(b) Suppose the y(nΔt) = y [n] /Δt represent samples of a bandlimitedsignal with a bandwidth of 10Hz. Find the Fourier transform of thesignal.

6. Using Z-transforms solve the following difference equation:

y[n+ 2] + y [n+ 1] +1

4y [n] = (1/3)

n, n ≥ 0, y [0] = 1, y [1] = 2.

Appendix A

Introduction to Functionsof a Complex Variable

A.1 Complex Numbers and Complex Variables

A.1.1 Complex Numbers

A complex number z is defined as a combination of two real numbers x and yand the imaginary unit i =

√−1 as follows: z = x + iy. The x and y are thenreferred to, respectively, as the real and imaginary parts of z. In symbols, wewrite x = e (z) and y = �m (z). When z = iy the complex number is saidto be imaginary (sometimes pure imaginary) and when z = x it is said to bepurely real. It is customary to represent a complex number graphically as apoint in a rectangular coordinate system wherein the real part is placed alongthe abscissa and the imaginary part along the ordinate. This representationis sometimes referred to as the Argand1 representation and the plane as theArgand plane. We shall refer to it simply as the complex plane, which is themore common designation. For example, the three points in Fig.A.1 representthe complex numbers 3 + i4 and 2 − i3 and 4 − i2. Of course, we could justas well have identified the three points by the coordinate pairs (3, 4), (2,−3),(4,−2) so that this geometrical representation appears indistinguishable fromthat of a of two-dimensional real Cartesian vector with x-component x = e (z)and y-component y = �m (z). The magnitude (length) of such a vector is the

nonnegative quantity√x2 + y2 represented by an arrow drawn for the coordi-

nate origin while its direction may be specified by the angle arctan(y/x). Byanalogy with real two-dimensional vectors we associate with the complex num-ber z = x+ iy the magnitude r ≡

√x2 + y2 ≡ |z| and an angle θ =arctan(y/x)

1after Jean Robert Argand (1768–1822)

W. Wasylkiwskyj, Signals and Transforms in Linear Systems Analysis,DOI 10.1007/978-1-4614-3287-6, © Springer Science+Business Media, LLC 2013

337

338 Introduction to Functions of a Complex Variable

x

y

• −3+i2

• 3+i4

• 4−i2

Figure A.1: Complex number representation in the Argand plane

θ

r

x

yz

Figure A.2: Polar form of a complex number

with the latter referred to either as the argument of z (and written θ = arg(z))or as the phase angle of z. Taking account of these definitions we have

z = x+ iy = r cos θ + ir sin θ = reiθ, (A.1)

where we have used the Euler formula2

eiθ = cos θ + i sin θ. (A.2)

The last term in (A.1) is referred to as the polar form of a complex number whichcan be represented as in Fig.A.2. The addition of complex numbers is carriedout by adding their real and imaginary parts and appending the imaginary uniti to the imaginary part. Thus with z1 = x1 + iy1 and z2 = x2 + iy2 the sum issimply z3 = z1 + z2 = x1 + x2 + i(y1 +y2) so that e (z3) = x1 + x2 = x3 and

2The usual proof of this remarkable formula proceeds by showing equality of the Taylorseries expansions of eiθ and cos θ + i sin θ. A more elementary proof is the following. De-fine a function f(θ) = (cos θ + i sin θ) e−iθ. Differentiating the right side with respect to θyields (− sin θ + i cos θ) e−iθ −i (cos θ + i sin θ) e−iθ = 0 for all θ. Hence f(θ) is a constant,independent of θ. Since f(0) = 1, this constant equals unity and Euler’s formula follows.

A.1 Complex Numbers and Complex Variables 339

y

z1

z2

z3

x2 x1xx3

y3

y2

y1

Figure A.3: Parallelogram law of addition

�m (z3) = y1 +y2 = y3. This leads to the parallelogram construction illustratedin Fig.A.3 just as for real two-dimensional vectors. Despite this similarity com-plex numbers are not vectors since they do not share all properties of vectors.Nevertheless the term vector had been applied to complex numbers particularlyin the older Engineering literature dealing with electric circuit analysis. In ac-cordance with current terminology in electric circuit analysis a representationof parallelogram addition of complex numbers as in Fig.A.3 would be referredto as a phasor diagram.

Multiplication of complex numbers follows from the defining relation of theimaginary unit i. Thus i2 = −1, (−i)2 = −1, (−i) (i) = 1. Also, since in ac-cordance with (A.2) i = eiπ/2, we obtain in = eiπn/2 = cosnπ/2 + i sinnπ/2and i−n = e−iπn/2 = cosnπ/2− i sinnπ/2. For n > 0 these correspond, respec-tively, to counterclockwise and clockwise rotations by πn/2 along the unit circlein the complex plane. Thus multiplication of two arbitrary complex numbersz1 = x1 + iy1, z1 = x1 + iy1 results in

z1z2 = (x1 + iy1) (x2 + iy2) = x1x2 − y1y2 + i (y1x2 + y2x1) . (A.3)

A simple geometrical interpretation can be had by writing z1 and z2 in polarform. Thus with z1 = |z1| eiθ1 and z2 = |z2| eiθ2 we get

z1z2 = |z1| |z2| ei(θ1+θ2)

so that magnitudes get multiplied and the angles (arguments) are added, asshown in Fig. A.4. A similar geometrical interpretation obtains for divisionwhere the magnitudes are divided and the angles subtracted. Thus

z1z2

=|z1||z2|e

i(θ1−θ2)

.


z1z2

z2

q1+q2 q1q2 x

y

z1

Figure A.4: Multiplication of complex numbers

The conjugate of z = x + iy is obtained by changing i to −i. For this purposewe use the symbol ∗. Thus z∗ = x− iy. Obviously (z∗)∗ = z. Multiplication ofa complex number by its conjugate results in

zz∗ = (x+ iy) (x− iy) = x2 + y2 = r2 = |z|2 .

The operation conjugation is a useful tool when division is done in terms of the(Cartesian) component rather than the polar form and a separation in real andimaginary parts is desired. This is facilitated by a multiplication of numeratorand denominator by the conjugate of the denominator. Thus

z1z2

=x1 + iy1x2 + iy2

=(x1 + iy1) (x2 − iy2)(x2 + iy2) (x2 − iy2) =

x1x2 + y1y2 + i (y1x2 − y2x1)x22 + y22

so thate (z1/z2) = (x1x2 + y1y2) /

(x22 + y22

)

and�m (z1/z2) = (y1x2 − y2x1) /

(x22 + y22

).

Exponentiation is best performed by using the polar form of the complexnumber. Thus

zw = |z|w eiwθ

= ew ln|z|eiwθ

with w = u + iv an arbitrary complex number. Of particular interest is theinverse of the above problem with w = N a positive integer, i.e., that of findingz when the right side is known. Stated differently, we want to solve

zN = q

A.1 Complex Numbers and Complex Variables 341

v y

u x

f(z)

Figure A.5: Representation of f (z) as a mapping from the z to the w-plane

for z for a given (possibly complex) q = |q| eiθ. Since this is an N -th orderequation, by the fundamental theorem of algebra it must have N roots. We findthese roots by noting that

q = qei2πk ; k = 0,±1,±2, . . .

so that

z ≡ zk = q1/Nei2πk/N = |q|1/N eiθ/Nei2πk/N ; k = 0, 1, 2, . . .N − 1.

Also, since the coefficient of the N−1 power in the algebraic equation zN−q = 0is zero the sum of the roots is likewise zero, i.e.,

N−1∑

k=0

ei2πk/N = 0. (A.4)

A.1.2 Function of a Complex Variable

In general, a function of a complex variable z may be defined as a mapping of aset of complex numbers z = x+ iy in a region the complex z-plane (the domainof the function) into another set complex of numbers w = u + iv falling into aregion of the complex w-plane (the range of the function) in accordance with arule denoted symbolically by f(z). We represent this mapping analytically bywriting

w = f(z) = u (x, y) + iv (x, y) (A.5)

and may be thought of geometrically as shown in Fig.A.5. We shall be primarilyinterested in functions whose domain is the entire z-plane as, e.g.,

w = u+ iv = f(z) = z2 = (x+ iy)2= x2 − y2 + i2xy

so that u = x2 − y2 and v = 2xy.


A.2 Analytic Functions

A.2.1 Differentiation and the Cauchy–RiemannConditions

By analogy with real variables, it is reasonable to define the derivative of afunction of a complex variable at z = z0 as the limit

limΔz−→0

f (z0 +Δz)− f (z0)Δz

. (A.6)

However, unlike for a real variable such a limit is not unique for it may dependon the direction (i.e., argument) assumed by Δz as its magnitude tends to zero.Functions

This potential lack of uniqueness can be seen from the phasor diagram inFig.A.6 where Δz may approach zero from any initial position on the rim of inthe circular region with radius |Δz| and for each such position the limits maydiffer. Evidently for a derivative of f(z) to possess a unique value at z = z0requires that the limit (A.6) be independent of the direction (argument) of theincrement Δz. Functions with this property are referred to analytic functions.

We can readily establish the necessary conditions for the existence of aderivative at a point z0 = x0 + iy0 by first rewriting the limit (A.6) in terms ofreal and imaginary parts of f (z). Thus with

f(x+ iy) = u(x, y) + iv(x, y)

Eq. (A.6) reads

limΔz→0

f (z0 +Δz)− f (z0)Δz

= limΔz→0

u(x0 +Δx, y0 +Δy) + iv(x0 +Δx, y0 +Δy)− u(x0, y0)− iv(x0, y0)Δx+ iΔy

.

(A.7)

Δz

x

y

z 0 + Δz

z 0

Figure A.6: Directions though which the increment Δz can approach zero

A.2 Analytic Functions 343

A necessary condition for the existence of the derivative is that the approachto the limit in (A.7), when evaluated for any two distinct directions in Fig.A.6,remains the same. For example, suppose we evaluate the limits for Δz = Δxand for Δz = iΔy. In the first instance we obtain

limΔx→0

f (z0 +Δx)− f (z0)Δx

= limΔx→0

u(x0 +Δx, y0)− u(x0, y0) + iv(x0 +Δx, y0)− iv(x0, y0)Δx

= limΔx→0

Δu+ iΔv

Δx=∂u

∂x+ i

∂v

∂x. (A.8a)

Next, taking the increment along y, we get

limΔv→0

f (z0 + iΔy)− f (z0)iΔy

= limΔy→0

u(x0, y0 +Δy)− u(x0, y0) + iv(x0, y0 +Δy)− iv(x0, y0)iΔy

= limΔy→0

Δu+ iΔv

iΔy= −i∂u

∂y+∂v

∂y. (A.8b)

Thus, equating (A.8a) to (A.8b) provides us with the necessary set of conditionsfor the existence of a derivative at z = z0 :

∂u

∂x=

∂v

∂y, (A.9a)

∂u

∂y= −∂v

∂x. (A.9b)

Equations (A.9a) and (A.9b) are known as the Cauchy–Riemann (CR) condi-tions. It turns out that they are also sufficient conditions for the existence ofa derivative. To prove sufficiency we take an arbitrary increment Δf (z) andwrite it in terms of the partial derivatives of u and v as follows:

Δf (z) = Δu(x, y) + iΔv(x, y)

=∂u

∂xΔx+

∂u

∂yΔy + i

[∂v

∂xΔx+

∂v

∂yΔy

].

Next we substitute the CR conditions from (A.9) to obtain

Δf (z) =∂u

∂xΔx− ∂v

∂xΔy + i

[∂v

∂xΔx+

∂u

∂xΔy

]

=∂u

∂x(Δx+ iΔy) +

∂v

∂x(−Δy + iΔx)

=

(∂u

∂x+ i

∂v

∂x

)(Δx+ iΔy) =

(∂u

∂x+ i

∂v

∂x

)Δz.


From the last relation we note that the increment Δz is arbitrary. Hence weare entitled to the assertion that the direction along which Δz approaches 0 isimmaterial, and write

limΔz→0

Δf (z)

Δz=∂u

∂x+ i

∂v

∂x=∂v

∂y− i∂u

∂y=df (z)

dz= f

′(z) (A.10)

A.2.2 Properties of Analytic Functions

A function satisfying the CR conditions at a point is said to be analytic at thatpoint. A function which possesses a derivative at all points within a region of the complex plane is said to be analytic in . It is easy to see that a sum ofanalytic functions is analytic. Likewise the product of two analytic functions isanalytic. To show this, consider two functions f1 (z) and f2(z) both analytic in. We decompose these into their respective real and imaginary parts

f1 (z) = u1 + iv1 (A.11a)

f2(z) = u2 + iv2 (A.11b)

and, as a notational convenience, denote the partial derivatives of the real andimaginary parts by subscripts. The CR conditions for each function then assumethe form

u1x = v1y, u1y = −v1x, (A.12a)

u2x = v21y, u2y = −v2x. (A.12b)

Withg(z) = f1 (z) f2(z) = U + iV

we then obtain via the substitution of (A.11)

U = u1u2 − v1v2, (A.13a)

V = v1u2 + u1v2. (A.13b)

Analyticity of g(z) requires

Ux = Vy , (A.14a)

Uy = −Vx . (A.14b)

By direct calculation we obtain

Ux = u1u2x + u1xu2 − v1v2x − v1xv2, (A.15a)

Vy = v1u2y + v1yu2 + u1v2y + u1yv2, (A.15b)

Uy = u1u2y + u1yu2 − v1v2y − v1yv2, (A.15c)

Vx = v1u2x + v1xu2 + u1v2x + u1xv2. (A.15d)


Using (A.12) in (A.15b) we set u2y = −v2x, v1y = u1x, v2y = u2x, u1y = −v1xand 14a follows. Similarly we can verify (A.14b).

The simplest analytic function is a constant. On the next level of complexityis f (z) = z which clearly is analytic for all finite values of z. This is also thecase for f(z) = z2 as can be seen by writing

z2 = x2 − y2 + i2xy = u+ iv

and showing that the CR conditions are satisfied. Thus

∂u

∂x= 2x,

∂v

∂y= 2x =

∂u

∂x,∂u

∂y= −2y, ∂v

∂x= 2y = −∂u

∂y.

From this and the fact that a product of two analytic functions is analytic followsthat zn is analytic for any nonnegative integer. Thus a polynomial, being a sumof powers of z, is an analytic function for all |z| <∞.

Examples of analytic functions that are not polynomials are the trigono-metric functions sin z and cos z. By decomposing them into real and imagi-nary parts with the aid of the trigonometric addition formulas one can showthat the CR conditions are satisfied in the entire finite complex plane and thatd (sin z) /dz = cos z and d (cos z) /dz = − sin z. Thus not only are the sinusoidsanalytic for all |z| <∞, but all the derivatives are analytic as well.3

A.2.3 Integration

The formal definition of an integral of a function of a complex variable is quitesimilar to that of a line integral in a plane for real variables. The integrationis carried out over a curve C connecting two points in the complex plane, say,z1 = x1 + iy1 and z2 = x2 + iy2, as shown in Fig. A.7, and the integral definedby the usual Riemann sums applied to the real and imaginary constituents:

I =

∫ z2

z1

f (z)dz =

∫ (x2,y2)

(x1,y1)

[u (x, y) + iv(x, y)] (dx + idy)

=

∫

C

{u [x, y (x)] dx− v [x, y (x)] dy}+ i

∫

C

{u [x, y (x)] dx+ v [x, y (x)] dy}(A.16)

In general, the value of the integral I will depend on the choice of the curveconnecting z1 and z2. However, if the path of integration is confined entirely tothe region of analyticity of f (z), then I will be independent of the chosen curveC. The proof of this assertion requires an auxiliary result, known as Green’stheorem.

3In fact we shall show in the sequel that this is a general property of analytic functions,i.e., an analytic function possesses derivatives of all orders.


(x1, y1)

(x2, y2)

y(x)C

x

y

Figure A.7: Integration path for line integral in the complex plane

C

x

y

a b

c

d

g2(x)

g1(x)

f2(y)

f1(y) dy

Figure A.8: Closed integration contour

Green’s Theorem. Any two real functions P (x, y) and Q(x, y) with contin-uous partial derivatives satisfy

∮

C

(Pdx+Qdy) =

∫∫

A

(∂Q

∂x− ∂P

∂y

)dxdy, (A.17)

where the line integral is taken in a counterclockwise direction along a closedplanar curve C and double integral over the area A enclosed by the curve, asshown in Fig. A.8.


We note from the figure that the closed contour C can be traced using thetwo functions f2 (y) and f1 (y) and that the elementary contribution to the areaA is

dA = [f2 (y)− f1 (y)] dy.Hence the integral of ∂Q/∂x over the area enclosed by the closed contour is

∫∫

A

∂Q

∂xdxdy =

∫ d

c

dy

∫ f2(y)

f1(y)

∂Q

∂xdx =

∫ d

c

dy {Q [f2 (y) , y]−Q [f1 (y) , y]} .

(A.18)Evidently the last integral can be interpreted as a line integral taken in the

counterclockwise direction along the close contour C encompassing the area A.Accordingly, we rewrite it in the notation used in (A.17):

∮

C

Qdy =

∫∫

A

∂Q

∂xdxdy. (A.19)

Alternatively, we can trace out the closed contour using the functions g2 (x)and g1 (x) so that the elementary area contribution becomes

dA = [g2 (x)− g1 (x)] dx.

Hence

∫∫∂P

∂yA

dxdy =

∫ b

a

dx

∫ g2(x)

g1(x)

∂P

∂ydy =

∫ b

a

dx {P [x, g2 (x)]− P [x, g1 (x)]}

= −∮

C

Pdx (A.20)

so that ∮

C

Pdx = −∫∫

∂P

∂yA

dxdy. (A.21)

Adding (A.19) and (A.21) we obtain (A.17).Q.E.D.

We the aid of Green’s theorem we can now establish a fundamental propertyof analytic functions.

Theorem 1 Let f (z) be an analytic function in . Then

∮

C

f (z)dz = 0 (A.22)

for any closed curve lying entirely in .


Γ1

Γ2z1

z2

x

y

Figure A.9: Integrals along two different paths

Proof.∮

C

f (z)dz =

∮

C

(u+ iv) (dx+ idy) =

∮

C

(udx− vdy) + i

∮

C

(vdx + udy) . (A.23)

Using Green’s theorem

∮

C

(udx− vdy) =

∫∫

A

(−∂v∂x− ∂u

∂y

)dxdy, (A.24a)

∮

C

(vdx + udy) =

∫∫

A

(∂u

∂x− ∂v

∂y

)dxdy. (A.24b)

In accordance with the CR conditions the right side of (A.24a) and (A.24b) isidentically zero validating (A.22). �

One corollary of the preceding theorem is that an integral of an analyticfunction taken along a path connecting two fixed points is dependent on thepath as long as the curve connecting the points lies entirely within the regionof analyticity of the function. Thus assuming that the closed curve in Fig. A.9lies within the region of analyticity of f (z) we have

∮f(z)dz = 0. (A.25)

Alternatively,4 we can write this as a sum of two integrals over paths Γ1 and Γ2

∮f(z)dz =

∫ z2

z1(Γ2)

f (z)dz −∫ z2

z1(Γ1)

f (z) dz = 0 (A.26)

4Here and in the following the symbol

∮denotes an integral carried out over a closed

path in the counterclockwise direction.

A.3 Taylor and Laurent Series 349

so that ∫ z2

z1(Γ1)

f (z)dz =

∫ z2

z1(Γ2)

f (z)dz (A.27)

over two arbitrary curves Γ1 and Γ2.

A.3 Taylor and Laurent Series

A.3.1 The Cauchy Integral Theorem

We have shown that the product of two analytic functions is analytic. Thus forany analytic function f(z), the function g(z) defined by the product

g(z) =f(z)

z − z0 (A.28)

will be analytic in the same region as f(z) with a possible exception of thepoint z = z0. Therefore an integral of g(z) carried out along any closed pathlying entirely within the region of analyticity must yield zero, and, in particular,along the closed path shown in Fig. A.10. Tracing out this path in the directionindicated by the arrows, we first traverse the circular segment C0 in the clockwisedirection and then, moving along the linear segment L1, reach the outer contourC. Continuing in the counterclockwise direction along C we connect to pathsegment L2 which brings us back to C0. The corresponding contributions to theclosed path integral are

−∮

C0

g(z)dz +

∫

L1

g(z)dz +

∮

C

g(z)dz +

∫

L2

g(z)dz = 0. (A.29)

x

q

yL1 L2

r

z0C C0

Figure A.10: Integration path


In the limit as the two line segments approach coincidence the contributionsfrom L1 and L2 cancel so that (A.29) reduces to

∮

C

f(z)

z − z0dz =

∮

C0

f(z)

z − z0 dz. (A.30)

On the circular contour C0 we set z − z0 = reiθ and obtain

−∮

C0

f(z)

z − z0 dz = −∫ −2π

0

ireiθf(z0 + reiθ)

reiθdθ = i

∫ 2π

0

f(z0 + re−iθ)dθ. (A.31)

Using this in (A.30) we have5

∮

C

f(z)

z − z0 dz = i

∫ 2π

0

f(z0 + re−iθ)dθ. (A.32)

After setting r = 0 we solve for f(z0) and obtain the Cauchy integral formula

f(z0) =1

2πi

∮

C

f(z)

z − z0 dz. (A.33)

As indicated in Fig. A.10, the integration is to be carried out in the counter-clockwise direction along a contour enclosing the point z0 but can otherwisebe arbitrary. If the contour does not enclose z0, we are integrating an analyticfunction and the integral vanishes. This result can be rephrased more formallyas follows. Let f(z) be an analytic function in a region and C a closed contourwithin . Let C represent a region within entirely enclosed by C. Then

1

2πi

∮

C

f(z)

z − z0dz =

{0, when z0 /∈ C ,f(z0),when z0 ∈ C . (A.34)

Since for z0 = z the integrand in (A.33) is an analytic function of z0 we areentitled to differentiation with respect to z0 under the integral sign. We thenobtain

f (1)(z0) =1

2πi

∮

C

f(z)

(z − z0)2dz. (A.35)

The same argument applies also to an n-fold differentiation. This leads to theformula

f (n)(z0)

n!=

1

2πi

∮

C

f(z)

(z − z0)n+1 dz, (A.36)

5Note that according to (A.31) the radius of the circle C0 along which we are integratingcan be arbitrarily small.


ς

rmax

C

z

z0

Figure A.11: Integration paths in the derivation of the Taylor series

where the integral again vanishes when the contour fails to enclose z0. Formula(A.36) embodies an extremely important property of analytic functions. Itstates that an analytic function possesses derivatives of all orders and that allof these derivatives are also analytic functions. This result forms the basis for theTaylor as well as the Laurent expansions to be discussed in the sequel. Anotherimportant consequence of the Cauchy integral formula is that a function analyticand bounded for all z must be a constant, a result referred to a Liouville’sTheorem. Its proof is quite straightforward. Thus if f (z) is analytic for any z0and bounded i.e., |f (z0)| < M , then with |z − z0| = r

∣∣∣f (1)(z0)∣∣∣ =

∣∣∣∣∣∣∣

1

2πi

∮

|z−z0|

f(z)

(z − z0)2dz

∣∣∣∣∣∣∣≤ 1

2π

∣∣∣∣∣∣∣

∮

|z−z0|

f(z)

(z − z0)2dz

∣∣∣∣∣∣∣≤ M

r.

Since M is independent of r, we can set r = ∞. Then∣∣f (1)(z0)

∣∣ = 0 for all z0so that f (z0) is a constant. Functions that are analytic for all finite z are calledentire (or integral) functions. Examples are polynomials, ez, sin z, cos z, Jn (z)(with n an integer).

A.3.2 The Taylor Series

Let f(z) be an analytic function within the region enclosed by C as shown inFig.A.11. We apply the Cauchy integral theorem (A.33) to the closed contourC where we denote the variable of integration by ς. The function f(z) is thengiven by

f(z) =1

2πi

∮

C

f(ς)

ς − z dς. (A.37)


Recall that z, which in the notation of (A.34) now plays the role of z0, mustlie within the contour C. Presently the point z0 shown in the figure is just anarbitrary point within C. We use it to rewrite the denominator of (A.37) bysetting

ζ − z = ζ − z0 + z0 − z = (ζ − z0)[1− z − z0

ζ − z0

](A.38)

1

ς − z =1

ς − z0

[1

1− z−z0ς−z0

]. (A.39)

In the region ∣∣∣∣z − z0ς − z0

∣∣∣∣ < 1 (A.40)

(A.39) can be represented by the convergent geometric series

1

ς − z =1

ς − z0∞∑

n=0

(z − z0ς − z0

)n, (A.41)

which we substitute into the Cauchy integral formula (A.37) and obtain

f(z) =

∞∑

n=0

⎡

⎢⎣1

2πi

∮

r+max

f(ς)

(ς − z0)n+1 dς

⎤

⎥⎦ (z − z0)n , (A.42)

where in view of (A.40) we have shifted the integration contour from C to rmax.Comparing the terms in braces with (A.36) we see that (A.42) is equivalent to

f(z) =

∞∑

n=0

f (n) (z0)

n!(z − z0)n , (A.43)

which is the Taylor series expansion of f(z) about the point z = z0. Its region ofconvergence is defined by (A.40) and represented geometrically in Fig. A.11 bythe circle with center at z0 and radius rmax. From the geometrical constructionin the figure it is evident that rmax is the shortest distance from z0 to theintegration contour C. The radius of convergence can be increased by moving z0further away from the integration contour provided the circle remains entirelywithin region of analyticity of f(z) as prescribed by (A.40). Thus for f(z)analytic in the region shown in Fig.A.12 every point can serve as an expansionpoint of a Taylor series. These expansions are alternative representations ofthe same function where each representation has its own circle of convergencewhose maximum radius is governed by the proximity of the expansion point tothe boundary of the region of analyticity. For example, if we were to choosethe three points z1, z2, and z3 as shown in the figure as Taylor series expansionpoints we would obtain convergence only within the circles with radii r1, r2, andr3, respectively. When the convergence regions corresponding to two or moreexpansion points overlap one has the choice of several Taylor representations


r2z2

z1

z3r3

r1

Figure A.12: Convergence regions for Taylor series

all converging within the same region. For example, the common region ofconvergence for expansions about z1 and z2 corresponds to the shaded area inFig.A.12. In this case f(z) can be represented by two different series both ofwhich converge within the shaded region. When the circles of convergence aredisjoint, as, for example, z1 and z3, the two respective Taylor series still representthe same function. This identification is referred to as analytic continuation.

In Fig. A.12 the boundary between the region where a function is analyticand where it fails to be analytic is represented by a continuous curve. Morefrequently a function will fail to be analytic only at a set of discrete points as,e.g., the function in (A.28). As a special case let us set f(z) = 1 and determineTaylor expansions of

g(z) =1

z − 2. (A.44)

Evidently this function is analytic for all finite z with the exception of thepoint z = 2. Hence we should be able to find a convergent Taylor series aboutany point z = 2. For example, let us pick z = 0. Dividing the numerator andthe denominator in (A.44) by 2 and using the geometric series expansion in z/2,we get

g(z) =∞∑

n=0

(− 1

2n+1

)zn (A.45)

a Taylor series that converges for |z − 1| < 2. Suppose we expand g(z) aboutthe point z = 1. We again use the geometric series approach and obtain6

g(z) =1

z − 2= − 1

1− (z − 1)=

∞∑

n=0

− (z − 1)n, (A.46)

which converges for |z − 1| < 1. The circles of convergence of (A.45) and (A.46)are shown in Fig. A.13.

6We could have also used (A.36) but in this particular case the indirect approach is simpler.


•••1 20

Figure A.13: Circles of convergence for series (A.45) and (A.46)

i

i

i

C+b

C−b

z

V +

V −

z0

r+max

r−min

Figure A.14: Convergence regions for the Laurent series

A.3.3 Laurent Series

Just like the Taylor series the Laurent series it is based on the Cauchy integralformula. We start by positioning the expansion point z0 within the regionbounded by the closed curve C−

b shown in Fig. A.14 where f(z) is not necessarilyanalytic. The region of analyticity currently of interest to us lies between thetwo closed boundaries: the inner boundary C−

b and the outer boundary C+b .

The evaluation of the Cauchy integral proceeds along the path in the directionindicated by the arrows. To close the path we introduce, in addition to C+

b

and C−b , two straight closely spaced line segments. The integrations along these

segments are carried out in opposite directions so that their contributions cancel.


Thus the curves C+b and C−

b alone enclose the region of analyticity of f(z) sothat ∮

C+b

f(z)dz −∮

C−b

f(z)dz = 0. (A.47)

Applying the Cauchy integral formula to the same closed path we have

f(z) =1

2πi

∮

C+b

f(ς+)dς+

ς+ − z − 1

2πi

∮

C−b

f(ς−)dς−

ς− − z . (A.48)

As in the Taylor series we set

1

ς+ − z =1

ς+ − z0 − (z − z0) =1

(ς+ − z0)[1− z−z0

ς+−z0

]

=1

ς+ − z0∞∑

n=0

(z − z0ς+ − z0

)n, (A.49)

which converges for |z − z0| < |ς+ − z0|. In Fig.A.14 this corresponds to theinterior of the circle with radius r+max. Thus we can represent the first of the twointegrals in (A.48) by a series that converges within the same circle provided weshift the path of integration in each term of the series from the original contourC+b to r+max. We then get

1

2πi

∮

C+b

f(ς+)dς+

ς+ − z =

∞∑

n=0

an (z − z0)n , (A.50)

where

an =1

2πi

∮

r+max

f(ς+)dς+

(ς+ − z0)n+1 . (A.51)

Eq. (A.50) converges within the circle |z − z0| < r+max. We note that (A.50)appears to be identical with the Taylor series expansion in (A.36). This identityis, however, purely formal because despite the fact that (A.50) converges withinthe same region as (A.36) it does not converge to f (z) but only to a functionrepresented by the first of the two integrals in (A.48). This function is referredto as the regular part of f (z).

To develop a series representation for the second integral in (A.48) we againutilize the geometric series. Thus

1

ς− − z = − 1

(z − z0)[1− ς−−z0

z−z0

] = − 1

z − z0∞∑

n=0

(ς− − z0z − z0

)n, (A.52)


which converges for |ς− − z0| < |z − z0|. In Fig. A.14 this corresponds to pointsoutside of the circle with radius r−min. With the aid of (A.52) we now constructa series representation of the second integral in (A.48):

1

2πi

∮

C−b

f(ς−)dς−

ς− − z0 =−1∑

n=−∞an(z − z0)n, (A.53)

where

an =1

2πi

∮

r−min

(ς− − z0

)−n−1f(ς−)dς−. (A.54)

The function represented by (A.53) is called the principal part of f(z). Thecoefficient a−1 plays a special role in complex variable theory and is referred toas the residue of f(z). The sum of (A.50) and (A.53) is the Laurent series

f(z) =

∞∑

n=−∞an(z − z0)n, (A.55)

which converges within the annular region r−min<|z| < r+max to the analyticfunction f(z). In virtue of the analyticity of f(z) the expansion coefficients anrepresented by the two separate expressions (A.51) and (A.54) can be combinedinto the single formula

an =1

2πi

∮(ς − z0)−n−1 f(ς)dς, (A.56)

where the integral may be evaluated over any closed path within the annularregion of convergence of (A.56). In particular the residue of f(z) is

a−1 =1

2πi

∮f(ς)dς. (A.57)

A.4 Singularities of Functions and the Calculus

of Residues

A.4.1 Classification of Singularities

Isolated Singularities

Definitions. In regions where a function fails to be analytic it is termedsingular. A function can be singular on a discrete set of points, or over a con-tinuum (i.e., on an uncountable set of points). Singularities at discrete pointswill be referred to as isolated those over a continuum extended. Isolated singu-larities can be classified as poles or as essential singularities. A function f(z)has pole of order N at z = z0 if it

f(z)(z − z0)n

A.4 Singularities of Functions and the Calculus of Residues 357

is analytic for n ≥ N but fails to be analytic for n < N . For example,

f(z) =z2

z − 2(A.58)

has a pole of order 1 (also called a simple pole) at z = 2 and

f(z) =z

(z − 1)2(A.59)

has a pole of order 2 at z = 1. One can generalize this definition to encompasspoles at infinity. In that case z, z2, z3 represent, respectively, poles of orderone, two, and three. In accordance with this definition (A.58) has, in additionto the pole at z = 2, a simple pole at infinity. We note that an N -th orderpole at z = z0 can be removed by multiplying the function by (z − z0)N . Weshall call a singularity that can be removed by multiplying the function by apolynomial, a removable singularity. For poles at infinity we will widen thisdefinition by including multiplications by z−n. A function can heave multipleremovable singularities. For a ratio of polynomials these are all the zeros of thedenominator. For example, for f(z) = z5/(z − 1)(z − 2)3, we have a simplepoles at z = 1 and a pole of order 3 at z = 2 and a simple pole at infinity. Afunction with removable singularities need not be a polynomial as, for example,sin z/z2 which has a removable simple pole at z = 0. An isolated singularitythat cannot be removed by multiplying it by a polynomial is referred to as anessential singularity. For example, the function e1/z fails to be analytic onlyat z = 0. This singularity cannot be removed by multiplying it by zn, as isapparent from the Laurent expansion

e1/z = 1 +1

z+

1

2 z2+

1

3! z3+

1

4! z4+ . . . (A.60)

Replacing 1/z by z transforms (A.60) into a Taylor series for ez which has anessential singularity at infinity but is otherwise analytic. Functions that haveno singularities other than essential singularities at infinity are called entirefunctions. Many elementary functions fall into this category. Examples are:sin z, cos z, the Bessel function Jn(z).

One necessary characteristic of entire functions is that they can be repre-sented by infinite series of ascending powers in z, i.e., a Taylor expansions aboutz = 0. However, this is not a sufficient property, for such expansions can alsobe gotten for ratios of polynomials where the singularities can be removed. Thisis not possible for essential singularities in general, and for entire functions inparticular.

A function can have both removable and essential singularities. For example,cos z/z has a simple pole at z = 0 and an essential singularity at infinity.

Laurent Series for Functions with Isolated Singularities. InSect. A.2 the type of singularities outside the regions of convergence for theTaylor and Laurent series was left unspecified. In the following we assume


• •x

y

x2 + y2 = 1(x−1)2 + y2 = 1

(0,0) (1,0)

Figure A.15: Convergence regions for Laurent expansions of 1/z(z − 1)

that they are poles. In this case the Taylor and Laurent series coefficients cangenerally be found without the explicit use of (A.56). For example, considerthe function

f(z) =1

(z − 1)z(A.61)

with its two simple poles shown in Fig. A.15There are four possible Laurent series. The two for the pole at z = 0 converge

in the annular region7 defined by 0 < z < 1, and 1 < z <∞, corresponding tothe circle at the origin in Fig.A.15. For the pole at z = 1 the boundary betweenthe convergence regions 0 < |1− z| < 1 and 1 < |1− z| < ∞ is the circle withits origin is at z = 0.

For z = 0 we obtain the first of the two Laurent series by expanding 1/(1−z)in a geometric series and dividing each term by −z. This yields

f(z) = −1

z− 1− z − z2 − z3 . . . , (A.62)

which converges within the unit circle |z| < 1. We note that the residue, as de-fined by (A.56), equals −1. To obtain the second series we divide the numeratorand denominator of (A.61) by z so that

f(z) =1

z21

1− 1z

(A.63)

and expand 1/(1− z−1

)in a geometric series. The result is the Laurent expan-

sion

f(z) =1

z2+

1

z3+

1

z4. . . , (A.64)

which converges for |z| > 1.

7In all four cases the inner circle of the annulus encloses the pole with a vanishingly smallradius.


For the pole at z = 1 we can again rearrange f(z) and use the geometricseries. The series converging for 0 < |1− z| < 1 is

f(z) =1

(z − 1 + 1) (z − 1)=

1

(z − 1)[1− (z − 1) + (z − 1)2 − (z − 1)3 . . .

=1

(z − 1)− 1 + (z − 1)− (z − 1)

2+ (z − 1)

3. . . (A.65)

and for 1 < |1− z| < 1

f(z) =1

(z − 1 + 1) (z − 1)=

1

(z − 1)2

1[1 + 1

z−1

]

=1

(z − 1)2 −

1

(z − 1)3 +

1

(z − 1)4 −

1

(z − 1)5 . . . (A.66)

Multivalued Functions and Extended Singularities. Functions suchas f∗ (z) and |z| that are not analytic anywhere on the complex plane are onlyof peripheral interest in analytic function theory. Here we will be dealing withfunctions that fail to be analytic only along a line or curve in the complexplane. They generally result from the imposition of constraints on multivaluedfunctions. As an example consider the function

f(z) =√z. (A.67)

Using the polar form z = reiθ

f (z) = r1/2eiθ/2, (A.68)

where r1/2 will be defined as nonnegative. Along the positive segment of thex-axis θ = 0 and x = r so that f(z) = r1/2 > 0. When θ is incremented by 2πwe return to the same point x = rei2π = r but not to the same value of f(z).Instead, it equals −r1/2. To return to r1/2 requires an additional rotation of 2π.Thus at any point of the complex plane

√z equals either r1/2eiθ/2 or −r1/2eiθ/2.

To return to the same value of the function smaller fractional powers requiremore than two rotations. For example, z1/3 requires 2 rotations. As a resultat the point z1 = r1e

iθ1 = r1eiθ1ei2πk, k = 0, 1, 2, . . . z1/3 can assume one 3

possible values, i.e.,

z1/31 =

⎧⎪⎨

⎪⎩

r1/31 eiθ1/3,

r1/31 eiθ1/3ei2π/3

r1/31 eiθ1/3e

i4π/3

.

(A.69)

Each of these values can be plotted on a separate complex plane, referred to asa Riemann sheet. For example, for

√z we can identify two Riemann sheets for

z1/3 three and for z1/n n. With this construction, each branch of the function issingle valued on the corresponding Riemann sheet. Points at which all branchesof a function have the same value are referred to as branch points. For z1/n the


q0

z = r1/2eiq0 /2

z = −r1/2eiq0 /2

z1/3 = r1/3eiq0 /3

z1/3 = r1/3eiq0/3ei2p /3

x

y

Figure A.16: Values of z−1/2 and z−1/3 on both sides of the branch cut on thetop Riemann sheet

branch point is at the coordinate origin. We still have to define the boundariesamong the different Riemann sheets, i.e., the domains of the branches of thefunction. This is done by introducing a branch cut, which in the present ex-amples can be a straight line running from the branch point to infinity. Uponcrossing this branch cut one winds up on the next Riemann sheet. Continuing,one returns to the top sheet. Any direction of the line is permitted but thespecification of a particular direction defines a specific set of Riemann sheets.For example, a branch cut at θ = θ0 restricts the angle on the top Riemannsheet to θ0 ≤ θ < 2π+θ0 on the second Riemann sheet to 2π+θ0 ≤ θ < 4π+θ0and on the n-th Riemann sheet to 2π(n − 1) + θ0 ≤ θ < 2πn + θ0. For z1/3

and θ1 > θ0 the three values in (A.69) lie, respectively, on the first, second,and third Riemann sheet. Crossing a branch cut provides a smooth transitionto the next Riemann sheet. This is not the case if the domain of the functionis restricted to only one of the Riemann sheets since this forces the function toassume different values on the two sides of the branch cut. In contrast to pointsingularities discussed in the previous subsection, here we have a step discon-tinuity across the branch cut and continuity along it. We shall call this typeof singularity an extended singularity to distinguish it from point singularitiesdiscussed in the preceding subsection. Figure A.16 shows the branch cut atθ = θ0 as well as values on the top Riemann sheet on both sides of the branchcut for z1/3 and z1/2. Across the branch cut the discontinuities for z1/2 andz1/3 are, respectively, 2r1/2eiθ0/2 and r1/3

√3eiθ0/3. Excluding points directly

on the branch cut these functions are analytic on the entire top Riemann sheet.They are also analytic at points arbitrarily close to the branch cut as can bedemonstrated by a direct application of the limiting forms of the CR conditions.

One novelty not encountered with single-valued functions is that path de-formations and, in particular, the construction of closed paths, may requireintegration around branch cuts. As an example, consider the function z−1/2,


ε

L+

L−

θ

r=1

y

C

c

Figure A.17: Integration around a branch cut

again with its branch point at the origin. We use branch cut in Fig A.16 withθ0 = 0 and evaluate the integral

I =

∮dz√z

(A.70)

on the top Riemann sheet over the closed path shown in Fig. A.17.Integrating in the counterclockwise direction, the contribution from the two

straight line segments L− and L+ for sufficiently small ε is∫ 0

1

−z−1/2dz +

∫ 1

0

z−1/2dz = 4 (A.71)

and around the circle C we get∫ 2π

0

ieiθe−iθ/2 = −4. (A.72)

Finally the contribution of the semicircle c with center at the branch point andradius ε is ∫ π/2

3π/2

iε eiθε−1/2e−iθ/2, (A.73)

which vanishes as ε approaches 0. Adding the three contributions we get I = 0,as expected of an analytic function.

A.4.2 Calculus of Residues

The Residue Theorem

Integration of a function along a contour enclosing isolated singularities playsan important role in the application of complex variables. In the following wepresent the underlying theory.


z2

z3

z4 zNz1

C1C2

C3

C4 CN

x

y

...

C

Figure A.18: Equivalence between the integration along the contour C and thecircles enclosing the isolated singularities

Let f(z) be an analytic function within the closed curve C except at Nisolated singularities at z�, � = 1, 2, . . .N as shown in Fig. A.18. We encloseeach of the N singularities within C by circles C�. Since an integral of ananalytic function along a closed path yields is zero we can deform curve C sothat the integral along C equals the sum of the integrals along the N circleswithin C. This is expressed by

∮f (z)dz =

N∑

�=1

∮

C�

f� (z)dz, (A.74)

where f� (z) is the local value of f (z) surrounding the �-th singularity andwhere all the integrations run in the counterclockwise direction. Because thesingularities are isolated there is a finite circular region where f� (z) can berepresented by the Laurent series

f� (z) =∞∑

n=−∞an� (z − z�)n , (A.75)

where an� is the n-th coefficient of the Laurent series within the �-th circle with� = 1, 2, . . .N . Integrating over C� as indicated in (A.74)

∮

C�

f� (z) dz =

∞∑

n=−∞an�

∮

C�

(z − z�)n dz. (A.76)

The values of the integrals within the sum are independent of the pathsenclosing the singularities. Since we have chosen a circle the integrations resultin ∮

C�

(z − z�)n dz =∫ 2π

0

r�einθ�ir�e

iθ�dθ� =

{2πi ;n = −1 ,0 ; n = −1. (A.77)


Substituting into (A.76) and then into (A.74) we obtain

1

2πi

∮f (z) dz =

N∑

�=1

a−1�. (A.78)

In the special case in (A.57) the a−1 coefficient was referred to as the residue off (z). Formula (A.77) states that a function that has only isolated singularitieswhen integrated along a closed path is equal to the sum of the residues multipliedby 2πi. It should be noted that the preceding derivation does not assume thatthe singularities are poles. For example, for the function e1/z

∮e1/zdz = 2πi,

which is agreement with the first term of the Laurent series in (A.60).

Computation of Residues

Suppose the function f(z) has an M -th order pole at z = z�. Since this is a

removable singularity the function g (z) = (z − z�)M f (z) is analytic at z = z�and has the Taylor expansion

g (z) = (z − z�)M f (z) =

∞∑

k=0

g(k) (z�)

k!(z − z�)k . (A.79)

On the other hand, the Laurent expansion for f (z) is

f (z) =a−M

(z − z�)M+

a−M+1

(z − z�)M−1+ · · ·+ a−1

(z − z�) +a0+a1 (z − z�)+ · · · (A.80)

Multiplying (A.80) by (z − z�)M we get

g (z) = a−M + a−M+1 (z − z�) + · · ·+ a−1 (z − z�)M−1 + a0 + a1 (z − z�)M + · · ·we identify the residue term

a−1 =1

(M − 1)!

dM−1

dzM−1

[(z − z�)M f(z)

] ∣∣∣∣z=z�

. (A.81)

For a simple pole this formula reduces to

a−1 = limz −→ z�

(z − z�) g (z)f(z)

∣∣∣∣z=z�

. (A.82)

An alternative expression to (A.82) can be found by taking explicit account ofthe generic form of f (z) i.e.,

f (z) =g (z)

h (z). (A.83)


• •z1 z0

x

y

−R R

Figure A.19: Integration path for evaluation of I

Since h (z) has a simple zero at z = z� its Taylor expansion is

h (z) = (z − z�)h(1)� (z�) + (z − z�)2 h(2)� (z�) /2 + · · ·With the Taylor expansions for g(z) we form the ratio

f (z) =g (z�) + (z − z�) g(1)� (z�) + (z − z�)2 g(2)� (z�) /2 + · · ·

(z − z�)h(1) (z�) + (z − z�)2 h(2) (z�) /2 + · · ·, (A.84)

where g (z�) = 0. Long division gives the Laurent expansion

f (z) =g (z�)

h(1)� (z�)

1

z − z� + Taylor Series

so that

a−1 =g (z�)

h(1) (z�). (A.85)

This formula avoids the limiting process in (A.82) and is generally preferred.

Evaluation of Integrals. One of the applications of Residue Calculus is theevaluation of improper integrals (Integrals with infinite limits). For example,consider the integral

I =

∫ ∞

0

dx

1 + x4=

1

2

∫ ∞

−∞

dx

1 + x4=

1

2lim

R−→∞

∫ R

−R

dz

1 + z4. (A.86)

The denominator is a product of four linear factors with zeros at zk =eiπ(2k+1)/4 ; k = 0, 1, 2, 3 that correspond to the four simple poles of 1/

(1 + z4

).

We enclose the poles in the upper half plane, i.e., z0 and z1 by the path consistingof the segment −R < x < R of the real axis and a semicircle of radius R in theupper half plane, as shown in Fig. A.19.

Integrating 1/(1 + z4

)in the counterclockwise direction along the semicircle

IR =

∫ π

0

ieiθRdθ

1 +R4ei4θ


we get the bound

|IR| ≤∣∣∣∣∫ π

0

ieiθRdθ

1 +R4ei4θ

∣∣∣∣ <∫ π

0

∣∣∣∣ieiθR

1 +R4ei4θ

∣∣∣∣ dθ <π

R3(A.87)

so thatlim

R−→∞IR = 0.

Thus for sufficiently large R the integration along the x-axis in (A.86) isequivalent to the integration of 1/

(1 + z4

)along the closed path in Fig.A.19.

In accordance with (A.78) this equals 2πi times the sum of the residues at thetwo enclosed poles. Since the poles are simple we are entitled to using (A.85).A simple calculation yields

2πi{1/4z3

∣∣z=eiπ/4 + 1/4z3

∣∣z=eiπ3/4

}= π/

√2

and taking account of the factor of two in (A.86) we get the final result I =π/2√2. This evaluation procedure can be generalized to integrals of the form

IN =

∫ ∞

−∞

PN−2(x)

QN(x)dx, (A.88)

where PN−2(x) and QN(x) are polynomials of order N − 2 and N . The residueevaluation then gives

IN = 2πi

N∑

�=1

a−1�. (A.89)

A different situation arises for integrals of the form

I =

∫ ∞

−∞f(x)eixtdx, (A.90)

where t is a real parameter. In this case the vanishing of the contribution fromthe integration along the semicircle (as in Fig.A.19) as its radius tends to infinitydepends on f(x) and the sign of t. Conditions under which this contributionvanishes are given by the Jordan Lemma which for our purpose may be statedas follows:

Referring to Fig.A.20, let f(z) be an analytic function in the region R > R−0

or R+0 that includes the semi-circular contours C− or C+. If on C− (C+)

limR−→∞

f (R eiθ) −→ 0 (A.91)

then for t < 0 (t > 0)

limR−→∞

∫

C−(C+)

f (z) eiztdz −→ 0. (A.92)

Proof. We first deal with C− and denote the corresponding integral by IR.

IR =

∫

C−f (z) eiztdz =

∫ −π

0

f(R eiθ

)eitR eiθ iRdθ


ℜe (z)

ℑm (z)

R

θ

C+

C−

Figure A.20: Coordinates in the proof of the Jordan Lebesgue lemma

=

∫ −π

0

eiRt cos θ f(R eiθ

)e−Rt sin θiRdθ

=

∫ π

0

eiRt cos θ f(R e−iθ

)eRt sin θiRdθ. (A.93)

We bound it as follows:

|IR| ≤∫ π

0

∣∣f(R e−iθ

)∣∣ eRt sin θRdθ ≤ 2∣∣f

(R e−iθ

)∣∣∫ π/2

0

eRt sin θRdθ. (A.94)

For 0 < θ < π/2, sin θ > 2θ/π. Since the argument of the exponential in (A.94)is negative we can increase the integral by replacing sin θ with 2θ/π. Makingthis replacement we get

∫ π/2

0

e−R|t| sin θRdθ <∫ π/2

0

e−2|t|θR/πRdθ =π

2 |t|(1− e−R|t|

)≤ π

2 |t| . (A.95)

Substituting (A.95) in (A.94) we obtain the final result:

|IR| ≤ π

|t|∣∣f

(R e−iθ

)∣∣ . (A.96)

In view of (A.91), |IR| approaches zero which proves the lemma for t < 0. Thesteps in the proof for t > 0 are identical. What distinguishes the two cases is thatimplicit in the proof for t < 0 is the requirement of analyticity of f(z) in the lowerhalf plane, whereas for t > 0 it is the upper half plane. �

In general, to evaluate an integral by Residue Calculus one must be able toidentify in the complex plane an integration path that can be closed by addingto it the integration range of the integral to be evaluated without changingits value. In the case of (A.86), the path was closed by the addition of asemicircle along which the integration was reduced to zero. This reduction wasachieved entirely by the algebraic decay of the integrand which, when expressed


in polar coordinates, reduces to 1/R3. As indicated in (A.88), for integrandsthat are ratios of polynomials the required minimum ratio of the denominatorto numerator degrees is 2. Multiplication by the polar coordinate differentialon the circle introduces an additional factor of R so that the net decay rate isonly 1/R.

Consider now the integral

I (t) =

∫ ∞

−∞

−ix1 + x2

eixtdx. (A.97)

Here as in (A.88) we can form a closed path by supplementing the integrationrange with a semicircle and let its radius approach infinity. The decay rate ofthe integrand is now exponential and in accordance with the Jordan Lemmathe decay rate of the multiplier of the exponential is immaterial as long as itapproaches zero at infinity. The integrand in (A.97) satisfies this criterion. Thecorresponding closed path can be constructed using either C+ or C− dependingon the sign of t. Continuing the integrand into the complex domain and closingthe path with one of the semicircles we write

I(t) =

{ ∮C+

−iz1+z2 e

ixtdz; t > 0,∮C−

−iz1+z2 e

ixtdz; t < 0,(A.98)

where, in accordance with Fig. A.20, the integral on C+ runs in the counterclockwise direction (positive residue summation) and on C− in the clockwisedirection (negative residue summation). The integrand has two simple polesone at z = i and the other at z = −i. For t > 0 the exponent decays in theupper half plane. In accordance with the Jordan Lemma the contribution fromC+ vanishes so that we can equate (A.97) to the first integral in (A.98) andhence to 2πi times the enclosed residue. We get I(t) = πe−t. For t < 0 theintegrand decays on C−. Closing the path of integration in the lower half planewe pick up the residue at z = −i and obtain I(t) = πet. Both results can beincluded in the compact form

I(t) = πe−|t|. (A.99)

One important application of the Jordan Lemma is in the inversion of Fourierand Laplace Transforms. Because the direct Laplace transform is defined interms of the real parameter s, the lower half plane and its association with thenegative time range as used herein correspond to the right half plane in theLaplace Transform. Thus to rephrase the statement of the Jordan Lemma interms of the Laplace Transform variable one should replace z by −is.

Bibliography

[1] Abramovitz M, Stegun IA (eds) (1965) Handbook of mathematical func-tion: with formulas, graphs, and mathematical tables. Dover Publications,Inc., New York.

[2] Chen CT(1984) Linear systems theory and design. Oxford University Press,New York.

[3] Chen CT(1994) System and signal analysis. Saunders College Publishing,Fort Worth.

[4] Cohen L (1995) Time-frequency analysis. Prentice Hall, Englewood Cliffs.

[5] Courant, Hilbert (1953) Methods of mathematical physics. IntersciencePublishers, New York.

[6] Gabor D (1946) Theory of communications. J Inst Electr Eng 93:249–457.

[7] Giordano AA, HsuFM (1985) Least square estimation with applications todigital signal processing. Wiley, New York.

[8] Golub GH, Van Loan CF (1996). Matrix computations, 3rd edn. JohnsHopkins Studies in Mathematical Sciences, Baltimore.

[9] Golub GH, Van Loan CF (1980). An analysis of the total least squareproblem. SIAM J Numer Anal 17:883–893.

[10] Haykin S, Van Veen B (1999). Signals and systems. Wiley New York.

[11] Lebensbaum MC(1963) Further study of the black box problem. In: Pro-ceedings of the IEEE (Correspondence), vol 51, p. 864.

[12] Mandal M, Asif A (2007). Continuous and discrete time signals and sys-tems. Cambridge University Press, Cambridge.

[13] Mitra SK (1998) Digital signal processing: a computer-based approach.The McGraw-Hill Companies, Inc., New York.

[14] Oppenheim AV, Willsky AS, Hamid Nawab S (1997) Signals & systems,2nd edn. Prentice Hall, Upper Saddle River.


369

370 Bibliography

[15] Paley REAC, Wiener N (1934) Fourier transforms in the complex domain.American Society Collegium Publication, New York, vol 19.

[16] Papoulis A (1962) The fourier integral and its applications. McGraw-HillBook Company, Inc., New York.

[17] Phillips CL, Parr JM, Riskin EA (2008) Signals, systems, and transforms.Pearson Prentice Hall, Upper Saddle River.

[18] Poor VH (1994) An introduction to signal detection and estimation, 2ndedn. Springer New York.

[19] Poularikas AD, Samuel Seely S (1991). Signals and systems, 2nd edn. PWSKENT Publishing Company, Boston.

[20] Poularikas AD (ed) (1996) The transforms and applications handbook.CRC Press. CRC Handbook Published in Cooperation with IEEE Press,Boca Raton.

[21] Riesz F, SZ.-Nagy B (1990) Functional analysis (translated from 2nd Frenchedition by Boron LF). Dover Publications, Inc., New York.

[22] Slepian J (1949) Letters to the editor. Electr Eng 63:377.

[23] Slepian J, Pollard HO (1961). Polate spherical functions: fourier analysisand uncertainty. Bell Syst Tech J 40:43–64.

[24] Tikhonov AN (1963) Solution of incorrectly forumalted problems and theregularization method. Soviet Math Doklady 4:1035–1038 (English trans-lation of doklad nauk SSSR 151:501–504).

[25] Van Huffel S, Vandewalle J (1991) The total least square problems: com-putational aspects and analysis. Society for Industrial and Applied Math-ematics, Philadelphia.

[26] Ziemer RE, Tranter WH, Fannin RD (1993). Signals and systems: contin-uous and discrete 3rd edn. Macmillan Publishing Company, New York.

Index

AAiry function, 186

for large arguments, 187Amplitude fading, 216Analytic functions, 352Analyticity of FT, 158Anharmonic Fourier series, 114

BBandlimited functions, 281Bandlimited signal defined by infinite

number of samples, 288Bandpass, 150–158

and demodulated basebandspectra, 151

process, 155representation, 150

Baseband I&Q power spectra, 158Bessel inequality, 55BIBO stability, 206Bilateral Z-transform, 324Biorthogonal set, 24Branch cut, 370

CCalculus of residues, 371Casuality, 202Cauchy principal value (CPV)

integrals, 121Cauchy residue theorem, 323Chirped pulse, 190Communications channel modeling,

215Completeness definition, 76, 77Complex baseband signal, 152, 153Convergence

at interval endpoints, 92

in mean, 76nonuniform, 92pointwise, 76at step discontinuities, 60, 68, 88,

93, 138ConvolutionFourier transform, 131Laplace transform, 257

Z-transform, 328Cubic phase nonlinearities, 189

DDefinition and analytic properties, 270Delta function, limiting forms, 65, 66,

68, 111Denumerable set of step

discontinuities, 73Differential equations

general theory, 235–242LTI and time-varying, 217–235

Differential operator, 232Differentiation with respect to

transform variable, 256Discontinuities in (open) interval, 88Discrete form of Gram matrix, 233Discrete Fourier transform (DFT)

finite impulse response filter (FIR),340

IIR and FIR transfer functions,341

infinite impulse response filter(IIR), 340

matrix properties, 311periodic extensions, 311resolution improvement by

0-padding, 319Dispersion curve, 180


371

372 Index

Dispersion effects in optical fibers, 189Distortionless transmission, 178Domain of operator, 12Doppler spread, 216Duhamel integral, 209

EEffects of finite transmitter spectral

line width, 192Eigenvalue problems, 38Electrical network realization, 233Elementary wave function, 182Entire function, 282Equivalent first-order vector system,

226Evaluation of inverse FT, 164Evaluation of inverse LT, 258

of rational functions, 260Expansion functions, 22, 76

and their duals, 24

FFeedback representation, 218, 239Fejer summation

comparison of Fejer and Fourierconvergence, 99

Fejer and Fourier integral approx-imations, 141

Fejer and Fourier integral kernels,140

Fejer summation technique andFourier series, 96

higher order Fejerapproximations, 100

First order, 217Folding frequency, 295Fourier series (FS)

anharmonic, 114completenessfor cosine series, 104–106for exponential FS, 86for sine series, 106–108

and delta function representation,94

extension of a function for cosineseries, 105

Fourier series, LMS approxima-tion and normal equations,85

interpolation with Fourier seriescosine, 109exponential, 108

kernel, 86and Poisson sum formula, 136

Fourier transform (FT) integralamplitude and phase (Paley-Wiener

condition), 164analytic signals, 142basic properties of FT, 127comparison of Fourier and Fejer

kernels, 141completenessfor exponential FT, 120for sine and cosine series, 197

continuous and discrete Fourierspectra, 121

convergence and CPV (Cauchyprincipal value) integrals, 121

determination of the unilateralLaplace transform from FT,278

discrete Fourier transform, 307Fejer and Fourier integral

approximations, 141Fejer and Fourier integral kernels,

140Fourier cosine and sine transforms

and normal equations, 195Fourier integral, 117Fourier transforms and normal

equations, 118instantaneous frequency, 144relationship between FT and

FCT(Fourier cosinetransform), 197

relationships between FT andunilateral LT, 277

short-time Fourier transform(STFT), 174

sliding-window Fourier transform(Gabor transform), 174, 175

time-frequency analysis, 169, 170

Index 373

Frame, 48Frequency and time domain

representations, 102Frequency dispersion, 183Fresnel integrals, 148Function of exponential order, 246Function spaces, 20Functions with branch point

singularities, 264

GGabor spectrogram, 175Gaussian window, 174Gibbs phenomenon, 92Gram determinant, 19Gram matrix, 19Gram–Schmidt orthogonalization, 27Group delay, 179

HHeisenberg uncertainty principle, 173Hermitian matrix, 19Higher order singularity function, 69Hilbert transforms, 162

and analytic functions, 159causal signals, 132discrete Hilbert transforms (DHT),

343

IImmunity to noise interference, 20Improper integral, 122Impulse response, 203Infinite impulse response filter (IIR),

340Infinite orthogonal systems, 76Initial value theorems

Fourier transform, 135–136Laplace transform, 253–257Z-transform, 321, 326, 330

Inner product, 14invariance, 131of vectors, 23

Inphase and quadrature components,152, 155, 156

Integral transform techniques, 203

Inversion formula (LT), 251, 271Inversion of Gram matrix, 31Isolated singularities, 366

JJordan lemma, 165, 375

KKerhunen–Loeve expansion, 78

LLaguerre polynomials, 60Laplace transforms (LT), 257

determination of the single-sidedLaplace transform from theFT, 278

double-sided, 270single-sided, 246

Laurent series, 322, 325, 364Least mean square (LMS)

approximation, 14by sinusoids spanning a

continuum, 117Legendre polynomials, 29, 57Linear independence, 17Linear operator, 203Linear systems, 203Linear time-invariant (LTI) systems,

158, 207response to exponentials, 209

LMS approximation. See Least meansquare (LMS) approximation

LTI systems. See Linear time-invariant(LTI) systems

MMatrix condition numbers, 20Mean squared error, 14Meromorphic functions, 262Method of least squares, 14Method of stationary phase, 144Minimum phase-shift functions, 344Moore–Penrose pseudoinverse, 40Moving source of radiation, 217Multivalued functions, 369

374 Index

NNorm

axiomatic definition, 165triangle inequality, 165

Normal equations, 31and continuous summation index,

33in discrete form, 31generalizations, 32and integral transforms, 35LMS approximation and the

normal the equations, 29LMS error minimization, 30LMS minimization problem, 31orthonormal functions, 22and SVD, 40

OOpen interval, 88Orthogonality, 17Orthogonalization techniques, 26Orthogonalization via Gram matrix,

26Orthogonal polynomials

Hermite, 62Laguerre, 60, 61Legendre, 29, 57

PPaley–Wiener condition, 164Parseval’s theorem

analytic signals, 142arbitrary orthogonal functions,

23, 29Fourier series, 102Fourier transform, 119

Phase and group delay, 178Phase and group velocity, 181Piecewise differentiable function, 75Power spectrum, 156Principle of superposition, 203Projections of infinite-dimensional

vector, 21Projections of signal, 22Projection theorem, 29

Prolate Spheroidal Wave Functions,293

Pulse compression, 215

QQuasi linear dependence, 45

RRandom variables, 35Rational function, 164Reciprocal basis functions, 24Region of analyticity of F (s), 250Relationships between amplitude and

phase, 162Residual error, 13Residue evaluation, 165Response of homogeneous electrically

conducting medium to a unitstep excitation, 267, 270

Response to an exponential input, 211Riemann–Lebesgue lemma (RLL), 66Riemann sheet, 371Riemann sum, 15

approximation, 109rms signal duration, 170Rodriguez formula, 58Round off errors, 20

SSampling

bandpass, 300demodulation using bandpass

sampling, 303impulse, 296sampling theorem for stationary

random processes, 286Shannon sampling theorem, 285,

335stretching timescale of a periodic

waveform by undersampling,307

Schwarz inequality, 17Signals

analytic signals, 142canonical signals, 124causal signals, 158

Index 375

and Hilbert transform, 132degrees of freedom, 289digital demodulation of bandpass

signals, 303distortion, 178duration, 170and Rayleigh limit, 290

envelope, 153idealizedrectangular pulse, 71sign function, 71sine integral function, 72triangular pulse, 72unit step function, 71

LT of polynomial and exponentialsignals, 257

periodic signals, 258random signals, 153spaces, 11spectral concentration of bandlim-

ited signals, 291spectrum corrupted by aliasing,

295stochastic signals, 78subspace, 41

Simple pole, 165Singular value decomposition (SVD),

37for continuum, 45and data corrupted by noise, 42minimum norm least square solu-

tion, 41minimum norm solution for con-

tinuum, 46reduced form, 39singular values, 38solutions of normal equation, 40

Slepian back box problem, 235Spectral amplitude taper, 300Spectral concentration, 289

of bandlimited signals, 291Spectral resolution and signal dura-

tion, 284Spectrogram, 175

Spectrum with step discontinuities,285

Spheroidal functions, 118Stability, 206Statistic processes and LMS

approximation, 35Steady state system response, 211Step discontinuities, 73, 93Step response, 206SVD. See Singular value

decomposition (SVD)Synchronous detection, 156

TTaylor series, 361Tight frame, 48Tikhonov regularization, 53Time-invariance. See Linear

time-invariant (LTI) systemsTime shift, 256

and frequency shift, 129Time-varying first-order linear system,

220Time-varying multipath signal, 216Time-varying resistor circuit, 220Time-varying systems, 211Total least squares, 49Transcendental functions, 262Transfer function of LTI system, 211Transient response, 211Two electrical networks with identical

transfer functions, 234

UUncertainty principle, 169Uncertainty relation, 173Unitary transformation, 37

WWavepacket, 183Wronskian, 20, 224

ZZero-order hold reconstruction, 298

Documents

Signals and transforms in linear systems analysis