University Of Illinois · Contents 1 Introduction 19 1.1 Early Science and Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.1.1 Lecture 1: (Week 1) Three

An Invitation to Mathematical Physics

and Its History

Jont B. Allen

Copyright © 2016,17,18 Jont Allen

University of Illinois at Urbana-Champaign

http://jontalle.web.engr.illinois.edu/uploads/298/

Thursday 14th March, 2019 @ 18:24; Version 0.97.08

2

Contents

1 Introduction 19

1.1 Early Science and Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.1.1 Lecture 1: (Week 1) Three Streams from the Pythagorean theorem . . . . . . . . 20

1.1.2 What is mathematics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.1.3 Early physics as mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.2 Modern mathematics is born . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.2.1 Science meets mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.2.2 The Three Pythagorean Streams . . . . . . . . . . . . . . . . . . . . . . . . . . 29

1.2.3 Exercises NS-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2 Stream 1: Number Systems (9 Lectures) 35

2.1 Lecture 2: The taxonomy of numbers: P,N,Z,Q,F, I,R,C . . . . . . . . . . . . . . . 36

2.1.1 Numerical taxonomy: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.1.2 Applications of integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.2 Lecture 3: (Week 2) The role of physics in mathematics . . . . . . . . . . . . . . . . . . 43

2.2.1 The three streams in mathematics . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.2.2 Stream 1: Prime number theorems . . . . . . . . . . . . . . . . . . . . . . . . 45

2.2.3 Stream 2: Fundamental theorem of algebra . . . . . . . . . . . . . . . . . . . . 45

2.2.4 Stream 3: Fundamental theorems of calculus . . . . . . . . . . . . . . . . . . . 45

2.2.5 Other key theorems in mathematics . . . . . . . . . . . . . . . . . . . . . . . . 46

2.2.6 Exercises NS-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.3 Lecture 4: (Week 3) Applications of Prime number . . . . . . . . . . . . . . . . . . . . 52

2.3.1 Lecture 5: The Euclidean algorithm (GCD) . . . . . . . . . . . . . . . . . . . . 54

2.3.2 Lecture 6: Continued fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

2.3.3 Lecture 7: Pythagorean triplets (Euclid’s formula) . . . . . . . . . . . . . . . . 61

2.3.4 Lecture 8: Pell’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

2.3.5 Lecture 9: Fibonacci sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

2.3.6 Exercises NS-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3 Stream 2: Algebraic Equations (12 Lectures) 73

3.1 Lecture 10: Algebra and geometry as physics . . . . . . . . . . . . . . . . . . . . . . . 74

3.1.1 The first Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.1.2 Finding roots of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.1.3 Matrix formulation of the polynomial . . . . . . . . . . . . . . . . . . . . . . . 83

3.1.4 Working with polynomials in Matlab/Octave: . . . . . . . . . . . . . . . . . . . 84

3.2 Lecture 11: (Week 4) Eigenanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

3.2.1 Eigenvalues of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

3.2.2 Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

3.2.3 Analytic functions: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

3.2.4 Exercises AE-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

3.3 Lecture 12: (Week 5) Root classification . . . . . . . . . . . . . . . . . . . . . . . . . . 103

3

4 CONTENTS

3.3.1 Convolution of monomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

3.3.2 Residue expansions of rational functions . . . . . . . . . . . . . . . . . . . . . 105

3.4 Lecture 13: Introduction to Analytic Geometry . . . . . . . . . . . . . . . . . . . . . . 107

3.4.1 Development of Analytic Geometry . . . . . . . . . . . . . . . . . . . . . . . . 112

3.4.2 Lecture 14: Applications of scalar products . . . . . . . . . . . . . . . . . . . . 113

3.4.3 Lecture 15: (Week 6) Gaussian Elimination . . . . . . . . . . . . . . . . . . . . 115

3.4.4 Exercises AE-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

3.5 Lecture 16: Transmission (ABCD) matrix composition method . . . . . . . . . . . . . . 123

3.5.1 Thevenin transfer impedance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

3.5.2 Lecture 17: Impedance matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

3.5.3 Lecture 18: Power relations and impedance . . . . . . . . . . . . . . . . . . . . 125

3.5.4 Power vs. power series, linear vs. nonlinear . . . . . . . . . . . . . . . . . . . . 125

3.6 Lecture 19: Signals: Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . 128

3.7 Lecture 20: Systems: Laplace transforms . . . . . . . . . . . . . . . . . . . . . . . . . 131

3.7.1 Lecture 21: (Week 8) System (Network) Postulates . . . . . . . . . . . . . . . . 133

3.8 Lecture 22: (Week 7) Complex analytic mappings (domain-coloring) . . . . . . . . . . . 135

3.8.1 The Riemann Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

3.8.2 Bilinear transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

3.8.3 Exercises AE-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

4 Stream 3a: Scalar Calculus (11 Lectures) 145

4.1 Modern mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

4.2 Lecture 23: Fundamental theorems of scalar calculus . . . . . . . . . . . . . . . . . . . 148

4.2.1 The fundamental theorems of complex calculus: . . . . . . . . . . . . . . . . . 148

4.2.2 Lecture 24: (Week 9) Cauchy-Riemann conditions . . . . . . . . . . . . . . . . 149

4.2.3 Exercises DE-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

4.3 Lecture 25: Complex Analytic functions and Brune Impedance . . . . . . . . . . . . . . 156

4.3.1 Complex impedance functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

4.3.2 Lecture 26: Branch cuts, Riemann Sheets . . . . . . . . . . . . . . . . . . . . . 160

4.4 Lecture 27: (Week 10) Three complex integration theorems I . . . . . . . . . . . . . . . 165

4.4.1 Cauchy’s theorems for integration in the complex plane . . . . . . . . . . . . . . 165

4.4.2 Exercises DE-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

4.4.3 Lecture 28: Three complex integration theorems II, III . . . . . . . . . . . . . . 177

4.5 Lecture 29: Inverse Laplace transform (t < 0): Cauchy residue theorem . . . . . . . . . 177

4.5.1 Lecture 30: (Week 11) Inverse Laplace transform (t > 0) . . . . . . . . . . . . . 178

4.5.2 Lecture 31: Properties of the LT . . . . . . . . . . . . . . . . . . . . . . . . . . 179

4.5.3 Solving differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

4.5.4 Exercises DE-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

5 Stream 3b: Vector Calculus (10 Lectures) 187

5.1 Lecture 32: (Week 12) Properties of Fields and potentials . . . . . . . . . . . . . . . . . 187

5.1.1 Lecture 33: Gradient ∇, divergence∇·, curl ∇×and Laplacian ∇2 . . . . . . . . 189

5.1.2 Laplacian operator in N dimensions . . . . . . . . . . . . . . . . . . . . . . . 193

5.2 Lecture 34: Partial differential equations from physics . . . . . . . . . . . . . . . . . . . 198

5.3 Lecture 35: Scalar Wave Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

5.4 Lecture 36: (Week 13) The Webster horn equation . . . . . . . . . . . . . . . . . . . . . 202

5.5 Lec 36b: Webster horn equation: Derivation . . . . . . . . . . . . . . . . . . . . . . . . 204

5.5.1 Matrix formulation of the WHEN . . . . . . . . . . . . . . . . . . . . . . . . . 206

5.6 Lecture 37: d’Alembert’s superposition principle . . . . . . . . . . . . . . . . . . . . . 207

5.6.1 Complex analytic nature of Γ(s) . . . . . . . . . . . . . . . . . . . . . . . . . . 208

5.6.2 Exercises VC-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

5.6.3 Finite length horns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Chapter 1

Introduction

Much of early mathematics dating before 1600 BCE centered around the love of art and music, due

to the sensations of light and sound. Our psychological senses of color and pitch are determined by

the frequencies (i.e., wavelengths) of light and sound. The Chinese and later the Pythagoreans are well 5

known for their early contributions to music theory. We are largely ignorant of exactly what the Chinese

scholars knew. The best record comes from Euclid, who lived in the 3rd century, after Pythagoras. Thus

we can only trace the early mathematics back to the Pythagoreans in the 6th century (580-500 BCE),

which is centered around the Pythagorean theorem and early music theory.

Pythagoras strongly believed that “all is number,” meaning that every number, and every mathemat- 10

ical and physical concept, could be explained by integral (integer) relationships, mostly based on either

ratios, or the Pythagorean theorem. It is likely that his belief was based on Chinese mathematics from

thousands of years earlier. It is also believed that his ideas about the importance of integers followed

from the theory of music. The musical notes (pitches) obey natural integral ratio relationships, based on

the octave (a factor of two in frequency). The western 12-tone scale breaks the octave into 12 ratios. 15

Today this has been rationalized to be the 12 root of 2, which is approximately equal to 18/17. This

innate sense of frequency ratios comes from the physiology of the auditory organ (the cochlea) which

represents a fix distance along the Organ of Corti, the sensory organ of the inner ear.

As acknowledged by Stillwell (2010, p. 16), the Pythagorean view is relevant today:

With the digital computer, digital audio, and digital video coding everything, at least ap- 20

proximately, into sequences of whole numbers, we are closer than ever to a world in which

“all is number.”

1.1 Early Science and Mathematics

While early Asian mathematics has been lost, it clearly defined the course for math for at least several

thousand years. The first recorded mathematics were those of the Chinese (5000-1200 BCE) and the 25

Egyptians (3,300 BCE). Some of the best early records were left by the people of Mesopotamia (Iraq,

1800 BCE).1 While the first 5,000 years of math are not well documented, but the basic record is clear,

as outlined in Fig. 1.1 (p. 21).

Thanks to Euclid, and later Diophantus (c250 CE), we have some basic (but vague) understanding of

Chinese mathematics. For example, Euclid’s formula (Eq. 2.11, p. 61) provides a method for computing 30

Pythagorean triplets, a formula believed to be due to the Chinese (Stillwell, 2010, pp. 4-9).2

Chinese bells and stringed musical instruments were exquisitely developed with tonal quality, as

documented by ancient physical artifacts (Fletcher and Rossing, 2008). In fact this development was

so rich that one must ask why the Chinese failed to initiate the industrial revolution. Specifically, why

1See Fig. 2.8, p. 62.2One might reasonably view Euclid’s role as that of a mathematical messenger.

19

20 CHAPTER 1. INTRODUCTION

did Europe eventually dominate with its innovation when it was the Chinese who did the extensive early

invention?

It could have been for the wrong reasons, but perhaps our best insight into the scientific history from

China may have come from an American chemist and scholar from Yale, Joseph Needham, who learned

to speak Chinese after falling in love with a Chinese woman, and ended up researching early Chinese5

science and technology for the US government.

According to Lin (1995) this is known as the Needham question:

“Why did modern science, the mathematization of hypotheses about Nature, with all its

implications for advanced technology, take its meteoric rise only in the West at the time of

Galileo[, but] had not developed in Chinese civilization or Indian civilization?”10

Needham cites the many developments in China:3

“Gunpowder, the magnetic compass, and paper and printing, which Francis Bacon consid-

ered as the three most important inventions facilitating the West’s transformation from the

Dark Ages to the modern world, were invented in China.” (Lin, 1995)

“Needham’s works attribute significant weight to the impact of Confucianism and Taoism15

on the pace of Chinese scientific discovery, and emphasize what it describes as the ‘diffu-

sionist’ approach of Chinese science as opposed to a perceived independent inventiveness

in the western world. Needham held that the notion that the Chinese script had inhibited

scientific thought was ‘grossly overrated’ ” (Grosswiler, 2004).

Lin was focused on military applications, missing the importance of non-military contributions. A20

large fraction of mathematics was developed to better understand the solar system, acoustics, musical

instruments and the theory of sound and light. Eventually the universe became a popular topic, as it still

is today.

Regarding the “Needham question,” I suspect the resolution is now clear. In the end, China withdrew

itself from its several earlier expansions, based on internal politics (Menzies, 2004, 2008).25

Chronological history pre 16th century

20th BCE Chinese (Primes; quadratic equation; Euclidean algorithm (GCD))

18th BCE Babylonia (Mesopotamia/Iraq) (quadratic equation)

6th BCE Pythagoras (Thales) and the Pythagorean “tribe”

4th BCE Euclid (quadratic equation); Archimedes

3d BCE Eratosthenes 276-194BCE

3d CE Diophantus c250CE

4th Alexandria Library destroyed 391CE

6th Brahmagupta (negative numbers; quadratic equation) 598-670CE

10th al-Khwarizmi (algebra) 830CE Hasan Ibn al-Haytham (Alhazen) 965-1040CE

12th Bhaskara (calculus) 1114-1183 Marco Polo 1254-1324

15th Leonardo da Vinci 1452-1519; Michelangelo 1475-1564 Copernicus 1473-1543

16th Tartaglia (cubic eqs); Bombelli 1526-1572 Galileo Galilei 1564-1642

1.1.1 Lec 1: The Pythagorean theorem

Thanks to Euclid’s Elements (c323 BCE) we have an historical record, tracing the progress in geometry,

as established by the Pythagorean theorem, which states that for any right triangle having sides of

3https://en.wikipedia.org/wiki/Joseph_Needham\#cite_note-11

1.1. EARLY SCIENCE AND MATHEMATICS 21

1500BCE |0CE |500 |1000 |1400|1650

|830Christ

ChineseBabylonia

PythagoreansEuclid

LeonardoBrahmaguptaDiophantus Bhaskara

ArchimedesBombelli

al-KhawarizmiMarco Polo

Copernicus

Figure 1.1: Mathematical timeline between 1500 BCE and 1650 CE. The western renaissance is considered to have occurred

between the 15th and 17th centuries. However, the Asian “renaissance” was likely well before the 1st century (i.e., 1500 BCE).There is significant evidence that a Chinese ‘treasure ship’ visited Italy in 1434, initiating the Italian renaissance (Menzies,2008). This was not the first encounter between the Italians and the Chinese, as documented in ‘The travels of Marco Polo’(c1300 CE).

lengths (a, b, c) ∈ R that are either positive real numbers, or more interesting, integers c > [a, b] ∈ Nsuch that a+ b > c,

c2 = a2 + b2. (1.1)

Early integer solutions were likely found by trial and error rather than by an algorithm.

If a, b, c are lengths, then a2, b2, c2 are each the area of a square. Equation 1.1 says that the area a2

plus the area b2 equals the area c2. Today a simple way to prove this is to compute the magnitude of the

complex number c = a+ b, which forces the right angle

|c|2 = (a+ b)(a− b) = a2 + b2. (1.2)

However, complex arithmetic was not an option for the Greek mathematicians, since complex numbers

and algebra had yet to be discovered.

Almost 700 years after Euclid’s Elements, the Library of Alexandria was destroyed by fire (391

CE), taking with it much of the accumulated Greek knowledge. As a result, one of the best technical 5

records remaining is Euclid’s Elements, along with some sparse mathematics due to Archimedes (c300

BCE) on geometrical series, computing the volume of a sphere, the area of the parabola, and elementary

hydrostatics. In c1572 a copy Diophantus’s Arithmetic was discovered by Bombelli in the Vatican library

(Stillwell, 2010, p. 51). This book became an inspiration for Galileo, Descartes, Fermat and Newton.

Early number theory: Well before Pythagoras, the Babylonians (c1,800 BCE) had tables of triplets 10

of integers [a, b, c] that obey Eq. 1.1, such as [3, 4, 5]. However, the triplets from the Babylonians were

larger numbers, the largest being a = 12709, c = 18541. A stone tablet (Plimpton-322) dating back to

1800 BCE (Fig. 2.8, p. 62) was found with integers for [a, c]. Given such sets of two numbers, which

determined a third positive integer b = 13500 such that b =√c2 − a2, this table is more than convincing

that the Babylonians were well aware of Pythagorean triplets (PTs), but less convincing that they had 15

access to Euclid’s formula, a formula for PTs. (Eq. 2.11 p. 61).

It seems likely that Euclid’s Elements was largely the source of the fruitful era due to the Greek

mathematician Diophantus (215-285) (Fig. 1.1), who developed the field of discrete mathematics now

known as Diophantine analysis. The term means that the solution, not the equation, is integer. The work

of Diophantus was followed by fundamental change in mathematics, possibly leading to the development 20

of algebra, but at least including the discovery of

1. negative numbers,

2. quadratic equation (Brahmagupta, 7th CE),

3. algebra (al-Khwarizmi, 9th CE), and

4. complex arithmetic (Bombelli, 15th CE). 25


These discoveries overlapped with the European middle (aka, dark) ages. While Europe went “dark,”

presumably European intellectuals did not stop working during these many centuries.4

1.1.2 What is mathematics?

It seems strange when people complain they “can’t learn math,”5 but then claim to be good at languages.

Pre-high-school students tend to confuse arithmetic with math. One does not need to be good at arith-5

metic to be good at math (but it doesn’t hurt).

Math is a language, with the symbols taken from various languages, not so different from other

languages. Today’s mathematics is a written language with an emphasis on symbols and glyphs, biased

toward Greek letters, obviously due to the popularity of Euclid’s Elements. The specific evolution of

these symbols is interesting (Mazur, 2014). Each symbol is dynamically assigned a meaning, appropri-10

ate for the problem being described. These symbols are then assembled to make sentences. It is similar

to Chinese in that the spoken and written versions are different across dialects. Like Chinese, the sen-

tences may be read out loud in any language (dialect), while the mathematical sentence (like Chinese

characters) is universal.

Learning languages is an advanced social skill. However, the social outcomes of learning a language15

and math are very different. First learning a new language is fun because it opens doors to other cultures.

Second math is different due to the rigor of the rules of the language, along with the way it is taught (e.g.,

not as a language). A third difference between math and language is that math evolved from physics,

with important technical applications.

Like any language, the more mathematics you learn, the easier it is to understand, because mathe-20

matics is built from the bottom up. It’s a continuous set of concepts, much like the construction of a

house. If you try to learn calculus and differential equations, while skipping simple number theory, the

lessons will be more difficult to understand. You will end up memorizing instead of understanding, and

as a result you will likely forget it. When you truly understand something, it can never be forgotten. A

nice example is the solution to a quadratic equation: If you learn how to complete the square (p. 77),25

you will never forget the quadratic formula.

The topics need to be learned in order, just as in the case of building the house. You can’t build

a house if you don’t know about screws or cement (plaster). Likewise in mathematics, you will not

learn to integrate if you have failed to understand the difference between integers, complex numbers,

polynomials, and their roots.30

A short list of topics for mathematics is numbers (N,Z,Q, I,C), algebra, derivatives, anti-derivatives

(i.e., integration), differential equations, vectors and the spaces they define, matrices, matrix algebra,

eigenvalues and vectors, solutions of systems of equations, matrix differential equations and their eigen-

solution. Learning is about understanding, not memorizing.

The rules of mathematics are formally defined by algebra. For example, the sentence a = b means35

that the number a has the same value as the number b. The sentence is spoken as “a equals b.” The

numbers are nouns and the equal sign says they are equivalent, playing the role of a verb, or action

symbol. Following the rules of algebra, this sentence may be rewritten as a− b = 0. Here the symbols

for minus and equal indicate two types of actions (verbs).

Sentences can become arbitrarily complex, such as the definition of the integral of a function, or40

a differential equation. But in each case, the mathematical sentence is written down, may be read out

loud, has a well-defined meaning, and may be manipulated into equivalent forms following the rules of

algebra and calculus. This language of mathematics is powerful, with deep consequences, first known

as algorithms, but eventually as theorems.

The writer of an equation should always translate (explicitly summarize the meaning of the expres-45

sion), so the reader will not miss the main point, as a simple matter of clear writing.

4It would be interesting to search the archives of the monasteries, where the records were kept, to determine exactly what

happened during this religious blackout.5“It looks like Greek to me.”

1.1. EARLY SCIENCE AND MATHEMATICS 23

Just as math is a language, so language may be thought of as mathematics. To properly write

correct English it is necessary to understand the construction of the sentence. It is important to identify

the subject, verb, object, and various types of modifying phrases. Look up the interesting distinction

between that and which.6 Thus, like math, language has rules. Most individuals use what “sounds

right,” but if you’re learning English as a second language, it is necessary to understand the rules, which 5

are arguably easier to master than the foreign speech sounds.

Context is everything, and the most important context is physics. Without a physical problem to

solve, there can be no mathematics. People wanted to navigate the earth, and weigh things. This required

an understand of gravity. Many questions about gravity were deep, such as “Where is the center of the

universe?” Actually this answer is simple: Ask the Pope and he will tell you. But church dogma only 10

goes so far. Mathematics, along with a heavy dose of physics, finally answered this huge question.

Someone needed to perfect the telescope, and put satellites into space, and view the cosmos. Without

mathematics none of this would have happened.

1.1.3 Early physics as mathematics: Back to Pythagoras

There is a second answer to the question What is mathematics? The answer comes from studying its 15

history, which starts with the earliest record. This chronological view starts, of course, with the study

of numbers. First there is the taxonomy of numbers. It took thousands of years to realize that numbers

are more than the counting numbers N, and to create a symbol for nothing (i.e., zero), and to invent

negative numbers. With the invention of the abacus, a memory aid for the manipulation of complex sets

of real integers, one could do very detailed calculations. But this required the discovery of algorithms 20

(procedures) to add, subtract, multiply (many adds of the same number) and divide (many subtracts

of the same number), such as the Euclidean algorithm. Eventually it became clear to the experts (early

mathematicians) that there were natural rules to be discovered, thus books (e.g., Euclid’s Elements) were

written to summarize this knowledge.

The role of mathematics is to summarize algorithms (i.e., sets of rules), and formalize the idea 25

as a theorem. Pythagoras and his followers, the Pythagoreans, believed that there was a fundamental

relationship between mathematics and the physical world. The Pythagoreans may have been the first

to capitalize on the relationship between science and mathematics, to use mathematics to design things

for profit.7 This may have been the beginning of capitalizing technology (i.e., engineering), based

on the relationship between physics and math. This impacted commerce in many ways, such as map 30

making, tools, implements of war (the wheel, gunpowder), art (music), water transport, sanitation, secure

communication, food, etc. Of course the Chinese were the first to master many of these technologies.

Why is Eq. 1.1 called a theorem? Theorems require a proof. What exactly needs to be proved? We

do not need to prove that (a, b, c) obey this relationship, since this is a condition that is observed. We

do not need to prove that a2 is the area of a square, as this is the definition of an area. What needs to be 35

proved is that the relation c2 = a2 + b2 holds if, and only if, the angle between the two shorter sides is

90. The Pythagorean theorem (Eq. 1.1) did not begin with Euclid or Pythagoras, rather they appreciated

its importance, and documented its proof.

In the end the Pythagoreans, who instilled fear in the neighborhood, were burned out, and murdered,

likely the fate of mixing technology with politics: 40

“Whether the complete rule of number (integers) is wise remains to be seen. It is said that

when the Pythagoreans tried to extend their influence into politics they met with popular

resistance. Pythagoras fled, but he was murdered in nearby Mesopotamia in 497 BCE.”

–Stillwell (2010, p. 16)

6https://en.oxforddictionaries.com/usage/that-or-which

7It seems likely that the Chinese and Egyptians also did this, but it is more difficult to document.


1.2 Modern mathematics is born

Modern mathematics (what we practice today) was born in the 15-16th centuries, in the minds of

Leonardo da Vinci, Bombelli, Galileo, Descartes, Fermat, and many others (Stillwell, 2010). Many

of these early masters were, like the Pythagoreans, secretive to the extreme about how they solved prob-

lems. This soon changed due to Galileo, Mersenne, Descartes and Newton, causing mathematics to5

blossom. The developments during this time may seemed hectic and disconnected. But this is a wrong

impression, as the development was dependent on new technologies such as the telescope (optics) and

more accurate time and frequency measurements, due to Galileo’s studies of the pendulum, and a better

understanding of the relation fλ = co between frequency f , wavelength λ and the wave speed co.

Chronological history: 16th to 19th centuries

|1525 |1600 |1643 |1700 |1750 |1800 |1855

|1596

|1601|1655

|1667

|1564 |1756|1526

Euler

Newton

d'Alembert

GaussGalileo

Daniel Bernoulli

Jacob Bernoulli

Johann BernoulliDescartes

Beethoven

MozartFermat

Bombelli

Figure 1.2: Timeline for the development of the theory of calculus, between 16th and 19th CE. Given the timeline, itseems likely that Bombelli’s discovery of Diophantus’s book “Arithmetic” in the Vatican library triggered many of the ideaspresented by Galileo, Descartes and Fermat, followed by others (i.e., Newton). Bombelli’s discovery might be considered as amagic moment in mathematics. The vertical red lines indicate mentor-student relationships. Mozart and Beethoven are shownas landmarks. Related timelines include (see index): Fig. 1.1 (p. 21) which gives the timeline from 1500 BCE to 1650 CE;Fig. 3.1 (p. 73) presents the 17-20 CE (Newton–Einstein) view from 1640–1950; Fig. 3.5 (p. 107) presents the period fromDescartes to Gauss and Cauchy; Fig. 4.1 (p. 146) presents the period from Bombelli to Einstein (1525-1925).

1.2.1 Science meets mathematics10

Galileo: In 1589 Galileo famously conceptualized an experiment where he suggested dropping two

different weights from the Leaning Tower of Pisa, and he suggested that they must take the same time

to hit the ground.

Conceptually this is a mathematically sophisticated experiment, driven by a mathematical argument

in which he considered the two weights to be connected by an elastic cord (a spring). His studies resulted15

in the concept of conservation of energy, one of the cornerstones of physical theory since that time.

Being joined with an elastic cord, the masses become one. If the velocity were proportional to the

mass, as believed by Archimedes, the sum of the two weights would necessarily fall even faster. This

results in a logical fallacy: How can two masses fall faster than either? This also violates the concept of

conservation of energy, as the total energy of two masses would be greater than that of the parts. In fact20

Galileo’s argument may have been the first time that the principle of conservation of energy was clearly

stated.

It seems likely that Galileo was attracted to this model of two masses connected by a spring because

he was also interested in planetary motion, which consist of masses (sun, earth, moon), also mutually

attracted by gravity (i.e., the spring).25

Galileo also performed related experiments on pendulums, where he varied the length l, massm, and

angle θ of the swing. By measuring the period (periods/unit time) he was able to formulate precise rules

between the variables. This experiment also measured the force exerted by gravity, so the experiments

were related, but in very different ways. The pendulum served as the ideal clock, as it needed very little

energy to keep it going, due to its very low friction (energy loss).830

8http://muse.tau.ac.il/museum/galileo/pendulum.html

1.2. MODERN MATHEMATICS IS BORN 25

Mass

Mass

Spring

Mass

Mass

t = 0

t = 1

Figure 1.3: Depiction of the argument of Galileo (unpublished book of 1638) as to why weights of different masses (i.e.,weight) must fall with the same velocity, contrary to what Archimedes had proposed c250 BCE.

In a related experiment, Galileo measured the duration of a day by counting the number of swings

of the pendulum in 24 hours, measured precisely by the daily period of a star as it crossed the tip of

a church steeple. The number of seconds in a day is 24×60×60 = 86400 = 273352 [s/day]. Since

86,400 is the product of the first three primes, it is highly composite, and thus may be reduced in many

equivalent ways. For example, the day can be divided evenly into 3, 5 or 6 parts, and remain exact 5

in terms of the number of seconds that transpire. Factoring the number of days in a year (365=5*73)

is not useful, since it may not be decomposed into many small primes.9 Galileo also extended work

on the relationship of wavelength and frequency of a sound wave in musical instruments. On top of

these impressive accomplishments, Galileo greatly improved the telescope, which he needed for his

observations of the planets. 10

Many of Galileo’s contributions resulted in new mathematics, leading to Newton’s discovery of the

wave equation (c1687), followed 60 years later by its one-dimensional general solution by d’Alembert

(c1747).

Mersenne: Marin Mersenne (1588–1648) also contributed to our understanding of the relationship

between the wavelength and the dimensions of musical instruments, and is said to be the first to mea- 15

sure the speed of sound. At first Mersenne strongly rejected Galileo’s views, partially due to errors in

Galileo’s reports of his results. But once Mersenne saw the significance of Galileo’s conclusion, he

became Galileo’s strongest advocate, helping to spread the word (Palmerino, 1999).

This incident invokes an important theorem of nature: Is data more like bread or wine? The answer

is, it depends on the data. Galileo’s original experiments on pendulums and ball falling down slopes, 20

were flawed by inaccurate data. This is likely because he didn’t have good clocks. But he soon solved

that problem and the data became more accurate. We don’t know if Mersenne repeated Galileo’s exper-

iments, and then appreciated his theory, or if he communicated with Galileo. But resolution was that

the early data were like bread (it rots), but when the experimental method was improved, with a better

clock, the corrected data was like wine (which gets better with age). Galileo claimed that the time for 25

the mass to drop a fixed distance was exactly proportional to the square of the time. This expression lead

to F = ma. One follows from the other. If the mass varies then you get Newton’s second law of motion

(Eq. 3.2, p. 74).

He was also a decent mathematician, inventing (1644) the Mersenne primes (MP) πm of the form

πm = 2πk − 1,

where πk (k < m) denotes the kth prime (p. 36). As of Dec 2018 51 MPs are known.10 The first MP is

3 = π2 = 2π1 − 1, and the largest known prime is a MP. Note that 127 = π31 = 2π7 − 1 is the MP of 30

the MP π7.

9For example, if the year were 364 = 22 · 7 · 13 days, it would would make for shorter years (by 1 day), 13 months per

year (e.g., 28 = 2 · 2 · 7 day vacation per year?), perfect quarters, and exactly 52 week. Every holiday would always fall on

the same day, every year.10http://mathworld.wolfram.com/MersennePrime.html


Newton: With the closure of Cambridge University due to the plague of 1665, Newton returned home

to Woolsthorpe-by-Colsterworth (95 [mi] north of London), to work by himself for over a year. It was

during this solitary time that he did his most creative work.

While Newton (1642–1726) may be best known for his studies on light, he was the first to predict

the speed of sound. However, his theory was in error11 by√cp/cv =

√1.4 = 1.183. This famous error5

would not be resolved for 140 years, awaiting the formulation of thermodynamics and the equi-partition

theorem by Laplace in 1816.12

Just 11 years prior to Newton’s 1687 Principia, there was a basic understanding that sound and light

traveled at very different speeds, due to the experiments of Ole Rømer.13 14

Ole Rømer first demonstrated in 1676 that light travels at a finite speed (as opposed to10

instantaneously) by studying the apparent motion of Jupiter’s moon Io. In 1865, James

Clerk Maxwell proposed that light was an electromagnetic wave, and therefore traveled at

the speed co appearing in his theory of electromagnetism.15

The idea behind Rømer’s discovery was that due to the large distance between Earth and Io, there was

a difference between the period of the moon when Jupiter was closest to Earth vs. when it was farthest15

from Earth. This difference in distance caused a delay or advance in the observed eclipse of Io as it went

behind Jupiter, delayed by the difference in time due to the difference in distance. It is like watching a

video of a clock, delayed or sped up. When the video is slowed down, the time will be inaccurate (it will

indicate an earlier time).

Studies of vision and hearing: Since light and sound (music) played such a key role in the develop-20

ment of the early science, it was important to fully understand the mechanism of our perception of light

and sound. There are many outstanding examples where physiology impacted mathematics. Leonardo

da Vinci (1452–1519) is well known for his early studies of the human body. Exploring our physio-

logical senses requires a scientific understanding of the physical processes of vision and hearing, first

considered by Newton (1687) (1643–1727), but first properly researched much later by Helmholtz (Still-25

well, 2010, p. 261). Helmholtz’s (1821–1894) studies and theories of music and the perception of sound

are fundamental scientific contributions (Helmholtz, 1863a). His best known mathematical contribution

is today known as the fundamental theorem of vector calculus, or simply Helmholtz theorem.

The amazing Bernoulli family: The first individual who seems to have openly recognized the impor-

tance of mathematics, enough to actually teach it, was Jacob Bernoulli (1654–1705) (Fig. 1.4). Jacob30

worked on what is now viewed as the standard package of analytic “circular” (i.e., periodic) functions:

sin(x), cos(x), exp(x), log(x).16 Eventually the full details were developed (for real variables) by

Euler (Sections 3.1 p. 95 and R.1, p. 336).

From Fig. 1.2 we see that Jacob was contemporary with Descartes, Fermat, and Newton. Thus it

seems likely that he was strongly influenced by Newton, who in turn was influenced by Descartes,1735

Vite and Wallis (Stillwell, 2010, p. 175).

Jacob Bernoulli, like all successful mathematicians of the day, was largely self-taught. Yet Jacob

was in a new category of mathematicians because he was an effective teacher. Jacob taught his sibling

Johann (1667–1748), who then taught his sibling Daniel (1700–1782). But most importantly, Johann

taught Leonhard Euler (1707–1783) (Figs. 1.2 and 1.4), the most prolific (thus influential) of all mathe-40

maticians. This teaching resulted in an explosion of new ideas and understanding. It is most significant

11The square root of the ratio of the specific heat capacity at constant pressure cp to that at constant volume cv .12https://en.wikipedia.org/wiki/Pierre-Simon_Laplace#Speed_of_sound13https://www.youtube.com/watch?v=b9F8Wn4vf5Y14https://www.youtube.com/watch?v=rZ0wx3uD2wo15https://en.wikipedia.org/wiki/Speed_of_light16The log and tan functions are related by Eq. 4.2 p. 146.17https://en.wikipedia.org/wiki/Early_life_of_Isaac_Newton


Figure 1.4: Above: Jakob (1655–1705) and Johann (1667–1748) Bernoulli. Below: Leonhard Euler (1707-1783) and Jeanle Rond d’Alembert (1717-1783). Euler was blind in his right eye, hence the left portrait view. The figure numbers are fromStillwell (2010).


that all four mathematicians published their methods and findings. Much later, Jacob studied with stu-

dents of Descartes18 (Stillwell, 2010, p. 268-9).

Euler: Euler went far beyond all the Bernoulli family (Stillwell, 2010, p. 315). A special strength of

Euler was the degree to which he published. First he would master a topic, and then he would publish.

His papers continued to appear long after his death (Calinger, 2015). It is also interesting that Leonhard5

Euler was a contemporary of Mozart (and James Clerk Maxwell of Abraham Lincoln).

Bach’s mathematics: It turns out which I need to further document that JS Bach did a mathematical

analysis of music. This seems to be about the same time that Euler was writing a book on music theory.

Someone please help to fill in these details.19Needs more research.

d’Alembert: Another individual of that time of special note, who also published extensively, was10

d’Alembert (Fig. 1.4). Some of the most creative ideas were first proposed by d’Alembert. Unfortu-

nately, and perhaps somewhat unfairly, his rigor was criticized by Euler, and later by Gauss (Stillwell,

2010). But once the tools of mathematics were finally openly published, largely by Euler, mathematics

grew exponentially.20

Gauss: Figure 4.1 (p. 146) shows the timeline of the most famous mathematicians. It was one of the15

most creative times in mathematics. Gauss was born at the end of Euler’s long and productive life.

I suspect that Gauss owed a great debt to Euler: surely he must have been a scholar of Euler. One of

Gauss’s most important achievements may have been his contribution to solving the open question about

the density of prime numbers.21

Table 1.1: Three streams followed from Pythagorean theorem: number systems (Stream 1), geometry(stream 2) and infinity (stream 3).

• The Pythagorean theorem is the mathematical spring which bore the three streams.

• Several centuries per stream:

1) Numbers:

6thBCE N counting numbers, Q rationals, P primes

5thBCE Z common integers, I irrationals

7thCE zero ∈ Z

2) Geometry: (e.g., lines, circles, spheres, toroids, . . . )

17thCE Composition of polynomials (Descartes, Fermat)

Euclid’s geometry & algebra⇒ analytic geometry

18thCE Fundamental theorem of algebra

3) Infinity: (∞→ Sets)

17-18thCE Taylor series, functions, calculus (Newton, Leibniz)

19thCE R real, C complex 1851

20thCE Set theory

18It seems clear that Descartes was also a teacher.19https://www.youtube.com/watch?v=6b5MM4x-3aQ,

https://www.uchicago.edu/features/combining_math_and_music/20There are at least three useful exponential scales: Factors of 2, factors of e ≈ 2.7, and factors of 10. The decibel uses

factors of 2 (6 [dB]) and factors of 10 (20 [dB]). Information theory uses powers of 2 (1 [bit]), 4 (2 [bits]). Circuit theory uses

factors of both e and 10.21http://www-history.mcs.st-andrews.ac.uk/Biographies/Gauss.html.


1.2.2 Three Streams from the Pythagorean theorem

From the outset of his presentation, Stillwell (2010, p. 1) defines “three great streams of mathematical

thought: Numbers, Geometry and Infinity” that flow from the Pythagorean theorem, as summarized in

Table 1.1. This is a useful concept, based on reasoning not as obvious as one might think. Many factors

are in play here. One of these is the strongly held opinion of Pythagoras that all mathematics should 5

be based on integers.22 The rest are tied up in the long, necessarily complex history of mathematics, as

best summarized by the fundamental theorems (Table 2.1, p. 44), each of which is discussed in detail in

a relevant chapter. Stillwell’s concept of three streams, following from the Pythagorean theorem, is the

organizing principle behind this book:

1. Introduction Section 1 is intended to be a self-contained survey of basic pre-college mathematics, 10

as a detailed overview of the fundamentals, presented as three main streams:

2 Number systems: Stream 1 (p. 35)

3 Algebraic equations: Stream 2 (p. 73)

4 Scalar calculus: Stream 3a (p. 145)

5 Vector calculus: Stream 3b (p. 187) 15

If you’re a student, stick to these sections.

Sections 2-5 are disorganized research ideas that have not yet found a home. Students, please stay

out of these later chapters.

2. Number Systems (Stream 1, p. 319) Some uncertain ideas of number systems, starting with prime

numbers, through complex numbers, vectors and matrices. 20

3. Algebraic Equations (Stream 2, p. 335) Algebra and its development, as we know it today. The

theory of real and complex equations and functions of real and complex variables. Complex

impedance Z(s) of complex frequency s = σ + ω is covered with some care, developing the

topic which is needed for engineering mathematics.

4. Scalar Calculus (Stream 2a) Ordinary differential equations. Integral theorems. Acoustics. 25

5. Vector Calculus: (Stream 2b) Vector partial differential equations. Gradient, divergence and curl

differential operators. Stokes’s and Green’s theorems. Maxwell’s equations.

22https://www.youtube.com/watch?v=gHSv94aTTnk

54 CHAPTER 2. STREAM 1: NUMBER SYSTEMS (9 LECTURES)

Thus one could factor a product of primes N = πkπl by doing M divisions, where M is the number

of primes less than N . This assumes the list of primes less than N are known. However, most integers

are not a simple product of two primes

But the utility has to do with the sheer density of primes (the prime number theorem, i.e., Gauss’s

hypothesis). If we were simply looking up a few numbers from a short list of primes, it would be easy,5

but the density of primes among the integers is logarithmic (≫1 per octave).

This brings us to the prime number theorem (PNT). The security problem is the reason why these

two theorems are so important: 1) Every integer has a unique representation as a product of primes, and

2) the density of primes is large (see the discussions on p. 45 and Section O.2, p. 319). Thus security

reduces to the “needle in the haystack problem” due to the cost of a search.10

Examples of the GCD: k = gcd(m, n)

• Examples (m,n, k ∈ Z):

– 5 = gcd(13 · 5, 11 · 5). The GCD is the common factor 5.

– gcd(13 · 10, 11 · 10) = 10 (gcd(130, 110) = 10 = 2 · 5, is not prime)

– gcd(1234, 1024) = 2(1234 = 2 · 617, 1024 = 210)

– gcd(πkπm, πkπn) = πk

– k = gcd(m,n) is the part that cancels in the fraction m/n ∈ F

– m/gcd(m,n) ∈ Z

• Coprimes (m ⊥ n) are numbers with no distinct common factors: i.e., gcd(m,n)=1

– The GCD of two primes is always 1: gcd(13,11) = 1, gcd(πm, πn)=1 (m 6= n)

– m = 7 · 13, n = 5 · 19 ⇒ (7 · 13) ⊥ (5 · 19)

– If m ⊥ n then gcd(m,n) = 1

– If gcd(m,n) = 1 then m ⊥ n

• The GCD may be extended to polynomials: e.g., gcd(ax2 + bx+ c, αx2 + βx+ γ)

– gcd((x− 3)(x− 4), (x− 3)(x− 5))= (x− 3)

– gcd(x2 − 7x+ 12, 3(x2 − 8x+ 15))= 3(x− 3)

– gcd(x2 − 7x+ 12, (3x2 − 24x+ 45)= 3(x− 3)

– gcd( (x− 2π)(x− 4), (x− 2π)(x− 5) )= (x− 2π) (Needs long division)

Figure 2.3: The Euclidean algorithm for finding the GCD of two numbers is one of the oldest algorithms in mathematics,and is highly relevant today. It is both powerful and simple. It was used by the Chinese during the Han dynasty (Stillwell, 2010,p. 70) for reducing fractions. It may be used to find pairs of integers that are coprime (their GCD must be 1). The GCD maybe used to identify factors of polynomials by long division. It has an important sister algorithm called the continued fractionalgorithm (CFA), that generalizes the Euclidean algorithm.

2.3.1 Lec 5: Greatest common divisor (Euclidean algorithm)

The Euclidean algorithm is a method to find the greatest common divisor (GCD) k between two in-

tegers n,m, denoted k = gcd(n,m), where n,m, k ∈ N (Graham et al., 1994). For example, 15 =gcd(30, 105) since, when factored (30, 105) = (2·3·5, 7·3·5) = 3·5·(2, 7) = 15·(2, 7). The Euclidean

algorithm is a method for finding the largest common divisor without factoring the two numbers. It was15

known to the Chinese (i.e., not discovered by Euclid) (Stillwell, 2010, p. 41). The GCD may be used to

find coprimes, as described below in Fig. 2.3.

There are six closely related numeric operators which return integers∈ Z (Sect. A.2, p. 248). In Mat-

lab/Octave: floor(x), ceil(x), round(x) take x ∈ R, and mod(n,m) (modulo), rem(n,m)

(remainder), gcd(n,m) (greatest common divisor) take n,m ∈ N. Rounding schemes are used in20

continued fractions and the others are widely used when doing arithmetic, such as division.

The remainder function (modulo) for real arguments is closely related to the phase of a pure delay,

since the Laplace transform of a delay [Sect. I.2 (p. 282)] (delayed Dirac) is related to the phase is the

2.3. LECTURE 4: (WEEK 3) APPLICATIONS OF PRIME NUMBER 55

remainder (modular) function (mod 2π), since

∠e−ωT = rem(ωT, 2π).

Interestingly, in this instance the arguments of the remainder function are real (∈ R). Modular arithmetic

may be alternatively written as mod(t, T ) = ((t))T . Thus modular functions are the group of periodic

functions. In Matlab/Octave unwrap() is the inverse of mod().

The Euclidean algorithm: The algorithm is best explained by a trivial example: Let the two numbers

be 6, 9. At each step the smaller number (6) is subtracted from the larger (9) and the smaller number 5

and the difference (the remainder) are saved. This process continues until the two resulting numbers are

equal, which is the GCD. For our example: 9− 6 = 3, leaving the smaller number 6 and the difference

3. Repeating this we get 6− 3 = 3, leaving the smaller number 3 and the difference 3. Since these two

numbers are the same we are done and the GCD=3. We can easily verify this result by factoring [e.g.,

(9, 6) = 3(3, 2)]. The value may also be numerically verified using the Matlab/Octave GCD command 10

gcd(6, 9), which returns 3.

Direct matrix method: The GCD may be written as a matrix recursion given the starting vector

(m0, n0)T . The recursion is then[mk+1

nk+1

]=

[1 −10 1

] [mk

nk

]. (2.7)

This recursion continues untilmk+1 < nk+1, at which pointm and nmust be swapped. The process

is repeated until m = n, which equals the GCD. The direct method is inefficient because it recursively

subtracts n many times until the resulting m is less than n, as shown in Fig. 2.4. It also must test for

m ≤ n at each iteration, and then swap m and n if they are not equal. 15

Since this matrix may be inverted, the process may be reversed, which is closely related to the

continued fraction algorithm (CFA) (p. 58).

Why is the GCD important? The utility of the GCD algorithm arises directly from the fundamental

difficulty in factoring large integers. Computing the GCD, using the Euclidean algorithm, is low cost,

compared to factoring when finding the coprime factors, which is extremely expensive. This utility 20

surfaces when the two numbers are composed of very large primes.

When two integers have no common factors they are said to be coprimes, and their GCD is 1.

The ratio of two integers which are coprime is automatically in reduced form (they have no common

factors). For example, 4/2 ∈ Q is not reduced since 2 = gcd(4, 2) (with no remainder). Canceling out

the common factor 2 gives the reduced form 2/1 ∈ F. Thus if we wish to form the ratio of two integers, 25

first compute the GCD, then remove it from the two numbers to form the ratio. This assures the rational

number is in its reduced form (∈ F rather than ∈ Q). If the GCD were 103 digits, it is obvious that any

common factor would need to be be removed, thus greatly simplifying further computation. This will

make a huge difference when using IEEE-754.

The floor function and the GCD are related in an important way, as discussed next. 30

Exercise: What does it mean to reach the turning point when using the Euclidean algorithm? Solu-

tion: When m/n − ⌊m/n⌋ < n we have reached a turning point. When the remainder is zero (i.e.,

m/n− ⌊m/n⌋ = 0), we have reached the GCD.

The GCD and long division: When one learns how to divide a smaller number into a larger one, they

must learn how to use the GCD. For example, suppose we which to compute 6 ÷ 110. One starts by 35

finding out how many times 6 goes into 11. Since 6 × 2 = 12, which is larger than 11, the answer

is 1. We then subtract 6 from 11 to find the remainder 5. This is, of course, the floor function (e.g.,

⌊11/6⌋ = 1, ⌊11/5⌋ = 2).


Example: Divide 10 into 99, the floor function (floor(99/10)) must be used, followed by the

remainder function (rem(99,10)). Solution: When we divide a smaller number into a larger one,

we must first find the floor followed by the remainder. For example 99/10 = 9 + 9/10 has a floor of 9 and

a remainder of 9/10.

Example: Start with the two integers [873, 582]. In factored form these are [π25 · 32, π25 · 3 · 2]. Given

the factors, we see that the largest common factor is π25 · 3 = 291 (π25 = 97). When we take the ratio

of the two numbers this common factor cancels

873

582= π25 · 3 · 3

π25 · 3 · 2=

3

2= 1.5.

Of course if we divide 582 into 873 we will numerically obtain the answer 1.5 ∈ F.5

Exercise: Show that in Matlab/Octave rat(873/582) = 1 + 1/(−2) gives the wrong answer.

Solution: Since

factor(873) = 3 · 3 · 97 and factor(582) = 2 · 3 · 97,

thus 3/2 = 1 + 1/2 is the correct answer. But due to rounding methods, it is not 3/2. As an example,

in Matlab/Octave rat(3/2)=2+1/(-2). Matlab’s rat() command uses rounding rather than the

floor function, which explains the difference. When the rat() function produces negative numbers,

rounding is employed.

m

m-n

m-2n

m-3n

m-6n

m-7n

m-k

n

kn n

Figure 2.4: The Euclidean algorithm recursively subtracts n from m until the remainder m − kn is either less than n orzero. Note that this recursion is the same as mod(m,n). Thus the GCD recursively computes mod(m,n) until the remainderrem(m,n) is less than n. It then swaps m,n, so that n < m. This repeats, until it terminates on the GCD. For the casedepicted here the value of k, that renders the remainder m − 6n < n. If one more step were taken (k = 7) the remainderwould become negative. By linear interpolation we can find that m− αn = 0, α ∈ R, where the solid line cross the abscissa.For this example k = 6.5, thus 6 = ⌊m/n⌋ < n. This is nonlinear (truncation) arithmetic, which is a property related to theGCD. The floor() functions finds the turning point, where we swap the two numbers, since by definition, m > n. In thisexample k = 6 = ⌊m/n⌋.

Graphical description of the GCD: The Euclidean algorithm is very simple when viewed graphically.10

In Fig. 2.4 we show what is happening as one approaches the threshold. After reaching the threshold,

the two numbers must be swapped, which is addressed by the upper row of Eq. 2.8.

Below is a simple Matlab/Octave code to find k=gcd(m,n) based on the Stillwell (2010) definition

of the EA:

function k = gcd(m,n)15

while m ˜=0

A=m; B=n;

m=max(A,B); n=min(A,B); %m>n

m=m-n;


end

This program loops until m = 0.

Multiplication is simply recursive addition, and finding the GCD takes advantage of this fact. Let’s

take a trivial example, (9, 6). Taking the difference 9 − 6 (the larger from the smaller), and writing

multiplication as sums, one can see what is going on. The GCD may be written two equivalent ways:

9− 6 = (3 + 3 + 3)− (3 + 3) = 0 + 0 + 3 = 3,

and

9− 6 = (3 + 3 + 3)− (2 + 2 + 2) = 1 + 1 + 1 = 3.

Written the first way, it is 3, because subtracting (3+3) from (3+3+3) leaves 3. Written the second

way, each 3 is matched with a −2, leaving 3 ones, which add to 3. Of course the two decomposition 5

must yield the same result because 2 · 3 = 3 · 2. Thus finding the remainder of the larger number minus

the smaller yields the GCD of the two numbers.

Coprimes: When the GCD of two integers is 1, the only common factor is 1. This is of key importance

when trying to find common factors between the two integers. When 1=gcd(m,n) they are said to be

coprime or “relatively prime,” which is frequently written as m ⊥ n. By definition, the largest common

factor of coprimes is 1. But since 1 is not a prime (π1 = 2), they have no common primes. It can be

shown (Stillwell, 2010, p. 41-42) that when a ⊥ b, there exist m,n ∈ Z such that

am+ bn = gcd(a, b) = 1.

A much more efficient method uses the floor() function, which is called division with rounding.

Specifically the GCD may be written in one step as

[mn

]

k+1

=

[0 11 −

⌊mn

⌋] [mn

]

k

. (2.8)

This matrix is simply Eq. 2.7 to the power ⌊m/n⌋, followed by swapping the inputs (the smaller

must always be on the bottom).

2.3.2 Lec 6: Continued fraction algorithm 10

As shown in Fig. 2.5, the continued fraction algorithm (CFA) starts from a single real decimal number

xo ∈ R, and recursively expands it as a fraction x ∈ F (Graham et al., 1994). Thus the CFA may be

used for forming rational approximations to any real number. For example, π ≈ 22/7, and excellent

approximation, well known to Chinese mathematicians.

The the Euclidean algorithm (i.e., GCD), on the other hand, operates on a pair of integers m,n ∈ N 15

and returns their greatest common divisor k ∈ N, such that m/k, n/k ∈ F are coprime, thus reducing

the ratio to its irreducible form (i.e., m/k ⊥ n/k). Note this is done without factoring m,n.

Despite this seemingly irreconcilable difference between the GCD and CFA, the two are closely

related, so close that Gauss called the Euclidean algorithm, for finding the GCD, as the continued fraction

algorithm (CFA) (Stillwell, 2010, P. 48). At first glance it is not clear why Gauss would be so “confused.” 20

One is forced to assume that Gauss had some deeper insight into this relationship. If so, that insight

would be valuable to understand.22

22In some sense, to be explored, the GCD and CFA are inverses.


Definition of the CFA

1. Start with n = 0 and the positive input target xo ∈ R+. n = 0, m0 = 0, xo = π

2. Rounding: Let mn = ⌊xn⌉ ∈ N. m0 = ⌊π⌉ = 3

3. The input vector is then: [xn,mn]T . [π, 3]T0 .

4. Remainder: rn = xn −mn (−0.5 ≤ rn ≤ 0.5). r0 = π − 3 ≈ 0.14159

5. Reciprocate:

xn+1 =

1/rn, return to step 2 rn 6= 0 x1 = 1/0.14159 = 7.062646

0, terminate rn = 0.

6. The output vector is [xn+1,mn]T . [7.062646, 3]T

Figure 2.5: The CFA of any positive number, xo ∈ R+, is defined in this figure. Numerical values forn = 0, xo = π,m0 = 0 are given in blue on the far right. For n = 1 the input vector is [x1,m1]T =[7.062646, 3]T . If at any step the remainder is zero, the algorithm terminates (Step 5). Convergenceis guaranteed. The recursion may continue to any desired accuracy. Alternative rounding schemes aregiven on p. 248.

Notation: Writing out all the fractions can become tedious. For example, expanding e = 2.7183 · · ·using the Matlab/Octave command rat(exp(1)) gives the approximation

exp(1) = 3 + 1/(−4 + 1/(2 + 1/(5 + 1/(−2 + 1/(−7)))))− o(1.75× 10−6

),

= [3;−4, 2, 5,−2,−7]− o(1.75× 10−6).

Here we use a compact bracket notation, e6 ≈ [3;−4, 2, 5,−2,−7] where o() indicates the error of the

CFA expansion.

Since many entries are negative, we may deduce that rounding arithmetic is being used by Mat-

lab/Octave (but this is not documented). Note that the leading integer part may be noted by an optional

semicolon.23 If the process is carried further, the values of an ∈ N give increasingly more accurate5

rational approximations.

Example: Let xo ≡ π ≈ 3.14159 . . . . Thus ao = 3, ro = 0.14159, x1 = 7.065 ≈ 1/ro, and a1 = 7.

If we were to stop here we would have

π2 ≈ 3 +1

7 + 0.0625 . . .≈ 3 +

1

7=

22

7. (2.9)

This approximation of π ≈ 22/7 has a relative error of 0.04%

22/7− ππ

= 4× 10−4.

Example: For a second level of approximation we continue by reciprocating the remainder 1/0.0625 ≈15.9966 which rounds to 16 giving a negative remainder of ≈ −1/300:

π3 ≈ 3 + 1/(7 + 1/16) = 3 + 16/(7 · 16 + 1) = 3 + 16/113 = 355/113.

23Unfortunately Matlab/Octave does not support the bracket notation.


With rounding the remainder is −0.0034, resulting in a much more rapid convergence. If floor rounding

is used (15.9966 = 15 − 0.9966) the remainder is positive and close to 1, resulting in a much less

accurate rational approximation for the same number of terms. It follows that there can be a dramatic

difference depending on the rounding scheme, which, for clarity, should be specified, not inferred.

Rational approximation examples

π2 =22

7= [3; 7] ≈ π2 + o(1.3× 10−3)

π3 =355

113= [3; 7, 16] ≈ π3 − o(2.7× 10−7)

π4 =104348

33215= [3; 7, 16,−249] ≈ π4 + o(3.3× 10−10)

Figure 2.6: The expansion of π to various orders, using the CFA, along with the order of the error of

each rational approximation, with rounding. For example, π2 =22/7 has an absolute error (|22/7− π|)of about 0.13%.

CFA as a matrix recursion: Start with n0 = 0 ∈ N, x0 ∈ I, n1 = ⌊x0⌋ ∈ N, r1 = x − ⌊x⌋ ∈ I and 5

x1 = 1/r1 ∈ I, rn 6= 0. May need work.

In matrix notation, with k ∈ N

[nx

]

k+1

=

[0 ⌊x0⌋0 1

x0−⌊x0⌋

] [nx

]

k

This generates the next CFA parameter n1, and x1 = 1/r1 from n0 and x0.

Example: If x0 = π, then n1 = ⌊π⌋ = 3, r1 = π − n1 = 0.141592656 · · · , and x1 = 1/r1:

[3

7.06251331

]

1

=

[0 ⌊π⌋0 1

π−⌊π⌋

] [0π

]

0

and for n = 2 [7

15.99659

]

2

=

[71

0.06251331

]

2

=

[0 70 1

7.0625−7

] [3

7.06251331

]

1

For n = 3, π3 = [n1;n2, n3] = [3; 7, 15]. Continuing n4 = ⌊1.003418⌋ = 1 and n5 = 292. Clean up indexing.

Exercise: Find the CFA using the floor function, to 12th order.

Solution: π12 = [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1]. 10

Example: Matlab/Octave’s rat(pi,1e-20) gives:

3 + 1/(7 + 1/(16 + 1/(-294 + 1/(3 + 1/(-4 + 1/(5 + 1/(-15 + 1/(-3)))))))).

In bracket notation

π9 = [3; 7, 16,−294, 3,−4, 5,−15, 03].

Because the sign changes, it is clear that Matlab/Octave use rounding rather than the floor function.


Exercise: Based on several examples, which rounding scheme is the most accurate? Explain why.

Solution: Rounding will result in a smaller remainder at each iteration, thus a smaller net error and thus

improved convergence.

When the CFA is applied and the expansion terminates (rn = 0), the target is rational. When

the expansion does not terminate (which is not always easy to determine, as the remainder may be5

ill-conditioned due to small numerical rounding errors), the number is irrational. Thus the CFA has

important theoretical applications regarding irrational numbers. You may explore this using Matlab’s

rats(pi) command.

Besides these five basic rounding schemes, there are two other important R → N functions (i.e.,

mappings), which will be needed later: mod(x,y), rem(x,y)with x, y ∈ R. The base 10 numbers10

may be generated from the counting numbers using y=mod(x,10).

Exercise:

1. Show how to generate a base-10 real number y ∈ R from the counting numbers N using the

m=mod(n,10)+k10 with n, k ∈ N. Solution: Every time n reaches a multiple of 10, m is reset

to 0 and the next digit to the left is increased by 1, by adding 1 to k, generating the digit pair km.15

Thus the mod() function forms the underlying theory behind decimal notation.

2. How would you generate binary numbers (base 2) using the mod(x,b) function? Solution: Use

the same method as in the first example above, but with b = 2.

3. How would you generate hexadecimal numbers (base 16) using the mod(x,b) function? Solu-

tion: Use the same method as in the first example above, but with b = 16.20

4. Write out the first 19 numbers in hex notation, starting from zero. Solution: 0, 1, 2, 3, 4, 5, 6, 7,

8, 9, A, B, C, D, E, F, 10, 11, 12. Recall that 1016 = 1610 thus 1216 = 1810, resulting in a total of

19 numbers if we include 0.

5. What is FF16 in decimal notation? Solution: hex2dec(’ff’) = 25510.

Symmetry: A continued fraction expansion can have a high degree of “recursive symmetry.” For

example, the CFA of

R1 ≡1 +√

5

2= 1 +

1

1 + 11+···

= 1.618033988749895 · · · . (2.10)

Here an in the CFA is always 1 (R1 ≡ [1., 1, 1, · · · ]), thus the sequence cannot terminate, proving that25 √5 ∈ I. A related example is R2 ≡ rat(1+sqrt(2)), which gives R2 = [2., 2, 2, 2, · · · ].

When expanding a target irrational number (xo ∈ I), and the CFA is truncated, the resulting rational

fraction approximates the irrational target. For the example above, if we truncate at three coefficients

([1; 1, 1]) we obtain

1 +1

1 + 11+0

= 1 + 1/2 = 3/2 = 1.5 =1 +√

5

2+ 0.118 · · · .

Truncation after six steps gives

[1., 1, 1, 1, 1, 1, 1] = 13/8 ≈ 1.6250 =1 +√

5

2+ .0070 · · · .

Because all the coefficients are 1, this example converges very slowly. When the coefficients are large

(i.e., remainder small), the convergence will be faster. The expansion of π is an example of faster

convergence.


Exercise: Discuss the relationship between the CFA and the transmission line modeling method of

Sect. 3.5 (p. 123). Solution: The solution is detailed in Appendix P (p. 323).

Disc: 1

1−z−πk

= 1+

z−πk + z−2πk · · ·

In summary: Every rational number m/n ∈ F, with m > n > 1, may be uniquely expanded as a

continued fraction, with coefficients ak determined using the CFA. When the target number is irrational

(xo ∈ Q), the CFA does not terminate; thus, each step produces a more accurate rational approximation, 5

converging in the limit as n→∞.

Thus the CFA expansion is an algorithm that can, in theory, determine when the target is rational, but

with an important caveat: one must determine if the expansion terminates. This may not be obvious. The

fraction 1/3 = 0.33333 · · · is an example of such a target, where the CFA terminates yet the fraction

repeats. It must be that

1/3 = 3× 10−1 + 3× 10−2 + 3× 10−3 + · · · .

Here 3*3=9. As a second example24

1/7 = 0.142857142857142857142857 · · · = 142857× 10−6 + 142857× 10−12 + · · ·

There are several notations for repeating decimals such as 1/7 = 0.1142857 and 1/7 = 0.1((142857)).

Note that 142857 = 999999/7. Related identities include 1/11 = 0.090909 · · · and 11 × 0.090909 =999999. When the sequence of digits repeats, the sequence is predictable, and it must be rational. But it

is impossible to be sure that it repeats because the length of the repeat can be arbitrarily long. 10

One of the many useful things about the procedure is its generalizations to the expansion of trans-

mission lines, as complex functions of the Laplace complex frequency s. In Assignment 4.5.4, [see

Fig. 4.10 (p. 183)], a transmission line is analogized in terms of the CFA, but residue-expanded (p. 84)

in terms of the Laplace frequency s, rather than as integers. This method, called a Cauer synthesis, is

based on the ABCD transmission line method of Secs. 3.5 (p. 123). 15

2.3.3 Lec 7: Pythagorean triplets (Euclid’s formula)

Euclid’s formula is a method for finding three integer lengths [a, b, c] ∈ N that satisfy Eq. 1.1. It is

important to ask: Which set are the lengths [a,b,c] drawn from? There is a huge difference, both practical

and theoretical, if they are from the real numbers R, or the counting numbers N. Given p, q ∈ N with

p > q, the three lengths [a, b, c] ∈ N of Eq. 1.1 are given by

a = p2 − q2, b = 2pq, c = p2 + q2. (2.11)

Figure 2.7: Beads on a string form perfect right triangles when the number of unit lengths between beads foreach side satisfies Eq. 1.1. For example, when p = 2, q = 1, the sides are [3, 4, 5].

This result may be directly verified, since

[p2 + q2]2 = [p2 − q2]2 + [2pq]2

24Taking the Fourier transform of the target number, represented as a sequence, could help to identify an underlying periodic

component. The number 1/7 ↔ [[1, 4, 2, 8, 5, 7]]6 has a 50 [dB] notch at 0.8π [rad] due to its 6 digit periodicity, carried to 15

digits (Matlab/Octave maximum precision), Hamming windowed, and zero padded to 1024 samples.


or

p4 + q4 +2p2q2 = p4 + q4 −

2p2q2 +4p2q2.

Thus, Eq. 2.11 is easily proven, once given. Deriving Euclid’s formula is obviously much more difficult,

and is shown in Fig. 3.4 (p. 101).

A well-known example is the right triangle depicted in Fig. 2.7 (p. 61), defined by the integers

[3, 4, 5] ∈ N , having angles [0.54, 0.65, π/2] [rad], which satisfies Eq. 1.1 (p. 21). As quantified by

Euclid’s formula (Eq. 2.11), there are an infinite number of Pythagorean triplets (PTs). Furthermore the5

seemingly simple triangle, having angles of [30, 60, 90] ∈ N [deg] (i.e., [π/6, π/3, π/2] ∈ I [rad]), has

one irrational (I) length ([1,√

3, 2]).

Figure 2.8: “Plimpton-322” is a stone tablet from 1800 [BCE], displaying a and c values of the Pythagorean

triplets [a, b, c], with the property b =√c2 − a2 ∈ N. Several of the c values are primes, but not the a values. The

stone is item 322 (item 3 from 1922) from the collection of George A. Plimpton.

Table 2.2: Table of Pythagorean triples computed from Euclid’s formula Eq. 2.11 for various [p, q]. The lastthree columns are the first, fourth and next to last, from “Plimpton-322,” along with their corresponding [p, q]. In

all cases c2 = a2 + b2 and p = q + l, where l =√c− b ∈ N.

q 1 2 3 5 125 27

p 2 3 4 3 4 5 4 5 6 12 54 50

a 3 8 15 5 12 21 7 16 27 119 12709 1771

b 4 6 8 12 16 20 24 30 36 120 13500 2700

c 5 10 17 13 20 29 25 34 45 169 18541 3229

The technique for proving Euclid’s formula for PTs [a, b, c] ∈ Q, derived in Fig. 3.4 (p. 101) is much

more interesting than the PTs themselves.

The set from which the lengths [a, b, c] are drawn was not missed by the Indians, Chinese, Egyptians,10

Mesopotamians, Greeks, etc. Any equation whose solution is based on integers is called a Diophantine

equation, named after the Greek mathematician Diophantus of Alexandria (c250 CE) (Fig. 1.1, p. 21).

A stone tablet having the numbers engraved on it, as shown in Fig. 2.8, was discovered in Mesopotamia,

from the 19th century [BCE], and cataloged in 1922 by George Plimpton.25 26 These numbers are a and

c pairs from PTs [a,b,c]. Given this discovery, it is clear that the Pythagoreans were following those who15

25http://www.nytimes.com/2010/11/27/arts/design/27tablets.html

26https://en.wikipedia.org/wiki/Plimpton_322


came long before them. Recently a second similar stone, dating between 350 and 50 [BCE] has been

reported, that indicates early calculus on the orbit of Jupiter’s moons, the very same moons that Rømer

observed to show that the speed of light was finite (p. 26).27

2.3.4 Lec 8: Pell’s equation

Pell’s equation

x2n −Ny2

n = (xn −√Nyn)(xn +

√Nyn) = 1, (2.12)

with non-square N ∈ N specified and x, y ∈ N unknown, has a venerable history in both physics (p. 35)

and mathematics. Given its factored form it is obvious that every solution xn, yn has the asymptotic

property

xnyn

∣∣∣∣n→∞

→ ±√N.

It is believed that Pell’s equation is directly related to the Pythagorean theorem, and therefore to the 5

Euclidean algorithm, since they are both simple binomials having integer coefficients (Stillwell, 2010,

48), with Pell’s equation being the hyperbolic version of Eq. 1.1. For example, with N = 2, a solution

is x = 17, y = 12 (i.e., 172 − 2 · 122 = 1).

A 2 × 2 matrix recursion algorithm likely due to the Chinese, and used by the Pythagoreans to

investigate√N , is

[xy

]

n+1

=

[1 N1 1

] [xy

]

n

, (2.13)

where we indicate the index outside the vectors.

Starting with the trivial solution [xo, yo]T = [1, 0]T (i.e., x2

o − Ny2o = 1), additional solutions of 10

Pell’s equations are determined, having the propriety xn/yn →√N ∈ F (Stillwell, 2010, p. 44). This

approach was motivated by the Euclidean algorithm (GCD, p. 54, 323), since yn/xn →√

2 (Stillwell,

2010, p. 37, 55).

Note that Eq. 2.13 is a 2x2 matrix composition method (see p. 105), since the output of one matrix

multiply is the input to the next. Why Eqs. 2.13 and 2.12 are equivalent is explained in Section 3.2.3 15

(p. 94).

Asian solutions: The first solution of Pell’s equation was published by Brahmagupta (c628), who

independently discovered the equation (Stillwell, 2010, p. 46). Brahmagupta’s novel solution also used

the composition method, but different from Eq. 2.13 (Stillwell, 2010, p. 69). Then in 1150CE, Bhaskara

II independently obtained solutions using Eq. 2.13 (Stillwell, 2010, p.69). This is the composition 20

method we shall explore here, as summarized in Table E.1 (p. 268).

The best way to see how this recursion results in solutions to Pell’s equation is by example. Initial-

27http://www.nytimes.com/2016/01/29/science/babylonians-clay-tablets-geometry-astronomy-jupiter.

html


izing the recursion with the trivial solution [xo, yo]T = [1, 0]T gives

[x1

y1

]=

[11

]=

[1 21 1

] [10

]12 − 2 · 12 = −1

[x2

y2

]=

[32

]=

[1 21 1

] [11

]32 − 2 · 22 = 1

[x3

y3

]=

[75

]=

[1 21 1

] [32

](7)2 − 2 · (5)2 = −1

[x4

y4

]=

[1712

]=

[1 21 1

] [75

]172 − 2 · 122 = 1

[x5

y5

]=

[4129

]=

[1 21 1

] [1712

](41)2 − 2 · (29)2 = −1

Thus the recursion results in a modified version of Pell’s equation

x2n − 2y2

n = (−1)n, (2.14)

where only even values of n are solutions. This sign change had no effect on the Pythagoreans’ goal,

since they only cared about the ratio yn/xn → ±√

2.

Modified recursion: The following summarizes the solution (∈ C) of Pell’s equation for N = 2using a slightly modified matrix recursion. Note that xn/yn →

√2 as n → ∞, which was what the

Pythagoreans were pursuing, and that the odd solutions are pure-imaginary.5

As shown below the recursion may be slightly modified to fix the (−1)n problem, by multiplying

the 2x2 matrix by 1 =√−1. This results in Pell’s equation for every step of the recursion:

[x1

y1

]=

[11

]

1

=

[1 21 1

] [10

]

0

2 − 2 · 2 = 1

[xy

]

2

= 2[32

]

2

=

[1 21 1

]

[11

]

1

32 − 2 · 22 = 1

[xy

]

3

= 3[75

]

3

=

[1 21 1

]2[32

]

2

(7)2 − 2 · (5)2 = 1

[xy

]

4

=

[1712

]

4

=

[1 21 1

]3[75

]

3

172 − 2 · 122 = 1

[xy

]

5

=

[4129

]

5

=

[1 21 1

] [1712

]

4

(41)2 − 2 · (29)2 = 1

Solution to Pell’s equation: By multiplying the matrix by 1, all the solutions (xk ∈ C) to Pell’s

equation are determined. The 1 factor corrects the alternation in sign, so every iteration yields a so-

lution. Fig. E.1 (p. 268) shows that every output of this slightly modified matrix recursion gives so-

lutions to Pell’s equation. For n = 0 (the initial solution) [x0, y0] is [1, 0]0, [x1, y1] = [1, 1]1, and

[x2, y2] = −[3, 2]2. These are easily checked using this recursion.10

At each iteration, the ratio xn/yn approaches√

2 with increasing accuracy, coupling it to the CFA

which may also be used to find approximations to√N . The value of 41/29 ≈

√2, with a relative error

of <0.03%. The solution for N = 3 is given in Appendix E.1.1 (p. 268).


2.3.5 Lec 9: Fibonacci sequence

Another classic problem, also formulated by the Chinese, was the Fibonacci sequence, generated by the

relation

fn+1 = fn + fn−1. (2.15)

Here the next number fn+1 ∈ N is the sum of the previous two. If we start from [0, 1], this difference

equation leads to the Fibonacci sequence fn = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . .]. Alternatively, if we

define yn+1 = xn, then Eq. 2.15 may be compactly represented by a 2x2 companion matrix recursion

(see Exercise on p. 70) [xy

]

n+1

=

[1 11 0

] [xy

]

n

, (2.16)

which has eigenvalues (1±√

5)/2.

The correspondence of Eqs. 2.15 and 2.16 is easily verified. Starting with [x, y]T0 = [0, 1]T we

obtain for the first few steps[10

]

1

=

[1 11 0

] [01

]

0

,

[11

]

2

=

[1 11 0

] [10

]

1

,

[21

]

3

=

[1 11 0

] [11

]

2

,

[32

]

4

=

[1 11 0

] [21

]

3

, · · · .

From the above xn = [0, 1, 1, 2, 3, 5, . . .] is the Fibonacci sequence, since the next xn is the sum of the

previous two, and the next yn is xn.

Figure 2.9: This construction is called the Fibonacci spiral. Note how it is constructed out of squares having areas given bythe square of the Fibonacci numbers. In this way, the spiral is smooth and the radius increases as the Fibonacci numbers (e.g.,8=3+5, 13 = 5+8, etc.). (Adapted from https://en.wikipedia.org/wiki/Golden spiral.)

Exercise: Use the Octave/Matlab command compan(c) to find the companion matrix of the poly-

nomial coefficients defined by Eq. 2.15. Solution: Using the Matlab/Octave code: c=[1, -1,

-1]; C=compan(c); returning

C =

[1 11 0

](2.17)

5

Exercise: Find the eigenvalues of matrix C. Solution: The characteristic equation is

det

[1− λ 1

1 −λ

]= 0

or λ2 − λ− 1 = (λ− 1/2)2 − 1/4− 1 = 0, which has roots λ± = (1±√

5)/2 ≈ 1.618,−0.618.

The mean-Fibonacci sequence: Suppose that the Fibonacci sequence recursion is replaced by the

mean of the last two values, namely let

fn+1 =fn + fn−1

2. (2.18)

This seems like a small change. But how does the solution differ? To answer this question it is helpful

to look at the corresponding 2x2 matrix.


Exercise: Find the 2x2 matrix corresponding to Eq. 2.18. The 2x2 matrix may be found using the

companion matrix method (p. 83). Solution: Using Matlab/Octave code:

A=[1, -1/2, -1/2];

C=compan(A);

which returns

A =1

2

[1 12 0

]

Exercise: Find the sequence from the mean-Fibonacci starting from [1, 0]0. Solution: By inspection

one steady-state solution is [1, 1]∞. What is the second steady-state in the solution?

Relations to digital signal processing: Today we recognize Eq. 2.15 as a difference equation, which

is a pre-limit (pre Stream 3) form of a differential equation. The Greek 2x2 matrix form of Eq. 2.15 is5

an early precursor to 17th and 18th century developments in linear algebra. Thus the Greeks’ recursive

solution for the√

2 and Bhaskara’s (1030 CE) solution of Pell’s equation are early precursors to discrete-

time processing, as well as to calculus.

There are strong similarities between Pell’s equation and the Pythagorean theorem. As we shall

see in Appendix O, Pell’s equation is related to the geometry of a hyperbola, just as the Pythagorean10

equation is related to the geometry of a circle. We shall show, as one might assume, that there is a

Euclidean formula for the case of Pell’s equations, since these are all conic sections with closely related

conic geometry. As we have seen, the solutions involve√−1. The derivation is a trivial extension of

that for the Euclidean algorithm. The early solution of Brahmagupta was not related to such a simple

formula. Rather it was based on the recursion Eq. 2.15 (Stillwell, 2010).15

It makes one wonder if Eq. 1.1, written as x2n + y2

n = z2n, with xn, yn, zn ∈ N, has a 2x2 matrix

composition solution.

Exercise: Work out the details of this conjecture. Hint: z2 = x2 + y2 = (x + y)(x − y). Solu-

tion: The solution may be found in Allen (2019).

Chapter 3

Algebraic Equations: Stream 2

The era of 1640 to 1900 (Fig. 3.1) produced a continuous stream of fundamental theorems. A few

of the individuals who played notable roles in this development, in chronological (birth) order, include:

Galileo, Mersenne, Newton, d’Alembert, Fermat, Huygens, Descartes and Helmholtz. These individuals 5

were some of the first to develop the basic ideas, in various forms, that were then later reworked into the

proofs, that today we recognize as the fundamental theorems of mathematics.

Chronological history from the 17th century

17th Newton 1642-1727, Bernoulli, Johann 1667–1748

18th Bernoulli, Daniel, Cauchy 1789–1857, Euler 1707–1783, d’Alembert 1717–1783, Gauss1777–1855

19th Helmholtz 1821–1894, Kirchhoff 1824–87, Kelvin 1824–1907, Riemann 1826–1866,Maxwell 1831–1879, Rayleigh 1842–1919, Heaviside 1850–1925, Poincare 1854–1912

20th Harvey Fletcher 1884–1981, Arnold Sommerfeld 1886–1951, Albert Einstein 1879–1955,Leon Brillouin 1889–1969, Harry Nyquist 1889–1976 . . .

2121 Hendrik Wade Bode 1905–1982,

Time Line

|1640 |1700 |1750 |1800 |1850 |1900 |1950

Daniel BernoulliEuler

dAlembert

Gauss

Cauchy

Helmholtz

RiemannMaxwell

Poincare

Rayleigh

Kirchhoff

Heaviside

EinsteinKelvin

Stokes

Sommerfeld

Brillouin

Johann BernoulliNewton

BeethovenMozart

Figure 3.1: timeline of the three centuries from the mid 17th to 20th CE, one of the most productive times of all, becausemathematicians were sharing information. Figure 1.2 (p. 24) (Bombelli-Gauss) provides a closer look at the 16–19 CE, andFig. 4.1 (p.146) (Bombelli-Einstein) provides the full view 16–20 CE.

73

74 CHAPTER 3. STREAM 2: ALGEBRAIC EQUATIONS (12 LECTURES)

3.1 Lec 10: The physics behind nonlinear Algebra (Euclidean geometry)

Following Stillwell’s history of mathematics, Stream 2 is geometry, which led to the merging of Eu-

clid’s geometrical methods and the 9th century development of algebra by al-Khwarizmi (830 CE). This

integration of ideas led Descartes and Fermat to develop analytic geometry.5

The mathematics up to the time of the Greeks, documented and formalized by Euclid, served students

of mathematics for more than two thousand years. Algebra and geometry were, at first, independent

lines of thought. When merged, the focus returned to the Pythagorean theorem. Algebra generalized the

analytic conic section, greatly extending the geometrical approach as taught in Euclid’s Elements. With

the introduction of algebra, numbers, rather than lines, could be used to represent a geometrical length.10

Thus the appreciation for geometry grew, given the addition of the rigorous analysis using numbers.

Physics inspires algebraic mathematics: The Chinese used music, art, and navigation to drive math-

ematics. Unfortunately much of their knowledge has been handed down either as artifacts, such as

musical bells and tools, or mathematical relationships documented, but not created, by scholars such

as Euclid, Archimedes, Diophantus, and perhaps Brahmagupta. With the invention of algebra by al-15

Khwarizmi (830CE), mathematics became more powerful, and blossomed. During the 16th and 17th

century, it had become clear that differential equations (DEs), such as the wave equation, can charac-

terize a law of nature, at a single point in space and time. This principle was not obvious. A desire to

understand motions of objects and planets precipitated many new discoveries. This period is illustrated

in Fig. 1.2 (p. 24).20

As previously described, the law of gravity was first formulated by Galileo using the concept of

conservation of energy, which determines how masses are accelerated when friction is not considered

and the mass is constant. Kepler investigated the motion of the planets. While Kepler was the first to

observe that the orbits of planets are described by ellipses, it seems he under-appreciated the significance

of his finding, and continued working on his incorrect epicycle planetary model. Following up on

Galileo’s work (Galileo, 1638), Newton (c1687) went on to show that there must be a gravitational

potential between two masses (m1,m2) of the form

φNew(r(t)) ∝ m1m2

r(t), (3.1)

where r = |x1 − x2| is the Euclidean distance between the two point masses at locations x1 and x2.

Using algebra and his calculus, Newton formalized the equation of gravity, forces and motion (Newton’s

three laws), the most important being

f(t) =d

dtMv(t), (3.2)

and showed that Kepler’s discovery of planetary elliptical motion naturally follows from these laws (see

p. 11). With the discovery of Uranus (1781) “Kepler’s theory was ruined” (i.e., proven wrong) (Stillwell,

2010, p. 23).

Once Newton proposed the basic laws of gravity (Eq. 3.2), and resolved the elliptical motion of the

planets, he proceeded to calculate, for the first time, the speed of sound. This required some form of the

pressure wave equation

∂2

∂x2(x, t) =

1

c2o

∂2

∂t2(x, t), (3.3)

a key equation in mathematical physics. The speed of sound is

co =

√ηoPoρo

= 343, [m/s]

3.1. LECTURE 10: ALGEBRA AND GEOMETRY AS PHYSICS 75

which is a function of the density ρo = 1.12 [kg/m3] and the dynamic stiffness ηPo of air.1 In other

units the speed of sound is 767 [mph], 1125.3 [ft/s], 1.125 [ft/ms] and 1234.8 [km/h]. One of the earliest

measurements was by Marin Mersenne in 1630 as 1,380 [Parisian feet per second]. 1 English foot is

equal to 1.06575 Paris feet.

Exercise: How much smaller is one from Paris France than one from London England? Solution: The 5

French seem to have been about 94[%] smaller than the English, according to this measure.

If we substitute for the pressure

(x, t) = e(ωt±2πkx), (3.4)

where t is time and x is position, we find that 2πk = 2π/λ = 2πf/co, because fλ = co.

Case of dispersive wave propagation: This classic relation λf = c is deceptively simple, thus con-

fusing, because k = 2π/λ becomes complex (has both real and imaginary parts) in dispersive media

(e.g., acoustic waves in tubes) when losses are considered (Kirchhoff, 1868; Mason, 1928). A more

important example is the case of electron waves in silicon crystals (Brillouin, 1953). In these more gen-

eral cases, the wave number k(f) = 2πf/c is replaced with the complex analytic function we call the

propagation function κ(s) of s, i.e., the wave becomes the eigen-function of the wave equation

p±(x, t) = Po(s)este±κ(s)x. (3.5)

In these more general cases the wave number κ(s) must be a complex analytic function of the Laplace

frequency s = σ + ω, as used with the Laplace transform. This is because electron “waves” in a

dispersive semi-conductor, such as silicon, are “filtered” in magnitude and phase. Silicon is a highly 10

dispersive “wave-filter,” forcing the wavelength to be a complex function of frequency. This view is

elegantly explained by Brillouin (1953, Chap. 1) in his historic text. Modern acoustics contains a rich

source of related examples (Morse, 1948; Beranek, 1954; Fletcher and Rossing, 2008; Ramo et al.,

1965).

Newton’s Principia was finally published in 1687, and the general solution to Newton’s wave equa-

tion [i.e., p(x, t) = G(t ± x/c)], where G is any function, was first published 60 years later by

d’Alembert (c1747), which showed that for single frequency sounds, the wavelength λ and frequency fwere related by

fλ = c. (3.6)

Today d’Alembert’s single-frequency analytic wave solution must be written as Eq. 3.5 having a complex 15

wave number κ(s) = 2π/λ(s) [m−1]. This formulation led to the frequency domain concept of the

Laplace analysis, based on the linearity (i.e., superposition) property of the wave equation (Postulate

P2: Lec. 3.7.1, p. 133).

Newton’s value for the speed of sound in air co was incorrect by the thermodynamic constant√ηo,

a problem that would take 200+ years to formulate. What was needed was the adiabatic process (the 20

concept of constant-heat). For audio frequencies (0.02-20 [kHz]), the small temperature gradients cannot

diffuse the distance of a wavelength in one cycle (Pierce, 1981; Boyer and Merzbach, 2011), “trapping”

the heat energy in the wave.2 The eventual fix to Newton’s formula for the sound speed was to define

the dynamic stiffness of air ηoPo, where Po (1 [atm] or 105 [Pa]) is the static stiffness of air. But this fix

still ignores losses (Kirchhoff, 1868). 25

Newton’s success was important because it quantified the physics behind the speed of sound, and

demonstrated that momentum (mv), not mass m, was transported by the wave. His concept was correct,

and his formulation using algebra and calculus represented a milestone in science. In air, assuming

no visco-elastic losses, it is constant (i.e., co =√ηoPo/ρo). When including losses the wave number

1ηo = Cp/Cv = 1.4 is the ratio of two thermodynamic constants and Po = 105 [Pa] is the barometric pressure of air.2There were other physical enigmas, such as the observation that sound disappears in a vacuum or that a vacuum cannot

draw water up a column by more than 34 feet.


becomes a complex function of frequency, leading to Eq. 3.5. In periodic structures, again the wave

number becomes complex due to diffraction, as commonly observed in optics (e.g., diffraction gratings)

and acoustics. Thus Eq. 3.6 only holds for the simplest cases, but in general it must be considered as a

complex analytic function of s, as κ(s) in Eq. 3.5.

The corresponding discovery for the formula for the speed of light was made 174 years after Prin-5

cipia, by Maxwell (c1861). Maxwell’s formulation also required great ingenuity, as it was necessary

to hypothesize an experimentally unmeasured term in his equations, to get the mathematics to correctly

predict the speed of light.

It is somewhat amazing that to this day, we have failed to understand gravity significantly better than

Newton’s theory. Perhaps this is too harsh, given the 1905 work of Einstein.310

3.1.1 The first Algebra

Prior to the invention of algebra, people worked out problems as sentences using an obtuse description

of the problem (Stillwell, 2010, p. 93). Algebra changed this approach, resulting in a compact language

of mathematics, where numbers are represented as abstract symbols (e.g., x and α). The problem to be

solved could be formulated in terms of sums of powers of smaller terms, the most common being powers

of some independent variable (i.e., time or frequency). If we set an = 1

Pn(z) ≡ zn + an−1zn−1 + · · ·+ aoz

0 = zm +n−1∑

k=0

akzk =

n∏

k=0

(z − zk). (3.7)

is called a monic polynomial. The coefficient an cannot be zero, or the polynomial would not be of

degree n. The resolution is to force an = 1, since this simplifies the expression and does not change the

roots.

The key question is: What values of z = zk result in Pn(zk) = 0. In other words, what are the roots15

zk of the polynomial? Answering this question consumed thousands of years, with intense efforts by

many aspiring mathematicians. In the earliest attempts, it was a competition to evaluate mathematical

acumen. Results were held as a secret to the death bed. It would be fair to view this effort as an obsession.

Today the roots of any polynomial may be found, to high accuracy, by numerical methods. Finding roots

is limited by the numerical limits of the representation, namely by IEEE-754 (p. 41). There are also a20

number of important theorems.

Of particular interest is composing a circle with a line, where the line does not touch the circle, and

finding the intersection (root). There was no solution to this problem using geometry. The resolution of

this problem is addressed in the assignments below.

3.1.2 Finding roots of polynomials25

The problem of factoring polynomials has a history more than a millennium in the making. While the

quadratic (degree N = 2) was solved by the time of the Babylonians (i.e., the earliest recorded history

of mathematics), the cubic solution was finally published by Cardano in 1545. The same year, Cardano’s

student solved the quartic (N = 4). In 1826 (281 years later) it was proved that the quintic (N = 5)

could not be factored by analytic methods.30

As a concrete example we begin with the important but trivial case of the quadratic

P2(x) = ax2 + bx+ c. (3.8)

First note that if a = 0, the quadratic reduces to the monomial P1(x) = bx + c. Thus we have the

necessary condition that a 6= 0. The best way to proceed is to divide a out and work directly with the

normalized quadratic P2(x) = 1aP2(x). In this way we do not need to worry about the exception, a = 0.

3Gravity waves were experimentally measured for the first time while I was formulating Section R.


The roots are those values of x such that P2(xk) = 0. One of the first results (recorded by the

Babylonians, c2000 BCE) was the factoring of this equation by completing the square (Stillwell, 2010,

p. 93). One may isolate x by rewriting Eq. 3.8 as

1

aP2(x) = (x+ b/2a)2 − (b/2a)2 + c/a. (3.9)

This is easily verified by expanding the squared term and canceling (b/2a)2

1

aP2(x) = [x2 + (b/a)x+(b/2a)2]−(b/2a)2 + c/a.

Setting Eq. 3.9 to zero and solving for the two roots x± gives the quadratic formula

x± =−b±

√b2 − 4ac

2a. (3.10)

The roots of P2(x), with a = 1, simplify to

x± = −1

2b±

√(b/2)2 − c. (3.11)

This can be further simplified. The term b2 − c > 0 under the square root is called the discriminant,

and in physics and engineering problems, >99.9% of the time it is negative. Finally b/2 ≪ √c; thus,

the most natural way (i.e., corresponding to the most common physical cases) of writing the solution is4

x± = −1

2b±

√c− (b/2)2 ≈ −b/2±

√c. (3.12)

This form separates the real and imaginary parts of the solution in a natural way. The term b/2 is

called the damping, which accounts for losses in a resonant circuit, while the term√c, for mechanical,

acoustical and electrical networks, is called the resonant frequency, typically written as ωo. The last

approximation ignores the (typically) minor correction to the resonant frequency, which in engineering

practice is typically always ignored. Knowing that there is a correction is highlighted by this formula, 5

making one aware it exists.

It is not required that a, b, c ∈ R, but for physical problems of interest, this is always true (>99.9%

of the time).

Summary: The quadratic equation and its solution are ubiquitous in physics and engineering. It seems

obvious that instead of memorizing the meaningless Eq. 3.10, one should learn the physically meaningful 10

solution, Eq. 3.12, obtained via Eq. 3.9, with a = 1. Arguably, the factored and normalized form

(Eq. 3.9) is easier to remember, as a method (completing the square), rather than as a formula to be

memorized.

Additionally, the real (b/2) and imaginary ±√c parts of the two roots have physical significance as

the damping and resonant frequency. Equation 3.10 has none. 15

No insight is gained by memorizing the quadratic formula. To the contrary, an important concept

is gained by learning how to complete the square, which is typically easier than identifying a, b, c and

blindly substituting them into Eq. 3.10. Thus it’s worth learning the alternate solution (Eq. 3.12) since it

is more common in practice and requires less algebra to interpret the final answer.

4This is the case for mechanical and electrical circuits having small damping. Physically b > 0 is the damping coefficient

and√c > 0 is the resonant frequency. One may then simplify the form as x2 + 2bx+ c2 = (x+ b+ c)(x+ b− c).


Exercise: By direct substitution demonstrate that Eq. 3.10 is the solution of Eq. 3.8. Hint: Work with

P2(x). Solution: Setting a = 1 the quadratic formula may be written

x± =−b± 1

√4c− b2

2.

Substituting this into P2(x) gives

P±(x±) = x±2 + bx± + c

=

(−b±

√b2 − 4c

2

)2

+ b

(−b±

√b2 − 4c

2

)+ c

=1

4

(b

2∓2b

√b2 − 4c+ (b

2 −4c)

)+

1

4

(−2b2 ±

2b√b2 − 4c

)+ c

=0.

Exercise: By direct substitution, show that Eq. 3.12 satisfies Eq. 3.10. Solution: Putting Eq. 3.10

into Eq. 3.10 in factored form gives

P2(x) = (x− x+)(x− x−)

=

(x+

1

2b+

√c− (b/2)2

)(x+

1

2b−

√c− (b/2)2

)

=

(x+

b

2

)2

+(c− b2/4

)

=(x2 + bx+

b2/4)

+(c−

b2/4)

=x2 + bx+ c.

In third grade I learned the times-table trick for 9:

9 · n = (n− 1) · 10 + (10− n). (3.13)

With this simple rule I did not need to depend on my memory for the 9 times tables.5 Learning an

algorithm is much more powerful than memorization of the 9 times tables. Note that the two terms

(n− 1) and (10− n), add to 9. How one thinks about a problem can have great impact.5

Newton’s method for finding roots of Pn(s) Newton is well known for an approximate but efficient

method to find the roots of a polynomial.6 Consider polynomial Pn(s) ∈ C for s ∈ C

Pn(s) = cn(s− s0)n + cn−1(s− s0)n−1 + · · ·+ c1(s− s0) + c0,

where we may use Taylor’s formula (p. 89) to determine the coefficients

cn =1

n!

dn

dsnPn(s)

∣∣∣∣s=s0

.

5E.g.: 9 · 7 = (7 − 1) · 10 + (10 − 7) = 60 + 3 and 9 · 3 = (3 − 1) · 10 + (9 − 3) = 20 + 7. By expanding the above,

one can see why it works: 9n = n10−10 +10 − n = n(10 − 1).6https://en.wikipedia.org/wiki/Newton’s_method


If an initial guess for the root s0 is close to a root, then |(s − s0)|n ≪ |(s − s0)| for n ≥ 2 ∈ N. Thus

we may truncate Pn(s− so) to its linear term

Pn(s) ≈ (s− s0)d

dsPn(s)

∣∣∣∣s=s0

+ Pn(s0)

= (s− s0)P ′n(s0) + Pn(s0),

where P ′n(s) is shorthand for dPn(s)/ds.

Root s0 may be found by iteration, defining a sequence sn → s0, n ∈ N, such that Pn(sn) = 0 as

n→∞. Replacing s with sn, and s0 with sn−1, the formula becomes a recursion for the root sn

(sn − sn+1)P ′n(sn+1) + Pn(sn+1) = Pn(sn)→ 0,

as n → ∞. Solving for sn gives Newton’s famous root-finding formula, based on the reciprocal of the

logarithmic derivative P ′n/Pn

sn = sn−1 −Pn(sn−1)

P ′n(sn−1)

. (3.14)

Here sn−1 is the expansion point and sn is the nth approximation to the root. With every step, the

expansion point moves closer to the root, converging to the root in the limit. As it comes closer, the

linearity assumption becomes more accurate, ultimately resulting in the convergence to the root.

This expression is related to the log-derivative d logP (x)/dx = P ′(x)/P (x). It follows that even 5

for cases where fractional derivatives of roots are involved (p. 164),7 Newton’s method should converge,

since the log-derivative linearizes the equation.8

ℜ x0 1 2 3 4 5

ℑ x

-2

-1

0

1

2

ℜ x

0 1 2 3 4 5

ℑ x

-2

-1

0

1

2

Figure 3.2: Newton’s method applied to the polynomial having real roots [1, 2, 3, 4] (left) and 5 complexroots (right). A random starting point was chosen, and each curve shows the values of sn as Newton’s methodconverges to the root. Different random starting points converge to different roots. The method always results inconvergence to a root. Claims to the contrary (Stewart, 2012, p. 347) are a result of forcing the roots to be real.For convergence, one must work with sn ∈ C.

Newton’s view: Newton believed that imaginary roots and numbers have no meaning (p. 147), thus

he sought only real roots (Stillwell, 2010, p. 119). In this case Newton’s relation may be explored as a

graph, which puts Newton’s method in the realm of analytic geometry. The function P ′n(x) is the slope 10

of the polynomial Pn(x) at xn. The ratio Pn(x)/P ′n(x) has poles at the roots of P ′ and zeros at the roots

we seek.

7This seems like an obvious way toward understanding fractional, even irrational, roots.8https://en.wikipedia.org/wiki/Argument_principle


Example: When the polynomial is P2 = 1− x2, and P ′2(x) = −2x, Newton’s iteration becomes

xn+1 = xn +1− x2

n

2xn.

Note the roots are x± = ±1). For the case of N = 2 the root of P ′2(x) is always the average of the roots

of P2(x).

To start the iteration (n = 0) we need an initial guess for x0, which is an “initial random guess” of

where a root might be. For P2(x) = 1− x2,

x1 = x0 +1− x2

0

2x0= x0 +

1

2(−x0 + 1/x0) .

1. Let P2(x) = 1 − x2, and x0 = 1/2. Draw a graph describing the first step of the iteration.

Solution: Start with an (x, y) coordinate system and put points at x0 = (1/2, 0) and the vertex of

P2(x), i.e.: (0, 1) (P2(0) = 1). Then draw 1− x2, along with a line from x0 to x1.5

2. Calculate x1 and x2. What number is the algorithm approaching? Is it a root of P2? Solu-

tion: First we must find P ′2(x) = −2x. Thus the equation we will iterate is

xn+1 = xn +1− x2

n

2xn=x2n + 1

2xn.

By hand

x0 = 1/2

x1 =(1/2)2 + 1

2(1/2)=

1

4+ 1 = 5/4 = 1.25

x2 =(5/4)2 + 1

2(5/4)=

(25/16) + 1

10/4=

41

40= 1.025.

These estimates rapidly approach the positive real root x = 1. Note that if one starts at the root of

P ′(x) = 0 (i.e., x0 = 0), the first step is indeterminate.

3. Write an Octave/Matlab script to check your answer for part (a). Solution:

x=1/2;

for n = 1:310

x = x+(1-x*x)/(2*x);

end

(a) For n = 4, what is the absolute difference between the root and the estimate, |xr − x4|?Solution: 4.6E-8 (very small!)15

(b) What happens if x0 = −1/2? Solution: You converge on the negative root, x = −1.

4. Does Newton’s method work for P2(x) = 1 + x2? Why?9 Hint: What are the roots in this case?

Solution: In this case P ′2(x) = +2x thus the iteration gives

xn+1 = xn −1 + x2

n

2xn.

In this case the roots are x± = ±1, namely purely imaginary. Obviously Newton’s method fails

because there is no way for the answer to become complex. If like Newton, you didn’t believe in

complex numbers, your method would fail to converge to the complex roots (i.e., real in, real out).

This is because Octave/Matlab assumes x ∈ C, since it is initialized as C.20

9https://en.wikipedia.org/wiki/Newton’s method#Complex functions


5. What if you let x0 = (1 + )/2 for the case of P2(x) = 1 + x2? Solution: By starting with a

complex initial value, we fix the Real in = Real out problem.

Fractal diagrams: When the initial guess is real, but the nearest root is complex, Newton’s

iteration fails, as discussed in the exercises above. This failure to converge gives rise to fractal

patterns, called the Mandelbrot set.105

Example: Assume that polynomial P3(s) = (s− a)2/(s− b)π. Then

lnP3(s) = 2 ln(s− a)− π ln(s− b)

andd

dslnP3(s) =

2

s− a +π

s− b .

Reduction by the logarithmic derivative to simple poles: As shown by the above trivial example,

any polynomial, having zeros of arbitrary degree (i.e., π in the example), may be reduced to the ratio of

two polynomials, by taking the logarithmic derivative, since

LN (s) =N(s)

D(s)=

d

dslnPn(s) =

P ′n(s)

Pn(s). (3.15)

Here the starting polynomial is the denominator D(s) = Pn(s) while the numerator N(s) = P ′n(s)

is the derivative of D(s). Thus the logarithmic derivative can play a key role in analysis of complex

analytic functions, as it reduces higher order poles, even those of irrational degree, to simple poles.

The logarithmic derivative LN (s) has a number of special properties:

1. LN (s) has simple poles sp and zeros sz . 10

2. The poles of LN (s) are the zeros of Pn(s).

3. The zeros of LN (s) (i.e., P ′n(sz) = 0) are the zeros of P ′

n(s).

4. LN (s) is analytic everywhere other than its poles.

5. Since the zeros of Pn(s) are simple (no second-order poles), it is obvious that the zeros of LN (s)always lie close to the line connecting the two poles. One may easily demonstrate the truth of the 15

statement numerically, and has been quantified by the Gauss-Lucas theorem which specifies the

relationship between the roots of a polynomial and those of its derivative. Specifically, the roots

of P ′N−1 lie inside the convex hull of the roots of Pn.

6. Newton’s method may be expressed in terms of the reciprocal of the logarithmic derivative, since

sk+1 = sk + ǫo/LN (s),

where ǫo is called the step size, which is used to control the rate of convergence of the algorithm.

If the step size is too large, the root-finding path may jump to a different domain of convergence, 20

thus a different root of Pn(s).

7. Not surprisingly, given all the special proprieties, LN (x) plays an key role in mathematical

physics.

10https://en.wikipedia.org/wiki/Mandelbrot set


Euler’s product formula: Counting may be written as a recursion, simply by adding 1 to the previous

value starting from 0. The even numbers may be generated by adding 2, starting from 0.Hack needs work Multiples

of 3 may be similarly generated by adding 3 to the previous value, starting from 0. Such recursions

are fundamentally related to prime numbers n ∈ P, as first shown by Euler, starting from the counting

numbers n ∈ N. The logic is based on the properties of the sieve (Lec. 2.3, p. 52). The basic idea is both5

simple and basic, taking almost everyone by surprise, likely even Euler. It is related on the old idea that

the integers may be generated by the geometric series, when viewed as a recursion.

Example: Let’s look at counting modulo prime numbers. For example

n = k mod (2) · k mod (3) · k mod (5).

here

+Nnun

Delay

α = 1Nn−M

Nn = αNn−M + un

M

Figure 3.3: This feedback network is described by a time-domain difference equation with delay M [s], has an all-poletransfer function, given by Eq. R.17. THICKEN LINES

To see this define the step function un = 0 for n < 0 and un = 1 for n ≥ 0 and the counting number

function Nn = 0 for n < 0. The counting numbers may be recursively generated from the recursion

Nn+1 = Nn−M + un. (3.16)

which for M = 1 gives Nn = n. For M = 2 Nn = 0, 2, 4, · · · are the even numbers.

As was first published by Euler in 1737, one may recursively factor out the leading prime term,10

resulting in Euler’s product formula. Based on the argument given the discussion of the Sieve (Lec. 2.3,

52), one may automate the process and create a recursive procedure to identify multiples of the first item

on the list, and then remove the multiples of that prime. The lowest number on this list is the next prime.

One may then recursively generate all the multiples of this new prime, and remove them from the list.

Any numbers that remain are candidates for primes.15

The observation that this procedure may be automated with a recursive filter, such as that shown in

Fig. 3.3, implies that it may be transformed into the frequency domain, and described in terms of its

poles and zeros. He was successful in finding the function that describes these poles, which are related

to the primes. For example, the poles of the filter shown in Fig. 3.3 may be determined by taking the z-

transform of the recursion equation, and solving for the roots of the resulting polynomial. The recursion20

equation is the time-domain equivalent to Riemann’s zeta function ζ(s), which the frequency domain

equivalent representation.

Exercise: Show that Nn = n follows from the above recursion.Needs work. Solution: If n = −1 we have Nn=0

and un = 0. For n = 0 the recursion gives N1 = N0 + u0, thus N1 = 0 + 1. When n = 1 we have

N2 = N1 + 1 = 1 + 1 = 2. For n = 2 the recursion gives N3 = N2 + 1 = 3. Continuing the recursion25

we find that Nn = n. Today we denote such recursions of this form as digital filters. The state diagram

for Nn is given in Fig. 3.3.

To start the recursion define un = 0 for n < 0. Thus u0 = u−1 + 1. But since u−1 = 0, u0 = 1.

The counting numbers follow from the recursion. Thus we have


Exercise: Write an Octave/Matlab program that generates removes all the even number, to give the

odds Nn = 1, 0, 3, 0, 5, 0, 7, 0, 9, · · · .Solution:

M=50; N=(0:M-1);

u=ones(1,M); u(1)=0; 5

Dem=[1 0 -1]; Num=[0 0 1];

n=filter(Num,Dem,u);

y2=n.*N; F1=N-y2

which gives: F1 = [0, 1, 0, 3, 0, 5, 0, 7, 0, · · · ].

Exercise: Write a program to recursively down-sample Nn by 2:1. Solution: 10

N=[1 0 3 0 5 0 7 0 9 0 11 0 13 0 15]

k=1;

while 2*k-1 < length(N)

2*k-1;

M(k)=N(2*k+1); 15

k=k+1;

end

M

which gives: M = [3, 5, 7, 9, 11, 13, 15]20

Here is a 1 line code that does the same thing:

N=1:15; M=N(1:2:15)

which returns the odd numbers between 1 and 15 (i.e., 1:2:15)

For the next step in the sieve, generate all the multiples of 3 (the second prime) and subtract these 25

from the list. This will either zero out these number from the list, or create negative items which are then

removed. Numbers are negative when the number has already been removed because it has a second

factor. For example, 6 is already removed because it is a multiple of 2, thus was removed when removing

the multiples of prime number 2.

3.1.3 Matrix formulation of the polynomial 30

There is a one-to-one relationship between every constant coefficient differential equation, its character-

istic polynomial and the equivalent matrix form of that differential equation, defined by the companion

matrix (Horn and Johnson, 1988, p. 146).

The companion matrix: The N ×N companion matrix Can Newton’s method

be formulated as a CMwith eigen-value roots?

CN =

0 −c0

1 0 0 −c1

0 1 0 −c2... 0 1 0 · · · ...

· · · . . . 0...

0 1 0 −cN−2

0 1 −cN−1

N×N

(3.17)


is derived from the monic polynomial of degree N

PN (s) = sN + cN−1sN−1 · · ·+ c2s

2 + c1s+ co

= sN +N−1∑

n=0

cnsn,

having coefficient vector

cTN−1 = [1, cN−1, cN−2, · · · co]T .

The roots of the monic polynomial are the eigenvalues of the companion matrix CN . The degrees of

freedom (DoF) of a polynomial, and thus of its companion matrix, are the number of non-zero eigenval-

ues.

Exercise: Show that the eigenvalues of the 3x3 companion matrix are the same as the roots of P3(s).

Solution: Expanding the determinant of C3 − sI3 along the right-most column:

−

∣∣∣∣∣∣∣

−s 0 −c0

1 −s −c1

0 1 −(c2 + s)

∣∣∣∣∣∣∣= c0 + c1s+ (c2 + s)s2 = s3 + c2s

2 + c1s+ c0.

This is the negative of the characteristic polynomial (−P3(s)), which has the same roots.

Exercise: Find the companion matrix for the Fibonacci sequence, defined by the recursion (i.e., dif-

ference equation)

fn+1 = fn + fn−1,

initialized with fn = 0 for n < 0 and f0 = 1. Solution: Taking the “Z transform” gives the polynomial

z1 = z0 − z−1 = 0 having coefficient vector Z = [1,−1,−1], resulting in the Fibonacci companion

matrix

C =

[1 11 0

],

as discussed Section 2.3.5 on page 65.5

Example: Matlab/Octave: A polynomial is represented in Matlab/Octave in terms of its coefficient

vector. When the polynomial vector for the poles of a differential equation is

cN = [1, cN−1, cN−2, · · · c0]T ,

the coefficient cN = 1. This normalization guarantees that the leading term is not zero, and the number

of roots (N ) is equal to the degree of the monic polynomial.

3.1.4 Working with polynomials in Matlab/Octave:

In Matlab/Octave there are seven related functions you must become familiar with:

1. R=roots(A) Vector A = [aN , aN−1, . . . , ao] ∈ C are the complex coefficients of polynomial10

PN (z) =∑Nn=0 anz

n ∈ C, where N ∈ N is the degree of the polynomial. It is convenient

to force aN = 1, corresponding to dividing the polynomial by this value, when it is not 1,

guaranteeing it cannot be zero. Further R is the vector of roots, [z1, z2, · · · , zn] ∈ C such that

polyval(A,zk)=0.

Example: roots([1, -1])=1.15


2. y=polyval(A,x): This evaluates the polynomial defined by vector A ∈ CN evaluated at x

∈ C, returning vector y(x)∈ C.

Example: polyval([1 -1],1)=0, polyval([1, 1],3)=4.

3. P=poly(R): This is the inverse of root(), returning a vector of polynomial coefficients P

∈ CN of the corresponding characteristic polynomial, starting from either a vector of roots R, or 5

a matrix A, for example, defined with the roots on the diagonal. The characteristic polynomial is

defined as the determinant of |A− λI| = 0 having roots R.

Example: poly([1])=[1, -1]i, poly([1,2])=[1,-3,2].

Due to IEEE-754 scaling issues, this can give strange results that are numerically correct, but only

within the limits of IEEE-754 accuracy. 10

4. R=polyder(C) This routine takes the N coefficients of polynomial C and returns the N − 1coefficients of the derivative of the polynomial. This is useful when working with Newton’s

method, where each step is proportional to PN (x)/P ′N−1(x).

Example: polyder([1,1])= [1]

5. [K,R]=residue(N,D): Given the ratio of two polynomials N,D, residue(N,D) returns

vectors K,R such thatN(s)

D(s)=∑

k

Kk

s− sk, (3.18)

where sk ∈ C are the roots of the denominator D polynomial and K ∈ C is a vector of residues, 15

which characterize the roots of the numerator polynomial N(s). The use of residue() will be

discussed in Sect. 4.4 (p. 166), and I, p. 284. This is one of the most valuable time-saving routine

I know of.

Example: Residue(2, [1 0 -1])= [1 -1]

6. C=conv(A,B): Vector C∈ CN+M−1 contains the polynomial coefficients of the convolution of 20

the two vector of coefficients of polynomials A,B∈ CN and B∈ CM .

Example: [1, 2, 1]=conv([1, 1], [1, 1]).

7. [C,R]=deconv(N,D): Vectors C, N, D ∈ C. This operation uses long division of polyno-

mials to find C(s) = N(s)/D(s) with remainder R(s), where N = conv(D,C)+R, namely

C =N

Dremainder R. (3.19)

Example: Defining the coefficients of two polynomials asA = [1, a1, a2, a3] andB = [1, b1, b+2], one may find the coefficients of the product from C=conv(A,B), and recover B from C with

B=deconv(C,A). 25

8. A=compan(D): Vector D = [1, dN−1, dN−2, · · · , d0]T ∈ C contains the coefficients of the

monic polynomial

D(s) = sN +N∑

k=1

dN−ksk,

and A is the companion matrix of vector D (Eq. I.3, p. 284). The eigenvalues of A are the roots

of monic polynomial D(s).

Example: compan([1 -1 -1])= [1 1; 1 0]


Exercise: Practice the use of Matlab’s/Octave’s related functions, which manipulate roots, polynomi-

als and residues: root(), conv(), deconv(), poly(), polyval(), polyder(), residue(),

compan().

Solution: Try Newton’s method for various polynomials. Use N=poly(R) to provide the coeffi-

cients of a polynomial given the roots R. Then use root() to factor the resulting polynomial. Then5

use Newton’s method and show that the iteration converges to the nearest root.11

Solution: Write more examples similar to the ones provided above, and manually verify them.

Exercise: Write an Octave/Matlab script that finds the companion matrix of the Fibonacci recursion

formula, and verify it manually. Solution: Z=[1, -1, -1]; C=compan(Z)

3.2 Lec 11: Eigenanalysis I: eigenvalues of a matrix10

At this point we turn a corner in the discussion, to discuss the important topic of eigenanalysis, which

starts with the computation of the eigenvalues and their eigenvectors of a matrix. As briefly discussed

in Lecture 2.2 (p. 43), eigenvectors are mathematical generalizations of resonances, or modes, naturally

found in physical systems.

When you pluck the string of a violin or guitar, or hammer a bell or tuning fork, there are natural15

resonances that occur. These are the eigenmodes of the instrument. The frequency of each mode is

related to the eigenvalue, which in physical terms is the frequency of the mode. But this idea goes way

beyond simple acoustical instruments. Wave-guides and atoms can be resonant systems. The resonances

of the hydrogen atom are called the Lyman series,12 a special case of the Rydberg series and Rydberg

atom (p. 352).20

Thus this topic runs deep in both physics and, eventually, mathematics. In some real sense, eigen-

analysis was what the Pythagoreans were seeking to understand. This relationship is rarely spoken about

in the open literature, but once you see it, it can never be forgotten, as it colors your entire view of all

aspects of modern physics.

3.2.1 Eigenvalues of a matrix25

The method for finding eigenvalues is best described by an example.13 Starting from the matrix Eq. 2.3.5

(p. 66), the eigenvalues are defined by the eigenmatrix equation

1

2

[1 12 0

] [e1

e2

]= λ

[e1

e2

].

The unknowns here are the eigenvalue λ and the eigenvector e = [e1, e2]T . First find λ by subtracting

the right from the left

1

2

[1 12 0

] [e1

e2

]− λ

[e1

e2

]=

1

2

[1− 2λ 1

2 −2λ

] [e1

e2

]= 0 (3.20)

The only way that this equation for e can have a solution is if the matrix is singular.

Example: The determinant of the above 2x2 example is the product of the diagonal elements, minus

the product of the off-diagonal elements, which results in the quadratic equation

−2λ(1− 2λ)− 2 = 4λ2 − 2λ− 2 = 0.

11A Matlab/Octave program that does this may be downloaded from http://jontalle.web.engr.illinois.edu/

uploads/493/M/NewtonJPD.m.12https://en.wikipedia.org/wiki/Lyman_series

13Appendix C (p. 259) is an introduction to the topics of eigenanalysis for 2x2 matrices.

3.2. LECTURE 11: (WEEK 4) EIGENANALYSIS 87

Completing the square gives

(λ− 1/4)2 − (1/4)2 − 1/2 = 0, (3.21)

thus the roots (i.e., eigenvalues) are λ± = 1±34 = 1,−1/2.

Exercise: Expand Eq. 3.21 and recover the quadratic equation. Solution:

(λ− 1/4)2 − (1/4)2 − 1/2 = λ2 − λ/2 +(1/4)2 − (1/4)2 − 1/2 = 0.

Thus completing the square is equivalent to the original equation.

Exercise: Find the eigenvalues of matrix Eq. 2.12. (Hint: see p. 260) Solution: This is a minor

variation on the previous example. Briefly

det

[1− λ N

1 1− λ

]= (1− λ)2 −N = 0.

Thus λ± = 1±√N.

Exercise: Starting from [xn, yn]T = [1, 0]T compute the first 5 values of [xn, yn]T . Solution: Here is

a Matlab/Octave code for computing xn: 5

x(1:2,1)=[1;0];

A=[1 1;2 0]/2;

for k=1:10; x(k+1)=A*x(:,k); end

10

which gives the rational (xn ∈ Q) sequence: 1, 1/2, 3/4, 5/8, 11/24, 21/25, 43/26, 85/27, 171/28,

341/29, 683/210, · · · .

Exercise: Show that the solution to Eq. 2.18 is bounded, unlike that of the divergent Fibonacci se-

quence. Explain what is going on. Solution: Because the next value is the mean of the last two, the

sequence is bounded. To see this one needs to compute the eigenvalues of the matrix Eq. 2.3.5 (p. 66). 15

Eigenanalysis: The key to the analysis of such equations is called the eigenanalysis, or modal-analysis

method. These are also known as resonant modes in the physics literature. Eigenmodes describe the

naturally occurring “ringing” found in physical wave-dominated boundary value problems. Each mode’s

“eigenvalue” quantifies the mode’s natural frequency. Complex eigenvalues result in damped modes,

which decay in time due to energy losses. Common examples include tuning forks, pendulums, bell, 20

and strings of musical instruments, all of which have a characteristic frequency.

Two modes with the same frequency are said to be degenerate. This is a very special condition, with

a high degree of symmetry.

Cauchy’s residue theorem is used to find the time-domain response of each frequency-domain com-

plex eigenmode. Thus eigenanalysis and eigenmodes of physics are the same thing (see Sect. 4.3, 25

p. 156), but are described using different (i.e., mutually unrecognizable) notional methods.14 The “eigen

method” is summarized in Appendix D, p. 263.

Taking a simple example of a 2x2 matrix T ∈ C, we start from the definition of the two eigen-

equations

T e± = λ±e±, (3.22)

corresponding to two eigenvalues λ± ∈ C and two 2x1 eigenvectors e± ∈ C.

14During the discovery or creation of quantum mechanics, two alternatives were developed: Schrodinger’s differential

equation method and Heisenberg’s matrix method. Eventually it was realized the two were equivalent.


Example: Assume that T is the Fibonacci matrix Eq. 2.16.

The eigenvalues λ± may be merged into a 2x2 diagonal eigenvalue matrix

Λ =

[λ+ 00 λ−

],

while the two eigenvectors e+ and e− are merged into a 2x2 eigenvector matrix

E = [e+, e−] =

[e+

1 e−1

e+2 e−

2

], (3.23)

corresponding to the two eigenvalues. Using matrix notation, this may be compactly written as

T E = EΛ. (3.24)

Note that while λ± and E± commute, EΛ 6= ΛE.

From Eq. 3.24 we may obtain two very important forms:

1. the diagonalization of T

Λ = E−1T E, (3.25)

and

2. the eigen-expansion of T

T = EΛE−1, (3.26)

which is useful for computing power of T (i.e., T 100 = E−1Λ100E).5

Example: If we take

T =

[1 11 −1

],

then the eigenvalues are given by (1−λ±)(1 +λ±) = −1, thus λ± = ±√

2. This method of eigenanal-

ysis is discussed in Section Q.2 (p. 328) and Appendix E (p. 267).

Exercise: Use the formula for the generalized diagonalization of a matrix to find the general solution

of the mean-Fibonacci sequence. Solution: The eigenvalues are given by the roots of

0 = −λ(1/2− λ)− 1/2 = (λ− 1/4)2 − 9/16

which are [1,−1/2].By studying the eigenvalues of Eq. 2.3.5 one finds that the steady state solution approaches 1.

Namely fn → 1 = (fn−1 + fn−2)/2 is the solution, as n → ∞. Namely the average of the last10

two values must approach 1 for large n.

Exercise: Show that the geometric series formula holds for 2x2 matrices. Starting with the 2x2 identity

matrix I2 and a ∈ C, with |a| < 1, show that

I2(I2 − aI2)−1 = I2 + aI2 + a2I22 + a3I3

2 + · · · .

Solution: Since akIk2 = akI2, we may multiply both sides by I2 − aIk2 to obtain

I2 = I2 + aI2 + a2I22 + a3I3

2 + · · · − aI2(aI2 + a2I22 + a3I3

2 + · · · )= [1 + (a+ a2 + a3 + · · · )− (a+ a2 + a3 + a4 + · · · )]I2

= I2


Exercise: Show that when T is not a square matrix, Eq. 3.22 can be generalized to

Tm,n = Um,mΛm,nV†n,n.

This important generalization of eigenanalysis is called a singular value decomposition (SVD).

Summary: The GCD (Euclidean algorithm), Pell’s equation and the Fibonacci sequence may all be

written as compositions of 2x2 matrices. Thus Pell’s equation and the Fibonacci sequence are special

cases of 2x2 matrix composition [xy

]

n+1

=

[a bc d

] [xy

]

n

.

This is an important and common thread of these early mathematical findings. It will turn out that

this 2x2 matrix recursion plays a special role in physics, mathematics and engineering, because such

equations are solved using the eigenanalysis method. More than several thousand years of mathematical,

by trial and error, set the stage for this breakthrough. But it took even longer to be fully appreciated. Discuss

development ofalgebra (Fig 1.1 p. 19) and theDiophantus’ book on “Arithmitic”Bombelli, Fermat and Descartes

5

The key idea of the 2x2 matrix solution, widely used in modern engineering, can be traced back to

Brahmagupta’s solution of Pell’s equation, for arbitrary N . Brahmagupta’s recursion, identical to that

of the Pythagoreans’ N = 2 case (Eq. 2.13), eventually led to the concept of linear algebra, defined

by the simultaneous solutions of many linear equations. The recursion by the Pythagoreans (6th BCE)

predated the creation of algebra by al-Khwarizmi (9th CE century) (Fig. 1.1) (Stillwell, 2010, p. 88). 10

3.2.2 Taylor series

An analytic function is one that

1. may be expanded in a Taylor series

P (x) =∞∑

n=0

cn(x− xo)n, (3.27)

2. converges for |x − xo| < 1, called the RoC, with coefficients cn that are determined by the

derivatives of P (x), evaluated at the expansion point x = xo.

3. The Taylor series coefficients cn are defined by taking derivatives of P (x) and evaluating them at

the expansion point x0, namely

cn =1

n!

dn

dxnP (x− xo)

∣∣∣∣x=xo

. (3.28)

Properties: The Taylor formula is a prescription for how to uniquely define the coefficients cn. With- 15

out the Taylor series formula, we would have no way of determining cn. The proof of the Taylor formula

is transparent, simply by taking successive derivatives of Eq. 3.27, and then evaluating the result at the

expansion point. If P (x) is analytic then this procedure will always work. If P (x) fails to have a deriva-

tive of any order, then the function is not analytic and Eq. 3.27 is not valid for P (x). For example, if

P (x) has a pole at xo then it is not analytic at that point. 20

The Taylor series representation of P (x) has special applications for solving differential equations

because

1. it is single valued

2. all its derivatives and integrals are uniquely defined

3. it may be trivially continued into the complex plane by extending x ∈ C. 25


Analytic continuation: A limitation of the Taylor series expansion is that it is not valid outside of its

RoC. One method for avoiding this limitation is to move the expansion point. This is called analytic con-

tinuation. However, analytic continuation is a non-trivial operation because it (1) requires manipulating

an infinite number of derivatives of P (x), (2) at the new expansion point xo, where (3) P (x− xo) may

not have derivatives, due to possible singularities. (4) Thus one needs to know where the singularities5

of P (s) are in the complex s plane. Due to these many problems, analytic continuation is rarely used,

other than as an important theoretical concept.

Example: The trivial (corner) case is the geometric series P (x) = 1/(1 − x) about the expansion

point x = 0. The function P (x) is defined everywhere, except at the singular point x = 1, whereas the

geometric series is only valid for |x| < 1.10

Exercise: Verify that co and c1 of Eq. 3.27 follow from Eq. 3.28. Solution: To obtain c0, for n = 0,

there is no derivative (d0/dx0 indicates no derivative is taken), so we must simply evaluate P (x−xo) =c0 + c1(x − xo) + · · · at x = xo, leaving co. To find c1 we take one derivative which results in

P ′(x) = c1 + 2c2(x− x0)) · · · . Evaluating this at x = x0 leaves c1. Each time we take a derivative we

reduce the degree of the series by 1, leaving the next constant term.15

Exercise: Suppose we truncate the Taylor series expansion to N terms. What is the name of such

functions and what are their properties? Solution: When an infinite series is truncated the resulting

function is called an N th degree polynomial

PN (x) =N∑

n=0

= c0 + c1(x− xo) + c2(x− xo)2 · · · cN (x− x0)N .

We can find co by evaluating PN (x) at the expansion point xo, since from the above formula PN (xo) =c0. From the Taylor formula c1 = P ′

N (x)|xo .

Exercise: How many roots do PN (x) and P ′N (x) have? Solution: According to the fundamental

theorem of algebra PN (x) has N roots. P ′N (x) has N − 1 roots. The Gauss-Lucas theorem states that

the N − 1 roots of P ′N (x) lie inside the convex hull of the N roots of PN (x).20

To understand the meaning of the convex hull, consider the following construction: If stakes are

placed at each of the N roots of PN (x), and a string is then wrapped around the stakes, with all the

stakes inside the string, the convex hull is then the closed set inside the string. One can then begin to

imagine how the N − 1 roots of the derivative must evolve with each set inside the convex hull of the

previous set. This concept may be recursed to N − 2, and below.25

Exercise: Would it be possible to find the inverse Gauss-Lucas theorem that states where the roots of

the integral of a polynomial might be? Solution: With each integral there is a new degree of freedom

that must be taken into account, thus this problem is much more difficult. But since there is only 1 extra

degree of freedom, it does not seem impossible. To solve this problem more constraints will be needed.

Role of the Taylor series: The Taylor series plays a major role in the mathematics of differential30

equations and their solution, as the coefficients of the series uniquely determine the analytic series rep-

resentation via its derivatives. The implications and limitations of the power series representation are

very specific: if the series fails to converge (i.e., outside the RoC), it is essentially meaningless.

A very important fact about the RoC: It is only relevant to the series, not the function being expanded.

Typically the function has a pole at the radius of the RoC, where the series fails to converge. However35

the function being expanded is valid everywhere, other than at the pole. It seems that this point has been

poorly explained in many texts. Besides the RoC is the region of divergence RoD, which is the RoC’s

complement.


The Taylor series does not need to be infinite to converge to the function it represents, since it ob-

viously works for any polynomial PN (x) of degree N . But in the finite case (N < ∞), the RoC is

infinite, and the series is the function PN (x) exactly, everywhere. Of course PN (x) is called a polyno-

mial of degree N . When N → ∞, the Taylor series is only valid within the RoC, and it is (typically)

the representation of the reciprocal of a polynomial. 5

These properties are both the curse and the blessing of the analytic function. On the positive side,

analytic functions are the ideal starting point for solving differential equations, which is exactly how they

were used by Newton and others. Analytic functions are “smooth” since they are infinitely differentiable,

with coefficients given by Eq. 3.28. They are single valued, so there can be no ambiguity in their

interpretation. 10

Two well-known analytic functions are the geometric series (|x| < 1)

1

1− x = 1 + x+ x2 + x3 + . . . =∞∑

n=0

xn (3.29)

and exponential series (|x| <∞)

ex = 1 + x+1

2x2 +

1

3 · 2x3 +

1

4 · 3 · 2x4 + . . . =

∞∑

n=0

1

n!xn. (3.30)

Exercise: Relate the Taylor series expressions for Eq. 3.29 to the following functions:

1.

F1(x) =

∫ x 1

1− xdx (3.31)

Solution: = x+ 12x

2 + 13x

3 + . . . .

2.

F2(x) =d

dx

1

1− x (3.32)

Solution: 1 + 2x+ 3x2 . . .

3.

F3(x) = ln1

1− x (3.33)

Solution: 1 + 12x+ 1

3x2.

4.

F4(x) =d

dxln

1

1− x (3.34)

Solution: 1 + x+ s2 + s3 . . .. 15

Exercise: Using symbolic manipulation (Matlab, Octave, Mathematica), expand the given function

F (s) in a Taylor series, and find the recurrence relation between the Taylor coefficients cn, cn−1, cn−2.

Assume a ∈ C and T ∈ R.

1.

F (s) = eas

Solution: A Google search on “octave syms taylor” is useful to answer this question. The Mat-

lab/Octave code is to expand this in a taylor series is 20

syms s

taylor(exp(s),s,0,’order’,10)


Exercise: Find the coefficients of the following functions by the method of Eq. 3.28, and give the RoC.

1. w(x) = 11−x . Solution: From a straightforward expansion we know the coefficients are

1

1− x = 1 + x+ (x)2 + (x)3 · · · = 1 + x− x2 +−x3 · · · .

Working this out using Eq. 3.28 is more work:

c0 = 10!w

∣∣0

= 1; c1 = 11!dwdx

∣∣∣0

= − −(1−x)2

∣∣∣x=0

= ; c2 = 12!d2wdx2

∣∣∣0

= 12!

−2(1−x)3

∣∣∣0

= −1;

c3 = 13!d3wdx3

∣∣∣0

= −(1−x)4

∣∣∣0

= −j.However, if we take derivatives of the series expansion it is much easier, and one can even figure5

out the term for cn:

c0 = 1; c1 = ddx

∑(x)n

∣∣∣0

= j; c2 = 12!

d2

dx2

∑(jx)n

∣∣∣0

= 2()2;

c3 = 13!

d3

dx3

∑(jx)n

∣∣∣0

= ()3 = −;· · · ,cn = 1

n!nn! = n.10

The RoC is |xj| = |x| < 1.

2. w(x) = ex. Solution: cn = 1n!

n. The RoC is |x| < ∞. Functions with an∞ RoC are called

entire.

Brune impedances: A third special family of functions is formed from ratios of two polynomials

Z(s) = N(s)/D(s), commonly used to define an impedance Z(s), denoted the Brune impedance.

Impedance functions are a very special class of complex analytic functions because they must have a

non-negative real part

ℜZ(s) = ℜN(s)

D(s)≥ 0,

so as to obey conservation of energy. A physical Brune impedance cannot have a negative resistance

(the real part); otherwise it would act like a power source, violating conservation of energy. Most

impedances used in engineering applications are in the class of Brune impedances, defined by the ratio

of two polynomials, of degrees m and n

ZBrune(s) =PM (s)

PN (s)=sM + a1S

M−1 · · · a0

sN + b1SN−1 · · · b0, (3.35)

where M = N ± 1 (i.e., N = M ± 1). This fraction of polynomials is sometimes known as a “Pade

approximation,” but more specifically this ratio is a Brune impedance, with poles and zeros, defined as

the complex roots of the two polynomials. The key property of the Brune impedance is that the real part

of the impedance is non-negative (positive or zero) in the right s half-plane

ℜZ(s) = ℜ [R(σ, ω) + jX(σ, ω)] = R(σ, ω) ≥ 0 for ℜs = σ ≥ 0. (3.36)

Since s = σ + ω, the complex frequency (s) right half-plane (RHP) corresponds to ℜs = σ ≥ 0).

This condition defines the class of positive-real functions, also known as the Brune condition, which is

frequently written in the abbreviated form

ℜZ(ℜs ≥ 0) ≥ 0.

As a result of this positive-real constraint on impedance functions, the subset of Brune impedances15

(those given by Eq. 3.35 satisfying Eq. 3.36) must be complex analytic in the entire right s half-plane.

This is a powerful constraint that places strict limitations on the locations of both the poles and the zeros

of every positive-real Brune impedance.


Exercise: Show that Z(s) = 1/√s is positive real, but not a Brune impedance. Solution: Since it

may not be written as the ratio of two polynomials, it is not in the Brune impedance class. By writing

Z(s) = |Z(s)eφ in polar coordinates, since −π/4 ≤ φ ≤ π/4 when |∠|s < π/2, Z(s) satisfies the

Brune condition, thus is positive real.

Determining the region of convergence (RoC): Determining the RoC for a given analytic function 5

is quite important, and may not always be obvious. In general the RoC is a circle whose radius extends

from the expansion point out to the nearest pole. Thus when the expansion point is moved, the RoC

changes since the location of the pole is fixed.

Example: For the geometric series (Eq. 3.29), the expansion point is xo = 0, and the RoC is |x| < 1,

since 1/(1 − x) has a pole at x = 1. We may move the expansion point by a linear transformation, for 10

example, by replacing x with z + 3. Then the series becomes 1/((z + 3)− 1) = 1/(z + 2), so the RoC

becomes 2 because in the z plane the pole has moved to −2.

Example: A second important example is the function 1/(x2 + 1), which has the same RoC as the

geometric series, since it may be expressed in terms of its residue expansion (aka, partial fraction ex-

pansion)1

x2 + 1=

1

(x+ 1)(x− 1)=

1

2

(1

x− 1− 1

x+ 1

).

Each term has an RoC of |x| < |1| = 1. The amplitude of each pole is called the residue, defined in

Section 4.4 Eq. 4.36, p. 166. The residue for the pole at 1 is 1/2.

In summary, the function 1/(x2 + 1) is the sum of two geometric series, with poles at ±1 which 15

are not as obvious because the roots are complex and conjugate. Once factored, it becomes clear what

is going on.

Exercise: Verify the above expression is correct, and show that the residues are±1/2. Solution: Cross-

multiply and cancel, leaving 1, as required. The RoC is the coefficient on the pole. Thus the residue of

the pole at x is /2. 20

Exercise: Find the residue of ddz z

π. Solution: Taking the derivative gives zπ−1 which has a pole at

z = 0. Applying the formula for the residue (Eq. 4.36, p. 166) we find

c−1 = limz→0

zzπ−1 = limz→0

zπ = 0.

Thus the residue is zero. The exponential series converges for every finite value of x ∈ R (the RoC is

the entire real line), thus the exponential is called an entire function.

Exercise: Find the residue of g(z) = ddz (z − 1)−π. Solution: g(z) = −π/(z − 1)π+1, and apply the

definition of the residue (Eq. 4.36).

3.2.3 Analytic functions: 25

Any function that has a Taylor series expansion is called an analytic function. Within the RoC, the series

expansion defines a single-valued function. Polynomials 1/(1 − x) and ex are examples of analytic

functions that are real functions of their real argument x. This is not the entire story. Because ana-

lytic functions are easily manipulated term by term, they may be used to find solutions of differential

equations since the derivatives of a series are uniquely determined within the RoC, due to Eq. 3.28. 30


Every analytic function has a corresponding differential equation, which is determined by the coef-

ficients ak of the analytic power series. An example is the exponential, which has the property that it is

the eigen-function of the derivative operation

d

dxeax = aeax,

which may be verified using Eq. 3.30. This relationship is a common definition of the exponential

function, which is very special because it is the eigen-function of the derivative.

The complex analytic power series (i.e., complex analytic functions) may also be integrated, term

by term, since ∫ x

f(x)dx =∑ ak

k + 1xk+1. (3.37)

Newton took full advantage of this property of the analytic function and used the analytic series (Taylor

series) to solve analytic problems, especially for working out integrals, allowing him to solve differen-

tial equations. To fully understand the theory of differential equations, one must master single-valued5

analytic functions and their analytic power series.

Single- vs. multi-valued functions: Polynomials and their∞-degree extensions (analytic functions)

are single valued: for each x there is a single value for PN (x). The roles of the domain and codomain

may be swapped to obtain an inverse function, with properties that can be different from those of the

function. For example, y(x) = x2 + 1 has the inverse x = ±√y − 1, which is double valued, and10

complex when y < 1. Periodic functions such as y(x) = sin(x) are even more “exotic” since x(y) =arcsin(x) = sin−1(x) has an infinite number of x(y) values for each y. This problem was first addressed

in Riemann’s 1851 PhD thesis, written while he was working with Gauss.

Exercise: Let y(x) = sin(x). Then dy/dx = cos(x). Show that dx/dy = −1/√

1− x2. Hint:

x(y) = cos−1(y) = arccos(y). Solution: See the implicit function theorem (D’Angelo, 2017, p. 104).15

Add solution.

Exercise: Let y(x) = sin(x). Then dy/dx = cos(x). Show that dx/dy = −/√

1 + x2. Solu-

tion: Add solution.

Exercise: Find the Taylor series coefficients of y = sin(x) and x = sin−1(y). Solution: Add solution.

Complex analytic functions: When the argument of an analytic function F (x) is complex, that is,

x ∈ R is replaced by s = σ + ω ∈ C (recall that R ⊂ C)

F (s) =∞∑

n=0

cn(s− so)n, (3.38)

with cn ∈ C, that function is said to be a complex analytic.20

For example, when the argument of the exponential becomes complex, it is periodic on the ω axis,

since

est = e(σ+ω)t = eσteωt = eσt [cos(ωt) + sin(ωt)] . (3.39)

Taking the real part gives

ℜest = eσteωt + e−ωt

2= eσt cos(ωt),

and ℑest = eσtsin(ωt). Once the argument is allowed to be complex, it becomes obvious that

the exponential and circular functions are fundamentally related. This exposes the family of entire

circular functions [i.e., es, sin(s), cos(s), tan(s), cosh(s), sinh(s)] and their inverses [ln(s), arcsin(s),


arccos(s), arctan(s), cosh−1(s), sinh−1(s)], first fully elucidated by Euler (c1750) (Stillwell, 2010,

p. 315). Note that because a function, such as sin(ωt), is periodic, its inverse must be multi-valued.

What is needed is some systematic way to account for this multi-valued property. Euler also studied

the Gamma function (Γ(s)), which generalizes the factorial operation for real, and eventually by Gauss,

for complex arguments (s,Γ(s) ∈ C):

s! = Γ(s+ 1).

Given a complex analytic function of a complex variable, one must resort to the extended complex

plane, Riemann sheets and branch cuts, as discussed in Section 3.8.1 (p. 138). The extended complex

plane is a tool that extends the domain of complex analytic functions to include the point at infinity.

This topic is critically important in engineering mathematics and will be discussed at length in Sections

3.8.1-3.7 (pp. 138-131). 5

Definition of the Taylor series of a complex analytic function: However, there is a fundamental

problem: we cannot formally define the Taylor series for the coefficients ck until we have defined the

derivative with respect to the complex variable dF (s)/ds, with s ∈ C. Thus simply substituting s for xin an analytic function leaves a major hole in one’s understanding of the complex analytic function.

To gain a feeling for the nature of the problem, we make take derivatives of a function with respect

to various variables. For example,d

dtest = sest,

eωtd

dσeσt = σest,

and

eσtd

dωeω = ωest.

It was Cauchy (1814) (Fig. 3.1) who uncovered much deeper relationships within complex analytic 10

functions (Sect. 3.8, p. 135) by defining differentiation and integration in the complex plane, leading

to several fundamental theorems of complex calculus, including the fundamental theorem of complex

integration, and Cauchy’s formula. We shall explore this and several fundamental theorems in Sect. 4.2

(p. 148).

There seems to be some disagreement as to the status of multi-valued functions: Are they functions, 15

or is a function strictly single valued? If so, then we are missing out on a host of interesting possibilities,

including all the inverses of nearly every complex analytic function. For example, the inverse of a

complex analytic function is a complex analytic function (e.g., es and log(s)).

Impact of complex analytic mathematics on physics: It seems likely, if not obvious, that the suc-

cess of Newton was his ability to describe physics by the use of mathematics. He was inventing new 20

mathematics at the same time he was explaining new physics. The same might be said for Galileo.

It seems likely that Newton was extending the successful techniques and results of Galileo’s work on

gravity (Galileo, 1638).15 Galileo died on Jan 8, 1642, and Newton was born Jan 4, 1643, just short of

one year later. Certainly Newton was well aware of Galileo’s great success, and naturally would have

been influenced by it (see p. 24). 25

The application of complex analytic functions to physics was dramatic, as may be seen in the six

volumes on physics by Arnold Sommerfeld (1868-1951), and from the productivity of his many (36)

students (e.g., Debye, Lenz, Ewald, Pauli, Guillemin, Bethe, Heisenberg, and Seebach, to name a few),

notable coworkers (i.e., Leon Brillouin) and others (i.e., John Bardeen) upon whom Sommerfeld had a

strong influence. Sommerfeld is famous for training many students who were awarded the Nobel Prize 30

in Physics, yet he never won a Nobel (the prize is not awarded in mathematics). Sommerfeld brought

mathematical physics (the merging of physical and experimental principles via mathematics) to a new

15http://muse.tau.ac.il/museum/galileo/galileo_low_of_fall.html


level with the use of complex integration of analytic functions to solve otherwise difficult problems, thus

following the lead of Newton who used real integration of Taylor series to solve differential equations

(Brillouin, 1960, Ch. 3 by Sommerfeld, A.).

3.3. LECTURE 12: (WEEK 5) ROOT CLASSIFICATION 103

3.3 Lec 12: Root classification by convolution

Following the exploration of algebraic relationships by Fermat and Descartes, the first theorem was

being formulated by d’Alembert. The idea behind this theorem is that every polynomial of degree N(Eq. 3.7) has at least one root. This may be written as the product of the root and a second polynomial of

degree of N− 1. By the recursive application of this concept, it is clear that every polynomial of degree 5

N has N roots. Today this result is known as the fundamental theorem of algebra:

Every polynomial equation P (z) = 0 has a solution in the complex numbers. As Descartes

observed, a solution z = a implies that P (z) has a factor z − a. The quotient

Q(z) =P (z)

z − a =P (z)

a

[1 +

z

a+

(z

a

)2

+

(z

a

)3

+ · · ·]

(3.10)

is then a polynomial of one lower degree. . . . We can go on to factorize P (z) into n linear

factors.

—Stillwell (2010, p. 285).

The ultimate expression of this theorem is given by Eq. 3.7 (p. 76), which indirectly states that an 10

nth degree polynomial has n roots. We shall use the term degree when speaking of polynomials and

the term order when speaking of differential equations. A general rule is that order applies to the time

domain and degree to the frequency domain, since the Laplace transform of a differential equation,

having constant coefficients, of order N , is a polynomial of degree N in Laplace frequency s.

Today this theorem is so widely accepted we fail to appreciate it. Certainly about the time you 15

learned the quadratic formula, you were prepared to understand the concept of polynomials having

roots. The simple quadratic case may be extended to a higher degree polynomial. The Octave/Matlab

command roots([1, a2, a1, ao]) provides the roots [s1, s2, s3] of the cubic equation, defined by the

coefficient vector [1, a2, a1, ao]. The command poly([s1, s2, s3]) returns the coefficient vector. I

don’t know the largest degree that can be accurately factored by Matlab/Octave, but I’m sure its well 20

over N = 103. Today, finding the roots numerically is a solved problem.

Factorization versus convolution: The best way to gain insight into the polynomial factorization

problem is through the inverse operation, multiplication of monomials. Given the roots xk, there is

a simple algorithm for computing the coefficients ak of PN (x) for any n, no matter how large. This

method is called convolution. Convolution is said to be a trap-door function since it is easy, while the 25

inverse, factoring (deconvolution), is hard, and analytically intractable for degree N ≥ 5 (Stillwell,

2010, p. 102).

3.3.1 Convolution of monomials

As outlined by Eq. 3.7, a polynomial has two descriptions, first as a series with coefficients an and

second in terms of its roots xr. The question is: What is the relationship between the coefficients and 30

the roots? The simple answer is that they are related by convolution.

Let us start with the quadratic

(x+ a)(x+ b) = x2 + (a+ b)x+ ab, (3.11)

where in vector notation [−a,−b] are the roots and [1, a+ b, ab] are the coefficients.

To see how the result generalizes, we may work out the coefficients for the cubic (N = 3). Multi-

plying the following three factors gives )

(x−1)(x−2)(x−3) = (x2−3x+2)(x−3) = x(x2−3x+2)−3(x2−3x+2) = x3−6x2 +11x−6.


When the roots are [1, 2, 3] the coefficients of the polynomial are [1,−6, 11,−6]. To verify, substitute

the roots into the polynomial, and show that they give zero. For example, r1 = 1 is a root since

P3(1) = 1− 6 + 11− 6 = 0.

As the degree increases, the algebra becomes more difficult. Imagine trying to work out the co-

efficients for N = 100. What is needed is a simple way of finding the coefficients from the roots.5

Fortunately, convolution keeps track of the book-keeping, by formalizing the procedure, along with

Newton’s deconvolution method for finding roots of polynomials (p. 79).

Convolution of two vectors: To get the coefficients by convolution, write the roots as two vectors

[1, a] and [1, b]. To find the coefficients we must convolve the vectors, indicated by [1, a] ⋆ [1, b], where

⋆ denotes convolution. Convolution is a recursive operation. The convolution of [1, a] ⋆ [1, b] is done

as follows: reverse one of the two monomials, padding unused elements with zeros. Next slide one

monomial against the other, forming the local scalar product (element-wise multiply and add):

a 1 0 00 0 1 b= 0

a 1 00 1 b= x2

a 1 01 b 0= (a+ b)x

0 a 11 b 0= abx0

0 0 a 11 b 0 0= 0

,

resulting in coefficients [· · · , 0, 0, 1, a+ b, ab, 0, 0, · · · ].By reversing one of the polynomials, and then taking successive scalar products, all the terms in the

sum of the scalar product correspond to the same power of x. This explains why convolution of the10

coefficients gives the same answer as the product of the polynomials.

As seen by the above example, the positions of the first monomial coefficients are reversed, and then

slid across the second set of coefficients, the dot-product is computed, and the result placed in the output

vector. Outside the range shown, all the elements are zero. In summary,

[1,−1] ⋆ [1,−2] = [1,−1− 2, 2] = [1,−3, 2].

In general

[a, b] ⋆ [c, d] = [ac, bc+ ad, bd],

Convolving a third term [1,−3] with [1,−3, 2] gives (p. 339)

[1,−3] ⋆ [1,−3, 2] = [1,−3− 3, 9 + 2,−6] = [1,−6, 11,−6],

which is identical to the cubic example, found by the algebraic method.

By convolving one monomial factor at a time, the overlap is always two elements, thus it is never

necessary to compute more than two multiplies and an add for each output coefficient. This greatly

simplifies the operations (i.e., they are easily done in your head). Thus the final result is more likely to15

be correct. Comparing this to the algebraic method, convolution has the clear advantage.

Exercise:

1. What are the three nonlinear equations that one would need to solve to find the roots of a cubic?

Solution: From our formula for the convolution of three monomials we may find the nonlinear

“deconvolution” relations between the roots19 [−a,−b,−c] and the cubic’s coefficients [1, α, β, γ]

(x+ a) ⋆ (x+ b) ⋆ (x+ c) = (x+ c) ⋆ (x2 + (a+ b)x+ ab)

= x · (x2 + (a+ b)x+ ab) + c · (x2 + (a+ b)x+ ab)

= x3 + (a+ b+ c)x2 + (ab+ ac+ cb)x+ abc

= [1, a+ b+ c, ab+ ac+ cb, abc].

19By working with the negative roots we may avoid an unnecessary and messy alternating sign problem.

3.3. LECTURE 12: (WEEK 5) ROOT CLASSIFICATION 105

It follows that the nonlinear equations must be

α = a+ b+ c

β = ab+ ac+ bc

γ = abc.

These may be solved by the classic cubic solution, which therefore is a deconvolution problem,

also know as long division of polynomials. It follows that the following long-division of polyno-

mials must be true:

x3 + (a+ b+ c)x2 + (ab+ ac+ bc)x+ abc

x+ a= x2 + (b+ c)x+ bc

The product of monomial P1(x) with a polynomial PN (x) gives PN+1(x): This statement is an-

other way of stating the fundamental theorem of algebra. Each time we convolve a monomial with a

polynomial of degree N , we obtain a polynomial of degree N + 1. The convolution of two monomials

results in a quadratic (degree 2 polynomial). The convolution of three monomials gives a cubic (degree 5

3). In general, the degree k of the product of two polynomials of degree n,m is the sum of the degrees

(k = n+m). For example, if the degrees are each 5 (n = m = 5), then the resulting degree is 10.

While we all know this theorem from high school algebra class, it is important to explicitly identify

the fundamental theorem of algebra.

Note that the degree of a polynomial is one less than the length of the vector of coefficients. Since 10

the leading term of the polynomial cannot be zero, else the polynomial would not have degree N , when

looking for roots, the coefficient can (and should always) be normalized to 1.

In summary, the product of two polynomials of degreem,n havingm and n roots gives a polynomial

of degree m + n. This is an analysis process of merging polynomials, by coefficient convolution.

Multiplying polynomials is a merging process, into a single polynomial. 15

Composition of polynomials: Convolution is not the only important operation between two polyno-

mials. Another is composition, which may be defined for two functions f(z), g(z). Then the composi-

tion c(z) = f(z) g(z) = f(g(z)). As a specific example, suppose f(z) = 1 + z + z2 and g(z) = e2z .

With these definitions

f(z) g(z) = 1 + e2z + (e2z)2 = 1 + e2z + e4z.

Note that f(z) g(z) 6= g(z) f(z).

Exercise: Find g(z) f(z). Solution: e2f(z) = e2(1+z+z2) = e2e(1+z+z2) = e3ezez2.

3.3.2 Residue expansions of rational functions

As discussed in Section 3.1.4, p. 84, there are at least seven important Matlab/Octave routines that are

closely related: conv(), deconv(), poly(), polyder(), polyval(), residue(), root(). 20

Several of these are complements of each other, or do a similar operation in a slightly different way. Rou-

tines conv(), poly() build polynomials from the roots, while root() solves for the roots given the

polynomial coefficients. The operation residue() converts the ratio of two polynomials and expands

it in a partial fraction expansion, with poles and residues.

When lines and planes are defined, the equations are said to be linear in the independent variables.

In keeping with this definition of linear, we say that the equations are non-linear when the equations

have degree greater than 1 in the independent variables. The term bilinear has a special meaning, in that

both the domain and codomain are linearly related by lines (or planes). As an example, impedance is


defined in frequency as the ratio of the voltage over the current, but it frequently has a representation as

the ratio of two polynomials N(s) and D(s)

Z(s) =N(s)

D(s)= sLo +Ro +

K∑

k=0

Kk

s− sk. (3.12)

Here Z(s) is the impedance and V and I are the voltage and current at radian frequency ω.20

Such an impedance is typically specified as a rational or bilinear function, namely the ratio of

two polynomials, PN (s) = N(s) = [aN , an−1, · · · , ao] and PK(s) = D(s) = [bK , bK−1, · · · , bo]of degrees N,K ∈ N, as functions of complex Laplace frequency s = σ + ω, having simple roots.

Most impedances are rational functions since they may be written as D(s)V = N(s)I . Since D(s)and N(s) are both polynomials in s, rational functions are also called bilinear transformation or in the

mathematical literature Mobius transformation, which comes from a corresponding scalar differential

equation, of the form

K∑

k=0

bkdk

dtki(t) =

N∑

n=0

andn

dtnv(t) ↔ I(ω)

K∑

k=0

bksk = V (ω)

N∑

n=0

ansn. (3.13)

This construction is also known as the ABCD method in the engineering literature (Eq. 3.35, p. 123).

This equation, as well as 3.12, follows from the Laplace transform (see Section 3.7, p. 131) of the

differential equation (on left), by forming the impedance Z(s) = V/I = A(s)/B(s). This form of the

differential equation follows from Kirchhoff’s voltage and current laws (KCL, KVL) or from Newton’s5

laws (for the case of mechanics).

The physical properties of an impedance: Based on d’Alembert’s observation that the solution to

the wave equation is the sum of forward and backward traveling waves, the impedance may be rewritten

in terms of forward and backward traveling waves (see p. 207)

Z(s) =V

I=V + + V −

I+ − I− = ro1 + Γ(s)

1− Γ(s), (3.14)

where ro = P+/I+ is called the surge impedance of the transmission line (e.g., wire) connected to

the load impedance Z(s), and Γ(s) = P−/P+ = I−/I+ is the reflection coefficient corresponding to

Z(s). Any impedance of this type is called a Brune impedance due to its special properties (discussed

on p. 363) (Brune, 1931a). Like Z(s), Γ(s) is causal and complex analytic. The impedance and the10

reflectance function must both be complex analytic, since they are related to the bilinear transformation,

which assures the mutual complex analytic properties.

Due to the bilinear transformation, the physical properties of Z(s) and Γ(s) are very different.

Specifically, the real part of the load impedance must be non-negative (ℜZ(ω) ≥ 0), if and only

if |Γ(s)| ≤ 1|. In the time domain, the impedance z(t) ↔ Z(s) must have a value of ro at t = 0.15

Correspondingly, the time domain reflectance γ(t)↔ Γ(s) must be zero at t = 0.

This is the basis of conservation of energy, which may be traced back to the properties of the re-

flectance Γ(s).

Exercise: Show that if ℜZ(s) ≥ 0 then |Γ(s)| ≤ 1. Solution: Taking the real part of Eq. 3.14,

which must be ≥ 0, we findLog-derivative analysis.

ℜZ(s) =ro2

[1 + Γ(s)

1− Γ(s)+

1 + Γ∗(s)

1− Γ∗(s)

]= ro

1− |Γ(s)|2|1 + Γ(s)|2 ≥ 0.

Thus |Γ| ≤ 1.

20Note that the relationship between the impedance and the residues Kk is a linear one, best solved by setting up a linear

system of equations in the unknown residues. Methods for doing this will be discussed in Appendix I (p. 284).

3.4. LECTURE 13: INTRODUCTION TO ANALYTIC GEOMETRY 107

3.4 Lec 13: Introduction to Analytic Geometry

Analytic geometry came about with the merging of Euclid’s geometry with algebra. The combination

of Euclid’s (323 BCE) geometry and al-Khwarizmi’s (830 CE) algebra resulted in a totally new and

powerful tool, analytic geometry, independently worked out by Descartes and Fermat (Stillwell, 2010).

The addition of matrix algebra during the 18th century enabled analysis in more than three dimensions, 5

which today is one of the most powerful tools used in artificial intelligence, data science and machine

learning. The utility and importance of these new tools cannot be overstated. The timeline for this period

is provided in Fig. 3.5.

|1596 |1650 |1700 |1750 |1800


dAlembert

GaussCauchy

LagrangeNewton

DescartesFermat

Johann Bernoulli

Figure 3.5: Expanded timeline of Fig. 3.1 (p. 73) covering the two centuries from 1596CE to 1855CE, covering thedevelopment of the modern theories of analytic geometry, calculus, differential equations and linear algebra. The verticalred lines indicate mentor-student relationships. Lagrange had a role in the development of linear algebra. Note the overlapbetween Newton and Johann and his son Daniel Bernoulli, and Euler. Gauss had the advantage of input from Newton, Euler,d’Alembert and Lagrange. Likely Cauchy had an influence on Gauss as well.

There are many important relationships between Euclidean geometry and 16th century algebra. An

attempt at a detailed comparison is summarized in Table 3.1. Important similarities include vectors, their

Pythagorean lengths [a, b, c]

c =√

(x2 − x1)2 + (y2 − y1)2, (3.15)

a = x2−x1 and b = y2−y1, and the angles. Euclid’s geometry had length and angles, but no concept of

coordinates, thus of vectors. One of the main innovations of analytic geometry is that one may compute 10

with real, and soon after, complex numbers.

There are several new concepts that come with the development of analytic geometry:

1. Composition of functions: If y = f(x) and z = g(y) then the composition of functions f and g is

denoted z(x) = g f(x) = g(f(x)).

2. Elimination: Given two functions f(x, y) and g(x, y), elimination removes either x or y. This 15

procedure, known to the Chinese, is called Gaussian elimination.

3. Intersection: While one may speak of the intersection of two lines to give a point, or two planes to

give a line, the term intersection is also an important but very different concept in set theory. This

is a special case of elimination when the functions f(x, y), g(x, y) are linear in their arguments.

4. Vectors: Analytic geometry provides the concept of a vector (see Appendix A), as a line with 20

length and orientation (i.e., direction). Analytic geometry defines vectors in any number of di-

mensions, as ordered sets of points.

5. Analytic geometry extends the ideas of Euclidean geometry with the introduction of the scalar

(dot) product of two vectors f ·g, and the vector (cross) product f×g (see Fig. 3.6, p. 109).

6. All of these exist independently of any coordinate system, thus are more general than analytic 25

geometry.


Table 3.1: An ad-hoc comparison between Euclidean geometry and analytic geometry. I am uncertain as to theclassification of the items in the third column.

Euclidean geometry: R3 Analytic geometry: Rn Uncertain

(a) Proof

(b) Line length

(c) Line intersection

(d) Point

(e) Projection (e.g.

scalar product)

(f) Line direction

(g) Vector (sort of)

(h) Conic section

(i) Square roots (e.g.,

spiral of Theodorus)

(a) Numbers

(b) Algebra

(c) Power series

(d) Analytic functions

(e) Complex analytic

functions: e.g.,

sin θ, cos θ, eθ, log z

(f) Composition

(g) Elimination

(h) Integration

(i) Derivatives

(j) Calculus

(k) Polynomial ∈ C

(l) Fund. thm. algebra

(m) Normed vector spaces

(a) Cross product (R3)

(b) Recursion

(c) Iteration ∈ C2 (e.g.,

Newton’s method)

(d) Iteration ∈ Rn

What algebra also added to geometry was the ability to compute with complex numbers. For exam-

ple, the length of a line (Eq. 3.15) was measured in Geometry with a compass: numbers played no role.

Once algebra was available, the line’s Euclidean length could be computed numerically, directly from

the coordinates of the two ends, defined by the 3-vector

e = xx + yy + zz = [x, y, z]T ,

which represents a point at (x, y, z) ∈ R3 ⊂ C3 in three dimensions, having direction, from the origin

(0, 0, 0) to (x, y, z). An alternative matrix notation is e = [x, y, z]T , a column vector of three numbers.

These two notations are different ways of representing exactly the same thing.

By defining the vector, analytic geometry allows Euclidean geometry to become quantitative, beyond

the physical drawing of an object (e.g., a sphere, triangle or line). With analytic geometry we have the5

Euclidean concept of a vector, a line having a magnitude (length) and direction, but analytic defined in

terms of physical coordinates (i.e., numbers). The difference between two vectors defines a third vector,

a concept already present in Euclidean geometry. For the first time, complex numbers were allowed into

geometry (but rarely used until Cauchy and Riemann).

As shown in Fig. 3.6, there are two types of products: the 1) scalar A · B and 2) vector A × B10

products.

Scalar product of two vectors: When using algebra, many concepts, obvious with Euclid’s geometry,

may be made precise. There are many examples of how algebra extends Euclidean geometry, the most

basic being the scalar product (aka dot product) between vectors x,κ ∈ C3

x · κ = (xx + yy + zz) · (αx + βy + γz), ∈ C

= αx+ βy + γz.


θ

L∗

A ·B = ||A||||B|| cos θ

C = A×B = ||A||||B|| sin θ z

A×

B

A

B

Vector product: ⊥ AB plane

Scalar product: In AB plane

y

x

z

Figure 3.6: Vectors A,B,C are used to define the scalar product A · B, vector (cross) product A × B andtriple product C · (A×B). The two vector products complement each other, one proportional to the sin θ of theangle θ between them and the other to the cos θ. It follows that ||A · B||2 + ||A × B||2 = ||A||2||B||2. Thescalar product computes the projection of one vector on the other. The vector product A ×B computes the areaof the trapezoid formed by the two vectors, while the triple product C · (A×B) defines the volume of the formedparallelepiped (i.e., prism). When all the angles are 90, the volume becomes a cuboid (see p. 211

).

In matrix notation the scalar product is written as (p. 211)

x · κ =

xyz

T αβγ

=

[x, y, z

]αβγ

= αx+ βy + γz. (3.16)

The scalar product takes the character of κ. For example, if κ(s) ∈ C is a function of complex frequency

s, then the scalar product is complex. If ζ ∈ R is real, then the scalar product is real.

Norm (length) of a vector: The norm of a vector

||e|| ≡ +√

e · e ≥ 0

is defined as the positive square root of the scalar product of the vector with itself. This is a generalization

of the length, in any number of dimensions, forcing the sign of the square-root to be non-negative. The

length is a concept of Euclidean geometry, and it must always be positive and real. A complex (or

negative) length is not physically meaningful. More generally, the Euclidean length of a line is given as

the norm of the difference between two real vectors e1, e2 ∈ R

||e1 − e2||2 = (e1 − e2) · (e1 − e2)

= (x1 − x2)2 + (y1 − y2)2 + (z1 − z2)2 ≥ 0.

From this formula we see that the norm of the difference of two vectors is simply a compact expression

for the Euclidean length. A zero-length vector, such as is a point, is the result of the fact that

||x− x||2 = (x− x) · (x− x),

is zero.

Integral definition of a scalar product: Up to this point, following Euclid, we have only considered

a vector to be a set of elements xn ∈ R, index over n ∈ N, as defining a linear vector space with 5

scalar product x · y, with the scalar product defining the norm or length of the vector ||x|| =√x · x.

Given the scalar product, the norm naturally follows.


At this point an obvious question presents itself: Can we extend our definition of vectors to differ-

entiable functions (i.e., f(t) and g(t)), indexed over t ∈ R, with coefficients labeled by t ∈ R, rather

than by n ∈ N? Clearly, if the functions are analytic, there is no obvious reason why this should be a

problem, since analytic functions may be represented by a convergent series having Taylor coefficients,

thus are integrable term by term.5

Specifically, under certain conditions, the function f(t) may be thought of as a vector, defining a

normed vector space. This intuitive and somewhat obvious idea is powerful. In this case the scalar

product can be defined in terms of the integral

f(t) · g(t) =

∫

tf(t)g(t)dt (3.17)

summed over t ∈ R, rather than a sum over n ∈ N.

This definition of the vector scalar product allows for a significant but straightforward generalization

of our vector space, which will turn out to be both useful and an important extension of the concept of

a normed vector space. In this space we can define the derivative of a norm with respect to t, which is

not possible for the case of the discrete case, indexed over n. The distinction introduces the concept of10

continuity in the index t, which does not exist for the discrete index n ∈ N.

Pythagorean theorem and the Schwarz inequality: Regarding Fig. 3.6, suppose we compute the

difference between vector A ∈ R and αB ∈ R as L = ||A − αB|| ∈ R, where α ∈ R is a scalar that

modifies the length of B. We seek the value of α, which we denote as α∗, that minimizes the length

of L. From simple geometrical considerations, L(α) will be minimum when the difference vector is15

perpendicular to B, as shown in the figure by the dashed line from the tip of A ⊥ B.

To show this algebraically we write out the expression for L(α) and take the derivative with respect

to α, and set it to zero, which gives the formula for α∗. The argument does not change, but the algebra

greatly simplifies, if we normalize A,B to be unit vectors a = A/||A|| and b = B/||B||, which have

norm = 1.

L2 = (a− αb) · (a− αb) = 1− 2αa · b + α2. (3.18)

Thus the length is shortest (L = L∗, as shown in Fig. 3.6) when

d

dαL2

∗ = −2a · b + 2α∗ = 0.

Solving for α∗ we find α∗ = a · b. Since L∗ > 0 (a 6= b), Eq. 3.18 becomes

1− 2|a · b|2 + |a · b|2 = 1− |a · b|2 > 0.

In conclusion cos θ ≡ |a · b| < 1. In terms of A,B this is |A ·B| < ||A|| ||B|| cos θ, as shown next

to B in Fig. 3.6. Thus the scalar product between two vectors is their direction cosine. Furthermore,

since this forms a right triangle, the Pythagorean theorem must hold. The triangle inequality says that

the lengths of the two sides must be greater than the hypotenuse. Note that Θ ∈ R 6∈ C.20

This derivation is an abbreviated version of a related discussion in Section R.1.3 (p. 114).

Vector cross (×) and wedge (∧) product of two vectors: The vector product (aka, cross-product)

A × B and exterior product (aka, wedge-product) A ∧ B are the second and third types of vector

products. As shown in Fig. 3.6,

C = A×B = (a1x + a2y + a3z)× (b1x + b2y + b3z) =

∣∣∣∣∣∣∣

x y z

a1 a2 a3

b1 b2 b3

∣∣∣∣∣∣∣

is ⊥ to the plane defined by A and B. The cross product is severely limited to two input vectors A and

B ∈ R2, taken from three real dimensions (i.e., R3).


The exterior wedge product generalizes the cross product, since it may be defined in terms of A,B ∈C2, taken from n dimensions (Cn), with output in C1. Its output is a complex scalar with

A ∧B = −B ∧A = eθ

The difference between the two is that the cross product is a vector (typically in R) while the wedge

product is a signed complex scalar ∈ C.

Example: If we define A = 3x− 2y + 0z and B = 1x + 1y + 0z then the cross product is

1.

A×B =

∣∣∣∣∣∣∣

x y z

3 −2 01 1 0

∣∣∣∣∣∣∣= (3+ 2)z.

Since a1 ∈ C, this example violates the common assumption that A ∈ R3.

2. The wedge product A ∧B is defined by

A ∧B =

∣∣∣∣∣a1 b1

a2 b2

∣∣∣∣∣ =

∣∣∣∣∣3 1−2 1

∣∣∣∣∣

= (3x− 2y) ∧ (x + y)

= 3 · 0 0x ∧ x− 2y ∧ x + 3x ∧ y− 2 0

y ∧ y

= (3+ 2)x ∧ y

From the above example we see that the absolute value of the wedge product |a ∧ b| = ||a × b||,namely

|(a2y + a3z) ∧ (b2y + b3z)| = ||(a × b)||.The wedge product is especially useful because it is zero when the two vectors are co-linear, namely x∧ 5

x = 0 and x∧ y = 1. The advantage of the “wedge” product is that it is valid in n > 3 dimensions since

it is defined for any two vectors, in any number of dimensions. Like the cross product, the magnitude of

the wedge product is equal to the area of the trapezoid formed by the two vectors.

Scalar triple product: The triple of a third vector C with the vector product A×B ∈ R is

C · (A×B) =

∣∣∣∣∣∣∣

c1 c2 c3

a1 a2 a3

b1 b2 b3

∣∣∣∣∣∣∣∈ R3,

which equals the volume of a parallelepiped.

Impact of Analytic Geometry: The most obvious impact of analytic geometry was its detailed anal- 10

ysis of the conic sections, using algebra rather than drawings via a compass and ruler. An important

example is the composition of the line and circle, a venerable construction, presumably going back to

before Diophantus (250CE). Once algebra was invented, the composition could be done using formulas.

With this analysis came complex numbers.

The first two mathematicians to appreciate this mixture of Euclid’s geometry and the new algebra 15

were Fermat and Descartes (Stillwell, 2010, p. 111-115); soon Newton contributed to this effort by the

addition of physics (e.g., calculations in acoustics, orbits of the planets, and the theory of gravity and

light, significant concepts for 1687 (Stillwell, 2010, p. 115-117)).

Given these new methods, many new solutions to problems emerged. The complex roots of polyno-

mials continued to appear, without any obvious physical meaning. Complex numbers seem to have been 20

viewed as more of an inconvenience than a problem. Newton’s solution to this dilemma was to simply

ignore the imaginary cases (Stillwell, 2010, p. 119).


3.4.1 Development of Analytic Geometry

Intersection and Gaussian elimination: The first “algebra” (al-jabr) is credited to al-Khwarizmi (830

CE). Its invention advanced the theory of polynomial equations in one variable, Taylor series, and com-

position versus intersections of curves. The solution of the quadratic equation had been worked out

thousands of years earlier, but with algebra a general solution could be defined. The Chinese had found5

the way to solve several equations in several unknowns, for example, finding the values of the intersec-

tion of two circles. With the invention of algebra by al-Khwarizmi, a powerful tool became available to

solve the difficult problems.

Composition and Elimination In algebra there are two contrasting operations on functions: compo-

sition and elimination.10

Composition: Composition is the merging of functions, by feeding one into the other. If the two

functions are f, g then their composition is indicated by f g, meaning the function y = f(x) is

substituted into the function z = g(y), giving z = g(f(x)).

Composition is not limited to linear equations, even though that is where it is most frequently ap-

plied. To compose two functions, one must substitute one equation into the other. That requires solving15

for that substitution variable, which is not always possible in the case of nonlinear equations. However,

many tricks are available that may work around this restriction. For example, if one equation is in x2

and the other in x3 or√x, it may be possible to multiply the first by x or square the second. The point

is that one of the variables must be isolated so that when it is substituted into the other equations, the

variable is removed from the mix.20

Examples: Let y = f(x) = x2 − 2 and z = g(y) = y + 1. Then

g f = g(f(x)) = (x2 − 2) + 1 = x2 − 1. (3.19)

In general composition does not commute (i.e., f g 6= g f ), as is easily demonstrated. Swapping the

order of composition for our example gives

f g = f(g(y)) = z2 − 2 = (y + 1)2 − 2 = y2 + 2y − 1. (3.20)

Intersection: Complementary to composition is intersection (i.e., decomposition) (Stillwell, 2010,

pp. 119,149). For example, the intersection of two lines is defined as the point where they meet. This is

not to be confused with finding roots. A polynomial of degree N has N roots, but the points where two

polynomials intersect has nothing to do with the roots of the polynomials. The intersection is a function

(equation) of lower degree, implemented with Gaussian elimination.25

Intersection of two lines Unless they are parallel, two lines meet at a point. In terms of linear algebra

this may be written as two linear equations21 (on the left), along with the intersection point [x1, x2]T ,

given by the inverse of the 2x2 set of equations (on the right)

[a bc d

] [x1

x2

]=

[y1

y2

] [x1

x2

]=

1

∆

[d −b−c a

] [y1

y2

]. (3.21)

By substituting the expression for the intersection point [x1,x2]T into the original equation, we see that

it satisfies the equations. Thus the equation on the right is the solution to the equation on the left.

21When writing the equation Ax = y in matrix format, the two equations are ax1 + bx2 = y1 and dx1 + ex2 = y2 with

unknowns (x1, x2), whereas in the original equations ay + bx = c and dy + ex = f , they were y, x. Thus in matrix format,

the names are changed. The first time you see this scrambling of variables, it can be confusing.


Note the structure of the inverse: 1) The diagonal values (a, d) are swapped, 2) the off-diagonal

values (b, c) are negated and 3) the 2x2 matrix is divided by the determinant ∆ = ad − bc. If ∆ = 0,

there is no solution. When the determinant is zero (∆ = 0), the slopes of the two lines

slope =dx2

dx1=b

a=d

c

are equal, thus the lines are parallel. Only if the slopes differ can there be a unique solution.

Exercise: Show that the equation on the right is the solution of the equation on the left. Solution: By

a direct substitution of the right equation into the left equation[a bc d

]· 1

∆

[d −b−c a

] [y1

y2

],=

1

∆

[ad− bc −ab+ abcd− cd −cb+ ad

]=

1

∆

[∆ 00 ∆

],

which gives the identity matrix.

Algebra will give the solution when geometry cannot. When the two curves fail to intersect on the

real plane, the solution still exists, but is complex valued. In such cases, geometry, which only considers

the real solutions, fails. For example, when the coefficients [a, b, c, d] are complex, the solution exists, 5

but the determinant can be complex. Thus algebra is much more general than geometry. Geometry fails

when the solution has a complex intersection.

A system of linear Ax = y equations has many interpretations, and one should not be biased by the

notation. As engineers we are trained to view x as the input and y as the output, in which case then

y = Ax seems natural, much like the functional relation y = f(x). But what does the linear relation

x = Ay mean, when x is the input? The obvious answer is that y = A−1x. But when working with

systems of equations, there are many uses of equations, and we need to become more flexible in our

interpretation. For example y = A2x has a useful meaning, and in fact we saw this type of relationship

when working with Pell’s equation (p. 63) and the Fibonacci sequence (p. 65). As another example

consider [z1

z2

]=

[a1x a1y

a2x a2y

] [xy

],

which is reminiscent of a three-dimensional surface z = f(x, y). We shall find that such generalizations

are much more than a curiosity.

3.4.2 Lec 14: Applications of scalar products 10

Another important example of algebraic expressions in mathematics is Hilbert’s generalization of Eq. 1.1,

known as the Schwarz inequality, shown in Fig. 3.7. What is special about this generalization is that it

proves that when the vertex is 90, the length of the leg is minimum.

Vectors may be generalized to have∞ dimensions: U,V = [v1, v2, · · · , v∞]). The Euclidean inner

product (i.e., scalar product) between two such vectors generalizes the finite dimensional case

U ·V =∞∑

k=1

ukvk.

As with the finite case, the norm ||U|| =√

U ·U =√∑

u2k is the scalar product of the vector with

itself, defining the length of the infinite component vector. Obviously there is an issue of convergence, 15

if the norm for the vector is to have a finite length.

It is a somewhat arbitrary requirement that a, b, c ∈ R for the Pythagorean theorem (Eq. 1.1). This

seems natural enough since the sides are lengths. But, what if they are taken from the complex numbers,

as for the lossy vector wave equation, or the lengths of vectors in Cn? Then the equation generalizes to

c · c = ||c||2 =n∑

k=1

|ck|2,


where ||c||2 = (c, c) is the inner (dot) product of a vector c with itself, where |ck| is the magnitude the

complex ck. As before, ||c|| =√||c||2 is the norm of vector c, akin to a length.

Triangle ≤ proof?

V + 0.5UE(α

)=

V−

αU

|V · U |/||V ||UE

(α∗ )

=V−

α∗U

V

α∗U αU

Figure 3.7: The Schwarz inequality is related to the shortest distance (length of a line) between the ends of the two vectors.

||U || =√U · U as the scalar product of that vector with itself.

Schwarz inequality The Schwarz inequality22 says that the magnitude of the inner product of two

vectors is less than or equal to the product of their lengths

|U · V | ≤ ||U || ||V ||.

This may be simplified by normalizing the vectors to have unit length (U = U/||U ||, V = V/||V ||),in which case −1 < U · V ≤ 1. Another simplification is to define the scalar product in terms of the

direction cosine

cos θ = |U · V | ≤ 1.

A proof of the Schwarz inequality is as follows: From these definitions we may define the minimum

difference between the two vectors as the perpendicular from the end of the first to the intersection

with the second. As shown in Fig. 3.7, U ⊥ V may be found by minimizing the length of the vector

difference:

minα||V − αU ||2 = ||V ||2 + 2αV · U + α2||U ||2 > 0

0 = ∂α (V − αU) · (V − αU)

= V · U − α∗||U ||2

∴ α∗ = V · U/||U ||2.

The Schwarz inequality follows:

Imin = ||V − α∗U ||2 = ||V ||2 − |U · V |2

||U ||2 > 0

0 ≤ |U · V | ≤ ||U || ||V ||.An important example of such a vector space includes the definition of the Fourier transform, where

we may set

U(ω) = e−ω0t V (ω) = eωt U · V =

∫

ωeωte−ω0t

dω

2π= δ(ω − ω0).

It seems that the Fourier transform is a result that follows from a minimization, unlike the Laplace

transform that follows from causal system parameters. This explains the important differences between

the two, in terms of their properties (unlike the LT, the FT is not complex analytic). We further explore5

this topic in Lecture 3.6 (p. 128).

move Lec 36 (p. 146) here,

16?22A simplified derivation is provided in Sect. 3.4 (p. 107).


3.4.3 Lec 15: Gaussian Elimination

Perhaps move 2 p.87?The method for finding the intersection of equations is based on the recursive elimination of all the

variables but one. This method, known as Gaussian elimination (Appendix B, p. 251), works across a

broad range of cases, but may be defined as a systematic algorithm when the equations are linear in the

variables.23 Rarely do we even attempt to solve problems in several variables of degree greater than 1. 5

But Gaussian eliminations may still work in such cases (Stillwell, 2010, p. 90).

In Appendix B.2.1 (p. 257) the inverse of a 2x2 linear system of equations is derived. Even for

a 2x2 case, the general solution requires a great deal of algebra. Working out a numeric example of

Gaussian elimination is more instructive. For example, suppose we wish to find the intersection of the

two equations

x− y = 3

2x+ y = 2.

This 2x2 system of equations is so simple that you may immediately visualize the solution: By adding

the two equations, y is eliminated, leaving 3x = 5. But doing it this way takes advantage of the specific

example, and we need a method for larger systems of equations. We need a generalized (algorithmic)

approach. This general approach is called Gaussian elimination. 10

Start by writing the equations in matrix form (note this is not of the form Ax = y)

[1 −12 1

] [xy

]=

[32

]. (3.22)

Next, eliminate the lower left term (2x) using a scaled version of the upper left term (x). Specifically,

multiply the first equation by -2, add it to the second equation, replacing the second equation with the

result. This gives [1 −10 3

] [xy

]=

[3

2− 3 · 2

]=

[3−4

]. (3.23)

Note that the top equation did not change. Once the matrix is “upper triangular” (zero below the diag-

onal) you have the solution. Starting from the bottom equation, y = −4/3. Then the upper equation

gives x− (−4/3) = 3, or x = 3− 4/3 = 5/3.

In principle, Gaussian elimination is easy, but if you make a calculation mistake along the way, it

is very difficult to find your error. The method requires a lot of mental labor, with a high probability 15

of making a mistake. Thus you do not want to apply this method every time. For example, suppose

the elements are complex numbers, or polynomials in some other variable such as frequency. Once the

coefficients become more complicated, the seemingly trivial problem becomes highly error prone. There

is a much better way, that is easily verified, which puts all the numerics at the end, in a single step.

The above operations may be automated by finding a carefully chosen upper-diagonalization matrix

U that does the same operation. For example, let

U =

[1 0−2 1

]. (3.24)

Multiplying Eq. 3.22 by U , we find

[1 0−2 1

] [1 −12 1

] [xy

]=

[1 −10 3

] [xy

]=

[3−4

], (3.25)

thus we obtain Eq. 3.23. At this point we can either back-substitute and obtain the solution, as we did 20

above, or find a matrix L that finishes the job, by removing elements above the diagonal.

23https://en.wikipedia.org/wiki/System_of_linear_equations.


Exercise: Show that det(U) = det(UA) = 1. Solution: Since

det

[a bc d

]= ad− bc,

and det(UA) = detU ·det(A), and det(U) = 1−0 = 1, the result follows naturally. Common notation

is to let det(A) = |A|.In Appendix B.2.1 the inverse of a general 2x2 matrix is summarized in terms of three steps: 1) swap

the diagonal elements, 2) reverse the signs of the off-diagonal elements and 3) divide by the determinant

∆ = ab− cd. Specifically [a bc d

]−1

=1

∆

[d −b−c a

]. (3.26)

There are very few things that you must memorize, but the inverse of a 2x2 is one of them. It needs to

be in your tool-bag, like the quadratic formula.

While it is difficult to compute the inverse matrix from scratch (Appendix B), it takes only a few

seconds to verify it (steps 1 and 2)

[a bc d

] [d −b−c a

]=

[ad− bc −ab+ abcd− cd −bc+ ad

]=

[∆ 00 ∆

]. (3.27)

Thus dividing by the determinant gives the 2x2 identity matrix. A good strategy, when you don’t trust5

your memory, is to write down the inverse as best you can, and then verify.

Using the 2x2 matrix inverse on our example (Eq. 3.22), we find

[xy

]=

1

1 + 2

[1 1−2 1

] [32

]=

1

3

[5

−6 + 2

]=

[5/3−4/3

]. (3.28)

If you use this method, you will rarely (never) make a mistake, and the solution is easily verified. Either

you can check the numbers in the inverse, as was done in Eq. 3.27, or you can substitute the solution

back into the original equation.

3.5. LECTURE 16: TRANSMISSION (ABCD) MATRIX COMPOSITION METHOD 123

3.5 Lec 16: Transmission (ABCD) matrix composition method

Matrix composition: Matrix multiplication represents a composition of 2x2 matrices, because the

input to the second matrix is the output of the first (this follows from the definition of composition:

f(x) g(x) = f(g(x))). Thus the ABCD matrix is also known as the transmission matrix method, or

occasionally the chain matrix. The general expression for an transmission matrix T (s) is

[V1

I1

]=

[A(s) B(s)C(s) D(s)

] [V2

−I2

]. (3.35)

The four coefficients A(s),B(s), C(s),D(s) are all complex functions of the Laplace frequency s =σ + jω (p. 132). Typically they are polynomials in s, C(s) = s2 + 1 for example.

It is a standard convention to always define the current into the node, but since the input current

(on the left) is the same as the output current on the right (I2), hence the negative sign on I2, to meet 5

the convention of current into every node. When transmission matrices are cascaded, all the signs then

match.

We have already used 2x2 matrix composition in representing complex numbers (p. 38), and for

computing the gcd(m,n) of m,n ∈ N (p. 54), Pell’s equation (p. 63) and the Fibonacci sequence

(p. 65). It would appear that 2x2 matricies have high utility, yet are simple to work with. 10

Definitions of A,B, C,D: By writing out the equations corresponding to Eq. 3.35, it is trivial to show

that:

A(s) =V1

V2

∣∣∣∣I2=0

, B(s) = −V1

I2

∣∣∣∣V2=0

, C(s) =I1

V2

∣∣∣∣I2=0

, D(s) = −I1

I2

∣∣∣∣V2=0

. (3.36)

Each equation has a physical interpretation, and a corresponding name. Functions A and C are

said to be blocked, because the output current I2 is zero. Functions B and D are said to be short-

circuited, because the output voltage V2 is zero. These two terms (blocked, short-circuited) are electrical-

engineering centric, arbaritary, and fail to generalize to other cases, thus these terms should be avoided.

For example, in a mechanical system blocked would correspond to an output isometric (no length 15

change) velocity of zero. In mechanics the isometric force is defined as the maximum applied force,

conditioned on zero velocity (aka, the blocked force). Thus the short-circuited force (aka B) would

correspond to the force when the force is zero, which is nonsense. Thus these engineering-centric terms

do not gracefully generalize, so better terminology is needed. Much of this was sorted out by Thevenin25

c1883 (Johnson, 2003) and (Kennelly, 1893). 20

A,D are called voltage (force) and current (velocity) transfer functions, since they are ratios of

voltages and currents, whereas B, C are known as the transfer impedance and transfer admittance. For

example, the unloaded (blocked) (I2 = 0) output voltage V2 = I1/C, which would correspond to the

isometric force in mechanics. In this way each term expresses an output (port 2) in terms of an input

(port 1), for a given load condition. 25

Exercise: Explain why C is given as above. Solution: Writing out the lower equation gives I1 =CV2 −DI2. Setting I2 = 0, we obtain the equation for C = I1/V2|I2=0.

Exercise: Can C = 0? Solution: Yes, if I2 = 0 and I1 = I2, C = 0. For C 6= 0 there needs to be a

finite “shunt” impedance across V1, so that I1 6= I2 = 0.

25https://en.wikipedia.org/wiki/Leon_Charles_Thevenin


3.5.1 Thevenin parameters of a source:

An important concept in circuit theory is that of the Thevenin parameters, the open-circuit voltage and

the short-circuit current, the ratio of which defines the Thevenin impedance (Johnson, 2003). The open-

circuit voltage is defined as the voltage V2 when the load current I2 = 0, which was shown in the

previous exercise to be V2/I1|I2=0 = 1/C.5

Thevenin Voltage: The definition Thevenin voltage VThev is the same as that for A, thus

VThev

V1

∣∣∣∣I2=0

=1

A . (3.37)

Thevenin impedance The Thevenin impedance is the impedance looking into port 2 with V1 = 0,

thus

ZThev =V2

I2

∣∣∣∣V1=0

.

From the upper equation of Eq. 3.35, with V1 = 0, we obtain AV2 = BI2, thus

ZThev =BA .

Properties of the transmission matrix: The transmission matrix is always constructed from the prod-

uct of elemental matrices of the form[1 Z(s)0 1

]or

[1 0

Y (s) 1

].

Thus for the case of reciprocal systems (P6, p. 134),

∆T = det

[A(s) B(s)C(s) D(s)

]= 1,

since the determinant of the product of each elemental matrix is 1, the determinant their product is 1.

An anti-reciprocal system may be synthesized by the use of a gyrator, and for such cases ∆T = −1.

The eigenvalues and vectors equations for a T matrix are summarized on p. 183 and discussed in

Appendix D (p. 263).

3.5.2 Lec 17: The impedance matrix10

With a bit of algebra, one may find the impedance matrix in terms ofA,B, C,D (Van Valkenburg, 1964a,

p. 310) [V1

V2

]=

[z11 z12

z21 z22

] [I1

I2

]=

1

C

[A ∆T

1 D

] [I1

I2

]. (3.38)

Note that if C = 0, the impedance matrix does not exist.

Definitions of z11(s), z12(s), z21(s), z22(s): The definitions of the matrix elements are easily read off

of the equation, as

z11 =V1

I1

∣∣∣∣I2=0

, z12 = −V1

I2

∣∣∣∣I1=0

, z21 =V2

I1

∣∣∣∣I2=0

, z22 = −V2

I2

∣∣∣∣I1=0

. (3.39)

These definitions follow trivially from Eq. 3.38 and each element has a physical interpretation. For

example, the unloaded (I2 = 0, aka blocked or isometric) input impedance is z11(s) = A(s)/C(s) while

the unloaded transfer impedance is z21(s) = 1/C(s). For reciprocal systems (P6, p. 134) z12 = z21 since

∆T = 1. For anti-reciprocal systems, such as dynamic (aka, magnetic) loudspeakers and microphones15

(Kim and Allen, 2013)), ∆T = −1, thus z21 = −z12 = 1/C. Finally z22 is the impedance looking into

port 2 with port 1 open/blocked (I1 = 0).


3.5.3 Lec 18: Network power relations

Impedance is a very general concept, closely tied to the definition of power P(t) (and energy). Power

is defined as the product of the effort (force) and the flow (current). As described in Fig. 3.2, these

concepts are very general, applying to mechanics, electrical circuits, acoustics, thermal circuits, or any

other case where conservation of energy applies. Two basic variables are defined, generalized force and

generalized flow, also called conjugate variables. The product of the conjugate variables is the power,

and the ratio is the impedance. For example, for the case of voltage and current,

P(t) ≡∫ t

v(t)i(t)dt, Z(s) ≡ V (ω)

I(ω).

3.5.4 Power vs. power series, linear vs. nonlinear

Another place where equations of second degree appear in physical applications is in energy and power

calculations. The electrical power is given by the product of the voltage v(t) and current i(t) (or in

mechanics as the force times the velocity). For example if we define P = v(t)i(t) to be the power P[watts], then the total energy [joules] at time t is (Van Valkenburg, 1964a, Chapter 14)

E(t) =

∫ t

0v(t)i(t)dt.

From this observe that the power is the rate of change of the total energy

P(t) =d

dtE(t),

reminiscent of the fundamental theorem of calculus [Eq. 4.7, (p. 148)].

Ohm’s law and impedance: The ratio of voltage over the current is called the impedance which has

units of [ohms]. For example given a resistor of R = 10 [ohms],

v(t) = R i(t).

Namely, 1 [amp] flowing through the resistor would give 10 [volts] across it. Merging the linear rela-

tion due to Ohm’s law with the definition of power shows that the instantaneous power in a resistor is

quadratic in voltage and current

P(t) = v(t)2/R = i(t)2R. (3.40)

Note that Ohm’s law is linear in its relation between voltage and current whereas the power and energy 5

are nonlinear.

Ohm’s law generalizes in a very important way, allowing the impedance (e.g., resistance) to be a

linear complex analytic function of complex frequency s = σ + ω (Kennelly, 1893; Brune, 1931a).

Impedance is a fundamental concept in many fields of engineering. For example:26 Newton’s second

law F = ma obeys Ohm’s law, with mechanical impedance Z(s) = sm. Hooke’s law F = kx for 10

a spring is described by a mechanical impedance Z(s) = k/s. In mechanics a “resistor” is called a

dash-pot and its impedance is a positive and real constant.27

26In acoustics the pressure is a potential, like voltage. The force per unit area is given by f = −∇p, thus F = −∫

∇p dS.

Velocity is analogous to a current. In terms of the velocity potential, the velocity per unit area is v = −∇φ.27https://en.wikipedia.org/wiki/Impedance_analogy


Kirchhoff’s laws KCL, KVL: The laws of electricity and mechanics may be written down using

Kirchhoff’s current and voltage laws (KCL, KVL), which lead to linear systems of equations in the

currents and voltages (velocities and forces) of the system under study, with complex coefficients having

positive real parts.

Points of major confusion are a number of terms that are misused, and overused, in the fields of5

mathematics, physics and engineering. Some of the most obviously abused terms are linear/nonlinear,

energy, power, power series. These have multiple meanings, which can, and are, fundamentally in

conflict.

Transfer functions (transfer matrix): The only method that seems to work, to sort this out, is to cite

the relevant physical application, in specific contexts. The most common standard reference is a physical

system that has an input x(t) and an output y(t). If the system is linear, then it may be represented by

its impulse response h(t). In such cases the system equation is

y(t) = h(t) ⋆ x(t)↔ Y (ω) = H(s)|σ=0X(ω);

namely, the convolution of the input with the impulse response gives the output. From Fourier analysis

this relation may be written in the real frequency domain as a product of the Laplace transform of the10

impulse response, evaluated on the ω axis and the Fourier transform of the input X(ω) ↔ x(t) and

output Y (ω)↔ y(t).

If the system is nonlinear, then the output is not given by a convolution, and the Fourier and Laplace

transforms have no obvious meaning.

The question that must be addressed is why the power is nonlinear, whereas a power series of H(s)15

is linear: Both have powers of the underlying variables. This is naturally confusing and rarely, if ever,

address. The quick answer is that powers of the Laplace frequency s correspond to derivatives, which

are linear, whereas the product of the voltage v(t) and current i(t) is nonlinear. The important and

interesting question will be addressed in Section 3.7.1 (p. 133), in terms of the system postulates of

physical systems.20

Whats going on?Needs work. The domain variables must be separated from the codomain variables. In our

example, the voltage and current are multiplied together, resulting in a nonlinear output, the power. If

the frequency is squared, this is describing the degree of a polynomial. This is not nonlinear because it

does not impact the signal output; rather it characterizes the Laplace transform of the system response.

Table 3.2: The generalized impedance is defined as the ratio of a force over a flow, a concept that also holds in mechanicsand acoustics. In mechanics, the ‘force’ is equal to the mechanical force on an element (e.g. a mass, dash-pot, or spring), andthe ‘flow’ is the velocity. In acoustics, the ‘force’ is pressure, and the ‘flow’ is the volume velocity or particle velocity of airmolecules.

Case Force Flow Impedance units ohms [Ω]

Electrical voltage (V) current (I) Z = V/I [Ω]

Mechanics force (F) velocity (U) Z = F/U mechanical [Ω]

Acoustics pressure (P) particle velocity (V) Z = P/V specific [Ω]

Acoustics mean pressure (P) volume velocity (V) Z = P/V acoustic [Ω]

Thermal temperature (T) heat-flux (J) Z = T/J thermal [Ω]

Ohm’s law: In general, impedance is defined as the ratio of a force over a flow. For electrical circuits25

(Table 3.2), the voltage is the ‘force’ and the current is the ‘flow.’ Ohm’s law states that the voltage

across and the current through a circuit element are linearly related by the impedance of that element

(which is typically a complex function of the complex Laplace frequency s = σ + ω). For resistors,

the voltage over the current is called the resistance, and is a constant (e.g. the simplest case, V/I = R).

For inductors and capacitors, the impedance depends on the Laplace frequency s [e.g. V/I = Z(s)]).30


As discussed in Fig. 3.2, the impedance concept also holds for mechanics and acoustics. In mechan-

ics, the ‘force’ is equal to the mechanical force on an element (e.g. a mass, dash-pot, or spring), and

the ‘flow’ is the velocity. In acoustics, the ‘force’ is pressure, and the ‘flow’ is the volume velocity or

particle velocity of air molecules.

In this section we shall derive the method of linear composition of systems, also known as the ABCD 5

transmission matrix method, or in the mathematical literature as the Mobius (bilinear) transformation.

Using the method of matrix composition, a linear system of 2x2 matrices can represent a large important

family of networks. By the application of Ohm’s law to the circuit shown in Fig. 3.2, we can model a

cascade of such cells.

L = ρA(x)

V2V1 +

−

+

−

I2I1

C = ηPA(x)

Figure 3.2: A single LC cell of the LC transmission line (see Fig. Q.1, p. 328). Every cell of any transmission line maybe modeled by the ABCD method, as the product of two matrices. For the example shown here, the inductance L of the coiland the capacitance C of capacitor are in units of [henry/m] and [farad/m], thus they depend on length ∆x [m] that the cellrepresents. Note the flows are always defined as into the + node.

Example of the use of the ABCD matrix composition: In Fig. 3.2 we see the network is composed

of a series inductor (mass) having an impedance Zl = sL, and a shunt capacitor (compliance) having an

impedance Zc = 1/sC, where s ∈ C. As determined by Ohm’s law, each impedance is described by a

linear relation between the current and the voltage. For the inductive impedance, applying Ohm’s law,

we find

V1 − V2 = ZlI1,

where Zl is the complex impedance of the inductor. For the capacitive impedance, applying Ohm’s law

we find

V2 = (I1 + I2)Zc,

where Zc is the complex impedance of the capacitor. 10

Each of these linear impedance relations may be written in matrix form. The series inductor equation

gives (note I1 = −I2) [V1

I1

]=

[1 Zl0 1

] [V2

−I2

], (3.41)

while the shunt capacitor equation yields (note V1 = V2)

[V2

I2

]=

[1 0Yc 1

] [V3

−I3

], (3.42)

where Yc = 1/Zc is called the admittance.

When the second matrix equation for the shunt admittance (Eq. 3.42) is substituted into the series

impedance equation (Eq. 3.41), we find the ABCD matrix (T1 T2) for the cell is simply the product of

two matrices [V1

I1

]=

[1 Zl0 1

] [1 0Yc 1

] [V3

−I3

]=

[1 + ZlYc Zl

Yc 1

] [V3

−I3

]. (3.43)

Note that the determinant of the matrix ∆ = AD−BC = 1. This is not an accident since the determinant

of the two matricies are each 1, thus the determinant of their product is 1. Every cascade of series and

shunt elements will always have ∆ = 1.


Thus for Fig. 3.2, Eq. 3.43 hasA(s) = 1+s2LC, B(s) = sL, C(s) = sC andD = 1. This equation

characterizes every possible relation between the input and output voltage and current of the cell. For

example, the ratio of the output over input voltage, with the output unloaded, is

V3

V1

∣∣∣∣I3=0

=1

A(s)=

1

1 + ZlYc=

1

1 + Zl/Zc=

ZcZc + Zl

.

This relation is known as the voltage divider relation.

To derive the formula for the current divider equation, use the lower equation and set V3 = 0, giving

−I3

I1

∣∣∣∣V2=0

=1

D = 1.

3.6 Lec 19: Signals: Fourier transforms

Two basic transformations in engineering mathematics are the Fourier and the Laplace transforms, which

deal with time–frequency analysis. the best books on this topic is Papoulis (1962).

Notation: The Fourier transform takes a time domain signal f(t) ∈ R and transforms it to the fre-

quency domain ω ∈ R, where it is complex (F (ω) ∈ C). For the Fourier transform, both the time

−∞ < t < ∞ and frequency ∞ < ω < ∞ are strictly real. The relationship between f(t) and its

transform F (ω) is indicated by the double arrow symbol

f(t)↔ F (ω).

Since the FT obeys superposition, it is possible to define a FT of a complex time function f(t) ∈ C, t ∈5

R. This is useful in certain engineering applications (i.e., Hilbert envelope, Hilbert transforms). It is

accepted in the engineering and physics literature to use the case of the variable to indicate the type of

argument. A time function is f(t), where t has units of seconds [s] and is in lower case, whereas its

Fourier transform, a function of frequency, having units of either hertz [Hz] or radians [2πHz], is written

using upper case F (ω). This helps the reader parse the type of variable under consideration. This is a10

helpful piece of notation, but not entirely in agreement with notation used in mathematics.

Definition of the Fourier transform: The forward transform takes f(t) to F (ω) while the inverse

transform takes F (ω) to f(t). The tilde symbol indicates that in general the recovered inverse transform

signal can be slightly different from f(t). We give examples of this in Table 3.3.

F (ω) =

∫ ∞

−∞f(t)e− ωtdt f(t) =

1

2π

∫ ∞

−∞F (ω)e ωtdω (3.44)

F (ω)↔ f(t) f(t)↔ F (ω). (3.45)

Properties of Fourier Transforms:15

1. Both time t and frequency ω are real.

2. For the forward transform (time to frequency), the sign of the exponential is negative.

3. The limits on the integrals in both the forward and reverse FTs are [−∞,∞].

4. When taking the inverse Fourier transform, the scale factor of 1/2π is required to cancel the 2π in

the differential dω.20

3.6. LECTURE 19: SIGNALS: FOURIER TRANSFORMS 129

Table 3.3: Basic (Level I) Fourier transforms. Note a > 0 ∈ R has units [rad/s]. To flag this necessary condition,we use |a| to assure this condition will be met. The other constant To ∈ R [s] has no restrictions, other than beingreal. Complex constants may not appear as the argument to a delta function, since complex numbers do not havethe order property.

f(t)↔ F (ω) Name

δ(t)↔ 1(ω) ≡ 1 ∀ ω Dirac

1(t) ≡ 1∀ t↔ 2πδ(ω) Dirac

sgn(t) =t

|t| ↔2

ω

u(t) =1(t) + sgn(t)

2↔ πδ(ω) +

1

ω≡ U(ω) step

δ(t− To)↔ e−ωTo delay

δ(t− To) ⋆ f(t)↔ F (ω)e−ωTo delay

u(t)e−|a|t ↔ 1

ω + |a| exp

rec(t) =1

To[u(t)− u(t− To)]↔

1

To

(1− e−ωTo

)pulse

u(t) ⋆ u(t)↔ δ2(ω) Not defined NaN

5. The Fourier step function may be defined by the use of superposition of 1 and sgn(t) = t/|t| as

u(t) ≡ 1 + sgn(t)

2=

1 t > 0

1/2 t = 0

0 t < 0

.

Taking the FT of a delayed step function

u(t− To)↔1

2

∫ ∞

−∞[1− sgn(t− To)] e−jωtdt = πδ(ω) +

e−jωTo

jω.

Thus the FT of the step function has the term πδ(ω) due to the 1 in the definition of the Fourier

step. This term introduces a serious flaw with the FT of the step function: While it appears to be

causal, it is not.

6. The convolution u(t) ⋆ u(t) is not defined because both 1 ⋆ 1 and δ2(ω) do not exist (and cannot

be defined). 5

7. The inverse FT has convergence problems whenever there is a discontinuity in the time response.

This we indicate with a hat over the reconstructed time response. The error between the target

time function and the reconstructed is zero in the root-mean sense, but not point-wise.

Specifically, at the discontinuity point for the Fourier step function (t = 0), u(t) 6= u(t), yet∫|u(t)−u(t)|2dt = 0. At the point of the discontinuity the reconstructed function displays Gibbs 10

ringing (it oscillates around the step, hence does not converge at the jump).28 The LT does not

exhibit Gibbs ringing.

28https://en.wikipedia.org/wiki/Gibbs_phenomenon


8. The FT is not always analytic in ω, as in this example of the step function. The step function

cannot be expanded in a Taylor series about ω = 0 because δ(ω) is not analytic in ω.

9. The Fourier δ function is denoted δ(t), to differentiate it from the Laplace delta function δ(t).

They differ because the step functions differ, due to the convergence problem described above.

10. One may define

u(t) =

∫ t

−∞δ(t)dt,

and define the somewhat questionable notation

δ(t) =d

dtu(t),

since the Fourier step function is not analytic.5

11. The rec(t) function is defined as

rec(t) =u(t)− u(t− To)

To=

0 t < 0

1/To 0 < t < To

0 t > T0

.

It follows that δ(t) = limTo→0. Like δ(t), the rec(t) has unit area.

Table 3.4: Summary of key properties of FTs.

FT Properties

d

dtv(t)↔ ωV (ω) deriv

f(t) ⋆ g(t)↔ F (ω)G(ω) conv

f(t)g(t)↔ 1

2πF (ω) ⋆ G(ω) conv

f(at)↔ 1

aF

(ω

a

)scaling

Periodic signals: Besides these two basic types of time–frequency transforms, there are several vari-

ants that depend on the symmetry in time and frequency. For example, when the time signal is sam-

pled (discrete in time), the frequency response becomes periodic, leading to the discrete-time Fourier

transform (DTFT). When a time response is periodic, the frequency response is sampled (discrete in10

frequency), leading to the Fourier series. These two symmetries may be simply characterized as peri-

odic in time ⇒ discrete in frequency, and periodic in frequency ⇒ discrete in time. When a function

is discrete both in time and frequency, it is necessarily periodic in time and frequency, leading to the

discrete Fourier transform (DFT). The DFT is typically computed with an algorithm called the fast

Fourier transform (FFT), which can dramatically speed up the calculation when the data is a power of 215

in length.

An important symmetry occurs given functions that are causal (in time) and periodic (in frequency).

The best known example is the class of signals that have z transforms, which are causal (one-sided in

3.7. LECTURE 20: SYSTEMS: LAPLACE TRANSFORMS 131

time) discrete-time signals. The harmonic series (Eq. 3.29, p. 91) is the z-transform of the discrete-

time step function and is thus, due to symmetry, analytic within the RoC in the complex frequency (z)

domain.

The double brackets on f((t))To indicate that f(t) is periodic in t with period To, i.e., f(t) =f(t + kTo) for all k ∈ N. Averaging over one period and dividing by the To gives the average value of 5

f(t).

Exercise: Consider the FT as a scalar (dot) product (Eq. 3.16, p. 109) between “vectors” f((t))Toand e−ωkt

F (ωk) = f((t))To · e−ωkt

≡ 1

To

∫ To

0f(t)e−ωktdt,

where ω0 = 2π/To, f(t) has period To, i.e., f(t) = f(t+ nTo) = eωnt with n ∈ N, and ωk = kωo.What is the value of the scalar product? Solution: Evaluating the scalar product we find

eωnt · e−ωkt =1

To

∫ To

0eωnte−ωktdt

=1

To

∫ To

0e2π(n−k)t/Todt =

1 n = k

0 n 6= k.

The two signals (vectors) are orthogonal.

Exercise: Consider the discrete time FT (DTFT) as a scalar (dot) product (Eq. 3.16, between “vec-

tors” fn = f(t)|tn and e−ωtn where tn = nTs and Ts = 1/2Fmax is the sample period. The scalar

product over n ∈ Z is

F ((ω))2π = fn · e−ωtn

≡∞∑

n=−∞fne

−ωtn ,

where ω0 = 2π/To and ωk = kωo. If fn = eωnt what is the value of the scalar product? Solution: Eval-

uating the scalar product we find

eωnt · e−ωkt =1

To

∫ To

0eωnte−ωktdt. =

1

To

∫ To

0e2π(n−k)t/Todt =

1 n = k

0 n 6= k.

The two signals (vectors) are orthogonal.

Add Kernel methods

applicationsPoisson Sum Formula

3.7 Lec 20: Systems: Laplace transforms 10

The Laplace transform takes real signals f(t) ∈ R, as a function of real time t ∈ R, that are causal,

i.e., strictly zero for negative time (f(t) = 0 for t < 0), and transforms them into complex functions

F (s) ∈ C of complex frequency s = σ + ω. As for the Fourier transform, there is the notation

f(t)↔ F (s).

When a signal is zero for negative time f(t < 0) = 0, it is said to be causal, and the resulting 15

transform F (s) is then complex analytic over significant regions of the s plane. For a function of time

to be causal, time must be real (t ∈ R), since if it were complex, it would lose the order property (thus

it could not be causal). It is helpful to emphasize the causal nature of f(t)u(t) by indicating the causal

nature, using the Heaviside step function u(t).


Any restriction on a function (e.g., real, causal, periodic, positive real part, etc.) is called a symmetry

property. There are many forms of symmetry (Section 3.7.1, p. 133). The concept of symmetry is very

general and widely used in both mathematics and physics, where it is more generally known as group

theory. We shall restrict ourselves to only a few very basic symmetries.

Definition of the Laplace transform: The forward and inverse Laplace transforms are

F (s) =

∫ ∞

0−f(t)e−stdt f(t) =

1

2π

∫ σo+∞

σo−∞F (s)estds (3.46)

F (s)↔ f(t) f(t)↔ F (s) (3.47)

Here s = σ + jω ∈ C [2πHz] is the complex Laplace frequency in radians and t ∈ R [s] is the time in5

seconds.

When dealing with engineering problems it is convenient to separate the signals we use from the

systems that process them. We do this by treating signals, such as a music signal, differently from a

system, such as a filter. In general signals may start and end at any time. The concept of causality has

no mathematical meaning in signal space. Systems, on the other hand, obey very rigid rules (to assure10

that they remain physical). These physical restrictions are described in terms of the network postulates,

which are discussed in Sect. 3.7.1 (p. 133). There is a question as to why postulates are needed, and

which ones are the best choices.29 There may be no answers to these questions, but having a set of

postulates is a useful way of thinking about physics.30

As discussed in Section 4.5.1 (p. 178), we must use the Cauchy residue theorem (CRT), requiring

closure of the contour C at ω→ ±j∞∮

C=

∫ σ0+j∞

σ0−j∞+

∫

⊂∞

,

where the path represented by ‘⊂∞’ is a semicircle of infinite radius. For a causal, ‘stable’ (e.g. doesn’t15

“blow up” in time) signal, all of the poles of F (s) must be inside of the Laplace contour, in the left half

s-plane.

Rig

id w

all

Mass M

Dashpot (resistance) R

Spring compliance C

Displacement x(t)

Force f (t)

Figure 3.3: Three-element mechanical resonant circuit consisting of a spring, mass and dash-pot (e.g., viscous fluid).

Hooke’s law for a spring states that the force f(t) is proportional to the displacement x(t), i.e.,

f(t) = Kx(t). The formula for a dash-pot is f(t) = Rv(t), and Newton’s famous formula for mass is

f(t) = d[Mv(t)]/dt, which for constant M is f(t) = Mdv/dt.20

The equation of motion for the mechanical oscillator in Fig. 3.3 is given by Newton’s second law;

the sum of the forces must balance to zero

Md2

dt2x(t) +R

d

dtx(t) +Kx(t) = f(t). (3.48)

These three constants, the mass M , resistance R and stiffness K (∈ R ≥ 0) are real and non-negative.

The dynamical variables are the driving force f(t) ↔ F (s), the position of the mass x(t) ↔ X(s) and

its velocity v(t)↔ V (s), with v(t) = dx(t)/dt↔ V (s) = sX(s).

29https://www.youtube.com/watch?v=JXAfEBbaz_430https://www.youtube.com/watch?v=YaUlqXRPMmY

3.7. LECTURE 20: SYSTEMS: LAPLACE TRANSFORMS 133

Newton’s second law (c1650) is the mechanical equivalent of Kirchhoff’s (c1850) voltage law

(KVL), which states that the sum of the voltages around a loop must be zero. The gradient of the

voltage results in a force on a charge (i.e., F = qE).

Equation 3.48 may be re-expressed in terms of impedances, the ratio of the force to velocity, once it

is transformed into the Laplace frequency domain. 5

Example: The divergent series

etu(t) =∞∑

0

1

n!tn ↔ 1

s− 1

is a valid description of etu(t), with an unstable pole at s = 1. For values of |x − xo| < 1 (x ∈ R),

the analytic function P (x) is said to have a region of convergence (RoC). For cases where the argument

is complex (s ∈ C), this is called the radius of convergence (RoC). We call the region |s − so| > 1 the

region of divergence (RoD), and |s − so| = 0, the singular circle. Typically the underlying function

P (s), defined by the series, has a pole on the singular circle for s ∈ C. One may isolate such poles 10

by moving the expansion point so until the RoC approaches zero. Outside the circle the Taylor series is

divergent and inside it is convergent.

There seems to be a conflict with the time response f(t) = eatu(t), which has a divergent series

(unstable pole). I’m not sure how to explain this conflict, other than to point out that t ∈ R: thus, the

series expansion of the diverging exponential is real-analytic, not complex analytic. First f(t) has a 15

Laplace transform with a pole at s = 1, in agreement with its unstable nature. Second, every analytic

function must be single valued. This follows from the fact that each term in Eq. 3.27 is single valued.

Third, analytic functions are “smooth” since they may be differentiated an infinite number of times and

the series still converges. There can be no jumps or kinks in such functions.

The key idea that every impedance must be complex analytic and ≥ 0 for σ > 0 was first proposed 20

by Otto Brune in his PhD at MIT, supervised by Ernst A. Guillemin, an MIT electrical engineering

professor who played an important role in the development of circuit theory and was a student of Arnold

Sommerfeld.31 Other MIT advisers were Norbert Wiener and Vannevar Bush. Brune’s primary, but

non-MIT, advisor was W. Cauer, also perhaps a student of Sommerfeld, who was well trained in 19th

century German mathematics (Brune, 1931b). 25

Summary: While the definitions of the FT (FT ) and LT (LT ) transforms may appear similar, they are

not. The key difference is that the time response of the Laplace transform is causal, leading to a complex

analytic frequency response. While the frequency response of the Fourier transform is complex, it is

not complex analytic since the frequency ω is real. Fourier transforms do not have poles. These are not

superficial differences. 30

The concept of symmetry is helpful in understanding the many different types of time-frequency

transforms. Two fundamental types of symmetry are causality and periodicity.

The Fourier transform FT characterizes the steady-state response while the Laplace transform LTcharacterizes both the transient and steady-state response. Given a causal system responseH(s)↔ h(t)with input x(t), the output is

y(t) = h(t) ⋆ x(t)↔ Y (ω) = H(s)∣∣∣s=ω

X(ω).

3.7.1 Lec 21: Ten system postulates

Solutions of differential equations, such as the wave equation, are conveniently described in terms of

mathematical properties, which we present here in terms of 10 network postulates: 35

31It must be noted that University of Illinois Prof. ‘Mac’ Van Valkenburg was arguably more influential in circuit theory

during the same period. Mac’s books are certainly more accessible, but perhaps less widely cited.


(P1) Causality (non-causal/acausal): Causal systems respond when acted upon. Virtually all physical

systems obey causality. An example of a causal system is an integrator, which has a response of

a step function. Filters are also examples of causal systems. Signals represent acausal responses.

They do not have a clear beginning or end, such as the sound of the wind or traffic noise. A causal

linear system is typically complex analytic. A nonlinear system can be causal, but not complex5

analytic.

(P2) Linearity (nonlinear): Linear systems obey superposition. Two signals x(t) and y(t) are the inputs

to a linear system, producing outputs x′(t) and y′(t). When the inputs are presented together as

ax(t) + by(t) with weights a, b ∈ C, the output will be ax′(t) + by′(t). If either a or b is zero, the

corresponding signal is removed from the output.10

Nonlinear systems mix the two inputs, thereby producing signals not present in the input. For

example, if the inputs to a nonlinear system are two sine waves, the output will contain distortion

components, having frequencies not present at the input. An example of a nonlinear system is one

that multiplies the two inputs. A second is a diode, which rectifies a signal, letting current flow

only in one direction. Most physical systems have some degree of nonlinear response, but this is15

not always desired. Other systems are designed to be nonlinear, such as the diode example.

(P3) Passive (active): An active system has a power source, such as a battery, while a passive system

has no power source. While you may consider a transistor amplifier to be active, it is only so when

connected to a power source.

(P4) Real (complex) time response: Typically systems are “real in, real out.” They do not naturally20

have complex responses (real and imaginary parts). While a Fourier transform takes real inputs

and produces complex outputs, this is not an example of a complex time response. P4 is a charac-

terization of the input signal, not its Fourier transform.

(P5) Time-invariant (time varying): For a system to be a time varying system the output must depend

on when the input signal starts or stops. If the output, relative to the input, is independent of the25

starting time, then the system is said to be time-invariant (static).

(P6) Reciprocal (non- or anti-reciprocal): In many ways this is the most difficult propriety to charac-

terize and thus to understand. It is best characterized by the ABCD matrix (Section 3.5, p. 123).

If B = C, the system is said to be reciprocal. If B = −C, it is said to be anti-reciprocal. The

impedance matrix is reciprocal while a loudspeaker is anti-reciprocal and modeled by the gyrator30

rather than a transformer. Non-reciprocal systems are modeled by gyrators, which may be thought

of as transformers which swap the force and flow variables. For example, the input impedance of

a gyrator, terminated by an inductor, is a capacitor. For an expanded discussion on reciprocity, see

Sect. K, p. 289.Add fig T network: Za,

Zb, Zc, to explain P6

(P7) Reversibility (non-reversible): If swapping the input and output of a system leaves the system35

invariant, it is said to be reversible. When A = D, the system is reversible. Note the similarity

and differences between reversible and reciprocal.

(P8) Space-invariant (space-variant): If a system operates independently as a function of where it

physically is in space, then it is space-invariant. When the parameters that characterize the system

depend on position, it is space-variant.40

(P9) Deterministic (random): Given the wave equation, along with the boundary conditions, the sys-

tem’s solution may be deterministic, or not, depending on its extent. Consider a radar or sonar

wave propagating out into uncharted territory. When the wave hits an object, the reflection can re-

turn waves that are not predicted, due to unknown objects. This is an example where the boundary

condition is not known in advance.45

3.8. LECTURE 22: (WEEK 7) COMPLEX ANALYTIC MAPPINGS (DOMAIN-COLORING) 135

(P10) Quasi-static (ka < 1) Quasi-statics follows the Nyquist sampling theorem, for systems that have

dimensions that are small compared to the local wavelength. This assumption fails when the fre-

quency is raised (the wavelength becomes short). Thus this is also known as the long-wavelength

approximation. Quasi-statics is typically stated as ka < 1, where k = 2π/λ = ω/co and a is the

smallest dimension of the system. see p. 235 for a detailed discussion of the role of quasi-statics 5

in acoustic horn wave propagation.

Postulate (P10) is closely related to the Feynman lecture The “underlying unity” of nature where

Feynman asks (Feynman, 1970c, Ch. 12-7): Why do we need to treat the fields as smooth? His

answer is related to the wavelength of the probing signal relative to the dimensions of the object

being probed. This raises the fundamental question: Are Maxwell’s equations a band-limited 10

approximation to reality? Today we have no answers to this question.

Here is one helpful discussion:32

The Lorentz force formula and Maxwell’s equations are two distinct physical laws, yet

the two methods yield the same results.

Why the two results coincide was not known. In other words, the flux rule consists of 15

two physically different laws in classical theories. Interestingly, this problem was also

a motivation behind the development of the theory of relativity by Albert Einstein. In

1905, Einstein wrote in the opening paragraph of his first paper on relativity theory,

“It is known that Maxwell’s electrodynamics – as usually understood at the present

time – when applied to moving bodies, leads to asymmetries which do not appear to be 20

inherent in the phenomena.” But Einstein’s argument moved away from this problem

and formulated special theory of relativity, thus the problem was not solved.

Richard Feynman once described this situation in his famous lecture (The Feynman

Lectures on Physics, Vol. II, 1964), “we know of no other place in physics where such

a simple and accurate general principle requires for its real understanding an analysis 25

in terms of two different phenomena. Usually such a beautiful generalization is found

to stem from a single deep underlying principle. · · · We have to understand the “rule”

as the combined effects of two quite separate phenomena.”

Summary discussion of the 10 network postulates: Each postulate has at least two categories. For

example, (P1) is either causal, non-causal or acausal while (P2) is either linear or non-linear. (P6) and 30

(P9) only apply to 2-port algebraic networks (those having an input and an output). The others apply

to both 2– and 1–port networks (e.g., an impedance is a 1-port). An interesting example is the anti-

reciprocal transmission matrix of a loudspeaker (Allen, 2019, Lec21).

Related forms of these postulates may be found in the network theory literature (Van Valkenburg,

1964a,b; Ramo et al., 1965). Postulates (P1-P6) were introduced by Carlin and Giordano (1964) and 35

(P7-P9) were added by Kim et al. (2016). While linearity (P2), passivity (P3), realness (P4) and time-

invariance (P5) are independent, causality (P1) is a consequence of linearity (P2) and passivity (P3)

(Carlin and Giordano, 1964, p. 5).

3.8 Lec 22: Complex analytic mappings (domain-coloring)

One of the most difficult aspects of complex functions of a complex variable is visualizing the mappings

from the z = x+ y to w(z) = u+ v planes. For example, w(z) = sin(x) is trivial when z = x+ yis real (i.e., y = 0), because sin(x) is real. Likewise for the case where x = 0,

sin(y) =e−y − ey

2= − sinh(y)

32https://www.sciencedaily.com/releases/2017/09/170926085958.htm


is pure imaginary. However, the more general case

w(z) = sin(z) ∈ C

is not easily visualized. And when u(x, y) and v(x, y) are less well known functions, w(z) can be even

more difficult to visualize. For example, ifw(z) = J0(z) then u(x, y), v(x, y) are the real and imaginary

parts of the Bessel function.

A software solution: Fortunately with computer software today, this problem can be solved by adding

color to the chart. An Octave/Matlab script33 zviz.m has been used to make the charts shown in5

Fig. 3.4. Such charts are known as domain-coloring.34

u+jv = s

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = (x+jy)-sqrt(i)

x-2 -1 0 1 2

jy

2

1

0

-1

-2

Figure 3.4: Left: Domain-colorized map showing the complex map-ping from the s = σ+ω plane to the w(s) = u(σ, ω) + v(σ, ω) plane.This mapping may be visualized by the use of intensity (light/dark) toindicate magnitude, and color (hue) to indicate angle (phase) of the map-ping. Right: This shows the w(z) = z − √

j plane, shifted to the right

and up by√

2/2 = 0.707. The white and black lines are the iso-real andiso-imaginary contours of the mapping.

Rather than plotting u(x, y) and

v(x, y) separately, domain-coloring al-

lows us to display the entire function

on one color chart (i.e., colorized plot).10

For this visualization we see the complex

polar form of w(s) = |w|e∠w, rather

than the 2x2 (four-dimensional) Carte-

sian graphw(x+y) = u(x, y)+v(x, y).On the left is the reference condition, the15

identity mapping (w = s), and on the

right the origin has been shifted to the

right and up by√

2.

Mathematicians typically use abstract

(i.e., non–physical) notation w(z), where20

w = u + vı and z = x + yı. Engineers

prefer to think in terms of a physical com-

plex impedance Z(s) = R(s) + X(s),

having resistance R(s) and reactance X(s) [ohms], as a function of the complex Laplace radian fre-

quency s = σ + ω [rad], as used, for example, with the Laplace transform (Sect. 3.7, p. 131).25

In Fig. 3.4 we use both notations, with Z(s) = s on the left and w(z) = z − √ on the right,

where we show this color code as a 2x2 dimensional domain-coloring graph. Intensity (dark to light)

represents the magnitude of the function, while hue (color) represents the phase, where (see Fig. 3.4)

red is 0, sea-green is 90, blue-green is 135, blue is 180, and violet is −90 (or 270).35

The function w = s = |s|eθ has a dark spot (a zero) at s = 0, and becomes brighter away from30

the origin. On the right is w(z) = z − √, which shifts the zero to z =√. Thus domain-coloring

gives the full 2x2 complex analytic function mapping w(x, y) = u(x, y) + v(x, y), in colorized polar

coordinates.

Visualizing complex functions: The mapping from z = x + ıy to w(z) = u(x, y) + ıv(x, y)is difficult to visualize because for each point in the domain z, we would like to represent both the35

magnitude and phase (or real and imaginary parts) of w(z). A good way to visualize these mappings is

to use color (hue) to represent the phase and intensity (dark to light) to represent the magnitude. The

Octave/Matlab program zviz.m has been provided to do this (see Lecture 17 on the class website).u+jv = sin(0.5*pi*((x+jy)-i))

x-2 -1 0 1 2

jy

2

1

0

-1

-2

Figure 3.5: Plot of sin(0.5π(z − i)).

To use the program in Matlab/Octave, use the syn-

tax zviz <function of z> (for example, type40

zviz z.ˆ2). Note the period between z and ˆ2.

This will render a domain-coloring (aka colorized) ver-

sion of the function. Examples you can render with

33http://jontalle.web.engr.illinois.edu/uploads/298/zviz.zip

34https://en.wikipedia.org/wiki/Domain_coloring

35Colors vary depending on both the display medium and the eye.


zviz are given in the comments at the top of the

zviz.m program. A good example for testing is

zviz z-sqrt(j), which will show a dark spot (a

zero) at (1 + 1)/√

2 = 0.707(1 + 1).

Example: Figure 3.5 shows a colorized plot of 5

w(z) = sin(π(z− i)/2) resulting from the Matlab/Oc-

tave command zviz sin(pi*(z-i)/2). The abscissa (horizontal axis) is the real x axis and the

ordinate (vertical axis) is the complex iy axis. The graph is offset along the ordinate axis by 1i, since the

argument z − i causes a shift of the sine function by 1 in the positive imaginary direction. The visible

zeros of w(z) appear as dark regions at (−2, 1), (0, 1), (2, 1). As a function of x, w(x + 1) oscillates 10

between red (phase is zero degrees), meaning the function is positive and real, and sea-green (phase is

180), meaning the function is negative and real.

Along the vertical axis, the function is either a cosh(y) or sinh(y), depending on x. The intensity

becomes lighter as |w| increases, and white as w →∞. The intensity becomes darker as |w| decreases,

and black as w → 0. 15

u+jv = exp((x+jy))

x-2 -1 0 1 2

jy

2

1

0

-1

-2

u+jv = log((x+jy))

x-2 -1 0 1 2

jy 2

1

0

-1

-2

Figure 3.6: This domain-color map allows one to visualize complex mappings by the use of intensity (light/dark) to indicatemagnitude, and color (hue) to indicate angle (phase). The white and black lines are the iso-real and iso-imaginary contoursof the mapping. Left: This figure shows the domain-color map for the complex mapping from the z = x + y plane tothe w(z) = u + v = ex+y plane, which goes to zero as x → −∞, causing the domain-color map to become dark forx < −2. The white and black lines are always perpendicular because ez is complex analytic everywhere. Right: This showsthe principal value of the inverse function u(x, y) + v((x, y) = log(x + y), which has a zero (dark) at z = 1, since therelog(1) = 0 (the imaginary part is zero). Note the branch cut from x = [0,−∞). On branches other than the one shown, thereare no zeros since the phase (∠z = 2πn ∈ Z) is not zero. n is called the branch index.

Example: Two additional examples are given in Fig. 3.6 using the notation w(z) = u(x, y) +v(x, y)showing the two complex mappings w = es (left) and its inverse s = ln(w). The exponential is

relatively easy to understand because w(s) = |eσeω| = eσ.

The red region is where ω ≈ 0 in which casew ≈ eσ. As σ becomes large and negative, w → 0, thus

the entire field becomes dark on the left. The field is becoming light on the right where w = eσ → ∞. 20

If we let σ = 0 and look along the ω axis, we see that the function is changing phase, sea-green (90) at

the top and violet (-90) at the bottom.

In the right panel note the zero for ln(w) = ln |w| + ω at w = 1. The root of the log function is

log(wr) = 0,wr = 1, φ = 0, since log(1) = 0. More generally, the log ofw = |w|eφ is s = ln |w|+φ.Thus s(w) can be zero only when the angle of w is zero. 25

The ln(w) function has a branch cut along the φ = 180 axis. As one crosses over the cut, the phase

goes above 180, and the plane changes to the next sheet of the log function. The only sheet with a


zero is the principle value, as shown. For all others, the log function is either increasing or decreasing

monotonically, and there is no zero, as seen for sheet 0 (the one showing in Fig. 3.6).

3.8.1 Riemann Sphere: 3d extension of chord and tangent method

Once algebra was formulated by c830 CE, mathematicians were able to expand beyond the limits set

by geometry on the real plane, and the verbose descriptions of each problem in prose (Stillwell, 2010,5

p. 93). The geometry of Euclid’s Elements had paved the way, but after 2000 years, the addition of the

language of algebra changed everything. The analytic function was a key development, heavily used by

both Newton and Euler. Also the investigations of Cauchy made important headway with his work on

complex variables. Of special note was integration and differentiation in the complex plane of complex

analytic functions, which is the topic of stream 3.10

It was Riemann, working with Gauss in the final years of Gauss’s life, who made the breakthrough,

with the concept of the extended complex plane.36 This concept was based on the composition of a line

with the sphere, similar to the derivation of Euclid’s formula for Pythagorean triplets (Fig. 3.4, p. 101).

While the importance of the extended complex plane was unforeseen, it changed analytic mathematics

forever, along with the physics it supported. It unified and thus simplified many important integrals, to15

the extreme. The basic idea is captured by the fundamental theorem of complex integral calculus (Table

4.1, p. 149).

Figure 3.7: The left panel shows how the real line may be composed with the circle. Each real x value maps to a corre-sponding point x′ on the unit circle. The point x → ∞ maps to the north pole N . This simple idea may be extended with thecomposition of the complex plane with the unit sphere, thus mapping the plane onto the sphere. As with the circle, the pointon the complex plane z → ∞ maps onto the north pole N . This construction is important because while the plane is open(does not include z → ∞), the sphere is analytic at the north pole. Thus the sphere defines the closed extended plane. Figureadapted from Stillwell (2010, pp. 299-300).

The idea is outlined in Fig. 3.7. On the left is a circle and a line. The difference between this case

and the derivation of the Pythagorean triplets is that the line starts at the north pole, and ends on the

real x ∈ R axis at point x. At point x′, the line cuts through the circle. Thus the mapping from x to20

x′ takes every point on R to a point on the circle. For example, the point x = 0 maps to the south pole

(not indicated). To express x′ in terms of x one must compose the line and the circle, similar to the

composition used in Fig. 3.4 (p. 101). The points on the circle, indicated here by x′, require a traditional

polar coordinate system, having a unit radius and an angle defined between the radius and a vertical line

passing through the north pole. When x→∞ the point x′ → N , known as the point at infinity. But this25

idea goes much further, as shown on the right half of Fig. 3.7.

Here the real tangent line is replaced by a tangent complex plane z ∈ C, and the complex puncture

point z′ ∈ C, in this case on the complex sphere, called the extended complex plane. This is a natural

extension of the chord/tangent method on the left, but with significant consequences. The main differ-

ence between the complex plane z and the extended complex plane, other than the coordinate system, is30

36“Gauss did lecture to Riemann but he was only giving elementary courses and there is no evidence that at this time he

recognized Riemann’s genius.” Then “In 1849 he [Riemann] returned to Gottingen and his Ph.D. thesis, supervised by Gauss,

was submitted in 1851.” http://www-groups.dcs.st-and.ac.uk/˜history/Biographies/Riemann.html


what happens at the north pole. The point at |z| =∞ is not defined on the plane, whereas on the sphere,

the point at the north pole is simply another point, like every other point on the sphere.

Open vs. closed sets: Mathematically the plane is said to be an open set, since the limit z →∞ is not

defined, whereas on the sphere, the point z′ is a member of a closed set, since the north pole is defined.

The distinction between an open and closed set is important, because the closed set allows the function 5

to be complex analytic at the north pole, which it cannot be on the plane (since the point at infinity is

not defined).

The z plane may be replaced with another plane, say the w = F (z) ∈ C plane, where w is some

function F of z ∈ C. For the moment we shall limit ourselves to complex analytic functions of z, namely

w = F (z) = u(x, y) + v(x, y) =∑∞n=0 cnz

n. 10

In summary, given a point z = x+ y on the open complex plane, we map it to w = F (z) ∈ C, the

complex w = u + v plane, and from there to the closed extended complex plane w′(z). The point of

doing this is that it allows us to allow the function w′(z) to be analytic at the north pole, meaning it can

have a convergent Taylor series at the point at infinity z →∞. Since we have not yet defined dw(z)/dz,

the concept of a complex Taylor series remains undefined. 15

3.8.2 Bilinear transformation

In mathematics the bilinear transformation has special importance because it is linear in its action on

both the input and output variables. Since we are engineers we shall stick with the engineering termi-

nology. But if you wish to read about this on the internet, be sure to also search for the mathematical

term, Mobius transformation. 20

When a point on the complex plane z = x+ y is composed with the bilinear transform (a, b, c, d ∈C), the result is w(z) = u(x, y) + v(x, y)

w =az + b

cz + d. (3.49)

The transformation from z → w is a cascade of four independent compositions:

1. translation (w = z + b: a = 1, b ∈ C, c = 0, d = 1),

2. scaling (w = |a|z: a ∈ R, b = 0, c = 0, d = 1),

3. rotation (w = a|a|z: a ∈ C, b = 0, c = 0, d = |a|) and

4. inversion (w = 1z : a = 0, b = 1, c = 1, d = 0). 25

Each of these transformations is a special case of Eq. 3.49, with the inversion the most complicated. I

highly recommend a video showing the effect of the bilinear (Mobius) transformation on the plane.37

The bilinear transformation is the most general way to move the expansion point in a complex

analytic expansion. For example, starting from the harmonic series, the bilinear transform gives

1

1− w =1

1− az+bcz+d

=cz + d

(c− a)z + (d− b) .

The RoC is transformed from |w| < 1 to |(az − b)/(cz − d)| < 1. An interesting application might be

in moving the expansion point until it is on top of the nearest pole, so that the RoC goes to zero. This

might be a useful way of finding a pole, for example. 30

When the extended plane (Riemann sphere) is analytic at z =∞, one may take the derivatives there,

defining the Taylor series with the expansion point at∞. When the bilinear transformation rotates the

Riemann sphere, the point at infinity is translated to a finite point on the complex plane, revealing the

37https://www.youtube.com/watch?v=0z1fIsUNhO4


analytic nature at infinity. A second way to transform the point at infinity is by the bilinear transformation

ζ = 1/z, mapping a zero (or pole) at z = ∞ to a pole (or zero) at ζ = 0. Thus this construction of the

Riemann sphere and the Mbious (bilinear) transformation allows us to understand the point at infinity,

and treat it like any other point. If you felt that you never understood the meaning of the point at ∞(likely), this should help.5

Chapter 4

Stream 3a: Scalar (i.e., Ordinary)

Differential Equations

Stream 3 is ∞, a concept which typically means unbounded (immeasurably large), but in the case of

calculus, ∞ means infinitesimal (immeasurably small), since taking a limit requires small numbers. 5

Taking a limit means you may never reach the target, a concept that the Greeks called Zeno’s paradox

(Stillwell, 2010, p. 76).

When speaking of the class of ordinary (versus vector) differential equations, the term scalar is

preferable, since the term “ordinary” is vague, if not a meaningless label. There is a special subset of

fundamental theorems for scalar calculus, all of which are about integration, as summarized in Table 10

4.1 (p. 149), starting with Leibniz’s theorem. These will be discussed below, and more extensively in

Lec. 4.2 (p. 148) and Sections 4 (p. 145) and 5 (p. 145).

Chronological history post 16th century

16th Bombelli 1526–1572

17th Galileo 1564–1642, Kepler 1571–1630, Newton 1642–1727 Principia 1687; Mersenne; Huy-gens; Pascal; Fermat, Descartes (analytic geometry); and the three Bernoullis (Jakob, Johann& son Daniel)

18th Euler 1707–1783 (Student of Johann Bernoulli), d’Alembert 1717–1783, Lagrange 1736–1833, Laplace 1749-1827, Fourier 1768-1830, Gauss 1777–1855 Cauchy 1789–1857,

19th Helmholtz 1821–1894, Kelvin 1824–1907, Helmholtz 1821-1894, Kirchhoff 1824-1887, Rie-mann 1826–1866, Maxwell 1831–1879, Rayleigh 1842–1919 Heaviside 1850–1925,

20th Hilbert 1862-1942, Einstein 1879-1955

Following the integral theorems on scalar calculus are those on vector calculus, without which there

could be no understanding of Maxwell’s equations. Of these, the fundamental theorem of complex calcu-

lus (aka, Helmholtz decomposition), Gauss’s law and Stokes’s theorem form the cornerstone of modern 15

vector field analysis. These theorems allow one to connect the differential (point) and macroscopic

(integral) relationships. For example, Maxwell’s equations may be written as either vector differential

equations, as shown by Heaviside (along with Gibbs and Hertz),1 or in integral form. It is helpful to

place these two forms side-by-side, to fully appreciate their significance. To understand the differen-

tial (microscopic) view, one must understand the integral (macroscopic) view. These are presented in 20

Section 5.7 (p. 223) and Fig. 5.7 (p. 226).

4.1 The beginning of modern mathematics

As outlined in Fig. 4.1, mathematics as we know it today began in the 16th to 18th centuries, ar-

guably starting with Galileo, Descartes, Fermat, Newton, the Bernoulli family, and Euler. Galileo

1https://en.wikipedia.org/wiki/History_of_Maxwell\%27s_equations

145

146 CHAPTER 4. STREAM 3A: SCALAR CALCULUS (11 LECTURES)

TimeLine

|1525 |1600 |1700 |1800 |1875 |1925

Mersenne

Fermat

Hilbert

US Civil War

Descartes

Mozart

Johann Bernoulli

Jacob BernoulliDaniel Bernoulli

EinsteinHuygens

Euler

Newton

d'Alembert

Gauss

Galileo CauchyHelmholtz

Maxwell

Riemann

Bombelli

Rayleigh

Figure 4.1: Final overview of the timeline for the four centuries from the 16th and 20th CE covering Bombelli to Einstein.Mozart and the US Civil War are indicated along the bottom, for orientation.

was formidable, due to his fame, fortune, and his “successful” stance against the powerful Catholic

establishment. His creativity in scientific circles was certainly well known due to his many skills and

accomplishments. Descartes and Fermat were at the forefront of merging algebra and geometry. While

Fermat kept meticulous notebooks, he did not publish, and tended to be secretive. Thus Descartes’s

contributions were more widely acknowledged, but not necessarily deeper.5

Regarding the development of calculus, much was yet to be done by Newton and Leibniz, using term-

by-term integration of functions based on Taylor series representation. This was a powerful technique,

but as stated earlier, incomplete because the Taylor series can only represent single-valued functions,

within the RoC. But more importantly, Newton (and others) failed to recognize the powerful generaliza-

tion to complex analytic functions. The first major breakthrough was Newton’s publication of Principia10

(1687), and the second was Riemann (1851), advised by Gauss.

Following Newton’s lead, the secretive and introverted behavior of the typical mathematician dra-

matically changed with the Bernoulli family (Fig. 4.1). The oldest brother Jacob taught his much

younger brother Johann, who then taught his son Daniel. But Johann’s star pupil was Euler. Euler

first mastered all the tools and then published, with a prolificacy previously unknown.15

Euler and the circular functions: The first major task was to understand the family of analytic cir-

cular functions, ex, sin(x), cos(x), and log(x), a task begun by the Bernoulli family, but mastered by

Euler. Euler sought relations between these many functions, some of which may not be thought of as

being related, such as the log and sin functions. The connection that may “easily” be made is through

their complex Taylor series representation (Eq. 3.28, p. 89). By the manipulation of the analytic series

representations, the relationship between ex, and the sin(x) and cos(x), was precisely captured with the

relation

ejω = cos(ω) + sin(ω), (4.1)

and its analytic inverse (Greenberg, 1988, p. 1135)

tan−1(z) =1

2ln

(1− z1+ z

). (4.2)

The phase is φ(ω) = ∠z, from which one may compute the group delay

τg(ω) = − 1

2π

∂φ(ω)

∂ω.

4.2. LECTURE 23: FUNDAMENTAL THEOREMS OF SCALAR CALCULUS 147

Exercise: Starting from Eq. 4.1, derive Eq. 4.2. Solution: Solution: Let z(ω) = tanω, then

z(ω) =sinω

cosω= tan(ω) = −e

ω − e−ω

eω + e−ω = −e2ω − 1

e2ω + 1. (4.3)

Solving this for e2ω gives

e2ω =1 + z

1− z . (4.4)

Taking the square root results in the analytic relationship between eω and tanω

eω = ±√

1 + z

1− z .

Taking the log, and using the definition of ω(z) = tan−1(z), we obtain Eq. 4.2.

These equations are the basis of transmission lines (TL) and the Smith chart. Here z(ω) is the TL’s

input impedance and Eq. 4.4 is the reflectance.

u+jv = atan((x+jy))

x-2 -1 0 1 2

jy

2

1

0

-1

-2

u+jv = (i/2)*log((1-i*(x+jy))./(1+i*(x+jy)))

x-2 -1 0 1 2

jy

2

1

0

-1

-2

Figure 4.2: Colorized plots of ω(z) = tan−1(z) and ω(z) = i2

ln(1 − iz)/(1 + iz), verifying they are the same complexanalytic function.

While many high school students memorize Euler’s relation, it seems unlikely they appreciate the

significance of complex analytic functions (Eq. 3.39, p. 94). 5

A brief history of complex analytic functions: Newton famously ignored imaginary numbers, and

called them imaginary in a disparaging (pejorative) way. Given Newton’s prominence, his view certainly

must have keenly attenuated interest in complex algebra, even though it had been previously quantified

by Bombelli in 1525, likely based on his serendipitous finding of Diophantus’s book Arithmetic in the

Vatican library. 10

Euler derived his relationships using real power series (i.e., real analytic functions). While Euler

was fluent with =√−1, he did not consider functions to be complex analytic. That concept was first

explored by Cauchy almost a century later. The missing link to the concept of complex analytic is the

definition of the derivative with respect to the complex argument

F ′(s) =dF (s)

ds, (4.5)

where s = σ + ω, without which the complex analytic Taylor coefficients may not be defined.

Euler did not appreciate the role of complex analytic functions because these were first discovered

well after his death (1785) by Augustin-Louis Cauchy (1789–1857), and extended by Riemann in 1851

(p. 138).


4.2 Lec 23: Fundamental theorems of scalar calculus

History of the fundamental theorem of scalar calculus: It some sense, the story of calculus begins

with the fundamental theorem of calculus (FTC), also known generically as Leibniz’s formula. The

simplest integral is the length of a line L =∫ L

0 dx. If we label a point on a line as x = 0 and wish

to measure the distance to any other point x, we form the line integral between the two points. If the5

line is straight, this integral is simply the Euclidean length given by the difference between the two ends

(Eq. 3.4, p. 109).

If F (χ) ∈ R describes a height above the line χ ∈ R, then f(x)

f(x)− f(0) =

∫ x

x=0F (χ)dχ (4.6)

may be viewed as the anti-derivative of F (χ). Here χ is a dummy variable of integration. Thus the area

under F (χ) only depends on the difference of the area evaluated at the end points. It makes intuitive

sense to view f(x) as the anti-derivative of F (χ).10

This property of the area as an integral over an interval, only depending on the end points, has impor-

tant consequences in physics in terms of conservation of energy, allowing for important generalizations.

For example, as long as χ ∈ R, one may let F (χ) ∈ C with no loss of generality, due to the linear

propriety (P1, p. 133) of the integral.

If f(x) is analytic (Eq. 3.27, p. 89), then

F (x) =d

dxf(x) (4.7)

is an exact real differential. It follows that F (x) is analytic. This is known as the fundamental theorem

of (real) calculus (FTC). Thus Eq. 4.7 may be viewed as an exact real differential. This is easily shown

by evaluatingd

dxf(x) = lim

δ→0

f(x+ δ)− f(x)

δ= F (x),

starting from the anti-derivative Eq. 4.6. If f(x) is not analytic then the limit may not exist, so this is a15

necessary condition.

There are many important variations on this very basic theorem (i.e., Sect. 4, p. 145). For example,

the limits could depend on time. Also when taking Fourier transforms, the integrand depends on both

time t ∈ R and frequency ω ∈ R via a complex exponential “kernel” function e±ωt ∈ C, which is

analytic in both t and ω.20

4.2.1 The fundamental theorems of complex calculus:

The fundamental theorem of complex calculus (FTCC) states (Greenberg, 1988, p. 1197) that for any

complex analytic function F (s) ∈ C with s = σ + ω ∈ C,

f(s)− f(so) =

∫ s

soF (ζ)dζ. (4.8)

Equations 4.6 and 4.8 differ because the path of the integral is complex. Thus the line integral is over

s ∈ C rather than a real integral over χ ∈ R. The fundamental theorem of complex calculus (FTCC)

states that the integral only depends on the end points, since

F (s) =d

dsf(s). (4.9)

Comparing exact differentials Eq. 4.5 (FTCC) and Eq. 4.7 (FTC), we see that f(s) ∈ C must be complex

analytic, and have a Taylor series in powers in s ∈ C. It follows that F (s) is also complex analytic.


Complex analytic functions: The definition of a complex analytic function F (s) of s ∈ C is that the

function may be expanded in a Taylor series (Eq. 3.38, p. 94) about an expansion point so ∈ C. This

definition follows the same logic as the FTC. Thus we need a definition for the coefficients cn ∈ C,

which most naturally follow from Taylor’s formula

cn =1

n!

dn

dsnF (s)

∣∣∣∣s=so

. (4.10)

The requirement that F (s) have a Taylor series naturally follows by taking derivatives with respect to sat so. The problem is that both integration and differentiation of functions of complex Laplace frequency

s = σ + ω have not yet been defined.

Thus the question is: What does it mean to take the derivative of a function F (s) ∈ C, s = σ+ω ∈C, with respect to s, where s defines a plane rather than a real line? We learned how to form the 5

derivative on the real line. Can the same derivative concept be extended to the complex plane?

The answer is affirmative. The question may be resolved by applying the rules of the real derivative

when defining the derivative in the complex plane. However, for the complex case, there is an issue

regarding direction. Given any analytic function F (s), is the partial derivative with respect to σ different

from the partial derivative with respect to ω? For complex analytic functions, the FTCC states that the 10

integral is independent of the path in the s plane. Based on the chain rule, the derivative must also be

independent of direction at so. This directly follows from the FTCC. If the integral of a function of a

complex variable is to be independent of the path, the derivative of a function with respect to a complex

variable must be independent of the direction. This follows from Taylor’s formula, Eq. 4.10, for the

coefficients of the complex analytic formula. 15

The Cauchy-Riemann conditions: The FTC defines the area as an integral over a real differential

(dx ∈ R), while the FTCC relates an integral over a complex function F (s) ∈ C, along a complex

interval (i.e., path) (ds ∈ C). For the FTC the area under the curve only depends on the end points of the

anti-derivative f(x). But what is the meaning of an “area” along a complex path? The Cauchy-Riemann

conditions provide the answer. 20

Table 4.1: Summary of the fundamental theorems of integral calculus, each of which deals with integration.There are at least two main theorems related to scalar calculus, and three more for vector calculus.

Name Mapping p. Description

Leibniz (FTC) R1 → R0 148 Area under a real curve

Cauchy (FTCC) C1 → R0 148 Area under a complex curve

Cauchy’s theorem C1 → C0 165 Close integral over analytic region is zero

Cauchy’s integral formula C1 → C0 166 Fundamental theorem of complex integral calculus

residue theorem C1 → C0 166 Residue integration

4.2.2 Lec 24: Cauchy-Riemann conditions

For the integral of Z(s) = R(σ, ω) + X(σ, ω) to be independent of the path, the derivative of Z(s)must be independent of the direction of the derivative. As we show next, this leads to a pair of equations

known as the Cauchy-Riemann conditions. This is an important generalization of Eq. 1.1 (p. 21), which

goes from real integration (x ∈ R) to complex integration (s ∈ C) based on lengths, thus on area. 25

To defined

dsZ(s) =

d

ds[R(σ, ω) + X(σ, ω)] ,

take partial derivatives of Z(s) with respect to σ and ω, and equate them:

∂Z

∂σ=∂R

∂σ+

∂X

∂σ≡ ∂Z

∂ω=

∂R

∂ω+

∂X

∂ω.


This says that a horizontal derivative, with respect to σ, is equivalent to a vertical derivative, with respect

to ω. Taking the real and imaginary parts gives the two equations

CR-1:∂R(σ, ω)

∂σ=

∂X(σ, ω)

∂ωand CR-2:

∂R(σ, ω)

∂ω= −

∂X(σ, ω)

∂σ, (4.11)

known as the Cauchy-Riemann (CR) conditions. The cancels in CR-1, but introduces a 2 = −1 in

CR-2. They may also be written in polar coordinates (s = reθ) as

∂R

∂r=

1

r

∂X

∂θand

∂X

∂r= −1

r

∂R

∂θ.

If you are wondering what would happen if we took a derivative at 45 degrees, then we only need to

multiply the function by eπ/4. But doing so will not change the derivative. Thus we may take the

derivative in any direction by multiplying by eθ, and the CR conditions will not change.

The CR conditions are necessary conditions that the integral of Z(s), and thus its derivative, be

independent of the path, expressed in terms of conditions on the real and imaginary parts of Z. This is a

very strong condition on Z(s), which follows assuming that Z(s) may be written as a Taylor series in s:

Z(s) = Zo + Z1s+1

2Z2s

2 + · · · , (4.12)

where Zn ∈ C are complex constants given by the Taylor series formula (Eq. 4.10, p. 149). As with

the real Taylor series, there is the convergence condition, that |s| < 1, called the radius of convergence5

(RoC). This is an important generalization of the region of convergence (R0C) for real s = x.

Every function that may be expressed as a Taylor series in s − so about point so ∈ C is said to

be complex analytic at so. This series, which must be single valued, is said to converge within a ra-

dius of convergence (RoC). This highly restrictive condition has significant physical consequences. For

example, every impedance function Z(s) obeys the CR conditions over large regions of the s plane, in-10

cluding the entire right half-plane (RHP) (σ > 0). This condition is summarized by the Brune condition

ℜZ(σ > 0) ≥ 0 (Section 4.3, Eq. 4.26, p. 158). When the CR condition is generalized to volume

integrals, it is called Green’s theorem, used heavily in the solution of boundary value problems in engi-

neering and physics (Kusse and Westwig, 2010). Sections 4 (p. 145) and 5 (p. 187) depend heavily on

these concepts.15

We may merge these equations into a pair of second-order equations by taking a second round of

partials. Specifically, eliminating the real part R(σ, ω) of Eq. 4.11 gives

∂2R(σ, ω)

∂σ∂ω=∂2X(σ, ω)

∂2ω= −∂

2X(σ, ω)

∂2σ, (4.13)

which may be compactly written as∇2X(σ, ω) = 0. Eliminating the imaginary part gives

∂2X(σ, ω)

∂ω∂σ=∂2R(σ, ω)

∂2σ= −∂

2R(σ, ω)

∂2ω, (4.14)

which may be written as ∇2R(σ, ω) = 0.

In summary, for a function Z(s) to be complex analytic, the derivative dZ/ds must be independent

of direction (path), which requires that the real and imaginary parts of the function obey Laplace’s

equation, i.e.,

CR-3: ∇2R(σ, ω) = 0 and CR-4: ∇2X(σ, ω) = 0. (4.15)

The CR equations are easier to work with because they are first-order, but the physical intuition is best

understood by noting two facts (1) the derivative of a complex analytic function is independent of its

direction, and (2) the real and imaginary parts of the function both obey Laplace’s equation. Such

relationships are known as harmonic functions.220

2When the function is the ratio of two polynomials, as in the cases of the Brune impedance, they are also related to Mobius

transformations, also know as bi-harmonic operators.


As we shall see in the next few lectures, complex analytic functions must be smooth since every

analytic function may be differentiated an infinite number of times, within the RoC. The magnitude

must attain its maximum and minimum on the boundary. For example, when you stretch a rubber sheet

over a jagged frame, the height of the rubber sheet obeys Laplace’s equation. Nowhere can the height of

the sheet rise above or below its value at the boundary. 5

Harmonic functions define conservative fields, which means that energy (like a volume or area) is

conserved. The work done in moving a mass from a to b in such a field is conserved. If you return the

mass from b back to a, the energy is retrieved, and zero net work has been done.


4.3 Lec 25: Complex Analytic functions and Brune Impedance

It is rarely stated that the variable that we are integrating over, either x (space) or t (time), is real

(x, t ∈ R), since that fact is implicit, due to the physical nature of the formulation of the integral. But

this intuition must be refined once complex numbers are included with s ∈ C, where s = σ + ω.

That time and space are real variables is more than an assumption: it is a requirement that follows

from the real order property. Real numbers have order. For example, if t = 0 is now (the present), then

t < 0 is the past and t > 0 is the future. The order property of time and space allows one to order these

along a real axis. To have time travel, time and space would need to be complex (they are not), since

if the space axis were complex, as in frequency s, the order property would be invalid. It follows that

if we require order, time and space must be real (t, x ∈ R). Interestingly, it was shown by d’Alembert

(1747) that time and space are related by the pure delay due to the wave speed co. To obtain a solution

to the governing wave equation, which d’Alembert first proposed for sound waves, x, t ∈ R3 may be

combined as functions of

ζ± = t± x/co,where c [m/s] is the phase velocity of the waves. The d’Alembert solution to the wave equation, describ-

ing waves on a string under tension, is

u(x, t) = f(t− x/co) + g(t+ x/co), (4.22)

which describes the transverse velocity (or displacement) of two independent waves f(ζ−), g(ζ+) on

the string, which represent forward and backward traveling waves.3 For example, starting with a string

at rest, if one displaces the left end, at x = 0, by a step function u(t), then that step displacement will

propagate to the right as u(t− x/co), arriving at location xo [m], at time xo/co [s]. Before this time, the

string will not move to the right of the wave-front, at xo [m], and after to [s] it will have displacement

1. Since the wave equation obeys superposition (postulate P1, p. 133), it follows that the “plane-wave”

eigen-functions of the wave equation for x,k ∈ R3 are given by

ψ±(x, t) = δ(t∓ k · x)↔ est±k·x, (4.23)

where |k| = 2π/|λ| = ω/co is the wave number, |λ| is the wavelength, and s = σ + ω, the Laplace5

frequency.

When propagation losses are considered, we must replace the wave number k with a complex

analytic vector wave number κ(s) = kr(s)+ k(s), which is denoted as either the complex propagation

function, propagation function, or the dispersion relation. The vector complex propagation function is a

key generalization of the scalar wave number k = 2π/λ.10

Forms of loss include viscosity and radiation. Important examples include acoustic plane waves,

electromagnetic wave propagation, antenna theory, and the more complex case of 3D electron wave

propagating in crystals (i.e., silicon), where electrons and EM waves are in a state of equilibrium.

Electrons and photons are simply different EM states, where κ(x, s) describes the crystal’s disper-

sion relations as functions of both frequency and direction, famously known as Brillouin zones. Disper-15

sion is a property of the medium such that the wave velocity is a function of frequency and direction, as

in silicon.4 Informed discussions on the history of this topic may be found in Brillouin (1953).

4.3.1 Complex impedance functions

Conservation of energy (or power) is a cornerstone of modern physics. It may first have been under

consideration by Galileo Galilei (1564-1642) and Marin Mersenne (1588-1648). Today the question20

is not whether it is true, but why. Specifically, what is the physics behind conservation of energy?

3d’Alembert’s solution is valid for functions that are not differentiable, such as δ(t− cox).4In case you missed it, I’m suggesting is that photons (propagating waves) and electrons (evanescent waves) are different

wave “states.” The difference is the medium, which determines the dispersion relation (Papasimakis et al., 2018).

4.3. LECTURE 25: COMPLEX ANALYTIC FUNCTIONS AND BRUNE IMPEDANCE 157

Surprisingly, the answer is straightforward, based on its definition and the properties of impedance.

Recall that the power is the product of the force and flow, and impedance is their ratio.

The power is given by the product of two variables, sometimes called conjugate variables, the force

and the flow. In electrical terms, these are voltage (force) (v(t) ↔ V (ω)) and current (flow) (i(t) ↔I(ω)); thus, the electrical power at any instant of time is

P(t) = v(t)i(t). (4.24)

The total energy E(t) is the integral of the power, since P(t) = dE/dt. Thus if we start with all the

elements at rest (no currents or voltages), then the energy as a function of time is always positive

E(t) =

∫ t

0P(t)dt ≥ 0, (4.25)

and is simply the total energy applied to the network (Van Valkenburg, 1964a, p. 376). Since the voltage

and current are related by either an impedance or an admittance, conservation of energy depends on the

property of impedance. From Ohm’s law and P1 (every impedance is causal)

v(t) = z(t) ⋆ i(t) =

∫ t

τ=0z(τ)i(t− τ)dτ ↔ V (s) = Z(s)I(s).

Needs work!

Example: Let i(t) = δ(t). Then |w|2(τ) = i(t) ⋆ i(t) = δ(τ). Thus

Ixx(t) =

∫ t

τ=0z(τ)|w|2(τ)dτ =

∫ t

τ=0z(τ)δ(τ)dτ =

∫ t

0z(τ)dτ.

The Brune impedance always has the form z(t) = roδ(t) + ζ(t). The surge impedance is defined as

ro =

∫ ∞

0−z(t)dt.

The integral of the reactive part (ζ(t)) is always zero, since the reactive part cannot not store energy.

Perhaps easier to visualize is when working in the frequency domain where the total energy, equal

to the integral of the real part of the power, is

1

sℜV I =

1

2s(V ∗I + V I∗) =

1

2s(Z∗I∗I + ZII∗) =

1

sℜZ(s)|I|2 ≥ 0.

Formally this is related to a positive definite operator where the positive resistance forces the definite-

ness, which is sandwiched between the current. 5

In conclusion, conservation of energy is totally dependent on the properties of the impedance. Thus

one of the most important and obvious applications of complex functions of a complex variable is the

impedance function. This seems to be the ultimate example of the FTC, applied to z(t), in the name of

conservation of energy.

Poles and zeros of PR functions must be first degree: We conjecture that this proof also requires 10

that the poles and the zeros of the impedance function be simple (only first degree). Second degree poles

would have a reactive “secular” response of the form h(t) = t sin(ωkt+ φ)u(t), and these terms would

not average to zero, depending on the phase, as is required of an impedance. As a result, only single

degree poles would be possible.5 Furthermore, when the impedance is the ratio of two polynomials,

where the lower degree polynomial is the derivative of the higher degree one, then the poles and zeros 15

must alternate. This is a well-known property of the Brune impedance that has never been adequately

5Secular terms result from second degree poles since u(t) ⋆ u(t) = tu(t) ↔ 1/s2.


explained except for very special cases (Van Valkenburg, 1964b, p. 104-107). I suspect that no one has

ever reported an impedance having second degree poles and zeros as that would be rare impedance.

Network analysis books never report 2nd degree poles and zeros in their impedance functions. Nor

has there ever been any guidance as to where the poles and zeros might lie in the left-hand s plane.

Understanding the exact relationships between pairs of poles and zeros, to assure that the real part of the5

impedance is real, would resolve a longstanding unsolved problem (Van Valkenburg, 1964b).

Every impedance must obey conservation of energy (P3): The impedance function Z(s) has resis-

tance R and reactance X as a function of complex frequency s = σ + ω. From the causality postulate

(P1) of Sections 3.7.1 and K (p. 289), z(t < 0) = 0. Every impedance is defined by a Laplace transform

pair

z(t)↔ Z(s) = R(σ, ω) + X(σ, ω),

with R,X ∈ R.

According to Postulate P3 (Sect. 3.7.1, p. 133), a system is passive if it does not contain a power

source. Drawing power from an impedance violates conservation of energy. This propriety is also called

positive-real, which was defined by Brune (1931b,a)

ℜZ(s ≥ 0) ≥ 0. (4.26)

‘

Positive-real systems cannot draw more power than is stored in the impedance.6 The region σ ≤ 0is called the left half s plane (LHP), and the complementary region σ > 0 is called the right half s plane10

(RHP). According to the Brune condition the real part of every impedance must be non-negative in the

RHP.

It is easy to construct examples of second-order poles or zeros in the RHP, such that P3 is violated.

Thus P3 implies that the impedance may not have more than simple (first-order) poles and zeros, strictly

in the LHP. But there is yet more: These poles and zero in the LHP must have order, to meet the minimum

phase condition. This minimum phase condition is easily stated

∠Z(s) < ∠s,

but difficult to prove.7 There seems to be no proof that second-order poles and zeros (e.g., second-order

roots) are not allowed. However such roots must violate a requirement that the poles and zeros must

alternate on the σ = 0 axis, which follows from P3. In the complex plane the concept of ‘alternate’15

is not defined (complex numbers cannot be ordered). What has been proved (i.e., Foster’s reactance

theorem8) is that if the poles are on the real or imaginary axis, they must alternate, leading to simple

poles and zeros (Van Valkenburg, 1964a). The restriction on poles is sufficient, but not necessary, as

Z(s) = 1/√s is a physical realizable (PR) impedance, but is less than a first-degree pole (Kim and

Allen, 2013). The corresponding condition in the LHP, and its proof, remains elusive (Van Valkenburg,20

1964a).

As an example, a series resistor Ro and capacitor Co have an impedance given by (Table I.2, p. 282)

Z(s) = Ro + 1/sCo ↔ Roδ(t) +1

Cou(t) = z(t), (4.27)

with constants Ro, Co ∈ R > 0. In mechanics an impedance composed of a dash-pot (damper) and a

spring has the same form. A resonant system has an inductor, resistor and a capacitor, with an impedance

given by (Table I.1, p. 281)

Z(s) =sCo

1 + sCoRo + s2CoMo↔ Co

d

dt

(c+e

s+t + c−es−t)

= z(t), (4.28)

6Does this condition hold for the LHP σ < 0? It does for Eq. 4.28.7As best I know, this is an open problem in network theory (Brune, 1931a; Van Valkenburg, 1964a).8https://en.wikipedia.org/wiki/Foster\%27s_reactance_theorem


which is a second degree polynomial with two complex resonant frequencies s±. When Ro > 0 these

roots are in the left half s plane, with z(t)↔ Z(s).

Systems (networks) containing many elements, and transmission lines, can be much more compli-

cated, yet still have a simple frequency domain representation. This is the key to understanding how

these physical systems work, as will be described below. 5

Complex analytic functions: To solve a differential equation, or integrate a function, Newton used

the Taylor series to integrate one term at a time. However, he only used real functions of a real variable,

due to the fundamental lack of appreciation of the complex analytic function. This same method is how

one finds solutions to scalar differential equations today, but using an approach that makes the solution

method less obvious. Rather than working directly with the Taylor series, today we use the complex

exponential, since the complex exponential is an eigen-function of the derivative

d

dtest = sest.

Since est may be expressed as a Taylor series, having coefficients cn = 1/n!, in some real sense the

modern approach is a compact way of doing what Newton did. Thus every linear constant coefficient

differential equation in time may be simply transformed into a polynomial in complex Laplace frequency

s, by looking for solutions of the form A(s)est, transforming the differential equation into a polynomial

A(s) in complex frequency. For example,

d

dtf(t) + af(t)↔ (s+ a)F (s).

The root of A(sr) = sr + a = 0 is the eigenvalue of the differential equation. A powerful tool for

understanding the solutions of differential equations, both scalar and vector, is to work in the Laplace

frequency domain. The Taylor series has been replaced by est, transforming Newton’s real Taylor series

into the complex exponential eigen-function. In some sense, these are the same method, since

est =∞∑

n=0

(st)n

n!. (4.29)

Taking the derivative with respect to time gives

d

dtest = sest = s

∞∑

n=0

(st)n

n!, (4.30)

which is also complex analytic. Thus if the series for F (s) is valid (i.e., it converges), then its derivative

is also valid. This was a very powerful concept, exploited by Newton for real functions of a real variable,

and later by Cauchy and Riemann for complex functions of a complex variable. The key question here

is: Where does the series fail to converge? If it does not converge, the representation fails. This is the

main message behind the FTCC (Eq. 4.8). 10

The FTCC (Eq. 4.5) is formally the same as the FTC (Eq. 4.7) (Leibniz formula), the key (and

significant) difference being that the argument of the integrand s ∈ C. Thus this integration is a line

integral in the complex plane. One would naturally assume that the value of the integral depends on the

path of integration.

But, according to FTCC, it does not. In fact they are clearly distinguishable from the FTC. And the 15

reasoning is the same. If F (s) = df(s)/ds is complex analytic (i.e., has a power series f(s) =∑k cks

k,

with f(s), ck, s ∈ C), then it may be integrated, and the integral does not depend on the path. At

first blush, this is sort of amazing. The key is that F (s) and f(s) must be complex analytic, which

means they are differentiable. This all follows from the Taylor series formula Eq. 4.10 (p. 149) for the

coefficients of the complex analytic series. For Eq. 4.8 to hold, the derivatives must be independent 20

of the direction, as discussed in Section 4.2.2. The concept of a complex analytic function therefore


has eminent consequences, in the form of several key theorems on complex integration discovered by

Cauchy (c1820).

The use of the complex Taylor series generalizes the functions it describes, with unpredictable con-

sequences, as nicely shown by the domain coloring diagrams presented in Section 3.8 (p. 135). Cauchy’s

tools of complex integration were first exploited in physics by Sommerfeld (1952) to explain the onset5

transients in waves, as explained in detail in Brillouin (1960, Chap. 3).

Up to 1910, when Sommerfeld first published his results using complex analytic signals and saddle

point integration in the complex plane, there was a poor understanding of the implications of the causal

wave-front. It would be reasonable to say that his insights changed our understanding of wave propaga-

tion, for both light and sound. Sadly this insight has not been fully appreciated, even to this day. If you10

question my summary, please read Brillouin (1960, Chap. 1).

The full power of the complex analytic function was first appreciated by Bernard Riemann (1826-

1866) in his University of Gottingen PhD thesis of 1851, under the tutelage of Carl Friedrich Gauss

(1777-1855), and drawing heavily on the work of Cauchy.

The key definition of a complex analytic function is that it has a Taylor series representation over a15

region of the complex frequency plane s = σ + jω, that converges in a region of convergence (RoC)

about the expansion point, with a radius determined by the nearest pole of the function. A further

surprising feature of all analytic functions is that within the RoC, the inverse of that function also has a

complex analytic expansion. Thus given w(s), one may also determine s(w) to any desired accuracy,

critically depending on the RoC.20

4.3.2 Lec 26: Multi-valued functions

In the field of mathematics there seems to have been a tug-of-war regarding the basic definition of the

concept of a function. The accepted definition today seems to be a single-valued mapping from the

domain to the codomain (or range). This makes the discussion of multi-valued functions somewhat

tedious. In 1851 Riemann (working with Gauss) seems to have resolved this problem for the natural set25

of multi-valued functions by introducing the concept of single-valued sheets, delineated by branch-cuts.

Two simple yet important examples of multi-valued functions are the circle z2 = x2 + y2 and

w = log(z). For example, assuming z is the radius of the circle, solving for y(x) gives the double-

valued function

y(x) = ±√z2 − x2.

If we accept the modern definition of a function as the mapping from one set to a second, then y(x) is not

a function, or even two functions. For example, what if x > z? Or worse, what if z = 2 with |x| < 1?

Riemann’s construction, using branch cuts for multivalued function, resolves all these difficulties (as

best I know).30

To proceed, we need definitions and classifications of the various types of complex singularities:

1. Poles of degree 1 are called simple poles. Their amplitude is called the residue (e.g. α/s has

residue α). Simple poles are special (Eq. 4.35, p. 166)9 as they play a key role in mathematical

physics, since their inverse Laplace transform defines a causal eigen-function.

2. When the numerator and denominator of a rational function (i.e., ratio of two polynomials) have35

a common root (i.e., factor), that root is said to be removable.

3. A singularity that is not (1) removable, (2) a pole or (3) a branch point is called essential.

4. A complex analytic function (except for isolated poles) is called meromorphic.10 Meromorphic

functions can have any number of poles, even an infinite number. The poles need not be simple.

9https://en.wikipedia.org/wiki/Pole_(complex_analysis)

10https://en.wikipedia.org/wiki/Meromorphic_function


5. A complex analytic function (except for isolated poles), is called meromorphic.11 Meromorphic

functions can have any number of poles, even an infinite number. The poles need not be simple.

6. When the first derivative of a function Z(s) has a simple pole at a, then a is said to be a branch

point of Z(s). An important example is the logarithmic derivative

d ln(s− a)α/ds = α/(s− a), α ∈ I.

However, the converse does not necessarily hold.

7. I am not clear about the interesting case of an irrational pole (α ∈ I). In some cases this may be

simplified via the logarithmic derivative operation, as mentioned above. 5

More complex topologies are being researched today, and progress is expected to accelerate due to

modern computing technology.12 It is helpful to identify the physical meaning of these more complex

surfaces, to guide us in their interpretation and possible applications.13

Branch cuts: Up to this point we have only considered poles of degree α ∈ N of the form 1/sα.

The concept of a branch cut allows one to manipulate (and visualize) multi-valued functions, for which 10

α ∈ F. This is done by breaking each region into single-valued sheets, as shown in Fig. 4.3 (right). The

branch cut is a curve ∈ C that separates the various single-valued sheets of a multi-valued function. The

concepts of branch cuts, sheets and the extended plane were first devised by Riemann, working with

Gauss (1777-1855), as first described in his thesis of 1851. It was these three mathematical and geomet-

rical constructions that provided the deep insight into complex analytic functions, greatly extending the 15

important earlier work of Cauchy (1789-1857) on the calculus of complex analytic functions. For alter-

native helpful discussion of Riemann sheets and branch cuts, see Boas (1987, Section 29, pp. 221-225)

and (Kusse and Westwig, 2010).

Multi-valued functions: The best way to explore the complex mapping from the complex planes

s → w(s) is to master the single valued function s = w2(s) and its double-valued inverse w(s) =√s, 20

the square root function (Fig. 4.3). Single valued functions such as W (s) = s2 are relatively straight-

forward. Multivalued functions require the concept of a branch cut, defined in the image (w(s)) plane,

which is a technique to render the multiple values as single valued on each of several sheets, defined by

the sheet index in the domain (s) plane, and delineated by a branch cut (in the w plane). The sheets are

labeled in the domain (s) plane by a sheet index k ∈ Z while branch points and cuts are defined in the 25

image (w) plane (a.k.a., the range). It is important to understand that the path of every branch cut is not

unique, and may be moved. However branch points are unique thus not movable.

The multi-valued nature of w(s) =√s is best understood by working with the function in polar

coordinates. Let

sk = reθe2πk, (4.31)

where r = |s|, θ = ∠s,∈ R and k ∈ Z is the sheet-index. Furthermore let

u+jv = s.^2

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = -sqrt(-s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

Figure 4.3: Here we use Cartesian coordinates in the domain s = σ+ω

and polar coordinates for the range w(σ, ω) = |w|eψ. The color intensityindicates the magnitude |s|, with black being |s = 0| and bright (eventually

white), indicating |s| → ∞. Left: Mapping: w(s) = s2. Right: Mapping

of the principal branch from −s to w(s) = −√

−s (i.e., the rotated inverse

of s2). This sheet was generated by rotating w by 180°. The branch cut ison the ψ = 0 axis, with branch points at |w| = 0 and ∞.

w = ρeψ =√reθ/2eπk, (4.32)

where ρ = |w|, ψ∠w,∈ R. The

generic Cartesian coordinates are s = 30

σ+ω and w(s) = u(σ, ω) + v(σ, ω).For single valued functions such as

w(s) = s2 (Left) there is no branch

11https://en.wikipedia.org/wiki/Meromorphic_function

12https://www.maths.ox.ac.uk/about-us/departmental-art/theory

13https://www.quantamagazine.org/secret-link-uncovered-between-pure-math-and-physics-20171201


cut since ψ = 2θ. Note how the red

color (θ = 0) appears twice in this

mapping. For multi-valued functions a

branch-cut is required, typically along

the negative v(σ, ω) axis (i.e, ψ = π)5

but may be freely distorted, as seen by

comparing the right panel of Fig. 4.3

with the right panel of Fig. 4.5.

Figure 4.4: Here we show the mapping for the square root function z = ±√x which has two single-

valued sheets of the x plane corresponding to the two signs of the square root. The branch cut runs alongthe real axis, from z = 0 to ∞. A more comprehensive explanation is presented in the text. Figure from:https://en.wikipedia.org/wiki/Riemann surface

Hue: By studying the output of zviz.m (p. 136) we may understand domain-coloring. The domain

angles ∠s go from −π < θ < π, with θ = 0 being red and ±π being green (between yellow and blue).10

The angle to hue map is shown in the left panel of Fig. 4.5. For the w(s) = s2 the ∠s is expanded by 2,

since ψ = 2θ. For w(s) =√s the ∠s is compressed by a factor of 2, since ψ = θ/2. Thus the principle

angle k = 0, −180 < θ < 180) maps to half the w plane (−90 < ψ < 90) from purple to yellow,

while the k = 1 branch maps to 90 < ψ < 270. Note how the panel on the right of Fig. 4.3 matches

the right half of s (purple = -90°, yellow/green = +90°) while the middle panel above comes from the15

left side of s (green to purple). The center panel is green at -180°, and purple at +180°, which matches

the left panel at ±180°, respectively (i.e., eπ√s).

The principle (i.e., first) Riemann sheet of√s, corresponding to −π < θ < π (i.e., k = 0), is shown

in the center of Fig. 4.5. This differs from the neighboring sheet (k = 1), shown on the right. Thus

w(s) =√s is a multi-valued function of s, having two single-valued sheets.20

The location of the branch cut may be moved by rotating the z coordinate system. For example,

w(z) = ±√z and w(z) = ±√−z have different branch cuts, as may be easily verified using the

Matlab/Octave commands j*zviz(z) and zviz(-z), as shown in Fig. 4.5. Since the cut may be

moved, every function is analytic on the branch cut, if a Taylor series is formed on the branch cut, it

will describe the function on the two different sheets. Thus the complex analytic series (i.e., the Taylor25

formula, Eq. 4.10) does not depend on the location of a branch cut, as it only describes the function

uniquely (as a single-valued function), valid in its local region of convergence.

The second sheet (k = 1) (Fig. 4.5) picks up at θ = π and continues on to π + 2π = 3π. The

first sheet maps the angle of w (i.e., φ = ∠w = θ/2) from −π/2 < φ < π/2 (w =√reθ/2). This

corresponds to u = ℜw(s) > 0. The second sheet maps π/2 < ψ < 3π/2 (i.e., 90 to 270), which30

is ℜw = u < 0. In summary, twice around the s plane is once around the w(s) plane, because the

angle is half due to the√s. This then describes the multi-valued nature of the square root function.


u+jv = s

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = sqrt(s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = exp(j*pi).*sqrt(s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

Figure 4.5: Colorized plots of sk = |s|eθe2πk and wk(s) =√sk =

√|s|eθ/2eπk, as defined in polar coordinates by

Eqs. 4.31 and 4.32. Left: Color map hue reference plane s = |s|eθe2πk. Center: Sheet index k = 0, w(s) =√

|s|eθ/2

for −π < θ < +π and ψ = ∠w(s) = θ/2 between ±π/2 [rad]. Right: Sheet index k = 1, w(s) = eθ/2eπ√

|s| for

π < θ < 3π and π/2 < ψ < 3π/2 [rad]. The branch cut is at θ = ∠s = ±180 (π [rad]) where the hue of w changesabruptly from green to purple (center) and from blue back to green (right). Note how the hue matches between the center andright panels at the branch cut: in the center panel purple runs along -180° and along +180° on the right. Likewise, green runsalong +180° in the center and along -180° on the right. Thus w(s) =

√s is analytic on the branch cut connecting the two

sheets (k = 0 → k = 1).

This concept of analytic inverses becomes important only when the function is multi-valued. For

example, if w(s) = s2 then s(w) = ±√w is multivalued. Riemann dealt with such extensions with the

concept of a branch-cut with multiple sheets, labeled by a sheet number. Each sheet describes an analytic

function (Taylor series) that converges within some RoC, having a radius out to the nearest pole. Thus

Riemann’s branch cuts and sheets explicitly deal with the need to define unique single-valued inverses 5

of multi-valued functions. Since the square root function has two overlapping regions, corresponding to

the ± due to the radical, there must be two connected regions, sort of like mathematical Siamese twins:

distinct, yet the same.

Figure 4.3 shows the single-valued function w(z) = s2 (left), and Fig. 4.5 (right) its inverse, the

double-valued mapping of s(w) = ±√w. 10

Branch cuts emanate and terminate at branch points, defined as singularities (poles), that can even

have fractional degree, as for example 1/√s, and terminate at one of the matching roots, which includes

the possibility of∞.14 For example, suppose that in the neighborhood of the pole, at so the function is

f(s) =w(s)

(s− so)k,

where w, s, so ∈ C and k ∈ Q. When k = 1, so = σo + ωo is a first degree “simple pole”, having

degree 1 in the s plane, with residue w(so).15 Typically the order and degree are positive integers, but

fractional degrees and orders are not uncommon in modern engineering applications (Kirchhoff, 1868;

Lighthill, 1978). Here we shall allow both the degree and order to be fractional (∈ F). When k ∈ F ⊂ R

(k = n/m is a real reduced fraction. Namely when GCD(n,m)=1, n ⊥ m). This defines the degree of 15

a fractional pole. In such cases there must be two sets of branch cuts of degree n and m. For example,

if k = 1/2, the singularity (branch cut) is of degree 1/2, and there are two Riemann sheets, as shown in

Fig. 4.3 (p. 161). The analytic inverse of this function is s = w2, as shown in the same figure.

Fractional-order Bessel function: An important example is the Bessel function

δ(t) +1

tJ1(t)u(t)↔

√s2 + 1,

14This presumes that poles appear in pairs, one of which may be at ∞.15We shall refer to the order of a derivative, or differential equation, and the degree of a polynomial, as commonly used in

engineering applications (https://en.wikipedia.org/wiki/Differential_equation#Equation_order).


which is related to the solution to the wave equation in two-dimensional cylindrical coordinates (Table

I.2, p. 282). Bessel functions are the solutions (i.e., eigen-functions) of guided acoustic waves in round

pipes, or surface waves on the earth (seismic waves), or waves on the surface of a pond (Table 5.2,

p. 207).

u+jv = sqrt(pi./s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = sqrt(s.^2+1)

σ

-2 -1 0 1 2jω

2

1

0

-1

-2

Figure 4.6: Colorized plots of two LT pairs: Left:√π/s ↔ u(t)/

√t. Right:

√s2 + 1 ↔ δ(t) + 1

tJ1(t)u(t).

It is important to understand that the function is analytic on the branch cut, but not at the branch5

point. One is free to move the branch cut (at will). It does not need to be on a line: it could be cut in

almost any connected manner, such as a spiral. The only rule is that it must start and stop at the matching

branch points, or at∞, which must have the same degree.

There are a limited number of possibilities for the degree, k ∈ Z or ∈ F. If the degree is drawn

from R 6∈ F, the pole cannot not have a residue. According to the definition of the residue, k ∈ F will10

not give a residue. But there remains open the possibility of generalizing the concept of the Riemann

integral theorem, to include k ∈ F. One way to do this is to use the logarithmic derivative which renders

fractional poles to simple poles with fractional residues.

If the singularity has an irrational degree (k ∈ I), the branch cut has the same “irrational degree.”

Accordingly there would be an infinite number of Riemann sheets, as in the case of the log function. An

example is k = π, for which

F (s) =1

sπ= e− log(sπ) = e−π log(s) = e−π log(ρ)e−πθ,

where the domain is expressed in polar coordinates s = ρeθ. When k ∈ F it may be maximally close

(e.g., k = π152/π153 = π152/(π152 + 2) = 881/883 ≈ 0.99883, or its reciprocal ≈ 1.0023).16 The15

branch cut could be very subtle (it could even go unnoticed), but it would have a significant impact on

the function, and on its inverse Laplace transform.xamples:

Example: Find poles, zeros and residues of F (s).

1.

F (s) =d

dslns+ e

s+ π,

Solution:

F (s) =d

ds[ln(s+ e)− ln(s+ π)] =

(1

s+ e− 1

s+ π

)

The the poles are at s1 = −e and s2 = −π with respective residues of ±1.

16Since there are no even primes other than π1 = 2, the minimum difference is 2. Out of 106 primes, only 5 have a minimum

difference of more than e80, with these 5 having a uniform distribution on a log scale. verify 80

4.4. LECTURE 27: (WEEK 10) THREE COMPLEX INTEGRATION THEOREMS I 165

2.

F (s) =d

dsln

(s+ 3)e

(s+ )−π ,

Solution: There is a very important take-home message here regarding the utility of the

logarithmic derivative.

3.

F (s) = eπ ln s,

Solution: Add solution.

4.

F (s) = π−s.

Solution: Add solution.

Multivalued functions: The two basic functions we review, to answer the questions about multivalued 5

functions and branch cuts, are w(s) =√s and w(s) = log(s) along with their inverse functions w(s) =

s2 and w(s) = es. For uniformity we shall refer to the complex abscissa (s = σ + ω) and the complex

ordinate w(s) = u + v. When the complex abscissa and domain are swapped, by taking the inverse

of a function, multi-valued functions are a common consequence. For example, f(t) = sin(t) is single

valued, and analytic in t, thus has a Taylor series. The inverse function t(f) is multivalued. 10

The modern terminology is the domain and range, or alternatively the co-domain.17

Log function: Next we discuss the multi-valued nature of the log function. In this case there are

an infinite number of Riemann sheets, not well captured by Fig. 3.6 (p. 137), which only displays the

principal sheet. However, if we look at the formula for the log function, the nature is easily discerned.

The abscissa s may be defined as multi-valued since

sk = re2πkeθ.

Here we have extended the angle of s by 2πk, where k is the sheet index ∈ Z. Taking the log

log(s) = log(r) + (θ + 2πk).

When k = 0 we have the principal value sheet, which is zero when s = 1. For any other value of k,

w(s) 6= 0, even when r = 1, since the angle is not zero, except for the k = 0 sheet.

4.4 Lec 27: Three Cauchy integral theorems

4.4.1 Cauchy’s theorems for integration in the complex plane 15

There are three basic definitions related to Cauchy’s integral formula. They are closely related, and can

greatly simplify integration in the complex plane. The choice of names seems unfortunate, if not totally

confusing.

Add Dates.1. Cauchy’s (Integral) Theorem (CIT):

∮

CF (s)ds = 0, (4.33)

if and only if F (s) is complex analytic inside of a simple closed curve C (Boas, 1987, p. 45);

(Stillwell, 2010, 319). The FTCC (Eq. 4.8) says that the integral only depends on the end points 20

if F (s) is complex analytic. By closing the path (contour C) the end points are the same, thus the

integral must be zero, as long as F (s) is complex analytic.

17The best way to create confusion is to rename something. The confusion grows geometrically with each renaming. I

suspect that everyone who cares knows the terms abscissa and ordinate, and some fraction know the equivalent terms domain

and range.


2. Cauchy’s Integral Formula (CIF):

1

2πj

∮

B

F (s)

s− sods =

F (so), so ∈ C < B (inside)

0, so 6∈ C > B (outside).(4.34)

Here F (s) is required to be analytic everywhere within (and on) the boundary B of integration

(Greenberg, 1988, p. 1200); (Boas, 1987, p. 51); (Stillwell, 2010, p. 220). When the point so ∈ C

is within the boundary the value F (so) ∈ C is the residue of the pole so of F (s)/(s− so). When

the point so lies outside the boundary, the integral is zero.

3. The (Cauchy) Residue Theorem (CRT): (Greenberg, 1988, p. 1241), (Boas, 1987, p. 73)

∮

CF (s)ds = 2πj

K∑

k=1

ck =K∑

k=1

∮F (s)

s− skds, (4.35)

where the residues ck ∈ C corresponds to the kth pole of f(s) enclosed by the contour C. By the5

use of Cauchy’s integral formula, the right-most form of the residue theorem is equivalent to the

CIF.18

How to calculate the residue: The case of first degree poles has special significance, because the

Brune impedance only allows simple poles and zeros, increasing its utility. The residues for simple poles

are F (sk), which is complex analytic in the neighborhood of the pole, but not at the pole.10

Consider the function f(s) = F (s)/(s− sk), where we have factored f(s) to isolate the first-order

pole at s = sk, with F (s) analytic at sk. Then the residue of the poles at ck = F (sk). This coefficient

is computed by removing the singularity, by placing a zero at the pole frequency, and taking the limit as

s→ sk, namely

ck = lims→sk

[(s− sk)F (s)] (4.36)

(Greenberg, 1988, p. 1242), (Boas, 1987, p. 72).

When the pole is anN th degree, the procedure is much more complicated, and requires takingN−1order derivatives of f(s), followed by the limit process (Greenberg, 1988, p. 1242). Higher degree poles

are rarely encountered; thus, it is good to know that this formula exists, but perhaps it is not worth the

effort to memorize it.15

Summary and examples: These three theorems, all attributed to Cauchy, collectively are related to

the fundamental theorems of calculus. Because the names of the three theorems are so similar, they are

easily confused.

1. In general it makes no sense (nor is there any need) to integrate through a pole, thus the poles (or20

other singularities) must not lie on C.

2. The Cauchy integral theorem (Eq. 4.33) follows trivially from the fundamental theorem of com-

plex calculus (Eq. 4.8, p. 148), since if the integral is independent of the path, and the path returns

to the starting point, the closed integral must be zero. Thus Eq. 4.33 holds when F (s) is complex

analytic within C.25

3. Since the real and imaginary parts of every complex analytic function obey Laplace’s equation

(Eq. 4.15, p. 150), it follows that every closed integral over a Laplace field, i.e., one defined by

Laplace’s equation, must be zero. In fact this is the property of a conservative system, corre-

sponding to many physical systems. If a closed box has fixed potentials on the walls, with any

distribution whatsoever, and a point charge (i.e., an electron) is placed in the box, then a force30

equal to F = qE is required to move that charge, and thus work is done. However, if the point is

returned to its starting location, the net work done is zero.

18This theorem is the same as a 2D version of Stokes’s theorem (Boas, 1987).

4.4. LECTURE 27: (WEEK 10) THREE COMPLEX INTEGRATION THEOREMS I 167

4. Work is done in charging a capacitor, and energy is stored. However, when the capacitor is

discharged, all of the energy is returned to the load.

5. Soap bubbles and rubber sheets on a wire frame obey Laplace’s equation.

6. These are all cases where the fields are Laplacian, thus closed line integrals must be zero. Lapla-

cian fields are commonly observed because they are so basic. 5

7. We have presented the impedance as the primary example of a complex analytic function. Phys-

ically, every impedance has an associated stored energy, and every system having stored energy

has an associated impedance. This impedance is usually defined in the frequency s domain, as a

force over a flow (i.e., voltage over current). The power P(t) is defined as the force times the flow

and the energy E(t) as the time integral of the power

E(t) =

∫ t

−∞P(t)dt, (4.37)

which is similar to Eq. 4.6 (p. 148) [see Section 3.5.3, Eq. 3.40 (p. 125)]. In summary, impedance

and power and energy are all fundamentally related.

4.5. LECTURE 29: INVERSE LAPLACE TRANSFORM (T < 0): CAUCHY RESIDUE THEOREM177

4.4.3 Lec 28: Cauchy Integral Formula & Residue Theorem

The Cauchy integral formula (Eq. 4.34) is an important extension of the Cauchy integral theorem

(Eq. 4.33) in that a pole has been explicitly injected into the integrand at s = so. If the pole loca-

tion is outside of the curve C, the result of the integral is zero, in keeping with Eq. 4.33. When the pole

is inside of C, the integrand is no longer complex analytic at the enclosed pole. When this pole is simple, 5

the residue theorem applies. By a manipulation of the contour in Eq. 4.34, the pole can be isolated with

a circle around the pole, and then taking the limit, the radius may be taken to zero, in the limit, isolating

the pole.

For the related Cauchy residue theorem (Eq. 4.35) the same result holds, except it is assumed that

there are K simple poles in the function F (s). This requires the repeated application of Eq. 4.34, K 10

times, so it represents a minor extension of Eq. 4.34. The function F (s) may be written as f(s)/PK(s),

where f(s) is analytic in C and PK(s) is a polynomial of degree K, with all of its roots sk ∈ C.

Non-integral degree singularities: The key point is that this theorem applies when n ∈ I, including

fractionals n ∈ F, but for these cases the residue is always zero, since by definition, the residue is the

amplitude of the 1/s term (Boas, 1987, p. 73). 15

Examples:

1. When n ∈ F (e.g., n = 2/3), the residue is, by definition, zero.

2. The function 1/√s has a zero residue (apply the definition of the residue Eq. 4.36).

3. When n 6= 1 ∈ I, the residue is, by definition, zero.

4. When n = 1, the residue is given by Eq. 4.36. 20

5. This method is necessary when computing the inverse Laplace transform.

4.5 Lec 29: Inverse Laplace transform (t < 0): Cauchy residue theorem

The inverse Laplace transform Eq. 3.47 transforms a function of complex frequency F (s) and returns a

causal function of time f(t)f(t)↔ F (s),

where f(t) = 0 for t < 0. Examples are provided in Table I.2 (p. 282). We next discuss the details

of finding the inverse transform by use of the Cauchy residue theorem, and how the causal requirement

f(t < 0) = 0 comes about. 25

The integrand of the inverse transform is F (s)est and the limits of integration are −σo∓ω. To find

the inverse we must close the curve, at infinity, and show that the integral at ω → ∞. There are two

ways to close these limits – to the right σ > 0 (RHP), and to the left σ < 0 (LHP) – but there needs to

be some logical reason for this choice. That logic is the sign of t. For the integral to converge, the term

est must go to zero as ω →∞. In terms of the real and imaginary parts of s = σ + ω, the exponential 30

may be rewritten as eσteωt. Note that both t and ω go to∞. So it is the interaction between these two

limits that determines how we pick the closure, RHP vs. LHP.

Case for causality (t < 0): Let us first consider negative time, including t → −∞. If we were to

close C in the left half-plane (σ < 0), then the product σt is positive (σ < 0, t < 0, thus σt > 0). In

this case as ω → ∞, the closure integral |s| → ∞ will diverge. Thus we may not close in the LHP for 35

negative time. If we close in the RHP σ > 0 then the product σt < 0 and est will go to zero as ω →∞.

This then justifies closing the contour, allowing for the use of the Cauchy theorems.

If F (s) is analytic in the RHP, the FTCC applies, and the resulting f(t) must be zero, and the inverse

Laplace transform must be causal. This argument holds for any F (s) that is analytic in the RHP (σ > 0).


Case of unstable poles: An important but subtle point arises: If F (s) has a pole in the RHP, then the

above argument still applies if we pick σo to be to the right of the RHP pole. This means that the inverse

transform may still be applied to unstable poles (those in the RHP). This explains the need for the σo in

the limits. If F (s) has no RHP poles, then σo = 0 is adequate, and this factor may be ignored.

Case for zero time (t = 0): When time is zero, the integral does not, in general, converge, leaving5

f(t) undefined. This is most clear in the case of the step function u(t) ↔ 1/s, where the integral may

not be closed, because the convergence factor est = 1 is lost for t = 0.

The fact that u(t) does not exist at t = 0 explains the Gibbs phenomenon in the inverse Fourier

transform. At times where a jump occurs, the derivative of the function does not exist, and thus the time

response function is not analytic. The Fourier expansion cannot converge at places where the function10

is not analytic. A low-pass filter may be used to smooth the function, but at the cost of temporal reso-

lution. Forcing the function to be analytic at the discontinuity, by smoothing the jumps, is an important

computational method.

4.5.1 Lec 30: Inverse Laplace transform (t > 0)

Case of t > 0: Next we investigate the convergence of the integral for positive time t > 0. In this case15

we must close the integral in the LHP (σ < 0) for convergence, so that st < 0 (σ ≤ 0 and t > 0). When

there are poles on the ω = 0 axis, σo > 0 assures convergence by keeping the on-axis poles inside

the contour. At this point the Cauchy residue theorem (Eq. 4.35) is relevant. If we restrict ourselves to

simple poles (as required for a Brune impedance), the residue theorem may be directly applied.

The most simple example is the step function, for which F (s) = 1/s and thus

u(t) =

∮

LHP

est

s

ds

2π↔ 1

s,

which is a direct application of the Cauchy residue theorem, Eq. 4.35 (p. 166). The forward transform20

of u(t) is straightforward, as discussed in Section 3.7 (p. 131). This is true of most if not all of the

elementary forward Laplace transforms. In these cases, causality is built into the integral by the limits,

so it is not a result, as it must be in the inverse transform. An interesting problem is to prove that u(t) is

not defined at t = 0.

u+jv = besselj(0,pi*(x+jy))

x-2 -1 0 1 2

jy

2

1

0

-1

-2

u+jv = cos(pi*(x+jy))

x-2 -1 0 1 2

jy

2

1

0

-1

-2

Figure 4.9: Left: Colorized plot ofw(z) = Jo(πz). The first zero is at 2.405, and thus appears at 0.7655 = 2.405/π, some-what larger than the root of cos(π/2). Right: Note the similarity to w(z) = sin(z). The LT ’s have similar characteristics, asdocumented in Table I.2 (p. 282).

The inverse Laplace transform of F (s) = 1/(s + 1) has a residue of 1 at s = −1, thus that is the


only contribution to the integral. More demanding cases are Laplace transform pairs

1√tu(t)↔

√π

sand Jo(t)u(t)↔ 1√

s2 + 1,

and more on p. 282. Many of these are easily proved in the forward direction, but are much more difficult

in the inverse direction, due to the properties at t = 0, unless of course the residue theorem (Eq. 4.35,

p. 166) is invoked. The last LT -pair helps us understand the basic nature of the Bessel functions J0(z)

and H(1)0 (z2), with a branch cut along the negative axis (Allen, 2019).

The form√s is called a semi-inductor (Kim and Allen, 2013), also know as the skin effect in EM 5

theory. The form 1/√s is a semi-capacitor.

Some open questions: Without the use of the CRT (Eq. 4.35) it is difficult to see how to evaluate the

inverse Laplace transform of 1/s directly. For example, how does one show that the above integral is

zero for negative time (or that it is 1 for positive time)? The CRT neatly solves this difficult problem by

the convergence of the integral for negative and positive time. Clearly the continuity of the integral at 10

ω → ∞ plays an important role. Perhaps the Riemann sphere plays a role in this that has not yet been

explored.

4.5.2 Lec 31: Properties of the LT (e.g., Linearity, convolution, time-shift, modulation,

etc.)

As shown in the table of Laplace transforms, there are integral (i.e., integration, not integer) relation-

ships, or properties, that are helpful to identify. The first of these is a definition, not a property: Add notation to Appdx A.

f(t)↔ F (s).

When taking the LT, the time response is given in lower case (e.g., f(t)) and the frequency domain 15

transform is denoted in upper case (e.g., F (s)). It is required, but not always explicitly specified, that

f(t < 0) = 0, that is, the time function must be causal (P1: Section 3.7.1).

Linearity: A key property so basic that it is almost forgotten, is the linearity property of the LT,

summarized in P2 of Section 3.7.1 (p. 134).

Convolution property: One of the most basic and useful properties is that the product of two LTs in

frequency results in convolution in time

f(t) ⋆ g(t) =

∫ t

0f(τ)g(t− τ)dτ ↔ F (s)G(s),

where we use the ⋆ operator as a shorthand for the convolution of two time functions. 20

A key application of convolution is filtering, which takes many forms. The most basic filter is the

moving average, the moving sum of data samples, normalized by the number of samples. Such a filter

has very poor performance. It also introduces a delay of half the length of the average, which may or

may not constitute a problem, depending on the application. Another important example is a low-pass

filter that removes high-frequency noise, or a notch filter that removes line-noise (i.e., 60 [Hz] in the US, 25

and its 2nd and 3d harmonics, 120 and 180 [Hz]). Such noise is typically a result of poor grounding and

ground loops. It is better to solve the problem at its root than to remove it with a notch filter. Still, filters

are very important in engineering.


By taking the LT of the convolution we can derive this relationship:∫ ∞

0[f(t) ⋆ g(t)]e−stdt =

∫ ∞

t=0

[∫ t

0f(τ)g(t− τ)dτ

]e−stdt

=

∫ t

0f(τ)

(∫ ∞

t=0g(t− τ)e−stdt

)dτ

=

∫ t

0f(τ)

(e−sτ

∫ ∞

t′=0g(t′)e−st′dt′

)dτ

= G(s)

∫ t

0f(τ)e−sτdτ

= G(s)F (s).

We first encountered this relationship in Section ?? (p. ??) in the context of multiplying polynomials,

which was the same as convolving their coefficients. Hopefully the parallel is obvious. In the case of

polynomials, the convolution was discrete in the coefficients, and here it is continuous in time. But the

relationships are the same.

Time-shift property: When a function is time-shifted by time To, the LT is modified by esTo , leading

to the property

f(t− To)↔ e−sTo F (s).

This is easily shown by applying the definition of the LT to a delayed time function.5

Time derivative: The key to the eigen-function analysis provided by the LT is the transformation of a

time derivative on a time function, that is,

d

dtf(t)↔ sF (s).

Here s is the eigenvalue corresponding to the time derivative of est. Given the definition of the derivative

of est with respect to time, this definition seems trivial. Yet that definition was not obvious to Euler. It

needed to be extended to the space of complex analytic function est, which did not happen until at least

Riemann (1851).

Given a differential equation of order K, the LT results in a polynomial in s, of degree K. It follows10

that this LT property is the corner-stone of why the LT is so important to scalar differential equations, as

it was to the early analysis of Pell’s equation and the Fibonacci sequence, as presented in earlier chapters.

This property was first uncovered by Euler. It is not clear if he fully appreciated its significance, but by

the time of his death, it certainly would have been clear to him. Who first coined the terms eigenvalue

and eigen-function? The word eigen is a German word meaning of one.15

Initial and final value theorems: There are much more subtle relations between f(t) and F (s) that

characterize f(0+) and f(t→∞). While these properties can be very important in certain applications,

they are beyond the scope of the present treatment. These relate to so-called initial value theorems. If

the system under investigation has potential energy at t = 0, then the voltage (velocity) need not be zero

for negative time. An example is a charged capacitor or a moving mass. These are important situations,20

but better explored in a more in-depth treatment.

4.5.3 Solving differential equations: Method of Frobenius

Many differential equations may be solved by assuming a power series (i.e., Taylor series) solution of

the form

y(x) = xr∞∑

n=0

cnxn (4.49)

with coefficients cn ∈ C. The method of Frobenius is quite general (Greenberg, 1988, p. 193).


Example: When a solution of this form is substituted into the differential equation, a recursion relation

in the coefficients results. For example, if the equation is

y′′(x) = λ2y(x)

the recursion is cn = cn−1/n. The resulting equation is

y(x) = eλx = x0∞∑

n=0

1

n!xn,

namely cn = 1/n!, thus ncn = 1/(n− 1)! = cn−1.

Exercise: Find the recursion relation for Bessel’s equation of order ν

x2y′′(x) + xy′(x) + (x2 − ν2)y(x) = 0.

Solution: If we assume a complex analytic solution of the form Eq. 4.49, we find the Bessel recursion

relation for coefficients ck to be

ck = − 1

k(k + 2ν)ck−2

(Greenberg, 1988, p. 231).

Appendix A

Notation

A.1 Number systems

The notation used in this book is defined in this appendix so that it may be quickly accessed.1 Where the

definition is sketchy, page numbers are provided where these concepts are fully explained, along with 5

many other important and useful definitions. For example a discussion of N may be found on page 36.

Math symbols such as N may be found at the top of the index, since they are difficult to alphabetize.

A.1.1 Units

Strangely, or not, classical mathematics, as taught today in schools, does not seem to acknowledge the

concept of physical units. Units seem to have been abstracted away. This makes mathematics distinct 10

from physics, where almost everything has units. Presumably this makes mathematics more general

(i.e., abstract). But for the engineering mind, this is not ideal, or worse, as it necessarily means that

important physical meaning, by design, has been surgically removed. We shall use SI units whenever

possible, which means this book is not not a typical book on mathematics. Spatial coordinates are quoted

in meters [m], and time in seconds [s]. Angles in degrees have no units, whereas radians have units of 15

inverse-seconds [s−1]. For a complete list of SI units, refer to https://physics.nist.gov/cuu/

pdf/sp811.pdf.

A.1.2 Symbols and functions

For more on basic math notation, see to Graham et al. (1994).

We use ln as the log function base e, log as base 2, and πk to indicate the kth prime (e.g., π1 = 20

2, π2 = 3).

When working with Fourier FT and Laplace LT transforms, lower case symbols are in the time

domain while upper case indicates the frequency domain, as f(t) ↔ F (ω). An important exception is

Maxwell’s equations because they are so widely used as upper-case bold letters (e.g., E(x, ω)). It seems

logical to change this to conform to lower case, with e(x, t)↔ E(x, ω) as the preferred notation. 25

A.1.3 Common mathematical symbols:

There are many pre-defined symbols in mathematics, too many to summarize here. We shall only use a

small subset, defined here.

• A set is a collection of objects that have a common property. A set is defined using braces. For

example, if set P = a, b, c such that a2 + b2 = c2, then members of P obey the Pythagorean 30

theorem. Thus we could say that 1, 1,√

2 ∈ P .

1https://en.wikipedia.org/wiki/List_of_mathematical_symbols_by_subject#Definition_symbols

245

246 APPENDIX A. NOTATION

• Number sets: N,P,Z,Q,F, I,R,C are briefly discussed below, and in greater detail in Section 2.1

(pp. 36–37).

• One can define sets of sets and subsets of sets, and this is prone (in my experience) to error. For

example, what is the difference between the number 0 and the null set ∅ = 0? Is 0 ∈ ∅? Ask

a mathematician. This seems a lackluster construction in the world of engineering.5

• Define a vector as an n-tuple.

• The symbol⊥ is used in many different ways to indicate two things are perpendicular, orthogonal,

or in disjoint sets. In set theory A ⊥ B is equivalent to A ∩ B = ∅. If two vectors E,H are

perpendicular E ⊥ H , then their inner product E ·H = 0 is zero. One must infer the meaning

of ⊥ from its context.10

Table A.1: List of all upper and lower case Greek letters used in the text.

A.1.4 Greek letters

Frequently Greek letters, as provided in Fig. A.1, are associated in engineering and physics with a

specific physical meaning. For example, ω [rad] is the radian frequency 2πf , ρ [kgm/m3] is commonly

the density. φ, ψ are commonly used to indicate angles of a triangle, and ζ(s) is the Riemann zeta

function. Many of these are so well established it makes no sense to define new terms, so we will adopt15

these common terms (and define them).

Likely you do not know all of these Greek letters, commonly used in mathematics. Some of them

are pronounced in strange ways. The symbol ξ is pronounced “see,” ζ is “zeta,” β is “beta,” and χ is

“kie” (rhymes with pie and sky). I will assume you know how to pronounce the others, which are more

phonetic in English. One advantage of learning LATEX is that all of these math symbols are built in,20

and thus more easily learned, once you have adopted this powerful open-source math-oriented word-

processing system (e.g., used to write this book).

A.1.5 Double-Bold notation

Table A.2 indicates the symbol followed by a page number indication where it is discussed, and the

Genus (class) of the number type. For example, N > 0 indicates the infinite set of counting numbers 1,25

2, 3, · · · , not including zero. Starting from any counting number, you get the next one by adding 1.

Counting numbers are also known as the cardinal numbers.

We say that a number is in the set with the notation 3 ∈ N ⊂ R, which is read as “3 is in the set

of counting numbers, which in turn is in the set of real numbers,” or in vernacular language “3 is a real

counting number.”30

A.2. ROUNDING SCHEMES 247

Table A.2: Double-bold notation for the types of numbers. (#) is a page number. Symbol with an

exponent denote the dimensionality. Thus R2 represents the real plane. An exponent of 0 denotes point,

e.g., ∈ C0.

Symbol (p. #) Genus Examples Counter Examples

N (36) Counting 1,2,17,3, 1020 -5, 0, π, -10, 5j

P (36) Prime 2,17,3, 1020 0, 1, 4, 32, 12, −5

Z (36) Integer -1, 0, 17, 5j, -1020 1/2,π,√

5

Q (37) Rational 2/1, 3/2, 1.5, 1.14√

2, 3−1/3, π

F (37) Fractional 1/2, 7/22 1/√

2

I (37) Irrational√

2, 3−1/3, π, e Vectors

R (37) Reals√

2, 3−1/3, π 2πj

C (37) Complex 1,√

2j, 3−j/3, πj Vectors

G Gaussian integers 3− 2 ∈ Z ∪ C complex integers

Prime numbers (P ⊂ N) are taken from the counting numbers, but do not include 1.

The signed integers Z include 0 and negative integers. Rational numbers Q are historically defined

to include Z, a somewhat inconvenient definition, since the more interesting class are the fractionals F,

a subset of rationals F ∈ Q that exclude the integers (i.e., F ⊥ Z). This is a useful definition because the

rationals Q = Z ∪ F are formed from the union of integers and fractionals. 5

The rationals may be defined, using set notation (a very sloppy notation with an incomprehensible

syntax), as

Q = p/q : q 6= 0 & p, q ∈ Z,

which may be read as “the set ‘· · · ’ of all p/q such that ‘:’ q 6= 0, ‘and’ p, q ⊂ Z. The translation of

the symbols is in single (‘· · · ’) quotes.

Irrational numbers I are very special: They are formed by taking a limit of fractionals, as the nu-

merator and denominator→ ∞, and approach a limit point. It follows that irrational numbers must be

approximated by fractionals. 10

The reals (R) include complex numbers (C) having a zero imaginary part (i.e., R ⊂ C).

The size of a set is denoted by taking the absolute value (e.g., |N|). Normally in mathematics this

symbol indicates the cardinality, so we are defining it differently from the standard notation.

Classification of numbers: From the above definitions there exists a natural hierarchical structure of

numbers:

P ∈ N, Z : N, 0,−N, F ⊥ Z, Q : Z ∪ F, R : Q ∪ I ⊂ C

1. The primes are a subset of the counting numbers: P ⊂ N.

2. The signed integers Z are composed of ±N and 0, thus N ⊂ Z. 15

3. The fractionals F do not include of the signed integers Z.

4. The rationals Q = Z ∪ F are the union of the signed integers and fractionals.

5. Irrational numbers I have the special properties I ⊥ Q.

6. The reals R : Q, I are the union of rationals and irrationals I.

7. Reals R may be defined as a subset of those complex numbers C having zero imaginary part. 20

248 APPENDIX A. NOTATION

A.2 Rounding schemes

In Matlab/Octave there are five different rounding schemes (i.e., mappings): round(x), fix(x),

floor(x), ceil(x), roundb(x), with x ∈ R. Function round(π)=3 (e.g. 3 = ⌊π⌉) rounds

to the nearest integer, whereas floor(π) (i.e., 3 = ⌊π⌋) rounds down and ceil(π) rounds up (i.e.,

4 = ⌈π⌉).5

Note round() ≡ ⌊π⌉ introduces negative remainders, when ever a number rounds up. The ex-

pressions in brackets is the notation for the continued fraction expansion (p. 57).

A.3 Periodic functions

Fourier series tells us that periodic functions are discrete in frequency, with frequencies given by nTx,

where Ts is the sample period (Ts = 1/2Fmax and Fmin = Fmax/NFT).10

This concept is captured by the Fourier series, which is a frequency expansion of a periodic function.

This concept is quite general. Periodic in frequency implies discrete in time. Periodic and discrete in

time requires periodic and discrete in frequency (the case of the DFT). The modulo function mmodn is

periodic with period n.

A periodic function may be conveniently indicated using double-parentheses notation. This is some-

times known as modular arithmetic. For example,

f((t))T = f(t) = f(t± kT )

is periodic on t, T ∈ R with a period of T and k ∈ Z. This notation is useful when dealing with Fourier15

series of periodic functions.

When a discrete valued (e.g., time t ∈ N) sequence is periodic with period N ∈ Z, we use square

brackets

f [[n]]N = f [n] = f [n± kN ],

with k ∈ Z. This notation will be used with discrete-time signals that are periodic, such as the case of

the DFT.

A.4 Differential equations vs. polynomials

A polynomial has degree N defined by the largest power. A quadratic equation is degree 2, and a cubic

has degree 3. We shall indicate a polynomial by the notation

PN (z) = zN + aN−1zN−1 · · · a0.

It is a good practice to normalize the polynomial so that aN = 1. This will not change the roots, defined20

by Eq. 3.7 (p. 76). The coefficient on zN−1 is always the sum of the roots zn (aN−1 =∑Nn zn), and a0

is always their product (a0 =∏Nn zn).

Differential equations have order (polynomials have degree). If a second-order differential equation

is Laplace transformed (Lec. 3.7, p. 131), one is left with a degree 2 polynomial.

Example:

d2

dt2y(t) + b

d

dty(t) + cy(t) = α

(d

dtx(t) + βx(t)

)↔ (A.1)

(s2 + bs+ c)Y (s) = α(s+ β)X(s). (A.2)

Y (s)

X(s)= α

s+ β

s2 + bs+ c≡ H(s)↔ h(t). (A.3)

A.4. DIFFERENTIAL EQUATIONS VS. POLYNOMIALS 249

Using the same argument as for polynomials, the lead coefficient must always be 1. The coefficient

α ∈ R is called the gain. The complex variable s is the Laplace frequency. Discuss half-derivatives?

The ratio of the output Y (s) over the input X(s) is called the system transfer function H(s). When

H(s) is the ratio of two polynomials in s, the transfer function is said to be bilinear, since it is linear in

both the input and output. The roots of the numerator are called the zeros and those of the denominator, 5

the poles. The inverse Laplace transform of the transfer function is called the system impulse response,

which describes the system’s output signal y(t) for any given input signal x(t), via convolution (i.e.,

y(t) = h(t) ⋆ x(t)).

Appendix D

Symbolic analysis of T E = EΛ

D.1 General case

Here we assume

T =

[A BC D

]

with ∆T = 1.

The eigenvectors e± of T are

e± =

(1

2C

[(A−D)∓

√(A−D)2 + 4BC

]

1

)(D.1)

and eigenvalues are

λ± =1

2

((A+D)−

√(A−D)2 + 4BC

(A+D) +√

(A−D)2 + 4BC

). (D.2)

D.1.1 Case of the Transmission matrix: 5

For the case of the ABCD matrix the eigenvalues depend on reciprocity, since ∆T = 1 if T (s) is

reciprocal, and ∆T = −1 if it is anti-reciprocal. Thus it is helpful to display the eigenfunctions and

values in terms of ∆T so this distinction is explicit.

The term under the radical (i.e., the discriminant) may be rewritten in terms of the determinant of T

(i.e., ∆T = AD − BC), since

(A−D)2 − (A+D)2 = −4AD.

The for the ABCD matrix the expression under the radical becomes

(A−D)2 + 4BC = A2 +D2 − 4AD + 4BC= A2 +D2 − 4∆T .

Rewriting the eigenvectors and eigenvalues in terms of ∆T we find

e± =

(1

2C

[(A−D)∓

√(A+D)2 − 4∆T

]

1

)(D.3)

and eigenvalues are

λ± =1

2

((A+D)−

√(A+D)2 − 4∆T

(A+D) +√

(A+D)2 − 4∆T

). (D.4)

263

264 APPENDIX D. SYMBOLIC ANALYSIS OF T E = EΛ

For the reversible transmission matrix:

When A = D the transmission matrix is said to be reversible, thus

λ± =

(A−

√BC

A+√BC

). (D.5)

Example: An application of the above results may be found in the Train problem (p. 183).

D.2 Special cases having symmetry

Each 2x2 matrix has four entries, each of which can be complex. This leads to 4x2=8 possible special

symmetries (an eight-fold way), discussed next, in quasi-order of their importance. Each symmetry is5

related to properties of T . For example if T = T †, the matrix is said to have Hermitian symmetry.

When T = T T the matrix is symmetric, and when T = −T T is is said to be skew-symmetric. Each of

these eight-fold symmetries corresponds to some sort of physical constraint, as discussed below.

D.2.1 Reversible systems

When the values of T on its diagonal are equal (A = D), the matrix is called reversible, and the

eigenvectors and eigenvalues greatly simplify to

E =

[−√

BC +

√BC

1 1

]Λ =

[A−

√BC 0

0 A +√BC

]. (D.6)

Reversibility (P7) is a common symmetry. Its utility is in describing transmission lines as discussed in10

Sections 3.5 (p. 123), ?? and 4.5.4 (pp ??-182).

D.2.2 Reciprocal systems

When the matrix is symmetric (B = C), the corresponding system is said to be reciprocal. Most physical

systems are reciprocal. The determinant of the transmission matrix of a reciprocal network ∆T =AD − BC = 1. For example, electrical networks composed of inductors, capacitors and resistors are15

always reciprocal. It follows that the complex impedance matrix is symmetric (Van Valkenburg, 1964a).

Magnetic systems such as dynamic loudspeakers are anti-reciprocal, and correspondingly ∆T = −1.

The impedance matrix of a loudspeaker is skew symmetric (Kim and Allen, 2013). All impedance matri-

ces are either symmetric or anti-symmetric, depending on whether they are reciprocal (LRC networks)

or anti-reciprocal (magnetic networks). These systems have complex eigenvalues with negative real20

parts, corresponding to lossy systems. In some sense, all of this follows from conservation of energy,

but the precise general case is waiting for enlightenment. The impedance matrix is never Hermitian. It

is easily proved that Hermitian matrices have real eigenvalues, which correspond to lossless networks.

Any physical system of equations that has any type of loss cannot be Hermitian.

In summary, given a reciprocal system, the T matrix has ∆T = 1, and the corresponding impedance25

matrix is symmetric (not Hermitian).

D.3 Impedance matrix

As previously discussed in Section 3.5 (p. 123), the T matrix corresponding to an impedance matrix is

[V1

V2

]= Z(s)

[I1

I2

]=

1

C

[A ∆T

1 D

] [I1

I2

].

D.3. IMPEDANCE MATRIX 265

Reciprocal systems have skew-symmetric impedance matrices, namely z12 = z21. When the system is

both reversible A = D and reciprocal, the impedance matrix simplifies to

Z(s) =1

C

[A 11 A

].

For such systems there are only two degrees of freedom, A and C. As discussed previously in Sect. 3.5

(p. 123), each of these has a physical meaning: 1/A is the Thevenin source voltage given a voltage drive

and B/C is the Thevenin impedance.

D.3.1 Transmission matrices and symmetry

The transmission matrix fully characterizes a two-port network (Sect. 3.5, p. 123). 5

D.3.2 Hermitian symmetry

When a system is Hermitian its matrix is conjugate symmetric

Z(s) = Z†(s),

a stronger condition than reciprocal, but not the symmetric symmetry of a Brune impedance matrix.

Impedance is not Hermitian unless the diagonal elements are real, but it does have symmetric symmetry.

In the case of a Hermitian matrix, the eigenvalues are always real. To show this start from the defi-

nition of an impedance eigen-equation (V is a vector of voltages, I a current vector and Z an impedance

matrix)

V = ZI = IΛ; P = V′I = I′ΛI,

where Z, I,V,Λ ∈ C, A = A† is a square conjugate-symmetric matrix, and I,V are vectors of the size

of Z and P is the power. Here Z† is the complex transpose (see Appendix A, p. 253). Move this disc to HW10

D.3.3 Double roots

For the 2x2 case of double roots the matrix has Jordan form

T =

[λ 10 λ

].

Then

T n =

[λn nλn−1

0 λn

].

Needs workThis generalizes to nxn matrices having arbitrary combinations of degeneracies (multiple roots), as in

symmetric (square) drums, for example . . . .

Octave vs. Matlab seeming conflict

Using Octave (version 4.2.2, June 21, 2018):

e1,1 =−b

a2 − d

2 + 12

√a2 − 2ad+ 4bc+ d2

=−2b

(a− d) +√

(a− d)2 + 4bc.

This looks very different from the Matlab output (above). 15

Here is what I get with the same command, from Matlab (June 21, 2018):

e1,1 =a2 + d

2 −√a2−2 a d+d2+4 b c

2

c− d

c=

1

2c

[(a− d)−

√(a− d)2 + 4 b c

].

266 APPENDIX D. SYMBOLIC ANALYSIS OF T E = EΛ

Here we derive the eigenmatrix E, and eigenvalue matrix Λ given a 2x2 transmission matrix

T =

[A BC D

],

such that T E = EΛ, using symbolic algebra methods, given by the Matlab/Octave’s script

syms A B C D T E L %Use symbolic Matlab/Octave

T=[A B;C D] %Given matrix T

[E,L]=eig(T) %Find eigenvector matrix E and

%eigenvalue matrix L5

These results have been numerically verified to be the same using CkEig.m. Conclusions from the

numerical experiment with (A = [1, 2; 3, 5]) are:

1. The Octave and Matlab symbolic formulas give the same numeric results for eigenmatrix and

eigenvalues.

2. The symbolic and numeric results correctly diagonalize the matrix, that is Γ = E−1AE.10

3. The numeric equations are normalized to 1: norm(e±) = 1.

4. The symbolic Eigenvectors are not normalized to 1, rather E(2, :) = 1.

5. When the numeric result is normalized by the lower element (making it 1) all results agree.

Appendix I

Tables of Laplace Transforms

Properties of the Laplace Transform

1. Time t ∈ R [s] and Laplace frequency [rad] are defined as s = σ + ω ∈ C.

2. Given a Laplace transform (LT ) pair f(t)↔ F (s), in the engineering literature, the time domain 5

is always lower case [f(t)] and causal (i.e., f(t < 0) = 0) and the frequency domain is upper-

case [e.g. F (s)]. Maxwell’s venerable equations are the unfortunate exception to this otherwise

universal rule.

3. The target time function f(t < 0) = 0 (i.e., it must be causal). The time limits are 0− < t < ∞.

Thus the integral must start from slightly below t = 0 to integrate over a delta function at t = 0. 10

For example if f(t) = δ(t), the integral must include both sides of the impulse. If you wish to

include non-causal functions such as δ(t + 1), it is necessary to extend the lower time limit. In

such cases simply set the lower limit of the integral to −∞, and let the integrand (f(t)) determine

the limits.

4. When taking the forward transform (t → s), the sign of the exponential is negative. This is 15

necessary to assure that the integral converges when the integrand f(t) → ∞ as t → ∞. For

example, if f(t) = etu(t) (i.e., without the negative σ exponent), the integral does not converge.

5. The limits on the integrals of the forward transform are t : (0−,∞) ∈ R, and the reverse LTs are

[σo −∞, σo +∞] ∈ C. These limits will be further discussed in Section 4.5.2 (p. 179).

6. When taking the inverse Laplace transform, the normalization factor of 1/2π is required to cancel 20

the 2π in the differential ds of the integral.

7. The frequency for the LT must be complex, and in general F (s) is complex analytic for σ > σo.It follows that the real and imaginary parts of F (s) are related. Given ℜF (s) it is possible to

find ℑF (s) (Boas, 1987). More on this in Section 3.7 (p. 131).

8. To take the inverse Laplace transform, we must learn how to integrate in the complex s plane. 25

This will be explained in Sections 4.4-4.5.2 (p. 165-179).

9. The Laplace step function is defined as

u(t) =

∫ t

−∞δ(t)dt =

1 if t > 0

NaN t = 0

0 if t < 0

.

Alternatively one could define δ(t) = du(t)/dt.

279

280 APPENDIX I. TABLES OF LAPLACE TRANSFORMS

10. It is easily shown that u(t)↔ 1/s by direct integration

F (s) =

∫ ∞

0u(t) e−stdt = −e

−st

s

∣∣∣∣∣

∞

o

=1

s.

With the LT step (u(t)) there is no Gibbs ringing effect.

11. In many physical applications, the Laplace transform takes the form of a ratio of two polynomials.

In such case the roots of the numerator polynomial are called the zeros while the roots of the

denominator polynomial are called the poles. For example the LT of u(t) ↔ 1/s has a pole at

s = 0, which represents integration, since

u(t) ⋆ f(t) =

∫ r

−∞f(τ)dτ ↔ F (s)

s.

12. The LT is quite different from the FT in terms of its analytic properties. For example, the step

function u(t)↔ 1/s is complex analytic everywhere, except at s = 0. The FT of 1↔ 2πδ(ω) is

not analytic anywhere.5

13. Dilated step function (a ∈ R)

u(at)↔∫ ∞

−∞u(at)e−stdt =

1

a

∫ ∞

−∞u(τ)e−(s/a)τdτ =

a

|a|1

s= ±1

s,

where we have made the change of variables τ = at. The only effect that a has on u(at) is the sign

of t, since u(t) = u(2t). However u(−t) 6= u(t), since u(t) · u(−t) = 0, and u(t) + u(−t) = 1,

except at t = 0, where it is not defined.

Once complex integration in the complex plane has been defined (Section 4.2.2, p. 149), we can

justify the definition of the inverse LT (Eq. 3.46).110

1https://en.wikipedia.org/wiki/Laplace_transform#Table_of_selected_Laplace_transforms

281

Table I.1: Functional relationships between Laplace transforms.

LT functional properties

f(t) ⋆ g(t) =

∫ t

t=0f(t− τ)g(τ)dτ ↔ F (s)G(s) convolution

u(t) ⋆ f(t) =

∫ t

0−f(t)dt↔ F (s)

sconvolution

f(at)u(at)↔ 1

aF (

s

a) a ∈ R 6= 0 scaling

f(t)e−atu(t)↔ F (s+ a) damped

f(t− T )e−a(t−T )u(t− T )↔ e−sTF (s+ a) damped and delayed

f(−t)u(−t)↔ F (−s) reverse time

f(−t)e−atu(−t)↔ F (a− s) time-reversed & damped

d

dtf(t) = δ′(t) ⋆ f(t)↔ sF (s) deriv

Additional transforms

sin(t)u(t)

t↔ tan−1(1/s) half-sync

282 APPENDIX I. TABLES OF LAPLACE TRANSFORMS

Table I.2: Laplace transforms of f(t), δ(t), u(t), rect(t),To, p, e,∈ R and F (s), G(s),s, a ∈ C. Given aLaplace transform (LT ) pair f(t) ↔ F (s), the frequency domain will always be upper-case [e.g. F (s)] and thetime domain lower case [f(t)] and causal (i.e., f(t < 0) = 0). An extended table of transforms is given in TableI.1 on page 281.

f(t)↔ F (s) t ∈ R; s, F (s) ∈ C Name

δ(t)↔ 1 Dirac

δ(|a|t)↔ 1

|a| a 6= 0 time-scaled Dirac

δ(t− To)↔ e−sTo delayed Dirac

δ(t− To) ⋆ f(t)↔ F (s)e−sTo –∞∑

n=0

δ(t− nTo) =1

1− δ(t− To)↔ 1

1− e−sTo =∞∑

n=0

e−snTo one-sided impulse train

u(t)↔ 1

sstep

u(−t)↔ −1

santi-causal step

u(at)↔ a

sa 6= 0 ∈ R dilated or reversed step

e−atu(t)↔ 1

s+ aa > 0 ∈ R damped step

cos(at)u(t)↔ 1

2

(1

s− a +1

s+ a

)a ∈ R cos

sin(at)u(t)↔ 1

2

(1

s− a −1

s+ a

)a ∈ C “damped” sin

u(t− To)↔1

se−sTo T0 > 0 ∈ R time delay

rect(t) =1

To[u(t)− u(t− To)]↔

1

To

(1− e−sTo

)rect-pulse

u(t) ⋆ u(t) = tu(t)↔ 1/s2 ramp

u(t) ⋆ u(t) ⋆ u(t) =1

2t2u(t)↔ 1/s3 double ramp

1√tu(t)↔

√π

s

tpu(t)↔ Γ(p+ 1)

sp+1ℜp > −1, q ∈ C

Jn(ωot)u(t)↔

(√s2 + ω2

o − s)n

ωno√s2 + ω2

o

283

Table I.3: The following table provides an extended table of Laplace Transforms. J0, K1 are Bessel functionsof the first and second kind.

f(t)↔ F (s) Name

δ(at)↔ 1

aa 6= 0; Time-scaled Dirac

δ(t+ To)↔ esTo negative delay

u(at)↔ a

sa 6= 0; dilate

u(−t)↔ −1

santi-causal step

eatu(−t)↔ 1

−s+ aanticausal damped step

d1/2

dt1/2f(t)u(t)↔

√sF (s) “half” derivative

d1/2

dt1/2u(t)↔

√s “half” derivative

1√πtu(t)↔ 1√

s“half” integration

erfc(α√t)↔ 1

se−2α

√s (Morse-Feshbach-II, p. 1582) α > 0; erfc

Jo(at)u(t)↔ 1

s2 + a2Bessel

J1(t)u(t)/t↔√s2 + 1− s

J1(t)u(t)/t+ 2u(t)↔√s2 + 1 + s = esinh−1(s)

δ(t) + J1(t)u(t)/t↔√s2 + 1

Io(t)u(t)↔ 1/√s2 − 1

u(t)/√t+ 1↔ es

√π

serfc(√s)

√tu(t) ∗

√1 + tu(t)↔ es/2K1(s/2)/2s

Appendix L

Figure L.1: Page 21

1500BCE |0CE |500 |1000 |1400|1650

|830Christ

ChineseBabylonia

PythagoreansEuclid

LeonardoBrahmaguptaDiophantus Bhaskara

ArchimedesBombelli

al-KhawarizmiMarco Polo

Copernicus

Figure L.2: Page 24

Chronological history: 16th to 19th centuries

|1525 |1600 |1643 |1700 |1750 |1800 |1855

|1596

|1601|1655

|1667

|1564 |1756|1526

Euler

Newton

d'Alembert

GaussGalileo

Daniel Bernoulli

Jacob Bernoulli

Johann BernoulliDescartes

Beethoven

MozartFermat

Bombelli

Figure L.3: Page 73

|1640 |1700 |1750 |1800 |1850 |1900 |1950


dAlembert

Gauss

Cauchy

Helmholtz

RiemannMaxwell

Poincare

Rayleigh

Kirchhoff

Heaviside

EinsteinKelvin

Stokes

Sommerfeld

Brillouin

Johann BernoulliNewton

BeethovenMozart

295

296 APPENDIX L. TIME LINES

Figure L.4: Page 107

|1596 |1650 |1700 |1750 |1800


dAlembert

GaussCauchy

LagrangeNewton

DescartesFermat

Johann Bernoulli

Figure L.5: Page 146

|1525 |1600 |1700 |1800 |1875 |1925

Mersenne

Fermat

Hilbert

US Civil War

Descartes

Mozart

Johann Bernoulli

Jacob BernoulliDaniel Bernoulli

EinsteinHuygens

Euler

Newton

d'Alembert

Gauss

Galileo CauchyHelmholtz

Maxwell

Riemann

Bombelli

Rayleigh

Appendix M

Figure M.1: Page 10

MATHEMATICSENGINEERING

PHYSICS

Figure M.2: Page 65

Figure M.3: Page 79

ℜ x0 1 2 3 4 5

ℑ x

-2

-1

0

1

2

ℜ x

0 1 2 3 4 5

ℑ x

-2

-1

0

1

2

297

298 APPENDIX M. COLOR PLATES

Figure M.4: Page 136

u+jv = s

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = (x+jy)-sqrt(i)

x

-2 -1 0 1 2

jy

2

1

0

-1

-2


u+jv = sin(0.5*pi*((x+jy)-i))

x

-2 -1 0 1 2

jy

2

1

0

-1

-2


u+jv = exp((x+jy))

x

-2 -1 0 1 2

jy

2

1

0

-1

-2

u+jv = log((x+jy))

x

-2 -1 0 1 2

jy

2

1

0

-1

-2

299


u+jv = atan((x+jy))

x

-2 -1 0 1 2

jy

2

1

0

-1

-2

u+jv = (i/2)*log((1-i*(x+jy))./(1+i*(x+jy)))

x

-2 -1 0 1 2

jy

2

1

0

-1

-2


u+jv = s.^2

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = -sqrt(-s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2


300 APPENDIX M. COLOR PLATES


u+jv = s

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = sqrt(s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = exp(j*pi).*sqrt(s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2


u+jv = sqrt(pi./s)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2

u+jv = sqrt(s.^2+1)

σ

-2 -1 0 1 2

jω

2

1

0

-1

-2


u+jv = besselj(0,pi*(x+jy))

x

-2 -1 0 1 2

jy

2

1

0

-1

-2

u+jv = cos(pi*(x+jy))

x

-2 -1 0 1 2

jy

2

1

0

-1

-2

Bibliography

Allen, J. (2019), An invatation to mathmatical physics – companion (Springer).

Allen, J. B. (2014), “The relation between tympanic membrane higher order modes and standing waves.”

in Proceedings of the 12th International Mechanics of Hearing Workshop, edited by K. K. Corey DP

(AIP, American Institute of Physics, Melville, NY), p. In Press. 5

Batchelor, G. (1967), An introduction to fluid dynamics (Cambridge University Press, Cambridge,

United Kingdom).

Beranek, L. L. (1954), Acoustics (McGraw–Hill Book Company, Inc., 451 pages, New York).

Beranek, L. L. and Mellow, T. J. (2012), Acoustics: Sound Fields and Transducers (Academic Press -

Elsevier Inc., Waltham, MA). 10

Boas, R. (1987), An invitation to Complex Analysis (Random House, Birkhauser Mathematics Series).

Bode, H. (1945), Network Analysis and Feedback Amplifier Design (Van Nostrand, New York).

Bombieri, E. (2000), “Problems of the millennium: The Riemann hypothesis,” RiemannZeta-

ClayDesc.pdf , 1–11Date unknown.

Boyer, C. and Merzbach, U. (2011), A History of Mathematics (Wiley), URL https://books. 15

google.com/books?id=BokVHiuIk9UC.

Brillouin, L. (1953), Wave propagation in periodic structures (Dover, London), updated 1946 edition

with corrections and added appendix.

Brillouin, L. (1960), Wave propagation and group velocity (Academic Press), pure and Applied Physics,

v. 8, 154 pages. 20

Brune, O. (1931a), “Synthesis of a finite two-terminal network whose driving point impedance is a

prescribed function of frequency,” J. Math and Phys. 10, 191–236.

Brune, O. (1931b), “Synthesis of a finite two-terminal network whose driving point impedance is a

prescribed function of frequency,” Ph.D. thesis, MIT,

https://dspace.mit.edu/handle/1721.1/10661. 25

Calinger, R. S. (2015), Leonhard Euler: Mathematical Genius in the Enlightenment (Princeton Univer-

sity Press, Princeton, NJ, USA).

Campbell, G. (1903a), “On Loaded Lines in Telephonic Transmission,” Phil. Mag. 5, 313–331, see

Campbell22a (footnote 2): In that discussion, ... it is tacitly assumed that the line is either an actual

line with resistance, or of the limit such with R=0. 30

Campbell, G. (1922), “Physical Theory of the Electric Wave Filter,” Bell System Tech. Jol. 1(2), 1–32.

Campbell, G. (1937), The collected papers of George Ashley Campbell (AT&T, Archives, 5 Reiman Rd,

Warren, NJ), forward by Vannevar Bush; Introduction by E.H. Colpitts.

303

304 BIBLIOGRAPHY

Campbell, G. A. (mar 1903b), “On loaded lines in telephonic transmission,” Philosophical Magazine

Series 6 5(27), 313–330.

Campbell, G. A. (1910), “Telephonic Intelligibility,” Phil. Mag. 19(6), 152–9.

Carlin, H. J. and Giordano, A. B. (1964), Network Theory, An Introduction to Reciprocal and Nonrecip-

rocal Circuits (Prentice-Hall, Englewood Cliffs NJ).5

Cauer, E. and Mathis, W. (1995), “Wilhelm Cauer (1900-1945),” Archiv fur Elektronik und Ubertra-

gungstechnik 49(5), 244–251.

Cauer, E., Mathis, W., and Pauli, R. (2000), “Life and work of Wilhelm Cauer (1900-1945),” in Pro-

ceedings of the 14-th International Symposium on Mathematical Theory of Networks and Systems,

Perpignan, France.10

Cauer, W. (1958), Synthesis of Linear Communication Networks: Vols. 1 and 2 (McGraw-Hill).

Cauer, W., Klein, W., Pelz, F., Knausenberger, G., and Warfield, J. (1958), Synthesis of linear communi-

cation networks (McGraw-Hill New York).

Condon, E. and Morse, P. (1929), Quantum Mechanics (McGraw-Hill, New York).

Crandall, I. B. (1926), Theory of Vibrating Systems and Sound (D. Van Nostrand Company, Inc., 27215

pages, New York), book contains extensive Bibliographic information about applied acoustics and

speech and hearing. Crandall was the first to analyze speech sounds in a quantitative way.

D’Angelo, J. P. (2017), Linear and complex analysis for applications (CRC Press, Taylor & Francis

Group).

Einstein, A. (1905), “Uber die von der molekularkinetischen Theorie der Warme geforderte Bewegung20

von in ruhenden Flussigkeiten suspendierten Teilchen,” Annalen der physik 322(8), 549–560.

Fagen, M. (Ed.) (1975), A history of engineering and science in the Bell System — The early years

(1875-1925) (AT&T Bell Laboratories).

Feynman, R. (1970a), Feynman Lectures On Physics (Addison-Wesley Pub. Co.).

Feynman, R. (1970b), Feynman Lectures On Physics: Vol. I (Addison-Wesley Pub. Co.).25

Feynman, R. (1970c), Feynman Lectures On Physics: Vol. II (Addison-Wesley Pub. Co.).

Feynman, R. (1970d), Feynman Lectures On Physics: Vol. III (Addison-Wesley Pub. Co.).

Fine (2007), Number Theory: An Introduction via the Distribution of Primes (Birkhauser

Boston, Boston, MA), chap. 4, pp. 133–196, URL http://dx.doi.org/10.1007/

978-0-8176-4541-0_4.30

Flanders, H. (June–July 1973), “Differentiation under the integral sign,” American Mathematical

Monthly 80(6), 615–627.

Fletcher, N. and Rossing, T. (2008), The Physics of Musical Instruments (Springer New York), URL

https://books.google.com/books?id=9CRSRYQlRLkC.

Fry, T. (1928), Probability and its engineering uses (D. Van Nostrand Co. INC., Princeton NJ).35

Galileo (1638), Two new sciences (1914, Macmillan & Co, New York,Toronto,

http://files.libertyfund.org/files/753/0416 Bk.pdf).

BIBLIOGRAPHY 305

Goldsmith, A. N. and Minton, J. (1924), “The Performance and Theory of Loud Speaker Horns,” Radio

Engineers, Proceedings of the Institute of 12(4), 423–478.

Goldstein, L. (Jun-Jul 1973), “A History of the Prime Number Theorem,” The American Mathematical

Monthly 80(6), 599–615.

Graham, R. L., Knuth, D. E., and Patashnik, O. (1994), Concrete Mathematics: A Foundation for Com- 5

puter Science (Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA), 2nd ed.

Greenberg, M. D. (1988), Advanced Engineering Mathematics (Prentice Hall, Upper Saddle

River, NJ, 07458), URL http://jontalle.web.engr.illinois.edu/uploads/493/

Greenberg.pdf.

Grosswiler, P. (2004), “Dispelling the Alphabet Effect,” Canadian Journal of Communication 29(2). 10

Gustavsen, B. and Semlyen, A. (1999), “Rational approximation of frequency domain responses by

vector fitting,” Power Delivery, IEEE Transactions on 14(3), 1052–1061.

Hahn, W. (1941), “A new method for the calculation of cavity resonators,” JAP 12, 62–68.

Hamming, R. (Mar 1986), “You and your research,” Unpublished talk at Bell Communication

Research Colloquim Seminar https://en.wikiquote.org/wiki/Richard Hamming(1), 1–25, URL 15

\urlhttp://www.csee.usf.edu/˜zheng/hamming-research.pdf, Jim F. Kaiser.

Hanna, C. R. and Slepian, J. (1924), “The Function and Design of Horns for Loud Speakers,” American

Institute of Electrical Engineers, Transactions of the XLIII, 393–411.

Heaviside, O. (1892), “On resistance and conductance operators, and their derivatives inductance and pe

rmittance, especially in connection with electric and magnetic energy,” Electrical Papers 2, 355–374. 20

Helmholtz, H. (Jan 1858), “Ueber Integrale der hydrodynamichen Gleichungen welche den Wirelbe-

wegungen entsprechen (On integrals of the Hydrodynamic Equations that correspond to Vortex Mo-

tions),” Journal fur die reine und angewandte Mathematik 55, 25–55.

Helmholtz, H. (1863a), On the sensations of tone (Dover (1954), 300 pages, New York), dover edition

of 1954. 25

Helmholtz, H. (Feb 1863b), “Ueber den Einfluss der Reibung in der Luft auf die Schallbewegung

(On the influence of friction in the air on the sound movement),” Verhandlungen des naturhistoisch-

medicinischen Vereins zu Heildelberg III(17), 16–20, sitzung vom 27.

Helmholtz, H. (1978), “On Integrals of the Hydrodynamic Equations that Correspond to Vortex Mo-

tions,” International Journal of Fusion Energy 1(3-4), 41–68, URL http://www.wlym.com/ 30

archive/fusion/fusionarchive_ijfe.html, English translation of Helmholtz’s 1858

paper, translated by Uwe Parpart (1978). The image is on page 41.

Heras, R. (2016), “The Helmholtz theorem and retarded fields,” European Journal of Physics 37(6),

065204, URL http://stacks.iop.org/0143-0807/37/i=6/a=065204.

Hill, J. M. (mar 2007), “Laplace transforms and the Riemann zeta function,” Integral Transforms and 35

Special Functions 18(3), 193–205, [email protected].

Horn, R. and Johnson, C. (1988), Matrix Analysis (Cambridge Univ. Press, Cambridge, England).

Hunt, F. V. (1952), Electroacoustics, The analysis of transduction, and its historical background (The

Acoustical Society of America, 260 pages, Woodbury, NY 11797).

306 BIBLIOGRAPHY

Johnson, D. H. (2003), “Origins of the equivalent circuit concept: the voltage-source equivalent,” Pro-

ceedings of the IEEE 91(4), 636–640, URL http://www.ece.rice.edu/˜dhj/paper1.

pdf.

Karal, F. C. (1953), “The analogous acoustical impedance for discontinuities and constrictions of circular

cross section,” J. Acoust. Soc. Am. 25(2), 327–334.5

Kennelly, A. E. (1893), “Impedance,” Transactions of the American Institute of Electrical Engineers 10,

172–232.

Kim, N. and Allen, J. B. (2013), “Two-port network Analysis and Modeling of a Balanced Armature

Receiver,” Hearing Research 301, 156–167.

Kim, N., Yoon, Y.-J., and Allen, J. B. (2016), “Generalized metamaterials: Definitions and taxonomy,”10

J. Acoust. Soc. Am. 139, 3412–3418.

Kirchhoff, G. (1868), “On the influence of heat conduction in a gas on sound propagation,” Ann. Phys.

Chem. 134, 177–193.

Kirchhoff, G. (1974), “On the influence of heat conduction in a gas on sound propagation (English),”

in Physical Acoustics, Benchmark Papers in Acoustics Volume 4, edited by B. Lindsay (Dowden,15

Hutchinson & Ross, Inc., Stroudsburg, Pennsylvania), pp. 7–19.

Kleiner, M. (2013), Electroacoustics (CRC Press, Taylor & Francis Group, Boca Raton, FL, USA),

https://www.crcpress.com/authors/i546-mendel-kleiner .

Kusse, B. R. and Westwig, E. A. (2010), Mathematical physics: applied mathematics for scientists and

engineers (John Wiley & Sons).20

Lamb, H. (1932), Hydrodynamics (Dover Publications, New York).

Lighthill, S. M. J. (1978), Waves in fluids (Cambridge University Press, England).

Lin, J. (jan 1995), “The Needham Puzzle: Why the Industrial Revolution did not originate in China,”

in Economic Behavior, Game Theory, and Technology in Emerging Markets, edited by B. Chris-

tiansen (PNAS and IGI Global, 701 E Chocolate Ave., Hershey, PA), vol. 43(2), chap. Economic25

Development and Cultural Change, pp. 269–292, DOI:10.1073/pnas.0900943106 PMID:19131517,

DOI:10.1086/452150.

Mason, W. (Apr 1927), “A Study of the Regular Combination of Acoustic Elements with Applications

to Recurrent Acoustic Filters, Tapered Acoustic Filters, and Horns,,” Bell System Tech. Jol. 6(2),

258–294.30

Mason, W. (Feb 1928), “The propagation characteristics of sound tubes and acoustic filters,” Phy. Rev.

31, 283–295.

Mawardi, O. K. (jul 1949), “Generalized Solutions of Webster’s Horn Theory,” J. Acoust. Soc. Am.

21(4), 323–30.

Mazur, J. (2014), Enlightening Symbols: A Short History of Mathematical Notation and Its Hidden35

Powers (Princeton University Press, Princeton, NJ).

McMillan, E. M. (1946), “Violation of the reciprocity theorem in linear passive electromechanical sys-

tems,” Journal of the Acoustical Society of America 18, 344–347.

Menzies, G. (2004), 1421: The Year China Discovered America (HarperCollins, NYC).

BIBLIOGRAPHY 307

Menzies, G. (2008), 1434: The year a Magnificent Chinese fleet sailed to Italy and ignited the renais-

sance (HarperCollins, NYC).

Miles, J. W. (sept 1948), “The coupling of a cylindrical tube to a half-infinite space,” J. Acoust. Soc.

Am. 20(5), 652–664.

Miller, D. A. (Sep 1991), “Huygens’s wave propagation principle corrected,” Optics Letters 16(18), 5

1370–2.

Montgomery, C., Dicke, R., and Purcell, E. (1948), Principles of Microwave Circuits (McGraw-Hill,

Inc., New York).

Morse, P. and Feshbach, H. (1953), Methods of theoretical physics. Vol. I. II (McGraw-Hill, Inc., New

York). 10

Morse, P. M. (1948), Vibration and sound (McGraw Hill, 468 pages, NYC, NY).

Newton, I. (1687), Philosophiae Naturalis Principia Mathematica (Reg. Soc. Press).

Olson, H. F. (1947), Elements of Acoustical Engineering (2n Edition) (D. Van Nostrand Company, 539

pages).

Palmerino, C. (1999), “Infinite degrees of speed: Marin Mersenne and the debate over Galileo’s Law of 15

free fall,” Early Science and Medicine 4(4), 269–328, ’www.jstor/stable/4130144’.

Papasimakis, N., Raybould, T., Fedotov, V. A., Tsai, D. P., Youngs, I., and Zheludev, N. I. (May 2018),

“Pulse generation scheme for flying electromagnetic doughnuts,” Phys. Rev. B 97, 201409, URL

https://link.aps.org/doi/10.1103/PhysRevB.97.201409.

Papoulis, A. (1962), The Fourier integral and its applications (McGraw–Hill Book Co., New York). 20

Parent, P. and Allen, J. B. (2010), “Wave model of the human tympanic membrane,” Hearing Research

263, 152–167.

Pierce, A. D. (1981), Acoustics: An Introduction to its Physical Principles and Applications (McGraw-

Hill, 678 pages, New York).

Pipes, L. A. (1958), Applied Mathematics for Engineers and Physicists (McGraw Hill, NYC, NY). 25

Ramo, S., Whinnery, J. R., and Van Duzer, T. (1965), Fields and waves in communication electronics

(John Wiley & Sons, Inc., New York).

Rayleigh, J. W. (1896), Theory of Sound, Vol. I 480 pages, Vol. II 504 pages (Dover, New York).

Riemann, B. (1851), “Grundlagen fur eine allgemeine Theorie der Funktionen einer veranderlichen

complexen Grosse [An alternative basis for a general theory of the functions of complex variables],” 30

Ph.D. thesis, University of G0ttingen.

Riemann, B. (Nov 1859), “On the Number of Primes Less Than a Given Magnitude,” Monatsberichte

der Berliner Akademie , 1–10,Translated by David R. Wilkins-1998.

Robinson, S. R. (2017), “Effects of ear-canal geometry and middle-ear pressure on wideband acoustic

reflectance,” Ph.D. thesis, Univ. of IL, Urbana IL, Maryland. 35

Salmon, V. (1946a), “Generalized plane wave horn theory,” J. Acoust. Soc. Am. 17(3), 199–211.

Salmon, V. (1946b), “A new family of horns,” J. Acoust. Soc. Am. 17(3), 212–218.

308 BIBLIOGRAPHY

Schelkunoff, S. (1943), Electromagnetic waves (Van Nostrand Co., Inc., Toronto, New York and Lon-

don), 6th edition.

Schwinger, J. S. and Saxon, D. S. (1968), Discontinuities in waveguides : notes on lectures by Julian

Schwinger (Gordon and Breach, New York, United States).

Shaw, E. (1970), “Acoustic Horns with Spatially varying density or elasticity,” J. Acoust. Soc. Am.5

50(3), 830–840.

Sommerfeld, A. (1949), Partial differential equations in Physics, Lectures on Theoretical Physics, Vol. I

(Academic Press INC., New York).

Sommerfeld, A. (1952), Electrodynamics, Lectures on Theoretical Physics, Vol. III (Academic Press

INC., New York).10

Sondhi, M. and Gopinath, B. (1971), “Determination of vocal-tract shape from impulse responses at the

lips,” J. Acoust. Soc. Am. 49(6:2), 1867–73.

Stewart, J. (2012), Essential Calculus: Early Transcendentals (Cengage Learning, Boston, MA), https:

//books.google.com/books?id=AcQJAAAAQBAJ .

Stillwell, J. (2002), Mathematics and its history; Undergraduate texts in Mathematics; 2d edition15

(Springer, New York).

Stillwell, J. (2010), Mathematics and its history; Undergraduate texts in Mathematics; 3d edition

(Springer, New York).

Van Valkenburg, M. (1960), Introduction to Modern Network Synthesis (John Wiley and Sons, Inc., New

York).20

Van Valkenburg, M. (1964a), Network Analysis second edition (Prentice-Hall, Englewood Cliffs, N.J.).

Van Valkenburg, M. E. (1964b), Modern Network Synthesis (John Wiley & Sons, Inc., New York, NY).

Vanderkooy, J. (1989), “A model of loudspeaker driver impedance incorporating eddy currents in the

pole structure,” JAES 37(3), 119–128.

Webster, A. G. (1919), “Acoustical impedance, and the theory of horns and of the phonograph,” Proc.25

Nat. Acad. Sci. 5, 275–282.

Youla, D. (Sep 1964), “Analysis and Synthesis of arbitrarily terminated lossless nonuniform lines,” IEEE

Transactions on Circuit Theory 11(3), 363–372.

/home/jba/298JA/anInvitation.15/Book.tex Version 0.97.08 Thursday 14th March, 2019 @ 18:24

Index

∇×(): see curl, 189

∇·(): see divergence, 189

F: see fractional numbers, 37

Γ(s): reflectance, 208

∇(): see gradient, 189, 2005

∇2(): see Laplacian, 46, 189

N: see counting numbers, 36, 245

P: see primes, 36

Q: see rational numbers, 37

T (s): see transmission matrix, 12710

∇2(): see vector Laplacian, 189

Y: see characteristic admittance, 218

Z: see integers, 36

Z: see characteristic impedance, 218

A‖, A⊥, 21215

δ(t): Dirac impulse, 283

δk Dirac impulse, 345

E,D,H,B: Definitions, 230

E,D,H,B: Units, 231

κ, propagation function, 7520

⊥: see perp p. 246, 36

ρπ(N): density of primes, 269, 319

tan(s), inverse, 146

tan−1(s), 146

τg(ω): group delay, 14625

ζ(s) function: See zeta function, 343

ζ(s), Riemann, 52

ζ(x), Euler, 52

ka < 1, 236

u(t) Heaviside step, 28330

u(t) step function, 279

uk Heaviside function, 345

zviz.m, see: plots, colorized, 342

ceil(x), 248

compan, 8535

conv(A,B), 85, 105

deconv(A,B), 85

fix(x), 248

floor(x), 57, 248

mod(x,y), 6040

poly(R), 85

polyder(), 85

polyval(), 84, 105

rem(x,y), 60

residue(), 105 45

residue(N,D), 85

root(), 103, 105

round(x), 248

roundb(x), 248

1–port, 135 50

2–port, 135

abacus, 35, 238

ABCD: see transmission matrix, 127

abscissa, see domain, 165

active/passive, 291 55

admittance, poles and zeros, 208

admittance: see impedance, 127

algebraic network, see Transmission matrix, 132

analytic continuation, 89

analytic functions, 42, 93, 146 60

analytic geometry, 107, 111

analytic, complex, 76, 95, 133

analytic, evolution operator, 75

anti-reciprocal, see reciprocity, 134

Assignments: AE-1, 97 65

Assignments: AE-2, 117

Assignments: AE-3, 141

Assignments: DE-1, 152

Assignments: DE-2, 168

Assignments: DE-3, 182 70

Assignments: NS-1, 30



Assignments: VC-1, 211

Assignments: VC-2, 240 75

Bach, J. S., 28

Bardeen, John, 96

Beranek, Leo, 244

Bernoulli, Daniel, 73

Bernoulli, Jakob, 27 80

Bernoulli, Johann, 27

Bessel function, 163, 178, 179, 181

Bessel function, roots, 203

Bethe, 96

Bhaskara II, early calculus, 11 85

309

310 INDEX

Bhaskara II, 63

bi-harmonic, 150

bilinear transformation, 105, 139, 249, 291

Boas, 244

Bode, 2445

Bombelli, 21, 24, 39, 147

Brahmagupta, 21, 35, 63, 89, 239

branch cut, 137, 160–162, 341

branch index, 137

branch points, 16110

Brillouin zones, 156, 338

Brillouin, Leon, 73, 75, 96, 203, 222, 244

Brune Condition, 158, 363

Brune condition, 92

Brune impedance, 106, 150, 157, 166, 178, 362,15

363

Brune impedance, properties, 363

Campbell, George Ashley, 203, 205, 222

Cantor, 37

Cardano, 7620

Cauchy, 46, 73, 95, 138, 160

Cauchy integral formula, 177

Cauchy integral theorem, 46, 165, 177

Cauchy residue theorem, 87, 177

Cauchy-Riemann conditions, 14925

Cauer synthesis, 61, 277, 327

causality, 128, 289, 290

CFA, 55

CFA matrix, 274, 324

CFA: see continued fractions, 5730

characteristic admittance, 218

characteristic impedance, 185

Chebyshev, 45

Chinese four stone problem: see puzzles, 43

chord–tangent methods, 138, 139, 329, 33235

CoC, 227

codomain, 95

CoG, 227

colorized plots: see plots, colorized , 136

companion matrix, 83, 85, 28440

completing the square, 77

complex analytic, 146, 159, 160, 162, 341

complex analytic function, 75, 76, 94, 95, 131,

134, 137–139, 147–150, 156–161, 163–

167, 177–180, 196, 198, 201, 22245

complex analytic functions, 94

complex analytic: see analytic, 94

complex analytic: see causal, 134

complex frequency, 156

complex frequency plane, 15850

complex frequency, see Laplace frequency, 177

complex numbers, 37

complex numbers, Newton’s view, 147

complex numbers: history, 39

complex numbers: polar representation, 38 55

complex propagation function, see propagation func-

tion, 75, 156

composition, 45, 63, 105, 107, 108, 111, 123, 127,

138, 139

composition: line and sphere, 138 60

compressible, 229

conjugate of a complex number, 39

conjugate variables, 125, 167

conservation of energy, 24, 74, 92, 106, 125, 148,

157, 158 65

conservative field, 151, 197, 229

continued fractions, 50, 54, 57, 60, 61, 64

continued fractions, circuit theory, 277, 327

convex hull, 81, 90

convolution, 103, 104, 129, 179 70

coprimes, 36, 37, 54, 55, 57, 269, 319

counting numbers, 36

cryptography, 42

cubic, 76

curl, 189 75

curl ∇×(), 46, 192, 200, 227

d’Alembert, 25, 27, 28, 73, 75, 103, 146, 156, 193,

208, 218, 222, 232

d’Alembert’s superposition principle, 208

day: seconds in, 24 80

deconv(), 105

deconvolution, 85, 105

degree vs. order, 163

degree, fractional, 249

degree, polynomials, 76, 103, 163, 248 85

degrees of freedom, 84, 229

delay, see group delay, 146

derivative, fractional, 163

derivative, half, 283

derivative, order, 163 90

Descartes, 24, 26, 103, 146, 335

determinant, 112, 116, 191, 252, 253

DFT, 128

diffusion equation, 198–200, 222

Diophantus, 19, 21, 62 95

Dirac impulse, 345

dispersion relation, 156

dispersion relation, see propagation function, 156

dispersive wave propagation, 75

divergence, 189 100

divergence ∇·(), 46, 192, 200, 227

division with rounding, 57

INDEX 311

DoC, 227

DoF: see degrees of freedom, 84

DoG, 227

domain, 95

domain-coloring: see plots, colorized, 1365

driving-point impedance, 290

DTFT, 128

dynamic stiffness of air, 75

eardrum standing waves, 236

eigen-function of d/dt, 18010

eigen-space, see vector space, 203

eigenanalysis, 22

eigenanalysis of reversible systems, 264

eigenanalysis: 2x2 matrix, 87, 263

eigenanalysis: Fibonacci, 8815

eigenanalysis: Pell’s equation, 64

eigenfunction, 208

eigenfunction expansion, 334

eigenmodes, 43

eigenmodes: physical explanation, 19520

eigenvalues, 65, 84–87

eigenvalues, Fibonacci sequence, 87

eigenvalues, Pell’s equation, 87, 260

Einstein, 76, 285

Einstein, Albert, 7325

elimination, 107, 108

elliptic differential equation, 198

energy, 24, 25, 74, 125, 167, 291, 292

entire function, 93, 207

entire functions, 92, 20830

entropy, 54

epicycle, 74

equations, linear, 89

equi-partition theorem, 76

Eratosthenes, 52, 34435

error function, 283

essential singularity, 160, 161

Euclid, 37

Euclid’s formula, 21, 62, 101, 329

Euclidean algorithm, 54, 55, 269, 31940

Euclidean length, 109

Euler, 26–28, 73, 138, 335, 343

Euler product formula, 82

Euler product formula: see Riemann ζ function, 82

Euler’s equation, 20245

Euler’s functional zeta equation, 347

evanescent modes, 221

evolution, 199

evolution operator κ(s), 75

Ewald, 9650

extended complex plane, 138, 139

factoring, 76

Fermat, 24, 26, 63, 103, 146, 335

Feshbach, 244

Feynman, 188, 244 55

FFT, 128

Fibonacci, 333, 334

Fibonacci recursion formula, 35

Fibonacci sequence, 65, 70, 84, 88, 113, 123

Fibonacci sequence eigenanalysis, 88 60

Fibonocci sequence eigen-decomposition, 70

Fletcher, Harvey, 73

Fourier Series, 128

Fourier transform, 128

Fourier transform properties, 130 65

Fourier transform table, 129

fractional numbers, 37

fractional, degree, 163

fractional-order, 163

frequency domain, 75, 128, 159 70

frequency, complex, 177

frequency, complex: Laplace frequency, 75

Fry, 244

FTC, 149

FTCC, 148 75

FTVC, 197

function, circular, 95

function, inverse, 95

function, periodic, 95

functions, causal, 133 80

functions, periodic, 133

functions: Γ(s), 347

functions: see complex analytic, 94

fundamental theorems: algebra, 45, 46, 103, 105

fundamental theorems: arithmetic, 37, 40, 45, 46 85

fundamental theorems: calculus, 46, 148

fundamental theorems: complex calculus, 46, 138,

145, 148

fundamental theorems: complex calculus (Cauchy),

153 90

fundamental theorems: prime number, 45, 271, 321

fundamental theorems: real calculus (Leibniz), 153

fundamental theorems: scalar vs. vector, 145

fundamental theorems: table of, 149

fundamental theorems: vector calculus, 26, 46, 189, 95

197, 228

Galileo, 8, 24, 74, 95, 146, 195, 198

Gamma function, 347

Gauss, 46, 73, 160, 269, 319

Gauss’s law, 231 100

Gauss, CFA vs. GCD, 57

Gauss, GCD vs. CFA, 275, 325

312 INDEX

Gauss-Lucas theorem, 81, 337

Gauss-Lucas theorem: inverse, 90

Gaussian elimination, 115

GCD, 55

GCD: see greatest common divisor, 435

generalized flow, 125

generalized force, 125

generalized impedance, 125, 126, 201

geometric series, 91

Gibbs phenomenon, 129, 28010

GoD, 227

gOd, 227

gOd: Little-god, 228

gradient, 189

gradient ∇(), 46, 192, 200, 22715

gravity, 25, 74, 76, 111, 190, 198

gravity waves, 76

greatest common divisor, 43, 54–57, 273, 323

Green, 244

Greens Theorem, 15020

group delay, 146, 355

gyrator, 124, 134, 135, 289

gyro-scope, 191

Halley, 10, 11

Hamming, 1525

Hankel function, 207

Heaviside, 73

Heaviside step function, 279

Heisenberg, 96

Heisenberg, Werner, 63, 8730

Helmholtz, 73, 146, 228, 244, 285

Helmholtz portrait, 228

Helmholtz Theorem, 26

Helmholtz’s decomposition theorem, 189

Helmholtz’s decomposition, see fundamental theo-35

rem of vector calculus, 228

Hilbert space, see vector space, 203

Hilbert transform, 128

Homework, see Assignments, 30

horn equation, 193, 203, 205, 218, 22440

horn, conical, 220

horn, exponential, 220

horn, uniform, 218

Hunt, 244

Huygens, 146, 20245

Hydrogen spectrum, 356

hyperbolic differential equation, 198

IEEE-754, 37, 41, 76, 85

impedance, 106, 127, 132, 135, 150, 156, 157, 167,

177, 17850

impedance, Brune, 106, 363

impedance, generalized, 201

impedance, poles and zeros, 208

impulse function, 283

impulse response, 249 55

incompressible, 229

inner product: see scalar product, 109

integers, 36

integers, utility of, 40

integration by parts, 206 60

integration, half, 283

intensity, 156

internet security, 42, 43, 270, 271, 320, 321

intersection, 107

intersection of curves, 107, 112, 115, 329, 330, 65

332, 335, 341

intersection of sets, 107

intersection point, 112

irrational poles, 161

irrotational, 191, 229, 230 70

Kelvin, Lord: William Thompson, 73

Kepler, 74

key encryption, 270, 320

Kirchhoff, 46, 73, 285

Kirchhoff’s laws, 126, 201 75

Kirchhoff’s portrait, 228

Klein, Felix, 9

Kleppler, Daniel, 238

Lagrange, 63

Lamb, 244 80

Laplace and the speed of sound, 26

Laplace frequency, 75, 127, 128, 158, 159, 177,

279

Laplace frequency plane, 158

Laplace transform, 132, 280 85

Laplace transform, impedance, 158

Laplace transforms: Properties I, 279, 281

Laplace transforms: Table II, 282

Laplace transforms: Table III, 283

Laplacian, 189 90

Laplacian ∇2, 167, 191, 198, 200, 203, 227

Laplacian N dimensions, 193

Laplacian, scalar∇2, 46

Laplacian, vector ∇2, 46, 228

lattice vector, 338 95

laws of gravity, 74

Lec 20: zeta function, product formula, 344

Lec20: zeta function poles, 345

Lec20: zeta function RoC, 345

Lec20: zeta function, Euler, 343 100

INDEX 313

Lec20: zeta function, Riemann, 343

Lec20: zeta function: definition, 344

Leibniz, 11

length, Euclidean, 109

Leonardo da Vinci, 245

light, speed of, 230, 235

Lighthill, 244

linear equations, 103

linearity, 289, 290

logarithmic derivative, 79, 81, 161, 164, 33710

logarithmic derivative: definition, 79

logarithmic integral, 269, 319

Long, Kehan, 332

Lord Kelvin, 73

Lord Rayleigh, 7315

Mobius transformation, see bilinear, 106

Mason, Warren, 187

Mason, Warren P., 285

matrix algebra, 107

Maxwell, 46, 7320

Maxwell’s equation: matrix form, 231

Maxwell’s equations, 29, 46, 145, 230, 236

Maxwell’s equations: d’Alembert solution, 232

Maxwell’s equations: names, 230

Maxwell’s equations: units, 23125

Maxwell’s matrix wave equation, 232

meromorphic, 161

meromorphic function, 161

Mersenne, 24, 146, 156, 195

Mersenne and the speed of sound, 7530

Mersenne primes, 25

modal damping, 87

modal-analysis, 87, 203

modulo, 248

monic polynomial, 45, 76, 84, 85, 33735

Morse, 244

Morse, Phillip, 244

music, 19, 20, 23, 43, 74, 239

Natural numbers: see counting numbers, 36

Needham, Joseph, 2040

network postulates, 158, 289

network postulates, summary, 135

network, algebraic, see Transmission matrix, 132

networks, algebraic, 290

Newton, 11, 24, 26, 63, 73–76, 79, 80, 95, 125,45

138, 146, 198, 202, 239, 335

Newton and the speed of sound, 75

Newton’s laws, 201

Newton’s laws of motion, 74

Newton’s method, 78–81, 8550

Newton’s root finding method fails, 80

Newton, root finding method, 79

non-reciprocal, see reciprocity, 134

nonlinear algebra, 134

norm, see scalar product, 109 55

Nyquist approximation, 185

Nyquist, H., 73, 135

Oakley, Barbra, 15

octave, 45

Ohm’s law, 125–127, 135, 201 60

Olson, 244

order vs. degree, 163

order, derivatives, 163, 248

order, fractional, 163

ordinate, see range, 165 65

parabolic differential equation, 198

partial fraction expansion, 93, 163, 166, 363, 364

passive, 289, 291

passwords, 43

Pauli, 96 70

Pell’s equation, 63, 334

Pell’s equation, Heisenberg’s view, 35, 63

Pipes, 244

Plimpton-322, 62

plots, colorized: eπ√s , 163 75

plots, colorized: atan(z) , 147

plots, colorized: 11−e−sπ , 346

plots, colorized: log 1−z1+z , 147

plots, colorized: ±√s , 163

plots, colorized: log(), 137 80

plots, colorized: tan(), arctan(), 342

plots, colorized:√s2 + 1 , 164

plots, colorized: J0(), I0(), H0(), 342

plots, colorized: Jo(πz) vs. cos(πz), 178

plots, colorized: s2,√s, 161 85

plots, colorized: w(z) = log z, 137

plots, colorized: w(z) = ez , 137

plots, colorized: reference condition, 136

PNT: see prime number theorem, 36

Poincare, 73 90

point at infinity, 138–140

Poisson, 198, 199

poles, 249, 280, 363

poles, degree, 163

poles, fractional, 163, 249 95

poles, simple, 166

poly(), 105

polynomial, 76

polynomial, long division, 105

polynomials, 22 100

314 INDEX

positive-definite, 291

positive-real, 92, 158, 207, 291

postulates, 133, 289

postulates, Feynman’s view, 132

postulates, of horns, 2065

postulates, of systems, 75, 132, 133

potential, 189, 196, 224

power, 125, 156, 157, 167, 291

Poynting vector, 232

prime, 34310

prime number theorem, 36, 44, 45, 54, 269, 319

prime number theorem: see fundamental theorems,

271, 321

prime sieve, 52, 53

prime, density, 269, 31915

primes, 36, 42, 45, 239, 269–271, 319–321

primes, Eratosthenes, 53

primes, Mersenne, 25

primes, utility of, 270, 320

Principia, 11, 26, 75, 14620

probability: average, 143

probability: event, 143

probability: expected value, 143

probability: experiment, 143

probability: observable, 14325

probability: random variable, 143

probability: sampling noise, 144

probability: the mean of many trials, 143

probability: the number of events, 143

probability: trial, 14330

probability: uncertainty, 143

propagation function, 75, 127, 156, 338

propagation function, conical horn, 220

propagation function, exponential horn, 220

propagation function, uniform horn, 218, 36135

propagation function, with losses, 285, 287

public-key security, 43

public-key, see cryptography, 42

public-private keys, 270, 320

puzzles, 43, 12040

Pythagoras, 19

Pythagorean comma, 239

Pythagorean triplets, 21, 35

Pythagoreans, 19, 24, 37, 40, 42, 62, 63, 89, 332

Pythagoreans, fate, 2345

quadratic, 103

quadratic equation, solution, 77

quadratic, solution, 77

quantum mechanics, 43, 87, 191, 207, 237, 238

quartic, 7650

quasi-statics, 135, 235, 236, 285, 289, 293, 363

quintic, 76

Ramo, 244

rational function, 105, 106, 160, 208

rational numbers, 37 55

Rayleigh, 244, 285

Rayleigh reciprocity, see reciprocity, 134

Rayleigh, Lord: John William Strutt, 73

reciprocal, see reciprocity, 134

reciprocity, 124, 134, 289, 292 60

reflectance, 106, 208, 209, 217, 218, 362

region of convergence: see RoC, 92, 133

relatively prime, 36

removable singularity, 160

residue, 85, 93, 161, 163, 166, 177, 363 65

residue Theorem, 178

residue, impedance, 166

residue-expanded, 61, 84

reversibility, 289, 293

reversible system, eigenanalysis, 264 70

reversible systems, 134

RHP, 150

Rieman zeta function: mod(n,πk) , 343

Riemann, 73, 160, 180

Riemann ζ(s): definition, 101 75

Riemann sheet, 160, 162, 341

Riemann sphere, 138

right half-plane, see RHP, 150

RoC, 42, 90, 92, 93, 131, 146, 150, 160

RoC: definition, 90, 133 80

RoD: definition, 90, 133

roots, 76

roots, Bessel function, 203

RSA, 270, 271, 320, 321

Rydberg atom, 86, 195, 356 85

Rømer, Ole, 26, 63

Sagan, Carl, 235

Salmon, Vince, 43

sampling theorem, 135

scalar field, 189 90

scalar product, 104, 109, 110, 131, 235

scalar product, see norm, 109

Schrodinger’s equation, 43

Schrodinger, Erwin, 43, 87

Schwarz inequality, 110, 113, 114 95

seconds in a day, 24

semitone, 19, 239

separation of variables, 221

set theory, 37

set theory symbols defined, 245 100

Shannon entropy, 54

INDEX 315

sheet-index, 161

sieve, 53, 344

signal processing, digital, 66

signals, 128

simple pole, 815

singular circular: see RoC definition, 133

Singular value decomposition, 89

singularity, 160

Slater, J.C., 244

Smith chart, 14710

solenoidal, see incompressible, 228

solenoidal, see irrotational, 230

Sommerfeld, 73, 96, 160, 235, 244, 362

Sommerfeld, Arnold, 73

spatial invariant, 289, 29315

spectrum: see eigenvalues, 195

speed of light, 26, 63, 76, 230

speed of light, Einstein’s view, 235

speed of light, first measurements, 26

speed of sound: higher order modes, 23620

speed of sound: Mersenne measures of, 75

speed of sound: Newton first calculates it, 75

spiral of Fibonacci, 65, 297

spiral of Theodorus, 42

step function, 283, 34525

step function, Heaviside, 279

Stokes, 46, 244

Stokes theorem, 231

Stokes’s law, 231

stream 1, 29, 35, 44, 4530

stream 2, 29, 44–46, 73, 74

stream 3, 29, 44–46, 66

stream 3a, 145

stream 3b, 187

Sturm-Liouville, 206, 22135

SVD, 89

symmetry, 128, 135

systems, 128

Taylor series, 89, 91, 93, 94, 159, 160

Taylor series formula, 14940

telegraph equation, 203

telephone equation, 203

teslas, 230

Thevenin parameters, 123, 124, 209, 231, 290

thermal conductivity, 200, 36245

time domain, 128

Time Line: 16 BCE-17 CE, 21

Time Line: 16–20 CE, 146

Time Line: 16-18 CE, 24

Time Line: 17–20 CE, 7350

Time Line: 17-19 CE, 107

time-invariant, 289, 292

top, 191

train problem, 183, 185, 186

train problem solution, 184 55

transmission line, 106, 127, 147

transmission line, generalization, 202

transmission lines, 203

transmission lines, CFA, 278, 328

transmission matrix, 123, 127, 134, 218, 289, 290, 60

294, 364

trap-door function, 43, 270, 320

triangle inequality, 110

triple product, 109

triple product, see vector triple product, 111 65

Units: E,D,H,B, 231

Vandermonde, 337

vector cross product, 110

vector cross product: see vector product, 111

vector identity mnemonics: DoG, GoD, DoC, CoG, 70

CoC, 227

vector identity: CoC, 235

vector identity: CoG, 234

vector identity: DoC, 234

vector identity: DoG, 190 75

vector identity: GoD, 190

vector identity: gOd, 228

vector identity: Laplacian, 235

vector inner product: see scalar product, 111

vector Laplacian, 189 80

vector product, 109, 110, 235

vector space, 108, 109, 235

vector space, Fourier transform, 114

vector triple product, 111

vector wave equation, 338 85

vector wedge product, 110

vector, three, 111

vector, unit, 111

vectors, 107

visco-thermal loss, 187, 285, 287 90

wave equation, 74, 202, 338

wave number: see propagation function, 75

wavelength, 75, 156

Webers, 230

Webster horn equation, see horn equation, 203, 204 95

Webster Laplacian, 203

wedge product: see vector wedge product, 110

year: days in, 24

year: improved definition, 25

316 INDEX

z-transform, 65, 128

Zeno’s paradox, 145

zeros, 249, 280

zeta function RoC, 101

zeta function, Euler, 525

zeta function, poles, 102

zeta function, product formula, 101

zeta function, Riemann, 52, 101

Zhang, Yiming, 274, 324

zviz.m, see plots, colorized, 13610

Documents

University Of Illinois · Contents 1 Introduction 19 1.1 Early Science and Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.1.1 Lecture 1: (Week 1) Three