MathlCCM 573 - Graduate Assignment 4

MathlCCM 573 - Graduate Assignment 4This is only for students taking MathlCCM 573. It is NOT for students taking MathlCCM 473.

Due: Thursday, April 12. Nothing accepted after Tuesday, April 17. This is worth 15 points. 10% offfor beinglate. Please work by yourself. See me if you need help.

Read the following material from the text. (It is at the end in case you do not have the text.)

1. The material on The Continuous Least Squares Problem on pages 245 - 247.

2. The material on Systems of Differential Equations on pages 289 - 295.

Then do the following. Do all your work by hand showing your work.

1. (3 points) Find the polynomial ¢i...x) = a + bx that best approximatesj{x) = X2 in the least squares sense on[0, 1].

2. (1 point) Make a plot of ¢i...x) andj{x) on the same graph. (You can do this using mathematical software.)

3. (2 points) Calculate 11/- ¢ II.

4. (5 points) Find the solution to the following system of differential equations

dxdt = x + 3y

~dt 4x + 2y

that solves the initial conditions x(O) = 7 andy(O) = O.

5. (4 points) Find the solution to the following system of differential equations

dxdt = x + 3y + 10

~dt 4x + 2y

that solves the initial conditions x(O) = 7 and y(O) = O.

GEOMETRIC APPROACH 245

The Continuous Least Squares Problem

We introduced the discrete least squares problem by considering the problem ofapproximating a discrete point set {( ti, Yi) I i = 1, ... , n} by a simple curve suchas a straight line. In the continuous least squares problem, the discrete point set isreplaced by continuous data {( t, f (t)) I t E [a, b]}. Thus, given a function f definedon some interval [a, b], we seek a simple function ¢ (e.g. a linear polynomial) suchthat ¢ approximates f well on [a, b]. The goodness of approximation is measured,not by calculating a sum of squares, but by calculating the integral of the square ofthe error:

(3.5.30)

The continuous least squares problem is solved by minimizing this integral overwhichever set of functions ¢ we are allowing as approximations of f. For example, ifthe approximating function is to be a first-degree polynomial, we minimize (3.5.30)over the set of functions {¢ I ¢(t) = ao + al t; ao, al E lR}. More generally, if theapproximating function is to be a polynomial of degree less than m, we minimize(3.5.30) over the set of functions

L (<Pj, <Pi)Xj = (1, <Pi)j=l

i = 1,... ,m.

246 THE LEAST SQUARES PROBLEM

The set Pm-1 is an m-dimensional vector space of functions. Still more generallywe can let S be any m-dimensional vector space of functions defined on [a, b] andminimize (3.5.30) over S.

We can solve the continuous least squares problem by introducing an inner produand a norm for functions and utilizing the geometric ideas introduced in this section..The inner product of two functions I and g on [a, b] is defined by

(1, g) = lb I(x)g(x)dx.

This inner product enjoys the same algebraic properties as the inner product on _For example, (1,g) = (g, j), and (h + 12,g) = (h, g) + (12, g). Two functions ;and g are said to be orthogonal if (1, g) = O. The norm of I is defined by

(b) 1/2

11111 = lII(x)12dX

Notice that the norm and inner product are related by II I II = .j(]J'). Furthe(3.5.30) can be expressed in terms of the norm as II I - <P 112. Thus the continleast squares problem is to find <P E S such that

II I - <P II = min II I-?/! II·..pES

We proved Theorem 3.5.15 only for subspaces of R", but it is valid for functions;as well. See, for example, [90, Theorem 11.7.2]. Thus the continuous least •problem (3.5.31) has a unique solution, which is characterized by I - <P E S-is,

(1- <P,?/!) = 0 for all?/! E S.

Let <PI"'" <Pm be a basis for S. Then <P = 2:7=1 Xj<Pj for some unknocients Xl, ... ,Xm. Substituting this expression for <P into (3.5.32), setting t." =and applying some of the basic properties of inner products, we obtain

m

This is a system of m linear equations in m unknowns, which can be v.Ti·~~L..matrix equation

Cx=d,

where

UPDATING THE QR DECOMPOSITION 247

The matrix C is clearly symmetric. In fact, it is positive definite, so (3.5.34) can besolved by Cholesky's method to yield the solution of the continuous least squaresproblem.

• 3.5.35 With each nonzero Y = [Y1 ... Ym jT E lRm associate the nonzero function'ljJ = 2:;:1 Yj¢j· Prove that yT Cy = ('ljJ, 'ljJ). Combining this with the fact that('l/!, 'ljJ) > 0, we conclude that C is positive definite. D

The next exercise shows that the equations (3.5.34) are analogous to the normalequations of the discrete least squares problem.

r.w=.c 3.5.36 Find V1, ... ,Vm E lRm for which the normal equations (3.5.22) have theform

m

L (Vj,Vi)Xj = (j,Vi)j=l

Thus the normal equations have the same general form as (3.5.33).

i = 1, ... ,m.

D

3.5.37 Find the polynomial ¢(t) = Xl + X2t of degree 1 that best approximatesf(t) = t2 in the least squares sense on [0,1]. Check your answer by verifying thatf - ¢ is orthogonal to S = {aD + a1t lao, a1 E R}. D

3.5.38 Let [a, b] = [0,1], S = Pm-1, and let ¢1, ... , ¢m be the basis of Pm-1defined by ¢l(t) = 1, ¢2(t) = t, 4dt) = t2, ... , ¢m(t) = tm-1. Calculate thematrix C that would appear in (3.5.34) in this case. Note that C is just the m x mmember of the family of Hilbert matrices introduced in Section 2.2. This is thecontext in which the Hilbert matrices first arose. D

In Section 2.2 the Hilbert matrices served as an example of a family of ill-conditioned matrices Hm E lRmxm, m = 1,2,3, ... , whose condition numbers getrapidly worse with increasing m. Now we can observe, at least intuitively, that the illconditioning originates with the basis ¢l(t) = 1, ¢2(t) = t, ¢3(t) = t2, .... If youplot the functions tk on [0, 1], you will see that with increasing k they look more andmore alike. That is, they come closer and closer to being linearly dependent. Thusthe basis ¢1, ... , ¢m is (in some sense) ill conditioned and becomes increasinglyill conditioned as m is increased. This ill conditioning is inherited by the Hilbertmatrices.

CHAPTER 5

EIGENVALUES AND EIGENVECTORS I

Eigenvalues and eigenvectors turn up in stability theory, theory of vibrations, quantummechanics, statistical analysis, and many other fields. It is therefore important tohave efficient, reliable methods for computing these objects. The main business ofthis chapter is to develop such algorithms, culminating in the powerful and elegantFrancis algorithm, also known as the implicitly-shifted QR algorithm.

Before we embark on the development of algorithms, we take the time to illustrate(in Section 5.1) how eigenvalues and eigenvectors arise in the analysis of systems ofdifferential equations. The material is placed here entirely for motivational purposes.It is intended to convince you, the student, that eigenvalues are important. Section 5.1is not, strictly speaking, a prerequisite for the rest of the chapter.

Section 5.1 also provides an opportunity to introduce MATLAB's eig command.When you use e i g to compute the eigenvalues and eigenvectors of a matrix, youare using Francis's algorithm.

5.1 SYSTEMS OF DIFFERENTIAL EQUATIONS

Many applications of eigenvalues and eigenvectors arise from the study of systemsof differential equations.

Fundamentals of Matrix Computations, Third Edition. By David S. WatkinsCopyright © 2010 John Wiley & Sons, Inc.

289

290 EIGENVALUES AND EIGENVECTORS I

u,6V

u, 2!l IH

Figure 5.1 Solve for the time-varying loop currents.

Example 5.1.1 The electrical circuit in Figure 5.1 is the same as the one that wasfeatured in Example 1.2.8, except that two inductors and a switch have been added.Whereas resistors resist current, inductors resist changes in current. If we are studyingconstant, unvarying currents, as in Example 1.2.8, we can ignore the inductors, sincetheir effect is felt only when the currents are changing. However, if the currents arevarying in time, we must take the inductances into account.

Once the switch in the circuit is closed, current will begin to flow. The loopcurrents are functions of time: Xi = Xi (t). Just as before, an equation for each loopcan be obtained by applying Kirchhoff's voltage law: the sum of the voltage dropsaround the loop must be zero. The voltage drop across an inductor is proportional tothe rate of change of the current. The constant of proportionality is the inductance.Thus if the current flowing through an inductor is X (t) amps at time t seconds, and theinductance is m henries, the voltage drop across the inductor at time t is mx(t) volts,where x denotes the time derivative ~~ (amperes/second). Because of the presenceof derivative terms, the resulting loop equations are now differential equations. Thuswe have a system of two differential equations (one for each loop) in two unknowns(the loop currents).

Let us write down the two equations. First consider the first loop. As you willrecall, the voltage drop across the 5 n resistor in the direction indicated by the arrowfor the first loop is 5(Xl - X2) volts. The voltage drop across the 1 henry inductor isIXl volts. Summing these voltage drops, together with the voltage drops across theother resistors in loop 1, we obtain the equation

Similarly, in loop 2,

These are exactly the same as the equations we obtained in Example 1.2.8, except forthe derivative terms. Rearranging these equations and employing matrix notation,

SYSTEMS OF DIFFERENTIAL EQUATIONS 291

we can rewrite the system as

-95

(5.1.2)

Suppose the switch is closed at time zero. Then we can find the resulting currentsby solving the system (5.1.2) subject to the initial conditions

[Xl (0)] [0]

X2(0) O·(5.1.3)

As we shall see, we can solve this problem with the help of eigenvalues and eigen-vectors. D

The system of differential equations in Example 5.1.1 has the general form x =

Ax - b, where

[ -9 5]A = 5 -8 '

A larger electrical circuit will lead to a larger system of equations of the sameor similar form. For example, in Exercise 5.1.21 we consider a system of eightdifferential equations. The form is again x = Ax - b, but now A is an 8 x 8 matrix.In general we will consider systems of ti differential equations, in which the matrixA is ti x n:

Homogeneous Systems

An important step in solving a system of the form x = Ax - b is to solve therelated homogeneous system x = Ax, obtained by dropping the forcing term b(corresponding to the battery in Example 5.1.1). Thus we consider the problem ofsolving

dxdt = Ax,

where A is an ti x ti matrix. Equation (5.1.4) is shorthand for

dXldt

dX2dt

(5.1.4)

A common approach to solving linear differential equations is to begin by lookingfor solutions of a particularly simple form. Therefore let us seek solutions of theform

x(t) = g(t)v, (5.1.5)


g(t)g(t) V = Av.

Since v and Av are constant vectors, (5.1.6) implies that g(t) / g( t) must be constant.That is, there exists a (real or complex) constant A such that

(5.1.6)

where g(t) is a nonzero scalar (real or complex valued) function of t, and v is anonzero constant vector. The time-varying nature of x(t) is expressed by g(t), whilethe vector nature of x(t) is expressed by v. Substituting form (5.1.5) into (5.1 A), weobtain the equation

g(t)v = g(t)Av,

or equivalently

g(t) = Ag(t) . (5.1.7)

In addition (5.1.6) impliesAv = AV.

A nonzero vector v for which there exists a A such that Av = AV is called aneigenvector of A. The number A is called the eigenvalue of A associated with v. Sofar we have shown that if x(t) is a solution of (5.1.4) of the form (5.1.5), then v mustbe an eigenvector of A, and g( t) must satisfy the differential equation (5.1.7), where Ais the eigenvalue of A associated with v. The general solution of the scalar differentialequation (5.1.7) is g( t) = ce/", where c is an arbitrary constant. Conversely, if v isan eigenvector of A with associated eigenvalue A, then

is a solution of (5.1.4), as you can easily verify. Thus each eigenvector of A giverise to a solution of (5.1.4). If A has enough eigenvectors, then every solution of(5 .1A) can be realized as a linear combination of these simple solutions. Specifically,suppose A (which is n x n) has a set of n linearly independent eigenvectors VI, ...

Vn with associated eigenvalues AI, ... , An. Then for any constants CI, ... , en,

(5.1.8)

is a solution of (5.1.4). Since VI, ... , Vn are linearly independent, (5.1.8) turns outto be the general solution of (5.1A); that is, every solution of (5.1A) has the form(5.1.8). See Exercise 5.1.26.

An n x n matrix that possesses a set of n linearly independent eigenvectors isaid to be a semisimple or nondefective matrix. In the next section we will see thatin some sense "most" matrices are semisimple. Thus for "most" systems of the form(5.1A), the general solution is (5.1.8).

It is easy to show (Section 5.2) that A is an eigenvalue of A if and only if A is asolution of the characteristic equation det(AI - A) = O. For each eigenvalue A acorresponding eigenvector can be found by solving the equation (AI - A)v = O. For


small enough (e.g. 2-by-2) systems of differential equations it is possible to solve thecharacteristic equation exactly and thereby solve the system.

Example 5.1.9 Find the general solution of

The coefficient matrix is A = [~ !], whose characteristic equation is

o det(AI - A) = det [A-=-12 A__34 ]

(A - 2)(A - 4) - 3 = A2 - 6A + 5

(A - l)(A - 5).

Therefore the eigenvalues are Al = 1 and A2 = 5. Solving the equation (AI - A)v =o twice, once with A = Al and once with A = A2, we find (nonunique) solutions

VI = [ 3]-1 and

Since these two (= n) vectors are obviously linearly independent, we conclude thatthe general solution of the system is

(5.1.10)

D

5.1.11 Check the details of Example 5.1.9. D

The computations in Example 5.1.9 were quite easy because the matrix was only? x 2, and the eigenvalues turned out to be integers. In a more realistic problem

e system would bigger, and the eigenvalues would be irrational numbers that weauld have to approximate. Imagine trying to solve a system of the form x = Axith, say, n = 200. We would need to find the 200 eigenvalues of a 200 x 200trix. Just trying to determine the characteristic equation (a polynomial equation of

zegree 200) is a daunting task, then we have to solve it. What is more, the problem of- ding the zeros of a polynomial, given the coefficients, turns out to be ill conditioned

enever there are clustered roots, even when the underlying eigenvalue problem isII conditioned (Exercises 5.1.24 and 5.1.25). Clearly we need a better method. A~or task of this chapter is to develop one.


Nonhomogeneous Systems

We now consider the problem of solving the nonhomogeneous system

dx- = Ax - bdt '

(5.1.12)

where b is a nonzero constant vector. Since b is invariant in time, it is reasonable totry to find a solution that is invariant in time: x(t) = z, where z is a constant vector.Substituting this form into (5.1.12), we have dx/dt = 0, so 0 = Az - b, i.e. Az = b.Assuming A is nonsingular, we can solve the equation Az = b to obtain a uniquetime-invariant solution z,

Once we have z in hand, we can make the following simple observation. If y(t) iany solution of the homogeneous problem x = Ax, then the sum x(t) = y(t) + z ia solution of the nonhomogeneous system. Indeed x = if + 0 = Ay = A(x - z) =Ax - Az = Ax - b, so x = Ax - b. Moreover, if y(t) is the general solutionof the homogeneous problem, then x(t) = y(t) + z is the general solution of thenonhomogeneous problem.

Example 5.1.13 Let us solve the nonhomogeneous differential equation

This is just (5.1.2) from Example 5.1.1. The two components of the solution, subjectto the initial conditions (5.1.3), represent the two loop currents in the electrical circuitin Figure 5.1.

We must find a time-invariant solution and the general solution of the homogeneouproblem, then add them together. First of all, the time-invariant solution mustsatisfy

This is exactly the system we had to solve in Example 1.2.8. The solution isZl = 30/47 = 0.6383 and Z2 = 54/47 = 1.1489. Recall that these numbersrepresent the loop currents in the circuit if inductances are ignored.

Now let us solve the homogeneous system

We can proceed as in Example 5.1.9 and form the characteristic equation 0det(M - A) = A2 + 17A +47, but this time the solutions are not integers. Applyingthe quadratic formula, we find that the solutions are A = (-17 ± ViOl) /2. Therounded eigenvalues are Al = -13.5249 and A2 = -3.4751. We can now substitutethese values back into the equation (AI - A)v = 0 to obtain eigenvectors. This itedious by hand because the numbers are long decimals.


An easier approach is to let MATLAB do the work. Using MATLAB's eigcommand, we obtain the (rounded) eigenvectors

[0.7415]

VI = -0.6710 and [0.6710 ]

V2 = 0.7415 .

Thus the general solution to the homogeneous problems is

-13.5249t [ 0.7415] -3.475lt [ 0.6710 ]Cl

e -0.6710 + C2e 0.7415'

Adding this to the time-invariant solution of the nonhomogeneous problem, we obtainthe general solution of the nonhomogeneous problem:

(t) - [ 0.6383 ] -13.5249t [ 0.7415] -3.475lt [ 0.6710 ]x - 1.1489 + cj e -0.6710 + C2e 0.7415·

(5.1.14)To obtain the particular solution that gives the loop currents in the circuit, assuming

the switch is closed at time zero, we must apply the initial condition (5.1.3), x(O) = O.Making this substitution in (5.1.14), we obtain

[0.6383 ] [0.7415 ] [ 0.6710 ]

0= 1.1489 + Cl -0.6710 + C2 0.7415 '

which can be seen to be a system of two linear equations that we can solve for Cl andC2. Doing so (using MATLAB) we obtain

Cl = 0.2977 and C2 = -1.2802.

Using this solution we can determine what the loop currents will be at any time.We bring this example to a close by considering the long-term behavior of the

circuit. Examining (5.1.14), we see that the exponential functions e-13.5249t ande-3.475lt both tend to zero as t -+ 00, since the exponents are negative. This is aconsequence of both eigenvalues Al = -13.5249 and A2 = -3.4751 being negative.In fact the convergence to zero is quite rapid in this case. After a short time, theexponential functions become negligible and the solution is essentially constant:

x(t) ~ [ ~:~~~~ ]. In other words, after a brief transient phase, the circuit settles

down to its steady state. Figure 5.2 shows a plot of the loop currents as a function oftime during the first second after the switch has been closed. Notice that after justone second the currents are already quite close to their steady-state values. 0

A larger circuit is studied in Exercise 5.1.2l. More interesting circuits can be builtby inserting some capacitors and thereby obtaining solutions that oscillate in time.Of course, mass-spring systems also exhibit oscillatory behavior.

Documents

MathlCCM 573 - Graduate Assignment 4