E. T. S. I. Caminos, Canales y Puertos 1
EngineeringComputation
Lecture 7
E. T. S. I. Caminos, Canales y Puertos 2
System of Equations
Errors in Solutions to Systems of Linear Equations
Objective: Solve [A]{x} = {b}
Problem:Round-off errors may accumulate and even be
exaggerated by the solution procedure. Errors are often exaggerated if the system is ill-conditioned
Possible remedies to minimize this effect:1. Partial or complete pivoting2. Work in double precision 3. Transform the problem into an equivalent system of
linear equations by scaling or equilibrating
Errors in Solutions to Systems of Linear Equations
E. T. S. I. Caminos, Canales y Puertos 3
Ill-conditioning• A system of equations is singular if det|A| = 0• If a system of equations is nearly singular it is ill-conditioned.
Systems which are ill-conditioned are extremely sensitive to small changes in coefficients of [A] and {b}. These systems are inherently sensitive to round-off errors.
Question:Can we develop a means for detecting these situations?
Errors in Solutions to Systems of Linear Equations
E. T. S. I. Caminos, Canales y Puertos 4
Ill-conditioning of [A]{x} = {b}:
Consider the graphical interpretation for a 2-equation system:
11 12 1 1
21 22 2 2
a a x b (1) =
a a x b (2)
We can plot the two linear equations on a graph of x1 vs. x2.
x1
x2
b2/a22
b1/a11
b1/a12
a11x1+ a12x2 = b1
a21x1+ a22x2 = b2
b2/a21
Errors in Solutions to Systems of Linear Equations
E. T. S. I. Caminos, Canales y Puertos 5
Ill-conditioning of [A]{x} = {b}:
Consider the graphical interpretation for a 2-equation system:
11 12 1 1
21 22 2 2
a a x b (1) =
a a x b (2)
We can plot the two linear equations on a graph of x1 vs. x2.
Well-conditioned
x1
x2
Uncertaintyin x2
Ill-conditioned
x1
x2Uncertainty
in x2
Errors in Solutions to Systems of Linear Equations
E. T. S. I. Caminos, Canales y Puertos 6
Ways to detect ill-conditioning:
1. Calculate {x}, make small change in [A] or {b} and determine the change in the solution {x}.
2. After forward elimination, examine diagonal of upper triangular matrix. If aii << ajj, i.e. there is a relatively small value on diagonal, then this may indicate ill-conditioning.
3. Compare {x}single with {x}double
4. Estimate the "condition number" for A.
Substituting the calculated {x} into [A]{x} and checking this against {b} will not always work!!!
Errors in Solutions to Systems of Linear Equations
E. T. S. I. Caminos, Canales y Puertos 7
Ways to detect ill-conditioning:
• If det|A| = 0 the matrix is singular ==> the determinant may be an indicator of conditioning
• If det|A| is near zero is the matrix ill-conditioned?
• Consider:
After scaling:
==> det|A| will provide an estimate of conditioning if it is normalized by the "magnitude" of the matrix.
60.01 0 0
det 0 0.01 0 100 0 0.01
1 0 0det 0 1 0 1
0 0 1
Errors in Solutions to Systems of Linear Equations
E. T. S. I. Caminos, Canales y Puertos 8
Norms and the Condition Number
We need a quantitative measure of ill-conditioning.
This measure will then directly reflect the possible magnitude of round off effects.
To do this we need to understand norms:
Norm: Scalar measure of the magnitude of a matrix or vector ("how big" a vector is).
Not to be confused with the dimension of a matrix.
Norms
E. T. S. I. Caminos, Canales y Puertos 9
Here are some vector norms for n x 1 vectors {x} with typical elements xi.
Each is in the general form of a p norm defined by the general relationship:
1. Sum of the magnitudes:
2. Magnitude of largest element:(infinity norm)
3. Length or Euclidean norm:
1/ pnp
ipi 1
x x
n
i1i 1
x x
1/ 2n2
i2i 1
x x
ii
x max x
Vector Norms: Scalar measure of the magnitude of a vector
Vector Norms
E. T. S. I. Caminos, Canales y Puertos 10
Vector Norms
Required Properties of vector norm:
1. ||x|| 0 and ||x|| = 0 if and only if [x]=0
2 ||kx|| = k ||x|| where k is any positive scalar
3. ||x+y|| ||x|| + ||y|| Triangle Inequality
For the Euclidean vector norm we also have
4. ||x•y|| ||x|| ||y||
because the dot product or inner product property satisfies:
||xy|| = ||x||•||y|| |cos()| ||x|| • ||y||.
Norms
E. T. S. I. Caminos, Canales y Puertos 11
1. Largest column sum:
(column sum norm)
2. Largest row sum: (row sum norm)
(infinity norm)
p px 1A max A x
n
1 ijj
i 1
A max a
n
iji
j 1
A max a
Matrix Norms: Scalar measure of the magnitude of a matrix.
Matrix norms corresponding to vector norms above are defined by the general relationship:
Matrix Norms
E. T. S. I. Caminos, Canales y Puertos 12
3. Spectral norm: ||A|| 2 = (µmax)1/2
where µmax is the largest eigenvalue of [A]T[A]
If [A] is symmetric, (µmax)1/2 = max , is the largest eigenvalue of [A].
(Note: this is not the same as the Euclidean or Frobenius norm, seldom used:
2
1 1
n n
ijei j
A A
Matrix norms
E. T. S. I. Caminos, Canales y Puertos 13
Matrix NormsFor matrix norms to be useful we require that
0. || Ax || || A || ||x ||General properties of any matrix norm:
1. || A || 0 and || A || = 0 iff [A] = 0
2. || k A || = k || A || where k is any positive scalar
3. || A + B || || A || + || B || "Triangle Inequality"
4. || A B || || A || || B ||
Why are norms important? Norms permit us to express the accuracy of the solution {x} in
terms of || x || Norms allow us to bound the magnitude of the product [ A ] {x}
and the associated errors.
Matrix norms
E. T. S. I. Caminos, Canales y Puertos 14
Forward and backward error analysis can estimate the effect of truncation and roundoff errors on the precision of a result. The two approaches are alternative views:
1. Forward (a priori) error analysis tries to trace the accumulation of error through each process of the algorithm, comparing the calculated and exact values at every stage.
2. Backward (a posteriori) error analysis views the final solution as the exact solution to a perturbed problem. One can consider how different the perturbed problem is from the original problem.
Here we use the condition number of a matrix [A] to specify the amount by which relative errors in [A] and/or {b} due to input, truncation, and rounding can be amplified by the linear system in the computation of {x}.
Error Analysis
E. T. S. I. Caminos, Canales y Puertos 15
Backward Error Analysis of [A]{x} = {b} for errors in {b}
Suppose the coefficients {b} are not precisely represented. What might
be the effect on the calculated value for {x + dx}?
Lemma: [A]{x} = {b} yields ||A|| ||x|| ||b|| or
Now an error in {b} yields a corresponding error in {x}:
[A ]{x + dx} = {b + db}
[A]{x} + [A]{ dx} = {b} + {db}
Subtracting [A]{x} = {b} yields:
[A]{dx} = {db} ––> {dx} = [A]-1{db}
b A1x
A x b
Error Analysis
E. T. S. I. Caminos, Canales y Puertos 16
Define the condition number ask = cond [A] ||A-1|| ||A|| 1
If k 1 or k is small, the system is well-conditioned
If k >> 1, system is ill conditioned.
1 = || I || = || A-1A || || A-1 || || A || = k = Cond(A)
1x A b
1 A
x b
1x b bA A
x b b
Taking norms we have:
And using the lemma:
we then have :
Backward Error Analysis of [A]{x} = {b} for errors in {b}
Error Analysis
E. T. S. I. Caminos, Canales y Puertos 17
Backward Error Analysis of [A]{x} = {b} for errors in [A]If the coefficients in [A] are not precisely represented, what might be effect on the calculated value of {x+ dx}?
[A + dA ]{x + dx} = {b}
[A]{x} + [A]{ dx} + [dA]{x+dx} = {b}
Subtracting [A]{x} = {b} yields:
[A]{ dx} = – [dA]{x+dx} or
{dx} = – [A]-1 [dA] {x+dx}
Taking norms and multiplying by || A || / || A || yields :
1x A AA A
x x A A
Error Analysis
E. T. S. I. Caminos, Canales y Puertos 18
Estimate of Loss of Significance:Consider the possible impact of errors [dA] on the precision of {x}.
pA AIf ~ 10 , then
A A
x
x x
s s p~ 10 , then 10 10
x
x x
Error Analysis
implies that if
Or, taking log of both sides: s > p - log10()
• log10() is the loss in decimal precision; i.e., we start with p decimal figures and end-up with s decimal figures.
• It is not always necessary to find [A]-1 to estimate k = cond[A]. Instead, use an estimate based upon iteration of inverse matrix using LU decomposition.
E. T. S. I. Caminos, Canales y Puertos 19
Impetus for Iterative Schemes:1. May be more rapid if coefficient matrix is "sparse"
2. May be more economical with respect to memory
3. May also be applied to solve nonlinear systems
Disadvantages:1. May not converge or may converge slowly 2. Not appropriate for all systems
Error bounds apply to solutions obtained by direct and iterative methods because they address the specification of [dA] and {db}.
Iterative Solution Methods
E. T. S. I. Caminos, Canales y Puertos 20
Basic Mechanics:Starting with:
a11x1 + a12x2 + a13x3 + ... + a1nxn = b1
a21x1 + a22x2 + a23x3 + ... + a2nxn = b2
a31x1 + a32x2 + a33x3 + ... + a3nxn = b3
: :an1x1 + an2x2 + an3x3 + ... + annxn = bn
Solve each equation for one variable:x1 = [b1 – (a12x2 + a13x3 + ... + a1nxn )} / a11
x2 = [b2 – (a21x1 + a23x3 + ... + a2nxn )} / a22
x3 = [b3 – (a31x1 + a32x2 + ... + a3nxn )} / a33
:
xn = [bn – (an1x2 + an2x3 + ... + an,n-1xn-1 )} / ann
Iterative Solution Methods
E. T. S. I. Caminos, Canales y Puertos 21
Start with initial estimate of {x}0.
Substitute into the right-hand side of all the equations.
Generate new approximation {x}1.
This is a multivariate one-point iteration: {x}j+1 = {g({x}j)}
Repeat process until the maximum number of iterations is reached or until:
|| xj+1 – xj || d + e || xj+1 ||
Iterative Solution Methods
E. T. S. I. Caminos, Canales y Puertos 22
To solve [A]{x} = {b}Separate [A] into: [A] = [Lo] + [D] + [Uo] [D] = diagonal (aii)
[Lo] = lower triangular with 0's on diagonal[Uo] = upper triangular with 0's on diagonal
Rewrite system:[A]{x} = ( [Lo] + [D] + [Uo] ){x} = {b}
[D]{x} + ( [Lo] + [Uo] ){x} = {b}Iterate:
[D]{x}j+1 = {b} – ( [Lo]+[Uo] ) {x}j
{x}j+1 = [D]-1{b} – [D]-1 ( [Lo]+[Uo] ) {x}j
Iterations converge if:|| [D]-1 ( [Lo] + [Uo] ) || < 1(sufficient if equations are diagonally dominant)
Convergence
E. T. S. I. Caminos, Canales y Puertos 23
j 1 j j j1 12 13 1n n 111 2 3
j 1 j j j2 21 23 2n n 222 1 3
j j jj 1n n n1 n2 n,n 1 nn1 2 n 1
x b a x a x a x a
x b a x a x a x a
x b a x a x a x a
Iterative Solution Methods – the Jacobi Method
E. T. S. I. Caminos, Canales y Puertos 24
In most cases using the newest values within the right-hand side equations will provide better estimates of the next value. If this is done, then we are using the Gauss-Seidel Method:
( [Lo]+[D] ){x}j+1 = {b} – [Uo] {x}j
or explicitly:
j 1 j 1 j j2 21 23 2n n 222 1 3
j 1 j 1 j 1 j3 31 32 3n n 333 1 2
j 1 j 1 j 1j 1n n n1 n2 n,n 1 nn1 2 n 1
x b a x a x a x a
x b a x a x a x a
x b a x a x a x a
If this is done, then we are using the Gauss-Seidel Method
Iterative Solution Methods -- Gauss-Seidel
E. T. S. I. Caminos, Canales y Puertos 25
If either method is going to converge,Gauss-Seidel will converge
faster than Jacobi.
Why use Jacobi at all?
Because you can separate the n-equations into n independent tasks, it is very well suited computers with parallel processors.
Iterative Solution Methods -- Gauss-Seidel
E. T. S. I. Caminos, Canales y Puertos 26
Rewrite given system: [A]{x} = { [B] + [E] } {x} = {b}
where [B] is diagonal, or triangular so we can solve
[B]{y} = {g} quickly. Thus,
[B] {x}j+1= {b}– [E] {x}j
which is effectively: {x}j+1 = [B]-1 ({b} – [E] {x}j )
True solution {x}c satisfies:{x}c = [B]-1 ({b} – [E] {x}c)
Subtracting yields: {x}c – {x}j+1= – [B]-1 [E] [{x}c – {x}j]
So ||{x}c – {x}j+1 || ||[B]-1 [E]|| ||{x}c – {x}j ||
Iterations converge linearly if || [B]-1 [E] || < 1
=> || ([D] + [Lo])-1 [Uo] || < 1 For Gauss-Seidel
=> || [D] -1 ([Lo] + [Uo]) || < 1 For Jacobi
Convergence of Iterative Solution Methods
E. T. S. I. Caminos, Canales y Puertos 27
Iterative methods will not converge for all systems of equations, nor for all possible rearrangements.
If the system is diagonally dominant,
i.e., | aii | > | aij | where i j then
i i1 i2 ini 1 2 n
ii ii ii ii
b a a ax x x x
a a a a
with all < 1.0, i.e., small slopes.ij
ii
a
a
Convergence of Iterative Solution Methods
E. T. S. I. Caminos, Canales y Puertos 28
A sufficient condition for convergence exists:
n
ii ij
j 1j i
a a
Notes:
1. If the above does not hold, still may converge.
2. This looks similar to infinity norm of [A]
Convergence of Iterative Solution Methods
E. T. S. I. Caminos, Canales y Puertos 29
Relaxation Schemes:
where 0.0 < 2.0(Usually the value of l is close to 1)
Underrelaxation ( 0.0 < < 1.0 )More weight is placed on the previous value. Often used to: - make non-convergent system convergent or - to expedite convergence by damping out oscillations.
Overrelaxation ( 1.0 < 2.0 )More weight is placed on the new value. Assumes that the new value is heading in right direction, and hence pushes new value close to true solution.
The choice of is highly problem-dependent and is empirical, so relaxation is usually only used for often repeated calculations of a particular class.
new trial oldi i ix = x + (1- ) x
Improving Rate of Convergence of G-S Iteration
E. T. S. I. Caminos, Canales y Puertos 30
We often need to solve [A]{x} = {b} where n = 1000's• Description of a building or airframe,• Finite-Difference approximations to PDE's. Most of A's
elements will be zero; a finite-difference approximation to Laplace's equation will have five aij0 in each row of A.
Direct method (Gaussian elimination)
• Requires n3/3 flops (say n = 5000; n3/3 = 4 x 1010 flops) • Fills in many of n2-5n zero elements of A
Iterative methods (Jacobi or Gauss-Seidel)• Never store [A] (say n = 5000; [A] would need 4n2 = 100 Mb)• Only need to compute [A-B] {x}; and to solve [B]{xt+1} = {b}
Why Iterative Solutions?
E. T. S. I. Caminos, Canales y Puertos 31
• Effort:
Suppose [B] is diagonal,
solving [B] {v} = {b} n flops
Computing [A-B] x 4n flops
For m iterations 5mn flops
For n = m = 5000, 5mn = 1.25x108 At worst O(n2).
Why Iterative Solutions?