Introduction to computational methods in science and ... · PDF fileIntroduction to computational methods in science and engineering using MATLAB ... for the programming projects with

1

Introduction to computational methods in scienceand engineering using MATLAB

Dr. rer. nat. Hans-Georg MatuttisUniversity of Electro-Communications,

Department of Mechanical and Control EngineeringChofu Chofugaoka 1-5-1

Tokyo 182-8585Japan

http://www.matuttis.mce.uec.ac.jp/, but

1. When something is unclear, ask me, not you neighbor, who is busy himself. Askas much questions as you need.

2. The Script will be downloadable from http://www.matuttis.mce.uec.ac.jp/or from the E-Learning system. You can read it online or print it out. If youprint out more than 100 pages, you have to submit an application (signed byme) for more printout pages. Reading the script does not replace attending thelecture.

3. The homework is exercise to learn programming, you cannot learn programmingby reading the script.

4. Learn to use the online-help from MATLAB

5. For the credit:Presence in the LectureNumber of Points in the programming homework and the E-Learning System

Getting Started

0.1 Why ”Computational Methods”

Most problems in Science and Engineering di!er from undergraduate problems in therespect that no ”closed solutions” exist: Whereas there is a closed solution (solutionfunction) for the harmonic oscillator with viscous damping,

x + 2 ! x! "# $Viscous damping

+"20x = 0

x(t) = A exp(!!t) exp%±i

&"2

0 ! !2t'

there is no closed solution for the harmonic oscillatorwith sliding friction (see graphics to the right)

mx + sgn (v) µFN! "# $Sliding friction

+kx = 0, sgn(v) =v

|v| ,

but solutions can only be given piecewise for rangeswhere the friction is constant:

x(t) = !("x ! x0) sin%"t +

#

2

'! x0 for 0 " t " T

2

x(t) = !("x ! 3x0) sin%"t +

#

2

'+ x0 for

T

2" t " T, x0 =

µFN

k

For forces which are not linear in x and its derivatives, in general not even piece-wisesolutions can be given. Other problems for which no solutions exists are problems withmany degrees of freedom (e.g. planet systems), or flow problems.For the technically important fields of structural analysis and fluid mechanics, mostresults are nowadays obtained by computer simulations.

4

Fluid mechanics: Flow around a sphere with increasing Reynolds number/ flow speed:Analytical solutions exist only for the Stokes flow problem.

Stokes Vortices Vortex Street Turbulence

0.2 New MAC-Installation

Since 4/2006, the exercises-room is equipped with MAC-computers instead of the oldSUN-Unix-Workstation-Terminals. Software can either be started via the Window-Icons in the Applications-Directory, or via the command-line terminal, which getsstarted by clicking on the X-Icon. MATLAB can be started by clicking on the MATLAB-Icon. It is recommended for the Course to use the EMACS-Editor. Because the currentMAC-OSX-Operating-System is based on the Unix-Operating system, the followingcomments on Unix are useful, last not least, because UNIX-commands (for directorylistings, previewing of graphics, removal of unneeded data etc.) can be used from theMATLAB-prompt via ! as escape-sequence.WARNING! The new MAC-Installation allows the teacher to view the screen and thecurrently active programs in each student terminal. Applications which are unneededfor the lesson can be terminated from the teacher-console.

0.3 UNIX-Workstations

The MAC-computers in the terminal-room cannot be used remotely. If students wantto login remotely from outside the terminal-room, the access is possible to the SUN-cluster which has the name sun.edu.cc.uec.ac.jp, which consists neither of MAC’snor PC’s, but UNIX-Workstations. The login (also possible from the MAC-computers)has to be via the secure shell and login by setting the X-terminal is possible as

ssh -X sun.edu.cc.uec.ac.jp

or

ssh -Y sun.edu.cc.uec.ac.jp

(the option -Y or -X depends on the version of the operating system one is loggingon from). UNIX was originally written to be used from a commando-prompt window,not from GUI/Window systems. It is advisable to be able to use the original UNIX-commands. If you like UNIX and would like a UNIX-like environment on your PC,install the free CYGWIN-package, www.cygwin.com.Some survival-UNIX-commands:

0.3. UNIX-WORKSTATIONS 5

cp source destination copy the file source to the file destination ifdestination exists, overwrite it with source

cp -r source destination like copy, -r means recursive, works also withdirectories

pwd display current directorymv source destination rename the file source to the file destination

cd change directoryps x display existing jobs and their job-idkill job-id kills the job with the number job-idls list directory contentls -d list all directories in the current directoryls -lrt list all files with information about their size, in

the order in which the have been created, thenewest ones a the end

ls [a-c]* b list all files in this directory with names begin-ning with a,b or c

find . -name thisname -print look for the file or directory thisname in the cur-rent directory and in all the subdirectories

find . -name ’*arg*’ -print display all files and directories in in the currentdirectory and in all the subdirectories which con-tain the string arg in their name

fgrep asdf *.txt look for all lines which contains the string asdfin all files which have the extension .txt.

UNIX is a multi-user multi-process operating system, so several uses can run commandsat the same time on a single computer. It is also to move jobs in the backgroundwhen starting them so that they don’t block the command prompt by appending theampersand &. If a program was started in the foreground and blocks the prompt, itcan be pushed into the background via CNTRL-Z and the execution will be interrupted.If in the same terminal window bg is typed, the execution is resumed. (Of course,this does not work with programs which have an input prompt in the foreground, likeMATLAB).Some special directories:./ the current

directory../ the directory below

the current directory~/ the users login direc-

torySome special characters:* any string ? any single character [a-f] all of the characters of

a,b,c,d,e,fThe recommendation for this course is to work in a directory which is dedicated toMATLAB alone, and this directory is not the root-directory of the account. The nameml should do just fine:

mkdir mlcd ml

6

0.4 MATLAB

0.4.1 Introduction: Interpreters and Compilers

In general, for the programming projects with high numerical complexity, it will bethe best to develop the algorithms in MATLAB. MATLAB is, like BASIC or symboliccomputer languages like MAPLE, MATHEMATICA, MACSYMA and REDUCE, aninterpreter language, i.e. the language commands are translated into processor instruc-tions. Nevertheless, MATLAB is not a symbolic language, but performs all calculationsnumerically, i.e. with floating point numbers.1 The language can be used either froma command prompt or as a functional (or object-oriented) programming language. Incompiler languages like FORTRAN, C or PASCAL, the program is fully translatedinto processor instructions before execution. If errors occur at runtime, the memorycontents is di#cult to analyze, usually only with the help of a debugger, which mayalter the program execution and memory layout up to the point that some errors can-not reproduced. The debuggers properties vary much more than the language itself. InMATLAB, after a program crash, the data are still accessible in MATLAB’s memoryand can be analyzed using the commands from the MATLAB-language itself.Interpreters allow fast program development. As a rule, their execution times arehigher than those of compiler languages, but during program development, usually thecompile time is more costful than the actual runtime. In MATLAB, when complexbuiltin functions are initialized via small commands, like a matrix inversion, very oftenthe advantage in speed for the compiler languages is negligible.Many programming languages have a whole zoo of data types. MATLAB’s elementaryelementary data type is the complex matrix. (Recently, MATLAB also o!ers morekinds of data types, but we will not use them in this course). Variables can be pro-cessed up to the point where they take a complex value. Variables which are used asindices must nevertheless have an integer value.Because it is not possible to declare variables in MATLAB, is refuses to process vari-ables which are not initialized. In FORTRAN77, for example, it was possible to usevariables which were neither declared nor initialized, and which assumed the value 0at the moment they were used.

0.4.2 Getting started

The MATLAB-Interpreter is started on our installation by by typing

matlab

at the command prompt, which starts the MATLAB-desktop. If you are busy and youdon’t want to see the splash-screen (MATLAB-Commercial) at the program start, use

matlab -nosplash

1The symbolic package available with MATLAB is basically MAPLE with a MATLAB-Interface.

0.4. MATLAB 7

Basic Commands:edit starts the MATLAB-Editor with Syntax-highlighting of MATLAB-

commands. You can use any editor you like to write MATLAB-files,but the line-end may vary between operating systems and may leadto trouble

clear empties the memoryclear a clear the variable a from the memorywho displays the variables which have been assignedhelp gives help concerning a specific topichelp help tells you how to use the help functionlookfor looks for a word in the help files, useful if you are looking for a com-

mand according to context, but are not sure about the commandname

disp(a) displays the value of the variable adisp(’a’) displays just the string, a.rand random number generator, will be used a lot to initialize dataformat format the output, format compact suppresses output of empty lines,

format short forces the rounding of the output to eight digits, butthe computations are still performed with full precision

% comment signls displays the current working directory of MATLAB, i.e. the directory

for which MATLAB can access the files directlycd changes the current working directory of MATLAB

The MATLAB-desktop is written in JAVA (another interpreter-based programminglanguage), which has still some stability problems2, so the desktop crashes relativelyoften. If you don’t want to work with the desktop to avoid unnecessary crashes, butwant to write the programs in a Unix-editor you know, you can also start MATLABwith the command-prompt only as

matlab -nosplash -nodesktop

2To get an idea why the JAVA-Interface of MATLAB crashes so often, see the internal memo fromSUN from http://www.internalmemos.com/memos/memodetails.php?memo id=1321

8

Special Characters:! escape sequence, allows to use UNIX-Commands like cd, pwd from

the MATLAB-prompt[...] 1. vector brackets referring to the value of the entry, [1 2 3] is a

vector with the entries 1, 2 and 3.2. brackets referring to the output arguments of functions.

(...) 1. Brackets referring to the indices of a vector, a(3) is the thirdelement of the vector a2. brackets referring to the input arguments of a function.

... three dots mark the end of a line which is continued in the next line; has no syntactical function like in C but is only used to suppress the

output of the operationi, j stands for the complex increment

#!1, but can also be overwritten

for other uses.pi is indeed 3.1415....

, divide commands, when several command lines should be written inthe same editor line

: divide loop variables, lower_bound:stepwidth:upper_bound,WARNING !lower_bound,stepwidth,upper_bound only displays the variableslower_bound, stepwidth, upper_bound

As a first reference, Kermit Sigmons MATLAB primer athttp://math.ucsd.edu/!driver/21d-s99/matlab-primer.htmlcan be recommended. It gives a short overview over available commands, but it is agood idea to get used to the builtin help-function of matlab (just type help from theprompt). For most purposes, the internal help is su#cient. Manuals for MATLABare available, but there is not much information which ones needs beyond the builtinhelp command on a daily basis, except the references to the algorithms used. This is ahuge di!erence to e.g. MATHEMATICA, where the algorithms are ”secret”. Beware,in contrast e.g. to FORTRAN, MATLAB is case sensitive, ABC is not the same asabc. If you used the same variable names in lower case and upper case in the sameprograms, you will run into trouble anyway. Information about a public-domain cloneof MATLAB, OCTAVE, can be found at www.octave.org.

Control statements are usually terminated via the end command, no matter whetherit is an if statement or a for loop:

for i=1:10i

end

a=2b=3if (a>b)

disp(’a>b’)else

disp(’a<=b’)end

0.4. MATLAB 9

0.4.3 Matrix Processing

MATLAB was started by Cleve Moler, a famous researcher in numerical linear algebra,as a MATRIX LABORATORY for his students, which should allow fast, save and easydevelopment of algorithms for numerical matrix analysis.MATLAB has evolved to a general purpose language which specialized applications inmany fields. Many books in the meantime use MATLAB either as a formal language offor the programming examples, have a look at http://www.mathworks.com/support/books/index.jsp.

Matrix Syntax:* multiplies two matrices according to the con-

ventions of inner/outer/matrix product.* multiplies two matrices elementwisea(2:4) elements of vector a from the second to the

fourth elementend the last element in a row/column of a vec-

tor/matrixa(2:end) elements of vector a from the second to the last

elementb=c(2:3,2:6) assign b the values in the matrix c from line 2

and 3 from row 2 to 6

With the matrix syntax and the proper use of brackets, many operations can be sim-plified without the use of loops:for i=1:20

a(i)=i/2end

$a=[0.5:0.5:10]

ora=linspace(0.5,10,20)

Many functions either operate on vectors and matrices elementwise, or they are matrixfunction in the sense that the operations are performed as matrix functions.

Matrix/Vector Functions:length give the longest dimension of a matrix, or the length of a vectorsize gives the dimensions of a matrixlinspace(a,b,m) make a vector with entries in m equidistant intervals between a and b

rand(n,m) set up a random matrix with n lines and m rowsexp exponential function, works elementwise on a matrixexpm matrix exponential function, works on the eigenvalue of a matrix and

can only be used for square matriceseig eigenvalue decompositioninv matrix inversionnorm matrix/vector normdet determinantsvd singular value decomposition

10

0.4.4 User-defined Functions (m-files)

User-defined functions can be written as ASCII-files with the extension .m. A functionmy_function would contain in the file my_function.m

function [out_arg1,out_arg2,arg3]=my_funtion(in_arg1,in_arg2,arg3)% function [out_arg1,out_arg2,arg3]=my_function(in_arg1,in_arg2,arg3)% The first comment after the function declaration is% displayed if "help my_function" is typed to write% self-documenting functions........return

It is advisable always to end a function with a return statement, and also the mainprogram.For input-functions, MATLAB-functions use ”call by value”, which means that theinput-arguments (in round brackets) cannot be modified in the functions. Only theoutput-arguments (in []-brackets) can be modified by the called function. If an argu-ment is to be used as an input-argument and an output-argument, it must appear inthe round brackets and in the []-brackets, like arg3 in the above example.Global variables can be defined with the statement global in a similar way as variablesare declared in other programming languages. The same declaration must then be usedin the functions which use the variable. Global variables can be also overwritten in thefunctions, they are ”call by reference” variables.FORTRAN uses call by reference for all input variables of subroutines. C uses call byvalue for scalars and call by reference for arrays, so that a pointer to a variable mustbe used if a scalar is to be modified in the functions.Functions can be overloaded for di!erent numbers of input parameters and for scalarand matrix functions. If the operations used in the function allow an interpretation inthe matrix-sense, the function can automatically used for functions.

0.4. MATLAB 11

Exercises

1. Set up a vector with the entries (1, 2, 3, 4, . . . n) once using a for-loop, the secondtime using an implicit loop.

2. Multiply every second element with a constant a, once using a for-loop, onceusing an implicit loop.

3. Write a program which tests which finds out which elements of a vector are even

4. See what happens if you set up ones(L) , ones(L,1), ones(1,L), and whathappens when you try to multiply these objects with each other.

5. a) What do you expect what the following program does?

clearstep=2upper_bound=10

for i=1,step,upperbounddisp(i)

endreturn

b) What does the program really do? c) How do you have to rewrite the programso that the program does what you expected it to do in a)

6. Write a program which computes the factorial n! of an integer number n,

n n!1 12 1 · 23 1 · 2 · 34 1 · 2 · 3 · 4

7. Rewrite the factorial-program as a subroutine

8. Rewrite the factorial-subroutine so that the input-arguments are checked, so thatonly ”proper” input arguments are accepted.

9. Use the help-function of MATLAB to find out the relation between the built-in$!function gamma and the factorial.

Chapter 1

How to write better programs

In this chapter, I will discuss the basics of programming style for numerical computing.1

Everything seems to be a matter of course, and during several courses, some studentswho considered themselves ”experienced programmers” skipped these lessons. Usually,after 2 weeks homework, they ran into exactly those pitfalls, problems and errors as Idiscussed in these pages, and usually wasted several hours which could have been spentproductively. My usual comment was: ”We had this two weeks ago when you didn’tattend . . . ”

1.1 Programming Style

1.1.1 Choosing variable names

Of course, nobody would use variable names in scientific computing which have noscientific meaning, like linda,charly,taro when there is no documentation what thevariables mean. Variables which are di#cult to spell, like asdtfgl or such-like shouldbetter be avoided, except if there is a convention how to compose such variable names.Some variable names in scientific programming are self-explaining, like

x,y,z,vx,vy,vz,omega

etc. I is very easy to over-do the self-explanation by choosing too long variable names,as i saw once in the programs in a masters thesis:

this_is_the_coordinate_of_x.this_is_the_coordinate_of_y,

1I will use the terminology ”Computational Physics”, ”Computational Engineering”, ”ScientificComputing”, ”Scientific Programming” pretty much as synonyms. ”Numerical methods”, ”numericalmathematics”, ”numerical algorithms” I will use when I want to emphasize mathematical techniquesto handle floating point computations, minimize roundo!-errors, control discretization errors etc.

”Numerical physics” i will use if i want to emphasize that the techniques for the ”computationalphysics” require an understanding of the floating point computations involved.

14 CHAPTER 1. HOW TO WRITE BETTER PROGRAMS

etc. Of course, don’t choose meaningless ”short” variable names, like

xx,xxx,xxxx,xxxxxy

should be avoided. If one wants to express the order of the derivation, e.g. timederivation of a coordinate (Gear Predictor corrector method uses up to the 5th timederivative ....), is may be good practice to use

x_0, x_1, x_2, x_4 . . . . .

for the coordinate, first derivative, second and so on. A convention like

dx_f, dy_f, dx2_f, dx_dy_f .....

is also not a bad idea.The Fortran77-standard allowed only variable (and subroutine) names of six charac-ters, so once I spend a happy week in 1989 rewriting my longer into shorter variablenames. That was before the coming of the Fortran90-standard, nowadays, all Fortran-Compilers accept longer variable names, but you may come across programs in the oldconvention. I am not sure about the variable-lengths for C++-compilers, but be awarethat internally the compiler will expand internally the variable name variablename instructure structurename in the object objectname to something like

objectname_structurename_variablename

and when these names become too long, this may also cause trouble. A colleaguesprogram once refused to compile because the internal name representation was longerthan 256 characters, and also debugging tools have problems if e.g. subroutine namesin objects or modules are becoming too long. As far as too long variable names areconcerned, one may run into similar Problems with new C++ Compilers as one didwith Fortran77 Compilers decades ago.Be aware that similar variable names can be easily confused, especially if they makeuse of uppercase/lowercase letters and the underscore, like

Variablename, variablename, Variable_name

so the use of all three in the same program will certainly cause problems. It is agood convention to use variable names which sound di!erently. At one point in onesprogramming career, one should decide whether to write composite variable nameswith an underscore, Variable_name, or not, as Variablename, or as VariableName.More considerations about the convention and choices for Variable names can be foundin ”Code Complete”.It is practical to reserve i,j,k for loop variables for short loops, which increases read-ability, especially if the original mathematical formulae e.g. for vector operations use

1.1. PROGRAMMING STYLE 15

i, j, k as indices. In scientific applications, n is very often reserved for particle numbers,and l,lx,ly,lz for system sizes. Using short variable names increases the readability,when the usage of the variable is clear, e.g. one just implements a formula accordingto a text book. Of course, it is not a bad idea to include the reference (page and title)to the text book in a comment line.

Originalformula:

a = v/t

x =1

2at2

Good:

% A.B. Meier, Mechanics,% p. 15, eq. 4a=v/tx=.5*a*t*t

Not so good:

%acceleration=velocity/timeposition=.5*acceleration*time^2

For long, complex programs, the usage of e.g. n as particle number or m as numberof timesteps becomes increasingly cumbersome, especially if one tries to recycle thecode and the variable names have already be used e.g. as loop-variables. It is betterto append some information, so if one has to treat walls and particles in a program,one should define n_wall, n_particle, and for the computation of e.g. the mass, theprogram would best be written like

for i_particles=1:n_particlem_particle(i_particle)=r_particle^2*pi

end

and in the same way for the walls.

1.1.2 Readability of code and Computational e!ort

Try to write your code as readable as possible. One definition of a programming Guru2

is: He understands his programs still after he has not looked at them for ten years. Ifyou consider yourself a Guru, try to read programs from ten years ago. There is a worldchampionship in writing the most unreadable C-Program, the infamous ”InternationalObfuscated C Code Contest”3, and one of the winners wrote the following:

/** Program to compute an approximation of pi* by Brian Westley, 1988* (requires pcc macro concatenation; try gcc -traditional-cpp)*/

#define _ -F<00||--F-OO--;int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO(){

_-_-_-_

2Further Information on how to write good Programs can be found in: Code Complete, SteveMc-Connel, Microsoft Press, Paperback 1993, also available in Japanese

3Homepage at http://www.ioccc.org/


_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_

_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_-_

_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_-_-_-_-_-_-_

_-_-_-_-_-_-_-_-_-_-_-__-_-_-_-_-_-_-_

_-_-_-_}

As the purpose of science is clarity, the purpose is writing scientific code is also clarity andreadability. Unreadable code is code which is hard to debug, and errors in scientific computingare much more di"cult to detect if you get the second digit in your as in the case of commer-cial software, where you can always tell by messages like ”segmentation violation”. Moreover,commercial software vendors can make money by selling software updates, whereas in scien-tific computing, people who wrote buggy code will have trouble in their career. Unreadablecode is not the fault of the programming language, though some programming languagesattract chaotic programmers more than others. The advantage of ”restrictive” programminglanguages like ADA is, that you cannot make certain classed of errors.

What is in a line

Be aware that identical operations in a computer are not speeded up by cramming everythingin the same line,

a=2*ba=a^2a=a/c+d

will take the same computer time as

a=((2*b)^2)/c+d

what is more readable, depends on the implemented formulae. There are tricks called ”per-formance optimization” which actually allow faster program execution due to the style thecode is written, but this has nothing to do with cramming many commands in a single line,but this can only discussed in a later chapter.

Coherence

If you are not sure which lines in the code should be grouped together, it is best to stickto the concept of ”coherence”, writing operations in consecutive lines which a!ect the samevariables. Instead of

1.1. PROGRAMMING STYLE 17

a1=b1*c1a2=b2+c2a3=b3/c3a1=a1-d1/e1a2=(a1+a2)/2a3=a3*a2

it is better to write

a1=b1*c1a1=a1-d1/e1

a2=b2+c2a2=(a1+a2)/2

a3=b3/c3a3=a3*a2

Once I had to find the error in the program of a student. The result was correct, except thatit was 10 orders of magnitude wrong. He should have divided the result by a timestep dt.The student knew that if one has to do many divisions by the same numberdt, it is fast tocompute the inverse i_dt=1/dt and multiply with i_dt. And he thought that he could saveprogramming time by not defining a new variable i_dt, so his program looked like

dt=10^-5dt=1/dt

............(one page of code............

result=preliminary_result/dt

A perfect interaction of a stupid choice of variable names (the name of the variable at theend did not match the meaning), a code which was longer than one page, so one could notread it in a single window, and an incoherent way of using the variable dt.

Line length and Subroutine length

Fortran77 and also some other programming languages limited the line length to 72 charac-ters. Many C-Programmers consider it as advanced programming style to indent their codeso much that they cannot even get the full line length on a 19inch screen, and they have toscroll their windows to the left and right. Be aware that what you cannot see on a singleglance, but only after repeated scrolling, can easily cause errors as you have no ”oversight”over your code. This applies to vertical as well as horizontal scrolling, so keep the line numberof a subroutine to a certain limit (two A4 pages may already be too long) and also the numberof columns should not be more than maybe 80 characters, every program which needs moreshould have more subroutines.


1.2 Safety first

The most important aspect of scientific programming is the safety of the programs. ”Neverin the history of mankind has it been possible to produce so many wrong answers so fast”4.

1.2.1 Check Input variables

Always check the input variables of your subroutines. You may know with which parametersthe subroutine must be used, but there may be somebody else who may not know it, usu-ally the next student who uses the program after you, who will produce a lot of numericalgarbage. So even if you have a simulation of a mechanical system, which should be used withpositive timestep and positive masses, you better check whether the timestep at the massesare larger than 0 at the beginning of the program. Moreover, error in passing arguments inthe subroutine can be detected more easily like that. If you find a wrong input parameter,don’t replace the input with a default value, but stop the program good and hard:

input(’mass’)if (mass<-0)error(’mass should be larger than 0’)

endif

For general software, it may be a good idea to define a default value. For most numericalapplications (except for accuracy thresholds), specifying a default input may be a very badidea.

1.2.2 Operator precedence

For analytic arithmetic expressions, the order of the arithmetic operations is usually welldefined, so that a + b % cd is automatically evaluated as a + (b % (cd)). Usually, the order ofthe operations is equally clear with logical expressions, but with numerical code, it is a priorinot clear whether for the logical operator not, and, or as ~,&,|

(~a<b*c&d==0)

is evaluated as ((~(a<b*c))&(d==0)), or as (~((a<b*c)&(d==0)), or whether the logicaloperations can indeed be applied bitwise to the integer-values as ~(10101010)=(01010101)and then be used as the numbers of respective type. So if anything occurs which is moreambiguous than addition and multiplication, one should use brackets.

1.3 Program documentation

Always document your program, and the best method will be to write the explanation withinthe code, if they are elsewhere, they will get lost over the years. I will reject any projectwhich is not well documented.

4Carl-Erik Froberg

1.3. PROGRAM DOCUMENTATION 19

1.3.1 Stupid comments

There are useful ways and stupid ways to write comments. When I once emphasized theimportance of comments for computer programs, in the next exercise lesson one studentwrote the following comment:

% here is a comment

When I asked why he wrote such a comment, he said: Because you said we should writecomments. But he had not written in his program what the program should do, and duringone hour of programming forgot actually what he should program... Another stupid commentwould be

% Divide by ca=b/c

Of course, the multiplication is self-explaining, but for the same short line a comment like

% c from function XXYYZZ, not yet checked whether c becomes 0a=b/c

may help a lot in debugging the code. Generally, focus on what the code is doing, not how,because how it is done can be read from the programmed lines.

1.3.2 Comments

Usually, every line which contains information which is not self-explaining, like

volume=lx*lx*lzmass=volume*rho

should better be documented. Of course, the amount of comments necessary grows with thenumber of people who are supposed to use the code, with the number of functions and linesin the code and with the complexity. If you are not sure who will use the code, then betterwrite your program documentation in English. It is generally a good Idea to formalize onesdocumentation, especially at the beginning of functions/subroutines:

%PURPOSE: What the program is supposed to do%USAGE: When and how the program%AUTHOR: Who wrote the program%DATE: Date when the program was written%ALGORITHM: If the algorithm used is more complicated% than what you can document in the body of the subroutine, you% better explain the algorithm here%LITERATURE: If you have used a complicated algorithm e.g. for% matrix inversion etc, write from which book or article the algorithm% comes, usually you have also used the naming conventions, and anybody% who wants to understand the algorithm (maybe you after ten years) better% reads the literature first.%CAVEATS: If you have programmed%TODO: How to improve the algorithm the next time you have time%REVISION HISTORY: Write the date when you modified the algorithm,


This above example is ”easy to maintain”, to modify or add to. What is not ”easy tomaintain”, would be something like

% PURPOSE:% +-------------+

and so on and so on. The simpler you design your comments, the more likely it is thatyou really write them in the way they should be written. If any of the above points, leavethem away. If the routine is complete and runs as it should, don’t write an empty TODOpoint. If your routine-name is my_asin, (My arcus-sinus), then you don’t have to do much inprinciple. But if the routine actually computes the sinus in a non-standard-way by polynomialapproximation, you better write where you have it from in the literature. If the routine isvectorized, this should be stated in the PURPOSE. If the vectorization works only if a vectorizeddivision is availabe, this should be written in the caveat. If you write a routine for the firsttime, you don’t have to write a REVISIONHISTORY, the date is enough.And when you change the routine, also change the comments! Nothing is more confusingthan working with a correct routine for my_sinus which calculates a cosinus.

Exercises

1. Check the MATLAB-programs you wrote up to now whether they are in accordancewith the above ideas

2. Write a program which creates a matrix where the first column contains equally spacedx-values between -5 and 5, and the second column contains the values of the second-order polynom y = ax2 + bx + c

3. Write a program which makes creates a matrix where the first column contains equallyspaced x-values between -5 and 5, and the second column contains the values of thefunction y = 1/(1 + x)

4. Write a program which can dectect whether the result of an mathematical computationhas complex parts

Chapter 2

Stochastic methods I

Stochastic methods use concepts from probability theory. Knowledge about stochastic meth-ods is important in every field of science and engineering, because each data series containsa certain element of chance or a certain scattering of the data.

2.1 Random Number Generators

In computer simulations, the element of chance is usually simulated by so-called random-numbers, or pseudo-random-numbers. A random number generator is a function which shouldgenerate a sequence of numbers which are distributed according to certain probability rules.In case of equally-distributed random numbers, the numbers are usually between 0 and 1,and all values can be obtained with the same probability. The random number generator inMATLAB is called rand, and it can be called with arguments so that the result is not just asingle random number but a vector or matrix:

clear , format compact , format shortrand % output a random numbera=rand(1,4) % output a 4x1 vector of random numbersb=rand(4) % output a 4x4 matrix of random numbers

This program using the function rand for equally distributed random numbers gives thefollowing output:

>> showrandans =

0.9501a =

0.2311 0.6068 0.4860 0.8913b =

0.7621 0.4447 0.7382 0.91690.4565 0.6154 0.1763 0.41030.0185 0.7919 0.4057 0.89360.8214 0.9218 0.9355 0.0579

22 CHAPTER 2. STOCHASTIC METHODS I

2.1.1 Mean and Variance

Standard quantities which characterise statistical properties of a set a1, a2, . . . an of n numbersare the mean

mean:µ = &a' =1n

n(

i=1

ai. and the variance ! = Var(a) =1

n ! 1

n(

i=1

&ai ! &a''2.

the mean of the squares of the di!erences between the respective samples and their mean.The square root of the variance is called the standard deviation.

% PURPOSE: Calculate mean and Variance% for the MATLAB-Random-Number Generatorclearformat compactformat shortn_rn=10000rn_vec=rand(n_rn,1);% rand(n_rm,1) gives line vector of length 10000,% rand(1,n_rn) gives a row vector of length 10000,% rand(n_rn) gives a square matrix of length 10000^2 and crashes the programmean_rn=mean(rn_vec);var_rn=var(rn_vec)return

Exercise: Calculate by hand the theoretical mean and variance of the for random numbersequally distributed between 0 and 1.Another random number generator in MATLAB is randn, which creates random numbersaccording to the Gauss distribution

G(x) =1

!#

2"exp

)! (x ! xm)2

2!2

*

,

and the ”normally distributed” random numbers from randn have mean xm and standarddeviation ! = 1.Exercise 2 : Estimate the error-dependence in a statistical sequence of random numbersfrom the number of random numbers used by comparing the theoretical variance for therandn-random number generator with the actually measured variance.

2.1.2 Distributions and tests of random numbers

A visualization for random numbers is just to draw the histogram: How many random num-bers in the interval #X. These intervals are called ”bins” of the the histogram, the collectionof data in the histogram is often called ”binning”. For a certain number of bins in the his-togram, the distribution of the random numbers can be studied. For the following programthe output is given below and the drawn histogram is given on the right:

2.1. RANDOM NUMBER GENERATORS 23

clearformat compactformat shorta=rand(1,4)hist(a)

>> randhista =

0.3423 0.3544 0.7965 0.56170.3 0.4 0.5 0.6 0.7 0.80

0.5

1

1.5

2

If ”many” bins are used, and ”few” random numbers, the histograms is ”rough”, if morerandom numbers are used, the histogram is smooth:

clearformat compactformat shorta=rand(1,50);subplot(3.9,2,1), hist(a)set(gca,’Xticklabel’,’’)title(’50 random numbers,10 bins’)axis tight

b=rand(1,500);subplot(3.9,2,3), hist(b)set(gca,’Xticklabel’,’’)title(’500 random numbers,10 bins’)axis tight

d=rand(1,50000);subplot(3.9,2,5), hist(d)set(gca,’Xticklabel’,’’)title(’50000 random numbers,10 bins’)axis tight

subplot(3.9,2,7), hist(d,100)title(’50000 numbers,50 bins’)axis tight

0

5

1050 random numbers,10 bins

0

20

40


0

2000

4000


0.2 0.4 0.6 0.80

200

400

50000 numbers,50 bins

Exercise 3: Estimate the dependence of the statistical fluctuations, i.e. dependence in thedi!erences in the number of entries in the histogram on the number of entries.A basic test for random numbers is whether the random numbers in a bin are the same”within the statistical fluctuations” Much more sophisticated tests for Random Numbers canbe found in Knuth1. Nevertheless, to evaluate the usability of a random number algorithmfor a given problem, one should not rely on theoretically available algorithms, but one should

1Donald Knuth, The Art of Computer Programming, Addison-Wesley 1998


test the algorithm for a problem with an unknown solution with a related problem for whichone knows the solution. Another visual way of controlling random number sequences is toplot one sequence as the x- and the other as the y-coordinate:

clearformat compactn_rn=100;a=rand(n_rn,1);b=rand(n_rn,1);plot(a,b,’.’)axis image

2.2 Usage of random numbers

2.2.1 Initializing the Seed 0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Random numbers are used to verify statistical hypotheses, or to initialize simulations in anarbitrary way. To test statistical hypotheses with several, independent sequences of randomnumbers. Nevertheless, during program development, it is advantageous to test and debug theprogram always with the same random number sequence. The start value which determinesthe sequence is called ”seed”, which is set by

rand("seed",X)

where X should be a numerical value. For general random number generators, very oftenprime numbers have to be used as seed, so always read the documentation first.

2.2.2 Monte Carlo Method: Calculate # by random numbers

Before random numbers could be easily and fast generated with computer algorithms, math-ematicians used tabulated random numbers2, similar as values for integrals are still usedtoday. Some of these random number tables had been compiled using Roulette Results fromthe Casino of Monte Carlo in Monaco, and so Monte Carlo Methods got their name. In re-cent years, in Computer Science it has become fashionable to name some methods Las Vegasmethods instead of Monte Carlo methods, but the di!erence is purely academic.

2See E.g. Random numbers in uniform and normal distribution : with indices for subsets / compiledby Charles E. Clark, Chandler Pub. Co, 1966


To calculate " with random numbers, let usconsider a quarter circle of radius 1 and areaa" = "/4, in a square of length 1 and areaa! = 1. If we choose a point randomly insidethe square, the probability P that it is insidethe area is

P =a"a!

="/41

=N(in Circle)

N(in Square) = Ntotal,

where P is the relative frequency with whichpoints are found inside the quarter-circle.Therefore, " can be computed via the relativefrequencies as

" = 4N(in Circle)

Ntotal.

A program which does this computation isgiven on the right.Exercise: Try to understand the time behav-ior by plotting the di!erence to the absoluteresult in di!erent scales (logarithmically, dou-ble logarithmically)

r=1

clearformat compact, format shortmc_step=10000n_insize=zeros(mc_step,1);n_try=zeros(mc_step,1);i_inside=0;for i_mc=1:mc_stepx=rand;y=rand;r2=x*x+y*y;if r2<=1

i_inside=i_inside+1;endn_try(i_mc)=i_mc;n_inside(i_mc)=i_inside;

end4*i_inside/mc_stepreturn

2.2.3 Simulation of Stochastic processes

Random numbers allow to simulate processes which are often considered to be deterministicin a stochastic way. Let us in the following consider a league of teams, which sports (baseball,soccer, basketball ....) does not matter. Each of the six teams has a certain ”game strength”Si. Let us define the probability for a team A to win against the other team B as

PAB =SA

SB· max (Si)

SA + SB.

In the following program, the game strength (team_quality) is the same for each team,nevertheless you will find that usually one team wins. In the run which is depicted behindthe listing, the percentage of wins for all the ”teams” are plotted. One can see that in thebeginning leading ”team 2” finishes as the last team, whereas the also leading ”team 6””wins” the ”championship”. In real life, sports reporters waste a lot of time and energyon explaining such developments, but in out simulation, we can see that such such narrowoutcomes are just a result of chance. For stock exchange fluctuations, the same reasoningapplies.

clear , format compact, format shortn_team=5, n_game=100


for i=1:n_teamteam_quality(i)=11

endn_games_played(1:n_team)=0n_games_won(1:n_team)=0for i_game=1:n_gamefor i_team=1:n_team

for j_team=i_team+1:n_teamn_games_played(i_team)=n_games_played(i_team)+1;n_games_played(j_team)=n_games_played(j_team)+1;

win_probability=......% relative probability

(team_quality(i_team)/team_quality(j_team))*......% normalization

max(team_quality)/(team_quality(i_team)+team_quality(j_team));

% assign a winner according to the probabilityif (win_probability>rand)n_games_won(i_team)=n_games_won(i_team)+1;

elsen_games_won(j_team)=n_games_won(j_team)+1;

endscore(n_games_played(i_team),i_team)=n_games_won(i_team);score(n_games_played(j_team),j_team)=n_games_won(j_team);

endend

end

% normalize the number of games won to a winning probabilitynormalization=ones(size(score));normalization=cumsum(normalization(:,1));

plot(normalization,score(:,1)./normalization,’--’,...normalization,score(:,2)./normalization,’.’,...normalization,score(:,3)./normalization,’+’,...normalization,score(:,4)./normalization,’-.’,...normalization,score(:,5)./normalization,’:’,...normalization,score(:,6)./normalization,’-’)legend(’team 1’,’team 2’,’team 3’,’team 4’,’team 5’,’team 6’)


0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

team 1team 2team 3team 4team 5team 6

Exercise: Modify the ”quality” and see how the winning probability changes. Find out howstrong you have to modify the winning probability so that it ”wins” in all test runs.

2.2.4 Time averages and ensemble averages

In the example of calculating " with random numbers, better statistic can be obtained byusing more Monte-Carlo steps. As an alternative, one can also run the programs with fewMonte-Carlo steps several times with di!erent seeds, save the results and average the results.This will reduce the noise in the data. For statistically independent data, as for our calculationof ", both approaches are independent.The law of large numbers states that the actual probabilities are only realized after infinitelymany tries. For a finite number of realizations, the fluctuations in the systems can clearly befelt.An approach like the one above where successive Monte-Carlo data are obtained indepen-dently from each other is called ”simple sampling”. If the Monte-Carlo data are chosendepending on the previous data, this procedure is called ”importance sampling”.

Homework 1: The obsolete Why-FunctionImplement the old ”why”-function (which does not exist any more) from Matlab Version 5.When you typed why, you got the possible answers:’why not?’;’don”t ask!’;’it”s your karma.’;’stupid question!’’how should I know?’’can you rephrase that?’’it should be obvious.’’the devil made me do it.’’the computer did it.’’the customer is always right.’’in the beginning, God created the heavens and the earth...’’don”t you have something better to do?’;’because you deserve it’or


’Cleve’ / ’Jack’ / ’Bill’ / ’Joe’ / ’Pete’’insisted on it’’suggested it’’told me to’’wanted it’’knew it was a good idea’’wanted it that way’Write a program which randomly gives one of the four answers, with equal probabilityHomework 2: The goat quizAt the end of a quiz-show, the winner has to chose his price behind three doors. Betweentwo doors, there is a goat, if the winner chooses one of these doors, he gets nothing.First the winner chooses one door. Then the show-master opens one of the remaining doors,which has a goat behind.The winner is now allowed to switch his choice to the third remaining door, or to stick to thedoor he has chosen.If his choice was successful, he gets the price.Write a program which allows you to find out whether it is better for the winner to switchthe door after the show-master shows the goat, or whether it is better to stick with the firstchoice.Think about what is better before you write the program, but don’t manipulate the programoutcome to obtain your conjectured result.

Chapter 3

Numerical Analysis I

3.1 Data types: Integers

Integers are represented according to the number representation the computer uses inter-nally. For example, in the binary representation, integers are represented as combination of0 and 1, in the hexadecimal (Greek-Latin for 16) representation, integers are represented ascombinations from 0 to A, see Tab. 3.1. If you need the conversion from decimal to binary,

decimal binary hexadecimal00 00000 0001 00001 0102 00010 0203 00011 0304 00100 0405 00101 0506 00110 0607 00111 0708 01000 0809 01001 09

decimal binary hexadecimal10 01010 0A11 01011 0B12 01100 0C13 01101 0D14 01110 0E15 01111 0F16 10000 1017 10001 1118 10010 1219 10011 13

Table 3.1: Integers from 0 to 18 in decimal, binary and hexadecimal representation.

hexadecimal to decimal or whatever, you can always use the MATLAB-functions dec2hex,hex2dec, dec2bin and bin2dex. ”2”, pronounced as ”to”, like in ”decimal to hexadecimal”.The same naming logic is applied in ”num2str”, conversion from ”numeric to string”.The di!erence between one integer and the next largest representable integer is always one,and integers in di!erent representations are always the ”same” integers.Integers in FORTRAN are also sometimes declared as INTEGER*4, because 4 Byte=4 · 8Bit=32 digits are used to represent these ”standard” integers. As one bit is reserved for thesign of the integer, largest representable integer is something like 231!1, the smallest !231+1.As extensions to Standard -FORTRAN, there exist in some compilers also the INTEGER*8type (8-Bit-integers, from !263 + 1 to 263 ! 1) and the INTEGER*2 type (2-Bit-integer).INTEGER*8 is convenient when large Integer-values have to expressed without the roundingoccurring in Floating point computations, whereas INTEGER*2 is convenient if large arrays

30 CHAPTER 3. NUMERICAL ANALYSIS I

of integers must be stored where the integers can only take very few values. The danger ofusing the non-standard integer-types is that if one changes the compiler (or the computerone works on), these data-types may not be available any more, and one has to rewrite thewhole program.The C/C++-standards do not define the absolute accuracy of their data-types, but providesthe type int and longint, where the longint has possibly the larger number of digits (but mayhave the same number as short int). Additionally, there is the unsigned-data-type, whichallows to represent a largest number in signed which is twice as large as in the signed datatype.

3.1.1 Fixed point numbers

Fixed-Point numbers are created from integers by renormalizing the integer with a prefac-tor. Fixed-Point numbers are needed in environments where a constant absolute precision isneeded, for example in the banking sector, where the accuracy of an operation always mustbe rounded to a certain digit, e.g. 1/10000 $, and this accuracy must be maintained over thewhole data range, from the smallest transactions of a few dollars to Billions of dollars.

3.2 Data types: Floating point numbers

In technical and scientific applications, the orders of magnitudes used are much larger thane.g. in banking or administration. Trillions of Dollars (1012) are a lot of money, but trillionsof molecules is something rather microscopic. Therefore, the preferred data type in scientificcomputations is the floating point number, where the numbers are spaced our irregularly,more numbers in smaller intervals, so that the relative accuracy of operations is constant,not the ”absolute accuracy” as with integer numbers.MATLAB performs all operations in floating point numbers (actually, in complex floatingpoint numbers). In contrast, many standard programming languages like C, C++, FOR-TRAN, do not perform type conversion during arithmetic operation, but only at the timeof the assignment of the result. That means an integer-division of a number by a largernumber gives 0, and depending on the data-type, the results have di!erent accuracies, as inthe following example in FORTRAN90:

program test_implicitimplicit nonewrite(*,*) 3/7 ! = 0 Integer-divisionwrite(*,*) 3./7. ! = 0.428571 REAL*4-Divisionwrite(*,*) 3.d0/7.d0 ! = 0.428571428571429 REAL*8 Divisionstopend

3.2. DATA TYPES: FLOATING POINT NUMBERS 31

3.2.1 ErrorFor the following sections, it will be convenientto define the ”numerical error” of an opera-tion, the di!erence between the outcome of an”exact” operation using real numbers and the”numerical” operation using numbers as theyare stored

exact A ( B = Cnumerical A(B = C

absolute error #absolute = |C ! C|relative error #relative = |C ! C|/|C|

in the computer. With respect to representing mathematical real numbers, e.g. multiples of5./9., on the computer, integers have a constant absolute error, on average the error is of the”order of 1”, whereas for floating point numbers have constant relative error as can be seenin the following table.

Floating point number IntegerOperation Error Operation Error50./9.=5.555555555555556 < O(10#14) 50/9=5 < O(1)500./9.=55.55555555555556 < O(10#13) 500/9=55 < O(1)

relative error constant absolute error constant

3.2.2 Usage

They are the only numbers on a computer which which fast, ”numerical” computations arepossible over a large range of possible values. Floating Point Operations, FLOPS, are usuallygiven as the benchmarks for computers, and currently the fastest computer in the World,the Earth Simulator near Yokohama, can do about 40 Tera-Flops. The precision of thedeclared variables is usually expressed in the declaration statement: In the FORTRAN77standard, REAL*4/REAL*8 (or DOUBLE PRECISION) expressed that 4/8 Byte were usedto represent and the data.

3.2.3 Data-Layout

In floating point numbers, Mantissa and Exponent are stored in such a way that the numberis represented as the sum of powers of the base $, precision t and lower and upper boundsfor the exponents e, L " e " U. A floating point number xcan then be represented as

x = ±%

d1

$+

d2

$2+ . . .

dt

$t

'· $e

with

0 " di " $ ! 1, (i = 1, . . . , t)

The usual real numbers in a higher programming like C or FORTRAN language have thefollowing characteristics:Kind Byte/Bit mantissa/exponent Range valid digitsReal 4/32 23/8 8.43·10#37 – 3.37·1038 6-7Double 8/64 52/11 4.19·10#307 – 1.67·10308 15-16


3.2.4 Example

The above representation does not give equidistant numbers, as can be seen if the distributionof numbers for is plotted for $ = 2, !1 " e " 2, t = 3:

-4 -3 -2 -1 0 1 2 3 4

As can be seen by the above graph, floating point numbers have as many numbers between 1to 10 as between 10 to 100, whereas integers and fixed point numbers have as many numbersin the interval from 0 to 1 as from 1 to 2. In other words, if numbers are rounded to fixedpoint numbers, there is a constant absolute error over the whole range of numbers, whereasfor floating point numbers, there is a constant relative error over the whole range of availablenumbers.The builtin-function in MATLAB to find out which is the largest relative space betweensuccessive floating-point numbers is eps. This function depends on the implementation ofMATLAB as well as on the hardware and will give di!erent results on di!erent processors. Ifyou use a computer language other than MATLAB of FORTRAN90, where these functionsare built in, you can use the following algorithm:

% program eval_myepsclearformat compact% compute machine-epsilonmyeps=1.myepsp1=myeps+1.while (myepsp1>1)myeps=0.5*myeps;myepsp1=1+myeps;

endmyeps

Other builtin functions which are convenient to get ideas about the feasibility of some numer-ical algorithm are realmax, the largest representable floating point number, and realmin,the smallest floating point number which is larger than 0. All these functions eps, realmin,realmax are implementation dependent, i.e. their result may be di!erent on di!erent com-puter models, because the mathematical operations are ”wired” in a di!erent way on thechip.The actual number of valid digits of mantissa and exponent are usually not defined in

3.3. CHECKING FOR EQUALITY 33

language-Standards, so that the IEEE-Standard (IEEE= Institute of Electrical and Elec-tronics Engineers)uses for double precision

S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0 1 11 12 63

with the sign S, the exponents E and Mantissa digits F. whereas CRAY used something like

S EEEEEEEEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0 1 17 18 63

which due to the lower accuracy and other idiosyncrasies in rounding, has now totally van-ished. For most numerical computations, double precision is su"cient, and the errors withsingle precision computations will be too large. Additionally to double precision, many man-ufacturers o!ered Real*16/Quadruple Precision, which usually will be considerably slowerthan double precision.Modern compilers and Processors, as the Pentium4 and the G4/G5 allow faster computa-tions if the compiler options allow rounding for double precision, so that the results will beconsiderably less accurate than double precision/16 digits, but still more accurate than singleprecision/8 digits.That 4/8 Byte are used does not mean that all compiler functions operate on these datatypes are computed with the full accuracy, correct ”up to the last Bit”. The IEEE-Standarddefined that all results have to be given in such a way that only the last/ least significant bitis rounded. Because this can become quite costly, one can usually choose compiler optionswhich o!er higher accuracy, but not so good performance, or faster but less accurate code.The errors elf-implemented routines may su!er from additional errors which will be discussedin the next sections.

3.3 Checking for Equality

Concerning what has been said in this section about accuracy, there are some things one cando with integers which one should not do with floating point numbers. For a start, do notcheck floating point numbers for equality

if (a==b)

but check for equality with a certain error #. Be sure whether you need the absolute error

n=10epsilon=10^{-n}if (abs(a-b)<epsilon)

or the relative error, which for two values a, b )= 0 can be defined as

n=10epsilon=10^{-n}if (abs(a-b)<epsilon*max(abs(a),abs(b)))


The last example is much saver than writing it as

n=10epsilon=10^{-n}if (abs(a-b)/max(abs(a),abs(b))<epsilon) ! don’t do this !!!!

which will crash the program in case that a = b = 0. Moreover, a multiplication can beexecuted faster than a division, so if the if-condition is inside an often executed loop, thedivision can slow down the execution of the loop considerably.

3.4 Impossible Numbers

Several Mathematical operations are not well defined in mathematics, like dividing by 0, orcomputing the real value of an asin of a number with an absolute value larger than 1. Thecomputer has to do something if the operation is mathematically undefined or meaningless.Programs in compiler languages like C and FORTRAN usually crash, and leave it to theprogrammer to find out where the error occurred.If a variable is defined in FORTRAN as real, the result must also be real, so expressions likesqrt(-1) or asin(1.5) crash the program. As the elementary datatype in MATLAB is thecomplex array, such operations give in MATLAB the ”correct” result, e.g.

> sqrt(-1)ans = 0 + 1i> asin(1.5)ans = 1.57080 - 0.96242i

This may become a problem if the expected result is indeed real, but very near the undefinedvalue, e.g. if the result without rounding error should be 1, but due to rounding it is e.g..1.000000000001, and the asin computed from it is

> asin(1.000000000001)ans = 1.5708e+00 - 1.4143e-06i

so that the computation will be continued with a complex part. In such cases, the inputshould always be checked with an if-statement whether it conforms to the expectations.There is also an IEEE-Standard which defines such ”exceptions”, e.g. what should be done ife.g. a number is divided by 0. The result is stored in a bit-pattern which is outputted as NaN,”Not a Number”. MATLAB is a bit more sophisticated. For a start, it gives the ”correct”result for the division, ±* :

> 4/0warning: division by zeroans = Inf> -2/0warning: division by zeroans = -Inf

for Inf, the usual ”rules” apply, but some cases are di!erent:

3.5. ERRORS 35

> Inf+3ans = Inf> Inf+Infans = Inf> Inf/Infans = NaN> Inf-Infans = NaN

When tested for equality via the ==-Operator, one Idiosyncrasy is, that Infinity is alwaysequal to Infinity in MATLAB, but NaN is always unequal to NaN:

> 4==4ans = 1> Inf==Infans = 1> NaN==NaNans = 0

and for tests for NaN, the isnan-Function must be used:

> isnan(4)ans = 0octave:17> isnan(NaN)ans = 1octave:18>

To test which numbers are the largest and smallest, the MATLAB-Functions realmin andrealmax can be used. Because Inf, -Inf and NaN must are represented as Floating-Pointpatterns in MATLAB, there are about three to four Bit-patterns less available in MATLABthan in Compilers for e.g. FORTRAN or C which don’t use Inf and Nan. Because theBit-Pattern of the largest Numbers are used, the largest represented floating-point numberis smaller than in the compilers.

3.5 Errors

As we have seen in the previous chapter, the representation of real numbers as floating pointapproximation leads intrinsically to rounding errors. In the following, we will treat additionalsources of error which occur in the evaluation of algebraic equations.

3.5.1 Truncation error

Function evaluation

Many mathematical expressions are defined as an infinite process, for example the exponentialfunction is

exp(x) = 1 +x

1!+

x2

2!+

x3

3!. . . (3.1)


The error which results when e.g. the infinite series is instead computed with only a finitenumber of operations, i.e. truncated after a finite step is called the truncation error. In fact,if in a given interval a function f(x) is given by the infinite polynomial series with seriescoe"cients a1, a2, a3 . . . , a$, if the series is truncated after n steps in a given interval, anapproximation

f(x) =n(

i=0

a%ixi + O(xn+1) (3.2)

can be found which has a smaller error than the truncated series using the coe"cients of theinfinite series ai. Such a series is called an ”n-th order approximation of f(x),” which oftenmakes use of the expansion of the function in terms of Orthogonal polynomials1

Whereas the exponential function exp(!x) is defined in the infinite series with the coe"cients

exp(!x) =$(

n=0

!xn

n!,

the best finite approximation in the interval 0 + x + !ln2 to exp(!x) with 10 digits is

exp(x) = a0 + a1 · x1 + a2 · x2 + a3 · x3 + a4 · x4 + a5 · x5 + a6 · x6 + a7 · x7 + %(x)

with %(x) " 2 , 10#10 and the coe"cients are given in Tab.3.2. Be aware that the coef-ficients for the truncated polynomial approximation depend on the interval for which theapproximation should be used, to minimize the error.

n an 1/n!0 1.00000 00000 1.0000000000000000001 -0.99999 99995 -1.0000000000000000002 0.49999 99206 0.5000000000000000003 -0.16666 53019 -0.1666666666666666574 0.04165 73475 0.0416666666666666645 -0.00830 13598 -0.0083333333333333336 0.00132 98820 0.0013888888888888897 -0.00014 131261 -0.000198412698412698

Table 3.2: Coe#cients for the Polynomial approximation of exp(!x) in the interval0 " x " ln2 (middle column) and the corresponding coe#cients of the infinite Taylorseries.

In practice, many transcendental functions f(x) which are introduced in elementary classesof mathematical lessons are numerically better approximated by other approximations thanpolynomial approximations, e.g. by making explicit use of divisions, which can itself mimicoperations of infinite order in x, either via Pade-Approximation (quotient of two polynomialexpressions) or via continued fractions.An e!ective strategy, especially with periodic functions, is argument reduction, so that onedoes not have to compute the Taylor series for large x, but for a small x near the origin

1Chap. 22, Handbook of Mathematical Functions, M. Abramowitz, I. Stegun, National Bureau ofStandards.

3.5. ERRORS 37

by either shifting the periodic functions like sin, cos into the interval between [0,"/4], or bydecomposing the function into a product of an integer argument and an non-integer argument,like in the case of the exponential function, where one computes

exp(x) = exp(m + f) = exp(m) % exp(f),m integer, |f | < 1.

Many approximation of transcendental functions can be found in Abramovitz/Stegun2.Other examples for truncation error can be found in other series expansion methods, e.g.Fourier series truncated after a certain number of coe"cients, or Pade approximations, wherean analytical function

f(x) =+$

i=1 aixi

+$i=1 bixi

is approximated by the truncated Pade approximation

f(x) =+n

i=1 aixi

+mi=1 bixi

.

3.5.2 Rounding error

Because we have only a finite number of digits available, when we try e.g. in Octave tocompute 5/9, we get

> format long> 5/9ans = 0.555555555555556>

So, first of all, it is not necessary to input 5./9. like in FORTRAN when one wants to usefloating point numbers. On the other hand, one sees that the periodic fraction which is theresult must be rounded to 16 decimal digits.Therefore, when we compute the exponential function in the following program,

% Example for rounding error in computing transcendental functionsclearformat compactformat longx=-20.5n_iter=100myexp=0.for i=0:n_iter-1% Compute the Taylor-series for the exp-function% x!=gamma(x+1)myexp=myexp+x^i/gamma(i+1)

endexp(x)

return2Handbook of Mathematical Functions, M. Abramowitz, I. Stegun, National Bureau of Standards.


we obtainmyexp=-4.422614950123058e-07as a result, instead of the correctexp(x) = 1.250152866386743e ! 09.As we see, the result is so ”very” wrong that not even the sign of the result is correct, we geta negative value for a computation which should always give positive values. The problem isalso not the range of the numbers, because the smallest number representable in MATLABprecision is - 10#300, much smaller than the correct result, 10#9. The problem is also not atruncation error, as we are still trying to add taylor contributions, even the result does notchange any more after the 95th iteration. The problem is that we try to add something whichis smaller than the last digit of the summation.There are possibilites to circumvent such kinds of problems which will be explained later inthe lecture.

3.5.3 Catastrophic cancellation

Even for the last bit of a floating point function evaluation in double precision, which givesabout 16 digits accuracy, the 17th digit is of course wrong. The subtraction of numbers ofnearly equal size shifts these invalid digits in front, so that the results For expressions like

a=cos(x)^2-sin(x)^2

this gives dubious results whenever the argument x is a multiple of "/4, with arbitrary numberof canceled digits. The problem can simply be circumvented by using the trigonometricidentity cos(2x) = cos(x)2 ! sin(x)2 so that

a = cos(2x) (3.3)

always gives the result with the accuracy of the compilers evaluation of the cos- evaluation.

3.6 Good and bad ”directions” in Numerics

The following integral is positive because, because the integrand is positive in the wholeintegration interval [0,1]:

En =, 1

0xnex#1dx, n = 1, 2, . . .

From partial Integration we obtain a relation between En and En#1, which can be used toiteratively compute En if we have E0 given:

, 1

0xnex#1dx = xnex#1

----1

0!

, 1

0nxn#1ex#1dx

= 1 ! n, 1

0xn#1ex#1dx

En = 1 ! nEn#1, n = 2, . . .

3.6. GOOD AND BAD ”DIRECTIONS” IN NUMERICS 39

In a REAL*8 implementation, we obtain with E1 = exp(!1)the result to the right. We can be sure that at least E18, andtherefore all the following solutions are wrong, because theresult should not become negative in the first place. Becausethe Ei should fall monotonically, in the iteration 1!nEn theterm nEn is approaching 1 in the iteration, and the correctinformation in the iteration is quickly annihilated, so thatonly the erroneous ”last digits” survive. If we use insteadthe reordered equation

En = 1 ! nEn#1, n = 2, . . . (3.4)

so that En#1 =1 ! En

n, n = * . . . 3, 2 (3.5)

we can approximate the starting value:

En =, 1

0xnex#1dx (3.6)

", 1

0xndx (3.7)

=xn+1

n + 1

----1

0(3.8)

=1

n + 1, (3.9)

E1 0.367879441171442E2 0.264241117657115E3 0.207276647028654E4 0.170893411885384E5 0.145532940573080E6 0.126802356561519E7 0.112383504069363E8 0.100931967445092E9 0.09161229299417073E10 0.08387707005829270E11 0.07735222935878028E12 0.07177324769463667E13 0.06694777996972334E14 0.06273108042387321E15 0.05903379364190187E16 0.05545930172957014E17 0.05719187059730757E18 -0.02945367075153627E19 1.559619744279189E20 -30.19239488558378

which shows that for ”very large n, ”En is very small”, and take E21 - 0 as an educatedguess, so that E20 = 0.5. The output for this iteration E20 $ E0 is given on the left, on theright the output is overlayed with the output of the iteration E0 $ E20.

E1 0.3678794411714423E2 0.2642411176571154E3 0.2072766470286539E4 0.1708934118853843E5 0.1455329405730786E6 0.1268023565615286E7 0.1123835040692999E8 0.1009319674456008E9 0.09161229298959281E10 0.083877070104072E11 0.07735222885520793E12 0.0717732537375049E13 0.06694770141243632E14 0.06273218022589153E15 0.05901729661162711E16 0.05572325421396629E17 0.0527046783625731E18 0.05131578947368421E19 0.025E20 0.5

E1 0.3678794411714423E2 0.2642411176571154E3 0.2072766470286539E4 0.1708934118853843E5 0.1455329405730786E6 0.1268023565615286E7 0.1123835040692999E8 0.1009319674456008E9 0.09161229298959281E10 0.083877070104072E11 0.07735222885520793E12 0.0717732537375049E13 0.06694770141243632E14 0.06273218022589153E15 0.05901729661162711E16 0.05572325421396629E17 0.0527046783625731E18 0.05131578947368421E19 0.0250000000000000E20 0.5000000000000000

E1 0.3678794411714423E2 0.2642411176571153E3 0.207276647028654E4 0.170893411885384E5 0.1455329405730801E6 0.1268023565615195E7 0.1123835040693635E8 0.1009319674450921E9 0.09161229299417073E10 0.0838770700582927E11 0.07735222935878028E12 0.07177324769463667E13 0.06694777996972334E14 0.06273108042387321E15 0.05903379364190187E16 0.05545930172957014E17 0.05719187059730757E18 -0.02945367075153627E19 1.559619744279189E20 -30.19239488558378


As can be seen, the second iteration with the ”wrong” starting value converges against theright end-value exp(!1), whereas the first iteration with the ”right” starting value convergesagainst a wrong result. This shows the ”art” of numerical computing, which is, to obtain acorrect end result with a ”good” routine and a wrong starting value, instead of obtaining awrong end result with a correct starting, but a ”bad” routine.It will be later become obvious in this course that integration is always the ”good” directionin numerical computing, which can decrease initial errors, whereas the di!erentiation is thebad direction, which can increase initial errors. This is in contrast to manual calculation,where the di!erentiation is easier to treat than integration.

3.7 Calculus and order of Methods

3.7.1 Taylor Approximation revisited

Many functions can be approximated by a polynomialseries

f(x) =$(

i=1

aixi

in such a way, so that around the point x0, for the &!thderivative f!(x), we have

f(x) =$(

!=1

f (!)(x0)

&!(x ! x0)! .

For the functions exp(t), sin(t), cos(t), the Taylor seriesis given below:

exp(t) =$(

n=0

tn

n!= 1 +

t

1!+

t2

2!+

t3

3!+ . . .

sin(t) =$(

n=0

(!1)n t2n+1

(2n + 1)!= t ! t3

3!+

t5

5!! . . .

cos(t) =$(

n=0

(!1)n t2n

2n!= 1 ! t2

t!+

t4

4!! t6

6!+ . . .

−5 0 5

−2

0

2 sin(x)1st Order3rd Order5th Order7th Order

−5 0 5

−2

0

2 cos(x)0th Order2nd Order4th Order6th Order

−5 0 50

2

4

6

exp(x)0th Order1st Order2nd Order3rd Order

If we truncate the (for transcendental functions infinite) series after a finite number of terms,we obtain the Taylor approximation. The the evaluation of a Taylor approximation, e.g. offourth order with the coe"cients a, b, c, d, e

f(x) = a + bx + cx2 + dx3 + ex4

series can be done in an e"cient and in an ine"cient way. Using the above formula directly,we can write

f(x)=a+b*x+c*x*x+d*x*x*x+e*x*x*x*x

3.7. CALCULUS AND ORDER OF METHODS 41

so that we need four additions and ten multiplications. If we use brackets around the expres-sion in a skilled way, four additions and four multiplications are su"cient:

f(x)=a+(b+(c+(d+e*x)*x)*x)*x

It is easy to write down the derivative of the above polynom as

f(x)=b+(2*c+(3*d+4*e*x)*x)*x

In MATLAB, the evaluations of polynoms is imple-mented with the function polyval, the derivative withpolyder, but the order of the coe"cients is the oppo-site from the above example, and the graph can be seenon the right:

clear, format compactP=[1 0 -1]x=linspace(-3,3,100);y=polyval(P,x);P_deriv=polyder(P);y_deriv=polyval(P_deriv,x);plot(x,y,’-’,x,y_deriv,’--’)legend(’f(x)=x^2-1’,’d/dx f(x)=2x’)gridaxis image

Typical functions which cannot be approximated byTayler-series are functions with a ”jump”, like the sign-function sign(x) = x/|x|.Because the Taylor-series is an infinite series, one needscomparatively many terms to obtain a good approxi-mation. If convergence is sought only on a finite inter-vall, the Tchebiche!-approximation, which minimizesthe error over a finite interval, is usually a much betterapproximation for the same number of terms.

−2 0 2−6

−4

−2

0

2

4

6

8

f(x)=x2−1d/dx f(x)=2x

3.7.2 Integration I

In the same way that many transcendental functions can be represented by an infinite Taylorseries but approximated as a finite polynomial series in x, integrals and derivatives can beapproximated by replacing the infinitely small di!erential dx by the finite di!erence #x,and the error can be expressed as a power of #x, as in the approximation of transcendentalfunctions by finite power series. The simplest method to numerically integrate an integral

I =, b

af(x)dx (3.10)


consist by the simple evaluation of the corresponding Riemann-sum as

I(1) = #x(

i

f(xi), f(xi) = {a, a +#x, . . . b ! 2#x, b !#x}, (3.11)

where b ! a is a multiple of #x, and the integration points are spaced equidistantly.3 I(1)

means that the method is of first order in #x, the error is of the order of #x2.Numerical integration is sometimes called ”quadrature”, maybe from the time where theintegral was approximated numerically by drawing squares under the graph, and this boxcounting was the first ”non-analytical” quadrature. As an example, let us compute theintegral , b

aexp(!x2)dx =

#"erf(b)

2!

#"erf(a)

2,

which is a bit unintuitive because its needs the error function erf to be represented analyti-cally. With the integration bounds of [0, 1] the integral is with about 15 digits accuracy

, 1

0exp(!x2)dx = .7468241328124270.

Now let us approximate this integral with the rectangle midpoint rule, where we replace the in-tegral with a Riemann sum with n corner points and we will evaluate

−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Integration with Rectangle Midpoint Rule

the function in the middle of the n ! 1 in-tervals of equal width h with the functonevaluated at the middle instead of the leftor right end of the integration interval:

clearformat longn=101 % n odd !dx=1/(n-1) % stepsizexrect=[dx/2:dx:1-dx/2];yrect=exp(-xrect.*xrect);sum(yrect)*dx

, b

af(x)dx = h·

.f(x0)+f(x1)+f(x2) . . .+f(xn)

/

For 100 intervals / 101 corner points, theresult 0.74682719849232 is correct up to 5digits or the rectangle midpoint rule , amazing, as we only used 100 point. In our Monte-Carlo evaluation of ", we needed of the order of 10000 points when we wanted an accuracyof only two digits.Very often, numerical integration methods are introduced not by using the rectangle midpoint-rule, but the trapeze-rule, which is slightly more complicated than the midpoint rule, as eachintegration interval must be approximated as a trapeze:

, b

af(x)dx =

h

2·.f(x0) + 2f(x1) + 2f(x2) + . . . + f(xn)

/

3There are methods which don’t chose the points equidistantly, but optimize the choice of pointsso that the most accurate approximation is obtained with the minimum number of points.


−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Integration wit Trapeze RuleInstead of evaluating the left and right

bound of each interval, we will count thefunction values between the upper andlower integration bounds once, the functionvalues of the upper and lower bound onlyhalf. This corresponds to the summationover the existing trapeze areas:

clearformat longn=101 % n odd !dx=1/(n-1) % stepsizextrap=[0:dx:1];ytrap=exp(-xtrap.*xtrap);(sum(ytrap)-.5*(ytrap(1)+ytrap(n)))*dx

Surprisingly, the result of 0.74681800146797 is one digit less accurate than the result with themidpoint rule, though the program was more complicated, because we had to think abouta proper way to implement the trapeze shape for each interval. If we think about a graphwith mostly negative curvature, the trapeze rule will end with an approximation which isconstantly below the true function value. For the rectangles of the midpoint rule are partlyabove, partly below the graph, so that there is error compensation already within one interval.In the rectangle midpoint rule, we have chosen the quadrature points in the middle of theinterval. If we would have choosen the values for the function evaluation at the left/rightboundary of each interval, we would have obtainted 0.74997860426211/ 0.74365739867383,considerably less accurate than the rectangle midpoint rule.

It can be shown4 that the

midpoint rule has an accuracy of

124

h3i f

%%(xi), (3.12)

better than the trapeze rule whichhas an accuracy of

112

h3i f

%%(xi), (3.13)

overestinatedRec. Midp. rule

Integral over convex curve

underestimatedTrapeze rule

so it is surprising that textbooks usually introduce numerical quadrature via the trapeze rule.Because both formulae are correct up to the second power of hi, and the error is of the thirdpower, they are called formulae of ”second order”.More accurate, accurate to third order, is the composite Simpson-Rule S(f), which makes useof combining the rectangle midpoint rule R(f) and the trapez rule T (f). When we comparethe integral for the midpoint rule and for the trapeze rule, we see that in our integration

4G.E. Forsythe, M. Malcom, C. Moler, Computer Methods for Mathematical computations, Pren-tice Hall 1977


intervall with a convex function, the trapeze rule allways gives a too small result, the midpointrule gives allways a too large result. Therefore, if we average R(f) and T (f), we will get abetter result than R(f) and T (f) alone. Because the error for R(f) (Eqn. 3.12) is twice aslarge as the error for T (f) (Eqn. 3.13), we should not take the direct average 1

2R(f)+ 12T (f),

but the weighted average so that the error of both rules cancels:

S(f) =23R(f) +

13T (f).

Its error can be shown5 to be of the order of

12880

h4i f

%%%%(xi).

For our example with the integral from 0 to 1 over exp(!x2), we obtain 0.746824132817537as the result, instead of the exact 0.7468241328124270 . . . .In our second-order formulae, we tried to approximate the graph with straight lines andintegrated the area below the curve. A parabola is determined by 3 points, and therefore onecan also try to approximate the graph via a parabola instead of a straight line to obtain aSimpson rule directly by supplying three integration points for each interval.It is therefore necessary to have an odd number of integration points, and the direct derivationof the Simpson rule can be done for an integration interval with length 2h and by insertingthe Taylor expansion of the function f(x) with the &!th derivatives f (!) around the pointx0 so that

f(x) =$(

!=0

f (!)(x0)

&!(x ! x0)!

instead of the function f(x) itself yields:, 2h

0f(x)dx =

, 2h

0

%f(h) + f %(h) · x + f %%(h) · x2 + . . .

'dx -

, 2h

0

%f(h) +

.f(0) ! f(2h)/(2h)

/· x +

.f(0) ! 2f(h) + f(2h)/(2h)2

/· x2

'dx =

h

3(f(0) + 4f(h) + f(2h))

f(x)

Simpson-Rule

0 h 2h 3h 4h

g(x)

Using this formula for the single interval of length h, we can compose the formula for thewhole integration over [a, b] with the integration point from x0 = a to x2n = b :

, b

af(x)dx =

h

3

.f(x0) + 4f(x1) + 2f(x2) + 4f(x3) + . . . + f(x2n)

/

The MATLAB-Program is5G.E. Forsythe, M. Malcom, C. Moler, Computer Methods for Mathematical computations, Pren-

tice Hall 1977


clearformat long,format compactn=101 % n odd !dx=1/(n-1) % Stepsizexsimp=[0:dx:1];ysimp=exp(-xsimp.*xsimp);(4*sum(ysimp(2:2:n))+2*sum(ysimp(3:2:n-1))+ysimp(1)+ysimp(n))*dx/3

which gives the Result 0.74682413289418, slightly worse than the composed Simpson rule.In the following table, we compare the errors of the di!erent orders by underlining the correctdigits and introduce the Big-O-notation

Method Integral Order correctness0 10 exp(!x2)dx

Exact 0.7468241328124270 . . . O(h$)Rectangle, left endpoint 0.74997860426211 O(h)Rectangle, right endpoint 0.74365739867383 O(h)Rectangle, midpoint 0.74682719849232 O(h2)Trapeze rule 0.74681800146797 O(h2)Simpson 0.74682413289418 O(h3)Composite Simpson 0.746824132817537 O(h3)

Several conclusions can be drawn from viewing the above table, which hold also for othernumerical methods which have an intrinsic truncation error:

1. If a formula if of n!th order and a discretization of 1/100 of the interval length are used,for a first order implementation, the error is about 1/100=1 %, for the second ordermethod it will be about 1/10.000 and for the third order method it will be 1/1000000.(Of course, the prefactors in the order also have to be taken into consideration).

2. Therefore, it is not always necessary to increase the number of discretization steps toobtain a more accurate result. The change from first order to second order in therectangle rule resulted from just switching the integration points by h/2.

3. If the theoretical accuracy cannot be reached, it is necessary to consider whether1. The function under consideration does not fulfill the necessary criteria (smoothnessetc) or2. to search for the error in the program resulting from incorrect prefactors, intervalswith incorrect bounds etc. If a formula of second order gives results with an errorproportional 1/(number of points), then the interval bounds are usually determinedwrongly.

4. Be aware that it is not possible to integrate functions analytically if their integral hasno solution due to divergence etc .....

In this section, we have discussed the error resulting from the integration over a whole interval.This is also called a global error, in contrast to the error which occurs in the approximationof the single interval. Numerical methods su!ering from truncation error vary depending onwhether the global error is the same as the local error, whether the global error is larger than


the local error (many solvers for di!erential equations which do not conserve energy) or theglobal error is smaller than the local error (error compensation as in the case of the SimpsonIntegration.).

100 101 102 103 104 10510−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

Trapeze Rule

Rectangular Midpoint Rule

Simpson

Rectangular left and right boundary

Relative Accuracy for Integrating exp(−x*x) between 0 and 1

Number of Integration Points

One generally should be verycareful in using a method withlow order accuracy and a smalltime step. First of all, formany problems, such a methodcan become quite time con-suming. Furthermore, themore function evaluations oc-cur, the more rounding errorsare accumulated. The dia-gram to the right shows thecost-performance diagram, thenumber of time steps plot-ted with respect to the accu-racy. Cost-performance dia-grams vary depending on theevaluated functions. As canbe seen, beyond 1000 integra-tion points, the accuracy of theSimpson method has already reached the limit of 16 digits of the double precision accuracyand therefore the integral evaluation cannot be increased further by increasing the numberof integration points.

There are integration formulas which are easier to use than the numerical approximationsof the Riemann sum we introduced here, which are called Newton-Coates formulae. TheMidpoint rule is called an ”open Newton-Coates formula”, because the endpoints of the inte-gration intervall are not evaluated, the formulae for which the endpoints must be evaluated,are alled ”closed Newton-Coates formulae”. The following table shows the Taylor expan-sion for a single interval of length h for Newton-Coates formulae of di!erent order with thecorresponding error term:

Name Integral Formula Error TermMidpoint Trapez–Rule

0 x2x1 f(x)dx = h[12f1 + 1

2f2] +O(h3f %%)Simpson

0 x3x1 f(x)dx = h[13f1 + 4

3f2 + 13f3] +O(h5f (4))

Simpson 3/80 x4x1 f(x)dx = h[38f1 + 9

8f2 + 98f3 + 3

8f4] +O(h5f (4))Bode

0 x5x1 f(x)dx = h[1445f1 + 64

45f2 + 2445f3 + 64

45f4 + 1445f5] +O(h7f (6))

Carrying the error compensation in formulas with truncation error further to higher orders,by combining low order methods as in the case of the composite Simpson rule so that ahigher-order method results, is called ”Romberg integration”. If the limit for ”infinitely highorders” is taken, this is called Richardson-extrapolation, and these ideas can also be appliedto di!erentiation and the solution of numerical di!erential equations.


3.7.3 Di!erentiation I

In the same way one can derive the Newton-Coates formulae for integrals from their Taylorexpansion in the previous section, one can derive formulae for the derivatives using theTaylor expansion6. Such approximations are often called finite di!erence formulas, as theyapproximate the di!erential with the finite di!erence. For a data which take the valuefi#2, fi#1, fj , fj+1, fj+2 at equidistant points, we get the following finite di!erence schemesfor first order derivatives.:Name: Finite di!erence scheme: Leading errorForward–Di!erence (fi+1 ! fi)/#x #xf %%(x)/2Backward–Di!erence (fi ! fi#1)/#x !#xf %%(x)/23–point symmetric (fi+1 ! fi#1)/(2#x) #x2f %%%(x)/63–point asymmetric (!1.5fi + 2fi+1 ! .5fi+2)/#x !#x2f %%%(x)/35–point symmetric (fi#2 ! 8fi#1 + 8fi+1 ! fi+2)/(12#x) !#x4f %%%%%(x)/3

Note that the leading coe"cients in front of the fi#2, fi#1, fj , fj+1, fj+2 have to add up to 0.For second order derivatives, similar schemes are be written down in the following table andagain the coe"cients add up to 0:

Name: Finite di!erence scheme: Leading error3–point symmetric (fi#1 ! 2fi + fi+1)/(#x2) #x2f %%%%(x)/123–point asymmetric (fi ! 2fj+1 + fi+2)/(#x2) #x2f %%%(x)5–point symmetric (!fi#2 + 16fi#1 ! 30fi + 16fi+1 ! fi+2)/(12#x) !#x4f %%%%%%(x)/90

In contrast to numerical integration, which smoothes out errors via error compensation,numerical di!erentiation ”roughens up” the solution. If high accuracy is desired, there areusually better solutions than computing the derivatives directly via finite di!erence schemes.The graph to the right shows the numerical integral of

1 !, x

0sin(y)dy = 1 ! cos(x)

and the numerical di!erential of sin(x),

ddx

sin(x) = cos(x)

with some additional noise of 1 % in sin(x). The above graph was produced with the followingprogram:

clearformat compactnstep=200x=linspace(0,4*pi,nstep);dx=mean(diff(x));idx=1/dx;y=sin(x-dx/2)+0.01*(rand(size(x))-.5);subplot(3,1,1)

6Clive A.J. Fletcher, Computational Techniques for Fluid Dynamics, Vol.1, 2nd. ed. Springer 1990


0 2 4 6 8 10 12−1

−0.5

0

0.5

1 sin(x)+−0.005*randd/dx sin(x)cos(x)1−int(sin(x))

0 2 4 6 8 10 12

−0.1−0.05

00.05

0.1

absolute error

cos(x)−d/dx*sin(x)cos(x)−1+int(sin(1)

0 2 4 6 8 10 12−2

0

2

4

relative error

(cos(x)−d/dx*sin(x))/cos(x)(cos(x)−1+int(sin(1))/cos(x)

plot(x,y,’-.’,x(1:nstep-1),diff(y)*idx,...x(1:nstep-1),cos(x(1:nstep-1)),’:’,...x(1:nstep-1),1-cumsum(y(1:nstep-1))*dx,’--’)axis tightlegend(’sin(x)+-0.005*rand’,’d/dx sin(x)’,’cos(x)’,’1-int(sin(x))’)subplot(3,1,2)plot(x(1:nstep-1),diff(y)*idx-cos(x(1:nstep-1)),...

x(1:nstep-1),1-cumsum(y(1:nstep-1))*dx-cos(x(1:nstep-1)),’:’)legend(’cos(x)-d/dx*sin(x)’,’cos(x)-1+int(sin(1)’)title(’absolute error’)

axis tight

subplot(3,1,3)plot(x(1:nstep-1),((diff(y)*idx-cos(x(1:nstep-1)))./cos(x(1:nstep-1))),...

x(1:nstep-1),(1-cumsum(y(1:nstep-1))*dx-cos(x(1:nstep-1)))..../cos(x(1:nstep-1)),’:’)legend(’(cos(x)-d/dx*sin(x))/cos(x)’,’(cos(x)-1+int(sin(1))/cos(x)’)

axis tighttitle(’relative error’)

return

Both the di!erential and the integral should give cos(x), but the di!erential is so noisy thatthe result deviates visibly from the exact solution. The integral over the noisy result givesnevertheless a smooth curve. This is again a case of a ”good” and a ”bad” direction ofnumerical computing, as we encountered before by rewriting the iterative computation of the


equationEn = 1 ! nEn#1 (numerically unstable)

intoEn = 1 ! nEn#1 (numerically stable).

As can be seen, for the di!erentiation and its inverse operation, the integration, in numericalanalysis, the di!erentiation consists the ”bad” direction, ”integration” is the good direction.In numerical analysis, integrals, also of higher order, can usually be computed with su"cientprecision, in contrast to derivatives, whereas in analytical calculations, it is usually alwayspossible to compute di!erentials, but very often the computation of closed forms for integralsis problematic.Exercises:1. Write a program which produces floating point numbers for base 2 and mantissa 4, as wellas for base 4 with mantissa 2.a) Chose the exponent so that both number systems are roughly comparable.b) Plot the position of the numbers.c) Compare both number systems: Which number system can be supposed to have the betterroundo!-properties.2. Write a program which computes the exponential function exp(x) using the Taylor seriesand one program which computes the exponential function by evaluating the integer part ofx using powers of the Euler number e and the non-integer-part using the taylor series. Forwhich size of the arguments become the

Chapter 4

Graphics

4.0.4 Initializing and manipulating vectors

Instead of using for loops for setting up vectors and matrices, it is convenient in MATLAB touse the implicit loops provided by the colon operator : and brackets for the array constructor[]

>> a=[3:6]a =

3 4 5 6

For step-sizes di!erent from one the stepsize can be specified as[lower_bound:stepsize:upper_bound] like in

>> a=[3:.5:6]a =

3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000

This is di!erent from loops in FORTRAN and C, where the stepsize is added as the third ar-gument for a loop statement. Whereas the colon operator notation using : constructs a vectorwith a given lower and upper bound for a given stepsize, [lower_bound:stepsize:upper_bound],if instead of the stepsize the number of points is known, it is more convenient to use thelinspace-function

>> a=linspace(3,6,7)a =

3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000

There is also a function which gives vectors in logarithmic spacing

>> b=logspace(1,1000,3)b =

10 Inf In>> b=logspace(1,4,4)b =

10 100 1000 10000

52 CHAPTER 4. GRAPHICS

If several vectors should be concatenated, this can be done with the brackets for the array-constructor []

>> c=[1 3]c =

1 3>> c=[4 c b]c =Columns 1 through 6

4 1 3 10 100 1000Column 7

10000

After a lot of vector operations, one usually one also needs functions which give informationsabout the vectors used. The most elementary function, which displays information aboutvariables, is

>> who

Your variables are:

a ans b c

The length of a vector is displayed by

>> length(a)ans =

7

but this function makes no di!erence between column- and row vectors. For information onhigher dimensions, one has to use the function

>> size(a)ans =

1 7

Vector elements can be accessed either via the for loops like in other programming languageslike in

>> for i=1:length(b)f(i)=2*b(i)endf =

20f =

20 200f =

20 200 2000f =

20 200 2000 20000

53

or via the colon-notation with : and round brackets so that for a vector

>> c=.2:.2:1.2c =

0.2000 0.4000 0.6000 0.8000 1.0000 1.2000

the assignment of the second to the fourth element to a vector g can be written as

>> g=c(2:4)g =

0.4000 0.6000 0.8000

The whole of a vector can be assigned without specifying the bounds like in

>> h=c(:)h =

0.20000.40000.60000.80001.00001.2000

If the vector from a lower bound up to the end should be assigned, this can be done via thethe end statement in round brackets together with the colon operator :

>> v=c(4:end)v =

0.8000 1.0000 1.2000

Functions which operate on vectors are usually defined in the ”canonical” way, that meansin a way in which one expects the function to work. The functions prod and sum acting on avector behave in the way one expects, e.g. they give as a result the product and sum of thevector elements. Whereas prod and sum are acting on vectors and give a scalar as a result,the functions cumsum and cumprod which computed the cumulated sum and the cumulatedproduct give a vector as a result

>> cumsum(1:5)ans =

[1 3 6 10 15]

One must be careful with the use of ”multiplicative” operators *, / and ^, which are inMATLAB in general interpreted in the sense of numerical linear algebra, so that column-and line- operations must match. If one wants to use these operators elementwise, one shoulduse their ”elementwise” variants which are preceded by a ”.”, as in .*, ./ and .^.


4.1 Setting up and manipulating Matrices

Matrices can be manipulated in the same way as vectors, preferably with the colon operator: and the brackets for the array-constructor []. Some elementary builtin MATLAB matrixfunctions will be explained here because they make matrix construction easier. The onesfunction sets up a matrix with ones as every element, as usual in MATLAB a single matrixsets up a square two-dimensional matrix

>> ones(2)ans =

1 11 1

For non-square matrixes, two indices have to be specified, where the first is the columns-index,and the second is the row index, for example

>> ones(3,2)ans =

1 11 11 1

The zeros function behaves in the same way as the ones function, only that it sets upmatrixes with 0 as every element

>> zeros(2,3)ans =

0 0 00 0 0

In linear algebra, the identity matrix is very important, and therefore the unit matrix inMATLAB is named eye, eyedentity / identity

>> eye(3)ans =

1 0 00 1 00 0 1

It may be surprising, but the identity-matrix is also defined for non-square matrices, as thefollowing example shows

>> eye(2,5)ans =

1 0 0 0 00 1 0 0 0

Another important matrix function is the constructor for the random matrix

4.1. SETTING UP AND MANIPULATING MATRICES 55

>> rand(2,4)ans =

0.8214 0.6154 0.9218 0.17630.4447 0.7919 0.7382 0.4057

Matrices can then be constructed via matrix functions alone like in

>> c=ones(2)-eye(2)c =

0 11 0

or with the help of the matrix constructor brackets [] so that

>> b=zeros(2)b =

0 00 0

>> d=[2 34 5]d =

2 34 5

>> e=[c bd c]e =

0 1 0 01 0 0 02 3 0 14 5 1 0

A very convenient function similar to linspace in one dimension which can be used to set uparguments for functions in higher dimensions is the meshgrid-function which the functionalityis as follows:

>> [X,Y] = meshgrid(1:3,10:14)X =

1 2 31 2 31 2 31 2 31 2 3

Y =10 10 1011 11 1112 12 1213 13 1314 14 14


4.2 Graphs and Visualization

4.2.1 Visualizing Vectors

The elementary command in MATLAB for plotting functions etc. is the plot command.Plots can be shown in the plotting window either alone or as one of many sub-plots, like inthe following example

>> x=[.1:.1:.5]x =

0.1000 0.2000 0.3000 0.4000 0.5000>> y=[20:-4:1]y =

20 16 12 8 4

>> subplot(2,2, 1)>> plot(y)

>> subplot(2,2,2)>> plot(x,y)

which displays on the screen (note the di!ernt scale on the x-axis)

1 2 3 4 50

5

10

15

20

0.1 0.2 0.3 0.4 0.50

5

10

15

20

Plots of vectors can be done either by plotting the vector directly or by specifying two vectors,the first will be taken as the x-axis. If the vector length does not match, MATLAB issuesan error message and stops the program execution. The plots are automatically done in thesub-plot which has been called last.A subtle way of plotting it the plot of a vector of complex numbers. If you have a complexvector c, you can get the real part x and the imaginary part y via

x=real(c)y=imag(c)

The command plot(c) has then the same e!ect as plot(x,y) which means that the imagi-nary part is plotted versus the real part.If a new plotting window should be opened, this can be done via the figure command, thefirst window is built is figure(1) command which is automatically executed if no plotting

4.3. VISUALIZING ARRAYS 57

window is open, figure(2) opens a second plotting window and so on. Plots are done in thewindow for which the figure command was called last.There is a wide variety of ways to influence graph annotation in MATLAB

% example for graph anotationsubplot(2,2,1)x=[.1:.1:.5]plot(x,x.*x,x,x.*x.*log(x))xlabel(’Xaxis’)ylabel(’Yaxis’)titel(’Plot anotations’)text(.14,.2,’any label here’)legend(’x^2’,’x^2*log(x)’)

0.1 0.2 0.3 0.4 0.5−0.2

−0.1

0

0.1

0.2

0.3

XaxisYa

xis

any label here

x2

x2*log(x)

The legend created by label can be moved with the mouse. MATLAB graphics can be savedin various styles (Postscript, encapsulated Postscript, JPEG, .....) via the print command.The line-style (full lines, dotted lines, symbols) can be changed via the arguments in the plotcommand

plot(x,log(x),...x,x,’:’,...x,x.^2,’+’,...x,x.^3,’*-’)

To look at a drawing in higher resolution,use the zoom command, aim with the mouse-pointer at the region which should be zoomedand click the left mousebuttom (the rightmousebuttom unzooms the region again).

0.1 0.2 0.3 0.4 0.5−2.5

−2

−1.5

−1

−0.5

0

0.5

4.3 Visualizing Arrays

As an example in this section, we will use the the Rosser matrix

>> rosserans =

611 196 -192 407 -8 -52 -49 29196 899 113 -192 -71 -43 -8 -44

-192 113 899 196 61 49 8 52407 -192 196 611 8 44 59 -23-8 -71 61 8 411 -599 208 208

-52 -43 49 44 -599 411 208 208-49 -8 8 59 208 208 99 -91129 -44 52 -23 208 208 -911 99

If a matrix is displayed with the plot command, the lines are each plotted as a vector, as inthe example for plot(rosser) below on the right.


Arrays are plotted as arrays with the verb—mesh—-command, which plots the data in awire-frame-type of graph, as below on the right.The view command can be used to set a di!erent viewing angle for three-dimensional plots. Itis also possible to change the viewpoint interactively via the rotate3d command by pointingwith the mouse on the frame and pulling the frame of the 3D-graph.

1 2 3 4 5 6 7 8−1000

−800

−600

−400

−200

0

200

400

600

800

1000

12

34

56

78

0

2

4

6

8−1000

−500

0

500

1000

4.4 Analyzing systems via plotting

In the following, a linear, an exponential function, a hyperbola, a logarithmic and an inversesquare root are plotted via the following program, in linear, x- and y logarithmic as well asdouble logarithmic scale:

clearformat compact

x=[linspace(0.1,10,100)];

subplot(2,2,1)plot(x,x,’-’,x,exp(x),’--’,x,1./x,’:’,x,log(x),’-.’,x,1./sqrt(x),’-+’)

axis([0 10 -3 20 ])title(’linear plot’)legend(’x’,’exp(x)’,’1/x’,’log(x)’,’1/sqrt(x)’)

subplot(2,2,2)semilogx(x,x,’-’,x,exp(x),’--’,x,1./x,’:’,x,log(x),’-.’,x,1./sqrt(x),’-+’)

axis([0.1 10 -3 20 ])title(’semilogarithmic in x-direction’)legend(’x’,’exp(x)’,’1/x’,’log(x)’,’1/sqrt(x)’,2)

subplot(2,2,3)semilogy(x,x,’-’,x,exp(x),’--’,x,1./x,’:’,x,log(x),’-.’,x,1./sqrt(x),’-+’)

axis([0.1 10 .01 20000 ])title(’semilogarithmic in y-direction’)legend(’x’,’exp(x)’,’1/x’,’log(x)’,’1/sqrt(x)’,2)

4.4. ANALYZING SYSTEMS VIA PLOTTING 59

subplot(2,2,4)loglog(x,x,’-’,x,exp(x),’--’,x,1./x,’:’,x,log(x),’-.’,x,1./sqrt(x),’-+’)

axis([0.1 10 .01 20000 ])title(’logarithmic ’)legend(’x’,’exp(x)’,’1/x’,’log(x)’,’1/sqrt(x)’,2)

0 2 4 6 8 10

0

5

10

15

20linear plot

x exp(x) 1/x log(x) 1/sqrt(x)

10−1 100 101

0

5

10

15

20semilogarithmic in x−direction


2 4 6 8 1010−2

100

102

104

semilogarithmic in y−direction


10−1 100 10110−2

100

102

104

logarithmic


Many systems in science and mathematics can be better understood by just plotting typicalproperties in di!erent scales. Logarithmic, linear, exponential and power laws can be foundin nature, and are easily identifiable by plotting the data in di!erence scales.


4.4.1 Linear PlotsTypical ”linear” plots result from linear ”response func-tions”, the simplest is probably Hooks law, which issketched in the drawing to the right. It is a ”linear”law, and in the dynamical situation, when the spring ispulled with a force in a certain frequency, the elongationchanges with the same frequency.Such a linear response is not a matter of course.There are ”nonlinear systems” which respond with e.g.”frequency-doubling” to an external stimulation, like inthe case where a high intensity red laser beam going intoa target comes out as a blue laser beam (blue light withtwice the frequency of the red light).

Blue BeamLASER

Crystall targetRed Beam

4.4.2 Logarithmic Plots

If the y-axis of a plot is choosen logarithmically, exponential curves appear as straight lines,so that logarithmic plots allow to identify exponential behavior. Typical examples for expo-nential behavior are time evolution plots. Radioactive decay and the increase of the GDPin Economies are examples for such a time evolution. Below, the increase in the Dow JonesIndustrial Stock Index is shown. The curves in linear plot are bent, but more or less straightin a logarithmic plot. It is a matter of ongoing debate whether this reflects rather the ex-ponential increase in the strength of the US-Economy or just the exponential inflationarye!ects.

If instead of the y-axis the x-axis is choosen in logarithmic scale, logarithmic curves becomestraight lines. Logarithmic curves grow slower than linear curves. Typical examples forlogarithmic behavior are animal senses. Light and sound are perceived on a logarithmic

4.4. ANALYZING SYSTEMS VIA PLOTTING 61

scale, i.e. sound is not ”twice as loud” if the pressure of the sound wave is ”twice as high”,but ”102 as hight”.

4.4.3 Double logarithmic Plots

Double logarithmic plots allow to identify ”power laws”, functions of the form xr, where rdoes not necessarily have to be integer. Power laws are usually found in nature when systemssu!er from ”finite size e!ects”.

The Gutenberg-Richter, which states that the probability of earthquakes of magnitude x isproportional to a function of 1/x r is an example where a system (the tectonic plates) createearthquakes which are up to a maximum size, the size of the plate itself.The curves of a power law can be written as a superposition of exponential curves, like inthe following program

clearformat compacta=linspace(.0,100,1000)

la=length(a)

x=linspace(0.1,1000,100);

y=zeros(size(x));for i=1:lay=y+exp(-.1*a(i).*x);

endloglog(x,y)

This means that a power-law is found in a system if there are many di!erent scales, whichcontribute to an exponential phenomenon and on each size scale (variable a in the aboveexample) there is a di!erent prefactor in the exponential law.In the same way as the curve of a power law can be written as a superposition of exponentialcurves, a Lorentzian can be written as a superposition of Gaussian curves.Another example for power-laws is the 1/f -noise in many technical applications. I its veryoften found in system where a seemingly continuous process is a result of discrete processes,and the deviation from the mean causes the noisy fluctuations.


4.5 Specialized Plots and specialized styles

4.5.1 Graph Properties

The axis of a graph can be easily modified with the axis-command.

axis image

defines the same length unit for x! and y!axis.

axis([xmin xmax ymin ymax])

defines the minimal coordinates for x! and y!axis respectively. Because MATLAB usuallychooses the axis so as to end the axis at values for multiples of 1, 10, 100, it is sometimesnecessary to set

axis tight

so that the axes terminate at the extremal values of the plot. Apart from the axis, a grid canbe specified by using grid. Sometimes it is necessary to modify picture properties like theaxis labels etc from the default values chosen by MATLAB. For the program

clearformat compactx=linspace(0,2*pi,10);y=sin(x);h=plot(x,y)g=axesget(h)get(g)

the plot is defined as a variable, and these variable can be displayed with the get command.The entries for h and g can then be directly modfied using the set-command, by specifyingthe object-name, h, g, the property to modify, e.g. Color and the new value, e.g.

set(g,’Color’,[.5 .5 .5])

Another possible usage, if one already knows the property name, is e.g.

set(gca,’XTickLabel’,{’One’;’Two’;’Three’;’Four’})

which labels the first four tick marks on the x-axis and then reuses the labels until all ticksare labeled. The labels can be positioned like

set(gca,’XTickLabel’,{’1’;’10’;’100’})

4.6. MATLAB-OUTPUT INTO A FILE 63

4.5.2 Including Images

The command

image

puts a default-image (in GIF-Format) on the graphics-screen. In general, graphics of nearlyany format can be read and displayed using

name=imread(’name.gif’)image(name)

The date in the variable name can then be manipulated like a usuall MATLAB-array.

4.6 MATLAB-output into a file

Very often, one wants to save some output of MATLAB onto a file to include it in otherdocuments. For short output, it is simplest under a window-system to copy the desired lineswith the mouse into an editor. If the output becomes too long, one can use the command

diary on

and MATLAB will then output not only to the screen, but also in the file diary . If onewants to redirect the output in a file with a special name, one can use the command

diary(’special_filename’)

To end the output in the diary, use the command

diary off

If you want to include the output in a LaTeX document and preserve the ”Computer-output-look”, you can use the\begin{document}\end{\document}style, all the program examples in this scriptum are produced in such a way.

4.7 Graphics to include into documents

If you want to save the graphics on the MATLAB-Graphics screen as a file that should beincluded in a text (for e.g. LaTeX or WORD), you have to use the print-command, withthe syntax

print -dFORMAT FILENAME


The following table introduces some graphics-formatsprint -dps2 name.ps Black-white Postscript-output, Graphics which can be directly

plotted on as (Postscript-) printer. All graphics are on a singlepage in the paper format.

print -dpsc2 name.ps Like above, but in colorprint -deps2 name.eps Black-white encapsulated POSTSCRIPT-output, can be in-

cluded in word-processor-programs like LaTeX.print -depsc2 name.eps Like the above, but in color.print -djpeg name.jpg Output in jpeg-format which can be included in Word-processors

like Word or for Internet-pages.If the orientation should be changes e.g. to portrait, one can use the MATLAB-command

orient portrait

4.8 Including Encapsulated Postscript in LaTeX

If you want to include jpeg in GUI-based word-processors, you can use the correspondingmenus and/or the mouse.LaTeX, probably the most widely used word-processing program in the sciences, is not GUI-based. A (not so) short introduction about the various commands in various languages canbe found on ftp://ctan.tug.org/tex-archive/info/lshort/. LaTeX is rather a ”text-programming-Language”, which converts a file name.tex (the ”program”) into a file name.dvi(the device-independent-file) which can then be converted into general formats like Postscriptvia

dvips -o name.ps name

which produces a postscript-file. These postscript-formats can then e.g. by the commandps2pdf be used to transform the Postscript-format into PDF-format (Adobe-Acrobat). Ifyou want to prepare a LaTeX-document with the corresponding graphics, you have to loada ”package” with the software for including graphics. A widely used package is the epsfig-package, so whereas for a conventional LaTeX-report, the header looks like

\documentclass[twoside,12pt]{report}

the header must contain

\documentclass[twoside,12pt]{report}\usepackage{epsfig}

if postscript-graphics should be included. Graphics can then be included with\epsfig=filename,width=??,height=??,angle= where either width or height must be given,the angle can also be left away. Here are some examples:

\epsfig{file=graphiken/circle_square.eps,width=2cm}

r=1

4.9. PROCESSING GRAPHICS 65

\epsfig{file=graphiken/circle_square.eps,height=2cm}

r=1

\epsfig{file=graphiken/circle_square.eps,height=2cm,angle=-90}

r=1

\epsfig{file=graphiken/circle_square.eps,height=2cm,width=4cm}

r=1

In principle, all postscript-files should be includable in Latex, but some Programs producepostscript-output which is not compatible with LaTeX. Under UNIX, one can use the com-mand

ps2epsi name.ps name.epsi

to convert a file name.ps to a file name.epsi which corresponds to the encapsulated postscriptinterchange format.

4.9 Processing Graphics

If you have graphics in another format than postscript, like *.jpeg or *.gif files, which youwant to include in LaTeX, you have to convert them into postscript with some other software.One of the most widely used programs for this task under UNIX is xv, which allows to loadgraphics in one format and to save it in another format, for example

xv name.jpg

will load the program name.jpg. Pressing the right button of the mouse will make a menuappear, and the graphics can be saved as Postscript by choosing the appropriate menu (SAVE$ FORMAT $ POSTSCRIPT).

Chapter 5

Linear Algebra

Usually, one learns about linear algebra in the first year of study, but often, one needs it muchlater, when one has forgotten most of it already. MATLAB means Matrix LABoratory andits first version was written by Cleve Moler so that his students could learn linear Algebramore easily.General documentation of MATLAB can also be found on http://www.mathworks.com/access/helpdesk/help/techdoc/matlab.shtml

5.1 Matrix Manipulation

5.1.1 Matrix commands

The diagonal of a matrix can be extracted with the diag command in the following way:

A =0.520109 0.340012 0.4702930.510104 0.326988 0.6367760.010375 0.782090 0.900370

> diag(A)ans =0.520110.326990.90037

If the input of the diag-command is a vector, diag constructs a matrix with the vector onthe diagonal, a typical example how commands are overloaded in MATLAB:

> b=[3 5 7]b =3 5 7

> A=diag(b)A =

68 CHAPTER 5. LINEAR ALGEBRA

3 0 00 5 00 0 7

The operator for the matrix-transpose is the accent ’:

> A=rand(2)A =0.66166 0.486610.69184 0.39113

> B=A’B =0.66166 0.691840.48661 0.39113

Because MATLAB knows the di!erence between column- and row-vectors, the transpose-operator ’ can also be used to transform column- into row-vectors and vice versa:

> v=[1 2 3 4 5]v =1 2 3 4 5

> u=v’u =12345

For complex-valued matrices, the ’-operator gives the Hermitian conjugate matrix:

> H=rand(3)+sqrt(-1)*rand(3)H =0.59574 + 0.89043i 0.91601 + 0.87663i 0.19920 + 0.74066i0.71691 + 0.73996i 0.31324 + 0.44034i 0.19254 + 0.85119i0.38660 + 0.13756i 0.33661 + 0.71527i 0.29184 + 0.58186i

> G=H’G =0.59574 - 0.89043i 0.71691 - 0.73996i 0.38660 - 0.13756i0.91601 - 0.87663i 0.31324 - 0.44034i 0.33661 - 0.71527i0.19920 - 0.74066i 0.19254 - 0.85119i 0.29184 - 0.58186i

The commands which extract the upper/lower trigonal matrix are triu/tril:

5.2. MATRIX PRODUCTS 69

> AA =0.951650 0.084814 0.2083570.109170 0.585341 0.5629310.667123 0.528991 0.860920

> tril(A)ans =0.95165 0.00000 0.000000.10917 0.58534 0.000000.66712 0.52899 0.86092

> triu(A)ans =0.95165 0.08481 0.208360.00000 0.58534 0.562930.00000 0.00000 0.86092

If the columns or rows should be flipped, i.e. if their order should be inverted, this can bedone with the commands fliup and fliplr, ”flip up down” and ”flip left right”:

> fliplr(A)ans =0.73180 0.405410.55208 0.79014

> flipud(A)ans =0.79014 0.552080.40541 0.73180

These two commands can be used to form a transposition for a complex matrix, which is notthe hermitian conjugate:

> A=rand(2)+sqrt(-1)*rand(2)A =0.839504 + 0.572899i 0.466803 + 0.675260i0.086815 + 0.252680i 0.132638 + 0.086518i

> B=fliplr(flipud(A))B =0.132638 + 0.086518i 0.086815 + 0.252680i0.466803 + 0.675260i 0.839504 + 0.572899i

5.2 Matrix Products

For matrices and vectors, there are a lot of ways products can be computed. There is nodi!erence between vectors and matrices, a vector is just a matrix with only one row or column.The simplest form is the elementwise product, which uses the operator .*:


> u=[1 2 3 4]u =1 2 3 4

> v=[5 6 7 8]v =5 6 7 8

> u.*vans =

5 12 21 32

The inner product for a row-vector u and a column-vector w is computed with the operator*:

> u=[1 2 3 4]u =1 2 3 4

> w=[1 1 2 2]’w =1122

> u*wans = 17

If instead of u*w we compute w*u, the result is the outer product

> w*uans =1 2 3 41 2 3 42 4 6 82 4 6 8

Matrices can be treated in the same way as vectors with elementwise multiplication .* ormultiplication in the sense of linear algebra:

> A=[1 2> 3 4]A =1 23 4

> B=[1 -1

5.2. MATRIX PRODUCTS 71

> -2 2]B =

1 -1-2 2

> A*Bans =-3 3-5 5

> A.*Bans =

1 -2-6 8

A matrix-vector product is performed like this:

> A=[1 2> 3 4]A =1 23 4

> v=[1> 2]v =12

> A*vans =

511

Matlab also has the Kronecker-Product1

as a builtin function,

> u=[1 2 3 4]u =1 2 3 4

> v=[5 6 7 8]v =5 6 7 8

> kron(u,v)ans =

5 6 7 8 10 12 14 16 15 18 21 24 20 24 28 32


> kron(u,v’)ans =

5 10 15 206 12 18 247 14 21 288 16 24 32

Whereas the elementwise matrix product computed with .% is commutative, of course thematrix product computed with % is not commutative.

5.3 Repetition of elementary linear Algebra

The angle between vectors to vectors v and w for any finite can be computed via theirinner/scalar-product · as

cos ' =|v · w|#

v · v ·#

w · w.

For the inner/ scalar-product, we have the Cauchy–Schwartz–Inequality

|v · w| "#

v · v ·#

w · w.

Vectors for which the scalar product is 0 are called orthogonal. Whereas orthogonality of twovectors v and w can be defined in ”theoretical” mathematics as the property that their scalarproduct is zero, v · w = 1, in numerical mathematics it is necessary to define orthogonalityin a way so that possible rounding errors are taken into account, as the following exampleshows

> w=[sqrt(3) sqrt(3)]w =1.73205080756888 1.73205080756888

> v=[sqrt(3) -sqrt(3)]v =

1.73205080756888 -1.73205080756888

> v*w’ans = -9.64636952420157e-17

Obviously the last result should be ”exactly” zero, but due to the rounding errors in thecomputation, there is a finite error. How the definition of orthogonality can be applied insuch a way that rounding errors are taken into account can be seen in the next section aboutthe rank of matrices.

5.3.1 Size and Rank of Matrices

The size of a matrix A can be computed with the size-command:

5.3. REPETITION OF ELEMENTARY LINEAR ALGEBRA 73

> size(A)ans =2 2

size gives a two-row vector as an answer, the number of columns is size(A,2), the numberof rows size(A,1). Also the length of columns / rows for a column/ row vector v can becomputed with size(v,2) / size(v,1)The rank of a matrix is in ”theoretical” linear algebra the number of linear independentrows/columns. Because the definition of linear independence is equivalent to the definitionof orthogonality, we will use the rank computation as the criterion for the orthogonality. Therank of a matrix can be computed with MATLABS’srank command. How the rank commandworks, will be explained later in the section about the singular value decomposition, alonghow one should choose the optional threshold in MATLABS’s rank command. First let usreview some theorems about the rank of matrices1

5.3.2 Some Theorems on the rank of matrices

• The outer product of two matrices of two vectors always gives a matrix of rank 1.Example:

• Random matrices have ”nearly always” full rank, i.e. the rank of a matrix con-structed with rand is the same as the number of the columns/rows, and if the numberrows/columns larger than the number of columns/rows, we have rank(A)=min(size(A)):

> A=rand(3,4)A =

0.23382 0.43570 0.42862 0.979610.79868 0.34546 0.69142 0.743050.66927 0.71192 0.15419 0.11667

> rank(A)ans = 3

• Square matrices which have a rank smaller than their number of columns/rows arecalled singular. They cannot be inverted, and systems of linear equations where theequations form a singular matrix cannot be solved. Their determinant vanishes.

• The Rank of a matrix does not change through transposition, complex or hermitianconjugation.

• The product of non-singular matrices as the same rank as the matrices themselves:

> A=rand(3,4)A =

1Roger A. Horn, Charles R. Johnson, Matrix Analysis Cambrigde University Press 1991


0.476924 0.068071 0.420827 0.8839680.061165 0.885041 0.027155 0.4179660.139677 0.708093 0.489577 0.978820

> B=rand(3)B =

0.908288 0.703948 0.3635890.245781 0.950685 0.0973440.942011 0.726192 0.064962

> C=B*AC =

0.52703 0.94231 0.57935 1.453010.18896 0.92705 0.17690 0.709900.50276 0.75283 0.44795 1.19982

> rank(A)ans = 3> rank(B)ans = 3> rank(C)ans = 3

• For rank-deficient matrix, the rank of the product matrix is the same as that of thematrix with the lowest rank:

> A=rand(2,4)A =

0.200421 0.795092 0.896583 0.4547980.838726 0.220597 0.018236 0.018493

> A(3,:)=A(2,:)A =

0.200421 0.795092 0.896583 0.4547980.838726 0.220597 0.018236 0.0184930.838726 0.220597 0.018236 0.018493

> B=rand(3)B =

0.94359 0.31700 0.206350.50896 0.36833 0.400630.48172 0.19705 0.42594

> C=B*AC =

0.62806 0.86569 0.85555 0.43882


0.74695 0.57430 0.47035 0.245690.61907 0.52044 0.44326 0.23061

> rank(A)ans = 2> rank(B)ans = 3> rank(C)ans = 2

5.3.3 Rank-Inequalities

• For A . Mm,n we have rankA " min(m, n)

• When a column or a row of a matrix are deleted, the rank of the resulting matrixcannot be larger than the rank of the original matrix.

• For A . Mm,k, B . Mk,n we have

rankA + rankB ! k " rankAB

" min(rankArankB)

• For A,B . Mm,k, rank(A + B) " rankA + rankB

• For A . Mm,k, B . Mk,p, C . Mp,n we have

rankAB + rankBC " rankB + rankABC.

5.3.4 Norms of a Matrix

Every Matrix-Norm can also be used as a vector-norm, but not vice versa. Therefore, weexplain here only definitions for matrix-norms. Analogous to real and complex Scalars, onewants to use something like an absolute value also for matrices. Something which behaveslike an ”absolute value” under addition is the Norm of a matrix.

Properties of Matrix–Norms

• ||A|| + 0 (Non-negativity)

• ||A|| = 0 if A = 0

• ||cA|| = |c| · ||A|| for all real and complex c (homogeneity)

• ||A + B|| " ||A|| + ||B|| (triangle inequality)

• ||AB|| " ||A|| · ||B||(sub-multiplicativity)


Definitions for Norms

In MATLAB, all of the following norms exist, and they can be computed via the functionnorm(x) and if necessary further arguments.Name Definition MATLAB–Function1. Spectral Norm ||A||2 maximal Eigenvalue of A&A norm(A), norm(A,2)2. 1–Norm ||A||1 maximal row sum of A norm(A,1)3. *–Norm ||A||$ maximal column sum of A norm(A,’inf’)

4. Frobenius–N. ||A||Fro

&(+

Ai,jA&j,i) norm(A,’fro’)

5.3.5 Determinant of a Matrix

The norm only fulfills sub-multiplicativity, i.e. the norm of a matrix product is equal orsmaller than the product of the norms of the factors. An ”absolute value” which fulfills themultiplicativity is the determinant, which can be computed in matlab via det(A):

det A · det B = det(A · B)

Further properties of the matrix are:

• The exchange of two adjacent columns/rows inverts the sign of the matrix:

> A=rand(3)A =

0.209224 0.413728 0.2124790.106481 0.192283 0.0744380.291095 0.436435 0.508115

> det(A)ans = -0.0017939

> B=[A(:,2) A(:,1) A(:,3)]B =

0.413728 0.209224 0.2124790.192283 0.106481 0.0744380.436435 0.291095 0.508115

> det(B)ans = 0.0017939

• The determinant of the identity matrix is one, independent of its dimension.

• Never use the Cramer Rule or the Jacobi expansion for the computation of a determi-nant, is is wasteful and numerically instable.

• The numerically most suitable computation method for determinants is the so-calledLU-decomposition, where the matrix A is decomposed as a product of an lower trian-gular matrix L with 1’s on the diagonal and an upper triangular Matrix U as

L · U = A.


The determinant of A is therefore the product of the diagonal entries of U. Row-and Column-Permutations, so called pivoting, increases the numerical accuracy of thedecomposition, for details, see [Gol89]. The MATLAB-command which computes thematrix determinant via LU-decomposition is det.

5.3.6 Matrix inverses

Nonsingular square matrices are inverted by the inv command. The elementwise division ofone matrix by another in MATLAB is written as A./B, where all entries of the divisor matrixmust be )= 0 to avoid an error message. This is totally di!erent from the MATRIX divisionA/B, which corresponds to the multiplication of matrix A with the inverse of matrix B:

> A=rand(2)A =0.29975 0.850070.88812 0.33290

> B=rand(2)B =0.89979 0.723700.53648 0.97567

> A/Bans =-0.33410 1.119091.40492 -0.70090

> C=inv(B)C =

1.9926 -1.4780-1.0956 1.8376

> A*Cans =-0.33410 1.119091.40492 -0.70090

Because the product C*A does not necessarily the same result as the product A*C, there isalso the ”right division” of a matrix, which with the above matrices gives

> C*Aans =

-0.71537 1.201811.30362 -0.31963

> B\A


ans =

-0.71537 1.201811.30362 -0.31963

If one tries to invert a singular matrix, MATLAB gives a result (usually wrong), and issuesan error message:

> A=[1 1> 1 1]A =1 11 1

> inv(A)warning: inverse: matrix singular to machine precision, rcond = 0ans =1 11 0

> B=inv(A)warning: inverse: matrix singular to machine precision, rcond = 0B =1 11 0

> B*Aans =2 21 1

5.4 How many matrix products are possible

A matrix product is computed using three indices i, j, k,

aij =(

k

bikckj

Therefore, there are 6 possible orders to to program the loops, but basically, there are onlytwo possibilities:

clearformat longn=20b=randn(n).*10.^(16*randn(n));c=randn(n).*10.^(16*randn(n));

5.4. HOW MANY MATRIX PRODUCTS ARE POSSIBLE 79

tic% Version 1: Dot-Producta1=zeros(n);for j=1:nfor i=1:n

for k=1:na1(i,j)=a1(i,j)+b(i,k)*c(k,j);

endend

endtoc

tic% Equivalent toa2=zeros(n);for j=1:nfor i=1:n

a2(i,j)=b(i,:)*c(:,j);end

endtoc

tic% Version 2: Daxpy-Producta3=zeros(n);for j=1:nfor k=1:n

for i=1:na3(i,j)=a3(i,j)+b(i,k)*c(k,j);

endend

endtoc

tic% equivalent toa4=zeros(n);for j=1:nfor k=1:n

a4(:,j)=a4(:,j)+c(k,j)*b(:,k);end

endtoc

return


We have also included the tic and toc command to profile the time used for a matrixmultiplication. It can be seen that MATLAB performs much faster if the inner loop isevaluated using the :-notation.The first version of the matrix-matrix multiplication has a inner vector product as a ”Kernel”,the inner part of the routine. The second version of the Matrix-multiplication has a kernelwhich can be written as

(y = a(x + (y,

an operation where the left side in words is ”A X Plus Y”, for which often the acronymSAXPY or DAXPY (S for single, D for double precision) is in use.It turns out that both operations are numerically equivalent, and both need 2l3 floating pointoperations (multiplications and additions).It is common to give the speed of computers by how many Floating Point Operations PerSecond (Flops) then can perform. Modern PC’s are in the range of a few hundreds MFlops,Workstations are nowadays in the GFlops-Range, and the Earth Simulator, a Supercomputernear Yokohama, can to about 4 TeraFLOPS.Using programs to test the speed of Computers is called benchmarking.

5.5 Matrix Inverses again

5.5.1 How to solve a linear system by hand

The standard way to solve a linear system of k equations with k unknowns,

Ax = b,

with the unknowns in the vector x and the right-hand-side b2

33334

a1,1 a1,2 · · · a1,k

a2,1 a2,2 · · · a2,k...

... . . . ...ak,1 ak,2 · · · ak,k

5

66667

2

33334

x1

x2...

xk

5

66667=

2

33334

b1

b2...

bk

5

66667

is to rewrite the system in ”augmented form”2

33334

a1,1 a1,2 · · · a1,k

a2,1 a2,2 · · · a2,k...

... . . . ...ak,1 ak,2 · · · ak,k

----------

b1

b2...

bk

5

66667

and transform the matrix and the right-hand-side vector b via elementary row- and column-operations (subtracting multiples of some rows from other rows) to upper triangular form,where all the elements below the diagonal are 0:

2

33334

a1,1 a1,2 · · · a1,k

0 a2,2 · · · a2,k...

... . . . ...0 0 · · · ak,k

----------

b1

b2...

bk

5

66667

5.5. MATRIX INVERSES AGAIN 81

The solutions for x1, x2, . . . xk can then be computed via back-substitution as

xk = b/ak,k

xk#1 =.bk#1 ! ak#1,kxk

//ak#1,k#1

xk#2 =.bk#2 ! ak#2,k#1xk#1 ! ak#2,kxk

//ak#2,k#2

xi =1

ai,i

2

4bi !k(

j=i+1

ai,jxj

5

7

This scheme of eliminating elements so that a triangular coe"cient matrix survives for whichthe unknowns can be computed in a trivial way is called Gaussian elimination. As an example

2

349 3 44 3 41 1 1

5

67

2

34x1

x2

x3

5

67 =

2

34783

5

67

in augmented form 2

349 3 44 3 41 1 1

-------

783

5

67 .

We start by interchanging the first and the last row2

341 1 14 3 49 3 4

-------

387

5

67 .

Next, we subtract 4times the first row from the second row and nine times the first row fromthe last row: 2

341 1 10 !1 00 !6 !5

-------

3!4

!20

5

67 .

Finally, we add -6 times the second row to the last row, and obtain the triangular system2

341 1 10 !1 00 0 !5

-------

3!4!4

5

67 ,

from which we can compute the unknowns successively as x= ! 4/5, x=4 and x= ! 1/5.

5.5.2 The numerical variants: LU-decomposition

For numerical purposes, the two steps, reduction to a triangular system (elimination step),and backward substitution (solution step), are often split up in two routines. A commoncollections of subroutines for the computation of numerical linear algebra is the LINPACK-package, which includes matrix inversions and orthogonalization methods for real and com-plex matrices. MATLABS routines for linear algebra are basically routines from LINPACK,and Cleve Moler, the inventor of MATLAB was also a co-author of LINPACK.


Numerically, the Gaussian elimination is usually implemented as a LU-decomposition, afactorization of the Matrix A of the system

Ax = b

into an upper trigonal matrix U and lower trigonal matrix L, so that

LU = A

so that the solution can again be computed in a trivial way. In MATLAB, the LU-factorizationcan be computed via the lu command, for example as

> aa =

-1.0688456296920776 0.5834664106369019 -0.01743803359568120.0473232455551624 -0.6955339908599854 -0.2883380949497223

-0.5952438712120056 -0.0617007017135620 -1.1060823202133179

> [l,u]=lu(a)l =

1.000000000000000 0.000000000000000 0.000000000000000-0.044275098518011 1.000000000000000 0.0000000000000000.556903499136249 0.577325122175704 1.000000000000000

u =-1.068845629692078 0.583466410636902 -0.0174380335956810.000000000000000 -0.669700958047086 -0.2891101656051310.000000000000000 0.000000000000000 -0.929460456605607

The solution of a linear system Ax = b can be computed in MATLAB with the slash-command, which is not only the division ”from the left” for scalars,5/4

ans = 1.25000000000000> 5\4ans = 0.800000000000000but also for matrices. The Algebraic meaning is

A\B = A#1 · B, A/B = A · B#1,

and for matrices (Remember that MATLAB means MATRIX Laboratory), this is not nec-essarily the same. The solution for Ax = b can be obtained by formally dividing through Afrom the left,

Ax = b $ A\Ax = A\b $ x = A\b.

The solution of the triangular system, including with testing whether Ax is really equal b,can then be programmed in the following way:

5.5. MATRIX INVERSES AGAIN 83

> A=rand(3)A =

0.63356 0.25786 0.711590.98480 0.13788 0.627610.60858 0.76457 0.90059

> b=rand(3,1)b =

0.819310.618350.14195

> x=A\b

x =

-0.74296-2.369962.67169

> A*xans =

0.819310.618350.14195

In LINPACK, the elimination step is called factoring (because the LU-decomposition producestwo factors, L and U), and the Double precision GEneral matrix FActoring is therefore calledDGEFA. The solution/substitution of the system is DGESL , SL for solution.There exists also a LINPACK-benchmark, which sets up matrices in a well-defined way andcomputes the matrix inverses, then computes the number of floating point operations andthe time and then computes the Flop-rate. In this way, the speed of computers has beenevaluated for decades.2

5.5.3 Matrix inversion

A matrix inversion can be computed in the same way as the solution of a linear system, whichwe see that if we write the problem as

AA#1 = E, with the identity-matrix E =

2

33334

1 0 · · · 00 1 · · · 0...

... . . . ...0 0 · · · 1

5

66667

2http://www.netlib.org/benchmark/linpackd in FORTRAN, but also available in other lan-guages, the results in http://www.netlib.org/benchmark/linpackd/performance.ps


where the columns of the identity matrix E have the role ofthe right-hand side b and the columns of A#1 are the un-knowns x. It is now clear why it is advantageous to use theLU-decomposition, as it allows the simultaneous solution ofsystems with arbitrary many columns on the right hand side.After the factoring is completed, the solution step for com-puting the inverse of a l,l matrix takes l times as many stepsas the solution of the system for a single column right-hand-side. We can see this by using the flops-command whichwas available with old versions of MATLAB (before version6), which measured the number of floating point iterations,and the example program on the right.For 150, 150 matrices, the number of FLOPS necessary forthe solution of the linear system is 2419042, for the computa-tion of the matrix inverse it is 6907967. This means that thenumber of FLOPS required for the matrix inversion is aboutthree times as much than the solution of the linear system.For the solution of the linear system, the highest computa-tional cost is actually the factoring, not the backward substi-tution, and we can see that the backward substitution takestwice as much operations as the factoring itself.

clearformat compactn=150A=randn(n);b=randn(n,1);

flops(0)x1=A\b;flopsflops(0)x2=inv(A)*b;flopsreturn

>>n =

150ans =

2419042ans =

6907967

5.5.4 Accuracy of the matrix inversion

Up to now, we have not discussed the error in matrix inversions. As we have not used anyorder of approximation, it is clear that there will be neither truncation error nor discretizationerror, and only rounding errors have to be taken into consideration. As a test case for thematrix inversion, let us consider the matrix

A =)

1 11 ! $ 1

*

which has the inverse

A#1 =)

1/$ !1/$!(1 + $)/$ 1/$

*

If we compute the inverse in double precision, for # = 10#8, we obtain for

A=1.000000000000000 1.0000000000000000.999999990000000 1.000000000000000

the inverse

99999999.4975241 -99999999.4975241-99999998.4975241 99999999.4975241

5.6. EIGENVALUES 85

instead of the expected result

A#1 =)

108 !108

!100000001 108

*

.

What went wrong? The numerical parameter which describes how accurate a matrix inversioncan be computed, or a linear system can be solved, is the condition number ), which isimplemented in MATLAB’s cond-function. The condition number for a matrix A is defined asthe norm of the Matrix |A| divided by the norm of the inverse matrix |A#1|, or, if |A|/|A#1| <1, then as the inverse ) = |A#1|/|A|.There is a heuristic, which says that if the condition-number ) of a matrix A is - 10k, for amatrix inversion about k digits will be lost in accuracy. four our above matrix with # = 10#8,the condition number is ) = 4 · 10#8. as the error is about 0.5 for a matrix for which theentries are of the order of 108, we see that the predictions of the Heuristic are quite accurate.We have discussed that there are two possible implementations of the matrix-multiplication,the DOT and the DAXPY-product. The LU-decomposition can formally written as an op-erator O acting on the original matrix A, so that formally

O · A = L · U.

In other words, the LU-decomposition is a very special matrix-matrix multiplication, andtherefore there a two variants. The conventional DAXPY-variant, which is widely treatedin textbooks on numerical analysis, is implemented in MATLAB, and in Packages like LIN-PACK, LAPACK, NAG and visual numerics. The rather rarely mentioned DOT-variant, forwhich depending on the implementations Names like Doolittle, Crout or Crout-Doolittle areused, is basically only used in NUMERICAL RECIPES, a compendium of numerical routines,where none of the authors has a background in analysis. A DDOT-routine in MATLAB (notshown, because I don’t want anybody to use it) for the above problem produced the result

1.99999999 -1.00000000-0.99999999 1.00000000

and the error was not in the eight digit, but already in the first digit! The scalar product asa kernel introduces rounding errors which cannot be predicted with the conventional formulausing the condition number.

5.6 Eigenvalues

The eigenvalues can be computed in MATLAB via the eig command. For a random matrixA, we obtain the eigenvalues as

>> A=randn(2)A =

0.5181 -1.22740.8397 0.1920

>> eig(A)ans =

0.3551 + 1.0020i0.3551 - 1.0020i


so one can see that the eigenvalues of a real square matrix are in general not real. For asymmetric matrix, we see that

>> A=A+A’A =

1.0363 -0.3876-0.3876 0.3840

>> eig(A)ans =

1.21670.2035

we obtain real eigenvalues. Formally, the eigenvalues *i of a matrix A are often introducedas the roots of the characteristic polygyon of A,

det(A ! *E) = 0, E =

2

33334

1 0 · · · 00 1 · · · 0...

... . . . ...0 0 · · · 1

5

66667.

For a diagonal 2 , 2 matrix,

det))

a 00 b

*

! *

)1 00 1

**

= (a ! *)(b ! *) = 0,

we see that the solutions for * are exactly a and b. In other words, the eigenvalues of adiagonal matrix are the diagonal matrix entries themselves. As we have seen above, theeigenvalues of a real symmetric diagonal matrix are real. If we look at the characteristicpolynomial, we see that for an upper triangular matrix, the o!-diagonal elements vanish inthe characteristic polynomial, so also for a trigonal matrix the eigenvalues are exactly thediagonal elements:

A =1.0363 -0.3876

0 0.3840>> eig(A)ans =

1.03630.3840

5.6. EIGENVALUES 87

5.6.1 What to do with eigenvaluesFor a non-diagonal matrix A, the ”action” of the non-diagonalmatrix can usually be replaced ”somehow” by the ”action” ofone or more eigenvalues. One example for such an ”action” isthe multiplication of a matrix with a vector. We if we multiply avector iteratively with a matrix A and compute of the norm of thevector, for example with the program on the right, we find thatafter several iterations the length of the iterated vector becomethe absolute value of the largest eigenvalue of the matrix:

ans =-0.23606797749979 4.23606797749979

norm_of_v = 4.12310562561766norm_of_v = 4.23570259468110norm_of_v = 4.23606684261261norm_of_v = 4.23606797397526norm_of_v = 4.23606797748884norm_of_v = 4.23606797749976norm_of_v = 4.23606797749979norm_of_v = 4.23606797749979

clearformat compactformat long

v=[11];

v=v/norm(v);A=[1 2

2 3];eig(A)

for i=1:8v=A*v;norm_of_v=norm(v)v=v/norm(v);

end

5.6.2 Diagonalization and Eigenvectors

Now, if we have a matrix A =)

1 22 3

*

we can call eig not only to compute the eigen-

values, but using two output-arguments in constructor-brackets [], we can also obtain theeigenvectors as

>> [u,l]=eig(A)u =

0.85065080835204 0.52573111211913-0.52573111211913 0.85065080835204

l =-0.23606797749979 0

0 4.23606797749979


(the eigenvectors l are then not outputted as a vector, but as a diagonal matrix). In ourabove example, where we iteratively multiplied the vector v with the matrix A, the end resultfor v is

>> vv =

0.525731112137810.85065080834049

which is the right column of u, and therefore the eigenvector to the larger eigenvalue *2 =4.23606797749979. In other words, out iterative multiplication of a vector to a matrix is a wayto find the largest eigenvalue and the eigenvector corresponding to this largest eigenvector,and in the literature, this method is often called the power method, because it correspondsto multiplying a power of A onto v :

Anv! > (umax

The matrix u which contains the eigenvectors is at the same time the transformation whichtransforms A onto diagonal form, so that

u%Au =)

!0.23606797749979 00 4.23606797749979

*

The matrix u is called a unitary transformation

5.6.3 Computing the characteristic polynomial

As we have used Newton-Raphson Iteration for the computation of roots of polynomials,one could think that this would also be a good method to compute the eigenvalues from thecharacteristic polynomial

det(A ! *E) = 0.

Actually, this is not the case. The numerical algorithm for the computation of the eigenvalues,which will not be elaborates here, makes use of all of the l , l matrix entries, whereas thesolution of the characteristic polynomial only makes use of l coe"cients computed from thel , l matrix entries, so again we loose significant information as in the example for theintersection computation of ellipses via the fourth order polynomial.On the contrary, instead of computing eigenvalues via roots it is usually feasible to computethe roots of a polynomial by rewriting it as the corresponding eigenvalue problem. First letus divide the polynomial P (x)

P (x) = akxk + ak#1x

k#1 . . . a0 = 0

by the leading coe"cient ak, so that our equation looks like

P (x) = xk + ak#1xk#1 . . . a0 = 0.

5.6. EIGENVALUES 89

Then one can set up the so-called companion matrix CP for P (x) e.g. as

CP =

2

3333334

0 1 0 · · · 00 0 1 · · · 0...

...... . . . ...

0 0 0 · · · 1!a0 !a1 !a2 · · · !ak#1

5

6666667,

and the eigenvalues of CP are the roots of the polynomial P (x). For example, the polynomial

P (x) = x3 ! 2x2 ! 5t + 6 = 0

has the roots x1 = 1, x2 = !2, x3 = 3. If we set up the companion matrix as

C =

0 1 00 0 1

-6 5 2

We obtain as the eigenvalues of the companion matrix C

> eig(C)ans =

3.00001.0000

-2.0000

5.6.4 Stability Analysis

Eigenvalues play an important role in stability analysis, i.e. in the analysis whether a nu-merical problem is ”stable” or not. Instability usually results from eigenvectors which arelarger than 1 (or, in some cases, di!erent from 1). As an example of how eigenvalues enterin the solution of problems, let us look at the example problem for the ordinary di!erentialequations:

function dydt = f(t,y)% necessary parameters as global variablesglobal Dglobal omega0% velocity componentdydt(1)=-omega0^2 * y(n,2)- 2*D*y(n,1) ;

% position componentdydt(2)=y(n,1);

return

This can also be written in matrix notation as


function dydt = f(t,y)% necessary parameters as global variablesglobal Dglobal omega0% velocity componentdydt=[- 2*D -omega0^2

0 1 ]* y;% position componentreturn

Obviously, the Matrix A =)

!2D !+2

0 1

*

has Eigenvalues, and the integration step is given

by At., so in fact errors in the time integration can be analyzed by analyzing the eigenvaluesof Adt. For this harmonic oscillator, the Matrix is trigonal and the eigenvalues are obviously(!2Ddt, dt) , and are therefore constant in time, so the problem can be analyzed also purelyanalytical. For more complicated problems, like for the ordinary di!erential equations of theLorentz attractor

dx

dt= !(y ! x)

dy

dt= rx ! y ! xz

dz

dt= xy ! bz,

with real constants !, r, b the matrix of the ordinary di!erential equation is obviously a non-linear function, because the time evolution (dx

dt ,dydt ,

dzdt ) cannot be written as a product of

a matrix A independent of x, y, z and the vector x, y, z, like in the case of the harmonicoscillator. The classical way to analyze the stability of such a system is to ”linearize” thematrix, usually a risky business, because the linearized matrix is not guaranteed reproducethe full behavior of the non-linear system. The ”modern” approach is to simply perform thetime integration and output representative values for the eigenvalues of the matrix

B =

2

34!! ! 0

r !1 !x0 x !b

5

67 .

Now let us solve the Lorentz-Model with constant stepsize using the Euler method and letus plot the Eigenvalues of the matrix. We know already that the Euler-method is bad,so our solution will be inaccurate, but it will be much more interesting to implement theEuler-method in two di!erent ways and see how the two solutions diverge from each other.

% Compute the Lorenz-Modelclear,format compactn=20;r=60;b=8/3;sigma=10;

5.6. EIGENVALUES 91

t_max=1.3dt=0.01 % diverges with this timestep: dt=0.011;ndt=round(t_max/dt);x=zeros(ndt,1);mateig=zeros(ndt,1);x(1)=1;y=x; z=x;

bild=0;k=[1 1 1]’;k(:,2:ndt)=zeros(3,ndt-1);prop=[ -sigma sigma 0,

0 -1 0,0 0 -b];

% L\"osung der DGL direktfor i=1:ndt-1dx=sigma*(y(i)-x(i))*dt;dy=(x(i)*(r-z(i))-y(i))*dt;dz=(x(i)*y(i)-b*z(i))*dt;x(i+1)=x(i)+dx;y(i+1)=y(i)+dy;z(i+1)=z(i)+dz;

end% Loesung der DGL mit Matrix-% Vektor-Multiplikationenfor i=1:ndt-1prop(2,1)=(r-k(3,i));prop(3,1)=(k(2,i));k(:,i+1)=dt*(prop*k(:,i))+k(:,i);

mat_eig(i+1)=max(abs(eig(prop*dt)));endsubplot(4,1,1)plot3(x(1:ndt),y(1:ndt),z(1:ndt));subplot(4,1,2)plot3(k(1,1:ndt),k(2,1:ndt),k(3,1:ndt));subplot(4,1,3)plot3(k(1,1:ndt)-x(1:ndt)’,...

k(2,1:ndt)-y(1:ndt)’,...k(3,1:ndt)-z(1:ndt)’);

subplot(4,1,4)plot(mat_eig)

This is the first surprise, that two implementations of the Euler-method don’t give ”numericalidentical results”, and the di!erence increases if we increase the maximal time. The next


surprise comes when we increase the timestep from dt=0.01 to dt=0.013. We can see thatthen the Solution and the maximal eigenvalues start to diverge, and if the maximal time istaken longer, the program even crashes because it reaches infinity. Here we have found theproperty of the solution of di!erential equations, that the eigenvalues of the correspondingmatrix times the time step may not become larger than 1, or the solution does not convergeany more.The eigenvalue-spectrum obtained from the Euler-method is also representative for the eigen-values which we would obtain from higher-order Methods like Runge-Kutta, which are itselfonly a ”sophisticated” concatenation of Euler-steps with di!erent step-size.

5.6.5 Eigenvalue condition number

As in the case of the matrix inversion, there is a parameter which tells how accurately thecondition of the eigenvalues could be performed. In MATLAB, the function which gives theeigenvalue condition number (di!erent from the condition number cond for matrix inverses)is called condeig, and it gives the inverses of the eigenvectors of the matrix.

Chapter 6

Ordinary di!erential equations

For ordinary di!erential equations there is a closed theory about which solution methodshould be applied in which case. In the case of ordinary di!erential equations, the totaldi!erential imposes additional constraints on the solution so that the numerical equations canbe satisfied more easily. In contrast, Partial di!erential equations are much more di"cult totreat numerically, because the boundary conditions impose certain constraints on the solutionmethod, so that in the case of nonlinear equations, the optimal choice for a solution strategiesis far from obvious.

6.1 Reference Example

6.1.1 Newton’s equations of motion

Ordinary di!erential equations play a an important role in science and engineering, andmaybe the most central equation is Newton’s equation of motion, which relates a time- ,velocity and position-dependent force F (x, x, t), mass m and a

F (x, x, t)x = ma.

Rewriting the equation with the second derivative of the position x, we get

F (x, x, t) = mx,

which due to the second derivation of x is called a ordinary di!erential equation of secondorder. In general, it can be shown that n ordinary di!erential equations of order m can berewritten into n · m coupled di!erential equations of first order. For the case of Newtonsequations of motion, this can be done by introducing the velocity v as the derivative of x sothat

F (x, x, t) = mv,

v = x.

Because standard texts in numerical analysis prefer to deal with first order di!erential equa-tions, is is importand to understand the latter form.

94 CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

6.1.2 Linear oscillator

For simplicity, we set the mass m = 1. If the force takes the form !2Dxomega20x, which

corresponds to a linear spring with linear damping, the equation of motion takes the form

!2Dx ! +20 = x.

The solution of this equation with the Damping term 2D and the frequency of the undampedoscillation +0 is

x(t) = x0 exp(!Dt) cos(+Dt),

+D =&+2

0 ! D2.

Though there are solution schemes to solve second order equations directly, it is usuallysimpler to solve equation of second order by reducing them to a system of coupled first orderequations. For our problem, we introduce the velocity v and its time derivative v = a, so thisleads to the system of first order equations

v = !2Dx ! +20x,

x = v.

It is customary in the mathematical community to introduce the vector y =8 v

x

9(without

vector symbol( ) and to rewrite the equation as

ddt

y = F (y, t),

where F (y, t) becomes a vector-valued function with the time t and the vector y as argument.

6.2 The Euler Method

When faced with a di!erential for an numerical implementation, intuitively one first wantsto replace the di!erential operator d by the finite di!erence # as

dydt

- y(t +#t) ! y(t)#t

so that the first order approximation in the solution of one time-step ti with value y(ti) andthe Function value Fi = F (yi, ti) to the next is

y(ti +#t) - y(ti) + Fi#t,

which is called Euler method.

clearformat compact

D=.2 , omega0=1 % Damping and Force constantx0=1 , v0=0 %Initial conditions

6.2. THE EULER METHOD 95

dt=0.1, t0=0, t_max=30 % time-step, start-time, end time

y(1,1)=v0y(1,2)=x0t(1)=t0n=1while (t(n)<t_max)% velocityy(n+1,1)=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );

% positiony(n+1,2)=y(n,2)+dt*y(n,1);

% current timet(n+1)=t(n)+dt;n=n+1;

end

% exact solutionomega_d=sqrt(omega0^2-D^2);y_ex=(x0*exp(-D*t).*cos(omega_d*t))’;subplot(2,2,1)plot(t,y(:,2),’-’,t,y_ex,’:’)legend(’Euler, dt=0.1’,’Exact’)axis tight

0

Computedsolution

Exact solution

Euler

y0

t t +dt0Strategy of the Euler method: Eval-uate the value at the right side of theinterval via the starting value of theleft side and the tangent of the leftside of the interval.

0 10 20 30

−0.5

0

0.5

1Euler, dt=0.1Exact

Result of the Euler method for the damped harmonicoscillator: The period and the amplitude are wrong.

By construction, the Euler-method is a first-order method, because we only retained theterms proportional to #t in the expansion. Of course, if we plot the absolute error for our


exponentially vanishing solution, the error also vanishes exponentially, therefore we don’tdraw the error here.

6.2.1 Discussion of the Error

Geometrically speaking, the Euler Method chooses the starting value and the tangent atthe same point for the integration, which is correct only in the infinitesimal limit. It canbe seen that the results obtained from the from the Euler Method are far from satisfyingfor our timestep of dt = 0.1. If we decrease the step size, we can reduce the discretiza-tion error for the time step, but there is a limit which is reached for the above di!er-ential equation with dt = 0.001, and even using 1/10 and 1/100 the timestep does notchange the result significantly any more - by 10 times and 100 times the computational cost.

0 5 10 15 20 25 3010−7

10−6

10−5

10−4

10−3

10−2

10−1

dt=10−2

dt=10−3

dt=10−4

dt=10−5

For some ordinary di!erential equations, we will increase also the rounding error (from addingeach new timestep) for integrating out the same time interval, so in the limit of dt $ 0, wewon’t obtain the correct result with the Euler Method. Therefore there is one thing about theEuler Method which should be kept in mind: NEVER USE THE EULER METHODIN A SERIOUS APPLICATION1

1Except for stochastic di!erntial equations, where the stochastic noise destroys the systematicerror, but even then there may be better choices ....


6.2.2 Modified Euler Method

There are several strategies which can be used to reduce the error in the Euler Mehod. Onepossibility is to use for the timestep from t0 to t# + dt the value of F (y, t0 + dt/2), in themiddle of the interval, instead of F (y, t0) on the left of the interval, which results in a secondorder method. This is similar to the midpoint method in numerical quadrature, which givesa second order method, whereas the rectangular method with the value at the end of theintegral gives only a quadrature rule of first order.

0

Exact solution

Modified Euler

solutionComputed

y0

0 0t t +dt/2 t +dtStrategy of the modified Eulermethod: Evaluate the value at theright side of the interval via thestarting value of the left side of theinterval and the tangent in the mid-dle of the interval.

0 10 20 30−0.5

0

0.5

1Euler, dt=0.1Exact

Result of the modified Euler method for the dampedharmonic oscillator: The period and the amplitude arecomputed much more accurately than for the Eulermethod.

6.2.3 Heun’s method

Heuns method uses the value y0 and the tangent F (y0, t0) at the left of the interval tocomputes the Euler step to the right of the interval F (y0 + dtF (y0, t), t0 + dt) as an esti-mate/prediction of the value at the right intervall, then it calculates as ”corrected” value forF (y, t) the averages between the solution F (y0, t0) at the left hand side of the interval andF (y0 + dtF (y0, t), t0 + dt) at the right side.Heun’s method is a second order method, and there is a certain structuarl similarity to thetrapeze rule in the quadrature, where also the left hand value and the right hand value areused.Nevertheless, there are some new perspectives about this method which allow to develope anew class of integration methods for ordinary di!erential equations which are of higher thansecond order:

1. The idea of first advancing the time integration in a ”predictor step”, then to modifythe result in a corrector step, is the basis of the so-called predictor-corrector methods.


2. The idea to use more than a single value of F (y, t) within a single time intervall dt isthe basis of the Runge-Kutta-type methods.

clear ; format compact

D=.2 , omega0=1 % Damping and Force constantx0=1 , v0=0 %Initial conditionsdt=0.1, t0=0, t_max=30 % time-step, start-time, end time

y(1,1)=v0y(1,2)=x0t(1)=t0n=1while (t(n)<t_max)% halfstep% velocityy_pred1=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );

% positiony_pred2=y(n,2)+dt*y(n,1);

% velocityy(n+1,1)=y(n,1)+dt*(-omega0^2*.5*(y(n,2)+y_pred2)- 2*D*.5*(y(n,1)+y_pred1));

% positiony(n+1,2)=y(n,2)+dt*.5*(y(n,1)+y_pred1);

% current timet(n+1)=t(n)+dt;n=n+1;

end% exact solutionomega_d=sqrt(omega0^2-D^2);y_ex=(x0*exp(-D*t).*cos(omega_d*t))’;subplot(2,2,1)plot(t,y(:,2),’-’,t,y_ex,’:’)legend(’Euler, dt=0.1’,’Exact’)axis tight


tion

Modified Euler

Exact PredictedSolutions

Computedcorrectedsolution

0 0t t +dt

solu−

Strategy of Heun’s method: Evalu-ate the values and tangents at theleft and right side of the interval as”predicted values” and take the av-erage as ”corrected value”.

0 10 20 30−0.5

0

0.5

1Heun, dt=0.1Exact

Result of Heun’s method for the damped harmonic os-cillator: The period and the amplitude are computedmuch more accurately than for the Euler method.

6.2.4 Stability

Up to now we have focused in our investigations purely from the point of ”accuracy”, in themeaning that a numerical solution we get will have some finite error in comarison to theexact solution of the problem, but will be more or less the same ”shape”. Actually, a morefundamental problem in numerical analysis is ”stability”, loosely speaking, the mathematicalproblem whether a numerical solution has the ”same shape” of the exact solution at all. Ifwe look at the following first order di!erential equation

ddt

y(t) = 1 ! ty1/3,

which for y(0) = 1 is strictly real in the interval [0,5]. The numerical solution overshootsfor too large time-steps, as is shown in the following graphs for the numerical solution withthe Euler method, so that y(t) becomes negative, and therefore in MATLAB delivers thecomplex roots of the negative values of y(t). The result of the numerical integration for toolarge time-steps shows a total di!erent shape than the exact solution, and is therefore called”unstable”. It is therefore a primary aim to choose numerical methods and time-steps so thatthe solution is stable, accuracy is ”only” a secondary concern.Regrettably, some methods which give very high accuracy for some problems give very smallstability with other problems. It is often advisable to check the stability of a method byusing di!erent time steps to see if the numerical solution changes, or not. If a small changeof the time-step leads to only a small change in the solution, the solution is stable. Themathematical definition of stability is that a solution undergoes only a small change for asmall change of the initial conditions, and in this respect, the time-step represents somethinglike an ”initial condition”.


0 1 2 3 4 5−1.5

−1

−0.5

0

0.5

1

1.5

t

y

real component

Euler, dt=0.5 Euler, dt=0.25 Euler, dt=0.125Euler, dt=0.05 Exact

0 1 2 3 4 5−1.6

−1.4

−1.2

−1

−0.8

−0.6

−0.4

−0.2

0

imaginary component

t

y

Euler, dt=0.5 Euler, dt=0.25 Euler, dt=0.125Euler, dt=0.05 Exact

6.3 Programming Ordinary di!erential equations

6.3.1 Readability

Before we proceed to higher order formulae, we should improve the readability of the code.Here is a good opportunity to introduce MATLAB functions. For the Euler method, we had

% velocityy(n+1,1)=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );

% positiony(n+1,2)=y(n,2)+dt*y(n,1);

As was emphasized in the first chapter, readability is tantamount in programming, andwhereas the Euler algorithm was still quite readable, for Heun’s method, we hat to treat

% velocityy_pred1=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );


6.3. PROGRAMMING ORDINARY DIFFERENTIAL EQUATIONS 101



and this is certainly not readable any more. An insight is, that the forcelaw for the timeintegration of the spring was inputted twice, once for the corrector step, and once for thepredictor step, so it would be a good idea to input the force law evaluation as a MATLABfunction.

6.3.2 Global variables

Most ordinary di!erential equations do not only need the time as an input parameter, whichwe specified in the above example, but they need other input parameters as well. One ofthe simplest ways to incorporate such input parameters is via the declaration of MATLABglobal attribut, which allows the specifiction of global variables, and which are of course notlimited in their use to functions for ordinary di!erential equations.

6.3.3 MATLAB functions

A typical MATLAB function is the following code, where the constructor brackets [] haveto be used for the output arguments and the round brackets () for the input arguments:

function [output_arg1,output_arg2]=function_name(input_arg1,input_arg2);% Comment following the function declaration; This comment will be displayed% when you type% "help function_name"% from the MATLAB prompt.global a % a global variable , which must be declared as global% somewhere else and initialized

output_arg1=input_arg1+input_arg_2*a

output_arg2=input_arg1*input_arg_2

return % end of function

The function is called for example as

[out1,out2]=function_name(25,24)

If not all input-arguments are used in the function call, like

[out1,out2]=function_name(25)

MATLAB terminates with an error message. MATLAB functions cannot override inputarguments, in the following example,


function [output_arg1,output_arg2]=function_name(input_arg1,input_arg2);

output_arg1=input_arg1+input_arg_2

output_arg2=input_arg1*input_arg_2

input_arg1=15return % end of function

the line

input_arg1=15

does not have any meaning, because only output-arguments (in constructor brackets [])are recopied to the calling program. If not all output- or input-arguments are assigned,MATLAB terminates with an error message. Overloading, the use of a variable number ofinput-arguments is possible, and in this case one has to ask the number of the input argumentswith the MATLAB function nargin and the number of output arguments nargout. We willnot treat overloading here, but is is easy to find examples for overloaded methods by lookingat some MATLAB functions which exist in MATLAB-code (most MATLAB functions arewritten in MATLAB) in the toolbox directory.

which histo

will display the directory in which the toolbox-MATLAB-function histo can be found, and itis possible to load the function into the editor and view the usage of the operator overloading.

6.3.4 MATLAB-functions for ODE-Solvers

It is customary to write MATLAB-functions for ODE-Solvers with a header like

function dydt = f(t,y)

with the time t and the generalized coordinates y as input and the first order derivative ofthe generalized coordinates dxdy as output. We can rewrite Heun’s method

% velocityy_pred1=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );




using the MATLAB-function (we will retain the the time as function-argument though thereis no explicit time-dependence in our force law)

6.3. PROGRAMMING ORDINARY DIFFERENTIAL EQUATIONS 103

function dydt = f(t,y)% necessary parameters as global variablesglobal Dglobal omega0% velocity componentdydt(1)=-omega0^2 * y(n,2)- 2*D*y(n,1) ;

% position componentdydt(2)=y(n,1);

return

as

clear ; format compact

global D, D=.2 , global omega0, omega0=1 % Damping and Force constantx0=1 , v0=0 %Initial conditionsdt=0.1, t0=0, t_max=30 % time-step, start-time, end time

y(1,1)=v0y(1,2)=x0t(1)=t0n=1while (t(n)<t_max)% predicted valuey_pred=y(n,:)+dt*f(t(n),y(n))-omega0^2 * y(n,2)- 2*D*y(n,1) );

% corrected valuey(n+1,:)=y(n,:)+.5*dt*(f(t(n),y(n))+f(t(n),y_pred)

t(n+1)=t(n)+dt;n=n+1;

end% exact solutionomega_d=sqrt(omega0^2-D^2);y_ex=(x0*exp(-D*t).*cos(omega_d*t))’;subplot(2,2,1)plot(t,y(:,2),’-’,t,y_ex,’:’)legend(’Euler, dt=0.1’,’Exact’)axis tight

which is much more readable than the original. This also allows us to see how many functionevaluations are necessary to a given timestep: whereas for the original Euler method we usedone function evaluation per timestep, for Heun’s method we need two function evalutionper timestep. Therefore, Heun’s method is not only more accurate, but also more costful.Higher order methods will need even more function evaluations, therefore we rewrite Heun’smethod as a function along with the computation of the number of steps, and the controlof the reasonableness of the input parameters. Moreover, the evalf command of MATLABis used so that the MATLAB-file which contains the di!erential equation can be passed as


an argument. Moreover, we have initialized tout and yout so that we don’t loose time byallocating new memory space when a new element is added to these vectors in each timestep:

function [tout,yout] = heun(yinit,tstart,tend,dt,f)% Heun’s Method:% Runge-Kutta integrator (2nd order)% Input arguments -% y = current value of dependent variable% t = independent variable (usually time)% dt = step size (usually timestep)% f = right hand side of the ODE; f is the% name of the function which returns dy/dt% Calling format f(y,t).% Output arguments -% yout = new value of y after one stepsize dtnsteps=ceil((tend-tstart)/dt)dtif nsteps<0tstarttenddterror (’tend-tstart is not a positive multiple of dt’)

endif (abs(nsteps*dt-(tend-tstart))>1e-6*dt)disp(’warning: time interval not a multiple of timestep’)disp(’inputed timestep:’)dtdt=(tend-start)/nstepsdisp(’use instead’)dt

enddt

if (size(yinit,1)==1)yinit=yinit’;

end

%allocate necessary memory to save time:yout=zeros(length(yinit),nsteps); %tout=zeros(1,nsteps);

yout(:,1)=yinit;y=yinit;tout(1)=tstart;

n=1;

6.4. THE CLASSICAL RUNGE-KUTTA FORMULA 105

for k=1:nstepsF1 = feval(f,y,tout(n));t_full = tout(n) + dt;ytemp = y + dt*F1;F2 = feval(f,ytemp,t_full);n=n+1;y= y + .5*dt*(F1 + F2);yout(:,n)=y;tout(n)=t_full;

endreturn

This program can then be called from a driver routine (a routine which does nothing elsethan call a specific function) in such a way:

clear ;format compactglobal D, D=.2 ,global omega0, omega0=1 % Damping and Force constantomega_d=sqrt(omega0^2-D^2);dt=0.1, t0=0, t_max=20 % time-step, start-time, end timex0=1 %Initial conditionsv0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);

[t,y]=heun([v0;x0],t0,t_max,dt,’harm_osc’);

% exact solutiony_ex=(x0*exp(-D*t).*cos(omega_d*t))’;subplot(2,2,1)plot(t,(y(2,:)-y_ex’)./y_ex’,’:’)legend(’Heun, dt=0.1’,’Exact’)axis tight

6.4 The classical Runge-Kutta formula

6.4.1 The Idea

The idea to evaluate not only a single integration point, and moreover compute within a singletimestep more integration points from previously computed auxiliary timesteps is realized inthe so-called Runge-Kutta algorithm. The formulae for the so-called ”classical” Runge-Kuttamethod are

yk+1 = yk +dt

6(k1 + 2k2 + 2k3 + k4)

k1 = f(ti, yi)k2 = f(ti + dt/2, yi + k1dt/2)


k3 = f(ti + dt/2, yi + k2dt/2)k4 = f(ti + dt, yi + k3dt)

It uses four evaluations F1, F2, F3, F4 at intermediate steps t0, t0 + dt/2, t0 + dt/2, t0 + dt.F2 is computed from F1, F3 from F2 and F4 from F3. Afterwards, the new y is computedas a weighted average of the F1, F2, F3, F4. The function then looks like this

function [tout,yout] = rk4_class(yinit,tstart,tend,dt,f)% Classical Runge-Kutta integrator (4th order)% Input arguments -% y = current value of dependent variable% t = independent variable (usually time)% dt = step size (usually timestep)% f = right hand side of the ODE; f is the% name of the function which returns dy/dt% Calling format f(y,t).% Output arguments -% yout = new value of y after one stepsize dtnsteps=ceil((tend-tstart)/dt)if nsteps<0tstarttenddterror (’tend-tstart is not a positive multiple of dt’)

endif (abs(nsteps*dt-(tend-tstart))>1e-6*dt)disp(’warning: time interval not a multiple of timestep’)disp(’inputed timestep:’)dtdt=(tend-start)/nstepsdisp(’use instead’)dt

end


end

yout=zeros(length(yinit),nsteps);tout=zeros(1,nsteps);

yout(:,1)=yinit;y=yinit;tout(1)=tstart; %40


n=1;

half_dt = 0.5*dt;dt_6=dt/6;for k=1:nsteps% yF1 = feval(f,y,tout(n));t_half = tout(n) + half_dt;ytemp = y + half_dt*F1;F2 = feval(f,ytemp,t_half);ytemp = y + half_dt*F2;F3 = feval(f,ytemp,t_half);t_full = tout(n) + dt;ytemp = y + dt*F3;F4 = feval(f,ytemp,t_full);y = y + dt_6*(F1 + F4 + 2.*(F2+F3));n=n+1;yout(:,n) = y;tout(n)=t_full;

end

return;

The same driver as for Heun’s program above can be used, just with the line[t,y]=heun([v0;x0],t0,t_max,dt,’harm_osc’);replaced by[t,y]=rk4_class([v0;x0],t0,t_max,dt,’harm_osc’);

6.4.2 The importance of the initial condition

In the section for Euler’s and Heun’s method, we used as initial condition x0 = 1, v0 = 0 tocompare the numerical result with the exact solution

yex = (x0 exp(!Dt) cos(+dt))

Actually, this solution is not the exact solution for the initial value problem with v0 = 0, butfor the initial value problem with

v0 = !D exp(!Dt0) cos(+dt0) ! +d exp(!Dt0) % sin(+dt0)

The improvement for the Runge-Kutta method compared to Heun’s method would not beenvisible because the initial value for the integration is so far o! that the numerical solutionis quite wrong. Whenever one computes numerical solutions to compare them with exactsolutions, one should be sure that they are the solutions for the identical problem.


0 1 2 3 4 5 6 7 8 9−0.5

0

0.5

1correct initial condition exact solution incorrect initial condition

6.4.3 Accuracy

Now that we have outlined several algorithms with di!erent order (di!erent truncation errorwith respect to the Taylor expansion of dt), we should compare the above methods withrespect to their cost and accuracy. As has been mentioned already above, the cost of aRunge-Kutta step is four function evaluations per timestep, in contrast to a single functionevaluation for Euler and two function evaluations for Heun. Let us compare the accuracyof the three methods, once for the absolute accuracy ycomputed ! yexact, and once forthe relative accuracy (ycomputed ! yexact)/yexact. For the Euler method, we obtain anexponentially decaying error due to the fact that the solution decays exponentially, and therelative error increases exponentially. The absolute error starts at the order of 10#2, whichis the square of the order of the timestep (dt = 0.1)2, as was expected.

0 10 20 30 40

10−4

10−2

Euler, dt=0.1, absolute error

0 10 20 30 40

10−2

100

102

Euler, dt=0.1, relative error

For Heun’s method, the absolute error starts at the order of 10#3, which is the order of thetimestep (dt = 0.1)2, which was also expected. The relative error is constant for a certaintime, and then increases exponentially.


0 10 20 30 40 50

10−6

10−4

Heun, dt=0.1, absolute error

0 10 20 30 40 5010−4

10−2

100

Heun, dt=0.1, relative error

For the classical Method by Runge and Kutta, the absolute error starts at the order of10#6, which is by sheer luck one order more accurate than the fifth power of the timestep(dt = 0.1)4, which as was expected as the absolute error. Again, as in Heun’s method, therelative error is constant for a certain time, and then diverges exponentially.

10 20 30 40 50

10−10

10−8

10−6

Class. Runge−Kutta, dt=0.1, absolute error

0 10 20 30 40 50

10−6

10−4

10−2

Class. Runge−Kutta, dt=0.1, relative error

The last investigations have reviewed some old concepts and shown some important newconcepts for error analysis:

1. The order of the Euler, Heun and Runge-Kutta method are 1,2 and 4 respectively,therefore the absolute error at the beginning of the integration process is of the order+1 of the timestep dt2, dt3 and dt4 for an initial amplitude of the order of 1.

2. The local error is the error for a single timestep, and the local absolute error at thebeginning of the integration is the same as the local relative error.

3. The behavior of the relative error is a bit more complicated, as can be seen, the relativeerror increases during the integration process, but not monotonically. The error atthe end of the integration process is called the global error, and it can be seen thatthe global relative error is much larger than the local absolute error. Whenever oneperforms a time-integration of ordinary di!erential equations, one should know whichis the actually permitted error, and this is determined by the physical problem.


6.5 Adaptive Stepsize Control

It is possible to construct Runge-Kutta schemes using redundant evaluations of F (y, t) sothat the timestep can be computed in order n and in order n + 1 synchronously. One of thefirst such method was proposed by Fehlberg for fourth and fifth order, another, currentlyvery popular method of the same order is the more stable scheme by Prince and Dormand.The knowledge about both orders allows to estimate the error of the solution, and one cantherefore devise strategies to reduce the timestep if the error is too large, or to increase thetimestep more accurately than desired (and therefore takes too much computer time). SuchRunge-Kutta methods are build into MATLAB as ode23 and ode45, and with the driver

clear ;format compactglobal D, D=.2 ,global omega0, omega0=1 % Damping and Force constantomega_d=sqrt(omega0^2-D^2);

dt=0.1, t0=0, t_max=50 % time-step, start-time, end timex0=1 %Initial conditionsv0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);

[t,y]=ode23(’harm_osc2’,[t0 t_max],[v0 x0]);

y_ex=(x0*exp(-D*t).*cos(omega_d*t))’;subplot(2,2,1)semilogy(t,abs(y(:,2)-y_ex’),’:’)title(’ode23, dt=0.1, absolute error’)axis tightsubplot(2,2,2)semilogy(t,abs((y(:,2)-y_ex’)./y_ex’),’:’)title(’ode23, dt=0.1, relative error’)axis tight

and the file for the di!erential equation harm_osc2

function dydt=harm_osc2(t,y)format compactglobal Dglobal omega0% d velocity/dtdydt(1,1)=-omega0^2 * y(2)- 2*D*y(1);

% d position/dtdydt(2,1)=y(1);

return

This file is di!erent from our previously used harm_osc.m file, as the order of the inputparameters t and y are exchanged. The following solution for our damped harmonic oscillatorhas been computed:

6.5. ADAPTIVE STEPSIZE CONTROL 111

0 5 10 15 20 25 30−1

−0.5

0

0.5

1

5 10 15 20 25 30

0.05

0.1

0.15

0.2

0.25

0.3

Above the solution was plotted, below we see the timestep. The time-adaption algorithmchanged the timestep depending on whether the oscillation was at a relative minimum or ina straight motion. The accuracy of the time integration can be set by the input parametersof the ode23 function, see help ode23. The accuracy diagram for the default accuracy is thefollowing:

0 10 20 30 40 50

10−8

10−6

10−4

ode23, dt=0.1, absolute error

0 10 20 30 40 50

10−5

ode23, dt=0.1, relative error

The same plots can be made for the ode45 algorithm, which gives the following accuracydiagram


0 10 20 30 40 50

10−10

10−5

ode23, dt=0.1, absolute error

0 10 20 30 40 50

10−10

10−5

100

ode23, dt=0.1, relative error

The computed solution and the timestep are

0 5 10 15 20 25 30 35 40 45 50−1

−0.5

0

0.5

1

5 10 15 20 25 30 35 40 45 50

0.05

0.1

0.15

0.2

0.25

0.3

and it can be seen that MATLAB starts with a very small timestep and then increases thetimestep significantly to reach the default accuracy of the time integrator. The advantage ofthese adaptive methods for ”reasonable”ordinary di!erential equations are:

• One can specify the relative and absolute errors on input, and obtain a solution whichis guaranteed to be inside the specified errors.

• The performance is optimal, i.e. for the given method there will be no solution whichwill be computable with less timesteps/less computer time.

• Without knowing anythin about the system, and the relation between the timestepand the error resulting from the timestep for the given set of equations, one obtains a

6.5. ADAPTIVE STEPSIZE CONTROL 113

correct solution.

Therefore, it is allways a good idea to start an investigation of a problem with the abovemethod. But there are some caveats, for the case of ”unreasonable” di!erential equations,and these are systems which are often encountered in daily life, which are treated in the nextsubsection.

6.5.1 Problems with Adaptive stepsize control

Adaptive stepsize control needs some assumptions about the smoothness of the treated dif-ferential equations. There are some notorious physical situations which lead to non-smoothproblems:

• Coulomb-friction: If an ordinary di!erential equations contains terms which containsthe sign of a function, like in the case of Coulomb-friction,

FCoul = !µsign(v),

it may happen that the solution for the equation is not smooth enough, so that even areduction of the time step does not lead to the same solution for di!erent orders of thefunction evaluation of the ODE-solver. In that case, the solver stops, or it continuesonly with very small time steps so that the solution is not finished within finite time.

• Bouncing balls: If an object flies in free motion in a gravitational field, its trajectoriesare parabolic. If it hits a target, the motion is suddenly reversed. For the numericaltime integration, the free motion allows a very large timestep, whereas in the momentwhere the target is hit, the timestep has to be drastically reduced. It is possible thatnumerical solvers with adaptive stepsize control are not able to reduce the step-sizeappropriately, and in the simulation the impacting particle may not be reflected, butmay fly through the target. The risk for such a mishap is higher for higher ordersolvers, e.g. for 8th order.

Because the adaptive stepsize control needs some information about how the timestep mustbe reduced, MATLAB allows to specify the way in which the timestep should be changed viathe options-command.

6.5.2 Coulomb Friction

In contrast to the friction of and in fluids, which is for small velocities v proportional to thevelocity, Coulomb friction, the friction of solid on solid surfaces, is proportional to the signof th friction only. Using the Coulomb friction coe"cient µ and the normal force fn, we canwrite the Coulomb friction as

FCoul = !µfnv.

Obviously, this forcelaw has a jump at v = 0, and we know from physics that the frictionFCoulcan take any value from !µfn to µfn for v = 0. Actually, there is a method to solve


such an ”undetermined” Problem in a ”numerically exact” way2, but we will just try to usethe adaptive stepsize control in thehope that we get a reasonable solu-tion by decreasing the stepsize. Usingthe ordinary di!erential equation

y = !y ! 2Dsign (y) ,

Let us check the output for di!erentvalues of D using the programs to theright and let us look at the timestep:

0 2 4 6

−0.5

0

0.5

1d/dx2 x + 2 D *sign(d/dx x) + x = 0

D=0.05D=0.1

0 2 4 610−5

100

dt

timestep

D=0.05D=0.1

It can be seen that for D = 0.05,as long the oscillation resembles theoscillation of the damped harmonicoscillator, the timestep is compara-ble to the one one expects for theharmonic oscillator. For D = 0.1,where the 0-amplitude is reached, thetimestep goes down for several ordersob magnitude to guarantee the van-ishing of the amplitude, and the inte-gration is slowed down several ordersof magnitude in comparison with thedamped harmonic oscillator.

clearformat compactglobal D

tmax=7

D=0.05[t1,y1]=ode23(’lin_coul_osc’,[0 tmax],[1 0]);t1_plot=linspace(0,max(t1),2*length(t1));y1_plot=interp1(t1,y1,t1_plot,’spline’);

D=0.1[t2,y2]=ode23(’lin_coul_osc’,[0 tmax],[1 0]);t2_plot=linspace(0,max(t2),2*length(t2));y2_plot=interp1(t2,y2,t2_plot,’spline’);t2_plot=[t2_plot tmax];y2_plot=[y2_plot ; [0,1]];subplot(2,2,1)plot(t1_plot,y1_plot(:,1),...t2_plot,y2_plot(:,1),’:’)axis tightlegend(’D=0.05’,’D=0.1’)title(’d/dx^2 x + 2 D *sign(d/dx x) + x = 0’)

subplot(2,2,2)semilogy(t1(2:end),diff(t1),t2(2:end),diff(t2),’:’)ylabel(’dt’)xlabel(’timestep’)legend(’D=0.05’,’D=0.1’)axis([0 7 1e-5 1])

return

function dydt = f(t,y)% lin_coul_osc.mglobal Ddydt = [y(2); -y(1)-2*D*sign(y(2))];return

2Hairer et al, Solving Ordinary Di!. Equations I, Springer

6.6. STIFF DIFFERENTIAL EQUATIONS 115

6.6 Sti! di!erential equations

There is a class of di!erentialequations, which are called ”sti!”.They usually involve two di!erenttimescales/ periods for the oscillationof a scientific phenomenon, like in thecase of the Van der Pol’s equation

y%1 = y2

y%2 = µ.1 ! y2

i

/y2 ! y1

which for values of µ of the order of 1is a absolutely ”ordinary” set of dif-ferential equations, but for increas-ing µ, usual integrators need verysmall timesteps for the integrationprocess. For µ = 500, we have com-puted the solution using the standardode23-integrator and the ”sti!” vari-ant ode23, and it can be seen thatthe sti! integrator uses much longertimesteps to obtain the same result,and is ten times as fast.MATLAB o!ers several sti! solvers,among them ode23s and ode45s,which are ”Runge-Kutta-Type”, andode15s, which has several options forthe choice of the solution method.Currently, there is no clear defini-tion of when an ordinary di!eren-tial equations is sti!, because it isnot allways possible to identify ”time-scales” in the system. The currentheuristic definition definition is: ” Asti! di!erential equation is a di!eren-tial equation for which a sti! solverworks much better than an ordinarysolver”.If very high accuracy is necessaryfor the solution of the system, thetimestep of the sti! solver is reducedto the timestep for the ”ordinary”solver.

clearformat compactglobal mu,mu=500

t_max=1.5D=0.2tic[t1,y1]=ode23(’vanderpol’,[0 t_max],[2 0]);t1_plot=linspace(0,max(t1),2*length(t1));y1_plot=interp1(t1,y1,t1_plot,’spline’);toc

tic[t2,y2]=ode23s(’vanderpol’,[0 t_max],[2 0]);t2_plot=linspace(0,max(t2),2*length(t2));y2_plot=interp1(t2,y2,t2_plot,’spline’);toc

subplot(2,2,1)plot(t1,y1(:,1),’*’,t1_plot,y1_plot(:,1))title(’vandermode-ODE, ode23’)subplot(2,2,2)semilogy(diff(t1))ylabel(’timestep’)

subplot(2,2,3)plot(t2,y2(:,1),’*’,t2_plot,y2_plot(:,1))title(’vandermode-ODE, ode23s’)subplot(2,2,4)semilogy(diff(t2))ylabel(’timestep’)

function dydt = vanderpol(t,y)% vanderpol.mglobal mu

dydt=[y(2); mu*(1-y(1)^2)*y(2)-y(1)];return

For the example of the previous section for the Coulomb friction problem, the solution using asti! solver is not better than for the ordinary integrator. On the contrary, for some parameters


it may happen that the solver reduces the timestep to numerically 0 and the solution processterminates with an error message.

0 0.5 1 1.51.9975

1.998

1.9985

1.999

1.9995

2vandermode−ODE, ode23

0 200 400 600 800 100010−5

10−4

10−3

10−2

times

tep

0 0.5 1 1.51.9975

1.998

1.9985

1.999

1.9995

2vandermode−ODE, ode23s

0 10 20 3010−5

100

times

tep

6.7 Symplectic di!erential equations

The examples we have discussed up to now were implementations of Newton’s Equation ofMotion where the system underwent continuous energy loss. Actually, there is a huge classof systems which obey Newton’s equation of motion for which no energy-loss occurs, thereare systems of atoms and molecules. These systems are called symplectic, and can be writtenvia ”canonical equations” (using generalized coordinates and generalized momenta).

6.7.1 Stormer-Verlet Method

In the previous examples, we always included some damping in the system and rewrote theordinary di!erential equation as a system of coupled first order di!erential equations. For themost widely used symplectic (energy-conserving) time integrator, it is not necessary to rewritethe first order di!erential equation from second to first order, on the contrary, this methodis not able to handle velocity-dependent (first order terms) at all. Using the acceleration a(=Force/mass), we can write the Verlet method for the coordinate x as

xi+1 = 2xi ! xi#1 + aidt2.

6.7. SYMPLECTIC DIFFERENTIAL EQUATIONS 117

As one can see, this method usesnot only the information of thecurrent timestep (xi and ai), butalso the information from theprevious timestep (xi#1), and istherefore a so called multistep-method, a method which usesinformation from several steps.The previously described Runge-Kutta methods are members ofthe class of the so-called one-stepmethods.A possible implementation of theVerlet-method with computationof the number of timesteps isshown on the right. As canbe seen from the formula forthe Verlet-algorithm, when westart with timestep 0, we alsohave to compute timestep !dt,which is also done in the fol-lowing example program beforethe loop, in an approximate andunsatisfying way. Because themulti-step methods don’t allowby themsteves the computationof the previous timesteps, theyare called ”non-self-starting”, incontrast to ”one-step-methods”,which are called ”self-starting”.In ”conventional” implementa-tion of ”multistep-methods”, usu-ally at the beginning a ”self-starting method” is used to com-pute the previous timestep.Because verlet-type of algorithmsare mostly used for molecularsimulations, where the details ofthe initial conditions dont matter,at least not up to an error / dt,for practical applictions it is noproblem that the Verlet-methodis not self-starting.

function [tout,yout] = verlet(yinit,tstart,tend,dt,f)% Stoermer-Verlet Method:% Symplectic Method of (2nd order)% Input arguments -% y = current value of dependent variable% t = independent variable (usually time)% dt = step size (usually timestep)% f = right hand side of the ODE; f is the% name of the function which returns dy/dt% Calling format f(y,t).% Output arguments -% yout = new value of y after one stepsize dtnsteps=ceil((tend-tstart)/dt)dtif nsteps<0

tstarttenddterror (’tend-tstart not positive multiple of dt’)

endif (abs(nsteps*dt-(tend-tstart))>1e-6*dt)

disp(’warning: time interval not’)disp(’a multiple of timestep’)disp(’inputed timestep:’)dtdt=(tend-start)/nstepsdisp(’use instead’)dt

enddt


end

yout=zeros(length(yinit),nsteps);tout=zeros(1,nsteps);

yout(:,1)=yinit;y=yinit;tout(1)=tstart;

n=1;dt2=dt*dt;% compute timestep before initial timestep,% this implementation BAD, as it is one order less% accurate than verlet itself !!!!!!!!!


F1 = feval(f,yout(2,1));y_mdt=y(2,1)-F1*dt2;F2 = feval(f,yout(2,1));yout(2,2)=2*y(2,1)-y_mdt-F2*dt2;tout(2)=dt;

for k=2:nsteps-1

F1 = feval(f,yout(2,k));t_full = tout(k) + dt;yout(2,k+1)=2*yout(2,k)-yout(2,k-1)+F1*dt2;

tout(k+1)=t_full;end

return;

6.7.2 Precision

We compare the numerical solution for the harmonic oscillator without damping

function out=verlet_lin_osc(in)% verlet-lin-osc% linear oscillator with frequency omega% for use with verlet-type integratorglobal omega0out=-omega0^2*in;return

using the main program

clear ;format compactglobal D, D=0 ,global omega0, omega0=1 % Damping and Force constantomega_d=omega0;

dt=0.01, t0=0, t_max=300 % time-step, start-time, end timex0=1 %Initial conditionsv0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);

tic[t,y]=verlet([v0 x0],t0,t_max,dt,’verlet_lin_osc’);y_ex=(x0*cos(omega_d*t*1))’; % exact solution

[rkt2,rky2]=ode23(’lin_osc’,[t0 t_max],[v0 x0]);y_rk2=(x0*exp(-D*rkt2).*cos(omega_d*rkt2))’; % exact solution

6.7. SYMPLECTIC DIFFERENTIAL EQUATIONS 119

[rkt4,rky4]=ode45(’lin_osc’,[t0 t_max],[v0 x0]);y_rk4=(x0*exp(-D*rkt4).*cos(omega_d*rkt4))’; % exact solution

subplot(3,1,1)semilogy(t,abs((y(2,:)-y_ex’)’))title(’Error for verlet’)axis tight

subplot(3,1,2)semilogy(rkt2,abs(rky2(:,2)-y_rk2’))title(’Error for ode23’)axis tight

subplot(3,1,3)semilogy(rkt4,abs(rky4(:,2)-y_rk4’))title(’Error for ode45’)axis tightreturn

0 50 100 150 200 250

10−6

10−4

Error for verlet

0 50 100 150 200 250 300

10−5

Error for ode23

0 50 100 150 200 250 300

10−8

10−6

Error for ode45

It can be seen that the Verlet-Algorithm has a larger error for the initial timesteps, due toour choice of the earliest timestep in first order. Nevertheless, the error bound is constantover the whole integration interval. The remarkable property of the Verlet-method is that its


global error is the same as its local error.Though ode23 and ode45 from MATLAB start which a much smaller error, the global erroris proportional the the integration time, i.e. with time grows over all bounds. Physically thatmeans that if non-symplectic algorithms like Runge-Kutta are used for energy-conservingsystems, the energy will drift significantly over the integration interval. Verlet-Type of algo-rithms are even stable for millions of integration steps.

6.7.3 Velocities

The verlet methods only makes use of the coordinates, not of the velocities. Because thevelocities don’t occur in the equations, they can only be estimated using the relation

vi = (ri+1 ! ri#1) /2dt,

so that the velocities of a timestep are only known after the completion of the followingtimestep. Therefore, it is not possible to incorporate velocity-dependent interactions in theverlet-scheme.Often, it is not clear how large a timestep should be chosen for a given dissipative problem.There are some people who advocate the following procedure: Run the problem without dis-sipation and fix the timestep so that the change in energy during the simulation is negligible,than use this timestep for the dissipative system. Our exploration of the symplectic integratorshows that such a strategy is meaningless. The non-dissipative systems are a totally di!er-ent class than the dissipative systems, even the ”best” non-symplective integrators cannotcompete with quite mediocre symplectic integrators. Contrarywise, symplectic integratorscannot be used with dissipation, in the above Verlet-Stormer integration there is no possibil-ity to implement velocity-dependent forces, because at the time the forces must be computed,the velocity is not yet known. The same is true for modifications like the ”velocity-Verletscheme”, where one knows the velocity ”half a timestep” too late, using the velocity from theprevious timestep introduces errors which are of the order of Verlet-scheme itself.

Documents

Introduction to computational methods in science and ... · PDF fileIntroduction to computational methods in science and engineering using MATLAB ... for the programming projects with