Analysis and Dynamic Modelling of Complex Systems

Analysis and Dynamic Modelling ofComplex Systems

INAUGURAL–DISSERTATION

zurErlangung des Doktorgrades

derFakultät für Mathematik und Physik

derAlbert–Ludwigs–Universität Freiburg im Breisgau

vorgelegt von

DANIEL FALLER

aus Freiburg i. Br.2003

Dekan: Prof. Dr. R. Schneider

Leiter der Arbeit: HD Dr. J. Timmer

Referent: HD Dr. J. Timmer

Koreferent: Prof. Dr. J. J. v. d. Bij

Tag der Verkündigungdes Prüfungsergebnisses: 21.07.2003

Publications

1. Non-Markovian spectral broadening in interacting continuous-wave atomlasers.H. P. Breuer, D. Faller, B. Kappler und F. Petruccione.Europhys. Lett., 54 (1) , pp. 14–20 (2001)

2. Non-Markovian dynamics in continuous-wave atom lasers.H. P. Breuer, D. Faller, B. Kappler und F. Petruccione.Macroscopic Quantum Coherence and Quantum Computing, KluwerAcademic/Plenum Publishers, New York, 2001, p. 367-380.

3. High–Throughput Evaluation of Olefin Copolymer Composition by Meansof Attenuated Total Reflection Fourier Transform Infrared Spectroscopy.A. Tuchbreiter, J. Marquardt, J. Zimmermann, P. Walter, R. Mülhaupt, B. Kap-pler, D. Faller, T. Roths and J. HonerkampJ. Comp. Chem., 2001, 3, 598–603

4. A master equation approach to option pricingD. Faller and F. PetruccionePhysica A, 2003, 319, 519–534

5. Normalization of DNA-microarray data by non-linear correlation maxi-mizationD. Faller, H.U. Voss, J. Timmer and U. HobohmTo appear in Journal of Computational Biology

6. Real-Time Monitoring of Ethene/1-Hexene Copolymerizations: Determi-nation of Catalyst Activity, Copolymer Composition and CopolymerizationParametersB. Kappler, A. Tuchbreiter, D. Faller, P. Liebetraut, W. Horbelt, J. Timmer, J.Honerkamp, R. Mülhauptsubmitted

7. Expression profiling on chronically rejected transplant kidneysJ. Donauer, B. Rumberger, M. Klein, D. Faller, J. Wilpert, T. Sparna, G. Schieren,R. Rohrbach, J. Timmer, P. Pisarski, G. Kirste and G. WalzTo appear in Transplantation

8. Impact of the steady state assumption on model identification of VLDL/IDLapoB metabolismT. G. Müller, D. Faller, J. Timmer, M. W. Baumstark and K. Winklersubmitted to Journal of Lipid Research

9. Simulation Methods for Optimal Experimental Design in Systems BiologyD. Faller, U. Klingmüller and J. Timmersubmitted to Simulation

10. Tests for cycling in a signalling pathwayT. G. Müller, D. Faller, J. Timmer, I. Swameye, O. Sandra and U. Klingmüllersubmitted to Royal Statistical Society C

11. Oxidative stress response and production of collagen are normal adaptivechanges after renal ablationP. Gerke, B. Rumberger, O. Vonend, J. Wilpert, M. Bek, J. Donauer, D. Faller,T. Sparna, M. Klein, K. Amann, R. Rohrbach, H. Pavenstädt, J. Timmer and G.Walzsubmitted

Talks and Posters

1. Workshop: Proteomics, Bioinformatics and Genomics, Bernried 2001“Optimal” normalization of DNA–microarray data

2. ESGLD 2001:From Genotype to Phenotype: Modelling the Gene Knockout for Different Catepsins

3. Workshop: Biometrical Analysis of Molecular Markers, Heidelberg 2001“Optimal” normalization of DNA–microarray experiments

4. J. Am. Soc. Nephrol, 2002Expression profiling segregates chronic transplant nephropathy from other chron-ical diseases

5. International Conference on Systems Biology, Stockholm 2002“Robustness vs. identifiability in biological systems: Pitfalls and Prospects”

Contents

Preface 1

I Analysis and dynamical modelling of biological systems 5

1 Functionality of biological systems 71.1 Organisation of biological systems . . . . . . . . . . . . . . . . . . . 7

1.1.1 DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.1.2 RNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.1.3 Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2 Transcriptomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3 Proteomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3.1 Data management facilities . . . . . . . . . . . . . . . . . . . 12

2 Analysis of microarray experiments 152.1 Selection of interesting genes . . . . . . . . . . . . . . . . . . . . . . 152.2 Standard data normalisation techniques . . . . . . . . . . . . . . . . 18

2.2.1 Global normalisation . . . . . . . . . . . . . . . . . . . . . . 192.2.2 Non–linear normalisation . . . . . . . . . . . . . . . . . . . . 20

2.3 Testing for differential expression . . . . . . . . . . . . . . . . . . . 212.3.1 Controlling the familywise error rate . . . . . . . . . . . . . . 212.3.2 Controlling the false discovery rate . . . . . . . . . . . . . . 22

2.4 Application to DNA-microarray experiments . . . . . . . . . . . . . 222.4.1 Chronic transplant nephropathy . . . . . . . . . . . . . . . . 232.4.2 Mouse nephrectomie . . . . . . . . . . . . . . . . . . . . . . 25

3 Optimal transformations for normalisation of microarray experiments 313.1 Optimal transformations . . . . . . . . . . . . . . . . . . . . . . . . 31

3.1.1 Maximal-correlation . . . . . . . . . . . . . . . . . . . . . . 313.1.2 Alternating Conditional Expectation algorithm . . . . . . . . 32

3.2 Normalisation Method . . . . . . . . . . . . . . . . . . . . . . . . . 333.2.1 Normalisation by non–linear correlation maximisation . . . . 33

3.3 Application to experimental data . . . . . . . . . . . . . . . . . . . . 35

1

2 CONTENTS

3.3.1 Experimental settings . . . . . . . . . . . . . . . . . . . . . . 353.3.2 Analysis of the method . . . . . . . . . . . . . . . . . . . . . 373.3.3 False-positive rate . . . . . . . . . . . . . . . . . . . . . . . 393.3.4 False-negative rate . . . . . . . . . . . . . . . . . . . . . . . 39

3.4 Generalisation of the method . . . . . . . . . . . . . . . . . . . . . . 403.4.1 Variance stabilisation . . . . . . . . . . . . . . . . . . . . . . 413.4.2 Least trimmed squares regression . . . . . . . . . . . . . . . 42

4 Optimal Experimental Design in Systems Biology 434.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.2 System identification . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.1 Identifiability . . . . . . . . . . . . . . . . . . . . . . . . . . 454.2.2 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . 464.2.3 Local identifiability . . . . . . . . . . . . . . . . . . . . . . . 474.2.4 Optimal experimental design . . . . . . . . . . . . . . . . . . 47

4.3 Application to MAP–Kinase . . . . . . . . . . . . . . . . . . . . . . 484.3.1 The MAP-Kinase signalling pathway . . . . . . . . . . . . . 484.3.2 Measurement method and error model . . . . . . . . . . . . . 494.3.3 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.4.1 Parameter identifiability . . . . . . . . . . . . . . . . . . . . 514.4.2 Optimal experimental design . . . . . . . . . . . . . . . . . . 52

4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5 Model selection in robust systems 575.1 Robustness versus Identifiability . . . . . . . . . . . . . . . . . . . . 57

5.1.1 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.2 Application to perfect adaption . . . . . . . . . . . . . . . . . . . . . 58

5.2.1 Bacterial chemotaxis . . . . . . . . . . . . . . . . . . . . . . 585.2.2 Perfect adaption . . . . . . . . . . . . . . . . . . . . . . . . 595.2.3 Robustness: Pitfalls and Prospects . . . . . . . . . . . . . . . 60

6 Analysis of the dynamics of the lipoprotein metabolism 656.1 The lipoprotein metabolism . . . . . . . . . . . . . . . . . . . . . . . 65

6.1.1 Experimental design . . . . . . . . . . . . . . . . . . . . . . 666.1.2 Experimental methods . . . . . . . . . . . . . . . . . . . . . 666.1.3 Modelling the lipoprotein metabolism . . . . . . . . . . . . . 68

6.2 Modelling the large lipoproteins: VLDL and IDL . . . . . . . . . . . 696.2.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 696.2.2 Compartment models . . . . . . . . . . . . . . . . . . . . . . 696.2.3 Model selection . . . . . . . . . . . . . . . . . . . . . . . . . 726.2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.3 Modelling the small lipoproteins: LDL . . . . . . . . . . . . . . . . . 77

CONTENTS 3

6.3.1 Mathematical model . . . . . . . . . . . . . . . . . . . . . . 776.3.2 Observation function . . . . . . . . . . . . . . . . . . . . . . 786.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

II Option pricing theory 83

7 Introduction to option pricing 857.1 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857.2 Standard pricing theory . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.2.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . 867.2.2 Black–Scholes equation . . . . . . . . . . . . . . . . . . . . 86

8 A master equation approach to option pricing 898.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 898.2 Black–Scholes equation from a piecewise deterministic process (PDP) 908.3 General Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 928.4 Master Equation formulation of the Black-Scholes Equation . . . . . 96

8.4.1 Stochastic process with uniform discretization . . . . . . . . . 968.4.2 Stochastic process with non-uniform discretization . . . . . . 988.4.3 Fast stochastic process with non-uniform discretisation . . . . 1008.4.4 Analysis of the algorithm . . . . . . . . . . . . . . . . . . . . 1018.4.5 American options . . . . . . . . . . . . . . . . . . . . . . . . 103

8.5 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . 103

9 Summary 107

A Bacterial Chemotaxis 111

References 115

Preface

[The universe] cannot be read until we have learnt the languageand become familiar with the characters in which it is written. It iswritten in mathematical language, and the letters are triangles, cir-cles and other geometrical figures, without which means it is humanlyimpossible to comprehend a single word.

Galileo Galilei, Opere Il Saggiatore

The foundation of physics are mathematical equations describing the observed en-vironment. The success of theories such as Newton’s classical mechanics, quantummechanics or Einstein’s equations of gravitational theory is based on their ability topredict, although sometimes only in a probabilistic sense, the observed behaviour of amultitude of systems.This theoretical description of a physical system in terms of fundamental equationsnot only allows for a compact description of observed behaviour. In addition, thisstructured approach is also essential in order to get insights into a phenomenon underconsideration, surpassing a pure description. For example, Planck’s radiation formulawas not only a perfect description of experimental observations, it also gave rise to acompletely new theory helping to understand microscopic phenomena.These theoretical concepts have been successfully applied to a variety of complexphysical systems ranging from turbulence to open quantum systems. One commonfeature of these complex dynamical systems is that they involve a large number ofdegrees of freedom. Theoretical descriptions of these systems have to be based onmesoscopic or even macroscopic equations. The mathematical descriptions of thesesystems are based on different types of equations, ranging from ordinary or partial dif-ferential equations to stochastic differential equations.One of the challenges for physics in the 21th century is to widen the horizon and tospread the use of their basic concepts to other sciences. One prospective area are the socalled live sciences: biology and medicine. While Darwin’s insights into the evolutionof animal species first gave rise to the 19th century biology as a descriptive science,during the 20th century biology matured to an experimental science, that is now un-derstood in terms of fundamental physical and chemical principles.

1

2 Preface

This change in biology was mainly caused by the foundation of molecular biologybased on work also of physicists as Max Delbrück. This change in biology is capturedin the following citation from the Nobel lecture of Max Delbrück in 1969:

“... what else could genes be but molecules ? However, in the mid-thirties, thiswas not a trivial statement. Genes at that time were algebraic units of the combinato-rial science of genetics and it was anything but clear that these units were moleculesanalysable in terms of structural chemistry. They could have turned out to be sub-microscopic steady state systems, or they could have turned out to be something un-analysable in terms of chemistry as first suggested by [Niels] Bohr [1] ...”

With the recent advent of rapid genome sequencing and other high throughput mea-surement techniques, the biological sciences are entering an unprecedented age of dis-covery. Biology is faced with a lot of facts that need an explanation. But while physicsuses mathematics to represent the laws of nature, biology so far mainly relies on wordsand cartoon like diagrams [2].Of course there are already many physicists working on a wide range of biological sys-tems, such as enzyme kinetic or protein folding. But typically this are rather isolatedresearch topics at the borderline between physics and biology. In order to make senseof the huge amount of data generated by the recent experimental advances, a reductionto basic laws underlying biological organisms is necessary. Here physics with its richexperience in modelling complex systems could play a much more important role thanthat of an ancillary science [3, 4]. Following a recent nature editorial entitled “Canphysics deliver another biological revolution ?”, the present challenge in live sciencescan be phrased as

“ ... tackling important biological questions in biology, using the tools, both phys-ical and mental, of physics.”

Nature Editorial, 397, (1999)

In the following, different biological systems are investigated in order to reach a modelbased understanding. This mathematical description then helps in achieving insightsinto the functional coherences of the system. To this end quantitative dynamic mea-surements are made. Then mathematical models are proposed and validated using thisexperimental data.Since all experimental measurements are disturbed by observational noise and oftennot all components are observable, a typical question encountered is the identifiabilityof parameters or a model structure. Successful modelling efforts are addicted to a bal-ance between a too simplistic and a too detailed model. Hence system identification,inference of the true model structure from experimental data, has to be performed us-ing statistical tests. In the first part of this thesis, these questions are addressed in thecontext of the specific biological system under investigation.

Preface 3

In Part II another prospective area for the application of concepts from physics, be-sides life sciences, is presented: financial theory. Current financial markets inventeda large variety of so called derivatives (options), whose value is derived from a socalled underlying e.g. stock values. Based on the early findings of Bachelier the fairvalue of an option is usually computed based on a dynamics of the underlying assetin terms of a geometric Brownian motion [5]. However, in real markets the eleganceof the perfect hedge of the Black–Scholes approach to option pricing is generally lost.By introducing a mesoscopic stochastic option price variable, the standard approachto option pricing is generalised to a piecewise deterministic process. This allows fora description of the Black–Scholes equation in terms of a master equation. The mainadvantages of a master equation formulation are that existing powerful numerical al-gorithms can be used for option pricing and, additionally, this description simplifiesthe inclusion of jump processes present in realistic markets, since one has to deal onlywith one type of stochastic process.This thesis is structured as follows. The first chapter introduces the basic buildingblocks of biological systems, and some of the emerging high-throughput measurementtechniques. In Chapter 2 and 3 a detailed investigation of DNA-microarray experi-ments, which measure the expression of thousands of genes in parallel, and their sta-tistical analysis is presented. Here a new non-parametric standardisation algorithm forDNA-microarray data is proposed and applied to experimental data. To maximise theinformation content of dynamic measurements, optimal experimental design criteriacan be used. Chapter 4 introduces and evaluates optimal experimental design criteriain the context of a typical signal transduction pathway. In Chapter 5 the restrictionstowards inference from experimental data due to robustness of a biological system isinvestigated. Chapter 6 exemplifies the model building procedure using the lipopro-tein metabolism as an example. Starting from a reaction diffusion system the modelstructure is simplified to a set of distinct compartments. The dynamic behaviour ofthese compartments can then be described by a set of coupled ordinary differentialequations. This model can then be used to investigate the effects and working princi-ples of new drugs, and hence enables insights not available without the mathematicaldescription.In the second part of this thesis, Chapter 7 shortly reviews the standard Black–Scholestheory of option pricing. In Chapter 8, this approach is then generalised to a meso-scopic stochastic option price variable. It is demonstrated that this approach offersboth, numerical and conceptual advantages compared to the standard option pricingtheory.

Part I

Analysis and dynamical modelling ofbiological systems

5

Chapter 1

Functionality of biological systems

Measure what is measurable, and make measurable what is not so.

Galileo Galilei

In the following chapters the dynamic behaviour of different biological systemsis investigated. This chapter introduces the basic structure of these systems and thenotations used. In addition the measurement techniques available are briefly reviewed.

1.1 Organisation of biological systems

Biological systems are organised at several different scales. Biological informationis stored in DNA (desoxyribonucleic acid), then transcribed into mRNA (messengerribonucleic acid) which then is translated into proteins [6, 7], see Figure (1.1) for anoverview. Proteins are then responsible for regulation of the cell function and thus arethe main target for drug development.

1.1.1 DNA

The whole information about the organisation of a biological system is stored in theDNA. The DNA is composed of double stranded polymers, called chromosomes,which themself are build from four bases: Adenosin, Guanosin, Cytosin and Thymidin.While simple organisms such as Escherichia Coli contain only one chromosome, thereare 23 human chromosomes consisting of altogether about 3× 109 bases.The two strands of the DNA consist of a complementary sequence of bases (Adenosin/Thymidin and Cytosin/Guanosin) which interact by hydrogen bonding. The processduring which two complementary strands of DNA bind together and form a stabledouble helix is known as hybridization. An identical copy of the DNA is present inevery cell and when cells divide this DNA is replicated, such that each successor cell

7

8 Functionality of biological systems

introns

exons

��

��

� ��

� � �� ! " # � � � � � �%$ � & & & � � �' (�)�* + , - . /0 1 2�354 6 4 7 8 9 :; <�=�= > ?A@A?CBC@ D E

FHG IKJMLONQPSR�T�U

Figure 1.1: The basic principle of cell function. Information stored in exons are transcribedinto mRNA which is then translated into proteins. Proteins are the main buildingblocks of cell regulation.

contains one copy.The DNA consists of coding and non-coding regions known as exons and introns. Onlyabout 3% of the human genome are assumed to be coding regions and only about 0.1%of the genome of individuals of the same species is expected to be different.

1.1.2 RNA

The coding regions of the genome of an organism can be divided in different functionalunits known as genes. The human genome is estimated to consist of 50000–100000genes.Each gene can be translated into so called mRNA, this gene expression creates anexact copy of a subset of the coding region of the corresponding gene. Because ofe.g. alternate splicing each gene can be translated into different mRNA molecules.The mRNA levels and thus the gene expressions are different between different cells.Hence different cell types which share exactly the same construction plan, the DNA,express different genes (the mRNA the gene codes for) and can thus fulfil differenttasks in the organism.The gene expression of a cell is not only determined by genetic factors but also byenvironmental influences.

1.2 Transcriptomics 9

DNA-Array

Probe 1 Probe 2

mRNA-Preparation

adding Fluorescence

Hybridisation

� ��

Figure 1.2: Basic principle of DNA–microarray experiments. Fluorescent labelled cDNA ishybridized to corresponding DNA strands attached to a glass surface. By detectingthe fluorescence activity of the different spots, the gene expression can be mea-sured.

1.1.3 Proteins

Each protein is synthesised by a unique mRNA sequence. Thereby three nucleotidesof the mRNA molecule are translated into one amino acid of the polypeptide chain.Afterwards the proteins are folded into three dimensional structures. Proteins are themain building blocks of functional regulation in cells. During the biochemical pro-cesses responsible for functional regulation, proteins are still diversified by e.g. cova-lent binding of phosphates or changes of the three dimensional structure, altering thebiochemical activity.

1.2 Transcriptomics

The transcriptom of an organism is defined as the entirety of mRNA molecules whichis actually present in a cell. Measurement of the transcriptom thus reveals the geneexpression.There are several distinct experimental techniques to measure gene expression, but thecommon principle is to make use of the hydrogen bonding between two complemen-tary DNA strands. In the following we will concentrate on a two colour fluorescentlabelling technique using glass slides.As shown in Figure 1.2, the mRNA of two biological probes under consideration is ex-tracted. By the use of reverse transcriptase the RNA is enzymatically transformed intocomplementary DNA (cDNA) and two different fluorescent nucleotides are inserted.On the other hand short DNA sequences, typically 300–3000 nucleotides, uniquelycoding for different genes are spotted on a glass slide to obtain a so called DNA–


microarray. Now the two labelled cDNA samples are competitively hybridized on themicroarray. After some time the left over cDNA is washed off and the fluorescent ac-tivity of the two different colours in each spotted DNA sequence is measured. Sincethe hybridization is assumed to be competitive, the ratio of the fluorescent intensitycorresponds to the gene expression ratio between the two probes.

Depending on the technology used to label the cDNA (fluorescent or radioactivemarker) and the surface at which the DNA sequences are spotted, one can achieve avery high density, such that a microarray experiment measures the gene expression ofthousands of genes in parallel. Figure 1.3 shows a small part of a DNA microarrayexperiment. Shown is a 4× 3 grid of hybridized cDNA blocks where each block con-sists of 484 unique cDNA spots. The total microarray consists of 32 such blocks andthus one can measure the expression of 15488 genes of the mouse genome in parallel.Since the two probes P1 and P2 hybridized were fluorescently labelled green and redrespectively, a green spot shows that the corresponding gene is higher expressed inprobe P1, while a red spot indicates a higher expression in probe P2. Qualitatively asimilar gene expression in the two probes yields a yellow spot.

1.3 Proteomics

The proteom of an organism is the entirety of proteins present in a cell, which them-self are synthesised from the available RNA sequences. Historically, measurements ofthe present proteins are much older than measurements at the transcription level. Themain technique used is the so called 2D–gel electrophoresis [8, 9].Proteins are absorbed by gels carrying an immobilised pH gradient. By subjecting thefirst dimension of the gel to a strong electric field, acid proteins at the alkaline sideof the gel will dissociate and become negatively charged. Because of the electric fieldthese proteins will migrate to the positive side of the gel and reach a point where theybecome neutralised by the gel, lose their net charge and do not migrate further. In thesecond dimension proteins are separated according to their molecular mass. Hence,assuming that all proteins have different charge and molecular mass they can be sepa-rated on a two dimensional gel. A scanned image of such a gel is shown in Figure 1.4.By cutting out a spot of such a gel, the corresponding protein can be identified byMALDI-TOF mass spectroscopy.The analysis of 2D–gel experiments is much more difficult than that of a microarrayexperiment. The main difficulties are that: (i) the spots are not located an a rectangulargrid, (ii) the gels are flexible so the location of the same spot on different gels variesand (iii) each protein has to be separately analysed by mass spectroscopy.The construction of a proteom chip in analogy to microarray experiments is compli-cated since the three dimensional structure of the proteins which is important for bind-ing has to be preserved when they are printed to a surface. However first experimentswith protein chips [10, 11] are promising.

1.3 Proteomics 11

Figure 1.3: Small zoom into the scanned image of a DNA–microarray measuring the expres-sion of 15488 genes of the mouse genome in parallel.


Figure 1.4: A scanned image of a 2D–gel electrophoresis experiment separating proteins ac-cording to their mass and charge in the two dimensions.

1.3.1 Data management facilities

The vast quantity of data generated by 2D–gel experiments requires integrated so-lutions for data management and analysis which not only allow for an easy accessto measured protein abundance but links these values with information about the bi-ological samples and experimental conditions used to generate the data.Thus a datamanagement and analysis software aimed at facilitating the interpretation of 2D–gelexperiments by providing all the information needed in a standardised and extensibleway had to be developed. This solution tries to substitute the traditionally used labbook and additionally offers basic data analysis facilities.The developed data management solution evolved from the need of a local workinggroup to store their 2D–gel experiments. The two main design goals were: simpleusability for laboratory staff and the ability to adapt the information stored for the ownneeds without having to change the structure of the database. Hence, this software caneasily be adapted to fit the needs of different laboratories.In order to facilitate portability, a client server architecture was used. The client com-puter just needs a browser and internet access. The communication with the serverthen uses an encrypted http connection and is handled by java servlets. Currently theuser interface supports two databases, Oracle 9i and Postgres which is freely availableunder the Postgres license. Extensions to other databases should be straightforward.The information stored in the database is organised in different experiments. Eachexperiment consists of two main building blocks, biological samples and 2D–gels.Several biological samples with certain properties such as the genotype or the numberof replications made are created within one experiment. Then a corresponding number

1.3 Proteomics 13

Figure 1.5: Screenshot of the developed data management and analysis software. Shown is themain window where biological probes and 2D–gels can be added to the database.

of 2D–gels is added. The properties of the 2D–gels are organised in categories suchas first and second dimension. These categories and their properties (e.g. pH gradient,voltage, ...) can be added and changed interactively by the laboratory staff withouthaving to change the database layout.In addition to the description of the biological sample and the experimental conditionsused to create the 2D–gels, also the scanned images and the computed spot intensities


(a) (b)

1

10

20

30

40

50

60

70

same behavior

no change

opposite behavior

Figure 1.6: Qualitative analysis of a 2D–gel experiment. At three different time points, t0, t1and t2, the protein levels of two biological samples are measured. (a) The colourcoded protein concentration relative to the concentration at time t0 for both sam-ples, and (b) the correlation between the time evolution of the two samples.

are stored in the database. Once this information and the assignment which 2D–gelcorresponds to which biological sample is made the analysis of the experiments can beperformed. To this end the output of standard image analysis software used to detectand match the spots on the different gels can be uploaded to the database. At the mo-ment two types of analysis are possible, but further tools can be added easily. Matchedspots can be selected according to their spot intensities and their detection frequencyin different repeats. As example a query like "All spots with intensity higher than 10which were detected at least 3 times within 5 repeats" is possible. In addition, if aset of spot intensities with different experimental conditions is selected, a two–samplet-test can be performed. This is done by a call to the external statistics package R, andhence it is straightforward to implement other statistical tools into the user interfaceby just using the already available facilities of R.Figure 1.6 shows a qualitative analysis of a 2D–gel experiment. A time course of

the protein levels of 2 biological samples is measured. Figure 1.6(a) shows the colourcoded protein levels measured at two different time points relative to the initial mea-surement. The time course of the two biological samples are plotted to the right andleft of the black bar respectively. In Figure 1.6(b) the correlation between the two timecourses is shown.

Chapter 2

Analysis of microarray experiments

If your experiment needs statistics, you ought to have done a betterexperiment.

Ernest Rutherford

The complexity of DNA–microarray experiments complicates the analysis of theresulting data. Biological entities which are to be investigated have to be defined,samples of them have to be extracted, amplified and labelled. On the other hand thegenes of the organism under consideration have to be sequenced, short DNA sequenceswhich uniquely probe for a gene have to be selected and spotted before the hybridiza-tion can take place. Afterwards an image analysis software is used to detect the spotsand to measure their fluorescent intensity.In addition to the biological variability which is of course present in the gene ex-pression of the two selected entities, each of the experimental steps above possiblyintroduces systematic errors and observational noise into the experimental data. A rea-sonable data analysis should be designed to consider this.In the following a systematic approach to the analysis of DNA–microarray experimentsstarting from the selection of the genes to appropriate data normalisation techniquesand reasonable tests for differential expression is demonstrated, which is then appliedto two microarray experiments.

2.1 Selection of interesting genes

Often the amount of biological material from which the labelled cDNA is prepared islimited. Hence, reducing the number of genes measured and thus the number of spotson a microarray is an important prerequisite for successful DNA–microarray experi-ments. Also from the analysis point of view a reduced number of spots, and thus, a

15

16 Analysis of microarray experiments

Inte

nsity

020

0040

0060

0080

0010

000

1200

014

000

Figure 2.1: Colour coded signal intensity of a radioactively labelled UniGene filter from theResource Center and Primary Database (RZPD). Note the clearly visible spatialgradients at the edges of the filter.

reduced number of hypotheses of differential expression which have to be tested, sim-plifies controlling the false positive rate, see Section 2.3. Summarising, in order tosave biological material and to add value to the analysis of microarray experiments itis crucial to use a reasonable algorithm to select a subset of genes which are interestingin the context of the given question.The basic idea to select interesting genes, of which the expression is measured af-terwards in more detail, is to use just a few samples to get a rough idea of the geneexpression. Then genes which are obviously not expressed can be sorted out. Fig-ure 2.1 shows the colour coded gene expression measured with a UniGene filter fromthe Resource Centre and Primary Database (RZPD) in Germany. The experimentaltechnique of these filters is based on radioactively labelled cDNA-sequences spottedon nylon filters. The UniGene set consists of three such filters and altogether containsabout 75000 different gene sequences spotted in duplicate from the human genome.The spatial gradients which can be seen at the edges of the filter already illustrate theneed for a data normalisation strategy which will be discussed in the next section.The duplicate spots of the filter can be used for quality control of the experiment. The

relative difference expression of the duplicates for gene i is defined as

di = log10

[ | Ii,1 − Ii,2 |min(Ii,1, Ii,2)

]

, (2.1)

where Ii,j denotes the background corrected intensity of the j-th measurement of genei. Figure 2.2(a) shows the distribution of the relative difference di. It can be clearlyseen that there are two different distributions present. This behaviour can be explainedby the so called overshining. Since radioactivity is used to label the cDNA, a verybright spot can contribute to the signal of a weaker spot located next to it. If the two

2.1 Selection of interesting genes 17

(a)

d

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

(b)

d

Den

sity

−4 −3 −2 −1 0 1 2

0.0

0.2

0.4

0.6

0.8

Figure 2.2: Background corrected relative difference gene expression di, see Eq. (2.1), of twospots from the same filter coding for the same gene. In (a) all spots are used whilein (b) only spots with a background corrected intensity higher that 400 were used.

different distributions seen in Figure 2.2(a) really originate from this effect, then onewould expect the narrower peak on the right to disappear if low intensity spots areremoved from the histogram of di. In order to verify this, Figure 2.2(b) shows thedistribution of di if only spots with a background corrected intensity higher than 400in arbitrary units are used to compute the density.Since this first step in a microarray experiment is only intended for a very broad screen-ing of interesting genes, one now needs a ranking of the genes according to their possi-ble importance. After deciding how much genes can be put on the smaller and cheapermicroarrays used afterwards, this number of genes is taken according to their ranking.In order to do this ranking there are mainly two approaches possible. Since probablyonly genes which are expressed at all are interesting, one can rank the genes accordingto their intensity. Or, if the filter experiments have been performed for two differentbiological entities one can apply a test for differential expression and rank the genescorrespondingly.Since the design and first production of the smaller microarrays is also very expensiveand time consuming, the resulting specialised microarray should be usable in differentcontexts. Hence we decided to rank genes according to their intensity, since a selectionaccording to a test of differential expression limits the use of the chip to measurementsof the biological entities used in the initial screening.The approach presented above was used to create a specialised DNA–microarray for


(a) (b)

Figure 2.3: Cumulative distribution of the log–ratio ri, see Eq. (2.2), from 15 different mi-croarrays (a) and after globally adjusting to zero median and unit variance (b).

experiments concerning the human liver. In addition to the genes selected in this way,important genes known to be expressed in the liver from database enquiries were in-cluded.

2.2 Standard data normalisation techniques

For a reasonable analysis of DNA–microarray experiments it is crucial to persistentlyquality control the experiments. This not only means to ensure that laboratory staffdoes its best at keeping a high quality standard, but also concerns the data analysispart. As we will see later on, carefully inspecting the experimental data provides animportant feedback to the laboratory about where to optimise the experimental proto-col. Additionally, by detecting and possibly correcting for systematic errors presentin the data one adds value to the results obtained from analysing the experimentaldata [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22].In the following, the data from several microarray experiments is inspected to discoversystematic errors present in the experimental data. The data resulting from two–colourmicroarray experiments, see Section 1.2, consists of 4 numbers per spotted sequence,the foreground and background values for each of the two colours. The whole analysisdone afterwards is based on the log–ratio of the two channels for sequence i definedby

ri = log2

[

fgcy3,i − bgcy3,i

fgcy5,i − bgcy5,i

]

, (2.2)

2.2 Standard data normalisation techniques 19

(a)

−4 −3 −2 −1 0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

Signal

Cum

ulat

ive

dist

ribut

ion

(b)

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Signal

Cum

ulat

ive

dist

ribut

ion

Figure 2.4: Cumulative distribution of the log–ratios for different needles used for the spottingof the microarray (a) and correspondingly by separating the log–ratios accordingto the PCR plate used for amplification of the mRNA.

where cy3, cy5 denote the two channels and fg, bg stands for the fore– and background,respectively. If one of the background values is higher than the corresponding fore-ground value ri is assumed not to be available.

2.2.1 Global normalisation

Figure 2.3(a) shows the cumulative distribution of the ri for 15 different microarrays.These microarrays were hybridised from samples of mouse kidney at different timepoints after a 5/6 nephrectomie. Basically it can be seen that there are large differ-ences in the mean intensity of different microarrays. Of course one would expect thatthe regulation of genes after a 5/6 nephrectomie changes dramatically, but this shouldonly affect a rather small subset of the 15488 different genes present on the microarrayand thus can not cause such a dramatic shift of the mean log–ratio.The explanation of this effect becomes clear if one looks at the expression of so called

house-keeping genes. These genes are expected not to change their expression undera wide range of experimental conditions, hence their name. However, between thedifferent microarray experiments from Figure 2.3(a) the measured log–ratios for thesegenes do change significantly. This can be traced back to a global intensity shift inone of the two colours caused by e.g. unequal amounts of labelled cDNA used forthe hybridization. These global intensity shift can be accounted for by adjusting thedata from each microarray to zero median and unit variance, see Figure 2.3(b) for theresulting cumulative distributions.This widely used normalisation method is based on the assumption that only a smallsubset of the genes present on the arrays change their expression between the different


(a) (b)

Figure 2.5: (a) Colour coded log–ratio ri versus intensity Ii together with a smooth leastsquares fit (—) and (b) the results of smooth least squares fits of the gene expressionof single microarrays against their common mean.

experimental conditions under investigation. Alternatively the measured values of thehouse-keeping genes can be used to estimate global transformations which are then ap-plied to the measured log–ratio. However, usually the opposite approach of adjustingto zero median and unit variance is followed, and the house–keeping genes are thenused to assess the appropriateness of this approach.There are other systematic artifacts present in microarray data which stem from the

production process. Figure 2.4(a) shows the cumulative distribution of the log–ratiosfrom one array for different needles. One would expect the genes showing differentialexpression to be uniformly distributed over the array and, hence, the spotting nee-dles [16]. This would lead to approximately the same distribution of the log–ratios fordifferent needles. The same argument of course holds for the 384 well PCR (Poly-merase Chain Reaction) plates used for amplification of the mRNA. Correspondingdistributions obtained by splitting the genes by PCR plate are shown in Figure 2.4(b).In order to really quantify these effects and to be able to reasonably correct for them,one would need either a large number of sequences spotted in duplicates or the samenumber of house-keeping genes. If there are such duplicates available which are e.g.spotted by different needles such systematic errors introduced by the spotting processcould be accounted for. Unfortunately such a number of redundant spots is usually notavailable since it is just too expensive.

2.2.2 Non–linear normalisation

The normalisation method presented above globally adjusts the measured log–ratios bya linear transformation to zero median and unit variance. By basing the normalisationalgorithm only on the log–ratios, the information about the intensity of the two chan-

2.3 Testing for differential expression 21

nels is not used. Figure 2.5(a) shows the measured log–ratios ri versus the intensity

Ii = log2

[

fgcy3,i − bgcy3,i

]

+ log2

[

fgcy5,i − bgcy5,i

]

. (2.3)

The black line shows a smooth least squares fit, see Ref. [23], to the data. It can be seenthat there is a mild downwards trend present in the data. As proposed in Ref. [16], it isreasonable to assume that regulation of gene expression and, hence, the measured log–ratio should be independent from the intensity of the corresponding gene expression.Hence, the non–linear least squares fit from Figure 2.5(a) can be used to normalise thedata by adjusting the data to a mean log–ratio of zero, independently of the measuredintensity.This non–linear within slide normalisation technique can be combined with the fol-lowing between slide normalisation [14]. The gene expression measured on everymicroarray used in one experiment is plotted against their common mean gene expres-sion. Then, as before, a smooth least squares fit is used to correct the gene expressionof each microarray. Typical results of these fits are shown in Figure 2.5(b). The as-sumption underlying this approach, which is the same as for the global normalisation,is that most of the genes do not change their expression, and hence a mean gene expres-sion can be used to normalise DNA–microarray data. A generalisation of this approachis presented in Chapter 3.

2.3 Testing for differential expression

A microarray experiment typically consists of two different biological entities whosegene expression is to be compared. After a proper data normalisation the basic ques-tion is: Which genes are significantly differentially expressed ?To answer this question there is a huge number of statistical tests which can be ap-plied to this setting [24]. Depending on the error model used one may favour standardparametric or non–parametric tests, see Ref. [15]. However, no matter what test isused, one will encounter the problem of multiple testing [25]. Since microarray exper-iments typically measure the expression of thousands of genes in parallel, this numberof statistical tests has to be done in order to find differentially expressed genes.

2.3.1 Controlling the familywise error rate

In order to decide which genes are differentially expressed, statistical hypothesis testsare performed [24]. Based on a sample of measured gene expressions, the null hypoth-esis of no differential expression is tested. To this end the actual observed value of atest statistic is compared to the distribution of the test statistic under the null hypothe-sis. The so called p-value gives the quantile of the observed value of the test statisticcomputed from the distribution of the test statistic under the null hypothesis. Hence,the smaller this p-value, the more likely the null hypothesis is wrong.In a multiple testing situation the p-values resulting from a statistical hypothesis test


have to be corrected to account for the number of tests done in parallel. The fami-lywise error rate (FWER) is defined as the probability of at least one falsely rejectedhypothesis. There exist a variety of approaches which control this error rate in a mul-tiple testing situation, see Ref. [25] for a recent overview. The simplest approaches arebased on the first–order Bonferroni inequality

pi =

{

pi ×N Bonferroni correction,

pi × (N − i + 1) Holm correction.(2.4)

Here the ordered p-values are either multiplied with the number N of tests done inparallel in the case of a Bonferroni correction, or with N − i + 1 in the case of Holm’smethod, where i denotes the index of the p-value. It can be shown that these procedurescontrol the FWER. However, these procedures reduce the average power of the test. Ifthe number of hypotheses tested in parallel increases, the corrected p-values increaseand, thus, fewer hypotheses are rejected. Hence, the power of the test is decreased.

2.3.2 Controlling the false discovery rate

In practical applications to microarray data one usually is not so much interested incontrolling the FWER but rather in controlling the false discovery rate (FDR). Thefalse discovery rate is defined as the expected number of falsely rejected hypothesisper total number of rejected hypothesis [26]. In the setting of microarrays where typi-cally hundreds of wrong hypothesis are rejected, this quantity is a much more realisticestimate of the cost of type 1 errors.In order to control the FDR, Benjamini & Hochberg [26] proposed the followingmethod. First the p-values are sorted in ascending order and the index

i0 = max{j ∈ 1, . . . , N | p(j) ≤ αj

N}, (2.5)

is computed. Then rejecting all hypothesis with p-values smaller than p(i0) controlsthe FDR at the level α.Inspired by the classical method of controlling the FDR, Storey, see Ref. [27, 28],developed an approach with a potentially higher power which fixes the rejection regionin advance and then estimates the FDR from the data available.

2.4 Application to DNA-microarray experiments

In the following the methods presented above are applied to two DNA-microarrayexperiments. DNA–microarray experiments are based on a competitive hybridizationof the differently labelled cDNA probes, see Section 1.2. Since in the following thegene expression of a large number of samples is compared, a common reference cDNAis used in every experiment for the competitive hybridization. This enables the directcomparison of the measured gene expression of different probes.

2.4 Application to DNA-microarray experiments 23

(a)

−5 0 5

−5

05

log−ratio, Exp. 1

log−

ratio

, Dye

Sw

ap

(b)

−5 0 5

−5

05

log−ratio, Pat. 1

log−

ratio

, Pat

. 2

Figure 2.6: (a) Gene expression measured for one patient and the corresponding dye swapmeasurement and (b) the normalised gene expression of two different patients.

2.4.1 Chronic transplant nephropathy

Chronic transplant nephropathy remains a poorly defined inflammatory process thatlimits the survival rate of most renal transplants. To identify candidate genes thatcharacterise chronic transplant nephropathy, the gene profile of thirteen chronicallyrejected kidney transplants was determined, using a 7k human DNA microarray. Inaddition, the gene profile of sixteen normal kidneys and twelve end-stage polycystickidneys (PKD) was measured to distinguish genes present in normal renal tissue, orspecific for end-stage renal failure [29].

Data analysis

The experimental design includes a colour reversal experiment for every tissue sampleto correct for dye-specific effects. Initially, the log ratio of measured cy3/cy5 valuesobtained from the image analysis software is computed. Then, global normalisation ofexpression values is performed by adjusting the data to zero median and unit variancein order to obtain an identical distribution of overall gene expression. Taking the meanof the expression values of the dye swap experiments allows correction for dye specificeffects. Following an approach proposed by Dudoit et al. [16], the computed expres-sion ratios should not depend on the intensity of the spots. Hence, a smooth non-linearleast squares fit is computed to correct for an intensity-dependent bias.Figure 2.6(a) shows the gene expression measured in a typical dye swap experiment.The deviations of the observed values from the black line is due to observational noiseand different labelling efficiencies for the two colours. Taking the mean of the twomeasurements for each patient corrects for multiplicative dye specific effects. Af-

24A

nalysisof

microarray

experiments

0.61.2

1.5 2.0 2.5c(2, 2)

1020

3040

200 400 600 800

1040

−4 −2 0 2 4−4 −2 0 2 4

N99

N3

N25

N5

N11

N14

N12

N23

N6

N16

N20

N13

N24

N10

N9

N17

7ZN

17ZN

18ZN

14ZN

1ZN

10ZN

19ZN

13ZN

6ZN

8ZN

9ZN

30ZN

TN14

TN15

TN17

TN2

TN7

TN9

TN5

TN16

TN6

TN8

TN10

TN4

TN11

Figure

2.7:C

olourcoded

expressionof

571genes

separating16

normalkidneys

(N),from

the12

endstage

PK

Dkidneys

(ZN

)andthe

13chronically

rejectedkidney

transplants.N

otethe

two

subgroups(T

N2,T

N7,T

N14,T

N15,T

N17

andT

N4,T

N5,T

N6,T

N8,

TN

9,TN

10,TN

11,TN

16).


ter this normalisation, the gene expression for two different patients shown in Figure2.6(b) is obtained.To exclude artifacts near background range, all genes were eliminated with a signalless than 3-fold over background in at least 80% of specimens within a group. Apply-ing these criteria, 2190 genes were subjected to statistical analysis. A two sample t-testwas used for a statistical analysis of differentially expressed genes.We performed two-dimensional hierarchical cluster analysis utilising genes with significant differencesbetween transplants and normal kidney tissue (p <0.0003), or transplant and PKD tis-sue (p < 0.05). A total of 571 genes fulfilled these criteria, and were included inthe cluster analysis. Hierarchical clustering is based on grouping elements accordingto their similarity, in this case given by the correlation coefficient. Starting with allgenes separated, in each step of the clustering algorithm the two most similar groupsof genes are combined. The similarity between two groups of genes is given by theaverage value of all pairwise correlation coefficients. Two dimensional clustering wasnecessary since both the genes and the patients had to be grouped according to theirsimilarity.Hierarchical clustering based on these genes differentiate normal and transplant tis-sue as well as transplant and PKD tissue. A majority of these genes encoded proteinsinvolved in cellular metabolism, transport, signalling, transcriptional activation, ad-hesion, and the immune response. Seventeen percent of these genes are regulatedmore than three-fold at the level of messenger RNA expression. Interestingly, com-prehensive gene profiling of chronically rejected kidneys reveals two distinct subsetsof chronically rejected transplants, see Figure 2.7. Neither clinical data nor histol-ogy could explain this genetic heterogeneity, reflecting the multifactorial etiology ofchronic rejection. Thus, microarray analysis of rejected kidneys may help to definedifferent entities of transplant nephropathy.

2.4.2 Mouse nephrectomie

Using a mouse chip with 15488 different gene sequences, genes involved in recoveryafter kidney operations were identified [30]. To this end the gene expression in micewas measured before and at 10 different time points after a 5/6 nephrectomie. Ateach time point the gene expression of two different mice was measured in a dye-swapexperiment, using a common reference mRNA for the competitive hybridization.

Data analysis

After normalisation of the experimental microarray data, see Section 2.2, hierarchi-cal average linkage clustering was used to identify genes with a common expressionprofile measured at 11 different points in time. Since the expression of each gene com-pared to a reference is measured, the correlation of the expression profiles is used assimilarity measure.Figure 2.8 shows two distinct clusters found. The function of most of the 15488 genes


(a)

174 genes

time / days

rela

tive

gene

exp

ress

ion

92 unknown 26 Signal Transduction 25 Matrix/Structural Proteins 8 Prot. Synth./Transl. Contr. 7 Energy/Metabolism 5 Transcription/Chromatin 3 Cell Cycle 3 DNA Replication 3 Heat Shock/Stress 2 Apoptosis

0 1 2 5 10 14 20 25 42 180 270

−1

01

23

4

(b)

72 genes

time / days

rela

tive

gene

exp

ress

ion

42 unknown 8 Transcription/Chromatin 6 Matrix/Structural Proteins 5 Signal Transduction 4 Energy/Metabolism 3 Prot. Synth./Transl. Contr. 2 Heat Shock/Stress 1 Apoptosis 1 Cell Cycle

0 1 2 5 10 14 20 25 42 180 270

−1.

0−

0.5

0.0

0.5

Figure 2.8: Two gene clusters found by hierarchical average linkage clustering using the cor-relation as metric. There are 174 (a) and 72 (b) different genes in the two clusters.In addition to the cluster profile, the genes of each cluster are separated accordingto their function.

is unknown, but for the genes with known function their number is given. Note that theexpression profiles of the two clusters shown are very dissimilar. But by constructionof hierarchical clustering algorithms, some of the resulting clusters are very similar.Figure 2.9 visualises the similarity of the genes contained in the seven largest clusters.The numbers denote the number of the cluster in which the gene is included, and theEuclidean distance between two genes corresponds to the similarity of the expressionprofile of the two genes. The more similar the gene expression is, the smaller is the dis-

Figure 2.9: Visualisation of the within and between similarities for the seven largest clustersfound using multidimensional scaling. The Euclidean distance between the genesin this figure corresponds to the multidimensional distance in the 11 dimensionalspace of measured gene expressions.


(a)

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

0 5 10 15 20

gene

exp

ress

ion

t / [days]

measurementsfit

(b)

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0 5 10 15 20

gene

exp

ress

ion

t / [days]

measurementsfit

Figure 2.10: The measured mean gene expression for the two largest clusters together with themodel prediction using estimated parameters according to Eq. (2.7).

tance between the two numbers. This two-dimensional visualisation of the similaritymeasure is calculated by multidimensional scaling, see Ref. [31]. If the gene expres-sion x and y of two genes at 11 points in time is measured, the distance between thesetwo expression vectors can be defined by d(x, y) = 1− ρ(x, y), where ρ(x, y) denotesthe correlation of the two genes. Then two-dimensional multidimensional scaling cor-responds to reducing the 11 dimensional feature vector, x, y, to two dimensions, x, y,such that

E =[d(x, y)− d(x, y)]2

d(x, y)2, (2.6)

is minimised. Note that d denotes the Euclidean distance.The significance of the clusters found can be assessed by their size. Even if therewould be no genes with a common profile due to their function, since the expressionof 15488 genes is measured, there will be clusters of genes sharing a common profileby chance. The size of these clusters can be assessed by randomly permuting the timeordering of the measured gene expression and applying the hierarchical clustering al-gorithm to this data. The resulting clusters are expected to be much smaller. For thisdata, these clusters have a mean size of 10 genes with a standard deviation of 0.5, and,hence, clusters with less than 10 genes found in the real data are likely to have beenfound by chance. Note that the exemplary clusters shown in Figure 2.8 are composedof 174 and 72 genes, and are thus much larger than expected just by chance.The ultimate goal of microarray experiments is to reconstruct genetic networks fromthe experimental data [32]. These functional networks are the key for an understandingof processes controlled on the genetic scale. Due to the large number of genes, approx-imately 15000 in this case, and the small length of the time series, 11 time points inthe present case, this reconstruction is challenging. In order to reduce the number of


1 3

56

13

-15

-20

-8

23

21

Figure 2.11: Graphical representation of a genetic regulatory network for the clusters 1, 3, 5and 6 shown in Figure 2.9. All regulations with a numerical value smaller than 5have been discarded.

genes involved in the regulatory network, a mathematical model not for single genes,but rather for clusters of genes can be developed [33]. The basic assumption for thisapproach is that genes from the same cluster share a similar expression profile and thuspresumably a common function.Since the real structure of the genetic network is unknown, the following phenomeno-logical model for the dynamics of the clustered gene expression is used

gi(t) = −aigi(t) + f(bi +∑

j

wijgj(t)) . (2.7)

Here the mean gene expression of a cluster is denoted by gi(t), and the regulation ofeach gene is separated in the effect from the gene cluster under consideration, aigi(t),and the gene network f(.). The function f(.) is a sigmoidal function given by

f(x) =(

1 + e−x)−1

, (2.8)

and wij describes the regulatory effect of gene cluster j towards gene cluster i. Inorder to apply this model, the four largest dissimilar clusters from Figure 2.9 are used:1, 3, 5 and 6. Hence, this model consists of 24 parameters to be estimated. Altogether,there are 11 time points at which measurements take place and, thus, 44 data points.But since the time points are distributed over 270 days, only the first 7 measurements,the first 20 days, are used in the following. Hence, the 24 parameters have to be esti-mated from 28 available measurements. For reliable parameter estimates much moremeasurements are necessary. However, despite the fact that these measurements arenot available, in order to demonstrate the potential of this method, the parameters arenevertheless estimated, but one has to be careful concerning their interpretation.Figure 2.10 shows the mean expression profile for the first two clusters together with


the model prediction. It can be seen that the model is able to capture the observeddynamics, but that definitely more observations are necessary. Hence we refrain frompresenting the parameter estimation error or using model selection strategies as dis-cussed in Chapter 6. Nevertheless, to demonstrate the potential of such an approach,Figure 2.11 shows a graphical representation of the estimated genetic network, whereall parameter values less than 5 have been discarded. This model now enables quanti-tative insights into the regulation of the genetic network. It can be seen that, e.g. thegenes from cluster 6 repress the expression of the genes present in cluster 5.However, for a detailed analysis of genetic networks two prerequisites are necessary.Firstly, as already mentioned, much more measurements are needed. Only then, reli-able parameter estimates including estimation errors are possible and advanced modelselection strategies as discussed in Chapter 6 can be applied. Secondly, for the inter-pretation of the resulting graphical model, at least a rough idea about the biologicalfunction of a cluster would be very helpful. However, in our case, see Figures 2.8, thefunction of the majority of the genes in each cluster is unknown. If these two prereq-uisites are given, an analysis as presented above can play a vital role in investigationsof the functional coherences of genetic networks.

Chapter 3

Optimal transformations fornormalisation of microarrayexperiments

Signal data from DNA-microarray experiments contains systematic artifacts from theproduction process and observational noise. It is becoming more and more recognisedthat a sufficient number of chip replicates has to be made in order to reliably measurethe gene expression [21, 34, 35].To reduce the systematic part of the noise resulting from pipetting errors, from differenttreatment of chips during hybridization and from chip-to-chip manufacturing variabil-ity, normalisation schemes are employed. Some widely used normalisation techniqueshave been presented in Section 2.2. In the following an iterative non-parametric non-linear normalisation scheme called simultaneous Alternating Conditional Expectation(sACE), which is designed to maximise correlation between chip repeats is presented.

3.1 Optimal transformations

Since the normalisation method presented in the following is based on optimal trans-formations, these transformations and their practical computation with the ACE algo-rithm are briefly introduced in the next section.

3.1.1 Maximal-correlation

Given two random variables X1, X2, the Rényi maximal-correlation [36] is defined as

Ψ(X1, X2) = supf,g| R(f(X1), g(X2)) |, (3.1)

where R denotes the usual correlation coefficient and the supremum is computed overthe set of all Borel measurable functions with finite and positive variance. The mainproperties of the maximal-correlation can be summarised as follows:

31

32 Optimal transformations for normalisation of microarray experiments

• defined if X1, X2 6= const

• symmetric, normalised, 0 ≤ Ψ ≤ 1

• Ψ = 0 if and only if X1, X2 independent

• Ψ = 1 if fully dependent

• p(X1, X2) = N(µ1, µ2, σ1, σ2) ⇒ Ψ = R

The functions f ∗, g∗ maximising Ψ(X1, X2)

| R(f ∗(X1), g∗(X2)) |= Ψ(X1, X2), (3.2)

are the so called optimal transformations. Optimal transformations can also be in-troduced from a regression point of view. It can be shown that optimal transforma-tions defined to maximise the correlation also minimise the regression in the followingsense. Consider the general multiple regression model

Φ0(Y ) =d∑

j=1

Φj(Xj) + ε, (3.3)

where the explanatory variables Xj are assumed to be independent of the error term ε.In this regression setting one is now looking for functions minimising the fraction ofvariance not explained by the regression

e2(Φi=0,...,d) =

E

[

{

Φ0(Y )−∑d

j=1 Φj(Xj)}2]

E [Φ20(Y )]

. (3.4)

It can be shown that in the case of two dimensions the functions Φ0, Φ1 that minimisethe regression also maximise the correlation between X1 and X2, and thus are theoptimal transformations [37].

3.1.2 Alternating Conditional Expectation algorithm

In order to compute transformations minimising the regression coefficient e2 from Eq.(3.4), an iterative algorithm known as Alternating Conditional Expectation (ACE) isused. Without loss of generality assume

E[

Φ20(Y )

]

= 1 and E [Φj(Xj)] = 0 . (3.5)

Then, if the additive model is correct [38], one obtains

E[Φ0(Y )−d∑

j 6=k

Φj(Xj) | Xk] = Φk(Xk) . (3.6)

3.2 Normalisation Method 33

Hence if the functions Φj, j 6= k are known, the functions Φk can be estimated by aunivariate regression fit based on the measurements Y and Xj. By iteratively exploit-ing the above equation, the so called backfitting, the functions Φk can be estimated.This is the basic idea of the ACE algorithm working as follows:

Initialize: Φ0(Y ) = Y/ ‖Y ‖ and Φj=1,...,d(Xj) = 0while e2(Φi=0,d) decreases do

while e2(Φi=0,d) decreases doΦk(Xk) = E[Φ0(Y )−∑d

j 6=k Φj(Xj) | Xk] for k = 1, . . . , dend whileΦ0(Y ) = E[

∑

j=1,...,d Φj(Xj) | Y ]/ ‖ E[∑

j=1,...,d Φj(Xj) | Y ] ‖end while

Here ‖ . ‖ denotes the L2 norm. In practical applications of this method the con-ditional expectations involved can be calculated by kernel estimation. A more de-tailed discussion of the algorithm and applications to experimental data can be foundin Refs. [37, 38, 39].

3.2 Normalisation Method

The simplest normalisation procedures are linear and global (LG) such as the mediannormalisation presented in Section 2.2. Here one tries to adapt the mean or median ofdifferent repeats by multiplying each repeat with a constant factor [20], see Ref. [22]for a more sophisticated version.In order to design an advanced inter–chip normalisation method for DNA–microarraystwo cases have to be distinguished. First the normalisation of repeats of the same bi-ological entity and second standardisation of microarrays from different biologicalentities, where, of course, differences in the gene expression are expected.Since in the first case every deviation between the repeats is either a systematic artifactof the production process, or observational noise either caused by biological variabilityor the laboratory, maximising the correlation is a reasonable design goal. In the secondcase, if microarrays from different biological entities are to be standardised, maximis-ing the correlation is in principle not a good idea since it could diminish the effectspresent in the data which one is looking for. However if one assumes, as most standardnormalisation algorithms do, that most of the genes do not change their expressionbetween the two entities, then it is again reasonable to maximise the correlation.

3.2.1 Normalisation by non–linear correlation maximisation

Now the concept of optimal transformations is applied to DNA–microarray data inorder to construct a normalisation scheme based on maximisation of the correlation


between different microarray experiments [12].Suppose we have a microarray experiment with d repeats, and the values of the expres-sion level of gene g in repetition i is denoted by Xig. A straightforward application ofthe ACE algorithm presented in the previous section would e.g. construct transforma-tions Φi(Xig), i = 1, . . . , d which minimise

e2 =

∑

g

[

Φ1(X1g)−∑

k=2,N Φk(Xkg)]2

∑

g Φ21(X1g)

. (3.7)

This common approach has to be generalised for the present setting. Since there is nosingle predictor variable Y as before, now the pairwise residuals given by

e2 =∑

i<j

∑

g [Φi(Xig)− Φj(Xjg)]2

∑

g Φ2i (Xig)

, (3.8)

have to be minimised. To compute these transformations one needs a modification ofACE which will be presented now.The main idea is to apply the standard ACE algorithm simultaneously to all pairs ofrepeats. This leads to n− 1 different transformations for each repeat, thus after everyiterative step the transformations for one repeat are initialised to their mean computedfrom the previous iteration step. This results in the following algorithm, called simul-taneous ACE (sACE) in the following:

Initialize: Φi(Xig) = Xig/ ‖ Xig ‖while e2(Φi=1,...,d) decreases do

for all pairwise comparisons i < j doΦj(Xjg) = E [Φi(Xig)|Xjg]Φi(Xig) = E [Φj(Xjg)|Xig] / ‖ E [Φj(Xjg)|Xig] ‖

end forCompute mean of Φi|same repeat

end whileHere E[.|.] denotes the conditional expectation value. The conditional expectation val-ues, e.g. E [Φi(Xig)|Xjg] are estimated by smoothing the scatter-plot Φi(Xig) versusXjg using a triangular window over n neighbouring genes

Φj(Xjg) = E [Φi(Xig)|Xjg] =∑

l∈I

wlgΦi(Xil)/∑

l∈I

wlg , (3.9)

withI = {g} ∪ {indices of 2n nearest neighbours of Xjg}, (3.10)

and

wlg = 1− | l − g

n| . (3.11)

3.3 Application to experimental data 35

The parameter n plays the role of a regularisation parameter and controls the smooth-ness of the resulting transformation. In each iterative step one obtains several differenttransformations for one replication, Φi|same repeat. After each iterative step their aver-age value is used to initialise Φi for the next iterative step.To apply the algorithm the experimental data is first rank ordered [40], before sACEis applied. Thus the calculation of the conditional expectation in the above scheme isindependent of the original distribution or a monotonic transformation of the raw data,e.g., logarithmic transformations.The “optimal” transformed data is again transformed with a joint transformation tohave a similar distribution as the original data. This joint transformation is constructedby mapping the averaged optimal data to the averaged mean raw data. This transfor-mation is then applied to every “optimal” normalised dataset. Raw and transformeddata are directly comparable, which simplifies the biological interpretation but maynot be necessary if one is interested in statistical tests of significance only.It is important to note that any systematic part of experimental noise generates statisti-cal dependency, and hence DNA microarray experiments may produce data which arenot statistically independent. It can be shown that in two dimensions any statistical de-pendence of different repeats is detected and corrected for by the ACE algorithm [36].The proposed normalisation algorithm is designed to find smooth functions over theintensity of the measurements which correct for this type of error.

3.3 Application to experimental data

In the following the algorithm presented above is applied to data from microarray ex-periments performed at Roche, Basel. These experiments involved so called Affymetrixchips using a slightly different measurement technique than the two–colour DNA-microarrays discussed in the previous sections. However, from a normalisation pointof view, a detailed understanding of this technology is not needed and hence we refrainfrom going into too much technical details.

3.3.1 Experimental settings

Microarray data from 158 Affymetrix chips have been used to assess the noise re-duction capability of the non-parametric non-linear normalisation procedure (sACE).Chips were from 12 different experimental conditions. For four conditions (C1, C2,C3, C4) the entire Hu42K chipset with about 42000 genes, most of them from so calledEST sequences, was used. The Hu42K chipset consists of 5 chip subtypes (Hu6800,Hu35KsubA, Hu35KsubB, Hu35KsubC, Hu35KsubD). The five chip types with fourconditions result in 20 repeat groups. Biological samples were from cell cultures ofhuman macrophages, where each chip represented one cell culture flask. In two condi-tions (EM, EMI) only the Hu6800 chip was used. In six other experimental conditions(C, NF, HF, FED, VV7, VV8) the rat RgU34A chip was used. Biological samples were


Exp. # repeats chip type # genes # genes used LG sACE

C1 8 Hu35KsubA 8907 7347 0.68 0.50C2 4 Hu35KsubA 8907 7199 0.70 0.45C3 8 Hu35KsubA 8907 7116 0.76 0.52C4 4 Hu35KsubA 8907 7075 0.66 0.47C1 8 Hu35KsubB 8924 5919 0.67 0.43C2 4 Hu35KsubB 8924 6285 0.67 0.51C3 7 Hu35KsubB 8924 5882 0.67 0.47C4 4 Hu35KsubB 8924 6035 0.66 0.40C1 8 Hu35KsubC 8928 6602 0.78 0.55C2 4 Hu35KsubC 8928 6282 0.73 0.52C3 8 Hu35KsubC 8928 5753 0.67 0.37C4 4 Hu35KsubC 8928 5752 0.64 0.43C1 8 Hu35KsubD 8928 6318 0.62 0.37C2 4 Hu35KsubD 8928 6810 0.80 0.57C3 8 Hu35KsubD 8928 6339 0.63 0.37C4 4 Hu35KsubD 8928 6947 0.78 0.55C1 8 Hu6800(E) 7129 5240 0.54 0.28C2 4 Hu6800(E) 7129 5095 0.47 0.33C3 9 Hu6800(E) 7129 5120 0.49 0.29C4 4 Hu6800(E) 7129 5043 0.44 0.27EM 6 Hu6800(E) 7129 5115 0.42 0.24EMI 6 Hu6800(E) 7129 5122 0.36 0.23VV7 4 RgU34A 8798 6361 0.37 0.32VV8 4 RgU34A 8798 6428 0.41 0.32

C 5 RgU43A 8798 6336 0.35 0.27FED 4 RgU43A 8798 6296 0.36 0.33HF 5 RgU43A 8798 6311 0.36 0.30NF 4 RgU43A 8798 6380 0.35 0.26

Table 3.1: Noise reduction of sACE normalization compared to linear global (LG) normal-ization, expressed as average relative standard deviation (rSD). Only genes with amean raw signal between 20 and 5000 (mean calculated over repeats) were used tocalculate rSD, representing the bulk of genes with reliable signals.

tissues from individual rats. Altogether 28 repeat groups (22 human, 6 rat) were usedto assess normalisation, with 4-9 repetitive chips per group, see Table 3.1.Chip hybridization, washing and staining with a strepta–vidin-phycoerythrin conjugatewere performed using Affymetrix instrumentation according to the companies recom-mended protocols.Per gene signals were calculated from the 40 sub-signals of individual cDNA probesusing the standard algorithm of the Affymetrix GeneChip software called ADI ("aver-age difference intensity"). ADI may generate negative signals, because the so calledmismatch oligo probes, which are aimed at representing the cross-hybridization por-tion of the signal, sometimes show a higher signal than the so-called positive matcholigo probes. Since negative signals on mRNA expression do not make sense biologi-


Figure 3.1: Signal distribution of raw signals for two exemplary experiments.

cally, all signals below 10 were adjusted to 10 prior to normalisation.

3.3.2 Analysis of the method

The human chips used represent the first generation of Affymetrix chips where the40 oligo probes belonging to one gene are in close spatial proximity, while the ratchips used are from the second generation of Affymetrix chips where the 40 oligoprobes belonging to one gene are distributed over the chip. Distributing oligos makesthe per-gene signals less vulnerable to local defects and intensity gradients. Gradi-ents are not accounted for using standard linear normalisation schemes. Hybridizationon the human chips was done about one year before the rat chips, when laboratoryprotocols were still under improvement. Both improvements on chip design and wetlab procedures led to a significantly lower noise for the rat chips, as expressed by thelower average rSD of the non-normalised signals. Since biological variability is higher

(a) (b)

Figure 3.2: Signal distribution of signals after normalization with LG (a) and normalisationwith sACE (b) exemplified by the two chip repeats C4R1E and C4R2E.


(a) (b)

Figure 3.3: Dependency of signal variability on signal intensity. Gene signals are binned intowindows of size 20 (with mean raw signal used as basis for window binning)and rSD over gene repeats is averaged within each window. Plotted values aresmoothed over 9 consecutive windows. rSD of signals calculated without norma-lisation (blue), with linear global normalisation (green) and sACE normalisation(black). The cumulated number of genes is shown as red line.

within rat individuals compared to cell cultures (data not shown) the technological im-provement is even larger with the new chips than reflected by rSD differences.Figure 3.1 shows the raw data for two experiments. This clearly shows the need for a

non-linear normalisation method. By construction both normalisation methods (linearglobal (LG), sACE) generated signals in the same range (between 10 and 40000) andwith a similar distribution, see Figure 3.2. While average rSD calculated over all genesrepresents a rough estimate of normalisation efficiency, a more detailed look at rSD asa function of signal intensity (Figure 3.3) shows that sACE is particularly efficientwith low signals, without falling behind LG for higher signals. For instance, for theexperiment C3-E (Hu6800 chip) sACE generates a considerably lower rSD for genesin the mean raw signal range 100-800, which represent about 30% of all genes on thechip, compared to LG, see Figure 3.3(a). A similar behaviour on low-signal genescan be observed for the other experiments. Over the 22 experiments with first genera-tion chips (chip type Hu*) and comparatively low biological variability (samples fromcell culture) sACE reduced average rSD by 24%-48%, over the six experiments withsecond generation chips (chip type RgU34A) and presumably higher biological vari-ability (samples from different individuals) sACE reduced average rSD by 8%-26%,see Table 3.1. The average rSD values obtained with the two normalisation methodsare 0.57± 0.15 for LG normalisation and 0.39± 0.11 for sACE normalisation.


Exp. # repeats # genes used # false pos. LG passing test # false pos. sACE passing test

C1 8 4166 70 13 45 5C3 8 2646 104 6 115 4C1 8 2043 59 18 24 9C3 7 1856 31 1 24 0C1 8 2016 145 6 169 4C3 8 1372 8 1 5 0C1 8 2855 53 11 21 7C3 8 2840 38 8 23 8C1 8 3531 157 28 26 1C3 9 3477 103 23 31 10EM 6 4047 65 7 25 3EMI 6 4162 55 4 19 3

Table 3.2: Reduction of false positive rate by normalisation. False positives here are geneswith a 2-fold expression difference, after splitting a set of chip repeats into twodifferent artificial conditions. Only genes with a mean raw signal between 100 and5000 were considered for calculation of false-positive rate. t-test was applied withan error probability of less than 0.05 to the genes which showed at least a 2-foldexpression difference.

3.3.3 False-positive rate

To determine the false-positive rate, we split experiments with six or more repeats intotwo groups (a minimum of three repeats per condition is needed to apply t-test), eachgroup artificially representing a different biological condition. Genes were called falsepositive if they showed an expression difference of more than two-fold (up or down)between conditions. Chip repeats should not show any differential expression. Be-cause of experimental artifacts it is realistic that there may be genes which are falselyreported to be differentially expressed, though. The number of genes with such chancedifferential expression, see Table 3.2, was similar in magnitude to the numbers foundby Golub et al. [41] (173/6817 and 136/6817, resp.) using a different method. Onlygenes with a mean raw signal between 100 and 5000 were considered here. In 10 outof 12 such comparisons, sACE reduced the number of false positives, compared to LGnormalisation. If false positives were additionally filtered by t-test (p < 0.05) sACEproduced the same number of false positives in one from twelve comparisons and re-duced the number of false positives in eleven out of twelve comparisons (Table 3.2),with an overall reduction in number of false positives by 57%.

3.3.4 False-negative rate

The aim of normalisation is reduction in variability over repeats. However, this re-duction must not go too far and eliminate the effects one is looking for. This wouldresult in false-negatives, i.e. genes which are up- or down regulated but not recog-


Time 6 Time 24Gene LG sACE plus LG TaqMan LG sACE plus LG TaqMan

FC t-test FC t-test FC FC t-test FC t-test FC

31998_at 3.5 0.0001 3.3 0.00001 7.0 3.0 0.0006 2.8 0.00724 4.932855_at 1.1 0.81647 1.1 0.72167 1.6 2.0 0.0048 2.0 0.00277 2.134776_at 1.5 0.17305 1.4 0.21118 1.8 1.5 0.18498 1.4 0.12129 1.936873_at 1.5 0.06182 1.3 0.11863 1.6 4.0 0.00177 3.5 0.00001 6.739498_at 1.9 0.35989 1.6 0.19476 1.3 3.1 0.05013 2.0 0.01185 0.939950_at 3.0 0.00099 2.8 0.00003 5.2 3.8 0.00045 3.9 0.0002 5.241362_at 7.4 0.00178 6.1 0.00014 67.1 9.0 0 8.0 0.00039 31.741764_at 1.5 0.00238 1.4 0.00092 1.7 2.9 0.00002 2.7 0.00018 4.4

608_at 1.2 0.06042 1.2 0.05873 1.1 1.7 0.03654 1.5 0.02067 2.2649_s_at 1.2 0.04974 1.2 0.02752 3.0 2.1 0.00672 2.2 0.01645 3.2

Table 3.3: Comparison between TaqMan, Linear global normalisation alone (LG) and sACEplus LG normalisation with respect to false negatives. The normalisation methodsACE plus LG corresponds to applying a linear global normalisation after the sACEnormalisation. For 10 genes in two experimental control vs. treatment settings(Time-6 and Time-24) up- or down regulation Fold-Change (FC) was measured bothby a DNA-microarray (Affymetrix Hu95A chip) and by two TaqMan experiments.Using a 5% test niveau no false-negative (i.e. a gene which is reported as differen-tially expressed by LG alone and TaqMan, but not by sACE plus LG) was found inthese 20 cases.

nised as such. The false-negative rate can only be estimated by comparison with anindependent method of mRNA expression level measurement like Northern-blot, Taq-Man or qPCR. To do this on thousands or even just hundreds of genes is extremelylaborious and outside the scope of this work. From principal considerations we expectthat the noise reduction capability of sACE is effective on both sides, i.e. that sACEreduces the number of false-negatives in a similar order of magnitude as the numberof false-positives, since noise reduction implies significance improvement, and signif-icance improvement propagates directly into treatment vs. control settings. We havetested this assumption on 10 genes in two different control vs. treatment settings andfind no increase in false-negatives with sACE, while significance is improved (t-test issmaller overall for sACE plus LG normalised data vs. LG normalised data; see Table3.3).

3.4 Generalisation of the method

In the following two generalisations of the normalisation approach presented aboveare discussed. The use of variance stabilisation techniques avoids problems which canoccur, e.g. if the distribution of the raw data is organised in clusters. Additionally itis shown, that the use of least trimmed squares regression enlarges the applicability

3.4 Generalisation of the method 41

(a)

0

2

4

6

8

10

12

14

0 1 2 3 4 5 6 7

log-

ratio

, Exp

. 2

log-ratio, Exp. 1

(b)

0.68

0.7

0.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

Var

ianc

e

Intensity

Figure 3.4: Sketch of a possible optimal transformation if the data is organised in clusters (a)and (b) the mean dependent variance of raw microarray data (black) and after vari-ance stabilisation (red).

of the normalisation method to situations where one can not assume that most of thegenes do not change their expression.

3.4.1 Variance stabilisation

The standard ACE algorithm which was the basis for the presented sACE normalisa-tion algorithm has difficulties in estimating the optimal transformation if the data isorganised in distinct clusters. As shown in Figure 3.4(a), the optimal transformationbecomes a step function for this type of data. Since the ACE algorithm is looking fortransformations with zero mean and unit variance, a step function like the one shownin Figure 3.4(a) may indeed be a valid optimal transformation.In order to overcome this difficulty one can proceed to the concept of variance stabil-ising transformations [38]. If an arbitrary random variable X with a mean dependentvariance V ar(µ) is transformed by

h(t) =

∫ t

0

1√

V ar(u)du, (3.12)

then the resulting random variable has approximately constant variance. Figure 3.4(b)shows the variance of data resulting from a microarray experiment which clearly risesfor larger intensities. After applying a variance stabilising transformation according toEq. 3.12 the variance is approximately constant.This concept can be introduced into the sACE algorithm by applying such a variancestabilising transformation after every iterative step. Figure 3.5(a) shows the resultingvariance against the current mean for a typical microarray experiment. There is onlya very small decrease of the variance after applying the generalised sACE algorithm.


(a)

0.68

0.7

0.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

Var

ianc

e

Intensity

(b)

0.68

0.7

0.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

Var

ianc

e

Intensity

Figure 3.5: Microarray data after variance stabilisation (black) and after normalisation with theenhanced sACE algorithm including variance stabilisation (red) for a real experi-ment (a), and (b) for the same microarray data with an artificial systematic error.

This suggests of course that there are practically no systematic intensity dependenterrors present in this data. However if one artificially introduces such errors, as donein Figure 3.5(b), sACE detects and corrects for this errors.

3.4.2 Least trimmed squares regression

If the sACE algorithm as presented in Section 3.2 is applied to data from differentbiological entities one relies on the assumption that most of the genes do not changetheir expression. Thus these genes can be used to compute the optimal transformationswhile the small number of effects present do not disturb the estimation of the transfor-mations.This assumption can be overcome if least trimmed squares are used for the estima-tion of the conditional expectation values which have to be computed. Least trimmedsquares regression [42] minimises

r2 =

N/2∑

i=1

r2i:N , (3.13)

where r1:N denotes the ordered residuals in ascending order. Hence, in principle itsuffices to assume that at least 50% of the genes do not change their expression. Usingleast trimmed squares regression will then ignore the rest of the data, which of courseincludes the effects one is looking for, for the estimation of the optimal transforma-tions. One practical drawback of this approach is that the computation of the leasttrimmed squares is quite expensive [43].

Chapter 4

Optimal Experimental Design inSystems Biology

Cell function is based on complex signalling pathways that control information pro-cessing. Complexity arises from the large number of different molecules involvedand the variety of interactions between them. In order to understand these signallingpathways at a system level, these dynamic interactions have to be simultaneously ex-amined [44, 45].To obtain such a system–level understanding of a biological system a quantitative databased approach is essential. To this end quantitative dynamic experiments are made,from which the system structure and the parameters have to be deduced [44,45]. Sincebiological systems have to cope with different environmental conditions often certainproperties are robust with respect to variations in some of the parameters. Hence itis important to use optimal experimental design considerations in advance of the ex-periments to improve the information content of the measurements. Using the MAP–Kinase pathway as an example we present a simulation study investigating the ap-plication of different optimality criteria. It is demonstrated that experimental designsignificantly improves the parameter estimation accuracy, and additionally reveals dif-ficulties in parameter estimation due to robustness [46].

4.1 Introduction

In order to successfully identify the model structure and reliably estimate the systemparameters of a biological system, these have to be identifiable given the current ex-perimental design. If this is not the case the experimental design has to be improved.Moreover, even if identifiability is given, an enhanced experimental design may drasti-cally improve the estimation accuracy. Since quantitative time–resolved measurementsare time and cost intensive, these improvements will also reduce the experimental costsneeded to achieve a pre–specified accuracy.In the following we discuss how optimality of an experimental design can be defined

43

44 Optimal Experimental Design in Systems Biology

and calculated. Then a simulation study using the MAP–Kinase signalling pathway,which is known to be a robust pathway involved in a variety of different regulatorysystems is presented. The main aim of this simulation study is to investigate howcomputer simulations in advance of the experiments can be used to improve the exper-imental design. Experimental design can be used to optimise the selection of the timepoints at which the measurements are recorded and to calculate optimal stimulationsof the system. Here we are mainly concerned with the optimisation of the input to theMAP–Kinase pathway. All measurements are recorded on a pre-specified uniform gridof time points. Using this pathway as an example, the advantages and disadvantagesof different optimality criteria are discussed.This chapter is structured as follows. Section 4.2 introduces the basic system iden-tification theory and briefly reviews some concepts of optimal experimental design.After a short introduction of the MAP–Kinase pathway in Section 4.3, the results ofthe simulation study are presented in Section 4.4.

4.2 System identification

Dynamical biological systems can be described by a variety of different mathematicalmodels, ranging from partial or ordinary differential equations to stochastic differentialequations. In the following we will restrict ourselves to models defined in terms ofordinary differential equations. In this case the time evolution of the system statex(t) ∈ R

S is given byx(t) = f(x(t), θ, u(t)), (4.1)

where θ ∈ RP denotes the parameters of the system and u(t) the input to the system.

Note that θ may not only include dynamic parameters such as reaction rates, but alsounknown initial conditions of the system state vector x(0). For the sake of simplicitythe input function u(t) is assumed to be scalar, but the methods presented in the fol-lowing do not depend on this assumption.Often not all components of the system can be measured directly. The observationfunction g describes which properties yM(t, θ) ∈ R

L of the system can be measured

yM(ti, θ, u) = g(x(ti, θ, u)) i = 1, . . . , N . (4.2)

The observation function g and the input function u(t) together with the specificationof the time points ti at which the measurements are recorded completely specify theexperimental design used. The observations yD(ti) ∈ R

L are given by

yD(ti) = yM(ti, θ0, u) + εi i = 1, . . . , N . (4.3)

The true parameter vector is denoted by θ0 and εi ∈ RL describes the observation error

at time ti. Most often the observation error is assumed to be distributed according to

εij = N(0, σ2ij), i = 1, . . . , N j = 1, . . . , L . (4.4)

4.2 System identification 45

The variances σij can be estimated from repetitions of the experiments.Knowledge of the system structure, Eq. (4.1), and the experimental design, Eq. (4.2),allows for the evaluation of the identifiability of the system structure and the parame-ters. This is a prerequisite for inference of system properties, which is the final goal ofsystems biology.

4.2.1 Identifiability

One distinguishes between structural, local and practical identifiability. Structuralidentifiability of parameters is a theoretical property of the model structure depend-ing only on the observation function g and the input function u(t). It does not dependon the observational noise or the number of data points measured, but rather is anasymptotic property in the limit of an infinite number of observations.The parameters θ of a model are structural identifiable if

∀ θ1, θ2 ∈ RP , θ1 6= θ2 ⇒ ∃ t with g(x(t, θ1, u)) 6= g(x(t, θ2, u)) . (4.5)

This definition for structural identifiability is rather strict. One can easily imagine re-alistic situations where the parameters are not identifiable according to this definition,but nevertheless they would be identifiable for a reasonably restricted set of all possi-ble parameters. Hence for all practical purposes in order to decide if parameters canbe identified with the current experimental protocol it is crucial to restrict the set ofpossible parameters. This leads to the definition of local identifiability.The parameters θ of a model are locally identifiable in a ε neighbourhood of a param-eter θ0 if

∀ θ1, θ2 ∈ {θ ∈ RP | ‖θ − θ0‖ < ε} , θ1 6= θ2

⇒ ∃ t with g(x(t, θ1, u)) 6= g(x(t, θ2, u)) . (4.6)

In contrast to the theoretical properties of structural and local identifiability the prac-tical identifiability of parameters is limited by the finite amount of data and observa-tional noise. Hence, if there are large observation errors or few data and, thus, noreliable estimate of the parameters is possible, these parameters are called practicalnon-identifiable.There are different analytical approaches to prove structural identifiability of parame-ters, see Ref. [47, 48, 49], or Ref. [50] for an overview. However, for large non–linearsystems these analytical approaches are quite complicated. For the purposes neededin the following local identifiability near the true parameter value is sufficient. Toprove local identifiability we used an approach based on the parameter estimation ac-curacy [51], which will be discussed in Section 4.2.3.Identifiability of parameters has to be distinguished from model selection where onetries to discriminate between different possible models. If the set of all models underinvestigation is given byM = {f | possibly true model} the true model is structural


identifiable if

∀M1(θ1), M2(θ2) ∈ M , M1 6= M2

⇒ ∃ t with g(x(t, θ1, u)) 6= g(x(t, θ2, u)) . (4.7)

4.2.2 Parameter estimation

In all practical applications, parameters of the system have to be estimated, since theycan not be measured directly. In addition, the measured values, see Eq. (4.3), con-tain observational errors. Due to its superior statistical properties the most popularapproach to parameter estimation is the maximum likelihood method [52, 53, 54].In the case of Gaussian observational noise, the maximum likelihood estimation cor-responds to a minimisation of the weighted residual sum of squares

χ2(θ) =

M∑

j=1

N∑

i=1

(

yDj (ti)− yM

j (ti, θ, u)

σij

)2

. (4.8)

The asymptotic distribution of the least squares estimate θ can be computed analyti-cally. To this end one assumes that in the limiting case of an infinite number of ob-servations the deviation ∆θ between the real and estimated parameters is also small.Hence the observation function can be expanded in a Taylor series yielding

yMj (ti, θ, u) = yM

j (ti, θ0, u) +∇θθθ yj |ti, �0

(θ − θ0) (4.9)

= yMj (ti, θ0, u) + {∇θθθ gj +

∇� gj

(

[∇θθθ x1]T , . . . , [∇θθθ xK ]T

)T}

|ti, �0

(θ − θ0).

After inserting this result into the minimisation functional one obtains

χ2(θ) =M∑

j=1

N∑

i=1

[

ε2ij

σ2ij

− 2 εij

σ2ij

∇θθθ yj(ti, θ0) ∆θ + [∆θ]T(

1

σ2ij

[∇θθθyj]T [∇θθθyj]

)

∣

∣

∣

ti,�0

∆θ

]

.

(4.10)

The minimisation of χ2(θ) with respect to θ yields the following equation for theestimated deviation of the parameter vector ∆θ

{

M∑

j=1

N∑

i=1

1

σ2ij

[∇θθθ yj]T [∇θθθ yj]

}

∆θ =: F∆θ =

M∑

j=1

N∑

i=1

εij

σ2ij

[∇θθθ yj]T , (4.11)

where the so called Fisher information matrix F was introduced [52]. The equationabove can be solved, and one obtains

∆θ = F−1M∑

j=1

N∑

i=1

εij

σ2ij

[∇θθθ yj]T . (4.12)

4.2 System identification 47

Finally the covariance matrix Σ of the estimated parameter vector is computed by

Σ = 〈∆θ∆θT 〉 = F−1 . (4.13)

In order to evaluate this covariance matrix, the derivations of the observation functionwith respect to the parameters, ∇θθθ yj(ti), are needed. Taking into account Eq. (4.9)one hence needs the derivation of g with respect to θ and x. Additionally, the deriva-tions ∇θθθ xk(ti) have to be computed from the system of ordinary differential equations

∂

∂t(∇θθθ xk) =

S∑

r=1

∂fk(x, θ)

∂xr

∇θθθxr + ∇θθθfk(x, θ) , (4.14)

with the initial conditions (∇θθθ xk)(0) = ∇θθθ xk(0). If the Fisher information matrixand thus the covariance matrix of the estimated parameters is known, the asymptoticconfidence intervals for the estimates can be computed from the multivariate normaldistribution

p(θ) =

√

Det(F )

[2π]P/2exp(−1

2θT Fθ) . (4.15)

4.2.3 Local identifiability

Local identifiability of the parameters in a small neighbourhood of the true parame-ter values can be assessed using the parameter estimation accuracy [51]. If at leastone of the parameters is not identifiable, this would result in a joint probability dis-tribution p(θ) which is not a multivariate normal distribution. Technically such anon-identifiability would result in a covariance matrix which does not have full rank.Hence, the condition number, the ratio of the largest eigenvalue to the smallest eigen-value would asymptotically tend to infinity. Therefore, the asymptotic behaviour ofthe condition number can be used to assess local parameter identifiability.Note that if there are large differences in the size of the parameters to be estimated, itwill be advantageous to use relative estimation errors. In such a setting the correlationmatrix is used instead of the covariance matrix.

4.2.4 Optimal experimental design

Basically the information content of a measurement can be quantified by the covari-ance matrix Σ of the estimated parameters. Simply speaking, the smaller the jointconfidence intervals for the estimated parameters are, the more information is con-tained in the experiment. Mostly four measures of the information content are distin-guished [55, 56, 57, 58]:

• A-optimal design: max(Tr(F ))

• D-optimal design: min(det(Σ))


��

��∗∗

(1)

(2)

(3)

(4)

�� ∗∗

��∗

Figure 4.1: The last step of the MAPK cascade. Activated MAPK kinase (Mek∗∗) catalyses theactivation of MAPK (Erk) resulting in the activated form Erk∗∗. The deactivationof the active form is catalysed by the Phosphatase (Pase).

• E-optimal design: min(λmax(Σ))

• modified E-optimal design: min(λmax(Σ)/λmin(Σ))

The A-optimal design tries to maximise the trace of the Fisher information matrix F.However this criteria is rarely used, since it can lead to non–informative experiments[59], with a covariance matrix which is not positive definite. A D-optimal designminimises the determinant of the covariance matrix Σ, and can thus be interpreted asminimising the geometric mean of the errors in the parameters. The largest error isminimised by the E-optimal design which corresponds to a minimisation of the largesteigenvalue of the covariance matrix. The modified E-optimal design minimises theratio of the largest to the smallest eigenvector and thus optimises the functional shapeof the confidence intervals.

4.3 Application to MAP–Kinase

Upon external stimulation, cell surface receptors initiate a network of internal sig-nalling pathways which are essential to cell function. One of these signalling path-ways, the mitogen activated protein kinase (MAPK) signal cascade is a highly con-served pathway found in a variety of eucaryotic organisms. Therefore cumulative ef-forts over the past decade have been carried out to explore the functionality and prop-erties of this module [60, 61, 62, 63, 64, 65, 66], see Refs. [67, 68, 69] for an overview.

4.3.1 The MAP-Kinase signalling pathway

MAP–Kinase is assumed to be composed of three kinases, MAPK kinase kinase (Raf),MAPK kinase (Mek) and MAPK (Erk). These kinases can be activated by phospho-rylation. The activated form then catalyses the activation of the downstream kinase.In the following we will concentrate on the last step of the MAPK cascade which is

4.3 Application to MAP–Kinase 49

(a) (b)

1e+06

1e+07

1e+07 1e+08

estim

ated

err

or

signal intensity

Figure 4.2: Western Blot measurement of the MAP-Kinase pathway (a), the black spots corre-spond to different proteins separated vertically by size. The measurement error ofthe quantification of the amount of proteins for different signal intensities (b).

shown in Figure 4.1. The reactions occurring in this last step of the signalling cascadeare

[Erk] + [Mek∗∗]a1−→←−b1

[Erk−Mek∗∗]c1−→ [Erk∗] + [Mek∗∗] ,

[Erk∗] + [Mek∗∗]a3−→←−b3

[Erk∗ −Mek∗∗]c3−→ [Erk∗∗] + [Mek∗∗] ,

[Erk∗∗] + [Pase]a4−→←−b4

[Erk∗∗ − Pase]c4−→ [Erk∗] + [Pase] ,

[Erk∗] + [Pase]a2−→←−b2

[Erk∗ − Pase]c2−→ [Erk] + [Pase] .

Assuming mass action law for each of this reactions, a system of coupled ordinarydifferential equations for the involved proteins is derived. Altogether there are 14 dy-namical parameters involved, the 12 reaction rates ai, bi, ci, i ∈ 1, . . . , 4, and the totalPhosphatase (Pase) and Kinase (Erk) concentrations. For this system Mek∗∗ serves asinput, while Erk∗∗ can be regarded as the output of the system.

4.3.2 Measurement method and error model

For quantification of the amount of proteins present in a cell, Western Blots are used.Firstly the proteins are separated by their size using SDS-PAGE. SDS-PAGE standsfor sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Basically the SDS de-tergent quantitatively binds to proteins, giving them linearity and a uniform charge perlength. Then the proteins can be separated solely on the basis of their size using 2Dgel electrophoresis already presented in Section 1.3. Afterwards the proteins are trans-ferred to a nitrocellulose membrane and a specific antibody targeting the protein underconsideration is added. Without going into too much details, this antibody is needed


to finally visualise the proteins. Figure 4.2(a) shows the result of such an experiment,each black spot corresponds to a different protein. The protein concentration can nowbe quantified using the darkness of each spot. Since this measurement process mainlyconsists of a high intensity counting process, one would expect the error distributionto be Gaussian and proportional to

√I , where I denotes the measured signal intensity.

Figure 4.2(b) shows the measurement error estimated from 8 repetitive experimentsfor different signal intensities. It can be clearly seen that, as expected, a power lawdependence is given.Initial investigations of this error structure can now be used to attach error bars to eachsingle observation, without the need to repeat all measurements.

4.3.3 Simulation study

The dynamic behaviour of the system and thus the parameter estimation accuracy de-pends on the actual parameter values. To demonstrate the prospects of experimentaldesign considerations, the following parameters have been chosen for the purpose ofthe presented simulation study:

ai = 0.5, bi = 0.6, ci = 0.9, i = 1, . . . , 4 Pasetot = 20, Erktot = 50. (4.16)

A typical time evolution of the system response, the two measured components Erkand Erk∗∗, subject to a stimulus with a quartic input function, see Eq. (4.17), includingGaussian observational noise (σ = 0.3), is shown in Fig. 4.3(a).The optimisation of the different optimality criteria can be quite intensive in terms ofcomputing time. For the computation of the optimal stimulation of the MAP–Kinasepathway in Section 4.3, a polynomial parametrisation of the input function is used.Then a minimisation algorithm based on a sequential quadratic programming method,implemented in the NAG numerical libraries [70], is used to obtain the optimal stimu-lation.

4.4 Results

To understand the functionality of the MAPK signalling pathway, a quantitative de-scription of the signal dynamic is necessary [65]. The crucial question for such asystems level approach is how the information content of measurements can be quan-tified, and how experiments can be optimised in order to yield a maximal amount ofinformation about the observed system.To answer these questions, a simulation study, using the last step of the MAPK cas-cade as an example is performed. It is assumed that the dynamical behaviour of boththe inactive (Erk) and the active (Erk∗∗) form of MAPK can be measured subject to aselectable stimulation of the activated MAPKK (Mek∗∗), see Figure 4.1.In order to simplify the discussion of the results we restrict ourselves to the estimationof two parameters from the first reaction in Figure 4.1. We will discuss the estimation

4.4 Results 51

(a)

5 10 15 20 25 30

t

2

4

6

8

10

12

14

16

Con

cent

ratio

nErk

Erk

(b)

102 103 104 105

N

380

382

384

386

388

390

392

*101

Con

ditio

n nu

mbe

r

Figure 4.3: (a): Typical time evolution of the measured activated and inactivated MAPK con-centrations, Erk and Erk∗∗, for a quartic input signal, see Eq. (4.17), and (b) theasymptotic behaviour of the condition number of the covariance matrix when therate constants a1 and b1 are estimated.

of a1, b1 and a1, c1, since these two cases showed a quite different behaviour. All otherparameters were fixed. However the concepts discussed can be generalised to a largerparameter space.

4.4.1 Parameter identifiability

Before one tries to estimate these parameters one must assure that the parameters areidentifiable at all. To investigate identifiability, we simulated data according to themodel and observational noise was added. Then minimisation of χ2(θ), see Eq. (4.8),yielded an estimate of the parameters and the covariance matrix. As discussed in Sec-tion 4.2.3, the asymptotic behaviour of the condition number of this covariance matrixcan be used to assess the identifiability of the parameters [51]. Figure 4.3(b) showsthe behaviour of the condition number if the parameters a1 and b1 are estimated and c1

is fixed. It can be seen that the condition number does not tend to infinity with an in-creasing number N of data points, and, hence, in this setting the parameters are locallyidentifiable. However, the large values of the condition number already indicate thatthere will be large estimation errors, which probably limit the practical identifiability.Estimation of a1 and c1 showed the same qualitative behaviour, however, the values ofthe condition number were much smaller.


(a)

0.4 0.5 0.6 0.7 0.8

a1

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

b 1

estimated parameters

analytical confidence interval

(b)

0.44 0.46 0.48 0.5 0.52 0.54 0.56 0.58

a1

0.3

0.4

0.5

0.6

0.7

0.8

0.9

b 1

estimated parameters

analytical confidence interval

Figure 4.4: The 95 % confidence intervals for the parameters a1 and b1 in the case of a modifiedE-optimal design for the best linear (a) and quartic (b) input function, together with200 (a) and 500 (b) samples of estimated parameters.

4.4.2 Optimal experimental design

Designing optimal experiments typically involves decisions about what components ofthe system are measured at which points in time. Additionally the stimulation of thesystem has to be chosen. For our simulation study we assume that the time evolution ofthe inactive (Erk) and activated (Erk∗∗) MAPK is measured at a given time resolution,and we try to find the most informative input profile for Mek∗∗. To this end we used apolynomial parameterisation of the input function

Mek∗∗(t) = 2 +

d∑

k=1

pk tk, 2 ≤ Mek∗∗(t) ≤ 10 ∀t ∈ [0, 30], (4.17)

where the maximal and minimal Mek∗∗ concentrations are bounded. Using this para-metrisation of the input we optimised the different design criteria discussed in Section4.2.4 to obtain the optimal input for different degrees d of the input function.The confidence intervals calculated according to Eq. (4.15) are valid in the asymptoticcase of an infinite number of observations. In order to assess the validity of this as-sumption in the present case, observational noise is added to simulated data and theparameters are estimated. Figure 4.4 shows the 95% confidence ellipsoids calculatedaccording to Eq. (4.15) for a modified E-optimal design together with 200 samplesof estimated parameters for the optimal linear and 500 samples for the quartic inputfunction. It can be seen that the shape and size of the computed confidence ellipsoidsresembles the distribution of the estimated parameters, while the estimated parame-ters are slightly shifted towards higher parameter values. This indicates that in thecurrent setting the asymptotic covariance matrix can be used to optimise the experi-mental design. Note that also if this assumption would not be valid, one can still use

4.4 Results 53

0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7

a1

0.0

0.2

0.4

0.6

0.8

1.0

1.2

b 1

linear input

quartic input

Figure 4.5: The 95 % confidence intervals for the parameters a1 and b1 in the case of a modifiedE-optimal design for the best linear and quartic input function.

the estimated parameters to reconstruct the confidence region and thus to optimise theexperimental design. However, the numerical efforts in this case are much larger.Figure 4.5 shows the shape of the 95% confidence ellipsoid calculated according to Eq.(4.15) for an estimation of the parameters a1 and b1 using two optimal input functionsof different degree d. The modified E-optimal design was used to find the optimal lin-ear and quartic input function. It can be seen that there is a substantial improvement inparameter estimation accuracy in moving from an optimal linear input to the optimalquartic input function. In this case the estimation error in the parameter a1 was reducedby approximately 60%.

Figure 4.6(a) shows the 95% confidence intervals for estimation of a1, b1, for theoptimal D–, E– and modified E–design using the largest degree of the input function.

(a)

0.42 0.44 0.46 0.48 0.5 0.52 0.54 0.56 0.58

a1

0.3

0.4

0.5

0.6

0.7

0.8

0.9

b 1

E-optimal

modified E

D-optimal

(b)

0.49 0.495 0.5 0.505 0.51 0.515

a1

0.885

0.89

0.895

0.9

0.905

0.91

0.915

0.92

c 1

E-optimal

modified E

D-optimal

Figure 4.6: (a): The 95% confidence intervals for the estimation of a1, b1 and a1, c1 (b) fordifferent design criteria using a optimal quartic input function.


(a)

0 5 10 15 20 25 30

t

2

4

6

8

10

Mek linear

quadratic

cubic

quartic

(b)

0 5 10 15 20 25 30

t

2

4

6

8

10

Mek linear

quadratic

cubic

quartic

(c)

0 5 10 15 20 25 30

t

2

3

4

5

6

7

8

9

10

Mek

linear

quadratic

cubic

quartic

Figure 4.7: The optimal input functions for the E–optimal design (a), the modified E–optimaldesign (b) and the optimal D-design (c) for different degrees of the input functionpolynom.

There is a strong correlation between the estimates of these two parameters. Thesecorrelations can lead to practical non–identifiability for less optimal input functions,despite the structural identifiability which is present in this case.Especially in the case of correlated estimates it is important to be aware of the multi–dimensionality of the confidence interval. For example the estimation error of a1, ifthe parameter b1 would be fixed is about 10 times smaller than in the case where bothparameters are estimated. The estimation error of a1, if b1 is fixed, can be seen inFig. 4.6(a) as the projection of the intersection of the b1 = 0.6 line and the confidenceellipsoid to the a1 axis.

Figure 4.6(b) shows the confidence regions for an estimation of a1 and c1. Thereis only a small correlation between these two parameters present and hence these pa-rameters are better suited for a discussion of the different design criteria. As expectedthe D-optimal design results in the smallest volume in the parameter space of the con-fidence region, while the E-optimal design minimises the largest principal axis of theconfidence ellipsoid. In contrast to these criteria, the modified E–optimal design whichoptimises the shape of the confidence region results in larger estimation errors, espe-

4.5 Conclusion 55

cially for the parameter a1.Figure 4.7 shows the optimal input functions for different degrees of the polynom fromEq. (4.17) for the optimal E–, the modified E–design and the D-design.

4.5 Conclusion

Systems Biology is based on quantitative dynamic measurements of biological sys-tems, from which the system structure and the parameters have to be deduced. Usingthe MAP–Kinase signalling cascade, a highly conserved regulatory module, as an ex-ample, the impact of improved experimental designs to parameter estimation accuracywas demonstrated. It was shown that by moving from a linear input function to a spe-cific quartic input function the parameter estimation error could be reduced by 60%.In addition to improve the estimation accuracy, computer simulations can also help todetect possible practical non–identifiabilities in the parameter space. Practical non–identifiability will result in a large condition number of the estimated covariance ma-trix, see Fig. 4.5 as an example. This example illustrates that e.g. for an analysis ofrobustness, where the sensitivity of the output to variations in the parameters is stud-ied, it does not suffice to use variations in single parameters. In general, combinationsof parameters are the most prospective candidates for non–identifiabilities.Since the modified E–optimal design resulted in larger confidence intervals, this simu-lation study would prefer the E– and D–optimal experimental designs. These designsresulted in roughly the similar shape and size of the confidence regions.

Chapter 5

Model selection in robust systems

Widely used technical or biological systems are composed of a large number of sub-systems. It would be very harmful if the failure of a single component would lead to acatastrophic failure of the whole system. In order to avoid such failures of the wholesystem a major design goal for complex systems is robustness towards the failure ofa single component or changing environmental conditions. A typical example for arobust technical device is e.g. the operational amplifier, where changes of the internalcomponents only weakly affect the operation characteristics, see Ref. [71]. Howeverin the following we will concentrate on biological systems.As discussed in the previous chapter, a quantitative model based description is es-sential for the understanding of signal processing in biological systems. Since thesesystems have to cope with changing environments “robustness” is one of the designgoals [44, 72]. In the following we will investigate how this “robustness” affects theinformation about the biological system that can be obtained by measurements.

5.1 Robustness versus Identifiability

In the following the differences between the rigorously defined identifiability of a sys-tem and the more qualitative property robustness are discussed and the basic notationused afterwards is introduced.

5.1.1 Robustness

The abstract property robustness is hard to define rigorously. At least the followingthree types present in biological systems have to be distinguished according to Kitano[44]:

• Adaption:Certain features of the system do not or only weakly depend on a subset ofenvironmental conditions.

57

58 Model selection in robust systems

• Parameter insensitivity:There is only a weak or even no dependence of a system response to changes inthe kinetic parameters of the system.

• Graceful degradation:There is rather a slow degradation instead of a catastrophic failure after damage.

In the following we concentrate on the first two definitions of robustness. The maindifference between the first two definitions above is that they concern external envi-ronmental conditions in the case of adaption, and internal parameters in the case ofparameter insensitivity. The internal parameters may be of primary interest, but sincethe system interacts with the environment the external parameters have to be estimatedaccordingly. From a mathematical point of view, the question of identifiability arises.In contrast to robustness, identifiability is mathematically rigorously defined, see Sec-tion 4.2.1. Given a certain measurement protocol g(.) it is possible that the systemstructure, f(.), or parameters, θ, of the system can not be estimated from the measure-ments, and, hence, are non-identifiable.

5.2 Application to perfect adaption

Nowadays with the advances in quantitative measurements available it is quite popularto use the parameter estimation and model selection techniques presented in Section4.2.2, to obtain a system level understanding of the biological system under investi-gation, see e.g. [73, 74, 75]. However the systems studied are quite large, containinghundreds of differential equations and a similar number of parameters to be estimated.In the following bacterial chemotaxis is used as a well known example of a robustsystem showing perfect adaption to critically demonstrate the pitfalls and prospects ofapplying parameter estimation techniques to large biological networks.

5.2.1 Bacterial chemotaxis

Bacterial chemotaxis [76, 77] is probably one of the best known biological examplesshowing perfect adaption. The movement of a swimming bacteria such as Escherichiacoli is composed of a smooth motion interrupted by events of ’tumbling’, where a newdirection of motion is chosen randomly. By changing this tumbling frequency depen-dent on the current environment of the bacterium, a directed movement, e.g. towardsattractants such as food, is possible. Measurements show that the steady state tumblingfrequency in a homogeneous environment does not depend on the actual concentrationof the chemical attractants influencing the tumbling frequency. This property, knownas perfect adaption, ensures that the bacterium is able to maintain it’s sensitivity to-wards attractants over a wide range of concentrations.The proposed model for bacterial chemotaxis [76] is based on a receptor complex

5.2 Application to perfect adaption 59

which is assumed to be either in an active or an inactive state with probabilities depend-ing on the complex being occupied by a ligand, the attractant, or not. This complexeu/om , where u/o denote the state of the receptor which can be occupied or unoccupied,

occurs also in different methylation levels m = 0, . . . , 4. The basic reactions occurringbetween these different states are

eu/om + b

ab−→←−db

[eu/om b]

kb−→ eu/om−1 + b , (5.1a)

eu/om + r

ar−→←−dr

[eu/om r]

kr−→ eu/om+1 + r , (5.1b)

eum + l

kl−→←−k−l

eom , (5.1c)

where l denotes the ligand, and r and b mediate the methylation and demethylationreactions. Each form of the complex can be in an active state with a probability αu/o

m

depending on the methylation level and weather it is occupied by a ligand or not. Amore detailed explanation of the model can be found in Ref. [76].Assuming mass action law, the chemical reactions (5.1) can be translated into differ-ential equations, and one obtains e.g. for the complex [eu

mb]

∂

∂t[eu

mb](t) =(1− δm0) {−kl l(t)[eumb](t) + k−l[e

omb](t)− (kb + db)[e

umb](t)

+abαumeu

m(t)b(t)} . (5.2)

The full set of equations and a short description of the parameters and their numericalvalues used in the following is given in Appendix A.

5.2.2 Perfect adaption

According to the model of bacterial chemotaxis discussed above, the measured outputof the system, the tumbling frequency, is connected to the system activity

A =

M∑

m=0

αumeu

m + αomeo

m . (5.3)

It can be shown that under some weak assumptions the steady state activity Ast. doesnot depend on the ligand concentration. And in addition this property is not a conse-quence of a fine tuned set of parameter values but rather a robust property of the biolog-ical model which is invariant under moderate changes in the parameter values [76,78].The basic assumptions underlying the model are that

• the methylation reaction works at saturation,

• only active complexes are demethylated.


0 100 200 300 400 500

t

0.0

0.01

0.02

0.03

A

Activity

Ligand

Figure 5.1: System activity of bacterial chemotaxis according to model (5.1) subject to a steplike change of the free ligand concentration. Note the different size of response af-ter addition respectively withdrawal of the ligand and the different adaption times.

Given these prerequisites the perfect adaption property can be derived, if for the sakeof simplicity one assumes that only the methylated complexes can be in the activestate, αu,o

0 = 0, and that the complex can be methylated only once, M = 1. Then usingMichaelis–Menten kinetics for the reactions (5.1a,5.1b)

∂

∂teu/o1 (t) = Vr − Vb

αu/o1 eu/o

1

kb + αu/o1 eu/o

1

, (5.4)

where Vr, Vb are the maximal reaction rates according to the Michaelis–Menten kinet-ics and r works at saturation. The steady state activity is thus given by

Ast. = kbVr

Vb − Vr, (5.5)

which is independent of the ligand concentration. It can be checked that this steadystate is stable such that small perturbations of the state result in movements of thesystem towards this steady state.The property that this steady state value is independent of the ligand concentration isknown as perfect adaption. The main assumption responsible for this behaviour is thatonly active complexes are demethylated. This constitutes a feedback of the output ofthe system back into the system which then leads to adaption [78]. Figure 5.1 showsthe activity of a system subject to a step like change of the ligand concentration. Notethe different time scales at which adaption takes place after addition or withdrawal offree ligand.

5.2.3 Robustness: Pitfalls and Prospects

In the simplified model which was used to demonstrate the perfect adaption propertyof bacterial chemotaxis, the steady state activity (5.5) did not depend on the ligand


(a)

0.1 0.2 0.4 0.7 1.0 2.0 4.0 7.0 10.0

p/p

0.8

1.0

1.2

1.4

1.6

1.8

2.0A

/A

B

kr

E

(b)

0.1 0.2 0.4 0.7 1.0 2.0 4.0 7.0 10.0

p/p

0

10

20

30

40

50

A

kr

kb

db

Figure 5.2: Dependence of the adaption property subject to a change in the parameter values.(a) Shown is the relative change of the steady state value after an increase in theligand concentration for different values of the system parameters B, E and kr.Dependence of the absolute activity value subject to changes in the parameterswithout change of the ligand concentration (b).

concentration. This is not strictly true for the real model for all possible parametervalues.Figure 5.2(a) shows the dependence of the relative change of the system activity afteraddition of ligand on the parameter values used. One can clearly see that the parame-ters used in Ref. [76] yield perfect adaption while a change in the parameters weaklyaffects the adaption property. Figure 5.2(b) shows the dependence of the absolutesteady activity on the actual values of three different parameters for fixed ligand con-centrations. Hence, the absolute activity is definitely not a robust property.What are the implications of robust properties such as the system activity on parame-ter estimation and model selection ? In the current literature one finds statements like:“Parameter estimation accuracy additionally gives a quantitative measure for robust-ness ...”, see Ref. [75]. While robustness definitely induces large parameter estimationerrors and, hence, can lead to practical non-identifiabilities, the conclusion from pa-rameter estimation accuracy to robustness is not straightforward. In order to explorethe connection between parameter estimation accuracy and robustness, data accordingto the presented model of bacterial chemotaxis is simulated and used to estimate theparameter values. The used model consists of 24 differential equations and 21 param-eters and thus has approximately the same size as other popular biological pathwayscurrently investigated [75, 74].The perfect adaption of the steady state system activity Ast concerning the ligand con-centration l reflects the fact that Ast is independent of l. From a mathematical point ofview, if only Ast is observed, l, and most of the other parameters, can not be identified.This situation changes if the dynamics of A(l(t)) resulting from a stimulation with


(a)

101 102 103 104 105

N

0.2

0.3

0.4

0.5

erro

r(3)

(b)

101 102 103 104 105

N

0.007

0.01

0.02

0.04

0.07

0.1

0.2

erro

r(d r

)

Figure 5.3: Parameter estimation error of the probability α3 of being in one of the active statesand dissociation rate dr based on a measurement of A(l(t)) for different numbersN of observation time points. Asymptotically the expected square root dependenceis found.

the ligand l is measured, see Figure 5.1. It can be shown that under this condition allparameters of the original model [76] can be identified.Using this large model, and measuring only the time dependent system activity, theparameter estimation reveals interesting results. Figure 5.3 shows the parameter esti-mation accuracy of two estimated parameters, α3 and dr, depending on the number Nof observations. From the maximum likelihood approach to parameter estimation, onewould expect a square-root dependence of the estimation error on the number of datapoint used. This dependence can clearly seen in Figure 5.3(b). However the parameterestimation accuracy of α3 shows a quite different behaviour, see Figure 5.3(a). Thereis only a modest increase in estimation accuracy at the beginning, but if more observa-tions are used one reaches the expected asymptotic behaviour.This demonstrates, that despite robustness is present in the system, the parameters canbe accurately estimated. Difficulties in parameter estimation are, as in this case, mostoften caused by a insufficient number of measured data points and can not be tracedback to robustness.From a mathematical point of view, investigation of robustness of a biological sys-tem induces an observation function g(.). This observation function g(.) determinesthe identifiability of the parameters. Thus, robustness can render parameters non-identifiable, leading to huge estimation errors. However, this argument can not bereversed. Huge estimation errors can also be caused by a too small number of mea-sured data points.Choosing a well designed observation function g(.), including properties of the systemwhich are not robust with respect to changes in the parameters, it is possible to estimatethe parameters reliable. In contrast to the steady state activity in the case of bacterial


chemotaxis, the time dependent system activity A(t) crucially depends on the exactparameter values. Hence, using measurements of A(t) it was possible to estimate theparameters reliably.Since in general robustness effects only a small subset of features of a biological sys-tem, one would expect that in most cases carefully designed experiments will be able toidentify the biological pathway and allow for an accurate estimation of the parametersinvolved. Simulation studies as discussed in Chapter 4, in advance of the experiments,are important to identify such observation functions and, hence, to improve the exper-imental design.

Chapter 6

Analysis of the dynamics of thelipoprotein metabolism

Nature uses as little as possible of anything.

Johannes Keppler

The lipoprotein metabolism basically acts as a transport system delivering lipopro-teins assembled in the liver through the blood into the cells. Since it is well known thatdisorders of this metabolic pathway dramatically increase the risk of coronary heartdiseases this pathway is a main target for drug development. Hence a system levelunderstanding of this pathway is important in order to design or improve drugs aimedat reducing this risk.In the following the working principle of the lipoprotein metabolism is investigatedby stable–isotope experiments and the resulting data is used to validate and improvemodels currently available.

6.1 The lipoprotein metabolism

In the liver lipoproteins from the digestive system are assembled into large apolipopro-tein B (apoB) molecules. These carrier molecules are then distributed by the bloodsystem. If the apoB molecule binds to a cell it partly delivers the cargo. Since thelipoproteins delivered usually have a smaller mass density than the carrier molecule,the density of the resulting apoB molecule increases during this process.Historically the apoB molecules are distinguished according to Table 6.1 by their massdensity [79]. However, in principle the apoB molecules can transfer a variable amountof cargo into the cells.The importance of this metabolism stems from fact that especially the high density

65

66 Analysis of the dynamics of the lipoprotein metabolism

acronym name density range

VLDL Very low density lipoprotein 0.9 – 1.006 g/mlIDL Intermediate density lipoprotein 1.006 – 1.019 g/mlLDL Low density lipoprotein 1.019 – 1.063 g/ml

Table 6.1: Distinction of apoB molecules transferring lipids from the liver into the cells ac-cording to their mass density.

LDL molecules can increase the risk of atherosclerosis up to six fold [80]. Thus thismetabolic pathway is a main target of drug development.

6.1.1 Experimental design

Usually it is assumed that the lipoprotein metabolism as discussed above usually worksin a steady state. However, an increased activity of the congestive system producesmore apoB carrier molecules, and the delipidation cascade from VLDL to IDL andLDL is stimulated. However, modelling of the lipoprotein metabolism based on suchan approach would be quite difficult since such a model would have to include theeffects of the digestive system. Hence, it was decided to perform the experiments inthe fastening state.Since the apoB molecule is also produced in the liver it is much easier to insert labelledamino acids such as leucine, phenylalanine, aginine into the body which are then usedduring assembly of apoB. Hence, the resulting labelled apoB molecules can be usedto quantitatively observe the delipidation cascade. The first experiments used radioac-tively marked amino acids, while nowadays stable–isotope marked amino acids arepreferred for medical reasons. If only a small amount of these labelled amino acids isused, the assumed steady state of the lipoprotein metabolism is not disturbed.This experimental approach offers the possibility to use different markers at the sametime. Hence, it is possible to study the effect of different stimulations with two or moremarkers at the same time in one patient. Figure 6.1 shows the quantitative measure-ments of the two labelled amino acid concentrations aa(t). The markers were given ina bolus injection for phenylalanine, Figure 6.1(a), and with a continuous injection for8 hours using aginine, Figure 6.1(b).

6.1.2 Experimental methods

To obtain quantitative measurements of the amount of the labelled and unlabelled apoBmolecules, the following experimental steps are taken: blood samples are collectedfrom the study participants between 0 and 12 hours after intravenous administration ofthe labelled amino acids. Ultra centrifugation is then used to separate molecules withdifferent mass densities. Since VLDL and IDL molecules possess narrow mass densitydistributions, they can be separated relatively simple. However, the distributions of thethree different LDL molecules overlap. Therefore the density range for separation of

6.1 The lipoprotein metabolism 67

(a)

0

5

10

15

20

25

30

35

40

45

50

0 2 4 6 8 10 12

aa(t

) / [

g/m

l]

t / [h]

(b)

0

1

2

3

4

5

6

7

8

9

0 2 4 6 8 10 12

aa(t

) / [

g/m

l]

t / [h]

Figure 6.1: Stimulation of the lipd metabolism. Shown are the time courses used to adminis-ter the two differently labelled amino acids aa(t), (a) phenylalanine with a bolusinjection and (b) aginine administered continuously for 8 hours.

the LDL was divided into six subranges, see [81]. For every density range at everytime point the total concentration of apoB, the so-called apoB pool value, was deter-mined by nephelometry (Wako 30R analyser, Wako chemicals, Japan). It is importantto note that this technique leads to the total mass of apoB molecules in a specific massdensity range, i.e. this method integrates the mass density distribution in such a massdensity subrange.To obtain the time course of the labelled apoB molecules within the different den-sity ranges, gas chromatography-mass spectrometry (GC-MS) was performed with aHewlett Packard 5890 gas chromatograph connected to a Hewlett Packard 5972 massspectrometer. Ion masses 282 and 285 for leucine (three H-atoms are replaced by D-atoms) and 148 and 154 for phenylalanine (six 12C-atoms replaced by 13C-atoms) weredetermined. Using these relative values in combination with the absolute concentrationof apoB yielded absolute concentrations of labelled and unlabelled apoB molecules.The level of observational noise of the experimental data results from a combinationof the experimental measurements techniques involved, i.e. ultracentrifugation, neph-elometry and GC-MS.The technical GC-MS data variability is accounted for by using a Poisson distributiondue to the counting process involved. The GC-MS yields the fraction of labelled andunlabelled apoB molecules. To obtain the absolute apoB concentrations, these valueshave to be multiplied by the total apoB concentration. The error of the total apoB con-centration due to technical and biological variability is determined by using Savitzky-Golay filtering of the measured time dependent total apoB concentrations [50].A typical data set is shown in Fig 6.2. Note that for each of the two tracers, the GC-MS yields the fraction of labelled and unlabelled apoB. These values are then mul-tiplied with the total apoB concentration to obtain the concentrations of the labelled


and unlabelled apoB molecules. Hence, there are two measurements of the unlabelledVLDL/IDL, which both enter the analysis.

6.1.3 Modelling the lipoprotein metabolism

In principle the amount of lipids contained in an apoB molecule, and hence the massdensity of the lipoproteins, is a continuous valued variable. Due to the large particlenumbers of assembled apoB molecules stochastic effects can be neglected and thus thedescription of the lipid metabolism can be based on a general reaction diffusion system

∂tn(ρ, t) = −∂ρD(n, ρ, t)∂ρn(ρ, t) + F (n, ρ, t). (6.1)

Here the particle number n(ρ, t) dρ denotes the number of apoB molecules with adensity between ρ and ρ + dρ. The coefficient D(n, ρ, t) describes the diffusion of theparticle density n(ρ, t), while F (n, ρ, t) describes reactions of apoB molecules result-ing in a loss of lipoproteins which are transferred to the cells.In general the reaction and diffusion terms in Eq. (6.1) are unknown and have to beestimated from the dynamic measurements taken. Since these measurements integratethe particle density n(ρ, t) about different density regions for VLDL, IDL and LDLand result in a total concentration, practically all information about the mass densitydistribution is lost. Thus a model based on the full reaction diffusion system is notpossible with the current experimental setting.However, there is experimental evidence that the release of cargo causes an conforma-tional alteration of the apoB molecule which results in a change of the binding activitytowards the cells. This assumption justifies to build a mathematical model based ondistinct lipoprotein compartments, having different mass density distributions. Hence,such a model can be formulated in terms of ordinary differential equations

x(t) = f(x(t), θ). (6.2)

Here the system state vector is denoted by x(t) ∈ RS and θ ∈ R

P denotes both, theparameters of the system and unknown initial conditions x(0). Note that only the totalamount of the different lipoproteins, xi(t), is modelled. Such a model does not includethe lipoprotein mass density distributions. This is the basic idea of using a compart-ment model.The mass density distribution for VLDL and IDL can be assessed experimentally,see [82]. It can be shown that there are two independent VLDL subfractions andone IDL subfraction. Additionally the VLDL and IDL density distributions are wellseparated. Hence, both, the sum of the two VLDL subfractions, and the total IDLconcentration can be measured directly. In the LDL density range, three differentlipoproteins are found experimentally, see [83]. Their mass density distributions arehighly overlapping and, hence, they can not be measured separately. Hence, a mathe-matical model of the lipoprotein metabolism in the LDL density mass density rangehas to incorporate a parametrisation of the three LDL mass density distributions. The

6.2 Modelling the large lipoproteins: VLDL and IDL 69

six different LDL measurements, see Section 6.1.1, are the total lipoprotein concentra-tion in small density sub-ranges and, hence, contain contributions from all three LDLlipoproteins. Thus the reconstruction of the LDL mass density distribution and thetime evolution of the LDL subfractions is a highly non-trivial problem.

6.2 Modelling the large lipoproteins: VLDL and IDL

The lipoproteins are assumed not to interact with each other. Thus the rate at whichlipoproteins are delivered to the cell should only depend on the current lipoprotein con-centration and, hence, linear ordinary differential equations are used in the following.Since linear equations are used, and the reactions are such that lipoproteins are onlyremoved from the apoB molecules, the modelling of the lipoprotein metabolism can beseparated in two steps. In the first step a model of the large VLDL and IDL moleculesis derived and validated using experimental data [84]. Afterwards the three smallerlow density lipoproteins are included into this model.

6.2.1 Experiments

Figure 6.2 shows the dynamic behaviour of the correspondingly labelled VLDL andIDL concentrations. It can be seen that the concentrations of the labelled molecules isquite small compared to the not labelled ones. However it is quite obvious that the sys-tem is not in a steady state since especially the concentration of the unlabelled VLDLchanges significantly.During the experiments blood is taken from the patients and in order to keep the totalvolume constant an infusion is given. This artificially lowers the lipoprotein concentra-tion, and the experimental data has to be corrected for this effect. Figure 6.3(b) showsthe measured cholesterol concentration of five different patients. It can be assumedthat cholesterol is not produced during the experiments, and hence the cholesterol timecourse can be used to account for the dilution effect.Figure 6.3(a) shows the corrected unlabelled VLDL pool concentrations for the samefive patients. It can be seen that for some patients at about 7.5 hours the VLDL concen-trations increase. This can only happen if the production rate of the liver is increasedwhich clearly conflicts with a steady state assumption during the whole experiment.

6.2.2 Compartment models

Several different compartment models of the lipid metabolism have been proposedduring the last years [85, 86, 87, 88]. However all these models used the steady stateassumption which, as the measurements discussed in Section 6.2.1 show, is violated.Hence in the following a model not based on the steady state assumption is proposedand compared with a model from Packard et. al. using this assumption, see Ref. [86].Figure 6.4(b) shows the modelMst from Packard et.al., composed of 6 different VLDL


-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 2 4 6 8 10 12

VLD

Lobs,

p

time

-0.2

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10 12

VLD

Lobs,

a

time

11.5

12

12.5

13

13.5

14

14.5

15

15.5

0 2 4 6 8 10 12

VLD

Lobs

time

12

12.5

13

13.5

14

14.5

15

15.5

0 2 4 6 8 10 12

VLD

Lobs

time

-0.05

0

0.05

0.1

0.15

0.2

0.25

0 2 4 6 8 10 12

IDLob

s, p

time

-0.05

0

0.05

0.1

0.15

0.2

0.25

0 2 4 6 8 10 12

IDLob

s, a

time

5.5

5.6

5.7

5.8

5.9

6

6.1

6.2

6.3

0 2 4 6 8 10 12

IDLob

s

time

5.5

5.6

5.7

5.8

5.9

6

6.1

6.2

6.3

0 2 4 6 8 10 12

IDLob

s

time

Figure 6.2: Time evolution of phenylalanine ( obs, p ) and aginine ( obs, a ) labelled and unla-belled ( obs ) VLDL and IDL molecules after injection of marked amino acids. Thecorresponding time courses of the labelled amino acids was shown in Figure 6.1(a)(phenylalanine) and 6.1(b) (aginine).


(a)

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 2 4 6 8 10 12

rela

tive

conc

entr

atio

n

time / [h]

(b)

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

0 2 4 6 8 10 12

rela

tive

conc

entr

atio

n

time / [h]

Figure 6.3: Normalised time evolution of the unlabelled VLDL for five different patients (a)and typical measurements of the cholesterol during the experiment (b).

compartments and one IDL compartment. The LDL part of the model is not shownhere, since in this section we restrict ourself to a model of the VLDL and IDL mea-surements. In the setting of linear compartment models the LDL parts can then beadded afterwards without any modification of the previous part. In the original modelthe liver did not only produce the first VLDL compartment but also later ones, whichis not included here. There are two reasons for this: First this would introduce non-identifiable parameters into the system, see Ref. [86], and second these rates weremuch smaller than the production rate of the first VLDL.Figure 6.4(a) shows the proposed modelMnst consisting only of two VLDL compart-ments. The variable production rate of the liver is accounted for by a function of thegeneral form

k01(t) = α0 + α1 t + α2 max(0, t− 7.5) (6.3)

where t is measured in hours. The term containing α2 describes an increased produc-tion rate triggered by metabolic fate after approximately 7.5 hours, a behaviour presentin some of the VLDL measurements shown in Figure 6.3(a).In comparing these two models it is important to note that the Packard model makesuse of six VLDL compartments, where only the sum of the first and the last three ismeasured, respectively. In contrast to this, the newly proposed model includes onlytwo VLDL compartments, and their sum is measured. As already discussed it is a pri-ori not known where the VLDL density range has to be split in order to obtain the twodifferent VLDL compartments which can then be explained by an ordinary differentialequation.Summarising, the proposed model has three main advantages. It includes a non–stationary VLDL pool, uses a reduced number of compartments and does not introducea priory unknown splits in the VLDL density range. A reduced number of compart-ments is of course only an advantage if the experimental data is explained as good as


(a)

k01(t) ��1 �� 2 ��

kv1 kv2 kv3

k12 k23

k13

(b)

k01 �� 1 �� 3 � ��

kv1

kv2

kv3

k12 k34

��2

��4

�� 5

!�"�#�"6

kv5

kv4 kv6 kv7

k46 k67

Figure 6.4: Schematic sketch of the non-stationary,Mnst, (a) and stationary,Mst, (b) compart-ment model used. The dashed boxes show which components are combined in oneobservation.

with a larger model.

6.2.3 Model selection

Besides estimation of parameters, another very important topic is how to discriminatebetween different possible models [89]. In the simplest setting these models are nested,meaning that there is one largest model including all others if an appropriate numberof parameters are eliminated.In the case of nested models, the likelihood ratio test, see Ref. [24, 90] can be usedto eliminate unnecessary parameters starting from the largest model. For the case onnon-nested models see Refs. [91, 92]. The likelihood ratio test procedure is based onthe likelihood ratio

lr(M1,M2) = 2 [L(M1)− L(M2)] , (6.4)

between two models. Here L(.) denotes the logarithm of the likelihood of the corre-sponding models and modelM1 is the full model whileM2 is the restricted model. Ifone assumes that the true model is nested within modelM2, then the likelihood ratiotest statistic which can be evaluated as

lr(M1,M2) = χ2(M2)− χ2(M1) , (6.5)

is distributed according to a χ2(p1 − p2) distribution, where p1, p2 denote the numberof estimated parameters. In all practical cases the χ2(.) values for the two models are


-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 2 4 6 8 10 12

VLD

Lobs,

pVLDL1VLDL2

-0.5

0

0.5

1

1.5

2

2.5

0 2 4 6 8 10 12

VLD

Lobs,

a

VLDL1VLDL2

5

10

15

20

25

30

0 2 4 6 8 10 12

VLD

Lobs

VLDL1VLDL2

5

10

15

20

25

30

0 2 4 6 8 10 12

VLD

Lobs

VLDL1VLDL2

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0 2 4 6 8 10 12

IDLob

s, p

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0 2 4 6 8 10 12

IDLob

s, a

7.4

7.5

7.6

7.7

7.8

7.9

8

8.1

8.2

8.3

8.4

0 2 4 6 8 10 12

IDL

7.5

7.6

7.7

7.8

7.9

8

8.1

8.2

8.3

8.4

0 2 4 6 8 10 12

IDL

Figure 6.5: Experimental data and estimated trajectories for one patient with two different trac-ers. Since only the sum of both VLDL compartments is measured, also the timeevolution of these two compartments is shown.


0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 2 4 6 8 10 12co

ncen

trat

ion

time

VLDL1VLDL2

Figure 6.6: Measurements of two VLDL sub–compartments within the VLDL mass densityrange.

of course not known and have to be computed based on estimates of the parameters.In the following a bootstrap approach for model selection is discussed [93, 94]. Ba-sically model selection can be done by backward–, forward–, or stepwise–selection.Backward–elimination starts with the largest model and during each step the χ2 val-ues of all models with one less parameter are investigated to decide which one is themost probable, based on the likelihood–ratio test. Accordingly forward–eliminationstarts with the smallest model an adds one parameter in each step. Stepwise selectioncombines the two approaches, hence in each step parameters can be added or removedfrom the model. However none of these approaches guarantees to find the right modelwhen applied to data. As an additional problem these procedures are based on multi-ple likelihood–ratio tests and hence the problem of multiple testing, which was alreadydiscussed in Section 2.3, arises.In order to assess the stability of these model selection strategies a bootstrap estimateof the model selection result can be used [95]. Here, in the present setting, a non–parametric bootstrap sample is constructed by sampling with replacement from the 25available experiments. Then the model selection is carried out with each of the boot-strap samples, which gives insight into the stability of the model selection procedure.Computationally this model selection strategy is very favourably since the parame-ters for each model have to be estimated only once for each dataset. The bootstrapprocedure which then operates only with the resulting χ2 values is hence quite fast.

6.2.4 Results

An exemplary trajectory according to modelMnst simulated with the estimated param-eters is shown in Figure 6.5. It can be seen that this model explains the measured datasatisfactorily. Since only the sum of the two VLDL compartments is measured theirdynamic behaviour is unknown. Nevertheless the predictions of the model for thesecompartments can be computed and are shown in the Figure.Note that the mass density ranges of these compartments are unknown and that they


can hence not be measured. Figure 6.6 shows the measurements for two VLDL com-partments using a split motivated by a priori knowledge about the chemical config-urations of the apoB molecule. Qualitatively the model predictions from Figure 6.5resemble the time evolution of the measured compartments. However, it was not pos-sible to obtain reasonable fits when these measurements are included. This is mostlikely due to overlapping mass density distributions of the two VLDL species. Hence,VLDL1 and VLDL2 can not be separated experimentally.The results of the parameter estimation for all patients available using the two modelsMst and Mnst is shown in Table 6.2. The obtained χ2 values should be distributedaccording to a χ2 distribution with N − p degrees of freedom, where N denotes thenumber of measured data points and p parameters are estimated. Table 6.2 shows thatusing the 5% quantile model Mst is rejected in 20 of 25 cases, while model Mnst

is rejected only once. This clearly demonstrates that a model assuming a stationaryVLDL pool with 6 VLDL sub-compartments fails to quantitatively describe the lipidmetabolism. However the proposed model using only 2 VLDL sub-compartments anda transfer function according to Eq. (6.3) describes the measured data satisfactorily.

Given the fact that model Mnst describes the lipoprotein metabolism much betterthan Mst, in the following the bootstrap model selection procedure described aboveis used to investigate if a smaller model would be sufficient. The patients from Table6.2 stem from three different studies. The first 10 experiments are performed with postmenopausal women, experiments 11 to 21 consists of male patients with combinedhyperlipidemia, and the last four experiments are done with healthy laboratory staff.Since from a medical point of view one would expect these groups to behave differ-ently they are analysed separately.In principle there are nine different models nested within Mnst with one parameterless. In Table 6.2 only the difference χ2(Mnst

i )−χ2(Mnst) for the two smaller models

• Mnst1 : kv1 = 0

• Mnst2 : α2 = 0

are given since it turned out that one of the other models was selected only in less than5% of the bootstrap samples.If only the first group of experiments is used about 71% of the samples discard thelarger model in favour of modelMnst

1 , and in 28% of the samples both smaller modelsare discarded. These 28% for the larger model are mainly caused by experiment 8, ifthis experiment is left out then in 95% off all cases modelMnst

1 is chosen.The second experimental group shows a quite different behaviour. Here modelMnst

1 ischosen in only 10% of all cases, while both modelMnst

2 andMnst are selected in 44%of the cases. Hence this group of patients seems to be more heterogeneous and, hence,there is no unique smaller model for all patients.The first two groups (post–menopausale women and man with combined hyperlipi-demia) behave quite differently. The model selection process revealed that for the firstgroup the decay of one VLDL compartment can be neglected which nevertheless is


Mst Mnst Mnst1 Mnst

2

dof χ2 dof χ2 ∆χ2 p ∆χ2 p1 65 112.4∗ 134 53.5 0.01 0.92 8.63 < 0.052 45 130.9∗ 117 52.1 0.01 0.92 0.91 0.343 65 43.8 135 35.4 0 1 2.95 0.094 49 123.7∗ 114 71.2 0 1 1.1 0.295 43 8.7 55 47.0 0.58 0.44 0.83 0.366 33 79.2∗ 99 63.5 0.76 0.38 13.71 < 0.057 33 79.7∗ 93 39.4 0 1 2.22 0.148 43 169.2∗ 109 77.2 7.99 < 0.05 0.39 0.539 42 150.9∗ 110 77.8 0 1 59.21 < 0.05

10 67 424.5∗ 139 185.4∗ 5.4 < 0.05 10.6 < 0.0511 62 37.8 121 36.5 1.14 < 0.05 0 112 41 207.7∗ 112 130.8 0 1 12.9 < 0.0513 33 101.0∗ 99 81.2 5.79 < 0.05 0 114 34 83.1∗ 102 67.3 7.98 < 0.05 0.18 0.6615 42 126.8∗ 113 67.8 2.69 0.1 0 116 66 107.6∗ 137 112.2 7 < 0.05 0 117 42 60.0∗ 114 68.9 0.87 0.35 0.08 0.7718 42 154.3∗ 111 63.6 0 1 0.19 0.6619 43 87.6∗ 114 58.5 0.01 0.92 12.38 < 0.0520 39 116.7∗ 109 21.5 0 1 0.02 121 65 41.4 135 23.7 9.82 < 0.05 0.56 0.4522 64 164.7∗ 125 119.3 29.6 < 0.05 23.5 < 0.0523 43 263.8∗ 115 91.6 5.7 < 0.05 0 124 40 45.4 106 51.1 1.55 0.21 4.2 < 0.0525 43 156.5∗ 113 123.5 0 1 0.2 0.65

Table 6.2: The χ2 values obtained from the parameter estimation for the two modelsMst1 and

Mnst1 . A star (∗) denotes that the corresponding model is rejected at the 5% level.

Additionally the difference in the χ2 values for the models Mnst1 and Mnst

2 whichare nested intoMnst

1 and the p-values from a likelihood–ratio test is given.

important for the second group. On the contrary the lipoprotein metabolism of theman showed that the increase of the VLDL production rate determined by the param-eter α2 can be discarded in approximately half of the cases, while this is parameter isdefinitely needed in the first patient group.The third group of patients was not analysed by this bootstrap procedure since thereare only 4 experiments available.

6.3 Modelling the small lipoproteins: LDL 77

n(ρ)

ρ

VLDL IDL LDL

g1 g2 g3 g4 g5 g6 g7 g8

Figure 6.7: Schematic sketch of the lipoprotein density distribution n(ρ). Shown are the den-sity distributions of VLDL1, VLDL2, IDL, LDL1, LDL2 and LDL3 (from leftto right). Observed are the integral particle numbers within the 8 density rangesg1, . . . , g8.

6.3 Modelling the small lipoproteins: LDL

In the following the mathematical model developed for the larger lipoproteins, VLDLand IDL, is generalised to include the smaller low density lipoproteins.

6.3.1 Mathematical model

The system state vector x(t) is composed of the six lipoproteins in the following way

x(t) = (VLDL1(t), VLDL2(t), IDL(t), LDL1(t), LDL2(t), LDL3(t))T . (6.6)

In order to successfully model the lipid metabolism under fastening conditions, it isnecessary to include a time dependent production rate of VLDL molecules, see Sec-tion 6.2.1. Typical production rates are known to decrease at the beginning and toincrease after approximately 7.5 hours due to digestion, see Section 6.2.2. Hence, theparametrisation of the production rate

k01(t) = α0 + α1 t + α2 max(0, t− 7.5) , (6.7)

which was already introduced in Eq. (6.3) is used in the following. The largest linearordinary differential equation for the lipid metabolism investigated in the following is


given by

xj(t) = k0j aa(t− τ) +

j−1∑

i=1

kij xi(t) (6.8)

−6∑

i=j

kji xj(t)− kvj xj(t), j = 1, . . . , 6 .

Here the parameters kij denote the transfer rate of apoB molecules from the compart-ment xi to xj . The rates k0j describe the direct production of xj by the liver and kvj

denotes the degradation rate of the corresponding compartment. The function aa(t)denotes the concentration of the markers, see Figure 6.1, and τ denotes the delay untilthe first labelled apoB molecules produced by liver appear.This model includes all possible decays of lipoprotein compartments. The only twoassumptions made in the following are that the second VLDL compartment x2, andthe IDL compartment x3 are not produced directly by the liver: k02 = k03 = 0. Fur-thermore, since the direct production rate of the LDL compartments is expected to bemuch smaller than k01(t), it is reasonable to assume a constant direct production ratefor these components.

6.3.2 Observation function

The measurements resulting from the stable isotope experiments correspond to inte-grals over small density ranges of the lipoprotein distribution. Figure 6.7 shows asketch of the lipoprotein density distribution, together with the 8 density regions ofwhich the total lipoprotein concentrations are measured. Note that the first measuredquantity, total VLDL, corresponds to the sum of x1 and x2, while the second quantitymeasured, IDL, is x3. Hence the first two components of the observation function g(x)are given by

g1(x) = x1(t) + x2(t), (6.9a)

g2(x) = x3(t). (6.9b)

Since the three LDL distributions highly overlap, each of the quantities x4, x5 andx6 contribute to the six measurements taken in this density region. In order to get afunctional relationship between the measured quantities and the size of the LDL sub-fractions, the density distribution of each LDL compartment has to be parametrised. Tothis end we assumed a Gaussian distribution with independent mean, ρj , and variance,σ2

j , for each LDL compartment. Thus the measurements in the LDL density range aregiven by

gj(x) = x4(t) h1(ρj−2, ρj−1) + x5(t) h2(ρj−2, ρj−1) (6.9c)

+ x6(t) h3(ρj−2, ρj−1) j = 3, . . . , 8 ,


where the LDL density was split experimentally according to the vector

ρexp = (1.019, 1.031, 1.034, 1.037, 1.040, 1.044, 1.063) . (6.10)

The contributions of the three LDL subfractions to each of the measurements can becalculated as follows

hj(ρexp1 , ρexp

2 ) =

∫ ρexp2

ρexp1

1√2πσ

exp

(

−(ρ− ρj)2

2σ2j

)

dρ . (6.11)

Since the parameters ρj and σj are unknown, they are included in the parameter vectorθ and are thus estimated just as the dynamic parameters and the starting values of thesystem.

6.3.3 Results

In all stable isotope experiments the lipid metabolism was stimulated by administeringlabelled amino acids with time courses similar to the ones shown in Figure 6.1. Theseamino acids are then used during the formation of the apoB molecules, resulting inlabelled lipoproteins. As discussed in Section 6.1.1, each experiment consists of fourtime courses resulting from the two labelled and the corresponding unlabelled lipopro-tein concentrations. Since the labelling by stable isotopes does not influence the lipidmetabolism, the dynamic parameters of the system for these time courses are identical.Hence, these time courses are used simultaneously in order to estimate the parametersof the system by minimising the residual sum of squares, see Eq. (4.8).The model parameters were estimated separately for each patient. This includes theparameters, ρj, σj , describing the density distribution of the three LDL compartments.In principle it would be possible that the location parameter of these distributions isthe same for all patients or at least for a subgroup of them sharing a common type oflipoprotein metabolism. However, in a first step we are mainly interested in investigat-ing if there exists one common model for all patients using the assumptions discussedabove. Hence these parameters were newly estimated for every patient.Figure 6.8 shows the experimental data for a stable isotope experiment using pheny-lalanine as marker together with the predictions of the model, see Eq. (6.8), for theestimated parameters. It can be seen that the model used is indeed able to explain theexperimental data, despite the simplification of the reaction diffusion equation to a lin-ear ordinary differential equation.

The χ2 values for the estimated parameters should be distributed according to aχ2(N − P ) distribution, where N denotes the number of available measurements andP parameters are estimated. Figure 6.9(a) shows the distribution of the correspondingp-values computed from the estimated χ2 value for the different stable isotope experi-ments. Ideally, if the model used for the parameter estimation is correct, these valuesshould be uniformly distributed. That this is not the case is due to the error structure


-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 2 4 6 8 10 12

VLD

Lobs,

p

t

-0.05

0

0.05

0.1

0.15

0.2

0.25

0 2 4 6 8 10 12

IDLob

s, p

t

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 10 20 30 40 50 60 70

LDL 1

obs,

p

t

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0 10 20 30 40 50 60 70

LDL 2

obs,

p

t

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 10 20 30 40 50 60 70

LDL 3

obs,

p

t

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 10 20 30 40 50 60 70

LDL 4

obs,

p

t

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0 10 20 30 40 50 60 70

LDL 5

obs,

p

t

-0.05

0

0.05

0.1

0.15

0.2

0 10 20 30 40 50 60 70

LDL 6

obs,

p

t

11.5

12

12.5

13

13.5

14

14.5

15

15.5

0 2 4 6 8 10 12

VLD

Lobs

t

5.5

5.6

5.7

5.8

5.9

6

6.1

6.2

6.3

0 2 4 6 8 10 12

IDLob

s

t

9.5

10

10.5

11

11.5

12

12.5

13

13.5

14

0 2 4 6 8 10 12

LDL 1

obs

t

7

7.5

8

8.5

9

9.5

10

10.5

11

0 2 4 6 8 10 12

LDL 2

obs

t

9.5

10

10.5

11

11.5

12

12.5

0 2 4 6 8 10 12

LDL 3

obs

t

12.5

13

13.5

14

14.5

15

15.5

0 2 4 6 8 10 12

LDL 4

obs

t

11.5

12

12.5

13

13.5

14

14.5

15

15.5

16

0 2 4 6 8 10 12

LDL 5

obs

t

11

11.5

12

12.5

13

13.5

14

14.5

15

0 2 4 6 8 10 12

LDL 6

obs

t

Figure 6.8: Experimental data of a stable isotope experiment for a bolus injection of phenylala-nine labelled amino acids, together with the predictions of the model (6.8). Mea-sured is the sum of the two labelled, VLDLobs,p, and unlabelled, VLDLobs, and thecorresponding IDL compartments together with six labelled, LDLobs,p

i , and unla-belled, LDLobs

i , total LDL concentrations.


(a)

p

Fre

quen

cy

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

20

(b)

0

0.2

0.4

0.6

0.8

1

1.2

0 2 4 6 8 10 12

VLD

Lobs,

p

time / [h]

VLDL1VLDL2

Figure 6.9: (a) Distribution of the probability of the estimated χ2(θ) values for all stable iso-tope experiments available. (b) Dynamic behaviour of the two unobserved VLDLcompartments, their sum and the measured total VLDL concentration.

of the measurements. As discussed in Section 6.1.2, there are two sources for the ob-servational noise. The technical GC-MS data variability is accounted for by a Poissondistribution due to the high intensity counting process involved. The fraction of la-belled and unlabelled is then multiplied by the total apoB concentration. Since thistotal concentration is only measured once, the errors introduced at the different timepoints in this step are not independent. Hence, the resulting χ2 values are smaller thanin the case of independent observational noise.Due to the complex structure of the observation function g, which only measures the

(a)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 2 4 6 8 10 12

LDL

time / [h]

x1x2x3

(b)

0

20

40

60

80

100

120

140

1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08

n(ρ)

ρ

nx1(ρ)

nx2(ρ)

nx3(ρ)

Figure 6.10: (a) Dynamic behaviour of the three independent LDL compartments x1, x2 andx3 and (b) the corresponding particle density distributions nxi

(ρ) for these com-partments, normalised to unit area.


sum of both VLDL compartments, the temporal evolution of these components is un-known. However using the estimated parameters, the mathematical model can be usedto reconstruct the dynamics of these components. Figure 6.9(b) shows the time evo-lution of the two VLDL components as predicted from the model, their sum and themeasured sum of the two compartments. As expected, the first VLDL compartment isstimulated much faster than the second one which basically leads to a slower decreaseof the combined signal.The observation function in the LDL density region is even more complex. Here, inte-grals over three highly overlapping density distributions are measured. Figure 6.10(a)shows the reconstructed dynamics of the three independent LDL compartments. Theestimated density distribution of the three LDL compartments for this patient is shownin Figure 6.10(b). It can be seen that especially the three compartments have a largeoverlap and can thus not be separated experimentally.

6.4 Conclusion

We presented a parsimonious model of the lipoprotein metabolism, which was able toexplain the experimental data from stable isotope experiments. This model was basedon six different lipoproteins, constituting the delipidation cascade. Note that earliermodelling efforts used between two and six VLDL compartments, one or two IDLcompartments and up to seven LDL compartments.The reduction of the number of VLDL compartments was based on avoiding the steadystate assumption. By using a time dependent apoB production rate of the liver, amodel with only two VLDL compartments explained the experimental data better thana model with six VLDL compartments based on the this assumption.The reduction of the number of LDL compartments was possible due to a parametricmodel of the LDL density distribution. Because of experimental limitations, only thetotal lipoprotein concentration in several distinct mass density ranges can be measured.This results in a complicated observation function, which integrates over the threeLDL density distributions. Nevertheless it was possible to reconstruct both, the LDLdensity distribution, and the temporal evolution of the three LDL compartments usingtime resolved measurements of the delipidation cascade.In the future this model is further simplified for specific subgroups of the patientssharing a common type of lipoprotein metabolism. The estimated parameters in thesemodels can then be used to investigate the effects of newly developed drugs towardsthe lipoprotein metabolism.

Part II

Option pricing theory

83

Chapter 7

Introduction to option pricing

In the following some of the basic concepts of option pricing theory are briefly re-viewed to introduce the notation needed in the next chapter, where a new approach tooption pricing based on a master equation is developed.

7.1 Derivatives

Current financial markets use a large variety of so called derivatives. The value ofthese financial instruments is based on the value of the underlying, e.g. stock values.In the following we will concentrate on options, for a review of other derivatives seeRef. [96].Basically an option is an agreement between two parties allowing the purchaser to buyor sell a specified quantity of the underlying at a fixed price at a specified time or timerange in the future. If the buyer of an option has the right to sell the underlying theoption is called a put option, while a call option gives the buyer the right to buy theunderlying. Since the buyer has the right to exercise the option, the buyer of an optionis in the so called long position, while the provider of the option is in the so calledposition. European options have to be exercised at a given point in time, while forAmerican options the buyer can exercise the option in pre-specified time range. Sincethe option gives the buyer a right to decide if the option is exercised, the seller of anoption charges a price for this right, the option price.There are two main uses for options: Speculation and Hedging. If someone speculatesthat e.g. a stock will raise 10% within the next year, there are two possibilities. Eitherbuying the stock now and holding it for one year, or buying a call option with a strikeequal to the current price of the stock and a maturity of one year. Buying the option ismuch more favourable since the option is much cheaper than the stock and hence thepercentage return is much larger. In addition, if the assumption is wrong, the possibleloss is limited by the option price. On the other hand options can be used as assurance,this is known as Hedging. If a company is in need of e.g. 1000 pork bellies next month,a call option for this quantity can be used to limit the maximal amount of money to

85

86 Introduction to option pricing

buy them. However, if the price is below the exercise price the option is not exercised.In both cases the option price can be regarded as assurance fee.

7.2 Standard pricing theory

This section deals with the determination of the fair value of the option price. Afterbriefly revisiting the assumptions underlying the standard option pricing theory, theBlack–Scholes equation for the fair value of the option price is derived.

7.2.1 Assumptions

Usually at least the following assumptions are made in the derivation of the fair optionprice.

1. No arbitrage:If there would be a possibility for arbitrage it is used immediately until the mar-kets are in equilibrium.

2. Markets are liquid:It is always possible to buy and sell an arbitrary amount of all financial instru-ments.

3. Markets are efficient:All available information concerning an asset is immediately incorporated intoits price.

4. No transaction costs.

The fourth assumption is of a more technical value and simplifies the mathematicaltreatment, but it can be dropped in principle. However the first three assumptionsconstitute the heart of the option pricing theory presented below. Especially the effi-cient market hypothesis according to which future expectations of a stock are instan-taneously priced into the stock value implies that the stock value follows some sort ofrandom walk [97, 98]. Interestingly this behaviour was already found 1900 by LouisBachelier in his PhD thesis [5], before Einstein and Wiener introduced Brownian mo-tion.

7.2.2 Black–Scholes equation

The standard textbook derivation of the Black–Scholes equation assumes a geometricBrownian motion for the underlying asset which is described by a stochastic differen-tial equation

ds = µs dt + σs dW . (7.1)

7.2 Standard pricing theory 87

Here s is the asset value, µs and σ2s2 denote drift and variance of the random walk.Since dW is the increment of a Wiener process the resulting trajectory will be contin-uous. For more realistic models instantaneous jumps of the underlying, described byadditional stochastic terms based on Poisson processes [99, 100] may be included in(7.1). These so called jump diffusion models, are used e.g. to incorporate the effect ofinformation about stocks which arrives at random times [101].The option price at time t, denoted by V (S, t), will thus also be a stochastic variable.Using Ito’s lemma one obtains from Eq. (7.1)

dV =

(

∂V

∂t+

1

2σ2S2∂2V

∂S2

)

dt +∂V

∂SdS . (7.2)

The basic idea of the Black-Scholes pricing theory [102] is to construct a risk-freeportfolio containing this option and a variable amount of the underlying. To this endthe portfolio

Π = V (S, t) + ∆S , (7.3)

is introduced, where ∆ denotes the amount of the underlying in the portfolio. Thechange of the value of this portfolio is given by

dΠ =

(

∂V

∂t+

1

2σ2S2∂2V

∂S2

)

dt +

(

∂V

∂S+ ∆

)

dS . (7.4)

Now, by choosing

∆ = −∂V

∂S, (7.5)

the random term from equation (7.4) vanishes and thus the portfolio is risk–free. Inorder to avoid arbitrage the change of the portfolio must then equal the return from arisk–free investment yielding the risk–free interest rate r

dΠ = r Π dt . (7.6)

Together with Eq. (7.4) one thus obtains a partial differential equation for the optionprice. The above derivation assumed a non-dividend paying stock, this assumption canbe omitted to obtain the Black–Scholes equation for a dividend paying stock [101,103]

∂ V (S, t)

∂t= −σ2S2

2

∂2 V (S, t)

∂S2− (r − q) S

∂ V (S, t)

∂S+ r V (S, t) . (7.7)

Here the continuous dividend is given by q. In order to numerically solve this partialdifferential equation the relevant variables are made dimensionless, i.e. one obtains,

S = ln(S/E), V (S, τ) = V (S, t)/E , (7.8)

and the time direction is reversed

τ =σ2

2(T − t), (7.9)

88 Introduction to option pricing

so that the payoff function can be used as initial condition. In this way one obtains

∂ V (S, τ)

∂τ=

∂2 V (S, τ)

∂S2+ (kq − 1)

∂V (S, τ)

∂S− kr V (S, τ) , (7.10)

where kq = 2 (r− q)/σ2 and kr = 2 r/σ2 have been introduced. With the transforma-tion

u(x, τ) = exp

[

(kq − 1)

2S +

(kq − 1)2

4τ

]

× exp [krτ ] V (S, τ) , (7.11)

the above equation can be further simplified to a diffusion equation

∂u(S, τ)

∂τ=

∂2 u(S, τ)

∂S2. (7.12)

If the parameters of the Black–Scholes equation (7.7) are assumed to be constant intime, then the known general solution to the above equation can be used to constructan analytic solution of the Black–Scholes equation, see e.g. [101]. Using the boundarycondition

Vc(S, T ) = max(0, S(T )− E), (7.13)

for the value of a European Call option with Exercise price E at maturity T one obtainsthe fair value of the option

Vc(S, t) = SN(d1)− Ee−r(T−t)N(d2). (7.14)

Here N(.) denotes value the cumulative normal distribution and the parameters d1 andd2 are given by

d1 =log(S/E) + (r + 1

2σ2)(T − t)

σ√

T − t, (7.15)

d2 = d1 − σ√

T − t. (7.16)

Chapter 8

A master equation approach to optionpricing

In this chapter a master equation approach to the numerical solution of option pricingmodels is developed. The basic idea of the approach is to consider the Black–Scholesequation as the macroscopic equation of an underlying mesoscopic stochastic optionprice variable. The dynamics of the latter is constructed and formulated in terms of amaster equation. The numerical efficiency of the approach is demonstrated by meansof stochastic simulation of the mesoscopic process for both European and Americanoptions [104].

8.1 Introduction

For simple cases, e.g. constant interest rate and volatility, explicit analytical solutionsto the Black–Scholes equation are known [105]. However, for more involved casesone has to rely upon numerical methods. Here we are not concerned with determinis-tic methods, e.g. finite differences, but rather with Monte Carlo methods [106].The idea behind the canonical Monte Carlo method is to exploit the fact that the fairvalue of an option is given by the present value of the expected payoff at expiry underthe risk neutral measure. Thus the standard Monte Carlo approach is based upon thesimulation of a geometric Brownian motion for the underlying asset until expiry. Thenthe payoff is computed and discounted up to the current time. By averaging over dif-ferent realizations of this stochastic process the current option price is estimated [107].Monte Carlo methods offer easy to understand numerical algorithms which can beeasily applied to quite complicated, e.g. path dependent or correlated multi-asset, op-tions [101]. In addition they allow for a straightforward inclusion of stochastic termssuch as the interest rate or volatility. From a numerical point of view they are especiallysuited to problems with many degrees of freedom and the algorithm can be easily runin parallel.However Monte Carlo simulations of the Black-Scholes equation are usually slower

89

90 A master equation approach to option pricing

than comparable finite difference solutions of the partial differential equation. Anotherdisadvantage of the standard Monte Carlo approach is that while the application to Eu-ropean options is straightforward, the valuation of American options is more involved.Using a generalisation of the canonical Monte Carlo method one has to assure that theearly exercise condition stating that the option price is always above the current payoffis not violated sometime in the future. If this happens and the option value is belowthe payoff, the option is exercised and the option value is given by the payoff func-tion. A Monte Carlo simulation based on the dynamics of the underlying asset has tokeep track of all these points in asset time space which makes the algorithm ineffective.Nevertheless more advanced Monte Carlo algorithms are available [108,109,110,111].Since the Monte Carlo methods described above are based on a model of the underly-ing dynamics of asset values this approach could be named microscopic. In contrastnow we are going to follow a different strategy, which we would like to name meso-scopic, which has already been applied with success to simulations of turbulence influid dynamics [112,113,114], to the investigation of hydrodynamic fluctuations [115]as well as to magnetohydrodynamics [116]. The same approach has been shown to leadto fast Monte Carlo algorithms for the balance equations of nonequilibrium thermo-dynamics [117], for the simulation of chemical reactions [118] and reaction–diffusionprocesses [119].The basic idea of the mesoscopic approach [120] is to regard the value of the option,say V as the expectation value of an underlying fluctuating (mesoscopic) stochasticoption value, say θ. A master equation for θ is constructed in such a way that theexpectation value of the multivariate stochastic variable θ satisfies the macroscopicBlack–Scholes equation. The stochastic process defined in terms of a master equationis easily simulated and allows for a Monte Carlo algorithm for the direct simulation ofthe Black–Scholes equation.

8.2 Black–Scholes equation from a piecewise determin-istic process (PDP)

In contrast to the standard derivation of the Black–Scholes equation based on a modelof the underlying in terms of a Wiener process (7.1), this section demonstrates howthe derivation can be parallelled by modelling the underlying in terms of a piecewisedeterministic process [121]. This is possible since a diffusion process can be repre-sented as the continuous limit of an appropriate jump process. This approach, despitedescribing the same dynamics as the standard approach, has the advantage of clearlydemonstrating that it is quite obvious to interpret the option value as a stochastic vari-able and thus to interpret the Black–Scholes equation as the macroscopic expectationvalue of a mesoscopic stochastic option value. In addition the formulation in terms ofpiecewise deterministic processes (PDPs) simplifies the inclusion of additional jumpprocesses since one has to deal only with one type of stochastic process. Another im-

8.2 Black–Scholes equation from a piecewise deterministic process (PDP) 91

portant point is that for such processes one can easily proceed to a master equationformulation, for which powerful numerical algorithms exist, as will be shown below.Here we describe the stochastic process of the underlying asset by a stochastic differ-ential equation of the general form [122]

dX = g(X(t))dt +∑

α

[zα(X(t))−X(t)] dNα . (8.1)

The function g(X(t)) describes the deterministic change of the random variable X(t)and the Poisson increment dNα introduces instantaneous jumps X(t) → zα(X(t)).The Poisson process is defined by the transition rate γα which determines the expec-tation value 〈dNα〉 = γαdt and dNαdNβ = δα,β dNα. The above stochastic processdescribes a deterministic continuous dynamics interrupted by stochastic jumps and isthus also known as piecewise deterministic process.In order to derive the Black–Scholes equation in this setting, the corresponding Itoequation for PDPs for the change of an arbitrary function f(X(t), t) of this PDP,see [122]

df =∂f

∂tdt +

∂f

∂xg(X(t))dt +

∑

α

[f(zα(X(t)))− f(X(t))] dNα , (8.2)

has to be used. Consider now the stochastic differential equation

ds = sµ dt + δs dN+ − δs dN− , (8.3)

with 〈dN±〉 = s2σ2/(2δs2) dt. The stochastic increments dN± are either 0, in whichcase the deterministic evolution takes place, or 1, which describes an instantaneousjump of size ±δs. In the limit δs → 0 this stochastic differential equation correspondsto the Fokker Planck equation for a geometric Brownian motion with drift sµ and vari-ance s2σ2 [123]. Hence in this limit the above equation generates the same dynamicsas Eq. (7.1), but is formulated as a PDP.For the derivation of the Black–Scholes equation one introduces a portfolio π consist-ing of a long position of the option v and a variable quantity ∆ of a short position ofthe underlying s, π = v −∆s. The change of the portfolio π thus becomes

dπ = dv −∆ds . (8.4)

Here dv denotes the change of the option price v(s(t), t) which can be computed withthe help of Eq. (8.2), one obtains

dv =∂v

∂tdt + µs

∂v

∂sdt + dN+ [v(s + δs)− v(s)] + dN− [v(s− δs)− v(s)] . (8.5)

For the portfolio π to become riskless the terms containing dN± in Eq. (8.4) haveto vanish separately. Taking into account Eqs. (8.3,8.5) this leads to the followinghedging strategy

∆± = ±v(s± δs)− v(s)

δs≈ ∂v

∂s, (8.6)


which is up to first order in δs the well known Black–Scholes hedge. Since the portfoliois now riskless it’s return must equal the return of a risk free investment with the samevalue

dπ = rπdt = r(v −∆s) dt , (8.7)

where r is the riskfree interest rate. Inserting Eqs. (8.5,8.7) into Eq. (8.4) and takingthe expectation value in the limit δs→ 0 one obtains

〈∂v(s, t)

∂t〉 = −s2σ2

2〈∂

2v(s, t)

∂s2〉 − rs〈∂v(s, t)

∂s〉+ r〈v(s, t)〉 . (8.8)

If the expectation value V (S, t) = 〈v(s, t)〉 is identified with the macroscopic optionprice then one obtains the Black–Scholes equation for a European option on a nondividend paying asset.Hence the traditional Black-Scholes equation can be interpreted as the expectationvalue of a stochastic process given by Eq. (8.5). Additional not infinitesimal jumpprocesses can now easily be included, but this will change the hedging strategy [99,100] and the expectation value of the option price will then follow equations similar tothe Black–Scholes equation.

8.3 General Theory

A general approach in statistical physics is to derive macroscopic equations from aknown microscopic dynamics. Here the opposite approach is followed: given a partialdifferential equation for a macroscopic observable we construct a mesoscopic stochas-tic process such that the expectation value of the stochastic variable is governed by theoriginal partial differential equation. In this section the general theory needed for theconstruction of a stochastic process underlying a partial differential equation is pre-sented, which in the following section is then applied to the Black-Scholes equation.For a review see [124, 125].In order to define the underlying stochastic process the state space of the system hasto be given. To this end the space variable x is discretized such that θλ denotes thestate of the system at the discrete position xλ, λ = 1, . . . , n. Hence the state spaceΓ of the system is given by Γ = {θ | θ ∈ R

n}. The stochastic process is defined bythe joint probability distribution P = P ({θλ} , t) giving the probability of finding thevalues {θλ} at time t. The time development of the probability density P is given by amaster equation of the general form

∂

∂tP (θ, t) = AP (θ). (8.9)

The time evolution operatorA is assumed to generate a multivariate stochastic Markovprocess.

∂

∂tP ({θλ}, t) =

∑

{νλ}

A({θλ}, {νλ}) P ({νλ}, t). (8.10)

8.3 General Theory 93

Time dependent expectation values of a function F : Γ → R for a known P (θ, t)are given by

〈F 〉 =

∫

Dθ F (θ)P (θ, t) . (8.11)

The operator A is defined in terms of diffeomorphic maps b acting on the state spaceb : Γ → Γ, and corresponding operators Bb acting on the functions introduced abovein the following way

BbF (θ) = F (b−1(θ)). (8.12)

The most general operatorA needed, is of the form

A = c∑

µ

[Det(bµ)]−1Bbµ− I, (8.13)

where Det(bµ) denotes the determinant of the Jacobian matrix of the map bµ and I

denotes the identity. In order to generate a valid stochastic process the operator A hasto fulfil (i) all transition probabilities w are positive and (ii) 〈AF 〉 = 0 in order topreserve the normalisation of the probability density.With these definitions the macroscopic equation of motion for a general observable Fcan now be computed. From Eq. (8.11) and 〈AF 〉 = 0 one obtains

∂

∂t〈F 〉 = 〈[F,A]〉. (8.14)

Since A is a sum of operators Bb one has to compute

[F,Bb](θ) = Bb

{

B−1b FBb(θ)− F (θ)

}

= Bb {Bb−1FBb(θ)− F (θ)}= Bb {F (b(θ))− F (θ)} . (8.15)

Inserting Eq. (8.13) into (8.14) one thus obtains

∂

∂t〈F 〉 = c

∑

µ

〈[Det(bµ)]−1Bbµ{F (bµ(θ))− F (θ)}〉, (8.16)

which can be further simplified by a change of variables in the integral computing theexpectation value on the right hand side of the equation. Finally one obtains

∂

∂t〈F 〉 = c

∑

µ

〈F (bµ(θ))− F (θ)〉. (8.17)

The above equation enables the computation of the macroscopic expectation value ofa general observable F given an arbitrary multivariate Markov process defined by thetime evolution operator A.


For a master equation formulation of the stochastic process one needs the transitionprobability w(θ, θ), which can be computed starting from Eq. (8.9)

∂

∂tP (θ, t) = c

∑

µ

{

1

Det(bµ)P (b−1

µ (θ), t)− P (θ, t)

}

= c∑

µ

∫

Dθ

{

1

Det(bµ)δ(b−1

µ (θ), θ)P (θ, t) − δ(θ, bµ(θ))P (θ, t)}

= c∑

µ

∫

Dθ{

δ(θ, bµ(θ))P (θ, t)− δ(θ, bµ(θ))P (θ, t)}

.

Introducingw(θ, θ) = c

∑

µ

δ(θ, bµ(θ)), (8.18)

the canonical form of the master equation is obtained

∂

∂tP (θ, t) =

∫

Dθ{

w(θ, θ)P (θ, t)− w(θ, θ)P (θ, t)}

. (8.19)

As a simple example for the general theory presented above consider the diffusionequation (7.12). To solve this equation using a mesoscopic stochastic process the stateθ of the system is discretized such that 〈θλ〉 is the value of u at position xλ = λ δl, λ =1, . . . , n. Consider a stochastic process defined by the map

a±µ =

{

θµ → θµ − α θµ

θµ±1 → θµ±1 + α θµ

, (8.20)

with arbitrary parameter α and the time evolution operator

A = 1/(α δl2)∑

µ

1

1− α

[

A+aµ

+A−aµ

]

− 2 I. (8.21)

Here the operators A±aµ

are defined through the maps a±µ by Eq. (8.12).

In order to compute the expectation value 〈θλ〉 one defines the projection operatorFλ by Fλ(θ) = θλ. With the help of the general formula (8.17) one obtains for theexpectation value of Fλ

∂

∂t〈θλ〉 =

∂

∂t〈Fλ〉 =

〈θλ+1〉 − 2〈θλ〉+ 〈θλ−1〉δl2

, (8.22)

which in the continuum limit corresponds to a discretization of the diffusion equa-tion (7.12). Hence the partial differential equation can be solved by simulation of thestochastic process defined through the generator (8.21). This is usually done with thehelp of the following algorithm [126]:

8.3 General Theory 95

0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7

x

0.0

0.05

0.1

0.15

0.2

0.25

0.3

u(x,

)

=0.4

=0.01

Figure 8.1: Simulation of the diffusion equation (7.12) by means of the stochastic process(8.21) for two different values of α. In both cases 5 realizations of the stochas-tic process have been averaged. The initial distribution was assumed to be a sharpGaussian shaped peak centred at x = 0.5.

• initialise θλ at t = 0

• while t < tend

– compute random time step τ until the next jump occurs from an exponentialdistribution with mean value 〈τ〉 = 1/w = α δl2/n

– apply one of the transitions a±µ selected according to their relative proba-

bility

By repeating the above algorithm, different realizations of θ are generated and thus〈θλ〉 can be estimated from a sample of realizations. The results of such a simulationof (7.12) are shown in Fig. 8.1.This stochastic process has some interesting features. As can be seen from Fig. 8.1 theparameter α can be used to control the size of the fluctuations. In earlier applications ofthe general theory presented above to balance equations of non-equilibrium statisticalmechanics, the parameter α could be interpreted as the temperature of the physicalsystem [124,127]. The larger the parameter α the larger are the fluctuations in Fig. 8.1,hence it is intuitively clear that, e.g. in a thermodynamic setting α plays essentially therole of the temperature.From a numerical point of view the total transition probability w = n/(α δl2), wheren denotes the dimension of the state space, is constant. Thus the error made by takinga constant time step τ = 1/w in the numerical simulation vanishes as O(δt). Thissignificantly reduces the number of random variables needed. In addition all transitionshave the same probability 1/(α δl2) which does not depend on the current state ofthe system. This makes the random selection of a transition very efficient, otherwisealgorithms as discussed in [128] have to be used.


8.4 Master Equation formulation of the Black-ScholesEquation

As seen in Section 8.2 the Black–Scholes equation can be interpreted as a determinis-tic equation which governs the expectation value of a stochastic option price. In thissection we will describe how the stochastic dynamics of the option price can be for-mulated in terms of a master equation of the type considered in Section 8.3. For thesake of simplicity we will consider the Black-Scholes equation for a European call inthe dimensionless form of Eq. (7.10). The generalisation to a put is straightforwardand the application to American options is discussed in Section 8.4.5.The interesting price range of the underlying x = S is divided into n discrete pointsxλ, λ = 1, . . . , n with distances ∆λ = xλ − xλ−1, λ = 2, . . . , n.

8.4.1 Stochastic process with uniform discretization

The easiest possibility to construct the stochastic process is to use a uniform discretiza-tion of the state space ∆λ = δl. The Black-Scholes equation (7.10) consists of a dif-fusive part, a convective part and a part corresponding to a chemical reaction. Thestochastic process underlying the Black-Scholes equation will also consist of threeparts corresponding to these terms in the partial differential equation. The total timeevolution operator A is thus defined by

ABS = Adiff +Aconv +Achem. (8.23)

It has already been shown that the diffusive part can be generated by a stochastic pro-cess defined by the maps a±

µ introduced in Eq. (8.20) and the corresponding operatorsA±

µ . For the Black-Scholes equation in addition one needs maps for the two remainingparts. The following map bµ

bµ =

{

θµ → θµ − α θµ

θµ−1 → θµ−1 + α θµ

, (8.24)

corresponds to the convective part, while

cµ ={

θµ → θµ − α θµ , (8.25)

corresponds to the chemical reaction term. Hence the time evolution operators for thethree parts of the stochastic process are given by

Adiff =1

αδl2

∑

µ

1

1− α{A+

µ +A−µ } − 2 I, (8.26)

Aconv =k − 1

α δl

∑

µ

1

1− αBµ − I, (8.27)

Achem =k

α

∑

µ

1

1− αCµ − I. (8.28)

8.4 Master Equation formulation of the Black-Scholes Equation 97

Here the operators A±µ ,Bµ, Cµ are defined according to the general definition (8.12)

through their maps a±µ , bµ, cµ. In order to prove that the expectation value of the

stochastic process whose generator is given by (8.23) really solves the Black-Scholesequation one uses Eq. (8.17) and the projection operator Fλ(θ) = θλ. One then imme-diately obtains

∂

∂t〈θλ〉 =

〈θλ+1〉+ 〈θλ−1〉 − 2〈θλ〉δl2

+ (k − 1)〈θλ+1〉 − 〈θλ〉

δl− k〈θλ〉, (8.29)

which in the continuum limit δl → 0 converges towards the dimensionless Black-Scholes equation (7.10).With these definitions the transition probability w(θ, θ) becomes

w(θ, θ) =1

αδl2

∑

µ

δ(θ, a+µ (θ)) + δ(θ, a−

µ (θ)) +k − 1

αδl

∑

µ

δ(θ, b(θ))

+k

α

∑

µ

δ(θ, c(θ)). (8.30)

Summarising the Black-Scholes Eq. (7.10) can be numerically solved by simulatingthe stochastic process

∂

∂tP (θ, t) = ABSP (θ) (8.31)

=

∫

Dθ{

w(θ, θ)P (θ, t)− w(θ, θ)P (θ, t)}

.

The only change to the algorithm from the previous section is that now in every timestep one of the four possible transition channels has to be selected according to theirrelative probability. These probabilities are still constant in time and thus allow for avery fast selection of the time step and transition probability as described in the previ-ous section.Figure 8.2 shows a typical result of a simulation averaged over 10 realizations of thestochastic process with α = 0.005. Naturally the simulation results fluctuate aroundthe analytic solution, but these fluctuations are very small if one takes into account thatonly 10 realizations are averaged. The reason for this is the parameter α, the smallerα, the smaller are the fluctuations of the process and the smaller is the number of real-izations needed.One drawback of this stochastic process is, that it is restricted to a uniform discreti-sation of the dimensionless variable S from Eq. (7.10). As can be seen in Fig. 8.2after transforming back to the original variables V and S this leads to an exponentialdistribution of the discrete values of the underlying, see Eqs. (7.8,7.9). Hence mostof the points are at the left border of the integration region. Since the total transitionrate performs as 1/δl3 the length of the time step during the simulation and thus theoverall computing time depends strongly on the discretisation of the underlying. A


10 20 30 40 50 60 70 80 90 100

S

0

10

20

30

40

50

60

70

V(S

,t)

simulation

exact

Figure 8.2: Time evolution of the analytic solution (continuous line) and results of a directstochastic simulation (squares) according to Eq. (8.23) of the Black-Scholes equa-tion for an European call with Exercise price 60, σ = 0.2, r = 0.06 for five differ-ent points in time. The solution of the master equation was estimated by averaging10 realization with α = 0.005 at times T = 10, 7.5, 5, 2.5, 0.05.

numerically more efficient algorithm thus has to use a uniform discretization of theunderlying S which hence requires much less points. This results in a non-uniformdiscretization of the dimensionless variable S in Eq. (7.10).

8.4.2 Stochastic process with non-uniform discretization

To enable a non-uniform discretization of the Black-Scholes equation (7.10) one usesan arbitrary distance ∆λ of two neighbouring points. The time evolution operatorAchem from the previous section does not depend on the discretization and is thus notaltered. But for the diffusive and convective part of the Black-Scholes equation newstochastic processes An.u.

diff and An.u.conv taking into account the non-uniform (n.u.) dis-

cretisation have to be defined. With the definitions

r+µ =

{

θµ → θµ − α1θµ

∆µ−1∆µ

θµ+1 → θµ+1 + α12θµ∆µ+1

(∆µ+1+∆µ)∆µ∆µ+1

, (8.32)

r−µ =

{


∆µ−1∆µ

θµ−1 → θµ−1 + α12θµ∆µ−2

(∆µ−1+∆µ−2)∆µ−2∆µ−1

, (8.33)

for the diffusive and

sµ =

{


∆µ

θµ−1 → θµ−1 + α2θµ

∆µ−1

, (8.34)


10 20 30 40 50 60 70 80 90 100

S

0

20

40

60

80

V(S

,t)

simulation

exact

Figure 8.3: Three exemplary realizations of the stochastic process from Eq. (8.43) (squares)and the corresponding exact solution (continuous line) of the Black–Scholes equa-tion with Exercise price 60, σ = 0.2, r = 0.06 and T = 10.

for the convective part one obtains the two new time evolution operators

An.u.diff =

1

α1

∑

µ

1

Det(r+µ )R+

µ +1

Det(r−µ )R−

µ − 2 I, (8.35)

An.u.conv =

k − 1

α2

∑

µ

1

Det(sµ)Sµ − I. (8.36)

The total time evolution operator An.u.BS is now given by

An.u.BS = An.u.

diff +An.u.conv +Achem. (8.37)

In order to prove that the expectation value 〈θλ〉 of the stochastic process above reallysolves the Black-Scholes equation one again introduces the projection operator Fλ andmakes use of the general theorem (8.17) to obtain

∂

∂t〈θλ〉 =

2∆λ−1

∆λ+∆λ−1〈θλ+1〉+ 2∆λ

∆λ+∆λ−1〈θλ−1〉 − 2〈θλ〉

∆λ−1∆λ

+ (k − 1)〈θλ+1〉 − 〈θλ〉

∆λ+1

− k 〈θλ〉. (8.38)

In the continuum limit the expectation value 〈θλ〉 of this stochastic process thus againsolves the Black-Scholes equation, but now on a non-uniform grid. The transitionprobability w(θ, θ) becomes

w(θ, θ) =1

α1

∑

µ

δ(θ, r+µ (θ)) + δ(θ, r−µ (θ)) +

k − 1

α2

∑

µ

δ(θ, s(θ))

+k

α

∑

µ

δ(θ, c(θ)). (8.39)


10 20 30 40 50 60 70 80 90 100

S

0

10

20

30

40

50

60

70

V(S

,t)

simulation

exact

Figure 8.4: Time evolution of the analytic solution (line) and results of a direct stochastic simu-lation (squares) according to Eq. (8.43) averaged over 200 realizations of the Black-Scholes equation for an European call with Exercise price 60, σ = 0.2, r = 0.06at the time points T = 10, 7.5, 5, 2.5, 0.05.

The parameters αi are chosen such that the relative size of the transitions is smallerthan 1, hence they fulfil α < 1, α2/∆µ < 1 and 2α1/∆2

µ < 1 for every possibletransition µ. The smallest αi, which is α1, determines the scaling of the total transitionprobability

w =2n

α1+ (k − 1)

n

α2+

k

α. (8.40)

Since α1 ∼ O(∆2µ) and the total transition probability scales as O(n/α1) the numerical

effort to simulate the stochastic process scales according to O(n/∆2µ). This is the same

dependence on the discretisation as in the previous section with a stochastic processon a uniform grid. But since now the grid can be chosen appropriate one needs muchless grid points and the algorithm is much faster.

8.4.3 Fast stochastic process with non-uniform discretisation

The use of a non-uniform grid reduces the number of grid points to be used in a nu-merical simulation. But there is still a O(n/∆2

µ) ≈ O(1/∆3µ) dependence of the total

transition rate which makes the algorithm slow. To get rid of this dependence thestochastic process corresponding to the diffusive part is again improved. In analogy toequilibrium Monte Carlo simulation of lattice systems [129,130], where one generatesa new configuration of the whole lattice in a single sweep, one defines

Asweepdiff =

1

α1

[

1

Det(t+)T + +

1

Det(t−)T − − 2 I

]

, (8.41)

with the map t± given byt± =

∑

µ

r±µ . (8.42)


10 100

N

0.3

0.4

0.5

0.6

0.7

0.8 /

2 /

4 /

Figure 8.5: Numerical mean square error ε of the solution of the Black–Scholes equation foran European call with Exercise price 60, σ = 0.2, r = 0.06 according to themaster equation (8.43) for three different ratios α/N . The parameter values for thesimulation parameter (α/N ) are α1/N = 10−7, α2/N = 10−5 and α3/N = 10−4.

This process thus updates every θλ at once. The total time evolution operator is nowgiven by

AsweepBS = Asweep

diff +An.u.conv +Achem. (8.43)

For the transition probability one obtains

w(θ, θ) =1

α1δ(θ, t+(θ)) +

1

α1δ(θ, t−(θ))

+k − 1

α2

∑

µ

δ(θ, s(θ)) +k

α

∑

µ

δ(θ, c(θ)) . (8.44)

Since α1 again scales as O(1/∆2µ) the total transition probability now also scales ac-

cording to O(1/∆2µ). This of course makes the algorithm significantly faster.

8.4.4 Analysis of the algorithm

Figure 8.3 shows three exemplary solutions of the stochastic process defined in (8.43)which corresponds to a non-uniform discretisation of the dimensionless underlying S.In addition to the expected random fluctuations around the exact solution, single re-alizations also show systematic deviations from the exact solution for a wide rangeof values of the underlying asset. These systematic and random fluctuations can betraced back to the two different types of stochastic processes entering Eq. (8.43). Thegenerators An.u.

conv and Achem describe single jump processes, where the jump processchanges only one option value in asset time space, this of course causes the randomfluctuations. In contrast the systematic deviations stem from Asweep

diff which describes ajump process changing all option values at once.


0.08 0.1 0.2 0.4 0.6 0.8 1.0

10-1

100

101

102

103

104

CP

U tim

e / s

ec.

random walk

master equation

Figure 8.6: Comparison of the computation time in seconds needed for the pricing of an optionby simulation of the stochastic dynamics of the underlying asset and by a masterequation formulation of the Black–Scholes equation. Shown is the computationtime vs. the numerical mean square error ε.

The numerical estimation of the option price from a sample of realizations of thestochastic process is of course not affected by this behaviour of single realizations.By averaging about several realizations the estimated option price converges againstthe analytic solution as is shown in Fig. 8.4.Concerning the numerical error ε of the Monte Carlo simulation of the Black–Scholesequations one has to distinguish two types of parameters entering the numerical al-gorithm derived in the previous section: the number of samples N used to estimatethe solution and the parameters αi which control the size of the fluctuations. Figure8.5 shows the mean squared error ε of a Monte Carlo simulation for three differentvalues of the ratio αi/N . Different simulations of the same ratio were obtained e.g.by decreasing N and all three parameters αi by the same factor. This figure clearlyshows that the error only depends on this ratio αi/N . In addition it can be seen that theerror scales with the square root of this ratio. This of course is a dependence which isexpected for a Monte Carlo simulation.In order to assess the numerical performance of the proposed Monte Carlo method it

is compared with the standard Monte Carlo approach based on the simulation of thedynamics of the underlying asset. The numerical error in a Monte Carlo solution hastwo sources. The systematic part of the error stems from the finite discretisation of thestochastic differential equation in the case of the simulation of the geometric Brow-nian motion and from the finite discretisation of the state space when one solves themaster equation. In addition there is a random error from the averaging over differentsolutions which decays in both cases as N−0.5 (if α is fixed in the master equationformulation). Figure 8.6 shows the time required to achieve a given precision for thetwo methods in a parameter range where the systematic part of the numerical errors arecomparable. As can be easily seen the solution of the master equation is significantlyfaster than the standard approach.

8.5 Summary and Outlook 103

8.4.5 American options

Up to now only European options have been considered. How can the concept of meso-scopic Monte Carlo simulations be generalised for American options? A straightfor-ward generalisation of the algorithm simulating the time evolution of the underlyingasset is quite involved, since the early exercise condition has to be fulfilled. Wheneverthe option price falls below the payoff function the option is exercised and the op-tion value is thus given by the payoff. Hence in order to assure that the early exercisecondition is not violated sometime in the future a straightforward generalisation of thisMonte Carlo simulation would have to keep track of all these points in asset time spacewhich would make the algorithm very ineffective. Of course, there are more advancedtechniques to generalise the Monte Carlo approach to American options, see [111] foran overview.The proposed mesoscopic approach based on a master equation whose expectationvalue solves the original Black–Scholes equation can be generalised to the valuationof American options. Since the time direction in the dimensionless Eq. (7.10) hasbeen reversed and the payoff function is used as initial condition the generalisation toAmerican options is straightforward. One just has to simulate one time step accordingto the master equation (8.43) and average about the number of samples used to esti-mate the option price. Wherever this option price is below the payoff function it isreplaced with the payoff and the next time step is simulated. Hence one assures thatthe early exercise condition is fulfilled.For this approach to work it is critical that first the samples are averaged and that theearly exercise condition is applied thereafter. Applying the early exercise conditionbefore averaging would introduce a bias towards higher option prices. Every time thestochastic realization is by chance below the payoff function the option value is in-creased. But since the option value is never decreased this way it is intuitively clearthat one would obtain higher option prices.Fig. 8.7 shows the result of such a simulation for an American call with exercise price60, interest rate 0.07, volatility 0.2 and a continuous dividend yield of 0.10.

8.5 Summary and Outlook

In real markets the elegance of the perfect hedge of the Black–Scholes approach tooption pricing is generally lost. Already a very simple model of the dynamics of theunderlying stock in terms of piecewise deterministic processes shows that the optionprice itself may be regarded as a stochastic variable. However the dynamics of theexpectation value of this stochastic option price is governed by the Black–Scholesequation. This consideration is the motivation for the proposed master equation ap-proach to option pricing.The essential point of this approach is that the Black–Scholes equation is interpretedas the macroscopic equation of an underlying mesoscopic stochastic process for the


20 30 40 50 60 70 80 90 100

S

0

10

20

30

40

V(S

,t)

simulation

partial differential equation

Figure 8.7: Time evolution for an American call with Exercise price 60, interest rate 0.07 anddividend yield 0.10 according to the stochastic simulation (squares) and the solu-tion of the corresponding partial differential equation (continuous line) for the timepoints T = 10, 7.5, 5.

stochastic option price variable. By using PDPs one can then proceed to a masterequation formulation of the option pricing problem. In contrast to this the usually usedmicroscopic approach describes the dynamics of the underlying asset by a stochasticdifferential equation.The master equation formulation of the option pricing theory offers several advantagesover the standard approach. This formulation provides a general setting in which alsoother kinds of jump processes may be easily embedded, without altering the char-acter of the equations. One may also include additional stochastic processes for thevolatility and the interest rate and then arrive at stochastic volatility and interest ratemodels [131,132]. This of course would lead to different hedging strategies. One pos-sibility to extend the proposed approach beyond the Black–Scholes equation, is to usea hedging strategy proposed in Refs. [133, 134] which is based on minimising the riskdefined in terms of the variance of a wealth function.In addition the master equation formulation also allows for the use of advanced simu-lation algorithms. It was shown that it is possible to construct a master equation whosetransition probability is constant in time and does not depend on the current state ofthe system. As shown in Section 8.3, this allows for fast numerical algorithms for thecomputation of the option price.Using the standard Black–Scholes equation as an example we have demonstrated howto construct such numerically efficient stochastic processes underlying the partial dif-ferential equation. As can be seen from the derivation of the general theory in section8.3, it is clear that this approach is not restricted to the standard Black–Scholes equa-tion but can be applied to generalisations of the latter.The master equation formulation of the Black–Scholes equation is numerically abouta factor of two faster than the standard Monte Carlo approach. This is mainly due tothe fact that the proposed mesoscopic approach does not simulate sample trajectories

8.5 Summary and Outlook 105

of the underlying asset which, after averaging, results in the option price for the initialasset value at a given time. In contrast to this the mesoscopic approach works in thewhole asset time space and generates sample option prices in the whole state spaceduring one realization and is hence much faster.Another advantage of the master equation formulation of the Black–Scholes equationis that it allows for a straightforward generalisation to price American options in thesame framework by just comparing the average option price with the payoff function.Summarising, the proposed master equation approach is located in between the stan-dard Monte Carlo approach for the simulation of the underlying asset and the finitedifference solution of the partial differential equation. This approach tries to combinethe numerical efficiency of the solution of the partial differential equation with theadvantages of the Monte Carlo approach. The main advantage of the Monte Carlo ap-proach is that it can be applied to price options using also advanced stochastic modelsfor the underlying.Concluding, the mesoscopic approach we have formulated has been exploited to im-plement a master equation approach to option pricing. Of course, it will be of greatinterest to investigate in further work if features of real markets correspond to the addi-tional freedom to choose the size of the fluctuations contained in the proposed masterequation formulation. One possible approach would be to introduce a volatility of theoption price. Then, the parameter α can be used to calibrate the stochastic processusing market data. This will eventually lead to a better understanding of option pricingin real markets.

Chapter 9

Summary

The aim of this thesis is to apply the tools of physics for the analysis and dynamicmodelling of complex systems to two different fields: live sciences in Part I and finan-cial theory in Part II.The advent of new research fields, such as molecular biology, in the life sciences, biol-ogy and medicine, transformed these disciplines from a descriptive to an experimentalscience. Despite these advances, the live sciences are still addicted to a qualitativeunderstanding of their experiments. This is reflected e.g. in the mostly cartoon likerepresentations of biological pathways. However, for a deeper understanding of fun-damental principles underlying these systems, a quantitative understanding in terms ofdynamic models is essential.In the first part of this thesis, different aspects involved in a model based understandingof biological systems are discussed. In Chapter 1 the basic building blocks of biolog-ical systems and the measurement techniques used are introduced. The informationstored in the DNA, which is the same in each cell, is transcribed into the precursor ofproteins, the mRNA. The gene expression, the mRNA abundance, and thus the pro-tein concentrations are different for different cell types. Proteins are the main buildingblocks for the functional organisation of biological pathways. High-throughput mea-surement techniques are available for the DNA- and mRNA–levels, while measure-ments at the protein level are still laborious, but new techniques evolve rapidly.In Chapters 2 and 3 the statistical analysis of DNA-microarray experiments, measur-ing the expression of thousands of genes in parallel is discussed. The main attentionis directed towards normalisation and standardisation of the experimental data fromdifferent experiments and statistical hypothesis testing. The main sources for system-atic errors and observational noise contained in the experimental data and statisticaltests for differential gene expression are discussed in Chapter 2. DNA-microarray ex-periments are an important tool for screening a large number of genes for possiblehypotheses. However, due to experimental limitations the inferences are of a moreexploratory nature. Analysing gene expression data from 41 different kidneys, distinctknown medical subgroups are validated, and additionally, indications for an unknownseparation of 13 chronically rejected kidney transplants into two distinct entities is

107

108 Summary

found. In a second experiment, the measurements of the gene expression at differenttime points after a 5/6 nephrectomie are used to scan for functionally related genesinvolved in kidney recovery of mice.In Chapter 3 a new normalisation algorithm for DNA-microarray data is developed.Using a non-parametric approach, based on so called optimal transformations, the cor-relations between different repeats of an experiment is maximised, and hence sys-tematic experimental artifacts are removed. The algorithm is validated using DNA-microarray experiments and the performance is compared to other standard normal-isation schemes. In addition generalisations of the method e.g. using least trimmedsquares regression to improve the robustness of the method are discussed.For a model level understanding of a biological system, time resolved quantitativemeasurements are performed, from which the model structure and parameter valuesthereof have to be deduced. Since all measurements contain observational noise and,in general not all components are observable, the parameters have to be estimated. InChapter 4 different criteria for the information content of an experiment regarding theparameters to be estimated are discussed. Using the MAP–Kinase pathway, which isknown to be involved in a variety of different regulatory modules, as an example, theprospects of optimal experimental design considerations are demonstrated. It is shownthat an improved experimental design drastically reduces the parameter estimation er-ror.Since biological systems have to cope with changing environments and noisy inputsignals, the functionality of critical components often depends only weakly on the pre-cise tuning of reaction rates. This behaviour, the robustness of biological systems,limits the amount of information about the system present in an observation. Bacterialchemotaxis, a mechanism used to construct a directed movement of a bacteria towardsa chemical attractant, is well known for perfect adaption. Here the steady state tum-bling frequency of the system does not depend on the chemical concentration of theattractant, and, hence, only gradients are sensed. Using bacterial chemotaxis as anexample it is shown, that despite robustness is present, if the response of the steadystate tumbling frequency towards a change in the concentration of the chemical attrac-tant is measured, the parameters can be estimated reliably. Hence, by changing theobservation function, e.g. measuring the time dependent tumbling frequency insteadof the steady state tumbling frequency in this example, the parameters can be reliablyestimated since, in general, only a small subset of the system properties show such arobust behaviour.Chapter 6 exemplifies the model building procedure using the lipoprotein metabolism.Lipoproteins produced in the liver are assembled into large apoB molecules whichbind to the cell surface and partly deliver their cargo, reducing their size. This delip-idation cascade from the large VLDL to the medium sized IDL and the small LDLapoB molecules is observed using stable isotope experiments. It is demonstrated thata model used up to now fails to explain the experimental data. A generalised parsi-monious model, including a time dependent production rate of the liver, is derived.Despite the fact that this model uses less compartments than previous models of the

109

lipoprotein metabolism available, it describes the experimental data satisfactorily.The small LDL apoB molecules can not be observed directly, since the particle sizedistributions of the three metabolically different LDL apoB subgroups highly overlap.The measurements in the LDL density range taken, thus contain contributions fromdifferent LDL subgroups. Using a parametrisation of the LDL particle size distribu-tion, it was possible to reconstruct both, the unknown parameters of the particle sizedistribution and the dynamics of the three different LDL species. These insights areonly possible with the help of the developed mathematical model, since experimen-tally a separation of the three LDL species is not possible. These results are now usedto assess the working principle of newly developed drugs.

In Part II a generalisation of the standard option pricing theory is developed. Op-tions are financial instruments, based on the value of a so called underlying. An optiongives the buyer the right to buy or sell a certain amount of the underlying at a pre-specified price some when in the future. Options are important financial tools, widelyused for speculation or hedging. Hedging is used to secure future financial transac-tions, by paying an assurance fee, the option price. Usually the fair value of the optionis calculated using the Black–Scholes equation introduced in Chapter 7. Monte–Carlomethods for the option price based on the microscopic dynamics of the underlying areavailable, but the application of this algorithm to American options is difficult.In Chapter 8 a mesoscopic approach to option pricing theory based on a master equa-tion for a stochastic option price variable is developed. It is demonstrated that this ap-proach is numerically about a factor two faster than the standard Monte–Carlo method.In addition this approach offers conceptual advantages, since the inclusion of addi-tional jump processes, describing the effects of information about stocks arriving atdiscrete times is straightforward and can be done without altering the character of theequations.The mesoscopic stochastic process is designed such that the expectation value of theoption price is governed by the Black–Scholes equation. There is no information abouthigher order moments of the option price available. However, the proposed processcontains an additional degree of freedom, which in a physical setting would correspondto the temperature of the system. It will be of great interest to investigate in furtherwork if features of real markets correspond to this additional freedom to choose thesize of the fluctuations. This will eventually lead to a better understanding of optionpricing in real markets.

Appendix A

Bacterial Chemotaxis

The complex eu/om occurs in five different methylation levels, m = 0, . . . , 4 and can be

occupied or unoccupied by a ligand. The 30 reaction equations for bacterial chemo-taxis read as follows:

∂

∂teu

m(t) =− kll(t)eum(t) + k−le

om(t) + (1− δmM ) {−arαmeu

m(t)r(t)

−a∗r(1− αm)eu

m(t)r(t) + dr[eumr](t) + kb[e

um+1b](t)

}

+ (1− δm0){

−abαmeum(t)b(t) + db[e

umb](t) + kr[e

um−1r](t)

}

, (A.1)

∂

∂teo

m(t) =kll(t)eum(t)− k−le

om(t) + (1− δmM ) {−arαmeo

m(t)r(t)

−a∗r(1− αm)eo

m(t)r(t) + dr[eomr](t) + kb[e

om+1b](t)

}

+ (1− δm0){

−abαmeom(t)b(t) + db[e

omb](t) + kr[e

om−1r](t)

}

, (A.2)

∂

∂t[eu

mb](t) =(1− δm0) {−kll(t)[eumb](t) + k−l[e


umb](t)

+abαmeum(t)b(t)} , (A.3)

∂

∂t[eo

mb](t) =(1− δm0) {kll(t)[eumb](t)− k−l[e


omb](t)

+abαmeom(t)b(t)} , (A.4)

∂

∂t[eu

mr](t) =(1− δmM) {−kll(t)[eumb](t) + k−l[e

omb](t)− (kr + dr)[e

umr](t)

+arαmeum(t)r(t) + a∗

r(1− αm)eum(t)r(t)} , (A.5)

∂

∂t[eo

mr](t) =(1− δmM) {kll(t)[eumr](t)− k−l[e

omr](t)− (kr + dr)[e

omr](t)

+arαmeom(t)r(t) + a∗

r(1− αm)eom(t)r(t)} . (A.6)

111

112 Bacterial Chemotaxis

The following parameters are used during the simulations, see Ref. [77].

kl = 1 [ms µM ]−1 k−l = 1 [ms]−1

ar = a∗r = 80 [s µM ]−1 dr = 100 s−1

kr = 0.1 s−1 db = 1000 s−1

kb = 0.1 s−1 ab = 800 [s µM ]−1

ET = 104 molecules/cell RT = 200 molecules/cell

BT = 2000 molecules/cell N = 7.14 1014 cells/liter

αu0 = 0 αu

1 = 0.1

αu2 = 0.5 αu

3 = 0.75

αu4 = 1

αo0 = 0 αo

1 = 0

αo2 = 0.1 αo

3 = 0.5

αo4 = 1

Acknowledgements

I am sincerely thankful to HD Dr. Jens Timmer for the experience of a dynamic re-search environment, for many encouraging discussions and his supervision of this the-sis.

I would like to thank Prof. Dr. J. Honerkamp for giving me the opportunity to under-take the research for this thesis in his group.

For fruitful collaboration I would like to thank, in chronological order

• Dr. Bernd Kappler for the work done concerning materials research

• Dr. Thomas Reinheckel for the work concerning analysis of 2DE-gels

• Dr. Brigitta Rumberger, Dr. Johannes Donauer, Dr. Jochen Wilpert and TitusSparna for the work done concerning DNA-microarray experiments

• Dr. W. Huber for interesting discussions concerning the analysis of DNA-microarrayexperiments.

• Prof. Christoph Borner, Lotti Egger and Prof. Thomas Dandekar for the workconcerning the FAS-pathway of apoptosis.

• Dr. H. U. Voss for the work concerning normalisation of microarray experiments

• PD Dr. Francesco Petruccione for the research concerning option pricing theory

• Dr. Thorsten Müller, Dr. Karl Winkler and Dr. Manfred Baumstark concerningthe investigation of the lipoprotein metabolism

• Dr. Ursula Klingmüller and Marcel Schilling for the analysis of the MAP-Kinasepathway

I am grateful to the graduate school ’Nonlinear differential equations’ of the GermanResearch Foundation (DFG) for financial support.

I wish to thank Yu-Kai The and Felix Dreher for proof-reading this thesis.

A very special thank go to my parents without whom I would never have got this far.

Bibliography

[1] N. Bohr. Light and Life. Nature, 131:421–457, 1933.

[2] J. Knight. Bridging the culture gap. Nature, 419:244–246, 2002.

[3] Can physics deliver another biological revolution? Nature, 397:89, 1999.

[4] Verschläft die Physik ihre Zukunft ? Physikalische Blätter, 57(10):3, 2001.

[5] L. Bachelier. Théorie de la Spéculation. PhD thesis, Annales de l’Ecole NormaleSupérieure, 1900.

[6] M. Schena, editor. DNA Microarrays: A Practical Approach. Oxford University Press,2000.

[7] J. M. Bower and H. Bolouri, editors. Computational Methods in Molecular Biology.MIT Press, 2000.

[8] PH. O’Farrell. High resolution two-dimensional electrophoresis of proteins. J. Biol.Chem., 250(10):4007–40021, 1975.

[9] J. Klose. Protein mapping by combined isoelectric focusing and electrophoresis ofmouse tissues. A novel approach to testing for induced point mutations in mammals.Humangenetik, 26(3):231–243, 1975.

[10] G. MacBeath and S. L. Schreiber. Printing proteins as microarrays for high-throughputfunction determination. Science, 289:1760–1763, 2000.

[11] H. Zhu, J. F. Klemic, S. Chang, P. Bertone, A. Casamayor, K. G. Klemic, D. Smith,M. Gerstein, M. A. Reed, and M. Snyder. Analysis of yeast protein kinases using proteinchips. Nature Genetics, 26:283–289, 2000.

[12] D. Faller, H.U. Voss, J. Timmer, and U. Hobohm. Normalization of DNA-microarraydata by non-linear correlation maximization. Journal of Computational Biology, 2003.

[13] H. Herzel, D. Beule, S. Kielbasa, J. Korbel, C. Sers, A. Malik, H. Eickhoff, H. Lehrach,and J. Schuchhardt. Extracting information from cDNA Arrays. Chaos, 11:98–107,2001.

[14] D. Amaratunga and J. Cabrera. Analysis of data from viral DNA microchips. Journalof the American Statistical Association, 96 (456):1161–1170, 2001.

115

[15] S. Dudoit, Y. H. Yang, M. J. Callow, and T. P. Speed. Statistical methods for identifyingdifferentially expressed genes in replicated cDNA microarray experiments. StatisticaSinica, 12:111–139, 2002.

[16] Y. H. Yang, S. Dudoit, P. Luu, and T. P. Speed. Normalization of cDNA microarray data.Technical report, Department of Statistics, UC-Berkeley, 2000.

[17] E. E. Schadt, C. Li, B. Ellis, and W. H. Wong. Feature extraction and normalization al-gorithms for high-density oligonucleotide gene expression array data. Technical Report303, Department of Statistics, UCLA, 1999.

[18] G. C. Tseng, M.-K. Oh, L. Rohlin, J. C. Liao, and W. H. Wong. Issues in cDNA mi-croarray analysis: quality filtering, channel normalization, models of variations and as-sessment of gene effects. Nuclear Acids Research, 29 (12):2549–2557, 2001.

[19] J. Schuchhardt, D. Beule, A. Malik, E. Wolski, H. Eickhoff, H. Lehrach, and H. Herzel.Normalization strategies for cDNA microarrays. Nuclear Acids Research, 28 (10):e47,2000.

[20] U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine.Broad patterns of gene expression revealed by clustering analysis of tumor and normalcolon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci., 96 (12):6745–6750, 1999.

[21] M. K. Kerr and G. A. Churchill. Experimental design for gene expression microarrays.Biostat., 2:183–201, 2001.

[22] A. Zien, T. Aigner, R. Zimmer, and T. Lengauer. Centralization: a new method for thenormalization of gene expression data. Bioinformatics, 17:323–331, 2001.

[23] W. S. Cleveland. Robust locally weighted regression and smoothing scatterplots. J.Amer. Statist. Assoc., 74:829–836, 1979.

[24] D. R. Cox and D. V. Hinkley. Theoretical Statistics. Chapman & Hall, 1994.

[25] J. P. Shaffer. Multiple hypothesis testing. Annu. Rev. Psychol., 46:561–584, 1995.

[26] Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: a practical andpowerful approach to multiple testing. J. Royal Stat. Soc. B, 57(1):289–300, 1995.

[27] J. D. Storey. False Discovery Rates: Theory and Applications to DNA Microarrays.PhD thesis, Department of Statistics, Stanford University, 2002.

[28] J. D. Storey. A direct approach to false discovery rates. J. R. Statist. Soc. B, 64:479–498,2002.

[29] J. Donauer, B. Rumberger, M. Klein, D. Faller, J. Wilpert, T. Sparna, G. Schieren,R. Rohrbach, J. Timmer, P. Pisarski, G. Kirste, and G. Walz. Expression profiling onchronically rejected transplant kidneys. Transplantation, 2003.

[30] P. Gerke, B. Rumberger, O. Vonend, J. Wilpert, M. Bek, D. Faller J. Donauer, T. Sparna,M. Klein, K. Amann, R. Rohrbach, H. Pavenstädt, J. Timmer, and G. Walz. Oxida-tive stress response and production of collagen are normal adaptive changes after renalablation. Submitted, 2003.

[31] T. F. Cox and M. A. A. Cox. Multidimensional Scaling. Chapman and Hall, 1994.

[32] P. D’Haeseleer, S. Liang, and R. Somogyi. Genetic network inference: From co-expression clustering to reverse engeneering. Bioinformatics, 16:707–726, 2000.

[33] M. Wahde and J. Hertz. Modeling genetic regulatory dynamics in neural development.Journal of Computational Biology, 8:429–442, 2001.

[34] V. G. Tusher, R. Tibshirani, and G. Chu. Significance analysis of microarrays applied tothe ionizing radiation responses. Proc. Natl. Acad. Sci., 98:5116–5121, 2001.

[35] T. Ideker, V. Thorsson, J.A. Ranish, R. Christmas, J. Buhler, R. Bumgarner, R. Ae-bersold, and L. Hood. Integrated genomic and proteomic analysis of a systematicallyperturbed metabolic network. Science, 292:929–934, 2001.

[36] A. Rényi. On measures of dependence. Acta. Math. Acad. Sci. Hungar., 10:441–451,1959.

[37] L. Breiman and J. H. Friedman. Estimating optimal transformations for multiple regres-sion and correlation. J. Am. Stat. Assoc., 80:580, 1985.

[38] M.G. Schimek, editor. Smoothing and Regression. Approaches, Computation and Ap-plication. Wiley & Sons, New York, 2000.

[39] W. Härdle. Applied Nonparametric Regression. Cambridge University Press, New York,1990.

[40] H. U. Voss. Analyzing Nonlinear Dynamical Systems with Nonparametric Regression.In A.I. Mees, editor, Nonlinear Dynamics and Statistics, pages 413–434. Birkhäuser,Boston, 2001.

[41] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller,M.L. Loh, J.R. Downing, M.A. Caligiuri, D.C. Bloomfield, and E.S. Lander. Molecularclassification of cancer: class discovery and class prediction by gene expression moni-toring. Science, 286:929–934, 1999.

[42] P. J. Rousseuw and A. M. Leroy. Robust regression and outlier detection. Wiley seriesin probability and mathematical statistics. John Wiley and Sons, 1987.

[43] P. J. Rousseuw and K. V. Driessen. Computing LTS Regression for Large Data Sets.Technical report, University of Antwerp, 1998.

[44] H. Kitano. Systems Biology: A Brief Overview. Science, 295:1662–1664, 2002.

[45] I. Swameye, T. G. Müller, J. Timmer, O. Sandra, and U. Klingmüller. Identificationof nucleocytoplasmic cycling as a remote sensor in cellular signaling by data-basedmodeling. Proc. Natl. Acad. Sci., 100:1028–1033, 2003.

[46] D. Faller, U. Klingmüller, and J. Timmer. Simulation methods for optimal experimentaldesign in systems biology. Submitted, 2003.

[47] H. Pohjanpalo. System identifiability based on power series expansion of the solution.Math. Biosci., 41:21–33, 1978.

[48] J. F. Ritt. Differential equations from the algebraic standpoint. American MathematicalSociety, 1932.

[49] W. T. Wu. On zeros of algebraic equations - an application of Ritt principle. KexueTongbao, 31:1–5, 1986.

[50] T. G. Müller. Modeling complex systems with differential equations. PhD thesis, Albert–Ludwigs Universität Freiburg, 2002. http://www.freidok.uni-freiburg.de/volltexte/556/.

[51] T. G. Müller, N. Noykova, M. Gyllenberg, and J. Timmer. Parameter identification indynamical models of anaerobic wastewater treatment. Mathematical Biosciences, 177-178:147–160, 2002.

[52] R. A. Fisher. On an absolute criterion for fitting frequency curves. Messenger of Math-ematics, 41:155–160, 1912.

[53] K. Schittkowski. Parameter estimation in systems of nonlinear equations. NumerischeMathematik, 68:129–142, 1994.

[54] L. Ljung. System Identification. Prentice Hall, 1999.

[55] G. E. P. Box and W. J. Hill. Discrimination among mechanistic models. Technometrics,9:57–71, 1967.

[56] A. Munack. Optimization of sampling. In K. Schügerl, editor, Biotechnology, a multi-volume Comprehensive Treatize, volume 4 of Measuring, Modeling and Control, pages251–264. VCH, Weinheim, 1991.

[57] A. Munack. Some improvements in the identification of bioprocesses. In M. N. Karimand G. Stephanopoulos, editors, Modeling and Control of Biotechnical Processes, pages89–94. 2nd IFAC Symposium, 1992.

[58] M. Baltes, R. Schneider, C. Sturm, and M. Reuss. Optimal experimental design forparameter estimation in unstructured growth models. Biotechnol. Prog., pages 480–488,1994.

[59] G. C. Goodwin. Identification: Experiment design. In Systems and Control Encyclope-dia, volume 4, pages 2257–2264. 1987.

[60] D. E. Koshland. Switches, thresholds and ultrasensitivity. Trends in Biochemical Sci-ences, 12:225–229, 1987.

[61] C.-Y. F. Huang and J. E. Ferrell. Ultrasensitivity in the mitogen–activated protein kinasecascade. Proc. Natl. Acad. Sci., 93:10078–10083, 1996.

[62] A. R. Asthagiri and D. A. Lauffenburger. A computational study of feedback effectson signal dynamics in a mitogen–activated protein kinase (MAPK) pathway model.Biotechnol. Progr., 17:227–239, 2001.

[63] R. Heinrich, B. G. Neel, and T. A. Rapoport. Mathematical models of protein kinasesignal transduction. Molecular Cell, 9:957–970, 2002.

[64] U. S. Bhalla, P. T. Ram, and R. Iyengar. MAP kinase phosphatase as a locus of flexibilityin a mitogen–activated protein kinase signalling network. Science, 297:1018–1023,2002.

[65] N. T. Ingolia and A. W. Murray. History matters. Science, 297:948–949, 2002.

[66] B. N. Kholodenko. MAP kinase cascade signaling and endocytic trafficking: a marriageof convenience. TRENDS in Cell Biology, 12(4):173–177, 2002.

[67] N. Blüthgen and H. Herzel. MAP-Kinase-Cascade: Switch, Amplifier or Feedback Con-troller. In U. Kummer R. Gauges, C. van Gend, editor, 2nd Workshop on Computationof Biochemical Pathways and Genetic Networks, pages 55–62. Logos Verlag, 2001.

[68] G. Pearson, F. Robinson, T. B. Gibson, B. Xu, M. Karandikar, K. Berman, and M. H.Cobb. Mitogen-Activated Protein (MAP) Kinase Pathways: Regulation and Physiolog-ical Functions. Endocrine Reviews, 22:153–158, 2001.

[69] J. Pouysségur, V. Volmat, and P. Lenormand. Fidelity and spatio–temporal control inMAP kinase (ERKs) signalling. Biochemical Pharmacology, 64:755–763, 2002.

[70] Numerical Algorithms Group. The NAG Fortran Library Manual, Mark 20, 2002.

[71] E. O. Doebelin. Dynamic Analysis and Feedback Control. McGraw-Hill, 1962.

[72] M. E. Csete and J. C. Doyle. Reverse engineering of biological complexity. Science,295:1664–1669, 2002.

[73] V. Periwal and Z. Szallasi. Trading "wet–work" for network. Nature Biotechnology,20:345, 2002.

[74] B. Schoeberl, C. Eichler-Jonsson, E. D. Gilles, and G. Müller. Computational modelingof the dynamics of the MAP kinase cascade activated by surface and internalized EGFreceptors. Nature Biotechnology, 20:370–375, 2002.

[75] J. Stelling and E. D. Gilles. Robustness vs. identifiability of regulatory modules ? InT.M. Yi et al., editor, Proc. 2nd International Conference on Systems Biology, pages181–190, 2001.

[76] N. Barkai and S. Leibler. Robustness in simple biochemical networks. Nature, 367:913–917, 1997.

[77] U. Alon, M. G. Surette, N. Barkai, and S. Leibler. Robustness in bacterial chemotaxis.Nature, 397:168–171, 1999.

[78] T.-M. Yi, Y. Huang, M. I. Simon, and J. Doyle. Robust perfect adaption in bacterialchemotaxis through integral feedback control. Proc. Natl. Acad. Sci., 97:4649–4653,2000.

[79] W. R. Fisher. Heterogeneity of plasma low density lipoproteins manifestations of thephysiologic phenomenon in man. Metabolism, 32(3):283–291, 1983.

[80] M. A. Austin and R. M. Krauss. LDL density and atherosclerosis. J. Am. Med. Assoc.,273(2):115–121, 1995.

[81] M. W. Baumstark, W. Kreutz, A. Berg, I. Frey, and J. Keul. Structure of human low-density lipoprotein subfractions, determined by X-ray small-angle scattering. BBA-MolCell Biol L, 1037:48–57, 1990.

[82] C.J. Packard, T. Demant, J.P. Stewart, D. Bedford, M.J. Caslake, G. Schwertfeger,A. Bedynek, J. Shepherd, and D. Seidel. Apolipoprotein B metabolism and the dis-tribution of VLDL and LDL subfractions. J Lipid Res, 41:305–317, 2000.

[83] B. A. Griffin, M. J. Caslake, B. Yip, G. W. Tait, C. J. Packard, and J. Shepherd. Rapidisolation of low density lipoprotein (LDL) subfractions from plasma by density gradientultracentrifugation. Atherosclerosis, 83:59–67, 1990.

[84] T. G. Müller, D. Faller, J. Timmer, M. W. Baumstark, and K. Winkler. Impact of thesteady state assumption on model identification of VLDL/IDL apoB metabolism. Sub-mitted, 2003.

[85] K. G. Parhofer, P. H. R. Barrett, J. Dunn, and G. Schonfeld. Effect of pravastatin onmetabolic parameters of apolipoprotein B in patients with mixed hyperlipoproteinemia.Clin. Invest. Med., 71:939–946, 1993.

[86] C. J. Packard, T. Demant, J. P. Bedford, M. J. Caslake, G. Schwertfeger, J. Shepherd,and D. Seidel. Apolipoprotein B metabolism and the distribution of VLDL and LDLsubfractions. Journal of Lipid Research, 41:305–317, 2000.

[87] F. Pont, L. Duvillard, B Vergés, and P Gambert. Development of compartmental modelsin stable–isotope experiments. American Hearth Association, 18:853–860, 1998.

[88] T. Demant, C. J. Packard, H. Demmelmair, P. Stewart, A. Bedynek, D. Bedford, D. Sei-del, and J. Shepherd. Sensitive methods to study human apolipoprotein B metabolismusing stable isotope–labeled amino acids. American Physiological Society, 270:1022–1036, 1996.

[89] T. G. Müller, D. Faller, J. Timmer, I. Swameye, O. Sandra, and U. Klingmüller. Testsfor cycling in a signalling pathway. Submitted, 2003.

[90] A. Buse. The Likeihood Ratio, Wald, and Lagrange Multiplier Test: An ExpositoryNote. The American Statistican, 36(3):153–157, 1982.

[91] J. Wahrendorf, H. Becher, and C. C. Brown. Bootstrap comparison of non-nested gener-alized linear models: Applications in survival analysis and epidemiology. Appl. Statist.,36(1):72–81, 1987.

[92] Q. H. Vuong. Likelihood Ratio tests for model selection and non-nested hypotheses.Econometrica, 57(2):307–333, 1989.

[93] P. Hall and S. R. Wilson. Two guidelines for bootstrap hypothesis testing. Biometrics,47:757–762, 1991.

[94] J. Shao. Bootstrap model selection. Journal of the American Statistical Association,91(434):655–665, 1996.

[95] W. Sauerbrei and M. Schumacher. A bootstrap resampling procedure for model build-ing: application to the Cox regression model. Statistics in Medicine, 11:2093–2109,1992.

[96] H.-P. Deutsch. Derivate und interne Modelle: Modernes Risikomanagement. Schäffer-Poeschel, 2001.

[97] P. A. Samuelson. Proof that properly anticipated prices fluctuate randomly. IndustrialManagement Review, 1965.

[98] B. B. Mandelbrot. Forecasts of future prices, unbiased markets and martingale models.J of Business, 1966.

[99] R. C. Merton. Option Pricing when underlying stock returns are discontinuous. Journalof Financial Economics, 3:125–144, 1976.

[100] J. C. Cox. The valuation of options for alternative stochastic processes. Journal ofFinancial Economics, 3:145–166, 1976.

[101] P. Wilmott. Derivatives, The Theory and Practice of Financial Engineering. John Wiley& Sons, Chichester, 1998.

[102] F. Black and M. Scholes. The pricing of options and corporate liabilities. Journal ofPolitical Economy, 81(3):637–654, 1973.

[103] P. Wilmott, S. Howison, and J. Dewynne. The Mathematics of Financial Derivaties, AStudent Introduction. Cambridge University Press, Cambridge, 1995.

[104] D. Faller and F. Petruccione. A master equation approach to option pricing. Physica A,319:519–534, 2003.

[105] J. C. Hull. Options, Futures, and other Derivatives. Prentice Hall International, London,1997.

[106] Bruno Dupire. Monte Carlo: Methodologies and Applications for Pricing and RiskManagement. Risk Books, London, 1998.

[107] P. Boyle. Options: A Monte Carlo Approach. Journal of Financial Economics, 4:323–338, 1977.

[108] F. A. Longstaff and E. S. Schwartz. Valuing American options by simulation: a simpleleast-squares approach. Rev. Financ. Stud., 14(1):113–147, 2001.

[109] M. Broadie and P. Glasserman. Pricing American-style Securities using Simulation.Technical report, Columbia University, 1997.

[110] P. Bossaerts. Simulation Estimators of Optimal Early Exercise. Technical report,Carnegie Mellon University, 1989.

[111] P. Boyle, M. Broadie, and P. Glasserman. Monte Carlo Methods for Security Pricing.Journal of Economic Dynamics & Control, 21:1267–1321, 1997.

[112] H.-P. Breuer and F. Petruccione. Burgers’s turbulence model as a stochastic dynamicalsystem: Master equation and simulation. Phys. Rev. E, 47:1803–1814, 1993.

[113] H.-P. Breuer and F. Petruccione. Stochastic simulations of high-Reynolds-number tur-bulence in two dimensions. Phys. Rev. E, 50:2795–2801, 1994.

[114] P. Biechele, H.-P. Breuer, and F. Petruccione. Non-equilibrium Monte Carlo simulationof a decaying Navier-Stokes turbulence. Phys. Lett. A, 256:147–152, 1999.

[115] H.-P. Breuer and F. Petruccione. A master equation description of fluctuating hydrody-namics. Physica A, 192:569–588, 1993.

[116] R. Grauer and C. Marliani. Analytical and numerical approaches to structure functionsin magnetohydrodynamic turbulence. Physica Scripta, T67:38–42, 1996.

[117] H.-P. Breuer, W. Huber, and F. Petruccione. Fast Monte Carlo algorithm for nonequilib-rium systems. Phys. Rev. E, 53:4232–4235, 1996.

[118] H.-P. Breuer, J. Honerkamp, and F. Petruccione. A stochastic approach to complexchemical reactions. Chem. Phys. Lett., 190:199–201, 1992.

[119] H.-P. Breuer, W. Huber, and F. Petruccione. Fluctuation effects on wave propagation ina reaction–diffusion process. Physica D, 73:259–273, 1994.

[120] N. G. van Kampen. Stochastic Processes in Physics and Chemistry. North-Holland,Amsterdam, 1992.

[121] M. H. A. Davis. Markov Models and Optimization. Chapman & Hall, London, 1993.

[122] H.-P. Breuer and F. Petruccione. The Theory of Open Quantum Systems. Oxford Uni-versity Press, Oxford, 2002.

[123] C. W. Gardiner. Handbook of Stochastic Methods. Springer Verlag, Berlin, 1990.

[124] H.-P. Breuer and F. Petruccione. How to build master equations for complex systems.Continuum Mech. Thermodyn., 7:439–473, 1995.

[125] H.-P. Breuer and F. Petruccione. Mesoscopic modelling and stochastic simulation ofturbulent flows. In T. Funiaki and W. A. Woyczynsci, editors, Nonlinear stochasticPDEs: Hydrodynamic limit and Burgers’s turbulence, volume 77 of The IMA Volumesin Mathematics and its Applications, pages 261–291. Springer–Verlag, New York, 1996.

[126] D. T. Gillespie. A general method for numerically simulating the stochastic time evolu-tion of coupled chemical reactions. J. Comput. Phys., 22:403–434, 1976.

[127] H.-P. Breuer and F. Petruccione. Thermostochastics: Heath conduction and temperaturefluctuations. Physica A, 209:83–95, 1994.

[128] M. Fricke and J. Schnakenberg. Monte-Carlo simulation of an inhomogeneous reaction-diffusion system in the biophysics of receptor cells. Z. Phys. B, 83:277–284, 1991.

[129] K. Binder. The Monte Carlo Method in Condensed Matter Physics. Springer Verlag,Berlin, 1995.

[130] D. P. Landau and K. Binder. A Guide to Monte Carlo Simulation in Statistical Physics.Cambridge University Press, Cambridge, 2000.

[131] J. Hull and A. White. The Pricing of Options on Assets with Stochastic Volatilities. TheJournal of Finance, 42(2):281–300, 1987.

[132] L. O. Scott. Option Pricing when the Variance Changes Randomly: Theory, Estimation,and an Application. Journal of Financial and Quantitative Analysis, 22(4):419–438,1987.

[133] J. P. Bouchaud and M. Potters. Theory of financial risks. Cambridge University Press,Cambridge, 2000.

[134] J. P. Bouchaud, M. Potters, and D. Sestovic. Hedged Monte–Carlo: low variance deriva-tive pricing with objective probabilities. Physica A, 289:517–525, 2001.

Documents

Analysis and Dynamic Modelling of Complex Systems