11

Modeling and Simulation of Biological Networks, Volume 64

  • Upload
    lythuy

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

AMS SHORT COURSE LECTURE NOTES Introductory Survey Lectures

published as a subseries of Proceedings of Symposia in Applied Mathematics

http://dx.doi.org/10.1090/psapm/064

Proceedings of Symposia in

APPLIED MATHEMATICS

Volume 64

Modeling and Simulation of Biological Networks

American Mathematical Society Short Course January 10-11, 2006 San Antonio, Texas

Reinhard C Laubenbacher Editor

tfSEMAT/

American Mathematical Society !$ Providence, Rhode Island

Editorial Board

Mary Pugh Lenya Ryzhik Eitan Tadmor (Chair)

L E C T U R E N O T E S P R E P A R E D F O R T H E A M E R I C A N M A T H E M A T I C A L S O C I E T Y S H O R T C O U R S E

M O D E L I N G A N D SIMULATION O F B I O L O G I C A L N E T W O R K S H E L D IN SAN A N T O N I O , T E X A S

J A N U A R Y 10-11 , 2006

T h e A M S Short Course Series is sponsored by the Society's P r o g r a m Commi t t ee for Nat iona l Meetings. T h e series is under t he direction of t he Short Course Subcommi t tee of t h e P r o g r a m Commi t t ee for Nat ional Meetings.

2000 Mathematics Subject Classification. P r i m a r y 92B05.

Library of Congress Cataloging- in-Publ icat ion D a t a

American Mathematical Society. Short Course, Modeling and Simulation of Biological Networks (2006 : San Antonio, Tex.)

Modeling and simulation of biological networks / Reinhard C. Laubenbacher, editor. p. cm.—(Proceedings of symposia in applied mathematics ; v. 64)

Includes bibliographical references and index. ISBN 978-0-8218-3964-5 (alk. paper) 1. Biology—Mathematical models—Congresses. 2. Computational biology—Congresses.

I. Laubenbacher, Reinhard. II. Title.

QH323.5 .A45 2006 570.1'5118—dc22 2007060770

Copying and reprinting. Material in this book may be reproduced by any means for edu­cational and scientific purposes without fee or permission with the exception of reproduction by services that collect fees for delivery of documents and provided that the customary acknowledg­ment of the source is given. This consent does not extend to other kinds of copying for general distribution, for advertising or promotional purposes, or for resale. Requests for permission for commercial use of material should be addressed to the Acquisitions Department, American Math­ematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294, USA. Requests can also be made by e-mail to [email protected].

Excluded from these provisions is material in articles for which the author holds copyright. In such cases, requests for permission to use or reprint should be addressed directly to the author(s). (Copyright ownership is indicated in the notice in the lower right-hand corner of the first page of each article.)

© 2007 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights

except those granted to the United States Government. Copyright of individual articles may revert to the public domain 28 years

after publication. Contact the AMS for copyright status of individual articles. Printed in the United States of America.

@ The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability.

Visit the AMS home page at http:/ /www.ams.org/

10 9 8 7 6 5 4 3 2 1 12 11 10 09 08 07

Contents

Introduction to Modeling and Simulation of Biological Networks vii

An introduction to reconstructing ancestral genomes LIOR PACHTER 1

Phylogenetics ELIZABETH S. ALLMAN AND JOHN A. RHODES 21

Polynomial dynamical systems in systems biology BRANDILYN STIGLER 53

An introduction to optimal control applied to immunology problems SUZANNE LENHART AND JOHN T. WORKMAN 85

Modeling and simulation of large biological, information and socio-technical systems: An interaction-based approach

CHRISTOPHER L. BARRETT, KEITH BISSET, STEPHEN EUBANK,

MADHAV V. MARATHE, V. S. ANIL KUMAR, AND HENNING S.

MORTVEIT 101

Index 149

Introduction to Modeling and Simulation of Biological Networks

"All processes in organisms, from the interaction of molecules to the complex functions of the brain and other whole organs, strictly obey these physical laws. Where organisms differ from inanimate matter is in the organization of their systems and especially in the possession of coded information [M]."

It is the task of computational biology to help elucidate those differences. This process has barely begun, and many researchers are testing computational tools that have been used successfully in other fields for their efficacy in helping to understand many biological systems. Mathematical and statistical network modeling is an im­portant step toward uncovering the organizational principles and dynamic behavior of biological networks. Undoubtedly, new mathematical tools will be needed, how­ever, to meet this challenge. The workhorse of this effort at present comprises the standard tools from applied mathematics, which have proven to be successful for many problems. But new areas of mathematics not traditionally considered applicable are contributing powerful tools.

The advent of "digital biology" has provided a rich application area for dis­crete mathematics. One type of problem faced by life scientists is computational: organize data into models that are explanatory and predictive. But another type of problem is conceptual. An important problem to make biological phenomena treatable with quantitative methods is the need for a language to express concepts such as "self-organization" or "robustness" of biological systems. There is no telling what mathematical specialty has the right tools for this task. Both the National Science Foundation and the National Institutes of Health are investing heavily in fostering the synthesis between biology and mathematics. Progress can be made only through a close collaboration between life scientists and quantitative scien­tists, in particular mathematicians. Altogether, the mathematical sciences face an exciting and stimulating challenge. In [C] the author argues that the relationship between mathematics and biology in the twenty-first century might rival that of mathematics and physics in the twentieth. In [S] we can find examples of new theorems inspired by biological problems.

The AMS Short Course Modeling and Simulation of Biological Networks at the 2006 Joint Annual Meetings, on which this volume is based, was intended to intro­duce this topic to a broad mathematical audience. The aim of the course and of this volume is to explain some of the biology and the computational and mathe­matical challenges we are facing. The different chapters provide examples of how these challenges are met, with particular emphasis on nontraditional mathemati­cal approaches. The volume features a broad spectrum of networks across scales, ranging from biochemical networks within a single cell to epidemiological networks

vii

V l l l INTRODUCTION

encompassing whole cities. Also, the volume is broad in the range of mathematical tools used in solving problems involving these networks.

Chapters: The first two chapters, one by Elizabeth Allman and John Rhodes, and the other by Lior Pachter, focus on the "coded information" that Mayr refers to above. They discuss mathematical tools that help analyze genome-level infor­mation, locate genes in newly sequenced genomes, and organize evolutionary infor­mation. The mathematical areas involved include statistics, discrete mathematics and algebraic geometry.

The third chapter, by Brandilyn Stigler, discusses the biochemical networks that translate genome-level information into cellular metabolism, using the exam­ple of gene regulatory networks. Constructing network-level mathematical models poses unique challenges and constitutes one of the frontiers of research in math­ematical biology. The main mathematical areas discussed in these chapters are dynamical systems theory and computational algebra.

Cells assemble to form organisms, and organisms assemble to form populations. The fourth chapter, by Suzanne Lenhart and John Workman, describes a control-theoretic approach to problems in immunology, such as drug delivery. The final chapter, by Christopher Barrett, Keith Bisset, Stephen Eubank, Madhav Marathe, A. Kumar, and H. Mortveit, discusses an interaction-based approach to modeling in population biology and epidemiology, as well as mathematical problems associated with this modeling paradigm.

The Short Course lectures were complemented by two panel discussions: 1. The New Face of Mathematical Biology. Over the last decade mathematical biology has dramatically changed, in particular with the advent of high-throughput genomics and the need for mathematical and statistical methods to align and annotate the large number of complete genomes that are becoming available. This discussion focused on new areas of research and the central role that mathematics, including pure mathematics, can play in the life sciences. 2. Opportunities in Mathematical Biology. This discussion focused on new opportunities for mathematics students and researchers in this field. If the mathematics community embraces the fact that the twenty-first century clearly will be the century of biology, then the life sciences can play a role in twenty-first-century mathematics similar to the role of physics in the twentieth.

Acknowledgement. The editor thanks the AMS staff involved in the produc­tion of this volume and the organization of the AMS Short Course preceding it for their extremely helpful and professional services.

References

[C] J. E. Cohen, Mathematics is biology's next microscope, only better; biology is mathematics' next physics, only better, PLoS Biology 2 (12) 2004.

[M] E. Mayr, Toward a new philosophy of biology, Harvard Univ. Press, Cambridge, MA, 1988. [S] B. Sturmfels, Can biology lead to new theorems?, Clay Mathematical Institute 2005 Annual

Report.

Reinhard Laubenbacher VIRGINIA BIOINFORMATICS INSTITUTE AT VIRGINIA T E C H

Index

acyclic orientations, 118 affine variety, 60 agents, 102, 130 algebraic geometry, 36, 38, 43, 44 algebraic statistics, 7 alignment

global multiple, 8 multiple sequence, 3 poset, 8 reversal, 16 tree, 9

ancestral genomes, 1

back substitutions, 24 base substitution, 22

models, 27 binary hypercube, 116 Boolean model, 68 Boolean system, 59 bottom-up modeling, 54 branch length, 6 Buchberger-Moller algorithm, 62, 83 Burnsides' lemma, 119

Cavender-Farris model, 5 Chinese Remainder Theorem, 60 chronic mylegenous leukemia, 95 clustering, 126 codons, 33 comparative genomics, 1 complete graph, 116 complex network, 126 complex systems, 124 complexity, 63, 67

computational, 105 computational algebra, 60 computational universality, 116, 120 computer simulation, 101 conjugation relation, 119 coupled probabilistic timed transition

system, 134 covarion model, 48 covering map, 117

data fitting, 60 data set

knockout, 66 wildtype, 66

degree sequence, 127 dependency graph, 57, 103, 107 discretization, 59 disease models, 133 dissimilarity, 25

map, 14, 26 distance methods, 25 distributed systems, 105

large-scale, 102 dynamical systems

discrete, 104 finite, 54, 56 monomial, 59 polynomial, 56, 59

minimal, 62 Sequential (SDS) (see also Sequential

Dynamical System (SDS), 57, 101, 103, 104, 106

stochastic, 111 dynamics, 58, 60

edge expansion, 127 elimination ordering, 65 energy function, 111 epidemiology, 102 equivalence, 113, 118 expected pattern frequencies, 28

Felsenstein, 24, 38 algorithm, 7

Fitch-Hartigan algorithm, 23 fixed point, 56, 107

existence problem, 110 Flate(P) , 41-43 F la t v (P ) , 45 flattening

edge, 42, 48 of a joint distribution P , 42 vertex, 45-48

forward engineering, 54

149

150

functional genomics, 111 functional linkage networks, 111

Garden of Eden, 107 existence problem, 110

gene annotation, 111 general time reversible models (GTR), 31,

32 generalized Lagrange interpolation, 56 genetic code, 33 GM, 43, 45, 48

model, 38, 41 +1, 48

graph morphism, 122 Grobner bases, 11, 61, 79, 81

reduced, 61 group-based models, 30 GTR, 35

+ r , 35 +1, 34, 35, 48 +1 + T, 34, 48

Hadamard conjugation, 31 hidden mutations, 24 host, 133

J T , 37 ideal, 60

of the points, 60 phylogenetic, 37, 43 zero-dimensional, 62

identifiability, 48 of tree topologies, 48

indels, 10 independent set, 119 indirect effect, 72 input points, 60 instantaneous rate matrix, 31 interaction, 124

graph,103 invariable sites, 33 invariants, 40, 47, 48

for e, 43 edge, 43 linear, 40 phylogenetic, 37, 38, 40, 46, 47 topologically informative, 40

joint distribution P of states at the leaves a i , < i 2 , . . . , a n , 28

Jukes-Cantor distance, 35 Jukes-Cantor model, 5, 30, 39

Kimura 2-parameter model, 30 Kimura 3-parameter model, 31 knockout experiments, 66

indexed time series, 66

leading term, 61 lexicographic, 65

INDEX

limit cycle, 56 local interactions, 104

composition of, 104 local rules, 103 local vertex functions, 103

malaria, 135 Markov model, 6

chain, on a tree, 6 general (GM), 27, 41, 43, 45-48 pair hidden, 9

maximum likelihood, 35, 36, 47 tree, 36

median problem, 16 minors, 42

matrix, 45 mixture model, 32, 33 ML, 48

inference, 48 molecular clock, 26 monomial ordering, 79 MrBayes, 49

Needleman-Wunsch algorithm, 8 neighbor joining, 14, 35

algorithm, 26 network structure, 123 normal form, 61, 65

optimal control, 87 orbit graph G, 109 out group, 30

parsimony, 23 score, 24

people-location graph, 125 periodic orbit, 107 permutation, 105

existence problem, 110 phase space, 105, 107 PHYLIP, 49 phylogenetic distances, 34 phylogenetic inference

Bayesian approaches to, 36, 48 phylogenetic trees, 21 phylogenetic variety, 37, 46 phylogenetic X-tree, 22 phylogenetics, 21 phylogenies, 21 phylogeny reconstruction, 13 population mobility, 131 Portland, Oregon, 125 power-law distribution, 127 predecessor existence problem, 110 probabilistic timed transition system

(PTTS), 134 protein-signalling network, 66 proto-population, 131

INDEX 151

random graph models, 124 rank 1 tensors, 44 rate matrix, 6 rate variation across sites, 40 rates-across-sites model, 34 reachability problem, 109 reverse engineering, 55, 59, 60

scaling problems, 129 SDS. See Sequential Dynamical System

(SDS) secant variety, 44, 45

higher, 45, 46 segment polarity genes, 68 Segre product, 44 Segre variety, 43 Sequential Dynamical System (SDS), 57,

101, 103, 104, 106 computational (cSDS), 105 invert ible, 107 -morphism, 122 phase space of, 105 reduction of, 116 stochastic, 134

shattering, 128 SimDemics, 103, 133 Simfrastructure, 103, 110, 130 singular value decomposition, 47 16s ribosomal RNA, 2 smallpox, 133 social network, 124

contact, 127 soft polytomies, 23 Splits Equivalence Theorem, 41 standard monomials, 61 state set, 56 state space, 70

graph, 58 state transition, 56 states, 56 stationary model, 32 structural measures, 126 structure, 60 support, 57 synonymous substitutions, 7 synthetic household, 132 synthetic individuals, 125 systems biology, 53, 54

telecommunication network, 125 tensor, 28

rank, 44, 45 term order, 61, 65, 79 threshold functions, 108

k-, 109 inverted, 109

time series, 59 knockout, 70

indexed, 66

time-reversibility, 32 top-down modeling, 55 trajectory, 56 transient point, 107 TRANSIMS, 110

micro-simulator, 110 router, 110

transition function, 56 transport networks, 127 transportation systems, 102 transposable elements, 13 tree metric, 25

ultra-conserved elements, 4 unitary semi-ring, 121 Unweighted Pair Group Method using

Arithmetic Means (UPGMA), 25 update order, 103 update sequence, 103 urban infrastructure, 130, 131

commodity markets, 130 interdependencies, 130 public health, 130 telecommunication, 130 transportation, 130

VT, 37, 47 validation, 129 vanishing function, 60 variable order, 61, 65 variable ordering, 71 vertex expansion, 128 virus population, 85

wildtype experiment, 66, 69, 70 wiring diagram, 55 word, 105

Y-local function, 106