11
Review A systems biology view of cancer Reinhard Laubenbacher a,f, , Valerie Hower b , Abdul Jarrah a , Suzy V. Torti c,d , Vladimir Shulaev a,f , Pedro Mendes a,e,f , Frank M. Torti d,f , Steven Akman d,f a Virginia Bioinformatics Institute, Washington St. (0477), Blacksburg, VA 24061, USA b School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-0160, USA c Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA d Comprehensive Cancer Center, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA e School of Computer Science, University of Manchester, Manchester, England f Department of Cancer Biology, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA abstract article info Article history: Received 20 December 2008 Received in revised form 20 May 2009 Accepted 1 June 2009 Available online 6 June 2009 Keywords: Systems biology Cancer Mathematical modeling In order to understand how a cancer cell is functionally different from a normal cell it is necessary to assess the complex network of pathways involving gene regulation, signaling, and cell metabolism, and the alterations in its dynamics caused by the several different types of mutations leading to malignancy. Since the network is typically complex, with multiple connections between pathways and important feedback loops, it is crucial to represent it in the form of a computational model that can be used for a rigorous analysis. This is the approach of systems biology, made possible by new -omics data generation technologies. The goal of this review is to illustrate this approach and its utility for our understanding of cancer. After a discussion of recent progress using a network-centric approach, three case studies related to diagnostics, therapy, and drug development are presented in detail. They focus on breast cancer, B-cell lymphomas, and colorectal cancer. The discussion is centered on key mathematical and computational tools common to a systems biology approach. © 2009 Elsevier B.V. All rights reserved. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 2. Cancer is a systems biology disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 3. The systems biology approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 4. A systems biology view of the hallmarks of cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 4.1. Independence from external growth signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 4.2. Insensitivity to antigrowth signaling and evasion of apoptosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.3. Limitless replicative potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.4. Sustained angiogenesis and metastasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 5. Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 5.1. Network-based classication of breast cancer metastasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 5.2. Prediction of oncogenes and molecular perturbation targets in B-cell lymphomas . . . . . . . . . . . . . . . . . . . . . . . . . . 136 5.2.1. The BCI interactome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 5.3. A multi-scale mathematical model of cancer, and its use in designing radiation therapies . . . . . . . . . . . . . . . . . . . . . . 137 5.3.1. Radiation therapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.3.2. Simulation results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.3.3. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 6. Outlook and conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Biochimica et Biophysica Acta 1796 (2009) 129139 Corresponding author. Virginia Bioinformatics Institute, Washington St. (0477), Blacksburg, VA 24061, USA. Fax: +1540 231 2606. E-mail address: [email protected] (R. Laubenbacher). 0304-419X/$ see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.bbcan.2009.06.001 Contents lists available at ScienceDirect Biochimica et Biophysica Acta journal homepage: www.elsevier.com/locate/bbacan

A Systems Biology View of Cancer

Embed Size (px)

DESCRIPTION

A review of Systems Biology application on cancer research

Citation preview

Page 1: A Systems Biology View of Cancer

Biochimica et Biophysica Acta 1796 (2009) 129–139

Contents lists available at ScienceDirect

Biochimica et Biophysica Acta

j ourna l homepage: www.e lsev ie r.com/ locate /bbacan

Review

A systems biology view of cancer

Reinhard Laubenbacher a,f,⁎, Valerie Hower b, Abdul Jarrah a, Suzy V. Torti c,d, Vladimir Shulaev a,f,Pedro Mendes a,e,f, Frank M. Torti d,f, Steven Akman d,f

a Virginia Bioinformatics Institute, Washington St. (0477), Blacksburg, VA 24061, USAb School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-0160, USAc Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USAd Comprehensive Cancer Center, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USAe School of Computer Science, University of Manchester, Manchester, Englandf Department of Cancer Biology, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA

⁎ Corresponding author. Virginia Bioinformatics InstitE-mail address: [email protected] (R. Laubenbach

0304-419X/$ – see front matter © 2009 Elsevier B.V. Adoi:10.1016/j.bbcan.2009.06.001

a b s t r a c t

a r t i c l e i n f o

Article history:Received 20 December 2008Received in revised form 20 May 2009Accepted 1 June 2009Available online 6 June 2009

Keywords:Systems biologyCancerMathematical modeling

In order to understand how a cancer cell is functionally different from a normal cell it is necessary to assessthe complex network of pathways involving gene regulation, signaling, and cell metabolism, and thealterations in its dynamics caused by the several different types of mutations leading to malignancy. Since thenetwork is typically complex, with multiple connections between pathways and important feedback loops, itis crucial to represent it in the form of a computational model that can be used for a rigorous analysis. This isthe approach of systems biology, made possible by new -omics data generation technologies. The goal of thisreview is to illustrate this approach and its utility for our understanding of cancer. After a discussion of recentprogress using a network-centric approach, three case studies related to diagnostics, therapy, and drugdevelopment are presented in detail. They focus on breast cancer, B-cell lymphomas, and colorectal cancer.The discussion is centered on key mathematical and computational tools common to a systems biologyapproach.

© 2009 Elsevier B.V. All rights reserved.

Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1302. Cancer is a systems biology disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1303. The systems biology approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1324. A systems biology view of the hallmarks of cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

4.1. Independence from external growth signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1334.2. Insensitivity to antigrowth signaling and evasion of apoptosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344.3. Limitless replicative potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344.4. Sustained angiogenesis and metastasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

5. Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345.1. Network-based classification of breast cancer metastasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1355.2. Prediction of oncogenes and molecular perturbation targets in B-cell lymphomas . . . . . . . . . . . . . . . . . . . . . . . . . . 136

5.2.1. The BCI interactome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1365.3. A multi-scale mathematical model of cancer, and its use in designing radiation therapies . . . . . . . . . . . . . . . . . . . . . . 137

5.3.1. Radiation therapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1385.3.2. Simulation results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1385.3.3. Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

6. Outlook and conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

ute, Washington St. (0477), Blacksburg, VA 24061, USA. Fax: +1 540 231 2606.er).

ll rights reserved.

Page 2: A Systems Biology View of Cancer

130 R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

1. Introduction

The insight that cancer has a strong genetic component has focusedattention on the search for oncogenes [1] and the pathways in whichthey participate [2]. It is well appreciated that several different types ofmutations must occur in a multi-stage process before a cell becomesmalignant, thereby altering different molecular pathways involving theaffected genes. These pathways represent a variety of molecularmechanisms relating to gene regulation, signaling, and cell metabo-lism. In order to understand how a cancer cell is functionally differentfrom a normal cell it is necessary to assess the overall network and itschanged dynamics rather than the individual pathways. Since thenetwork tends to be complex, with extensive cross-talk betweenpathways as well as important feedback loops, it is crucial to representit in the form of a computational model that can be used for a rigorousanalysis. As for most complex diseases, an additional complicationarises from the fact that for many aspects of cancer it is important tointegrate several different scales in addition to the molecular one. Forexample, for certain therapies, it might be important to also considerthe cellular scale, e.g. the cell cycle, and the tissue scale, e.g. certaineffects of angiogenesis. What emerges is a view of cancer as anexceedingly complex array of components that operate at differentspatial and temporal scales, all working together as a highly non-lineardynamic system. Recognizing dynamic signatures of this system canhelp improve diagnostics, and an understanding of the mechanismscan help with the design of sophisticated perturbations of this systemthat disrupt cancer progression.

The case for a systems biology approach to complex diseases suchas cancer or diabetes has been made eloquently in [3] and in [4,5]. Asthe authors point out in [4], “Reductionism pervades the medicalsciences and affects the way we diagnose, treat, and prevent diseases.While it has been responsible for tremendous successes in modernmedicine, there are limits to reductionism, and an alternativeexplanation must be sought to complement it.” This alternativeexplanation lies in a systems-oriented view. It is only in the last twodecades that technological advances in molecular data generation,imaging, and computing have come together to allow the realizationof such a view in a practical way.

This review is intended to highlight the usefulness of a systemsbiology approach to cancer for the clinic.We beginwith a discussion ofwhy systems biology tools are an essential part of the repertoire ofapproaches to this complex disease, using the Akt signaling pathwayas an example. We then discuss in detail three case studies related tocancer diagnostics, therapy, and drug development, providing anoverview of different mathematical and computational approaches toproblems in the framework of these case studies.

2. Cancer is a systems biology disease

Systems biology has emerged during the last decade as a powerfulnew paradigm for research in the life sciences. It is based on thepremise that the properties of complex systems consisting of manycomponents that interact with each other in non-linear, nonadditiveways cannot be understood solely by focusing on the components. Thesystem as a whole has emergent properties that are not visible at theparts level. The availability of high-throughput data generationtechnologies, such as DNA microarrays, has made it possible toapply this paradigm in molecular biology and biomedicine. Transcrip-tional data from a large part or all of the genome allows a shift in focusfrom individual genes in linear pathways to large-scale networks ofinteracting genes. In many cases, it is the properties of the aggregatenetwork, possibly interacting with the extracellular environment,which are at the heart of pathological changes, such as cancer.Moreover, the system properties of interest often are embodied in itsdynamics. For instance, changes in the structure of the networkthrough mutations or epigenetic effects can lead to changes in

network dynamics that result in different physiological properties. Inmany cases, in particular when it comes to cancer, phenomena atdifferent scales are connected, such as the molecular and tissue levelscales in tumor growth. The natural framework for the study of thestructure and dynamics of networks is through the use of mathema-tical models. This is perhaps the defining characteristic of the systemsbiology paradigm.

As an example, we discuss the Akt signaling pathway. Akt is aserine/threonine kinase whose signaling is increased in numeroustypes of cancer. Akt occurs downstream of P13 K (phosphatidylinositol3-kinase) and can be altered by many different mechanisms inmalignant cells, including PTEN (phosphatase and tensin homolog)mutation [6] and activity of a number of oncogenes including Ras [7]and Her2/neu [8]. When unstimulated, the Akt protein resides in thecytosol. Akt can translocate to the plasma membrane and issubsequently phosphorylated. This process activates Akt as a kinase,and Akt can then affect many biological processes occurring withinthe cell. For instance, Akt promotes cell survival by inhibiting pro-apoptotic proteins such as Bad (Bcl2-antagonist of cell death) [9],caspase-9 [10], and FOXO1 (forkhead box O1) [11] and by activatinganti-apoptotic proteins like Mdm2 (transformed mouse 3 T3 celldouble minute 2) [12]. The tumor suppressor TSC2 (tuberous sclerosis2) is phosphorylated by Akt, which activates the mTOR kinase [13].This causes an increase in both translation and protein synthesis whilepromoting cell growth. Metabolically, data suggests that Akt serves asregulator of glycolysis in cancer cells. Akt activity increases glucoseconsumption and causes cancer cells to rely on glucose for survival[14]. A common way to represent these interactions is in a simpleschematic cartoon, as shown in Fig. 1A.

However, the Akt pathway cannot be considered in isolation. Thereis in fact cross-talk between this pathway and several othersimplicated in different types of cancers. For example, NF-κBtranscription factors are induced by Akt signaling through itsregulation of IκB [15], thus linking the Akt pathway to NF-κB signaling[16]. Akt also cross-talks with the CCR2 (chemokine (C–C motif)receptor 2) [17] and COX2 (PTGS2) pathways [18]. Recent softwarepackages have been designed to depict the additional complexitiesrepresented by these interconnected networks, whose nodes repre-sent molecular species. An example derived using the IngenuityPathways software package is shown in Fig. 1B. It becomes clear thatan analysis and understanding of this complex network is difficult orimpossible without analytical tools.

While this graphical representation of the network providesadditional information not available by considering the individualpathways separately, it is of course still a vast simplification, since it ismerely a static representation of several dynamic processes happen-ing at the same time, with several intertwined feedback loops. Theonly way to effectively study the effect of either mutations ortherapeutic interventions is to create a quantitative model of thisnetwork which integrates the dynamics of the individual pathwaysand their interconnections andwhich can be simulated on a computer.Representing aspects of an in silico cell, the model can then be used toexplore a variety of questions. One can use it to identify network nodesthat have a particularly strong effect on network dynamics, or one cansimulate the compound effect of amulti-target drug. As an example, in[19] a mathematical model was constructed of HRG induced signalingcascades induced by HRG (histidine-rich glycoprotein) and mediatedby ErbB tyrosine kinase receptors involving the Akt and MAPKpathways and their cross-talk. These receptors play an important rolein cell proliferation and differentiation, and altered expression ormutation has been implicated in several human cancers. Themathematical model allows a detailed simulation of several differentdynamic events, resulting in temporal concentration curves. Forexample, the graph shown in Fig. 1C produced by dynamic modelingreveals that the extent of Raf-Akt cross-talk can profoundly affect themagnitude and duration of the response of ErbB to receptor

Page 3: A Systems Biology View of Cancer

Fig. 1. Different views of the Akt pathway/network. Subpanel A represents a schematic view of Akt and the other types of molecules it interacts with, based on information available in the literature. Subpanel B was constructed using IngenuityPathway and data from [60,61]. Subpanel C was obtained using an implementation of the differential equations model in [19]. The simulated concentration curves of Akt (…), ERK (__), and MEK (–) for the first 30 min after the stimulation by ahigh dose of heregulin (HRG).

131R.Laubenbacher

etal./

Biochimica

etBiophysica

Acta

1796(2009)

129–139

Page 4: A Systems Biology View of Cancer

Fig. 2. The systems biology strategy. The goal of a systems biology approach is to construct predictive mathematical models of biological systems. This figure depicts the generalstrategy to be followed. The starting point is existing biological knowledge about the system in question. It is used to construct a general mathematical framework for a model. Themodel might contain several unknown biological parameters, such as kinetic rate constants or diffusion rates. The model is refined through known parameter values andexperimental data, to which the model can be fitted through simulation. At this stage, the model is used to generate biological hypotheses that can be tested experimentally. Throughan iterative process of hypothesis and experiment, the model is refined until it accurately captures the key biological features of the system to be modeled.

132 R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

stimulation. Such an insight can clearly not be gleaned from thesimple cartoon shown in Fig. 1A or even the complex networkdepicted in Fig. 1B.

This detailed example demonstrates how one is naturally led toconsider quantitative models of complex molecular networks if onewants to comprehend the net effect of a mutation or the application ofa therapeutic. This path is even more essential if one wants to linkintracellular processes with events at the multi-cellular level, such asangiogenesis. A key feature of cancer development is encapsulated inthe interaction between intracellular networks and the extracellularenvironment. Ultimately, a comprehensive understanding of cancerhas to link events at the molecular level with those at the tissue level,organism level, and even the environment. This can only be doneusing multi-scale mathematical models that integrate quantitativeinformation at different scales. This is precisely the approach ofsystems biology. Section 4 and Case Study 3 in Section 5 illustrate thisview by placing intracellular networks in the context of theextracellular environment. Before reviewing systems biology resultsspecific to cancer, we provide a general introduction to the systemsbiology paradigm and methodological “toolbox”.

3. The systems biology approach

The ability to collect data for a large number of molecular speciesfrom the same sample, including data measuring gene expression andthe levels of many proteins and metabolites, opens the possibility ofobtaining network-level data. This type of data also makes possible amore discovery-oriented approach. Since the ultimate goal of asystems biology approach to cancer is to understand a relevantbiological network with the help of a mathematical model derivedfrom a data-mining effort on heterogeneous, system-level data,systems biology projects require a specialized experimental design,several steps for data analysis, and an iterative approach alternatingbetween computational prediction and experimental verification. Thisprocess is schematically represented in Fig. 2.

The fundamental premise of systems biology is that a completeunderstanding of a molecular network requires not just knowledge ofits parts but also of their dynamic interactions. In order to constructmeaningful system-level models, quantitative whole-cell measure-ments of the network's components as they change over time inresponse to one or more perturbations are needed. They form thebasis of a computational model of the network. The modelincorporates one or more hypotheses about the mechanisms thatare responsible for system dynamics and provides a theoreticalexplanation of the observations. Model validation with furtherexperiments tests these hypotheses. A validated model can thenserve as a tool for the generation of further hypotheses. As such, the

mathematical model represents a tool for hypothesis generation abouta system that is typically too large and complex to understand on thebasis of a static wiring diagram. Essential features of molecularnetworks, such as feedback loops and synergistic effects of differentpathways, cannot be discovered and understood without thetheoretical basis provided by the model.

Constructing dynamic mathematical models of networks is oftennot possible, however, due to the lack of appropriate time course datasets. In these cases it is still beneficial and often possible to constructand analyze static networks representing, for example, all possibleprotein–protein interactions in a cell type. Many data sets fromfunctional genomics are available for this purpose. In Section 6wewilldescribe two such examples.

Another important tool set in the systems biology repertoire isnetwork inference from system-level data, typically DNA microarraydata measuring gene transcription levels, 2D-gel, or large-scale massspectrometry data. Inference methods allow the construction ofphenomenological networks solely based on information available inthe data, without the bias of prior biological knowledge or preconceivednotions of the network structure. These phenomenological models canthen be used to focus attention on certain features and provideconstraints for building mechanistic models like the one in [19].

Sometimes network inference methods are referred to as “top-down” modeling and mechanistic models as “bottom-up.” As theseterms have begun to enter into general use, it is worth commenting ontheir distinction. The availability of high-throughput data sets such asDNA microarrays or large-scale protein and metabolite profiles hasopened up the possibility to create system-wide models of networksthrough inference of relationships between network nodes from datasets. This approach, sometimes referred to as top-down modeling, is incontrast to the more traditional approach of constructing mathematicalor statistical models by first identifying a list of molecular species andtheir interactions and then constructing a model that encodes theserelationships. Model parameters are then frequently estimated byfitting the model to available data. This approach might be calledbottom-up modeling. (An example is provided in the next case study.)The fundamental difference between the two approaches is that in thebottom-up approach one begins with a determination of whichmolecules are important and which interactions are to be considered.In the top-down approach the aim is to reverse-engineer thisinformation in an unbiased way from the system-level data. Networkinference has emerged as a central problem in systems biology [20]. Avariety of algorithms have been developed for this purpose, whoseoutput ranges from statistical models such as ARACNE and Bayesiannetworks models [21], providing correlation measures of networknodes, to discrete models, such as Boolean networks [22] and systemsof ordinary differential equations[23], giving full dynamic network

Page 5: A Systems Biology View of Cancer

133R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

models. In most cases, top-down models are phenomenological ratherthan mechanistic, like bottom-up models. The ultimate goal is tocombine the two approaches, with top-down models providingconstraints for the bottom-up approach.

Systems biology approaches to the study of cancer to date havemostly focused on the construction and understanding of themolecular networks being altered by malignant transformations. Butseveral studies have been undertaken that move the application ofsystems biology techniques closer to the clinic. While still far from thepersonalized medicine paradigm suggested by the genomics revolu-tion it is feasible to envision some of the ways in which parameters inmathematical models might depend on characteristics of individualsand determine the nature of dynamic processes related to cancerprogression or diagnosis and treatment. Relevant questions impactingclinical practice are:

• How can we use network-level analysis to improve prognosticforecasting? Case Study 1 in Section 5. suggests an answer withrespect to the prediction of the metastatic potential of breasttumors. Using a combination of DNA microarray data and protein–protein interaction information, the study proposes subnetworksof the protein–protein interaction network as predictors, ratherthan collections of individual genes.

• How can we use alterations of the molecular network in malignantcells rather than differential gene expression in order to identifyoncogenes? Case Study 2 focuses on this question in the context ofB-cell lymphomas. Using a variety of data sets and statisticaltechniques, a network is established that integrates a range ofmolecular interactions. Oncogenes in different lymphoma pheno-types are identified by the changes in their network neighborhoodrather than by differential expression.

• How can we use multi-scale dynamic models spanning the molecularand the tissue level in order to improve treatment methods? Thisquestion is addressed in Case Study 3, focused on improvedprotocols for radiation therapy. Using a dynamic mathematicalmodel that integrates the molecular, cellular, and tissue levelscales, one can study strategies for optimal timing of radiationtreatment sessions.

Table 1Systems biology studies related to cancer.

Reference Input Method

1 Data from human/mouse studies ODE2 Gene expression/healthy human Statistics3 Microarray data metastatic and

non-metastatic prostate tumorsMode-of-action network iden

4 Gene expression from both healthyand diseased pancreas tissue

Network inference using signgenes/interactions

5 Kinetics and interactions obtained fromthe literature

A multi-scale model, involvindynamic models

6 Kinetics and interactions obtained fromthe literature

Probabilistic model for geneti

7 Microarray data from breast cancer patients Network motif selection using

8 Kinetics and interactions obtained fromthe literature

ODE-based model of reaction

9 Microarray data from multiple phenotypesof lymphoma

Network inference using statitheory

10 Telomere maintenance genes Statistics/graph theory11 modeling/kinetics/biology of p53 gene

regulatory networksODE model of p53 gene regul

12 Experimental data from MCF7 breast cancercell line

ODE-based model of ErbB sig

13 Kinetics and interactions ODE/agent-based multi-scale14 MS-based phosphoproteomics of glioblastoma

cell linesNetwork inference using clus

15 Microarray data from lung tissue of c-Myctransgenic mice

Statistical network inferenceweight matrices

This table highlights several different systems biology approaches that have been used in the rsuch studies.

Before presenting the case studies that address specific instancesof these questions, we first review briefly the systems biologyliterature as it pertains to cancer. This review is not intended to becomprehensive, but rather illustrative of the systems biologyapproach.

4. A systems biology view of the hallmarks of cancer

In their much-cited review [24], Hanahan and Weinberg identifyseveral crucial acquired traits that are necessary for a cell to becomemalignant. These include the ability to mimic normal growthsignaling, the insensitivity to antigrowth signals, the ability to evadeapoptosis, limitless replicative potential, sustained angiogenesis, andtissue invasion and metastasis. Although these hallmarks focus on thetumor cell itself and do not encompass contributions of theextracellular environment, they have formed a useful framework formany systems-based approaches to the study of cancer. We discusshere progress made in understanding these hallmarks from a systemsbiology point of view. Specific studies illustrating a range of differentapproaches are presented in Table 1.

4.1. Independence from external growth signaling

As suggested in [24], there is ample evidence that most or allcancers display dysregulation of several signaling pathways resultingin the cells' acquired independence of external growth factors. Acommon alteration is the upregulation of growth factor receptors. Inparticular, the pathway of the epidermal growth factor receptor EGFRis hyperactivated in approximately 30% of all cancers, includingmalignancies of the colon, lung, pancreas, ovary, kidney, as well assome leukemias [34]. Several detailed mathematical models of thispathway have been constructed, e.g., [32,35], and analyzed.

The point is made in [24] that to properly understand thephenomenon of becoming independent of external growth signalswill require an understanding of how mutated cells interact withother cells in their microenvironment, in addition to intracellularevents. During the last decade so-called agent-based or individual-based models have become increasingly popular which provide an

Kinetic? Output

Yes Response to chemo/immunotherapyNo links HMMR expression and breast cancer

tification No Genes active in the advancement of prostate cancer

ificant No Collection of key players in angiogenesisincluding PPARδ

g different Yes Tumor characteristics are important in predictingthe efficacy of radiation therapy

c mutations Yes Probability for obtaining 2 mutations in a tumorcell varies greatly across tumors

statistical analysis No Subnetworks that discriminate between metastaticand non-metastatic breast cancer

s in a cancer cell Yes Tool in drug discovery and development

stics/information No Phenotype-specific collections of dysregulated genesand interactions

No Network involved in telomere maintenanceatory network Yes Simulation of DSB generation/repair, ATM activation,

and p53-Mdm2 feedbacknaling Yes Kinetic relationship of EGF and HRG act. of ErbB

receptors to ERK and Akt signalingmodel Yes EGFR affects brain tumor dynamicstering No EGFRvIII activated pathways and c-Met receptor as

a target in glioblastoma.using in positional No Transcriptional regulation occurring in c-Myc

overexpression

ecent literature, and provide examples of the kinds of information that can be derived in

Page 6: A Systems Biology View of Cancer

1 Agent-based models represent individuals, or agents, in a population explicitly, forinstance, individual cells in a tumor. Each agent is equipped with a set of rules thatdescribe how it interacts with its environment and with other agents. Such models arebeing used increasingly in population biology, social science, and computationalbiology.

134 R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

excellent modeling paradigm for such a multi-scale approach. Severalmathematical models of the cellular signaling pathways involving thefamily of Her receptor proteins have been constructed; see, e.g.,[19,30,36,37].

In [31], a multi-scalemodel of malignant glioma is constructed thatrepresents the interior of each cell as a system of ordinary differentialequations representing mass balance equations of molecular speciesrelated to the EGFR (Her1) interaction network. Multiple cells resideon a 2-dimensional grid on which they proliferate and move betweengrid points, based on a collection of rules. The rules incorporate cell–cell signaling and the sensing of environmental conditions such asnutrient sources. One interesting conclusion suggested by modelsimulation is that tumor dynamics cannot be explained throughmolecular processes inside the cells alone. Instead, cells with a higherreceptor density show an early increase in the switching betweenproliferative and migratory traits, suggesting a more active protein-level interaction between tumor cells.

Downstream pathways that process signals from these and otherreceptors can also be altered in malignant cells, with the SOS-Ras-Raf-MAPK cascade playing a central role. Here too a systems biologyapproach has resulted in mathematical models focusing on differentaspects of this network [38–40]. Altogether, these efforts are creatingamore andmore extensive and detailed quantitative representation ofthe inter- and intracellular signaling network that helps mutated cellsacquire self-sufficiency in growth signals. The complexity of thisnetwork renders traditional “cartoon-level” representations ofincreasingly limited value.

4.2. Insensitivity to antigrowth signaling and evasion of apoptosis

Cell growth can be arrested through either a temporary orpermanent interruption of the cell cycle, with cells being directed tothe quiescent (G0) state or to senescence. At the extreme, cells aredirected to programmed cell death, or apoptosis. For example, manyantiproliferative signals involve the retinoblastoma protein pRb andits relatives. Disruption of the pRb pathway through hypophosporyla-tion prevents antigrowth factors such as TGFβ from blocking advancethrough G1 phase. The cell cycle has been studied extensively from thepoint of view of systems biology. While there are mathematicalmodels of the cell cycle in model organisms, e.g., yeast [41], onlylimited progress has been made on a systems biology description ofthe mammalian cell cycle. However, a mathematical model was usedin [42] to study, among other things, pRb dynamics during G1 phase inmammalian cells.

Preventing terminal cell differentiation is frequently accomplishedby directly affecting the c-myc oncogene. A recent transcriptionalprofiling study has made significant progress toward identifying thetranscriptional network that regulates c-myc expression [33], usingstatistical network inference algorithms. See the review [43] for furtherprogress on understanding the function of c-myc in the epidermis.

There are several possible pathways that activate the apoptosisprogram in a cell. The most common mechanism involves the tumorsuppressor protein p53, one of the most extensively studiedmammalian proteins. Enough information and data is available tomake the p53 network a good candidate for mathematical modeling.In recent years several systems biology models have appeared. Forinstance, the papers [44,45] study the oscillatory properties of thenegative feedback loop between p53 and the oncogene Mdm2. Thisfeedback loop was investigated in [29] as a possible tool in the designof optimal radiotherapy protocols.

4.3. Limitless replicative potential

Telomere maintenance is a feature in virtually all cancer cells andis regulated by a complex cellular network. A first step towardunderstanding this network and its role in carcinogenesis in

mammalian cells is to elucidate the network topology, that is, thepattern of the interactions between the molecular species involved,and eventually its dynamics. Here too, mathematical and statisticalmethods have aided in making progress. Several authors have studiedthe network. Most recently, in [28] a general method was developedto infer a draft of the topology of the telomere length-controlnetwork from high-throughput phenotypical data. The method wasthen applied to data from gene knockout yeast strains. The nodes ofthe network are proteins and an edge indicates a protein–proteininteraction. (Networks are typically represented through the math-ematical notion of a graph. The nodes, or vertices, of the graphrepresent the molecular species involved in the network. There is adirected or undirected edge between two nodes if they interact insome way. For instance, a directed edge from gene A to gene B mightindicate that gene A (through its protein) activates gene B. Anundirected edge between two proteins might indicate a protein–protein interaction.)

4.4. Sustained angiogenesis and metastasis

To understand the ability of solid tumors to invade neighboringtissues and to stimulate the growth of blood vessels to assure nutrientsupply, it is important to understand the interplay between molecularevents inside tumor cells, such as the upregulation of the vascularendothelial growth factor (VEGF) and the nature of the angiogenicswitch, and events in the microenvironment surrounding the tumor,such as communication with neighboring cells, flow characteristics inthe nearby vasculature, and other factors favoring or discouragingtissue invasion. These different scales represent a significant challengeto a mathematically formulated systems biology view. Fortunately, inthe last decade, several modeling frameworks have been developedthat allow a multi-scale modeling approach. In particular, thecombination of continuous models, using systems of differentialequations to capture gradient flows of nutrients or signal cascades,and discrete, so-called agent-based1, models to represent collectionsof individual cells and their individual or communal actions based ontheir specific environment appears to be a very promising approach;see, e.g., [46–48]. Increased computational power is making it possiblein the near future to simulate three-dimensional multi-scale modelsof tumor growth. Preliminary results are already providing intriguinginsights and new hypotheses, such as the possibility that a tumor'smicroenvironment might have more of a selective rather than aregulatory effect on invading cancer cells [49], which departs from theconventional view [50].

5. Case studies

We now present three exemplary case studies that demonstratehow a systems-level view can provide insights of direct relevance tothe problem of human cancer. They are chosen to be representative ofdifferent clinical issues, different mathematical methods, and differ-ent cancers. The first study uses a combination of gene expression dataand a protein–protein interaction network to discover network motifsthat can be used for the prediction of developing distant metastases inbreast cancer patients. The quantitative methodology consists of astatistical analysis. The second uses a more complex network ofmolecular interactions instead of just gene expression data to identifyoncogenes in certain types of B-cell lymphomas. In addition tostatistical tools, a network inference methods based on informationtheory is used to construct the network. Finally, the third study

Page 7: A Systems Biology View of Cancer

135R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

constructs a detailedmechanisticmathematical model of the effects ofradiation therapy on cell proliferation. It uses three different coupledmodels to represent a genetic network (at the lowest level) thatregulates the cell cycle (at themiddle level), which, in turn, affects cellproliferation in the microenvironment (top level). Furthermore,conditions in this microenvironment feed back to control the generegulation network.

5.1. Network-based classification of breast cancer metastasis

Fatalities from breast cancer, the most common malignant diseasein Western women, result primarily from metastases at distant sitesrather than from the primary tumor. As described in [51], more than80% of patients currently receive adjuvant chemotherapy. Onlyapproximately 10–15% of breast cancer patients develop distantmetastases within 3 years of initial diagnosis, however, it is notuncommon that metastases appear after 10 years or more from initialdiagnosis. The availability of accurate prognostic markers would allowa more targeted application of adjuvant therapy, thereby avoidingmany unnecessary chemotherapy treatments. However, traditionalclinical markers provide an accurate prognosis for only about 30% ofpatients. With the availability of large-scale data collection throughDNA microarrays, the question was investigated whether one couldassemble a profile consisting of a relatively small number of genes thatcould be used to predict the likelihood for a given patient to developdistant metastases.

The study [52] used data from 295 patients to identify a profileconsisting of 70 genes whose differential expression was a goodpredictor of disease outcome in a group of patients 52 years oryounger, tumor-negative lymph nodes, and no previous history ofcancer. Another study, [53], identified a profile of 76 genes with goodpredictive properties in patients with lymph-node-negative breastcancer. Interestingly, the overlap between the profiles from the twostudies consisted of only 3 genes. In order to improve the predictivevalue of gene expression for this purpose it is reasonable to groupgenes according to the pathways they belong to and attempt to findpathway profiles instead. However, there are several obstacles to thisstrategy. For one, our knowledge of the many relevant molecularpathways is still very incomplete. Furthermore, most genes have notbeen assigned to individual pathways yet. And, as we have seen, thenotion of pathway is not clearly defined in the first place, and onemight better focus on networks; but these are even less wellunderstood at this time.

In this section we focus on a study [26] that proposes as analternative the combination of DNA microarray data and ourincreasingly good knowledge of the human protein–protein interac-tion network. Overlaying gene expression on top of the proteins in thisnetwork one can then search for collections of subnetworks thatprovide better discrimination between patients with good prognosisand those with poor prognosis. Chuang et al. tested this idea on thedata from the two studies [52,53], with promising results.

The mathematical model in [40] is a human protein–proteininteraction network N. Its nodes consist of a collection of 11,203human proteins. An edge between two nodes/proteins representsevidence of a physical interaction between the two proteins. Suchevidence can be obtained from the literature. In [54], for instance, aliterature mining effort resulted in 31,609 interactions between 7748human proteins. In addition, in [55] a high-throughput yeast two-hybrid system was used to obtain approximately 2800 interactionsbetween human proteins. Using a variety of literature sources, thenetwork in [26] contains 57,235 interactions.

The role of model simulation for a given set of experimental data isthe computation of certain types of subnetworks of N. The input hereis a gene expression data set consisting of a collection of DNAmicroarrays from patient samples, classified as either metastatic ornon-metastatic, which is superimposed on the nodes of the subnet-

works of the network N. A subnetwork M of N is simply a connectedsubgraph, that is, a collection of nodes and edges in N, so that any twonodes in this collection are connected by a path. We can assign a scoreS(M) to a subnetwork M through the following steps:

• After a suitable transformation, compute the average of theexpression values of the genes corresponding to all proteins inM, for each DNA microarray in the data set. We obtain a collectionX of values. We can view X as a discrete random variable.

• Let Y be the discrete randomvariable that takes on the values 1 or 2(defined by a probability distribution on these values), dependingon whether a given DNA microarray corresponds to a patient withmetastatic or non-metastatic breast cancer.

• Compute the mutual information between the random variables Xand Y to obtain S(M).

The mutual information between two random variables X and Y isused in information theory and measures the mutual dependence ofthe two variables. It is computed as follows:

MI X; Yð Þ = ΣxΣy p x; yð Þlog p x; yð Þ= p xð Þp yð Þð Þ:

Here, the sums extend over all x in X, respectively y in Y, p(x,y) is thejoint probability distribution function of X and Y, and log is thelogarithm function to the base 2. In our case, the mutual informationmeasure computes essentially how much information about Y weobtain from knowing X, that is, how much we know about theclassification of the samples if we know the activity score of thesubnetwork with respect to the given DNA microarray data set.

The approach in [26] is to now identify a collection of subnet-works that can serve as markers to discriminate between metastaticand non-metastatic cancer. This is accomplished with a heuristicsearch algorithm. For the microarray data set S1 in [52] a total of 149such networks, including a total of 618 genes, were identified asdiscriminative, and 243 were found for the data set S2 in [53], totaling906 genes. For a new expression profile, the subnetworks from S1predicted outcome with 70.1% accuracy while the subnetworks fromS2 had 72.2% accuracy. This compares favorably to the rate of 62% forsingle gene marker classification (that is, use of differential expres-sion of each of a collection of genes) of S1 in [52] and 63% for S2 in[53]. The subnetworks derived from S1 classified data from S2 with48.8% accuracy and the subnetworks from S2 classified data from S1with 55.8% accuracy. In comparison, the differential expression-basedtests in [52] and [53] could classify the other data set with 45.3% and41.5% accuracy, respectively. This shows a significant shortcoming ofthis and other methods, which seem to be dependent on the data setused to derive the subnetworks. Across data sets the subnetworkmarkers seems to perform only marginally better than single genemarkers, even though there was a significantly larger overlap insubnetworks from the two data sets S1 and S2 (12.7%) than inindividual genes (1.7%).

As an example of an interesting biological conclusion to be drawnfrom the subnetwork markers, while thioredoxin was not differen-tially expressed in S2, it instead connected many nodes of differen-tially expressed genes involved in DNA replication and cell mobility.The corresponding subnetworks were predictive of the S1 data. TheRas-related subnetwork and RAD54L-related proteasome from S1were predictive of the S2 data. While many well-establishedprognostic markers for breast cancer disease outcome, such as HER-2/neu (ERBB2) or Myc, were not present in single gene markeranalysis alone [56], they played a central role in the discriminativesubnetworks by interconnecting many expression-responsive genes.We will observe a similar phenomenon in the next case study.

This paper represents a novel approach to integrate transcript andprotein-level information into networks to identify markers fordisease outcome. While this approach may improve predictive

Page 8: A Systems Biology View of Cancer

136 R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

performance, it also raises questions. For example, how good asubstitute are protein–protein interaction networks for more detailedrealistic signaling networks involving transcripts, proteins, and avariety of small molecules? Undoubtedly, the performance of markersbased on them might improve method performance. With improveddata collection and improved mathematical and computationalalgorithms, network-based markers for disease outcome might soonbecome much more sophisticated, and [26] is a step in this direction.

5.2. Prediction of oncogenes and molecular perturbation targets inB-cell lymphomas

A network-centric approach, as in [26], can also be used todistinguish normal from malignant cells and identify targets andeffectors of specific biochemical perturbations, potentially useful forthe identification of drug targets. This approach is taken in [27] with afocus on B-cell lymphomas. The central tool in this approach is the B-cell interactome (BCI), a genome-wide cellular network that incorpo-rates several different molecular interaction types in a human B-cell,including transcriptional and signaling interactions, as well ascomplex formation. The BCI was assembled using information froma variety of sources, such as the literature and several databases, inaddition to information obtained through the use of statisticalnetwork inference algorithms applied to DNA microarray data. Itconsists of 64,649 unique pairwise interactions and represents an“average” set of molecular interactions supported by themajority of B-cell samples from several stages of normal development (naïve,memory, germinal center) as well as from several tumor phenotypes.

This network is used as a background against which to test DNAmicroarray data from different tumor phenotypes. For a givenphenotype, an edge in the BCI may disappear (loss of correlation,LoC) in this phenotype or may appear (gain of correlation, GoC),based on an information-theoretic test. One can then rank individualgenes based on the statistical significance of the LoC/GoC enrichmentamong the interactions in which they directly participate. Note thatthis is quite different from a differential-expression-based ranking ofgenes. Here, the ranking is made on the basis of the extent to whichthe BCI network neighborhood of a gene changes in differentphenotypes. The method was validated using three well-character-ized B-cell tumor phenotypes, follicular lymphoma, Burkitt's lym-phoma, and mantle cell lymphoma, as well as a set of biochemicalperturbation assays.

5.2.1. The BCI interactomeThe B-cell interactome (BCI) used in the study, based on an earlier

version published in [57], integrates information from severaldifferent sources into a network of protein–protein, protein–DNA,and modulatory interactions in mammalian B-cells. This network canbe represented as a graph with three types of edges. The nodes of thegraph represent gene products. An undirected edge P—Q betweentwo proteins P and Q represents a protein–protein interaction,whereas a directed edge P→G from a protein P to a gene G representsa protein–DNA interaction. Finally, a directed edge from a node to anedge between nodes represents a modulatory interaction, arisingfrom a protein, such as an activating kinase, modulating theregulation of a target by a transcription factor. As mentioned earlier,it represents a “background” network that can be used as a controlagainst which to evaluate different phenotypes.

Whether a specific potential interaction does in fact exist isdetermined by a Bayesian evidence integration model (that is, aprobabilistic model that weights the reliability of different types ofinput data in computing likelihood), which evaluates several different“clues” about the interaction. Specifically, the posterior odds Opost iscomputed as the product of the prior odds Oprior and a likelihood ratioLR. Here, the prior odds are computed as the average probability thattwo random gene products interact, based on the number of

interactions expected in a specific cellular context. The likelihoodratio of an interaction between gene products x and y, given acollection of clues c1, …, cN is computed as

LR x; y; c1; N ; cNð Þ = P c1 N cN jx and y interactð Þ= P c1 N cN jx and ydo not interactð Þ;

where P(X|Y) denotes the conditional probability of event X happen-ing, given the occurrence of event Y. In order to compute theseprobabilities one can use a positive, respectively negative, gold standardset of interactions, respectively non-interactions and calculate howmany times a specific clue is observed in the positive and negativegold standard set. The gold standard sets should include only geneproduct pairs for which it is knownwhether they do or do not interact.

The positive gold standard set for protein–protein interactions(GSP) was constructed by extracting human protein–protein interac-tions from several existing data bases, which were obtained throughlow-throughput, high quality experiments. The resulting networkincludes 34,842 unique protein–protein interactions involving 7323genes. Since negative interactions are less well documented in theliterature, the negative gold standard set (GSN) was constructed usingan indirect method. Proteins were classified into four subcellularcompartments, including the cell periphery and exocytic pathway,cytoplasm, mitochondria, and nucleus. For each pair of compartments,the enrichment of interactions involving proteins from the twocompartments was computed, which resulted in compartment pairsthat were likely to contain proteins involved in an interaction. TheGSN was then taken to contain all pairs of proteins contained in cellcompartment pairs not enriched for interactions, resulting in18,359,948 negative interactions. Since many proteins are known tomigrate between intracellular compartments, this approach may beproblematic, however.

For protein–DNA interactions the amount of experimentallyvalidated data is significantly smaller. Here, the approach taken in[57] was to focus on thewell-studied transcription factor MYC. The GSPwas taken to include a set of 1,041 B-cell specific genes known to beregulated by MYC. The GSN was taken to be its genomic complement.

The resulting four gold standard sets were then used to assignlevels of confidence to evidence for interactions obtained usingseveral other methods. For protein–protein interactions, interactiondata from the model organisms Caenorhabditis elegans, Drosophilamelanogaster, Mus musculus, and Saccharomyces cerevisiae were usedby mapping model organism genes to human genes based onpublished information [58]. In addition, interactions from two yeasttwo-hybrid data sets were included. Secondly, interactions wereobtained by applying a literature data-mining algorithm, based on acollection of keywords. These were supplemented by informationfrom the Gene Ontology (GO) database, motivated by the observationthat interacting proteins tend to share the same biological process.Finally, a large set of human gene expression profiles were used toextract interaction information based on gene co-expression. This wasdone by computing the pairwise mutual information (introduced inthe first case study) of over 6000 genes.

For protein–DNA interactions information was extracted fromdatabases of interactions in themouse, mapped to human interactionsvia [58]. The largest contribution to the set of protein–DNAinteractions, however, came from the application of the statisticalnetwork inference algorithm ARACNE [59] to the above-mentionedcollection of DNA microarray profiles. ARACNE also relies on themutual information measure to establish correlations between genes.These different methods for establishing pairwise interactionsrepresent the evidence, or clues, that are then used to compute thelikelihood ratio needed to compute the odds that a given pair of geneproducts interact, as described earlier. The final assembly of thisinformation comprises the B-cell interactome, used in [27] as thebackground, against which different phenotypes of B-cells are tested.

Page 9: A Systems Biology View of Cancer

137R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

The first step is to apply the mutual information measureintroduced earlier to a large collection of DNA microarray geneprofiles in B-cells comprising three different cancer phenotypes, totest, for each pair of gene products in the BCI, whether its relationship(interaction/non-interaction) is consistent with the mutual informa-tionmeasure obtained for this pair. Accordingly, the edge between thetwo is classified as Loss of Correlation (LoC) if the edge is present inthe BCI but not in the new phenotype, or as Gain of Correlation (GoC)if the edge is absent in the BCI but present in the new phenotype. Thenext step is to rank genes according to the statistical significance of theLoC/GoC enrichment among the interactions in which they directlyparticipate. This provides a measure of dysregulation for a given genein a given cancer phenotype which is based on the changes in thegene's network neighborhood rather than just its individual differ-ential expression. The approach can identify small network modulesthat are dysregulated, providing clues to the mechanistic processesunderlying oncogenesis.

Three cancer phenotypes are represented in the DNA microarrayset. Follicular lymphoma, one of the most common B-cell non-Hodgkin's lymphomas, exhibited a relatively small amount ofdysregulation, with only 192 LoC/GoC interactions. Two genes, BCL2(B-cell CLL/lymphoma 2) and SMADI, were ranked very high in thisanalysis (first and sixth). BCL2 is involved in 8 of the 192 interactions,and SMAD1 has been shown to be involved in abnormal pathwayactivation in follicular lymphoma, mediated by tumor-transforminggrowth factor-β. By comparison, neither gene is ranked highly by adifferential expression analysis. For Burkitt's lymphoma the transcrip-tion factorMYC is known to play an important regulatory role [18]. Thenetwork dysregulation analysis ranked MYC in place 15, whereas adifferential expression analysis ranks it 32. Furthermore, a key effectorof MYC, MTA1 (metastasis associated 1), was ranked third, eventhough a differential expression analysis does not place it among thetop 1000 genes. For mantle cell lymphoma, the method also identifiedseveral relevant genes, in agreement with differential expressionanalysis.

In summary, the systems biology method developed in [27]identifies genes whose function within the cellular network of geneproducts in a given B-cell lymphoma phenotype is significantlyaltered. The viewpoint underlying the method is that the networkneighborhood of a gene is more predictive of altered function thanmere differential expression of the gene alone, and provides moreinformation about the possible mechanisms underlying the pheno-typic changes. The method has been validated in different cancerphenotypes, including Burkitt's lymphoma, follicular lymphoma, andmantle cell lymphoma, andwith a validation experiment, consisting ofa perturbation of the CD40 (TNF receptor superfamily member 5)pathway, with promising results. This is despite the fact that themethod requires a number of assumptions to compensate for the lackof qualitatively and quantitatively adequate data to establish the B-cellinteractome and estimate likelihood ratios. It can be expected thatwith the improved availability of data, the statistical accuracy ofmethods such as this one will improve.

5.3. A multi-scale mathematical model of cancer, and its use in designingradiation therapies

Currently, the three principal cancer treatments are surgery,chemotherapy, and radiotherapy. The latter is administered to themajority of cancer patients today, possibly in conjunction with othertreatments. Treatment protocols typically include daily doses ofionizing radiation over the course of several weeks. A major limitationof the efficacy of radiotherapy is that only proliferative cells in a tumorare susceptible to DNA damage from radiation. If solid tumors outgrowtheir blood supply, then many of their cells become quiescent due tohypoxia. Furthermore, sensitivity of cells to ionizing radiation alsodepends on what phase of the cell cycle they are in. In order to study

the optimal timing of a radiation dose it is therefore important tounderstand the cumulative effect of these factors. Since the cellularresponse to hypoxia is controlled at the molecular level through acomplex signaling network, a full understanding of the cumulativeeffect of environmental conditions and molecular events on cellsensitivity to ionizing radiation must integrate dynamic events atseveral spatial and temporal scales. The paper [25] carries out thisintegration in the case of colorectal cancer through a complex multi-scale mathematical model, which can be used to test the efficacy ofdifferent radiation protocols. Application of this model is used tosuggest an alternative protocol that administers radiation doses attimes coinciding with a maximal proportion of proliferative cells.

The model integrates events at three different scales, including agenetic signaling network that responds to environmental conditionsand controls the proliferative state of a cell, the progression of a cellthrough the different stages of the cell cycle, as well as tumor growthat the tissue level and environmental conditions such as hypoxia andoverpopulation. The 2-dimensional computational domain of themodel is an 8 cm square area of tissue which contains five smallcircular tumormasses, one in the center and the other four toward thecorners, with two sources of oxygen to the right and left of the centraltumor. The oxygen concentration in the tissue is described by adiffusion equation (that is, a partial differential equation whichdescribes density fluctuations in a material undergoing diffusion)with a diffusion coefficient that remains constant throughout thetissue (so that the diffusion equation becomes linear).

The propagation of tumor cells through the tissue is modeled usinga macroscopic continuous fluid dynamics model, which describes theflow of tumor cells in the extracellular matrix:

v = − kjp:

Here, v represents the flux (discharge per unit area), k is a constantthat indicates media permeability, and▿p is the gradient vector of thepressure field p, indicating direction and magnitude of its change. Letnϕ(x,y,t) denote the density of cells at time t at position (x,y) which arein cell cycle phase ϕ, where ϕ is one of the phases G1, S, G2M, G0 (=quiescent), Apop. Then nϕ can be described by the advection equation

∂n/ = ∂t + j � vn/

� �= P/;

where Pϕ is the cell density proliferation term at time t. (Here, ∂denotes the partial derivative of the multivariate function nϕ withrespect to variable t, and ▿ denotes the gradient vector of itsargument function.) Advection equations are standard tools todescribe transport of a substance in a moving fluid. Summing up theequations for the different cell cycle phases we obtain the global age-structured equation describing cell density:

−j � kjpð Þ = Σ/P/:

As time passes in this dynamic model, cells at different locations (x,y)in the computational space progress through the various stages of thecell cycle, which is implemented through a discrete mathematicalmodel at the cellular scale. The cell cycle is represented as three stages,G1, S, and G2-M, together with a stage G0 for quiescent cells and Apopfor cells that have been targeted for apoptosis. The different stageshave different lengths and are divided into discrete time intervals. Agiven cell progresses through the time intervals in discrete steps atconstant speed. At the end of the G1 phase there is a checkpoint atwhich two tests are performed, one for DNA damage and the other fortwo environmental conditions, hypoxia and overpopulation. In thecase of DNA damage, the cell is targeted for apoptosis (Apop) throughactivation of p53, if it is not mutated. Similarly, in the event of hypoxiaor overpopulation in the location (x,y), the cell is directed to thequiescent state G0 through activation of the APC or SMAD pathway

Page 10: A Systems Biology View of Cancer

138 R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

(see below). When environmental conditions change the cell canreturn to the proliferative state and progress through the cell cycle. Ifthe cell passes this checkpoint it progresses through the S and G2-Mphases of the cell cycle to produce offspring.

As mentioned, at the molecular level of each cell a network ofsignaling pathways is active that reacts to environmental conditionsthe cell finds itself in and regulates progression through the cell cycle.This network consists of four interconnected pathways, which processsignals from overpopulation (APC), growth factor signals (RAS),hypoxia (SMAD), and DNA damage (p53). It is implemented as aBoolean networkmodel inwhich a given gene/protein in the networkis either ON or OFF. The determination of the state of a given networknode at time t+1 is determined using a logical expression involvingthe states at time t of all the nodes that provide input to it in thenetwork. For instance, the logical expression for c-myc is:

c−myct + 1 = RASt1B−catt1 I SMADt:

This formula is to be interpreted as saying that c-myc at time t+1will be ON precisely if RAS and B-cat are ON at time t and SMAD isOFF at time t. There are similar formulas for the other nodes in thenetwork. Mutation of the genes SMAD and APC are implemented inthe model by setting SMADt+1=0=APCt+1. Using this Booleannetwork model one can compute the cell cycle signal resulting from agiven set of environmental conditions. If hypoxia occurs at time t,then SMADt+1=1; similarly for overpopulation.

To simulate themodel, one needs tomake several assumptions. Forthe discrete cell cycle and gene regulation models the time step ischosen to be 1 h, and simulations are run for 320 h. The model isinitialized with a distribution of normal and malignant (i.e., at leastone mutation) cells at different points of the cell cycle. First, hypoxiaand overpopulation conditions are checked, using model parameters,and the genetic model is run for each cell with these inputs until itreaches a steady state. Then the cell cycle model is run for one timestep (= 1 h) and new values are computed for the cell densities n atdifferent cell cycle stages and proliferation terms P. Finally, theadvection equations are solved with these new values.

5.3.1. Radiation therapyAs mentioned earlier, it is assumed that only proliferative cells are

sensitive to radiation treatment. Furthermore, DNA damage ismeasured as the number of double strand breaks, given by theproduct of the radiosensitivity and the radiation dose. DNA damage isimplemented in the model by setting a threshold for the number ofdouble strand breaks. If the number of breaks exceeds this threshold,p53 is activated and the cell is subsequently directed to apoptosis. Thestandard radiotherapy protocol used in the simulations consists of a 2Gγ dose per day, five days a week, repeated for several weeks. In themodel the dose is assumed to be uniformly distributed over the spatialdomain. Thus, a treatment is implemented in the simulation bycomputing for each cell its susceptibility and the damage it sustains,and activating p53 as appropriate. Then, at the next time step, thegenetic model will target the cell for apoptosis.

5.3.2. Simulation resultsThe goal of the paper is to assess the efficacy of the standard

radiotherapy protocol, which is done by counting the number of cellsdirected to apoptosis after repeated applications of a radiation dose.The simulation results show clearly that during 10 treatments on thestandard schedule an increasingly large number of cells in the interiorof the tumors are in a quiescent state, due to the effect of increasinghypoxia. As a result, repeated treatments on a 24-hour schedulebecome increasingly less effective since the majority of cells are notsensitive. In the paper, an alternative treatment schedule is explored,which takes into account the length of the cell cycle. Here, after aninitial treatment, the follow-up treatments are timed to coincide with

a maximum number of mitotic cells and are delivered before hypoxiasets in. The result is an increase by a factor of 20 in the number of cellsmarked for apoptosis.

5.3.3. SummaryThe model described here integrates relevant phenomena at

several spatial and temporal scales into a heterogeneous mathema-tical model of tumor growth and the effect of radiotherapy. The keycharacteristic of this systems biology model is that it takes intoaccount a complex network of relevant factors whose aggregatecontribution can only be understood with the help of sophisticatedmathematical modeling tools. It allows an in-depth exploration ofdifferent variants of the standard treatment protocol scheduling,which can provide the basis for the design of appropriate clinical trials.It also allows the exploration of the effects of different mutations ontreatment response. At the tissue level, a fluid dynamicsmodel is used,working with cell densities rather than numbers of individual cells.Using individual-based simulation techniques to create a fully discretemodel might improve its accuracy. Also, constructing a true three-dimensional model might bring to light important spatial effects. Atthe same time, both of these improvements would come at substantialcomputational cost. Ultimately, as depicted in Fig. 2, comparingsimulated outcomes with experimental results will point to gaps orshortfalls that can be addressed through additional modeling.

6. Outlook and conclusion

In order to obtain a useful understanding of cancer, it is necessaryto take a systemic view of the disease, and to utilize mathematicalmodeling techniques. We have described in detail three examples theutility of such a view. The first two examples use a graphical depictionof a large-scale intracellular network as a mathematical tool,representing interactions between molecular species. Changes in thenetwork structure, rather than changes in individual genes, then formthe basis for an analysis of the differences between healthy andmalignant cells. In the first case this leads to prognostic markers forthe propensity to metastasis in breast cancer patients, in the secondcase to a new method for identifying oncogenes. The third examplerepresents an attempt to bridge different spatial and temporal scalesto understand the effect of different environmental parameters on cellproliferation, integrating the molecular, cellular, and tissue scales. Itcan then be used to study a variety of different treatment approachesat each of these scales. The example here focuses on the effect ofradiation therapy.

In order to use suchmodels effectively they need to be informed byappropriate high quality data. As we have seen in the discussion of thecase studies, compromises had to bemade due to the lack of data, theirpoor quality, or both. This applies in particular to detailed dynamicmodels like the one in [25]. In order to calibrate such modelsaccurately, detailed time course observations with coherent measure-ments at different scales are needed, but are currently not available inmost or all cases. It can be expected that with technological progressdata collection will become cheaper, more abundant, and of higherquality. New imaging techniques at the molecular and tissue level willallow a representation of events at a much higher scale of resolution,which can inform increasingly detailed and complex models. Increas-ing computing power will enable us to simulate and analyze complexmodels in a short enough time frame to be useful as a clinical tool.

In [24] the authors lay out a vision for the future: “Two decadesfrom now, having fully charted the wiring diagrams of every cellularsignaling pathway, it will be possible to lay out the complete“integrated circuit of the cell.” We will then be able to apply thetools of mathematical modeling to explain how specific geneticlesions serve to reprogram this integrated circuit in each of theconstituent cell types so as to manifest cancer.”We have attempted inthis review to show that in the field of cancer, the rapid progress of

Page 11: A Systems Biology View of Cancer

139R. Laubenbacher et al. / Biochimica et Biophysica Acta 1796 (2009) 129–139

systems biology and its embrace of diverse mathematical andstatistical modeling methods has brought this vision within grasp.Even more, quantitative tools are helping to translate insights aboutparts of this integrated circuit to clinical practice in the form of newdiagnostic tools, modified or new treatments, and potential new drugtargets for cancer treatment. The role of systems biology intranslational cancer medicine is bound to increase, driven byintegrated teams of mathematical modelers, cancer biologists, andclinicians. By working together in a transdisciplinary fashion, theneeds of the clinic can directly inspire needed advances infundamental understanding, enabled by quantitative tools.

Acknowledgements

This work was supported in part by grants T32 CA079448 (V.H.),R37 DK042412 (F.M.T.), R01CA120170 (V.S), and R01 GM080219(P. M.). P. Mendes also acknowledges financial support from the U.K.BBSRC and EPSRC. The content is solely the responsibility of theauthors and does not necessarily represent the official views of theNational Cancer Institute or the National Institutes of Health. Thanksare due to M. Hatakeyama, the lead author of [19], for providingMathematica code to generate Fig. 1C.

References

[1] L.D. Wood, et al., The genomic landscapes of human breast and colorectal cancers,Science 318 (5853) (2007) 1108–1113.

[2] B. Vogelstein, K.W. Kinzler, Cancer genes and the pathways they control, Nat. Med.10 (8) (2004) 789–799.

[3] J.J. Hornberg, et al., Cancer: a systems biology disease, Biosystems 83 (2–3) (2006)81–90.

[4] A.C. Ahn, et al., The limits of reductionism inmedicine: could systems biology offeran alternative? PLoS Medicine 3 (6) (2006) e208.

[5] A.C. Ahn, et al., The clinical applications of a systems approach, PLoS Medicine 3(7) (2006) e209.

[6] B. Stiles, et al., Essential role of AKT-1/protein kinase B{alpha} in PTEN-controlledtumorigenesis, Mol. Cell. Biol. 22 (11) (2002) 3842–3851.

[7] P. Rodriguez-Viciana, et al., Activation of phosphoinositide 3-kinase by interactionwith Ras and by point mutation, EMBO J. 15 (10) (1996) 2442–2451.

[8] E. Tokunaga, et al., Akt is frequently activated in HER2/neu-positive breast cancersand associated with poor prognosis among hormone-treated patients, Interna-tional Journal of Cancer 118 (2) (2006) 284–289.

[9] S.R. Datta, et al., Akt phosphorylation of BAD couples survival signals to the cell-intrinsic death machinery, Cell 91 (2) (1997) 231–241.

[10] M.H. Cardone, et al., Regulation of cell death protease caspase-9 by phosphoryla-tion, Science 282 (5392) (1998) 1318–1321.

[11] A. Brunet, et al., Akt promotes cell survival by phosphorylating and inhibiting aforkhead transcription factor, Cell 96 (6) (1999) 857–868.

[12] L.D. Mayo, D.B. Donner, A phosphatidylinositol 3-kinase/Akt pathway promotestranslocation of Mdm2 from the cytoplasm to the nucleus, Proceedings of theNational Academy of Sciences 98 (20) (2001) 11598–11603.

[13] K. Inoki, et al., TSC2 is phosphorylated and inhibited by Akt and suppresses mTORsignalling, Nat. Cell Biol. 4 (9) (2002) 648–657.

[14] R.L. Elstrom, et al., Akt stimulates aerobic glycolysis in cancer cells, Cancer Res. 64(11) (2004) 3892–3899.

[15] L.P. Kane, et al., Induction of NF-[kappa]B by the Akt/PKB kinase, Curr. Biol. 9 (11)(1999) 601–605.

[16] F. Lerebours, et al., NF-kappa B genes have a major role in inflammatory breastcancer, BMC Cancer 8 (2008).

[17] S. Futagami, et al., COX-2 and CCR2 induced by CD40 ligand and MCP-1 are linkedto VEGF production in endothelial cells, Prostaglandins, Leukotrienes andEssential Fatty Acids 78 (2) (2008) 137–146.

[18] Z. Li, et al., A global transcriptional regulatory role for c-Myc in Burkitt'slymphoma cells, PNAS 100 (2003) 8164–8169.

[19] M. Hatakeyama, et al., A computational model on the modulation of mitogen-activated protein kinase (MAPK) and Akt pathways in heregulin-induced ErbBsignalling, Biochem. J. 373 (2) (2003) 451–463.

[20] G. Stolovitzky, D. Monroe, A. Califano, Dialogue on reverse-engineering assess-ment and methods: the DREAM of high-throughput pathway inference, Ann. N. Y.Acad. Sci. 1115 (2007) 1–22.

[21] A.J. Hartemink, Reverse engineering gene regulatory networks, Nat. Biotechnol. 23(5) (2005) 554–555.

[22] R. Laubenbacher, B. Stigler, A computational algebra approach to the reverseengineering of gene regulatory networks, J. Theor. Biol. (2004).

[23] T.S. Gardner, et al., Inferring genetic networks and identifying compound mode ofaction via expression profiling, Science 301 (5629) (2003) 102–105.

[24] D. Hanahan, R.A. Weinberg, The hallmarks of cancer, Cell 100 (2000) 57–70.

[25] B. Ribba, T. Colin, S. Schnell, A multiscale mathematical model of cancer, and itsuse in analyzing irratiation therapies, Theor. Biol. Med. Modelling 3 (2006) 7.

[26] H.Y. Chuang, et al., Network-based classification of breast cancer metastasis, Mol.Sys. Biol. 3 (2007).

[27] K.M. Mani, et al., A systems biology approach to prediction of oncogenes andmolecular perturbation targets in B-cell lymphomas, Mol. Syst. Biol. 4 (2008).

[28] R. Shachar, et al., A systems-level approach to mapping the telomere lengthmaintenance gene circuitry, Mol. Syst. Biol. 4 (2008) 172.

[29] J.P. Qi, et al., A mathematical model of P53 gene regulatory networks underradiotherapy, Biosystems 90 (3) (2007) 698–706.

[30] M.R. Birtwistle, et al., Ligand-dependent responses of the ErbB signaling network:experimental and modeling analyses, Mol. Syst. Biol. (2007) EMBO and NaturePublishing Group.

[31] C.A. Athale, T.S. Deisboeck, The effects of EGF-receptor density on multiscaletumor growth patterns, J. Theor. Biol. 238 (4) (2006) 771–779.

[32] P.H. Huang, et al., Quantitative analysis of EGFRvIII cellular signaling networksreveals a combinatorial therapeutic strategy for glioblastoma, Proc. Nat. Acad. Sci.104 (31) (2007) 12867–12872.

[33] S. Reymann, J. Borlak, Transcription profiling of lung adenocarinomas of c-myc-transgenic mice: identification of the c-myc regulatory gene network, B.M.C. Sys.Biol. 2 (2008) 46.

[34] W. Kolch, Ras/Raf signalling and emerging pharmacotherapeutic targets, Exp.Opin. Pharmacother. 3 (2002) 709–718.

[35] B.N. Kholodenko, et al., Quantification of Short Term Signaling by the EpidermalGrowth Factor Receptor, 1999, pp. 30169–30181.

[36] B.S. Hendriks, H.S. Wiley, D. Lauffenburger, HER2-mediated effects on EGFRendosomal sorting: analysis of biophysical mechanisms, Biophys. J. 85 (4) (2003)2732–2745.

[37] N. Kumar, et al., Modeling HER2 effects on cell behavior from mass spectrometryphosphotyrosine data. PLoS Computational Biology 3 (1) (2007) e4.

[38] G. Castoria, et al., Integrating signals between cAMP andMAPK pathways in breastcancer, Front. Biosci. 13 (2008) 1318–1327.

[39] A. Kiyatkin, et al., Scaffolding protein Grb2-associated binder 1 sustains epidermalgrowth factor-induced mitogenic and survival signaling by multiple positivefeedback loops, J. Biol. Chem. 281 (29) (2006) 19925–19938.

[40] E.C. Stites, et al., Network analysis of oncogenic Ras activation in cancer, Science318 (5849) (2007) 463–467.

[41] A. Sveiczer, J. Tyson, B. Novak, Modelling the fission yeast cell cycle, Brief Funct.Genomic Proteomic 2 (4) (2004) 298–307.

[42] T. Haberichter, et al., A systems biology dynamical model of mammalian G1 cellcycle progression, Mol. Syst. Biol. 3 (2007) 84.

[43] F. Watt, M. Frye, S. Benitah, MYC in mammalian epidermis: how can an oncogenestimulate differentiation? Nat. Rev., Cancer 8 (3) (2008) 234–242.

[44] S. Bottani, B. Grammaticos, Analysis of a minimal model for p53 oscillations, J.Theor. Biol. 249 (2) (2007) 235–245.

[45] N. Geva-Zatorsky, et al., Oscillations and variability in the p53 systems, Mol. Syst.Biol. 2006 (2) (2006) 0033.

[46] A.R. Anderson, et al., Tumor morphology and phenotypic evolution driven byselective pressure from the microenvironment, Cell 127 (2006) 905–915.

[47] Y. Jiang, et al., A multiscale model for avascular tumor growth, Biophys. J. 89 (6)(2005) 3884–3894.

[48] L. Zhang, C.A. Athale, T.S. Deisboeck, Development of a three-dimensionalagent-based tumor model: simulating gene–protein interaction profiles, cellphenotypes, and multicellular patterns in brain cancer, J. Theor. Biol. 244(2007) 96–107.

[49] A.R. Anderson, V. Quaranta, Integrative mathematical oncology, Nat. Rev., Cancer 8(2008) 227–234.

[50] M.M. Mueller, N.E. Fusenig, Friends or foes—bipolar effects of the tumor stroma incancer, Nat. Rev., Cancer 4 (2004) 839–849.

[51] B. Weigelt, J.L. Peterse, L.J. van't Veer, Breast cancer metastasis: markers andmodels, Nat. Rev. Cancer 5 (8) (2005) 591–602.

[52] M.J. van de Vijver, et al., A gene-expression signature as a predictor of survival inbreast cancer, N. Engl. J. Med. 347 (25) (2002) 1999–2009.

[53] Y. Wang, et al., Gene-expression profiles to predict distant metastasis oflymph-node-negative primary breast cancer, Lancet 365 (9460) (2005)671–679.

[54] A.K. Ramani, et al., Consolidating the set of known human protein–proteininteractions in preparation for large-scale mapping of the human interactome,Genome Biol. 6 (5) (2005) R40.

[55] J.F. Rual, et al., Towards a proteome-scale map of the human protein–proteininteraction network, Nature 437 (7062) (2005) 1173–1178.

[56] L.J. van 't Veer, et al., Gene expression profiling predicts clinical outcome of breastcancer, Nature 415 (6871) (2002) 530–536.

[57] C. Lefebvre, et al., A context-specific network of protein-DNA and protein–proteininteractions reveals new regulatory motifs in human B-cells, Lecture Notes inBioinf 4532 (2007) 42–56.

[58] K. O'Brian, M. Remm, E. Sonnhammer, Inparanoid: a comprehensive database ofeukaryotic orthologs, Nucleic Acids Res. 33 (Database issue) (2005) D476–D480.

[59] A. Margolin, et al., ARACNE: an algorithm for the reconstruction of gene regulatorynetworks in a mammalian cellular context, B.M.C. Bioinformatics 7 (Suppl. 1)(2006) S1–S7.

[60] M. Djavani, et al., Early blood profiles of virus infection in a monkey model forLassa Fever, J. Virology 81 (15) (2007) 7960–7973.

[61] M. Djavani, et al., Gene expression in primate liver during viral hemorrhagic fever,J. Virology (2008). under review.