16
Journal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory networks Maximino Aldana a,d, , Enrique Balleza a,d , Stuart Kauffman b , Osbaldo Resendiz c,d a Centro de Ciencias Fı´sicas, Universidad Nacional Auto´noma de Me´xico, C.P. 62251, Apartado Postal 48-3, Cuernavaca, Morelos, Mexico b Institute of Biocomplexity and Informatics, University of Calgary, Calgary, Alta., Canada c Centro de Ciencias Geno´micas, UNAM, Cuernavaca, Morelos, Mexico d Consortium for the Americas for Interdisciplinary Science, UNM, Albuquerque, NM, USA Received 29 March 2006; received in revised form 17 October 2006; accepted 25 October 2006 Available online 1 November 2006 Abstract Living organisms are robust to a great variety of genetic changes. Gene regulation networks and metabolic pathways self-organize and reaccommodate to make the organism perform with stability and reliability under many point mutations, gene duplications and gene deletions. At the same time, living organisms are evolvable, which means that these kind of genetic perturbations can eventually make the organism acquire new functions and adapt to new environments. It is still an open problem to determine how robustness and evolvability blend together at the genetic level to produce stable organisms that yet can change and evolve. Here we address this problem by studying the robustness and evolvability of the attractor landscape of genetic regulatory network models under the process of gene duplication followed by divergence. We show that an intrinsic property of this kind of networks is that, after the divergence of the parent and duplicate genes, with a high probability the previous phenotypes, encoded in the attractor landscape of the network, are preserved and new ones might appear. The above is true in a variety of network topologies and even for the case of extreme divergence in which the duplicate gene bears almost no relation with its parent. Our results indicate that networks operating close to the so-called ‘‘critical regime’’ exhibit the maximum robustness and evolvability simultaneously. r 2006 Elsevier Ltd. All rights reserved. MSC: primary 98.8.c Keywords: Gene regulatory networks; Robustness; Evolvability; Gene divergence; Gene duplication 1. Introduction Robustness and evolvability are two central properties of biological systems (Srtelling et al., 2004; de Visser et al., 2003; Kirschner and Gerhart, 1998; Nehaniv, 2003; Poole et al., 2003; Wagner, 2005a). Living organisms are robust since they can maintain performance under a broad range of random perturbations, ranging from temporary chemi- cal or physical changes in the environment, to permanent genetic mutations. They are also evolvable since organisms eventually do change as a result of changes in their genetic material, acquiring new functions and adapting to new environments. Robustness and evolvability have been observed to occur at different levels of biological organiza- tion, going from gene regulation to organismal fitness. However, despite the central importance of these two concepts to the understanding of the functioning and evolution of biological systems, it is not clear yet what are the structural and dynamical mechanisms that generate complex structures that are both robust and evolvable. Furthermore, neither robustness nor evolvability have been defined unambiguously. Therefore, before addressing the problem of how robustness and evolvability emerge in genetic regulatory networks (GRN), we must start by defining them in this context. Several definitions have been given depending upon the context and level of organization under consideration. Here we follow de Visser et al. (2003) and define robustness as the invariance of phenotypes in the face of perturbation. In this definition the word ‘‘perturbation’’ means anything ARTICLE IN PRESS www.elsevier.com/locate/yjtbi 0022-5193/$ - see front matter r 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.jtbi.2006.10.027 Corresponding author. Tel.: +52 555 622 7787; fax: +52 555 622 7775. E-mail address: max@fis.unam.mx (M. Aldana). URL: http://www.fis.unam.mx/max.

Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

0022-5193/$ - se

doi:10.1016/j.jtb

�CorrespondE-mail addr

URL: http:/

Journal of Theoretical Biology 245 (2007) 433–448

www.elsevier.com/locate/yjtbi

Robustness and evolvability in genetic regulatory networks

Maximino Aldanaa,d,�, Enrique Ballezaa,d, Stuart Kauffmanb, Osbaldo Resendizc,d

aCentro de Ciencias Fısicas, Universidad Nacional Autonoma de Mexico, C.P. 62251, Apartado Postal 48-3, Cuernavaca, Morelos, MexicobInstitute of Biocomplexity and Informatics, University of Calgary, Calgary, Alta., Canada

cCentro de Ciencias Genomicas, UNAM, Cuernavaca, Morelos, MexicodConsortium for the Americas for Interdisciplinary Science, UNM, Albuquerque, NM, USA

Received 29 March 2006; received in revised form 17 October 2006; accepted 25 October 2006

Available online 1 November 2006

Abstract

Living organisms are robust to a great variety of genetic changes. Gene regulation networks and metabolic pathways self-organize and

reaccommodate to make the organism perform with stability and reliability under many point mutations, gene duplications and gene

deletions. At the same time, living organisms are evolvable, which means that these kind of genetic perturbations can eventually make the

organism acquire new functions and adapt to new environments. It is still an open problem to determine how robustness and evolvability

blend together at the genetic level to produce stable organisms that yet can change and evolve. Here we address this problem by studying

the robustness and evolvability of the attractor landscape of genetic regulatory network models under the process of gene duplication

followed by divergence. We show that an intrinsic property of this kind of networks is that, after the divergence of the parent and

duplicate genes, with a high probability the previous phenotypes, encoded in the attractor landscape of the network, are preserved and

new ones might appear. The above is true in a variety of network topologies and even for the case of extreme divergence in which the

duplicate gene bears almost no relation with its parent. Our results indicate that networks operating close to the so-called ‘‘critical

regime’’ exhibit the maximum robustness and evolvability simultaneously.

r 2006 Elsevier Ltd. All rights reserved.

MSC: primary 98.8.c

Keywords: Gene regulatory networks; Robustness; Evolvability; Gene divergence; Gene duplication

1. Introduction

Robustness and evolvability are two central properties ofbiological systems (Srtelling et al., 2004; de Visser et al.,2003; Kirschner and Gerhart, 1998; Nehaniv, 2003; Pooleet al., 2003; Wagner, 2005a). Living organisms are robustsince they can maintain performance under a broad rangeof random perturbations, ranging from temporary chemi-cal or physical changes in the environment, to permanentgenetic mutations. They are also evolvable since organismseventually do change as a result of changes in their geneticmaterial, acquiring new functions and adapting to newenvironments. Robustness and evolvability have been

e front matter r 2006 Elsevier Ltd. All rights reserved.

i.2006.10.027

ing author. Tel.: +52 555 622 7787; fax: +52 555 622 7775.

ess: [email protected] (M. Aldana).

/www.fis.unam.mx/�max.

observed to occur at different levels of biological organiza-tion, going from gene regulation to organismal fitness.However, despite the central importance of these twoconcepts to the understanding of the functioning andevolution of biological systems, it is not clear yet what arethe structural and dynamical mechanisms that generatecomplex structures that are both robust and evolvable.Furthermore, neither robustness nor evolvability have beendefined unambiguously. Therefore, before addressing theproblem of how robustness and evolvability emerge ingenetic regulatory networks (GRN), we must start bydefining them in this context.Several definitions have been given depending upon the

context and level of organization under consideration.Here we follow de Visser et al. (2003) and define robustness

as the invariance of phenotypes in the face of perturbation. Inthis definition the word ‘‘perturbation’’ means anything

Page 2: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

1‘‘Inputs’’ and ‘‘outputs’’ is the common terminology in the literature on

complex networks. However, in the biological literature it is more

common to refer to the inputs and outputs as ‘‘regulator genes’’ and

‘‘target genes’’, respectively. Here we will use these two terminologies

indistinguishably.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448434

that drives the system away from its wild-type state.However, throughout this work we will use perturbation assynonymous of permanent genetic change (i.e. mutations).On the other hand, we adopt the following definition forevolvability (Wagner, 2005b): A biological system is

evolvable if it can acquire novel functions (phenotypes)through genetic change (perturbations), functions that help

the organism survive and reproduce. Accordingly, in thiswork we seek to devise models for genetic networks thatare robust in the sense that phenotypes are preserved in thepresence of perturbations, and at the same time areevolvable in the sense that under such perturbations newphenotypes may also emerge. Note that phenotype andperturbation are two essential elements involved in theprevious definitions. It is therefore important to specifywhat phenotypes are in our models and what kind ofperturbations will be considered.

With regard to phenotypes, it has been a long standinghypothesis that the dynamical attractors of the geneticnetwork correspond to cellular types or cellular fates(Kauffman, 1969, 1993). This hypothesis has been partiallyconfirmed, both numerically and experimentally, in recentwork where patterns of gene expression of real organismshave been identified with the dynamical attractors ofproperly constructed genetic network models (Albert andOthmer, 2003; Mendoza and Alvarez-Buylla, 2000; Espi-nosa-Soto et al., 2004; Huang and Ingber, 2000; Huanget al., 2005; Gardner et al., 2003). In the next sections wepresent the GRN model we will be working with andexplain the occurrence of dynamical attractors and theirbiological significance. Here it suffices to mention that adynamical attractor in this context can be considered as thestationary gene expression profile which the geneticnetwork falls into after a transient time, starting out froma given initial condition (such as a heat shock). The workcited above provides evidence supporting the fact thatsome phenotypic traits, such as the cellular type or thecellular fate (apoptosis, quiescence, proliferation, differ-entiation) can be viewed as end programs encoded in thedynamical attractors of the GRN.

The attractor landscape of a genetic network, (namely,the set of all the attractors and their basins of attraction), isan emergent property that depends on the structural anddynamical organization of the entire network in the sameway as phenotypes are emergent properties determined to alarge extent by the organization of the underlying geneticnetwork. Therefore, given the biological significance of theattractors, the problem of the robustness and evolvabilityof phenotypes can be addressed by studying the conserva-tion and transformation of the attractor landscape of theGRN under perturbations. This takes us to the secondimportant element involved in the definitions of robustnessand evolvability: what kind of perturbations will beconsidered?

The only perturbation considered in this work is geneduplication and divergence. Nowadays it is widely acceptedthat one of the main mechanisms of genome growth and

evolution is gene duplication followed by genetic diver-gence (Lynch and Conery, 2000; Lynch, 2002; Lynch andKatju, 2004; Teichmann and Babu, 2004; Zhang, 2003).Susumu Ohno was among the first who pointed out theimportance of gene duplication, for it constitutes aremarkable source of material for functional gene noveltyin organisms (Ohno, 1995). Rapidly after gene duplication,the gradual accumulation of mutations in one copy makesthe parent and duplicate genes diverge (Lynch and Conery,2000; Teichmann and Babu, 2004; Zhang, 2003). Thisdivergence might consist of (i) non-functionalization, inwhich one of the copies becomes silenced; (ii) neofunctio-nalization, in which one copy develops a new function,whereas the other copy retains its original function; (iii)subfunctionalization, where the two copies acquire com-plementary functions that, added together, carry out theoriginal function. In any case, ‘‘changes of gene expression

after gene duplication appear to be a general rule rather than

exception, and these changes often occur quickly after gene

duplication.’’ (Zhang, 2003). Indeed, the results presentedhere indicate that the duplication and divergence of a singlegene can change the entire attractor landscape. This changemay consist not only in the emergence of new phenotypes(attractors), but also in the reconfiguration of differentia-tion and gene expression pathways.

2. Genetic network models

The dynamics of GRN can be modeled using differentapproaches (De Jong, 2002; Smolen et al., 2000; Mason etal., 2004). In this work, as a test-bed for the study ofrobustness and evolvability in GRN, we choose to modelgene activities by random Boolean networks (RBN) withdifferent topologies. Since their proposal in 1969 (Kauff-man, 1969), RBN have successfully described in aqualitative way several important aspects of the generegulation and cell differentiation processes (Kauffman,1993, 1995). The model consists of a set of N binaryvariables, s1;s2; . . . ;sN , each acquiring the values 0 or 1corresponding to the two states of gene expression (‘‘off’’and ‘‘on,’’ respectively). The state of each gene sn isregulated by a set of kn other genes. In turn, sn can regulatethe expression of ln other genes. Note that the network isdirected since, if sm regulates the expression of sn, theopposite does not necessarily occur. We will call the set ofkn genes that regulate the expression of sn the inputs

or regulators of sn. Analogously, the set of ln genes forwhich sn is an input will be referred to as the outputs ortargets of sn.

1

As in every directed network, the topology of the inputconnections need not be the same as the topology of the

Page 3: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESSM. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448 435

output connections. Denoting as piðkÞ and poðlÞ theprobability distribution functions that an arbitrary genehas k inputs and l outputs, respectively, we consider thefollowing two different topologies:

2

to

piðk3

div

(Pa

200

Ho

arc

tion

Homogeneous random topology2: Each gene has exactlyK inputs (regulators) randomly chosen from anywherein the system. Therefore, the input probability distribu-tion is piðkÞ ¼ dk;K , where dk;K is the Kronecker deltafunction. Since the K regulators of each gene are chosenrandomly, it follows that the output probabilitydistribution is Poissonian with average K, i.e.poðlÞ ¼ e�K Kl=l!.

� Scale-free output topology: The number of outputs(targets) ln of each sn is a random variable that followsa power law distribution poðlÞ ¼ Cl�g. Choosing the ln

outputs of each gene randomly from anywhere in thesystem, the number of inputs kn turns out to be arandom variable that follows a Poisson distributionPiðkÞ ¼ e�K Kk=k! whose average K is determined by thescale-free exponent g (and N, the number of genes in thenetwork).

Genetic network models with homogeneous randomtopology have been extensively studied (Derrida andStauffer, 1986; Kauffman, 1993, 1995; Aldana et al.,2003). However, recent analysis indicate that this topology,although easy to implement numerically and studyanalytically, is unrealistic for the modeling of geneticnetworks. The scale-free topology mentioned above seemsto fit better the transcriptional regulatory networksobserved experimentally for several organisms (Albert,2005; Guelzim et al., 2002; Babu et al., 2004).3 Nonetheless,we include the homogeneous random topology in this workbecause several analytic results are known about theBoolean dynamics of networks with this topology. There-fore, such networks serve as a reference point to comparethe results obtained for other topologies.

Once each gene sn has been provided with a set of inputs,fsn1 ; . . . ; snkn

g, the network dynamics are then given by thesimultaneous updating of all the genes of the networkaccording to

snðtþ 1Þ ¼ f nðsn1 ðtÞ; . . . ;snknðtÞÞ. (1)

In the above expression, f n is a Boolean rule randomlychosen from the ensemble of all possible Boolean rules suchthat, for each of the 2kn configurations of the kn inputs of

Another topology which is equivalent, from a dynamical point of view,

the homogeneous random topology described here is the one in which

Þ and poðlÞ are both Poisson distributions with the same average K.

It has been argued that gene duplication processes followed by

ergence produce protein interaction networks with scale-free topology

stor-Satorras et al., 2003; Rzhetsky and Gomez, 2001; Bhan et al.,

2; Hughes and Friedman, 2005; Chung et al., 2003; Raval, 2003).

wever, the authors are not aware of a model that accounts for the

hitecture observed in GRN, namely, exponential or Poisson distribu-

s for the inputs and scale-free distributions for the outputs.

sn, f n ¼ 1 with probability p and f n ¼ 0 with probability1� p. In the context of GRN, the parameter p is theprobability of gene expression. A Boolean rule is drawnfrom this ensemble for each gene in the network. It is worthemphasizing that both the inputs and the Boolean rule ofeach gene are chosen only once and remain fixedthroughout the temporal evolution of the network. Thissituation is known in the literature as the quenched model.We will denote as St the dynamical configuration of the

network at time t:

St ¼ fs1ðtÞ;s2ðtÞ; . . . ;sNðtÞg.

Starting out from an initial configuration S0, the systemwill trace out a trajectory in time

S0! S1 ! S2! � � � ! St � � �

determined by Eq. (1). By analysing the temporal evolutionof two distinct trajectories that start out from two slightlydifferent configurations, S0 and ~S0, it is possible to showanalytically that RBN with homogeneous random topol-ogy (as well as with scale-free input topology) exhibit acontinuous (second order) phase transition from anordered regime where the two trajectories typicallyconverge after some transient time, to a chaotic regimewhere the system is extremely sensitive to small changes inthe initial condition and the two trajectories typicallydiverge from each other (Derrida and Stauffer, 1986;Aldana and Cluzel, 2003; Aldana, 2003). The phasetransition is governed by the value of the so-called averageexpected sensitivity, defined as S ¼ 2pð1� pÞK , where K isthe average network connectivity and p the probability ofgene expression (Shmulevich and Kauffman, 2004). ForS41 the network is in the chaotic regime and for So1 it isin the ordered regime. The phase transition occurs atS ¼ 1, which is called the critical regime. Throughout thiswork we analyse networks with p ¼ 0:5, for which thephase transition occurs at K ¼ 2.It has been hypothesized that RBN operating in the

critical (or near the critical) regime are good candidates forthe modeling of real GRN (Kauffman, 1993). This is sobecause networks in the ordered regime exhibit extremelystable dynamics in the sense that small variations in theinitial configuration S0 typically die out over time. On theother hand, networks operating in the chaotic regime areextremely sensitive to small changes in the initial config-uration S0, and hence they are not stable. The compromisebetween ‘‘frozen’’ stability and chaotic behavior is achievedclose to the ordered regime, where the network dynamicsare neither frozen nor chaotic. Recent work shows evidencethat eukaryotic cells seem to operate in the ordered orcritical regimes, but not in the chaotic one (Shmulevichet al., 2005; Ramo et al., 2006). As we will see in the nextsections, our results give further support to the hypothesisthat the GRN of living organisms should operate close tothe critical phase, since in such a case one obtains themaximum robustness and evolvability simultaneouslywithin the same system.

Page 4: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

Fig. 1. Graphical representation of the attractor landscape of a Boolean

network with N ¼ 15 genes and average connectivity K ¼ 2. In this figure

each configuration St is represented by a small circle, whereas the edges

correspond to discrete time steps. Two configurations are connected if one

is the successor of the other under the dynamics given by Eq. (1). The

arrows indicate the direction of the dynamical flow. All the configurations

that have the same successor are painted with the same color. In this

particular realization the configuration space organizes into two dynami-

cal attractors and their corresponding basins of attraction.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448436

3. Attractor landscape in GRN

One of the main characteristics of RBN is the occurrenceof dynamical attractors. Since the network dynamics aredeterministic, and the total number O ¼ 2N of dynamicalconfigurations of a network with N genes is finite,after a transient time the network will inevitably fall intoa previously visited configuration, from which thenetwork will trace out a cyclic trajectory. This cyclictrajectory is called, as noted, an attractor. Theremay be more than one attractor, but at least one mustexist. All the configurations that after a transient time fallinto the same attractor constitute its basin of attraction.Therefore, all the O ¼ 2N possible dynamical configura-tions of the network organize into disjoint classes, eachcorresponding to a dynamical attractor and its respectivebasin of attraction. We will refer to the set of disjointclasses which the configuration space breaks into as theattractor landscape.

Wuensche introduced a graphical representation tovisualize the attractor landscape, as shown in Fig. 1(Wuensche and Lesser, 1992; Wuensche, 2004). In thisrepresentation, each of the O ¼ 2N dynamical configura-tions of the network is represented with a solid dot. Twosuccessive configurations St and Stþ1 are connected with aline. In the attractor landscape depicted in Fig. 1, whichcorresponds to a Boolean network with N ¼ 15, K ¼ 2 andp ¼ 0:5, all the O ¼ 215 configurations organize into twobasins of attraction with their respective attractors. Notethat the Boolean network model is dissipative in the sensethat multiple different configurations can map into one, sothat information of the initial condition is lost. This can beseen in Fig. 1 as the fan-like structures, where multipledifferent configurations are connected to a single one.All the possible trajectories that lead to the same attractorcan be considered as the gene expression pathwaysthat produce the particular phenotype encoded in thatattractor.

As it was mentioned in the Introduction, there isnumerical and experimental evidence supporting thehypothesis that the dynamical attractors of the geneticnetwork correspond to different cell types or cell fates(such as apoptosis, quiescence, proliferation, differentia-tion, etc.) (Albert and Othmer, 2003; Mendoza andAlvarez-Buylla, 2000; Espinosa-Soto et al., 2004; Huangand Ingber, 2000; Huang et al., 2005; Gardner et al., 2003).Therefore, given the biological significance of attractors,the problem of robustness and evolvability of GRN can beaddressed by studying the robustness and evolvability ofthe attractor landscape in our RBN models, notably theconservation of attractors and the emergence of new onesunder perturbation of the network structure. Within thisframework, changes in the attractor landscape of thenetwork due to mutations in the network structure (inparticular, gene duplication and divergence) would corre-spond to changes in cellular phenotypes or accessiblecellular fates. This point of view allows us to give the

following operational definition of robustness and evolva-bility:

We will say that a RBN with n attractors is robust undergene duplication and divergence if all of its n attractorsare conserved after the duplication and divergence ofone gene. Additionally, the network is evolvable if, as aresult of these mutations, new attractors emerge.

Since the attractor landscape is an emergent property ofthe entire network, it is clear that the robustness andevolvability defined previously, which entail the conserva-tion of all the attractors of the network and the emergenceof new ones, are not associated with a single gene or asubset of genes. Rather, they are dynamical propertiesemerging from the collective behavior of the wholenetwork. In other words, these kinds of robustness andevolvability are distributed as defined by Wagner (2004), ordegenerated as defined by Edelman and Gally (2001).

Page 5: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESSM. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448 437

4. Gene duplication and divergence

The process of gene duplication followed by divergenceis implemented in the network model as follows. We startwith a RBN composed of N genes, which we will refer to asthe original network. Then we randomly choose one of its N

genes and duplicate it, generating a new network with N þ

1 genes (see Fig. 2). Divergence is simulated by implement-ing one or more of the following ‘‘mutations’’ in the newnetwork: (i) the input connections of the duplicate gene arerandomly rewired; (ii) the output connections of theduplicate gene are randomly rewired; (iii) the Boolean ruleof the duplicate gene is randomly changed; (iv) the Booleanrules of the outputs of the duplicate gene (namely, theBoolean rules of the genes for which the duplicate gene isan input) are randomly changed. Note that in mutations (i)and (ii) neither the number of input connections nor thenumber of output connections are changed. In both casesthe connections are rewired but the total number of themremains the same.

Mutation (iv) requires further explanation. Once theduplicate gene is added to the network, each of its targetswill have an extra regulator, which is the duplicate geneitself. Therefore, the Boolean rules of these targets will

σ1 σNσn

f

σ1 σN+1σn

f f~

σnσ1 σN+1

f f

a

b

c

Fig. 2. Gene duplication and divergence. (a) We start with an original

network with N genes. Each gene has associated a set of inputs, a set of

outputs and a Boolean function f. We randomly chose one of these genes

for duplication (as for instance the gene sn emphasized in gray). (b) We

make a copy of sn, with the same inputs and outputs and Boolean

function, and add it to the network as the gene sNþ1. (c) Some of the

inputs and outputs of the new gene are randomly rewired, and its Boolean

functions randomly mutated. The network we end up with contains an

extra gene, which is a mutated copy of one of the genes of the original

network.

need to be extended to accept the duplicate gene as an extrainput. This extension is carried out in such a way that,when the duplicate gene is off (it is equal to 0) the Booleanrules of its targets will have the same values as they hadbefore duplication. However, when the duplicate gene is on(it is equal to 1), the Boolean rules of its targets may havethe same values as before or they can mutate and acquirenew values. The situation is illustrated in Table 1.Mutations (i) and (ii) correspond to structural diver-

gence, whereas mutations (iii) and (iv) correspond todivergence in functionality. These two types of divergencehave been observed to occur in duplicate genes of realorganisms (Teichmann and Babu, 2004). We will refer tothe new network obtained by this process as the mutated

network.Motivated by the hypothesis that the different cellular

fates or cellular types are encoded in the dynamicalattractors of the network, and the experimental evidencesupporting it, our goal is to determine the probability pðqÞ

that a given percentage q of the attractors of the originalnetwork are conserved in the mutated network. In order todo so, we consider the case of extreme divergence in whichall the four types of mutation mentioned above are carriedout. The above is justified by the fact that the rate ofdivergence after the gene duplication event is between oneto two orders of magnitude larger than the rate ofduplication (Gu et al., 2005). Therefore, immediately afterthe gene duplication in our model, we implement the fourtypes of mutation described above. Presumably, the

Table 1

Extension of the Boolean rule of the genes for which the duplicate gene

sNþ1 is an input

sm1sm2

f m

(a)

0 0 0

0 1 1

1 0 0

1 1 1

(b)

sm1sm2

sNþ1 f m

0 0 0 0

0 1 0 1

1 0 0 0

1 1 0 1

0 0 1 0

0 1 1 1

1 0 1 0

1 1 1 0

In the original network the gene sm has only two inputs, sm1and sm2

.

Hence, its Boolean rule f m consists of four values, as shown in (a). After

duplication and divergence, it might be possible that the new gene sNþ1 in

the mutated network has sm as one of its outputs. In that case, the

Boolean rule of f m has to be extended to accept sNþ1 as a new input. The

extension of f m is carried out in such a way that f m will acquire the same

value it acquired before duplication whenever sNþ1 ¼ 0, as in the upper

half of (b). The part of f m corresponding to sNþ1 ¼ 1 is randomly assigned

again, as in the lower half of (b).

Page 6: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

Table 2

Typical outcome of the attractors of the original and mutated networks

Original network Mutated network Classification

Attractor A1 Attractor B1

*010100011011011 *0101000110110111

*000111000101110 *0001110001011101 Identity

*110011001101010 *1100110011010100

Attractor A2 Attractor B2

*101010111011101 *1010101110111011

*010101101101011 *0101011011010110

0001000001000101 Expansion

*110001011010001 *1100010110100011

1000001100110100

Attractor A3 Attractor B3

*000001100010011 *0000011000100110

*000111011010010 *0001110110100100

111110011011101 Contraction

*111111011011101 *1111110110111011

111001100101101

Attractor B4

01000001001100101

00100000110101111

10000100110100101 Innovation

10010100110100000

In this table we show an example for the case N ¼ 15, K ¼ 2 and p ¼ 12.

After duplication and divergence, the original attractor landscape,

consisting of the attractors A1, A2, A3 changes and becomes the

transformed attractor landscape consisting of the attractors B1, B2, B3,

B4. This table shows the following transformations of the individual

attractors: Identity, A1 ¼ B1. Expansion, A2 � B2. Contraction,

A3 � B3. And innovation, in which a new attractor B4 emerges. The

configurations marked with a star are identical in both sets. Since the

original network has N genes and the mutated network has N þ 1, in order

to compare the two sets of attractors we have ignored the state of the

duplicate gene sNþ1 in the mutated network.

5For simplicity, we will refer to the attractors of the original and

mutated networks as the original and transformed attractors, respectively.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448438

robustness observed in this case is a lower bound for theone obtained in less extreme cases of divergence.

It is important to distinguish between changes in theattractors and changes in their basins of attraction. As wewill see later, even if all the attractors of the originalnetwork are identical to those of the mutated network,their respective basins of attraction may be different. Theabove would correspond to changes in differentiation andgene expression pathways rather than in phenotypes.

5. Results for homogeneous random networks

In this section we present numerical results for networkswith homogeneous random topologies. In the numericalsimulations, the original network has N ¼ 20 genes and themutated network has N þ 1 ¼ 21, (except for Table 2, inwhich N ¼ 15).4 The probability of gene expression togenerate the Boolean rules is p ¼ 1

2, for which the average

sensitivity is S ¼ K=2. By varying K we can make thesystem transit from the ordered to the chaotic regime, withthe phase transition occurring at K ¼ 2. To have a good

4The reason for working with such small networks is that in this way we

can sample all the configuration space and find all the attractors.

statistics we average the relevant quantities presented hereover 20; 000 network realizations.Our numerical simulations show that the original

attractors can be transformed in many different ways afterduplication and divergence.5 Since the mutated networkhas N þ 1 genes, whereas the original network has only N

genes, in order to compare the original and the trans-formed attractors we ignore the value of the duplicate genein the mutated network, comparing the values of only thosegenes that are common in both networks. Table 2 shows atypical outcome of the attractors of the original andmutated networks. In this particular realization, theoriginal network has three attractors, labeledA1;A2;A3, whereas the mutated network has fourattractors, denoted as B1;B2;B3;B4. From Table 2 thefollowing transformations between attractors are apparent:

An

net

net

Identity, in which an original attractor is identical to oneof the transformed attractors (such as A1 ¼ B1).

� Expansion, in which one of the original attractorsacquires more configurations in the mutated networkand preserves all its old ones, (such as A2 � B2). � Contraction, where one of the original attractors losessome of its configurations in the mutated network (e.g.A3 � B3). � And interestingly, we observe innovation, which consistsin the emergence of a fully new attractor in the mutatednetwork, such as B4.

It is important to distinguish again between thetransformations undergone by the individual attractors(identity, expansion, contraction) and the transformationundergone by the entire attractor landscape. It is clear thatexpansion and contraction involve transformations in boththe individual attractors and the attractor landscape.However, it is possible that all the original attractors areidentically conserved in the mutated network and none-theless the attractor landscape changes either because ofthe emergence of new attractors (innovation) and/orbecause the basins of attraction changed. From a biologicalpoint of view it is desirable for the mutated network topreserve all of the phenotypes encoded in the attractors ofthe original network and to acquire new ones. Therefore,we consider as biologically relevant only the transforma-tions given by identities, expansions and innovations. Wewill generically refer to the first two kinds of transforma-tions, i.e. identities and expansions, as conservation of

attractors. Additionally, we demand that the mutatednetwork has at least the same number of attractors as theoriginal one (it can have more). In this way, we allow the

alogously, we will refer to the attractor landscape of the original

work as the original attractor landscape, and to that of the mutated

work as the transformed attractor landscape.

Page 7: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

0 20 40 60 80 100

0 20 40 60 80 100

0 20 40 60 80 100

0

0.2

0.4

0.6

0

0.2

0.4

0.6

P (

q)

0

0.2

0.4

0.6

0.4

0.6

K = 1

K = 2

K = 3

K = 4

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448 439

possibility for the mutated network to develop newphenotypes and at the same time to preserve the old ones.This condition is not as restrictive as one might think, sincethe number of attractors in RBN grows, on average, withthe size of the network (Aldana et al., 2003).

Indeed, Fig. 3 shows the probability ppðKÞ that thenumber no of original attractors is less or equal to thenumber nt of transformed attractors, as a function of K

(solid line). Note that this probability, a decreasingfunction of K, is higher for ordered and critical networksðK ¼ 1; 2Þ than for chaotic networks ðK42Þ. Note also thatPpðKÞ41

2 for any value of K, which means that most likelythe mutated network will have at least the same number ofattractors as the original network.

On the other hand, Fig. 4 shows the probability pðqÞ thata percentage q of the original attractors are conserved inthe mutated network, for different values of the connectiv-ity K. Note that for K ¼ 1 and 2 the maximum of pðqÞ

occurs at q ¼ 100%. This result indicates that, with thehighest probability, ordered and critical networks conserveall its attractors after gene duplication and divergence. Weshould recall that the results displayed on Fig. 4 wereobtained for the case of extreme divergence, where every

feature of the duplicate gene was randomly mutated(Boolean rules and input and output connections). Ournumerical simulations indicate that, as K increases, thepeak at q ¼ 100% decreases and almost vanishes for KX4.For networks in the chaotic regime the maximum of pðqÞ is

0 2 4 6 8 10 12 14

K

0.0

0.2

0.4

0.6

0.8

1.0

Probability

no ≤ nt

no = nt

no < nt

Fig. 3. Probability ppðKÞ that nopnt, where no and nt are the number of

original and transformed attractors, respectively (solid line). ppðKÞ can be

decomposed into the probability p¼ðKÞ that no ¼ nt (dashed line), and the

probability poðKÞ that noont (dotted-dashed line). Note that for Kp2 the

probability p¼ðKÞ is smaller than poðKÞ, whereas for K42 the opposite

happens. The crossover occurs at some intermediate point between K ¼ 2

and 3.

0 20 40 60 80 100

q

0

0.2

Fig. 4. Probability pðqÞ that a given percentage q of the original attractors

are conserved (identities or expansions) after duplication and divergence.

This probability is shown for homogeneous random networks operating in

the ordered (K ¼ 1), critical (K ¼ 2) and chaotic regimes (KX3). Note the

large peak at q ¼ 100% for K ¼ 1 and 2, which reveals that ordered and

critical networks conserve all their attractors with a high probability. As

we can see, this property is lost as the network enters the chaotic region.

The shaded histogram shown for K ¼ 2 is the probability of conservation

of attractors given that all the inputs of the duplicate gene are frozen. The

simulations were carried out for networks with N ¼ 20 and 20,000

network realizations.

attained at q ¼ 0%, which indicates that the property ofconserving attractors is lost as we enter the chaotic region.It is worth mentioning that the genes in a RBN can be

classified in two categories: frozen and non-frozen. Thefrozen genes reach a constant value after a transient timeand remain constant thereafter, whereas the non-frozengenes keep changing in a pattern of cyclic activity. As theaverage network sensitivity S ¼ 2pð1� pÞK increases, thefraction of frozen genes decreases and eventually vanishesfor S!1 (Kauffman, 1993). To some extent, theexistence of frozen genes may account for the dynamical

Page 8: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

1 2 3 4 5K

0

0.02

0.04

0.06

0.08

0.1

Pe

(K)

Fig. 6. Probability for the transformed attractor landscape to contain all

the original attractors and at least a new one, as a function of the network

connectivity K. This graph reveals that critical networks have the largest

potential to acquire new attractors while preserving all the old ones. The

simulations were carried out for networks with N ¼ 20 and 20,000

network realizations.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448440

stability of networks with low sensitivities. However, wehave observed that the large peaks of PðqÞ at q ¼ 100%obtained for ordered and critical networks cannot becompletely attributed to the existence of frozen genes. Forexample, the shaded histogram in Fig. 4 for K ¼ 2 is theprobability of conservation of attractors given that all the

inputs of the duplicate gene in the mutated network are

frozen. The large difference between the gray and blackhistograms in this figure reveals that, even when the inputsof the duplicate gene are non-frozen, all the originalattractors are conserved with a high probability.

So far we have generically considered both identities andexpansions of the original attractors as conservation ofattractors. It is clear that the information of the originalnetwork encoded in its attractors will be conserved underthese two kinds of transformations. However, it isimportant to determine to what extent the originalattractors are identically conserved in the mutated net-work, and to what extent they are expanded. As Fig. 5reveals, the probability of identities is considerably largerthan the probability of expansions. Therefore, when anoriginal attractor is conserved, most likely it is identicallyconserved.

Another important quantity is the probability peðKÞ forthe mutated network to have new attractors (innovations)given that all the original attractors were conserved. Thisprobability is shown in Fig. 6 as a function of the networkconnectivity K. As can be seen from this figure, themaximum probability occurs at K ¼ 2, which indicates that

0 2 4 6 8 10 12 14

K

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Prob

ability

ConservationsIdentitiesExpansions

Fig. 5. Probability of conservation (identities or expansions) of attractors

as a function of the network connectivity K. The dashed and dotted-

dashed curves correspond to the probability for an original attractor to be

identically conserved or to be expanded, respectively. The solid curve gives

the probability of conservation of attractors, namely, the probability of

both identities or expansions. The simulations were carried out for

networks with N ¼ 20 and 20,000 network realizations.

networks operating at the critical regime have simulta-neously the maximum robustness (all the original attrac-tors are conserved) and the maximum evolvability (newattractors appear).

6. Results for scale-free networks

In this section we consider networks with Poisson inputtopology and scale-free output topology. According tosome recent studies, this architecture is similar to the oneobserved in gene transcription regulatory networks of realorganisms (Albert, 2005; Guelzim et al., 2002; Babu et al.,2004). To see that this is indeed the case, in Fig. 7 we showthe input distributions piðkÞ and in Fig. 8 the outputdistribution poðlÞ for the transcriptional regulatory net-works of E. coli, B. subtilis and S. cerevisiae. The data togenerate these distributions are in the public domain(Martınez Antonio and Collado-Vides, 2003; Makita etal., 2004; Luscombe et al., 2004). As can be seen from thesefigures, the input distribution for E. coli is well approxi-mated by a Poisson distribution with average K � 2:16,whereas the output distribution has a long tail that fits apower law with exponent g � 1:636. Analogously, the inputdistribution for B. subtilis can also be approximated by aPoisson distribution with average K � 1:3, while theoutput distribution fits a power law with exponentg � 1:673. Finally, the input distribution for S. cerevisiae,rather than being Poissonian, is better approximatedby an exponential e�ak with a � 0:5, whereas the output

Page 9: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

0 2 4 6 8 10

0 2 4 6 8 10

10-3

10-2

10-1

100

10-3

10-2

10-1

100

Pi (k

)

0 2 4 6 8 10 12 14 16 18

k

10-4

10-3

10-2

10-1

100

E. coli inputs

B. Subtilis inputs

S. cerevisiae inputs

a

b

c

Fig. 7. Input connectivity distribution piðkÞ for (a) E. coli, (b) B. subtilis

and (c) S. cerevisiae. In all the graphs the black solid line corresponds to

the experimental data. The dashed curves in panels (a) and (b) correspond

to Poisson distributions with averages K ¼ 2:16 and 1:63, respectively,whereas the dashed curve in (c) is the best exponential distribution that fits

the data, with an average K ¼ 2:05. All the graphs are in log-linear scales

to better appreciate the differences between the experimental data and the

best analytic fit.

1 10 10010-5

10-4

10-3

10-2

10-1

1 10 10010-5

10-4

10-3

10-2

10-1

Po

(l)

1 10 100

l

10-5

10-4

10-3

10-2

E. coli outputs

B. Subtilis outputs

S. cerevisiae outputs

a

b

c

Fig. 8. Output connectivity distribution poðlÞ for (a) E. coli, (b) B. subtilis

and (c) S. cerevisiae in log–log scale. We use logarithmic bins to generate

the probability distribution from the experimental data. The dashed line in

each graph is the plot of the power law that best fit the data, yielding the

scale-free exponents g ¼ 1:636 for E. coli, g ¼ 1:673 for B. subtilis and

g ¼ 0:984 for S. cerevisiae.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448 441

distribution follows a power law with exponent g � 0:984.It is worth mentioning that the average input connectivityof the transcription regulatory network of S. cerevisiae, ascomputed from the data shown in Fig. 7(c), turns out to beK � 2:05.

The above results show examples of real geneticnetworks that, to the best of the current data available,have input distributions with short tails (whether Poisso-nian or exponential), and output distributions with longtails that can be approximated by power laws. Therefore,from a theoretical point of view it is well justified to analysethe robustness of the attractors of RBN with Poisson inputand scale-free output topologies. Such networks are easilygenerated in the computer by first assigning to each sn itsnumber of outputs ln, where ln is a random variable drawnfrom a power law distribution, and then choosing these ln

outputs at random from anywhere in the system. By thisprocess the input distribution will automatically bePoissonian.

We start with an original network with N ¼ 20 genes.Then, we duplicate one of these genes and change itsBoolean rule, its input and output connections, and theextension of the Boolean rules of its target genes. As in theprevious section, we consider the case of extreme diver-gence in which all of the above properties are randomlychanged. We end up again with a mutated network withN þ 1 ¼ 21 genes in which the last gene is a mutated copyof one of the original genes. Then we proceed to comparethe set of original attractors with the set of transformedattractors in order to compute the probability of conserva-tions and the probability of innovations. Everything seemsto be as in the previous section. There is an importantdifference, though. Since in this case we are consideringnetworks with scale-free output topology, the genes are nolonger statistically equivalent. Such networks are charac-terized by a high heterogeneity in the connectivity of theindividual genes. Therefore, care must be taken in the waywe choose the gene to be duplicated, as we do not expect, apriori, the robustness of the attractors to be the same if we

Page 10: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESSM. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448442

systematically duplicate genes with a small number ofoutputs than if we systematically duplicate genes withmany outputs. However, to the best of our knowledge,there is no experimental evidence for the duplication rate ofhighly connected genes to be particularly different fromthat of poorly connected genes (Yanai et al., 2000;Evangelisti and Wagner, 2004; Teichmann and Babu,2004). Hence, the approach adopted here is to randomlychoose a gene and duplicate it, regardless of its connectiv-ity. Since in a scale-free network the majority of the geneshave just a few number of connections, most likely theduplicate gene will be poorly connected and only occa-

0

0.2

0.4

0.6

0.8

0

0.2

0.4

0.6

P (

q)

0 20 40 60 80 100

0 20 40 60 80 100

0 20 40 60 80 100

0 20 40 60 80 100

0

0.2

0.4

q

0

0.1

0.2

0.3

γ = 8.00

γ = 2.16

γ = 1.66

γ = 1.35

Fig. 9. Probability pðqÞ for a percentage q of the original attractors to be

conserved after the duplication and divergence of a randomly chosen gene

of the original network, for networks with Poisson input topology and

scale-free output topology. The scale-free exponent has been adjusted to

produce average connectivities K ¼ 1 (g ¼ 8:00), K ¼ 2 (g ¼ 2:16), K ¼ 3

(g ¼ 1:66) and K ¼ 4 (g ¼ 1:35). Note that pð100Þ is larger in this case than

for homogeneous random networks with the same connectivities (see Fig.

4), indicating that scale-free networks are more resilient to behave

chaotically than homogeneous random networks. The simulations were

carried out for networks with N ¼ 20 and 20,000 network realizations.

sionally a highly connected gene will be chosen forduplication.Fig. 9 shows the probability pðqÞ that a fraction q of the

original attractors are conserved in the mutated networkfor the case in which the duplicated gene is chosenrandomly from anywhere in the system. The differentgraphs in this figure correspond to different values of thescale-free exponent g, chosen in such a way that the averagenetwork connectivity K acquires the values 1–4.6 FromFig. 9 it can be observed that the maximum of pðqÞ forordered and critical networks is attained at q ¼ 100% andtherefore, in the ordered and critical regimes all theattractors are conserved with the maximum probability.As we enter the chaotic region the value of pð100Þdecreases. However, by comparing the results displayedin Fig. 4 and 9 we can see that pð100Þ, i.e. the probability ofconservation of all the attractors, is considerably larger forscale-free networks than for homogeneous random net-works, even for K42 (which is the chaotic regime fornetworks with homogeneous random topology). This resultis a consequence of the fact that the gene to be duplicated israndomly chosen from anywhere in the system. Therefore,in scale-free networks the duplicate gene is poorlyconnected in the majority of the duplication events, andonly occasionally a highly connected gene is duplicated.Contrary to this, in homogeneous random networks theconnectivity of every duplicated gene is nearly the same asthe average connectivity of the entire network.Intuitively, one would expect that duplicated genes with

a large number of connections would be more likely tochange the dynamical attractors of the network thanduplicated genes with a small number of connections. Tosee that this is indeed the case, we analyse the behavior ofthe probability pð100Þ as a function of the outputconnectivity l of the duplicated gene. To do so, wesystematically duplicate and mutate genes with exactly l

target genes and then vary l from 1 up to N. Fig. 10 showsthe results of this analysis for scale-free networks withaverage connectivity K ¼ 2. As we can see from this figure,pð100Þ decreases rapidly with l and becomes more or lessconstant for lX5. The above result indicates that networkswith scale-free topology are more robust than homoge-neous random networks in the sense that, when randomlychosen genes are duplicated, the probability of conserva-tion of all the dynamical attractors is larger for scale-freenetworks than for homogeneous random ones.We conclude this section by presenting in Fig. 11 the

probability peðKÞ for the mutated network to conserve allthe original attractors and to have new ones (innovations),as a function of the average network connectivity K. Again,we observe that this probability is maximum for criticalnetworks with average connectivity K ¼ 2.

6The average network connectivity K of a scale-free network with N

genes and exponent g is given by K ¼PN

k¼1k�g� ��1 PN

k¼1k1�g� �

. The

exponent g can thus be adjusted to obtain any desired connectivity K.

Page 11: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

0 5 10 15

l

0.35

0.4

0.45

0.5

0.55

0.6

P (

100)

Fig. 10. Probability pð100Þ of conservation of all the original attractors as

a function of the output connectivity l of the duplicate gene. Note that

pð100Þ decreases rapidly with l and becomes more or less constant for lX5.

Therefore, duplicate genes with many outputs are more likely to change

the attractor landscape of the original network than poorly connected

duplicate genes. The simulations were carried out for networks with N ¼

20 and 20,000 network realizations.

1 2 3 4 5

K

0

0.05

0.1

0.15

Pe

(K)

Fig. 11. Probability peðKÞ for the mutated network to have new attractors

given that all the original ones were conserved, for the case in which the

output topology of the network is scale-free. The simulation was carried

out for networks with N ¼ 20 and 20,000 network realizations.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448 443

7. Results for large networks

In the work presented in the previous sections we haveconsidered small networks with N ¼ 20 genes (N ¼ 21 for

the mutated network). The reason, as it was explainedbefore, is that in this way we can explore all theconfiguration space, (consisting of O ¼ 220 ¼ 1; 048; 576and O ¼ 221 ¼ 2; 097; 152 configurations for the originaland mutated networks, respectively), which allows us tofind all the attractors of the networks. The caveat is thatsmall networks typically have a small number of attractors.The conservation of just a few attractors, one might argue,is most likely to occur than the conservation of a largenumber of attractors. Therefore, the question arises as towhether or not the results obtained for small networkswould still be valid for larger networks with a largernumber of attractors. In this section we go one order ofmagnitude upwards and present results for networks withN ¼ 200 genes. However, due to limitations in bothcomputer memory and computing time, it is not feasibleto explore all the configuration space for such largenetworks. One has to necessarily sample just a tiny subsetof the configuration space, with the hope that most, if notall the attractors, are found.For Boolean networks with N ¼ 200 genes, the config-

uration space consists of O ¼ 2200 � 1060 configurations,out of which we sample only a randomly chosen subset ofOs ¼ 106. The attractors with the largest basins ofattraction will come out as a result of this randomsampling, and then the probability of conservation forthose attractors will be determined. We restrict our analysisto networks in the critical regime, i.e. networks withaverage connectivity K ¼ 2, for two reasons. On the onehand, networks with N ¼ 200 operating in the orderedregime still have a small number of attractors. In ournumerical simulations we found that the probability for thenetwork to have just one attractor is twice as much forK ¼ 1 than for K ¼ 2. Therefore, the results for networkswith K ¼ 1 would not differ substantially from the resultspresented before. On the other hand, networks with N ¼

200 operating in the chaotic regime (K42) typically haveextremely long transient times and long attractor lengths,which makes it practically impossible to find theirattractors in a reasonable time. The existence of extremelylong transients and long cycles is typical for networks in thechaotic regime, becoming worse as the value of K increases.Therefore, we restrict our analysis to networks with K ¼ 2.Fig. 12 shows the probability pðqÞ that a percentage q of

the original attractors are conserved in the mutatednetwork, for networks with N ¼ 200 and K ¼ 2. The dataplotted in Fig. 12(a) correspond to homogeneous randomnetworks, whereas those plotted in Fig. 12(b) correspondto networks with scale-free output topology. Since we aresampling only a small fraction of the configuration space,this figure shows the probability of conservation only forthose attractors that were found. So, the peak at q ¼ 100%clearly refers to the conservation of the 100% of theattractors that came out with the random sampling. Theremight be some attractors, with small basins of attraction,that did not come out and therefore are not taken intoaccount in Fig. 12. However, we can see from this figure

Page 12: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

0 20 40 60 80 100

0 20 40 60 80 100

0

0.1

0.2

0.3

0.4

0.5

q

0

0.1

0.2

0.3

0.4

0.5

0.6

P (

q)

K = 2

γ = 2.38

a

b

Fig. 12. Probability pðqÞ that a percentage q of the original attractors are

conserved in the mutated network, for large networks with N ¼ 200. The

two graphs correspond to (a) homogeneous random topology and (b)

scale-free topology. In both cases the average network connectivity is

K ¼ 2. To compute these graphs we sampled 106 configurations for each

network realization, and then averaged the results over 20,000 network

realizations.

1 10 10010-5

10-4

10-3

10-2

10-1

100

Pa

(n)

0 10 20 30 40 50n

0

0.1

0.2

0.3

0.4

0.5

0.6

P (

100

|n)

a

b

Fig. 13. (a) Probability paðnÞ for the original network to have n attractors

in any arbitrary network realization, for the case N ¼ 200 and K ¼ 2.

Note that this probability can be approximated by a power-law, paðnÞ�n�b

where b � 2:2 (dashed line). (b) Conditional probability pð100jnÞ that the

100% of the original attractors are conserved given that there were n of

them. Note that this probability is a slowly decreasing function of n. The

data were computed using 20,000 network realizations.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448444

that the probability of conservation pðqÞ for networks withN ¼ 200 genes is remarkably similar to that plotted inFigs. 4 and 9 for small networks with N ¼ 20 and K ¼ 2,where all the configuration space was probed and all theattractors were found. This similarity suggests that almostall the attractors, if not all, are being found and taken intoaccount in the data reported in Fig. 12.

The two prominent peaks in Fig. 12 (and in Figs. 4 and 9),one at q ¼ 100% and the other at q ¼ 0%, indicate that,for critical networks, either all of the original attractors areconserved, or none of them is conserved. Intermediate casesoccur with very low probability. This kind of all-or-nonebehavior could be trivially satisfied if there were onlyone attractor in the original network. Then the oneattractor would be either conserved or not conserved inthe mutated network. Nonetheless, the distribution of thenumber of original attractors has a long tail, as shown inFig. 13(a) where the probability paðnÞ for the originalnetwork to have n attractors is plotted as a function of n. Inother words, with a high probability the original network

has more than just one attractor. Therefore, it is necessaryto analyse the behavior of the probability of conservationof attractors pðqÞ as a function of the number of originalattractors. The results of this analysis are reported inFig. 13(b) for networks with scale-free output topology(N ¼ 200, K ¼ 2). In this figure we plot the conditionalprobability pð100jnÞ that the 100% of the originalattractors are conserved given that there were n originalattractors. It is apparent from this figure that pð100jnÞ is aslowly decaying function of n and that the probability ofconservation of just one attractor (�0:6) is only twice asmuch as the probability of conservation of 35 attractors(�0:3). Consequently, the all-or-none behavior observed inFig. 12 is only weakly related to the number of originalattractors.It is worth mentioning that it might not be biologically

relevant to analyse even larger networks with more than acouple of hundred genes. It has been pointed out in recentstudies that the genome is heterogeneous in terms ofconnectivity of individual genes: a large portion of thegenes are effector genes (whose products are structuralproteins, metabolic enzymes, tRNA’s, etc.) that do not

Page 13: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESSM. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448 445

directly control the expression of other genes. Theseeffector genes are enslaved by a core of regulatory genesthat have direct outputs to the effector genes (via signalingmolecules, transcription factors, cofactors, etc.). Thus, theentire genome of an organism can be seen as a ‘‘medusanetwork’’ with a regulatory head and many ‘‘tentacles’’ ofenslaved effector genes. The dynamical properties of thenetwork are determined by the head of the medusanetwork. Hence, considering just the head, which enslavesthe effector genes, is sufficient for the model. The size ofthis head of regulatory genes is not precisely known, but iscertainly substantially smaller than the total number ofgenes of a typical genome. For instance, in E. coli the headof the medusa network as computed from the RegulonData Base consists of less than 80 genes (Martınez Antonioand Collado-Vides, 2003).

8The acquisition of new programs encoded in a set of genes can be more

important than the acquisition of new genes in order to achieve higher

organization in an individual. An striking example of this is found in the

genomes of Drosophila melanogaster and Caenorhabditis elegans. The fly

has fewer genes than the worm—about 14,000 as compared to 19,000.

Additionally, the fly has almost twice as much non-coding DNA per gene

(about 10,000 nucleotides on average, as compared with about 5000).

8. Summary and discussion

RBN are useful tools to understand the processes of celldifferentiation and gene regulation. In spite of theirsimplicity, it has been possible to identify numericallyand experimentally the dynamical attractors of Booleannetworks with the gene expression patterns of some realGRN. This identification makes it possible to gain insightabout the robustness and evolvability of GRN by analysingthe robustness and evolvability of their attractor land-scapes. Additionally, the analysis presented in this workallows us to give a first answer to an important questionabout the relevance of RBN for the modeling of biologicalsystems due to the possible fragility of their dynamicalattractors against perturbations (Aldana et al., 2003).

Robustness of the attractor landscape in RBN: The resultsreported in Figs. 4, 9 and 12 show that networks operatingin the ordered and critical regimes are intrinsically robustagainst the duplication and full divergence of one gene.7

This robustness is reflected in the large peak obtained forthe probability pðqÞ of conservation of attractors atq ¼ 100%. For K ¼ 1 and 2, pð100Þ is considerably largerthan pð0Þ, indicating that all the attractors of ordered andcritical networks are conserved with a high probability.The situation changes drastically in the chaotic regime forhomogeneous networks, where the probability pð100Þbecomes almost negligible compared with pð0Þ for KX4.The same change is less drastic for scale-free networks dueto the existence of a large number of genes with lowconnectivity. Altogether, homogeneous random and scale-free networks operating in the chaotic regime are lessrobust because, with a high probability, their attractors arenot conserved under the gene duplication and divergenceprocesses. From a biological point of view, the total orpartial loss of the network phenotypes encoded in its gene

7The same is true for networks that operate near the critical regime, for

instance, networks with average connectivity K ¼ 2:1. The validity of this

statement is grounded on the fact that the phase transition from the

ordered to the chaotic regimes in the RBN model is continuous.

expression patterns (attractors) is not compatible with therobustness observed in real GRN. Therefore, the aboveresults give further support to the hypothesis that realGRN should be critical or near-critical, but not totallychaotic or ordered (Shmulevich et al., 2005).

Evolvability of the attractor landscape: In addition to theconservation of the original attractors, we observe theemergence of totally new ones in the mutated network.Therefore, not only does the duplicate gene acquire a newfunction after divergence (e.g. a new signaling molecule ora new structural protein), but with a high probability thewhole network acquires novel phenotypes or gene regula-tory programs encoded in the new attractors, whilepreserving the old ones.8 From the biological point ofview, the emergence of new attractors allows for thepossibility of evolving. Some of the phenotypes encoded inthe new attractors might be beneficial for the adaptation ofthe organism to new environments. Some others might notbe. In either case, the new attractors serve as the rawmaterial on which natural selection would act. Within thisframework, the results reported in Figs. 6 and 11 indicatethat critical networks exhibit the maximum potential toevolve.

Robustness and evolvability: The coexistence of robust-ness and evolvability within the same system is illustratedschematically in Fig. 14, which shows the original andtransformed attractor landscapes for a typical networkrealization with K ¼ 2 and N ¼ 15. The original attractorlandscape consists of the three attractors A1, A2 and A3,which are identically conserved and become the attractorsB1, B2 and B3, respectively, of the mutated network. Inaddition to these three attractors, the mutated network alsohas three new ones, B4, B5 and B6, which did not exist inthe original network. On the one hand, the dynamicalbehavior of the network is robust since all of the threeoriginal attractors are conserved. On the other hand, it isevolvable since three new attractors appeared. This istypically the case for networks operating near the criticalregime.As it has been mentioned before, the conservation of all

the original attractors and the emergence of fully new onesentails a reconfiguration of the attractor landscape. This isso because configurations that in the original networkbelong to the basin of attraction of a given attractor, in themutated network can belong to the basin of attraction of adifferent attractor (or attractors). From a biological point

Even though there are fewer genes in the fly than in the worm, the first

seems to be more complex than the later. Thus, as Alberts et al. explain,

‘‘the molecular construction kit has fewer types of parts, but the assembly

instructions—as specified by the regulatory sequences in the non-coding

DNA—seem to be more voluminous’’ (Alberts et al., 2002).

Page 14: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESS

Fig. 14. Schematic representation of the coexistence of robustness and

evolvability within the same system. All the original attractors A1, A2

and A3 are identically conserved in the mutated network, where they

become B1, B2 and B3, respectively. In this sense, the original network is

robust. Additionally, the mutated network has three new attractors, B4,

B5 and B6 which represent new phenotypes and new possibilities for the

system to evolve.

Fig. 15. Restructuration of the attractor landscape after the duplication

and divergence of one of the genes of the original network. Even when all

the original attractors are conserved in this particular example (and a new

one emerges), their basins of attraction ‘‘exchange’’ configurations in the

mutated network, thus changing the gene expression and differentiation

pathways of the network.

M. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448446

of view, this corresponds to changes in gene expression anddifferentiation pathways. To illustrate this point we presentin Fig. 15 another typical example for a network with N ¼

15 and K ¼ 2 in which all the original attractors (A1, A2,A3 and A4) are identically conserved in the mutatednetwork (they become the attractors B1, B2, B3 and B4).Additionally, a new attractor has emerged in the mutatednetwork (B5). We use the same color to label all theconfigurations that belong to the same basin of attractionin the original network (red for A1, blue for A2, green forA3 and violet for A4), and respect the same color code tolabel the configurations of the basins of attraction of themutated network. In this way we can identify which

Page 15: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESSM. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448 447

configurations in the basins of attraction of the mutatednetwork come from which basin of attraction of theoriginal network.9

From Fig. 15 it can be observed that, although each ofthe basins of attraction of the mutated network isdominated by one color, they also contain configurationsof different colors. Thus, the basin of attraction of B1

contains configurations from the basins of attraction ofA1, A2, A3 and A4; the basin of attraction of B2 containsconfigurations from the basins of attraction of A2, A3 andA4; etc. This indicates that the duplication and divergenceof a single gene not only does produce new attractors, butit also can change the differentiation and gene expressionpathways of the entire network (Wagner, 2004).

Finally, it is important to differentiate between the newfunction acquired by the duplicate gene after divergence,and the new phenotypes acquired by the entire network. Inthe current literature on gene duplication, it is mostlystressed that the duplicate gene can eventually develop anew function with the gradual accumulation of mutationsin its coding sequence (see for instance Hughes (2002)).However, in this work we have shown that this kind of‘‘relaxed selection’’, where one gene is duplicated, mutatedand retained, can also change the functioning of the wholeorganism, which is reflected in the emergence of new geneexpression patterns (attractors) and new differentiationpathways.

In conclusion, we have shown that RBN operating nearthe critical regime are both robust and evolvable under theprocesses of gene duplication and divergence. Our resultssupport the observation that the robustness of biologicalsystems is not necessarily attributed to any kind ofredundancy. They also provide a mechanism to understandhow robustness and evolvability can live together, givingrise to the great variety of stable living forms we observearound us.

Acknowledgments

We thank Hernan Larralde, Gustavo Martınez-Mekler,Ilya Shmulevich, Leo P. Kadanoff and Reviewer #1 foruseful comments and suggestions. E. Balleza also acknowl-edges CONACyT-Mexico for a Ph.D scholarship. Thiswork was partially supported by CONACyT grant P47836-F, NSF grant PHY-0417660 and NIH grant GM070600-01.

References

Albert, R., 2005. Scale-free networks in cell biology. J. Cell Sci. 118 (21),

4947–4957.

Albert, R., Othmer, H.G., 2003. The topology of the regulatory

interactions predicts the expression pattern of the segment polarity

genes in Drosophila melanogaster. J. Theor. Biol. 223, 1–18.

9To establish this correspondence, we neglected the value of the

duplicate gene sNþ1 of the mutated network, comparing only the values of

the N genes that are common in both networks.

Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.,

2002. Molecular Biology of the Cell. fourth ed. Garland.

Aldana, M., 2003. Dynamics of Boolean networks with scale-free

topology. Physica D 185 (1), 45–66.

Aldana, M., Cluzel, P., 2003. A natural class of robust networks. Proc.

Nat. Acad. Sci. USA 100 (15), 8710–8714.

Aldana, M., Coppersmith, S., Kadanoff, L.P., 2003. Perspectives and

Problems in Nonlinear Science. In: Kaplan, E., Marsden, J.E.,

Sreenivasan, K.R. (Eds.). Springer, New York.

Babu, M.M., Luscombe, N.M., Aravind, L., Gerstein, M., Teichmann,

S.A., 2004. Structure and evolution of transcriptional regulatory

networks. Curr. Opinions Struct. Biol. 14, 283–291.

Bhan, A., Galas, D.J., Dewey, T.G., 2002. A duplication growth model of

gene expression networks. Bioinformatics 18 (11), 1486–1493.

Chung, F., Lu, L.Y., Dewey, T.G., Galas, D.J., 2003. Duplication models

for biological networks. J. Comput. Biol. 10 (5), 677–687.

De Jong, H., 2002. Modeling and simulation of genetic regulatory systems:

a literature review. J. Comput. Biol. 9 (1), 67.

de Visser, J., Hermisson, J., Wagner, G.P., Meyers, L.A., Bagheri-

Chaichian, H., Blanchard, J.L., Chao, L., 2003. Perspective: evolution

and detection of genetic robustness. Evolution 57 (9), 1959–1972.

Derrida, B., Stauffer, D., 1986. Phase transitions in two dimensional

Kauffman cellular automata. Europhys. Lett. 2, 739–745.

Edelman, G.M., Gally, J.A., 2001. Degeneracy and complexity

in biological systems. Proc. Nat. Acad. Sci. USA 98 (24),

13763–13768.

Espinosa-Soto, C., Padilla-Longoria, P., Alvarez-Buylla, E.R., 2004. A

gene regulatory network model for cell-fate determination during

Arabidopsis thaliana flower development that is robust and recovers

experimental gene expression profiles. Plant Cell 16, 2923–2939.

Evangelisti, A.M., Wagner, A., 2004. Molecular evolution in the yeast

transcriptional regulation network. J. Exp. Zool. 302B, 392–411.

Gardner, T.S., di Bernardo, D., Lorenz, D., Collins, J.J., 2003. Inferring

genetic networks and identifying compound mode of action via

expression profiling. Science 301, 102–105.

Gu, X., Zhang, Z., Huang, W., 2005. Rapid evolution of expression and

regulatory divergences after yeast gene duplication. Proc. Nat. Acad.

Sci. USA 102 (3), 707–712.

Guelzim, N., Bottani, S., Bourgine, P., Kepes, F., 2002. Topological and

causal structure of the yeast transcriptional regulatory network. Nat.

Genet. 31, 60–63.

Huang, S., Ingber, D.E., 2000. Shape-dependent control of cell growth,

differentiation and apoptosis: switching between attractors in cell

regulatory networks. Exp. Cell Res. 261, 91–103.

Huang, S., Eichler, G., Bar-Yam, Y., Ingber, D.E., 2005. Cell fates as

high-dimensional attractor states of a complex gene regulatory

network. Phys. Rev. Lett. 94, 128701.

Hughes, A., 2002. Adaptive evolution after gene duplication. Trends

Genet. 18 (9), 433–434.

Hughes, A.L., Friedman, R., 2005. Gene duplication and the properties of

biological networks. J. Mol. Evol. 61, 758–764.

Kauffman, S.A., 1969. Metabolic stability and epigenesis in randomly

constructed genetic nets. J. Theor. Biol. 22, 437–467.

Kauffman, S.A., 1993. Origins of Order: Self-organization and Selection

in Evolution. Oxford University Press, New York.

Kauffman, S.A., 1995. At Home in the Universe. Oxford University Press,

New York.

Kirschner, M., Gerhart, J., 1998. Evolvability. Proc. Nat. Acad. Sci. USA

95, 8420–8427.

Luscombe, N., Babu, M.M., Yu, H., Snyder, M., Teichmann, S., Gerstein,

M., 2004. Genomic analysis of regulatory network dynamics reveals

large topological changes. Nature 431, 308–312 hURL:http://sandy.

topnet.gersteinlab.org/i.

Lynch, M., 2002. Gene duplication and evolution. Science 297, 945–947.

Lynch, M., Conery, J.S., 2000. The evolutionary fate and consequences of

duplicate genes. Science 290, 1151–1155.

Lynch, M., Katju, V., 2004. The altered evolutionary trajectories of gene

duplicates. Trends Genet. 20, 544–549.

Page 16: Robustness and evolvability in genetic regulatory networksmax/Spanish/robustjtb.pdfJournal of Theoretical Biology 245 (2007) 433–448 Robustness and evolvability in genetic regulatory

ARTICLE IN PRESSM. Aldana et al. / Journal of Theoretical Biology 245 (2007) 433–448448

Makita, Y., Nakao, M., Ogasawara, N., Nakai, K., 2004. Dbtbs: database

of transcriptional regulation in Bacillus subtilis and its contribution to

comparative genomics. Nucleic Acids Res. 32, D75–D77 hURL:http://

dbtbs.hgc.jp/i.

Martınez Antonio, A., Collado-Vides, J., 2003. Identifying global regulators

in transcriptional regulatory networks in bacteria. Curr. Opinion

Microbiol. 6, 482–489 hURL:http://regulondb.ccg.unam.mx/i.

Mason, J., Linsay, P.S., Collins, J.J., Glass, L., 2004. Evolving complex

dynamics in electronic models of genetic networks. Chaos 14 (3),

707–715.

Mendoza, L., Alvarez-Buylla, E.R., 2000. Genetic regulation of root hair

development in Arabidopsis thaliana: a network model. J. Theor. Biol.

204, 311–326.

Nehaniv, C., 2003. Evolvability. BioSystems 69, 77–81.

Ohno, S., 1995. Evolution by Gene Duplication. Springer, New York.

Pastor-Satorras, R., Smith, E., Sole, R.V., 2003. Evolving gene interaction

networks through gene duplication. J. Theor. Biol. 222, 199–210.

Poole, A., Phillips, M., Penny, D., 2003. Prokaryote and eukaryote

evolvability. BioSystems 69, 163–185.

Ramo, P., Kesseli, J., Yli-Harja, O., 2006. Perturbation avalanches and

criticality in gene regulatory networks. J. Theor. Biol. 242, 164–170.

Raval, A., 2003. Some asymptotic properties of duplication graphs. Phys.

Rev. E 68 (6), 066119.

Rzhetsky, A., Gomez, S.M., 2001. Birth of scale-free molecular networks

and the number of distinct DNA and protein domains per genome.

Bioinformatics 17 (10), 988–996.

Shmulevich, I., Kauffman, S.A., 2004. Activities and sensitivities in

Boolean network models. Phys. Rev. Lett. 93 (4), 048701.

Shmulevich, I., Kauffman, S.A., Aldana, M., 2005. Eukaryotic cells are

dynamically ordered or critical but not chaotic. Proc. Nat. Acad. Sci.

USA 102 (38), 13439–13444.

Smolen, P., Baxter, D., Byrne, J., 2000. Modeling transcriptional control

in gene networks—methods, recent results, and future directions. Bull.

Math. Biol. 62, 247.

Srtelling, J., Sauer, U., Szallasi, Z., Doyle, F.J.I., Doyle, J., 2004.

Robustness of cellular functions. Cell 118, 675–685.

Teichmann, S.A., Babu, M.M., 2004. Gene regulatory network growth by

duplication. Nat. Genet. 36, 492–496.

Wagner, A., 2004. Distributed robustness versus redundancy as causes of

mutational robustness. BioEssays 27 (2), 176–188.

Wagner, A., 2005a. Robustness and evolvability in living systems.

Princeton Studies in Complexity. Princeton University Press, Prince-

ton, NJ.

Wagner, A., 2005b. Robustness evolvability and neutrality. FEBS Lett.

579, 1772–1778.

Wuensche, A., 2004. Basins of attraction in network dynamics: a

conceptual framework for biomolecular networks. In: Schlosser, G.,

Wagner, G.P. (Eds.), Modularity in Development and Evolution.

Chicago University Press, Chicago (Chapter 13), p. 288.

Wuensche, A., Lesser, M.J., 1992. The Global Dynamics of Cellular

Automata; An Atlas of Basin of Attraction Fields of One-Dimensional

Cellular Automata. Addison-Wesley, Reading, MA.

Yanai, I., Camacho, C.J., DeLisi, C., 2000. Predictions of gene family

distributions in microbial genomes: evolution by gene duplication and

modification. Phys. Rev. Lett. 85 (12), 2641–2644.

Zhang, J., 2003. Evolution by gene duplication: an update. Trends Ecol.

Evol. 18, 292–298.