Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Mapping the Fitness Landscape of Gene ExpressionUncovers the cause of Antagonism and Sign Epistasisbetween Adaptive MutationsHsin-Hung Chou1,2¤a, Nigel F. Delaney1¤b, Jeremy A. Draghi3,4, Christopher J. Marx1,5*¤c
1 Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America, 2 Institute of Molecular Systems Biology,
ETH Zurich, Zurich, Switzerland, 3 Department of Zoology, University of British Columbia, Vancouver, Canada, 4 Department of Biology, University of Pennsylvania,
Philadelphia, Pennsylvania, United States of America, 5 Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Cambridge, Massachusetts, United
States of America
Abstract
How do adapting populations navigate the tensions between the costs of gene expression and the benefits of geneproducts to optimize the levels of many genes at once? Here we combined independently-arising beneficial mutations thataltered enzyme levels in the central metabolism of Methylobacterium extorquens to uncover the fitness landscape defined bygene expression levels. We found strong antagonism and sign epistasis between these beneficial mutations. Mutations withthe largest individual benefit interacted the most antagonistically with other mutations, a trend we also uncovered throughanalyses of datasets from other model systems. However, these beneficial mutations interacted multiplicatively (i.e., noepistasis) at the level of enzyme expression. By generating a model that predicts fitness from enzyme levels we couldexplain the observed sign epistasis as a result of overshooting the optimum defined by a balance between enzyme catalysisbenefits and fitness costs. Knowledge of the phenotypic landscape also illuminated that, although the fitness peak wasphenotypically far from the ancestral state, it was not genetically distant. Single beneficial mutations jumped straighttoward the global optimum rather than being constrained to change the expression phenotypes in the correlated fashionexpected by the genetic architecture. Given that adaptation in nature often results from optimizing gene expression, theseconclusions can be widely applicable to other organisms and selective conditions. Poor interactions between individuallybeneficial alleles affecting gene expression may thus compromise the benefit of sex during adaptation and promote geneticdifferentiation.
Citation: Chou H-H, Delaney NF, Draghi JA, Marx CJ (2014) Mapping the Fitness Landscape of Gene Expression Uncovers the cause of Antagonism and SignEpistasis between Adaptive Mutations. PLoS Genet 10(2): e1004149. doi:10.1371/journal.pgen.1004149
Editor: Mark Achtman, Warwick Medical School, United Kingdom
Received September 24, 2013; Accepted December 16, 2013; Published February 27, 2014
Copyright: � 2014 Chou et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: HC acknowledges support by an EMBO Long-Term Fellowship (ALTF 132-2010). This work was funded by an NIH award to CJM (GM078209).The fundershad no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: [email protected]
¤a Current address: Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom.¤b Current address: Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts, United States of America.¤c Current address: Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America.
Introduction
The concept of a fitness landscape unites the three levels of
evolutionary change – genotype, phenotype, and fitness – into a
mathematical picture of the potential for, and constraints upon,
adaptive evolution. By mapping genotypes to a measure of fitness,
fitness landscapes guide our understanding of how epistasis –
nonlinear interactions between the fitness effects of mutations –
shapes evolution. Strong epistasis implies that landscapes are
rugged, with many peaks, or locally optimally genotypes [1,2]. The
magnitude and form of epistasis is predicted to determine the
number of evolutionary trajectories [3,4], the rate and repeatabil-
ity of adaptation [5–7], and the benefit of sex [8]. Recent
experimental work with a wide variety of model organisms has
revealed diminishing returns as a general trend of adaptation [9–
13], with relatively few cases of synergy [11,14] or sign epistasis
[15] (i.e., the same mutation being beneficial or deleterious in
different contexts [16]). Antagonism between adaptive mutations
might imply that these populations are summiting peaks in their
fitness landscapes with just a handful of genetic changes. This
explanation might lead to further trends, such as a negative
relationship between the initial selective coefficient of a mutation
and its epistatic interactions that could prove to be a useful
predictor of a saturating process of adaptation [17]. In order to
definitely link diminishing returns to the ascent of local peaks, as
well as to understand the existence of the peaks themselves, we
must understand the phenotypes that link genotype and fitness in
the adaptive landscape. Mathematically convenient formulations
such as Fisher’s geometric model for adaptation near a single peak
[18] have been used to interpret the trend toward antagonism
[19]. This approach assumes stabilizing selection a priori. What
remains unclear is what types of physiological interactions give rise
to fitness landscapes of varying shape and form, as well the
constraints upon mutational changes to underlying phenotypes.
Models of metabolic pathways have been amongst the most
successful in translating underlying biochemical phenotypes to
PLOS Genetics | www.plosgenetics.org 1 February 2014 | Volume 10 | Issue 2 | e1004149
fitness. The contribution of enzyme activities upon metabolic flux
has been formalized via Metabolic Control Analysis (MCA)
[20,21]. The ability of this approach to predict the fitness
consequences of changes in enzyme properties has been verified
in experimental systems that vary from Escherichia coli in lactose-
limited chemostats to the flight properties of butterflies (reviewed
in [22]). Turning to multiple enzymes, MCA theory has suggested
a general trend toward synergistic interactions between activity-
increasing mutations in a metabolic pathway [23,24]. A major
limitation, however, has been that the costs of enzyme expression
[25] have not been included in classical MCA. Whereas the
dependence of flux through a metabolic pathway saturates with
increasing levels of a given enzyme, the costs will continue to
accumulate. The balance of these two selective factors will
generate an intermediate optimum, and thus stabilizing selection.
Inclusion of expression costs to MCA has enabled predictions of
the optimum levels of a single enzyme [26], and was used to
compare the differential utility of alternate, degenerate pathways
[27]. An open question, however, is how the balance between
catalytic benefits and expression costs plays out to optimize
enzyme expression across many enzymes simultaneously.
In order to study how evolution would simultaneously optimize
expression of multiple genes, we have developed a model system of
an engineered Methylobacterium extorquens AM1 (EM) in which we
altered its central metabolism to be dependent upon a foreign
pathway (Figure 1A for details). M. extorquens grows on methanol
by oxidizing it first to formaldehyde, and then through a series of
steps to formate, which is either fully oxidized to CO2 or
incorporated into biomass [28–32]. In the EM strain we removed
the endogenous pathway for formaldehyde oxidation in wild-type
(WT) [28] and replaced it with two genes encoding a foreign
pathway that oxidizes formaldehyde via glutathione (GSH)
derivatives [29]. Eight populations dependent upon this intro-
duced metabolic pathway evolved in methanol-containing medi-
um via serial transfers for 900 generations [10,33].
Adaptation of the unfit EM strain to grow on methanol
consistently involved beneficial mutations that altered expression
of the foreign GSH pathway (Figure 1B). When the GSH pathway
was introduced, the two enzymes were cloned together on a single
mRNA transcript behind a strong native promoter present on a
medium copy plasmid (,9 cell21) [10,29,34,35]. As such, the costs
of expression outweighed the catalytic benefits, and among the
targets of adaptation we identified by resequencing strains evolved
in separate populations, we universally obtained beneficial
mutations that decreased expression of these enzymes [10,34].
These mutations reduced expression of the GSH pathway through
three classes of underlying mechanisms: Class A decreased
expression per gene copy, Class B reduced gene dosage by
lowering plasmid copy number, and Class C integrated the
introduced pathway into the host genome, which also reduced
plasmid copy number [33,34] (Figure 1B). In terms of epistasis,
mutations in multiple genes along a single adaptive trajectory –
including one mutation (here ‘A1’) reducing expression of the
GSH pathway – have been shown to exhibit a general trend of
diminishing returns that was devoid of sign epistasis [10].
However, here we are interested in uncovering the trends and
mechanisms underlying epistatic interactions between mutations
that arose in separate adapting lineages and affect expression of
the same metabolic pathway.
We combined independently-arising beneficial mutations af-
fecting gene expression of this two-enzyme metabolic pathway and
report strong antagonism and sign epistasis for fitness. These
interactions were increasingly antagonistic for larger benefit
mutations. Such strong antagonism did not stem from the effects
of mutational combinations upon enzyme levels, but rather from
the nonlinear mapping between enzyme expression and organis-
mal fitness. By developing a quantitative model that relates
expression cost and catalytic benefit to fitness, we characterized
the overall shape of this fitness landscape and revealed that some
of these single mutations can optimize multiple phenotypes
simultaneously, leading to a big jump toward the single, global
optimum.
Results
Interactions between mutations affecting expression of atwo-enzyme metabolic pathway exhibit strongantagonism and sign epistasis
To explore the pattern of epistatic interactions between
beneficial mutations affecting expression of the GSH-dependent
pathway, we combined beneficial plasmid mutations that emerged
during experimental evolution and affected distinct traits [34]. We
focused upon Class A (decreased expression per copy) and B
(reduced gene dosage) mutations because of their genetic
tractability, and the prediction that these represent orthogonal
mechanisms to achieve lower expression. We hypothesized that
mutational combinations between these classes would result in
enzyme levels that would be the product of the individual
perturbations (Figure 1C). Three class A mutations, A1–A3, and
one class B mutation, B5, occurred independently, whereas B2 and
B3 were isolated together from the same plasmid. We generated
12 plasmids that paired each Class A mutation with each one from
Class B, as well as with the B2–B3 pair, and measured their
relative fitness via competitions with a fluorescently labeled
ancestor [34] (Tables S1, S2). The observed fitness values for
the mutational combinations were substantially less than expected
based upon a simple multiplicative null model incorporating the
single mutant effects (i.e., Wij = Wi6Wj; R2 = 0.53, adj-R2 = 0.32;
Figure 2A).
The increasingly strong antagonism for higher expected fitness
values suggested a potential negative relationship between the
selective coefficients observed for each mutation and the average
epistasis that mutation exhibited with other mutations. We
observed that the individually most beneficial mutations (large s)
engendered the greatest antagonism (e,0) when combined with
Author Summary
The pace and outcome of a series of adaptive steps in anevolving lineage depends upon how well differentbeneficial mutations stack on top of each other. We foundthat independent beneficial mutations that affected geneexpression for a metabolic pathway did not work welltogether, and were often jointly deleterious. The mostbeneficial mutations interacted the most poorly withothers, which was a trend we found common in otherbiological systems. Through generating a model thataccounted for enzymatic benefits and expression costs,we uncovered that this antagonism was caused by aphenotype to fitness mapping that had an intermediatepeak. This allowed us to predict the fitness effect of doublemutants and to uncover that the single winning mutationstended to move straight to the peak in a single step. Thesefindings demonstrate the importance of considering thephenotypic changes that cause nonlinear interactionsbetween mutations upon fitness, and thus influence howpopulations evolve.
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 2 February 2014 | Volume 10 | Issue 2 | e1004149
Figure 1. Adaptive mutations that optimized the expression of the GSH-linked pathway. A) The GSH-linked formaldehyde oxidationpathway in the EM strain. The enzymes of the GSH-linked pathway are indicated with yellow arrows. MDH, methanol dehydrogenase; spont.,spontaneous reaction of formaldehyde and GSH; FlhA, S-hydroxymethyl GSH dehydrogenase; FghA, S-formyl-GSH hydrolase; FDHs, formatedehydrogenases. B) Adaptive mutations identified on the introduced plasmid (pCM410) expressing the introduced formaldehyde pathway. Mutationsoccurring in the predicted ribosome-binding site (bold text) of fghA, or its upstream region are shown in magenta (Class A). Mutations occurring inregions that control plasmid replication are shown in blue (Class B). The fghA start codon is underlined. trfA, a gene encoding the TrfA proteinessential to plasmid replication; PmxaF, promoter of the flhA-fghA gene cassette; oriV, origin of replication recognized by the TrfA protein; oriT, originof transfer. ISMex4 and ISMex25, two insertion sequences native to M. extorquens AM1. C) Diagram of orthogonal mechanisms of Class A and Bmutations on gene expression.doi:10.1371/journal.pgen.1004149.g001
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 3 February 2014 | Volume 10 | Issue 2 | e1004149
other mutations, including several examples of sign epistasis
(Figure 2B).
The observed relationship between selective coefficientand average epistasis is observed for other biologicalsystems
Several theoretical arguments suggest that the geometry of
fitness landscapes might induce correlations between the size of a
mutation and the strength and direction of epistasis. Epistasis has
been observed to be coupled to the mean fitness effect of mutations
[36,37]. A single beneficial mutation of large effect may
appreciably change both the mean fitness of subsequent mutations
and the remaining distance to the optimum, potentially skewing its
own epistatic coefficients.
Given the emerging empirical consensus and theoretical
arguments for antagonistic epistatic interactions among beneficial
mutations, we analyzed several other datasets to ask whether the
strength and form of the relationship between selective effect and
average epistatic effect held for intragenic and intergenic datasets.
For this comparison we analyzed the relationship between s and efor previous datasets from M. extorquens and E. coli where the
beneficial mutations occurred consecutively in a variety of genes
across the genome of a single adapting lineage [10,11], combina-
tions of mutations from two genes of the bacteriophage ID11 [12],
and two datasets of within-protein interactions for b-lactamase
[17,38]. These datasets also displayed signs of a correlation
between s and increasingly negative e (as noted in [37]), with the
exception of the intragenic data for b-lactamase (Figure S1).
Negative trends in the relationship between initial selective
coefficient and epistasis may seem like obvious evidence for
diminishing returns. However, recent theoretical work has shown
that in models where mutations have random effects and no
tendency to be either synergistic or antagonistic, a pattern of
diminishing returns occurs between mutations if they are selected
conditioned on being beneficial in the ancestral background [39].
In Supplementary Text S1 we show this behavior in a simple
model of evolution on fitness landscapes with no mean epistatic
tendency and show how it leads to a pattern of diminishing returns
between beneficial mutations as a form of regression to the mean.
This analysis suggests that genotype-fitness data alone, without
knowledge of the phenotypic effects of mutations or the
physiological causes for trade-offs, might be insufficient to infer
the mechanism underlying a pattern of epistasis.
Pairs of orthogonal perturbations to gene expression actindependently upon enzyme levels
What physiological factors underlie the strong antagonism
observed between mutations affecting expression of the foreign
GSH pathway? A first possibility is that mutational combinations
lead to smaller changes in protein expression than expected from
the single mutants and that such antagonistic behavior at the level
of expression phenotypes merely propagated through as observed
antagonism at the level of fitness. Because we used combinations
that largely derived from pairing mutations that reduced
expression per copy (Class A) with those that decreased plasmid
copy number (Class B), our null hypothesis was that these
mechanisms should act independently to alter expression, such
that the expression level of an A+B mutant pair would simply be
the product of these two values. Consistent with this prediction,
enzyme levels were well described by the null model of
multiplicative independence between paired perturbations
(Figure 3A, Table S1). A simple linear model of log-transformed
changes in enzyme levels as a function of the presence of the single
mutations with no interaction terms explains much of the variation
for both FlhA and FghA (adjusted-R2 = 0.85 (FlhA) and 0.86
(FghA)). This predictability can be seen in the high correlation
between observed expression phenotypes for paired perturbations
and those expected based upon the single changes.
Figure 2. Epistasis at the level of expression phenotypes and fitness relative to independence or kinetic model. A) Fitness values ofmutation combinations are consistently lower than expected by multiplicative independence. B) Mutations with larger with selective advantages (s)when tested alone in the ancestor have more negative e values in combination with other mutations.doi:10.1371/journal.pgen.1004149.g002
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 4 February 2014 | Volume 10 | Issue 2 | e1004149
A fitness landscape model that incorporates benefits andcosts predicts the fitness of mutational combinations
Since mutational combinations did not introduce epistatic
interactions at the level of gene expression, we built a model of the
fitness landscape based upon enzyme levels in order to ask how its
shape would contribute to antagonism. Building upon earlier work
on single enzymes [26] (Supplementary Text S2), we generated a
model of the fitness landscape that calculates fitness as flux through
the pathway above a threshold, minus the sum of two costs Cið Þ:
W~VMax
:FlhA
FlhAzEh
{Threshold{C1:FlhA{C2
:FghA
The hyperbolic expression for catalysis has been used before [40]
to effectively describe the dependence of steady-state flux to the
levels of a single enzyme and incorporates a ‘‘Vmax’’ term for the
pathway, and an Eh half-maximal enzyme level term. We only
model FlhA concentration as beneficial to fitness even though
FghA is absolutely required for growth on methanol [34]. This is
because, over the parameter range of our perturbations, fitness
appeared to rise monotonically with decreasing levels of FghA.
This suggests FghA is a typical enzyme that has a low metabolic
‘‘control coefficient’’ [20,21] and that it only limits catalysis at
exceptionally low levels. None of our perturbations pushed FghA
levels below 2%, and for comparison b-galactosidase levels in
lactose-limited chemostats only impacted fitness significantly if
they decreased activity to #1% [41].
The threshold flux term was added to the model to capture an
unusual right-shift of the typical relationship between enzyme
concentration and fitness observed with FlhA in these data, such
that fitness approached zero even in the presence of measurable
concentrations of functioning enzyme. We have observed similar
behavior when manipulating levels of the analogous enzyme in the
endogenous, tetrahydromethanopterin-dependent pathway for
formaldehyde oxidation in WT (SM Carroll, CJM, unpublished).
As both of these enzymes occur directly downstream of
formaldehyde production, this threshold phenomenon may be
explained by toxic effects of elevated steady-state formaldehyde
concentrations at low enzyme levels. Finally, there are two cost
terms for FlhA and FghA. The cost per molecule for each enzyme
was treated as a linear function, consistent with prior work [26,42].
The six parameters of this benefit - costs model were fit using
the data from the EM ancestor, single mutants, as well as strains
with inducible promoter plasmids (27 data points; Table S3). The
inducible promoter plasmids contained a cumate-responsive
repressor to modulate the levels of flhA-fghA from ancestral levels
to lower values (Table S1). These data were critical for capturing
the steep decline of the fitness landscape at low values of FlhA.
The resulting benefit - costs model captured the curvature of the
fitness landscape (Figure 4) and, unlike the simple multiplicative
model, it was able to predict the 17 combinations of mutations that
were not used for model fitting with high precision (R2 = 0.98)
(Figure 3B). From the perspective of the ancestral genotype, in the
model fitness rises gently with decreased expression of either
enzyme, but then declines rapidly upon reaching catalytically-
limiting levels of FlhA. A similar cliff exists for low values of FghA
[34], but at enzyme levels beyond the range of our dataset and
below the detection threshold of our enzyme assay method (see
Methods).
Beneficial mutations moved directly toward the globalexpression optimum rather than in the locally steepestdirection on the fitness landscape
Our fitness landscape model that precisely maps phenotypes to
fitness allowed us to explore how much local topography may have
influenced the direction of phenotypic change during evolution by
Figure 3. Mutations exert independent effects upon enzyme expression but a mechanistic model of benefits and costs of enzymesis required to predict combined effects on fitness. A) Expression levels (in mU) are well predicted by independent, multiplicative effects of eachmutation on the FlhA and FghA enzymes. B) Using a kinetic model of catalysis and costs parameterized with data from the single mutants andinducible promoter constructs predicts the mutational combinations very well (R2 = 0.98).doi:10.1371/journal.pgen.1004149.g003
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 5 February 2014 | Volume 10 | Issue 2 | e1004149
de novo mutations (Figure 5). We compared the changes in enzyme
expression caused by single beneficial mutations relative to three
factors: 1) the local gradient in the fitness landscape for the
ancestor (greater decreases of FlhA versus FghA because the
former is more costly), 2) the direct vector pointing to the global
fitness optimum and 3) equal proportional changes between the
enzymes which might be expected due to the physical constraint of
their being expression from a single transcript. All mutations
moved toward the global optimum rather than ascend in the
phenotypically steepest direction on the local fitness landscape.
Mutations B2 and B5 affected copy number but, through
mechanisms we do not currently understand, led to greater
decreases in FghA than FlhA. In contrast, B3 was directly along
the line of equivalent change in both enzymes. This mutation was
identified along with B2 as a plasmid haplotype, and this B2–B3
combination allowed this lineage to accomplish a similar
phenotypic (and fitness) change as the other mutants.
Discussion
We found that combining adaptive mutations that optimize
expression of a two-enzyme pathway exhibited strong antagonistic
interactions and sign epistasis. Fitness values of mutational
combinations were generally less than expected relative to a null
model of independent, multiplicative effects upon fitness. We
further observed a negative relationship between s and e for
individual mutations. Other datasets of intragenic epistasis
revealed similar trends between s and e; however, this overall
trend of epistasis (e.g., antagonism) does not imply a specific
connection with properties of the individual mutations, such as s.
For example, this trend will also arise as a consequence of
regression to the mean when the beneficial mutations assayed are
conditioned to be beneficial in the ancestral background and the
effect of a mutation has a component that is independently
distributed on each possible genetic background. Therefore, to
extract biological insight from the quantitative relationship
between s and e , we must interrogate the mechanisms that lead
to antagonistic epistasis.
The first possible explanation for antagonism in our data would
be non-linearities in the way mutations combined to affect enzyme
expression. However, as expected from having chosen combina-
tions that combined class A mutations with those from class B,
these orthogonal mechanistic effects resulted in independent
effects on enzyme expression that were jointly well predicted with
a simple, multiplicative model. As in this system, many ecologically
relevant genes are encoded on plasmids whose regulation and gene
dosage may both be effected by separate sets of mutations. More
broadly, mutations that influence different traits that make joint
contribution to a higher phenotype such as fitness are common. At
the level of individual genes, for example, catalytic improvement
of an enzyme often results from the joint contribution of mutations
that improve protein stability and those that enhance kinetic
parameters [43–45].
The second factor that could generate antagonism is the
curvature of the underlying fitness landscape for gene expression.
Recent theory has shown that almost any formulation of fitness
based upon multiple underlying phenotypes will generate epistasis
at the level of fitness, even when the mutations – as we observed
here – do not interact epistatically on the underlying trait
phenotypes [46]. Previous models have formulated fitness as a
function of gene expression correctly predicted the evolution of
optimal levels of gene expression [26]. Here we extended this
model framework to multiple enzymes and used it for the first time
to interpret beneficial mutational effects from phenotype to fitness,
which we characterized both individually and in combination.
Our fitness landscape model was able to predict the fitness values
of the mutation combinations with high precision (R2 = 0.98). The
asymmetry in the curvature of this fitness landscape results from
the relatively gentle effects of expression costs relative to the sharp
transition in fitness effects due to rate limitation upon catalysis
[41]. This observed selection to maintain an intermediate
optimum of enzyme levels is distinct from the selective neutrality
Figure 4. Fitness landscape of the GSH-linked pathway. A) A two-dimensional heatmap and B) three-dimensional surface showing the shapeof the fitness landscape predicted by the model. The x and y axes are FlhA and FghA levels relative to ancestor (set at 100). Though the plasmid costis included in the model, this fourth dimension was corrected for to allow a three-dimensional visualization (see Methods). Experimental data pointsindicate the ancestor (asterisk), single mutants (grey circles), mutational combinations (white squares), and inducible expression vectors (blackcircles).doi:10.1371/journal.pgen.1004149.g004
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 6 February 2014 | Volume 10 | Issue 2 | e1004149
on a catalytic plateau that was predicted by classical MCA
analyses that did not incorporate expression costs [40].
Knowledge of the underlying fitness landscape allows us to
understand aspects of the epistatic interactions not evident from
fitness values alone. For example, we observed that the A3 and B5
mutations had fairly comparable individual fitness values
(1.42060.032 vs. 1.45760.033; mean and 95% CI), but the
former had three-fold more antagonistic epistasis than the latter
(average = 20.46 vs. 21.34; t-test, p = 0.025; Figure 6). The
modeled fitness landscape illuminates the underlying reason for
this difference. Both mutations rest near the peak value of enzyme
expression, but on opposite sides (B5 has 70% the level of FghA as
A3). This poises B5 such that it is much more sensitive to further
reductions in expression than A3. Thus, although these two
mutations are essentially equivalent if one only considers their
fitness values, their locations in phenotypic space change their
likelihood for antagonism and sign epistasis.
One consequence of sign epistasis between mutations affecting
the phenotypes like gene expression is a reduction of the benefit
caused by recombination bringing beneficial mutations together
into the same genome (i.e., Fisher-Muller model). This tradeoff
between benefits and costs is inherent to gene expression, and thus
results in stabilizing selection. These patterns of epistasis are likely
very common, given the apparent ubiquity of stabilizing selection
upon gene expression from microbes to primates [47–50].
With a complete fitness landscape defined by biochemical
phenotypes, we can now interpret the genetic landscape in terms
of what was accessible to individual mutations. In contrast to how
selection acts upon standing genetic variation, the de novo
mutations fixed in experimental populations ignored the pheno-
typically-local best direction of change and ‘‘jumped’’ towards the
global optimum. This highlights that the classic quantitative
genetics intuition of climbing in the direction of the steepest
selection gradient [51], which is appropriate for small populations
containing standing genetic variation, fails to capture the
phenotypic potential of a sizeable pool of de novo mutations
arising from large populations. In our case, it was not the large
magnitude of expression change that was surprising, per se, but
change in the ratio of their expression. Given that flhA and fghA are
encoded on the same transcript, it was notable that all but one of
the single mutations down-regulated FghA to a large extent while
only cutting FlhA levels in half. Whereas this system allowed
Figure 5. Adaptive mutations tended to move phenotypes toward the global optimum. The direction of phenotypic movements forstrains with single mutations (red lines; enzyme levels of ancestor set to 100) are compared to the vectors indicating of the locally steepest fitnessgradient for the ancestor (blue), complete 1:1 correlation between phenotypes (black), and the global optimum (green). The direction of phenotypicmovement for all single mutations was closer to the vector for the global optimum relative to the local gradient around the ancestor.doi:10.1371/journal.pgen.1004149.g005
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 7 February 2014 | Volume 10 | Issue 2 | e1004149
individual mutations to reach near-optimality, a multi-step
trajectory was required for the directed evolution of the LacI
repressor to reverse its regulatory logic [52]. In that system, the
first round mutation simply broke the old logic to become
constitutive, which in combination with two latter mutations
allowed the ‘‘anti-LacI’’ phenotype to emerge and locate the
fitness peak they predicted from a computational model. Our
results suggest that relatively large moves in multi-phenotype space
can emerge as winners, provided the genetic architecture at least
allows rare mutations to achieve this possibility.
Finally, the near optimal expression levels of these beneficial
mutations becomes even more remarkable when considering that
this optimization did not happen in isolation, but in adapting
populations that contained many beneficial mutations simulta-
neously [33,34,53]. In varying environments, such diversity in
large microbial populations may lead to genetically complex
adaptation such as stable polymorphisms [54]. In a stable
environment, this diversity leads to ‘clonal interference’ [55], a
type of serial fixation that effectively sorts for mutations of the
greatest effect amongst what was possible. This would have
impacted the mutations that affected GSH pathway expression in
two ways. Firstly, there were many different genetic solutions to
reducing expression of these enzymes [34]. One type in particular
- the Class C mutations that resulted from integration of the
introduced plasmid into the host chromosome – occur at very high
rates and emerged to detectable levels repeatedly, up to 17 times
per population [33]. These mutations confer ,O the benefit of
the Class A and B mutations [34], however, and were only found
to rise to fixation in three of eight populations despite more than
100 observed occurrences [33]. In this regard, clonal interference
aids finding optimal solutions by allowing only the best individual
mutations to fix. Recently, however, it has been shown that
fixation probability of contending mutations is only partly
dependent upon their individual effect because they commonly
hitchhike with other beneficial mutations present [56–58]. This
leads to a second effect of clonal interference, which is competition
between lineages carrying beneficial mutations affecting distinct
phenotypic processes. Indeed, the ancestral genotype faced a
variety of phenotypic challenges besides just optimizing expression
of the GSH pathway [14,59,60]. Some of these mutations in other
loci had beneficial effects up to 36 larger than those described
here [10] and were segregating at the same time as mutations
affecting expression of the GSH pathway [33,53]. Even with so
much turmoil in the populations, the eventual winners discovered
nearly optimal solutions to this local, two-enzyme expression
optimization in order to win the battle for fixation. Population size
thus contributed to the fixation of optimal solutions by both
increasing the number of mutations occurring and escaping drift in
the first place, and by facilitating competition between multiple
potential solutions. These factors conspired to allow selection to
reward – when mutationally possible – lineages that made long-
range, lucky jumps to distant peaks on the phenotypic landscape.
Materials and Methods
Experimental evolution and growth conditionsThe EM strain was generated previously by deleting the mptG gene
of M. extorquens AM1 in the white strain WT CM502 [61] lacking
carotenoid pigments due to an unmarked mutation in crtI (encoding
phytoene desaturase) [62], followed by introduction of pCM410
[10]. Eight replicate populations seeded by the EM strain were
grown in 9.6 ml methanol (15 mM) minimal media incubated in a
30uC shaking incubator at 225 rpm. Populations were transferred to
fresh media at a 1/64 dilution rate (thus six generations per growth
cycle, Nfinal<109) and propagated for 600 generations. One liter of
minimal media consists of 100 ml of phosphate buffer (25.3 g of
K2HPO4 and 22.5 g of NaH2PO4 in 1 liter of deionized water),
100 ml of sulfate solution (5 g of (NH4)2SO4 and 0.98 g of MgSO4 in
1 liter of deionized water), 799 ml of deionized water, and 1 ml of
trace metal solution. One liter of the trace metal solution consists of
100 ml of 179.5 mM FeSO4, 800 ml of premixed metal mix
(12.738 g of EDTA disodium salt dihydrate, 4.4 g of ZnSO4?7H2O,
1.466 g of CaCl2?2H2O, 1.012 g of MnCl2?4H2O, 0.22 g of
(NH4)6Mo7O24?4H2O, 0.314 g of CuSO4?5H2O, and 0.322 g of
CoCl2?6H2O in 1 liter of deionized water, pH 5), and 100 ml of
deionized water [14].
Plasmid and strain constructionAll strains and plasmids used are indicated in Table S2. All
plasmids constructed in this study were maintained in E. coli 10-
beta strain (New England Biolabs) and were transferred to M.
extorquens via electroporation [63] or tri-parental mating with the
helper strain pRK2073 [64]. Plasmid DNA in E. coli was extracted
using the QIAprep Spin MiniPrep Kit (Qiagen). The PmxaF
expression vector pCM160 [65], its variant pCM410 in the EM
strain expressing the flhA-fghA cassette [10], and the cumate-
Figure 6. Phenotype and not just fitness value determinesepistatic interactions. Mutations (A) A3 and (B) B5 had similar fitnessvalues, but resided on opposite flanks of the optimal phenotype (shownhere for simplicity as FlhA enzyme levels). In interactions with secondarybeneficial mutations (B2 and A2, respectively), B5 fares worse due tobeing closer to the fitness cliff that occurs when FlhA catalysis becomeslimiting.doi:10.1371/journal.pgen.1004149.g006
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 8 February 2014 | Volume 10 | Issue 2 | e1004149
inducible vector pHC112 expressing the flhA-fghA cassette [34]
have been described previously. In order to combine Class A and
B mutations which accumulated on separate pCM410 derivatives
during experimental evolution of the EM strain (Figure 1B, Table
S1), Class B mutations (from pCM410B2B3, pCM410B2, pCM410B3,
and pCM410B5) were moved to pCM410 derivatives bearing Class
A mutations (pCM410A1, pCM410A2, and pCM410A3) through the
procedures delineated below. Fragments containing B2–B3, B2,
and B3 mutations were first obtained by digesting pCM410B2B3,
pCM410B2, or pCM410B3 with SfiI and NheI. These were then
ligated into the plasmid backbone of pCM410A1, pCM410A2, or
pCM410A3 cut with the same enzymes. A fragment containing the
B5 mutation was obtained by digesting of pCM410B5 with SfiI and
SexAI and then ligated into the plasmid backbone pCM410A1,
pCM410A2, or pCM410A3 cut by the same enzymes. The above
procedures were also applied to introduce B2–B3, B2, B3, and B5
mutations into pHC112 in order to generate pHC112 derivatives
that vary in their plasmid copy number.
Fitness assaysFitness assays were performed by a previously described
procedure [34]. Strains were first physiologically acclimated
through one 4-day growth cycle in 9.6 ml of minimal media
supplemented with 15 mM methanol. In addition, for strains
bearing cumate-inducible promoter plasmids (pHC112 deriva-
tives), different concentrations of cumate (Table S1) were added to
growth media to modulate the expression of FlhA and FghA
enzymes. After this acclimation phase, each of these strains was
mixed with a fluorescent variant (CM1232) of the EM ancestor
[10] by a 1:1 volume ratio, diluted 1/64 into 9.6 ml of fresh
growth media, and incubated in a 30uC shaking incubator at
225 rpm. The ratios of the two populations before (R0) and after
(R1) competitive growth were quantified by a LSR II flow
cytometer (BD Biosciences) for at least 50000 cell counts per
sample. The forward scatter threshold of LSRII was adjusted to
300 to ensure unbiased detection of the test and reference strains
despite their potential differences in cell size. Fitness values (W)
relative to the reference strain were calculated by a previously
described equation assuming an average of 64-fold size expansion
of mixed populations during competitive growth [35]:
W~logR1:64
R0
� �=log
1{R1ð Þ:64
1{R0ð Þ
� �
In order to convert to absolute differences in growth rate, the EM
ancestor grows under these conditions with a growth rate of
0.0654+/20.0016 h21 [10].
Enzyme assaysThe activities of FlhA [66] and FghA [67] were assayed in three
replicates as described using cells harvested from mid-exponential
phase cultures. Cells were collected through centrifugation at
10,0006 g for 10 min, frozen at 280uC, and used for enzyme
assays within a week. Right before assays frozen cell pellets were
suspended in 50 mM Tris-HCl buffer (pH 7.5) and physically
disrupted in tubes containing Lysing Matrix B and shaken at speed
6.0 m/s on a FastPrepH-24 bead beater (MP Biomedicals) for
40 seconds. Insoluble debris in the cell lysate was removed by
centrifugation at 13,0006 g, 4uC for 15 min. The total protein
concentration of the cell lysate was quantified using the Bradford
method [68]. Kinetic analysis of FlhA and FghA activities over
10 min at 30uC was performed in 200 ml reaction mixtures using a
SpectraMax M5 Plate Reader (Molecular Devices).
Quantification of plasmid copy numbersThe copy number of pCM410 derivatives in M. extorquens was
quantified by a real-time PCR approach described previously [34].
Briefly genomic DNA of M. extorquens from mid-exponential phase
cultures was extracted by an alkaline lysis method [69]. Detection
of plasmid DNA was targeted at the kan gene using primers
HC410p18 (59-GAAAACTCACCGAGGCAGTTCCATAG-39)
and HC410p19 (59-TCAGTCGTCACTCATGGTGATTT-
CTCA-39). Detection of chromosomal DNA was targeted at the
rpsB gene (encoding the 30S ribosomal protein S2) in the
chromosome META1 using primers HCAM111 (59-TGAC-
CAACTGGAAGACCATCTCC-39) and HCAM113 (59-TTG-
GTGTCGATCACGAACAGCAG-39). Real-time PCR experi-
ments were performed in three replicates with the PerfeCTa
SYBR Green SuperMix (Quanta Biosciences) on a DNA Engine
Opticon2 (MJ Research), and the average threshold cycle (Ct) of
each PCR reaction was determined using the Opticon Monitor v.
2.02 software (MJ Research). Each real-time PCR reaction
contained 25 ng of genomic DNA extracted from various strains
and kan- or rpsB-specific primers. To establish a standard curve
(SC) of plasmid copy numbers, 1, 0.1, 0.01, and 0.001 ng of
pCM410 (equivalent to 9.096107, 9.096106, 9.096105, and
9.096104 plasmid molecules, respectively) were mixed with 25 ng
of genomic DNA (equivalent to 3.036106 genome copies) of the
plasmid-less, white WT M. extorquens (CM502) [61]. The standard
curve is a plot of DCt (i.e. Ctkan–CtrpsB) versus plasmid molecules
on a log2 scale. For each strain, by interpolating its DCt value
against the SC the absolute quantity of plasmid DNA can be
estimated using the following equation:
Plasmid copy number
genome~2
DCt{SCinterceptSCintercept 7 3:03|106
� �
Model fitting and comparisonMultiplicative models predicting fitness or gene expression were
fit and assessed as standard linear models following a log
transformation of the response variable. The model for the fitness
landscape was fit using a non-linear routine in Matlab. The raw
data as well as commented code in Matlab and R that completely
recreates the analysis and figures has been deposited at www.
datadryad.org (doi:10.5061/dryad.8hb23).
Supporting Information
Figure S1 Trends of epistasis as a function of selective
coefficient. The epistasis values on fitness were calculated for
every mutational combination a mutation with a given value of s
was found in. The datasets represent: intragenic epistasis during
single adaptive trajectories of (A) M. extorquens [10] or (B) E. coli
[11], (C) intragenic and intergenic combinations of independent
mutations from the bacteriophage ID11 [12], or (D,E) two datasets
of intragenic combinations of E. coli b-lactamase alleles [17,38].
Note that ID11 data were given as doublings per hour, and we
therefore exponentiated with base two. These transformed data, as
well as the other four data sets, were treated as multiplicative
fitnesses; these values were normalized by dividing by the ancestor
fitness, and the epistasis of mutation i in background g was
calculated as eig = Wig2Wi6Wg.
(DOCX)
Text S1 Analysis of bias in epistasis as a function of selection
coefficient.
(DOCX)
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 9 February 2014 | Volume 10 | Issue 2 | e1004149
Text S2 Derivation of the fitness model.
(DOCX)
Table S1 Genotypes and phenotypes of pCM410, pHC112, and
their derivatives.
(DOCX)
Table S2 Bacterial strains and plasmids.
(DOCX)
Table S3 The value of fitted parameters in the fitness landscape
model.
(DOCX)
Acknowledgments
We would like to thank Michael Desai, Sergey Kryazhimskiy, Daniel
Segre, Olivier Tenaillon, members of the Marx lab and two anonymous
reviewers for useful comments on the manuscript.
Author Contributions
Conceived and designed the experiments: HHC CJM. Performed the
experiments: HHC. Analyzed the data: HHC NFD JAD CJM.
Contributed reagents/materials/analysis tools: HHC NFD JAD CJM.
Wrote the paper: HHC NFD JAD CJM.
References
1. Wright S (1932) The roles of mutation, inbreeding, crossbreeding, and selection
in evolution. Proc VIth Int Congress of Genetics 1: 356–366.
2. Phillips PC (2008) Epistasis – the essential role of gene interactions in the
structure and evolution of genetic systems. Nat Rev Genet 9: 855–867.
3. Franke J, Klozer A, de Visser JAGM, Krug J (2011) Evolutionary accessibility of
mutational pathways. PLoS Comput Biol 7: e1002134.
4. Tenaillon O, Rodrıguez-Verdugo A, Gaut RL, McDonald P, Bennett AF, et al.
(2012) The molecular diversity of adaptive convergence. Science 335: 457–461.
doi:10.1126/science.1212986.
5. Kryazhimskiy S, Tkacik G, Plotkin JB (2009) The dynamics of adaptation on
correlated fitness landscapes. Proc Natl Acad Sci USA 106: 18638–18643.
doi:10.1073/pnas.0905497106.
6. Draghi JA, Parsons TL, Plotkin JB (2011) Epistasis increases the rate of
conditionally neutral substitution in an adapting population. Genetics 187:
1139–1152. doi:10.1534/genetics.110.125997.
7. Szendro IG, Franke J, de Visser JAGM, Krug J (2013) Predictability of evolution
depends nonmonotonically on population size. Proc Natl Acad Sci USA 110:
571–576. doi:10.1073/pnas.1213613110.
8. deVisser JAGM, Park SC, Krug J (2009) Exploring the effect of sex on empirical
fitness landscapes. Am Nat 174: S15–30. doi:10.1086/599081.
9. Sanjuan R, Moya A, Elena SF (2004) The contribution of epistasis to the
architecture of fitness in an RNA virus. Proc Natl Acad Sci USA 101: 15376–
15379.
10. Chou H-H, Chiu H-C, Delaney NF, Segre D, Marx CJ (2011) Diminishing
returns epistasis among beneficial mutations decelerates adaptation. Science
332: 1190–1192. doi:10.1126/science.1203799
11. Khan AI, Dinh DM, Schneider D, Lenski RE, Cooper TF (2011) Negative
epistasis between beneficial mutations in an evolving bacterial population.
Science 332: 1193–1196.
12. Rokyta DR, Joyce P, Caudle SB, Miller C, Beisel CJ, et al. (2011) Epistasis
between beneficial mutations and the phenotype-to-fitness map for a ssDNA
virus. PLoS Genet 7: e1002075. doi:10.1371/journal.pgen.1002075.
13. Maharjan RP, Ferenci T (2013) Epistatic interactions determine the mutational
pathways and coexistence of lineages in clonal Escherichia coli populations.
Evolution 67: 2762–2768 doi:10.1111/evo.12137.
14. Chou H-H, Berthet J, Marx CJ (2009) Fast growth increases the selective
advantage of a mutation arising recurrently during evolution under metal
limitation. PLoS Genet 5: e1000652. doi:10.1371/journal.pgen.1000652.
15. Kvitek DJ, Sherlock G (2011) Reciprocal sign epistasis between frequently
experimentally evolved adaptive mutations causes a rugged fitness landscape.
PLoS Genet 7: e1002056 doi:10.1371/journal.pgen.1002056.
16. Weinreich DM, Watson RA, Chao L (2005) Perspective: Sign epistasis and
genetic constraint on evolutionary trajectories. Evolution 59: 1165–1174.
17. Schenk MF, Szendro IG, Salverda ML, Krug J, de Visser JAGM (2013) Patterns
of epistasis between beneficial mutations in an antibiotic resistance gene. Mol
Biol Evol 30: 1779–1787. doi:10.1093/molbev/mst096.
18. Fisher RA (1930) The genetical theory of natural selection. Clarendon Press,
Oxford, UK.
19. Martin G, Elena SF, Lenormand T (2007) Distributions of epistasis in microbes
fit predictions from a fitness landscape model. 39: 555–560.
20. Kascer H, Burns JA (1973) The control of flux. Symp Soc Exp Biol 27: 65–104.
21. Heinrich R, Rapoport TA (1974) A linear steady-state treatment of enzymatic
chains. General properties, control and effector strength 42: 89–95.
22. Watt WB, Dean AM (2000) Molecular-functional studies of adaptive genetic
variation in prokaryotes and eukaryotes. Annu Rev Genet 34: 593–622.
23. Szathmary E (1993) Do deleterious mutations act synergistically? Metabolic
control theory provides a partial answer. Genetics 133: 127–132.
24. Keightley PD (1996) Metabolic models of selection response. J Theor Biol 182:
311–316.
25. Stoebel DM, Dean AM, Dykhuizen DE (2008) The cost of expression of
Escherichia coli lac operon proteins in in the process, not in the products. Genetics
178: 1653–1660. doi:10.1534/genetics.107.085399.
26. Dekel E, Alon U (2005) Optimality and evolutionary tuning of the expression
level of a protein. Nature 436: 588–592. doi:10.1038/nature03842.
27. Flamholz A, Noor E, Bar-Even A, Liebermeister W, Milo R (2013) Glycolytic
strategy as a tradeoff between energy yield and protein cost. Proc Natl Acad Sci
USA 110: 10039–10044. doi:10.1073/pnas.1215283110.
28. Chistoserdova L, Vorholt JA, Thauer RK, Lidstrom ME (1998) C1 transfer
enzymes and coenzymes linking methylotrophic bacteria and methanogenic
Archaea. Science 281: 99–102.
29. Marx CJ, Chistoserdova L, Lidstrom ME (2003) Formaldehyde-detoxifying role
of the tetrahydromethanopterin-linked pathway in Methylobacterium extorquens
AM1. J Bacteriol 185: 7160–7168.
30. Marx CJ, Laukel M, Vorholt JA, Lidstrom ME (2003) Purification of the
formate-tetrahydrofolate ligase from Methylobacterium extorquens AM1 and
demonstration of its requirement for methylotrophic growth. J Bacteriol 185:7169–7175.
31. Marx CJ, Van Dien SJ, Lidstrom ME (2005) Flux analysis uncovers the key role
of functional redundancy in formaldehyde metabolism. PLoS Biol 3: e16.
32. Crowther GJ, Kosaly G, Lidstrom ME (2008) Formate as the main branch point
for methylotrophic metabolism in Methylobacterium extorquens AM1. J Bacteriol
190: 5057–5062. doi:10.1128/JB.oo228-08.
33. Lee M-C, Marx CJ (2013) Synchronous waves of failed soft sweeps in the
laboratory: remarkably rampant clonal interference of alleles at a single locus.
Genetics 193: 943–952. doi:1534/genetics.112.148502.
34. Chou H-H, Marx CJ (2012) Optimization of gene expression through divergent
mutational paths. Cell Rep 1: 133–140. doi:10.1016/j.celrep.2011.12.003.
35. Lee M-C, Chou H-H, Marx CJ (2009) Asymmetric, bimodal trade-offs duringadaptation of Methylobacterium to distinct growth substrates. Evolution 63: 2816–
2830. doi:10.1111/j.1558-5646.2009.00757.x.
36. Wilke CO, Adami C (2001) Interaction between directional epistasis and
average mutational effects. Proc Biol Sci 268: 1469–1474.
37. Gros PA, Le Nagard H, Tenaillon O (2009) The evolution of epistasis and its
links with genetic robustness, complexity and drift in a phenotypic model of
adaptation. Genetics 182: 277–293. doi:10.1534/genetics.108.099127
38. Weinreich DM, Delaney NF, Depristo MA, Hartl DL (2006) Darwinian
evolution can follow only very few mutational paths to fitter proteins. Science
312: 111–114.
39. Draghi JA, Plotkin J (2013) Selection biases the prevalence and type of epistasis
among beneficial substitutions. Evolution 67: 3120–3131. doi:10.1111/
evo.12192.
40. Hartl DL, Dykhuizen DE, Dean AM (1985) Limits of adaptation: the evolution
of selective neutrality. Genetics 111: 655–674.
41. Dykhuizen DE, Dean AM, Hartl DL (1987) Metabolic flux and fitness. Genetics115: 25–31.
42. Dong H, Nilsson L, Kurland CG (1995) Gratuitous overexpression of genes in
Escherichia coli leads to growth inhibition and ribosome destruction. J Bacteriol177: 1497–1504.
43. Lunzer M, Miller SP, Felsheim R, Dean AM (2005) The biochemical
architecture of an ancient adaptive landscape. Science 310: 499–501.
44. Giver L, Gershenson A, Freskgard P-O, Arnold FH (1998) Directed evolution ofa thermostable esterase. Proc Natl Acad Sci USA 95: 12809–12813.
doi:10.1073/pnas.95.22.12809.
45. DePristo MA, Weinreich DM, Hartl DL (2005) Missense meanderings insequence space: a biophysical view of protein evolution. Nat Rev Genet 6: 678–
687.
46. Chiu H-C, Marx CJ, Segre D (2012) Epistasis from functional dependence offitness on underlying traits. Proc Biol Sci 279: 4156–4164. doi:10.1098/
rspb.2012.1449.
47. Denver DR, Morris K, Streelman JT, Kim SK, Lynch M, et al. (2005) Thetranscriptional consequences of mutation and natural selection in Caenorhabditis
elegans. Nat Genet 37: 544–548.
48. Lemos B, Meiklejohn CD, Caceres M, Hartl DL (2005) Rates of divergence ingene expression profiles of primates, mice, and flies: stabilizing selection and
variability among functional categories. Evolution 59: 126–137.
49. Wagner A (2005) Energy constraints on the evolution of gene expression. MolBiol Evol 22: 1365–1374.
50. Bedford T, Hartl DL (2009) Optimization of gene expression by natural
selection. Proc Natl Acad Sci USA 106: 1133–1138. doi:10.1073/pnas.0812009106.
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 10 February 2014 | Volume 10 | Issue 2 | e1004149
51. Lande R, Arnold SJ (1983) The measurement of selection on correlated
characters. Evolution 37: 1210–1226.52. Poelwijk FJ, de Vos MG, Tans SJ (2011) Tradeoffs and optimality in the
evolution of gene regulation. Cell 146: 462–470. doi:10.1016/j.cell.2011.06.035.
53. Chubiz LM, Lee M-C, Delaney NF, Marx CJ (2012) FREQ-Seq: a rapid, cost-effective, sequencing-based method to determine allele frequencies directly from
mixed populations. PLoS One 7: e47959. doi:10.1371/journal.pone.0047959.54. Yi X, Dean AM (2013) Bounded population sizes, fluctuating selection and the
tempo and mode of coexistence. Proc Natl Acad Sci USA 110:16945–16950.
doi:10.1073/pnas.1309830110.55. Gerrish PJ, Lenski RE (1998) The fate of competing beneficial mutations in an
asexual population. Genetica 102–103: 127–144.56. Woods RJ, Barrick JE, Cooper TF, Shrestha U, Kauth MR, et al. (2011)
Second-order selection for evolvability in a large Escherichia coli population.Science 331:1433–1436. doi:10.1126/science.1198914.
57. Lang GI, Botstein D, Desai MM (2011) Genetic variation and the fate of
beneficial mutations in asexual populations. Genetics 188:647–661.doi:10.1534/genetics.111.128942.
58. Lang GI, Rice DP, Hickman MJ, Sodergren E, Weinstock GM, et al. (2013)Pervasive genetic hitchhiking and clonal interference in forty evolving yeast
populations. Nature 500(7464):571–4 doi:10.1038/nature12344.
59. Lee M-C, Marx CJ (2012) Repeated, selection-driven genome reduction ofaccessory genes in experimental populations. PLoS Genet 8: e1002651.
doi:10.1371/journal.pgen.1002651.60. Carroll SM, Marx CJ (2013) Evolution after introduction of a novel metabolic
pathway consistently leads to restoration of wild-type physiology. PLoS Genet 9:e1003427. doi:10.1371/journal.pgen.1003427.
61. Marx CJ (2008) Development of a broad-host-range sacB-based vector for
unmarked allelic exchange. BMC Res Notes 1: 1. doi:10.1186/1756-0500-1-1.
62. Van Dien SJ, Marx CJ, O’Brien BN, Lidstrom ME (2003) Genetic
characterization of the carotenoid biosynthetic pathway in Methylobacterium
extorquens AM1 and isolation of a colorless mutant. Appl Environ Microbiol 69:
7563–7566.
63. Marx CJ, Lidstrom ME (2004) Development of an insertional expression vectorsystem for Methylobacterium extorquens AM1 and generation of null mutants lacking
mtdA and/or fch. Microbiol 150: 9–19.64. Chistoserdov AY, Chistoserdova LV, McIntire WS, Lidstrom ME (1994)
Genetic organization of the mau gene cluster in Methylobacterium extorquens AM1:
complete nucleotide sequence and generation and characteristics of mau
mutants. J Bacteriol 176: 4052–4065.
65. Marx CJ, Lidstrom ME (2001) Development of improved versatile broad-host-range vectors for use in methylotrophs and other Gram-negative bacteria.
Microbiol 147: 2065–2075.66. Ras J, Van Ophem PW, Reijnders WN, Van Spanning RJ, Duine JA, et al.
(1995) Isolation, sequencing, and mutagenesis of the gene encoding NAD- and
glutathione-dependent formaldehyde dehydrogenase (GD-FALDH) from Para-
coccus denitrificans, in which GD-FALDH is essential for methylotrophic growth.
J Bacteriol 177: 247–251.67. Harms N, Ras J, Reijnders WN, van Spanning RJ, Stouthamer AH (1996) S-
formylglutathione hydrolase of Paracoccus denitrificans is homologous to human
esterase D: a universal pathway for formaldehyde detoxification? J Bacteriol 178:6296–6299.
68. Bradford MM (1976) Rapid and sensitive method for the quantification ofmicrogram quantities of protein utilizing the principle of protein-dye binding.
Anal Biochem 72: 248–254. doi:10.1016/0003-2697(76)90527-3.69. Lee CL, Ow DS, Oh SK (2006) Quantitative real-time polymerase chain
reaction for determination of plasmid copy number in bacteria. J Microbiol
Methods 65: 258–267.
Antagonism on a Gene Expression Fitness Landscape
PLOS Genetics | www.plosgenetics.org 11 February 2014 | Volume 10 | Issue 2 | e1004149
Figure S1. Trends of epistasis as a function of selective coefficient.
0.1 0.2 0.3 0.4 0.5
-0.20
-0.15
-0.10
-0.05
0.00
s
Epistasis
A
0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14
-0.10
-0.05
0.00
0.05
0.10
s
Epistasis
B
0 10 20 30 40 50
-1000
-800
-600
-400
-200
0
s
Epistasis
C
0 5 10 15
-1e+05
-5e+04
0e+00
5e+04
s
Epistasis
D
0 5 10 15 20 25
-1500
-1000
-500
0
s
Epistasis
E
Text S1. Analysis of bias in epistasis as a function of selection coefficient.
The slope of the relationship between the selective coefficient of a mutation in the ancestral background,
and its epistatic interactions with other beneficial mutations, seems intuitively sufficient to reveal a trend toward
increasing synergistic or antagonistic epistasis. If mutations with larger beneficial effects in the ancestor combine
more poorly with other mutations, this slope will be negative; such negative relationships appear consistent with
an explanation of diminishing returns, where the benefit of a mutation decreases monotonically with the fitness of
the genetic background in which it is assayed. However, recent work in a computational model of fitness
landscapes has suggested that ascertainment bias in evolution experiments can produce statistical artifacts that
may be hard to distinguish from the patterns expected with diminishing returns [39]. Here we explore a simple,
illustrative model to show the fitness landscapes without diminishing returns can show spurious negative
correlations between the benefit of mutations in the ancestor and their fitness effects in combination.
Our simplified fitness landscape model starts with L sites, each of which permits a single alternative allele.
By assumption, each of these mutations has an inherent, or mean effect, which is independent of the genetic
background, and a random, or epistatic, effect, which reflects the influence of the genetic background on that
allele’s fitness contribution. The inherent effect of a locus i, µi, is drawn independently and identically from a
Gaussian distribution with mean zero and standard deviation msd. Random effects have a mean of zero and a
standard deviation which is specific for each locus; these standard deviations, σi, are drawn independently and
identically from an exponential with mean rsd. The random effects for loci i in background g are then drawn from
a Gaussian with mean zero and standard deviation σi. Let N(a, b) stand for a Gaussian draw with mean a and
standard deviation b and E (c) for an exponential draw with mean c; then, the inherent and random effects of the
mutation at locus i are:
µi = N(0, msd) Eq. 1
σi = E(rsd) Eq. 2
If the ancestor has a fitness of one, then the fitness of the genotype with a mutation at locus i is:
wi = 1 + µi + N(0, σi) = 1 + µi + ri Eq. 3
where ri is the random effect in the ancestor. Assuming that fitness is calculated on a multiplicative scale, the
fitness of the genotype with mutations at both loci i and j is:
wij = (1 + µi + rij)(1 + µj + rji) Eq. 4
where rij is the random effect of mutation i in the background containing j, and rji is the random effect of mutation
j with i. We can then calculate epistasis between mutations at loci i and j by subtracting the expected fitness from
the of the double mutant from its actual fitness, as calculated with Eq. 4.
εij = wij - wiwj Eq. 5
Epistasis, as represented in this model, is not biased toward antagonism at higher fitnesses; we therefore
do not expect to see signatures of diminishing returns from experiments conducted on these model landscapes. If a
trend toward negative regressions of ε on s of mutations is observed for beneficial mutations in this model, then
we can conclude that the process of selection makes these regressions vulnerable to spurious indications of
diminishing returns.
To generate each set of beneficial mutations, we first assign fitness effects in the ancestor using Eq. 3.
Then, we select a group of n beneficial mutations with a simplified model of an evolutionary process. This
selection step reflects an ascertainment bias: mutations that escape genetic drift and reach fixation are likely to be
very beneficial on the background of the ancestor, and are therefore likely to exhibit a regression to the mean
when tested on other genetic backgrounds. To illustrate the effects of this potential bias, we consider two
procedures for selecting beneficial mutations. In the first, which we will call weak selection, all beneficial
mutations are equally likely to be chosen. In the second, labeled strong selection, beneficial mutations are chosen
proportionally to their selective coefficients.
In each simulated experiment, six unique beneficial mutations are chosen according to either the weak or
strong selection procedure, and the slope of the relationship between s and ε is determined by linear regression.
These experiments are replicated 10,000 times for each parameter values, and we report the mean slope and the
fraction of slopes that are negative. Because of the nature of selection in our model, only the relative magnitudes
of selective coefficients are important; we therefore only consider the ratio msd:rsd.
Mean regression slopes, and the fraction of slopes that are negative, for strong selection simulations. Dashed
lines indicate the unbiased expectations. Parameter values are msd = 0.01, rsd = 0.05 for the ratio 0.2, msd = 0.05,
rsd = 0.05 for the ratio 1, and msd = 0.05, rsd = 0.01 for the ratio 5.
The above figure summarizes the bias toward negative slopes between selective coefficients in the
ancestor and epistasis in pairs for strong selection. For all parameter values, regression slopes are more likely to
be negative. This reflects the sorting effects of selection, where fixed mutations are likely to be unusually good on
the background in which they were selected, relative to other genetic backgrounds. This bias is more severe when
the number of loci is large and the magnitude of random effects is large compared to mean effects. The results for
the weak selection model are similar with two differences: biases are slightly smaller, and the effect of L is
negligible (data not shown). These complex interactions between parameters, which depend on aspects of both the
fitness landscape itself and the population process of selecting the mutations, highlight the difficulty in attempting
to correct for these biases.
0.2 1 5
L = 20L = 100
Ratio of Mean to Epistatic Effects
Mea
n S
lope
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.2 1 5
L = 20L = 100
Ratio of Mean to Epistatic Effects
Frac
tion
of N
egat
ive
Slo
pes
0.0
0.2
0.4
0.6
0.8
1.0
1
Supplementary Text S2. Derivation of the fitness model.
Comparison of current benefit – cost model to previous analysis in M. extorquens.
The present formulation of fitness as a function of benefits and costs bears some
similarities to a prior phenomenological model for M. extorquens [10], but there are substantial
differences. In order to analyze epistasis amongst four beneficial mutations that occurred in a
single evolving lineage of the EM strain, we had noted that enzyme expression cost was a
substantial component of adaptation and built a simple model of fitness around considering the
impact cost would have. Three of the four alleles identified in an evolved (EVO) strain appeared
to increase fitness at least partly by reducing this cost. Therefore, like models for a single enzyme
[26], we partitioned fitness into a ‘benefit’ component b0, analogous to a single conglomerate
‘enzyme activity’ that sets the rate of energy extracted from the substrate to generate biomass;
and a cost c0, encompassing a fixed amount of energy diverted to deal with over-expression of
the foreign pathway. Thus the fitness of the ancestral strain can be written, as W0 = b0-c0 = 1. We
hypothesized that new alleles could modify benefit and cost by multiplicative factors (λi and θi,
respectively), giving rise to a fitness Wi = λib0-θic0. A successive allele j, on top of the
background of mutant i, is similarly assumed to act multiplicatively on the benefit and cost
components, yielding a fitness Wij = λiλjb0-θiθjc0.
There are three primary differences between the model described above and the one
developed in the current work. First, we measured changes in the enzyme levels themselves and
using these as an intermediate level phenotype that is the basis of calculating fitness. This
contrasts with the above model that only considered the fitness values of various strains and
abnormal cellular morphology as a proxy for costs. Second, we distinguish between the two
enzymes rather than lumping them together, for breaking the original correlation in their
expression was critical for adaptation [34]. Third, we explicitly describe the pleiotropy that links
the consistent, predictable changes in benefit and costs that occur when expression of FlhA is
altered. The above model contains the potential for pleiotropy in that each mutation could have
values of λi and θi that both differ from 1, but here that shape is established based upon a
simplification of MCA [40]. It is because of these linkages to the underlying physiology that the
current model can generate accurate predictions of fitness for mutational combinations not used
in the training set (R2 = 0.98).
2
Mechanistic derivation of the fitness model from Metabolic Control Analysis
Our fitness model is a biologically inspired but ultimately empirical description of the effect of
the measured enzyme levels on fitness. However, it is also possible to directly derive this model
from an underlying mechanistic model, as explained in this supplemental text. We would like to
gratefully acknowledge that this derivation was originally provided by an anonymous reviewer of
this manuscript.
Consider a model where fitness (𝑊) was proportional to the flux through a pathway, (𝐽), with
linear penalties for the concentration of a toxic metabolic intermediate (𝑆!) as well as the
concentration of the enzymes 𝐹𝑙ℎ𝐴 and 𝐹𝑔ℎ𝐴 along this pathway, such that:
𝑊 = 𝛽!𝐽 − 𝛽!𝑆! − 𝛽!𝐹𝑙ℎ𝐴 − 𝛽!𝐹𝑔ℎ𝐴
In our conception of this model, the toxic metabolite is formaldehyde, which appears upstream of
the enzymatic reactions catalyzed by FlhA and FghA in the pathway of methanol metabolism in
this organism.
Additionally, imagine that the pathway that determines fitness is a linear pathway that contains
the toxic intermediate 𝑆! as an intermediate step. This pathway begins at a substrate 𝑆! whose
concentration is fixed, and ends at the irreversible reaction catalyzed by 𝐸!!!. Further, assume
that this pathway is governed by mass action kinetics and that the kinetic rates of the forward and
reverse reactions do not change during the evolution experiment.
3
One can show that this model can be reduced to our model for fitness. This derivation depends on
many classic results from metabolic control theory, and if these are unfamiliar to the reader they
are encouraged to consult a reference book before examining this result (or Appendix C in [20]).
First make the following definitions:
𝑘! = 𝐹𝑜𝑟𝑤𝑎𝑟𝑑 𝑟𝑎𝑡𝑒 𝑡ℎ𝑟𝑜𝑢𝑔ℎ 𝑡ℎ𝑒 𝑥𝑡ℎ 𝑟𝑒𝑎𝑐𝑡𝑖𝑜𝑛
𝑘!! = 𝑅𝑒𝑣𝑒𝑟𝑠𝑒 𝑟𝑎𝑡𝑒 𝑡ℎ𝑟𝑜𝑢𝑔ℎ 𝑡ℎ𝑒 𝑥𝑡ℎ 𝑟𝑒𝑎𝑐𝑡𝑖𝑜𝑛
𝑞! =𝑘!𝑘!!
= 𝑟𝑎𝑡𝑖𝑜 𝑜𝑓 𝑟𝑎𝑡𝑒𝑠
𝐾!,! = 𝑞!
!
!!!
𝑒! = 𝐶𝑜𝑛𝑐𝑒𝑛𝑡𝑟𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑒𝑛𝑧𝑦𝑚𝑒 𝑋
𝐸! = 𝑒!𝑘! = 𝑒𝑛𝑧𝑦𝑚𝑒 𝑡𝑖𝑚𝑒𝑠 𝑓𝑜𝑟𝑤𝑎𝑟𝑑 𝑟𝑎𝑡𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
Then for a linear pathway of length 𝑛 reactions, the flux through the pathway will be equal to:
𝐽 =𝑆!1
𝐾!,!𝐸!!!!!
Because the flux through the entire pathway is equal to the flux from the toxic intermediate as
well, we can set these two fluxes to be equal and solve for 𝑆!.
𝐽 =𝑆!1
𝐾!,!𝐸!!!!!
=𝑆!1
𝐾!,!𝐸!!!!!
𝑆! =𝑆!
1𝐾!,!𝐸!
!!!!
1𝐾!,!𝐸!
!!!!
4
Substituting back in to our original equation for fitness we obtain:
𝑊 = 𝛽!𝐽 − 𝛽!𝑆! − 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
𝑊 = 𝛽!𝑆!1
𝐾!,!𝐸!!!!!
− 𝛽!𝑆!
1𝐾!,!𝐸!
!!!!
1𝐾!,!𝐸!
!!!!
− 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
𝑊 = 𝑆!𝛽! − 𝛽!
1𝐾!,!𝐸!
!!!!
1𝐾!,!𝐸!
!!!!
− 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
Note that: !!!,!
= !!,!!!,!
and that the control coefficient for an enzyme in a pathway are equal to:
𝐶! =
𝑒!𝐸!𝐾!,!
𝑒!𝐸!𝐾!,!
!!!!
As before 𝑒! is the concentration of enzyme for the 𝑥𝑡ℎ step. By rearrangement one can then
obtain:
𝑊 = 𝑆!𝛽! − 𝛽!
𝐾!,!𝐸!𝐾!,!
!!!!
1𝐾!,!𝐸!
!!!!
− 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
𝑊 = 𝑆!𝛽!
1𝐾!,!𝐸!
!!!!
− 𝛽!𝐾!,!
1𝐾!,!𝐸!
1𝐾!,!𝐸!
!!!!
!
!!!
− 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
5
𝑊 = 𝑆!𝛽!
1𝐾!,!𝐸!
!!!!
− 𝛽!𝐾!,!𝐶!𝑒!
!
!!!
− 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
We now scale all the enzyme concentrations except FlhA and FghA to be equal to 1 (as we are
modeling their concentrations for different mutational combinations as effectively constant), and
define two constants for notational ease that are related to the control coefficients as:
𝑎 =𝐶!"!!𝑒!"!!
+𝐶!"!!𝑒!"!!
+ 𝐶!!!!;!!!"!!;!!!"!!
𝑏 =1
𝐾!,! 𝐸!!!!"!!;!!!"!!
Fitness can then be rearranged as:
𝑊 = 𝑆!𝛽!
𝑏 + 1𝐾!,!"!! 𝐸!"!!
+ 1𝐾!,!"!! 𝐸!"!!
− 𝛽!𝐾!,! 𝑎 +𝐶!"!!𝑒!"!!
+𝐶!"!!𝑒!"!!
− 𝛽!𝑒!"!!
− 𝛽!𝑒!"!!
Assume 𝑐!"!! is much smaller than 𝑐!"!! (justifiable given the observation that this enzyme
could be dropped almost to negligible concentrations relative to wild-type). This implies that: !
!!"!!!!"!!≫ !
!!"!!!!"!!, and justifies ignoring these FghA terms leading to:
𝑊 = 𝑆!𝛽!
𝑏 + 1𝐾!,!"!! 𝐸!"!!
− 𝛽!𝐾!,! 𝑎 +𝐶!"!!𝑒!"!!
− 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
6
𝑊 =𝑆!𝛽!
𝑏 + 1𝐾!,!"!! 𝐸!"!!
− 𝑆!𝛽!𝐾!,!𝑎 +𝑆!𝛽!𝐾!,!𝐶!"!!
𝑒!"!!− 𝛽!𝑒!"!! − 𝛽!𝑒!"!!
𝑊 =𝑆!𝛽!
𝑏 + 1𝐾!,!"!! 𝐸!"!!
− 𝑆!𝛽!𝐾!,!𝑎 − 𝑒!"!! 𝛽! +𝑆!𝛽!𝐾!,!𝐶!"!!
𝑒!"!! ! − 𝛽!𝑒!"!!
𝑊 =𝑒!"!!𝑆!𝛽!
𝑏𝑒!"!! +
1𝑘!"!!𝐾!,!"!!𝑏
− 𝑆!𝛽!𝐾!,!𝑎 − 𝑒!"!! 𝛽! +𝑆!𝛽!𝐾!,!𝐶!"!!
𝑒!"!! ! − 𝛽!𝑒!"!!
Renaming the constants:
𝑊 =𝑒!"!!𝛾!𝑒!"!! + 𝛾!
− 𝛾! − 𝑒!"!! 𝛾! +𝛾!
𝑒!"!! ! − 𝛾!𝑒!"!!
And finally, as our model is designed to explore data where the enzyme concentration of FlhA is
decreasing relative to the wildtype (𝑒!"!! < 1.0) we can regard the term !!!!"!! ! as relatively less
important over the parameter range we are fitting (or that the penalty term in parenthesis on the
right can be approximated by a simple linear penalty), leading to our model for fitness:
𝑊 =𝑒!"!!𝛾!𝑒!"!! + 𝛾!
− 𝛾! − 𝛾!𝑒!"!! − 𝛾!𝑒!"!!
Table S1. Genotypes and phenotypes of pCM410, pHC112, and their derivatives. aThe nucleotide positions of mutations on pCM410 (Genbank Accession no. FJ389188)
are indicated in parenthesis. bThe concentrations of cumate (µM) applied to induce expression of the flhA-fghA
cassette in pHC112 and its derivatives. cPhenotypes during growth in methanol (15 mM) minimal media supplemented. Data are
reported as means and 95% confidence intervals of three independent measurements.
Plasmids were quantified as copies per genome. Enzyme activities of FlhA and FghA
were measured as the amount of substrate being catalyzed per milligram of cellular
protein per second (mM sec-1 mg-1).
Plasmid
pCM410
pHC112
aMutations bInducer cPhenotypeIndex Description Plasmid copy FlhA FghA Fitness0 ancestral plasmid 0 8.875 ± 1.091 0.669 ± 0.045 1.562 ± 0.119 1.007 ± 0.009A1 Δ(3791–3801) 0 9.081 ± 0.912 0.487 ± 0.018 0.301 ± 0.029 1.300 ± 0.027A2 flhA...ISMex4 (3771/ 3772) 0 8.866 ± 0.891 0.477 ± 0.073 0.332 ± 0.066 1.353 ± 0.022A3 T→C (3804) 0 9.451 ± 1.652 0.442 ± 0.016 0.045 ± 0.007 1.420 ± 0.032B2 Δ(9141–113) 0 6.177 ± 0.791 0.460 ± 0.010 1.076 ± 0.123 1.326 ± 0.032B3 T→G (9136) 0 7.211 ± 0.558 0.639 ± 0.007 1.351 ± 0.054 1.075 ± 0.012B2B3 Δ(9141–113), T→G (9136) 0 5.807 ± 0.664 0.427 ± 0.013 0.932 ± 0.117 1.323 ± 0.026B5 trfA::ISMex25(8302/8303) 0 2.199 ± 0.563 0.318 ± 0.011 0.093 ± 0.013 1.457 ± 0.033A1B3 Δ(3791–3801), T→G (9136) 0 8.015 ± 1.150 0.484 ± 0.037 0.315 ± 0.068 1.402 ± 0.034A1B2 Δ(3791–3801), Δ(9141–113) 0 6.156 ± 1.087 0.366 ± 0.019 0.199 ± 0.026 1.478 ± 0.069A1B2B3 Δ(3791–3801), Δ(9141–113), T→G (9136) 0 5.914 ± 0.779 0.341 ± 0.016 0.201 ± 0.040 1.492 ± 0.036A1B5 Δ(3791–3801), trfA ::ISMex25(8302/8303) 0 1.853 ± 0.341 0.108 ± 0.018 0.100 ± 0.022 0.365 ± 0.037A2B3 flhA...ISMex4 (3771/ 3772), T→G (9136) 0 8.553 ± 1.421 0.491 ± 0.040 0.311 ± 0.049 1.366 ± 0.042A2B2 flhA...ISMex4 (3771/ 3772), Δ(9141–113) 0 5.897 ± 0.606 0.423 ± 0.023 0.190 ± 0.021 1.372 ± 0.064A2B2B3 flhA...ISMex4 (3771/ 3772), Δ(9141–113), T→G (9136) 0 5.186 ± 0.418 0.392 ± 0.029 0.174 ± 0.033 1.488 ± 0.015A2B5 flhA...ISMex4 (3771/ 3772), trfA::ISMex25(8302/8303) 0 2.115 ± 0.539 0.113 ± 0.030 0.104 ± 0.029 0.448 ± 0.047A3B3 T→C (3804), T→G (9136) 0 7.880 ± 0.793 0.419 ± 0.067 0.052 ± 0.004 1.502 ± 0.024A3B2 T→C (3804), Δ(9141–113) 0 5.089 ± 0.728 0.298 ± 0.039 0.041 ± 0.008 1.483 ± 0.101A3B2B3 T→C (3804), Δ(9141–113), T→G (9136) 0 4.620 ± 1.058 0.292 ± 0.020 0.043 ± 0.002 1.428 ± 0.088A3B5 T→C (3804), trfA::ISMex25(8302/8303) 0 1.907 ± 0.252 0.171 ± 0.010 0.038 ± 0.004 1.097 ± 0.081R cumate-inducible expression plasmid 0 8.645 ± 1.402 0.131 ± 0.014 0.902 ± 0.141 0.584 ± 0.074
0.05 9.347 ± 1.359 0.143 ± 0.019 1.052 ± 0.223 0.804 ± 0.0080.1 9.002 ± 1.652 0.177 ± 0.018 1.129 ± 0.087 1.051 ± 0.0260.15 9.045 ± 1.838 0.281 ± 0.046 1.139 ± 0.200 1.141 ± 0.0160.25 9.375 ± 1.832 0.311 ± 0.026 1.190 ± 0.152 1.192 ± 0.0060.5 8.714 ± 1.007 0.356 ± 0.043 1.350 ± 0.178 1.170 ± 0.0021 9.665 ± 0.927 0.466 ± 0.074 1.474 ± 0.139 1.141 ± 0.00115 7.206 ± 1.391 0.627 ± 0.094 1.715 ± 0.172 1.065 ± 0.027
RB2 cumate-inducible expression, Δ(9141–113) 0 6.350 ± 1.380 0.112 ± 0.026 0.653 ± 0.078 0.275 ± 0.0400.05 0.625 ± 0.0430.1 6.255 ± 1.347 0.132 ± 0.021 0.752 ± 0.023 0.883 ± 0.0570.15 1.225 ± 0.0550.25 5.957 ± 1.082 0.248 ± 0.033 0.789 ± 0.026 1.224 ± 0.0270.5 1.226 ± 0.0761 1.228 ± 0.02315 6.053 ± 0.659 0.419 ± 0.105 1.372 ± 0.226 1.227 ± 0.089
RB3 cumate-inducible expression, T→G (9136) 0 7.355 ± 0.887 0.128 ± 0.028 0.836 ± 0.068 0.463 ± 0.0450.05 0.793 ± 0.0520.1 7.065 ± 0.539 0.172 ± 0.032 0.983 ± 0.088 1.040 ± 0.0110.15 1.138 ± 0.0390.25 7.101 ± 0.608 0.332 ± 0.039 1.182 ± 0.045 1.210 ± 0.0640.5 1.194 ± 0.0031 1.118 ± 0.01515 6.793 ± 0.191 0.619 ± 0.101 1.641 ± 0.158 1.085 ± 0.014
RB2B3 cumate-inducible expression, Δ(9141–113), T→G (9136) 0 4.804 ± 0.696 0.107 ± 0.010 0.548 ± 0.114 0.357 ± 0.0330.05 0.692 ± 0.0420.1 5.311 ± 0.553 0.142 ± 0.015 0.619 ± 0.165 0.856 ± 0.0070.15 1.190 ± 0.0770.25 5.551 ± 1.162 0.234 ± 0.034 0.743 ± 0.022 1.226 ± 0.0530.5 1.263 ± 0.0261 1.252 ± 0.02415 5.714 ± 0.270 0.421 ± 0.030 1.430 ± 0.276 1.264 ± 0.017
RB5 cumate-inducible expression, trfA::ISMex25(8302/8303) 0 1.786 ± 0.575 0.108 ± 0.023 0.555 ± 0.099 0.307 ± 0.0290.05 0.582 ± 0.0200.1 2.191 ± 0.612 0.122 ± 0.013 0.609 ± 0.052 0.641 ± 0.0340.15 1.142 ± 0.0390.25 1.943 ± 0.568 0.268 ± 0.057 0.753 ± 0.152 1.248 ± 0.0330.5 1.267 ± 0.0321 1.287 ± 0.04315 2.527 ± 0.859 0.434 ± 0.025 1.305 ± 0.160 1.330 ± 0.024
Table S2. Bacterial strains and plasmids.
Strain or plasmid Description Source or reference Strains CM502 Wild-type, crtI502 [61] CM624 crtI502, ΔmptG [10] CM701 ΔmptG, pCM410 [10] CM702 crtI502, ΔmptG, pCM410 [10] CM1232 crtI502, katA::(loxP-trrnB-PtacA-mCherry-tT7), ΔmptG, pCM410 [10] Plasmids pCM410 pCM160 bearing the flhA-fghA cassette downstream of PmxaF [10] pHC112 pCM410 bearing Plac-cymR and a CymR binding site (CuOcmt)
between PmxaF and the flhA-fghA cassette [34]
bpCM410Index Derivatives of pCM410 This study bpHC112Index Derivatives of pCM112 This study aKmr, kanamycin resistance bIndex, plasmid derivatives of pCM410, pCM160, pHC112 carrying mutations that occurred on pCM410
during experimental evolution of the EM strain. See Figure 1B and Table S1 for details.
Table S3. The value of fitted parameters in the fitness landscape model.
Parameter Value Vmax 175.841 Eh 0.186 Threshold 173.230 C1 0.011 C2 0.002