Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Automated design of synthetic microbial communities
Behzad D. Karkaria,1 Alex J. H. Fedorec,1 and Chris P. Barnes1,2∗
1Department of Cell & Developmental Biology, University College London, London, WC1E 6BT, UK
2UCL Genetics Institute, University College London, London, WC1E 6BT, UK
∗To whom correspondence should be addressed;
E-mail: [email protected]
Abstract
In naturally occurring microbial systems, species rarely exist in isolation. There is strong eco-
logical evidence for a positive relationship between species diversity and the functional output
of communities. The pervasiveness of these communities in nature highlights that there may be
advantages for engineered strains to exist in cocultures as well. Building synthetic microbial com-
munities allows us to create distributed systems that mitigates issues often found in engineering
a monoculture, especially when functional complexity is increasing. Here, we demonstrate a
methodology for designing robust synthetic communities that use quorum sensing to control
amensal bacteriocin interactions in a chemostat environment. We explore model spaces for two
and three strain systems, using Bayesian methods to perform model selection, and identify the
most robust candidates for producing stable steady state communities. Our findings highlight
important interaction motifs that provide stability, and identify requirements for selecting genetic
parts and tuning the community composition.
Introduction
Traditionally, in biotechnology and synthetic biology, a microbe is engineered and grown as a mono-
culture to perform a particular function. Novel functionality is imparted by introducing heterologous
genetic processes that would not normally be found in the organism. Unintended interactions be-
tween the introduced heterologous processes can cause the engineered function to behave in an
1
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
unintended manner [1, 2, 3]. Metabolic burden imposed by heterologous expression can significantly
slow growth rates and encourage selection of mutants [4]. The limited cellular resource availability
can cause circuits to behave differently when expressed alongside one another [5, 6]. Opportunities
for unintended interactions will increase exponentially with the number of heterologous processes
introduced into a host strain. These limitations are potentially circumvented with the use of microbial
communities, which can be used to implement distributed systems. A distributed system would al-
low us to allocate functional components between subpopulations of cells, creating physical barriers
that insulate processes from one another and distributing the burden of heterologous expression [7].
This allows us to scale complexity in a manner that could not be achieved under the limitations of a
monoculture. In natural environments, we observe mixed-species microbial communities that exhibit
competitive advantages over monocultures in productivity, resource efficiency, metabolic complex-
ity and resistance to invasion [8, 9]. Being able to predictably and reproducibly construct microbial
communities for synthetic biology or biotechnology applications would allow us to harness these
advantages.
While the potential advantages of building distributed systems for synthetic biology applications
are clear, the maintenance and control of microbial communities comes with its own challenges. The
competitive exclusion principle states that when multiple populations compete for a single limiting
resource (in the absence of other interactions), a single population with the highest fitness will drive
the others to extinction [10]. Evidence from natural microbial systems and ecological studies have
shown us that stability can arise through feedback between subpopulations. Both cooperative and
competitive interactions are important for integrating feedback that can stabilise communities by
manipulating growth or fitness of the subpopulations [11, 12, 13, 14, 15, 16].
Communication between members of a microbial community is important for the dynamic regula-
tion of its behaviour. Quorum sensing (QS) systems are a key set of tools that enable us to engineer
communication between and within subpopulations of a community. In natural systems, they can be
used to regulate genetic processes as a function of population density [17]. QS systems consist of
a biosynthetic module that produces small, freely diffusible molecules. These molecules bind reg-
ulatory proteins that can activate or repress gene expression at specific promoters [17]. Synthetic
microbial communities have been built using QS to regulate processes that manipulate the growth
rate or fitness of a population. Fitness can be manipulated by the expression of lysis proteins,
2
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
metabolic enzymes, toxins and anti-microbial peptides (AMPs) [14, 18, 19, 20, 21, 22, 23, 24]. In
this study we focus on exploring the use of bacteriocins to manipulate subpopulation growth rates.
Bacteriocins are gene-encoded AMPs that can be used to directly suppress the growth rate of a
sensitive population [25]. They are exported into the extracellular environment, and generally use
“Trojan horse” strategies to enter and kill sensitive strains. One of the most well studied bacteriocins
is microcin V (MccV), a broad spectrum class IIa bacteriocin [25]. It is exported by the cell, and
takes advantage of iron-binding receptors displayed extracellularly to enter a sensitive cell, having
an amensal effect on the target [25]. Expression of immunity genes provide protection against the
bacteriocin and these can be expressed separately or in conjunction with the bacteriocin [26]. Bac-
teriocins present themselves as a versatile actuator of population growth. Their expression by a
single strain can affect any number of subpopulations in a synthetic community. This is in contrast to
lysis proteins and intracellular toxins which affect individual cells. Previously, we have demonstrated
the use of MccV expression to improve plasmid maintenance in a population [27] and for building
stable cocultures that overcome competitive exclusion [28]. Other bacteriocins, such as nisin, have
also been used to produce stable communities [24].
Microbial communities exhibit diverse temporal dynamics including stable steady state, oscil-
lations and deterministic chaos [29, 30]. An important part of designing synthetic communities is
being able to define the temporal dynamics that suit the intended application. In this work we set an
objective of producing robust stable steady state communities in a chemostat environment. Stable
communities allow multiple subpopulations to contribute in a specialised manner to an aggregate
community function. We envisage that through the expression of QS and bacteriocins we can build
’plug-and-play’ stabilising systems. These consist of a set of genetic parts that when expressed
by the appropriate subpopulations will produce a stable steady state community. The design chal-
lenge lies in identifying the most robust combinations of QS and bacteriocin expression, giving us
the highest probability of producing stable steady state in a chemostat environment. Given a small
set of genetic parts, the number of ways they can be combined to produce a candidate system can
become overwhelming; system design by intuition alone becomes increasingly challenging when
dealing with multi-level interactions. Predicting how a system will behave before implementation is
essential for the efficient use of lab resources and fully understanding the interactions that occur
[31]. Models allow us to make data driven decisions concerning wet lab implementation. Model
3
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
selection is a process of identifying good performing models from a selection of candidates [32].
Approximate Bayesian computation with sequential Monte Carlo sampling (ABC SMC) is a model
selection and parameterisation method that has been applied to synthetic biology systems [33]. We
have previously used ABC SMC to identify robust genetic oscillators in two and three-gene networks
[34], and to identify parameter regions that give rise to multistable genetic switches [35]. ABC SMC
also allows us to derive design principles that are key to producing the desired behaviour [34, 35].
Similarly, Yeoh et al. developed software that compared the ability of genetic parts to produce logic
gate behaviours, performing model selection using Akaike information criterion (AIC) [36]. Work-
flows have been developed to design regulatory networks from databases of characterised parts
with defined behaviours [37, 38]. Automated circuit design has the potential to greatly improve the
engineering process in synthetic biology.
Here, we present the first instance of automated synthetic community design. Our workflow
automatically generates candidate systems from a set of parts which can be used to engineer a
community. We use ABC SMC to perform model selection, identifying candidate systems that have
the highest probability of producing stable communities in a chemostat bioreactor. Using these meth-
ods we reveal the optimal designs for two- and three-strain systems. This workflow also allows us to
derive fundamental design principles for building stable communities and reveal critical parameters
to control the community composition.
Results
Automated synthetic microbial Community Designer (AutoCD) workflow
Figure 1 illustrates AutoCD, the workflow developed and applied in this study. First, we set the avail-
able parts which can be used to build a stabilising system in a chemostat environment. This consists
of the number of strains (N), bacteriocins (B), and QS systems (A). Any A in a system can regulate
the expression of any B in the system by induction or repression. Strains in all models are depen-
dent upon a single nutrient resource (S), which is consumed by strains and replenished through
dilution of the chemostat with fresh media. Importantly, all models therefore include nutrient based
competition between subpopulations. Uniform prior distributions are set, describing each part and
their interactions with one another (Table 2). The uniform priors we use span the expected ranges
4
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
these parameters could take. Importantly, in scenarios where the particular parts have already been
selected and characterised, the prior parameters can be constrained. The available parts and prior
parameter distributions serve as inputs to the Model Space Generator, which conducts a series
of combinatorial steps to produce all possible genetic circuits. The Model Space Generator then
builds unique combinations of strains expressing different genetic circuits, where each combination
is a candidate stabilising system. Filtering steps remove inviable, redundant and mirror systems,
yielding a set of unique candidates to be assessed (Methods). The Model Space Generator pro-
duces an ordinary differential equation (ODE) model for each system in the context of the chemostat
environment, and these models form our prior model space.
The final required input is a mathematical description of the objective population behaviour, a
stable steady state. We use three distance functions (d1, d2, d3) to describe how far away a simu-
lation is from the objective stable steady state (Eq 1). d1 is the final gradient of Nx , capturing the
most fundamental characteristic of stable steady state where the population level of a strain is un-
changing. d2 is the standard deviation of a population, chosen to detect unstable behaviours such
as oscillations, favouring simulations that reach stable steady state quickly. d3 is the reciprocal of the
strain population at the end of the simulation, allowing us to define a minimum population density.
Given the three distances, εF defines thresholds below which a simulation meets the requirements
of our stable steady state objective. The distances of all strain populations in a simulation must be
below these thresholds to satisfy the objective behaviour. εF1 was chosen to match the error toler-
ance of the ODE solver and εF2 threshold was chosen through qualitative assessment of simulation
data to define a practical threshold for what stable steady state simulations should look like. εF3 is
set to ensure all populations have a minimum final OD of 0.001. The posterior distribution is made
up of simulations where the distances for each strain population are less than the εF thresholds
(Eq 2). ABC SMC performs model selection on the model space for the objective defined by these
distance functions and εF . A particle is a sampled model and sampled parameters. ABC SMC ini-
tially samples particles from the prior distributions with an unbounded distance threshold. Particles
are propagated through intermediate distributions, gradually reducing the distance thresholds until
they equal εF (Methods). ABC SMC provides an estimation of model and parameter space posterior
probabilities for the given prior distributions and the objective behaviour. The output of ABC SMC is
an approximation of the posterior distribution which we can use to help us design synthetic commu-
5
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
nities and chemostat settings in the lab.
Distance functions:
d1(Nx) = |∆Nx(t − 1)|
d2(Nx) = σ(Nx)
d3(Nx) =1
Nx(t − 1)
(1)
Distance thresholds
εF = {1e−9, 0.001, 1000}
d1 < εF1
d2 < εF2
d3 < εF3
(2)
Designing two strain cocultures that achieve steady state
Here we apply AutoCD to the design of a stable steady state coculture containing two strains. In Fig-
ure 2 we define a model space consisting of two strains (N1, N2), two bacteriocins (B1, B2) and two
QS systems (A1, A2). We set model space limits to enable feasible experimental implementation,
allowing expression of up to one QS per strain and expression of up to one bacteriocin per strain.
Each strain can be sensitive to up to one bacteriocin. Given these conditions, the Model Space Gen-
erator yields 69 unique two strain models (m0,m1...m68). These 69 models serve as a uniform prior
model space upon which we perform model selection using ABC SMC. From the available genetic
parts, there are 17 possible interaction options that could exist between species in each candidate
model. We perform hierarchical clustering on the interactions present in each model, grouping mod-
els based on the similarity of their interactions. This clustering is visualised as a dendrogram in
(Figure 2A). ABC SMC approximates the posterior probability of each model for the stable steady
state objective, indicating how effective the candidate system is in producing stable steady state.
m62 has the highest posterior probability, and is therefore the system which most robustly produces
stable steady state (Figure 2A). m62 consists of two strains exhibiting a cross-protection mutualism
relationship [30]. Each strain expresses an orthogonal QS molecule that represses the expression of
a self-limiting bacteriocin in the opposing strain (Figure 2B). This is an interdependence between the
two strains where the extinction of one strain would result in the extinction of the other. The exclusion
of one strain reduces the fitness of the other, creating a closed feedback loop which overcomes the
competitive exclusion principle.
When designing new systems, minimising the number of genetic parts will reduce the number
of experimental variables, improving the ease of construction and optimisation of a system. We
6
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
subset the model space by the number of expressed species in the system (maximum two QS and
two bacteriocin), yielding subsets containing candidate models with two, three and four expressed
species (low complexity to high complexity). We identify the candidates with the highest posterior
probability in each subset (Figure 2B). We see the posterior probability increasing despite the larger
parameter spaces, which is important because ABC SMC will naturally favour models which yield
stable steady state with the smallest possible number of parameters (Occam’s razor) [39]. We
see that all three models have self-limiting motifs, where a strain is sensitive to the bacteriocin
it produces. All three models are devoid of other-limiting motifs, where a strain is sensitive to a
bacteriocin produced by another strain.
The Bayes Factor (BF ) is a ratio between the marginal likelihoods of two models, giving a quan-
tification of support for one model compared with another. BF > 3.0 indicates evidence of a notable
difference between the two models, while BF < 3.0 suggests insubstantial evidence [40] (Table 1).
The BF of m66 compared with m48 suggests substantial improvement in the posterior probability
can be made by increasing complexity. However, the BF of m48 compared with m62 suggests insub-
stantial evidence behind this improvement in posterior probability. These diminishing returns when
increasing system complexity hold important ramifications for system design. The introduction of an
additional QS part to move from m48 to m62 may not be efficient for the minor improvement in steady
state robustness.
Model selection has identified the best performing designs for producing stable communities.
However, the parts used in the design may require specific characteristics or chemostat experi-
mental setup. ABC SMC also produces posterior parameter distributions for each model, giving us
information about the parameter values necessary to yield stable steady state. Figure 2C shows the
posterior distributions of several tunable parameters in m66 and m62, showing the parameter values
that give rise to stable communities. The dilution rate of the chemostat (D) is a directly tunable
parameter and the maximal expression rate of the bacteriocin (KBmax ) can be tuned through choice
of promoter and ribosome binding-site [41]. The growth rates (µmax ) can be tuned through choice of
base strains or auxotrophic dependencies [42, 43]. The lower model posterior probability of m66 is
reflected in the parameter density distributions, which are generally more constrained that those of
m62. The distributions of D in both systems show a lower dilution rate is important for stable steady
state. KBmax for all bacteriocins is tightly constrained to high maximal bacteriocin expression rates.
7
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
For m66, the correlation coefficients for the two growth rates (µmax1,µmax2) show the parameters are
loosely correlated. Additionally, we see that N1 has a higher maximal growth rate (µmax1) than that
of N2 (µmax2). The faster maximal growth rate of N1 is necessary to counteract self-limitation that
is negatively regulated by the population of N2. Conversely, m62 shows a wider distribution of strain
growth rates at stable steady states and a low correlation coefficient, indicating these parameters
are not as critical to produce the objective behaviour.
Self-limiting motifs stabilise two strain systems
The dendrogram of Figure 2A highlights a cluster of high performing models that are closely related.
This suggests underlying interactions of the model space exist that are important for producing
communities with stable steady state.
Non negative matrix factorization (NMF) is an unsupervised machine learning method we can
use to reduce the dimensionality of the interaction space [44]. We can use NMF to help us under-
stand the underlying motifs and how they affect community stability. We represent each model by the
interactions present in the system (Figure 2A). NMF takes these interactions and learns a number of
clusters (K ), models can be rebuilt by a weighted sum of these clusters. In our case, these clusters
can be represented as interaction motifs. We set K = 4, in order to give us a digestible summary
of the model space. Figure 3A shows the learned motifs that can be used to represent the entire
model space. Figure 3B shows the component weights for each model, defining the membership
each model has for each motif. The models are shown in descending order of posterior probability,
we can see that K2 is heavily weighted in the top performing models. The motif K2 refers to self-
limiting (SL) only interactions where the strain is sensitive to the bacteriocin it produces (Figure 3A,
B). The top models are consistently assigned low weights for K4 (Figure 3A, B), a motif which refers
to other-limiting (OL) only interactions, where the strain is sensitive to a bacteriocin produced by the
other strain (Figure 3A).
The continuous learned motifs from NMF can be hard to interpret. We use the indications pro-
duced by NMF to curate our own discrete motifs. K2 and K4 show us the direction of bacteriocin
sensitivity is an important feature and we proceed to investigate this further. All models can be built
by combining eight fundamental motifs which can be categorised as either SL or OL, based on the
direction of bacteriocin sensitivity (Figure 3C). Within each category, motifs are differentiated by the
8
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
way bacteriocin expression is regulated (Figure 3C). For example, m66 = SL2, m48 = SL4 + SL2
and m62 = SL2 + SL2. In order to assess the importance of each motif for producing stable com-
munities we perform a motif impact analysis (Figure 3D). For each model we identify the nearest
neighbours in the model space that can be built by adding each motif. By calculating the change
in posterior probability for each neighbour, we can summarise the contribution each motif brings to
the model posterior probability (Figure 3D). By repeating this across the entire model space, we are
able to quantify whether a motif is stabilising or destabilising (Figure 3E). The lower quartiles of SL
motifs all show lower negative change magnitudes compared with the lower quartiles of OL motifs.
The upper quartiles of SL motifs show a higher positive change magnitude than that of OL motifs.
Together these show the addition of SL motifs more often result in an improved posterior probability,
whereas addition of OL motifs are more often resulting in decreased posterior probability. The upper
quartile of SL2 shows the motif has the most stabilising effect, closely followed by SL4. We see these
findings are reflected by top models identified in Figure 2B, where all models are constructed with
SL2 and SL4 motifs.
The total output of bacteriocin by a population will be a function of the population’s density. All
SL motifs therefore possess a fundamental negative feedback relationship between growth rate and
density, augmented by the mode of QS regulation. Conversely, the population density and growth
rate of a strain in OL motifs are decoupled. This lack of feedback is a clear explanation as to why
we see SL motifs as positive contributors to stability while OL motifs have a destabilising effect.
Designing three strain communities that achieve steady state
While several studies have demonstrated the ability to establish synthetic two strain systems [45,
46, 23, 47, 48, 49, 20, 50, 51, 52, 53, 54], efforts with three strains are sparser [55, 24, 56]. Having
demonstrated the automated design of two strain systems, we next tackle the far larger challenge
of designing stable three strain communities. The addition of a single strain significantly increases
the parameter space, engineering options and possible interactions. We define our available parts
consisting of three strains (N1, N2, N3), three bacteriocins (B1, B2, B3) and two orthogonal QS
systems (A1, A2). We maintain the same strain engineering restrictions, allowing up to one QS
expression and up to one bacteriocin expression per strain. Each strain can be sensitive to up to
one bacteriocin. Given the available parts and engineering limits, the Model Space Generator yields
9
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
4,182 unique models. Due to the much greater number of models, we group models based upon
the interactions in each model by hierarchical clustering for up to five levels. The average posterior
probabilities of each cluster are shown (Figure 4A). 3,289 models have a posterior probability of zero,
highlighting how much more difficult this design scenario is. ABC SMC identifies m4119 as the system
with the highest posterior probability for producing stable steady state. m4119 consists of a two QS
molecules; A1 is produced by N2, A2 is produced by N3 (Figure 4B). The QS molecules repress
the expression of self-limiting bacteriocins produced by each population. Using the minimal motifs
defined in Figure 3C, m4119 can be summarised as m4119 = 3×SL2. We subset the model space on
the counts of heterologous expression in the system, yielding subsets containing candidate models
with two, three and four expressed parts (low complexity to high complexity) (Figure 4B). Models with
two heterologously expressed parts all had a posterior probability of 0.0 and are not shown. Again,
we see a diminishing increase in posterior probability that comes with increasing complexity. m3938 is
the more complicated neighbour of m4119, the difference being N1 is also contributing with production
of A1, resulting in a fall in the posterior probability. The increase in posterior probability that occurs
when moving from m4125 to m4119 has BF < 3.0, indicating the difference between the posterior
probability of the two models is not substantial. These system comparisons highlight the trade-
off between increasing complexity and improving system performance. In a similar fashion to the
two strain model space, the top performing models are dominated by self-limiting only interactions
(Supplementary Figure 1).
Multiple engineered bacteriocins are more important than multiple orthogonal QS
systems
Our results have identified top performing models in the two and three strain model spaces. We have
also highlighted the diminishing returns that occur with increasing model complexity in top performing
models. Next we aim to summarise the importance of different parts and their contribution to the
stable steady state objective behaviour, further enabling us to triage genetic parts for construction in
the lab.
Figure 5 shows a summary of the parts used to construct three strain systems and the aver-
age posterior probabilities they yield. This gives us important information to form heuristic rules
in the design of three strain systems. Figure 5A shows the very similar posterior probability when
10
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
comparing two QS systems rather than one. Figure 5B demonstrates the substantial advantage of
repressive QS regulation of bacteriocin production over inducible systems. Figure 5C shows very
strong evidence in favour of using three bacteriocins to produce stable steady state in three strain
systems. These three statistics suggest that on average there is little advantage to be gained in the
use of two QS systems, and priority should be given to the use of a single repressive QS to regulate
three bacteriocin systems, such as we see in m4125.
Defining stable steady state population ratios in three strain systems
Synthetic communities can be applied to improving yields and efficiency via the distribution of biopro-
duction pathways between subpopulations [46, 53]. The density of a subpopulation will determine
its productive output capacity for the function it conducts. Being able to define the steady state
composition of a synthetic community is therefore an important feature. Here we demonstrate that
secondary objectives can be applied to the output of ABC SMC without further simulation time,
identifying key parameters that enable fine tuning of stable steady state population densities.
The εF3 threshold value ensures all simulations in the final population have an OD > 0.001.
Figure 6A shows the community composition distribution of m4119. The majority of accepted par-
ticles show a final community composition that is dominated by a single strain. By using the final
population distances from ABC SMC we can subset for a secondary objective and identify how
the system can be tuned to produce a more evenly distributed community composition. We set a
secondary objective, stipulating that all strains must be of OD > 0.1 (pink) (Figure 6B). Therefore
strains that do not meet the secondary objective have 0.001 < OD < 0.1 (blue) (Figure 6B). From
these two subsets we generate separate parameter distributions and calculate the divergence using
Kolmogorov-Smirnov (KS). Parameter distributions that show the greatest divergence are important
for changing the system behaviour from one that is dominated by a single strain, to one that has
a more even distribution of strain densities. The distributions of four parameters that exhibit great-
est divergence are shown in Figure 6C. A higher dilution rate (D) and lower maximal bacteriocin
expression rates (KBmax1, KBmax2, KBmax3) are associated with producing an more evenly distributed
community composition. Importantly, all three parameters are realistically tunable. The dilution rate
can be controlled directly through the chemostat device, while bacteriocin expression rates can be
changed through the choice of promoters and ribosome binding sites.
11
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
Discussion
Synthetic communities built to date have employed the use of QS, metabolic dependencies, intra-
cellular lysis proteins, toxins and extracellular AMPs to engineer interactions that enable community
formation [55, 54, 24]. Studies often incorporate mathematical modelling to inform experimental de-
sign and identify the conditions that will produce the expected population dynamic. The design of the
fundamental interactions of the system itself is often directed by mimicking ecological interactions
found in nature, or by rational judgement. As the possible types of engineered interaction increases,
so does the need for comprehensive assessment of the vast model spaces. The modelling and sta-
tistical framework demonstrated here addresses this design problem. With our examples we have
highlighted important design features and heuristic rules for building synthetic steady state commu-
nities. Comparative coculture studies have been used to identify stable partnerships in microbial
ecosystems [57, 58]. These methods could be used to validate our findings in the wet lab by con-
structing a subset of strains from the model space and comparing the ability of pairs to produce
stable steady state.
We have identified optimal system designs using bacteriocins and QS for stable steady state in
two and three strain communities. m62, the top model of the two strain model space uses a cross-
protection mutualism, whereby the density of each subpopulation inhibits the self-limitation of the
other. Cross-protection mutualism has previously been incorporated in synthetic microbial commu-
nities via the mutual degradation of externally supplied antibiotics [49]. Metabolic interdependencies
can also be employed to engineer mutualism [50, 51]. Similarly, in m4119 of the three strain model
space (Figure 4B) we see pair-wise cross-protection mutualism between two subpopulations and a
dependent subpopulation. All top performing models used SL interactions to produce stable steady
state dynamics. Self-limitation is observed in many natural biological communities, normally in a
response to stress [59, 60]. These processes, while detrimental to the individual, provide a net
benefit to the community through release of a public good - they are altruistic processes [61]. Al-
truistic cell death is conserved throughout different species implying a competitive advantages in
natural environments [62]. SL interactions have previously been employed for building synthetic co-
cultures. Scott et al. demonstrated the use of QS regulated lysis protein expression to implement a
self-limiting interactions in two strain systems, overcoming competitive exclusion in a batch culture.
12
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
The robustness of SL interactions can be explained by the feedback loops involved. Total bac-
teriocin output by a subpopulation is heavily dependent upon its population density; low population
density will naturally have a low output of bacteriocin [63], making QS a secondary level of regula-
tion. This is supported by both two and three strain scenarios where we observe the diminishing
returns that come with increasing complexity. Figure 5 shows that increasing the number of bacte-
riocin in a system yields greater increases in stability than increasing the number of QS systems.
A closed feedback loop exists between the bacteriocin expression rate and the population density.
This is an important reason why we see all SL motifs generally show positive contribution to stability.
Conversely, in OL motifs the population expressing the bacteriocin will not be negatively affected
and therefore a closed feedback-loop does not exist. It should be emphasised that while the design
rules we have identified hold true for a stable steady state objective, it may not be the case for other
objective population dynamics, such as oscillations. However, new objectives can be investigated
by changing the distance functions which describe the population dynamic.
Finally, we showed that the posterior parameter distribution from ABC SMC can be used to make
decisions on part characteristics and experimental conditions (Figure 2C and Figure 6C). Our results
show the dilution rate (D) is an important experimental parameter for producing stable steady state,
and tuning the community composition. The rate of removal of molecules from the environment can
produce very different population dynamics. This result is supported by previously conducted work
where the dilution rate has regularly been demonstrated as important for determining the popula-
tion dynamic [23, 49, 6]. We also show our methodology can identify systems that are robust to
differences in growth rate, highlighted by the comparison of m66 and m62 in Figure 2C. Together
these draw attention to important part characteristics that should be considered when constructing
a stable community. Furthermore, it should be emphasised that if the parts are already selected and
characterised, their parameters can be fixed in the prior distribution reflecting the known behaviour.
The framework we have developed offers a natural entry point to the design-build-test cycle, pro-
viding a data informed roadmap for building a robust synthetic community with a desired behaviour.
We have revealed stable steady state systems in a two and three strain model space, and generated
impactful rules and heuristics for their construction. The flexibility of this framework enables us to
quickly redefine population level behaviours depending on the required application.
13
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
Methods
Model space generator
Models are generated from a set of parts which are expressed by different strains in the system. We
represent an expression configuration through a set of options. We define the options for expres-
sion of A in each strain, where the options are not expressed, expression of A1, and expression of
A2 (0, 1 and 2). We define the options for expression of bacteriocin, which for the two strain model
space includes no expression, expression of B1 or expression of B2 (0, 1, and 2). For the three strain
model space, this includes includes no expression, expression of B1, expression of B2 or expression
of B3 (0, 1, 2 and 3 respectively). Lastly we define the mode of regulation for the bacteriocin, which
can be either induced or repressed (0 and 1). This is redundant if a bacteriocin is not expressed.
Two strain:
A = {0, 1, 2}
B = {0, 1, 2}
R = {0, 1}
Three strain:
A = {0, 1, 2}
B = {0, 1, 2, 3}
R = {0, 1}
This enables us to build possible part combinations that can be expressed by a population. Let
Pc be a family of sets, where each set is a unique combination of parts.
PC = A× B × R
Each strain in a system can be sensitive to up to one bacteriocin. Let I represent the options
for strain sensitivity. In the two strain model space, the options are insensitive, sensitive to B1 or
sensitive to B2 (0, 1, and2 respectively). In the three strain model space, where the options are in-
sensitive, sensitive to B1, sensitive to B2 or sensitive to B3 (0, 1, 2 and 3 respectively).
Two strain:
I = {0, 1, 2}
Three strain:
I = {0, 1, 2, 3}
Each strain is defined by it’s sensitivities, and expression of parts. Let PE be all unique engineered
14
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
strains:
PE = I × PC
Which can be combined to form a model, yielding unique combinations in two strains and three
strains:
Two strain:
PM = PE × PE
Three strain:
PM = PE × PE × PE
Finally, we use a series of rules to remove redundant models. A system is removed if:
1. Two or more strains are identical, concerning bacteriocin sensitivity and combination of ex-
pressed parts.
2. The QS regulating a bacteriocin is not present in the system.
3. A strain is sensitive to a bacteriocin that does not exist in the system.
4. A bacteriocin exists that no strain is sensitive to.
This cleanup yields the options which are used to generate ODE equations for system.
System equations
s in each system are rescaled to improve speed of obtaining numerical approximations.
N ′x = NxCN
B ′z = BzCB
A′y = AyCA
Each model is represented as sets of species
15
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
N = {1, 2...x}
B = {1, 2...z}
A = {1, 2...y}
Species are represented as differential equations
dNx
dt= µx(S)− Nx
B∑z=1
ω(B ′z)− NxD
dS
dt= D(S0 − S)−
N∑x=1
µxN′x
γx
dBz
dt=
N∑x=1
(kBx ,zN′x)
CB− DBz
dAy
dt=
N∑x=1
kAx ,yN′x
CA− DAy
Growth modelled by Monod’s equation for growth limiting nutrient (S)
µx(S) =µxmaxS
KX + S
Killing by bacteriocin, where ωmax = 0 if strain is insensitive
ω(Bz) = ωmaxBnωz
Knωω + Bnω
z
Induction or repression of bacteriocin expression by QS species, Ay
kB(z , y) = kBmax
A′nzy
KnzBz
+ A′nzy
kB(z , y) = kBmax
KnzBz
KnzBz
+ A′nzy
16
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
Simulations were conducted for 1000hrs, the final 100hrs were to calculate the summary statistics
and were stopped early if the population of any strain fell below 1e−10 (extinction event). Simulations
with an extinction event have distances set to maximum. This was done to prevent excessive time
spent solving ODEs of collapsed populations.
Bayesian inference
Let θ ∈ Θ be a sampled parameter vector with a prior π(θ). Given an objective of x0, where x0
exists in the solution space, x0 ∈ D. We define the likelihood function for the objective behaviour as
f (x0|θ). Bayes’ theorem gives us the posterior distribution of θ that exists for the objective x0.
π(θ|x0) =f (x0|θ)π(θ)
π(x0)
We can rewrite π(x0) where a and b represent the lower and upper bounds of the parameter value:
π(x0) =
∫ b
af (x0, θ)dθ =
∫ b
af (x0|θ)π(θ)dθ
The posterior distribution informs us of the parameter distribution that gives rise to the objective.
π(θ|x0) =f (x0|θ)π(θ)∫ b
a f (x0|θ)π(θ)dθ
Let m be a model from a vector of competing models, M , such that m ∈ M = {m0,m2...mn}. Each
model has its own parameter space, allowing us to define a joint space, (m, θ) ∈ M ×ΘM .
We can write Bayes’ theorem in the context of a model space.
π(m|x0) =f (x0|m)π(m)∫
M f (x0|m′)π(m′)dm′
17
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
Since the M is discrete, we can rewrite this as
π(m|x0) =f (x0|m)π(m)∑M f (x0|m′)π(m′)
The marginal likelihood of the model, f (x0|m), considers the joint parameter and model space
f (x0|m) =
∫ΘM
π(θ|m)f (x0|θ,m)dθ
Approximate Bayesian computation
Writing the likelihood function, f (x0|θ), in terms of summary statistics can be difficult. We bypass
this and approximate the posterior by generating data from a model. We can sample a parameter
vector from the prior, θ∗ ∼ π(θ), which is simulated to yield a data vector, x∗. This can be written as
a conditional, x∗ ∼ f (x |θ∗), which also gives the joint density, π(θ, x).
In order to obtain the posterior distribution that satisfies our objective behaviour, x0, we apply a
conditional to define whether a generated data vector, x∗ belongs to the objective x0.
If x = x0
π(θ|x , x0) =π(θ)f (x |θ)
π(θ)f (x |θ)dxdθ
Else
π(θ|x , x0) = 0
Let ρ(x , x0) be a distance function that compares a simulation to the objective. Using distance
thresholds (ε), we can define values below which the distance is acceptably small. We can redefine
π(θ|x , x0) in the context of thresholds to obtain an approximation of the posterior.
18
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
If ρ(x , x0) < ε
πε(θ|x , x0) =π(θ)f (x |θ)
π(θ)f (x |θ)dxdθ
Else
πε(θ|x , x0) = 0
The smaller ε is and the larger the number of simulations conducted, the more accurate the repre-
sentation of the true posterior will be. We can write this marginal posterior distribution as
π(θ∗|ρ(x∗, x0)) ≤ ε ≈ π(θ|x0)
Model selection with ABC SMC
The most basic ABC algorithm is the ABC rejection algorithm. Let ε be the distance threshold the
defining the necessary level of agreement between the objective, x0, and a given simulation, x∗. In
this paper we use a variant of ABC, ABC Sequential Monte Carlo (ABC SMC) [39]. Particles are
sampled from the prior distributions, each particle represents a sampled model and sampled param-
eters for that model. ABC SMC evolves particles sampled from the prior distribution through a series
of intermediate distributions. Particles sampled from an intermediate distribution are perturbed and
and given an importance weighting to define their sample probability for the next distribution. The
distance threshold (ε) is decreased between distributions, moving the acceptance criteria closer to
the objective. These features aim to improve the acceptance rate of particles while maintaining a
good approximation of the posterior distribution.
19
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
Algorithm 1: Model selection with ABC SMC
1 Set population indicator, t = 02 Set initial epsilon, εt = inf3 Set final epsilon, εF = [x , y , z ]4 Set particle indicator, i = 05 if t = 0 then6 Sample m∗ from π(m)7 Sample θ∗∗ from π(θ(m∗))
8 else if t > 0 then9 Sample particle θ∗ from previous population {θ(m∗)it−1} with weights w(m∗)t−1
10 Perturb θ∗ to obtain θ∗∗ ∼ Kt(θ|θ∗)11 if π(θ∗∗) = 0 then12 go to 513
14 Simulate, x∗ ∼ f (x |θ∗∗,m∗)15 if d(x∗, x0) > εt then16 go to 517
18 Set mit = m∗
19 Set θit = θ∗∗
20 Calculate particle weight, w it
21 if t = 0 then22 w i
t = 123 else24 w i
t = π(θ∗∗)∑Nj=1 w
jt−1Kt(θit−1|θ∗∗)
25 if i < N then26 Set i = i + 127 go to 528
29 Normalise weights for every m.30 if εt 6= εT then31 Update population number, t = t + 132 Update ε according to accepted particle distances, εt = fε()33 go to 5
20
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
The Bayes factor can be used to help us interpret how much better (or worse) one model is than the
other. Given two models, m1 and m2, the Bayes factor is calculated as:
BF =P(m1|x)/P(m2|x)
P(m1)/P(m2)
P(mi ) is the prior, and P(mi |x) is the posterior probability. Given uniform priors, P(mi ) = 1/M ,
where M is the number of models. Therefore we can simplify to:
BF =P(m1|x)
P(m2|x)
The Bayes factor is a measure the support for m2 relative to m1. It accounts for the number
of parameters, or complexity of the two models. The Bayes factor allows us to directly compare
the weight of evidence for and against the two models and has the advantage that it can be used to
compare non nested models. Two Bayes factors can be compared directly, since they both represent
evidence in favour of the hypothesis [39, 40]. We therefore use Bayes factors to directly compare
the ability of two models to represent the objective population behaviour. The following table allows
us to interpret BF .
Bayes factor (BF )value
Evidence against m2
(in favour of m1)1 to 3 Very weak
3 to 20 Positive20 to 150 Strong> 150 Very strong
Table 1: Bayes factor categorisation to describe evidence in favour of m1, compared with m2.
Software packages and simulation settings
Repository of AutoCD can be found at https://github.com/ucl-cssb/AutoCD/. The repository includes
configuration files for the two- and three- strain experiments. ABC SMC model selection algorithm
was written in python using Numpy, Pandas and Scipy. ODE simulations were conducted in C++
with a Rosenbrock 4 stepper from the Boost library. All simulations use an absolute error tolerance
21
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
of 1e−9, and relative error tolerance of 1e − 4. Non negative matrix factorisation was conducted
using Scikit Learn. Dendrograms were made from Scipy, using the unweighted pair group method
with arithmetic mean (UPGMA) clustering algorithm. Ternary diagrams were made using python
package python-ternary [64]. Parameter distribution plots were made in R using ggplot2.
Prior distributions
Prior distributions for both two and three strain systems are sampled from uniformly between the
min and max values listed below. Constant parameters have the same min and max value.
Parameter /Species
Description Prior(min)
Prior(max)
Units Citation
ParametersCN OD to cell number scaling factor 1e9 1e9 None N/ACB Microcin scaling factor 1e−9 1e−9 None N/ACA QS scaling factor 1e−9 1e−9 None N/AD Dilution rate 0.01 0.2 h−1 N/AKAyBz Half maximal QS promoter activa-
tion/repression from Ay to Bz
1e−9 1e−6 M [65]
Kx Monod’s half saturation constant 3.9e−5 3.9e−5 M [66]Kω Half saturation killing constant 1e−7 1e−6 M [67, 68]S0 Substrate concentration of input media
(0.4% glucose)0.02 0.02 M M9 media
γ E. coli substrate yield 1e11 1e11 cell M−1 [69]kAy Production rate of AHL per cell 1e−22 1e−15 M h−1 [70]KBmaxz Maximal expression rate of microcin 1e−22 1e−15 M h−1 [6]µxmax Maximum growth rate 0.4 3 h−1 [42, 43]nz Hill coefficient AHL induced expression 1 2 M [65]nω Hill coefficient for killing 1 2 M [65]ωmax Maximum rate of bacteriocin killing 0.5 2.0 M−1 h−1 [71, 67,
68]Initial species
N OD of strain 0.01 0.5 OD N/AS 0.4% glucose concentration 0.02 0.02 M N/AB Microcin concentration 1e−81 1e−81 M N/AA QS concentration 1e−10 1e−10 M N/A
Table 2: Uniform priors used to for parameter sampling and initial species values.
22
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
OutputsInputs Software
Set availableparts and limits
Set prior parameter distribution
Set desired behaviour
Model space generator
ABC SMC
O.D
600
Parameter inference
βα
γδ
ε
α
π(Ө)
β γ δ
ε ζ λη
Motif impact analysis
Community composition control
B1
A1N1 N2
Strain
N2
N1
Bacteriocin
B1
QS
A1
N1ƒ (,( N2 B1
A1
Identify high probabilitycandidates
B1
A1N1 N2
Figure 1: Overview of AutoCD pipeline. Model selection workflow begins with definition of availableparts and prior parameter distributions, used to generate system models from all the possible inter-actions. We use ABC SMC to perform model selection for the desired population behaviour. Theoutputs of ABC SMC provide us with community designs, insight into underlying motifs, parameterrequirements and information on composition tunability.
23
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
0.12
0.08
0.04
0.00
0.00
0.05
0.10
Posterior probability
Mod
els
InteractionsB2
A1
N1 N2
B1
A2
B.
BF =
5.5
BF =
4.5
BF =1.2
Increasing model complexity
N1 N2
m66
N1 N2
N1
N2
m48
N1 N2
m62m66
m48
m62
Bacteriocin QS Strains
A.
D
kB_max_1
mu_max_1
mu_max_2
D
kB_max_1
kB_max_2
mu_max_1
mu_max_2
KBmax1
D
KBmax2
�max1
�max2
�max1
�max2
D
KBmax1
D Dilution rate
KBmaxMaximal expression rate of bacteriocin
�maxMax growth rate of strain m66 m62
Key parametersC.
0.19 -0.03 0.15
-0.15
0.59
0.02
0.09 0.03 0.10 0.03
-0.34 0.33 -0.33
-0.31 0.31
0.22
Figure 2: Output of AutoCD for the two strain stable steady state objective A.) Dendrogram is gen-erated by hierarchical clustering of the adjacency matrices for each model in the two strain modelspace. All possible interactions are shown in the network illustration. Each column of the heatmaprepresents a possible interaction between species, where green indicates the interaction exists forthe model and black indicates absence of the interaction. The bar chart shows the posterior prob-ability of each model. B) Shows the models with highest posterior probability when subsetted fornumber of parts expressed, in order of increasing complexity (2, 3 and 4 parts). Bayes factors (BF)are shown for pairwise comparison of the three models and error bars show the standard deviationacross three repeat experiments. Model schematics show the interactions between strains (blue andgreen), bacteriocin (red) and QS molecules (purple). C) Posterior parameter distributions of severaltunable parameters in m66 and m62. The top and left plots show 1D density distributions of eachparameter, central distributions are 2D density distributions for each pair of parameters. Pearsoncorrelation coefficients are shown on the top side of the diagonal for each parameter pair.
24
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
B2
A1
N1 N2
B1
A2
A1
N1 N2
B1
K1
K2
K3
K4
K1
K2
K3
K4
B2
A1
N1 N2
B1
A2
A1
N1 N2
B1
A2
Component interaction weights
Models
Non-negative matrix factorisation
Post
erio
rPr
obab
ility
-1.0 0.0 1.0
A. B.
Calculate changes in posterior probability for all models in M
Where mx is a model in M:
Find nearest neighbours that are reached by adding each motif
Mod
el p
oste
rior
prob
abilit
y
Models
Other-limiting motifsSelf-limiting motifs
∆ normalised posterior probability
A1
N1 N2
B1
A1
N1 N2
B1
A1
N1 N2
B1
A1
N1 N2
B1
A1
N1
B1
A1
N1 N2
B1
A1
N1
B1
SL1
A1
N1 N2
B1
SL2
SL3
SL4
SL1
SL3
SL2
SL4
OL1
OL3
OL2
OL4
OL1
OL3
OL2
OL4
mx -
C.
D. E.
Figure 3: Contribution of network motifs to stability. A, B) Non negative matrix factorisation (NMF)analysis to learn motifs in model space using four components (K=4). A) Four components learnedby NMF, the line opacity indicates the coefficient of the interaction. B) Visualisation of the componentweights for each model. Each column is a model, with light colours corresponding to high weight anddark colours low weight. C, D, E) Manually curated minimal motifs capture interaction importance.C) Minimal motifs split into self-limiting (SL) and other-limiting (OL) by the direction of bacteriocinkilling. D) Illustration of the algorithm used to generate each datapoint in E). Moving from a model tothe nearest neighbours that can be built by adding a motif will produce a change in model posteriorprobability. E) Boxplots and scatter plot showing the change in posterior probability when addingeach motif to a model. The boxplots show the median, first and third quartile. The lower and upperwhiskers mark the 5th and 95th percentiles respectively.
25
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
Mod
els
B.
BF =
6.7
BF =
7.4
BF = 1.1 BF = 1.1
N1
N2 N3
N1
N2 N3
N1
N2 N3
i) m4128 ii) m4125 iii) m4119 iv) m3938
N1
N2 N3
m4128
m4128
m4124m4119m3938
{
i) ii)
iii) iv)
Posterior probability10
-1110
-8
10-2
10-5
A.
Interactions
0.000
0.005
0.010
0.015
A2A1
N1
B2B1
N2N3
Bacteriocin
Strains
QS
Model space parts:
B3
Figure 4: Output of AutoCD for three strain stable steady state objective A.) Dendrogram is gener-ated by hierarchical clustering of the adjacency matrices for each model in the three strain modelspace. We set the limit of number of levels to 5, in order to show high level groups. Each columnof the heatmap represents a possible interaction between species, where green indicates the inter-action exists for the model and black indicates absence of the interaction. The posterior probabilityplot shows the average posterior probability within each group of models. B) Shows the models withhighest posterior probability when subsetted for number of parts expressed, in order of increasingcomplexity (3, 4, 5 and 6 expressed parts). Bayes factors (BF) are shown for pairwise comparisonof the three models and error bars show the standard deviation between three repeat experiments.Model schematics show the interactions between strains (blue, green and red), bacteriocin (red) andQS molecules (purple). Models with two parts showed posterior probability 0.0 and are not shown.
26
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
Mea
n po
ster
ior p
roba
bilit
y
One QS Two QS One B Two B Three B0.0
0.001
0.0005
+ve - ve +ve-ve
A B. C.
B2B1 B1 B2B1
B3
A2A1A1
Number of QS Modes of regulation Number of bacteriocin
BF =
15.
8
BF =
3.4
BF =
1.8
Figure 5: Summary of parts and by the mean posterior probabilities of subsets from three strainmodel space. A) Comparing systems with one and two orthogonal QS parts. B) Comparing modesof bacteriocin regulation in the system by subsetting for systems with induction (+ve), repression(-ve) or both (+ve, -ve). C) Comparing systems with one, two and three bacteriocin.
D
-50 -35
0.0 0.6 -50 -35
-50 -35
0.25
0
5.0
0.0
log( KBmax1 )
log( KBmax2 ) log( KBmax3 )
0.25
0
0.25
0Low
Particle density
High
N1 N2
N3
N1 N2
N3All strains
All strainsAny strain
OD > 0.001OD < 0.1
OD > 0.1
A. B. C.
Figure 6: Distribution of population densities in Model 4119. Axes of ternary diagrams (A and B)show percentage composition of the community. A) Heat map showing the community compositionat stable steady state in Model 4119. B) Scatter plot of the stable steady state systems, highlightingthe secondary objective where all strains have OD > 0.1 (red), and the primary objective only whereany strain has 0.001 < OD < 0.1. C) Density plots comparing the parameter distributions of fourparameters that show the greatest divergence to produce the secondary objective: Dilution rate (D)and maximal bacteriocin expression rates (KBmax1, KBmax2, KBmax3). The Kolmogorov-Smirnovvalues for the two objectives are 0.25, 0.12, 0.13 and 0.12 respectively.
27
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
1 References
[1] Libertad Pantoja-Hernández and Juan Carlos Martínez-García. Retroactivity in the Context of
Modularly Structured Biomolecular Systems. Frontiers in Bioengineering and Biotechnology,
3(85), jun 2015.
[2] Shridhar Jayanthi and Domitilla Del Vecchio. Retroactivity Attenuation in Bio-Molecular Sys-
tems Based on Timescale Separation. IEEE Transactions on Automatic Control, 56(4):748–
761, apr 2011.
[3] Andras Gyorgy, José I. Jiménez, John Yazbek, Hsin Ho Huang, Hattie Chung, Ron Weiss,
and Domitilla Del Vecchio. Isocost Lines Describe the Cellular Economy of Genetic Circuits.
Biophysical Journal, 109(3):639–646, aug 2015.
[4] D Summers. The kinetics of plasmid loss. Trends in Biotechnology, 9(1):273–278, jan 1991.
[5] Deepak Mishra, Phillip M Rivera, Allen Lin, Domitilla Del Vecchio, and Ron Weiss. A load driver
device for engineering modularity in biological networks. Nature Biotechnology, 2014.
[6] Andrea Y Weiße, Diego A Oyarzún, Vincent Danos, and Peter S Swain. Mechanistic links be-
tween cellular trade-offs, gene expression, and growth. Proceedings of the National Academy
of Sciences of the United States of America, 112(9):E1038–1047, 2015.
[7] Katie Brenner, Lingchong You, and Frances H. Arnold. Engineering microbial consortia: a new
frontier in synthetic biology. Trends in Biotechnology, 26(9):483–489, 2008.
[8] Theodore A. Kennedy, Shahid Naeem, Katherine M. Howe, Johannes M. H. Knops, David
Tilman, and Peter Reich. Biodiversity as a barrier to ecological invasion. Nature,
417(6889):636–638, jun 2002.
[9] Doruk Beyter, Pei-Zhong Tang, Scott Becker, Tony Hoang, Damla Bilgin, Yan Wei Lim, Todd C
Peterson, Stephen Mayfield, Farzad Haerizadeh, Jonathan B Shurin, Vineet Bafna, and Robert
McBride. Diversity, Productivity, and Stability of an Industrial Microbial Ecosystem. Applied and
environmental microbiology, 82(8):2494–2505, apr 2016.
28
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
[10] G. J. Butler and G. S. K. Wolkowicz. A Mathematical Model of the Chemostat with a Gen-
eral Class of Functions Describing Nutrient Uptake. SIAM Journal on Applied Mathematics,
45(1):138–151, feb 1985.
[11] Kevin R. Foster and Thomas Bell. Competition, Not Cooperation, Dominates Interactions
among Culturable Microbial Species. Current Biology, 22(19):1845–1850, oct 2012.
[12] Michael E. Hibbing, Clay Fuqua, Matthew R. Parsek, and S. Brook Peterson. Bacterial compe-
tition: Surviving and thriving in the microbial jungle, jan 2010.
[13] Shiri Freilich, Raphy Zarecki, Omer Eilam, Ella Shtifman Segal, Christopher S. Henry, Martin
Kupiec, Uri Gophna, Roded Sharan, and Eytan Ruppin. Competitive and cooperative metabolic
interactions in bacterial communities. Nature Communications, 2(1), 2011.
[14] Aleksej Zelezniak, Sergej Andrejev, Olga Ponomarova, Daniel R. Mende, Peer Bork, and Ki-
ran Raosaheb Patil. Metabolic dependencies drive species co-occurrence in diverse microbial
communities. Proceedings of the National Academy of Sciences, 112(20):6449–6454, may
2015.
[15] Alexander May, Shrinath Narayanan, Joe Alcock, Arvind Varsani, Carlo Maley, and Athena
Aktipis. Kombucha: a novel model system for cooperation and conflict in a complex multi-
species microbial ecosystem. PeerJ, 7(9):e7565, sep 2019.
[16] T. L. Czaran, R. F. Hoekstra, and L. Pagie. Chemical warfare between microbes promotes
biodiversity. Proceedings of the National Academy of Sciences, 99(2):786–790, jan 2002.
[17] Melissa B. Miller and Bonnie L. Bassler. Quorum Sensing in Bacteria. Annual Review of
Microbiology, 55(1):165–199, oct 2001.
[18] Christina V. Dinh, Xingyu Chen, and Kristala L. J. Prather. Development of a Quorum-Sensing
Based Circuit for Control of Coculture Population Composition in a Naringenin Production Sys-
tem. ACS Synthetic Biology, page acssynbio.9b00451, feb 2020.
[19] Kristina Stephens, Maria Pozo, Chen-Yu Tsao, Pricila Hauk, and William E. Bentley. Bacte-
rial co-culture with cell signaling translator and growth controller modules for autonomously
regulated culture composition. Nature Communications, 10(1):4129, dec 2019.
29
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
[20] Feng Liu, Junwen Mao, Ting Lu, and Qiang Hua. Synthetic, Context-Dependent Microbial
Consortium of Predator and Prey. ACS Synthetic Biology, 8(8):1713–1722, aug 2019.
[21] Apoorv Gupta, Irene M Brockman Reizman, Christopher R Reisch, and Kristala L J Prather.
Dynamic regulation of metabolic flux in engineered bacteria using a pathway-independent
quorum-sensing circuit. Nature Biotechnology, 35(3):273–279, mar 2017.
[22] Spencer R. Scott and Jeff Hasty. Quorum Sensing Communication Modules for Microbial Con-
sortia. ACS Synthetic Biology, 5(9):969–977, sep 2016.
[23] Frederick K Balagaddé, Hao Song, Jun Ozaki, Cynthia H Collins, Matthew Barnet, Frances H
Arnold, Stephen R Quake, and Lingchong You. A synthetic Escherichia coli predator-prey
ecosystem. Molecular systems biology, 4(187):187, 2008.
[24] Wentao Kong, David R. Meldgin, James J. Collins, and Ting Lu. Designing microbial consortia
with defined social interactions. Nature Chemical Biology, 14(8):821–829, aug 2018.
[25] Sylvie Rebuffat. Microcins. In Handbook of Biologically Active Peptides, pages 129–137. Else-
vier, 2013.
[26] Kathryn Geldart, Brittany Forkus, Evelyn McChesney, Madeline McCue, and Yiannis Kaznes-
sis. pMPES: A Modular Peptide Expression System for the Delivery of Antimicrobial Peptides
to the Site of Gastrointestinal Infections Using Probiotics. Pharmaceuticals, 9(4):60, oct 2016.
[27] Alex J.H. Fedorec, Tanel Ozdemir, Anjali Doshi, Yan-Kay Ho, Luca Rosa, Jack Rutter, Os-
car Velazquez, Vitor B. Pinheiro, Tal Danino, and Chris P. Barnes. Two New Plasmid Post-
segregational Killing Mechanisms for the Implementation of Synthetic Gene Networks in Es-
cherichia coli. iScience, 14:323–334, apr 2019.
[28] Alex J H Fedorec, Behzad D Karkaria, Michael Sulu, and Chris P Barnes. Killing in response
to competition stabilises synthetic microbial consortia. bioRxiv, pages 1–20, dec 2019.
[29] Lutz Becks, Frank M. Hilker, Horst Malchow, Klaus Jürgens, and Hartmut Arndt. Experimental
demonstration of chaos in a microbial food web. Nature, 435(7046):1226–1229, jun 2005.
[30] Jonathan Friedman and Jeff Gore. Ecological systems biology: The dynamics of interacting
populations, feb 2017.
30
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
[31] James T. MacDonald, Chris Barnes, Richard I. Kitney, Paul S. Freemont, and Guy-Bart V. Stan.
Computational design approaches and tools for synthetic biology. Integrative Biology, 3(2):97,
feb 2011.
[32] Paul Kirk, Thomas Thorne, and Michael P.H. Stumpf. Model selection in systems and synthetic
biology, aug 2013.
[33] Chris P Barnes, Daniel Silk, Xia Sheng, and Michael P H Stumpf. Bayesian design of synthetic
biological systems. Proceedings of the National Academy of Sciences of the United States of
America, 108(37):15190–5, sep 2011.
[34] Mae L. Woods, Miriam Leon, Ruben Perez-Carrasco, and Chris P. Barnes. A Statistical Ap-
proach Reveals Designs for the Most Robust Stochastic Gene Oscillators. ACS Synthetic Biol-
ogy, 5(6):459–470, jun 2016.
[35] Miriam Leon, Mae L. Woods, Alex J. H. Fedorec, and Chris P. Barnes. A computational method
for the investigation of multistable systems and its application to genetic switches. BMC Sys-
tems Biology, 10(1):130, dec 2016.
[36] Jing Wui Yeoh, Kai Boon Ivan Ng, Ai Ying Teh, Jing Yun Zhang, Wai Kit David Chee, and
Chueh Loo Poh. An Automated Biomodel Selection System (BMSS) for Gene Circuit Designs.
ACS Synthetic Biology, 2019.
[37] Jacob Beal, Ron Weiss, Douglas Densmore, Aaron Adler, Evan Appleton, Jonathan Babb,
Swapnil Bhatia, Noah Davidsohn, Traci Haddock, Joseph Loyall, Richard Schantz, Viktor
Vasilev, and Fusun Yaman. An end-to-End workflow for engineering of biological networks
from high-level specifications. ACS Synthetic Biology, 1(8):317–331, aug 2012.
[38] Guillermo Rodrigo and Alfonso Jaramillo. AutoBioCAD: Full biodesign automation of genetic
circuits. ACS Synthetic Biology, 2(5):230–236, may 2013.
[39] Tina Toni, David Welch, Natalja Strelkowa, Andreas Ipsen, and Michael P H Stumpf. Approxi-
mate Bayesian computation scheme for parameter inference and model selection in dynamical
systems. Journal of the Royal Society, Interface, 6(31):187–202, feb 2009.
31
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
[40] Robert E. Kass and Adrian E. Raftery. Bayes Factors. Journal of the American Statistical
Association, 90(430):773–795, jun 1995.
[41] Howard M. Salis, Ethan A. Mirsky, and Christopher A. Voigt. Automated design of synthetic
ribosome binding sites to control protein expression. Nature Biotechnology, 2009.
[42] Karoline Marisch, Karl Bayer, Theresa Scharl, Juergen Mairhofer, Peter M. Krempl, Karin
Hummel, Ebrahim Razzazi-Fazeli, and Gerald Striedner. A Comparative Analysis of Indus-
trial Escherichia coli K-12 and B Strains in High-Glucose Batch Cultivations on Process-,
Transcriptome- and Proteome Level. PLoS ONE, 2013.
[43] Neythen J. Treloar, Alex J. H. Fedorec, Brian Ingalls, and Chris P. Barnes. Deep reinforcement
learning for the control of microbial co-cultures in bioreactors. PLOS Computational Biology,
16(4):e1007783, apr 2020.
[44] Daniel D. Lee and H. Sebastian Seung. Learning the parts of objects by non-negative matrix
factorization. Nature, 401(6755):788–791, oct 1999.
[45] Alissa Kerner, Jihyang Park, Audra Williams, and Xiaoxia Nina Lin. A Programmable Es-
cherichia coli Consortium via Tunable Symbiosis. PLoS ONE, 7(3):e34032, mar 2012.
[46] Kang Zhou, Kangjian Qiao, Steven Edgar, and Gregory Stephanopoulos. Distributing a
metabolic pathway among a microbial consortium enhances production of natural products.
Nature biotechnology, 33(4):377–83, apr 2015.
[47] Wenying Shou, Sri Ram, and J. M. G. Vilar. Synthetic cooperation in engineered yeast popula-
tions. Proceedings of the National Academy of Sciences, 104(6):1877–1882, feb 2007.
[48] Samay Pande, Holger Merker, Katrin Bohl, Michael Reichelt, Stefan Schuster, Luís F
de Figueiredo, Christoph Kaleta, and Christian Kost. Fitness and stability of obligate cross-
feeding interactions that emerge upon gene loss in bacteria. The ISME Journal, 8(5):953–962,
may 2014.
[49] Eugene Anatoly Yurtsev, Arolyn Conwill, and Jeff Gore. Oscillatory dynamics in a bacterial
cross-protection mutualism. Proceedings of the National Academy of Sciences, 113(22):6236–
6241, may 2016.
32
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
[50] Kazufumi Hosoda, Shingo Suzuki, Yoshinori Yamauchi, Yasunori Shiroguchi, Akiko Kashiwagi,
Naoaki Ono, Kotaro Mori, and Tetsuya Yomo. Cooperative Adaptation to Establishment of a
Synthetic Bacterial Mutualism. PLoS ONE, 6(2):e17105, feb 2011.
[51] Xiaolin Zhang and Jennifer L. Reed. Adaptive Evolution of Synthetic Cooperating Communities
Improves Growth Performance. PLoS ONE, 9(10):e108297, oct 2014.
[52] Ye Chen, Jae Kyoung Kim, Andrew J. Hirning, K. Josi, and Matthew R. Bennett. Emergent
genetic oscillations in a synthetic microbial consortium. Science, 349(6251):986–989, aug
2015.
[53] Hans C Bernstein, Steven D Paulson, and Ross P Carlson. Synthetic Escherichia coli consortia
engineered for syntrophy demonstrate enhanced biomass productivity. Journal of biotechnol-
ogy, 157(1):159–66, jan 2012.
[54] Spencer R. Scott, M. Omar Din, Philip Bittihn, Liyang Xiong, Lev S. Tsimring, and Jeff Hasty. A
stabilized microbial ecosystem of self-limiting bacteria using synthetic quorum-regulated lysis.
Nature Microbiology, 2(8):17083, aug 2017.
[55] Marika Ziesack, Travis Gibson, John K. W. Oliver, Andrew M. Shumaker, Bryan B. Hsu, David T.
Riglar, Tobias W. Giessen, Nicholas V. DiBenedetto, Lynn Bry, Jeffrey C. Way, Pamela A. Silver,
and Georg K. Gerber. Engineered Interspecies Amino Acid Cross-Feeding Increases Popula-
tion Evenness in a Synthetic Bacterial Consortium. mSystems, 4(4):e00352–19, aug 2019.
[56] Michael J. Liao, M. Omar Din, Lev Tsimring, and Jeff Hasty. Rock-paper-scissors: Engineered
population dynamics increase genetic stability. Science, 365(6457):1045–1049, sep 2019.
[57] Jared Kehe, Anthony Kulesa, Anthony Ortiz, Cheri M. Ackerman, Sri Gowtham Thakku, Daniel
Sellers, Seppe Kuehn, Jeff Gore, Jonathan Friedman, and Paul C. Blainey. Massively par-
allel screening of synthetic microbial communities. Proceedings of the National Academy of
Sciences, 116(26):12804–12809, jun 2019.
[58] Ryan H. Hsu, Ryan L. Clark, Jin Wen Tan, John C. Ahn, Sonali Gupta, Philip A. Romero, and
Ophelia S. Venturelli. Microbial Interaction Network Inference in Microfluidic Droplets. Cell
Systems, 9(3):229–242.e4, sep 2019.
33
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
[59] Carlos Carmona-Fontaine and Joao B. Xavier. Altruistic cell death and collective drug resis-
tance, 2012.
[60] Yu Tanouchi, Anand Pai, Nicolas E. Buchler, and Lingchong You. Programming stress-induced
altruistic death in engineered bacteria. Molecular Systems Biology, 2012.
[61] Martin Ackermann, Bärbel Stecher, Nikki E. Freed, Pascal Songhet, Wolf Dietrich Hardt, and
Michael Doebeli. Self-destructive cooperation mediated by phenotypic noise. Nature, 2008.
[62] Gwyn T. Williams. Programmed cell death: a fundamental protective response to pathogens.
Trends in Microbiology, 1994.
[63] Hilje M. Doekes, Rob J. De Boer, and Rutger Hermsen. Toxin production spontaneously be-
comes regulated by local cell density in evolving bacterial populations. PLoS Computational
Biology, 15(8):e1007333, 2019.
[64] Marc Harper, Bryan Weinstein, Tgwoodcock, Cory Simon, Chebee7i, Wiley Morgan, Vince
Knight, Nick Swanson-Hysell, Matthew Evans, Jl-bernal, ZGainsforth, The Gitter Badger, Sax-
onAnglo, Maximiliano Greco, and Guido Zuidhof. python-ternary: Ternary Plots in Python.
Zenodo 10.5281/zenodo.594435, apr 2019.
[65] Nicolas Kylilis, Zoltan A. Tuza, Guy Bart Stan, and Karen M. Polizzi. Tools for engineering
coordinated system behaviour in synthetic microbial consortia. Nature Communications, 9(1),
dec 2018.
[66] Heinrich Senn, Urs Lendenmann, Mario Snozzi, Geoffrey Hamer, and Thomas Egli. The growth
of Escherichia coli in glucose-limited chemostat cultures: a re-examination of the kinetics. BBA
- General Subjects, 1994.
[67] Delphine Destoumieux-Garzón, Sophie Duquesne, Jean Peduzzi, Christophe Goulard, Michel
Desmadril, Lucienne Letellier, Sylvie Rebuffat, and Pascale Boulanger. The iron-siderophore
transporter FhuA is the receptor for the antimicrobial peptide microcin J25: Role of the microcin
Val11-Pro16 β-hairpin region in the recognition mechanism. Biochemical Journal, 2005.
34
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint
[68] Kamaljit Kaur, Oxana Tarassova, Ramana Venkata Dangeti, Sarfuddin Azmi, David Wishart,
Lynn McMullen, and Michael Stiles. Characterization of a highly potent antimicrobial peptide
microcin N from uropathogenic Escherichia coli, jun 2016.
[69] K. B. Andersen and K. Von Meyenburg. Are growth rates of Escherichia coli in batch cultures
limited by respiration? Journal of Bacteriology, 1980.
[70] Mattia Marenda, Marina Zanardo, Antonio Trovato, Flavio Seno, and Andrea Squartini. Model-
ing quorum sensing trade-offs between bacterial cell density and system extension from open
boundaries. Scientific Reports, 2016.
[71] Delphine Destoumieux-Garzón, Xavier Thomas, Mónica Santamaria, Christophe Goulard,
Michel Barthélémy, Bénédicte Boscher, Yannick Bessin, Gérard Molle, Anne Marie Pons, Lu-
cienne Letellier, Jean Peduzzi, and Sylvie Rebuffat. Microcin E492 antibacterial activity: Evi-
dence for a TonB-dependent inner membrane permeabilization on Escherichia coli. Molecular
Microbiology, 2003.
35
.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint