Automated design of synthetic microbial communities · 6/30/2020 · Automated design of synthetic microbial communities Behzad D. Karkaria, 1Alex J. H. Fedorec, and Chris P. Barnes1,2

Automated design of synthetic microbial communities

Behzad D. Karkaria,1 Alex J. H. Fedorec,1 and Chris P. Barnes1,2∗

1Department of Cell & Developmental Biology, University College London, London, WC1E 6BT, UK

2UCL Genetics Institute, University College London, London, WC1E 6BT, UK

∗To whom correspondence should be addressed;

E-mail: [email protected]

Abstract

In naturally occurring microbial systems, species rarely exist in isolation. There is strong eco-

logical evidence for a positive relationship between species diversity and the functional output

of communities. The pervasiveness of these communities in nature highlights that there may be

advantages for engineered strains to exist in cocultures as well. Building synthetic microbial com-

munities allows us to create distributed systems that mitigates issues often found in engineering

a monoculture, especially when functional complexity is increasing. Here, we demonstrate a

methodology for designing robust synthetic communities that use quorum sensing to control

amensal bacteriocin interactions in a chemostat environment. We explore model spaces for two

and three strain systems, using Bayesian methods to perform model selection, and identify the

most robust candidates for producing stable steady state communities. Our findings highlight

important interaction motifs that provide stability, and identify requirements for selecting genetic

parts and tuning the community composition.

Introduction

Traditionally, in biotechnology and synthetic biology, a microbe is engineered and grown as a mono-

culture to perform a particular function. Novel functionality is imparted by introducing heterologous

genetic processes that would not normally be found in the organism. Unintended interactions be-

tween the introduced heterologous processes can cause the engineered function to behave in an

1

.CC-BY-NC 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprintthis version posted July 1, 2020. . https://doi.org/10.1101/2020.06.30.180281doi: bioRxiv preprint

https://doi.org/10.1101/2020.06.30.180281

http://creativecommons.org/licenses/by-nc/4.0/

unintended manner [1, 2, 3]. Metabolic burden imposed by heterologous expression can significantly

slow growth rates and encourage selection of mutants [4]. The limited cellular resource availability

can cause circuits to behave differently when expressed alongside one another [5, 6]. Opportunities

for unintended interactions will increase exponentially with the number of heterologous processes

introduced into a host strain. These limitations are potentially circumvented with the use of microbial

communities, which can be used to implement distributed systems. A distributed system would al-

low us to allocate functional components between subpopulations of cells, creating physical barriers

that insulate processes from one another and distributing the burden of heterologous expression [7].

This allows us to scale complexity in a manner that could not be achieved under the limitations of a

monoculture. In natural environments, we observe mixed-species microbial communities that exhibit

competitive advantages over monocultures in productivity, resource efficiency, metabolic complex-

ity and resistance to invasion [8, 9]. Being able to predictably and reproducibly construct microbial

communities for synthetic biology or biotechnology applications would allow us to harness these

advantages.

While the potential advantages of building distributed systems for synthetic biology applications

are clear, the maintenance and control of microbial communities comes with its own challenges. The

competitive exclusion principle states that when multiple populations compete for a single limiting

resource (in the absence of other interactions), a single population with the highest fitness will drive

the others to extinction [10]. Evidence from natural microbial systems and ecological studies have

shown us that stability can arise through feedback between subpopulations. Both cooperative and

competitive interactions are important for integrating feedback that can stabilise communities by

manipulating growth or fitness of the subpopulations [11, 12, 13, 14, 15, 16].

Communication between members of a microbial community is important for the dynamic regula-

tion of its behaviour. Quorum sensing (QS) systems are a key set of tools that enable us to engineer

communication between and within subpopulations of a community. In natural systems, they can be

used to regulate genetic processes as a function of population density [17]. QS systems consist of

a biosynthetic module that produces small, freely diffusible molecules. These molecules bind reg-

ulatory proteins that can activate or repress gene expression at specific promoters [17]. Synthetic

microbial communities have been built using QS to regulate processes that manipulate the growth

rate or fitness of a population. Fitness can be manipulated by the expression of lysis proteins,

2


https://doi.org/10.1101/2020.06.30.180281


metabolic enzymes, toxins and anti-microbial peptides (AMPs) [14, 18, 19, 20, 21, 22, 23, 24]. In

this study we focus on exploring the use of bacteriocins to manipulate subpopulation growth rates.

Bacteriocins are gene-encoded AMPs that can be used to directly suppress the growth rate of a

sensitive population [25]. They are exported into the extracellular environment, and generally use

“Trojan horse” strategies to enter and kill sensitive strains. One of the most well studied bacteriocins

is microcin V (MccV), a broad spectrum class IIa bacteriocin [25]. It is exported by the cell, and

takes advantage of iron-binding receptors displayed extracellularly to enter a sensitive cell, having

an amensal effect on the target [25]. Expression of immunity genes provide protection against the

bacteriocin and these can be expressed separately or in conjunction with the bacteriocin [26]. Bac-

teriocins present themselves as a versatile actuator of population growth. Their expression by a

single strain can affect any number of subpopulations in a synthetic community. This is in contrast to

lysis proteins and intracellular toxins which affect individual cells. Previously, we have demonstrated

the use of MccV expression to improve plasmid maintenance in a population [27] and for building

stable cocultures that overcome competitive exclusion [28]. Other bacteriocins, such as nisin, have

also been used to produce stable communities [24].

Microbial communities exhibit diverse temporal dynamics including stable steady state, oscil-

lations and deterministic chaos [29, 30]. An important part of designing synthetic communities is

being able to define the temporal dynamics that suit the intended application. In this work we set an

objective of producing robust stable steady state communities in a chemostat environment. Stable

communities allow multiple subpopulations to contribute in a specialised manner to an aggregate

community function. We envisage that through the expression of QS and bacteriocins we can build

’plug-and-play’ stabilising systems. These consist of a set of genetic parts that when expressed

by the appropriate subpopulations will produce a stable steady state community. The design chal-

lenge lies in identifying the most robust combinations of QS and bacteriocin expression, giving us

the highest probability of producing stable steady state in a chemostat environment. Given a small

set of genetic parts, the number of ways they can be combined to produce a candidate system can

become overwhelming; system design by intuition alone becomes increasingly challenging when

dealing with multi-level interactions. Predicting how a system will behave before implementation is

essential for the efficient use of lab resources and fully understanding the interactions that occur

[31]. Models allow us to make data driven decisions concerning wet lab implementation. Model

3


https://doi.org/10.1101/2020.06.30.180281


selection is a process of identifying good performing models from a selection of candidates [32].

Approximate Bayesian computation with sequential Monte Carlo sampling (ABC SMC) is a model

selection and parameterisation method that has been applied to synthetic biology systems [33]. We

have previously used ABC SMC to identify robust genetic oscillators in two and three-gene networks

[34], and to identify parameter regions that give rise to multistable genetic switches [35]. ABC SMC

also allows us to derive design principles that are key to producing the desired behaviour [34, 35].

Similarly, Yeoh et al. developed software that compared the ability of genetic parts to produce logic

gate behaviours, performing model selection using Akaike information criterion (AIC) [36]. Work-

flows have been developed to design regulatory networks from databases of characterised parts

with defined behaviours [37, 38]. Automated circuit design has the potential to greatly improve the

engineering process in synthetic biology.

Here, we present the first instance of automated synthetic community design. Our workflow

automatically generates candidate systems from a set of parts which can be used to engineer a

community. We use ABC SMC to perform model selection, identifying candidate systems that have

the highest probability of producing stable communities in a chemostat bioreactor. Using these meth-

ods we reveal the optimal designs for two- and three-strain systems. This workflow also allows us to

derive fundamental design principles for building stable communities and reveal critical parameters

to control the community composition.

Results

Automated synthetic microbial Community Designer (AutoCD) workflow

Figure 1 illustrates AutoCD, the workflow developed and applied in this study. First, we set the avail-

able parts which can be used to build a stabilising system in a chemostat environment. This consists

of the number of strains (N), bacteriocins (B), and QS systems (A). Any A in a system can regulate

the expression of any B in the system by induction or repression. Strains in all models are depen-

dent upon a single nutrient resource (S), which is consumed by strains and replenished through

dilution of the chemostat with fresh media. Importantly, all models therefore include nutrient based

competition between subpopulations. Uniform prior distributions are set, describing each part and

their interactions with one another (Table 2). The uniform priors we use span the expected ranges

4


https://doi.org/10.1101/2020.06.30.180281


these parameters could take. Importantly, in scenarios where the particular parts have already been

selected and characterised, the prior parameters can be constrained. The available parts and prior

parameter distributions serve as inputs to the Model Space Generator, which conducts a series

of combinatorial steps to produce all possible genetic circuits. The Model Space Generator then

builds unique combinations of strains expressing different genetic circuits, where each combination

is a candidate stabilising system. Filtering steps remove inviable, redundant and mirror systems,

yielding a set of unique candidates to be assessed (Methods). The Model Space Generator pro-

duces an ordinary differential equation (ODE) model for each system in the context of the chemostat

environment, and these models form our prior model space.

The final required input is a mathematical description of the objective population behaviour, a

stable steady state. We use three distance functions (d1, d2, d3) to describe how far away a simu-

lation is from the objective stable steady state (Eq 1). d1 is the final gradient of Nx , capturing the

most fundamental characteristic of stable steady state where the population level of a strain is un-

changing. d2 is the standard deviation of a population, chosen to detect unstable behaviours such

as oscillations, favouring simulations that reach stable steady state quickly. d3 is the reciprocal of the

strain population at the end of the simulation, allowing us to define a minimum population density.

Given the three distances, εF defines thresholds below which a simulation meets the requirements

of our stable steady state objective. The distances of all strain populations in a simulation must be

below these thresholds to satisfy the objective behaviour. εF1 was chosen to match the error toler-

ance of the ODE solver and εF2 threshold was chosen through qualitative assessment of simulation

data to define a practical threshold for what stable steady state simulations should look like. εF3 is

set to ensure all populations have a minimum final OD of 0.001. The posterior distribution is made

up of simulations where the distances for each strain population are less than the εF thresholds

(Eq 2). ABC SMC performs model selection on the model space for the objective defined by these

distance functions and εF . A particle is a sampled model and sampled parameters. ABC SMC ini-

tially samples particles from the prior distributions with an unbounded distance threshold. Particles

are propagated through intermediate distributions, gradually reducing the distance thresholds until

they equal εF (Methods). ABC SMC provides an estimation of model and parameter space posterior

probabilities for the given prior distributions and the objective behaviour. The output of ABC SMC is

an approximation of the posterior distribution which we can use to help us design synthetic commu-

5


https://doi.org/10.1101/2020.06.30.180281


nities and chemostat settings in the lab.

Distance functions:

d1(Nx) = |∆Nx(t − 1)|

d2(Nx) = σ(Nx)

d3(Nx) =1

Nx(t − 1)

(1)

Distance thresholds

εF = {1e−9, 0.001, 1000}

d1 < εF1

d2 < εF2

d3 < εF3

(2)

Designing two strain cocultures that achieve steady state

Here we apply AutoCD to the design of a stable steady state coculture containing two strains. In Fig-

ure 2 we define a model space consisting of two strains (N1, N2), two bacteriocins (B1, B2) and two

QS systems (A1, A2). We set model space limits to enable feasible experimental implementation,

allowing expression of up to one QS per strain and expression of up to one bacteriocin per strain.

Each strain can be sensitive to up to one bacteriocin. Given these conditions, the Model Space Gen-

erator yields 69 unique two strain models (m0,m1...m68). These 69 models serve as a uniform prior

model space upon which we perform model selection using ABC SMC. From the available genetic

parts, there are 17 possible interaction options that could exist between species in each candidate

model. We perform hierarchical clustering on the interactions present in each model, grouping mod-

els based on the similarity of their interactions. This clustering is visualised as a dendrogram in

(Figure 2A). ABC SMC approximates the posterior probability of each model for the stable steady

state objective, indicating how effective the candidate system is in producing stable steady state.

m62 has the highest posterior probability, and is therefore the system which most robustly produces

stable steady state (Figure 2A). m62 consists of two strains exhibiting a cross-protection mutualism

relationship [30]. Each strain expresses an orthogonal QS molecule that represses the expression of

a self-limiting bacteriocin in the opposing strain (Figure 2B). This is an interdependence between the

two strains where the extinction of one strain would result in the extinction of the other. The exclusion

of one strain reduces the fitness of the other, creating a closed feedback loop which overcomes the

competitive exclusion principle.

When designing new systems, minimising the number of genetic parts will reduce the number

of experimental variables, improving the ease of construction and optimisation of a system. We

6


https://doi.org/10.1101/2020.06.30.180281


subset the model space by the number of expressed species in the system (maximum two QS and

two bacteriocin), yielding subsets containing candidate models with two, three and four expressed

species (low complexity to high complexity). We identify the candidates with the highest posterior

probability in each subset (Figure 2B). We see the posterior probability increasing despite the larger

parameter spaces, which is important because ABC SMC will naturally favour models which yield

stable steady state with the smallest possible number of parameters (Occam’s razor) [39]. We

see that all three models have self-limiting motifs, where a strain is sensitive to the bacteriocin

it produces. All three models are devoid of other-limiting motifs, where a strain is sensitive to a

bacteriocin produced by another strain.

The Bayes Factor (BF ) is a ratio between the marginal likelihoods of two models, giving a quan-

tification of support for one model compared with another. BF > 3.0 indicates evidence of a notable

difference between the two models, while BF < 3.0 suggests insubstantial evidence [40] (Table 1).

The BF of m66 compared with m48 suggests substantial improvement in the posterior probability

can be made by increasing complexity. However, the BF of m48 compared with m62 suggests insub-

stantial evidence behind this improvement in posterior probability. These diminishing returns when

increasing system complexity hold important ramifications for system design. The introduction of an

additional QS part to move from m48 to m62 may not be efficient for the minor improvement in steady

state robustness.

Model selection has identified the best performing designs for producing stable communities.

However, the parts used in the design may require specific characteristics or chemostat experi-

mental setup. ABC SMC also produces posterior parameter distributions for each model, giving us

information about the parameter values necessary to yield stable steady state. Figure 2C shows the

posterior distributions of several tunable parameters in m66 and m62, showing the parameter values

that give rise to stable communities. The dilution rate of the chemostat (D) is a directly tunable

parameter and the maximal expression rate of the bacteriocin (KBmax ) can be tuned through choice

of promoter and ribosome binding-site [41]. The growth rates (µmax ) can be tuned through choice of

base strains or auxotrophic dependencies [42, 43]. The lower model posterior probability of m66 is

reflected in the parameter density distributions, which are generally more constrained that those of

m62. The distributions of D in both systems show a lower dilution rate is important for stable steady

state. KBmax for all bacteriocins is tightly constrained to high maximal bacteriocin expression rates.

7


https://doi.org/10.1101/2020.06.30.180281


For m66, the correlation coefficients for the two growth rates (µmax1,µmax2) show the parameters are

loosely correlated. Additionally, we see that N1 has a higher maximal growth rate (µmax1) than that

of N2 (µmax2). The faster maximal growth rate of N1 is necessary to counteract self-limitation that

is negatively regulated by the population of N2. Conversely, m62 shows a wider distribution of strain

growth rates at stable steady states and a low correlation coefficient, indicating these parameters

are not as critical to produce the objective behaviour.

Self-limiting motifs stabilise two strain systems

The dendrogram of Figure 2A highlights a cluster of high performing models that are closely related.

This suggests underlying interactions of the model space exist that are important for producing

communities with stable steady state.

Non negative matrix factorization (NMF) is an unsupervised machine learning method we can

use to reduce the dimensionality of the interaction space [44]. We can use NMF to help us under-

stand the underlying motifs and how they affect community stability. We represent each model by the

interactions present in the system (Figure 2A). NMF takes these interactions and learns a number of

clusters (K ), models can be rebuilt by a weighted sum of these clusters. In our case, these clusters

can be represented as interaction motifs. We set K = 4, in order to give us a digestible summary

of the model space. Figure 3A shows the learned motifs that can be used to represent the entire

model space. Figure 3B shows the component weights for each model, defining the membership

each model has for each motif. The models are shown in descending order of posterior probability,

we can see that K2 is heavily weighted in the top performing models. The motif K2 refers to self-

limiting (SL) only interactions where the strain is sensitive to the bacteriocin it produces (Figure 3A,

B). The top models are consistently assigned low weights for K4 (Figure 3A, B), a motif which refers

to other-limiting (OL) only interactions, where the strain is sensitive to a bacteriocin produced by the

other strain (Figure 3A).

The continuous learned motifs from NMF can be hard to interpret. We use the indications pro-

duced by NMF to curate our own discrete motifs. K2 and K4 show us the direction of bacteriocin

sensitivity is an important feature and we proceed to investigate this further. All models can be built

by combining eight fundamental motifs which can be categorised as either SL or OL, based on the

direction of bacteriocin sensitivity (Figure 3C). Within each category, motifs are differentiated by the

8


https://doi.org/10.1101/2020.06.30.180281


way bacteriocin expression is regulated (Figure 3C). For example, m66 = SL2, m48 = SL4 + SL2

and m62 = SL2 + SL2. In order to assess the importance of each motif for producing stable com-

munities we perform a motif impact analysis (Figure 3D). For each model we identify the nearest

neighbours in the model space that can be built by adding each motif. By calculating the change

in posterior probability for each neighbour, we can summarise the contribution each motif brings to

the model posterior probability (Figure 3D). By repeating this across the entire model space, we are

able to quantify whether a motif is stabilising or destabilising (Figure 3E). The lower quartiles of SL

motifs all show lower negative change magnitudes compared with the lower quartiles of OL motifs.

The upper quartiles of SL motifs show a higher positive change magnitude than that of OL motifs.

Together these show the addition of SL motifs more often result in an improved posterior probability,

whereas addition of OL motifs are more often resulting in decreased posterior probability. The upper

quartile of SL2 shows the motif has the most stabilising effect, closely followed by SL4. We see these

findings are reflected by top models identified in Figure 2B, where all models are constructed with

SL2 and SL4 motifs.

The total output of bacteriocin by a population will be a function of the population’s density. All

SL motifs therefore possess a fundamental negative feedback relationship between growth rate and

density, augmented by the mode of QS regulation. Conversely, the population density and growth

rate of a strain in OL motifs are decoupled. This lack of feedback is a clear explanation as to why

we see SL motifs as positive contributors to stability while OL motifs have a destabilising effect.

Designing three strain communities that achieve steady state

While several studies have demonstrated the ability to establish synthetic two strain systems [45,

46, 23, 47, 48, 49, 20, 50, 51, 52, 53, 54], efforts with three strains are sparser [55, 24, 56]. Having

demonstrated the automated design of two strain systems, we next tackle the far larger challenge

of designing stable three strain communities. The addition of a single strain significantly increases

the parameter space, engineering options and possible interactions. We define our available parts

consisting of three strains (N1, N2, N3), three bacteriocins (B1, B2, B3) and two orthogonal QS

systems (A1, A2). We maintain the same strain engineering restrictions, allowing up to one QS

expression and up to one bacteriocin expression per strain. Each strain can be sensitive to up to

one bacteriocin. Given the available parts and engineering limits, the Model Space Generator yields

9


https://doi.org/10.1101/2020.06.30.180281


4,182 unique models. Due to the much greater number of models, we group models based upon

the interactions in each model by hierarchical clustering for up to five levels. The average posterior

probabilities of each cluster are shown (Figure 4A). 3,289 models have a posterior probability of zero,

highlighting how much more difficult this design scenario is. ABC SMC identifies m4119 as the system

with the highest posterior probability for producing stable steady state. m4119 consists of a two QS

molecules; A1 is produced by N2, A2 is produced by N3 (Figure 4B). The QS molecules repress

the expression of self-limiting bacteriocins produced by each population. Using the minimal motifs

defined in Figure 3C, m4119 can be summarised as m4119 = 3×SL2. We subset the model space on

the counts of heterologous expression in the system, yielding subsets containing candidate models

with two, three and four expressed parts (low complexity to high complexity) (Figure 4B). Models with

two heterologously expressed parts all had a posterior probability of 0.0 and are not shown. Again,

we see a diminishing increase in posterior probability that comes with increasing complexity. m3938 is

the more complicated neighbour of m4119, the difference being N1 is also contributing with production

of A1, resulting in a fall in the posterior probability. The increase in posterior probability that occurs

when moving from m4125 to m4119 has BF < 3.0, indicating the difference between the posterior

probability of the two models is not substantial. These system comparisons highlight the trade-

off between increasing complexity and improving system performance. In a similar fashion to the

two strain model space, the top performing models are dominated by self-limiting only interactions

(Supplementary Figure 1).

Multiple engineered bacteriocins are more important than multiple orthogonal QS

systems

Our results have identified top performing models in the two and three strain model spaces. We have

also highlighted the diminishing returns that occur with increasing model complexity in top performing

models. Next we aim to summarise the importance of different parts and their contribution to the

stable steady state objective behaviour, further enabling us to triage genetic parts for construction in

the lab.

Figure 5 shows a summary of the parts used to construct three strain systems and the aver-

age posterior probabilities they yield. This gives us important information to form heuristic rules

in the design of three strain systems. Figure 5A shows the very similar posterior probability when

10


https://doi.org/10.1101/2020.06.30.180281


comparing two QS systems rather than one. Figure 5B demonstrates the substantial advantage of

repressive QS regulation of bacteriocin production over inducible systems. Figure 5C shows very

strong evidence in favour of using three bacteriocins to produce stable steady state in three strain

systems. These three statistics suggest that on average there is little advantage to be gained in the

use of two QS systems, and priority should be given to the use of a single repressive QS to regulate

three bacteriocin systems, such as we see in m4125.

Defining stable steady state population ratios in three strain systems

Synthetic communities can be applied to improving yields and efficiency via the distribution of biopro-

duction pathways between subpopulations [46, 53]. The density of a subpopulation will determine

its productive output capacity for the function it conducts. Being able to define the steady state

composition of a synthetic community is therefore an important feature. Here we demonstrate that

secondary objectives can be applied to the output of ABC SMC without further simulation time,

identifying key parameters that enable fine tuning of stable steady state population densities.

The εF3 threshold value ensures all simulations in the final population have an OD > 0.001.

Figure 6A shows the community composition distribution of m4119. The majority of accepted par-

ticles show a final community composition that is dominated by a single strain. By using the final

population distances from ABC SMC we can subset for a secondary objective and identify how

the system can be tuned to produce a more evenly distributed community composition. We set a

secondary objective, stipulating that all strains must be of OD > 0.1 (pink) (Figure 6B). Therefore

strains that do not meet the secondary objective have 0.001 < OD < 0.1 (blue) (Figure 6B). From

these two subsets we generate separate parameter distributions and calculate the divergence using

Kolmogorov-Smirnov (KS). Parameter distributions that show the greatest divergence are important

for changing the system behaviour from one that is dominated by a single strain, to one that has

a more even distribution of strain densities. The distributions of four parameters that exhibit great-

est divergence are shown in Figure 6C. A higher dilution rate (D) and lower maximal bacteriocin

expression rates (KBmax1, KBmax2, KBmax3) are associated with producing an more evenly distributed

community composition. Importantly, all three parameters are realistically tunable. The dilution rate

can be controlled directly through the chemostat device, while bacteriocin expression rates can be

changed through the choice of promoters and ribosome binding sites.

11


https://doi.org/10.1101/2020.06.30.180281


Discussion

Synthetic communities built to date have employed the use of QS, metabolic dependencies, intra-

cellular lysis proteins, toxins and extracellular AMPs to engineer interactions that enable community

formation [55, 54, 24]. Studies often incorporate mathematical modelling to inform experimental de-

sign and identify the conditions that will produce the expected population dynamic. The design of the

fundamental interactions of the system itself is often directed by mimicking ecological interactions

found in nature, or by rational judgement. As the possible types of engineered interaction increases,

so does the need for comprehensive assessment of the vast model spaces. The modelling and sta-

tistical framework demonstrated here addresses this design problem. With our examples we have

highlighted important design features and heuristic rules for building synthetic steady state commu-

nities. Comparative coculture studies have been used to identify stable partnerships in microbial

ecosystems [57, 58]. These methods could be used to validate our findings in the wet lab by con-

structing a subset of strains from the model space and comparing the ability of pairs to produce

stable steady state.

We have identified optimal system designs using bacteriocins and QS for stable steady state in

two and three strain communities. m62, the top model of the two strain model space uses a cross-

protection mutualism, whereby the density of each subpopulation inhibits the self-limitation of the

other. Cross-protection mutualism has previously been incorporated in synthetic microbial commu-

nities via the mutual degradation of externally supplied antibiotics [49]. Metabolic interdependencies

can also be employed to engineer mutualism [50, 51]. Similarly, in m4119 of the three strain model

space (Figure 4B) we see pair-wise cross-protection mutualism between two subpopulations and a

dependent subpopulation. All top performing models used SL interactions to produce stable steady

state dynamics. Self-limitation is observed in many natural biological communities, normally in a

response to stress [59, 60]. These processes, while detrimental to the individual, provide a net

benefit to the community through release of a public good - they are altruistic processes [61]. Al-

truistic cell death is conserved throughout different species implying a competitive advantages in

natural environments [62]. SL interactions have previously been employed for building synthetic co-

cultures. Scott et al. demonstrated the use of QS regulated lysis protein expression to implement a

self-limiting interactions in two strain systems, overcoming competitive exclusion in a batch culture.

12


https://doi.org/10.1101/2020.06.30.180281


The robustness of SL interactions can be explained by the feedback loops involved. Total bac-

teriocin output by a subpopulation is heavily dependent upon its population density; low population

density will naturally have a low output of bacteriocin [63], making QS a secondary level of regula-

tion. This is supported by both two and three strain scenarios where we observe the diminishing

returns that come with increasing complexity. Figure 5 shows that increasing the number of bacte-

riocin in a system yields greater increases in stability than increasing the number of QS systems.

A closed feedback loop exists between the bacteriocin expression rate and the population density.

This is an important reason why we see all SL motifs generally show positive contribution to stability.

Conversely, in OL motifs the population expressing the bacteriocin will not be negatively affected

and therefore a closed feedback-loop does not exist. It should be emphasised that while the design

rules we have identified hold true for a stable steady state objective, it may not be the case for other

objective population dynamics, such as oscillations. However, new objectives can be investigated

by changing the distance functions which describe the population dynamic.

Finally, we showed that the posterior parameter distribution from ABC SMC can be used to make

decisions on part characteristics and experimental conditions (Figure 2C and Figure 6C). Our results

show the dilution rate (D) is an important experimental parameter for producing stable steady state,

and tuning the community composition. The rate of removal of molecules from the environment can

produce very different population dynamics. This result is supported by previously conducted work

where the dilution rate has regularly been demonstrated as important for determining the popula-

tion dynamic [23, 49, 6]. We also show our methodology can identify systems that are robust to

differences in growth rate, highlighted by the comparison of m66 and m62 in Figure 2C. Together

these draw attention to important part characteristics that should be considered when constructing

a stable community. Furthermore, it should be emphasised that if the parts are already selected and

characterised, their parameters can be fixed in the prior distribution reflecting the known behaviour.

The framework we have developed offers a natural entry point to the design-build-test cycle, pro-

viding a data informed roadmap for building a robust synthetic community with a desired behaviour.

We have revealed stable steady state systems in a two and three strain model space, and generated

impactful rules and heuristics for their construction. The flexibility of this framework enables us to

quickly redefine population level behaviours depending on the required application.

13


https://doi.org/10.1101/2020.06.30.180281


Methods

Model space generator

Models are generated from a set of parts which are expressed by different strains in the system. We

represent an expression configuration through a set of options. We define the options for expres-

sion of A in each strain, where the options are not expressed, expression of A1, and expression of

A2 (0, 1 and 2). We define the options for expression of bacteriocin, which for the two strain model

space includes no expression, expression of B1 or expression of B2 (0, 1, and 2). For the three strain

model space, this includes includes no expression, expression of B1, expression of B2 or expression

of B3 (0, 1, 2 and 3 respectively). Lastly we define the mode of regulation for the bacteriocin, which

can be either induced or repressed (0 and 1). This is redundant if a bacteriocin is not expressed.

Two strain:

A = {0, 1, 2}

B = {0, 1, 2}

R = {0, 1}

Three strain:

A = {0, 1, 2}

B = {0, 1, 2, 3}

R = {0, 1}

This enables us to build possible part combinations that can be expressed by a population. Let

Pc be a family of sets, where each set is a unique combination of parts.

PC = A× B × R

Each strain in a system can be sensitive to up to one bacteriocin. Let I represent the options

for strain sensitivity. In the two strain model space, the options are insensitive, sensitive to B1 or

sensitive to B2 (0, 1, and2 respectively). In the three strain model space, where the options are in-

sensitive, sensitive to B1, sensitive to B2 or sensitive to B3 (0, 1, 2 and 3 respectively).

Two strain:

I = {0, 1, 2}

Three strain:

I = {0, 1, 2, 3}

Each strain is defined by it’s sensitivities, and expression of parts. Let PE be all unique engineered

14


https://doi.org/10.1101/2020.06.30.180281


strains:

PE = I × PC

Which can be combined to form a model, yielding unique combinations in two strains and three

strains:

Two strain:

PM = PE × PE

Three strain:

PM = PE × PE × PE

Finally, we use a series of rules to remove redundant models. A system is removed if:

1. Two or more strains are identical, concerning bacteriocin sensitivity and combination of ex-

pressed parts.

2. The QS regulating a bacteriocin is not present in the system.

3. A strain is sensitive to a bacteriocin that does not exist in the system.

4. A bacteriocin exists that no strain is sensitive to.

This cleanup yields the options which are used to generate ODE equations for system.

System equations

s in each system are rescaled to improve speed of obtaining numerical approximations.

N ′x = NxCN

B ′z = BzCB

A′y = AyCA

Each model is represented as sets of species

15


https://doi.org/10.1101/2020.06.30.180281


N = {1, 2...x}

B = {1, 2...z}

A = {1, 2...y}

Species are represented as differential equations

dNx

dt= µx(S)− Nx

B∑z=1

ω(B ′z)− NxD

dS

dt= D(S0 − S)−

N∑x=1

µxN′x

γx

dBz

dt=

N∑x=1

(kBx ,zN′x)

CB− DBz

dAy

dt=

N∑x=1

kAx ,yN′x

CA− DAy

Growth modelled by Monod’s equation for growth limiting nutrient (S)

µx(S) =µxmaxS

KX + S

Killing by bacteriocin, where ωmax = 0 if strain is insensitive

ω(Bz) = ωmaxBnωz

Knωω + Bnω

z

Induction or repression of bacteriocin expression by QS species, Ay

kB(z , y) = kBmax

A′nzy

KnzBz

+ A′nzy

kB(z , y) = kBmax

KnzBz

KnzBz

+ A′nzy

16


https://doi.org/10.1101/2020.06.30.180281


Simulations were conducted for 1000hrs, the final 100hrs were to calculate the summary statistics

and were stopped early if the population of any strain fell below 1e−10 (extinction event). Simulations

with an extinction event have distances set to maximum. This was done to prevent excessive time

spent solving ODEs of collapsed populations.

Bayesian inference

Let θ ∈ Θ be a sampled parameter vector with a prior π(θ). Given an objective of x0, where x0

exists in the solution space, x0 ∈ D. We define the likelihood function for the objective behaviour as

f (x0|θ). Bayes’ theorem gives us the posterior distribution of θ that exists for the objective x0.

π(θ|x0) =f (x0|θ)π(θ)

π(x0)

We can rewrite π(x0) where a and b represent the lower and upper bounds of the parameter value:

π(x0) =

∫ b

af (x0, θ)dθ =

∫ b

af (x0|θ)π(θ)dθ

The posterior distribution informs us of the parameter distribution that gives rise to the objective.

π(θ|x0) =f (x0|θ)π(θ)∫ b

a f (x0|θ)π(θ)dθ

Let m be a model from a vector of competing models, M , such that m ∈ M = {m0,m2...mn}. Each

model has its own parameter space, allowing us to define a joint space, (m, θ) ∈ M ×ΘM .

We can write Bayes’ theorem in the context of a model space.

π(m|x0) =f (x0|m)π(m)∫

M f (x0|m′)π(m′)dm′

17


https://doi.org/10.1101/2020.06.30.180281


Since the M is discrete, we can rewrite this as

π(m|x0) =f (x0|m)π(m)∑M f (x0|m′)π(m′)

The marginal likelihood of the model, f (x0|m), considers the joint parameter and model space

f (x0|m) =

∫ΘM

π(θ|m)f (x0|θ,m)dθ

Approximate Bayesian computation

Writing the likelihood function, f (x0|θ), in terms of summary statistics can be difficult. We bypass

this and approximate the posterior by generating data from a model. We can sample a parameter

vector from the prior, θ∗ ∼ π(θ), which is simulated to yield a data vector, x∗. This can be written as

a conditional, x∗ ∼ f (x |θ∗), which also gives the joint density, π(θ, x).

In order to obtain the posterior distribution that satisfies our objective behaviour, x0, we apply a

conditional to define whether a generated data vector, x∗ belongs to the objective x0.

If x = x0

π(θ|x , x0) =π(θ)f (x |θ)

π(θ)f (x |θ)dxdθ

Else

π(θ|x , x0) = 0

Let ρ(x , x0) be a distance function that compares a simulation to the objective. Using distance

thresholds (ε), we can define values below which the distance is acceptably small. We can redefine

π(θ|x , x0) in the context of thresholds to obtain an approximation of the posterior.

18


https://doi.org/10.1101/2020.06.30.180281


If ρ(x , x0) < ε

πε(θ|x , x0) =π(θ)f (x |θ)

π(θ)f (x |θ)dxdθ

Else

πε(θ|x , x0) = 0

The smaller ε is and the larger the number of simulations conducted, the more accurate the repre-

sentation of the true posterior will be. We can write this marginal posterior distribution as

π(θ∗|ρ(x∗, x0)) ≤ ε ≈ π(θ|x0)

Model selection with ABC SMC

The most basic ABC algorithm is the ABC rejection algorithm. Let ε be the distance threshold the

defining the necessary level of agreement between the objective, x0, and a given simulation, x∗. In

this paper we use a variant of ABC, ABC Sequential Monte Carlo (ABC SMC) [39]. Particles are

sampled from the prior distributions, each particle represents a sampled model and sampled param-

eters for that model. ABC SMC evolves particles sampled from the prior distribution through a series

of intermediate distributions. Particles sampled from an intermediate distribution are perturbed and

and given an importance weighting to define their sample probability for the next distribution. The

distance threshold (ε) is decreased between distributions, moving the acceptance criteria closer to

the objective. These features aim to improve the acceptance rate of particles while maintaining a

good approximation of the posterior distribution.

19


https://doi.org/10.1101/2020.06.30.180281


Algorithm 1: Model selection with ABC SMC

1 Set population indicator, t = 02 Set initial epsilon, εt = inf3 Set final epsilon, εF = [x , y , z ]4 Set particle indicator, i = 05 if t = 0 then6 Sample m∗ from π(m)7 Sample θ∗∗ from π(θ(m∗))

8 else if t > 0 then9 Sample particle θ∗ from previous population {θ(m∗)it−1} with weights w(m∗)t−1

10 Perturb θ∗ to obtain θ∗∗ ∼ Kt(θ|θ∗)11 if π(θ∗∗) = 0 then12 go to 513

14 Simulate, x∗ ∼ f (x |θ∗∗,m∗)15 if d(x∗, x0) > εt then16 go to 517

18 Set mit = m∗

19 Set θit = θ∗∗

20 Calculate particle weight, w it

21 if t = 0 then22 w i

t = 123 else24 w i

t = π(θ∗∗)∑Nj=1 w

jt−1Kt(θit−1|θ∗∗)

25 if i < N then26 Set i = i + 127 go to 528

29 Normalise weights for every m.30 if εt 6= εT then31 Update population number, t = t + 132 Update ε according to accepted particle distances, εt = fε()33 go to 5

20


https://doi.org/10.1101/2020.06.30.180281


The Bayes factor can be used to help us interpret how much better (or worse) one model is than the

other. Given two models, m1 and m2, the Bayes factor is calculated as:

BF =P(m1|x)/P(m2|x)

P(m1)/P(m2)

P(mi ) is the prior, and P(mi |x) is the posterior probability. Given uniform priors, P(mi ) = 1/M ,

where M is the number of models. Therefore we can simplify to:

BF =P(m1|x)

P(m2|x)

The Bayes factor is a measure the support for m2 relative to m1. It accounts for the number

of parameters, or complexity of the two models. The Bayes factor allows us to directly compare

the weight of evidence for and against the two models and has the advantage that it can be used to

compare non nested models. Two Bayes factors can be compared directly, since they both represent

evidence in favour of the hypothesis [39, 40]. We therefore use Bayes factors to directly compare

the ability of two models to represent the objective population behaviour. The following table allows

us to interpret BF .

Bayes factor (BF )value

Evidence against m2

(in favour of m1)1 to 3 Very weak

3 to 20 Positive20 to 150 Strong> 150 Very strong

Table 1: Bayes factor categorisation to describe evidence in favour of m1, compared with m2.

Software packages and simulation settings

Repository of AutoCD can be found at https://github.com/ucl-cssb/AutoCD/. The repository includes

configuration files for the two- and three- strain experiments. ABC SMC model selection algorithm

was written in python using Numpy, Pandas and Scipy. ODE simulations were conducted in C++

with a Rosenbrock 4 stepper from the Boost library. All simulations use an absolute error tolerance

21


https://doi.org/10.1101/2020.06.30.180281


of 1e−9, and relative error tolerance of 1e − 4. Non negative matrix factorisation was conducted

using Scikit Learn. Dendrograms were made from Scipy, using the unweighted pair group method

with arithmetic mean (UPGMA) clustering algorithm. Ternary diagrams were made using python

package python-ternary [64]. Parameter distribution plots were made in R using ggplot2.

Prior distributions

Prior distributions for both two and three strain systems are sampled from uniformly between the

min and max values listed below. Constant parameters have the same min and max value.

Parameter /Species

Description Prior(min)

Prior(max)

Units Citation

ParametersCN OD to cell number scaling factor 1e9 1e9 None N/ACB Microcin scaling factor 1e−9 1e−9 None N/ACA QS scaling factor 1e−9 1e−9 None N/AD Dilution rate 0.01 0.2 h−1 N/AKAyBz Half maximal QS promoter activa-

tion/repression from Ay to Bz

1e−9 1e−6 M [65]

Kx Monod’s half saturation constant 3.9e−5 3.9e−5 M [66]Kω Half saturation killing constant 1e−7 1e−6 M [67, 68]S0 Substrate concentration of input media

(0.4% glucose)0.02 0.02 M M9 media

γ E. coli substrate yield 1e11 1e11 cell M−1 [69]kAy Production rate of AHL per cell 1e−22 1e−15 M h−1 [70]KBmaxz Maximal expression rate of microcin 1e−22 1e−15 M h−1 [6]µxmax Maximum growth rate 0.4 3 h−1 [42, 43]nz Hill coefficient AHL induced expression 1 2 M [65]nω Hill coefficient for killing 1 2 M [65]ωmax Maximum rate of bacteriocin killing 0.5 2.0 M−1 h−1 [71, 67,

68]Initial species

N OD of strain 0.01 0.5 OD N/AS 0.4% glucose concentration 0.02 0.02 M N/AB Microcin concentration 1e−81 1e−81 M N/AA QS concentration 1e−10 1e−10 M N/A

Table 2: Uniform priors used to for parameter sampling and initial species values.

22


https://doi.org/10.1101/2020.06.30.180281


OutputsInputs Software

Set availableparts and limits

Set prior parameter distribution

Set desired behaviour

Model space generator

ABC SMC

O.D

600

Parameter inference

βα

γδ

ε

α

π(Ө)

β γ δ

ε ζ λη

Motif impact analysis

Community composition control

B1

A1N1 N2

Strain

N2

N1

Bacteriocin

B1

QS

A1

N1ƒ (,( N2 B1

A1

Identify high probabilitycandidates

B1

A1N1 N2

Figure 1: Overview of AutoCD pipeline. Model selection workflow begins with definition of availableparts and prior parameter distributions, used to generate system models from all the possible inter-actions. We use ABC SMC to perform model selection for the desired population behaviour. Theoutputs of ABC SMC provide us with community designs, insight into underlying motifs, parameterrequirements and information on composition tunability.

23


https://doi.org/10.1101/2020.06.30.180281


0.12

0.08

0.04

0.00

0.00

0.05

0.10

Posterior probability

Mod

els

InteractionsB2

A1

N1 N2

B1

A2

B.

BF =

5.5

BF =

4.5

BF =1.2

Increasing model complexity

N1 N2

m66

N1 N2

N1

N2

m48

N1 N2

m62m66

m48

m62

Bacteriocin QS Strains

A.

D

kB_max_1

mu_max_1

mu_max_2

D

kB_max_1

kB_max_2

mu_max_1

mu_max_2

KBmax1

D

KBmax2

�max1

�max2

�max1

�max2

D

KBmax1

D Dilution rate

KBmaxMaximal expression rate of bacteriocin

�maxMax growth rate of strain m66 m62

Key parametersC.

0.19 -0.03 0.15

-0.15

0.59

0.02

0.09 0.03 0.10 0.03

-0.34 0.33 -0.33

-0.31 0.31

0.22

Figure 2: Output of AutoCD for the two strain stable steady state objective A.) Dendrogram is gen-erated by hierarchical clustering of the adjacency matrices for each model in the two strain modelspace. All possible interactions are shown in the network illustration. Each column of the heatmaprepresents a possible interaction between species, where green indicates the interaction exists forthe model and black indicates absence of the interaction. The bar chart shows the posterior prob-ability of each model. B) Shows the models with highest posterior probability when subsetted fornumber of parts expressed, in order of increasing complexity (2, 3 and 4 parts). Bayes factors (BF)are shown for pairwise comparison of the three models and error bars show the standard deviationacross three repeat experiments. Model schematics show the interactions between strains (blue andgreen), bacteriocin (red) and QS molecules (purple). C) Posterior parameter distributions of severaltunable parameters in m66 and m62. The top and left plots show 1D density distributions of eachparameter, central distributions are 2D density distributions for each pair of parameters. Pearsoncorrelation coefficients are shown on the top side of the diagonal for each parameter pair.

24


https://doi.org/10.1101/2020.06.30.180281


B2

A1

N1 N2

B1

A2

A1

N1 N2

B1

K1

K2

K3

K4

K1

K2

K3

K4

B2

A1

N1 N2

B1

A2

A1

N1 N2

B1

A2

Component interaction weights

Models

Non-negative matrix factorisation

Post

erio

rPr

obab

ility

-1.0 0.0 1.0

A. B.

Calculate changes in posterior probability for all models in M

Where mx is a model in M:

Find nearest neighbours that are reached by adding each motif

Mod

el p

oste

rior

prob

abilit

y

Models

Other-limiting motifsSelf-limiting motifs

∆ normalised posterior probability

A1

N1 N2

B1

A1

N1 N2

B1

A1

N1 N2

B1

A1

N1 N2

B1

A1

N1

B1

A1

N1 N2

B1

A1

N1

B1

SL1

A1

N1 N2

B1

SL2

SL3

SL4

SL1

SL3

SL2

SL4

OL1

OL3

OL2

OL4

OL1

OL3

OL2

OL4

mx -

C.

D. E.

Figure 3: Contribution of network motifs to stability. A, B) Non negative matrix factorisation (NMF)analysis to learn motifs in model space using four components (K=4). A) Four components learnedby NMF, the line opacity indicates the coefficient of the interaction. B) Visualisation of the componentweights for each model. Each column is a model, with light colours corresponding to high weight anddark colours low weight. C, D, E) Manually curated minimal motifs capture interaction importance.C) Minimal motifs split into self-limiting (SL) and other-limiting (OL) by the direction of bacteriocinkilling. D) Illustration of the algorithm used to generate each datapoint in E). Moving from a model tothe nearest neighbours that can be built by adding a motif will produce a change in model posteriorprobability. E) Boxplots and scatter plot showing the change in posterior probability when addingeach motif to a model. The boxplots show the median, first and third quartile. The lower and upperwhiskers mark the 5th and 95th percentiles respectively.

25


https://doi.org/10.1101/2020.06.30.180281


Mod

els

B.

BF =

6.7

BF =

7.4

BF = 1.1 BF = 1.1

N1

N2 N3

N1

N2 N3

N1

N2 N3

i) m4128 ii) m4125 iii) m4119 iv) m3938

N1

N2 N3

m4128

m4128

m4124m4119m3938

{

i) ii)

iii) iv)

Posterior probability10

-1110

-8

10-2

10-5

A.

Interactions

0.000

0.005

0.010

0.015

A2A1

N1

B2B1

N2N3

Bacteriocin

Strains

QS

Model space parts:

B3

Figure 4: Output of AutoCD for three strain stable steady state objective A.) Dendrogram is gener-ated by hierarchical clustering of the adjacency matrices for each model in the three strain modelspace. We set the limit of number of levels to 5, in order to show high level groups. Each columnof the heatmap represents a possible interaction between species, where green indicates the inter-action exists for the model and black indicates absence of the interaction. The posterior probabilityplot shows the average posterior probability within each group of models. B) Shows the models withhighest posterior probability when subsetted for number of parts expressed, in order of increasingcomplexity (3, 4, 5 and 6 expressed parts). Bayes factors (BF) are shown for pairwise comparisonof the three models and error bars show the standard deviation between three repeat experiments.Model schematics show the interactions between strains (blue, green and red), bacteriocin (red) andQS molecules (purple). Models with two parts showed posterior probability 0.0 and are not shown.

26


https://doi.org/10.1101/2020.06.30.180281


Mea

n po

ster

ior p

roba

bilit

y

One QS Two QS One B Two B Three B0.0

0.001

0.0005

+ve - ve +ve-ve

A B. C.

B2B1 B1 B2B1

B3

A2A1A1

Number of QS Modes of regulation Number of bacteriocin

BF =

15.

8

BF =

3.4

BF =

1.8

Figure 5: Summary of parts and by the mean posterior probabilities of subsets from three strainmodel space. A) Comparing systems with one and two orthogonal QS parts. B) Comparing modesof bacteriocin regulation in the system by subsetting for systems with induction (+ve), repression(-ve) or both (+ve, -ve). C) Comparing systems with one, two and three bacteriocin.

D

-50 -35

0.0 0.6 -50 -35

-50 -35

0.25

0

5.0

0.0

log( KBmax1 )

log( KBmax2 ) log( KBmax3 )

0.25

0

0.25

0Low

Particle density

High

N1 N2

N3

N1 N2

N3All strains

All strainsAny strain

OD > 0.001OD < 0.1

OD > 0.1

A. B. C.

Figure 6: Distribution of population densities in Model 4119. Axes of ternary diagrams (A and B)show percentage composition of the community. A) Heat map showing the community compositionat stable steady state in Model 4119. B) Scatter plot of the stable steady state systems, highlightingthe secondary objective where all strains have OD > 0.1 (red), and the primary objective only whereany strain has 0.001 < OD < 0.1. C) Density plots comparing the parameter distributions of fourparameters that show the greatest divergence to produce the secondary objective: Dilution rate (D)and maximal bacteriocin expression rates (KBmax1, KBmax2, KBmax3). The Kolmogorov-Smirnovvalues for the two objectives are 0.25, 0.12, 0.13 and 0.12 respectively.

27


https://doi.org/10.1101/2020.06.30.180281


1 References

[1] Libertad Pantoja-Hernández and Juan Carlos Martínez-García. Retroactivity in the Context of

Modularly Structured Biomolecular Systems. Frontiers in Bioengineering and Biotechnology,

3(85), jun 2015.

[2] Shridhar Jayanthi and Domitilla Del Vecchio. Retroactivity Attenuation in Bio-Molecular Sys-

tems Based on Timescale Separation. IEEE Transactions on Automatic Control, 56(4):748–

761, apr 2011.

[3] Andras Gyorgy, José I. Jiménez, John Yazbek, Hsin Ho Huang, Hattie Chung, Ron Weiss,

and Domitilla Del Vecchio. Isocost Lines Describe the Cellular Economy of Genetic Circuits.

Biophysical Journal, 109(3):639–646, aug 2015.

[4] D Summers. The kinetics of plasmid loss. Trends in Biotechnology, 9(1):273–278, jan 1991.

[5] Deepak Mishra, Phillip M Rivera, Allen Lin, Domitilla Del Vecchio, and Ron Weiss. A load driver

device for engineering modularity in biological networks. Nature Biotechnology, 2014.

[6] Andrea Y Weiße, Diego A Oyarzún, Vincent Danos, and Peter S Swain. Mechanistic links be-

tween cellular trade-offs, gene expression, and growth. Proceedings of the National Academy

of Sciences of the United States of America, 112(9):E1038–1047, 2015.

[7] Katie Brenner, Lingchong You, and Frances H. Arnold. Engineering microbial consortia: a new

frontier in synthetic biology. Trends in Biotechnology, 26(9):483–489, 2008.

[8] Theodore A. Kennedy, Shahid Naeem, Katherine M. Howe, Johannes M. H. Knops, David

Tilman, and Peter Reich. Biodiversity as a barrier to ecological invasion. Nature,

417(6889):636–638, jun 2002.

[9] Doruk Beyter, Pei-Zhong Tang, Scott Becker, Tony Hoang, Damla Bilgin, Yan Wei Lim, Todd C

Peterson, Stephen Mayfield, Farzad Haerizadeh, Jonathan B Shurin, Vineet Bafna, and Robert

McBride. Diversity, Productivity, and Stability of an Industrial Microbial Ecosystem. Applied and

environmental microbiology, 82(8):2494–2505, apr 2016.

28


https://doi.org/10.1101/2020.06.30.180281


[10] G. J. Butler and G. S. K. Wolkowicz. A Mathematical Model of the Chemostat with a Gen-

eral Class of Functions Describing Nutrient Uptake. SIAM Journal on Applied Mathematics,

45(1):138–151, feb 1985.

[11] Kevin R. Foster and Thomas Bell. Competition, Not Cooperation, Dominates Interactions

among Culturable Microbial Species. Current Biology, 22(19):1845–1850, oct 2012.

[12] Michael E. Hibbing, Clay Fuqua, Matthew R. Parsek, and S. Brook Peterson. Bacterial compe-

tition: Surviving and thriving in the microbial jungle, jan 2010.

[13] Shiri Freilich, Raphy Zarecki, Omer Eilam, Ella Shtifman Segal, Christopher S. Henry, Martin

Kupiec, Uri Gophna, Roded Sharan, and Eytan Ruppin. Competitive and cooperative metabolic

interactions in bacterial communities. Nature Communications, 2(1), 2011.

[14] Aleksej Zelezniak, Sergej Andrejev, Olga Ponomarova, Daniel R. Mende, Peer Bork, and Ki-

ran Raosaheb Patil. Metabolic dependencies drive species co-occurrence in diverse microbial

communities. Proceedings of the National Academy of Sciences, 112(20):6449–6454, may

2015.

[15] Alexander May, Shrinath Narayanan, Joe Alcock, Arvind Varsani, Carlo Maley, and Athena

Aktipis. Kombucha: a novel model system for cooperation and conflict in a complex multi-

species microbial ecosystem. PeerJ, 7(9):e7565, sep 2019.

[16] T. L. Czaran, R. F. Hoekstra, and L. Pagie. Chemical warfare between microbes promotes

biodiversity. Proceedings of the National Academy of Sciences, 99(2):786–790, jan 2002.

[17] Melissa B. Miller and Bonnie L. Bassler. Quorum Sensing in Bacteria. Annual Review of

Microbiology, 55(1):165–199, oct 2001.

[18] Christina V. Dinh, Xingyu Chen, and Kristala L. J. Prather. Development of a Quorum-Sensing

Based Circuit for Control of Coculture Population Composition in a Naringenin Production Sys-

tem. ACS Synthetic Biology, page acssynbio.9b00451, feb 2020.

[19] Kristina Stephens, Maria Pozo, Chen-Yu Tsao, Pricila Hauk, and William E. Bentley. Bacte-

rial co-culture with cell signaling translator and growth controller modules for autonomously

regulated culture composition. Nature Communications, 10(1):4129, dec 2019.

29


https://doi.org/10.1101/2020.06.30.180281


[20] Feng Liu, Junwen Mao, Ting Lu, and Qiang Hua. Synthetic, Context-Dependent Microbial

Consortium of Predator and Prey. ACS Synthetic Biology, 8(8):1713–1722, aug 2019.

[21] Apoorv Gupta, Irene M Brockman Reizman, Christopher R Reisch, and Kristala L J Prather.

Dynamic regulation of metabolic flux in engineered bacteria using a pathway-independent

quorum-sensing circuit. Nature Biotechnology, 35(3):273–279, mar 2017.

[22] Spencer R. Scott and Jeff Hasty. Quorum Sensing Communication Modules for Microbial Con-

sortia. ACS Synthetic Biology, 5(9):969–977, sep 2016.

[23] Frederick K Balagaddé, Hao Song, Jun Ozaki, Cynthia H Collins, Matthew Barnet, Frances H

Arnold, Stephen R Quake, and Lingchong You. A synthetic Escherichia coli predator-prey

ecosystem. Molecular systems biology, 4(187):187, 2008.

[24] Wentao Kong, David R. Meldgin, James J. Collins, and Ting Lu. Designing microbial consortia

with defined social interactions. Nature Chemical Biology, 14(8):821–829, aug 2018.

[25] Sylvie Rebuffat. Microcins. In Handbook of Biologically Active Peptides, pages 129–137. Else-

vier, 2013.

[26] Kathryn Geldart, Brittany Forkus, Evelyn McChesney, Madeline McCue, and Yiannis Kaznes-

sis. pMPES: A Modular Peptide Expression System for the Delivery of Antimicrobial Peptides

to the Site of Gastrointestinal Infections Using Probiotics. Pharmaceuticals, 9(4):60, oct 2016.

[27] Alex J.H. Fedorec, Tanel Ozdemir, Anjali Doshi, Yan-Kay Ho, Luca Rosa, Jack Rutter, Os-

car Velazquez, Vitor B. Pinheiro, Tal Danino, and Chris P. Barnes. Two New Plasmid Post-

segregational Killing Mechanisms for the Implementation of Synthetic Gene Networks in Es-

cherichia coli. iScience, 14:323–334, apr 2019.

[28] Alex J H Fedorec, Behzad D Karkaria, Michael Sulu, and Chris P Barnes. Killing in response

to competition stabilises synthetic microbial consortia. bioRxiv, pages 1–20, dec 2019.

[29] Lutz Becks, Frank M. Hilker, Horst Malchow, Klaus Jürgens, and Hartmut Arndt. Experimental

demonstration of chaos in a microbial food web. Nature, 435(7046):1226–1229, jun 2005.

[30] Jonathan Friedman and Jeff Gore. Ecological systems biology: The dynamics of interacting

populations, feb 2017.

30


https://doi.org/10.1101/2020.06.30.180281


[31] James T. MacDonald, Chris Barnes, Richard I. Kitney, Paul S. Freemont, and Guy-Bart V. Stan.

Computational design approaches and tools for synthetic biology. Integrative Biology, 3(2):97,

feb 2011.

[32] Paul Kirk, Thomas Thorne, and Michael P.H. Stumpf. Model selection in systems and synthetic

biology, aug 2013.

[33] Chris P Barnes, Daniel Silk, Xia Sheng, and Michael P H Stumpf. Bayesian design of synthetic

biological systems. Proceedings of the National Academy of Sciences of the United States of

America, 108(37):15190–5, sep 2011.

[34] Mae L. Woods, Miriam Leon, Ruben Perez-Carrasco, and Chris P. Barnes. A Statistical Ap-

proach Reveals Designs for the Most Robust Stochastic Gene Oscillators. ACS Synthetic Biol-

ogy, 5(6):459–470, jun 2016.

[35] Miriam Leon, Mae L. Woods, Alex J. H. Fedorec, and Chris P. Barnes. A computational method

for the investigation of multistable systems and its application to genetic switches. BMC Sys-

tems Biology, 10(1):130, dec 2016.

[36] Jing Wui Yeoh, Kai Boon Ivan Ng, Ai Ying Teh, Jing Yun Zhang, Wai Kit David Chee, and

Chueh Loo Poh. An Automated Biomodel Selection System (BMSS) for Gene Circuit Designs.

ACS Synthetic Biology, 2019.

[37] Jacob Beal, Ron Weiss, Douglas Densmore, Aaron Adler, Evan Appleton, Jonathan Babb,

Swapnil Bhatia, Noah Davidsohn, Traci Haddock, Joseph Loyall, Richard Schantz, Viktor

Vasilev, and Fusun Yaman. An end-to-End workflow for engineering of biological networks

from high-level specifications. ACS Synthetic Biology, 1(8):317–331, aug 2012.

[38] Guillermo Rodrigo and Alfonso Jaramillo. AutoBioCAD: Full biodesign automation of genetic

circuits. ACS Synthetic Biology, 2(5):230–236, may 2013.

[39] Tina Toni, David Welch, Natalja Strelkowa, Andreas Ipsen, and Michael P H Stumpf. Approxi-

mate Bayesian computation scheme for parameter inference and model selection in dynamical

systems. Journal of the Royal Society, Interface, 6(31):187–202, feb 2009.

31


https://doi.org/10.1101/2020.06.30.180281


[40] Robert E. Kass and Adrian E. Raftery. Bayes Factors. Journal of the American Statistical

Association, 90(430):773–795, jun 1995.

[41] Howard M. Salis, Ethan A. Mirsky, and Christopher A. Voigt. Automated design of synthetic

ribosome binding sites to control protein expression. Nature Biotechnology, 2009.

[42] Karoline Marisch, Karl Bayer, Theresa Scharl, Juergen Mairhofer, Peter M. Krempl, Karin

Hummel, Ebrahim Razzazi-Fazeli, and Gerald Striedner. A Comparative Analysis of Indus-

trial Escherichia coli K-12 and B Strains in High-Glucose Batch Cultivations on Process-,

Transcriptome- and Proteome Level. PLoS ONE, 2013.

[43] Neythen J. Treloar, Alex J. H. Fedorec, Brian Ingalls, and Chris P. Barnes. Deep reinforcement

learning for the control of microbial co-cultures in bioreactors. PLOS Computational Biology,

16(4):e1007783, apr 2020.

[44] Daniel D. Lee and H. Sebastian Seung. Learning the parts of objects by non-negative matrix

factorization. Nature, 401(6755):788–791, oct 1999.

[45] Alissa Kerner, Jihyang Park, Audra Williams, and Xiaoxia Nina Lin. A Programmable Es-

cherichia coli Consortium via Tunable Symbiosis. PLoS ONE, 7(3):e34032, mar 2012.

[46] Kang Zhou, Kangjian Qiao, Steven Edgar, and Gregory Stephanopoulos. Distributing a

metabolic pathway among a microbial consortium enhances production of natural products.

Nature biotechnology, 33(4):377–83, apr 2015.

[47] Wenying Shou, Sri Ram, and J. M. G. Vilar. Synthetic cooperation in engineered yeast popula-

tions. Proceedings of the National Academy of Sciences, 104(6):1877–1882, feb 2007.

[48] Samay Pande, Holger Merker, Katrin Bohl, Michael Reichelt, Stefan Schuster, Luís F

de Figueiredo, Christoph Kaleta, and Christian Kost. Fitness and stability of obligate cross-

feeding interactions that emerge upon gene loss in bacteria. The ISME Journal, 8(5):953–962,

may 2014.

[49] Eugene Anatoly Yurtsev, Arolyn Conwill, and Jeff Gore. Oscillatory dynamics in a bacterial

cross-protection mutualism. Proceedings of the National Academy of Sciences, 113(22):6236–

6241, may 2016.

32


https://doi.org/10.1101/2020.06.30.180281


[50] Kazufumi Hosoda, Shingo Suzuki, Yoshinori Yamauchi, Yasunori Shiroguchi, Akiko Kashiwagi,

Naoaki Ono, Kotaro Mori, and Tetsuya Yomo. Cooperative Adaptation to Establishment of a

Synthetic Bacterial Mutualism. PLoS ONE, 6(2):e17105, feb 2011.

[51] Xiaolin Zhang and Jennifer L. Reed. Adaptive Evolution of Synthetic Cooperating Communities

Improves Growth Performance. PLoS ONE, 9(10):e108297, oct 2014.

[52] Ye Chen, Jae Kyoung Kim, Andrew J. Hirning, K. Josi, and Matthew R. Bennett. Emergent

genetic oscillations in a synthetic microbial consortium. Science, 349(6251):986–989, aug

2015.

[53] Hans C Bernstein, Steven D Paulson, and Ross P Carlson. Synthetic Escherichia coli consortia

engineered for syntrophy demonstrate enhanced biomass productivity. Journal of biotechnol-

ogy, 157(1):159–66, jan 2012.

[54] Spencer R. Scott, M. Omar Din, Philip Bittihn, Liyang Xiong, Lev S. Tsimring, and Jeff Hasty. A

stabilized microbial ecosystem of self-limiting bacteria using synthetic quorum-regulated lysis.

Nature Microbiology, 2(8):17083, aug 2017.

[55] Marika Ziesack, Travis Gibson, John K. W. Oliver, Andrew M. Shumaker, Bryan B. Hsu, David T.

Riglar, Tobias W. Giessen, Nicholas V. DiBenedetto, Lynn Bry, Jeffrey C. Way, Pamela A. Silver,

and Georg K. Gerber. Engineered Interspecies Amino Acid Cross-Feeding Increases Popula-

tion Evenness in a Synthetic Bacterial Consortium. mSystems, 4(4):e00352–19, aug 2019.

[56] Michael J. Liao, M. Omar Din, Lev Tsimring, and Jeff Hasty. Rock-paper-scissors: Engineered

population dynamics increase genetic stability. Science, 365(6457):1045–1049, sep 2019.

[57] Jared Kehe, Anthony Kulesa, Anthony Ortiz, Cheri M. Ackerman, Sri Gowtham Thakku, Daniel

Sellers, Seppe Kuehn, Jeff Gore, Jonathan Friedman, and Paul C. Blainey. Massively par-

allel screening of synthetic microbial communities. Proceedings of the National Academy of

Sciences, 116(26):12804–12809, jun 2019.

[58] Ryan H. Hsu, Ryan L. Clark, Jin Wen Tan, John C. Ahn, Sonali Gupta, Philip A. Romero, and

Ophelia S. Venturelli. Microbial Interaction Network Inference in Microfluidic Droplets. Cell

Systems, 9(3):229–242.e4, sep 2019.

33


https://doi.org/10.1101/2020.06.30.180281


[59] Carlos Carmona-Fontaine and Joao B. Xavier. Altruistic cell death and collective drug resis-

tance, 2012.

[60] Yu Tanouchi, Anand Pai, Nicolas E. Buchler, and Lingchong You. Programming stress-induced

altruistic death in engineered bacteria. Molecular Systems Biology, 2012.

[61] Martin Ackermann, Bärbel Stecher, Nikki E. Freed, Pascal Songhet, Wolf Dietrich Hardt, and

Michael Doebeli. Self-destructive cooperation mediated by phenotypic noise. Nature, 2008.

[62] Gwyn T. Williams. Programmed cell death: a fundamental protective response to pathogens.

Trends in Microbiology, 1994.

[63] Hilje M. Doekes, Rob J. De Boer, and Rutger Hermsen. Toxin production spontaneously be-

comes regulated by local cell density in evolving bacterial populations. PLoS Computational

Biology, 15(8):e1007333, 2019.

[64] Marc Harper, Bryan Weinstein, Tgwoodcock, Cory Simon, Chebee7i, Wiley Morgan, Vince

Knight, Nick Swanson-Hysell, Matthew Evans, Jl-bernal, ZGainsforth, The Gitter Badger, Sax-

onAnglo, Maximiliano Greco, and Guido Zuidhof. python-ternary: Ternary Plots in Python.

Zenodo 10.5281/zenodo.594435, apr 2019.

[65] Nicolas Kylilis, Zoltan A. Tuza, Guy Bart Stan, and Karen M. Polizzi. Tools for engineering

coordinated system behaviour in synthetic microbial consortia. Nature Communications, 9(1),

dec 2018.

[66] Heinrich Senn, Urs Lendenmann, Mario Snozzi, Geoffrey Hamer, and Thomas Egli. The growth

of Escherichia coli in glucose-limited chemostat cultures: a re-examination of the kinetics. BBA

- General Subjects, 1994.

[67] Delphine Destoumieux-Garzón, Sophie Duquesne, Jean Peduzzi, Christophe Goulard, Michel

Desmadril, Lucienne Letellier, Sylvie Rebuffat, and Pascale Boulanger. The iron-siderophore

transporter FhuA is the receptor for the antimicrobial peptide microcin J25: Role of the microcin

Val11-Pro16 β-hairpin region in the recognition mechanism. Biochemical Journal, 2005.

34


https://doi.org/10.1101/2020.06.30.180281


[68] Kamaljit Kaur, Oxana Tarassova, Ramana Venkata Dangeti, Sarfuddin Azmi, David Wishart,

Lynn McMullen, and Michael Stiles. Characterization of a highly potent antimicrobial peptide

microcin N from uropathogenic Escherichia coli, jun 2016.

[69] K. B. Andersen and K. Von Meyenburg. Are growth rates of Escherichia coli in batch cultures

limited by respiration? Journal of Bacteriology, 1980.

[70] Mattia Marenda, Marina Zanardo, Antonio Trovato, Flavio Seno, and Andrea Squartini. Model-

ing quorum sensing trade-offs between bacterial cell density and system extension from open

boundaries. Scientific Reports, 2016.

[71] Delphine Destoumieux-Garzón, Xavier Thomas, Mónica Santamaria, Christophe Goulard,

Michel Barthélémy, Bénédicte Boscher, Yannick Bessin, Gérard Molle, Anne Marie Pons, Lu-

cienne Letellier, Jean Peduzzi, and Sylvie Rebuffat. Microcin E492 antibacterial activity: Evi-

dence for a TonB-dependent inner membrane permeabilization on Escherichia coli. Molecular

Microbiology, 2003.

35


https://doi.org/10.1101/2020.06.30.180281


Documents

Automated design of synthetic microbial communities · 6/30/2020 · Automated design of synthetic microbial communities Behzad D. Karkaria, 1Alex J. H. Fedorec, and Chris P. Barnes1,2