The novel bacterial phylum Calditrichaeota is diverse, widespread and abundant in
marine sediments and has the capacity to degrade detrital proteins
This is the accepted version of the following article: Marshall IPG, Starnawski P, Cupit C,
Cáceres EF, Ettema TJG, Schramm A, et al. (2017). The novel bacterial phylum Calditrichaeota
is diverse, widespread and abundant in marine sediments and has the capacity to degrade
detrital proteins. Environmental Microbiology Reports. e-pub ahead of print, doi:
10.1111/1758-2229.12544., which has been published in final form at
http://onlinelibrary.wiley.com/doi/10.1111/1758-2229.12544/abstract. This article may be
used for non-commercial purposes in accordance with the Wiley Self-Archiving Policy
[ http://olabout.wiley.com/WileyCDA/Section/id-828039.html ].
Running head: Expansion of Calditrichaeota diversity
Ian P. G. Marshall1*, Piotr Starnawski1§*, Carina Cupit1, Eva Fernández Cáceres2, Thijs J. G.
Ettema2, Andreas Schramm1, Kasper U. Kjeldsen1
1. Center for Geomicrobiology, Section for Microbiology, Department of Bioscience,
Aarhus University, Denmark
2. Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala
University, Sweden
§Current address. Department of Molecular Medicine, Aarhus University Hospital, Denmark
* These authors contributed equally to this manuscript
Corresponding author:
Ian P.G. Marshall
Center for Geomicrobiology
Section for Microbiology, Department of Bioscience, Aarhus University
Ny Munkegade 114
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
8000 Aarhus C
Denmark
E-mail: [email protected]
Telephone: +45 8715 6553
Fax: +45 8715 4326
Summary
Calditrichaeota is a recently recognised bacterial phylum with three cultured representatives,
isolated from hydrothermal vents. Here we expand the phylogeny and ecology of this novel
phylum with metagenome-derived and single-cell genomes from six uncultivated bacteria
previously not recognised as members of Calditrichaeota. Using 16S rRNA gene sequences
from these genomes, we then identified 322 16S rRNA gene sequences from cultivation-
independent studies that can now be classified as Calditrichaeota for the first time. This
dataset was used to re-analyse a collection of 16S rRNA gene amplicon datasets from marine
sediments showing that the Calditrichaeota are globally distributed in the seabed at high
abundance, making up to 6.7% of the total bacterial community. This wide distribution and
high abundance of Calditrichaeota in cold marine sediment has gone unrecognised until now.
All Calditrichaeota genomes show indications of a chemoorganoheterotrophic metabolism
with the potential to degrade detrital proteins through the use of extracellular peptidases.
Most of the genomes contain genes encoding proteins that confer O2 tolerance, consistent with
the relatively high abundance of Calditrichaeota in surficial bioturbated part of the seabed
and, together with the genes encoding extracellular peptidases, suggestive of a general
ecophysiological niche for this newly recognised phylum in marine sediment.
Introduction
The bacterial phylum Calditrichaeota has been recently recognised as an independent
phylum-level clade (Kublanov et al., 2017) with three cultured representatives in the genera
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Caldithrix (Miroshnichenko et al., 2003; 2010) and Calorithrix (Kompantseva et al., 2016).
These bacteria are anaerobic organoheterotrophic thermophiles isolated from marine
hydrothermal vents with a shared ability to degrade proteinaceous substrates. The 14-year
lag in between the phylum’s first isolate and a formal description of the phylum has meant
that Calditrichaeota has gone unrecognised in in the taxonomic classification of DNA
sequences from uncultivated microbes submitted to online databases, including
Genbank/ENA/DDBJ and the Silva small subunit ribosomal RNA (SSU rRNA) database (Quast
et al., 2013). This means that Calditrichaeota’s global distribution and abundance are
unknown and no unified view of the phylum’s unifying physiological characteristics exists
Here we used phylogenomic methods to clearly delineate the phylum Calditrichaeota and to
re-examine the taxonomic placement of several uncultured bacteria that were related to but
not previously identified as Calditrichaeota, based on genomes derived from single-cell
sequencing and metagenomic binning. We then used the 16S rRNA genes from isolate- and
uncultivated-source genomes to identify sequences in the Silva SSU rRNA database that ought
to be classified as Calditrichaeota. Since many of these were derived from marine sediments
we furthermore analysed NGS amplicon datasets from 18 different marine sediment sites to
determine the prevalence of Calditrichaeota in marine sediment. Finally, we aimed to
determine how widely the physiological traits of the three Calditrichaeota isolates are shared
with their uncultivated relatives, and identify the functions that typify this novel phylum.
Results and Discussion
Calditrichaeota was recently recognised as a bacterial phylum within the Fibrobacteres-
Chlorobi-Bacteroidetes (FCB) superphylum, with the closest relative phylum being
Ignavibacteria (Kublanov et al., 2017). Following other researchers’ approaches for defining
phyla based on genomic data (Yarza et al., 2014; Anantharaman et al., 2016), we set out to
determine whether several relatives of the genera Caldithrix and Calorithrix belonged to the
Calditrichaeota based on two criteria: (1) 16S ribosomal RNA sequence identity between
phylum members should be 75% or greater, and (2) the phylum should constitute a
monophyletic clade. Our analyses reveal that both of these criteria are fulfilled, supporting the
hypothesis that the Caldithrix and Calorithrix, alongside the genomes of two uncultivated
bacteria derived from metagenomes (RBG_16_48_16 from a terrestrial aquifer
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
(Anantharaman et al., 2016) and SM23_31 from marine sediment (Baker et al., 2015)), belong
to an independent phylum based on 16S rRNA gene phylogeny, albeit with very poor
bootstrap support (48% bootstrap support, Figure 1A – see supplemental materials and
methods for a complete description). We therefore constructed a phylogenetic tree based on a
concatenated alignment of 17 single-copy orthologous protein sequences from 55 bacterial
genomes (Figure 1B – see supplemental methods for the full criteria and methods used to
choose these protein sequences). This protein-sequence-derived phylogenetic tree places
Caldithrix and six related metagenome- and single-cell-derived genomes (Table 1) in a clade
with 100% bootstrap support. Attempting to form a phylum-level clade from a more deeply
branching point in the tree would include the phylum Bacteroidetes, which contains
representatives with less than 75% 16S rRNA gene sequence identity to Caldithrix abyssi. This
means that, according to our defined criteria, six uncultured bacterial species belong to the
phylum Calditrichaeota and have not been recognised as such prior to this study.
The previous lack of recognition for the Calditrichaeota means that there are large numbers of
16S ribosomal RNA sequences obtained through cultivation-independent sequencing that
have been misassigned to phyla other than Calditrichaeota. We identified some of these by
taking all 16S rRNA gene sequences from the Silva SSU REF database v. 128 (Quast et al.,
2013) that were greater than 80% identical to the Caldithrix abyssi sequence (71051
sequences total) and constructing a FastTree2 maximum likelihood phylogenetic tree from
these sequences. From this tree we could define a set of 322 sequences that belong within the
phylum Calditrichaeota (Supplemental Figure SF1A, Supplemental Table ST1). Silva taxonomy
placed almost all of these sequences within the phyla Fibrobacteres, Deferribacteres, and LCP-
89, while Greengenes taxonomy mostly showed provisional recognition of the “Caldithrix”
phylum and a number of unclassified sequences (Supplemental Figure SF1B, Supplemental
Table ST1). In contrast to the source of Caldithrix and Calorithrix isolates, the majority of these
sequences were derived from non-hydrothermal marine sediment. We therefore identified
reads from 18 different seabed sites accessible through publicly available next-generation 16S
rRNA gene amplicon community analysis studies of diffusive marine sediment environments
that were 95% identical or greater to one of the 322 representative Calditrichaeota
sequences. This analysis showed that Calditrichaeota is globally distributed and highly
abundant in marine sediment (Supplemental Table ST2, Figure 2A), comprising up to 6.7% of
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
16S rRNA gene amplicon libraries. The sample with the highest fraction of Calditrichaeota was
a sediment sample from a depth of 0-4cm in the Caspian Sea (Mahmoudi et al., 2015).
Calditrichaeota appear to be more abundant in shallow sediments (<1 meters below seafloor
[mbsf]) than in the deep subsurface, even though Calditrichaeota sequences were found down
to 141 mbsf (Figure 2B, Supplemental Table ST2).
We next examined what physiological features might be common across the phylum (see
Supplemental Table ST3 for an overview of inferred physiological characteristics). The three
cultured representatives of the Calditrichaeota, Caldithrix abyssi, Caldithrix palaeochoryensis,
and Calorithrix insularis, share certain features: they are all anaerobic thermophiles capable of
chemoorganoheterotrophic growth, in particular the fermentation of proteinaceous
compounds and the production of H2. All of the seven Calditrichaeota genomes have genes
encoding extracellular peptidases (Figure 3, Supplemental Table ST4), with such genes
constituting 8-75% (median 29%) of their complement of genes that encode proteins with a
predicted extracellular localization. The most common extracellular peptidase family is the
S8A (subtilisin) family, involved in the extracellular breakdown of proteins for use as a
substrate by the cell (Rawlings et al., 2014). All genomes also contain genes encoding proteins
for peptide or amino acid uptake (Figure 3, Supplemental Table ST6). Six of the seven
Calditrichaeota genomes have genes encoding NiFe hydrogenases, suggesting a capacity for
either H2 production during fermentation or the use of H2 as an electron donor. The presence
of genes encoding extracellular peptidases and amino acid/oligopeptide transport in all seven
Calditrichaeota genomes, and the presence of hydrogenases in six of the genomes, suggest
that the ability to ferment proteinaceous substrates and produce H2 is a broad defining
feature of the phylum Calditrichaeota beyond the two genera isolated to date.
Genes encoding cytochrome c oxidase were present in four of the genomes, including that of
the strictly anaerobic C. abyssi. This suggests that these enzymes could be involved in either
protection from oxidation by O2, as has been observed in other strict anaerobes (Lamrabet et
al., 2011; Ramel et al., 2013; 2015), or in aerobic metabolism. Genes encoding catalase (2/7
genomes), peroxidase (4/7 genomes), and superoxide dismutase (4/7 genomes) were also
widespread, with all but one genome (AABM5.125.24) encoding some mechanism for O2
protection (Supplemental Table ST3). An O2-tolerant or aerobic lifestyle is consistent with the
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
highest abundances of Calditrichaeota found in sediment at depths shallower than 1 mbsf
(Figure 2B), as O2 can penetrate surficial sediment by bioturbation (Kristensen et al., 2012).
Interestingly, the only genome in our analysis without any genes encoding O2-protection
enzymes or cytochrome C oxidase was from the deepest marine sediment depth that
environmental genomes were obtained from (1.25 mbsf, Table 1) and therefore the least
likely to be exposed to O2.
There are also certain characteristics that differentiate the three cultivated Calditrichaeota,
and the other genomes in this phylum reflect this metabolic diversity. Only C. abyssi and C.
insularis are capable of respiratory growth with nitrate as electron acceptor and
chemolithoheterotrophic growth on H2, while only C. palaeochoryensis can ferment
polysaccharides (and produce a wider range of fermentation products) (Miroshnichenko et
al., 2010; Kompantseva et al., 2016). A gene encoding nitrate reductase was only identified in
C. abyssi, suggesting that nitrate reduction is not necessarily a widespread trait within
Calditrichaeota, although a nitrite reductase was also identified in Caldithrix sp. RBG 13_44_9
(Supplemental Table ST3). Extracellular glycoside hydrolases were identified in four of the
seven genomes (Supplemental Table ST5).
This study has revealed the heretofore hidden extent of the novel phylum Calditrichaeota in
marine sediment by critical re-analysis of previously sequenced genomes and 16S rRNA
sequences from uncultivated bacteria. A re-classification of these databases is necessary for
accurate taxonomic assignment of metagenomic and metataxonomic sequence data in the
future. Our re-examination of 16S rRNA gene surveys from marine sediment shows that this
phylum is surprisingly abundant and globally distributed, and therefore likely an
unrecognised but significant player in geochemical element cycling. Based on the genomes
and isolates obtained so far, members of the phylum are likely to be O2-tolerant, protein-
fermenting anaerobes, but further cultivation-dependent and independent studies will be
necessary to corroborate these initial conclusions.
Accession Numbers
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
Whole Genome Shotgun projects have been deposited at DDBJ/ENA/GenBank under the
accession numbers MWJR00000000 (AABM5.125.24) and MWJS00000000 (AABM5.25.91).
The versions described in this paper are versions MWJR01000000 and MWJS01000000.
Table Caption
Table 1
Summary of genome statistics for genomes belonging to Calditrichaeota examined in this
study. Completeness statistics were generated using CheckM (Parks et al., 2015).
Figure Captions
Figure 1
Phylogenetic trees showing the candidate phylum Calditrichaeota. (A) PHYML maximum
likelihood tree based on 16S rRNA gene sequences. 100X bootstrapped - branch labels show
bootstrap percentages. Green-labeled nodes indicate sequences obtained from metagenome-
derived genome bins and single-cell genomes. (B) IQ-TREE phylogenetic tree based on a
concatenated alignment of 17 single-copy orthologous protein sequences taken from
sequenced genomes (preprotein translocase subunit SecY, 30S ribosomal proteins S13, S7,
S12, S10, S19, S3, S8, 50S ribosomal proteins L1, L2, L3, L5, L6, L11, L14, L16, and L27). The
tree is 1000X bootstrapped with branch labels showing bootstrap percentages. Numbers in
each grouped clade show the number of genomes within that clade. Labels in blue indicate
incomplete genomes without a gene encoding 16S rRNA. Currently genome sequences are not
available for Caldithrix palaeochoryensis and Calorithrix insularis and these species are
therefore not included in the tree. All sequences in red clades or with red-coloured text are
75% or more identical to the Caldithrix abyssi sequence. Clades coloured pink contain some
sequences greater than 75% identity and some sequences less than 75% identity (the number
of sequences >75% identity are shown coloured red in brackets). Sequences marked with
black-coloured text are less than 75% identical to Caldithrix abyssi.
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
Figure 2
Analysis of published 16S rRNA amplicon datasets from marine and brackish sediment. (A)
World map showing sampling locations and Calditrichaeota percentage for the sample
(depth) with the highest value. At some locations, several independent cores were analysed –
215
216
217
218
219
for exact coordinates see Supplemental Table ST2. (B) Calditrichaeota percentage plotted
against depth below seafloor for all datasets shown in A.
Figure 3
Protein degradation and uptake enzymes inferred from genomes in this study. Coloured
shapes show predicted proteins or protein complexes. Numbers adjacent to each protein
show the number of genomes (out of seven total genomes analysed) with genes encoding the
predicted protein or protein complex. Yellow indicates serine peptidases, green indicates
metallo peptidases, and red indicates cysteine peptidases. Orange indicates amino acid
transporters, turquoise indicates oligopeptide transporters, and blue indicates ABC
220
221
222
223
224
225
226
227
228
229
transporters. “Sec” indicates the Sec secretion system. See supplemental tables ST4 and ST6
for details.
Acknowledgments
All sequencing was performed by the National Genomics Infrastructure sequencing platforms
at the Science for Life Laboratory at Uppsala University, a national infrastructure supported
by the Swedish Research Council (VR-RFI) and the Knut and Alice Wallenberg Foundation.
The work was supported by the Danish National Research Foundation, by grants of the
230
231
232
233
234
235
236
237
Swedish Research Council and the Swedish Foundation for Strategic Research (grant no. 621-
2009-4813 and SSF-FFL5, respectively, both awarded to TJGE), and by grants from the
European Union Seventh Framework Program: an ERC Advanced Grant, MICROENERGY
[project no. 294200], an ERC Starting grant, PUZZLE_CELL [project no. 310039 to TJGE], and a
Marie Curie International Incoming Fellowship awarded to IPGM [ATP_adapt_energy]. We
would like to thank Susanne Nielsen, Trine Bech Søgaard and Lina Juzokaite for technical
assistance in the lab.
Conflict of Interest
The authors declare no conflict of interest.
References
Anantharaman, K., Brown, C.T., Hug, L.A., Sharon, I., Castelle, C.J., Probst, A.J., et al. (2016) Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nat Comms 7: 13219.
Baker, B.J., Lazar, C.S., Teske, A.P., and Dick, G.J. (2015) Genomic resolution of linkages in carbon, nitrogen, and sulfur cycling among widespread estuary sediment bacteria. Microbiome 3: 14.
Kompantseva, E.I., Kublanov, I.V., Perevalova, A.A., Chernyh, N.A., Toshchakov, S.V., Litti, Y.V., et al. (2016) Calorithrix insularis gen. nov., sp. nov., a novel representative of the phylum Calditrichaeota. Int J Syst Evol Microbiol.
Kristensen, E., Penha-Lopes, G., Delefosse, M., Valdemarsen, T., Quintana, C.O., and Banta, G.T. (2012) What is bioturbation? The need for a precise definition for fauna in aquatic sciences. Marine Ecology Progress Series 446: 285–302.
Kublanov, I.V., Sigalova, O., Gavrilov, S.N., Lebedinsky, A.V., Rinke, C., Kovaleva, O., et al. (2017) Genomic analysis of Caldithrix abyssi, the thermophilic anaerobic bacterium of the novel bacterial phylum Calditrichaeota. Front. Microbio. 1–41.
Lamrabet, O., Pieulle, L., Aubert, C., Mouhamar, F., Stocker, P., Dolla, A., and Brasseur, G. (2011) Oxygen reduction in the strict anaerobe Desulfovibrio vulgaris Hildenborough: characterization of two membrane-bound oxygen reductases. Microbiology 157: 2720–2732.
Mahmoudi, N., Robeson, M.S., Castro, H.F., Fortney, J.L., Techtmann, S.M., Joyner, D.C., et al. (2015) Microbial community composition and diversity in Caspian Sea sediments. FEMS Microbiol Ecol 91: 1–11.
Miroshnichenko, M.L., Kolganova, T.V., Spring, S., Chernyh, N., and Bonch-Osmolovskaya, E.A. (2010) Caldithrix palaeochoryensis sp. nov., a thermophilic, anaerobic, chemo-
238
239
240
241
242
243
244
245
246
247
248
249
250251252253254255256257258259260261262263264265266267268269270271272273
organotrophic bacterium from a geothermally heated sediment, and emended description of the genus Caldithrix. Int J Syst Evol Microbiol 60: 2120–2123.
Miroshnichenko, M.L., Kostrikina, N.A., Chernyh, N.A., Pimenov, N.V., Tourova, T.P., Antipov, A.N., et al. (2003) Caldithrix abyssi gen. nov., sp. nov., a nitrate-reducing, thermophilic, anaerobic bacterium isolated from a Mid-Atlantic Ridge hydrothermal vent, represents a novel bacterial lineage. Int J Syst Evol Microbiol 53: 323–329.
Parks, D.H., Imelfort, M., Skennerton, C.T., Hugenholtz, P., and Tyson, G.W. (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25: 1043–1055.
Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., et al. (2013) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 41: D590–6.
Ramel, F., Amrani, A., Pieulle, L., Lamrabet, O., Voordouw, G., Seddiki, N., et al. (2013) Membrane-bound oxygen reductases of the anaerobic sulfate-reducing Desulfovibrio vulgaris Hildenborough: roles in oxygen defence and electron link with periplasmic hydrogen oxidation. Microbiology 159: 2663–2673.
Ramel, F., Brasseur, G., Pieulle, L., Valette, O., Hirschler-Réa, A., Fardeau, M.-L., and Dolla, A. (2015) Growth of the obligate anaerobe Desulfovibrio vulgaris Hildenborough under continuous low oxygen concentration sparging: impact of the membrane-bound oxygen reductases. PLoS ONE 10: e0123455.
Rawlings, N.D., Waller, M., Barrett, A.J., and Bateman, A. (2014) MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res 42: D503–9.
Yarza, P., Yilmaz, P., Pruesse, E., Glöckner, F.O., Ludwig, W., Schleifer, K.-H., et al. (2014) Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences. Nat Rev Microbiol 12: 635–645.
274275276277278279280281282283284285286287288289290291292293294295296297298299