Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
The University of Manchester Research
Sputum microbiome temporal variability and dysbiosis inchronic obstructive pulmonary disease exacerbations: ananalysis of the COPDMAP studyDOI:10.1136/thoraxjnl-2017-210741
Document VersionAccepted author manuscript
Link to publication record in Manchester Research Explorer
Citation for published version (APA):Kolsum, U., Singh, D., & COPDMAP (2017). Sputum microbiome temporal variability and dysbiosis in chronicobstructive pulmonary disease exacerbations: an analysis of the COPDMAP study. Thorax.https://doi.org/10.1136/thoraxjnl-2017-210741
Published in:Thorax
Citing this paperPlease note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscriptor Proof version this may differ from the final Published version. If citing, it is advised that you check and use thepublisher's definitive version.
General rightsCopyright and moral rights for the publications made accessible in the Research Explorer are retained by theauthors and/or other copyright owners and it is a condition of accessing publications that users recognise andabide by the legal requirements associated with these rights.
Takedown policyIf you believe that this document breaches copyright please refer to the University of Manchester’s TakedownProcedures [http://man.ac.uk/04Y6Bo] or contact [email protected] providingrelevant details, so we can investigate your claim.
Download date:14. Nov. 2020
1
Sputum microbiome temporal variability and dysbiosis in chronic obstructive pulmonary 1 disease exacerbations: an analysis of the COPDMAP study 2 3 Zhang Wang1†, Richa Singh4†, Bruce E. Miller2, Ruth Tal-Singer2, Stephanie Van Horn3, Lynn 4 Tomsho3, Alexander Mackay4, James P. Allinson4, Adam J. Webb5, Anthony J. Brookes5, Leena 5 M. George6, Bethan Barker6, Umme Kolsum7, Louise E Donnelly4, Kylie Belchamber4, Peter J. 6 Barnes4, Dave Singh7, Christopher E. Brightling6, Gavin C. Donaldson4, Jadwiga A. Wedzicha4, 7 James R. Brown1* on behalf of COPDMAP 8 9 1 Computational Biology, Target Sciences, Research and Development (R&D), GlaxoSmithKline 10 (GSK), Collegeville, PA 19426, USA 11 2 Respiratory Therapy Area Unit, R&D, GSK, King of Prussia, PA 19406, USA 12 3 Target and Pathway Validation, Target Sciences, R&D, GSK, Collegeville, PA 19426, USA 13 4 National Heart and Lung Institute, Imperial College London, London, SW3 6NP, UK 14 5 Department of Genetics, University of Leicester, Leicester, LE1 7RH, UK 15 6 Institute for Lung Health, University of Leicester, Leicester, LE3 9QP, UK 16 7 University of Manchester and University Hospital of South Manchester, Manchester, M23 9QZ, 17 UK 18 19 † These authors contributed equally to the manuscript. 20 21 * Correspondence to: 22 James R. Brown ([email protected]) 23 1250 S. Collegeville Road, Collegeville, Pennsylvania, 19426-0989, United States 24 Mobile: +16102478580 25 Tel: +16109176374 26 27 Word count abstract: 240 28 Word count text body: 3500 29 30
2
Abstract 31
Background 32
Recent studies suggest that lung microbiome dysbiosis, the disease associated disruption of the 33
lung microbial community, might play a key role in chronic obstructive pulmonary disease 34
(COPD) exacerbations. However, characterizing temporal variability of the microbiome from 35
large longitudinal COPD cohorts is needed to better understand this phenomenon. 36
37
Methods 38
We performed a 16S ribosomal RNA survey of microbiome on 716 sputum samples collected 39
longitudinally at baseline and exacerbations from 281 COPD subjects at three UK clinical 40
centers as part of the COPDMAP consortium. 41
42
Results 43
The microbiome composition was similar among centers and between stable and exacerbations 44
except for a small significant decrease of Veillonella at exacerbations. The abundance of 45
Moraxella was negatively associated with bacterial alpha diversity. Microbiomes were distinct 46
between exacerbations associated with bacteria versus eosinophilic airway inflammation. 47
Dysbiosis at exacerbations, measured as significant within subject deviation of microbial 48
composition relative to baseline, was present in 41% of exacerbations. Dysbiosis was associated 49
with increased exacerbation severity indicated by a greater fall in FEV1, FVC and a greater 50
increase in CAT score, particularly in exacerbations with concurrent eosinophilic inflammation. 51
There was a significant difference of temporal variability of microbial alpha and beta diversity 52
among centers. The variation of beta diversity significantly decreased in those subjects with 53
frequent exacerbations. 54
55
Conclusions 56
Microbial dysbiosis is a feature of some exacerbations and its presence, especially in concert 57
with eosinophilic inflammation, is associated with more severe exacerbations indicated by a 58
greater fall in lung function. 59
60
3
Key messages: 61
What is the key question? 62
How does the lung microbial community vary over time within COPD subjects and how is 63
microbial dysbiosis in exacerbations implicated in disease characteristics? 64
65
What is the bottom line? 66
Dysbiosis of the sputum microbiome in COPD exacerbations, particularly in concert with 67
eosinophilic inflammation, is associated with a greater decline in lung capacity during the 68
exacerbation event. 69
70
Why read on? 71
The presented study entails the largest COPD sputum microbiome cohort to date with multiple 72
study centers, aiming at in-depth examination of microbial temporal variability, dysbiosis, and 73
disease phenotypes. 74
4
Introduction 75
Chronic obstructive pulmonary disease (COPD) is characterized by persistent symptoms and 76
impaired lung function as a consequence of small airway obliteration and alveolar destruction, 77
and is associated with chronic lung inflammation 1-3. Acute exacerbations of COPD are a sudden 78
onset of sustained worsening of these symptoms. Bacteria potentially play a key role in COPD 79
pathogenesis 4 5, with respiratory bacterial pathogens such as Haemophilus influenzae, Moraxella 80
catarrhalis and Streptococcus pneumoniae capable of driving host inflammatory responses 6-9. 81
Since bacteria frequently interact with each other and respond to altered environmental 82
conditions, the consortium of the lung microbial community, known as the lung microbiome, 83
could be important in the crosstalk between respiratory tract pathogens and host response 10 11. 84
85
Emerging studies collectively suggest that the lung microbiome differs between stable and 86
exacerbations in COPD (11-15, for review see 16). For example, Molyneaux et al. found an 87
increased representation of pathogenic Proteobacteria in particular Haemophilus in 88
exacerbations following rhinovirus infection 12. Huang et al. observed an increase of 89
Proteobacteria during exacerbations with a predicted loss of function in maintenance of 90
microbial homeostasis 13. Recently, several of us published a longitudinal analysis of the sputum 91
microbiome from 87 subjects from BEAT-COPD cohort 11. Our analysis revealed an increased 92
Proteobacteria versus Firmicutes during exacerbations. In addition, we found distinct 93
microbiome composition between bacterial and eosinophilic exacerbations. In light of the 94
heterogeneous nature of COPD exacerbations, the lung microbiome has potential as a biomarker 95
to assist in the precision medicine treatments for specific COPD patient subpopulations. 96
97
Although insightful, results from these previous studies have limitations in terms of 98
understanding microbial dysbiosis during exacerbations, as most of these studies comparing the 99
microbiome at stable and exacerbations involved only one single sampling point of each state. 100
The lung microbiome is temporally dynamic and can vary even in stable state 10.Thus the 101
microbial changes during exacerbations are a mixture of both the disease associated disruption of 102
microbial community or dysbiosis, and the regular temporal perturbations of the lung microbial 103
composition. Therefore, examining the baseline variation of the lung microbiome is an important 104
5
first step to more precisely assess the extent of microbial dysbiosis during exacerbations. On the 105
other hand, understanding temporal variability of the lung microbiome within individuals is also 106
important in disease understanding. Disorder of the temporal balance of microbial ecosystem in 107
the respiratory tract could trigger a dysregulated host immune response that results in negative 108
effects on host biology 10. Linking microbial temporal variation to disease characteristics and 109
host inflammatory profiles could potentially lead to monitoring and manipulating the stability of 110
airway microbial composition as a therapeutic strategy for COPD. 111
112
A finer-grained longitudinal sampling of microbiome at multiple stable and exacerbation visits is 113
necessary to quantitatively measure temporal variability of the microbiome and assess the 114
significance of microbial dysbiosis during exacerbations. Here we describe a longitudinal 16S 115
ribosomal RNA (rRNA) gene based microbiome survey on 716 sputum samples collected 116
sequentially at baseline and exacerbations over a period of up to two years duration from 281 117
COPD subjects at three UK centers as part of the COPDMAP consortium. This entails one of the 118
largest COPD sputum microbiome cohorts to date aiming at in-depth examination of temporal 119
variability of the microbiome. We provide new insights into temporal changes of the microbiome 120
and its potential implication in disease progression. 121
122
6
Material and Methods 123
Subjects and samples 124
Full information on subject inclusion/exclusion criteria, sputum sample collection, microbiome 125
and statistical analyses are provided in the online supplementary appendix. Briefly, sputum 126
samples were collected at multiple longitudinal baseline and exacerbation visits from COPD 127
subjects at three clinical centers, Imperial College London, University of Leicester and 128
University Hospital of South Manchester (hereafter referred to as London, Leicester and 129
Manchester, respectively) as part of the COPDMAP consortium (www.copdmap.org). All 130
sputum samples were immediately stored at -80oC and shipped frozen in batches for analysis. 131
Exacerbations were treated with corticosteroids and antibiotics according to guidelines 17. The 132
protocol summary is available at https://clinicaltrials.gov/ (Identifier: NCT01620645). 133
134
Microbiome analysis 135
For quality control purposes, all DNA extractions, sequencing and data analyses were performed 136
in a single, centralized lab at the GSK R&D facility in Collegeville, Pennsylvania, USA. 137
Bacterial genomic DNA was extracted from frozen sputum samples using the Qiagen DNA Mini 138
kit (Qiagen, CA, USA) as per manufacture protocol. The V4 hypervariable region of the 16S 139
rRNA gene was PCR amplified and sequenced using multiplexed Illumina Miseq platform with 140
the proper controls against reagent contamination as described previously 11. Sequencing reads 141
were processed using QIIME pipeline version 1.9 18. The default set of criteria was used to 142
remove low quality and chimeric reads. The remaining reads were subject to a close reference 143
OTU picking (97% identity cutoff). Sequence data are deposited at the National Center for 144
Biotechnology Information Sequence Read Archive (SRP102480). 145
146
Statistical analysis 147
Exacerbation phenotypes were defined using microbiological and clinical criteria as established 148
previously [12]. Phenotypes of 146 exacerbations samples were undetermined due to missing 149
data. Partial Least Squares Discriminant Analysis (PLS-DA) was performed on exacerbation 150
phenotypes and microbiome and/or clinical data using SIMCA-P (Umetrics, Stockholm, 151
Sweden) 19. Dysbiosis at exacerbations was measured as the deviation (Z-score) of the first 152
7
Principal Coordinate (PC1) of the weighted UniFrac distance for exacerbation samples relative to 153
all baseline PC1s from the same subject. Temporal variability of microbial alpha and beta 154
diversity was measured using the metrics described by Flores et al. 20. A general linear model 155
(GLM) was constructed between demographic and baseline clinical variables and temporal 156
variability of alpha and beta diversity among subjects. The model was optimized in a stepwise 157
algorithm using the “step” function in the R stats package 21. The false discovery rate (FDR) 158
method was used to adjust P-values for multiple testing wherever applicable 22. 159
160
8
Results 161
Overview of the COPDMAP sputum microbiome 162
Microbial composition was determined for 716 sputum samples collected at baseline and 163
exacerbations from 281 COPDMAP subjects at three centers. The number of samples varies 164
from one to nine per subject (Fig. S1). Demographic and baseline clinical data were recorded for 165
subjects at initiation of sample collection (Table 1, Table S1). A set of 16 clinical and 166
biochemical characteristics were further collected longitudinally (Table 2, Table S2). From DNA 167
sequences of the V4 hypervariable region of the 16S rRNA gene, a total of 3,784 operational 168
taxonomic units (OTUs) were identified using 97% identity cut-off after rarefaction. 169
170
Table 1. Major demographic and baseline clinical features of all subjects. 171
Demographic and baseline features All subjects (N=281) *
Gender † Male: 187 (70.3%), Female: 79 (29.7%)
Age ‡ 70 (8.1)
BMI 27.2 (5.4)
Baseline GOLD status 1: 30 (11.4%), 2: 132 (50.2%), 3: 78 (29.7%), 4: 23
(8.7%)
Treatment # Antibiotics: 38 (15.3%), Steroids: 9 (3.6%), Both: 202
(81.1%)
Number of cigarette packs per year 1 47 (30)
Number of exacerbation per year 1 1.1 (1.6)
Baseline FEV1 1.5 (0.6)
Baseline FEV1% 56.3 (18.9)
Baseline FEV1 predicted 2.6 (0.5)
Baseline FVC 2.9 (1.0)
Baseline FEV1/FVC ratio 0.5 (0.1)
CAT score 18.7 (7.3)
CES-D score 1 10 (13)
9
SGRQ total score 47.4 (18.2) † Categorical data present as number (proportion). ‡ Continuous data present as mean (SD) unless stated below. 172 1 Median (interquartile range). 173 * 15 subjects were missing any demographic or clinical data. 174 # The numbers represent exacerbation events, thus include subjects with more than one exacerbation. 175
176
Table 2. Major longitudinal clinical features at baseline and exacerbations of all samples. 177
Longitudinal features All samples
(N=716)
Visits P-value ‡
Baseline
(N=446)
Exacerbations
(N=270)
FEV1 † 1.4 (0.5) 1.5 (0.5) 1.2 (0.5) <0.001
FVC 2.8 (0.9) 3.0 (0.8) 2.5 (0.9) <0.001
FEV1/FVC ratio 0.5 (0.2) 0.5 (0.2) 0.5 (0.2) 0.26
CAT score 21.1 (7.4) 19.6 (7.1) 24.2 (7.0) <0.001
C-reactive protein (CRP) 1 5.0 (11.0) 3.0 (5.0) 10.0 (27.0) <0.001 2
Blood neutrophil count (X109 cells/L) 5.5 (2.3) 4.9 (1.7) 6.2 (2.7) <0.001
Blood lymphocyte count (X109 cells/L) 1.8 (0.7) 1.8 (0.6) 1.8 (0.7) 0.49
Blood monocyte count (X109 cells/L) 0.7 (0.3) 0.6 (0.2) 0.7 (0.3) <0.001
Blood eosinophil count (X109 cells/L) 1 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.18 2
Blood basophil count (X109 cells/L) 0.0 (0.0) 0.1 (0.0) 0.0 (0.0) 0.01
Sputum neutrophil count % 1 78.8 (33.8) 75.1 (34.0) 84.2 (28.5) <0.001 2
Sputum lymphocyte count % 1 0.0 (0.5) 0.0 (0.3) 0.2 (1.0) 0.028 2
Sputum eosinophil count % 1 0.8 (2.0) 0.8 (2.2) 0.5 (2.0) 0.07 2
Sputum macrophage count % 1 13.0 (21.2) 14.5 (23.2) 8.5 (19.0) <0.001 2
Sputum epithelial count % 1 3.2 (8.0) 4.0 (9.8) 2.0 (4.8) <0.001 2
† Data present as mean (SD) unless stated below. 178 1 Median (interquartile range). 179 ‡ P-value was calculated for baseline and exacerbations comparison using T-test unless stated below. 180 2 Mann-Whitney-Wilcoxon Test. 181
182
10
Similar to other sputum or lung microbiome studies 11-15 23-26, the vast majority of OTUs 183
belonged to Proteobacteria (52.3%), Firmicutes (28.7%), Bacteroidetes (15.0%) and 184
Actinobacteria (1.9%) at the phylum level (Table S3, Fig. S2). At the genus level, Haemophilus 185
(25.8%) was most abundant across all samples, followed by Veillonella (15.8%) and Prevotella 186
(13.2%). Other common genera in the airway such as Streptococcus (4.4%) and Moraxella 187
(4.0%) were also among the most abundant genera identified. As a quality control for sample 188
processing and sequence analyses, an additional aliquot of sputum was collected as duplicates for 189
11 samples from the same subject at the same visit. Duplicates all had low UniFrac distance and 190
were highly similar in microbial composition (Fig. S3). 191
192
Overall, the microbiome composition was similar between baseline and exacerbation samples 193
with a small significant decrease of Veillonella at exacerbations (repeated measures ANOVA, 194
FDR-adjusted (adj.) P= 0.042) (Fig. 1A). The microbiome composition was similar among 195
centers with a significantly higher alpha diversity in the London cohort (Fig. S4A). Within each 196
center, there was a significant decrease of alpha diversity (Shannon, repeated measures ANOVA, 197
P=1.1e-4) and increase of Moraxella at Leicester (Fig. S4B). A strong negative correlation was 198
found between the abundance of Moraxella and alpha diversity for all samples (Shannon, R=-199
0.445, adj. P<9.6e-14, Fig. 1B). 200
201
Similar to previously observed 11, distinct microbial populations were found in bacterial and 202
eosinophilic exacerbations, with a significantly decreased alpha diversity (Shannon, T-test 203
P=0.008) and significantly increased proportion of Proteobacteria (T-test, adj. P=0.001) versus 204
Bacteroidetes (T-test, adj. P=0.002) in bacterial exacerbations compared to eosinophilic 205
exacerbations (Fig. 1C-D, Fig. S5A). An improvement in predicting the two phenotypes was 206
observed according to PLS-DA by combining the clinical and microbiome datasets versus using 207
the clinical data only (Fig. S5B). Within individual centers, this trend was more pronounced for 208
Leicester samples than those of London or Manchester (Fig. S5C). 209
210
We performed multivariate analysis to identify clinical factors significantly associated with 211
microbial alpha and beta diversity. Among all clinical variables, C-reactive protein (CRP), a 212
11
known inflammatory marker for COPD prognosis 27, was the most significant factor correlated 213
with both alpha diversity (Shannon, P<0.01, Fig. S6) and beta diversity at the phylum level 214
among all samples (Table S4). No factors significantly predicted variation at the genus and OTU 215
levels. CRP was also significantly associated with alpha and beta diversity of the predicted 216
functional profiles of the sputum microbiome using the software PICRUSt 28 (Table S5). 217
218
Increased disease severity in exacerbations with dysbiosis 219
Longitudinal sampling of a large cohort over a two year period enables temporal variability and 220
dysbiosis of the sputum microbiome to be quantitatively measured. To explore variation of the 221
sputum microbiome over time, we plotted the first Principal Coordinate (PC1) of the weighted 222
UniFrac distance for all samples within each subject as a proxy for their microbial compositions, 223
as it explains 49.0% of the total beta diversity (Fig. 2A). Only subjects with at least two baseline 224
and one exacerbation samples were included. Visual inspection of the plot revealed a deviation 225
of PC1 for many exacerbation samples relative to baseline samples from the same subject (Fig. 226
2A, Fig. S7A), indicating specific exacerbations were particularly susceptible to alternation of 227
microbial composition or dysbiosis. In comparison, the sputum microbiome was much less 228
variable among baseline samples. This is supported by a significantly increased within subject 229
standard deviation of PC1 (paired T-test, P=6.7e-4) combining baseline and exacerbation 230
samples compared to baseline samples only, with the most profound changes at the Leicester 231
center (Fig. S7B). 232
233
Having assessed temporal variability of the sputum microbiome at baseline, we measured the 234
dysbiosis of exacerbation as a Z-score that measures how much its PC1 deviated from all 235
baseline PC1s from the same individual. A total of 49 exacerbations (out of 119 exacerbations 236
with a Z-score, 41.2%) were identified as in significant dysbiosis state with an absolute Z-score 237
greater than 2 (P<0.05, Fig. 2A, Fig. S7C). In most of these exacerbations, the sputum 238
microbiome shifted from a balanced composition to a more biased one predominated by one or a 239
few taxa with a decreased alpha diversity (Fig. S8). Bacterial genera of Veillonella, Cronobacter 240
and Haemophilus were among the key taxa associated with the dysbiosis (Fig. S9). Across all 241
exacerbation subtypes, bacterial exacerbations had the highest number of dysbiosis events than 242
12
other subtypes (Fig. S7C), with the caveat that phenotype could not be defined for 21 of the 49 243
exacerbations due to missing data. 244
245
For exacerbations with or without significant dysbiosis, we compared the exacerbation severity 246
determined by change in lung function and symptoms relative to the last baseline measurement. 247
We found a non-significantly greater decrease in FEV1 and FVC and a greater increase in CAT 248
score for exacerbations with dysbiosis compared to those without (Fig. 2B). Such trends were 249
overall consistent within each center, except for a reversal trend of FVC in Manchester which 250
has a smaller sample size (Fig. S10A). Also, the exacerbation Z-score was positively correlated, 251
albeit non-significantly, with changes of FEV1 and FVC, and negatively correlated with change 252
of CAT score (Fig. S10B), suggesting that the more dysbiotic the exacerbation was, the more 253
severe the clinical outcome could possibly be. As eosinophil abundance is another important 254
factor for COPD exacerbations, we reclassified exacerbations according to both the dysbiosis 255
and blood eosinophil indices. Doing so revealed four subgroups of exacerbations where 256
dysbiosis and/or high blood eosinophil level (>3 x108 cells/L) are the predominant feature. 257
Exacerbations with both dysbiosis and high eosinophil level had the greatest changes of FEV1 258
(statistically significant, ANOVA P=0.02), FVC and CAT score, whereas exacerbations with 259
neither dysbiosis nor high eosinophils level were associated with the least of such changes (Fig. 260
2B). 261
262
Exacerbation frequency associated with temporal variability of the sputum microbiome 263
We next sought to quantify temporal variability of the sputum microbiome within subjects using 264
the metrics described by Flores et al. 20. Only subjects with at least three samples were included. 265
The variability of microbial alpha diversity was denoted as the coefficient of variation of 266
Shannon for samples within each subject. The variability of beta diversity was calculated as the 267
median of pairwise UniFrac distances for samples within each subject. A wide range of temporal 268
variability of alpha and beta diversity was observed across subjects (Fig. 3A). We noted that 269
there was a significantly lower variation of alpha and beta diversity among London subjects than 270
Leicester or Manchester ones (T-test, P<0.001). As expected, both variations of alpha and beta 271
13
diversity were significantly higher in subjects with dysbiosis exacerbations than those without 272
(T-test, P<0.01). 273
274
We constructed a generalized linear model (GLM) to look for clinical characteristics associated 275
with temporal variability of the sputum microbiome. A set of 14 demographic and baseline 276
clinical variables were included for each subject. Center, FEV1/FVC ratio and number of 277
exacerbations per year (prior to the sampling visits) were significant factors for the variation of 278
alpha diversity across all subjects (Table 3). When reconstructing GLM for each center, number 279
of exacerbations per year was significant for London and Leicester subjects. In addition, center 280
and number of exacerbations per year were significantly associated with beta diversity variation 281
across all subjects (Table 2). A continuous decreasing trend of number of exacerbations per year 282
was observed toward subjects with greater variation of beta diversity (Fig. 3B). Likewise, a 283
continuous decreasing trend of beta diversity variation was observed toward subjects with higher 284
number of exacerbations per year (ANOVA, P<0.05, Fig. 3C). Similar trends were observed 285
within each center (Fig. S11) and for temporal variability of baseline microbiomes only (Fig. 286
S12). 287
288
Table 3. List of demographic and baseline clinical variables significantly associated with 289
temporal variability of microbial alpha and beta diversity among subjects. P-values are indicated 290
for significant variables in the model. 291
Temporal variability Alpha diversity (Shannon) Beta diversity (Weighted UniFrac distance)
All London Leicester Manchester All London Leicester Manchester
Number of exacerbations
per year
0.01 0.01 3E-4 # 0.01 0.03 # #
BMI 0.11 ‡ # 0.01 # # # 0.04 #
CES-D score # 0.03 # # # # # 0.01
Packs of cigarette per year # # # # 0.07 ‡ # 0.02 #
FEV1 0.03 # # # # # 0.17 ‡ #
FEV1/FVC ratio # # 0.01 # # # # #
Age # 0.03 # # # # # #
SGRQ total score # # # # # # # 0.03
Center 2.6E-12 NA NA NA 1.8E-10 NA NA NA
# P≥0.05 and absent in the model. 292 ‡ Variables not statistically significant but present in the model. 293
14
Discussion 294
Culture-independent analyses have uncovered a previously unappreciated complexity of the lung 295
bacterial community that has reshaped our understanding of COPD etiology 11-15 23-26. Our study 296
reveals a diverse sputum microbiome among the COPDMAP subjects and further validates the 297
association of microbiome with specific exacerbation phenotypes. We also show in-depth 298
temporal variation of the sputum microbial community within subjects and identified potential 299
new relationships of the microbiome variation with patient disease progression. 300
301
One advantage of our study is the longitudinal sampling at multiple baseline and exacerbation 302
visits compared to most previous studies where a single snapshot of exacerbations was taken. It 303
is well appreciated that the lung microbiome is inherently variable shaped by the balance of 304
ecological factors like microbial immigration and elimination 10. During exacerbations, the 305
balance goes awry with dysregulated host immune response and inflammation leading to further 306
microbial changes or dysbiosis. Therefore, to explicitly determine the extent of disease 307
associated dysbiosis one would need to first distill the normal perturbations of microbial 308
composition. Our study underscores the importance of considering temporal variability of the 309
microbiome in understanding the significance of microbial dysbiosis in COPD exacerbations. 310
311
From assessing temporal variation of the sputum microbiome, we identified a subset of 312
exacerbation events in which significant dysbiosis is a feature. In these exacerbations, the 313
microbiome composition shifted from a highly diverse microbial community to a less diverse 314
one characterized by the predominance of only one or few genera. These dysbiosis exacerbations 315
appear to be the main source of microbial temporal variation and are associated with a greater 316
worsening of health status and decrease of lung capacity. To our knowledge, this is the first 317
evidence to suggest that respiratory dysbiosis is associated with increased exacerbation severity 318
in COPD, although the strength of this association is weak and needs to be further validated in 319
additional cohorts and by other measures of disease severity. Altered environmental conditions 320
in exacerbations could disturb the composition of the lung microbial community 29 30, which in 321
turn elicit a dysregulated host immune response through bacterial metabolites and virulent 322
factors, resulting in a sustained damage cycle with an accelerated decline in lung function 31. 323
15
Whether dysbiosis is the cause or consequence of the increased exacerbation severity and how 324
this imbalance is implicated in host inflammatory pathways are new questions that will impact 325
on how we understand and treat COPD exacerbations. 326
327
It has been recently emphasized that not all COPD exacerbations are the same 32. Our results 328
suggest the existence of subgroups of exacerbations associated with or without significant 329
microbial dysbiosis or increased eosinophilia. Importantly, these subgroups likely reflect 330
fundamental differences in their immuno-pathogenesis driving the exacerbations, and therefore 331
might require alternative therapeutic approaches. Interestingly, the most severe exacerbations 332
were observed in the small subgroup that had evidence of bacterial dysbiosis in concert with 333
eosinophilic inflammation. It is possible that this group might require interventions such as 334
antibiotics and steroids (i.e. prednisolone) to target both bacteria and eosinophilic inflammation 335
whereas in contrast those without bacterial dysbiosis nor eosinophilic inflammation might not 336
require these therapies. Our results perhaps establish a new paradigm in stratifying COPD 337
exacerbations according to both dysbiosis and eosinophil measurements, which could be 338
informative guiding future personalized therapies. Further efforts in identifying biomarkers for 339
these subgroups in larger populations could help refine exacerbation subtypes toward phenotype-340
specific clinical management. 341
342
We found a significantly decreased exacerbation frequency in subjects with higher temporal 343
variation of the microbial beta diversity. In COPD there is a subset of frequent exacerbators that 344
are particularly susceptible to recurrent exacerbations independent of other risk factors 33 34. Thus 345
low temporal variability of the sputum microbiome might come as a predicative factor for the 346
frequent exacerbator phenotypes. Frequently recurrent exacerbations are often associated with 347
persistent propensity to inflammation with high levels of airway biomarkers such as IL-6 and IL-348
8 35 36. It is possible that elevated airway inflammation in frequent exacerbators maintains the 349
microbiome in a sustainable dysbiotic state and prevents it from regular fluctuation. 350
351
An important novelty of our study is that there were three unique study centers. All samples were 352
processed in a central lab, which minimizes microbiome variation due to differences in 353
16
experimental protocols. Interestingly, there was a significant difference in temporal variability of 354
the microbiome among subjects in the three centers, even though their overall microbiome 355
profiles were highly similar. Factors accounting for the among-center variation could include 356
differences in the frequency of clinical visits and compliance with medications, although we lack 357
the comparative data across centers to suggest specific causes. Our study suggests that the impact 358
of demographics and clinical procedures on the lung microbial community needs to be broadly 359
considered in future studies. 360
361
Our study has several caveats. First, only a proportion of the bacterial 16S rRNA gene was 362
sequenced to characterize the microbial population both here and in previous lung microbiome 363
studies 11-14. Thus the resolution is insufficient when it comes to species-level characterization of 364
the microbiome, whereas ecological and functional interaction of individual species or strains 365
could be important in the underlying disease etiology. Second, despite a large cohort size, 366
longitudinal sampling remains relatively sparse for many subjects with variation in the timing of 367
their sampling visits. Further efforts on characterizing respiratory tract metagenomes in a more 368
regularly and intensively followed patient cohort together with host multi-omics profiling would 369
promise to bring in a more comprehensive picture of the intrinsic variability of the lung 370
microbiome and its implications in disease heterogeneity. 371
372
In summary, our study revealed a temporally dynamic sputum microbiome in COPD subjects in 373
which microbial dysbiosis in exacerbations, particularly in concert with eosinophilic 374
inflammation, was associated with increased exacerbation severity. Our findings underscore the 375
importance of considering temporal variability of the sputum microbiome in COPD 376
heterogeneity and its potential as a biomarker toward more precise treatment of COPD. 377
17
Figure legends 378
Figure 1. Baseline and exacerbation microbiome profiles across centers. A) Alpha diversity 379
(Shannon) and compositions of major phyla and genera in samples at baseline and exacerbations. 380
B) Correlation between alpha diversity (Shannon) and relative abundance of Moraxella. Each dot 381
represents a sample colored by baseline or exacerbations. C) Alpha diversity (Shannon) and 382
composition of major phyla and genera in exacerbation samples of different exacerbation 383
phenotypes. D) Principal Coordinate Analysis (PCoA) showing distinct clustering of samples 384
with bacterial and eosinophilic exacerbations. The number of samples is indicated in the 385
parenthesis under each subgroup in the bar chart. B: bacterial; V: viral; E: eosinophilic; BE: 386
bacterial and eosinophilic; BV: bacterial and viral; and Pauci: pauci-inflammatory. Error bars are 387
within 1.5 interquartile range of the upper and lower quartiles. *** adj. P<0.001; ** adj. P<0.01; 388
* adj. P<0.05. 389
390
Figure 2. Dysbiosis of the sputum microbiome. A) Scatter plot of the first Principal Coordinate 391
(PC1) of all samples within each subject at each center. Only subjects with at least two baseline 392
and one exacerbation samples were included. Exacerbation samples are highlighted in red. Box-393
whisker plots indicate the distribution of baseline PC1s within each subject. The confidence 394
bands indicate the 95% confidence interval for the mean baseline PC1s within each subject. 395
Exacerbations outside the confidence bands are the ones with significant dysbiosis (absolute Z-396
score>2, P<0.05). B) Box-whisker plots showing changes of FEV1, FVC and CAT score 397
between dysbiosis and non-dysbiosis exacerbations, and among four subgroups of exacerbations 398
classified by dysbiosis and blood eosinophils level. Error bars are within 1.5 interquartile range 399
of the upper and lower quartiles. 400
401
Figure 3. Temporal variability of the sputum microbiome. A) Temporal variability of microbial 402
alpha (coefficient of variation of Shannon) and beta diversity (median of pairwise weighted 403
UniFrac distances) for each subject. Only subjects with at least three samples were included. 404
Subjects with dysbiosis exacerbations are highlighted in yellow. B) Box-whisker plots showing 405
exacerbation frequency of subjects within different quartile groups of temporal variability of 406
alpha and beta diversity, with the first quartile defined as ‘low’, the second and third quartiles as 407
18
‘medium’ and the fourth quartile as ‘high’. C) Box-whisker plots showing temporal variability of 408
alpha and beta diversity in subjects with different classes of exacerbation frequency. The number 409
of samples is indicated in the parenthesis under each subgroup in the box-whisker plot. Error 410
bars are within 1.5 interquartile range of the upper and lower quartiles. ANOVA test for 411
temporal variability of alpha and beta diversity: *** adj. P<0.001; ** adj. P<0.01; * adj. P<0.05. 412
413
19
Reference 414
1. Lopez AD, Shibuya K, Rao C, et al. Chronic obstructive pulmonary disease: current burden and future 415 projections. The European respiratory journal 2006;27(2):397-412. doi: 416 10.1183/09031936.06.00025805 417
2. Franklin W, Lowell FC, Michelson AL, et al. Chronic obstructive pulmonary emphysema; a disease of 418 smokers. Annals of internal medicine 1956;45(2):268-74. 419
3. Taraseviciene-Stewart L, Douglas IS, Nana-Sinkam PS, et al. Is alveolar destruction and emphysema in 420 chronic obstructive pulmonary disease an immune disease? Proceedings of the American 421 Thoracic Society 2006;3(8):687-90. doi: 10.1513/pats.200605-105SF 422
4. Erkan L, Uzun O, Findik S, et al. Role of bacteria in acute exacerbations of chronic obstructive 423 pulmonary disease. International journal of chronic obstructive pulmonary disease 424 2008;3(3):463-7. 425
5. Sethi S, Murphy TF. Bacterial infection in chronic obstructive pulmonary disease in 2000: a state-of-426 the-art review. Clinical microbiology reviews 2001;14(2):336-63. doi: 10.1128/CMR.14.2.336-427 363.2001 428
6. Miravitlles M, Espinosa C, Fernandez-Laso E, et al. Relationship between bacterial flora in sputum and 429 functional impairment in patients with acute exacerbations of COPD. Study Group of Bacterial 430 Infection in COPD. Chest 1999;116(1):40-6. 431
7. Ball P. Epidemiology and treatment of chronic bronchitis and its exacerbations. Chest 1995;108(2 432 Suppl):43S-52S. 433
8. Soler N, Torres A, Ewig S, et al. Bronchial microbial patterns in severe exacerbations of chronic 434 obstructive pulmonary disease (COPD) requiring mechanical ventilation. American journal of 435 respiratory and critical care medicine 1998;157(5 Pt 1):1498-505. doi: 436 10.1164/ajrccm.157.5.9711044 437
9. Monso E, Ruiz J, Rosell A, et al. Bacterial infection in chronic obstructive pulmonary disease. A study of 438 stable and exacerbated outpatients using the protected specimen brush. American journal of 439 respiratory and critical care medicine 1995;152(4 Pt 1):1316-20. doi: 440 10.1164/ajrccm.152.4.7551388 441
10. Dickson RP, Martinez FJ, Huffnagle GB. The role of the microbiome in exacerbations of chronic lung 442 diseases. Lancet 2014;384(9944):691-702. doi: 10.1016/S0140-6736(14)61136-3 443
11. Wang Z, Bafadhel M, Haldar K, et al. Lung microbiome dynamics in chronic obstructive pulmonary 444 disease exacerbations. The European respiratory journal 2016 doi: 10.1183/13993003.01406-445 2015 446
12. Molyneaux PL, Mallia P, Cox MJ, et al. Outgrowth of the bacterial airway microbiome after rhinovirus 447 exacerbation of chronic obstructive pulmonary disease. American journal of respiratory and 448 critical care medicine 2013;188(10):1224-31. doi: 10.1164/rccm.201302-0341OC 449
13. Huang YJ, Sethi S, Murphy T, et al. Airway microbiome dynamics in exacerbations of chronic 450 obstructive pulmonary disease. Journal of clinical microbiology 2014;52(8):2813-23. doi: 451 10.1128/JCM.00035-14 452
14. Millares L, Ferrari R, Gallego M, et al. Bronchial microbiome of severe COPD patients colonised by 453 Pseudomonas aeruginosa. European journal of clinical microbiology & infectious diseases : 454 official publication of the European Society of Clinical Microbiology 2014;33(7):1101-11. doi: 455 10.1007/s10096-013-2044-0 456
15. Huang YJ, Kim E, Cox MJ, et al. A persistent and diverse airway microbiota present during chronic 457 obstructive pulmonary disease exacerbations. Omics : a journal of integrative biology 458 2010;14(1):9-59. doi: 10.1089/omi.2009.0100 459
20
16. Huang YJ, Erb-Downward JR, Dickson RP, et al. Understanding the role of the microbiome in chronic 460 obstructive pulmonary disease: principles, challenges, and future directions. Translational 461 research : the journal of laboratory and clinical medicine 2017;179:71-83. doi: 462 10.1016/j.trsl.2016.06.007 463
17. National Institute for Health and Clinical Excellence. Chronic obstructive pulmonary disease: 464 management of chronic obstructive pulmonary disease in adults in primary and secondary care. 465 London: National Clinical Guideline Centre: Available from: 466 http://guidance.nice.org.uk/CG101/Guidance/pdf/English2010. 2010 467
18. Caporaso JG, Kuczynski J, Stombaugh J, et al. QIIME allows analysis of high-throughput community 468 sequencing data. Nature methods 2010;7(5):335-6. doi: 10.1038/nmeth.f.303 469
19. Eriksson L JE, Kettaneh-Wold N. Multi- and Megavariate Data Analysis, Part 2, Advanced Applications 470 and Method Extensions. MKS Umetrics AB 2006 471
20. Flores GE, Caporaso JG, Henley JB, et al. Temporal variability is a personalized feature of the human 472 microbiome. Genome biology 2014;15(12):531. doi: 10.1186/s13059-014-0531-y 473
21. R Core Team (2014) R: A language and environment for statistical computing. R Foundation for 474 Statistical Computing, Vienna, Austria.;ISBN 3-900051-07-0, URL http://www.R-project.org/ 475
22. Benjamini Y, and Hochberg, Y. Controlling the false discovery rate – a practical and powerful 476 approach to multiple testing. J R Stat Soc Ser B Methodol 1995;57:289-300. 477
23. Hilty M, Burke C, Pedro H, et al. Disordered microbial communities in asthmatic airways. PloS one 478 2010;5(1):e8578. doi: 10.1371/journal.pone.0008578 479
24. Erb-Downward JR, Thompson DL, Han MK, et al. Analysis of the lung microbiome in the "healthy" 480 smoker and in COPD. PloS one 2011;6(2):e16384. doi: 10.1371/journal.pone.0016384 481
25. Zakharkina T, Heinzel E, Koczulla RA, et al. Analysis of the airway microbiota of healthy individuals 482 and patients with chronic obstructive pulmonary disease by T-RFLP and clone sequencing. PloS 483 one 2013;8(7):e68302. doi: 10.1371/journal.pone.0068302 484
26. Pragman AA, Kim HB, Reilly CS, et al. The lung microbiome in moderate and severe chronic 485 obstructive pulmonary disease. PloS one 2012;7(10):e47305. doi: 10.1371/journal.pone.0047305 486
27. Dahl M, Vestbo J, Lange P, et al. C-reactive protein as a predictor of prognosis in chronic obstructive 487 pulmonary disease. American journal of respiratory and critical care medicine 2007;175(3):250-488 5. doi: 10.1164/rccm.200605-713OC 489
28. Langille MG, Zaneveld J, Caporaso JG, et al. Predictive functional profiling of microbial communities 490 using 16S rRNA marker gene sequences. Nature biotechnology 2013;31(9):814-21. doi: 491 10.1038/nbt.2676 492
29. Worlitzsch D, Tarran R, Ulrich M, et al. Effects of reduced mucus oxygen concentration in airway 493 Pseudomonas infections of cystic fibrosis patients. The Journal of clinical investigation 494 2002;109(3):317-25. doi: 10.1172/JCI13870 495
30. Schmidt A, Belaaouaj A, Bissinger R, et al. Neutrophil elastase-mediated increase in airway 496 temperature during inflammation. Journal of cystic fibrosis : official journal of the European 497 Cystic Fibrosis Society 2014;13(6):623-31. doi: 10.1016/j.jcf.2014.03.004 498
31. Marsland BJ, Gollwitzer ES. Host-microorganism interactions in lung diseases. Nature reviews 499 Immunology 2014;14(12):827-35. doi: 10.1038/nri3769 500
32. Lopez-Campos JL, Agusti A. Heterogeneity of chronic obstructive pulmonary disease exacerbations: a 501 two-axes classification proposal. The Lancet Respiratory medicine 2015;3(9):729-34. doi: 502 10.1016/S2213-2600(15)00242-8 503
33. Wedzicha JA, Brill SE, Allinson JP, et al. Mechanisms and impact of the frequent exacerbator 504 phenotype in chronic obstructive pulmonary disease. BMC medicine 2013;11:181. doi: 505 10.1186/1741-7015-11-181 506
21
34. Hurst JR, Vestbo J, Anzueto A, et al. Susceptibility to exacerbation in chronic obstructive pulmonary 507 disease. The New England journal of medicine 2010;363(12):1128-38. doi: 508 10.1056/NEJMoa0909883 509
35. Bhowmik A, Seemungal TA, Sapsford RJ, et al. Relation of sputum inflammatory markers to 510 symptoms and lung function changes in COPD exacerbations. Thorax 2000;55(2):114-20. 511
36. Perera WR, Hurst JR, Wilkinson TM, et al. Inflammatory changes, recovery and recurrence at COPD 512 exacerbation. The European respiratory journal 2007;29(3):527-34. doi: 513 10.1183/09031936.00092506 514
515
1
Sputum microbiome temporal variability and dysbiosis in chronic obstructive pulmonary 1 disease exacerbations: an analysis of the COPDMAP study 2 3 Zhang Wang, Richa Singh, Bruce E. Miller, Ruth Tal-Singer, Stephanie Van Horn, Lynn 4 Tomsho, Alexander Mackay, James P. Allinson, Adam J. Webb, Anthony J. Brookes, Leena M. 5 George, Bethan Barker, Umme Kolsum, Louise E Donnelly, Kylie Belchamber, Peter J. Barnes, 6 Dave Singh, Christopher E. Brightling, Gavin C. Donaldson, Jadwiga A. Wedzicha, James R. 7 Brown on behalf of COPDMAP 8 9
10 11 12
SUPPLEMENTARY MATERIAL 13 14
2
Material and Methods 15
Study subjects and sample collection 16
COPDMAP was conducted in accordance with the Declaration of Helsinki 1 and Good Clinical 17
Practice 2, and was approved by the Imperial College London, University of Leicester and 18
University of Manchester Research Ethics Committee. All participants provided written 19
informed consent. Subjects with a physician diagnosis of COPD were recruited from three 20
clinical centers at Imperial College London, University of Leicester and University Hospital of 21
South Manchester, and through local advertising to enter studies investigating biomarkers in 22
COPD as previously described 3 4. The Imperial samples were collected at Royal Free Hospital of 23
University College London at the time of the study. Subjects with asthma, or significant 24
respiratory disease other than COPD, or the inability to produce sputum after sputum induction 25
were excluded from the study. Sputum samples from COPD subjects were collected at multiple 26
longitudinal visits including both baseline (defined as no evidence of symptom-defined 27
exacerbations in the preceding four weeks and the subsequent two weeks post-clinic visit) and 28
exacerbations (defined according to Anthonisen criteria 5 and/or healthcare utilization 6). All 29
exacerbation sputum samples were collected prior to the institution of any exacerbation 30
treatment. Demographic, baseline and longitudinal clinical data were recorded for samples. A 31
number of 15 subjects were missing any demographic or clinical data and were excluded for 32
biostatistical analysis. 33
34
16S rRNA sequencing 35
For quality control purposes, all DNA extractions, sequencing and data analyses occurred in a 36
single, centralized lab at the GSK R&D facility in Collegeville, Pennsylvania, USA. Frozen 37
sputum samples homogenized in sterile 1x sputasol (0.1% DTT) was thawed completely on 38
bench. Bacterial genomic DNA was extracted from sputum samples using the Qiagen DNA Mini 39
kit (Qiagen, CA, USA) as per manufacture protocol. The V4 hypervariable region of the 16S 40
rRNA gene was PCR amplified using specific primers (515F: 5’ 41
GTGCCAGCMGCCGCGGTAA3’, 806R: 5’GGACTACHVGGGTWTCTAAT3’), including 42
Illumina sequencing adapters 7. The reverse amplification primer contained a 12 bp error-43
correcting Golay barcode sequence allowing for pooling of multiple samples in the same 44
flowcell 8. Negative controls for extraction (no sputum material) and PCR amplification (no 45
3
template, Qiagen Elution Buffer only) were included in each experiment. The extraction negative 46
control for each experiment was subsequently sequenced to identify any potential contaminating 47
bacterial species. 48
49
The amplification mix (25 μl) contained 4 μl sputum DNA, 2 μl (0.2 µM) each of forward and 50
reverse primers (Integrated DNA Technologies, Coralville, IA), 12.5 μl of 2x KAPA HiFi 51
HotStart Ready Mix (KK2602, Kapa biosystems, Boston MA), and 4.5 μl RNase free water. 52
PCR amplification was performed on an ABI 9700 thermocycler using the following cycling 53
protocol: initial denaturation at 95°C for 3 min, followed by 35 cycles of 98°C for 20 sec, 66°C 54
for 15 sec, and 72°C for 15 sec, with a final hold of 72°C for 1 min. Aliquots of reaction mixture 55
(3 µl each) were analyzed by 2% agarose gel (2% Egel, Invitrogen) with samples containing a 56
band of approximately 385 bp considered ‘PCR positive’. Samples with no visible amplified 57
product were considered ‘PCR negative’. Unincorporated nucleotides and remaining primers 58
were removed using Agencourt AMPure XP-PCR clean up (A63882, Beckman Coulter, 59
Pasadena, CA), according to the manufacturer’s protocol. The DNA concentration of the eluted 60
product was quantified using the KAPA Library Quantification Kit for Ilumina platform 61
(KK4835, Kapa biosystems, Boston MA). PCR products were normalized to 10 nM and 62
quantified again using the KAPA Library Quantification kit and pooled into equimolar 4 nM 63
pools. 64
65
The amplified PCR products were sequenced in five runs on an Illumina MiSeq sequencer 66
(Illumina, San Diego, CA). Following cluster formation on the MiSeq instrument, the amplicons 67
were sequenced using primers complimentary to the V4 region and designed for paired-ends 68
sequencing. A third sequencing primer was used for reading the barcodes. To check for proper 69
cluster density and sample normalization, a MiSeq single-end 26 bp+12 bp index sequencing run 70
was performed using the MiSeq instrument. The pool was mixed with a PhiX library (Illumina, 71
San Diego CA) at a ratio of 1:9 in order to increase the entropy of the library. A final MiSeq 2x 72
150 bp+12 bp index sequencing run was performed on the pooled samples. 73
74
Although negative reagent controls were performed for all DNA isolation, extraction and PCR 75
amplification step, we performed further analyses to ensure that potential contamination risks 76
4
were minimized. We compared our results against the 92 contaminant genera detected in 77
sequenced negative ‘blank’ controls by Salter et al. 9. We failed to detect 42 out of the 92 78
contaminant genera in our dataset (Table S6). Of the remaining genera that were found in our 79
data, none had an average relative abundance greater than 0.002, or had a relative abundance 80
greater than 0.1 in any particular sample, except for Pseudomonas and Streptococcus which are 81
known lung pathogens (Table S6). 82
83
16S rRNA sequence analysis and OTU classification 84
First, all reads mapping to PhiX reference sequence (GenBank: NC_001422.1) using bowtie 85
v1.0.1 10 were removed from the analysis. Remaining paired-end reads were merged using pear 86
v0.9.5-64 11, discarding all reads containing ambiguous bases (option ‘-u 0’). A paired-end read 87
was discarded if one of the following conditions was met: overlap < 10 bp, assembly length < 50 88
bp or p-value of alignment > 0.01. Sequencing reads were processed using QIIME pipeline 89
version 1.9 12. Eukaryotic, mitochondria and chloroplast sequences were filtered by BLASTN 90
against the SILVA database [5]. Chimeric reads were identified using UCHIME using both de 91
novo and reference based methods with default parameters 13. A total of 68,643,967 reads were 92
generated, and 55,786,582 reads were retained after filtering processes. The remaining reads 93
were subject to a 97% identity cutoff close reference OTU picking using the UCLUST method 14 94
against the August 2013 edition of the Greengenes 16S rRNA database (v13_8) 15. OTU 95
clustering was performed on each run separately and the resulting OTU tables were merged 96
afterwards. OTUs that contain a single read (singleton OTUs) were excluded to remove potential 97
sequencing artifacts. All 716 samples were rarefied to 46,056 reads which is the minimum 98
number of aligned reads across all samples. The rarefied OTU table was used for assessing 99
alpha, beta diversity and for subsequent statistical analyses. Alpha and beta diversity was also 100
calculated from functional prediction of microbial gene families and pathways using the software 101
PICRUSt 16. 102
103
Statistical analysis 104
PLS-DA between the sputum microbiome and exacerbation phenotypes 105
Exacerbation phenotypes were defined using slightly modified microbiological and clinical 106
criteria as established previously 4 17. In particular, the total bacterial load was estimated by the 107
5
qPCR copy number of Haemophilus influenzae normalized by the proportion of the species in 108
the sputum microbiome, and a bacterial exacerbation was defined as a total bacterial load >= 107 109
cells 4 17. A virus exacerbation was defined as a positive sputum viral detection by PCR. An 110
eosinophilic exacerbation was defined as the presence of more than 3% non-squamous cells in 111
sputum. Samples with multiple criteria satisfied were classified as the corresponding 112
combination of phenotypes. The remaining samples associated with limited changes in the 113
inflammatory profile were classified as ‘pauci-inflammatory’. The phenotype could not be 114
defined for 146 exacerbation samples due to missing data. A total of 25 clinical variables were 115
included in the analysis. PLS-DA was performed using SIMCA-P (Umetrics, Stockholm, 116
Sweden) 18 as previously described 17. For subjects with multiple exacerbations, only the initial 117
exacerbation sample was included in the analysis to meet the independence assumption for PLS-118
DA. 119
120
Dysbiosis and temporal variability of the sputum microbiome 121
A total of 64 subjects that had at least two baseline and one exacerbation samples were included 122
in the dysbiosis analysis. We measured the dysbiosis of exacerbations relative to baseline 123
samples of the same subject using PC1 of weighted UniFrac distance. Assuming a normal 124
distribution of the baseline PC1s within each subject, the mean and standard deviation of the 125
baseline PC1s were calculated. And a Z-score was calculated for each exacerbation as: 126
𝑍𝑍𝑒𝑒 =𝑃𝑃𝑃𝑃1𝑒𝑒 − 𝜇𝜇𝑏𝑏
𝜎𝜎𝑏𝑏 127
where PC1e is the exacerbation PC1, and μb
and σb are the mean and standard deviation of all 128
baseline PC1s of the same subject, respectively. An absolute Z-score greater than 2 was used as 129
cutoff for dysbiosis, which is equivalent to a probability of 0.05 in observing the exacerbation 130
PC1 from the subject under the distribution of its baseline PC1s. 131
132
A total of 126 subjects that had at least three samples were included in the analysis of the 133
microbial temporal variability. We adopted the metrics from Flores et al. 19 to assess temporal 134
variability of microbial alpha and beta diversity. For the variability of alpha diversity, we 135
calculated the coefficient of variation (CV) as standard deviation normalized by mean for the 136
Shannon of all samples within each subject. For the variability of beta diversity, we calculated 137
6
the median of the pairwise weighted UniFrac distances of all samples within each subject. 138
Higher values of these measurements represent a more variable microbial community. Subjects 139
were then divided into quartiles based on the CV of Shannon and the median of pairwise 140
weighted UniFrac distances where the first quartile was defined as ‘low’, the second and third 141
quartiles as ‘medium’ and the fourth quartile as ‘high’ for each measurement of temporal 142
variability. 143
144
Clinical predictors of the sputum microbiome diversity and temporal variability 145
Multivariate models were constructed to assess the significant association between patient 146
demographic and clinical variables to alpha and beta diversity and their temporal variability. 147
Both the microbiome and clinical datasets were pre-processed as described previously 17. To 148
identify clinical predictors of temporal variability of alpha and beta diversity, a general linear 149
model (GLM) was constructed between demographic/baseline clinical variables and each 150
measurement respectively, for all subjects and subjects within each center. A set of 14 151
demographic and baseline clinical variables were included for each subject. A total of 113 152
subjects with complete measurements of these data were included in the analysis. The 113 153
subjects were not significantly different from the remaining subjects in terms of study center 154
distribution, major demographic and baseline clinical variables (age, FEV1, FVC, CAT score, 155
etc) and microbiome profiles. As each subject was associated a single measurement, we were 156
able to meet the independence assumption. The model was optimized in terms of Akaike 157
information criterion (AIC) through backward elimination of non-significant effects in a 158
stepwise algorithm using the “step” function in the R stats package 20. 159
160
To identify clinical predictors of microbial alpha diversity, a general linear mixed model 161
(GLMM) was constructed between clinical variables and Shannon for all samples as well as 162
samples within each center. Subject ID of each sample was used as the random factor to adjust 163
for dependency of repeated measurements on the same subject. A set of 22 demographic, 164
baseline and longitudinal clinical variables were included for all samples. A total of 391 samples 165
with complete measurements of these data were included in the analysis. The 391 samples were 166
not significantly different from the remaining samples in terms of study center distribution, 167
major clinical variables (FEV1, FVC, CAT score, etc) and microbiome profiles. The model was 168
7
optimized in terms of Akaike information criterion (AIC) through backward elimination of non-169
significant effects in a stepwise algorithm using the “step” function in the R lmerTest package 21. 170
A GLMM was also constructed between clinical variables and Shannon for the predicted 171
metagenome from microbial taxa. 172
173
To identify clinical predictors of beta diversity, we carried out a canonical correspondence 174
analysis (CCA) using the R Vegan package 22. To meet the independence assumption, only the 175
initial baseline samples from each subject were included in the analysis. The same set of 22 176
demographic, baseline and longitudinal clinical variables were included in the analysis. A total 177
of 129 initial baseline samples with complete measurements of these data were included in the 178
analysis. CCA was performed on clinical variables and the relative abundance of taxa at each of 179
the phylum, genus and OTU level, for all samples as well as samples within each center. At each 180
level, taxa present in at least 10% of samples were included in the analysis. The model was 181
optimized in terms of Akaike information criterion (AIC) in a stepwise algorithm using the 182
“step” function in the R stats package 20. The statistical significance of each clinical variable was 183
obtained by permutation test. CCA analysis was also performed between clinical variables and 184
L2, L3 functional categories for the predicted metagenome from microbial taxa. 185
186
187
188
8
Supplementary Results 189
190
Figure S1. Time points of sample collection from subjects at London, Leicester and Manchester. 191
Each set of connected dots represents samples collected from the same subject at different visits. 192
The X axis represents days of sample collection date relative to the earliest collection date of all 193
samples at each center. Subjects were firstly grouped by the number of samples and then ordered 194
by the initial sample collection date. 195
196
9
197
Figure S2. Overview of the sputum microbiome across all 716 samples. Each column represents 198
one sample. Y-axis represents relative abundance of major phyla and genera. Samples were 199
clustered by UPGMA clustering based on the weighted UniFrac distances. 200
201
10
202
Figure S3. Highly similar microbial profiles between duplicate samples collected from the same 203
individual at the same date. Duplicate samples were grouped by subject. Genus level microbial 204
composition was shown for each sample. Genera with average relative abundance greater than 205
0.005 were included. 206
207
11
208
209
210
Figure S4. Alpha diversity (Shannon) and composition of major phyla and genera in samples A) 211
at London, Leicester and Manchester, and B) baseline and exacerbations within each center. 212
213
12
214
Figure S5. Microbiome distinguished bacterial and eosinophilic exacerbations. A) Unweighted 215
pair group method with arithmetic mean clustering showing distinct clustering of samples with 216
bacterial and eosinophilic exacerbations. B) PLS-DA classification of bacterial and eosinophilic 217
exacerbations using clinical, microbiome and their combined variables at phylum (L2), genus 218
(L6) and OTU levels. The models were evaluated in terms of area under Receiver Operating 219
Characteristic curve (AUC), R2 and Q2 scores. C) Alpha diversity (Shannon) and composition of 220
major phyla and genera in exacerbation samples of different exacerbation phenotypes at London, 221
Leicester and Manchester. The number of samples is indicated in the parenthesis under each 222
subgroup in the bar chart. Error bars are within 1.5 interquartile range of the upper and lower 223
quartiles. B: bacterial; V: viral; E: eosinophilic; BE: bacterial and eosinophilic; BV: bacterial and 224
viral; and Pauci: pauci-inflammatory. *** adj. P<0.001; ** adj. P<0.01; * adj. P<0.05. 225
226
13
227
Figure S6. Significant negative correlation between CRP and alpha diversity (Shannon). Each 228
dot represents a sample colored by center. 229
230
14
231
Figure S7. Dysbiosis of the sputum microbiome. A) Scatter plot of the first Principal Coordinate 232
(PC1) of all samples within each subject at each center. Only subjects with at least two baseline 233
and one exacerbation samples were included. Exacerbation samples were colored by different 234
exacerbation phenotypes. Most deviations of PC1 occurred at exacerbations. B) Box-whisker 235
plots showing significantly increased standard deviation of PC1 combining baseline and 236
exacerbation samples within each subject compared to baseline samples only. C). Scatter plot of 237
absolute Z-score for exacerbations at each center. Exacerbations were colored by different 238
exacerbation phenotypes. Error bars are within 1.5 interquartile range of the upper and lower 239
quartiles. B: bacterial; V: viral; E: eosinophilic; BE: bacterial and eosinophilic; BV: bacterial and 240
viral; and Pauci: pauci-inflammatory. *** adj. P<0.001; ** adj. P<0.01; * adj. P<0.05. 241
242
243
244
15
245
Figure S8. Temporal dynamics of the sputum microbiome in subjects with dysbiosis 246
exacerbations at London, Leicester and Manchester. Each horizontal bar represents alpha 247
diversity (Shannon) and genus level microbial composition of one sample. Samples were 248
grouped by subject and ordered by collection dates from bottom to top. Baseline or exacerbation 249
samples are indicated at the left of the horizontal bars. Exacerbations with dysbiosis are 250
highlighted in asterisks. 251
252
16
253
254
Figure S9. Bacterial genera associated with dysbiosis in exacerbations. Box-whisker plots 255
showing the relative abundance changes of bacterial genera relative to the last baseline 256
measurements between exacerbations with or without dysbiosis. The genera were ordered by the 257
FDR-adjusted P-value in T-test. Error bars are within 1.5 interquartile range of the upper and 258
lower quartiles. 259
260
261
17
262
Figure S10. A) Box-whisker plots showing changes of FEV1, FVC and CAT score between 263
dysbiosis and non-dysbiosis exacerbations at London, Leicester and Manchester. No data is 264
available for CAT score change in dysbiosis exacerbations at London due to missing data. B) 265
Correlations between absolute Z-score measuring exacerbation dysbiosis and changes of FEV1, 266
FVC and CAT score from baseline. Error bars are within 1.5 interquartile range of the upper and 267
lower quartiles. 268
269
18
270
Figure S11. A) Box-whisker plots showing exacerbation frequency of subjects within different 271
quartile groups of temporal variability of alpha and beta diversity at each center, with the first 272
quartile defined as ‘low’, the second and third quartiles as ‘medium’ and the fourth quartile as 273
‘high’. B) Box-whisker plots showing temporal variability of alpha and beta diversity in subjects 274
within different classes of exacerbation frequency at each center. The number of samples is 275
indicated in the parenthesis under each subgroup in the box-whisker plot. Error bars are within 276
1.5 interquartile range of the upper and lower quartiles. *** adj. P<0.001; ** adj. P<0.01; * adj. 277
P<0.05. 278
279
19
280
Figure S12. A) Box-whisker plots showing exacerbation frequency of subjects within different 281
quartile groups of temporal variability of alpha and beta diversity for baseline samples only, with 282
the first quartile defined as ‘low’, the second and third quartiles as ‘medium’ and the fourth 283
quartile as ‘high’. B) Box-whisker plots showing temporal variability of alpha and beta diversity 284
for baseline samples only in subjects within different classes of exacerbation frequency. Error 285
bars are within 1.5 interquartile range of the upper and lower quartiles. Each comparison was 286
performed on all samples and Leicester samples. The sample sizes for London and Manchester 287
are too small to generate meaningful conclusion. The number of samples is indicated in the 288
parenthesis under each subgroup in the box-whisker plot. Error bars are within 1.5 interquartile 289
range of the upper and lower quartiles. 290
291
20
Table S1. Major demographic and baseline clinical features of all subjects and subjects at each 292
center. 293
Features All subjects (N=281) * Center
London (N=128) Leicester (N=100) Manchester (N=53)
Gender † Male: 187 (70.3%),
Female: 79 (29.7%)
Male: 75 (64.7%),
Female: 41 (35.3%)
Male: 76 (76.8%),
Female: 23 (23.2%)
Male: 36 (70.6%),
Female: 15 (29.4%)
Age ‡ 70 (8.1) 71 (8.6) 69 (7.6) 67 (7.4)
BMI 27.2 (5.4) 26.7 (5.7) 27.8 (5.0) 26.9 (5.1)
Baseline GOLD status 1: 30 (11.4%), 2: 132
(50.2%), 3: 78 (29.7%),
4: 23 (8.7%)
1: 9 (7.8%), 2 62
(53.9%), 3: 33 (28.7%),
4: 11 (9.6%)
1: 8 (8.1%), 2: 51
(51.5%), 3: 32
(32.3%), 4: 8 (8.1%)
1: 13 (26.5%), 2: 19
(38.8%), 3: 13
(26.5%), 4: 4 (8.2%)
Treatment #
Antibiotics: 38 (15.3%),
Steroids: 9 (3.6%),
Both: 202 (81.1%)
Antibiotics: 22 (13.9%),
Steroids: 1 (0.6%),
Both: 135 (85.4%)
Antibiotics: 11
(15.1%), Steroids: 8
(11.0%), Both: 54
(74.0%)
Antibiotics: 5
(27.8%), Steroids: 0
(0.0%), Both: 13
(72.2%)
Number of cigarette
packs per year 1
47 (30) 45 (34) 47 (28) 49 (32)
Number of
exacerbation per year 1
1.1 (1.6) 1.6 (1.8) 1 (1.7) 0 (0.7)
Baseline FEV1 ‡ 1.5 (0.6) 1.3 (0.5) 1.4 (0.6) 1.7 (0.6)
Baseline FEV1% 56.3 (18.9) 54.6 (17.3) 54.5 (17.4) 63.7 (23.5)
Baseline FEV1
predicted
2.6 (0.5) 2.5 (0.5) 2.7 (0.6) 2.7 (0.5)
Baseline FVC 2.9 (1.0) 2.8 (1.0) 2.7 (0.8) 3.6 (0.8)
Baseline FEV1/FVC
ratio
0.5 (0.1) 0.5 (0.1) 0.5 (0.1) 0.5 (0.1)
CAT score 18.7 (7.3) 16.7 (7.5) 20.0 (6.4) 19.5 (8.1)
CES-D score 1 10 (13) 10 (12) 10 (13) 13 (16)
SGRQ total score 47.4 (18.2) 45.3 (15.3) 48.7 (18.7) 48.8 (22.3) † Categorical data present as number (proportion). 294 ‡ Continuous data present as mean (SD) unless stated below. 295 1 Median (IQR). 296 * 15 subjects were missing any demographic or clinical data. 297 # The numbers represent exacerbation events, thus include subjects with more than one exacerbation. 298
299
300
21
Table S2. Major longitudinal clinical features at baseline and exacerbations of all samples and samples at each center. 301
Features All London Leicester Manchester
All
(N=716)
Base
(N=446)
Exac
(N=270)
All
(N=301)
Base
(N=132)
Exac
(N=169)
All
(N=303)
Base
(N=221)
Exac
(N=82)
All
(N=112)
Base
(N=93)
Exac
(N=19)
FEV1 1.4 (0.5) 1.5 (0.5) 1.2 (0.5) 1.2 (0.5) 1.3 (0.5) 1.2 (0.5) 1.4 (0.5) 1.5 (0.5) 1.2 (0.4) 1.6 (0.6) 1.7 (0.6) 1.0 (0.3)
FVC 2.8 (0.9) 3.0 (0.8) 2.5 (0.9) 2.7 (1.0) 3.0 (1.0) 2.6 (1.0) 2.7 (0.7) 2.7 (0.7) 2.5 (0.6) 3.3 (0.9) 3.4 (0.8) 2.1 (0.9)
FEV1/FVC ratio 0.5 (0.2) 0.5 (0.2) 0.5 (0.2) 0.5 (0.2) 0.5 (0.1) 0.5 (0.2) 0.5 (0.1) 0.5 (0.1) 0.5 (0.1) 0.5 (0.4) 0.5 (0.4) 0.4 (0.2)
CAT score 21.1
(7.4)
19.6
(7.1)
24.2
(7.0)
19.9
(7.8)
17.3
(6.9)
22.7
(7.8)
21.9 (6.8) 20.5 (6.7) 25.7
(5.5)
21.1
(8.0)
20.0
(8.0)
26.4
(6.6)
C-reactive protein
(CRP) 1
5.0
(11.0)
3.0 (5.0) 10.0
(27.0)
6.0
(16.0)
4.0 (6.0) 9.0
(25.0)
5.0 (10.0) 3.0 (5.0) 10.0
(31.0)
5.0 (8.0) 4.0 (5.0) 13.0
(28.0)
Blood neutrophil count
(X109 cells/L)
5.5 (2.3) 4.9 (1.7) 6.2 (2.7) 5.9 (2.5) 5.5 (1.9) 6.2 (2.8) 5.3 (2.1) 4.9 (1.7) 6.3 (2.6) 4.8 (1.7) 4.5 (1.4) 6.3 (2.2)
Blood lymphocyte
count (X109 cells/L)
1.8 (0.7) 1.8 (0.6) 1.8 (0.7) 1.8 (0.7) 2.0 (0.7) 1.7 (0.7) 1.9 (0.7) 1.8 (0.6) 2.0 (0.7) 1.7 (0.6) 1.8 (0.6) 1.6 (0.8)
Blood monocyte count
(X109 cells/L)
0.7 (0.3) 0.6 (0.2) 0.7 (0.3) 0.8 (0.3) 0.8 (0.2) 0.8 (0.3) 0.5 (0.2) 0.5 (0.1) 0.6 (0.2) 0.6 (0.2) 0.6 (0.2) 0.7 (0.3)
Blood eosinophil count
(X109 cells/L) 1
0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2) 0.2 (0.2)
Blood basophil count
(X109 cells/L)
0.0 (0.0) 0.1 (0.0) 0.0 (0.0) 0.0 (0.0) 0.0 (0.0) 0.0 (0.0) 0.1 (0.0) 0.1 (0.0) 0.1 (0.0) 0.0 (0.0) 0.0 (0.0) 0.0 (0.0)
Sputum neutrophil
count % 1
78.8
(33.8)
75.1
(34.0)
84.2
(28.5)
73.7
(34.0)
73.0
(29.0)
73.7
(36.0)
77.4
(34.2)
73.0
(34.2)
89.5
(22.2)
88.2
(13.5)
88.0
(17.5)
89.2
(7.8)
Sputum lymphocyte
count % 1
0.0 (0.5) 0.0 (0.3) 0.2 (1.0) 0.0 (2.0) 0.0 (0.0) 2.0 (3.0) 0.2 (0.4) 0.2 (0.5) 0.0 (0.2) 0.0 (0.1) 0.0 (0.2) 0.0 (0.0)
Sputum eosinophil
count % 1
0.8 (2.0) 0.8 (2.2) 0.5 (2.0) 0.0 (2.0) 0.0 (1.0) 1.0 (2.0) 0.8 (2.2) 0.8 (2.8) 0.5 (1.5) 1.1 (3.0) 1.2 (3.0) 1.0 (8.2)
22
Sputum macrophage
count % 1
13.0
(21.2)
14.5
(23.2)
8.5
(19.0)
20.0
(27.0)
22.0
(22.0)
20.0
(29.0)
10.8
(19.2)
14.2
(23.2)
5.2 (9.5) 7.0
(10.8)
8.0
(12.9)
5.5 (9.5)
Sputum epithelial cell
count % 1
3.2 (8.0) 4.0 (9.8) 2.0 (4.8) 0.0 (1.0) 0.0 (0.0) 0.0 (2.0) 4.8 (11.8) 6.0 (12.2) 3.5 (7.8) 1.9 (3.2) 1.5 (5.2) 3.0 (2.5)
Data present as mean (SD) unless stated below. 302 1 Median (IQR). 303
23
Table S3. The relative abundances of major phyla and genera (average relative abundance > 1%) in the sputum microbiome. The 304
phylum and genus level taxa were separated by the dotted line. 305
Taxa All London Leicester Manchester
All
(N=716
)
Base
(N=446)
Exac
(N=270)
All
(N=301)
Base
(N=132)
Exac
(N=169)
All
(N=303)
Base
(N=221)
Exac
(N=82)
All
(N=112)
Base
(N=93)
Exac
(N=19)
Proteobacteria 52.3 51.5 53.7 53.0 51.4 54.2 52.1 51.8 52.8 51.3 50.9 53.3
Firmicutes 28.7 29.0 28.1 27.2 28.0 26.5 30.0 29.4 31.8 29.3 29.8 26.8
Bacteroidetes 15.0 15.3 14.5 15.5 16.1 15.0 14.4 14.9 12.9 15.5 15.2 16.8
Actinobacteria 1.9 2.0 1.8 2.1 2.2 2.0 1.7 1.9 1.3 1.9 2.0 1.4
Fusobacteria 1.4 1.5 1.3 1.6 1.6 1.6 1.3 1.4 0.9 1.3 1.4 1.0
Haemophilus 25.8 25.6 26.1 27.0 26.5 27.3 25.2 26.1 22.7 24.0 23.0 29.3
Veillonella 15.8 16.1 15.3 13.7 14.0 13.5 17.3 16.7 18.9 17.4 17.7 16.3
Prevotella 13.2 13.5 12.6 13.6 14.3 13.1 12.6 13.2 11.2 13.4 13.1 15.0
Erwinia 7.0 7.3 6.6 7.8 8.2 7.5 6.1 6.6 4.8 7.4 7.6 6.3
Granulicatella 6.8 6.8 6.9 7.5 8.0 7.1 6.9 6.9 6.9 4.8 4.7 5.4
Cronobacter 6.4 6.5 6.3 6.5 5.8 7.0 6.0 6.4 4.8 7.3 7.4 6.3
Streptococcus 4.4 4.5 4.3 4.3 4.4 4.3 4.3 4.2 4.5 5.1 5.4 3.7
Moraxella 4.0 3.6 4.8 3.1 2.7 3.4 4.8 3.7 7.8 4.4 4.5 4.1
Actinomyces 1.0 1.1 0.9 1.1 1.2 1.1 0.9 1.0 0.6 1.1 1.2 0.7
306
24
Table S4. List of clinical variables significantly associated with microbial alpha diversity and 307
phylum level beta diversity among samples. P-values are indicated for significant variables in the 308
model. 309
Microbial diversity Alpha diversity (Shannon) Beta diversity (Phylum level abundance)
All London Leicester Manchester All London Leicester Manchester
C-reactive protein (CRP) 0.02 0.005 0.04 0.02 0.04 # # 0.01
FEV1/FVC ratio 0.003 # 0.03 0.003 0.04 # # #
Age 0.03 # # # 0.02 # # 0.01
Number of exacerbations
per year
# # 0.04 # # # 0.01 #
White blood cell # 0.02 # 0.01 # # # #
Blood neutrophil count # 0.03 # 0.01 # # # #
Blood lymphocyte count # 0.01 # 0.01 # # # #
Blood basophil count # 0.01 # # # # # 0.11 ‡
FEV1 # # # 0.04 # # # #
FVC # # # 0.01 # # # #
CAT score # # 0.02 # # # # #
BMI # # # # # # 0.07 ‡ #
Visit type <1E-7 0.001 <1E-7 # NA a NA NA NA
Center <1E-7 NA NA NA 0.001 NA NA NA
# P≥0.05 and absent in the model. 310 ‡ Variables not statistically significant but present in the model. 311 a Only initial baseline samples were used. 312
25
Table S5. List of clinical variables significantly associated with alpha and beta diversity of the inferred metagenomic profiles of the 313
sputum microbiome by PICRUSt 16 among samples. P-values are indicated for significant variables in the model. 314
Inferred functional
profile
Alpha diversity (Shannon) Beta diversity (L1) Beta diversity (L2)
All London Leicester Manchester All London Leicester Manchester All London Leicester Manchester
CRP 0.05 <1E-7 # 0.01 0.03 0.02 # 0.09 ‡ 0.04 0.14 ‡ # #
Basophil count # # 0.01 # # 0.15 ‡ # 0.03 # # # 0.02
FEV1 0.002 # 0.02 # # # # 0.11 ‡ # # # 0.10 ‡
Age # # # # 0.11 ‡ # # 0.11 ‡ # # # 0.11 ‡
FEV1/FVC ratio 0.002 # 0.005 # # # # # # # # #
FVC 0.005 # 0.03 # # # # # # # # #
Lymphocyte count # # # # # 0.02 # # # 0.05 # #
Years smoked # # # # # 0.07 ‡ # # # 0.01 # #
While blood cells # 0.01 # # # # # # # # # #
CAT score # # 0.05 # # # # # # # # #
Number of
exacerbations per year
# # # # 0.11 ‡ # # # # # # #
Visit type <1E-7 0.01 <1E-7 # NA a NA NA NA NA NA NA NA
Center <1E-7 NA NA NA # NA NA NA # NA NA NA
# P≥0.05 and absent in the model. 315 ‡ Variables not statistically significant but present in the model. 316 a Only initial baseline samples were used. 317
26
Table S6. Occurrence and average relative abundance of contaminate genera detected in 318
sequenced negative ‘blank’ controls by Salter et al. 9 in COPDMAP dataset. The first column 319
(occurrence rel abundance > 0) was calculated as the fraction of samples in which each genus has 320
abundance greater than 0. The second column (occurrence rel abundance > 0.1) was calculated as 321
the fraction of samples in which each genus has abundance greater than 0.1. And the third 322
column is the average relative abundance of each genus across all samples. 323
Occurrence (rel
abundance > 0)
Occurrence (rel
abundance > 0.1)
Average rel abundance
Alphaproteobacteria
Afipia 0 0 0
Aquabacterium 0 0 0
Asticcacaulis 0.002793296 0 6.06E-08
Aurantimonas 0.025139665 0 6.97E-07
Beijerinckia 0 0 0
Bosea 0.001396648 0 6.06E-08
Bradyhizobium 0 0 0
Brevundimonas 0.044692737 0 1.97E-05
Caulobacter 0.001396648 0 3.03E-08
Craurococcus 0 0 0
Devosia 0.011173184 0 4.25E-07
Hoeflea 0 0 0
Mesorhizobium 0 0 0
Methylobacterium 0.118715084 0 4.52E-06
Novosphingobioum 0 0 0
Ochrobactrum 0.698324022 0 4.58E-05
Paracoccus 0.086592179 0 2.30E-06
Pedomicrobiom 0 0 0
Phyllobacterium 0.009776536 0 3.34E-07
Rhizobium 0.005586592 0 1.21E-07
Roseomonas 0 0 0
Sphingobium 0.036312849 0 6.22E-06
Sphingomonas 0.160614525 0 3.02E-05
Sphingopyxis 0.018156425 0 8.49E-07
Betaproteobacteria
Acidovorax 0.019553073 0 7.28E-07
Azoarcus 0 0 0
Azospira 0 0 0
Burkholderia 0 0 0
Comamonas 0.008379888 0 1.82E-07
27
Cupriavidus 0.001396648 0 9.10E-08
Curvibacter 0 0 0
Delftia 0.005586592 0 1.52E-07
Duganella 0 0 0
Herbaspirillum 0 0 0
Janthinobacterium 0.002793296 0 1.52E-07
Kingella 0.995810056 0 0.000529353
Leptothrix 0 0 0
Limnobacter 0 0 0
Massilia 0 0 0
Methylophilus 0 0 0
Methyloversatilis 0 0 0
Oxalobacter 0.025139665 0 1.27E-06
Pelomonas 0 0 0
Polaromonas 0 0 0
Ralstonia 0.005586592 0 1.21E-07
Schlegelella 0 0 0
Sulfuritalea 0 0 0
Undibacterium 0 0 0
Variovorax 0.906424581 0 0.000159964
Gammaproteobacteria
Acinetobacter 1 0 0.001896829
Enhydrobacter 0.083798883 0 3.49E-06
Enterobacter 0.997206704 0 0.000509339
Escherichia 0.048882682 0 1.09E-06
Nevskia 0.001396648 0 6.06E-08
Pseudomonas 0.995810056 0.004189944 0.004687729
Pseudoxanthomonas 0.019553073 0 8.79E-07
Psychobacter 0 0 0
Stenotrophomonas 0.780726257 0 0.000139556
Xanthomonas 0 0 0
Actinobacteria
Aeromicrobium 0 0 0
Arthrobacter 0.005586592 0 2.73E-07
Beutenbergia 0 0 0
Brevibacterium 0.027932961 0 1.27E-06
Corynebacterium 1 0 0.001012521
Curtobacterium 0 0 0
Dietzia 0.002793296 0 2.12E-07
Geodermatophilus 0 0 0
Janibacter 0.018156425 0 4.25E-07
28
Kocuria 0 0 0
Microbacterium 0.159217877 0 6.22E-06
Micrococcus 0.054469274 0 2.03E-06
Microlunatus 0.002793296 0 9.10E-08
Patulibacter 0 0 0
Propionibacterum 0 0 0
Rhodococcus 0.909217877 0 0.000154081
Tsukamurella 0 0 0
Firmicutes
Abiotrophia 0 0 0
Bacillus 0.048882682 0 2.03E-06
Brevibacillus 0 0 0
Brochothrix 0.002793296 0 9.10E-08
Facklamia 0.002793296 0 9.10E-08
Paenibacillus 0.060055866 0 2.30E-06
Streptococcus 1 0.032122905 0.044335024
Bacteroidetes
Chryseobacterium 0.205307263 0 1.11E-05
Dyadobacter 0.002793296 0 6.06E-08
Flavobacterium 0.026536313 0 1.46E-06
Hydrotalea 0 0 0
Niatella 0 0 0
Olivibacter 0 0 0
Pedobacter 0.009776536 0 4.25E-07
Wautersiella 0 0 0
Deinococcus-Thermus
Deinococcus 0.124301676 0 5.34E-06
324
325
326
29
References 327
1. General Assembly of the World Medical A. World Medical Association Declaration of Helsinki: ethical 328 principles for medical research involving human subjects. The Journal of the American College of 329 Dentists 2014;81(3):14-8. 330
2. International Conference on Harmonisation of technical requirements for registration of 331 pharmaceuticals for human u. ICH harmonized tripartite guideline: Guideline for Good Clinical 332 Practice. Journal of postgraduate medicine 2001;47(1):45-50. 333
3. Bafadhel M, McKenna S, Terry S, et al. Blood eosinophils to direct corticosteroid treatment of 334 exacerbations of chronic obstructive pulmonary disease: a randomized placebo-controlled trial. 335 American journal of respiratory and critical care medicine 2012;186(1):48-55. doi: 336 10.1164/rccm.201108-1553OC 337
4. Bafadhel M, McKenna S, Terry S, et al. Acute exacerbations of chronic obstructive pulmonary disease: 338 identification of biologic clusters and their biomarkers. American journal of respiratory and 339 critical care medicine 2011;184(6):662-71. doi: 10.1164/rccm.201104-0597OC 340
5. Anthonisen NR, Manfreda J, Warren CP, et al. Antibiotic therapy in exacerbations of chronic 341 obstructive pulmonary disease. Annals of internal medicine 1987;106(2):196-204. 342
6. Rodriguez-Roisin R. Toward a consensus definition for COPD exacerbations. Chest 2000;117(5 Suppl 343 2):398S-401S. 344
7. Caporaso JG, Lauber CL, Walters WA, et al. Global patterns of 16S rRNA diversity at a depth of millions 345 of sequences per sample. Proceedings of the National Academy of Sciences of the United States 346 of America 2011;108 Suppl 1:4516-22. doi: 10.1073/pnas.1000080107 347
8. Caporaso JG, Lauber CL, Walters WA, et al. Ultra-high-throughput microbial community analysis on 348 the Illumina HiSeq and MiSeq platforms. The ISME journal 2012;6(8):1621-4. doi: 349 10.1038/ismej.2012.8 350
9. Salter SJ, Cox MJ, Turek EM, et al. Reagent and laboratory contamination can critically impact 351 sequence-based microbiome analyses. BMC biology 2014;12:87. doi: 10.1186/s12915-014-0087-352 z 353
10. Langmead B, Trapnell C, Pop M, et al. Ultrafast and memory-efficient alignment of short DNA 354 sequences to the human genome. Genome biology 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-355 r25 356
11. Zhang J, Kobert K, Flouri T, et al. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. 357 Bioinformatics 2014;30(5):614-20. doi: 10.1093/bioinformatics/btt593 358
12. Caporaso JG, Kuczynski J, Stombaugh J, et al. QIIME allows analysis of high-throughput community 359 sequencing data. Nature methods 2010;7(5):335-6. doi: 10.1038/nmeth.f.303 360
13. Edgar RC, Haas BJ, Clemente JC, et al. UCHIME improves sensitivity and speed of chimera detection. 361 Bioinformatics 2011;27(16):2194-200. doi: 10.1093/bioinformatics/btr381 362
14. Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 363 2010;26(19):2460-1. doi: 10.1093/bioinformatics/btq461 364
15. McDonald D, Price MN, Goodrich J, et al. An improved Greengenes taxonomy with explicit ranks for 365 ecological and evolutionary analyses of bacteria and archaea. The ISME journal 2012;6(3):610-8. 366 doi: 10.1038/ismej.2011.139 367
16. Langille MG, Zaneveld J, Caporaso JG, et al. Predictive functional profiling of microbial communities 368 using 16S rRNA marker gene sequences. Nature biotechnology 2013;31(9):814-21. doi: 369 10.1038/nbt.2676 370
17. Wang Z, Bafadhel M, Haldar K, et al. Lung microbiome dynamics in chronic obstructive pulmonary 371 disease exacerbations. The European respiratory journal 2016 doi: 10.1183/13993003.01406-372 2015 373
30
18. Eriksson L JE, Kettaneh-Wold N. Multi- and Megavariate Data Analysis, Part 2, Advanced Applications 374 and Method Extensions. MKS Umetrics AB 2006 375
19. Flores GE, Caporaso JG, Henley JB, et al. Temporal variability is a personalized feature of the human 376 microbiome. Genome biology 2014;15(12):531. doi: 10.1186/s13059-014-0531-y 377
20. R Core Team (2014) R: A language and environment for statistical computing. R Foundation for 378 Statistical Computing, Vienna, Austria.;ISBN 3-900051-07-0, URL http://www.R-project.org/ 379
21. Kuznetsova A BPaCR. lmerTest: Tests in Linear Mixed Effects Models. 2014 380 22. Oksanen J BF, Kindt R, Legendre P, Minchin PR, O'Hara RB, Simpson GL, Solymos P, Stevens MHH and 381
Wagner H. vegan: Community Ecology Package. 2013 382
383
Rela
tive
abun
danc
e %
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Baseline(446)
Exacerbation(270)
3
4
5
6
Shan
non
BaselineExacerbations
*
3
4
5
6
0.0 0.2 0.4 0.6 0.8
Shan
non
Moraxella
FDR P < 9.6e-14, R=-0.445
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
B(37)
E(13)
V(38)
Pauci(27)
BE(3)
BV(6)
3
4
5
6
Rela
tive
abun
danc
e %
Shan
non
**
**
**
***
PC1 (60.23%)
PC2 (13.56%)
PC3 (8.24%)BE
OthersFusobacteriaActinobacteriaBacteroidetesFirmicutes OthersFirmicutes StreptococcusFirmicutes VeillonellaProteobacteria OthersProteobacteria MoraxellaProteobacteria Haemophilus
A B
C D
Prin
cipa
l Coo
rdin
ate
1 (4
9.0%
)
FEV1†
Dys NonDys
−0.8
−0.4
0.0
0.4FVC†
Dys NonDys
−1
0
1
CAT score
Dys NonDys
−10
0
10
20
FEV1*
EB eB Eb eb
−0.8
−0.4
0.0
0.4FVC†
EB eB Eb eb
−1
0
1
CAT score
EB eB Eb eb
−10
0
10
20
Exac
erba
tion
chan
ges
from
bas
elin
e
Dys: DysbiosisNonDys: Non-Dysbiosis† 0.05 < P < 0.1* P < 0.05
EB: Eosinophil+DysbiosiseB: Non-eosinophil+DysbiosisEb: Eosinophil+Non-dysbiosiseb: Non-eosinophil+Non-dysbiosis
LeicesterLondon Manchester−0.8
−0.4
0.0
0.4
0.8
BaselineExacerbation
Subject
A
B
0
0.1
0.2
0.3
0.0
0.2
0.4
0.6
London Leicester ManchesterSubject
CV o
f Sha
nnon
Med
ian
of p
aire
d w
eigh
ted
Uni
Frac
A Subject with dysbiosis exacerbations **
0
2
4
6
No. o
f exa
cerb
ation
s per
yea
r
B C
0
0.1
0.2
0.3
0.2
0.4
0.6
00-1(46)
1-2(32)
2-3(21)
>3(24)
0-1(46)
1-2(32)
2-3(21)
>3(24)
Varia
tion
of a
lpha
div
ersit
y
Varia
tion
of b
eta
dive
rsity
*
Number of exacerbations per year
Low(31)
Medium (60)
High (32)
Low(31)
Medium (60)
High (32)
Variation of alpha diversity * Variation of beta diversity ***