21
Supplemental material 1 Optimization of the extraction solvent The following extraction solvents/solvent mixtures were tested on 10 randomly chosen leaves: methanol, dichloromethane, heptane, 1:1 methanol:dichloromethane, 3:1:1 methanol:chloroform:water, 3:3:2 acetonitril:2-propanol:water, 5:2:2 methanol:chloroform:water, and 2:2:0.5 methanol:chloroform:water. Based on GC-TOF (method parameters as stated in the manuscript) profiles of the test samples, 3:3:2 acetonitrile:2-propanol:water was used for sample extraction. This solvent mixture extracted a high number of metabolites and provided very reproducible results. Supplemental material 2 Quality control of the GC-TOF data Before acquiring the experimental data, six calibration curve samples spanning one order of dynamic range and consisting of 29 pure reference compounds, comprising a variety of different metabolites, e.g., amino acids, sugars and organic acids, were recorded as quality control (QC) samples to ensure instrument performance. Together with the QC samples, one blank sample prepared in the same manner was recorded to control for laboratory contamination. Furthermore, daily QC samples were used. These

link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

  • Upload
    lyhuong

  • View
    214

  • Download
    2

Embed Size (px)

Citation preview

Page 1: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

Supplemental material 1

Optimization of the extraction solvent

The following extraction solvents/solvent mixtures were tested on 10 randomly chosen leaves:

methanol, dichloromethane, heptane, 1:1 methanol:dichloromethane, 3:1:1

methanol:chloroform:water, 3:3:2 acetonitril:2-propanol:water, 5:2:2 methanol:chloroform:water,

and 2:2:0.5 methanol:chloroform:water. Based on GC-TOF (method parameters as stated in the

manuscript) profiles of the test samples, 3:3:2 acetonitrile:2-propanol:water was used for sample

extraction. This solvent mixture extracted a high number of metabolites and provided very

reproducible results.

Supplemental material 2

Quality control of the GC-TOF data

Before acquiring the experimental data, six calibration curve samples spanning one order of

dynamic range and consisting of 29 pure reference compounds, comprising a variety of different

metabolites, e.g., amino acids, sugars and organic acids, were recorded as quality control (QC)

samples to ensure instrument performance. Together with the QC samples, one blank sample

prepared in the same manner was recorded to control for laboratory contamination. Furthermore,

daily QC samples were used. These samples comprised one instrument blank and one method blank

in addition to one freshly prepared QC sample (highest calibration level). A set of these QC samples

was run for every 10 experimental samples and evaluated daily. The evaluation and basic principles

of the QC scheme employed followed the principles outlined by Fiehn et al. (2008).

To evaluate the quality of the GC-TOF data, principal component analysis (PCA) was applied to all

samples injected to ensure that the blank and QC samples were well separated from the biological

samples (supplemental figure 2.1). Furthermore, the data were investigated for batch effects using a

Page 2: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

PCA of all the biological samples by coloring the scores plot according to the injection number

and/or the batch number of the samples. No groupings of the samples could be observed, indicating

that there were no batch effects (supplemental figure 2.2). Additionally, variations in the peak

heights of the fatty acid methyl ester (FAME) retention index markers across the entire series of

injections were evaluated and found to have an acceptable relative standard deviation (RSD%) of

approximately 15%, with the exceptions of the C16 and C10 FAMEs. These had RSD%s of 18.5%

and 17.2%, respectively which were also acceptable because these markers were located inside

clusters of several larger, closely eluting peaks.

-20 -10 0 10 20 30 40 50 60

-20

-15

-10

-5

0

5

10

15

20

QC 6_15A

Blank 1_14A Blank 1_13A Blank 1_15A Blank 1_17A

Blank 1_10A Blank 1_9A

QC 6_1A

Blank 1_11A Blank 1_3A Blank 1_2A Blank 1_4A Blank 1_7A

Blank 1_6A QC 5_1A

Blank 1_16A Blank 1_5A Blank 1_8A QC 4_1A QC 2_1A QC 3_1A QC 1_1A Blank 1_1A Blank 1_12A

Scores PC#1 (32.667%)

PCA Scores [Model 1]

Control 08_3 A Control 08_3 B

Ant 06_1 B Ant 06_1 A

Ant 03_2 B

Control 01_3 B Ant 01_1 A

Ant 03_1 B Ant 03_2 A

Control 09_3 B

Control 05_1 A

Ant 04_3 A

Control 01_3 A Ant 08_1 A

Ant 01_1 B

Control 04_2 B

Ant 04_3 B

Ant 08_1 B

Ant 04_1 A

Control 09_3 A

Control 04_2 A

Ant 04_1 B Ant 03_1 A

Control 05_1 B

Control 06_1 A Control 06_1 B

Control 06_2 A

Ant 09_3 A

Control 06_2 B

Ant 05_1 A Ant 05_1 B

Ant 03_3 A

Ant 09_3 B Control 08_1 A

Ant 05_3 A

Ant 03_3 B

Ant 05_3 B

Ant 10_3 A Control 01_2 B Control 10_3 A

Ant 04_2 B

Ant 10_3 B Control 04_1 B

Ant 04_2 A

Control 01_1 B Control 10_3 B Control 01_1 A Control 13_1 B

Control 04_1 A

Control 01_2 A Control 06_3 B

Control 08_1 B

Control 13_1 A

Control 06_3 A

Ant 08_3 A Control 08_2 A

Ant 08_3 B

Control 09_2 B Control 09_2 A

Ant 07_3 B

Ant 01_3 A Ant 01_3 B

Control 12_1 B

Ant 12_3 B Ant 12_3 A

Ant 09_1 A

Ant 14_1 A

Ant 09_2 A

Control 05_3 B Control 05_3 A Control 08_2 B

Ant 07_3 A Ant 09_2 B

Ant 07_2 A

Control 12_1 A

Ant 02_3 B

Ant 09_1 B

Control 13_2 A

Control 11_2 A

Ant 12_2 B

Ant 07_2 B

Ant 14_1 B

Ant 12_1 B

Control 13_2 B

Control 11_2 B

Ant 06_2 A

Ant 05_2 A

Control 04_3 A

Control 10_2 A

Ant 02_3 A

Ant 06_2 B

Control 07_2 B

Control 02_1 B

Control 10_2 B

Ant 11_3 B Ant 11_3 A

Control 09_1 B

Control 12_3 B

Ant 13_3 A

Control 12_3 A

Control 02_1 A Control 05_2 A

Ant 12_2 A Control 02_2 A

Ant 13_3 B

Ant 05_2 B

Control 03_3 B

Control 02_2 B Ant 12_1 A Control 04_3 B Control 11_1 B Control 07_2 A

Control 09_1 A

Ant 10_1 A Control 11_1 A

Control 13_3 A

Control 03_2 A

Control 03_1 B

Control 03_2 B

Control 02_3 A

Control 03_3 A Control 13_3 B

Control 02_3 B

Control 05_2 B

Ant 10_1 B

Ant 02_2 A

Control 10_1 B Control 10_1 A Ant 11_1 A

Ant 01_2 B Ant 01_2 A

Control 07_1 B Ant 02_2 B Ant 11_1 B Control 07_3 A Ant 13_1 A Ant 02_1 B Control 03_1 A Ant 02_1 A Control 07_3 B Control 07_1 A Control 12_2 A

Ant 13_2 A Ant 13_2 B

Ant 06_3 A

Control 14_1 A

Ant 06_3 B

Control 12_2 B Control 14_1 B

Ant 13_1 B

Ant 07_1 A Ant 07_1 B Ant 10_2 B Ant 10_2 A

Ant 11_2 A

Ant 08_2 B Ant 08_2 A

Control 14_2 A Control 14_2 B

Ant 11_2 B

Control 14_3 A

Ant 14_3 B Ant 14_3 A

Control 14_3 B

Control 11_3 B Control 11_3 A Ant 14_2 A Ant 14_2 B

Sco

res

PC

#2 (8

.881

%)

L e a v e s f r o m a n t p l a n t sB l a n k s a m p l e sL e a v e s f r o m c o n t r o l p l a n t sQ C s a m p l e s

Supplemental Figure 2.1 -PCA scores plot showing the clear separation of biological samples (leaves from ant hosting plants (red), leaves from control plants (green)) from method blanks (blue) and quality control samples (pink).

Page 3: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

-30 -25 -20 -15 -10 -5 0 5 10 15 20

-20

-15

-10

-5

0

5

10

15

20

25

Ant 14_2 B Ant 14_2 A

Control 14_3 B Ant 06_3 A

Ant 08_2 A

Control 05_3 A

Ant 08_2 B

Control 14_3 A Ant 06_3 B

Ant 02_1 A Control 02_3 B Ant 02_2 B

Control 05_2 A

Ant 02_1 B

Control 11_3 B Control 11_3 A

Control 02_3 A

Ant 10_2 A

Control 14_1 B

Ant 13_1 A Ant 12_1 A

Ant 11_1 B

Ant 02_2 A Control 05_3 B Control 07_1 B Control 07_1 A Ant 09_1 A

Ant 07_1 B

Ant 14_3 B

Control 14_1 A

Ant 07_1 A

Ant 02_3 A Control 07_2 A

Control 05_2 B

Ant 11_1 A

Control 02_1 A

Ant 10_2 B

Control 07_3 A

Ant 11_2 B

Control 07_3 B

Control 03_3 A

Ant 14_3 A

Ant 13_1 B

Ant 09_1 B

Control 10_1 B

Ant 11_3 B Control 07_2 B

Control 02_1 B

Ant 12_1 B

Control 14_2 A

Ant 11_2 A

Control 06_3 A

Control 02_2 A Control 12_2 B

Control 10_1 A Ant 10_1 A

Control 02_2 B

Control 13_2 A

Control 03_1 B Control 12_2 A Ant 11_3 A Control 03_1 A Control 11_1 B

Ant 06_2 A Ant 10_1 B

Ant 13_2 B

Control 06_3 B

Control 14_2 B

Ant 02_3 B

Ant 13_2 A

Control 08_2 A Ant 07_3 A

Control 03_3 B

Ant 12_2 A

Control 11_2 B

Control 08_2 B

Control 13_2 B

Ant 09_2 B

Control 10_2 A

Control 09_2 B

Control 04_3 B Control 04_3 A

Control 01_1 A

Ant 14_1 B

Control 11_1 A Control 10_2 B

Control 13_3 A

Ant 07_3 B

Ant 05_2 B

Ant 12_3 A

Ant 12_2 B

Ant 09_2 A

Control 01_1 B

Ant 05_2 A

Control 03_2 A Control 03_2 B

Ant 06_2 B

Control 09_2 A Control 04_1 A

Control 13_3 B

Ant 09_3 B

Control 09_3 A Control 12_3 A

Control 12_1 A

Ant 12_3 B

Ant 10_3 A Control 12_3 B

Ant 13_3 B

Control 11_2 A

Ant 14_1 A

Ant 01_3 B

Ant 03_3 A

Control 08_1 B

Ant 07_2 B

Ant 09_3 A

Ant 10_3 B

Ant 08_3 B

Control 12_1 B

Ant 07_2 A

Ant 08_3 A

Ant 05_1 B

Ant 03_3 B Control 06_1 B

Control 09_3 B

Scores PC#1 (13.130%)

PCA Scores [Model 3: Ant and control plants]

Ant 01_2 B

Ant 05_1 A

Ant 01_3 A

Control 01_2 A Control 04_1 B

Control 04_2 A

Control 08_1 A

Control 10_3 B

Ant 01_2 A

Control 01_2 B

Ant 13_3 A

Ant 01_1 B

Control 06_1 A

Control 05_1 B

Control 04_2 B

Ant 08_1 B

Control 09_1 A Control 06_2 B

Control 09_1 B

Control 10_3 A

Ant 05_3 B

Control 01_3 B Control 13_1 A

Ant 01_1 A

Ant 05_3 A

Control 05_1 A

Control 01_3 A

Ant 08_1 A

Ant 06_1 A

Control 06_2 A

Control 13_1 B Ant 06_1 B Control 08_3 A

Ant 04_2 B

Control 08_3 B Ant 04_1 B

Ant 03_2 A

Ant 04_3 B

Ant 04_1 A

Ant 03_1 A Ant 04_3 A

Ant 04_2 A

Ant 03_2 B

Ant 03_1 B

Sco

res

PC

#2 (1

0.65

2%)

1 1.5 2 2.5 3 3.5 4 4.5 5 5.50

20

401 1.5 2 2.5 3 3.5 4 4.5 5 5.5

Batch

Supplemental Figure 2.2 -PCA scores plot colored according to the batch number of the samples.

Supplemental material 3

Initial analysis of the GC-TOF data

A total of 624 metabolite peaks were found in the C. arabica leaf extracts. Of these, 96 were

annotated, whereas the remaining 528 were kept in the dataset as unknowns and identified only by

their unique Binbase number. Inspection of the Binbase results for replaced values indicative of

missing peaks and overlaying the chromatograms, indicated that the differences between ant-

hosting and control plants were primarily quantitative because the same peaks were generally

present in all the sample extracts. When the sampling times (1-14) were visualized individually by

PCA, clear separations between ant-hosting and control plants were observed at all sampling times.

However, when all sampling times were analyzed together, no clear separation of the plants was

found. When this PCA scores plot was colored according to the number of AM spots on the

individual leaves (data not shown), it was evident that leaves with a high number of spots were

Page 4: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

grouped more distantly from the other samples. From the analysis of pure AM (manuscript in

preparation), AM contains a range of small common primary metabolites, such as amino acids,

sugars, and organic acids, that are also found in C. arabica leaves. The fact that these compounds

are being deposited on the leaf surface, and thus being extracted along with the leaf, complicates the

distinction between an actual metabolic response and compounds simply present on the leaf surface.

To remove all compounds present in both AM and C. arabica leaves from the data analysis was not

a viable solution. This approach would have removed a large proportion of potentially important

metabolites, e.g., several amino acids. Therefore, the separation of leaves with a high number of

AM spots observed in the initial PCA warranted further scrutiny. To identify compounds that were

positively correlated with the number of AM spots on the leaf, a PLS-DA model with the number of

AM spots as the Y-variable was prepared. Details regarding the validation of this model are

described in supplemental material 4.1. In the PLS-DA model scores plot colored according to the

number of AM spots (supplemental figure 3.1), it was observed that more spots present on the leaf

surface created greater distance from the leaves of control plants and ant-hosting plants with no

spots. To identify the metabolites responsible for this separation, the metabolites with the 50 highest

loadings on the first latent variable (LV) were selected because the primary separation was along

this direction in the scores plot. From the list of these 50 variables, non-annotated metabolites were

removed, and eight annotated metabolites remained: β-alanine, glycine, urea, valine, citrulline,

isoleucine, 1-kestose, and leucine.

Page 5: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

-10 -8 -6 -4 -2 0 2 4 6 8 10

-15

-10

-5

0

5

10

21-40 AM spots11-20 AM spots1-10 AM spots0 AM spots

LV 1 (11.16%)

LV

2 (7

.73%

)

Supplemental Figure 3.1 -PLS-DA model of leaves from ant-hosting plants with AM spots using the number of AM spots as the Y-variable. The scores plot, latent variable (LV) 1 versus 2, has been colored according to the number of AM spots on the leaf surface.

From the analysis of pure AM (data not shown), it was evident that all of these metabolites were

also abundant in pure AM, and when plotting the levels of these compounds in C. arabica leaves

against the number of AM spots, clear positive correlations were observed (data not shown). Other

compounds were present in similar levels in pure AM but showed very little or no positive

correlation between the number of AM spots on the leaf surface and the levels in the C. arabica

leaves. At the time of writing, no explanations for these observations existed; thus, all leaves with

AM spots were excluded from the remaining data analysis. This step was performed to prevent

compounds deposited in the AM spots from confounding the metabolic response of ant-hosting C.

arabica plants. Thus, the following univariate and multivariate data analyses were performed on a

reduced data set consisting of all leaves from control plants and only the leaves from ant-hosting

plants without AM spots (23 samples removed). One exception was the Pearson correlation

matrices, in which correlations for leaves from ant-hosting plants with AM spots were also

investigated.

Page 6: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

Supplemental material 4

Validation of the PLS-DA models

PLS-DA models have a tendency to be over-fitted and provide overly optimistic results, so we

needed to rigorously validate these models (Westerhuis et al. 2008). The present study included a

relatively large number of samples, which reduced the risk of over-fitting that is more likely to

occur when modeling few samples and hundreds of variables (Westerhuis et al. 2008). Close

attention was still paid to over-fitting by the careful validation and evaluation of the models.

Validation was performed by random repeated cross-validation using 1,000 repetitions and either 28

segments for the model using the number of AM spots on the leaf surface as the Y-variable or 20

segments for the model using the treatment of the plants (ants/control) as the Y-variable. With this

type of validation, six samples were randomly chosen as test set samples in each repetition. A

calibration model was then built using the remaining samples and tested using the test set. This

process was repeated 1,000 times, and the resulting model was an average of all of the repetitions.

When aiming for a rigorous validation, it is important to remove test set samples for validation that

are independent of the samples used for calibration. Samples such as technical or biological

replicates cannot be considered independent. When repeating the random selection of test set

samples 1,000 times, in some cases, independent samples were selected, and in others, they were

not. Thus, the validation was sometimes weak and sometimes rigorous. However, due to the high

number of repetitions, we believe that the validation could be considered a rigorous validation. The

evaluation of the validated models was performed by inspecting the root mean square error (RMSE)

and standard error of prediction (SEP) plotted against the number of latent variables (LVs) for both

the calibrated and the validated models as well as the scores plots, the variance explained in the X

matrix and r2 values.

Page 7: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

4.1 Validation of PLS-DA model with number of AM spots as Y-variable

The validated model (random repeated cross-validation using 1,000 repetitions and 28 segments)

was evaluated by the inspection of RMSE and SEP with an increasing number of LVs

(supplemental figure 4.1). Curves for both the calibrated and the validated model were similar and

decreased smoothly with the number of LVs, which indicates a robust model. As expected, the

validated RMSE and SEP were slightly higher, but they followed the same trend as the calibration

model curves. The r2 values for the calibrated and the validated model were 0.9996 and 0.9683,

respectively, which indicated good separation. The explained variance in the X-matrix increased

smoothly with the number of LVs used in the model (supplemental figure 4.2), which also indicates

a robust model. The optimal number of LVs was found to be 19, which explained 65% of the total

variation. Increasing the number of LVs by one only explained approximately 1% additional

variation. Fifty LVs explained 86% of the variation.

0 5 10 15 20 25 30 35 40 45 50

0

1

2

3

4

5

6

7

8

PC #

PLS Model Statistics (AM spots) [Model 3: PLS AM spot CV: Ran rep 28 1000]

Sta

tistic

[ v a l ] a v g _ R M S E

[ v a l ] a v g _ S E P

[ c a l ] R M S E

[ c a l ] S E P

Supplemental Figure 4.1 –Validation of the PLS-DA model of leaves from ant-hosting plants with AM spots using the number of ant-manure (AM) spots on the leaf surface as Y-variable. Root mean square error of prediction (RMSE) and standard error of prediction (SEP) plotted against the number of latent variables for both the validated model (val) as well as the calibrations model (cal).

Page 8: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

0 5 10 15 20 25 30 35 40 45 50

0

10

20

30

40

50

60

70

80

90

PC #

PLS Model Statistics (AM spots) [Model 3: PLS AM spot CV: Ran rep 28 1000]

Sta

tistic

[ c a l ] e x p . v a r . X

Supplemental Figure 4.2 –Validation of the PLS-DA model of leaves from ant-hosting plants with AM spots using the number of ant-manure (AM) spots on the leaf surface as Y-variable. Explained variance in the X-matrix plotted against the number of latent variables.

4.2 Validation of PLS-DA model using the treatment of the plants (with/without ants) as dummy Y-variable

The validated model was evaluated by inspecting plots of RMSE and SEP versus the number of

LVs (supplemental figure 4.3). Both curves for the calibrated and the validated model were similar

and decreased smoothly with the number of LVs, which indicated a robust model. As expected, the

validated RMSE and SEP were slightly higher, but they followed the same trend as the calibration

model curves. The r2 values for the calibrated and the validated models were 0.9974 and 0.9058,

respectively, which indicated a good separation of the groups. The explained variance in the X-

matrix increased smoothly with the number of LVs used in the model (supplemental figure 4.4),

which also indicates a robust model. The optimal number of LVs was 13, which explained 59% of

the total variation. By including an additional LV, the explained variance increased less than 2%.

Ninety percent of the variation was explained with 50 LVs.

Page 9: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

0 5 10 15 20 25 30 35 40 45 50

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

PLS Model Statistics (Treatment) [Model 2: PLS Treatment CV: Ran rep 20 1000]

PC #

Sta

tistic

[ v a l ] a v g _ R M S E

[ v a l ] a v g _ S E P

[ c a l ] R M S E

[ c a l ] S E P

Supplemental Figure 4.3 - Validation of the PLS-DA model of leaves from control plants and ant-hosting plants without AM spots using the treatment of the plants (with/without ants) as dummy Y-variable. Root mean square error of prediction (RMSE) and standard error of prediction (SEP) plotted against the number of latent variables for both the validated model (val) as well as the calibrations model (cal).

0 5 10 15 20 25 30 35 40 45 50

0

10

20

30

40

50

60

70

80

90

PC #

PLS Model Statistics (Treatment) [Model 2: PLS Treatment CV: Ran rep 20 1000]

Sta

tistic

[ c a l ] e x p . v a r . X

Supplemental Figure 4.4 - Validation of the PLS-DA model of leaves from control plants and ant-hosting plants without AM spots using the treatment of the plants (with/without ants) as dummy Y-variable. Explained variance in the X-matrix plotted against the number of latent variables.

Page 10: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

Supplemental material 5

Supplemental Table 5.1 –Metabolite identifiers for annotated metabolites with significantly different levels in the leaves of Coffea arabica as a result of the treatment of the plants (with/without ants). Only metabolites which have an LV1 loading among the 100 highest (up-regulated in ant-hosting plants) or the 100 lowest (up-regulated in control plants) in the PLS-DA model which also have p<0.05 and FC>1.4 (fold-change, calculated as ant-hosting plants versus control plants) are shown.

Metabolite ID: p: FC: LV1 loading: FAME retention index:

Characteristic ion m/z:

Valine 0.0191 3.1225 0.0839 314553 144Isoleucine 0.0323 2.8360 0.0765 360071 158Phenylalanine 0.0215 2.3929 0.0567 537507;502040 218;120Threonine 0.0007 2.3670 0.0956 410607;361026 218;130Serine 0.0074 2.2944 0.0808 396155;339438 204;116Citrulline 0.0362 2.2622 0.0700 622683 157Glycine 0.0048 2.0909 0.0687 368800 174Alanine 0.0252 1.9317 0.0549 245337 116Aspartate 0.0105 1.6963 0.0622 433407 160Glutamate 0.0029 1.6208 0.0721 530204 246Oxoproline 0.0205 1.5422 0.0564 485692 156β-alanine 0.0151 1.4920 0.0582 435969 174Linoleic acid 0.0130 1.7259 0.0825 776982 337Oleic acid 0.0066 1.6704 0.0739 778854 117Palmitic acid 0.0006 1.4233 0.0823 714075 117Catechin 0.0316 4.6644 0.0785 987442 3682,3-Dimethylquinoxaline 0.0076 2.7319 0.0748 828765 158Epicatechin 0.0455 1.6127 0.0599 981994 368Caffeine 0.0017 1.6018 0.0760 644775 1091-Kestose 0.0160 3.7472 0.0878 1123718 230Ribose 0.0029 1.6926 0.0853 554970 103β-gentiobiose 0.0070 1.6735 0.0657 969142;975821 204;160Glycerol 0.0154 1.6398 0.0748 346242 117Ribitol 0.0022 0.5755 -0.0573 577209 217Myo-inositol 0.0022 0.4970 -0.0540 730336 305N-acetyl-D-mannosamine 0.0305 0.6681 -0.0534 735610 319Pyrrole-2-carboxylic acid 0.0436 3.1467 0.0711 394622 240Cytidine-5'-diphosphate degr. product 0.0314 2.2642 0.0599 860212 217

Page 11: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

Supplemental Table 5.2 –Non-annotated (unknown) metabolites with significantly different levels in the leaves of Coffea arabica as a result of the treatment of the plants (with/without ants). Only metabolites which have an LV1 loading among the 100 highest (up-regulated in ant-hosting plants) or the 100 lowest (up-regulated in control plants) in the PLS-DA model which also have p<0.05 and FC>1.4 (fold-change, calculated as ant-hosting plants vs. control plants) are shown.

Binbase ID: p: FC: LV1 loading: FAME retention index:

Characteristic ion m/z:

642982 0.0083 10.2600 0.0895 934466 217706781 0.0066 9.1527 0.0918 935035 217702622 0.0090 8.3791 0.0884 929537 217648015 0.0468 5.4337 0.0565 533744 188644764 0.0385 3.9787 0.0684 1186583 169650050 0.0329 3.6255 0.0696 1255158 361650104 0.0033 3.3083 0.0935 812874 144702626 0.0289 3.0860 0.0734 940158 361652424 0.0024 3.0526 0.0911 757794 174643592 0.0349 2.8180 0.0702 870730 172642725 0.0003 2.7820 0.1075 806751 204651429 0.0204 2.6935 0.0758 1216662 368642800 0.0131 2.6465 0.0946 329656 211643579 0.0401 2.5565 0.0646 897868 186642929 0.0068 2.5078 0.0838 873587 156644737 0.0044 2.4557 0.0906 1062648 204644384 0.0011 2.3289 0.0973 1043439 307649763 0.0087 2.3215 0.0829 865183 446642854 0.0320 2.2756 0.0654 522559 159643610 0.0156 2.1734 0.0788 886168 103643556 0.0179 2.1726 0.0752 821701 144643478 0.0144 2.0955 0.0729 716136 189676244 0.0296 1.9952 0.0661 522900 159644770 0.0185 1.9946 0.0691 606037 199642993 0.0377 1.9607 0.0693 544001 204643037 0.0087 1.9578 0.0984 243095 154647770 0.0183 1.9194 0.0686 898931 217654330 0.0141 1.8696 0.0811 361035 86642853 0.0298 1.8067 0.0610 590685 231643133 0.0068 1.7480 0.0732 1048666 204647676 0.0131 1.7413 0.0650 257225 102652613 0.0009 1.7345 0.0807 445402 128648982 0.0289 1.7087 0.0711 345567 174680901 0.0055 1.6679 0.0754 1049167 204650963 0.0193 1.6513 0.0508 959971 169

Binbase ID: p: FC: LV1 loading: FAME retention index:

Characteristic ion m/z:

Page 12: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

642756 0.0004 1.6369 0.0779 1071213 307643055 0.0349 1.5899 0.0671 318475 117642772 0.0236 1.5699 0.0669 520411 210643207 0.0199 1.5658 0.0503 722243 255651461 0.0137 1.5479 0.0650 956180 319642744 0.0285 1.5164 0.0647 481441 210642946 0.0017 1.4985 0.0634 287799 126646257 0.0223 1.4969 0.0673 464235 210642836 0.0022 1.4766 0.0836 592619 299643244 0.0157 1.4190 0.0706 302503 188645601 0.0328 1.4165 0.0624 519392 103644436 0.0219 1.4103 0.0511 523603 217650426 0.0189 1.3126 0.0584 845126 225642862 0.0192 0.6990 -0.0551 867290 204642992 0.0033 0.6702 -0.0698 452959 172644449 0.0174 0.6574 -0.0502 844679 204645035 0.0097 0.6552 -0.0601 842293 204644113 0.0227 0.6168 -0.0463 725049 289643074 0.0018 0.5910 -0.0743 458177 172645383 0.0000 0.5764 -0.0644 1099460 309644051 0.0078 0.5655 -0.0630 1010048 297643099 0.0031 0.5652 -0.0525 512420 117644383 0.0006 0.5597 -0.0585 746733 319642990 0.0076 0.5549 -0.0467 1020541 249644013 0.0063 0.5468 -0.0463 704139 241644109 0.0144 0.5463 -0.0563 878402 204642866 0.0435 0.5445 -0.0479 922064 204642877 0.0249 0.5386 -0.0498 585149 217646027 0.0001 0.5341 -0.0646 667067 130643630 0.0017 0.5317 -0.0578 962017 183642963 0.0282 0.5025 -0.0495 549286 245642828 0.0001 0.4928 -0.0712 744276 319644475 0.0136 0.4239 -0.0445 974084 191642814 0.0019 0.4193 -0.0503 669235 289644145 0.0234 0.3439 -0.0535 1025629 223644081 0.0041 0.2176 -0.0476 1001042 105

Page 13: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

Supplemental Table 5.3 –Annotated metabolites found significant in either the univariate or multivariate data analyses, but not both and hence was excluded from the final list of metabolites with significantly different levels in the leaves of Coffea arabica as a result of the treatment of the plants (with/without ants). Exclusion was based on metabolites meeting one or two of the following criteria: p>0.05, FC<1.4 or an LV1 loading not among the 100 highest (up-regulated in ant-hosting plants) or the 100 lowest (up-regulated in control plants) in the PLS-DA model (indicated by *). FC (fold-change) was calculated as ant-hosting plants vs. control plants.

Metabolite ID: P: FC: LV1 loading:

FAME retention index:

Characteristic ion m/z:

3,4-dihydroxybenzoic acid 0.0175 0.8197 -0.0421 621690 1934-hydroxybenzoic acid 0.1517 1.4066 0.0372* 537976 223Alpha-tocopherol 0.0762 0.5619 -0.0420 1064327 237Asparagine 0.0930 6.6308 0.0690 475819;554034 100;116Benzoic acid 0.0052 1.2296 0.0652 337942 179Citramalic acid 0.0242 1.2869 0.0518* 457703 247Cyanoalanine 0.0613 4.1556 0.0681 404175 141Fructose 0.0554 0.7401 -0.0474 640502;644307 103;103Galactinol 0.4161 0.8769 -0.0216 1018120;1020192 204;204Glutamine 0.1441 2.5316 0.0764 600653 156Guanosine 0.0518 3.6561 0.0715 956964 324Leucine 0.0760 6.0842 0.0622 347153 158Linolenic acid 0.0722 1.7817 0.0605 779400 108Lyxosylamine 0.1719 0.7616 -0.0275 541792 103Nicotinic acid 0.0160 1.2672 0.0602 365117 180Phosphate 0.2978 1.4078 0.0409* 345765 299Pipecolinic acid 0.1273 2.8485 0.0495* 403451 156Proline 0.0586 3.8504 0.0717 364708 142Putrescine 0.0918 2.6831 0.0599 588551 174Salicylaldehyde 0.0899 1.9899 0.0388* 405428 119Stearic acid 0.0045 1.1887 0.0624 787569 117Sucrose 0.2811 0.8051 -0.0282 915714 361Tyramine 0.0598 1.7337 0.0497* 664017 174Urea 0.7660 0.7129 -0.0073* 327621 171

Page 14: link.springer.com · Web viewleaves from the data analysis was not a viable solution. This approach would have removed a large proportion of potentially important metabolites, e.g.,

Supplemental material 6

Serin

eG

lyci

nePh

enyl

alan

ine

Tyro

sine

Thre

onin

eA

spar

agin

eA

spar

tate

Isol

euci

neG

luta

mat

eG

luta

min

eC

itrul

line

Prol

ine

Oxo

prol

ine

Leuc

ine

Val

ine

Ala

nine

Bet

a-al

anin

eC

yano

alan

ine

Pipe

colin

ic a

cid

Nic

otin

ic a

cid

Tyra

min

eLy

xosy

lam

ine

N-a

cety

l-D-m

anno

sam

ine

Putre

scin

eU

rea

2-H

ydro

xypy

ridin

ePy

rrol

e-2-

carb

oxyl

ic a

cid

Gua

nosi

neC

affe

ine

2,3-

Dim

ethy

lqui

noxa

line

Fum

aric

aci

d2-

Ket

oadi

pic

acid

Citr

ic a

cid

Mal

ic a

cid

Succ

inic

aci

dC

itram

alic

aci

dM

alei

c ac

idQ

uini

c ac

idM

ucic

aci

dPe

larg

onic

aci

dG

lyce

ric a

cid

Gly

colic

aci

dC

hlor

ogen

ic a

cid

Caf

feic

aci

dSa

licyl

ic a

cid

4-H

ydro

xybe

nzoi

c ac

idSh

ikim

ic a

cid

3,4-

dihy

drox

yben

zoic

aci

dB

enzo

ic a

cid

3-H

ydro

xybe

nzoi

c ac

idC

atec

hin

Epic

atec

hin

Rib

ose

Bet

a-ge

ntio

bios

eX

ylos

eFu

cose

Glu

cose

Taga

tose

Talo

se o

r gal

acto

seFr

ucto

seM

alto

seR

affin

ose

1-K

esto

seSu

cros

eG

lyce

rol

Man

nito

lR

ibito

lG

alac

tinol

Gal

acto

nic

acid

Thre

onic

aci

dLi

nole

ic a

cid

Palm

itic

acid

Lino

leni

c ac

idA

rach

idic

aci

dSt

earic

aci

dO

leic

aci

dC

apric

aci

dB

ehen

ic a

cid

Cyt

idin

e-5-

diph

osph

ate

degr

. Pro

d.Si

tost

erol

Phyt

olPh

osph

ate

Alp

ha-to

coph

erol

Salic

ylal

dehy

deSu

ccin

ate

sem

iald

ehyd

eM

yo-in

osito

l

SerineGlycinePhenylalanineTyrosineThreonineAsparagineAspartateIsoleucineGlutamateGlutamineCitrullineProlineOxoprolineLeucineValineAlanineBeta-alanineCyanoalaninePipecolinic acidNicotinic acidTyramineLyxosylamineN-acetyl-D-mannosaminePutrescineUrea2-HydroxypyridinePyrrole-2-carboxylic acidGuanosineCaffeine2,3-DimethylquinoxalineFumaric acid2-Ketoadipic acidCitric acidMalic acidSuccinic acidCitramalic acidMaleic acidQuinic acidMucic acidPelargonic acidGlyceric acidGlycolic acidChlorogenic acidCaffeic acidSalicylic acid4-Hydroxybenzoic acidShikimic acid3,4-Dihydroxybenzoic acidBenzoic acid3-Hydroxybenzoic acidCatechinEpicatechinRiboseBeta-gentiobioseXyloseFucoseGlucoseTagatoseTalose or galactoseFructoseMaltoseRaffinose1-KestoseSucroseGlycerolMannitolRibitolGalactinolGalactonic acidThreonic acidLinoleic acidPalmitic acidLinolenic acidArachidic acidStearic acidOleic acidCapric acidBehenic acidCytidine-5-diphosphate degr. Prod.SitosterolPhytolPhosphateAlpha-tocopherolSalicylaldehydeSuccinate semialdehydeMyo-inositol

Phenolics and flavonoids Fatty acids Other

compoundsOrganic acidsSugar

alcohols and sugar acids

Sugars

N-containing compounds

Amino acids

N-containing compoundsAmino acids

Fatty acids

Other compounds

Sugar alcohols and sugar acids

-1

1

Organic acids

Sugars

Phenolics and flavonoids

Supplemental Figure 6.1 - Pearson correlation matrix heat map of annotated compounds in leaves from ant-hosting plants with AM spots