13
ARTICLE Received 14 Jul 2014 | Accepted 7 Oct 2014 | Published 20 Nov 2014 Prediction and quantification of bioactive microbiota metabolites in the mouse gut Gautham V. Sridharan 1, *, Kyungoh Choi 2, *, Cory Klemashevich 2 , Charmian Wu 1 , Darshan Prabakaran 2 , Long Bin Pan 1 , Shelby Steinmeyer 3 , Carrie Mueller 3 , Mona Yousofshahi 4 , Robert C. Alaniz 3,w , Kyongbum Lee 1,w & Arul Jayaraman 2,3,w Metabolites produced by the intestinal microbiota are potentially important physiological modulators. Here we present a metabolomics strategy that models microbiota metabolism as a reaction network and utilizes pathway analysis to facilitate identification and character- ization of microbiota metabolites. Of the 2,409 reactions in the model, B53% do not occur in the host, and thus represent functions dependent on the microbiota. The largest group of such reactions involves amino-acid metabolism. Focusing on aromatic amino acids, we predict metabolic products that can be derived from these sources, while discriminating between microbiota- and host-dependent derivatives. We confirm the presence of 26 out of 49 predicted metabolites, and quantify their levels in the caecum of control and germ-free mice using two independent mass spectrometry methods. We further investigate the bioactivity of the confirmed metabolites, and identify two microbiota-generated metabolites (5-hydroxy-L-tryptophan and salicylate) as activators of the aryl hydrocarbon receptor. DOI: 10.1038/ncomms6492 1 Department of Chemical and Biological Engineering, Tufts University, Medford, Massachusetts 02155, USA. 2 Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843, USA. 3 Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA. 4 Department of Computer Science, Tufts University, Medford, Massachusetts 02155, USA. *These authors contributed equally to this work. w These authors jointly supervised this work. Correspondence and requests for materials should be addressed to K.L. (email: [email protected]) or to A.J. (email: [email protected]). NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 1 & 2014 Macmillan Publishers Limited. All rights reserved.

Prediction and quantification of bioactive microbiota ...€¦ · 1Department of Chemical and Biological Engineering, Tufts University, Medford, ... Isolating and culturing individual

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • ARTICLE

    Received 14 Jul 2014 | Accepted 7 Oct 2014 | Published 20 Nov 2014

    Prediction and quantification of bioactivemicrobiota metabolites in the mouse gutGautham V. Sridharan1,*, Kyungoh Choi2,*, Cory Klemashevich2, Charmian Wu1, Darshan Prabakaran2,

    Long Bin Pan1, Shelby Steinmeyer3, Carrie Mueller3, Mona Yousofshahi4, Robert C. Alaniz3,w,

    Kyongbum Lee1,w & Arul Jayaraman2,3,w

    Metabolites produced by the intestinal microbiota are potentially important physiological

    modulators. Here we present a metabolomics strategy that models microbiota metabolism as

    a reaction network and utilizes pathway analysis to facilitate identification and character-

    ization of microbiota metabolites. Of the 2,409 reactions in the model, B53% do not occur in

    the host, and thus represent functions dependent on the microbiota. The largest group

    of such reactions involves amino-acid metabolism. Focusing on aromatic amino acids, we

    predict metabolic products that can be derived from these sources, while discriminating

    between microbiota- and host-dependent derivatives. We confirm the presence of 26 out of

    49 predicted metabolites, and quantify their levels in the caecum of control and germ-free

    mice using two independent mass spectrometry methods. We further investigate the

    bioactivity of the confirmed metabolites, and identify two microbiota-generated metabolites

    (5-hydroxy-L-tryptophan and salicylate) as activators of the aryl hydrocarbon receptor.

    DOI: 10.1038/ncomms6492

    1 Department of Chemical and Biological Engineering, Tufts University, Medford, Massachusetts 02155, USA. 2 Artie McFerrin Department of ChemicalEngineering, Texas A&M University, College Station, Texas 77843, USA. 3 Department of Microbial Pathogenesis and Immunology, Texas A&M HealthScience Center, College Station, Texas 77843, USA. 4 Department of Computer Science, Tufts University, Medford, Massachusetts 02155, USA. * Theseauthors contributed equally to this work. wThese authors jointly supervised this work. Correspondence and requests for materials should be addressed toK.L. (email: [email protected]) or to A.J. (email: [email protected]).

    NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 1

    & 2014 Macmillan Publishers Limited. All rights reserved.

    mailto:[email protected]:[email protected]://www.nature.com/naturecommunications

  • The human gastrointestinal (GI) tract is colonized by B1014

    microorganisms that are collectively termed the micro-biota. Disruptions in the microbiota composition (dysbio-

    sis) are increasingly correlated to not only gut diseases1, but alsoobesity, insulin resistance and type 2 diabetes2,3. There isincreasing evidence that the functional outputs of themicrobiota, that is, the metabolites they produce, are importantmodulators of host physiology in the GI tract. Work from ourlaboratory4 and another group5 demonstrated that thetryptophan (Trp)-derived bacterial metabolite indole attenuatesindicators of inflammation and improves tight junctionproperties in intestinal epithelial cells in vitro and in vivo.Fukuda et al.6 reported that acetate produced by intestinalbacteria inhibits translocation of E. coli O157:H7 Shiga toxinfrom the gut lumen to systemic circulation. Other short-chainfatty acids such as butyrate and propionate have been shown toinduce the differentiation of naive T cells into anti-inflammatoryregulatory T cells (Treg)7, and into Th1 and Th17 T cells that alsoproduce interleukin (IL)-10 (ref. 8).

    Despite a high level of interest, only a handful of bioactivemicrobiota metabolites in the GI tract have been identified. Onemajor challenge is that the spectrum of metabolites present in theGI tract is extremely complex, as the microbiota can carry out adiverse range of biotransformation reactions, including those thatare not present in the mammalian host9. Isolating and culturingindividual bacterial species to identify the metabolites producedin these cultures remain challenging, as many intestinal bacteriacannot be cultured under standard laboratory conditions.Moreover, this approach does not account for community-levelinteractions between the microorganisms as metabolitesproduced by one microorganism can be utilized or modified byother microorganisms. Another challenge lies in classifying ametabolite as either microbiota- or host-derived, as manymetabolites are present in both microorganisms and mammalsbecause of the high degree of conservation of metabolic pathwaysacross organisms10.

    Metabolomics of faecal or body fluid samples has beenincreasingly used to explore the metabolite profiles of the GItract, and to compare these profiles under different physiologicalor disease conditions. Mass spectrometry (MS)-based untargetedapproaches have been especially useful in analysing a broadspectrum of metabolites in a high throughput manner11. Whilethis approach offers the benefit of potential for discovery, it alsohas drawbacks. Owing to the complexity of the mass spectraobtained from full-scan experiments, metabolite identificationcan be difficult, especially if neither high purity standards nordatabase entries are available for the metabolites of interest. High-resolution time-of-flight (TOF) mass spectrometers cansomewhat alleviate this problem12, as chemical identities can beestablished based on exact mass as well as isotope and MS/MSfragmentation patterns. Alternatively, targeted analysis of an apriori selected set of metabolites affords custom optimization ofMS parameters for individual metabolites to enable sensitivedetection using quantitative methods such as multiple reactionmonitoring (MRM). The obvious drawback is that the discoverypotential can be limited. Importantly, neither untargeted nortargeted metabolomics can determine whether a gut metabolite isthe product of host or microbiota metabolism as a standaloneapproach, as many metabolites can be produced in bothmammalian and bacterial cells.

    In this work, we present a targeted approach that addresses thediscovery limitation by integrating an in silico prediction step intothe metabolomics workflow. To date, bioinformatics tools havebeen utilized in metabolomics generally for post hoc analysis toprocess data13 or perform statistical comparisons14. Recently,Greenblum et al.15 presented an elegant metagenomic study that

    places obesity- or inflammatory bowel disease (IBD)-associatedvariations in human gut microbiota gene abundances in thecontext of a microbial community-level metabolic network. Thepresent study similarly models microbiota metabolism as areaction network and uses this model to computationally explorethe products of microbiota metabolism from aromatic amino acids(AAAs). We thus exploit efficient algorithms for network analysisand the growing catalogue of annotated microbial genomes toconduct in silico discovery experiments. Specifically, we utilize aprobabilistic pathway construction algorithm to identify potentialderivatives of AAAs while discriminating between microbiota andhost contributions to the formation of the derivatives.

    To validate our methodology, we predict and experimentallyanalyse both bacteria- and host-derived products of aromaticamino acids (AAAs). We utilize two independent MS methods toquantify the levels of the predicted metabolites in caecum contentsfrom conventionally raised specific pathogen-free (SPF) and germ-free (GF) mice. On the basis of recent studies suggesting thatAAA-derived metabolites could be potent aryl hydrocarbonreceptor (AhR) ligands in the context of host immune systemfunction16, mucosal reactivity17 and oxidative stress defense18, weuse a Gaussia luciferase (GLuc) reporter system to monitor theability of the identified metabolites to activate the AhR. Theworkflow described in this study maps the identified metabolitesto specific metabolic pathways, which can then be traced to thecorresponding genes and species harbouring the genes, therebyfacilitating the elucidation of functional contributions by themicrobiota to the complex metabolite profile of the gut.

    ResultsDiversity and uniqueness of microbiota metabolic functions. Acombined host–microbiota reaction network model of gutmicrobiota metabolism was constructed based on genomeannotation information on select bacteria reported to be presentin the human GI tract19. The model comprises 3,449 distinctreactions, of which 940 are unique to the host, 1,267 are unique tothe gut microbiota and 1,142 are present in both (Fig. 1)20.Certain metabolic functions, for example, lipopolysaccharidebiosynthesis, are dominated by reactions only present in themicrobiota (Fig. 1, i). In contrast, other functions, such as steroidbiosynthesis, are performed by the host without the involvementof bacterial species in the microbiota (Fig. 1, ii). Overall, this typeof exclusivity is the exception, rather than the rule, as the majorityof functions (61 out of 82 categories) include both host- andmicrobiota-specific reactions.

    To investigate the distribution of metabolic functions acrossdifferent subsets of the microbiota, a phylum score was computedfor each reaction characterizing its prevalence in different phyla.Out of the 2,409 reactions in the microbiota model, only 286reactions belong to a single phylum (Fig. 2a). The most conservedfunction is translation, with 83% of the reactions present in allfive bacterial phyla comprising the microbiota model. Interest-ingly, the least conserved function category is energy metabolism,with only 13% of the reactions present in all five phyla. Thenumber of reactions in each function category also variedsubstantially depending on the function. The largest number ofreactions (493 out of 2,409) belongs to a category designated as‘unclassified’ by KEGG. The largest category with an assignedfunction is amino-acid metabolism, accounting for 16% of allreactions in the microbiota model. Proteobacteria possess thebroadest coverage of amino-acid reactions, expressing genes for349 of the 392 amino-acid reactions in the microbiota model.Approximately 9% of the amino-acid reactions are unique to thisphylum, which is greater than the number of amino-acidreactions unique to the other four phyla combined.

    ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492

    2 NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • Biotransformation of aromatic amino acids. We focused onaromatic amino acids (AAAs), as several previous studies, includ-ing our own work4, suggested that these are putative precursors ofbioactive bacterial metabolites. Out of the 169 AAA reactions in thecombined host–microbiota model, the gut microbiota harbours 99

    reactions, with 66 reactions not encoded by the host genome.A majority of these reactions are contributed by Proteobacteria(Fig. 2b), particularly Enterobacter and Escherichia.

    The KEGG function categories, while useful for broadassessment of metabolic capabilities, can be ambiguous. For

    Lipopolysaccharide biosynthesis (25)Peptidoglycan biosynthesis (19)Xylene degradation (17)Valine, leucine and isoleucine biosynthesis (25)Lysine biosynthesis (32)Benoate degradation (27)Fluorobenzoate degradation (13)Phenylananine, tyrosine and tryptophan biosynthesis (32)C5-branched dibasic acid metabolism (11)Napthalene degradation (10)Biotin metabolism (21)Pentose and glucuronate interconversions (45)Phosphonate and phosphinate metabolism (13)Thiamine metabolism (16)Aminobenzoate degradation (19)Riboflavin metabolism (19)Geraniol degradation (15)Phenylalanine metabolism (38)Porphyrin and chlorophyll metabolism (58)Nitrogen metabolism (29)Methane metabolism (57)Phenylpropanoid biosynthesis (11)Ascorbate and aldarate metabolism (23)Cyanoamino acid metabolism (12)Butanoate metabolism (38)Chlorocyclohexane and chlorobenzene metabolism (13)Fructose and mannose metabolism (47)Sulfur metabolism (27)Starch and sucrose metabolism (41)Terpenoid backbone biosynthesis (31)Taurine and hypotaurine metabolism (15)Glyoxylate and dicarboxylate metabolism (45)Cysteine and methionine metabolism (58)Amino sugar and nucleotide sugar metabolism (77)Chloroalkane and chloroalkene degradation (11)Histidine metabolism (31)Pyruvate metabolism (51)Propanoate metabolism (51)Ubiquinone and other terpenoid-quinone biosynthesis (41)Carbon fixation pathways in prokaryotes (36)Arginine and proline metabolism (86)Pantothenate and CoA biosynthesis (26)Vitamin B6 metabolism (19)Galactose metabolism (39)Lysine degradation (31)Pentose phosphate pathway (34)Folate biosynthesis (30)Carbon fixation in photosynthetic organisms (27)Glycerolipid metabolism (28)Glycerophospholipid metabolism (64)Selenocompound metabolism (64)Glycolysis/gluconeogenesis (44)Purine metabolism (121)Pyrimidine metabolism (99)Glycine, serine and threonine metabolism (50)Tyrosine metabolism (66)Inositol phosphate metabolism (48)Nicotinate and nicotinamide metabolism (29)Alanine, aspartate and glutamate metabolism (36)TCA cycle (28)Fatty acid biosynthesis (49)Valine, leucine and isoleucine degradation (44)Beta-alanine metabolism (25)Aminoacyl-tRNA biosynthesis (29)Fatty acid metabolism (42)Tryptophan metabolism (46)Primary bile acid biosynthesis (48)Glutathione metabolism (28)One carbon pool by folate (28)Ether lipid metabolism (16)Steroid hormone biosynthesis (105)Biosynthesis of unsaturated fatty acids (30)Fatty acid elongation (33)Steroid biosynthesis (37)Arachidonic acid metabolism (37)Alpha-linolenic acid metabolism (15)Sphingolipid metabolism (31)Retinol metabolism (22)Isoquinoline alkaloid biosynthesis (11)Metabolism of xenobiotics by cytochrome P450 (67)Drug metabolism — cytochrome P450 (25)Drug metabolism — other enzymes (27)

    Fraction of reactions that are host specificFraction of reactions in both host and microbiomeFraction of reactions that are microbiome specific

    KEGG pathway (Num. rxns in combined model)

    (i)

    (ii)

    Figure 1 | Function categories of reactions in the combined microbiota–host model. Colours indicate whether the reactions are host-specific,

    microbiome-specific or are present in both. The length of each coloured segment is proportional to the fraction contributed by the corresponding system

    (host, microbiota or both).

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492 ARTICLE

    NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 3

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • example, tryptophan transaminase (EC no. 2.6.1.27) is assigned totryptophan (Trp) metabolism, whereas indole-2-monooxygenase(EC no. 1.14.13.137) is assigned to benzoxazinoid biosynthesis,even though both enzymes are two steps from Trp. Visualinspection of the current KEGG map for Trp metabolism suggeststhat indole can only be converted to either indoxyl or2-formylamino-benzaldehyde. However, indole can also beconverted to indole-2-dione through the aforementioned mono-oxygenase. This example illustrates the need for a moresystematic analysis. To this end, we searched for metabolitesthat are biochemically related to AAAs by constructing possiblebiotransformation pathways for Phe, Trp or Tyr, while dis-criminating between microbiota and host reactions. Pathwayconstruction proceeded by selectively adding reactions onlypresent in the microbiota, and terminated when no such reactioncould be added. This ensured that none of the intermediates,except the terminal metabolite in the pathway, could be formedthrough a host reaction. Owing to the probabilistic nature of thepathway construction algorithm, the results could vary with thenumber of iterations. Therefore, we performed a series ofsimulations where we varied the iteration numbers until we didnot observe any further increase in the number of uniquepathways (Supplementary Fig. 1).

    For Trp, this probabilistic search returned three pathwayscomposed of strictly bacterial enzymes (Fig. 3). The remainingsingle-step ‘pathways’ represent the termination steps enforced bythe algorithm’s stopping criterion. The total number of Trpderivatives identified in this way is 10. Of these, the followingfour compounds only participate in microbiota metabolism,per KEGG’s annotation: indole, indoleglycerol phosphate,1-(2-carboxyphenylamino)-1-deoxy-D-ribulose 5-phosphate and

    N-(5-phospho-D-ribosyl) anthranilate. The remaining metabo-lites participate in both microbiota and host metabolism. Forphenylalanine (Phe), the search returned 12 distinct pathwayscomposed of strictly bacterial enzymes, and three pathwayscomposed of enzymes expressed in both the microbiota and host(Supplementary Fig. 2). Of a total of 33 predicted derivatives, 21participate only in microbiota metabolism, whereas 11 participatein microbiota and host reactions (Supplementary Table 1).Finally, the search on tyrosine (Tyr) returned only one bacterialpathway (Supplementary Fig. 3), as all other reactions directlyconnected to Tyr were present in the host.

    Metabolite analysis. We used a targeted metabolomics approachto measure the predicted panel of metabolites, including deriva-tives that form through reactions present in the host. Of the49 predicted derivatives, a subset of 19 metabolites was analysedusing MRM, taking into account availability of pure standardsand ease of ionization and fragmentation. To broaden the scopeof metabolic profiling, the predicted metabolites were also ana-lysed using a second MS method, information-dependent acqui-sition (IDA). This method allowed detection and identification ofadditional metabolites based on accurate mass even when high-purity standards were unavailable. On the other hand, we foundthat the MRM method offered greater dynamic response anddetection sensitivity for some metabolites (Supplementary Fig. 4).The final panel of metabolites targeted for MRM and IDA ana-lyses is listed in Supplementary Table 1.

    For MRM analysis, metabolite identification was performedbased on both chromatographic retention time and masssignature, as we found that even an optimized MRM transition

    Unclassified

    Amino-acid metabolism

    Carbohydrate metabolism

    Metabolism of cofactors and vitamins

    Lipid metabolism

    Nucleotide metabolism

    Xenobiotics biodegradation and metabolism

    Metabolism of other amino acids

    Energy metabolism

    Metabolism of terpenoids and polyketides

    Biosynthesis of other secondary metabolites

    Glycan biosynthesis and metabolism

    Translation

    0 50 100 150 200 250 300 350 400 450 500Number of reactions

    16%6%

    13%

    65%

    ActinobacteriaFirmicutesProteobacteriaMiscellaneous

    ActinobacteriaBacteroidetesFirmicutesProteobacteriaMiscellaneousShared by two phylaShared by three phylaShared by four phylaShared by all phyla

    a

    b

    Figure 2 | Distribution of functional categories across microbiota. (a) Function categories of reactions in the gut microbiota model and their uniqueness

    determined at the phylum level. Uniqueness was determined based on the phyla score (see Methods). (b) Distribution of unique reactions involved in

    aromatic amino-acid metabolism.

    ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492

    4 NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • (precursor–product ion pair) did not always uniquely identify ametabolite in a complex biological sample (Supplementary Fig. 5).For IDA analysis, the identity of a detected metabolite wasdetermined based on accurate mass and isotope pattern, and,whenever possible, confirmed based on measured MS/MSfragmentation patterns. As pure chemical standards were notused for the IDA experiments, the metabolite concentrations arereported in terms of total ion counts normalized to mass of caecalcontents.

    Combined, the two methods detected 26 of the 49 predictedmetabolites in the caecal contents of (conventionally raised) SPFor GF mice (Supplementary Table 1). Of the detected metabolites,16 metabolites only participate in microbiota-specific reactionsaccording to KEGG’s annotation, whereas the remaining10 participate in reactions present in both the microbiota andhost (Fig. 4). We did not detect any metabolites that onlyparticipate in host-specific reactions. In the subset of 23metabolites that was not detected, the number of microbiota-and host-specific metabolites was eight and four, respectively,with the remaining eleven participating in reactions present inboth the microbiota and host. Taken together, these trendssuggest that a host-specific AAA derivative is less likely to bepresent in the caecal contents than a derivative that could formthrough reactions present in the microbiota. Furthermore, wefound that more than half (15/26) of the detected metabolites

    were either significantly reduced or absent in GF mice (Fig. 5),corroborating the contribution of intestinal microbes to thepresence of these metabolites.

    Only a small fraction (3/25) of the detected metabolites—shikimate, tryptamine and tyramine—could be analysed by bothMRM and IDA, with the latter method providing broadercoverage (24 versus 6 metabolites detected by IDA and MRM,

    R00678

    Indolepyruvate

    R00677

    5-Hydroxy-L-tryptophan

    R01814L-formylkynurenine

    R07213

    1-(2-Carboxyphenylamino)-1-deoxy-D-ribulose-5-phosphate

    R03509

    N-(5-Phospho-D-ribosyl)anthranilate

    R03508

    R01073

    Anthranilate

    R00685R03664

    Tryptamine

    R00674

    Indoleglycerol-phosphate

    R02722

    R00673

    Indole

    R02340L-tryptophanyl-tRNA(Trp)

    L-tryptophan

    Reaction

    Metabolite

    Gut microbiota

    Host

    Both

    Start metabolite

    Figure 3 | Metabolic derivatives of tryptophan (Trp) identified from probabilistic pathway construction. Starting with Trp, pathways were constructed

    by recursively adding a reaction and then checking whether the reaction is present in the host. The tree diagram shown represents the union of all

    pathways identified in this way. As constrained by the algorithm, each branch of the tree terminates at a metabolite that is present in the host (blue or

    purple). Reaction numbers (for example, R00674) refer to KEGG IDs.

    16

    108

    11

    4

    Microbiota

    Both

    N/D — Microbiota

    N/D — Both

    N/D — Host

    Figure 4 | Summary of predicted and detected metabolites. A metabolite

    was classified as microbiota, host or both based on whether a reaction involving

    the metabolite is microbiota-specific, host-specific or expressed in both the

    microbiota and host per KEGG annotation. Solid and shaded pies represent

    metabolites that were detected and not detected (N/D), respectively.

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492 ARTICLE

    NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 5

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • respectively). For the three metabolites detected by bothmethods, the results were consistent. Shikimate was dramaticallyreduced, whereas tryptamine and tyramine were signi-ficantly increased in SPF samples compared with GF samples(Supplementary Fig. 6).

    For metabolites detected in both SPF and GF samples andsignificantly reduced in the GF condition, the fold-changesranged from ca. 2 to 300. For Trp, two of the four intermediatescomprising the longest microbiota pathway (Fig. 3) were reducedor altogether absent in GF samples. Indole was reduced 2.2-fold,whereas 1-(2-Carboxyphenylamino)-1-deoxy-D-ribulose-5-phos-phate was not detected in GF samples. For Phe, major branchpoints of the microbiota pathways occur at chorismate andisochorismate. The intermediates upstream of these branch pointsare, in order, arogenate and prephenate (Supplementary Fig. 2).All four microbiota metabolites were reduced (ca. 3.6-fold in thecase of arogenate) or not detected in GF samples. Downstream ofthese two branch points, the trends were mixed, as someintermediates, notably those of the shikimate branch, were onlydetected in GF samples. For Tyr, all three detected metaboliteswere significantly reduced in GF samples (Fig. 5).

    Activation of AhR by microbiota metabolites. In order to linkthe in vivo presence of the AAA-derived metabolites to the reg-ulation of host function, we investigated whether the metabolitesdetected in the caecal contents could activate an eukaryotic sig-nalling pathway. As we were interested in establishing thebioactivity of specific metabolites, we confined the analysis tonine metabolites (5-hydroxy-L-tryptophan, indole, indolepyr-uvate, salicylate, shikimate, tryptamine, chorismate, 3,4-dihy-droxyl-L-phenylalanine and tyramine) that could be obtained ashigh-purity chemicals from a commercial source. We focused onthe AhR as recent studies have highlighted the importance of thisreceptor in regulating gut physiology21–23. Furthermore, previousstudies24 have shown that endogenous Trp-derived metabolitessuch as kynurenine (host-derived) and 6-formylindolo[3,2-b]carbazole (an ultraviolet-exposed Trp degradation product)are potent ligands for AhR25. Therefore, we investigated whethermicrobiota metabolites derived from AAAs are ligands for theAhR. We used MCF-7 (Michigan Cancer Foundation-7) humanbreast cancer cells as the model cell line, as prior work has shownhigh levels of AhR responsiveness in these cells26. MCF-7 cellswith a stably integrated GLuc reporter plasmid for AhR-binding

    0

    10,000

    20,000

    30,000

    40,000

    Cou

    nts

    per

    g sa

    mpl

    ePhenethylamine Arogenate

    0

    20,000

    40,000

    60,000

    Cou

    nts

    per

    g sa

    mpl

    e0.E+00

    2.E+04

    4.E+04

    6.E+04

    8.E+04

    1.E+05

    Cou

    nts

    pjer

    g s

    ampl

    e

    Iso/chorismate/prephenate

    0.E+00

    5.E+03

    1.E+04

    2.E+04

    2.E+04

    3.E+04

    Cou

    nts

    per

    g sa

    mpl

    e

    O5(1-Carboxyvinyl)-3-phosphoshikimate

    4-Amino-4-deoxychorismate

    0

    5,000

    10,000

    15,000

    Cou

    nts

    per

    g sa

    mpl

    e

    0.E+00

    2.E+05

    4.E+05

    6.E+05

    8.E+05

    Cou

    nts

    per

    g sa

    mpl

    e

    4-Aminobenzoate

    0

    5,000

    10,000

    15,000

    20,000

    25,000

    Cou

    nts

    per

    g sa

    mpl

    e

    (1R,6R)-6-Hydroxy-2-succinylcyclohexa-2,4-diene-1-carboxylate

    0

    10,000

    20,000

    30,000

    40,000

    Cou

    nts

    per

    g sa

    mpl

    e

    Dihydropteroate

    0

    500

    1,000

    1,500

    2,000

    Cou

    nts

    per

    g sa

    mpl

    e

    (2S,3S)-2,3-Dihydro-2,3-dihydroxybenzoate

    0.E+00

    1.E+05

    2.E+05

    3.E+05

    4.E+05

    5.E+05C

    ount

    s pe

    r g

    sam

    ple

    3-Dehydroquinate

    2-Dehydro-3-deoxy-D-arabino-heptonate-7-

    phosphate

    0.E+00

    2.E+06

    4.E+06

    6.E+06

    Cou

    nts

    per

    g sa

    mpl

    e

    0

    20,000

    40,000

    60,000

    80,000

    Cou

    nts

    per

    g sa

    mpl

    e

    5-Hydroxy-L-tryptophan Indolepyruvate

    0

    5,000

    10,000

    15,000

    Cou

    nts

    per

    g sa

    mpl

    e

    Indole

    0

    10

    20

    30

    40Tryptamine

    0.00

    0.02

    0.04

    0.06

    0.08

    Con

    c., µ

    M

    Con

    c., µ

    M

    0

    5,000

    10,000

    15,000

    20,000

    Cou

    nts

    per

    g sa

    mpl

    e

    Indoleglycerolphosphate-

    Carboxyphenylamino-1-deoxy-D-ribulose-5-

    phosphate

    0

    2,000

    4,000

    6,000

    8,000

    10,000

    Cou

    nts

    per

    g sa

    mpl

    e

    N-(5-Phospho-D-ribosyl)anthranilate

    0

    500

    1,000

    1,500

    2,000

    2,500

    Cou

    nts

    per

    g sa

    mpl

    e

    Anthranilate

    0.E+00

    2.E+05

    4.E+05

    6.E+05

    8.E+05

    Cou

    nts

    per

    g sa

    mpl

    e

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Con

    c., µ

    M

    Tyramine

    0.E+00

    2.E+05

    4.E+05

    6.E+05

    Cou

    nts

    per

    g sa

    mpl

    e

    3,4-Dihydroxy-L-phenylalanine

    a b c

    Salicylate

    0.0

    0.1

    0.2

    0.3

    0.4

    0.5

    Con

    c., µ

    M

    0.E+00

    2.E+04

    4.E+04

    6.E+04

    8.E+04

    Cou

    nts

    per

    g sa

    mpl

    e

    Phenol

    N/D

    N/D

    N/D

    N/D

    N/D N/D

    N/D

    Shikimate

    0

    10

    20

    30

    40

    Con

    c., µ

    M

    **

    *

    *

    *

    *

    *

    *

    *

    *

    *

    *

    Figure 5 | Comparison of metabolite concentrations in caecum luminal contents from SP and GF mice determined using MRM and IDA MS. Metabolite

    concentrations (MRM) and counts (IDA) were normalized to the corresponding wet weight of caecal contents. Circles (a), triangles (b) and squares

    (c) denote Phe, Trp and Tyr derivatives, respectively. Colours indicate whether the metabolite participates only in microbiota metabolism (red) or both host

    and microbiota metabolism (purple). Host-specific metabolites were not detected. Closed and open symbols denote samples from SPF and GF mice,

    respectively. Isochorismate, chorismate and prephenate are all represented in the same plot as they have identical exact masses. An asterisk (*) indicates a

    statistically significant difference between the GF and SPF groups (two-sided Mann–Whitney U-test, Po0.05). N/D: not detected.

    ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492

    6 NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • activity were exposed to microbiota metabolites or 20 nM 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD; xenobiotic-positive controlthat is an AhR ligand) for 48 h, and luciferase activity in culturesupernatants was measured. Kynurenine, an endogenous AhRligand derived from Trp by host indoleamine 2,3-dioxygenaseactivity27, was used as an additional positive control. Exposure to20 nM TCDD and 100 mM kynurenine resulted in a 3.9- and 2-fold increase in the rate of AhR-driven luciferase activity (relativelight unit (RLU)/h/relative fluorescence unit), respectively, ascompared with the solvent control (Fig. 6). Of the ninemetabolites tested, five metabolites (5-hydroxy-L-tryptophan,indolepyruvate, tryptamine, salicylate and shikimate) inducedAhR activity in MCF-7 cells (Fig. 6). Depending on themetabolite, the lowest concentration that could stimulatesignificant AhR activity varied over several orders ofmagnitude. For example, 5-hydroxy-L-tryptophan induced AhRactivity at 0.01 mM, whereas indolepyruvate required aconcentration greater than 100 mM to show significant activity.The minimum concentration needed to activate the AhR for theother metabolites fell within this range, with salicylate andshikimate inducing activity at 10 mM and tryptamine at 100mM.

    DiscussionWe present here a methodology for the identification of gutmicrobiota metabolites that integrates computational pathwayanalysis into a targeted metabolomics workflow. We predictedand profiled possible biotransformation products of AAAs (Trp,Phe and Tyr), and detected a majority (26 out of 49) of thepredicted metabolites in caecal contents from mice. All of thedetected metabolites could be formed through bacterial reactionsfrom one of the source amino acids, none of these metaboliteswere host-specific, and a majority (15 out of 26) of thedetected metabolites were significantly reduced in GF micerelative to SPF mice.

    The effects of physiological or pathological perturbations onthe intestinal microbiota have been investigated using metage-nomic analyses characterizing the composition of the gutmicrobial community, the enrichment (or depletion) of bacterialgenes or the expression levels of genes for specific metabolicpathways28–30. A limitation of these analyses is that they do notprovide direct information on which molecules are formed fromwhich bacterial biotransformation reactions and which metabolicproducts are increased or decreased under different conditions.Thus, the ability to unambiguously identify and quantify bacterialmetabolites is expected to have a significant impact on the studyof human gut microbiome function.

    One obvious qualification in interpreting the in silico analysis isthat the reliability of our model is predicated on the accuracy ofthe annotation in the reaction database (in our case KEGG).While the KEGG database is continually updated, it is certainlypossible that there are missing entries or incorrect annotations,which could explain why the predictions were not 100% accurate.For example, Wikoff et al.31 identified indole-3-propionate as aTrp-derived metabolite that is produced by the gut microbiota.However, at the time of completion of this work, this metabolitewas not listed in the KEGG Compound database, and thus couldnot be identified by our algorithm. Similarly, the discriminationof bacterial metabolites from metabolites that could also beproduced by host cells depends on an accurate model of hostmetabolism. The most relevant host genomes, for example, mouseand human, have been sequenced and largely annotated, andseveral published genome-scale model reconstructions exist.However, prior studies suggest that several iterations may berequired32 until a published model can be considered a stable,consensus reconstruction33.

    Currently, there is no consensus model of gut microbiotametabolism. One approach has been to build species-specificgenome-scale models, and apply constrained optimizationmethods such as flux balance analysis to characterize the

    *

    **

    # #0

    0.5

    1

    1.5

    2

    DMF 0.1 0.1 1 10 100 1,000

    Fol

    d of

    (R

    LU/h

    /RF

    U)

    5-hydroxy-L-tryptophan

    * **

    0

    0.5

    1

    1.5

    2

    DMF 1 10 100 1,000

    Fol

    d of

    (R

    LU/h

    /RF

    U)

    Salicylate

    * *

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    DMF 1 10 100 1,000

    Fol

    d of

    (R

    LU/h

    /RF

    U)

    Shikimate

    *

    0

    0.5

    1

    1.5

    2

    DMF 1 10 100 500

    Fol

    d of

    (R

    LU/h

    /RF

    U)

    Tryptamine *

    0

    0.5

    1

    1.5

    2

    2.5

    DMF 1 10 100 500

    Fol

    d of

    (R

    LU/h

    /RF

    U)

    Indole-3-pyruvate

    *

    *

    0

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    4.5

    DMF Kynurenine100 µM

    TCDD20 nM

    Fol

    d of

    (R

    LU/h

    /RF

    U)

    Positive control

    µM

    µM

    µM µM

    µM

    Figure 6 | Dose-dependent activation of the AhR by AAA-derived metabolites. AhR activity is reported as the rate of luciferase activity normalized to red

    fluorescence of constitutively expressed RFP (RLU/h/relative fluorescence unit (RFU)). Positive controls were 20 nM TCDD and 100mM kynurenine,and the negative control was 0.1% (v/v) N, N-dimethylformamide. Data shown are mean±s.d. from four replicate experiments. Asterisk (*) indicatesstatistical significance at Po0.05 using the Student’s t-test. Results for 100mM and 1000mM 5-hydroxy-L-tryptophan are marked with # as cell death wasobserved at these concentrations.

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492 ARTICLE

    NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 7

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • metabolic capacities of the microbe. For example, Heinkenet al.34 utilized a genome-scale metabolic reconstruction ofBacteroides thetaiotaomicron in conjunction with flux balanceanalysis to explore the co-metabolism between a commensalmicrobe and its murine host, reporting that the microbecould rescue a potentially lethal loss of enzymatic functionin the host. More recently, Shoaie et al.35 assembled meta-bolic reconstructions of three species, Eubacterium rectale,Methanobrevibacter smithii and B. thetaiotaomicron, asrepresentatives of three main phyla in the human gut, whichwere used to simulate the metabolic interactions between thespecies under varying nutrient settings.

    An alternative approach is to treat the microbiota as a single‘super’ organism and model the metabolic capability of themicrobiota at the community level. This approach has thedrawback that interactions between particular species cannot beexamined in detail. However, such community-level models canbe directly applied to the growing volume of metagenomic data tocomprehensively analyse the diversity of metabolic functionscollectively encoded by the GI microbiome36,37. Recently,Greenblum et al.15 assembled a community-level model ofhuman GI tract microbiota to find significant alterations in thefunctional organization of the microbiota metabolic network inobesity and IBD. We also modelled the microbiota as a unifiedsystem, abstracting the union of all enzymes collectively encodedby the different species as part of a single reaction network. Asone of our goals was to discriminate between microbiota and hostproducts of metabolism, we extended this community-levelanalysis by also assembling a host metabolic model, andutilizing this model to identify the metabolic functions thatdepend on the microbiota.

    Although our study uses the mouse as the model hostorganism, the microbiota model was constructed from micro-biome data on the human intestine because the list ofdocumented intestinal bacteria is more extensive for humansthan mice. However, a recent meta-analysis showed that mice andhuman gut microbiota share a very strong similarity (90% and89% of bacterial phyla and genera, respectively)38, and that themost abundant bacterial species are common to human andmurine gut microbiota. Therefore, we assumed that a sampling ofthe most common species found in the human GI tract shouldreasonably approximate the biochemical diversity of the murinegut microbiota.

    We evaluated the pathway analysis results from the combinedmicrobiota–host model by performing targeted metabolomicsexperiments, focusing on AAAs. These amino acids can beendogenously transformed into a variety of bioactive derivativesformed by commensal bacteria in the intestine (for example,indole from Trp4). In the present study, we used a probabilisticpathway construction algorithm to identify metabolic derivativesthat can be produced from Trp, Phe or Tyr via enzymaticreactions, while also mapping the enzymes involved to the host ormicrobiota metabolic network. An alternative approach would beto exhaustively search through the combined microbiota–hostnetwork to build a subnetwork comprising all reactions andmetabolites that connect to a source metabolite, similar to the‘scope’ analysis described in ref. 39. This approach would confer aspeed advantage over pathway construction, if the goals were tosimply enumerate all possible metabolites that are connected tothe source metabolite via enzymatic reactions. However, thisapproach does not provide information on the composition ofpathways that connect a particular biotransformation product toa specific source metabolite. This information is necessary todetermine whether the formation of the biotransformationproduct requires any microbial enzymes. Moreover, the set ofmetabolites resulting from network expansion may not resemble

    the starting metabolite in a meaningful way. For example, Trp canbe converted to glucose and ketone bodies, which in turn can befurther oxidized through the main pathways of central carbonmetabolism. Therefore, a pathway analysis-based approach thatavoids these issues is desirable.

    The algorithm used in this study can efficiently samplepathways connected to a source metabolite of interest, whileshaping pathway construction through the reaction selectioncriteria. Directing the algorithm to only select reactions that areabsent in the host when constructing a biotransformationpathway ensured that the intermediates would not be connectedto other metabolites in the host metabolic network, and thus canbe unambiguously designated as products of microbiota metabo-lism. This search strategy also avoided the enumeration ofcommon hub metabolites that are the intermediates of centralcarbon pathways such as TCA cycle and glycolysis, which arefound in all living cells. A potential drawback, however, is thatthis type of pathway construction cannot fully reflect host–microbiota co-metabolism; that is, certain biotransformationproducts that require multiple reactions expressed in bothmicrobiota and host cannot be identified in this way. One wayto address this issue is to simply restart pathway construction at aterminal metabolite node. For example, restarting the search atindolepyruvate extends this branch by adding a bacterial reactionproducing indole-3-acetaldehyde, before terminating with areaction present in both host and microbiota that producesindole-3-acetate (Supplementary Fig. 7a). Since formation ofindolepyruvate requires a host-specific enzyme, indole-3-acet-aldehyde and indole-3-acetate can be considered products ofhost–microbiota co-metabolism. Using MRM analysis, we foundthat GF caecum samples contained 30-fold less indole-3-acetatecompared with SPF samples (Supplementary Fig. 7b), consistentwith our analysis that production of this metabolite depends onthe microbiota, and highlighting the utility of our approach fordetermining host–microbiota co-metabolism.

    A second way to explore host–microbiota co-metabolism is tosearch exhaustively, where pathway construction can utilize allreactions in the combined host–microbiota model, and then postprocess the search results using annotation data to identifymetabolic products that are downstream of at least onemicrobiota reaction. We explored this idea by limiting thepathway length to four reactions from the source metabolite tokeep computational runtime to within hours. While we found64,741 distinct paths for Trp, many of them do not describe ameaningful biotransformation of the source metabolite. Forexample, this analysis predicts chorismate as a Trp-derivedmetabolite; however, this pathway involves serine and pyruvate asintermediates, casting doubt that chorismate can be meaningfullyclassified as a Trp-derived product. It should be possible tocircumvent this issue by pruning the results of the exhaustivesearch by filtering for a functional group that is characteristic ofthe source metabolite of interest. For example, imposing theconstraint that every intermediary metabolite along a pathwaymust possess a six-carbon aromatic ring eliminates all but 87paths. However, this type of filtering may not be generallyapplicable, since other classes of metabolites, for example, organicacids, may lack a distinctive characteristic functional group.Moreover, exhaustive pathway enumeration has an exponentialruntime of kl, where k is the average number of reactionsconnected to a metabolite and l scales with the maximal pathwaylength. Increasing the length limit to more than four reactionswould lead to runtimes that are on the order of days using atypical workstation. This demonstrates the value of a sampling-based approach for constructing pathways of arbitrary length,which involves a more manageable polynomial runtime withrespect to the number of iterations.

    ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492

    8 NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • We found that both MRM and IDA have advantages anddisadvantages for experimental evaluation of the computationalanalysis. An important benefit of MRM is that when high-puritystandards are available, the selective detection of specificprecursor–product ion pairs can enhance sensitivity and improveLOD. Instrument-specific parameters can be tailored andoptimized for each individual MRM transition, whereas full-scanmethods use a particular set of parameters for all analytes, whichmay be suboptimal for a subset of the metabolites. In practice, wefound that even optimized MRM transitions may not represent aunique mass signature for a metabolite, as there are cases wheremultiple analytes present in a biological sample share the sametransition with the highest intensity. For example, indole 3-acetamide shows a strong signal for the 175-130 transition, asdoes arginine, a highly abundant amino acid. Similarly,metabolites possessing thermally labile bonds can decompose atthe ion source. For example, a pure sample of Trp showed astrong MRM signal for the indole transition (118-91) at theexpected retention time for Trp (Supplementary Fig. 5b), whichwas presumably because of partial decomposition of Trp intoindole by the heated electrospray ionization. In this study, weutilized high-resolution MS to identify metabolites based onaccurate mass as an alternative to MRM MS when high-puritystandards are not available. Specifically, we performed IDAexperiments to collect full-scan MS data with very high massaccuracy, while monitoring the MS/MS spectra for all ionsmeeting a specified count threshold whenever fragmentationcould be achieved.

    Differences in instrument design also contributed to adifference in sensitivity between the MRM and IDA experiments.A comparison of the signal versus concentration plot for Trpshowed that IDA provides approximately one-tenth of thesensitivity of MRM (Supplementary Fig. 4). In fact, certainmetabolites such as indole and salicylate could not be detected atall in a biological sample using IDA, whereas they could bequantified using MRM. These examples suggest that the choicebetween MRM and IDA will have to balance a tradeoff betweensensitivity and resolving power.

    To our knowledge, this is the first targeted metabolomics studyto quantitatively estimate the physiological concentration ofseveral microbiota-produced metabolites present in caecumcontents. While the literature on absolute concentrations ofmicrobiota metabolites is relatively sparse, we found goodagreement between our results and previously reported values.In an early study, Whitt and Demoss40 used an enzymatic assayto determine an indole concentration of B40 nmol g� 1 tissue inmurine caecum, which is comparable to our results (10–40 nmol g� 1 sample wet weight in SPF caecum). In addition, wedetected and identified a number of metabolites using IDA forwhich pure standards were unavailable. While we could not obtainabsolute concentrations for these metabolites, it was possible todetermine fold-changes across GF and SPF samples (Fig. 5).

    Our results on metabolite differences between SPF and GFmice are comparable to other previous studies. Zheng et al.11

    profiled urinary and faecal metabolites of Wistar rats exposed tob-lactam antibiotics and detected tryptamine, indole-3-acetate,indole, shikimate and phenol in urine, and tryptamine andtyramine in faeces, which we detected in the caecum in thepresent study (Supplementary Table 1). They also observed thatthe levels of indole-3-acetate, indole and phenol were lower in theantibiotic-treated mice, presumably because of the effect of theantibiotic on the gut microbiota. These results are in agreementwith the reduction in these metabolites we observed in GF caecalcontents (Fig. 5). Interestingly, Zheng et al.11 reported thatantibiotic exposure increased the level of shikimate relative to thecontrol group, which is also consistent with our data.

    The trends in our data are also similar to those reported byWikoff et al.31, who compared plasma metabolites from GF andconventionally raised (CONV) mice. This comparison found thatindoxyl sulfate and indole-3-propionic acid were present in theplasma of CONV mice, but were absent in GF mice, which isconsistent with our observations that indole is reduced in GFsamples (Fig. 5). Likewise, phenyl sulfate, a xenobiotictransformation product of phenol, was only detected in CONVmice, also in agreement with our results showing that phenol isabsent in GF mice. Similarly, our observation that tyramine levelsare reduced in GF caecal contents is consistent with thedifferences in the colonic luminal contents from GF micerelative to Ex-GF mice inoculated with a faecal suspension fromSPF mice41. However, unlike our study, this comparison of GFand Ex-GF mice did not attempt to attribute changes in themetabolites to specific bacterial pathways. Instead, metaboliteswere grouped into host- or microbiota-contributed productssolely based on their relative amounts in GF and Ex-GF samples.

    The AhR is a ligand-activated transcription factor that plays animportant role in the mucosal immune system42. Earlier work hasshown that the plant secondary metabolite indole-3-carbinol canbind and activate the AhR43. In our study, we observed that fourof the Trp derivatives (5-hydroxy-L-tryptophan, indole-3-acetate,tryptamine and indolepyruvate) and two of the Phe derivatives(salicylate and shikimate) could activate the AhR in a dose-dependent manner (Fig. 6 and Supplementary Fig. 7c). To ourknowledge, this is the first report identifying 5-hydroxy-L-tryptophan, salicylate and shikimate as non-host-derivedactivators for the AhR. Our results are also consistent with aprevious study by Heath-Pagliuso et al.24, who showed thattryptamine and indole-3-acetate are AhR ligands.

    We found that five metabolites predicted to be microbiota-specific are significantly elevated in the GF samples, which issomewhat counterintuitive (Fig. 5). Three of these metabolites,shikimate, 3-dehydroquinate and O5(1-carboxyvinyl)-3-phos-phoshikimate, are intermediates of the shikimate pathway(Supplementary Fig. 2). In order to determine the source ofshikimate, we investigated whether it was present in the micechow. We found that the chow indeed contains shikimate, andthat the levels in the chow fed to SPF and GF mice arecomparable (46 and 48 mM in SPF and GF mice, respectively).Further, IDA MS confirmed that the other metabolites of theshikimate pathway were not present in the chow. This suggeststhat the intestinal bacteria could be utilizing shikimate anddepleting it from the caecum. On the other hand, the presence ofO5(1-Carboxyvinyl)-3-phosphoshikimate in the GF samplessuggests that the host organism may also express enzymes thatcan catalyse the conversion of shikimate, and that these enzymesare missing in the present annotation of the mouse genome.Similarly, we detected significant amounts of 4-amino-4-deox-ychorismate in the GF samples, which could have been producedfrom 4-aminobenzoate via a host enzyme that is missing from thegenome annotation. In contrast, chorismate, prephenate andarogenate are depleted in the GF samples, consistent with the insilico prediction, suggesting that any errors in the annotationinvolve enzymes that are more distal from Phe than chorismate.The physiological significance of shikimate in intestinal home-ostasis warrants further investigation, since we observe shikimateto also be an activator for AhR.

    A logical extension of this work is to predict and identifymolecules generated by the intestinal microbiota from othersource metabolites under different physiologic conditions. Forexample, the methodology described here can be applied toidentify metabolic products of emerging contaminants such asbisphenol A and phthalates, which could give rise to derivativeswith either increased activity or potentially different spectrum of

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492 ARTICLE

    NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 9

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • activity44. Another example is to identify metabolites derivedfrom nutrients thought to provide benefits for gut health, such ascomplex carbohydrates in vegetables. In addition, predictivebiomarkers could be identified for disease states such as obesity,IBD and cancer that are characterized by alterations in the GItract microbiota45. A second possible extension of this work is toincorporate genomic data into the pathway analysis. In this study,we did not differentiate reactions based on their gene abundanceor expression level. Consequently, each candidate reaction has anequal likelihood of selection, which is unlikely to reflect the trueengagements of metabolic reactions in the gut microbiota.Metabolites produced by highly abundant organisms and/orhighly expressed enzymes should more likely be present atquantifiable levels compared with the products of depletedorganisms or minimally expressed enzymes. In this regard,there is an exciting opportunity to further enhance the in silicoprediction step by incorporating metagenomic data, for example,from RNA-seq experiments. Our pathway construction algorithmalready accepts user-specified selection weights and could beextended in a straightforward manner to explore a microbiotametabolic network weighted by relative gene abundances orexpression levels. Prospectively, this type of data integration couldaddress fundamental questions regarding not only who or what ispresent in the gut microbiota but also who is contributing to whatfunction.

    MethodsMaterials. All chemicals including HPLC-grade solvents and high-puritymetabolite standards were purchased from Sigma-Aldrich (St Louis, MO) unlessnoted otherwise. Cell culture reagents were purchased from Life Technologies(Carlsbad, CA).

    Microbiota metabolic network model. A reaction network model of gut micro-biota metabolism was constructed based on genome annotation information onselect species reported to be present in the human GI tract19. The rationale forusing human, as opposed to murine, microbiome data was that the list ofdocumented species was more extensive. Moreover, a recent study showed thathuman and murine gut microbiota share a very strong similarity (90% and 89% ofbacterial phyla and genera, respectively)38. The study by Qin et al.19 reported atotal of 194 strains with annotated genomes available in the HMP Data Analysisand Coordination Center, MetaHIT or GenBank. An organism from this list wasincluded in the model if an exact or species-level match was found among theannotated organisms listed in KEGG because we referenced this database to mapgenes to enzymes to reactions. If multiple strains were listed in KEGG for amatching species, all of the strains were included, with the exception of deadlypathogens. In the case a species-level match could not be found, an organism fromQin et al.19 was included in the microbiota model if a match was found at the genuslevel. In this case, the organism from Qin et al.19 was substituted with all specieslisted in KEGG that belong to the same genus, provided that the species is not adeadly pathogen. Substituting for organisms with a member of the same genusensured that the composition of the model did not significantly deviate from thecurrent consensus on the most abundant phyla in the intestine46. Microbes thatwere matched more than once (as a result of the substitutions) were eliminated toprevent multiple entries in the final model. Out of the 194 annotated bacterialgenomes referenced by Qin et al.19, 176 could be matched to an entry in KEGG atleast at the level of genus it not species or strain. The final list of bacteria in themicrobiota model comprised a total of 149 organisms, including different strains ofthe same species. The full list of organisms included in the microbiota model isprovided in Supplementary Data 1.

    A schematic outlining the steps used to assemble the microbiota reactionnetwork model is shown in Supplementary Fig. 8. Using a script written inMATLAB (MathWorks, Natick, MA), the KEGG enzyme database was searchedfor entries encoded by the genomes of the strains in the microbiota model. Each ofthe 6,043 enzymes in the data file was flagged as present or absent in the 149microbiota model strains, resulting in a binary matrix E, where element eij denotesthe presence (‘1’) or absence (‘0’) of an enzyme i in species j. A similar search wasthen conducted through the KEGG orthology data file to generate a matrix K,where element kij denotes the presence (‘1’) or absence (‘0’) of an orthologue groupi in species j. This additional search was necessary because genome annotation inKEGG sometimes associates a gene product with a reaction without assigning anenzyme commission number (EC no.) for the reaction. On the basis of the enzyme(E) and orthology (K) matrices, corresponding reaction matrices RE and RK weregenerated using a text search through the KEGG reaction database. In thesematrices, element rEij or rKij was set to 1 if enzyme or ortholog group j catalyses

    reaction i, and to 0 otherwise. Multiplying the enzyme or orthology matrix (E or K)with the corresponding reaction matrix (RE or RK), followed by reconciling the twomatrix products (to remove duplicate entries) resulted in a matrix S specifying themetabolic reaction set available to each organism, where element sij is a non-zerovalue if reaction i is catalysed by an enzyme present in organism j, and 0 otherwise.Supplementary Data 2 list the full set of microbiota model reactions and theirdefinitions. On the basis of the reaction definitions, a corresponding set ofmetabolites was compiled from the KEGG compound database. The total numberof reactions and metabolites in the microbiota model was 2,409 and 2,274,respectively.

    Uniqueness and classification of microbiota reactions. The species in themicrobiota model were hierarchically clustered based on the metabolic reaction setsencoded by their genomes as specified in the S matrix. For the purpose of clus-tering, all non-zero entries were first set to 1 to indicate the presence of thecorresponding reaction in a given species. The dendrograms and heatmap (Fig. 7)were generated using a built-in function (clustergram) of the Bioinformaticstoolbox in MATLAB. The similarity metric and linkage type were correlationdistance and average linkage, respectively. The resulting clusters of species indi-cated that there were metabolic functions (groups of reactions) unique to thespecies in a particular phylum. To determine which metabolic functions are uniqueto a particular subset of species or common to all species, a ‘phylum score’ wascalculated as follows.

    Pij¼nijnj

    In the above equation, nj is the total number of species in phylum j, and nij isthe number of species expressing reaction i in the phylum. A score of 1 indicatesthat every species in a given phylum catalyses the reaction, whereas a score of 0indicates that no species in the phylum catalyses the reaction. On the basis of thesescores, each reaction in the microbiota model was classified as ‘unique’ or‘common.’ A reaction was designated as common if the corresponding scores were40 for more than one phylum group, otherwise designated as unique. Finally, toassociate the uniqueness of reactions with metabolic function, the reactions weresorted into functional groups based on their KEGG pathway module assignments.In the case where a reaction was associated with more than one KEGG pathwaymodules, we used the primary assignment.

    Pathway analysis. We used computational pathway analysis to identify possiblebiotransformation products of AAAs that depend on microbiota metabolism. Apreviously published genome-scale metabolic model of the mouse47 was used torepresent host metabolism, consistent with the experimental model used in thisstudy. The published mouse model was manually proofread to account for anydiscrepancies with the most recent version of the KEGG Reaction database. Afterproofreading and eliminating generic reactions, the final number of uniquereactions and metabolites in the mouse model were 2,182 and 2,119, respectively.This mouse model was combined with the microbiota model described above. Aftereliminating duplicate entries, the combined model consisted of 3,449 reactions and3,076 metabolites, with 1,142 reactions and 1,317 metabolites present in both themouse and microbiota.

    The pathway analysis used in this study built on a pathway constructionalgorithm previously developed to explore novel synthesis pathways for both nativeand non-native metabolites in a microbial metabolic engineering host48. For thisstudy, we modified this algorithm to define the search space in terms of reactions,rather than metabolites. The algorithm recursively constructs a tree, starting from auser-specified source metabolite as the root of the tree. A single reaction israndomly selected from a list of candidate reactions that involve the sourcemetabolite as a main reactant. Candidate reactions for pathway construction weredrawn from the combined microbiota and host model, which represents theuniverse of reactions that could be expressed in the murine gut, if the models areassumed to be complete. The selected reaction is then added to the tree andrepresented by an edge. This edge expands the tree by attaching new nodesrepresenting the product metabolites and cofactors of the selected reaction. Theconstruction thus proceeds in a depth-first manner. Each of these nodes is a newroot for the recursion, unless (a) the metabolite or cofactor was previously added tothe tree, or (b) all reactions consuming or producing the metabolite or cofactor arepresent in the host (that is, mouse model).

    To achieve reasonable runtimes (on the order of a few minutes for a run ofseveral thousand iterations), the size of the search space is further constrained byplacing an upper limit on the number of reactions that can be used to construct apathway. In this study, the upper limit was varied from 20 to 50, which had noobservable impact on the number of the unique pathways and metabolitespredicted by the algorithm. When the addition of a reaction to the tree violates theupper limit, the algorithm backtracks and proceeds by adding to the tree anotherreaction that has not been previously explored, effectively identifying an alternativepathway. If none of these alternative routes satisfy the pathway length limit, thealgorithm further backtracks and continues from there. The algorithm finisheswhen all permitted-length branches of the tree terminate in a metabolite that isnative to the host organism. Owing to the probabilistic nature of selecting thereactions, the completed tree does not exhaustively enumerate all possible

    ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492

    10 NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • pathways. Rather, each tree represents a single pathway from the source metaboliteto one or more product metabolites that are native to the host organism. Therefore,the search is iterated many times until no more unique trees can be constructed. Inour previous work, we found that the probabilistic search matches an exhaustivesearch in terms of sampling diversity, and dramatically outperforms the exhaustivesearch in terms of computational efficiency48.

    Sample collection. Female C57BL/6 SPF and C57BL/6 GF mice at 6 weeks of agewere purchased from Taconic (Albany, NY). The sample size (n¼ 7) was selectedto achieve a power of at least 0.80. The power analysis was performed a priori andassumed that the s.d. was 40% of the group mean. The GF mice were shipped in aGF transport container. All mice were weighed and killed immediately upon arrivalat the animal facility at Texas A&M Health Science Center. Caecum contents werecollected, weighed, flash-frozen and stored at � 80 �C before metabolite extraction.All animals were handled in accordance with the Texas A&M University HealthSciences Center Institutional Animal Care and Use Committee guidelines under anapproved animal use protocol.

    Metabolite extraction. Metabolites were extracted from caecum luminal contentsusing a solvent-based method49 with minor modifications. Briefly, 1.5 ml of ice-cold methanol/chloroform (2:1, v/v) was added to a sample tube containing a pre-weighed luminal content or faecal sample. After homogenization on ice, the sampletube was centrifuged under refrigeration (4 �C) at 15,000 g for 10 min. Thesupernatant was then transferred to a new sample tube through a (70-mm) cellstrainer. After adding 0.6 ml of ice-cold water, the sample tube was vortexedvigorously and centrifuged under refrigeration (4 �C) at 15,000 g for 5 min to obtainphase separation. The upper and lower phases were separately collected into freshsample tubes with a syringe, taking care not to disturb the interface. To improvesignal intensity for MS, 400ml of the polar phase was concentrated by solventevaporation in an Eppendorf speedvac concentrator (Eppendorf, Hauppauge, NY),and then reconstituted in 40 ml of methanol/water (1:1, v/v) for subsequentanalysis. Extracted metabolites were stored at � 80 �C until analysis.

    Multiple reaction monitoring. In the case where pure chemical standards wereavailable for purchase, metabolites were analysed using MRM to obtain absolutequantitation. We found that this MS method could provide greater sensitivity forselected metabolites compared with information-dependent acquisition (IDA) asassessed by the dynamic range of the standard curves (Supplementary Fig. 4).Before sample analysis, MS parameters were optimized for each target metaboliteto identify the MRM transition (precursor/product fragment ion pair) with thehighest intensity under direct injection at 10 ml min� 1. The target metabolites insamples were detected and quantified on a triple quadrupole linear ion trap mass

    spectrometer (3200 QTRAP, AB SCIEX, Foster City, CA) coupled to a binarypump HPLC (Prominence LC-20, Shimazu, Concord, Ontario, Canada). Sampleswere maintained at 4 �C on an autosampler before injection. Chromatographicseparation was achieved on a hydrophilic interaction column (Luna 5 mm NH2100 Å 250� 2 mm, Phenomenex, Torrance, CA) using a solvent gradientmethod50. Solvent A was an ammonium acetate (20 mM) solution in water with 5%acetonitrile (v/v). The pH of solvent A was adjusted to 9.5 immediately beforeanalysis using ammonium hydroxide. Solvent B was pure acetonitrile. Injectionvolume was 10ml. The gradient method is shown in Supplementary Table 2.

    Peak identification and integration were performed using the Analyst software(version 5, Agilent, Foster City, CA) to calculate the area under curve (AUC) foreach metabolite identified in the polar phase. The total moles of each metabolitewas calculated from standard curves and normalized to the mass of luminalcontents. This quantity was divided by the density of the caecal contents (assumedto be the same as water) to determine the concentration (mM) of each metabolite incaecal luminal contents based on the fraction partitioned into the polar phase. Themeasured concentration is a conservative lower estimate as it does not account foreither the partition of the metabolite into the non-polar phase or matrix effectsresulting in ion suppression. As such, these concentration values should beinterpreted as an order of magnitude estimate and was used primarily forsemiquantitative comparisons between the two experimental groups.

    Information-dependent acquisition. Pure standards could not be obtained for alarge number of the metabolic derivatives identified in the pathway analysis. Thesemetabolites were analysed using IDA experiments performed on a triple quadru-pole TOF mass spectrometer (TripleTOF 5600þ , AB SCIEX) coupled to a binarypump HPLC system (1260 Infinity, Agilent Technologies, Santa Clara, CA).Chromatographic separation was performed as described for the MRM experi-ments. Samples eluting from the column were injected into the mass analyser via aDuoSpray ion source (TurboIonSpray probe, AB SCIEX). Each sample was runtwice, with the mass analyser operated in either positive or negative ion mode. TheIDA method for both polarities (cycle time 650 ms) included a TOF MS (survey)scan (accumulation time: 250 ms; collision energy (CE): ±10 V) and four depen-dent (triggered), high-resolution MS/MS (product ion) scans (accumulation time:100 ms each: CE, ±45 V). The TOF MS and product ion scans all had mass rangesof m/z 50 to 1,000 for both polarities. The IDA method included an automaticcalibration step performed after every five samples using a polypropylene glycolsolution. The identity of a detected metabolite was determined based on exact massand isotope pattern, and, whenever possible, confirmed by comparing the calcu-lated and measured MS/MS fragmentation patterns using PeakView (version 1.2,AB SCIEX). A difference of 30 p.p.m. was used as the tolerance threshold betweenthe measured exact mass and the corresponding theoretical value calculated fromthe molecular formula. For each metabolite with a confirmed chemical identity, the

    Figure 7 | Heat map and dendrograms show hierarchical clustering of organisms in the microbiota model based on the similarity of reaction sets

    encoded by their genomes. Clustering was performed on the reaction-organism matrix (S, see Methods for details), with the data standardized

    along the rows. Rows (reactions), and then columns (organisms), were clustered using the average linkage method based on correlation distance as the

    similarity metric. Colours indicate standardized row values relative to the mean. Red, black and green indicate values above, at and below the mean.

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492 ARTICLE

    NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 11

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • corresponding peak in the chromatogram was integrated using MultiQuant (ver-sion 2.1, AB SCIEX) to determine the AUC. Fold changes in metabolite levelsbetween different samples were calculated based on the AUC values normalized tothe corresponding sample (wet caecum content) weights.

    Cell culture. MCF-7 human breast cancer cells were obtained from ATCC(Manassas, VA). Cells were cultured at 37 �C with 5% CO2 in RPMI 1640 medium(MP Biomedicals, Solon, OH) supplemented with 10% (v/v) fetal bovine serum,glucose (2.5 g l� 1), HEPES (10 mM), sodium pyruvate (1 mM), sodium bicarbo-nate (2 g l� 1), penicillin (100 U ml� 1) and streptomycin (100 mg ml� 1).

    Construction of GLuc reporter plasmid for AhR activation. A lentiviral reporterplasmid for monitoring activation of AhR was constructed as described below. AhRresponse elements in target promoter were identified using the TRANSFACdatabase 7.0 Public. An oligonucleotide containing three repeats of the bindingsequence (CTGAGGCTAGCGTGCGT) separated by four to six bases (spacersequence) was chemically synthesized with two restriction enzyme (EcoRI andAfeI) cleavage sites at the ends. The RE oligonucleotide was cloned into a lentiviralvector51 in which expression of the GLuc gene is under the control of a minimalcytomegalovirus promoter and red fluorescent protein (RFP) is constitutivelyexpressed. Expression of GLuc is induced when ligand-activated AhR binds to itsRE. Clones containing the correct RE were identified by multiple restrictionenzyme digests and verified by sequencing.

    Generation of a stable MCF-7 AhR reporter cell line. A stable MCF-7 AhRreporter cell line (MCF-7/AhR-GLuc) was generated by lentiviral transduction. Toproduce lentiviral particles, AhR reporter plasmid and packaging plasmids psPAX(plasmid 12260, Addgene, MA) and pMD2.G (plasmid 12259, Addgene) were co-transfected into 293T/17 cells using the calcium phosphate transfection method52.After 24 h following the transfection, the medium was replenished and 5 mM ofsodium butyrate was added. After an additional 24 h of incubation, culturesupernatants containing viral particles were collected, pooled, filtered with 0.45 mmfilters and centrifuged for 2 h at 4 �C at 48,000� g. The viral titre was measuredusing a Lenti-X qRT-PCR titration kit (Clontech, Palo Alto, CA). To transduceMCF-7 cells, a concentrated aliquot of virus particles (B1� 108 IFU) was added tothe cells in presence of Polybrene (hexadimethrine bromide). After 4 h ofincubation with the virus particles, the medium was replenished.

    AhR activation studies. MCF-7/AhR-GLuc reporter cells were seeded in 24-welltissue culture plates and grown to 70% confluence. Cells were treated with indi-cated concentrations of target metabolites (5-hydroxy-L-tryptophan, indole, indo-lepyruvate, salicylate, shikimate, tryptamine, chorismate, tyramine, 3, 4-dihydroxy-L-phenylalanine and indole-3-acetate). The negative control was 0.1% (v/v) N, N-dimethylformamide and the positive controls were 20 nM TCDD and 100 mMkynurenine. For assays using the MCF-7 reporter cells, 20 ml of culture supernatantwas collected at 48 h post treatment. The MCF-7 supernatant samples were storedat � 20 �C until the secreted luciferase activity was measured. The luciferaseactivity (RLUs) was used to calculate the rate of GLuc production (RLU divided bythe time over which GLuc was secreted). To account for differences in cell densitybetween different experiments, the GLuc production rate was normalized by theintensity (relative fluorescence units) of the constitutively expressed RFP measuredat 550/600 nm excitation/emission.

    Statistical analysis. Comparisons of the medians between the metabolite levels ofSPF and GF mice were performed with the non-parametric two-sided Mann–Whitney U-test. The null hypothesis that the two medians are the same wasrejected for Po0.05. Comparisons of the means for the reporter experiments wereperformed using the Student’s t-test. The null hypothesis that the two means arethe same was rejected for Po0.05.

    References1. Chassaing, B. & Darfeuille-Michaud, A. The commensal microbiota and

    enteropathogens in the pathogenesis of inflammatory bowel diseases.Gastroenterology 140, 1720–1728 (2011).

    2. Burcelin, R., Serino, M., Chabo, C., Blasco-Baque, V. & Amar, J. Gut microbiotaand diabetes: from pathogenesis to therapeutic perspective. Acta Diabetol. 48,257–273 (2011).

    3. Turnbaugh, P. J. & Gordon, J. I. The core gut microbiome, energy balance andobesity. J. Physiol. 587, 4153–4158 (2009).

    4. Bansal, T., Alaniz, R. C., Wood, T. K. & Jayaraman, A. The bacterial signalindole increases epithelial-cell tight-junction resistance and attenuatesindicators of inflammation. Proc. Natl Acad. Sci. USA 107, 228–233 (2010).

    5. Shimada, Y. et al. Commensal bacteria-dependent indole production enhancesepithelial barrier function in the colon. PLoS ONE 8, e80604 (2013).

    6. Fukuda, S. et al. Bifidobacteria can protect from enteropathogenic infectionthrough production of acetate. Nature 469, 543–547 (2011).

    7. Arpaia, N. et al. Metabolites produced by commensal bacteria promoteperipheral regulatory T-cell generation. Nature 504, 451–455 (2013).

    8. Park, J. et al. Short-chain fatty acids induce both effector and regulatory T cellsby suppression of histone deacetylases and regulation of the mTOR-S6Kpathway. Mucosal Immunol. doi: 10.1038/mi.2014.44 (2014).

    9. van Duynhoven, J. et al. Metabolic fate of polyphenols in the humansuperorganism. Proc. Natl Acad. Sci. USA 108, 4531–4538 (2011).

    10. Peregrin-Alvarez, J. M., Sanford, C. & Parkinson, J. The conservation andevolutionary modularity of metabolism. Genome Biol. 10, R63 (2009).

    11. Zheng, X. et al. The footprints of gut microbial-Mammalian co-metabolism.J. Proteome Res. 10, 5512–5522 (2011).

    12. Lu, W., Bennett, B. D. & Rabinowitz, J. D. Analytical strategies for LC-MS-based targeted metabolomics. J. Chromatogr. B Analyt. Technol. Biomed. LifeSci. 871, 236–242 (2008).

    13. Brown, M. et al. Automated workflows for accurate mass-based putativemetabolite identification in LC/MS-derived metabolomic datasets.Bioinformatics 27, 1108–1112 (2011).

    14. Martin, F. P. et al. Dietary modulation of gut functional ecology studied by fecalmetabonomics. J. Proteome Res. 9, 5284–5295 (2010).

    15. Greenblum, S., Turnbaugh, P. J. & Borenstein, E. Metagenomic systems biologyof the human gut microbiome reveals topological shifts associated with obesityand inflammatory bowel disease. Proc. Natl Acad. Sci. USA 109, 594–599(2012).

    16. Stockinger, B., Hirota, K., Duarte, J. & Veldhoen, M. External influences on theimmune system via activation of the aryl hydrocarbon receptor. Semin.Immunol. 23, 99–105 (2011).

    17. Zelante, T. et al. Tryptophan catabolites from microbiota engage arylhydrocarbon receptor and balance mucosal reactivity via interleukin-22.Immunity 39, 372–385 (2013).

    18. Jaichander, P., Selvarajan, K., Garelnabi, M. & Parthasarathy, S. Induction ofparaoxonase 1 and apolipoprotein A-I gene expression by aspirin. J. Lipid Res.49, 2142–2148 (2008).

    19. Qin, J. et al. A human gut microbial gene catalogue established by metagenomicsequencing. Nature 464, 59–65 (2010).

    20. Tanabe, M. & Kanehisa, M. Using the KEGG database resource. CurrentProtocols in Bioinformatics/Editoral Board, Andreas D Baxevanis [et al]Chapter 1 (2012).

    21. Qiu, J. et al. The aryl hydrocarbon receptor regulates gut immunity throughmodulation of innate lymphoid cells. Immunity 36, 92–104 (2012).

    22. Monteleone, I. et al. Aryl hydrocarbon receptor-induced signals up-regulateIL-22 production and inhibit inflammation in the gastrointestinal tract.Gastroenterology 141, 237–248 (2011).

    23. Monteleone, I., MacDonald, T. T., Pallone, F. & Monteleone, G. The arylhydrocarbon receptor in inflammatory bowel disease: linking the environmentto disease pathogenesis. Curr. Opin. Gastroenterol. 28, 310–313 (2012).

    24. Heath-Pagliuso, S. et al. Activation of the Ah receptor by tryptophan andtryptophan metabolites. Biochemistry 37, 11508–11515 (1998).

    25. Mezrich, J. D. et al. An interaction between kynurenine and the arylhydrocarbon receptor can generate regulatory T cells. J. Immunol. 185,3190–3198 (2010).

    26. Holmes, J. L. & Pollenz, R. S. Determination of aryl hydrocarbon receptornuclear translocator protein concentration and subcellular localization inhepatic and nonhepatic cell culture lines: development of quantitative Westernblotting protocols for calculation of aryl hydrocarbon receptor and arylhydrocarbon receptor nuclear translocator protein in total cell lysates. Mol.Pharmacol. 52, 202–211 (1997).

    27. El-Zaatari, M. et al. Tryptophan catabolism restricts IFN-gamma-expressingneutrophils and clostridium difficile immunopathology. J. Immunol. 193,807–816 (2014).

    28. Handelsman, J. Metagenomics: application of genomics to unculturedmicroorganisms. Microbiol. Mol. Biol. Rev. 68, 669–685 (2004).

    29. Jones, B. V., Begley, M., Hill, C., Gahan, C. G. M. & Marchesi, J. R.Functional and comparative metagenomic analysis of bile salt hydrolase activityin the human gut microbiome. Proc. Natl Acad. Sci. USA 105, 13580–13585(2008).

    30. Turnbaugh, P. J. et al. An obesity-associated gut microbiome with increasedcapacity for energy harvest. Nature 444, 1027–1031 (2006).

    31. Wikoff, W. R. et al. Metabolomics analysis reveals large effects of gut microfloraon mammalian blood metabolites. Proc. Natl Acad. Sci. USA 106, 3698–3703(2009).

    32. Gille, C. et al. HepatoNet1: a comprehensive metabolic reconstruction of thehuman hepatocyte for the analysis of liver physiology. Mol. Syst. Biol. 6, 411(2010).

    33. Aziz, R. K. et al. SEED servers: high-performance access to the SEED genomes,annotations, and metabolic models. PLoS ONE 7, e48053 (2012).

    34. Heinken, A., Sahoo, S., Fleming, R. M. & Thiele, I. Systems-levelcharacterization of a host-microbe metabolic symbiosis in the mammalian gut.Gut Microbes 4, 28–40 (2013).

    ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492

    12 NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunications

  • 35. Shoaie, S. et al. Understanding the interactions between bacteria in the humangut through metabolic modeling. Sci. Rep. 3, 2532 (2013).

    36. Jiao, D., Ye, Y. & Tang, H. Probabilistic inference of biochemical reactions inmicrobial communities from metagenomic sequences. PLoS Comput. Biol. 9,e1002981 (2013).

    37. Marcobal, A. et al. A metabolomic view of how the human gut microbiotaimpacts the host metabolome using humanized and gnotobiotic mice. ISME J.7, 1933–1943 (2013).

    38. Krych, L., Hansen, C. H., Hansen, A. K., van den Berg, F. W. & Nielsen, D. S.Quantitatively different, yet qualitatively alike: a meta-analysis of the mousecore gut microbiome with a view towards the human gut microbiome. PLoSONE 8, e62578 (2013).

    39. Handorf, T., Ebenhoh, O. & Heinrich, R. Expanding metabolic networks:scopes of compounds, robustness, and evolution. J. Mol. Evol. 61, 498–512(2005).

    40. Whitt, D. D. & Demoss, R. D. Effect of microflora on the free amino aciddistribution in various regions of the mouse gastrointestinal tract. Appl.Microbiol. 30, 609–615 (1975).

    41. Matsumoto, M. et al. Impact of intestinal microbiota on intestinal luminalmetabolome. Sci. Rep. 2, 233 (2012).

    42. Li, Y. et al. Exogenous stimuli maintain intraepithelial lymphocytes via arylhydrocarbon receptor activation. Cell 147, 629–640 (2011).

    43. Bjeldanes, L. F., Kim, J. Y., Grose, K. R., Bartholomew, J. C. & Bradfield, C. A.Aromatic hydrocarbon responsiveness-receptor agonists generated fromindole-3-carbinol in vitro and in vivo: comparisons with 2,3,7,8-tetrachlorodibenzo-p-dioxin. Proc. Natl Acad. Sci. USA 88, 9543–9547 (1991).

    44. Van de Wiele, T. et al. Human colon microbiota transform polycyclic aromatichydrocarbons to estrogenic metabolites. Environ. Health Perspect. 113, 6–10(2005).

    45. Arthur, J. C. et al. Intestinal inflammation targets cancer-inducing activity ofthe microbiota. Science 338, 120–123 (2012).

    46. Eckburg, P. B. et al. Diversity of the human intestinal microbial flora. Science308, 1635–1638 (2005).

    47. Quek, L. E. & Nielsen, L. K. On the reconstruction of the Mus musculusgenome-scale metabolic network model. Genome Inform. 21, 89–100 (2008).

    48. Yousofshahi, M., Lee, K. & Hassoun, S. Probabilistic pathway construction.Metab. Eng. 13, 435–444 (2011).

    49. Sellick, C. A. et al. Evaluation of extraction processes for intracellularmetabolite profiling of mammalian cells: matching extraction approaches to celltype and metabolite targets. Metabolomics 6, 427–438 (2010).

    50. Bajad, S. U. et al. Separation and quantitation of water soluble cellularmetabolites by hydrophilic interaction chromatography-tandem massspectrometry. J. Chromatogr. A 1125, 76–88 (2006).

    51. Tian, J., Alimperti, S., Lei, P. & Andreadis, S. T. Lentiviral microarrays forreal-time monitoring of gene expression dynamics. Lab. Chip 10, 1967–1975(2010).

    52. Carlotti, F. et al. Lentiviral vectors efficiently transduce quiescent mature3T3-L1 adipocytes. Mol. Ther. 9, 209–217 (2004).

    AcknowledgementsThis work was partially supported by grants from the NSF (CBET 0846453 to A.J., CBET1264502 to K.L. and A.J., CBET 1337760 and CBET 0812381 to K.L.) and the NIH(1R21GM106251 to K.L. and A.J.; 1R21AI095788 to A.J. and RCA; 1R56DK081768 toK.L. and A.J.).

    Author contributionsG.V.S., R.C.A., K.L. and A.J. designed experiments. G.V.S., K.C., C.K., C.W., D.P., S.S.and C.M. carried out experiments. G.V.S., K.C., C.K. and C.W. analysed the data. G.V.S.,K.C., K.L. and A.J. wrote the manuscript.

    Additional informationSupplementary Information accompanies this paper at http://www.nature.com/naturecommunications

    Competing financial interests: The authors declare no competing financial interest.

    Reprints and permission information is available online at http://npg.nature.com/reprintsandpermissions/

    How to cite this article: Sridharan, G. V. et al. Prediction and quantificationof bioactive microbiota metabolites in the mouse gut. Nat. Commun. 5:5492doi: 10.1038/ncomms6492 (2014).

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6492 ARTICLE

    NATURE COMMUNICATIONS | 5:5492 | DOI: 10.1038/ncomms6492 | www.nature.com/naturecommunications 13

    & 2014 Macmillan Publishers Limited. All rights reserved.

    http://www.nature.com/naturecommunicationshttp://www.nature.com/naturecommunicationshttp://npg.nature.com/reprintsandpermissions/http://npg.nature.com/reprintsandpermissions/http://www.nature.com/naturecommunications

    title_linkResultsDiversity and uniqueness of microbiota metabolic functionsBiotransformation of aromatic amino acids

    Figure™1Function categories of reactions in the combined microbiota-host model.Colours indicate whether the reactions are host-specific, microbiome-specific or are present in both. The length of each coloured segment is proportional to the fraction contriMetabolite analysis

    Figure™2Distribution of functional categories across microbiota.(a) Function categories of reactions in the gut microbiota model and their uniqueness determined at the phylum level. Uniqueness was determined based on the phyla score (see Methods). (b) DisFigure™3Metabolic derivatives of tryptophan (Trp) identified from probabilistic pathway construction.Starting with Trp, pathways were constructed by recursively adding a reaction and then checking whether the reaction is present in the host. The tree diagFigure™4Summary of predicted and detected metabolites.A metabolite was classified as microbiota, host or both based on whether a reaction involving the metabolite is microbiota-specific, host-specific or expressed in both the microbiota and host per KEGG Activation of AhR by microbiota metabolites

    Figure™5Comparison of metabolite concentrations in caecum luminal contents from SP and GF mice determined using MRM and IDA MS.Metabolite concentrations (MRM) and counts (IDA) were normalized to the corresponding wet weight of caecal contents. Circles (a)DiscussionFigure™6Dose-dependent activation of the AhR by AAA-derived metabolites.AhR activity is reported as the rate of luciferase activity normalized to red fluorescence of constitutively expressed RFP (RLUsolhsolrelative fluorescence unit (RFU)). Positive contrMethodsMaterialsMicrobiota metabolic network modelUniqueness and classification of microbiota reactionsPathway analysisSample collectionMetabolite extractionMultiple reaction monitoringInformation-dependent acquisition

    Figure™7Heat map and dendrograms show hierarchical clustering of organisms in the microbiota model based on the similarity of reaction sets encoded by their genomes.Clustering was performed on the reaction-organism matrix (S, see Methods for details), witCell cultureConstruction of GLuc reporter plasmid for AhR activationGeneration of a stable MCF-7 AhR reporter cell lineAhR activation studiesStatistical analysis

    ChassaingB.Darfeuille-MichaudA.The commensal microbiota and enteropathogens in the pathogenesis of inflammatory bowel diseasesGastroenterology140172017282011BurcelinR.SerinoM.ChaboC.Blasco-BaqueV.AmarJ.Gut microbiota and diabetes: from pathogenesis to theThis work was partially supported by grants from the NSF (CBET 0846453 to A.J., CBET 1264502 to K.L. and A.J., CBET 1337760 and CBET 0812381 to K.L.) and the NIH