Genome Sequence of the Model Medicinal

Embed Size (px)

Citation preview

  • 8/10/2019 Genome Sequence of the Model Medicinal

    1/9

    ARTICLE

    1NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    2012Macmillan Publishers Limited. All rights reserved.

    Received 2 Feb 2012 |Accepted 22 May 2012 |Published 26 Jun 2012 DOI: 10.1038/ncomms1923

    Ganoderma lucidum is a widely used medicinal macrofungus in traditional Chinese medicine

    that creates a diverse set of bioactive compounds. Here we report its 43.3-Mb genome,

    encoding 16,113 predicted genes, obtained using next-generation sequencing and optical

    mapping approaches. The sequence analysis reveals an impressive array of genes encoding

    cytochrome P450s (CYPs), transporters and regulatory proteins that cooperate in secondary

    metabolism. The genome also encodes one of the richest sets of wood degradation enzymes

    among all of the sequenced basidiomycetes. In all, 24 physical CYP gene clusters are identified.Moreover, 78 CYP genes are coexpressed with lanosterol synthase, and 16 of these show high

    similarity to fungal CYPs that specifically hydroxylate testosterone, suggesting their possible

    roles in triterpenoid biosynthesis. The elucidation of the G. lucidumgenome makes this organism

    a potential model system for the study of secondary metabolic pathways and their regulation

    in medicinal fungi.

    1Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College , Beijing 100193, China. 2Department of

    Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, Tennessee 38163, USA. 3Laboratory for Molecular

    and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of WisconsinMadison, Madison,

    Wisconsin 53706, USA. 4Architecture et Fonction des Macromolcules Biologiques, Aix-Marseille Universit, CNRS UMR 7257, 163 Avenue de Luminy,

    13288 Marseille Cedex 09, France. 5INRA, UMR1163 de Biotechnologie des Champignons Filamenteux, Aix-Marseille Universit, ESIL, 163 Avenue de

    Luminy, CP 925, 13288 Marseille Cedex 09, France. 6National Evolutionary Synthesis Center, 2024 W. Main Street A200, Durham, North Carolina 27705,

    USA. 7China Academy of Chinese Medical Sciences, Beijing 100700, China. *These authors contributed equally to this work. Correspondence and requests

    for materials should be addressed to S.C. (email: [email protected]) or to C.S. (email: [email protected]).

    Genome sequence of the model medicinal

    mushroom Ganoderma lucidum

    Shilin Chen1,*, Jiang Xu1,*, Chang Liu1,*, Yingjie Zhu1, David R. Nelson2, Shiguo Zhou3, Chunfang Li1,

    Lizhi Wang1, Xu Guo1, Yongzhen Sun1, Hongmei Luo1, Ying Li1, Jingyuan Song1, Bernard Henrissat4,

    Anthony Levasseur5, Jun Qian1, Jianqin Li1, Xiang Luo1, Linchun Shi1, Liu He1, Li Xiang1, Xiaolan Xu1,

    Yunyun Niu1, Qiushi Li1, Mira V. Han6, Haixia Yan1, Jin Zhang1, Haimei Chen1, Aiping Lv7, Zhen Wang1,

    Mingzhu Liu1, David C. Schwartz3& Chao Sun1

  • 8/10/2019 Genome Sequence of the Model Medicinal

    2/9

    ARTICLE

    2

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1923

    NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    2012Macmillan Publishers Limited. All rights reserved.

    Ganoderma lucidum, also known as the mushroom o immor-tality and the symbol o traditional Chinese medicine, isone o the best-known medicinal macroungi in the world.

    Its pharmacological activities are widely recognized, as indicatedby its inclusion in the American Herbal Pharmacopoeia and Tera-peutic Compendium1. Modern pharmacological research has dem-onstrated that G. lucidumexhibits multiple therapeutic activities,including antitumour, antihypertensive, antiviral and immunomod-ulatory activities2. G. lucidumproduces a large reservoir o bioactive

    compounds; thus ar, more than 400 different compounds have beenidentied3, making this ungus a virtual cellular actory or bio-logically useul compounds. riterpenoids and polysaccharides arethe two major categories o pharmacologically active compounds inG. lucidum. In addition to producing these bioactive chemical com-pounds, G. lucidum, like other white rot basidiomycetes, secretesenzymes that can effectively decompose both cellulose and lignin.Such enzyme activities may prove useul or biomass utilization,bre bleaching and organo-pollutant degradation4.

    Our understanding o G. lucidumbiology is limited despite its ven-erable role in traditional Chinese medicine and its impressive arse-nal o bioactive compounds. Here, we report the complete genomesequence o monokaryotic G. lucidumstrain 260125-1, and we identiya large set o genes and potential gene clusters involved in secondary

    metabolism and its regulation. Tis genomic inormation helps elu-cidate the molecular mechanisms underlying the synthesis o diversesecondary metabolites in medicinal ungi. Te genome sequence willmake it possible to realize the ull potential o G. lucidumas a sourceo pharmacologically active compounds and industrial enzymes.

    ResultsGenome sequence assembly and annotation. We sequenced thegenome o the haploid G. lucidumstrain 260125-1 (SupplementaryNote 1 and Supplementary Fig. S1) using a whole-genome shotgunsequencing strategy. A 43.3-Mb genome sequence was obtainedby assembling approximately 218 million Roche 454 and Illuminareads (~440 X coverage) (able 1 and Supplementary able S1).Tis genome sequence assembly consisted o 82 scaffolds (Supple-

    mentary able S2), which were ordered and oriented onto 13chromosome-wide optical maps (Fig. 1, Supplementary able S3 andSupplementary Fig. S2). A comparison o the sequence scaffolds andoptical maps showed greater than 86% congruency, indicating thehigh quality o the genome sequence assembly. In total, 16,113 genemodels were predicted, with an average sequence length o 1,556 bp(Supplementary able S4), comparable to the genomes o otherlamentous ungi57. On average, each predicted gene contains 4.7exons, with 85.4% o the genes containing introns. Te overall GCcontent is approximately 55.9% (59.0% or exons, 52.2% or intronsand 53.7% or intergenic regions). Repetitive sequences representapproximately 8.15% o the genome. Te majority o the repeatsare LR/Gypsy (3.92% o the genome; Supplementary Note 2 andSupplementary able S5). Approximately 70% o the genes were

    annotated by similarity searches against homologous sequences andprotein domains (Supplementary able S6).

    Comparisons with other fungal genomes. Te predicted proteomeo G. lucidumwas compared with those o 14 other sequenced ungi.OrthoMCL analysis revealed that 4.5% o the predicted proteins inG. lucidumhave orthologues in all other species, whereas 43.8% othe proteins are unique to G. lucidum; approximately 35.3% o theunique proteins have at least one paralogue (Supplementary Data 1).o illuminate the evolutionary history o G. lucidum, a phylogenetictree was constructed using 296 single-copy orthologous genes con-served in these 15 ungi (Supplementary Fig. S3). Te topology o thetree is consistent with the taxonomic classication o these species.

    Te proteome o G. lucidumwas also described by the proteinamily (PFAM) representation (Supplementary Data 2 and 3). Te

    evolution and expansion o single-protein amilies were examinedusing CAF8. Several protein amilies were ound to have under-gone expansion, including amilies with unctions related to anabo-lism, wood degradation and development (Supplementary able S7).Noteworthy examples include the expansion o the cytochromeP450 (CYP) amily and the major acilitator superamily (MFS)transporter amily. Because these two amilies have importantroles in the biosynthesis and transportation o metabolites, theirexpansion might well contribute to the diversity o G. lucidum

    metabolites9,10.A total o 250 syntenic blocks were identied on the basis o the

    conserved gene order between G. lucidumand Phanerochaete chrys-osporium11, corresponding to 3,008 genes and 2,986 genes, respec-tively, in each genome. On average, each block in the G. lucidumgenome includes 12 genes. In all, 92 blocks contain more than

    Table 1 | General features of the G. lucidumgenome.

    Number of chromosomes 13

    Length of genome assembly (Mb) 43.3

    GC content (%) 55.9

    Number of protein-coding genes 16,113

    Average gene length (bp) 1,556GC content of protein-coding genes (%) 59.3

    Average number of exons per gene 4.7

    Average exon size (bp) 268

    Average coding sequence size (bp) 1,188

    Average intron size (bp) 87

    Average size of intergenic regions (bp) 1,206

    Chromosome

    Gene density

    (genes per 100kb)

    (Bin=1 Mb)

    3 51

    12

    13 1

    3

    2

    4

    d c b a

    5

    67

    8

    9

    10

    11

    Figure 1 | An ideogram showing the genomic features of G. lucidum.

    (a) GC content was calculated as the percentage of G + C in 100-kb

    non-overlapping windows. (b) Gene density is represented as the

    number of genes in 100-kb non-overlapping windows. The intensity of

    the blue colour correlates with gene density. (c) Pseudochromosome:

    the diagram represents 13 G. lucidumpseudochromosomes. (d) Genome

    duplication: regions sharing more than 90% sequence similarity over

    5 kb are connected by grey lines; those with more than 90% similarity

    over 10 kb are connected by orange lines (Supplementary Note 6).

  • 8/10/2019 Genome Sequence of the Model Medicinal

    3/9

  • 8/10/2019 Genome Sequence of the Model Medicinal

    4/9

    ARTICLE

    4

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1923

    NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    2012Macmillan Publishers Limited. All rights reserved.

    Te expression proles o these genes were highly correlatedwith that o LSS (correlation coeffi cient (r) > 0.9) (Fig. 3a andSupplementary Data 5). Furthermore, their expression prolesare positively correlated with triterpenoid content proles duringdevelopment (Fig. 2d), suggesting that some o these 78 CYP genesmight be involved in triterpenoid biosynthesis. O these genes,28 were classied into novel amilies unique to G. lucidum, and

    38 genes were classied into novel subamilies o previous knownamilies. Te remaining genes belong to subamilies also ound inP. chrysosporium and Postia placenta. On the basis o previousreports on CYPs rom P. chrysosporium and P. placenta, we knowthat some enzymes rom the CYP512 and CYP5144 amilies canonly effectively modiy an animal steroid hormone, testosterone,rom among more than ten potential substrates tested9,15. Consid-ering the structural similarity o testosterone to the triterpenoidsproduced by G. lucidum, 15 CYP512 genes and one CYP5144gene coexpressed with LSSare likely to be involved in triterpenoidbiosynthesis (Fig. 3b and Supplementary Fig. S6). Te exact roles othese CYPs will be investigated urther.

    In lamentous ungi, evolution has avoured the clustering ogenes involved in the biosynthesis o particular secondary metab-olites16. According to the proposed biosynthetic pathway, at least

    three CYPs are involved in lanosterol modication in G. lucidum(Supplementary Fig. S5). Tereore, to urther characterize the puta-tive gene cluster involved in triterpenoid biosynthesis, we examinedthe physical clustering o CYP genes in the G. lucidum genomeand ound 24 clusters containing three or more CYP genes (Fig. 4and Supplementary Data 6). O these clusters, two have CYPs thatwere coexpressed with LSS(average correlation coeffi cient > 0.9).

    However, ten genes in close proximity to LSS on chromosome 6did not exhibit strong coexpression with LSS(average correlationcoeffi cient = 0.64) (Supplementary Fig. S7), indicating the need orurther examination o the organization o the genes involved intriterpenoid biosynthesis in G. lucidum.

    Te biosynthesis of other bioactive compounds in G. lucidum.Polysaccharides are another major group o bioactive compoundsound in G. lucidum. Among the polysaccharides, the water-soluble1,3--and 1,6--glucans are the most active as immunomodulatorycompounds2,17 (Supplementary Fig. S8). G. lucidum encodes two1,3--glucan synthases and seven -glucan biosynthesis-associatedproteins containing an SKN1 domain (PF03935); such genes areknown to have key roles in the biosynthesis o 1,6--glucans inSaccharomyces cerevisiae(Supplementary able S9). Tese proteins

    CYP5150DPP

    CYP5

    12(A,U,V)

    GL

    CYP5

    12T1G

    L

    CYP512

    (N,P)P

    P

    CYP512

    X1GL

    CY

    P512(B

    -C)P

    C

    CYP5

    12(E

    -G)P

    C

    CYP5

    12Y1

    GL

    CYP5

    12(L

    ,M)P

    P

    CYP5

    12(H

    ,J)P

    C

    CYP5

    1F1PC

    CYP5

    1F1v1PP

    CYP5

    1F1GL

    CYP61A1GL

    CYP61A1v1PP

    CYP61A1PC

    CYP53

    66A1GL

    CYP51

    44(A

    -D,J)P

    C

    CYP51

    44L1

    PP

    CYP51

    44E1PC

    CYP51

    44(K

    ,M)P

    P

    CYP5144N1

    GL

    CYP5144(F

    -H)PC

    CYP5359

    GL

    CYP5351A3GL

    CYP5351APP

    CYP5357B1GL

    CYP5349A3

    GL

    CYP5349APP

    CYP5360A1

    GL

    CYP5148A2

    PC

    CYP5148B6GL

    CYP5148B4v1

    PP

    CYP5148A1

    PC

    CYP5365A1GL

    C Y P

    5 0 3 5 ( A

    - E ) P

    C

    CYP5035L2GL

    CYP

    5035M2

    GL

    CYP

    5 0 3 5 K

    1G

    L

    CYP5035U1G

    L

    CYP5035F

    PP

    CYP5140A3GL

    CYP5

    12D1

    PC

    CYP5

    12(K

    ,S)P

    P

    CYP5150APC

    CYP5150B1PC

    CYP5150(J-L)GL

    CYP5150F1PP

    CYP5150E1v1PP

    CYP5139A1PC

    CYP5139H1G

    L

    CYP5139D

    8PP

    CYP5139G

    1GL

    CYP5139D

    PP

    CYP5136APC

    CYP5136D

    5GL

    CYP5137BPP

    CYP5137A

    PC

    CYP5137A

    5GL

    CYP5137A

    3PP

    CYP5137A

    4v1PP

    CYP505DPC

    CYP505DPP

    CYP505D

    6PC

    CYP505DGL

    CYP63B1PC

    CYP63B2PP

    CYP63CPC

    CYP63A4PC

    CYP63A13GL

    CYP63D1v1PP

    CYP63APC

    CYP5141EPP

    CYP5141B1PC

    CYP5141D1PC

    CYP5141F2GL

    CYP5141C1PC

    CYP5141APP

    CYP5141A PC

    CYP5140A1PC

    CYP5140A1PP

    CYP5140A2PP

    CYPs co-expressed with lanosterol synthase in G. lucidum

    b

    a

    CYPs involved in the hydroxylation of testosterone in P. chrysosporium

    CYPs involved in the hydroxylation of testosterone in P. placenta

    M

    P

    F

    Figure 3 | CYP gene expression at different developmental stages and phylogenetic analysis coexpressed with LSS.(a) Two-way clustering of gene-

    expression profiles for the CYP genes expressed across three developmental stages: mycelia (M), primordial (P) and fruiting bodies (F), as quantified

    by real-time PCR. The relative expression level of each gene was centred on the mean and then unit scaled across the developmental stages. The floor

    (shown in green) and ceiling (shown in red) of the expression levels were set as twice the s.d. (b) The phylogenetic analysis of CYPs coexpressed with

    LSS in G. lucidum(GL) and their homologues in Polyporales. A total of 78 coexpressed CYPs from 21 families in G. lucidumand CYPs from the same

    families in P. placenta(PP) and P. chrysosporium(PC) were included in the tree. The minimal evolution tree was generated with a heuristic search using

    the Close-Neighbour-Interchange (CNI) algorithm in MEGA (version 5.05). Bootstrap values based on 1,000 replications was set and shown between

    50 and 100 just as the branch colours changed from blue, black to red. Moreover, the genes from the same family or subfamily were collapsed and

    shown as triangles.

  • 8/10/2019 Genome Sequence of the Model Medicinal

    5/9

    ARTICLE

    5

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1923

    NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    2012Macmillan Publishers Limited. All rights reserved.

    are well conserved in G. lucidum, S. cerevisiae, P. chrysosporiumandP. placenta, suggesting their importance in ungal polysaccharidebiosynthesis (Supplementary Note 4)18,19.

    LZ-8, the rst member o the ungal immunomodulatory pro-tein amily, was isolated rom G. lucidum in 198920,21. Pharma-cological experiments indicated that LZ-8 has anti-tumour andimmunomodulation activities2224. All ungal immunomodulatoryproteins contain an Fve domain (PF09259.5). wo genes (GL18769and GL18770) encoding proteins with an Fve domain (PF09259.5)were ound in the G. lucidumgenome. GL18770 was ound to encodea known LZ-8 protein, and GL18769 encodes a protein with 73%identity to LZ-8. Te unction o GL18769 requires urther study. Incontrast, no genes encoding Fve domain-containing proteins wereound in the P. chrysosporiumand P. placentagenomes, suggesting

    that LZ-8 may be unique to G. lucidum.Te G. lucidum genome encodes one non-ribosomal peptidesynthase (NRPS) and ve polyketide synthase (PKS) genes, includ-ing our reducing-type PKSs and one non-reducing-type PKS.Domain analysis indicated these may be unctional enzymes(Supplementary Fig. S9). Compared with other ungi, G. lucidumhas ewer NRPSs and PKSs, suggesting that G. lucidum may notproduce non-ribosomal peptides (NRP) and polyketides (PK)as prolically as other ungi. Indeed, no NRPs or PKs have beenisolated rom this species to date; thus, special conditions may beneeded to trigger NRPS or PKS gene expression.

    Te terpene synthase amily is a mid-sized amily responsible orthe biosynthesis o monoterpene, sesquiterpene and diterpene back-bones25. A total o 12 terpene synthase genes were identied in the

    G. lucidumgenome, though triterpenoids are the only type that hasbeen isolated rom G. lucidum thus ar. Phylogenetic analysis indi-cated that at least ve terpene synthases exhibit high similarity tothe characterized terpene synthases rom Coprinus cinereus26; thesesynthases are named Cop1 (germacrene A synthase) (GL22353,54.5%), Cop2 (germacrene A synthase) (GL25909, 50.3%), Cop3(-muurolene synthase) (GL24515, 65.3%) and Cop4 (-cadinenesynthase) (GL20244, 55.6%; GL25830, 50.6%) (SupplementaryFig. S10). With the exception o GL22395, all o these genes encodeproteins less than 400 amino acids in length and are closely relatedon the phylogenetic tree, suggesting that all o them may encodesesquiterpene synthases.

    Some genes encoding tailing enzymes and transporters wereound in the vicinity o the PKS, NRPS and terpene synthase genes,suggesting that the biosynthetic gene cluster paradigm may hold

    true or G. lucidumin the same way that it does in ascomycetes27.Recently, more basidiomycete ungi have been sequenced2830,acilitating the understanding o the organization o genes involvedin the biosynthesis o secondary metabolites in basidiomycetes.

    ransporters. ransporters have multiple unctions, such as theuptake and redistribution o synthesized metabolic end products inthe organism, and they are classied into three types: AP-dependenttransporters, ion channels and secondary transporters31. A total o1,063 transport proteins belonging to 134 amilies were identiedin G. lucidum(Supplementary Data 7). Among these transporters,248 are AP-dependent transporters, 29 are ion channels and 321are secondary transporters; the remainder are incompletely char-acterized transporters. In general, the MFS transporters participate

    in secondary metabolism, and the AP-binding cassette (ABC)is involved in the transport o polysaccharides and lipids32. In theG. lucidum genome, secondary transporters (321) are the mostabundant, with the majority belonging to the MFS amily (170),whereas 49 AP-binding cassette transporters were identied. SomeMFS transporters were ound in the CYP clusters or other clus-ters identied using the antiSMASH sofware33, suggesting theirpossible roles in the biosynthesis o secondary metabolites.

    Regulation of secondary metabolism. Secondary metabolite pro-duction and ungal development are regulated in response to envi-ronmental conditions. One o the best-known regulatory proteinamilies is the velvet amily, and these velvet-domain-containingproteins were also identied in the G. lucidumgenome. wo o these

    proteins, VeA and VelB, are located on the same sequence scaffold.Tese two proteins interact with the methyltranserase-domain-containing protein LaeA and regulate secondary metabolism anddevelopment in Aspergillus. Considering their regulatory roles inprevious studies34,35, we propose a coordinated pathway or sec-ondary metabolism and development in G. lucidum (Supplemen-tary Fig. S11).

    More than 600 regulatory proteins have been identied inG. lucidum(Supplementary Data 8). A total o 249 predicted regula-tory proteins are ound in regions that are syntenic with P. chrys-osporiumor S. commune, implying that part o the gene regulatorynetwork o G. lucidummay be conserved. Zinc-nger-amily pro-teins are reportedly involved in the pathway-specic regulation oungal secondary metabolites36. Among the predicted regulators,117 CCHC-containing proteins, 81 C2H2-containing proteins

    Chr154K

    505K

    1.43M

    1.46M

    2.54M

    2.50M

    50K

    4.03M

    881K

    747K

    734K

    1.45M 91K

    135K

    732K

    882K

    921K

    2.97M

    631K

    2.75M

    1.02M

    506K

    558K

    581K

    509K 498 K

    603K

    1.82M

    1.61M

    1.74M

    2.30M

    2.35M

    1.93M

    2.38M

    2.45M

    1.56M

    1.61M

    2.63M

    2.66M

    0.98M

    1.52M

    2.47M

    2.55M

    637K3.85M

    219K

    2.30M

    2.38M

    415K

    Chr2 Chr3 Chr4 Chr5 Chr6 Chr7 Chr8

    Chr9

    Chr10 Chr11 Chr12

    Figure 4 | Putative CYP gene clusters found in the G. lucidumgenome.The genes are represented by lines on the chromosomal fragments. The colours

    of the lines indicate whether the genes are in the forward (blue) or reverse (red) orientation. The beginning and end of each cluster is shown to the left,

    and each cluster is labelled according to the CYP genes it contains. The chromosome numbers are shown at the top.

  • 8/10/2019 Genome Sequence of the Model Medicinal

    6/9

    ARTICLE

    6

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1923

    NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    2012Macmillan Publishers Limited. All rights reserved.

    and 73 Zn2Cys6-containing proteins have been identied. Six othese zinc-related proteins were ound in clusters predicted usingantiSMASH33or SMURF37 (Supplementary Data 9 and 10). Epi-genetic modiers also have important roles in the regulation osecondary metabolism38. A total o 33 GCN5-related proteins, 15PHD-related proteins, 19 SE-related proteins and 8 HDAC-relatedproteins were identied in the G. lucidumgenome. Te participa-tion o these predicted proteins in ungal secondary metabolismremains to be experimentally veried.

    Te digestion of wood and other polysaccharides. A total o 417G. lucidumgenes could be assigned to carbohydrate-active enzyme(CAZymes) amilies as dened in the CAZy database39 (Supple-mentary able S10), making this ungus one o the richest basidi-omycetes examined so ar in terms o the number o CAZymes(Supplementary able S11). In particular, the genome encodes can-didate enzymes or the digestion o the three major classes o plantcell wall polysaccharides: cellulose, hemicelluloses and pectin. Inter-estingly, although G. lucidumis the richest basidiomycete examinedso ar in terms o genes encoding enzymes or pectin digestion, itsstrategy or pectin breakdown relies solely on hydrolytic enzymes;the genome does not encode any pectin/pectate lyases (Supple-mentary Data 11). In addition, this ungal genome is particularly

    rich in enzymes that catalyse the decomposition o chitin, with 40genes assignable to CAZy amily GH18, the highest number amongknown basidiomycetes.

    Unlike the hydrolysis o polysaccharides, lignin digestion isconsidered an enzymatic combustion process, involving several oxi-doreductases such as laccases, ligninolytic peroxidases and peroxide-generating oxidases40. Annotation o the candidate ligninolyticenzymes encoded in the G. lucidumgenome revealed a set o 36 lig-ninolytic oxidoreductases (Supplementary able S12). Interestingly,compared with model white-rot ungi, such as P. chrysosporium11and S. commune12 (Supplementary able S11), G. lucidum pos-sesses a large and complete set o ligninolytic peroxidases alongwith laccases and a cellobiose dehydrogenase. Te presence o theseenzymes suggests that G. lucidummay exploit different strategies

    or the breakdown o lignin, including oxidation by hydrogen per-oxide in a reaction catalysed by class-II peroxidases. In addition,G. lucidumlaccases may degrade recalcitrant lignin compounds inthe presence o redox mediators, or they may generate lignocellulose-degrading hydroxyl radicals via Fenton chemistry41. In agreementwith the presence o candidate class-II peroxidases, several peroxide-generating oxidases were identied in the G. lucidum genome,particularly in the copper-radical oxidase amily. Tereore, the dis-tribution o its lignocellulolytic gene amilies classies G. lucidumas a particularly versatile white-rot ungus equipped with a remark-able enzymatic arsenal able to degrade all components o wood(Supplementary Note 5).

    Discussion

    As one o the most amous traditional Chinese medicines,G. lucidumhas a long track record o sae use, and many pharmaceutical com-pounds have been ound in this medicinal macroungus. However,the understanding o the basic biology o G. lucidumis still very lim-ited. Here, we present the genome sequence o G. lucidumgeneratedby next-generation sequencing (NGS) and optical mapping technol-ogies. Te high accuracy o the genome sequence was validated usingtwo osmid sequences obtained by Sanger sequencing technology.With the help o the chromosome-wide optical map or each G. luci-dumchromosome, the sequence scaffolds assembled rom NGS readswere effectively ordered and oriented onto the optical map scaffolds,which has greatly acilitated the construction o chromosome-widesequence pseudomolecules. Tus, the combination o opticalmapping and NGS represents an effective approach or de novowhole-genome sequencing without cloning or genetic mapping.

    Our genome sequence analysis revealed a large assortment ogenes and gene clusters potentially involved in secondary metabo-lism and its regulation. In particular, the G. lucidumgenome con-tains one o the richest sets (both in abundance and diversity) oCYP genes known among the sequenced ungal genomes. CYPsgenerally have important roles in primary and secondary metabo-lism. Among other Polyporales genomes, P. chrysosporiumhas 148CYP genes and 10 CYP pseudogenes in 33 amilies, and Postiaplacentahas 186 CYP genes and 5 CYP pseudogenes in 42 amilies.

    G. lucidumhas 22 amilies in common with P. chrysosporiumand 28in common with P. placenta. Some o the CYP amilies are expanded.For example, the CYP512 amily has 23 genes in G. lucidumcom-pared with 14 in P. chrysosporium and 14 in P. placenta (Supple-mentary able S13). In addition, 11 lineage-specic CYP amilieswere identied in G. lucidum. Te expansion o common sharedCYP amilies and the emergence o new CYP amilies indicate theexpansion o the biochemical unctions o CYPs in G. lucidum. Tediscovery o new CYP amilies seems to accompany the comple-tion o each new ungal genome sequence except in those genomeswith unusually small numbers o CYPs. Even among those lamen-tous ungi that have been highly sampled or sequencing, such asFusariumand Aspergillusspecies, newly sequenced genomes con-tinue to reveal novel CYP amilies. Tis phenomenon may be due

    to a larger pool o CYP genes present in a common ancestor, withsubsequent gene loss in some species. In addition, lateral gene trans-er may occur among ungi. A third possibility involves the evo-lution o CYPs rom an existing amily and their rapid divergenceaccompanied by neounctionalization42.

    riterpenoids are a highly diverse group o natural products thatare widely distributed in eukaryotes, and many triterpenoids havebenecial properties or human health. o our knowledge, G. lucidumhas the most diverse and abundant triterpenoid content o all exam-ined ungi. All triterpenoids isolated rom G. lucidum to date arederived rom the same lanosterol skeleton. Tereore, the triterpe-noid diversity observed in G. lucidum likely originates rom di-erent modications and/or the low substrate specicity o severaltailoring enzymes in this pathway. G. lucidumtriterpenoids are syn-

    thesized via the MVA pathway, which is conserved in all eukaryotes(Supplementary Fig. S5). Compared with the well-studied upstreamcatalytic steps, little is known about how lanosterol is modied toyield the diverse triterpenoids ound in G. lucidum. CYPs have cen-tral roles in lanosterol modications in the proposed triterpenoidbiosynthetic pathway. Real-time PCR analysis demonstrated that78 CYP genes are coexpressed with LSS, suggesting their possibleroles in triterpenoid biosynthesis. Recently, a comprehensive unc-tional analysis o CYPs rom P. chrysosporiumand P. placentawascarried out using a wide variety o compounds as substrates9,15.Interestingly, we ound that multiple CYPs can catalyse the hydrox-ylation o testosterone, suggesting that their natural substrates arestructurally related to the steroids. Tese CYPs include CYP512(C1, E1, F1, G2), CYP5136 (A1, A3), CYP5141C1, CYP5144J1,

    CYP5147A3 and CYP5150A2 in P. chrysosporium and CYP512(N6, P1, P2), CYP5139D2 and CYP5150D1 in P. placenta. How-ever, o these CYPs, CYP5136 (A1, A3), CYP5141C1, CYP5147A3,CYP5150 (A2, D1) and CYP5139D2 show low substrate specicity,as they can effectively modiy several other compounds, such asbiphenyl, carbazole and so on. Tereore, these enzymes are lesslikely to use steroids as their natural substrates. In contrast, theenzymes rom the CYP512 and CYP5144 amilies are most likelyinvolved in steroid modication in the two species. In G. lucidum,we ound 15 genes rom the CYP512 amily and one gene romthe CYP5144 amily that are coexpressed with LSS(SupplementaryFig. S6). On the basis o structural similarities between steroidsand the G. lucidum triterpenoids, these enzymes are more likelyto catalyse hydroxylation reactions on the cyclic skeletons o thetriterpenoids in G. lucidum.

  • 8/10/2019 Genome Sequence of the Model Medicinal

    7/9

    ARTICLE

    7

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1923

    NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    2012Macmillan Publishers Limited. All rights reserved.

    Interestingly, in addition to the genes involved in the biosynthe-sis o triterpenoids and polysaccharides, we ound genes that maybe involved in the biosynthesis o NRPs, PKs and other kinds oterpenes. Tese compounds have not previously been isolated romG. lucidum, suggesting that their synthesis might be tightly regu-lated. Tis example shows that genome analyses can provide insightinto the complete chemical prole o an organism. Some o the syn-thetic pathways encoded in this genome might contribute to thetherapeutic activities o G. lucidum.

    In summary, the elucidation o the G. lucidumgenome makesit a compelling model system or studying the biosynthesis othe pharmacologically active compounds produced by medicinalungi. Te identication o numerous lignin degradation enzymeswill accelerate the discovery o complete lignin degradation path-ways necessary or the strategic exploitation o these enzymes inindustrial settings. Tereore, the comprehensive understandingo the G. lucidumgenome will pave the way or its uture roles inpharmacological and industrial applications.

    MethodsStrain and culture conditions. G. lucidumis a species complex that shows tremen-dous intra-species diversity43. Te G. lucidumdikaryotic strain CGMCC5.0026,belonging to the G. lucidumAsian group, was obtained rom the China GeneralMicrobiological Culture Collection Center (Beijing, China) and is one o themost widely used isolates or the production o G. lucidummedicinal material inChina. Te monokaryotic strain G.260125-1 used or whole-genome sequencingwas derived rom the strain CGMCC5.0026 by protoplasting. Vegetative myceliawere grown on potato dextrose medium in the dark at 28 C. Liquid cultures wereshaken at 50 r.p.m. Te primordia and ruiting bodies o the strain CGMCC5.0026used or transcriptomic analyses were cultivated on Quercus variabilisBlume logsat Huiao Pharmaceutical Company (Luoian, Hubei Province, China). All strainsare available on request.

    Construction of an optical map. Protoplasts o the monokaryotic strain G.260125-1were collected by centriugation at 1,000gor 10 min 1and were then diluted toa nal concentration o 2109cells ml 1. A solution o 1.2% low-melting-pointagarose in 0.125 M EDA (pH 7.5) was heated to 45 C in a water bath and wasthen added to the protoplast suspension. Te mixture was pipetted thoroughlyusing a wide-bore tip and was then placed at 4 C to solidiy. Te solidied gel wassliced into pieces and incubated in 50 ml o digestion buffer (0.5 M EDA, 7.5%-mercaptoethanol) at 37 C overnight. Ten, the buffer was replaced with NDSKbuffer (0.5 M EDA, 1% (v/v) N-lauroylsarcosine, 1 mg ml 1proteinase K)44. DNAsamples or optical mapping were obtained by melting DNA gel inserts at 70 C or7 min, and then digesting with -agarase (New England Biolabs, USA) at 42 C or2 h. Te optimal concentration or mapping was determined by perorming serialdilutions in E buffer, and wide-bore pipette tips were used or the liquid transers.7 DNA (Yorkshire Bioscience, UK) at a concentration o 30 pgl 1was added toE and mixed by pipetting up and down using a wide-bore pipette tip beore theaddition o genomic DNA. DNA solutions were loaded into the silastic microchan-nel device, and the DNA molecules were stretched and mounted onto the opticalmapping suraces through capillary action and the electrostatic binding o DNAmolecules to the positively charged optical mapping suraces. Mounted DNA mol-ecules were digested by the restriction endonuclease SpeI in NEB Buffer 2 (50 mMNaCl, 10 mM risHCl, 10 mM MgCl2, 1 mM dithiothreitol, pH 7.9; New EnglandBiolabs) without BSA, and riton X-100 was added to the digested DNA at a nalconcentration o 0.02%. Digested DNA molecules were then stained with 12l o0.2M YOYO-1 solution (5% YOYO-1 in E containing 20% -mercaptoethanol;Eugene, USA). Fully automated imaging workstations were used to generatesingle-molecule optical data sets or whole-genome map construction.

    Genome sequencing and assembly. Te genomic DNA o G. lucidumwas sequen-ced using the Roche 454 GS FLX (Roche, USA) and Illumina GAII (Illumina,USA) NGS platorms. Following pre-processing, the Roche 454 reads were assem-bled into a primary assembly using CABOG45and then scaffolded with Illuminapaired-end and mate-pair reads using SSPACE version 1.146. Finally, the shortreads were subjected to error correction and gap lling using Nesoni (version 0.49)and SOAP GapCloser47, respectively. Te nished chromosome-wide sequencepseudomolecules were constructed by anchoring and orienting the nal sequencescaffolds onto the whole genome physical maps o G. lucidumgenerated by anoptical mapping system48,49.

    Gene prediction and annotation. Gene models were predicted using the MAKERpipeline50. Te repeat sequences were masked throughout the genome usingRepeatMasker (version 3.2.9) and the RepBase library (version 16.08). Genestructures were predicted with a combination o ab initioFgenesh51,52, SNAP53

    and Augustus54; comparisons with protein sequences were perormed usingBLASX and exonerate: protein2genome. In parallel, comparisons with Roche454 ES sequences were perormed using BLASN and exonerate: est2genomeagainst a set o Roche 454 ES contigs rom three different developmental phases:mycelia, primordia and ruiting bodies. A set o 16,113 predicted gene models wereobtained, and more than 1,600 genes were manually curated using Apollo sofware.All o the predicted gene models were unctionally annotated by their sequencesimilarity to genes and proteins in the NCBI nucleotide (Nt), non-redundant andUniProt/Swiss-Prot protein databases. Te gene models were also annotated bytheir protein domains using InterProScan. All genes were classied accordingto Gene Ontology (GO), eukaryotic orthologous groups and KEGG metabolic

    pathways.

    Repeat content. Te REPE package (version 1.4)55,56was used to detect andannotate transposable elements in G. lucidum. Satellites, simple repeats andlow-complexity sequences were annotated separately using RepeatMasker.

    Transcriptome sequencing and analyses. For the Roche 454 sequencing,complementary DNA was synthesized using SMAR technology as previouslydescribed57and sequenced using the standard GS FLX itanium RL sequencingprotocol (Roche). Te Roche 454 reads were assembled using the GS De NovoAssembler (version 2.5.3). An RNA-Seq analysis was perormed according tothe protocol recommended by the manuacturer (Illumina). Te reads rom di-erent phases were mapped to the whole-genome assembly using BLA (version0.33)58with the ollowing settings: 90% minimum identity and 100-bp max intronlength. Te statistical models or maximum likelihood and maximum a posterioriimplemented in Cuffl inks (version 1.1.0) were used or expression quanticationand differential analysis59. Te abundances are reported as normalized ragments

    per kb o transcript per million mapped reads. A gene is considered signicantlydifferentially expressed i its expression differs between any two samples rom thethree stages with a old change > 2 and a P-value < 0.05 as calculated by Cuffl inks.

    Quantitative PCR. Following digestion with DNase, the total RNA was reversetranscribed into single-stranded complementary DNA. Quantitative PCR wasperormed three times or each sample using SYBR green (Lie echnologies,USA) on an ABI PRISM 7500 Real-ime PCR System (Lie echnologies). Teexpression data or the CYPs were normalized against an internal reerence gene,glyceraldehyde-3-phosphate dehydrogenase(GAPDH). Te relative expression levelswere calculated by comparing the cycle threshold (Ct) o each target gene with thehousekeeping gene GAPDHusing the 2 Ctmethod.

    CYP annotation and analysis. Te reerence CYP sequences were downloadedrom http://drnelson.uthsc.edu/P450seqs.dbs.html. All predicted proteins werethen used to search the reerence CYP data set using the BLASP program witha cutoff E-value < 1e 5. Te selected proteins were manually curated and named

    according to the standard sequence homology criteria or CYP nomenclature. Aphysical CYP gene cluster was dened as three or more CYPs present within a100-kb sliding window o the genomic sequence or ewer than 10 genes betweenCYPs afer they had been sorted into groups along the chromosomes. I twoadjacent clusters overlapped, they were merged to orm one larger cluster60.

    Carbohydrate-active enzyme annotation. All putative G. lucidumproteins weresearched against entries in the CAZy39database using BLAS. Te proteins withe-values smaller than 0.1 were urther screened by a combination o BLASsearches against individual protein modules belonging to the GH, G, PL, CEand CBM classes, and CBM and HMMer (version 3.0) were used to query againsta collection o custom-made hidden Markov model (HMM) proles constructedor each CAZy amily. All identied proteins were then manually curated.

    Lignin digestion enzyme annotation. HMM proles were constructed or eachamily o lignin-digesting enzymes and were used to classiy G. lucidumgenesinto sequence-based amilies, termed AA1 to AA8 (Levasseur and Henrissat,

    unpublished data). All identied proteins were then manually curated.

    Data availability. More detailed descriptions o the methods are provided inSupplementary Methods. All o the data generated in this project, including thoserelated to genome assembly, gene prediction, gene unctional annotations andtranscriptomic data, may be downloaded rom our interactive web portal athttp://www.herbalgenomics.org/galu.

    References1. Sanodiya, B. S., Takur, G. S., Baghel, R. K., Prasad, G. B. & Bisen, P. S.

    Ganoderma lucidum: a potent pharmacological macroungus. Curr. Pharm.Biotechnol.10,717742 (2009).

    2. Boh, B., Berovic, M., Zhang, J. & Zhi-Bin, L. Ganoderma lucidumand itspharmaceutically active compounds. Biotechnol. Annu. Rev.13,265301 (2007).

    3. Shiao, M. S. Natural products o the medicinal ungus Ganoderma lucidum:occurrence, biological activities, and pharmacological unctions. Chem. Rec.3,172180 (2003).

  • 8/10/2019 Genome Sequence of the Model Medicinal

    8/9

    ARTICLE

    8

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1923

    NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    2012Macmillan Publishers Limited. All rights reserved.

    4. Ko, E. M., Leem, Y. E. & Choi, H. . Purication and characterization olaccase isozymes rom the white-rot basidiomycete Ganoderma lucidum.Appl. Microbio. Biotechnol.57,98102 (2001).

    5. Pel, H. J. et al.Genome sequencing and analysis o the versatile cell actoryAspergillus nigerCBS 513.88. Nat. Biotechnol.25,221231 (2007).

    6. Jeffries, . W. et al.Genome sequence o the lignocellulose-bioconvertingand xylose-ermenting yeast Pichia stipitis. Nat. Biotechnol.25,319326(2007).

    7. De Schutter, K. et al.Genome sequence o the recombinant protein productionhost Pichia pastoris. Nat. Biotechnol.27,561566 (2009).

    8. De Bie, ., Cristianini, N., Demuth, J. P. & Hahn, M. W. CAFE: a computational

    tool or the study o gene amily evolution. Bioinformatics22,12691271(2006).

    9. Hirosue, S. et al.Insight into unctional diversity o cytochrome P450 inthe white-rot basidiomycete Phanerochaete chrysosporium: involvement oversatile monooxygenase. Biochem. Biophys. Res. Commun.407,118123(2011).

    10. Law, C. J., Maloney, P. C. & Wang, D. N. Ins and outs o major acilitatorsuperamily antiporters.Annu. Rev. Microbiol.62,289305 (2008).

    11. Martinez, D. et al.Genome sequence o the lignocellulose degrading ungusPhanerochaete chrysosporiumstrain RP78. Nat. Biotechnol.22,695700(2004).

    12. Ohm, R. A. et al.Genome sequence o the model mushroom Schizophyllumcommune. Nat. Biotechnol.28,957963 (2010).

    13. Shi, L., Ren, A., Mu, D. & Zhao, M. Current progress in the study onbiosynthesis and regulation o ganoderic acids.Appl. Microbiol. Biotechnol.88,12431251 (2010).

    14. Benveniste, P. Biosynthesis and accumulation o sterols.Annu. Rev. Plant Biol.

    55,429457 (2004).15. Ide, M., Ichinose, H. & Wariishi, H. Molecular identication and unctional

    characterization o cytochrome P450 monooxygenases rom the brown-rotbasidiomycete Postia placenta.Arch. Microbiol.194,243253 (2012).

    16. Yu, J. H. & Keller, N. Regulation o secondary metabolism in lamentous ungi.Annu. Rev. Phytopathol.43,437458 (2005).

    17. Xu, Z., Chen, X., Zhong, Z., Chen, L. & Wang, Y. Ganoderma lucidumpolysaccharides: immunomodulation and potential anti-tumor activities.Am. J. Chin. Med.39,1527 (2011).

    18. Douglas, C. M. Fungal beta(1,3)-D-glucan synthesis.Med. Mycol.39,5566(2001).

    19. Shahinian, S. & Bussey, H. -1,6-Glucan synthesis in Saccharomyces cerevisiae.Mol. Microbiol.35,477489 (2000).

    20. Kino, K. et al.Isolation and characterization o a new immunomodulatoryprotein, ling zhi-8 (LZ-8), rom Ganoderma lucidium.J. Biol. Chem.264,472478 (1989).

    21. anaka, S. et al.Complete amino acid sequence o an immunomodulatory

    protein, ling zhi-8 (LZ-8).J. Biol. Chem.264,1637216377 (1989).22. Wu, C. et al.Ling Zhi-8 mediates p53-dependent growth arrest o lung

    cancer cells prolieration via the ribosomal protein S7-MDM2-p53 pathway.Carcinogenesis 32,18901896 (2011).

    23. Lin, Y. et al.An immunomodulatory protein, Ling Zhi-8, induced activationand maturation o human monocyte-derived dendritic cells by the NF-B andMAPK pathways.J. Leukoc. Biol.86,877889 (2009).

    24. Kino, K. et al.An immunomodulating protein, Ling Zhi-8 (LZ-8) preventsinsulitis in non-obese diabetic mice. Diabetologia33,713718 (1990).

    25. Chen, F., Toll, D., Bohlmann, J. & Pichersky, E. Te amily o terpenesynthases in plants: a mid-size amily o genes or specialized metabolism thatis highly diversied throughout the kingdom. Plant J.66,212229 (2011).

    26. Agger, S., Lopez-Gallego, F. & Schmidt-Dannert, C. Diversity o sesquiterpenesynthases in the basidiomycete Coprinus cinereus.Mol. Microbiol.72,11811195 (2009).

    27. Reverberi, M., Ricelli, A., Zjalic, S., Fabbri, A. & Fanelli, C. Natural unctionso mycotoxins and control o their biosynthesis in ungi.Appl. Microbiol.

    Biotechnol.87,899911 (2010).28. Olson, . et al.Insight into trade-off between wood decay and parasitism romthe genome o a ungal orest pathogen. New Phytol.194,10011013 (2012).

    29. Fernandez-Fueyo, E. et al.Comparative genomics o Ceriporiopsissubvermisporaand Phanerochaete chrysosporiumprovide insight into selectiveligninolysis. Proc. Natl Acad. Sci. USA109,54585463 (2012).

    30. Eastwood, D. C. et al.Te plant cell wall-decomposing machinery underlies theunctional diversity o orest ungi. Science333,762765 (2011).

    31. Te Arabidopsis Genome Initiative. Analysis o the genome sequence o theowering plantArabidopsis thaliana. Nature408,796815 (2000).

    32. Platta, H. W. & Erdmann, R. Te peroxisomal protein import machinery. FEBSLett.581,28112819 (2007).

    33. Medema, M. H. et al.AntiSMASH: rapid identication, annotation and analysiso secondary metabolite biosynthesis gene clusters in bacterial and ungalgenome sequences. Nucleic Acids Res.39,W339W346 (2011).

    34. Bayram, . et al.VelB/VeA/LaeA complex coordinates light signal with ungaldevelopment and secondary metabolism. Science320,15041506 (2008).

    35. Bayram, . & Braus, G. H. Coordination o secondary metabolism anddevelopment in ungi: the velvet amily o regulatory proteins. FEMS Microbiol.Rev.36,124 (2012).

    36. MacPherson, S., Larochelle, M. & urcotte, B. A ungal amily o transcriptionalregulators: the zinc cluster proteins.Microbiol. Mol. Biol. Rev.70,583604(2006).

    37. Khaldi, N. et al.SMURF: genomic mapping o ungal secondary metaboliteclusters. Fungal Genet. Biol.47,736741 (2010).

    38. Williams, R. B., Henrikson, J. C., Hoover, A. R., Lee, A. E. & Cichewicz, R. H.Epigenetic remodeling o the ungal secondary metabolome. Org. Biomol.Chem.6,18951897 (2008).

    39. Cantarel, B. L. et al.Te Carbohydrate-Active EnZymes database (CAZy): anexpert resource or Glycogenomics. Nucleic Acids Res.37,D233D238 (2009).

    40. Ruiz-Dueas, F. J. & Martnez, .. Microbial degradation o lignin: how abulky recalcitrant polymer is effi ciently recycled in nature and how we can takeadvantage o this.Microbiol. Biotechnol.2,164177 (2009).

    41. Guilln, F., Gmez-oribio, V., Martinez, M. J. & Martinez, A. . Productiono hydroxyl radical by the synergistic action o ungal laccase and aryl alcoholoxidase.Arch. Biochem. Biophys.383,142147 (2000).

    42. Nelson, D. R. Progress in tracing the evolutionary paths o cytochrome P450.Biochim. Biophys. Acta1814,1418 (2011).

    43. Hong, S. G. & Jung, H. S. Phylogenetic analysis o Ganodermabased on nearlycomplete mitochondrial small-subunit ribosomal DNA sequences.Mycologia96,742755 (2004).

    44. Herschleb, J., Ananiev, G. & Schwartz, D. C. Pulsed-eld gel e lectrophoresis.Nat. Protoc.2,677684 (2007).

    45. Miller, J. R. et al.Aggressive assembly o pyrosequencing reads with mates.Bioinformatics24,28182824 (2008).

    46. Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffoldingpre-assembled contigs using SSPACE. Bioinformatics27,578579 (2011).

    47. Li, R. et al.SOAP2: an improved ultraast tool or short read alignment.Bioinformatics25,19661967 (2009).

    48. Zhou, S. et al.A single molecule scaffold or the maize genome. Plos Genet.5,e1000711 (2009).

    49. Zhou, S. et al.Whole-genome shotgun optical mapping o Rhodobactersphaeroidesstrain 2.4.1 and its use or whole-genome shotgun sequenceassembly. Genome Res.13,21422151 (2003).

    50. Cantarel, B. L. et al.MAKER: an easy-to-use annotation pipeline designed oremerging model organism genomes. Genome Res.18,188196 (2008).

    51. Salamov, A. A. & Solovyev, V. V. Ab initio gene nding in Drosophila genomicDNA. Genome Res.10,516522 (2000).

    52. Radakovits, R. et al.Draf genome sequence and genetic transormation o theoleaginous alga Nannochloropis gaditana. Nat. Commun.3,686 (2012).

    53. Kor, I. Gene nding in novel genomes. BMC Bioinformatics5,59 (2004).54. Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a

    new intron submodel. Bioinformatics19,Ii215Ii225 (2003).55. Flutre, ., Duprat, E., Feuillet, C. & Quesneville, H. Considering transposable

    element diversication in de novoannotation approaches. PLoS One6,e16526(2011).

    56. Rouxel, . et al.Effector diversication within compartments o theLeptosphaeria maculans genome affected by Repeat-Induced Point mutations.Nat. Commun.2,202 (2011).

    57. Sun, C. et al.De novosequencing and analysis o the American ginseng roottranscriptome using a GS FLX itanium platorm to discover putative genesinvolved in ginsenoside biosynthesis. BMC Genomics11,262 (2010).

    58. Grabherr, M. G. et al.Full-length transcriptome assembly rom RNA-Seq datawithout a reerence genome. Nat. Biotechnol.29,644652 (2011).

    59. rapnell, C. et al.ranscript assembly and quantication by RNA-Seq revealsunannotated transcripts and isoorm switching during cell differentiation.Nat. Biotechnol.28,511515 (2010).

    60. Kelly, D. E., Kraevec, N., Mullins, J. & Nelson, D. R. Te CYPome(Cytochrome P450 complement) oAspergillus nidulans. Fungal Genet. Biol.

    46,S53S61 (2009).

    AcknowledgementsTis work was supported by the Key National Natural Sc ience Foundation o China(Grant no. 81130069) and the Program or Changjiang Scholars and Innovative Researcheam in University o Ministry o Education o China (Grant no. IR1150).

    Author contributionsS.C. initiated the G. lucidumgenome project; S.C., C.S. and C.Liu designed andcoordinated the project; J.X., C.Li., L.W., X.G. and J.Q. sequenced the genome; S.Z.and D.C.S. perormed the optical mapping; Y.Z. and Y.L. assembled the genome; Y.Z.and C.Liu coordinated the annotation process; D.R.N., J.X., X.L., L.S., L.H., L.X., X.X.,Y.N., Q.L., H.Y., J.Z. and H.C. annotated and curated the genes; Y.S., H.L., Z.W. andM.L. perormed the experiments; J.S., D.R.N., B.H., A. Levasseur, M.V.H. and A. Lv.interpreted the experiments; C.S., C.Liu, J.S., J.X., Y.J., B.H., A. Levasseur and D.R.N.wrote the paper; S.C. and J.S. coordinated the writing o the paper.

  • 8/10/2019 Genome Sequence of the Model Medicinal

    9/9

    ARTICLE

    9

    NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1923

    NATURE COMMUNICATIONS | 3:913 | DOI: 10.1038/ncomms1923 | www.nature.com/naturecommunications

    Additional informationAccession codes: Tis whole-genome sequencing project has been deposited atGenome under the project accession number PRJNA71455. Sequence reads have beendeposited in the short-read archive at GenBank under the ollowing accession numbers:SRA043914 contains the Roche 454-generated genomic reads, SRA048014 contains theIllumina GA-generated genomic data, SRA048974 contains the Roche 454-generatedtranscriptome reads and SRA048015 contains the Illumina RNA-Seq reads.

    Supplementary Informationaccompanies this paper at http://www.nature.com/naturecommunications

    Competing nancial interests:Te authors declare no competing nancial interests.

    Reprints and permissioninormation is available online at http://npg.nature.com/reprintsandpermissions/

    How to cite this article:Chen, S. et al.Genome sequence o the model medicinalmushroom Ganoderma lucidum. Nat. Commun.3:913 doi: 10.1038/ncomms1923 (2012).

    License: Tis work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported License. o view a copy o this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/