7
Minireview The Schistosoma mansoni transcriptome: An update Guilherme Oliveira * Laboratory of Cellular and Molecular Parasitology, Centro de Pesquisas Rene ´ Rachou, Fundac ¸a ˜o Oswaldo Cruz, Av. Augusto de Lima 1715, Belo Horizonte, MG 30190-002, Brazil Programa de Po ´ s-Graduac ¸a ˜o e Pesquisa, Santa Casa de Belo Horizonte, Av. Francisco Sales 1111, 9° andar, Ala C, Belo Horizonte, MG 30150-221, Brazil Received 1 February 2007; received in revised form 29 May 2007; accepted 1 June 2007 Available online 12 June 2007 Abstract Large scale EST sequencing projects have been carried out for Schistosoma mansoni and Schistosoma japonicum. This update will briefly review the most recent accomplishments in the area and discuss the use of EST data for the purposes of gene discovery, gene model development, genome annotation and SNP analysis. In addition, the use of ESTs for studying other features of the transcriptome such as splice site and transcription initiation variants will be discussed as well as approaches to assigning function to unknown tran- scripts. Although EST sequencing has contributed much for schistosome research, other data mining possibilities exist, including the identification of putative drug and vaccine targets. Ó 2007 Elsevier Inc. All rights reserved. Index Descriptors and Abbreviations: Trematode; Schistosomiasis; Expressed sequence tags; Transcriptome; cDNA, complementary DNA; mRNA, messenger RNA; STS, sequence-tagged sites; ORESTES, Open Reading Frame Expressed Sequence Tags; RGMG, Minas Gerais Genome Network; CDD, Conserved Domain Database; KEGG, Kyoto Encyclopedia of Genes and Genomes; RNA, ribonucleic acid; DNA, deoxyribonucleic acid; SNP, single nucleotide polymo- rphism; MR4, Malaria Research and Reference Reagent Resource; SR3, Schistosome Related Reagent Repository; SAGE, Serial Analysis of Gene Expression The transcriptome can be defined as a collection of the genes transcribed in an organism, tissue, or cell. Transcriptomic information has been extremely useful for gene discovery, the elaboration of gene models, train- ing of gene finding algorithms (once a genome is avail- able) the design of microarrays and has tremendously facilitated the cloning of genes of interest by the avail- ability of cDNA clones. However, the mRNA tran- scribed is not expressed in equal amounts even in a single cell and they also differ in length. Therefore, experimental efforts towards obtaining the transcriptome of an organism generally will provide partial mRNA sequences and will be enriched for genes that are more highly transcribed. The partial cDNA sequences gener- ated are called expressed sequence tags, ESTs. Despite some technical difficulties for producing full length cDNAs and incomplete views of the transcriptome pro- vided, the approach has proven to be a powerful tool for the study of genes and their expression pattern. The original and main motivation for the study of tran- scriptomes was the possibility of discovering the gene con- tent of an organism and generating STSs that pointed to a gene, without the need to obtain the complete genome sequence (Adams et al., 1991). Although the unfinished genome sequences of Schistosoma mansoni and Schisto- soma japonicum have been produced (El-Sayed et al., 2004; McManus et al., 2004), and are currently in the anno- tation stage, the transcriptome of both species have been studied on a large scale and have contributed to the discov- ery of genes and genome annotation, among other uses that will be further discussed in this update (genome sequencing will be discussed in this issue). Currently, Gen- 0014-4894/$ - see front matter Ó 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.exppara.2007.06.001 * Address: Laboratory of Cellular and Molecular Parasitology, Centro de Pesquisas Rene ´ Rachou, Fundac ¸a ˜o Oswaldo Cruz, Av. Augusto de Lima 1715, Belo Horizonte, MG 30190-002, Brazil. Fax: +55 31 3295 3115. E-mail address: oliveira@cpqrr.fiocruz.br www.elsevier.com/locate/yexpr Available online at www.sciencedirect.com Experimental Parasitology 117 (2007) 229–235

The Schistosoma mansoni transcriptome: An update

Embed Size (px)

Citation preview

Page 1: The Schistosoma mansoni transcriptome: An update

Available online at www.sciencedirect.com

www.elsevier.com/locate/yexpr

Experimental Parasitology 117 (2007) 229–235

Minireview

The Schistosoma mansoni transcriptome: An update

Guilherme Oliveira *

Laboratory of Cellular and Molecular Parasitology, Centro de Pesquisas Rene Rachou, Fundacao Oswaldo Cruz, Av.

Augusto de Lima 1715, Belo Horizonte, MG 30190-002, Brazil

Programa de Pos-Graduacao e Pesquisa, Santa Casa de Belo Horizonte, Av. Francisco Sales 1111,

9� andar, Ala C, Belo Horizonte, MG 30150-221, Brazil

Received 1 February 2007; received in revised form 29 May 2007; accepted 1 June 2007Available online 12 June 2007

Abstract

Large scale EST sequencing projects have been carried out for Schistosoma mansoni and Schistosoma japonicum. This update willbriefly review the most recent accomplishments in the area and discuss the use of EST data for the purposes of gene discovery, genemodel development, genome annotation and SNP analysis. In addition, the use of ESTs for studying other features of the transcriptomesuch as splice site and transcription initiation variants will be discussed as well as approaches to assigning function to unknown tran-scripts. Although EST sequencing has contributed much for schistosome research, other data mining possibilities exist, including theidentification of putative drug and vaccine targets.� 2007 Elsevier Inc. All rights reserved.

Index Descriptors and Abbreviations: Trematode; Schistosomiasis; Expressed sequence tags; Transcriptome; cDNA, complementary DNA; mRNA, messengerRNA; STS, sequence-tagged sites; ORESTES, Open Reading Frame Expressed Sequence Tags; RGMG, Minas Gerais Genome Network; CDD, ConservedDomain Database; KEGG, Kyoto Encyclopedia of Genes and Genomes; RNA, ribonucleic acid; DNA, deoxyribonucleic acid; SNP, single nucleotide polymo-rphism; MR4, Malaria Research and Reference Reagent Resource; SR3, Schistosome Related Reagent Repository; SAGE, Serial Analysis of Gene Expression

The transcriptome can be defined as a collection ofthe genes transcribed in an organism, tissue, or cell.Transcriptomic information has been extremely usefulfor gene discovery, the elaboration of gene models, train-ing of gene finding algorithms (once a genome is avail-able) the design of microarrays and has tremendouslyfacilitated the cloning of genes of interest by the avail-ability of cDNA clones. However, the mRNA tran-scribed is not expressed in equal amounts even in asingle cell and they also differ in length. Therefore,experimental efforts towards obtaining the transcriptomeof an organism generally will provide partial mRNAsequences and will be enriched for genes that are more

0014-4894/$ - see front matter � 2007 Elsevier Inc. All rights reserved.

doi:10.1016/j.exppara.2007.06.001

* Address: Laboratory of Cellular and Molecular Parasitology, Centrode Pesquisas Rene Rachou, Fundacao Oswaldo Cruz, Av. Augusto deLima 1715, Belo Horizonte, MG 30190-002, Brazil. Fax: +55 31 32953115.

E-mail address: [email protected]

highly transcribed. The partial cDNA sequences gener-ated are called expressed sequence tags, ESTs. Despitesome technical difficulties for producing full lengthcDNAs and incomplete views of the transcriptome pro-vided, the approach has proven to be a powerful toolfor the study of genes and their expression pattern.

The original and main motivation for the study of tran-scriptomes was the possibility of discovering the gene con-tent of an organism and generating STSs that pointed to agene, without the need to obtain the complete genomesequence (Adams et al., 1991). Although the unfinishedgenome sequences of Schistosoma mansoni and Schisto-

soma japonicum have been produced (El-Sayed et al.,2004; McManus et al., 2004), and are currently in the anno-tation stage, the transcriptome of both species have beenstudied on a large scale and have contributed to the discov-ery of genes and genome annotation, among other usesthat will be further discussed in this update (genomesequencing will be discussed in this issue). Currently, Gen-

Page 2: The Schistosoma mansoni transcriptome: An update

230 G. Oliveira / Experimental Parasitology 117 (2007) 229–235

Bank contains information for over 650 species with atleast 1000 ESTs (Table 1).

There are two main approaches for the large scale pro-duction of cDNA sequences. The conventional method isbased on reverse-transcribing mRNA, cloning andsequencing the extremities of the inserted cDNA (Fultonet al., 1995). The alternative approach named ORESTESis to sequence short randomly amplified and cloned cDNAs(Dias et al., 2000) (Fig. 1). Both methods can potentiallyresult in large numbers of redundant sequences of cDNAs.The sequences need to be assembled so that transcriptsfrom one gene are grouped together. The assembly processyields, in many cases, full length virtual cDNA sequences.The methods are complimentary. The ORESTES tends toaccumulate at the central region of the original mRNAand the conventional method at the 5 0 and 3 0 ends. There-fore, by combining both methods, the chance of assemblinga full virtual cDNA sequence increases. The ORESTESmethod allows sampling of a larger number of small cDNAlibraries, which is useful especially in cases of difficult to

Table 1Online resources related to Schistosoma ESTs available

Resource Web site

ORESTES and annotation http://verjo18.iq.usp.br/schistoTIGR gene index http://compbio.dfci.harvard.ed

gimain.pl?gudb=s_mansoniESTs and KEGG pathways http://rgmg.cpqrr.fiocruz.br/EST of S. mansoni, S. japonicum and

S. haematobium

www.sanger.ac.uk/Projects/S_

S. mansoni SNPs http://bioinfo.cpqrr.fiocruz.brSchistosome Related Reagent Repository—

SR3www.afbr-bri.com/SR3/

Genome sequence of S. mansoni www.genedb.org/genedb/smanGenome database for S. mansoni www.shistodb.netGenome sequence of S. japonicum http://lifecenter.sgst.cn/sj.do#CDD database http://130.14.29.110/StructurePfam database www.sanger.ac.uk/Software/PPIR and UNIPROT http://pir.georgetown.edu/Swiss-Prot www.expasy.org/sprot/InterPro www.ebi.ac.uk/interpro/KEGG www.genome.jp/kegg/

Fig. 1. Methods for producing expressed sequence tags (ESTs). The complConventional cDNA sequencing. Conventional cDNAs are usually 100–500 brespect to the 5 0or 3 0 end, and tend to accumulate at the end of the representingaccumulate on the central region of the representing mRNA.

obtain mRNA. The method also normalizes the minicDNA libraries, making the relative number of ESTs sim-ilar irrespective of the different gene transcription levels inthe sampled tissue. Normalization permits the identifica-tion of rare transcripts. The conventional method, on theother hand, permits an in depth sampling of much largercDNA libraries. In addition, the conventional method per-mits the estimation of the relative numbers of each tran-script by not normalizing the libraries. Also importantly,the conventional cDNA sequencing method produces fro-zen stocks of clones, many of which are complete or nearcomplete cDNAs. These cDNA clones are an importantresource to the research community.

The study of the transcriptome can provide many differ-ent types of information about an organism. The main typeof information generated is the discovery of new genes whichcan be accomplished efficiently and quickly with a transcrip-tomic approach, especially for organisms with large gen-omes, such as schistosomes. In the pre-genome era ofschistosome research, this approach was used and yielded a

Reference

/ Verjovski-Almeida et al. (2003)u/tgi/cgi-bin/tgi/ Merrick et al. (2003)

Oliveira et al., unpublishedmansoni/ Wellcome Trust Sanger Institute,

Unpublished/snp/ Simoes et al. (2007)

Unpublished

soni/ UnpublishedUnpublished, under constructionUnpublished

/cdd/cdd.shtml Marchler-Bauer et al. (2007)fam/ Finn et al. (2006)

Wu et al. (2003, 2006a)Wu et al. (2006a)Mulder et al. (2002)Kanehisa et al. (2004)

ete cDNA clone is represented by the central line. (A) ORESTES. (B)p sequences produced from cDNA cloned and sequenced directionally inmRNA. On the other hand, ORESTES are shorter sequences that tend to

Page 3: The Schistosoma mansoni transcriptome: An update

G. Oliveira / Experimental Parasitology 117 (2007) 229–235 231

large number of new genes (Franco et al., 1995; Oliveira,2001) cDNA clones were also widely distributed to theresearch community. More recently, larger scale projectson S. mansoni have immensely expanded the knowledgeand made mining the information much more effective(Oliveira and Johnston, 2001; Hu et al., 2003; Verjovski-Almeida et al., 2003; Verjovski-Almeida et al., 2004;Oliveira, unpublished effort of the RGMG). For S. japoni-

cum one large project was published (Hu et al., 2003). TheWellcome Trust Sanger Institute also maintains at its ftp siteEST sequences of S. mansoni and S. haematobium (Table 1).

The annotation, or attribution of possible function, ofthe sequences can be carried out with each individual readand with assembled sequences. Generally the methods willinvolve the search for similarities between the EST or con-tig and a database of DNA or protein sequence with func-tional annotation. This type of analysis will initially yieldinformation about which genes are known and which areunknown. In order to annotate gene function using similar-ity search, curated databases of proteins are preferablyused. Some of the interesting databases are: CDD (March-ler-Bauer et al., 2007) or Pfam (Finn et al., 2006) for con-served protein domains and families and PIR (Wu et al.,2003), SwissProt or UniProt (Wu et al., 2006a) for anno-tated proteins, among many others. InterPro contains sev-eral databases with information on protein families,domains and functional sites (Mulder et al., 2002), whichcan be queried with InterProScan (Quevillon et al., 2005).One of the problems with annotating a gene or a proteinis the use of a structured vocabulary shared among differ-ent organisms. For this reason, it is also useful to annotatea sequence with a Gene Ontology term (Ashburner et al.,2000). One example of an annotated transcriptome data-base is the Gene Index (Merrick et al., 2003), but othersare also available (Table 1).

The results obtained with schistosome EST sequencesindicate that under the Molecular Function category, mostof the ESTs belong to the binding (nucleic acid binding)and enzyme (mainly hydrolases and transferases) catego-ries (Hu et al., 2003; Verjovski-Almeida et al., 2003). UnderBiological Process the most prevalent categories weremetabolism or transport. The finding of these categoriesas the most prevalent was expected, as metabolism is usu-ally the major physiological activity of an organism. Someinteresting features were observed, such as the presence ofglucose importers, proteins involved in the uptake ofamino acids and lipids and storage of lipids. The identifica-tion of the metabolic pathways in which parasite geneproducts participate will contribute to the understandingof their metabolic processes. This investigation may alsocontribute to the identification of possible drug targets(Fairlamb, 2002). One of the possibilities is to use KEGGthat provides biochemical pathways, among otherresources (Kanehisa et al., 2004). A visualization of theKEGG pathways obtained with the use of the transcrip-tome of S. mansoni is available (Oliveira, unpublished;Table 1).

After annotation, lists of genes possibly involved with acertain physiological function in the organism are usuallyproduced. One example is genes involved in sex differenti-ation. This group of genes is of interest because interferingwith sex differentiation or maturation may be one way ofpreventing the disease by interrupting egg production bythe female parasite, as the eggs cause most of the pathology(LoVerde, 2002). In addition, sex differentiation in schisto-somes should be an interesting model for study since mostplatyhelminths are hermaphrodites (Mone and Boissier,2004). Several homologues of genes involved in sex differ-entiation, such as fox-1, mog-1, tra-2 and fem-1 wereobserved (Verjovski-Almeida et al., 2003). Hu et al.(2003), focused on genes differentially expressed betweensexes and obtained results comparable to those of micro-array experiments (Hoffmann et al., 2002). Among the dif-ferentially expressed genes are those coding for egg shellproteins, maleless, epididymal secretory protein E1 andtransformer-2b.

One interesting approach for analyzing the EST contentis the comparison of the frequency of ESTs of one organ-ism with those of model organisms. This will provide infor-mation on conserved genes (Hu et al., 2003), but also canbe informative in relation to the expected number of ESTs,indicating genes that follow unusual patterns of expressionin comparison with other organisms (Mudado and Ortega,2006).

Interestingly, from the first descriptions with fewersequences (Franco et al., 1995), to the larger projects (Huet al., 2003; Verjovski-Almeida et al., 2003), at least 50%of all transcripts identified yield no similar transcripts inGenBank. The number may decrease as the clusters growin size and with the availability of better gene models withthe genomic data. Nevertheless, clearly, schistosomes con-tain a large set of unique genes. One of the main challengesis the characterization of their function.

Functional genomics will contribute to describe genefunction. In this issue recent developments in knockingdown gene expression by RNA interference and overexpression by the introduction of exogenous DNA are dis-cussed. However, there is a need for observable phenotypesin order to see an effect resulting from knock-out, knock-down or transgene approaches. The function of a geneproduct may sometimes also be inferred by a guilt-by-association approach (although experimental evidence isnecessary to corroborate the indications). Microarrayexperiments, for example, can reveal genes that are co-expressed in different situations and may, therefore, workin conjunction, or interact to generate a certain phenotype(Quackenbush, 2003; Voy et al., 2006). Proteomics resultsmay also be analyzed in this fashion (Shin et al., 2005).Novel computational approaches may prove to be power-ful tools in assigning function to an unknown gene product(Wolfe et al., 2005; Wu et al., 2006b). Microarray (Hoff-mann et al., 2001; Hoffmann et al., 2002; Fitzpatricket al., 2004; Fitzpatrick et al., 2005; Chai et al., 2006; Dillonet al., 2006; Gobert et al., 2006; Moertel et al., 2006;

Page 4: The Schistosoma mansoni transcriptome: An update

232 G. Oliveira / Experimental Parasitology 117 (2007) 229–235

Vermeire et al., 2006) and proteomic (Curwen et al., 2004;Cheng et al., 2005; Knudsen et al., 2005; van Balkom et al.,2005; Braschi et al., 2006; Liu et al., 2006; Van Hellemondet al., 2006) data have been produced for schistosomes, butnot yet fully explored with powerful computationalmethods.

The identification of genes in a genome is still an ardu-ous task for large and complex genomes. ESTs have alsobeen very useful, providing experimental evidence for theelaboration of gene models. EST evidence has been incor-porated into various algorithms for gene identification(Stanke et al., 2006). New ESTs may also reveal previouslyunknown genes. It has been shown for the human genome,for example, that new EST information is useful for iden-tifying transcribed regions in the genome even with theavailability of an already large cDNA dataset (de Souzaet al., 2000). In the case of schistosomes, for which a fin-ished complete genome sequence is not available, ESTs willalso provide evidence for genes that may not have beensequenced.

As previously stated, some effort has been dedicated toproducing full length cDNA clones from existing librariesfor S. mansoni and S. japonicum. The interests in obtainingfull length cDNA clones are several. Full length clones canprovide extremely useful information for the production ofaccurate gene models and provide better annotation (Cas-telli et al., 2004). Distinct intron/exon boundaries, tran-scription start sites, antisense RNA are some examples ofthe richness of the genome that can be further explored(Miura et al., 2006). The existing ESTs and those madeavailable in the future will greatly contribute towards thefull understanding of many features of the genome. Fulllength clones have also been used in novel globalapproaches towards the identification of candidate anti-gens for the development of malaria vaccines using aDNA vaccine approach (Shibui et al., 2005). Towards thegoal of providing the research community with a set of fulllength cDNA clones, Faria-Campos et al. (2006) identifiedwithin the S. mansoni cDNAs clones from the RGMGwork, a set of clones that potentially contained the initiat-ing methionine. A similar approach was undertaken for S.

japonicum (Hu et al., 2003). The nature of the selection pro-cess however yielded mostly short sequences. An efforttoward the production of a large number of long full lengthcDNAs would benefit from the construction of cappedcDNA libraries (Suzuki et al., 1997). As well as providingfull length cDNAs, a finishing approach to the transcrip-tome should target ESTs pairs from longer transcripts ornot fully covered gene models (Sogayar et al., 2004). Fulllength cDNAs are the gold standard for defining a tran-script and will enhance genome characterization and geneannotation.

ESTs clones have been an invaluable resource for theresearch community. Obtaining cDNA clones for a geneof interest will not always be trivial. Several efforts towardsproviding the research community with already clonedcDNAs has made the cloning effort much easier and

cheaper. Some of the efforts for human cDNA clones arefor example I.M.A.G.E. clones from ATCC or the GermanResource Center for Genome Research (Lennon et al.,1996), the Japanese NITE and Riken Resource Centersthat distribute full length human cDNA clones, TAIR forArabidopsis full length cDNA clones (Rhee et al., 2003)and several commercial suppliers. For parasite materialthere is the MR4 that provides a variety of types of clones(Wu et al., 2001). With the idea of making resources readilyavailable and inexpensive for the research community,recently the SR3 was created and will need community sup-port for its success (Table 1).

The study of the transcriptome using microarrays hasbeen carried out in Schistosoma and is discussed in depthin this issue. In addition, SAGE analysis have also beenconducted to some extent in Schistosoma (Verjovski-Almeida et al., 2003), Williams, see this issue of EP).Recently it was shown by long-SAGE that in vitro exposureto nitric oxide modulates the expression of several genes,among them the upregulation of superoxide dismutase(Messerli et al., 2006). EST data was been used for thedesign of microarrays, both of the cDNA and oligonucleo-tide types (Hoffmann et al., 2001, 2002; Fitzpatrick et al.,2004, 2005; Chai et al., 2006; Dillon et al., 2006; Gobertet al., 2006; Moertel et al., 2006; Vermeire et al., 2006).The identification of SAGE tags will enhance the knowl-edge of transcription patterns as well as corroborate exper-imental evidence for transcribed genomic regions.

In addition to finding and annotating genes, ESTs canbe mined for other types of information. Simoes et al. (inpress) have used EST and base call quality informationmade available by one of the published projects (Verjov-ski-Almeida et al., 2003) to identify SNPs. The authorsidentified 15,615 putative SNPs, of which 1832 resultedin non-synonymous amino acid substitutions. Many ofthe known antigens and vaccine candidates containedamino acid polymorphisms. This kind of approach willprovide valuable information for researchers interestedin vaccine development and the identification of drugtargets.

In addition to SNPs, differences in splicing can beobserved by investigating EST data. Differentially splicedgenes, linked to development or sex, for example,fare increasingly being observed in schistosomes (Shoe-maker et al., 1992; Hamdan and Ribeiro, 1998; De Mendo-nca et al., 2002; Ram et al., 2004; DeMarco et al., 2006;Bahia et al., 2006). Alternative transcription initiationand polyadenylation sites could also be explored with theuse of EST data (Kan et al., 2001; Zavolan et al., 2003;Le et al., 2006; Seoighe et al., 2006; Tian et al., in press).

In conclusion, ESTs have been produced in large scalefor S. mansoni and S. japonicum and contributed signifi-cantly to the discovery of new genes, annotation of the gen-ome and production of microarrays. There are, however,several other uses that have been less explored, such asthe identification of polymorphisms, alternative splice sitesand transcription initiation sites, among others.

Page 5: The Schistosoma mansoni transcriptome: An update

G. Oliveira / Experimental Parasitology 117 (2007) 229–235 233

Acknowledgments

G.O. receives financial support from NIH Grants5D43TW007012-03 and 5R03TW007358-02; CNPq/FIO-CRUZ 400315/2006-8; FAPEMIG REDE-2829/05 andEDT 17001/01.

References

Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropo-ulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno, R.F.,1991. Complementary DNA sequencing: expressed sequence tags andhuman genome project. Science 252, 1651–1656.

Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry,J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris,M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese,J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.,2000. Gene ontology: tool for the unification of biology. The GeneOntology Consortium. Nature Genetics 25, 25–29.

Bahia, D., Avelar, L., Mortara, R.A., Khayath, N., Yan, Y., Noel, C.,Capron, M., Dissous, C., Pierce, R.J., Oliveira, G., 2006. SmPKC1, anew protein kinase C identified in the platyhelminth parasite Schisto-

soma mansoni. Biochemical and Biophysical Research Communica-tions 345, 1138–1148.

Braschi, S., Curwen, R.S., Ashton, P.D., Verjovski-Almeida, S., Wilson,A., 2006. The tegument surface membranes of the human bloodparasite Schistosoma mansoni: a proteomic analysis after differentialextraction. Proteomics. 6, 1471–1482.

Castelli, V., Aury, J.M., Jaillon, O., Wincker, P., Clepet, C., Menard, M.,Cruaud, C., Quetier, F., Scarpelli, C., Schachter, V., Temple, G.,Caboche, M., Weissenbach, J., Salanoubat, M., 2004. Whole genomesequence comparisons and ‘‘full-length’’ cDNA sequences: a combinedapproach to evaluate and improve Arabidopsis genome annotation.Genome Research 14, 406–413.

Chai, M., McManus, D.P., McInnes, R., Moertel, L., Tran, M., Loukas,A., Jonesa, M.K., Gobert, G.N., 2006. Transcriptome profiling of lungschistosomula,in vitro cultured schistosomula and adult Schistosoma

japonicum. Cellular and Molecular Life Sciences 63, 919–929.Cheng, G.F., Lin, J.J., Feng, X.G., Fu, Z.Q., Jin, Y.M., Yuan, C.X.,

Zhou, Y.C., Cai, Y.M., 2005. Proteomic analysis of differentiallyexpressed proteins between the male and female worm of Schistosoma

japonicum after pairing. Proteomics 5, 511–521.Curwen, R.S., Ashton, P.D., Johnston, D.A., Wilson, R.A., 2004. The

Schistosoma mansoni soluble proteome: a comparison across four life-cycle stages. Molecular and Biochemical Parasitology 138, 57–66.

De Mendonca, R.L., Bouton, D., Bertin, B., Escriva, H., Noel, C., Vanacker,J.M., Cornette, J., Laudet, V., Pierce, R.J., 2002. A functionally conservedmember of the FTZ-F1 nuclear receptor family from Schistosoma

mansoni. European Journal of Biochemistry 269, 5700–5711.de Souza, S.J., Camargo, A.A., Briones, M.R., Costa, F.F., Nagai, M.A.,

Verjovski-Almeida, S., Zago, M.A., Andrade, L.E., Carrer, H., El-Dorry, H.F., Espreafico, E.M., Habr-Gama, A., Giannella-Neto, D.,Goldman, G.H., Gruber, A., Hackel, C., Kimura, E.T., Maciel, R.M.,Marie, S.K., Martins, E.A., Nobrega, M.P., Paco-Larson, M.L.,Pardini, M.I., Pereira, G.G., Pesquero, J.B., Rodrigues, V., Rogatto,S.R., da, S.I., Sogayar, M.C., de Fatima, S.M., Tajara, E.H.,Valentini, S.R., Acencio, M., Alberto, F.L., Amaral, M.E., Aneas,I., Bengtson, M.H., Carraro, D.M., Carvalho, A.F., Carvalho, L.H.,Cerutti, J.M., Correa, M.L., Costa, M.C., Curcio, C., Gushiken, T.,Ho, P.L., Kimura, E., Leite, L.C., Maia, G., Majumder, P., Marins,M., Matsukuma, A., Melo, A.S., Mestriner, C.A., Miracca, E.C.,Miranda, D.C., Nascimento, A.N., Nobrega, F.G., Ojopi, E.P.,Pandolfi, J.R., Pessoa, L.G., Rahal, P., Rainho, C.A., da, R.N., deSa, R.G., Sales, M.M., da Silva, N.P., Silva, T.C., da Jr., S.W., Simao,D.F., Sousa, J.F., Stecconi, D., Tsukumo, F., Valente, V., Zalcbeg, H.,Brentani, R.R., Reis, F.L., as-Neto, E., Simpson, A.J., 2000. Identi-

fication of human chromosome 22 transcribed sequences with ORFexpressed sequence tags. Proceedings of the National Academy ofSciences USA 97, 12690–12693.

DeMarco, R., Oliveira, K.C., Venancio, T.M., Verjovski-Almeida, S.,2006. Gender biased differential alternative splicing patterns of thetranscriptional cofactor CA150 gene in Schistosoma mansoni. Molec-ular and Biochemical Parasitology 150, 123–131.

Dias, N.E., Correa, R.G., Verjovski-Almeida, S., Briones, M.R., Nagai,M.A., da Jr., S.W., Zago, M.A., Bordin, S., Costa, F.F., Goldman,G.H., Carvalho, A.F., Matsukuma, A., Baia, G.S., Simpson, D.H.,Brunstein, A., de Oliveira, P.S., Bucher, P., Jongeneel, C.V., O’Hare,M.J., Soares, F., Brentani, R.R., Reis, L.F., de Souza, S.J., Simpson,A.J., 2000. Shotgun sequencing of the human transcriptome with ORFexpressed sequence tags. Proceedings of the National Academy ofSciences USA 97, 3491–3496.

Dillon, G.P., Feltwell, T., Skelton, J.P., Ashton, P.D., Coulson, P.S.,Quail, M.A., Nikolaidou-Katsaridou, N., Wilson, R.A., Ivens, A.C.,2006. Microarray analysis identifies genes preferentially expressed inthe lung schistosomulum of Schistosoma mansoni. InternationalJournal for Parasitology 36, 1–8.

El-Sayed, N.M., Bartholomeu, D., Ivens, A., Johnston, D.A., LoVerde,P.T., 2004. Advances in schistosome genomics. Trends in Parasitology20, 154–157.

Fairlamb, A.H., 2002. Metabolic pathway analysis in trypanosomes andmalaria parasites. Philosophical Transactions: Biological Sciences 357,101–107.

Faria-Campos, A.C., Moratelli, F.S., Mendes, I.K., Ortolani, P.L.,Oliveira, G.C., Campos, S.V.A., Ortega, J.M., Franco, G.R., 2006.Production of full-length cDNA sequences by sequencing and analysisof expressed sequence tags from Schistosoma mansoni.. Memorias doInstituto Oswaldo Cruz 101, 161–165.

Finn, R.D., Mistry, J., Schuster-Bockler, B., Griffiths-Jones, S., Hollich,V., Lassmann, T., Moxon, S., Marshall, M., Khanna, A., Durbin, R.,Eddy, S.R., Sonnhammer, E.L., Bateman, A., 2006. Pfam: clans, webtools and services. Nucleic Acids Research 34, D247–D251.

Fitzpatrick, J.M., Johansen, M.V., Johnston, D.A., Dunne, D.W.,Hoffmann, K.F., 2004. Gender-associated gene expression in tworelated strains of Schistosoma japonicum. Molecular and BiochemicalParasitology 136, 191–209.

Fitzpatrick, J.M., Johnston, D.A., Williams, G.W., Williams, D.J.,Freeman, T.C., Dunne, D.W., Hoffmann, K.F., 2005. An oligonu-cleotide microarray for transcriptome analysis of Schistosoma mansoni

and its application/use to investigate gender-associated gene expres-sion. Molecular and Biochemical Parasitology 141, 1–13.

Franco, G.R., Adams, M.D., Soares, M.B., Simpson, A.J., Venter, J.C.,Pena, S.D., 1995. Identification of new Schistosoma mansoni genes bythe EST strategy using a directional cDNA library. Gene 152, 141–147.

Fulton, L.L., Hillier, L.D., Wilson, R.K., 1995. Large-scale complemen-tary DNA sequencing methods. Methods in Cell Biology 48, 571–582.

Gobert, G.N., McInnes, R., Moertel, L., Nelson, C., Jones, M.K., Hu, W.,McManus, D.P., 2006. Transcriptomics tool for the human Schisto-

soma blood flukes using microarray gene expression profiling. Exper-imental Parasitology 114, 160–172.

Hamdan, F.F., Ribeiro, P., 1998. Cloning and sequence analysis of alysophospholipase homologue from Schistosoma mansoni. Parasitol-ogy Research 84, 839–842.

Hoffmann, K.F., Johnston, D.A., Dunne, D.W., 2002. Identification ofSchi stosoma mansoni gender-associated gene transcripts by cDNAmicroarray profiling. Genome Biol. 3, RESEARCH0041.

Hoffmann, K.F., McCarty, T.C., Segal, D.H., Chiaramonte, M., Hesse,M., Davis, E.M., Cheever, A.W., Meltzer, P.S., Morse III, H.C.,Wynn, T.A., 2001. Disease fingerprinting with cDNA microarraysreveals distinct gene expression profiles in lethal type 1 and type 2cytokine-mediated inflammatory reactions. FASEB Journal 15, 2545–2547.

Hu, W., Yan, Q., Shen, D.K., Liu, F., Zhu, Z.D., Song, H.D., Xu, X.R.,Wang, Z.J., Rong, Y.P., Zeng, L.C., Wu, J., Zhang, X., Wang, J.J.,Xu, X.N., Wang, S.Y., Fu, G., Zhang, X.L., Wang, Z.Q., Brindley,

Page 6: The Schistosoma mansoni transcriptome: An update

234 G. Oliveira / Experimental Parasitology 117 (2007) 229–235

P.J., McManus, D.P., Xue, C.L., Feng, Z., Chen, Z., Han, Z.G., 2003.Evolutionary and biomedical implications of a Schistosoma japonicum

complementary DNA resource. Nature Genetics 35, 139–147.Kan, Z., Rouchka, E.C., Gish, W.R., States, D.J., 2001. Gene structure

prediction and alternative splicing analysis using genomically alignedESTs. Genome Research 11, 889–900.

Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., Hattori, M., 2004.The KEGG resource for deciphering the genome. Nucleic AcidsResearch 32, D277–D280.

Knudsen, G.M., Medzihradszky, K.F., Lim, K.C., Hansell, E., McKer-row, J.H., 2005. Proteomic analysis of Schistosoma mansoni cercarialsecretions. Molecular and Cellular Proteomics 4, 1862–1875.

Le, T.V., Riethoven, J.J., Kumanduri, V., Gopalakrishnan, C., Lopez, F.,Gautheret, D., Thanaraj, T.A., 2006. AltTrans: transcript patternvariants annotated for both alternative splicing and alternativepolyadenylation. BMC Bioinformatics 7, 169.

Lennon, G., Auffray, C., Polymeropoulos, M., Soares, M.B., 1996. TheI.M.A.G.E. Consortium: an integrated molecular analysis of genomesand their expression. Genomics 33, 151–152.

Liu, F., Lu, J., Hu, W., Wang, S.Y., Cui, S.J., Chi, M., Yan, Q., Wang,X.R., Song, H.D., Xu, X.N., Wang, J.J., Zhang, X.L., Zhang, X.,Wang, Z.Q., Xue, C.L., Brindley, P.J., McManus, D.P., Yang, P.Y.,Feng, Z., Chen, Z., Han, Z.G., 2006. New perspectives on host-parasiteinterplay by comparative transcriptomic and proteomic analyses ofSchistosoma japonicum. Public Library of Science Pathogens 2, e29.

LoVerde, P.T., 2002. Presidential address. Sex and schistosomes: aninteresting biological interplay with control implications. Journal ofParasitology 88, 3–13.

Marchler-Bauer, A., Anderson, J.B., Derbyshire, M.K., Weese-Scott, C.,Gonzales, N.R., Gwadz, M., Hao, L., He, S., Hurwitz, D.I., Jackson,J.D., Ke, Z., Krylov, D., Lanczycki, C.J., Liebert, C.A., Liu, C., Lu,F., Lu, S., Marchler, G.H., Mullokandov, M., Song, J.S., Thanki, N.,Yamashita, R.A., Yin, J.J., Zhang, D., Bryant, S.H., 2007. CDD: aconserved domain database for interactive domain family analysis.Nucleic Acids Research 35, D237–D240.

McManus, D.P., Le, T.H., Blair, D., 2004. Genomics of parasiticflatworms. International Journal for Parasitology 34, 153–158.

Merrick, J.M., Osman, A., Tsai, J., Quackenbush, J., LoVerde, P.T., Lee,N.H., 2003. The Schistosoma mansoni Gene Index: gene discovery andbiology by reconstruction and analysis of expressed gene sequences.Journal of Parasitology 89, 261–269.

Messerli, S.M., Morgan, W., Birkeland, S.R., Bernier, J., Cipriano, M.J.,McArthur, A.G., Greenberg, R.M., 2006. Nitric oxide-dependentchanges in Schistosoma mansoni gene expression. Molecular andBiochemical Parasitology 150, 367–370.

Miura, F., Kawaguchi, N., Sese, J., Toyoda, A., Hattori, M., Morishita,S., Ito, T., 2006. A large-scale full-length cDNA analysis to explore thebudding yeast transcriptome. Proceedings of the National Academy ofSciences USA 103, 17846–17851.

Moertel, L., McManus, D.P., Piva, T.J., Young, L., McInnes, R.L.,Gobert, G.N., 2006. Oligonucleotide microarray analysis of strain- andgender-associated gene expression in the human blood fluke, Schisto-

soma japonicum. Molecular and Cellular Probes 20, 280–289.Mone, H., Boissier, J., 2004. Sexual biology of schistosomes. Advances in

Parasitology 57, 89–189.Mudado, M.A., Ortega, J.M., 2006. A picture of gene sampling/expression

in model organisms using ESTs and KOG proteins. Genetics andMolecular Research, 242–253.

Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A.,Binns, D., Biswas, M., Bradley, P., Bork, P., Bucher, P., Copley, R.,Courcelle, E., Durbin, R., Falquet, L., Fleischmann, W., Gouzy, J.,Griffith-Jones, S., Haft, D., Hermjakob, H., Hulo, N., Kahn, D.,Kanapin, A., Krestyaninova, M., Lopez, R., Letunic, I., Orchard, S.,Pagni, M., Peyruc, D., Ponting, C.P., Servant, F., Sigrist, C.J., 2002.InterPro: an integrated documentation resource for protein families,domains and functional sites. Briefings in Bioinformatics 3, 225–235.

Oliveira, G., 2001. Schistosoma gene discovery project update. Trends inParasitology 17, 108–109.

Oliveira, G., Johnston, D.A., 2001. Mining the schistosome DNAsequence database. Trends in Parasitology 17, 501–503.

Quackenbush, J., 2003. Genomics. Microarrays-guilt by association.Science 302, 240–241.

Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler,R., Lopez, R., 2005. InterProScan: protein domains identifier. NucleicAcids Research 33, W116–W120.

Ram, D., Ziv, E., Lantner, F., Lardans, V., Schechter, I., 2004. Stage-specific alternative splicing of the heat-shock transcription factorduring the life-cycle of Schistosoma mansoni. Parasitology 129, 587–596.

Rhee, S.Y., Beavis, W., Berardini, T.Z., Chen, G., Dixon, D., Doyle, A.,Garcia-Hernandez, M., Huala, E., Lander, G., Montoya, M., Miller,N., Mueller, L.A., Mundodi, S., Reiser, L., Tacklind, J., Weems, D.C.,Wu, Y., Xu, I., Yoo, D., Yoon, J., Zhang, P., 2003. The Arabidopsis

Information Resource (TAIR): a model organism database providinga centralized, curated gateway to Arabidopsis biology, researchmaterials and community. Nucleic Acids Research 31, 224–228.

Seoighe, C., Nembaware, V., Scheffler, K., 2006. Maximum likelihoodinference of imprinting and allele-specific expression from EST data.Bioinformatics 22, 3032–3039.

Shibui, A., Shiibashi, T., Nogami, S., Sugano, S., Watanabe, J., 2005. Anovel method for development of malaria vaccines using full-lengthcDNA libraries. Vaccine 23, 4359–4366.

Shin, H., Sheu, B., Markey, M.K., 2005. Guilt-By-Association featureselection applied to simulated proteomic data. American MedicalInformatics Association Annual Symposium Proceedings 1114.

Shoemaker, C.B., Ramachandran, H., Landa, A., dos Reis, M.G., Stein,L.D., 1992. Alternative splicing of the Schistosoma mansoni geneencoding a homologue of epidermal growth factor receptor. Molecularand Biochemical Parasitology 53, 17–32.

Simoes, M., Bahia, D., Zerlotini, A., Torres, K., Artiguenave, F., Neshich,G., Kuser, P., Oliveira, G., in press. Single nucleotide polymorphismsidentification in expressed genes of Schistosoma mansoni. Molecularand Biochemical Parasitology.

Sogayar, M.C., Camargo, A.A., Bettoni, F., Carraro, D.M., Pires, L.C.,Parmigiani, R.B., Ferreira, E.N., de Sa, M.E., do Rosario, D.d., Simpson,A.J., Cruz, L.O., Degaki, T.L., Festa, F., Massirer, K.B., Sogayar, M.C.,Filho, F.C., Camargo, L.P., Cunha, M.A., de Souza, S.J., Faria Jr., M.,Giuliatti, S., Kopp, L., de Oliveira, P.S., Paiva, P.B., Pereira, A.A.,Pinheiro, D.G., Puga, R.D., de Souza, J.E., Albuquerque, D.M.,Andrade, L.E., Baia, G.S., Briones, M.R., Cavaleiro-Luna, A.M.,Cerutti, J.M., Costa, F.F., Costanzi-Strauss, E., Espreafico, E.M.,Ferrasi, A.C., Ferro, E.S., Fortes, M.A., Furchi, J.R., Giannella-Neto,D., Goldman, G.H., Goldman, M.H., Gruber, A., Guimaraes, G.S.,Hackel, C., Henrique-Silva, F., Kimura, E.T., Leoni, S.G., Macedo, C.,Malnic, B., Manzini, B.C., Marie, S.K., Martinez-Rossi, N.M., Menossi,M., Miracca, E.C., Nagai, M.A., Nobrega, F.G., Nobrega, M.P., Oba-Shinjo, S.M., Oliveira, M.K., Orabona, G.M., Otsuka, A.Y., Paco-Larson, M.L., Paixao, B.M., Pandolfi, J.R., Pardini, M.I., Passos Bueno,M.R., Passos, G.A., Pesquero, J.B., Pessoa, J.G., Rahal, P., Rainho, C.A.,Reis, C.P., Ricca, T.I., Rodrigues, V., Rogatto, S.R., Romano, C.M.,Romeiro, J.G., Rossi, A., Sa, R.G., Sales, M.M., Sant’Anna, S.C.,Santarosa, P.L., Segato, F., Silva Jr., W.A., Silva, I.D., Silva, N.P.,Soares-Costa, A., Sonati, M.F., Strauss, B.E., Tajara, E.H., Valentini,S.R., Villanova, F.E., Ward, L.S., Zanette, D.L., 2004. A transcriptfinishing initiative for closing gaps in the human transcriptome. GenomeResearch 14, 1413–1423.

Stanke, M., Schoffmann, O., Morgenstern, B., Waack, S., 2006. Geneprediction in eukaryotes with a generalized hidden Markov model thatuses hints from external sources. BMC Bioinformatics 7, 62.

Suzuki, Y., Yoshitomo-Nakagawa, K., Maruyama, K., Suyama, A.,Sugano, S., 1997. Construction and characterization of a fulllength-enriched and a 50-end-enriched cDNA library. Gene 200,149–156.

Tian, B., Pan, Z., Lee, J.Y., in press. Widespread mRNA polyadenylationevents in introns indicate dynamic interplay between polyadenylationand splicing. Genome Research.

Page 7: The Schistosoma mansoni transcriptome: An update

G. Oliveira / Experimental Parasitology 117 (2007) 229–235 235

van Balkom, B.W., van Gestel, R.A., Brouwers, J.F., Krijgsveld, J.,Tielens, A.G., Heck, A.J., Van Hellemond, J.J., 2005. Mass spectro-metric analysis of the Schistosoma mansoni tegumental sub-proteome.Journal of Proteome Research 4, 958–966.

Van Hellemond, J.J., Retra, K., Brouwers, J.F., van Balkom, B.W.,Yazdanbakhsh, M., Shoemaker, C.B., Tielens, A.G., 2006. Functionsof the tegument of schistosomes: clues from the proteome andlipidome. International Journal for Parasitology 36, 691–699.

Verjovski-Almeida, S., DeMarco, R., Martins, E.A., Guimaraes, P.E.,Ojopi, E.P., Paquola, A.C., Piazza, J.P., Nishiyama Jr., M.Y.,Kitajima, J.P., Adamson, R.E., Ashton, P.D., Bonaldo, M.F.,Coulson, P.S., Dillon, G.P., Farias, L.P., Gregorio, S.P., Ho, P.L.,Leite, R.A., Malaquias, L.C., Marques, R.C., Miyasato, P.A.,Nascimento, A.L., Ohlweiler, F.P., Reis, E.M., Ribeiro, M.A., Sa,R.G., Stukart, G.C., Soares, M.B., Gargioni, C., Kawano, T.,Rodrigues, V., Madeira, A.M., Wilson, R.A., Menck, C.F., Setubal,J.C., Leite, L.C., as-Neto, E., 2003. Transcriptome analysis of theacoelomate human parasite Schistosoma mansoni. Nature Genetics35, 148–157.

Verjovski-Almeida, S., Leite, L.C., as-Neto, E., Menck, C.F., Wilson,R.A., 2004. Schistosome transcriptome: insights and perspectives forfunctional genomics. Trends in Parasitology 20, 304–308.

Vermeire, J.J., Taft, A.S., Hoffmann, K.F., Fitzpatrick, J.M., Yoshino,T.P., 2006. Schistosoma mansoni: DNA microarray gene expressionprofiling during the miracidium-to-mother sporocyst transformation.Molecular and Biochemical Parasitology 147, 39–47.

Voy, B.H., Scharff, J.A., Perkins, A.D., Saxton, A.M., Borate, B., Chesler,E.J., Branstetter, L.K., Langston, M.A., 2006. Extracting genenetworks for low-dose radiation using graph theoretical algorithms.Public Library of Science Computational Biology 2, e89.

Wolfe, C.J., Kohane, I.S., Butte, A.J., 2005. Systematic survey revealsgeneral applicability of ‘‘guilt-by-association’’ within gene coexpres-sion networks. BioMed. Central Bioinformatics 6, 227.

Wu, C.H., Apweiler, R., Bairoch, A., Natale, D.A., Barker, W.C.,Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., 2006a.The Universal Protein Resource (UniProt): an expanding universe ofprotein information. Nucleic Acids Research 34, D10–D15.

Wu, C.H., Yeh, L.S., Huang, H., Arminski, L., Castro-Alvear, J., Chen,Y., Hu, Z., Kourtesis, P., Ledley, R.S., Suzek, B.E., Vinayaka, C.R.,Zhang, J., Barker, W.C., 2003. The Protein Information Resource.Nucleic Acids Research 31, 345–347.

Wu, J., Hu, Z., DeLisi, C., 2006b. Gene annotation and network inferenceby phylogenetic profiling. BMC Bioinformatics 7, 80.

Wu, Y., Fairfield, A.S., Oduola, A., Cypess, R.H., 2001. The MalariaResearch and Reference Reagent Resource (MR4) Center-creatingAfrican opportunities. African Journal of Medicine and MedicalScience 30 (Suppl.), 52–54.

Zavolan, M., Kondo, S., Schonbach, C., Adachi, J., Hume, D.A.,Hayashizaki, Y., Gaasterland, T., 2003. Impact of Alternative Initi-ation, Splicing, and Termination on the Diversity of the mRNATranscripts Encoded by the Mouse Transcriptome. Genome Research13, 1290–1300.