Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Dataset PaperSubtracted Transcriptome Profile of Tiger Shrimp(Penaeus monodon) That Survived WSSV Challenge
Benedict A Maralit1 Minerva Fatimae H Ventolero1 Mary Beth B Maningas2
Edgar C Amar3 and Mudjekeewis D Santos1
1 Genetic Fingerprinting Laboratory National Fisheries Research and Development Institute Corporate 101 BuildingMother Ignacia Avenue 1103 Quezon City Philippines
2 Department of Biological Science College of ScienceResearch Center for the Natural Sciences Thomas Aquinas Research CenterUniversity of Santo Tomas Espana Boulevard 1008 Manila Philippines
3 Aquaculture Department Southeast Asian Fisheries Development Center (SEAFDEC AQD) Tigbauan Iloilo Philippines
Correspondence should be addressed to Mudjekeewis D Santos mudjiesantosyahoocom
Received 15 July 2013 Accepted 5 September 2013 Published 8 May 2014
Academic Editors A Hannoufa Y Okazaki and B Padhukasahasram
Copyright copy 2014 Benedict A Maralit et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
There is increased interest in the development of virus-resistant or improved shrimp stock because production is currently hinderedby outbreaks and limited understanding of shrimp defense Recent advancement now allows for high-throughputmolecular studieson shrimp immunity We used next-generation sequencing (NGS) coupled with suppression subtractive hybridization (SSH) togenerate a transcriptome database of genes from tiger shrimp that survived White spot syndrome virus (WSSV) challenge A totalof 9597 unique sequences were uploaded to NCBI Sequence Read Archive with accession number SRR577080 Sixty-five uniquesequences 6 of the total were homologous to genes of Penaeus monodon Genes that were initially related to bacterial infectionand environmental stress such as 14-3-3 gene heat shock protein 90 and calreticulinwere also found including a few full-length genesequences Initial analysis of the expression of some genes was done Hemocyanin ferritin and fortilin-binding protein exhibiteddifferential expression between survivor and control tiger shrimps Furthermore candidate microsatellite markers for brood stockselection were mined and tested Four trinucleotide and one dinucleotide microsatellites were successfully amplified The studyhighlights the advantage of the NGS platform coupled with SSH in terms of gene discovery and marker generation
1 Introduction
Although a lot of bestmanagement practices detectionmeth-ods and disease intervention have been developed shrimpproduction is still continuously affected by viral infectionsBecause of this greater attention has now been placed onmolecular studies for application on genetic breeding pro-grams As seen in recent years there is an increase in thenumber of studies that aim to elucidate the function ofseveral previously unknown genes and their viral responsepathways Expression and characterization studies of one ora group of similar genes are currently gaining pace Likewiselarge databases from which immune-related genes may bedetermined are now publicly available and are continuouslygenerated
Some notable genes that have been recently describedfrom Expressed Sequence Tags (EST) databases are hemo-cyanin ferritin and fortilin-binding protein [1ndash5] Manysubtraction libraries have also been used to associate minedgenes from EST to specific roles in immunity For instancecathepsins [1 6] and penaeidins [7ndash9] were proven to beimmune related based on several separate cDNA librarystudies [10] Using the same large-scale sequencing approachgenes currently related to bacterial infection and environ-mental stress such as heat shock proteins [11] hemocytehomeostasis-associated protein [12] 14-3-3 gene and calreti-culin are also suggested to play important roles inWhite spotsyndrome virus (WSSV) infection or resistance
As seen large-scale transcriptome analysis and EST data-bases have established many immune related genes which
Hindawi Publishing CorporationDataset Papers in ScienceVolume 2014 Article ID 807806 11 pageshttpdxdoiorg1011552014807806
2 Dataset Papers in Science
ldquoSurvivorrdquo ldquoApparently healthyrdquo
PM04
PM05
PM06
PM07
PM08
PM25
PM26
PM27
PM28
PM29
16S
WSSV
Figure 1 PCR-based detection of White spot syndrome virus in allPenaeus monodon samples Top internal controls bottom lane 1ladder lane 2 negative control lane 3 positive control lanes 4ndash8survivor samples lanes 9ndash13 control shrimp samples
were further characterized by expression or functional ana-lyses [1 8 9 12 13] Indeed its usefulness has beenproven in many aspects of shrimp health Recent transcrip-tome approaches such as microarrays and next-generationsequencing (NGS) have also provided efficient applicationsfor rapid determination of quantitative trait loci and geneticmarkers from immune-related genes [1 14ndash16] because theseapproaches provide longer average sequence reads of 600base pairs (bp) than traditional ones The markers generatedfrom these platforms may be used for a lot of geneticstudies integral to taxonomy or aquaculture In fact thelatest breeding technologies in plants make use of next-generation sequencing data for rapid marker development[17] In shrimp traditional selective breeding programs havealready been useful for producing WSSV resistant Penaeusvannamei [1 18ndash21] However this classical approach entailshigh-input requirements of capital time and research effortAs seen in plants marker-assisted selection [22] a modifiedapproach in traditional selective breeding proves successfulIt is thus the goal of this study to create a database of genesfrom which candidate markers may be mined and used inmarker-assisted selection (MAS) of WSSV-resistant or toler-ant P monodon Suppression subtractive hybridization (SSH)coupled with NGS was used in this study to generate thedesired database We show that this platform can be used tofurther study genes that may have relevant function inWSSVimmunity Likewise we identified resistance-related markerssuch as microsatellites that may be amplified and analyzedfor downstream genetic applications such as semiquantitativeexpression analysis
2 Methodology
Sample collection was as follows Shrimps for challengeexperiments were collected from a farm located in NegrosOccidental Philippines Shrimp fry used to stock the farmcame from a hatchery that practices biosecurity monitoringBefore they were transported to a biosecure facility inSEAFDEC AQD samples were tested for WSSV and otherviruses Subsequently they were quarantined in this facilityfor one month as they were being acclimated to tank condi-tions
For WSSV challenge an in vivo titration test by intra-muscular injection of the viral inoculum was conducted todetermine the dose that resulted in 50 mortality (SID
50)
of test shrimps (155 days old) To do this gills frommoribund shrimp were collected and homogenized in PBSThe homogenate was centrifuged at 5000 rpm The result-ing supernatant was filtered in 045 120583m sterile membranefilter serially diluted (10minus3 to 10minus9) in PBS and 100 120583L wasinjected into the abdominal muscle region of each shrimpMortality was observed for 15 days The computed SID
50
was 10minus645mLminus01 and the titer of the undiluted supernatantwas 10745 SID
50mLminus1 The challenge test was conducted by
injecting each shrimp with 1 SID50of the inoculum following
the same protocol used in the in vivo titration test Survivingshrimps from this challenge were kept for genetic analysisTissue sampling from surviving shrimps (survivor) was doneafter 30 days of challenge test termination
DNA extraction and WSSV detection were as followsA total of 10 representative P monodon samples were col-lected Five shrimps that survived viral challenge experiments(30 days after infection) were labeled as ldquosurvivorrdquo whilefive apparently healthy shrimps (no gross signs of disease)were labeled as ldquocontrolrdquo Extraction of DNA from gillswas performed using CTAB extraction buffer as describedby Santos et al [23] Spectrophotometry of DNA extractswas performed before proceeding to amplification (PCR)experiments A total of 10 120583L of reaction mixture was usedcontaining 2 120583L of DNA template 02mM dNTP 2 120583M ofeach primer [24 25] and 1x buffer already containing 15mMMgCl
2and amplified under the following conditions initial
denaturation at 95∘C for 5 minutes followed by 30 cycles ofdenaturation at 95∘C for 30 seconds annealing at 52∘C for30 seconds and extension at 72∘C for another 30 secondsand a final extension for 5 minutes and storage at 4∘CAll DNA extracts were confirmed to contain DNA by PCRamplification using universal 16Sar primers [26] Based on thePCR detection method described none of the samples waspositive for WSSV (Figure 1)
Library preparation was as follows Suppression subtrac-tive hybridization (SSH) is one of the most powerful andpopular methods for generating subtracted cDNA librariesbecause the probability of obtaining low-abundance dif-ferentially expressed cDNA fragments is very high It isa suppression PCR technique that uses normalization andsubtraction in a single procedure It was used in this studybecause downstream analyses such as filtering after next-generation sequencing of the subtracted library are simplifiedin this platform
RNA extracted from gills using Nucleospin RNA IIkit (Machery-Nagel) was subjected to spectrophotometryfor the determination of purity and concentration cDNAwas synthesized from the RNA extracts using ClonetechSMARTer cDNA synthesis kit Immediately after cDNA wasamplified using Clonetech Advantage II PCR kit throughlong distance-PCR (ld-PCR) following the recommendedprotocol Incubation time was increased from 3 minutes to6 minutes to ensure synthesis and amplification of longercDNA transcripts The cDNA templates with highest qualitywere used in the subtraction experiment
Dataset Papers in Science 3
1 2 3 4 5
L FU FS RU RS
Figure 2 Evaluation of the subtraction efficiency of the subtractedcDNAs PCRs with 120573-actin-specific primers were performed usingforward (lanes 2 and 3) and reverse (lanes 4 and 5) SSH productsLane 1 ladder lanes 2 and 4 unsubtracted lanes 3 and 5 subtracted
cDNA subtraction was carried out using Clonetech PCR-Select cDNA Subtraction Kit in accordance with the manu-facturerrsquos protocol For forward subtraction cDNA synthe-sized from shrimp sample that survived viral challenge (sur-vivor) was used as driver while cDNA from control shrimpwas used as tester Forward and reverse SSH reactions wereperformed simultaneously Tester and driver cDNA popula-tionswere briefly digestedwithRsaI and ligated eachwith oneof the two adaptors (adaptor 1 and adaptor 2R) included inthe kit After denaturation at 98∘C for 90 s tester cDNAswereligated with the adaptor 1 and adaptor 2R separately and thenhybridized with an excess of driver cDNAs at 68∘C for 8 hAfter the first hybridization the two products (adaptor 1 andadaptor 2R)weremixed together and immediately hybridizedagain with an excess of freshly denatured driver cDNA for20 h at 68∘C Differentially expressed products were thenselectively amplified by two rounds of PCR using ClonetechAdvantage II PCR kit with primers specific to the twoadaptors resulting in suppression of genes common to thetwo cDNA populations and enrichment of the differentiallyexpressed genes The efficiency of the subtraction was finallyevaluated by comparing the abundance of the constitutivelyexpressed-actin in the subtracted and unsubtracted productsby PCR as described by Zhao et al [1] Figure 2 shows thatafter subtraction the constitutive abundance of 120573-actin wasefficiently reduced after SSH
Next-generation sequencing and bioinformatics were asfollows The forward subtracted product from SSH was sentto Macrogen Inc Korea (httpwwwmacrogencomeng)for 454 sequencing of 14 plate De novo assembly of theacquired reads was conducted using GS De Novo Assem-bler v 26 (Newbler) with cDNAtranscriptome optionNewbler parameters such as base calling quality calling andassembly quality filters were set to default Singletons werecleaned by SeqClean (httpsourceforgenetprojectsseq-clean) and further assessed and controlled by Lucy (httplucysourceforgenet) with minimum length filter of 100 bpA complete workflow of the study is exhibited in Figure 3
As seen in Table 2 a total of 76889928 bases whichcomprised 240616 reads were generated fromNGS Approx-imately 52 of the reads were assembled to form 125598sequences with an average read length of 319 ranging from
100 to 600 base pairs De novo assembly generated 4090contigs with N50 of 504 Likewise 9526 singletons wereproduced leaving out 10108 outliers and 2650 too shortsequences Two phases of sequence cleaning using SeqCleanand Lucy generated 5507 singletons completing a totalof 9597 This number of unique sequences is comparableto those stored in Penaeus monodon Genome Database(httpsysbioiissinicaedutwpageindexphp) and the Pen-aeus monodon gene discovery project [27] The genome sizeof tiger shrimp is not yet fully determined but the 1119862 nuclearDNA content of P monodon was estimated to be sim722 ofthe human genome that is sim253 pg DNA per nucleus or217times10
9 bp per haploid genome [28] In Table 2 the first partpresents Reads Status where the number of reads counts thereads used in the assembly computation number of bases thereadrsquos bases used in the assembly computation assembled thenumber of reads that are fully incorporated into the assemblypartial the number of reads with only a part being includedin the assembly singleton the number of reads that didnot overlap with any other reads in the input repeat thenumber of reads deemed to be from repeat regions outlierthe number of reads that were identified by the GS De NovoAssembler as problematic and too short the number of readsthat were too short to be used in the computationThe secondpart of the table presents the frequency information onisogroups or collection of contigs containing reads that implyconnections between them The number of isogroups corre-sponds to the number of isogroups identified average contigcount the average count of contigs in the isogroups largestcontig count the largest count of contigs in the isogroupsnumber with one contig the number of isogroups assembledby one contig average isotig count the average count ofisotigs in the isogroups largest isotig count the largest countof isotigs in the isogroups number with one isotig thenumber of isogroups assembled by one isotig The third partpresents the frequency information of isotigs (analogous toan individual transcript) The number of isotigs correspondto the number of contigs identified average contig countthe average count of contigs in the isotigs largest contigcount the largest count of contigs in the isotigs number withone contig the number of isotigs assembled by one contignumber of bases the total number of bases in the isotigaverage isotig size the average isotig size N50 isotig size theN50 isotig size (an N50 means that half of all bases reside inisotigs of this size or longer) largest isotig size the size ofthe largest isotig The last part in the table presents the fre-quency information of singletons The number of singletonscorresponds to the number of singletons identified valid thenumber of valid singletons after SeqClean short the numberof sequences less than 100 bp LowQual undetermined basegt3 in clear range dust standalone by low complexity (dust)filter lt40 nt ShortQ trimmed sequence less than 100 bpvalid (after second cleaning) the number of valid singletonsafter Lucy trashed the number of low-quality sequences
Processed reads and generated contigs were aligned to thesequences publicly available in the NCBI GenBank databaseSeparate BLAST analyses were done for isotigs (Dataset Item1 (Table)) and singletons (Dataset Item 2 (Table)) Of thesequences generated from NGS 598 (1462) had significant
4 Dataset Papers in Science
Viral challenge
RNA extraction
cDNA synthesis
Raw images
Image processing
(CWF file)
Signal processing
(SFF file)
Raw data(SFF file)
Assembly
(cDNA option)
Singleton
cleaning
BLAST GO andother analyses
Databasearchiving and
upload
RT-PCR
Gel
electrophoresis
Suppressionsubtractive
hybridization
454 sequencing
Library
preparation
Data
processingAnalysis
Marker
amplification
Figure 3 Experiment workflow
matches (le10119864 minus 3) to the nonredundant (nr) nucleotidedatabase of GenBank after BLAST analysis with defaultoptions Twenty-six (26) sequences (425) of those withBLASTn hits had 119864-values of le10119864 minus 100 and thus wereconsidered to have highly significant homology [29] About296 sequences (4950) had homology values between 10119864minus99 and 10119864 minus 20 which were considered as moderatelysimilar to the sequences in GenBank The remaining 276isotigs (4615) had weak homologies or similarities (119864-values between 10119864minus19 and 10119864minus3) Meanwhile out of the5507 cleaned singletons 414 had significant hits to GenBanksequences One hundred sixty-one (161) (3889) of thoseannotated had moderate similarity to the gene sequences inGenBank Out of the combined 1012 BLASTn homologoussequences 9 were similar to Daphnia pulex nucleotidesequences (Figure 4)
The remaining majority had significant homologies toseveral shrimp species such as Penaeus monodon (6)Litopenaeus vannamei (3) Marsupenaeus japonicus (3)and Fenneropenaeus chinensis (2) Dataset Item 3 (Table)lists the database sequences with homology to P monodongenes in the GenBank nucleotide database Resulting geneontology (GO) assignment showed that 88 of the totalisotigs (Dataset Item 4 (Table)) and 94 of the singletons(Dataset Item 5 (Table)) had no significant hits to the Gen-bank GO database Tables 3 and 4 show the isotigs and sin-gletons GO analysis results respectively 488 isotigs and 336singletons were assigned and distributed to different cate-gories based on gene ontology Complete lists of isotigand singleton BLASTx or GO assignments are exhibited in
Dataset Item 6 (Table) and Dataset Item 7 (Table) respect-ively
In terms of similarities to model organisms the majorityof the 824 GO assigned sequences (40) had significanthits to genes of Drosophila melanogaster Genes from Daniorerio also had homologies to 23 of the sequences Theremaining sequences were significantly homologous to thegenes of Rattus norvegicus (14) Mus musculus (10) andother species (Figure 5)
Microsatellite mining was as follows Using msatcom-mander with search options for trinucleotide repeats orhigher 25 microsatellite markers were identified in DatasetItem 8 (Table) Five (20) of these have BLASTn matchesThe homologous genes however were putative or predictedSome of these were amplified using RT-PCR with the fol-lowing mixture 120583L 1 cDNA template in 1x PCR buffer with2mM MgCl
2 02mM dNTPs 05120583M forward and reverse
primers 1 unit Taq and distilled water The PCR mix wasrun in a thermal cycler with the following conditions 95∘Cfor 5 minutes 30 cycles of 95∘C for 30 seconds 52∘C for 30seconds 72∘C for 30 seconds and a final extension of 72∘Cfor 5 minutes Successfully amplified microsatellite markersare shown in Figure 6
The successful amplification of these microsatellites indi-cates that repetitive segments may also be found in thetranscriptome and they may be used as genetic markersrelating to desired phenotype Once confirmed to be poly-morphic and after further field testing using genomic DNAthese microsatellite markers may eventually be established asmarkers for WSSV resistance in shrimp
Dataset Papers in Science 5
224 other species 74es 74
Penaeus monodon 6Litopenaeus vannamei 3
Marsupenaeus japonicus 2Fenneropenaeus chinensis 2
Callinectes sapidus 2
Daphnia pulex 9
Procambarus clarkii 1
Figure 4 Organisms with most nucleotide hits the proportion of unique sequences with BLASTn matches (119864-value le10minus3) in thenonredundant nucleotide database classified according to the organism with the ldquotop hitrdquo nucleotide sequence
Semiquantitative expression analysis was as follows Theconcentration and purity of the RNA extracts were deter-mined by spectrophotometry prior to cDNA synthesis andRT-PCR All extracts were found to contain quality RNA (OD260280 of 18 to 20) Gene expression was analyzed usingsemiquantitative PCR PCRmix was prepared using the sameconcentrations as described in microsatellite amplificationRunning conditions however were modified into 25 cycleswith a 50∘C cycle annealing temperature After the run 2 120583Lof each ampliconwasmixedwith 2 120583Lof loading dye and thenelectrophoresed in 1 agarose gel Relative expression wasdetermined based on the fluorescence of 2 120583L DNA ladderloaded on the same gel and computed using BioDocAnalyzeSoftware bundled with the Biometra Gel Documentation sys-tem Ferritin and fortilin-binding protein whose full-lengthcoding regions were found in the library were successfullyamplified Differential expression of these genes was clearlyobserved in the semiquantitative PCR profiles (Figure 7)120573-Actin was used as negative control [1] while hemo-
cyanin a known differentially expressed gene in virus resis-tant shrimps [13] was used as a positive control It can beseen that fortilin-binding protein is significantly upregulatedin ldquosurvivorrdquo shrimps Primers and annealing temperaturesused were listed in Table 1
3 Dataset Description
The dataset associated with this Dataset Paper consists of 8items which are described as follows
Dataset Item 1 (Table) A list of isotigs and BLAST resultsThe column Query Name presents the isotig number desig-nated by the assembly program Query Length the lengthof isotig sequence in base pairs Query Start the start ofalignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered
by alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignmentbit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal Identities
6 Dataset Papers in Science
Others
5
Drosophila
melanogaster 40
Danio rerio 23
Rattus norvegicus 14
Mus musculus 10
Caenorhabditis elegans 3Candida albicans 3Arabidopsis thaliana 1
Saccharomyces cerevisiae 1
Figure 5Organismswithmost protein hits the proportion of putative transcriptswith BLASTxmatches (119864-valuele10minus3) in the nonredundantprotein database classified according to the organism with the ldquotop hitrdquo protein sequence
1 2 3 4 5 6 7 8
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
isotig01639
isotig01684
isotig00854
isotig01256
Control
HGUVU5Q01ANPJ4
ldquoSurvivorrdquo
Figure 6 Successful amplification of microsatellites in P monodon samples Leftmost marker lengths lane 1 ladder lanes 2 to 4 controlshrimps lanes 5 to 7 ldquosurvivorrdquo shrimps lanes 8 negative control rightmost microsatellite identifier
Dataset Papers in Science 7
Table 1 Primers used in RT-PCR
Product Identifier 51015840 to 31015840 sequence Melting temp
170 isotig001639 Forward TCAAAGTTAGCTCCTAGACCA 52Reverse AGCTCCTACTCCTGTTTCTGT 55
146 isotig001684 Forward GCCTCCTATAAACGCTAAAAT 50Reverse ATTACCCAGAATTCACTGGTT 51
325 isotig00854 Forward CTGCAGGTCATGCTCAACG 60Reverse AGCTCCCACACATCAAGGG 60
314 isotig01256 Forward ACGACTCGAAGGAAGGCTC 60Reverse TCCCGAATGTCCGTTCTCC 60
154 HGUVU5Q01ANPJ4 Forward TCGCGACGAGTACATGGG 60Reverse GTCTTCACTGTATTCAATTGGCTG 60
268 Hemocyanin Forward ACAGTCTCCCTCCATACACCGTGG 63Reverse AGGCCAGGCGAAAACGCGAA 64
279 Ferritin Forward GCCCGGGCAGGTACTCTCCA 66Reverse TGGCCGCATCGTTCTCCAGC 64
223 Fortilin-binding protein Forward GCTCGCAGTTAACCTCCTCTGCC 63Reverse GGCGTCCACCGTGCAAGAAGT 64
Table 2 A list of database reads sequences and other components with their corresponding frequencies
Number ofreads
Number ofbases Assembled Partial Singleton Repeat Outlier Too short
240616 76099104 125598 91137 9526 1597 10108 2650Number ofisogroups
Averagecontig count
Largestcontig count
Number withone contig
Averageisotig count
Largest isotigcount
Number withone isotig
3724 2 2813 3573 1 13 3577Number ofisotigs
Averagecontig count
Largestcontig count
Number withone contig
Number ofbases
Averageisotig size
N50 isotigsize Largest isotig size
4090 1 6 3732 1635969 400 504 1921Number ofsingletons Valid Short LowQual Dust ShortQ Valid (after
2nd cleaning) Trashed
9526 5771 3489 1 73 192 5507 264
Table 3 Isotigs GO analysis result
GO analysis resultBiological process 1686118225Cellular component 1276988942Molecular function 1916892833No hits 3602
Column 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Item 2 (Table) A list of singletons and BLASTresults The column Query Name presents the singleton codedesignated by the assembly program Query Length the
Table 4 Singletons GO analysis result
GO analysis resultBiological process 1014409221Cellular component 9316145113Molecular function 1413976268No hits 5171
length of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignmentin query Query Coverage the percentage of query coveredby alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignment
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
2 Dataset Papers in Science
ldquoSurvivorrdquo ldquoApparently healthyrdquo
PM04
PM05
PM06
PM07
PM08
PM25
PM26
PM27
PM28
PM29
16S
WSSV
Figure 1 PCR-based detection of White spot syndrome virus in allPenaeus monodon samples Top internal controls bottom lane 1ladder lane 2 negative control lane 3 positive control lanes 4ndash8survivor samples lanes 9ndash13 control shrimp samples
were further characterized by expression or functional ana-lyses [1 8 9 12 13] Indeed its usefulness has beenproven in many aspects of shrimp health Recent transcrip-tome approaches such as microarrays and next-generationsequencing (NGS) have also provided efficient applicationsfor rapid determination of quantitative trait loci and geneticmarkers from immune-related genes [1 14ndash16] because theseapproaches provide longer average sequence reads of 600base pairs (bp) than traditional ones The markers generatedfrom these platforms may be used for a lot of geneticstudies integral to taxonomy or aquaculture In fact thelatest breeding technologies in plants make use of next-generation sequencing data for rapid marker development[17] In shrimp traditional selective breeding programs havealready been useful for producing WSSV resistant Penaeusvannamei [1 18ndash21] However this classical approach entailshigh-input requirements of capital time and research effortAs seen in plants marker-assisted selection [22] a modifiedapproach in traditional selective breeding proves successfulIt is thus the goal of this study to create a database of genesfrom which candidate markers may be mined and used inmarker-assisted selection (MAS) of WSSV-resistant or toler-ant P monodon Suppression subtractive hybridization (SSH)coupled with NGS was used in this study to generate thedesired database We show that this platform can be used tofurther study genes that may have relevant function inWSSVimmunity Likewise we identified resistance-related markerssuch as microsatellites that may be amplified and analyzedfor downstream genetic applications such as semiquantitativeexpression analysis
2 Methodology
Sample collection was as follows Shrimps for challengeexperiments were collected from a farm located in NegrosOccidental Philippines Shrimp fry used to stock the farmcame from a hatchery that practices biosecurity monitoringBefore they were transported to a biosecure facility inSEAFDEC AQD samples were tested for WSSV and otherviruses Subsequently they were quarantined in this facilityfor one month as they were being acclimated to tank condi-tions
For WSSV challenge an in vivo titration test by intra-muscular injection of the viral inoculum was conducted todetermine the dose that resulted in 50 mortality (SID
50)
of test shrimps (155 days old) To do this gills frommoribund shrimp were collected and homogenized in PBSThe homogenate was centrifuged at 5000 rpm The result-ing supernatant was filtered in 045 120583m sterile membranefilter serially diluted (10minus3 to 10minus9) in PBS and 100 120583L wasinjected into the abdominal muscle region of each shrimpMortality was observed for 15 days The computed SID
50
was 10minus645mLminus01 and the titer of the undiluted supernatantwas 10745 SID
50mLminus1 The challenge test was conducted by
injecting each shrimp with 1 SID50of the inoculum following
the same protocol used in the in vivo titration test Survivingshrimps from this challenge were kept for genetic analysisTissue sampling from surviving shrimps (survivor) was doneafter 30 days of challenge test termination
DNA extraction and WSSV detection were as followsA total of 10 representative P monodon samples were col-lected Five shrimps that survived viral challenge experiments(30 days after infection) were labeled as ldquosurvivorrdquo whilefive apparently healthy shrimps (no gross signs of disease)were labeled as ldquocontrolrdquo Extraction of DNA from gillswas performed using CTAB extraction buffer as describedby Santos et al [23] Spectrophotometry of DNA extractswas performed before proceeding to amplification (PCR)experiments A total of 10 120583L of reaction mixture was usedcontaining 2 120583L of DNA template 02mM dNTP 2 120583M ofeach primer [24 25] and 1x buffer already containing 15mMMgCl
2and amplified under the following conditions initial
denaturation at 95∘C for 5 minutes followed by 30 cycles ofdenaturation at 95∘C for 30 seconds annealing at 52∘C for30 seconds and extension at 72∘C for another 30 secondsand a final extension for 5 minutes and storage at 4∘CAll DNA extracts were confirmed to contain DNA by PCRamplification using universal 16Sar primers [26] Based on thePCR detection method described none of the samples waspositive for WSSV (Figure 1)
Library preparation was as follows Suppression subtrac-tive hybridization (SSH) is one of the most powerful andpopular methods for generating subtracted cDNA librariesbecause the probability of obtaining low-abundance dif-ferentially expressed cDNA fragments is very high It isa suppression PCR technique that uses normalization andsubtraction in a single procedure It was used in this studybecause downstream analyses such as filtering after next-generation sequencing of the subtracted library are simplifiedin this platform
RNA extracted from gills using Nucleospin RNA IIkit (Machery-Nagel) was subjected to spectrophotometryfor the determination of purity and concentration cDNAwas synthesized from the RNA extracts using ClonetechSMARTer cDNA synthesis kit Immediately after cDNA wasamplified using Clonetech Advantage II PCR kit throughlong distance-PCR (ld-PCR) following the recommendedprotocol Incubation time was increased from 3 minutes to6 minutes to ensure synthesis and amplification of longercDNA transcripts The cDNA templates with highest qualitywere used in the subtraction experiment
Dataset Papers in Science 3
1 2 3 4 5
L FU FS RU RS
Figure 2 Evaluation of the subtraction efficiency of the subtractedcDNAs PCRs with 120573-actin-specific primers were performed usingforward (lanes 2 and 3) and reverse (lanes 4 and 5) SSH productsLane 1 ladder lanes 2 and 4 unsubtracted lanes 3 and 5 subtracted
cDNA subtraction was carried out using Clonetech PCR-Select cDNA Subtraction Kit in accordance with the manu-facturerrsquos protocol For forward subtraction cDNA synthe-sized from shrimp sample that survived viral challenge (sur-vivor) was used as driver while cDNA from control shrimpwas used as tester Forward and reverse SSH reactions wereperformed simultaneously Tester and driver cDNA popula-tionswere briefly digestedwithRsaI and ligated eachwith oneof the two adaptors (adaptor 1 and adaptor 2R) included inthe kit After denaturation at 98∘C for 90 s tester cDNAswereligated with the adaptor 1 and adaptor 2R separately and thenhybridized with an excess of driver cDNAs at 68∘C for 8 hAfter the first hybridization the two products (adaptor 1 andadaptor 2R)weremixed together and immediately hybridizedagain with an excess of freshly denatured driver cDNA for20 h at 68∘C Differentially expressed products were thenselectively amplified by two rounds of PCR using ClonetechAdvantage II PCR kit with primers specific to the twoadaptors resulting in suppression of genes common to thetwo cDNA populations and enrichment of the differentiallyexpressed genes The efficiency of the subtraction was finallyevaluated by comparing the abundance of the constitutivelyexpressed-actin in the subtracted and unsubtracted productsby PCR as described by Zhao et al [1] Figure 2 shows thatafter subtraction the constitutive abundance of 120573-actin wasefficiently reduced after SSH
Next-generation sequencing and bioinformatics were asfollows The forward subtracted product from SSH was sentto Macrogen Inc Korea (httpwwwmacrogencomeng)for 454 sequencing of 14 plate De novo assembly of theacquired reads was conducted using GS De Novo Assem-bler v 26 (Newbler) with cDNAtranscriptome optionNewbler parameters such as base calling quality calling andassembly quality filters were set to default Singletons werecleaned by SeqClean (httpsourceforgenetprojectsseq-clean) and further assessed and controlled by Lucy (httplucysourceforgenet) with minimum length filter of 100 bpA complete workflow of the study is exhibited in Figure 3
As seen in Table 2 a total of 76889928 bases whichcomprised 240616 reads were generated fromNGS Approx-imately 52 of the reads were assembled to form 125598sequences with an average read length of 319 ranging from
100 to 600 base pairs De novo assembly generated 4090contigs with N50 of 504 Likewise 9526 singletons wereproduced leaving out 10108 outliers and 2650 too shortsequences Two phases of sequence cleaning using SeqCleanand Lucy generated 5507 singletons completing a totalof 9597 This number of unique sequences is comparableto those stored in Penaeus monodon Genome Database(httpsysbioiissinicaedutwpageindexphp) and the Pen-aeus monodon gene discovery project [27] The genome sizeof tiger shrimp is not yet fully determined but the 1119862 nuclearDNA content of P monodon was estimated to be sim722 ofthe human genome that is sim253 pg DNA per nucleus or217times10
9 bp per haploid genome [28] In Table 2 the first partpresents Reads Status where the number of reads counts thereads used in the assembly computation number of bases thereadrsquos bases used in the assembly computation assembled thenumber of reads that are fully incorporated into the assemblypartial the number of reads with only a part being includedin the assembly singleton the number of reads that didnot overlap with any other reads in the input repeat thenumber of reads deemed to be from repeat regions outlierthe number of reads that were identified by the GS De NovoAssembler as problematic and too short the number of readsthat were too short to be used in the computationThe secondpart of the table presents the frequency information onisogroups or collection of contigs containing reads that implyconnections between them The number of isogroups corre-sponds to the number of isogroups identified average contigcount the average count of contigs in the isogroups largestcontig count the largest count of contigs in the isogroupsnumber with one contig the number of isogroups assembledby one contig average isotig count the average count ofisotigs in the isogroups largest isotig count the largest countof isotigs in the isogroups number with one isotig thenumber of isogroups assembled by one isotig The third partpresents the frequency information of isotigs (analogous toan individual transcript) The number of isotigs correspondto the number of contigs identified average contig countthe average count of contigs in the isotigs largest contigcount the largest count of contigs in the isotigs number withone contig the number of isotigs assembled by one contignumber of bases the total number of bases in the isotigaverage isotig size the average isotig size N50 isotig size theN50 isotig size (an N50 means that half of all bases reside inisotigs of this size or longer) largest isotig size the size ofthe largest isotig The last part in the table presents the fre-quency information of singletons The number of singletonscorresponds to the number of singletons identified valid thenumber of valid singletons after SeqClean short the numberof sequences less than 100 bp LowQual undetermined basegt3 in clear range dust standalone by low complexity (dust)filter lt40 nt ShortQ trimmed sequence less than 100 bpvalid (after second cleaning) the number of valid singletonsafter Lucy trashed the number of low-quality sequences
Processed reads and generated contigs were aligned to thesequences publicly available in the NCBI GenBank databaseSeparate BLAST analyses were done for isotigs (Dataset Item1 (Table)) and singletons (Dataset Item 2 (Table)) Of thesequences generated from NGS 598 (1462) had significant
4 Dataset Papers in Science
Viral challenge
RNA extraction
cDNA synthesis
Raw images
Image processing
(CWF file)
Signal processing
(SFF file)
Raw data(SFF file)
Assembly
(cDNA option)
Singleton
cleaning
BLAST GO andother analyses
Databasearchiving and
upload
RT-PCR
Gel
electrophoresis
Suppressionsubtractive
hybridization
454 sequencing
Library
preparation
Data
processingAnalysis
Marker
amplification
Figure 3 Experiment workflow
matches (le10119864 minus 3) to the nonredundant (nr) nucleotidedatabase of GenBank after BLAST analysis with defaultoptions Twenty-six (26) sequences (425) of those withBLASTn hits had 119864-values of le10119864 minus 100 and thus wereconsidered to have highly significant homology [29] About296 sequences (4950) had homology values between 10119864minus99 and 10119864 minus 20 which were considered as moderatelysimilar to the sequences in GenBank The remaining 276isotigs (4615) had weak homologies or similarities (119864-values between 10119864minus19 and 10119864minus3) Meanwhile out of the5507 cleaned singletons 414 had significant hits to GenBanksequences One hundred sixty-one (161) (3889) of thoseannotated had moderate similarity to the gene sequences inGenBank Out of the combined 1012 BLASTn homologoussequences 9 were similar to Daphnia pulex nucleotidesequences (Figure 4)
The remaining majority had significant homologies toseveral shrimp species such as Penaeus monodon (6)Litopenaeus vannamei (3) Marsupenaeus japonicus (3)and Fenneropenaeus chinensis (2) Dataset Item 3 (Table)lists the database sequences with homology to P monodongenes in the GenBank nucleotide database Resulting geneontology (GO) assignment showed that 88 of the totalisotigs (Dataset Item 4 (Table)) and 94 of the singletons(Dataset Item 5 (Table)) had no significant hits to the Gen-bank GO database Tables 3 and 4 show the isotigs and sin-gletons GO analysis results respectively 488 isotigs and 336singletons were assigned and distributed to different cate-gories based on gene ontology Complete lists of isotigand singleton BLASTx or GO assignments are exhibited in
Dataset Item 6 (Table) and Dataset Item 7 (Table) respect-ively
In terms of similarities to model organisms the majorityof the 824 GO assigned sequences (40) had significanthits to genes of Drosophila melanogaster Genes from Daniorerio also had homologies to 23 of the sequences Theremaining sequences were significantly homologous to thegenes of Rattus norvegicus (14) Mus musculus (10) andother species (Figure 5)
Microsatellite mining was as follows Using msatcom-mander with search options for trinucleotide repeats orhigher 25 microsatellite markers were identified in DatasetItem 8 (Table) Five (20) of these have BLASTn matchesThe homologous genes however were putative or predictedSome of these were amplified using RT-PCR with the fol-lowing mixture 120583L 1 cDNA template in 1x PCR buffer with2mM MgCl
2 02mM dNTPs 05120583M forward and reverse
primers 1 unit Taq and distilled water The PCR mix wasrun in a thermal cycler with the following conditions 95∘Cfor 5 minutes 30 cycles of 95∘C for 30 seconds 52∘C for 30seconds 72∘C for 30 seconds and a final extension of 72∘Cfor 5 minutes Successfully amplified microsatellite markersare shown in Figure 6
The successful amplification of these microsatellites indi-cates that repetitive segments may also be found in thetranscriptome and they may be used as genetic markersrelating to desired phenotype Once confirmed to be poly-morphic and after further field testing using genomic DNAthese microsatellite markers may eventually be established asmarkers for WSSV resistance in shrimp
Dataset Papers in Science 5
224 other species 74es 74
Penaeus monodon 6Litopenaeus vannamei 3
Marsupenaeus japonicus 2Fenneropenaeus chinensis 2
Callinectes sapidus 2
Daphnia pulex 9
Procambarus clarkii 1
Figure 4 Organisms with most nucleotide hits the proportion of unique sequences with BLASTn matches (119864-value le10minus3) in thenonredundant nucleotide database classified according to the organism with the ldquotop hitrdquo nucleotide sequence
Semiquantitative expression analysis was as follows Theconcentration and purity of the RNA extracts were deter-mined by spectrophotometry prior to cDNA synthesis andRT-PCR All extracts were found to contain quality RNA (OD260280 of 18 to 20) Gene expression was analyzed usingsemiquantitative PCR PCRmix was prepared using the sameconcentrations as described in microsatellite amplificationRunning conditions however were modified into 25 cycleswith a 50∘C cycle annealing temperature After the run 2 120583Lof each ampliconwasmixedwith 2 120583Lof loading dye and thenelectrophoresed in 1 agarose gel Relative expression wasdetermined based on the fluorescence of 2 120583L DNA ladderloaded on the same gel and computed using BioDocAnalyzeSoftware bundled with the Biometra Gel Documentation sys-tem Ferritin and fortilin-binding protein whose full-lengthcoding regions were found in the library were successfullyamplified Differential expression of these genes was clearlyobserved in the semiquantitative PCR profiles (Figure 7)120573-Actin was used as negative control [1] while hemo-
cyanin a known differentially expressed gene in virus resis-tant shrimps [13] was used as a positive control It can beseen that fortilin-binding protein is significantly upregulatedin ldquosurvivorrdquo shrimps Primers and annealing temperaturesused were listed in Table 1
3 Dataset Description
The dataset associated with this Dataset Paper consists of 8items which are described as follows
Dataset Item 1 (Table) A list of isotigs and BLAST resultsThe column Query Name presents the isotig number desig-nated by the assembly program Query Length the lengthof isotig sequence in base pairs Query Start the start ofalignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered
by alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignmentbit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal Identities
6 Dataset Papers in Science
Others
5
Drosophila
melanogaster 40
Danio rerio 23
Rattus norvegicus 14
Mus musculus 10
Caenorhabditis elegans 3Candida albicans 3Arabidopsis thaliana 1
Saccharomyces cerevisiae 1
Figure 5Organismswithmost protein hits the proportion of putative transcriptswith BLASTxmatches (119864-valuele10minus3) in the nonredundantprotein database classified according to the organism with the ldquotop hitrdquo protein sequence
1 2 3 4 5 6 7 8
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
isotig01639
isotig01684
isotig00854
isotig01256
Control
HGUVU5Q01ANPJ4
ldquoSurvivorrdquo
Figure 6 Successful amplification of microsatellites in P monodon samples Leftmost marker lengths lane 1 ladder lanes 2 to 4 controlshrimps lanes 5 to 7 ldquosurvivorrdquo shrimps lanes 8 negative control rightmost microsatellite identifier
Dataset Papers in Science 7
Table 1 Primers used in RT-PCR
Product Identifier 51015840 to 31015840 sequence Melting temp
170 isotig001639 Forward TCAAAGTTAGCTCCTAGACCA 52Reverse AGCTCCTACTCCTGTTTCTGT 55
146 isotig001684 Forward GCCTCCTATAAACGCTAAAAT 50Reverse ATTACCCAGAATTCACTGGTT 51
325 isotig00854 Forward CTGCAGGTCATGCTCAACG 60Reverse AGCTCCCACACATCAAGGG 60
314 isotig01256 Forward ACGACTCGAAGGAAGGCTC 60Reverse TCCCGAATGTCCGTTCTCC 60
154 HGUVU5Q01ANPJ4 Forward TCGCGACGAGTACATGGG 60Reverse GTCTTCACTGTATTCAATTGGCTG 60
268 Hemocyanin Forward ACAGTCTCCCTCCATACACCGTGG 63Reverse AGGCCAGGCGAAAACGCGAA 64
279 Ferritin Forward GCCCGGGCAGGTACTCTCCA 66Reverse TGGCCGCATCGTTCTCCAGC 64
223 Fortilin-binding protein Forward GCTCGCAGTTAACCTCCTCTGCC 63Reverse GGCGTCCACCGTGCAAGAAGT 64
Table 2 A list of database reads sequences and other components with their corresponding frequencies
Number ofreads
Number ofbases Assembled Partial Singleton Repeat Outlier Too short
240616 76099104 125598 91137 9526 1597 10108 2650Number ofisogroups
Averagecontig count
Largestcontig count
Number withone contig
Averageisotig count
Largest isotigcount
Number withone isotig
3724 2 2813 3573 1 13 3577Number ofisotigs
Averagecontig count
Largestcontig count
Number withone contig
Number ofbases
Averageisotig size
N50 isotigsize Largest isotig size
4090 1 6 3732 1635969 400 504 1921Number ofsingletons Valid Short LowQual Dust ShortQ Valid (after
2nd cleaning) Trashed
9526 5771 3489 1 73 192 5507 264
Table 3 Isotigs GO analysis result
GO analysis resultBiological process 1686118225Cellular component 1276988942Molecular function 1916892833No hits 3602
Column 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Item 2 (Table) A list of singletons and BLASTresults The column Query Name presents the singleton codedesignated by the assembly program Query Length the
Table 4 Singletons GO analysis result
GO analysis resultBiological process 1014409221Cellular component 9316145113Molecular function 1413976268No hits 5171
length of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignmentin query Query Coverage the percentage of query coveredby alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignment
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
Dataset Papers in Science 3
1 2 3 4 5
L FU FS RU RS
Figure 2 Evaluation of the subtraction efficiency of the subtractedcDNAs PCRs with 120573-actin-specific primers were performed usingforward (lanes 2 and 3) and reverse (lanes 4 and 5) SSH productsLane 1 ladder lanes 2 and 4 unsubtracted lanes 3 and 5 subtracted
cDNA subtraction was carried out using Clonetech PCR-Select cDNA Subtraction Kit in accordance with the manu-facturerrsquos protocol For forward subtraction cDNA synthe-sized from shrimp sample that survived viral challenge (sur-vivor) was used as driver while cDNA from control shrimpwas used as tester Forward and reverse SSH reactions wereperformed simultaneously Tester and driver cDNA popula-tionswere briefly digestedwithRsaI and ligated eachwith oneof the two adaptors (adaptor 1 and adaptor 2R) included inthe kit After denaturation at 98∘C for 90 s tester cDNAswereligated with the adaptor 1 and adaptor 2R separately and thenhybridized with an excess of driver cDNAs at 68∘C for 8 hAfter the first hybridization the two products (adaptor 1 andadaptor 2R)weremixed together and immediately hybridizedagain with an excess of freshly denatured driver cDNA for20 h at 68∘C Differentially expressed products were thenselectively amplified by two rounds of PCR using ClonetechAdvantage II PCR kit with primers specific to the twoadaptors resulting in suppression of genes common to thetwo cDNA populations and enrichment of the differentiallyexpressed genes The efficiency of the subtraction was finallyevaluated by comparing the abundance of the constitutivelyexpressed-actin in the subtracted and unsubtracted productsby PCR as described by Zhao et al [1] Figure 2 shows thatafter subtraction the constitutive abundance of 120573-actin wasefficiently reduced after SSH
Next-generation sequencing and bioinformatics were asfollows The forward subtracted product from SSH was sentto Macrogen Inc Korea (httpwwwmacrogencomeng)for 454 sequencing of 14 plate De novo assembly of theacquired reads was conducted using GS De Novo Assem-bler v 26 (Newbler) with cDNAtranscriptome optionNewbler parameters such as base calling quality calling andassembly quality filters were set to default Singletons werecleaned by SeqClean (httpsourceforgenetprojectsseq-clean) and further assessed and controlled by Lucy (httplucysourceforgenet) with minimum length filter of 100 bpA complete workflow of the study is exhibited in Figure 3
As seen in Table 2 a total of 76889928 bases whichcomprised 240616 reads were generated fromNGS Approx-imately 52 of the reads were assembled to form 125598sequences with an average read length of 319 ranging from
100 to 600 base pairs De novo assembly generated 4090contigs with N50 of 504 Likewise 9526 singletons wereproduced leaving out 10108 outliers and 2650 too shortsequences Two phases of sequence cleaning using SeqCleanand Lucy generated 5507 singletons completing a totalof 9597 This number of unique sequences is comparableto those stored in Penaeus monodon Genome Database(httpsysbioiissinicaedutwpageindexphp) and the Pen-aeus monodon gene discovery project [27] The genome sizeof tiger shrimp is not yet fully determined but the 1119862 nuclearDNA content of P monodon was estimated to be sim722 ofthe human genome that is sim253 pg DNA per nucleus or217times10
9 bp per haploid genome [28] In Table 2 the first partpresents Reads Status where the number of reads counts thereads used in the assembly computation number of bases thereadrsquos bases used in the assembly computation assembled thenumber of reads that are fully incorporated into the assemblypartial the number of reads with only a part being includedin the assembly singleton the number of reads that didnot overlap with any other reads in the input repeat thenumber of reads deemed to be from repeat regions outlierthe number of reads that were identified by the GS De NovoAssembler as problematic and too short the number of readsthat were too short to be used in the computationThe secondpart of the table presents the frequency information onisogroups or collection of contigs containing reads that implyconnections between them The number of isogroups corre-sponds to the number of isogroups identified average contigcount the average count of contigs in the isogroups largestcontig count the largest count of contigs in the isogroupsnumber with one contig the number of isogroups assembledby one contig average isotig count the average count ofisotigs in the isogroups largest isotig count the largest countof isotigs in the isogroups number with one isotig thenumber of isogroups assembled by one isotig The third partpresents the frequency information of isotigs (analogous toan individual transcript) The number of isotigs correspondto the number of contigs identified average contig countthe average count of contigs in the isotigs largest contigcount the largest count of contigs in the isotigs number withone contig the number of isotigs assembled by one contignumber of bases the total number of bases in the isotigaverage isotig size the average isotig size N50 isotig size theN50 isotig size (an N50 means that half of all bases reside inisotigs of this size or longer) largest isotig size the size ofthe largest isotig The last part in the table presents the fre-quency information of singletons The number of singletonscorresponds to the number of singletons identified valid thenumber of valid singletons after SeqClean short the numberof sequences less than 100 bp LowQual undetermined basegt3 in clear range dust standalone by low complexity (dust)filter lt40 nt ShortQ trimmed sequence less than 100 bpvalid (after second cleaning) the number of valid singletonsafter Lucy trashed the number of low-quality sequences
Processed reads and generated contigs were aligned to thesequences publicly available in the NCBI GenBank databaseSeparate BLAST analyses were done for isotigs (Dataset Item1 (Table)) and singletons (Dataset Item 2 (Table)) Of thesequences generated from NGS 598 (1462) had significant
4 Dataset Papers in Science
Viral challenge
RNA extraction
cDNA synthesis
Raw images
Image processing
(CWF file)
Signal processing
(SFF file)
Raw data(SFF file)
Assembly
(cDNA option)
Singleton
cleaning
BLAST GO andother analyses
Databasearchiving and
upload
RT-PCR
Gel
electrophoresis
Suppressionsubtractive
hybridization
454 sequencing
Library
preparation
Data
processingAnalysis
Marker
amplification
Figure 3 Experiment workflow
matches (le10119864 minus 3) to the nonredundant (nr) nucleotidedatabase of GenBank after BLAST analysis with defaultoptions Twenty-six (26) sequences (425) of those withBLASTn hits had 119864-values of le10119864 minus 100 and thus wereconsidered to have highly significant homology [29] About296 sequences (4950) had homology values between 10119864minus99 and 10119864 minus 20 which were considered as moderatelysimilar to the sequences in GenBank The remaining 276isotigs (4615) had weak homologies or similarities (119864-values between 10119864minus19 and 10119864minus3) Meanwhile out of the5507 cleaned singletons 414 had significant hits to GenBanksequences One hundred sixty-one (161) (3889) of thoseannotated had moderate similarity to the gene sequences inGenBank Out of the combined 1012 BLASTn homologoussequences 9 were similar to Daphnia pulex nucleotidesequences (Figure 4)
The remaining majority had significant homologies toseveral shrimp species such as Penaeus monodon (6)Litopenaeus vannamei (3) Marsupenaeus japonicus (3)and Fenneropenaeus chinensis (2) Dataset Item 3 (Table)lists the database sequences with homology to P monodongenes in the GenBank nucleotide database Resulting geneontology (GO) assignment showed that 88 of the totalisotigs (Dataset Item 4 (Table)) and 94 of the singletons(Dataset Item 5 (Table)) had no significant hits to the Gen-bank GO database Tables 3 and 4 show the isotigs and sin-gletons GO analysis results respectively 488 isotigs and 336singletons were assigned and distributed to different cate-gories based on gene ontology Complete lists of isotigand singleton BLASTx or GO assignments are exhibited in
Dataset Item 6 (Table) and Dataset Item 7 (Table) respect-ively
In terms of similarities to model organisms the majorityof the 824 GO assigned sequences (40) had significanthits to genes of Drosophila melanogaster Genes from Daniorerio also had homologies to 23 of the sequences Theremaining sequences were significantly homologous to thegenes of Rattus norvegicus (14) Mus musculus (10) andother species (Figure 5)
Microsatellite mining was as follows Using msatcom-mander with search options for trinucleotide repeats orhigher 25 microsatellite markers were identified in DatasetItem 8 (Table) Five (20) of these have BLASTn matchesThe homologous genes however were putative or predictedSome of these were amplified using RT-PCR with the fol-lowing mixture 120583L 1 cDNA template in 1x PCR buffer with2mM MgCl
2 02mM dNTPs 05120583M forward and reverse
primers 1 unit Taq and distilled water The PCR mix wasrun in a thermal cycler with the following conditions 95∘Cfor 5 minutes 30 cycles of 95∘C for 30 seconds 52∘C for 30seconds 72∘C for 30 seconds and a final extension of 72∘Cfor 5 minutes Successfully amplified microsatellite markersare shown in Figure 6
The successful amplification of these microsatellites indi-cates that repetitive segments may also be found in thetranscriptome and they may be used as genetic markersrelating to desired phenotype Once confirmed to be poly-morphic and after further field testing using genomic DNAthese microsatellite markers may eventually be established asmarkers for WSSV resistance in shrimp
Dataset Papers in Science 5
224 other species 74es 74
Penaeus monodon 6Litopenaeus vannamei 3
Marsupenaeus japonicus 2Fenneropenaeus chinensis 2
Callinectes sapidus 2
Daphnia pulex 9
Procambarus clarkii 1
Figure 4 Organisms with most nucleotide hits the proportion of unique sequences with BLASTn matches (119864-value le10minus3) in thenonredundant nucleotide database classified according to the organism with the ldquotop hitrdquo nucleotide sequence
Semiquantitative expression analysis was as follows Theconcentration and purity of the RNA extracts were deter-mined by spectrophotometry prior to cDNA synthesis andRT-PCR All extracts were found to contain quality RNA (OD260280 of 18 to 20) Gene expression was analyzed usingsemiquantitative PCR PCRmix was prepared using the sameconcentrations as described in microsatellite amplificationRunning conditions however were modified into 25 cycleswith a 50∘C cycle annealing temperature After the run 2 120583Lof each ampliconwasmixedwith 2 120583Lof loading dye and thenelectrophoresed in 1 agarose gel Relative expression wasdetermined based on the fluorescence of 2 120583L DNA ladderloaded on the same gel and computed using BioDocAnalyzeSoftware bundled with the Biometra Gel Documentation sys-tem Ferritin and fortilin-binding protein whose full-lengthcoding regions were found in the library were successfullyamplified Differential expression of these genes was clearlyobserved in the semiquantitative PCR profiles (Figure 7)120573-Actin was used as negative control [1] while hemo-
cyanin a known differentially expressed gene in virus resis-tant shrimps [13] was used as a positive control It can beseen that fortilin-binding protein is significantly upregulatedin ldquosurvivorrdquo shrimps Primers and annealing temperaturesused were listed in Table 1
3 Dataset Description
The dataset associated with this Dataset Paper consists of 8items which are described as follows
Dataset Item 1 (Table) A list of isotigs and BLAST resultsThe column Query Name presents the isotig number desig-nated by the assembly program Query Length the lengthof isotig sequence in base pairs Query Start the start ofalignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered
by alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignmentbit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal Identities
6 Dataset Papers in Science
Others
5
Drosophila
melanogaster 40
Danio rerio 23
Rattus norvegicus 14
Mus musculus 10
Caenorhabditis elegans 3Candida albicans 3Arabidopsis thaliana 1
Saccharomyces cerevisiae 1
Figure 5Organismswithmost protein hits the proportion of putative transcriptswith BLASTxmatches (119864-valuele10minus3) in the nonredundantprotein database classified according to the organism with the ldquotop hitrdquo protein sequence
1 2 3 4 5 6 7 8
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
isotig01639
isotig01684
isotig00854
isotig01256
Control
HGUVU5Q01ANPJ4
ldquoSurvivorrdquo
Figure 6 Successful amplification of microsatellites in P monodon samples Leftmost marker lengths lane 1 ladder lanes 2 to 4 controlshrimps lanes 5 to 7 ldquosurvivorrdquo shrimps lanes 8 negative control rightmost microsatellite identifier
Dataset Papers in Science 7
Table 1 Primers used in RT-PCR
Product Identifier 51015840 to 31015840 sequence Melting temp
170 isotig001639 Forward TCAAAGTTAGCTCCTAGACCA 52Reverse AGCTCCTACTCCTGTTTCTGT 55
146 isotig001684 Forward GCCTCCTATAAACGCTAAAAT 50Reverse ATTACCCAGAATTCACTGGTT 51
325 isotig00854 Forward CTGCAGGTCATGCTCAACG 60Reverse AGCTCCCACACATCAAGGG 60
314 isotig01256 Forward ACGACTCGAAGGAAGGCTC 60Reverse TCCCGAATGTCCGTTCTCC 60
154 HGUVU5Q01ANPJ4 Forward TCGCGACGAGTACATGGG 60Reverse GTCTTCACTGTATTCAATTGGCTG 60
268 Hemocyanin Forward ACAGTCTCCCTCCATACACCGTGG 63Reverse AGGCCAGGCGAAAACGCGAA 64
279 Ferritin Forward GCCCGGGCAGGTACTCTCCA 66Reverse TGGCCGCATCGTTCTCCAGC 64
223 Fortilin-binding protein Forward GCTCGCAGTTAACCTCCTCTGCC 63Reverse GGCGTCCACCGTGCAAGAAGT 64
Table 2 A list of database reads sequences and other components with their corresponding frequencies
Number ofreads
Number ofbases Assembled Partial Singleton Repeat Outlier Too short
240616 76099104 125598 91137 9526 1597 10108 2650Number ofisogroups
Averagecontig count
Largestcontig count
Number withone contig
Averageisotig count
Largest isotigcount
Number withone isotig
3724 2 2813 3573 1 13 3577Number ofisotigs
Averagecontig count
Largestcontig count
Number withone contig
Number ofbases
Averageisotig size
N50 isotigsize Largest isotig size
4090 1 6 3732 1635969 400 504 1921Number ofsingletons Valid Short LowQual Dust ShortQ Valid (after
2nd cleaning) Trashed
9526 5771 3489 1 73 192 5507 264
Table 3 Isotigs GO analysis result
GO analysis resultBiological process 1686118225Cellular component 1276988942Molecular function 1916892833No hits 3602
Column 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Item 2 (Table) A list of singletons and BLASTresults The column Query Name presents the singleton codedesignated by the assembly program Query Length the
Table 4 Singletons GO analysis result
GO analysis resultBiological process 1014409221Cellular component 9316145113Molecular function 1413976268No hits 5171
length of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignmentin query Query Coverage the percentage of query coveredby alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignment
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
4 Dataset Papers in Science
Viral challenge
RNA extraction
cDNA synthesis
Raw images
Image processing
(CWF file)
Signal processing
(SFF file)
Raw data(SFF file)
Assembly
(cDNA option)
Singleton
cleaning
BLAST GO andother analyses
Databasearchiving and
upload
RT-PCR
Gel
electrophoresis
Suppressionsubtractive
hybridization
454 sequencing
Library
preparation
Data
processingAnalysis
Marker
amplification
Figure 3 Experiment workflow
matches (le10119864 minus 3) to the nonredundant (nr) nucleotidedatabase of GenBank after BLAST analysis with defaultoptions Twenty-six (26) sequences (425) of those withBLASTn hits had 119864-values of le10119864 minus 100 and thus wereconsidered to have highly significant homology [29] About296 sequences (4950) had homology values between 10119864minus99 and 10119864 minus 20 which were considered as moderatelysimilar to the sequences in GenBank The remaining 276isotigs (4615) had weak homologies or similarities (119864-values between 10119864minus19 and 10119864minus3) Meanwhile out of the5507 cleaned singletons 414 had significant hits to GenBanksequences One hundred sixty-one (161) (3889) of thoseannotated had moderate similarity to the gene sequences inGenBank Out of the combined 1012 BLASTn homologoussequences 9 were similar to Daphnia pulex nucleotidesequences (Figure 4)
The remaining majority had significant homologies toseveral shrimp species such as Penaeus monodon (6)Litopenaeus vannamei (3) Marsupenaeus japonicus (3)and Fenneropenaeus chinensis (2) Dataset Item 3 (Table)lists the database sequences with homology to P monodongenes in the GenBank nucleotide database Resulting geneontology (GO) assignment showed that 88 of the totalisotigs (Dataset Item 4 (Table)) and 94 of the singletons(Dataset Item 5 (Table)) had no significant hits to the Gen-bank GO database Tables 3 and 4 show the isotigs and sin-gletons GO analysis results respectively 488 isotigs and 336singletons were assigned and distributed to different cate-gories based on gene ontology Complete lists of isotigand singleton BLASTx or GO assignments are exhibited in
Dataset Item 6 (Table) and Dataset Item 7 (Table) respect-ively
In terms of similarities to model organisms the majorityof the 824 GO assigned sequences (40) had significanthits to genes of Drosophila melanogaster Genes from Daniorerio also had homologies to 23 of the sequences Theremaining sequences were significantly homologous to thegenes of Rattus norvegicus (14) Mus musculus (10) andother species (Figure 5)
Microsatellite mining was as follows Using msatcom-mander with search options for trinucleotide repeats orhigher 25 microsatellite markers were identified in DatasetItem 8 (Table) Five (20) of these have BLASTn matchesThe homologous genes however were putative or predictedSome of these were amplified using RT-PCR with the fol-lowing mixture 120583L 1 cDNA template in 1x PCR buffer with2mM MgCl
2 02mM dNTPs 05120583M forward and reverse
primers 1 unit Taq and distilled water The PCR mix wasrun in a thermal cycler with the following conditions 95∘Cfor 5 minutes 30 cycles of 95∘C for 30 seconds 52∘C for 30seconds 72∘C for 30 seconds and a final extension of 72∘Cfor 5 minutes Successfully amplified microsatellite markersare shown in Figure 6
The successful amplification of these microsatellites indi-cates that repetitive segments may also be found in thetranscriptome and they may be used as genetic markersrelating to desired phenotype Once confirmed to be poly-morphic and after further field testing using genomic DNAthese microsatellite markers may eventually be established asmarkers for WSSV resistance in shrimp
Dataset Papers in Science 5
224 other species 74es 74
Penaeus monodon 6Litopenaeus vannamei 3
Marsupenaeus japonicus 2Fenneropenaeus chinensis 2
Callinectes sapidus 2
Daphnia pulex 9
Procambarus clarkii 1
Figure 4 Organisms with most nucleotide hits the proportion of unique sequences with BLASTn matches (119864-value le10minus3) in thenonredundant nucleotide database classified according to the organism with the ldquotop hitrdquo nucleotide sequence
Semiquantitative expression analysis was as follows Theconcentration and purity of the RNA extracts were deter-mined by spectrophotometry prior to cDNA synthesis andRT-PCR All extracts were found to contain quality RNA (OD260280 of 18 to 20) Gene expression was analyzed usingsemiquantitative PCR PCRmix was prepared using the sameconcentrations as described in microsatellite amplificationRunning conditions however were modified into 25 cycleswith a 50∘C cycle annealing temperature After the run 2 120583Lof each ampliconwasmixedwith 2 120583Lof loading dye and thenelectrophoresed in 1 agarose gel Relative expression wasdetermined based on the fluorescence of 2 120583L DNA ladderloaded on the same gel and computed using BioDocAnalyzeSoftware bundled with the Biometra Gel Documentation sys-tem Ferritin and fortilin-binding protein whose full-lengthcoding regions were found in the library were successfullyamplified Differential expression of these genes was clearlyobserved in the semiquantitative PCR profiles (Figure 7)120573-Actin was used as negative control [1] while hemo-
cyanin a known differentially expressed gene in virus resis-tant shrimps [13] was used as a positive control It can beseen that fortilin-binding protein is significantly upregulatedin ldquosurvivorrdquo shrimps Primers and annealing temperaturesused were listed in Table 1
3 Dataset Description
The dataset associated with this Dataset Paper consists of 8items which are described as follows
Dataset Item 1 (Table) A list of isotigs and BLAST resultsThe column Query Name presents the isotig number desig-nated by the assembly program Query Length the lengthof isotig sequence in base pairs Query Start the start ofalignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered
by alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignmentbit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal Identities
6 Dataset Papers in Science
Others
5
Drosophila
melanogaster 40
Danio rerio 23
Rattus norvegicus 14
Mus musculus 10
Caenorhabditis elegans 3Candida albicans 3Arabidopsis thaliana 1
Saccharomyces cerevisiae 1
Figure 5Organismswithmost protein hits the proportion of putative transcriptswith BLASTxmatches (119864-valuele10minus3) in the nonredundantprotein database classified according to the organism with the ldquotop hitrdquo protein sequence
1 2 3 4 5 6 7 8
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
isotig01639
isotig01684
isotig00854
isotig01256
Control
HGUVU5Q01ANPJ4
ldquoSurvivorrdquo
Figure 6 Successful amplification of microsatellites in P monodon samples Leftmost marker lengths lane 1 ladder lanes 2 to 4 controlshrimps lanes 5 to 7 ldquosurvivorrdquo shrimps lanes 8 negative control rightmost microsatellite identifier
Dataset Papers in Science 7
Table 1 Primers used in RT-PCR
Product Identifier 51015840 to 31015840 sequence Melting temp
170 isotig001639 Forward TCAAAGTTAGCTCCTAGACCA 52Reverse AGCTCCTACTCCTGTTTCTGT 55
146 isotig001684 Forward GCCTCCTATAAACGCTAAAAT 50Reverse ATTACCCAGAATTCACTGGTT 51
325 isotig00854 Forward CTGCAGGTCATGCTCAACG 60Reverse AGCTCCCACACATCAAGGG 60
314 isotig01256 Forward ACGACTCGAAGGAAGGCTC 60Reverse TCCCGAATGTCCGTTCTCC 60
154 HGUVU5Q01ANPJ4 Forward TCGCGACGAGTACATGGG 60Reverse GTCTTCACTGTATTCAATTGGCTG 60
268 Hemocyanin Forward ACAGTCTCCCTCCATACACCGTGG 63Reverse AGGCCAGGCGAAAACGCGAA 64
279 Ferritin Forward GCCCGGGCAGGTACTCTCCA 66Reverse TGGCCGCATCGTTCTCCAGC 64
223 Fortilin-binding protein Forward GCTCGCAGTTAACCTCCTCTGCC 63Reverse GGCGTCCACCGTGCAAGAAGT 64
Table 2 A list of database reads sequences and other components with their corresponding frequencies
Number ofreads
Number ofbases Assembled Partial Singleton Repeat Outlier Too short
240616 76099104 125598 91137 9526 1597 10108 2650Number ofisogroups
Averagecontig count
Largestcontig count
Number withone contig
Averageisotig count
Largest isotigcount
Number withone isotig
3724 2 2813 3573 1 13 3577Number ofisotigs
Averagecontig count
Largestcontig count
Number withone contig
Number ofbases
Averageisotig size
N50 isotigsize Largest isotig size
4090 1 6 3732 1635969 400 504 1921Number ofsingletons Valid Short LowQual Dust ShortQ Valid (after
2nd cleaning) Trashed
9526 5771 3489 1 73 192 5507 264
Table 3 Isotigs GO analysis result
GO analysis resultBiological process 1686118225Cellular component 1276988942Molecular function 1916892833No hits 3602
Column 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Item 2 (Table) A list of singletons and BLASTresults The column Query Name presents the singleton codedesignated by the assembly program Query Length the
Table 4 Singletons GO analysis result
GO analysis resultBiological process 1014409221Cellular component 9316145113Molecular function 1413976268No hits 5171
length of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignmentin query Query Coverage the percentage of query coveredby alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignment
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
Dataset Papers in Science 5
224 other species 74es 74
Penaeus monodon 6Litopenaeus vannamei 3
Marsupenaeus japonicus 2Fenneropenaeus chinensis 2
Callinectes sapidus 2
Daphnia pulex 9
Procambarus clarkii 1
Figure 4 Organisms with most nucleotide hits the proportion of unique sequences with BLASTn matches (119864-value le10minus3) in thenonredundant nucleotide database classified according to the organism with the ldquotop hitrdquo nucleotide sequence
Semiquantitative expression analysis was as follows Theconcentration and purity of the RNA extracts were deter-mined by spectrophotometry prior to cDNA synthesis andRT-PCR All extracts were found to contain quality RNA (OD260280 of 18 to 20) Gene expression was analyzed usingsemiquantitative PCR PCRmix was prepared using the sameconcentrations as described in microsatellite amplificationRunning conditions however were modified into 25 cycleswith a 50∘C cycle annealing temperature After the run 2 120583Lof each ampliconwasmixedwith 2 120583Lof loading dye and thenelectrophoresed in 1 agarose gel Relative expression wasdetermined based on the fluorescence of 2 120583L DNA ladderloaded on the same gel and computed using BioDocAnalyzeSoftware bundled with the Biometra Gel Documentation sys-tem Ferritin and fortilin-binding protein whose full-lengthcoding regions were found in the library were successfullyamplified Differential expression of these genes was clearlyobserved in the semiquantitative PCR profiles (Figure 7)120573-Actin was used as negative control [1] while hemo-
cyanin a known differentially expressed gene in virus resis-tant shrimps [13] was used as a positive control It can beseen that fortilin-binding protein is significantly upregulatedin ldquosurvivorrdquo shrimps Primers and annealing temperaturesused were listed in Table 1
3 Dataset Description
The dataset associated with this Dataset Paper consists of 8items which are described as follows
Dataset Item 1 (Table) A list of isotigs and BLAST resultsThe column Query Name presents the isotig number desig-nated by the assembly program Query Length the lengthof isotig sequence in base pairs Query Start the start ofalignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered
by alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignmentbit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal Identities
6 Dataset Papers in Science
Others
5
Drosophila
melanogaster 40
Danio rerio 23
Rattus norvegicus 14
Mus musculus 10
Caenorhabditis elegans 3Candida albicans 3Arabidopsis thaliana 1
Saccharomyces cerevisiae 1
Figure 5Organismswithmost protein hits the proportion of putative transcriptswith BLASTxmatches (119864-valuele10minus3) in the nonredundantprotein database classified according to the organism with the ldquotop hitrdquo protein sequence
1 2 3 4 5 6 7 8
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
isotig01639
isotig01684
isotig00854
isotig01256
Control
HGUVU5Q01ANPJ4
ldquoSurvivorrdquo
Figure 6 Successful amplification of microsatellites in P monodon samples Leftmost marker lengths lane 1 ladder lanes 2 to 4 controlshrimps lanes 5 to 7 ldquosurvivorrdquo shrimps lanes 8 negative control rightmost microsatellite identifier
Dataset Papers in Science 7
Table 1 Primers used in RT-PCR
Product Identifier 51015840 to 31015840 sequence Melting temp
170 isotig001639 Forward TCAAAGTTAGCTCCTAGACCA 52Reverse AGCTCCTACTCCTGTTTCTGT 55
146 isotig001684 Forward GCCTCCTATAAACGCTAAAAT 50Reverse ATTACCCAGAATTCACTGGTT 51
325 isotig00854 Forward CTGCAGGTCATGCTCAACG 60Reverse AGCTCCCACACATCAAGGG 60
314 isotig01256 Forward ACGACTCGAAGGAAGGCTC 60Reverse TCCCGAATGTCCGTTCTCC 60
154 HGUVU5Q01ANPJ4 Forward TCGCGACGAGTACATGGG 60Reverse GTCTTCACTGTATTCAATTGGCTG 60
268 Hemocyanin Forward ACAGTCTCCCTCCATACACCGTGG 63Reverse AGGCCAGGCGAAAACGCGAA 64
279 Ferritin Forward GCCCGGGCAGGTACTCTCCA 66Reverse TGGCCGCATCGTTCTCCAGC 64
223 Fortilin-binding protein Forward GCTCGCAGTTAACCTCCTCTGCC 63Reverse GGCGTCCACCGTGCAAGAAGT 64
Table 2 A list of database reads sequences and other components with their corresponding frequencies
Number ofreads
Number ofbases Assembled Partial Singleton Repeat Outlier Too short
240616 76099104 125598 91137 9526 1597 10108 2650Number ofisogroups
Averagecontig count
Largestcontig count
Number withone contig
Averageisotig count
Largest isotigcount
Number withone isotig
3724 2 2813 3573 1 13 3577Number ofisotigs
Averagecontig count
Largestcontig count
Number withone contig
Number ofbases
Averageisotig size
N50 isotigsize Largest isotig size
4090 1 6 3732 1635969 400 504 1921Number ofsingletons Valid Short LowQual Dust ShortQ Valid (after
2nd cleaning) Trashed
9526 5771 3489 1 73 192 5507 264
Table 3 Isotigs GO analysis result
GO analysis resultBiological process 1686118225Cellular component 1276988942Molecular function 1916892833No hits 3602
Column 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Item 2 (Table) A list of singletons and BLASTresults The column Query Name presents the singleton codedesignated by the assembly program Query Length the
Table 4 Singletons GO analysis result
GO analysis resultBiological process 1014409221Cellular component 9316145113Molecular function 1413976268No hits 5171
length of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignmentin query Query Coverage the percentage of query coveredby alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignment
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
6 Dataset Papers in Science
Others
5
Drosophila
melanogaster 40
Danio rerio 23
Rattus norvegicus 14
Mus musculus 10
Caenorhabditis elegans 3Candida albicans 3Arabidopsis thaliana 1
Saccharomyces cerevisiae 1
Figure 5Organismswithmost protein hits the proportion of putative transcriptswith BLASTxmatches (119864-valuele10minus3) in the nonredundantprotein database classified according to the organism with the ldquotop hitrdquo protein sequence
1 2 3 4 5 6 7 8
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
2000 bp1200 bp800 bp400bp
isotig01639
isotig01684
isotig00854
isotig01256
Control
HGUVU5Q01ANPJ4
ldquoSurvivorrdquo
Figure 6 Successful amplification of microsatellites in P monodon samples Leftmost marker lengths lane 1 ladder lanes 2 to 4 controlshrimps lanes 5 to 7 ldquosurvivorrdquo shrimps lanes 8 negative control rightmost microsatellite identifier
Dataset Papers in Science 7
Table 1 Primers used in RT-PCR
Product Identifier 51015840 to 31015840 sequence Melting temp
170 isotig001639 Forward TCAAAGTTAGCTCCTAGACCA 52Reverse AGCTCCTACTCCTGTTTCTGT 55
146 isotig001684 Forward GCCTCCTATAAACGCTAAAAT 50Reverse ATTACCCAGAATTCACTGGTT 51
325 isotig00854 Forward CTGCAGGTCATGCTCAACG 60Reverse AGCTCCCACACATCAAGGG 60
314 isotig01256 Forward ACGACTCGAAGGAAGGCTC 60Reverse TCCCGAATGTCCGTTCTCC 60
154 HGUVU5Q01ANPJ4 Forward TCGCGACGAGTACATGGG 60Reverse GTCTTCACTGTATTCAATTGGCTG 60
268 Hemocyanin Forward ACAGTCTCCCTCCATACACCGTGG 63Reverse AGGCCAGGCGAAAACGCGAA 64
279 Ferritin Forward GCCCGGGCAGGTACTCTCCA 66Reverse TGGCCGCATCGTTCTCCAGC 64
223 Fortilin-binding protein Forward GCTCGCAGTTAACCTCCTCTGCC 63Reverse GGCGTCCACCGTGCAAGAAGT 64
Table 2 A list of database reads sequences and other components with their corresponding frequencies
Number ofreads
Number ofbases Assembled Partial Singleton Repeat Outlier Too short
240616 76099104 125598 91137 9526 1597 10108 2650Number ofisogroups
Averagecontig count
Largestcontig count
Number withone contig
Averageisotig count
Largest isotigcount
Number withone isotig
3724 2 2813 3573 1 13 3577Number ofisotigs
Averagecontig count
Largestcontig count
Number withone contig
Number ofbases
Averageisotig size
N50 isotigsize Largest isotig size
4090 1 6 3732 1635969 400 504 1921Number ofsingletons Valid Short LowQual Dust ShortQ Valid (after
2nd cleaning) Trashed
9526 5771 3489 1 73 192 5507 264
Table 3 Isotigs GO analysis result
GO analysis resultBiological process 1686118225Cellular component 1276988942Molecular function 1916892833No hits 3602
Column 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Item 2 (Table) A list of singletons and BLASTresults The column Query Name presents the singleton codedesignated by the assembly program Query Length the
Table 4 Singletons GO analysis result
GO analysis resultBiological process 1014409221Cellular component 9316145113Molecular function 1413976268No hits 5171
length of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignmentin query Query Coverage the percentage of query coveredby alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignment
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
Dataset Papers in Science 7
Table 1 Primers used in RT-PCR
Product Identifier 51015840 to 31015840 sequence Melting temp
170 isotig001639 Forward TCAAAGTTAGCTCCTAGACCA 52Reverse AGCTCCTACTCCTGTTTCTGT 55
146 isotig001684 Forward GCCTCCTATAAACGCTAAAAT 50Reverse ATTACCCAGAATTCACTGGTT 51
325 isotig00854 Forward CTGCAGGTCATGCTCAACG 60Reverse AGCTCCCACACATCAAGGG 60
314 isotig01256 Forward ACGACTCGAAGGAAGGCTC 60Reverse TCCCGAATGTCCGTTCTCC 60
154 HGUVU5Q01ANPJ4 Forward TCGCGACGAGTACATGGG 60Reverse GTCTTCACTGTATTCAATTGGCTG 60
268 Hemocyanin Forward ACAGTCTCCCTCCATACACCGTGG 63Reverse AGGCCAGGCGAAAACGCGAA 64
279 Ferritin Forward GCCCGGGCAGGTACTCTCCA 66Reverse TGGCCGCATCGTTCTCCAGC 64
223 Fortilin-binding protein Forward GCTCGCAGTTAACCTCCTCTGCC 63Reverse GGCGTCCACCGTGCAAGAAGT 64
Table 2 A list of database reads sequences and other components with their corresponding frequencies
Number ofreads
Number ofbases Assembled Partial Singleton Repeat Outlier Too short
240616 76099104 125598 91137 9526 1597 10108 2650Number ofisogroups
Averagecontig count
Largestcontig count
Number withone contig
Averageisotig count
Largest isotigcount
Number withone isotig
3724 2 2813 3573 1 13 3577Number ofisotigs
Averagecontig count
Largestcontig count
Number withone contig
Number ofbases
Averageisotig size
N50 isotigsize Largest isotig size
4090 1 6 3732 1635969 400 504 1921Number ofsingletons Valid Short LowQual Dust ShortQ Valid (after
2nd cleaning) Trashed
9526 5771 3489 1 73 192 5507 264
Table 3 Isotigs GO analysis result
GO analysis resultBiological process 1686118225Cellular component 1276988942Molecular function 1916892833No hits 3602
Column 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Item 2 (Table) A list of singletons and BLASTresults The column Query Name presents the singleton codedesignated by the assembly program Query Length the
Table 4 Singletons GO analysis result
GO analysis resultBiological process 1014409221Cellular component 9316145113Molecular function 1413976268No hits 5171
length of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignmentin query Query Coverage the percentage of query coveredby alignment to the database sequence Subject Descriptionthe description or title of the matched database sequenceSubject Accession the accession of the matched databasesequence Subject Length the length of the matched databasesequence in base pairs Subject Start the start of alignment insubject Subject End the end of alignment in subject SubjectCoverage the percentage of database sequence covered byalignment to the query sequence Bit Score the alignment
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
8 Dataset Papers in Science
Control
Hemocyanin
Ferritin
Fortilin-bindingprotein
lowast
lowast
0
500
0
500
0
500
0
500
Control
Control
Control
Control
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
200120583g120 120583g80 120583g40 120583g
ldquoSurvivorrdquo
ldquoSurvivorrdquo
ldquoSurvivorrdquoldquoSurvivorrdquo
ldquoSurvivorrdquo
120573-Actin
Figure 7 Successful amplification of immune-related genes in P monodon samples Lane 1 ladder lanes 2 to 4 control shrimps lanes 5 to7 ldquosurvivorrdquo shrimps lane 8 negative control Asterisk indicates significant difference between 120573-actin normalized fluorescence scores ofcontrol and ldquosurvivorrdquo shrimps based on 119905-test at 95 confidence interval
bit score 119864-Value Score the alignment expect value fromthe database sequence MatchTotal Identities the numberof identical matches Identities Percentage the percentageof identical matches MatchTotal Positives the number ofpositive scoringmatches Positives Percentage the percentageof positive scoring matches MatchTotal Gaps the numberof gaps in the alignment Gaps Percentage the percentage ofgaps in the alignment Query Frame the reading frame of thequery sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject Description
Column 7 Subject AccessionColumn 8 Subject LengthColumn 9 Subject StartColumn 10 Subject EndColumn 11 Subject CoverageColumn 12 Score BitColumn 13 119864-Value ScoreColumn 14 MatchTotal IdentitiesColumn 15 Identities Percentage ()Column 16 MatchTotal PositivesColumn 17 Positives Percentage ()Column 18 MatchTotal GapsColumn 19 Gaps Percentage ()Column 20 Query Frame
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
Dataset Papers in Science 9
Dataset Item 3 (Table) A list of dataset sequence names(Name) their corresponding annotation according to theGenbank nonredundant nucleotide database (NR Annota-tion) and related organisms (Related Organism)
Column 1 NameColumn 2 NR AnnotationColumn 3 Related Organism
Dataset Item 4 (Table) A list of isotig frequencies of majorgene ontology (GO) categories and subcategories Since agene product could be assigned to more than one GO termthe percentages in each main category do not add up toexactly 100 percent Supplementary information is includedwhere the isotig number isogroup number length andnumber of contigs comprising the contig are listed per GOterm
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 5 (Table) A list of singleton frequencies ofmajor gene ontology (GO) categories and subcategoriesSince a gene product could be assigned to more than oneGO term the percentages in each main category do not addup to exactly 100 percent Supplementary information is alsoincluded where the singleton code is listed per GO term
Column 1 GO CategoryColumn 2 GO SubcategoryColumn 3 Frequency Percentage ()Column 4 Supplementary Information
Dataset Item 6 (Table) A list of isotigs and BLAST2GOresults The column Query Name presents the isotig numberdesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotal
Gaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query EndColumn 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 7 (Table) A list of singletons and BLAST2GOresults The column Query Name presents the singleton codedesignated by the assembly program Query Length thelength of isotig sequence in base pairs Query Start the startof alignment in query Query End the end of alignment inquery Query Coverage the percentage of query covered byalignment to the database sequence Subject Description thedescription or title of the matched database sequence withspecific details that contains hit ID sequence symbol speciesnumber and name GO number specific GO term andevidence source Subject Length the length of the matcheddatabase sequence in base pairs Subject Start the start ofalignment in subject Subject End the end of alignmentin subject Subject Coverage the percentage of databasesequence covered by alignment to the query sequence BitScore the alignment bit score 119864-Value Score the alignmentexpect value from the database sequence MatchTotal Iden-tities the number of identical matches Identities Percentagethe percentage of identical matches MatchTotal Positivesthe number of positive scoringmatches Positives Percentagethe percentage of positive scoring matches MatchTotalGaps the number of gaps in the alignment Gaps Percentagethe percentage of gaps in the alignment Query Frame thereading frame of the query sequence
Column 1 Query NameColumn 2 Query LengthColumn 3 Query StartColumn 4 Query End
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
10 Dataset Papers in Science
Column 5 Query CoverageColumn 6 Subject DescriptionColumn 7 Subject LengthColumn 8 Subject StartColumn 9 Subject EndColumn 10 Subject CoverageColumn 11 Score BitColumn 12 119864-Value ScoreColumn 13 MatchTotal IdentitiesColumn 14 Identities Percentage ()Column 15 MatchTotal PositivesColumn 16 Positives Percentage ()Column 17 MatchTotal GapsColumn 18 Gaps Percentage ()Column 19 Query Frame
Dataset Item 8 (Table) A list dataset sequences withmicrosatellites The column Sequence identifier presents theisotig number or singleton code Length the sequence lengthin base pairs Repeats the number andmotif of microsatelliterepeats Left Start Primer the position of forward primeralong the sequence Right Start Primer the position of reverseprimer along the sequence Blastn Match the annotationbased on Blastn
Column 1 Sequence IdentifierColumn 2 LengthColumn 3 RepeatsColumn 4 Left Start PrimerColumn 5 Right Start PrimerColumn 6 Blastn Match
4 Concluding Remarks
The transcriptome assembly described by this paper is tobe used in conjunction with Dataset Items 1ndash8 (Tables) tofacilitate easy and rapid identification of any gene of interestfrom the ldquosurvivorrdquo tiger shrimp transcriptome This datasetrepresents a large pool of sequences that needs to be furthercharacterized for additional understanding of WSSV P mon-odon interaction Once validated candidate markers minedfrom the library including the generated microsatellites maybe used for genetic linkage mapping
Dataset Availability
The dataset associated with this Dataset Paper is dedi-cated to the public domain using the CC0 waiver and isavailable at httpdxdoiorg1011552014807806dataset Inaddition the sequence database has been uploaded to NCBISequence Read Archive with accession number SRR577080(httpwwwncbinlmnihgovsra)
Conflict of Interests
The authors declare that there is no conflict of interests
Acknowledgments
The authors would like to thank Dr Adelaida T Calpe ofthe Philippine Council for Agriculture Aquatic and Natu-ral Resources Research and Development (PCAARRD) forguiding the projectrsquos implementation Mr Kenneth Kim theBiochemistry Laboratory of the Thomas Aquinas ResearchComplex Roselyn Aguila Sweedy Kay Perez and the restof the staff of Genetic Fingerprinting Laboratory and theNFRDI administration
References
[1] Z-Y Zhao Z-X Yin S-P Weng et al ldquoProfiling of differ-entially expressed genes in hepatopancreas of white spot syn-drome virus-resistant shrimp (Litopenaeus vannamei) by sup-pression subtractive hybridisationrdquo Fish and Shellfish Immunol-ogy vol 22 no 5 pp 520ndash534 2007
[2] D Pan N He Z Yang H Liu and X Xu ldquoDifferential geneexpression profile in hepatopancreas of WSSV-resistant shrimp(Penaeus japonicus) by suppression subtractive hybridizationrdquoDevelopmental and Comparative Immunology vol 29 no 2 pp103ndash112 2005
[3] M Tonganunt B NupanM Saengsakda et al ldquoThe role of Pm-fortilin in protecting shrimp from white spot syndrome virus(WSSV) infectionrdquo Fish and Shellfish Immunology vol 25 no5 pp 633ndash637 2008
[4] T Panrat P Sinthujaroen B Nupan W Wanna M T Tammiand A Phongdara ldquoCharacterization of a novel binding proteinfor FortilinTCTPmdashcomponent of a defensemechanismagainstviral infection in Penaeus monodonrdquo PLoS ONE vol 7 no 3Article ID e33291 2012
[5] A Clavero-Salas R R Sotelo-Mundo T Gollas-Galvan etal ldquoTranscriptome analysis of gills from the white shrimpLitopenaeus vannamei infected with White Spot SyndromeVirusrdquo Fish and Shellfish Immunology vol 23 no 2 pp 459ndash472 2007
[6] J Robalino J S Almeida D McKillen et al ldquoInsights into theimmune transcriptome of the shrimp Litopenaeus vannameitissue-specific expression profiles and transcriptomic responsesto immune challengerdquo Physiological Genomics vol 29 no 1 pp44ndash56 2007
[7] B J Cuthbertson L J Deterding J GWilliams et al ldquoDiversityin penaeidin antimicrobial peptide form and functionrdquo Devel-opmental and Comparative Immunology vol 32 no 3 pp 167ndash181 2008
[8] J C Garcıa A Reyes M Salazar and C B Granja ldquoDifferentialgene expression in White Spot Syndrome Virus (WSSV)-infected naıve and previously challenged Pacific white shrimpPenaeus (Litopenaeus) vannameirdquo Aquaculture vol 289 no 3-4 pp 253ndash258 2009
[9] E de la Vega B M Degnan M R Hall and K J WilsonldquoDifferential expression of immune-related genes and trans-posable elements in black tiger shrimp (Penaeus monodon)exposed to a range of environmental stressorsrdquo Fish andShellfish Immunology vol 23 no 5 pp 1072ndash1088 2007
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
Dataset Papers in Science 11
[10] S Pongsomboon S Tang S Boonda T Aoki I Hirono andA Tassanakajon ldquoA cDNA microarray approach for analyzingtranscriptional changes in Penaeus monodon after infection bypathogensrdquo Fish and Shellfish Immunology vol 30 no 1 pp439ndash446 2011
[11] T Aoki H-C Wang S Unajak M D Santos H Kondo andI Hirono ldquoMicroarray analyses of shrimp immune responsesrdquoMarine Biotechnology vol 13 no 4 pp 629ndash638 2011
[12] A Prapavorarat S Pongsomboon and A Tassanakajon ldquoIden-tification of genes expressed in response to yellow headvirus infection in the black tiger shrimp Penaeus monodonby suppression subtractive hybridizationrdquo Developmental andComparative Immunology vol 34 no 6 pp 611ndash617 2010
[13] S Nayak S K Singh N Ramaiah and R A Sreepada ldquoIdenti-fication of upregulated immune-related genes in Vibrio harveyichallenged Penaeus monodon postlarvaerdquo Fish and ShellfishImmunology vol 29 no 3 pp 544ndash549 2010
[14] J Zhang R Chiodini A Badr and G Zhang ldquoThe impact ofnext-generation sequencing on genomicsrdquo Journal of Geneticsand Genomics vol 38 no 3 pp 95ndash109 2011
[15] F Bontems L Baerlocher S Mehenni I Bahechar L Farinelliand R Dosch ldquoEfficient mutation identification in zebrafishby microarray capturing and next generation sequencingrdquo Bio-chemical and Biophysical Research Communications vol 405no 3 pp 373ndash376 2011
[16] S G Bavykin V MMikhailovich V M Zakharyev et al ldquoDis-crimination of Bacillus anthracis and closely related microor-ganisms by analysis of 16S and 23S rRNA with oligonucleotidemicroarrayrdquo Chemico-Biological Interactions vol 171 no 2 pp212ndash235 2008
[17] H Yang Y Tao Z Zheng C Li M W Sweetingham and JG Howieson ldquoApplication of next-generation sequencing forrapid marker development in molecular plant breeding a casestudy on anthracnose disease resistance in Lupinus angustifoliusLrdquo BMC Genomics vol 13 p 318 2012
[18] J Cuellar-Anjel R Chamoro B White-Noble P Schofield andD Lightner ldquoTesting finds resistance to WSSV in shrimp fromPanamanian breeding programrdquo Global Aquaculture Advocatepp 65ndash66 2011
[19] Z Huang Z Yin S Weng J He and S Li ldquoSelective breedingand preliminary commercial performance of Penaeus vannameifor resistance to white spot syndrome virus (WSSV)rdquo Aquacul-ture vol 364-365 pp 111ndash117 2012
[20] Y Huang A Hua-shui Z Yin et al ldquoStudies onWSSV resistantand immune characteristics of the 4th generation selectivebreeding families for resistance to the white syndrome virus(WSSV) of Litopenaeus vannameirdquo Journal of Fisheries of Chinavol 34 no 10 pp 1550ndash1558 2010
[21] T Gitterle R Salte B Gjerde et al ldquoGenetic (co)variation inresistance to White Spot Syndrome Virus (WSSV) and harvestweight in Penaeus (Litopenaeus vannamei)rdquo Aquaculture vol246 no 1-4 pp 139ndash149 2005
[22] X-S Hu ldquoA general framework for marker-assisted selectionrdquoTheoretical Population Biology vol 71 no 4 pp 524ndash542 2007
[23] M D Santos G V Lopez and N C Barut ldquoA pilot study onthe genetic variation of eastern little tuna (Euthynnus affinis) inSoutheast Asiardquo Philippine Journal of Science vol 139 no 1 pp43ndash50 2010
[24] B A Maralit C M A Caipang M D Santos and M BB Maningas ldquoPCR detection of white spot syndrome virus
(WSSV) from farmed pacific white shrimp (Litopenaeus van-namei) in selected sites of the Philippinesrdquo AACL Bioflux vol4 no 4 pp 474ndash480 2011
[25] L M Tapay E C B Nadala Jr and P C Loh ldquoA polymerasechain reaction protocol for the detection of various geographi-cal isolates of white spot virusrdquo Journal of Virological Methodsvol 82 no 1 pp 39ndash43 1999
[26] A Meyer J M Morrissey and M Schartl ldquoRecurrent origin ofa sexually selected trait in Xiphophorus fishes inferred from amolecular phylogenyrdquo Nature vol 368 no 6471 pp 539ndash5421994
[27] A Tassanakajon S Klinbunga N Paunglarp et al ldquoPenaeusmonodon gene discovery project the generation of an ESTcollection and establishment of a databaserdquo Gene vol 384 no1-2 pp 104ndash112 2006
[28] S-W Huang Y-Y Lin E-M You et al ldquoFosmid library endsequencing reveals a rarely known genome structure of marineshrimp Penaeus monodonrdquo BMC Genomics vol 12 article 2422011
[29] J Quilang S Wang P Li et al ldquoGeneration and analysis ofESTs from the eastern oyster Crassostrea virginica Gmelin andidentification of microsatellite and SNP markersrdquo BMC Geno-mics vol 8 article 157 2007
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Anatomy Research International
PeptidesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
International Journal of
Volume 2014
Zoology
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Molecular Biology International
GenomicsInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioinformaticsAdvances in
Marine BiologyJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Signal TransductionJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
BioMed Research International
Evolutionary BiologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Biochemistry Research International
ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Genetics Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Virolog y
Hindawi Publishing Corporationhttpwwwhindawicom
Nucleic AcidsJournal of
Volume 2014
Stem CellsInternational
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Enzyme Research
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Microbiology