Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 19
Section C. Comparative Genomics Analysis of West Nile Virus
Objective
Upon completion of this exercise, you will be able to use the Virus Pathogen Resource (ViPR; http://www.viprbrc.org/) to:
• Search for virus sequences and view genome annotations in ViPR
• Save selected sequences as a working set in your private Workbench space
• Build and visualize a phylogenetic tree on a set of sequences to infer their evolutionary relationships
• Predict genotype and detect recombination in virus genomes
• Annotate virus genome sequences
I. Search for sequences and save matching sequences into working sets
a. Go to the ViPR homepage (http://www.viprbrc.org/), click “Flaviviridae” to get to the family homepage.
b. Mouse-over “Search Data” in the grey navigation bar and click “Genomes”.
c. The Genome Search page allows you to search for sequences based on taxonomy, collection year, sample location, host selection, complete genome or not, etc. A dynamic number of matching search results is displayed at the top of the page to help you search more efficiently.
d. For this exercise, we are going to search for West Nile viruses isolated from 1999-2001 in the US and South Africa. Select the following criteria and click the “Search” button to run the query. Virus: West Nile virus (Flaviviridae->Flavivirus->West Nile virus) Complete Genome: Complete Genome Only Collection Year: 1999-2001 Geographic Grouping: Africa, North America Country: South Africa, USA
2/14/14 Virus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Genome Search
www.viprbrc.org/brc/vipr_genome_search.do?method=ModifySearch&selectionContext=1392425810978 1/1
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Feb 11, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Start to type strain to get suggestions Deselect All
SELECT VIRUS(ES) TO INCLUDE IN SEARCHJump to strain in taxonomy:
COMPLETE GENOME Complete Genome Only
Start: 1999
End: 2001
COLLECTION YEAR
To add month to search, seeAdvance Search Options: MonthRange
GEOGRAPHIC GROUPING
AfricaAsiaEuropeNorth AmericaOceaniaSouth America
COUNTRY
South AfricaSudanTanzaniaTrinidad and TobagoTunisiaUSAUganda
HOST SELECTION
AllAlpacaAmerican FlamingoAvianBatBearded ParrotbillBirdBisonBlack HowlerBlackbirdBlue JayBlue TitBoarBongoBrown-‐‑Headed CowbirdBuffaloCamelCardinalCattle
ADVANCED OPTIONSSearchClear
Results matching your criteria: 45
Tip: To select multiple or deselect, Ctrl-click (Windows) or Cmd-click (MacOS)
Show All
(0/25 strains selected) (25 Strains - 3 complete genomes)
Species: West Nile virus Deselect All(18367/18367 strains selected) (18367 Strains - 711 complete genomes)
Species: Yaounde virus Select All(0/2 strains selected) (2 Strains - 1 complete genomes)
Species: Yellow fever virus Select All(0/440 strains selected) (440 Strains - 79 complete genomes)
Species: Yokose virus Select All(0/5 strains selected) (5 Strains - 2 complete genomes)
Species: Zika virus Select All(0/16 strains selected) (16 Strains - 5 complete genomes)
Genome Search Search for virus genomic sequences and related information. You can search for the whole virus family or search for specified genus, species etc. You can also find yourstrain or genome record if you have its information, such as strain name, accession.
Genome searches for Dengue virus or Hepatitis C virus can be augmented with clinical metadata criteria. Selecting the appropriate nodes in the taxonomy browser(Flavivirus, Dengue virus, Hepacivirus, Hepatitis C virus) will add metadata search panels and enable you to include these criteria. Some sequences have more metadatafields defined than others. Queries based on metadata only retrieve sequences for which those fields are defined.
ViPR Home Flaviviridae Home Genome Search
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 20
e. The Search Results page will be displayed. Here you can:
i. Save the search query to your Workbench and rerun the search again later.
ii. Download the sequences (genome, CDS, protein) by clicking “Download”.
iii. Store selected sequences as a working set in the Workbench so that you can run various analyses on the working set.
iv. View the details for any item in the results table by clicking on “View” next to any row.
f. On the Genome Details page, you will find the strain information, genome information, genome image map, and mature peptide annotations generated by ViPR.
g. Click “View” for a protein to load the Mature Peptide Details page. Here you will find annotations of the mature peptide including: genomic locations, HMM/Pfam domains, related protein structures, predicted and experimentally determined immune epitopes, etc.
h. Return to the Genome Search Results page by clicking “Results” in the breadcrumb.
2/14/14 Virus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Genome Search Result
www.viprbrc.org/brc/vipr_genome_search.do 1/2
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Add to Working Set Save Search Download
Your search returned 45 genomes. Search Criteria Displaying 50 records per page , sorted by Species Name, Strain
Name, GenBank Accession in ascending order.Display Settings
Genome Search Result
Your Selected Items: 45 items selected | Deselect All
Select all 45 genomes
Strain NameSpecies
Name
GenBank
Accession
Sequence
Length
Collection
DateHost GenBank Host Country Mol Type
3356.2.1.1(JEV) West Nilevirus
EF530047 11029 2000 Crow American crow USA genomicRNA
3356K VP2 West Nilevirus
EF657887 11029 2000 Crow American crow USA genomicRNA
FL2001 crow 67030 West Nilevirus
GQ379156 11029 07/2001 Crow crow USA genomicRNA
LSU-AR01 West Nilevirus
FJ527738 11029 2001 Blue Jay blue jay USA genomicRNA
New York 99 West Nilevirus
HQ596519 11029 1999 Crow crow USA genomicRNA
NY 2001 Suffolk West Nilevirus
DQ164194 11029 2001 Crow American crow USA genomicRNA
NY99-crow-V76/1 West Nilevirus
FJ151394 11029 1999 Crow crow USA genomicRNA
WNV-1/US/BID-V4186/1999
West Nilevirus
HM488125 10516 1999 Crow Corvusbrachyrhynchos
USA genomicRNA
WNV-1/US/BID-V4187/1999
West Nilevirus
HM488126 10598 1999 Crow Corvusbrachyrhynchos
USA genomicRNA
WNV-1/US/BID-V4188/1999
West Nilevirus
HM488127 10625 1999 Crow Corvusbrachyrhynchos
USA genomicRNA
WNV-1/US/BID-V4189/1999
West Nilevirus
HM488128 10616 1999 Crow Corvusbrachyrhynchos
USA genomicRNA
WNV-1/US/BID-V4191/2000
West Nilevirus
HM488129 10617 2000 Mosquito Culex salinarius USA genomicRNA
WNV-1/US/BID-V4192/2000
West Nilevirus
HM488130 10621 2000 Mosquito Culex salinarius USA genomicRNA
WNV-1/US/BID-
V4193/2000
West Nile
virus
HM488131 10620 2000 Mosquito Culex pipiens USA genomic
RNA
WNV-1/US/BID-V4194/2000
West Nilevirus
HM488132 10621 2000 Mosquito Culiseta melanura USA genomicRNA
WNV-1/US/BID-V4195/2001
West Nilevirus
HM488133 10621 2001 Mosquito Culex pipiens USA genomicRNA
WNV-1/US/BID-V4196/2001
West Nilevirus
HQ671696 10618 2001 Mosquito Culex salinarius USA genomicRNA
WNV-1/US/BID-V4197/2001
West Nilevirus
HQ671697 10621 2001 Mosquito Aedes vexans USA genomicRNA
WNV-1/US/BID-V4198/2001
West Nilevirus
HM488134 10513 2001 Mosquito Ochlerotatussollicitans
USA genomicRNA
WNV-1/US/BID-V4199/2001
West Nilevirus
HM488135 10621 2001 Mosquito Ochlerotatuscantator
USA genomicRNA
WNV-1/US/BID-V4200/2001
West Nilevirus
HM488136 10621 2001 Mosquito Culex restuans USA genomicRNA
WNV-1/US/BID-V4689/2001
West Nilevirus
HM488246 10533 2001 Avian Corvusbrachyrhynchos
USA genomicRNA
WNV-1/US/BID-V4691/2001
West Nilevirus
HM488247 10612 2001 Avian Corvusbrachyrhynchos
USA genomicRNA
Run Analysis
ViPR Home Flaviviridae Home Genome Search Results
Identify Similar Sequences (BLAST)
Analyze Sequence Variation (SNP)
Align Sequences (MSA)
Metadata-driven Comparative Analysis Tool
Generate Phylogenetic Tree
Genotype Recombination
Sequence Format Conversion
PCR Primer Design
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
2/14/14 Virus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Details for Flavivirus West Nile virus Strain 3356.2.1.1
www.viprbrc.org/brc/viprStrainDetails.do?ncbiAccession=EF530047&decorator=flavi&context=1392429622993 1/2
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Send Comments to Curator
Strain Name: 3356.2.1.1
Organism: West Nile virus
Taxonomy: Flaviviridae -> Flavivirus -> West Nile virus -> Type JEV
GenBank Host: American crow
Host: Crow
Isolation Country: USA
Collection Date: 2000
GenBank Definition: West Nile virus strain 3356.2.1.1, complete genome.
Authors:
Jia,Y., Dupuis,A.P. II, Jerzak,G.V.S., Maffei,J.G. andKramer,L.D.,Jia,Y., Moudy,R.M., Dupuis,A.P. II, Ngo,K.A.,Maffei,J.G., Jerzak,G.V., Franke,M.A., Kauffman,E.B. andKramer,L.D.
GenBank Sequence Accession: EF530047
Sequence Length: 11029
Sequence Status: Complete
Sequence: View Nucleotide Sequence and design PCR primers
Number of Proteins: 14
Organism Name: West Nile virus
Isolation Source: kidney
GenBank Note: small plaque phenotype;; plaque purified virus variant with smallplaque morphology;; derived from WN NY 2000-crow3356 VP1
Mol Type: genomic RNA
GenBank Host: American crow
Host: Crow
Isolation Country: USA
Collection Date: 2000
Genome Image Map Hide Show
Strain Details for West Nile virus Strain 3356.2.1.1(JEV)
Strain Information
Genome: EF530047
ViPR Home Flaviviridae Home Genome Search Results Strain Details (3356.2.1.1)
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
2/14/14 Virus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Details for Flavivirus West Nile virus Strain 3356.2.1.1
www.viprbrc.org/brc/viprStrainDetails.do?ncbiAccession=EF530047&decorator=flavi&context=1392429622993 2/2
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Feb 11, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Protein Information (SOP)Gene Symbol Protein Product Name ViPR Locus ID CDS Start CDS End NCBI Gene ID Locus Name
GenBank
-N/A- polyprotein WNV-1 97 10398 -N/A- -N/A-
ViPR-generated
ancC anchored core protein C ancC 97 465 -N/A- -N/A-
C core protein C C 97 411 -N/A- -N/A-
preM PreM protein preM 466 966 -N/A- -N/A-
M matrix protein M M 742 966 -N/A- -N/A-
E envelope protein E 967 2469 -N/A- -N/A-
NS1 non-structural protein NS1 NS1 2470 3525 -N/A- -N/A-
NS2a non-structural protein NS2a NS2a 3526 4218 -N/A- -N/A-
NS2b non-structural protein NS2b NS2b 4219 4611 -N/A- -N/A-
NS3 non-structural protein NS3 NS3 4612 6468 -N/A- -N/A-
NS4a non-structural protein NS4a NS4a 6469 6846 -N/A- -N/A-
2k 2K protein 2k 6847 6915 -N/A- -N/A-
NS4b non-structural protein NS4b NS4b 6916 7680 -N/A- -N/A-
NS5 RNA-dependent RNA polymerase NS5 7681 10395 -N/A- -N/A-
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 21
i. To analyze these sequences, we will select all records by ticking the checkbox above the table and add them to a working set by clicking the “Add to working set” button. This way, we will be able to retrieve the data from the Workbench later and run various analyses on the same data set.
j. You’ll be prompted to log in to your Workbench account in order to save data to a working set. If you don’t have an account already, simply register for an account for free by choosing the “Register for a new account” option and following the prompts.
k. A lightbox of “Add to Working Set” will pop up. Now create a new working set and name it “WNV 1999-2001 US & S Africa complete genomes”. Click “Add to Working Set” to save the sequences to a working set.
II. Construct and visualize a genome phylogenetic tree
a. Click “Workbench” in the grey navigation bar to access your Workbench area.
b. On the Workbench page, click “View” next to the saved WNV working set.
c. The Working Set Details page displays the sequence records saved in the working set. Select all records by clicking the checkbox above the table. Mouse over “Run Analysis” and click “Generate Phylogenetic Tree”.
d. On the Tree setting page, select “Quick Tree”, choose strain name and date as tree tip label, and click “Build Tree”.
e. While the analysis is running, you can save the analysis to your Workbench by entering a name and then clicking “Save to Workbench”. Once it is saved, you can come back to the Workbench at any time to retrieve the analysis results.
f. After the analysis is finished, a View Phylogenetic Tree page will be loaded. Here you can save the phylogenetic file in Newick or PhyloXML format to your computer. Click “View Tree” to load the Archaeopteryx Tree Viewer window.
2/14/14 Virus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Phylogenetic Tree
www.viprbrc.org/brc/tree.do?method=ModifyInputPage&decorator=flavi&ticketNumber=TR_889545101471 1/1
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Feb 11, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
INPUT45 GENOMES SELECTED FOR TREE
LABEL TREE TIPS (ENDS) WITH Strain Name Specify custom format of tip label (max 4)
Strain Name Accession Number Date Country USA State Host Species Species Name
ANALYSIS NAME
TREE GENERATION Quick Tree Custom Tree (I want to set my own parameters and/or I have a large
dataset)
SOURCE OF SEQUENCES TO BE ANALYZED *
Build TreeClear
Generate Phylogenetic Tree Tutorial
The "Quick Tree" option uses the FastME [ Desper, R., Gascuel, O. (2002) Journal of Computational Biology 19(5), pp. 687-705. ]. This algorithm uses a fast, distance-based approach and is good for generating trees from datasets containing 1) more than 1,000 sequences of short or medium length sequences, 2) more than 100 very longsequences, or 3) to reconstruct a "quick and dirty" tree. The "Custom Tree" option incorporates PhyML [ Guindon, S. and Gascuel, O., (2003) Syst Biol. 52: 696-704 ] or RaxML [Stamatakis, A. et al. (2005) Bioinformatics 21:456-463] algorithms. User-defined settings are required for either. PhyML infers a more evolutionarily-accurate phylogenetic topology by applying a substitution model to thenucleotide sequences. This algorithm is best applied to datasets containing 1) fewer than 100 very long sequences, 2) between 100 and 1,000 small or medium lengthsequences. When large datasets are input, ViPR will automatically use the RaxML algorithm. Click here to view a tutorial on generating a phylogenetic tree using ViPR tools.
ViPR Home Flaviviridae Home Genome Search Results Generate Phylogenetic Tree
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 22
g. A Tree Viewer window will pop up. Many tree customization options exist including: change tree type, reroot the tree, collapse/expand/display subtree, swap descendants, decorate (color) the tree leaves by any associated metadata (e.g., country, year of isolation, host, virus type, sequence position, etc.), resize the tree, change the font size, etc.
i. Color the tree leaves by country by selecting “Country” in the Basic Decoration Options section.
ii. Re-root the tree based on the South African sequences. To do so, make sure “Root/ Reroot” is selected in the Tree Manipulations section, and then click the node next to the two South African sequences.
iii. Our previous Meta-CATS analysis identified amino acid position 522 in the polyprotein (232 in the E protein) as significantly different between the lineage 1a viruses and lineage 2 viruses, with more likely a V in lineage 1a viruses and a T in lineage 2 viruses (p-value = 1.3E-128). Now color-code the tree by its corresponding nucleotide position 1667.
Cite IRD Tutorials Glossary of Terms Report a Bug Request Web Training Contact Us Release Date: Jan 24, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN266200400041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute , Vecna Technologies, SAGE Analytica and Los Alamos National Laboratory.
Save Analysis Newick File PhyloXml File Phylip File Tree Parameters PhyML Log Tree Build Parameters
View Tree
The IRD Tree Decorator is a custom-enhancement of Archaeopteryx . The original FORESTER/ATV library is freely available from SourceForge . Credits: Zmasek C.M. and Eddy S.R.
(2001) ATV: display and manipulation of annotated phylogenetic trees. Bioinformatics, 17, 383-384.
Click the "View Tree" button below to launch the tree viewer software in a new window. If you prefer other viewing software, the tree data is available fordownload in Newick or PhyloXml format using the buttons above.
Due to security concerns, certain browsers (e.g. Safari and Firefox) have disabled Java plug-ins by default. If the Tree Viewer takes a long time toload, please test your browser's Java plug-in to make sure it can display Java Applets properly.Safari has recently tightened the security settings on Java Applets, which may affect image export functions of the Tree Viewer. Click Here forinstructions on how to fix this.
ENHANCED TREE VIEWER
The IRD team provides software that allows 'decoration' of your tree by features such as host species, year, country, and subtype. This custom software isbased on Archaeopteryx . In the tree viewer, use the drop-down menu for basic decoration or advanced decoration to select the feature for coloring. Thedecorated tree and corresponding legend can be exported using options in the File drop-down menu.
A user's guide is available. How to create a publication quality tree image
View Phylogenetic TreeHome My Workbench Working... Generate Phylogenetic Tree Results
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA
About Us Community Announcements Links Resources Support Sign Out
You are logged in as [email protected]
Influenza Research Database - Phylogenetic Tree Viewer http://www.fludb.org/brc/tree.do?decorator=influenza&method...
1 of 1 2/10/14 7:45 PM
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 23
iv. Click on the “Advanced Decoration” button and select the “Sequence Position” option from the drop-down menu.
v. Enter position 1667 into the textbox to highlight this nucleotide position.
vi. The tree shows that the South African sequences (lineage 2) both have A at this position while the American strains (lineage 1a) all have G at this position. This position, therefore, can be used to color-code the tree by taxonomic lineage.
vii. The default colors may or may not be ideal for your purpose. You can change the color by using the “Advanced Decoration”. In the Advanced Decoration Options dialog box, select “Sequence Position”, click the Manual Decoration checkbox and click “Go”.
viii. Check A and choose red in the color palette, then click “Apply”. Now strains with A at position 1667 are colored in red.
ix. You can save the tree image by clicking the “File” menu and then a file format.
h. Return to the Tree Results page. Save the tree analysis to your Workbench by clicking “Save Analysis”. Rename the analysis so that you can recognize it later, for example, “WNV 1999-2001 USA S Africa phylogeny”. Then click “Save”.
i. Go to your Workbench. You can see the tree is listed at the top of the Workbench table. Click “View” to retrieve the tree analysis result. The parameters used to generate the tree are also saved.
III. Genotype & Recombination Detection
a. Download a WNV genome sequence to your computer from: http://tinyurl.com/l4dbg29.
b. Mouse-over “Analyze & Visualize” in the grey navigation bar and click “Genotype-Recombination Detection”.
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 24
c. When the landing page for this tool loads, either:
• Select the “Paste sequences in fasta format” option and paste the contents of the downloaded genome into the textbox, OR
• Select the “Upload a file containing my sequences in fasta format” and find the correct file on your own computer.
d. Next, select the “West Nile Virus” species from the drop-down list and click the “Run” button.
e. The results should be displayed in a table with the summary information for each analyzed strain shown in separate rows. To view more detailed information for the results for the chimera strain, click on the “View” link in the first column of the table.
f. On the Genotype Report page, you can:
• View the predicted genotype and recombination type (if applicable).
• Download a spreadsheet listing the detailed results of recombination determination.
• View the genotyping results in graphical format.
• Download or view the alignment of your sequence with representative sequences from each taxon selected by ViPR.
• Download or view the phylogenetic tree based on the alignment of your sequence with representative sequences from each taxon selected by ViPR.
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Feb 11, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Upload a file containing my sequences in FASTA format.
Paste sequences in FASTA format.
Use working sets
SOURCE OF SEQUENCES TO BE ANALYZED *Sequences can also be selected from search results or a working set in your workbench
>WNV-Chimera|New York 99|SA93/01AGTAGTTCGCCTGTGTGAGCTGACAAACTTAGTAGTGTTTGTGAGGATTAACAACAATTAACACAGTGCGAGCTGTTTCTTAGCACGAAGATCTCGATGTCTAAGAAACCAGGAGGGCCCGGCAAGAGCCGGGCTGTCAATATGCTAAAACGCGGAATGCCCCGCGTGTTGTCCTTGATTGGACTGAAGAGGGCTATGTTGAGCCTGATCGACGGCAAGGGGCCAATACGATTTGTGTTGGCTCTCTTGGCGTTCTTCAGGTTCACAGCAATTGCTCCGACCCGAGCAGTGCTGGATCGATGGAGAGGTGTGAACAAACAAACAGCGATGAAACACCTTCTGAGTTTTAA
Only 1 sequence is needed.Defline in your FASTA file will be used to label the display
West Nile Virus
RunClear
ANALYSIS NAME
SELECT SPECIES
Genotype Determination and Recombination DetectionThis annotation pipeline takes an alignment of sequences containing at least two representatives from each taxon. This reference alignment is then used to construct adistance-based tree, which is then parsed in order to find the closest relatives for any query sequence using a Branch Indexing method. By incorporating a static window size,this pipeline can also identify any recombinant query sequence. When the analysis is completed, a graphical representation of the score corresponding to the genotypeclassification for each region of the "sliding window" will be shown. A spreadsheet file with the results will also be available for download. This tool is based on the GenotypeDetermination Tool developed by Carla Kuiken's group at Los Alamos National Laboratory for the HCV database . (SOP)
ViPR Home Flaviviridae Home Genotype determination and Recombination detection
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
Virus Pathogen Database and Analysis Resource (ViPR) - Flaviv... http://www.viprbrc.org/brc/genotypeRecombination.do?metho...
1 of 1 2/20/14 3:57 PM
2/20/14 3:39 PMVirus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Genotype Recombination Result
Page 1 of 1http://www.viprbrc.org/brc/genotypeRecombination.do?decorator=flavi&method=RetrieveResults&ticketNumber=GR_566099818252
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Feb 11, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs, Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Save Analysis Download
Your analysis contains 1 records
Genotype Recombination Analysis Result
Defline Species Status(Genotype) BI(Genotype) Genotype Status(Recombination) Recombination Comment
WNV_Chime WESTNILE Success 0.354 2 Success 1A,2 -N/A-
ViPR Home Flaviviridae Home Genotype determination and Recombination detection Results
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 25
Note that this strain was artificially made to be a recombinant, which was detected by this algorithm. Strains that have natural recombination will be detected in a similar way.
IV. Genome Annotation
a. Mouse-over the “Analyze & Visualize” tab from the grey navigation bar and click “Genome Annotator (GATU)”.
b. In order to annotate your own sequence, you need to select a previously annotated reference sequence. If you already have an annotated reference sequence in .gb format, click “Launch GATU” to proceed directly to launch GATU. If not, you can use ViPR BLAST to search for a closely-related annotated sequence as your reference.
i. If you have your own sequence, prepare the sequence in FASTA format, save it in plain
text and use .fasta as the file extension. FASTA file example:
2/20/14 3:40 PMVirus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Genotype Recombination Details
Page 1 of 2http://www.viprbrc.org/brc/genotypeRecombination.do?decorator=flavi&method=ShowDetails&ticketNumber=GR_566099818252&shortName=_seq001
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Genotype
Download
The genotype results include a tab separated file listing the sequence name, a single consensus genotype result for the entire genome, and the confidence metric.
Recombination
Download
This is a tab separated file listing the results for all windows for the sequence.
Genotyping results in graphical format
Alignment
Download Aligned Fasta Visualize Aligned Sequences
This is the multiple sequence alignment of your sequence with a ViPR reference sequence alignment that consists of at least 2 representatives from each taxon
Tree
View phylogentic treeDownload Newick File
This is the tree generated by PAUP based on the input alignment for the whole genome
Genotype Report
Genotype InformationWhole Genome Genotype prediction: 2
Whole Genome Recombination Type: 1A,2
Run Analysis
ViPR Home Flaviviridae Home Genotype determination and Recombination detection Results Details
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
2/14/14 Virus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - GATU
www.viprbrc.org/brc/gatuStart.do?decorator=flavi 1/1
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Feb 11, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Go
REFERENCE SEQUENCE
To use GATU, you will need to select a reference sequence. If you already have an appropriate GenBank file, proceed directly to Launch GATU. If not,you can use ViPR Blast to search for one. Browse to your target sequence in a FASTA format, then click on Go. Run a Blast search, pick a referencesequence file, and download it to your directory in GenBank format. Then click Launch GATU and upload your reference and target sequences using therespective controls.
File Path:
No file chosenChoose File
Launch GATU
Go
FILE FORMAT CONVERSION
The GATU-produced annotation file can be modified for submission to GenBank by using the file format conversion tool. This tool will convert the GATU-produced (GenBank format) annotation file into a Fasta sequence file and a tab-delimited 'Feature Table' file. Both files can then be used for submittingnew sequences to GenBank with the Sequin and tbl2asn tools.
File Path:
No file chosenChoose File
ANALYSIS NAME
Genome Annotator (GATU)GATU, a Genome Annotation Transfer Utility (Tcherepanov, et al., BMC Genomics 2006, 7:150 PubMed: 16772042) is an initial-stage tool to transfer annotations from apreviously annotated reference to a new, closely-related target genome. ViPR users should ensure that their system has Java 1.6 or higher. The GATU interface providescontrols for uploading a reference .gb file of the relevant viral family, along with the target genome in .gb or Fasta format. When done, a table summarizes the similarities oftransferred annotations and provides users with checkbox control over which to accept. GATU also detects ORFs in the target and bioinformatics tools to assess if theseshould be annotated. The annotated target genome can be saved in multiple file formats.
Originally developed at the University of Victoria, GATU was adapted for use with ViPR.
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support
Launch GATU directly if you have a reference
sequence in .gb format.
Use your target sequence to BLAST for a closely-related annotated
sequence as your reference.
Created by the ViPR/IRD team and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License 26
>gb:EF657887|Organism:West Nile virus 3356K VP2|Subtype:null|Host:Crow AGTAGTTCGCCTGTGTGAGCTGACAAACTTAGTAGTGTTTGTGAGGATTAACAACAATTAACACAGTGCG AGCTGTTTCTTAGCACGAAGATCTCGATGTCTAAGAAACCAGGAGGGCCCGGCAAGAGCCGGGCTGTCAA
Otherwise, you can use a sample sequence from: https://tinyurl.com/mz62l34
ii. Click “Browse”, find the target sequence file on your computer, and click “Go” to run a BLAST search again annotated WNV reference sequences in ViPR.
iii. After BLAST is finished, a list of recommended reference sequences will be displayed. Choose a closely-related sequence and download its GenBank file to your computer.
c. Now, click “Launch GATU” to run the GATU application. A dialog box will pop up. Click “Allow” to allow the GATU applet to be loaded on your computer.
d. In the GATU window, upload your .gb file as the “Reference Genome” and your target genome FASTA file as the “Genome to Annotate”.
e. Click “Annotate” to execute annotation process. When done, a table is displayed which summarizes the similarities of transferred annotations and provides users with checkbox control over which to accept.
f. Click “Save” to save the annotated target genome in Genbank, EMBL, or XML file formats.
2/20/14 6:09 PMVirus Pathogen Database and Analysis Resource (ViPR) - Flaviviridae - Sequence Similarity Search (BLAST) Report
Page 1 of 1http://www.viprbrc.org/brc/blast.do?decorator=flavi&method=RetrieveResults&ticketNumber=BL_457758125639
Loading Virus Pathogen Database and Analysis Resource (ViPR)...
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Feb 11, 2014
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs, Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
REFERENCE SEQUENCE
Here are some recommended Reference Sequences. Select one, click the link Download GenBank File, and save the file to your local machine.Now click Launch GATU. Under Genome Selection > Reference Genome, click Upload Genome File and browse to the saved Reference.
Download GenBank File Sequence header Bit Score E Value
EXT853727 >gi|853727| Country:USA| West Nile virus, complete genome.|gb|158516887 21740 0.0
EXT870105 >gi|870105| Country:| West Nile virus, complete genome.|gb|11528013 2553 0.0
EXT849462 >gi|849462| Country:| Murray Valley encephalitis virus, complete genome.|gb|9633622 204 5.0E-51
EXT391627 >gi|391627| Country:USA| St. Louis encephalitis virus, complete genome.|gb|123205971 153 2.0E-35
EXT844430 >gi|844430| Country:Austria| Usutu virus, complete genome.|gb|56692441 145 4.0E-33
Launch GATU
GATUGATU, a Genome Annotation Transfer Utility (Tcherepanov, et al., BMC Genomics 2006, 7:150 PubMed: 16772042) is an initial-stage tool to transfer annotations from apreviously annotated reference to a new, closely-related target genome. ViPR users should ensure that their system has Java 1.6 or higher. The GATU interface providescontrols for uploading a reference .gb file of the relevant viral family, along with the target genome in .gb or Fasta format. When done, a table summarizes the similarities oftransferred annotations and provides users with checkbox control over which to accept. GATU also detects ORFs in the target and bioinformatics tools to assess if these shouldbe annotated. The annotated target genome can be saved in multiple file formats.
Originally developed at the University of Victoria, GATU was adapted for use with ViPR.
ViPR Home Flaviviridae Home Identify Similar Sequences (BLAST) Results
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES You are logged in as [email protected]
FlaviviridaeAbout Us Community Announcements Links Resources Support