1
Virus Pathogen Resource (ViPR): an Open Bioinformatics Database and Analysis Resource for Human Virus Pathogen Research Yun Zhang 1 , Brett Pickett 1 , Eva Rab 1 , Jyothi Noronha 1 , R. Burke Squires 1 , Victoria Hunt 1 , Mengya Liu 2 , Liwei Zhou 3 , Chris Larson 4 , Jonathan Dietrich 3 , Edward B. Klem 3 , Richard H. Scheuermann 1,5 1 Department of Pathology, 5 Division of Biomedical Informatics, Univ. of Texas Southwestern Medical Center, Dallas, TX; 2 Southern Methodist Univ., Dallas, TX; 3 Northrop Grumman Health Solutions, Rockville MD; 4 Vecna Technologies, Greenbelt MD. Introduction Figure 2: A screenshot of the Sequence Feature Details page. The details page displays strain information, Sequence Feature information, and a table containing all Variant Types for the selected Sequence Feature. 1 Pickett, B.E., et al. (2012) ViPR: an open bioinformatics database and analysis resource for virology research. Nucl. Acids Res. 40(D1): D593-D598 2 UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic acids research, 39, D214-219. 3 Sayers, E.W., et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic acids research, 39, D38-51. 4 Vita, R., et al. (2010) The immune epitope database 2.0. Nucleic acids research, 38, D854-862. Edgar, R.C., 2004. 5 Darling, A.C.E., et al. (2004) Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements. Genome Res., 14: 1394-1403 6 Zmasek, C.M. and Eddy, S.R. (2001) ATV: display and manipulation of annotated phylogenetic trees. Bioinformatics, 17, 383-384. 7 Hanson, R. (2010) Jmol - a paradigm shift in crystallographic visualization. Journal of Applied Crystallography, 43, 1250-1260. We would like to thank the primary data providers for the data that was used throughout this study. We also recognize the scientific and technical personnel responsible for supporting and developing ViPR, which has been wholly supported with federal funds from the NIH/NIAID (N01AI2008038 to R.H.S.). Figure 5: 3D Protein Structure Viewer 7 in ViPR. A display of a Vaccinia Virus N1 protein 3D structure with ligands highlighted (PDB ID: 2UXE). ViPR combines the strength of a relational database with a suite of bioinformatics integrated tools to support everything from basic sequence and structural analyses to genotype-phenotype studies and host-virus interaction studies. The uniqueness of ViPR lies in: • integrating data from various sources • capturing unique data on host responses to virus infections • encouraging the analysis of the comprehensive data contained within the system • combining the available tools to quickly perform complex analytical workflows • facilitating rapid hypothesis generation using bioinformatics methods for subsequent experimental testing • allowing data sharing and storage with collaborators Figure 1: A screenshot of the ViPR homepage. The ViPR homepage is the portal used to access the various types of data and advanced functionality within the system. The Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org ), sponsored by the National Institute of Allergy and Infectious Diseases serves as a single publicly-accessible repository of integrated datasets and analysis tools for 14 different virus families to support wet-bench virology researchers focusing on the development of diagnostics, prophylactics, vaccines, and treatments for these pathogens 1 . ViPR Supports 14 Virus Families Arenaviridae, Bunyaviridae, Caliciviridae, Coronaviridae, Filoviridae, Flaviviridae, Hepeviridae, Herpesviridae, Paramyxoviridae, Picornaviridae, Poxviridae, Reoviridae, Rhabdoviridae, and Togaviridae. ViPR Intergrates Data from Many Sources • GenBank sequence records, gene annotations, and strain metadata Gene Ontology (GO) classifications UniProtKB protein annotations • Protein Databank (PDB) 3D protein structures • Immune epitopes from the Immune Epitope Database (IEDB) Clinical data • Host-Pathogen Interaction Data generated from the NIAID Systems Virology projects and the ViPR-funded Driving Biological Projects • Additional data derived from computational algorithms ViPR Provides Analysis and Visualization Tools • Genome Annotator • BLAST Sequence Similarity Search • Multiple Sequence Alignment • Phylogenetic Tree Construction • 3D Protein Structures with Sequence Feature or Epitope Highlights • Sequence Feature Variant Type (SFVT) Analysis • Metadata-driven Comparative Genomics Analysis • SNP Analysis ViPR enables you to store and share data and results through the ViPR Workbench Figure 3: Comparative Genomic Analytical tools in ViPR (A) A multiple sequence alignment of Vaccinia genome sequences calculated with Mauve 5 . (B) A Metadata-driven Comparative Analysis Tool to identify individual positions that correlate with a metadata attribute. (C) A phylogenetic tree visualized using the Archaeopteryx 6 tool showing the relationship between Vaccinia viruses. A C B Multiple Sequence Alignment, Phylogenetic Tree and Metadata-driven Comparative Analysis Tool 3D Protein Structure Viewer Summary Acknowledgements References Sequence Feature Variant Type (SFVT) Sequence Features (SFs): characterized structural, functional, immune epitope, or sequence alteration regions of a protein manually curated from UniProt 2 , GenBank 3 , and the Immune Epitope Database 4 and then validated by expert researchers. Variant Type (VT): Polymorphisms in each Sequence Feature are identified as “Variant Types” of the Sequence Feature. • Available for hepatitis C virus subtype 1a, Dengue virus type 1-4, and Orthopoxvirus (Vaccinia) in ViPR. • Enables researchers to quickly query and analyze the genotypic changes for all sequence records that could be associated with a given phenotype. Host-virus Interaction Data Figure 4: Host Factor Data in ViPR. An example of a host factor experiment result summary showing differentially expressed genes in human cells infected with H5N1.

Virus Pathogen Resource ( ViPR ) :

Embed Size (px)

DESCRIPTION

Virus Pathogen Resource ( ViPR ) : an Open Bioinformatics Database and Analysis Resource for Human Virus Pathogen Research Yun Zhang 1 , Brett Pickett 1 , Eva Rab 1 , Jyothi Noronha 1 , R. Burke Squires 1 , Victoria Hunt 1 , Mengya Liu 2 , Liwei Zhou 3 , - PowerPoint PPT Presentation

Citation preview

Page 1: Virus Pathogen  Resource ( ViPR ) :

Virus Pathogen Resource (ViPR):an Open Bioinformatics Database and Analysis Resource for Human Virus Pathogen Research

Yun Zhang1, Brett Pickett1, Eva Rab1, Jyothi Noronha1, R. Burke Squires1, Victoria Hunt1, Mengya Liu2, Liwei Zhou3, Chris Larson4, Jonathan Dietrich3, Edward B. Klem3, Richard H. Scheuermann1,5

1Department of Pathology, 5Division of Biomedical Informatics, Univ. of Texas Southwestern Medical Center, Dallas, TX; 2Southern Methodist Univ., Dallas, TX; 3Northrop Grumman Health Solutions, Rockville MD; 4Vecna Technologies, Greenbelt MD.

Introduction

Figure 2: A screenshot of the Sequence Feature Details page. The details page displays strain information, Sequence Feature information, and a table containing all Variant Types for the selected Sequence Feature.

1 Pickett, B.E., et al. (2012) ViPR: an open bioinformatics database and analysis resource for virology research. Nucl. Acids Res. 40(D1): D593-D598

2UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic acids research, 39, D214-219. 3 Sayers, E.W., et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic acids research, 39, D38-51.4Vita, R., et al. (2010) The immune epitope database 2.0. Nucleic acids research, 38, D854-862. Edgar, R.C., 2004. 5Darling, A.C.E., et al. (2004) Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements. Genome Res., 14: 1394-

14036Zmasek, C.M. and Eddy, S.R. (2001) ATV: display and manipulation of annotated phylogenetic trees. Bioinformatics, 17, 383-384.7Hanson, R. (2010) Jmol - a paradigm shift in crystallographic visualization. Journal of Applied Crystallography, 43, 1250-1260.

We would like to thank the primary data providers for the data that was used throughout this study. We also recognize the scientific and technical personnel responsible for supporting and developing ViPR, which has been wholly supported with federal funds from the NIH/NIAID (N01AI2008038 to R.H.S.).

Figure 5: 3D Protein Structure Viewer7 in ViPR.A display of a Vaccinia Virus N1 protein 3D structure with ligands highlighted (PDB ID: 2UXE).

ViPR combines the strength of a relational database with a suite of bioinformatics integrated tools to support everything from basic sequence and structural analyses to genotype-phenotype studies and host-virus interaction studies. The uniqueness of ViPR lies in:

• integrating data from various sources• capturing unique data on host responses to virus infections• encouraging the analysis of the comprehensive data

contained within the system• combining the available tools to quickly perform complex

analytical workflows• facilitating rapid hypothesis generation using bioinformatics

methods for subsequent experimental testing• allowing data sharing and storage with collaborators

Figure 1: A screenshot of the ViPR homepage. The ViPR homepage is the portal used to access the various types of data and advanced functionality within the system.

The Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org), sponsored by the National Institute of Allergy and Infectious Diseases serves as a single publicly-accessible repository of integrated datasets and analysis tools for 14 different virus families to support wet-bench virology researchers focusing on the development of diagnostics, prophylactics, vaccines, and treatments for these pathogens1.

ViPR Supports 14 Virus FamiliesArenaviridae, Bunyaviridae, Caliciviridae, Coronaviridae,

Filoviridae, Flaviviridae, Hepeviridae, Herpesviridae, Paramyxoviridae, Picornaviridae, Poxviridae, Reoviridae, Rhabdoviridae, and Togaviridae.

ViPR Intergrates Data from Many Sources• GenBank sequence records, gene annotations, and strain

metadata • Gene Ontology (GO) classifications• UniProtKB protein annotations• Protein Databank (PDB) 3D protein structures• Immune epitopes from the Immune Epitope Database

(IEDB)• Clinical data• Host-Pathogen Interaction Data generated from the NIAID

Systems Virology projects and the ViPR-funded Driving Biological Projects

• Additional data derived from computational algorithms

ViPR Provides Analysis and Visualization Tools• Genome Annotator• BLAST Sequence Similarity Search• Multiple Sequence Alignment• Phylogenetic Tree Construction• 3D Protein Structures with Sequence Feature or Epitope

Highlights• Sequence Feature Variant Type (SFVT) Analysis• Metadata-driven Comparative Genomics Analysis• SNP Analysis

ViPR enables you to store and share data and results through the ViPR Workbench

Figure 3: Comparative Genomic Analytical tools in ViPR(A) A multiple sequence alignment of Vaccinia genome sequences calculated with Mauve5. (B) A Metadata-driven Comparative Analysis Tool to identify individual positions that correlate with a metadata attribute. (C) A phylogenetic tree visualized using the Archaeopteryx6 tool showing the relationship between Vaccinia viruses.

A

C

B

Multiple Sequence Alignment,Phylogenetic Tree and Metadata-driven

Comparative Analysis Tool

3D Protein Structure Viewer

Summary

Acknowledgements

References

Sequence Feature Variant Type (SFVT)• Sequence Features (SFs): characterized structural, functional,

immune epitope, or sequence alteration regions of a protein manually curated from UniProt2, GenBank3, and the Immune Epitope Database4 and then validated by expert researchers.

• Variant Type (VT): Polymorphisms in each Sequence Feature are identified as “Variant Types” of the Sequence Feature.

• Available for hepatitis C virus subtype 1a, Dengue virus type 1-4, and Orthopoxvirus (Vaccinia) in ViPR.

• Enables researchers to quickly query and analyze the genotypic changes for all sequence records that could be associated with a given phenotype.

Host-virus Interaction Data

Figure 4: Host Factor Data in ViPR.An example of a host factor experiment result summary showing differentially expressed genes in human cells infected with H5N1.