73
www.fludb.org Sequence Feature Variant Type and Evolutionary Trajectory Analysis using the Influenza Research Database (IRD) 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center

19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

  • Upload
    aldona

  • View
    49

  • Download
    0

Embed Size (px)

DESCRIPTION

Sequence Feature Variant Type and Evolutionary Trajectory Analysis using the Influenza Research Database (IRD). 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center. Outline. Brief o verview of NIAID-Sponsored Influenza Research Database (IRD) - PowerPoint PPT Presentation

Citation preview

Page 1: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Sequence Feature Variant Type and Evolutionary Trajectory Analysis using the

Influenza Research Database (IRD)

19 July 2011

Richard H. Scheuermann, Ph.D.Department of Pathology

U.T. Southwestern Medical Center

Page 2: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgOutline

• Brief overview of NIAID-Sponsored Influenza Research Database (IRD)– Comprehensive integrated database– Analysis and visualization tools– U.S. NIH-funded, free access, open to all– Developed by a team of research scientists, bioinformaticians and

professional software developers– www.fludb.org– www.viprbrc.org for other human viral pathogens

• Novel approach to genotype-phenotype association studies – Sequence Feature Variant Type (SFVT) analysis

• Evolutionary Trajectory analysis of the pandemic (H1N1) 2009 strain

Page 3: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgPublic Health Impact of Influenza

• Seasonal flu epidemics occur yearly during the fall/ winter months and result in 3-5 million cases of severe illness worldwide.

• More than 200,000 people are hospitalized each year with seasonal flu-related complications in the U.S.

• Approximately 36,000 deaths occur due to seasonal flu each year in the U.S.

• Populations at highest risk are children under age 2, adults age 65 and older, and groups with other comorbidities.

• Pandemics– 1918 Spanish flu (H1N1); 20 - 100 million deaths– 1957 Asian flu (H2N2); 1 - 1.5 million deaths– 1968 Hong Kong flu (H3N2); 750,000 - 1 million deaths– 2009 Swine origin (H1N1); > 16,000 deaths as of March 2010

Source: World Health Organization - http://www.who.int/mediacentre/factsheets/fs211/en/index.html

Page 4: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgInfluenza Virus

Orthomyxoviridae familyNegative-strand RNASegmentedEnveloped

8 RNA segments encode11 proteinsClassified based on serology of HA and NA

Page 5: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgIRD Overview

www.fludb.org

Page 6: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 7: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSearch Access to Datawww.fludb.org

Page 8: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgData Types

Page 9: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgCore Query Attributes

Page 10: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgAdvanced Query Options

Page 11: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSegment search results

Page 12: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgAnalysis and Visualizationwww.fludb.org

Page 13: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgAnalysis and Visualization Tools

Page 14: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgWorkbench Accesswww.fludb.org

Page 15: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgMy Private Workbench

Page 16: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 17: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 18: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 19: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 20: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

www.viprbrc.org

Page 21: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgIRD Summary

• Funded by U.S. National Institute of Allergy and Infectious Diseases (NIAID)

• Free and open access with no use restrictions• Developed by a team of research scientists, bioinformaticians

and professional software developers• Comprehensive collection of public data• Novel derived data, novel analytical tools, unique functions

• Integration – Integration – Integration• www.fludb.org • www.viprbrc.org

Page 22: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

NOVEL APPROACH TO GENOTYPE-PHENOTYPE ASSOCIATION STUDIES – SEQUENCE FEATURE VARIANT TYPE (SFVT) ANALYSIS

Page 23: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgLimitations to Phylogenetics

• Traditional virus phylogenetics focuses on comparative analysis of whole genome/genome segments, and is most useful to understand virus evolution

• However, the genetic determinants of important viral phenotypes, e.g. virulence, host range, replication efficiency, immune response evation, etc., are determined by focused functional regions of viral proteins

• Therefore, specific genotype-phenotype association can be masked by other evolutionary factors that contribute to traditional phylogenetic analysis

Page 24: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSFVT approachVT-1 I F D R L E T L I LVT-2 I F N R L E T L I LVT-3 I F D R L E T I V LVT-4 L F D Q L E T L V SVT-5 I F D R L E N L T LVT-6 I F N R L E A L I LVT-7 I Y D R L E T L I LVT-8 I F D R L E T L V LVT-9 I F D R L E N I V LVT-10 I F E R L E T L I LVT-11 L F D Q M E T L V S

Influenza A_NS1_nuclear-export-signal_137(10)

• Identify regions of protein/gene with known structural or functional properties – Sequence Features (SF)• an alpha-helical region, the binding site for another protein, an enzyme active site, an immune epitope

• Determine the extent of sequence variation for each SF by defining each unique sequence as a Variant Type (VT)

• High-level, comprehensive grouping of all virus strains by VT membership for each SF independently• Genotype-phenotype association statistical analysis, e.g. genetic determinants of host range, virulence,

replication rate

Influenza A_NS1_alpha-helix_171(17)

Page 25: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSF definition

• Based on experimentation reported in the literature and 3D protein structures (PDB records)

• Captured by manual curation• Defined by the specific amino acid positions in the

polypeptide chain• Annotated with the know structural or functional

properties

Page 26: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Influenza A Sequence Features as of 18JUL2011

4128 SFs total

Page 27: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgNS1 Sequence Features

Page 28: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSF8 (nuclear export signal)

Page 29: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT for SF8 (nuclear export signal)

Page 30: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT-1 strains

Page 31: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

DO VARIATIONS IN NS1 SEQUENCE FEATURES INFLUENCE INFLUENZA VIRUS HOST RANGE?

Page 32: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgNS1 Sequence Features

Page 33: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT for SF8 (nuclear export signal)

Page 34: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT distribution by host

Page 35: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Causes of apparent NS1 VT-associated host range restriction

• Virus spread - capability + opportunity– Phenotypic property of the virus – limited capacity– Restricted founder effect – limited opportunity

• Restricted spatial-temporal distribution

• Sampling bias – assumption of random sampling– Oversampling – avian H5N1 in Asia; 2009 H1N1– Undersampling – large and domestic cats

• Linkage to causative variant

Page 36: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT-11 strains

Page 37: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT for SF8 (nuclear export signal)

Page 38: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT lineages

Page 39: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT-4 lineage

Page 40: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 41: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT-4 lineage = B allele/group

Page 42: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT-16 & VT-9 lineages

Page 43: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 44: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgVT-7 lineage

Page 45: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 46: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

EVOLUTIONARY TRAJECTORY ANALYSIS OF THE PANDEMIC (H1N1) 2009 STRAIN

Page 47: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgPhylogenetic Analysis

• Evolutionary origin– Select a representative pandemic (H1N1) 2009 sequence

from the IRD database– BLAST to identify most similar sequences– Assess phylogenetic relationships

Page 48: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgPandemic (H1N1) 2009 selection

Page 49: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgBLAST Result

Page 50: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSegment 1 phylogenetic tree

Swine/Ohio/2004

Duck/USA/2000s

Human/USA/2007 (seasonal)

Swine/USA/1990s

Pandemic (H1N1) 2009

Page 51: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org Temporal component

• Reference strain– A/California/04/2009

• BLAST– Return top 1000 results

• Normalize data• Graph nucleotide differences versus isolation year

differences

Page 52: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgNP chart

Page 53: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgNS chart

Page 54: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgHA chart

Page 55: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Group 1

Group 3

Group 2

Page 56: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

<= Cali/04/09

NS blue cluster (G1)

Page 57: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

<= Cali/04/09

NS green cluster (G2)

Page 58: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgPhylogenetic Trees Quantification

• Analysis method– Build tree for Group 1 and Group 2 strains separately– Analyze branch lengths of trees

• Results– Avg. Group 1 Branch Length:0.0034 (S.D. 0.0062)– Avg. Group 2 Branch Length: 0.0075 (S.D. 0.0118)– T-test (2 sample, unequal variance): 3.22 10-05

Page 59: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Group 1

Group 3

Group 2

Page 60: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgHA trendline

Page 61: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Evolutionary Trajectory Slopes vs. Mutation Rate

Segment Group 1 Slope Group 2 Slope Mutation RatePB2 6.8 24.9 4.3PB1 7.6 26.9PA 5.9 23.2HA 5.5 28.8 5.7NP 2.9 18.2 3.6NA 3.8 23.1 3.2M 1.3 5.6 1.5NS 2.0 12.5 1.6

Substitutions/segment/year

Page 62: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Evolutionary Trajectory (E.T.)

Similar but Distantly Related (SDR)

Page 63: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Garten, et al. Science 2009

Page 64: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Garten, et al. Science 2009

Page 65: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

<= Cali/04/09

ET

Page 66: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

<= Cali/04/09

SDR

Page 67: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

North American H1N1 Lineage - HAH1N1 2009

American Swine, 2000’sNorth American H1N1Lineage HA – Group 1

American Swine, 90’s

American Swine, 80’s

American Swine, 70’s

American Swine, 40 - 60’s

Page 68: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgEvolutionary Trajectory Plots

Evolutionary Trajectory of a strain, with candidates displayed.

Page 69: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 70: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

Page 71: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSummary

• The Influenza Research Database (IRD) provides a comprehensive resource of data, analysis and visualization tools about influenza virus – www.fludb.org

• SFVT represents a novel tool that can be used to better understand genotype-phenotype relationships for flu

• Use of IRD to illuminate the viral origins of the pandemic (H1N1) 2009 virus

• IRD is continually evolving to capture and integrate addition data and analytical tools to support the needs of the influenza research community

Page 72: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.org

72

• U.T. Southwestern– Richard Scheuermann (PI)– Burke Squires– Jyothi Noronha– Victoria Hunt– Shubhada Godbole– Brett Pickett– Yun Zhang

• MSSM– Adolfo Garcia-Sastre– Eric Bortz– Gina Conenello– Peter Palese

• Vecna– Chris Larsen– Al Ramsey

• LANL– Catherine Macken– Mira Dimitrijevic

• U.C. Davis– Nicole Baumgarth

• Northrop Grumman– Ed Klem– Mike Atassi– Kevin Biersack– Jon Dietrich– Wenjie Hua– Wei Jen– Sanjeev Kumar– Xiaomei Li– Zaigang Liu– Jason Lucas– Michelle Lu– Bruce Quesenberry– Barbara Rotchford– Hongbo Su– Bryan Walters– Jianjun Wang– Sam Zaremba– Liwei Zhou

• IRD SWG– Gillian Air, OMRF– Carol Cardona, Univ. Minnesota– Adolfo Garcia-Sastre, Mt Sinai– Elodie Ghedin, Univ. Pittsburgh– Martha Nelson, Fogarty– Daniel Perez, Univ. Maryland– Gavin Smith, Duke Singapore– David Spiro, JCVI– Dave Stallknecht, Univ. Georgia– David Topham, Rochester– Richard Webby, St Jude

• USDA– David Suarez

• Sage Analytica– Robert Taylor– Lone Simonsen

• CEIRS Centers

Acknowledgments

N01AI40041

Page 73: 19 July 2011 Richard H. Scheuermann, Ph.D. Department of Pathology

www.fludb.orgSegment 6 (NA) By Host

0 10 20 30 40 50 60 70 80 90 1000

50

100

150

200

250

300

swineturkeyDuckChickenHuman

Isolation Year Differences

Nuc

leoti

de D

iffer

ence

s