69
Evolution as a Confounding Factor in Genetic Association Studies 14 December 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center

Evolution as a Confounding Factor in Genetic Association Studies 14 December 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern

Embed Size (px)

Citation preview

  • Slide 1
  • Evolution as a Confounding Factor in Genetic Association Studies 14 December 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center
  • Slide 2
  • Current projects
  • Slide 3
  • Outline Hypothesis that evolution should be considered a confounding factor in genetic association studies HLA-mediated autoimmune disease predisposition analysis using SFVT Identifying genetic determinants of influenza species jump events based on convergent evolution Novel general strategies for formally controlling for evolution as a confounding factor in genetic association studies
  • Slide 4
  • EVOLUTION AS A CONFOUNDING FACTOR IN GENETIC ASSOCIATION STUDIES
  • Slide 5
  • Population-based genetic association Many diseases exhibit evidence of genetic predispositions Genotype-phenotype association studies Diagnostic biomarker Molecular underpinnings of disease pathology GWAS and linkage disequilibrium Co-inheritance of linked genetic markers Advantage of using SNPs to detect causal variants NGS could obviate the need for using linked SNPs
  • Slide 6
  • Statistical assumptions Independence (confounding) Random sampling (bias) Population has reached equilibrium Test sample represents a random sampling of the equilibrium population
  • Slide 7
  • HLA-MEDIATED AUTOIMMUNE DISEASE PREDISPOSITION ANALYSIS USING SFVT
  • Slide 8
  • Class I and II Peptide Sources
  • Slide 9
  • 9 HLA and autoimmune disease DiseaseHLA AlleleRelative Risk Ankylosing spondylitisB2787.4 Postgonococcal arthritisB2714.0 Acute anterior uveitisB2714.6 Rheumatoid arthritisDR45.8 Chronic active hepatitisDR313.9 Sjogren syndromeDR39.7 Insulin-dependent diabetesDR3/DR414.3 21-Hydroxylase deficiencyBW4715.0 Robbins Pathologic Basis of Disease 6th Edition (1999)
  • Slide 10
  • 10 HLA and infectious disease Correlation between HLA genotype and HIV viral burden and progression to AIDS M Dean, M Carrington and SJ O'Brien Annual Review of Genomics and Human Genetics Vol. 3: 263-292 (2002)
  • Slide 11
  • 11 HLA and drug sensitivity HLA alleledrug sensitivityassociationprevalence B*1502carbamazepine (epilepsy)p = 3 x 10 -27 high Chinese absent Caucasians B*5701abacavir (HIV)p = 5 x 10 -20 high Caucasians absent in Africans, Hispanics B*5801allopurinol (gout)p = 5 x 10 -24 high Chinese P. Parham
  • Slide 12
  • 12 HLA-A HLA-BHLA-C 697 (24)1109 (49)381 (9) HLA-DRBHLA-DQA1HLA-DQB1HLA-DPA1HLA-DPB1 690 (20) 3495 (7)27131 MICAMICBTAP 653011 Figures in parenthesis indicate the number of serologically defined antigens at each locus. 500 new submission each year. Number of HLA Alleles IMGT HLA - October 2008
  • Slide 13
  • HLA Allele Nomenclature 13 HLA - A * 24 02 01 01 Locus Asterisk Allele family (serological where possible) Amino acid difference Non-coding (silent) polymorphism Intron, 3 or 5 polymorphism N = null L = low S = Sec. A = Abr. Q = Quest. HLA - A * 24 02 01 02 L
  • Slide 14
  • 14 DRB1 phylogeny DRB1*07 DRB1*09 DRB1*10 DRB1*04 DRB1*16 DRB1*15
  • Slide 15
  • 15 DRB1 phylogeny DRB1*13
  • Slide 16
  • 16 DRB1 phylogeny DRB1*07 DRB1*09 DRB1*10 DRB1*04 DRB1*16 DRB1*15
  • Slide 17
  • 17 DRB1 alignment 07/1507/0909/15
  • Slide 18
  • Limitations with traditional HLA allele-based association studies Treats entire allele as a single unit and therefore includes both causative and passenger variations Doesnt take into account structural relationships between alleles Syntax of the HLA nomenclature was designed to capture some of the structural relationships between alleles, but there are several exceptions
  • Slide 19
  • 19 HLAmediated disease predisposition Hypothesis: While the allelic/haplotypic structures reflect evolutionary history of the locus, it is the focused regions in the HLA genes/proteins that affect gene expression, protein structure and/or protein function that are responsible for enhanced disease risk
  • Slide 20
  • 20 An alternative approach DAIT-Data Interoperability Steering Committee/HLA Working Group members HLA Nomenclature : WHO/ IMGT HLA/ Anthony Nolan Research Institute NCBI - dbMHC Biomedical ontology people
  • Slide 21
  • Summary of SFVT approach Define individual sequence features (SF) in HLA proteins (genes) Determine the extent of polymorphism for each sequence feature by defining the observed variant types (VT) Re-annotate HLA typing information with complete list of VT for each SF Examine the association between every sequence feature variant type and disease or other phenotype 21
  • Slide 22
  • Representative Sequence Features
  • Slide 23
  • 23 A*0201 - peptide binding SF
  • Slide 24
  • A*0201 - peptide binding pocket B 24
  • Slide 25
  • 25 A*0201 - CD8 binding & TCR binding SF CD8 Binding TCR Binding
  • Slide 26
  • Summary of SFs defined 1775 total
  • Slide 27
  • Variant Types for Hsa_HLA-DRB1_beta-strand 2_peptide antigen binding
  • Slide 28
  • Representative Sequence Features Variant Types
  • Slide 29
  • HLA SFVT Association with Systemic Sclerosis Summary of data set Systemic sclerosis (SSc, scleroderma) is a chronic condition characterized by altered immune reactivity, thickened skin, endothelial dysfunction, interstitial fibrosis, gangrene, pulmonary hypertension, gastrointestinal tract dysmotility, and renal arteriolar dysfunction. A large cohort of ~1300 SSc patients and ~1000 healthy controls has been assembled by Drs. Frank C. Arnett, John Reveille and colleagues at the University of Texas Health Science Center at Houston. Information on autoantibody reactivity for over 15 nuclear antigens is available. 4-digit typing has been done for DRB1, DQA1, and DQB1 in all individuals. Initial re-annotation of 4 digit DRB1 typing data DRB1*1104 => SF1_VT43; SF2_VT4; SF3_VT12 Statistical analysis Split data set into two - pseudo-replicates 2 x n contingency table for every SF (286), where n = number of VT Chi-squared or Fishers Exact Test analysis Select SF with adjusted p-value
  • Flu pandemics of the 20 th and 21 st centuries initiated by species jump events 1918 flu pandemic (Spanish flu) subtype H1N1 (avian origin) estimated to have claimed between 2.5% to 5.0% of the worlds population (20 > 100 million deaths) Asian flu (1957 1958) subtype H2N2 (avian origin) 1 - 1.5 million deaths Hong Kong flu (1968 1969) subtype H3N2 (avian origin) between 750,000 and 1 million deaths 2009 H1N1 subtype H1N1 (swine origin) ~ 16,000 deaths as of March 2010
  • Slide 44
  • Pandemic stages Adaptive drivers
  • Slide 45
  • Basic reproductive number (R 0 ) Total number of secondary cases per case Reasonable surrogate of fitness Characteristics of pandemic viruses: R 0 H >1, and In genetic neighborhood of viruses with R 0 R>1 and R 0 H1) Pandemic Viruses (R 0 H >1) Stuttering viruses (R 0 R>1 and R 0 H1 and R 0 H1 and R 0 H
  • Summary Human influenza pandemics are initiated by species jump events followed by sustained human to human transmission (R 0 H>1) Multiple independent occurrences of the same mutation during stuttering transmission is evidence of convergent evolution of adaptive drivers hypotheses for experimental testing Surveillance for adaptive drivers in reservoir species could help anticipate the next pandemic N01AI40041
  • Slide 66
  • TOWARD A GENERAL STRATEGY FOR CONTROLLING FOR EVOLUTION AS A CONFOUNDING FACTOR IN GENETIC ASSOCIATION STUDIES
  • Slide 67
  • Slide 68
  • 68 HLA SFVT Acknowledgements BISC ImmPort Team David Karp (UTSW) Nishanth Marthandan (UTSW) Paula Guidry (UTSW) Frank C. Arnett (UTH) John Reveille (UTH) Chul Ahn (UTSW) Glenys Thompson (Berkeley) Tom Smith (NG) Jeff Wiser (NG) DAIT HLA Working Group David DeLuca (Hannover) Raymond Dunivin (NCBI) Michael Feolo (NCBI) Wolfgang Helmberg (Graz) Steven G. E. Marsh (ANRI) David Parrish (ITN) Bjoern Peters (LIAI) Effie Petersdorf (FHCRC) Matthew J. Waller (ANRI) Sequence Ontology WG Michael Ashburner (Cambridge) Lindsay Cowell (Duke) Alexander D. Diehl (Jackson) Karen Eilbeck (Utah) Suzanna Lewis (LBNL) Chris Mungall (LBNL) Darren A. Natale (Georgetown) Barry Smith (Buffalo) With support from NIAID N01AI40076
  • Slide 69
  • 69 U.T. Southwestern Richard Scheuermann (PI) Burke Squires Jyothi Noronha Victoria Hunt Shubhada Godbole Brett Pickett Yun Zhang Haizhou Liu MSSM Adolfo Garcia-Sastre Eric Bortz Gina Conenello Peter Palese Vecna Chris Larsen Al Ramsey LANL Catherine Macken Mira Dimitrijevic U.C. Davis Nicole Baumgarth Northrop Grumman Ed Klem Mike Atassi Kevin Biersack Jon Dietrich Wenjie Hua Wei Jen Sanjeev Kumar Xiaomei Li Zaigang Liu Jason Lucas Michelle Lu Bruce Quesenberry Barbara Rotchford Hongbo Su Bryan Walters Jianjun Wang Sam Zaremba Liwei Zhou IRD SWG Gillian Air, OMRF Carol Cardona, Univ. Minnesota Adolfo Garcia-Sastre, Mt Sinai Elodie Ghedin, Univ. Pittsburgh Martha Nelson, Fogarty Daniel Perez, Univ. Maryland Gavin Smith, Duke Singapore David Spiro, JCVI Dave Stallknecht, Univ. Georgia David Topham, Rochester Richard Webby, St Jude USDA David Suarez Sage Analytica Robert Taylor Lone Simonsen CEIRS CentersAcknowledgments N01AI40041