Evolution as a Confounding Factor in Genetic Association Studies 14 December 2011 Richard H....
69
Evolution as a Confounding Factor in Genetic Association Studies 14 December 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center
Evolution as a Confounding Factor in Genetic Association Studies 14 December 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern
Evolution as a Confounding Factor in Genetic Association
Studies 14 December 2011 Richard H. Scheuermann, Ph.D. Department
of Pathology U.T. Southwestern Medical Center
Slide 2
Current projects
Slide 3
Outline Hypothesis that evolution should be considered a
confounding factor in genetic association studies HLA-mediated
autoimmune disease predisposition analysis using SFVT Identifying
genetic determinants of influenza species jump events based on
convergent evolution Novel general strategies for formally
controlling for evolution as a confounding factor in genetic
association studies
Slide 4
EVOLUTION AS A CONFOUNDING FACTOR IN GENETIC ASSOCIATION
STUDIES
Slide 5
Population-based genetic association Many diseases exhibit
evidence of genetic predispositions Genotype-phenotype association
studies Diagnostic biomarker Molecular underpinnings of disease
pathology GWAS and linkage disequilibrium Co-inheritance of linked
genetic markers Advantage of using SNPs to detect causal variants
NGS could obviate the need for using linked SNPs
Slide 6
Statistical assumptions Independence (confounding) Random
sampling (bias) Population has reached equilibrium Test sample
represents a random sampling of the equilibrium population
Slide 7
HLA-MEDIATED AUTOIMMUNE DISEASE PREDISPOSITION ANALYSIS USING
SFVT
10 HLA and infectious disease Correlation between HLA genotype
and HIV viral burden and progression to AIDS M Dean, M Carrington
and SJ O'Brien Annual Review of Genomics and Human Genetics Vol. 3:
263-292 (2002)
Slide 11
11 HLA and drug sensitivity HLA alleledrug
sensitivityassociationprevalence B*1502carbamazepine (epilepsy)p =
3 x 10 -27 high Chinese absent Caucasians B*5701abacavir (HIV)p = 5
x 10 -20 high Caucasians absent in Africans, Hispanics
B*5801allopurinol (gout)p = 5 x 10 -24 high Chinese P. Parham
Slide 12
12 HLA-A HLA-BHLA-C 697 (24)1109 (49)381 (9)
HLA-DRBHLA-DQA1HLA-DQB1HLA-DPA1HLA-DPB1 690 (20) 3495 (7)27131
MICAMICBTAP 653011 Figures in parenthesis indicate the number of
serologically defined antigens at each locus. 500 new submission
each year. Number of HLA Alleles IMGT HLA - October 2008
Slide 13
HLA Allele Nomenclature 13 HLA - A * 24 02 01 01 Locus Asterisk
Allele family (serological where possible) Amino acid difference
Non-coding (silent) polymorphism Intron, 3 or 5 polymorphism N =
null L = low S = Sec. A = Abr. Q = Quest. HLA - A * 24 02 01 02
L
Limitations with traditional HLA allele-based association
studies Treats entire allele as a single unit and therefore
includes both causative and passenger variations Doesnt take into
account structural relationships between alleles Syntax of the HLA
nomenclature was designed to capture some of the structural
relationships between alleles, but there are several
exceptions
Slide 19
19 HLAmediated disease predisposition Hypothesis: While the
allelic/haplotypic structures reflect evolutionary history of the
locus, it is the focused regions in the HLA genes/proteins that
affect gene expression, protein structure and/or protein function
that are responsible for enhanced disease risk
Slide 20
20 An alternative approach DAIT-Data Interoperability Steering
Committee/HLA Working Group members HLA Nomenclature : WHO/ IMGT
HLA/ Anthony Nolan Research Institute NCBI - dbMHC Biomedical
ontology people
Slide 21
Summary of SFVT approach Define individual sequence features
(SF) in HLA proteins (genes) Determine the extent of polymorphism
for each sequence feature by defining the observed variant types
(VT) Re-annotate HLA typing information with complete list of VT
for each SF Examine the association between every sequence feature
variant type and disease or other phenotype 21
Variant Types for Hsa_HLA-DRB1_beta-strand 2_peptide antigen
binding
Slide 28
Representative Sequence Features Variant Types
Slide 29
HLA SFVT Association with Systemic Sclerosis Summary of data
set Systemic sclerosis (SSc, scleroderma) is a chronic condition
characterized by altered immune reactivity, thickened skin,
endothelial dysfunction, interstitial fibrosis, gangrene, pulmonary
hypertension, gastrointestinal tract dysmotility, and renal
arteriolar dysfunction. A large cohort of ~1300 SSc patients and
~1000 healthy controls has been assembled by Drs. Frank C. Arnett,
John Reveille and colleagues at the University of Texas Health
Science Center at Houston. Information on autoantibody reactivity
for over 15 nuclear antigens is available. 4-digit typing has been
done for DRB1, DQA1, and DQB1 in all individuals. Initial
re-annotation of 4 digit DRB1 typing data DRB1*1104 => SF1_VT43;
SF2_VT4; SF3_VT12 Statistical analysis Split data set into two -
pseudo-replicates 2 x n contingency table for every SF (286), where
n = number of VT Chi-squared or Fishers Exact Test analysis Select
SF with adjusted p-value
Flu pandemics of the 20 th and 21 st centuries initiated by
species jump events 1918 flu pandemic (Spanish flu) subtype H1N1
(avian origin) estimated to have claimed between 2.5% to 5.0% of
the worlds population (20 > 100 million deaths) Asian flu (1957
1958) subtype H2N2 (avian origin) 1 - 1.5 million deaths Hong Kong
flu (1968 1969) subtype H3N2 (avian origin) between 750,000 and 1
million deaths 2009 H1N1 subtype H1N1 (swine origin) ~ 16,000
deaths as of March 2010
Slide 44
Pandemic stages Adaptive drivers
Slide 45
Basic reproductive number (R 0 ) Total number of secondary
cases per case Reasonable surrogate of fitness Characteristics of
pandemic viruses: R 0 H >1, and In genetic neighborhood of
viruses with R 0 R>1 and R 0 H1) Pandemic Viruses (R 0 H >1)
Stuttering viruses (R 0 R>1 and R 0 H1 and R 0 H1 and R 0 H
Summary Human influenza pandemics are initiated by species jump
events followed by sustained human to human transmission (R 0
H>1) Multiple independent occurrences of the same mutation
during stuttering transmission is evidence of convergent evolution
of adaptive drivers hypotheses for experimental testing
Surveillance for adaptive drivers in reservoir species could help
anticipate the next pandemic N01AI40041
Slide 66
TOWARD A GENERAL STRATEGY FOR CONTROLLING FOR EVOLUTION AS A
CONFOUNDING FACTOR IN GENETIC ASSOCIATION STUDIES
Slide 67
Slide 68
68 HLA SFVT Acknowledgements BISC ImmPort Team David Karp
(UTSW) Nishanth Marthandan (UTSW) Paula Guidry (UTSW) Frank C.
Arnett (UTH) John Reveille (UTH) Chul Ahn (UTSW) Glenys Thompson
(Berkeley) Tom Smith (NG) Jeff Wiser (NG) DAIT HLA Working Group
David DeLuca (Hannover) Raymond Dunivin (NCBI) Michael Feolo (NCBI)
Wolfgang Helmberg (Graz) Steven G. E. Marsh (ANRI) David Parrish
(ITN) Bjoern Peters (LIAI) Effie Petersdorf (FHCRC) Matthew J.
Waller (ANRI) Sequence Ontology WG Michael Ashburner (Cambridge)
Lindsay Cowell (Duke) Alexander D. Diehl (Jackson) Karen Eilbeck
(Utah) Suzanna Lewis (LBNL) Chris Mungall (LBNL) Darren A. Natale
(Georgetown) Barry Smith (Buffalo) With support from NIAID
N01AI40076
Slide 69
69 U.T. Southwestern Richard Scheuermann (PI) Burke Squires
Jyothi Noronha Victoria Hunt Shubhada Godbole Brett Pickett Yun
Zhang Haizhou Liu MSSM Adolfo Garcia-Sastre Eric Bortz Gina
Conenello Peter Palese Vecna Chris Larsen Al Ramsey LANL Catherine
Macken Mira Dimitrijevic U.C. Davis Nicole Baumgarth Northrop
Grumman Ed Klem Mike Atassi Kevin Biersack Jon Dietrich Wenjie Hua
Wei Jen Sanjeev Kumar Xiaomei Li Zaigang Liu Jason Lucas Michelle
Lu Bruce Quesenberry Barbara Rotchford Hongbo Su Bryan Walters
Jianjun Wang Sam Zaremba Liwei Zhou IRD SWG Gillian Air, OMRF Carol
Cardona, Univ. Minnesota Adolfo Garcia-Sastre, Mt Sinai Elodie
Ghedin, Univ. Pittsburgh Martha Nelson, Fogarty Daniel Perez, Univ.
Maryland Gavin Smith, Duke Singapore David Spiro, JCVI Dave
Stallknecht, Univ. Georgia David Topham, Rochester Richard Webby,
St Jude USDA David Suarez Sage Analytica Robert Taylor Lone
Simonsen CEIRS CentersAcknowledgments N01AI40041