View
33
Download
0
Category
Preview:
DESCRIPTION
An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus. Chirag J. Patel et al., PLoS One, May 2010. First, some context. Hypothesis-driven vs. data-driven research Tension between these two forms of research crosses all scientific disciplines - PowerPoint PPT Presentation
Citation preview
An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus
Chirag J. Patel et al.,PLoS One, May 2010
First, some context...
2
• Hypothesis-driven vs. data-driven research• Tension between these two forms of research
crosses all scientific disciplines• In recent years, there has been an explosion of
data-driven research...why?
Kell, 2003
Computational Speed Over Time
3Kurzweil, 2010
Genome Sequencing Cost
4NHGRI, 2012
The Age of Big Data
5Lohr, 2012
Genome-Wide Association Study (GWAS)
6
• Typically case-control study design• Examine associations of single-nucleotide
polymorphisms (SNPs) with disease state• NCBI’s SNP Database lists 187,852,828 SNPs
identified in human genome (June 2012)• GWAS typically examines 100,000’s of SNPs
through use of DNA mircoarrays
Bush et al., PLOS Computational Biology, 2012
GWAS of systemic sclerosis
7Radstake et al., 2010
On to the paper!
T2D Prevalence in US
9CDC, 2011 Year
Estim
ated
Pre
vale
nce
(%)
Age
T2D Incidence in US
10CDC, 2012
Year
Estim
ated
Incid
ence
(per
100
0 pp
l)
Age
80 85 90 95 00 05 10 0
2
4
6
8
10
14
12
18
16
18 - 44
45 - 64
65 – 79
Introduction
11
• Type 2 Diabetes (T2D) has complex etiology, involving genetics, lifestyle, and environment
• GWAS identified multiple SNPs associated with T2D, but these don’t explain T2D trends
• Standard environmental epidemiology approaches limited by narrow focus
• Patel et al. propose first “Environment-Wide Association Study” (EWAS) to examine T2D using a large, nationally-representative dataset
Methods
12
• Combined four NHANES datasets (1999-2006)• Rich cross-sectional data on demographics,
chemical toxicants, pollutants, allergens, nutrients, fasting blood sugar, and self-reported medical history
• By using NHANES weighting, results can be generalized to US population
Methods: Environment Scan
13
• Omitted environmental factors with low variability (>90% of observations below detection limit). Also omitted factors only affecting specific subsets of population
• Across all four NHANES cohorts: 543 environmental factors
• 266 unique factors in total, with 157 factors found in more than one cohort
• Log-transformed factors when necessary. Used z-score transformations to allow comparisons between factors
14
Methods: Case definition
15
• Based on ADA guidelines: fasting blood glucose level ≥ 126 mg/dL
• Did not distinguish T1D from T2D• Did not consider medication use or medical
history
ADA, 2009
Methods: Primary Analysis
16
• Logistic regression (accounting for NHANES weighting) to estimates associations of 266 unique environmental factors with case status
• Estimated prevalence odds ratios• Ran regressions for each individual NHANES
cohort and with data of all combined cohorts • Covariates: age, sex, BMI, ethnicity, and
income/poverty ratio
Methods: False Discovery Rate (FDR)
17
• Accounted for multiple hypothesis testing• FDR= proportion of "discoveries" (significant
results) that are actually false positives• Less stringent than Bonferroni correction
Methods: False Discovery Rate (FDR)
18
Alpha Level FDR
5 false discoveries
100 total tests
5 false discoveries
100 significant resultsα =
α = 0.05
Shaffer, 1995
FDR =
FDR = 0.05
Methods: Primary Analysis
19
1) First phase: Used two-sided alpha level of 0.02 to pick factors associated with T2D in individual NHANES cohorts
2) Second phase: Determine how many of these 37 factors are associated with T2D in two or more cohorts (two-sided alpha level of 0.02)
Methods: secondary/sensitivity analyses
20
1) Reverse causality test: re-run analysis only among people that didn’t report doctor diagnosis of T2D
2) Lipophilic chemicals: adjusted for total triglycerides and cholesterol
3) Recent diet: adjusted for diet and supplement use
Results: first phase
21
Identified 37 unique factors (FDR = 10-30%)
• Dioxins• Furans• Heavy metals• Nutrient/vitamins• Organochlorine pesticides• Polychlorinated biphenyls• Viruses
Results: second phase
23
Identified 5 unique factors (overall FDR = 2%)
• Cis-β-carotene• Trans- β-carotene• γ-tocopherol• Heptachlor Epoxide• PCB170: 2,2',3,3',4,4',5-heptachlorobiphenyl
24
Results: reverse causality?
25
Primary Analysis Secondary Analysis
Cis-β-carotene 0.6 (0.5 – 0.7) 0.6 (0.5 – 0.7)
Trans- β-carotene 0.6 (0.5 – 0.7) 0.7 (0.5 – 0.8)
γ-tocopherol 1.5 (1.3 – 1.7) 1.8 (1.3 – 2.2)
Heptachlor Epoxide 1.7 (1.3 – 2.1) 1.6 (1.1 – 2.1)
PCB170 2.2 (1.6 – 3.2) 2.1 (1.2 – 3.9)
Prevalence OR (95% CI)
Results: confounding by lipid levels?
26
Primary Analysis Secondary Analysis
Cis-β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)
Trans- β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)
γ-tocopherol 1.5 (1.3 – 1.7) 1.4 (1.2 – 1.6)
Heptachlor Epoxide 1.7 (1.3 – 2.1) 1.6 (1.3 – 2.0)
PCB170 2.2 (1.6 – 3.2) 2.3 (1.4 – 3.7)
Prevalence OR (95% CI)
Results: adjusting for diet/supplements?
27
Primary Analysis Secondary Analysis
Cis-β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)
Trans- β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)
γ-tocopherol 1.5 (1.3 – 1.7) 1.3 (1.1 – 1.5)
Heptachlor Epoxide 1.7 (1.3 – 2.1) 1.6 (1.3 – 2.1)
PCB170 2.2 (1.6 – 3.2) 2.2 (1.4 – 3.5)
Prevalence OR (95% CI)
Discussion
28
• EWAS confirmed previous findings (carotenes and PCB) and provided novel associations (heptachlor epoxide and γ-tocopherol)
• Limitations and Strengths?• Dawning of age of “enviromics”?• Next steps?
o e.g. cumulative exposure?
Recommended