20
Population-specific causal disease effect sizes in functionally important regions impacted by selection ASHG 2019 October 17, 2019 Huwenbo Shi Alkes Price Group Harvard University & Broad Institute 1 (Shi et al. 2019 bioRxiv)

Population-specific causal disease effect sizes at loci

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Population-specific causal disease effect sizes in functionally important regions impacted by selection

ASHG 2019

October 17, 2019

Huwenbo Shi

Alkes Price Group

Harvard University & Broad Institute

1

(Shi et al. 2019 bioRxiv)

Most GWAS samples are still European

2

GWAS is becoming more diverse

โ€ข 2009: 0.1 million

โ€ข 2016: 7.0 million

(Popejoy & Fullerton 2016 Nature)

Number of non-European GWAS samples

70x

European80%

non-European 20%

(Sirugo, Williams, Tishkoff 2019 Cell)

GWAS samples by ethnicity

Trans-ethnic genetic risk prediction has limited accuracy

(Martin et al. 2019 Nat Genet)

Factors impacting prediction accuracy

โ€ข Different LD patterns

โ€ข Population-specific variantsโ€ฆโ€ฆ

(Marquez-Luna et al. 2017 Genet Epidemiol)

3

Relative to Europeans, trans-ethnic genetic risk prediction is less accurate for non-Europeans

โ€ข Different causal effect sizes?

Trans-ethnic genetic correlations are less than 1

โ€ข Schizophrenia: 0.6 (AFR American โ€“ EUR)(de Candia et al. 2013 AJHG)

โ€ข Prostate cancer: 0.9 (EAS โ€“ EUR)(Mancuso et al. 2016 Nat Genet) 0.6 (AFR American โ€“ EUR)

โ€ข Rheumatoid arthritis: 0.5 (EAS โ€“ EUR)(Brown et al. 2016 AJHG;Galinsky et al. 2019 Genet Epidemiol)

4

Why is trans-ethnic genetic correlation imperfect?

โ€ข Average across 22 traits: 0.8 (EAS โ€“ EUR)(Martin et al. 2019 Nat Genet)

Stratifying trans-ethnic genetic correlation

5

causal effect size

Pop1

Pop2

Coding Conserved Enhancer

Stratifying heritability

(Finucane et al. 2015 Nat Genet)

โ„Ž๐‘”2(Coding)

โ„Ž๐‘”2(Enhancer)

โ„Ž๐‘”2(Conserved)

+

+

+

(Shi et al. 2019 bioRxiv)

๐‘Ÿ๐‘”(Coding)

๐‘Ÿ๐‘”(Enhancer)

๐‘Ÿ๐‘”(Conserved)

Stratifying trans-ethnic genetic correlation (๐ซ๐ )

ร—

ร—

ร—

Pop1

Pop2

Pop1

Pop2

Pop1

Pop2

Outline

โ€ข Method to stratify trans-ethnic genetic correlation (S-LDXR)

โ€ข Analysis of 30 complex diseases and traits

โ€ข Interpretation: potential role of positive selection

6

Outline

7

โ€ข Method to stratify trans-ethnic genetic correlation (S-LDXR)

โ€ข Analysis of 30 complex diseases and traits

โ€ข Interpretation: potential role of positive selection

S-LDXR stratifies squared trans-ethnic genetic correlation from GWAS summary statistics

8

S-LDXR with baseline-LD-X model(62 genomic annotations)

Pop2Pop1

reference LD (1000 GP)(in 2 populations)

enrichment of

trans-ethnic ๐’“๐’ˆ๐Ÿ in

annotation C๐€๐Ÿ ๐ถ =

๐’“๐’ˆ๐Ÿ(๐ถ)

๐’“๐’ˆ๐Ÿ

(Finucane et al. 2015 Nat Genet; Gazal et al. 2017 Nat Genet; Gazal et al. 2019 Nat Genet)

PGC 2014 Nature

Pop2

GWAS summary statistics(in 2 populations)

Pop1

Genomic annotations(binary and continuous)

SNP

Coding

Conserved

โ€ฆโ€ฆ

B Stat

S-LDXR models per-allele effect sizes, accounting

for MAF differences, different from S-LDSC

use shrinkage to reduce noise

Product of Z-scores is higher for SNPs tagging annotations enriched for trans-ethnic genetic covariance

9

SNP enriched for trans-ethnic genetic covariance

j high ๐‘1๐‘—๐‘2๐‘—

LD in pop 1

LD in pop 2

low ๐‘1๐‘—๐‘2๐‘—j

LD in pop 1

LD in pop 2

(Finucane et al. 2015 Nat Genet; Gazal et al. 2017 Nat Genet; Gazal et al. 2019 Nat Genet)

๐ธ ๐‘1๐‘—๐‘2๐‘— = ๐‘1๐‘2

๐ถ

๐‘™ร—๐‘—(๐ถ)๐œƒ๐ถ

Z-score of SNP j in population 1 and 2

GWAS sample sizes

trans-ethnic LD score

net contribution of annotation C

S-LDXR is unbiased in null simulations

โ€ข 28 binary functional annotations(Finucane et al. 2015 Nat Genet)

โ€ข Quintiles of continuous-valued annotations(Gazal et al. 2017 Nat Genet)

simulated ๐€๐Ÿ ๐ถ = 1 for all annotations

10

๐€๐Ÿ ๐ถ : enrichment of squared trans-ethnic genetic correlation in annotation C

S-LDXR is conservative in causal simulations

11

simulated ๐€๐Ÿ ๐ถ informed by real traits

estimates are shrunk towards the null: ๐€๐Ÿ ๐ถ =1

โ€ข Quintiles of continuous-valued annotations(Gazal et al. 2017 Nat Genet)

โ€ข 28 binary functional annotations(Finucane et al. 2015 Nat Genet)

estimates are shrunk towards the null: ๐€๐Ÿ ๐ถ =1

๐€๐Ÿ ๐ถ : enrichment of squared trans-ethnic genetic correlation in annotation C

Outline

12

โ€ข Method to stratify trans-ethnic genetic correlation (S-LDXR)

โ€ข Analysis of 30 complex diseases and traits

โ€ข Interpretation: potential role of positive selection

13

Analysis of 30 complex diseases and traits

โ€ข We obtained GWAS Summary statistics from

Biobank Japan (Kanai et al. 2018 Nat Genet)

UK Biobank (Bycroft et al. 2018 Nature)

East Asian(N โ‰ˆ 93K)

European(N โ‰ˆ 274K)

The method can be generalized to analyze other populations.

๐€๐Ÿ ๐ถ : enrichment of squared trans-ethnic genetic correlation in annotation C

14

โ€ข 8 continuous-valued annotations (Gazal et al. 2017 Nat Genet)

existing methods assume all SNPs have the same ๐‘Ÿ๐‘”2 (orange line)

(Lee et al. 2012 Bioinformatics; Brown et al. 2016 AJHG; Galinsky et al. 2018 Genet Epidemiol)

Trans-ethnic genetic correlation is depleted in top quintile of background selection statistic

Trans-ethnic genetic correlation is depleted in top quintile of background selection statistic

15

โ€ข Other continuous-valued annotations mirror background selection statistics

(Gazal et al. 2017 Nat Genet)

๐€๐Ÿ ๐ถ : enrichment of squared trans-ethnic genetic correlation in annotation C

๐’“๐’ˆ๐Ÿ in top quintile of

background selection statistic

= 0.8 ร— genome-wide ๐’“๐’ˆ๐Ÿ

Trans-ethnic genetic correlation is depleted in functionally important regions

16

๐€๐Ÿ ๐ถ < 1 for most functional annotations

โ€ข Positively correlate with background selection statistic, enriched for heritability(Finucane et al. 2015 Nat Genet; Gazal et al. 2017 Nat Genet)

๐€๐Ÿ ๐ถ : enrichment of squared trans-ethnic genetic correlation in annotation C

Trans-ethnic genetic correlation is most depleted near skin and immune genes and least depleted near brain genes

17

โ€ข Top 10% genes (ยฑ 100 kb) specifically expressed in 53 tissues (Finucane et al. 2018 Nat Genet)

๐€๐Ÿ ๐ถ : enrichment of squared trans-ethnic genetic correlation in annotation C

Skin Brain (CNS)

negative selection(Oโ€™Connor et al. 2019 AJHG)

positive selection(Sabeti et al. 2006 Science)

Immune 53 tissues

Outline

18

โ€ข Method to stratify trans-ethnic genetic correlation (S-LDXR)

โ€ข Analysis of 30 complex diseases and traits

โ€ข Interpretation: potential role of positive selection

Interpretation: potential role of positive selection

19

โ€ข Trans-ethnic genetic correlation is depleted in functionally important regions.

โ€ข This is likely due to stronger G x E. (Robinson et al. 2017 Nat

Genet ) Dominance and G x G are less likely. (Hill et al. 2008

Plos Genet; Zhu et al. 2015 AJHG)

โ€ข Results for immune and skin genes suggest G x E may be linked to positive selection.

Acknowledgments

20

Alkes PriceSteven Gazal

Masahiro KanaiEvan Koch

Armin SchoechSamuel Kim

Yang LuoTiffany AmariutaYukinori Okada

Soumya RaychaudhuriShamil Sunyaev

Also thanks to:Luke Oโ€™ConnorHilary FinucaneDaniel Kassler

Benjamin NealeRaymond Walters

Alicia MartinNick PattersonSwapan Mallick

Katherine SiewertBrielin Brown

Bogdan PasaniucMargaux Hujoel

Farhad Hormozdiari

Twitter:@shihuwenbo

(Shi et al. 2019 bioRxiv)

(journal club retreat, May 2019)