Upload
wef
View
740
Download
2
Embed Size (px)
Citation preview
Finding risk genes in
psychiatric disorders
Mark Daly, PhD
Chief, Analytic and Translational Genetics UnitMassachusetts General Hospital
&Institute Member, Broad Institute and
Stanley Center for Psychiatric Research
Why Genetics?
• Correlation = causation
• Unique insights into biological causes
– Particular potential for mental health where direct biological measures, blood tests, etc. not available
• Durable foundation for rational therapeutic development
Despite diagnostic challenges, common
psychiatric diseases are extremely heritable
Sullivan, Daly, O’Donovan 2012
Gottesman 1991Cowan, Kopnisky, Hyman 2002
Traditional approach to
gene discovery
First Century Genetics
1860s: Mendel’s laws of inheritance – discrete, transmissible units of inherited variation resulting in phenotypic differences
A B C
1910s: Sturtevant and Morgan create the first genetic map
Aa AA Aa Aa AA AA Aa AA
Aa AA
1940s-1950s: Principles of linkage
analysis developed
‘Mendelian’ diseases travel predictably
and consistently in families
Aa AA Aa Aa AA AA Aa AA
Aa AA
Dominant transmission
Thousands of diseases or traits caused by mutations in a single gene
(e.g., Huntington’s, CF, muscular dystrophy)
Family-based linkage analysis
AC
AA
AC
AC
AC
AC
AA
AA
AA
AA
A/C Disease GeneSaw dramatic successes in the 1980s-90s
for the localization of genes underlying
countless Mendelian disorders:
Huntington’s, CF, DMD, early onset forms
of breast cancer, Alzheimers, diabetes…
Tracking ‘co-segregation’ of DNA
polymorphisms with disease
status permits identification of
region containing responsible
gene and mutations
Glazier, Nadeau and Aitman, Science 2002
Wildly successful for rare diseases, this simply
does not work for common ones…
Dark Ages of complex trait genetics
If not Mendelian, what is the genetic
architecture of traits that are
1. highly heritable &
2. very common
1900-1925: the Dawn of Polygenicity
Biometricians recognize many traits are highly heritable but do not apparently adhere to Mendel’s laws…violently opposed by Mendelians
Key experiments in plants, flies demonstrate that large phenotypic differences can arise from the sum of many contributors
Fisher synthesizes model wherein large number of small ‘Mendelian’ factors can explain high heritability of continuous traits
Failure of linkage not difficult to
understand
disease state
phenotype1
phenotype2
phenotype3
phenotype4 phenotype5
Exposures / environment
genotype
other
genes
“We suggest that evolutionary changes in anatomy and way of life are more often based on changes in the mechanisms controlling the expression of genes than on sequence changes in proteins. We therefore propose that regulatory mutations account for the major biological differences between humans and chimpanzees.” –King & Wilson. Science. April, 1975.
Many genes vs. 1
Incomplete/low penetrance
Progress has required many
fundamental paradigm shifts
Understanding the genomeand the fundamental nature of human variation
Dramatic technological advances in our ability to access genomes
If instead of one gene, there are
hundreds of contributors, how do
we proceed?
Study all common variation to find weak,
often regulatory risk factors? (CVAS/GWAS)
- OR -
Sequence a limited number of cases and controls
to find rare, high-impact variants? (RVAS)
Genomewide Association to common DNA variants
2005-present
10M or so common variants:
typically shared across populations
Gabriel et al, Science 2002 Rosenberg et al, Science 2002
10M or so common variants:
typically shared across populations
Gabriel et al, Science 2002 Rosenberg et al, Science 2002
The vast majority of genetic differences between individuals reside in common variants
(Lewontin 1972)
Therefore, most genetic variation in common traits should be explained by common variation
Really not a new idea
Observation: Extremely high heritability across cultures, backgrounds and relative pairs
Model: given frequency, heritability and lack of Mendelian segregation, polygenic inheritance of “constitutional predisposition” or “liability” in the terminology of the then recent work of Falconer’s extension of quantitative genetics models to inherited risk of ‘all or none’ traits.
50 years ago…
GWAS 2005-present:the primary driver of discovery in common disease
First efforts in psychiatry were not successfulGenomewide association in schizophrenia with 3500 cases
International Schizophrenia Consortium 2009
Why did GWAS seem to work more readily in immune-mediated and
cardio/metabolic disease?
Another contributor:Natural Selection
From: Fecundity of Patients With Schizophrenia, Autism, Bipolar Disorder,
Depression, Anorexia Nervosa, or Substance Abuse vs Their Unaffected Siblings
JAMA Psychiatry. 2013;70(1):22-30. doi:10.1001/jamapsychiatry.2013.268
Ramifications:
Common and low frequency
variants can be plentiful at very
low effect sizes (OR < 1.1)
Large effect alleles must be
extremely rare
DRAMATICALLY
REDUCED
FITNESS IN
SCHIZOPHRENIA
AND AUTISM
Implication of
PNAS Dec. 2013
Modest ORs (2-10)
- Sweet spot for lipids,
CVD, Alzheimers, AMD,
immune disease
- De novo studies will not
flag these (most are
inherited and found in
unaffecteds)
- Selection keeps them
almost impossibly rare
to detect
High OR =
de novo detection
only
High frequency =
GWAS detection
Conceptually, a polygenic
model could fit per Gottesman
& Shields, but if so it would
require a much larger scale to
gain access to the individual
components
PGC statistical analysis groupStephan Ripke
Ben Neale Naomi Wray
Frank DudbridgePeter Holmans
Danyu Lin Edwin van den Oord
Shaun PurcellSarah MedlandNick Craddock
Danielle PosthumaKen Kendler
PGC Schizophrenia groupMichael O'Donovan
Pamela Sklar
Patrick Sullivan Doug Levinson
Ed Scolnick
Pablo Gejman
Aiden Corvin
Anil Malhotra
Ayman Fanous
D Blackwood
Hugh Gurling
Kenneth Kendler
Michael Gill
Michael Owen
Leena Peltonen
Ole Andreassen
Roel Ophoff
David St. Clair
Sven Cichon
Thomas Schulze
Peter Holmans
Thomas Lehner
Aarno Palotie
Tonu Esko
Alan Sanders
Thomas Werge
Dan Rujescu
BryanMowry
MathewKeller
Fundamental Shift:
Collaborationrather than competition
is the key
Psychiatric Genomics Consortium (PGC)
300+ investigators80 institutions20 countries
qSCZ - Ancient times – 2009 (ISC)
2601 cases, 3345 controls
0 genome wide significant sites
q
9394 cases, 12462 controls
5 genome wide significant sites
PGC - The Past - 2011
More than 100 distinct regions of
associated to schizophrenia!!!
PGC SCZ v2: Genomewide association in schizophrenia with 37,000 cases
July 22, 2014DRD2
C4
SLC39A8
Common variants can abound, but only at
extremely modest ORs permitted by this
selective pressure
Odds-ratio vs risk allele frequency, PGC-schizophrenia 2014 results
(N ~ 35,000 cases, n=128 genome-wide significant variants)
Common variants can abound, but only at
extremely modest ORs permitted by this
selective pressure
Odds-ratio vs risk allele frequency, PGC-schizophrenia 2014 results
(N ~ 35,000 cases, n=128 genome-wide significant variants)
GWAS: 100s of biological clues
available, >20,000 cases required to
start harvesting them
Biology begins to emerge from
the shadows…
Discovery: Alleles of C4 shape schizophrenia risk in
proportion to their effects on expression of C4A
Genetic result: (n=62,000, p<10-20)
(from brain tissue,
n=100, p<10-4)
Chromosome
(-lo
g1
0(p
))
Steve McCarroll, Aswin Sekar
Sekar et al, Nature, Feb 11 2016
C4 shapes the extent of
synaptic pruning
Allison Bialas, Matt Baum, Mike Carroll, Beth Stevens
WT C4 +/- C4 -/-
In C4 -/- mice,TRNs retain
multiple, overlapping inputs
C4 is expressed by RGCs during
“critical period” for pruning
A potential piece of the puzzle...
Huttenlocher, 1979
Excessive synaptic
pruning may play a role
Schizophrenia patient
Control
Schizophrenia patient
Birth Child Adult
Glantz & Lewis
Arch Gen Psychiatry
2000
Another example: SLC39A8
• Zn and Mn transporter
• A functional allele (A391T) corresponding to lower serum
Mn levels is convincingly associated to schizophrenia
risk
• Mendelian recessive deficiency of this gene recently
described as disorder of glycosylation
• Important biological clue and potentially a public health
intervention – 15% of Europeans carry this low
functioning, high-risk variant
Other emerging lessons• Nearly all associated variants have the same effect across global
populations
• Very few have unusual modes of association– Effect differences by sex, parent-of-origin– Epistasis (gene-gene interactions)
• Many/most are shared across diagnostic boundaries– Considerable shared genetics across behavioral and cognitive
traits/diagnoses
• Polygenic risk provides many insights– Degree of shared genetic predisposition between traits– Enrichment highlights specific cells/tissues of relevance– Causal relationships between biomarkers and disease can be evaluated via
Mendelian Randomization
All these observations are consistent with what is seen in other disease areas (e.g., immune-mediated diseases)
Measuring heritability
ISC, Nature 2009Yang et al. 2010 Nat GenetLee et al. 2011 AJHG
Heritability explained by genome-wide significant SNPs
< SNP heritability <Total narrow sense heritability
SNP heritability = heritability explained by all SNPs genotyped on a standard array.
Hilary Finucane, Brendan Bulik-Sullivan, Ben Neale
Polygenic risk scoresUse largest GWAS meta-analysis
For each individual, calculate weighted sum of their riskalleles across the entire genome
Single predictive measurement describing the risk carried by that genetic background
Polygenic risk scores 0.22
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
Roughly 20% of the variance in SCZ risk in a new sample is captured by 2016 PRS(Previous meetings: 3% (2009), 8% (2011), 18% (2014))
Polygenic risk scores
Established genomewide significant hit regions explain only 20-25% of this!Many more pointers to biology will become definitively established as we expand GWAS
0.22
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
Many uses of PRS / MR / LD score regression
• Overlap between diseases helps to clarify shared pathways
• Enrichment of gene expression and epigenetic marks clarifies relevant cell types
• Understanding the relevance and relationships of continuous traits across population and prodromal phenotypes to disease
• Understanding the relevance of variation in functional assays and models by linkage to the heritable biological variation of disease
Common polygenic variation –
even of weak effect – can
provides critical insights into
the root biological causes of
mental illness
Gottesman & Shields 1967
Strengths & limitations of common polygenic
risk
• Enables recognition of:
– Etiologic overlap
between diseases
– Evaluating causal
relationships between
biomarkers and disease
– Points to specific
cells/tissues relevant
– Can now resolve many
to single/few variants
45 best-resolved associations to IBD
Coding
TFBS
Epigenetics
eQTL
• Interpretation is
challenging
Despite gut and immune cells
being accessible and well-
studied, more than half of the
non-coding map to no known
enhancer, promoters, etc.
Effect sizes in complex disease
42
Associations discovered by GWAS
(IBD pictured here)
- Most discoveries (~80%) are non-
coding
- Those that are coding have
dramatically larger range of effects
OR = 1.25
Effect sizes in complex disease
43
OR = 1.25
LDLR, APOB, LPL, APOA5, LPA,
PCSK9…
Myocardial infarction
RNF186, CARD9, NOD2, IL23R, …
Inflammatory Bowel Disease
SLC30A8, HNF1A, PAX4, …
Type 2 diabetes
APOE, ABCA7, TREM2, …
Alzheimer’s
CFH, CFI, C3, C9
AMD
Effect sizes in complex disease
44
OR = 1.25
LDLR, APOB, LPL, APOA5, LPA,
PCSK9…
Myocardial infarction
RNF186, CARD9, NOD2, IL23R, …
Inflammatory Bowel Disease
SLC30A8, HNF1A, PAX4, …
Type 2 diabetes
APOE, ABCA7, TREM2, …
Alzheimer’s
CFH, CFI, C3, C9
AMD
These types of variants, at frequencies we have power to
detect
cannot in theory and
do not in empirical data
exist in autism, schizophrenia and traits with similar
selection
Natural selection prevents strong alleles from achieving
any measurable population frequency, and therefore
meaningful contribution to heritability –
de novo mutations, however, are exempt
AA AA
AC
Rare variant studies seem
hopelessly underpowered – even
OR=2 has no chance to become
even a 0.1% polymorphism
Beneficial exception: de novo
mutations!
• Can have any effect size
• Easy to find – low background
rate
De novo variation and autism
AA AA
AB
4000 ASD trios with deep exome sequence
compiled to date…Autism Sequencing Consortium (ASC) founded toleverage further emerging sequence data collaboratively
Key Investigators: Buxbaum, Daly, Devlin, Roeder, StateBarrett, Cutler, Palotie, Scherer, Sanders, Talkowski, Walsh, Zwick
Rate
of de n
ovo t
runcating m
uta
tions p
er
exom
e
***
***
***RR = 1.55
***
* P < 0.01
** P < 0.001
*** P < 0.0001
(compared to
unaffected ASD siblings)
ASD
Unaffected ASD siblings
ID/DD
All Class 2 & Class 1 pLI ≥0.9
class 1 pLI < 0.9
0.22
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
RR = 3.24
RR = 6.70
RR = 2.15
NS
de novo mutations contribute to ASD & ID
Most reliable signal to date in
ASD – and much more strongly
in ID/DD and epileptic
encephalopathy – has been
excess of de novo truncating
mutations and CNVs
ExAC reference database
critical to interpretation
1000
Genomes
ESP
N=6500
ExAC
N=60,706
Latino
African
European
South Asian
East Asian
Other
1000 Genomes ESP ExAC
01
000
020
000
300
00
40
000
500
00
60
00
0Sample Size (N) and Ancestral Diversity
1000 Genomes, ESP, ExAC
Indiv
idu
als
with
Exom
e S
eq
uence D
ata
East Asian
South Asian
European
Middle Eastern
African
Native American ancestry
Diverse Other
World proportions
World Population
Scaled to ExAC height
Daniel MacArthurMonkol Lek Kaitlin Samocha
Enables recognition of 20% of
genes that do not tolerate
heterozygous truncating
mutations – see also Cassa et al
(2017, in press, Nat Gen);
Petrovski/Goldstein RVIS papers
Lek et al. Nature 536:
285-291 (2016)
Ra
te p
er
exo
me
***
***
***
RR = 1.55
***
* P < 0.01
** P < 0.001
*** P < 0.0001
compared to
unaffected ASD siblings
ASD de novo variants
Unaffected ASD siblings
de novo variants
ID/DD
de novo variants
All mutations in tolerant mutations in intolerant
genes genes
0.22
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
RR = 3.24
RR = 6.70
RR = 2.15
NS
Zeroing in on the critical de novo mutations
Jack Kosmicki
pLI separates signal from noise in
schizophrenia exome studyExome sequencing was performed on ~6,000 controls and
~5,000 cases with schizophrenia from Sweden. 0 genes
discovered. Slight signal for enrichment of singleton LoF
mutations in the cases.
Set All genes (n=19,131)
Haploinsufficient(pHI ≥ 0.95, n=2651)
Not haploinsufficient (pHI < 0.95, n=16480)
Synonymous p=NA
OR=0.97 p=0.81
OR=0.96 p=NA
OR=0.98
Missense p=NA
OR=0.97 p=0.13
OR=0.98 p=NA
OR=0.97
Loss-of-function
p=0.0025OR=1.022
p=2.7e-13OR=1.42
p=0.60OR=0.99
Giulio Genovese, Kaitlin Samocha, Steve McCarroll, Pat Sullivan, Pam Sklar, Christina Hultman
pLI separates signal from noise in
schizophrenia exome study
Set All genes (n=19,131)
Haploinsufficient(pLI ≥ 0.95, n=2651)
Not haploinsufficient (pLI < 0.95, n=16480)
Synonymous p=NA
OR=0.97 p=0.81
OR=0.96 p=NA
OR=0.98
Missense p=NA
OR=0.97 p=0.13
OR=0.98 p=NA
OR=0.97
Loss-of-function
p=0.0025OR=1.022
p=2.7e-13OR=1.42
p=0.60OR=0.99
Giulio Genovese, Kaitlin Samocha, Steve McCarroll, Pat Sullivan, Pam Sklar, Christina Hultman
Exome sequencing was performed on ~6,000 controls and
~5,000 cases with schizophrenia from Sweden. 0 genes
discovered. Slight signal for enrichment of singleton LoF
mutations in the cases.
Next Steps in Gene Discovery
Additional Gene Discovery
Fine-Mapping and Functional Studies
Exome/Genome Meta Analysis
Manhattan plot, EAS
53
Combining EUR and EAS
54
# cases # controls
EAS 13,305 16,244
EUR 33,640 43,456
COMBINED 46,945 59,700
# Loci # SNPs
EAS 12 12
EUR 108 128
COMBINED 140 181
Sample size Loci and SNPs associated
A lot more to be learned from
GWAS...
Specific loci and significantly
improved polygenic instruments
Beyond GWAS: Fine-mappingExample from Inflammatory Bowel Disease
Immunochip: Specialized genotyping reagent focused on immune-mediated diseases:– Type 1 Diabetes (T1DGC)
– Autoimmune thyroid disease
– Ankylosing spondylitis
– Crohn’s disease
– Celiac disease
– IgA deficiency
– Multiple sclerosis
– Primary biliary cirrhosis
– Psoriasis
– Rheumatoid arthritis
– Systemic lupus erythematosus
– Ulcerative colitis
• Nearly 50,000 Cases in GWAS & Immunochip
• Huang, Jostins, Fang et al. (BioRxiv October 2015; Nature June 2017)
Crohn’s & Ulcerative Colitis (IBD)18 associations refined to a SINGLE VARIANT with greater than 95% posterior-p
• 200 bp into intron of TNFSFR6B• 5 kb downstream of GPR35• 500 bp from TSS of JAK2,
massive ENCODE peak• intronic to IL2RA
(MEF2A/MEF2C binding site)• RELA/NFKB binding site 40 kb
upstram IKZF1• 10 kb from TSS of NKX2-3, also
LINC01475 inbetween SNP and NKX2-3
• Intron LRRK2• 5kb downstream from HNF4A• 4 kb from TSS of PRDM1
• Intron of NOD2• NOD2 insC• NOD2 R702W• NOD2 G908R• NOD2 N289S
• IL23R V362I• CARD9 splice variant• IFIH1 I923V • SMAD3 I65V
Finding the ultra-rare high-impact variantsNot as easy in case-control data
265
1271
3722
8100
11853
16372
31682
34165
266977
3099
5273
7293
9564
21115
23562
0
10000
20000
30000
2010 2011 2012 2013 2014 2015 2016 2017
Year
Cu
mula
tive
num
be
r o
f in
div
idu
als
se
qu
ence
d
a
a
Schizophrenia
Control
With UK10K done at Sanger, total now past 25K cases!
Like early years of GWAS, many individual exome studies completed, later this year for the first time these will be assembled into a meta-analysis and results made immediately avail.
How are we getting along without trios?
.30
.25
.20
.15
.10
.5
0
SSC+ASC:De novo PTVs- high pLI genes only- not in ExAC
ASD con Dan: ASD con
Danish cases and control rates:all PTVs- high pLI genes only- not in ExAC- Rate difference significant p<10-13
Inherited variants carry modest signal - but are also rare so do not
wash out de novo signal.30
.25
.20
.15
.10
.5
0
SSC+ASC:inherited PTVs- high pLI genes only- not in ExAC- 18.5%-16% (p<.005)
ASD con Dan: ASD con
Danish cases and control rates:all PTVs- high pLI genes only- not in ExAC- Rate difference significant p<10-13
By focusing on damaging variants inintolerant genes, we ensure that variants are either de novo or VERY young
AcknowledgmentsElise Robinson- Dan Weiner- Emilie Wigdor
Hailiang HuangKaitlin SamochaChristine StevensJack KosmickiStephan RipkeKyle Satterstrom
Ben Neale & Lab
Daniel MacArthur & LabAarno Palotie & LabExAC teamHail team
ALSPACBeate St. PourcainGeorge Davey Smith
Support from NIMH (Thomas Lehner), NHGRI, Simons Foundation, Stanley Center
iPSYCH teamPreben Mortensen, Anders BørglumThomas Werge, Merete Noredntoft, Ole Mors, David Hougaard, Mads Hollegaard, Jonas GrauholmJakob Grove, Ditte Demontis
Autism Sequencing ConsortiumJoe BuxbaumMatt StateJeff BarrettEd CookDave CutlerBernie DevlinAarno PalotieKathryn RoederSilvia de RubeisStephan SandersMke TalkowskiMike Zwick
Inspiration:Steve HymanEd Scolnick
Families and clinicianscontributing to:PGC schizophrenia studies worldwide
Simons Simplex CollectionSVIP, AGRE, AGP, Autism Consortium
PGC-ASDRic AnneyDan ArkingBernie DevlinStephan Ripkeand many more…
Jakob Grove
Anders Børglum
iPSYCH+PGC ASD GWAS 2017 (17K cases, 1KG Phase 3 imputation)
Genome-wide significant
GWS after replication
+5 additional significant loci in MTAG analyses w/ Educational Attainment, SCZ & MDD
Appendix 1:Key references and figures regarding
polygenicity 1900-1925
Biometric / Mendelian debate
Karl Pearson William Bateson
East 1915: Inheritance of Corolla
Length in Nicotiana longiflora
Demonstration of polygenic inheritance through selective breeding
The first complex trait mapping:
Altenburg and Muller (1920)
Hermann Muller
Many Mendelian traits can sum
to a continuous distribution
Ronald Fisher (1918)