Upload
osuccc-james
View
115
Download
1
Embed Size (px)
Citation preview
Genetic predisposition to papillary thyroid cancer
Albert de la Chapelle The Ohio State University
Fagin and Wells NEJM, 2016
Heritability of selected cancers
Thyroid 8.48 12.42Testis 8.57 8.5Multiple myeloma 4.29 5.62Prostate 2.21 9.41 0.42Colorectum 2.54 4.41 0.35Breast 1.83 2.01 0.27Lung 2.55 3.16 0.26All cancers 2.15 3.53
Site Family risk ratioUtah1 Sweden2
Twin study3
(proportion of variance)
1. Goldgar et al. 1994 2. Dong and Hemminki 2001 3. Lichtenstein et al. 2000Adapted from Risch 2001
Odds ratio (OR)Odds ratio = likelihood of acquiring the
phenotype (e.g. PTC) if marker present relative to when it is absent
OR >1 is “predisposing”OR <1 is “protective”In the last few years the term odds ratio
has begun to be called “effect size”OR and effect size are equivalent to
“penetrance”
Searching for predisposing genes
• Loss of heterozygosity• Association studies
• Next generation sequencing
• Linkage analysis in families
Linkage in PTCLiterature
• Between 1997 and 2006 at least 5 loci were proposed: 1p21, 2q21, 8p23, 14q32, 19p13
• Despite vigorous efforts no genes have been found
• These data suggest genetic heterogeneity, multigenic and multifactorial inheritance, probably low penetrance.
Main reason for failure of linkage analysis in PTC is
Overdiagnosis (?)
Unfortunately, the Next Generation Sequencing suffers from the same problem.
• Definition of overdiagnosis“Diagnosis of those that would not, if left alone, result in symptoms or death”
• “Overdiagnosis accounting for thyroid cancer in women South Korea 90%
USA, Italy, France 70-80% Japan, Nordic countries, UK 50%”
• “There is no evidence of new risk factors or increased exposure”
NEJM 2016
Ahn et al. NEJM, 2014
M
M M
M M
WT
WT WT
WT WT WT WT WTWT
LOD score: 2.05
M
M M
M M
WT
WT WT
WT WT WT WT WTWT
LOD score: 0.79
Summary of candidate genes found by linkage
OSU
Chromosome Gene description No. families Ref
8q24 A lncRNA inside the TG gene 7 1
12q14 SRGAP1 4 2
4q32 An enhancer of unknown function 1 3
1. He et al. Cancer Res 2009
2. He et al. JCEM 2013
3. He et al. PLoS One 2013
He et al. JCEM 2013
Whole genome linkage analysis of 38 PTC families
Plot of the genome-wide linkage scan with posterior PPL from 38 families.
He et al. JCEM 2013 (V. Vieland)
SRGAP1, Slit-Robo Rho GTPase activating protein 1 gene
• Located in 12q14; found by linkage• Different missense mutations segregate with PTC
in 3 families• Missense mutations occur in sporadic cases and
controls (OR 1.21, p=0.0008)• Missense variants impair the inactivation of
CDC42, a key function• SRGAP1 is a candidate gene for PTC susceptibility
and has low-medium penetrance
A>C enhancer mutation in gene desert found by targeted deep sequencing
He et al. PLoS One 2013
Long-range enhancer mutation 4q32 A>C
• Enhancer element is highly conserved• ChIP assay confirms enrichment of enhancer
marks, e.g. H3K4me1• Enhancer binds TFs POU2F and YY1• Risk allele (C) greatly impairs TF binding• Enhancer RNA is greatly downregulated in
thyroid tumors
Enhancer A>C mutation in 4q32 is ultra rare
• Found in 11 affected individuals of one large non-medullary thyroid carcinoma family
• Not found in 38 other families• Not found in 2676 sporadic cases• Not found in 2470 controls• Target genes not yet found
This suggests an ultra-rare, high-penetrance mutation
Enhancer mutation in 4q32Counseling
• Initial pedigree had 11 affected individuals
• Extensive counseling has resulted in much larger pedigree
• Testing for mutation: positive 34/68negative 34/68
• Sex ratio in mut. positive individuals: males n=17
females n=17
Hypothesis
Most of the heritability in PTC is due to (common?) low-penetrance genes
The way of approaching these is….
Genome-wide association study
GWAS
Genome-wide association studies, GWAS
• Principle: search for marker that is more common in cases than in controls
• Discovery test: type many (e.g. 1 million) SNPs in e.g. 1000 cases, 1000 controls
• Validation test: type top SNPs (e.g. 50) in e.g. a further 2000 cases, 2000 controls
• Replication test: type top SNPs (e.g. 5) in further cases and controls from different populations
• Because of multiple testing, apply rigorous significance standards (e.g. p 10-8)
GWAS-generated loci for PTCSummary, first two GWAS
• 9q22 (OR 1.8) intergenic, apparently related to one or two lincRNA genes (data shown). FOXE1 nearby
• 14q13 (OR 1.37) intergenic, risk allele affects thyroid specific lincRNA (data shown). NKX2-1 nearby
• 2q35 (OR 1.34) in DIRC3 gene
• 8p12 (OR 1.36) in first intron of neuroregulin (NRG1) gene. Risk allele lowers gene expression
• 14q13 (OR 2.09) intergenic, close to but independent of first 14q13 locus
Putative lincRNA transcripts in 9q22PTCSC = papillary thyroid cancer susceptibility candidate
unspliced transcript of PTCSC2 (>60 Kb) spans the genomic region containing SNP rs965513.
rs96
5513
Shared haplotype
rs18
7743
1rs
1098
3700
rs15
6196
0rs
7871
887
rs18
6727
7
FOXE1PTCSC2-unspliced
PTCSC2-spliced
Three enhancers and 4 functional variants in a ~33 kb block in 9q22
He et al. PNAS 2015
89 kb 247 kb
Two GWAS SNPs in Chromosome 14q13.3
PTCSC3
rs944289
MBIP
rs116909374
SFTA3 NKX2-130 kb
lincRNA (TCONS_00022711)
PTCSC3GAPDH
PTCSC3
GAPDH
PTCSC3 in 14q13 locus
0.2
0.4
0.6
0.8
2^-(D
elta
Ct)
Normal, NKX2.1 Vs rs944289
CCn=11
CTn=33
TTn=28
0.2
0.4
0.6
0.8
2^-(D
elta
Ct)
Tumor, NKX2.1 Vs rs944289
CCn=11
CTn=31
TTn=29
rs944289[T]
Adjacent unaffected Tumor
CCn=11
CTn=31
TTn=29
CCn=11
CTn=31
TTn=29
2^-d
elta
Ct
2^-d
elta
Ct
The risk allele [T] of the SNP increases the expression of NKX2-1 (TTF1) in thyroid tissue
Kruskal test p-value =0.0899 Pairwise Wilcoxon test, TT vs CC, p value= 0.14TT vs CT, p value= 0.046
Kruskal test p-value =0.0200 Pairwise Wilcoxon test, TT vs CC, p value= 0.074TT vs CT, p value= 0.0074
PTC: Examples of clinical association at the 14q13 locus
Data from genotyping 1216 cases and 1416 controls
• rs965513 associates with larger tumor size (p=0.025) and extrathyroidal expansion (OR=1.29, p=0.045)
• Rs2439302 associates with lymph node metastasis (OR 1.24, p=0.016) and multifocality of the tumor (OR 1.24, p=0.012)
Much more to come…
Jendrzejewski et al. Thyroid 2016
Predictive power of GWAS lociTowards the development of a risk panel
• 5 loci described so far• ORs range from ~1.4 to ~2.1• Are these ratios additive?
2 large cohorts of cases and controls genotyped for the 5 loci
Ohio 747 cases 1047controls; Warsaw 1795 cases 2090 controlsAdditive risks sought
Liyanarachchi et al. Thyroid 2013
Cumulative odds ratios relative to number of risk alleles for 5 GWAS SNPs
Liyanarachchi et al. Thyroid 2013
Third GWAS deCODE + OSU
Paper submitted, Gudmundsson et al. 2016
•Previous 5 loci confirmed•5 new loci detected•Involvement of coding genes observed
Next generation sequencing (NGS)* Whole exome sequencing WES* Whole genome sequencing WGS• In principle predisposing genes can be found in individual
patients by whole genome sequencing and perfect bioinformatics
• In practice this does not work• Power of resolution can be improved by studying families
searching for variants shared by affecteds• When families are reasonably large linkage can help
focus search for relevant variants• Discrimination power can be enhanced by haplotyping• NB overdiagnosis of PTC
WES of PTC• Our first NGS experiment• Study 7 PTC families with > 4 affected
• Do WES on 2 affected/family• Present results:
Positive finding in 2
Whole exome sequencing in 7 PTC families
*
*
*
** * * *
* *
* * *
Whole Exome Sequencing
* Genotyping used for Linkage Analysis
**
** * * **
**
* *
*
*
Filtering principles and resultsConditions and filtering No. of variants
• Variants detected ~1 million/individual
• Quality filtering ~200,000/individual
• Elimination of common variants (>0.01) ~10,000/individual
• Variant shared by the 2 affecteds ~2000/family
• Not found in other than one family ~400/family
• Deleterious by nature of variant & conservation ~100/family
• Expressed in thyroid ~40/family
Akagi, Symer et al.
How to filter the remaining candidates (n=40)
• Validate mutation by Sanger• Literature (cancer involvement?)• Databases (same mutation seen?)• Cosegregation in the family• Genotyping results from deCODE• Linkage (peak or valley)• Haplotype sharing• Population occurrence
Segregation of SRRM2 c.1037 (S346F) variant in Family 7
Haplotype sharing
Typing of PTC cases and controls:7/1170 sporadic cases0/1404 controls
OR = 8.14; p-value = 0.049
0/138 familial cases
Haplotype sharing:a good filter to eliminate candidate variants
(Final 21 candidates from Family 7)
Linkage analysis of Family 7
SRRM2 (chr 16)
SRm300 aka SRRM2 Gene (Serine/arginine repetitive matrix protein 2)
NM_016333: 9379 nt in 15 exons Ser346Phe
S346F
1
human 339 KDKDKKEKSATRPSPSPERSSTGPEPPAPTPLLAERHGGSPQPLAT 383mouse 241 KD--KKEKSAVRPSPSPERSSTGPELPAPTPLLVEQHVDSPRPLAA 285chimpanzee 339 KDKDKKEKSATRPSPSPERSSTGPEPPAPTPLLAERHGGSPQPLAT 383pig 339 KGKDKKEKSAVRSSPSPERSSTGPEPPAPTPLLAEQHGGSPQPLAT 383dog 339 KDKE-KEKSGIRPSPSPERSSTGPEPPAPTPLLAEQHGGSPQPLAT 381cat 263 KDKD-KEKSAIRPSPSPERSSAGPEPPAPTPLLAEQHGGSPQPLAT 307cattle 338 KD--KKEKSAVRPSPSPERSNTGPEPPAPTLLLAEQHGGSPQPLAT 381sperm whale 355 KDKDKKEKSAVQPSPSPERSSTGPELPAPTPLLAEQYGGSPQPLAT 400
2752RSD-1 RSD-2
Protein: 2752 aa 300 kDa
Heat map of 1642 alternative splicing eventsRNA-Seq data: alternatively spliced transcripts in the cases were differentially expressed when compared to controls
PSI: the ratio of the “included” expression level vs. the sum of both spliced isoforms.
Yellow: higher PSI. Blue: lower PSI.
Main problems
• Only coding DNA typed
• Unexpectedly common mutation (>1%; >3% etc.) would be filtered out
• Only mutations classified as “pathogenic” are considered
121781
$ $ $
144961
150004
$
$
$
$
128705
$ $
$
$
$ $
$
*
**
$
$
$
*
**
3 6
$
$ $
89281 75700
69238 20778
*PTC
Melanoma
** PTC & melanoma
Goiter or nodules Other benign thyroid disease
KEY
Spherocytosis
$, Whole genome sequence
Other malignancy
$ $ $
$$
$ $ $ $
$
Whole genome sequencing performed in 8 families
FILTERING
Whole genome sequencing in PTC familiesSummary of results in 8 families
• Filtering excluded everything except coding variants of genes expressed in thyroid tissue and with predicted pathogenicity
• The median number of remaining candidate genes per families was 27 (range 14-83)
• Efforts to identify the correct gene(s) are underway
So What?
Consequences of gene discoveries
• Improved molecular insight; pathways?Yes but slow
• Diagnosis? Yes but mainly in families
• Clinical stratification?Promising but so far modest impact