November 2017 SESHAT: AN INNOVATIVE TOOL TO HANDLE … · 2019. 12. 12. · Predictions TYPICAL...

Preview:

Citation preview

Pr. Thierry Soussi

SESHAT: AN INNOVATIVE TOOL TO HANDLE SOMATIC AND GERMLINE TP53 VARIANTS

PITFALLS IN TP53 STATUS ANALYSIS

November 2017

SESHAT: AN INNOVATIVE TOOL TO HANDLE SOMATIC AND GERMLINE TP53 VARIANTS

PITFALLS IN TP53 STATUS ANALYSIS

November 2017

TP53 VARIANT PATHOGENICITY

HUNTING NOVEL TP53 SNP

SESHAT

Input amino acid substitution

and protein ID

Alignments

Sequence

Structure

Conservation score

Frequencies

Physicochemical properties

3D structureSecondary strcuture

Surface area properties

Output Predictions

TYPICAL PIPELINE FOR IDENTIFICATION OF DELETERIOUS VARIANTS IN CODING REGIONS

SIFTtakesaquerysequenceandusesmultiplealignmentinformationtopredicttoleratedanddeleterioussubstitutionsforeverypositionofthequerysequence.

DeleteriousNondeleterious

Borderline

TYPICAL PIPELINE FOR IDENTIFICATION OF DELETERIOUS VARIANTS IN CODING REGIONS

SIFTtakesaquerysequenceandusesmultiplealignmentinformationtopredicttoleratedanddeleterioussubstitutionsforeverypositionofthequerysequence.

PREDICTINGDELETERIOUS AMINOACIDSUBSTITUTIONSISNOT

PREDICTINGPATHOGENICITY.

DeleteriousNondeleterious

Borderline

pathogenic

TYPICAL PIPELINE FOR IDENTIFICATION OF DELETERIOUS VARIANTS IN CODING REGIONS

PREDICTING PATHOGENICITY

MUST BEWILL BE

GENE SPECIFIC

IDENTIFICATION OF PATHOGENIC VARIANTS IN CODING REGIONS

WHETHER IT WILL BE ALSO TUMOR TYPE SPECIFICIS AN OPEN QUESTION

IDENTIFICATION OF PATHOGENIC VARIANTS IN CODING REGIONS

UMD-TP53

83 004 TP53 records75 448 tumors*6 916 TP53 variants

Sequencing artfacts have been identified and tagged

Main problems: duplicates entries

http://p53.fr

* The difference between patients and records is due to the fact that some tumors have multiple TP53 mutations

0

200

400

600

800

1000

1200

1400

1600

1800

2000

1989

19

90 19

91 19

92 19

93 19

94 19

95 19

96 19

97 19

98 19

99 20

00 20

01 20

02 20

03 20

04 20

05 20

06 20

07 20

08 20

09 20

10 20

11 20

12 20

13 20

14 20

15 20

16 20

17

TCGA

Cumulated SNV 5-8

Cumulated Ins / Del 5-8

WE HAVE INDENTIFIED MOST POTENTIAL TP53 SNV

Cumulated TP53 novelty since 1989.Since 2008, the number of novel TP53 variants detected in human tumors has plateaued,

whereas the rate of detection of novel frameshift mutations has remained constant for more than 20 years.

IndelSNV

TCGA data

WE HAVE INDENTIFIED MOST POTENTIAL TP53 SNV

C G A

TGA

TCA T

CG

1 single codon can sustain 9 different single nucleotide substitutions

WE HAVE INDENTIFIED MOST POTENTIAL TP53 SNV

C G A

TGA

TCA T

CG

1 single codon can sustain 9 different single nucleotide substitutions

9

9

Codon 175

Codon 114

Codon 40

-9

-6

-3

0

3

6

9

Missensemuta+ononly Foundinhumancancer

NeverfoundinhumancancerMdm2 binding site

Hotspot170-175

183

114Regulatory domain (uncertain function)

WE HAVE INDENTIFIED MOST POTENTIAL TP53 SNV

1 393N

umbe

r of s

unst

itutu

in

0

500

1000

1500

2000

2500

3000

3500

4000

1 10 100 1000 10000

Freq

uenc

y in

the

data

base

Variant occurance in the database

p.R175H (3,500x)

FREQUENCY OF TP53 VARIANT IN THE UMD TP53 DATABASE

0

500

1000

1500

2000

2500

3000

3500

4000

1 10 100 1000 10000

Freq

uenc

y in

the

data

base

Variant occurance in the database

p.R175H (3,500x)

p.G302W (1x)

FREQUENCY OF TP53 VARIANT IN THE UMD TP53 DATABASE

0

500

1000

1500

2000

2500

3000

3500

4000

1 10 100 1000 10000

Freq

uenc

y in

the

data

base

Variant occurance in the database

p.R175H (3,500x)

p.G302W (1x)

507 variants have been found only once

FREQUENCY OF TP53 VARIANT IN THE UMD TP53 DATABASE

ASSESSING TP53 VARIANTS PATHOGENICITY

Missense mutations 73%

Nonsense mutations

11%

Synonymous mutations

2%

Splice mutations 3%

Indel 11%

UMD TP53: 82 000 MUTATIONS

ASSESSING TP53 VARIANTS PATHOGENICITY

Missense mutations 73%

Nonsense mutations

11%

Synonymous mutations

2%

Splice mutations 3%

Indel 11%

UMD TP53: 82 000 MUTATIONS

1% are pathogenic

ASSESSING TP53 VARIANTS PATHOGENICITY

chr17:g.7578552C>G

c.378C>G

p.Y126* 41

Evgeny M. Makarov et al. 2017 Plos one

ASSESSING TP53 VARIANTS PATHOGENICITY

chr17:g.7578552C>G

c.378C>G

p.Y126* 41

Evgeny M. Makarov et al. 2017 Plos one

ASSESSING TP53 VARIANTS PATHOGENICITY

chr17:g.7578552C>G

c.378C>G

p.Y126* 41

Evgeny M. Makarov et al. 2017 Plos one

ASSESSING TP53 VARIANTS PATHOGENICITY

chr17:g.7578552C>G

c.378C>G

p.Y126* 41

Evgeny M. Makarov et al. 2017 Plos one

p.Y126del

ASSESSING TP53 VARIANTS PATHOGENICITY

Is there a TP53 null genotype ?

All these cell lines have a TP53 non sense mutation

ASSESSING TP53 VARIANTS PATHOGENICITY

Is there a TP53 null genotype ?

Gain of functionIncrease metastatic potential

PREDICTING PATHOGENICITY

MUST BEWILL BE

GENE SPECIFIC

IDENTIFICATION OF PATHOGENIC VARIANTS IN CODING REGIONS

WHETHER IT WILL BE ALSO TUMOR TYPE SPECIFICIS AN OPEN QUESTION

SESHAT: AN INNOVATIVE TOOL TO HANDLE SOMATIC AND GERMLINE TP53 VARIANTS

PITFALLS IN TP53 STATUS ANALYSIS

November 2017

TP53 VARIANT PATHOGENICITY

HUNTING NOVEL TP53 SNP

SESHAT

R72P R213RP47S

TP53 AND RARE SNP

P36P

THE TP53 GENE HAS BEEN SEQUENCED IN MORE THAN 150 000 INDIVIDUALS

IN MOST CASES MATCHED DNA FROM THE SAME PATIENT IS NOT SEQUENCED

Gnomad(or previous Exac)

dbSNP

Include pathogenic variantsLow number of individuals

138,632 individualsOnly germline variant

TP53 AND RARE SNP

Gnomad(Exac)

138,632 individualsOnly germline variant

p.R72P 183,709p.P36P 3,520p.R213R 3,407p.P47S 433p.D21D 136

variantAllele count

p.R175H 1p.R248Q 5p.R273H 4p.R273C 3p.R248W 1p.Y220C 2….

variantAllele count

Gnomad database includes multiple pathogenic TP53 variants

What is the true frequency of TP53 pathogenic variant in the normal population ?

TP53 AND RARE SNP

National genome database(Finland, Sweden, Netherland, Japan,…)

Gnomad Flossie

TP53 AND RARE SNP

National genome database(Finland, Sweden, Netherland, Japan,…)

Gnomad Flossie

Exraction of all coding variants

TP53 AND RARE SNP

Remove well known non pathogenic SNP

48 remaining SNP

Remove pathogenic variants

UMD TP53 DATABASEGermline variants found at

frequency higher than expected

TP53 AND RARE SNP

48 remaining SNP

23 remaining SNP

Functional analysis

0

50000

100000

150000

200000

250000

300000

350000

T

Wt TP53 R175H

M1 M2

M3 M4

M5 M6

M7 M8

M9

Transactivation

annexin

Pi

Wt TP53

MutTP53

SNP M1

SNP M2

Apoptosis

ApoptosisNo Apoptosis

Apoptosis

Wt TP53 Mut1TP53

Mut2TP53SNP M1 SNP M2

Control

0

50000

100000

150000

200000

250000

300000

350000

T

Wt TP53 R175H

M1 M2

M3 M4

M5 M6

M7 M8

M9

Transactivation

Growth arrest

Apoptosis

TP53 AND RARE SNP

Sequencing primers and intronic SNP

Several intronic primers frequently used for sequencing overlapp SNP

SESHAT: AN INNOVATIVE TOOL TO HANDLE SOMATIC AND GERMLINE TP53 VARIANTS

PITFALLS IN TP53 STATUS ANALYSIS

November 2017

TP53 VARIANT PATHOGENICITY

HUNTING NOVEL TP53 SNP

SESHAT

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

SESHAT

Annotation accuracy

The 3' rule:a misapplied guideline:

SESHAT

Annotation accuracy

Most TP53 insertions are duplications

SESHAT

Using a correct reference

p.R43HAnewmutationhotspot?

53 kDa

SESHAT

Using a correct reference

p.R43HAnewmutationhotspot?

35 kDa

53 kDa

BAM

USERS

Single mutantanalysis

Batch analysis

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

BAM

USERS

Single mutantanalysis

Batch analysis

Seshat: transformation

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

BAM

USERS

Single mutantanalysis

Batch analysis

Seshat: transformation

Mutalyzer:checking

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

BAM

USERS

Single mutantanalysis

Batch analysis

Seshat: transformation

Mutalyzer:checking

Seshat: analysis

Batch analysisoutputs

Single mutant analysisoutputs

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

PatientID test1cDNA_variant c.743G>AHG19_Variant chr17:g.7577538G>ANG_017013.2_Variant NG_017013.2:g.18331G>ASNP_ID rs11540652Transcriptt1MN_000546.5 c.743G>ATranscriptt2NM_001126112.2 c.743G>ATranscriptt3NM_001126114.2 c.743G>ATranscriptt4NM_001126113.2 c.743G>ATranscriptt5NM_001126115.1 c.347G>ATranscriptt6NM_001126116.1 c.347G>ATranscriptt7NM_001126117.1 c.347G>ATranscriptt8NM_001126118.1 c.626G>AProteinP1TP53_alpha LRG_321p1:p.R248QProteinP3TP53_beta LRG_321p3:p.R248QProteinP4TP53_gamma LRG_321p4:p.R248QProteinP8Delta40_TP53_alpha LRG_321p8:p.R209QProteinP9Delta40_TP53_beta LRG_321p9:p.R209QProteinP10Delta40_TP53_gamma LRG_321p10:p.R209QProteinP5Delta133_TP53_alpha LRG_321p5:p.R116QProteinP6Delta133_TP53_beta LRG_321p6:p.R116QProteinP7Delta133_TP53_gamma LRG_321p7:p.R116QProteinP11Delta160_TP53_alpha LRG_321p11:p.R89QProteinP12Delta160_TP53_beta LRG_321p12:p.R89QProteinP13Delta160_TP53_gamma LRG_321p12:p.R89Q

PatientID test1Records_Number 2562Variant_Classification Missense_MutationVariant_Type SNP

Comment_1_Frequency ThissinglenucleotidevariantisveryfrequentComment_2_Activity Inactive

Comment_3_Isoforms Thissubstitutiontargets12TP53isoformsComment_4prediction DamagingComment_5_Outliers -Comment_6_Splicing NosplicedefectpredictedComment_7_Sequence -

Comment_9_SNPBothgermlineandsomaticvariantshavebeendescribed;seecomment10forpopulationdata

Comment_10_populationrs11540652;listedintheExAcdatabase(withouttheTCGAcohort)withallelecount=7

Pathogenicity Pathogenic

Final_commentPublishedresearchaswellasdatabaseanalysisprovidesufficientevidenceforclassificationofthisvariantaspathogenic.

mutalyzer_comment

PatientID test1Sift_Prediction DamagingSift_Score 0.006Polyphen-2_HumVar ProbablydamagingPolyphen-2_HumDiv ProbablydamagingMutassessor_prediction MediumMutassessor_score 2.970Provean_prediction DeleteriousProvean_Score -3.915Condel DeleteriousCondel_Score 0.905MutPred_Splice_General_Score 0.181MutPred_Splice_Prediction_Label SpliceNeutralVariant(SNV)MutPred_Splice_Confident_HypothesesNotrelevantSIFT_converted_rankscore 0.620Polyphen2_HDIV_rankscore 0.899Polyphen2_HVAR_rankscore 0.916LRT_converted_rankscore 0.856MutationTaster_converted_rankscore 0.708MutationAssessor_rankscore 0.886FATHMM_rankscore 0.998MetaSVM_rankscore 0.968MetaLR_rankscore 0.998VEST3_rankscore 0.860PROVEAN_converted_rankscore 0.728phyloP46way_primate_rankscore 0.782phyloP46way_placental_rankscore 0.950GERP_RS_rankscore 0.970SiPhy_29way_logOdds_rankscore 0.973Evol_Score_II 78.950RFscore_II 0.992phyloP100way_vertebrate_rankscore 0.996phastCons46way_primate_rankscore 0.973phastCons46way_placental_rankscore0.992phastCons100way_vertebrate_rankscore0.988CADD_raw_rankscore 0.999DANN_rank_score 0.999

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

What Future ?

Version 1

data

Analysisdata are not stored

SESHAT: A WEB PORTAL FOR TP53 MUTATION ANALYSIS

What Future ?

Version 1

Version 2.0

data

Analysisdata are not stored

data

AnalysisData can stored in private databaseAvailable to specific consortia

THANK YOU

Recommended