View
13
Download
0
Category
Tags:
Preview:
DESCRIPTION
Toxicological Relationships Between Proteins Obtained From a Molecular Spam Filter. Florian Nigsch & John Mitchell. F. Nigsch, et al ., J. Chem. Inf. Model., 48 , 306-318 (2008) F. Nigsch, et al ., Toxicology and Applied Pharmacology , 231 , 225-234 (2008) - PowerPoint PPT Presentation
Citation preview
Toxicological Relationships Between Proteins Obtained From
a Molecular Spam Filter
Florian Nigsch & John Mitchell
F. Nigsch, et al., J. Chem. Inf. Model., 48, 306-318 (2008)
F. Nigsch, et al., Toxicology and Applied Pharmacology, 231, 225-234 (2008)
F. Nigsch, et al., J. Chem. Inf. Model., 48, 2313-2325 (2008)
Toxicological Relationships Between Proteins Obtained From
a Molecular Spam Filter
Florian Nigsch & John Mitchell
F. Nigsch, et al., J. Chem. Inf. Model., 48, 306-318 (2008)
F. Nigsch, et al., Toxicology and Applied Pharmacology, 231, 225-234 (2008)
F. Nigsch, et al., J. Chem. Inf. Model., 48, 2313-2325 (2008)
Toxicological Relationships Between Proteins Obtained From
a Molecular Spam Filter
Florian Nigsch & John Mitchell
Now at Novartis Institutes, Boston
Toxicological Relationships Between Proteins Obtained From
a Molecular Spam Filter
Florian Nigsch & John Mitchell
Soon moving to University of St Andrews
Spam
• Unsolicited (commercial) email
• Approx. 90% of all email traffic is spam
• Where are the legitimate messages?
• Filtering
Analogy to Drug Discovery
• Huge number of possible candidates
• Virtual screening to help in selection process
Properties of Drugs
• High affinity to protein target
• Soluble
• Permeable
• Absorbable
• High bioavailability
• Specific rate of metabolism
• Renal/hepatic clearance?
• Volume of distribution?
• Low toxicity
• Plasma protein binding?
• Blood-Brain-Barrier penetration?
• Dosage (once/twice daily?)
• Synthetic accessibility
• Formulation (important in development)
Multiobjective Optimisation
Bioactivity Synthetic accessibility
Permeability
Toxicity
Metabolism
Solubility
Huge number of candidates …
Multiobjective Optimisation
Bioactivity Synthetic accessibility
Permeability
Toxicity
Metabolism
Solubility
U S E L
E S S
Drug
Huge number of candidates … most of which are useless!
Winnow Algorithm
• Invented in late 1980s by Nick Littlestone to learn Boolean functions
• Name from the verb “to winnow”– High-dimensional input data
• Natural Language Processing (NLP), text classification, bioinformatics
• Different varieties (regularised, Sparse Network Of Winnow - SNOW, …)
• Error-driven, linear threshold, online algorithm
Winnow Algorithm
• Invented in late 1980s by Nick Littlestone to learn Boolean functions
• Name from the verb “to winnow”– High-dimensional input data
• Natural Language Processing (NLP), text classification, bioinformatics
• Different varieties (regularised, Sparse Network Of Winnow - SNOW, …)
• Error-driven, linear threshold, online algorithm
Winnow Algorithm
• Invented in late 1980s by Nick Littlestone to learn Boolean functions
• Name from the verb “to winnow”– High-dimensional input data
• Natural Language Processing (NLP), text classification, bioinformatics
• Different varieties (regularised, Sparse Network Of Winnow - SNOW, …)
• Error-driven, linear threshold, online algorithm
Winnow Algorithm
• Invented in late 1980s by Nick Littlestone to learn Boolean functions
• Name from the verb “to winnow”– High-dimensional input data
• Natural Language Processing (NLP), text classification, bioinformatics
• Different varieties (regularised, Sparse Network Of Winnow - SNOW, …)
• Error-driven, linear threshold, online algorithm
Winnow Algorithm
• Invented in late 1980s by Nick Littlestone to learn Boolean functions
• Name from the verb “to winnow”– High-dimensional input data
• Natural Language Processing (NLP), text classification, bioinformatics
• Different varieties (regularised, Sparse Network Of Winnow - SNOW, …)
• Error-driven, linear threshold, online algorithm
Feature Space - Chemical Space
m = (f1,f2,…,fn)
f1
f2
f3
Feature spaces of high dimensionality
COX2
f2
f3
f1
DHFR
CDK1CDK2
Combinations of Features
Combinations of molecular features
to account for synergies.
Features of Molecules
Based on circular fingerprints
Training Example
WorkflowFor predicting protein targets
Protein Target Prediction
• Which protein does a given molecule bind to?• Virtual Screening• Multiple endpoint drugs - polypharmacology• New targets for existing drugs• Prediction of adverse drug reactions (ADR)
– Computational toxicology
Protein Target Prediction
• Which protein does a given molecule bind to?• Virtual Screening• Multiple endpoint drugs - polypharmacology• New targets for existing drugs• Prediction of adverse drug reactions (ADR)
– Computational toxicology
Protein Target Prediction
• Which protein does a given molecule bind to?• Virtual Screening• Multiple endpoint drugs - polypharmacology• New targets for existing drugs• Prediction of adverse drug reactions (ADR)
– Computational toxicology
Protein Target Prediction
• Which protein does a given molecule bind to?• Virtual Screening• Multiple endpoint drugs - polypharmacology• New targets for existing drugs• Prediction of adverse drug reactions (ADR)
– Computational toxicology
Protein Target Prediction
• Which protein does a given molecule bind to?• Virtual Screening• Multiple endpoint drugs - polypharmacology• New targets for existing drugs• Prediction of adverse drug reactions (ADR)
– Computational toxicology
Predicted Protein Targets
• Selection of 233 classes from the MDL Drug Data Report
• ~90,000 molecules• 15 independent
50%/50% splits into training/test set
Predicted Protein Targets
Cumulative probability of correct prediction within the three top-ranking predictions: 82.1% (±0.5%)
Computational Toxicology
• Model for target prediction
• Annotated library of toxic molecules– MDL Toxicity
database
– ~150,000 molecules
– Standardisation
– MySQL database
• For each molecule we predict the likely target
• Correlations between predicted protein targets and known toxicity codes– Canonical (23)
– Full (490)
Toxicological Relationships Outline (1)
• Protein target prediction allows us to link (predictively) 150,000 toxic organic molecules to 233 specific protein targets
• Each target is treated as a single protein, although may be sets of related proteins)
• Toxicological databases link (experimentally) these 150,000 molecules to 23 toxicity classes
• Combining these two sources of data matches the 233 proteins with the 23 toxicity classes
Toxicological Relationships Outline (1)
• Protein target prediction allows us to link (predictively) 150,000 toxic organic molecules to 233 specific protein targets
• Each target is treated as a single protein, although may be sets of related proteins
• Toxicological databases link (experimentally) these 150,000 molecules to 23 toxicity classes
• Combining these two sources of data matches the 233 proteins with the 23 toxicity classes
Toxicological Relationships Outline (1)
• Protein target prediction allows us to link (predictively) 150,000 toxic organic molecules to 233 specific protein targets
• Each target is treated as a single protein, although may be sets of related proteins
• Toxicological databases link (experimentally) these 150,000 molecules to 23 toxicity classes
• Combining these two sources of data matches the 233 proteins with the 23 toxicity classes
Toxicological Relationships Outline (1)
• Protein target prediction allows us to link (predictively) 150,000 toxic organic molecules to 233 specific protein targets
• Each target is treated as a single protein, although may be sets of related proteins
• Toxicological databases link (experimentally) these 150,000 molecules to 23 toxicity classes
• Combining these two sources of data matches the 233 proteins with the 23 toxicity classes
Toxicological Relationships Outline (2)
• For each protein target, we have a profile of association with the 23 toxicity classes
• Proteins with similar profiles are clustered together
• We demonstrate that these clusters of proteins can be physiologically meaningful.
Toxicological Relationships Outline (2)
• For each protein target, we have a profile of association with the 23 toxicity classes
• Proteins with similar profiles are clustered together
• We demonstrate that these clusters of proteins can be physiologically meaningful.
Toxicological Relationships Outline (2)
• For each protein target, we have a profile of association with the 23 toxicity classes
• Proteins with similar profiles are clustered together
• We demonstrate that these clusters of proteins can be physiologically meaningful.
Predictions Obtained
L70 - Changes in liver weight<LiverY07 - Hepatic microsomal oxidase<Enzyme inhibitionM30 - Other changes<Kidney, Urether, and BladderL30 - Other changes<Liver
Target PredictionHighest ranking class IS predicted protein target
Protein code j
Toxicity codes i
Result matrix R = (rij)rij incremented for each prediction.
( )Protein targetsT
oxcodes
r11 r12
r21
…
Toxicity Annotations
CANONICAL TOXICITY CODES (23)
FULL TOXICITY CODES (490)Y41 : Glycolytic < Metabolism (intermediary) < Biochemical
Proteins by Toxicity
• Cardiac - G1. Kainic acid receptor2. Adrenergic alpha23. Phosphodiesterase III4. cAMP Phosphodiesterase5. O6-Alkylguanine-DNA
alkyltransferase
• Vascular - H
1. Angiotensin II AT2
2. Dopamine (D2)
3. Bombesin
4. Adrenergic alpha2
5. 5-HT antagonist
Top 5 Proteins by Toxicity
68 distinct proteins for 23 toxicity classes, i.e., 3.0 proteins per canonical toxicity code.
Lanosterol 14alpha-Methyl Demethylase 5 Glucose-6-phosphate Translocase 4 IL-6 4 Benzodiazepine Antagonist 3 Kainic Acid Receptor 3
Proteins and their connectivities
Clustering of Toxicity Classes
Clustering of toxicity classes: based on predicted protein associations from the result matrix
Correlation Between Toxicity Classes
Correlations between toxicity classes: 23 by 23 correlation matrix
Correlations between proteins: 233 by 233 correlation matrix
Correlation Between Proteins
Correlations between proteins: 233 by 233 correlation matrix
Cluster 1 (proteins 6-11)
Correlation Between Proteins
We will look at two specific clusters, which are called Cluster 1 and Cluster 4.
• Carbonic Anhydrase Inhibitor
• Estrogen Receptor Modulator
• LHRH Agonist• Aromatase Inhibitor• Cysteine Protease
Inhibitor• DHFR Inhibitor
• Cluster 1 (proteins 6-11)
• Within-cluster correlation (without auto-correlation)r = 0.95
Cluster 1
• Carbonic Anhydrase Inhibitor
• Estrogen Receptor Modulator
• LHRH Agonist• Aromatase Inhibitor• Cysteine Protease
Inhibitor• DHFR Inhibitor
• Cluster 1 (proteins 6-11)
• Within-cluster correlation (without auto-correlation)r = 0.95
Cluster 1
• Carbonic Anhydrase Inhibitor
• Estrogen Receptor Modulator
• LHRH Agonist• Aromatase Inhibitor• Cysteine Protease
Inhibitor• DHFR Inhibitor
Cluster 1
• Within-cluster correlation (without auto-correlation)r = 0.95
Proteins involved in breast cancer
Cluster 1
Proteins involved in breast cancer
Cluster 1
Computational ToxicologyCA
ER LHRH
Aromatase Cysteine Prot.
DHFR
Tissue-specific transcripts of human steroid sulfatase are under control of estrogen signaling pathways in breast carcinoma, Zaichuk 2007
“aim of this study was to characterize carbonic anhydrase II (CA2), as novel estrogen responsive gene” Caldarelli 2005
The Transactivation Domain AF-2 but Not the DNA-Binding Domain of the Estrogen Receptor Is Required to Inhibit Differentiation of Avian Erythroid Progenitors, Marieke von Lindern 1998
This led to premature expression of CAII, a possible explanation for the toxic effects of overexpressed ER.
Cathepsin L Gene Expression and Promoter Activation in Rodent Granulosa Cells, Sriraman 2004
showed that cathepsin L expression in granulosa cells of small, growing follicles in- creased in periovulatory follicles after human chorionic gonadotropin stimulation.
Controversies of adjuvant endocrine treatment for breast cancer and recommendations of the 2007 St Gallen conference, Rabaglio 2007
Merchenthaler 2005
Summary of aromatase inhibitor trials: The past and future, Goss 2007 Regulation of collagenolytic cysteine
protease synthesis by estrogen in osteoclasts, Furuyama 2000
Antimalarials?
Induction by estrogens of methotrexate resistance in MCF-7 breast cancer cells, Thibodeau 1998
Literature-based links between these proteins
Breast Cancer Proteins
and now Cluster 4 …
Cluster 4
This cluster links treatment of stomach ulcers to loss of
bone mass!
This cluster links treatment of stomach ulcers to loss of
bone mass!
Proton Pump Inhibitors etc.
Correlation above 0.98
Proton Pump Inhibitors etc.
Correlation above 0.99
Correlation above 0.98
Proton Pump Inhibitors etc.
• Proton pump inhibitors used to limit production of gastric acid
• PTH is important in the developent/regulation of osteoclasts (cells for bone resorption)
• PTH controls levels of Ca2+ in the blood; increased PTH levels are associated with age-related decrease of bone mass
Recent clinical studies showed increased risk of hip fractures resulting from long-term use of proton pump inhibitors. Hence link between PTH and proton pump inhibitors.
PTH = Parathyroid hormone (84 aa mini-protein)
Proton Pump Inhibitors etc.
• Proton pump inhibitors used to limit production of gastric acid
• PTH is important in the developent/regulation of osteoclasts (cells for bone resorption)
• PTH controls levels of Ca2+ in the blood; increased PTH levels are associated with age-related decrease of bone mass
Recent clinical studies showed increased risk of hip fractures resulting from long-term use of proton pump inhibitors. Hence link between PTH and proton pump inhibitors.
PTH = Parathyroid hormone (84 aa mini-protein)
Proton Pump Inhibitors etc.
• Proton pump inhibitors used to limit production of gastric acid
• PTH is important in the developent/regulation of osteoclasts (cells for bone resorption)
• PTH controls levels of Ca2+ in the blood; increased PTH levels are associated with age-related decrease of bone mass
Recent clinical studies showed increased risk of hip fractures resulting from long-term use of proton pump inhibitors. Hence link between PTH and proton pump inhibitors.
PTH = Parathyroid hormone (84 aa mini-protein)
Proton Pump Inhibitors etc.
• Proton pump inhibitors used to limit production of gastric acid
• PTH is important in the developent/regulation of osteoclasts (cells for bone resorption)
• PTH controls levels of Ca2+ in the blood; increased PTH levels are associated with age-related decrease of bone mass
Recent clinical studies showed increased risk of hip fractures resulting from long-term use of proton pump inhibitors. Hence link between PTH and proton pump inhibitors.
PTH = Parathyroid hormone (84 aa mini-protein)
Proton Pump Inhibitors etc.
• Proton pump inhibitors used to limit production of gastric acid
• PTH is important in the developent/regulation of osteoclasts (cells for bone resorption)
• PTH controls levels of Ca2+ in the blood; increased PTH levels are associated with age-related decrease of bone mass
Recent clinical studies showed increased risk of hip fractures resulting from long-term use of proton pump inhibitors. Hence link between PTH and proton pump inhibitors.
Conclusions
• Successful adaptation of algorithm formerly not used in this area
• Benchmark confirms usability, speed & memory requirements
• Can find correct protein targets for molecules
• Hence link proteins together via ligand-binding properties and associations of ligands with toxicities
• Identify toxicological relationships between proteins
Conclusions
• Successful adaptation of algorithm formerly not used in this area
• Benchmark confirms usability, speed & memory requirements
• Can find correct protein targets for molecules
• Hence link proteins together via ligand-binding properties and associations of ligands with toxicities
• Identify toxicological relationships between proteins
Conclusions
• Successful adaptation of algorithm formerly not used in this area
• Benchmark confirms usability, speed & memory requirements
• Can find correct protein targets for molecules
• Hence link proteins together via ligand-binding properties and associations of ligands with toxicities
• Identify toxicological relationships between proteins
Conclusions
• Successful adaptation of algorithm formerly not used in this area
• Benchmark confirms usability, speed & memory requirements
• Can find correct protein targets for molecules
• Hence link proteins together via ligand-binding properties and associations of ligands with toxicities
• Identify toxicological relationships between proteins
Conclusions
• Successful adaptation of algorithm formerly not used in this area
• Benchmark confirms usability, speed & memory requirements
• Can find correct protein targets for molecules
• Hence link proteins together via ligand-binding properties and associations of ligands with toxicities
• Identify toxicological relationships between proteins
Acknowledgements
• Jos Tissen• Bernd van Buuren• Silvia Miret
» Unilever» Cambridge
• Andreas Bender• Hamse Mussa• Jeremy Jenkins
Funding - Unilever
Recommended