Upload
chris-southan
View
404
Download
5
Embed Size (px)
Citation preview
www.guidetopharmacology.org
IUPHAR/BPS guide to pharmacology (GtoPdb):
Concise mapping for the triples of chemistry,
data, and protein target classifications
Christopher Southan, Adam J. Pawson, Joanna L. Sharman, Elena
Faccenda, Simon Harding, Jamie Davis, IUPHAR/BPS Guide to
PHARMACOLOGY, Centre for Integrative Physiology, University of
Edinburgh
ACS Wed, Mar 16 CINF 140:Chemistry, Data & the Semantic Web: An
Important Triple to Advance Science 1:30 PM - 4:45 PM Room 25B
1:35pm - 2:00pm
1
http://www.slideshare.net/cdsouthan/southan-
nciuphar-acssandiego-59444512
Abstract (will be skipped for presentation)
2
The International Union of Basic and Clinical Pharmacology Committee on Receptor
Nomenclature and Drug Classification (NC-IUPHAR) provides authoritative reports on
G protein-coupled receptors (GPCRs) Nuclear Hormone Receptors and Ion Channels
as pharmacology-based classifications. While these recommendations surfaced as
Pharmacological Review papers (i.e. unstructured) since the 1990’s, they were
already underpinning the protein tables in GtoPdb's predecessor, IUPHAR-DB, by
2003. By 2012 this hierarchical data structure had expanded into the GtoPdb schema
covering essentially all target classes for pharmacology, drug discovery and chemical
biology. As of August 2015 the expert-curated relationship capture from the literature
covers 1505 target-to-ligand mappings of which 1228 human protein IDs have
quantitative interaction data recorded against 5860 chemical structures. The
motivation, evolutionary trajectory, the need for community engagement to fill data
gaps and future directions of the resource will be outlined. Descriptions will cover the
challenges of cross-referencing alternative gene/protein hierarches, each of which has
different navigational utilities and linkages to chemistry in GtoPdb. These now extend
beyond receptors to enzymes and include NC-IUPHAR, HGNC, UniProt, Ensembl,
InterPro, Gene Ontology and E.C. numbers. The adaption of our classifications to
encompass a new immunopharmacology project will also be discussed.
Outline
• Introduction to NC-IUPHAR
• Evolution of IUPHAR-DB to GtoPdb
• Relationship statistics
• Target hierarchy and navigation
• Triple challenges with taxol
• Protein mapping and data gaps
• Introducing Guide to Immunopharmacology
• Conclusions and plans
3
International Union of Basic and Clinical Pharmacology
Committee on Receptor Nomenclature and Drug
Classification (NC-IUPHAR)
• Section within IUPHAR umbrella organisation since 1987
• Issuing guidelines for the nomenclature and classification of human biological targets of current and future medicines
• Facilitating the interface between the Human Genome Project entities as functional units and potential drug targets
• Designating pharmacologically important polymorphisms
• Developing an authoritative and freely available, global online resource the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
• Establishment of target-specific subcommittees (650 members)
• Associated with over 90 PubMed entries since 1995
• Co-applicant on UK Wellcome Trust grants for the Edinburgh University-based GtoPdb and GtoImmPdb projects
4
NC-IUPHAR
2015 output
5
NC-IUPHAR – Human Gene Nomenclature Committee collaboration
6
IUPHAR-DB launched in 2009:
unique model of committee-underpinned annotation
7
2012 to 2016: evolution of GtoPdb with major expansion
8
Human targets
Ligands
GtoPdb relationship statistics (Jan 2016)
9
6,149
Ligands
1,786
Swiss-Prot IDs
14,117 affinity
values
15,000
PubMed IDs
Our basic “triple”
Top Level NC-IUPHAR target classification
• NC-IUPHAR underpinned but largely HGNC-concordant
• Defers to target-class nomenclature outside NC-IUPHAR domains (e.g. MEROPS for proteases, ESTER for a/b hydrolases)
• Includes pharmacologist-preferred NC-IUPHAR naming (e.g. Calcium-activated potassium channel KCa1.1 = KCNMA1)
• 65 “Quaternary Structure Subunit” annotations
• NC-IUPHAR use of lower-case and symbols can be problematic
10
Navigation: ligands > primary target
11
Navigation: paper > chemistry > target > affinity data
12
Trouble with triples (I) :
so which taxol drug structure is it?
13
Probably not the
virtual D52
12 CIDs include CAS 33069-62-4
Trouble with triples (II): so which is the molecular target?
14
Trouble with triples (III):
so which structure >
activity > target ?
15
• 22 CIDs share 4842
PubChem Bioassay
results
• 89% are aligned against
CID 36314
• 12 record actives
• None of the mixtures
have results
GtoPdb:
parsimonious
annotation
16
• We curate
selectively and
with high
stringency
• This results in
minimal rather
than maximal
triples coverage
More trouble with triples:
which targets are real and which IDs cross-map1:1?
UniProt, human =151,569
UniProt, human, Swiss-Prot = 20,198
+ neXtProt = 20,040
+ HGNC = 19,836
+ Ensembl = 18,933
+ CCDS = 18,286
+ Entrez Gene ID = 18,245
+ RefSeq = 18,244
+ Evidence at protein level = 14,065
17
Even more trouble with triples: prodrugs and data gaps
18
• Data gaps could be experimentally filled with established assays
• For example, some early ACE inhibitors have no purified human protein
results (only rat, rabbit or hamster)
• Prodrugs may have no recorded activity – so cannot be target mapped
• On a good day we can get Ki and IC50 from the same paper
• How do we convince/entice folk to fill the gaps?
Utility of different target hierarchies
• NC-IUPHAR <> Swiss-Prot <> HGNC
• HGNC families and stems
• InterPro (includes Pfam)
• Genome Ontology (GO)
• EC numbers for enzymes
• Protein Ontology
• ChEMBL groupings
• Pathways (systems pharmacology)
• UniProt key words and cross-references
• Terminology for oligomeric complexes and splice variants
is problematic
19
Introducing the Guide to Immunopharmacology
• Wellcome Trust funded project initiated 4Q15
• Abbreviation will be GtoImmPdb.
• Homepage portal providing an immunological perspective
onto the database.
• Will use same schema as GtoPdb but extended to
integrate GtoImmPdb data.
• Search via biological processes and target annotations to
terms in the Gene Ontology (GO)
• Mapping to a simplified specific process list
• Provide search options via the Cell Ontology.
20
Intersects between GO immunology, GO inflammation and
GtoPdb targets with quantitative ligand interactions
21
Conclusions and plans
• Resolving triples across the bioactivity big data landscape is difficult
• Our approach is concise “small data” relationship mapping
• NC-IUPHAR > new nomenclature engagements
• Consolidate GtoPdb (< 2000 stringent target mappings)
• Instantiate GtoImmPdb
• RDF-ise GtoPdb for OpenPhacts
• PubChem BioAssay submission (target class splits)
• PubChem SID splits (e.g. approved drugs)
• Fill in legacy data gaps
• Expand (flexible) rules and relationship handling e.g. protein
interaction inhibitors, hybrid therapeutics, ligands with unknown
molecular mechanism
• Work on chemistry mapping retrieval
22
References, acknowledgments and questions
23
http://www.ncbi.nlm.nih.
gov/pubmed/24234439
• Please visit us http://www.guidetopharmacology.org/
• Curation rules are outlined in our FAQ, the 2014 and 2016 NAR
papers and blogposts
• Funders are acknowledged in the title slide
• To retrieve NC-IUPHAR's 95 Pharmacological Reviews nomenclature
publications in PubMed : (International[Title] AND Union[Title] AND
Pharmacology[Title] AND "Pharmacol Rev"[Journal])