Upload
egon-willighagen
View
103
Download
1
Tags:
Embed Size (px)
DESCRIPTION
OpenTox Virtual Seminar presentation of 2010-04-01.
Citation preview
The CDK, Bioclipse, and RDF
Egon Willighagen <http://chem-bla-ics.blogspot.com/>
Bioclipse & Proteochemometric Group (Prof. Wikberg)Department of Pharmaceutical Biosciences
Uppsala University
2010-04-01
Problem
BuildingBlocks
Solution
Application
Conclusion
Who am I?
http://www.citeulike.org/user/
egonw/tag/papers
http:
//chem-bla-ics.blogspot.com
http://egonw.github.com
waveto:
2010-04-01 Bioclipse & Proteochemometric Group - 2 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
The Problem...
Solanum lycopersicum...
We model our world, but ...Life is not uni- or bivariateKnowledge is not eitherBut we think of it as suchInformation Loss!
2010-04-01 Bioclipse & Proteochemometric Group - 3 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Names...
benzene3-[4-[3-(1-methyl-7-oxo-3-propyl-4H-pyrazolo[4,3-d]pyrimidin-5-yl)-4-propoxyphenyl]sulfonylpiperazin-1-yl]propanoicacidInChI=1S/C25H34N6O6S/c1-4-6-19-22-23(29(3)28-19)25(34)27-24(26-22)18-16-17(7-8-20(18)37-15-5-2)38(35,36)31-13-11-30(12-14-31)10-9-21(32)33/h7-8,16H,4-6,9-15H2,1-3H3,(H,32,33)(H,26,27,34)
2010-04-01 Bioclipse & Proteochemometric Group - 4 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
... Molecular reality...
1 000 000 000 000 000 000 000 000000 000 000 000 000 000 000 000000 000 000 000
2010-04-01 Bioclipse & Proteochemometric Group - 5 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
... and Numbers
2010-04-01 Bioclipse & Proteochemometric Group - 6 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Knowledge Representation: InformationLoss
2010-04-01 Bioclipse & Proteochemometric Group - 7 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Data Analysis
2010-04-01 Bioclipse & Proteochemometric Group - 8 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Proteochemometrics
2010-04-01 Bioclipse & Proteochemometric Group - 9 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Main Theme
How do we navigate dimensionality space?How include prior knowledge?While minimizing information loss?With optimal knowledge extraction?And maximizing interpretability?Without ending up in random correlation?
2010-04-01 Bioclipse & Proteochemometric Group - 10 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
The Setting...
1998: Organicchemistry...beatiful science!But ... why, how,what, ...
PJJA Buijnsters et al., Eur.J.Org.Chem, 2002, 1397–1406
2010-04-01 Bioclipse & Proteochemometric Group - 11 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Knowledge Representation...
What are theorganic normalconditions?
2010-04-01 Bioclipse & Proteochemometric Group - 12 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
The Problem: Reproducibility...
Where reproducibility isseverely hampered:
recalculate basic atom andbond propertiesaccess to QSAR/QSPRdatawell-defined algorithmspublications destroyinformation
2010-04-01 Bioclipse & Proteochemometric Group - 13 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Solutions...
Openesslicense that allowsmodification andredistributionhiding behind publicdomain is not helpful
Semantic Webbe explicit in what youmeanboth in facts and inalgorithms
2010-04-01 Bioclipse & Proteochemometric Group - 14 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Reproducibility needs ODOSOS
Open DataNo Intellectual Monopoly
Open Sourcealgorithms are compleximplementations even morestrong interaction with representation
Open StandardsSemantic Webformatsunique identifiers
http: // en. wikipedia. org/ wiki/ Glyn_ Moody
2010-04-01 Bioclipse & Proteochemometric Group - 15 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Jmol
Started in 1997 byDan Gezelter(Notre Dame)Leaders: BradlySmith, me, MiguelHoward, BobHanson
E.L. Willighagen, M. Howard, Nature Precedings, 2005http: // www. jmol. org/
2010-04-01 Bioclipse & Proteochemometric Group - 16 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
The Chemistry Development Kit
A Family of ProjectsCDK-Taverna (chemoinformatics workflows)JChemPaint (semantic 2D editor)ChemoJava (GPL-ed extension)
Goalslibrary of cheminformatics algorithmseducational
UsageCDK 2003: 75+ times cited in literatureBioclipse, KNIME, Jumbo (CML), AMBIT, ...
C. Steinbeck et al., J.Chem.Inf.Comput.Sci, 2003C. Steinbeck et al., Curr.Pharm.Design, 2006
2010-04-01 Bioclipse & Proteochemometric Group - 17 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
CDK: an Open Project
Featuresopen mailinglist and bugtrackeropen source repositoryrelease soon, release often
Offer Reviewsenior developers reviewpatches
2010-04-01 Bioclipse & Proteochemometric Group - 18 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Bioclipse
O. Spjuth et al., BMC Bioinformatics 2007, 8:59
2010-04-01 Bioclipse & Proteochemometric Group - 19 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Integration
Servicesdatabases: PubChemweb servicesGoogle SpreadsheetsMyExperiment.org: BioclipseScripting LanguageTwitter, ...journals, ...
TechniquesSOAP, REST, XMPP, . . .Resource Description Frameworkdedicated APIs
2010-04-01 Bioclipse & Proteochemometric Group - 20 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Resource Description Framework
Facts as Triplessubjectpredictate (relation)object
Exampleswp:Benzenechem:hasSMILES"c1ccccc1"wp:Benzene owl:sameAschemspider:123
2010-04-01 Bioclipse & Proteochemometric Group - 21 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
OpenMolecules RDF: dereferenceable URI
http://rdf.openmolecules.net/
2010-04-01 Bioclipse & Proteochemometric Group - 22 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
OpenMolecules RDF: linked data
http://rdf.openmolecules.net/
2010-04-01 Bioclipse & Proteochemometric Group - 23 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Bioclipse-RDF
local RDF storageread/write RDF/XML, N3run SPARQL queries (local and remote)extract RDF from XHTML/RDFa
Thanx to Jena and Pellet.
2010-04-01 Bioclipse & Proteochemometric Group - 24 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
Names 2 Graphs 2 Numbers...
2010-04-01 Bioclipse & Proteochemometric Group - 25 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
ChEMBL / QSAR
2010-04-01 Bioclipse & Proteochemometric Group - 26 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
RDF graph visualization
2010-04-01 Bioclipse & Proteochemometric Group - 27 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
OWL for Descriptors
Used for model and data.
2010-04-01 Bioclipse & Proteochemometric Group - 28 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
MyExperiment: Bioclipse ScriptingLanguage
2010-04-01 Bioclipse & Proteochemometric Group - 29 - Egon Willighagen | chem-bla-ics.blogspot.com
Problem
BuildingBlocks
Solution
Application
Conclusion
What does this bring us?
Platform to integrate the RDF with the computation worldBioclipse as single point of accessScripting, sharing of scripts with MyExperiment.orgBridge the nominal with the numerical world
2010-04-01 Bioclipse & Proteochemometric Group - 30 - Egon Willighagen | chem-bla-ics.blogspot.com