Upload
kupkbteam
View
264
Download
0
Tags:
Embed Size (px)
Citation preview
KUPKB: Sharing, Connecting and Exposing Kidney and Urinary
Knowledge using RDF and OWL
Julie Klein & Simon Jupp
Bio-health informatics group
University of Manchester
www.kupkb.org
The problem domain
Thousands of studies have been conducted by the kidney research community
On different species
On different materials
• On different biological levels
gene
human mouse
urine tissue
protein
cell
Large diversity Integration of the knowldege is complex
Where does the data go?Research Papers
Bespoke kidney laboratory databases
Generalist databases
Scattered, hidden in figures, coming in different formatsMost of the data is lost!
The Kidney and Urinary Pathway Knowledge Base:
SHARE AND CONNECT
The iKUP Browser:
EXPOSE
www.kupkb.org
Stucture
KUP Ontology(schema)
Experimental data
KUP Knowledge Base
RDF triple store
iKUP Browser
Populous
RightField
Ontologies provide the schemaWhat has been observed, where and when?
Disease ontology
Animal model
Gene Ontology
Experimental factors
Cell type ontology
We needed to connect these reference ontologies.
Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO)
Mouse anatomy ontology
http://www.e-lico.org/public/kupo/
Ontologies by stealth
Populous generates simple Excel based templates
The domain experts are the experts so get them build it
Anatomy (MAO)
Biological processes(
GO)
Cells (CTO)
Spreadsheet
Ontology
OP
PL
Scr
ipts
http://www.e-lico.eu/populous.html
Describing/Collecting experimental dataGathering good meta-data AND data again by stealth using RightField
Content of the meta-data cells is constraint to the relevant set of KUPO terms
http://www.sysmo-db.org/rightfield
Describing/Collecting experimental dataGathering good meta-data AND data again by stealth using RightField
Content of the meta-data cells is constraint to the relevant set of KUPO terms
Mashing it all together
Kidney and Urinary Pathway Ontology~1800 classes (~40,000 after imports closure)
Experimental data220 KUP experiments integrated
Owl reasoning
KUP Knowledge Base
RDF triple store
~35M triples
SPARQLing results
We can now ask queries that span several databases We can exploit OWL semantics for intelligent answers
Make it all RDF/OWL and expose a SPARQL endpoint…
…then we are done right?
BUT!
Easy to use application… …this is what the biologist really want
Doing some biology
1. A biological question
Accepted for publication in the FASEB J!
Can calreticulin be associated to the development of human kidney disease?
2. No answer with classical tools
Search in Pubmed and Google does not return any relevant result!
3. Querying the KUPKB
4. Validation in the wet-lab
KUPKB in silico result confirmed.
5. Publish an innovative result
Reusing and Building
Ontologies provide the schema Experimental data
Owl reasoning
KUP Knowledge Base
RDF triple store
Reusing and Building
Ontologies provide the schema Experimental data
Owl reasoning
KUP Knowledge Base
RDF triple store iKUP Browser
Kidney and Urinary Pathway OntologyTool to facilitate building of onto.
Annotations, homogenizationTool to facilitate data annotation
What next
User study and evaluation experiments ongoing with Manchester Web Ergonomics Lab
Application to other biological domains Change the domain model in the ontologies and we can construct any
organ knowledge base in this way Already interests in gut, liver, heart and metabolic diseases
Acknowledgments• Simon Jupp
• Stuart Owen, Matthew Horridge, Katy Wolstencroft and Carole Goble @ University of Manchester for RightField
• Joost Schanstra, Panagiotis Moulos, Jean-Loup Bascands @ Renal Fibrosis Lab, Toulouse, France
• Aristidis Charonis, Bénédicte Buffin-Meyer, Myriem Fernandez for the CALR example
• e-LICO FP7 project and EuroKUP
• Robert Stevens, ontology development, University of Manchester
Open Source License: GNU Lesser General Public LicenseCode: http://code.google.com/p/kupkb-dev/
Some rough stats…• 195 KUP experiments integrated• KUPKB RDF store ~35M triples• KUPK Ontology ~1800 classes. ~40,000 after imports closure
Architecture• Sesame and BigOWLIM for the RDF store• Web site developed with Google web toolkit• OWL API and HermiT reasoner for classification and faceted browsing
Summary
The KUPKB RDF store is a mashup of biological knowledge relating to the KUP domain
Ontologies provide the schema and a consistent data annotation mechanism
We expose this knowledge base through a simple web interface that real biologists can use, the iKUP
iKUP and KUPKB provides a faster mechanism for the biologist to survey the data in biological publications and helps the hypothesis generation process.
It is a testament to the tools and APIs that such applications are now being delivered at relatively low cost