21
KUPKB: Sharing, Connecting and Exposing Kidney and Urinary Knowledge using RDF and OWL Julie Klein & Simon Jupp Bio-health informatics group University of Manchester www.kupkb.org

JulieKlein_Bosc2012

Embed Size (px)

Citation preview

KUPKB: Sharing, Connecting and Exposing Kidney and Urinary

Knowledge using RDF and OWL

Julie Klein & Simon Jupp

Bio-health informatics group

University of Manchester

www.kupkb.org

The problem domain

Thousands of studies have been conducted by the kidney research community

On different species

On different materials

• On different biological levels

gene

human mouse

urine tissue

protein

cell

Large diversity Integration of the knowldege is complex

Where does the data go?Research Papers

Bespoke kidney laboratory databases

Generalist databases

Scattered, hidden in figures, coming in different formatsMost of the data is lost!

The Kidney and Urinary Pathway Knowledge Base:

SHARE AND CONNECT

The iKUP Browser:

EXPOSE

www.kupkb.org

Stucture

KUP Ontology(schema)

Experimental data

KUP Knowledge Base

RDF triple store

iKUP Browser

Populous

RightField

Ontologies provide the schemaWhat has been observed, where and when?

Disease ontology

Animal model

Gene Ontology

Experimental factors

Cell type ontology

We needed to connect these reference ontologies.

Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO)

Mouse anatomy ontology

http://www.e-lico.org/public/kupo/

Ontologies by stealth

Populous generates simple Excel based templates

The domain experts are the experts so get them build it

Anatomy (MAO)

Biological processes(

GO)

Cells (CTO)

Spreadsheet

Ontology

OP

PL

Scr

ipts

http://www.e-lico.eu/populous.html

Describing/Collecting experimental dataGathering good meta-data AND data again by stealth using RightField

Content of the meta-data cells is constraint to the relevant set of KUPO terms

http://www.sysmo-db.org/rightfield

Describing/Collecting experimental dataGathering good meta-data AND data again by stealth using RightField

Content of the meta-data cells is constraint to the relevant set of KUPO terms

Mashing it all together

Kidney and Urinary Pathway Ontology~1800 classes (~40,000 after imports closure)

Experimental data220 KUP experiments integrated

Owl reasoning

KUP Knowledge Base

RDF triple store

~35M triples

SPARQLing results

We can now ask queries that span several databases We can exploit OWL semantics for intelligent answers

Make it all RDF/OWL and expose a SPARQL endpoint…

…then we are done right?

BUT!

Easy to use application… …this is what the biologist really want

The iKUP browser

Built as an easy-to-use and light Google Web Toolkit application

To expose data from the KUPKB

Doing some biology

1. A biological question

Accepted for publication in the FASEB J!

Can calreticulin be associated to the development of human kidney disease?

2. No answer with classical tools

Search in Pubmed and Google does not return any relevant result!

3. Querying the KUPKB

4. Validation in the wet-lab

KUPKB in silico result confirmed.

5. Publish an innovative result

Reusing and Building

Ontologies provide the schema Experimental data

Owl reasoning

KUP Knowledge Base

RDF triple store

Reusing and Building

Ontologies provide the schema Experimental data

Owl reasoning

KUP Knowledge Base

RDF triple store iKUP Browser

Kidney and Urinary Pathway OntologyTool to facilitate building of onto.

Annotations, homogenizationTool to facilitate data annotation

What next

User study and evaluation experiments ongoing with Manchester Web Ergonomics Lab

Application to other biological domains Change the domain model in the ontologies and we can construct any

organ knowledge base in this way Already interests in gut, liver, heart and metabolic diseases

Acknowledgments• Simon Jupp

• Stuart Owen, Matthew Horridge, Katy Wolstencroft and Carole Goble @ University of Manchester for RightField

• Joost Schanstra, Panagiotis Moulos, Jean-Loup Bascands @ Renal Fibrosis Lab, Toulouse, France

• Aristidis Charonis, Bénédicte Buffin-Meyer, Myriem Fernandez for the CALR example

• e-LICO FP7 project and EuroKUP

• Robert Stevens, ontology development, University of Manchester

Open Source License: GNU Lesser General Public LicenseCode: http://code.google.com/p/kupkb-dev/

Thank you for listening…

www.kupk b .org

Some rough stats…• 195 KUP experiments integrated• KUPKB RDF store ~35M triples• KUPK Ontology ~1800 classes. ~40,000 after imports closure

Architecture• Sesame and BigOWLIM for the RDF store• Web site developed with Google web toolkit• OWL API and HermiT reasoner for classification and faceted browsing

Summary

The KUPKB RDF store is a mashup of biological knowledge relating to the KUP domain

Ontologies provide the schema and a consistent data annotation mechanism

We expose this knowledge base through a simple web interface that real biologists can use, the iKUP

iKUP and KUPKB provides a faster mechanism for the biologist to survey the data in biological publications and helps the hypothesis generation process.

It is a testament to the tools and APIs that such applications are now being delivered at relatively low cost