18
1 ONTO-ToolKit: enabling bio-ontology engineering via Galaxy Aravind Venkatesan, ONTO-ToolKit: enabling bio-ontology engineering via Gal Aravind Venkatesan Systems Biology group, Department of Biology NTNU, Trondheim [email protected]

Venkatesan bosc2010 onto-toolkit

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Venkatesan bosc2010 onto-toolkit

1

ONTO-ToolKit: enabling bio-ontology engineering via Galaxy

Aravind Venkatesan, ONTO-ToolKit: enabling bio-ontology engineering via Galaxy

Aravind Venkatesan

Systems Biology group, Department of BiologyNTNU, Trondheim

[email protected]

Page 2: Venkatesan bosc2010 onto-toolkit

2

Overview

• Galaxy• Ontology for Life Sciences• ONTO-Toolkit• Use Cases• Conclusion• Future Directions• Acknowledgment• References

Page 3: Venkatesan bosc2010 onto-toolkit

5

Ontology for Life Sciences

• Ontologies aid in knowledge formalisation and machine interoperability

• The success of ontologies in the Life Sciences is marked by the wide spread use of Gene Ontology1 (GO)

• Application ontologies such as the Cell Cycle Ontology2

• The OBO flat file format3 (OBOF) and the Web Ontology Language4 (OWL) have gained wide acceptance as knowledge representation languages.

Page 4: Venkatesan bosc2010 onto-toolkit

6

ONTO-Toolkit

• Is a collection of tools to manage ontologies represented in the OBO file format within Galaxy environment

• The tools are wrappers for commonly used functions provided by ONTO-PERL5

• ONTO-PERL was developed as part of the Semantic Systems Biology6 (SSB) initiative

• ONTO-PERL (OBOF-centered PERL API) comprises of extensible set of (Object-oriented) PERL modules

• These have an organised set of subroutines to deal with ontologies and is fully compatible with the current OBO specifications (ver. 1.2)

• The latest version (ver.1.22) of ONTO-PERL can be directly downloaded from CPAN, http://search.cpan.org/dist/ONTO-PERL/

ONTO-PERL: An API supporting the development and analysis of bio-ontologies. Antezana E, Egana M, De Baets B, Kuiper M, Mironov V. Bioinformatics 2008; doi: 10.1093/bioinformatics/btn042

Page 5: Venkatesan bosc2010 onto-toolkit

7 Examples of ONTO-PERL functionalitiesScripts Functionality

get_ancestor_terms.pl Collects the ancestor terms (list of IDs) froma given term (existing ID) in the given OBO ontology.

get_child_terms.pl Collects the child terms (list of term IDs andtheir names) from a given term (existing ID) in the given OBOontology.

get_descendent_terms.pl Collects the descendent terms (list of IDs)from a given term (existing ID) in the given OBO ontology.

get_subontology_from.pl Extracts a subontology (in OBO format) of agiven ontology having the given term ID as the root.

get_intersection_ontology.pl Provides an intersection of the given ontologies (in OBO format)

obo2owl.pl OBO to OWL translator.

obo2rdf.pl OBO to RDF translator.

obo_trimming.pl This script trims a given branch of an OBO ontology.

Page 6: Venkatesan bosc2010 onto-toolkit

8 ONTO-Toolkit - GALAXY

Define arguments

Page 7: Venkatesan bosc2010 onto-toolkit

9 ONTO-Toolkit - GALAXY

Page 8: Venkatesan bosc2010 onto-toolkit

10 Use Cases

• To investigate similarities between given molecular functions

• Collecting all the upstream terms (ancestors) of two given molecular function terms and to identify common ancestors terms.

Term ID 1

Term ID 2

Motivation:• To demonstrate the functionality of ONTO-Toolkit in GALAXY

• To demonstrate the usefulness of ontology engineering in biological domain

Use Case I:

Chosen Ontology: Cell Cycle OntologyChosen Terms:Term 1: id: CCO:F0000004name: trans-hexaprenyltranstransferase activityTerm 2:id: CCO:F0000820name: homogentisate 1,2-dioxygenase activity

Page 9: Venkatesan bosc2010 onto-toolkit

11 Use Case I

Uploading an obo ontology file – e.g.:

cco_S_pombe

Page 10: Venkatesan bosc2010 onto-toolkit

12

Molecular function Term ID: CCO:F0000004

Conti…

Page 11: Venkatesan bosc2010 onto-toolkit

13

• This step is repeated for the second term - CCO:F0000820

List of ancestor terms for the given Molecular function Term 1

List of ancestor terms for Term 2

Page 12: Venkatesan bosc2010 onto-toolkit

14

Common ancestor terms

Gets the overlapping ancestor terms

Page 13: Venkatesan bosc2010 onto-toolkit

15

Use Case II

• Identifying overlapping annotations for a given pair of distinct biological process terms

Chosen Ontology: Cell Cycle OntologyChosen Terms:

Term 1: id: CCO:P0000005name: cell cycle checkpoint

Term 2:id: CCO:P0000069name: mitosis

Term ID 1

Term ID 2

Page 14: Venkatesan bosc2010 onto-toolkit

16

Gets the sub-ontology for the given terms

Use Case II

Page 15: Venkatesan bosc2010 onto-toolkit

17

Generated sub-ontology of Term 1 : CCO:P0000005

Generated sub-ontology of Term 2 : CCO:P0000069

Page 16: Venkatesan bosc2010 onto-toolkit

18

Gets the intersection of the two sub-ontologies

Page 17: Venkatesan bosc2010 onto-toolkit

19

Conclusion

• Use Case I – the results provides evidence that the two molecular functions are unrelated as only the high level terms are shared by them.

• Use Case II – the results suggests the possibility of an overlap between two distinct biological processes

• ONTO-Toolkit functionalities provides rich-ontology driven solutions within the Galaxy framework

Future Directions

• Provide interface to perform SPARQL queries within Galaxy

• Provide visualisation module

Page 18: Venkatesan bosc2010 onto-toolkit

20

Acknowledgment

• Dr. Erick Antezana, NTNU• Dr. Vladimir Mironov, NTNU• Dr. Martin Kuiper, NTNU

References1. M. Ashburner, et al. Gene ontology: tool for the unification of biology. The Gene Ontology

Consortium. Nat Genet, 25:25– 29, May 2000.

2. The Cell Cycle Ontology, http://www.semantic-systems-biology.org/cco

3. The OBO Flat File Format Specification (ver.1.2), http://www.geneontology.org/GO.format.obo-1_2.shtml

4. OWL Web Ontology Language, http://www.w3.org/TR/owl-semantics/

5. ONTO-PERL: An API supporting the development and analysis of bio-ontologies. Antezana E, Egana M, De Baets B, Kuiper M, Mironov V. Bioinformatics 2008; doi: 10.1093/bioinformatics/btn042

6. Semantic Systems Biology, http://www.semantic-systems-biology.org/