21
Representing chemicals using OWL, Description Graphs and Rules Janna Hastings, EBI, UK Michel Dumontier, Carleton University, Canada Duncan Hull, EBI, UK Matthew Horridge, Manchester, UK Christoph Steinbeck, EBI, UK Ulrike Sattler, Manchester, UK Robert Stevens, Manchester, UK Tertia Hӧrne, University of South Africa Katarina Britz, Meraka Institute, South Africa OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Representing chemicals using OWL, Description Graphs and Rules

Embed Size (px)

DESCRIPTION

Objects can be said to be structured when their representation also contains their parts. While OWL in general can describe structured objects, description graphs are a recent, decidable extension to OWL which support the description of classes of structured objects whose parts are related in complex ways. Classes of chemical entities such as molecules, ions and groups (parts of molecules) are often characterized by the way in which the constituent atoms of their instances are connected via chemical bonds. For chemoinformatics tools and applications, this internal structure is represented using chemical graphs. We here present a chemical knowledge base based on the standard chemical graph model using description graphs, OWL and rules. We include in our ontology chemical classes, groups, and molecules, together with their structures encoded as description graphs. We show how role-safe rules can be used to determine parthood between groups and molecules based on the graph structures and to determine basic chemical properties. Finally, we investigate the scalability of the technology used through the development of an automatic utility to convert standard chemical graphs into description graphs, and converting a large number of diverse graphs obtained from a publicly available chemical database

Citation preview

Page 1: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Representing chemicalsusing OWL, Description Graphs

and Rules

Janna Hastings, EBI, UKMichel Dumontier, Carleton University, Canada

Duncan Hull, EBI, UKMatthew Horridge, Manchester, UK

Christoph Steinbeck, EBI, UKUlrike Sattler, Manchester, UK

Robert Stevens, Manchester, UKTertia Hӧrne, University of South Africa

Katarina Britz, Meraka Institute, South Africa

Page 2: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Page 3: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Problem

We wish to represent and reason over structured objects

i.e. their representation contains also their parts

Page 4: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Chemical structures

caffeine

Moleculesconsist of atoms

connected by bonds

single bond

double bond

Carbon atom

Hydrogen atom

Nitrogen atom Oxygen atom

Page 5: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Chemical ontology

Chemical ontology consists of chemical classes which can be defined by parts of structures and/or properties of structures

carboxylic acid

cyclic molecule

if molecule has part some carboxy group

if molecule has property cyclic, i.e. a self-connectedcyclic path exists through the molecule’s atoms

Page 6: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

OWL representation

Without structure, all parts must be explicitly asserted(combinatorial explosion for larger molecules)

But the structure of complex molecules breaks the OWL Tree Model requirement

does not have a model in the shape of a tree

Page 7: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Description Graphs

A recent, decidable extension to OWL 2, allowing expression of complex structures as graphs within the ontology

A description graph consists of a set of labelled vertices and a set of directed edges

Each description graph has a main class which links the graph to the main OWL ontology

Page 8: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Strong separation

In order to preserve decidability of knowledge bases enriched with description graphs,atomic properties used as graph edges have to be different to those used in axioms in the main OWL ontology

This is known as the strong separation requirement

Page 9: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Rules

Enhance OWL with the capacity to express if – then constructions

Consist of ‘antecedent’ (if conditions) and ‘consequent’ (then result)

Antecedent and consequent are composed of conjunctions of atomic statements

Page 10: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Goal

• Can we represent chemical structures using OWL and Description Graphs?

• Can we reason over the information encoded in chemical structures using OWL, Description Graphs and Rules?

Page 11: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

OWL ontology

Page 12: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Chemical description graphs

Generated based on structures converted from a chemical database (ChEBI)

Page 13: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Rules

Generated for properties, e.g. being cyclic

molecule(?x), atom(?a1), atom (?a2), atom(?a3), atom(?a4), bond(?b1), bond (?b2), bond(?b3), bond (?b4), has_atom(?x, ?a1), has_atom(?x, ?a2), has_atom(?x, ?a3), has_atom(?x, ?a4),has_bond(?a1, ?b1), has_bond(?a1, ?b4), has_bond(?a2, ?b1), has_bond(?a2, ?b2),has_bond(?a3, ?b2), has_bond(?a3, ?b3), has_bond(?a4, ?b3), has_bond(?a4, ?b4)

-> cyclic_entity(?x)

cyclobutane tetrahedrane

Page 14: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Rules

Generated for parthood, e.g. carboxylic acidmolecule(?y), atom(?a0), oxygen_atom(?a1), carbon_atom(?a2), oxygen_atom (?a3), has_atom(?y, ?a0), has_atom (?y, ?a1), has_atom (?y, ?a2), has_atom (?y, ?a3), double_bond(?b0), single_bond (?b1), single_bond (?b2), has_bond(?a0, ?b2), has_bond(?a1, ?b1), has_bond(?a2, ?b0), has_bond(?a2, ?b1), has_bond(?a2, ?b2), has_bond(?a3, ?b0) -> carboxylic_acid(?y)

benzoic acid has this part so: is a carboxylic acid

carboxylic acid

benzoicacid

Page 15: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Testing the reasoning

• Can we use a reasoner to deduce the classification hierarchy based on the graphs and rules?

No asserted hierarchy between test classes and molecules with generated

graphs

Page 16: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Results

Inferred hierarchy shows classified molecules

Page 17: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Testing the performance

• How many molecules (description graphs) can we include in our knowledge base?

• How does the reasoning task (classification) scale with respect to the number of graphs, both with and without rules in the knowledge base?

Page 18: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Results

Page 19: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Experiences

• Difficult to debugTools support needs to be improved

• Difficult to construct rules for properties which depend on all atoms or all bonds in a given molecule

e.g. saturated -> all bonds in molecule are single

Page 20: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Conclusion

• Using OWL, Description Graphs and Rules we can represent chemical structures at the class level in our knowledge base and reason over the structural information

• Scalability of the reasoning with the rules is a concern

Page 21: Representing chemicals using OWL, Description Graphs and Rules

OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Acknowledgements

• Special thanks to Kirill Degtyarenko, Stefan Schulz, Colin Batchelor, Birte Glimm and the ChEBI team

• Funding: Meraka Institute, South Africa; BBSRC (BB/G022747/1); NSERC Discovery Grant