Ontology - and Reloaded and Revolutions

Preview:

DESCRIPTION

 

Citation preview

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology- and Reloaded and Revolutions

in Life Science

Jie Bao

baojie@cs.iastate.edu

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Once upon a time there is a Graduate called Neo

who like thinking

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

And ask his advisor silly questions Architect: Hello Neo. Neo: Who are you? Architect: I am the Architect. I created the Matrix. I

have been waiting for you. You have many questions and although the process has altered your consciousness you remain irrevocably human, ergo some of my answers you will understand and some of them you will not. Concordantly, while your first question maybe the most pertinent you may or may not realize it is also the most irrelevant. Ultimately, you are just one Cell.

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Meaning of the World

A Cell?

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Cell?

From: Amelia Ireland (2005). GO : the Gene Ontology (Talk)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Cell?

From: Amelia Ireland (2005). GO : the Gene Ontology (Talk)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Cell?

From: Amelia Ireland (2005). GO : the Gene Ontology (Talk)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Cell?

From: Amelia Ireland (2005). GO : the Gene Ontology (Talk)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Cell?

From: Amelia Ireland (2005). GO : the Gene Ontology (Talk)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Cell?

From: Amelia Ireland (2005). GO : the Gene Ontology (Talk)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Names and Meaning

You do need to learn more about names and their meanings to be the One. In another word, you have to learn ontology.

Ontology? It is another unknown word….

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

What is Ontology

• the American Heritage® Dictionary: The branch of metaphysics that deals with the nature of being.

• Merriam-Webster : 1: a branch of metaphysics concerned with the nature and relations of being. 2 : a particular theory about the nature of being or the kinds of existents

Then use dictionary!

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

What is Ontology(2)

• 1. <philosophy> A systematic account of Existence.• 2. <artificial intelligence> (From philosophy) An explicitformal

specification of how to represent the objects, concepts and other entities that are assumed to exist in some area ofinterest and the relationships that hold among them. For AI systems, what "exists" is that which can be represented. ..... Formally, an ontology is the statement of a logicaltheory.

• 3. <information science> The hierarchical structuring of knowledge about things by subcategorising them according to their essential (or at least relevant and/or cognitive) qualities.

Source: The Free On-line Dictionary of Computing, © 1993-2005 Denis Howe

And google!

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

What is Ontology(3)

• Aristotle: Science of Being (Metaphysics, IV, 1)

• Tom Gruber: An ontology is a specification of a conceptualization. (T. R. Gruber. A translation approach to portable ontologies. Knowledge Acquisition, 5(2):199-220, 1993 )

And learn from those people

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

This is an ontology

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Another Ontology

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Yet Another Ontology

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Keys of Ontologies

A set of controlled vocabulary that classify concepts and define the relationship between them

• Terms: Neo, Cell, Smith, Agent,Program, fights• Relations

– Neo is a Cell– Smith is an Agent– An Agent is a Program– Neo fights Smith

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

What is Not an Ontology

• Controlled vocabulary, with no relations. – Like PDB ID

• Dictionaries – they are not formal specification

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Why Ontology

What I can do once I have an ontology?

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Annotation

In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity. In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…

Function: protein serine/threonine kinase activity ; GO:0004674 (IDA)

Component:integral to plasma membrane ; GO:0005887 (IDA)

Process: response to wounding ; GO:0009611 (NAS)

You may annotate data precisely with ontology terms

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Reasoning

– Recognising semantic similarity in spite of syntactic

differences (Neo is the One)

– Recognising implicit consequences given explicitly

stated facts ( (given ontology)-> Smith is a Program)

This is what we called “machine understandability”, Neo

You may infer implicit facts from known facts

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology

Reloaded

Ontologies in life science

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

More Gradute-advisor conversion• Neo: Why am I here?

• Architect: Your life is the sum of a remainder of an unbalanced equation inherent in the programming of the matrix. You are the eventuality of an anomaly which despite my sincerest efforts I have been unable to eliminate from what is otherwise a harmony of mathematical precision. While it remains a burden deciduously avoided it is not unexpected and thus not beyond a measure of control, since we have known the control of you cells with knowledge about your biological existence. Which has led you inexcerably here?

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

More Gradute-advisor conversion

Can you give me an example of your knowledge about biological existences ?I assume it is a ontology – the study of existence

Let’s Start with GO

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

GO?

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

GO - Gene Ontology

Cellular Component

Molecular Function Biological Process

is NOT an ontology only about Gene

This is a typical ontological misunderstanding

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Cellular Component

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Molecular Function

insulin binding. insulin receptor activity

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Molecular Function

drug transporter activity

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Biological Process

cell division

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Biological Process

Courtship behavior of Homo Sapiens

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Anatomy of a GO term

id: GO:0006094name: gluconeogenesisnamespace: processdef: The formation of glucose fromnoncarbohydrate precursors, such aspyruvate, amino acids and glycerol.[http://cancerweb.ncl.ac.uk/omd/index.html]exact_synonym: glucose biosynthesisxref_analog: MetaCyc:GLUCONEO-PWYis_a: GO:0006006is_a: GO:0006092

unique GO IDterm name

definition

synonymdatabase ref

parentage

ontology

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology Structure

cell

membrane chloroplast

mitochondrial chloroplastmembrane membrane

is-apart-of

DAG(directed acyclic graph)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

What other ontologies in life science

There are many! EC, SCOP, CATH, MIPS. MGED… Let’s see some trait ontologies.

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Phenotype (Trait) OntologiesExample: Plant Ontology (http://www.plantontology.org)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Animal Trait Ontology

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research LaboratoryATO Editor

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

ATO Editor Features

• Collaborative Ontology Building– Concurrent editing– Hierarchical management– Partial Locking

• Scalable Database storage

• Modular ontology representation

• Partially reusable ontology

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

OBO• Open Biomedical Ontologies - http://obo.sourceforge.net/

And more…

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

OBO Format• format-version: GO_1.0

!any comment here typeref: relationship.types subsetdef: goslim "Generic GO Slim" version: $Revision: 1.11 $ date: April 18th, 2003 saved-by: jrichter remark: Example file

[Term] id: GO:0003674 name: molecular_function def: "The action characteristic of a gene product." [GO:curators] subset: goslim

[Term] id: GO:0016209 name: antioxidant activity is_a: GO:0003674 def: "Inhibition of the reactions brought about by dioxygen or peroxides. \ Usually the antioxidant is effective because it can itself be more easily \ oxidized than the substance protected. The term is often applied to \ components that can trap free radicals, thereby breaking the chain \ reaction that normally leads to extensive biological damage." \ [ISBN:0198506732]

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology

- Revolutions

Ontology for Matrix

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

More Gradute-advisor conversion

Ontologies are really funny. Could you tell me how useful they are other than helping you to design efficient biological cells?

Basically, ontologies help us a lot in design the Matrix. To represent the real world, we need a formalism to do Knowledge Representation.

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Semantic WebThe first prototype of Matrix is based on so called “Semantic Web”

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Semantic Web

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontologies based on Logics

• Matrix, or semantic web, are designed based on the formal presentation of concepts and their relations. The underlying magic is logics, especially the Description Logics, which is much complex than the OBO ontologies e.g. GO.

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Web Ontology Language• OWL is a syntax of descrption logics

and first generation Matrix language.

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

DL and OWL

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

DL and OWL(2)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Evolution of Web Ontology Languages

XMLXML

HTMLHTML

RDFSRDFS

SHOESHOE

OILOIL

DAML-ONTDAML-ONT

OWLOWLRDFRDF

Revision

Extendvocabularies

Combinevocabularies

Extend HTML tagsfor semantic description

Define vocabularies

OWL-SOWL-S

SGMLSGML

DAML-SDAML-S

For Webservices

1992 1998 1999 2000 2001 2002 2003

DAML(DAML+OIL)

DAML(DAML+OIL)

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Protégé – Ontology Editor

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Architect: Ontologies do open the road to the series of revolutions leading to the final Matrix

Neo: That’s an interesting talk. If I were you, I would hope that we will meet again, Dr. Arch.

Architect: We will.

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

References• Amelia Ireland. GO : the Gene Ontology (talk)• Tom Shi: Introduction to GO (talk)• Chintan O. Patel and Yves A. Lussier Rerepresenting Biomedical

Ontologies using the Web Ontology Language -A Checklist . SOFG 2004

• Zhi-liang Hu, Jie Bao, Max F. Rothschild, Vasant Honavar, and James M. Reecy. (2006) Developing Frameworks and Tools for Animal Trait Ontology (ATO) . Plant and Animal Genome XIV Conference. Poster Track. January 14-18, 2006,San Diego, California.

• Ian Horrocks, Peter F. Patel-Schneider, and Frank van Harmelen. From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics, 2003.

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Questions

Computer Science 672, Spring 2006, Iowa State University. Feb 27. 2006

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

A dictionary of Matrix• First Matrix: a complete but inconsistent system• Later Matrices: incomplete but consistent Systems• Zion: formulae not decidable in Matrix (due to

Gödel incompleteness)• Matrix Reload: incorporate new facts from the One• Neo – a learning program to collect new facts, from

non-logic worlds• Oracle – heuristic program• Architect – A distributed reasoner which can

integrate knowledge in matrix and Neo.