21
1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191 http :// tinyurl.com/6r34vst Presented by Brian Davis, Ph.D. VCDE WS Teleconference November 17, 2011

1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

Embed Size (px)

Citation preview

Page 1: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1

War of Ontological Worlds: Mathematics, Computer Code or Esperanto?

By Andrey Rzhetsky and James A. Evans

PLos Computational Biology 7(9): e1002191

http://tinyurl.com/6r34vst

Presented by Brian Davis, Ph.D. VCDE WS Teleconference

November 17, 2011

Page 2: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

22

Why is this Paper Important (for Brian)?

• Conceptual framework to understand, and perhaps address, issues I see (in caBIG, 3rd Millennium, Politics, …Life)

• Why do people talk past each other?• Why do some participants in caBIG (among whose aims is after all,

semantic interoperability) have such a hard time understanding one another?

• Why do some “Technologists” disagree on approaches to solving technical issues (SPARQL vs. DSQL; Ontologies vs. Metadata, etc.)

• This is not a deep technical paper (4 pages, 1 figure). • Enjoyable, clever (Mao, Tolkien)

• However, it is illustrates different Points of View (POV) among “ontology” experts …

• especially in regards to the value of Use cases end Users• Allows a facilitator to perhaps find common ground to build upon.

Page 3: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

3

War of Ontology Worlds: Different Views on Ontologies

• “In biomedicine today, the term ontology means different things to different experts”

• Abstract:• 3 clusters of experts view…

• Ontology as Mathematics: value is rigor and logic, symmetry and consistency, representation across scientific sub-fields and include only non-contradictory knowledge

• Ontology as Code: value is value is on utility and diversity: fit for purpose and custom design of ontologies

• Ontology as Esperanto: value is facilitating cross disciplinary communication across data sets and diverse communities.

• These different views align with classical divides in Science and suggestions how synthesis of concerns could strengthen the next generation of biomedical ontologies.

Page 4: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

4

Origen's of and developments in ontologies

• Definitions: philosophical inquiry into nature and categories of existence

• Circa 1900: logicians extended and formalized…as a system for describing entities that exist in the world.”• properties• interrelationships• inferential mechanisms for reasoning

• Circa 1990: computer scientists …applying it to…machine-readable knowledge representations.

• Circa 2011: rise of scientific databases that are increasingly complex and persistent and require interoperability, ontologies have become enlisted in information technologies

Page 5: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

5

MEANINGS of Ontologies

• “In biomedicine today, the term ontology means different things to different experts”• A continuum

• Unordered terminologies• Example: American Medical Association's list of Current

Procedural Terminologies (CPT Coded)• Taxonomies

• Example: International Classification for Diseases (ICD)• Organizes by hierarchical “is-a” relationships

• Formal Ontologies • Example: Gene Ontology (GO), Foundational Model of

Anatomy (FMA)• Organizes by rich, rigorous relationships

• Disagreement on GO categorization (inconsistent structure)

Page 6: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

6

Use examples

• Unstructured Terminology (CPT)• Billing patients for medical procedures at hospitals

• Taxonomies and Ontologies (GO) • Annotation of experimental findings in research

• Formal Ontologies (FMA)• Reasoning across annotated findings for novel insight

Page 7: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

77

Ontologists and Ontologies

• Ontologies constructed by heterogeneous groups of: • Computer scientists• Bench biologists• Bedside physicians• Programmers• Philosophers

• = “Ontologists” (self identified)

• Conferences of Ontologists:• Focus on construction of ontologies• NOT focused on understanding ontologies as Knowledge

Representations• When discussed = “scuffle of emotionally charged opinion”.

Page 8: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

88

This paper

• Interviews with 14 leading ontologists• Summarize the wide range of worldviews• Categorize as 3 Archetypal or Caricatures that

highlight essential differences:• Mathematics• Code• Esperanto

• Intermediate views (mixtures of each Archetype): consisting of weighted mixtures of the 3 above.

Page 9: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

99

Table 1: Training and Views of Ontologists Interviewed

Page 10: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1010

Ontologies as Mathematics

• Value = Formal consistency (because…)• …Ultimate goal is computational reasoning across ontologies

• A single, unifying ontology covering the whole of biology and medicine is possible to design and pursue

• It need not be complete and should only contain established knowledge in order to approximate the underlying reality

• Quote: “unless you have a core of terms and relations which is universally valid, however small it might be, then you’re always going to have some kind of slack in your ontology…fall short of rigorous…”

• No need to represent uncertainty, hypotheses or speculations• Introduction of probability will lead to “…results of quite low value.”• First order logic (tools) and computationally tractable subsets of logic

are appropriate tools for …inference across rigorous ontologies.

Page 11: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1111

Ontologies as Mathematics (con’t.)

• “Every ontology ever built should have the same upper [level] ontology, ideally.”• Examples: BFO, SUMO, Cyc

• The best upper level ontologies will compete for scientific attention until the best will win out

• Training: computer scientists and philosopher

Page 12: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1212

Ontologies as Code

• Value: utility• Practical value should trump mathematical elegance.

• Ontologies should be designed specifically for a range of special or general purposes (like programming languages C++, HTML)

• Quote: “I view ontologies as primarily as software artifacts.”• An Ontology should serve its function and intended user community

(even if small)• The number of ontologies should be equal to or greater than the number

of projects requiring structured knowledge representations• “Let a thousand flowers bloom” –Mao): let users create own custom

ontologies• Design choices (of the ontology) are secondary to desired utility• Explicitly OPPOSED to the view of a unified ontology for the whole of

biomedicine

Page 13: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1313

Ontologies as Code (con’t.)

• “overly abstract mathematical ontologies provide a false sense of certainty. They obscure distinctions that might be useful to a particular task, and make unnecessary distinctions.”

• Abstract, upper level ontologies are disconnected from reality and may not have utility.

• Ontologies should be evaluated based on usability and efficiency in the context of specific problems.

• No unification of all ontologies: all ontologies can co-exist in peace• This group = medical researchers, clinical researchers,

bioinformaticists and biologists

Page 14: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1414

Ontologies as Esperanto• Value: facilitation of cross-community communication• Ontologies should cross-link concepts from different domains to allow

for knowledge transfer and insight between areas, even if imperfectly.• Motivated by possibility of making data computable over fields,

experimental techniques, countries and time periods.• A Unified ontology is unrealistic• Practical solution is “a federated interlinkage…a grid or a network of

ontologies and vocabularies…”• Systematically borrowing terms between ontologies is essential to

create productive overlaps that reduce redundancy and facilitate cross-communication.

• Don’t need complete cross-mapping, but mapping is sufficient to compute over datasets as a whole.

• Ontology construction requires diplomatic social activity to coordinate between scientists and fields. (besides deep domain knowledge and design precision).

• Linguists

Page 15: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1515

How they view each other

• Mathematics vs. Code and Esperanto• suggests that computer code and Esperanto approaches are messy and

inconsistent, even “silly and Childish”.• Esperanto and Code ontologies are inefficient to improve • Rarely able to reason over Esperanto or Code Ontologies without using

probability to allow for contradiction and error.• Mathematics vs. Esperanto

• Efforts to integrate domain-specific ontologies as compromising half-measures that abandon the potential strength of unification

Page 16: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1616

How they view each other (con’t.)

• Code and Esperanto vs. Mathematics• Mathematics approach is utopian• Of little practical use• Even potentially sinister (“one mother ontology to serve all purposes and

in the darkness bind them”) -from Lord of the Rings• Code vs. Mathematics

• Mathematics ontologies are incomplete• Unrepresentative of relevant knowledge in an area• Hence, unproductive• Mathematics ontologies are rigid and artificial to domain experts

• Esperanto vs. Code• Environment is “eclectic chaos”• Multiplying unnecessary redundancy• Failing to exploit natural linking opportunities across knowledge

Page 17: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1717

Ontology Challenges posed by Text Mining• Multiple levels of granularities co-exist in scientific literature• Eg: “protein methylation”

• Molecular Biology: “PMRT5 methylates Histones H3 and H4”• Chemistry: multistage process.

• Therefore, if we extract information from (legacy) text, we cannot commit to a single representation, if we want to retain the fidelity of its source.

• Disagreement persist in the scientific communities: if we wish to retain fidelity (without arbitrary censorship) the disagreement must be retained.

• Objects in ontologies change over time, so mentions in text may also change (eg, childhood lifecycle: changes of outcomes based on time of exposure: measles)

• Must retain uncertainty and ambiguity: Theories and symbols change (eg, early “tubulin” later become “alpha tubulin, beta-tubulin”, etc.)

Page 18: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1818

Conclusions and Next steps

• “These challenges suggest a new virtue: Representativeness” (Esperanto view)

• If ontologies are employed as indexing biomedical knowledge and to discover it, they must maintain inconsistent biomedical claims (just as research scientists attempt to do)

• Inconsistencies should not be ignored as they point to theoretic weakness and opportunities.

• Suggest that• All three ontology perspectives need to be honored• Usability of an ontology for a particular community should NOT be

compromised• Additional efforts to maximize an ontologies mathematical rigor will

improve its re-use and facilitate integrative analysis and discovery across biomedicine

Page 19: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

1919

In caBIG

• ICR and TBPT F2F in August 2011: Misunderstanding between research scientists, bioinformaticists and computer scientists (Esperanto, Code and Mathematics)

• Semantic Infrastructure: Discovery across federated services via ontologies (Sparql)(mathematics)

• Terminologies/Ontologies as fit for purpose in specific circumstances• (BioPortal with >400 available)• Tension between Software developer teams and users for terms they

want/need (Code) vs. Infrastructure teams that desire “higher level utility” (eg, discovery across federated data via structured ontologies)

• Need to Quick changes and local terminologies (“Dynamic Extensions”)(Code) and need for consistency and rigor for federated discovery (Mathematics)

• SAIF Level of Abstraction (Conceptual, Logical, Implementable) and Viewpoints (Information, Business, etc.)

Page 20: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

2020

Suggestions?

• There are three valid points of view regarding the use and value of ontologies

• As technologists (possibly leaning to mathematics side), we should not foist our opinions and values on Code and Esperanto.

Page 21: 1 War of Ontological Worlds: Mathematics, Computer Code or Esperanto? By Andrey Rzhetsky and James A. Evans PLos Computational Biology 7(9): e1002191

2121

Questions, Comments, Suggestions?