O N T O L O G I E S
What Are Ontologies, andWhy Do We Need Them?B. Chandrasekaran and John R. Josephson, Ohio State UniversityV. Richard Benjamins, University of Amsterdam
THEORIES IN AI FALL INTO TWObroad categories: mechanism theories andcontent theories. Ontologies are content the-ories about the sorts of objects, properties ofobjects, and relations between objects that arepossible in a specified domain of knowledge.They provide potential terms for describingour knowledge about the domain.
In this article, we survey the recent devel-opment of the field of ontologies in AI. Wepoint to the somewhat different roles ontolo-gies play in information systems, natural-language understanding, and knowledge-based systems. Most research on ontologiesfocuses on what one might characterize asdomain factual knowledge, because knowl-ede of that type is particularly useful in nat-ural-language understanding. There is an-other class of ontologies that are importantin KBSone that helps in sharing know-eldge about reasoning strategies or problem-solving methods. In a follow-up article, wewill focus on method ontologies.
Ontology as vocabulary
In philosophy, ontology is the study of thekinds of things that exist. It is often said thatontologies carve the world at its joints. InAI, the term ontology has largely come to
mean one of two related things. First of all,ontology is a representation vocabulary, oftenspecialized to some domain or subject matter.More precisely, it is not the vocabulary as suchthat qualifies as an ontology, but the concep-tualizations that the terms in the vocabularyare intended to capture. Thus, translating theterms in an ontology from one language toanother, for example from English to French,does not change the ontology conceptually. Inengineering design, you might discuss theontology of an electronic-devices domain,which might include vocabulary that describesconceptual elementstransistors, operationalamplifiers, and voltagesand the relationsbetween these elementsoperational ampli-fiers are a type-of electronic device, and tran-sistors are a component-of operational ampli-fiers. Identifying such vocabularyand theunderlying conceptualizationsgenerally
requires careful analysis of the kinds of objectsand relations that can exist in the domain.
In its second sense, the term ontology issometimes used to refer to a body of knowl-edge describing some domain, typically acommonsense knowledge domain, using arepresentation vocabulary. For example,CYC1 often refers to its knowledge repre-sentation of some area of knowledge as itsontology.
In other words, the representation vocab-ulary provides a set of terms with which todescribe the facts in some domain, while thebody of knowledge using that vocabulary isa collection of facts about a domain. How-ever, this distinction is not as clear as it mightfirst appear. In the electronic-device exam-ple, that transistor is a component-of opera-tional amplifier or that the latter is a type-ofelectronic device is just as much a fact about
THIS SURVEY PROVIDES A CONCEPTUAL INTRODUCTIONTO ONTOLOGIES AND THEIR ROLE IN INFORMATION
SYSTEMS AND AI. THE AUTHORS ALSO DISCUSS HOWONTOLOGIES CLARIFY THE DOMAINS STRUCTURE OF
KNOWLEDGE AND ENABLE KNOWLEDGE SHARING.
20 1094-7167/99/$10.00 1999 IEEE IEEE INTELLIGENT SYSTEMS
its domain as a CYC fact about some aspectof space, time, or numbers. The distinctionis that the former emphasizes the use ofontology as a set of terms for representingspecific facts in an instance of the domain,while the latter emphasizes the view of ontol-ogy as a general set of facts to be shared.
There continues to be inconsistencies inthe usage of the term ontology. At times, the-orists use the singular term to refer to a spe-cific set of terms meant to describe the entityand relation-types in some domain. Thus, wemight speak of an ontology for liquids orfor parts and wholes. Here, the singularterm stands for the entire set of concepts andterms needed to speak about phenomenainvolving liquids and parts and wholes.
When different theorists make different pro-posals for an ontology or when we speak aboutontology proposals for different domains ofknowledge, we would then use the plural termontologies to refer to them collectively. In AIand information-systems literature, however,there seems to be inconsistency: sometimes wesee references to ontology of domain andother times to ontologies of domain, bothreferring to the set of conceptualizations forthe domain. The former is more consistent withthe original (and current) usage in philosophy.
Ontology as content theory
The current interest in ontologies is the lat-est version of AIs alternation of focus be-tween content theories and mechanism the-ories. Sometimes, the AI community getsexcited by some mechanism such as rule sys-tems, frame languages, neural nets, fuzzylogic, constraint propagation, or unification.The mechanisms are proposed as the secretof making intelligent machines. At othertimes, we realize that, however wonderful themechanism, it cannot do much without agood content theory of the domain on whichit is to work. Moreover, we often recognizethat once a good content theory is available,many different mechanisms might be usedequally well to implement effective systems,all using essentially the same content.2
AI researchers have made several attemptsto characterize the essence of what it meansto have a content theory. McCarthy andHayestheory (epistemic versus heuristic dis-tinction),3 Marrs three-level theory (infor-mation processing, strategy level, algorithmsand data structures level, and physical mech-anisms level),4 and Newells theory (Knowl-
edge Level versus Symbol Level)5 all grap-ple in their own ways with characterizingcontent. Ontologies are quintessentially con-tent theories, because their main contributionis to identify specific classes of objects andrelations that exist in some domain. Ofcourse, content theories need a representa-tion language. Thus far, predicate calculus-like formalisms, augmented with type-ofrelations (that can be used to induce classhierarchies), have been most often used todescribe the ontologies themselves.
Why are ontologiesimportant?
Ontological analysis clarifies the structureof knowledge. Given a domain, its ontologyforms the heart of any system of knowledgerepresentation for that domain. Withoutontologies, or the conceptualizations thatunderlie knowledge, there cannot be a vocab-ulary for representing knowledge. Thus, thefirst step in devising an effective knowledge-representation system, and vocabulary, is toperform an effective ontological analysis ofthe field, or domain. Weak analyses lead toincoherent knowledge bases.
An example of why performing goodanalysis is necessary comes from the field ofdatabases.6 Consider a domain having sev-eral classes of people (for example, students,professors, employees, females, and males).This study first examined the way this data-base would be commonly organized: stu-dents, employees, professors, males, andfemale would be represented as types-of theclass humans. However, some of the prob-lems that exist with this ontology are that stu-dents can also be employees at times and canalso stop being students. Further analysisshowed that the terms students and employeedo not describe categories of humans, but areroles that humans can play, while terms suchas females and males more appropriately rep-resent subcategories of humans. Therefore,clarifying the terminology enables the ontol-ogy to work for coherent and cohesive rea-soning purposes.
Second, ontologies enable knowledgesharing. Suppose we perform an analysis andarrive at a satisfactory set of conceptualiza-tions, and their representative terms, for somearea of knowledgefor example, the elec-tronic-devices domain. The resulting ontol-ogy would likely include domain-specific
terms such as transistors and diodes; generalterms such as functions, causal processes,and modes; and terms that describe behaviorsuch as voltage. The ontology captures theintrinsic conceptual structure of the domain.In order to build a knowledge representationlanguage based on the analysis, we need toassociate terms with the concepts and rela-tions in the ontology and devise a syntax forencoding knowledge in terms of the conceptsand relations. We can share this knowledgerepresentation language with others whohave similar needs for knowledge represen-tation in that domain, thereby eliminating theneed for replicating the knowledge-analysisprocess. Shared ontologies can thus form thebasis for domain-specific knowledge-repre-sentation languages. In contrast to the previ-ous generation of knowledge-representationlanguages (such as KL-One), these lan-guages are content-rich; they have a largenumber of terms that embody a complex con-tent theory of the domain.
Shared ontologies let us build specificknowledge bases that describe specific situ-ations. For example, different electronic-devices manufacturers can use a commonvocabulary and syntax to build catalogs thatdescribe their products. Then the manufac-turers could share the catalogs and use themin automated design systems. This kind ofsharing vastly increases the potential forknowledge reuse.
Describing the world
We can use the terms provided by thedomain ontology to assert specific proposi-tions about a domain or a situation in adomain. For example, in the electronic-device domain, we can represent a fact abouta specific circuit: circuit 35 has transistor 22as a component, where circuit 35 is aninstance of the concept circuit and transistor22 is an instance of the concept transistor.Once we have the basis for representingpropositions, we can also represent knowl-edge involving propositional attitudes (suchas hypothesize, believe, expect, hope, desire,and fear). Propositional attitude terms takepropositions as arguments. Continuing withthe electronic-device domain, we can assert,for example: the diagnostician hypothesizesor believes that part 2 is broken, or thedesigner expects or desires that the powerplant has an output of 20 megawatts. Thus,an ontology can represent beliefs, goals,
JANUARY/FEBRUARY 1999 21
hypotheses, and predictions about a domain,in addition to simple facts. The ontology alsoplays a role in describing such things as plansand activities, because these also requirespecification of world objects and relations.Propositional attitude terms are also part ofa larger ontology of the world, useful espe-cially in describing the activities and prop-erties of the special class of objects in theworld called intensional entitiesforexample, agents such as humans who havemental states.
Constructing ontologies is an ongoingresearch enterprise. Ontologies range inabstraction, from very general terms thatform the foundation for knowledge repre-sentation in all domains, to terms that arerestricted to specific knowledge domains. Forexample, space, time, parts, and subparts areterms that apply to all domains; malfunctionapplies to engineering or biological domains;and hepatitis applies only to medicine.
Even in cases where a task might seem tobe quite domain-specific, knowledge repre-sentation might call for an ontology that des-cribes knowledge at higher levels of gener-ality. For example, solving problems in thedomain of turbines might require knowledgeexpressed using domain-general terms suchas flows and causality. Such general-leveldescriptive terms are called the upper ontol-ogy or top-level ontology. There are manyopen research issues about the correct waysto analyze knowledge at the upper level. Toprovide some idea of the issues involved,Figure 1 excerpts a quote from a recent callfor papers.
Today, ontology has grown beyond philos-ophy and now has many connections to infor-mation technology. Thus, research on ontol-ogy in AI and information systems has had toproduce pragmatically useful proposals fortop-level ontology. The organization of a top-level ontology contains a number of problems,similar to the problems that surround ontol-
ogy in philosophy. For example, many ontolo-gies have thing or entity as their root class.However, Figure 2 illustrates that thing andentity start to diverge at the next level.
For example, CYCs thing has the subcat-egories individual object, intangible, and rep-resented thing; the Generalized UpperModels7 (GUM) um-thing has the subcate-gories configuration, element, and sequence;Wordnets8 thing has the subcategories liv-ing thing and nonliving thing, and Sowasroot T has the subcategories concrete, pro-cess, object, and abstract. (Natalya FridmanNoys and Carol Hafners article discussesthese differences more fully.9) Some of thesedifferences arise because not all of theseontologies are intended to be general-pur-pose tools, or even explicitly to be ontolo-gies. Another reason for the differences isthat, in principle, there are many differenttaxonomies.
Although differences exist within ontolo-gies, general agreement exists between on-tologies on many issues:
There are objects in the world. Objects have properties or attributes that
can take values. Objects can exist in various relations with
each other. Properties and relations can change over
time. There are events that occur at different
time instants. There are processes in which objects par-
ticipate and that occur over time. The world and its objects can be in dif-
ferent states. Events can cause other events or states as
effects. Objects can have parts.
The representational repertoire of objects,relations, states, events, and processes doesnot say anything about which classes of these
entities exist. The modeler of the domainsmakes these commitments. As we move froman ontologys top to lower taxonomic levels,commitments specific to domains and phe-nomena appear. For modeling objects onearth, we can make certain commitments. Forexample, animals, minerals, and plants aresubcategories of objects; has-life(x) and con-tains-carbon(x) are object properties; andcan-eat(x, y) is a possible relation betweenany two objects. These commitments are spe-cific to objects and phenomena in this do-main. Further, the commitments are not arbi-trary. For them to be useful, they shouldreflect some underlying reality.
There is no sharp division between do-main-independent and domain-specific on-tologies for representing knowledge. Forexample, the terms object, physical object,device, engine, and diesel engine all describeobjects, but in an order of increasing domainspecificity. Similarly, terms for relationsbetween objects can span a range of speci-ficity, such as connected, electrically-con-nected, and soldered-to.
Subtypes of concepts. Ontologies generallyappear as a taxonomic tree of conceptual-izations, from very general and domain-independent at the top levels to increasinglydomain-specific further down in the hierar-chy. We mentioned earlier that differentontologies propose different subtypes of evenvery general concepts. This is because, as arule, different sets of subcategories...