Upload
ian-calhoun
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
1
Ontologies
Piek Vossen
VU University Amsterdam
2
Overview
• Ontologies versus lexicons
• Ontological starting points
• Comparison of available ontologies
• Identity criteria
• Basic Formal Ontology
3
Why ontologies?
• Lexicons of the future will depend on ontologies;– Semantic data in lexicon partially reflects world
knowledge;– World knowledge is stored externally in for
example the Open Data Cloud: network of RDF data resources
• Lexicons contain linguistic knowledge that is not in encyclopedia
4
World knowledge in Wordnet• POS: v ID: ENG20-02177556-v BCS: 1
Synonyms: sell:1Definition: exchange or deliver for money or its equivalentDomain: commerceSUMO/MILO: Selling-> [hypernym] exchange:1, change:7, interchange:1
transfer:5
• POS: v ID: ENG20-02143689-v BCS: 2Synonyms: buy:1, purchase:1Definition: obtain by purchase; acquire by means of a financial transactionDomain: commerceSUMO/MILO: Buying-> [hypernym] get:1, acquire:1
5
SUMO
• Selling– (documentation Selling EnglishLanguage "A FinancialTransaction
in which an instance of Physical is exchanged for an instance of CurrencyMeasure.")
• Buying– (documentation Buying EnglishLanguage "A FinancialTransaction
in which an instance of CurrencyMeasure is exchanged for an instance of Physical.")
• FinancialTransaction– (documentation FinancialTransaction EnglishLanguage "A
Transaction where an instance of Currency is exchanged for something else.")
6
Lexicon ontology mapping
Lexicon:sell: subj(x), direct obj(z),indirect obj(y) buy: subj(y), direct obj(z),indirect obj(x)
Ontology:(and (instance x Human)(instance y Human) (instance z
Entity) (instance e FinancialTransaction) (source x e) (destination y e) (patient z e)
The same process but a different perspective by subject and object realization: marry in Russian two verbs, apprendre in French can mean teach and learn
8
Evolution of the web
9
Knowledge pyramid
GOOGLE INDEX
social networks
webweb
web web web.........
social computer networks
RDFdatabases
RDFdatabases
RDFdatabases
RDFdatabases
social computer & human networks
10
Ontologies versus Lexicons
• Lexicon contain the knowledge about words and expressions that are necessary to effectively communicate in a language;
• Lexicon interacts with grammar and discourse model;
• Lexical knowledge is part of general knowledge of the world;
• Lexical knowledge is subconscious knowledge (like playing piano) whereas our knowledge of the world is of a higher level (like theory of harmony);
11
Ontologies versus lexicons
• Language is an instrument for communication:– utterances are never completely descriptive– Minimal & sufficient information for a
communicative effect (Gricean maxims)
12
News paper headings & captionsVrij Nederland “Geknipt voor u”
• Veel vrouwen verdienen minimumloon• Herder bijt schaap• Zwembad loopt leeg• Dames lopen uit• Winkelende vrouw raakt geld kwijt• Dode zwemmer• Vrouw draagt kruis paus• Eieren gooien terug op braderie
13
Ontologies versus lexicons
• Speakers/writers make assumptions about the addressee:– Knowledge of the world (Schank ('70):
grammar does not exist, conceptual dependencies)
– Knowledge of language– Knowledge about the communicative settings
14
Ontologies versus lexicon
• Multilingual perspective sheds light on the delineation of lexical and world knowledge:– water = substance & mass noun– sand = substance & mass noun but granular– grass = substance & mass noun but granular– rice, bran (Dutch plural: zemelen), chives (Dutch uncount:
bieslook) = substance? & mass noun or plural, oats (Dutch haver, havervlokken, havermeel)
– forest = group noun, one, two forests (Dutch bos = group and mass, een, twee bossen, veel bos)
• Linguistic variation around border cases:– limited forms -> symbolic– infinite & analogue reality
15
Autonomous & Language-Specific
voorwerp{object}
lepel{spoon}
werktuig{tool}
tas{bag}
bak{box}
blok{block}
lichaam{body}
Wordnet1.5 Dutch Wordnet
bagspoonbox
object
natural object (an object occurring naturally)
artifact, artefact (a man-made object)
instrumentality block body
containerdeviceimplement
tool instrument
16
Artificial ontology: • better control or performance, or a more compact and coherent structure. • introduce artificial levels for concepts which are not lexicalized in a language (e.g. instrumentality, hand tool), • neglect levels which are lexicalized but not relevant for the purpose of the ontology (e.g. tableware, silverware, merchandise).
What properties can we infer for spoons?spoon -> container; artifact; hand tool; object; made of metal or plastic; for eating, pouring or cooking
Linguistic versus Artificial Ontologies
17
Linguistic ontology: • Exactly reflects the relations between all the lexicalized words and
expressions in a language. • Captures valuable information about the lexical capacity of
languages: what is the available fund of words and expressions in a language.
What words can be used to name spoons?spoon -> object, tableware, silverware, merchandise, cutlery,
Linguistic versus Artificial Ontologies
18
Wordnets versus ontologies
• Wordnets:• autonomous language-specific lexicalization
patterns in a relational network. • Usage: to predict substitution in text for
information retrieval,• text generation, machine translation, word-
sense-disambiguation.• Ontologies:
• data structure with formally defined concepts.• Usage: making semantic inferences.
19
Ontological starting points
• What is being defined: realists versus conceptualists– scientific definition of the world– cognitive, cultural perception and interpretation
• How much room for different perspectives?• Engineering point of view: what is required by
applications?• Top level ontologies versus domain ontologies• Principles for ontology design• Sharing, re-use, interoperability
20
Comparing available ontologies
• Mascardi, Cordì, and Rosso (2008) • 7 different Upper Ontologies: BFO, Cyc, DOLCE,
GFO, PROTON, Sowa’s ontology, and SUMO, • software engineering criteria:
– Number of Dimensions. – Implementation language(s)– Modularity. – Use in Applications.– Alignment with WordNet. – Licensing.
21
Basic Formal Ontology
BFO http//www. ifomis.org/ bfo
Developers Smith, Grenon, Stenzhorn, Spear (IFOMIS)
Dimensions 36 classes related via is_a relation,
Modules SNAP snapshot ontologies indexed by times & SPAN single videoscopic ontology
Applications biomedical domain and used in building an ontology for clinic-genomic trials on cancer.
Alignment wordnet
NO
Language OWL
License Free
22
CycCyc http://www.cyc.com/
Developers Cycorp
Dimensions 300,000 concepts,, 3,000,000 assertions (facts and rules), 15,000 relations
Modules The “microtheory” approach supports modularity
Applications Domains of NLP, e.g.: WSD and Q&A, network risk assessment, terrorism-related
Alignment wordnet
Links to 12,000 synsets
Language CycL, OWL
License Commercial, OpenCyce for research
23
DOLCE
DOLCE http://www. loa-cnr.it/ DOLCE.html
Developers Guarino et al. of the LOA
Dimensions 100 concepts, 100 axioms
Modules It is not currently divided into modules (planned).
Applications LOIS Project, SmartWeb, Language Technology for eLearning AsIsKnown
Alignment wordnet
Links to 100 synsets
Language First Order Logic, KIF, OWL
License Free
24
GFOGFO http://www.onto-med.de/ontologies/gfo.html
Developers Onto-Med Research Group
Dimensions 79 classes, 97 subclass relations, 67 properties
Modules 3-layered architecture: abstract top level, abstract core level, and basic level. Several ontological modules, incl. functions and roles
Applications Ontological foundation of conceptual modelling and Biomedical science: Gene Ontology, Celltype Ontology, Chemical Entities of Biological Interest Ontology, GFO-Bio.
Alignment wordnet
NO
Language First Order Logic and KIF (forthcoming); OWL
License released under the modified BSD Licence
25
PROTONPROTON http://proton.semanticweb.org/
Developers Ontotext Lab, Sirm
Dimensions 300 concepts, 100 properties
Modules 3 levels including 4 modules.
Applications Different domains and purposes, e.g. semantic annotation, knowledge management systems in legal and telecommunications domain (projects MediaCampaign, ISTWorld, Business Data Ontology for Semantic Web Services)
Alignment wordnet
NO
Language OWL Lite
License Free
26
John SowaJohn Sowa http://www.jfsowa.com/ontology/
Developers Sowa
Dimensions 30 classes, 5 relationships, 30 axioms
Modules Not explicitly divided into modules
Applications Inspired many other upper ontologies,
Alignment wordnet
NO
Language 1st Order Modal Language,KIF
License Free
27
SUMO/MILOSUMO http://www.ontologyportal.org/
Developers Niles, Pease, Menzel
Dimensions 20,000 terms, 60,000 axioms (incl.domain ontologies)
Modules MId-Level Ontology, and ontologies for a range of specialized domains
Applications Many papers report on usage (from academic to govern-ment, to industrial), among which NLP, “pure” representation and reasoning.
Alignment wordnet
All synsets of WN3.0
Language SUO-KIF,
License OWL
28
Ontoclean Guarino - Welty
• Methodology for designing and building ontologies that ease re-use and integration
• Intuitions on how we, as cognitive agents, interact with the world (sensory system, cognition & culture)
• Purpose to design ontologies for information systems
29
Basic Notions
• Identity through an essential (intrinsic) property, e.g. DNA, a person’s brain
• What properties can change while maintaining identity
• Other ways of establishing identity:– Being a member of a class: does not keep the
invidividual members apart
– Global unique Ids: hacks that does not explain how two descriptions can be the same
30
Identity criteria (Guarino and Welty)
• Rigidity: to what extent are properties of an entity true in all or most worlds? E.g., a man is always a person but may bear a Role like student only temporarily. Thus manhood is a rigid property while studenthood is anti-rigid
• Essence: which properties of entities are essential? For example, “shape” is an essential property of “vase” but not an essential property of the clay it is made of.
• Unicity: which entities represent a whole and which entities are parts of these wholes? An “ocean” or “river” represents a whole but the “water” it contains does not.
31
Individuals and Concepts
• The term "meta-property" adopted here is based on a fundamental distinction within the domain of discourse:
• individuals or particulars vs.• concepts or universals
• Meta-level properties induce distinctions among concepts, while object-level properties induce distinctions among individuals
32
Rigidity
• A property is essential to an individual iff it necessarily holds for that individual
• A property is rigid (+R) iff, necessarily, it is essential to all its instances. A property is non-rigid (-R) iff it is not essential to some of its instances, and anti-rigid (~R) iff it is not essential to all its instances
• Person vs Student
33
Identity
• A property carries an identity criterion (+I) iff all its instances can be (re)identified by means of a suitable sameness relation. A property supplies an identity criterion iff such criterion is not inherited by any subsuming property
• Person vs. Student
34
Dependence
• An individual x is constantly dependent on y iff, at any time, x can't be present unless y is fully present, and y is not part of x. Ex: Hole/Host
• A property P is constantly dependent (+D) iff, for all its instances, there exists something they are constantly dependent on.
• Here Dependent = Constantly Dependent
35
Types vs. Roles
• A rigid property that supplies an identity criterion and is not (notionally) dependent is called a type.
• An anti-rigid property that is notionally dependent is called a role. It is a material role if it carries (but not supplies) an identity criterion, and a formal role otherwise.
• Person vs. Student vs. Part
36
Typology of meta properties-O -I +/-D +R CATEGORY LOCATION, ENTITY
-O -I +D -R UNDESIRABLE
-O -I +D ~R FORMAL ROLE PART, PATIENT
-O -I -D -R ATTRIBUTION RED
-O +I -D -R ATTRIBUTION&TYPE RED PERSON
+O +I +/-D +R TYPE FLOWER, PERSON
+O +I -D -R UNDESIRABLE
+O +I -D ~R PHASE SORTAL CATERPILAR
+O +I +D -R X
+/-O +I +D ~R MATERIAL ROLE STUDENT, FOOD
-O +I +D -R UNDESIRABLE
-O +I +/-D +R MERELY ESSENTIAL SORTAL INVERTEBRATE MAMMAL
+O -I INCOHERENT
O = carries its own identityI = carries a identity condition, possibly inherited
37
Typology of meta properties
property
FormalProperty-I
Category: -I,+R
Attribute: -I,-R,-D
Formal role:-I,~R,+D
Material role:+I,+D,~R
Phase sortal:+I,-D,~R
Type&Attribute:+I,-D,-R
Type:+I,+R
Merely essential sortal:+I+R
Role~R,+D
Anti-Essential~RNon-
Essential-R
Essential~R
Sortal+I
entity, location
red, male
part, patient
student, food
caterpilar
red apple
apple, person
invertebrate mammals
non = not essential to someanti = not essential to all
38
Extensionality
• An individual is said to be extensional iff, necessarily, everything that has the same proper parts is identical to it: amount of matter
• A property is extensional (+E) iff, necessarily, all its instances are extensional
• A property is anti-extensional (~E) iff, necessarily, all its instances are non-extensional, so that they can possibly change some parts while keeping their identity: persons and their bodies
39
Unity
• An individual is unified by a (suitably constrained) relation R iff it is a mereological sum of entities that are bound together by R. Ex. the relation having the same boss may unify a group of employees in a company -> establishes a group
• An individual w is a whole under R iff it is maximally unified by R, in the sense that R is internal to w, and no part of w is linked by R to something that is not part or w
• A property P is said to carry unity (+U) if there is a common unifying relation R such that all the instances of P are essential wholes under R. A property carries anti-unity (~U) if all its instances can possibly be non-wholes. If every instance of P is an essential whole, but there is no unifying relation common to all instances of P, then we mark P with the property *U
40
Singularity and Plurality
• An individual is a singular whole iff its unifying relation is the transitive closure of the relation "strong connection", like that existing between two 3D regions that have a surface in common. Topological wholes of this kind have a special cognitive relevance, which accounts for the natural language distinction between singular and plural -> countibility
• A plural individual is a sum of singular wholes that is not itself a singular whole. Plural individuals may be wholes themselves or not. In the former case they will be called collections; in the latter case pluralities
• A piece of coal is a singular whole. A lump of coal is a topological whole, but not a singular whole, since the pieces of coal merely touch each other, with no material connection. It is therefore a plural whole
42
Messy taxonomy
entity:-I-U-D+R
Location Amount of matter Red Agent
Group
Country
Physical ObjectLiving being
FruitFood
Apple
Red Apple
Caterpillar
Butterfly
Animal
Vertebrate
Person
Organization
Group of people
Social entity
Legal entity
43
Methodology
• Analyse each property according to meta-properties
• Remove all properties except for categories and essential sortals
• Remove subsumption between incompatible identity conditions
• Add Phasal sortals• Add attributes, roles and mixed types
44
Some conflicts
• car -> physical object + amount of matter
• animal -> living being + physical object
• organization -> group of people (+ME) + social entity (-ME) + legal agent
45
Cleaner taxonomy
entity:-I-U-D+R
Location+O-U-D+R
Amount of matter+O-U-D+R
Group+O~U-D+R
Physical Object+O+U-D+R
Living being+O+U-D+R
Fruit
Apple
Animal
Vertebrate:+I
Person
Organization+O+U-D+R
Group of people:+I
Social entity-I+U-D+R
46
Clean taxonomy
entity:-I-U-D+R
Location Amount of matter
RedAgent
Group
Region
Physical Object
Living being
FruitFood
Apple
Red Apple
Caterpillar Butterfly
Animal
Vertebrate
Person
Organization
Group of people
Social entity
Legal entityCountry
Lepidopteran
+o-u-d+r
+o+u -d+r+i-o~u+d~r
+i+o+u-d-r+l+u-d~r +l+u-d~r
+o+u-d+r
+o+u-d+r
+i-o~u-d+r
+o+u-d+r
47
Basic Formal Ontology
• Realist approach to ontology, based on science:– independent of our linguistic, comceptual, theoretical,
cultural representations– reality existed before humans
• Perspectivalism:– there are many different representations that are equally
good: -> different levels of granularity (atoms, molecules, organisms, ecosystems, galaxies)
• Fallibilism: science can be wrong• Adequate: given the domain choose the adequate
granularity
48
Substances and processes exist in time in different ways
substance
t i m
e
process
49
Snapshot Video ontology ontology
substance
t i m
e
process
50
SNAP vs SPAN
• Objects vs. events
• Continuants vs. occurrents
• Nouns vs. verbs
• In preparing an inventory of reality
• we keep track of these two different kinds of entities in two different ways
51
SNAP and SPAN•
• anatomy and physiology
52
SNAP: Entities existing in toto at a time
53
SPAN: Entities extended in time
SPANEntity extended in time
Portion of Spacetime
Fiat part of process *First phase of a clinical trial
Spacetime worm of 3 + Tdimensions
occupied by life of organism
Temporal interval *projection of organism’s life
onto temporal dimension
Aggregate of processes *Clinical trial
Process[±Relational]
Circulation of blood,secretion of hormones,course of disease, life
Processual Entity[Exists in space and time, unfolds
in time phase by phase]
Temporal boundary ofprocess *
onset of disease, death
54
SNAP-SPAN
Participation
Perpetration (+agentive)
Initiation
Perpetuation
Termination
Influence
Facilitation
Hindrance
Mediation
Patiency(-agentive)
55
Realization (SNAP-SPAN)
• the execution of a plan, algorithm
• the expression of a function
• the exercise of a role
• the realization of a disposition
56
Material examples:
• SPAN SNAP• expression of an emotion• utterance of a sentence• application of a therapy• course of a disease• increase of temperature
57
SPAN SNAP
Involvement
Creation
Sustaining in being
Destruction
DemarcationBlurring
Degradation