21
ISO/TC37/SC4/TDG6 ISO/TC37/SC4/TDG6 Language Resource Language Resource Ontologies Ontologies 2008-05-25, Marrakech 2008-05-25, Marrakech HASIDA Koiti HASIDA Koiti [email protected] [email protected] CfSR, AIST, Japan CfSR, AIST, Japan

ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti [email protected] CfSR, AIST, Japan

Embed Size (px)

Citation preview

Page 1: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

ISO/TC37/SC4/TDG6ISO/TC37/SC4/TDG6Language Resource Language Resource

OntologiesOntologies2008-05-25, Marrakech2008-05-25, Marrakech

HASIDA KoitiHASIDA Koiti

[email protected]@aist.go.jp

CfSR, AIST, JapanCfSR, AIST, Japan

Page 2: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

TDG6 IssuesTDG6 Issuesontologizationontologization

DC, LAF, LMF, FS, MAF, SemAF, SynAF, DC, LAF, LMF, FS, MAF, SemAF, SynAF, TDG3, etc.TDG3, etc.

Cf. the Pisa group’s work on LMFCf. the Pisa group’s work on LMFextension of RDF (and ontology extension of RDF (and ontology

framework) to more straightforwardly framework) to more straightforwardly address linguistic informationaddress linguistic informationextended RDF instead of XMLextended RDF instead of XML

nodes embedding nodes … rdf:Container?nodes embedding nodes … rdf:Container?

publish TRspublish TRslaunch ISslaunch ISs

2

Page 3: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

OntologizationOntologizationontology-based reformulationontology-based reformulation

Most current standards are based on Most current standards are based on XML and lack standard framework for XML and lack standard framework for semantic interpretation.semantic interpretation.

not XML but RDF as base not XML but RDF as base description and modeling tooldescription and modeling toolSemantic interpretation is Semantic interpretation is

standardized not for XML but for RDF.standardized not for XML but for RDF.ontology as schemaontology as schema

not DTD, XML Schema, RELAXNG, not DTD, XML Schema, RELAXNG, etc.etc. 3

Page 4: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

Motivations of Motivations of OntologizationOntologization

Lack of formal tool by which to Lack of formal tool by which to write schemas fully addressing the write schemas fully addressing the specifications in ISs.specifications in ISs.

DCR model lacks descriptive DCR model lacks descriptive power.power.

4

Page 5: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

Weaknesses of DCR Weaknesses of DCR MetamodelMetamodel

DCR metamodel cannot addressDCR metamodel cannot addresssorts of DCs: such as unary predicate, sorts of DCs: such as unary predicate,

binary relation, symmetric binary binary relation, symmetric binary relation, etc.relation, etc.

types of the domain (1types of the domain (1stst arg.) and the arg.) and the range (2range (2ndnd arg.) of binary relations arg.) of binary relations (properties)(properties)

5

Page 6: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

Semantic Mess of XMLSemantic Mess of XML

Semantic interpretation of XML is not Semantic interpretation of XML is not standardized but rather arbitrary.standardized but rather arbitrary.

Many inconsistent `standards’ on Many inconsistent `standards’ on overlapping issues.overlapping issues.

Huge standards containing many Huge standards containing many different semantic interpretation different semantic interpretation manners.manners.e.g., MPEG-7 > 2000 pagese.g., MPEG-7 > 2000 pages

6

Page 7: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

RDFRDFResource Description FrameworkResource Description FrameworkW3C recommendation W3C recommendation

http://www.w3.org/RDF/basis of ontology standards such as basis of ontology standards such as

RDFS, OWL, and SKOS.RDFS, OWL, and SKOS.graph data modelgraph data modeltextual representationtextual representation

XMLXMLN3N3

7

Page 8: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

RDF GraphRDF Graph

http://www.example.org/people#fredhttp://www.example.org/people#fredhttp://www.example.org/people#fredhttp://www.example.org/people#fred

http://meetings.example.com/m1/hphttp://meetings.example.com/m1/hphttp://meetings.example.com/m1/hphttp://meetings.example.com/m1/hp

m:homePagem:homePagem:homePagem:homePage

m:attendingm:attendingm:attendingm:attendinghttp://meetings.example.com/cal#m1http://meetings.example.com/cal#m1http://meetings.example.com/cal#m1http://meetings.example.com/cal#m1

m:givenNamem:givenNamem:givenNamem:givenName FredFredFredFred

m:hasEmailm:hasEmailm:hasEmailm:hasEmailmailto:[email protected]:[email protected]:[email protected]:[email protected]

8

Page 9: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

Cf. RDF in TextCf. RDF in Text

XML

N3

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:m="http://www.example.org/meeting_organization#" xmlns="http://www.example.org/people#" xmlns:p="http://www.example.org/personal_details#"> <rdf:Description about="http://meetings.example.com/cal#m1"> <m:homePage resource="http://meetings.example.com/m1/hp"/> </rdf:Description> <rdf:Description about="http://www.example.org/people#fred"> <m:attending resource="http://meetings.example.com/cal#m1"/> <p:GivenName>Fred</p:GivenName> <p:hasEmail resource="mailto:[email protected]"/> </rdf:Description></rdf:RDF>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:m="http://www.example.org/meeting_organization#" xmlns="http://www.example.org/people#" xmlns:p="http://www.example.org/personal_details#"> <rdf:Description about="http://meetings.example.com/cal#m1"> <m:homePage resource="http://meetings.example.com/m1/hp"/> </rdf:Description> <rdf:Description about="http://www.example.org/people#fred"> <m:attending resource="http://meetings.example.com/cal#m1"/> <p:GivenName>Fred</p:GivenName> <p:hasEmail resource="mailto:[email protected]"/> </rdf:Description></rdf:RDF>

@prefix p: <http://www.example.org/personal_details#> .@prefix m: <http://www.example.org/meeting_organization#> .<http://meetings.example.com/cal#m1> m:homePage <http://meetings.example.com/m1/hp> .<http://www.example.org/people#fred> p:GivenName "Fred"; p:hasEmail <mailto:[email protected]>; m:attending <http://meetings.example.com/cal#m1> .

@prefix p: <http://www.example.org/personal_details#> .@prefix m: <http://www.example.org/meeting_organization#> .<http://meetings.example.com/cal#m1> m:homePage <http://meetings.example.com/m1/hp> .<http://www.example.org/people#fred> p:GivenName "Fred"; p:hasEmail <mailto:[email protected]>; m:attending <http://meetings.example.com/cal#m1> .

9Let’s

forg

et these

texts

Let’s fo

rget t

hese te

xts

and use gra

phs!

and use gra

phs!

Page 10: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

ISO 24610: Feature ISO 24610: Feature StructureStructure

typed feature structure as in HPSG, etc.typed feature structure as in HPSG, etc.ISO 24610-1: Feature Structure ISO 24610-1: Feature Structure

RepresentationRepresentationISO 24610-2: Feature System ISO 24610-2: Feature System

DeclarationDeclarationgraph modelgraph modelAVM (attribute-value matrix)AVM (attribute-value matrix)textual encoding by XMLtextual encoding by XML

10

Page 11: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

FS GraphFS Graph

determinerdeterminerdeterminerdeterminerPOSPOSPOSPOS

SPECIFIERSPECIFIERSPECIFIERSPECIFIER

ORTHORTHORTHORTH lalalala

HEAHEADD

HEAHEADD

AGRAGRAGRAGR

AGAGRR

AGAGRR

nounnounnounnounPOSPOSPOSPOS

ORTHORTHORTHORTH pommepommepommepomme

singularsingularsingularsingularNUMBERNUMBERNUMBERNUMBER

11

Page 12: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

FS in AVMFS in AVM

SPECIFIER

HEAD

POS determinerORTH `la’AGR [1][NUMBER singular]

POS nounORTH `pomme’AGR [1]

12

Page 13: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

FS in XMLFS in XML<fs> <f name="specifier"> <fs> <f name="pos"><symbol value="determiner"/></f> <f name="orth"><string>la</string></f> <f name="agr"> <var label="n1"> <fs><f name="number"><symbol value="singular"/></f></fs> </var> </f> </fs> </f> <f name="head"> <fs> <f name="pos"><symbol value="noun"/></f> <f name="orth"><string>pomme</string></f> <f name="agr"><var label="n1"/></f> </fs> </f></fs>

<fs> <f name="specifier"> <fs> <f name="pos"><symbol value="determiner"/></f> <f name="orth"><string>la</string></f> <f name="agr"> <var label="n1"> <fs><f name="number"><symbol value="singular"/></f></fs> </var> </f> </fs> </f> <f name="head"> <fs> <f name="pos"><symbol value="noun"/></f> <f name="orth"><string>pomme</string></f> <f name="agr"><var label="n1"/></f> </fs> </f></fs>

13Let’s fo

rget t

his, to

o!

Let’s fo

rget t

his, to

o!

Page 14: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

FS in RDF Graph (= FS Graph)FS in RDF Graph (= FS Graph)

determinerdeterminerdeterminerdeterminerPOSPOSPOSPOS

SPECIFIERSPECIFIERSPECIFIERSPECIFIER

ORTHORTHORTHORTH lalalala

HEAHEADD

HEAHEADD

AGRAGRAGRAGR

AGAGRR

AGAGRR

nounnounnounnounPOSPOSPOSPOS

ORTHORTHORTHORTH pommepommepommepomme

singularsingularsingularsingularNUMBERNUMBERNUMBERNUMBER

14

Page 15: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

Ontologies Subsume Feature Ontologies Subsume Feature SystemsSystems

Features are partial functions, Features are partial functions, whereas RDF properties are relations whereas RDF properties are relations in general (possibly partial functions).in general (possibly partial functions).

Usual feature systems have no Usual feature systems have no taxonomy of features, whereas usual taxonomy of features, whereas usual ontologies have taxonomies of ontologies have taxonomies of properties (e.g., due to properties (e.g., due to rdfs:subPropertyOf).rdfs:subPropertyOf).

16

Page 16: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

wordwordwordword

<fsDecl type="word" baseTypes="sign"> <fsDescr>The fundamental type for individual words</fsDescr> <fDecl name="orth"> <fDescr>The orthographic representation for this word</fDescr> <vRange><string/></vRange> </fDecl></fsDecl>

<fsDecl type="word" baseTypes="sign"> <fsDescr>The fundamental type for individual words</fsDescr> <fDecl name="orth"> <fDescr>The orthographic representation for this word</fDescr> <vRange><string/></vRange> </fDecl></fsDecl>

orthorthorthorth

Feature Structure Feature Structure DeclarationDeclaration

17

signsignsignsign

rdfs:domainrdfs:domainrdfs:domainrdfs:domainstringstringstringstringrdfs:rangerdfs:rangerdfs:rangerdfs:range

rdfs:subClassOfrdfs:subClassOfrdfs:subClassOfrdfs:subClassOf

The fundamental type for individual wordsThe fundamental type for individual words

rdfs:commentrdfs:commentrdfs:commentrdfs:comment

The orthographic representation for this wordThe orthographic representation for this word

rdfs:commentrdfs:commentrdfs:commentrdfs:commentowl:FunctionalPropertyowl:FunctionalPropertyowl:FunctionalPropertyowl:FunctionalProperty

rdf:typerdf:typerdf:typerdf:type

Page 17: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

Constraint (Conditional)Constraint (Conditional)

18

XXXX

invinvinvinv

truetruetruetrue

finfinfinfin

auxauxauxaux

vformvformvformvform

<cond> <fs> <f name="inv"> <binary value="true"/> </f> </fs> <then/> <fs> <f name="aux"> <binary value="true"/> </f> <f name="vform"> <symbol value="fin"/> </f> </fs></cond>

<cond> <fs> <f name="inv"> <binary value="true"/> </f> </fs> <then/> <fs> <f name="aux"> <binary value="true"/> </f> <f name="vform"> <symbol value="fin"/> </f> </fs></cond>

XXXX truetruetruetrue

condcondcondcond

named graph

Page 18: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

FS Ontologization FS Ontologization (Summary)(Summary)

RDF ⊃ FSRDF ⊃ FSUse ontologies for feature-system Use ontologies for feature-system

declarations.declarations.We need RDF-based notations to We need RDF-based notations to

encode constraints.encode constraints.Defaults are outside of ontology.Defaults are outside of ontology.

19

Page 19: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

ISO 24612: Linguistic ISO 24612: Linguistic Annotation FrameworkAnnotation Framework

20

Page 20: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

RDF Extended for EmbeddingRDF Extended for Embedding

● ●● ●● ●● ●

NUMBERNUMBERNUMBERNUMBER

a node embedding nodes 21

rdfs:typerdfs:typerdfs:typerdfs:type NPNPNPNP

TheTheTheThe

clockclockclockclock

SINGSINGSINGSING

rdfs:typerdfs:typerdfs:typerdfs:type TOKENTOKENTOKENTOKEN

POSPOSPOSPOS

BASEBASEBASEBASETHETHETHETHE

DETDETDETDET

rdfs:typerdfs:typerdfs:typerdfs:type

POSPOSPOSPOS NNNNNNNN

BASEBASEBASEBASECLOCKCLOCKCLOCKCLOCK

possibly stand-off annotation

Page 21: ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

ProspectsProspectsRDF as basic data structureRDF as basic data structure

Graph modelGraph model is essential.is essential.Forget about textual encoding such as Forget about textual encoding such as

XMLXMLthough W3C insists on plain-test encoding.though W3C insists on plain-test encoding.

ontology to address FSDontology to address FSDstraightforward to basically declare straightforward to basically declare

features and feature structuresfeatures and feature structuresneed some inventions for constraintsneed some inventions for constraints

extension of RDFextension of RDFembeddings (of strings)embeddings (of strings)collections (sets, bags, lists)collections (sets, bags, lists)

lots more to dolots more to do 22