36
Text Understanding Agents and the Semantic Web Akshay Java, Tim Finin, Sergei Nirenburg 01/04/2005

Text Understanding Agents and the Semantic Web

  • Upload
    soleil

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Text Understanding Agents and the Semantic Web. Akshay Java, Tim Finin, Sergei Nirenburg 01/04/2005. Outline. Motivation: Language Understanding Agents Ontological Semantics Bridging the Knowledge Gap Preliminary Evaluation SemNews : An Application Testbed Conclusion Q&A. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Text Understanding Agents  and the Semantic Web

Text Understanding Agents and the Semantic Web

Text Understanding Agents and the Semantic Web

Akshay Java, Tim Finin, Sergei Nirenburg 01/04/2005 Akshay Java, Tim Finin, Sergei Nirenburg 01/04/2005

Page 2: Text Understanding Agents  and the Semantic Web

OutlineOutline

• Motivation: Language Understanding Agents

• Ontological Semantics• Bridging the Knowledge Gap• Preliminary Evaluation• SemNews: An Application Testbed• Conclusion• Q&A

Page 3: Text Understanding Agents  and the Semantic Web

WWWWWW

MotivationMotivation• Intelligent agents need knowledge and information.• Most Web content is NL text.• SW can benefit NLP tools in their language

understanding tasks

Web of documents Web of data

Text

Images

Audio

video

Ontologies

Instances

triples

Natural Language RDF/OWL

Facts from NL

structured information

Semantic Web

NL

P T

ools

Page 4: Text Understanding Agents  and the Semantic Web

MotivationMotivation

Language Understanding Agents

Provides RDF version of the news.

Page 5: Text Understanding Agents  and the Semantic Web

Ontological SemanticsOntological SemanticsOntoSem is a Natural Language Processing System that processes the text and converts them into facts.

Supported by a constructed world model encoded in a rich Ontology.

Page 6: Text Understanding Agents  and the Semantic Web

Ontological SemanticsOntological Semantics

Page 7: Text Understanding Agents  and the Semantic Web

Static Knowledge SourcesStatic Knowledge Sources

• Ontology• 8000 concepts• Avg 16 properties each

• Lexicons• English: 45000 entries• Spanish: 40000 entries• Chinese: 3000 entries

• Fact repository• 20000 facts

• Onomasticon• NNNNN names

tim finin
We should get some kind of extimate for this. Is there jut one onomaticon or one for each applicaiton of domain?
Page 8: Text Understanding Agents  and the Semantic Web

The OntoSem OntologyThe OntoSem Ontology

PROPERTY

FILLER

FA

CE

T

ONTOLOGY ::= CONCEPT+CONCEPT ::= ROOT | OBJECT-OR-EVENT | PROPERTYSLOT ::= PROPERTY | FACET | FILLER

Page 9: Text Understanding Agents  and the Semantic Web

Text Meaning Representation (TMR)

Text Meaning Representation (TMR)

Word sense addressed

disambiguated

A persistent fact stored in the FR

Semantic dependency established

Page 10: Text Understanding Agents  and the Semantic Web

REQUEST-ACTION-69   AGENT HUMAN-72 THEME ACCEPT-70   BENEFICIARY ORGANIZATION-71   SOURCE-ROOT-WORD ask TIME (< (FIND-ANCHOR-TIME))

ACCEPT-70   THEME WAR-73   THEME-OF REQUEST-ACTION-69   SOURCE-ROOT-WORD authorize

ORGANIZATION-71   HAS-NAME United-Nations  BENEFICIARY-OF REQUEST-ACTION-69   SOURCE-ROOT-WORD UN

HUMAN-72   HAS-NAME Colin Powell  AGENT-OF REQUEST-ACTION-69 SOURCE-ROOT-WORD he ; reference resolution has been carried out

WAR-73   THEME-OF ACCEPT-70   SOURCE-ROOT-WORD war

Text Meaning Representation (TMR)

Text Meaning Representation (TMR)

He asked the UN to authorize the war.

Page 11: Text Understanding Agents  and the Semantic Web

Mapping OntoSem to web based KR

Mapping OntoSem to web based KR

NL Text OntoSem

OWLOntology

Lexicon OntoSem2OWL

FactRepository

TMR

Ontology

TMRsIn OWL

Page 12: Text Understanding Agents  and the Semantic Web

Mapping Rules for ClassesMapping Rules for ClassesOntoSem LISP version

(make-frame patent

(definition (value (common "the exclusive right to make, use or sell an invention, which is granted to the inventor")))

(is-a (value (common intangible-asset legal-right))))

OWL Version:

<owl:Class rdf:about="&ontosem;patent"> <rdfs:subClassOf> <owl:Class rdf:about="&ontosem;intangible-asset"> </owl:Class> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Class rdf:about="&ontosem;legal-right"> </owl:Class> </rdfs:subClassOf> <rdfs:comment> he exclusive right to make, use or sell an invention, which is granted to the inventor </rdfs:label></owl:Class>

Page 13: Text Understanding Agents  and the Semantic Web

Mapping Rules for PropertiesMapping Rules for Properties

• Properties can be• ObjectProperty owl:ObjectProperty• Datatype Property owl:DatatypeProperty

• Property hierarchy is defined by owl:subPropertyOf

• Domain maps to rdfs:domain• Range maps to rdfs:range• Restrictions are handled using owl:Restriction• Numeric datatypes are handled using XSD

Page 14: Text Understanding Agents  and the Semantic Web

Mapping Rules for Properties…

Mapping Rules for Properties…

(make-frame controls (domain

(sem (common physical-event physical-object social-event social-role)))

(range (sem (common actualize artifact

natural-object social-role))) (is-a (value (common relation))) (inverse (value (common controlled-by))) (definition

(value (common "A relation which relates concepts to what

they can control"))))

Page 15: Text Understanding Agents  and the Semantic Web

Mapping Rules for Properties…

Mapping Rules for Properties…

<owl:ObjectProperty rdf:ID= "controls"> <rdfs:domain>

<owl:Class> <owl:unionOf rdf:parseType="Collection">

<owl:Class rdf:about="#physical-event"/> <owl:Class rdf:about="#physical-object"/><owl:Class rdf:about="#social-event"/>

<owl:Class rdf:about="#social-role"/> </owl:unionOf> </owl:Class>

</rdfs:domain> <rdfs:range>

<owl:Class> <owl:unionOf rdf:parseType="Collection">

<owl:Class rdf:about="#actualize"/> <owl:Class rdf:about="#artifact"/> <owl:Class rdf:about="#natural-object"/> <owl:Class rdf:about="#social-role"/>

</owl:unionOf> </owl:Class>

</rdfs:range> <rdfs:subPropertyOf>

<owl:ObjectProperty rdf:about="#relation"/> </rdfs:subPropertyOf> <owl:inverseOf rdf:resource="#controlled-by"/> <rdfs:label> "A relation which relates concepts to what they can control" </rdfs:label>

</owl:ObjectProperty>

(make-frame

(domain

(range

(is-a(inverse

Page 16: Text Understanding Agents  and the Semantic Web

Mapping Rules for FacetsMapping Rules for FacetsFacets are a way to restricting the fillers that can be used for a

particular slot

• SEM and VALUE• Maps them using owl:Restriction on a particular property.

• RELAXABLE-TO• Add this to the classes present in owl:Restriction and add this

information in the annotation.• DEFAULT

• No clear way to represent non-monotonic reasoning and closed world assumptions in Semantic Web.

• DEFAULT-MEASURE• similar to DEFAULT Facet, not handled.• DEFAULT, DEFAULT-MEASURE used relatively less frequently

• NOT• Not facet can be handled using owl:disjointOf

• INV• need not be handled since is-a slot is already mapped to owl:inverseOf

Page 17: Text Understanding Agents  and the Semantic Web

EvaluationEvaluation

http://w3c.org/RDF/Validator/

Swoop

Pellet

Wonderweb

Built Ontology translation tool using Jena API Total Triples Generated ~ 102189 (including bnode)

Time to build the Model ~ 10-40 sec

Time to do RDFS Inference ~ 10 sec

Time to do OWL Micro ~ 40 sec

Time to do OWL Full ~ ????

DL Expressivity: ELUIHEL - Conjunction and Full Existential QuantificationU - UnionH - Role HierarchyI - Role Inverse

Total Number of Classes: 7747 (Defined: 7747, Imported: 0)Total Number of Datatype Properties: 0 (Defined: 0, Imported: 0)Total Number of Object Properties: 604 (Defined: 604, Imported: 0)Total Number of Annotation Properties: 1 (Defined: 1, Imported: 0)Total Number of Individuals: 0 (Defined: 0, Imported: 0) NOTE: This is using no Restrictions

After Translation

OWL FULL

Page 18: Text Understanding Agents  and the Semantic Web

EvaluationEvaluation

• Syntactic Correctness: was checked using OWL/RDF validators.

• Semantic Validation: Full semantic validation even for subsets of OWL is difficult.

• Meaning Preservation: some subset of the native representation features such as DEFAULTS, modality, case roles may be underrepresented or not handled.

• Feature Minimization: Complex features could be difficult for reasoners to handle hence we can perform the translations at each of the levels – OWL Lite, OWL DL, OWL Full.

• Translation Complexity: OntoSem is an extensive and large ontology (~8000 concepts). Translation itself is done syntactically but in general translation might require reasoning which could be an issue.

Page 19: Text Understanding Agents  and the Semantic Web

An Application Testbed: SemNews

An Application Testbed: SemNews

• Semantically Search and Browse news• Aggregators collect the RSS news

descriptions form various sources.• The sentences are processed by

OntoSem and are converted into TMRs• Provides intelligent agents with the

latest news in a machine readable format

• http://semnews.umbc.edu/

http://semnews.umbc.edu

Page 20: Text Understanding Agents  and the Semantic Web

Semantic RSS

Data Aggregators

News Feeds

OntoSem

TMRs FR

Language Processing

OntoSem2OWLDekade Editor

Knowledge Editor Environment Semantic Web Tools

OntoSem Ontology (OWL)

TMR

Inferred

Triples

Fact Repository Interface

Ontology &Instance browser

Text Search

RDQL Query

Swoogle Index

1 2

56

7

8

9

3 4

10

11

12

13

14

15

RSSAggregator

http://semnews.umbc.edu

Page 21: Text Understanding Agents  and the Semantic Web

Agent understandable newsAgent understandable news

Provides RDF version of the news.

http://semnews.umbc.edu

Page 22: Text Understanding Agents  and the Semantic Web

Semantacizing RSSSemantacizing RSS

View structured representation of the RSS news story.

Future versions would enable editing the facts and provide provenance information

http://semnews.umbc.edu

Page 23: Text Understanding Agents  and the Semantic Web

News stories are ontologically linked

News stories are ontologically linked

Find news stories by browsing through the OntoSem ontology.

http://semnews.umbc.edu

Page 24: Text Understanding Agents  and the Semantic Web

Tracking Named EntitiesTracking Named Entities

Find stories on a specific named entity.

http://semnews.umbc.edu

Page 25: Text Understanding Agents  and the Semantic Web

Browsing FactsBrowsing Facts

Fact repository explorer for named entity ‘Mexico’ shows that it has a relation ‘nationality-of’ with CITIZEN-235

Fact repository explorer for instance CITIZEN-235 shows that the citizen is an agent of ESCAPE-EVENT

http://semnews.umbc.edu

Page 26: Text Understanding Agents  and the Semantic Web

Querying the semanticized RSS

Querying the semanticized RSS

RDQL Queries

Provides structured querying over text repre-sented in RDF.

http://semnews.umbc.edu

Page 27: Text Understanding Agents  and the Semantic Web

Semantic AlertsSemantic Alerts

Alerts can be specified as ontological concepts/ keywords / RDQL queries.

Subscribe to results of structured queries

http://semnews.umbc.edu

Page 28: Text Understanding Agents  and the Semantic Web

Beyond keyword searchBeyond keyword search• Conceptually searching for content

Find all news stories that have something to do with a place and a terrorist activity.

• Context based queryingFind all events in which ‘George Bush’ was the ‘speaker’.

• Reporting factsFind all politicians who traveled to Asia.

• Knowledge sharingPopulating instances by mapping FOAF and DC to OntoSem ontology.

Page 29: Text Understanding Agents  and the Semantic Web

Current workCurrent work

• Enron email corpus• Profiles in terror

tim finin
I think we should have a slide at least mentioning some current, ongoing work, e.g., looking at the enron email corpus and the work with Mindswap on profiles in terror.
Page 30: Text Understanding Agents  and the Semantic Web

ConclusionsConclusions

• Integrating language processing agents into the SW would publish SW annotations and documents that capture the text’s meaning.

• Migrating from native non-web based representation to SW representation may be loss-full but is still useful for many applications.

• SemNews application testbed demonstrates some scenarios that can benefit from language understanding agents.

Page 31: Text Understanding Agents  and the Semantic Web

For More InformationFor More Information• Semnews application

http://semnews.umbc.edu/• OntoSem NLP system

http://ilit.umbc.edu/• UMBC ebiquity research group

http://ebiquity.umbc.edu/• This presentation

http://ebiquity.umbc.edu/paper/html/id/260/

Page 32: Text Understanding Agents  and the Semantic Web

ReferencesReferencesSoftware Used[1] OntoSem http://ilit.umbc.edu/[2] RDF Validation service http://w3c.org/RDF/Validator[3] Jena Toolkit http://jena.sourceforge.net/[4] Swoop Ontology Viewer http://www.mindswap.org/2004/SWOOP/[5] Pellet OWL DL Reasoner http://www.mindswap.org/2003/pellet/[6] Wonder Web OWL Validator http://phoebus.cs.man.ac.uk:9999/OWL/Validator

Papers[1] Sergei Nirenburg and Victor Raskin, Ontological Semantics, Formal Ontology and Ambiguity[2] Sergei Nirenburg and Victor Raskin, Ontological Semantics, MIT Press, Forthcoming[3] Sergei Nirenburg, Ontological Semantics: Overview, Presentation CLSP JHU, Spring 2003[4] Marjorie McShane, Sergei Nirenburg, Stephen Beale, Margalit Zabludowski, The Cross Lingual Reuse and

Extension of knowledge Resources in Ontological Semantics[5] P.J Beltran-Ferruz, P.A Gonzalez-Calero, P. Gervas Converting Mikrokosmos frames into Description Logics.[6] Sergei Nirenburg, Ontology Tutorial, ILIT UMBC

Mailing Lists[1] Jena Developers [email protected][2] pellet users [email protected][3] Semantic web [email protected][4] W3c RDF Interest [email protected][5] W3c Semantic web [email protected]

Page 33: Text Understanding Agents  and the Semantic Web

Backup slides

Page 34: Text Understanding Agents  and the Semantic Web

Reasoning CapabilitiesReasoning CapabilitiesBuildfile: build.xml

init:

compile:

dist: [jar] Building jar: /home/aks1/software/eclipse/workspace/ontojena/dist/lib/ontojena.jar

run: [java] MODEL OK [java] Resource: http://ontosem.org/#fire-engine [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#fire-engine) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#all) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#physical-object) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#inanimate) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#wheeled-vehicle) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#engine-propelled-vehicle) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#wheeled-engine-vehicle) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#artifact) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#object) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#land-vehicle) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#vehicle) [java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#truck) [java] - (http://ontosem.org/#fire-engine rdfs:label ' "a truck with equipment for fighting fires"') [java] - (http://ontosem.org/#fire-engine rdf:type owl:Class) [java] fire-engine recognized as subclas of vehicle

BUILD SUCCESSFULTotal time: 10 seconds

real 0m11.144suser 0m9.530ssys 0m0.190s[aks1@trishuli ontojena]$

Finding Transitive Closures (RDFS reasoning)

Fire-engine

Truck

Wheeled-engine-vehicle

Engine-propelled--vehicle Wheeled--vehicle

Land-vehicle

vehicleInferred Triples

Page 35: Text Understanding Agents  and the Semantic Web

Mapping RulesMapping Rules

Case Frequency Mapped Using

1 domain 617 rdfs:domain

2 domain with not facet 16 owl:disjointWith

3 range 406 rdfs:range

4 range with not facet 5 owl:disjointWith

5 inverse 260 owl:inverseOf

Property Related Constructs

Page 36: Text Understanding Agents  and the Semantic Web

Mapping RulesMapping Rules

Case Frequency Mapped Using1 value 18217 owl:Restriction

2 sem 5686 owl:Restriction

3 relaxable-to 95 annotation

4 default 350 Not handled

5 default-measure 612 Not handled

6 not 134 owl:disjointWith

7 inv 1941 Not required

Facet related constructs