26
Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom Works, Seattle

Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Embed Size (px)

Citation preview

Page 1: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Acquiring and Using World Knowledgeusing a Restricted Subset of English

Peter Clark, Phil Harrison, Tom Jenkins,John Thompson, Rick Wojcik

Boeing Phantom Works, Seattle

Page 2: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Introduction

• Knowledge acquisition is still a major bottleneck

– automated methods are good but still very restricted

• Our approach:

– Knowledge entry using Controlled Language

– Hits “sweet spot” between logic and full NLP

– language interpreter generates logic output

• Outline:

1. Our Controlled Language Processing technology

2. Discussion on Natural Language as a basis for KR

Page 3: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Formallanguage

Unrestrictednatural

languageControlled English

“Consider the following possible situation in which a ball first…”

too hard for the

computerto

understand

“A ball falls from a cliff”“xy B(x)R(x,y)C(y)”

too hard

for the user

The Language Spectrum

Page 4: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

An object is thrown with a horizontal velocity of 20 m/s from a cliff that is 125 m above level ground. If air resistance is negligible, how long does it take the object to fall to the ground?

CPL (Computer-Processable Language)Original text (incomprehensible to computer):

An object is thrown from a cliff.The horizontal velocity of the object is 20 m/s.The top of the cliff is 125 m above level ground.The object falls 125 m to the ground.What is the duration of the fall?

Rewritten in CPL (computer can understand):Short sentences No pronouns

Simple sentence structures

Page 5: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Target Interpretation• Sentences in first-order logic• Capable of supporting machine inference

“An object is thrown from a cliff”

Object Cliff

Throwobject origin

isa(_Object1, object_n1)isa(_Cliff2, cliff_n1)isa(_Throw3, throw_v1)object(_Throw3, _Object1)origin(_Throw3, _Cliff2)

Page 6: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Target Interpretation• Sentences in first-order logic• Capable of supporting machine inference

“a person is carrying an entity that is inside a room”

Person Object

Carryagent object

isa(_Person1, person_n1)isa(_Room2, room_n1)isa(_Entity3, entity_n1)isa(_Carry4, carry_v1)object(_Carry4, _Entity3)agent(_Carry4, _Person1)is-inside(_Entity4, _Room2) =====> is-inside(_Person1, _Room2)

Roomis-inside

is-inside

“the person is in the room.”

IFTHEN

Page 7: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Overview of Processing“An object is thrown from a cliff”

Parser & LF Generator

Word sense disambiguator

Relational disambiguator

Coreference identifier

Structural reorganizer

Object Cliff

Throw

object origin

(_Object13320 instance_of object_n1)(_Cliff13321 instance_of cliff_n1)(_Throw13319 instance_of throw_v1)(_Throw13319 object _Object13320)(_Throw13319 origin _Cliff13321)

WorldKnowledge

LinguisticKnowledge

Page 8: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Entering Quantified Expressions (Rules)

• Seven “rule templates” used:IF sentence THEN sentenceABOUT object: sentenceobject IS noun/verb phraseBEFORE sentence, sentenceBEFORE sentence, it is not true that sentenceAFTER sentence, sentenceAFTER sentence, it is not true that sentence

Processing:1. Each sentence processed as a ground assertion2. Quantifiers are added (Prolog-style)3. “Action” templates become situation calculus rules

Page 9: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Originaltext

An object is thrown from a cliff.The horizontal velocity of the object is 20 m/s. The top of the cliff is 125 m above level ground.

CPL (Controlled english)

Logic

KB

Overall Flow of Processing

Paraphrase ofsystem’s understanding

An object is thrown from a cliff.The horizontal velocity of the object is 20 m/s. The top of the cliff is 125 m above level ground.

Rewritingadvice

Page 10: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom
Page 11: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom
Page 12: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom
Page 13: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom
Page 14: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom
Page 15: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom
Page 16: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom
Page 17: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Part II: Discussion

Controlled Languages:Strengths and challenges

Page 18: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Strengths…

• CPL is easy to use, appears viable– built KB with over 1000 rules– KB is

• inference-capable• easy to inspect and organize

• Makes knowledge entry accessible to many users– major achievement

xy B(x)R(x,y)C(y)???

“A man is driving a truck towards the factory”

Page 19: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Challenges: 1. Reformulating in a Controlled Language is not trivial

• Task is not just grammatical reformulation• Rather:

– “natural” English leaves much knowledge implicit– CPL author must make that explicit

“attack: intense adverse criticism”

Original text:

“IF a person attacks a 2nd person THEN the first person criticizes the 2nd person intensely.”

CPL:

Page 20: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Challenges: 1. Reformulating in a Controlled Language is not trivial

• Task is not just grammatical reformulation• Rather:

– “natural” English leaves much knowledge implicit– CPL author must make that explicit

“axis: the center around which something rotates”

Original text:

“IF an object is rotating THEN the object is turning around the object’s axis.”

CPL:

Page 21: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

2. Users may not be aware of system’s mistakes1. User must be able to spot misinterpretations easily

– System’s paraphrase must be unambiguous2. User must know how to correct them

“The man ate the sandwich on the plate”

“The man ate on the plate. He ate the sandwich.”

??????

Page 22: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

2. Users may not be aware of their mistakes• User must be able to spot errors easily

– System’s paraphrase must be unambiguous• User must know how to correct them

“The man ate the sandwich on the plate”

“The man ate on the plate. He ate the sandwich.”

“the man ate the sandwich that was on the plate”

Page 23: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

3. Natural-Language-based knowledge representations have limited expressivity

“Natural language is very expressive”

• …not to the computer! (Avoid “wishful semantics”)• Expressiveness =

– the amount the computer understands– the amount it is able to use to draw conclusions from

• Everything else is meaningless to the computer

• e.g., CPL can’t express:– constraints, defaults, some quantification patterns

Page 24: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

4. Sometimes, linguistically motivated representations are poor

• Language-based KR:– Most concepts correspond to words– Structure of KB will mirror structure of language

• Is this bad? Sometimes…

“… walked for 10 miles”

“Traditional”KR

distance(_Walk1, _Distance1)value(_Distance1, 10, mile)

distance(_Walk1, _Mile1)count(_Mile1, 10)

NL-basedKR

Page 25: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

5. (Lack of) Canonicalization

• Many ways to say the same thing• System needs to realize the equivalence BUT: often NL-based KRs will not

Solutions:• Add equivalence rules. (But there are lots!!)

– e.g., “Conducting a X of Y ↔ Xing a Y”• Have the interpreter normalize the input.• Restrict the input language.

“conducting a test of an entity”“testing an entity”

Page 26: Acquiring and Using World Knowledge using a Restricted Subset of English Peter Clark, Phil Harrison, Tom Jenkins, John Thompson, Rick Wojcik Boeing Phantom

Summary• CPL = a restricted English language for knowledge

– Hits “sweet spot” between logic and full NLP– Produces inference-capable representations– Is viable, used to build a large KB

• But: No “free lunch”– requires skill to use it effectively

• NL-based KRs are becoming more important!– Web: need semantically meaningful annotations– AI: need better knowledge acquisition tools

• Some exciting possibilities ahead (esp. at Boeing!)