32
Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu

Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu

Embed Size (px)

Citation preview

Semantic Enrichment of Text with Background

Knowledge

Anselmo Peñas

NLP & IR GroupUNED

nlp.uned.es

Eduard Hovy USC / ISI

isi.edu

UNED

nlp.uned.es

Text omits information

San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.

UNED

nlp.uned.es

Make explicit implicit information

Implicit (More) explicit

San Francisco’s Eric Davis Eric Davis plays for San FranciscoE.D. is a player, S.F. is a team

Eric Davis intercepted pass1

-

Steve Walsh pass1 Steve Walsh threw pass1

Steve Walsh threw interception1…

Young touchdown pass2 Young completed pass2 for touchdown…

touchdown pass2 to Brent Jones

Brent Jones caught pass2 for touchdown

San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.

UNED

nlp.uned.es

Goals

General Goal Automatic recovering of such

omitted information

Enrichment is the process of adding explicitly to a text’s representation the information that is either implicit or missing in the text

UNED

nlp.uned.es

The enrichment cycle

Cycle:1. Read text from collection2. Ruminate in BKB3. Enrich text representation4. Repeat

DomainDocs.

ReadingBackgroun

d Knowledge

Base

Rumination

Enrichment

UNED

nlp.uned.es

Goals

Specific goals of this work

Explore the idea of using “Proposition Stores” as Background Knowledge for enrichment

Explore procedures for enrichment

Determine the kinds of knowledge that Proposition Stores must include to enable enrichment

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

UNED

nlp.uned.es

Elements in our BKB

Entities• Classes: not limited to a predefined set• Instances: proper nouns (in this first

approach)• Class:has-instance:Instance relations

Propositions: Predefined syntactic structures

• NV, NVPN• NVN, NVNPN• NPN, AN• …

UNED

nlp.uned.es

Extraction of propositions

Patterns over dependency trees

prop( Type, Form : DependencyConstrains : NodeConstrains ).

Examples:prop(nv, [N,V] : [V:N:nsubj, not(V:_:'dobj')] : [verb(V)]).

prop(nvnpn, [N1,V,N2,P,N3]:[V:N2:'dobj', V:N3:Prep, subj(V,N1)]:[prep(Prep,P)]).

prop(has_value, [N,Val]:[N:Val:_]:[nn(N), cd(Val), not(lemma(Val,'one'))]).

UNED

nlp.uned.es

Background Knowledge Base(NFL, US football)

?> NN NNP:’pass’

NN 24 'Marino’:'pass‘

NN 17 'Kelly':'pass'NN 15

'Elway’:'pass’

?>X:has-instance:’Marino’20 'quarterback':has-

instance:'Marino'6 'passer':has-instance:'Marino'4 'leader':has-instance:'Marino'3 'veteran':has-

instance:'Marino'2 'player':has-instance:'Marino'

?> NPN 'pass':X:'touchdown‘

NPN 712 'pass':'for':'touchdown'

NPN 24 'pass':'include':'touchdown’

?> NVN 'quarterback':X:'pass'

NVN 98 'quarterback':'throw':'pass'

NVN 27 'quarterback':'complete':'pass‘

?> NVNPN 'NNP':X:'pass':Y:'touchdown'NVNPN 189

'NNP':'catch':'pass':'for':'touchdown'NVNPN 26

'NNP':'complete':'pass':'for':'touchdown‘…  

?> NVN 'end':X:'pass‘

NVN 28 'end':'catch':'pass'

NVN 6 'end':'drop':'pass‘

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

UNED

nlp.uned.es

Enrichment example (1)

…to set up a 7-yard Young touchdown pass to Brent Jones

pass

Young

touchdown Jones

nn nn to

Young pass?> X:has-instance:Young

X=quarterback?>

NVN:quarterback:X:passX=throwX=complete

pass to Jones?> X:has-

instance:JonesX=end

?> NVN:end:X:passX=catchX=drop

UNED

nlp.uned.es

Enrichment example (2)

pass

Young

touchdown Jones

throwcomplete

nn catchdrop

touchdown pass?> NVN touchdown:X:pass

False?> NPN pass:X:touchdown

X=for

…to set up a 7-yard Young touchdown pass to Brent Jones

UNED

nlp.uned.es

Enrichment example (3)

pass

Young

touchdown Jones

throwcomplete

for catchdrop

?> NVNPN NAME:X:pass:for:touchdownX=completeX=catch

…to set up a 7-yard Young touchdown pass to Brent Jones

UNED

nlp.uned.es

Enrichment example (4)

pass

Young

touchdown Jones

complete for catch

Young complete pass for touchdown Jones catch pass for touchdown

…to set up a 7-yard Young touchdown pass to Brent Jones

UNED

nlp.uned.es

Enrichment

Build context for instances Build context for dependencies

Finding prepositionsFinding verbs

Constrain interpretations

UNED

nlp.uned.es

Enrichment example (5)

San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.

Before enrichment

forthrow

catchcomplete

After enrichment

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

UNED

nlp.uned.es

What BKBs need for enrichment? (1)

Ability to answer about instances• Not complete population• But allow analogy

Ability to constrain interpretations and accumulate evidence

• Several different queries over the same elements considering different syntactic structures

• Require normalization (and parsing)

UNED

nlp.uned.es

What BKBs need for enrichment? (1)

Ability to discover entity classes with appropriate granularity level

• Quarterbacks throw passes• Ends catch passes• Tag an entity as person or even player is

not specific enough for enrichment

Text frequently introduces the relevant class (appropriate granularity level) for understanding

UNED

nlp.uned.es

What BKBs need for enrichment? (2)

Ability to digest enough knowledge adapted to the domain

• Crucial

Approaches• Macro-reading (web scale) + domain

adaptation• Shallow NLP, lack of normalization

• Reading in context (suggested here)• Domain partitioning• Deeper NLP, specific domain NLP

UNED

nlp.uned.es

Digest enough knowledge

DART: general domain propositions storeTextRunner: general domain (web-scale)BKB: specific domain propositions store (only

30,000 docs)

?> quarterback:X:passDART TextRunner BKB (US

Football)(no results) (~200) threw

(~100) completed (36) to throw (26) has thrown (19) makes (19) has (18) fires

(99) throw(25) complete(7) have(5) attempt(5) not-throw(4) toss(3) release

UNED

nlp.uned.es

?> X:intercept:pass

DART TextRunner BKB (US Football)

(13) person (6) person/place/organization(2) full-back(1) place

(30) Early (26) Two plays

(24) fumble (20) game (20) ball (17) Defensively

(75) person(14) cornerback(11) defense(8) safety(7) group(5) linebacker

Digest Knowledge in the domain(entity classes)

UNED

nlp.uned.es

Digest Knowledge in the domain(ambiguity problem)

?> person:X:passDART TextRunner BKB (US

Football)(47) make (45) take (36) complete (30) throw (25) let (23) catch (1) make (1) expect

(22) gets (17) makes (10) has (10) receives (7) who has (7) must have (6) acting on (6) to catch (6) who buys (5) bought (5) admits (5) gives

(824) catch(546) throw(256) complete(136) have(59) intercept(56) drop(39) not-catch(37) not-throw(36) snare(27) toss(23) pick off(20) run

UNED

nlp.uned.es

Domain issue

?> person:X:passNFL Domain

905:nvn:[person:n, catch:v, pass:n].667:nvn:[person:n, throw:v, pass:n].286:nvn:[person:n, complete:v, pass:n].

204:nvnpn:[person:n, catch:v, pass:n, for:in, yard:n].

85:nvnpn:[person:n, catch:v, pass:n, for:in, touchdown:n].

IC Domain6:nvn:[person:n, have:v, pass:n]3:nvn:[person:n, see:v, pass:n]

1:nvnpn:[person:n, wear:v, pass:n, around:in, neck:n]

BIO Domain<No results>

UNED

nlp.uned.es

Domain issue

?> X:receive:YNFL Domain

55:nvn:[person:n, receive:v, call:n].34:nvn:[person:n, receive:v, offer:n].33:nvn:[person:n, receive:v, bonus:n].29:nvn:[team:class, receive:v, pick:n].

IC Domain78 nvn:[person:n, receive:v, call:n]44 nvn:[person:n, receive:v, letter:n]35 nvn:[group:n, receive:v, information:n]31 nvn:[person:n, receive:v, training:n]

BIO Domain24 nvn:[patients:n, receive:v, treatment:n]14 nvn:[patients:n, receive:v, therapy:n]13 nvn:[patients:n, receive:v, care:n]

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

UNED

nlp.uned.es

Conclusions

Limiting to a specific domain provides some powerful benefits Ambiguity is reduced Higher density of relevant propositions Different distribution of propositions across domains Amount of source text is reduced, allowing deeper

processing such as parsing Specific tools for specific domains

Proposition stores seem to be useful Improve parsing, corref, WSD,…

We presented a new application: ENRICHMENT

UNED

nlp.uned.es

Current work

Develop automatic procedures for EnrichmentNeed better Proposition Stores

• Selectional Preferences• Lexical relatedness• Structural /frame transformations• …

UNED

nlp.uned.es

Future work

Develop appropriate methodologies for evaluationIntrinsic?Extrinsic: QA over single

documents?• Reading comprehension tests?

Thanks!

UNED

nlp.uned.es

NVN 3 'quarterback':'find':'receiver‘NVNPN 3 'quarterback':'throw':'pass':'to':'receiver'NVNPN 2 'quarterback':'complete':'pass':'to':'receiver'NVNPN 1 'receiver':'catch':'pass':'from':'quarterback‘

nvn:('NNP':'quarterback'):'hit':('NNP':'receiver'),177).nvnpn:('NNP':'quarterback'):'throw':'pass':'to':

('NNP':'receiver'),143).nvnpn:('NNP':'quarterback'):'complete':'pass':'to':

('NNP':'receiver'),79).nvn:('NNP':'quarterback'):'find':('NNP':'receiver'),69).nvnpn:('NNP':'receiver'):'catch':'pass':'from':

('NNP':'quarterback'),43).