Upload
arron-carson
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Semantic Enrichment of Text with Background
Knowledge
Anselmo Peñas
NLP & IR GroupUNED
nlp.uned.es
Eduard Hovy USC / ISI
isi.edu
UNED
nlp.uned.es
Text omits information
San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.
UNED
nlp.uned.es
Make explicit implicit information
Implicit (More) explicit
San Francisco’s Eric Davis Eric Davis plays for San FranciscoE.D. is a player, S.F. is a team
Eric Davis intercepted pass1
-
Steve Walsh pass1 Steve Walsh threw pass1
Steve Walsh threw interception1…
Young touchdown pass2 Young completed pass2 for touchdown…
touchdown pass2 to Brent Jones
Brent Jones caught pass2 for touchdown
San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.
UNED
nlp.uned.es
Goals
General Goal Automatic recovering of such
omitted information
Enrichment is the process of adding explicitly to a text’s representation the information that is either implicit or missing in the text
UNED
nlp.uned.es
The enrichment cycle
Cycle:1. Read text from collection2. Ruminate in BKB3. Enrich text representation4. Repeat
DomainDocs.
ReadingBackgroun
d Knowledge
Base
Rumination
Enrichment
UNED
nlp.uned.es
Goals
Specific goals of this work
Explore the idea of using “Proposition Stores” as Background Knowledge for enrichment
Explore procedures for enrichment
Determine the kinds of knowledge that Proposition Stores must include to enable enrichment
UNED
nlp.uned.es
Elements in our BKB
Entities• Classes: not limited to a predefined set• Instances: proper nouns (in this first
approach)• Class:has-instance:Instance relations
Propositions: Predefined syntactic structures
• NV, NVPN• NVN, NVNPN• NPN, AN• …
UNED
nlp.uned.es
Extraction of propositions
Patterns over dependency trees
prop( Type, Form : DependencyConstrains : NodeConstrains ).
Examples:prop(nv, [N,V] : [V:N:nsubj, not(V:_:'dobj')] : [verb(V)]).
prop(nvnpn, [N1,V,N2,P,N3]:[V:N2:'dobj', V:N3:Prep, subj(V,N1)]:[prep(Prep,P)]).
prop(has_value, [N,Val]:[N:Val:_]:[nn(N), cd(Val), not(lemma(Val,'one'))]).
UNED
nlp.uned.es
Background Knowledge Base(NFL, US football)
?> NN NNP:’pass’
NN 24 'Marino’:'pass‘
NN 17 'Kelly':'pass'NN 15
'Elway’:'pass’
…
?>X:has-instance:’Marino’20 'quarterback':has-
instance:'Marino'6 'passer':has-instance:'Marino'4 'leader':has-instance:'Marino'3 'veteran':has-
instance:'Marino'2 'player':has-instance:'Marino'
?> NPN 'pass':X:'touchdown‘
NPN 712 'pass':'for':'touchdown'
NPN 24 'pass':'include':'touchdown’
…
?> NVN 'quarterback':X:'pass'
NVN 98 'quarterback':'throw':'pass'
NVN 27 'quarterback':'complete':'pass‘
…
?> NVNPN 'NNP':X:'pass':Y:'touchdown'NVNPN 189
'NNP':'catch':'pass':'for':'touchdown'NVNPN 26
'NNP':'complete':'pass':'for':'touchdown‘…
?> NVN 'end':X:'pass‘
NVN 28 'end':'catch':'pass'
NVN 6 'end':'drop':'pass‘
…
UNED
nlp.uned.es
Enrichment example (1)
…to set up a 7-yard Young touchdown pass to Brent Jones
pass
Young
touchdown Jones
nn nn to
Young pass?> X:has-instance:Young
X=quarterback?>
NVN:quarterback:X:passX=throwX=complete
pass to Jones?> X:has-
instance:JonesX=end
?> NVN:end:X:passX=catchX=drop
UNED
nlp.uned.es
Enrichment example (2)
pass
Young
touchdown Jones
throwcomplete
nn catchdrop
touchdown pass?> NVN touchdown:X:pass
False?> NPN pass:X:touchdown
X=for
…to set up a 7-yard Young touchdown pass to Brent Jones
UNED
nlp.uned.es
Enrichment example (3)
pass
Young
touchdown Jones
throwcomplete
for catchdrop
?> NVNPN NAME:X:pass:for:touchdownX=completeX=catch
…to set up a 7-yard Young touchdown pass to Brent Jones
UNED
nlp.uned.es
Enrichment example (4)
pass
Young
touchdown Jones
complete for catch
Young complete pass for touchdown Jones catch pass for touchdown
…to set up a 7-yard Young touchdown pass to Brent Jones
UNED
nlp.uned.es
Enrichment
Build context for instances Build context for dependencies
Finding prepositionsFinding verbs
Constrain interpretations
UNED
nlp.uned.es
Enrichment example (5)
San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.
Before enrichment
forthrow
catchcomplete
After enrichment
UNED
nlp.uned.es
What BKBs need for enrichment? (1)
Ability to answer about instances• Not complete population• But allow analogy
Ability to constrain interpretations and accumulate evidence
• Several different queries over the same elements considering different syntactic structures
• Require normalization (and parsing)
UNED
nlp.uned.es
What BKBs need for enrichment? (1)
Ability to discover entity classes with appropriate granularity level
• Quarterbacks throw passes• Ends catch passes• Tag an entity as person or even player is
not specific enough for enrichment
Text frequently introduces the relevant class (appropriate granularity level) for understanding
UNED
nlp.uned.es
What BKBs need for enrichment? (2)
Ability to digest enough knowledge adapted to the domain
• Crucial
Approaches• Macro-reading (web scale) + domain
adaptation• Shallow NLP, lack of normalization
• Reading in context (suggested here)• Domain partitioning• Deeper NLP, specific domain NLP
UNED
nlp.uned.es
Digest enough knowledge
DART: general domain propositions storeTextRunner: general domain (web-scale)BKB: specific domain propositions store (only
30,000 docs)
?> quarterback:X:passDART TextRunner BKB (US
Football)(no results) (~200) threw
(~100) completed (36) to throw (26) has thrown (19) makes (19) has (18) fires
(99) throw(25) complete(7) have(5) attempt(5) not-throw(4) toss(3) release
UNED
nlp.uned.es
?> X:intercept:pass
DART TextRunner BKB (US Football)
(13) person (6) person/place/organization(2) full-back(1) place
(30) Early (26) Two plays
(24) fumble (20) game (20) ball (17) Defensively
(75) person(14) cornerback(11) defense(8) safety(7) group(5) linebacker
Digest Knowledge in the domain(entity classes)
UNED
nlp.uned.es
Digest Knowledge in the domain(ambiguity problem)
?> person:X:passDART TextRunner BKB (US
Football)(47) make (45) take (36) complete (30) throw (25) let (23) catch (1) make (1) expect
(22) gets (17) makes (10) has (10) receives (7) who has (7) must have (6) acting on (6) to catch (6) who buys (5) bought (5) admits (5) gives
(824) catch(546) throw(256) complete(136) have(59) intercept(56) drop(39) not-catch(37) not-throw(36) snare(27) toss(23) pick off(20) run
UNED
nlp.uned.es
Domain issue
?> person:X:passNFL Domain
905:nvn:[person:n, catch:v, pass:n].667:nvn:[person:n, throw:v, pass:n].286:nvn:[person:n, complete:v, pass:n].
204:nvnpn:[person:n, catch:v, pass:n, for:in, yard:n].
85:nvnpn:[person:n, catch:v, pass:n, for:in, touchdown:n].
IC Domain6:nvn:[person:n, have:v, pass:n]3:nvn:[person:n, see:v, pass:n]
1:nvnpn:[person:n, wear:v, pass:n, around:in, neck:n]
BIO Domain<No results>
UNED
nlp.uned.es
Domain issue
?> X:receive:YNFL Domain
55:nvn:[person:n, receive:v, call:n].34:nvn:[person:n, receive:v, offer:n].33:nvn:[person:n, receive:v, bonus:n].29:nvn:[team:class, receive:v, pick:n].
IC Domain78 nvn:[person:n, receive:v, call:n]44 nvn:[person:n, receive:v, letter:n]35 nvn:[group:n, receive:v, information:n]31 nvn:[person:n, receive:v, training:n]
BIO Domain24 nvn:[patients:n, receive:v, treatment:n]14 nvn:[patients:n, receive:v, therapy:n]13 nvn:[patients:n, receive:v, care:n]
UNED
nlp.uned.es
Conclusions
Limiting to a specific domain provides some powerful benefits Ambiguity is reduced Higher density of relevant propositions Different distribution of propositions across domains Amount of source text is reduced, allowing deeper
processing such as parsing Specific tools for specific domains
Proposition stores seem to be useful Improve parsing, corref, WSD,…
We presented a new application: ENRICHMENT
UNED
nlp.uned.es
Current work
Develop automatic procedures for EnrichmentNeed better Proposition Stores
• Selectional Preferences• Lexical relatedness• Structural /frame transformations• …
UNED
nlp.uned.es
Future work
Develop appropriate methodologies for evaluationIntrinsic?Extrinsic: QA over single
documents?• Reading comprehension tests?
UNED
nlp.uned.es
NVN 3 'quarterback':'find':'receiver‘NVNPN 3 'quarterback':'throw':'pass':'to':'receiver'NVNPN 2 'quarterback':'complete':'pass':'to':'receiver'NVNPN 1 'receiver':'catch':'pass':'from':'quarterback‘
nvn:('NNP':'quarterback'):'hit':('NNP':'receiver'),177).nvnpn:('NNP':'quarterback'):'throw':'pass':'to':
('NNP':'receiver'),143).nvnpn:('NNP':'quarterback'):'complete':'pass':'to':
('NNP':'receiver'),79).nvn:('NNP':'quarterback'):'find':('NNP':'receiver'),69).nvnpn:('NNP':'receiver'):'catch':'pass':'from':
('NNP':'quarterback'),43).