44
Extreme underspecification Using semantics to integrate deep and shallow processing

Extreme underspecification Using semantics to integrate deep and shallow processing

Embed Size (px)

Citation preview

Page 1: Extreme underspecification Using semantics to integrate deep and shallow processing

Extreme underspecification

Using semantics to integrate deep and shallow processing

Page 2: Extreme underspecification Using semantics to integrate deep and shallow processing

Acknowledgements Alex Lascarides, Ted Briscoe, Simone

Teufel, Dan Flickinger, Stephan Oepen, John Carroll, Anna Ritchie, Ben Waldron

Deep Thought project members Cambridge Masters students … Other colleagues at Cambridge,

Saarbrücken, Edinburgh, Brighton, Sussex and Oxford

Page 3: Extreme underspecification Using semantics to integrate deep and shallow processing

Talk overview Why integrate deep and shallow processing?

and why use compositional semantics? Semantics from shallow processing Flattening deep semantics

Underspecification Minimal semantic units

Composition without lambdas Integration experiments with broad-coverage

systems/grammars (LinGO ERG and RASP) How does this fit with deeper semantics?

Page 4: Extreme underspecification Using semantics to integrate deep and shallow processing

Deep processing Detailed, linguistically-motivated,

e.g., HPSG, LFG, TAG, varieties of CG Precise; detailed compositional semantics

possible; generation as well as parsing Some are broad coverage and fast enough

for real time applications BUT: not robust (coverage gaps, ill-formed

input), too slow for IE etc, massive ambiguity

Page 5: Extreme underspecification Using semantics to integrate deep and shallow processing

Shallow (and intermediate) processing Shallow: e.g. POS tagging, NP chunking Intermediate: e.g., grammars with only

a POS tag lexicon (RASP) Fast; robust; integrated stochastic

techniques for disambiguation BUT: no long-distance dependencies, allow

ungrammatical input (so limitations for generation), no conventional semantics without subcategorization

Page 6: Extreme underspecification Using semantics to integrate deep and shallow processing

Why integrate deep and shallow processing? Complementary strengths and weaknesses Weaknesses of each are inherent: more

complexity means larger search space, greater information requirement hand-coding vs machine learning is not the main

issue – treebanking costs, sparse data problems Lexicon is the crucial resource difference

between deep and shallow approaches

Page 7: Extreme underspecification Using semantics to integrate deep and shallow processing

Applications that may benefit from integrated approaches Summarization:

shallow parsing to identify possible key passages, deep processing to check and combine

Email response: deep parser uses shallow parsing for

disambiguation, back off when parse failure Information extraction:

shallow first (as summarization), named entities Question answering:

deep parse questions, shallow parse answers

Page 8: Extreme underspecification Using semantics to integrate deep and shallow processing

Compositional semantics as the common representation Need a common representation

language: pairwise compatibility between systems is too limiting

Syntax is theory-specific Eventual goal should be semantics Crucial idea: shallow processing gives

underspecified semantic representation

Page 9: Extreme underspecification Using semantics to integrate deep and shallow processing

Shallow processing and underspecified semantics Integrated parsing: shallow parsed phrases

incorporated into deep parsed structures Deep parsing invoked incrementally in

response to information needs Reuse of knowledge sources:

domain knowledge, recognition of named entities, transfer rules in MT

Integrated generation Formal properties clearer, representations

more generally usable

Page 10: Extreme underspecification Using semantics to integrate deep and shallow processing

Semantics from POS tagging every_AT1 cat_NN1 chase_VVD

some_AT1 dog_NN1 _every_q(x1), _cat_n(x2sg),

_chase_v(epast), _some_q(x3), _dog_n(x4sg)

Tag lexicon: AT1 _lemma_q(x)NN1 _lemma_n(xsg)VVD _lemma_v(epast)

Page 11: Extreme underspecification Using semantics to integrate deep and shallow processing

Deep parser output Conventional semantic representation

Every dog chased some catevery(x,cat(xsg),some(ysg,dog1(ysg),chase(esp,xsg,ysg)))some(ysg,dog1(ysg),every(xsg,cat(xsg),chase(esp,xsg,ysg)))

Compositional: reflects morphology and syntax

Scope ambiguity

Page 12: Extreme underspecification Using semantics to integrate deep and shallow processing

Modifying syntax of deep grammar semantics: overview

1. Underspecification of quantifier scope: in this talk, using Minimal Recursion Semantics (MRS)

2. Robust MRS• Separating relations • Explicit equalities• Conventions for predicate names and sense

distinctions• Hierarchy of sorts on variables

Page 13: Extreme underspecification Using semantics to integrate deep and shallow processing

Scope underspecification Standard logical forms can be

represented as trees Underspecified logical forms are partial

trees (or descriptions of sets of trees) Constraints on scope control how trees

may be reconstructed

Page 14: Extreme underspecification Using semantics to integrate deep and shallow processing

Logical formsGeneralized quantifier notation:

every(x,cat(xsg),some(ysg,dog1(ysg),chase(esp,xsg,ysg)))forall x [cat(x) implies exists y [ dog1(y) and chase(e,x,y) ]]

some(ysg,dog1(ysg),every(xsg,cat(xsg),chase(esp,xsg,ysg)))exists y [ dog1(y) and forall x [cat(x) implies chase(e,x,y) ]]

Event variables: e.g., chase(e,x,y)

Page 15: Extreme underspecification Using semantics to integrate deep and shallow processing

PC trees

every

x cat

x

some

y dog1 chase

y x y

some

y dog1

y

every

x cat chase

x

Every cat chased some dog

e x ye

Page 16: Extreme underspecification Using semantics to integrate deep and shallow processing

PC trees share structure

every

x cat

x

some

y dog1 chase

y

some

y dog1

y

every

x cat chase

xx ye x ye

Page 17: Extreme underspecification Using semantics to integrate deep and shallow processing

Bits of trees

every

x cat

x

some

y dog1

y

chase

Reconstruction conditions:tree-nessvariable binding

x ye

Page 18: Extreme underspecification Using semantics to integrate deep and shallow processing

Label nodes and holes

lb1:every

x lb2:cat

x

lb4:some

y lb5:dog1

y

lb3:chase

h6

h7

h0

h0 – hole correspondingto the top of the tree

Valid solutions:equate holes and labels

x ye

Page 19: Extreme underspecification Using semantics to integrate deep and shallow processing

Maximize splitting

lb1:every

x

lb2:cat

x

lb4:some

y

lb5:dog1

y

lb3:chase

h6

h7

h0

h8

Constraints:h8=lb5h9=lb2

h9

x ye

Page 20: Extreme underspecification Using semantics to integrate deep and shallow processing

Notation for underspecified scope

lb1:every(x,h9,h6)

lb2:cat(x)

lb5:dog1(y)

lb4:some(y,h8,h7)

lb3:chase(e,x,y)

top: h0h9=lb2h8=lb5

MRS actually uses:h9 qeq lb2h8 qeq lb5

Page 21: Extreme underspecification Using semantics to integrate deep and shallow processing

Extreme underspecification Splitting up predicate argument

structure Explicit equalities Hierarchies for predicates and sorts Goal is to split up semantic

representation into minimal components

Page 22: Extreme underspecification Using semantics to integrate deep and shallow processing

Separating argumentslb1:every(x,h9,h6), lb2:cat(x), lb5:dog1(y),

lb4:some(y,h8,h7), lb3:chase(e,x,y), h9=lb2,h8=lb5

goes to:

lb1:every(x), RSTR(lb1,h9), BODY(lb1,h6), lb2:cat(x), lb5:dog1(y), lb4:some(y), RSTR(lb4,h8), BODY(lb4,h7), lb3:chase(e),ARG1(lb3,x),ARG2(lb3,y), h9=lb2,h8=lb5

Page 23: Extreme underspecification Using semantics to integrate deep and shallow processing

Explicit equalities

lb1:every(x1), RSTR(lb1,h9), BODY(lb1,h6), lb2:cat(x2), lb5:dog1(x4),lb4:some(x3), RSTR(lb4,h8), BODY(lb4,h7),lb3:chase(e),ARG1(lb3,x2),ARG2(lb3,x4),h9=lb2,h8=lb5,x1=x2,x3=x4

Page 24: Extreme underspecification Using semantics to integrate deep and shallow processing

Naming conventionslb1:_every_q(x1sg),RSTR(lb1,h9),BODY(lb1,h6),

lb2:_cat_n(x2sg),

lb5:_dog_n_1(x4sg),

lb4:_some_q(x3sg),RSTR(lb4,h8),BODY(lb4,h7),

lb3:_chase_v(esp),ARG1(lb3,x2sg),ARG2(lb3,x4sg)h9=lb2,h8=lb5, x1sg=x2sg,x3sg=x4sg

Page 25: Extreme underspecification Using semantics to integrate deep and shallow processing

POS output as underspecificationDEEP –

lb1:_every_q(x1sg), RSTR(lb1,h9), BODY(lb1,h6), lb2:_cat_n(x2sg), lb5:_dog_n_1(x4sg), lb4:_some_q(x3sg), RSTR(lb4,h8), BODY(lb4,h7),lb3:_chase_v(esp), ARG1(lb3,x2sg),ARG2(lb3,x4sg), h9=lb2,h8=lb5, x1sg=x2sg,x3sg=x4sg

POS –

lb1:_every_q(x1), lb2:_cat_n(x2sg), lb3:_chase_v(epast), lb4:_some_q(x3), lb5:_dog_n(x4sg) (as previous slide but added labels)

Page 26: Extreme underspecification Using semantics to integrate deep and shallow processing

POS output as underspecificationDEEP –

lb1:_every_q(x1sg), RSTR(lb1,h9),BODY(lb1,h6), lb2:_cat_n(x2sg), lb5:_dog_n_1(x4sg), lb4:_some_q(x3sg), RSTR(lb4,h8), BODY(lb4,h7),lb3:_chase_v(esp), ARG1(lb3,x2sg),ARG2(lb3,x3sg), h9=lb2,h8=lb5, x1sg=x2sg,x3sg=x4sg

POS –

lb1:_every_q(x1), lb2:_cat_n(x2sg), lb3:_chase_v(epast), lb4:_some_q(x3), lb5:_dog_n(x4sg)

Page 27: Extreme underspecification Using semantics to integrate deep and shallow processing

Hierarchies esp (simple past) is defined a subtype of

epast

in general, hierarchy of sorts defined as part of the semantic interface (SEM-I)

dog_n_1 is a subtype of dog_n by convention, lemma_POS_sense is a

subtype of lemma_POS

Page 28: Extreme underspecification Using semantics to integrate deep and shallow processing

Extreme Underspecification Factorize deep representation to

minimal units Only represent what you know for each

type of processor Compatibility:

Sorts and (some) closed class word information in SEM-I for consistency

No lexicon for shallow processing (apart from POS tags)

Page 29: Extreme underspecification Using semantics to integrate deep and shallow processing

Semantics from RASP RASP: robust, domain-independent, statistical

parsing (Briscoe and Carroll) can’t produce conventional semantics

because no subcategorization can sometimes identify arguments:

S -> NP VP NP supplies ARG1 for V partial identification:

VP -> V NP S -> NP S NP might be ARG2 or ARG3

Page 30: Extreme underspecification Using semantics to integrate deep and shallow processing

Underspecification of arguments

ARGN

ARG1or2 ARG2or3

ARG2ARG1 ARG3

RASP arguments can be specified as ARGN, ARG2or3 etcAlso useful for Japanese deep parsing?

Page 31: Extreme underspecification Using semantics to integrate deep and shallow processing

Software etc Open Source LinGO English Resource

Grammar (ERG) LKB system: parsing and generation, now

includes MRS-RMRS interconversion RMRS output as XML RMRS comparison Preliminary RASP-RMRS First version of SEM-I

Page 32: Extreme underspecification Using semantics to integrate deep and shallow processing

Composition without lambdas Formalized, consistent composition

integration at subsentential level standardization

Traditional lambda calculus unsuitable Doesn’t allow underspecification Syntactic requirements mixed up with the

semantics Algebra is rational reconstruction of a feature

structure approach to composition

Page 33: Extreme underspecification Using semantics to integrate deep and shallow processing

Lexicalized composition[h,e1], {[h3,x]subj },{h:_probably(h2), h3:_sleep(e), arg1(h3,x)},{e1=e},{h2 qeq h3}1. hook: externally accessible information2. slots: when functor, slot is equated with

argument hook3. relations: accumulated monotonically4. equalities: record hook-slot equations (not shown

from now on)5. scope constraints: (ignored from now on)

Page 34: Extreme underspecification Using semantics to integrate deep and shallow processing

probably sleeps[h3,e], {[h3,x]subj}, {h3:_sleep(e), ARG1(h3,x)}sleeps[h,e1], {[h2,e1]mod}, {h:_probably(h2)}probably

Syntax defines probably as semantic head, composition using mod slot

[h,e1], {[h3,x]subj},{h:_probably(h3), h3:_sleep(e1), arg1(h3,x)}

probably sleeps

Page 35: Extreme underspecification Using semantics to integrate deep and shallow processing

Non-lexicalized grammars Lexicalized approach is a rational

reconstruction of semantic composition in the ERG (Copestake et al, 2001)

Without lexical subcategorization, rely on grammar rules to provide the ARGs

`anchors’ rather than slots, to ground the ARGs (single anchor for RASP)

Page 36: Extreme underspecification Using semantics to integrate deep and shallow processing

Some cat sleeps (in RASP)[h3,e], <h3>, {h3:_sleep(e)}sleeps[h,x], <h1>, {h1:_some(x),RSTR(h1,h2),h2:_cat(x)}some cat

S->NP VP: Head=VP, ARG1(<VP anchor>,<NP hook.index>)[h3,e], <h3>, {h3:_sleep(e), ARG1(h3,x),

h1:_some(x),RSTR(h1,h2),h2:_cat(x)}some cat sleeps

Page 37: Extreme underspecification Using semantics to integrate deep and shallow processing

The current project …

Page 38: Extreme underspecification Using semantics to integrate deep and shallow processing

Deep Thought Saarbrücken, Sussex, Cambridge,

NTNU, Xtramind, CELI Objectives: demonstrate utility of deep

processing in IE and email response German, Norwegian, Italian and English October 2002 – October 2004

Page 39: Extreme underspecification Using semantics to integrate deep and shallow processing

Integrated IE: a scenario Example:

I don’t like the PBX 30 Shallow processing finds interesting

sentences Named entity system isolates entities

• h1:name(x,”PBX-30”)

Deep processor identifies relationships, modals, negation etc

• h2:neg(h3), h3:_like(y,x), h3:name(x,”PBX-30”)

Page 40: Extreme underspecification Using semantics to integrate deep and shallow processing

Some issues `shallow’ processors can sometimes be

deeper: e.g. h1:model-name(x,”PBX-30”) Compatibility and standardization: defining

SEM-I (semantic interface) Limits on compatibility: e.g., causative-

inchoative Efficiency of comparison: indexing

representations by character position

Page 41: Extreme underspecification Using semantics to integrate deep and shallow processing

The bigger picture ... `deep’ processing reflects syntax and

morphology but limited lexical semantics

conventional vs predictable: count/mass: lentils/rice, furniture, lettuce adjectives: heavy defeat, ?large problem prepositions and particles: up

Page 42: Extreme underspecification Using semantics to integrate deep and shallow processing

Incremental development of wide-coverage semantics corpus-based acquisition techniques:

shallow processing eventual integration with deep

processing statistical model of predicates: e.g.,

large_j_rel pointer to vector space logic isn’t enough but is needed

Page 43: Extreme underspecification Using semantics to integrate deep and shallow processing

Conclusion

every

x cat

x

some

y dog1 chase

y x y

some

y dog1

y

every

x cat chase

xe

x ye

lb1:every(x), RSTR(lb1,h9), BODY(lb1,h6), lb2:cat(x), lb5:dog1(y),

lb4:some(y), RSTR(lb4,h8), BODY(lb4,h7), lb3:chase(e),ARG1(lb3,x),

ARG2(lb3,y), h9=lb2,h8=lb5

Page 44: Extreme underspecification Using semantics to integrate deep and shallow processing

Conclusion: extreme underspecification Split up information content as much as

possible Accumulate information by simple

operations Don’t represent what you don’t know

but preserve everything you do know Use a flat representation to allow pieces

to be accessed individually