Upload
edie
View
39
Download
0
Embed Size (px)
DESCRIPTION
Computational Semantics http://www.coli.uni-sb.de/cl/projects/milca/esslli Day II: A Modular Architecture. Aljoscha Burchardt, Alexander Koller, Stephan Walter, Universität des Saarlandes, Saarbrücken, Germany ESSLLI 2004, Nancy, France. Computing Semantic Representations. Yesterday: - PowerPoint PPT Presentation
Citation preview
Computational Semanticshttp://www.coli.uni-sb.de/cl/projects/milca/esslliDay II: A Modular
Architecture
Aljoscha Burchardt,
Alexander Koller,
Stephan Walter,Universität des Saarlandes,
Saarbrücken, Germany
ESSLLI 2004, Nancy, France
Computing Semantic Representations
• Yesterday: -Calculus is a nice tool for systematic meaning
construction.– We saw a first, sketchy implementation– Some things still to be done
• Today: – Let’s fix the problems– Let’s build nice software
• Semantic representations constructed along the syntax tree: How to get there?
By using functional applications help to guide arguments in the right
place on -reduction:
Yesterday: -Calculus
x.love(x,mary)@johnlove(john,mary)
Yesterday’s disappointment
Our first idea for NPs with determiner didn’t work out:
“A man” ~> z.man(z)
„A man loves Mary“ ~> * love(z.man(z),mary)
z.man(z) just isn‘t the meaning of „a man“.
If anything, it translates the complete sentence „There is a man“
Let‘s try again, systematically…
But what was the idea after all?
Nothing!
z(man(z) love(z,mary))
z(y.man(y)(z) x.love(x,mary)(z))
A solution
What we want is:
„A man loves Mary“ ~> z(man(z) love(z,mary))
What we have is:
“man” ~> y.man(y)
“loves Mary” ~> x.love(x,mary)
How about: z(man(z) love(z,mary))z(y.man(y)(z) love(z,mary))z(y.man(y)(z) love(z,mary))z(y.man(y)(z) x.love(x,mary)(z))z(y.man(y)(z) x.love(x,mary)(z))Remember: We can use variables for any kind of term.So next:
z(y.man(y)(z) x.love(x,mary)(z))Q. Q(z)) x.love(x,mary)P( P )y.man(y)P(Q.z(P(z) Q(z))) <~ “A”
P(Q.z(P(z)Q(z)))@y.man(y) x.love(x,mary)@Q.z(man(z)Q(z))
But…
“A man … loves Mary”
x.love(x,mary)
“John … loves Mary”@ john not systematic!
P.P@john @ x.love(x,mary) better!x.love(x,mary)@johnlove(john,mary)
So: “John” ~> P.P(john)
fine!z.man(z) x.love(x,mary)(z)man(z) love(z,mary)
not reducible!x.love(x,mary)@john
P(Q.z(P(z) Q(z)))@ y.man(y) @ x.love(x,mary)
"loves Mary" ~> yx.love(x,y)@Q.Q(mary)
Transitive VerbsWhat about transitive verbs (like "love")?
"Mary" ~> Q.Q(mary)
x.love(x,Q.Q(mary))
How about something a little more complicated:
"loves" ~> Rx([email protected](x,y))
The only way to understand this is to see it in action...
"loves" ~> yx.love(x,y) ???won't do:
x(P.P(mary)@y.love(x,y))P.P(john) @x(y.love(x,y)(mary))
"John loves Mary" again...
loves Mary
Rx([email protected](x,y)) P.P(mary)
x.love(x,mary)(john)
John
P.P(john) ( )@@
x.love(x,mary)
love(john,mary)
love(john,mary)love(john,mary)
Summing up
• nouns: “man” ~> x.man(x)• intransitive verbs: „smoke“ ~> x.smoke(x)• determiner: „a“ ~> P(Q.z(P(z)
Q(z)))• proper names: „mary“ ~> P.P(mary)• transitive verbs: “love” ~>
Rx([email protected](x,y))
Today‘s first success
What we can do now (and could not do yesterday):• Complex NPs (with determiners)• Transitive verbs
… and all in the same way.
Key ideas:• Extra λs for NPs• Variables for predicates• Apply subject NP to VP
Yesterday’s implementation
s(VP@NP) --> np(NP),vp(VP).np(john) --> [john].np(mary) --> [mary].tv(lambda(X,lambda(Y,love(Y,X)))) --> [loves],
{vars2atoms(X),vars2atoms(Y)}.iv(lambda(X,smoke(X))) --> [smokes], {vars2atoms(X)}.iv(lambda(X,snore(X))) --> [snorts], {vars2atoms(X)}.vp(TV@NP) --> tv(TV),np(NP).vp(IV) --> iv(IV).
% This doesn't work!np(exists(X,man(X))) --> [a,man], {vars2atoms(X)}.
Was this a good implementation?
A Nice Implementation
What is a nice implementation?It should be:– Scalable: If it works with five examples, upgrading to
5000 shouldn’t be a great problem (e.g. new constructions in the grammar, more words...)
– Re-usable: Small changes in our ideas about the system shouldn’t lead to complex changes in the implementation (e.g. a new representation language)
Solution: Modularity
• Think about your problem in terms of interacting conceptual components
• Encapsulate these components into modules of your implementation, with clean and abstract pre-defined interfaces to each other
• Extend or change modules to scale / adapt the implementation
Another look at yesterday’s implementation
• Okay, because it was small
• Not modular at all: all linguistic functionality in one file, packed inside the DCG
• E.g. scalability of the lexicon: Always have to write new rules, like:
tv(lambda(X,lambda(Y,visit(Y,X)))) --> [visit], {vars2atoms(X),vars2atoms(Y)}.
• Changing parts for Adaptation? Change every single rule!
Let's modularize!
smoke(j)
“John smokes”
Semantic Construction:Conceptual Components
Black Box
Semantic Construction:Inside the Black Box
Black Box
Words(lexical)
Phrases(combinatorial)
Syntax Semantics
DCG combine-rules
lexicon-facts
DCG
The DCG-rules tell us what phrases are acceptable (mainly). Their basic structure is:
s(...) --> np(...), vp(...), {...}.
np(...) --> det(...), noun(...), {...}.
np(...) --> pn(...), {...}.
vp(...) --> tv(...), np(...), {...}.
vp(...) --> iv(...), {...}.
(The gaps will be filled later on)
combine-rules
The combine-rules encode the actual semantic construction process. That is, they glue representations together using @:
combine(s:(NP@VP),[np:NP,vp:VP]).
combine(np:(DET@N),[det:DET,n:N]).combine(np:PN,[pn:PN]).
combine(vp:IV,[iv:IV]).combine(vp:(TV@NP),[tv:TV,np:NP]).
The lexicon-facts hold the elementary information connected to words:
lexicon(noun,bird,[bird]). lexicon(pn,anna,[anna]). lexicon(iv,purr,[purrs]). lexicon(tv,eat,[eats]).
Lexicon
Their slots contain:1. syntactic category2. constant / relation symbol (“core” semantics)3. the surface form of the word.
lexicon(tv,eat,[eats]).lexicon(tv,eat,[eats]).lexicon(tv,eat,[eats]).
Interfaces
Words(lexical)
Phrases(combinatorial)
Syntax Semantics
DCG combine-rules
lexicon-facts
lexicon-calls Semantic macros
combine-calls
Interfaces in the DCG
• Lexical rules are now fully abstract. We have one for each category (iv, tv, n, ...). The DCG uses lexicon-calls and semantic macros like this:
iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.
pn(PN)--> {lexicon(pn,Sym,Word),pnSem(Sym,PN)}, Word.
• In the combinatorial rules, using combine-calls like this:
vp(VP)--> iv(IV),{combine(vp:VP,[iv:IV])}.
s(S)--> np(NP), vp(VP), {combine(s:S,[np:NP,vp:VP])}.
Information is transported between the three components of our system by additional calls and variables in the DCG:
Interfaces: How they work
iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.
(e.g. “smokes”)
Sym = smoke
• looks up the Word found in the string, ...
When this rule applies, the syntactic analysis component:
• ... checks that its category is iv, ...
• ... and retrieves the relation symbol Sym to be used in the semantic construction.
lexicon(iv, smoke, [smokes])lexicon(iv, smoke, [smokes])lexicon(iv, smoke, [smokes])
So we have: Word = [smokes]Sym = smoke
Interfaces: How they work II
The DCG-rule is now fully instantiated and looks like this:iv(lambda(X, smoke(X)))-->
{lexicon(iv,smoke,[smokes]), ivSem(smoke, lambda(X, smoke(X)))}, [smokes].
iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.
Then, the semantic construction component:
Sym = smoke• takes Sym ...
• ... and uses the semantic macro ivSem ...
• ... to transfer it into a full semantic representation for an intransitive verb.
ivSem(Sym,IV)ivSem(smoke,IV)ivSem(smoke,lambda(X, smoke(X)))
IV = lambda(X, smoke(X))
What’s inside a semantic macro?
Semantic macros simply specify how to make a valid semantic representation out of a naked symbol. The one we’ve just seen in action for the verb “smokes” was:
ivSem(Sym,lambda(X,Formula)):- compose(Formula,Sym,[X]).
compose builds a first-order formula out of Sym and a new variable X:Formula = smoke(X)
This is then embedded into a - abstraction over the same X:lambda(X, smoke(X))
Another one, without compose: pnSem(Sym,lambda(P,P@Sym)).john lambda(P,P@john)
Words(lexical)
Phrases(combinatorial)
Syntax Semantics
pn(PN) --> …,[john]iv(IV) --> …,[smokes]
Word =[john] Word = [smokes]
pnSem(Sym,PN) Sym = johnivSem(Sym,IV) Sym = smoke
PN = lambda(P,P@john)IV = lambda(X,smoke(X))
“John smokes”
lexicon(pn,john,[john]).
lexicon(iv,smoke,[smokes]).
np(NP) --> …,pn(PN)vp(VP) --> …,iv(IV)
NP = lambda(P,P@john)VP = lambda(X,smoke(X))
s(S)--> np(NP), vp(VP),{combine(s:S,[np:NP,vp:VP])}.
A look at combine
combine(s:NP@VP,[np:NP,vp:VP]).
S = NP@VPNP = lambda(P,P@john)VP = lambda(X,smoke(X))
So:S = lambda(P,P@john)@lambda(X,smoke(X))
That’s almost all, folks…
betaConvert(lambda(P,P@john)@lambda(X,smoke(X), Converted)Converted = smoke(john)
Little Cheats
Determiners: ("every man")• No semantic Sym in the lexicon:
lexicon(det,_,[every],uni).• Semantic representation generated by the macro alone:
detSem(uni,lambda(P,lambda(Q,forall(X,
(P@X)>(Q@X))))).Negation – same thing: ("does not walk")• No semantic Sym in the lexicon:
lexicon(mod,_,[does,not],neg).• Representation solely from macro:
modSem(neg,lambda(P,lambda(X,~(P@X)))).
A few “special words” are dealt with in a somewhat different manner:
The code that's online(http://www.coli.uni-sb.de/cl/projects/milca/esslli)
• lexicon-facts have fourth argument for any kind of additional information:
lexicon(tv,eat,[eats],fin).
• iv/tv have additional argument for infinite /fin.:iv(I,IV)--> {lexicon(iv,Sym,Word,I),…}, Word.
• limited coordination, hence doubled categories: vp2(VP2)--> vp1(VP1A), coord(C), vp1(VP1B), {combine(vp2:VP2,[vp1:VP1A,coord:C,vp1:VP1B])}.vp1(VP1)--> v2(fin,V2), {combine(vp1:VP1,[v2:V2])}.
e.g. "eat" vs. "eats"
e.g. fin/inf, gender
e.g. "talks and walks"
A demo
lambda :-
readLine(Sentence),
parse(Sentence,Formula), resetVars, vars2atoms(Formula),
betaConvert(Formula,Converted),
printRepresentations([Converted]).
Evaluation
Our new program has become much bigger, but it's…
• Modular: everything's in its right place:– Syntax in englishGrammar.pl
– Semantics (macros + combine) in lambda.pl
– Lexicon in lexicon.pl
• Scalable: E.g. extend the lexicon by adding facts to lexicon.pl
• Re-usable: E.g change only lambda.pl and keep the rest for changing the semantic construction method (e.g. to CLLS on Thursday)
What we‘ve done today
• Complex NPs, PNs and TVs in λ-based semantic construction
• A clean semantic construction framework in Prolog
• Its instantiation for -based semantic construction
Ambiguity
• Some sentences have more than one reading, i.e. more than one semantic representation.
• Standard Example: "Every man loves a woman":– Reading 1: the women may be different
x(man(x) -> y(woman(y) love(x,y)))– Reading 2: there is one particular woman
y(woman(y) x(man(x) -> love(x,y)))
• What does our system do?
betaReduce(lambda(X, F)@X,F).betaReduce(lambda(john,walk(john))@john, walk(john))
Excursion: lambda, variables and atoms
• Question yesterday: Why don't we use Prolog variables for FO-variables?
• Advantage (at first sight): -reduction as unification:betaReduce(lambda(X, F)@X,F).
Now: X = john, F = walk(X) ("John walks")
F = walk(john)
Nice, but…
Problem: Coordination
"John and Mary"(X. Y.P((X@P) (Y@P))@ Q.Q(john))@R.R(mary)P((Q.Q(john)@P) (R.R(mary)@P))P(P(john) P(mary))
"John and Mary walk"
P(P(john) P(mary))@ x.walk(x)x.walk(x)@john x.walk(x)@mary
lambda(X,walk(X))@john & lambda(X,walk(X))@mary
-reduction as unification:X = johnX = mary