Aljoscha Burchardt, Alexander Koller, Stephan Walter, Universität des Saarlandes,

Computational Semanticshttp://www.coli.uni-sb.de/cl/projects/milca/esslliDay II: A Modular

Architecture

Aljoscha Burchardt,

Alexander Koller,

Stephan Walter,Universität des Saarlandes,

Saarbrücken, Germany

ESSLLI 2004, Nancy, France

http://www.coli.uni-sb.de/cl/projects/milca/esslli

Computing Semantic Representations

• Yesterday: -Calculus is a nice tool for systematic meaning

construction.– We saw a first, sketchy implementation– Some things still to be done

• Today: – Let’s fix the problems– Let’s build nice software

• Semantic representations constructed along the syntax tree: How to get there?

By using functional applications help to guide arguments in the right

place on -reduction:

Yesterday: -Calculus

x.love(x,mary)@johnlove(john,mary)

Yesterday’s disappointment

Our first idea for NPs with determiner didn’t work out:

“A man” ~> z.man(z)

„A man loves Mary“ ~> * love(z.man(z),mary)

z.man(z) just isn‘t the meaning of „a man“.

If anything, it translates the complete sentence „There is a man“

Let‘s try again, systematically…

But what was the idea after all?

Nothing!

z(man(z) love(z,mary))

z(y.man(y)(z) x.love(x,mary)(z))

A solution

What we want is:

„A man loves Mary“ ~> z(man(z) love(z,mary))

What we have is:

“man” ~> y.man(y)

“loves Mary” ~> x.love(x,mary)

How about: z(man(z) love(z,mary))z(y.man(y)(z) love(z,mary))z(y.man(y)(z) love(z,mary))z(y.man(y)(z) x.love(x,mary)(z))z(y.man(y)(z) x.love(x,mary)(z))Remember: We can use variables for any kind of term.So next:

z(y.man(y)(z) x.love(x,mary)(z))Q. Q(z)) x.love(x,mary)P( P )y.man(y)P(Q.z(P(z) Q(z))) <~ “A”

P(Q.z(P(z)Q(z)))@y.man(y) x.love(x,mary)@Q.z(man(z)Q(z))

But…

“A man … loves Mary”

x.love(x,mary)

“John … loves Mary”@ john not systematic!

P.P@john @ x.love(x,mary) better!x.love(x,mary)@johnlove(john,mary)

So: “John” ~> P.P(john)

fine!z.man(z) x.love(x,mary)(z)man(z) love(z,mary)

not reducible!x.love(x,mary)@john

P(Q.z(P(z) Q(z)))@ y.man(y) @ x.love(x,mary)

"loves Mary" ~> yx.love(x,y)@Q.Q(mary)

Transitive VerbsWhat about transitive verbs (like "love")?

"Mary" ~> Q.Q(mary)

x.love(x,Q.Q(mary))

How about something a little more complicated:

"loves" ~> Rx([email protected](x,y))

The only way to understand this is to see it in action...

"loves" ~> yx.love(x,y) ???won't do:

x(P.P(mary)@y.love(x,y))P.P(john) @x(y.love(x,y)(mary))

"John loves Mary" again...

loves Mary

Rx([email protected](x,y)) P.P(mary)

x.love(x,mary)(john)

John

P.P(john) ( )@@

x.love(x,mary)

love(john,mary)

love(john,mary)love(john,mary)

Summing up

• nouns: “man” ~> x.man(x)• intransitive verbs: „smoke“ ~> x.smoke(x)• determiner: „a“ ~> P(Q.z(P(z)

Q(z)))• proper names: „mary“ ~> P.P(mary)• transitive verbs: “love” ~>

Rx([email protected](x,y))

Today‘s first success

What we can do now (and could not do yesterday):• Complex NPs (with determiners)• Transitive verbs

… and all in the same way.

Key ideas:• Extra λs for NPs• Variables for predicates• Apply subject NP to VP

Yesterday’s implementation

s(VP@NP) --> np(NP),vp(VP).np(john) --> [john].np(mary) --> [mary].tv(lambda(X,lambda(Y,love(Y,X)))) --> [loves],

{vars2atoms(X),vars2atoms(Y)}.iv(lambda(X,smoke(X))) --> [smokes], {vars2atoms(X)}.iv(lambda(X,snore(X))) --> [snorts], {vars2atoms(X)}.vp(TV@NP) --> tv(TV),np(NP).vp(IV) --> iv(IV).

% This doesn't work!np(exists(X,man(X))) --> [a,man], {vars2atoms(X)}.

Was this a good implementation?

A Nice Implementation

What is a nice implementation?It should be:– Scalable: If it works with five examples, upgrading to

5000 shouldn’t be a great problem (e.g. new constructions in the grammar, more words...)

– Re-usable: Small changes in our ideas about the system shouldn’t lead to complex changes in the implementation (e.g. a new representation language)

Solution: Modularity

• Think about your problem in terms of interacting conceptual components

• Encapsulate these components into modules of your implementation, with clean and abstract pre-defined interfaces to each other

• Extend or change modules to scale / adapt the implementation

Another look at yesterday’s implementation

• Okay, because it was small

• Not modular at all: all linguistic functionality in one file, packed inside the DCG

• E.g. scalability of the lexicon: Always have to write new rules, like:

tv(lambda(X,lambda(Y,visit(Y,X)))) --> [visit], {vars2atoms(X),vars2atoms(Y)}.

• Changing parts for Adaptation? Change every single rule!

Let's modularize!

smoke(j)

“John smokes”

Semantic Construction:Conceptual Components

Black Box

Semantic Construction:Inside the Black Box

Black Box

Words(lexical)

Phrases(combinatorial)

Syntax Semantics

DCG combine-rules

lexicon-facts

DCG

The DCG-rules tell us what phrases are acceptable (mainly). Their basic structure is:

s(...) --> np(...), vp(...), {...}.

np(...) --> det(...), noun(...), {...}.

np(...) --> pn(...), {...}.

vp(...) --> tv(...), np(...), {...}.

vp(...) --> iv(...), {...}.

(The gaps will be filled later on)

combine-rules

The combine-rules encode the actual semantic construction process. That is, they glue representations together using @:

combine(s:(NP@VP),[np:NP,vp:VP]).

combine(np:(DET@N),[det:DET,n:N]).combine(np:PN,[pn:PN]).

combine(vp:IV,[iv:IV]).combine(vp:(TV@NP),[tv:TV,np:NP]).

The lexicon-facts hold the elementary information connected to words:

lexicon(noun,bird,[bird]). lexicon(pn,anna,[anna]). lexicon(iv,purr,[purrs]). lexicon(tv,eat,[eats]).

Lexicon

Their slots contain:1. syntactic category2. constant / relation symbol (“core” semantics)3. the surface form of the word.

lexicon(tv,eat,[eats]).lexicon(tv,eat,[eats]).lexicon(tv,eat,[eats]).

Interfaces

Words(lexical)


Syntax Semantics

DCG combine-rules

lexicon-facts

lexicon-calls Semantic macros

combine-calls

Interfaces in the DCG

• Lexical rules are now fully abstract. We have one for each category (iv, tv, n, ...). The DCG uses lexicon-calls and semantic macros like this:

iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.

pn(PN)--> {lexicon(pn,Sym,Word),pnSem(Sym,PN)}, Word.

• In the combinatorial rules, using combine-calls like this:

vp(VP)--> iv(IV),{combine(vp:VP,[iv:IV])}.

s(S)--> np(NP), vp(VP), {combine(s:S,[np:NP,vp:VP])}.

Information is transported between the three components of our system by additional calls and variables in the DCG:

Interfaces: How they work


(e.g. “smokes”)

Sym = smoke

• looks up the Word found in the string, ...

When this rule applies, the syntactic analysis component:

• ... checks that its category is iv, ...

• ... and retrieves the relation symbol Sym to be used in the semantic construction.

lexicon(iv, smoke, [smokes])lexicon(iv, smoke, [smokes])lexicon(iv, smoke, [smokes])

So we have: Word = [smokes]Sym = smoke

Interfaces: How they work II

The DCG-rule is now fully instantiated and looks like this:iv(lambda(X, smoke(X)))-->

{lexicon(iv,smoke,[smokes]), ivSem(smoke, lambda(X, smoke(X)))}, [smokes].


Then, the semantic construction component:

Sym = smoke• takes Sym ...

• ... and uses the semantic macro ivSem ...

• ... to transfer it into a full semantic representation for an intransitive verb.

ivSem(Sym,IV)ivSem(smoke,IV)ivSem(smoke,lambda(X, smoke(X)))

IV = lambda(X, smoke(X))

What’s inside a semantic macro?

Semantic macros simply specify how to make a valid semantic representation out of a naked symbol. The one we’ve just seen in action for the verb “smokes” was:

ivSem(Sym,lambda(X,Formula)):- compose(Formula,Sym,[X]).

compose builds a first-order formula out of Sym and a new variable X:Formula = smoke(X)

This is then embedded into a - abstraction over the same X:lambda(X, smoke(X))

Another one, without compose: pnSem(Sym,lambda(P,P@Sym)).john lambda(P,P@john)

Words(lexical)


Syntax Semantics

pn(PN) --> …,[john]iv(IV) --> …,[smokes]

Word =[john] Word = [smokes]

pnSem(Sym,PN) Sym = johnivSem(Sym,IV) Sym = smoke

PN = lambda(P,P@john)IV = lambda(X,smoke(X))

“John smokes”

lexicon(pn,john,[john]).

lexicon(iv,smoke,[smokes]).

np(NP) --> …,pn(PN)vp(VP) --> …,iv(IV)

NP = lambda(P,P@john)VP = lambda(X,smoke(X))

s(S)--> np(NP), vp(VP),{combine(s:S,[np:NP,vp:VP])}.

A look at combine

combine(s:NP@VP,[np:NP,vp:VP]).

S = NP@VPNP = lambda(P,P@john)VP = lambda(X,smoke(X))

So:S = lambda(P,P@john)@lambda(X,smoke(X))

That’s almost all, folks…

betaConvert(lambda(P,P@john)@lambda(X,smoke(X), Converted)Converted = smoke(john)

Little Cheats

Determiners: ("every man")• No semantic Sym in the lexicon:

lexicon(det,_,[every],uni).• Semantic representation generated by the macro alone:

detSem(uni,lambda(P,lambda(Q,forall(X,

(P@X)>(Q@X))))).Negation – same thing: ("does not walk")• No semantic Sym in the lexicon:

lexicon(mod,_,[does,not],neg).• Representation solely from macro:

modSem(neg,lambda(P,lambda(X,~(P@X)))).

A few “special words” are dealt with in a somewhat different manner:

The code that's online(http://www.coli.uni-sb.de/cl/projects/milca/esslli)

• lexicon-facts have fourth argument for any kind of additional information:

lexicon(tv,eat,[eats],fin).

• iv/tv have additional argument for infinite /fin.:iv(I,IV)--> {lexicon(iv,Sym,Word,I),…}, Word.

• limited coordination, hence doubled categories: vp2(VP2)--> vp1(VP1A), coord(C), vp1(VP1B), {combine(vp2:VP2,[vp1:VP1A,coord:C,vp1:VP1B])}.vp1(VP1)--> v2(fin,V2), {combine(vp1:VP1,[v2:V2])}.

e.g. "eat" vs. "eats"

e.g. fin/inf, gender

e.g. "talks and walks"

http://www.coli.uni-sb.de/cl/projects/milca/esslli

A demo

lambda :-

readLine(Sentence),

parse(Sentence,Formula), resetVars, vars2atoms(Formula),

betaConvert(Formula,Converted),

printRepresentations([Converted]).

Evaluation

Our new program has become much bigger, but it's…

• Modular: everything's in its right place:– Syntax in englishGrammar.pl

– Semantics (macros + combine) in lambda.pl

– Lexicon in lexicon.pl

• Scalable: E.g. extend the lexicon by adding facts to lexicon.pl

• Re-usable: E.g change only lambda.pl and keep the rest for changing the semantic construction method (e.g. to CLLS on Thursday)

What we‘ve done today

• Complex NPs, PNs and TVs in λ-based semantic construction

• A clean semantic construction framework in Prolog

• Its instantiation for -based semantic construction

Ambiguity

• Some sentences have more than one reading, i.e. more than one semantic representation.

• Standard Example: "Every man loves a woman":– Reading 1: the women may be different

x(man(x) -> y(woman(y) love(x,y)))– Reading 2: there is one particular woman

y(woman(y) x(man(x) -> love(x,y)))

• What does our system do?

betaReduce(lambda(X, F)@X,F).betaReduce(lambda(john,walk(john))@john, walk(john))

Excursion: lambda, variables and atoms

• Question yesterday: Why don't we use Prolog variables for FO-variables?

• Advantage (at first sight): -reduction as unification:betaReduce(lambda(X, F)@X,F).

Now: X = john, F = walk(X) ("John walks")

F = walk(john)

Nice, but…

Problem: Coordination

"John and Mary"(X. Y.P((X@P) (Y@P))@ Q.Q(john))@R.R(mary)P((Q.Q(john)@P) (R.R(mary)@P))P(P(john) P(mary))

"John and Mary walk"

P(P(john) P(mary))@ x.walk(x)x.walk(x)@john x.walk(x)@mary

lambda(X,walk(X))@john & lambda(X,walk(X))@mary

-reduction as unification:X = johnX = mary

Documents

Aljoscha Burchardt, Alexander Koller, Stephan Walter, Universität des Saarlandes,