25
LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

LING 438/538Computational Linguistics

Sandiway Fong

Lecture 23: 11/20

Page 2: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Today’s Topics

Three things1. continue with context-free grammar

example• deal with left recursion problem...

2. Homework• your chance to write a context-free grammar

538 Class Presentations• selecting a chapter• format etc.

Page 3: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Last Time

• Let’s write a context-free grammar that returns parse trees for simple active/passive sentence pairs such as:– John hit a ball/John ate a sandwich– *John hit/John ate– *hit a ball/*ate a sandwich– the ball was hit/the sandwich was eaten– the ball was hit by John/the sandwich was eaten by John

• Let’s introduce traces in the case of passives:– [S [NP the ball] [VP [aux was ][VP [V hit] [NP trace]]]]– [S [NP the ball] [VP [VP [aux was ][VP [V hit] [NP trace]]][PP [P by][NP

John]]]]

Page 4: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Grammar

• Note:– need to handle English passive morphology– passive be selects for a V-enExample– *was ate (simple past form)– was eaten (-en past participle form)

• Implementation:– use an extra argument to indicate the verb form– v(v(ate),past) --> [ate].– v(v(eaten),pastparticiple) --> [eaten].

Page 5: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Grammar

• [Developed in class]s(s(NP,VP)) --> np(NP,notrace),

vp(VP,_,_,notrace).np(np(Det,N),notrace) --> det(Det),

common_noun(N).np(np(N),notrace) -->

proper_noun(N).np(np(trace),trace) --> [].proper_noun(john) --> [john].det(det(the)) --> [the].det(det(a)) --> [a].common_noun(n(ball)) --> [ball].

common_noun(n(sandwich)) --> [sandwich].

vp(vp(BE,VP),Form,selectsforvp,notrace) --> passive_be(BE,Form), vp(VP,pastparticiple,transitive,trace).

vp(vp(V,NP),Form,transitive,EC) --> transitive(V,Form), np(NP,EC).

vp(vp(V),Form,intransitive,notrace) --> intransitive(V,Form).

vp(vp(VP,PP),Form,transitive,EC) --> vp(VP,Form,transitive,EC), pp(PP).

Page 6: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Grammar

passive_be(v(be),root) --> [be].

passive_be(v(is),thirdpersonpresent) --> [is].

passive_be(v(was),past) --> [was].

intransitive(v(eat),root) --> [eat].

intransitive(v(eats),s) --> [eats].

intransitive(v(ate),past) --> [ate].

intransitive(v(eaten),pastparticiple) --> [eaten].

transitive(v(eat),root) --> [eat].

transitive(v(eats),s) --> [eats].

transitive(v(ate),past) --> [ate].

transitive(v(eaten),pastparticiple) --> [eaten].

transitive(v(hit),root) --> [hit].

transitive(v(hits),s) --> [hits].

transitive(v(hit),past) --> [hit].

transitive(v(hit),pastparticiple) --> [hit].

pp(pp(P,NP)) --> p(P), np(NP,notrace).

p(p(by)) --> [by].

Page 7: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Grammatical Sentences

– John hit a ball/John ate a sandwich– John ate– the ball was hit/the sandwich was eaten– the ball was hit by John/the sandwich was eaten by John

• ?- s(X,[john,hit,the,ball],[]).• X = s(np(john),vp(v(hit),np(det(the),n(ball))))• | ?- s(X,[john,ate,a,sandwich],[]).• X = s(np(john),vp(v(ate),np(det(a),n(sandwich))))• | ?- s(X,[john,ate],[]).• X = s(np(john),vp(v(ate)))• | ?- s(X,[the,sandwich,was,eaten],[]).• X = s(np(det(the),n(sandwich)),vp(v(was),vp(v(eaten),np(trace)))) • | ?- s(X,[the,sandwich,was,eaten,by,john],[]).• X =

s(np(det(the),n(sandwich)),vp(v(was),vp(vp(v(eaten),np(trace)),pp(p(by),np(john)))))

Page 8: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Infinite loop

• Occurs with ungrammatical input– *John hit

• Also with grammatical input when we ask for more solutions– i.e. invoke backtracking– John ate a sandwich

• Computational System– involves recursion– Prolog also selects first matching rule– but will try other rules on backtracking

Page 9: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• Rule (simplified):– vp --> vp, pp.

– causes Prolog to go into an infinite loop

• Why?– Suppose there is no PP in the input– what happens on backtracking?

Page 10: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• Idea:– Look ahead into the input for a potential

PP– License Prolog to use the VP adjunction

rule only when there is an appropriate (overt) preposition ahead in the input

Page 11: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• Implementation:– requires access to the input list– not available directly from the DCG rule

• DCG rules are translated into underlying Prolog rules that contain input/output list pairsExample: DCG rule– vp(vp(VP,PP)) --> vp(VP,Number), pp(PP).– gets translated into Prolog as– vp(vp(A,B), C, D, E) :- vp(A, C, D, F), pp(B, F, E).– D = part of sentence to be analyzed by the VP rule – E = part left over after VP rule

– (the sandwich) D = [was,eaten,by,john]– E = []– F = ?

Page 12: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• DCG rules are translated into underlying Prolog rules that contain input/output list pairsExample: DCG rule– vp(vp(VP,PP)) --> vp(VP,Number), pp(PP).– gets translated into Prolog:– vp(vp(A,B), C, D, E) :- vp(A, C, D, F), pp(B, F, E).

• Solution:– modify the underlying Prolog rule directly– add a call to a Prolog predicate to check for list membership for

preposition– vp(vp(A,B), C, D, E) :- checkforpp(D), vp(A, C, D, F),

pp(B, F, E).

Page 13: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• Prolog VP adjunction rule– vp(vp(A,B), C, D, E) :- checkforpp(D), vp(A, C,

D, F), pp(B, F, E).

• Implementation of the supporting predicate checkforpp/1:– % checkforpp(List) true if List contains a preposition (by)

– checkforpp([by|_]).– checkforpp([_|L]) :- checkforpp(L).

Page 14: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• Actually, this only partially solves the problem• Case 1: no PP in input

– VP adjunction won’t be triggered because checkforpp/1 fails– vp(vp(A,B), C, D, E) :- checkforpp(D), vp(A, C, D, F),

pp(B, F, E).

• Case 2: there is a PP in input– still get recursion on backtracking ... and an infinite loop– because each recursion is licensed by the same PP

• Idea:– need to say that we license VP adjunction one PP at a time

• Prolog solution:– each time checkforpp/1 succeeds it should “mark” the PP so

that next time it is called it won’t select the same PP again

Page 15: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• Idea:– need to say that we license VP adjunction one PP at a time

• Prolog solution:– each time checkforpp/1 succeeds it should “mark” the PP so that next

time it is called it won’t select the same PP again

• Implementation:– checkforpp/2– % checkforpp(List,NewList) true if List contains a

preposition (by)and NewList is the marked List– checkforpp([by|L],[by2|L]).– checkforpp([X|L],[X|L2]) :- \+X=by, checkforpp(L,L2).

Page 16: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Solving the Left Adjunction Problem

• VP adjunction Prolog rule:– vp(vp(A,B), C, D, E) :- checkforpp(D,G), vp(A, C,

G, F), pp(B, F, E).

• must make sure PP rules still manage to pick up marked preposition

• Hence:– p(p(by)) --> [by].

– must morph into:– p(p(by)) --> [by2].

Page 17: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Homework

• due Tuesday 27th

Page 18: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Why can’t computers use English?

• from Lecture 1– a linguist’s view:

• a list of examples that are hard for computers to do

– a computational linguist’s view (mine): • these actually aren’t very hard at all...

armed with some DCG technology, we can easily write a grammar to that make the distinctions outlined in the pamphlet

– your homework task• write a grammar for these examples

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 19: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

If computers are so smart, why can't they use simple English?

• Consider, for instance, the four letters read; they can be pronounced as either reed or red. How does the machine know in each case which is the correct pronunciation? Suppose it comes across the following sentences:

• (l) The girls will read the paper. (reed) • (2) The girls have read the paper. (red) • We might program the machine to pronounce read as reed if it

comes right after will, and red if it comes right after have. But then sentences (3) through (5) would cause trouble.

• (3) Will the girls read the paper? (reed) • (4) Have any men of good will read the paper? (red) • (5) Have the executors of the will read the paper? (red) • How can we program the machine to make this come out

right?

Page 20: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

If computers are so smart, why can't they use simple English?

• (6) Have the girls who will be on vacation next week read the paper yet? (red)

• (7) Please have the girls read the paper. (reed)• (8) Have the girls read the paper?(red)• Sentence (6) contains both have and will before read, and both

of them are auxiliary verbs. But will modifies be, and have modifies read. In order to match up the verbs with their auxiliaries, the machine needs to know that the girls who will be on vacation next week is a separate phrase inside the sentence.

• In sentence (7), have is not an auxiliary verb at all, but a main verb that means something like 'cause' or 'bring about'. To get the pronunciation right, the machine would have to be able to recognize the difference between a command like (7) and the very similar question in (8), which requires the pronunciation red.

Page 21: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Homework Requirements

• This is what you need to submit• Part 1

– write down (in English) the grammatical constraints you are going to use to make the distinctions in examples (1) – (8).

– e.g. what you are assuming about things like auxiliary/verb fronting and the constraints from perfective have

• Part 2– implement your constraints in the framework of a Definite Clause

Grammar (DCG) that returns parse trees. – submit both your grammar and the runs.– to make the distinction between the forms of the verb read readily

apparent in your parse trees, use something like:– v(v(red),pastparticiple) --> [read].– v(v(read),root) --> [read].

Page 22: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Homework Requirements

• Note: the question mark is crucial in the following example• (5) Have the executors of the will read the paper? (red)

• Note: – you can either treat ? as an input word or have the parser return

two possible parses (without ?)

Page 23: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

538 class presentations

• your chance to get up and explain ideas in computational linguistics to the rest of the class

• Textbook Chapters: – from Chapter 11 onwards– as long as it’s on material we haven’t covered (or will cover)

in class– so, e.g., the basic pumping lemma wouldn’t be acceptable

• Remaining topics:– parsing techniques (left-corner, chart, tabular)– WordNet (ontologies, semantic networks)

Page 24: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Chapters

• II: Syntax• 11 Features and Unificat

ion

• 12 Lexicalized and Probabilistic Parsing

• 13 Language and Complexity

• III: Semantics• 14 Representing Meanin

g• 15 Semantic Analysis• 16 Lexical Semantics

• 17 Word Sense Disambiguation and Information Retrieval

• IV: Pragmatics• 18 Discourse • 19 Dialog and Conversational A

gents

• 20 Natural Language Generation

• V: Multilingual Processing• 21 Machine Translation

Page 25: LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

538 class presentations

• Pick one chapter– pick topic(s) within the chapter– send me email: first-come first-served – (same chapter different topics possible)– 10 minute presentation with slides– (powerpoint, PDF acceptable)– explain and evaluate the central

idea/technique/algorithm/trade-offs behind the topic you’ve chosen

– you’ll be graded on clarity of presentation and how well you explain or communicate the topic(s)