62
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst December 1, 2004

SIMS 290-2: Applied Natural Language Processing

  • Upload
    komala

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

SIMS 290-2: Applied Natural Language Processing. Marti Hearst December 1, 2004. Today. Discourse Processing Going beyond the sentence Characteristics Cohesion / coherence Given / new Rhetorical structure Issues: Segmentation Linear Hierarchical Text vs. Dialogue - PowerPoint PPT Presentation

Citation preview

Page 1: SIMS 290-2:  Applied Natural Language Processing

1

SIMS 290-2: Applied Natural Language Processing

Marti HearstDecember 1, 2004 

 

Page 2: SIMS 290-2:  Applied Natural Language Processing

2

TodayDiscourse Processing

Going beyond the sentence

CharacteristicsCohesion / coherenceGiven / newRhetorical structure

Issues:Segmentation

– Linear– Hierarchical– Text vs. Dialogue– Discourse cues vs. content change

Co-reference / anaphora resolution

Dialogue Processing

Page 3: SIMS 290-2:  Applied Natural Language Processing

3Adapted from slide by Julia Hirschberg

What makes a text/dialogue coherent?

“Consider, for example, the difference between passages (18.71) and (18.72). Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Do you have a discourse? Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book.”

vs….

Page 4: SIMS 290-2:  Applied Natural Language Processing

4Adapted from slide by Julia Hirschberg

What makes a text/dialogue coherent?

“Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book. Do you have a discourse? Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Consider, for example, the difference between passages (18.71) and (18.72). (J&M:695)

Page 5: SIMS 290-2:  Applied Natural Language Processing

5Adapted from slide by Julia Hirschberg

What makes a text coherent?

Discourse/topic structureAppropriate sequencing of subparts of the discourse

Rhetorical structureAppropriate use of coherence relations between subparts of the discourse

Referring expressionsWords or phrases, the semantic interpretation of which is a discourse entity

Page 6: SIMS 290-2:  Applied Natural Language Processing

6Adapted from slide by Julia Hirschberg

Information StatusContrast

– John wanted a poodle but Becky preferred a corgi.

Topic/comment – The corgi they bought turned out to have fleas.

Theme/rheme – The corgi they bought turned out to have fleas.

Focus/presupposition – It was Becky who took him to the vet.

Given/new – Some wildcats bite, but this wildcat turned out to be a sweetheart.– Contrast Speaker (S) and Hearer (H)

Page 7: SIMS 290-2:  Applied Natural Language Processing

7Adapted from slide by Julia Hirschberg

Entities when first introduced are newBrand-new (H must create a new entity)

I saw a dinosaur today.Unused (H already knows of this entity)

I saw your mother today.

Evoked entities are old -- already in the discourse

Textually evokedThe dinosaur was scaley and gray.

Situationally evokedThe light was red when you went through it.

InferrablesContaining

I bought a carton of eggs. One of them was broken.Non-containing

A bus pulled up beside me. The driver was a monkey.

Determining Given vs. New

Page 8: SIMS 290-2:  Applied Natural Language Processing

8Adapted from slide by Julia Hirschberg

Given/New and Definiteness/Indefiniteness

Subject NPs tend to be syntactically definite and oldObject NPs tend to be indefinite and new

I saw a black cat yesterday. The cat looked hungry.– Definite articles, demonstratives, possessives, personal

pronouns, proper nouns, quantifiers like all, every

Indefinite articles, quantifiers like some, any, one signal indefiniteness…but….

This guy came into the room

Page 9: SIMS 290-2:  Applied Natural Language Processing

9

Discourse/Topic Structure

Text Segmentation:Linear

– TextTiling– Look for changes in content words

Hierarchical– Grosz & Sidner’s Centering theory– Morris & Hirst’s algorithm– Lexical chaining through Roget’s thesaurus

Hierarchical + Relations– Mann et al.’s Rhetorical Structure Theory– Marcu’s algorithm

Page 10: SIMS 290-2:  Applied Natural Language Processing

10

TextTiling

Goal: find multi-paragraph topicsExample: 21 paragraph article called Stargazers

Page 11: SIMS 290-2:  Applied Natural Language Processing

11Adapted froms slide by William Yerazunis

TextTiling

Goal: find multi-paragraph topicsBut … it’s difficult to define topic (Brown & Yule)Focus instead on topic shift or changeChange in content, by contrast with setting, scene, charactersMechanism:

compare adjacent blocks of textlook for shifts in vocabulary

Page 12: SIMS 290-2:  Applied Natural Language Processing

12

Intuition behind TextTiling

Page 13: SIMS 290-2:  Applied Natural Language Processing

13Adapted froms slide by William Yerazunis

TextTiling Algorithm

TokenizationLexical Score Determination

BlocksVocabulary IntroductionsChains

Boundary Identification

Page 14: SIMS 290-2:  Applied Natural Language Processing

14Adapted froms slide by William Yerazunis

Tokenization

Convert text stream into terms (words)Remove “stop words”Reduce to root (inflectional morphology)

Subdivide into “token-sequences”(substitute for sentences)

Find potential boundary points (paragraphs breaks)

Page 15: SIMS 290-2:  Applied Natural Language Processing

15Adapted froms slide by William Yerazunis

Determining Scores

Compute a score at each token-sequence gapScore based on lexical occurrencesBlock algorithm:

score iw w

w w

t b t bt

t b t btt

( ), ,

, ,

1 2

1 2

2 2

Page 16: SIMS 290-2:  Applied Natural Language Processing

16

Page 17: SIMS 290-2:  Applied Natural Language Processing

17Adapted froms slide by William Yerazunis

Boundary Identification

Smooth the plot (average smoothing)

Assign depth score at each token-sequence gap“Deeper” valleys score higherOrder boundaries by depth scoreChoose boundary cut off (avg-sd/2)

Page 18: SIMS 290-2:  Applied Natural Language Processing

18Adapted froms slide by William Yerazunis

Evaluation

DATA Twelve news articles from Dialog Seven human judges per article “major” boundaries: chosen by >= 3 judges Avg number of paragraphs: 26.75 Avg number of boundaries: 10 (39%)

RESULTS Between upper and lower bounds Upper bound: judges’ averages Lower bound: reasonable simple algorithm

Page 19: SIMS 290-2:  Applied Natural Language Processing

19Adapted froms slide by William Yerazunis

Assessing Agreement Among Judges

KAPPA Coefficient Measures pairwise agreement Takes expected chance agreement into account P(A) = proportion of times judges agree P(E) = proportion expected chance agreement

.43 to .68 (Isard & Carletta 95, boundaries) .65 to .90 (Rose 95, sentence segmentation) Here, k= .647

kP A P E

P E

( ) ( )

( )1

P E P B P B

P E

( ) ( ( )) ( ( ))

( ) . . .

2 2

2 2

1

39 61 52

Page 20: SIMS 290-2:  Applied Natural Language Processing

20Adapted froms slide by William Yerazunis

TextTiling ConclusionsFirst computational investigation into multi-paragraph discourse unitsSimple Discourse Cue: position-sensitive term repetition Acceptable performance for some tasksHas been reproduced/used by many researchersMulti-lingual

(applied by others to French, German, Arabic)

Page 21: SIMS 290-2:  Applied Natural Language Processing

21Adapted from slide by Julia Hirschberg

What Can Hierarchical Structure Tell Us?

Welcome to word processing. That’s using a computer to type letters and reports. Make a typo?

No problem.

Just back up, type over the mistake, and it’s gone.

And, it eliminates retyping.

And, it eliminates retyping.

Page 22: SIMS 290-2:  Applied Natural Language Processing

22Adapted from slide by Julia Hirschberg

Centering Theory of Discourse Structure (Grosz & Sidner ‘86)

A prominent theory of discourse structureProvides for multiple levels of analysis: S’s purpose as well as content of utterances and S and H’s attentional stateIdentifies only a few, general relations that hold among intentionsOften leads to a hierarchical structure

Three components:Linguistic structureIntentional structureAttentional structure

Page 23: SIMS 290-2:  Applied Natural Language Processing

23

Example of Hierarchical Analysis(Morris and Hirst ’91)

Page 24: SIMS 290-2:  Applied Natural Language Processing

24

Page 25: SIMS 290-2:  Applied Natural Language Processing

25Adapted from slide by Julia Hirschberg

Rhetorical Structure Theory (Mann, Matthiessen, and Thompson ‘89)

One theory of discourse structure, based on identifying relations between parts of the text

Identify meaningful units and the relations between them

– Clauses and clause-like units that are unequivocally the nucleus or satellite of a rhetorical relation.

Only the midday sun at tropical latitudes is warm enough] [to thaw ice on occasions,] [but any liquid water formed in this way would evaporate almost instantly] [because of the low atmospheric pressure.]

Nucleus/satellite notion encodes asymmetry

Page 26: SIMS 290-2:  Applied Natural Language Processing

26Adapted from slide by Julia Hirschberg

Rhetorical Structure Theory

Some rhetorical relations:Elaboration (set/member,class/instance/whole/part…)Contrast: multinuclearCondition: Sat presents precondition for NPurpose: Sat presents goal of the activity in NSequence: multinuclearResult: N results from something presented in SatEvidence: Sat provides evidence for something claimed in N

Page 27: SIMS 290-2:  Applied Natural Language Processing

27Adapted from slide by Daniel Marcu

Determining high-level relations

[Smart cards are not a new phenomenon.1] [They have been in development since the late 1970s and have found major applications in Europe, with more than a quarter of a billion cards made so far.2] [The vast majority of chips have gone into prepaid, disposable telephone cards, but even so the experience gained has reduced manufacturing costs, improved reliability and proved the viability of smart cards.3] [International and national standards for smart cards are well under development to ensure that cards, readers and the software for the many different applications that may reside on them can work together seamlessly and securely.4] [Standards set by the International Organization for Standardization (ISO), for example, govern the placement of contacts on the face of a smart card so that any card and reader will be able to connect.5]

Page 28: SIMS 290-2:  Applied Natural Language Processing

28Adapted from slide by Daniel Marcu

Representing implicit relations[Smart cards are becoming more attractive2] [as the price of microcomputing power and storage continues to drop.3] [They have two main advantages over magnetic-stripe cards.4] [First, they can carry 10 or even 100 times as much information5] [- and hold it much morerobustly.6] [Second, they can execute complex tasks in conjunction with a terminal.7]

Page 29: SIMS 290-2:  Applied Natural Language Processing

29Adapted from slide by Julia Hirschberg

What’s the Rhetorical Structure?

System: Hello. How may I help you?User: I would like to find out why I was charged for a call?System: What call would you like to inquire about?User: My bill says I made a call to Syncamaloo, Texas, but I’ve never even heard of this town.System: May I have the date of the call that appears on your bill?

Page 30: SIMS 290-2:  Applied Natural Language Processing

30Adapted from slide by Daniel Marcu

Issues for RSTMany variations in expression

[I have not read this book.] [It was written by Bertrand Russell.][I have not read this book,] [which was written by Bertrand Russell.][I have not read this book written by Bertrand Russell.][I have not read this Bertrand Russell book.]

Rhetorical relations are ambiguous[He caught a bad fever] [while he was in Africa.]

– Circumstance > Temporal-Same-Time

[With its distant orbit, Mars experiences frigid weather conditions.] [Surface temperatures typically average about –60 degrees Celsius at the equator and can dip to –123 degrees C near the poles. ]

– Evidence > Elaboration

Page 31: SIMS 290-2:  Applied Natural Language Processing

31Adapted from slide by Julia Hirschberg

Identifying RS Automatically (Marcu ’99)

Train a parser on a discourse treebank90 RS trees, hand-annotated for rhetorical relationsElementary discourse units (edu’s) linked by RRParser learns to identify N and S and their RRFeatures: Wordnet-based similarity, lexical, structural

Uses discourse segmenter to identify discourse units

Trained to segment on hand-labeled corpus (C4.5)Features: 5-word POS window, presence of discourse markers, punctuation, seen a verb?,…Eval: 96-8% accuracy

Page 32: SIMS 290-2:  Applied Natural Language Processing

32Adapted from slide by Julia Hirschberg

Evaluation of parser:Id edu’s: Recall 75%, Precision 97%Id hierarchical structure (2 edu’s related): Recall 71%, Precision 84%Id nucleus/satellite labels: Recall 58%, Precision 69%Id RR: Recall 38%, Precision 45%

Later errors due mostly to edu mis-identification

Id of hierarchical structure and n/s status comparable to human when hand-labeled edu’s used

Hierarchical structure is easier to id than RR

Identifying RS Automatically (Marcu ’99)

Page 33: SIMS 290-2:  Applied Natural Language Processing

33Adapted from slide by Julia Hirschberg

Some Problems with RST (cf. Moore & Pollack ‘92)

How many Rhetorical Relations are there?How can we use RST in dialogue as well as monologue?RST does not allow for multiple relations holding between parts of a discourseRST does not model overall structure of the discourse

Page 34: SIMS 290-2:  Applied Natural Language Processing

34Adapted from slide by Ani Nenkova

Referring Expressions

Referring expressions are words or phrases, the semantic interpretation of which is a discourse entity (also called referent)

Discourse entities are semantic objects . – Can have multiple syntactic realizations within a text

Discourse entities exist in the domain D, in which a text is interpreted

Page 35: SIMS 290-2:  Applied Natural Language Processing

35Adapted from slide by Ani Nenkova

Referring Expressions: Example

A pretty woman entered the restaurant. She sat at the table next to mine and only then I recognized her. This was Amy Garcia, my next door neighbor from 10 years ago. The woman has totally changed! Amy was at the time shy…

Page 36: SIMS 290-2:  Applied Natural Language Processing

36Adapted from slide by Ani Nenkova

Pronouns vs. Full NP

A pretty woman entered the restaurant. She sat at the table next to mine and only then I recognized her. This was Amy Garcia, my next door neighbor from 10 years ago. The woman has totally changed! Amy was at the time shy…

Page 37: SIMS 290-2:  Applied Natural Language Processing

37Adapted from slide by Ani Nenkova

Definite vs. Indefinite NPs

A pretty woman entered the restaurant. She sat at the table next to mine and only then I recognized her. This was Amy Garcia, my next door neighbor from 10 years ago. The woman has totally changed! Amy was at the time shy…

Page 38: SIMS 290-2:  Applied Natural Language Processing

38Adapted from slide by Ani Nenkova

Common Noun vs. Proper Noun

A pretty woman entered the restaurant. She sat at the table next to mine and only then I recognized her. This was Amy Garcia, my next door neighbor from 10 years ago. The woman has totally changed! Amy was at the time shy…

Page 39: SIMS 290-2:  Applied Natural Language Processing

39Adapted from slide by Ani Nenkova

Modified vs. Bare head NP

A pretty woman entered the restaurant. She sat at the table next to mine and only then I recognized her. This was Amy Garcia, my next door neighbor from 10 years ago. The woman has totally changed! Amy was at the time shy…

Page 40: SIMS 290-2:  Applied Natural Language Processing

40Adapted from slide by Ani Nenkova

Premodified vs. Postmodified

A pretty woman entered the restaurant. She sat at the table next to mine and only then I recognized her. This was Amy Garcia, my next door neighbor from 10 years ago. The woman has totally changed! Amy was at the time shy…

Page 41: SIMS 290-2:  Applied Natural Language Processing

41Adapted from slide by Ani Nenkova

Anaphora resolution

Finding in a text all the referring expressions that have one and the same denotation

Pronominal anaphora resolutionAnaphora resolution between named entitiesFull noun phrase anaphora resolution

Page 42: SIMS 290-2:  Applied Natural Language Processing

42Adapted from slide by Ani Nenkova

Anaphora Resolution

A pretty woman entered the restaurant. She sat at the table next to mine and only then I recognized her. This was Amy Garcia, my next door neighbor from 10 years ago. The woman has totally changed! Amy was at the time shy…

Page 43: SIMS 290-2:  Applied Natural Language Processing

43Adapted from slide by Ani Nenkova

Pronominal anaphora resolution

Rule-based vs statistical(Ken 1996), (Lap 1994) vs (Ge 1998)

Performed on full syntactic parse vs on shallow syntactic parse

(Lap 1994), (Ge 1998) vs (Ken 1996)

Type of text used for the evaluation(Lap 1994) computer manual texts (86% accuracy)(Ge 1998) WSJ articles (83% accuracy)(Ken 1996) different genres (75% accuracy)

Page 44: SIMS 290-2:  Applied Natural Language Processing

44Adapted from slide by Ani Nenkova

Pronominal anaphora resolution

Generic vs specific reference1. The Vice-President of the United States is also

President of the Senate.2. Historically, he is the President’s key person in

negotiations with Congress3a. He is required to be 35 years old.3b. As Ambassador to China, he handled many tricky

negotiations, so he is well prepared for the job

Page 45: SIMS 290-2:  Applied Natural Language Processing

45

Talking to a Machine….and (often) Getting an Answer

Today’s spoken dialogue systems make it possible to accomplish real tasks without talking to a personKey advances

Stick to goal-directed interactions in a limited domainPrime users to adopt the vocabulary you can recognizePartition the interaction into manageable stagesJudicious use of system vs. mixed initiative

Page 46: SIMS 290-2:  Applied Natural Language Processing

46Adapted from slide by Julia Hirschberg

Acoustic and Prosodic Cues to Discourse Structure

Intuition:Speakers vary acoustic and prosodic cues to convey variation in discourse structureSystematic? In read or spontaneous speech?

Evidence: Observations from recorded corporaLaboratory experimentsMachine learning of discourse structure from acoustic/prosodic features

Page 47: SIMS 290-2:  Applied Natural Language Processing

47Adapted from slide by Julia Hirschberg

Boston Directions Corpus (Hirschberg & Nakatani ’96)

Experimental Design– 12 speakers: 4 used– Spontaneous and read versions of 9 direction-giving

tasks

Corpus: 50m read; 67m sponLabeling

Prosodic: ToBI intonational labelingDiscourse: Grosz & Sidner

Features used in analysis

Page 48: SIMS 290-2:  Applied Natural Language Processing

48Adapted from slide by Julia Hirschberg

ds1: step 1, enter and get tokenfirstenter the Harvard Square T stopand buy a token

ds2: inbound on red linethenproceed to get on theinboundumRed Lineuh subway

Boston Directions Corpus: Describe how to get to MIT from Harvard

Page 49: SIMS 290-2:  Applied Natural Language Processing

49Adapted from slide by Julia Hirschberg

ds3: take subway from hs, to cs to ksandtake the subwayfrom Harvard Squareto Central Squareand then to Kendall Square

– ds4: describe ks stationyou’ll see a music sculpture therewhich will tell you it’s Kendall Squareit’s very nice

ds5: get off T.then get off the T

Page 50: SIMS 290-2:  Applied Natural Language Processing

50

Dialogue vs. Monologue

Monologue and dialogue both involve interpreting

Information statusCoherence issuesReference resolutionSpeech acts, implicature, intentionality

Dialogue involves managingTurn-takingGrounding and repairing misunderstandingsInitiative and confirmation strategies

Page 51: SIMS 290-2:  Applied Natural Language Processing

51

Segmenting Speech into Utterances

What is an `utterance’?Why is EOU detection harder than EOS?How does speech differ from text? Single syntactic sentence may span several turns

A: We've got you on USAir flight 99B: YepA: leaving on December 1.

Multiple syntactic sentences may occur in single turnA: We've got you on USAir flight 99 leaving on December. Do you

need a rental car?

Intonational definitions: intonational phrase, breath group, intonation unit

Page 52: SIMS 290-2:  Applied Natural Language Processing

52

Turns and Utterances

Dialogue is characterized by turn-taking: who should talk next, and when they should talkHow do we identify turns in recorded speech?

Little speaker overlap (around 5% in English --although depends on domain)But little silence between turns either

How do we know when a speaker is giving up or taking a turn? Holding the floor? How do we know when a speaker is interruptable?

Page 53: SIMS 290-2:  Applied Natural Language Processing

53

Simplified Turn-Taking Rule (Sacks et al)

At each transition-relevance place (TRP) of each turn:

If current speaker has selected A as next speaker, then A must speak nextIf current speaker does not select next speaker, any other speaker may take next turnIf no one else takes next turn, the current speaker may take next turn

TRPs are where the structure of the language allows speaker shifts to occur

Page 54: SIMS 290-2:  Applied Natural Language Processing

54

Adjacency pairs set up next speaker expectations

GREETING/GREETINGQUESTION/ANSWERCOMPLIMENT/DOWNPLAYERREQUEST/GRANT

‘Significant silence’ is dispreferredA: Is there something bothering you or not? (1.0s)A: Yes or no? (1.5s)A: Eh?B: No.

Page 55: SIMS 290-2:  Applied Natural Language Processing

55

Turntaking and Initiative Strategies

System InitiativeS: Please give me your arrival city name.U: Baltimore.S: Please give me your departure city name….

User InitiativeS: How may I help you?U: I want to go from Boston to Baltimore on November

8.

`Mixed’ initiativeS: How may I help you?U: I want to go to Boston.S: What day do you want to go to Boston?

Page 56: SIMS 290-2:  Applied Natural Language Processing

56

Grounding (Clark & Shaefer ‘89)

Conversational participants don’t just take turns speaking….they try to establish common ground (or mutual belief)H must ground a S's utterances by making it clear whether or not understanding has occurredHow do hearers do this?

Several different mechanisms

Page 57: SIMS 290-2:  Applied Natural Language Processing

57

S: I can upgrade you to an SUV at that rate.Continued attention

(U gazes appreciatively at S)

Relevant next contributionU: Do you have a RAV4 available?

Acknowledgement/backchannelU: Ok/Mhmmm/Great!

Demonstration/paraphraseU: An SUV.

Display/repetitionU: You can upgrade me to an SUV at the same rate?

Request for repairU: I beg your pardon?

Grounding Mechanisms(Clark & Shaefer ‘89)

Page 58: SIMS 290-2:  Applied Natural Language Processing

58

How do we evaluate Dialogue Systems?

PARADISE framework (Walker et al ’00)

“Performance” of a dialogue system is affected both by what gets accomplished by the user and the dialogue agent and how it gets accomplished

Efficiency of the Interaction:User Turns, System Turns, Elapsed TimeQuality of the Interaction: ASR rejections, Time Out Prompts, Help Requests, Barge-Ins, Mean Recognition Score (concept accuracy), Cancellation RequestsUser SatisfactionTask Success: perceived completion, information extracted

Page 59: SIMS 290-2:  Applied Natural Language Processing

59

Identifying Misrecognitions and User Corrections Automatically (Hirschberg, Litman & Swerts)

Collect corpus from interactive voice response systemIdentify speaker ‘turns’

– incorrectly recognized – where speakers first aware of error – that correct misrecognitions

Identify prosodic features of turns in each category and compare to other turnsUse Machine Learning techniques to train a classifier to make these distinctions automatically

Page 60: SIMS 290-2:  Applied Natural Language Processing

60

Turn Types

TOOT: Hi. This is AT&T Amtrak Schedule System. This is TOOT. How may I help you?

User: Hello. I would like trains from Philadelphia to New York leaving on Sunday at ten thirty in the evening.

TOOT: Which city do you want to go to?

User: New York.

misrecognition

correction

aware site

Page 61: SIMS 290-2:  Applied Natural Language Processing

61

Results

Reduced error in predicting misrecognized turns to 8.64%Error in predicting ‘awares’ (12%)Error in predicting corrections (18-21%)

Page 62: SIMS 290-2:  Applied Natural Language Processing

62Adapted from slide by Julia Hirschberg

Dialogue Conclusions

Spoken dialogue systems presents new problems -- but also new possibilities

Recognizing speech introduces a new source of errorsAdditional information provided in the speech stream offers new information about users’ intended meanings, emotional state (grounding of information, speech acts, reaction to system errors)

Why spoken dialogue systems rather than web-based interfaces?