110
Representing Meaning Lecture 19 13 Sep 2007

Representing Meaning

Embed Size (px)

DESCRIPTION

Representing Meaning. Lecture 19 13 Sep 2007. Semantic Analysis. Semantic analysis is the process of taking in some linguistic input and assigning a meaning representation to it. There a lot of different ways to do this that make more or less (or no) use of syntax - PowerPoint PPT Presentation

Citation preview

Representing Meaning

Lecture 19

13 Sep 2007

Semantic Analysis

Semantic analysis is the process of taking in some linguistic input and assigning a meaning representation to it. There a lot of different ways to do this that make more

or less (or no) use of syntax We’re going to start with the idea that syntax does

matter The compositional rule-to-rule approach

Compositional Semantics Syntax-driven methods of assigning semantics to

sentences

Semantic Processing

We’re going to discuss 2 ways to attack this problem (just as we did with parsing) There’s the theoretically motivated correct and

complete approach… Computational/Compositional Semantics

Create a FOL representation that accounts for all the entities, roles and relations present in a sentence.

And there are practical approaches that have some hope of being useful and successful.

Information extractionDo a superficial analysis that pulls out only the entities, relations and roles that are of interest to the consuming application.

Compositional Analysis

Principle of Compositionality The meaning of a whole is derived from the meanings

of the parts What parts?

The constituents of the syntactic parse of the input What could it mean for a part to have a meaning?

Better

Book ),()^,(^

),()^,()^(,

yIsaxMaryGivee

xyGivenxJohnGiverxGivingyx

Turns out this representation isn’t quite as useful as it could be. Giving(John, Mary, Book)

Better would be one where the “roles” or “cases” are separated out. E.g., consider:

Note: essentially Giver=Agent, Given=Theme, Givee=To-Poss

Predicates

The notion of a predicate just got more complicated…

In this example, think of the verb/VP providing a template like the following

The semantics of the NPs and the PPs in the sentence plug into the slots provided in the template

),()^,()^,()^(,,, xzGiveexyGivenxwGiverxzGivingyxw

Advantages

Can have variable number of arguments associated with an event: events have many roles and fillers can be glued on as appear in the input.

Specifies categories (e.g., book) so that we can make assertions about categories themselves as well as their instances. E.g., Isa(MobyDick, Novel), AKO(Novel, Book).

Reifies events so that they can be quantified and related to other events and objects via sets of defined relations.

Can see logical connections between closely related examples without the need for meaning postulates.

Example

AyCaramba serves meat

),()^,()^( MeateServedAyCarambaeServereServinge

Compositional Analysis

Augmented Rules

We’ll accomplish this by attaching semantic formation rules to our syntactic CFG rules

Abstractly

This should be read as the semantics we attach to A can be computed from some function applied to the semantics of A’s parts.

)}.,....({... 11 semsemfA nn

Example

Easy parts…

NP -> PropNoun NP -> MassNoun PropNoun -> AyCaramba MassMoun -> meat

Attachments

{PropNoun.sem}

{MassNoun.sem}

{AyCaramba}

{MEAT}

Example

S -> NP VP VP -> Verb NP Verb -> serves

{VP.sem(NP.sem)} {Verb.sem(NP.sem) ???

),()^,()^( xeServedyeServereServingeyx

Lambda Forms

A simple addition to FOPC Take a FOPC sentence

with variables in it that are to be bound.

Allow those variables to be bound by treating the lambda form as a function with formal arguments

)(xxP

)(

))((

SallyP

SallyxxP

Example

Example

Example

Example

Syntax/Semantics Interface: Two Philosophies

1. Let the syntax do what syntax does well and don’t expect it to know much about meaning

In this approach, the lexical entry’s semantic attachments do all the work

2. Assume the syntax does know something about meaning• Here the grammar gets complicated and the lexicon

simpler (constructional approach)

Example

Mary freebled John the nim.

Where did he get it from?

Who has it?

Why?

Example

Consider the attachments for the VPs

VP -> Verb NP NP rule (gave Mary a book)

VP -> Verb NP PP (gave a book to Mary)

Assume the meaning representations should be the same for both. Under the lexicon-heavy scheme, the VP attachments are:

VP.Sem (NP.Sem, NP.Sem)

VP.Sem (NP.Sem, PP.Sem)

Example

Under a syntax-heavy scheme we might want to do something like

VP -> V NP NP V.sem ^ Recip(NP1.sem) ^ Object(NP2.sem) VP -> V NP PP

V.Sem ^ Recip(PP.Sem) ^ Object(NP1.sem) i.e the verb only contributes the predicate, the grammar

“knows” the roles.

Integration

Two basic approaches Integrate semantic analysis into the parser (assign

meaning representations as constituents are completed)

Pipeline… assign meaning representations to complete trees only after they’re completed

Semantic Augmentation to CFG Rules

CFG Rules are attached with semantic attachments. These semantic attachments specify how to compute the meaning

representation of a construction from the meanings of its constituent parts.

A CFG rule with semantic attachment will be as follows:

A 1,…,n { f(j.sem,…,k.sem) }

The meaning representation of A, A.sem, will be calculated by applying the function f to the semantic representations of some constituents.

Naïve Approach

ProperNoun Anarkali {Anarkali }

MassNoun meat { Meat }

NP ProperNoun {ProperNoun.sem }

NP MassNoun { MassNoun.sem }

Verb serves {e,x,y ISA(e,Serving) Server(e,x) Served(e,y) }

But we cannot propagate this representation to upper levels.

Using Lambda Notations

ProperNoun Anarkali { Anarkali }

MassNoun meat { Meat }

NP ProperNoun {ProperNoun.sem }

NP MassNoun { MassNoun.sem }

Verb serves {xy e ISA(e,Serving) Server(e,y) Served(e,x) }

VP Verb NP { Verb.sem(NP.sem) }

S NP VP { VP.sem(NP.sem) }

application of lambda expression lambda expression

Quasi-Logical Form

During semantic analysis, we may use quantified expressions as terms. In this case, our formula will not be a FOPC formula. We call this form of formulas as quasi-logical form.

A quasi-logical form should be converted into a normal FOPC formula by applying simple syntactic translations.

Server(e,<x ISA(x,Restaurant)>) a quasi-logical formula

x ISA(x,Restaurant ) Server(e,x) a normal FOPC formula

Swrite(bertrand,principia)

NPbertand

VPy.write(y,principia)

Vx.y.write(y,x)

NPprincipia

bertrand

writes principia

Parse Tree with Logical Forms

Pros and Cons

If you integrate semantic analysis into the parser as it is running… You can use semantic constraints to cut off parses

that make no sense But you assign meaning representations to

constituents that don’t take part in the correct (most probable) parse

Complex Terms

Allow the compositional system to pass around representations like the following as objects with parts:

Complex-Term → <Quantifier var body>

)Restaurant,(xIsax

Example

Our restaurant example winds up looking like

Big improvement…

Meat)Served(e,))RestaurantxxIsaeServereeServing ,(,()(

Conversion

So… complex terms wind up being embedded inside predicates. So pull them out and redistribute the parts in the right way…

P(<quantifier, var, body>)

turns intoQuantifier var body connective P(var)

Example

),()Restaurant(

))Restaurant,(,(

xeServerx, Isax

xIsaxeServer

Quantifiers and Connectives

If the quantifier is an existential, then the connective is an ^ (and)

If the quantifier is a universal, then the connective is an

-> (implies)

Multiple Complex Terms

Note that the conversion technique pulls the quantifiers out to the front of the logical form…

That leads to ambiguity if there’s more than one complex term in a sentence.

Quantifier Ambiguity

Consider Every restaurant has a menu

That could mean that every restaurant has a menu

Or thatThere’s some uber-menu out there and all restaurants have

that menu

Quantifier Scope Ambiguity

),(),(),()(,

)(

MenuyIsayeHadxeHavereyHavinge

xtxRestauran

),(),()(

),(),(

yeHadxeHavereeHaving

RestaurantxxIsaMenuyyIsa

Ambiguity

This turns out to be a lot like the prepositional phrase attachment problem

The number of possible interpretations goes up exponentially with the number of complex terms in the sentence

The best we can do is to come up with weak methods to prefer one interpretation over another

Non-Compositionality

Unfortunately, there are lots of examples where the meaning (loosely defined) can’t be derived from the meanings of the parts

Idioms, jokes, irony, sarcasm, metaphor, metonymy, indirect requests, etc

English Idioms

Kick the bucket, buy the farm, bite the bullet, run the show, bury the hatchet, etc…

Lots of these… constructions where the meaning of the whole is either Totally unrelated to the meanings of the parts (kick

the bucket) Related in some opaque way (run the show)

The Tip of the Iceberg

Describe this construction

1. A fixed phrase with a particular meaning

2. A syntactically and lexically flexible phrase with a particular meaning

3. A syntactically and lexically flexible phrase with a partially compositional meaning

4. …

Example

Enron is the tip of the iceberg.

NP -> “the tip of the iceberg” Not so good… attested examples…

the tip of Mrs. Ford’s iceberg the tip of a 1000-page iceberg the merest tip of the iceberg

How about That’s just the iceberg’s tip.

Example

What we seem to need is something like NP ->

An initial NP with tip as its head followed by

a subsequent PP with of as its head and that has iceberg as the head of its NP

And that allows modifiers like merest, Mrs. Ford, and 1000-page to modify the relevant semantic forms

Quantified Phrases

Consider

A restaurant serves meat. Assume that A restaurant looks like

If we do the normal lambda thing we get

)RestaurantxIsax ,(

)),(,()( MeatServed(e,))RestaurantxxIsaeServereeServing

Semantic analysis

Goal: to form the formal structures from smaller pieces

Three approaches: Syntax-driven semantic analysis Semantic grammar Information extraction: filling templates

Semantic grammar

Syntactic parse trees only contain parts that are unimportant in semantic processing.

Ex: Mary wants to go to eat some Italian food

Rules in a semantic grammar InfoRequest USER want to go to eat FOODTYPE FOODTYPENATIONALITY FOODTYPE NATIONALITYItalian/Mexican/….

Semantic grammar (cont)

Pros: No need for syntactic parsing Focus on relevant info Semantic grammar helps to disambiguate

Cons: The grammar is domain-specific.

Information extraction

The desired knowledge can be described by a relatively simple and fixed template.

Only a small part of the info in the text is relevant for filling the template.

No full parsing is needed: chunking, NE tagging, pattern matching, …

IE is a big field: e.g., MUC. KnowItAll

Summary of semantic analysis

Goal: to form the formal structures from smaller pieces

Three approaches: Syntax-driven semantic analysis Semantic grammar Information extraction

Lexical Semantics

Meaning

Traditionally, meaning in language has been studied from

three perspectives

The meaning of a text or discourse The meanings of individual sentences or utterances The meanings of individual words

We started in the middle, now we’ll look at the meanings of

individual words.

Word Meaning

We didn’t assume much about the meaning of words when

we talked about sentence meanings Verbs provided a template-like predicate argument structure Nouns were practically meaningless constants

There has be more to it than that The internal structure of words that determines

where they can go and what they can do (syntagmatic)

What’s a word?

Words?: Types, tokens, stems, roots, inflected forms?

Lexeme –

An entry in a lexicon consisting of a pairing of a form with

a single meaning representation

Lexicon - A collection of lexemes

Lexical Semantics

The linguistic study of systematic meaning related structure of lexemes is called Lexical Semantics.

A lexeme is an individual entry in the lexicon. A lexicon is meaning structure holding meaning relations of

lexemes. A lexeme may have different meanings. A lexeme’s meaning

component is known as one of its senses.

Different senses of the lexeme duck. an animal, to lower the head, ...

Different senses of the lexeme yüz face, to swim, to skin, the front of something, hundred, ...

Relations Among Lexemes and Their Senses

Homonymy

Polysemy

Snonymy

Hyponymy

Hypernym

Homonymy

Homonymy is a relation that holds between words having the same form (pronunciation, spelling) with unrelated meanings. Bank -- financial institution, river bank Bat -- (wooden stick-like thing) vs (flying scary mammal thing) Fluke –

A fish, and a flatworm. The end parts of an anchor. The fins on a whale's tail. A stroke of luck.

Homograph disambiguation is critically important in speech synthesis, natural language processing and other fields.

Polysemy

Polysemy is the phenomenon of multiple related meanings in a same lexeme. Bank -- financial institution, blood bank, a synonym for 'rely upon'

-- these senses are related.

• "I'm your friend, you can bank on me"

• While some banks furnish sperm only to married women, others are less restrictive

• However: a river bank is a homonym to 1 and 2, as they do not share etymologies. It is a completely different meaning

Mole - a small burrowing mammal several different entities called moles which refer to different things, but their

names derive from 1.e.g. A Mole (espionage) burrows for information hoping to go undetected. .

Polysemy

Milk The verb milk (e.g. "he's milking it for all he can get") derives from the

process of obtaining milk.

Lexicographers define polysemes within a single dictionary lemma, numbering different meanings, while homonyms are treated in separate lemmata.

Most non-rare words have multiple meanings The number of meanings is related to its frequency Verbs tend more to polysemy Distinguishing polysemy from homonymy isn’t always easy

(or necessary)

Synonymy

Synonymy is the phenomenon of two different lexemes having

the same meaning. Big and large In fact, one of the senses of two lexemes are same.

There aren’t any true synonyms. Two lexemes are synonyms if they can be successfully substituted

for each other in all situations What does successfully mean?

Preserves the meaning But may not preserve the acceptability based on notions of politeness, slang,

... Example - Big and large?

That’s my big sister a big plane That’s my large sister a large plane

Hyponymy and Hypernym

Hyponymy: one lexeme denotes a subclass of the other lexeme. The more specific lexeme is a hyponymy of the more general lexeme. The more general lexeme is a hypernym of the more specific lexeme.

A hyponymy relation can be asserted between two lexemes when the meanings of the lexemes entail a subset relation Since dogs are canids

Dog is a hyponym of canid and Canid is a hypernym of dog

Car is a hyponymy of vehicle, vehicle is a hypernym of car.

Ontology

The term ontology refers to a set of distinct objects resulting from

analysis of a domain.

A taxonomy is a particular arrangements of the elements of

an ontology into a tree-like class inclusion structure.

A lexicon holds different senses of lexemes together with other

relations among lexemes.

Lexical Resourses

There are lots of lexical resources available Word lists On-line dictionaries Corpora

The most ambitious one is WordNet A database of lexical relations for English Versions for other languages are under development

WordNet WordNet is widely used lexical database for English. WebPage: http://www.cogsci.princeton.edu/~wn/ It holds:

The senses of the lexemes holds relations among nouns such as hypernym, hyponym,

MemberOf, .. Holds relations among verbs such as hypernym, … Relations are held for each different senses of a lexeme.

WordNet Relations

Some of WordNet Relations (for nouns)

WordNet Hierarchies Hyponymy chains for the senses of the lexeme bass

WordNet - bass The noun "bass" has 8 senses in WordNet.

1. bass -- (the lowest part of the musical range)2. bass, bass part -- (the lowest part in polyphonic music)3. bass, basso -- (an adult male singer with the lowest voice)4. sea bass, bass -- (the lean flesh of a saltwater fish of the family Serranidae)5. freshwater bass, bass -- (any of various North American freshwater fish with lean flesh (especially of the genus Micropterus))6. bass, bass voice, basso -- (the lowest adult male singing voice)7. bass -- (the member with the lowest range of a family of musical instruments)8. bass -- (nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes)

The adjective "bass" has 1 sense in WordNet.1. bass, deep -- (having or denoting a low vocal or instrumental range; "a deep voice"; "a bass voice is lower than a baritone voice"; "a bass clarinet")

WordNet –bass HyponymsResults for "Hyponyms (...is a kind of this), full" search of noun "bass"

6 of 8 senses of bass                                                   Sense 2bass, bass part -- (the lowest part in polyphonic music)       => ground bass -- (a short melody in the bass that is constantly repeated)       => thorough bass, basso continuo -- (a bass part written out in full and accompanied by figures for successive chords)       => figured bass -- (a bass part in which the notes have numbers under them to indicate the chords to be played)Sense 4sea bass, bass -- (the lean flesh of a saltwater fish of the family Serranidae)       => striped bass, striper -- (caught along the Atlantic coast of the United States)Sense 5freshwater bass, bass -- (any of various North American freshwater fish with lean flesh (especially of the genus Micropterus))       => largemouth bass -- (flesh of largemouth bass)       => smallmouth bass -- (flesh of smallmouth bass)Sense 6bass, bass voice, basso -- (the lowest adult male singing voice)       => basso profundo -- (a very deep bass voice)Sense 7bass -- (the member with the lowest range of a family of musical instruments)       => bass fiddle, bass viol, bull fiddle, double bass, contrabass, string bass -- (largest and lowest member of the violin family)       => bass guitar -- (the lowest six-stringed guitar)       => bass horn, sousaphone, tuba -- (the lowest brass wind instrument)           => euphonium -- (a bass horn (brass wind instrument) that is the tenor of the tuba family)           => helicon, bombardon -- (a tuba that coils over the shoulder of the musician)       => bombardon, bombard -- (a large shawm; the bass member of the shawm family)Sense 8bass -- (nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes)       => freshwater bass -- (North American food and game fish)

WordNet – bass Synonyms

Results for "Synonyms, ordered by estimated frequency" search of noun "bass"8 senses of bass                                                        Sense 1bass -- (the lowest part of the musical range)       => low pitch, low frequency -- (a pitch that is perceived as below other pitches)Sense 2bass, bass part -- (the lowest part in polyphonic music)       => part, voice -- (the melody carried by a particular voice or instrument in polyphonic music; "he tried to sing the tenor part")Sense 3bass, basso -- (an adult male singer with the lowest voice)       => singer, vocalist, vocalizer, vocaliser -- (a person who sings)Sense 4sea bass, bass -- (the lean flesh of a saltwater fish of the family Serranidae)       => saltwater fish -- (flesh of fish from the sea used as food)Sense 5freshwater bass, bass -- (any of various North American freshwater fish with lean flesh (especially of the genus Micropterus))       => freshwater fish -- (flesh of fish from fresh water used as food)Sense 6bass, bass voice, basso -- (the lowest adult male singing voice)       => singing voice -- (the musical quality of the voice while singing)Sense 7bass -- (the member with the lowest range of a family of musical instruments)       => musical instrument, instrument -- (any of various devices or contrivances that can be used to produce musical tones or sounds)Sense 8bass -- (nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes)       => percoid fish, percoid, percoidean -- (any of numerous spiny-finned fishes of the order Perciformes)

Internal Structure of Words

Paradigmatic relations connect lexemes together in particular ways

but don’t say anything about what the meaning representation of

a particular lexeme should consist of.

Various approaches have been followed to describe

the semantics of lexemes. Thematic roles in predicate-bearing lexemes Selection restrictions on thematic roles Decompositional semantics of predicates Feature-structures for nouns

Thematic Roles

Thematic roles provide a shallow semantic language for characterizing certain arguments of verbs.

For example: Ali broke the glass. Veli opened the door.

Ali is Breaker and the glass is BrokenThing of Breaking event; Veli is Opener and the door is OpenedThing of Opening event. These are deep roles of arguments of events. Both of these events have actors which are doer of a volitional event, and

things affected by this action. A thematic role is a way of expressing this kind of commonality. AGENT and THEME are thematic roles.

Some Thematic Roles

AGENT --The volitional causer of an event -- She broke the door

EXPERIENCER -- The experiencer of an event -- Ali has a headache.

FORCE -- The non-volitional causer of an event -- The wind blows it.

THEME -- The participant most directly effected by an event --

She broke the door.

INSTRUMENT -- An instrument used in an event --

He opened it with a knife.

BENEFICIARY -- A beneficiary of an event -- I bought it for her.

SOURCE -- The origin of the object of a transfer event --

I flew from Rome.

GOAL -- The destination of the object of a transfer event --

I flew to Ankara.

Thematic Roles (cont.)

Takes some of the work away from the verbs. It’s not the case that every verb is unique and has to completely

specify how all of its arguments uniquely behave. Provides a mechanism to organize semantic processing It permits us to distinguish near surface-level semantics

from deeper semantics

Linking

Thematic roles, syntactic categories and their positions in larger syntactic structures are all intertwined in complicated ways.

For example…

AGENTS are often subjects In a VP->V NP NP rule, the first NP is often a GOAL

and the second a THEME

Deeper Semantics

He melted her reserve with a husky-voiced paean to her eyes.

If we label the constituents He and her reserve as the Melter and Melted, then those labels lose any meaning they might have had.

If we make them Agent and Theme then we don’t have the same problems

Selectional Restrictions

A selectional restriction augments thematic roles by allowing lexemes to place certain semantic restrictions on the lexemes and phrases can accompany them in a sentence. I want to eat someplace near Bilkent. Now we can say that eat is a predicate that has an AGENT and a

THEME And that the AGENT must be capable of eating and the THEME must

be capable of

being eaten

Each sense of a verb can be associated with selectional restrictions. THY serves NewYork. -- direct object (theme) is a place THY serves breakfast. -- direct object (theme) is a meal.

We may use these selectional restrictions to disambiguate a sentence.

As Logical Statements

For eat…

Eating(e) ^Agent(e,x)^ Theme(e,y)^Isa(y, Food)

(adding in all the right quantifiers and lambdas)

WordNet Use WordNet hyponyms (type) to encode the selection

restrictions

Specificity of Restrictions

What can you say about THEME in each with respect to the verb? Some will be high up in the WordNet hierarchy, others not so high…

PROBLEMS

Unfortunately, verbs are polysemous and language is creative…

… ate glass on an empty stomach accompanied only by water

and tea you can’t eat gold for lunch if you’re hungry … get it to try to eat Afghanistan

Discovering the Restrictions

Instead of hand-coding the restrictions for each verb,

can we discover a verb’s restrictions by using a corpus and WordNet?

1. Parse sentences and find heads

2. Label the thematic roles

3. Collect statistics on the co-occurrence of particular headwords with particular thematic roles

4. Use the WordNet hypernym structure to find the most meaningful level to use as a restriction

Motivation

Find the lowest (most specific) common ancestor that covers a significant number of the examples

Word-Sense Disambiguation

Word sense disambiguation refers to the process of selecting

the right sense for a word from among the senses that the word is known to have

Semantic selection restrictions can be used to disambiguate

Ambiguous arguments to unambiguous predicates

Ambiguous predicates with unambiguous arguments

Ambiguity all around

Word-Sense Disambiguation

We can use selectional restrictions for disambiguation. He cooked simple dishes. He broke the dishes.

But sometimes, selectional restrictions will not be enough to disambiguate. What kind of dishes do you recommend? -- we cannot know what

sense is used. There can be two lexemes (or more) with multiple senses.

They serve vegetarian dishes. Selectional restrictions may block the finding of meaning.

If you want to kill Turkey, eat its banks. Kafayı yedim. These situations leave the system with no possible meanings, and they

can indicate a metaphor.

WSD and Selection Restrictions

Ambiguous arguments

Prepare a dish Wash a dish

Ambiguous predicates

Serve Denver Serve breakfast

Both

Serves vegetarian dishes

WSD and Selection Restrictions

This approach is complementary to the compositional analysis approach.

You need a parse tree and some form of predicate-argument analysis derived from

The tree and its attachmentsAll the word senses coming up from the

lexemes at the leaves of the treeIll-formed analyses are eliminated by noting

any selection restriction violations

Problems

As we saw last time, selection restrictions are violated all the time.

This doesn’t mean that the sentences are ill-formed or preferred less than others.

This approach needs some way of categorizing and dealing with the various ways that restrictions can be violated

WSD Tags

What’s a tag?

A dictionary sense? For example, for WordNet an instance of “bass” in a text

has 8 possible tags or labels (bass1 through bass8).

WordNet Bass

The noun ``bass'' has 8 senses in WordNet

1. bass - (the lowest part of the musical range)2. bass, bass part - (the lowest part in polyphonic music)3. bass, basso - (an adult male singer with the lowest voice)4. sea bass, bass - (flesh of lean-fleshed saltwater fish of the family

Serranidae)5. freshwater bass, bass - (any of various North American lean-fleshed

freshwater fishes especially of the genus Micropterus)6. bass, bass voice, basso - (the lowest adult male singing voice)7. bass - (the member with the lowest range of a family of musical

instruments)8. bass -(nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes)

Representations

Most supervised ML approaches require a very simple representation for the input training data.

Vectors of sets of feature/value pairsI.e. files of comma-separated values

So our first task is to extract training data from a corpus with respect to a particular instance of a target word

This typically consists of a characterization of the window of text surrounding the target

Representations

This is where ML and NLP intersect

If you stick to trivial surface features that are easy to extract from a text, then most of the work is in the ML system

If you decide to use features that require more analysis (say parse trees) then the ML part may be doing less work (relatively) if these features are truly informative

Surface Representations

Collocational and co-occurrence information

Collocational Encode features about the words that appear in

specific positions to the right and left of the target word

Often limited to the words themselves as well as they’re part of speech

Co-occurrence Features characterizing the words that occur

anywhere in the window regardless of position

Typically limited to frequency counts

Collocational

Position-specific information about the words in the window

guitar and bass player stand

[guitar, NN, and, CJC, player, NN, stand, VVB]

In other words, a vector consisting of [position n word, position n part-of-speech…]

Co-occurrence

Information about the words that occur within the window.

First derive a set of terms to place in the vector. Then note how often each of those terms occurs in a

given window.

Classifiers

Once we cast the WSD problem as a classification problem, then all sorts of techniques are possible

Naïve Bayes (the right thing to try first)Decision listsDecision treesNeural netsSupport vector machinesNearest neighbor methods…

Classifiers

The choice of technique, in part, depends on the set of features that have been used Some techniques work better/worse with features with

numerical values Some techniques work better/worse with features that

have large numbers of possible values For example, the feature the word to the left has a

fairly large number of possible values

Statistical Word-Sense Disambiguation

)|(maxarg VsPpsSs

Where s is a vector of senses, V is the vector representation of the input

)(

)()|(maxarg

VP

sPsVPps

Ss By Bayesian rule

n

i

iSs

svPsPps1

)|()(maxargBy making independence assumption ofmeanings. This means that the result isthe product of the probabilities of its individual features given that its sense

Problems

Given these general ML approaches, how many classifiers do I need to perform WSD robustly

One for each ambiguous word in the language

How do you decide what set of tags/labels/senses to use for a given word?

Depends on the application

END

Examples from Russell&Norvig (1)

7.2. p.213

Not all students take both History and Biology. Only one student failed History. Only one student failed both History and Biology. The best history in History was better than the best score in Biology. Every person who dislikes all vegetarians is smart. No person likes a smart vegetarian. There is a woman who likes all men who are vegetarian. There is a barber who shaves all men in town who don't shave themselves. No person likes a professor unless a professor is smart. Politicians can fool some people all of the time or all people some of the time but they

cannot fool all people all of the time.

Categories & Events

Categories: VegetarianRestaurant (Joe’s) – categories are relations and not objects MostPopular(Joe’s,VegetarianRestaurant) – not FOPC! ISA (Joe’s,VegetarianRestaurant) – reification (turn all concepts into

objects) AKO (VegetarianRestaurant,Restaurant)

Events: Reservation (Hearer,Joe’s,Today,8PM,2) Problems:

Determining the correct number of roles Representing facts about the roles associated with an event Ensuring that all the correct inferences can be drawn Ensuring that no incorrect inferences can be drawn

MUC-4 Example

INCIDENT: DATE 30 OCT 89 INCIDENT: LOCATION EL SALVADOR INCIDENT: TYPE ATTACK INCIDENT: STAGE OF EXECUTION ACCOMPLISHED INCIDENT: INSTRUMENT ID INCIDENT: INSTRUMENT TYPEPERP: INCIDENT CATEGORY TERRORIST ACT PERP: INDIVIDUAL ID "TERRORIST" PERP: ORGANIZATION ID "THE FMLN" PERP: ORG. CONFIDENCE REPORTED: "THE FMLN" PHYS TGT: ID PHYS TGT: TYPEPHYS TGT: NUMBERPHYS TGT: FOREIGN NATIONPHYS TGT: EFFECT OF INCIDENTPHYS TGT: TOTAL NUMBERHUM TGT: NAMEHUM TGT: DESCRIPTION "1 CIVILIAN"HUM TGT: TYPE CIVILIAN: "1 CIVILIAN"HUM TGT: NUMBER 1: "1 CIVILIAN"HUM TGT: FOREIGN NATIONHUM TGT: EFFECT OF INCIDENT DEATH: "1 CIVILIAN"HUM TGT: TOTAL NUMBER

On October 30, 1989, one civilian was killed in a reported FMLN attack in El Salvador.

Subcategorization frames

1. I ate

2. I ate a turkey sandwich

3. I ate a turkey sandwich at my desk

4. I ate at my desk

5. I ate lunch

6. I ate a turkey sandwich for lunch

7. I ate a turkey sandwich for lunch at my desk

- no fixed “arity” (problem for FOPC)

One possible solution

1. Eating1 (Speaker)2. Eating2 (Speaker, TurkeySandwich)3. Eating3 (Speaker, TurkeySandwich, Desk)4. Eating4 (Speaker, Desk)5. Eating5 (Speaker, Lunch)6. Eating6 (Speaker, TurkeySandwich, Lunch)7. Eating7 (Speaker, TurkeySandwich, Lunch, Desk)Meaning postulates are used to tie semantics of predicates:

w,x,y,z: Eating7(w,x,y,z) ⇒ Eating6(w,x,y)Scalability issues again!

Another solution

- Say that everything is a special case of Eating7 with some arguments unspecified: ∃w,x,y Eating (Speaker,w,x,y)

- Two problems again:- Too many commitments (e.g., no eating except at meals: lunch, dinner, etc.)- No way to individuate events:

∃w,x Eating (Speaker,w,x,Desk) ∃w,y Eating (Speaker,w,Lunch,y) – cannot combine into ∃w Eating (Speaker,w,Lunch,Desk)

Reification

∃ w: Isa(w,Eating) ∧ Eater(w,Speaker) ∧ Eaten(w,TurkeySandwich) – equivalent to sentence 5.

Reification: No need to specify fixed number of arguments for a given

surface predicate No more roles are postulated than mentioned in the input No need for meaning postulates to specify logical connections

among closely related examples

Representing time

1. I arrived in New York

2. I am arriving in New York

3. I will arrive in New York

∃ w: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork)

Representing time

∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ EndPoint(I,e) ∧ Precedes (e,Now)

∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ MemberOf(i,Now)

∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ StartPoint(i,s) ∧ Precedes (Now,s)

Representing time

We fly from San Francisco to Boston at 10. Flight 1390 will be at the gate an hour now.

Use of tenses Flight 1902 arrived late. Flight 1902 had arrived late.

“similar” tenses When Mary’s flight departed, I ate lunch When Mary’s flight departed, I had eaten lunch

reference point

Aspect

Stative: I know my departure gate Activity: John is flying

no particular end point Accomplishment: Sally booked her flight

natural end point and result in a particular state Achievement: She found her gate Figuring out statives:

* I am needing the cheapest fare.* I am wanting to go today.* Need the cheapest fare!

Representing beliefs

Want, believe, imagine, know - all introduce hypothetical worlds I believe that Mary ate British food. Reified example:

∃ u,v: Isa(u,Believing) ∧ Isa(v,Eating) ∧ Believer (u,Speaker) ∧ BelievedProp(u,v) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood)

However this implies also: ∃ u,v: Isa(v,Eating) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood)

Modal operators: Believing(Speaker,Eating(Mary,BritishFood)) - not FOPC! – predicates

in FOPC hold between objects, not between relations. Believes(Speaker, ∃ v: ISA(v,Eating) ∧ Eater(v,Mary) ∧

Eaten(v,BritishFood))

Modal operators

Beliefs Knowledge Assertions Issues:

If you are interested in baseball, the Red Sox are playing tonight.

Examples from Russell&Norvig (2)

7.3. p.214

One more outburst like that and you'll be in comptempt of court. Annie Hall is on TV tonight if you are interested. Either the Red Sox win or I am out ten dollars. The special this morning is ham and eggs. Maybe I will come to the party and maybe I won't. Well, I like Sandy and I don't like Sandy.