61
Question about the reading What are clitics? They are not words. Evidence: they can’t be stressed They are not prefixes or suffixes. Evidence: they don’t cause certain changes in the word that a prefix or suffix would cause. Evidence: any given prefix or suffix can attach to one kind of word (for example, only nouns or only verbs). Some clitics attach to whatever is nearby.

Question about the reading What are clitics? They are not words. –Evidence: they can’t be stressed They are not prefixes or suffixes. –Evidence: they don’t

Embed Size (px)

Citation preview

Question about the reading

• What are clitics?• They are not words.

– Evidence: they can’t be stressed

• They are not prefixes or suffixes.– Evidence: they don’t cause certain changes in the

word that a prefix or suffix would cause.– Evidence: any given prefix or suffix can attach to one

kind of word (for example, only nouns or only verbs). Some clitics attach to whatever is nearby.

Example: Spanish clitic pronouns(Data to be supplied by the class)

• Word stress in Spanish– Stress the second to last or last syllable– Examples:

• When you add a suffix like –able or –mente, the stress goes on the new second to last syllable:– Examples:

• Clitic pronoun:– Example: I am reading it.– When the clitic is added, the stress stays on the old second to

last syllable.• Clitic pronoun:

– Example: I see him.– Can it be stressed?

A Distributional Approach to Parts of Speech

Grammars and Lexicons

11-721

September 5, 2007

Categories of Words:Parts of Speech

• Noun

• Verb

• Adjective

• Adverb

• Preposition

• Determiner (Article)

• Modal ?

Parts of Speech

This boy must seem incredibly stupid to that girl.Det Noun Modal Verb Adverb Adjective Prep. Det Noun

Scientific method in linguistics

• Theories (hypotheses) must be testable and falsifiable.

• Results must be reproducible.

Reproducible Results: Chomsky, 1957

The search for rigorous formulation in linguistics has a much more serious motivation than mere concern for logical niceties or the desire to purify well-established methods of linguistic analysis. Precisely constructed models for linguistic structure can play an important role, both negative and positive, in the process of discovery itself. By pushing a precise but inadequate formulation to an unacceptable conclusion, we can often expose the exact source of the inadequacy and, consequently, gain a deeper understanding of the linguistic data. More positively a formalized theory may automatically provide solutions for many problems other than those for which it was explicitly designed. Obscure and intuition-bound notions can neither lead to absurd conclusions nor provide new and correct ones, and hence they fail to be useful in two important respects.

In language technologies, imprecise definitions lead to poor intercoder reliability, which leads to poor training, etc.

A traditional theory of parts of speech

• Verbs denote actions

• Nouns denote entities

• Adjectives denote states

• Adverbs denote manner

• Prepositions denote location

• Determiners specify

Counter-examples

• The same concept can function in several parts of speech.– Pinker, page 98

• Her interest in fungi (noun)• Fungi are starting to interest her more and

more. (verb)• She seems interested in fungi. (adjective)• Interestingly, the fungi grew an inch in an

hour. (adverb)

The distributional theory of parts of speech

• “A part of speech, then, is not a kind of meaning; it is a kind of token that obeys certain formal rules, like a chess piece or a poker chip.”– Pinker, page 98

• Testable and falsifiable

• Assumes discrete categories

The distributional theory of parts of speech

• Distribution– The contexts where the word can appear

• Morphology– Prefixes, suffixes, and other changes to the

structure of the word.

Identifying parts of speech by their Morphology

• Morphology: The form of words

• Affixes: Prefixes, suffixes, infixes

• Stem changes: swim/swam

Morphological properties of English nouns

• Count nouns– Cup/cups– Book/books

• Mass nouns– Attention/?attentions– Sand/?sands– Water/?waters– Coffee/?coffees

Morphological Properties of English adjectives

• Monosyllabic (one syllable) adjectives– Tall/taller/tallest– Fast/faster/fastest

• Multi-syllabic adjectives– Intelligent/more intelligent/most intelligent

• Except for adjectives that have non-gradable meanings:– Alphabetical, unique, pregnant

Invariant words: no prefixes or suffixes in English

• Prepositions (in, on, at, about, across, beyond, etc.)

• Modals (may, might, can, could, must, shall, should, etc.)

Morphological Properties of English Verbs

Base Participle Past Present Gerund

mow mown mowed mows mowing

prove proven proved proves proving

go gone went goes going

meet met met meets meeting

cut cut cut cuts cutting

Past Participle Present Participle

Third person singular subjectInfinitive

What are participles?

• Verb forms that act like adjectives or nouns:– Mown grass

• Participle in an adjective position

– Mowing is fun• Participle in a noun position

Other uses of English Participles

• The grass was mown.– Passive verb

• I was mowing the grass.– Present progressive verb

Distributional criteria for parts of speech

Template 1: adjectives

• Great ideas spread quickly.

• Interesting ideas spread quickly.

• Stupid ideas spread quickly.

• Colorless ideas spread quickly.

• Words of the same category have the same distribution. For example, adjectives can come before nouns.

Template 2: adjectives

• They are very adjective.• They are very nice/gentlemanly/ladylike.• *They are very gentlemen/ladies/faxes.• *They are very starve/die.• *They are very to/at/on.

• They are very in.• They are very off.

Template 3: adjectives and adverbs

• Very adverb or adjective

• Very slow

• Very slowly

• Very badly

• Very happy

Template 4: adverb

• He treats her adverb.

• He treats her well.

• He treats her arrogantly.

• He treats her nicely.

• He treats her nice.

• He treats her good.

Template 5: nouns

• noun can be a pain in the neck.• Television can be a pain in the neck.• Linguistics can be a pain in the neck.• This can be a pain in the neck.• *Happy can be a pain in the neck.• *From can be a pain in the neck.• *The can be a pain in the neck.• *Breathe can be a pain in the neck.

Template 6: verbs

• They/it can verb.

• They/it can stay/leave/die/cry.

• *They/it can gorgeous/cute/trendy.

• *They/it can from/to/in/off/on.

• *They/it can door/bible/gold/camera.

Template 7: Modals

• Modal I be frank?

• Can I be frank?

• Must I be frank?

• Should I be frank?

• Need I be frank?

Template 8: determiner

• He wrote determiner other works.

• He wrote the/all/these/no/few/many other works.

• *He wrote despair/be/have other works.

• *He wrote student other works.

• ?He wrote successful other works.

Template 9: prepositions

• Right preposition.– Right is an intensifier.

• Right up/down/in/on/across the street• Right down the stairs• Right in the drawer• Right from school• Right across the street• *He right despaired.• *She chose right this one.

Problems

• Problems with Radford’s templates

• Problems for the assumption of discrete categories– Words that evade categorization

Template 1 problem

• Templates need to be more exact:

– Great ideas spread quickly.– The ideas spread quickly.

• Do great and the have the same part of speech?

Template 5: need subcategories

• Cat can be a pain in the neck.

• The template only works for– Plural nouns (e.g., cats)– Mass nouns (e.g., water)– Pronouns (e.g., he)– Proper nouns (e.g., Sam)

• Cat is a singular count noun.

Count and mass nouns

• Singular count nouns must occur with a determiner:– The cat was a pain in the neck.– A cat can be a pain in the neck.– *Cat was a pain in the neck.

• Plural nouns and mass nouns can occur without a determiner.– Cats can be a pain in the neck.– Water can be a pain in the neck.

• Singular mass nouns change their meaning when they occur with “a”– a water– a coffee– ?An information

Other things to take into account

• He can be a pain in the neck.

• *Him can be a pain in the neck.

• This music rocks.

• These CDs rock.

Template 6: Need subcategories

• *They can handle.• *They can accommodate.• *They can harbor.

• The template only works for intransitive verbs.• These verbs need another noun after them.

– They can handle boredom.– They can accommodate changes.– They can harbor criminals.

Template 9: prepositions

• She looked at him right strangely. (dialect)

• She is right pretty. (dialect)

• You look a right clown. (Oxford English Dictionary)

• The government made a right mess of it. (Oxford English Dictionary)

Words can have more than one part of speech

• He needs to see a doctor. (verb)

• Need I be frank? (modal)

• I feel a need to explore my roots. (noun)

Importance to you

• The distributional theory of parts of speech is problematic, but it is your best bet for your grammar writing project.

• When you are building a lexicon, you will decide on parts of speech for words by using template tests and morphological tests.

In-class exercise

• Goals:– Interpret the results of distributional tests for

parts of speech.– Discover that some words are problematic for

the distributional theory of parts of speech.– Reminder:

• When you know a language, you know a complex body of unconscious knowledge.

Words that evade classification

• More tests for prepositions and adjectives– Cambridge Grammar of the English

Language, Chapter 7, Section 2.2

• Attempt to categorize like, worth, near, opposite, due, close, far

Predicative and non-predicative adjuncts

• Cambridge Grammar of the English Language, page 604

• Adjectives: predicative modifiers– Tired of the ship, the captain saw an island on which

to land.• Tired is predicated of the captain.

– *Tired of the ship, there was a small island.

• Prepositions: non-predicative modifiers– Ahead of the ship, the captain saw an island on which

to land.– Ahead of the ship, there was an island on which to

land.

Become, Feel, Seem, Look

• Adjectives– He became/seemed/felt/looked happy

• Prepositions– *He became/seemed/felt/looked in the park.– Exceptions

• He became/seemed/felt/looked under the weather• He became/seemed/felt/looked out of his mind

Degree modification

• Adjectives– Very smart– Smarter– Smart enough– *very much smart

• Prepositions– *very in the room– ?very much in the room– *more on the table

• ?This book is more on the table than that one.– ?This book is enough on the table not to fall.– ?This book is on the table enough not to fall.– This book is very much on the table.– ?This book is more about linguistics than that one.

Followed by bare NP or PP

• Adjectives: Cannot be followed by bare NP– Fond of Sam– *Fond Sam– Happy about the promotion– *Happy the promotion

• Prepositions: Can be followed by bare NP– In the room– About linguistics

Right and Straight

• Adjectives:– *right red– *right conspicuous– ?right smart

• Prepositions– Straight into the room– Right on the table

Coming with a question word when it moves (Pied Piping, from a story where kids and rats followed a piper)

• Relative clause– I saw a man– The man who I saw ___

• Embedded question– I know that you saw someone.– I don’t know who you saw ___.

• Prepositions– She cut the bread with a knife– The knife with which she cut it ___– The knife she cut it with– I know that you are referring to someone.– I don’t know to whom you are referring ___– I don’t know who you are referring to.

• Adjectives– She is fond of Sam.– ?The boy fond of whom she is ___– The boy of whom she is fond __– The boy who she is fond of ___– *I don’t know fond of whom she is. – *I don’t know of whom she is fond ___.– I don’t know who she is fond of ___.

Worth

• Predication:– Worth over a million dollars, the jewels were

kept under surveillance.– *Worth over a million dollars, there will be

ample opportunity for a lavish lifestyle.

• Become– What might have been a $200 first edition

suddenly became worth perhaps ten times that much.

Worth

• Degree modification– *It was very worth the effort.– It was very much worth the effort.– ?It was enough worth the effort.– ?It was worth the effort enough.

• Followed by a bare NP– yes

Worth

• Right and straight– *The land is right worth $100K.

• Comes with a question word?– She thought the land was worth $100K. – This was far less than the amount which she

thought the land was worth ___.– *This was far less than the amount worth

which she thought the land was ___.

Worth

• Degree modification

Parts of Speech in Language Technologies

Part of Speech Tagging

• Input: string of words• Output: string of words with a part of speech

associated with each word.• Example:

– This:det boy:N likes:V that:det girl:N

• Use statistical or rule-based knowledge about distribution.

• Usually use a long list of parts of speech, e.g., around 40.

Part of speech tags used in the Penn Treebank

• Coordinating conjunction• Cardinal number• Determiner• Existential-there• Foreign word• Preposition/subordinating conjunction• Adjective• Comparative adjective• Superlative adjective• List item marker• Modal

Part of speech tags used in the Penn Treebank

• Singular noun or mass noun• Plural noun• Singular proper noun• Plural proper noun• Predeterminer• Possessive ending• Personal pronoun• Possessive pronoun• Adverb• Comparative adverb• Superlative adverb• Particle

Part of speech tags used in the Penn Treebank

• Symbol• To• Interjection• Base form verb• Past tense verb• Gerund or present participle verb• Past participle verb• Verb not 3rd person singular present• Verb 3rd singular present• Wh-determiner• Wh-pronoun• Possessive wh-pronoun• Wh-adverb

A different theory of Parts of Speech

Theory of Propositional Acts and Parts of Speech

(William Croft, Radical Construction Grammar, Chapter 2)

• Refer• Modify• Predicate

• Nouns are words that refer without additional marking.

• Adjectives and adverbs modify without additional marking.

• Verbs predicate without additional marking.

Additional Marking• Predication > reference

– Destroy > destruction– The destruction of the city

• Predication > modification– Destroy > that destroyed– The hurricane that destroyed New Orleans

• Modification > predication– Red > is red– The book is red

• Modification > reference– red > the red one– The red one is on the shelf

• Reference > predication– Teacher > is a teacher– He is a teacher

Problems with propositional acts and additional marking

• Modification > reference without additional marking– Robin Hood stole from the rich and gave to

the poor.

• Reference > modification without marking– Toy house

Variation across languages

• World Atlas of Language Structures

Things that are marked on verbs in other languages

• Aspect– Perfect and imperfect

• Mood– Subjunctive

• Voice– Passive