44
Building a Lexicon Statistical learning & recognizing words

Building a Lexicon Statistical learning & recognizing words

Embed Size (px)

Citation preview

Building a Lexicon

Statistical learning & recognizing words

Overview

• More on segmentation: the role of statistics

• Using different cues to segmentation: Transitional probabilities vs. Stress cues

• Once words are pulled out and remembered, is lexical access mature?

• What are some steps that occur?

Word Segmentation

• In writing, blank spaces between words, but not in speech

Once you know words, you can hear them

But how do babies pull them out? Only some in isolation!

What have babies learned by 8-10 months?

• Phonetic categories (position specific!)• Phonotactics (consonant sequences allowed)• Stress patterns (English predominantly Sw)• Ability to use distributional information

– A particular word in different contexts might pop out, e.g.

– “the milk”, “drink your milk”, “do you want some milk”

• Can babies use these to help segment words?

Word Segmentation

• At 7.5 months only segment Sw words• Succeed on words like DOCtor and HAMlet• But fail on words like guiTAR and deVICE• For WS words, seem to pull out only the

strong syllable• Indeed, if the Strong syllable is always

followed by a particular function word, will consider it part of the word

• E.g. pulled out “TAR is”

Transitional Probabilities

• Why would infants pull out ‘Tar is”

Different types of statistics

• Frequency– the occurrence of an item, high vs low

– Sw words are higher in frequency

• Distributional probabilities – the distribution of occurrences, bimodal vs. monomodal

• Transitional probabilities– the probability that one item will follow (or be followed

by) another

Transitional probabilities and segmentation

• What are transitional probabilities?“ABCABCABC” “ABCBABCBA”121424214 etc.

• The problem:Listeningtoenglishwhenyoudon’tknowwhereonew

ordendsandthenextbeginsfeelsalittlelikethis.Listening to English when you don’t know where

one word ends and the next begins feels a little like this.

Transitional probabilities and segmentation

• A sensitivity to transitional probabilities could be useful:– Lookattheprettybaby.– Whatalittlebaby!– Mybabygirliscrying.– Ilikemybagelwithcreamcheeseandlox.

• Probability of Ba+by is 3/4• If you’re trying to figure out the “Units” in

speech, this might be a useful property

Transitional probabilities and segmentation

• Presented infants with a 2min sequence of syllables with no pauses:

• Bidakupadotigolabubidakugolabu...

• Test: golabu (word) vs tigola (partword)1.0 0.33

Infants looked longer when the partwords were played showing that they discriminated between the two test items.

Saffran et al., 1996

Transitional probabilities and language learning

• Infants can use the transition probabilities between segments to pull out candidate words from continuous English speech.

• At 8 months

• Maybe segmented words are part of prelexical representations

Which comes first?

• So – we’ve learned that infants can use stress cues to segment words

• And, they can use transitional probabilities to segment words

• Heard speech has both cues

• We never produce monotone utterances like Bidakupadotigolabubidakugolabu

• It seems, that if we combine Stress cues and predictive transitional probabilities segmentation might be even easier

• But how could you learn stress cues if you didn’t already know some words?

• Johnson & Jusczyk (2001) suggest that infants learn English stress frequencies from hearing words in isolation

• Is this necessary? Not if infants use TPs first.

• Thiessen & Saffran (Dev. Psych, 2003) tested this question

• They use the Saffran, et. al., type task, but added Stress cues

• Compared infants of 6-7 and 8-9 months on their ability to learn S-W vs. W-S words

• At 6-7 months, if the TP predicted a word, the infants were able to segment it whether it was SW or WS

• At 8-9 months, they only succeeded on SW words

• To use the Saffran, et. al. example

• An infant of 6-7 months could learn either:– bidakupadotiGOlabubidakuGOlabu OR– bidakupadotigoLAbubidakugoLAbu

• But an infant of 8-9 months showed better learning of:– bidakupadotiGOlabubidakuGOlabu

* This has been simplified from the way the study was actually done

• Infants can use many different types of statistics to learn about words

• These include frequency, distributional probabilities, and transitional probabilities

• Different types of statistics might be important at different points in development for helping infants along the way

Is learning transitional probabilities specific for learning

language?• Tone sequences

AFB, F#A#D, EGD#, CG#C#

AFBF#A#DAFB EGD#CG#C#

• Test: AFB (“word”) vs. D#CG# (“partword”)• Infants of 8 months looked longer when the

partwords were played, showing that infants discriminated between the two test items.

Saffran et al., 1999

How specific is this type of learning for language?

• Visual sequences

• Presented 1 object at a time, no pauses, objects were in pairs

Kirkham et al. 2001

How specific is this type of learning for language?

• Test: (real pair) vs. (novel pair)

• Infants as young as 2 months looked longer at the “novel pairs”

Kirkham et al. 2001

Are humans the only species sensitive to statistics?

• Birds are sensitive to distributions of vowels: can discriminate vowels when they are presented in a bimodal distribution

Kluender et al., 1998

Cotton-top Tamarins (monkeys) heard the continuous nonsense words of Saffran et al., and responded the same way as infants, by looking longer at the part-words

Hauser et al., 2001

Summary: statistical learning

• Human infants (and adults) can do it

• Humans can do it for language, but also for tones and for visual images!

• Birds and monkeys can do it too.

• Statistical learning seems to be a very basic learning mechanism that human infants use to help “bootstrap” into language.

But is that all word learning is?

• The “word” learning we have talked about so far this week and last, is really just about learning familiar word forms

• We need word forms• But we need to learn word meanings as well• Indeed, a full lexical entry has:

– Word form– Word meaning (intension: definition)– Word meaning (extension: other objects it can apply to)– Grammatical class

The mature “lexicon” and language processing

• Your lexicon has all this information about every word you hear

• So, when you hear a sentence like,– “Get the leash so we can take the dog to the park”

• You’ll recognize the highly predictable word “dog” at the very beginning of the “d”

• The semantic context has been set up, you know a noun must come after “the”, and those two facts make you expect “dog”

• This allows you to synthesize, process more efficiently, and look forward for what might be coming next in the sentence

• Imagine if I had said:– “Get the leash so we can take the drapes down”

• Fernald, et. al. (1998) ask if infants lexicons are this mature from the beginning.

• They find they are not – but that they mature quickly– there are rapid gains in the speed of word recognition

between 15 and 24 months of age

– At 15 months, infants have to hear the whole word and have a chance to see both objects

– By 24 months, they, like adults, are making a guess on the basis of the beginning sounds of the word

More on accessing the word form…

• So, by 24 months, infants need only part of the sound to access the whole word in the lexicon

• But this is just about accessing word form• How about when infants start learning the meaning

of words as well?• To explore this, we tested infants at the very

beginning stages of word meaning – an associative understanding of word-object linkages

• (not nec the same thing as referential understanding)• We asked how easy it is for infants to access the full

detail in the lexicon

Word-learning: Switch DesignWord-learning: Switch Design

(Werker, Cohen, Lloyd, Casasola, & Stager, (Werker, Cohen, Lloyd, Casasola, & Stager, DevPsychDevPsych, 1998), 1998)

Baby in the Switch Task

Looking Time During Test Phase

0

2

4

6

8

10

12

8 10 to 12 14Age in Months

Look

ing

Tim

e (s

)

Same

Switch

Infants succeed by 14 monthsWerker, Cohen, Lloyd, Casasola, & Stager, Dev. Psych. 1998

Learning Similar WordsStager & Werker, Nature 1997

Early Word Learning

Results: Similar sounding words• Infants of 14-months failed to learn similar words, even

though they can discriminate the words• This suggested that at the early stages of word learning,

when the task is challenging for the infant, they do not have the attentional resources available to attend to the fine phonetic detail in words

• To confirm this, we tested older infants who are more accomplished word learners (with Fennell, Corcoran, & Stager, Infancy, 2002)

• To rule out the possibility that perhaps the objects were too much alike, we used less similar objects

Testing infants on minimal pair words

Habituation Phase Test Phase Same Switch

“bih” “dih” “bih” “dih”

0

2

4

6

8

10

12

14

16

14 17 20

Age in Months

Loo

kin

g T

ime

(s)

Same

Switch

Infants fail until 17 monthsStager & Werker, Nature, 1997

Werker, Fennell, Corcoran, & Stager, Infancy, 2002

-Unless they know the words well (Fennell & Werker, Lang. & Speech, 2003; Swingley & Aslin, 2002)

-Or tested in a preference test with both objects as reminders (with Fennell, Swingley, & Yoshida)

-And see Julia Wales & George Hollich (CDS, 2003)

Infants fail until 17 monthsStager & Werker, Nature, 1997

Werker, Fennell, Corcoran, & Stager, Infancy, 2002

0

2

4

6

8

10

12

14

16

14 17 20

Age in Months

Loo

kin

g T

ime

(s)

Same

Switch

Minimal pair word learning(Werker, Cohen, et al, 1998; Stager & Werker, 1997; Werker, et. al., 2001)

Explanation to date(Stager & Werker, 1997; Fennell & Werker, in press; Werker &

Fennell, in press)

• Word-object linking is initially difficult for infants• May pick up detail while listening to words• Cannot learn linkage and utilize full phonetic detail• The task of learning the associative link is so

difficult that without the object there, they can’t remember the precise details of the word form even though it is likely stored in the lexicon

• As become more accomplished word learners, detail can be used

Swingley & Aslin (2001, 2002) do find evidence of use of phonetic detail

Where’s the Baby? OR Where’s the Vaby?

2 DVs: Latency to look away from mismatch (14 & 18-22 mos)Total looking time to the match (14 mos)

How to explain different results?

• The two choice looking procedure reveals remind the child of what the link is– Can measure on-line processing via latency– All information available for recognition memory

to operate in overall LT to match

Experiment 2: Combined Method (w/Dan Swingley, Katie Yoshida & Chris Fennell)

• Teach infants 2 new minimal pair words via habituation design

• Test using side by side S&A design

• 2 DVs – latency to look toward match– Total looking time to match

Habituation Phase

“bih”

“dih”

“bih”

Test Phase (8 trials)

• When tested this way, when both the object and the word are presented, the infant has more cues to remind them, and is better able to access the full detail in the lexicon

To think about….• Infants can use a variety of statistics to help

them learn about words– Frequency– Transitional probabilities– Distributional probabilities

• Indeed, at the earliest stages of word learning, associative statistics are helpful

• Do you think statistical & associative learning can explain language?

Problems for statistics….• How do infants know what to calculate their

statistics across?• If they can detect transitional probabilities in tones

and visual images as well as syllables, why don’t they ever connect syllables to tones?

• How do they know to listen for syllables to begin with?

• NGG might suggest that just as the child is a filter for motherese, the child’s language learning biases act as a filter for which statistical regularities to pay attention to, and when