31
TAMARA NICOL MEDINA JOHN TRUESWELL LILA GLEITMAN University Of Pennsylvania JESSE SNEDEKER Harvard University Society for Research in Child Development April 2, 2009, Denver, CO Using Extra-Linguistic Cues to Identify Good Word Learning Instances

TAMARA NICOL MEDINA JOHN TRUESWELL LILA GLEITMAN University Of Pennsylvania JESSE SNEDEKER Harvard University Society for Research in Child Development

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

TAMARA NICOL MEDINAJOHN TRUESWELL

LILA GLEITMANUniversity Of Pennsylvania

JESSE SNEDEKERHarvard University

Society for Research in Child DevelopmentApril 2, 2009, Denver, CO

Using Extra-Linguistic Cues to Identify Good Word Learning

Instances

Just look at the world!

Observe physical and temporal contingencies between words and objects. (At least, for physically observable objects.)

Experimental evidence supports ease of mapping Fast mapping (e.g., Carey, 1978; Mervis & Bertrand,

1994; Behrend et al., 2001; Jaswal & Markman, 2001) Cross-situational word learning (e.g., Yu & Smith,

2007; Smith & Yu, 2008; Vouloumanos, 2008; Xu &Tenenbaum, 2007)

It’s Not that Easy! (Augustine, Locke, Quine, Gleitman, Fodor, Siskind, etc.)

Reference problem Book? Cat? Shoes? Chair? Cheerios? Cup?

Rug? Pants? Head? Hand? …Frame problem

Dog or Puppy? Hand or Finger? Red or Ball?

Naturalistic learning conditions Medina, Trueswell, Snedeker, & Gleitman. (2008). When the shoe

fits: Cross-situational word learning in realistic learning environments. BUCLD.

How do learners narrow down the possibilities?

Linguistic context (Landau & Gleitman, 1985; Gleitman, 1990; Gillette, Gleitman, Gleitman, & Lederer, 1999)

Learning biases Whole object constraint (Markman, 1989) Mutual exclusivity (Markman & Wachtel, 1988;

Markman, Wasow, & Hansen, 2003)

Social-attentional cues (e.g., Baldwin 1991, 1993; Tomasello & Akhtar, 1995; Bloom, 2002; Behne, Carpenter, & Tomasello, 2005)

Social-Attentional Cues

Nonverbal cues can reduce the range of possible interpretations.

Direction of speaker eye-gaze (Baldwin, 1991, 1993; Trueswell & Gleitman, 2003; Nappa, Wessel, McEldoon, Gleitman, & Trueswell, 2009)

Joint attention: Occurs naturally when parent and child are focused on the same thing at the same time (Baldwin, 1991; Bruner, 1978)

~ 70% of mothers’ utterances (Collis, 1977; Harris, Jones, & Grant, 1983; Tomasello & Todd, 1983)

Positively associated with early vocabulary acquisition (Tomasello & Todd, 1983; Harris, Jones, Brookes, & Grant, 1986; Tomasello, Mannle, & Kruger, 1986; Akhtar, Dunham, & Dunham, 1991)

Quality of Learning Instances(Baldwin, 1991)

But what about the lack of perfect contingency between word and referent?

Follow-In vs Discrepant Labeling

“Look! A dax!”

Quality of Learning Instances(Baldwin, 1991)

But what about the lack of perfect contingency between word and referent?

Follow-in labeling: eye gaze, voice direction, and body posture oriented at object child is currently focused on

16-19 mo-olds mapped correctlyDiscrepant labeling: eye gaze, voice direction,

and body posture directed at a hidden (but previously seen) object, while infant is focused on another object

Infants did not map the word to the focused-object.

Social-attentional cues in interaction

(Frank, Goodman, & Tenenbaum, In Press)

Rollins corpus (CHILDES): mom and baby (6 mo)

Social-attentional cues Infant: Hands, Mouth (infant only), “Touch”, Looking

(direction of eye gaze) Caregiver: Hands, “Touch”, Looking (direction of eye

gaze)

Cross-situational word-learning model successfully discovered the mappings between words and objects.

Social-attentional cues in interaction

(Frank, Goodman, & Tenenbaum, In Press)

Rollins corpus (CHILDES): mom and baby (6 mo)

Social-attentional cues Infant: Hands, Mouth (infant only), “Touch”, Looking

(direction of eye gaze) Caregiver: Hands, “Touch”, Looking (direction of eye

gaze)

Cross-situational word-learning model successfully discovered the mappings between words and objects.

Social-attentional cues in interaction

(Frank, Goodman, & Tenenbaum, In Press)

Rollins corpus (CHILDES): mom and baby (6 mo)Social-attentional cues

Infant: Hands, Mouth (infant only), “Touch”, Looking (direction of eye gaze)

Caregiver: Hands, “Touch”, Looking (direction of eye gaze)

Cross-situational word-learning model successfully discovered the mappings between words and objects.

Joint attention? Follow-in?What would interaction look like if child were

initiating actions?

Our Goals

Look at a representative sample of parent-child interactions.

Explore the conditions under which word meaning is transparent (or not) from extra-linguistic cues alone: Presence (or absence) of cues Timing and coordination of cues

Joint attention? Follow-in?

Selection of Stimuli

Large video corpus of parent-child interactions in natural settings (home, outdoors, etc.) Snedeker, J. (2001). Interactions between infants (12-15

months) and their parents in four settings. Unpublished corpus.

Selection of Stimuli

Word learning “norming” study Gertner, Y., Fisher, C., Gleitman, L., Joshi, A., & Snedeker, J. (In

progress). Machine implementation of a verb learning algorithm. Adapation of Human Simulation Paradigm (Gillette, Gleitman,

Gleitman, & Lederer, 1999; Snedeker and Gleitman, 1999) Randomly selected six instances of highly frequent content

words. Each instance was edited into a 40-second “vignette”. Sound turned off.

Visual context only cue to word meaning, placing viewers in the situation of the early word learner.

Utterance of target word (at 30 sec) indicated by a BEEP.

Guess the “mystery” word in each vignette.

30 sec

10 sec

<BEEP>

(silence)

(silence)

(silence)

Drawings courtesy of Emily Trueswell

Two types of vignettes “High Informative” – vignettes guessed by >50% of

participants “Low Informative” – vignettes guessed by <33% of

participants

Selection of Stimuli

Two types of vignettes “High Informative” – vignettes guessed by >50% of

participants Rare (only 7% of vignettes). All basic level objects.

“Low Informative” – vignettes guessed by <33% of participants

Stimuli for Current Study 8 nouns: bag, ball, book, horse, necklace, nose, phone,

shoe One HI vignette One LI vignette

Selection of Stimuli

N = 12 (ages 3;1 to 5;4)Modified for fun!

Shorter vignettes with funny noises “What do you think the parent said?” Celebratory animation

Pilot Study: Children

Pilot Study: Children

0%

20%

40%

60%

80%

100%

Children

% C

orre

ct G

uess

es

HI

LI*

2 = 3.84

Extra Linguistic Cue Coding

Is the target object visible in the scene (on screen)?Is the child moving or reaching towards the object?Is the child handling the object?Is the child looking at the object?Is the child looking at the parent?Is the parent moving or reaching towards the

object?Is the parent handling the object?Is the parent looking at the object?Is the parent looking at the child?

Presence of Target Object

0

10

20

30

40

Presence of Object

Ave

rage

Tot

al D

urat

ion

of C

ue (

sec)

HI

LI

Error bars reflect Standard Error of the Mean.

Cue Occurrence at Word Onset

0%

50%

100%

Child Parent

% O

ccur

renc

e at

Tar

get W

ord

HI

LI

*2 = 7.27

*2 = 4.00

*2 = 4.27

Joint Attention at Word Onset

Child Looking at and/or Handling ObjectAND Parent Looking at Object

0%

50%

100%

Joint Attention

% O

ccur

renc

e at

Tar

get

Wor

d

HI

LI

*2 = 4.00

What is the timing of cues?

Follow-in?Parent refers to object under child’s focus of

attention.

First onset of cues relative to word onset.

First Onset of Cues

Child Looking at Object t(1,12)=1.56, p=0.14

Child Moving/Reaching Toward Object t(1,12)=2.05, p=0.06

Child Handling Object t(1,12)=2.96, p=0.01

Parent Handling Object t(1,8)=1.09, p=0.31

0 10 20 30 40

LI

HI

Info

rma

tivity

of V

ign

ette

s

Time (sec)

Error bars reflect Standard Error of the Mean.

First Onset of Cues

0 10 20 30 40

LI

HI

Info

rma

tivity

of V

ign

ette

s

Time (sec)

Parent Looking at Child t(1,13)=0.54, p=0.59

Parent Looking at Object t(1,12)=0.08, p=0.93

Child Looking at Parent t(1,6)=0.02, p=0.98

Parent Moving/Reaching Toward Object t(1,7)=0.15, p=0.89

Error bars reflect Standard Error of the Mean.

Differentiating HI and LI Vignettes

High Informative Follow-In: Utterance of target word immediately

after first onset of child’s shift in focus towards object.

Joint Attention: Co-occurring high rates of child’s attention to object and parent’s attention to child and object.

Low Informative Delayed follow-in. Low joint attention.

Implications

Basic Level Object Terms provide a scaffold for further learning. Word order, syntax, abstract lexical items, etc.

Vindication of Bruner/Baldwin’s social conditions for word learning found in natural parent-child interactions. Word learning is successful when cues align.