27
Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat Pompeu Fabra Barcelona

Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Embed Size (px)

Citation preview

Page 1: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Cross-modal Prediction in Speech

PerceptionCarolina Sánchez, Agnès Alsius, James T. Enns & Salvador

Soto-Faraco

Multisensory Research Group

Universitat Pompeu Fabra

Barcelona

Page 2: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Auditory + visual performanceMSI enhancement

Background

Visual + Auditory

Improve Speech Perception

Multisensory Integration

Page 3: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Background

• Prediction within one sensory modality• Many levels of information processing

– Phonological prediction “ This morning I went to the library and borrowed a … book” (De Long, 2005; Pickering, 20707)

– Visual prediction: Visual search (Enns, 2008; Dambacher, 2009)

– Sensorimotor prediction: forward model (Wolpert, 1997)

Page 4: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Predictive coding

Pickering, 2007

Page 5: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Hypothesis

• If there exists prediction within the same modality,

and if predictive coding models can account for prediction at a phonological level, then …

Predictive Coding could occur across different sensory modalities too.

Page 6: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Indirect evidences of cross-modal transfer in speech

van Wassenhove’s , 2005

time

ERPs

• Amplitud reduction

• Shortening latency

/pa/ high visual saliency

/ka/ short visual saliency

Page 7: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Our study

• Visual prediction

• Auditory prediction

• Visual-to-auditory cross-modal prediction

• Auditory-to-visual cross-modal prediction

Page 8: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Visual prediction

Visual stream

Auditory stream

V

A

With visual informative visual context

Without informative context

Task :

AV Match vs. AV Mismatch

Target fragment

Context fragment

speechnon speech

Page 9: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Results

*

0

200

400

600

800

1000

1200Reaction time

mse

c

match mismatch

With visual informative context

Without informative context

* With previous context participants respond faster than without it.

VISUAL PREDICTION

Page 10: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Auditory prediction

Visual stream

Auditory stream

V

A

With auditory informative auditory context

Without informative context

speechnon speech

Task :

AV Match vs. AV Mismatch

Target fragment

Context fragment

Page 11: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Results

*

0

200

400

600

800

1000

1200

With auditory informative context

Without informative context

Reaction time

mse

c

match mismatch

* With previous context participants respond faster than without it.

AUDITORY PREDICTION

Page 12: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Visual vs. Auditory Visual prediction Auditory

prediction

0

200

400

600

800

1000

1200Rts

mse

c

congruent incongruent

With visual informative context

Without informative context*

0

200

400

600

800

1000

1200

With auditory informative context

Without informative context

Rts

mse

c

congruent incongruent

*

Page 13: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Conclusions

• Visual prediction

• Auditory prediction

Is this prediction cross-modal?

Page 14: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Predictability of Vision-to-Audition Design of the experiment

V

AMismatch

Unimodal continued

Auditory stream

Visual stream

Match

Unimodal continuedV

A

Discontinued

Match

V

A

Discontinued

Mismatch

V

A

Cross-modal continued

Mismatch

Page 15: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Predictability of Vision-to-Audition Stimuli

V

AMismatch

V

AMismatch

V

AMismatch

Unimodal continued Discontinued Cross-modal continued

Page 16: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Results

Participants were faster in the cross-modal condition than in the completely incongruent one.

VISUAL –TO-AUDITORY PREDICTION

700

750

800

850

900

950

1000

Reaction time

mse

c

*

VisualAuditory

Unimodal continued

Discontinued Cross-modal continued

Page 17: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Predictability of Audition-to-Vision Design of the experiment

Auditory stream

Visual stream

Match

Unimodal continued

V

AMismatch

Unimodal continued

V

AMatch

Discontinued

V

AMismatch

Discontinued

V

AMismatch

Cross-modal continued

Page 18: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

0

200

400

600

800

1000

1200Reaction time

mse

c

Visual

Auditory

Unimodal continued

Discontinued Cross-modal continued

Results

We didn’t find any difference between the mismatch condicions

NO AUDITORY-TO-VISUAL PREDICTION

Page 19: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Conclusions

• There is some kind of prediction from vision-to-auditory modality

• There is not any prediction from auditory-to-vision modality

Does this prediction depend on the language?

Page 20: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Canadian participants with english sentences

VISUAL –TO-AUDITORY PREDICTION IN NATIVE LANGUAGE

700

750

800

850

900

950

1000Reaction time

mse

c

*

Visual

Auditory

Unimodal continued

Discontinued Cross-modal continued

700

750

800

850

900

950

1000

Reaction time

mse

c

*

VisualAuditory

Unimodal continued

Discontinued Cross-modal continued

Spanish participants with spanish sentences

Results (L1)

Page 21: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Results (L1)

Canadian participants with english sentences

0

200

400

600

800

1000

1200Reaction time

mse

c

No differences between the mismatch conditions

No prediction from auditory-to-visual modality in native language

Spanish participants with spanish sentences

0

200

400

600

800

1000

1200Reaction time

mse

c

Visual

Auditory

Unimodal continued

Discontinued Cross-modal continued

Visual

Auditory

Unimodal continued

Discontinued Cross-modal continued

Page 22: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Conclusions

• There is some kind of prediction from vision-to-auditory modality in L1

• There is not any prediction from auditory-to-vision modality L1

What happens with an unknown language?

Page 23: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Unknown language : visual to auditory

Canadian participants with spanish sentences

NO VISUAL-TO-AUDITORY IN OTHER LANGUAGE

700

800

900

1000

1100

1200Reaction time

mse

c

Visual

Auditory

Unimodal continued

Discontinued Cross-modal continued

Page 24: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Unknown language: auditory to visual

Spanish participants with english sentences

Canadian participants with spanish sentences

0

200

400

600

800

1000

1200Reaction time

mse

c

0

200

400

600

800

1000

1200Reaction time

mse

c

No differences between the mismatch conditions

No prediction from auditory-to-visual modality in other language

Visual

Auditory

Unimodal continued

Discontinued Cross-modal continued

Visual

Auditory

Unimodal continued

Discontinued Cross-modal continued

Page 25: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

Conclusions

• No visual-to-auditory cross-modal prediction in an unknown language…

it seems that some level of knowledge about the articulatory phonetics of the language is required to obtain the advantage of the predictive coding

• No auditory-to-visual cross-modal prediction

Page 26: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

General Conclusions

• Unimodal prediction from visual to visual modality from auditory to auditory

• L1: ASYMMETRY– Cross-modal prediction from visual-to-auditory

modality– No cross-modal prediction from auditory-to-visual

modality

• Unknown language: previous knowledge of the language is neccesary to make the prediction– No cross-modal prediction from visual-to-auditory

modality– No cross-modal prediction from auditory-to-visual

modality

Page 27: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat

- Agnès Alsius, Postdoc

Queen’s University

- Antonia Najas, MA/ Research Assistant Universitat Pompeu Fabra

- Phil Jaekl, PostdocUniversitat Pompeu Fabra

- All the people of the Vision Lab, UBC, Vancouver

Thanks to…

Thanks for your attention!!