Upload
azana
View
35
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Pragmatically-guided perceptual learning. Tanya Kraljic, Arty Samuel, Susan Brennan Adaptation Project mini-Conference, May 7, 2007. 1-Minute Background on Speech Perception Part 1: Perceptual constancy. Speaker. Listener. Speech sounds (phonemes) differ depending on: who is speaking - PowerPoint PPT Presentation
Citation preview
Pragmatically-guided perceptual learning
Tanya Kraljic, Arty Samuel, Susan Brennan
Adaptation Project mini-Conference, May 7, 2007
Speaker Listener
Speech sounds (phonemes) differ depending on:
• who is speaking• what the immediate phonetic context is
1-Minute Background on Speech Perception
Part 1: Perceptual constancy
Speaker Listener
Speech sounds (phonemes) differ depending on:
• who is speaking• what the immediate phonetic context is
Perceptual constancy
And Yet…
Speaker Listener
1. Learn the acoustic invariants as children, then extract those and discard everything else as we’re listening Problem: What acoustic invariants?
1-Minute Background on Speech Perception
Part 2: Solutions?
Speaker Listener
1. Learn the acoustic invariants as children, then extract those and discard everything else as we’re listening Problem: What acoustic invariants?
2. Represent (learn) every variation that is encountered Problem: memory (if every variant is stored separately), ‘catastrophic interference’ (if you keep changing the same
representation)
1-Minute Background on Speech Perception
Part 2: Solutions
Getting at the Question: How does the perceptual system decide what to learn?
General idea in perception: Maybe the system tries to learn invariants of the distal objects that produce the stimuli (in this case, that would mean the speaker) and not of the stimuli themselves (in this case, the acoustic signal)
Our hypothesis: Maybe the system tries to learn those aspects of the signal that reflect characteristic properties of the speaker (and therefore are likely to remain stable across contexts and situations)
Getting at the Question: How does the perceptual system decide what to learn?
Specifically: How might it determine which variations are characteristic?
Our test: two kinds of information the system might use:
1. A ‘first impressions’ heuristic: In the absence of any other information, the properties that are present during first encounter are assumed to be representative and stable
2. Pragmatic cues that indicate that the variation is incidental (seeing that the speaker is talking with a pen in her mouth) can override the influence of primacy
What does Perceptual learning look like?2-phase Method
1. Exposure Phase (Lexical Decision Task)
Purpose: To expose participants to a speaker who pronounces a
particular sound in an ambiguous way (e.g., /?s/)Method: The /?s/ occurs in the context of words that cause the
sound to be perceived as one or the other phoneme (e.g. dino?aur
OR impa?ent).
Example: dino?aur Example: dino?aur OR impa?ent OR impa?ent
What does Perceptual learning look like?2-phase Method
1. Exposure Phase (Lexical Decision Task)
Purpose: To expose participants to a speaker who pronounces a
particular sound in an ambiguous way (e.g., /?s/)Method: The /?s/ occurs in the context of words that cause the
sound to be perceived as one or the other phoneme (e.g. dino?aur
OR impa?ent). * Listeners hear both ‘odd’ (dino?aur) and good versions of the (legacy) phonemes from the same speaker *
2. Test Phase (Category Identification)Purpose: Tests whether perceptual learning has occurredMethod: Participants hear items from a continuum that ranges from/s/ to // with several ambiguous points in between. They have to label each sound as S or SH.
*All manipulations are during the Exposure phase*
Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)
(really X another 2 - Phoneme: ?S or ?SH)
Manipulation: 2X2
*All manipulations are during the Exposure phase*
Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)
(really X another 2 - Phoneme: ?S or ?SH)
Manipulation: 2X2
*All manipulations are during the Exposure phase*
Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)
(really X another 2 - Phoneme: ?S or ?SH)
Manipulation: 2X2
*All manipulations are during the Exposure phase*
Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)
(really X another 2 - Phoneme: ?S or ?SH)
Pronunciation attribute varies by modality:
AudioOnly modality = Order manipulation (to test ‘first impressions heuristic)
Order 1st half 2nd half Attribution Prediction
Odd 1st dino?aur legacy Characteristic learning
Odd 2nd legacy dino?aur Incidental no learning
Manipulation: 2X2
/s/
Odd SecondNo Perceptual learning (F(1,62)=.29, p=.59
Results: Audio ModalityResults: Audio Modality
0
10
20
30
40
50
60
70
80
90
100
% S
H re
spon
ses
?SH Exposure?S Exposure
Odd FirstPerceptual learning (F(1,62)=5.93, p=.018)
/s/ /?s/ // ///?s/
*All manipulations are during the Exposure phase*
Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)
(really X another 2 - Phoneme: ?S or ?SH)
Pronunciation attribute varies by modality:
AudioVisual modality = Pragmatic manipulation (can it override ‘first impressions’ heuristic?)
Pragmatic Order Attribution Prediction
No pen in mouth* odd first Characteristic learning
Pen in mouth odd first Incidental no learning *No pen in mouth condition is just an AV version of our Audio, Odd-first condition
Manipulation: 2X2
Example of manipulation:
No pen in mouth
Pen in mouth
Manipulation: 2X2
/s/
Pen in MouthNo Perceptual learning (F(1,68)=.04, p>.05
Results: AudioVisual ModalityResults: AudioVisual Modality
0
10
20
30
40
50
60
70
80
90
100
% S
H re
spon
ses
?SH Exposure?S Exposure
No Pen in MouthPerceptual learning (F(1,68)=6.29, p=.015)
/s/ /?s/ // ///?s/
Overall results / Conclusions
Results: Same acoustic signal is handled differently depending on whether it is assumed to be a characteristic pronunciation or an incidental (perhaps transient) one
Main effect of phoneme (SH vs. S), no interaction with modality, significant interaction with Pronunciation attribute.
0
1
2
3
4
5
6
7
8
9
10
% P
erce
ptu
al l
earn
ing
eff
ect
(%S
H r
esp
- %
S r
esp
)
Audio AudioVisual
Characteristic pronunciation
Incidental pronunciation
Overall results / Conclusions
Converging Evidence: Our work on idiolectal/dialectal STR shows learning for ?s when it is speaker-driven, but not when it is contextually-driven
Conclusion: Perceptual learning is a powerful mechanism applied conservatively.
Pragmatic information plays an immediate role in guiding learning
Thank you
Design Elaboration
?S ?SH
Audio AudioAudioVisual AudioVisual
odd 1st odd 2nd odd 2ndodd 1st
Design Elaboration
?S ?SH
Audio AudioAudioVisual AudioVisual
odd 1st odd 2nd odd 2ndodd 1st
PenNo Pen PenNo Pen