1
Rapid accent adaptation and constraints on cross-talker generalization Kodi Weatherholtz, Linda Liu, and T. Florian Jaeger, University of Rochester { kweatherholtz ,, lliu , fjaeger} @bcs.rochester.edu To achieve robust language understanding, listeners must adjust for variability in speech (e.g., dialect, foreign accent). Listeners can adapt to systematic variability across talkers [1,2]. To the extent that listeners can generalize learning from familiar talkers to new similar talkers, they are able to benefit from prior accent experience when encountering a new talker [3]. Such cross-talker generalization requires that listeners distinguish accent-general patterns from talker-specific idiosyncracies. Acknowledgments: This work was supported by an NIHCD R01 (HD075797) to TFJ. We are grateful to Ann Bradlow for providing access to the ALLSSTAR corpus for stimulus materials. References: [1] Bradlow, A., & Bent, T. (2008). Cognition, 106(2). [2] Sidaras, S., Alexander, J., & Nygaard, L. (2009). JASA, 125 (5). [3] Kleinschmidt, D. & Jaeger, T. F. (2015). PsychReview, 122(2. Previous findings: Exposure to multiple talkers with the same foreign accent facilitates—and is perhaps necessary for—generalization to new talkers with that accent (Bradlow & Bent, 2008). By contrast, exposure to a single talker appears to result in talker-specific adaptation (i.e., better performance on the trained talker, but no generalization to new talkers). Exposure-test paradigm: Task: Sentence intelligibility in noise (following Bradlow & Bent, 2008), conducted on Mechanical Turk. Stimuli: Simple declarative sentences, each containing 2-4 keywords , produced by native English or Mandarin-accented English talkers. - Somebody stole the money . Exp1: Replicating foreign accent adaptation via the web Exp 2 (N = 156): Is multi-talker training necessary for generalization? 0.4 0.6 0.8 1.0 0 20 40 60 80 Trial Proportion keywords correctly transcribed Condition Control (5 AmEng) Single talker training Multitalker training Talkerspecific training Training data by trial 0.4 0.6 0.8 1.0 Control (training = 5 AmEng) Single talker training Multitalker training Talkerspecific training Accuracy at Test 0.4 0.6 0.8 1.0 0 20 40 60 80 Trial Proportion keywords correctly transcribed Condition Control (5 AmEng) Talkerspecific training Training data by trial 0.4 0.6 0.8 1.0 Control (training = 5 AmEng) Talkerspecific training Accuracy at Test Results - Training: participants who heard foreign-accented speech were initially less accurate (relative to control) but improved with exposure. - Test: higher accuracy in the talker-specific condition, relative to control, indicating accent adaptation (beyond task adaptation) - Effect size for talker-specific adaptation ~10%; comparable to B&B2008 Training materials. 16 sentences repeated 5 times - by 5 different native speakers, or - by 1 Mandarin-accented talker (80 training trials, cf. 160 in B&B2008) Nonnative 01 Nonnative 02 Nonnative 03 Nonnative 04 Nonnative 05 Nonnative 06 1.6% 2.4% +8.3% +11.4% +12.8% +16.2% 0.4 0.6 0.8 1.0 test phase test phase test phase test phase test phase test phase Proportion keywords correctly transcribed Condition Control (training = 5 AmEng) Talkerspecific training Training Test Control Talker-specific native (AmEng) x 5 non- native speaker Conditions from Exp1 Training Test Single-talker Multi-talker non- native x 5 non- native speaker different non- native speaker Conditions. Test materials. 16 sentences produced by a single Mandarin-accented talker . - (identity of the test talker varied btw subjs; 6 different Mandarin-accented talkers total) Conditions. somebody stole the money Stimulus w/o noise Stimulus in noise (+5 signal-to-noise ratio) Noise added to avoid ceiling effects. Improved performance (adaptation)—or not—by test talker. For some talkers, performance was at ceiling regardless of exposure Exposure provided no reliable benefit, despite lack of ceiling effect. Talkers were relatively hard to understand, but easy to adapt to. Conclusions We found evidence that generalization of accent adaptation is both more robust and less robust than previously claimed (cf. Bradlow & Bent, 2008). - More robust in that single talker training can be sufficient for cross-talker generalization (i.e., multi-talker training is not a necessary condition) - Less robust in that generalization was imperfect (i.e., cross-talker generalization was never as strong as talker-specific training) Our question Under what conditions does adaptation to a foreign accent generalize to new talkers with that accent? * n.s. Outcome variable: proportion of keywords correctly transcribed. Bradlow & Bent, 2008, Figure 3. * * average training benefit by talker 0.00 0.25 0.50 0.75 1.00 Single talker training Multitalker training Proportion keywords correctly transcribed Item Repetition during Training yes (Exp2) no (Exp3) Exp 3. Replicated finding of cross-talker generalization in both the single-talker and multi-talker conditions, after removing item repetition during training. Results - Cross-talker generalization following both single talker and multi-talker training - Strength of generalization: no discernable benefit of multi-talker over single talker training - Imperfect generalization: multi-talker < talker-specific (cf. B&B, 2008) n.s. n.s. ** Effect of perceived accent familiarity on test performance Participants. 95 Mechanical Turkers Control (training = 5 AmEng) Single talker training Multitalker training Talkerspecific training 0.6 0.7 0.8 0.9 daily weekly monthly yearly never daily weekly monthly yearly never daily weekly monthly yearly never daily weekly monthly yearly never Frequency of exposure to the test talker's accent (subjective report) Proportion of keywords correctly transcribed

Kodi Weatherholtz, Linda Liu, and T. Florian Jaeger, …kweatherholtz.github.io/publications/Weatherholtz_etal...Rapid accent adaptation and constraints on cross-talker generalization!

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Kodi Weatherholtz, Linda Liu, and T. Florian Jaeger, …kweatherholtz.github.io/publications/Weatherholtz_etal...Rapid accent adaptation and constraints on cross-talker generalization!

Rapid accent adaptation and constraints on cross-talker generalization!Kodi Weatherholtz, Linda Liu, and T. Florian Jaeger, University of Rochester{kweatherholtz,, lliu, fjaeger} @bcs.rochester.edu!

To achieve robust language understanding, listeners must adjust for variability in speech (e.g., dialect, foreign accent). Listeners can adapt to systematic variability across talkers [1,2]. To the extent that listeners can generalize learning from familiar talkers to new similar talkers, they are able to benefit from prior accent experience when encountering a new talker [3]. Such cross-talker generalization requires that listeners distinguish accent-general patterns from talker-specific idiosyncracies.!

Acknowledgments:!This work was supported by an NIHCD R01 (HD075797) to TFJ. We are grateful to Ann Bradlow for providing access to the ALLSSTAR corpus for stimulus materials.!

References:![1] Bradlow, A., & Bent, T. (2008). Cognition, 106(2).![2] Sidaras, S., Alexander, J., & Nygaard, L. (2009). JASA, 125 (5).[3] Kleinschmidt, D. & Jaeger, T. F. (2015). PsychReview, 122(2.!!

Previous findings:!Exposure to multiple talkers with the same foreign accent facilitates—and is perhaps necessary for—generalization to new talkers with that accent (Bradlow & Bent, 2008). !By contrast, exposure to a single talker appears to result in talker-specific adaptation (i.e., better performance on the trained talker, but no generalization to new talkers). !

Exposure-test paradigm:!Task: Sentence intelligibility in noise (following Bradlow & Bent, 2008), conducted on Mechanical Turk.!

Stimuli: Simple declarative sentences, each containing 2-4 keywords, produced by native English or Mandarin-accented English talkers.! - Somebody stole the money.!

Exp1: Replicating foreign accent adaptation via the web! Exp 2 (N = 156): Is multi-talker training necessary for generalization?!

0.4

0.6

0.8

1.0

0 20 40 60 80Trial

Prop

ortio

n ke

ywor

ds c

orre

ctly

tran

scrib

ed

Condition●

Control (5 AmEng)Single talker trainingMulti−talker trainingTalker−specific training

Training data by trial

0.4

0.6

0.8

1.0

Control(training =5 AmEng)

Singletalker

training

Multi−talker

training

Talker−specifictraining

Accuracy at Test

0.4

0.6

0.8

1.0

0 20 40 60 80Trial

Prop

ortio

n ke

ywor

ds c

orre

ctly

tran

scrib

ed

Condition●

Control (5 AmEng)Talker−specific training

Training data by trial

0.4

0.6

0.8

1.0

Control(training = 5 AmEng)

Talker−specifictraining

Accuracy at Test

Results!-  Training: participants who heard foreign-accented speech were

initially less accurate (relative to control) but improved with exposure.!-  Test: higher accuracy in the talker-specific condition, relative to

control, indicating accent adaptation (beyond task adaptation)!-  Effect size for talker-specific adaptation ~10%; comparable to B&B2008!

Training materials. 16 sentences repeated 5 times! - by 5 different native speakers, or! - by 1 Mandarin-accented talker! (80 training trials, cf. 160 in B&B2008)!

Non−native 01 Non−native 02 Non−native 03 Non−native 04 Non−native 05 Non−native 06

−1.6% −2.4% +8.3% +11.4% +12.8% +16.2%0.4

0.6

0.8

1.0

test phase test phase test phase test phase test phase test phase

Prop

ortio

n ke

ywor

ds c

orre

ctly

tran

scrib

ed

Condition Control(training = 5 AmEng)

Talker−specifictraining

Training! Test!

Con

trol

!Ta

lker

-spe

cific

!

native (AmEng)

x 5!

non-native

speaker !

Conditions from Exp1!

Training! Test!

Sing

le-ta

lker

!M

ulti-

talk

er

!

non-native

x 5!

non-native

speaker !

different non-

native speaker

!

Conditions.!

Test materials. 16 sentences produced by a single Mandarin-accented talker.!-  (identity of the test talker varied btw subjs;

6 different Mandarin-accented talkers total)!

Conditions.!

somebody stole the money

Stimulus w/o noise!

Stimulus in noise (+5 signal-to-noise ratio)!Noise added to avoid ceiling effects.!

Improved performance (adaptation)—or not—by test talker.!

For some talkers, performance was at ceiling regardless of exposure!

Exposure provided no reliable benefit, despite lack of ceiling effect. !

Talkers were relatively hard to understand, but easy to adapt to.!

Conclusions!We found evidence that generalization of accent adaptation is both more robust and less robust than previously claimed (cf. Bradlow & Bent, 2008). !

-  More robust in that single talker training can be sufficient for cross-talker generalization (i.e., multi-talker training is not a necessary condition)!

-  Less robust in that generalization was imperfect (i.e., cross-talker generalization was never as strong as talker-specific training)!

Our question!Under what conditions does adaptation to a foreign accent generalize to new talkers with that accent?!

*! n.s.!

Outcome variable: proportion of keywords correctly transcribed. !

Author's personal copy

A two-way repeated measures ANOVA with post-test (Chinese-accented vs. Slo-vakian-accented) as a within-subjects factor and training condition as a between-sub-jects factor (Conditions 1, 2, 3, 4 and 5 as shown in Table 2) showed significant maineffects of both post-test (F(l, 82) = 92.90, p < 0001) and training condition(F(4,82) = 14.036, p < 0001). The post-test · training condition interaction was alsosignificant (F(4,82) = 2.860, p < .03). Separate one-factor ANOVAs for each of thetwo post-tests showed significant effects of training condition (Chinese-accentedtalker: F(4,82) = 14.892, p < .0001); (Slovakian-accented talker: F(4,82) = 6.211,p < .001). For the post-test with the Chinese-accented talker (post-test 1), pair-wisecomparisons (1-tailed, t-tests), showed that performance following the no trainingcontrol condition (Condition 5) was significantly worse (at the p < .0001 level) thanperformance following all of the other conditions. Performance on post-test 1 fol-lowing the task control condition (Condition 4) did not differ significantly from per-formance following the single talker training condition (Condition 3), but waspoorer than performance following the talker-specific training condition (p < .02)and the multiple-talker training condition (p < .05) Similarly, performance onpost-test 1 following the single talker training condition (Condition 3) was poorerthan performance following the talker-specific training condition (p < .02) and themultiple-talker training condition (p < .05). Performance following the multipletalker training condition (Condition 1) and the talker-specific training condition(Condition 2) did not differ significantly from each other. For the post-test withthe Slovakian-accented talker, pair-wise comparisons (1 tailed, t-tests) showed signif-icantly worse performance (at the p < .02 level) in the no training control condition(Condition 5) relative to all of the other conditions. There were no significant differ-ences across any of the other training conditions (Conditions 1–4).

In summary, results of this training study support four major conclusions. First,perception of sentences produced by a foreign-accented talker was better after prac-tice with the task regardless of whether the training talker(s) came from the same lan-guage background as the test talker. This conclusion is supported by the finding thatperformance on the test with the Slovakian-accented talker (post-test 2) was better in

Fig. 3. Performance in RAU on the two post-tests in each of the training conditions in Experiment 2.Error bars represent the standard error of the mean. Training conditions are explained in Table 2 and inthe accompanying text.

720 A.R. Bradlow, T. Bent / Cognition 106 (2008) 707–729

Bradlow & Bent, 2008, Figure 3.!

*!*!

average training benefit

by talker!

0.00

0.25

0.50

0.75

1.00

Single talkertraining

Multi−talkertraining

Prop

ortio

n ke

ywor

ds c

orre

ctly

tran

scrib

ed

Item Repetitionduring Training

yes (Exp2) no (Exp3)

Accuracy at Test

Exp 3.!Replicated finding of cross-talker generalization in both the single-talker and multi-talker conditions, after removing item repetition during training.!

Results!-  Cross-talker generalization following both single talker and multi-talker training!-  Strength of generalization: no discernable benefit of multi-talker over single

talker training !-  Imperfect generalization: multi-talker < talker-specific (cf. B&B, 2008)!

n.s.!

n.s.!**!

Effect of perceived accent familiarity on

test performance!

Participants. 95 Mechanical Turkers!

Control(training = 5 AmEng)

Single talkertraining

Multi−talkertraining

Talker−specifictraining

● ●

●●

●● ●

0.6

0.7

0.8

0.9

daily weekly monthly yearly never daily weekly monthly yearly never daily weekly monthly yearly never daily weekly monthly yearly neverFrequency of exposure to the test talker's accent

(subjective report)

Prop

ortio

n of

key

word

sco

rrect

ly tr

ansc

ribed