1
0 50 100 150 200 250 300 350 Voiced Voiceless Average Vowel Length (ms) Initial Stop Consonant Vowel Length High Engagement Task Low Engagement Task 0 100 200 300 400 500 600 700 800 900 1000 Voiced Voiceless Average F1 (Hz) Initial Stop Consonant F1 Onset High Engagement Task Low Engagement Task 0 50 100 150 200 250 300 350 Voiced Voiceless Average f0 (Hz) Initial Stop Consonant F0 Onset High Engagement … Low Engagement … High Engagement Low Engagement 80 100 120 20 25 30 35 Voiceless Voiced 0.25 0.5 0.75 1 0.25 0.5 0.75 1 Normalized Trial Number (Proportion of total trials) Voice onset time (ms) Stimulus Voicing: Voiceless Voiced Initial VOT: Long Short Voice Onset Time β = 3.55, SE = 1.86, χ 2 (1) = 3.37, p = 0.07 3 β = 1.62, SE = 0.85, χ 2 (1) = 3.31, p = 0.07 3 High Engagement Low Engagement 150 200 250 150 200 250 300 Voiceless Voiced 0.25 0.5 0.75 1 0.25 0.5 0.75 1 Normalized Trial Number (Proportion of total trials) Vowel length (ms) Stimulus Voicing: Voiceless Voiced Initial VL: Long Short Vowel Length β = -8.78, SE = 2.72, χ 2 (1) = 7.82, p = .005 3 β = -6.17, SE = 3.18, χ 2 (1) = 3.51, p = .06 3 β = -3.72, SE = 1.76, χ 2 (1) = 4.43, p = .04 2 High Engagement Low Engagement 150 175 200 225 150 175 200 225 Voiceless Voiced 0.25 0.5 0.75 1 0.25 0.5 0.75 1 Normalized Trial Number (Proportion of total trials) f0 onset (Hz) Stimulus Voicing: Voiceless Voiced Initial f0 onset: High Low Phonetic convergence in an immersive game-based task INTRODUCTION METHOD RESULTS DISCUSSION ACKNOWLEDGEMENTS & REFERENCES Tifani Biro ([email protected]) Department of Speech-Language-Hearing: Sciences and Disorders University of Kansas Joe Toscano ([email protected]) Department of Psychology Villanova University Navin Viswanathan ([email protected]) Department of Speech-Language-Hearing: Sciences and Disorders University of Kansas Statistical Analyses Three linear mixed effects models were used to examine: The effect of trial number, task engagement, initial VOT, and their interactions on the produced VOTs for voiced or voiceless tokens separately. A trial number x task engagement x initial VOT interaction would indicate convergence The effect of initial VOT, trial number, and their interactions on VOTs of voiced or voiceless tokens separately in each individual task. A initial VOT x trial number interaction would indicate convergence The effect of trial number on the VOTs of voiced or voiceless tokens for each of the participants in each task Similar models examined these effects on VL, F1, and f0 Summary of Results Elicited mean VOT were longer than what had been previously reported (voiced VOT: 25 ms; voiceless VOT: 101 ms; Lisker & Abramson, 1964). Lexical factors (Baese-Berk & Goldrick, 2009) and task difficulty (Schertz, 2013) could contribute to these differences However, participants’ productions were not significantly different between the low and high engagement tasks Convergence was observed along certain acoustic dimensions (VOT, VL, F1 onset), but not others (f0). Interestingly, the extent of convergence was affected by the level of task engagement. Overall, subjects in the high engagement task converged more, whereas subjects in the low engagement task were less likely to converge. This finding was especially the case for VL and F1 onset These preliminary results suggest that engaging, naturalistic tasks may yield results that more accurately reflect real-world phonetic variation than traditional laboratory experiments Future Directions Future studies will use these communicative tasks to look at convergence among speakers of different native languages Participants completed either a high- or low-engagement task 30 word-initial voicing minimal pairs provided key information that interlocutors had to provide to each other Phonetic convergence was tracked for 4 acoustic dimensions (VOT, VL, F1, f 0) over the course of the one hour experiment Acknowledgements Thank y ou to our c oders: Anne Marie Crinnion, Sarah Welsh, J acklyn Coelho, N ic ole J ohnson, J ohn Mic hael Kay, Rakshana Selvarajan, and C hristopher Burley ; thanks to David Saltz man and Michael Phelan for help programming the Minec raft puz zles , and thanks to Emma Folk for as sistance with running s ubjec ts. TMB w as s upported by a Villanov a Graduate Student Fellows hip. References B abel, M. (2012). E vidence for phonetic and social selectivity in spontaneous phonetic imitation. Journal of P honetics, 40(1), 177-189. B aese-B erk, M., & Goldrick, M. (2009). Mechanisms of interaction in speech production. Language and cognitive processes, 24(4), 527-554. Buxó-Lugo, A., Toscano, J. C., & Watson, D. G. (2016). Effects of participant engagement on prosodic prominence. Discourse Processes, (just-accepted). Lisker, L., & A bramson, A . S . (1964). A cross-language study of voicing in initial stops: A coustical measurements. Word, 20, 384-422. Olmstead, A. J., Viswanathan, N., Aivar, M. P., & Manuel, S. (2013). Comparison of native and non-native phone imitation by E nglish and S panish speakers. Frontiers in psychology, 4. P ardo, J. S . (2010). E xpressing oneself in conversational interaction. Expressing Oneself/E xpressing One's self: Communication, Cognition, Language, and Identity, 183-196. S chertz, J. (2013). E xaggeration of featural contrasts in clarifications of misheard speech in E nglish. Journal of P honetics, 41(3), 249-263. Mapping Acoustic Signals to Phonetic Categories Stimuli (subset) Phonetic category membership is indicated by multiple dimensions in the acoustic signal. However, acoustic dimensions are variable and context-dependent, which can lead to ambiguities between two speech sounds that only differ in one acoustic attribute. One phenomenon that may affect this variability is phonetic convergence (the observation that speech patterns of interlocutors become more similar during a conversation) Procedure 1. Production Measures The extent talkers can converge on productions outside of their native language range may be limited (Olmstead et al., 2013). What dimensions talkers converge on also varies across studies (Babel, 2012; Pardo, 2010). However, since convergence is a phenomenon occurring conversationally within the natural world, typical laboratory tasks may fail at fully eliciting it 2. Convergence Measures High Engagement Low Engagement 600 700 800 500 550 600 Voiceless Voiced 0.25 0.5 0.75 1 0.25 0.5 0.75 1 Normalized Trial Number (Proportion of total trials) F1 onset (Hz) Stimulus Voicing: Voiceless Voiced Initial F1 onset: High Low β = -25.56, SE = 12.63, χ 2 (1) = 3.99, p = .05 β = 14.18, SE = 6.01, χ 2 (1) = 4.48, p = .03 Factors such as engagement have been found to have an effect on language production (Buxó-Lugo et al., 2016) How might task engagement affect speech production and phonetic convergence? 0 20 40 60 80 100 120 140 160 b d g p t k Average VOT (ms) Stop Consonant VOT Distributions Lisker & Abramson (1964) High Engagement Task Low Engagement Task Natural Setting Conversational partner Less control Highly engaging Lack of interlocutor More control Typically boring Laboratory Setting Voice Onset Time (VOT) Release Burst f1 onset f0 onset Vowel Length /b/ Time Freq Amp Maze Participant 1 Participant 2 1 Listener Talker 2 Talker Listener 3 Listener Talker 4 Talker Listener 5 Listener Talker 6 Talker Listener Voiced Voiceless Velar got cot goat coat ghost coast Alveolar dent tent dip tip dart tart Bilabial bat pat bear pear bark park gap fig goat coat muck luck sigh high rig cap say rare done pat yard doe nun ten bet sad pay heart frown glass try goat muck fig high cap Start End VOT Conversation /b/1 /b/2 /p/1 /p/2 VOT /b/1 /b/2 /p/1/p/2 F1 Onset Fundamental Frequency Onset 1 2 3 3 3 β = -8.67, SE = 3.97, χ 2 (1) = 4.73, p = .03 1 β = -18.77, SE = 7.51, χ 2 (1) = 6.20, p = .01 2 Low Engagement Listener High Engagement Listener Low Engagement Talker High Engagement Talker

Phonetic convergence in an immersive game-based taskwraplab.co/publications/Biro-Toscano-Viswanathan-ASA-Fall2016.pdf · Phonetic convergence in an immersive game-based task INTRODUCTION

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Phonetic convergence in an immersive game-based taskwraplab.co/publications/Biro-Toscano-Viswanathan-ASA-Fall2016.pdf · Phonetic convergence in an immersive game-based task INTRODUCTION

0

50

100

150

200

250

300

350

Voiced Voiceless

Aver

age

Vow

el L

engt

h (m

s)

Initial Stop Consonant

Vowel LengthHigh Engagement TaskLow Engagement Task

0100200300400500600700800900

1000

Voiced Voiceless

Aver

age

F1 (H

z)

Initial Stop Consonant

F1 OnsetHigh Engagement Task

Low Engagement Task

0

50

100

150

200

250

300

350

Voiced Voiceless

Aver

age

f0 (H

z)

Initial Stop Consonant

F0 OnsetHigh Engagement …Low Engagement …

High Engagement Low Engagement

●●

● ●

80

100

120

20

25

30

35

VoicelessVoiced

0.25 0.5 0.75 1 0.25 0.5 0.75 1

Normalized Trial Number (Proportion of total trials)

Voic

e on

set t

ime

(ms)

Stimulus Voicing: ● Voiceless Voiced

Initial VOT: ● ●Long Short

Voice Onset Time

β = 3.55, SE = 1.86,χ2(1) = 3.37, p = 0.073

β = 1.62, SE = 0.85,χ2(1) = 3.31, p = 0.073

High Engagement Low Engagement

●●

●●

●●

●●

●●

●●

●●

150

200

250

150

200

250

300

VoicelessVoiced

0.25 0.5 0.75 1 0.25 0.5 0.75 1

Normalized Trial Number (Proportion of total trials)

Vow

el le

ngth

(ms)

Stimulus Voicing: ● Voiceless Voiced

Initial VL: ● ●Long Short

Vowel Length

β = -8.78, SE = 2.72,χ2(1) = 7.82, p = .0053

β = -6 .17, SE = 3.18,χ2(1) = 3.51, p = .063

β = -3 .72, SE = 1.76,χ2(1) = 4.43, p = .042

High Engagement Low Engagement

150

175

200

225

150

175

200

225

VoicelessVoiced

0.25 0.5 0.75 1 0.25 0.5 0.75 1

Normalized Trial Number (Proportion of total trials)

f0 o

nset

(Hz)

Stimulus Voicing: ● Voiceless Voiced

Initial f0 onset: ● ●High Low

Phonetic convergence in an immersive game-based task

INTRODUCTION

METHOD

RESULTS DISCUSSION

ACKNOWLEDGEMENTS & REFERENCES

Tifani Biro ([email protected])

Department of Speech-Language-Hearing: Sciences and Disorders

University of Kansas

Joe Toscano ([email protected])

Department of PsychologyVillanova University

Navin Viswanathan([email protected])

Department of Speech-Language-Hearing: Sciences and Disorders

University of Kansas

Statistical AnalysesThree linear mixed effects models were used to examine:

The effect of trial number, task engagement, initial VOT, and their interactions on the produced VOTs for voiced or voiceless tokens separately. A trial number x task engagement x initial VOT interaction would indicate convergence

The effect of initial VOT, trial number, and their interactions on VOTs of voiced or voiceless tokens separately in each individual task. A initial VOT x trial number interaction would indicate convergenceThe effect of trial number on the VOTs of voiced or voiceless tokens for each of the participants in each task

Similar models examined these effects on VL, F1, and f0

Summary of ResultsElicited mean VOT were longer than what had been previously reported (voiced VOT: 25 ms; voiceless VOT: 101 ms; Lisker & Abramson, 1964). Lexical factors (Baese-Berk & Goldrick, 2009) and task difficulty (Schertz, 2013) could contribute to these differencesHowever, participants’ productions were not significantly different between the low and high engagement tasks

Convergence was observed along certain acoustic dimensions (VOT, VL, F1 onset), but not others (f0). Interestingly, the extent of convergence was affected by the level of task engagement. Overall, subjects in the high engagement task converged more, whereas subjects in the low engagement task were less likely to converge. This finding was especially the case for VL and F1 onset

These preliminary results suggest that engaging, naturalistic tasks may yield results that more accurately reflect real-world phonetic variation than traditional laboratory experiments

Future DirectionsFuture studies will use these communicative tasks to look at convergence among speakers of different native languages

Participants completed either a high- or low-engagement task30 word-initial voicing minimal pairs provided key information that interlocutors had to provide to each otherPhonetic convergence was tracked for 4 acoustic dimensions (VOT, VL, F1, f0) over the course of the one hour experiment

AcknowledgementsThank you to our coders: Anne Marie Crinnion, Sarah Welsh, Jacklyn Coelho, Nicole Johnson, John Michael Kay, Rakshana Selvarajan, and Christopher Burley; thanks to David Saltzman and Michael Phelan for help programming the Minecraft puzzles, and thanks to Emma Folk for assistance with running subjects. TMB was supported by a Villanova Graduate Student Fellowship.

ReferencesBabel, M. (2012). Evidence for phonetic and social selectivity in spontaneous phonetic

imitation. Journal of Phonetics, 40(1), 177-189.Baese-Berk, M., & Goldrick, M. (2009). Mechanisms of interaction in speech production.

Language and cognitive processes, 24(4), 527-554.Buxó-Lugo, A., Toscano, J. C., & Watson, D. G. (2016). Effects of participant

engagement on prosodic prominence. Discourse Processes, (just-accepted).Lisker, L., & Abramson, A. S. (1964). A cross-language study of voicing in initial stops:

Acoustical measurements. Word, 20, 384-422.Olmstead, A. J., Viswanathan, N., Aivar, M. P., & Manuel, S. (2013). Comparison of

native and non-native phone imitation by English and Spanish speakers. Frontiers in psychology, 4.

Pardo, J. S. (2010). Expressing oneself in conversational interaction. Expressing Oneself/Expressing One's self: Communication, Cognition, Language, and Identity, 183-196.

Schertz, J. (2013). Exaggeration of featural contrasts in clarifications of misheard speech in English. Journal of Phonetics, 41(3), 249-263.

Mapping Acoustic Signals to Phonetic Categories

Stimuli (subset)

Phonetic category membership is indicated by multiple dimensions in the acoustic signal. However, acoustic dimensions are variable and context-dependent, which can lead to ambiguities between two speech sounds that only differ in one acoustic attribute. One phenomenon that may affect this variability is phonetic convergence (the observation that speech patterns of interlocutors become more similar during a conversation)

Procedure

1. Production Measures

The extent talkers can converge on productions outside of their native language range may be limited (Olmstead et al., 2013). What dimensions talkers converge on also varies across studies (Babel, 2012; Pardo, 2010). However, since convergence is a phenomenon occurring conversationally within the natural world, typical laboratory tasks may fail at fully eliciting it

2. Convergence Measures

High Engagement Low Engagement

●●

●● ●

600

700

800

500

550

600

VoicelessVoiced

0.25 0.5 0.75 1 0.25 0.5 0.75 1

Normalized Trial Number (Proportion of total trials)

F1 o

nset

(Hz)

Stimulus Voicing: ● Voiceless Voiced

Initial F1 onset: ● ●High Low

β = -25.56, SE = 12.63,χ2(1) = 3.99, p = .05

β = 14.18, SE = 6.01,χ2(1) = 4.48, p = .03

Factors such as engagement have been found to have an effect on language production (Buxó-Lugo et al., 2016)How might task engagement affect speech production and phonetic convergence?

020406080

100120140160

b d g p t k

Aver

age

VOT

(ms)

Stop Consonant

VOT DistributionsLisker & Abramson (1964)

High Engagement Task

Low Engagement Task

Natural Setting• Conversational partner• Less control• Highly engaging

• Lack of interlocutor•More control• Typically boring

Laboratory Setting

Voice Onset Time (VOT)

Release Burst

f1 onset

f0 onset

Vowel Length

/b/

Time

Freq

Amp

Maze Participant 1 Participant 2

1 Listener Talker

2 Talker Listener

3 Listener Talker

4 Talker Listener

5 Listener Talker

6 Talker Listener

Voiced Voiceless

Velar got cot

goat coatghost coast

Alveolar dent tent

dip tipdart tart

Bilabial bat pat

bear pearbark park

gap fig

goat coat muck

luck sigh

high rig cap

say rare done pat yard

doe nun ten bet sad

pay heart frown glass try

goat muck fig high cap

Start

End

VOT

Conversation

/b/1 /b/2 /p/1 /p/2

VOT

/b/1 /b/2 /p/1 /p/2

F1 Onset Fundamental Frequency Onset

1

2

3

3

3

β = -8.67, SE = 3.97, χ2(1) = 4.73, p = .03

1β = -18.77, SE = 7.51,χ2(1) = 6.20, p = .01

2

Low Engagement Listener High Engagement Listener Low Engagement Talker High Engagement Talker