37
The role of the tongue and pharynx in enhancement of vowel nasalization: A real-time MRI investigation of French nasal vowels. Christopher Carignan a,b , Ryan Shosted b , Maojing Fu b , Zhi-Pei Liang b , Bradley Sutton b a North Carolina State University b University of Illinois at Urbana-Champaign INTERSPEECH 2013, August 29

The role of the tongue and pharynx in enhancement of …faculty.las.illinois.edu/rshosted/docs/Carignan_etal_IS...The role of the tongue and pharynx in enhancement of vowel nasalization:

Embed Size (px)

Citation preview

The role of the tongue and

pharynx in enhancement of vowel nasalization:

A real-time MRI investigation

of French nasal vowels.

Christopher Carignana,b, Ryan Shostedb, Maojing Fub, Zhi-Pei Liangb, Bradley Suttonb

aNorth Carolina State UniversitybUniversity of Illinois at Urbana-Champaign

INTERSPEECH 2013, August 29

Introduction: nasalization

� Nasal vowels are characterized by some degree of coupling between the nasal and oral cavities.� Velopharyngeal coupling, i.e., “nasalization”.

� It is well known that nasalization significantly alters the acoustic spectrum of vowels (Hawkins & Stevens, 1985; Katakoa et al., 2001; Pruthi et al., 2007).

� Aside from acoustic effects such as formant amplitude reduction and bandwidth widening, modeling and acoustic studies suggest that the F1/F2 frequencies modulated.

INTERSPEECH 2013, August 29

Introduction: effect on F1/F2

� The frequencies are centralized along the F1

dimension (i.e, F1 is raised for high vowels and lowered for low vowels). (Fujimura & Lindqvist, 1971; Diehl et al., 1991; Serrurier & Badin, 2008; Feng & Castelli, 1996)

� The F2 frequency is lowered for all non-back vowels (Serrurier & Badin, 2008; Feng & Castelli, 1996; Carignan, 2013).

INTERSPEECH 2013, August 29

Introduction: F1/F2 (cont.)

� Krakow et al. (1988) observed that the F1 variation inherent in nasalization is similar to acoustic changes associated with tongue height and jaw position, and that the effect of nasalization on F1 can be perceived as a change in tongue height.� Nasalized high vowels perceived as lower, nasalized low

vowels perceived as higher� Lingual height centralization is also well-documented

typologically for phonemic nasal vowels.� In a variety of languages, under the influence of nasalization,

high vowels are transcribed as lower and low vowels are transcribed as higher (Beddor, 1983; Hajek, 1997; Sampson, 1999).

� Delvaux (2009) shows that F2 lowering alone may help trigger the percept of nasality in French vowels.

INTERSPEECH 2013, August 29

Vowel nasalization’s problem of ambiguity

� Ambiguity in the acoustic signal� Formant changes due to nasalization or changes in

oral articulation?

� Ambiguity in perception� Formant changes due to nasalization or changes in

oral articulation?

� NB: acoustic signal alone is not enough to determine the effect of the respective articulatory contributions to the acoustic realization.

INTERSPEECH 2013, August 29

Vowel nasalization � misperception of source?

� Given the perceptual confusion between formant changes due to nasalization and formant changes due to oral articulation, there is likely a tendency for the acoustic centralization of F1 and lowering of F2 (due to nasalization) to be misperceived as oral articulatory changes.

� Following Ohala (1993, 1996), this misperception may lead to consistent, systematic changes in oral articulatory configuration as concomitants of nasal vowel phonologization.

INTERSPEECH 2013, August 29

Disambiguating NMF nasal vowels

� Previous research has observed that there are oral articulatory differences between nasal vowels and their oral counterparts in Northern Metropolitan French (NMF) � (Straka, 1965; Bricher-Labaeye, 1970; Zerling,

1984; Bothorel et al., 1986; Montagu, 2002; Delvaux et al., 2002; Carignan, 2013)

� Using AG200 and AG500, Carignan (2013) found that F1/nasalization discrepancy not predicted by lingual/labial articulation in some cases.� Pharyngealization plays a role in the acoustic

manifestation of some NMF nasal vowels?

INTERSPEECH 2013, August 29

Methodology

� Real-time MRI using partially separable functions (Liang, 2007)

� 128 × 128 voxels� Voxel resolution: 2.2 mm × 2.2 mm × 8.0 mm� ~25 fps for each of four simultaneously recorded

slices.� Four slices:

� Coronal� Oblique (velopharyngeal)� Hyperpharyngeal� Hypopharyngeal

INTERSPEECH 2013, August 29

rtMRI tomography

CoronalObliqueHyperpharyngealHypopharyngeal

INTERSPEECH 2013, August 29

rtMRI video (example)

� Il retape pot parfois.

� ‘He retypes pot/jar sometimes.’

� /o/� Coronal slice

INTERSPEECH 2013, August 29

Methodology (cont.)

� Three NMF female speakers� Word list containing six French lexical items, with

the target vowel in open syllable, preceded by [p]:� /pɛ/: paix ‘peace’� /pɛ̃/: pain ‘bread’� /pa/: papa ‘daddy’� /pɑ ̃/: paon ‘peacock’� /po/: pot ‘pot/jar’� /pɔ̃/: pont ‘bridge’

� Vowel boundaries segmented using noise-cancelled audio signal.

INTERSPEECH 2013, August 29

Methodology: rotation and ROI

� MR image for each vowel repetition was used to calculate an average MR image for the vowel set.

� Each MR image in a vowel set was rotated and translated with respect to the average MR image to correct for minor movements made by the speaker between repetitions.

� Average MR images for each of the six target vowel sets were used to determine the region of interest (ROI) for each MR slice for that vowel.

� The maximal bounds (i.e., x-y coordinates) of all six vowels determined the bounds of the ROI.

INTERSPEECH 2013, August 29

Methodology: rotation and ROI

� Example: Average coronal slice for NMF01.

� ROI represents the maximal coordinates of the air cavity observed for the six vowels (e.g., for the coronal slice, [ɑ̃], the lowest vowel).

� Image represents the average MR image for all six vowel sets (used for rotation and translation).

INTERSPEECH 2013, August 29

Measure 1: thArea analysis

� The cavity of interest in each MR slice was delineated by hand.

� The intensity values in the air cavity were measured in 8-bit space (values 0-255), and the maximum of the range for each slice was logged as the threshold (τ) for that slice.

� τ was used to convert each MR image into binary space: black (0) for each pixel value at or below τ, and white (1) for each pixel value above τ.

� thArea: sum of number of black pixels in ROI, multiplied by in-plane resolution (2.2 mm2)

INTERSPEECH 2013, August 29

Measure 1: thArea analysis

INTERSPEECH 2013, August 29

Measure 2: pixel-articulator PCA

� ROI is determined for each vowel.� Principal Component Analysis (PCA) is used.� The intensity value of each pixel in the ROI is

included as an input the PCA model.� Reminder: high intensity values are interpreted as

flesh, low intensity values are interpreted as air.� Thus, positive loadings for PCs are interpreted as

flesh entering the frame, and negative loadings for PCs are interpreted as flesh exiting the frame.

� Result: provides a method of reliably interpreting PCs in articulatory terms (what is important?).

INTERSPEECH 2013, August 29

Measure 2: pixel-articulator PCA

� The method of using PCA, and subsequent analysis of PC1, is sufficient for explaining well-known differences between oral and nasal vowels for the velopharyngeal slice (i.e., closing of the velopharyngeal port for oral vowels).� Aside: PC2 also captures opening of the

velopharyngeal port and drag of the posterior wall.

� This presentation: � PC1 analyses for coronal, hyperpharyngeal slices� thArea analysis for hypopharyngeal slice

INTERSPEECH 2013, August 29

Coronal slice

INTERSPEECH 2013, August 29

Coronal slice, PC1 heatmap

Interpretation all speakers: tongue raising

right

tongue

left

INTERSPEECH 2013, August 29

Coronal slice, PC1 results

� PC1 interpretation for all speakers: tongue raising

1: /ɛ/

2: /ɛ̃/

3: /a/

4: /ɑ̃/

5: /o/

6: /ɔ̃/

Carignan (2013)

INTERSPEECH 2013, August 29

Hyperpharyngeal slice

INTERSPEECH 2013, August 29

Hyperpharyngeal slice, PC1 heatmap

Interpretation, all speakers: tongue retraction

posterior

anterior

right

INTERSPEECH 2013, August 29

Hyperpharyngeal slice, PC1 results

� PC1 interpretation for all speakers: tongue retraction

1: /ɛ/

2: /ɛ̃/

3: /a/

4: /ɑ̃/

5: /o/

6: /ɔ̃/

INTERSPEECH 2013, August 29

Hyperpharyngeal slice, PC1 results

� Horizontal tongue position patterns corroborate EMA data (Carignan, 2013)

1: /ɛ/

2: /ɛ̃/

3: /a/

4: /ɑ̃/

5: /o/

6: /ɔ̃/

INTERSPEECH 2013, August 29

Correlation: cor. thArea/hypo. thArea

NMF01

p<0.001, r= -0.604

•: /ɛ/

•: /ɛ̃/

•: /a/

•: /ɑ̃/

•: /o/

•: /ɔ̃/

INTERSPEECH 2013, August 29

Correlation: cor. thArea/hypo. thArea

NMF02

p<0.001, r= -0.502

•: /ɛ/

•: /ɛ̃/

•: /a/

•: /ɑ̃/

•: /o/

•: /ɔ̃/

INTERSPEECH 2013, August 29

Correlation: cor. thArea/hypo. thArea

NMF03

p<0.001, r= -0.213

•: /ɛ/

•: /ɛ̃/

•: /a/

•: /ɑ̃/

•: /o/

•: /ɔ̃/

INTERSPEECH 2013, August 29

Correlation: cor. thArea/hypo. thArea

� Negative correlation between coronal thAreaand hypopharyngealthArea.

� Corroborates that low vowels, generally, have greater pharyngeal constriction.

INTERSPEECH 2013, August 29

Pharyngeal constriction of NMF nasal vowels

1: /ɛ/

2: /ɛ̃/

3: /a/

4: /ɑ̃/

5: /o/

6: /ɔ̃/

� Hypopharngyeal thArea measure.� LME model with speaker as random effect.� All oral-nasal differences: p<0.001

INTERSPEECH 2013, August 29

Pharyngeal constriction of NMF nasal vowels

1: /ɛ/

2: /ɛ̃/

3: /a/

4: /ɑ̃/

5: /o/

6: /ɔ̃/

� Low vowel pair: less pharyngeal constriction for nasal vowel (predicted to lower F1)

� Non-low vowel pairs: greater pharyngeal constriction for nasal vowels (predicted to raise F1)

INTERSPEECH 2013, August 29

Pharyngeal constriction of NMF nasal vowels

� Remember: F1 dimension is centralized under the effect of nasalization.

� A constriction near the glottis will raise F1.� Pharyngeal constriction observed for the three

NMF oral/nasal vowel pairs is predicted to centralize the F1 dimension for the nasal vowels.

� This finding suggests that hypopharyngealaperture enhances the acoustic effect of nasalization on F1 frequency.

INTERSPEECH 2013, August 29

Lingual retraction for all nasals

� Hyperpharyngeal PC1 (i.e., tongue retraction) is greater for all nasals than their oral counterparts, for all speakers.

1: /ɛ/

2: /ɛ̃/

3: /a/

4: /ɑ̃/

5: /o/

6: /ɔ̃/

INTERSPEECH 2013, August 29

Lingual retraction for all nasals (cont.)

� Hyperpharyngeal thArea is smaller for all nasals than their oral counterparts, for all speakers.

1: /ɛ/

2: /ɛ̃/

3: /a/

4: /ɑ̃/

5: /o/

6: /ɔ̃/

INTERSPEECH 2013, August 29

Tongue retraction, F2 and nasalization in NMF

� Generally, global retraction has been observed for NMF nasal vowels compared to their oral counterparts using EMA (Carignan, 2013)

� Remember: F2 is lowered for non-back vowels under the effect of nasalization.

� “Passive” effect of nasalization� Lowered velum � “velic” constriction

� Shosted et al. (2012)

� Near an anti-node of F2� Constriction: lower F2

INTERSPEECH 2013, August 29

Tongue retraction, F2 and nasalization in NMF

� Lingual retraction observed for the three NMF nasal vowel is predicted to lower F2.

� This finding suggests that horizontal tongue position enhances the acoustic effect of nasalization on F2 frequency.

� May help explain perceptual finding from Delvaux(2009), and modeling findings from Serrurier & Badin (2008) and Feng & Castelli (1996).

INTERSPEECH 2013, August 29

Conclusion

� Our findings suggest that oral articulatory configurations enhance the effect of nasalization on F1/F2 frequencies in the production of NMF nasal vowels:� lingual articulation� hypopharyngeal constriction� labial rounding/protrusion (EMA: Carignan, 2013)

� The oral articualtion of NMF nasal vowels may be due (at least in part) to misperception of the articulatory source of changes in F1/F2 over time, rather than to mere chance.

INTERSPEECH 2013, August 29

ReferencesBeddor, P. S., “Phonological and phonetic effects of nasalization on vowel height”, Ph.D. thesis, University of Minnesota, Indiana University

Linguistics Club, 1983.Bothorel, A., Simon, P., Wioland, F., and Zerling, J.-P. Cinéradiographie des voyelles et consonnes du français. Travaux de l’Institut de

Phonétique de Strasbourg, 18, 1986.Brichler-Labaeye, C. Les voyelles françaises. Mouvements et positions articulatoires à la lumière de la radiocinématographie (coll. Bibliothèque

française et romane, série A, no. 18). Klincksieck, Paris, 1970.Carignan, C., “When nasal is more than nasal: The oral articulation of nasal vowels in two dialects of French”, Ph.D. thesis, University of Illinois

at Champaign-Urbana, 2013.Delvaux, V., “Perception du contraste de nasalité vocalique en français”, French Language Studies, 19: 25–29, 2009.Delvaux, V., Metens, T., and Soquet, A. French nasal vowels: Acoustic and articulatory properties. In Proceedings of the seventh international

conference on spoken language processing, volume 1, pages 53–56, Denver, 2002.Diehl, R. L., Kluender, K. R., Walsh, M. A., and Parker, E. M., “Auditory enhancement in speech perception and phonology”, in R. R. Hoffman

and D. S. Palermo [Eds], Cognition and the symbolic processes, Vol 3: Applied and ecological perspectives, 59–76, Erlbaum, 1991.Engwall, O., Delvaux, V., and Metens, T., “Interspeaker variation in the articulation of nasal vowels”, Online at:

http://www.speech.kth.se/prod/publications/files/1926.pdf, accessed on 11 March 2013Feng, G., and Castelli, E., “Some acoustic features of nasal and nasalized vowels: a target for vowel nasalization”, Journal of the Acoustical

Society of America, 99(6): 3694–3706, 1996.Fujimura, O., and Lindqvist, J., “Sweep-tone measurements of vocal-tract characteristics”, Journal of the Acoustical Society of America, 49: 541–

558, 1971.Hajek, J., Universals of Sound Change in Nasalization, Blackwell, 1997.Hawkins, S., and Stevens, K. N., “Acoustic and perceptual correlates of the non-nasal distinction for vowels”, Journal of the Acoustical Society of

America, 77(4):1560–1575, 1985.Kataoka, R., Warren, D., Zajaz, D. J., Mayo, R., and Lutz, R. W., “The relationship between spectral characteristics and perceived hypernasality

in children”, Journal of the Acoustical Society of America, 109(5): 2181–2189, 2001.Krakow, R. A., Beddor, P. S., and Goldstein, L. M., “Coarticulatory influences on the perceived height of nasal vowels”, Journal of the Acoustical

Society of America, 83(3): 1146–1158, 1988.Liang, Z.-P., “Spatiotemporal imaging with partially separable functions”,Biomedical Imaging: From Nano to Macro, 2007. ISBI 2007, 4th IEEE

International Symposium onBiomedical Imaging, 988–991, 2007.Montagu, J. L’articulation labiale des voyelles nasales postérieures du françcais: Comparaison entre locuteurs français et anglo-américains. In

XXIVèmes Journées d’Étude sur la Parole, pages 253–256, Nancy, 2002Ohala, J. J. “The phonetics of sound change”, in Jones, C. [Ed], Historical linguistics: Problems and perspectives, 237–78, Longman. 1993.Pruthi, T., Epsy-Wilson, C. Y., and Story, B. H., “Simulation and analysis of nasalized vowels based on magnetic resonance imaging data”,

Journal of the Acoustical Society of America, 121: 3858–3873, 2007.Sampson, R., Nasal vowel evolution in romance, Oxford University Press, 1999.Serrurier, A., and Badin., P., “A three-dimensional articulatory model of the velum and nasopharyngeal wall based on MRI and CT data”, Journal

of the Acoustical Society of America, 123(4): 2335–2355, 2008.Shosted, R., Carignan, C., and Rong, P., “Managing the distinctiveness of phonemic nasal vowels: Articulatory evidence from Hindi”, Journal of

the acoustical society of America, 131(1): 455– 465, 2012.Straka, G. Album phonétique. L’Uuniversité de Laval, Québec, 1965.Zerling, J. P. Phénomènes de nasalité et de nasalization vocaliques: Étude cinéradiographique pour deux locuteurs.Travaux de l’Institut de

Phonétique de Strasbourg, 16:241–266, 1984.

INTERSPEECH 2013, August 29