26
Cepstral Peak Prominence- Based Phonation Stabilisation Time as an indicator of Voice Disorder Stephen Jannetts and Felix Schaeffler 31/08/2015 11 th Pan-European Voice Conference

Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Embed Size (px)

Citation preview

Page 1: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Cepstral Peak Prominence-Based Phonation Stabilisation Time as an indicator of Voice Disorder

Stephen Jannetts and Felix Schaeffler

31/08/201511th Pan-European Voice Conference

Page 2: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Voice in connected speech Requires

Initiation of phonationMaintenance of phonationTermination of phonation

… in quick succession and at specific points in time

Page 3: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Voice in connected speech

Initiation of phonationMaintenance of phonationTermination of phonation

Voice Problems

Page 4: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Voice in connected speech

Initiation of phonationMaintenance of phonationTermination of phonation

(Gordon & Ladefoged, 2001)

Voice Problems

Page 5: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Clinical acoustic assessment Focused on phonation maintenance Uses sustained vowels to exclude confounds Initial and final portions of the vowel are excluded

→ Phonation initiation and termination are not taken into account

Page 6: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Clinical acoustic assessment This approach has been criticised for poor validity (e.g. Takahashi & Koike, 1976;

Hammarberg, et al. 1980; Askenfelt & Hammarberg, 1986; Maryn et al. 2010; Maryn & Roy, 2012; Choi et al. 2012)

Complex transitions in connected speech could be a rich source of clinical information

Mechanical consequences of inflammation or tension could be most evident at voice onset

Initiation/Termination rarely differentiated even when connected speech used (See e.g. Vocal Rise Time, p. 129 Baken & Orlikoff 2000)

Page 7: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Phonation Stabilisation Time Acoustic approach to phonation initiation

Uses connected speech

Does not require manual segmentation

Page 8: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

PST based on autocorrelation

Schaeffler, et al 2015 - http://www.icphs2015.info/pdfs/Papers/ICPHS0331.pdf

Onset of voicingStable

periodicity threshold

Time (s) .91 Stable

periodicity threshold

.45 voicing threshold

Autocorrelation coefficient

Page 9: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

PST based on CPP

Onset of voicingStable

periodicity threshold

Time (s) 23.14dB

Stable periodicity threshold

(.45) voicing threshold

Cepstral Peak Prominence

Page 10: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

PST is a duration

PST

Page 11: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

PST based on CPP – keeping things robust

Onset of voicingStable

periodicity threshold

Time (s) 23.14dB

Stable periodicity threshold

(.45) voicing threshold

Cepstral Peak Prominence

70ms

Page 12: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

PST - Research QuestionsQ1 - Can PST differentiate normal and disordered voices?

Q2 – Can PST detect cases that are below pathological thresholds for sustained vowels?

Page 13: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Material KayPENTAX Disordered Voice Database

Sustained vowel: stable portion of sustained [a], including 22 MDVP parameters (shimmer, jitter etc)

Connected speech: 12s section of the ‘rainbow passage’

Voices are categorised as ‘normal’ and ‘pathological’

Page 14: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

All samples

Normal Disordered TotalFemale 31 191 220Male 21 121 142Total 52 312 364

Page 15: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Samples below threshold

Normal Disordered TotalFemale 30 (31) 20 (191) 50Male 15 (21) 17 (121) 32Total 45 37 82

Page 16: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Procedure PST calculated using CPP

Variables- PST mean per sample (PST M)- PST standard deviation per sample (PST SD)- Percentage of voiced segments that reached criterion (Seg%)

Page 17: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Results: all voices Female disordered voices: significantly longer duration PST M (U=728.5, p<0.001) significantly larger SD of PST (U=502, p<0.001) Seg% significantly lower (U=557, p<0.001)

Male disordered voices: significantly longer duration PST M (U=221.5, p<0.001) significantly larger SD of PST (U=200, p<0.001) Seg% significantly lower (U=140, p<0.001)

Page 18: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Results: below threshold Normal DisorderedPST M 50.84(11.39)* 70.28(27.15)*

PST SD 23.49(7.89)** 52.13(28.45)**

Seg% 93.57(6.31)* 79.02(23.68)*

Means (SD) for female ‘below threshold’ Normal DisorderedPST M 45.88(9.52)* 77.86(33.03)*

PST SD 26.94(9.99)* 49.31(25.15)*

Seg% 95.58(4.35)** 76.63(22.05)**

Means (SD) for male ‘below threshold’

* - p < 0.005** - p < 0.001

Page 19: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Mean PST

0 20 40 60 80 100 120 140 160 180

PST

SD

0

20

40

60

80

100

120

Disordered Normal

Male below threshold

Page 20: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Mean PST

0 20 40 60 80 100 120 140 160 180

PST

SD

0

20

40

60

80

100

120

Disordered Normal

Male below threshold

Page 21: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Mean PST

20 40 60 80 100 120 140

PST

SD

0

20

40

60

80

100

120

Disordered Normal

Female below threshold

Page 22: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Mean PST

20 40 60 80 100 120 140

PST

SD

0

20

40

60

80

100

120

Disordered Normal

Female below threshold

Page 23: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Hypothesis confirmed PST was significantly longer in all disordered voice groups PST is a potentially useful parameter for the analysis of disordered

voices Even for voices without pathological findings in sustained vowels

Maybe particularly relevant for mild/early stage voice disorders? Or a certain type?

Page 24: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Remaining questions and work to be done Categorisation and diagnostic labelling - is PST more useful for a specific voice disorder or symptom?

Segmental context?

Algorithm tweaking and streamlining the process.

Page 25: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

Thank you!

Page 26: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder

ReferencesGordon, M. & Ladefoged, P. (2001). Phonation types: a cross-linguistic overview. Journal of Phonetics, 29(4), 383–406.Maryn, Y., Roy, N., De Bodt, M., Van Cauwenberge, P. & Corthals, P. (2009). Acoustic measurement of overall voice quality: a meta-analysis. The Journal of the Acoustical Society of America, 126(5), 2619–34.Askenfelt, A.G. & Hammarberg, B. 1986. Speech waveform perturbation analysis: a perceptualacoustical comparison of seven measures. Journal of Speech and Hearing Research, 29(1), 50–64.Choi, S.H. et al. 2012. The effect of segment selection on acoustic analysis. Journal of Voice, 26(1), 1–7.Hammarberg, B. et al. 1980. Perceptual and acoustic correlates of abnormal voice qualities. Acta OtoLaryngologica, 90(5-6), 441–51.Maryn, Y. & Roy, N., 2012. Sustained vowels and continuous speech in the auditory-perceptual evaluation of dysphonia severity. Jornal da Sociedade Brasileira de Fonoaudiologia, 24(2), 107– 12.Maryn, Y. et al., 2010. Toward improved ecological validity in the acoustic measurement of overall voice quality: combining continuous speech and sustained vowels. Journal of Voice, 24(5), 540–55.Takahashi, H. & Koike, Y., 1976. Some perceptual dimensions and acoustical correlates of pathologic voices. Acta Oto-Laryngologica. Supplementum, 338, 1–24.Baken, R. J., & Orlikoff, R. F. 2000. Clinical Measurement of Speech and Voice. San Diego: Singular Publishing Group.Schaeffler, F., Jannetts, S & Beck, J., 2015. Phonation Stabilisation Time as an Indicator of Voice Disorder. ICPhS [accepted].