17
ProZed: an Editor for the Automatic Processing of Prosodic Variation C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université de Provence

ProZed: an Editor for the Automatic Processing of Prosodic Variation

Embed Size (px)

DESCRIPTION

ProZed: an Editor for the Automatic Processing of Prosodic Variation. C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université de Provence. Summary. 1. Prosodic systems Prosody as a multidimensional macro-system Levels of representation. - PowerPoint PPT Presentation

Citation preview

Page 1: ProZed: an Editor for the Automatic Processing of Prosodic Variation

ProZed: an Editor for the Automatic Processing of Prosodic Variation

C. AURAN, C. BOUZON & D.J. HIRST

Laboratoire Parole et LangageCNRS UMR6057

Université de Provence

Page 2: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Summary1. Prosodic systems

Prosody as a multidimensional macro-systemLevels of representation

2. ProZEdGeneral conceptionsDemonstrations (a few modules)

Long sound file fragmentation, Speaker separation

Duration manipulationSilence detection and fragmentationMOMEL-INTSINT codingPhonological resynthesis

3. Perspectives

Page 3: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Prosodic systems

Page 4: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Prosody as a macro-system

• Prosody seen as consisting of 3 systems (Di Cristo 2001):• Tonal system• Temporal system• Metrical system

• Intimate interactions between elements from these 3 systems

• Complex relations between the acoustic, the phonetic and the

phonological levels

« Prosody » does not mean « intonation »

Page 5: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Orthogonal dimensions

•Tonal and temporal systems make use of 2 orthogonal

dimensions (Ladd 1996, Di Cristo et al. 2003 and forthcoming):

• Linear dimension (tonal sequences, syllable length

distribution, …)

• Frame dimension (register level and span, downtrends,

tempo, …)

Both dimensions play a major part in the organisation of discourse

and the linguistic characterisation of dialects (ref.)

Page 6: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Levels of representation (1)

• 4 levels of representation (cf. Hirst et al. 2000):

0. Physical level (acoustic data)

1. Phonetic level (continuous quantitative variables)

2. Surface phonological level (abstract qualitative characteristics)

3. Underlying phonological level

• Interpretability constraint → local interpretation in relation with

adjacent levels

• Mapping:• between level 0 and level 1: phonetic representation• between level 1 and level 2: surface phonological representation

Page 7: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Levels of representation (2)

• Phonetic representation:

• Temporal system: unit alignment with the speech signal

• Tonal system: quadratic spline modelling of fundamental

frequency (MOMEL algorithm)

Page 8: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Levels of representation (3)

• Surface phonological representation:

• Temporal system: categorical coding (--, -, , +, ++)- Base dimension: raw segment duration- Frame dimension: tempo factor on raw segment duration

• Tonal system: INTSINT coding of MOMEL targets (M, T, B, L,

H, U, D) - Purely formal coding (≠ ToBI but cf. narrow IPA

transcription)- Base dimension + frame dimensions (register level,

register span, declination effect)

Page 9: ProZed: an Editor for the Automatic Processing of Prosodic Variation

INTSINT: base dimension Absolute tones

T (Top)

M (Mid)

B (Bottom)

Relative tones non-iterative

H (Higher)

L (Lower)

iterative

U (Up)

D (Down)

S (Same)H (Higher) L (Lower)

U (Up) D (Down)

Page 10: ProZed: an Editor for the Automatic Processing of Prosodic Variation

0

50

100

150

200

M T L H L H L H B

INTSINT: Frame dimensionDowndrift

Register level and register span codings(cf. Portes & Di Cristo 2003)

Page 11: ProZed: an Editor for the Automatic Processing of Prosodic Variation

ProZEd

Page 12: ProZed: an Editor for the Automatic Processing of Prosodic Variation

General conceptions (1)

ProZEd: « Prosodic Editor »

• Multi-functional

• Preliminary processing (file segmentation, speakers separation, …)

• Specific processing (duration processing, silence detection, intonation processing, resynthesis, …)

• « Theory independent » (cf. Mixdorf’s work)

• Multi-platform (Praat, Perl), freeware and open source (GPL)

Page 13: ProZed: an Editor for the Automatic Processing of Prosodic Variation

General conceptions (2)

ProZEd: Representation levels

Reversible mapping (for intonation):

0. Physical level

1. Phonetic level

2. Surface phonological level

MOMEL

INTSINT

INT2PHO

QSP

MBROLA

Page 14: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Demonstrations

Long sound file fragmentation

Duration manipulation

Silence detection and fragmentation

MOMEL-INTSINT coding

Phonological resynthesis

[ Launch ProZEd ]

Page 15: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Perspectives

Page 16: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Perspectives• Improved modelling of duration (z-score method)

• Automatic generation of both xml and human (more easily)

readable data sheets (polymetrical expressions for instance)

Ex.: _<M>(nV, <H>)(TIN, <BU>)_

• New modules for:

• automatic pseudo-segment detection and processing (IRIT’s

Vocalis software)

• automatic complementary information extraction

• automatic alignment using iterative DTW (Di Cristo & Hirst

1997)

Page 17: ProZed: an Editor for the Automatic Processing of Prosodic Variation

Thank you for your attention

Presentation available from

www.lpl.univ-aix.fr/~EPGA/

(ProZEd modules also available shortly… )