Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The...

Preview:

Citation preview

Functional Data Analysis for Speech Research

Michele GubianRadboud University Nijmegen The NetherlandsLondon, March 24th 2010Cambridge, March 26th 2010

Content What and why Functional Data Analysis (FDA)

Motivation

Case study 1

Case study 2 – pitch re-synthesis

How to use FDA

Using the R package ‘fda’

Motivation

Analyzing curves

PCA

ANOVA

Linear models

xxx

x

?

dur ext

58

48

98

2.8

3.8

2.9

dur

ext

Problems

xxx

x

?

dur

ext

Decide what are the important features of a curve using

models

intuition / trial and error

However

Those features may not capture all the relevant dynamic

aspects

e.g. concavity/convexity

long range correlatioins

Problems (2)

xxx

x

?

dur

ext

Identify those feature points

manually

(semi)automatically

However

The identification may be hard, even ill-posed

time consuming

risk of subjective judgment

Analyzing curves with FDA

xxx

x

?

dur

ext

Functional

Data

Analysis

Analyzing curves with FDA

All the information contained in the curve (dynamics) is used

No need to reduce a curve to a set of significant features

No need to introduce assumptions on what is relevant in a curve

shape and what is not

FDA provides both VISUAL and QUANTITATIVE results

input is curves, output is also curves

plus classic statistical output like p-values, confidence intervals

Functional Data Analysis: an extension of (some) statistical techniques to the domain of functions

Example

Ask people: How old are you? How much do you earn?

Each data point is a point in 2D

CLASSIC FDA

age

salary xx

x

xxx

x

x

Record people salary through the years

Each “data point” is a whole CURVE

age

salary

Case study

Diphthong vs. hiatus in Spanish

/ja/ vs. /i.a/ contrast is unstable in European Spanish

Diachronically, in Romance languages /i.a/ becomes /ja/

Diatopically, in Latin American Spanish the contrast seems to be lost

It is not present in orthography (“ia” in either case)

No strict minimal pairs

Investigate

Consistent realization of the contrast

Inter-speaker variation

Cues used in the realization

CuesDIPHTHONG

/ja/HIATUS

/i.a/

Duration

Formants

Pitch

short long

f1

f2

f1

f2

f0 f0

Example diphthong

Example hiatus

Dataset

Read speech

Diphthong

‘Emiliana no, …’ /e.mi.lja.na#no#.../ (‘Not Emiliana, …’)

Hiatus

‘Mi liana no, … ‘ /mi#li.a.na#no#.../ (‘Not my liana, …’)

9 speakers (gender balanced)

20 repetitions per speaker per type

In total 365 utterances

Duration

Pitch

Pitch was extracted from the beginning of /l/ to the end of the

rising gesture

In Spanish the pitch rising peak falls beyond the accented

syllable

lja li a

The raw dataspeaker

/ja/ vs /i.a/

FDA data preparation

Each sampled curve has to be turned into a function

Decide how much detail to retain (smoothing)

FDA data preparation (2)

All functions will be obtained by a combination of so-called

basis functions, usually B-splines

All functions will be linearly stretched in time to become of

equal duration

Functional

representation

B-spline

ClassicPrincipal Component Analysis (PCA)

age25 65

salary

xx

xxx

x

xx

xxx

xx xx xx

x x

xxx

x

xx x

xxx

xx

x

xx

x

x

PC1

PC2

Functional PCA on pitch contours

Functional PCA on pitch contours

PCA does not know about labels !!

Functional PCA on pitch contours

PC1

Functional PCA on pitch contours

PC1

Functional PCA on pitch contours

PC2

Functional PCA on pitch contours

PC2

Functional PCA on formants

PC2

PC1

f1 f2

Functional PCA on formants

PC1PC1

Cues coordination

Duration vs formants Duration vs pitch

Summary

FDA provides tools to extract relevant dynamic characteristics of a set of

curves

Traditional tools like PCA (and linear regression) are extended to curves

Functional PCA revealed the main dynamic cues used in the realization

of a (weak) contrast in Spanish

Without using the labels information

Without extracting features from the curves (e.g. peaks)

Combining multi-dimensional curves (formants) without effort

References Functional Data Analysis website:

www.functionaldata.org

Books:

Software:

a bilingual (R and MATLAB) tool is freely available

online

Appendix

Functional linear models

y(t) = a(t) + b(t) x

diphthong, x = 0

hiatus, x = 1

Confidence intervals for a(t) and b(t)

R2(t) = percentage of explained variance

Recommended