16
Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion University of the Negev Workshop on: Speech Enhancement and Multichannel Audio Processing Technion 22.2.2007 BGU

Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

  • Upload
    hans

  • View
    44

  • Download
    3

Embed Size (px)

DESCRIPTION

BGU. Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone. Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion University of the Negev Workshop on: Speech Enhancement and Multichannel Audio Processing Technion 22.2.2007. BGU. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Joseph TabrikianDept. of Electrical and Computer Engineering

Ben-Gurion University of the Negev

Workshop on:Speech Enhancement and Multichannel Audio Processing

Technion 22.2.2007

BGU

Page 2: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Outline

Motivation Single source pitch estimation and tracking Multiple source pitch estimation and tracking Experiments Conclusion

BGU

Page 3: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Motivation Speech enhancement Sensitivity of many audio processing

algorithms to interference. For example: Automatic speech/speaker recognition Speech/music compression

Single microphone blind source separation (BSS)

Karaoke

BGU

Page 4: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Single Source - Modeling Voice frames - harmonic model:

additive Gaussian noise In matrix notation:

BGU

1

( ) cos( ) ( ), 1, ,K

n k n k nk

y t b t v t n N

( ) - nv t

1 1 1 1

2 2 2 2

1 cos cos sin sin

1 cos cos sin sin( )

1 cos cos sin sinN N N N

t K t t K t

t K t t K t

t K t t K t

A

( ) , ~ (0, )N vy A b v v R

0 1 1 T

c c cK s sKb b b b b b

Page 5: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Single Source – Pitch Tracking Maximum Likelihood (ML) estimator:

Pitch tracking: The data vector at the mth frame:

- first-order Markov process: Maximum A-posteriori Probability (MAP) pitch tracking

via the Viterbi algorithm.(Tabrikian-Dubnov-Dickalov 2004)

BGU

( ) , 1, ,m m m m m M y A b v

1/ 2

1/ 2

2

11/ 2 1 1/ 2

ˆ arg max ( )

( ) ( ) ( ) ( ) ( )H H

v

v

R A

v v vR A

P y

P R A A R A A R

1

M

m m

1 11

( , , ) ( | )M

M m mm

f f

Page 6: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Single Source - Voicing Decision Unvoiced model

Colored Gaussian noise model:

Voiced/unvoiced decision by the

Generalized Likelihood Ratio Test (GLRT):

BGU

~ ( , )N yy 0 R

2

2 2, ,

2

max ( | , , ; )GLRT=

max ( | ; ) ( )

voiced

v

unvoiced

Hv voiced

unvoicedH

f H

f H

y

b

y AR

y b y

y R I P y

(Fisher-Tabrikian-Dubnov 2006)

Page 7: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Multiple Sources ML estimator of from under the

model: with unknown signal and unknown (Gaussian) noise covariance:

BGU

j j js y a v

1

J

j jy

12

21

ˆ arg max log max( , )max( , )

, ( ), : ( 1)

Ll

ML ll l

T Tsvd L L

GG

G

A y A A AG T R T T I a a T1

2

1

ˆ0 arg max logL

ll

G

1

1ˆ arg max MVDRT

J L

ya R a

(Harmanci-Tabrikian-Krolik 2000)

Page 8: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Multiple Sources Voiced model:

v includes other interferences. is unknown. Using J overlapping subframes of size Ls

(2K+1<J< Ls):

jth column of :

BGU

1

ˆ arg max log ( ) ,

1( ) ( ) ( ) , ( ) ( ) ( ) ( )

J

ML jj

T T T

J

G

A A A A AG Y I U U Y A U Λ V

1 T

JyR YY

Y 1 1, , ,T

j j j N Jy y y

( ) , ~ (0, )N vy A b v v R

vR

Page 9: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Multiple Sources Pitch tracking:

The data vector at the mth frame:

- first-order Markov process

Maximum A-posteriori Probability (MAP) pitch tracking via the Viterbi algorithm

BGU

( ) , 1, ,m m m m m M y A b v

1

M

m m

Page 10: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Multiple Sources - Voicing Decision Unvoiced model

Colored Gaussian noise model:

Voiced/unvoiced decision by the GLRT:

BGU

~ ( , )N yy 0 R

, ,

1

max ( | , , ; )GLRT=

max ( | ; )

voiced

unvoiced

HJvoiced j

junvoiced jH

f H

f H

yv

yv

v Rb R

v GR

y b R

y R

(Fisher-Tabrikian-Dubnov 2007)

Page 11: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Multiple Source Models Exact ML for the strongest voiced signal, and

“locally ML” for other voiced signals

BGU

1,

ˆ ˆML LML 2,

ˆLML

Lik

elih

ood

fun

ctio

n

Page 12: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Experiments – Single Source

BGU

Page 13: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Experiments - Two Sources

BGU

150 200 250 300 350-90

-80

-70

-60

-50

-40

-30

-20

-10

0

Frequency [Hz]

Nor

mal

ized

log-

likel

ihoo

d

Two voiced sources

Page 14: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Experiments – Voicing Decision

BGU

Page 15: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Experiments - – Voicing Decision

BGU

Page 16: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

Conclusions ML pitch estimation for single and multiple sources

have been developed under the harmonic model for voiced frames.

The derived likelihood functions under the two models allow implementation of the Viterbi algorithm for MAP pitch tracking.

The GLRT for voicing decision is derived under the two models.

Future work: development of multiple hypothesis tracking methods for

single microphone BSS. Adaptive estimation of the number of harmonics

BGU