12
Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Embed Size (px)

Citation preview

Page 1: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and

Regularization

Sean WebsterMentors: Ernie Esser, Jack Xin

Page 2: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

The Problem

x1

x2

⎝ ⎜

⎠ ⎟=a11 ∗ a12 ∗

a21∗ a22 ∗

⎝ ⎜

⎠ ⎟s1s2

⎝ ⎜

⎠ ⎟

x1 = a11 ∗s1 + a12 ∗s2x2 = a21∗s1 + a22 ∗s2

Page 3: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Partial Inversion

x = A∗s

A−1 =1

det A

a22 ∗ −a12 ∗

−a21∗ a11∗

⎝ ⎜

⎠ ⎟

A−1x = s

a22 ∗ −a12 ∗

−a21∗ a11∗

⎝ ⎜

⎠ ⎟x1

x2

⎝ ⎜

⎠ ⎟= det A

s1s2

⎝ ⎜

⎠ ⎟=v1

v2

⎝ ⎜

⎠ ⎟

v1 = a22 ∗x1 − a12 ∗x2

v2 = −a21∗x1 + a11∗x2

Page 4: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Decorrelation

E[s1(t)s2(t − n)] = 0

E[v1(t)v2(t − n)] = 0

−a22a21E[x1(t)x1(t − n)]+ a12a21E[x2(t)x1(t − n)]

+a22a11E[x1(t)x2(t − n)] − a12a11E[x2(t)x2(t − n)] = 0

Cnij = E[x i(t)x j (t − n)]

−a22a21Cn11 + a12a21Cn

21 + a22a11Cn12 − a12a11Cn

22 = 0

a22 a21( )−Cn

11 Cn12

Cn21 −Cn

22

⎝ ⎜

⎠ ⎟a12

a11

⎝ ⎜

⎠ ⎟= 0

uΤCnw = 0

Page 5: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

l1 Normalization Constraint

σ2 ul1

2−1( )

2

= 0

σ 2 w l1

2−1( )

2

= 0

F = uΤCnw2

n

∑ +σ 2 ul1

2−1( )

2

+σ 2 wl1

2−1( )

2

= 0

Page 6: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Cross Cancellation

x11

x12

⎝ ⎜

⎠ ⎟=a11 ∗ a12 ∗

a21∗ a22 ∗

⎝ ⎜

⎠ ⎟s10

⎝ ⎜

⎠ ⎟

x21

x22

⎝ ⎜

⎠ ⎟=a11 ∗ a12 ∗

a21∗ a22 ∗

⎝ ⎜

⎠ ⎟0

s2

⎝ ⎜

⎠ ⎟

x11 = a11 ∗s1x12 = a21∗s1x21 = a12 ∗s2x22 = a22 ∗s2

a21∗x11 = a21∗a11 ∗s1a11 ∗x12 = a11 ∗a21∗s1a22 ∗x21 = a22 ∗a12 ∗s2a12 ∗x22 = a12 ∗a22 ∗s2

a21∗x11 − a11 ∗x12 = 0

a22 ∗x21 − a12 ∗x22 = 0

Page 7: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Regularization

r = exp(c *[0 :q −1])

R =r r

r r

⎝ ⎜

⎠ ⎟

p = aA.*R

Page 8: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Results

Instantaneous A Cross Cancellation A

Page 9: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

ResultsCross Cancellation + Normalization + Regularization A

Cross Cancellation + Normalization + Regularization + Decorrelation A

Page 10: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Results

Convoluted A Cross Cancellation A

Page 11: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

Results

Cross Cancellation + Normalization A

Page 12: Audio Demixing with Decorrelation, Cross Cancellation, Normalization, and Regularization Sean Webster Mentors: Ernie Esser, Jack Xin

ReferencesAlexis Favrot, Christof Faller, and Fabian Kuech. Reverberation modeling in acoustic

echo suppression. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011.

Jie Liu, Jack Xin, Yingyong Qi, and Fan-Gang Zheng. A time domain algorithm for blind separation of convolutive sound mixtures and L1 constrainted minimization of cross correlations. Communications in Mathematical Sciences, 7(1):109–128, 2009.

Meng Yu, Wenye Ma, Jack Xin, and Stanley Osher. A convex speech extraction model and fast compu- tation by the split bregman method. Pages 1–8, 2010.