Independent Component Analysis Independent Component Analysis

Independent Component AnalysisIndependent Component Analysis

BackgroundBackground

Hands-free speech recognition system

Target SpeechMicrophone

Speech recognition system

？Interference

Interference is also observed at microphone.Speech recognition performance

significantly degrades.

Is it fine tomorrow?

Goal

Microphone array Receiver which consists of multiple elements To enhance target speech or reduce interference

Problem of microphone array processing A priori information is required.

Directions of arrival of the sound sources Breaks of target speech for filter adaptation

Background (Cont’d)Background (Cont’d)

Realization of high quality hands-free speech interface system

ICAICABlind Signal Separation (BSS) or Independent Component Analysis (ICA) is

the identification & separation of mixtures of sources with little prior information.

• Applications include:

– Audio Processing– Medical data– Finance– Array processing (beamforming)– Coding

• … and most applications where Factor Analysis and PCA is currently used.• While PCA seeks directions that represents data best in a Σ|x0 - x|2 sense,

ICA seeks such directions that are most independent from each other.We will concentrate on Time Series separation of Multiple Targets

Approach taken to estimate source signals only from the observed mixed signals. Any information about source directions and acoustic

conditions is not required. Independent component Analysis (ICA) is mainly used.

Previous works on ICA J. Cardoso, 1989 C. Jutten, 1990 (Higher-order decorrelation) P. Common, 1994 (define the term “ICA”) A. Bell et al., 1995 (infomax)

Blind Source Separation (BSS)Blind Source Separation (BSS)

Microphone2

Microphone1MutuallyMutuallyIndependentIndependent KnownKnown

ICA-Based BSSICA-Based BSS

Speaker2

Speaker1Good Morning!

Hello!

Observedsignal1

Observedsignal2Source2

Source1

To estimate source signalsTo estimate source signals

No a priori information (unsupervised adaptive filtering)

BSS for Instantaneous mixtureBSS for Instantaneous mixture

)(

)(

)(

)( 11

1

111

tx

tx

ts

ts

AA

AA

LKLKL

K

Linearly Mixing Process

Mixing Matrix Source Observed

Separation ProcessSeparated Unmixing Matrix

)(

)(

)(

)( 1

1

1111

tx

tx

WW

WW

ty

ty

LKLK

L

K

Independent?

Cost Function

Optimize

Mathematical Formulation

• s(k)= (s1(k),…,sn(k))T: the vector of n-source signals; • x(k)= (x1(k),…,xm(k))T: the vector of m-sensor signals;

• v (k): the vector of sensor noises.• A is the mixing matrix.

s(k)

x(k)

• y(k)= (y1(k),…,ym(k))T: the vector of recovered signals

• W is the demixing matrix.

y(k)=W x(k)

Demixing ModelProblem: to estimate the source

signals (or event-related potentials) by using the sensor signals

DefinitionKurtosis is more commonly defined as the fourth cumulant divided by the square of the second cumulant, which is equal to the fourth moment around the mean divided by the square of the variance of the probability distribution minus 3,

which is also known as excess kurtosis. The "minus 3" at the end of this formula is often explained as a correction to make the kurtosis of the normal distribution equal to zero.

More generally, if X1, ..., Xn are independent random variables all having the same variance, then

whereas this identity would not hold if the definition did not include the subtraction of 3.

http://en.wikipedia.org/wiki/Cumulant

http://en.wikipedia.org/wiki/Variance

Various Criterion for ICAVarious Criterion for ICA

• Decorrelation– To minimize correlation among signals in multiple time durations

• Nonlinear function 1– To minimize higher-order correlation

• Nonlinear function 2– To assume p.d.f of sources

Separated Signal ：　 T21 )(),...,()( tytyt y

diag)()(E T tt yy

diag)()(E T3 tt yy

diag)()(E T tt yyΦ :Φ Sigmoid

function

Cost Function for Nonlinear Function 2Cost Function for Nonlinear Function 2

),,( 1 Kyyp Kullback-Leibler (KL) divergence between 　　　　　 and

K

k kyp1 )(

1. Joint Entropy of y 2. Sum of marginal entropy of ky

・ Minimized when are mutually independentky

K

kk

K

k k

WYHWH

dyp

ppWKL

1

1

);();(

)()(log)()(

Y

yyy

=

K

kk

K

k k

WYHWH

dyp

ppWKL

1

1

);();(

)()(log)()(

Y

yyy

Derivation for Nonlinear Function 2Derivation for Nonlinear Function 2

1TT

T1T

T1T

)(E

)(E)(

)()()()(

WyyI

xyW

xxyWW

WW

y

x

dxpKL)(WKL

Nonlinear Function 2 　⇒　 To be diagonalized

where

W

This can be approximated by Sigmoid Function in speech signal.

T

1

1 )(log...,,)(log)(

K

K

yyp

yypy

To update along the negative gradient of

Measures of Non-Measures of Non-GaussianityGaussianity• Kurtotis : gauss=0 (sensitive to outliers)

• Entropy : gauss=largest

• Neg-entropy : gauss = 0

• Approximations

• where v is a standard gaussian random variable and :

224 }){(3}{)( yEyEykurt

dyyfyfyH )(log)()(

)()()( yHyHyJ gauss

222 )(481

121)( ykurtyEyJ

2)()()( vGEyGEyJ

)2/.exp()(

).cosh(log1)(2uayG

yaayG

Data Centering & Data Centering & WhiteningWhitening• Centering

x = x‘ – E{x‘}– But this doesn‘t mean that ICA cannt estimate the mean,

but it just simplifies the Alg.– IC‘s are also zero mean because of:

E{s} = WE{x}– After ICA, add W.E{x‘} to zero mean IC‘s

• Whitening– We transform the x’s linearly so that the x~ are white. Its

done by EVD. x~ = (ED-1/2ET)x = ED-1/2ET Ax = A~s

where E{xx~} = EDET

So we have to Estimate Orthonormal Matrix A~

– An orthonormal matrix has n(n-1)/2 degrees of freedom. So for large dim A we have to est only half as much parameters. This greatly simplifies ICA.

• Reducing dim of data (choosing dominant Eig) while doing whitening also help.

Noisy ICA ModelNoisy ICA Modelx = As + n

• A ... mxn mixing matrix• s ... n-dimensional vector of IC‘s• n ... m-dimensional random noise vector• Same assumptions as for noise-free model, if we use measures

of nongaussianity which are immune to gaussian noise.• So gaussian moments are used as contrast functions. i.e.

• however, in pre-whitening the effect of noise must be taken in to account:

x~= (E{xxT} - Σ)-1/2 xx~ = Bs + n~.

)2/exp(2/1)(

)()()(22

2

cxcyG

vGEyGEyJ

Documents

Independent Component Analysis Independent Component Analysis