11

Click here to load reader

10.1002_acs.896

Embed Size (px)

DESCRIPTION

vh

Citation preview

Page 1: 10.1002_acs.896

INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSINGInt. J. Adapt. Control Signal Process. 2006; 20:213–223Published online 23 March 2006 in Wiley InterScience (www.interscience.wiley.com) DOI:10.1002/acs.896

A new blind source separation method based onfractional lower-order statistics

Daifeng Zha1,n,y and Tianshuang Qiu2

1College of Electronic Engineering, Jiujiang University, Jiujiang 332005, China2School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116024, China

SUMMARY

We proposed neural network structures related to multilayer feed-forward networks for performing blindsource separation (BSS) based on fractional lower-order statistics. As alpha stable distribution process hasno its second- or higher-order statistics, we modified conventional BSS algorithms so that their capabilitiesare greatly improved under both Gaussian and lower-order alpha stable distribution noise environments.We analysed the performances of the new algorithm, including the stability and convergence performance.The analysis is based on the assumption that the additive noise can be modelled as alpha stable process.The simulation experiments and analysis show that the proposed class of networks and algorithms is morerobust than second-order-statistics-based algorithm. Copyright # 2006 John Wiley & Sons, Ltd.

KEY WORDS: alpha stable distribution; blind source separation; independent component analysis; neuralnetworks; second-order statistics; higher-order statistics; fractional lower-order statistics(FLOS); non-Gaussian noise

1. INTRODUCTION

In some applications, such as underwater acoustic signal processing, radio astronomy,communications, radar system, etc., and for most conventional and linear-theory-basedmethods, it is reasonable to assume that the additive noise is usually assumed to be Gaussiandistributed with finite second-order statistics (SOS). In some scenarios, it is inappropriate tomodel the noise as Gaussian noise. Recent studies [1,2] show that alpha stable distribution isbetter for modelling impulsive noises, including underwater acoustic, low-frequency atmo-spheric, and many man-made noises, than Gaussian distribution in signal processing, which hassome important characteristics and makes it very attractive. Stable process arises as limitingprocesses of sum of independent, identically distributed random variables via the generalizedcentral limit theorem. This kind of physical process with sudden and short endurance highimpulse in real world, called lower-order alpha stable distribution random process, has nosecond- or higher-order statistics. It has no close-form probability density function (p.d.f.) so

Received 14 December 2004Revised 24 July 2005

Accepted 27 December 2005Copyright # 2006 John Wiley & Sons, Ltd.

yE-mail: [email protected]

nCorrespondence to: Daifeng Zha, College of Electronic Engineering, Jiujiang University, Jiujiang 332005, China.

Page 2: 10.1002_acs.896

that we can only describe it by its characteristic function [1,2]:

jðtÞ ¼ expfjmt� gjtja½1þ jb sgnðtÞoðt; aÞ�g ð1Þ

where

oðt; aÞ ¼tan

ap2

a=1

2

plogjtj a ¼ 1

8>><>>: ; �15x51; d > 0; 05a42;�14b41

a is the characteristic exponent. It controls the thickness of the tail in the distribution. TheGaussian process is a special case of stable processes with a ¼ 2: The dispersion parameter g issimilar to the variance of Gaussian process and b is the symmetry parameter. If b ¼ 0; thedistribution is symmetric and the observation is referred to as the SaS (symmetry a-stable)distribution, i.e. it is symmetrical about m. m is the location parameter. When a ¼ 2 and b ¼ 0;the stable distribution becomes the Gaussian distribution.

Conventional blind source separation (BSS) [3] is optimal in approximating the input data inthe mean-square error sense, describing some second-order characteristics of the data. Non-linear BSS [4,5] method which is related to higher-order statistical techniques is a usefulextension of standard BSS that has been developed in context with blind separation ofindependent sources from their linear mixtures. Such blind techniques are needed, for example,in various applications of array processing, communications, medical signal processing, andspeech processing. In BSS, the data are represented in an orthogonal basis determined merely bythe SOS (covariance) of the input data [6–10]. Conventional BSS methods are based on second-order statistics (SOS) or higher-order statistics (HOS) and often utilized in blind separation ofunknown source signals from their linear mixtures. In this paper, we proposed neural networkstructures related to multilayer feed-forward networks for performing BSS based on fractionallower-order statistics (FLOS). As lower-order alpha stable distribution process has no itssecond- or higher-order statistics, we modified conventional algorithms so that their capabilitiesare greatly improved under both Gaussian and fractional lower-order alpha stable distributionnoise environments.

2. DATA MODEL AND NETWORK STRUCTURES

In the following, we present the basic data model used in the source separation problem, anddiscuss the necessary assumptions. Assume that there exist P zero-mean source signalssiðnÞ; i ¼ 1; 2; . . . ;P that are scalar-valued and mutually statistically independent for eachsample value or, in practice, as independent as possible. The independence condition is formallydefined so that the joint probability density of the source signals must be the product of themarginal densities of the individual sources. More concretely, the source signals could besampled discrete time waveforms. We assume that the original sources are unobservable, and allthat we have are a set of noisy linear mixtures XðnÞ ¼ ½x1ðnÞ;x2ðnÞ; . . . ;xMðnÞ�T; n ¼ 1; 2; . . . ;N:We can write the signal model in matrix form as follows:

X ¼ ASþ V ð2Þ

Here, X is M �N observation matrix, A is M � P constant mixing matrix with full columnrank, For the sake of convenience, we can assume that the number of sources is equal to the

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

D. ZHA AND T. QIU214

Page 3: 10.1002_acs.896

dimension of XðnÞ; i.e. the number of sources is known. S is P�N independent source signalsmatrix, V denotes M �N possible additive lower-order alpha stable distribution noise matrix,mixture matrix A is unknown. The task of source separation is merely to find the estimation ofthe sources, knowing only the data X: Each source signal is a stationary zero-mean stochasticprocess.

Now, let us consider a two-layer feed-forward neural network structure [3,8] shown inFigure 1. The inputs of the network are components of the observation matrix X (not counted asa layer). In the hidden layer there are P neurons, and the output layer consists again of Pneurons. Let B denote P�M the pre-processing matrix between the inputs and the hiddenlayer, and WT; the P� P weight matrix between the hidden and output layers, respectively.Based on above network structure, the BSS can be done in two subsequent stages as follows: (a)Obtain a pre-processing matrix that orthogonalizes the mixture matrix. (b) Learn a weight matrix,which maximizes the p-order moment of every elements of YðnÞ:

max EfjyiðnÞjpg ¼ EfjWTi ZðnÞj

pg ð05p5a42Þ ð3Þ

In the meantime, the sources are separated.

3. PRE-WHITENING BASED ON NORMALIZED COVARIANCE MATRIX

Generally, it is impossible to separate the possible noise in the input data from the sourcesignals. In practice, noise smears the results in all the separation algorithms [7]. If the amount ofnoise is considerable, the separation results are often fairly poor. Here, we introduce a two-stepseparation method that achieves the BSS. The first step is a whitening procedure thatorthogonalizes the mixture matrix. Here, we search for a matrix B; which transforms mixingmatrix A into a unitary matrix. Classically, for a finite variance signal, the whitening matrix iscomputed as the inverse square root of the signal covariance matrix. In our case, alpha stableimpulsive noise has infinite variance. However, we can take advantage from the normalizedcovariance matrix.

Theorem 1 (Sahmoudi et al. [12])Let X ¼ ½Xð1Þ;Xð2Þ; . . . ;XðNÞ� be a stable process data matrix, then normalized covariancematrix of X

Cx ¼XXT

N � TraceðXXT=NÞð4Þ

Figure 1. The linear feed-forward neural network structure.

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

NEW BLIND SOURCE SEPARATION METHOD 215

Page 4: 10.1002_acs.896

Converges asymptotically to the finite matrix when N !1; i.e.

limN!1

Cx ¼ ADAT ð5Þ

D ¼ diagðd1; d2; . . . ; dMÞ ð6Þ

where di ¼ limN!1 Di=PP

j¼1 Djkajk2; aj is column of A; Di ¼

PNn¼1 x

2i ðnÞ=N:

Theorem 2Let we have eigendecomposition of Cx as Cx ¼ UX2UT and we can get its eigenvalues andcorresponding eigenvectors lm, em; m ¼ 1; 2; . . . ;M and whitening matrix B ¼ X�1UT; then thefollowing equation is orthogonal:

Z ¼ BX ð7Þ

Proof

ZZT ¼BXXTBT ¼ BCxN � TraceðXXT=NÞBT

¼N � TraceðXXT=NÞ � B �UX2UT � BT

¼N � TraceðXXT=NÞ �X�1UT �UX2UT �U � ðX�1ÞT

¼N � TraceðXXT=NÞ �X�1 � I �X2 � I � �ðX�1ÞT

¼N � TraceðXXT=NÞ � I

So we can write

BCxBT ¼ BADATBT ¼ ðBAD1=2ÞðBAD1=2ÞT ¼ I ð8Þ

&

4. SEPARATING ALGORITHMS

The core part and most difficult task in BSS is learning of the separating matrix WT: Duringrecent years, many neural blind separation algorithms have been proposed. In the following, wediscuss and propose separation algorithms, which are suitable for alpha stable noiseenvironments in BSS networks. Let us consider ith output weight vector Wi; i ¼ 1; 2; . . . ;P;standard BSS is based on SOS and maximize the output variance (power) EfjyiðnÞj

2g ¼EfjWT

i ZðnÞj2g subject to orthogonal constraints WiW

Ti ¼ IP: As lower-order alpha stable

distribution noise has no second-order moment, we must select appropriate optimal criterion.So the BSS problem corresponding to p-order moment maximization is solution to optimizationproblem for each Wi; i ¼ 1; 2; . . . ;P subject to orthogonal constraints WiW

Ti ¼ IP:

Wopti ¼ arg max

Wi

E1

pjZTðnÞWi jp

� �ðWiW

Ti ¼ IPÞ ð9Þ

Let objective function be

JðWiÞ ¼ E1

pjZTðnÞWi j

p

� �þ

1

2liiðWT

i Wi � 1Þ þ1

2

XPj¼1;j=i

lijWTi Wj ð10Þ

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

D. ZHA AND T. QIU216

Page 5: 10.1002_acs.896

Here, the Lagrange multiplier is lij, imposed on the orthogonal constraints WiWTi ¼ IP: For

each neuron, Wi is orthogonal to the weight vector Wj ; j=i:The estimated gradient of JðWiÞ with respect to Wi is

#rJðWiÞ ¼ EfZðnÞjZTðnÞWi jp�2 conjðZTðnÞWiÞg þ

XPj¼1

lijWj ð11Þ

At the optimum, the gradients must vanish for i ¼ 1; 2; . . . ;P; and WTi Wj ¼ dij : These

can be taken into account by multiplying (11) by WTj from left. We can obtain lij ¼

�WTj EfZðnÞjZ

TðnÞWi jp�2 conjðZTðnÞWiÞg: Inserting these into (11) we get

#rJðWiÞ ¼ I�XPj¼1

WjWTj

" #EfZðnÞjZTðnÞWi jp�2 conjðZTðnÞWiÞg ð12Þ

A practical gradient algorithm for optimization problem (9) is now obtained by inserting (12)into Wiðnþ 1Þ ¼WiðnÞ � mðnÞ #rJðWiðnÞÞ; where m(n) is the gain parameter. The final algorithmis thus

Wiðnþ 1Þ ¼WiðnÞ � mðnÞ I�XPj¼1

WjðnÞWTj ðnÞ

" #ZðnÞjZTðnÞWiðnÞjp�2 conjðZTðnÞWiðnÞÞ ð13Þ

As yiðnÞ ¼ ZTðnÞWiðnÞ; (13) can be written as follows:

Wiðnþ 1Þ ¼WiðnÞ � mðnÞjyiðnÞjp�2 conjðyiðnÞÞ ZðnÞ �XPj¼1

yjðnÞWjðnÞ

" #ð14Þ

Let gðtÞ ¼ jtjp�2 conjðtÞ; then g(t) is an appropriate network non-linear transform function forlower-order alpha stable distribution impulse noise.

Considering that during the iteration error item of gradient I�PP

j¼1 WjðnÞWTj ðnÞ might be

zero instantaneously, we modify (14) to improve robustness of algorithm as

Wiðnþ 1Þ ¼WiðnÞ � mðnÞgðyiðnÞÞ ZðnÞ �XPj¼1

gðyiðnÞÞWjðnÞ

" #ð15Þ

Thus, W1;W2; . . . ;WP can be obtained.Let YðnÞ ¼ ½y1ðnÞ; y2ðnÞ; . . . ; yPðnÞ�T;W ¼ ½W1;W2; . . . ;WP�; for whole network, solution to W

and optimization problem is

Wopt ¼ arg maxW

XPi¼1

E1

pjZTðnÞWijp

� �ð16Þ

According to the above derivation, by using gðtÞ ¼ jtjp�2 conjðtÞ; the algorithm for learningW is

Wðnþ 1Þ ¼WðnÞ � mðnÞ½ZðnÞ �WðnÞgðYðnÞÞ�gðYTðnÞÞ ð17Þ

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

NEW BLIND SOURCE SEPARATION METHOD 217

Page 6: 10.1002_acs.896

5. PERFORMANCES ANALYSIS

Different non-linear functions can be applied to different blind signal separation problems.Some popular functions are gðtÞ ¼ signðtÞ and gðtÞ ¼ tanhðtÞ corresponding to the doubleexponential p.d.f. 1

2expð�jxjÞ and the inverse-cosine-hyperbolic p.d.f. ð1=pÞð1=coshðxÞÞ;

respectively. For the class of symmetric normal inverse Gaussian (NIG), it is straightforwardto obtain according to Kidmose [11]

gðtÞ ¼�atffiffiffiffiffiffiffiffiffiffiffiffiffiffid2 þ t2

p K2ðaffiffiffiffiffiffiffiffiffiffiffiffiffiffid2 þ t2

K1ðaffiffiffiffiffiffiffiffiffiffiffiffiffiffid2 þ t2

where K1ð:Þ and K2ð:Þ are the modified Bessel functions of the second kind with indices 1 and 2.As lower-order alpha stable distribution noise has no second- or higher-order moments, we

must select appropriate non-linear function gðtÞ ¼ jtjp�2 conjðtÞ ðp5aÞ: If t is real data,gðtÞ ¼ jtjp�1 signðtÞ; if p ¼ 1; gðtÞ ¼ signðtÞ: Figure 2 shows the non-linear functions of alphastable distribution for different a.

We start from the learning rule (17), and we assume that there exists a square separatingmatrix HT such that UðnÞ ¼ HTZðnÞ: The separating matrix HT must be orthogonal. To makethe analysis easier, we multiply both sides of the learning rule (17) by HT: We obtain

HTWðnþ 1Þ ¼ HTWðnÞ þ mðnÞ½HTZðnÞ �HTWðnÞgðWðnÞTZðnÞÞ�gðZðnÞTWðnÞÞ ð18Þ

For the sake of HHT ¼ IP; we can get

HTWðnþ 1Þ ¼ HTWðnÞ þ mðnÞ½HTZðnÞ �HTWðnÞgðWðnÞTHHTZðnÞÞ�gðZðnÞTHHTWðnÞÞ ð19Þ

Define QðnÞ ¼ HTWðnÞ; WðnÞ ¼ ðHTÞ�1QðnÞ; (19) is written as

Qðnþ 1Þ ¼ QðnÞ þ mðnÞ½UðnÞ �QðnÞgðQðnÞTUðnÞÞ�gðQðnÞUðnÞTÞ ð20Þ

Figure 2. The non-linear functions of alpha stable distribution for different a.

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

D. ZHA AND T. QIU218

Page 7: 10.1002_acs.896

Geometrically, the transformation multiplying by the orthogonal matrix HT simply means arotation to a new set of co-ordinates such that the elements of the input vector expressed in theseco-ordinates are statistically independent.

Analogous differential equation of (20) is obtained as matrix form:

dQ=dt ¼ EfUgðUTQÞg � EfgðQTUÞgðUTQÞg ð21Þ

According to Karhumen et al. [3], we can easily prove that (21) has stable solution. For thesake of Q ¼ HTW; thus W ¼ ðHTÞ�1Q is asymptotic stable solution of (17). Figure 3 shows thestability and convergence of the algorithms based on SOS and FLOS. From Figure 3, we knowthe algorithm based on FLOS has better stability and convergence performances than thealgorithm based on SOS.

6. EXPERIMENTAL RESULTS

The assumption that signals are of finite variances is commonly made in the real world and isreasonable for array signal processing. The array under consideration is a uniform linear array(ULA) with interelement spacing equal to half of a wavelength. Since an alpha stable randomvariable with a52 is of infinite variance, we can use generalized SNR (GSNR) [1,2] which is theratio of the signal power over the noise dispersion gv

GSNR ¼ 10 logPs

s¼ 10 log

1

gv �N

XNn¼1

jsðnÞj2 !

ð22Þ

Figure 3. The stability and convergence of the algorithms based on SOS and FLOS.

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

NEW BLIND SOURCE SEPARATION METHOD 219

Page 8: 10.1002_acs.896

For a finite sample realization, the GSNR can be computed by

GSNR� 10 log

PNn¼1 jsðnÞj

2PNn¼1 jvðnÞj

2

!ð23Þ

Since an SaS random variable is characterized by two parameters a and g in each experiment,we use the GSNR to describe the signal-to-noise ratio. The simulations here implemented thealgorithms based on FLOS and conventional SOS, respectively.Experiment 1: Suppose that an linear microphone array with five sensors, random audio timedomain signals of a piano and a bird enter the array from different directions. What’s more,alpha stable impulsive noises with a ¼ 1:7 exist in the array at the same time. GSNR is 20 dB.Two algorithms are used in the experiment, including: (1) SOS with non-linear functiongðtÞ ¼ tanhðtÞ; and (2) FLOS with gðtÞ ¼ jtjp�2 conjðtÞ; respectively. We can get signal waveforms

Figure 4. The source and separate signals in time domain.

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

D. ZHA AND T. QIU220

Page 9: 10.1002_acs.896

in time domain as shown in Figure 4, where (a) and (b) are source signals, (c) and (d) areseparate signals based on SOS algorithm, (e) and (f) are separate signals based on FLOSalgorithm.Experiment 2: We repeat simulations when GSNR is 20 dB. Two independent sources arelinearly mixed. One is the periodical noise free brain evoked potential (EP) signal, the period is128 points and the sampling frequency is 1000Hz. The other is an alpha stable non-Gaussiannoise with a ¼ 1:7: Two algorithms are used in the experiment, including: (1) SOS withnon-linear function gðtÞ ¼ tanhðtÞ; (2) FLOS with gðtÞ ¼ jtjp�2 conjðtÞ; respectively. We can getsignals in time domain as shown in Figure 5, where (a) and (b) are source signals, (c) and (d) areseparated signals based on SOS, (e) and (f) are separated signals based on FLOS. For FLOSalgorithm, the correlation coefficient between the separated and source EP signals is –0.9213,and the correlation coefficient between the separated and source alpha stable non-Gaussiannoises is �0.9098.Experiment 3: Separate the mixed EP signal and noise again with the new FLOS algorithm andconventional SOS algorithm, respectively. And the results of 10 independent experiments areshown in Figure 6 and Table I. The correlation coefficients of EP and of the noise are calculated

Figure 5. Separating results: (a), (b) are the source signals; (c), (d) are the separated signals with SOS; and(e), (f) are the separated signals with FLOS.

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

NEW BLIND SOURCE SEPARATION METHOD 221

Page 10: 10.1002_acs.896

at some iteration times. From Table I, we get that the performance of the new algorithm isbetter than the Conventional algorithm.

7. CONCLUSION

This paper briefly introduced the statistical characteristics of stable distribution and proposed ablind source separation method based on fractional lower-order statistics in non-Gaussian

Figure 6. The correlation coefficients of EP and noise.

Table I. Comparison between the two algorithms.

Correlation coefficient (FLOS) Correlation coefficient (SOS)

Iteration n times EP Noise EP Noise

50 0.1244 0.1044 0.0044 0.0004100 �0.3450 �0.3050 �0.0050 �0.0063150 0.4378 0.4378 0.1378 0.1072200 0.6766 0.7706 0.1716 0.1212250 �0.9291 �0.9091 �0.1711 �0.1451300 �0.9287 �0.9107 �0.3937 �0.2231350 �0.9293 �0.9113 �0.4993 �0.2923400 0.9295 0.9195 0.3945 0.3045450 0.9299 0.9292 0.2935 0.1935500 �0.9501 �0.9593 �0.2804 �0.1904

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

D. ZHA AND T. QIU222

Page 11: 10.1002_acs.896

impulsive noise environments. Furthermore, we have analysed its stability and convergenceperformances.

In our simulations, we compared the performances of FLOS algorithm with those of SOSalgorithm. From above simulations, we can easily obtain the following conclusions: theproposed class of network and BSS algorithm based on FLOS is more robust than conventionalalgorithm based on SOS so that its separation capability is greatly improved under bothGaussian and fractional lower-order stable distribution noise environments.

REFERENCES

1. Nikias CL, Shao M. Signal Processing with Alpha-Stable Distributions and Applications. Wiley: New York, 1995.2. Shao M, Nikias CL. Signal Processing with fractional lower order moments: stable processes and their applications.

Proceedings of IEEE 1993; 81(7):986–1010.3. Karhumen J, Oja E, Wang L, Vigario R, Joutsensalo J. A class of neural networks for independent component

analysis. IEEE Transactions on Neural Networks 1997; 8(3):67–78.4. Zhang Y, Ma Y. CGHA for principal component extraction in the complex domain. IEEE Transactions on Neural

Networks 1997; 8(5):105–117.5. Karhunen J, Joutsensalo J. Nonlinear generalizations of principal component learning algorithms. Proceedings of

International Joint Conference on Neural Networks, vol. 3. 1993.6. Wang L, Karhunen J, Oja E. A bigradient optimization approach for robust BSS, MCA, and source separation.

Proceedings of IEEE International Conference on Neural Networks, vol. 4. 1995.7. Winter S, Sawada H, Makino S. Geometrical understanding of the BSS subspace method for over-determined blind

source separation. IEEE Transactions on Acoustics, Speech, and Signal Processing 2003; 13(6):112–126.8. Mutihac R, Van H. BSS and ICA neural implementations for source separation}a comparative study. Proceedings

of the International Joint Conference on Neural Networks, vol. 1. 2003; 20–24.9. Szu H, Hsu C. Unsupervised neural network learning for blind sources separation. Proceedings of 5-th Brazilian

Symposium on Neural Networks, 1998.10. Diamantaras KI. Asymmetric BSS neural networks for adaptive blind source separation. Proceedings of the 1998

IEEE Signal Processing Society Workshop, 1998; 103–112.11. Kidmose P. Blind separation of heavy tail signals. IMM-Phd, LYNGBY, Technical University of Denmark, 2001.12. Sahmoudi M, Abed-Meraim K, Benidir M. Blind separation of impulsive alpha-stable sources using minimum

dispersion criterion. IEEE Signal Processing Letters 2005; 12(4):281–284.

Copyright # 2006 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2006; 20:213–223

NEW BLIND SOURCE SEPARATION METHOD 223