A Clustering Technique for Digital Communications

8/4/2019 A Clustering Technique for Digital Communications

1/10

570 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 4, NO. 4, JULY 1993

A Clustering Technique for Digital CommunicationsChannel Equalization Using RadialBasis Function NetworksSheng Chen, Memb er, IEEE, Bernard Mulgrew, Memb er, IEEE, and Peter M. Grant, Senior Member, IEEE

Abstruct-The paper investigates the application of a radialbasis function network to digital communications channel equal-ization. It is shown that the radial basis function network hasan identical structure to the optimal Bayesian symbol-decisionequalizer solution and, therefore, can be employed to implementthe Bayesian equalizer. The training of a radial basis function net-work to realize the Bayesian equalization solution canbeachievedefficiently using a simple and robust supervised clustering algo-rithm. During data transmission a decision-directed version ofthe clustering algorithm enables the radial basis function networkto track a slowly time-varying environment. Moreover, the clus-tering scheme provides an automatic compensation for nonlinearchannel and equipment distortion. This represents a radically newapproach to the adaptive equalizer design. Computer simulationsare included to illustrate the analytical results.

I. INTRODUCTIONIGH speed communications channels are often impairedH y channel intersymbol interference and additive noise.Adaptive equalizers are required in these communicationssystems to obtain reliable data transmission. A discrete timemodel of a digital communications system is depicted inFig. 1, where a digital sequence s ( t ) is transmitted througha dispersive channel with transfer function

The transmitted symbol sequence s ( t ) is assumed to be anequiprobable and independent binary sequence taking valuesfrom { f l } . The channel output is corrupted by an additivewhite Gaussian noise e@). The task of the equalizer is torecover the transmitted symbols based on the channel obser-vation y(t).From estimation theory, it is known that the best perfor-mance is obtained by detecting the entire transmitted sequenceusing the maximum likelihood sequence estimator (MLSE)[l], [2]. Adaptive MLSE is implemented in the form of achannel estimator and a Viterbi algorithm. High complexityand deferring decisions associated with the MLSE are how-ever often unacceptable in many practical communicationssystems. Most of the practical equalizers therefore employ anManuscript received April 29, 1991; revised May 5, 1992. This workwas supported by the U.K. Science and Engineering Research Council underAward GR/E10357.The authors are with the Department of Electrical Engineering, Universityof Edinburgh, Kings Buildings, Edinburgh EH9 3JL, Scotland.IEEE Lo g Number 9202044.

Channel EqualiserFig. 1 Discrete-time model of data transmission system.

architecture of making decisions symbol by symbol. Symbol-decision equalizers can further be classified into two categoriesaccording to whether they estimate a channel model explicitly.In a typical direct-modeling equalizer, a channel model isidentified explicitly. The transmitted symbols are treated asstates and a Kalman filter is used to estimate these states[3]. The indirect-modeling approach recovers the transmittedsymbols by filtering the channel observations, usually using anadaptive linear filter [4], without estimating a channel modelexplicitly. This is by far the widest used equalizer structureand it is considered in the present study.The structure of the symbol-decision and indirect-modelingequalizer is shown in Fig. 2. The operation of the equalizer ateach sample t is based on m most recent channel observationsand a decision is made regarding the transmitted symbol atsample t - 7 , where the integers m and 7 are known as theequalizer order and delay, respectively. How the m channelobservations are processed determines the performance and thecomplexity of the equalizer. The indirect modeling approachis sometimes referred to as the inverse modeling because,traditionally, the equalization problem is viewed as an inversefiltering in which the equalizer forms an approximation to theinverse of the distorting channel [4]. From this view point,the filter within the architecture of Fig. 2 becomes linear and,the resulting equalizer is called a linear transversal equalizer(LTE). This view point however has certain shortcomings.Firstly, it completely ignores the fact that s ( t ) is binary. Itought to exploit this information to the benefit of performance,as is the case in the MB E . The inverse modeling alsoimplies that increasing the equalizer order m should lead to amore accurate approximation and hopefully better equalizationperformance. This is however not true due to the noiseenhancement. Previous research [5 ] , [6] has demonstrated thatthe LTE does not achieve the full performance potential of thegiven symbol-decision structure in Fig. 2. Better performancecan be obtained if some more complex filtering methodis employed.

1045-9227/93$03.00 0 1993 IEEE

1


2/10

CHEN ef al.: CHANNEL EQUALIZATION USING FUNCTION NETWORKS 571

Fig. 2 Architecture of symbol-decision equalizer without estimatingchannel model.

The use of the multilayer perception [5] and the polynomialfilter [6] as equalizers can achieve significant performance im-provement over the LTE. This is because these two nonlinearequalizers are able to approximate the optimal symbol-decisionequalizer solution implicitly. The multilayer perceptron how-ever, has problems of slow convergence and unpredictablesolutions during the training while the polynomial equalizersuffers a drawback of exponentially increasing filter dimen-sion. Furthermore, it is not known how to specify parametersof these two nonlinear equalizers even when the channeltransfer function, the symbol and the noise statistics are given.Their parameters can only be set by experiment in an ad hocmanner. This is in contrast to the LTE whose parameters arecompletely specified by the Wiener filter solution when thechannel statistics are provided.The present study proposes a novel strategy for an adaptiveequalizer design based on the radial basis function (RBF)network [7]-[9]. The optimal solution for the symbol-decisionarchitectureof Fig. 2is first derived using the Bayes decisiontheory [lo], [ ll ]. It is then shown that this optimal Bayesianequalizer solution has an identical structure to the RBF net-work. For a known channel, therefore, all the parametersof the RBF network are specified explicitly, giving rise toprecisely the Bayesian solution. Moreover, the training of anRBF network to realize the optimal equalizer can be carriedout very efficiently by exploiting the underlying data structure.This involves a supervised clustering to position the RBFcenters at the desired channel output states. Because noisychannel observations form Gaussian clusters and the means ofthese data clusters are the desired channel states, rapid con-vergence is guaranteed. The supervised clustering algorithm isvery simple and robust, and it represents a radical departurefrom traditional training methods which are mostly based onminimizing the mean square error between the desired filteroutput and the actual filter output. Often communicationschannels are time-varying and, during data transmission, thedesired channel output states will be changing. The RBFnetwork can track these changes using a decision-directedclustering algorithm. Similar to the multilayer perceptron andthe polynomial equalizers, the adaptive RBF equalizer iscapable of compensating nonlinear distortion. Compared withthe two previous nonlinear equalizers, however, the adaptive

RBF equalizer has significantperformance and implementationadvantages because its structure is explicitly equivalent to theunderlying optimal Bayesian solution. Computer simulationresults are used to demonstrate the optimal performance ofRBF equalizers.11. OPTIMAL SYMBOL-DECISIONQUALIZER

The general symbol-decision equalizer depicted in Fig. 2is characterized by the equalizer order m and delay T. Forthe general channel of n h + 1 taps given in (l), there aren, = 2nh+m combinations of the channel input sequence

(2 )Ts ( t ) = [ s ( t ) * f s(t - m + 1- n h ) ] .This gives rise to n, points or values of the noise-free channeloutput vector

P ( t ) = [$( t ) ..y(t - m + l)]? (3)These points will be referred to as the desired channel states,and they can be partitioned into two classes according to thevalue of s ( t - T )

} (4)Y& = { $ ( t ] s ( t T ) = l},YL,r { Y ( t ) ( s ( t- T ) = -1).The two sets Y& and Y L , ~ontain the information ofthe channel transfer function, the symbol statistics and theequalizer constraints. Each desired state y : ~ YA, or yf EY;, has a priori probability of appearance p i . Under theprevious assumptions on symbol statistics, all the desiredstates have a same probability of appearance p = l/n,. Thenumbers of the states in Yz , and YL, are denoted as nb andn; , respectively.Because of the additive white Gaussian noise, the noisyobservation vector

y(t) = [ y ( t ) . . .y( t - m + I)]' ( 5 )is a random process having conditional Gaussian densityfunctions centered at each of the desired channel states. Itis apparent that channel observations form clusters and themeans of these data clusters are the desired states. Determiningthe value of the transmitted symbol s ( t - T ) based on theobservation vector (5) is a decision problem. The Bayesdecision theory [lo], [ l l ] provides the optimal solution tothe general decision problem and, therefore, can be employedto derive the optimal solution for the general equalizer ofFig. 2. It is straightforward o verify that this optimal Bayesianequalizer solution is defined as [6]

with the optimal Bayesian filter or decision function given bynh

nT

j=1(7)

I


3/10

57 2 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 4, NO . 4, JULY 1993

where the first sum is over y l E YA, , the second sum isover y j E Y;, r , and g, s the noise variance.

The optimal equalizer solution clearly depends on the noisedistribution as well as the desired channel states. Multiplyingf ~ ( y ( t ) )y a positive constant does not change the optimally.What is critical is the optimal decision boundary defined by

It partitions the observation space into two decision regionscorresponding to the two decisions i ( t - T) = f l . Becausethe Bayesian decision function (7) is nonlinear, the optimalboundary (8) is a hypersurface in the observation space. Thedecision boundary of any linear equalizer is a hyperplane in theobservation space. Therefore a performance gap always existsbetween the LTE and the optimal equalizer. For equiprobablesymbols, the coefficients in the optimal filter (7) becomeredundant and can be removed, giving rise to the followingsimpler form of the optimal decision function.n$

fB b ( t ) ) = exp (- I(y(t)- Y+ I 2/2g:)i= l

An example is given to illustrate these results.chosen as m = 2. Let the channel transfer function beFor the purpose of graphical display, the equalizer order is

H ( z ) = 0.5 + 1.02-l. (10)All the combinations of s ( t ) and the desired channel states arelisted in Table I. The states of Y& and Yq1 are also plottedin Fig. 3using the square 0 nd cross x , respectively.When P(t ) s at a particular state, the observation vector y(t)is a stochastic process having a Gaussian density function witha mean equal to the given state and a variance equal to thatof the noise. For noise variance of = 0.125, 1000 samples ofy(t) are plotted in Fig. 3using dots. It is seen that observationsform clusters around the desired channel states. The optimaldecision boundary computed using (9) is a curve, which is atwo-dimensional hypersurface. When a y(t) is observed at theright-hand region of he boundary, the decision i ( t - 1)= 1 smade. If the observation vector y(t) appears in the left-handregion of the boundary, i ( t - 1) = -1 is made. This way ofmaking decisions is optimal because it produces the minimumaverage error probability or bit error rate.

111. TH E RADIAL BASIS UNCTION NETWORKConsider the RBF network [7]-[9], which is a two-layeredprocessing structure depicted in Fig. 4. The hidden layer

consists of an array of computing units. Each unit containeda parameter vector called a center, and the unit calculates asquared distance between the center and the network inputvector. The squared distance is then divided by a parametercalled a width and the result is passed through a nonlinearfunction. The second layer is essentially a linear combiner

TABLE IINPUT ND DESIRED HANNE LTATES. ( z ) = 0. 5+ .Oz-,m = 2 AND r = 1

No. s ( t ) s ( t - ) s ( t - ) C ( t ) Q(t - 1 )1 1 1 1 1.5 1.53 -1 1 1 0.5 1.54 - 1 1 -1 0.5 -0.55 1 -1 1 -0.5 0.56 1 -1 - 1 -0.5 -1.57 -1 - 1 1 -1.5 0.58 -1 - 1 -1 -1.5 -1.5

2 1 1 - 1 1.5 -0.5

3

2

1rl

v oh

-1

-2

-3 -3 -2 -1 0 1 2 3Y (t)Fig. 3 . Desired states, data clusters, and optimal decision boundary,H ( r ) = 0. 5 + .Oz-, SNR = 10 dB and 1000 samples of y ( t ) .with a set of connection weights. The overall response of theRBF network is a mapping f T

nf 4 Y ) = W i 4 ( l l Y - GllZ/P2) (11)i = l

where n is the number of computing units, c, are the RBFcenters, pi are the widths of the units and wi are the weights.Comparing the network response (1 1) with the optimalequalizer filter (7), it is obvious that they have the samestructure. The RBF network is therefore an ideal processingmeans to implement the optimal Bayesian equalizer. Givenchannel statistics, it is known exactly how to specify all theparameters of the RBF network. The number of the hiddenunits n is equal to the number of the desired channel statesand the RBF centers are placed at these desired states. Thenonlinear function 4 is obviously chosen as the followingexponential function 4(Y) = exP(-Y) (12)and all the widths have a same value p , which is twice aslarge as the noise variance. Each hidden unit then implementsa component conditional density function in (7) and, when the

7


4/10

CHEN et al. : CHANNEL EQUALIZATION USING FUNCTION NETWORKS

LinearCombiner

HiddenLayer

Yl ." YRIFig. 4. Schematic of radial basis function network.

weights are further set to the corresponding coefficients in (7),the RBF network realizes precisely the Bayesian equalizer.For the case of equiprobable symbols, the network can besimplified considerably by fixing half of the weights to 1 andthe other half to -1. In this case the second layer becomesa summer.Whether a RBF network can realize optimal equalizersolution depends crucially on whether the centers can bepositioned correctly at the desired channel states. The widthp is a less influential parameter. It is not too difficult tosee that p need not be accurately set to 2a2. As far as thedecision boundary is concerned, the influence of p on thehidden units cancel out each other to a certain extent. Thisis important because in practice only an estimate of the noisevariance is available. Ideally n should be equal to n,, andthis requires a correct estimation of the channel order n h . Ifthe estimate A h is larger than the true value n h , more centersthan really required will be employed. Apart from introducingunnecessary computations, this will not affect the performanceof the RBF network. For example, if A h = n h +1, he numberof centers is twice as many as the desired channel states, anda pair of two centers will converge to each desired state. Thisstill results in the optimal equalization solution. If howeverA h is smaller than nh , less centers than the desired states willbe used and some performance loss can be expected. Thesediscussions will be further illustrated later using simulation.

IV. SUPERVISEDEARNINGIn reality the channel transfer function is unavailable. Anefficient learning strategy is necessary in order for a RBFnetwork to learn the optimal equalizer solution. In the caseof equiprobable symbols, the weights of the network can befixed and learning involves finding the desired channel statesso that the centers of the network can be positioned at thesestates. Denote the n, combinations of s ( t ) as si,15 i 5 n,.During the training period, transmitted symbols are knownto the equalizer. At each sample t , it can be inferred from

s ( t ) which member of the desired states occurs. Furthermore,

573

TABLE I1COMPLEXITYF SUPERVISEDLUSTERING13) FOR m-INPUTSAND n-U"s RBF NETWORK

m multiplicationsm divisionsm + 1 additionsTABLE I11COMPLEXITYF SUPERVISEDMS (15) FOR m-INPUTS

AND UNITS RBF NETWORKn x m + 2n + 1 multiplications

n divisions2n x m + n additionsn evaluations of exp(-y)

as mentioned previously, noisy observations form Gaussianclusters centered at the desired states. This suggests thata supervised &-means clustering procedure can effectivelyfilter out the noise so that the RBF centers converge to thedesired states. The computational procedure of this clusteringalgorithm is summarized as follows:if (s(t) == si){~ ( t )counteri * c;(t- 1)+ y(t);

counter; = counteri + 1;c;(t)= c;(t)/counteri;

1 (13)Because of the underlying data structure, a rapid convergenceof this supervised clustering procedure is guaranteed. The com-putational requirement of this supervised clustering algorithmis given in Table 11.For nonstationary channels, the followingadaptive version of (13) is preferred:

where gc is the learning rate for centers. This version ofthe supervised clustering algorithm is simpler than (13). Itshould be emphasized that the algorithm (13) or (14) is allthat is required to train a RBF network in the case of equi-probable symbols.If the assumption of equiprobable symbols is violated, itis advisable to adjust the weights of the network so thatthe network can learn the general equalizer solution (7). Theadaptation of the weights is achieved using the followingsupervised least mean square (LMS) algorithm:m

where gw is the learning rate for weights. The computationalcomplexity of this LM S algorithm is listed in Table 111. Theproperties of the LMS algorithm are well understood (121.


5/10

57 4 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 4, NO. 4, JULY 1993

Knowledge of the noise variance g," nd the channel ordern h are required to specify the network structure and simpletechniques can be employed to provide their estimates. If thecenters are positioned correctly at the desired channel statesand the ith desired state appears at sample t , t is easy to verifythat the noise variance is given by

this unsupervised clustering is as follows:d i ( t ) = ~ly(t) s(t-IC = arg[min{di(t),Ck(t) = C k ( t - 1) +gc(y(t) - C k ( t - 1 ) )G ( t )= ~ ( t) ,

1 I II 5 i 5 n }]

15 i 5 n , an d i # ICd = E [ l l Y ( t ) - G1 I 2 / 4 (16)where k, the expectation Operator.This suggests a simp1eestimator for the noise variance. If s ( t ) = s; at t , he noisevariance estimate is adjusted according to

If it is also necessary to adapt the weights during datatransmission, a decision-directed LM S algorithm can be in-corporated into the adaptive algorithm. The decision-directedLMS algorithm is identical to the LM S algorithm (15) exceptthat the error signal ~ ( t )s derived as the difference betweenthe estimated symbol S( t - T ) and the network response.= ( ( t- )'-,"(' - 1 + IIY(t) - G(t) l12 /m) / t* (17) For the equalization application, the unsupervised cluster-_ing algorithm (21) is unnecessarily complex and a decision-directed version of th e supervised clustering algorithm (14)provides a simpler alternative. The basic idea is to infer theUnder the assumptions introduced in Section I, the autocorre-lations of y(t) satisfy

Note that ynh # 0 if ho # 0. The channel order n h cantherefore be inferred from the autocorrelations of y(t). Thisinvolves the calculation of the normalized sample autocorre-lations of y(t)

\ t = 1(19)

where N is the number of samples, and jj is the sample meanof Y(t)

N3 = N-l y(t). (20)t = l

A Tq is regarded as significant if it is outside the 95%confidence bands ~ k1 . 9 6 N - ' / ~ . he last significant sampleautocorrelation provides an estimate i i h for the channel order.V. DECISION-DIRECTEDEARNING

state membership using estimated symbols. Because of thedecision delay T , at sample t , he algorithm actually determinesthe state membership at t - and the computational procedureof this decision-directed clustering algorithm is as follows:if (s( t- T ) == si){

1 (22)s ( t ) G ( t - 1)+ gc * ( Y ( t - - G(t - 1));where

k ( t - T ) = [ i ( t- ~ ) . . . i ( tT - m + 1 - n h ) ] ? (23)After the initial training and in normal operation, the equal-izer decisions are correct with high probability. This ensuresthat inferring state memberships from estimated symbols arecorrect often enough to allow a rapid convergence of thecenters to the channel states. This algorithm is all requiredto track variations in the channel characteristics during datatransmission in the case of equiprobable symbols.VI. A COMPARISONWITH OTHER NONLINEAR QUALIZERS

The approach reported in this study represents a radicallynew thinking to adaptive equalizer design. Traditional adaptivealgorithms for equalizers are based on the criterion of minimiz-For time-varying channels, the channel states also become

time-varying. A RB F network must have ability tocope with this situation. ~~~i~~data transmission, superviseding the mean square error between the desired filter output andthe actual filter output, that is, these learning algorithms adjustthe filter parameters to achieve a minimum of the criterion

learning no longer applies and adaptation has to rely onunsupervised or decision-directed learning. The unsupervised6-means clustering procedure is often employed as a part ofthe general learning algorithm to adjust RBF centers [13], [14].This involves computing the squared distance between the cen-ters and the network input vector, selecting a minimum squareddistance and moving the corresponding center closer to theinput vector. This unsupervised clustering scheme can also beused in the current application for the RBF network to tracktime-varying channel states. The computational procedure of

The ultimate performance criterion for an equalizer is howeverbit error rate and there exists no clear relationship between themean square error criterion (24) and the bit error rate criterion.The optimal Bayesian equalizer (7) or (9) will not necessarilyproduce a good mean square error performance and yet itprovides the minimum average bit error rate achievable underthe general structure of Fig. 2.The two nonlinear equalizersproposed in [5] and [6] rely on the mean square error criterion

1


6/10

CHEN et al.: CHANNEL EQUALIZATION USING FUNCTION NETWORKS 515

(24) for training. Because the multilayer perception is highlynonlinear in the parameters, its error surface (24) is very com-plicated. Therefore, training times for multilayer perceptronequalizers are typically very long and the adaptive algorithmmay become trapped at bad local minima. A n alternativenonlinear equalizer approximates the Bayesian solution (7)using a polynomial filter [6]. The polynomial equalizer suffersa drawback of exponentially increasing filter dimension. Aconsequence of this is that the autocorrelation matrix of theinput vector to the adaptive algorithm often becomes veryill-conditioned and has a large eigenvalue spreading ratio.Therefore, it is necessary to pass the output of the polynomialfilter through a sigmoid function in order to alleviate thisnumerical difficulty. The training of the RBF network as anequalizer is based on the method that exploits the underlyingdata generating mechanism. The channel outputs form clustersin the observation space. The clustering algorithm positionsthe network centers at the means of data clusters efficiently.This approach is obviously guaranteed to converge rapidly andit is directly linked to the bit error rate performance sincethe trained RBF network will realize explicitly the Bayesiansolution.It is interesting to compare the two different strategiesemployed by the RBF network and those equalizers requiringa channel estimator. The MLSE identifies the channel modeldirectly while the RBF equalizer identifies the end-results ofthe channel. This has an important implication for channelsinvolving nonlinear distortion. When nonlinear distortion istaken into account, the general channel model can be definedas

where f h is some nonlinear function and 0 is a channelparameter vector. The identification of this nonlinear channelmodel requires to specify the nonlinear structure as well asto estimate the channel parameters, which is a very diffi-cult task particularly in the context of real-time adaptation.The present approach avoids all these difficulties and it canstraightforwardly be applied to the equalization involvingnonlinear channels. Because the clustering algorithm alwaysconverges to the set of the desired channel states, the RBFnetwork can achieve the full Bayesian performance regardlesswhether the channel is linear or nonlinear. Difficulties in on-line identification of a nonlinear channel model, on the otherhand, should not be underestimated. Even when the nonlinearform fh is given, it is not always possible to identify allthe parameters in the model unless the system input signalis persistently exciting [15]. For a linear channel model,persistent excitation means that s( t) should contain sufficientfrequency components. Since s ( t ) is generally white, it is anideal input signal for identifying the linear channel model (1).For a nonlinear channel model, however, persistent excitationrequires an additional condition that s ( t ) should cover asufficient range of amplitudes. The binary nature of s ( t )therefore represents a worst scenario and, as a consequence,parameters in some nonlinear channel models may not beidentifiable. This can be demonstrated by considering the

3

2

1hrlY oh

-1

-2

- 3 - 3 -2 -1 0 1 2 3Y (t)Fig. 5. Comparison of decision boundaries. Dotted optimal, solid: RBFnetwork, +: centers, H ( z ) = 0 .5 + l.Oz-', SNR = 10 dB, m = 2, T = 1,p = 202, and Ah = 1.following nonlinear channel model

2

+2 hZl i2 i3S( t- 21)s(t- 2 ) s ( t - 3)+ e ( t ) . (26)

The input vector to the adaptive channel estimator consistsof 19 monomials s ( t - l), s ( t - l ) s ( t - 2 ) and s ( t - l) .s ( t - i 2 ) 5 ( t - i 3 ) . The rank of the 19 x 19 autocorrelationmatrixof the estimator input vector, however, is only 8. It is thereforeimpossible to identify all the 19 parameters in (26).

WI. SIMULATION STUDYIn all the results, s ( t ) was an equiprobable random numbertaking values from {fl}. herefore the weights of the RBFnetwork were fixed and learning only involved supervisedor decision-directed clustering to position the centers at thedesired channel states.For the same system shown in Fig. 3, 160 samples oftraining data were used to train the RBF network using (13).Initially correct estimates of the channel order and the noisevariance were assumed, giving rise to p = 20: and Ah = 1.

The trained RBF network produced the decision boundaryshown in Fig. 5. It is seen that convergence of the RBF centerto the desired channel states was achieved and the supervisedclustering algorithm performed well even though the noiselevel was very high. For better signal to noise ratio (SNR)conditions, training samples can be reduced.

I


7/10

576 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 4, NO . 4, JULY 1993

The influence of the width p was next investigated. Thewidth p was set to 40: and a:, respectively. The first caserepresented an estimate of the noise variance that was twiceas large as the true value and, in the second case, theestimate was only half of the true noise variance. These twoRBF networks were trained and they produced the decisionboundaries which were practically the same as that obtainedusing the true noise variance shown in Fig. 5. This confirmsthat the performance of RBF equalizers is relatively insensitiveto the width p. Obviously the width p has no effect onthe supervised clustering learning for centers. The simplevariance estimator (17) was also incorporated into the trainingprocedure to estimate the noise variance. After 160 samples oftraining it produced 6; = 0.105 compared with the true value0," 0.125. This estimate was apparently accurate enough forthe RBF network to realize the optimal decision boundary.The channel (10) used in the previous simulation has anorder n h = 1. Assume that it was wrongly estimated asf i h = 2. This resulted in 1 6 centers compared with 8 desiredchannel states. The posit ions of these 16 centers obtainedafter 160 samples of supervised clustering learning are plottedin Fig. 6. As expected, two centers converged to a desiredstate and the optimality of the RBF equalizer was maintained.Another channel

H ( z ) = 1.0+ 0.8z-l + 0.5.Y2 (27)was used to test the case of f i h < n h . Again a poor SNRof 10 dB was chosen and 160 samples were used in training.This was a channel of order n h = 2. Suppose that an incorrectestimate f i h = 1 was provided. This gave rise to a RBFnetwork with 8 centers compared with 16 desired channelstates. Also a wrong estimate of the noise variance was usedby setting p = 40:. The result obtained is shown in Fig. 7.It is interesting to see that each center converged to the meanof two desired states. Judging from the decision boundariesdepicted in Fig. 7, only a small performance loss occurred inthis case. The estimator (17) produced 6: = 0.281 comparedwith the true noise variance 62 = 0.189. This estimate is moreaccurate than the actual one used in the network.

A third channel with a more realistic equalizer order wasused to study the performance of the RBF network under avariety of SNR's. The channel transfer function was given byH ( z ) = 0.3482 + 0.87042-1 + 0 . 3 4 8 2 ~ ~ ~ .28)

The equalizer order and delay were chosen as m = 4 andT = 1,respectively. Correct estimates of the channel order andthe noise variance were assumed. This gave rise to a networkof 64 centers. 640 samples of training data were used in thesupervised clustering procedure (13). Bit error rates of theoptimal Bayesian equalizer and the trained RBF network aredepicted in Fig. 8, where it is seen that the RBF equalizerachieved the optimal performance. The performance of theorder-4 Wiener filter is also plotted in Fig. 8as a comparison.The performance of the Wiener filter is the best that a LTEcan hope to realize. For this example, order 4 is also the best

3

2

1-rf

Y ob

-1

-2

-3

I 'i I 1 I I,,

- 3 -2 -1 0 1 2 3y (t)

Fig. 6. Comparison of decision boundaries. Dotted: optimal, solid: RBFnetwork, +: centers, H ( z ) = 0.5 + .Oz-', SN R = 10 dB, m = 2 , T = 1 ,p = 2 4 , and f i h = 2.3 I I 1 I j I

IX

rl

Y o-1

- 2

- 3 -3 - 2 -1 0 1 2 3y (t)

Fig. 7. Comparison of decision boundaries. Dotted: optimal, solid: RBFnetwork, +: centers, " ( 2 2 = 1.0 + 0 . 8 ~ ~ ' 0 .5z- * , SN R = 10 dB,m = 2, T = 0 , p = 40,, and f ih = 1.choice for the Wiener filter. Higher order Wiener filters cannotimprove performance and may even worsen the results dueto the noise enhancement, as can be seen from Fig. 9. Atthe error probability of the RBF equalizer reported a4.4 dB improvement in SNR over the Wiener filter. Note thatfor SNR > 15 dB, less than 300 samples are sufficient to trainthe network. The noise variances were also estimated using(17) and the results are summarized in Table IV. When thewidth was set to p = 262, an identical performance to thecase of p = 2 4 was produced by the RBF network.The ability of the RBF network to cope with nonlin-ear distortion was demonstrated using the following non-

1


8/10

CHEN et al.: CHANNEL EQUALIZATION USING FUNCTION NETWORKS

I

0 , I I I I

-1-zY- -2.-nn(00

L0LL

00

-3- - 4-

-5

-

-

-

-

-

H i e n e r filter +-RBF to p t i m a l 8-

-6 L I I I I0 5 10 15 20 25Signal to N o i s e Ratio IdB)

Fig. 8. Comparison of performance. H ( z ) = 0.3482 + 0 . 8 7 0 4 ~ ~ '0 . 3 4 8 2 ~ - ~ , = 4 and T = 1.

0 1 2 3 4 5 6 7 8 9Wiener Filter OrderFig. 9 . Performance versus Wiener filter order. H ( z ) = 0.3482 +0.8704%-' + 0.3482z-*, and T = 1.

TABLE IVTRUE OISE ARIANCES AND ESTIMATEDOISE ARIANCESC T ~ 0.3162 0.1778 0.1000 0.0562 0.0316 0.0178 0.0100&,2 0.2319 0.1304 0.0733 0.0412 0.0232 0.0131 0.0073

linear channel } (29)y(t) = ~ ( t )0.2z2(t)- 0.1Z3(t) + e ( t )X ( Z ) / S ( Z ) 0.3482 + 0 . 8 7 0 4 ~ ~ ~0 .3 4 8 2 ~- ~ .Fig. 10 shows the performance curves of the optimal Bayesianequalizer with m = 4 and r = 1 and the order-4 Wienerfilter with T = 1. The supervised clustering algorithm (13)was used to train a RBF equalizer of 64 centers with atraining sequence of 320 samples. The trained RBF equalizer

0

-1---'2 -2-

r vn0L

L -3g -4

-5

0LLI,-6 0 5 to t5 20 25Signal t o No i s e R a t i o ( d e l

Fig. 10. Comparison of performance. Nonlinear channel (29), m = 4 and7 = 1.

produced a performance curve closely matched to that of theBayesian equalizer.For all the above examples, a combined learning of super-vised clustering and LMS was also performed to train thecenters and the weights simultaneously. The performancesobtained were identical to the results reported here. This isnot surprising and it confirms that for equiprobable symbolsthe training of centers is sufficient for the network to realizethe optimal equalizer solution.A further example demonstrated the tracking performanceof the RBF network using the decision-directed clusteringalgorithm (22). The channel was time-varying with atransfer function

H ( z ) = 1.0 + (0.2 + 0.0015t)z-' (30)and the noise variance was chosen as O.Ol(1.0 + (0.2 +0.0015t)2) to provide a constant SNR of 20 dB. Correctestimates of uf and T L ~ ere assumed. The center trajectoriesobtained using the decision-directed clustering algorithm witha learning rate gc = 0.2 are plotted in Fig. 11. It is clear fromFig. 11 that good tracking was achieved.Finally, autocorrelations of channel observations werecomputed using (19) for several channels and the resultsare given in Fig. 12. For all these four examples, channelorders were correctly revealed from their normalized sampleautocorrelations.

VIII. CONCLUDINGEMARKSA practical application of the RBF network to the equaliza-tion of digital communications channels has been reported.A main contribution of this study is the derivation of thestructural equivalence between the optimal Bayesian solution

of the symbol-decision equalizer and the IU3F network. Thisexplains clearly why neural network equalizers outperform thetraditional linear equalizer and it highlights the advantages ofthe RBF equalizer over othei nonlinear equalizers. It has beendemonstrated that the training of the RBF network to realizethe optimal equalizer solution can be achieved efficiently using


9/10

2 , I I I I I I I I

0. 50

-0 .5-1 c

-1 .5

-2-2 - 1 . 5 -1 -0.5 0 0.5 1 1. 5 2c l i ( t )

Fig. 11. Tracking performance of decision-directed clustering. H ( 2) =1. 0 + (0.2 + 0 . 0 0 1 5 t ) ~ - ~ ,NR = 20 dB , 0- desired state trajectories,m = 2 and T = 0.(a) H(z)=l .O+0.5~-~

l r i - - - - l

1 11(c )H(z)= 1 . 0 4 . 8 ~+0.51 -*

(b) H( z) = l.O+O.Sz -+0.5z -2

-1-11(d) H(r )= .O+l.Oz --1.0~-~+1.Oz 3

Fig. 12. Autocorrelations of channel observations. - - 5% confidenceband, SNR = 10 dB and 200 samples of observations.the supervised clustering algorithm. For nonstationary chan-nels, a decision-directedversion of the clustering algorithm canbe employed to provide tracking ability for the RBF networkduring data transmission.

A useful technique to improve equalization performanceis to use the decision feedback [4]. After the submissionof this paper, some new results have been obtained by in-corporating the decision feedback into the RBF network toimprove performance and to minimize processing complexity[161. For some communications systems, digital symbols, andchannels are represented in complex-valued forms. Researchhas been continuing into complex-valued neural network struc-tures involving complex signals (e.g., [17], [MI). The resultspresented in this study can readily be extended to the generalcase of complex-valued symbols and channels by employingthe complex-valued RBF network of [MI.

IEEE TRANSACTIONS ON NEURAL NE TWORKS , VOL. 4, NO . 4, JULY 1993

ACKNOWLEDGMENTThe authors acknowledge stimulating discussions with Prof.

C. Cowan and Dr. S. McLaughlin on the topics reported inthis study.REFERENCES

[11 G. D. Forney, M aximum -likelihood s equen ce estimation of digitalsequences in the presence of intersymbol interference, IEEE Trans.Inform. Theory, vol. IT-18, pp. 363-3 78, 1972.[2] F. R. Magee and J. G. Proakis, Adaptive maxim um-likelihood s eque nceestimation for digital signaling in the presence of intersymbol interfer-ence, IEEE Trans. Inform. Theory, vol. IT-19, pp. 120-1 24, 1973.[3] B. Mulgrew and C.F.N. Cowan, Adaptive Filter and Equalisers.Boston, M A Kluwer Academic, 1988.[4] S.U. H. Qureshi, Adaptive equalization, Proc. IEEE, vol. 73,[5] G. J. Gibson, S. Siu, and C. F. N. C owa n, Application of m ultilayerperceptrons as adaptive channel equalisers, in Proc. IEEE Int.Conj Acoust., Speech, Signal Processing, Glasgow, Scotland, 1989,pp. 1183-1186.[6] S. Chen, G. J. Gibson, and C. F. N. Cowan, Adaptive channel equali-sation using a polynomial-perceptron structure, Proc. Inst. Elec. Eng.,vol. 137, pt. I, no. 5, pp. 257-264, 1990.[7] D. S. Broomhead and D. Lowe, Multivariable functional interpolationand adaptive networks, Complex Systems, vol. 2, pp. 321-355, 1988.[SI T. Poggio and F. Girosi, Networks for approximation and learning,Proc. IEEE, vol. 78, pp. 1481-1497, 1990.[9] S. Chen, C. F. N. C owan, and P.M. Grant, Orthogonal least squareslearning algorithm for radial basis function networks, IEEE Trans.Neural Networks, vol. 2, pp. 302-309, 1991.[ lo ] D . F. Specht, Generation of polynomial discriminant functions forpattern recognition, IEEE Trans. Electron. Comput., vol. EC-16,[111 R. 0. Duda and P. E. Hart, Pattern Classification and Scene Analysis.New York: Wiley, 1973.[12] B. Widrow and S.D. Steams, Adaptive Signal Processing. EnglewoodCliffs, NJ: Prentice-Hall, 1985.[13] J. Moody, and C. J. Darken, Fast-learning in netw orks of locally-tunedprocessing units, Neural Computation, vol. 1, no. 2, pp. 281-294,1989.[14] S. Chen, S.A. Billings, and P. M. Grant, Re cursive hybrid algo rithmfor nonlinear system identification using radial basis function networks,Int. J. Control, vol. 5 5 , no. 5, pp. 1051-1070, 1992.[15] I. J. Leontaritis and S.A. Billings, Experimental design and identifia-bility for nonlinear systems, Int. J. Systems Sci., vol. 18,pp. 189-202,1987.[16] S. Chen, B. Mulgrew, and S.McLaughlin, Adaptive Bayesian decisionfeedback equaliser based on a radial basis function network, in Proc.ICC 92, Chicago, IL, 1992, vol. 3, pp.343.3A343.3.5.[17] T.L. Clarke, Generalization of neural networks to the complex plane,in Proc. Int. Joint Conj Neural Networks, San Diego, C A, 19 90, vol. 11,[18] S. Chen, S.McLaughlin, and B. Mulgrew, Complex-valued radial hasisfunction network, Part I: network architecture and learning algorithms,EURASIP Signal Processing J. , submitted for publication, 1992.

pp. 1349-1387, 1985.

pp. 308-319, 1967.

pp. 435-440.

Sheng Chen (M90) was born in Fujian Province,China, in 1957. He received the B.Sc. degree incontrol engineering from the Ea st China PetroleumInstitute in 19 82 and the Ph.D. degree in controlengineering from the City University at London in1986.From 1986 o 198 9 he was a postdoctoral researchassociate in the Department of Control Engineeringat the University of Sheffield. He joined the SignalProcessing Group in the Department of ElectricalEngineering at the University of Edinburgh in 1 989as a research fellow. His research interest are in modeling and identification ofnonlinear systems, artificial neural networks, and adaptive signal processingfor communications.

1


10/10

CHEN et al.: CHANNEL EQUALIZATION USING FUNCTION NETWORKS 579

Bernard Mulgrew (M88) received the Ph.D. de-gree in 1 987 from the University of Edinburgh.He worked for four years in the Radar Sys-tems Department at Ferranti Defence Systems, Ed-inburgh, Scotland, after graduation in 1979. From1983 to 1 986 he was a research associate in theDepartment of Electrical Engineering at the Uni-versity of Edinburgh, studying the performance anddesign of adap tive filter algorithms. He is currently alecturer in the D epartment of Electrical Engineering,University of Edinburgh, where he gives courseson signals and systems, and signal processing. His research interests are inadaptive signal processing and estimation theory and in their application toradar and communications systems. He is a coauthor of two books on signalprocessing.Dr. Mulgrew is an associate member of the IEE and a member of theAudio Engineering Society.

Peter M. Grant (M77-SM83) was born in St.Andrews, Scotland, in 1944. He received the B.Sc.degree in electronic engineering from the Heriot-Watt University, Edinburgh, Scotland, in 1966, andthe Ph.D. degree from the University of Edinburghin 1975.From 1966 o 1 970 he was with Plessey CompanyLtd., before he was appointed to a research fellow-ship at the University of Edinburgh to study theapplications of surface acoustic wave and charge-coupled communication systems. He was subse-quently appointed to a lectureship and promoted throu gh to a personal chair inElectronic Signal Processin g with responsibility for teaching signal processingand communication systems. He leads the signal processing research groupwith personal involvement in adaptive filtering and pattern recognition. In1985-1986 he was appointed as a visiting staff member at the MIT LincolnLaboratory.Dr. Grant was the recipient of a James Caird Travelling Scholarshipduring the academic year 1977-1978 and, as a visiting assistant professor, heresearched at the Ginzton Laboratory, Stanford University. H e is a fellow ofthe Institution of Electrical Engin eers (London). He serves there as on e of th ehonorary editors of the IEE Proceedings-PartF,Radar and Signal Processing,as the senior editor and chairman of the IEE Proceedings Editorial Panel,and as a panel member for th e Elec?ronicsand Communications EngineeringJournal.

Documents

A Clustering Technique for Digital Communications