Blind channel identification via stochastic approximation: constant step-size algorithms

Systems & Control Letters 45 (2002) 347–356www.elsevier.com/locate/sysconle

Blind channel identi"cation via stochastic approximation:constant step-size algorithms

G. Yina ;1 ∗, Han-Fu Chenb;2

aDepartment of Mathematics, Wayne State University, Detroit, MI 48202, USAbLaboratory of Systems and Control, Institute of Systems Science, Chinese Academy of Sciences, Beijing, 100080, China

Received 23 January 2001; received in revised form 25 October 2001

Abstract

This work is devoted to the study of a class of recursive algorithms for blind channel identi"cation. Using weakconvergence methods, the convergence of the algorithm is obtained and the rate of convergence is ascertained. Thetechnique discussed can also be used in the analysis of rates of convergence for decreasing step-size algorithms. c© 2001Elsevier Science B.V. All rights reserved.

Keywords: Blind identi"cation; Stochastic approximation; Noisy channel

1. Introduction

Recent advances in wireless communication stimulate much of the research e>ort in blind channel iden-ti"cation and related areas; see for example, [6,9–11,13,16–20]. There are mainly two approaches, namely,deterministic methods and stochastic methods. The o>-line algorithms developed collect a block of sampleddata "rst and then process these data to obtain the estimates of channel parameters. In the deterministicsetup, the estimation problem is solved by "nding solutions of a set of linear equations. For example, themethod discussed in [20] relies on least square techniques for noisy channels and requires that large amountof sampled data be used. This makes the computation intensive. In a recent paper [5], we proposed a classof recursive stochastic algorithms based on the deterministic model developed in [20]. The motivation of ourapproach stems from stochastic approximation methods; see [1,3,4,15]. One of the features of our algorithmsis that at each step, the received data are directly used to update the estimates of the channel parameters.This is di>erent from the statistical estimation schemes that require accurate estimates of quantities such asmoments, which in turn need a large amount of sampled data. The stochastic approximation algorithms arethus advantageous compared to the o>-line procedures.In [5], the convergence of the algorithm was derived for algorithms with decreasing step sizes together

with simulation studies. In this paper, we study algorithms with a constant step-size. Constant-step algorithmsare often used in the actual computation since they are easily implementable and since they have the ability

∗ Corresponding author. Tel.: +1-313-577-2496; fax: +1-313-577-7596.E-mail addresses: [email protected] (G. Yin), [email protected] (H.-F. Chen).

1 Supported in part by the National Science Foundation.2 Supported in part by the National Key Project of China and the National Natural Science Foundation of China.

0167-6911/01/$ - see front matter c© 2001 Elsevier Science B.V. All rights reserved.PII: S 0167 -6911(01)00193 -1

348 G. Yin, H.-F. Chen / Systems & Control Letters 45 (2002) 347–356

to track slight variation of the true parameter. In addition to convergence, we also derive the convergencespeed. Note that both the decreasing step-size algorithm and the constant step-size algorithm are representedby di>erence equations, hence are discrete-time stochastic systems.The rest of the paper is arranged as follows. Section 2 presents the formulation and the algorithm. Section

3 is devoted to the convergence of the algorithm. Since a constant parameter is used, a pertinent notionof convergence is in the sense of weak convergence. Section 4 examines the rates of convergence. Finally,Section 5 concludes the paper with a few more remarks.

2. Problem formulation

Consider a discrete-time input–output system having p FIR channels and the maximum order of the chan-nels L, in which for discrete time k = 1; 2; : : : ; {sk} is a sequence of one-dimensional input signals, and{xk} = {(x(1)k ; x(2)k ; : : : ; x(p)k )′} is a sequence of p-dimensional output signals. Here and hereafter, z′ denotesthe transpose of z; z(i) denotes the ith component of z. Then,

xk =L∑

j=0

hjsk−j; k¿L; (2.1)

where hj = (h(1)j ; : : : ; h(p)j )′. Eq. (2.1) can be written as

x(i)k = h(i)k ∗ sk ; (2.2)

where ∗ denotes convolution.In [5], both noise free and noisy observations were considered and both deterministic signals and random

inputs were treated. In this paper, we concentrate on the case that the input signals are random and theobservations yk are corrupted by noise, i.e.,

yk = xk + nk ; (2.3)

where nk is a p-dimensional random vector. The problem is to estimate hj; j=0; : : : ; L, based on the sequenceof observations {yk}. Note that sk ; xk ; nk , and yk can be complex valued. Nevertheless, throughout the paper,we consider real-valued processes for simplicity.The channels can be characterized by a p(L+ 1)-dimensional vector h∗. De"ne

h(i) = (h(i)0 ; : : : ; h(i)L )′; h∗ = (h(1);′; : : : ; h(p);

′)′: (2.4)

To approximate h∗, we develop a sequence of estimates {zk} recursively. Our goal is to design an adaptivealgorithm so that the estimates zk are updated on line and the sequence {zk} converges to h∗ in an appropriatesense.

2.1. Formulation

Let

(i)k = (y(i)

k · · ·y(i)k−L)

′; ’(i)k = (x(i)k · · · x(i)k−L)

′; i = 1; : : : ; p; k¿ 2L; (2.5)

where y(i)k and x(i)k are the ith component of yk and xk , respectively. Also denote

X (i)L = (’(i); ′

2L ; : : : ; ’(i); ′N )′; i = 1; : : : ; p;

G. Yin, H.-F. Chen / Systems & Control Letters 45 (2002) 347–356 349

which is an (N − 2L + 1) × (L + 1) matrix, where N is the number of samples. From (2.2), we have (see[20]) that

h(i2)k ∗ x(i1)k = h(i1)k ∗ x(i2)k ; or equivalently; (X (i1)L ;−X (i2)

L )

(h(i2)

h(i1)

)= 0: (2.6)

An (N − 2L+ 1)[p(p− 1)=2]× [(L+ 1)p] matrix

XL =

X (2)L −X (1)

L 0 · · · · · · 0

X (3)L 0 −X (1)

L 0 · · · 0

......

.... . .

. . ....

X (p)L 0 0 · · · 0 −X (1)

L

0 X (3)L −X (2)

L 0 · · · 0

......

. . .. . .

0 X (p)L 0 · · · 0 −X (2)

L

· · · · · · · · · · · · · · · · · ·...

...

· · · · · · · · · · · · · · · · · ·0 · · · · · · 0 X (p)

L −X (p−1)L

(2.7)

is de"ned in [20]. In view of (2.6) and (2.7), XLh∗ = 0. It is proposed in [20] to solve this equation for h∗

in the noise free case or by considering a constrained least-squares problem minh|XLh∗|2 subject to |h∗| = 1when the observations are corrupted by noise.Similar to (2.7), we de"ne two [p(p− 2)=2]× [p(L+ 1)] matrices �k and �k by replacing all the matrix

entries X (i)L in XL with (i)

k and ’(i)k , respectively. Therefore, XL is (N − 2L + 1) times as large as �k (or

�k). Note that �k (or �k) contains the observation xj in the noise-free case (or yj for noisy observations)in a window of size L + 1 back from time instant j (i.e., j; j − 1; : : : ; j − L); these are the observations thatare related to signal sj−L. It is worth to note that in contrast to XL, neither �k nor �k depends on N .

2.2. Algorithm

Let �¿ 0 be a small parameter. We consider a constant step-size algorithm with initial value z�2L−1 �=0,and

z�k+1 = z�k − �(�′k+1�k+1 − E$′

k+1$k+1)z�k for k¿ 2L; (2.8)

where $k =�k − �k , for k¿ 2L. Assume that the following conditions hold:

(A1) The polynomials hi(z) , h(i)0 + h(i)1 z + · · · + h(i)L zL; i = 1; : : : ; p have no common factor. For a "xed"nite positive integer N , the linear complexity of the "nite sequence {sj; j = 1; : : : ; N} is greater thanor equal to 2L+ 1: For the notion of linear complexity, see [5,20] and the references therein.

(A2) {sk} and {nk} are mutually independent and each of them is a sequence of mutually independent randomvariables such that for some �¿ 0; Enk = 0; supk E|nk |2+� ¡∞, and supkE|sk |2+� ¡∞.


Remark 2.1. Under (A1); it follows from [20; Theorem 1] that h∗ is the (unique up to a scalar multiple)nonzero vector simultaneously satisfying �kh∗=0 w.p.1; for k¿ 2L: As a result; we also have that E�kh∗=0and �′

k�kh∗ = 0. Since h∗ is a nonzero vector; �k does not have full rank; neither does �′k�k .

3. Convergence

De"ne

�k =�′k�k − E�′

k�k and B= E(�′k�k + �′

k�k − �′k�k): (3.1)

Then (2.8) becomes

zk+1 = zk − ��k+1zk − �Bzk ; k¿ 2L: (3.2)

Remark 3.1. In view of the contents of �k and �k; they involve mainly the terms yk and xk ; respectively.By virtue of the de"nitions of yk and xk and the independence and moment assumptions (see (A2)); we havefor each i1 and each i2;

Ex(i1)k y(i2)k = E

(L∑

‘=0

h(i1)‘ sk−‘

)(x(i2)k + n(i2)k )

=L∑

‘=0

L∑�=0

h(i1)‘ h(i2)� Esk−‘sk−�:

It follows that

B= E�′k�k : (3.3)

We proceed to analyze the asymptotics of {zk}, and establish a convergence result by using the weakconvergence methods. The notion of weak convergence of probability measures is an extension of convergencein distribution. Let �k and � be random variables living in a complete separable metric space. We say that �kconverges weakly to � i> for any bounded and continuous function f(·), Ef(�k) → Ef(�) as k → ∞. Thesequence {�k} is said to be tight if for each !¿ 0, there is a compact set K! such that P(�k ∈K!)¿ 1 − !for all k. Note that the index k above can be replaced by �¿ 0. A result, known as Prohorov’s theorem,states that on a complete separable metric space, tightness is equivalent to relative compactness (i.e., eachsequence has a convergent subsequence with limit contained in the underlying space). Using this theorem, weare able to extract convergent subsequences once tightness is veri"ed. Further details on weak convergencemethods can be found in, for example, [8, Chapter 3] or [15, Chapters 7 and 8]. In lieu of dealing with thediscrete-time algorithm (3.2) directly, we work with a sequence of piecewise constant interpolations. De"nez�(·) as z�(t) = zk for t ∈ [�k; �k + �). Then z�(·)∈Dp[0;∞), which is the space of functions that are rightcontinuous, have left limits, endowed with the Skorohod topology. Next, we present a convergence theorem.

Theorem 3.2. Suppose the initial data z�2L−1 converges weakly to z0 and (A1) and (A2) hold. Then {z�(·)}is tight in Dp[0;∞); and z�(·) converges weakly to z(·) that is a solution of the di;erential equation

z =−Bz; z(0) = z0: (3.4)

Moreover; for any t� → ∞ as � → 0;

lim�→0

P(dist(z�(t� + ·);Z)¿!) = 0 (3.5)

for any !¿ 0; where dist(z;Z) = inf{|z − y|:y∈Z} is the usual distance and Z = {z∈Rp:Bz = 0}.


Remark 3.3. Note that E�′k�k is symmetric and nonnegative de"nite. It does not have full rank; but the

nonzero eigenvalues are positive. This allows us to get an a priori bound on the solution of (3.4). In fact;since z = exp(−Bt)z0; we have for any t ∈ [0;∞); |z|6 |exp(−Bt)||z0|¡∞. Thus the solution of (3.4) isbounded for all t ∈ [0;∞). In view of the condition; Z is an invariance set of (3.4) and is the null spaceof B.

Proof. We will use a truncation device; see [15; p. 288]. Let M be a "xed but otherwise arbitrary positive realnumber and SM = {z∈Rr:|z|6M}. Use a smooth truncation function qM (·) with compact support satisfying

qM (z) =

{1; |z|6M;

0; |z|¿M + 1:(3.6)

De"ne a truncated version of (3.2) as

zMk = zk ; k = 0; : : : ; 2L− 1; zMk+1 = zMk − �[�k+1zMk + BzMk ]qM (zMk ); k¿ 2L: (3.7)

That is; {zMk } stops at the "rst exit from SM+1 and equals the original sequence {zk} up to the "rst exit fromSM . De"ne an interpolation as z�;M (t) = zMk for t ∈ [k�; k�+ �). It then follows that z�;M (t) = z�(t) up until the"rst exit from the M -sphere SM . As a result; z�;M (·) is an M -truncation of z�(·) de"ned in [15; p. 278]. We"rst show that the truncated process {z�;M (·)} converges weakly to zM (·). Then; letting M → ∞; we concludethat the untruncated process z�(·) also converges.The "rst step is to prove the tightness of z�;M (·). By (A2), {�k} is uniformly integrable. This together with

the boundedness of {zMk } then implies that {z�;M (·)} is tight by Lemma 3:7 in [14, p. 51]. Moreover, as canbe shown in [15, pp. 223–224], the limits are Lipschitz continuous w.p.1.Since {z�;M (·)} is tight, the Prohorov’s theorem (see [8, p. 103] or [15, p. 201]) allows us to extract

a convergent subsequence. For simplicity, still denote the subsequence by {z�;M (·)} with limit zM (·). BySkorohod representation [8, p. 102] or [15, p. 203] without changing notation, we may assume that z�;M (·) →zM (·) w.p.1, and the convergence is uniform on any bounded time interval. To characterize the limit process,we claim that

*(t)def= zM (t)− z0 +∫ t

0BzM (r)qM (zM ) dr (3.8)

is a continuous martingale. To verify the martingale property, using [15, Section 7:4:1, p. 205], it suQces toshow that for any bounded and continuous function g(·), any t; s¿ 0, any positive integer ,, and ti ¡ t¡ t+s; i6 ,,

E,∏

i=1

g(zM (ti))(zM (t + s)− zM (t) +

∫ t

0BzM (r)qM (zM (r)) dr

)= 0:

To proceed, we work with the process indexed by �. By the weak convergence of z�;M (·) and the Skorohodrepresentation,

lim�→0

E,∏

i=1

g(z�;M (ti))[z�;M (t + s)− z�;M (t)]

=E,∏

i=1

g(zM (ti))[zM (t + s)− zM (t)]: (3.9)


Let {m�} be a sequence of positive integers satisfying m� → ∞, but .� = �m� → 0 as � → 0. Then using(3.7),

E,∏

i=1

g(z�;M (ti))[z�;M (t + s)− z�;M (t)]

=E,∏

i=1

g(z�;M (ti))

−�

(t+s)=�−1∑j=t=�

�j+1zMj qM (zMj )− �(t+s)=�−1∑

j=t=�

BzMj qM (zMj )

=E,∏

i=1

g(z�;M (ti))

−

(t+s)=.�∑l=t=.�

.�1m�

lm�+m�−1∑j=lm�

Elm��j+1zMj qM (zMj )

−(t+s)=.�∑l=t=.�

.�1m�


BElm�zMj qM (zMj )

: (3.10)

Using the boundedness of {z�;M (·)}, the smoothness of qM (·), and the recursion (3.7), it is easily seen that

1m�


Elm��j+1zMj qM (zMj ) =1m�


Elm��j+1zMlm�qM (zMlm�

) + o(1);

where as � → 0, o(1) → 0 in probability uniformly in t. Note that {�k} is uniformly integrable. By thewell-known law of large numbers

1m�


Elm��j+1 → 0 in probability as � → 0

so1m�


Elm��j+1zMj qM (zMj ) → 0 in probability as � → 0: (3.11)

Using the argument as in [15, Chapter 8], it can be shown that

lim�→0

E,∏

i=1

g(z�;M (ti))

−B

(t+s)=.�∑l=t=.�

.�1m�


Elm�zMj qM (zMj )

=E,∏

i=1

g(zM (ti))(−∫ t+s

tBzM (r)qM (zM (r)) dr

): (3.12)

Then [15, Section 7:4:1] leads to that (3.8) is a martingale. The Lipschitz continuity of zM (·) then yields that*(·) is Lipschitz continuous. Consequently, [15, Theorem 4:1:1, p. 70] implies that *(·) is a constant w.p.1.Since *(0) = 0; the constant must be 0. That is, zM (·) satis"es the ordinary di>erential equation

zM (t) =−BzM (t)qM (zM (t)); zM (0) = z0:

In fact the above characterization holds for any weakly convergent subsequence. Since the limit does notdepend on the extracted subsequences, the whole sequence z�;M (·) also converges weakly to the same limitbecause of the relative compactness. Note that the di>erential equation (3.4) is linear and hence has uniquesolution for each initial condition. Using similar argument as in [15, p. 220 and pp. 248–250], we concludethat the untruncated process z�(·) also converges weakly to z(·), which is a solution of (3.4). Moreover, itcan be shown that (3.5) holds.


4. Convergence rate

Rate of convergence is referred to the asymptotic properties of suitably normalized sequence of errors{zk − z∗}; see [4, Chapter 4], [7, Chapter 2], [15, Chapter 10], and the references therein. We are seekingfor such 0¿ 0 that (zk − z∗)=�0 converges to a nontrivial limit. To do so, we "rst establish an error bound,which indicates how the estimation error depends on the step size �. As a consequence, tightness of a suitablyscaled sequence of estimation errors is obtained. Next, by taking a continuous-time interpolation, we showthat a suitably scaled sequence converges weakly to a di>usion process. It turns out 0= 1

2 ; the scaling factor1=√� together with the stationary covariance of the di>usion gives us the desired rate of convergence [15,

Chapter 10]. Roughly, it will be shown that (zk − z∗)=√� is asymptotically normal. Using weak convergence

methods, far-reaching result revealing the stochastic dynamic structure of the scaled sequence will be obtained.To proceed, we need another assumption.

(A3) As � → 0 and t� → ∞, z�(t�+·) converges in probability to a z∗ ∈Z . The sequences {sk} and {nk} satisfysupk E(|sk |4 + |nk |4)¡∞, and there is a symmetric nonnegative de"nite matrix A that is independentof k such that

A= E�kz∗z∗; ′�′k ; (4.1)

where �k is given by (3.1).

Here the main issue is convergence rate, so we have assumed the convergence to a particular z∗ ∈Z. Thisconvergence is in the sense that zk converges to z∗ as � → 0 and k → ∞ simultaneously. Our objective canbe stated as: Find the limit distribution of a suitable normalized sequence of {zk − z∗} conditioned on theweak convergence of z�(t� + ·) to z∗; see [12]. That is, we aim to "nd the limit conditional distribution.De"ne zk = zk − z∗: Then (3.2) becomes

zk+1 = zk − ��k+1zk − �Bzk − ��k+1z∗: (4.2)

Using a quadratic function V (z) = (12 )z′z, we obtain the following error bounds.

Theorem 4.1. Assume (A1)–(A3). Then for su=ciently large k; E|zk |2 = O(�).

Proof. Since {�k} is a martingale di>erence sequence and zk is Fk -measurable; where {Fk} is an increasingsequence of 4-algebras generated by {z0; �j: j6 k};

Ek z′k�k+1z∗ = 0 and Ez′k�k+1zk = 0:

It follows that

EkV (zk+1)− V (zk) =−�z′kBzk +�2

2Ek(�k+1zk + Bzk + �k+1z∗)′(�k+1zk + Bzk + �k+1z∗): (4.3)

Note that

Ek(�k+1zk + Bzk + �k+1z∗)′(�k+1zk + Bzk + �k+1z∗)

=z′k [Ek�′k+1�k+1]zk + z′kB′Bzk + 2z′k [Ek�′k+1�k+1]z∗ + z∗; ′[Ek�′k+1�k+1]z∗

=O(1 + V (zk)):


For suQciently small �¿ 0; we can make O(�2)V (zk)6 5�V (zk)=2 for some 5¿ 0. Consequently; adding aperturbation �I to make B+ �I positive de"nite; we obtain that

EkV (zk+1)− V (zk) = −�z′kBzk +O(�2)(1 + V (zk))

= −�z′k(B+ �I)zk +O(�2)(1 + V (zk))

6−5�V (zk) + O(�2)(1 + V (zk))

6−5�2V (zk) + O(�2): (4.4)

Taking expectation and iterating on the resulting inequality; we obtain

EV (zk+1)6(1− 5�

2

)kEV (z0) +

k∑j=0

(1− 5�

2

)jO(�2)

=(1− 5�

2

)kEV (z0) + O(�): (4.5)

Choose K� ¿ 0 such that (1− 5�=2)K� 6K�. Then for all k¿K�; EV (zk) = O(�).

As a direct consequence of the above theorem, we obtain the following result.

Corollary 4.2. De>ne uk = (zk − z∗)=√�. Under (A1)–(A3); {uk : k¿K�} is tight.

The corollary indicates that after an initial transient period, the scaled sequence uk settles down. Workingwith this sequence, de"ne a piecewise constant interpolation u�(t) = uk for t ∈ [�(k − K�); �(k − K�) + �): Weproceed to show that u�(·) converges weakly to a di>usion process. Again, we work with an M -truncationprocess. De"ne

uMK�= uK� ; uMk+1 = uMk − �(�k+1uMk + BuMk )qM (uMk )− √

��k+1z∗ for k¿K�; (4.6)

and its continuous-time interpolation u�;M (t)= uMk for t ∈ [k�; k�+ �). Then u�;M (·) is an M -truncation of u�(·)(see [15, p. 278]).It is easily seen that for any .¿ 0, t ¿ 0, and .¿s¿ 0,

u�;M (t + s)− u�;M (t) =−�(t+s)=�−1∑

j=t=�

�j+1uMj qM (uMj )− �(t+s)=�−1∑

j=t=�

BuMj qM (uMj )−√�(t+s)=�−1∑

j=t=�

�j+1z∗:

Since E�′k�k = 0 if j �= k, the orthogonality of {�k} then implies that

E

∣∣∣∣∣∣√�(t+s)=�−1∑

j=t=�

�j+1z∗

∣∣∣∣∣∣2

= �(t+s)=�−1∑

j=t=�

(t+s)=�−1∑k=t=�

z∗; ′[E�′j+1�k+1]z∗ = �(t+s)=�−1∑

j=t=�

tr A;

where A is given by (4.1). Thus

lim.→0

lim sup�→0

E

∣∣∣∣∣∣√�(t+s)=�−1∑

j=t=�

�j+1z∗

∣∣∣∣∣∣2

= lim.→0

O(.) = 0:


By virtue of the boundedness of {uMj } and the orthogonality of {�j},

lim.→0

E

∣∣∣∣∣∣�(t+s)=�−1∑

j=t=�

�j+1uMj qM (uMj )

∣∣∣∣∣∣2

=�2(t+s)=�−1∑

j=t=�

(t+s)=�−1∑k=t=�

uM;j

′E�′j+1�k+1uMk qM (uMj )qM (uMk )

6K�2(t+s)=�−1∑

j=t=�

E|�j|2 = O(�) → 0 as � → 0:

Finally,

lim.→0

lim sup�→0

E

∣∣∣∣∣∣�(t+s)=�−1∑

j=t=�

BuMj qM (uMj )

∣∣∣∣∣∣2

6 lim.→0

O(.2) = 0:

As a result

lim.→0

lim sup�→0

E|u�;M (t + s)− u�;M (t)|2 = 0:

Thus the tightness criterion (see [8, Section 3:8, p. 132] or [14, p. 47]) implies that {u�;M (·)} is tight inDp[0;∞).

Lemma 4.3. Under (A2) and (A3);√�∑t=�−1

j=0 �j+1z∗ converges weakly to a Brownian motion w(·) withcovariance tA; where A is given by (4:1).

Proof. It follows from the well-known Donsker’s invariance theorem [2; p. 68].

Using similar technique as in the proof of the convergence of the iterates in the last section (and by virtueof Lemma 4.3), we can "rst prove the weak convergence of u�;M (·) and then let M → ∞ to conclude:

Theorem 4.4. Under the conditions of Corollary 4:2; u�(·) converges weakly to u(·); which is a solution of

du =−Bu dt − dw; u(0) = u0;

where w(·) is the Brownian motion given in Lemma 4:3.

5. Further remarks

A class of algorithms for blind channel identi"cation has been studied in this paper. Convergence and ratesof convergence were obtained for the constant-step-size algorithms.In [5], a decreasing step-size algorithm for the blind channel identi"cation problem was developed and

w.p.1 convergence was obtained. Using the approach of this paper, we can obtain the rate of convergenceof the decreasing step-size algorithms. With reference to [5], consider decreasing step-size algorithms of theform

zk+1 = zk − �k�k+1zk − �kBzk ; k¿ 2L; (5.1)


where {�k} is a sequence of nonnegative real numbers satisfying �k → 0 as k → ∞ and∑

k �k =∞. De"nezk(t) to be the piecewise constant interpolation of zk on the interval [tk ; tk+1), where tk =

∑k−1j=0 �j. Then we

can obtain similar result as that of Theorem 3.2 with � replaced by �k . Note that z�(t� + ·) is now replacedby zk(8k + ·) where 8k → ∞ as k → ∞. Furthermore, we can investigate the rate of convergence issues viasuitably scaled sequence {(zk − z∗)=

√�k}. Under the conditions of Theorems 3.2 and 4.4, the conclusions

of the corresponding theorems continue to hold, with � replaced by �k , and � → 0 replaced by k → ∞,respectively.Further studies may be directed to obtaining large deviations properties and rates of convergence in con-

junction with computational budget. Another important issue is to improve the eQciency of the algorithms.

References

[1] A. Benveniste, M. Metivier, P. Priouret, Adaptive Algorithms and Stochastic Approximations, Springer, Berlin, 1990.[2] P. Billingsley, Convergence of Probability Measures, 2nd Edition, Wiley, New York, 1968.[3] H.F. Chen, Stochastic approximation and its new applications, Proceedings of the 1994 Hong Kong International Workshop on New

Directions of Control and Manufacturing, 1994, pp. 2–12.[4] H.F. Chen, Y.M. Zhu, Stochastic Approximation, Shanghai Scienti"c and Technical Publishers, Shanghai, 1996 (in Chinese).[5] H.F. Chen, X.-R. Cao, J. Zhu, Stochastic approximation based algorithms for blind channel identi"cation, preprint, 2000.[6] Z. Ding, Y. Li, On channel identi"cation based on second order cyclic spectra, IEEE Trans. Signal Process. 42 (1994) 1260–1264.[7] M. DuTo, Random Iterative Models, Springer, New York, 1997.[8] S.N. Ethier, T.G. Kurtz, Markov Processes: Characterization and Convergence, Wiley, New York, 1986.[9] D. Gesbert, P. Duhame, S. Mayrargue, On-line blind multichannel equalization based on mutually referenced "lters, IEEE Trans.

Signal Process. 45 (1997) 2307–2317.[10] D. Gesbert, P. Duhamel, Unbiased blind adaptive channel identi"cation and equalization, IEEE Trans. Signal Process. 48 (2000)

148–158.[11] Y. Hua, Fast maximum likelihood for blind identi"cation of multiple FIR channels, IEEE Trans. Signal Process. 44 (1996) 661–672.[12] Yu. M. Kaninovskii, On the limit distribution of processes of stochastic approximation type when the regression function has several

roots, Soviet Math. Dokl. 30 (1988) 210–211.[13] V. Krishnamurthy, G. Yin, S. Singh, Adaptive step size algorithms for blind interference suppression in DS=CDMA systems, IEEE

Trans. Signal Process. 49 (2001) 190–201.[14] H.J. Kushner, Approximation and Weak Convergence Methods for Random Processes, with Applications to Stochastic Systems

Theory, MIT Press, Cambridge, MA, 1984.[15] H.J. Kushner, G. Yin, Stochastic Approximation Algorithms and Applications, Springer, New York, 1997.[16] E. Moulines, P. Duhamel, J-F. Cardoso, S. Mayrargue, Subspace methods for the blind identi"cation of multichannel FIR "lters,

IEEE Trans. Signal Process. 43 (1995) 516–525.[17] L. Tong, S. Perreau, Multichannel blind identi"cation: from subspace to maximum likelihood methods, Proceedings of the IEEE 86

(1998) 1951–1968.[18] L. Tong, G. Xu, T. Kailath, Blind identi"cation and equalization based on second-order statistics: A time domain approach, IEEE

Trans. Inform. Theory 40 (1994) 340–349.[19] A.J. van der Veen, S. Talwar, A. Paulraj, Blind estimation of multiple digital signals transmitted over FIR channels, IEEE Signal

Process. Lett. 2 (1995) 99–102.[20] G. Xu, L. Tong, T. Kailath, A least-squares approach to blind identi"cation, IEEE Trans. Signal Process. 43 (1995) 2982–2993.

Documents

Blind channel identification via stochastic approximation: constant step-size algorithms