Frosts original paper ieee 1972

Embed Size (px)

Citation preview

  • 8/7/2019 Frosts original paper ieee 1972

    1/10

    926 PROCEEDINGS OF THE IEEE, VOL. 60, NO . 8, UGUST 1972

    A n Algorithm For Linearly ConstrainedAdaptive Array ProcessingOTIS LAMONT FROST, 111, MEMBER, IEEE

    Abstract-A constrained least mean -squares algorithmhas beenderived whichis capab le of a djusting an array of se nsor s in real timeto respond to a signal coming from a desired direction while dis-criminating against noises comingfromotherdirections.Analysisand omputer simulations confirm hat the algorithm is able toiteratively adapt variable weights on the taps of the senso r array tominimize noise power in the array output. A se t of linear equalityconstraints on the weights maintains a chosen frequency characteris-tic for the array in the direction of interest.The array problem would be a classical constrained least-mean-squares problem except that the signal and noise statistics are as-sumed unknown a priori.A geometrical presentation shows that the algorithm is able tomaintain the constraints and prevent the accum ulation of quantiza-tion errors in a digital mplementation.

    T I. NTRODUCTIONHIS PAPERdescribes a simple algorithm for adjustingan array of sensors in real time to respond toa desiredsignal while discriminating against noises. A signalishere defined as a waveform of int eres t which arrives nplane waves from a chosen direction (called the look direc-tion).Thealgorithm terativelyadapts heweights of abroad-band sensor array (Fig. 1) to minimize noise power a ta t the array output while maintaining a chosen frequencyresponse in the look direction.

    The lgorithm, called theConstrainedLeastMean-Squares or ConstrainedLMS lgorithm,s a simplestochasticgradient-descentalgorithm which requiresonlyth at th e direction of arrival and a frequency bandof interestbe specified a priori. In the adapti ve process, the algorithmprogressively learns statisticsof noise arriving from directionsother than the look direction. Noise arriving from he lookdirection may be filtered out by a suitable choice of the fre-quency response characteristic in th at direction, or by exter-nalmeans. Subsequent processing of the arr ay output maybe done for detection or classification.A major advantage of the constrained LMS algorithm isthat it has a self-correcting feature permitting it to operatefor arbit rari ly long periods of time in a digital computer im-plementation without deviating from its constraints becauseof cumulative roundoff or truncation errors.

    The algorithm is applicable to array processing problemsin geoscience, sonar, and electromagnet ic antenna arrays inwhich a simple method is required for adjust ing an array inreal time odiscriminateagainst noises impinging o n thear ray sidelobes.

    ManuscriptreceivedDecember23,1971:revised May 4, 1972. Thisresearch is based on a Ph.D. dissertation in the Department of ElectricalEngineering, Stanford University, Stanford, Calif.The author is with ARGOSystems, Inc., Palo Alto, Calif. 94303.

    Fig. 1 . Broad-band antenna array and equivalent processor for signalscoming from the look direction.

    Previous WorkPrevious work on itera tive least squares array processing

    was done by Griffiths [ l ] ;his method uses an unconstrainedminimum-mean-square-errorptimizationriterionwhichrequires a priori knowledge of second-order signal statistics.Widrow, Mantey, Griffiths, and Goode [ 2 ] proposed a vari-able-criterion [3 ] optimizationprocedure nvolving he useof a known training signal; this was an application and extesion of the original work o n adaptive filters done by Widrowand Hoff [4] . Griffiths lso proposed a constrained eastmean-squares processor not requiring a priori knowledge ofthe signal statistics [SI;a new derivation of th is processor,given in [ 6 ] , shows that tmaybeconsideredasputtingsoft constraints on the processor via the quadratic penalty-function method.

    Ha rd (i.e., exactly)-constrained terativeoptimizationwas studied by Rosen [ 7 ] for the deterministic case. Lacoss[8] , Booker et al. [g] , andKobayashi [ l o ] studiedhard-constrained optimization in the array processing context forfiltering short lengths of dat a. All four authors used gradient-projection techniques [ l l ] ;Rosen and Booker correctly indi-cated hatgradient-projectionmethodsare usceptible ocumulative roundoff erro rs and are not sui table for long run swithoutanadditionalerror-correctionprocedure.Thecon-strained L M S algorithmdeveloped n hepresentwork sdesigned toavoiderroraccumulation while maintainingahard constraint; as a result, it is able to provide continuafilter ing for arbitrarily large numbers of i terations.Basic Principle of the Constraints

    Thealgorithm sable omaintaina chosen frequencyresponse in the look direction while minimizing output noise

    Authorized licensed use limited to: RMIT University. Downloaded on February 13, 2009 at 01:38 from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Frosts original paper ieee 1972

    2/10

    F R O S T : A L G O R I T H M F O R A D A P T I V E A R R A Y P R O C E S S I N G

    power because of a simple relation between the look directionfrequencyresponseand heweights n hearray of Fig. 1.Assume th at he look direction schosenas hedirectionperpendicular o he ine of sensors. Then dentical ignalcomponents arrivingon a plane wavefront parallel to the lineof sensors appear at th e first taps simultaneously and paradein parallel down the tapped delay lines following each sensor;however, noise waveforms arriving from other than the lookdirection will not, ngeneral,produceequalvoltagecom-ponents on an y given vertical column of taps. The voltages(signalplusnoise) at each aparemultipliedby he apweights and added to form the arra y output. Thus as far asthe signal is concerned, the arra y processor is equivalen t toa single tapped delay ine in which each weight is equal o th esum of the weights in the corresponding vertical column ofthe processor, as indic ated in ig. 1.These summation weightsin the equivalent tapped delay line must be selected so as togive the desired frequency response characteristic in the lookdirection.

    If t he look direction is chosen to be other t h a n tha t per-pendicular to he line of sensors, hen hearraycanbesteered either mechanically or electrically by the addition ofsteering imedelays notshown)placed mmediatelyaftereach sensor.A processor having K sensors and J taps per sensor hasK J weights and requires J constraints to determ ine its ook-direction frequency response. The remaining K J - J degreesof freedom in choosing the weights may be used to minimizethe total power in the array output. Since the look-directionfrequency response is ixed by the J constraints, minimizationof the total outpu t ower is equivalent tominimizing the n o n -look-direction noise power, so long as the setf signal voltagesa t the taps is uncorrelat ed with the set of noise voltages atthese taps. The latter assumption has commonly been madein previous work on itera tive array processing [l ] , [SI, [8]-[ l o ] . The effect of signal-correlated noise in the arr ay maybe to cancel out all or part of th e desired signal componentin the array output. Sou rcesf signal-correlated noise may bemultiple ignal-propagationpaths, nd oherent adarorsonar clutter.

    I t is permissible, and n fact desirable for proper noisecancellation that the voltages produced by the noises on th etaps of the ar ray be correlated among themselves, althoughuncorrelatedwithhe ignal oltages.Examples of suchnoises include waveforms from point sources n other hanthe lookdirection (e.g., lightn ing, jammers , noise fromnearby vehicles}, spatially ocalized ncoherent clutter, andself-noise from t he stru cture carrying the ar ray.

    Noise voltages which are uncorrelated between taps (e.g.,amplifier hermal noise) maybepartially ejectedby headap tive arra y in two ays. As in a conventional nonadaptivearray,such noises are elim inate d o he exte nt hat signalvoltages o n the taps are added coherentlyt the array output,while uncorrelated noise voltagesareadded ncoherently.Second, an adaptive array can reduce the weighting on an ytap t h a t may have a disportionately large uncorrelated noisepower.

    11. OPTIMUM-C ONSTR AINEDM S WEIGHTVECTORTh e first step n developing he constrained L M S algo-

    rithm is t o find the optimum weight vector.

    92 7Notation

    Not ati on will be as follows (see Fig. 2):Every A seconds, where A may be a multiple of the delay

    r between taps, the voltages at the array taps are sampled.The vector of tap voltages at the kth sample is written X(where

    X T ( k ) [ ~ 1 ( k A ) , * ( k A ) , * * , X K J ( ~ A ) ] .The superscript T denotes ranspose. The ap voltages arethe su ms of voltages due to look-direction waveforms 1 andnon-look-direction noises n, so tha t

    X ( k ) = L ( k ) + X @ ) (1)where theKJ-dimensionalvector of look-directionwave-forms a t the kth sample is

    W )R aps

    1 K tapsK taps

    and the vec tor of non-look-direction noises isN T ( k )4 [n~(kA) , n(kA) , * . , ~ K J ( W ] .

    The vect or of weights a t each ta p is W,whereWTp Wl, w2, . * * ,W K J ] .I t isassumed or hisderivation hat hesignalsand

    noises are adequately modeled as zero-mean randomrocesseswith (unknown) second-order statistics:

    E [ X ( k ) X T ( h ) ] R x x ( 2 4E [ N ( k ) ; V T ( k ) ]4 R N N (2b)E [ L ( k )L T ( k ) 4 R L L . ( 2 4

    As previously stated, it is assumed that the vector of look-direction waveforms is uncorrelated with the vector of non-look-direction noises, i.e.,

    E [ N ( k ) L T ( k ) ]= 0 . (3)I t is assumed tha t the noise environment s distributed sothat R X Xan d R N Nare positive definite [12].

    The outp ut of the array (signal estimate) at the time ofkth sample is

    y ( k ) = W T X ( k ) = X * ( k ) W . (4)Using (4 ) the exp ected output power of th e array is

    E [ y Z ( k ) ]= E [ W T X ( k ) X T ( k) W ] W T R x x W . (5 )The constraint that the weightsn th e jt h ertical column

    of taps sum toa chosen num ber fi (see Fig. 1) is expressed by

    Authorized licensed use limited to: RMIT University. Downloaded on February 13, 2009 at 01:38 from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Frosts original paper ieee 1972

    3/10

    928 PROCEEDINGS OF THE IEEE, AUGUST 1972

    LOOK DIMCTIONSIGNALS Ax)NOISESP1.1

    NOISE ANOISE B

    NOW-LOOK DIAECTDNno(Ks - th2ZtxtFig. 2. Signals and noises on the array. Because the array is steered toward the look direction, all beam signalcomponents on any given column of filter taps are identical.

    the requirementcjTW = f j , j = 1, 2, - . ,J

    where the KJ-dimensional vector c j has the form

    cj =

    '0 '

    0

    0

    01

    10

    0

    0. o m

    j th group of K elements.

    Constraining the weight vector to satisfy the J equations of(6) estricts W to a (KJ-J)-dimensional plane.Define the constraint matrix C as

    -

    look-direction-equivalent tapped delay line shown in Fig. 1:

    By inspection the constraint vectors c j are linearly indepen-dent, hence, C has full rank equal to J . The constraints ( 6 )are no w written

    CTW = 5. (10)Optimum Weight Vector

    Since the look-direction-frequency response is ixed by theJ constraints,minimization of the non-look-directionnoisepower is the sameas minimization of the total outpu t power.The cost crit erion used in this paper will be minimization oftotal array output power WTRxxW. Th e problem of findingthe optimum set of filter weights Wept is summarized by (5 )and (10) as

    minimize W T R x x W Ula)subjecto CTW = 5. (1W

    W

    This is the constrained M S problem.W,,t is oundby hemethod of Lagrangemultipliers,which is discussed in [13]. Including a factor of + to simplifylater arithmetic, he constraint function s adjoined o hecost unctionby a /-dimensionalvector of undete rminedLagrange multipliers X:

    H ( W ) = + W T R x x W + XT(CTW- 5). (12)Taking the gradient of (12) with respect to W

    V w H ( W )RxxW +a. (13)Th e first term in (13) is a vector proportional to the gradientof the cost function (lla), and the second term is a vectornd define 5 as the J-dimensional vector of weights of th e

    Authorized licensed use limited to: RMIT University. Downloaded on February 13, 2009 at 01:38 from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Frosts original paper ieee 1972

    4/10

    FROST: ALGORITHM FO R ADAPTIVERRAY PROCESSING 92 9normal to the (KJ-J)-dimensional constraint plane definedby PW-S=O [14]. Foroptimality hesevectorsmustbeantiparalle l [IS], which is achieved by se tting the sum f th evectors (13) equal to zero

    V w H ( W ) = R x x W + CX = 0.In erms of theLagrangemultipliers, heoptimalweightvector is then

    Wopt = - R x x - 0 (14)where Rxx-l exists because R X X was assumed positive defi-nite. Since Wept must satisfy the constraint (llb)

    C T W o p t= 5 = - CTRxx-CXand the Lagrange multipliers are found to be

    x = - [ c R~x-C]-5 (15)where the existence of [CTRxx-C]- ollows from the fact sthat R X X is positive definite and C has full rank [6]. From(14) and (15) the optimum-constrained LMS weight vectorsolving (11) is

    Wept = R x x - C [ C ~ R X X - * C ] - ~ ~ . (16)Using the set of weights Wept in the arr ay processor of

    Fig. 2 forms the optimum constrained LMS processor, whichis a filter in space and f requency. Substituti ngept in (4), theconstrained least squares estimatef the look-direction wave-form is

    yo&) = Wo,tTX(k). (1 )Discussion

    Th e constrained LMS filter is sometimes known by othernames. If the frequency characteristic in the look-direction ischosen to beall-passand inearphase distortionless), heout put of the constrained LMS filter is the maximum likeli-hood estimate of a stat ionary process in Gaussian noise f t heangle of arrival is known [IS]. Th e distortionless form of theconstrained LMS filter is called by some authors the Mini-mumVarianceDistortionlessLookestimator, MaximumLikelihoodDistortionlessEstimator, nd LeastSquaresUnbiased Estimator. By suitable choice of 5 a variety ofother optimal processors can be obtained [16].

    111. THEADAPTIVELGORITHMIn hispaper t sassumed hat he nputcorrelation

    matrix R X X s unknown a pr ior i and must be learned by an

    implement and, for a given computational cost, is applicableto arrays in which the number of weights is on the order ofthe square of the number that could be handled by the itera-tive matrix inversion method and the cubef the number thatcould be handled by the direct substitution method.Derivation

    For motivation of the algorithm derivation emporarilysuppose hat he correlation matrix Rxx is known. In con-strained gradient-descent optimization, the weight vector isinitialized at a vectorsatisfying heconstraint llb),sayW(0 )=C ( P C ) - V , and at each iteration the weight vector ismoved in the negative direction of the constrained gradient(13). The len gth of the step is proportional to the magnitudeof the constrained gradient and s scaled by a constant p .After the kth iteration the next weight vector is

    W ( k + 1) = W ( k )- p v w a [ W ( k ) ]= W(k)- [ R x x W ( k )+ CX(k)] (18)

    where the second step is from (13). The Lagrange multipliersarechosenby equiringW(kS1) osatisfy heconstraint( l lb ) :5 = C T W ( k+ 1) = C T W ( k )- C T R x x W ( k )- CTCX(k) .Solving for the Lagrange multipliersX(k) and substitutingithe weight-iteration equation (18) we haveW ( k + 1) = W ( k )- [ l - C ( C T C ) - C T ] R x x W ( k )+ C(CTC)-[5 C T W ( k ) ] . (19)

    The determinist ic algorithm (19) is shown i n this form toemphasize that the last factor5- F W ( k ) s not assumed tobe zero, as it would be if th e weight vector precisely satisfiedthe constraint t the kth iteration. t will be shown in SectionVI that this term permits the algorithm to correct any smalldeviations from the constraint due to arithmetic inaccurac yand prevents their eventual accumulat ion and growth.

    Defining the KJ-dimensional vectorF 6 C(CTC)-15 (204

    and theK J X K J matrixP p - C(CTC)-CT (20b)

    the algorithm may be rewritten asW ( k + 1) = P [ W ( k )- R x x W ( k ) ]+ F . (21)

    adaptiveechnique. In stationarynvironmentsuringEquation (21) is a deterministi c onstrained radientlearning! and i n time-vavingthe Optimum a n estimate Of descent algorithm requiring knowledge of the input correla-weights mus t be recomputed periodically. tionatrix R x x , which, however, inhe array isDirect substitution of a correlation matrix estimate into the

    optimal-weight equation (16) requires a number of multipli-cations a t each iteration proportional to the cubef the num-required inversion of the input correlation matrix. RecentlySaradis et a l . [I71 and Mantey and Griffiths [18 ] have shownhow to iteratively update matrix inversions, requiring onlyanumber of multiplications and storage locations proportional W(0 )= Fto the squareof the number of weights. The gradient-descentconstrainedLMSalgorithmpresentedhere equiresonly anumber of multiplicat ions and storage locations directl y pro-portional to the number of weights. It is therefore simple to where y ( k ) is the array outputsignal estimate) defined by (4).

    unavailable a priori. An available and simple approximationfor R X X a t the kth iteration is the outer product of the tapvoltagevectorwith tself: X ( k ) X T ( k ) .Substi tution of thi salgorithm

    Of weights. The is primarily caused by th e estimatento (21) gives theonstrainedMS

    W ( k + 1) = P [ W ( k )- y ( k ) X ( k ) ]+ F !Authorized licensed use limited to: RMIT University. Downloaded on February 13, 2009 at 01:38 from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Frosts original paper ieee 1972

    5/10

    930Discussion

    Theconstrained L M S algorithm (22) atisfies thecon-straint P W ( k + 1 ) = 5 at each tera tion , as can be verifiedby premultiplying(22) by P nd using (20). At each iterationthe algorithm requires only he ap voltages X ( k ) and thearray output ( k ) no a prior i knowledge of the input correla-tionmatrix sneeded. F is a constant vector hat can beprecomputed. One of the wo most complex operations re-quired by (22) is the mult iplication of each of the K J com-ponents of thevector X ( k ) by hescalar p y ( k ) ; the othersignifica I t operationsndicated yhematrix P = I- ( P C ) - l P . Because of the simple form of C [refer to 7 ) ] ,multiplication of a vector by P as indicated in (22) amountsto little more than a few additions. Expressed in summationnotation the iterative equations for the weight vector com-ponents are

    PROCEEDINGSOF THE IEEE, AUGUST 1972(22), using (4), (2a), and the independence assumption yieldan iterative equation in the mean value of the constrainedL M S weight vectorE[W(k+ I ) ] = P { E [ W ( k ) ]- R x x E [ W ( k ) ] ]+ F . ( 2 3 )

    Define thevector V ( k f 1 ) to be the difference betweenthe mean adaptive weight vector at iteration k fl and theoptimal weight vector (16)

    V ( k + 1) p E [ W ( k + l ) ]- w o p t .Using (23) andheelations F = ( I - P ) Wept an dPRxxWo,t=O, which may be verified by direct subs titu tionof (16) and (20b), an equation for th e difference process maybe constructed

    V ( k + 1) = PV(R) - P R x x V ( k ) .2 4 )

    These equations can readily be implemented on a digitalcomputer.

    IV .PERFORMANCEConvergence to the Optimum

    The weight vector W ( k ) obtained by the use of the sto-chastic algorithm (22) is a random vector. Convergencef themeanweightvector o heoptimum sdemonstratedbyshowing that th e len gth of the difference vector between themeanweightvectorand heoptimum (16) asympto ticallyapproaches zero.

    Proof of convergence of themean is greatly simplifiedby he assumption (used n [2]) hat successive samples ofthe input vector taken A seconds apart are statistically inde-pendent.This ondition anusually be approximated npractice by sampling the input vector t intervals large com-pared o he correlat ion ime of the input process plus helength of time it takes an input wav eform to pr opagate downthe a rray. The a ssump tion is ore restrictive than necessary,since Daniel1 [19 ] has shown that the much weaker assump-tion of asymptotic ndependence of the nputvectors ssufficient to demonstrate convergenc e in the related uncon-strained least squares problem.

    Taking the expected value of both sides of the algorithm

    The idempotence ofP (i.e., P 2 = P ) , hich can be verifiedby carrying out the multiplicat ion using (20b) and premulti-plication of equation (24) by P shows that P V ( k )= V ( k ) forall k, so (24) can be written

    V ( k + 1) = [I- p P R x x P ] V ( k )= [ I - PRxxP]k+1V(O).

    The matrix P R x x P determines both .the rate of conver-gence of themeanweightvector o heoptimumand hesteady-state variance f the we ight ve ctor ab out th e optimIt i s shown n [6] hat P R x x P has precisely J zero eigen-values, corresponding t o the column vectors of the constraintmatrix C; this is a result of the fact that during adaption nomovement is permitted away from (KJ- J)-dimensional costr ain t plane. I t is also shown in[6, appendix C] th atP R x x Phas KJ - J nonzero eigenvaluesui. i = 1, 2, . . 0 , K J - J, withvalues bounded between the smallest and largest eigenvaluesof RxxAmin 5 ami, 5 (Ti 5 umax 5 X i = 1 , 2 , . . . ,K J -where Amin an d X are the smallest and largest eigenvaluesof R X Xan d urnin nd urnaxre the smallest and largest nonzeeigenvalues ofPRxx P .

    Authorized licensed use limited to: RMIT University. Downloaded on February 13, 2009 at 01:38 from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Frosts original paper ieee 1972

    6/10

    FROST: ALGORITHM FOR ADAPTIVE ARRAY PROCESSING 93Examination of V(O)=F - Wept shows that tcan be

    expressed as inear ombina tion of the igenvectors ofP R x x P corresponding to nonzero eigenvalues. f V ( 0 ) s equalto an eigenvector f P R x x P , sa y ei with eigenvalueui#O then

    V ( ~ Z 1) = [ I - P R X X P ] ~ + ~ ~ ~= [I - ui]k+ei.

    Th e convergence of the mean weight vector to the opti-mum weight vector along any eigenvectorf P R x x P is there-foreeometricwitheometricatio1-pa;). Theim erequired for the euclidean length of the difference vector todecrease to e-1 of its initial value (time constant) is

    7i = A/ln (1 - pai) EA/pai ( 2 5 )where the approximation is validor pcaillIIv(k + l>lj

    5 (1- Pamin)k+ll/V(o>Ijan d so if the initial difference is finite the mean weight vectorconverge s to t he opti mum, i.e.,

    lim I ~ E [ W ( K ) ] - Wopt l (= 0k+ mwith time constants given by (25).Steady-State Performance-Stationary Environment

    The algorithm is designed to continually adapt for copingwith nonstationary noise environm ents. In stationar y environ-ments this adaptation causes the weight vector to havevari-ance about he optimum and produces an additional com-ponent of noise (abov e the optimu m) to appear at th e outp utof the adap tiv e processor.

    Th e ou tp ut power of the opt imum processor with a fixedweight vector (17) is

    E [ y op t2 ( h ) ]= WoptTRxxWopt= ST(CTRxx-C)- 5 .

    A measure of the frac tion f addition al noise caused by theadaptive algorithm operating in steady state in a stationaryenvironment is termed misadjustment M ( p ) by Widrow

    By assuming th at successive observation vectors [vectorsX@) of ta p voltages] are independent and have componentsx l ( k ) , . . - , X K L ( ~ ) hatare ointlyGaussiandistributed,Moschner[20]calculatedvery ightbounds on themisad-justment, using a methoddue oSenne [21], [22].Foraconvergence con stan t p satisfying

    1IO < p