Kolmogorov Wiener

Embed Size (px)

Citation preview

  • 8/12/2019 Kolmogorov Wiener

    1/18

    WIENERKOLMOGOROV FILTERING,FREQUENCY-SELECTIVE FILTERING,

    AND POLYNOMIAL REGRESSION

    D.S.G. P OOOLLLLLLOOOCCCKKKUniversity of London

    Adaptations of the classical WienerKolmogorov filters are described that enable

    them to be applied to short nonstationary sequences + Alternative filtering meth-ods that operate in the time domain and the frequency domain are described + Thefrequency-domain methods have the advantage of allowing components of thedata to be separated along sharp dividing lines in the frequency domain , withoutincurring any leakage + The paper contains a novel treatment of the start-up prob-lem that affects the filtering of trended data sequences +

    1. INTRODUCTION

    The classical theory of statistical signal extraction presupposes lengthy datasequences , which are assumed , in theory, to be doubly infinite or semi-infinite~see , for example , Whittle , 1983!+ In many practical cases , and in most econo-metric applications , the available data are , to the contrary , both strongly trendedand of a limited duration +

    This paper is concerned with the theory of finite-sample signal extraction ,and it shows how the classical WienerKolmogorov theory of signal extractioncan be adapted to cater to short sequences generated by processes that may benonstationary +

    Alternative methods of processing finite samples, which work in the fre-quency domain , are also described , and their relation to the time-domain im-

    plementations of the WienerKolmogorov methods are demonstrated + Thefrequency-domain methods have the advantage that they are able to achieveclear separations of components that reside in adjacent frequency bands in away that the time-domain methods cannot +

    2. WIENERKOLMOGOROV FILTERING OF SHORTSTATIONARY SEQUENCES

    In the classical theory , it is assumed that there is a doubly infinite sequence of observations , denoted , in this paper , by y~t ! $ yt ; t 0,6 1,6 2, + + +%+ Here , we

    Address correspondence to D +S+G+ Pollock , Department of Economics , Queen Mary College , University of Lon-don , Mile End Road , London E1 4NS , U+K+; e-mail : stephen_pollock@sigmapi +u-net+com+

    Econometric Theory , 23 , 2007, 7188+ Printed in the United States of America +DOI : 10+10170S026646660707003X

    2007 Cambridge University Press 0266-4666 007 $12+00 71

  • 8/12/2019 Kolmogorov Wiener

    2/18

    shall assume that the observations run from t 0 to t T 1+ These are gath-ered in the vector y @ y0 , y1 , + + + , yT 1 #' , which is decomposed as

    y j h , (1)

    where j is the signal component and h is the noise component + It may beassumed that the latter are from independent zero-mean Gaussian processes thatare completely characterized by their first and second moments + Then ,

    E ~j ! 0, D ~j ! Vj ,

    E ~h! 0, D ~h! Vh , and C ~j , h! 0+ (2)

    A consequence of the independence of j and h is that D ~ y! V Vj Vh +

    The autocovariance or dispersion matrices , which have a Toeplitz structure ,may be obtained by replacing the argument z within the relevant autocovari-ance generating functions by the matrix

    LT @e1 , + + + ,eT 1 ,0#, (3)

    which is derived from the identity matrix I T @e0 , e1 , + + + ,eT 1 # by deleting theleading column and appending a column of zeros to the end of the array + Using LT in place of z in the autocovariance generating function g ~ z! of the data pro-cess gives

    D ~ y! V g 0 I T (t 1

    T 1

    g t ~ LT t F T t !, (4)

    where F T LT ' is in place of z 1 + Because LT and F T are nilpotent of degree T ,

    such that LT q , F T

    q 0 when q T , the index of summation has an upper limit of T 1+

    The optimal predictors of the signal and the noise components are

    E ~j 6 y! E ~j ! C ~j , y! D 1 ~ y!$ y E ~ y!%

    Vj ~Vj Vh ! 1 y Z j y x , (5)

    E ~h6 y! E ~h! C ~h, y! D 1 ~ y!$ y E ~ y!%

    Vh ~Vj Vh !1 y Z h y h, (6)

    which are their minimum-mean-square-error estimates +The corresponding error dispersion matrices , from which confidence inter-

    vals for the estimated components may be derived , are

    D ~j 6 y! D ~j ! C ~j , y! D 1 ~ y!C ~ y, j!

    Vj Vj ~Vj Vh !1 Vj , (7)

    D ~h6 y! D ~h! C ~h, y! D 1 ~ y!C ~ y, h! ,

    Vh Vh ~Vj Vh ! 1 Vh + (8)

    72 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    3/18

    These formulas contain D $ E ~j 6 y!% C ~j , y! D 1 ~ y!C ~ y, j! and D $ E ~h6 y!%C ~h, y! D 1 ~ y!C ~ y, h! , which give the variability of the estimated componentsrelative to their zero-valued unconditional expectations + The results follow from

    the ordinary algebra of conditional expectations , of which an account has beengiven by Pollock ~1999!+ ~Equations ~5! and ~6! are the basis of the analysis inPollock , 2000+!

    It can be seen from ~5! and ~6! that y x h+ Therefore , only one of thecomponents needs to be calculated + The estimate of the other component maybe obtained by subtracting the calculated estimate from y+Also , the matrix inver-sion lemma indicates that

    ~Vj1 Vh

    1 ! 1 Vh Vh ~Vh Vj !1 Vh

    Vj Vj ~Vh Vj ! 1 Vj + (9)

    Therefore , ~7! and ~8! represent the same quantity , which is to be expected inview of the adding up +

    The estimating equations can be obtained via the criterion

    Minimize S ~j , h! j ' V j1 j h ' Vh

    1 h subject to j h y+ (10)

    Because S ~j , h! is the exponent of the normal joint density function N ~j , h! ,

    the estimates of ~5! and ~6! may be described , alternatively , as the minimumchi-square estimates or as the maximum-likelihood estimates +Setting h y j gives the function

    S ~j ! j ' V j1 j ~ y j! ' Vh

    1 ~ y j! + (11)

    The minimizing value is

    x ~Vj1 Vh

    1 ! 1 Vh 1 y

    y Vh ~Vh Vj !1

    y y h, (12)where the second equality depends upon the first of the identities of ~9!+

    A simple procedure for calculating the estimates x and h begins by solvingthe equation

    ~Vj Vh !b y (13)

    for the value of b+ Thereafter , one can generate

    x Vj b and h Vh b+ (14)

    If Vj and Vh correspond to the dispersion matrices of moving-average pro-cesses , then the solution to equation ~13! may be found via a Cholesky factor-ization that sets Vj Vh GG ' , where G is a lower-triangular matrix with a

    FILTERING AND POLYNOMIAL REGRESSION 73

  • 8/12/2019 Kolmogorov Wiener

    4/18

    limited number of nonzero bands + The system GG ' b y may be cast in theform of Gp y and solved for p+ Then , G ' b p can be solved for b+ Theprocedure has been described by Pollock ~2000!+

    3. FILTERING VIA FOURIER METHODS

    A circular stochastic process of order T is one in which the autocovariance matrixof a random vector x @ x 0 , x 1 , + + + , x T 1 #' , generated by the process , is un-changed when the elements of x are subjected to a cyclical permutation + Such aprocess is the finite equivalent of a stationary stochastic process +

    The circular permutation of the elements of the sequence is effected by thematrix operator

    K T @e1 , + + + ,eT 1 , e0 #, (15)

    which is formed from the identity matrix I T by moving the leading column tothe back of the array + The powers of K T are T -periodic such that K T

    q T K T q+

    Moreover , any circular matrix of order T can be expressed as a linear combi-nation of the basis matrices I T ,K T , + + + ,K T T

    1 +The matrix K T has a spectral factorization that entails the discrete Fourier

    transform and its inverse + Let

    U T 1 02 @W jt ; t , j 0, + + + ,T 1# (16)

    denote the symmetric Fourier matrix , of which the generic element in the j throw and t th column is

    W jt exp ~ i2p tj0T ! cos ~v j t ! i sin~v j t !, where v j 2p j0T + (17)

    The matrix U is unitary, which is to say that it fulfils the condition

    P UU U P U I T , (18)

    where P U T 1 02 @W jt ; t , j 0, + + + ,T 1# denotes the conjugate matrix +The circulant lag operator can be factorized as

    K T P UDU P U P DU , (19)

    where

    D diag $1,W ,W 2, + + + ,W T 1 % (20)

    is a diagonal matrix whose elements are the T roots of unity, which are foundon the circumference of the unit circle in the complex plane + Observe also that D is T -periodic , such that D q T D q , and that K q P UD q U P U P D q U for anyinteger q+ ~An account of the algebra of circulant matrices has been providedby Pollock , 2002+ See also Gray , 2002+!

    74 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    5/18

    The matrix of the circular autocovariances of the data is obtained by replac-ing the argument z in the autocovariance generating function g~ z! by thematrix K T :

    D ~ y! V g~K T !

    g 0 I T (t 1

    `

    g t ~K T t K T

    t !

    g 0 I T (t 1

    T 1

    g t ~K T t K T

    t !+ (21)

    The circular autocovariances would be obtained by wrapping the sequence of

    ordinary autocovariances around a circle of circumference T and adding theoverlying values + Thus

    g t ( j 0

    `

    g jT t , with t 0, + + + ,T 1+ (22)

    Given that lim ~t r ` !g t 0, it follows that gt r g t as T r `, which is tosay that the circular autocovariances converge to the ordinary autocovariancesas the circle expands +

    The circulant autocovariance matrix is amenable to a spectral factorizationof the form

    V g~K T ! P U g~ D !U , (23)

    wherein the j th element of the diagonal matrix g~ D ! is

    g~exp $iv j %! g 0 2 (t 1

    `

    g t cos~v j t! + (24)

    This represents the cosine Fourier transform of the sequence of the ordinaryautocovariances , and it corresponds to an ordinate ~scaled by 2 p! sampled atthe point v j 2p j0T , which is a Fourier frequency , from the spectral densityfunction of the linear ~i+e+, noncircular ! stationary stochastic process +

    The method of WienerKolmogorov filtering can also be implemented usingthe circulant dispersion matrices that are given by

    V j P U g j ~ D !U , Vh P U gh ~ D !U and

    V V j Vh P U $gj ~ D ! g h ~ D !%U , (25)

    wherein the diagonal matrices gj ~ D ! and gh ~ D ! contain the ordinates of thespectral density functions of the component processes + By replacing the disper-sion matrices within ~5! and ~6! by their circulant counterparts , we derive thefollowing formulas :

    FILTERING AND POLYNOMIAL REGRESSION 75

  • 8/12/2019 Kolmogorov Wiener

    6/18

    x P U g j ~ D !$gj ~ D ! g h ~ D !% 1 Uy P j y, (26)

    h P U g h ~ D !$gj ~ D ! gh ~ D !% 1 Uy P h y+ (27)

    Similar replacements within the formulas ~7! and ~8! provide the expressionsfor the error dispersion matrices that are appropriate to the circular filters +

    The filtering formulas may be implemented in the following way + First, aFourier transform is applied to the data vector y to give Uy, which resides inthe frequency domain + Then , the elements of the transformed vector are multi-plied by those of the diagonal weighting matrices J j g j ~ D !$gj ~ D ! gh ~ D !% 1

    and J h gh ~ D !$gj ~ D ! g h ~ D !% 1 + Finally, the products are carried back intothe time domain by the inverse Fourier transform , which is represented by the

    matrix P U + ~An efficient implementation of a mixed-radix fast Fourier trans-form , which is designed to cope with samples of arbitrary sizes , has been pro-vided by Pollock , 1999+ The usual algorithms demand a sample size of T 2n !+

    The frequency-domain realizations of the WienerKolmogorov filters havesufficient flexibility to accommodate cases where the component processes j ~t !and h~t ! have band-limited spectra that are zero-valued beyond certain bounds +If the bands do not overlap , then it is possible to achieve a perfect decomposi-tion of y~t ! into its components +

    Let Vj P U L j U , Vh P U L h U , and V P U ~Lj L h !U , where L j and Lh

    contain the ordinates of the spectral density functions of j~t ! and h~t !, sam-pled at the Fourier frequencies + Then , if these spectra are disjoint , there will beL j L h 0, and the dispersion matrices of the two processes will be singular +The matrix V y Vj Vh will also be singular , unless the domains of thespectral density functions of the component processes partition the frequencyrange+ Putting these details into ~26! gives

    x P U L j $Lj L h % Uy P UJ j Uy , (28)

    where $Lj L h % denotes a generalized inverse + The corresponding error dis-persion matrix is

    V j V j ~Vj Vh ! V j P U L j U P U L j ~Lj L h ! L j U + (29)

    But , if L j L h 0, then L j ~Lj L h ! L j L j , and so the error dispersion ismanifestly zero , which implies that x j +

    The significant differences between the time-domain and frequency-domainrealizations of the WienerKolmogorov filters occur at the ends of the sample +

    In the time-domain version , the coefficients of the filter vary as it moves throughthe sample , in a manner that prevents the filter from reaching beyond the boundsof the sample +

    In the frequency-domain version , the implicit filter coefficients , which donot vary, are applied to the sample via a process of circular convolution + Thus ,

    76 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    7/18

    as it approaches the end of the sample , the filter takes an increasing proportionof its data from the beginning +

    The implicit filter coefficients of the frequency-domain method are the cir-

    cularly wrapped versions of the coefficients of a notional time-domain filter ,defined in respect of y~t ! $ yt ; t 0,6 1,6 2, + + +%, which may have an infiniteimpulse response + As the circle expands , the wrapped coefficients converge onthe original coefficients in the same way as the circular autocovariances con-verge on the ordinary autocovariances +

    Even when they form an infinite sequence , the coefficients of the notionaltime-domain filter will have a limited dispersion around their central value +Thus , as the sample size increases , the proportion of the filtered sequence thatsuffers from the end-effects diminishes , and the central values from the time-

    domain filtering , realized according to the formulas ~5! and ~6!, and thefrequency-domain filtering , realized according to ~26! and ~27!, will convergepoint on point + By extrapolating the sample by a short distance at either end ,one should be able to place the end-effects of the frequency-domain filter outof range of the sample +

    Example

    The frequency response of a circular filter applied to a finite sample of T data

    points is a periodic function defined over a set of T adjacent Fourier frequen-cies v j 2p j0T , where j takes T consecutive integer values + It is convenient toset j 0,1, + + + ,T 1, in which case the response is defined over the frequencyinterval @0,2p! +

    However , it may be necessary to define the response over the interval @ p , p! ,which is used , conventionally , for representing the continuous response func-tion of a filter applied to a doubly infinite data sequence + In that case , j 0,6 1, + + + ,@T 02#, where @T 02# is the integral part of T 02+

    The responses at the T Fourier frequencies can take any finite real values ,

    subject only to the condition that l j l T j or to the equivalent condition thatl j l j+ These restrictions will ensure that the corresponding filter coeffi-cients form a real-valued symmetric sequence + The T coefficients of the circu-lar filter may be obtained by applying the inverse discrete Fourier transform tothe responses +

    It is convenient , nevertheless , to specify the frequency response of a circularfilter by sampling a predefined continuous response + Then , the coefficients of the circular filter are formally equivalent to those that would be found by wrap-ping the coefficients of an ordinary filter , pertaining to the continuous response ,

    around a circle of circumference T and adding the coincident values +A filter that has a band-limited frequency response has an infinite set of fil-ter coefficients + In that case , it is not practical to find coefficients of the circu-lar filter by wrapping the coefficients of the ordinary filter + Instead , they mustbe found by transforming a set of frequency-domain ordinates +

    FILTERING AND POLYNOMIAL REGRESSION 77

  • 8/12/2019 Kolmogorov Wiener

    8/18

    A leading example of a band-limited response is provided by an ideal low-pass frequency-selective filter that has a boxcar frequency response + Two ver-sions of the filter may be defined , depending on whether the transitions between

    pass band and stop band occur at Fourier frequencies or between Fourierfrequencies +Consider a set of T frequency-domain ordinates sampled , over the interval

    @ p , p! , from a boxcar function , centered on v 0 0+ If the cutoff points liebetween 6v d 6 2p d 0T and the succeeding Fourier frequencies , then the sam-pled ordinates will be

    l j1, if j $ d , + + + ,d !,

    0, otherwise +(30)

    The corresponding filter coefficients will be given by

    b ~k !

    2d 1

    T , if k 0,

    sin ~@d 21 #v1 k !

    T sin~v1 k 02! , for k 1, + + + ,@T 02#,

    (31)

    where v 1 2p 0T + This function is just an instance of the Dirichlet kernelsee , for example , Pollock ~1999!+ Figure 1 depicts the frequency response forthis filter at the Fourier frequencies , where l j 0,1+ It also depicts the contin-uous frequency response that would be the consequence of applying an ordi-nary filter with these coefficients to a doubly infinite data sequence +

    It is notable that a casual inspection of the continuous response would leadone to infer the danger of substantial leakage , whereby elements that lie withinthe stop band are allowed to pass into the filtered sequence + In fact, with regard

    to the finite sample , there is no leakage +

    Figure 1. The frequency response of the 17-point wrapped filter defined over the inter-val @ p , p! + The values at the Fourier frequencies are marked by circles +

    78 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    9/18

    If the cutoff points fall on the Fourier frequencies 6v d , and if l d l d 12_ ,

    then the filter coefficients will be

    b ~k !

    2d

    T , if k 0,

    cos ~v1 k 02!sin ~d v 1 k !

    T sin~v1 k 02! , for k 1, + + + ,@T 02#+

    (32)

    4. POLYNOMIALS AND THE MATRIX DIFFERENCE OPERATOR

    The remaining sections of this paper are devoted to methods of filtering non-

    stationary sequences generated by linear stochastic processes with unit roots inan autoregressive operator + To support the analysis , the present section dealswith the algebra of the difference operator and of the summation operator , whichis its inverse + When it is used to cumulate data , the summation operator auto-matically generates polynomial functions of the discrete-time index + This fea-ture is also analyzed +

    The matrix that takes the pth difference of a vector of order T is

    T p ~ I LT ! p+ (33)

    We may partition this matrix so that T p @Q * , Q #' , where Q*' has p rows+ If y

    is a vector of T elements , then

    T p y

    Q *'

    Q ' y

    g*

    g, (34)

    and g* is liable to be discarded , whereas g will be regarded as the vector of the pth differences of the data +

    The inverse matrix is partitioned conformably to give T p

    @S * , S #+ It fol-lows that

    @S * S #Q *

    '

    Q 'S * Q *

    ' SQ ' I T (35)

    and that

    Q *'

    Q '@S * S #

    Q *' S * Q*

    ' S

    Q ' S * Q ' S

    I p 0

    0 I T p+ (36)

    If g* is available , then y can be recovered from g via

    y S * g* Sg+ (37)

    FILTERING AND POLYNOMIAL REGRESSION 79

  • 8/12/2019 Kolmogorov Wiener

    10/18

    The lower-triangular Toeplitz matrix T p @S * , S # is completely character-ized by its leading column + The elements of that column are the ordinatesof a polynomial of degree p 1, of which the argument is the row index

    t 0,1, + + + ,T 1+ Moreover , the leading p columns of the matrix T p

    , whichconstitute the submatrix S * , provide a basis for all polynomials of degree p 1that are defined on the integer points t 0,1, + + + ,T 1+

    It follows that S * g* S * Q *' y contains the ordinates of a polynomial of degree p 1, which is interpolated through the first p elements of y, indexedby t 0,1, + + + , p 1, and which is extrapolated over the remaining integerst p, p 1, + + + ,T 1+

    A polynomial that is designed to fit the data should take account of all of theobservations in y+ Imagine , therefore , that y f h, where f contains the

    ordinates of a polynomial of degree p 1 and h is a disturbance term with E ~h! 0 and D ~h! Vh + Then , in forming an estimate f S * r * of f , weshould minimize the sum of squares h ' Vh 1 h+ Because the polynomial is fullydetermined by the elements of a starting-value vector r * , this is a matter of minimizing

    ~ y f ! ' Vh 1 ~ y f ! ~ y S * r * ! ' Vh

    1 ~ y S * r * ! (38)

    with respect to r * + The resulting values are

    r * ~S *'

    Vh 1

    S * !1

    S *'

    Vh 1

    y and f S * ~S *'

    Vh 1

    S * !1

    S *'

    Vh 1

    y+ (39)An alternative representation of the estimated polynomial is available , which

    avoids the inversion of Vh + This is provided by the identity

    P* S * ~S *' Vh

    1 S * ! 1 S *' Vh

    1

    I Vh Q ~Q ' Vh Q !1 Q ' I PQ , (40)

    which gives two representations of the projection matrix P* + The equality fol-lows from the fact that , if Rank @ R, S * # T and if S *' Vh 1 R 0, then

    S * ~S *' Vh 1 S * ! 1 S *' Vh

    1 I R~ R ' Vh 1 R! 1 R ' Vh

    1 + (41)

    Setting R Vh Q gives the result +Grenander and Rosenblatt ~1957 ! have shown that , when the disturbances

    are generated by a stationary stochastic process , the generalized least-squares~GLS ! estimator of a polynomial regression that incorporates the matrix Vh isasymptotically equivalent to the ordinary least-squares ~OLS ! estimator that hasthe identity matrix I T in place of Vh + The result has also been demonstrated by

    Anderson ~1971!+ It means that one can use the OLS estimator without signif-icant detriment to the quality of the estimates +The result is proved by showing that the metric of the GLS regression is

    asymptotically equivalent to the Euclidean metric of OLS regression + It isrequired that , in the limit , the columns of Vh 1 S * and S * should span the same

    80 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    11/18

    space + It follows that the projection operator P* of ~40! is asymptotically equiv-alent to S * ~S *' S * ! 1 S *' I Q ~Q ' Q ! 1 Q ' +

    The residuals of an OLS polynomial regression of degree p, which are given

    by y f Q ~Q'

    Q !1

    Q'

    y, contain the same information as the vector g Q'

    yof the pth differences of the data + The difference operator has the effect of nul-lifying the element of zero frequency and of attenuating radically the adjacentlow-frequency elements + Therefore , the low-frequency spectral structures of thedata are not perceptible in the periodogram of the differenced sequence +

    On the other hand , the periodogram of a trended sequence is liable to bedominated by its low-frequency components , which will mask the other spec-tral structures + However , the periodogram of the polynomial regression re-siduals can be relied upon to reveal the spectral structures at all frequencies +

    Moreover , by varying the degree p of the polynomial , one is able to alter therelative emphasis that is given to high-frequency and low-frequency structures +

    5. WIENERKOLMOGOROV FILTERING OF NONSTATIONARY DATA

    The treatment of trended data must accommodate stochastic processes with drift +Therefore , it will be assumed that , within y j h , the trend component jf z is the sum of a vector f , containing ordinates sampled from a poly-nomial in t of degree p at most, and a vector z from a stochastic process with p

    unit roots that is driven by a zero-mean process +If Q ' is the pth difference operator , then Q ' f mi, with i @1,1, + + + ,1#' ,will contain a constant sequence of values , which will be zeros if the degree of the stochastic drift is less than p, which is the degree of differencing + Also , Q ' zwill be a vector sampled from a mean-zero stationary process + Therefore ,d Q ' j is from a stationary process with a constant mean + Thus , there is

    Q ' y Q ' j Q ' h

    d k g, (42)

    where

    E ~d! mi, D ~d! Vd ,

    E ~k! 0, D ~k! Vk Q ' Vh Q , and C ~d, k! 0+ (43)

    Let the estimates of j , h, d Q ' j and k Q ' h be denoted by x , h, d , and k ,respectively + Then , with E ~g! E ~d! mi, there is

    E ~d6g! E ~d! Vd ~Vd Vk ! 1 $g E ~g!%

    mi Vd ~Vd Q ' Vh Q ! 1 $g mi% d , (44)

    E ~k6g! E ~k! Vk ~Vd Vk ! 1 $g E ~g!%

    Q ' Vh Q ~Vd Q ' Vh Q ! 1 $g mi% k ; (45)

    FILTERING AND POLYNOMIAL REGRESSION 81

  • 8/12/2019 Kolmogorov Wiener

    12/18

    and these vectors obey an adding-up condition :

    Q ' y d k g+ (46)

    In ~44!, the low-pass filter matrix Z d Vd ~Vd Q ' Vh Q ! 1 will virtuallyconserve the vector mi, which is an element of zero frequency + In ~45!, thecomplementary high-pass filter matrix Z k Q ' Vh Q ~Vd Q ' Vh Q ! 1 will vir-tually nullify the vector + Its failure to do so completely is attributable to thefact that the filter matrix is of full rank + As the matrix converges on its asymp-totic form , the nullification will become complete + It follows that , even whenthe degree of the stochastic drift is p, one can set

    d Vd ~Vd Vk !1 g Vd ~Vd Q ' Vh Q !

    1 Q ' y, (47)

    k Vk ~Vd Vk !1 g Q ' Vh Q ~Vd Q ' Vh Q !

    1 Q ' y+ (48)

    The estimates of j and h may be obtained by integrating , or reinflating , thecomponents of the differenced data to give

    x S * d * Sd and h S * k * Sk + (49)

    For this , the starting values d * and h* are required + The initial conditions in d *should be chosen so as to ensure that the estimated trend is aligned as closelyas possible with the data + The criterion is

    Minimize ~ y S * d * Sd ! ' Vh 1 ~ y S * d * Sd ! with respect to d * + (50)

    The solution for the starting values is

    d * ~S *' Vh

    1 S * ! 1 S *' Vh

    1 ~ y Sd !+ (51)

    The starting values of k * are obtained by minimizing the ~generalized ! sum of

    squares of the fluctuations :Minimize ~S * k * Sk ! ' Vh

    1 ~S * k * Sk ! with respect to k * + (52)

    The solution is

    k * ~S *' Vh 1 S * ! 1 S *' Vh

    1 Sk + (53)

    It is straightforward to show that , when ~42! holds,

    y x h, (54)

    which is to say that the addition of the estimates produces the original dataseries + As in ~40!, let P* S * ~S *' Vh 1 S * ! 1 S *' Vh 1 + Then , the two componentscan be written as

    82 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    13/18

    x Sd S * d * Sd P* ~ y Sd !, (55)

    h Sk S * k * ~ I P* !Sk + (56)

    Adding these , using d k Q ' y from ~46!, gives

    P* y ~ I P * !SQ ' y P * y ~ I P* !~SQ ' S * Q *' ! y

    P* y ~ I P* ! y y+ (57)

    Here , the first equality follows from the fact that ~ I P* !S * 0, and the sec-ond follows from the identity SQ ' S * Q *' I , which is from ~35!+ In view of ~49!, it can be seen that the condition x h y signifies that the criteria of ~50! and ~52! are equivalent +

    The starting values k * and d * can be eliminated from the expressions for x and h, which provide the estimates of the components + Substituting the expres-sion for PQ from ~40! into h ~ I P* !Sk PQ Sk together with the expressionfor k from ~48! and using the identity Q ' S I T gives

    h Vh Q ~Vd Q ' Vh Q ! 1 Q ' y+ (58)

    A similar reduction can be pursued in respect to the equation x ~ I P* !Sd P* y PQ Sd ~ I PQ ! y+ However , it follows immediately from ~54! that

    x y h

    y Vh Q ~Vd Q ' Vh Q ! 1 Q ' y+ (59)

    Observe that the filter matrix Z h Vh Q ~Vd Q ' Vh Q ! 1 of ~58!, whichdelivers h Z h g, differs from the matrix Z k Q ' Z h of ~48!, which deliversk Z k g, only in respect to the matrix difference operator Q ' + The effect of omitting the operator is to remove the need for reinflating the filtered compo-nents and thus to remove the need for the starting values +

    Equation ~59! can also be derived via a straightforward generalization of thechi-square criterion of ~10!+ If we regard the elements of d* as fixed values ,then the dispersion matrix of j S * d* S d is the singular matrix D ~j ! VjS Vd S ' + On setting h y j in ~10! and replacing the inverse of Vj

    1 by thegeneralized inverse Vj Q Vd

    1 Q ' , we get the function

    S ~j ! ~ y j! ' Vh 1 ~ y j! j ' Q Vd

    1 Q ' j , (60)

    of which the minimizing value is

    x ~Q Vd 1 Q ' Vh 1 ! 1 Vh 1 y+ (61)

    The matrix inversion lemma gives

    ~Q Vd1 Q ' Vh

    1 ! 1 Vh Vh Q ~Q ' Vh Q Vd ! 1 Q ' Vh , (62)

    FILTERING AND POLYNOMIAL REGRESSION 83

  • 8/12/2019 Kolmogorov Wiener

    14/18

    and putting this into ~61! gives the expression of ~59!+ The matrix of ~62!also constitutes the error dispersion matrix D ~h6 y! D ~j 6 y!, which , in viewof their adding-up property ~54!, is common to the estimates of the two

    components +It is straightforward to compute h when Vh and Vd have narrow-band moving-average forms + Consider h Vh Qb , where b ~Q ' Vh Q Vd ! 1 g+ First, b maybe calculated by solving the equation ~Q ' Vh Q Vd !b g+ This involves theCholesky decomposition of Q ' Vh Q Vd GG ' , where G is a lower-triangularmatrix with a number of nonzero diagonal bands equal to the recursive order of the filter+ The system GG ' b Gp g is solved for p+ Then , G ' b p is solvedfor b+ These are the recursive operations + Given b, then h can be obtained viadirect multiplications +

    6. APPLYING THE FOURIER METHOD TONONSTATIONARY SEQUENCES

    The method of processing nonstationary sequences that has been expounded inthe preceding section depends upon the finite-sample versions of the differenceoperator and the summation operator + These are derived by replacing z in thefunctions ~ z! 1 z and S~ z! ~1 z! 1 $1 z z 2 {{{% by thefinite-sample lag operator LT , which gives rise to a nonsingular matrix ~ LT !@Q * , Q #' and its inverse 1 ~ LT ! S~ LT ! @S * , S #+

    The identity of ~40!, which is due to the interaction of the matrices S * andQ ' , has enabled us to reinflate the components of the differenced sequence Q ' yg d k without explicitly representing the initial conditions or start-valuesd * and k * that are to be found in equations ~55! and ~56!+

    The operators that are derived by replacing z in ~ z! and S~ z! by the circu-lant lag operator K T are problematic + On the one hand , putting K T in place of zwithin ~ z! results in a noninvertible matrix of rank T 1+ On the other hand ,the replacement of z within the series expansion of S~ z! results in a nonconver-gent series that corresponds to a matrix in which all of the elements are uni-formly of infinite value +

    In applying the Fourier method of signal extraction to nonstationary sequences ,one recourse is to use ~ LT ! and S~ LT ! for the purposes of reducing the data tostationarity and for reinflating them + The components d and k of the differ-enced data may estimated via the equations

    d P U L d ~Ld L k ! Ug Pd g, (63)

    k P U L k ~Ld L k ! Ug Pk g+ (64)

    For a vector mi of repeated elements , there will be Pd mi mi and Pk mi 0+Whereas the estimates of d, k may be extracted from g Q ' y by the Fourier

    method , the corresponding estimates x , h of j , h will be found from equations

    84 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    15/18

    ~55! and ~56!, which originate in the time-domain approach and which requireexplicit initial conditions +

    It may also be appropriate , in this context , to replace the criteria of ~50! and

    ~52!, which generate the values of d * and k * , by simplified criteria wherein Vh is replaced by the identity matrix I T + Then

    d * ~S *' S * ! 1 S *

    ' ~ y Sd ! and k * ~S *' S * ! 1 S *

    ' Sk + (65)

    The available formulas for the summation of sequences provide convenientexpressions for the values of the elements of S *' S * ~see , e+g+, Banerjee , Dolado ,Galbraith , and Hendry , 1993, p+ 20!+

    An alternative recourse , which is available in the case of a high-pass or band-pass filter that nullifies the low-frequency components of the data , entails remov-ing the implicit differencing operator from the filter + ~In an appendix to theirpaper, Baxter and King , 1999, demonstrate the presence , within a symmetricbandpass filter , of two unit roots , i+e+, of a twofold differencing operator +!

    Consider a filter defined in respect to a doubly infinite sequence and letf~ z! be the transfer function of the filter , that is, the z-transform of the filtercoefficients + Imagine that f~ z! contains the factor ~1 z! p and let c~ z! ~1 z! pf~ z!+ Then , c~ z! defines a filter of which the finite-sample versioncan be realized by the replacement of z by K T +

    Because K T P UDU , the filter matrix can be factorized as c~K T ! C P U c~K T !U + On defining J c c~K T !, which is a diagonal weighting matrix , theestimate of the signal component is given by the equation

    x P UJ c Ug + (66)

    Whichever procedure is adopted , the Fourier method is applied to data thathave been reduced to stationarity by differencing + Given the circularity of the

    method , it follows that as a filter approaches one end of the sample it will takean increasing proportion of its data from the other end + To avoid some of theeffects of this , one may extrapolate the data at both ends of sample + The extrap-olation can be done either before or after the data have been differenced + Afterthe filtering , the extra-sample elements may be discarded +

    If the trended data have been extrapolated , then the starting values d * and k * ,which are required by the first procedure , may be obtained by minimizing thequadratic functions ~50! and ~52! defined over the enlarged data set + Other-wise , if the differenced data have been extrapolated , then the extra-sample points

    of the filtered sequence are discarded prior to its reinflation , and the quadraticfunctions that are to be minimized in pursuit of the starting values are definedover T points corresponding to the original sample +

    Extrapolating the differenced data has the advantage of facilitating the pro-cess of tapering , whereby the ends of the sample are reduced to a level at which

    FILTERING AND POLYNOMIAL REGRESSION 85

  • 8/12/2019 Kolmogorov Wiener

    16/18

    they can meet when the data are wrapped around a circle + The purpose of thisis to avoid any radical disjunctions that might otherwise occur at the meetingpoint and that might have ill effects throughout the filtered sequences +

    We should also note that , in the case of a finite bandpass filter , a third pos-sibility arises + This is to apply the filter directly to the original , undifferenced ,data+ Such a procedure has been followed both by Baxter and King ~1999 ! andby Christiano and Fitzgerald ~2003!+ The latter have found an ingenious way of extrapolating the data so as to enable the filter to generate values at the ends of the sample +

    The Fourier method can be used to advantage whenever the components of the data are to be found in disjoint frequency intervals + If the components areseparated by wide spectral dead spaces , then it should be possible to extract

    them from the data with filters that have gradual transitions that are confined tothe dead spaces + In that case , WienerKolmogorov time-domain filters of loworders can be used + However , when the components are adjacent or are sepa-rated by narrow dead spaces , it is difficult to devise a time-domain filter with atransition that is sharp enough and that does not suffer from problems of insta-bility+ Then , a Fourier method should be used instead +

    Example

    Figure 2 show the logarithms of a monthly sequence of 132 observations of theU+S+ money supply , through which a trend has been interpolated , and Figure 3shows the sequence of the deviations from the tend + The data are from Bolchand Huang ~1974!+

    The trend has been estimated with reference to Figure 4 , which shows theperiodogram of the residuals from a quadratic detrending of the data + There is atall spike in the periodogram at v 11 p 06, which represents the fundamentalfrequency of an annual seasonal fluctuation + To the left of this spike , there is aspectral mass , which belongs to the trend component + Its presence is an indica-

    Figure 2. The plot of the logarithms of 132 monthly observations on the U +S+ moneysupply, beginning in January 1960 + A trend , estimated by the Fourier method , has beeninterpolated through the data +

    86 D.S.G. POLLOCK

  • 8/12/2019 Kolmogorov Wiener

    17/18

    tion of the inadequacy of the quadratic detrending + To remove this spectral massand to estimate the trend more effectively , a Fourier method has been deployed +

    An inspection of the numerical values of the ordinates of the periodogramhas indicated that , within the accuracy of the calculations , the element at v 10 ,which is adjacent to the seasonal frequency , is zero-valued + This defines thepoint of transition of the low-pass filter that has been used to isolate the trendcomponent of the data + A filter with a less abrupt transition would allow thepowerful seasonal component to leak into the trend and vice versa + Experi-

    ments using WienerKolmogorov time-domain filters have shown the detri-ment of this +The trend in Figure 2 has been created by reinflating a sequence that has

    been synthesized from the relevant ordinates selected from the Fourier trans-form of the twice differenced data + The latter have a positive mean value , whichcorresponds to a quadratic drift in the original data + The differenced data havenot been extrapolated , because there appears to be no detriment in the end-effects + Moreover , the regularity of the seasonal fluctuations , as revealed bythe sequence of the deviations from the trend in Figure 3 , is a testimony to the

    success in estimating the trend +

    Figure 3. The deviations of the logarithmic money-supply data from the interpolatedtrend+

    Figure 4. The periodogram of the residuals from the quadratic detrending of the log-arithmic money-supply data +

    FILTERING AND POLYNOMIAL REGRESSION 87

  • 8/12/2019 Kolmogorov Wiener

    18/18

    REFERENCES

    Anderson , T+W+ ~1971 ! The Statistical Analysis of Time Series. Wiley+Banerjee , A+, J+ Dolado , J+W+ Galbraith , & D+F+ Hendry ~1993 ! Co-integration, Error-Correction

    and the Econometric Analysis of Non-Stationary Data. Oxford University Press +Baxter , M+ & R+G+ King ~1999 ! Measuring business cycles : Approximate band-pass filters for eco-

    nomic time series + Review of Economics and Statistics 81 , 575593 +Bolch , B+W+ & C+J+ Huang ~1974 ! Multivariate Statistical Methods for Business and Economics.

    Prentice-Hall +Christiano , L+J+ & T+J+ Fitzgerald ~2003 ! The band-pass filter + International Economic Review 44,

    435465 +Gray, R+M+ ~2002 ! Toeplitz and Circulant Matrices : A Review+ Information Systems Laboratory ,

    Department of Electrical Engineering , Stanford University , California ~http:00ee+stanford +edu 0gray0; toeplitz +pdf !+

    Grenander , U+ & M+ Rosenblatt ~1957 ! Statistical Analysis of Stationary Time Series. Wiley+Pollock , D+S+G+ ~1999 ! A Handbook of Time-Series Analysis, Signal Processing and Dynamics.

    Academic Press +Pollock , D+S+G+ ~2000 ! Trend estimation and de-trending via rational square wave filters + Journal

    of Econometrics 99, 317334 +Pollock , D+S+G+ ~2002 ! Circulant matrices and time-series analysis + International Journal of Math-

    ematical Education in Science and Technology 33, 213230 +Whittle , P+ ~1983 ! Prediction and Regulation by Linear Least-Square Methods , 2nd rev+ ed+ Basil

    Blackwell +

    88 D.S.G. POLLOCK