Modelling Non-linear and Non-stationary Time Series€¦ · Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series

Modelling Non-linear and Non-stationaryTime Series

Chapter 2: Non-parametric methods

Henrik Madsen

Advanced Time Series Analysis

September 2016

Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September 2016 1 / 45

Kernel estimators in time series analysis Non-parametric estimation

Non-parametric estimation

Free of parameters and model structure (apart from a smoothingconstant).

Useful if no prior information about the structure is available.

The regression curve is a direct function of the data.

Important methods: Kernel estimation, splines, and nearestneighbor.



Introduction

Assume that Y has PDF f (y), then

f (y) = limh→0

12h

P (Y − h < Y ≤ Y + h)

.Assume further we have N observations. An estimator of f (y) is:

f (y) =1

2hN[Number of observations in (y-h;y+h)]

f (x) =1

2hN

n∑

i=1

χ]x−h,x+h](xi)

where χ is the characteristic function:

χ[x−h,x+h](xi) =

{

1 if xi ∈]x − h, x + h],

0 if xi 6∈]x − h, x + h]



Introduction

Define

w(u) =

{

12 if |u| < 1

0 otherwise

then

f (y) =1N

N∑

t=1

1h

w

(

y − Yt

h

)

The general kernel estimator:∫

k(u) du = 1

f (y) =1

Nh

N∑

t=1

k

(

y − Yt

h

)


Kernel estimators in time series analysis The kernel estimator

The kernel estimator

Assume {Yt}, {Zt} are strongly (or strictly) stationary processesMadsen (2008).The basic quantity estimated is then Robinson (1983):

E [g(Zt)|Yt = y] fYt(y) (*)

where fYt(y) is the PDF of Yt. The solution to (*) is given by

[G(Zt); y] = (N′h)−1N′

∑

t=1

G(Zt)k

(

y − Yt

h

)



The kernel estimator - examples

Example 1

fY(y) = [1; y] =1

Nh

N∑

t=1

k

(

y − Yt

h

)

Example 2

E [G(Zt)|Yt = y] =[G(Zt); y][1; y]

=1

N′

∑

G(Zt)k(y−Yt

h )1N

∑

k( y−Yth )

Notice the special caseg(Zt) = Yt+1

leads to estimation ofE [Yt+1|Yt = y]




Assume that the theoretical relation is

Y = g(X(1), . . . ,X(q)) + ǫ

where ǫ is white noise, and g is an unknown continuous function.Given N observations

O = {(X(1)1 , . . . ,X(q)

1 , Y1), . . . , (X(1)N , . . . ,X(q)

N , YN)}

the goal is now to estimate the function g.If q = 1 the kernel estimator for g given the observations is

g(x) =1N

∑Ns=1 Ysk{h−1(x − Xs)}

1N

∑Ns=1 k{h−1(x − Xs)}




It is shown in Robinson (1983) that under reasonable assumptionsabout the stochastic process {Xt} and the kernel, the kernel estimatorconverges in probability to the conditional expectation of Y:

g(x)p−→ E [Y|X = x]

More precisely:

N → ∞, h → 0,Nh → ∞Then at every point of continuity of g(x):

g(x)p−→ g(x)


Kernel estimators in time series analysis Kernel functions

Kernel functions

The function, k is the kernel, and h is called the bandwidth of thekernel.A kernel is a real and bounded function which satisfies

∫

k(u) du = 1.The kernel with bandwidth, h, is given by

Kh(u) = h−1k(u/h)

Commonly used kernels are Gaussian, Epanechnikov kernel,rectangular, and triangular kernels.



Important kernels

The Epanechnikov kernel with bandwidth h is given by :

KEpah (u) =

34h

(1 − u2

h2 )I{|u|≤h}

The Gaussian kernel with bandwidth h is given by:

KGauh (u) =

1

h√

2πexp

(

− u2

2h2

)



Kernel functions - illustration

0

0.2

0.4

0.6

0.8

1

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

K(u)

u

Triangular

Epanechnikov

uniform

Gauss



Nearest Neighbor (NN) estimation

Previously: h was fixed.

Another (and often better) alternative is* k-NN (k Nearest Neighbor) (Note: here, k is a natural number -not the kernel).

Since it is reasonable to link the value of k to N we often consider afraction (or percentage) of the data. Often the symbol, ℏ, is used in thiscase.



Multivariate kernels

If we allow q > 1 in (1) a kernel of q’th order have to be used, i.e. a realbounded function kq(u1, . . . , uq), where

∫

kq(u)du = 1.By a generalization of the bandwidth h to the quadratic matrix h withthe dimension q × q the more general kernel estimator forg(X(1), . . . ,X(q)) is

g(x) =1N

∑Ns=1 Yskq[(x − Xs)h−1]

1N

∑Ns=1 kq[(x − Xs)h−1]

where x = (x1, . . . , xq) and Xs = (X(1)s , . . . ,X(q)

s ). This is called theNadaraya-Watson estimator.



Multivariate kernels - example

If kq is parameterized using the normal distribution, Nq(0, I), it is seenthat |h|−1kq

(

(x − z)h−1) is the value in the point, x of the q-dimensionalnormal density function with mean z and covariance h · hT .Hence it is possible to take into account correlations in data.

Y ∼ N(0, I), Y = (X − z)h−1 ⇒

X ∼ N(z, hhT)



Product kernel

In practice product kernels are used, i.e.

kq(xh−1) =

q∏

i=1

k(xi/hi)

where k is a kernel of order 1.Assuming that hi = h, i = 1, . . . , q the following generalization of theone dimensional case in (1) is obtained

g(x) =1N

∑Ns=1 Ys

∏qi=1 k{h−1(xi − X(i)

s )}1N

∑Ns=1

∏qi=1 k{h−1(xi − X(i)

s )}

In case of a Gaussian kernel, this corresponds to a diagonalcovariance matrix.


Kernel estimators in time series analysis Non-parametric LS

Non-parametric LS

If we define the weight

ws(x) =

∏qi=1 k{h−1(xi − X(i)

s )}1N

∑Ns=1

∏qi=1 k{h−1(xi − X(i)

s )}

then it is seen that the estimator corresponds to the local average:

g(x) =1N

N∑

s=1

ws(x)Ys



Non-parametric LS - continued

The estimate is actually a non-parametric least squares estimate at thepoint x. This is recognized from the fact that the solution to the leastsquares problem

arg minθ

1N

N∑

s=1

ws(x)(Ys − θ)2

is given by

θLS(x) =

∑Ns=1 ws(x)Ys

∑Ns=1 ws(x)



Non-parametric LS - continued

This shows that at each x, g is essentially a weighted LS locationestimate, i.e.

g(x) = θLS(x)1N

N∑

s=1

ws(x) = θLS(x)

The kernel estimate is equivalent to a local weighted least squaresestimate.



Variance of the non-parametric estimates

To assess the variance of the curve estimate at point x Hardle (1990)proposes the point-wise estimator given by

σ2(x) =1N

N∑

s=1

ws(x)(Ys − g(Xs))2

where the weights are given as shown previously by

ws(x) =

∏qi=1 k{h−1(xi − X(i)

s )}1N

∑Ns=1

∏qi=1 k{h−1(xi − X(i)

s )}

Remembering the WLS interpretation this estimate seems reasonable.


Kernel estimators in time series analysis Selection of bandwidth

The bandwidth

In analogy with smoothing in spectrum analysis the bandwidth hdetermines the smoothness of g:

If h is large the variance is small but the bias is large.

If h is small the variance is large but the bias is small.

In the limits:As h → ∞ it is seen that g(x) = Y.As h → 0 it is seen that g(x) = Yi for x = (X(1)

i , . . . ,X(q)i ) and otherwise

0.



Selection of bandwidth - Cross Validation

The ultimate goal for the selection of h is to minimize (see Hardle(1990))

MSE1(h) =1N

N∑

i=1

[g(Xi)− g(Xi)]2π(Xi)

Where π(. . . ) is a weighting function which screens off some of theextreme observations. However g is unknown.An easy solution is the ’plug-in’ method, where Yi is used as anestimate of g(Xi). Hence, the criterion is

MSE2(h) =1N

N∑

i=1

[g(Xi)− Yi]2π(Xi)

but it is clear that MSE2(h) → 0 for h → 0.



The “Leave one out” method

The idea is to avoid that (Xi) is used in the estimate for Yi. For everydata (Xi, Yi) we define an estimator g(i) for Yi based on all data except(Xi).The N estimators g(1), . . . , g(N) (called the “leave one out” estimators)are written

g(j)(x) =1

N−1

∑

s6=j Ys∏q

i=1 k{h−1(xi − X(i)s )}

1N−1

∑

s6=j

∏qi=1 k{h−1(xi − X(i)

s )}



The “Leave one out” method

Now the Cross-Validation criterion using the “leave one out” estimatesg(1), . . . , g(N) is

CV(h) =1N

N∑

i=1

[Yi − g(i)(Xi)]2π(Xi)

and the optimal value of h is then found by minimizing CV(h).It can be shown that under weak assumptions the estimate of thebandwidth h that is obtained when minimizing the Cross-Validationcriterion is asymptotic optimal, i.e. it minimizes (1) – see Hardle (1990).


Applications Non-parametric estimation of the conditional mean and variance

Applications of kernel estimators

Assume that a realization X1, . . . ,XN of a stochastic process {Xt} isavailable. The goal is now to use the realization to estimate thefunctions g(·) and h(·) in the model

Xt = g(Xt−1, . . . ,Xt−p) + h(Xt−1, . . . ,Xt−q)ǫt

As in Tjostheim (1994) we shall use the notation

M(xi1 , . . . , xid) = E [Xt|Xt−i1 = xi1 , . . . ,Xt−id = xid ]

V(xi1 , . . . , xid) = V [Xt|Xt−i1 = xi1 , . . . ,Xt−id = xid ]

One solution is to use the kernel estimator (other possibilities aresplines, nearest neighbor or neural network estimates).



Applications of kernel estimators - continued

Using a product kernel we obtain

M(xi1 , . . . , xid) =

1N−id

∑Ns=id+1 Xs

∏dj=1 k{h−1(xij − Xs−ij)}

1N−id+i1

∑N+i1s=id+1


Assuming that E(Xt) = 0 the estimator for V(·) is

V(xi1 , . . . , xid) =

1N−id

∑Ns=id+1 X2

s∏d

j=1 k{h−1(xij − Xs−ij)}1

N−id+i1

∑N+i1s=id+1




Applications of kernel estimators - continued

If E [Xt] 6= 0 it is clear that the above estimator is changed to

V(xi1 , . . . , xid) =

1N−id

∑Ns=id+1 X2

s∏d

j=1 k{h−1(xij − Xs−ij)}1

N−id+i1

∑N+i1s=id+1


− M2(xi1 , . . . , xid)



Identification of functional relationship

Estimation of conditional mean and varianceWe have simulated 10 stochastic independent time series for both ofthe following non-linear models:

A : Xt =

{

−0.8Xt−1 + ǫt, Xt−1 ≥ 00.8Xt−1 + 1.5 + ǫt, Xt−1 < 0

B : Xt =√

1 + X2t−1ǫt

What kind of models are model A and B?

Gaussian kernels with bandwidths, h = 0.6, are used.

Simulation of 10 independent time series.



Identification of functional relationship- Estimated M(x) for model A

-5

-4

-3

-2

-1

0

1

2

-6 -4 -2 0 2 4

M(x)

x

The theoretical conditional mean and variance are shown with ’+’.



Identification of functional relationship- Estimated M(x) for model B

-2

-1.5

-1

-0.5

0

0.5

1

1.5

-4 -2 0 2 4

M(x)

x



Identification of functional relationship- Estimated V(x) for model A

0

0.5

1

1.5

2

2.5

3

3.5

4

-6 -4 -2 0 2 4

V(x)

x



Applications of kernel estimators- Estimated V(x) for model B

0

5

10

15

20

25

30

35

40

-4 -2 0 2 4

V(x)

x


Applications Example: non-parametric estimation of the PDF

Example: non-parametric estimation of the PDF

An estimate of the probability density function (PDF) is Robinson(1983) :

f (xi1 , . . . , xid) =1

N − id + i1

N+i1∑

s=id+1

d∏

j=1

h−1k{h−1(xij − Xs−ij)}


Applications Non-parametric estimation of non-stationarity

Example: Non-parametric estimation ofnon-stationarity

Non-parametric methods can be applied to an identification of thestructure of the existing relationships leading to proposals for aparametric model. An example of identifying the diurnal variationin a non-stationary time series is given in the following.

Measurements of heat supply from 16 terrace houses in Kulladal,a suburb of Malmo in Sweden, are used to estimate the heat loadas a function of the time of the day.


Applications Heat supply vs. time of day

Example data- Heat supply versus measurement number

40

50

60

70

80

24 72 120 168 216 264 312 360 408 456 504 552 600 648 696

[kJ/s]

No.

Heat Supply



Example data – heat supply versus time of day

40

50

60

70

80

0 3 6 9 12 15 18 21 24

[kJ/s]

Hour

Heat Supply

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

��

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

��

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

��

�

�

�

�

��

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

��

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

��

�

�

�

�

�

�

�

�

�

�

��

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

��

�

�

�

�

�

�

��

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

The observations are equidistantly sampled on the interval [0, 24] with0.25 hour between the sample points. The transport delay between theconsumers and the measuring instruments is not more than a coupleof minutes.



Dependence between data points

Assumptions: The heat supply can be related to the time of day.There is no difference between the days for the considered period.Hence, the regression curve is a function only of the time of day,and it is this functional relationship, which will be considered.

Thus, the assumption of independence between measurements isnot fulfilled. The regression function contained in data is minorcompared to a set of data with the same number of independentobservations.



Smoothing with different bandwidths

Smoothing of the diurnal power load curve using an Epanechnikovkernel and bandwidths 0.125, 0.625,. . . , 3.625.

46

48

50

52

54

56

58

60

62

0 3 6 9 12 15 18 21 24

[kJ/s]

Hour

Nadaraya-Watson Estimator – Epanechnikov Kernel

The characteristics of the non-smoothed averages graduallydisappear, when the bandwidth is increased.



CV - approach for finding h

18.5

18.75

19

19.25

19.5

19.75

0 0.5 1 1.5 2 2.5 3 3.5 4

[(kJ/s)2]

Bandwidth [hour]

CV-function – Epanechnikov Kernel

Obviously the minimum of the CV-function is obtained for a bandwidthclose to 1.3. This value is most likely somewhat bigger than thevisually chosen.



Cross-Validation- Comments

The drop at 5.15 in the morning will clearly disappear in a curveestimate with this bandwidth.

These results indicate the need of non-parametric curve estimatesin which the bandwidth is not only a function of the density of thepredictor variables, but for instance also of the gradient of theregression curve.



Cross-Validation– Gaussian kernels

Smoothing of the diurnal power load curve using a Gaussian kerneland bandwidths 0.025, 0.125,. . . , 0.925.

46

48

50

52

54

56

58

60

62

0 3 6 9 12 15 18 21 24

[kJ/s]

Hour

Nadaraya-Watson Estimator – Gaussian Kernel



Cross-validation– Gaussian kernels

Clearly it is difficult visually to observe any difference between thecurve estimates obtained with the Epanechnikov and Gaussian kernel.A general conclusion : The choice of the kernel (window) is notimportant as long as it is reasonably smooth. It is the choice of thebandwidth that matters.


Applications Estimation of dependence on external variables

Example: Non-parametric estimation of (static)dependence on external variables

The dependent variable is the heat load in a district heating system,and the independent variables are ambient and supply temperature.Data are collected in the district heating system in Esbjerg, Denmark,during the period August 14 to December 10, 1989.

Heat load p(t) versusambient airtemperature a(t) andsupply temperatures(t).



Applying the Epanechnikov kernel

For the estimation of the regression surface the kernel method isapplied using the Epanechnikov kernel (1). A product kernel is used,i.e. the power load estimate, p, at time t, at ambient air temperature aand for supply temperature s is estimated as

pht,ha,hs(t, a, s) =

3843∑

i=1009

piKht(t − ti)Kha(a − ai)Khs(s − si)

3843∑

i=1009

Kht(t − ti)Kha(a − ai)Khs(s − si)

where Kh(u) = h−1k(u/h). The bandwidths were chosen using acombination of Cross-Validation and visual inspection.



The Epanechnikov kernel estimate

Kernel estimate of the dependence on time of day, td, ambient airtemperature, a and supply temperature, s, shown at td = 7.



The Epanechnikov kernel estimate– Estimate of the standard deviation

Point-wise estimate of standard deviation of the surface estimates.



Hardle, W. (1990), Applied nonparametric regression, CambridgeUniversity Press, New York.

Madsen, H. (2008), Time Series Analysis, 1. edn, Chapman &Hall/CRC.

Robinson, P. (1983), ‘Nonparametric estimators for time series’,Journal of time series analysis 4(3), 185–207.

Tjostheim, D. (1994), ‘Non-linear time series: A selective review’,Scandinavian Journal of Statistics 21, 97–130.


Documents

Modelling Non-linear and Non-stationary Time Series€¦ · Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series