21
Econometrics 2 — Fall 2005 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen 1 of 41 Univariate Time Series Analysis We consider a single time series, y 1 ,y 2 , ..., y T . We want to construct simple models for y t as a function of the past: E[y t |history]. Univariate models are useful for: (1) Analyzing the dynamic properties of time series. What is the dynamic adjustment after a shock? Do shocks have transitory or permanent eects (presence of unit roots)? (2) Forecasting. A model for E[y t | x t ] is only useful for forecasting y t+1 if we know (or can forecast) x t+1 . (3) Univariate time series analysis is a way to introduce the tools necessary for ana- lyzing more complicated models. 2 of 41

Univariate Time Series Analysis; ARIMA Models - ku · PDF fileEconometrics 2 — Fall 2005 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen 1of41 Univariate Time Series

dothuy

• View
238

4

Embed Size (px)

Citation preview

Econometrics 2 — Fall 2005

Univariate Time Series Analysis;

ARIMA Models

Heino Bohn Nielsen

1 of 41

Univariate Time Series Analysis• We consider a single time series, y1, y2, ..., yT .We want to construct simple models for yt as a function of the past: E[yt |history].

• Univariate models are useful for:(1) Analyzing the dynamic properties of time series.

What is the dynamic adjustment after a shock?Do shocks have transitory or permanent effects (presence of unit roots)?

(2) Forecasting.A model for E[yt | xt] is only useful for forecasting yt+1 if we know (or canforecast) xt+1.

(3) Univariate time series analysis is a way to introduce the tools necessary for ana-lyzing more complicated models.

2 of 41

Outline of the Lecture(1) Characterizing time dependence: ACF and PACF.

(2) Modelling time dependence: the ARMA(p,q) model

(3) Examples:

• AR(1).• AR(2).• MA(1).

(4) Lag operators, lag polynomials and invertibility.

(5) Model selection.

(6) Estimation.

(7) Forecasting.

3 of 41

Characterizing Time Dependence• For a stationary time series the autocorrelation function (ACF) is

ρk = Corr(yt, yt−k) =Cov(yt, yt−k)pV (yt) · V (yt−k)

=Cov(yt, yt−k)

V (yt)=γkγ0.

An alternative measure is the partial autocorrelation function (PACF), which is theconditional correlation:.

θk = Corr(yt, yt−k | yt−1, ..., yt−k+1).

Note: ACF and PACF are bounded in [−1; 1], symmetric ρk = ρ−k and ρk = θ0 = 1.

• Simple estimators, bρk and bθk, can be derived from OLS regressionsACF: yt = c + ρkyt−k + residualPACF: yt = c + θ1yt−1 + ... + θkyt−k + residual

• For an IID time series it hold that V (bρk) = V (bθk) = T−1, and a 95% confidenceband is given by ±2/

√T .

4 of 41

Example: Danish GDP

1980 2000

6.50

6.75

7.00

Danish GDP, log

1980 2000

-0.05

0.00

0.05Deviation from trend

1980 2000

-0.025

0.000

0.025

0.050

First difference

0 10 20-1

0

1

ACF

0 10 20-1

0

1

0 10 20-1

0

1

0 10 20-1

0

1

PACF

0 10 20-1

0

1

0 10 20-1

0

1

5 of 41

The ARMA(p,q) Model• First define a white noise process, t ∼ IID(0, σ2).

• The autoregressive AR(p) model is defined as

yt = θ1yt−1 + θ2yt−2 + ... + θpyt−p + t.

Systematic part of yt is a linear function of p lagged values.We need p (observed) initial values: y−(p−1), y−(p−2), ..., y−1, y0.

• The moving average MA(q) model is defined as

yt = t + α1 t−1 + α2 t−2 + ... + αq t−q.

yt is a moving average of past shocks to the process.We need q initial values: −(p−1) = −(p−2) = ... = −1 = 0 = 0.

• They can be combined into the ARMA(p,q) model

yt = θ1yt−1 + ... + θpyt−p + t + α1 t−1 + ... + αq t−q.6 of 41

Dynamic Properties of an AR(1) Model• Consider the AR(1) model

Yt = δ + θYt−1 + t.

Assume for a moment that the process is stationary.As we will see later, this requires |θ| < 1.

• First we want to find the expectation.Stationarity implies that E[Yt] = E[Yt−1] = µ. We find

E[Yt] = E[δ + θYt−1 + t]

E[Yt] = δ + θE[Yt−1] +E[ t]

(1− θ)µ = δ

µ =δ

1− θ.

Note the following:(1) The effect of the constant term, δ, depends on the autoregressive parameter, θ.(2) µ is not defined if θ = 1. This is excluded for a stationary process.

7 of 41

• Next we want to calculate the variance and the autocovariances.It is convenient to define the deviation from mean, yt = Yt − µ, so that

Yt = δ + θYt−1 + t

Yt = (1− θ)µ + θYt−1 + t

Yt − µ = θ (Yt−1 − µ) + t

yt = θyt−1 + t.

• We note that γ0 = V [Yt] = V [yt]. We find:

V [yt] = E[y2t ]

= E[(θyt−1 + t)2]

= E[θ2y2t−1 +2t + 2θyt−1 t]

= θ2E[y2t−1] +E[ 2t ] + 2θE[yt−1 t]

= θ2V [yt−1] + σ2 + 0.

Using stationarity, γ0 = V [yt] = V [yt−1], we get

γ0(1− θ2) = σ2 or γ0 =σ2

1− θ2.

8 of 41

• The covariances, Cov[yt, yt−k] = E[ytyt−k], are given by

γ1 = E[ytyt−1] = E[(θyt−1 + t)yt−1] = θE[y2t−1] + E[yt−1 t] = θγ0 = θσ2

1− θ2

γ2 = E[ytyt−2] = E[(θyt−1 + t) yt−2] = θE[yt−1yt−2] + E[ tyt−2] = θγ1 = θ2σ2

1− θ2...

γk = E[ytyt−k] = θkγ0

• The ACF is given by

ρk =γkγ0=θkγ0γ0

= θk.

• The PACF is simply the autoregressive coefficients: θ1, 0, 0, ...

9 of 41

Examples of Stationary AR(1) Models

0 50 100

-2.5

0.0

2.5 yt =εt

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

0 50 100

-2.5

0.0

2.5

5.0yt =0.80⋅yt−1+εt

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

0 50 100

-2.5

0.0

2.5

5.0yt =-0.80⋅yt−1+εt

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

10 of 41

Examples of AR(1) Models

0 20 40 60 80 100

-2

0

2

y t=0.50⋅yt−1+ε t

0 20 40 60 80 100

-2.5

0.0

2.5

5.0

7.5y t=0.95⋅yt−1+ε t

0 20 40 60 80 100

0

5

10y t=1.00⋅yt−1+ε t

0 20 40 60 80 100

0

50

100

150

200y t=1.05⋅yt−1+ε t

11 of 41

Dynamic Properties of an AR(2) Model• Consider the AR(2) model given by

Yt = δ + θ1Yt−1 + θ2Yt−2 + t.

• Again we find the mean under stationarity:

E [Yt] = δ + θ1E [Yt−1] + θ2E [Yt−2] +E [ t]

E [Yt] =δ

1− θ1 − θ2= µ.

• We then define the process yt = Yt − µ for which it holds that

yt = θ1yt−1 + θ2yt−2 + t.

12 of 41

• Multiplying both sides with yt and taking expectations yields

E£y2t¤= θ1E [yt−1yt] + θ2E [yt−2yt] +E [ tyt]

γ0 = θ1γ1 + θ2γ2 + σ2

E [ytyt−1] = θ1E [yt−1yt−1] + θ2E [yt−2yt−1] +E [ tyt−1]

γ1 = θ1γ0 + θ2γ1

E [ytyt−2] = θ1E [yt−1yt−2] + θ2E [yt−2yt−2] +E [ tyt−2]

γ2 = θ1γ1 + θ2γ0

E [ytyt−3] = θ1E [yt−1yt−3] + θ2E [yt−2yt−3] +E [ tyt−3]

γ3 = θ1γ2 + θ2γ1

• These are the so-called Yule-Walker equations.13 of 41

• To find the variance we can substitute γ1 and γ2 into the equation for γ0. This is,however, a bit tedious.

• We can find the autocorrelations, ρk = γk/γ0, as

ρ1 = θ1 + θ2ρ1

ρ2 = θ1ρ1 + θ2

ρk = θ1ρk−1 + θ2ρk−2, k ≥ 3

or alternatively that

ρ1 =θ1

1− θ2

ρ2 =θ21

1− θ2+ θ2

ρk = θ1ρk−1 + θ2ρk−2, k ≥ 3.

14 of 41

Examples of AR(2) Models

0 50 100

-2.5

0.0

2.5

5.0yt =0.50⋅yt−1+0.40⋅yt−2+εt

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

0 50 100

-5

0

5 yt =-0.80⋅yt−2+εt

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

0 50 100

-5

0

5yt =1.30⋅yt−1-0.80⋅yt−2+εt

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

15 of 41

Dynamic Properties of a MA(1) Model• Consider the MA(1) model

Yt = µ + t + α t−1.

• The mean is given by

E[Yt] = E[µ + t + α t−1] = µ

which is here identical to the constant term.

• Define the deviation from mean: yt = Yt − µ.

• Next we find the variance:

V [Yt] = E£y2¤= E

h( t + α t−1)

2i= E

£2t

¤+E

£α2 2t−1

¤+E [2α t t−1] =

¡1 + α2

¢σ2.

16 of 41

• The covariances, Cov[yt, yt−k] = E[ytyt−k], are given by

γ1 = E[ytyt−1]

= E[( t + α t−1) ( t−1 + α t−2)]

= E[ t t−1 + α t t−2 + α 2t−1 + α2 t−1 t−2] = ασ2

γ2 = E[ytyt−2]

= E[( t + α t−1) ( t−2 + α t−3)]

= E[ t t−2 + α t t−3 + α t−1 t−2 + α2 t−1 t−3] = 0...

γk = E[ytyt−k] = 0

• The ACF is given by

ρ1 =γ1γ0=

ασ2

(1 + α2)σ2=

α

(1 + α2)ρk = 0, k ≥ 2.

17 of 41

Examples of MA Models

0 50 100

-2.5

0.0

2.5

5.0yt =εt-0.90⋅ εt−1

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

0 50 100

-2.5

0.0

2.5yt =εt+0.90⋅ εt−1

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

0 50 100

-2.5

0.0

2.5yt =εt-0.60⋅ εt−1+0.50⋅ εt−2+0.50⋅ εt−3

0 10 20

0

1ACF-T heoret ical ACF-Est imated

0 10 20

0

1PA CF-T heoretical PA CF-Est imated

18 of 41

The Lag— and Difference Operators• Now we introduce an important tool called the lag-operator, L.It has the property that

L · yt = yt−1,

and, for example,

L2yt = L(Lyt) = Lyt−1 = yt−2.

• Also define the first difference operator, ∆ = 1− L, such that

∆yt = (1− L) yt = yt − Lyt = yt − yt−1.

• The operators L and ∆ are not functions, but can be used in calculations.

19 of 41

Lag Polynomials• Consider as an example the AR(2) model

yt = θ1yt−1 + θ2yt−2 + t.

That can be written as

yt − θ1yt−1 − θ2yt−2 = t

yt − θ1Lyt − θ2L2yt = t

(1− θ1L− θ2L2)yt = t

θ(L)yt = t,

where

θ(L) = 1− θ1L− θ2L2

is a polynomial in L, denoted a lag-polynomial.

• Standard rules for calculating with polynomials also hold for polynomials in L.

20 of 41

Characteristic Equations and Roots• For a model

yt − θ1yt−1 − θ2yt−2 = t

θ(L)yt = t,

we define the characteristic equation as

θ(z) = 1− θ1z − θ2z2 = 0.

The solutions, z1 and z2, are denoted characteristic roots.

• An AR(p) has p roots.Some of them may be complex values, h± v · i, where i =

√−1.

• Recall, that the roots can be used for factorizing the polynomial

θ(z) = 1− θ1z − θ2z2 = (1− φ1z) (1− φ2z) ,

where φ1 = z−11 and φ2 = z−12 are the inverse roots.

21 of 41

Invertibility of Polynomials• Define the inverse of a polynomial, θ−1(L) of θ(L), so that

θ−1(L)θ(L) = 1.

• Consider the AR(1) case, θ(L) = 1− θL, and look at the product

(1− θL)¡1 + θL + θ2L2 + θ3L3 + ... + θkLk

¢= (1− θL) +

¡θL− θ2L2

¢+¡θ2L2 − θ3L3

¢+¡θ3L3 − θ4L4

¢+ ...

= 1− θk+1Lk+1.

If |θ| < 1, it holds that θk+1Lk+1→ 0 as k →∞ implying that

θ−1(L) = (1− θL)−1 =1

1− θL= 1 + θL + θ2L2 + θ3L3 + ... =

∞Xi=0

θiLi.

• If θ(L) is a finite polynomial, the inverse polynomial, θ−1(L), is infinite.

22 of 41

ARMA Models in AR and MA form• Using lag polynomials we can rewrite the stationary ARMA(p,q) model as

yt − θ1yt−1 − ...− θpyt−p = t + α1 t−1 + ... + αq t−q (∗)θ(L)yt = α(L) t.

where θ(L) and α(L) are finite polynomials.

• If θ(L) is invertible, (∗) can be written as the infinite MA(∞) model

yt = θ−1(L)α(L) t

yt = t + γ1 t−1 + γ2 t−2 + ...

This is called the MA representation.

• If α(L) is invertible, (∗) can be written as an infinite AR(∞) model

α−1(L)θ(L)yt = t

yt − γ1yt−1 − γ2yt−2 − ... = t.

This is called the AR representation.23 of 41

Invertibility and Stationarity• A finite order MA process is stationary by construction.— It is a linear combination of stationary white noise terms.— Invertibility is sometimes convenient for estimation and prediction.

• An infinite MA process is stationary if the coefficients, αi, converge to zero.— We require that

P∞i=1 α

2i <∞.

• An AR process is stationary if θ(L) is invertible.— This is important for interpretation and inference.— In the case of a root at unity standard results no longer hold.We return to unit roots later.

24 of 41

• Consider again the AR(2) model

θ(z) = 1− θ1z − θ2z2 = (1− φ1L) (1− φ2L) .

The polynomial is invertible if the factors (1− φiL) are invertible, i.e. if

|φ1| < 1 and |φ2| < 1.

• In general a polynomial, θ(L), is invertible if the characteristic roots, z1, ..., zp, arelarger than one in absolute value.In complex cases, this corresponds to the roots being outside the complex unit circle.(Modulus larger than one).

Imaginary part

i

1 Real part

25 of 41

Solution to the AR(1) Model• Consider the model

Yt = δ + θYt−1 + t

(1 + θL)Yt = δ + t.

The solution is given as

Yt = (1 + θL)−1(δ + t)

=¡1 + θL + θ2L2 + θ3L3 + ...

¢(δ + t)

=¡1 + θ + θ2 + θ3 + ...

¢δ + t + θ t−1 + θ2 t−2 + θ3 t−3 + ...

• This is the MA-representation. The expectation is given by

E[Yt] =¡1 + θ + θ2 + θ3 + ...

¢δ → δ

1− θ.

26 of 41

• An alternative solution method is recursive subtitution:

Yt = δ + θYt−1 + t

= δ + θ(δ + θYt−2 + t−1) + t

= (1 + θ) δ + t + θ t−1 + θ2Yt−2

= (1 + θ) δ + t + θ t−1 + θ2(δ + θYt−3 + t−2)

=¡1 + θ + θ2

¢δ + t + θ t−1 + θ2 t−2 + θ3Yt−3

...

=¡1 + θ + θ2 + θ3 + ...

¢δ + t + θ t−1 + θ2 t−2 + ... + θt−1Y1

where we see the effect of the initial observation.

• The expectation is

E[Yt] =¡1 + θ + θ2 + θ3 + ...

¢δ + θt−1Y1→

δ

1− θ.

27 of 41

ARMA Models and Common Roots• Consider the stationary ARMA(p,q) model

yt − θ1yt−1 − ...− θpyt−p = t + α1 t−1 + ... + αq t−q

θ(L)yt = α(L) t

(1− φ1L) (1− φ2L) · · ·¡1− φpL

¢yt = (1− ξ1L) (1− ξ2L) · · ·

¡1− ξqL

¢t.

• If φi = ξj for some i, j, they are denoted common roots or canceling roots.The ARMA(p,q) model is equivalent to a ARMA(p-1,q-1) model.

• As an example, consider

yt − yt−1 + 0.25yt−2 = t − 0.5 t−1¡1− L + 0.25L2

¢yt = (1− 0.5L) t

(1− 0.5L) (1− 0.5L) yt = (1− 0.5L) t

(1− 0.5L) yt = t.

28 of 41

Unit Roots and ARIMA Models• A root at one is denoted a unit root.We consider the consequences later, here we just remove them by first differences.

• Consider an ARMA(p,q) model

θ(L)yt = α(L) t.

If there is a unit root in the AR polynomial, we can factorize into

θ(L) = (1− L) (1− φ2L) · · ·¡1− φpL

¢= (1− L)θ∗(L),

and we can write the model as

θ∗(L)(1− L)yt = α(L) t

θ∗(L)∆yt = α(L) t.

• An ARMA(p,q) model for ∆dyt is denoted an ARIMA(p,d,q) model for yt.

29 of 41

Example: Danish Real House Prices

1970 1975 1980 1985 1990 1995 2000 2005

-0.1

0.0

0.1

0.2

0.3

0.4

P=Log(HousePrice/CPI)

∆P

30 of 41

• Estimating an AR(2) model for 1972:1-2004:2 yields

pt = 1.545(21.0)

pt−1 − 0.5646(−7.58)

pt−2 + 0.003359(1.29)

The lag polynomial is given by

θ(L) = 1− 1.545 · L + 0.5646 · L2,

with inverse roots given by 0.953 and 0.592.

• One root is close to unity and we estimate an ARIMA(1,1,0) model for pt :

∆pt = 0.5442(7.35)

∆pt−1 + 0.0008369(0.416)

.

The second root is basically unchanged.

31 of 41

ARIMA(p,d,q) Model Selection• Find a transformation of the process that is stationary, e.g. ∆dYt.

• Recall, that for the stationary AR(p) model— The ACF is infinite but convergent.— The PACF is zero for lags larger than p.

• For the MA(q) model— The ACF is zero for lags larger than q.— The PACF is infinite but convergent.

• The ACF and PACF contains information p and q.Can be used to select relevant models.

32 of 41

• If alternative models are nested, they can be tested.

• Model selection can be based on information criteria

IC = log bσ2| {z }Measures the likelihood

+ penalty(T,#parameters)| {z }A penalty for the number of parameters

The information criteria should be minimized!

• Three important criteria

AIC = log bσ2 + 2 · kT

HQ = log bσ2 + 2 · k · log(log(T ))T

BIC = log bσ2 + k · log(T )T

,

where k is the number of estimated parameters, e.g. k = p + q.

33 of 41

Example: Consumption-Income Ratio

1970 1980 1990 2000

6.00

6.25

(A) Consumption and income, logs.Consumption (c) Income (y)

1970 1980 1990 2000

-0.15

-0.10

-0.05

0.00(B) Consumption-Income ratio, logs.

0 5 10 15 20

0

1 (C) ACF for series in (B)

0 5 10 15 20

0

1 (D) PACF for series in (B)

34 of 41

Model T p log-lik SC HQ AIC

ARMA(2,2) 130 5 300.82151 -4.4408 -4.5063 -4.5511

ARMA(2,1) 130 4 300.39537 -4.4717 -4.5241 -4.5599

ARMA(2,0) 130 3 300.38908 -4.5090 -4.5483 -4.5752

ARMA(1,2) 130 4 300.42756 -4.4722 -4.5246 -4.5604

ARMA(1,1) 130 3 299.99333 -4.5030 -4.5422 -4.5691

ARMA(1,0) 130 2 296.17449 -4.4816 -4.5078 -4.5258

ARMA(0,0) 130 1 249.82604 -3.8060 -3.8191 -3.8281

35 of 41

---- Maximum likelihood estimation of ARFIMA(1,0,1) model ----

The estimation sample is: 1971 (1) - 2003 (2)

The dependent variable is: cy (ConsumptionData.in7)

Coefficient Std.Error t-value t-prob

AR-1 0.857361 0.05650 15.2 0.000

MA-1 -0.300821 0.09825 -3.06 0.003

Constant -0.0934110 0.009898 -9.44 0.000

log-likelihood 299.993327

sigma 0.0239986 sigma^2 0.000575934

---- Maximum likelihood estimation of ARFIMA(2,0,0) model ----

The estimation sample is: 1971 (1) - 2003 (2)

The dependent variable is: cy (ConsumptionData.in7)

Coefficient Std.Error t-value t-prob

AR-1 0.536183 0.08428 6.36 0.000

AR-2 0.250548 0.08479 2.95 0.004

Constant -0.0935407 0.009481 -9.87 0.000

log-likelihood 300.389084

sigma 0.0239238 sigma^2 0.00057234936 of 41

Estimation of ARMA Models• The natural estimator is maximum likelihood. With normal errors

logL(θ, α, σ2) = −T2log(2πσ2)−

TXt=1

2t

2 · σ2,

where t is the residual.

• For an AR(1) model we can write the residual as

t = Yt − δ − θ1 · Yt−1,

and OLS coincides with ML.

• Usual to condition on the initial values. Alternatively we can postulate a distributionfor the first observation, e.g.

Y1 ∼ N

µδ

1− θ,

σ2

1− θ2

¶,

where the mean and variance are chosen as implied by the model for the rest of theobservations. We say that Y1 is chosen from the invariant distribution.

37 of 41

• For the MA(1) model

Yt = µ + t + α t−1,

the residuals can be found recursively as a function of the parameters

1 = Y1 − µ

2 = Y2 − µ− α 1

3 = Y3 − µ− α 2

...

Here, the initial value is 0 = 0, but that could be relaxed if required by using theinvariant distribution.

• The likelihood function can be maximized wrt. α and µ.

38 of 41

Forecasting• Easy to forecast with ARMA models.Main drawback is that here is no economic insight.

• We want to predict yT+k given all information up to time T , i.e. given the informationset

IT = {y−∞, ..., yT−1, yT}.

The optimal predictor is the conditional expectation

yT+k|T = E[yT+k | IT ].

39 of 41

• Consider the ARMA(1,1) model

yt = θ · yt−1 + t + α t−1, t = 1, 2, ..., T.

• To forecast we— Substitute the estimated parameters for the true.— Use estimated residuals up to time T . Hereafter, the best forecast is zero.

• The optimal forecasts will be

yT+1|T = E[θ · yT + T+1 + α · T | IT ]= bθ · yT + bα ·bT

yT+2|T = E[θ · yT+1 + T+2 + α · T+1 | IT ]= bθ · yT+1|T .

40 of 41

1990 1992 1994 1996 1998 2000 2002 2004 2006 2008

-0.16

-0.14

-0.12

-0.10

-0.08

-0.06

-0.04

-0.02Forecasts Actual

41 of 41