148
Time Series Analysis Andrea Beccarini Center for Quantitative Economics Winter 2013/2014 Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 1 / 143

Slides a13 14

Embed Size (px)

DESCRIPTION

Slides a13 14

Citation preview

Page 1: Slides a13 14

Time Series Analysis

Andrea Beccarini

Center for Quantitative Economics

Winter 2013/2014

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 1 / 143

Page 2: Slides a13 14

IntroductionObjectives

Time series are ubiquitous in economics, and very importantin macro economics and financial economics

GDP, inflation rates, unemployment, interest rates, stock prices

You will learn . . .

the formal mathematical treatment of time series and stochasticprocesseswhat the most important standard models in economics arehow to fit models to real world time series

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 2 / 143

Page 3: Slides a13 14

IntroductionPrerequisites

Descriptive Statistics

Probability Theory

Statistical Inference

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 3 / 143

Page 4: Slides a13 14

IntroductionClass and material

Class

Class teacher: Sarah Meyer

Time: Tu., 12:00-14:00

Location: CAWM 3

Start: 22 October 2013

Material

Course page on Blackboard

Slides and class material are (or will be) downloadable

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 4 / 143

Page 5: Slides a13 14

IntroductionLiterature

Neusser, Klaus (2011), Zeitreihenanalyse in denWirtschaftswissenschaften, 3. Aufl., Teubner, Wiesbaden.−→ available online in the RUB-Netz

Hamilton, James D. (1994), Time Series Analysis,Princeton University Press, Princeton.

Pfaff, Bernhard (2006), Analysis of Integrated andCointegrated Time Series with R, Springer, New York.

Schlittgen, Rainer und Streitberg, Bernd (1997),Zeitreihenanalyse, 7. Aufl., Oldenbourg, Munchen.

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 5 / 143

Page 6: Slides a13 14

BasicsDefinition

Definition: Time series

A sequence of observations ordered by time is called time series

Time series can be univariate or multivariate

Time can be discrete or continous

The states can be discrete or continuous

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 6 / 143

Page 7: Slides a13 14

BasicsDefinition

Typical notationsx1, x2, . . . , xT

or x(1), x(2), . . . , x(T )or xt , t = 1, . . . ,Tor (xt)t≥0

This course is about . . .

univariate time seriesin discrete timewith continuous states

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 7 / 143

Page 8: Slides a13 14

BasicsExamples

Quarterly GDP Germany, 1991 I to 2012 II

1995 2000 2005 2010

350

400

450

500

550

600

650

Time

GD

P (

in c

urre

nt b

illio

n E

uro)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 8 / 143

Page 9: Slides a13 14

BasicsExamples

DAX index and log(DAX), 31.12.1964 to 6.4.2009

1970 1980 1990 2000 2010

2000

6000

Time

DA

X

1970 1980 1990 2000 2010

6.0

7.0

8.0

9.0

Time

loga

rithm

of D

AX

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 9 / 143

Page 10: Slides a13 14

BasicsDefinition

Definition: Stochastic process

A sequence (Xt)t∈T of random variables, all defined on the sameprobability space (Ω,A,P), is called stochastic process with discrete timeparameter (usually T = N or T = Z)

Short version: A stochastic process is a sequence of random variables

A stochastic process depends on both chance and time

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 10 / 143

Page 11: Slides a13 14

BasicsDefinition

Distinguish four cases: both time and chance can be fixed or variable

t fixed t variable

ω fixed Xt(ω) is a Xt(ω) is a sequence ofreal number real numbers (path,

realization, trajectory)

ω variable Xt(ω) is a Xt(ω) is a stochasticrandom variable process

process.R

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 11 / 143

Page 12: Slides a13 14

BasicsExamples

Example 1: White noise

εt ∼ NID(0, σ2

)Example 2: Random walk

Xt = Xt−1 + εt and X0 = 0

εt ∼ NID(0, σ2)

Example 3: A random constant

Xt = Z

Z ∼ N(0, σ2)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 12 / 143

Page 13: Slides a13 14

BasicsMoment functions

Definition: Moment functions

The following functions of time are called moment functions:

µ(t) = E (Xt) (expectation function)σ2(t) = Var(Xt) (variance function)γ(s, t) = Cov(Xs ,Xt) (covariance function)

Correlation function (autocorrelation function)

ρ(s, t) =γ(s, t)√

σ2(s)√σ2(t)

moments.R [1]

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 13 / 143

Page 14: Slides a13 14

BasicsEstimation of moment functions

Usually, the moment functions are unknown and have to be estimated

Problem: Only a single path (realization) can be observed

X(1)1 X

(2)1 . . . X

(n)1

X(1)2 X

(2)2 . . . X

(n)2

...... . . .

...

X(1)T X

(2)T . . . X

(n)T

Can we still estimate the expectation function µ(t) and theautocovariance function γ(s, t)? Under which conditions?

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 14 / 143

Page 15: Slides a13 14

BasicsEstimation of moment functions

X(1)1 X

(2)1 . . . X

(n)1

X(1)2 X

(2)2 . . . X

(n)2

...... . . .

...

X(1)T X

(2)T . . . X

(n)T

Usually, the expectation function µ(t) should be estimated byaveraging over realizations,

µ(t) =1

n

n∑i=1

X(i)t

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 15 / 143

Page 16: Slides a13 14

BasicsEstimation of moment functions

X(1)1 X

(2)1 . . . X

(n)1

X(1)2 X

(2)2 . . . X

(n)2

...... . . .

...

X(1)T X

(2)T . . . X

(n)T

Under certain conditions, µ(t) can be estimated byaveraging over time,

µ =1

T

T∑t=1

Xt(1)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 15 / 143

Page 17: Slides a13 14

BasicsEstimation of moment functions

X(1)1 X

(2)1 . . . X

(n)1

X(1)2 X

(2)2 . . . X

(n)2

...... . . .

...

X(1)T X

(2)T . . . X

(n)T

Usually, the autocovariance γ(t, t + h) should be estimated byaveraging over realizations,

γ(t, t + h) =1

n

n∑i=1

(X(i)t − µ(t))(X

(i)t+h − µ(t + h))

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 16 / 143

Page 18: Slides a13 14

BasicsEstimation of moment functions

X(1)1 X

(2)1 . . . X

(n)1

X(1)2 X

(2)2 . . . X

(n)2

...... . . .

...

X(1)T X

(2)T . . . X

(n)T

Under certain conditions, γ(t, t + h) can be estimated byaveraging over time,

γ(t, t + h) =1

T

T−h∑t=1

(Xt(1) − µ)(Xt+h

(1) − µ)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 16 / 143

Page 19: Slides a13 14

BasicsDefinition

Moment functions cannot be estimated without additionalassumptions since only one path is observed

There are restrictions which allow to estimate the moment functions

Restriction of the time heterogeneity:The distribution of (Xt(ω))t∈T must not be completely different foreach t ∈ TRestriction of the memory:If the values of the process are coupled too closely over time, theindividual observations do not supply any (or only insufficient)information about the distribution

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 17 / 143

Page 20: Slides a13 14

BasicsRestriction of time heterogeneity: Stationarity

Definition: Strong stationarity

Let (Xt)t∈T be a stochastic process, and let t1, . . . , tn ∈ T be an arbitrarynumber of n ∈ N arbitrary time points.

(Xt)t∈T is called strongly stationary if for arbitrary h ∈ Z

P(Xt1 ≤ x1, . . . ,Xtn ≤ xn) = P(Xt1+h ≤ x1, . . . ,Xtn+h ≤ xn)

Implication: all univariate marginal distributions are identical

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 18 / 143

Page 21: Slides a13 14

BasicsRestriction of time heterogeneity: Stationarity

Definition: Weak stationarity

(Xt)t∈T is called weakly stationary if

1 the expectation exists and is constant: E (Xt) = µ <∞ for all t ∈ T2 the variance exists and is constant: Var(Xt) = σ2 <∞ for all t ∈ T3 for all t, s, r ∈ Z (in admissible range)

γ(t, s) = γ (t + r , s + r)

Simplified notation for covariance and correlation functions

γ(h) = γ(t, t + h)

ρ(h) = ρ(t, t + h)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 19 / 143

Page 22: Slides a13 14

BasicsRestriction of time heterogeneity: Stationarity

Strong stationarity implies weak stationarity(but only if the first two moments exist)

A stochastic process is called Gaussian if the joint distributionof Xt1 , . . . ,Xtn is multivariate normal

For Gaussian processes, weak and strong stationarity coincide

Intuition: An observed time series can be regarded as a realization ofa stationary process, if a gliding window of

”appropriate width“

always displays”qualitatively the same“ picture

stationary.R

Examples [2]

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 20 / 143

Page 23: Slides a13 14

BasicsRestriction of memory: Ergodicity

Definition: Ergodicity (I)

Let (Xt)t∈T be a weakly stationary stochastic process with expectation µand autocovariance γ(h); define

µ =1

T

T∑t=1

Xt

(Xt)t∈T is called (expectation) ergodic, if

limT→∞

E[(µT − µ)2

]= 0

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 21 / 143

Page 24: Slides a13 14

BasicsRestriction of memory: Ergodicity

Definition: Ergodicity (II)

Let (Xt)t∈T be a weakly stationary stochastic process with expectation µand autocovariance γ(h); define

γ(h) =1

T

T−h∑t=1

(Xt − µ)(Xt+h − µ)

(Xt)t∈T is called (covariance) ergodic, if for all h ∈ Z

limT→∞

E[(γ(h)− γ(h))2

]= 0

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 22 / 143

Page 25: Slides a13 14

BasicsRestriction of memory: Ergodicity

Ergodicity is consistency (in quadratic mean) of the estimators µ of µand γ(h) of γ(h) for dependent observations

The process (Xt)t∈T is expectation ergodic if (γ(h))h∈Z isabsolutely summable, i.e.

∞∑h=−∞

|γ(h)| <∞

The dependence between far away observations must be sufficientlysmall

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 23 / 143

Page 26: Slides a13 14

BasicsRestriction of memory: Ergodicity

Ergodicity condition (for autocovariance): A stationary Gaussianprocess (Xt)t∈T with absolutely summable autocovariance functionγ(h) is (autocovariance) ergodic

Under ergodicity, the law of large numbers holds even if theobservations are dependent

If the dependence γ(h) does not diminish fast enough,the estimators are no longer consistent

Examples [3]

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 24 / 143

Page 27: Slides a13 14

BasicsEstimation of moment functions

Summary of estimators electricity.R

µ = XT =1

T

T∑t=1

Xt

γ(h) =1

T

T−h∑t=1

(Xt − µ)(Xt+h − µ)

ρ(h) =γ(h)

γ(0)

Sometimes, γ(h) is defined with factor 1/(T − h)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 25 / 143

Page 28: Slides a13 14

BasicsEstimation of moment functions

A closer look at the expectation estimator µ

The estimator µ is unbiased, i.e. E (µ) = µ [4]

The variance of µ is [5]

Var (µ) =γ (0)

T+

2

T

T−1∑h=1

(1− h

T

)γ (h)

Under ergodicity, for T →∞

T · Var (µ)→ γ (0) + 2∞∑h=1

γ (h) =∞∑

h=−∞γ(h)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 26 / 143

Page 29: Slides a13 14

BasicsEstimation of moment functions

For Gaussian processes, µ is normally distributed

µ ∼ N (µ,Var(µ))

and asymptotically

√T (µ− µ)→ Z ∼ N

(0, γ (0) + 2

∞∑h=1

γ (h)

)

For non-Gaussian processes, µ is (often) asymptotically normal

√T (µ− µ)→ Z ∼ N

(0, γ (0) + 2

∞∑h=1

γ (h)

)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 27 / 143

Page 30: Slides a13 14

BasicsEstimation of moment functions

A closer look at the autocovariance estimators γ(h)

For Gaussian processes with absolutely summable covariance function,(√T (γ (0)− γ (0)) , . . . ,

√T (γ (K )− γ (K ))

)′is multivariate normal with expectation vector (0, . . . , 0)′ and

T · Cov (γ (h1) , γ (h2))

=∞∑

r=−∞(γ (r) γ (r + h1 + h2) + γ (r − h2) γ (r + h1))

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 28 / 143

Page 31: Slides a13 14

BasicsEstimation of moment functions

A closer look at the autocorrelation estimators ρ(h)

For Gaussian processes with absolutely summable covariance function,the random vector(√

T (ρ (0)− ρ (0)) , . . . ,√T (ρ (K )− ρ (K ))

)′is multivariate normal with expectation vector (0, . . . , 0)′ and acomplicated covariance matrix

Be careful: For small to medium sample sizes the autocovariance andautocorrelation estimators are biased!

autocorr.R

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 29 / 143

Page 32: Slides a13 14

BasicsEstimation of moment functions

An important special case for autocorrelation estimators:

Let (εt) be a white-noise process with Var (εt) = σ2 <∞, then

E (ρ (h)) = −T−1 + O(T−2)

Cov (ρ (h1) , ρ (h2)) =

T−1 + O(T−2) for h1 = h2

O(T−2

)else

For white-noise processes and long time series, the empiricalautocorrelations are approximately independent normal randomvariables with expectation −T−1 and variance T−1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 30 / 143

Page 33: Slides a13 14

Mathematical digression (I)Complex numbers

Some quadratic equations do not have real solutions, e.g.

x2 + 1 = 0

Still it is possible (and sensible) to define solutions to such equations

The definition in common notation is

i =√−1

where i is the number which, when squared, equals −1

The number i is called imaginary (i.e. not real)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 31 / 143

Page 34: Slides a13 14

Mathematical digression (I)Complex numbers

Other imaginary numbers follow from this definition, e.g.

√−16 =

√16√−1 = 4i√

−5 =√

5√−1 =

√5i

Further, it is possible to define numbers that containboth a real part and an imaginary part, e.g. 5− 8i or a + bi

Such numbers are called complex and the set of complex numbersis denoted as CThe pair a + bi and a− bi is called conjugate complex

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 32 / 143

Page 35: Slides a13 14

Mathematical digression (I)Complex numbers

Geometric interpretation:

seq(

0, 8

, len

gth

= 1

1)

real axis

imag

inar

y ax

is a+bi

imaginary part b

real part a

absolute value r

θ

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 33 / 143

Page 36: Slides a13 14

Mathematical digression (I)Complex numbers

Polar coordinates and Cartesian coordinates

z = a + bi

= r · (cos θ + i sin θ)

= re iθ

a = r cos θ

b = r sin θ

r =√a2 + b2

θ = arctan

(∣∣∣∣ba∣∣∣∣)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 34 / 143

Page 37: Slides a13 14

Mathematical digression (I)Complex numbers

Rules of calculus:

Addition(a + bi) + (c + di) = (a + c) + (b + d)i

Multiplication (cartesian coordinates)

(a + bi) · (c + di) = (ac − bd) + (ad + bc)i

Multiplication (polar coordinates)

r1eiθ1 · r2e iθ2 = r1r2e

i(θ1+θ2)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 35 / 143

Page 38: Slides a13 14

Mathematical digression (I)Complex numbers

Addition:

seq(

−2,

8, l

engt

h =

11)

real axis

imag

inar

y ax

is

a+bi

c+di

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 36 / 143

Page 39: Slides a13 14

Mathematical digression (I)Complex numbers

Addition:

seq(

−2,

8, l

engt

h =

11)

real axis

imag

inar

y ax

is

a+bi

c+di

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 36 / 143

Page 40: Slides a13 14

Mathematical digression (I)Complex numbers

Addition:

seq(

−2,

8, l

engt

h =

11)

real axis

imag

inar

y ax

is

a+bi

c+di

(a+c)+(b+d)i

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 36 / 143

Page 41: Slides a13 14

Mathematical digression (I)Complex numbers

Multiplication:

seq(

−2,

8, l

engt

h =

11)

real axis

imag

inar

y ax

is

r1

θ1

r2

θ2

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 37 / 143

Page 42: Slides a13 14

Mathematical digression (I)Complex numbers

Multiplication:

seq(

−2,

8, l

engt

h =

11)

real axis

imag

inar

y ax

is

r1

θ1

r2

θ2

r = r1 ⋅ r2

θ = θ1 + θ2

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 37 / 143

Page 43: Slides a13 14

Mathematical digression (I)Complex numbers

The quadratic equation

x2 + px + q = 0

has the solutions

x = −p

2±√

p2

4− q

If p2

4 − q < 0 the solutions are complex (and conjugate)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 38 / 143

Page 44: Slides a13 14

Mathematical digression (I)Complex numbers

Example: The solutions of

x2 − 2x + 5 = 0

are

x = −(−2)

2+

√(−2)2

4− 5 = 1 + 2i

and

x = −(−2)

2−√

(−2)2

4− 5 = 1− 2i

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 39 / 143

Page 45: Slides a13 14

Mathematical digression (II)Linear difference equations

First order difference equation with initial value x0:

xt = c + φ1xt−1

p-th order difference equation with initial value x0:

xt = c + φ1xt−1 + . . .+ φpxt−p

A sequence (xt)t=0,1,... that satisfies the difference equationis called a solution of the difference equation

Examples (diffequation.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 40 / 143

Page 46: Slides a13 14

Mathematical digression (II)Linear difference equations

We only consider the homogeneous case, i.e. c = 0

The general solution of the first-order difference equation

xt = φ1xt−1

isxt = A · φt1

with arbitrary constant A since xt = Aφt1 = φ1Aφt−11 = φ1xt−1

The constant is definitized by the initial condition, A = x0

The sequence xt = Aφt1 is convergent if and only if |φ1| < 1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 41 / 143

Page 47: Slides a13 14

Mathematical digression (II)Linear difference equations

Solution of the p-th order difference equation

xt = φ1xt−1 + . . .+ φpxt−p

Let xt = Az−t , then

Az−t = φ1Az−(t−1) + . . .+ φpAz

−(t−p)

z−t = φ1z−(t−1) + . . .+ φpz

−(t−p)

and thus1− φ1z

1 − . . .− φpzp = 0

Characteristic polynomial, characteristic equation

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 42 / 143

Page 48: Slides a13 14

Mathematical digression (II)Linear difference equations

There are p (possibly complex, possibly nondistinct) solutionsof the characteristic equation

Denote the solutions (called roots) by z1, . . . , zp

If all roots are real and distinct, then

xt = A1z−t1 + . . .+ Apz

−tp

is a solution of the homogeneous difference equation

If there are complex roots the solution is oscillating

The constants A1, . . . ,Ap can be definitized with p initial conditions(x0, x−1, . . . , xp−1)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 43 / 143

Page 49: Slides a13 14

Mathematical digression (II)Linear difference equations

Stability condition: The linear difference equation

xt = φ1xt−1 + . . .+ φpxt−p

is stable (i.e. convergent) if and only if all roots of thecharacteristic polynomial

1− φ1z − . . .− φpzp = 0

are outside the unit circle, i.e. |zi | > 1 for all i = 1, . . . , p

In R, the stability condition can be checked easily using thecommands polyroot (base package) or ArmaRoots (fArma package)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 44 / 143

Page 50: Slides a13 14

ARMA modelsDefinition

Definition: ARMA process

Let (εt)t∈T be a white noise process; the stochastic process

Xt = φ1Xt−1 + . . .+ φpXt−p + εt + θ1εt−1 + . . .+ θqεt−q

with φp, θq 6= 0 is called ARMA(p, q) process

AutoRegressive Moving Average process

ARMA processes are important since every stationary process can beapproximated by an ARMA process

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 45 / 143

Page 51: Slides a13 14

ARMA modelsLag operator and lag polynomial

The lag operator is a convenient notational tool

The lag operator L shifts the time index of a stochastic process

L (Xt)t∈T = (Xt−1)t∈TLXt = Xt−1

Rules

L2Xt = L (LXt) = Xt−2

LnXt = Xt−n

L−1 = Xt+1

L0Xt = Xt

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 46 / 143

Page 52: Slides a13 14

ARMA modelsLag operator and lag polynomial

Lag polynomial

A(L) = a0 + a1L + a2L2 + . . .+ apL

p

Example: Let A(L) = 1− 0.5L and B(L) = 1 + 4L2, then

C (L) = A(L)B(L)

= (1− 0.5L)(1 + 4L2

)= 1− 0.5L + 4L2 − 2L3

Lag polynomials can be treated in the same way as ordinarypolynomials

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 47 / 143

Page 53: Slides a13 14

ARMA modelsLag operator and lag polynomial

Define the lag polynomials

Φ(L) = 1− φ1L− . . .− φpLp

Θ(L) = 1 + θ1L + . . .+ θqLq

The ARMA(p, q) process can be written compactly as

Φ(L)Xt = Θ(L)εt

Important special cases

MA(q) process : Xt = εt + θ1εt−1 + . . .+ θqεt−q

AR(1) process : Xt = φ1Xt−1 + εt

AR(p) process : Xt = φ1Xt−1 + · · ·+ φpXt−p + εt

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 48 / 143

Page 54: Slides a13 14

ARMA modelsMA(q) process

The MA(q) process is

Xt = Θ(L)εt

Xt = εt + θ1εt−1 + . . .+ θqεt−q

with εt ∼ NID(0, σ2ε)

Expectation function

E (Xt) = E (εt + θ1εt−1 + . . .+ θqεt−q)

= E (εt) + θ1E (εt−1) + . . .+ θqE (εt−q)

= 0

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 49 / 143

Page 55: Slides a13 14

ARMA modelsMA(q) process

Autocovariance function

γ (s, t)

= E[

(εs + θ1εs−1 + . . .+ θqεs−q) (εt + θ1εt−1 + . . .+ θqεt−q)]

= E[εsεt + θ1εsεt−1 + θ2εsεt−2 + . . .+ θqεsεt−q

+θ1εs−1εt + θ21εs−1εt−1 + θ1θ2εs−1εt−2 + . . .+ θ1θqεs−1εt−q

+ . . .

+θqεs−qεt + θ1θqεs−qεt−1 + θ2θqεs−qεt−2 + . . .+ θ2qεs−qεt−q

]The expectations of the cross products are

E (εsεt) =

0 for s 6= tσ2ε for s = t

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 50 / 143

Page 56: Slides a13 14

ARMA modelsMA(q) process

Define θ0 = 1, then

γ (t, t) = σ2ε

∑q

i=0θ2i

γ (t − 1, t) = σ2ε

∑q−1

i=0θiθi+1

γ (t − 2, t) = σ2ε

∑q−2

i=0θiθi+2

γ (t − q, t) = σ2εθ0θq = σ2

εθq

γ (s, t) = 0 for s < t − q

Hence, MA(q) processes are always stationary

Simulation of MA(q) processes (maqsim.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 51 / 143

Page 57: Slides a13 14

ARMA modelsAR(1) process

The AR(1) process is

Φ(L)Xt = εt

(1− φ1L)Xt = εt

Xt = φ1Xt−1 + εt

with εt ∼ NID(0, σ2ε)

Expectation and variance function [6]

Stability condition: AR(1) processes are stable if |φ1| < 1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 52 / 143

Page 58: Slides a13 14

ARMA modelsAR(1) process

Stationarity: Stable AR(1) processes are weakly stationary if [7]

E (X0) = 0

Var(X0) =σ2ε

1− φ21

Nonstationary stable processes converge towards stationarity [8]

It is common parlance to call stable processes stationary

Covariance function of stationary AR(1) process [9]

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 53 / 143

Page 59: Slides a13 14

ARMA modelsAR(p) process

The AR(p) process is

Φ(L)Xt = εt

Xt = φ1Xt−1 + . . .+ φpXt−p + εt

with εt ∼ NID(0, σ2ε)

Assumption: εt is independent from Xt−1, Xt−2, . . . (innovations)

Expectation function [10]

The covariance function is complicated (ar2autocov.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 54 / 143

Page 60: Slides a13 14

ARMA modelsAR(p) process

AR(p) processes are stable if all roots of the characteristic equation

Φ(z) = 0

are larger than 1 in absolute value, |zi | > 1 for i = 1, . . . , p

An AR(p) process is weakly stationary if the joint distribution of thep initial values (X0,X−1, . . . ,X−(p−1)) is

”appropriate“

Stable AR(p) processes converge towards stationarity;they are often called stationary

Simulation of AR(p) processes (arpsim.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 55 / 143

Page 61: Slides a13 14

ARMA modelsInvertability

AR and MA processes can be inverted (into each other)

Example: Consider the stable AR (1) process with |φ1| < 1

Xt = φ1Xt−1 + εt

= φ1(φ1Xt−2 + εt−1) + εt

= φ21Xt−2 + φ1εt−1 + εt

...

= φn1Xt−n + φn−11 εt−(n−1) + . . .+ φ2

1εt−2 + φ1εt−1 + εt

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 56 / 143

Page 62: Slides a13 14

ARMA modelsInvertability

Since |φ1| < 1

Xt =∞∑i=0

φi1εt−i

= εt + θ1εt−1 + θ2εt−2 + . . .

with θi = φi1A stable AR(1) process can be written as an MA(∞) process(the same is true for stable AR(p) processes)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 57 / 143

Page 63: Slides a13 14

ARMA modelsInvertability

Using lag polynomials this can be written as

(1− φ1L)Xt = εt

Xt = (1− φ1L)−1εt

Xt =∞∑i=0

(φ1L)iεt

General compact and elegant notation

Φ(L)Xt = εt

Xt = (Φ(L))−1εt

= Θ(L)εt

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 58 / 143

Page 64: Slides a13 14

ARMA modelsInvertability

MA(q) can be written as AR(∞) if all roots of Θ(z) = 0 are largerthan 1 in absolute value (invertability condition)

Example: MA(1) with |θ1| < 1; from

Xt = εt + θ1εt−1

θ1Xt−1 = θ1εt−1 + θ21εt−2

we find Xt = θ1Xt−1 + εt − θ21εt−2

Repeated substitution of the εt−i terms yields

Xt =∞∑i=1

φiXt−i + εt with φi = (−1)i+1 θi1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 59 / 143

Page 65: Slides a13 14

ARMA modelsInvertability

Summary

ARMA(p, q) processes are stable if all roots of

Φ(z) = 0

are larger than 1 in absolute value

ARMA(p, q) processes are invertible if all roots of

Θ(z) = 0

are larger than 1 in absolute value

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 60 / 143

Page 66: Slides a13 14

ARMA modelsInvertability

Sometimes (e.g. for proofs), it is useful to write an ARMA(p, q)process either as AR(∞) or as MA(∞)

ARMA(p, q) can be written as AR(∞) or MA(∞)

Φ(L)Xt = Θ(L)εt

Xt = (Φ(L))−1 Θ(L)εt

(Θ(L))−1 Φ(L)Xt = εt

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 61 / 143

Page 67: Slides a13 14

ARMA modelsDeterministic components

Until now we only considered processes with zero expectation

Many processes have both a zero-expectation stochasticcomponent (Yt) and a non-zero deterministic component (Dt)

Examples:

linear trend Dt = a + btexponential trend Dt = abt

saisonal patterns

Let (Xt)t∈Z be a stochastic process with deterministic component Dt

and define Yt = Xt − Dt

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 62 / 143

Page 68: Slides a13 14

ARMA modelsDeterministic components

Then E (Yt) = 0 and

Cov (Yt ,Ys) = E [(Yt − E (Yt)) (Ys − E (Ys))]

= E [(Xt − Dt − E (Xt−Dt))(Xs − Ds − E (Xs−Ds))]

= E [(Xt − E (Xt)) (Xs − E (Xs))]

= Cov (Xt ,Xs)

The covariance function does not depend on the deterministiccomponent

To derive the covariance function of a stochastic process, simply dropthe deterministic component

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 63 / 143

Page 69: Slides a13 14

ARMA modelsDeterministic components

Special case: Dt = µt = µ

ARMA(p, q) process with constant (non-zero) expectation

Xt − µ = φ1(Xt−1 − µ) + . . .+ φp(Xt−p − µ)

+εt + θ1εt−1 + . . .+ θqεt−q

The process can also be written as

Xt = c + φ1Xt−1 + . . .+ φpXt−p + εt + θ1εt−1 + . . .+ θqεt−q

where c = µ (1− φ1 − . . .− φp)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 64 / 143

Page 70: Slides a13 14

ARMA modelsDeterministic components

Wold’s representation theorem: Every stationary stochastic process(Xt)t∈T can be represented as

Xt =∞∑h=0

ψhεt−h + Dt

with ψ0 = 1,∑∞

h=0 ψ2j <∞ and εt white noise with variance σ2 > 0

Stationary stochastic processes can be written as a sum of adeterministic process and an MA(∞) process

Often, low order ARMA(p, q) processes can approximate MA(∞)processes well

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 65 / 143

Page 71: Slides a13 14

ARMA modelsLinear processes and filter

Definition: Linear process

Let (εt)t∈Z be a white noise process; a stochastic process (Xt)t∈Z is calledlinear if it can be written as

Xt =∞∑

h=−∞ψhεt−h

= Ψ(L)εt

where the coefficients are absolutely summable, i.e.∑∞

h=−∞ |ψh| <∞.

The lag polynomial Ψ(L) is called (linear) filter

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 66 / 143

Page 72: Slides a13 14

ARMA modelsLinear processes and filter

Some special filters

Change from previous period (difference filter)

Ψ(L) = 1− L

Change from last year (for quarterly or monthly data)

Ψ(L) = 1− L4

Ψ(L) = 1− L12

Elimination of saisonal influences (quarterly data)

Ψ(L) =(1 + L + L2 + L3

)/4

Ψ(L) = 0.125L2 + 0.25L + 0.25 + 0.25L−1 + 0.125L−2

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 67 / 143

Page 73: Slides a13 14

ARMA modelsLinear processes and filter

Hodrick-Prescott filter (important tool in empirical macro economics)

Decompose a time series (Xt) into a long-term growth component(Gt) and a short-term cyclical component (Ct)

Xt = Gt + Ct

Trade-off between goodness-of-fit and smoothness of Gt

Minimize the criterion function

T∑t=1

(Xt − Gt)2 + λ

T−1∑t=2

[(Gt+1 − Gt)− (Gt − Gt−1)]2

with respect to Gt for given smoothness parameter λ

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 68 / 143

Page 74: Slides a13 14

ARMA modelsLinear processes and filter

The FOCs of the minimization problem are G1...

GT

= A

X1...

XT

where A = (I + λK ′K )−1 with

K =

1 −2 1 0 0 . . . 0 0 00 1 −2 1 0 . . . 0 0 00 0 1 −2 1 . . . 0 0 0...

......

...... . . .

......

...0 0 0 0 0 . . . 1 −2 1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 69 / 143

Page 75: Slides a13 14

ARMA modelsLinear processes and filter

The HP filter is a linear filter

Typical values for smoothing parameter λ

λ = 10 annual dataλ = 1600 quarterly dataλ = 14400 monthly data

Implementation in R (code by Olaf Posch)

Empirical examples (hpfilter.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 70 / 143

Page 76: Slides a13 14

Estimation of ARMA modelsThe estimation problem

Problem: The parameters φ1, . . . , φp, θ1, . . . , θq, σ2ε of an ARMA(p, q)

process are usually unknown

They have to be estimated from an observed time series X1, . . . ,XT

Standard estimation methods:

Least squares (OLS)Maximum likelihood (ML)

Assumption: the lag orders p and q are known

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 71 / 143

Page 77: Slides a13 14

Estimation of ARMA modelsLeast squares estimation of AR(p) models

The AR(p) model with non-zero constant expectation

Xt = c + φ1Xt−1 + . . .+ φpXt−p + εt

can be writte in matrix notationXp+1

Xp+2...

XT

=

1 Xp Xp−1 . . . X1

1 Xp+1 Xp . . . X2...

......

. . ....

1 XT−1 XT−2 . . . XT−p

cφ1...φp

+

εp+1

εp+2...εT

Compact notation: y = Xβ + u

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 72 / 143

Page 78: Slides a13 14

Estimation of ARMA modelsLeast squares estimation of AR(p) models

The standard least squares estimator is

β =(X′X

)−1X′y

The matrix of exogenous variables X is stochastic−→ usual results for OLS regression do not hold

But: There is no contemporaneous correlation between the error termand the exogenous variables

Hence, the OLS estimators are consistent and asymptotically efficient

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 73 / 143

Page 79: Slides a13 14

Estimation of ARMA modelsLeast squares estimation of ARMA models

Solve the ARMA equation

Xt = c + φ1Xt−1 + . . .+ φpXt−p + εt + θ1εt−1 + . . .+ θqεt−q

for εt ,

εt = Xt − c − φ1Xt−1 − . . .− φpXt−p − θ1εt−1 − . . .− θqεt−q

Define the residuals as functions of the unknown parameters

εt (d , f1, . . . , fp, g1, . . . , gq) = Xt − d − f1Xt−1 − . . .− fpXt−p

−g1εt−1 − . . .− gq εt−q

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 74 / 143

Page 80: Slides a13 14

Estimation of ARMA modelsLeast squares estimation of ARMA models

Define the sum of squared residuals

S (d , f1, . . . , fp, g1, . . . , gq) =T∑t=1

(εt (d , f1, . . . , fp, g1, . . . , gq))2

The least squares estimators are

(c , φ1, . . . , φp, θ1, . . . , θq) = arg min S (d , f1, . . . , fp, g1, . . . , gq)

Since the residuals are defined recursively one needs starting valuesε0, . . . , ε−q+1 and X0, . . . ,X−p+1 to calculate ε1

Easiest way: Set all starting values to zero (”conditional estimation“)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 75 / 143

Page 81: Slides a13 14

Estimation of ARMA modelsLeast squares estimation of ARMA models

The first order conditions are a nonlinear equation systemwhich cannot be solved easily

Minimization by standard numerical methods(implemented in all usual statistical packages)

Either solve the nonlinear first order conditions equation system orminimize S

Simple special case: ARMA(1, 1)

arma11.R

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 76 / 143

Page 82: Slides a13 14

Estimation of ARMA modelsMaximum likelihood estimation

Additional assumption: The innovations εt are normally distributed

Implication: ARMA processes are Gaussian

The joint distribution of X1, . . . ,XT is multivariat normal

X =

X1...

XT

∼ N (µ,Σ)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 77 / 143

Page 83: Slides a13 14

Estimation of ARMA modelsMaximum likelihood estimation

Expectation vector

µ = E

X1

...XT

=

c/ (1− φ1 − . . .− φp)...

c/ (1− φ1 − . . .− φp)

Covariance matrix

Σ = Cov

X1

X2...

XT

=

γ(0) γ(1) . . . γ(T − 1)γ(1) γ(0) . . . γ (T − 2)

......

. . ....

γ(T − 1) γ (T − 2) . . . γ(0)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 78 / 143

Page 84: Slides a13 14

Estimation of ARMA modelsMaximum likelihood estimation

The expectation vector and the covariance matrix contain allunknown parameters ψ =

(φ1, . . . , φp, θ1, . . . , θq, c , σ

)The likelihood function is

L (ψ; X) = (2π)−T/2 (det Σ)−1/2 exp

(−1

2(X− µ)′Σ−1 (X− µ)

)and the loglikelihood function is

ln L (ψ; X) = −T

2ln (2π)− 1

2ln (det Σ)− 1

2(X− µ)′Σ−1 (X− µ)

The ML estimators are ψ = arg max ln L (ψ; X)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 79 / 143

Page 85: Slides a13 14

Estimation of ARMA modelsMaximum likelihood estimation

The loglikelihood function has to be maximized by numerical methods

Standard properties of ML estimators:

1 consistency2 asymptotic efficiency3 asymptotically jointly normally distributed4 the covariance matrix of the estimators can be consistently estimated

Example: ML estimation of an ARMA(3, 3) model for the interestrate spread (arma33.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 80 / 143

Page 86: Slides a13 14

Estimation of ARMA modelsHypothesis tests

Since the estimation method is maximum likelihood,the classical tests (Wald, LR, LM) are applicable

General null and alternative hypotheses

H0 : g(ψ) = 0

H1 : not H0

where g(ψ) is an m-valued function of the parameters

Example: If H0 : φ1 = 0 then m = 1 and g(ψ) = φ1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 81 / 143

Page 87: Slides a13 14

Estimation of ARMA modelsHypothesis tests

Likelihood ratio test statistic

LR = 2(ln L(θML)− ln L(θR))

where θML and θR are the unrestricted and restricted estimators

Under the null hypothesis

LRd−→ U ∼ χ2

m

and H0 is rejected at significance level α if LR > χ2m;1−α

Disadvantage: Two models must be estimated

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 82 / 143

Page 88: Slides a13 14

Estimation of ARMA modelsHypothesis tests

For the Wald test we only consider g(ψ) = ψ − ψ0, i.e.

H0 : ψ = ψ0

H1 : not H0

Test statisticW = (ψ − ψ0)′Cov(ψ)(ψ − ψ0)

If the null hypothesis is true then Wd−→ U ∼ χ2

m

The asymptotic covariance matrix can be estimated consistently asCov(ψ) = H−1 where H is the Hessian matrix returned by themaximization procedure

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 83 / 143

Page 89: Slides a13 14

Estimation of ARMA modelsHypothesis tests

Test example 1:

H0 : φ1 = 0

H1 : φ1 6= 0

Test example 2

H0 : ψ = ψ0

H1 : not H0

Illustration (arma33.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 84 / 143

Page 90: Slides a13 14

Estimation of ARMA modelsModel selection

Usually, the lag orders p and q of an ARMA model are unknown

Trade-off: Goodness-of-fit against parsimony

Akaike’s information criterion for the model with non-zero expectation

AIC = ln σ2︸︷︷︸goodness-of-fit

+ 2 (p + q + 1) /T︸ ︷︷ ︸penalty

Choose the model with the smallest AIC

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 85 / 143

Page 91: Slides a13 14

Estimation of ARMA modelsModel selection

Bayesian information criterion BIC (Schwarz information criterion)

BIC = ln σ2 + (p + q + 1) · lnT/T

Hannan-Quinn information criterion

HQ = ln σ2 + 2 (p + q + 1) · ln (lnT ) /T

Both BIC and HQ are consistent while the AIC tends to overfit

Illustration (arma33.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 86 / 143

Page 92: Slides a13 14

Estimation of ARMA modelsModel selection

Another illustration: The true model is ARMA(2, 1) withXt = 0.5Xt−1 + 0.3Xt−2 + εt + 0.7εt−1; 1000 samples of size n = 500were generated; the table shows the model’s orders p and q as selected byAIC and BIC

# orders selected by AIC # orders selected by BICq q

p 0 1 2 3 4 5 0 1 2 3 4 50 0 0 0 0 0 0 0 0 0 0 0 01 0 18 64 23 14 6 0 310 167 4 0 02 0 171 21 16 5 7 0 503 3 1 0 03 0 7 35 58 80 45 1 0 2 1 0 04 9 2 12 139 37 44 6 1 0 0 0 05 11 6 12 56 46 56 1 0 0 0 0 0

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 87 / 143

Page 93: Slides a13 14

Integrated processesDifference operator

Define the difference operator

∆ = 1− L,

then∆Xt = Xt − Xt−1

Second order differences

∆2 = ∆(∆) = (1− L)2 = 1− 2L + L2

Higher orders ∆n are defined in the same way; note that ∆n 6= 1− Ln

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 88 / 143

Page 94: Slides a13 14

Integrated processesDefinition

Definition: Integrated process

A stochastic process is called integrated of order 1 if

∆Xt = µ+ Ψ(L)εt

where εt is white noise, Ψ(1) 6= 0, and∑∞

j=0 j |ψj | <∞

Common notation: Xt ∼ I (1)

I (1) processes are also called difference stationary orunit root processes

Stochastic and deterministic trends

Trend stationary processes are not I (1) (since Ψ(1) = 0)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 89 / 143

Page 95: Slides a13 14

Integrated processesDefinition

Stationary processes are sometimes called I (0)

Higher order integrations are possible, e.g.

Xt ∼ I (2)

∆2Xt ∼ I (0)

In general, Xt ∼ I (d) means that ∆dXt ∼ I (0)

Most economic time series are either I (0) or I (1)

Some economic time series may be I (2)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 90 / 143

Page 96: Slides a13 14

Integrated processesDefinition

Example 1: The random walk with drift, Xt = b + Xt−1 + εt ,is I (1) because

∆Xt = Xt − Xt−1

= b + εt

= b + Ψ(L)εt

where ψ0 = 1 and ψj = 0 for j 6= 0

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 91 / 143

Page 97: Slides a13 14

Integrated processesDefinition

Example 2: The trend stationary process, Xt = a + bt + εt ,is not I (1) because

∆Xt = b + εt − εt−1

= Ψ(L)εt

with ψ0 = 1, ψ1 = −1 and ψj = 0 for all other j

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 92 / 143

Page 98: Slides a13 14

Integrated processesDefinition

Example 3: The”AR(2) process“

Xt = b + (1 + φ)Xt−1 − φXt−2 + εt

(1− φL) (1− L)Xt = b + εt

is I (1) if |φ| < 1 because ∆Xt = Ψ(L) (b + εt) with

Ψ(L) = (1− φL)−1 = 1 + φL + φ2L2 + φ3L3 + φ4L4 + . . .

and thus Ψ(1) =∑∞

i=0 φi = 1

1−φ 6= 0. The roots of the characteristicequation are z = 1 and z = 1/φ

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 93 / 143

Page 99: Slides a13 14

Integrated processesDefinition

Example 4: The process

Xt = 0.5Xt−1 − 0.4Xt−2 + εt

is a stationary (stable) zero expectation AR(2) process; the process

Yt = a + bt + Xt

is trend stationary and I (0) since

∆Yt = b + ∆Xt

with ∆Xt = Ψ(L)εt = (1− L)(1− 0.5L + 0.4L2

)−1εt

and therefore Ψ(1) = 0 (i0andi1.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 94 / 143

Page 100: Slides a13 14

Integrated processesDefinition

Definition: ARIMA process

Let (εt)t∈T be a white noise process; the stochastic process (Xt)t∈Z iscalled integrated autoregressive moving-average process of the ordersp, d and q, or ARIMA(p, d , q), if ∆dXt is an ARMA(p, q) process

Φ(L)∆dXt = Θ(L)εt

For d > 0 the process is nonstationary (I (d)) even if all roots ofΦ(z) = 0 are outside the unit circle

Simulation of an ARIMA(p, d , q) process (arimapdqsim.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 95 / 143

Page 101: Slides a13 14

Integrated processesDeterministic versus stochastic trends

Why is it important to distinguish deterministic and stochastic trends?

Reason 1: Long-term forecasts and forecasting errors

Deterministic trend: The forecasting error variance is bounded

Stochastic trend: The forecasting error variance is unbounded

Illustrations

i0andi1.R

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 96 / 143

Page 102: Slides a13 14

Integrated processesDeterministic versus stochastic trends

Why is it important to distinguish deterministic and stochastic trends?

Reason 2: Spurious regression

OLS regressions will show spurious relationships betweentime series with (deterministic or stochastic) trends

Detrending works if the series have deterministic trends,but it does not help if the series are integrated

Illustrations

spurious1.R

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 97 / 143

Page 103: Slides a13 14

Integrated processesIntegrated processes and parameter estimation

OLS estimators (and ML estimators) are consistent andasymptotically normal for stationary processes

The asymptotic normality is lost if the processes are integrated

We only look at the very special case

Xt = φ1Xt−1 + εt

with εt ∼ NID(0, 1) and X0 = 0

The AR(1) process is stationary if |φ1| < 1 and has a unit rootif |φ1| = 1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 98 / 143

Page 104: Slides a13 14

Integrated processesIntegrated processes and parameter estimation

The usual OLS estimator of φ1 is

φ1 =

∑Tt=1 XtXt−1∑Tt=1 X

2t−1

How does the distribution of φ look like?

Influence of φ and T

Consistency?

Asymptotic normality?

Illustration (phihat.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 99 / 143

Page 105: Slides a13 14

Integrated processesIntegrated processes and parameter estimation

Consistency and asymptotic normality for I (0) processes (|φ1| < 1)

plim φ1 = φ1

√T(φ1 − φ1

)d→ Z ∼ N

(0, 1− φ2

1

)Consistency and asymptotic normality for I (1) processes (φ1 = 1)

plim φ1 = 1

T(φ1 − 1

)d→ V

where V is a nondegenerate, nonnormal random variable

Root-T -consistency and superconsistency

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 100 / 143

Page 106: Slides a13 14

Integrated processesUnit root tests

Importance to distinguish between trend stationarity and differencestationarity

Test of hypothesis that a process has a unit root (i.e. is I (1))

Classical approaches: (Augmented) Dickey-Fuller-Test,Phillips-Perron-Test

Basic tool: Linear regression

Xt = deterministics + φXt−1 + εt

∆Xt = deterministics + (φ− 1)︸ ︷︷ ︸=:β

Xt−1 + εt

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 101 / 143

Page 107: Slides a13 14

Integrated processesUnit root tests

Null and alternative hypothesis

H0 : φ = 1 (unit root)

H1 : |φ| < 1 (no unit root)

or, equivalently,

H0 : β = 0 (unit root)

H1 : β < 0 (no unit root)

Unit root tests are one-sided; explosive process are ruled out

Rejecting the null hypothesis is evidence in favour of stationarity

If the null hypothesis is not rejected, there could be a unit root

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 102 / 143

Page 108: Slides a13 14

Integrated processesDF test and ADF test

Dickey-Fuller (DF) and Augmented Dickey-Fuller (ADF) tests

Possible regressions

Xt = φXt−1 + εt or ∆Xt = βXt−1 + εtXt = a + φXt−1 + εt or ∆Xt = a + βXt−1 + εtXt = a + bt + φXt−1 + εt or ∆Xt = a + bt + βXt−1 + εt

Assumption for Dickey-Fuller test: no autocorrelation in εt

If there is autocorrelation in εt , use the augmented DF test

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 103 / 143

Page 109: Slides a13 14

Integrated processesDF test and ADF test

Dickey-Fuller regression, case 1: no constant, no trend

∆Xt = βXt−1 + εt

Null and alternative hypotheses

H0 : β = 0

H1 : β < 0

Null hypothesis: stochastic trend without drift

Alternative hypothesis: stationary process around zero

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 104 / 143

Page 110: Slides a13 14

Integrated processesDF test and ADF test

Dickey-Fuller regression, case 2: constant, no trend

∆Xt = a + βXt−1 + εt

Null and alternative hypotheses

H0 : β = 0 or H0 : β = 0, a = 0

H1 : β < 0 or H0 : β < 0, a 6= 0

Null hypothesis: stochastic trend without drift

Alternative hypothesis: stationary process around a constant

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 105 / 143

Page 111: Slides a13 14

Integrated processesDF test and ADF test

Dickey-Fuller regression, case 3: constant and trend

∆Xt = a + bt + βXt−1 + εt

Null and alternative hypotheses

H0 : β = 0 or β = 0, b = 0

H1 : β < 0 or β < 0, b 6= 0

Null hypothesis: stochastic trend with drift

Alternative hypothesis: trend stationary process

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 106 / 143

Page 112: Slides a13 14

Integrated processesDF test and ADF test

Dickey-Fuller test statistics for single hypotheses

“ρ-test” : T · β“τ -test” : β/σφ

The τ -test statistic is computed in the same way asthe usual t-test statistic

Reject the null hypothesis if the test statistics are too small

The critical values are not the quantiles of the t-distribution

There are tables with the correct critical values(e.g. Hamilton, table B.6)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 107 / 143

Page 113: Slides a13 14

Integrated processesDF test and ADF test

The Dickey-Fuller test statistics for the joint hypotheses arecomputed in the same way as the usual F -test statistics

Reject the null hypothesis if the test statistic is too large

The critical values are not the quantiles of the F -distribution

There are tables with the correct critical values(e.g. Hamilton, table B.7)

Illustrations (dftest.R)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 108 / 143

Page 114: Slides a13 14

Integrated processesDF test and ADF test

If there is autocorrelation in εt the DF test does not work (dftest.R)

Augmented Dickey-Fuller test (ADF test) regressions:

∆Xt = γ1∆Xt−1 + . . .+ γp∆Xt−p + βXt−1 + εt∆Xt = a + γ1∆Xt−1 + . . .+ γp∆Xt−p + βXt−1 + εt∆Xt = a + bt + γ1∆Xt−1 + . . .+ γp∆Xt−p + βXt−1 + εt

The added lagged differences capture the autocorrelation

The number of lags p must be large enough to make εt white noise

The critical values remain the same as in the no-correlation case

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 109 / 143

Page 115: Slides a13 14

Integrated processesDF test and ADF test

Further interesting topics (but we skip these)

Phillips-Perron test

Structural breaks and unit roots

KPSS test of stationarity

H0 : Xt ∼ I (0)

H1 : Xt ∼ I (1)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 110 / 143

Page 116: Slides a13 14

Integrated processesRegression with integrated processes

Spurious regression: If Xt and Yt are independent but both I (1)then the regression

Yt = α + βXt + ut

will result in an estimated coefficient β that is significantly differentfrom 0 with probability 1 as T →∞BUT: The regression

Yt = α + βXt + ut

may be sensible even though Xt and Yt are I (1)

Cointegration

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 111 / 143

Page 117: Slides a13 14

Integrated processesRegression with integrated processes

Definition: Cointegration

Two stochastic processes (Xt)t∈T and (Yt)t∈T are cointegrated if bothprocesses are I (1) and there is a constant β such that the process(Yt − βXt) is I (0)

If β is known, cointegration can be tested using a standard unit roottest on the process (Yt − βXt)

If β is unknown, it can be estimated from the linear regression

Yt = α + βXt + ut

and cointegration is tested using a modified unit root test on theresidual process (ut)t=1,...,T

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 112 / 143

Page 118: Slides a13 14

GARCH modelsConditional expectation

Let (X ,Y ) be a bivariate random variable with a joint densityfunction, then

E (X |Y = y) =

∫ ∞−∞

x fX |Y=y (x)dx

is the conditional expectation of X given Y = y

E (X |Y ) denotes a random variable with realization E (X |Y = y)if the random variable Y realizes as y

Both E (X |Y ) and E (X |Y = y) are called conditional expectation

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 113 / 143

Page 119: Slides a13 14

GARCH modelsConditional variance

Let (X ,Y ) be a bivariate random variable with a joint densityfunction, then

Var(X |Y = y) =

∫ ∞−∞

(x − E (X |Y = y))2fX |Y=y (x)dx

is the conditional variance of X given Y = y

Var(X |Y ) denotes a random variable with realization Var(X |Y = y)if the random variable Y realizes as y

Both Var(X |Y = y) and Var(X |Y ) are called conditional variance

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 114 / 143

Page 120: Slides a13 14

GARCH modelsRules for conditional expectations

1 Law of iterated expectations: E (E (X |Y )) = E (X )

2 If X and Y are independent, then E (X |Y ) = E (X )

3 The condition can be treated like a constant,E (XY |Y ) = Y · E (X |Y )

4 The conditional expecation is a linear operator. For a1, . . . , an ∈ R

E

(n∑

i=1

aiXi |Y

)=

n∑i=1

aiE (Xi |Y )

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 115 / 143

Page 121: Slides a13 14

GARCH modelsBasics

Some economic time series show volatility clusters, e.g. stock returns,commodity price changes, inflation rates, . . .

Simple autoregressive models cannot capture volatility clusters sincetheir conditional variance is constant

Example: Stationary AR(1)-process, Xt = αXt−1 + εt with |α| < 1;then

Var(Xt) = σ2X =

σ2ε

1− α2,

and the conditional variance is

Var(Xt |Xt−1) = σ2ε

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 116 / 143

Page 122: Slides a13 14

GARCH modelsBasics

In the following, we will focus on stock returns

Empirical fact: squared (or absolute) returns are positivelyautocorrelated

Implication: Returns are not independent over time

The dependence is nonlinear

How can we model this kind of dependence?

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 117 / 143

Page 123: Slides a13 14

GARCH modelsARCH(1)-process

Definition: ARCH(1)-process

The stochastic process (Xt)t∈Z is called ARCH(1)-process if

E (Xt |Xt−1) = 0

Var(Xt |Xt−1) = σ2t

= α0 + α1X2t−1

for all t ∈ Z, with α0, α1 > 0

Often, an additional assumption is

Xt | (Xt−1 = xt−1) ∼ N(0, α0 + α1x2t−1)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 118 / 143

Page 124: Slides a13 14

GARCH modelsARCH(1)-process

The unconditional distribution of Xt is a non-normal distribution

Leptokurtosis: The tails are heavier than the tails of the normaldistribution

Example of an ARCH(1)-process

Xt = εtσt

where (εt)t∈Z is white noise with σ2ε = 1 and

σt =√α0 + α1X 2

t−1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 119 / 143

Page 125: Slides a13 14

GARCH modelsARCH(1)-process

One can show that [11]

E (Xt |Xt−1) = 0

E (Xt) = 0

Var(Xt |Xt−1) = α0 + α1X2t−1

Var(Xt) = α0/ (1− α1)

Cov(Xt ,Xt−i ) = 0 for i > 0

Stationarity condition: 0 < α1 < 1

The unconditional kurtosis is 3(1− α21)/(1− 3α2

1) if εt ∼ N(0, 1).If α1 >

√1/3 = 0.57735, the kurtosis does not exist. [12]

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 120 / 143

Page 126: Slides a13 14

GARCH modelsARCH(1)-process

Squared returns follow [13]

X 2t = α0 + α1X

2t−1 + vt

with vt = σ2t (ε2

t − 1)

Thus, squared returns of ARCH(1) are AR(1)

The process (vt)t∈Z is white noise

E (vt) = 0

Var(vt) = E (v2t ) = const.

Cov(vt , vt−i ) = 0 (i = 1, 2, . . .)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 121 / 143

Page 127: Slides a13 14

GARCH modelsARCH(1)-process

Simulation of an ARCH(1)-process for t = 1, . . . , 2500

Parameters: α0 = 0.05, α1 = 0.95, start value X0 = 0

Conditional distribution: εt ∼ N(0, 1)

archsim.R

Check whether the simulated time series shows the typical stylizedfacts of return distributions

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 122 / 143

Page 128: Slides a13 14

GARCH modelsEstimation of an ARCH(1)-process

Of course, we do not know the true values of the model parametersα0 and α1

How can we estimate the unknown parameters α0 and α1 ?

Observations X1, . . . ,XT

Because ofX 2t = α0 + α1X

2t−1 + vt

a possible estimation method is OLS

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 123 / 143

Page 129: Slides a13 14

GARCH modelsEstimation of an ARCH(1)-process

OLS estimator of α1

α1 =

∑Tt=2

(X 2t − X 2

t

)(X 2t−1 − X 2

t−1

)∑T

t=2

(X 2t−1 − X 2

t−1

)2≈ ρ(X 2

t ,X2t−1)

Careful: These estimators are only consistent if the kurtosis exists(i.e. if α1 <

√1/3)

Test of ARCH-effects

H0 : α1 = 0

H1 : α1 > 0

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 124 / 143

Page 130: Slides a13 14

GARCH modelsEstimation of an ARCH(1)-process

For T large, under H0

√T α1 ∼ N(0, 1)

Reject H0 if√T α1 > Φ−1(1− α)

Second version of this test: Consider the R2 of the regression

X 2t = α0 + α1X

2t−1 + vt ,

then under H0

T α21 ≈ TR2 appr∼ χ2

1

Reject H0 if TR2 > F−1χ2

1(1− α)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 125 / 143

Page 131: Slides a13 14

GARCH modelsARCH(p)-process

Definition: ARCH(p)-process

The stochastic process (Xt)t∈Z is called ARCH(p)-process if

E (Xt |Xt−1, . . .Xt−p) = 0

Var(Xt |Xt−1, . . . ,Xt−p) = σ2t

= α0 + α1X2t−1 + . . .+ αpX

2t−p

for t ∈ Z, where αi ≥ 0 for i = 0, 1, . . . , p − 1 and αp > 0

Often, an additional assumption is that

Xt |(Xt−1 = xt−1, . . . ,Xt−p = xt−p) ∼ N(0, σ2t )

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 126 / 143

Page 132: Slides a13 14

GARCH modelsARCH(p)-process

Example of an ARCH(p)-process

Xt = εtσt

where(εt)t∈Z is white noise with σ2ε = 1 and

σt =√α0 + α1X 2

t−1 + . . .+ αpX 2t−p

An ARCH(p) process is weakly stationary if all roots of1− α1z − α2z

2 − . . .− αpzp = 0 are outside the unit circle

Then, for all t ∈ Z, E (Xt) = 0 and

Var(Xt) =α0

1−∑p

i=1 αi

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 127 / 143

Page 133: Slides a13 14

GARCH modelsARCH(p)-process

If (Xt)t∈Z is a stationary ARCH(p) process, then (X 2t )t∈Z is a

stationary AR(p) process

X 2t = α0 + α1X

2t−1 + . . .+ αpX

2t−p + vt

As to the error term,

E (vt) = 0

Var(vt) = const.

Cov(vt , vt−i ) = 0 for i = 1, 2, . . .

Simulating an ARCH(p) is easy

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 128 / 143

Page 134: Slides a13 14

GARCH modelsEstimation of ARCH(p) models

OLS estimation of

X 2t = α0 + α1X

2t−1 + . . .+ αpX

2t−p + vt

Test of ARCH-effects

H0 : α1 = α2 = . . . = αp = 0 vs H1 : not H0

Let R2 denote the coefficient of determination of the regression

Under H0, the test statistic TR2 ∼ χ2p;

thus reject H0 if TR2 > F−1χ2p

(1− α)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 129 / 143

Page 135: Slides a13 14

GARCH modelsMaximum likelihood estimation

Basic idea of the maximum likelihood estimation method:Choose parameters such that the joint density of the observations

fX1,...,XT(x1, . . . , xT )

is maximized

Let X1, . . . ,XT denote a random sample from X

The density fX (x ; θ) depends on R unknown parametersθ = (θ1, . . . , θR)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 130 / 143

Page 136: Slides a13 14

GARCH modelsMaximum likelihood estimation

ML estimation of θ: Maximize the (log)likelihood function

L (θ) = fX1,...XT(x1, . . . , xT ; θ)

=T∏t=1

fX (xt ; θ)

ln L (θ) =T∑t=1

ln fX (xt ; θ)

ML estimateθ = argmax [ln L (θ)]

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 131 / 143

Page 137: Slides a13 14

GARCH modelsMaximum likelihood estimation

Since observations are independent in random samples

fX1,...,XT(x1, . . . , xT ) =

T∏t=1

fXt (xt)

or

ln fX1,...,XT(x1, . . . , xT ) =

T∑t=1

ln fXt (xt)

=T∑t=1

ln fX (xt)

But: ARCH-returns are not independent!

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 132 / 143

Page 138: Slides a13 14

GARCH modelsMaximum likelihood estimation

Factorization with dependent observations

fX1,...,XT(x1, . . . , xT ) =

T∏t=1

fXt |Xt−1,...,X1(xt |xt−1, . . . , x1)

or

ln fX1,...,XT(x1, . . . , xT ) =

T∑t=1

ln fXt |Xt−1,...,X1(xt |xt−1, . . . , x1)

Hence, for an ARCH(1)-process

fX1,...,XT(x1, . . . , xT ) = fX1(x1)

T∏t=2

1√

2π√σ2t

exp

(−1

2

(xtσt

)2)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 133 / 143

Page 139: Slides a13 14

GARCH modelsMaximum likelihood estimation

The marginal density of X1 is complicated but becomes negligible forlarge T and, therefore, will be dropped from now on

Log-likelihood function (without initial marginal density)

ln L(α0, α1|x1, . . . , xT )

= −T

2ln 2π − 1

2

T∑t=2

lnσ2t −

1

2

T∑t=2

(xtσt

)2

where σ2t = α0 + α1x

2t−1

ML-estimation of α0 and α1 by numerical maximization ofln L(α0, α1) with respect to α0 and α1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 134 / 143

Page 140: Slides a13 14

GARCH modelsGARCH(p,q)-process

Definition: GARCH(p,q)-process

The stochastic process (Xt)t∈Z is called GARCH(p, q)-process if

E (Xt |Xt−1,Xt−2, . . .) = 0

Var(Xt |Xt−1,Xt−2, . . .) = σ2t

= α0 + α1X2t−1 + . . .+ αpX

2t−p

+β1σ2t−1 + . . .+ βqσ

2t−q

for t ∈ Z with αi , βi ≥ 0

Often, an additional assumption is that

(Xt |Xt−1 = xt−1,Xt−2 = xt−2, . . .) ∼ N(0, σ2t )

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 135 / 143

Page 141: Slides a13 14

GARCH modelsGARCH(p,q)-process

Conditional variance of GARCH(1, 1)

Var(Xt |Xt−1,Xt−2, . . .) = σ2t

= α0 + α1X2t−1 + β1σ

2t−1

=α0

1− β1+ α1

∞∑i=1

βi−11 X 2

t−i

Unconditional variance

Var(Xt) =α0

1−∑p

i=1 αi −∑q

j=1 βj

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 136 / 143

Page 142: Slides a13 14

GARCH modelsGARCH(p,q)-process

Necessary condition for weak stationarity

p∑i=1

αi +

q∑j=1

βj < 1

(Xt)t∈Z has no autocorrelation

GARCH-processes can be written as ARMA(max (p, q) , q)-processesin the squared returns

Example: GARCH(1, 1)-process with Xt = εtσt andσ2t = α0 + α1X

2t−1 + β1σ

2t−1

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 137 / 143

Page 143: Slides a13 14

GARCH modelsEstimation of GARCH(p,q)-processes

Estimation of the ARMA(max (p, q) , q)-process in the squared returns

Alternative (and better) method: Maximum likelihood

For a GARCH(1, 1)-process

fX1,...,XT(x1, . . . , xT )

= fX1(x1)T∏t=2

1√

2π√σ2t

exp

(−1

2

(xtσt

)2)

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 138 / 143

Page 144: Slides a13 14

GARCH modelsEstimation of GARCH(p,q)-processes

Again, the density of X1 can be neglected

Log-Likelihood function

ln L(α0, α1, β1|x1, . . . , xT )

= −T

2ln 2π − 1

2

T∑t=2

lnσ2t −

1

2

T∑t=2

(xtσt

)2

with σ2t = α0 + α1x

2t−1 + β1σ

2t−1 and σ2

1 = 0

ML-estimation of α0, α1 and β1 by numerical maximization

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 139 / 143

Page 145: Slides a13 14

GARCH modelsEstimation of GARCH(p,q)-processes

Conditional h-step forecast of the volatility σ2t+h in a GARCH(1, 1)

model

E(σ2t+h|Xt ,Xt−1, . . .

)= (α1 + β1)h

(σ2t −

α0

1− α1 − β1

)+

α0

1− α1 − β1

If the process is stationary

limh→∞

E (σ2t+h|Xt ,Xt−1, . . .) =

α0

1− α1 − β1

Simulation of GARCH-processes is easy; the estimation can becomputer intensive

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 140 / 143

Page 146: Slides a13 14

GARCH modelsResiduals of an estimated GARCH(1,1) model

Careful: Residuals are slightly different from what you know from OLSregressions

Estimates: α0, α1, β1, µ

From σ2t = α0 + α1X

2t−1 + β1σ

2t−1 and Xt = µ+ σtεt we calculate

the standardized residuals

εt =Xt − µσt

=Xt − µ√

α0 + α1X 2t−1 + β1σ2

t−1

Histogram of the standardized residuals

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 141 / 143

Page 147: Slides a13 14

GARCH modelsAR(p)-ARCH(q)-models

Definition: (Xt)t∈Z is called AR(p)-ARCH(q)-process if

Xt = µ+ φ1Xt−1 + εt

σ2t = α0 + α1ε

2t−1

where εt ∼ N(0, σ2t )

mean equation / variance equation

Maximum likelihood estimation

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 142 / 143

Page 148: Slides a13 14

GARCH modelsExtensions of the GARCH model

There are a number of possible extensions to the GARCH model:

Empirical fact: Negative shocks have a larger impact on volatilitythan positive shocks (leverage effect)

News impact curve

Nonnormal innovations, e.g. εt ∼ tν

Andrea Beccarini (CQE) Time Series Analysis Winter 2013/2014 143 / 143