LectureNotes 9 Filtering

State Space Models and Filtering

Jesus Fernandez-VillaverdeUniversity of Pennsylvania

1

State Space Form

What is a state space representation?

States versus observables.

Why is it useful?

Relation with filtering.

Relation with optimal control.

Linear versus nonlinear, Gaussian versus nongaussian.2

State Space Representation

Let the following system:

Transition equation

xt+1 = Fxt +Gt+1, t+1 N (0, Q)

Measurement equation

zt = H0xt + t, t N (0, R)

where xt are the states and zt are the observables.

Assume we want to write the likelihood function of zT = {zt}Tt=1.3

The State Space Representation is Not Unique

Take the previous state space representation.

Let B be a non-singular squared matrix conforming with F .

Then, if xt = Bxt, F = BFB1, G = BG, and H =H0B

0, wecan write a new, equivalent, representation:

Transition equation

xt+1 = Fxt +G

t+1, t+1 N (0, Q)

Measurement equation

zt = H0xt + t, t N (0, R)

4

Example I

Assume the following AR(2) process:zt = 1zt1 + 2zt2 + t, t N

0,2

Model is not apparently not Markovian.

Can we write this model in dierent state space forms?

Yes!

5

State Space Representation I

Transition equation:

xt =

"1 12 0

#xt1 +

"10

#t

where xt =hyt 2yt1

i0

Measurement equation:zt =

h1 0

ixt

6

State Space Representation II


xt =

"1 21 0

#xt1 +

"10

#t

where xt =hyt yt1

i0


h1 0

ixt

Try B ="1 00 2

#on the second system to get the first system.

7

Example II

Assume the following MA(1) process:zt = t + t1, t N

0,2

, and Ets = 0 for s 6= t.

Again, we have a more conmplicated structure than a simple Markov-ian process.

However, it will again be straightforward to write a state space repre-sentation.

8

State Space Representation I


xt =

"0 10 0

#xt1 +

"1

#t

where xt =hyt t

i0


h1 0

ixt

9

State Space Representation II


xt = t1

where xt = [t1]0

Measurement equation:zt = xt + t

Again both representations are equivalent!10

Example III

Assume the following random walk plus drift process:zt = zt1 + + t, t N

0,2

This is even more interesting.

We have a unit root.

We have a constant parameter (the drift).

11



xt =

"1 10 1

#xt1 +

"10

#t

where xt =hyt

i0


h1 0

ixt

12

Some Conditions on the State Space Representation

We only consider Stable Systems.

A system is stable if for any initial state x0, the vector of states, xt,converges to some unique x.

A necessary and sucient condition for the system to be stable is that:|i (F )| < 1

for all i, where i (F ) stands for eigenvalue of F .

13

Introducing the Kalman Filter

Developed by Kalman and Bucy.

Wide application in science.

Basic idea.

Prediction, smoothing, and control.

Why the name filter?

14

Some Definitions

Let xt|t1 = Ext|zt1

be the best linear predictor of xt given the

history of observables until t 1, i.e. zt1.

Let zt|t1 = Ezt|zt1

= H0xt|t1 be the best linear predictor of zt

given the history of observables until t 1, i.e. zt1.

Let xt|t = Ext|zt

be the best linear predictor of xt given the history

of observables until t, i.e. zt.

15

A Minimization Approach to the Kalman Filter II

Kt is called the Kalman filter gain and it measures how much weupdate xt|t1 as a function in our error in predicting zt.

The question is how to find the optimal Kt.

The Kalman filter is about how to build Kt such that optimally updatext|t from xt|t1 and zt.

How do we find the optimal Kt?

18

Example I

Assume the following model in State Space form:

Transition equationxt = + t, t N

0,2

Measurement equationzt = xt + t, t N

0,2

Let 2 = q2.

21

Example II

Then, if 1|0 = 2, what it means that x1 was drawn from the ergodicdistribution of xt.

We have:K1 =

21

1 + q 11 + q

.

Therefore, the bigger 2 relative to 2 (the bigger q) the lower K1and the less we trust z1.

22

The Kalman Filter Algorithm V

We also compute zt|t1 and t|t1.

Why?

To calculate the likelihood function of zT = {zt}Tt=1 (to follow).

27


t+1|t = Ft|tF 0 +GQG0

xt+1|t = Fxt|t

We finish with xt+1|t and t+1|t.

29

Some Intuition about the optimal Kt

Remember: Kt = t|t1HH0t|t1H +R

1

Notice that we can rewrite Kt in the following way:Kt = t|t1H1t|t1

If we did a big mistake forecasting xt|t1 using past information (t|t1large) we give a lot of weight to the new information (Kt large).

If the new information is noise (R large) we give a lot of weight to theold prediction (Kt small).

30

The Kalman Filter First Iteration I

Assume we know x1|0 and 1|0, thenx1z1|z0!N

"x1|0H0x1|0

#,

"1|0 1|0HH01|0 H01|0H +R

#!

Remember that:X|y,w N


34

The Kalman Filter First Iteration II

Then, we can write:

x1|z1, z0 = x1|z1 Nx1|1,1|1

where

x1|1 = x1|0 +1|0HH01|0H +R

1 z1 H0x1|0

and

1|1 = 1|0 1|0HH01|0H +R

1H01|0

35

Therefore, we have that:

z1|0 = H0x1|0

1|0 = H01|0H +R

x1|1 = x1|0 +1|0HH01|0H +R

1 z1 H0x1|0

1|1 = 1|0 1|0H

H01|0H +R

1H01|0

Also, since x2|1 = Fx1|1 +G2|1 and z2|1 = H0x2|1 + 2|1:

x2|1 = Fx1|1

2|1 = F1|1F 0 +GQG0

36

The Kalman Filter Algorithm

Given xt|t1, t|t1 and observation zt

t|t1 = H0t|t1H +R

zt|t1 = H0xt|t1

t|t = t|t1 t|t1HH0t|t1H +R

1H0

xt|t = xt|t1 + t|t1HH0t|t1H +R

1 zt H0xt|t1

039

t+1|t = Ft|tF 0 +GQGt|t1

xt+1|t = Fxt|t1

40

Putting the Minimization and the Probabilistic Approaches Together

From the Minimization Approach we know that:xt|t = xt|t1 +Kt

zt H0xt|t1

From the Probability Approach we know that:

xt|t = xt|t1 + t|t1HH0t|t1H +R

1 zt H0xt|t1

41

Writing the Likelihood Function

We want to write the likelihood function of zT = {zt}Tt=1:log `

zT |F,G,H,Q,R

=

TXt=1

log `zt|zt1F,G,H,Q,R

=

TXt=1

N2log 2 +

1

2log

t|t1

+1

2

TXt=1

v0t1t|t1vt

vt = zt zt|t1 = zt H0xt|t1

t|t1 = H0tt|t1Ht +R

43

Initial conditions for the Kalman Filter

An important step in the Kalman Fitler is to set the initial conditions.

Initial conditions:

1. x1|0

2. 1|0

Where do they come from?

44

Since we only consider stable system, the standard approach is to set:

x1|0 = x

1|0 =

where x solves

x = Fx

= FF 0 +GQG0

How do we find ?

= [I F F ]1 vec(GQG0)

45

Initial conditions for the Kalman Filter II

Under the following conditions:

1. The system is stable, i.e. all eigenvalues of F are strictly less than one

in absolute value.

2. GQG0 and R are p.s.d. symmetric

3. 1|0 is p.s.d. symmetric

Then t+1|t .

46

Remarks

1. There are more general theorems than the one just described.

2. Those theorems are based on non-stable systems.

3. Since we are going to work with stable system the former theorem is

enough.

4. Last theorem gives us a way to find as t+1|t for any 1|0 westart with.

47

The Kalman Filter and DSGE models

Basic Real Business Cycle model

maxE0

Xt=0

t { log ct + (1 ) log (1 lt)}

ct + kt+1 = kt (e

ztlt)1 + (1 ) k

zt = zt1 + t, t N (0,)

Parameters: = {,, , , ,}

48

Equilibrium Conditions

1

ct= Et

(1

ct+1

1 + ezt+1k1t+1 l

1t+1

)

1 1 lt

=

ct(1 ) eztkt lt

ct + kt+1 = eztkt l

1t + (1 ) kt

zt = zt1 + t

49

A Special Case

We set, unrealistically but rather useful for our point, = 1.

In this case, the model has two important and useful features:

1. First, the income and the substitution eect from a productivityshock to labor supply exactly cancel each other. Consequently, ltis constant and equal to:

lt = l =(1 )

(1 ) + (1 ) (1 )

2. Second, the policy function for capital is kt+1 = eztkt l1.

50

A Special Case II

The definition of kt+1 implies that ct = (1 ) eztkt l1.

Let us try if the Euler Equation holds:1

ct= Et

(1

ct+1

ezt+1k1t+1 l

1t+1

)1

(1 ) eztkt l1= Et

(1

(1 ) ezt+1kt+1l1ezt+1k1t+1 l

1t+1

)1

(1 ) eztkt l1= Et

(

(1 ) kt+1

)

(1 )=

(1 )

51

Let us try if the Intratemporal condition holds1 1 l

=

(1 ) eztkt l1(1 ) eztkt l

1 1 l

=

(1 )(1 )l

(1 ) (1 ) l = (1 ) (1 l)

((1 ) (1 ) + (1 ) ) l = (1 )

Finally, the budget constraint holds because of the definition of ct.

52

Transition Equation

Since this policy function is linear in logs, we have the transition equa-tion for the model:

1log kt+1zt

=

1 0 0logl1

0 0

1log ktzt1

+

011

t.

Note constant.

Alternative formulations.

53

Measurement Equation

As observables, we assume log yt and log it subject to a linearly additivemeasurement error Vt =

v1,t v2,t

0.

Let Vt N (0,), where is a diagonal matrix with 21 and 22, asdiagonal elements.

Why measurement error? Stochastic singularity.

Then:log ytlog it

!=

logl1 1 0

0 1 0

!

1log kt+1zt

+

v1,tv2,t

!.

54

The Solution to the Model in State Space Form

xt =

1log ktzt1

, zt =

log ytlog it

!

F =

1 0 0logl1

0 0

, G =

011

, Q = 2

H0 = logl1 1 0

0 1 0

!, R =

55

The Solution to the Model in State Space Form III

Now, using zT , F,G,H,Q, and R as defined in the last slide...

...we can use the Ricatti equations to compute the likelihood functionof the model:

log `zT |F,G,H,Q,R

Croos-equations restrictions implied by equilibrium solution.

With the likelihood, we can do inference!

56

What do we Do if 6= 1?

We have two options:

First, we could linearize or log-linearize the model and apply theKalman filter.

Second, we could compute the likelihood function of the model usinga non-linear filter (particle filter).

Advantages and disadvantages.

Fernandez-Villaverde, Rubio-Ramrez, and Santos (2005).57

The Kalman Filter and linearized DSGE Models

We linearize (or loglinerize) around the steady state.

We assume that we have data on log output (log yt), log hours (log lt),and log investment (log ct) subject to a linearly additive measurement

error Vt =v1,t v2,t v3,t

0.

We need to write the model in state space form. Remember thatbkt+1 = P bkt +Qzt

and blt = Rbkt + Szt58

Writing the Likelihood Function I

The transitions Equation:

1bkt+1zt+1

=

1 0 00 P Q0 0

1bktzt

+

001

t.

The Measurement Equation requires some care.

59

Writing the Likelihood Function II

Notice that byt = zt + bkt + (1 )blt Therefore, using blt = Rbkt + Szt

byt = zt + bkt + (1 )(Rbkt + Szt) =(+ (1 )R) bkt + (1 + (1 )S) zt

Also since bct = 5blt + zt + bkt and using again blt = Rbkt + Sztbct = zt + bkt 5(Rbkt + Szt) =( 5R) bkt + (1 5S) zt

60

Writing the Likelihood Function III

Therefore the measurement equation is:log ytlog ltlog ct

=

log y + (1 )R 1 + (1 )Slog l R Slog c 5R 1 5S

1bktzt

+

v1,tv2,tv3,t

.

61

The Likelihood Function of a General Dynamic Equilibrium Economy

Transition equation:St = f (St1,Wt; )

Measurement equation:Yt = g (St, Vt; )

Interpretation.

62

Some Assumptions

1. We can partition {Wt} into two independent sequencesnW1,t

oandn

W2,to, s.t. Wt =

W1,t,W2,t

and dim

W2,t

+dim (Vt) dim (Yt).

2. We can always evaluate the conditional densities pyt|Wt1, yt1, S0;

.

Lubick and Schorfheide (2003).

3. The model assigns positive probability to the data.

63

Our Goal: Likelihood Function

Evaluate the likelihood function of the a sequence of realizations ofthe observable yT at a particular parameter value :

pyT ;

We factorize it as:

pyT ;

=

TYt=1

pyt|yt1;

=TYt=1

Z Zpyt|Wt1, yt1, S0;

pWt1, S0|yt1;

dWt1dS0

64

...thus

The problem of evaluating the likelihood is equivalent to the problem of

drawing from npWt1, S0|yt1;

oTt=1

66

... and Weights

qit =pyt|wt|t1,i1 , yt1, s

t|t1,i0 ;

PNi=1 p

yt|wt|t1,i1 , yt1, s

t|t1,i0 ;

68

A Proposition

Letnesi0, ewi1oNi=1 be a draw with replacement from

st|t1,i0 , w

t|t1,i1

Ni=1

and probabilities qit. Thennesi0, ewi1oNi=1 is a draw from p Wt1, S0|yt; .

69

Sequential Monte Carlo I: Filtering

Step 0, Initialization: Set t 1 and initialize pWt11 , S0|yt1;

=

p (S0; ).

Step 1, Prediction: Sample N valuesst|t1,i0 , w

t|t1,i1

Ni=1

from the

density pWt1, S0|yt1;

= p

W1,t;

pWt11 , S0|yt1;

.

Step 2, Weighting: Assign to each draw st|t1,i0 , w

t|t1,i1 the weight

qit.

Step 3, Sampling: Drawnst,i0 , w

t,i1

oNi=1

with rep. from

st|t1,i0 , w

t|t1,i1

Ni=1

with probabilitiesnqit

oNi=1. If t < T set t t + 1 and go to

step 1. Otherwise stop.

71

Sequential Monte Carlo II: Likelihood

Use

(st|t1,i0 , w

t|t1,i1

Ni=1

)Tt=1

to compute:

pyT ;

'

TYt=1

1

N

NXi=1

p


t|t1,i0 ;

72

A Trivial Application

How do we evaluate the likelihood function pyT |,,

of the nonlinear,

nonnormal process:

st = + st1

1 + st1+ wt

yt = st + vt

where wt N (0,) and vt t (2) given some observables yT = {yt}Tt=1and s0.

73

1. Let s0,i0 = s0 for all i.

2. Generate N i.i.d. drawss1|0,i0 , w

1|0,iNi=1

from N (0,).

3. Evaluate py1|w1|0,i1 , y0, s

1|0,i0

= pt(2)

y1

+

s1|0,i0

1+s1|0,i0

+ w1|0,i!!.

4. Evaluate the relative weights qi1 =

pt(2)

y1

+

s1|0,i0

1+s1|0,i0

+w1|0,i!!

PNi=1 pt(2)

y1

+

s1|0,i0

1+s1|0,i0

+w1|0,i!!.

74

5. Resample with replacement N values ofs1|0,i0 , w

1|0,iNi=1

with rela-

tive weights qi1. Call those sampled valuesns1,i0 , w

1,ioNi=1.

6. Go to step 1, and iterate 1-4 until the end of the sample.

75

A Law of Large Numbers

A law of the large numbers delivers:

py1| y0,,,

' 1N

NXi=1

py1|w1|0,i1 , y0, s

1|0,i0

and consequently:

pyT,,

'

TYt=1

1

N

NXi=1

p


t|t1,i0

76

Comparison with Alternative Schemes

Deterministic algorithms: Extended Kalman Filter and derivations(Jazwinski, 1973), Gaussian Sum approximations (Alspach and Soren-

son, 1972), grid-based filters (Bucy and Senne, 1974), Jacobian of the

transform (Miranda and Rui, 1997).

Tanizaki (1996).

Simulation algorithms: Kitagawa (1987), Gordon, Salmond and Smith(1993), Mariano and Tanizaki (1995) and Geweke and Tanizaki (1999).

77

A Real Application: the Stochastic Neoclassical Growth Model

Standard model.

Isnt the model nearly linear?

Yes, but:

1. Better to begin with something easy.

2. We will learn something nevertheless.

78

The Model

Representative agent with utility function U = E0Pt=0

t

ct (1lt)

11

1 .

One good produced according to yt = eztAkt l1t with (0, 1) .

Productivity evolves zt = zt1 + t, || < 1 and t N (0,).

Law of motion for capital kt+1 = it + (1 )kt.

Resource constrain ct + it = yt.79

Solve for c (, ) and l (, ) given initial conditions.

Characterized by:Uc(t) = Et

hUc(t+ 1)

1 + Aezt+1k1t+1 l(kt+1, zt+1)

i

1

c(kt, zt)

1 l(kt, zt)= (1 ) eztAkt l(kt, zt)

A system of functional equations with no known analytical solution.

80

Solving the Model

We need to use a numerical method to solve it.

Dierent nonlinear approximations: value function iteration, perturba-tion, projection methods.

We use a Finite Element Method. Why? Aruoba, Fernandez-Villaverdeand Rubio-Ramrez (2003):

1. Speed: sparse system.

2. Accuracy: flexible grid generation.

3. Scalable.81

Building the Likelihood Function

Time series:

1. Quarterly real output, hours worked and investment.

2. Main series from the model and keep dimensionality low.

Measurement error. Why?

= (, , ,, ,,,1,2,3)

82


kt = f1(St1,Wt; ) = etanh1(t1)kt1l

kt1, tanh

1(t1); 1

1

1 (1 )

1 l

kt1, tanh1(t1);

lkt1, tanh1(t1);

+ (1 ) kt1

t = f2(St1,Wt; ) = tanh( tanh1(t1) + t)

gdpt = g1(St, Vt; ) = etanh1(t)kt l

kt, tanh

1(t); 1

+ V1,t

hourst = g2(St, Vt; ) = lkt, tanh

1(t); + V2,t

invt = g3(St, Vt; ) = etanh1(t)kt l

kt, tanh

1(t); 1

1

1 (1 )

1 l

kt, tanh

1(t);

lkt, tanh

1(t);

+ V3,t

Likelihood Function

83

Since our measurement equation implies that

p (yt|St; ) = (2)32 ||12 e

(St;)2

where (St; ) = (yt x(St; )))01 (yt x(St; )) t, we have

pyT ;

=

(2)3T2 ||T2

Z TYt=1

Ze

(St;)2 p

St|yt1, S0;

dSt

p (S0; ) dS1

' (2)3T2 ||T2TYt=1

1

N

NXi=1

e(sit;)2

84

Priors for the Parameters

Priors for the Parameters of the ModelParameters Distribution Hyperparameters

123

UniformUniformUniformUniformUniformUniformUniformUniformUniformUniform

0,10,10,1000,10,0.050.75,10,0.10,0.10,0.10,0.1

85

Likelihood-Based Inference I: a Bayesian Perspective

Define priors over parameters: truncated uniforms.

Use a Random-walk Metropolis-Hastings to draw from the posterior.

Find the Marginal Likelihood.

86

Likelihood-Based Inference II: a Maximum Likelihood Perspective

We only need to maximize the likelihood.

Diculties to maximize with Newton type schemes.

Common problem in dynamic equilibrium economies.

We use a simulated annealing scheme.

87

An Exercise with Artificial Data

First simulate data with our model and use that data as sample.

Pick true parameter values. Benchmark calibration values for thestochastic neoclassical growth model (Cooley and Prescott, 1995).

Calibrated ParametersParameter Value 0.357 0.95 2.0 0.4 0.02Parameter 1 2 3Value 0.99 0.007 1.58*104 0.0011 8.66*104

Sensitivity: = 50 and = 0.035.88

0.9 0.92 0.94 0.96 0.98

Likelihood cut at r

1.5 2 2.5 3 3.5

Likelihood cut at t

0.38 0.4 0.42

Likelihood cut at a

0.018 0.02 0.022

Likelihood cut at d

7 8 9 10 11

x 10-3

Likelihood cut at s

0.98 0.985 0.99

Likelihood cut at b

0.25 0.3 0.35 0.4

Likelihood cut at q

Nonlinear

Linear

Pseudotrue

Figure 5.1: Likelihood Function Benchmark Calibration

0.94850.9490.9495 0.95 0.95050.9510.95150

1000

2000

3000

4000

5000r

1.996 1.998 2 2.002 2.0040

1000

2000

3000

4000

5000t

0.3995 0.4 0.40050

2000

4000

a

0.01957 0.0196 0.019630

2000

4000

d

6.99 6.995 7 7.005 7.01

x 10-3

0

5000s

0.988 0.989 0.99 0.9910

5000b

0.3564 0.3568 0.3572 0.35760

5000q

1.578 1.579 1.58 1.581 1.582 1.583 1.584

x 10-4

0

5000

s1

1.116 1.117 1.118 1.119 1.12

x 10-3

0

2000

4000

s2

8.645 8.65 8.655 8.66 8.665 8.67 8.675

x 10-4

0

2000

4000

s3

Figure 5.2: Posterior Distribution Benchmark Calibration

0.9 0.92 0.94 0.96 0.98

Likelihood cut r

40 45 50 55

Likelihood cut t

0.36 0.38 0.4 0.42 0.44

Likelihood cut a

0.016 0.018 0.02 0.022

Likelihood cut d

0.03 0.035 0.04

Likelihood cut s

0.95 0.96 0.97 0.98 0.99

Likelihood cut b

0.3 0.35 0.4

Likelihood cut q

Nonlinear

Linear

Pseudotrue

Figure 5.3: Likelihood Function Extreme Calibration

0.9495 0.95 0.95050

2000

4000

6000

r

49.95 50 50.050

2000

4000

6000

t

0.3996 0.3998 0.4 0.40020

2000

4000

6000

a

0.019555 0.019565 0.019575 0.0195850

2000

4000

6000

d

0.035 0.035 0.035 0.035 0.035 0.035 0.0350

2000

4000

6000

s

0.989 0.9895 0.99 0.99050

2000

4000

6000

b

0.3567 0.3569 0.3571 0.35730

5000

q

1.58 1.5805 1.581 1.5815 1.582 1.5825

x 10-4

0

5000

s1

1.117 1.1175 1.118 1.1185 1.119

x 10-3

0

2000

4000

6000

s2

8.655 8.66 8.665

x 10-4

0

2000

4000

6000

s3

Figure 5.4: Posterior Distribution Extreme Calibration

0 1 2 3 4

x 105

0.4

0.6

0.8

1r

0 1 2 3 4

x 105

40

60

80t

0 1 2 3 4

x 105

0.350.4

0.45

a

0 1 2 3 4

x 105

0.01

0.02

0.03d

0 1 2 3 4

x 105

0.02

0.03

0.04s

0 1 2 3 4

x 105

0.9

0.95

1b

0 1 2 3 4

x 105

0.35

0.4q

0 1 2 3 4

x 105

0

0.05

s1

0 1 2 3 4

x 105

0

0.005

0.01

0.015

s2

0 1 2 3 4

x 105

0

0.02

0.04

s3

Figure 5.5: Converge of Posteriors Extreme Calibration

0.96 0.97 0.98 0.99 10

5000

10000r

1.68 1.7 1.72 1.74 1.760

5000

10000

15000t

0.32 0.322 0.324 0.326 0.328 0.33 0.3320

1

2x 10

4 a

6.2 6.25 6.3 6.35 6.4

x 10-3

0

5000

10000d

0.0198 0.02 0.0202 0.02040

5000

10000

15000s

0.9964 0.9966 0.9968 0.997 0.9972 0.9974 0.99760

5000

10000

15000b

0.385 0.39 0.395 0.40

5000

10000q

0.0435 0.044 0.0445 0.045 0.0455 0.046 0.04650

5000

10000

s1

0.014 0.0145 0.015 0.0155 0.0160

5000

10000

s2

0.037 0.0375 0.038 0.0385 0.039 0.0395 0.040

5000

10000

15000

s3

Figure 5.6: Posterior Distribution Real Data

0.39 0.395 0.4 0.405 0.41

-7000

-6000

-5000

-4000

-3000

-2000

-1000

Transversal cut at a

Exact100 Particles1000 Particles10000 Particles

0.97 0.98 0.99 1 1.01

-7000

-6000

-5000

-4000

-3000

-2000

-1000

Transversal cut at b


0.93 0.94 0.95 0.96 0.97

-220

-200

-180

-160

-140

-120

-100

-80

-60

-40

Transversal cut at r


6.85 6.9 6.95 7 7.05 7.1 7.15

x 10-3

-220

-200

-180

-160

-140

-120

-100

-80

-60

-40

Transversal cut at se


Figure 6.1: Likelihood Function

0 2000 4000 6000 8000 100000

0.2

0.4

0.6

0.8

110000 particles

0 0.5 1 1.5 2

x 104

0

0.2

0.4

0.6

0.8

120000 particles

0 1 2 3

x 104

0

0.2

0.4

0.6

0.8

130000 particles

0 1 2 3 4

x 104

0

0.2

0.4

0.6

0.8

140000 particles

0 1 2 3 4 5

x 104

0

0.2

0.4

0.6

0.8

150000 particles

0 2 4 6

x 104

0

0.2

0.4

0.6

0.8

160000 particles

Figure 6.2: C.D.F. Benchmark Calibration

0 2000 4000 6000 8000 100000

0.2

0.4

0.6

0.8

110000 particles

0 0.5 1 1.5 2

x 104

0

0.2

0.4

0.6

0.8

120000 particles

0 1 2 3

x 104

0

0.2

0.4

0.6

0.8

130000 particles

0 1 2 3 4

x 104

0

0.2

0.4

0.6

0.8

140000 particles

0 1 2 3 4 5

x 104

0

0.2

0.4

0.6

0.8

150000 particles

0 2 4 6

x 104

0

0.2

0.4

0.6

0.8

160000 particles

Figure 6.3: C.D.F. Extreme Calibration

0 2000 4000 6000 8000 100000

0.2

0.4

0.6

0.8

110000 particles

0 0.5 1 1.5 2

x 104

0

0.2

0.4

0.6

0.8

120000 particles

0 1 2 3

x 104

0

0.2

0.4

0.6

0.8

130000 particles

0 1 2 3 4

x 104

0

0.2

0.4

0.6

0.8

140000 particles

0 1 2 3 4 5

x 104

0

0.2

0.4

0.6

0.8

150000 particles

0 2 4 6

x 104

0

0.2

0.4

0.6

0.8

160000 particles

Figure 6.4: C.D.F. Real Data

Posterior Distributions Benchmark CalibrationParameters Mean s.d.

123

0.3570.9502.0000.4000.0200.9890.007

1.581041.121028.64104

6.721053.401046.781048.601051.341051.541059.291065.751086.441076.49107

89

Maximum Likelihood Estimates Benchmark CalibrationParameters MLE s.d.

123

0.3570.9502.0000.4000.0020.9900.007

1.581041.121038.63104

8.191060.0010.020

2.021062.071051.001060.0040.0070.0070.005

90

Posterior Distributions Extreme CalibrationParameters Mean s.d.

123

0.3570.95050.000.4000.0200.9890.035

1.581041.121028.65104

7.191041.881047.121034.801053.521068.691064.471061.871082.141072.33107

91

Maximum Likelihood Estimates Extreme CalibrationParameters MLE s.d.

123

0.3570.95050.0000.4000.0190.9900.035

1.581041.121038.66104

2.421066.121030.022

3.621077.431061.001050.0150.0170.0140.023

92

Convergence on Number of Particles

Convergence Real DataN Mean s.d.

100002000030000400005000060000

1014.5581014.6001014.6531014.6661014.6881014.664

0.32960.25950.18290.16040.14650.1347

93

Posterior Distributions Real DataParameters Mean s.d.

123

0.3230.9691.8250.3880.0060.9970.0230.0390.0180.034

7.976 1040.0080.0110.001

3.557 1059.221 1052.702 1045.346 1044.723 1046.300 104

94

Maximum Likelihood Estimates Real DataParameters MLE s.d.

123

0.3900.9871.7810.3240.0060.9970.0230.0380.0160.035

0.0440.7081.3980.0190.160

8.671030.2240.0600.0610.076

95

Logmarginal Likelihood Dierence: Nonlinear-Linear

p Benchmark Calibration Extreme Calibration Real Data0.1 73.631 117.608 93.650.5 73.627 117.592 93.550.9 73.603 117.564 93.55

96

Nonlinear versus Linear Moments Real DataReal Data Nonlinear (SMC filter) Linear (Kalman filter)

outputhoursinv

Mean s.d1.950.360.42

0.0730.0140.066

Mean s.d1.910.360.44

0.1290.0230.073

Mean s.d1.610.340.28

0.0680.0040.044

97

A Future Application: Good Luck or Good Policy?

U.S. economy has become less volatile over the last 20 years (Stockand Watson, 2002).

Why?

1. Good luck: Sims (1999), Bernanke and Mihov (1998a and 1998b)

and Stock and Watson (2002).

2. Good policy: Clarida, Gertler and Gal (2000), Cogley and Sargent

(2001 and 2003), De Long (1997) and Romer and Romer (2002).

3. Long run trend: Blanchard and Simon (2001).

98

How Has the Literature Addressed this Question?

So far: mostly with reduced form models (usually VARs).

But:

1. Results dicult to interpret.

2. How to run counterfactuals?

3. Welfare analysis.

99

Why Not a Dynamic Equilibrium Model?

New generation equilibrium models: Christiano, Eichebaum and Evans(2003) and Smets and Wouters (2003).

Linear and Normal.

But we can do it!!!

100

Environment

Discrete time t = 0, 1, ...

Stochastic process s S with history st = (s0, ..., st) and probabilityst.

101

The Final Good Producer

Perfectly Competitive Final Good Producer that solves

maxyi(st)

Zyistdi1

Zpistyistdi.

Demand function for each input of the form

yist=

pist

p (st)

11

yst,

with price aggregator:

pst=

Zpist 1 di

!1.

102

The Intermediate Good Producer

Continuum of intermediate good producers, each of one behaving asmonopolistic competitor.

The producer of good i has access to the technology:

yist= max

ez(s

t)kist1

l1i

st , 0

.

Productivity zst= z

st1

+ z

st.

Calvo pricing with indexing. Probability of changing prices (beforeobserving current period shocks) 1 .

103

Consumers Problem

Est

Xt=0

t

cst c st dc st1c

c l

st l stl

l+ m

stm stm

m

pst cst+ x

st+M

st+Zst+1

qst+1

stBst+1

dst+1 =

pst wstlst+ r

stkst1

+M

st1

+B

st+

st+ T

st

Bst+1

B

kst= (1 ) k

st1

xst

kst1

+ x

st.

104

Government Policy

Monetary Policy: Taylor ruleist= rgg

st

+ast

st g

st

+bst yst yg

st+ i

st

gst= g

st1

+

st

ast= a

st1

+ a

st

bst= b

st1

+ b

st

Fiscal Policy.105

Stochastic Volatility I

We can stack all shocks in one vector:st=zst, c

st, l

st, m

st, i

st,

st, a

st, b

st0

Stochastic volatility:

st= R

st0.5

st.

The matrix Rstcan be decomposed as:

Rst= G

st1

HstGst.

106

Stochastic Volatility II

Hst(instantaneous shocks variances) is diagonal with nonzero ele-

ments histthat evolve:

log hist= log hi

st1

+ ii

st.

Gst(loading matrix) is lower triangular, with unit entries in the

diagonal and entries ijstthat evolve:

ijst= ij

st1

+ ijij

st.

107

Where Are We Now?

Solving the model: problem with 45 state variables: physical capital,the aggregate price level, 7 shocks, 8 elements of matrix H

st, and

the 28 elements of the matrix Gst.

Perturbation.

We are making good progress.

108

Documents

LectureNotes 9 Filtering