Panel Data III

8/9/2019 Panel Data III

1/33

[email protected] 1

1

PANEL DATA WORKSHOP

BRUNEL UNIVERSITY

February 29, 2008.

PART II:

STATIC AND DYNAMIC PANEL DATA

MODELLING


2/33

[email protected] 2

2

Static and dynamic panel data modelling

Presentation outline

1. Introduction

2. Static panel data models

3. Empirical example4. Dynamic panel data models

5. Empirical example

6. Further considerations

7. Summary


3/33

[email protected] 3

3

1. Introduction

Suppose the aim is to establish the link between domestic

firms’ profitability and foreign direct investment (foreign

finance) by multinational enterprises (MNEs).

MNEs are at the frontier of technology and management

practices and have vast marketing resources. Thus it is

reasonable to expect that domestic firms receiving foreign

finance will increase their profitability.

The regression analysis will typically require data on firms’

profits and the amount of foreign finance they have

attracted. These variables are observable to the researcher.

Profitability, however, is also affected by several firm level

characteristics that are unobservable to the analyst.


4/33

[email protected] 4

4

1. Introduction

These characteristics include managerial ability, political and

business connections, happiness of workforce, and are

referred to as firm heterogeneity (or firm-specific effects).

Firm heterogeneity is part of the error term of the model

since it is unobservable.

It is assumed to be constant through a reasonably short

space of time.

Firm heterogeneity not only affect profits, but also foreign

finance. It can be argued that MNEs are likely to invest in

firms with more able managers and useful connections.

This creates error-regressor correlation, i.e. endogeneity

problem.


5/33

[email protected] 5

5

1. Introduction

We saw that the method of instrumental variables can be

used to tackle the problem of endogeneity.

Valid instruments, however, are not always easy to come by.

One of the main advantages of panel data is the ability to

control for the problem of endogeneity, without the necessity

of getting external instruments (in other words, without the

need for additional data).

The objective of this lecture is to demonstrate this advantage

of panel data using an example from finance.

In particular static and dynamic linear panel data models will

be considered.


6/33

[email protected] 6

6


For simplicity, consider the following static panel data model

with a single explanatory variable.

(1)

i and t index firms and time periods resp.; y is profits; x is

foreign finance, f i is firm heterogeneity and ε = error term.

The problem with estimating equation (1) by OLS is that theindividual heterogeneity f i is likely to be correlated with x:

f i is also called fixed effect or correlated effect.

The panel data solution to the problem of correlated effects

is to eliminate them from the model by transforming the data.


7/33

[email protected] 7

7


There are two methods of transforming the data to eliminate

correlated effects.

Method I: The within transformation

Step 1: For each firm i, average Equation (1) over time as

(2)

where and so on.

Note that because f i does not change over time it appears in

both (1) and (2).

Step 2 : Subtract (2) from (1) to eliminate f i and obtain the

following model:


8/33

[email protected] 8

8


(3)

Step 3: Estimate equation (3) by OLS.

The resulting estimator of β is called the within (or fixed

effect) estimator.

This estimator is free of endogeneity bias because the

correlated effect is not involved.

A drawback of this transformation is that it eliminates allvariables that are time-invariant .

For example if equation (1) includes the gender of the

manager and the location of the firm as regressors, these

variables will drop out of the model.

Another drawback is that only the within variability of the

variables is used (i.e. between variability is neglected).


9/33

[email protected] 9

9


Method II: The first-difference transformation

Step 1: For each firm i lag Equation (1) by one time period as

(4)

Note that because f i does not change over time it appears in

both (1) and (4).

Step 2 : Subtract (4) from (1) to eliminate f i and obtain thefollowing first differenced model.

(5)

Step 3: Estimate equation (5) by OLS, and the resulting

estimator of β will also be free of endogeneity bias.


10/33

[email protected] 10

10


When estimating panel data models by OLS following the

within and first-difference transformations, it is vital to adjust

the transformed error term for serial correlation within each i.

For example, the transformed error term in equation (5) for

time periods t=3 and t=4 are

and .

It is easy to see that the two error terms are correlated since

they have a common element in

Most econometric packages can adjust the OLS standard

errors for serial correlation in the transformed model.

When T= 2, the within and first-difference estimators

coincide.


11/33


11

Fixed/correlated effects.

Within transformation.

First-differencing.


12/33


12


The aim is to test whether foreign direct investment (foreign

finance) leads to increased firm profitability in the UK.

The following model is specified (i and t index firm and year)

(6)

PROF is log of profitability, FDI is log of foreign finance, MS

is market share, f is firm heterogeneity (fixed effect) and ε is

an error term which is assumed to be serially uncorrelated.

Task 1: Estimate the model by using the within transformation.

Task 2 : Test for heterogeneity-regressor correlation using

Hausman test ( remember this test?).

Task 3: For comparison, re-estimate the model by first-

differencing the data.


13/33


13


A peek at the data ( N=2813; T= 5)


14/33


14


TASK 1


15/33


15


TASK 2

Reject the null


16/33


16


1. Recall that D is the first-difference operator in Stata (it

automatically first differences the variables).

2. Also note that the noc (no constant) option was used.

3. The elasticity of profitability with respect to foreign finance

(0.0732) is equivalent to the corresponding elasticity from the

within estimator (0.0747).

TASK 3


17/33


17

4. Dynamic panel data models

When econometric models contain lagged dependent

variables, they are called dynamic models.

In these models the past influences the present because of

adjustment costs, habits, etc…

Consider the following dynamic panel data model:

, i=1,…N; t=3,…T. (7)

Note that we need T to be at least 3, so t=3,….,T. First-differencing equation (7) to eliminate f i gives

For convenience, rewrite the above model as

(8)


18/33


18


Although the heterogeneity term is eliminated from the

dynamic panel data model, the transformed equation (8) has

a problem of its own!

Namely, the first-differenced lagged dependent variable and

the first-differenced error term are correlated:

(9)

To prove equation (9) , note that contains ,which

by the equation (7) can be written as

(10)

and

(11)

As equations (10) and (11) have the term in common,


19/33


19


Because of this regressor-error correlation, the first-

differenced dynamic panel data model cannot be estimated

by OLS (unlike the static panel model).

Provided that ε is not serially correlated, however, it can be

estimated by IV/GMM using values of y lagged by two or

more periods as instruments.

Consider the case of y lagged by two periods, as IV.

It is not difficult to see that

As long as the lagged values of y are valid instruments, it is

possible to obtain consistent estimators of α and β.


20/33


20

First-difference model;

Lagged values of y as IV;

GMM.


21/33


21


Now extend the static panel data model of equation (6) by

including a lagged dependent variable

(12)

1. Because of dynamics and individual heterogeneity in the

model, first-difference the data.

2. Use lagged values of profits as instruments. In this particular

case, use twice and three times lagged values.3. Estimate the model by GMM.

4. Test for the validity of the instruments.

Note that there are several way of estimating the model

depending on the choice of instruments, but the basic

principle is the same.


22/33


22


Instruments are

valid

FDI is insignificant!Note: Obtained from ivregress 2sls


23/33


23


1. In dynamic panel data modelling, is it desirable to use all

possible lagged values of the dependent variable as

instruments?

No necessarily so, mainly because of two reasons:

a. In general too many instruments, even if all of them are

valid, can bias IV/GMM in finite samples.

b. From practical point of view , if you employ the IV/GMM

estimator discussed in the previous section, the more lags

you use as IV , the more observations you loose. For

example if you decide to lag the dependent variable 4 times,

you can’t use observations from the first 4 time periods.


24/33


24


2. Is there any technique that can allow me to use further lags

of the dependent variable as instruments, without loosing

data from the early periods of the panel?

Yes there are several methods that can allow you to do so!

Without loss of generality, consider the following simple

dynamic panel model with T = 5 (as in our example).

[1]

First-difference the model to obtain

[2]

If ε is not serially correlated, the following instruments are

valid for each i:

At t=3: yi1 ; at t=4: yi1 and yi2 ; and at t=5: yi1 , yi2 and yi3.


25/33


25


If you want to use all of the available instruments, construct

an instrument matrix Z with one row for each period that you

are instrumenting as:

This type of instrumentation procedure ensures that the

number of instruments are maximised.

Z is refereed to as GMM-style matrix of instruments.

Using the variables in Z as instruments, the parameter of the

dynamic panel model can be estimated by GMM.

IV for t=3

IV for t=4

IV for t=5


26/33


26


Using STATA with GMM-style instruments:

Test for instruments exogeneity.


27/33


27


A two-step variant of the estimation :

We are looking for absence of second-order

serial correlation in the first-differenced errors.

Note: In theory two-step estimator

is superior!


28/33


28


3. First-differencing the model, wipes out the effects of time-invariant variables. Is there a way of identifying theseeffects?

The answer is yes, and this involves using the level equation[1] rather than the first-differenced equation [2].

The idea is to use first-differences of the lagged dependentvariable as instruments in the level equation (as opposed tousing lagged dependent variables as instruments in the first-

differenced equation. But this requires the additional assumption that the process

under study stationary– in the sense that the distribution ofinitial observations coincides with the steady statedistribution of the process.

If this assumption is valid,


29/33


29


In fact, one can use both level and first-differencedequations with the relevant instruments to obtained what isknown the system GMM estimator.

This estimator is especially recommended when y is highlypersistent ( α is close to 1) and lagged values of y are quiteweak (irrelevant) instruments in the first-differenced model.

System GMM estimator


30/33


30


4. How can I deal with endogenous conditioning variables

(variables in X) in dynamic panel data model?

A conditioning variable could be strictly exogenous

(uncorrelated with past, current and future values of ε),

endogenous (correlated with past and/or current values of ε)

or predetermined (uncorrelated with past and current values

of ε, but correlated with its future values).

If x is strictly exogenous, its past, current and future values

can be used as its instruments.

Only current and past values of predetermined variables

may be used as valid instruments.

With endogenous conditioning variables, the nature of

permissible instruments depends on the lag structure of the

error term.


31/33


31


System GMM estimation assuming MS and FDI are endogenous and the error termIs serially uncorrelated.


32/33


32


5. Is it possible to get widely varying results when trying

different IV/GMM estimations?

Indeed! IV/GMM estimates can be quite erratic in finite

samples.

It is advisable to try different IV/GMM estimators in order to

establish the sensitivity of the results to choices ofinstruments and estimation method.

At the end of the day, however, you have to decide which

set of results , if any, is most convincing and be prepared to

explain and defend your decision.


33/33

33

7. Summary

1. Static panel data models with correlated effects can

be estimated by OLS after transforming the data to

eliminate individual heterogeneity.

2. The within or first-difference transformations can be

used to eliminate correlated effects.

3. The estimation of dynamic panel data models is

slightly more complicated and involves the use of

GMM.

THANK YOU!

Documents

Panel Data III