Panel Data I

  • Upload
    alaah1

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

  • 8/9/2019 Panel Data I

    1/34

    [email protected] 1

    1

    PANEL DATA WORKSHOP

    BRUNEL UNIVERSITY

    February 29, 2008.

    PART I:

    THE ABC OF INSTRUMENTAL VARIABLES

    AND GMM ESTIMATION

  • 8/9/2019 Panel Data I

    2/34

    [email protected] 2

    2

    The ABC of instrumental variables and

    GMM estimation

    Presentation outline

    1. Introduction

    2. Instrumental variables estimation

    3. Empirical example

    4. Three important tests

    5. Empirical example continued

    6. Summary

  • 8/9/2019 Panel Data I

    3/34

    [email protected] 3

    3

    1. Introduction

    Econometrics is concerned with the analysis of financial

    business and /economic data– time series, cross sectional

    or panel data.

    The analysis can be at individual, firm, industry or country

    level.

    Often the aim is to establish a causal relationship between

    various variables.

    1. Do bank loans help firms export more?

    2. Are women discriminated against in the labour market?

    3. Does financial development foster aggregate growth?

  • 8/9/2019 Panel Data I

    4/34

    [email protected] 4

    4

    How about the endogeneity

    problem???

  • 8/9/2019 Panel Data I

    5/34

    [email protected] 5

    5

    1. Introduction

    Suppose OLS regression of exports on bank loans shows a

    positive correlation between the two variables.

    Does this correlation imply that bank loans are the cause of

    increased export? -- Not necessarily so!!!

    May be exporters are more successful in obtaining bank

    loans or banks favour exporters more than non-exporters! Thus the causality might be from exports (dependent

    variable) to bank loans (independent variable).

    This is known as the problem of reverse causality or more

    generally endogeneity.

    Bank loans are potentially endogenous in the model.

  • 8/9/2019 Panel Data I

    6/34

    [email protected] 6

    6

    1. Introduction

     As another example, suppose that OLS regression of per

    capita GDP growth on financial development shows a

    positive relationship between the two variables.

    Since it is possible that financial markets develop in

    anticipation of future GDP growth, financial development

    could be a lead indicator of growth rather than an

    exogenous cause of growth.

    This creates the problem of endogeneity because the

    finance-growth relationship is simultaneously determined.

    It is important to remember that OLS would be biased when

    one or more regressors are endogenous.

  • 8/9/2019 Panel Data I

    7/34

    [email protected] 7

    7

    1. Introduction

    The examples illustrate that (i) correlation does not

    necessarily imply causality, and (ii) OLS may not always be

    an adequate empirical tool in finance.

    The objective of this lecture is to first introduce an estimation

    technique which is effective at dealing with the problem of

    endogeneity.

    This technique is known as instrumental variables (IV)

    estimation, or more generally as generalised method of

    moments (GMM).

    Unlike OLS, IV/GMM offer the chance of testing for causal

    relationships between economic variables.

  • 8/9/2019 Panel Data I

    8/34

    [email protected] 8

    8

    2. IV estimation

    Consider the following regression model

    For OLS to be unbiased, the matrix of regressors X and the

    error term ε should be uncorrelated. That is

    In this case we say that the regressors are exogenous.

    When at least one of the regressors are correlated with ε,

    Regressors that are correlated with the error term are called

    endogenous regressors.

  • 8/9/2019 Panel Data I

    9/34

    [email protected] 9

    9

    2. IV estimation

    Endogeneity could result from a variety of reasons including:

    1. Reverse causality.

    2. Simultaneity bias

    3. Omitted variables bias.

    4. Measurement errors.

    Whatever the reason behind endogeneity, the OLS estimator

    of β

    will be biased and inconsistent.

    If there are several regressors and just one of them is

    endogenous, the OLS estimator would still be biased.

  • 8/9/2019 Panel Data I

    10/34

    [email protected] 10

    10

    2. IV estimation

    In order to obtain a valid estimator of β and make correct

    inference about the relationship between y and X, we need

    some additional variables.

    The variables which help obtain a consistent estimator of β

    are known as instrumental variables (say Z).

    Instrumental variables should satisfy two properties:

    1. They have to be correlated with the endogenous

    regressors X:

    Instrument relevance.

    2. They have to be uncorrelated with the error term ε:

    Instrument exogeneity.

  • 8/9/2019 Panel Data I

    11/34

    [email protected] 11

    11

    2. IV estimation

    The instrumental variables should only affect the dependent

    variable (y) indirectly through their relationship with the

    endogenous regressors (X).

    In other words Z should not be part of the model.

    It is not always easy to come up with valid instruments that

    are exogenous to the model AND correlated with X.

    Suppose y = per capita GDP growth and X includes an

    indicator of financial development.

    The concern of endogeneity arises because faster per capita

    GDP growth is conducive to financial development.

  • 8/9/2019 Panel Data I

    12/34

  • 8/9/2019 Panel Data I

    13/34

    [email protected] 13

    13

    2. IV estimation

    If valid instruments that satisfy the properties of relevance

    and exogeneity are available, a consistent estimator of β

    can be obtained.

    This consistent estimator is called the instrumental variables

    (IV) estimator, and is denoted as .

    Consistency means that as the sample size gets large, the

    estimator converges to the true value β. The formula for the basic IV estimator is

    The IV estimator has approximate normal distribution in

    large samples. So statistical inference such as t-tests can be

    conducted in a standard fashion.

  • 8/9/2019 Panel Data I

    14/34

    [email protected] 14

    14

    2. IV estimation

    The basic IV estimator can be obtained as a two-stage least

    squares estimation process:

    1. Regress each endogenous regressor on all instruments

    and exogenous regressors, and generate predicted values.

    2. Estimate the model by OLS, replacing the endogenous

    regressors with their predicted values.

    If the error term is heteroskedastic or serially correlated,there are two options:

    a. can be used with robust standard errors.

    This option corresponds to the use of robust standard errors

    in OLS regressions

    It is the "safest” option, though not the most efficient one.

  • 8/9/2019 Panel Data I

    15/34

    [email protected] 15

    15

    2. IV estimation

    b. An efficient version of the IV estimator called the

    generalised method of moments (GMM) can also be used.

    This option corresponds to the use of Generalised Least

    Squares (GLS) in the standard regression analysis.

    In small samples, the GMM estimator tends to be inaccurate.

    IV-GMM estimation requires at least as many instrumental

    variables as endogenous regressors.

    When there are more instruments than endogenous

    regressors we say that the model is overidentified.

  • 8/9/2019 Panel Data I

    16/34

    [email protected] 16

    16

    IV, GMM, relevance,

    exogeneity, overidentification.

  • 8/9/2019 Panel Data I

    17/34

    [email protected] 17

    17

    3. Empirical example

    The aim is to test whether access to finance (bank loans)

    causes an increase in private firms exports in China.

    The following model is specified (i indexes firm)

    EXPORT is log of export, BANK is log of bank loans, DIST is

    log of distance from the city the firm is located in to the

    nearest port, LAB is a dummy variable showing whether thefirm is in a labour intensive industry or not.

    The model has three regressors, two of which, DIST and

    LAB are arguably exogenous (why?)

    BANK is potentially endogenous, however.

    One the one hand, bank loans might help firms export by

    providing them with the necessary financial resources.

  • 8/9/2019 Panel Data I

    18/34

    [email protected] 18

    18

    3. Empirical example

    On the other hand, banks might prefer to lend to exporting

    firms. So exporting could help secure more bank loans.

    Because of this potential problem of simultaneity bias, we

    employ IV/GMM.

    To start with, explore the following three variables as

    potential instruments:

    1. POL: A dummy variable indicating whether the firm has

    political connections or not.

    2. STATE: The share of state-owned enterprises (SOEs) in

    the region the firm is located in.

    3. EQUITY: The amount of equity/collateral the firm has.

  • 8/9/2019 Panel Data I

    19/34

    [email protected] 19

    19

    3. Empirical example

     Arguably, these instrumental variable candidates are

    correlated with the endogenous regressor (BANK). Thus

    they are likely to be relevant instruments.

    Political connection and high level of collateral help obtain

    more bank loans; while high presence of SOEs is likely to

    reduce private firms’ access to finance.

    On the other hand, the property of exogeneity requires that

    the instruments affect exporting through bank loans alone,

    rather than being fundamental drivers of export.

    First estimate the model by OLS with robust standard errors,

    and then by IV/GMM.

  • 8/9/2019 Panel Data I

    20/34

    [email protected] 20

    20

    3. Empirical example

     A peek at the cross sectional data ( N=5167)

  • 8/9/2019 Panel Data I

    21/34

    [email protected] 21

    21

    3. Empirical example

    OLS with robust standard error:

    Bank loans are positively correlated with export, but with

    marginal statistical significance.

    Should we trust these results? Probably not, because of

    simultaneity bias.

  • 8/9/2019 Panel Data I

    22/34

    [email protected] 22

    22

    3. Empirical example

    Two-stage least squares (IV) with robust standard errors: First stage regression

    1.These are the instrumental variables.

    2. Usually first-stage regressions are not reported in applied work.

    But it is important to routinely inspect them.

  • 8/9/2019 Panel Data I

    23/34

    [email protected] 23

    23

    3. Empirical example

    Two-stage least squares (IV) with robust standard errors: Second stage regression

    1. All interpretation of the model should be based on the second stage

    regression.

    2. Compared to OLS, 2SLS results appear to be counterintuitive: bank

    loans hurt exports; and distance to port does not seem to matter.

    3. We should test for the validity of the instruments before taking these

    results seriously!!

  • 8/9/2019 Panel Data I

    24/34

    [email protected] 24

    24

    4. Three important tests

    When working with instrumental variables three importanttests should be performed as a matter of routine. These are:

    1. Testing or checking for the relevance of the instrumental

    variable candidates: If the instruments have no or little

    correlation with the endogenous regressors, they are called

    weak instruments and would bias the IV estimator.

    2. Testing for the exogeneity of the instruments: If theinstruments are correlated with the error term, IV would be

    invalid.

    3. Testing whether the endogenous regressors are really

    endogenous: If the regressors are not endogenous after all,

    OLS would be the most efficient estimation method.

  • 8/9/2019 Panel Data I

    25/34

    [email protected] 25

    25

    4. Three important tests

    1. TESTING FOR INSTRUMENT RELEVANCE:

    The idea is to check whether the instruments are sufficiently

    correlated with the endogenous regressors.

    The simplest way is to test for the joint significance of the

    instruments in the first stage regression.

     As a rule of thumb, if the calculated F statistic is more than10 and the p-value is 0, the instruments are likely to be

    relevant.

    If the instruments are weak, it is advisable to look for

    other/additional instruments.

  • 8/9/2019 Panel Data I

    26/34

    [email protected] 26

    26

    4. Three important tests

    2. TESTING FOR INSTRUMENT EXOGENEITY

    The instruments should have no correlation with the error.

    In order to test for instruments exogeneity, we need to have

    more instruments than endogenous regressors. The number

    of excess instruments is called the number of overidentifying

    restrictions (in our example this number equals 2).

    The Sargan/Hansen can be used to test for IV exogeneity. The null hypothesis of the test is “ All instruments are

    valid”.

    If the null hypothesis is rejected, it means that at least one of

    the instruments is not valid.

    The test does not pinpoint which instruments are invalid.

  • 8/9/2019 Panel Data I

    27/34

    [email protected] 27

    27

    4. Three important tests

    3. TESTING FOR ENDOGENEITY OF REGRESSORS.

    Even if the instruments are found to be valid, it is a good

    idea to test whether it is really necessary to use IV/GMM.

    This can be achieved through the Hausman test for the

    endogeneity of regressors

    Where VIV and VOLS are the variance of the IV and OLS

    estimators respectively.

    The null hypothesis is : “All regressors are exogenous”.

    Under the null, H is distributed as Chi-squared random variable

    with degrees of freedom equal to the number of regressors.

    If the null hypothesis is not rejected, stick with OLS!!

  • 8/9/2019 Panel Data I

    28/34

    [email protected] 28

    28

    Relevance

    Exogeneity

    Endogeneity

    TEST TEST TEST!

  • 8/9/2019 Panel Data I

    29/34

    [email protected] 29

    29

    5. Empirical example cont.

    We can test for the relevance and exogeneity of the

    instruments in our export model as follows:

    The command “estat first”  tests for instrument relevance.

    Since the F statistic is greater than 10 and the p-value = 0,

    the problem of weak instruments is probably not too serious.

  • 8/9/2019 Panel Data I

    30/34

    [email protected] 30

    30

    5. Empirical example cont.

    The command “ estat overid ” test for the exogeneity of

    instruments.

    The null hypothesis of the test is “ POL, EQUITY and STATE

    are all exogenous instruments”.

    Under the null, the test statistics is distributed as a Chi-

    squared random variable with 2 degree of freedom ( 3

    instrumental variables – 1 endogenous regressor).

    The p-value of the test = 0, so we reject the null hypothesis

    and conclude that at least one of the instruments is not valid.

    Thus the IV results reported earlier should be discarded.

    Let’s try re-estimating the model by dropping one of the

    instruments (EQUITY).

  • 8/9/2019 Panel Data I

    31/34

    [email protected] 31

    31

    5. Empirical example cont.

    POL and STATE

    are valid instruments

  • 8/9/2019 Panel Data I

    32/34

    [email protected] 32

    32

    5. Empirical example cont.

    With valid instruments, the results suggest that bank loans

    play a positive and highly significant role in boosting exports.

    To check the robustness of this finding, re-estimate the last

    model by GMM.

    The two sets of results are practically the same, which is

    reassuring.

  • 8/9/2019 Panel Data I

    33/34

    [email protected] 33

    33

    5. Empirical example cont

    .

    Finally test for the endogeneity of BANK

    The Hausman test suggests that BANK is indeed

    endogenous. So using OLS would have been problematic.

    REJECT THE NULL

  • 8/9/2019 Panel Data I

    34/34

    34

    6. Summary

    1. The problem of endogeneity is common in applied

    econometrics.

    2. IV/GMM offer a way of tackling the problem of

    endogeneity.

    3. It is important to test for the validity of the

    instruments before taking the results from IV/GMM

    estimation seriously.

    THANK YOU!