Introduction to Econometrics- Stock & Watson -Ch 9 Slides.doc

8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 9 Slides.doc

1/69


2/69

Example !ortgage denial and ra"e

#he Boston $ed %!D& data set

"ndividual applications for single-family mortgagesmade in ! # in the greater Boston area

$% observations, collected under 'ome ortgageisclosure *ct (' *)

Variables

ependent variable:o "s the mortgage denied or accepted+

"ndependent variables:o income, wealth, employment statuso

other loan, property characteristics -$


3/69

o race of applicant

-%


4/69

#he 'inear robability !odel(SW Se"tion 9. )

* natural starting point is the linear regression modelwith a single regressor:

Y i = # ! X i uiBut:

hat does ! mean when Y is binary+ "s ! =Y

X

+

hat does the line # ! X mean when Y is binary+ hat does the predicted value .Y mean when Y is binary+ /or e0ample, what does .Y = #1$2 mean+

-3


5/69

#he linear probability model* "td.

Y i = # ! X i ui

4ecall assumption 5!: E (ui6 X i) = #, so

E (Y i6 X i) = E ( # ! X i ui6 X i) = # ! X i

hen Y is binary,

E (Y ) = ! 7r( Y =!) # 7r( Y =#) = 7r( Y =!)so

E (Y 6 X ) = 7r( Y =!6 X )

-8


6/69

#he linear probability model* "td.hen Y is binary, the linear regression model

Y i = # ! X i uiis called the linear probability model 1

9he predicted value is a probability :o E (Y 6 X = x) = 7r( Y =!6 X = x) = prob1 that Y = ! given xo .Y = the predicted probability that Y i = !, given X

! = change in probability that Y = ! for a given x:

! = 7r( !6 ) 7r( !6 )Y X x x Y X x x

= = + = =

0ample: linear probability model, ' * data-2


7/69

!ortgage denial +. ratio o, debt payments to in"ome( - ratio) in the %!D& data set (s/bset)

-;


8/69

'inear probability model %!D& data

deny = -1# 12#3 P/I ratio (n = $%)

(1#%$) (1# &)

hat is the predicted value for P/I ratio = 1%+

7r( !6 < 1%)deny P Iratio= = = -1# 12#3 1% = 1!8! alculating >effects:? increase P/I ratio from 1% to 13:7r( !6 < 13)deny P Iratio= = = -1# 12#3 13 = 1$!$9he effect on the probability of denial of an increasein P/I ratio from 1% to 13 is to increase the probability

by 1#2!, that is, by 21! percentage points (what? )1

-&


9/69

@e0t include black as a regressor:

deny = -1# ! 188 P/I ratio 1!;; black

(1#%$) (1# &) (1#$8)

7redicted probability of denial:

for black applicant with P/I ratio = 1%:

7r( !)deny = = -1# ! 188 1% 1!;; ! = 1$83 for white applicant, P/I ratio = 1%:

7r( !)deny = = -1# ! 188 1% 1!;; # = 1#;;

difference = 1!;; = !;1; percentage points oefficient on black is significant at the 8A level Still plenty of room for omitted ariable bias!

-


10/69

#he linear probability model S/mmary

odels probability as a linear function of X *dvantages:

o simple to estimate and to interpreto inference is the same as for multiple regression

(need heteroskedasticity-robust standard errors)

isadvantages:o oes it make sense that the probability should be

linear in +o 7redicted probabilities can be C# or D!E

-!#


11/69

9hese disadvantages can be solved by using a nonlinear probability model: probit and logit regression

robit and 'ogit Regression(SW Se"tion 9.0)

9he problem with the linear probability model is that it

models the probability of Y =! as being linear:

7r( Y = !6 X ) = # ! X

"nstead, we want:

# F 7r( Y = !6 X ) F ! for all X 7r( Y = !6 X ) to be increasing in X (for ! D#)

-!!


12/69

9his reGuires a nonlinear functional form for the probability1 'ow about an >S-curve?H

-!$


13/69

9he probit model satisfies these conditions:

# F 7r( Y = !6 X ) F ! for all X

-!%


14/69

7r( Y = !6 X ) to be increasing in X (for ! D#)

-!3


15/69

Probit regression models the probability that Y =! usingthe cumulative standard normal distribution function,

evaluated at " = # ! X :

7r( Y = !6 X ) = ( # ! X )

is the cumulative normal distribution function1 " = # ! X is the > " -value? or > " -inde0? of the probit model1

Example : Suppose # = -$, ! = %, X = 13, so

7r( Y = !6 X =13) = (-$ % 13) = (-#1&)7r( Y = !6 X =13) = area under the standard normal densityto left of " = -1&, which isH

-!8


16/69

7r( # F -#1&) = 1$!!-!2


17/69

robit regression* "td.

hy use the cumulative normal probability distribution+

9he >S-shape? gives us what we want:o # F 7r( Y = !6 X ) F ! for all X

o 7r( Y = !6 X ) to be increasing in X (for ! D#)

asy to use I the probabilities are tabulated in thecumulative normal tables

4elatively straightforward interpretation:o " -value = # ! X o #

. !. X is the predicted " -value, given X

-!;


18/69

o ! is the change in the " -value for a unit changein X

-!&


19/69

S$%$% Example : ' * data

. probit deny p_irat, r;

Iteration 0: log likelihood = -872.0853 We ll di!"#!! thi! later

Iteration $: log likelihood = -835.%%33Iteration 2: log likelihood = -83$.8053&Iteration 3: log likelihood = -83$.7'23&

(robit e!ti)ate! *#)ber o+ ob! = 2380 Wald "hi2 $ = &0.%8 (rob "hi2 = 0.0000/og likelihood = -83$.7'23& (!e#do 2 = 0.0&%2

------------------------------------------------------------------------------ 1 ob#!t deny 1 oe+. td. 4rr. ( 1 1 6'5 on+. Inter al9------------- ---------------------------------------------------------------- p_irat 1 2.'%7'08 .&%53$$& %.38 0.000 2.055'$& 3.87''0$ _"on! 1 - 2.$'&$5' .$%&'72$ -$3.30 0.000 -2.5$7&'' -$.87082

------------------------------------------------------------------------------7r( !6 < )deny P Iratio= = (-$1! $1 ; P/I ratio ) (1!2) (13;)

-!


20/69

S$%$% Example : ' * data, ctd17r( !6 < )deny P Iratio= = (-$1! $1 ; P/I ratio )

(1!2) (13;)

7ositive coefficient: does this make sense + Standard errors have usual interpretation 7redicted probabilities:

7r( !6 < 1%)deny P Iratio= = = (-$1! $1 ; 1%)= (-!1%#) = 1# ;

ffect of change in P/I ratio from 1% to 13: 7r( !6 < 13)deny P Iratio= = = (-$1! $1 ; 13) = 1!87redicted probability of denial rises from 1# ; to 1!8

-$#


21/69

robit regression with m/ltiple regressors

7r( Y = !6 X ! , X $) = ( # ! X ! $ X $)

is the cumulative normal distribution function1 " = # ! X ! $ X $ is the > " -value? or > " -inde0? of the probit model1

! is the effect on the " -score of a unit change in X ! ,holding constant X $

-$!


22/69

S$%$% Example : ' * data

. probit deny p_irat bla"k, r;

Iteration 0: log likelihood = -872.0853

Iteration $: log likelihood = -800.8850&Iteration 2: log likelihood = -7'7.$&78Iteration 3: log likelihood = -7'7.$3%0&

(robit e!ti)ate! *#)ber o+ ob! = 2380 Wald "hi2 2 = $$8.$8 (rob "hi2 = 0.0000/og likelihood = -7'7.$3%0& (!e#do 2 = 0.085'

------------------------------------------------------------------------------ 1 ob#!t deny 1 oe+. td. 4rr. ( 1 1 6'5 on+. Inter al9------------- ---------------------------------------------------------------- p_irat 1 2.7&$%37 .&&&$%33 %.$7 0.000 $.87$0'2 3.%$2$8$ bla"k 1 .708$57' .083$877 8.5$ 0.000 .5&5$$3 .87$2028

_"on! 1 -2.258738 .$588$%8 -$&.22 0.000 -2.5700$3 -$.'&7&%3------------------------------------------------------------------------------

&e'll go through the estimation details later!

-$$


23/69

S$%$% Example : predicted probit probabilities. probit deny p_irat bla"k, r;

(robit e!ti)ate! *#)ber o+ ob! = 2380 Wald "hi2 2 = $$8.$8 (rob "hi2 = 0.0000/og likelihood = -7'7.$3%0& (!e#do 2 = 0.085'

------------------------------------------------------------------------------ 1 ob#!t deny 1 oe+. td. 4rr. ( 1 1 6'5 on+. Inter al9------------- ----------------------------------------------------------------

p_irat 1 2.7&$%37 .&&&$%33 %.$7 0.000 $.87$0'2 3.%$2$8$ bla"k 1 .708$57' .083$877 8.5$ 0.000 .5&5$$3 .87$2028 _"on! 1 -2.258738 .$588$%8 -$&.22 0.000 -2.5700$3 -$.'&7&%3------------------------------------------------------------------------------

. !"a $ = _b6_"on!9 _b6p_irat9 .3 _b6bla"k9 0;

. di!play


24/69

S$%$% Example : ' * data, ctd17r( !6 < , )deny P I black =

= (-$1$2 $1;3 P/I ratio 1;! black ) (1!2) (133) (1#&)

"s the coefficient on black statistically significant+

stimated effect of race for P/I ratio = 1%:

7r( !61%,!)deny = = (-$1$2 $1;3 1% 1;! !) = 1$%%7r( !61%,#)deny = = (-$1$2 $1;3 1% 1;! #) = 1#;8

ifference in reJection probabilities = 1!8& (!81&

percentage points)

Still plenty of room still for omitted ariable bias!

-$3


25/69

'ogit regression

Logit regression models the probability of Y =! as thecumulative standard logistic distribution function,

evaluated at " = # ! X :

7r( Y = !6 X ) = ( ( # ! X )

( is the cumulative logistic distribution function:

( ( # ! X ) =# !( )

!! X e

++

-$8


26/69

'ogisti" regression* "td.

7r( Y = !6 X ) = ( ( # ! X )

where ( ( # ! X ) =# !( )

!! X e

++ 1

Example : # = -%, ! = $, X = 13,

so # ! X = -% $ 13 = -$1$ so

7r( Y = !6 X =13) = !


27/69

S$%$% Example : ' * data. logit deny p_irat bla"k, r;

Iteration 0: log likelihood = -872.0853 /ater>Iteration $: log likelihood = -80%.357$

Iteration 2: log likelihood = -7'5.7&&77Iteration 3: log likelihood = -7'5.%'52$Iteration &: log likelihood = -7'5.%'52$

/ogit e!ti)ate! *#)ber o+ ob! = 2380 Wald "hi2 2 = $$7.75 (rob "hi2 = 0.0000/og likelihood = -7'5.%'52$ (!e#do 2 = 0.087%

------------------------------------------------------------------------------ 1 ob#!t deny 1 oe+. td. 4rr. ( 1 1 6'5 on+. Inter al9------------- ---------------------------------------------------------------- p_irat 1 5.3703%2 .'%33&35 5.57 0.000 3.&822&& 7.258&8$ bla"k 1 $.272782 .$&%0'8% 8.7$ 0.000 .'8%&33' $.55'$3 _"on! 1 -&.$25558 .3&5825 -$$.'3 0.000 -&.8033%2 -3.&&7753------------------------------------------------------------------------------

. di!


28/69

7redicted probabilities from estimated probit and logitmodels usually are very close1

-$&


29/69

Estimation and n,eren"e in robit (and 'ogit)!odels (SW Se"tion 9.1)

7robit model:

7r( Y = !6 X ) = ( # ! X )

stimation and inferenceo 'ow to estimate # and ! +o hat is the sampling distribution of the estimators+

o hy can we use the usual methods of inference+ /irst discuss nonlinear least s)uares (easier to e0plain)

-$


30/69

9hen discuss maximum likelihood estimation (what isactually done in practice)

-%#


31/69

robit estimation by nonlinear least s2/ares4ecall KLS:

# !

$

, # !!min M ( )N

n

b b i ii Y b b X = +

9he result is the KLS estimators #. and !.

"n probit, we have a different regression function I thenonlinear probit model1 So, we could estimate # and !

by nonlinear least squares :

# !

$, # !

!min M ( )N

n

b b i ii

Y b b X = +

Solving this yields the nonlinear least squares estimatorof the probit coefficients1

-%!


32/69

3onlinear least s2/ares* "td.

# !

$

, # !!min M ( )N

n

b b i ii Y b b X = +

'ow to solve this minimiOation problem+

alculus doesnPt give and e0plicit solution1 ust be solved numerically using the computer, e1g1 by >trial and error? method of trying one set of valuesfor ( b#,b! ), then trying another, and another,H

Better idea: use specialiOed minimiOation algorithms"n practice, nonlinear least sGuares isnPt used because itisnPt efficient I an estimator with a smaller variance isH

-%$


33/69

robit estimation by maxim/m li4elihood

9he likelihood function is the conditional density of Y ! ,H, Y n given X ! ,H, X n, treated as a function of the

unknown parameters # and ! 1

9he ma0imum likelihood estimator ( L ) is the valueof ( #, ! ) that ma0imiOe the likelihood function1

9he L is the value of ( #, ! ) that best describe thefull distribution of the data1

"n large samples, the L is:o consistento normally distributed

-%%


34/69

o efficient (has the smallest variance of all estimators)

Spe"ial "ase the probit !'E with no X

Y =! with probability

# with probability !

p

p (Bernoulli distribution)

ata: Y ! ,H, Y n, i1i1d1

erivation of the likelihood starts with the density of Y ! :

7r( Y ! = !) = p and 7r( Y ! = #) = !I pso

7r( Y ! = y! ) = ! !!(! ) y y p p ( erify this for y *+,- *. )

-%3


35/69

Qoint density of ( Y ! ,Y $):Because Y ! and Y $ are independent,

7r( Y ! = y! ,Y $ = y$) = 7r( Y ! = y! ) 7r( Y $ = y$)

= M ! !!(! ) y y p p NM $ $!(! ) y y p p N

Qoint density of ( Y ! ,11,Y n):

7r( Y ! = y! ,Y $ = y$,H, Y n = yn)

= M ! !!(! ) y y p p NM $ $!(! ) y y p p NH M !(! )n n y y p p N

= ( )!! (! )nn

ii ii

n y y

p p =

=

9he likelihood is the Joint density, treated as a function ofthe unknown parameters, which here is p:

-%8


36/69

f ( pRY ! ,H, Y n) = ( )!! (! )nn

ii iin Y Y

p p ==

9he L ma0imiOes the likelihood1 "ts standard to workwith the log likelihood, lnM f ( pRY ! ,H, Y n)N:

lnM f ( pRY ! ,H, Y n)N =( ) ( )! !ln( ) ln(! )n ni ii iY p n Y p= =+

!ln ( R ,111, )nd f p Y Y

dp = ( ) ( )! !! !

!

n n

i ii iY n Y p p= =

+ = #Solving for p yields the L R that is, . /0E p satisfies,

-%2


37/69

( ) ( )! !! !. .!n n

i i /0E /0E i iY n Y

p p= = + = #

or

( ) ( )! !! !. .!n n

i i /0E /0E i iY n Y

p p= ==

or

. .! !

/0E

/0E Y pY p= or

. /0E p = Y = fraction of !Ps

-%;


38/69

9he L in the >no- X ? case (Bernoulli distribution):. /0E p = Y = fraction of !Ps

/or Y i i1i1d1 Bernoulli, the L is the >natural?estimator of p, the fraction of !Ps, which is Y

e already know the essentials of inference:o "n large n, the sampling distribution of . /0E p = Y is

normally distributedo 9hus inference is >as usual:? hypothesis testing via

t -statistic, confidence interval as !1 2 SE

S9*9* note: to emphasiOe reGuirement of large- n, the printout calls the t -statistic the " -statisticR instead of the ( -statistic, the chi1s)uared statstic (= ) ( )1

-%&


39/69

#he probit li4elihood with one X 9he derivation starts with the density of Y ! , given X ! :

7r( Y ! = !6 X ! ) = ( # ! X ! )

7r( Y ! = #6 X ! ) = !I ( # ! X ! )

so

7r( Y ! = y! 6 X ! ) = ! !!# ! ! # ! !( ) M! ( )N y y X X + +

9he probit likelihood function is the Joint density of Y ! ,

H, Y n given X ! ,H, X n, treated as a function of #, ! :

f( #, ! RY ! ,H, Y n6 X ! ,H, X n)= ! !!# ! ! # ! !( ) M! ( )N

Y Y X X + + T

H !# ! # !( ) M! ( )Nn nY Y

n n X X + + T

-%

h b l l h d


40/69

#he probit li4elihood ,/n"tion :

f( #, ! RY ! ,H, Y n6 X ! ,H, X n)

= ! !!# ! ! # ! !( ) M! ( )NY Y X X + + T

H !# ! # !( ) M! ( )Nn nY Y

n n X X + + T

anPt solve for the ma0imum e0plicitly ust ma0imiOe using numerical methods *s in the case of no X , in large samples:

o #. /0E , !.

/0E are consistent

o #. /0E , !.

/0E are normally distributed (more laterH)o 9heir standard errors can be computedo 9esting, confidence intervals proceeds as usual

/or multiple X Ps, see S *pp1 1$-3#

h l i li lih d i h


41/69

#he logit li4elihood with one X

9he only difference between probit and logit is thefunctional form used for the probability: isreplaced by the cumulative logistic function1

Ktherwise, the likelihood is similarR for details seeS *pp1 1$

*s with probit,o #

. /0E , !. /0E are consistent

o #. /0E , !. /0E are normally distributedo 9heir standard errors can be computedo 9esting, confidence intervals proceeds as usual

-3!

! / i


42/69

!eas/res o, ,it9he 2 $ and $ 2 donPt make sense here ( why? )1 So, twoother specialiOed measures are used:

!1 9he fraction correctly predicted = fraction of Y Ps forwhich predicted probability is D8#A (if Y i=!) or is

C8#A (if Y i=#)1

$1 9he pseudo-R 0 measure the fit using the likelihoodfunction: measures the improvement in the value ofthe log likelihood, relative to having no X Ps (see S*pp1 1$)1 9his simplifies to the 2 $ in the linearmodel with normally distributed errors1

-3$

' 5 di ib/ i h !'E ( i SW )


43/69

'arge5 n distrib/tion o, the !'E ( not in SW )

9his is foundation of mathematical statistics1 ePll do this for the >no- X ? special case, for which p is

the only unknown parameter1 'ere are the steps:

!1 erive the log likelihood (> ( p)?) (done)1$1 9he L is found by setting its derivative to OeroR

that reGuires solving a nonlinear eGuation1%1 /or large n, . /0E p will be near the true p ( p true ) so this

nonlinear eGuation can be appro0imated (locally) by

a linear eGuation (9aylor series around ptrue

)131 9his can be solved for . /0E p I p true 181 By the Law of Large @umbers and the L9, for n

large, n ( . /0E p I p true ) is normally distributed1-3%

!1 i th l lik lih d


44/69

!1 erive the log likelihood 2ecall : the density for observation 5! is:

7r( Y ! = y! ) = ! !!(! ) y y p p (density)

so

f ( pRY ! ) = ! !!(! )Y Y p p (likelihood)

9he likelihood for Y ! ,H, Y n is,

f ( pRY ! ,H, Y n) = f ( pRY ! ) H f ( pRY n)so the log likelihood is,

( p) = ln f ( pRY ! ,H, Y n)

= lnM f ( pRY ! ) H f ( pRY n)N=

!

ln ( R )n

ii

f p Y =

$1 Set the derivative of ( p) to Oero to define the L :-33


45/69

31 S l thi li 0i ti f ( /0E I true )


46/69

31 Solve this linear appro0imation for ( . /0E p I p true ):

( )true p

p p

L

$

$

( )

true p

p p

L

( . /0E

p I ptrue

) #

so$

$

( )

true p

p

p

L( . /0E p I p true ) I

( )true p

p

p

L

or

( . /0E p I p true ) I

!$

$

( )

true p

p

p

L ( )

true p

p

p

L

-32

81 S bstit te things in and appl the LL@ and L91


47/69

81 Substitute things in and apply the LL@ and L91

( p) =!

ln ( R )n

ii

f p Y =

( )true p

p p

L =

!

ln ( R )true

ni

i p

f p Y p=

$

$

( )

true p

p

p

L =

$

$!

ln ( R )

true

ni

i p

f p Y

p=

so

( . /0E p I p true ) I

!$

$

( )

true p

p

p

L ( )true

p

p

p

L

=

!$

$!

ln ( R )

true

ni

i p

f p Y p

=

!

ln ( R )true

ni

i p

f p Y p=

-3;

ultiply through by :


48/69

ultiply through by n :

n ( . /0E p I p true )

!$

$!

! ln ( R )true

n

i

i p

f p Y n p

=

!

! ln ( R )true

n

i

i p f p Y pn =

Because Y i is i1i1d1, theith terms in the summands are alsoi1i1d1 9hus, if these terms have enough ($) moments, thenunder general conditions (not Just Bernoulli likelihood):

$

$!

! ln ( R )

true

ni

i p

f p Y n p=

p

a (a constant) ( LL@)

!

! ln ( R )true

ni

i p

f p Y pn =

d 3 (#,

$ln f ) ( L9) ( &hy? )

7utting this together,-3&

( /0E p I true )


49/69

n ( . p I p true )

!$

$

!

! ln ( R )

true

ni

i p

f p Y n p

=

!

! ln ( R )true

ni

i p

f p Y pn =

$

$!

! ln ( R )

true

ni

i p

f p Y n p=

p

a (a constant) ( LL@)

!

! ln ( R )true

ni

i p

f p Y pn =

d 3 (#,

$ln f ) ( L9) ( &hy? )

son ( . /0E p I p true )

d 3 (#,

$ln f


50/69

4ecall: f ( pRY i) = !(! )i iY Y p p

so

ln f ( pRY i) = Y iln p (!I Y i)ln(!I p)and

ln ( , )i f p Y p

=

!!

i iY Y p p

=(! )

iY p p p

and$

$

ln ( , )i f p Y p

= $ $

!(! )

i iY Y p p

= $ $!

(! )i iY Y

p p +

-8#

enominator term first:


51/69

enominator term first:$

$

ln ( , )i f p Y p

= $ $

!(! )

i iY Y p p +

so$

$!

! ln ( R )

true

ni

i p

f p Y n p=

= $ $!! !

(! )

ni i

i

Y Y n p p=

+

= $ $!

(! )Y Y

p p+

p

$ $!

(! ) p p p p

+ (LL@)

=! !

! p p+ =

!(! ) p p

-8!

@e0t the numerator:


52/69

@e0t the numerator:

ln ( , )i f p Y

p

= (! )

iY p

p p

so

!

! ln ( R )

true

ni

i p

f p Y

pn =

=

!

!

(! )

ni

i

Y p

p pn =

=!

! !( )

(! )

n

ii

Y p p p n =

d 3 (#,

$

$M (! )NY

p p )

-8$

7ut these pieces together:


53/69

7ut these pieces together:

n ( . /0E p I p true )!

$

$!

! ln ( R )true

n

i

i p

f p Y n p

=

!

! ln ( R )true

n

i

i p

f p Y pn =

where$

$!

! ln ( R )

true

n

ii p

f p Y n p=

p

!

(! ) p p

!

! ln ( R )true

ni

i p

f p Y pn =

d 3 (#,

$

$M (! )NY

p p )

9hus

n ( . /0E p I p true )d

3 (#, $

Y )

S/mmary probit !'E* no5 X "ase-8%


54/69

9he L : . /0E p = Y

orking through the full L distribution theory gave:


3 (#, $

Y )

But because p true = 7r( Y = !) = E (Y ) = Y , this is:

n (Y I Y )d

3 (#, $

Y )

* familiar result from the first week of classE

-83

#he !'E deri+ation applies generally


55/69

#he ! E deri+ation applies generally


3 (#,$ln f


56/69

S/mmary distrib/tion o, the ! E(Why did do this to yo/6)

9he L is normally distributed for large n e worked through this result in detail for the probit

model with no X Ps (the Bernoulli distribution)

/or large n, confidence intervals and hypothesis testing proceeds as usual

"f the model is correctly specified, the L is efficient,that is, it has a smaller large- n variance than all other

estimators (we didnPt show this)1 9hese methods e0tend to other models with discretedependent variables, for e0ample count data (5crimes


57/69

&ppli ation to the Boston %!D& Data(SW Se"tion 9.7)

ortgages (home loans) are an essential part of buying a home1

"s there differential access to home loans by race+

"f two otherwise identical individuals, one white andone black, applied for a home loan, is there adifference in the probability of denial+

-8;

#he %!D& Data Set


58/69

#he %!D& Data Set

ata on individual characteristics, propertycharacteristics, and loan denial


59/69

#he loan o,,i er8s de ision

Loan officer uses key financial variables:o P/I ratioo housing e0pense-to-income ratioo loan-to-value ratioo personal credit history

9he decision rule is nonlinear:o loan-to-value ratio D A

o loan-to-value ratio D 8A (what happens in default+)o credit score

-8

Regression spe"i,i"ations


60/69

Regression spe i,i ations7r( deny=!6black , other X Ps) = H

linear probability model probit

ain problem with the regressions so far: potential

omitted variable bias1 *ll these (i) enter the loan officerdecision function, all (ii) are or could be correlated withrace:

wealth, type of employment

credit history family status

Wariables in the ' * data setH-2#


61/69

-2!


62/69


63/69

-2%


64/69

-23


65/69

-28

S/mmary o, Empiri"al Res/lts


66/69

y , p

oefficients on the financial variables make sense1 4lack is statistically significant in all specifications 4ace-financial variable interactions arenPt significant1 "ncluding the covariates sharply reduces the effect of

race on denial probability1 L7 , probit, logit: similar estimates of effect of race

on the probability of denial1

stimated effects are large in a >real world? sense1

-22

Remaining threats to internal* external +alidity


67/69

g y

"nternal validity!1 omitted variable bias

what else is learned in the in-person interviews+$1 functional form misspecification (noH)

%1 measurement error (originally, yesR now, noH)31 selection

random sample of loan applications

define population to be loan applicants81 simultaneous causality (no) 0ternal validity

9his is for Boston in ! #- !1 hat about today+-2;

S/mmary


68/69

y(SW Se"tion 9. )

"f Y i is binary, then E (Y 6 X ) = 7r( Y =!6 X ) 9hree models:

o linear probability model (linear multiple regression)o probit (cumulative standard normal distribution)o logit (cumulative standard logistic distribution)

L7 , probit, logit all produce predicted probabilities ffect of X is change in conditional probability that

Y =!1 /or logit and probit, this depends on the initial X 7robit and logit are estimated via ma0imum likelihood

o oefficients are normally distributed for large n

-2&

o Large- n hypothesis testing, conf1 intervals is as usual


69/69

g yp g

Documents

Introduction to Econometrics- Stock & Watson -Ch 9 Slides.doc