50
Lecture 10: Nonlinear Regression Functions Zheng Tian Zheng Tian Lecture 10: Nonlinear Regression Functions 1 / 46

Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Lecture 10: Nonlinear Regression Functions

Zheng Tian

Zheng Tian Lecture 10: Nonlinear Regression Functions 1 / 46

Page 2: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Outline

1 Introduction

2 A General Strategy For Modeling Nonlinear Regression Functions

3 Nonlinear functions of a single independent variable

4 Interactions between independent variables

5 Warm-up exercises

6 Regression Functions That Are Nonlinear in the Parameters

Zheng Tian Lecture 10: Nonlinear Regression Functions 2 / 46

Page 3: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Introduction

Overview

Linear population regression function

E (Yi | Xi ) = β0 + β1Xi1 + · · ·+ βkXik , where Xi = (Xi1, . . . ,Xik)′.

Nonlinear population regression function

E (Yi | Xi ) = f (Xi1,Xi2, . . . ,Xik ;β1, β2, . . . , βm), where f (·) is a nonlinearfunction.

Study questions

Why do we need to use nonlinear regression models?What types of nonlinear regression models can we estimate by OLS?How can we interpret the coefficients in nonlinear regression models?

Zheng Tian Lecture 10: Nonlinear Regression Functions 3 / 46

Page 4: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

Test Scores and district income

Test scores can bedetermined by averagedistrict incomeWe estimate a simple linearregression model

TestScore = β0+β1Income+u

What’s the problem withthe simple linear regressionmodel?

Figure: Scatterplot of test score vs district incomeand a linear regression line

Zheng Tian Lecture 10: Nonlinear Regression Functions 4 / 46

Page 5: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

Why does a simple linear regression model not fit the datawell?

Data points are below the OLS line when income is very low (under $10,000)or very high (over $40,000), and are above the line when income is between$15,000 and $30,000.

The scatterplot may imply a curvature in the relationship between test scoresand income.That is, a unit increase in income may have larger effect on test scores whenincome is very low than when income is very high.

The linear regression line cannot capture the curvature because the effect ofdistrict income on test scores is constant over all the range of income since

∆TestScore/∆Income = β1

where β1 is constant.

Zheng Tian Lecture 10: Nonlinear Regression Functions 5 / 46

Page 6: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

Estimate a quadratic regression model

TestScore = β0 + β1Income + β2Income2 + u (1)

This model is nonlinear, specifically quadratic, with respect to Income sincewe include the squared income.The population regression function is

E (TestScore|Income) = β0 + β1Income + β2Income2

It is linear with respect to β. So we can still use the OLS estimation andcarry out hypothesis testing as we do with a linear regression model.

Zheng Tian Lecture 10: Nonlinear Regression Functions 6 / 46

Page 7: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

Estimate a quadratic regression model (cont’d)

Figure: Scatterplot of test score vs district income and a quadratic regression line

Zheng Tian Lecture 10: Nonlinear Regression Functions 7 / 46

Page 8: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

A general formula for a nonlinear population regressionfunction

A general nonlinear regression model is

Yi = f (Xi1,Xi2, . . . ,Xik ;β1, β2, . . . , βm) + ui (2)

The population nonlinear regression function:

E (Yi |Xi1, . . . ,Xik) = f (Xi1,X2i , . . . ,Xik ;β1, β2, . . . , βm)

The number of regressors and the number of parameters are not necessarilyequal in the nonlinear regression model.In vector notation

Yi = f (Xi ;β) + ui (3)

We focus on the nonlinear regression models such that f (·) is nonlinear withXi but linear with β.

Zheng Tian Lecture 10: Nonlinear Regression Functions 8 / 46

Page 9: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

The effect on Y of a change in a regressor

For any general nonlinear regression function

The effect on Y of a change in one regressor, say X1, holding other thingsconstant, can be computed as

∆Y = f (X1 + ∆X1,X2, . . . ,Xk ;β)− f (X1,X2, . . . ,Xk ;β) (4)

For continuous and differentiable nonlinear functions

When X1 and Y are continuous variables and f (·) is differentiable, the marginaleffect of X1 is the partial derivative of f with respect to X1, that is, holding otherthings constant

dY =∂f (X1, . . . ,Xk ;β)

∂XidXi

because dXj = 0 for j 6= i

Zheng Tian Lecture 10: Nonlinear Regression Functions 9 / 46

Page 10: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

Application to test scores and income

Estimation

TestScore = 607.3(2.9)

+ 3.85(0.27)

Income − 0.0423(0.0048)

Income2, R2 = 0.554 (5)

Hypothesis test

Test H0 : β2 = 0 vs. H1 : β2 6= 0.

t =−0.04230.0048

= −8.81 > −1.96

We reject the null at the 1%, 5% and 10% significance levels, and therefore,confirm the quadratic relationship between test scores and income.

Zheng Tian Lecture 10: Nonlinear Regression Functions 10 / 46

Page 11: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

The effect of change in income on test scores

A change in income from $10 thousand to $20 thousand

∆Y = β0 + β1 × 11 + β2 × 112 − (β0 + β1 × 10 + β2 × 102)

= β1(11− 10) + β2(112 − 102)

= 3.85− 0.0423× 21 = 2.96

A change in income from $40 thousand to $41 thousand

∆Y = β0 + β1 × 41 + β2 × 412 − (β0 + β1 × 40 + β2 × 402)

= β1(41− 40) + β2(412 − 402)

= 3.85− 0.0423× 81 = 0.42

Zheng Tian Lecture 10: Nonlinear Regression Functions 11 / 46

Page 12: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

A General Strategy For Modeling Nonlinear Regression Functions

A general approach to modeling nonlinearities using multipleregression

1 Identify a possible nonlinear relationship.Economic theoryScatterplotsYour judgment and experts’ opinions

2 Specify a nonlinear function and estimate its parameters by OLS.The OLS estimation and inference techniques can be used as usual when theregression function is linear with respect to β.

3 Determine whether the nonlinear model can improve a linear modelUse t- and/or F-statistics to test the null hypothesis that the populationregression function is linear against the alternative that it is nonlinear.

4 Plot the estimated nonlinear regression function.5 Compute the effect on Y of a change in X and interpret the results.

Zheng Tian Lecture 10: Nonlinear Regression Functions 12 / 46

Page 13: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

Polynomials

A polynomial regression model of degree r

Yi = β0 + β1Xi + β2X2i + · · ·+ βrX

ri + ui (6)

r = 2: a quadratic regression modelr = 3: a cubic regression modelUse the OLS method to estimate β1, β2, . . . , βr .

Testing the null hypothesis that the population regression function is linear

H0 : β2 = 0, β3 = 0, ..., βr = 0 vs. H1 : at least one βj 6= 0, j = 2, . . . , r

Use F statistic to test this joint hypothesis. The number of restriction is q = r − 1.

Zheng Tian Lecture 10: Nonlinear Regression Functions 13 / 46

Page 14: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

What is ∆Y /∆X in a polynomial regression model?

Consider a cubic model and continuous X and Y

Y = β0 + β1X + β2X2 + β3X

3 + u

Then, we can calculate

dYdX

= β1 + 2β2X + 3β3X2

The effect of a unit change in X on Y depends on the value of X atevaluation.

Zheng Tian Lecture 10: Nonlinear Regression Functions 14 / 46

Page 15: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

Which degree of polynomial should I use?

Balance a trade-off between flexibility and statistical precision.Flexibility. Relate Y to X in more complicated way than simple linearregression.Statistical precision. X ,X 2,X 3, . . . are correlated so that there is the problemof imperfect multicollinearity.

Follow a sequential hypothesis testing procedure1 Pick a maximum value of r and estimate the polynomial regression for that r .2 Follow a "deletion" rule based on t-statistic or F-statistic.

Zheng Tian Lecture 10: Nonlinear Regression Functions 15 / 46

Page 16: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

Application to district income and test scores

We estimate a cubic regression model relating test scores to district income asfollows

TestScore = 600.1(5.1)

+ 5.02(0.71)

Income− 0.096(0.029)

Income2+ 0.00069(0.00035)

Income3, R2 = 0.555

Test whether it is a cubic modelThe t-statistic for H0 : β3 = 0 is 1.97 ⇒ Fail to reject

Test whether it is a nonlinear modelThe F-statistic for H0 : β2 = β3 = 0 is 37.7, p-value < 0.01

Interpretation of coefficients

Use the general formula of interpreting the effect of ∆X on Y .

Zheng Tian Lecture 10: Nonlinear Regression Functions 16 / 46

Page 17: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

A natural logarithmic function y = ln(x)

Properties of ln(x)

ln(1/x) = − ln(x), ln(ax) = ln(a) + ln(x)

ln(x/a) = ln(x)− ln(a), and ln(xa) = a ln(x)

The derivative of ln(x) is

d ln(x)

dx= lim

∆x→0

ln(x + ∆x)− ln(x)

∆x=

1x.

It follows that d ln(x) = dx/x , representing the percentage change in x .

Zheng Tian Lecture 10: Nonlinear Regression Functions 17 / 46

Page 18: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

The percentage-change form using ln(x)

The change in ln(X ) represents the percentage change in X

ln(x + ∆x)− ln(x) ≈ ∆x

xwhen ∆x is small.

The Taylor expansion of ln(x + ∆x) at x , which is

ln(x + ∆x) = ln(x) +d ln(x)

dx(x + ∆x − x) +

12!

d2 ln(x)

dx2 (x + ∆x − x)2 + · · ·

= ln(x) +∆x

x− ∆x2

2x2 + · · ·

When ∆x is very small, we can omit the terms with ∆x2,∆x3, etc. Thus, wehave ln(x + ∆x)− ln(x) ≈ ∆x

x when ∆x is small.

Zheng Tian Lecture 10: Nonlinear Regression Functions 18 / 46

Page 19: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

The three logarithmic regression models

There are three types of logarithmic regression models:

Linear-log modelLog-linear modelLog-log model

Differences in logarithmic transformation of X and/or Y lead to differences ininterpretation of the coefficient.

Zheng Tian Lecture 10: Nonlinear Regression Functions 19 / 46

Page 20: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

Case I: linear-log model

Model form. X is in logarithms, Y is not.

Yi = β0 + β1 ln(Xi ) + ui , i = 1, . . . , n (7)

Interpretation. a 1% change in X is associated with a change in Y of 0.01β1

∆Y = β1 ln(X + ∆X )− β1 ln(X ) ≈ β1∆X

X

Example. The estimated model is

TestScore = 557.8 + 36.42 ln(Income)

1% increase in average district income results in an increase in test scores by0.01× 36.42 = 0.36 point.

Zheng Tian Lecture 10: Nonlinear Regression Functions 20 / 46

Page 21: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

Case II: log-linear model

Model form. Y is in logarithms, X is not.

ln(Yi ) = β0 + β1Xi + ui (8)

Interpretation. A one-unit change in X is associated with a 100× β1%change in Y because

∆Y

Y≈ ln(Y + ∆Y )− ln(Y ) = β1∆X

Example.ln(Earnings) = 2.805 + 0.0087Age

Earnings are predicted to increase by 0.87% for each additional year of age.

Zheng Tian Lecture 10: Nonlinear Regression Functions 21 / 46

Page 22: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

Case III: log-log model

Model form. Both X and Y are in logarithms.

ln(Yi ) = β0 + β1 ln(Xi ) + ui (9)

Interpretation: elasticity. 1% change in X is associated with a β1% change inY because

∆Y

Y≈ ln(Y + ∆Y )− ln(Y ) = β1(ln(X + ∆X )− ln(X )) ≈ β1

∆X

X

β1 is the elasticity of Y with respect to X , that is

β1 =100× (∆Y /Y )

100× (∆X/X )=

percentage change in Y

percentage change in X

With the derivative, β1 = d ln(Y )/d ln(X ) = (dY /Y )/(dX/X ).

Example. The log-log model of the test score application is estimated as

ln(TestScore) = 6.336 + 0.0544 ln(Income)

This implies that a 1% increase in income corresponds to a 0.0544% increasein test scores.

Zheng Tian Lecture 10: Nonlinear Regression Functions 22 / 46

Page 23: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

The log-linear and log-log regression functions

Figure: The log-linear and log-log regression functions

Zheng Tian Lecture 10: Nonlinear Regression Functions 23 / 46

Page 24: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Nonlinear functions of a single independent variable

Summary

Regression specification Interpretation of β1

Y = β0 + β1 ln(X ) + u A 1% change in X is associated with achange in Y of 0.01β1

ln(Y ) = β0 + β1X + u A change in X by one unit is associatedwith a 100β1% change in Y

ln(Y ) = β0 + β1 ln(X ) + u A 1% change in X is associated with aβ1% change in Y, so β1 is the elasticityof Y with respect to X

Zheng Tian Lecture 10: Nonlinear Regression Functions 24 / 46

Page 25: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Interactions between independent variables

Interaction between two binary variables

Interaction between a continuous and a binary variable

Interaction between two continuous variables

Zheng Tian Lecture 10: Nonlinear Regression Functions 25 / 46

Page 26: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

The regression model with interaction between two binaryvariables

Two binary variables

D1i = 1 if the ith person has a college degree, and 0 otherwise.D2i = 1 if the ith person is female, and 0 otherwise.

A regression with an interaction term of two binary variables

Consider a regresion model concerning the effects of education and gender onearnings. The population regression function is

Yi = β0 + β1D1i + β2D2i + β3(D1i × D2i ) + ui (10)

The dependent variable: Yi , where Yi = Earningsi

D1i × D2i is the interaction term.

Zheng Tian Lecture 10: Nonlinear Regression Functions 26 / 46

Page 27: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

The method of interpreting coefficients in regressions withinteracted binary variables

We can follow a general rule for interpreting coefficients in Equation (10):

First compute the expected values of Y for each possible case described bythe set of binary variables.

Next compare these expected values. Each coefficient can then be expressedeither as an expected value or as the difference between two or moreexpected values.

Zheng Tian Lecture 10: Nonlinear Regression Functions 27 / 46

Page 28: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Compute the expected values of Y for each possiblecombinations of D1 and D2

Case 1 E (Yi |D1i = 0,D2i = 0) = β0: the average income of malenon-college graduates.

Case 2 E (Yi |D1i = 1,D2i = 0) = β0 + β1: the average income malecollege graduates.

Case 3 E (Yi |D1i = 0,D2i = 1) = β0 + β2: the average income of femalenon-college graduates.

Case 4 E (Yi |D1i = 1,D2i = 1) = β0 + β1 + β2 + β3: the average incomeof female college graduates.

Zheng Tian Lecture 10: Nonlinear Regression Functions 28 / 46

Page 29: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Compute the difference between a pair of cases

Case 1 vs. Case 2 E (Yi |D1i = 1,D2i = 0)− E (Yi |D1i = 0,D2i = 0) = β1: theaverage income difference between college graduates andnon-college graduates among male workers.

Case 1 vs. Case 3 E (Yi |D1i = 0,D2i = 1)− E (Yi |D1i = 0,D2i = 0) = β2: theaverage income difference between female and male workers whoare not college graduates.

Case 1 vs. Case 4E (Yi |D1i = 1,D2i = 1)− E (Yi |D1i = 0,D2i = 0) = β1 + β2 + β3:the average income difference between female college graduatesand male non-college graduates.

Zheng Tian Lecture 10: Nonlinear Regression Functions 29 / 46

Page 30: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Compute the difference between a pair of cases (cont’d)

Case 2 vs. Case 3 E (Yi |D1i = 0,D2i = 1)− E (Yi |D1i = 1,D2i = 0) = β2 − β1.Thus, the average income difference between female non-collegegraduates and male college graduates is β2 − β1.

Case 2 vs. Case 4 E (Yi |D1i = 1,D2i = 1)− E (Yi |D1i = 1,D2i = 0) = β2 + β3.Thus, the average income difference between female collegegraduates and male college graduates is β2 + β3.

Case 3 vs. Case 4 E (Yi |D1i = 1,D2i = 1)− E (Yi |D1i = 0,D2i = 1) = β1 + β3.Thus, the average income difference between female collegegraduates and female non-college graduates is β1 + β3.

Zheng Tian Lecture 10: Nonlinear Regression Functions 30 / 46

Page 31: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Hypothesis testing

We can use t-statistic or F-statistic to test whether the differences betweendifferent cases are statistically significant.

The null hypothesis: H0 : β2 = 0 vs. H1 : β2 6= 0.

What is this test for?What test statistic can we use?

The hypothesis is H0 : β1 + β3 = 0 vs. H1 : β1 + β3 6= 0.

What is this test for?What test statistic can we use?

Zheng Tian Lecture 10: Nonlinear Regression Functions 31 / 46

Page 32: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Interactions between a continuous and a binary variable

Consider the population regression of earnings (Yi ) against

one continuous variable, individual’s years of work experience (Xi ), andone binary variable, whether the worker has a college degree (Di , whereDi = 1 if the ith person is a college graduate).

As shown in the next figure, the population regression line relating Y and X candepend on D in three different ways.

Zheng Tian Lecture 10: Nonlinear Regression Functions 32 / 46

Page 33: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Interactions between a continuous and a binary variable:graphic representation

Figure: Regression Functions Using Binary and Continuous VariablesZheng Tian Lecture 10: Nonlinear Regression Functions 33 / 46

Page 34: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Different intercept, same slope: (a) in Figure 4

Yi = β0 + β1Xi + β2Di + ui (11)

From Equation (11), we have the population regression functions asE(Yi |Di = 1) = (β0 + β2) + β1Xi

E(Yi |Di = 0) = β0 + β1Xi .

Thus, E (Yi |Di = 1)− E (Yi |Di = 0) = β2.The average initial salary of college graduates is higher than non-collegegraduates by β2, and this gap persists at the same magnitude regardless ofhow many years a worker has been working.

Zheng Tian Lecture 10: Nonlinear Regression Functions 34 / 46

Page 35: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Different intercepts and different slopes: (b) in Figure 4

Equation (11):Yi = β0 + β1Xi + β2Di + β3(Xi × Di ) + ui (12)

The population regression functions for the two cases areE(Yi |Di = 1) = (β0 + β2) + (β1 + β3)Xi

E(Yi |Di = 0) = β0 + β1Xi .

Thus, β2 is the difference in intercepts and β3 is the difference in slopes.The average initial salary of college graduates is higher than non-collegegraduates by β2, and this gap will widen (or narrow) depending on the effectof the years of work experience on earnings.

Zheng Tian Lecture 10: Nonlinear Regression Functions 35 / 46

Page 36: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Different intercepts and same intercept: (c) in Figure 4

Yi = β0 + β1Xi + β2(Xi × Di ) + ui (13)

The population regression functions for the two cases areE(Yi |Di = 1) = β0 + (β1 + β2)Xi

E(Yi |Di = 0) = β0 + β1Xi .

Thus, there is only a difference in the slope but not in the intercept.Although college graduates have the same starting salary as those withoucolledge degree, the raise in salary and promotion of the former will be fasterthan the latter.

Zheng Tian Lecture 10: Nonlinear Regression Functions 36 / 46

Page 37: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Interactions between independent variables

Interactions between two continuous variables

Now we consider the regression of earnings against two continuous variables, onefor the years of work experience (X1) and another for the years of schooling (X2).

The interaction model is

Yi = β0 + β1X1i + β2X2i + β3(X1i × X2i ) + ui (14)

The effect of a change in X1, holding X2 constant, is

∆Y

∆X1= β1 + β3X2

Similarly, the effect of a change in X2, holding X1 constant, is

∆Y

∆X2= β1 + β3X1

Zheng Tian Lecture 10: Nonlinear Regression Functions 37 / 46

Page 38: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Warm-up exercises

Question 1

The interpretation of the slope coefficient in the modelln(Yi ) = β0 + β1 ln(Xi ) + ui is as follows:

A) a 1% change in X is associated with a β1% change in Y.B) a change in X by one unit is associated with a β1 change in Y.C) a change in X by one unit is associated with a 100 β1 % change in

Y.D) a 1% change in X is associated with a change in Y of 0.01β1.

Answer: A

Zheng Tian Lecture 10: Nonlinear Regression Functions 38 / 46

Page 39: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Warm-up exercises

Question 1

The interpretation of the slope coefficient in the modelln(Yi ) = β0 + β1 ln(Xi ) + ui is as follows:

A) a 1% change in X is associated with a β1% change in Y.B) a change in X by one unit is associated with a β1 change in Y.C) a change in X by one unit is associated with a 100 β1 % change in

Y.D) a 1% change in X is associated with a change in Y of 0.01β1.

Answer: A

Zheng Tian Lecture 10: Nonlinear Regression Functions 38 / 46

Page 40: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Warm-up exercises

Question 2

In the regression model Yi = β0 + β1Xi + β2Di + β3(Xi × Di ) + ui , where X is acontinuous variable and D is a binary variable, to test that the two regressions areidentical, you must use the

A) t-statistic separately for β2 = 0, β3 = 0.B) F-statistic for the joint hypothesis that β0 = 0, β1 = 0C) t-statistic separately for β3 = 0D) F-statistic for the joint hypothesis that β2 = 0, β3 = 0.

Answer: D

Zheng Tian Lecture 10: Nonlinear Regression Functions 39 / 46

Page 41: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Warm-up exercises

Question 2

In the regression model Yi = β0 + β1Xi + β2Di + β3(Xi × Di ) + ui , where X is acontinuous variable and D is a binary variable, to test that the two regressions areidentical, you must use the

A) t-statistic separately for β2 = 0, β3 = 0.B) F-statistic for the joint hypothesis that β0 = 0, β1 = 0C) t-statistic separately for β3 = 0D) F-statistic for the joint hypothesis that β2 = 0, β3 = 0.

Answer: D

Zheng Tian Lecture 10: Nonlinear Regression Functions 39 / 46

Page 42: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Warm-up exercises

Question 2

In the regression model Yi = β0 + β1Xi + β2Di + β3(Xi × Di ) + ui , where X is acontinuous variable and D is a binary variable, to test that the two regressions areidentical, you must use the

A) t-statistic separately for β2 = 0, β3 = 0.B) F-statistic for the joint hypothesis that β0 = 0, β1 = 0C) t-statistic separately for β3 = 0D) F-statistic for the joint hypothesis that β2 = 0, β3 = 0.

Answer: D

Zheng Tian Lecture 10: Nonlinear Regression Functions 39 / 46

Page 43: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Warm-up exercises

Question 3

(Requires Calculus) In the equationTestScore = 607.3 + 3.85Income − 0.0423Income2, the following income level

results in the maximum test score

A) 607.3.B) 91.02.C) 45.50.D) cannot be determined without a plot of the data.

Answer: C

Zheng Tian Lecture 10: Nonlinear Regression Functions 40 / 46

Page 44: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Warm-up exercises

Question 3

(Requires Calculus) In the equationTestScore = 607.3 + 3.85Income − 0.0423Income2, the following income level

results in the maximum test score

A) 607.3.B) 91.02.C) 45.50.D) cannot be determined without a plot of the data.

Answer: C

Zheng Tian Lecture 10: Nonlinear Regression Functions 40 / 46

Page 45: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Regression Functions That Are Nonlinear in the Parameters

Nonlinear regression models and nonlinear least squaresestimator

All the regression models that we have discussed in this lecture are nonlinear inthe regressors but linear in parameters so that we can still treat them as linearregression models and estimate using the OLS.

However, there exist regression models that are nonlinear in parameters. For thesemodels, we can either transform them to the "linear" type of models or estimateusing the nonlinear least squares (NLS) estimators.

Zheng Tian Lecture 10: Nonlinear Regression Functions 41 / 46

Page 46: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Regression Functions That Are Nonlinear in the Parameters

Transform a nonlinear model to a linear one

Suppose we have a nonlinear regression model as follows

Yi = αXβ11i X

β22i · · ·X

βk

ki eui (15)

Taking the natural logarithmic function on both sides of the equation

ln(Yi ) = ln(α) + β1 ln(X1i ) + β2 ln(X2i ) + · · ·+ βk ln(Xki ) + ui (16)

Equation (15) becomes a log-log regression model, which is linear in allparameters and can be estimated using the OLS. Let β0 = ln(α) and α = eβ0 .βi for i = 1, 2, . . . , k are the elasticities of Y with respect to Xi .

Zheng Tian Lecture 10: Nonlinear Regression Functions 42 / 46

Page 47: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Regression Functions That Are Nonlinear in the Parameters

A nonlinear model: logistic function

A dependent variable can only take values between 0 and 1.

The logistic regression model with k regressors is

Yi =1

1 + exp(β0 + β1X1i + · · ·βkXki )+ ui (17)

For small values of X , the value of the function is nearly 0 and the shape isflat.For large values of X , the function approaches 1 and the slope is flat again.

Zheng Tian Lecture 10: Nonlinear Regression Functions 43 / 46

Page 48: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Regression Functions That Are Nonlinear in the Parameters

A nonlinear model: negative exponential growth function

The effect of X on Y must be positive and the effect is bounded by a upperbound.Use the negative-exponential growth function to set up a regression model asfollows

Yi = β0[1− exp(−β1(Xi − β2))] + ui (18)

The slope is positive for all values of X .The slope is greatest at low values of X and decreases as X increases.There is an upper bound, that is, a limit of Y as X goes to infinity, β0.

Zheng Tian Lecture 10: Nonlinear Regression Functions 44 / 46

Page 49: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Regression Functions That Are Nonlinear in the Parameters

Logistic and negative exponential growth curves

Figure: The logistic and negative exponential growth functions

Zheng Tian Lecture 10: Nonlinear Regression Functions 45 / 46

Page 50: Lecture 10: Nonlinear Regression Functions · A General Strategy For Modeling Nonlinear Regression Functions TestScoresanddistrictincome Testscorescanbe determinedbyaverage districtincome

Regression Functions That Are Nonlinear in the Parameters

The nonlinear least squares estimators

For a nonlinear regression function

Yi = f (X1, . . . ,Xk ;β1, . . . , βm) + ui

which is nonlinear in both X and β, we can obtain the estimated parameters bynonlinear least squares (NLS) estimation.

The essential idea of NLS is the same as OLS, which is to minimize the sum ofsquared prediction mistakes. That is

minb1,...,bm

S(b1, . . . , bm) =n∑

i=1

[Yi − f (X1, . . . ,Xk ; b1, . . . , bm)]2

The solution to this minimization problem is the nonlinear least squares estimators.

Zheng Tian Lecture 10: Nonlinear Regression Functions 46 / 46