1 Research Method Lecture 1 (Ch1, Ch2) Simple linear regression ©

  • View
    218

  • Download
    3

Embed Size (px)

Text of 1 Research Method Lecture 1 (Ch1, Ch2) Simple linear regression ©

  • Slide 1
  • 1 Research Method Lecture 1 (Ch1, Ch2) Simple linear regression
  • Slide 2
  • 2 The goal of econometric analysis 4To estimate the causal effect of one variable on another The effect of one variable on another, holding all other relevant factors constant. 4Causal effect in other words is cetris paribus effect, which means other relevant factors being constant
  • Slide 3
  • 3 4For example consider the following model (Crop yield)= 0 + 1 (fertilizer)+u You are interested in the causal effect of the amount of fertilizer on crop yield. u contains all relevant factors which are unobserved by the researcher, such as the quality of land.
  • Slide 4
  • 4 4One way to obtain the causal effect is to control for all other relevant variables, like (Crop yield)= 0 + 1 (fertilizer)+ 2 (land quality)+.... +u In reality, we do not have all the relevant variables in the data set.
  • Slide 5
  • 5 4However, under certain conditions, even if we do not have all the relevant variables in the data, we can estimate the causal effect. 4In this lecture, you will learn such conditions for the case of simple linear regression.
  • Slide 6
  • 6 Type of data sets 4Cross sectional 4Time series 4Pooled cross sectional 4Panel Data
  • Slide 7
  • 7
  • Slide 8
  • 8
  • Slide 9
  • 9
  • Slide 10
  • 10
  • Slide 11
  • 11 A simple linear regression Assumptions SLR.1 : Linear in parameters In the population model the dependent variable, y, is related to the independent variable, x and the error term, u,as y= 0 + 1 x+u
  • Slide 12
  • 12 4 Assumption SLR.2 : Random sampling We have a random sample of size n, {x i,y i } for i=1,..,n, following the population model.
  • Slide 13
  • 13 4Understanding SLR.2 is important. Suppose you have the following data.Then SLR,2 means the following 4 SLR.2a : y 1, y 2,.., y n are independently and identically distributed 4 SLR.2b : x 1, x 2,.., x n are independently and identically distributed. 4 SLR.2c : x i and y j are independent for ij 4 SLR.2d : u 1 u 2,, u n are independently and identically distributed Obs id YX 1y1y1 x1x1 2y2y2 x2x2 ::: nynyn xnxn
  • Slide 14
  • 14 Assumption SLR.3 The sample outcome of x, namely, x 1,x 2,,x n are not all the same value.
  • Slide 15
  • 15 Assumption SLR.4 : Zero conditional mean Given any value of x, the expected value of u is zero, that is E(u|x)=0
  • Slide 16
  • 16 4Combined with SLR.2 and SLR.4, we have the following. Given the data {x i,y i } for i=1,2,,n we have SLR4.a E(u i |x i )=0 for i=1,2,,n SLR.4b E(u i |x 1,x 2,,x n )=0 for i=1,2,,n We usually write this as E(u i |X)=0 for short hand notation.
  • Slide 17
  • 17 Note the following 4E(u|x)=0 implies cov(u,x)=0 4But cov(u,x)=0 does not necessarily imply E(u|x)=0 4E(u|x)=0 does not imply that u and x are independent. 4But if u and x are independent, E(u|x)=0 is always satisfied. SLR.4 is the assumption that allows you to interpret the result as causal effect.
  • Slide 18
  • 18 Estimation of 0 and 1 4From the assumptions, we can motivate the estimation procedure. SLR.4 implies the following E(u)=0 E(ux)=0 This motivates the following empirical counter parts.
  • Slide 19
  • 19 The hat above the coefficients indicate that they are the estimates of the true parameter 0 and 1 Let us call the above two equations as the first order condition (FOCs) for the simple linear regression. By solving FOCs for beta coefficients, we have the following estimates. (See next page)
  • Slide 20
  • 20 The estimators for simple OLS Proof: See the front board These are called the ordinary least square (OLS) estimators.
  • Slide 21
  • 21 4After estimating coefficients, you can compute the residual, which is the estimated value of the error term, u.
  • Slide 22
  • 22 Some useful results 4From the FOCs, the following equations follow. We will use above equations many times in the proofs of various theorems.
  • Slide 23
  • 23 SST, SSE and SSR Total sum of squares: Explained sum of squares: Residual sum of squares: There are the following relationship SST=SSE+SSR Proof: See front board
  • Slide 24
  • 24 R squared R squared is a measure of fit. R squared is always between 0 and 1.
  • Slide 25
  • 25 Unit of measurements and functional form 1.Level-Level from Example: the determinants of CEO salary Salary = 0 + 1 (Sales)+u Where Salary is in $1000 and sales is in $1000. Then 1 shows the change in CEO salary in $1000 when sales increases by $1000.
  • Slide 26
  • 26 2.Log-log form Suppose you regress log(salary) on log(sales) in the CEO compensation example, Log(Salary) = 0 + 1 log(Sales)+u Then, 1 shows the % change. That is if sales increases by 1%, salary would increases by 1 %.
  • Slide 27
  • 27 3.Log-level form Example: the return on education Log(wage) = 0 + 1 (educ)+u Where wage is the hourly wage in $1, educ is the years of education. Then, if education increases by 1 year, wage increases by 100 1 %.
  • Slide 28
  • 28 Summary: Unit of measurement and functional form ModelDependent variable Independent variable Interpretation Level-levelyxy= 1 x Level-logylog(y)y=( 1 /100)%x Log-levellog(y)x%y=(100 1 )x Log-loglog(y)log(x)%y= 1 %x
  • Slide 29
  • 29 Unbiasedness of OLS Theorem 2.1 Under SLR.1 through SLR.4, we have Proof: See the front board.
  • Slide 30
  • 30 Variance of OLS estimators 4First, we introduce one more assumption Assumption SLR.5: Homoskedasticity Var(u|x)= 2 This means that the variance of u does not depend on the value of x.
  • Slide 31
  • 31 4Combining SLR.5 with SLR.2, we also have MRL.4a Var(u i |X)= 2 for i=1,,n where X denotes the independent variable for all the observations. That is, x 1, x 2,, x n.
  • Slide 32
  • 32 Theorem 2.2 where Proof: See front board
  • Slide 33
  • 33 4The standard deviations of the estimated parameters are then given by
  • Slide 34
  • 34 Estimating the error variance 4In Theorem 2.2, 2 is unknown, which has to be estimated. 4The estimate of 2 is given by
  • Slide 35
  • 35 Theorem 2.3: Unbiased estimator of 2. Under SLR.1 through SLR.5, we have Proof: See the front board
  • Slide 36
  • 36 Estimates of the variance and the standard errors of OLS slope parameter 4We replace the 2 in the theorem 2.2 by to get the estimate of the variance of the OLS parameters. This is given by Note the is a hat indicating that this is an estimate. 4Then the standard error of the OLS estimate is the square root of the above. This is the estimated standard deviation of the slope parameter.