Upload
noam
View
33
Download
0
Embed Size (px)
DESCRIPTION
Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random variable e with E( e ) = 0 That is, E(Y) = b 1 + b 2 X Or, Y = b 1 + b 2 X + e. Y. Observed points. - PowerPoint PPT Presentation
Citation preview
Topic4
Ordinary Least Squares
• Suppose that X is a non-random variable• Y is a random variable that is affected by X in a
linear fashion and by the random variable with E() = 0That is,
E(Y) = + X
Or, Y = + X +
O X
Y
..
. ..
Observed points
O X
Y
ActualLine
. .Y= 1 + 2x..
.
O X
Y
ActualLine
.Y= 1 + 2x..
..
O X
Y
ActualLine
.Y= 1 + 2x.
.
..
O X
Y
ActualLine
Y= 1 + 2x.
.
. ..
O X
Y
ActualLine
Y= 1 + 2x
.
. ..
.
O X
Y
. ActualLine
Y= 1 + 2x
.
. ..
O X
Y
. ActualLine
Y= 1 + 2x. ..
Y= b1 + b2xFitted Line
.
BC is an error of EstimationAC is an effect of the random factor
C
B
. A.
• The Ordinary Least Squares (OLS) estimates are obtained by minimising the sum of the squares of each of these errors.
• The OLS estimates are obtained from the values of X and the actual Y values (YA) as follows:
Error of estimation (e) YA –YE |
where YE is the estimated value of Y.e2 YA
–YE ]2
e2 YA –(b1 + b2 X)]2
e2/b1 YA –(b1 + b2X)] (-1) =0
e2 /b2 YA –(b1 + b2X)] (-X) = 0
Y –(b1 + b2X)] (-1) = 0
-NYMEAN + N b1 + b2NXMEAN = 0
b1 = YMEAN – b2XMEAN ….. (1)
e2/b2 Y –(b1+ b2X)] (-X) = 0
Y –(b1 + b2X)] (-X) = 0
b1X –b2X2 = XY ………..(2)
b1 = YMEAN - b2XMEAN ….. (1)
• These estimates are given below (with the superscripts for Y dropped).
^1 = (∑Y)(ΣX2) – (∑X)(∑XY)
N∑ X2 - (∑X)2
^2 = N∑YX – (∑X)(∑Y)
N∑ X2 - (∑X)2
• Alternatively,
^1 = YMEAN - ^2XMEAN
^2 = Covariance(X,Y) Variance(X)
(a) ei (Yi– YiE) = 0 and
(b) X2iei X2i(Yi– YiE) = 0
where YiE is the estimated value of Yi.
X2i is the same as Xi from before Proof: (Yi– YiE) = Yi– ^1 - ^2 X2i)
= Yi– ^1 - ^2 X2i
= nYMEAN – n^1 - n^2 XMEAN
= n(YMEAN – ^1 - ^2 XMEAN)
= 0 [ since ^1 = YMEAN - ^2XMEAN ]
Two Important Results
See the lecture notes for a proof of part (b)
Total sum of squares (TSS) (Yi– YMEAN )
2
Residual sum of squares (RSS) (Yi– Yi
E )
2
Explained sum of squares (ESS) (Yi
E – YMEAN )
2
To prove that
TSS = RSS + ESS
TSS ≡ (Yi– YMEAN)2
= {(Yi– YiE + Yi
E– YMEAN)}2
= (Yi– YiE)2 + (Yi
E– YMEAN)}2
(Yi– Yi E)(Yi
E– YMEAN)
= RSS + ESS (Yi– YiE)(Yi
E– YMEAN)
(Yi– YiE)(Yi
E– YMEAN)
Yi– YiE)(Yi
E ) -YMEAN Yi– YiE)
Yi– YiE)(Yi
E ) [by (a) above]
Yi– YiE)(Yi
E ) = Yi– YiE)(^1^2
Xi)
= ^1 Yi– YiE)^2 XiYi– Yi
E)
= 0 [by (a) and (b) above]
R2 ≡ ESS/TSS
Since TSS = RSS + ESS, it follows that
0 R2
Topic 5
Properties of Estimators
In the discussion that follows, ^ is an estimator of the parameter of interest,
Bias of ^ ≡ E(^) -
^ is unbiased if Bias of ^ = 0.
^ is negatively biased if Bias of ^ < 0.
^ is positively biased if Bias of ^ > 0.
Mean Squared Errors (MSE) of estimation for ^ is given asMSE^ ≡ E[(^-)]2
MSE^ ≡ E[(^-)2]≡ E[{^-E(^) +E(^)-≡ E[{^-E(^)}2] + E[{E(^)- 2E[{^-E(^)}*{E(^)-≡ Var(^) + {E(^)- 2E[{^-E(^)}*{E(^)-
Now, E[{^-E(^)}*{E(^)-
≡ {E(^)-E(^)}*{E(^)-
MSE^ ≡ Var(^) + {E(^)-
MSE^ ≡ Var(^) + (bias)2 .
≡ 0*{E(^)-
If ^ is unbiased, that is, if E( ^)- = 0. then we have,
MSE^ ≡ Var(^)
An unbiased estimator ^ of a parameter is efficient if and only if it has the smallest variance of all unbiased estimatorsThat is, for any other unbiased estimator p of
Var(^)≤ Var(p)
An estimator ^ is said to be consistent if it converges in probability to . That is,
Lim n Prob(|^- | > ) = 0 for every > 0.
When the above condition holds, ^ is said to be the probability limit of , that is,plim^
Sufficient conditions for consistency: If the mean of ^convergesto and var(^) converges to zero (as n approaches ) then ^is consistent.
That is, ^n is consistent if it can be shown that
Lim n E(^n
And Lim n Var(^n
The Regression Model with TWO Variables
The Model :: Y = X +
Y is the DEPENDENT variable
X is the INDEPENDENT variable
Yi X1i X2i i
The OLS estimates ^1 and ^2 are sample
statistics used to estimate 1and2 respectively
Yi X1i X2i i
Here X1i ≡ 1 for all i and X2 is
nothing but X .
Assumptions about X2:
(1a) X2 is non-random (chosen by the
investigator) (1b) Random sampling is performed from a population of fixed values of X2 .
(1c) : Lim (1/n)x22i) = Q > 0
n [ where x2i X2i – X2MEAN.]
(1c) : Lim (1/n)X2i) = P > 0
n
Assumptions about the disturbance term
2a. E() = 0
2b. Var(i) = 2 for all i.
2c. Cov(ij ) = 0 for i j. (The values
are uncorrelated across observations). 2d. The i all have a normal distribution
Homoskedasticity
Result^2 is linear in the dependent variable Yi
^2 = Covariance(X,Y)
Variance(X)
^2 = Yi–YMEAN )Xi–XMEAN )
Xi–XMEAN )2
Proof:
^2 = YiXi–XMEAN )
Xi–XMEAN )2
+ K
CiYiK
where the Ci andK are constants
Therefore,
^2 is a linear function of Yi
Since, Yi
X1i X2i i
^2 is a linear function of i and hence
is normally distributed
Similarly,
^1 is a linear function of Yi (and
hence i ) and is normally distributed
Both ^1 and ^2 are unbiased estimates of 1 and 2 respectively.
That is, E( ^1 ) = 1 and
E( ^2 ) = 2
Each of ^1 and ^2 is an efficient estimators of 1 and 2 respectively.
Thus, each of ^1 and ^2 is a
Best (efficient)
Linear (in the dependent variable Yi )
Unbiased
Estimator of 1 and 2 respectively.
Each of ^1 and ^2 is a consistent
estimator of 1 and 2 respectively.
Also,
Var(^1 ) = (1/n +X 2mean2x2i
2)
Var(^2 ) = x2i2)
. Cov(^1, ^2 ) = -X 2meanx2i2
LimVar(^2 )
n = Lim x2i
2
n = Lim /nx2i
2/n
n = 0/Q [using assumption (1c)]
= 0
Because ^2 is an unbiased estimator of 2 and
LimVar(^2 ) = 0
n
^2 is a consistent estimator of 2
The variance of the random term, , is not known
To perform statistical analysis, we estimate by
^2 RSS/(n-2)
This is because ^2 is an unbiased estimator of 2