Upload
abhishek-neha
View
73
Download
6
Embed Size (px)
DESCRIPTION
edf
Citation preview
Three Variable Regression Model
• Yi = B1+B2X2i+B3X3i_ Nonstochastic form, PRF
• Yi = B1+B2X2i+B3X3i+ui stochastic
• B2, B3 called partial regression or partial slope coefficients
• B2 measures the change in mean value of Y, per unit change in X2 holding the value of X3 constant
• Yi = b1+b2X2i+b3X3i+ei SRF
Assumptions
• Linear relationship
• Xs are non-stochastic variables.
• No linear relationship exists between two or more independent variables (no multi-collinearaity). Ex:X2i = 3 +2X3
• Error has zero expected value, constant variance and normally distributed
• RSS = ∑e2 = ∑(Yi – Ŷi)2 = ∑(Yi – b1-b2X2i-b3X3i)2
Least squire estimatorsLeast squire estimators
• Like 2-variable case, we can derive formulae for var(b1), var(b2) & var(b3) and hence their S.E.s
• We can also estimate σ2 as
• Goodness of fit, R2 = ESS/RSS
R2 = [b2∑yix2i+b3 ∑yix3i]/∑yi2
• 0 ≤ R2 ≤ 1
2ˆ = ∑e 2/(n – 3)
Testing of hypothesis, t-testTesting of hypothesis, t-test
• Say, Ŷi = -1336.09 + 12.7413X2i+85.7640X3i
(175.2725) (0.9123) (8.8019)
p=0.000 0.000 0.000
R2 = 0.89, n =32
• H0: B1=0, b1/se(b1)~ t(n-3)
• H0: B2=0, b2/se(b2)~ t(n-3)
• H0: B3=β, (b3 - β) /se(b3)~ t(n-3)
Testing Joint Hypothesis, F TestTesting Joint Hypothesis, F Test
H0 : B2 = B3 = 0
Or, H0 : R2 = 0
• X2 & X3 explain zero percent of the variation of Y
H1: At least one B ≠ 0
• A test of either hypothesis is called a test of overall significance of the estimated multiple regression
• We know, TSS = ESS + RSS
F test• If computed F value exceeds critical F value,
we reject the null hypothesis that the impact of explanatory variables is simultaneously equal to zero
• Otherwise we cannot reject the null hypothesis
• It may happen that not all the explanatory variables individually have much impact on dependent variable (i.e., some of the t values may be statically insignificant) yet all of them collectively influence dependent variable (H0 is rejected in F test)
• This happen only we have the problem of multicollinearity
Specification error• In this example we have seen
that both the explanatory variables are individually and collectively different from zero
• If we omit any one of these explanatory variable from our model, then there would be specification error
• What would be b1, b2 & R2 in 2-variable model?
Specification error
• Ŷi = -1336.09 + 12.7413X2i+85.7640X3i
(175.2725) (0.9123) (8.8019)
p=0.000 0.000 0.000R2 = 0.89, n =32
• Ŷi = -191.66 + 10.48X2
(264.43) (1.79)R2 = 0.53
• Ŷi = 807.95 + 54.57X3i
(231.95) (23.57)
R2 = 0.15
RR2 2 versus Adjusted Rversus Adjusted R22
• Larger the number of explanatory variables in the model, the higher the R2 will be
• However, R2 does not take into account dof
• Therefore, comparing R2 values of the two models with same dependent variable but different numbers of explanatory variables is essentially like comparing apples and bananas
• We need a measure of fit that is adjusted for the no. of explanatory variables in the model
RR2 2 versus Adjusted Rversus Adjusted R22
• Such a measure is called Adj R2
• If k > 1, Adj R2 ≤ R2, as the no of explanatory variables increases in the model, Adj R2 becomes increasingly smaller than R2
• It enable us to compare two models that have same dependent variable but different numbers of independent variables
• In our example, it can be shown that Adj R2=0.88 < 0.89 (R2)
2 2 ( 1)1 (1 )
( )
nR R
n k
When to add an additional variable?When to add an additional variable?
• We often faced with problem of deciding among several competing explanatory variables
• Common practice is to add variables as long as Adj R2 increases even though its numerical value may be smaller than R2
Computer output & Reporting
The Chicken Consumption Example
• Explain US Consumption of Chicken
• Time Series Observations - 1950-1984
Variable Definitions
• CHCONS - Chicken consumption in the US
• LDY - Log of disposable income in the US
• PC/PB - Price of Chicken relative to the Price of ‘Best Red Meat’
Data Time plots
Actual plots of the data over time follows
• Note the trends and cycles• What are the relationships between
the variables?• Are movements in CHCONS related
to movements in LDY and PC/PB?
0.0
10.0
20.0
30.0
40.0
50.0
60.0
1950
1952
1954
1956
1958
1960
1962
1964
1966
1968
1970
1972
1974
1976
1978
1980
1982
1984
CH
CO
NS
YEAR
Time plot - CHCONS Actual Data
0.0000
1.0000
2.0000
3.0000
4.0000
5.0000
6.0000
7.0000
8.0000
9.0000
10.0000
LD
Y
Year
Timeplot-LDY Actual Data
0.0000
0.2000
0.4000
0.6000
0.8000
1.0000
1.2000
1.4000
1.6000
1950
1953
1956
1959
1962
1965
1968
1971
1974
1977
1980
1983
PC
/PB
Year
Timeplot-PC/PB Actual Data
Chicken Consumption vs. Income
• There may be a relationship between CHCONS and LDY
• A simple plot of the two variables seems to reveal this
• Note the positive relationship
0.0
10.0
20.0
30.0
40.0
50.0
60.0
7.0000 7.5000 8.0000 8.5000 9.0000 9.5000
CH
CO
NS
LYD
Scatter Plot - CHCONS vs. LYD
Chicken Consumption vs. Relative Price of Chicken
• There may also be a relationship between CHCONS and PC/PB
• A plot of these two variables shows the relationship
• Note the negative relationship
0.0
10.0
20.0
30.0
40.0
50.0
60.0
0.0000 0.2000 0.4000 0.6000 0.8000 1.0000 1.2000 1.4000 1.6000
CH
CO
NS
PC/PB
Scatter Plot - CHCONS vs PC/PB
CHCONS = f(LDY)
• Simple linear regression captures the relationship between CHCONS and LDY, assuming no other relationships
• This regression explains much of the change in CHCONS, but not everything
• The plotted regression line shows the hypothesized relationship and the actual data
CHCONS = f(LDY) LDY Const.Coeff 15.86 -92.17SE(b) 0.53 4.34
R2 = 0.9641 SE(y) = 2.03 F = 879.05 df = 33 SSReg= 3639.12 SSResid = 136.61(also called SSE) (also called SSR)
0.00
10.00
20.00
30.00
40.00
50.00
60.00
7.0000 7.5000 8.0000 8.5000 9.0000 9.5000
CH
CO
NS
LYD
Regression Line - CHCONS = f(LYD)
CHCONS = f(LYD) Actual Data
CHCONS = f(PC/PB)
• Another simple regression examines the relationship between CHCONS and PC/PB
• While the line explains some of the variation of CHCONS, there is more unexplained error
CHCONS = f(PC/PB)
PC/PB Const.Coeff -28.83 50.77SEb 2.93 1.75
R2 = 0.746 SE(y) = 5.39 F = 97.14 df = 33 SSReg = 2818.32 SSResid = 957.42(also called ESS) (also called RSS)
0.00
10.00
20.00
30.00
40.00
50.00
60.00
0.0000 0.5000 1.0000 1.5000
CH
CO
NS
PC/PB
Regression Line - CHCONS = f(PC/PB)
CHCONS=f(PC/PB) Actual Data
CHCONS = f(LDY,PC/PB) LDY PC/PB Const.Coeff 12.79 -8.08 -63.19SEb 0.54 1.12 4.84
R2 = .986 SEy = 1.27 F = 1149.89 df = 32 SSReg = 3723.92 SSResid = 51.82 (SSE) (SSR)
0.0
10.0
20.0
30.0
40.0
50.0
60.0
CH
CO
NS
YEAR
Actual vs. Predicted
Actual CHCONS=f(LDY,PC/PB)
• Table 7.8 Gujarati: US Defense budget outlays 1962 – 1981
Yt= Defense budget outlays for year t ($ Bn)
X2t=GNP for year t ($ Bn)
X3t=US military sales/assistance ($ Bn)
X4t=Aerospace industry sales ($ Bn)
X5t= Military conflicts involving troops
=0, if troops < 100000=1, if troops > 100000
• Table 8.10, GujaratiTable gives data used by a telephone cable manufacturer to predict sales to a major consumer for the period 1968 – 1983
Y=annual sales in MPF (million paired feet)X2=GNP (billion $)
X3=housing starts (1000 of units)
X4=Unemployment rate (%)
X5=Prime rate lagged 6 months
X6= Customer line gains (%)• Introduce later
• Table 7.10, Gujarati
Consider following demand function for money in US for 1980 – 1998
Where, M = Real money demandY = Real GDPr = Interest rateLTRATE: Long term interest rate
(30 yr tr bond)TBRATE: 3 months tr bill rate
tubt
btt erYbM 32
1