Upload
amina
View
47
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Regression Models. Professor William Greene Stern School of Business IOMS Department Department of Economics. Regression and Forecasting Models. Part 1 – Simple Linear Model. Theory. Demand Theory: Q = f(Price) “The Law of Demand” Demand curves slope downward - PowerPoint PPT Presentation
Citation preview
Part 1: Simple Linear Model1-1/301-1
Regression ModelsProfessor William GreeneStern School of Business
IOMS DepartmentDepartment of Economics
Part 1: Simple Linear Model1-2/301-2
Regression and Forecasting Models
Part 1 – Simple Linear Model
Part 1: Simple Linear Model1-3/30
Theory
Demand Theory: Q = f(Price) “The Law of Demand” Demand curves slope
downward What does “ceteris paribus” mean here?
Part 1: Simple Linear Model1-4/30
Data on the U.S. Gasoline Market
Quantity = G = Expenditure / Price
Part 1: Simple Linear Model1-5/30
Shouldn’t Demand Curves Slope Downward?
G
GasP
rice
0.650.600.550.500.450.400.350.30
140
120
100
80
60
40
20
0
Scatterplot of GasPrice vs G
Part 1: Simple Linear Model1-6/30
Data on 62 Movies in 2010
Part 1: Simple Linear Model1-7/30
Average Box Office Revenue is about $20.7 Million
Part 1: Simple Linear Model1-8/30
Is There a Theory for This?
Scatter plot of box office revenues vs. number of “Can’t Wait To See It” votes on Fandango for 62 movies.
Part 1: Simple Linear Model1-9/30
Average Box Office by Internet Buzz Index
= Average Box Office for Buzz in Interval
Part 1: Simple Linear Model1-10/30
Deterministic Relationship: Not a Theory
Expected High Temperatures, August 11-20, 2013, ZIP 10012, NY
Part 1: Simple Linear Model1-11/30
Probabilistic RelationshipWhat Explains the Noise?
Fuel Bill = Function of Rooms + Random Variation
Part 1: Simple Linear Model1-12/30
Movie Buzz DataProbabilistic Relationship?
Part 1: Simple Linear Model1-13/30
The Regression Model
y = 0 + 1x + y = dependent variablex = independent variableThe ‘regression’ is the deterministic part, 0 + 1 xThe ‘disturbance’ (noise) is .The regression model is E[y|x] = 0 + 1x
Part 1: Simple Linear Model1-14/30
0 = y intercept
1 = slopeE[y|x] = 0 +
1x
y
x
Linear Regression Model
Part 1: Simple Linear Model1-15/30
The Model Constructed to provide a framework for
interpreting the observed data What is the meaning of the observed relationship
(assuming there is one) How it’s used
Prediction: What reason is there to assume that we can use sample observations to predict outcomes?
Testing relationships
Part 1: Simple Linear Model1-16/30
The slope is the interesting quantity.Each additional year of education is associated with an increase of 3.611 in disability adjusted life expectancy.
Part 1: Simple Linear Model1-17/30
A Cost ModelElectricity.mpjTotal cost in $MillionOutput in Million KWHN = 123 American electric utilitiesModel: Cost = 0 + 1 KWH + ε
Part 1: Simple Linear Model1-18/30
Cost Relationship
Output
Cost
80000700006000050000400003000020000100000
500
400
300
200
100
0
Scatterplot of Cost vs Output
Part 1: Simple Linear Model1-19/30
Sample Regression
Part 1: Simple Linear Model1-20/30
Interpreting the Model Cost = 2.44 + 0.00529 Output + e Cost is $Million, Output is Million KWH. Fixed Cost = Cost when output = 0
Fixed Cost = $2.44Million Marginal cost
= Change in cost/change in output= .00529 * $Million/Million KWH= .00529 $/KWH = 0.529 cents/KWH.
Part 1: Simple Linear Model1-21/30
Covariation and Causality
EDUC
DALE
121086420
80
70
60
50
40
30
20
S 7.87034R-Sq 59.2%R-Sq(adj) 59.0%
Fitted Line PlotDALE = 35.16 + 3.611 EDUC
Does more education make you live longer (on average)?
Part 1: Simple Linear Model1-22/30
Causality?
Height (inches) and Income ($/mo.) in first post-MBA Job (men). WSJ, 12/30/86.Ht. Inc. Ht. Inc. Ht. Inc.70 2990 68 2910 75 3150 67 2870 66 2840 68 2860 69 2950 71 3180 69 2930 70 3140 68 3020 76 3210 65 2790 73 3220 71 3180 73 3230 73 3370 66 2670 64 2880 70 3180 69 3050 70 3140 71 3340 65 2750 69 3000 69 2970 67 2960 73 3170 73 3240 70 3050
Estimated Income = -451 + 50.2 Height
Part 1: Simple Linear Model1-23/30
b0
b1
How to compute the y intercept, b0, and the slope, b1, in y = b0 + b1x.
Part 1: Simple Linear Model1-24/30
Least Squares Regression
Part 1: Simple Linear Model1-25/30
Fitting a Line to a Set of Points
Income
PerC
apita
G
27000260002500024000230002200021000
6.4
6.3
6.2
6.1
6.0
5.9
5.8
5.7
5.6
Scatterplot of PerCapitaG vs Income
Choose b0 and b1 tominimize the sum of squared residuals
Gauss’s methodof least squares.
N N N2 2 2i 0 1 i i 0 1 i ii 1 i 1 i 1
SS [y - b - b x ] [y - (b + b x )] e
Residuals i i 0 1 i
i i
e y (b b x )ˆ y y
Yi
Xi
Predictionsb0 + b1xi
Part 1: Simple Linear Model1-26/30
Computing the Least Squares Parameters b0 and b1
N Ni ii 1 i 1
N2 2x ii 1
Nxy i ii 1
1 1y = y = 20.721 x = x = 0.48242N N
1Var(x) = s = (x x) = 0.02453N-1
1Cov(x,y) = s = (x x)(y y) = 1.784N-1
4 numbers are needed :
xy1 2
x
0 1
s 1.784b 72.7181s 0.02453
b y - b x = 20.721- (72.7181)(0.48242) = -14.36
Part 1: Simple Linear Model1-27/30
b0=-14.36
b1= 72.718
Part 1: Simple Linear Model1-28/30
Least Squares Uses Calculus
0 1
0 1
0 0
0 1
0 1
1 1
0 1
N 21i iN-1 i=1
2N i i1
N-1 i=1
N1i iN-1 i=1
2N i i1
N-1 i=1
N1i i iN-1 i=1
SS = (y -b -b x )
(y -b -b x )SS =b b
= 2(y -b -b x )(-1) = 0
(y -b -b x )SS =b b
= 2(y -b -b x )(-x ) = 0
0 1
1
N1i=1 i iN-1
N 21i=1 iN-1
The solution is b = y - b x where
Σ (x - x)(y - y)b =
Σ (x - x)
Part 1: Simple Linear Model1-29/30
0 1
0 1b =-b =-14.
20.00,36, b =72.71
b =73.5008, Sum of Squares = , Sum of Squares = 1
10724
51.569.7
Least squares minimizes the sum of squared deviations from the line.
Part 1: Simple Linear Model1-30/30
Summary Theory vs. practice Linear Relationship
Deterministic Random, stochastic, ‘probabilistic’ Mean is a function of x
Regression Relationship Causality vs. correlation Least squares