30
Part 1: Simple Linear Model -1/30 1-1 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics

Regression Models

  • Upload
    amina

  • View
    47

  • Download
    3

Embed Size (px)

DESCRIPTION

Regression Models. Professor William Greene Stern School of Business IOMS Department Department of Economics. Regression and Forecasting Models. Part 1 – Simple Linear Model. Theory. Demand Theory: Q = f(Price) “The Law of Demand” Demand curves slope downward - PowerPoint PPT Presentation

Citation preview

Page 1: Regression Models

Part 1: Simple Linear Model1-1/301-1

Regression ModelsProfessor William GreeneStern School of Business

IOMS DepartmentDepartment of Economics

Page 2: Regression Models

Part 1: Simple Linear Model1-2/301-2

Regression and Forecasting Models

Part 1 – Simple Linear Model

Page 3: Regression Models

Part 1: Simple Linear Model1-3/30

Theory

Demand Theory: Q = f(Price) “The Law of Demand” Demand curves slope

downward What does “ceteris paribus” mean here?

Page 4: Regression Models

Part 1: Simple Linear Model1-4/30

Data on the U.S. Gasoline Market

Quantity = G = Expenditure / Price

Page 5: Regression Models

Part 1: Simple Linear Model1-5/30

Shouldn’t Demand Curves Slope Downward?

G

GasP

rice

0.650.600.550.500.450.400.350.30

140

120

100

80

60

40

20

0

Scatterplot of GasPrice vs G

Page 6: Regression Models

Part 1: Simple Linear Model1-6/30

Data on 62 Movies in 2010

Page 7: Regression Models

Part 1: Simple Linear Model1-7/30

Average Box Office Revenue is about $20.7 Million

Page 8: Regression Models

Part 1: Simple Linear Model1-8/30

Is There a Theory for This?

Scatter plot of box office revenues vs. number of “Can’t Wait To See It” votes on Fandango for 62 movies.

Page 9: Regression Models

Part 1: Simple Linear Model1-9/30

Average Box Office by Internet Buzz Index

= Average Box Office for Buzz in Interval

Page 10: Regression Models

Part 1: Simple Linear Model1-10/30

Deterministic Relationship: Not a Theory

Expected High Temperatures, August 11-20, 2013, ZIP 10012, NY

Page 11: Regression Models

Part 1: Simple Linear Model1-11/30

Probabilistic RelationshipWhat Explains the Noise?

Fuel Bill = Function of Rooms + Random Variation

Page 12: Regression Models

Part 1: Simple Linear Model1-12/30

Movie Buzz DataProbabilistic Relationship?

Page 13: Regression Models

Part 1: Simple Linear Model1-13/30

The Regression Model

y = 0 + 1x + y = dependent variablex = independent variableThe ‘regression’ is the deterministic part, 0 + 1 xThe ‘disturbance’ (noise) is .The regression model is E[y|x] = 0 + 1x

Page 14: Regression Models

Part 1: Simple Linear Model1-14/30

0 = y intercept

1 = slopeE[y|x] = 0 +

1x

y

x

Linear Regression Model

Page 15: Regression Models

Part 1: Simple Linear Model1-15/30

The Model Constructed to provide a framework for

interpreting the observed data What is the meaning of the observed relationship

(assuming there is one) How it’s used

Prediction: What reason is there to assume that we can use sample observations to predict outcomes?

Testing relationships

Page 16: Regression Models

Part 1: Simple Linear Model1-16/30

The slope is the interesting quantity.Each additional year of education is associated with an increase of 3.611 in disability adjusted life expectancy.

Page 17: Regression Models

Part 1: Simple Linear Model1-17/30

A Cost ModelElectricity.mpjTotal cost in $MillionOutput in Million KWHN = 123 American electric utilitiesModel: Cost = 0 + 1 KWH + ε

Page 18: Regression Models

Part 1: Simple Linear Model1-18/30

Cost Relationship

Output

Cost

80000700006000050000400003000020000100000

500

400

300

200

100

0

Scatterplot of Cost vs Output

Page 19: Regression Models

Part 1: Simple Linear Model1-19/30

Sample Regression

Page 20: Regression Models

Part 1: Simple Linear Model1-20/30

Interpreting the Model Cost = 2.44 + 0.00529 Output + e Cost is $Million, Output is Million KWH. Fixed Cost = Cost when output = 0

Fixed Cost = $2.44Million Marginal cost

= Change in cost/change in output= .00529 * $Million/Million KWH= .00529 $/KWH = 0.529 cents/KWH.

Page 21: Regression Models

Part 1: Simple Linear Model1-21/30

Covariation and Causality

EDUC

DALE

121086420

80

70

60

50

40

30

20

S 7.87034R-Sq 59.2%R-Sq(adj) 59.0%

Fitted Line PlotDALE = 35.16 + 3.611 EDUC

Does more education make you live longer (on average)?

Page 22: Regression Models

Part 1: Simple Linear Model1-22/30

Causality?

Height (inches) and Income ($/mo.) in first post-MBA Job (men). WSJ, 12/30/86.Ht. Inc. Ht. Inc. Ht. Inc.70 2990 68 2910 75 3150 67 2870 66 2840 68 2860 69 2950 71 3180 69 2930 70 3140 68 3020 76 3210 65 2790 73 3220 71 3180 73 3230 73 3370 66 2670 64 2880 70 3180 69 3050 70 3140 71 3340 65 2750 69 3000 69 2970 67 2960 73 3170 73 3240 70 3050

Estimated Income = -451 + 50.2 Height

Page 23: Regression Models

Part 1: Simple Linear Model1-23/30

b0

b1

How to compute the y intercept, b0, and the slope, b1, in y = b0 + b1x.

Page 24: Regression Models

Part 1: Simple Linear Model1-24/30

Least Squares Regression

Page 25: Regression Models

Part 1: Simple Linear Model1-25/30

Fitting a Line to a Set of Points

Income

PerC

apita

G

27000260002500024000230002200021000

6.4

6.3

6.2

6.1

6.0

5.9

5.8

5.7

5.6

Scatterplot of PerCapitaG vs Income

Choose b0 and b1 tominimize the sum of squared residuals

Gauss’s methodof least squares.

N N N2 2 2i 0 1 i i 0 1 i ii 1 i 1 i 1

SS [y - b - b x ] [y - (b + b x )] e

Residuals i i 0 1 i

i i

e y (b b x )ˆ y y

Yi

Xi

Predictionsb0 + b1xi

Page 26: Regression Models

Part 1: Simple Linear Model1-26/30

Computing the Least Squares Parameters b0 and b1

N Ni ii 1 i 1

N2 2x ii 1

Nxy i ii 1

1 1y = y = 20.721 x = x = 0.48242N N

1Var(x) = s = (x x) = 0.02453N-1

1Cov(x,y) = s = (x x)(y y) = 1.784N-1

4 numbers are needed :

xy1 2

x

0 1

s 1.784b 72.7181s 0.02453

b y - b x = 20.721- (72.7181)(0.48242) = -14.36

Page 27: Regression Models

Part 1: Simple Linear Model1-27/30

b0=-14.36

b1= 72.718

Page 28: Regression Models

Part 1: Simple Linear Model1-28/30

Least Squares Uses Calculus

0 1

0 1

0 0

0 1

0 1

1 1

0 1

N 21i iN-1 i=1

2N i i1

N-1 i=1

N1i iN-1 i=1

2N i i1

N-1 i=1

N1i i iN-1 i=1

SS = (y -b -b x )

(y -b -b x )SS =b b

= 2(y -b -b x )(-1) = 0

(y -b -b x )SS =b b

= 2(y -b -b x )(-x ) = 0

0 1

1

N1i=1 i iN-1

N 21i=1 iN-1

The solution is b = y - b x where

Σ (x - x)(y - y)b =

Σ (x - x)

Page 29: Regression Models

Part 1: Simple Linear Model1-29/30

0 1

0 1b =-b =-14.

20.00,36, b =72.71

b =73.5008, Sum of Squares = , Sum of Squares = 1

10724

51.569.7

Least squares minimizes the sum of squared deviations from the line.

Page 30: Regression Models

Part 1: Simple Linear Model1-30/30

Summary Theory vs. practice Linear Relationship

Deterministic Random, stochastic, ‘probabilistic’ Mean is a function of x

Regression Relationship Causality vs. correlation Least squares