145
Econometrics I Andrea Beccarini Summer 2011

Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Econometrics I

Andrea Beccarini

Summer 2011

Page 2: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Outline

• Very brief review of statistical basics

• Simple linear regression model(specification, point estimation, interval estimation, hypothesis tests,forecasting, maximum likelihood estimation)

• Multiple linear regression model

• Violations of (some) model assumptions

9

Page 3: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Review of basic statistics

• Random experiment (Zufallsexperiment)

• Sample space (Ergebnismenge)

• Event (Ereignis)

• Set operations (Verknüpfungen von Ereignissen)

• Partition (Partition oder vollständige Zerlegung)

10

Page 4: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Probability (Wahrscheinlichkeit)

• Kolmogorov’s axioms (Kolmogorovs Axiome)

• Conditional probability (bedingte Wahrscheinlichkeit)

• Total probability (Satz von der totalen Wahrscheinlichkeit)

• Bayes’ theorem (Satz von Bayes)

• Independence (Unabhängigkeit)11

Page 5: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Random variables (Zufallsvariable)

— Definition and intuition

— Distribution function and quantile function(Verteilungsfunktion und Quantilfunktion)

— Discrete and continuous random variables(diskrete und stetige Zufallsvariable)

— Density function (Dichtefunktion)

— Expectation (Erwartungswert)

— Variance (Varianz)12

Page 6: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Special discrete distributions,e.g. Bernoulli, binomial, Poisson, geometric, hypergeometric, . . .

• Special continuous distributionse.g. normal, standard normal distribution, exponential, Pareto, χ2, F, t, . . .

• There are many more special distributions

• Which distribution can be used when?

13

Page 7: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Simple linear regression model

• Econometrics: Application of statistical methods to empirical research ineconomics

• Econometric problems:

— Specification of an appropriate model

— Estimation of the model (Schätzung)

— Hypothesis testing

— Forecasting (Prognose)14

Page 8: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Economic model↓

SPECIFICATIONfunctional (A-assumptions)error term (B-assumptions)variables (C-assumptions)

↓Econometric model

↓ESTIMATION

↓Estimated model

↓ ↓HYPOTHESIS TESTS FORECASTING

15

Page 9: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Data

• Empirical research requires (high quality) data

• Often, collecting data is the main problem of empirical research

• There is no systematic approach

Kinds of data:

• Time series data (Zeitreihendaten), cross sectional data (Querschnittsdaten),panel data (Paneldaten)

16

Page 10: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Specification

• Numeric illustration: Data of the gratuity example

t xt yt t xt yt1 10.00 2.00 11 60.00 7.002 30.00 3.00 12 47.50 5.503 50.00 7.00 13 45.00 7.004 25.00 2.00 14 27.50 4.505 7.50 2.50 15 15.00 1.506 42.50 6.00 16 20.00 4.007 35.00 5.00 17 47.50 9.008 40.00 4.00 18 32.50 3.009 25.00 6.00 19 37.50 6.5010 12.50 1.00 20 20.00 2.50

Billing amount xt andtip yt (both in Euro)of 20 observed guests

17

Page 11: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Functional dependence (generic)

y = f (x)

• More specifically, the functional dependence is assumed to be

y = α+ βx

• Other functional forms are of course possible; more on that later

• The econometric model is specified using the A-, B- and C-assumptions

18

Page 12: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Economic model: yt = α+ βxt for t = 1, . . . , 20

Rechnungsbetrag x

Trin

kgel

d y

0 20 40 60 80

02

46

8

R

α

20β

20

19

Page 13: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Econometric model: yt = α+ βxt + ut for t = 1, . . . , 20

xt

y t

0 20 40 60 80

02

46

810

20

Page 14: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The A-assumptions (functional specification):

Assumption a1: No relevant exogenous variable is omitted from the econometricmodel, and the exogenous variable included in the model is relevant

Assumption a2: The true functional dependence between xt and yt is linear

Assumption a3: The parameters α and β are constant for all T observations(xt, yt)

21

Page 15: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The B-assumptions (error term specification):

Assumption b1: E(ut) = 0 for t = 1, . . . , T

Assumption b2: Homoskedasticity: V ar(ut) = σ2 for t = 1, . . . , T

Assumption b3: For all t 6= s with t = 1, 2, ..., T and s = 1, 2, .., T we have

Cov(ut, us) = 0

Assumption b4: The error terms ut are normally distributed.

Compact notation of all B-assumptions: ut ∼ NID(0, σ2) for t = 1, . . . , T

22

Page 16: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Graphical illustration of the error term distribution

23

Page 17: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The C-assumptions (variable specification):

Assumption c1 The exogenous variable xt is not stochastic, but can be controlledas in an experimental situation

Assumption c2 The exogenous variable xt is not constant for all observations t

• Of course, many (or even all?) of the A-, B-, and C-assumptions are restrictiveand unrealistic

• We will nevertheless suppose they are satisfied for the time being,and consider their violations later on

24

Page 18: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Point estimation

• The simple (two-variable) linear regression model is

yt = α+ βxt + ut

• Numeric illustration: The first data of the gratuity example

t xt yt1 10 22 30 33 50 7

012345678

0 20 40 60

yt

xt

25

Page 19: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Estimation: Compute estimated values α and β

• Distinguish between true and estimated values

• If the true econometric model is

yt = α+ βxt + ut

then the corresponding estimated model is

yt = α+ βxt

26

Page 20: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• How can we estimate the coefficients?

xt

y t

0 10 20 30 40 50 60

02

46

8

R1

R2

R3

27

Page 21: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Least squares method

• Sum of squared residuals

Suu =TXt=1

u2t

where the residuals are

ut = yt − yt

= yt − α− βxt

• Residual (Residuum): Difference between the observed value yt and theestimated (predicted) value yt

28

Page 22: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Choose α and β such that the sum of squared residuals

Suu =TXt=1

u2t =TXt=1

(yt − α− βxt)2

is minimized

• Derivation of estimators (Schätzer) [1]

β = Sxy/Sxx

α = y − βx

with

Sxx =X(xt − x)2 =

Xx2t − T x2

Sxy =X(xt − x) (yt − y) =

Xytxt − T xy

29

Page 23: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Numeric illustration for the three-points example

t xt yt1 10 22 30 33 50 7

• Calculate {1}

α, β

y1, y2, y3

u1, u2, u3

Suu

30

Page 24: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The coefficient of determination R2

• Variation of the endogenous variable Syy =P(yt − y)2

yt

xt

R

x

y =y y1 −

y y2 −

y y3 −g

012345678

0 20 40 60

31

Page 25: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Variation Syy =P(yt − y)2 and sum of squared residuals Suu =

Pu2t

yt

xt

$RKQ$u3

$u2

g

$u1

$y y1 −y y2 −

y y3 −$y y3 −

012345678

0 20 40 60

32

Page 26: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Decomposition of sum of squares (Streuungszerlegungssatz): [2]

Syy = Syy + Suu

or X(yt − y)2 =

X(yt − y)2 +

Xu2t

• Coefficient of determination (Bestimmtheitsmaß)

R2 =„explained variation”„unexplained variation”

=Syy − Suu

Syy=

Syy

Syy

• Computation of R2 {2}

R2 =βSxy

Syy=

S2xy

SxxSyy33

Page 27: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Properties of the estimators

• The estimators

β = Sxy/Sxx

α = y − βx

are random variables

• Thought experiment: repeated samples

• Computer simulation [experiment.R]

34

Page 28: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Under the a-, b- and c-assumptions (without b4) [3]

E(α) = α

E(β) = β

and [4]

Cov(α, β) = −σ2 (x/Sxx)V ar(α) = σ2

³1/T + x2/Sxx

´V ar(β) = σ2/Sxx

• BLUE property: α and β are the best linear unbiased estimators [5]

• If, additionally b4 is true, then α and β are the best unbiased estimators35

Page 29: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• How are yt, α and β distributed?

• Because of

ut ∼ NID(0, σ2)

yt is normally distributed, t = 1, . . . , T

• The expectation of yt is

E (yt) = E(α+ βxt + ut)

= E (α) +E (βxt) +E (ut)

= α+ βxt

36

Page 30: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The variance of yt is

V ar (yt) = E³(yt −E (yt))

= E³(yt − α− βxt)

= E³u2t´

= E³(ut −E(ut))

= σ2

• Further, for t = 1, . . . , T

yt ∼ NID(α+ βxt, σ2)

37

Page 31: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Since

β = Sxy/Sxx

α = y − βx

both α and β are linear transformations of the yt

• Linear transformations of independent normally distributed random variablesare normally distributed

• Hence,

α ∼ N³α, σ2(1/T + x2/Sxx)

´β ∼ N

³β, σ2/Sxx

´38

Page 32: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Interval estimation (Intervallschätzung)

• We already know that β is a random variable and

β ∼ N³β, σ2/Sxx

´

• Instead of a point estimator β we now want an interval estimator

[β − k ; β + k]

satisfying

P³β − k ≤ β ≤ β + k

´= 1− a

• The interval [β − k ; β + k] is called (1− a)-confidence interval(Konfidenzintervall)

39

Page 33: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Confidence interval when σ2 is known

• Step 1: Standardization of β

se(β) =qσ2/Sxx

z =β −E(β)

se(β)

=β − β

se(β)∼ N (0, 1)

• The random variable z = (β − β)/se(β) is a pivot (Pivot), i.e. its distributiondoes not depend on unknown parameters

40

Page 34: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Step 2: Find the (1− α/2)-quantile za/2

P (−za/2 ≤ z ≤ za/2) = 1− a

• Step 3: Substitute z by (β − β)/se(β)

P

Ã−za/2 ≤

β − β

se(β)≤ za/2

!= 1− a

• Rewriting yields the (1− a)-interval [6]{3}hβ − za/2 · se(β); β + za/2 · se(β)

i

41

Page 35: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Confidence interval when σ2 is unknown

• Step 1: Estimation of σ2 and se(β):

σ2 =1

T − 2

TXt=1

u2t

is a consistent and unbiased estimator of σ2 and

cse(β) = qσ2/Sxx

is a consistent estimator of se(β) (we postpone the proofs)

• Step 2: Standardization of β

t =β −E(β)cse(β) =

β − βcse(β) ∼ t(T−2)

42

Page 36: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The random variable t = (β − β)/cse(β) is a pivot• Step 3: Find the (1− α/2)-quantile ta/2

P (−ta/2 ≤ t ≤ ta/2) = 1− a

• Step 4: Substitute and solve for β,

P (β − ta/2 · cse(β) ≤ β ≤ β + ta/2 · cse(β)) = 1− a

• The interval estimator is {4}hβ − ta/2 · cse(β); β + ta/2 · cse(β)i

43

Page 37: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Interval estimator for intercept αhα− ta/2 · cse(α) ; α+ ta/2 · cse(α)i

where

cse(α) = qbσ2(1/T + x2/Sxx)

• Some terminology: The standard error (Standardfehler) is se(β);the estimated standard error is cse(β)

• Usually, both se(β) and cse(β) are called standard error (Standardfehler)• Interpretation of interval estimators?

44

Page 38: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Hypothesis tests

• How can we test hypotheses about the regression coefficients(usually about the slope β)?

• Null hypothesis H0 and alternative hypothesis H1(Nullhypothese und Alternativhypothese)

• There are one-sided and two-sided tests

• We already know that

β ∼ N³β, σ2/Sxx

´45

Page 39: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• If the null hypothesis H0 : β = q is true, then β can be substituted by q

β ∼ N³q, σ2/Sxx

´

• Then

P (β − k ≤ q ≤ β + k) = 1− a

P (q − k ≤ β ≤ q + k) = 1− a

• With high probability 1− α, the estimator β will be inside the interval[q − k; q + k], if H0 is true

• If the estimator β is outside the interval, that is evidence against the nullhypothesis

46

Page 40: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Graphical illustration

47

Page 41: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The analytical approach is slightly different

• Step 1: Set up H0 and H1 and fix the significance level a

H0 : β = q

H1 : β 6= q

• Step 2: Estimate se(β)

cse(β) = qσ2/Sxx

with σ2 = Suu/ (T − 2)

48

Page 42: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Step 3: Compute the t-test statistic

t =β − qcse(β)

If H0 : β = q is true, then

t ∼ t(T−2)

• Step 4: Find the critical value ta/2

P (−ta/2 ≤ t ≤ ta/2) = 1− a

• Step 5: Compare ta/2 and t. If t is outside [−ta/2; ta/2], i.e. if |t| > ta/2,then reject H0 {5}

49

Page 43: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Connections between hypothesis testing and confidence intervals

• Under the (two-sided) null hypothesis H0

P³q − ta/2 · cse(β) ≤ β ≤ q + ta/2 · cse(β)´ = 1− a

• The (1− a)-confidence interval ishβ − ta/2 · cse(β); β + ta/2 · cse(β)i

• Conclusion: If q is outside the confidence interval, H0 is rejected {6}

50

Page 44: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

One-sided hypothesis tests (einseitige Tests)

• Right or left-sided tests

• Right-sided null hypothesis

H0 : β ≤ q

H1 : β > q

• The basic idea remains the same: If β is „much larger” than q, reject H0

51

Page 45: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Graphical illustration:

52

Page 46: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Analytical approach (right-sided null hypothesis)

• Step 1: State H0 and H1 and set the significance level a

H0 : β ≤ q

H1 : β > q

• Step 2: Estimate se(β)

• Step 3: Compute the t-statistic

t =β − qcse(β)

Under H0 its distribution is t ∼ t(T−2)53

Page 47: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Step 4: Find the critical value ta

P (t ≤ ta) = 1− a

For left-sided null hypotheses, the steps 1, 2 and 3 are the same; the criticalvalue is t1−a with P (t < ta) = a

• Step 5: Compare ta and t; reject H0, if t > ta {7}

• For left-sided null hypotheses, H0 is rejected if t is less than the critical value,t < t1−a

54

Page 48: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The p-value (p-Wert)

• The p-value is the probability that the test statistic (a random variable) isgreater than the realized test statistic

• Traditional approach: Reject the null hypothesis if the test statistic is inside thecritical region, e.g. if t > ta

• Alternative approach: Comparison of probabilities; reject the null hypothesis ifthe p-value is less than the significance level a

55

Page 49: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Graphical illustration:

56

Page 50: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The two approaches — comparison of t-statistic and critical value or comparisonof p-value and significance level — are essentially identical {8}

• Advantages of the p-value approach?

• Disadvantages?

• p-value formulas for right- and left-sided hypothesis tests? [7]

• p-value formula for two-sided hypothesis test?

57

Page 51: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

How to choose the null and alternative hypotheses

• There are basically two strategies:

— State the opposite of the conjecture as the null hypothesisand try to reject it

— State the conjecture as the null hypothesisand show that it cannot be rejected

• There is an important asymmetry between rejection and non-rejection

58

Page 52: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Maximum likelihood estimation

• Main idea: Find those parameter values that maximize the probability (orlikelihood) of observing the actually observed data

• Notation:

θ : Parameter vector, e.g. θ = (α, β, σ2)

L(θ) : Likelihood (given all the data)

lnL (θ) : Log-likelihood

• Maximum likelihood estimators

θ = argmin lnL(θ)

59

Page 53: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• We already know that, for t = 1, . . . , T

yt ∼ NID(α+ βxt, σ2),

hence the density of yt is

fyt(y) =1√2πσ2

exp

Ã−12

(y − α− βxt)2

σ2

!

• Due to independence, the joint likelihood and log-likelihood are

L(α, β, σ2) = fy1,...,yT (y1, . . . , yT ) =TYt=1

fyt(yt)

lnL(α, β, σ2) = ln fy1,...,yT (y1, . . . , yT ) =TXt=1

ln fyt(yt)

60

Page 54: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Maximize

lnL(α, β, σ2) = ln fy1,...,yT (y1, . . . , yT )

=TXt=1

ln

"1√2πσ2

exp

Ã−12

(yt − α− βxt)2

σ2

!#

with respect to the parameters α, β, σ2 [8]

• The ML estimators are

αML = y − βMLx

βML =Sxy

Sxx

σ2ML =1

T

TXt=1

u2t

61

Page 55: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Hypothesis tests in the maximum likelihood framework(the three classical tests: Wald, LR, LM)

• Null and alternative hypotheses, e.g.

H0 : β = β0

H1 : β 6= β0

• Derivation of the test statistics [exercise]

62

Page 56: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Forecasting

• Conditional forecast: the value of the exogenous variable is known andnon-stochastic x0

• Point forecast of the endogenous variable is {9}

y0 = α+ βx0

• The true value of y0 is usually not y0 but

y0 = α+ βx0 + u0

63

Page 57: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The forecasting error is

y0 − y0 = α+ βx0 − (α+ βx0 + u0)

= (α− α) +³β − β

´x0 − u0

• There are two error sources:

1. The error term u0 will not vanish, in general.

2. The parameter estimates α and β will deviate from the true values α and β.

64

Page 58: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Properties of the point forecast

• Expected forecasting error:

E(y0 − y0) = E(α− α) +E(β − β)x0 −E(u0)

= 0

• Variance of the forecasting error [9]

V ar(y0 − y0) = σ2h1 + 1/T + (x0 − x)2 /Sxx

i

• Estimated variance of the forecasting error {9}

dV ar(y0 − y0) = σ2h1 + 1/T + (x0 − x)2 /Sxx

i65

Page 59: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Interval forecast

• Step 1: Estimation of se(y0 − y0)

• Step 2: Standardization of (y0−y0)

t =(y0 − y0)−

=0z }| {E (y0 − y0)cse(y0 − y0)

=y0 − y0cse(y0 − y0)

∼ tT−2

• Step 3: Find the ta/2-value (from statistical tables or using statistical computersoftware)

66

Page 60: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Step 4: With large probability 1− α, the random variable t will be inside theinterval [−ta/2 ; ta/2],

P

Ã−ta/2 ≤

y0 − y0cse (y0 − y0)≤ ta/2

!= 1− a

Solve for y0

P³y0 − ta/2 · cse(y0 − y0) ≤ y0 ≤ y0 + ta/2 · cse(y0 − y0)

´= 1− a

• Hence, the interval forecast is {9}hy0 − ta/2 · cse(y0 − y0); y0 + ta/2 · cse(y0 − y0)

i

67

Page 61: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Width of the interval

68

Page 62: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Multiple linear regression model

• Until today we only considered a single exogenous variable, but in mostempirical problems we face many exogenous variables

• Many of the results from the simple linear regression model can be transferredto the multiple case

• Important tool: matrix algebra(main diagonal, transpose, addition, scalar multiplication, inner product, matrixmultiplication, idem potent, determinant, rank, inverse, trace, definit matrices,semidefinite matrices)

69

Page 63: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Specification

• Example: Estimation of a production function for barley

• Conduct an experiment where the barley output (Gerste, gt) is observed fordifferent combinations of phosphate (pt) and nitrogen (nt)

• There are T = 30 different combinations

• The following table shows the data

70

Page 64: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

t pt nt gt t pt nt gt1 22,00 40,00 38,36 16 25,00 110,00 59,552 22,00 60,00 49,03 17 26,00 50,00 55,243 22,00 90,00 59,87 18 26,00 70,00 54,134 22,00 120,00 59,35 19 26,00 90,00 66,575 23,00 50,00 45,45 20 26,00 110,00 61,746 23,00 80,00 53,23 21 27,00 40,00 48,997 23,00 100,00 56,55 22 27,00 60,00 54,388 23,00 120,00 50,91 23 27,00 80,00 58,289 24,00 40,00 44,87 24 27,00 100,00 62,8110 24,00 60,00 54,06 25 28,00 50,00 50,7611 24,00 90,00 60,34 26 28,00 70,00 51,5412 24,00 120,00 58,21 27 28,00 100,00 59,3913 25,00 50,00 51,52 28 28,00 110,00 68,1714 25,00 80,00 58,58 29 29,00 60,00 59,2515 25,00 100,00 57,27 30 29,00 100,00 64,39

71

Page 65: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Functional specification (A-assumptions)

• The economic (agro-economic) model formalizes the connection between thebarley output (g) and the fertilizers (p and n)

g = f(p, n)

• Possible function formg = α+ β1p+ β2n

• A more realistic functional form

g = Apβ1nβ2,

where A, β1 and β2 are constant parameters

72

Page 66: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Take logarithms of the production function g = Apβ1nβ2,

ln g = lnA+ β1 ln p+ β2 lnn

• Define α = lnA, y = ln g, x1 = ln p and x2 = lnn, then

y = α+ β1x1 + β2x2

• Table of log-values:

t x1 x2 yt(= ln pt) (= lnnt) (= ln gt)

1 3,0910 3,6889 3,6470... ........... ........... ..........30 3,3673 4,6052 4,1650

73

Page 67: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The econometric model is

yt = α+ β1x1t + β2x2t + ut

for t = 1, . . . , T

• General model for K exogenous variables

yt = α+ β1x1t + β2x2t + . . .+ βKxkt + ut

for t = 1, . . . , T or

y1 = α+ β1x11 + β2x21 + ...+ βKxK1 + u1

y2 = α+ β1x12 + β2x22 + ...+ βKxK2 + u2...

yT = α+ β1x1T + β2x2T + ...+ βKxKT + uT

74

Page 68: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Matrix notation: Define

y =

⎡⎢⎢⎢⎣y1y2...yT

⎤⎥⎥⎥⎦ ; X =

⎡⎢⎢⎢⎣1 x11 . . . xK11 x12 . . . xK2... ... . . . ...1 x1T . . . xKT

⎤⎥⎥⎥⎦ ; β =

⎡⎢⎢⎢⎣αβ1...βK

⎤⎥⎥⎥⎦ ; u =

⎡⎢⎢⎢⎣u1u2...uT

⎤⎥⎥⎥⎦

• Compact notation for the multiple regression model

y = Xβ + u

or ⎡⎢⎢⎢⎣y1y2...yT

⎤⎥⎥⎥⎦ =⎡⎢⎢⎢⎣1 x11 . . . xK11 x12 . . . xK2... ... . . . ...1 x1T . . . xKT

⎤⎥⎥⎥⎦⎡⎢⎢⎢⎣αβ1...βK

⎤⎥⎥⎥⎦+⎡⎢⎢⎢⎣u1u2...uT

⎤⎥⎥⎥⎦75

Page 69: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The A-assumptions

Assumption A1: No relevant exogenous variable is omitted from the econometricmodel, and all exogenous variables included in the model are relevant

Assumption A2: The true functional dependence between X and y is linear

Assumption A3: The parameters β are constant for all T observations (xt, yt)

76

Page 70: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The B-assumptions

The B-assumptions are the same as in the simple linear model, i.e. E(ut) = 0,V ar(ut) = σ2, Cov(ut, us) = 0 for t 6= s and normality

B1 to B4 in matrix notation

u ∼ N³0, σ2IT

´

77

Page 71: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The C-assumptions

Assumption C1: The exogenous variables x1t, . . . , xKt are not stochastic, but canbe controlled as in an experimental situation

Assumption C2: No perfect multicollinearity: The are no parameter values γ0, γ1,γ2, . . . , γK (with at least one γk 6= 0), such that for all t = 1, . . . , T

γ0 + γ1x1t + γ2x2t + . . .+ γKxKt = 0

Assumption C2’ in matrix notation:

rang(X) = K + 1

(implication: T ≥ K + 1)

78

Page 72: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Perfect multicollinearity with two regressors

• If C2 is violated, there are γ0, γ1, γ2, (not all 0) such that

γ0 + γ1x1t + γ2x2t = 0

for all t = 1, . . . , T , thus

x2t = − (γ0/γ2)− (γ1/γ2)x1t = δ0 + δ1x1t

with δ0 = − (γ0/γ2) and δ1 = − (γ1/γ2)

• Hence, there are not really two regressors, since

yt = α+ β1x1t + β2x2t + ut= (α+ β2δ0)| {z }

=α0+(β1 + β2δ1)| {z }

=β0x1t + ut

79

Page 73: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Point estimation

• The econometric model is

y = Xβ + u

yt = α+ β1x1t + . . .+ βKxKt + ut for t = 1, . . . , T

• The estimated model is

y = Xβ

yt = α+ β1x1t + . . .+ βKxKt for t = 1, . . . , T

80

Page 74: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Define the residuals

u = y− yut = yt − yt for t = 1, . . . , T

• How can we find an estimator β in the multiple regression model?

• The sum of squared residuals is

Suu = u0u

=X

u2t

81

Page 75: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Because of

u = y−Xβ= yt − α− β1x1t − . . .− βKxKt

we have

Suu =³y−Xβ

´0 ³y−Xβ

´=

X³yt − α− β1x1t − . . .− βKxKt

´2• First order conditions

∂Suu

∂β=

⎡⎢⎢⎢⎢⎣∂Suu/∂α

∂Suu/∂β1...∂Suu/∂βK

⎤⎥⎥⎥⎥⎦ = 082

Page 76: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Vector of derivatives∂Suu

∂β=

∂β

³y−Xβ

´0 ³y−Xβ

´=

∂βy0y− ∂

∂β2y0Xβ +

∂ββX

0Xβ

= −2X0y+2X0Xβ

• J.R. Magnus, H. Neudecker, Matrix Differential Calculus with Applications inStatistics and Econometrics, rev. ed., John Wiley & Sons: Chichester, 1999.

• Phoebus J. Dhrymes, Mathematics for Econometrics, 3rd ed.,Springer: New York, 2000.

83

Page 77: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Solving the first order conditions yields the normal equations

X0Xβ = X0y

and thus

β =³X0X

´−1X0y

• The terms are

X0X=

⎡⎢⎢⎢⎣T

Px1t . . .

PxKtP

x1tPx21t . . .

Px1txKt

... ... . . . ...PxKt

PxKtx1t . . .

Px2Kt

⎤⎥⎥⎥⎦ , X0y=

⎡⎢⎢⎢⎣PytPx1tyt

...PxKtyt

⎤⎥⎥⎥⎦

• Numeric illustration {10}

84

Page 78: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Meaning of the estimators α, β1 and β2

• Formal meaning∂yt

∂x1t= β1 and

∂yt

∂x2t= β2

• Meaning of α: for x1t = x2t = 0

ln gt = α = 0.9543

gt = e0.9543 = 2.5969

85

Page 79: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Meaning of β1 and β2:

β1 =∂yt

∂x1t=

∂ (ln gt)

∂ (ln pt)

• Because of∂ ln gt

∂gt=1

gtand

∂ ln pt

∂pt=1

pt

we find

β1 =∂gt/gt

∂pt/pt

• β1 is the estimated elasticity of the barley output with respect to thephosphate fertilizer

86

Page 80: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Coefficient of determination R2

• The total variation of y can be decomposed in the same way as in the simplelinear model

Syy|{z}„total variation”

= Syy|{z}„explained variation”

+ Suu|{z}„unexplained variation”

• The coefficient of determination is defined as

R2 =„explained variation”„total variation”

=Syy

Syy

=Syy − Suu

Syy

87

Page 81: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Graphical illustration

E

AB

C

F D G

Syy

S11 S22

• Here

R2 =A+B + C

A+B + C +E

88

Page 82: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Computation of R2: In the simple linear regression model

R2 =Syy

Syy=

βSxy

Syy

• It can be shown that in the multiple linear regression model

Syy =KXk=1

βkSky

with the covariations Sky =PTt=1 (xkt − xk) (yt − y)

• Then {11}

R2 =

PKk=1

bβkSkySyy

89

Page 83: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Properties of the OLS estimators

• The estimator β is a random vector

• The expectation vector is [10]

E(β) = β

(unbiasedness, Erwartungstreue)

• The covariance matrix of β is [11]

V(β) = σ2³X0X

´−190

Page 84: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Special case: Covariance matrix in the two regressor model:

V ar(β1) =σ2

S11³1−R21·2

´V ar(β2) =

σ2

S22³1−R21·2

´V ar (α) = σ2/T + x21V ar(β1)

+2x1x2Cov(β1, β2) + x22V ar(β2)

Cov(β1, β2) =−σ2R21·2

S12³1−R21·2

´where

R21·2 =S212

S11S22

91

Page 85: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Gauss-Markov theorem

• The estimator β =¡X0X

¢−1X0y is linear in y, sinceβ = Dy

with D =³X0X

´−1X0

• β =¡X0X

¢−1X0y is not only unbiased but also efficient• Let β be another linear unbiased estimator of β

• Then V(β)−V(β) is positive semidefinit [12]

92

Page 86: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Distribution of the estimator

• The model is y = Xβ + u

• From u ∼ N(0, σ2IT ) we conclude that y is multivariate normally distributed

• Expectation vector and covariance matrix of endogenous variable

E(y) = E(Xβ + u) = Xβ

V(y) = V(Xβ + u) = V(u) = σ2IT

• Thus y ∼ N(Xβ, σ2IT )

93

Page 87: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• How is the estimator β distributed?

• Since β =¡X0X

¢−1X0y the estimator β also has a multivariate normaldistribution

• Expectation vector and covariance matrix are already known

• Hence

β ∼ Nµβ, σ2

³X0X

´−1¶

• Problem: The error term variance σ2 is unknown94

Page 88: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The covariance matrix V(β) cannot be computed without σ2

• Since usually σ2 is unknown, it has to be estimated

• An estimator of σ2 isσ2 =

SuuT −K − 1

• Its expectation is E(σ2) = σ2 [13]{12}

• The “residual maker matrix”

M = IT −X(X0X)−1X0

95

Page 89: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Interval estimation

• Interval estimation of a single component βk of the vector β

P³βk − c ≤ βk ≤ βk + c

´= 1− a

• We know that

βk ∼ N(βk, V ar(βk))

where V ar(βk) is the (k + 1)th diagonal element of σ2

¡X0X

¢−1

• Problem: σ2 and V ar(βk) are unknown

96

Page 90: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Step 1: Estimation of σ2 by σ2 and se(βk) =qV ar(βk) by

cse(βk) = q dV ar(βk)• Step 2: Standardization of βk

t =βk −E(βk)cse(βk) =

βk − βkcse(βk) ∼ t(T−K−1)

• Step 3: Find the ta/2-value

• Step 4: The (1− α)-interval estimator is {13}hβk − ta/2 · cse(βk) ; βk + ta/2 · cse(βk)i

97

Page 91: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Interval estimation of linear combinations of β

• Let r be an arbitrary (K + 1)-column vector

• How can we find a confidence interval of r0β ?

• Fertilizer example: r = [0, 1, 1]0, then r0β = β1 + β2 (economies of scale?)

• The point estimator of r0β is r0β

• The variance of r0β is r0V(β)r = σ2r0(X0X)−1r98

Page 92: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The confidence interval for r0β is∙r0β − ta/2 · σ

qr0(X0X)−1r ; r0β + ta/2 · σ

qr0(X0X)−1r

¸

• Special case of a single component

βk = r0β

for

r = [0, . . . , 0, 1, 0, . . . , 0]0

where the 1 is located at the kth position

• Then V ar(βk) = r0σ2(X0X)−1r99

Page 93: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Hypothesis tests: t-test

• There are tests of a single linear combination (t-tests) and tests of multiplelinear combinations (F -tests)

• Testing a single linear combination of parameters: t-test (two-sided)

• Remember: In the simple linear regression case

H0 : β = q

H1 : β 6= q

100

Page 94: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• In the multiple linear model the null and alternative hypotheses are

H0 : r0α+ r1β1 + . . .+ rKβK = q

H1 : r0α+ r1β1 + . . .+ rKβK 6= q

or

H0 : r0β = q

H1 : r0β 6= q

where

r = [r0, r1, . . . , rK]0

101

Page 95: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The test procedure:

1. Set up H0 and H1 and fix the significance level a

2. Estimate se(r0β)

3. Compute the t-statistic

4. Find the critical value ta/2

5. Test decision: Compare ta/2 and t {14}

102

Page 96: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The left-sided t-test

H0 : r0β ≥ q

H1 : r0β < q

and the right-sided test

H0 : r0β ≤ q

H1 : r0β > q

are similar

• The critical values are lower quantiles of the t-distribution for the left-sided testand upper quantiles for the right-sided test {14}

103

Page 97: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Hypothesis tests: F-test

• Simultaneous test of two or more linear combinations (restrictions)

• Null hypothesis and alternative hypothesis

H0 : Rβ = q

H1 : Rβ 6= q

• Exampels:

H0 : β1 = β2 = . . . = βK = 0

H0 : β1 = β2 = . . . = βKH0 : β1 + . . .+ βk = 1 and β1 = 2β2H0 : β1 = 5 and β2 = . . . = βK = 0

104

Page 98: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Basic idea of the F -test: Compare the restricted and the unrestricted model

• Sum of squared residuals of the econometric model and the model under thenull hypothesis

Suu = u0u =TXt=1

u2t

Su0u0 = u00u0 =TXt=1

³u0t´2

where u0 are the residuals if the model is estimated under the restrictions ofthe null hypothesis

105

Page 99: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Example: Null hypothesis

yt = α+ 0 · x1t + . . .+ 0 · xKt + ut = α+ ut

• Obviously, S0bubu ≥ Sbubu; the null hypothesis is likely to be false if S0bubu is “muchlarger” than Sbubu

• The test statistic is

F =

³S0bubu − Sbubu´.L

Sbubu/ (T −K − 1)where L is the number of restrictions in H0

• If the null hypothesis is true, then F ∼ F(L,T−K−1)106

Page 100: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The five steps of the F -test

1. Set up H0 and H1 and choose the significance level a

2. Calculate Sbubu and S0bubu (more on the computation of S0bubu later)3. Compute the F -test statistic

4. Find the critical value Fa, i.e. the upper a-quantile of theFL,T−K−1-distribution

5. Reject H0 if F > Fa {15}

107

Page 101: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Remarks:

• For L = 1 the F -test is identical to a two-sided t-test

• Careful: A combination of t-tests is not the same as a single F -test

• The decisions of t-tests and an F -test can be contradicting

• Distinction between individual t-tests and a simultaneous F -test

108

Page 102: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Example: H0 : β1 = β2 = 0.33 {16}

109

Page 103: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Computation of u00u0

• Estimate β subject to the restrictions Rβ = q given in the null hypothesis

• Optimization under constraints: Minimize

Su0u0 = (y−Xβ)0 (y−Xβ)

with respect to β subject to Rβ = q

• A standard Lagrange approach yields [14]

βRLS

= β −³X0X

´−1R0

∙R³X0X

´−1R0¸−1 ³

Rβ − q´

110

Page 104: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Residuals of the restricted model: u0 = y−XβRLS {17}

• The F -test statistic can also be written as [15]

F =

³Rβ − q

´0 hR¡X0X

¢−1R0i−1 ³Rβ − q´ /Lu0u/ (T −K − 1)

• Note the similarity to the t-test statistic

t2 =

³r0β − q

´2σ2hr0 (X0X)−1 r

i

• Standard statistical software includes simultaneous tests of linear combinations(F -tests)

111

Page 105: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Maximum likelihood estimation

• Repetition: If X is a K-dimensional random vector with multivariate normaldistribution N(μ,Σ) then its joint density is

fX (x) = (2π)−K/2 (detΣ)−1/2 exp

µ−12(x− μ)0Σ−1 (x− μ)

• Multiple linear regression model

y = Xβ + u with u ∼ N³0, σ2I

´

• Distribution of the endogenous variables: y ∼ N³Xβ, σ2I

´112

Page 106: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Joint density of y

fy (y)

= (2π)−T2

³detσ2I

´−12 expµ−12(y−Xβ)0

³σ2I

´−1(y−Xβ)

¶= (2π)−T/2

³σ2T

´−1/2exp

Ã−(y−Xβ)

0 (y−Xβ)2σ2

!

• Log-likelihood function

lnL³β, σ2

´= −T

2ln (2π)− T

2lnσ2 − (y−Xβ)

0 (y−Xβ)2σ2

113

Page 107: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• First order condition for a maximum⎡⎢⎣ ∂ lnL∂β

∂ lnL∂σ2

⎤⎥⎦ =⎡⎢⎢⎣

X0(y−X0β)σ2

−T2σ2

+(y−Xβ)0(y−Xβ)

2σ4

⎤⎥⎥⎦ ="00

#

• Solution of the FOCs [16]

βML =³X0X

´−1X0y

σ2ML =1

T

³u0u

´

• The ML estimator of β is identical to the OLS estimator, the ML estimator ofσ2 is different and thus biased (but asymptotically unbiased)

114

Page 108: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The classical tests (LR, Wald, LM)

• Illustration of the basic test ideas [threetests.R]

• Generalization to multiple restrictions

H0 : g(β) = 0

H1 : g(β) 6= 0

where β is the coefficient vector of a multiple linear regression modeland g is a (possibly nonlinear) vector-valued function

• Test of L linear restrictions: g(β) = Rβ − q

115

Page 109: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Wald test

• Idea: If g(βML) is significantly different from 0, reject H0

• Test statistic (for multiple restrictions)

W = g³βML

´0 h dCov ³g ³βML

´´i−1g³βML

´d→ U ∼ χ2L

if the null hypothesis is true

• Wald test statistic for L linear restrictions Rβ − q = 0 [17]

116

Page 110: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Likelihood ratio (LR) test

• Idea: If the maximal likelihood under the restrictions L(βR, σ2R) is significantlylower than the maximal likelihood without restrictions L(βML, σ

2ML), then

reject H0

• Test statistic

LR = 2³lnL

³βML, σ

2ML

´− lnL

³βR, σ

2R

´´d→ U ∼ χ2L

if the null hypothesis is true

• LR test statistic for L linear restrictions Rβ − q = 0 [18]

117

Page 111: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Lagrange multiplier (LM) test

• Idea: If the slope of the log-likelihood function ∂ lnL(βR)/∂β is significantlydifferent from 0, reject H0

• Test statistic

LM =

⎛⎝∂ lnL(βR)∂β

⎞⎠0 h dCov ³βR´i−1⎛⎝∂ lnL(βR)

∂β

⎞⎠ d→ U ∼ χ2L

if the null hypothesis is true

• LM test statistic for L linear restrictions Rβ − q = 0 [19]

118

Page 112: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Forecasting

• The approach is similar to forecasting in the simple linear regression

• Let x0 = [1, x10, x20, . . . , xK0]0 denote the vector of exogenous variables

• Point forecast

y0 = x00β

• Variance of the forecast error [20]

V ar (y0 − y0) = σ2µ1 + x00

³X0X

´−1x0

¶119

Page 113: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Presentation of the results

• In the literature, the results of regression analyses are often presented as follows

y = α + β1x1 + . . .+ βKxK(cse(α)) (cse(β1)) (cse(βK))

• Sometimes you find t-values in the parentheses, i.e. the values of the teststatistics for the tests H0 : βk = 0 vs H1 : βk 6= 0

• Often, R2 and σ and the value of the test statistic of the F test

H0 : β1 = . . . = βK = 0 vs H1 : not H0

are reported additionally120

Page 114: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Fertilizer example:y = 0.95432 + 0.59652x1 + 0.26255x2

(0.46943) (0.13788) (0.03400)

• Additional results

R2 = 0.743

σ2 = 0.00425

σ = 0.0652

• Test statisticsH0 : β1 = 0 −→ 4.326H0 : β2 = 0 −→ 7.723H0 : β1 = β2 = 0 −→ 38.98

121

Page 115: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Examples of computer output:

• Excel

• SPSS

• EViews

• Stata

• R

• matlab122

Page 116: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Assumptions

A1: No relevant variable is omitted, and no irrelevant variables are includedA2: The true functional dependence between X and y is linearA3: The parameters β are constant for all T observations (xt, yt)B1-B4: u ∼ N

³0, σ2IT

´C1: The exogenous variables are not stochasticC2: No perfect multicollinearity: rank(X) = K + 1

All assumptions can be violated

What happens if they are violated?

123

Page 117: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Omitted or irrelevant variables

• Assumption A1: No relevant exogenous variable is omitted from theeconometric model, and all exogenous variables included in the model arerelevant

• What happens if relevant variables are missing?

• What happens if there are irrelevant variables included in the model?

• Example: Wage structure in a firm with 20 employees; what are thedeterminants of the wage yt?

124

Page 118: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Data: Education x1t; age x2t; firm tenure x3t

t yt x1t x2t x3t t yt x1t x2t x3t1 1250 1 28 12 11 1350 1 30 132 1950 9 34 8 12 1600 2 43 213 2300 11 55 25 13 1400 2 23 54 1350 3 24 5 14 1500 3 21 15 1650 2 42 21 15 2350 6 50 226 1750 1 43 19 16 1700 9 64 367 1550 4 37 17 17 1350 1 36 108 1400 1 18 1 18 2600 7 58 309 1700 3 63 25 19 1400 2 35 1710 2000 4 58 30 20 1550 2 41 6

125

Page 119: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Three potential models (M2 is the true model)

(M1) yt = α+ βx1t + u0t(M2) yt = α+ β1x1t + β2x2t + ut(M3) yt = α+ β1x1t + β2x2t + β3x3t + u00t

Model Variable Coeff. bse(.) t-test p-value

(M1) Constant 1354.7 94.2 14.38 0.0000Education 89.3 19.8 4.50 0.0003

(M2) Constant 1027.8 164.5 6.25 0.0000Education 62.6 21.2 2.95 0.0089Age 10.6 4.6 2.32 0.0333

(M3) Constant 1000.5 225.7 4.43 0.0004Education 62.4 21.8 2.86 0.0114Age 12.4 10.7 1.16 0.2634Firm tenure -2.6 14.3 -0.18 0.8569

126

Page 120: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Omitted relevant variables

• Graphical representation

E

AB

C

F D G

Syy

S11 S22

127

Page 121: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The models:

(M1) yt = α+ βx1t + u0t(M2) yt = α+ β1x1t + β2x2t + ut

(M3) yt = α+ β1x1t + β2x2t + β3x3t + u00t

• The error terms

u0t = β2x2t + ut

E(u0t) = E(β2x2t + ut)

= β2x2t +E(ut)

= β2x2t + 0

6= 0

128

Page 122: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• If a relevant exogenous variable is omitted, assumption B1 is violated!

• Consequence for point estimation

β01 = β1 + β2

S12S11

E(β01) = E

Ãβ1 + β2

S12S11

!

= β1 + β2S12S11

• Consequence for interval estimationhβ01 − ta/2 · cse(β01) ; β01 + ta/2 · cse(β01)i

129

Page 123: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Further

se(β01) =

rvar

³β01

´with

var³β01

´=

σ2

S11

• The estimator

σ2 =Sbu0bu0T − 2

is biased; the unbiased estimator is

σ2 =SbubuT − 3

130

Page 124: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Conclusion: The coverage probability of the confidence intervals is not 1− α

• Hypothesis tests are also biased: The probability of an error of the first kinddoes not equal the significance level

• If a relevant exogenous variable is omitted, then

— the point estimators are biased and inconsistent

— the interval estimators and hypothesis tests are no longer valid {18}

131

Page 125: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Irrelevant variables

• The error term in the misspecified model M3 is

u00t = ut − β3x3t

and since β3 = 0

u00t = ut

• Consequently,

E(bα001) = α

E(β001) = β1

E(β002) = β2

E(β003) = β3 = 0

132

Page 126: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• The variances of the estimators are

V ar(β1) =σ2

S11³1−R21·2

´V ar(β

001) =

σ2

S11³1−R21·2 −R21·3

´

• The estimated error term variance is

bσ2 = Sbu00bu00T − 4

• Conclusion: Omitted relevant variables are a serious problem,redundant variables are not (but they inflate the standard errors)

133

Page 127: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Diagnosis

• How can we find the correct model?

• The coeffcient of determination R2 does not help select a model

• Adjusted R2

R2= 1− Sbubu /(T −K − 1)

Syy /(T − 1)

= 1−³1−R2

´ T − 1T −K − 1

134

Page 128: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Further model selection criteria(trade-off between biasedness and inefficiency)

• Akaike information criterion (AIC)

AIC = lnµSbubuT

¶+2(K + 1)

T

• t-test for single variables;F -test for multiple variables

135

Page 129: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Functional form

• Assumption A2: The true functional dependence between X and y is linear

• Milk example: Milk production m depends on amount of concentrated feed f

t ft mt t ft mt1 10 6525 7 8 58212 30 8437 8 14 75313 20 8019 9 25 83204 33 8255 10 1 43365 5 5335 11 17 72256 22 7236 12 28 8112

136

Page 130: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

0 5 10 15 20 25 30

5000

6000

7000

8000

K ra ftfutte r

Milc

hmen

ge

137

Page 131: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• A misspecified model returns useless results

• Some nonlinear dependencies

Semi-Log. : mt = α+ β ln ft + ut

Invers : mt = α+ β (1/ft) + ut

Exponential : lnmt = α+ βft + ut

Logarithmic : lnmt = α+ β ln ft + ut

Quadratic : mt = α+ β1ft + β2f2t + ut

138

Page 132: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Approach I: Estimation of a nonlinear regression

yt = g(xt) + ut

with criterion functionTXt=1

(yt − g(xt))2

• Optimization by numerical methods

• Approach II: Linearization of the model; then linear regression

yt = α+ βxt + utyt = lnmt

xt = ln ft

139

Page 133: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Diagnosis: Regression Specification Error Test (RESET)

• Higher order Taylor approximation

yt = f(xt) = α+ β1xt + β2x2t + β3x

3t + . . .

• Are the higher orders (jointly) significant?

• F -test of β2 = β3 = . . . = 0

• Problem: What happens if there are many exogenous variables?

140

Page 134: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Basic idea of the RESET: by2t , by3t , . . . are included as additional exogenousvariables

yt = α+ β1xt + γ2by2t + γ3by3t + ut

• If γ2 and/or γ3 are significant, then there are nonlinearities

• F -test of γ2 = γ3 = 0 (maybe even higher orders)

• The test is implemented in many statistical software packages

141

Page 135: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

RESET in the linear model:

1. Estimate the linear model and calculate Sbubu and the fitted byt2. Add L powers of yt to the linear model

yt = α+ β1xt + γ2y2t + γ3y

3t + ut

Estimate the extended model and calculate the sum of squared residuals S∗bubu3. The null hypothesis is H0 : γ2 = γ3 = 0

142

Page 136: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

4. Compute the F -test statistic

F(L,T−K∗−1) =

³Sbubu − S∗bubu´ /L

S∗bubu/ (T −K∗ − 1)where K∗ is the number of exogenous variables in the extended model

5. If F > Fa (significance level a, degress of freedom L and T −K∗ − 1),then H0 is rejected and the linear model is discarded

• Milk example {18}

143

Page 137: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Qualitative exogenous variables

• Assumption A3: The parameters β are constant for all T observations (xt, yt)

• Example: The wage yt depends on education x1t and age x2t

yt = α+ β1x1t + β2x2t + ut

• The wage equations for males and females might be different

yt = αM + βM1x1t + βM2x2t + ut

yt = αF + βF1x1t + βF2x2t + ut

• What happens if the difference is neglected? [qualitative.R]

144

Page 138: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Dummy variable

Dt =

(0 if male1 if female

• Extended model

yt = α+Dtγ + β1x1t + δ1Dtx1t + β2x2t + δ2Dtx2t + ut

• Model for men (Dt = 0)

yt = α+ β1x1t + β2x2t + ut

• Model for women (Dt = 1)

yt = (α+ γ) + (β1 + δ1)x1t + (β2 + δ2)x2t + ut

145

Page 139: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• If the qualitative variable has more than two values, we need more than onedummy variable

• Example: Religion (protestant, catholic, other)

DPt =

⎧⎪⎨⎪⎩0 for other1 for protestant0 for catholic

DCt =

⎧⎪⎨⎪⎩0 for other0 for protestant1 for catholic

• Meaning of the coefficients; testing structural stability

146

Page 140: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Estimation of the model

• Use the ordinary t- or F -tests to detect differences in the coefficients, e.g.

H0 : γ = δ1 = δ2 = 0

• Very often, the model includes only a level effect, i.e.

yt = α+ γDt + β1x1t + β2x2t + ut

• Then use a t-test for γ

147

Page 141: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Estimation of the wage equation model

yt = α+Dtγ + β1x1t + δ1Dtx1t + β2x2t + δ2Dtx2t + ut

• Compare with separat estimation of the two models [wages.R]

yt = αM + βM1x1t + βM2x2t + ut for menyt = αF + βF1x1t + βF2x2t + ut for women

• The point estimates and the sum of squared residuals are identical (why?)

• The standard errors differ (why?)

148

Page 142: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• For simplicity we only consider one exogenous variable

yt = α+ γDt + βxt + δDtxt + ut

• Order the observations such that Dt = 0 for t = 1, . . . , T1 and Dt = 1 fort = T1 + 1, . . . , T

• The joint estimation minimizes (with respect to α, β, γ, δ)

S (α, β, γ, δ) =T1Xt=1

(yt − α− βxt)2 +

TXt=T1+1

(yt − (α+ γ)− (β + δ)xt)2

149

Page 143: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

The first order conditions for the joint estimation are

∂S

∂α= −

T1Xt=1

(yt − α− βxt)−TX

t=T1+1

(yt − (α+ γ)− (β + δ)xt) = 0

∂S

∂β= −

T1Xt=1

(yt − α− βxt)xt −TX

t=T1+1

(yt − (α+ γ)− (β + δ)xt)xt = 0

∂S

∂γ= −

TXt=T1+1

(yt − (α+ γ)− (β + δ)xt) = 0

∂S

∂δ= −

TXt=T1+1

(yt − (α+ γ)− (β + δ)xt)xt = 0

150

Page 144: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

• Hence, the point estimates in the joint estimation are identical to those of theseparat estimations

• If the point estimates are identical, then so are the residuals; and if theresiduals are identical, then so are the sums of squared residuals

• As to the standard errors, in the joint model we estimate

σ2 = Suu/ (T − 4)

while in the separat estimations we estimate

σ20 = S0uu/ (T1 − 2)σ21 = S1uu/ ((T − T1)− 2)

151

Page 145: Andrea Beccarini Summer 2011 - wiwi.uni-muenster.de · Review of basic statistics • Random experiment (Zufallsexperiment) • Sample space (Ergebnismenge) • Event (Ereignis) •

Remarks

• What happens if the dummy variables are not 0/1-coded but 1/2-coded?

• Consider the model

yt = α+ γD1t + δD2t + βxt + ut

where

D1t =

(0 for males1 for females

D2t =

(0 for German citizenship1 else

• Interaction terms

152