1 MADE Why do we need econometrics? If there are two points and we want to know what relation describes that? X Y

1 MADE

Why do we need econometrics?

• If there are two points and we want to know what relation describes that?

X

Y

2 MADE


• But if there’s more than just two points for two variables?

3 MADE


• How would we look for this line?

MINIMISING THE RESIDUALS!!!!

4 MADE

What is an econometric model?

Some things about reality are known…– GDP per capita– capital accumulation– volume of trade

… but the relations between them are unknown– correlation– causality we need a tool to seek the latter using the

former

Costs? We need to simplify the reality

5 MADE

An example of a model

• Suppose you wanted to see what is the degree of gender discrimination in wages.

• Your model:wages=f (gender and ???)

– education– experience– profession– city/rural area– …

• We cannot consider everything because:– no data– model quality => STATISTICS

6 MADE

Random versus deterministic

• What is a variable?

• What is a random variable?– example: height of all the people in this room

• Can you ever get a deterministic number from a random one?

• What is EXPECTED VALUE?– for a deterministic variable– for a random variable

7 MADE

Are residuals form this graph random or deterministic?

8 MADE

An example of a model revisited

• Let’s go back to the example of gender discrimination:

• We said the model was like thiswages = f (gender and ???)

• But now we know that in fact:wages = constant +

coeff*education + coeff*experience + coeff*gender + coeff*whatevereslewethinkof +

residuals

• We don’t know the coefficients => we seek a method to find them!!!

• Residuals depend on how we choose the coefficients and are unknown (random)

9 MADE

Finding a method

• We want to minimise our „error”:

or

10 MADE

Finding a method

• We can write each of the elements as :

11 MADE

Finding a method

• What we have is:– X – a matrix of exogenous (input) variables

(„knowns”)– y - a vector of the endogenous (but still input)

variable (we think we know the results of the random process)

– ɛ – unknown residuals that can be only estimated using residuals from the model

– β – unknown parameters that we want to estimate (output)

• What we need is:– a model that will let us know β’s, with ɛ’s as

small as only possible

12 MADE

Finding a method

• Let’s define:

• Where:

is a theoretical, fitted value of y’s» e’s are only estimates of ɛ’s, but do

not have to be equal» b’s are only estimats of β’s, but are

chosen such that, y and y hat are as close as possible

13 MADE

Finding a method

• We find the method for estimation by minimising the residuals, but:– There is a lot of them– They can be very big (positive and negative)

and still add up to zero=> we need to take squares (distances) and not direct values

14 MADE

Finding a method

• We look for the first order conditions for:

• So we differentiate and put equal to zero:

15 MADE

Finding a method

• When it comes to matrices, multiplication is no longer as straightforward (it matters what comes first and you can’t divide)

• What you can is pre-multiply by an inverted matrix• In order for a matrix to be invertible, it has to be

nonsingular (no row and no column is a linear combination of the others)

• X’X is a matrix seems to meet these conditions

16 MADE

Finding a method

• We have an optimum, but we don’t know if it’s a max or a min => need to find second derivative and prove it’s positive to be sure to have a minimum (so residuals as small as possible)

• It is positive, so we have found what we were looking for

17 MADE

Properties of OLS

1. X’e=02. Fitted and actual values of y are on

average equal3. Σe=0 (for a model with a constant)4. There is nothing more systematic about

y than already explained by X (fitted y and residuals are not correlated)

18 MADE

Properties of OLS

• If a model has a constant…

• … and then

19 MADE

Is OLS the best?

• Can we be sure that OLS will always give us the best possible estimator?

• If assumptions are fulfilled, OLS is BLUE (meaning Best Linear Unbiased Estimator)

• Assumptions:1. y=Xβ2. X is deterministic and exogenous3. E(ɛi)=0

4. Cov(ɛi,ɛj)=0

5. Var(ɛi)=σ2

• What do we loose on linear and unbiased?

20 MADE

Variance-covariance matrix

21 MADE

What do we know about OLS properties

• It is unbiased:

22 MADE

What do we know about OLS properties?

• The variance of the parameters is given by:

so we only need to find an estimator of σ, but:

so…

23 MADE

What do we know about OLS properties?

…

24 MADE

Why do we need the properties?

• How can we say that a model is good?– We only know that among linear and unbiased

we have estimators of β that yield lowest errors)

• How can we say if one model is better than other?– So far we didn’t ask this question at all!

• How can we say AT ALL if a variable really is correlated with another?– So far we only considered setting up a model,

but in reality this is an implicit hypothesis and needs to be tested!

25 MADE

How good our model is?

• We can ask how big are the residuals when compared to the input values

TSS=ESS+RSS

with a constant

26 MADE

How good our estimates are?

• We can test the values we have obtained vis-a-vis a hypothesis that they are zero

27 MADE

Preview of coming attractions

• Hypothesis testing

• Understanding the output of any statistical package (or tables in papers you have to read )

• Interpretation

• Prognosis

Documents

1 MADE Why do we need econometrics? If there are two points and we want to know what relation describes that? X Y