65
BUEC 333: An Introduction to Introduction to Econometrics Krishna Pendakur, Economics, SFU

BUEC 333: An Introduction to Introduction to Econometrics

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BUEC 333: An Introduction to Introduction to Econometrics

BUEC 333:

An Introduction to

Introduction to Econometrics

Krishna Pendakur, Economics, SFU

Page 2: BUEC 333: An Introduction to Introduction to Econometrics

What is Econometrics?

• Econometrics is the application of economic theory

and statistical methods to analyze economic data.

• It's both a science and an art.

– There's always a disconnect between economic

theory and the real world. Creatively bridging this gap

to apply economic theory to real-world data is an a

– Economic data are rarely "perfect" (experimental) --

sometimes requires some finesse to get believable

results out of them.

– Economic theory and statistics are grounded in

mathematics. Direct application of either is scientific.

Page 3: BUEC 333: An Introduction to Introduction to Econometrics

Econometric Questions

• Magnitudes and Tests

• Does reducing class size improve the quality of education? – Common sense tells us that students get more attention in smaller classes,

and therefore probably learn more. But how much more? Reducing class size costs money (need more teachers). Does the improved quality of education justify the expense? Here, the magnitude of improvement matters.

• What will be the value of the S&P 500 one year from today?– Knowing (with certainty) the answer to this question would be worth a lot.

– A good (accurate) estimate is also valuable.

• Do Immigrants Face Discrimination in the Labour Market?– This is a yes/no question. We can use statistics to test whether or not

they face disparity.

Page 4: BUEC 333: An Introduction to Introduction to Econometrics

So What?

• Knowing some basic econometrics can be very lucrative.– Lots of jobs, some pay very well.

• Knowing your way around some econometric software can too.

• Useful for future study (4th year courses -- you'll see/read quite a bit of empirical work, and maybe do some).

• Even if you end up doing something completely different, it's useful for understanding the world around you.– Managers need to understand where estimates come from and what they tell

you (and what they don't!)

– Helpful for understanding real-world statistics (polls, press about empirical studies, "facts" and arguments, etc.)

– Helpful for understanding finance (alphas, betas, R2, etc.)

• Good Policy Uses Econometric Analysis– Magnitudes Matter

– Causation Matters

Page 5: BUEC 333: An Introduction to Introduction to Econometrics

BUEC 333 Deliverables…

• In this class, you will learn the statistical methods you would use to answer these questions (and many more ...)

• You will learn how to assess whether a particular estimate/analysis is any good.

• You will learn to use a specific statistical software package (EViews)

• You will get some practice working with real data.

Page 6: BUEC 333: An Introduction to Introduction to Econometrics

Today

• Correlation

• Magnitudes Matter

• Statistical Precision Matters

• One Dimension Isn’t Enough

• Regression

• Causation

• Instruments, Natural Experiments and Field

Experiments

Page 7: BUEC 333: An Introduction to Introduction to Econometrics

Some Data

• Long Example: Do immigrants earn less?

– How? Why?

• 2006 Census public-use microdata, sample of

from Census long forms.

• 3% sample of population of BC, about 100,000

observations of individuals

• Earnings from Wages and Salaries (ie.,

working for other people) in 2005

Page 8: BUEC 333: An Introduction to Introduction to Econometrics

What is Microdata? (Excel)

24119211299106

1521819115116

1321010118146

103199199956

1218888119146

7412628129966

12199999113166

24121221199126

2211314118146

1311415118126

63188117116

75144199916

14123231113186

10311011299921

121999991199116

7512628129966

8412527119986

121999991110186

1221010118116

751121219256

8311213199916

CFSTATCFSIZECFINEFCFINC_ATCFINCBFNMEMBATTSCHAGEIMMAGEGRPABOID

Page 9: BUEC 333: An Introduction to Introduction to Econometrics

What is Microdata?

• But, there are 56,530 rows and 124 columns

• A bit cumbersome

• Need to summarize this information

– Cross-Tabs

– Tables of Mean

– Regressions

Page 10: BUEC 333: An Introduction to Introduction to Econometrics

Correlation

• When one number is high, do we expect

another number to be high (or low)?

• When a person is an immigrant, are their

earnings lower?

• Need to map out (at least) immigration status

and earnings

Page 11: BUEC 333: An Introduction to Introduction to Econometrics

The Cross-Tab

Cross-Tabs show the numbers of observations (rows of the data)that have values in different ranges for different variables.They show how the data are clumped, and how the data vary with each other.

TTTToooottttaaaallll 11116666,,,,666633330000 11111111,,,,555588883333 11111111,,,,555511113333 7777,,,,333300008888 44447777,,,,000033334444 FFFFoooorrrreeeeiiiiggggnnnn----bbbboooorrrrnnnn PPPPeeeerrrrmmmm RRRReeeessss 8888,,,,888811114444 5555,,,,111133332222 4444,,,,888877778888 2222,,,,333355555555 22221111,,,,111177779999 CCCCaaaannnnaaaaddddiiiiaaaannnn----bbbboooorrrrnnnn 7777,,,,222288889999 6666,,,,222211112222 6666,,,,555522220000 4444,,,,888888881111 22224444,,,,999900002222 TTTTeeeemmmmppppoooorrrraaaarrrryyyy RRRReeeessssiiiiddddeeeennnnttttssss 555522227777 222233339999 111111115555 77772222 999955553333 IIIImmmmmmmmiiiiggggrrrraaaannnntttt ssssttttaaaattttuuuussss ZZZZeeeerrrroooo 1111----$$$$22220000kkkk $$$$22220000KKKK----$$$$55550000kkkk $$$$55550000kkkk++++ TTTToooottttaaaallll ggggrrrroooouuuuppppWWWWAAAAGGGGEEEESSSS

Page 12: BUEC 333: An Introduction to Introduction to Econometrics

• Seems crazy to talk about the earnings of

children, and the real elderly, and people who

arrived after the 2005 income year

• So, drop

– persons under 25

– persons over 69

– immigrants who arrived in 2005 or 2006

Page 13: BUEC 333: An Introduction to Introduction to Econometrics

TTTToooottttaaaallll 9999,,,,444444442222 6666,,,,888899995555 11110000,,,,444499999999 7777,,,,111199993333 33334444,,,,000022229999 FFFFoooorrrreeeeiiiiggggnnnn----bbbboooorrrrnnnn PPPPeeeerrrrmmmm RRRReeeessss 5555,,,,222200006666 3333,,,,666600008888 4444,,,,555577773333 2222,,,,333300004444 11115555,,,,666699991111 CCCCaaaannnnaaaaddddiiiiaaaannnn----bbbboooorrrrnnnn 3333,,,,999966662222 3333,,,,111122225555 5555,,,,888822223333 4444,,,,888822220000 11117777,,,,777733330000 TTTTeeeemmmmppppoooorrrraaaarrrryyyy RRRReeeessssiiiiddddeeeennnnttttssss 222277774444 111166662222 111100003333 66669999 666600008888 IIIImmmmmmmmiiiiggggrrrraaaannnntttt ssssttttaaaattttuuuussss ZZZZeeeerrrroooo 1111----$$$$22220000kkkk $$$$22220000KKKK----$$$$55550000kkkk $$$$55550000kkkk++++ TTTToooottttaaaallll ggggrrrroooouuuuppppWWWWAAAAGGGGEEEESSSS

Page 14: BUEC 333: An Introduction to Introduction to Econometrics

Cross-Tabs with Column %s

TTTToooottttaaaallll 111100000000....00000000 111100000000....00000000 111100000000....00000000 111100000000....00000000 111100000000....00000000 FFFFoooorrrreeeeiiiiggggnnnn----bbbboooorrrrnnnn PPPPeeeerrrrmmmm RRRReeeessss 55555555....11114444 55552222....33333333 44443333....55556666 33332222....00003333 44446666....11111111 CCCCaaaannnnaaaaddddiiiiaaaannnn----bbbboooorrrrnnnn 44441111....99996666 44445555....33332222 55555555....44446666 66667777....00001111 55552222....11110000 TTTTeeeemmmmppppoooorrrraaaarrrryyyy RRRReeeessssiiiiddddeeeennnnttttssss 2222....99990000 2222....33335555 0000....99998888 0000....99996666 1111....77779999 IIIImmmmmmmmiiiiggggrrrraaaannnntttt ssssttttaaaattttuuuussss ZZZZeeeerrrroooo 1111----$$$$22220000kkkk $$$$22220000KKKK----$$$$55550000kkkk $$$$55550000kkkk++++ TTTToooottttaaaallll ggggrrrroooouuuuppppWWWWAAAAGGGGEEEESSSS

Page 15: BUEC 333: An Introduction to Introduction to Econometrics

Cross-Tabs with Row %s

TTTToooottttaaaallll 22227777....77775555 22220000....22226666 33330000....88885555 22221111....11114444 111100000000....00000000 FFFFoooorrrreeeeiiiiggggnnnn----bbbboooorrrrnnnn PPPPeeeerrrrmmmm RRRReeeessss 33333333....11118888 22222222....99999999 22229999....11114444 11114444....66668888 111100000000....00000000 CCCCaaaannnnaaaaddddiiiiaaaannnn----bbbboooorrrrnnnn 22222222....33335555 11117777....66663333 33332222....88884444 22227777....11119999 111100000000....00000000 TTTTeeeemmmmppppoooorrrraaaarrrryyyy RRRReeeessssiiiiddddeeeennnnttttssss 44445555....00007777 22226666....66664444 11116666....99994444 11111111....33335555 111100000000....00000000 IIIImmmmmmmmiiiiggggrrrraaaannnntttt ssssttttaaaattttuuuussss ZZZZeeeerrrroooo 1111----$$$$22220000kkkk $$$$22220000KKKK----$$$$55550000kkkk $$$$55550000kkkk++++ TTTToooottttaaaallll ggggrrrroooouuuuppppWWWWAAAAGGGGEEEESSSS

Page 16: BUEC 333: An Introduction to Introduction to Econometrics

Do Immigrants Earn Less?

• Temporary Residents have way more zero-

earners, and foreign-born have somewhat

more zero-earners

• Foreign-born permanent residents more low-

earners and less high-earners than Canadian-

borns. The pattern is even stronger for

temporary residents.

Page 17: BUEC 333: An Introduction to Introduction to Econometrics

Integration Takes Time

• Immigrants are heterogeneous

• Some arrived long ago, some very recently

• Can we account for this in a Cross-Tab?

Page 18: BUEC 333: An Introduction to Introduction to Econometrics

Do Immigrants Earn Less?

TTTToooottttaaaallll 22227777....88880000 22220000....11111111 33330000....88881111 22221111....22228888 111100000000....00000000 TTTTeeeemmmmppppoooorrrraaaarrrryyyy 44445555....00007777 22226666....66664444 11116666....99994444 11111111....33335555 111100000000....00000000 CCCCddddnnnn----BBBBoooorrrrnnnn 22222222....33335555 11117777....66663333 33332222....88884444 22227777....11119999 111100000000....00000000 2222000000000000ssss 33333333....66666666 33332222....00007777 22227777....22228888 6666....99998888 111100000000....00000000 1111999999990000ssss 33334444....77777777 22225555....99992222 22229999....22223333 11110000....00007777 111100000000....00000000 1111999988880000ssss 22229999....55551111 11119999....55550000 33333333....44449999 11117777....55551111 111100000000....00000000 1111999977770000ssss 22229999....77779999 11116666....11116666 22229999....99994444 22224444....11111111 111100000000....00000000 1111999966660000ssss 44440000....88887777 11114444....99992222 22224444....44444444 11119999....77776666 111100000000....00000000 1111999955550000ssss 44440000....88884444 11116666....44440000 22220000....77777777 22221111....99999999 111100000000....00000000 BBBBeeeeffffoooorrrreeee 1111999955550000 33337777....88888888 11119999....77770000 11119999....77770000 22222222....77773333 111100000000....00000000 CCCCoooohhhhoooorrrrtttt ZZZZeeeerrrroooo 1111----$$$$22220000kkkk $$$$22220000KKKK----$$$$55550000kkkk $$$$55550000kkkk++++ TTTToooottttaaaallll IIIImmmmmmmmiiiiggggrrrraaaannnntttt ggggrrrroooouuuuppppWWWWAAAAGGGGEEEESSSS

Page 19: BUEC 333: An Introduction to Introduction to Econometrics

Table of Means

• Only 7% of recent immigrants have high earnings, compared with 27% of Cdn-born.

• But, 24% of immigrants from the 1970s have high earnings.

• This is getting clunky: how else can we summarize these data?

• A Table of Means gives the average value of some variable for values or ranges of another variable

Page 20: BUEC 333: An Introduction to Introduction to Econometrics

Table of Means

TTTTeeeemmmmppppoooorrrraaaarrrryyyy 666600008888 22221111777733337777 CCCCddddnnnn----BBBBoooorrrrnnnn 11117777,,,,777733330000 33338888333311117777 2222000000000000ssss 2222,,,,444466663333 11117777111100003333 1111999999990000ssss 5555,,,,444411112222 22220000222288883333 1111999988880000ssss 2222,,,,444411116666 22227777444477778888 1111999977770000ssss 2222,,,,666677779999 33333333333366665555 1111999966660000ssss 1111,,,,222266660000 33331111777766660000 1111999955550000ssss 555577773333 33330000666688882222BBBBeeeeffffoooorrrreeee 1111999955550000 66666666 22227777555566661111 CCCCoooohhhhoooorrrrtttt NNNN((((WWWWAAAAGGGGEEEESSSS)))) mmmmeeeeaaaannnn((((WWWWAAAAGGGGEEEESSSS))))IIIImmmmmmmmiiiiggggrrrraaaannnntttt

Page 21: BUEC 333: An Introduction to Introduction to Econometrics

Table of Means

• Those 2000s arrivals have it bad: $20,000

lower earnings than Canadian-borns.

• 1990s arrivals have similarly low earnings.

• But, 1970s arrivals have earnings about $5000

lower than Canadian-born.

Page 22: BUEC 333: An Introduction to Introduction to Econometrics

Variation

• Used 3% of Greater Vancouver households.

• The mean is not the mean---it is an estimate

of the mean.

• There’s dispersion around the mean. Lots of

people have income far from the mean.

• Standard Deviation (SD) characterises how far

people are from the mean

– SD=the average squared distance from the mean

Page 23: BUEC 333: An Introduction to Introduction to Econometrics

Table of Means and SDs

TTTTeeeemmmmppppoooorrrraaaarrrryyyy 666600008888 22221111777733337777 55557777888811111111 CCCCddddnnnn----BBBBoooorrrrnnnn 11117777,,,,777733330000 33338888333311117777 55556666333300007777 2222000000000000ssss 2222,,,,444466663333 11117777111100003333 22224444888844448888 1111999999990000ssss 5555,,,,444411112222 22220000222288883333 33331111888888889999 1111999988880000ssss 2222,,,,444411116666 22227777444477778888 33334444888855558888 1111999977770000ssss 2222,,,,666677779999 33333333333366665555 44449999222299997777 1111999966660000ssss 1111,,,,222266660000 33331111777766660000 66665555555588881111 1111999955550000ssss 555577773333 33330000666688882222 55558888666677777777BBBBeeeeffffoooorrrreeee 1111999955550000 66666666 22227777555566661111 33338888333388880000 CCCCoooohhhhoooorrrrtttt NNNN((((WWWWAAAAGGGGEEEESSSS)))) mmmmeeeeaaaannnn((((WWWWAAAAGGGGEEEESSSS)))) ssssdddd((((WWWWAAAAGGGGEEEESSSS))))IIIImmmmmmmmiiiiggggrrrraaaannnntttt

Page 24: BUEC 333: An Introduction to Introduction to Econometrics

• Immigrants arriving 1961-1970 and 1971-1980

earn almost as much as Canadian-born people

• Those arriving after seem to earn less and less

• But, are these differences statistically

significant---that is, are we sure that they are

there?

Page 25: BUEC 333: An Introduction to Introduction to Econometrics

Precision

• The precision of an estimated mean depends on

– How much data you have (N)

– How much dispersion there is (sd)

• Standard Error (SE)

– equals the standard deviation divided by the square root of N: SE = SD/√N

• 95% Confidence Interval (CI)

– gives a range such that we’re 95% sure that the true mean lies inside. These are roughly 4 SEswide (2 above, and 2 below, the estimated value)

Page 26: BUEC 333: An Introduction to Introduction to Econometrics

Table of Means, SDs and SEs

TTTTeeeemmmmppppoooorrrraaaarrrryyyy 666600008888 22221111777733337777 55557777888811111111 2222333344445555 CCCCddddnnnn----BBBBoooorrrrnnnn 11117777,,,,777733330000 33338888333311117777 55556666333300007777 444422223333 2222000000000000ssss 2222,,,,444466663333 11117777111100003333 22224444888844448888 555500001111 1111999999990000ssss 5555,,,,444411112222 22220000222288883333 33331111888888889999 444433333333 1111999988880000ssss 2222,,,,444411116666 22227777444477778888 33334444888855558888 777700009999 1111999977770000ssss 2222,,,,666677779999 33333333333366665555 44449999222299997777 999955552222 1111999966660000ssss 1111,,,,222266660000 33331111777766660000 66665555555588881111 1111888844448888 1111999955550000ssss 555577773333 33330000666688882222 55558888666677777777 2222444455551111BBBBeeeeffffoooorrrreeee 1111999955550000 66666666 22227777555566661111 33338888333388880000 4444777722224444 CCCCoooohhhhoooorrrrtttt NNNN((((WWWWAAAAGGGGEEEESSSS)))) mmmmeeeeaaaannnn((((WWWWAAAAGGGGEEEESSSS)))) ssssdddd((((WWWWAAAAGGGGEEEESSSS)))) sssseeeemmmm((((WWWWAAAAGGGGEEEESSSS))))IIIImmmmmmmmiiiiggggrrrraaaannnntttt

Page 27: BUEC 333: An Introduction to Introduction to Econometrics

Another Way To Present the Means and Standard Error

Earnings by Immigrant Cohort

15,000

20,000

25,000

30,000

35,000

40,000

Before

1950

19

50s

1960

s

19

70s

1980

s

19

90s

2000

s C

dn-B

orn

Tem

pora

ry

Immigrant Cohort

Mea

n E

arni

ngs

Page 28: BUEC 333: An Introduction to Introduction to Econometrics

So, Do Immigrants Earn Less?

• The 95% confidence interval for the mean earnings of a Canadian-born person aged 25-69 is entirely above that of the mean earnings of every immigrant cohort.

• A test considers the probability that some relationship is true.

• Can we test whether or not immigrants earn less? Given the probabilities, it is highly improbable that they earn the same as the Canadian-born. So, we reject the hypothesis that they earn the same.

Page 29: BUEC 333: An Introduction to Introduction to Econometrics

Age and Arrival Cohort

• Immigrants arriving before 1961 and after

1980 earn less than Canadian-born workers.

• But, those arriving before 1961 are old.

• We know old people have low earnings,

because they quit working.

• Those arriving between 1961 and 1980 are in

their prime earning years.

• Is this all about age?

Page 30: BUEC 333: An Introduction to Introduction to Econometrics

Immigrants Age and Earnings

TTTTeeeemmmmppppoooorrrraaaarrrryyyy 11116666888822220000 22226666999955550000 33338888555555551111 11113333000000000000 1111333333333333 CCCCddddnnnn----BBBBoooorrrrnnnn 33331111555555552222 44443333555511111111 44447777111177776666 33335555222211118888 11110000333333337777 2222000000000000ssss 11118888777799993333 11119999555566664444 11115555555533338888 8888888811110000 2222888899996666 1111999999990000ssss 22220000333311117777 22225555111133339999 22222222777777777777 11110000333366665555 2222111111117777 1111999988880000ssss 33331111666644448888 33331111000055557777 33330000111122224444 11119999111166669999 4444000000008888 1111999977770000ssss 33337777777700004444 44441111333333336666 33333333777788889999 33333333222299991111 6666888877776666 1111999966660000ssss 55553333777700008888 44442222999966667777 22228888888866665555 11110000111188887777 1111999955550000ssss 44449999111199999999 33336666333355559999 7777555533337777BBBBeeeeffffoooorrrreeee 1111999955550000 33331111777766665555 11113333222266667777 CCCCoooohhhhoooorrrrtttt 22225555----33334444 33335555----44444444 44445555----55554444 55555555----66664444 66665555----66669999IIIImmmmmmmmiiiiggggrrrraaaannnntttt AAAAggggeeee GGGGrrrroooouuuupppp

Page 31: BUEC 333: An Introduction to Introduction to Econometrics

Do Immigrants Earn Less?

• Now that we consider people who are the same

age, the disparities look smaller.

• Immigrants aged 25-34 who arrived in the 1970s

and 1980 have average earnings higher than

their Canadian-born counterparts.

• But, a lot of age-arrival year groups earn less.

Page 32: BUEC 333: An Introduction to Introduction to Econometrics

Controlling for Age

• When we control for a variable x, we ask “what is

the difference in y for a given value of x?”

• When you look within age categories, there’s less

disparity.

• Thus, controlling for age reduces the amount of

disparity we see between immigrants and

Canadian-born people

Page 33: BUEC 333: An Introduction to Introduction to Econometrics

Controlling for Education

• Immigrants are more educated than Canadian-

born people. Shouldn’t they earn more?

TTTToooottttaaaallll 11111111....44441111 22224444....33337777 22228888....00002222 6666....88882222 22229999....33338888 111100000000....00000000 TTTTeeeemmmmppppoooorrrraaaarrrryyyy 4444....99994444 11114444....99999999 11116666....99997777 8888....99990000 55554444....22220000 111100000000....00000000 CCCCddddnnnn----BBBBoooorrrrnnnn 9999....22221111 22226666....22227777 33332222....44447777 5555....33336666 22226666....66669999 111100000000....00000000 2222000000000000ssss 11110000....66668888 11117777....66663333 11113333....88889999 9999....44442222 44448888....33338888 111100000000....00000000 1111999999990000ssss 11115555....22222222 22223333....33335555 11119999....99993333 9999....22229999 33332222....22220000 111100000000....00000000 1111999988880000ssss 11117777....00000000 22223333....99991111 22224444....77770000 8888....88881111 22225555....55557777 111100000000....00000000 1111999977770000ssss 11113333....11114444 22222222....11110000 33330000....22220000 7777....66661111 22226666....99995555 111100000000....00000000 1111999966660000ssss 11113333....66667777 22224444....00009999 33333333....00007777 6666....33336666 22222222....88881111 111100000000....00000000 1111999955550000ssss 11116666....77775555 22227777....99992222 33330000....88889999 4444....88889999 11119999....55555555 111100000000....00000000 BBBBeeeeffffoooorrrreeee 1111999955550000 11113333....66664444 11116666....66667777 33336666....33336666 1111....55552222 33331111....88882222 111100000000....00000000 CCCCoooohhhhoooorrrrtttt LLLLTTTT HHHHSSSS HHHHSSSS SSSSoooommmmeeee PPPPSSSS BBBBAAAA////BBBBSSSScccc PPPPoooosssstttt BBBBAAAA TTTToooottttaaaallll IIIImmmmmmmmiiiiggggrrrraaaannnntttt HHHHiiiigggghhhheeeesssstttt LLLLeeeevvvveeeellll ooooffff SSSScccchhhhoooooooolllliiiinnnngggg

Page 34: BUEC 333: An Introduction to Introduction to Econometrics

Mean Earnings by Education

TTTTeeeemmmmppppoooorrrraaaarrrryyyy 11111111555533333333 11113333555577771111 11118888222222226666 11112222000033337777 22227777555577777777 CCCCddddnnnn----BBBBoooorrrrnnnn 22220000222211118888 33333333777700002222 33335555444444447777 44440000555511112222 55552222111122229999 2222000000000000ssss 8888555500002222 11112222777733337777 11115555999944444444 11113333222244442222 22221111666699992222 1111999999990000ssss 9999444433338888 11114444222233338888 22221111444433335555 22220000222277771111 22229999111100006666 1111999988880000ssss 11111111999911115555 22220000444466663333 22229999999966660000 22229999333355556666 44441111333333335555 1111999977770000ssss 11115555666677779999 22224444222299990000 33334444666611111111 33332222333366663333 44448888333300000000 1111999966660000ssss 11113333000044441111 22226666777711111111 33334444999911118888 22229999222211111111 44444444888811117777 1111999955550000ssss 11117777333344443333 22226666111144442222 22229999111177776666 33334444111144443333 44449999444400005555BBBBeeeeffffoooorrrreeee 1111999955550000 11110000222222222222 11111111000000000000 44443333111166667777 11112222000000000000 22226666555577771111 CCCCoooohhhhoooorrrrtttt LLLLTTTT HHHHSSSS HHHHSSSS SSSSoooommmmeeee PPPPSSSS BBBBAAAA////BBBBSSSScccc PPPPoooosssstttt BBBBAAAAIIIImmmmmmmmiiiiggggrrrraaaannnntttt HHHHiiiigggghhhheeeesssstttt LLLLeeeevvvveeeellll ooooffff SSSScccchhhhoooooooolllliiiinnnngggg

Page 35: BUEC 333: An Introduction to Introduction to Econometrics

Controlling for Education

• Unlike when we controlled for age, when we

control for education, the disparity gets

bigger.

• So, is the disparity big or small? Should we

control for age or education?

• You could make an even bigger table of

means, or…we could run a regression.

Page 36: BUEC 333: An Introduction to Introduction to Econometrics

Regression Analysis

• Tool to look at differences along many

dimensions (ie., along age, education and

immigration cohort all at the same time)

• Modeling Assumption: the effects of all these

things can be summarised by adding up a

bunch of individual effects.

• Then, assessing any one of them is easy: its

effect on the total is isolated.

Page 37: BUEC 333: An Introduction to Introduction to Econometrics

Analysing Regressions

• Cdn-Born, HS, aged 35-44 has estimated

earnings of $29,168 (top number)

• With a PhD (but still Cdn-Born aged 35-44), it

would be $23,459 higher, equalling $52,627

• So, you can isolate the effect of having a PhD

• You can also isolate the effect of immigration

arrival year

Page 38: BUEC 333: An Introduction to Introduction to Econometrics

Regression Estimates

73617300Post BA2017-20328Temporary

11575838BA/BSc1059-252172000s

7383813Some PS760-189281990s

965-7989LT HS1057-109521980s

1248-2732465-691020-43821970s

827-787855-641446-15381960s

737169245-5421019521950s

766-965725-345999-3662Before 1950

74237629Cdn-born 35-44 HS

Std ErrEstimateVariableStd ErrEstimateVariable

Page 39: BUEC 333: An Introduction to Introduction to Econometrics

Part of a Regression

2017-20328Temporary

1059-252172000s

760-189281990s

1057-109521980s

1020-43821970s

1446-15381960s

21019521950s

5999-3662Before 1950

74237629Cdn-born 35-44 HS

SEEstVariable

Regression are cool because you can look at part of a regression.

Consider the part with the immigrant cohort effects.

Holding age and education constant, if you compare a Canadian-born person with a Temporary resident, the latter earns $20,328 less.

Immigrants from the 1960s earn only $1538 less, and the confidence interval for that is about [-1500, 4500]: it could be zero dollars less!

Page 40: BUEC 333: An Introduction to Introduction to Econometrics

Multivariate Analysis

• With tables of proportions or frequencies (ie.

Cross-tabs), you can see concentration.

– Eg., concentration of immigrants at the low end of

the income distribution

• You can also see that they are concentrated in

the middle of the age distribution and the top

of the education distribution

• Thus, controlling for these things is essential.

Page 41: BUEC 333: An Introduction to Introduction to Econometrics

Multivariate Analysis

• Regressions allow you to control for lots of

things at once.

• One-way analysis on the basis of age might

tell you one thing (eg., earnings seem okay)

• One-way analysis on the basis of education

might tell you another (eg., earnings are low)

• Multivariate analysis can resolve this by

controlling for both at the same time.

Page 42: BUEC 333: An Introduction to Introduction to Econometrics

R-squared

• The regression model has just three things: year of arrival,

age and education.

• These three things ‘explain’ 7.5% of the variation in earnings

from wages and salaries.

– Explain, as in are correlated with

– Not explain, as in causes

• What explains the other 92.5%?

– Things that explain earnings that are correlated with

arrival, age and education

– Things that explain earnings that are not correlated with

arrival, age and education

Page 43: BUEC 333: An Introduction to Introduction to Econometrics

Men Earn More---So What

• Men earn more than women. What if we add

“male” to the regression?– 1970s immigrants have the same sex ratio as Canadian-borns

TTTToooottttaaaallll 55551111....77774444 44448888....22226666 111100000000....00000000 TTTTeeeemmmmppppoooorrrraaaarrrryyyy 55555555....55559999 44444444....44441111 111100000000....00000000 CCCCddddnnnn----BBBBoooorrrrnnnn 55550000....11110000 44449999....99990000 111100000000....00000000 2222000000000000ssss 55555555....99994444 44444444....00006666 111100000000....00000000 1111999999990000ssss 55554444....22227777 44445555....77773333 111100000000....00000000 1111999988880000ssss 55553333....00008888 44446666....99992222 111100000000....00000000 1111999977770000ssss 55551111....99992222 44448888....00008888 111100000000....00000000 1111999966660000ssss 55552222....33334444 44447777....66666666 111100000000....00000000 1111999955550000ssss 44449999....55557777 55550000....44443333 111100000000....00000000 BBBBeeeeffffoooorrrreeee 1111999955550000 44445555....44445555 55554444....55555555 111100000000....00000000 CCCCoooohhhhoooorrrrtttt 0000 1111 TTTToooottttaaaallll IIIImmmmmmmmiiiiggggrrrraaaannnntttt mmmmaaaalllleeee

Page 44: BUEC 333: An Introduction to Introduction to Econometrics

Comparing Regressions

yesjust men

52616172male

1360205887261696273617300Post BA

229568811142687111575838BA/BSc

1366392972835997383813Some PS

1789-11041951-8057965-7989LT HS

2312-352581230-270161248-2732465-69

1519-7198815-7778827-787855-64

136227777261849737169245-54

1418-14263755-9537766-965725-34

3895-158051989-193922017-20328Temporary

2042-318331045-242541059-252172000s

1433-25109750-18298760-189281990s

1975-151311042-105231057-109521980s

1890-78931006-41141020-43821970s

2688-13451426-12181446-15381960s

38071538207277821019521950s

10462-55355915-43875999-3662Before 1950

1360482077772956674237629Cdn-born 35-44 HS

SECoefSECoefSECoef

Page 45: BUEC 333: An Introduction to Introduction to Econometrics

• Check out just the top left panel: on the left there’s a regression without

male, on the right a regression with male.

• Adding male doesn’t do anything

• This is because male, though important to earnings, is not very correlated

with immigrant arrival cohort.

1989-193922017-20328Temporary

1045-242541059-252172000s

750-18298760-189281990s

1042-105231057-109521980s

1006-41141020-43821970s

1426-12181446-15381960s

207277821019521950s

5915-43875999-3662Before 1950

7772956674237629Cdn-born 35-44 HS

SECoefSECoef

Page 46: BUEC 333: An Introduction to Introduction to Econometrics

Just Men’s Earnings

yesjust men

male

136020588Post BA

22956881BA/BSc

13663929Some PS

1789-11041LT HS

2312-3525865-69

1519-719855-64

1362277745-54

1418-1426325-34

3895-15805Temporary

2042-318332000s

1433-251091990s

1975-151311980s

1890-78931970s

2688-13451960s

380715381950s

10462-5535Before 1950

136048207Cdn-born 35-44 HS

SECoefThis is the rightmost panel: a regression with just men in it.

The same pattern emerges: immigrants from earlier cohorts have higher earnings, and immigrants from recent cohorts have very low earnings.

Page 47: BUEC 333: An Introduction to Introduction to Econometrics

Endogeneity

• Adding Male to the regression changed nothing important

(because it wasn’t correlated with cohort)

– But let’s stick with just men.

• What about visible minority status? It is correlated.

TTTToooottttaaaallll 66662222....77770000 33337777....33330000 111100000000....00000000 TTTTeeeemmmmppppoooorrrraaaarrrryyyy 33337777....77778888 66662222....22222222 111100000000....00000000 CCCCddddnnnn----BBBBoooorrrrnnnn 99990000....77771111 9999....22229999 111100000000....00000000 2222000000000000ssss 11117777....55557777 88882222....44443333 111100000000....00000000 1111999999990000ssss 11114444....66661111 88885555....33339999 111100000000....00000000 1111999988880000ssss 11118888....33334444 88881111....66666666 111100000000....00000000 1111999977770000ssss 33335555....77774444 66664444....22226666 111100000000....00000000 1111999966660000ssss 66669999....33338888 33330000....66662222 111100000000....00000000 1111999955550000ssss 88886666....22221111 11113333....77779999 111100000000....00000000 BBBBeeeeffffoooorrrreeee 1111999955550000 99994444....44444444 5555....55556666 111100000000....00000000 CCCCoooohhhhoooorrrrtttt 0000 1111 TTTToooottttaaaallll IIIImmmmmmmmiiiiggggrrrraaaannnntttt VVVVMMMM

Page 48: BUEC 333: An Introduction to Introduction to Econometrics

1444-9523vm

135820674136020588Post BA

2293735222956881BA/BSc

1365354013663929Some PS

1787-107521789-11041LT HS

2309-353782312-3525865-69

1519-75761519-719855-64

136026251362277745-54

1421-134521418-1426325-34

3952-111813895-15805Temporary

2286-250212042-318332000s

1799-179121433-251091990s

2233-82221975-151311980s

2060-24631890-78931970s

27089982688-13451960s

38042343380715381950s

10448-536110462-5535Before 1950

136449041136048207White Cdn-born 35-44 HS

SECoefSECoef

Page 49: BUEC 333: An Introduction to Introduction to Econometrics

Endogeneity

• Those numbers changed a lot!

• Visible minority status is correlated with time of arrival: immigrants in later cohorts were more likely to be visible minorities.

• Visible minorities have lower earnings.

• Thus, if you leave visible minority out of the regression, you think you are measuring a cohort effect, but it is really a visible minority effect.

• This is a form of endogeneity. If a coefficient suffers from endogeneity, then its estimated value is not informative of the effect of the variable of interest.

Page 50: BUEC 333: An Introduction to Introduction to Econometrics

Correlation and Causation

• Two things make correlation different from

causation.

– missing variables: things (eg, earnings and arrival

cohort) may be correlated, but that correlation

may be driven by left-out or unobserved stuff (eg,

country-of-birth).

– reverse causation: things can be correlated, but

you don’t know which causes which.

• Eg, if Chinese-born people fare well in B.C., then they

may choose to move here. This means that their high

wages cause their clustering here.

Page 51: BUEC 333: An Introduction to Introduction to Econometrics

Causation and Endogeneity

• If two variables are correlated but are not

causally related, the econometric problem is

one of endogeneity (just as above).

Page 52: BUEC 333: An Introduction to Introduction to Econometrics

Policy Needs Isolated Effects

• If we want to know whether or not recent

cohorts fare badly, a table will do fine.

• If we want to know why they fare badly, we

need to control out stuff

– Eg., part of their ill-fortune is driven by their

ethnic origin, not when they came.

Page 53: BUEC 333: An Introduction to Introduction to Econometrics

Let’s Educate Immigrants

yesyesjust men

yesvm

135820674136020588Post BA

2293735222956881BA/BSc

1365354013663929Some PS

1787-107521789-11041LT HS

SECoefSECoef

Education pays off no matter how you slice it,controlling for age or not.Should we invest in educating immigrants asa matter of policy?

Page 54: BUEC 333: An Introduction to Introduction to Econometrics

Policy Needs Causal Effects

• Does the table tell what happens if we get

more educated immigrants?

• A lot of missing variables here:

– Quality of education; relevance of education;

place of education.

– Are people who get educated smarter than

people who don’t? If so, then educating a given

person won’t raise their wage as much as the

difference in earnings might suggest.

Page 55: BUEC 333: An Introduction to Introduction to Econometrics

Instrumental Variables

• Education is endogenous--that is we don’t

really believe that we have included all the

correlated missing variables.

• We could use an instrument to solve this

problem.

– An instrument is a variable the is correlated with

the problem variable (education in this case) but

not with all those correlated missing variables

– Bit too tricky to define right now.

Page 56: BUEC 333: An Introduction to Introduction to Econometrics

Experimental Approaches

• The idea of experiments, in the lab, in the

field, or natural experiments, is:

– hold everything constant while changing

something of interest

– Measure the difference across its values

– Since nothing else is changing, there are no

missing variables.

– Since the experimenter is changing the values of

the variable, there is no reverse causation.

Page 57: BUEC 333: An Introduction to Introduction to Econometrics

Natural Experiments

• Lab experiments don’t work so well for economic questions,

eg, do immigrant arrivals affect native-born wages? Need a

big lab.

• A natural experiment addresses reverse causality: use a

change that for sure is not driven by your supposed

dependent variable.

– Eg, the Mariel Boatlift put more than 100,000 Cubans into

Florida.

– This was not due to high wages in Florida; it was due to

Castro emptying prisons.

– It had no effect on native-born white or hispanic wages

(David Card 1990)

Page 58: BUEC 333: An Introduction to Introduction to Econometrics

Field Experiments

• Often, you don’t get the natural experiment

you wish for.

• A field experiment uses data created by the

investigator in such a way that there is little or

no missing variables problem.

• Phil Oreopoulous (2009) (in MBC-supported

research) sent out 8000 fake resumes to

figure out whether and why immigrants have

trouble finding work.

Page 59: BUEC 333: An Introduction to Introduction to Econometrics

Oreopoulous’ Study

• Field experiment with 8,000 mock resumes,

emailed in response to newspaper and online

job postings

• Randomize name, experience, education, and

other characteristics, send, and compare

callback rate differences

• Offers clear and convincing evidence on why

recent immigrants fare poorly in the Canadian

labour market

Page 60: BUEC 333: An Introduction to Introduction to Econometrics

Example: Different experience

• Names

Page 61: BUEC 333: An Introduction to Introduction to Econometrics

Example: Different name

Page 62: BUEC 333: An Introduction to Introduction to Econometrics

Field Experiment Setup

• Respond only to internet or newspaper

postings (e.g. Workopolis, Job Bank,

Craigslist,…) May – Nov. 2008

• 4 resumes per posting, sent over 2-3 days:

• 0: Can. name, Can. education, Can. experience

• 1: foreign name, Can. education, Can. Exp.

• 2: foreign name, foreign education, Can. Exp.

• 3: 4: foreign name, foreign educ., some or all

foreign exp., respectively

Page 63: BUEC 333: An Introduction to Introduction to Econometrics
Page 64: BUEC 333: An Introduction to Introduction to Econometrics

Oreopoulous’ Results

• Name discrimination likely illegal– “Subsection 11 (1) of the Code also establishes that the right of a person

under Part I is infringed where a requirement, qualification or factor exists

that is not a prohibited ground of discrimination, but that results in the

exclusion or restriction of a group of persons who are identified by a

prohibited ground of discrimination, unless the requirement or factor is

reasonable and genuine in the circumstances, subject to hardship on the

employer”

• For those with 4-6 years Can. Experience, minimal differences

for foreign vs. Can education

• Lower returns to foreign experience not because of type of

job or employer

• Brits do fine

Page 65: BUEC 333: An Introduction to Introduction to Econometrics

Food For Thought

• One-way tables are not enough:

– Missing variables may be correlated, and may in fact be driving things

– Multivariate analysis can fix up missing variables problems, and allow

you to get at causation

• Statistical precision matters:

– if things look different, but are not statistically distinguished (eg,

because they are only 1 std err apart), then they may not be different

at all.

• Regression analysis is a multivariate tool that lets you think

about each variable separately.

• Natural experiments, field experiments and instrument

variables help with both missing variables and reverse

causation, and thus allow you to get at causation.