Download ppt - Basics of the Labor Market

Basics of the Labor Market

Participants are assigned motives:

• Workers look for the best job

• Firms look for profits

• Government uses regulation to achieve goals of public policy

– Minimum wages

– Occupational safety

Workers• The most important actor; without workers, there is no

“labor”• Desire to optimize (to select the best option from available

choices) to maximize well-being• Will want to supply more time and effort for higher payoffs,

causing an upward sloping labor supply curve

Firms

• Decide who to hire and fire

• Motivated to maximize profits

• Relationship between price of labor and the number of workers a firm is willing to hire generates the labor demand curve

Basics of the Labor Market

Government

• Imposes taxes

• Safety/environmental regulations

• Set minimum wages

• Force firms to shuttle workers from home to work (SF, CA)

• Mediate labor union disputes with firms

Why is there a shortage of math teachers?

Mathematics History

LS (mathematicians)

LS (historians)

LD

LD

w union

w*ind

w*ind

E*math E*

hist

shortage surplus

Why study Labor Economics?

How does increasing the minimum wage affect workers and firms?

Low skilled labor market

LS (workers)

LD (firms)

unemployment

unemployment

wmin

wmin

LFE LFE


LS (workers)

LD (firms)

w*

E*



LS (workers)

LD (firms)

w*

E*

w*

E*

Is there a cost to immigration?

A flood of low skilled workers into

an economy…


Rise = .018 – (-.012) = .03

Run = -0.053 – 0.037 = -0.09

= -.03/0.09 = -.333

Using data to confirm theory(Scatterplots and simple regression)

The equation that describes how the dependent variable y is related to the independent variables and error is called the multiple regression model

y = 0 + 1x1 + 2x2 + . . . + kxk +

The equation that describes how the mean value of y is related to the independent variables is called the multiple regression equation

E(y) = 0 + 1x1 + 2x2 + . . . + kxk

Using data to confirm theory

y = b0 + b1x1 + b2x2 + . . . + bkxk

The equation that describes how the predicted value of y is related to the independent variables is called the estimated multiple regression equation:

(multiple regression)

1. Formulate a research question:

How has welfare reform affected employment of low-income mothers?

Issue 1: How should welfare reform be defined?

Since we are talking about aspects of welfare reform that influence the decision to work, we include the following variables:

• Welfare payments allow the head of household to work less.

tanfben3 = real value (in 1983 $) of the welfarepayment to a family of 3 (x1)

• The Republican lead Congress passed welfare reform twice both of which were vetoed by President Clinton. Clinton signed it into law after the Congress passed it a third time in 1996. All states put their TANF programs in place by 2000.

2000 = 1 if the year is 2000, 0 if it is 1994 (x2)


1. Formulate a research question:

How has welfare reform affected employment of low-income mothers?

Issue 1: How should welfare reform be defined? (continued)

• Families receive full sanctions if the head of household fails to adhere to a state’s work requirement.

fullsanction = 1 if state adopted policy, 0 otherwise (x3)

Issue 2: How should employment be defined?

• One might use the employment-population ratio of Low-Income Single Mothers (LISM):

number of LISM that are employed

number of LISM livingepr


2. Use economic theory or intuition to determine what the true regression model might look like.

40

400

U0

55

550 U1

300

Leisure

Consumption

Receiving the welfare check

increases LISM’s leisure which decreases hours worked

Use economic graphs to derive testable hypotheses:

Economic theory suggests the following is not true:

Ho: 1 = 0

(Using theory to build testable hypotheses)Using data to confirm theory

2. Use economic theory or intuition to determine what the true regression model might look like.

Use a mathematical model to derive testable hypotheses:

Economic theory suggests the following is not true:

Ho: 1 = 0

max ( , )

. .

80

U C L C L

s t

H L

C P wH

The solution of this problem is:

* 402

PL

w

* 10

2

L

P w

1 0 *

0H

P

Using data to confirm theory(Using theory to build testable hypotheses)

3. Compute means, standard deviations, minimums and maximums for the variables.

state year epr tanfben3 fullsanction black dropo unemp

Alabama 1994 52.35 110.66 0 25.69 26.99 5.38

Alaska 1994 38.47 622.81 0 4.17 8.44 7.50

Arizona 1994 49.69 234.14 0 3.38 13.61 5.33

Arkansas 1994 48.17 137.65 0 16.02 25.36 7.50

West Virginia 2000 51.10 190.48 1 3.10 23.33 5.48

Wisconsin 2000 57.99 390.82 1 5.60 11.84 3.38

Wyoming 2000 58.34 197.44 1 0.63 11.14 3.81

(Model Specification in regression)Using data to confirm theory

3. Compute means, standard deviations, minimums and maximums for the variables.

1994Mean Std Dev Min Max

2000Mean Std Dev Min Max Diff

epr 46.73 8.58 28.98 65.64 53.74 7.73 40.79 74.72 7.01

tanfben3 265.79 105.02 80.97 622.81 234.29 90.99 95.24 536.00 -31.50

fullsanction 0.02 0.14 0.00 1.00 0.70 0.46 0.00 1.00 0.68

black 9.95 9.45 0.34 36.14 9.82 9.57 0.26 36.33 -0.13

dropo 17.95 5.20 8.44 28.49 14.17 4.09 6.88 23.33 -3.78

unemp 5.57 1.28 2.63 8.72 3.88 0.96 2.26 6.17 -1.69


0

10

20

30

40

50

60

70

80

0 2 4 6 8 10

unemp

epr

0

10

20

30

40

50

60

70

80

0 5 10 15 20 25 30

dropo

epr

0

10

20

30

40

50

60

70

80

0 10 20 30 40

black

epr

0

10

20

30

40

50

60

70

80

0 200 400 600 800

tanfben3

epr

4. Construct scatterplots of the variables. (1994, 2000)


5. Compute correlations for all pairs of variables. If | r | > .7 for a pair of independent variables, • multicollinearity may be a problem• it is not possible to determine the separate effect of any particular

independent variable on y.• Some say avoid including independent variables that are highly

correlated, but it is better to have multicollinearity than omitted variable bias.

epr fullsanction black dropo unemp

tanfben3 -0.03 -0.24 -0.53 -0.50 0.10

unemp -0.64 -0.51 0.16 0.47

dropo -0.44 -0.25 0.51

black -0.32 0.07

fullsanction 0.43


state year pmt pmt_ln

Alabama 1994 110.66 4.71

Alaska 1994 622.81 6.43

Arizona 1994 234.14 5.46

Arkansas 1994 137.65 4.92

West Virginia 2000 190.48 5.25

Wisconsin 2000 390.82 5.97

Wyoming 2000 197.44 5.29

Variable transformation


Least Squares Criterion: 2 2min ( ) min ( )i i iy y ˆe

Computation of Coefficient Values:

In simple regression:

You can use matrix algebra or computer software packages to compute the coefficients

In multiple regression:

11

1

cov( , )var( )

x yb

x

0 1 1b y bx

1( )b XX Xy

0

1

p

b

b

b

(Estimation)Using data to confirm theory

Regression Statistics

Multiple R 0.0279

R Square 0.0008

Adjusted R Square -0.0094

Standard Error 8.8978

Observations 100

ANOVA

df SS MS F

Regression 1 6.031 6.031 0.076

Residual 98 7758.733 79.171

Total 99 7764.764

Coefficients Standard Error t Stat P-value

Intercept 46.9192 12.038 3.897 0.000

pmt_ln 0.6087 2.206 0.276 0.783

r 2·100% of the variability in y

can be explained by the model.

.08%epr of LISM

Error

(Omitting variable bias)Using data to confirm theory

-1.984 1.984

.025

.2760t

.025

We cannot reject H0 at a 5% level of significance.

Do Not Reject RejectReject

1

1-statb

bt

s.276

.60872.2055

df = 100 – 1 – 1 = 98 (column) = .05 /2 = .025 (row)

H0: 1 = 0

(hypothesis testing)Using data to confirm theory

1 1$292 $266 1 1ˆ ˆ| | ln(292) ln(266) 0.6087 ln(1.10) .058x xepr epr b b .10 .058

• If estimated coefficient b1 was statistically significant, we would interpret its value as follows:

Increasing monthly benefit levels for a family of three by 10% would result in a .058 percentage point increase in the average epr

of LISM

• However, since estimated coefficient b1 is statistically insignificant, we interpret its value as follows:

Increasing monthly benefit levels for a family of three

has no effect on the epr of LISM.

Our theory suggests that this estimate has the wrong sign and is biased towards zero. This bias is called omitted variable bias.

(interpretation)Using data to confirm theory

R Square 0.166

Adjusted R Square 0.149


Observations 100

ANOVA

df SS MS F

Regression 2 1288.797 644.398 9.652

Residual 97 6475.967 66.763

Total 99 7764.764


Intercept

pmt_ln

2000



15%epr of LISM

Error


R Square 0.214



Observations 100

ANOVA

df SS MS F

Regression 3 1664.635 554.878 8.732

Residual 96 6100.129 63.543

Total 99 7764.764


Intercept 31.544 11.204 2.815 0.006

pmt_ln 2.738 2.024 1.353 0.179

2000 3.401 2.259 1.506 0.135

full 5.793 2.382 2.432 0.017



19%epr of LISM

Error


R Square 0.517



Observations 100

ANOVA

df SS MS F

Regression 6 4018.075 669.679 16.623

Residual 93 3746.689 40.287

Total 99 7764.764


Intercept

pmt_ln

2000

full

black

drop

unemp



49%epr of LISM

Error


R Square 0.517



Observations 100

ANOVA

df SS MS F

Regression 6 4018.075 669.679 16.623

Residual 93 3746.689 40.287

Total 99 7764.764


Intercept

pmt_ln

2000

full

black

drop

unemp

(Estimation)

lnx1

x2

x3

x4

x5

x6

+

Error

1 2 3 4 5 6ˆ 104.529 5.709ln 2.821 3.768 0.291 0.374 3.023y x x x x x x


E() is probably equal to zero since E(e) = 0

1 2 3 4 5 6ˆ 104.529 5.709ln 2.821 3.768 0.291 0.374 3.023y x x x x x x

epry

pmt_ln 2000 full black drop unemp epr hat residuale

52.35 4.71 0 0 25.69 26.99 5.38 43.83 8.52

38.47 6.43 0 0 4.17 8.44 7.50 40.76 -2.29

49.69 5.46 0 0 3.38 13.61 5.33 51.19 -1.50

48.17 4.92 0 0 16.02 25.36 7.50 39.60 8.57

51.10 5.25 1 1 3.10 23.33 5.48 49.31 1.79

57.99 5.97 1 1 5.60 11.84 3.38 55.14 2.85

58.34 5.29 1 1 0.63 11.14 3.81 59.44 -1.10

Sum 0

1lnx 2x 3x 4x 5x 6x y

(A1: zero mean)Using data to confirm theory

Heteroscedasticity is likely present if scatterplots of residuals versus t, y, x1, x2 … xk are not a random horizontal band of points.

^

-15

-10

-5

0

5

10

15

20

30 40 50 60 70

predicted epr

resid

ual

-15

-10

-5

0

5

10

15

20

0 10 20 30 40

black

resid

ual

-15

-10

-5

0

5

10

15

20

4 5 6 7

pmt

resid

ual

-15

-10

-5

0

5

10

15

20

0 10 20 30

drop

resid

ual

-15

-10

-5

0

5

10

15

20

0 2 4 6 8 10

unemp

resid

ual

Non-constant variance in black?

Using data to confirm theory(A2: Constant variance)


Intercept -4681.00 3014.00 -1.55 0.125

pmt_ln 1526.70 972.20 1.57 0.121

2000 76.30 398.50 0.19 0.849

full -88.70 394.80 -0.22 0.823

black 28.53 28.72 0.99 0.324

drop 115.56 55.98 2.06 0.042

unemp -204.70 165.00 -1.24 0.219

pmt_ln2 -128.25 79.93 -1.60 0.113

black2 0.13 0.10 1.30 0.196

drop2 -0.75 0.44 -1.69 0.095

unemp2 -1.73 4.33 -0.40 0.690

pmt_lnX2000 -6.67 63.28 -0.11 0.916


To test for heteroscedasticity, perform White’s squared residual test by first squaring the residuals, and then using these as the “y” variable in a secondary regression:

pmt_lnXdrop -16.31 8.97 -1.82 0.073pmt_lnXfull 26.14 61.29 0.43 0.6712000Xunemp 30.33 17.53 1.73 0.088fullXblack 0.86 4.08 0.21 0.834fullXdrop 5.31 6.46 0.82 0.413fullXunemp -55.11 19.20 -2.87 0.005blackXdrop -0.56 0.33 -1.71 0.091blackXunemp 0.97 0.89 1.10 0.275dropXunemp 1.23 2.17 0.57 0.572

ANOVA

df SS MS F

Regression 25 81517 3261 1.24

Residual 74 194024 2622

Total 99 275541

If F-stat > F05 , we reject H0: no heteroscedasticity

25

Hence, 2 is probably constant

1.24

74 F.05 = 1.66


pmt_lnXblack -5.31 4.43 -1.20 0.234

If heteroscedasticity is a problem,

• Estimated coefficients aren’t biased

• Coefficient standard errors are wrong

• Hypothesis testing is unreliable

In our example, heteroscedasticity does not seem to be a problem.

If heteroscedasticity is a problem, do one of the following:

• Use Weighted Least Squares with 1/xj or 1/xj0.5 as weights where xj is the

variable causing the problem

• Compute “Huber-White standard errors”

1

1-statb

bt

s


Error is probably normally distributed if e is normally distributed

Histogram of residuals

0

5

10

15

20

25

30

1 2 3 4 5 6 7 8 9 10

residuals

freq

uen

cy

-20 -16 -12 -8 -4 0 4 8 12 16 20

(A3: Normality)Using data to confirm theory

There are a number of normality tests one can chose.

• The Jarque-Bera test involves using the skew and kurtosis of the residuals.

• The test statistic follows a chi-square distribution with 2 degrees of freedom:

kurtosis measures "peakedness" of the probability distribution. • High kurtosis → sharp peak, low kurtosis → flat peak.• involves raising standardized residuals to the 4th power • Excel: =kurt(A1:A100) → 0.0214

skewness measures asymmetry of the distribution. • 0 skew → symmetric distribution, negative skew → skewed left,

positive skew → skewed right• involves raising standardized residuals to the 3rd power• Excel: =skew(A1:A100) → 0.3276

2 22 2 2100-stat 1.791

6 4 6

.0214.3276

4

kurtw

nske


= .05 (column)

5.99

Do Not Reject H0 Reject H0: errors are normal

.05

2

There is no reason to doubt the assumption that the errors are normally distributed.

2 -stat

df = 2 (row) 2.05 5.99

1.791 2


If the errors are normally distributed,

• parameter estimates are normally distributed

• F and t significance tests are valid

If the errors are not normally distributed but the sample size is large,

• parameter estimates are approximately normally distributed (CLT) • F and t significance tests are valid

If the errors are not normally distributed and the sample size is small,

• parameter estimates are not normally distributed

• F and t significance tests are not reliable


The values of are probably independent if the autocorrelation residual plot or if the Durbin-Watson statistic (DW-stat) indicate the values of e are independent

The DW-stat varies when the data’s order is altered

• If there are multiple time periods, compute DW-stat after sorting by time periods

• If the data is cross-sectional, compute the DW-stat after sorting by geography (e.g., NE, NW, Central, SW, SE …)

• If the data is both, compute the DW-stat after sorting by time periods and geography

no autocorrelation if DW-stat = 2

perfect "" autocorrelation if DW-stat = 4

perfect "+" autocorrelation if DW-stat = 0

(A4: Independence)Using data to confirm theory

Observation region Residuals (ei - ei-1)2 ei2

2 0 -2.29 - 5.24

5 0 -0.56 3.00 0.31

11 0 -5.59 25.29 31.21

37 0 -4.42 1.36 19.55

47 0 -3.76 0.43 14.16

52 0 14.84 345.91 220.08

55 0 -4.80 385.57 23.05

61 0 11.91 279.27 141.86

87 0 3.51 70.62 12.30

98 5 1.79 161.93 3.19

99 5 2.85 1.12 8.11

sum 7620.63 3746.69

7620.63DW-stat

3746.69

DW-stat 2.03

There is no reason to doubt the assumption

that the errors are independent.


If autocorrelation (or serial correlation) is a problem,

• Estimated coefficients aren’t biased, but

• Their standard errors may be inflated

• Hypothesis testing is unreliable

In our example, autocorrelation does not seem to be a problem.

If autocorrelation is a problem, do one of the following:

• Change the functional form

• Include an omitted variable

• Use Generalized Least Squares

• Compute “Newey-West standard errors” for the estimated coefficients.

1

1-statb

bt

s


The true model is probably linear if the scatterplot of e versus y is a horizontal, random band of points

^

-15

-10

-5

0

5

10

15

20

30 40 50 60 70

predicted epr

resid

ual

2. There is no pattern in this scatter plot.1. The simple regression line’s slope = 0 and its height = 0

(A5: Linearity)Using data to confirm theory

If you fit a linear model to data which are nonlinearly related,

• Estimated coefficients are biased

• Predictions are likely to be seriously in error

In our example, nonlinearity does not seem to be a problem.

If the data are nonlinearly related, do one of the following:

• Rethink the functional form

• Transform one or more of the variables

All 5 model assumptions appear to be valid. Hence, the t and F tests are reliable

provided the “right” regressors are included.

1

1-statb

bt

s

(A5: Linearity)Using data to confirm theory

F

.05

Do not Reject H0 Reject H0

≈ 1

Hence, we reject H0.

There is insufficient evidence to conclude that the

coefficients are not all equal to zero simultaneously.

dfD = 93 and = .05 (row)

H0: 1 = 2 = . . . = 6 = 0

dfN = 6 (column)

16.6232.20

(Testing for Overall Significance)Using data to confirm theory

-1.986 1.986

.025

-2.3 0t

.025

Reject H0 at a 5% level of significance.


1

1-statb

bt

s-2.32

-5.7092.461

df = 100 – 6 – 1 = 93 (column) = .05 /2 = .025 (row)

H0: 1 = 0

I.e., TANF welfare payments influence the decision to work.

(Testing for Coefficient Significance)Using data to confirm theory

-1.986 1.986

.025

-1.39 0t

.025

We cannot reject H0 at a 5% level of significance.


2

2-statb

bt

s-1.39

-2.8212.029

df = 100 – 6 – 1 = 93 (column) = .05 (row) /2 = .025 (row)

H0: 2 = 0

I.e., welfare reform in general does not influence the decision to work.


-1.986 1.986

.025

1.960t

.025

Although we cannot reject H0 at a 5% level of significance,

we can at the 10% level (p-value = .054).


3

3-statb

bt

s1.96

3.7681.927

df = 100 – 6 – 1 = 93 (column) = .05 (row) /2 = .025 (row)

H0: 3 = 0

I.e., full sanctions for failure to comply with work rules influence the decision to work.


-1.986 1.986

.025

-3.26 0t

.025



4

4-statb

bt

s-3.26

-0.2910.089

df = 100 – 6 – 1 = 93 (column) = .05 (row) /2 = .025 (row)

H0: 4 = 0

I.e., the share of the population that is black influences the decision to work.


-1.986 1.986

.025

-1.85 0t

.025


5

5-statb

bt

s-1.85

-0.3740.202

df = 100 – 6 – 1 = 93 (column) = .05 (row) /2 = .025 (row)

H0: 5 = 0

Although we cannot reject H0 at a 5% level of significance,

we can at the 10% level (p-value = .068).

I.e., the share of the population that is high school droput influences the decision to work.


-1.986 1.986

.025

-4.89 0t

.025



6

6-statb

bt

s-4.89

-3.0230.618

df = 100 – 6 – 1 = 93 (column) = .05 (row) /2 = .025 (row)

H0: 6 = 0

I.e., the unemployment rate influences the decision to work.


• Since estimated coefficient b1 is statistically significant, we interpret its value as follows:

Increasing monthly benefit levels for a family of three by 10% would result in a .54 percentage point reduction in the average epr

of LISM

1 1$292 $266 1 1ˆ ˆ| | ln(292) ln(266) -5.709ln(1.10) .54x xepr epr b b .10 .54

• Since estimated coefficient b2 is statistically insignificant (at levels greater than 15%), we interpret its value as follows:

Welfare reform in general

had no effect on the epr of LISM.

(Interpreting Coefficients)Using data to confirm theory

• Since estimated coefficient b3 is statistically significant at the 10%

level, we interpret its value as follows:

33

yb

x

3.768 +3.768

+1

The epr of LISM is 3.768 percentage points higher in states that adopted full sanctions for families that fail to comply with work rules.



44

yb

x

-0.291 -0.291

+1

Each 10 percentage point increase in the share of the black population in states is associated with a 2.91 percentage point

decline in the epr of LISM.

10

10

-2.91

+10




55

yb

x

-0.374 -0.374

+1

Each 10 percentage point increase in the high school droput rate is associated with a 3.74 percentage point decline in the

epr of LISM.

10

10

-3.74

+10



66

yb

x

-3.023 -3.023

+1

Each 10 percentage point increase in the unemployment rate is associated with a 30.23 percentage point decline in the epr

of LISM.

10

10

-30.23

+10