Download pdf - Henrik Jacobsen Kleven London School of Economics Lecture ...darp.lse.ac.uk/pdf/EC426/EC426_16_06-07_H.pdf · Treatment/control assignment based on (husband’s earnings + family

LABOR SUPPLY RESPONSESTO TAXES AND TRANSFERS

Henrik Jacobsen Kleven

London School of Economics

Lecture Notes for MSc Public Economics (EC426)

AGENDA

Efficiency cost of labor supply responses.

Theoretical models of labor supply.

Empirical approaches andmethodological issues in labor supply estimation.

Quasi-experiments:

— Tax-transfer reforms and difference-in-differences

— Kinks, notches, and bunching

Optimization frictions and observed vs. structural elasticities.

WHY CARE ABOUT LABOR SUPPLY RESPONSES?

When the government levies income taxes to finance transfer programs

and public goods, individuals respond by changing labor supply.

By revealed preference, each individual prefers the new labor supply to

the old one at the tax-inclusive prices. So, everyone is better off because

of the labor supply adjustments?

No, the revealed preference argument applies to each individual alone,

not to the population as a whole.

Behavioral responses affect gov’t revenue and create a fiscal externality–

the deadweight loss of taxation.

THE TOTAL DEADWEIGHT LOSS OF TAXATION

net-of-taxwage rate

labor supply functionfunction

tax revenue

Aw

TDWL

A

τ·w

(1- τ)w deadweight loss

CSB

consumer surplus

hours workedh*B h*A

THE MARGINAL DEADWEIGHT LOSS OF TAXATION

net-of-tax

labor supply functionBehavioral revenue

wage rate

functione a o a e e ueeffect dB (≈ dDWL)

w

T DWLτ·w

(1- τ)wCS C

B

Mechanical revenue

hours worked

effect dM

THE DEADWEIGHT LOSS OF TAXATION

The deadweight loss is given by =− , where is the utility

loss from taxation (in monetary units) and is collected tax revenue.

The marginal DWL is given by = − .

We have = + where is themechanical revenue effectand is the behavioral revenue effect. We have = using

the envelope theorem.

⇒ = − ( + ) = −.

General insight: In a model with efficient markets, the marginalefficiency cost of taxation is given by the behavioral revenue loss.

DWL FROM LABOR SUPPLY RESPONSES

We have

= − = − · · =

1− · · ·

where ≡ (1− )(1− ) is the elasticity of hours worked with respect

to the net-of-tax rate.

Compensated vs uncompensated elasticity? For DWL, we want the

compensated elasticity (see Auerbach, 1985, for a rigorous treatment).

Empirical literature estimates compensated and uncompensated elas-

ticities.

DIMENSIONS OF LABOR SUPPLY RESPONSE

1. Quantitative dimensions:

(a) Hours worked for those who are working (intensive margin)

(b) Labor force participation (extensive margin)

2. Qualitative dimensions:

(a) Effort on the job

(b) Type of job (occupation, industry, etc.)

(c) Education and on-the-job training

(d) Location and migration

THE BASIC STATIC MODEL WITHOUT TAXES

Utility maximization:

max

= ( x) st. = +

where is the wage rate, is non-labor income, and x is a vector of

individual characteristics.

Optimal labor supply satisfies−00 = along with a non-negativity

constraint→ labor supply function = ( x) ≥ 0.

substitution effect

incomeeffect

INCOME AND SUBSTITUTION EFFECTS

consumption c

hours worked h

slope w1

slope w0

u0

u1

y

ELASTICITIES AND SLUTSKY DECOMPOSITION

Uncompensated wage elasticity =

. Can be either positive

or negative.

Income elasticity =

. Negative if leisure is normal.

Compensated wage elasticity captures the response at a con-

stant utility level. This elasticity is always positive.

The Slutsky Equation links elasticities:

= −

where ≡ − · is the income effect.

BASIC CROSS-SECTIONAL ESTIMATION

Based on the model above, early work considered a regression specifi-

cation such as

= 0 + 1 · + 2 · + 3 · x +

where x is a vector of observable controls and is the error term.

OLS is consistent if explanatory variables and are uncorrelated.

Results:

•Male labor supply (see Pencavel, 1986): ' 0 and ' 01 ⇒ ' 01.

• Female labor supply (Heckman-Killingsworth, 1986): Much largerelasticities on average, but enormous variation across studies.

PROBLEMS WITH BASIC APPROACH

1. Identification: may be positively correlated with taste for work,

which is unobserved and captured by → positive correlation be-

tween and → upward bias [omitted variable bias].

2.Measurement error as is measured as earnings/hours from

surveys.

3. Functional form sensitivity.

4. Ignores the impact of taxes on the central explanatory variables.

5.Non-participation: sample is selected so that 0, because

otherwise no wage rate is observed. Extensive responses ignored.

ACCOUNTING FOR (NON-LINEAR) TAXES

-bracket piecewise linear schedule () with marginal tax rates

1

The budget constraint:

= − () + = (1− )+

where ≡ − () + is virtual income in bracket .

Within each bracket, optimal labor supply can be written on the form

= ((1− )).

We must deal explicitly with the possibility of a kink solution at thethreshold between two brackets.

BUDGET SET AND VIRTUAL INCOME WITHA 2-BRACKET NONLINEAR TAX SCHEDULE

consumption c

hours worked h

slope (1- )�2

w

R1

R2

h

slope (1- )�1

w

u

kink

ESTIMATION WITH NONLINEAR TAXES

Based on the model, consider an empirical specification such as

= 0 + 1 · (1− ) + 2 · + 3 · x +

which can be estimated by e.g. OLS.

Problems with this approach:

1. Identification:(a) is correlated with [omitted variable bias]

(b) are endogenous to and hence correlated with [reverse

causality].

2. Bunching: the model predicts bunching at kink points, but wetypically do not observe much bunching in the data.

THE CONVEXITY ASSUMPTION

So far, we have assumed that workers supply labor where the indiffer-

ence curve is tangent to the budget set.

If this point of tangency occurred at negative hours, a non-negativity

constraint implied a corner solution at zero hours (non-participation).

In this model, marginal changes in taxes lead to marginal changes in

hours worked. Even participation responses are smooth in this way.

This analysis is fine as long as the budget and preferences are convex.

THE CONVEX MODEL:INTENSIVE VS EXTENSIVE LABOR SUPPLY RESPONSES

hours worked h

consumption c

uA

slope w0

y

uB

hAhB

THE CONVEX MODEL:INTENSIVE VS EXTENSIVE LABOR SUPPLY RESPONSES

hours worked h

consumption c

slope w0

y

slope w1

hAhB

uA

uB

THE REAL WORLD IS NON-CONVEX

Fixed costs of working due to child care, commuting, etc.

Fixed costs of employing due to hiring costs, technological con-straints, etc.

Means-tested transfers creating higher marginal tax rates at thebottom than further up the distribution.

Non-convexities bring discrete participation responses into playeven with small changes in wages or taxes.

�yF

A MODEL WITH FIXED WORK COSTS:INTENSIVE VS EXTENSIVE LABOR SUPPLY RESPONSES

hours worked h

consumption c

slope w0

y

�hF

�yF

A MODEL WITH FIXED WORK COSTS:INTENSIVE VS EXTENSIVE LABOR SUPPLY RESPONSES

hours worked h

consumption c

slope w0

y

�hF

slope w1

small intensiveresponse

large extensiveresponse

IndifferenceCurve

NonconvexBudget

High-TaxBracket

TransferPhase-Out

BUDGET SET WITH MEANS-TESTED TRANSFERS

hours worked h

consumption c

Low-TaxBracket

MODERN LABOR SUPPLY LITERATURE

• Early phase (around 1990s):

— Focus on the evaluation problem in previous work

— Search for identification takes center stage

— Focus on experimental evidence

• Later phase (2000s onwards):

— From survey data to large administrative datasets

— Sharper identification and graphical evidence

— Sufficient statistics approach as a bridge between reduced-form and structural estimation

SOURCES OF EXPERIMENTAL VARIATION

•Randomized experiments

— Truly exogenous variation

—Not very common in public finance

•Quasi-experiments [variation created by actual policy]

— Tax-transfer reforms and difference-in-differences

—Discontinuities in tax-transfer schedules and bunching approaches

—Regression Discontinuity Design

THE EVALUATION PROBLEM

Counterfactual question: what would person ’s labor supply be if

taxes were different?

= person ’s labor supply if taxes are high (‘treatment’), =

person ’s labor supply if taxes are low (‘control’). We don’t observe

both and at the same time. What we do observe is

[ | ]−[ | ] = [ − | ]| {z }treatment effect

+³[ | ]−[ | ]

´| {z }

selection bias

The selection bias reflects systematic differences between treatments

and controls. Randomized experiments solve this problem.

RANDOMIZED EXPERIMENTS

A sample of agents is divided randomly into treatments and controls

. Group is treated by a policy while group is not.

The treatment effect on outcome is measured by[ | ]−[ | ].

Random assignment⇒ [ | ] = [ | ]. Then,[ | ]−[ | ] = [ − | ] = true treatment effect.

Applications:

1.NIT experiments in the U.S. in the 60s and 70s. Many studies(e.g., Ashenfelter-Plant, JLE 1990).

2.Canadian Self Sufficiency Programme (SSP) in the 90s.Michalopoulos-Robins-Card (JPubE 2005).

QUASI-EXPERIMENTS & DIF-IN-DIF

Tax reforms create natural experiments that may resemble true exper-

iments.

Compare a group affected by the reform ( ) to a group not affected

(). Let and denote before and after the reform.

The effect on can be estimated by the difference-in-differences

∆ −∆ = [ − ]− [ − ]

A before-after estimator ∆ is biased by time effects. A group

comparison − is biased by group effects. The dif-in-dif removes

common (group-invariant) time effects and time-invariant group effects.

REGRESSION FORMULATION OF DIF-IN-DIF

Time dummy = 1 if is after the reform. Treatment dummy

= 1 if individual belongs to the treatment group.

Regression framework:

= 0+1+2+3·+4x+

Without covariates x, 3 is the basic dif-in-dif estimate. Including

covariates can improve the basic dif-in-dif.

PARALLEL TRENDS ASSUMPTION

Identification assumption of dif-in-dif:

All non-reform time effects must be the same for and , i.e. if not

for the reform the two groups would have been on parallel trends.

(When both and experience a policy change but of different size,

dif-in-dif is still possible. But this requires an extra assumption.)

EISSA (1995)

Never published but a good example for teaching.

The Tax Reform Act of 1986 (TRA86) cut marginal tax rates atthe top much more than further down the distribution.

The jointness of the US income tax implies that wives of rich husbandsexperienced larger tax cuts than wives of not-so-rich husbands.

Treatment/control assignment based on (husband’s earnings + family

non-labor income). -group = wives at the 99th percentile, -group =

wives at the 75th percentile.

Uses a dif-in-dif comparing changes in labor supply for and frombefore the reform (1985) until after the reform (1989).

THE TAX REFORM ACT OF 1986

Source: Eissa (1995)( )

MARGINAL TAX RATESFOR TREATMENTS AND CONTROLSFOR TREATMENTS AND CONTROLS

Source: Eissa (1995)

DIFFERENCE-IN-DIFFERENCES:LABOR FORCE PARTICIPATIONLABOR FORCE PARTICIPATION


DIFFERENCE-IN-DIFFERENCES:HOURS WORKED CONDITIONAL ON PARTICIPATIONHOURS WORKED CONDITIONAL ON PARTICIPATION


LABOR SUPPLY ELASTICITIES

Relate dif-in-dif for labor supply to dif-in-dif for (1− ):

∆ −∆

∆(1− )(1− )−∆(1− )(1− )

Elasticity of labor force participation is 0.5. Elasticity of hours worked

is 0.4. Total elasticity is 0.9.

Large elasticities but also large standard errors → effects not statisti-

cally significant.

ISSUES WITH EISSA’S APPROACH

1. The parallel trends assumption:

(a) -group starts from a lower level than the -group ⇒ -group

less able to absorb an upward trend in female labor supply.

(b) Alternative story: trend towards "power couples" in late 80s.

2. / assignment is not clean as TRA86 affected both groups

→ dif-in-dif requires homogeneous responsiveness for and .

3. Identification strategy requires no cross-substitutability in spousalleisures, which is very strong.

NON-PARALLEL TRENDS?

1. Use data from periods prior and subsequent to the policy change:

(a) Do a placebo dif-in-dif for non-reform periods. If it is non-zero,then dif-in-dif is tenuous.

(b) Do a triple-dif = dif-in-dif of interest — placebo dif-in-dif.

(c) Plot longer time series of outcomes for and to see if "some-thing different" happens around the reform.

2. Plotting the time-series from 1979-1998, Liebman and Saez (2006)

show that Eissa’s results are not robust.

TRANSFERS AND LABOR SUPPLY:IN-WORK BENEFITS

Transfers which are conditional on labor force participation.

They are typicallymeans-tested [targeted to low earnings/assets] andcategorical [targeted to single mothers].

Examples:

1.Earned Income Tax Credit (EITC) in the U.S. A cash transferprovided through the tax system.

2.Working Families Tax Credit (WFTC) in the U.K. is broadlysimilar to the EITC.

fixedcost

LABOR SUPPLY RESPONSES TO THE EITC

hours worked h

consumption c

Non-EITC budget

phase-in plateau phase-out

EITC

fixedcost


hours worked h

consumption c

Non-EITC budget


EITC

fixedcost


hours worked h

consumption c

Non-EITC budget


EITC

fixedcost


hours worked h

consumption c

Non-EITC budget


EITC

EISSA & LIEBMAN (1996)

Use a dif-in-dif to estimate the effect of the EITC expansion in TRA86

on labor force participation and hours worked for single mothers.

Assignment to treatments and controls:

1. T= single women with children, C= single women without children.

2. T = single women with children and low education, C = single

women without children and low education.

EITC BEFORE AND AFTER TRA86

Source: Eissa and Liebman (1996)

DIFFERENCE-IN-DIFFERENCES:LABOR FORCE PARTICIPATION RATES OF SINGLE WOMENLABOR FORCE PARTICIPATION RATES OF SINGLE WOMEN


PARTICIPATION RATES OF SINGLE WOMEN 1981-1992

All Unmarried Females


PARTICIPATION RATES OF SINGLE WOMEN 1981-1992

Unmarried Females With Less Than High School Education


FINDINGS AND ISSUES

EL96 find large participation effects, no significant hours-of-work effects.

Using single women without children as a control for single women with

children is problematic, especially with regards to the parallel trends

assumption.

Robustness checks:

• Treatment-control assignment that exploits EITC means-testing:single mothers with low vs. high education→ no significant effect.

• Longer time series plot of participation rates across treatment andcontrol groups→ only moderately convincing.

OVERVIEW OF EITC LABOR SUPPLY STUDIES

• Eissa-Liebman (1996): effect on single mothers using TRA86 as aquasi-experiment. Large participation effect, no hours-of-work effect.

•Meyer-Rosenbaum (2001): effect on single mothers using the 86, 90,and 93 reforms as quasi-experiments. More variation than EL96, but

complications due to welfare reform. Large participation effect.

• Eissa-Hoynes (2004): effect on married couples with low earnings.Jointness of EITC discourages participation of married women.

⇒ consensus that extensive responses aremore important than intensive

responses. But this view might need revision in light of recent work.

CURRENT RESEARCH FRONTIER

Survey data are out; large administrative datasets are in.

Quasi-experimental work based on clear and sharp identification.Effects should be visible in graphs!

Combines reduced-form strategies (credible identification) with more

structural approaches (link to theory and welfare analysis). Estimation

of sufficient statistics for welfare analysis.

Distinction between observed and structural elasticities due tooptimization frictions (adjustment costs, inattention, etc.).

RECENT QUASI-EXPERIMENTAL APPROACHES

Difference-in-differences using administrative data over long timeperiods allowing for graphical identification.

Bunching approaches that exploit discontinuous jumps in marginaltax rates (kinks) or discontinuous jumps in tax liability (notches).

Regression Discontinuity Design (RDD) andRegression KinkDesign (RKD).

RECENT PAPERS

•Bunching approaches:— Saez (AEJ-Pol 2010)— Chetty, Friedman, Olsen, and Pistaferri (QJE 2011)—Kleven and Waseem (QJE 2013)— Chetty, Friedman, Saez (AER 2013)

—Kleven (ARE 2016)

•Difference-in-differences:—Kleven, Landais, and Saez (AER 2013)—Kleven, Landais, Saez, and Schultz (QJE 2014)

•Regression Discontinuity Design:— Saez, Matsaganis, and Tsakloglou (QJE 2012)

BUNCHING AT KINK POINTS

In the old debate (Hausman vs. Heckman), bunching was seen as a

purely technical issue in the estimation of labor supply responses.

But bunching provides direct and compelling evidence of a labor supply

response to the marginal tax rate.

To detect bunching precisely, administrative data are crucial.

Saez (2010) shows that excess bunching around kinks can be used toidentify the compensated elasticity of labor supply/earnings.

SAEZ (2010): KINK ANALYSIS

Utility ¡ − ()

¢where is before-tax earnings and is ability.

Linear tax system () = · ; smooth ability distribution⇒ smooth earnings distribution with density 0 ().

Introduce a (small) kink by increasing the MTR from to + at the

cutoff ∗.

This kink produces bunching at ∗ by all individuals who had incomes inthe interval [∗ ∗ + ∗] prior to the kink. Individual H is the marginalbuncher who reduces his income by ∗ in response to the kink.

INDIFFERENCE CURVES AND BUNCHING

Source: Saez (2010)

DENSITY DISTRIBUTIONS AND BUNCHING

Source: Saez (2010)

SAEZ (2010): KINK ANALYSIS

The elasticity of earnings with respect to the marginal net-of-tax rate

can be inferred from the marginal buncher H.

For this individual, the income cutoff ∗ represents a tangency pointbetween the indifference curve and the upper part of the budget set

⇒ response ∗ is like an interior response.

For a small kink, the earnings elasticity is given by

=∗∗

(1− )= [0 (

∗) · ∗] (1− )

where = 0 (∗) ∗ is the total number of bunchers.

WHICH ELASTICITY CONCEPT?

A kink does not change the tax burden on inframarginal units of income

below ∗

⇒ a small kink does not change the average tax rate over the range

[∗ ∗ + ∗]

⇒ the response ∗ is driven by a change in the marginal tax rate withno change in the average tax rate (i.e., no income effect), and therefore

is the compensated elasticity.

Very large kinks may have income effects in which case will be a mix

of the compensated and uncompensated elasticities.

ESTIMATING EXCESS BUNCHINGUSING AN EMPIRICAL DENSITY DISTRIBUTIONUSING AN EMPIRICAL DENSITY DISTRIBUTION

Source: Saez (2010)

EMPIRICAL IMPLEMENTATION

Counterfactual density 0 (∗) and bunching is estimated from the

observed density, dropping observations right around the bunch.

Basic method by Saez (2010) assumes a uniform counterfactual density

in a small interval around the kink.

Refinement by Chetty et al. (2011) allows for curvature by fitting a

flexible polynomium to the observed distribution.

Standard errors can be obtained using the delta method or a bootstrap

method.

BUNCHING AND THE EITC

Source: Saez (2010)

BUNCHING AND THE EITC

Source: Saez (2010)

BUNCHING AND THE EITC:WAGE EARNERS VS. SELF-EMPLOYED

Source: Saez (2010)

BUNCHING AND THE EITC:WAGE EARNERS VS. SELF-EMPLOYED

Source: Saez (2010)

SAEZ (2010): RESULTS AND IMPLICATIONS

Clear bunching at the first EITC kink (for the self-employed) and the

first kink of the income tax. Both kinks are large and salient.

Otherwise, no evidence of bunching. Why?

Two potential explanations:

1. True structural elasticity is small.

2. True structural elasticity is not small, but the observed elasticity is

attenuated by optimization frictions.

Optimization frictions: (a) imperfect information, inattention, in-ertia, (b) adjustment costs and hours constraints.

CHETTY ET AL. (2011): BASIC IDEA

Micro elasticity estimates are small, macro elasticity estimates are large:

•Micro estimates are attenuated by frictions that prevent workers fromre-optimizing in response to small tax changes in the short run.

•Macro estimates are not attenuated by frictions, but they suffer fromidentification problems.

For welfare analysis and optimal policy, we are interested in the long-run

structural elasticity that is not attenuated by frictions.

Micro approach (ensuring identification) to analyze the role of frictions

for elasticity estimates.

CHETTY ET AL. (2011): SETUP

• Firms offer jobs with fixed wage-hours packages

•Workers draw jobs from a distribution of wage-hours packages and

must pay search costs to re-optimize

• Not all workers are located at their individual optimum (without

search frictions)

• Two types of equilibrium: (i) competitive markets, (ii) collectivebargaining

• In both cases, the distribution of hours offered by firms must equalthe distribution of hours selected by workers after search is complete

CHETTY ET AL. (2011): BUNCHING

With hours contraints, there are two ways to locate at the kink:

1. Individual bunching: workers search for jobs at the kink

2.Aggregate bunching: workers draw jobs at kink to begin with

• Such bunching may be generated by firms/unions tailoring joboffers to aggregrate worker preferences

• Signature of aggregate bunching: even workers who do not facethe kink bunch there

CHETTY ET AL. (2011): PREDICTIONS

1. [Size] Larger kinks generate larger observed elasticities

• Larger kinks induce more workers to pay searchs cost and relocateto the kink

2. [Scope] Kinks that affect a larger number of workers generate largerobserved elasticities

• Firms/unions tailor jobs to aggregate preferences → more aggre-

grate bunching at common kinks

3. [Correlation] More aggregrate bunching in sectors with greaterindividual bunching

CHETTY ET AL. (2011): DATA & VARIATION

Matched employer-employee administrative panel data for the full pop-

ulation of Denmark.

Very detailed tax return, labor market, and socio-demographic infor-

mation.

Sample restriction: wage earners aged 15-70, fully tax liable, years 1994-

2001→ about 20 million observations.

Quasi-experimental variation from large vs. small kinks in the Danish

income tax [and using tax reforms that move bracket cutoffs over time].

Marginal Tax Rates in Denmark in 200060

020

4080

100 200 300 40050 150150 250 350Taxable Income (1000s DKR)

Mar

gina

l Tax

Rat

e (%

)

log(NTR) = -11%

log(NTR) = -33%

Note: $1

6 DKr

Income Distribution for Wage Earners Around Top Kink (1994-2001)20

000

4000

060

000

8000

010

0000

-50 -40 -30 -20 -10 0 10 20 30 40 50

Taxable Income Relative to Top Bracket Cutoff (1000s DKr)

Freq

uenc

y


000

4000

060

000

8000

010

0000

-50 -40 -30 -20 -10 0 10 20 30 40 50


Freq

uenc

y

Excess mass BΔ


000

4000

060

000

8000

010

0000

-50 -40 -30 -20 -10 0 10 20 30 40 50


Freq

uenc

y

Excess mass (b) = 0.81 Standard error = 0.05

1000

020

000

3000

0

1000

020

000

3000

0

-50 -40 -30 -20 -10 0 10 20 30 40 50

0Fr

eque

ncy

(mar

ried

wom

en)

Freq

uenc

y (s

ingl

e m

en)

(a) Married Women vs. Single Men


Married WomenExcess mass (b)= 1.79Standard error = 0.10

Single MenExcess mass (b) = 0.25Standard error = 0.04

010

0020

0030

0040

00

020

0040

0060

0080

00

-50 -40 -30 -20 -10 0 10 20 30 40 50

(b) Teachers vs. MilitaryFr

eque

ncy

(teac

hers

)

Freq

uenc

y (m

ilita

ry)


TeachersExcess mass (b)= 3.54Standard error = 0.25

MilitaryExcess mass (b) = -0.12Standard error = 0.21

Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)0

1000

2000

3000

4000

6000

8000

1000

012

000

1400

0

210 220 230 240 250 260 270 280 290 300Taxable Income (1000s DKR)

All Wage Earners

Married Women

Taxable Income Distributions in 1994

1000

2000

3000

0

4000

8000

1200

01995


Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)

1000

2000

3000

0

4000

8000

1200

01996


Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)

010

0020

0030

00

5000

1000

015

000

1997


Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)

010

0020

0030

00

4000

8000

1200

01998


Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)

010

0020

0030

0040

00

4000

8000

1200

01999


Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)

1000

2000

3000

4000

0

6000

1000

014

000

2000


Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)

1000

2000

3000

4000

6000

1000

014

000

2001


Freq

uenc

y (a

ll w

age

earn

ers)

Freq

uenc

y (m

arrie

d w

omen

)

Distribution of Wage Earnings20

000

4000

060

000

8000

0

-50 -40 -30 -20 -10 0 10 20 30 40 50

Freq

uenc

y

Income Measure Relative to Top Bracket Cutoff (1000s DKR)

Excess mass (b) = 0.68Standard error = 0.05

PREDICTION 1: SIZE OF TAX CHANGE

Larger kinks generate larger observed elasticities, because they induce

more workers to pay searchs cost and relocate to the kink.

Compare observed elasticities from bunching at the large top tax kink

and the small middle tax kink.

Auxiliary tests to make sure that a difference in observed elasticities is

indeed driven by differences in the size of the tax change rather than

heterogeneity in elasticities by income level or tax rate level.

4000

060

000

8000

010

0000

1200

00

-50 -40 -30 -20 -10 0 10 20 30 40 50

Middle Tax Kink: All Wage Earners, Taxable Income Distribution


Predicted excess mass = 0.16Standard error = 0.01

Taxable Income Relative to Middle Bracket Cutoff

Freq

uenc

y

5000

060

000

7000

080

000

9000

010

0000

-50 -40 -30 -20 -10 0 10 20 30 40 50

Middle Tax Kink: All Wage Earners, Wage Earnings Distribution

Wage Earnings Relative to Middle Bracket Cutoff

Freq

uenc

y

Excess mass (b) = -0.06Standard error = 0.03

Predicted excess mass = 0.14 Standard error = 0.01

1000

020

000

3000

040

000

-50 -40 -30 -20 -10 0 10 20 30 40 50

Middle Tax Kink: Married Women, Taxable Income Distribution

Taxable Income Relative to Middle Bracket Cutoff

Freq

uenc

y


Predicted excess mass = 0.35Standard error = 0.02

Switchers from Top Tax to Middle TaxFr

eque

ncy

(Mid

dle

Tax)

Taxable Income Relative to Bracket Cutoff

Freq

uenc

y (T

op T

ax)

050

0010

000

1500

020

000

5000

1000

015

000

2000

0

-25 -15 -5 5 15 25



Middle Tax, year t+2Top Tax, year t

Observed Elasticity vs. Size of Tax Change

Log Change in Net-of-Tax Rate

Obs

erve

d E

last

iciti

es

0 10% 20% 30%5% 15% 25%

0

0.005

0.01

-0.005

PREDICTION 2: SCOPE OF TAX CHANGE

Kinks that affect a larger group of workers generate larger observed

elasticities as firms/unions tailor jobs to aggregate worker preferences.

Variation in the size of the group affected by a kink:

Variation in deductions and non-wage income across workers creates

variation in the effective location of the top bracket cutoff [i.e., the

wage income needed to locate at the kink].

Consider two different effective locations of the top tax kink:

• Statutory top tax kink [faced by 60% of population]

• “Pension kink” [faced by 2.5% of population]

010

2030

40

-50000 0 50000

Distribution of Net Deductions

Net Deduction (DKr)

Freq

uenc

y

Indivs making pension contribs.Indivs with non-wage income

05

1015

20

20000 30000 40000 50000

Distribution of Net Deductions Given Deductions > DKr 20,000

Net Deduction (DKr)

Freq

uenc

y

050

010

0015

00

-50 -40 -30 -20 -10 0 10 20 30 40 50

Wage Earnings Relative to Statutory Kink (1000s DKR)

Freq

uenc

y

Wage Earnings Distribution: Teachers

020

0040

0060

0080

0010

000

-50 -40 -30 -20 -10 0 10 20 30 40 50


Freq

uenc

y Wage Earnings Distribution: Teachers with Deductions > DKr 20,000

This groupstarts paying top tax here



Wage Earnings Relative to Pension Kink (1000s DKR)

Freq

uenc

y Wage Earnings Around Pension Kink: Deductions > 20,000

2000

2500

3000

3500

4000

4500

-50 -40 -30 -20 -10 0 10 20 30 40 50


Wage Earnings Relative to Pension Kink (1000s DKR)

Freq

uenc

y Wage Earnings Around Pension Kink: Deductions Between 7,500 and 25,000

2000

3000

4000

5000

-50 -40 -30 -20 -10 0 10 20 30 40 50

Excess mass (b)= -0.01Standard error = 0.15


Freq

uenc

y Wage Earnings Around Statutory Kink: Deductions Between 7,500 and 25,000

2500

3000

3500

4000

4500

5000

-50 -40 -30 -20 -10 0 10 20 30 40 50

Excess mass (b)= 0.56Standard error = 0.10

0.0

05.0

1.0

15.0

2.0

25

.45 .5 .55 .6 .65 .7Fraction of Group with |Net Deductions| < 7500

Obs

erve

d E

last

icity

from

Bun

chin

g at

Top

Kin

kObserved Elasticities vs. Scope of Tax Kink

SELF-EMPLOYED INDIVIDUALS

Thus far, we have looked only at wage earners

The self-employed do not face search frictions or hours constraints

• They can easily adjust reported earnings either by changing laborsupply or by avoidance/evasion

The self-employed serve as a “placebo test” for the findings

• Three predictions should not hold for the self-employed

• Size and scope of tax change should not matter

020

000

4000

060

000

-50 -40 -30 -20 -10 0 10 20 30 40 50


Freq

uenc

y Self-Employed: Taxable Income Distribution around Top Tax Cutoff


4000

6000

8000

1000

012

000

-50 -40 -30 -20 -10 0 10 20 30 40 50


Freq

uenc

y Self-Employed: Taxable Income Distribution around Middle Tax Cutoff


800

1000

1200

1400

1600

1800

-50 -40 -30 -20 -10 0 10 20 30 40 50


Self-Employment Income Relative to Statutory Top Tax Cutoff (1000s DKr)

Freq

uenc

y Self-Employment Income Around Statutory Kink: Deductions > 20,000

0.1

.2.3

.4.5

.1 .2 .3 .4

Obs

erve

d E

last

icity

fro

m B

unch

ing

at T

op K

ink

Self-Employed: Observed Elasticities vs. Scope of Tax Changes

Fraction of Group with |Net Deductions| < 7500

CHETTY ET AL. (2011): CONCLUSIONS

• Frictions due to search costs and hours constraints attenuate short-run behavioral responses to taxation

— Size and scope of tax variation matter for the observed elasticity

• Some notes and caveats:— For wage earners, the observed elasticity is tiny even at the largetop kink [size effect on elasticity is extremely small].

— For the self-employed, the small-kink elasticity is also smaller thanthe large-kink elasticity [size effect even when search costs and

hours constraints shouldn’t matter].

— Top and middle kinks differ in other dimensions than size.

—A broader view on frictions (inattention, inertia, etc.).

KLEVEN & WASEEM (2013): NOTCHES

Previous papers: observed elasticity 6= structural elasticity due tofrictions, but the true structural parameter is not identified.

This paper develops a method to identify the amount of friction andthe structural elasticity using notches.

Notches = discontinuities in the choice sets of individuals or firms.

Notches are conceptually different from kinks = discontinuities in the

slope of choice sets.

Application to income tax notches in Pakistan.

KLEVEN & WASEEM (2013): NOTCHES

Consider a notch where income tax liability increases discretely at an

earnings cutoff.

Prediction 1: excess bunching just below cutoff, hole above cutoff.

Prediction 2 (no frictions): strictly dominated region just abovethe cutoff should be empty in a frictionless world under any preferences.

Methodology:

1. Bunching identifies the observed elasticity

2. Density mass in dominated region identifies the amount of friction

3. Combination of (1)-(2) identifies the structural elasticity

KLEVEN & WASEEM (2013): NOTCHES VS KINKS

Smallness of bunching at kinks has two possible interpretations:

1. Small structural elasticities and small frictions

2. Large structural elasticities and large frictions

Cases 1 & 2 are observationally equivalent for kinks (same bunching),

but not for notches (same bunching, different hole).

Notches separately identifies observed and structural elasticities by us-

ing two moments (bunching and hole), whereas kinks provide only one

moment (bunching) and identifies only the observed elasticity.

PAKISTAN INCOME TAX SCHEDULES

05

1015

2025

Ave

rage

tax

rate

(%

)

100 200 300 400 500 600 700 800 900 1000 1100 1200 1300Taxable Income in PKR 000s

Wage Earners (2006−07)Self−Employed Individuals (2006−09)

BEHAVIORAL RESPONSES TO A TAX NOTCH:BASELINE MODEL

Consumptionz - T(z)

Earnings z

Individual Lindiff. curve

Individual Hindiff. curves

slope 1-t

slope 1-t- tΔ

notcht·z*Δ

z* z*+ΔzD z*+Δz*zI

EFFECT OF NOTCH ON EARNINGS DENSITY:BASELINE MODEL

Density

Earnings z

post-notch density

pre-notch density

bunching

densityhole

z* z*+Δz*zI

EFFECT OF NOTCH ON EARNINGS DENSITY:BASELINE MODEL SIMPLIFIED

Density

Earnings

post-notch density

pre-notch density

bunching

z* z*+Δz*

dominatedregion

EFFECT OF NOTCH ON EARNINGS DENSITY:HETEROGENEITY IN ELASTICITIES

Density

Earnings

post-notch density

pre-notch density

bunching

z*

dominatedregion

e is too lowfor bunching

z*+Δze*

EFFECT OF NOTCH ON EARNINGS DENSITY:OPTIMIZATION FRICTIONS

Density

Earnings

post-notch density

pre-notch density

bunching

z*

dominatedregion

e is too lowfor bunching

z*+Δze*

frictions are toohigh for bunching

EFFECT OF NOTCH ON EARNINGS DENSITY:MEASURING FRICTIONS

Density

Earnings

post-notch density

pre-notch density

bunching

z*

dominatedregion

z*+Δze*

share a*

share 1-a*

OBSERVED VS. STRUCTURAL ELASTICITY

Assuming quasi-linear utility with constant elasticity , the indifference

condition for the marginal buncher implies

∆∗

∗=

µ

∆

1−

¶; lim

→0∆∗ = ∆

Observed elasticity based on ∆∗ = 0(∗)

.

Structural elasticity based on ∆∗ = (1−∗)0(∗)

.

(1− ∗) = bunching scaled by the hole in the dominated range= amount of bunching if individuals overcame adjustment costs.

ESTIMATING THE COUNTERFACTUAL DENSITYUSING THE EMPIRICAL DENSITY

Density

Earnings zz*zL zU

exc uded rangel

dominatedrange

empiricaldensity h(z)

ESTIMATING THE COUNTERFACTUAL DENSITYUSING THE EMPIRICAL DENSITY

Density

Earnings z

empiricaldensity h(z)

z*zL zU

exc uded rangel

dominatedrange

counterfactualdensity h (z)0

share a* ofcounterfactual(frictions)

bunching B

missing mass M = B

BUNCHING, HOLE & FRICTIONS:SELF-EMPLOYED (500K NOTCH)

zUzL

b = 5.52(0.38)a* = 0.51(0.02)zU = 540.0(9.9)

dominatedrange

500

1000

1500

0N

umbe

r of

taxp

ayer

s

425 450 475 500 525 550 575Taxable income in PKR 000s

BUNCHING, HOLE & FRICTIONS:WAGE EARNERS (600K NOTCH)

zUzL

b =0.80(0.09)a* =0.91(0.01)zU =633.0(16.4)

dominatedrange50

010

0015

000

Num

ber

of ta

xpay

ers

525 550 575 600 625 650 675Taxable income in PKR 000s

KLEVEN-WASEEM (2013): EMPIRICAL RESULTS

1. Observed bunching at notches is large and sharp.

2. Optimization frictions are also very large (majority of taxpayers in

dominated ranges are unresponsive).

3. Combination of 1 & 2 implies that the frictionless earnings response

to notches is extremely large.

4. But the structural elasticity driving this response is nevertheless

modest.

5. This highlights the inefficiency of notches: by creating extremely

strong price distortions, they induce large behavioral responses even

when structural elasticities are small.