View
38
Download
0
Category
Preview:
Citation preview
Econ 2450B, Lecture 1: Basics of Welfare Estimation
Nathaniel Hendren
Harvard and NBER
Spring, 2015
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 1 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 1 / 145
Goal of Public Finance
Study the interaction between the government and the economyImplicitly, always a normative component
Should we increase taxes?Should we spend more on education, roads, etc.?Should we change the mix of taxes (e.g. commodity vs. property vs.capital vs. income taxes)?
Requires notion of “should”Use economics to formalize notions of welfare
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 2 / 145
Welfare Measurement
Measurement of welfare:Consumer surplus (Marshall 1890)Compensating variation and Equivalent variation (Hicks 1939, 1940,1941; Kaldor 1939)Takeaway of this literature
Income affects make definitions difficultInterpersonal comparisons are very difficult (Boadway 1974)But, envelope theorem is useful for analyzing small welfare changes
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 3 / 145
Welfare Impact of Policy Changes
Common approach in PF: consider small policy changesHow do we think about the welfare impacts of small policy changes?
Marginal excess burden (Harberger 1964, Feldstein 1999, Kleven andKreiner 2005)Marginal willingness to pay (Mayshar 1990, Slemrod and Yitzhaki1996, 2001, Kleven and Kreiner 2006, Hendren 2014)
Set up a general economic model to define these (Following Hendren2014)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 4 / 145
Setup
Individual i chooses:Goods xi and labor supply li
Could be vectors of goods/labor supply activities
Government chooses:Publicly provided goods and services, G , at marginal cost cTaxes on goods and labor supply: τx
i and τli
Transfers TiNon-linear taxes?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 5 / 145
Utility function
Individuals have utility function
ui (x , l ,G)
Rules out externalitiesProduction: Goods are produced linearly with one unit of labor supply
(1+ τxi ) xi ≤
(1− τl
i
)li + Ti
Rules out:SpilloversGE effects (see pset)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 6 / 145
Maximization
Indirect utility function
Vi(
τxi , τl
i ,G)= max ui (x , l ,G)
s.t.(1+ τx
i ) xi ≤(1− τl
i
)li + Ti
Lagrange multiplier λi is marginal utility of income
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 7 / 145
Social Welfare
Social welfare function
W({
τxi , τl
i ,Gi}
i
)= ∑
iψiVi
(τx
i , τli ,Gi
)Bergson (1938)-Samuelson (1947)Does ψ depend on things other than utility? (Saez and Stantcheva2013)Does it matter that we assume weights are linear?
Local changes...
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 8 / 145
Policy Changes
Define a “Policy Path” to trace out changes to government policy,P (θ):For any θ ∈ (−ε, ε)
P (θ) =
{{τl
ij (θ)}
j,{
τxij (θ)
}j , Ti (θ) , Gi (θ)
}i ,
Two assumptions (Draw Picture):1 θ = 0 is status quo:{{
τlij (0)
},{
τxij (0)
}, Ti (0) , Gi (0)
}i ,={{
τlij
},{
τxij},Ti ,Gi
}i
2 P (θ) is continuously differentiable in θ
d τxij
dθ , d τlij
dθ , dTidθ , and dGij
dθ exist and are continuous in θ
Should the government follow the policy path and increase θ?Need to measure how welfare changes with θFirst, start with the positive questions...
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 9 / 145
Positive Analysis: Agent’s Behavior and GovernmentBudget
Agents optimally choose xi and li facing policy P (θ)
xi (θ) = {xij (θ)}j and li (θ) ={
lij (θ)}
jThese are “potential outcomes” in world P (θ)
Net government resources towards individual i ,
ti (θ) =JG
∑j=1
cGj Gij (θ) + Ti (θ)−
JX
∑j=1
τxij (θ) xij (θ)−
JL
∑j=1
τlij (θ) lij (θ)
Budget neutrality would be ∑id tidθ = 0 ∀θ
dtidθ captures distributional impact
Behavioral response affects budgetddθ
JX∑j=1
τxij (θ) xij (θ) +
JL∑j=1
τlij (θ) lij (θ)
=
JX∑j
d τxij
dθxij +
JL∑j
d τlij
dθlij
︸ ︷︷ ︸
Mechanical Impacton Govt Revenue
+
JX∑j
τxij
dxijdθ
+JL∑j
τlij
dlijdθ
︸ ︷︷ ︸
Behavioral Impacton Govt Revenue
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 10 / 145
Normative Analysis: Marginal Willingness to Pay for Policy
Normative questions:WTP: How much are people willing to pay to move along the policypath?MEB: How much additional revenue could the government get if thepolicy change is implemented but utility is held constant usingindividual specific lump-sum transfers
Person i ’s marginal willingness to pay to move along the policy pathdVidθ |θ=0
λi
Money metric utility measureEquivalent to marginal EV and marginal CV
Why?Will define MEB later
Why is this different from EV/CV? (think about whose dollars we aremeasuring)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 11 / 145
Characterization of Marginal Willingness to Pay for Policy
The envelope theorem (Draw Picture) implies:dVidθ |θ=0
λi=
JG
∑j=1
∂ui∂Gij
λi
dGijdθ
+dTidθ−
JX
∑j
d τxij
dθxij −
JL
∑j
d τlij
dθlij
Now, substitute:
dTidθ
=dtidθ−
JG
∑j=1
cGj
dGijdθ
+ddθ
(JX
∑j=1
τxij (θ) xij (θ) +
JL
∑j=1
τlij (θ) lij (θ)
)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 12 / 145
Characterization of MWTP
Behavioral responses matter in keeping track of net resourcesdVidθ |θ=0
λi=
dtidθ︸︷︷︸
Net Resources
+JG
∑j=1
∂ui∂Gij
λi− cG
j
dGijdθ︸ ︷︷ ︸
Public Spending/Mkt Failure
+
(JX
∑j
τxij
dxijdθ
+JL
∑j
τlij
dlijdθ
)︸ ︷︷ ︸
Behavioral Impacton Govt Revenue
where the RHS is evaluated at θ = 0.Behavioral responses matter to the extent to which individuals imposeresource costs for which they don’t pay
If government taxation is only wedge between social and privatecosts, a single causal effect is sufficient
Impact on government revenue is sufficient for all behavioral responses
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 13 / 145
Marginal Excess Burden (MEB)
Can define MEB/MDWL in this frameworkLet v denote a vector of pre-specified utilities (e.g. status quo <–>“equivalent variation” MEB in Auerbach and Hines 2002)
Define an augmented policy path:
Pv =
{{τl
ij (θ)}
j,{
τxij (θ)
}j , Ti (θ) + Ci (θ; v) , Gi (θ)
}i
where Ci (θ; v) holds utilities constant at v.
MEB is defined asMEBvi
i =dtv
idθ|θ=0
Measures additional revenue government could obtain if it implementsthe policy but then holds people’s utility constant usingindividual-specific lump-sum transfersDepends on compensated elasticities (by definition)Conceptually deals with the budget constraintBut, MWTP is generally positive for budget-negative policies!
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 14 / 145
Motivating a Particular MVPF Measure
Many real-world policies are not budget neutralCommon to “adjust for the MCPF”
There are a lot of different definitions (see Ballard and Fullerton,1992; Dahlby, 2008)One definition is particularly useful: no need to decompose any causaleffects into income and substitution effects
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 15 / 145
Defining the MVPF
Suppose P1 (θ) and P2 (θ) are two non-budget neutral policies
Marginal cost to govt of∫
id tP1
idθ di and
∫i
d tP2i
dθ di
Marginal social welfare of∫
i ηidV P1
idθ |θ=0
λidi and
∫i ηi
dV P2i
dθ |θ=0λi
di
Define MVPF as in Mayshar (1990), Dahlby (1998), Slemrod andYitzhaki (1996, 2001), Kleven and Kreiner (2006)Benefit-cost ratio for each policy
MVPF iP =
∫i
ηiηi
dV Pi
dθ |θ=0λi
di∫i
d tPi
dθ di=
”BENEFIT”
”COST”
measured in units of i income
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 16 / 145
Properties
MVPF is policy-specific – can be computed for any non-budgetneutral policy
Does not require decompositions into income and substitution effects
Comparisons of MVPF correspond to comparisons of social welfareWelfare impact of budget-neutral policy with more P2 and less P1
dWdθ
ηi= MVPF i
P2−MVPF i
P1
Allocating money to high MVPF from low MVPF policies increasessocial welfare
But need to keep units i the same
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 17 / 145
Comparisons Using Okun’s Bucket
In general, MVPF requires weighting by social marginal utilities ofincomeFormula can be simplified if ηi is constant within the set ofbeneficiaries
Beneficiaries of P1 have equal social marginal utility of income η1MVPF 1
P1is marginal benefit to beneficiaries, normalized by govt cost
Beneficiaries of P2 have equal social marginal utility of income η2MVPF 2
P2is marginal benefit to beneficiaries, normalized by govt cost
Increasing spending on P1 and decrease spending on P2 increaseswelfare iff
η1η2≥
MVPF 2P2
MVPF 1P1
Differs from MEB comparisons across policies which requires modifiedsocial welfare weights that include income effects (Diamond andMirrlees 1971)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 18 / 145
Simplified Formulas
Simplification #1: Assume beneficiaries have same ηiCompute MVPF in units of beneficiaries’ income
Simplification #2: Suppose policy either effects market or non-markettransfers
[Market Goods/Transfers] P (θ) increases mechanicaltransfers/subsidies by $θ
MVPF =1
1|I |∫
i∈IdtP
idθ di
1|I |∫
i∈IdtP
idθ di = 1+ FE is cost of providing $1 mechanical income
[Non-Market Goods] P (θ) increases public goods/services by $θ
MVPF =
∂u∂Gλ
1|I |∫
i∈IdtP
idθ di
Multiply by WTP for G relative to income,∂u∂Gλ
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 19 / 145
Applications
Use existing causal effects to calculate MVPF for various policychanges
Top marginal tax rate increaseMany studies summarized in Saez et al (2012)
EITC GenerosityMany studies summarized in Hotz and Scholz (2003), Chetty et al(2013)
Food StampsHoynes and Schanzenbach (2012)
Job TrainingRCT of Job Training Partnership Act (Bloom et al 1997)
Section 8 Housing VouchersLotteried access to Section 8 in Illinois (Jacob and Ludwig 2012)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 20 / 145
Top Tax Rate Increases
Large literature studying causal impact of top tax rate increases /decreases
Saez, Slemrod, and Giertz (2012) provide reviewMany estimates of causal effect of changes to top income tax rateTax-weighted taxable income elasticity
Suggests 25-50% of mechanical revenue lost (lots ofdisagreement/uncertainty!)
Fiscal cost is $0.50-$0.75 for $1 in transfer
Suggests MVPF of $1.33-$2
MVPF =1
1− .25 = 1.33
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 21 / 145
EITC Expansions
Large literature studying causal impact of EITC expansions (Hotz andScholz 2003, Chetty et al 2013)
Intensive + extensive calculations suggest fiscal cost of EITC is ~14%higher because of labor supply impactsFiscal cost is $1.14 for $1 in mechanical EITC benefitsSuggests MVPF of $0.88
MVPF =1
1+ .14 = 0.88
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 22 / 145
Food Stamps
Hoynes and Schanzenbach (2012) use variation across counties inintroduction of food stamp program (1960-70s)Use data from 1968-78 PSIDCompare labor supply over counties across time
yict = α + δFSPct + ηc + λt + µst
+Xit β + σCB60c ∗ t + γTRANSFERSct
Standard DD with controls for 1960 census controls * time trends
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 23 / 145
Results
Find large but imprecise decrease in labor earnings of $2,943 (can’treject zero)
Assume 20% marginal tax rate -> $588.60 impact on governmentbudget
Average household transfer: $1,153.25Total cost is $1,153.25+$588.60=$1,741.85.MVPF:
11|I |∫
i∈IdtP
idθ di
=1, 153.251741.85 = 0.66
Food stamps are “in-kind”:∂u∂Gλ 6= 1
May be that∂u∂Gλ < 1 because goods are in kind
Smeeding (1982) estimates 0.97; Moffitt (1989) estimates ~1Whitmore (2002) estimates 0.80 for marginal/distorted recipients
Assuming food stamps valued as cash, MVPF is 0.66Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 24 / 145
Concerns with Food Stamp Results
Causal effects very impreciseAlso, causal effect in 1970 = causal effect now?How do we value non-market goods?
Generally requires structural modeling assumptions... (Whitmore 2002)How much do people distort their consumptionHow much does this affect their WTP (small distortion satisfiesenvelope theorem)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 25 / 145
Job Training
Job Training Partnership Act of 1982 provided job training services tolow income youth and adultsBloom et al (1997) report results from RCT (I focus on adult womenimpact)
Increased tax collection of $236Reduction in welfare benefits (AFDC) $235
$471 net increase in government budget from behavioral responses
Marginal cost of providing the training is $1,381Cost net of fiscal externality is $910
MVPF is 1.52 if program costs are valued at its costs
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 26 / 145
Job Training
No estimates of∂u∂Gλ for the program
Bloom et al (1997) implicitly assume earnings is fully valuedThis is far too common!When is this OK?
Earnings increase of $1,683 for marginal cost of $1,381 ->∂u∂Gλ = 1.22
Suggests MVPF of 1.85 if increase was entirely productivity
But could be MVPF = 0 if no one valued it
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 27 / 145
Section 8 Housing Vouchers
Section 8 is largest low-income housing program in USProvides vouchers to low-income households (see MTO experiment,etc.)
Jacob and Ludwig (2012) exploit excess applications in IllinoisAllocated via lotteryEstimate significant impact on labor supply and welfare take-up
Earnings decrease implies fiscal externality of $129 per voucherWelfare programs increase sum to $432 (mostly medicaid)But vouchers are a lot of money ($8,400/yr)Voucher cost $1.05 for every $1 of vouchers
MVPF = 0.95∂u∂Gλ
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 28 / 145
Section 8 Housing Vouchers
Reeder (1985) suggests $1 vouchers valued at∂u∂Gλ = 0.83
People consume too much house!
Suggests MVPF of 0.79 for housing vouchers
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 29 / 145
Summary
Policy∂u∂Gλ
11|I |∫
id tidθ di
MVPF
Top Tax Rate 1 1.33 - 2 1.33 - 2EITC Expansion 1 0.88 0.88Food Stamps 0.8 - 1 0.66 0.53 - 0.66Job Training 0 - 1.22 1.52 0 - 1.85
Housing Vouchers 0.83 0.95 0.79
Taking MVPF TopTax = 1.33, increasing EITC and top tax ratedesireable iff
ηRich
ηPoor ≤.881.33 = 0.66
$0.66 to a poor person or $1 to a rich person?Question: What about MEB comparisons?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 30 / 145
Takeaways
Need causal effect of policy in questionIs this what we used?ATE/ATT/ITT?
Also need WTP for non-market goodsThis is the hard part!
Aggregations across people require social welfare weightsHow to identify?Surveys (Saez Stantcheva 2013)Inverse Optimum (Later...)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 31 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 31 / 145
Optimal Policy vs. MVPF
How does the welfare framework we have covered so far relate tooptimal policy formulas?
We have been asking “what marginal policy changes from the statusquo are worth are most valuable (highest MVPF)?”Optimal policy formulas ask “In an ideal world, what would the optimalpolicy look like (given our SWF)?”
MVPF is evaluated around the status quoHence, requires local causal effects defined around the statusquo–observed elasticitiesWhat is the marginal WTP for a policy change?What is the additional deadweight loss from a policy change?(well-defined?)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 32 / 145
Optimal Policy vs. MVPF
Optimal policy is thought experiment: “Given our SWF, What is theoptimal __...”:
Tax rate/subsidyPublic goods provision“...given our constraints
Status quo is irrelevantBut when behavioral responses are important, must be evaluatedaround the optimal policy, not the status quo
Early optimal policy:Samuelson (1954) -> public goodsRamsey (1927) -> taxes
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 33 / 145
Optimal Public Goods (Samuelson 1954)
Before turning to taxes, consider public goodsPure public goods:
Non-rival: My consumption doesn’t prevent your consumption (NoCongestion)Non-excludable: Provider can’t prevent consumption by those whodon’t pay
Private Goods benefit one individual i at a time:Total consumed units of X = ∑i xi
Public Goods benefit several individuals simultaneouslyEach person consumes G but G may be the sum of all contributions byothers:
G = ∑j
Gj
where Gj is the contribution by individual j towards the public goodNote this differs slightly from the notation above where Gi is thepublicly provided good consumed by individual i .
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 34 / 145
Public Goods
Why might the free market under-provide public goods?Free-ridingPublic goods create positive externalities, individuals underprovide
Two main research questions:1. What is the optimal level of PGs?2. How are (and can) PGs provided in practice?
Provision problemsFree-rider models (BBV, in section)Optimal “warm glow” offsets the free-riding externalityWill government crowd-out private provision? Do we care?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 35 / 145
Optimal Public Goods (Samuelson 1954)
First Welfare Theorem: Any market equilibrium is Pareto OptimalWith public goods, this failsSamuelson (1954) derives condition for a Pareto Optimum (mayrequire transfers)
Consider First Welfare Theorem setup:Individuals indexed by i, two goods, X and GUtility functions U i (xi ,Gi ), standard budget constraintc is the dollar cost of producing G. (Normalize price of x to 1 sopGpx
= c)
Condition for private optimalityUG (xi ,Gi )Ux (xi ,Gi )
= c ⇐⇒ MRSi = MRT ∀i
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 36 / 145
Optimal Public Goods: Failure of FWT
Now, suppose G is publicSo each person purchases Gi , but values G = ∑i GiUtility is U(xi ,G) = U(xi ,Gi + ∑j 6=i Gj )
Condition for private optimality
Still UG (xi ,G)Ux (xi ,G)
= c ⇐⇒ MRSi = MRT∀i
But social value of additional spending on G is ∑ ηiUG (xi ,G)Ux (xi ,G)
Private choice of G ignores external benefit!Private equilibrium is Pareto Inferior: there is an allocation that canmake everyone better
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 37 / 145
Optimal Public Goods
Now assume G is only provided by the governmentWorry about crowd out?
Conder a policy path θ that just provides a Public Good funded byLump-Sum Transfers (like Sameulson’s planner would)
dGdθ = 1, at marginal cost c = 1dtidθ = 0 (budget neutral)Each person pays dTi
dθ
What is the condition under which we should have more G?Optimality Condition:
dWdθ
= 0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 38 / 145
Optimal Public Goods (Samuelson 1954)
Case 1: ηi = ηj = η. Then,
dWdθ
= ∑i
(∂ui∂Gλi− dTi
dθ
)
which equals zero if and only if the sum of MWTP for G out of ownincome equals the marginal cost ($1):
∑i
∂ui∂Gλi
= ∑i
dTidθ
= 1
Is this justified by the Pareto principle?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 39 / 145
Optimal Public Goods (Samuelson 1954)
Case 2: Pareto implementation. Use Lindahl pricing: dTidθ = −
∂ui∂Gλi.
Each person pays their WTP. Then,
dVidθ
λi=
∂ui∂Gλi
+dTidθ
= 0
which is budget feasible if and only if the taxes collected are greaterthan the cost:
∑i
∂ui∂Gλi
= ∑i
dTidθ≥ 1
Problems with Lindahl price? Need each person to pay their (privatelyknown) WTP.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 40 / 145
Stiglitz-Dasgupta-Atkinson-Stern MCPF
Suppose we need to finance with linear tax on labor.Consider representative agent model
NOTE: As we will see, it’s a bad idea to assume away lump-sum taxesin representative agent models...this makes the results largely uselessand even mis-leading.
Tax on labor, τ
We have dVdG = 0 if and only if:
∂u∂Gλ
= 1+ τ
(− dl
dG
)The fiscal externality, FE = τ
(− dl
dG)is known as the “Marginal Cost
of Public Funds” by Atkinson and Stern (1974) and Stiglitz andDasgupta (1971).
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 41 / 145
Stiglitz-Dasgupta-Atkinson-Stern MCPF
MCPF=τ(− dl
dG)
Intuition: the fiscal externality from a budget neutral policy thatincreases spending on G by $1 financed by an increase in τ.
Two components:dldθ
=dldτ
dτ
dθ+
dldG
dGdθ
Here, the MCPF is defined as a component of the welfare impact of abudget-neutral policy
Differs from “MVPF” in Hendren (2013) or “MCPF” in Kleven andKreiner (2006), who define MCPF/MVPF as the welfare impact perdollar spent on a non-budget neutral policy.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 42 / 145
MVPF, MEB, MCPF...Yikes!
A lot of different definitions of welfare measures / costs of taxation /etc:MVPF: Marginal welfare impact per dollar of government spending
Defined for non-budget neutral policiesCan be compared across policies to form hypothetical budget neutralpolicies (MVPF P1 −MVPF P2 is the welfare impact of a budget neutralpolicy that spends money on P1 financed with money from P2).Depends on causal effects of policy on government budget (toaccurately measure costs)
MEB: Marginal additional revenue the govt can collect if itimplements the policy but holds all utilities constant usingindividual-specific lump-sum taxation
Depends on compensated elasticitiesCan’t be aggregated to social weflare without adding back in theincome effects (Diamond and Mirrlees 1971).
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 43 / 145
MVPF, MEB, MCPF...Yikes!
MCPF of Atkinson-Stern-Stiglitz-DasguptaAtkinson and Stern (1974)Dasgupta and Stiglitz (1971)Fiscal externality component of broader budget-neutral policy thatsimultaneously increases G financed by τ.
MCPF of Harberger, Pigou, BrowningDefines MCPF as fiscal externality of MEB experiment of spendingmore on G financed by τ but holding utilities constant usingindividual-specific lump-sum taxation.
Depends on compensated response.Implicitly compares to world with lump-sum taxation
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 44 / 145
Nathan’s Commentary...
This literature is fairly complicated – terms are often not used in aconsistent mannerMy suggestion (others may disagree):
Just construct the benefit-cost ratio (MVPF) and compare acrosspolicies.It’s simpler, allows for easier conceptual comparisons across policies,and aggregates across people using social marginal utilites of income(Okun’s bucket)
If you have a non-budget neutral policy and you’ve constructedpeople’s MWTP for that policy, you don’t need to “adjust for themarginal cost of public funds”; rather, you need to think of yourestimate as a marginal value of public funds that can be compared toother policies.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 45 / 145
Pigouvian Externalities
Now suppose choosing x causes an externality, E (x), where x is theaggregate choice of x in the populationTo make things simple, retain the representative agent framework butassume choosing x doesn’t incorporate the effect on x and thus theexternality.Utility
u (x , l ,G ,E (x))
MWTP:
dVdθ
λ=
dtdθ
+
(∂u∂Gλ− c)
dGdθ
+ τx dxdθ
+ τl dldθ
+dEdθ
∂u∂Eλ
where∂u∂Eλ is the MWTP for E and dE
dθ is the causal (notcompensated) impact on E.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 46 / 145
Pigouvian Tax
Assume budget neutrality, no public goods, and no tax on labor (doesthis matter?)Note that
dEdθ
=dxdθ
dEdx
sodVdθ
λ=
(τx +
dEdx
∂u∂Eλ
)dxdθ
Pigouvian tax:
τPIGOU = −dEdx
∂u∂Eλ
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 47 / 145
MCPF and Double Dividend
With public goods
dVdθ
λ=
(∂u∂Gλ− c)
dGdθ
+
(τx +
dEdx
∂u∂Eλ
)dxdθ
Aside: Stiglitz-Dasgupta-Atkinson-Stern definition of MCPF is justthe FEDouble dividend: taxing x yields a ’cheaper’ MCPF because it alsodeals with externalityYour exercise: Show this is true iff τx < τPIGOU . What ifτx > τPIGOU
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 48 / 145
Optimal Taxation in Ramsey (1927)
Ramsey (1927): How should commodities be taxed to raise revenue,R > 0.
Modeled by Diamond and Mirrlees (1971)
Key result: Tax-weighted Hicksian price derivatives are equated acrossgoods
“Inverse elasticity rule”: tax goods with smaller compensatedbehavioral responses
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 49 / 145
Setup
Representative Agent (drop i subscripts).Commodities, xk , indexed by kGovernment imposes taxes on commodities, τk .Necessary condition for optimality
dVPdθ|θ=0 = 0
for all feasible policy paths P.Optimal tax would be lump-sum of size R
Assumed to not exist
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 50 / 145
Commodity Tax Variation
Consider policy P (θ) that changes commodity taxes (e.g. lowers taxon good 1 and raises tax on good 2)Budget neutral: dt
dθ = 0No change in public goodsSo, optimality condition only involves behavioral response:
∑k
τkdxkdθ|θ=0 = 0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 51 / 145
Hicksian Elasticity
Diamond and Mirrlees (1971): At the optimum, expand thebehavioral response using the Hicksian demands, xh
k ,
dxkdθ
=∂xh
k∂τ1
dτ1dθ
+∂xh
k∂τ2
dτ2dθ
Additional term, ∂xhk
∂udVpdθ , but this vanishes at the optimum.
Optimality condition is given by
∑k
τk∂xh
k∂τ1
dτ1dθ
= ∑k
τk∂xh
k∂τ2
(−dτ2
dθ
)
Tax-weighted Hicksian responses are equated across the tax ratesInverse elasticity rule
What are the needed elasticities?Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 52 / 145
Inverse Elasticity Rule
Assume cross elasticities are zero:
BC = x1dτ1dθ
+ τ1dx1dθ
+ x2dτ2dθ
+ τ2dx2dθ
= 0
sox1(1+ τ1
x1∂xh
1∂τ1
)dτ1dθ
= x2(1+ τ2
x2∂xh
2∂τ2
)(−dτ2
dθ
)And optimality implies
x1(
τ1x1
∂xh1
∂τ1
)dτ1dθ
= x2(
τ2x2
∂xh2
∂τ2
)(−dτ2
dθ
)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 53 / 145
Inverse Elasticity Rule
So (τ1x1
∂xh1
∂τ1
)=
(τ2x2
∂xh2
∂τ2
)= κ
Translating to price (1+tau) instead of tax (tau) elasticities:
τj1+ τj
εhj,(1+τj )
= κ
Orτj
1+ τj=
κ
εhj,(1+τj )
which is the “inverse elasticity rule”.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 54 / 145
Key Result: Inverse Elasticity Rule
Main result of Ramsey model: Inverse elasticity ruleKey Assumptions:
Representative agentNo lump sum taxation
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 55 / 145
Optimal Taxation of Production
Diamond and Mirrlees (1971) also consider the issue of productionefficiency.Commodities, xk , indexed by k, transformed into one another(produced) by firms and governmentProducer prices pk , Consumer prices qk
Tax is wedge τk = qk − pk
Consumer i solves max ui (x) s.t. ∑ qkxk ≤ 0Defines consumer (final) demand for each commodity x i
k (q)and indirect utility Vi (q) = u(xi (q))
Note: Consumers are the ones endowed with the initial commoditysupplyEndowments allow them to exchange, consumers are on budgetconstraint
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 56 / 145
Firm side
Price-taking firms j transform commoditiesProduction possibilites represented by input output function f j(y) = 0
for example, y1 = y .32 ∗ y .73 ⇐⇒ y1 − (−y .32 ) ∗ (−y .73 ) = 0Can turn y2and y3 into y1 (or vice versa, depending of domain)Negative arguments are inputs, positives are outputs
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 57 / 145
Firm side: CRS Production
Assumption: constant returns to scaleThen each firm can produce “as much” or “as little” as desired infixed proportions
Together, many CRS firms define an aggregate production functionf (y) = 0No profits for any firm (otherwise infinite production) in equilibriump·yj = 0 must hold in equilibrium, and thus p · y = p · (∑ yj) =0
Under CRS, behavior of many optimizing firms same as one aggregatefirm
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 58 / 145
Firm side: Firm Objective
Objective: Choose point on frontier to maximize output prices - inputprices
max p·y s.t. f (y) = 0
Optimality condition: ∂f∂yk
= pk ⇐⇒ MRT =∂f
∂yk∂f
∂yk ′= pk
p′k
Why can we ignore lagrange multiplier on f (y) = 0 condition?Because we can normalize the units of f to be in terms of one of thecommodities...see Diamond-Mirrlees (1971).
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 59 / 145
Govt
D&M think of Gov’t as a planner with a distributive objective but:Can’t just pick point on PPFMust deal with consumers through market place using uniform pricesUses:
a.) linear commodity taxes to set prices andb.) public production to adjust quantities above and beyond whatprivate sector does given prices
Public production follows PPF given by g(z) ≤ 0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 60 / 145
Objective
What is the objective here?redistribution–different than Ramsey, since no revenue requirement
Why would commodity taxes help with no lump sum transfers?differential wealth levels are due to endowment differencesCommodity taxes target:
Different tastesValue of endowment
But commodity taxes cause DWL
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 61 / 145
Objective
Solve
maxq,p,z ∑
iW (Vi (q)) s.t. ∑
ix i
k(q) = yk(p)+ zk , f (y) = 0, and g(z) = 0
Lagrangian
maxq,p,z ∑
iW (Vi (q))+∑
kλk(yk(p)+ zk −∑
ix i
k(q))+γf f (y(p))+γgg(z)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 62 / 145
Objective
Production-side and consumer-side variables are additively separable
maxq,p,z ∑
iW (Vi (q))−∑
kλk ∑
ix i
k (q)︸ ︷︷ ︸consumption
+∑k
λk (yk (p) + zk ) + γf f (y(p))+γg g(z)︸ ︷︷ ︸production
Note that FOC for producer prices and government production dependon W only through the shadow value of an endowment unit of k.Also, choice of p directly implements y, so we can choose y directly
maxq,y ,z ∑
iW (Vi (q))−∑
kλk ∑
ix i
k (q)︸ ︷︷ ︸consumption
+∑k
λk (yk + zk ) + γf f (y)+γg g(z)︸ ︷︷ ︸production
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 63 / 145
Result
[FOC yk ]λk = γf ∂f∂yk
[FOC gk ]λk = γg ∂g∂zk
Taking ratio, for any social welfare objective, it must be the case that:
∂g∂zk∂g
∂zk ′
=∂f∂yk∂f
∂yk ′
=pkpk ′
The government’s decision to intervene in the economy isindependent of the objective. MRTs are always equalized, and theonly wedge is between consumer and producer prices. Production-sideand consumer-side variables are additively separable
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 64 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 64 / 145
Mirrlees Model
Mirrlees (1971)Formalizes tradeoff between equity and efficiency as an informationproblem for the government
Equity and the size of the pie are necessarily related...
Saez (2001) shows Mirrlees model can be interpreted throughempirical elasticities
Spawned huge empirical literature on optimal taxation
Key difference: Redistribution as rationale for taxing income insteadof lump-sum taxation
Would like individual-specific lump-sum taxes (efficient!)But, hard to make individual-specific taxes (information constraints)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 65 / 145
Setup
Individuals choose effort/income to maximize utilityGovernment doesn’t observe your latent productivityCan only tax you based on observed income, not latent productivity /effortTaxing income -> affects effort
Efficiency / equity tradeoff
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 66 / 145
Setup
Individuals consume c and earn yIndividuals are heterogeneous and have utility u (c, y ; θ)
e.g. θ indexes cost of effortPopular specifications:
u (c, y ; θ) = u(c, y
θ
)= c − v
( yθ
)where y
θ is “effective effort”High θ implies higher productive (can earn y with lower utility cost).
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 67 / 145
Government’s Problem
Government maximizes Bergson-Samuelson SWF:
max∫
v (θ)ψ (θ) dµ (θ)
where v (θ) is individual θ’s utility and ψ (θ) are a set of welfareweights.If government could observe θ, second welfare theorem applies: Taxpeople based on θ
Individual-specific lump sum taxes for each θ
Key insight of Mirrlees: government can’t observe θ
Can observe income, yCan choose tax function T (y) so that individuals have budgetconstraint
Individuals earn y are taxed T (y):c ≤ y − T (y)
Partial equilibrium assumption: choice of y by type θ doesn’t affectearnings capabilities of type θ′
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 68 / 145
Individual’s Problem
Individuals choose y subject to tax schedule T (y)Substituting c = y − T (y), individuals choose earnings to maximizeutility:
maxy
u (y − T (y) , y ; θ)
If u is concave, yields FOC:
uc(1− T ′ (y (θ))
)= uy
or uyuc
= 1− T ′ (y (θ))
MRS equated to wagesTaxes distort earningsy decisionsLeads to choice y (θ) and utility v (θ)
v (θ) = u (y (θ)− T (y (θ)) , y (θ) ; θ)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 69 / 145
Government’s Constrained Problem
Government chooses function T (y) subject to the constraint thatindividuals choose y (θ) when facing T (y)
max∫
v (θ)ψ (θ) dµ (θ)
s.t.
v (θ) = maxy u (y − T (y) , y ; θ) [IC ]
y (θ) = argmax u (y − T (y) , y ; θ)
and ∫T (y (θ)) dµ (θ) ≤ 0 [RC ]
Note we assume the government does not face a participationconstraint
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 70 / 145
Solution: Two Approaches
General solution is difficult (Mirrlees 1971)Two general approaches: Calculus of Variations vs. Hamiltonian
Hamiltonian works well if θ has a nice structure (e.g. uni-dimensional)Calculus of variations useful for characterizing necessary conditions foroptimum, but sufficiency is difficult
More closely aligned to empirical quantitiesLoosely, calculate MVPF for variations in tax schedule and set welfareimpact equal to zero
Saez (2001) takes an empirical approach varying T ′ (y) functionSaez (2001) provides very nice formulasStart with a simpler version: optimal top tax rate
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 71 / 145
Calculus of Variations Approach: Top Tax Rate Changes
Suppose τ is tax rate on income above y . What is the optimal τ?(Saez 2001)
Simpler question: what is the revenue maximizing tax rate?Assume no social value of income on the rich
Total revenue from tax on incomes above y :
R =∫
y≥yτ (y − y) f (y) dy
where y (and it’s pdf, f (y)) is an endogenous response to the taxrate, τMarginal revenue from increasing τ:
dRdτ
= (E [y |y ≥ y ]− y) (1− F (y))︸ ︷︷ ︸Mechanical
+τd
dτ
∫y≥y
(y − y) f (y) dy︸ ︷︷ ︸Behavioral
where F (y) is the c.d.f. of the income distributionThe revenue-maximizing tax rate is:
τ∗ =(E [y |y ≥ y ]− y) (1− F (y))− d
dτ
∫y≥y (y − y) f (y) dy
Where are the IC constraints?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 72 / 145
Behavioral Response
Under some assumptions can write the response using “structural”elasticities:
ddτ
∫y≥y
(y − y) f (y) dy =∫
y≥y
dydτ
f (y) dy
e.g. Participation margin? Discrete shifts in labor supply?
What is dydτ?
Use utility theory (Saez 2001; Diamond and Saez 2011)Uncompensated Elasticity
ζu =1− τ
y∂y
∂ (1− τ)
Assume no income effects (see Saez 2001 for broader derivation)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 73 / 145
Graphically (Diamond and Saez 2011)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 74 / 145
Behavioral ResponseThen,
dydτ
= − ∂y∂ (1− τ)
dτ
dydτ
= −ζu y1− τ
So
−∫
y≥y
dydτ
f (y) dy =∫
y≥yζu y
1− τf (y) dy
=ζu
1− τ
∫y≥y
yf (y) dy
=ζu
1− τ(1− F (y))E [y |y ≥ y ]
where ζu is the income-weighted elasticity and ym = E [y |y ≥ y ]Note the elasticities are defined around the optimum
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 75 / 145
Top Tax Formula
Top tax rate satisfies:
τ =1− τ
ζuE [y |y ≥ y ]− y
E [y |y ≥ y ]
orτ =
11+ aζu
wherea =
E [y |y ≥ y ]− yy
is the pareto parameter
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 76 / 145
Pareto Tails
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 77 / 145
Top Tax Rate
If ζu = 0.3 and a = 1.5, then top tax rate is
τ =1
1+ 1.5 ∗ 0.3 = 0.69
If ζu = 0.5, thenτ = 57%
If ζu = 0.1, thenτ = 87%
What is the size of the behavioral response?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 78 / 145
Top Taxable Income Elasticity
Much literature on this...See Saez, Slemrod, and Giertz (2012 JEL review)DD with tax reforms reforms:
Reagan tax cutsFeldstein (1995)
Clinton tax increases (OBRA 93)Goolsbee 2000Giertz 2009
Pooled variationGruber and Saez 2002 (JpubEc)
Kink analysisSaez 2002
Difficulty: Top incomes are NOISY...precision is difficult.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 79 / 145
Classic Study: Feldstein 1995 JPE
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 80 / 145
Shortcomings
Feldstein 1995 is a seminal article, highlighting that people mayrespond to taxation and this affects how we should think about taxdesignBut, there are a few shortcomings:
57 observations in the top income bracket!Where are the standard errors?
Is panel data better than a repeated cross section?Mean reversion issues?Conditions on tax group in 1985...is that the right counterfactual?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 81 / 145
Gruber and Saez (2002, JPubEc)
Gruber and Saez 2002 fix the mean reversion issueUse variation in tax rates from 1979-1990
Consider model where income yit = y0it (1− τit)
ε where ε is thetaxable income elasticityTaking logs:
log(
yit+3yit
)= α + εlog
(1− τit+31− τit
)+ ηit
There are two reasons people face higher taxes:They choose to earn moreFor a given earnings, policy yields different tax ratesHence, 1− τit+3 is correlated with ηit because τit+3 is a function ofyit+3! Need an instrument to isolate policy changes!
IV using τPit+3 = T ′t+3 (yit): Instrument for marginal tax rate in t + 3
if the only thing that changed was the tax policy (not y !).Yields 0.3-0.6 estimate of ETI.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 82 / 145
Top Tax Rate
Goodsbee 2002 JPEUses panel data on corporate compensationShows taxable income declines after OBRA93
Short run elasticity > 1 (consistent with Feldstein 1995)
But it’s all short termOne-year elasticity 0-.4Changes in timing of compensation around the introduction of OBRA
Entirely driven by changes in the exercising of stock options aroundthe time of OBRA!
Increased 1992 taxable income, decreased 1993 taxable income
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 83 / 145
Are behavioral responses to top tax rates real vsavoidance?
Large literature on tax avoidance / tax evasionFeldstein (1999)
OK if y is taxable income
Chetty (2009)What if evasion has externalities / internalities?
Evasion effects the optimal designSandmo 1981Slemrod and Yitzhaki (2002 Handbook)Dina Pomerantz (2013 JMP) VAT...
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 84 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 84 / 145
General Solution
We ask: what are the properties of T (y) that maximize thegovernment’s objective?If T (y) is optimal, then should be indifferent to small changes inT (y)Calculus of Variations approach: Vary T (y).Consider calculus of variations in T (y)
Define T (y ; y∗, ε, η) by
T (y ; y∗, ε, η) =
{T (y) if y 6∈
(y∗ − ε
2 , y∗ + ε
2)
T (y)− η if y ∈(y∗ − ε
2 , y∗ + ε
2)
T provides η additional resources to an ε-region of individuals earningbetween y∗ − ε/2 and y∗ + ε/2.Given T , individual of type θ chooses y (y∗, ε, η; θ) that maximizesutility
Intuitively, some people who earn near y∗ might move away from y∗because the government is taxing them more (or move towards y∗ ifη < 0)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 85 / 145
Causal effects vs. IC constraints
Define choice of income, y , in environment with ε and η by
y (θ; y ∗, ε, η) = argmax u(y − T (y ; y ∗, ε, η) , y ; θ
)Why might an individual change choice of y?Why do we care about these changes?
Impact on government revenue!Do individuals internalize this impact on government revenue? NO.
Where are the IC constraints?Embedded in y function - we substitute the maximization program intothe resource constraint and assume observed behavior maximizes the ICconstraintTherefore, we need causal effects of policy changes instead of a fullsolution to the programming problem.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 86 / 145
Government Revenue Impact
Given choices y (y ∗, ε, η; θ), government revenue is given by
q (y ∗, ε, η) =1
Pr{
y (θ) ∈[y ∗ − ε
2 , y ∗ +ε2]} ∫
θT (y (θ; y ∗, ε, η) ; y ∗, ε, η) dµ (θ)
(normalized by the number of mechanical beneficiaries).Note q (y∗, 0, η) = q (y∗, ε, 0) = 0 for all ε and η
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 87 / 145
Social Welfare
Recall social welfare:
W =∫
v (θ)ψ (θ) dµ (θ)
Define social welfare W (y ∗, ε, η) to be social welfare underT (y ; y ∗, ε, η)
Let g (θ) denote the social marginal utility of income for type θ:
g (θ) =dWdyθ
= λ (θ)ψ (θ)
where λ is the individual’s marginal utility of incomeSo, g is the impact on social welfare of giving type θ an additional $1.Ratios of g are Okun’s bucket (Okun (1975))
g (θ1)
g (θ2)= 2
implies indifferent to $1 to type θ1 relative to $2 to type θ2
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 88 / 145
Welfare Impact
Welfare impact of increasing ηCan use envelope theorem:
Marginal welfare impact given by mechanical loss in income weightedby social marginal utility of income:
dWdη|η=0 =
∫g (θ) 1
{y ∈
(y∗ − ε
2 , y∗ +
ε
2
)}dµ (θ)
Note this assumes partial equilibrium (no welfare impact of changingtaxes on those not directly affected)
Marginal cost given by
Pr{
y (θ) ∈[y ∗ − ε
2 , y∗ +
ε
2
]} dq (y ∗, ε, η)
dη|η=0
Welfare impact per dollar of government budget:1
Pr{y(θ)∈[y∗− ε2 ,y∗+
ε2 ]}∫
g (θ) 1{
y ∈(y ∗ − ε
2 , y∗ + ε
2)}
dµ (θ)
dq(y∗,ε,η)dη |η=0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 89 / 145
Optimality Condition
Calculus of Variations implies welfare impact per dollar should beequated for all y ∗ and all ε > 0!
At the optimum, the government is indifferent to small variations inT (y)Otherwise wouldn’t be at an optimum!
General optimality condition
Let q (y) = limε→0(
dq(y ,ε,η)dη |η=0
)= 1+ FE (y)
Marginal revenue gained from imposing $1 of welfare loss onindividuals earning y
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 90 / 145
Optimality as equating MVPFs
Consider policy of giving money to y1:
MVPFy1 =E [g (θ) |y (θ) = y1]
q (y1)=
E [g (θ) |y (θ) = y1]1+ FE (y1)
Necessary condition for optimality: Indifferent to giving more moneyto y1 vs. y2.:
MVPFy1 = MVPFy2
orE [g (θ) |y (θ) = y1]
1+ FE (y1)=
E [g (θ) |y (θ) = y2]1+ FE (y2)
Must be indifferent to budget-neutral manipulations of resourcesRelative cost given by impact of small variations in tax schedule
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 91 / 145
Income and Substitution Effects
Can write FE (y) using behavioral elasticities...For linear tax rates,
E [g (θ) |y (θ) = y∗] = 1− τ (y∗)1− τ (y∗)
ζ (y∗)︸ ︷︷ ︸Income Effect
+τ (y)
1− τ (y)ddy |y=y∗
[yf (y)f (y∗)
εc (y)]
︸ ︷︷ ︸Substitution Effect
Optimal income tax schedule depends on behavioral distortions
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 92 / 145
Simplification: Diamond/Mirrlees/Saez formula
Assume:No income effects: ζ = 0Constant compensated elasticity εc
g ′ is constant conditional on y
Theng (y)− 1
εc =τ (y)
1− τ (y)ddy |y=y∗
[yf (y)f (y ∗)
]︸ ︷︷ ︸
Substitution EffectSome intuition: In regions where the optimal tax is linear:
1− τ
τ=
ε
1− g (y)α
where α = −1− dlog(f (y))dlog(y) is the elasticity of the income distribution
(constant if Pareto)Tax is higher when elasticity is lower, g is lower, alpha is lower.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 93 / 145
Laffer Tax
Suppose g ′ = 0Then,
τ
1− τ=
1εα
orτεα = 1− τ
orτ =
11+ εα
which is Diamond-Saez 2011 formula for revenue maximizing top taxrate
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 94 / 145
General Mirrlees Formula
Optimal Tax solves (τ linear)
g (y)− 1ε
f (y) = ddy |y=y∗
[τ (y)
1− τ (y)yf (y)
]Fundamental Thm of Calculus:[
limy→∞
τ (y)1− τ (y)
yf (y)f (y ∗)
]− τ (y)
1− τ (y)yf (y) =
∫ ∞
y
g (y)− 1ε
f (y) dy
Generally, limy→∞τ(y)
1−τ(y)yf (y)f (y∗) = 0 (e.g. if f is pareto, f ∝ y−α−1)
Soτ (y)
1− τ (y)yf (y) =
∫ ∞
y
1− g (y)ε
f (y) dy
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 95 / 145
General Mirrlees Formula (Diamond 1998)
Re-writing:τ (y)
1− τ (y)αε = 1− G (y)
whereG (y) = 1
1− F (y)
∫ ∞
yg (y) f (y) dy
is the average social marginal utilities on those earning more than y
α (y) = yf (y)1− F (y)
is the local Pareto parameter of the income distribution
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 96 / 145
General Mirrlees Formula (Diamond 1998)
Re-writing:τ (y)
1− τ (y)=
1− G (y)αε
Title of Diamond (1998): “Optimal Income Taxation: An Examplewith a U-Shaped Pattern of Optimal Marginal Tax Rates”
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 97 / 145
Optimal Top Tax Rate is Zero?
Wisdom prior to Saez (2001): Mirrlees Optimal Tax program implies“no tax at the top”.
Celebrated result of Mirrlees (1971)
Suppose income distribution is bounded, y ≤ y .Should the marginal tax rate be positive at y?
Suppose it is.Lower the tax rate slightly at y + ε –> individuals may increaseearnings to y
Increases tax revenue as long as marginal tax rate on earnings y toy + ε is positive
No individuals above y -> no one will decrease their earningsTherefore, optimality requires the tax on earnings = 0 from y to y + ε
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 98 / 145
Optimal Top Tax Rate is Zero?
Top tax rate = 0 only applies to the person who earns the maximumBut, also applies for some distributions:
Suppose α (y)→ ∞.Recall α (y) = −1− dlog(f )
dlog(y) = −1−yf ′(y)f (y)
If f ′/f is constant, then α→ −∞, so that τ → 0e.g. y exponential λe−λy -> λ2e−λy , so that
y f ′ (y)f (y)
= yλ
which blows up.
But if f is Pareto, then α is constantTurns out Pareto is good approximation (Saez 2001) -> optimal taxrate is not zero at the top.Need tax return data to know this!
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 99 / 145
Participation Effects
Implicitly, we assumed behavioral responses were continuousLocal income and substitution effects govern cost from behavioralresponess
But what if people enter the labor force?Saez 2002Kleven and Kreiner 2006
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 100 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 100 / 145
Mirrlees Model assumes no general equilibrium
[this section walks through proof of optimal tax with GE effects a la RSQJE 2013 roy model...]
Mirrlees model assumes no general equilibrium effects of tax changesIndividuals θ1’s choice of y doesn’t affect the tradeoffs faced by θ2Own-earnings impact of θ1 earning y is same as GDP impact of θ1increasing y
Assume there exists a utility function
u (c, y , θ)
where y = f (l ; θ)
But, what if choice of labor effort of type 1 affects earnings of type 2?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 101 / 145
Two Sources of GE Bias
“Trickle down” GE effectsRich people work harder -> poor people make more money?Build businesses, “job creators”, etc.Rothschild and Scheuer (QJE 2013)
“zero-sum” GE effectsRich people earn more money -> they steal it from poor peopleRent-seeking / investment bankingRothschild and Scheuer (Econometrica 2013)
The presence of such GE effects changes the optimal income taxschedule
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 102 / 145
Trickle Down/Up Economics
Suppose earnings of y given by
y (θ) = f ({l (θ)}θ)
where l is effort of type θ
Allows for arbitrary inter-connectedness of production/earnings
Suppose govt icnreases transfers to those earning y by η dollarsHas direct effect on those earning yBehavioral responses can directly affect earnings of those with earningsaway from y
How do we value these impacts?Envelope Theorem
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 103 / 145
Envelope Theorem
Utilityu (c, l , θ)
Earnings of type θy (θ) = w (θ) l (θ)
Suppose we change the top marginal income tax rate, τtop
Impact on wages of type θ is dwdτtop
Welfare impact on type θ not directly affected by the transfer is givenby
dVdτtop =
dwdτtop l = dy (θ)
dτtop |l
earnings impact holding behavioral responses constant
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 104 / 145
Welfare Impact
Combining with the impact of the transfer, we have an aggregatewelfare impact from the tax policy giving transfers near y as:
dWdτtop |η=0 = g ′top +
∫g ′ (y) dy
dτtop |ldy︸ ︷︷ ︸Externalities
where g ′top is the average social marginal utility of income on the rich(continue to assume g ′top = 0)Trickle down: ∫
g ′ (y) dydτtop |ldy > 0
Giving money to rich people increases incomes of poor:But could go the other way (e.g. fixed job slots):∫
g ′ (y) dydτtop |ldy < 0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 105 / 145
Optimality condition with externalities
Externalities impact direct welfare impact of tax changesBut, now behavioral responses are also more complicated...
Maybe the poor offset the mechanical impact on earnings by increasedlabor supply?Effects fiscal cost of tax change
Total revenue from increasing τ:
dR = E [y |y ≥ y ]− y︸ ︷︷ ︸Mechanical
+ τtop dE [y |y ≥ y ]dτtop︸ ︷︷ ︸
Behavioral
+∫
y<yT ′ (y) dy
dτtop︸ ︷︷ ︸Trickle-down Rev Impact
assuming continuously differentiable responses
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 106 / 145
Optimal Tax Rate
Normalize g ′ so that g ′ is 1 for a policy that provides income acrossthe distribution (dW=dR+dWOptimality implies
τ∗ =E [y |y ≥ y ]− y− dE [y |y≥y ]
dτ︸ ︷︷ ︸Original Formula
+
∫y<y T ′ (y) dy
dτtop
− dE [y |y≥y ]dτ︸ ︷︷ ︸
Fiscal Externality
−∫
g ′ (y) dydτtop |ldy
− dE [y |y≥y ]dτ︸ ︷︷ ︸
Direct Externality
Suppose fiscal externality small:Trickle down externalities -> lower top marginal tax rateFixed jobs -> higher top marginal tax rate
Could the fiscal externality overwhelm?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 107 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 107 / 145
Inverse Optimum Program
Bourguignon and Spadaro (2012), Werning (2007), Jacobs, Jongen,and Zoutman (2013; 2014), Lockwood and Weinzierl (2014), Hendren(2014)Traditional “optimal tax”: Start with primitives, derive optimal taxratesInverse optimum approach: Start with status quo, derive implicitweights, g (y).
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 108 / 145
Inverse Optimum Program: A Crazy Approach?
You may think this is crazy (why would I like Harry Reid’spreferences!?). But, it is not! There are three key reasons to like thistype of an approach even if you don’t think govts are optimizing:
Can be used to search for Pareto improvements of the tax schedule(Werning 2007, Bourguignon and Spadaro 2012).For other policies besides the tax schedule, can use Kaldor-Hicksefficiency criterion, as opposed to subjective social preference
Weights correspond to the “correct” Kaldor Hicks corrections for thecost of transfers (Hendren 2014)
Although the inverse optimum approach is generically “not invertible”and the social welfare interpretation breaks down in multi-dimensionalheterogeneity, the measure of these costs as “marginal costs” for theKaldor-Hicks corrections remains quite general
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 109 / 145
Bourguignon and Spadaro
“Tax-Benefit Revealed Social Preferences”We can identify G (y) from:
τ (y)1− τ (y)
=1− G (y)
α (y) ε
where α (y) = yf (y)1−F (y) and G (y) = E [g (y) |y ≥ y ]. Re-writing:
G (y) = 1+ α (y) εT ′ (y)
1− T ′ (y)
And g (y) = G ′ (y)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 110 / 145
Bourguignon and Spadaro
Use data from FranceMarginal tax ratesWage distribution from survey dataFind G ′ (y) < 0 for high values of earnings
Suggests implicit welfare weights are negative at the top
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 111 / 145
Bourguignon and Spadaro
Concerns:Use of HH data for wage distribution -> thin upper tail (decliningα (y))
Mechanically generates optimal zero tax rate at top (as in Diamond 98)
Treatment of heterogeneityDifferent tax rates for same wage...how to construct T ′ (y) from surveydata? See heterogeneity section below.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 112 / 145
Werning (2009)
Pareto Efficient Income TaxationCharacterize when is a tax schedule constrained inefficient?Local Laffer Effects
g (y) = 1− T ′ (y)1− T ′ (y)
εα (y) ≥ 0
where α = −1− dlog(f (y))dlog(y) (derivative of α above...).
Pareto efficiency test:
T ′ (y)1− T ′ (y)
εα (y) ≤ 1
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 113 / 145
Generalization to multiple dimensions
Suppose people have characteristics X (multi-dimensional). Obtaintaxes/transfers T (X ).Construct FE (X ) = limε→0 FE (X , ε), whereFE (X ∗, ε) = d
dη q (X , η, ε)− 1 is the fiscal externality from giving atax cut to those with values of X ∈ Nε (X ∗)Test:
FE (X ) < −1
Local Laffer Effects...But, testing FE (X ) < −1 for all X is hard!
Think of all the possible values of X in reality! All of people’sobservable characteristics – very high dimensional!
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 114 / 145
Resolving interpersonal comparisons
Difficult to find Laffer effects. Otherwise, need to resolveinterpersonal comparisons
g (θ) can describe social preferences...but what weights to use?Kaldor and Hicks: can winner’s hypothetically compensate losersthrough transfers?
Generally interpreted as lump-sum transfers (individual specific)Setup: Consider policy path P; construct WTP out of own income,s (θ) = dV
λ
Assume budget neutral policy for nowIn principle, s (θ) < 0 for some >0 for others.KH test: does there exist transfers t (θ) s.t. everyone is better off.True iff ∫
s (θ) dµ (θ)︸ ︷︷ ︸Aggregate Surplus
> 0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 115 / 145
Lump sum vs. distortionary transfers
KH motivates aggregate surplus (aka efficiency) as a normativecriterionMirrlees 71 observation: Transfers aren’t free!Hendren (2014): Imagine the transfers occur through the income taxschedule
See also Coate (2000) and Kaplow (2006; 2008).
Imagine taxing back the benefits to the government through changesin the tax scheduleFor now, assume benefits are homogeneous conditional on incomes (y)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 116 / 145
Multi-D heterogeneity
Benefits s (y). We tax them back s.t. government revenue is 1+FEper $1 taxed:
(1+ FE (y)) s (y)
Can get positive govt revenue from spending the $1 if∫(1+ FE (y)) s (y) dF (y) ≥ 0
If holds, then can do the policy, tax the benefits, and redistribute thesurplus to everyone!
Benefit-based taxation (Musgrave).
Motivates using weights 1+ FE (y) or
g (y) = 1+ FE (y)E [1+ FE (y)]
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 117 / 145
Two reasons to use these weights...
g (y) characterizes cost of providing $1 of welfare to those earning ythrough modifications in the tax schedule.
Give higher weight to those more costly to reach!
Two reasons to use these weights:Positive (Hendren 2014): Can augment policy with benefit tax to makePareto improvement
Prefer the policy by the (potential) Pareto principle
Normative (Bourguignon and Spadaro): g (y) are welfare weights thatrationalize status quo as optimum
g (y) rationalizes status quo as optimal (assuming g (y) > 0).
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 118 / 145
Application to U.S. data (Hendren 2014)
Hendren (2014) estimates FE (y) by calibrating elasticitiesEstimates T ′ and α using universe of income tax returns
Accounts for covariance between T ′ and αContrasts with existing approaches assuming single tax rate schedules(e.g. Bourguignon and Spadaro, Jacobs et. al.)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 119 / 145
Application to U.S. data (Hendren 2014)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 120 / 145
Multi-D heterogeneity
What to do if there is heterogeneity conditional on incomein earnings yin surplus s (θ)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 121 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 121 / 145
Commodity versus Income Taxation
Mirrlees model assumes only income taxWhat about commodity taxes? Or other taxes?
Diamond-Mirrlees (1971, AER) calculates optimal commodity taxes inworld with no lump-sum taxation
Leads to inverse elasticity ruleConsider policy variation around an optimum that solely changescommodity taxes:
dV P
dθ= 0
which implies
∑k
τkdxkdθ
= 0
At an optimum, budget-neutral policy changes have no utility impacts-> compensated elasticities.
But, suppose there was lump-sum taxation -> optimal distortionarytax is zero.
Need to nest commodity taxes in model with rational desire to avoidlump-sum taxationNathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 122 / 145
Commodity Taxation Revisited
Does commodity taxation have a role if we have a nonlinear incometax (with lump-sum)?
Need to put commodity taxes into Mirrlees (1971) frameworkAtkinson and Stiglitz (1976) JPubEc
Follow Kaplow (2006, JPubEc) for a simple proof
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 123 / 145
Kaplow (2006)
SetupIndividuals choose commodities {c1, c2...} and labor effort, lMaximize utility function
uh (c1, c2, ..., l) = uh (g (c1, ...) , l)
Key assumption: g is the same across people
Subject to budget constraint
∑ (pi + τi ) ci ≤ wl − T (wl)
where w is an individual’s wage (heterogeneous in population)wl is earnings and T (wl) is the (nonlinear) tax on earnings
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 124 / 145
Statement
Suppose there is a commodity taxpi + τipj + τj
6= pipj
for some i and jCan welfare be improved by re-setting τi = τj = 0 and suitablyaugmenting the tax schedule T?
Atkinson-Stiglitz/Kaplow: YES.Define V (τ,T ,wl) to be
V (τ,T ,wl) = max v (c1, c2, ...)
s.t. ∑ (pi + τi ) ci ≤ wl − T (wl)V is the value of the consumption argument of the utility function –holds independent of labor effort l!
Consumption allocations don’t reveal any information about laborsupply type w conditional on wl .
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 125 / 145
Proof
Define intermediate environment:Start with commodity taxes τDefine new taxes at zero τ∗i = 0Augment the tax schedule
Define T ∗ to offset the impact on utility so that utility is held constantin this intermediate world
Specifically, T ∗ satisfies
V (τ,T ,wl) = V (τ∗,T ∗,wl)
for all wl
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 126 / 145
Proof (Cont’d)
Lemma 1: Every type w chooses the same level of labor effort underτ∗,T ∗ as under τ,T .Proof:
Note that
U (τ,T ,w , l) = u (V (τ,T ,wl) , l) = u (V (τ∗,T ∗,wl) , l) = U (τ∗,T ∗,w , l)
so that utility is the same in both environments for a given individualfor any choice of l .Therefore, the l that maximizes utility in the original world maximizesutility in the intermediate world
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 127 / 145
Proof Cont’d
Lemma 2: The augmented world raises more revenue than the originalworldProof:
Will show that no individual in the intermediate regime can afford theoriginal consumption vector
Implies they pay more taxes in intermediate regime
Suppose type w can afford original vectorThen she strictly prefers a different vector because of change in relativepriceImplies intermediate environment is strictly better off -> contradictingdefinition of intermediate environment holding utilities constant
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 128 / 145
Proof Cont’d
Why does this imply aggregate tax revenue is higher in theintermediate environment?We have:
∑ (pi ) ci > wl − T ∗ (wl)for all wl (note τ∗ = 0)Budget constraint in initial regime implies
∑i(pi + τi ) ci = wl − T (wl)
so that∑
ipici = −∑
iτici + wl − T (wl)
So that−∑
iτici + wl − T (wl) > wl − T ∗ (wl)
orT ∗ (wl) > ∑ τici + T (wl)
QEDNathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 129 / 145
Proof Cont’d
So, the intermediate world generates more tax revenue and holdsutility constantNow, generate a third world that gives ε benefits to everyone throughlowering the tax scheduleImplies everyone better off.
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 130 / 145
Implications of Atkinson Stiglitz
Incredibly powerfulImplies zero capital taxes in the standard modelNests the celebrated “production efficiency” theorem of Diamond andMirrlees (1971)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 131 / 145
Capital Taxes
Suppose
U (c1, c2, ..., l) = u (c1)− v (l1) + β [u (c2)− v (l2)] + ...
With budget constraint
∑i(pi + τi ) ci ≤∑
iwi li
Sog (c1, c2, ...) = u (c1) + βu (c2) + ...
Implies no distortion in relative price of c1 and c2You should prove extension to case with li instead of just l .
What if more productive types have higher preferences for bequests?
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 132 / 145
Production Efficiency
Diamond and Mirrlees (1971) show a surprising result:Suppose C is produced with a bunch of intermediate inputs, xi
C = f (x1, ..., xn)
Question: would you ever want to tax these inputs?Answer: No!
u (x , l) = U (C (x) , l)
The production function for C is the same for all peopleWeak separability holdsImplies no taxes on intermediate inputs
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 133 / 145
When does Atkinson-Stiglitz Fail?
Mirrlees information logic:When commodity choices have desirable information about typeconditional on earnings!
What constitutes “desirable information”? (Saez 2002 JPubEc)Information about social welfare weights: Society likes people thatconsume x1 more than x2 conditional on earnings
Implement subsidy on good x1 financed by tax on x2First order welfare gain (b/c of difference in social welfare weights)Second order distortionary cost starting at τ = 0
Information about latent productivity: More productive types like x1more than x2 conditional on earnings
e.g. x1 is books; x2 is surf boardsThen, tax the goods rich people like but reduce the marginal tax rateLeads to increase in earnings!Depends on covariance
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 134 / 145
Back to Diamond and Mirrlees (1971) Optimal CommodityTaxation
Diamond and Mirrlees (1971) derive conditions for optimalcommodity taxationConsider model without an intercept in the tax schedule (i.e. nolump-sum transfers)Result: Taxes on goods inverse to their behavioral responses
Inverse elasticity rule
My view: This is arguably the most mis-understood result in allof public economics.
Because intercept is ruled out, introduces desire to tax inelastic goodsbecause they replicate the lump-sum tax.But with lump-sum tax, this desire goes awayOptimal commodity taxes depend on whether commodity choiceprovides systematic information about latent productivity and allowsfor a relaxation of the income distribution
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 135 / 145
1 Basics of Welfare Estimation
2 Optimal Tax: Early Literature on Public Goods and Commodity Taxes
3 Optimal Income Taxation: Mirrlees Model
4 Optimal Tax in Partial Equilibrium: A General Formula
5 Optimal Taxation in General Equilibrium
6 Inverse Optimum Program
7 Commodity versus Income Taxation: Atkinson-Stiglitz
8 Education & the Hamiltonian Approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 135 / 145
Education Policy
Should we subsidize education?Bovenberg and Jacobs (2005 JPubEc); Stantcheva (2013, JMP)Setup:
l is labor effort (unobserved)y is an individual’s production (observed)θ is an individual’s type (unobserved)h is human capital investment (observed)Arbitrary production (for now):
y = f (h, l , θ)
UtilityU (c, l , h, θ)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 136 / 145
Education Policy
Question: How should we tax/subsidize human capital in addition tononlinear income tax T (y)?Special Case: Can we apply Atkinson-Stiglitz theorem?Case 1: h is pure consumption and has no impact on production
If U = U (g (c, h) , l , θ), then AS implies no tax/subsidy for education
Case 2: High earning people like human capital more than poorpeople:
gPoorc
gPoorh
>gRich
cgRich
h
Tax it (allows lower marginal tax on earnings)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 137 / 145
Education Policy
Case 3: h has productive value and no utility valueFollow Bovenberg and Jacobs (2005, JPubEc)
Production functiony = lφ (h) θ = lhβθ
Utility
U = c − l1+ 1ε
1+ 1ε
This is more complicated...turn to Hamiltonian approach
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 138 / 145
Hamiltonian Approach
Common method for solving uni-dimensional screening problems: Usea Hamiltonian
Original Approach by Mirrlees (1971)
WLOG?, government chooses menu of {c (θ) , l (θ) , h (θ)}θ tomaximize social welfare: ∫
v (θ)ψ (θ) dθ
Will sometimes switch l and y ...
Subject to IC constraints and aggregate resource constraints
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 139 / 145
Switch to utility space
Common approach to using the hamiltonian: switch to utility spaceand solve for utility
c (v , y , h) = v +
(y
θφ(h)
)1+ 1ε
1+ 1ε
so given v , l , and h, c is the level of consumption that delivers utilityv
Helpful to have quasilinear utility here...
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 140 / 145
IC and Resource Constraint
Start with IC constraint:Define
v(θ, θ)= u
(c(θ), y(θ), h(θ); θ)= c
(θ)−
[y(θ)
φ(h(θ))θ
]1+ 1ε
1+ 1ε
utility of type θ who says they are type θIC constraint is
v (θ) = maxθ v(θ, θ)∀θ
Each type prefers truth-telling
Resource Constraint:∫T (h (θ) , y (θ)) =
∫(y (θ)− c (θ)− h (θ)) dθ ≥ 0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 141 / 145
First Order Approach
Under a single crossing assumption, the global incentive constraintscan be replaced with local incentive constraintsLocal IC constraints described by envelope theorem:
v ′ (θ) =∂u∂θ
= −(
y (θ)
φ (h (θ))
)1+ 1ε d
dθ θ−(1+1ε )
1+ 1ε
= −(
y (θ)
φ (h (θ))
)1+ 1ε d
dθ θ−(1+1ε )
1+ 1ε
= −(
y (θ)
φ (h (θ))
)1+ 1ε
θ1ε
=1θ
(y (θ)
φ (h (θ)) θ
)1+ 1ε
Note that v ′ (θ) > 0. Implies more productive types must get higherutility...
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 142 / 145
Hamiltonian
Hamiltonian:Think of θ as “time”v (θ) is the state variable (we have a constraint for v ′ (θ))control variables (aka co-state variables): h (θ), y (θ), and c (θ)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 143 / 145
Hamiltonian
H = v (θ)ψ (θ)−γIC (θ)
[1θ
(y (θ)
φ (h (θ)) θ
)1+ 1ε
]+γRC [y (θ)− h (θ)− c (y (θ) , h (θ) , v (θ) , θ)]
Why is this useful? Well, we’re done. We can solve for the optimalsubsidy on human capital...At the optimum, ∂H
∂f (X )= 0 for all ctsly diff functions of control
variables f (X ).So, substitute back l (θ) instead of y (θ).
H = v (θ)ψ (θ)−γIC (θ)1θ
l1+1ε +γRC [θl (θ) φ (h (θ))− h (θ)− c (θl (θ) φ (h (θ)) , h (θ) , v (θ) , θ)]
Now, take derivative wrt h holding l and v constant:∂H∂h = γRC
[θl (θ) φ′ (h (θ))− 1− dc
dh |l,v]= 0
where dcdh |l,v = 0
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 144 / 145
How general is this?
Why is the tax precisely zero?Because taxing it provides no help in screening (h does not help in ICconstraint)Is this general?
No: Depends on shape of incentive constraints (Stantcheva 2013).Consider general case: y (θ) = φ (h, θ):
v(θ, θ)= c
(θ)−
[y(θ)
φ(h(θ),θ)
]1+ 1ε
1+ 1ε
Then,
v ′ (θ) =[
y(θ)
φ(h(θ), θ)] 1
ε y (θ)∂φ∂θ
φ2 =
[y(θ)
φ(h(θ), θ)]1+ 1
ε ∂φ∂θ
φ
Note that∂φ∂θφ does NOT depend on h when φ = hθ. But, more
generally it does.The IC constraint now enters when taking the derivative of H wrt hholding l and v fixed
Stantcheva: human capital wedge depends on hicksian coefficient ofcomplementarity:
ρ =∂2φ∂θ∂h φ∂φ∂θ
∂φ∂h
Subsidize human capital more (less) than taxes if ρ < 1 (ρ > 1)
Nathaniel Hendren (Harvard and NBER) Lecture 1 Spring, 2015 145 / 145
Recommended