Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Measuring Polarization in High-Dimensional Data:Method and Application to Congressional Speech
Matthew Gentzkow, Stanford and NBERJesse M. Shapiro, Brown and NBERMatt Taddy, Microsoft and Chicago Booth
Tax Relief
DEATH TAX
freedom fighters
illegal alien
terrorists
ESTATE TAX
Tax Breaks
undocumented worker
Wealthiest
1 percent
living wage
fair labor
capitalist African American
Pro choice
equality
Tax Freedom
War on Terror
Pro life
Big Government
entrepreneurs
Right to life
Washington takeover
Welfare Queens
Origins
Dr. Frank I. Luntz – The Language of Healthcare 2009
1
THE LANGUAGE OF HEALTHCARE 2009
THE 10 RULES FOR STOPPING THE
“WASHINGTON TAKEOVER” OF HEALTHCARE
(1) Humanize your approach. Abandon and exile ALL references to the “healthcare
system.” From now on, healthcare is about people. Before you speak, think of the three
components of tone that matter most: Individualize. Personalize. Humanize.
(2) Acknowledge the “crisis” or suffer the consequences. If y
ou say there is no healthcare
crisis, you give your listener permission to ignore everything else you say. It is a
credibility killer for most Americans. A better approach is to define the crisis in your
terms. “If you’re one of the millions who can’t afford healthcare, it is a crisis.” Better
yet, “If some bureaucrat puts himself between you and your doctor, denying you
exactly what you need, that’s a crisis.” And the best: “If you have to wait weeks for
tests and months for treatment, that’s a healthcare crisis.”
(3) “Time” is the government healthcare killer. As Mick Jagger once sang, “Time is on
Your Side.” Nothing else turns people against the government takeover of healthcare
than the realistic expectation that it will result in delayed and potentially even denied
treatment, procedures and/or medications. “Waiting to buy a car or even a house won’t
kill you. But waiting for the healthcare you need – could. Delayed care is denied care.”
(4) The arguments against the Democrats’ healthcare plan must center around
“politicians,” “bureaucrats,” and “Washington” … not the free market, tax incentives,
or competition. Stop talking economic theory and start personalizing the impact of a
government takeover of healthcare. They don’t want to hear that you’re opposed to
government healthcare because it’s too expensive (any help from the government to
lower costs will be embraced) or because it’s anti-competitive (they don’t know about or
care about current limits to competition). But they are deathly afraid that a government
takeover will lower their quality of care – so they are extremely receptive to the anti-
Washington approach. It’s not an economic issue. It’s
a bureaucratic issue.
(5) The healthcare denial horror stories from Canada & Co. do resonate, but you have
to humanize them. You’ll notice we recommend the phrase “government takeover”
rather than “government run” or “government controlled” It’s because too many
politician say “we don’t want a government run healthcare system like Canada or Great
Britain” without explaining those consequences. There is a better approach. “In
countries with government run healthcare, politicians make YOUR healthcare decisions.
THEY decide if you’ll get the procedure you need, or if you are disqualified because the
treatment is too expensive or because you are too old. We can’t have that in America.”
Example: Social Security
• Luntz (2006):• “Never say ’privatization / private accounts.’ Instead say
’personalization / personal accounts.’ Two-thirds of America want topersonalize security while only one third would privatize it. Why?[Personalization] suggests ownership and control... while [privatization]suggests a profit motive and winners and losers.”
Example: Social Security
• 2005 Congress
Rep Dem“personal account” 184 48“private account” 5 542
• Media coverage, 6/23/05• “House GOP offers plan for Social Security; Bush’s private accounts
would be scaled back” (Washington Post)• “GOP backs use of Social Security surplus; Finds funding for personal
accounts” (Washington Times)
Example: Social Security
• 2005 Congress
Rep Dem“personal account” 184 48“private account” 5 542
• Media coverage, 6/23/05• “House GOP offers plan for Social Security; Bush’s private accounts
would be scaled back” (Washington Post)• “GOP backs use of Social Security surplus; Finds funding for personal
accounts” (Washington Times)
Is partisan speech a new phenomenon?
This Paper
• Goal: Measure trends in partisanship of political speech• Data: US Congressional Record, 1873-2009• Challenge: Speech is high-dimensional choice data
• Potential for severe finite-sample bias• Computation can be difficult
• Solution: Structural estimation with machine-learning methods• Approach exportable to other contexts (e.g. web browsing, residential
segregation)
Literature
• Polarization in Congress• E.g., Poole & Rosenthal (1984, 1997); McCarty et al. (2006)
• Polarization more broadly• E.g., Fiorina et al. (2006); Fiorina & Abrams (2006); Abramowitz & Saunders (2008)
• Congressional speech• E.g., Grimmer (2010, 2013); Quinn et al (2010)• Jensen et al (2012)
Data
Data
• US Congressional Record, 1872-2009• Use automated script to identify speaker and tag with metadata• Use some rules of thumb to remove procedural phrases
• “I yield the remainder of my time...”
• Turn into counts of two-word phrases less stems and stopwords• “war on terrorism” and “war on terror” become “war terror”
Trends in Verbosity
1880 1900 1920 1940 1960 1980 2000
5000
1000
015
000
Year
Tota
l utte
ranc
es p
er s
peak
er
Model
Statistical Model
• Vector of phrase counts cit for members i• Party affiliation P (i) ∈ {R,D}• Speaker characteristics xit
• Verbosity mit =∑
j cijt
• Assume throughout that
cit ∼ MN(
mit ,qP(i)t (xit)
)
Question
• How different are choices of R and D at each t?• Translation: how different are qR
t () and qDt ()?
• Approach: measure partisanship by diagnosticity• How much can I learn about your party from what you say?
Posteriors
• Posterior belief of an observer with a neutral prior after hearing phrase j
ρjt (x) =qR
jt (x)qR
jt (x) + qDjt (x)
• Posterior that the observer expects to assign to the speaker’s true party
πt (x) =12
qRt (x)′ · ρt (x) +
12
qDt (x)′ · (1− ρt (x))
Posteriors
• Posterior belief of an observer with a neutral prior after hearing phrase j
ρjt (x) =qR
jt (x)qR
jt (x) + qDjt (x)
• Posterior that the observer expects to assign to the speaker’s true party
πt (x) =12
qRt (x)′ · ρt (x) +
12
qDt (x)′ · (1− ρt (x))
Measure of Partisanship
πt =1Nt
∑i
πt (xit)
• Between 12 (speech uninformative) and 1 (speech fully revealing)
• Close cousin of isolation (White 1986, Cutler et al 1999)
Estimation
Plug-In Estimator
• Empirical analogues
q̂Pjt =
∑i∈P cijt∑i∈P mit
ρ̂jt =q̂R
jt
q̂Rjt + q̂D
jt
π̂PLUGINt =
12(q̂R
t)′ρ̂t +
12(q̂D
t)′(1− ρ̂t)
• This is the MLE when xit is constant• Consistent as quantity of speech grows large holding size of vocabulary
fixed
Maximum Likelihood EstimatorA
vera
ge p
artis
ansh
ip
1870 1890 1910 1930 1950 1970 1990 2010
0.54
0.56
0.58
0.60
0.62
0.64 real
Maximum Likelihood EstimatorA
vera
ge p
artis
ansh
ip
1870 1890 1910 1930 1950 1970 1990 2010
0.54
0.56
0.58
0.60
0.62
0.64 random real
Bias
E[(
q̂Rt)′ρ̂t −
(qR
t)′ρt
]=
(qR
t)′
E (ρ̂t − ρt) +
Cov[(
q̂Rt − qR
t)′, (ρ̂t − ρt)
]
• q̂Pt is unbiased for qP
t
• First term non-zero because ρ̂t is a non-linear function of q̂Pt
• Second term non-zero because ρ̂t is an increasing function of q̂Rt
Jensen et al. (2012)S
tand
ardi
zed
pola
rizat
ion
1870 1890 1910 1930 1950 1970 1990 2010
−1
0
1
2
3random real
Restrict to Commonly Occurring Phrases?
Top 90 percent of phrases
Ave
rage
par
tisan
ship
1870 1890 1910 1930 1950 1970 1990 2010
0.54
0.55
0.56
0.57
0.58
0.59
0.60 random real
Spoken more than 5 times
1870 1890 1910 1930 1950 1970 1990 2010
0.53
0.54
0.55
0.56
Top 50 percent of phrases
1870 1890 1910 1930 1950 1970 1990 2010
0.53
0.54
0.55
0.56
0.57
0.58
0.59Spoken more than 20 times
1870 1890 1910 1930 1950 1970 1990 2010
0.515
0.520
0.525
0.530
0.535
0.540
0.545
Top 10 percent of phrases
1870 1890 1910 1930 1950 1970 1990 2010
0.52
0.53
0.54
0.55
Spoken more than 100 times
1870 1890 1910 1930 1950 1970 1990 2010
0.510
0.515
0.520
0.525
0.530
0.535
Top 1 percent of phrases
1870 1890 1910 1930 1950 1970 1990 2010
0.510
0.515
0.520
0.525
0.530
0.535
Spoken more than 500 times
1870 1890 1910 1930 1950 1970 1990 2010
0.505
0.510
0.515
0.520
0.525
0.530
Leave-Out Estimator
• Define ρ̂−i,t which leaves out i• Define
π̂LOEt =
12
1|Rt |
∑i∈Rt
q̂′i,t · ρ̂−i,t +12
1|Dt |
∑i∈Dt
q̂′i,t ·(1− ρ̂−i,t
)• Enforces independence of q̂ and ρ̂• Still biased because of non-linear ρ̂
Leave-Out EstimatorA
vera
ge p
artis
ansh
ip
1870 1890 1910 1930 1950 1970 1990 2010
0.500
0.505
0.510
0.515
0.520
0.525random real
• Controlling bias• Add lasso type penalty to likelihood• Shrinks ρ̂jt toward 1
2
• Making computation feasible• Approximate likelihood with Poisson• Allows distributed computing (Taddy 2015)
• Controlling for confounds (xit )• geography, chamber, gender, indicator for being in majority party
Main Results
Baseline SpecificationA
vera
ge p
artis
ansh
ip
1870 1890 1910 1930 1950 1970 1990 2010
0.500
0.505
0.510
0.515
0.520
0.525 real random
MagnitudeE
xpec
ted
post
erio
r
0 20 40 60 80 100
0.5
0.6
0.7
0.8
0.9
1.0
Number of phrases
One minute of speech1873−1874
1989−19902007−2008
Comparison: Roll Call Votes
1870 1890 1910 1930 1950 1970 1990 2010
0.500
0.505
0.510
0.515
0.520
0.525
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85average partisanship (speech)distance between parties (roll−call voting)
Comparison: Roll Call Votes
.44
.46
.48
.50
.52
.54
.56
−1.0 −0.5 0.0 0.5 1.0NOMINATE
Par
tisan
ship
Democrat
Republican
Unpacking Partisanship
Most Partisan Phrases
• Define the partisanship of phrase j in session t to be the effect on πt ofremoving phrase j from the vocabulary (redistributing probability mass toother phrases proportionally)• Let q̃P
kt equal qPkt/
(1− qP
jt)
if k 6= j and 0 otherwise• Recompute πt replacing qP
t with ~qPt and holding ρt constant
60th Congress (1907-08)
Most Republican Most Democraticinfantri war section cornerindian war ship subsidimount volunt republ panamafeet thenc level canalpostal save powder trustspain pay print paperwar pay lock canalfirst regiment bureau corporsoil survey senatori termnation forest remove wreck
60th Congress (1907-08)
Most Republican Most Democraticinfantri war section cornerindian war ship subsidimount volunt republ panamafeet thenc level canalpostal save powder trustspain pay print paperwar pay lock canalfirst regiment bureau corporsoil survey senatori termnation forest remove wreck
• 1908 Rep platform: Calls for “generous provision” for veterans of Spanish-American andIndian wars
60th Congress (1907-08)
Most Republican Most Democraticinfantri war section cornerindian war ship subsidimount volunt republ panamafeet thenc level canalpostal save powder trustspain pay print paperwar pay lock canalfirst regiment bureau corporsoil survey senatori termnation forest remove wreck
• 1908 Dem platform: “Free the Government from the grip of those who have made it abusiness asset of the favor-seeking corporations.”
• William Cox (D-IN): “the entire United States is now being held up by a great hydra-headedmonster, known in ordinary parlance as a ’powder trust’.”
80th Congress (1947-48)
Most Republican Most Democraticsteam plant admir denfeldcoast guard public busistop communism labor standarddepart agricultur intern laborlend leas tax refundzone germani concili servicebritish loan standard actapprov compact soil conservunit kingdom school lunchunion shop cent hour
80th Congress (1947-48)
Most Republican Most Democraticsteam plant admir denfeldcoast guard public busistop communism labor standarddepart agricultur intern laborlend leas tax refundzone germani concili servicebritish loan standard actapprov compact soil conservunit kingdom school lunchunion shop cent hour
• Aftermath of WWII
80th Congress (1947-48)
Most Republican Most Democraticsteam plant admir denfeldcoast guard public busistop communism labor standarddepart agricultur intern laborlend leas tax refundzone germani concili servicebritish loan standard actapprov compact soil conservunit kingdom school lunchunion shop cent hour
• 1948 Dem platform: Advocates amending Fair Labor Standards Act to raise the federalminimum wage to 75 cents per hour; also advocates school lunch program
100th Congress (1987-88)
Most Republican Most Democraticfreedom fighter star wardoubl breast contra aidabort industri nuclear weapondemand second contra warheifer tax support contrareserv object nuclear wastincom ballist agent orangcommunist govern central americanwithdraw reserv nicaraguan governabort demand hatian peopl
100th Congress (1987-88)
Most Republican Most Democraticfreedom fighter star wardoubl breast contra aidabort industri nuclear weapondemand second contra warheifer tax support contrareserv object nuclear wastincom ballist agent orangcommunist govern central americanwithdraw reserv nicaraguan governabort demand hatian peopl
• Debate over support for Contra rebels fighting Sandinista government in Nicaragua;Iran-Contra affair
100th Congress (1987-88)
Most Republican Most Democraticfreedom fighter star wardoubl breast contra aidabort industri nuclear weapondemand second contra warheifer tax support contrareserv object nuclear wastincom ballist agent orangcommunist govern central americanwithdraw reserv nicaraguan governabort demand hatian peopl
• Debate over Reagan’s “Star Wars” missile defense initiative & nuclear weapons policy
104th Congress (1995-96)
Most Republican Most Democraticmedic save tax breakpartialbirth abort nurs homebig govern comp timefeder debt break wealthitax increas break wealthiesttax relief communiti policterm limit million childrennation debt assault weapontax freedom deficit reductitem veto head start
104th Congress (1995-96)
Most Republican Most Democraticmedic save tax breakpartialbirth abort nurs homebig govern comp timefeder debt break wealthitax increas break wealthiesttax relief communiti policterm limit million childrennation debt assault weapontax freedom deficit reductitem veto head start
• Debate over taxes and fiscal policy; Republicans using language from Luntz memos andContract with America
Distribution of Phrase-Level PartisanshipP
oste
rior
1976 1980 1984 1988 1992 1996 2000 2004 2008
0.0
0.2
0.4
0.6
0.8
1.0
●
●
● ● ●
● 0.001 0.01 0.05 0.1 0.9 0.95 0.99 0.999
NeologismsA
vera
ge p
artis
ansh
ip
1870 1890 1910 1930 1950 1970 1990 2010
0.50
0.55
0.60
0.65
baselinepre−1980 vocabulary
post−1980 vocabulary
Topic Decomposition
• Are trends in partisanship driven by• Divergence in which topics Dems/Reps emphasize?• Divergence in how the parties talk about a given topic?
Topics
alcohol environment mailbudget federalism minorities
business foreign moneycrime government religion
defense health taxeconomy immigration tradeeducation justiceelections labor
Ave
rage
par
tisan
ship
1870 1890 1910 1930 1950 1970 1990 2010
0.500
0.505
0.510
0.515
0.520
0.525
0.530
overallwithin
between
0.50
0.52
0.54
0.56
0.58
0.60
alcohol
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.001
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
defense
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.027
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
minorities
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.009
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
budget
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.015
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
crime
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.005
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
government
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.018
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
health
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.016
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
immigration
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.002
Fre
q.
0.50
0.52
0.54
0.56
0.58
0.60
tax
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
0.000
0.011
Fre
q.
Individual Tax PhrasesP
oste
rior
prob
abili
ty th
at s
peak
er is
Rep
ublic
an
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1870 1878 1950 1958 1966 1974 1982 1990 1998 2006
● ● ● ●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●death taxtax break
tax spendtax loophol
tax reliefclose tax
tax freedomshare tax
Explanations
Political Innovation
• Contract with America (1994)
• Republicans take control of Congress for first time since 1952• Frank Luntz: novel polling techniques, memos to Republican candidates• In the aftermath, Democrats launch an effort to improve their own choice of
language
You believe language can change a paradigm? “I don’t believe it – Iknow it. I’ve seen it with my own eyes...I watched in 1994 when the groupof Republicans got together and said: ‘We’re going to do this completelydifferently than it’s ever been done before.’...Every politician and everypolitical party issues a platform, but only these people signed a contract.” -Luntz (2004)
“Republican framing superiority had played a major role in their takeover ofCongress in 1994. I and others had hoped that... a widespreadunderstanding of how framing worked would allow Democrats to reversethe trend.” - Lakoff (2014)
Phrases from CWA0.
500.
510.
520.
530.
54
1870 1890 1910 1930 1950 1970 1990 2010
Avg
. par
tisan
ship
Contract with America
0.000
0.018
Fre
q.
Broader Context
• Party discipline in speech• Democratic Message Board (1989-1991)• Republican Theme Team (1991-1993): “develop ideas and phrases to be
used by all Republicans”
• Changing media environment• 1979: C-SPAN (House of Representatives)• 1983: C-SPAN2 (Senate)
“When asked whether he would be the Republican leader without C-SPAN,Gingrich... [replied] ‘No’... C-SPAN provided a group of media-savvy Houseconservatives in the mid-1980s with a method of... winning a prime-timeaudience.” (Frantzich & Sullivan 1996)
SummaryA
vera
ge p
artis
ansh
ip
1976 1980 1984 1988 1992 1996 2000 2004 2008
0.500
0.505
0.510
0.515
0.520
0.525 C−SPAN C−SPAN2 Contract with America
Ford Carter Reagan Bush Clinton Bush
Conclusion
Does Language Matter?
• Partisan language in Congress diffuses to broader public• Gentzkow & Shapiro 2010; Martin & Yurukoglu 2016; Greenstein & Zhu 2012
• Issue framing affects public opinion• Lathrop 2003; Graetz and Shapiro 2006; Druckman et al. 2013
• Language affects group identity• Kinzler et al 2007, Clots-Figueas and Masella 2013
• "Human beings do not live in the objective world alone, nor alone in the world ofsocial activity as ordinarily understood, but are very much at the mercy of theparticular language which has become the medium of expression.” (Sapir 1954)
• "When we successfully reframe public discourse, we change the way the publicsees the world. We change what counts as common sense.... Thinkingdifferently requires speaking differently.” (Lakoff 2014)