Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
A Sabermetric Test of Neustadt’s Skills Hypothesis
or
Why Ronald Reagan is like Bobby Cox and Lyndon Johnson is like Joe Torre
Jon R. Bond
Texas A&M University
and
Manny Teodoro
Texas A&M University
Prepared for Presentation at the 73rd Annual Meeting of the
Midwest Political Science Association
Chicago, Illinois
April 16-19, 2015
Abstract
A Sabermetric Test of Neustadt’s Skills Hypothesis*
or
Why Ronald Reagan is like Bobby Cox and Lyndon Johnson is like Joe Torre
Jon R. Bond
Texas A&M University
And
Manny Teodoro
Texas A&M University
This paper presents a new test of Neustadt’s Political Skills Hypothesis. We believe an indirect test comparing actual
wins against expected wins offers the most systematic test of this key hypothesis. We update and replicate findings of
Wins Above Expectations (WAE) from the traditional analysis of residuals from multiple regression. Then we adapt the
Pythagorean Expectations formula from sabermetrics to estimate WAE based on votes supporting and votes opposing
the president. Both approaches predict presidential success rates with high accuracy across all years and presidents. The
WAEs from the two approaches, however, are uncorrelated. The two approaches also identify different presidents as
above and below expectations. This suggests that the two approaches model different aspects of the underlying data
generation process. More research is needed to sort out what all this means.
*We are grateful to the Department of Political Science Texas A&M University for supporting this research.
1
A Sabermetric Test of Neustadt’s Skills Hypothesis
“The other teams could make trouble for us if they win.”
Yogi Berra
Presidential-congressional relations, like baseball, is all about winning. What the manager of a baseball team
needs most from his players is runs: his hitters must score runs, his pitchers and defenders must prevent them.
What the president needs most from members of Congress is their votes: he must get members to vote for his
favored legislation and against legislation he opposes. But important as winning is, what professional observers
(sports geeks and political scientists) really find most intriguing is Wins Above Expectations (WAE)—did the
team do better or worse than expected? If a team wins more games than expected, was it due to an advantageous
set of circumstances, simple “luck,” or did the manager’s skill and leadership produce wins when the team
should have lost? Using different statistical methods, sabermetricians and political scientists have attempted to
answer these questions for their respective fields.1
This paper applies sabermetric methods to analyze presidential success in Congress from Eisenhower to
Obama. Specifically, we adapt Bill James’ (1982) Pythagorean Expectations (PE) formula, which estimates how
many games a baseball team should have won based only on the number of runs scored and runs allowed, to
estimate WAE for presidents. If a roll call vote on which the president expresses a position is analogous to a
game, and votes for and against the president are analogous to runs scored and runs allowed, can the PE formula
predict presidential success rates over the course of a season (year) as accurately as it does for professional
baseball teams? If so, how well do WAE from PE correlate with residuals from multiple regression models used
by political scientists to identify presidents who won significantly more or less often that predicted (Bond and
Fleisher 1990, chap. 8; Cohen, Bond, and Fleisher 2013a; Fleisher and Bond 1983, 1992; Fleisher, Bond, and
1 Bill James named his approach to analyzing baseball “sabermetrics” in honor of the Society for American Baseball Research (SABR).
2
Wood 2008)? What are the lifetime WAEs of presidents from Eisenhower to Obama? Which presidents, if any,
stand out as uncommonly successful or unsuccessful?
The paper begins with a discussion of the fundamentals of presidential success in Congress. Political science
research finds that a regression model with three independent variables—party control, presidential approval, and
party polarization—and interactions successfully models the fundamentals of presidential success on roll call
votes in Congress. We update this model through 2014 (Obama’s sixth year), and use the residuals to estimate
WAE relative to this updated baseline. Next, we adapt the Pythagorean Expectations (PE) approach to predict
presidential success rates based only on votes won and lost, and present WAE relative to this parsimonious
atheoretical baseline. We find that both approaches predict presidential success rates with high accuracy across
all years and presidents. WAE from the two approaches, however, are uncorrelated. This suggests that the two
approaches model different aspects of the underlying data generation process.
The Fundamentals of Presidential-Congressional Relations
Richard Neustadt is rightly considered the dean of the behavioral study of the presidency. His message is clear
and compelling. To understand the presidency, study presidential behavior—what the president “can do, as one
man among many, to carry his own choices through that maze of personalities and institutions called the
government of the United States” (Neustadt 1960, i). Although he did not present a formal theory or use the
language of the scientific method, his insightful analysis generated hypotheses that subsequent generations of
political scientists set about testing. The three that have received the most attention include:
1. The Political Party Hypothesis (for which he expected null results): “What the Constitution separates our
political parties do not combine” (Neustadt 1960, 33);
2. The Public Approval Hypothesis: ceteris paribus, popular presidents are more successful than are
unpopular presidents (chap. 5); and
3. The Political Skills Hypothesis: ceteris paribus, presidents who are highly skilled at bargaining and
persuasion will be uncommonly successful while unskilled presidents will be unsuccessful (chap. 4).
3
As tests of the Neustadtian theory of presidential power2, the Political Skills Hypothesis is clearly the most
crucial because it focuses directly on what the president does to maximize his influence. With respect to public
approval, Neustadt (1960, 87) did not expect a strong “one-to-one relationship”. Rather, he viewed public
approval “as a conditioner, not the determinant” of presidential success. The Political Parties Hypothesis is the
least important test. Neustadt expected null results, but rejecting the null hypothesis does not necessarily
undermine the theory.
What evidence has 40 years of scientific research testing these hypotheses produced? Overall, the
correlation between the importance of the hypothesis to the theory and empirical support for it is -1.0.
Political Parties
Political science research has found strong support for the Political Parties Hypothesis. Large-n quantitative
studies consistently reject the null hypothesis that parties do not provide a bridge between the president and
Congress. Indeed, we find that party is by far the strongest determinant of presidential success in Congress—
support from members of the president’s party is statistically and substantively higher than that of members of
the opposition (Edwards 1989), and the president wins significantly more roll calls if his party has a majority
(Bond and Fleisher 1990). Although Neustadt’s observation about the weakness of American parties was
accurate, the pattern of robust party effects held during the period of his analysis—mid-twentieth century—when
parties were at their weakest. Rising party polarization in Congress from the mid-1980s to the present has
magnified the effects of party, though important chamber differences have emerged (Bond, Cohen, and Fleisher
2014, 2015; Bond, Fleisher, and Cohen 2012; Cohen, Bond, and Fleisher 2013a, 2013b).
We have sound theoretical reasons to explain the strong influence of party on presidential support in
Congress. Most obvious is simple arithmetic—majority presidents have more members on the floor with
incentives to support their positions than do minority presidents. Because members of the same political party
2 Neustadt’s analysis, of course, is not a scientific theory, as we understand it in contemporary political science. Yet, the generation of
testable hypotheses is arguably the most important thing that we want our theories to do.
4
must satisfy similar electoral coalitions, they have a wide range policy preferences in common. Consequently,
majority presidents win more because they have more members on the floor who agree with their proposals.
Furthermore, because the president’s co-partisans must run for reelection on his record as well as their own, they
have an electoral incentive to help him succeed. Thus, simple arithmetic explains much of why majority
presidents win more votes in Congress than do minority presidents.
Yet, arithmetic is not the only, or even most important, reason why majority presidents win more floor votes
than do minority presidents. A political party is more than just a bundle of shared preferences. More important than
numbers, is control of key levers of power in Congress. Political parties have long been conceived of as competing
teams (Downs 1957). Like sports teams, political parties are institutions with rules and procedures that channel and
constrain behavior (Aldrich 2011). The goal of party institutions is to win—win elections and enact policies favored
by party members. As a result, parties in Congress regularly take opposing positions on issues that have nothing to
do with policy or ideology, just because they don’t want the other team to win (Lee 2009), and to score points for the
next election. Majority party agenda control is a central component of the two leading partisan theories of
Congress—Conditional Party Government (CPG) theory (Aldrich and Rohde 2000; Rohde 1991) and Cartel Theory
(Cox and McCubbins 2005). When it comes to winning or losing roll call votes, then, the main advantage of majority
party control is that the president’s team controls key levers of power in Congress—committees, access to the floor,
rules governing debate, and which amendments are considered. Hence, the issues on the agenda and the presentation
of choices to members tend to reflect the president’s preferences when his party controls the chamber (Covington,
Wrighton, and Kinney 1995). These institutional advantages explain why party was such a strong influence on
presidential success even when parties were ideologically diverse in the mid-twentieth century.
Public Approval
Support for the Public Approval hypothesis is mixed but broadly supportive of Neustadt’s expectations.
Although some studies reported insignificant or negative relationships (Bond, Cohen, and Fleisher 2014; Bond
and Fleisher, 1980; Bond, Fleisher, and Northrup 1988), research typically finds that public approval has a
statistically significant but substantively marginal effect on support from members of Congress (Edwards 1989)
5
and on presidential success rates (Bond and Fleisher 1990). Furthermore, research that tests for interactions
shows that the effects of public approval are indeed conditional on party polarization and party control (Bond,
Cohen, and Fleisher 2014; Bond, Fleisher, and Wood 2003) as Neustadt suggested.
Given the importance of public support in a democracy, why does public approval of the president have
only a marginal effect on success rates? After all, if members of Congress want to get reelected (and they do),
rational actors should take public approval into account when deciding whether to support the president. As
Neustadt (1960, 86) explained, “Dependent men must take account of popular reactions to their actions. What
their publics may think of them becomes a factor, therefore, in deciding how to deal with the desires of a
President. His prestige enters into that decision; their publics are part of his.”
Closer reflection, however, suggests two reasons why presidential popularity might have marginal—or in
some cases negative—effects on presidential success. First, popular presidents may overestimate the benefits of
public approval and be less willing to compromise. An unwillingness to compromise is likely to lead to increased
party unity on floor votes, and hence, more support from the president’s party and less support from the
opposition. Second, is the question of credit and blame. Few voters have specific information about how often
their representative supports the president. Rather, the Strategic Politicians Theory indicates that effects
presidential popularity in congressional elections are indirect. If the economy is booming and the president is
popular, the president’s party will recruit a disproportionate number of politically experienced, well-financed
candidates to challenge vulnerable incumbents in the opposition party; if the president is unpopular, the opposing
party will recruit more high quality challengers (Jacobson 1989, 1990; Jacobson and Kernell 1983). Because
high quality challengers run vigorous campaigns that cast the incumbent in a negative light, the president’s co-
partisans tend to receive blame for unpopular policies even if they do not support them, and members of the
opposition are unlikely to receive credit even if they support popular policies. As a result, members of Congress
tend to follow their basic partisan predispositions—opposition partisans may oppose popular presidents because
they have little to gain from their support and much to lose if the president succeeds; co-partisans can’t run away
from an unpopular president, so they might be better off to help him succeed.
6
Political Skills
Thus far, political science research has been unable to uncover convincing evidence supporting the core
hypothesis of Neustadt’s argument—that highly skilled presidents are uncommonly successful. The Political
Skills Hypothesis is central to Neustadt’s thesis, but it’s also the most difficult to test, mainly because of
theoretical ambiguity about how to define and measure political skills, particularly with the small number of
cases. Unlike party and public approval, we have been unable to identify valid and reliable quantitative measures
of political skills. Supporting evidence comes mainly from in-depth case studies. But generalizing findings from
a single (and unrepresentative) case is problematical. Even a multiple-cases approach that uses common
standards to assess the performance of several presidents on important issues (Kellerman 1984) suffers from
questions about selection bias and interpretation of findings (see Bond and Fleisher 1990, 34-40).
Several studies developed quantitative measures of particular aspects of political skills based on
journalists’ assessments of presidential performance (Lockerbie and Borrelli 1989), presidential speeches (Fett
1994), activities to influence members’ votes documented in presidential archives (Covington 1987a, 1987b,
1988a, 1988b; Sullivan 1988, 1990, 1991), and agenda setting influence (Covington, Wrighton, and Kinney
1995). Although these studies are clever, their support for the Political Skills Hypothesis is mixed at best.
Moreover, questions about the validity of the measures, ambiguities interpreting the findings, and questions
about generalizability remain (see Bond and Fleisher 1990, 36-40; Fleisher, Bond, and Wood 2008).
Another approach seeks to finesse the intractable problem of measuring political skill and test the Political
Skills Hypothesis indirectly. Research on presidential-congressional relations shows that two variables—party
and public approval—systematically influence presidential success in Congress. These variables establish the
context within which presidential bargaining and persuasion take place. Regression models estimating the effects
of these contextual variables on presidential success establish a common baseline. The errors can be interpreted
as an estimate of wins above expectations—whether the president wins more or less often than should be
expected given the political context. Research in public administration has used regression errors as indirect
indicators of management skill, a similarly difficult construct to measure (Meier and O’Toole 2002). The error
7
term contains everything not explained by variables in the model. If the model is properly specified, the
unexplained variance will be random error. Errors that are not randomly distributed indicate that important
variables may have been omitted or that the model is misspecified. If political skill is as important to presidential
success as Neustadtian theory suggests, then the effects of this omitted variable should show up as non-random
variance in the errors.
Furthermore, theory together with evidence from qualitative case studies and close observers assessments
suggests specific hypotheses about patterns that we should observe in the residuals. These assessments, of
course, are not direct measures of political skill, but rather perceptions about a president’s “general reputation”
as more or less skilled. Neustadt’s concept of “professional reputation”, however, is based on perceptions of
Washingtonians. While every Washingtonian may not share the general view, there “usually is a dominant tone,
a central tendency, in Washington appraisals of a President” (Neustadt, 1960, 62). A review of the case-study
literature and assessments of close observers of presidential performance (Bond and Fleisher 1990, 198-204;
Edwards 2009, chap. 4) found consensus that Lyndon Johnson and Ronald Reagan were the most highly skilled
leaders since FDR, and that Richard Nixon and Jimmy Carter were relatively unskilled in their dealings with
Congress. Historical assessments of Eisenhower, Kennedy, and Ford are mixed. Reputations of presidents after
Reagan are still being formed. If perceived reputations of political skills have a strong effect on presidential
success rates, then:
1. Presidents reputed as highly skilled (Johnson and Reagan) should win significantly more often than the
baseline model predicts, and presidents reputed as unskilled (Nixon and Carter) should win significantly
less often; and
2. Since Neustadtian theory does not suggest that different skills are effective in the House and Senate, we
should observe similar patterns of wins above expectations in both chambers.
This approach, of course, is not a direct test of the effects of political skills. Rejecting the null hypothesis
would not mean that non-random variance in the residuals is due to unmeasured effects of political skills.
Exactly what unmeasured variables are in the error term is unknown, and non-random variance might result from
8
a number of causes. Yet failure to reject this null hypothesis would add to a growing accumulation of
quantitative evidence that actions of the president—”as one man among many”—can seldom overcome the
political and institutional context that set the broad parameters of presidential success in Congress.
Several studies using slightly different model specifications analyzing different periods were unable to
reject the null hypothesis of no non-random variance in the residuals:
The number of significant outliers was no greater than should be expected by chance;
Johnson and Reagan, perceived to have exceptionally high political skills, did not win significantly
more often than should be expected given the conditions they faced, nor did reputedly unskilled
Nixon and Carter win significantly less often; and
Winning more or less often in one chamber was not associated with winning more or less often in
the other chamber (Bond and Fleisher 1990, chap. 8; Cohen, Bond, and Fleisher 2013a; Fleisher
and Bond 1983, 1992; Fleisher, Bond, and Wood 2008).
So far, then, political science research has been unable to find even indirect evidence that supports the
Political Skills Hypothesis. Nevertheless, we believe the indirect test comparing actual wins against expected
wins offers the most systematic test of this key hypothesis. The logic of the scientific method is not to prove
what is true, but to disprove what is not true. The initial step is to try to reject the null hypothesis of no
relationship. Although previous studies have been unable to reject the null hypothesis regarding political skills,
we need to continue trying to replicate the results with models updated to include new data points and
specifications refined to incorporate new evidence about the determinants of presidential success. Moreover, we
need to look for different ways to estimate the baseline to measure wins above expectations.
Thus, we present results from two different quantitative approaches to estimating presidential wins above
expectations. First, we present results from an updated regression analysis of presidential success from 1953
through 2014, specified to incorporate the results of recent research showing that party polarization conditions
the effects of party control and presidential approval, and that the conditional relationships are different in the
9
House and Senate. Second, we adapt the sabermetric Pythagorean Expectations formula, used to estimate WAE
for baseball teams, to estimate WAE for presidents.
Regression Estimates of Presidential Wins Above Expectations
Party control and public approval are the key variables in previous baseline models.3 In addition, recent research
shows that rising party polarization over the last several decades has altered relationships between the president
and Congress. The effects of party polarization on presidential success, however, are conditional.
In particular, party polarization conditions the relationship between party control and presidential success.
The escalating number of cloture votes on presidential roll calls has changed the conditional effects in the
Senate. In the House governed by majority rule, party polarization amplifies the effects of party control—as
polarization increases, majority party presidents win more and minority presidents win less. In the Senate,
increasingly governed by a 60-vote decision rule, party polarization suppresses success rates—majority
presidents still win more than minority presidents do, but as polarization increases in the Senate, success rates
decline for both majority and minority presidents (Cohen, Bond, and Fleisher 2013a, 2013b).
The rise of cloture votes in recent decades (Binder and Smith 2001; Koger 2010) explains why
polarization suppresses majority presidents’ success rates in the Senate. Until the Clinton years, cloture votes
were less common on presidential roll calls than on all Senate votes. Two developments associated with party
polarization became apparent during the George W. Bush (hereafter Bush43) and Obama years: (1) cloture votes
on presidential roll calls nearly doubled—19% of Bush43 and Obama votes were on cloture compared to 10%
for Clinton; and (2) the “minority party filibuster” and 60-vote requirement to invoke cloture became a highly
effective tool that the minority party routinely used to block majority party presidents’ policies and nominees
(Bond, Fleisher, and Krutz 2009). Consequently, voting on cloture became much more partisan than on other
types of presidential roll calls—over 80% of Bush43 and Obama cloture voters had highly unified parties on
3 Some models included a dummy variable for honeymoon and a count variable to account for cycles over course of the term. We drop
these controls because they added little explanatory power to the models and the effects are not consistent across chambers.
10
opposite sides. With highly polarized parties, majority presidents almost always lose these votes because they
favor invoking cloture over 98 percent of the time. Getting to 60 typically requires votes from the minority,
which the president is unlikely to get from a highly cohesive minority party. Minority party presidents, in
contrast, almost always win these votes because they oppose invoking cloture over 83 percent of the time—it’s a
whole lot easier to win if you only need 41 votes that you can get from your own caucus. To account for this
change, we exclude cloture votes from the calculation of presidential success rates in the Senate. Since cloture is
unique to the Senate, excluding these votes provides a mix of majority and supermajority votes similar to that in
the House over the entire period. This has almost no effect on success rates before Bush43, but it is a way to
estimate wins above expectations for Bush43 and Obama in the Senate relative to a baseline similar to that in the
House and for earlier presidents in the Senate.
There is also evidence that the effects of public approval are conditional on the level of party polarization
(Bond, Fleisher, and Wood 2003). An update through Obama’s first term shows that public approval is
conditional on party control as well as party polarization (Bond, Cohen, and Fleisher 2014).
The Updated Regression Model
We estimate the following OLS regression model for the House and Senate:
PSS=B0+B1(Ptycntl)+B2(Aprv)+B3(Plrzn)+B4(Ptycntl*Aprv)+B5(Plrzn*Ptycntl)+
B6(Plrzn*Aprv)+B7(Plrzn*Ptycntl*Aprv)
Where:
1. PSS=Presidential Success Score measured as the annual percentage of roll calls (Congressional
Quarterly, Inc. Annually 1953-2015) on which the president’s position prevailed, excluding consensus
wins (more than 90% supporting the president4) and excluding cloture votes in the Senate;
4 This differs from the 20 percent threshold used in previous work (Bond and Fleisher 1990; Fleisher and Bond 2000). The lower
threshold still excludes the most routine issues, but it increases the n slightly.
11
2. Ptycntl=Party control measured with a binary variable5 coded 1=President’s party has a majority;
0=President’s party in minority6;
3. Aprv=President’s job approval measured as the mean Gallup job approval rating for the year adjusted to
exclude DK/no opin. (%Aprv/(%Aprv+%disaprv) centered on its mean7;
4. Plrzn=Party polarization measured as the mean distance between the parties (|%Dem yea -%Rep yea|)
on all RCs in the year excluding consensus votes (LT 10% in the minority) centered on its mean;
5. Ptycntl*Aprv=interaction of party control and approval;
6. Plrzn*Ptycntl=interaction of polarization and party control;
7. Plrzn*Aprv=interaction of polarization and approval;
8. Plrzn*Ptycntl*Aprv=interaction of all three conditional variables.
Table 1 presents the results for the House and Senate. The models perform well, explaining 89 percent of
the variance in the House and 72 percent in the Senate with just three variables plus interactions. The Senate is
somewhat less predictable than the House. This result is consistent with previous research and with the
expectation that institutional features (statewide constituency, six-year term, and smaller size) and different rules
and traditions would make the Senate more individualistic and deliberative than the House. Including the
interactions significantly improves the fit in both chambers. With interaction terms, we need to interpret the joint
significance of all the elements of the interactions rather than significance of any particular coefficient. Since the
focus of this paper is on estimating wins above expectations, we do not discuss the substantive interpretation of
the interactions here except to note that the results are generally consistent with previous work.8 We turn now to
an analysis of the residuals.
5 Using a binary variable instead of percent of the president’s party throws out information. But theory suggests that the primary benefit
of majority status is control of institutional levers of power, and there is evidence that the continuous variable does not add significant
explanatory power over the majority/minority dichotomy (Bond, Fleisher, and Cohen 2012; Bond, Fleisher, and Wood 2003).
6 The party division in the Senate was a tie in 2001. With VP Cheney breaking the tie, Republicans organized the Senate. On June 6, Sen.
Jeffords (R-VT) switched to caucus with Democrats. Thus, Republicans held the majority from January-June 6, and Democrats were the
majority from June 7 to the end of the 107th Congress. Bush is coded as a minority president because there were more days and more
votes in 2001 when Republicans were the minority. Coding Bush as a majority president makes little difference in the results.
7 We center continuous variables—Aprv and Plrzn—on their means. The only effect is to shift the intercept, giving zero a substantive
interpretation—the effects at average approval and average polarization. The slopes, R2, and overall model significance are unchanged.
8 Marginal effects plots and a brief discussion interpreting the interactions are in the Appendix.
12
[Table 1 about here]
Diagnostic tests found no evidence of heteroskedasticity, autocorrelation, specification error, or omitted
variables.9 This evidence and the high R2 indicates that this parsimonious model with just three variables (plus
interactions) is quite successful at modeling the fundamentals of presidential success in Congress. And we were
unable to reject the Political Skills null hypothesis that failing to include this theoretically essential variable in
the model would show up as non-random variance in the error term.
Regression Estimates of Presidential Wins Above Expectations
Since we are interested in estimating presidential wins above expectations each year in the House and Senate, we
plot the residuals to see which presidents won significantly more or less often than expected given the political
and institutional context. Studentized residuals provide a useful metric to identify significant outliers. Using a
95% confidence interval, absolute values greater than 2.017 would be considered unusual. We would expect one
in twenty to appear unusual by random chance. For our sample of 62 years, we would expect about 3.01 unusual
observations by random chance in each chamber.
Figure 1 plots studentized residuals in the House and Senate for each year. In the House, we find only two
outside the ±2.0 range—Nixon in 1971 (+2.86) and Ford in 1976 (-2.51)—though three more just miss with
absolute values of 1.9 (-1.9 for Bush41 in 1999 and Bush43 in 2004, and +1.9 for Clinton in 2000). In the
Senate, we find four residuals outside ±2.0 and two more at -1.99. Not only do we observe more outliers in the
Senate than should be expected by chance, but compared to the House, the pattern in the Senate appears less
random. Four of the large residuals are negative (Eisenhower in 1959, LBJ in 1966, Clinton in 1999, and Bush-
43 in 2004) while only two are positive (Bush43 in 2002 and 2008). Moreover, the Senate outliers appear early
or late in the time series—three of the six appear during Bush43, including of the two positive ones. Yet, the
diagnostics did not find significant evidence of anomalies in the behavior of the Senate residuals.
9 Cameron and Trivedi’s decomposition of IM-test (estat imtest) for heteroskedasticity (chi2 = 11.12 in the House model, 17.58 in the
Senate model) not significant; Durbin-Watson d-statistic (d=1.902 in House, 1.877 in the Senate) no evidence of positive or negative
autocorrelation; linktest revealed no evidence of specification error; and Ramsey (1969) RESET test (ovtest) revealed no evidence of
omitted variables (F(3, 51) = 0.37 in the House, 1.46 in the Senate).
13
[Figure 1 about here]
Some might view the ±2.0 standard to define “unusualness” of residuals as arbitrary. Not only do we
observe more significant outliers in the Senate than expected by chance, there are numerous years in both
chambers when a president’s actual wins deviated from expectations by a considerable amount. Before
concluding that the effects of presidential skills are intermittent and idiosyncratic, let’s employ the tried-and-
flawed “Interocular Test”—eyeball the plot, and if a pattern hits you between the eyes, it’s significant—to look
more closely for expected patterns.
In particular, qualitative assessments suggest that Johnson and Reagan are candidates for uncommonly
successful presidents, and Nixon and Carter are candidates for exceptionally unsuccessful presidents in their
dealing with Congress. Focusing only on whether residuals are positive or negative, however, we do not see
consistent patterns. Although Johnson exceeded expectations more often than not in the House, Reagan did not;
and in the Senate, both of these reputedly skilled presidents fell short of expectations as often as they exceeded
them. Consistent with his reputation as an unskilled leader, Carter’s success rate in the House fell short of
expectations in in three of four years, but Nixon usually exceeded expectations, significantly above expectations
in 1971. In the Senate, Nixon and Carter exceed expectations as often as they fell short. Thus, we see no
consistent patterns for the four presidents for whom we have specific expectations. What’s more, we do not
similar patterns in both chambers—the R2 between House and Senate residuals is .03 and the slope of the
regression line is nearly flat.10
Another objection to this approach is that patterns in the residuals might be sensitive to model
specification. To test this possibility, we employ a comparative Interocular Test. Two earlier studies analyzed
studentized residuals using slightly different model specifications for different time periods. Fleisher, Bond, and
Wood (2008) used a model that included party control, public approval, party polarization, and honeymoon to
10 ResidSen = 0.16 (ResidHse)+ 0.011; R² = 0.026
14
analyze annual success rates from 1953 to 2001; Cohen, Bond and Fleisher (2012a) used a model with only party
control and polarization with interactions to analyze presidential success from 1953-2010. The models produced
similar, though not identical, results. Correlations between residuals from this model with others are:
with the Fleisher, Bond, and Wood (2008) model: House R2=.59; Senate R2=.73;
with the Cohen, Bond, and Fleisher (2012a) model: House R2=.77; Senate R2=.74.
Figure 2 plots the studentized residuals from all three studies. We observe few years when the residuals are
substantially different across the different models (highly divergent residuals are enlarged). In the House,
Eisenhower’s estimate in 1960 changed from a significant positive outlier in the earlier studies to positive but far
from significant in the current model; Clinton’s estimate in 2000 appears as a significant positive outlier in the
current model but as close to zero in the earlier studies. In the Senate, Carter’s estimate in the current model is
negative, but well above -2.0; Clinton’s estimate in 2000 changes from clearly negative in earlier studies to
slightly above zero in the current model.11 One anomaly stands out in all three models—Nixon’s House success
rate in 1971 was much higher than expected (Studentized residuals near +3.0). This result is surprising given the
conventional view that Nixon was unskilled in his interactions with Congress.
[Figure 2 about here]
This analysis ought to raise serious doubts about leadership skill as a systematic explanation of presidential
success on roll call votes in Congress. We are certainly not arguing that presidents and political skill do not
matter. There are numerous examples showing that how well the president played the game of Washington
politics changed the outcome in some really important cases. Explaining these unusual and important cases is
interesting, but science seeks generalizable relationships in a large representative sample of cases. Thus far, we
have been unable to find evidence that is generalizable. Let’s see if sabermetrics can provide additional leverage
on estimating presidents’ wins above expectations.
11 Bush-43’s estimate in 2012 is significantly below expectations in both studies, but just barely so in the current model.
15
Sabermetrics Estimates of Presidential Wins Above Expectations
Political scientists chase the god of statistical significance (anything more common than .05 will be rejected) in
search of theory to explain political phenomena. Sabermetricians chase the god of big (really big) data in search
of accurate predictions (a little post hoc rationalizing will do just fine thank you). With apologies to Alfonso
Bedoya in Treasure of the Sierra Madre, the sabermetrician’s attitude is: “Theory? We ain’t got no theory. We
don’t need no . . . stinkin’ theory!”
Pythagorean Expectations
Bill James’ (1982) Pythagorean Expectations formula is a shining example. It predicts Major League Baseball
teams’ winning percentages based solely on runs scored and runs allowed with a high degree of accuracy:
𝑊𝑖𝑛 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 =𝑟𝑢𝑛𝑠 𝑠𝑐𝑜𝑟𝑒𝑑𝛾
𝑟𝑢𝑛𝑠 𝑠𝑐𝑜𝑟𝑒𝑑𝛾 + 𝑟𝑢𝑛𝑠 𝑎𝑙𝑙𝑜𝑤𝑒𝑑𝛾
Central to James’ PE is the insight that runs scored and runs allowed in a given game reveal information about a
team’s play beyond whether the team won or lost the game. Each run scored or allowed implies a story of
success or failure that culminated in a win or loss, and so runs scored and runs allowed are better measures of
actual team performance—are the hitters scoring? Are the pitchers and fielders making outs?—than is simple
winning percentage. Over the course of a 162 game season for thirty Major League franchises, winning
percentage and PE converge quite closely. In James’ original formulation, the value of γ was set at 2 for the sake
of simplicity and because it fit the data reasonably well—hence the name “Pythagorean Expectation.” Since then,
the PE formula has been refined to arrive at γ =1.82 as the exponent that generates the most accurate predictions
(Miller 2007).12
Of course, PE does not perfectly predict actual winning percentages, and Sabermetricians hotly debate the
meaning of deviations from PE, or Wins Above Expectations (WAE). Put simply, WAE is a measure of the
games that a team won that it should have lost (positive WAE), or lost that it should have won (negative WAE),
12 See Miller (2007) for an excellent discussion of the mathematical theory underlying James’ PE formula.
16
based on PE. In practical terms, WAE measures close wins and losses accumulated over a season. For many,
WAE is merely noise that reflects the irreducible randomness of a baseball game played by humans under
varying conditions. For others, WAE is an indicator of quality of relief pitching, the ability to hit or pitch “in the
clutch” (i.e., in high-leverage situations), or—critically for our purposes—the quality of a team’s manager.13
WAE as measure of managerial skill. The intuition behind WAE as a measure of managerial skill is that, over
the course of a season, a team’s PE is mostly a measure of the talent on its roster. A team with a roster of elite
athletes in the prime of their careers is expected to score lots of runs and allow few, with or without a great
manager. Just so, a team with slower, weaker, or injury-prone players is expected to score few runs and allow
many, regardless of what its manager does. WAE, then, is argued to measure how effectively a manager deploys
the talent at his disposal in close games.
Two championship seasons from the career of Hall of Fame manager Sparky Anderson provide a useful
comparison: the 1976 Cincinnati Reds and 1984 Detroit Tigers. Fresh off of a World Series championship the
previous year, the 1976 Reds were a juggernaut—the team’s roster boasted three future Hall of Famers (plus Pete
Rose, whose off-field transgressions have kept him from enshrinement). Dubbed the “Big Red Machine” in the
press, the team cruised to a 102-60 record, finishing ten games ahead of the second-place Dodgers, and went on
to sweep the Phillies and Yankees in the playoffs en route to a second consecutive World Series title. Although
baseball fans revere the 1976 Reds as one of the all-time great teams, its 102 wins were actually less than its
Pythagorean Expectation of 103-59, giving Sparky Anderson a -1.0 WAE (or -0.62 WAE%) for the season.
By contrast, the 1984 Tigers were coming off of a solid but unspectacular performance in 1983 with a
roster that included no Hall of Famers.14 Yet the 1984 Tigers led the American League all season, swept the
13 Darawski uses WAE to construct a baseball manager Hall of Fame. By WAE, Mike Scioscia of the Angels is the best manager in the
history of baseball. Bruce Bochy, Wilbert Robinson, Bobby Cox, and Filipe Alou round out the top five.
http://www.beyondtheboxscore.com/2012/3/28/2908044/manager-wins-above-expectancy
14 Co-author Teodoro believes that the 1985 Tigers’ Jack Morris is a Hall-of-Fame quality pitcher, and that the Baseball Writers
Association of America’s (BWAA) failure to admit Morris to the Hall is both an injustice and an indictment of the BWAA’s collective
judgment. Interested readers are encouraged to consider Morris’ career records and judge for themselves: http://www.baseball-
reference.com/players/m/morrija02.shtml.
17
Royals to win the American League pennant, and beat the Padres in five games to win the World Series. The
1984 Tigers’ 104-58 record exceeded their Pythagorean Expectation of 99-63 by five games, giving Anderson an
extraordinary 5.0 WAE and 3.09 WAE%. Not coincidentally, Anderson was named the 1984 American League
Manager of the Year.
The point of the illustration is simple but profound. In 1976 the Reds were expected to crush the National
League with its talent-laden roster, so the team’s excellent record and championship offers little evidence of
Anderson’s managerial prowess; the 1984 Tigers were expected to be merely good, so the team’s dominant
performance that year suggests that Anderson’s management contributed to its success.
Presidential PE in Congress. Can the PE formula be adapted to predict presidential success in Congress? There
are a number of parallels between presidential-congressional relations and baseball. The president is analogous to
a manager. A roll call is analogous to a game that the president’s team plays. Votes supporting the president are
analogous to runs scored, and votes against the president are analogous to runs allowed in each “game.” A year is
analogous to a season, and House and Senate to different leagues. A president who serves when his party holds
large majorities in both houses of Congress is akin to a manager with an all-star roster in a weak division; we
would expect him to win most of his roll calls without trying very hard. A president who faces hostile majorities
in Congress is like a manager of a team full of poor hitters and soft-tossing pitchers who struggle to throw
strikes; under such conditions, any roll call wins at all might be evidence of great presidential political skill, even
if the president’s actual winning percentage is low.
Just as James’ PE formula recognized that runs scored and allowed convey information about team
performance that is not reflected in winning percentage, votes won and lost on presidential roll calls may carry
information about a president’s legislative skill that percentage of roll calls won does not. Each vote for or
against the president carries information about legislative politics, including partisanship, cohesion of the
majority party, and congressional organization. Rather than considering each of these variables that typically
populate the right-hand side of a regression equation separately and with particular causal theories in mind, the
PE approach aggregates votes over a year as a way of approximating the overall strength of a president’s
18
legislative position apart from any one roll call, just as the baseball PE measures the strength of a team’s roster
apart from any one game. Adapted to presidential-congressional politics, the formula for Pythagorean Expected
roll call winning percentage (PE%) is:
𝑃𝐸% =𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟𝛾
𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟𝛾 + 𝑣𝑜𝑡𝑒𝑠 𝑎𝑔𝑎𝑖𝑛𝑠𝑡𝛾
The difference between roll call winning percentage and PE% is percentage Wins Above Expectations (WAE%).
At the end of a year, a positive value indicates that the president has won more roll calls than expected, while a
negative value indicates that he has lost more than he was expected to lose. This percentage multiplied by the
number of roll calls on which the president took a position yields Wins Above Expectations (WAE). Just as some
baseball analysts use WAE to assess managerial skill, WAE might also yield important information about
presidents’ skill with legislative politics. As with baseball managers, WAE% offers a useful way to think about
presidents’ annual performance, while total WAE rewards performance across several years and so offers some
perspective on career accomplishments.
The distinction between WAE and WAE% highlights an important difference between baseball games and
roll call votes.15 Whether enjoying a winning season or suffering through a miserable one, every Major League
team plays 162 games. “You win a few, you lose a few, some get rained out,” Satchel Paige once observed. “But
you got to dress for all of them.” Presidents may win a few and lose a few, but they don’t have to dress for all of
them. Unlike ballplayers, presidents get to pick their “games” because they choose when to take a position on a
particular roll call. A president who is relatively active in the legislative arena may have a high WAE frequently
despite a modest WAE%, while a president who takes few positions on roll calls may enjoy a very strong
WAE% but a low WAE. For example, if a president attempts to artificially inflate his success rate by avoiding
difficult roll calls and expressing a position only on those that are sure to go his way, the PE formula will
15 There are others. Roll call votes, for example, don’t involve the use of baseball bats, though Rep. Preston Brooks (D-SC) did attack
Sen. Charles Sumner (R-MA) with a cane on the Senate floor in 1856 (U.S. Senate n.d.
http://www.senate.gov/artandhistory/history/minute/The_Caning_of_Senator_Charles_Sumner.htm).
19
systematically underestimate his success rate. Although a president may occasionally engage in posturing to
avoid an embarrassing loss or endorse a sure winner, Peterson (1990) presents evidence that presidential
positions generally are sincere. And Covington (1987) finds evidence that the president can sometimes increase
the chances that a roll call will go his way by “staying private”. These cases are also rare, but these “wins” do not
appear in the success rate because the president remained in the clubhouse. Nonetheless, we might expect
presidents to have positive overall WAE because they sometimes do not express positions on roll calls they
expect to lose (unlike baseball teams, whose aggregate WAE is always exactly zero each season).
Calculating presidential PE. Following standard sabermetric procedures, we used the presidential roll call vote
data described earlier to calculate PE, WAE% and WAE for presidents. Votes for and against the president’s
position were totaled for roll calls on which the president took a position for each year from 1953-2014. As in the
regression analysis, we exclude cloture roll calls in the Senate, as well as “consensus” roll calls where the
president won over 90 percent of the votes. Using these totals, we calculated PE for each year with γ set to 2
using James’ (1982) original formulation. The result was a striking .934 Pearson correlation between PE% and
actual winning percentage in the House and .907 in the Senate. These impressive correlations belied substantial
year-to-year error, so we followed the usual sabermetric practice of minimizing squared differences between
PE% and actual winning percentage. To do so, we applied a Generalized Reduced Gradient (GRG) iterative
algorithm to the PE% formula and roll call voting data with an aim of minimizing squared differences with γ
constrained to be greater than zero and less than 100. The GRG algorithm is an efficient, robust method for
optimizing nonlinear problems, as in the present analysis (Mantell and Lasdon 1978).16 Optimization yields γ
values of 3.44 in the House and 3.11 in the Senate. Consequently, the PE% formulae for presidents are:
𝐻𝑜𝑢𝑠𝑒 𝑃𝐸% =𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.44
𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.44 + 𝑣𝑜𝑡𝑒𝑠 𝑎𝑔𝑎𝑖𝑛𝑠𝑡3.44 𝑆𝑒𝑛𝑎𝑡𝑒 𝑃𝐸% =
𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.11
𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.11 + 𝑣𝑜𝑡𝑒𝑠 𝑎𝑔𝑎𝑖𝑛𝑠𝑡3.11
16 As a check of the stability and robustness, we replicated the optimization procedure using the evolutionary/genetic stochastic
optimization algorithm recommended by Yeniay (2005). For both House and Senate, evolutionary optimization yielded a value of γ equal
to the seventh decimal place.
20
These formulae generate PE% that correlate .939 with actual winning percentages in the House and .823 in the
Senate.
Results of Sabermetric Analysis of Presidential Wins Above Expectations
We used PE% for each year in each chamber to calculate WAE% and WAE. We also calculated career average
WAE% and WAE for each president by summing his annual WAE in each chamber.
Regression and Sabermetric Approaches Estimate Different Baselines. The regression models and the PE
formula predict presidential success rates with similarly high levels of accuracy—correlations around 0.94 in the
House and above 0.82 in the Senate. Despite highly accurate predictions, the WAE% from the two approaches
are uncorrelated (R2 = 0.001 and House and 0.078 in the Senate). Figure 3 shows annual WAE% for House and
Senate. We standardized PE WAE% to facilitate comparison to the studentized residuals from the regression
analysis.17 The patterns of PE WAE% are quite different from those estimated by studentized residuals. Whereas
the studentized residuals indicated that each president exceeded expectations in some years and fell short in
others, the PE estimates indicate that presidents were consistently above or below expectations across the years
of their administrations. Perhaps most striking is the partisan difference—except for Eisenhower, Republican
presidents generally exceeded expectations, while Democratic presidents usually fell short.18 In particular,
Reagan was considerably above the mean in both the House and Senate during all eight years. Clinton was below
the mean for all but one or two years of his presidency, and well below in the Senate. Obama exceeded
expectations in the House during his first two years, but falls far below the mean in the last four years.
[Figure 3 about here]
Sabermetric Analysis of Neustadt’s Skills Hypothesis. How does Neustadt’s Political Skills Hypothesis fare
with sabermetric WAE%? Do presidents who are renowned for legislative skill generate impressive WAE%? Do
17 Standardized WAE% (z-scores), of course, are not the same as residuals. Residuals from OLS regression have a mean of zero. The
means of WAE% are 3.22% in the House and 2.43% in the Senate with standard deviations of 7.84 and 7.85 respectively. Thus, this
figure indicates each president’s deviation above or below the mean WAE% for all 62 years.
18 Keep in mind that zero is the mean level of performance.
21
presidents who historians regard as less skilled generate lackluster WAE%? And does WAE% in the House
correlate with WAE% in the Senate, as we would expect from the political skill hypothesis? Table 2 reports
career average WAE% and WAE for each president by summing his annual WAE in each chamber. Table 2 also
reports mean WAE% and total WAE (the sum of WAE for each chamber) for each president. For comparison to
a non-quantitative assessment, we also report each president’s “Relations with Congress” ranking from the 2010
Siena Presidential Expert Poll of presidential historians.19
[Table 2 about here]
Sabermetric analysis offers mixed evidence about the political skill of the reputed legislative giants
Lyndon Johnson and Ronald Reagan. Johnson’ career lfigures put him in the middle of the pack among post-
WWII presidents, with House (hWAE% = 2.9, hWAE=11.4) and Senate (sWAE%=1.6, sWAE=7.9) averages
below the mean for all analyzed presidents. In contrast, Reagan’s career statistics make him by far the most
impressive legislative operator in the sample, ranking second in career WAE% and a walk-off first place in WAE
in both houses. Reagan’s career hWAE of 82.5 and sWAE of 51.7 are more than double his next closest rival’s.
How do we reconcile these sabermetric differences between presidents’ historic reputations and their
sabermetric figures? Again, baseball offers an instructive metaphor. Since the emergence of sabermetrics, a new
generation of statistically trained, quantitatively oriented baseball analysts have clashed with traditional, “old
school” baseball writers over the interpretation of team, player, and manager performance (Lewis 2003).
Generally, sabermetricians seek objective measures of performance in terms of run production and run
prevention that account for differences in context, while traditionalists tend to focus on wins, losses, and the “eye
test” (visual, qualitative evaluation). Sometimes players and managers look better from one perspective than
another. Over the past twenty years, sabermetricians have come to dominate baseball analysis for hardcore fans
of the game, but the traditional baseball writers still stand as gatekeepers to the Hall of Fame, and so the
traditionalist imprimatur is still the standard by which the casual fan defines greatness in baseball.
19 To help get your students’ attention, Appendix B presents a set of presidential baseball cards that report detailed results for (eventually)
all eleven presidents.
22
The managerial careers of Joe Torre and Bobby Cox offer useful allegories to the legislative records of
presidents Johnson and Reagan. Torre and Cox are both widely considered all-time great managers, and both
were inducted to the Hall of Fame in 2014. Over his 29-year career, Torre won 2,326 games (.538 winning
percentage), six American League pennants, and four World Series titles; over his 29-career, Cox won 2,504
(.556 winning percentage), five National League pennants, and one World Series. But their career WAE figures
are quite different: Cox ranks fourth all-time with a career WAE of 77.0, but Torre’s respectable 23.3 puts him
far behind at 39th on the all-time list. A close look at their records makes the reason clear: Torre’s most
successful years came with the New York Yankees between 1996 and 2007, when his roster included some of
the most fearsome hitters and pitchers in baseball. Torre’s Yankees won big, but they were supposed to win big
based on their enormous payroll and all-star laden roster. Cox spent nearly his entire managerial career with the
Atlanta Braves, winning consistently and routinely outperforming his PE.
Like Torre and Cox, Johnson and Reagan earned their reputations in very different ways. Johnson’s actual
legislative winning percentages are very high, but not much better than his very high PE. Johnson won most of
his legislative battles, including some extraordinary legislative “championships” with civil rights laws. But like
Joe Torre’s Yankees of the late-1990s and early-2000s, LBJ was expected to win with his strong partisan
majorities in both houses and a booming economy. Reagan’s actual winning percentages are fairly low (.420
House, .720 Senate), but like Bobby Cox, his actual record was consistently much higher than his PE,
particularly in the House. Torre, Cox, Johnson and Reagan were all successful and probably deserving of their
places in their respective Halls of Fame. But sabermetrics make a stronger case for Cox and Reagan, while
traditionalists would point to Johnson's landmark legislative achievements and Torre’s four World Series rings in
making their arguments for enshrinement.
On the other end of the reputational distribution are Richard Nixon and Jimmy Carter—reputedly the least
politically-skilled presidents in our sample. Sabermetric assessment again yields mixed evidence for Neustadt’s
political skill hypothesis. Nixon and Carter both generated below-average WAE figures across the board, with
23
the exception of Nixon’s strong career sWAE (which is something of an oddity).20 But close examination of
Nixon’s and Carter’s records show that their legislative records relative to PE were not especially strong or
weak. Although they occupy the bottom half of the standings, a more accurate description might be “mediocre”
rather than “bad.” In contrast, Dwight Eisenhower and Bill Clinton occupy the sabermetric cellar.
Table 2 shows that Siena’s expert rankings are uncorrelated with legislative success measured as WAE. In
fact, the Siena rankings of “Relations with Congress” correlate much more strongly with its nineteen other
categories (e.g., presidential “Background,” “Imagination,” “Intelligence,” and “Luck”) than actual winning
percentage. This divergent assessment implies that presidential experts are applying their own adjustments to
presidents’ legislative records to account for context, but evidently not the same kinds of adjustments as those
implied by PE. As with sabermetricians and traditionalist sports writers, quantitative analytics and expert
qualitative assessments lead to very different conclusions about presidents’ legislative success.
Another striking finding that emerges from Table 2 is that Republican presidents outperform Democrats,
and it’s not close.21 Ike is in last place, but otherwise the top half of the distribution is all GOP. This may tell us
something about the cohesiveness of the GOP in Congress. Yet, the regression approach does not find any
systematic partisan advantage in WAE, so we need further investigation to determine why the PE formula
generates this result.
Finally, if Neustadt is correct that political skill drives legislative success, we should observe similar
patterns of WAE in both chambers. WAE offers support for hypothesis. From 1953-2014, annual WAE% for
House and Senate correlate at .44 (p<.01). Perhaps more importantly for Neustadt’s understanding of presidential
power, the Spearman rank-order correlation for presidents’ career WAE% in House and Senate is .69 (p=.02). In
other words, each president’s ranking relative to other presidents is broadly similar in both houses.
20 Nixon’s career sWAE of 22.2 is mainly the result of an extraordinary sWAE of 20.9 in 1973 when he won several close votes en route
to a low .376 overall winning percentage. Nixon’s sWAE was actually negative for three of his six years in office.
21 This was an extraordinarily painful sentence to write. That’s the benefit and curse of science and statistics: it’s difficult for one’s
partisan biases to creep into the findings even subconsciously.
24
Discussion
This analysis shows that the Pythagorean Expectations formula predicts presidential success rates as well as fully
specified regression models. What PE extracts from the sum of votes supporting and votes opposing the
president, however, is different from theoretical explanations of presidential success—the interaction of party
control, public approval, and party polarization—included in our regression models. But as is the case with
regression residuals, we don’t know exactly what’s in the WAE%. Perhaps WAE is capturing something
important about crafting legislative coalitions. At the very least, we think sabermetrics may offer a new, WAE
cool direction for the study of presidents’ participation in legislative politics.
Although this paper shows that sabermetrics may be a useful tool to analyze roll call votes in legislatures,
we barely scratch the surface. The more important contributions of sabermetrics to baseball is estimating players’
Wins Above Replacement (WAR) to show how much each one contributes to his team’s success. The parallel
between members of the president’s party and baseball players is much less clear than the parallel between votes
(for and against) and runs (scored and allowed). Yet members of Congress do play different positions. Party and
committee leaders have resources and talents that might help determine whether the president wins or loses at the
margins. In addition, students of presidential-congressional relations have long recognized that some votes are
more important than are others. A somewhat subjective approach is to analyze a subset of “Key Votes” identified
by journalists (Congressional Quarterly, Inc. Annually 1953-2015; Edwards 1989, 2009). The sabermetrics
approach is to let the data identify the most important votes. Political scientists have proposed ways to estimate
the significance of roll call votes based on closeness and turnout (King 1986; Riker 1959; Yohe 1968). Perhaps
these methods could be adapted and used with PE to calculate WAR for Congress.
25
References Aldrich, John H. 2011. Why Parties? A Second Look. Chicago: University of Chicago Press.
Aldrich, John H., and David W. Rohde. 2000. “The Consequences of Party Organization in the House: The Role of the
Majority and Minority Parties in Conditional Party Government.” In Polarized Politics: Congress and the President
in a Partisan Era, ed. Jon R. Bond and Richard Fleisher. Washington, DC: CQ Press.
Berry, William D., Matt Golder, and Daniel Milton. 2012. “Improving Tests of Theories Positing Interaction.” Journal
of Politics 74 (July): 653-671.
Binder, Sarah A., and Steven S. Smith. 2001. Politics Or Principle?: Filibustering in the United States Senate.
Washington, DC: Brookings Institution Press.
Bond, Jon R., and Richard Fleisher. 1980. The Limits of Presidential Popularity as a Source of Influence in the U.S.
House. Legislative Studies Quarterly, 5 (February): 69-78.
Bond, Jon R. and Richard Fleisher. 1990. The President in the Legislative Arena. Chicago: University of Chicago Press.
Bond, Jon R., Jeffrey E. Cohen, and Richard Fleisher. 2014. “Why Party Polarization Affects Presidential Success
Differently in the Senate and House: Testing for Conditional Effects of Party Control and Presidential Approval.”
Presented at the Annual Meeting of the American Political Science Association, Washington, DC, August 27 - 31,
2014.
Bond, Jon R., Richard Fleisher, and Jeffrey E. Cohen. 2015. “Presidential-Congressional Relations in an Era of Polarized
Parties and a 60-Vote Senate.” In James A. Thurber and Antoine Yoshinaka, eds., American Gridlock: Causes,
Characteristics, and Consequences of Polarization. (New York: Cambridge University Press, 2015).
Bond, Jon R., Richard Fleisher, and Jeffrey E. Cohen. 2012. “How Party Polarization Affects the Relationship between
Party Control and Presidential Success Differently in the House and Senate.” Presented at the Annual Meeting of
the Southern Political Science Association, New Orleans, January 12-14, 2012.
Bond, Jon R., Richard Fleisher, and Glen S. Krutz. 2009. “Malign Neglect: Evidence that Delay Has Become the Primary
Method of Defeating Presidential Appointments.” Congress & the Presidency 36: (No.3): 226-243.
Bond, Jon R., Richard Fleisher, and Michael Northrup. 1988. “Public Opinion and Presidential Support.” Annals, 499
(September): 47-63.
Bond, Jon R., Richard Fleisher, and B. Dan Wood. 2003. “The Marginal and Time Varying Effect of Public Approval
on Presidential Success in Congress.” Journal of Politics 65 (February): 92-110.
Cohen, Jeffrey, Jon R. Bond, and Richard Fleisher. 2013a. “Placing Presidential-Congressional Relations in Context: A
Comparison of Barack Obama and His Predecessors.” Polity 45 (Jan.): 105-126.
Cohen, Jeffrey, Jon R. Bond, and Richard Fleisher. 2013b. “The Implications of the 2012 Presidential Election for
Presidential-Congressional Relations: Change or More of the Same?” In Amnon Cavari, Richard J. Powell, and
Kenneth R. Mayer, eds., The 2012 Presidential Election: Forecasts, Outcomes, and Consequences (New York:
Routledge).
Congressional Quarterly, Inc. Annually 1953-2015. “Presidential Support.” Congressional Quarterly Almanac.
(Washington, DC: Congressional Quarterly, Inc.).
Covington, Cary R., J. Mark Wrighton, and Rhonda Kinney. 1995. “A Presidency-Augmented Model of Presidential
Success on House Roll Call Votes.” American Journal of Political Science 39 (November): 1001-24.
Covington, Cary R.1987a. “Mobilizing Congressional Support for the President: Insights from the 1960s.” Legislative
Studies Quarterly 12 (February): 77-96.
26
Covington, Cary R.1987b. “Staying Private: Gaining Congressional Support for Unpublicized Presidential Preferences
on Roll Call Votes.” Journal of Politics 49 (August): 737-55.
Covington, Cary R.1988a. “Building Presidential Coalitions Among Cross-pressured Members of Congress.” Western
Political Quarterly 41 (March): 47-62.
Covington, Cary R.1988b. “Guess Who’s Coming to Dinner: The Distribution of White House Social Invitations and
Their Effects on Congressional Support.” American Politics Quarterly 16 (July): 243-65.
Cox, Gary W., and Mathew D. McCubbins. 2005. Setting the Agenda: Responsible Party Government Theory in the
U.S. House of Representatives. New York: Cambridge University Press.
Davidson, Roger H. 1984. “The Presidency and Congress.” In The Presidency and the Political System, ed. Michael
Nelson. Washington, DC: CQ Press.
Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper & Row.
Edwards, George C. III. 1989. At the Margins: Presidential Leadership of Congress. New Haven: Yale University Press.
Edwards, George C. III. 2009. The Strategic President: Persuasion and Opportunity in Presidential Leadership.
Princeton, NJ: Princeton University Press.
Fett, Patrick J. 1994. “Presidential Legislative Priorities and Legislators’ Voting Decisions: An Exploratory Analysis.”
Journal of Politics. 56 (May): 502-12.
Fleisher, Richard, and Jon R. Bond. 1983. “Assessing Presidential Support in the House: Lessons from Reagan and
Carter.” Journal of Politics 45 (August): 745-58.
Fleisher, Richard, and Jon R. Bond. 1992. “Assessing Presidential Support in the House II: Lessons from George Bush.”
American Journal of Political Science 37 (May): 525-41.
Fleisher, Richard, and Jon R. Bond. 2000. “Partisanship and the President’s Quest for Votes on the Floor of Congress.”
In Polarized Politics: Congress and the President in a Partisan Era, ed. Jon R. Bond and Richard Fleisher.
Washington, DC: CQ Press, Chap. 8.
Fleisher, Richard, Jon R. Bond, and B. Dan Wood. 2008. “Which Presidents are Uncommonly Successful in Congress?”
In Presidential Leadership: The Vortex of Power, eds. Bert A. Rockman and Richard W. Waterman. New York:
Oxford University Press, pp. 191-214.
Jacobson, Gary C. 1990. The Electoral Origins of Divided Government. Boulder, CO: Westview.
Jacobson, Gary C., and Samuel Kernell. 1983. Strategy and Choice in Congressional Elections, 2nd ed. New Haven:
Yale University Press.
James, Bill. 1982. The Bill James Baseball Abstract, 1982. New York: Ballantine Books.
Kellerman, Barbara. 1984. The Political Presidency: Practice of Leadership from Kennedy through Reagan. New York:
Oxford University Press.
King, Gary. 1986. “The Significance of Roll Calls in Voting Bodies: A Model and Statistical Estimation.” Social Science
Research 15 (): 135-152.
Lee, Frances E. 2009. Beyond Ideology: Politics, Principles, and Partisanship in the U.S. Senate. Chicago: University
of Chicago Press.
Lewis, Michael. 2003. Moneyball: the Art of Winning an Unfair Game. New York: W.W. Norton.
Lockerbie, Brad and Stephen A. Borrelli. 1989. “Getting Inside the Beltway: Perceptions of Presidential Skill and
27
Success in Congress.” British Journal of Political Science 19 (January): 97-106.
Mantell, J. B. and Leon S. Lasdon. 1978. A GRG Algorithm for Econometric Control Problems. Annals of Economic
and Social Measurement 6(5): 581-597.
Meier, Kenneth J. and Laurence J. O’Toole Jr. 2002. “Public Management and Organizational Performance: The Effect
of Managerial Quality,” Journal of Policy Analysis and Management 21(4): 629-643.
Miller, Steven J. 2007. A Derivation of the Pythagorean Won-Loss Formula in Baseball. Chance 20(1): 40-48.
Neustadt, Richard E. 1960. Presidential Power: The Politics of Leadership. New York: Wiley.
Newey, W. and K. West. 1987. A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation Consistent
Covariance Matrix. Econometrica. 55 (May): 703-708.
Peterson, Mark A. 1990. Legislating Together: Eisenhower to Reagan. Cambridge, MA: Harvard University Press.
Ramsey, J.B. 1969. “Tests for Specification Error in Classical Linear Least Squares Regression Analysis.” Journal of
the Royal Statistical Society. B31: 250-71.
Riker, William H. 1959. “A Method for Determining the Significance of Roll Calls in Voting Bodies.” In John C.
Wahlke and Heinz Eulau, eds., Legislative Behavior: A Reader in Theory and Research. Glencoe, IL: Free Press.
Rohde, David. 1991. Parties and Leaders in the Post Reform House. Chicago: University of Chicago Press.
Sullivan, Terry. 1988. “Headcounts, Expectations, and Presidential Coalitions in Congress.” American Journal of
Political Science. 32 (August): 567-89.
Sullivan, Terry. 1990. “Explaining Why Presidents Count: Signaling and Information.” Journal of Politics. 52 (August):
939-62.
Sullivan, Terry. 1991. “The Bank Account Presidency: A New Measure of Evidence on the Temporal Path of
Presidential Influence.” American Journal of Political Science. 35 (August): 686-723.
U.S. Senate n.d. “Senate Stories.”
http://www.senate.gov/artandhistory/history/minute/The_Caning_of_Senator_Charles_Sumner.htm
Yeniay, Ozgur. 2005. A Comparative Study on Optimization Methods for the Constrained Nonlinear Programming
Problems. Mathematical Problems in Engineering 2: 165-173.
Yohe, William P. 1968. “Riker's Method for Assessing the Significance of Roll Call Votes.” Public Choice 4 (Spring):
59-66.
28
Tables and Figures
Table 1
Conditioning Effects of Party Control, Polarization, and Public
Approval on Presidential Success in Congress
House Senate
Coef. Coef.
Party control 0.447*** 0.288***
(20.20) (10.30)
Approval -0.064** 0.478***
(-0.52) (3.29)
Polarization -1.033*** -0.170**
(-9.01) (-1.00)
Party control*Approval 0.267#** -0.418**
(1.77) (-2.35)
Polarization*Party control 1.365*** 0.423***
(9.27) (2.16)
Polarization*Approval -2.385*** -0.819**
(-3.39) (-0.70)
Polarization*Party control*Approval 2.256*** -0.894**
(2.04) (-0.62)
Constant 0.343*** 0.508**
(20.83) (22.62)
N 62 62
F( 7,54) 87.63 24.16
Prob > F 0.000 0.000
R2 0.893 0.724
Entries are OLS regression coefficients estimated with Stata 13 with robust standard errors (t-
test in parentheses).
***p<.001, **p<.01, *p<.05, #p<.10
29
Table 2
Career Sabermetric Wins Above Expectations
House Senate Mean Total Actual
Mean
2010 Siena
Poll Rank
President WAE% WAE WAE% WAE WAE% WAE Win % “Relations
w/ Cong”
Reagan 12.0 82.5 9.0 51.7 10.5 134.2 56.8 5
Bush43 8.8 33.9 7.4 21.2 8.1 55.1 67.3 32
Ford 13.4 19.7 12.7 19.3 13.0 39.0 47.6 17
Bush41 6.1 24.9 5.4 13.2 5.7 38.1 45.4 23
Nixon 2.4 8.3 1.6 22.2 2.0 30.5 60.1 36
Johnson 2.9 11.4 1.6 7.9 2.3 19.3 79.2 1
Carter 2.8 13.3 -1.0 1.5 0.9 14.8 73.1 39
Kennedy 1.6 2.4 1.4 4.8 1.5 7.2 82.3 13
Obama -6.7 -34.8 5.9 19.0 -0.4 -15.9 66.7 18
Clinton -0.4 -6.2 -5.4 -14.7 -2.9 -21.0 50.9 25
Eisenhower -2.0 -5.5 -3.3 -15.5 -2.6 -27.0 65.9 10
Mean 3.7 13.6 3.2 11.9 3.5 25.5 63.4
Entries listed in descending order of career Wins Above Expectations. Republican presidents in bold; Democratic
presidents in italics.
31
Figure 2
Interocular Test: Comparison of WAEs from Three Different Studies
Bond & Teodoro 2015 Cohen, Bond, & Fleisher 2013a Fleisher, Bond, and Wood 2008
33
Appendix A: Marginal Effects Plots of House and Senate Models When testing for conditional effects, it is common to designate one of the variables as the conditioning variable. Such a
designation is inappropriate because the effects of interaction terms are symmetrical—“when the effect of X on Y is
conditional on the value of Z, the effect of Z must be conditional on the value of X.” Marginal effects plots are useful to show
the symmetrical effects of interactions, (Berry, Golder, and Milton 2012). Our models test for conditional effects of two
continuous variables (polarization and approval) and a dichotomous variable (party control). Showing the effects of the two
continuous independent variables requires a three-dimensional plot with polarization on the x-axis, approval on the z-axis,
and the dependent variable (presidential success rate) on the y-axis. The plane in the three-dimensional space represents the
estimated Presidential Success Rate at each level of polarization and approval. Because party control is a dichotomy, the
effects of polarization and approval conditional on party control could, in principle, be represented by two planes (one for
minority presidents and one for majority presidents) in the same three-dimensional plot, but two planes plotted in the same
space would add additional complexity with no gain in understanding. We present separate three-dimensional plots for
minority and majority presidents. But to see more clearly how party control conditions the effects of polarization and approval,
we show relationships for majority and minority president in two-dimensional plots from two perspectives—the effects of
polarization conditional on approval, and the effects of approval conditional on polarization. These graphs plot cross-sections
of the three dimensional plots at low, average, and high levels of approval and polarization. They don’t show the contours in
the three dimensional plots, but we can see more precisely how the slopes change under different conditions.
Conditional Effects in the House
Figure 1 shows the conditional effects of party control, polarization and presidential approval in the House. The three-
dimensional plots show the conditioning effects of polarization and approval on success rates of minority presidents (Panel
1a) and majority presidents (Panel 1b). But to see relationships for majority and minority presidents on the same graph, Panel
1c shows the effects of party polarization on success conditional on whether approval is low, average, and high. At average
approval, as polarization increases, the probability of winning increases for majority presidents and declines for minority
presidents. Specifically, when polarization is low (around -.30), success rates are the same for majority and minority
presidents. But when polarization is high (around .30), success rates are about 71% for majority presidents compared to 51%
for minority presidents. Majority presidents benefit from increasing polarization at all levels of approval. The effects of
polarization on minority presidents’ success rates are negligible at low approval but strongly negative at high approval.
Looking at the effects of approval conditional on party polarization (panel 1d), we see that rising popularity has small positive
effects increases majority presidents’ at all levels of polarization. Minority presidents, on the other hand, seem to benefit from
rising public approval only if polarization is low; the effects of rising approval are slightly negative at average polarization
and strongly negative at high polarization. The theory underlying the popularity hypothesis does not anticipate a negative
relationship. We speculate that the strong effects of party control and polarization may swamp the smaller effects of approval.
If minority presidents overestimate the benefits of public approval and make fewer concessions to the majority, the error in
judgment might account for the negative relationship. And when parties in Congress are highly polarized, the majority may
ignore a rise in approval because they know that public opinion is also polarized. If so, increases in public approval come
mainly from the president’s own partisans, votes that majority party members are not going get anyway.
[Appendix A Figure 1 about here]
34
Appendix Figure 1
Conditional Effects of Party Control, Polarization & Approval on Presidential Success in the House
Conditional Effects in the Senate
In the Senate the interactions between polarization and approval are not significant. This suggests that the only conditional
effects are between party control and polarization, and between party control and approval (see Figure 2). The effects party
polarization on presidential success are similar to those in the House—as polarization increases, majority presidents win more
and minority presidents win less (panel 2c). The slopes of the lines are less steep than in the House, indicating that effects of
party are smaller in the Senate. Panel 2d shows the effects of approval conditional on party control. The relationships differ
35
from those observed in the House. In the Senate, minority presidents benefit from increased public approval, but majority
presidents do not. We find no significant conditional effects between party polarization and approval, which means that the
slopes of the lines are not conditional on the level of polarization. Notice, however, that high party polarization increases the
success rate of majority presidents and decreases success rate of minority presidents.
[Appendix A Figure 2 about here]
36
Appendix Figure 2
Conditional Effects of Party Control, Polarization & Approval on Presidential Success in the Senate
37
Appendix B: Presidential Baseball Cards
DWIGHT D. EISENHOWERPresident, Republicans
Ike’s 1953 rookie legislative season was his best, with winning
.897 in the House and
.865 in the Senate
Season RC W% WAE% WAE RC W% WAE% WAE
1953 29 0.897 0.025 0.07 44 0.864 0.035 1.55
1954 29 0.724 -0.041 -1.19 66 0.742 -0.005 -0.33
1955 31 0.516 -0.011 -0.33 33 0.758 -0.066 -2.19
1956 23 0.609 -0.086 -1.99 56 0.625 -0.050 -2.80
1957 54 0.537 -0.061 -3.27 42 0.714 -0.075 -3.17
1958 40 0.675 -0.030 -1.21 85 0.729 -0.024 -2.02
1959 49 0.510 -0.018 -0.88 101 0.406 -0.033 -3.37
1960 41 0.634 0.065 2.65 75 0.600 -0.042 -3.12
Career 296 0.622 -0.019 -6.15 502 0.647 -0.031 -15.45
House Senate
Ken
ned
y w
as o
n p
ace
for
an o
uts
tan
din
g 1
96
3
in t
he
Ho
use
wit
h .
05
4
WA
E% w
hen
his
dea
th
cut
the
seas
on
sh
ort
.
JOHN F.
KENNEDYDemocrats | Pres. S
ea
son
RC
W%
WA
E%
WA
ER
CW
%W
AE
%W
AE
19
61
55
.80
0.0
20
1.1
21
15
.79
1.0
26
2.9
9
19
62
50
.82
0-.
02
5-1
.25
10
9.8
44
.02
12
.25
19
63
47
.83
0.0
54
2.5
29
0.8
56
-.0
05
-0.4
1
Ca
ree
r1
52
.81
6.0
16
2.3
93
14
.82
8.0
15
4.8
3
Ho
use
Se
na
te
38
Season RC W% WAE% WAE RC W% WAE% WAE
1963 13 .692 -.017 -0.23 13 1.000 .069 0.90
1964 41 .854 .077 3.14 189 .915 .021 4.05
1965 85 .906 .052 4.46 138 .913 .038 5.25
1966 74 .838 .084 6.25 91 .604 -.049 -4.49
1967 91 .648 -.034 -3.08 117 .735 -.009 -1.05
1968 70 .757 .013 0.88 129 .636 .025 3.22Career 374 .789 .031 11.41 677 .790 .012 7.89
House Senate
Johnson won the Voting Rights Act and
Immigration & Nationality Act on his way to a career-high 4.46 hWAEand 5.25 sWAE in 1965.
LYNDON JOHNSON president
REPUBLICANS
Four
of
Nix
on’s
5½
sea
sons
wer
e la
cklu
ster
, but
197
3 w
as a
blo
ckbu
ster
fo
r Tr
icky
Dic
k w
ith
+4.
04 h
WA
Eon
111
hR
Can
d a
who
ppin
g +
20.8
7 sW
AE
on
141
sRC
the
Sena
te
Richard Nixon | PRESIDENT
YEA
RR
CW
%W
AE%
WA
ER
CW
%W
AE%
WA
E
1969
40.6
75.0
652.
6259
.712
-.00
1-0
.08
1970
43.7
44.0
210.
9059
.559
.049
2.90
1971
49.7
96.0
080.
3872
.653
-.01
9-1
.34
1972
31.7
74.0
260.
8231
.419
-.13
4-4
.16
1973
111
.414
.036
4.04
141
.376
.148
20.8
7
1974
44.6
14-.
011
-0.4
873
.479
.055
4.04
CA
REE
R31
8.7
70.0
218.
28
435
.513
.051
22.2
2
HO
USE
SEN
AT
E