A Sabermetric Test of Neustadt’s Skills Hypothesis...A Sabermetric Test of Neustadt’s Skills Hypothesis or Why Ronald Reagan is like Bobby Cox and Lyndon Johnson is like Joe Torre

A Sabermetric Test of Neustadt’s Skills Hypothesis

or

Why Ronald Reagan is like Bobby Cox and Lyndon Johnson is like Joe Torre

Jon R. Bond

Texas A&M University

[email protected]

and

Manny Teodoro


[email protected]

Prepared for Presentation at the 73rd Annual Meeting of the

Midwest Political Science Association

Chicago, Illinois

April 16-19, 2015

mailto:[email protected]

mailto:[email protected]

Abstract

A Sabermetric Test of Neustadt’s Skills Hypothesis*

or

Why Ronald Reagan is like Bobby Cox and Lyndon Johnson is like Joe Torre

Jon R. Bond


And

Manny Teodoro


This paper presents a new test of Neustadt’s Political Skills Hypothesis. We believe an indirect test comparing actual

wins against expected wins offers the most systematic test of this key hypothesis. We update and replicate findings of

Wins Above Expectations (WAE) from the traditional analysis of residuals from multiple regression. Then we adapt the

Pythagorean Expectations formula from sabermetrics to estimate WAE based on votes supporting and votes opposing

the president. Both approaches predict presidential success rates with high accuracy across all years and presidents. The

WAEs from the two approaches, however, are uncorrelated. The two approaches also identify different presidents as

above and below expectations. This suggests that the two approaches model different aspects of the underlying data

generation process. More research is needed to sort out what all this means.

*We are grateful to the Department of Political Science Texas A&M University for supporting this research.

1

A Sabermetric Test of Neustadt’s Skills Hypothesis

“The other teams could make trouble for us if they win.”

Yogi Berra

Presidential-congressional relations, like baseball, is all about winning. What the manager of a baseball team

needs most from his players is runs: his hitters must score runs, his pitchers and defenders must prevent them.

What the president needs most from members of Congress is their votes: he must get members to vote for his

favored legislation and against legislation he opposes. But important as winning is, what professional observers

(sports geeks and political scientists) really find most intriguing is Wins Above Expectations (WAE)—did the

team do better or worse than expected? If a team wins more games than expected, was it due to an advantageous

set of circumstances, simple “luck,” or did the manager’s skill and leadership produce wins when the team

should have lost? Using different statistical methods, sabermetricians and political scientists have attempted to

answer these questions for their respective fields.1

This paper applies sabermetric methods to analyze presidential success in Congress from Eisenhower to

Obama. Specifically, we adapt Bill James’ (1982) Pythagorean Expectations (PE) formula, which estimates how

many games a baseball team should have won based only on the number of runs scored and runs allowed, to

estimate WAE for presidents. If a roll call vote on which the president expresses a position is analogous to a

game, and votes for and against the president are analogous to runs scored and runs allowed, can the PE formula

predict presidential success rates over the course of a season (year) as accurately as it does for professional

baseball teams? If so, how well do WAE from PE correlate with residuals from multiple regression models used

by political scientists to identify presidents who won significantly more or less often that predicted (Bond and

Fleisher 1990, chap. 8; Cohen, Bond, and Fleisher 2013a; Fleisher and Bond 1983, 1992; Fleisher, Bond, and

1 Bill James named his approach to analyzing baseball “sabermetrics” in honor of the Society for American Baseball Research (SABR).

2

Wood 2008)? What are the lifetime WAEs of presidents from Eisenhower to Obama? Which presidents, if any,

stand out as uncommonly successful or unsuccessful?

The paper begins with a discussion of the fundamentals of presidential success in Congress. Political science

research finds that a regression model with three independent variables—party control, presidential approval, and

party polarization—and interactions successfully models the fundamentals of presidential success on roll call

votes in Congress. We update this model through 2014 (Obama’s sixth year), and use the residuals to estimate

WAE relative to this updated baseline. Next, we adapt the Pythagorean Expectations (PE) approach to predict

presidential success rates based only on votes won and lost, and present WAE relative to this parsimonious

atheoretical baseline. We find that both approaches predict presidential success rates with high accuracy across

all years and presidents. WAE from the two approaches, however, are uncorrelated. This suggests that the two

approaches model different aspects of the underlying data generation process.

The Fundamentals of Presidential-Congressional Relations

Richard Neustadt is rightly considered the dean of the behavioral study of the presidency. His message is clear

and compelling. To understand the presidency, study presidential behavior—what the president “can do, as one

man among many, to carry his own choices through that maze of personalities and institutions called the

government of the United States” (Neustadt 1960, i). Although he did not present a formal theory or use the

language of the scientific method, his insightful analysis generated hypotheses that subsequent generations of

political scientists set about testing. The three that have received the most attention include:

1. The Political Party Hypothesis (for which he expected null results): “What the Constitution separates our

political parties do not combine” (Neustadt 1960, 33);

2. The Public Approval Hypothesis: ceteris paribus, popular presidents are more successful than are

unpopular presidents (chap. 5); and

3. The Political Skills Hypothesis: ceteris paribus, presidents who are highly skilled at bargaining and

persuasion will be uncommonly successful while unskilled presidents will be unsuccessful (chap. 4).

3

As tests of the Neustadtian theory of presidential power2, the Political Skills Hypothesis is clearly the most

crucial because it focuses directly on what the president does to maximize his influence. With respect to public

approval, Neustadt (1960, 87) did not expect a strong “one-to-one relationship”. Rather, he viewed public

approval “as a conditioner, not the determinant” of presidential success. The Political Parties Hypothesis is the

least important test. Neustadt expected null results, but rejecting the null hypothesis does not necessarily

undermine the theory.

What evidence has 40 years of scientific research testing these hypotheses produced? Overall, the

correlation between the importance of the hypothesis to the theory and empirical support for it is -1.0.

Political Parties

Political science research has found strong support for the Political Parties Hypothesis. Large-n quantitative

studies consistently reject the null hypothesis that parties do not provide a bridge between the president and

Congress. Indeed, we find that party is by far the strongest determinant of presidential success in Congress—

support from members of the president’s party is statistically and substantively higher than that of members of

the opposition (Edwards 1989), and the president wins significantly more roll calls if his party has a majority

(Bond and Fleisher 1990). Although Neustadt’s observation about the weakness of American parties was

accurate, the pattern of robust party effects held during the period of his analysis—mid-twentieth century—when

parties were at their weakest. Rising party polarization in Congress from the mid-1980s to the present has

magnified the effects of party, though important chamber differences have emerged (Bond, Cohen, and Fleisher

2014, 2015; Bond, Fleisher, and Cohen 2012; Cohen, Bond, and Fleisher 2013a, 2013b).

We have sound theoretical reasons to explain the strong influence of party on presidential support in

Congress. Most obvious is simple arithmetic—majority presidents have more members on the floor with

incentives to support their positions than do minority presidents. Because members of the same political party

2 Neustadt’s analysis, of course, is not a scientific theory, as we understand it in contemporary political science. Yet, the generation of

testable hypotheses is arguably the most important thing that we want our theories to do.

4

must satisfy similar electoral coalitions, they have a wide range policy preferences in common. Consequently,

majority presidents win more because they have more members on the floor who agree with their proposals.

Furthermore, because the president’s co-partisans must run for reelection on his record as well as their own, they

have an electoral incentive to help him succeed. Thus, simple arithmetic explains much of why majority

presidents win more votes in Congress than do minority presidents.

Yet, arithmetic is not the only, or even most important, reason why majority presidents win more floor votes

than do minority presidents. A political party is more than just a bundle of shared preferences. More important than

numbers, is control of key levers of power in Congress. Political parties have long been conceived of as competing

teams (Downs 1957). Like sports teams, political parties are institutions with rules and procedures that channel and

constrain behavior (Aldrich 2011). The goal of party institutions is to win—win elections and enact policies favored

by party members. As a result, parties in Congress regularly take opposing positions on issues that have nothing to

do with policy or ideology, just because they don’t want the other team to win (Lee 2009), and to score points for the

next election. Majority party agenda control is a central component of the two leading partisan theories of

Congress—Conditional Party Government (CPG) theory (Aldrich and Rohde 2000; Rohde 1991) and Cartel Theory

(Cox and McCubbins 2005). When it comes to winning or losing roll call votes, then, the main advantage of majority

party control is that the president’s team controls key levers of power in Congress—committees, access to the floor,

rules governing debate, and which amendments are considered. Hence, the issues on the agenda and the presentation

of choices to members tend to reflect the president’s preferences when his party controls the chamber (Covington,

Wrighton, and Kinney 1995). These institutional advantages explain why party was such a strong influence on

presidential success even when parties were ideologically diverse in the mid-twentieth century.

Public Approval

Support for the Public Approval hypothesis is mixed but broadly supportive of Neustadt’s expectations.

Although some studies reported insignificant or negative relationships (Bond, Cohen, and Fleisher 2014; Bond

and Fleisher, 1980; Bond, Fleisher, and Northrup 1988), research typically finds that public approval has a

statistically significant but substantively marginal effect on support from members of Congress (Edwards 1989)

5

and on presidential success rates (Bond and Fleisher 1990). Furthermore, research that tests for interactions

shows that the effects of public approval are indeed conditional on party polarization and party control (Bond,

Cohen, and Fleisher 2014; Bond, Fleisher, and Wood 2003) as Neustadt suggested.

Given the importance of public support in a democracy, why does public approval of the president have

only a marginal effect on success rates? After all, if members of Congress want to get reelected (and they do),

rational actors should take public approval into account when deciding whether to support the president. As

Neustadt (1960, 86) explained, “Dependent men must take account of popular reactions to their actions. What

their publics may think of them becomes a factor, therefore, in deciding how to deal with the desires of a

President. His prestige enters into that decision; their publics are part of his.”

Closer reflection, however, suggests two reasons why presidential popularity might have marginal—or in

some cases negative—effects on presidential success. First, popular presidents may overestimate the benefits of

public approval and be less willing to compromise. An unwillingness to compromise is likely to lead to increased

party unity on floor votes, and hence, more support from the president’s party and less support from the

opposition. Second, is the question of credit and blame. Few voters have specific information about how often

their representative supports the president. Rather, the Strategic Politicians Theory indicates that effects

presidential popularity in congressional elections are indirect. If the economy is booming and the president is

popular, the president’s party will recruit a disproportionate number of politically experienced, well-financed

candidates to challenge vulnerable incumbents in the opposition party; if the president is unpopular, the opposing

party will recruit more high quality challengers (Jacobson 1989, 1990; Jacobson and Kernell 1983). Because

high quality challengers run vigorous campaigns that cast the incumbent in a negative light, the president’s co-

partisans tend to receive blame for unpopular policies even if they do not support them, and members of the

opposition are unlikely to receive credit even if they support popular policies. As a result, members of Congress

tend to follow their basic partisan predispositions—opposition partisans may oppose popular presidents because

they have little to gain from their support and much to lose if the president succeeds; co-partisans can’t run away

from an unpopular president, so they might be better off to help him succeed.

6

Political Skills

Thus far, political science research has been unable to uncover convincing evidence supporting the core

hypothesis of Neustadt’s argument—that highly skilled presidents are uncommonly successful. The Political

Skills Hypothesis is central to Neustadt’s thesis, but it’s also the most difficult to test, mainly because of

theoretical ambiguity about how to define and measure political skills, particularly with the small number of

cases. Unlike party and public approval, we have been unable to identify valid and reliable quantitative measures

of political skills. Supporting evidence comes mainly from in-depth case studies. But generalizing findings from

a single (and unrepresentative) case is problematical. Even a multiple-cases approach that uses common

standards to assess the performance of several presidents on important issues (Kellerman 1984) suffers from

questions about selection bias and interpretation of findings (see Bond and Fleisher 1990, 34-40).

Several studies developed quantitative measures of particular aspects of political skills based on

journalists’ assessments of presidential performance (Lockerbie and Borrelli 1989), presidential speeches (Fett

1994), activities to influence members’ votes documented in presidential archives (Covington 1987a, 1987b,

1988a, 1988b; Sullivan 1988, 1990, 1991), and agenda setting influence (Covington, Wrighton, and Kinney

1995). Although these studies are clever, their support for the Political Skills Hypothesis is mixed at best.

Moreover, questions about the validity of the measures, ambiguities interpreting the findings, and questions

about generalizability remain (see Bond and Fleisher 1990, 36-40; Fleisher, Bond, and Wood 2008).

Another approach seeks to finesse the intractable problem of measuring political skill and test the Political

Skills Hypothesis indirectly. Research on presidential-congressional relations shows that two variables—party

and public approval—systematically influence presidential success in Congress. These variables establish the

context within which presidential bargaining and persuasion take place. Regression models estimating the effects

of these contextual variables on presidential success establish a common baseline. The errors can be interpreted

as an estimate of wins above expectations—whether the president wins more or less often than should be

expected given the political context. Research in public administration has used regression errors as indirect

indicators of management skill, a similarly difficult construct to measure (Meier and O’Toole 2002). The error

7

term contains everything not explained by variables in the model. If the model is properly specified, the

unexplained variance will be random error. Errors that are not randomly distributed indicate that important

variables may have been omitted or that the model is misspecified. If political skill is as important to presidential

success as Neustadtian theory suggests, then the effects of this omitted variable should show up as non-random

variance in the errors.

Furthermore, theory together with evidence from qualitative case studies and close observers assessments

suggests specific hypotheses about patterns that we should observe in the residuals. These assessments, of

course, are not direct measures of political skill, but rather perceptions about a president’s “general reputation”

as more or less skilled. Neustadt’s concept of “professional reputation”, however, is based on perceptions of

Washingtonians. While every Washingtonian may not share the general view, there “usually is a dominant tone,

a central tendency, in Washington appraisals of a President” (Neustadt, 1960, 62). A review of the case-study

literature and assessments of close observers of presidential performance (Bond and Fleisher 1990, 198-204;

Edwards 2009, chap. 4) found consensus that Lyndon Johnson and Ronald Reagan were the most highly skilled

leaders since FDR, and that Richard Nixon and Jimmy Carter were relatively unskilled in their dealings with

Congress. Historical assessments of Eisenhower, Kennedy, and Ford are mixed. Reputations of presidents after

Reagan are still being formed. If perceived reputations of political skills have a strong effect on presidential

success rates, then:

1. Presidents reputed as highly skilled (Johnson and Reagan) should win significantly more often than the

baseline model predicts, and presidents reputed as unskilled (Nixon and Carter) should win significantly

less often; and

2. Since Neustadtian theory does not suggest that different skills are effective in the House and Senate, we

should observe similar patterns of wins above expectations in both chambers.

This approach, of course, is not a direct test of the effects of political skills. Rejecting the null hypothesis

would not mean that non-random variance in the residuals is due to unmeasured effects of political skills.

Exactly what unmeasured variables are in the error term is unknown, and non-random variance might result from

8

a number of causes. Yet failure to reject this null hypothesis would add to a growing accumulation of

quantitative evidence that actions of the president—”as one man among many”—can seldom overcome the

political and institutional context that set the broad parameters of presidential success in Congress.

Several studies using slightly different model specifications analyzing different periods were unable to

reject the null hypothesis of no non-random variance in the residuals:

The number of significant outliers was no greater than should be expected by chance;

Johnson and Reagan, perceived to have exceptionally high political skills, did not win significantly

more often than should be expected given the conditions they faced, nor did reputedly unskilled

Nixon and Carter win significantly less often; and

Winning more or less often in one chamber was not associated with winning more or less often in

the other chamber (Bond and Fleisher 1990, chap. 8; Cohen, Bond, and Fleisher 2013a; Fleisher

and Bond 1983, 1992; Fleisher, Bond, and Wood 2008).

So far, then, political science research has been unable to find even indirect evidence that supports the

Political Skills Hypothesis. Nevertheless, we believe the indirect test comparing actual wins against expected

wins offers the most systematic test of this key hypothesis. The logic of the scientific method is not to prove

what is true, but to disprove what is not true. The initial step is to try to reject the null hypothesis of no

relationship. Although previous studies have been unable to reject the null hypothesis regarding political skills,

we need to continue trying to replicate the results with models updated to include new data points and

specifications refined to incorporate new evidence about the determinants of presidential success. Moreover, we

need to look for different ways to estimate the baseline to measure wins above expectations.

Thus, we present results from two different quantitative approaches to estimating presidential wins above

expectations. First, we present results from an updated regression analysis of presidential success from 1953

through 2014, specified to incorporate the results of recent research showing that party polarization conditions

the effects of party control and presidential approval, and that the conditional relationships are different in the

9

House and Senate. Second, we adapt the sabermetric Pythagorean Expectations formula, used to estimate WAE

for baseball teams, to estimate WAE for presidents.

Regression Estimates of Presidential Wins Above Expectations

Party control and public approval are the key variables in previous baseline models.3 In addition, recent research

shows that rising party polarization over the last several decades has altered relationships between the president

and Congress. The effects of party polarization on presidential success, however, are conditional.

In particular, party polarization conditions the relationship between party control and presidential success.

The escalating number of cloture votes on presidential roll calls has changed the conditional effects in the

Senate. In the House governed by majority rule, party polarization amplifies the effects of party control—as

polarization increases, majority party presidents win more and minority presidents win less. In the Senate,

increasingly governed by a 60-vote decision rule, party polarization suppresses success rates—majority

presidents still win more than minority presidents do, but as polarization increases in the Senate, success rates

decline for both majority and minority presidents (Cohen, Bond, and Fleisher 2013a, 2013b).

The rise of cloture votes in recent decades (Binder and Smith 2001; Koger 2010) explains why

polarization suppresses majority presidents’ success rates in the Senate. Until the Clinton years, cloture votes

were less common on presidential roll calls than on all Senate votes. Two developments associated with party

polarization became apparent during the George W. Bush (hereafter Bush43) and Obama years: (1) cloture votes

on presidential roll calls nearly doubled—19% of Bush43 and Obama votes were on cloture compared to 10%

for Clinton; and (2) the “minority party filibuster” and 60-vote requirement to invoke cloture became a highly

effective tool that the minority party routinely used to block majority party presidents’ policies and nominees

(Bond, Fleisher, and Krutz 2009). Consequently, voting on cloture became much more partisan than on other

types of presidential roll calls—over 80% of Bush43 and Obama cloture voters had highly unified parties on

3 Some models included a dummy variable for honeymoon and a count variable to account for cycles over course of the term. We drop

these controls because they added little explanatory power to the models and the effects are not consistent across chambers.

10

opposite sides. With highly polarized parties, majority presidents almost always lose these votes because they

favor invoking cloture over 98 percent of the time. Getting to 60 typically requires votes from the minority,

which the president is unlikely to get from a highly cohesive minority party. Minority party presidents, in

contrast, almost always win these votes because they oppose invoking cloture over 83 percent of the time—it’s a

whole lot easier to win if you only need 41 votes that you can get from your own caucus. To account for this

change, we exclude cloture votes from the calculation of presidential success rates in the Senate. Since cloture is

unique to the Senate, excluding these votes provides a mix of majority and supermajority votes similar to that in

the House over the entire period. This has almost no effect on success rates before Bush43, but it is a way to

estimate wins above expectations for Bush43 and Obama in the Senate relative to a baseline similar to that in the

House and for earlier presidents in the Senate.

There is also evidence that the effects of public approval are conditional on the level of party polarization

(Bond, Fleisher, and Wood 2003). An update through Obama’s first term shows that public approval is

conditional on party control as well as party polarization (Bond, Cohen, and Fleisher 2014).

The Updated Regression Model

We estimate the following OLS regression model for the House and Senate:

PSS=B0+B1(Ptycntl)+B2(Aprv)+B3(Plrzn)+B4(Ptycntl*Aprv)+B5(Plrzn*Ptycntl)+

B6(Plrzn*Aprv)+B7(Plrzn*Ptycntl*Aprv)

Where:

1. PSS=Presidential Success Score measured as the annual percentage of roll calls (Congressional

Quarterly, Inc. Annually 1953-2015) on which the president’s position prevailed, excluding consensus

wins (more than 90% supporting the president4) and excluding cloture votes in the Senate;

4 This differs from the 20 percent threshold used in previous work (Bond and Fleisher 1990; Fleisher and Bond 2000). The lower

threshold still excludes the most routine issues, but it increases the n slightly.

11

2. Ptycntl=Party control measured with a binary variable5 coded 1=President’s party has a majority;

0=President’s party in minority6;

3. Aprv=President’s job approval measured as the mean Gallup job approval rating for the year adjusted to

exclude DK/no opin. (%Aprv/(%Aprv+%disaprv) centered on its mean7;

4. Plrzn=Party polarization measured as the mean distance between the parties (|%Dem yea -%Rep yea|)

on all RCs in the year excluding consensus votes (LT 10% in the minority) centered on its mean;

5. Ptycntl*Aprv=interaction of party control and approval;

6. Plrzn*Ptycntl=interaction of polarization and party control;

7. Plrzn*Aprv=interaction of polarization and approval;

8. Plrzn*Ptycntl*Aprv=interaction of all three conditional variables.

Table 1 presents the results for the House and Senate. The models perform well, explaining 89 percent of

the variance in the House and 72 percent in the Senate with just three variables plus interactions. The Senate is

somewhat less predictable than the House. This result is consistent with previous research and with the

expectation that institutional features (statewide constituency, six-year term, and smaller size) and different rules

and traditions would make the Senate more individualistic and deliberative than the House. Including the

interactions significantly improves the fit in both chambers. With interaction terms, we need to interpret the joint

significance of all the elements of the interactions rather than significance of any particular coefficient. Since the

focus of this paper is on estimating wins above expectations, we do not discuss the substantive interpretation of

the interactions here except to note that the results are generally consistent with previous work.8 We turn now to

an analysis of the residuals.

5 Using a binary variable instead of percent of the president’s party throws out information. But theory suggests that the primary benefit

of majority status is control of institutional levers of power, and there is evidence that the continuous variable does not add significant

explanatory power over the majority/minority dichotomy (Bond, Fleisher, and Cohen 2012; Bond, Fleisher, and Wood 2003).

6 The party division in the Senate was a tie in 2001. With VP Cheney breaking the tie, Republicans organized the Senate. On June 6, Sen.

Jeffords (R-VT) switched to caucus with Democrats. Thus, Republicans held the majority from January-June 6, and Democrats were the

majority from June 7 to the end of the 107th Congress. Bush is coded as a minority president because there were more days and more

votes in 2001 when Republicans were the minority. Coding Bush as a majority president makes little difference in the results.

7 We center continuous variables—Aprv and Plrzn—on their means. The only effect is to shift the intercept, giving zero a substantive

interpretation—the effects at average approval and average polarization. The slopes, R2, and overall model significance are unchanged.

8 Marginal effects plots and a brief discussion interpreting the interactions are in the Appendix.

12

[Table 1 about here]

Diagnostic tests found no evidence of heteroskedasticity, autocorrelation, specification error, or omitted

variables.9 This evidence and the high R2 indicates that this parsimonious model with just three variables (plus

interactions) is quite successful at modeling the fundamentals of presidential success in Congress. And we were

unable to reject the Political Skills null hypothesis that failing to include this theoretically essential variable in

the model would show up as non-random variance in the error term.

Regression Estimates of Presidential Wins Above Expectations

Since we are interested in estimating presidential wins above expectations each year in the House and Senate, we

plot the residuals to see which presidents won significantly more or less often than expected given the political

and institutional context. Studentized residuals provide a useful metric to identify significant outliers. Using a

95% confidence interval, absolute values greater than 2.017 would be considered unusual. We would expect one

in twenty to appear unusual by random chance. For our sample of 62 years, we would expect about 3.01 unusual

observations by random chance in each chamber.

Figure 1 plots studentized residuals in the House and Senate for each year. In the House, we find only two

outside the ±2.0 range—Nixon in 1971 (+2.86) and Ford in 1976 (-2.51)—though three more just miss with

absolute values of 1.9 (-1.9 for Bush41 in 1999 and Bush43 in 2004, and +1.9 for Clinton in 2000). In the

Senate, we find four residuals outside ±2.0 and two more at -1.99. Not only do we observe more outliers in the

Senate than should be expected by chance, but compared to the House, the pattern in the Senate appears less

random. Four of the large residuals are negative (Eisenhower in 1959, LBJ in 1966, Clinton in 1999, and Bush-

43 in 2004) while only two are positive (Bush43 in 2002 and 2008). Moreover, the Senate outliers appear early

or late in the time series—three of the six appear during Bush43, including of the two positive ones. Yet, the

diagnostics did not find significant evidence of anomalies in the behavior of the Senate residuals.

9 Cameron and Trivedi’s decomposition of IM-test (estat imtest) for heteroskedasticity (chi2 = 11.12 in the House model, 17.58 in the

Senate model) not significant; Durbin-Watson d-statistic (d=1.902 in House, 1.877 in the Senate) no evidence of positive or negative

autocorrelation; linktest revealed no evidence of specification error; and Ramsey (1969) RESET test (ovtest) revealed no evidence of

omitted variables (F(3, 51) = 0.37 in the House, 1.46 in the Senate).

13

[Figure 1 about here]

Some might view the ±2.0 standard to define “unusualness” of residuals as arbitrary. Not only do we

observe more significant outliers in the Senate than expected by chance, there are numerous years in both

chambers when a president’s actual wins deviated from expectations by a considerable amount. Before

concluding that the effects of presidential skills are intermittent and idiosyncratic, let’s employ the tried-and-

flawed “Interocular Test”—eyeball the plot, and if a pattern hits you between the eyes, it’s significant—to look

more closely for expected patterns.

In particular, qualitative assessments suggest that Johnson and Reagan are candidates for uncommonly

successful presidents, and Nixon and Carter are candidates for exceptionally unsuccessful presidents in their

dealing with Congress. Focusing only on whether residuals are positive or negative, however, we do not see

consistent patterns. Although Johnson exceeded expectations more often than not in the House, Reagan did not;

and in the Senate, both of these reputedly skilled presidents fell short of expectations as often as they exceeded

them. Consistent with his reputation as an unskilled leader, Carter’s success rate in the House fell short of

expectations in in three of four years, but Nixon usually exceeded expectations, significantly above expectations

in 1971. In the Senate, Nixon and Carter exceed expectations as often as they fell short. Thus, we see no

consistent patterns for the four presidents for whom we have specific expectations. What’s more, we do not

similar patterns in both chambers—the R2 between House and Senate residuals is .03 and the slope of the

regression line is nearly flat.10

Another objection to this approach is that patterns in the residuals might be sensitive to model

specification. To test this possibility, we employ a comparative Interocular Test. Two earlier studies analyzed

studentized residuals using slightly different model specifications for different time periods. Fleisher, Bond, and

Wood (2008) used a model that included party control, public approval, party polarization, and honeymoon to

10 ResidSen = 0.16 (ResidHse)+ 0.011; R² = 0.026

14

analyze annual success rates from 1953 to 2001; Cohen, Bond and Fleisher (2012a) used a model with only party

control and polarization with interactions to analyze presidential success from 1953-2010. The models produced

similar, though not identical, results. Correlations between residuals from this model with others are:

with the Fleisher, Bond, and Wood (2008) model: House R2=.59; Senate R2=.73;

with the Cohen, Bond, and Fleisher (2012a) model: House R2=.77; Senate R2=.74.

Figure 2 plots the studentized residuals from all three studies. We observe few years when the residuals are

substantially different across the different models (highly divergent residuals are enlarged). In the House,

Eisenhower’s estimate in 1960 changed from a significant positive outlier in the earlier studies to positive but far

from significant in the current model; Clinton’s estimate in 2000 appears as a significant positive outlier in the

current model but as close to zero in the earlier studies. In the Senate, Carter’s estimate in the current model is

negative, but well above -2.0; Clinton’s estimate in 2000 changes from clearly negative in earlier studies to

slightly above zero in the current model.11 One anomaly stands out in all three models—Nixon’s House success

rate in 1971 was much higher than expected (Studentized residuals near +3.0). This result is surprising given the

conventional view that Nixon was unskilled in his interactions with Congress.


This analysis ought to raise serious doubts about leadership skill as a systematic explanation of presidential

success on roll call votes in Congress. We are certainly not arguing that presidents and political skill do not

matter. There are numerous examples showing that how well the president played the game of Washington

politics changed the outcome in some really important cases. Explaining these unusual and important cases is

interesting, but science seeks generalizable relationships in a large representative sample of cases. Thus far, we

have been unable to find evidence that is generalizable. Let’s see if sabermetrics can provide additional leverage

on estimating presidents’ wins above expectations.

11 Bush-43’s estimate in 2012 is significantly below expectations in both studies, but just barely so in the current model.

15

Sabermetrics Estimates of Presidential Wins Above Expectations

Political scientists chase the god of statistical significance (anything more common than .05 will be rejected) in

search of theory to explain political phenomena. Sabermetricians chase the god of big (really big) data in search

of accurate predictions (a little post hoc rationalizing will do just fine thank you). With apologies to Alfonso

Bedoya in Treasure of the Sierra Madre, the sabermetrician’s attitude is: “Theory? We ain’t got no theory. We

don’t need no . . . stinkin’ theory!”

Pythagorean Expectations

Bill James’ (1982) Pythagorean Expectations formula is a shining example. It predicts Major League Baseball

teams’ winning percentages based solely on runs scored and runs allowed with a high degree of accuracy:

𝑊𝑖𝑛 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 =𝑟𝑢𝑛𝑠 𝑠𝑐𝑜𝑟𝑒𝑑𝛾

𝑟𝑢𝑛𝑠 𝑠𝑐𝑜𝑟𝑒𝑑𝛾 + 𝑟𝑢𝑛𝑠 𝑎𝑙𝑙𝑜𝑤𝑒𝑑𝛾

Central to James’ PE is the insight that runs scored and runs allowed in a given game reveal information about a

team’s play beyond whether the team won or lost the game. Each run scored or allowed implies a story of

success or failure that culminated in a win or loss, and so runs scored and runs allowed are better measures of

actual team performance—are the hitters scoring? Are the pitchers and fielders making outs?—than is simple

winning percentage. Over the course of a 162 game season for thirty Major League franchises, winning

percentage and PE converge quite closely. In James’ original formulation, the value of γ was set at 2 for the sake

of simplicity and because it fit the data reasonably well—hence the name “Pythagorean Expectation.” Since then,

the PE formula has been refined to arrive at γ =1.82 as the exponent that generates the most accurate predictions

(Miller 2007).12

Of course, PE does not perfectly predict actual winning percentages, and Sabermetricians hotly debate the

meaning of deviations from PE, or Wins Above Expectations (WAE). Put simply, WAE is a measure of the

games that a team won that it should have lost (positive WAE), or lost that it should have won (negative WAE),

12 See Miller (2007) for an excellent discussion of the mathematical theory underlying James’ PE formula.

16

based on PE. In practical terms, WAE measures close wins and losses accumulated over a season. For many,

WAE is merely noise that reflects the irreducible randomness of a baseball game played by humans under

varying conditions. For others, WAE is an indicator of quality of relief pitching, the ability to hit or pitch “in the

clutch” (i.e., in high-leverage situations), or—critically for our purposes—the quality of a team’s manager.13

WAE as measure of managerial skill. The intuition behind WAE as a measure of managerial skill is that, over

the course of a season, a team’s PE is mostly a measure of the talent on its roster. A team with a roster of elite

athletes in the prime of their careers is expected to score lots of runs and allow few, with or without a great

manager. Just so, a team with slower, weaker, or injury-prone players is expected to score few runs and allow

many, regardless of what its manager does. WAE, then, is argued to measure how effectively a manager deploys

the talent at his disposal in close games.

Two championship seasons from the career of Hall of Fame manager Sparky Anderson provide a useful

comparison: the 1976 Cincinnati Reds and 1984 Detroit Tigers. Fresh off of a World Series championship the

previous year, the 1976 Reds were a juggernaut—the team’s roster boasted three future Hall of Famers (plus Pete

Rose, whose off-field transgressions have kept him from enshrinement). Dubbed the “Big Red Machine” in the

press, the team cruised to a 102-60 record, finishing ten games ahead of the second-place Dodgers, and went on

to sweep the Phillies and Yankees in the playoffs en route to a second consecutive World Series title. Although

baseball fans revere the 1976 Reds as one of the all-time great teams, its 102 wins were actually less than its

Pythagorean Expectation of 103-59, giving Sparky Anderson a -1.0 WAE (or -0.62 WAE%) for the season.

By contrast, the 1984 Tigers were coming off of a solid but unspectacular performance in 1983 with a

roster that included no Hall of Famers.14 Yet the 1984 Tigers led the American League all season, swept the

13 Darawski uses WAE to construct a baseball manager Hall of Fame. By WAE, Mike Scioscia of the Angels is the best manager in the

history of baseball. Bruce Bochy, Wilbert Robinson, Bobby Cox, and Filipe Alou round out the top five.

http://www.beyondtheboxscore.com/2012/3/28/2908044/manager-wins-above-expectancy

14 Co-author Teodoro believes that the 1985 Tigers’ Jack Morris is a Hall-of-Fame quality pitcher, and that the Baseball Writers

Association of America’s (BWAA) failure to admit Morris to the Hall is both an injustice and an indictment of the BWAA’s collective

judgment. Interested readers are encouraged to consider Morris’ career records and judge for themselves: http://www.baseball-

reference.com/players/m/morrija02.shtml.

http://www.beyondtheboxscore.com/2012/3/28/2908044/manager-wins-above-expectancy

http://www.baseball-reference.com/players/m/morrija02.shtml

http://www.baseball-reference.com/players/m/morrija02.shtml

17

Royals to win the American League pennant, and beat the Padres in five games to win the World Series. The

1984 Tigers’ 104-58 record exceeded their Pythagorean Expectation of 99-63 by five games, giving Anderson an

extraordinary 5.0 WAE and 3.09 WAE%. Not coincidentally, Anderson was named the 1984 American League

Manager of the Year.

The point of the illustration is simple but profound. In 1976 the Reds were expected to crush the National

League with its talent-laden roster, so the team’s excellent record and championship offers little evidence of

Anderson’s managerial prowess; the 1984 Tigers were expected to be merely good, so the team’s dominant

performance that year suggests that Anderson’s management contributed to its success.

Presidential PE in Congress. Can the PE formula be adapted to predict presidential success in Congress? There

are a number of parallels between presidential-congressional relations and baseball. The president is analogous to

a manager. A roll call is analogous to a game that the president’s team plays. Votes supporting the president are

analogous to runs scored, and votes against the president are analogous to runs allowed in each “game.” A year is

analogous to a season, and House and Senate to different leagues. A president who serves when his party holds

large majorities in both houses of Congress is akin to a manager with an all-star roster in a weak division; we

would expect him to win most of his roll calls without trying very hard. A president who faces hostile majorities

in Congress is like a manager of a team full of poor hitters and soft-tossing pitchers who struggle to throw

strikes; under such conditions, any roll call wins at all might be evidence of great presidential political skill, even

if the president’s actual winning percentage is low.

Just as James’ PE formula recognized that runs scored and allowed convey information about team

performance that is not reflected in winning percentage, votes won and lost on presidential roll calls may carry

information about a president’s legislative skill that percentage of roll calls won does not. Each vote for or

against the president carries information about legislative politics, including partisanship, cohesion of the

majority party, and congressional organization. Rather than considering each of these variables that typically

populate the right-hand side of a regression equation separately and with particular causal theories in mind, the

PE approach aggregates votes over a year as a way of approximating the overall strength of a president’s

18

legislative position apart from any one roll call, just as the baseball PE measures the strength of a team’s roster

apart from any one game. Adapted to presidential-congressional politics, the formula for Pythagorean Expected

roll call winning percentage (PE%) is:

𝑃𝐸% =𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟𝛾

𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟𝛾 + 𝑣𝑜𝑡𝑒𝑠 𝑎𝑔𝑎𝑖𝑛𝑠𝑡𝛾

The difference between roll call winning percentage and PE% is percentage Wins Above Expectations (WAE%).

At the end of a year, a positive value indicates that the president has won more roll calls than expected, while a

negative value indicates that he has lost more than he was expected to lose. This percentage multiplied by the

number of roll calls on which the president took a position yields Wins Above Expectations (WAE). Just as some

baseball analysts use WAE to assess managerial skill, WAE might also yield important information about

presidents’ skill with legislative politics. As with baseball managers, WAE% offers a useful way to think about

presidents’ annual performance, while total WAE rewards performance across several years and so offers some

perspective on career accomplishments.

The distinction between WAE and WAE% highlights an important difference between baseball games and

roll call votes.15 Whether enjoying a winning season or suffering through a miserable one, every Major League

team plays 162 games. “You win a few, you lose a few, some get rained out,” Satchel Paige once observed. “But

you got to dress for all of them.” Presidents may win a few and lose a few, but they don’t have to dress for all of

them. Unlike ballplayers, presidents get to pick their “games” because they choose when to take a position on a

particular roll call. A president who is relatively active in the legislative arena may have a high WAE frequently

despite a modest WAE%, while a president who takes few positions on roll calls may enjoy a very strong

WAE% but a low WAE. For example, if a president attempts to artificially inflate his success rate by avoiding

difficult roll calls and expressing a position only on those that are sure to go his way, the PE formula will

15 There are others. Roll call votes, for example, don’t involve the use of baseball bats, though Rep. Preston Brooks (D-SC) did attack

Sen. Charles Sumner (R-MA) with a cane on the Senate floor in 1856 (U.S. Senate n.d.

http://www.senate.gov/artandhistory/history/minute/The_Caning_of_Senator_Charles_Sumner.htm).

http://www.senate.gov/artandhistory/history/minute/The_Caning_of_Senator_Charles_Sumner.htm

19

systematically underestimate his success rate. Although a president may occasionally engage in posturing to

avoid an embarrassing loss or endorse a sure winner, Peterson (1990) presents evidence that presidential

positions generally are sincere. And Covington (1987) finds evidence that the president can sometimes increase

the chances that a roll call will go his way by “staying private”. These cases are also rare, but these “wins” do not

appear in the success rate because the president remained in the clubhouse. Nonetheless, we might expect

presidents to have positive overall WAE because they sometimes do not express positions on roll calls they

expect to lose (unlike baseball teams, whose aggregate WAE is always exactly zero each season).

Calculating presidential PE. Following standard sabermetric procedures, we used the presidential roll call vote

data described earlier to calculate PE, WAE% and WAE for presidents. Votes for and against the president’s

position were totaled for roll calls on which the president took a position for each year from 1953-2014. As in the

regression analysis, we exclude cloture roll calls in the Senate, as well as “consensus” roll calls where the

president won over 90 percent of the votes. Using these totals, we calculated PE for each year with γ set to 2

using James’ (1982) original formulation. The result was a striking .934 Pearson correlation between PE% and

actual winning percentage in the House and .907 in the Senate. These impressive correlations belied substantial

year-to-year error, so we followed the usual sabermetric practice of minimizing squared differences between

PE% and actual winning percentage. To do so, we applied a Generalized Reduced Gradient (GRG) iterative

algorithm to the PE% formula and roll call voting data with an aim of minimizing squared differences with γ

constrained to be greater than zero and less than 100. The GRG algorithm is an efficient, robust method for

optimizing nonlinear problems, as in the present analysis (Mantell and Lasdon 1978).16 Optimization yields γ

values of 3.44 in the House and 3.11 in the Senate. Consequently, the PE% formulae for presidents are:

𝐻𝑜𝑢𝑠𝑒 𝑃𝐸% =𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.44

𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.44 + 𝑣𝑜𝑡𝑒𝑠 𝑎𝑔𝑎𝑖𝑛𝑠𝑡3.44 𝑆𝑒𝑛𝑎𝑡𝑒 𝑃𝐸% =

𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.11

𝑣𝑜𝑡𝑒𝑠 𝑓𝑜𝑟3.11 + 𝑣𝑜𝑡𝑒𝑠 𝑎𝑔𝑎𝑖𝑛𝑠𝑡3.11

16 As a check of the stability and robustness, we replicated the optimization procedure using the evolutionary/genetic stochastic

optimization algorithm recommended by Yeniay (2005). For both House and Senate, evolutionary optimization yielded a value of γ equal

to the seventh decimal place.

20

These formulae generate PE% that correlate .939 with actual winning percentages in the House and .823 in the

Senate.

Results of Sabermetric Analysis of Presidential Wins Above Expectations

We used PE% for each year in each chamber to calculate WAE% and WAE. We also calculated career average

WAE% and WAE for each president by summing his annual WAE in each chamber.

Regression and Sabermetric Approaches Estimate Different Baselines. The regression models and the PE

formula predict presidential success rates with similarly high levels of accuracy—correlations around 0.94 in the

House and above 0.82 in the Senate. Despite highly accurate predictions, the WAE% from the two approaches

are uncorrelated (R2 = 0.001 and House and 0.078 in the Senate). Figure 3 shows annual WAE% for House and

Senate. We standardized PE WAE% to facilitate comparison to the studentized residuals from the regression

analysis.17 The patterns of PE WAE% are quite different from those estimated by studentized residuals. Whereas

the studentized residuals indicated that each president exceeded expectations in some years and fell short in

others, the PE estimates indicate that presidents were consistently above or below expectations across the years

of their administrations. Perhaps most striking is the partisan difference—except for Eisenhower, Republican

presidents generally exceeded expectations, while Democratic presidents usually fell short.18 In particular,

Reagan was considerably above the mean in both the House and Senate during all eight years. Clinton was below

the mean for all but one or two years of his presidency, and well below in the Senate. Obama exceeded

expectations in the House during his first two years, but falls far below the mean in the last four years.


Sabermetric Analysis of Neustadt’s Skills Hypothesis. How does Neustadt’s Political Skills Hypothesis fare

with sabermetric WAE%? Do presidents who are renowned for legislative skill generate impressive WAE%? Do

17 Standardized WAE% (z-scores), of course, are not the same as residuals. Residuals from OLS regression have a mean of zero. The

means of WAE% are 3.22% in the House and 2.43% in the Senate with standard deviations of 7.84 and 7.85 respectively. Thus, this

figure indicates each president’s deviation above or below the mean WAE% for all 62 years.

18 Keep in mind that zero is the mean level of performance.

21

presidents who historians regard as less skilled generate lackluster WAE%? And does WAE% in the House

correlate with WAE% in the Senate, as we would expect from the political skill hypothesis? Table 2 reports

career average WAE% and WAE for each president by summing his annual WAE in each chamber. Table 2 also

reports mean WAE% and total WAE (the sum of WAE for each chamber) for each president. For comparison to

a non-quantitative assessment, we also report each president’s “Relations with Congress” ranking from the 2010

Siena Presidential Expert Poll of presidential historians.19

[Table 2 about here]

Sabermetric analysis offers mixed evidence about the political skill of the reputed legislative giants

Lyndon Johnson and Ronald Reagan. Johnson’ career lfigures put him in the middle of the pack among post-

WWII presidents, with House (hWAE% = 2.9, hWAE=11.4) and Senate (sWAE%=1.6, sWAE=7.9) averages

below the mean for all analyzed presidents. In contrast, Reagan’s career statistics make him by far the most

impressive legislative operator in the sample, ranking second in career WAE% and a walk-off first place in WAE

in both houses. Reagan’s career hWAE of 82.5 and sWAE of 51.7 are more than double his next closest rival’s.

How do we reconcile these sabermetric differences between presidents’ historic reputations and their

sabermetric figures? Again, baseball offers an instructive metaphor. Since the emergence of sabermetrics, a new

generation of statistically trained, quantitatively oriented baseball analysts have clashed with traditional, “old

school” baseball writers over the interpretation of team, player, and manager performance (Lewis 2003).

Generally, sabermetricians seek objective measures of performance in terms of run production and run

prevention that account for differences in context, while traditionalists tend to focus on wins, losses, and the “eye

test” (visual, qualitative evaluation). Sometimes players and managers look better from one perspective than

another. Over the past twenty years, sabermetricians have come to dominate baseball analysis for hardcore fans

of the game, but the traditional baseball writers still stand as gatekeepers to the Hall of Fame, and so the

traditionalist imprimatur is still the standard by which the casual fan defines greatness in baseball.

19 To help get your students’ attention, Appendix B presents a set of presidential baseball cards that report detailed results for (eventually)

all eleven presidents.

22

The managerial careers of Joe Torre and Bobby Cox offer useful allegories to the legislative records of

presidents Johnson and Reagan. Torre and Cox are both widely considered all-time great managers, and both

were inducted to the Hall of Fame in 2014. Over his 29-year career, Torre won 2,326 games (.538 winning

percentage), six American League pennants, and four World Series titles; over his 29-career, Cox won 2,504

(.556 winning percentage), five National League pennants, and one World Series. But their career WAE figures

are quite different: Cox ranks fourth all-time with a career WAE of 77.0, but Torre’s respectable 23.3 puts him

far behind at 39th on the all-time list. A close look at their records makes the reason clear: Torre’s most

successful years came with the New York Yankees between 1996 and 2007, when his roster included some of

the most fearsome hitters and pitchers in baseball. Torre’s Yankees won big, but they were supposed to win big

based on their enormous payroll and all-star laden roster. Cox spent nearly his entire managerial career with the

Atlanta Braves, winning consistently and routinely outperforming his PE.

Like Torre and Cox, Johnson and Reagan earned their reputations in very different ways. Johnson’s actual

legislative winning percentages are very high, but not much better than his very high PE. Johnson won most of

his legislative battles, including some extraordinary legislative “championships” with civil rights laws. But like

Joe Torre’s Yankees of the late-1990s and early-2000s, LBJ was expected to win with his strong partisan

majorities in both houses and a booming economy. Reagan’s actual winning percentages are fairly low (.420

House, .720 Senate), but like Bobby Cox, his actual record was consistently much higher than his PE,

particularly in the House. Torre, Cox, Johnson and Reagan were all successful and probably deserving of their

places in their respective Halls of Fame. But sabermetrics make a stronger case for Cox and Reagan, while

traditionalists would point to Johnson's landmark legislative achievements and Torre’s four World Series rings in

making their arguments for enshrinement.

On the other end of the reputational distribution are Richard Nixon and Jimmy Carter—reputedly the least

politically-skilled presidents in our sample. Sabermetric assessment again yields mixed evidence for Neustadt’s

political skill hypothesis. Nixon and Carter both generated below-average WAE figures across the board, with

23

the exception of Nixon’s strong career sWAE (which is something of an oddity).20 But close examination of

Nixon’s and Carter’s records show that their legislative records relative to PE were not especially strong or

weak. Although they occupy the bottom half of the standings, a more accurate description might be “mediocre”

rather than “bad.” In contrast, Dwight Eisenhower and Bill Clinton occupy the sabermetric cellar.

Table 2 shows that Siena’s expert rankings are uncorrelated with legislative success measured as WAE. In

fact, the Siena rankings of “Relations with Congress” correlate much more strongly with its nineteen other

categories (e.g., presidential “Background,” “Imagination,” “Intelligence,” and “Luck”) than actual winning

percentage. This divergent assessment implies that presidential experts are applying their own adjustments to

presidents’ legislative records to account for context, but evidently not the same kinds of adjustments as those

implied by PE. As with sabermetricians and traditionalist sports writers, quantitative analytics and expert

qualitative assessments lead to very different conclusions about presidents’ legislative success.

Another striking finding that emerges from Table 2 is that Republican presidents outperform Democrats,

and it’s not close.21 Ike is in last place, but otherwise the top half of the distribution is all GOP. This may tell us

something about the cohesiveness of the GOP in Congress. Yet, the regression approach does not find any

systematic partisan advantage in WAE, so we need further investigation to determine why the PE formula

generates this result.

Finally, if Neustadt is correct that political skill drives legislative success, we should observe similar

patterns of WAE in both chambers. WAE offers support for hypothesis. From 1953-2014, annual WAE% for

House and Senate correlate at .44 (p<.01). Perhaps more importantly for Neustadt’s understanding of presidential

power, the Spearman rank-order correlation for presidents’ career WAE% in House and Senate is .69 (p=.02). In

other words, each president’s ranking relative to other presidents is broadly similar in both houses.

20 Nixon’s career sWAE of 22.2 is mainly the result of an extraordinary sWAE of 20.9 in 1973 when he won several close votes en route

to a low .376 overall winning percentage. Nixon’s sWAE was actually negative for three of his six years in office.

21 This was an extraordinarily painful sentence to write. That’s the benefit and curse of science and statistics: it’s difficult for one’s

partisan biases to creep into the findings even subconsciously.

24

Discussion

This analysis shows that the Pythagorean Expectations formula predicts presidential success rates as well as fully

specified regression models. What PE extracts from the sum of votes supporting and votes opposing the

president, however, is different from theoretical explanations of presidential success—the interaction of party

control, public approval, and party polarization—included in our regression models. But as is the case with

regression residuals, we don’t know exactly what’s in the WAE%. Perhaps WAE is capturing something

important about crafting legislative coalitions. At the very least, we think sabermetrics may offer a new, WAE

cool direction for the study of presidents’ participation in legislative politics.

Although this paper shows that sabermetrics may be a useful tool to analyze roll call votes in legislatures,

we barely scratch the surface. The more important contributions of sabermetrics to baseball is estimating players’

Wins Above Replacement (WAR) to show how much each one contributes to his team’s success. The parallel

between members of the president’s party and baseball players is much less clear than the parallel between votes

(for and against) and runs (scored and allowed). Yet members of Congress do play different positions. Party and

committee leaders have resources and talents that might help determine whether the president wins or loses at the

margins. In addition, students of presidential-congressional relations have long recognized that some votes are

more important than are others. A somewhat subjective approach is to analyze a subset of “Key Votes” identified

by journalists (Congressional Quarterly, Inc. Annually 1953-2015; Edwards 1989, 2009). The sabermetrics

approach is to let the data identify the most important votes. Political scientists have proposed ways to estimate

the significance of roll call votes based on closeness and turnout (King 1986; Riker 1959; Yohe 1968). Perhaps

these methods could be adapted and used with PE to calculate WAR for Congress.

25

References Aldrich, John H. 2011. Why Parties? A Second Look. Chicago: University of Chicago Press.

Aldrich, John H., and David W. Rohde. 2000. “The Consequences of Party Organization in the House: The Role of the

Majority and Minority Parties in Conditional Party Government.” In Polarized Politics: Congress and the President

in a Partisan Era, ed. Jon R. Bond and Richard Fleisher. Washington, DC: CQ Press.

Berry, William D., Matt Golder, and Daniel Milton. 2012. “Improving Tests of Theories Positing Interaction.” Journal

of Politics 74 (July): 653-671.

Binder, Sarah A., and Steven S. Smith. 2001. Politics Or Principle?: Filibustering in the United States Senate.

Washington, DC: Brookings Institution Press.

Bond, Jon R., and Richard Fleisher. 1980. The Limits of Presidential Popularity as a Source of Influence in the U.S.

House. Legislative Studies Quarterly, 5 (February): 69-78.

Bond, Jon R. and Richard Fleisher. 1990. The President in the Legislative Arena. Chicago: University of Chicago Press.

Bond, Jon R., Jeffrey E. Cohen, and Richard Fleisher. 2014. “Why Party Polarization Affects Presidential Success

Differently in the Senate and House: Testing for Conditional Effects of Party Control and Presidential Approval.”

Presented at the Annual Meeting of the American Political Science Association, Washington, DC, August 27 - 31,

2014.

Bond, Jon R., Richard Fleisher, and Jeffrey E. Cohen. 2015. “Presidential-Congressional Relations in an Era of Polarized

Parties and a 60-Vote Senate.” In James A. Thurber and Antoine Yoshinaka, eds., American Gridlock: Causes,

Characteristics, and Consequences of Polarization. (New York: Cambridge University Press, 2015).

Bond, Jon R., Richard Fleisher, and Jeffrey E. Cohen. 2012. “How Party Polarization Affects the Relationship between

Party Control and Presidential Success Differently in the House and Senate.” Presented at the Annual Meeting of

the Southern Political Science Association, New Orleans, January 12-14, 2012.

Bond, Jon R., Richard Fleisher, and Glen S. Krutz. 2009. “Malign Neglect: Evidence that Delay Has Become the Primary

Method of Defeating Presidential Appointments.” Congress & the Presidency 36: (No.3): 226-243.

Bond, Jon R., Richard Fleisher, and Michael Northrup. 1988. “Public Opinion and Presidential Support.” Annals, 499

(September): 47-63.

Bond, Jon R., Richard Fleisher, and B. Dan Wood. 2003. “The Marginal and Time Varying Effect of Public Approval

on Presidential Success in Congress.” Journal of Politics 65 (February): 92-110.

Cohen, Jeffrey, Jon R. Bond, and Richard Fleisher. 2013a. “Placing Presidential-Congressional Relations in Context: A

Comparison of Barack Obama and His Predecessors.” Polity 45 (Jan.): 105-126.

Cohen, Jeffrey, Jon R. Bond, and Richard Fleisher. 2013b. “The Implications of the 2012 Presidential Election for

Presidential-Congressional Relations: Change or More of the Same?” In Amnon Cavari, Richard J. Powell, and

Kenneth R. Mayer, eds., The 2012 Presidential Election: Forecasts, Outcomes, and Consequences (New York:

Routledge).

Congressional Quarterly, Inc. Annually 1953-2015. “Presidential Support.” Congressional Quarterly Almanac.

(Washington, DC: Congressional Quarterly, Inc.).

Covington, Cary R., J. Mark Wrighton, and Rhonda Kinney. 1995. “A Presidency-Augmented Model of Presidential

Success on House Roll Call Votes.” American Journal of Political Science 39 (November): 1001-24.

Covington, Cary R.1987a. “Mobilizing Congressional Support for the President: Insights from the 1960s.” Legislative

Studies Quarterly 12 (February): 77-96.

26

Covington, Cary R.1987b. “Staying Private: Gaining Congressional Support for Unpublicized Presidential Preferences

on Roll Call Votes.” Journal of Politics 49 (August): 737-55.

Covington, Cary R.1988a. “Building Presidential Coalitions Among Cross-pressured Members of Congress.” Western

Political Quarterly 41 (March): 47-62.

Covington, Cary R.1988b. “Guess Who’s Coming to Dinner: The Distribution of White House Social Invitations and

Their Effects on Congressional Support.” American Politics Quarterly 16 (July): 243-65.

Cox, Gary W., and Mathew D. McCubbins. 2005. Setting the Agenda: Responsible Party Government Theory in the

U.S. House of Representatives. New York: Cambridge University Press.

Davidson, Roger H. 1984. “The Presidency and Congress.” In The Presidency and the Political System, ed. Michael

Nelson. Washington, DC: CQ Press.

Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper & Row.

Edwards, George C. III. 1989. At the Margins: Presidential Leadership of Congress. New Haven: Yale University Press.

Edwards, George C. III. 2009. The Strategic President: Persuasion and Opportunity in Presidential Leadership.

Princeton, NJ: Princeton University Press.

Fett, Patrick J. 1994. “Presidential Legislative Priorities and Legislators’ Voting Decisions: An Exploratory Analysis.”

Journal of Politics. 56 (May): 502-12.

Fleisher, Richard, and Jon R. Bond. 1983. “Assessing Presidential Support in the House: Lessons from Reagan and

Carter.” Journal of Politics 45 (August): 745-58.

Fleisher, Richard, and Jon R. Bond. 1992. “Assessing Presidential Support in the House II: Lessons from George Bush.”

American Journal of Political Science 37 (May): 525-41.

Fleisher, Richard, and Jon R. Bond. 2000. “Partisanship and the President’s Quest for Votes on the Floor of Congress.”

In Polarized Politics: Congress and the President in a Partisan Era, ed. Jon R. Bond and Richard Fleisher.

Washington, DC: CQ Press, Chap. 8.

Fleisher, Richard, Jon R. Bond, and B. Dan Wood. 2008. “Which Presidents are Uncommonly Successful in Congress?”

In Presidential Leadership: The Vortex of Power, eds. Bert A. Rockman and Richard W. Waterman. New York:

Oxford University Press, pp. 191-214.

Jacobson, Gary C. 1990. The Electoral Origins of Divided Government. Boulder, CO: Westview.

Jacobson, Gary C., and Samuel Kernell. 1983. Strategy and Choice in Congressional Elections, 2nd ed. New Haven:

Yale University Press.

James, Bill. 1982. The Bill James Baseball Abstract, 1982. New York: Ballantine Books.

Kellerman, Barbara. 1984. The Political Presidency: Practice of Leadership from Kennedy through Reagan. New York:

Oxford University Press.

King, Gary. 1986. “The Significance of Roll Calls in Voting Bodies: A Model and Statistical Estimation.” Social Science

Research 15 (): 135-152.

Lee, Frances E. 2009. Beyond Ideology: Politics, Principles, and Partisanship in the U.S. Senate. Chicago: University

of Chicago Press.

Lewis, Michael. 2003. Moneyball: the Art of Winning an Unfair Game. New York: W.W. Norton.

Lockerbie, Brad and Stephen A. Borrelli. 1989. “Getting Inside the Beltway: Perceptions of Presidential Skill and

27

Success in Congress.” British Journal of Political Science 19 (January): 97-106.

Mantell, J. B. and Leon S. Lasdon. 1978. A GRG Algorithm for Econometric Control Problems. Annals of Economic

and Social Measurement 6(5): 581-597.

Meier, Kenneth J. and Laurence J. O’Toole Jr. 2002. “Public Management and Organizational Performance: The Effect

of Managerial Quality,” Journal of Policy Analysis and Management 21(4): 629-643.

Miller, Steven J. 2007. A Derivation of the Pythagorean Won-Loss Formula in Baseball. Chance 20(1): 40-48.

Neustadt, Richard E. 1960. Presidential Power: The Politics of Leadership. New York: Wiley.

Newey, W. and K. West. 1987. A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation Consistent

Covariance Matrix. Econometrica. 55 (May): 703-708.

Peterson, Mark A. 1990. Legislating Together: Eisenhower to Reagan. Cambridge, MA: Harvard University Press.

Ramsey, J.B. 1969. “Tests for Specification Error in Classical Linear Least Squares Regression Analysis.” Journal of

the Royal Statistical Society. B31: 250-71.

Riker, William H. 1959. “A Method for Determining the Significance of Roll Calls in Voting Bodies.” In John C.

Wahlke and Heinz Eulau, eds., Legislative Behavior: A Reader in Theory and Research. Glencoe, IL: Free Press.

Rohde, David. 1991. Parties and Leaders in the Post Reform House. Chicago: University of Chicago Press.

Sullivan, Terry. 1988. “Headcounts, Expectations, and Presidential Coalitions in Congress.” American Journal of

Political Science. 32 (August): 567-89.

Sullivan, Terry. 1990. “Explaining Why Presidents Count: Signaling and Information.” Journal of Politics. 52 (August):

939-62.

Sullivan, Terry. 1991. “The Bank Account Presidency: A New Measure of Evidence on the Temporal Path of

Presidential Influence.” American Journal of Political Science. 35 (August): 686-723.

U.S. Senate n.d. “Senate Stories.”

http://www.senate.gov/artandhistory/history/minute/The_Caning_of_Senator_Charles_Sumner.htm

Yeniay, Ozgur. 2005. A Comparative Study on Optimization Methods for the Constrained Nonlinear Programming

Problems. Mathematical Problems in Engineering 2: 165-173.

Yohe, William P. 1968. “Riker's Method for Assessing the Significance of Roll Call Votes.” Public Choice 4 (Spring):

59-66.

28

Tables and Figures

Table 1

Conditioning Effects of Party Control, Polarization, and Public

Approval on Presidential Success in Congress

House Senate

Coef. Coef.

Party control 0.447*** 0.288***

(20.20) (10.30)

Approval -0.064** 0.478***

(-0.52) (3.29)

Polarization -1.033*** -0.170**

(-9.01) (-1.00)

Party control*Approval 0.267#** -0.418**

(1.77) (-2.35)

Polarization*Party control 1.365*** 0.423***

(9.27) (2.16)

Polarization*Approval -2.385*** -0.819**

(-3.39) (-0.70)

Polarization*Party control*Approval 2.256*** -0.894**

(2.04) (-0.62)

Constant 0.343*** 0.508**

(20.83) (22.62)

N 62 62

F( 7,54) 87.63 24.16

Prob > F 0.000 0.000

R2 0.893 0.724

Entries are OLS regression coefficients estimated with Stata 13 with robust standard errors (t-

test in parentheses).

***p<.001, **p<.01, *p<.05, #p<.10

29

Table 2

Career Sabermetric Wins Above Expectations

House Senate Mean Total Actual

Mean

2010 Siena

Poll Rank

President WAE% WAE WAE% WAE WAE% WAE Win % “Relations

w/ Cong”

Reagan 12.0 82.5 9.0 51.7 10.5 134.2 56.8 5

Bush43 8.8 33.9 7.4 21.2 8.1 55.1 67.3 32

Ford 13.4 19.7 12.7 19.3 13.0 39.0 47.6 17

Bush41 6.1 24.9 5.4 13.2 5.7 38.1 45.4 23

Nixon 2.4 8.3 1.6 22.2 2.0 30.5 60.1 36

Johnson 2.9 11.4 1.6 7.9 2.3 19.3 79.2 1

Carter 2.8 13.3 -1.0 1.5 0.9 14.8 73.1 39

Kennedy 1.6 2.4 1.4 4.8 1.5 7.2 82.3 13

Obama -6.7 -34.8 5.9 19.0 -0.4 -15.9 66.7 18

Clinton -0.4 -6.2 -5.4 -14.7 -2.9 -21.0 50.9 25

Eisenhower -2.0 -5.5 -3.3 -15.5 -2.6 -27.0 65.9 10

Mean 3.7 13.6 3.2 11.9 3.5 25.5 63.4

Entries listed in descending order of career Wins Above Expectations. Republican presidents in bold; Democratic

presidents in italics.

30

Figure 1

Regression Estimates of Presidents’ Wins Above Expectation in Congress

31

Figure 2

Interocular Test: Comparison of WAEs from Three Different Studies

Bond & Teodoro 2015 Cohen, Bond, & Fleisher 2013a Fleisher, Bond, and Wood 2008

32

Figure 3

Pythagorean Presidential Wins Above Expectations

33

Appendix A: Marginal Effects Plots of House and Senate Models When testing for conditional effects, it is common to designate one of the variables as the conditioning variable. Such a

designation is inappropriate because the effects of interaction terms are symmetrical—“when the effect of X on Y is

conditional on the value of Z, the effect of Z must be conditional on the value of X.” Marginal effects plots are useful to show

the symmetrical effects of interactions, (Berry, Golder, and Milton 2012). Our models test for conditional effects of two

continuous variables (polarization and approval) and a dichotomous variable (party control). Showing the effects of the two

continuous independent variables requires a three-dimensional plot with polarization on the x-axis, approval on the z-axis,

and the dependent variable (presidential success rate) on the y-axis. The plane in the three-dimensional space represents the

estimated Presidential Success Rate at each level of polarization and approval. Because party control is a dichotomy, the

effects of polarization and approval conditional on party control could, in principle, be represented by two planes (one for

minority presidents and one for majority presidents) in the same three-dimensional plot, but two planes plotted in the same

space would add additional complexity with no gain in understanding. We present separate three-dimensional plots for

minority and majority presidents. But to see more clearly how party control conditions the effects of polarization and approval,

we show relationships for majority and minority president in two-dimensional plots from two perspectives—the effects of

polarization conditional on approval, and the effects of approval conditional on polarization. These graphs plot cross-sections

of the three dimensional plots at low, average, and high levels of approval and polarization. They don’t show the contours in

the three dimensional plots, but we can see more precisely how the slopes change under different conditions.

Conditional Effects in the House

Figure 1 shows the conditional effects of party control, polarization and presidential approval in the House. The three-

dimensional plots show the conditioning effects of polarization and approval on success rates of minority presidents (Panel

1a) and majority presidents (Panel 1b). But to see relationships for majority and minority presidents on the same graph, Panel

1c shows the effects of party polarization on success conditional on whether approval is low, average, and high. At average

approval, as polarization increases, the probability of winning increases for majority presidents and declines for minority

presidents. Specifically, when polarization is low (around -.30), success rates are the same for majority and minority

presidents. But when polarization is high (around .30), success rates are about 71% for majority presidents compared to 51%

for minority presidents. Majority presidents benefit from increasing polarization at all levels of approval. The effects of

polarization on minority presidents’ success rates are negligible at low approval but strongly negative at high approval.

Looking at the effects of approval conditional on party polarization (panel 1d), we see that rising popularity has small positive

effects increases majority presidents’ at all levels of polarization. Minority presidents, on the other hand, seem to benefit from

rising public approval only if polarization is low; the effects of rising approval are slightly negative at average polarization

and strongly negative at high polarization. The theory underlying the popularity hypothesis does not anticipate a negative

relationship. We speculate that the strong effects of party control and polarization may swamp the smaller effects of approval.

If minority presidents overestimate the benefits of public approval and make fewer concessions to the majority, the error in

judgment might account for the negative relationship. And when parties in Congress are highly polarized, the majority may

ignore a rise in approval because they know that public opinion is also polarized. If so, increases in public approval come

mainly from the president’s own partisans, votes that majority party members are not going get anyway.

[Appendix A Figure 1 about here]

34

Appendix Figure 1

Conditional Effects of Party Control, Polarization & Approval on Presidential Success in the House

Conditional Effects in the Senate

In the Senate the interactions between polarization and approval are not significant. This suggests that the only conditional

effects are between party control and polarization, and between party control and approval (see Figure 2). The effects party

polarization on presidential success are similar to those in the House—as polarization increases, majority presidents win more

and minority presidents win less (panel 2c). The slopes of the lines are less steep than in the House, indicating that effects of

party are smaller in the Senate. Panel 2d shows the effects of approval conditional on party control. The relationships differ

35

from those observed in the House. In the Senate, minority presidents benefit from increased public approval, but majority

presidents do not. We find no significant conditional effects between party polarization and approval, which means that the

slopes of the lines are not conditional on the level of polarization. Notice, however, that high party polarization increases the

success rate of majority presidents and decreases success rate of minority presidents.

[Appendix A Figure 2 about here]

36

Appendix Figure 2

Conditional Effects of Party Control, Polarization & Approval on Presidential Success in the Senate

37

Appendix B: Presidential Baseball Cards

DWIGHT D. EISENHOWERPresident, Republicans

Ike’s 1953 rookie legislative season was his best, with winning

.897 in the House and

.865 in the Senate

Season RC W% WAE% WAE RC W% WAE% WAE

1953 29 0.897 0.025 0.07 44 0.864 0.035 1.55

1954 29 0.724 -0.041 -1.19 66 0.742 -0.005 -0.33

1955 31 0.516 -0.011 -0.33 33 0.758 -0.066 -2.19

1956 23 0.609 -0.086 -1.99 56 0.625 -0.050 -2.80

1957 54 0.537 -0.061 -3.27 42 0.714 -0.075 -3.17

1958 40 0.675 -0.030 -1.21 85 0.729 -0.024 -2.02

1959 49 0.510 -0.018 -0.88 101 0.406 -0.033 -3.37

1960 41 0.634 0.065 2.65 75 0.600 -0.042 -3.12

Career 296 0.622 -0.019 -6.15 502 0.647 -0.031 -15.45

House Senate

Ken

ned

y w

as o

n p

ace

for

an o

uts

tan

din

g 1

96

3

in t

he

Ho

use

wit

h .

05

4

WA

E% w

hen

his

dea

th

cut

the

seas

on

sh

ort

.

JOHN F.

KENNEDYDemocrats | Pres. S

ea

son

RC

W%

WA

E%

WA

ER

CW

%W

AE

%W

AE

19

61

55

.80

0.0

20

1.1

21

15

.79

1.0

26

2.9

9

19

62

50

.82

0-.

02

5-1

.25

10

9.8

44

.02

12

.25

19

63

47

.83

0.0

54

2.5

29

0.8

56

-.0

05

-0.4

1

Ca

ree

r1

52

.81

6.0

16

2.3

93

14

.82

8.0

15

4.8

3

Ho

use

Se

na

te

38

Season RC W% WAE% WAE RC W% WAE% WAE

1963 13 .692 -.017 -0.23 13 1.000 .069 0.90

1964 41 .854 .077 3.14 189 .915 .021 4.05

1965 85 .906 .052 4.46 138 .913 .038 5.25

1966 74 .838 .084 6.25 91 .604 -.049 -4.49

1967 91 .648 -.034 -3.08 117 .735 -.009 -1.05

1968 70 .757 .013 0.88 129 .636 .025 3.22Career 374 .789 .031 11.41 677 .790 .012 7.89

House Senate

Johnson won the Voting Rights Act and

Immigration & Nationality Act on his way to a career-high 4.46 hWAEand 5.25 sWAE in 1965.

LYNDON JOHNSON president

REPUBLICANS

Four

of

Nix

on’s

5½

sea

sons

wer

e la

cklu

ster

, but

197

3 w

as a

blo

ckbu

ster

fo

r Tr

icky

Dic

k w

ith

+4.

04 h

WA

Eon

111

hR

Can

d a

who

ppin

g +

20.8

7 sW

AE

on

141

sRC

the

Sena

te

Richard Nixon | PRESIDENT

YEA

RR

CW

%W

AE%

WA

ER

CW

%W

AE%

WA

E

1969

40.6

75.0

652.

6259

.712

-.00

1-0

.08

1970

43.7

44.0

210.

9059

.559

.049

2.90

1971

49.7

96.0

080.

3872

.653

-.01

9-1

.34

1972

31.7

74.0

260.

8231

.419

-.13

4-4

.16

1973

111

.414

.036

4.04

141

.376

.148

20.8

7

1974

44.6

14-.

011

-0.4

873

.479

.055

4.04

CA

REE

R31

8.7

70.0

218.

28

435

.513

.051

22.2

2

HO

USE

SEN

AT

E

Documents

A Sabermetric Test of Neustadt’s Skills Hypothesis...A Sabermetric Test of Neustadt’s Skills Hypothesis or Why Ronald Reagan is like Bobby Cox and Lyndon Johnson is like Joe Torre