Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Part 3
Bayesian Updating of Political
Beliefs: Normative and Descriptive
Properties
Social scientists increasingly use Bayes’ Theorem as a normative standard of rational
political thinking (Bartels 2002; Gerber and Green 1998, 1999; Tetlock 2005;
Steenbergen 2002) and as a tool to describe how people actually think (Bartels 1993;
Achen 1992, 2002; Husted, Kenny, and Morton 1995; Grynaviski 2006; Lohmann
1994; Neill 2005). Models based on the Theorem are attractive because they offer a
way to account for the weight that people place on old beliefs and new influences when
revising their political ideas. They are also formal models, and as such, they bring the
benefits of mathematical exposition to topics that have usually lacked it (Lupia 2002;
see also Luce 1995). But uncertainty remains about the basis of their normative appeal
and about whether they can accommodate everyday features of political cognition.
This essay clarifies those matters. After explaining the Theorem’s appeal
as a standard of rationality, I show that four important features of political
thought—increased uncertainty in response to surprising information, selective
89
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
perception, attitude polarization, and enduring disagreement—are inconsistent with the
most widely-used Bayesian updating model but quite consistent with other Bayesian
models. Bayes’ Theorem proves capable of capturing many features of real-world
political thinking. But this flexibility is the Theorem’s downside: precisely because it is
consistent with so many different ways of thinking about politics, it is inadequate as a
standard of rationality.
90 Bayesian Updating of Political Beliefs
Bayes’ Theorem and Bayesian Updating Models
Most of the matters that interest political scientists—public opinion toward candidates,
implications of new policies, the probability of terrorist attacks—can be thought of
as probability distributions. Like most distributions, they have means and variances,
and the task that we set for ourselves is to learn about these parameters. A politician’s
ability to manage the economy, for example, may oscillate over time around a fixed
but unknown mean. Learning about politics becomes a matter of learning about
probability distributions—a task to which Bayesian statistics is especially well-suited.
The bedrock of Bayesian statistics is Bayes’ Theorem, an equation that relates
conditional and marginal probabilities:
p(E\S)p(S) p(E\S)p(S)(3 , )
where S and E are events in a sample space and /?(•) is a probability distribution
function. In words, the Theorem indicates that p(S |£), the probability that S occurs
conditional on E having occurred, is a function of the conditional probability p(E\S)
and the marginal probabilities p(E) and p(S). Stated thus, the Theorem is merely an
accounting identity. But a change in terminology draws out its significance. This time,
let S be a statement about politics and p(S ) be a belief about S , i.e., a probability
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Introduction 91
distribution indicating someone’s estimate of the extent to which S is true. E is
evidence bearing on the belief. In this version of Bayes’ Theorem, the estimated
probability of S before observing E is given by p(S); it is often called the prior
probability o f S , or simply the “prior.” The estimated probability that S is true after
observing E is p(S |£), often called the posterior probability o f S . And p(E\S) is the
likelihood function that one assigns to the evidence; it reflects a person’s guess about
the probability distribution from which the data are drawn. Understood in this way,
the Theorem tells us how to revise any belief after receiving relevant evidence and
subjectively estimating its likelihood. It is most often applied to beliefs about future
events (Tetlock 2005), but it is fundamentally a tool for calculating probabilities, and
it applies with equal force to all ideas that can be described in probabilistic terms. See
Figure 3.1; for applications of Bayesian models to political attitudes and evaluations,
see Bartels (1993, 2002) and Gerber and Green (1999).1
Bayes’ Theorem is attractive as a normative standard of belief updating because
it can be derived from two fundamental axioms of probability:
1 If it seems confusing to think of attitudes in probabilistic terms, consider that attitudes are merely beliefs that objects are good or bad in some way, often accompanied by affective responses to those objects (Zanna and Rempel 1988; Abelson 1986). If I like John McCain, I have assigned a high probability to the hypothesis that he belongs to a category of objects that I like. And if reviewing new evidence causes me to like McCain less, I assign a lower probability to that hypothesis. This definition of attitude is in keeping with the view that much mental categorization is probabilistic (e.g., Smith and Medin 1981; Smith 1990).
p(5n^) = Jp(£ri5) (3.2a)
(3.2b)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
92 Bayesian Updating of Political Beliefs
Median Household Income in Am erica
below above
$50,000 $50,000
Probability that Gingrich W ins the Presidency
.4 .6t -.8
Attitude toward John Edwards
Bill Clinton as Steward of the Economy
neither like nor
dislike
like
a littlelike
a lot
Figure 3 .1: Attitudes, Evaluations, and Factual Beliefs Are Probability Distributions. All of political cognition can be conceived in terms of probability distributions, and in doing so we win for political science the vast body of knowledge about subjective probability theory, the chief element of which is Bayes’ Theorem. The upper left-hand panel depicts a factual belief about household income in the U.S.: the person holding this belief estimates that there is a 35% chance that the median household income is below $50,000 and a 65% chance that it is greater than that. This is just a two-category discrete probability distribution. The upper right-hand panel depicts a belief about a future matter, Newt Gingrich’s chance of winning the Presidency in 2008. The belief is a beta distribution: continuous, asymmetric, and bounded between 0 and I (Paolino 2001; Jackman 2008, ch. 2). The lower left-hand panel depicts a positive but somewhat ambivalent attitude about John Edwards: it is a normal distribution. The lower right-hand panel is an ambivalent voter’s evaluation of Bill Clinton as a manager of the national economy—a discrete probability distribution with five categories.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Introduction 93
Equation 3.2a says that the probability that 5 and E both occur equals the probability
that E and 5 both occur. Equation 3.2b is a definition of conditional probability.2
No one consciously rejects either axiom, and following both of them requires that
beliefs be updated according to Bayes’ Theorem: p(S IE) = p(Ep \ and p(E f lS ) =
p(E\S)p(S), so p(S\E) = • By contrast, updating that does not correspond to
Bayes’ Theorem constitutes an implicit rejection of either or both of the axioms.3
Because the denominator of Equation 3.1 only serves to ensure that the
posterior density integrates to one (and is therefore a proper probability density),
Bayes’ Theorem is more commonly expressed as
P(S\E)k P(E\S)P(S).
In words, “the posterior belief is proportional to the prior belief times the likelihood.”
People are Bayesian if their posterior beliefs are determined in this fashion.
Importantly, Bayes’ Theorem says nothing about what one’s priors should be, what
evidence one should use to update, or how one should interpret the evidence that one
does use—a point to which we shall return.
In political science, one Bayesian updating model is far more common than
others: the “normal-normal” model, so-called because it assumes both that people’s
priors are normally distributed and that they perceive new information to be normally
distributed (e.g., Achen 1992; Bartels 1993,2002; Gerber and Green 1999; Husted,
Kenny, and Morton 1995; Gerber and Jackson 1993; Zechman 1979). Suppose that
a voter is trying to learn about p, a politician’s level of honesty. Initially, her belief
2 Although conditional probability is usually presented as an axiom, Bernardo and Smith (1994, ch. 2) show that it can be derived from simpler axioms.
3 This is the simplest valid treatment of a complex topic. For elaborate efforts to root Bayesian statistics in an axiomatic framework, see Savage (1954) and Pratt, Raiffa, and Schlaifer (1964).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
about his honesty is normally distributed: pi ~ N (jjq, <Xq). Later, she encounters a
new message, x, that contains information about his level of honesty. She assumes
that the message is a draw from a distribution with a mean equal to the parameter of
interest.4 The normal distribution is usually a sensible assumption: if the message can
theoretically assume any real value, and if error or “noise” is likely to be contributed
to it by many minor causes, the central limit theorem suggests that it is likely to be
normal. We write x ~ N(ji, cr2x). The variance of this distribution, cr2x, captures how
definitive the new information is. If it is communicated directly from a highly credible
source, the signal it sends is clear and the variance is quite small. But if it is merely
a rumor that the voter spots in a tabloid, the signal is only slightly informative and its
variance will be high. Similarly, the variance of a prior or posterior belief is a measure
of the confidence with which it is held: the higher the variance, the less confidence one
places in one’s estimate of //. (See Figure 3.2.)
94 Bayesian Updating of Political Beliefs
4 Assuming that the mean of the message distribution is \i is tantamount to assuming that the message comes from an unbiased source. If the voter believes that the message comes from a biased source, she needs to adjust for that bias before updating. This is no obstacle to Bayesian updating (e.g., Jackman 2005), but it is not part of the normal-normal model.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Introduction 95
3 5
strong belief
3 5
weak belief
Figure 3.2: The Variance of a Belief Indicates Its Strength. Both panels depict beliefs that are normal distributions with means of 3. The distribution in the left-hand panel has a variance of .25: the person who holds this belief is quite confident that the parameter of interest is about 3. By contrast, the distribution in the right-hand panel has a variance of 4. The person who holds this belief is not confident that the parameter of interest is close to 3; to him, it could easily be around I or 5 or some value even more distant from 3.
By a common result (e.g., Box and Tiao 1973), a voter with a normal prior
belief who updates according to Bayes’ Theorem in response to x will have posterior
belief fi\x ~ N(jx\,cr\), where
The posterior mean, n\, is a weighted average of the mean of the prior belief and the
new message. The weights are determined by the precisions, i.e., the reciprocals of the
variances of the prior belief and the new message. This is a fantastically convenient
result, as it permits us to compute posterior means without multiplying the prior
probability distribution by the likelihood. And this convenience helps to account for
and (3.3a)
(3.3b)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
96 Bayesian Updating of Political Beliefs
the popularity of the normal-normal model. But the model has shortcomings that make
it normatively unattractive and descriptively unrealistic:
1. Surprise is impossible. The model implies that people always become more
certain of their beliefs over time.
2. Polarization is unthinkable. Under widely assumed conditions, the model
implies that people who initially disagree are literally incapable of holding
posterior beliefs that are less alike than their priors.
3. Agreement is inevitable. The model implies that people will always disagree less
as they learn more. Furthermore, learning enough will always cause their beliefs
to converge to agreement.5
All of these shortcomings are peculiar to the normal-normal model; they are not
inherent properties of Bayesian updating. In the remainder of this article, I elaborate
each of these shortcomings and define other Bayesian updating models that surmount
them.
5 These two statements are not equivalent: it is possible for an updating model to imply ever-diminishing disagreement without implying eventual agreement. But the normal-normal model implies both.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Bayesians Can Be Surprised 97
The Norm al-Norm al Model Implies that People
Always Become More Certain, But O ther Bayesian
Updating Models Do Not
A desirable property of any learning model is that surprising new evidence can
cause people to become less sure of their beliefs. A distressing property of the
normal-normal model is that surprising new evidence always makes people more
sure of their beliefs. Formally, note that the variance of a posterior belief in the
normal-normal model can be expressed as
people more certain of their belief. Moreover, the extent to which new information
is surprising has no bearing on the extent to which it changes the certainty of one’s
beliefs. This shortcoming of the normal-normal model is often noted (Learner 1978;
Gerber and Green 1998; Bartels 1993, 2002; Grynaviski 2006), even though it has not
dented the model’s popularity. Fortunately, the problem is anything but endemic to
Bayesian updating. It is an artifact of an unrealistic assumption of the normal-normal
model: that we know the variance of the distribution that we are trying to learn about,
and need only estimate its mean. This situation is as rare in politics as it is in any other
domain. A model in which we simultaneously learn about the mean and variance of the
unknown distribution is both more realistic on its face and capable of accommodating
cases in which people become less certain over time.
<7If cr\ is finite, < 1, and therefore cr] < cr : updating will always make0 x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Updating with Unknown Mean and Unknown Variance
98 Bayesian Updating of Political Beliefs
Instead of trying to learn only about the mean of a distribution, we are now trying
to learn about both its mean, p, and its variance, cr2. Our prior belief about these
parameters is a bivariate joint distribution, pip, cr2). It can be expressed as the product
of a conditional prior belief about the mean, p(p\(r2), and a marginal prior belief about
the variance, p(cr2). Many different densities might be used to model these priors,
but two of the most obvious choices are a normal distribution for the mean and an
inverse-Gamma distribution for the variance. (The inverse-Gamma distribution is
attractive for modeling variances because it is continuous, flexible, and has a lower
bound of 0 but no upper bound.) Specifically, the normal/inverse-Gamma prior
distribution can be expressed as the product of two densities,
• po is the mean of the prior belief about p
• a21 no is the variance of the prior distribution for p, conditional on cr2. no is
interpretable as “prior sample size”: the smaller it is, the larger the variance
of the prior belief, reflecting the fact that a prior based on less information
(or fewer “prior observations”) is less precise than a prior based on more
information
• v0 > 0 is a prior shape parameter, i.e., a prior “degrees of freedom” parameter
where
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
• vocr2 is a prior scale parameter, equivalent to the sum of squared residuals
one would obtain from a previously observed dataset of size v0 in which each
observation came from a N(fi0, cr^) dataset.
Bayesians Can Be Surprised 99
The marginal inverse-Gamma prior distribution has a * 2 shape, and indeed, the
inverse-*2 distribution is a special case of the inverse-Gamma distribution. (Gelman
et al. 2004 discuss updating with normal/inverse-*2 priors; see Grynaviski 2006 for an
application.)
Suppose that we encounter messages x = (xi, . . . , jc„) that we believe bear
directly on the parameters about which we are trying to learn; i.e., we believe
that Xi ~ Nip, cr2) V i e (1 , . . . ,«). If our prior belief about n and cr2 is a
normal/inverse-Gamma distribution with parameters //0, n0, v0, and <Xq, our posterior
belief will also be a normal/inverse-Gamma distribution:
filer2,x ~ ) and (3.4a)\ n0 + n)
7 (v\ Vjcr? \cr |x ~ inverse-Gamma I —, I, (3.4b)
where Vi = v0 + n, and vicr2 = v0ct + £ (xt - x)2 + ^ Qi0 - x)2.1=1
Usually, interest focuses on the marginal posterior distribution of fi, which is a t
distribution:
//|x = tVl (//i, -^cr2/ (n0 + n ) j .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
100 Bayesian Updating of Political Beliefs
Now, the marginal prior distribution of ji is tYQ ^u0, yjo^/noj. If cr2/n0 < cr2/ (n0 + n),
the posterior belief about n has a higher variance than the prior. This occurs when
2 1
no riQ + n
2 ( n 0 + n \ voctq + £ /= i (*« - x)2 + & ( / /0 - * ) 20-51------- 1 <n0 / v0 + n •
( n° + H\ ( yo + n ) ~ v0o^ < V (x, - x)2 + (^0 _ x)2 .\ no j no + n
_2 2 nL'q / 2\ C/- 2 ^— (nv0 + nn0 + n ) + — (n0v0) - v0cr < ) (Xj - x) + ------- (pi0 - x)n0 v ’ n0 n0 + n
cr2 n— [n (v0 + n0 + n)] < V (j:,- - x)2 + —— (/i0 - *)2 . (3.5)no no + n
Equation 3.5 establishes that people who update according to Equations 3.4a and 3.4b
will become less sure if they learn from data that are sufficiently surprising ((Jc - fio)2
is large enough) or sufficiently vague (£"=i ( j c , - x)2 is large enough). Thus, in contrast
to the normal-normal model, a model that presumes that both a mean and a variance
are unknown permits new information to shake the confidence that people repose in
their beliefs.
Bayesian Updating Models Can Accommodate
Biased Interpretation of Political Information
It is common to hear or read that “Bayesian updating requires independence between
priors and new evidence” (Taber and Lodge 2006, 767; see also Ottati 1990, 160;
Fischle 2000). The notion underpinning the claim is, presumably, that Bayes’ Theorem
demands that beliefs correspond to some objective conception of reality. But there
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Bayesians Can Be Biased 101
is not even a germ of truth to such claims. Nothing in Bayes’ Theorem—nothing in
the other writing of Reverend Bayes—nothing in the writing of his contemporary,
Laplace—nothing in the whole of Bayesian statistics past or present warrants such
a claim. “Objective perception” of political information may belong in a standard
of rationality—and this is a point to which I shall return—but it has no place in a
framework for belief updating. Indeed, the entire history of Bayesian scholarship
militates against the notion that understanding of probability requires or profits from
objective perception of the world (de Finetti 1974; Savage 1964; Jeffrey 2004).6 Still,
there are some for whom the dream will never die, and Bayesian updating models are
quite able to accommodate them.
Partisanship is often thought to influence political views through selective
perception: the assimilation, from the evidence at hand, of only or chiefly those details
that support one’s prior beliefs (Campbell et al. 1960, esp. Chapter 6; Bartels 2002;
Taber and Lodge 2006; Jacobson 2006; Gaines et al. 2007; see also Bullock 2006).
Nothing in the normal-normal model is incompatible with selective perception; like
Bayes’ Theorem, it makes no prescription about the way in which evidence is to be
interpreted. Still, researchers who have a standard of objective perception may prefer
a model that explicitly distinguishes between selective and objective perception. The
normal-normal model can be adapted to this task.
We begin by distinguishing “good” messages that comport with one’s prior
and “bad” messages that do not. Formally, let the former set of messages be xg =
(x i , . . . , x,) and the latter set be xb = (xt+l, . . . , x T). ~ N (//, cr2) Vi € (1 , . . . , T).
6 O f course, this is not to deny that there are objective probabilities; although many contemporary “subjectivist Bayesians” (e.g., de Finetti, Savage, Diaconis) deny that there are, Bayes and Laplace did not. Note that there is a school of thought that goes by the name “Objective Bayes,” but the “objectivity” that its members favor amounts to the rejection of certain types of prior beliefs as inappropriate—not to the assumption that there are objectively correct likelihoods for data, least of all in the ambiguous world of politics. On both counts, see Press 2003.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
xg = xg ~ N (//,cr;/r); Xb = Xb ~ N ( fi ,a 2x/(T - t)). If one’s prior belief is
H ~ N (juo, cr^j, and one updates according to the normal-normal model, one’s posterior
is /i|xg, Xb ~ N (/ii, erf), where
102 Bayesian Updating of Political Beliefs
xg = £ |=1 Xj/t, and Xb = EL+i Xi/(T - 0- As each xt is subjectively determined, so
too are xg and xb: different people interpreting the same evidence may come up with
different values of these “sample means.” But if we, as researchers, have a standard
of objective interpretation, we can make this subjective assessment explicit by using
a modified model. The formula for the mean of the posterior is given by Gerber and
Green (1999):
/ l / o l \//1 ^ \ l/cr2 + t/cr2 + (T - t)/(r2J
_ ( t/<r2 \ . _ I (T - t)/cr2x \agXg* \ 1 /cr2 + t/cr2x + ( T - t)/cr2x ) “ *** \ 1 /cr2 + t/cr2 + (T - t) /a 2 ) '
(3.6)
ag and or* are the selection weights', they indicate the extent to which favorable and
unfavorable messages are misinterpreted. xg* and xbt are the objective sample means
of the favorable and unfavorable messages, i.e., the sample means in the eyes of the
researcher. In the absence of selective perception, a g = a b = 1. If higher values
of Xi are preferable, and selective perception consists of giving an unduly favorable
interpretation to bad news, a b will be greater than 1. If selective perception consists of
exaggerating the good news provided by favorable information, ag will be greater than
one.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Convergence and Polarization of Public Opinion 103
A closely related form of political bias is at work when people attribute more or
less credibility to a news source than it deserves: when Communist Party officials exalt
the People s Daily, perhaps; or when Republicans take Rush Limbaugh at face value.
This calls for a slightly different model:
I l / o t \
^ ^ \ I/a* + t/(r2x + (T - t)/o~2/
+ x (________ azfl^x________ \ ( ab( T - t ) / o j \g \ 1 /o% + ocgt/a2 + (T - i ) / a 2) b \ 1 /a 2 + t/cr2x + a b{T - t ) / a 2J ‘
(3.7)
Here, ag and a b apply not to the content of new information but to its credibility. If one
overrates the credibility of favorable messages, ag is greater than 1: it it is as though he
is responding to more favorable messages than he really received. If one underrates the
credibility of unfavorable messages, ab is less than 1: it is as though he is responding
to fewer favorable messages than he has received.
Convergence and Polarization of Public Opinion
under Bayesian Updating
No issue in Bayesian analysis of public opinion is more disputed than the implications
of Bayesian updating for disagreement among people with different prior beliefs.
Gerber and Green (1999, 203-05) maintain that if Republicans and Democrats are
Bayesian, they will agree neither more nor less as they update in response to new
evidence. Empirically, Gerber and Green find just this patterning of presidential
approval over time and adduce it as evidence that many people are Bayesians
whose views are unaffected by partisan bias. Bartels (2002) cites the same public
opinion data as evidence that people are biased Bayesians or not Bayesian at all:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
104 Bayesian Updating of Political Beliefs
unbiased application of Bayes’ Theorem, he writes, implies convergence of public
opinion. Achen (2005, 334) agrees, and Grynaviski (2006, 331) claims that Bartels
“formally proved” that Bayesian updaters will “inexorably come to see the world
in the same way.” (He didn’t.) A closely related dispute is about polarization of
public opinion: Gerber and Green (1999) write that attitude polarization (Lord,
Ross, and Lepper 1979) is incompatible with Bayesian updating except under very
unusual circumstances, while Steenbergen (2002, 7-8) concludes exactly the opposite.
There is at least a little truth to all of these positions: Bayesian updating does imply
convergence of public opinion under some conditions, but these are more numerous
and more stringent than the discussions to date have acknowledged.
Before embarking on a series of proofs, it will help to distinguish between
three kinds of convergence, all of which are depicted in Figure 3.3. Convergence
to agreement occurs when prior beliefs converge to the same belief after updating.
Convergence to signal occurs when people’s beliefs converge to the mean of the
distribution of messages that they are using to update; if beliefs converge to the
same signal, they also converge to agreement. This kind of convergence has been
widely discussed in Bayesian statistics, where it is subsumed by the broader topic
of consistency of Bayes estimates. In that literature, convergence to signal has
been proved to hold under general conditions for a wide variety of prior and data
distributions (e.g., Diaconis and Freedman 1986; Strasser 1981); I focus here on the
case of normal priors and data because of the ubiquity of these assumptions in political
science. Convergence to truth occurs when beliefs converge to the true parameter of
interest. Often, convergence to signal implies convergence to truth, but not always. If
beliefs are updated in response to messages from a biased news source, convergence to
signal implies that beliefs are not converging to the truth.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Convergence and Polarization of Public Opinion 105
E |
% ! 2.1
estimates converge to agreement but not to signal
estimates converge to agreement and signal
but not to truth
time
estimates converge to agreement, signal, and truth
Figure 3.3: Convergence to Agreement, to Signal, and to Truth. The dashed lines depict parameter estimates by two people. They converge in each panel; this is convergence to agreement. The solid black line in each panel represents the mean of the distribution of information that people are using to update their estimates, e.g., the distribution from which news articles are drawn. In the leftmost panel, the parameter estimates do not converge to the mean of this distribution. In the middle and rightmost panels, they do: this is convergence to signal as well as convergence to agreement
The grey line in the last two panels indicates the true parameter value. In the middle panel, it differs from the mean of the information distribution. This occurs whenever the information that people use to update their beliefs is biased on average. In the rightmost panel, the mean of the information distribution is also the true parameter value: here we have convergence to truth as well as to agreement and to signal.
Convergence of Public Opinion Under the Normal-Normal
Model
Proposition 1. A person’s prior belief is/u ~ N(p0, cr2). He updates according to
Equation 3.3a in response to x, a sample of t messages that he perceives to have mean
jc, with
jc, ~ NQj., crl) V i e (1, . . . , t). If t is large enough, his belief will converge to x.
Proof. By a result shown in the appendix, the t messages are equivalent to a
single message Jc from distribution N(ji, cr^/t). Suppose e > 0 and T = . Then°0€t > T implies
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
106 Bayesian Updating of Political Beliefs
M* - *1 =a \x + (a lJ t) ii0 - x(crl + cr2J t)vV/T-2
O’O + ° 1 / '
((T2x/t)(ji0 - X)
+ <r2xlt
(T2x |/io - *1to-2 + 0-2
T5f1o3,
70-2+0-2
-•*1
e)] 1 o*0 6 1 a x
o-\ \Mo -
+ O'2
ecr |/io - x|0-2 (1/iq - *1 - e) + ecr^
= 6. □
Discussion. The proof reveals conditions under which any Bayesian updater’s
belief will converge to a subjectively defined jc. To some (e.g., Grynaviski 2006;
Bartels 2002), it seems a short step to infer that Bayesians who initially disagree will
come to agree with each other. In fact, the conditions set forth in the proof are quite
stringent, and the conditions required for convergence to agreement are more stringent
still.
First among the requirements for convergence to agreement is simply that
people are Bayesian or that they adopt non-Bayesian updating rules that nevertheless
permit convergence. This is a point too often elided by those who take nonconvergence
as proof of partisan bias (e.g., Bartels 2002). There is ample laboratory evidence
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Convergence and Polarization of Public Opinion 107
that people are not Bayesian updaters, and while most documented non-Bayesian
tendencies do nevertheless permit convergence (Phillips and Edwards 1966; Tversky
and Kahneman 1971), some may not (Chapman and Chapman 1959; Hamm 1993).
The second requirement is that people with different priors perceive the new
information in the same way: technically, they must agree on the value of Jc. This
rules out selective perception (unless people with different prior selectively perceive
evidence in the same way—an unlikely circumstance, given that priors influence
the strength and direction of selective perception). It also rules out cases in which
people are updating evaluations that simply reflect different values. For example, if a
Republican president’s economic policies are consistent with Republican values and
inconsistent with Democratic ones, we should not expect convergence even if members
of both parties are Bayesians who have the same understanding of new economic
information.
A third requirement likely to be violated is that people update exclusively
on the basis of the same set of messages. If they do not—if, for example, i and
j both update on the basis of x, but i also updates on the basis of messages that
seem have a different sample mean—there is no reason to expect updating to cause
agreement. This rules out selective exposure, whereby people with different views
may systematically expose themselves chiefly to congenial news sources (Taber and
Lodge 2006; Gentzkow and Shapiro 2006). It was once argued that selective exposure
is uncommon—especially in politics—because people do not consciously seek to
reinforce their views through their choice of media (Sears and Freedman 1967; Frey
1986). But selective exposure does not require a reinforcement motive (Katz 1968),
and the heightened sorting of the electorate (Levendusky 2006) and splintering of
the market for news into specialized niches (Prior 2007, Chapter 4) may have made it
increasingly common.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
108 Bayesian Updating of Political Beliefs
A fourth requirement is that the parameter about which people are updating
is constant over time. As I show below, weakening this assumption allows
nonconvergence and polarization when people update their beliefs, even if they are
updating in response to the same information and are interpreting that information in
the same way.
Finally, complete convergence to agreement—the assumption that Bayesians
will “inexorably come to see the world in the same way” (Grynaviski 2006, 331)—can
only occur if updaters are responding to a set of messages so powerful that it causes
them to completely ignore their prior beliefs. (Formally, the precision of the set of
new messages must be infinitely greater than the precision of the prior beliefs.) Even
relative to the other conditions, this is unrealistic. Bartels (1993) argues forcefully that
people’s beliefs about candidates at the start of Presidential campaigns are far stronger
than we usually imagine. And it is widely known that most Americans are exposed
to only meager amounts of political news (Campbell et al. 1960; Zaller 1992; Delli
Carpini and Keeter 1996; Prior 2007). The assumption may hold in the extremely long
run, but by then, as Keynes noted, we’ll be dead.
Almost no interesting political predicaments satisfy all of these conditions.
This suggests that the recent focus on convergence has been misplaced: the failure of
Republicans and Democrats to evaluate the President in the same way or to otherwise
share the same beliefs may be evidence of selective perception, but it may also be due
to any of several other, quite likely factors. On the other hand, convergence becomes
more likely as more of these conditions are met, and it would be quite surprising if
Bayesian updating did not imply convergence of public opinion when all of them are
met. That is why the next two proofs are interesting: they show that convergence may
not occur under Bayesian updating if these conditions are only slightly relaxed.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Nonconvergence and Polarization Under Bayesian Updating
with Selective Perception
Convergence and Polarization of Public Opinion 109
Proposition 2: Nonconvergence and Polarization Under Bayesian Updating with
Selective Perception. Let voters i and j have prior belief // ~ N(po, cTq). Both update
in response to x, a set of T messages, under the presumption that jc , ~ N(p, cr2x) V i e
(1 , . . . , T ). As in Equation 3.6, x can be partitioned into xg, a set of t messages that
favor the voters’ prior, and xb, a set of (T - 1) messages that contradict the voters’ prior.
Assume t > 1 and (T - t) > 1. The objective mean of xg is xg* and the objective mean
of Xb is x again, these “objective” means are stipulated by the researcher. Assume
^ Xb*. Voter i updates by Equation 3.6 with selection weights agi and a^,. Voter j
updates by Equation 3.6 with selection weights agj and abj. If agi ± agj or abi a b],
convergence to agreement cannot occur unless
and that will only occur by chance.
Proof. By Appendix A, updating in response to the messages in xg and xb is
equivalent to updating in response to a single draw
_ _ org X g ^ /iT - t ) + abxb*(T2x/t X* (T2x/t + (T2x/(T - t)
agXg(r2x/(T - t ) + a b X b ^ /t ~ cr2xT/n(T - t)
= (agXg^x/iT - 0 + oibXb^Jt) H 2 T ^
tagxgt + (T - t)abXb,= T
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
from a normal distribution with variance a^/T . By Equation 3.3a, the posterior mean
is
. . I V ° t \ - 1 T/cri \flX + ( l/er j + r / c r j ) '
Assume e > 0. By Proposition 1, \/x\x - x*| < e fo rT > i.e.,//|x will°oeconverge to x,. It only remains to show that jc, takes on different values for i and j
when they have different selection weights:
tagiXg* "H (T t^)(YXf}* t(XgjXg* "I- (T t )(}*bjXfr*
110 Bayesian Updating of Political Beliefs
where jc*, is the value of jc, using fs selection weights and xtj is the value of jc* using
/ s selection weights. By assumption, t and T - t are positive, and xgt + xb*, so the
posterior beliefs of i and j will never converge to agreement unless
(agi - a gj) = ( a bi - a bj) . :
Discussion. The proof shows that Bayesians with the same prior beliefs who update
under selective perception will generally have different prior beliefs; it is therefore not
just a nonconvergence result but a polarization result, too. Of course, nonconvergence
is no less likely when i and j have different prior beliefs: it depends entirely on
the difference between jc„• and xtj, and not at all on the difference between prior
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
beliefs. And even when i and j have different priors, selective perception will lead to
polarization when
Convergence and Polarization of Public Opinion 111
Convergence and Polarization under Kalman-Filter Updating
The models considered to this point assume that people are trying to learn about //, a
quality of the political environment that does not change. Achen (1992), for example,
assumes that the Democratic and Republican Parties offer benefits to each voter that
oscillate over time around a mean benefit level that never changes, and he uses the
assumption to justify his use of the normal-normal model to study changes in party
identification. Such fixed-mean assumptions are apt when we are trying to learn about
history and perhaps when we are trying to update our beliefs over the short term.
But they are inappropriate when the parameter of interest changes over time. Pace
Achen, the net benefit that I derive from a party changes as my views or economic
status change. My preferences over policies change as new proposals are placed on the
table or taken off of it. And candidates may improve during their time in office or fall
increasingly under the sway of constituents whose views I oppose. In all of these cases,
the constant-parameter assumption is an approximation at best.
Suppose that a Bayesian is trying to learn about a parameter that changes
according to the rule
a, = ?<*,_! + ea, ea ~ Af(0,o£) (3.8)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
where t is the current period, y is a known autoregressive parameter, and ea is a
disturbance term with known, finite, nonzero variance a 2a. Let jcj, . . . , x,_i, xt denote
the observed values of the parameter of interest at times 1, 1, t. The relationship
between x, and a, is
x, = a, + ex, ex ~ N(0, a^) (3.9)
where cr2x is a known, finite, nonzero variance term. (Following Gerber and Green 1998
and Green, Gerber, and de Boef 1999, y, cr2, and cr\ are held constant for simplicity of
exposition. But the model described in this section can easily accommodate the case in
which they change over time. See Meinhold and Singpurwalla 1983 for an example.)
His initial belief is
(ar0 | y, a^, crty ~ N(a0, P0)-
Looking forward to period 1, his prior belief about the parameter of interest is
governed by Equation 3.8:
(ori | y, a l , aj'j ~ N (ya0, + a2a).
112 Bayesian Updating of Political Beliefs
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
But after receiving message x {, he updates. Because both his prior belief and the
perceived likelihood of x Y are normal, he updates according to Equations 3.3a and
3.3b:
Convergence and Polarization of Public Opinion 113
(ari | y ,a l , o l ,x i) ~ N(&i,Pi), where
l / ( y 2P(> + O'2)a i = ya 0i / ( y 2P0 + a 2) + i/o-i
l
+ X\l /cr
[ l / t f P o + 0-1)+1/0-1
l /(y 2P0 + o-l)+ l/a-2'
And generally,
( o r , | y , a l , o j , x t _ i , x t ) ~ N ( a t , P t ) , w h e r e
\ / ( y 2 P t- i + a l )at = yat~i
l/(y2Pt-i + o-2) + I /0-2+ x ,
I /0-2
Pt =\ / { y 2 P , - i + a 2a ) + I / 0 - 2 ’
l / ( y 2 P t- i + c r 2 ) + \ / o -2
(3.10)
(3.11)
and where xt_i is the vector of messages jci, . . . , x,_i. Equations 3.10 and 3.11 are
known as the Kalman filter algorithm after Kalman (1960) and Kalman and Bucy
(1961), who show that Equation 3.10 yields the expected value of a t under the
assumption of normal errors. If the normality assumption is relaxed, the Kalman
filter estimator of a, remains best (i.e., least-squares-minimizing) among all linear
estimators. Harvey (1989) and Beck (1990) contain extensive descriptions of
the Kalman filter and its properties. Goussev (2004) argues from a neurological
perspective that the human brain unconsciously uses it to update probabilities.
Meinhold and Singpurwalla (1983) provide a lucid introduction to it from a Bayesian
point of view. Gerber and Green (1998) note that the normal-normal model is a special
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
case of the Kalman filter model in which y = 1 and cr2a = 0. And given that the
Kalman filter model is a Bayesian updating model, one of its surprising implications
is that people’s beliefs may diverge even if they are updating in response to the same
messages and interpreting those messages in the same way.
Proposition 3A: Polarization Can Occur under Kalman Filtering Even in the
Absence of Selective Perception and Selective Exposure. Assume that a„ (r2a, y, x„
and a i are as described above. Let K, = , D 1/(7t Voter Vs belief about a 0 is
N(&io, Pq). Voter f s belief about «o is N(&jo, Po). They are exposed to one message
at each stage t; the entire set of messages is x,. They update their beliefs according to
Equations 3.10 and 3.11. If a® 4- ajo, their beliefs diverge from time t to time t + 1 if
and only if (1 - K,+l) |y| > 1.
Proof. We begin with a lemma: the Kalman filter estimator a t can be written as
a linear function of or0,
a, = c,ao + ft'xt, (3.12)
where c, = n /= i (1 ~ Kdy, ft' is a row vector, and xt is the column vector of messages
x i , . . . , xt. (See the appendix for a proof.)
114 Bayesian Updating of Political Beliefs
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Proof by contradiction: assume some t such that beliefs at t + 1 are not more
polarized than beliefs at t even though (1 - Kt+i)\y\ > 1. Note that 0 < l - K , < \ V t .
Then
|a it - a jt| - |q',-,+i - or;,,+i| > 0
=> |c,arl0 + ft'xt - (ctajo + ft'xt)| - |(c +idri0 + ft'+1xt+i) - (cl+la j0 + ft'+1xt+i)| > 0
^ Iq I — Iq +iI > ot t
=> Y \ ( 1 - Ki) Irl - (1 - */+i) Irl n a - Ki) w ^ 0i=l i=l
=>(i-ATf+1) irl < i ,
which is a contradiction. This establishes that divergence occurs between times t and
t + 1 if (1 - Kt+i) |y| > 1. Now assume some t such that beliefs at t + 1 are less alike
than beliefs at t even though (1 - Kt+{) |y| < 1. Then
|a.t - & jt\ ~ |a,-,/+iQ'y>f+i| < 0 t t
=> f [ ( i - Irl - (i - Kt+1) Irl Y \ ( i - Kt) Irl < o1=1 i=i
= > d - ^ +1) l r l> i,
which is a contradiction. This establishes that divergence occurs only if
( l - f f , +1) l r l > l . □
Proposition 3B: Convergence to Agreement Occurs Eventually under the Kalman
Filter. Assume the conditions of Proposition 3 A. At some period t, K, will reach a
steady state. After that point, polarization will not be possible.
Convergence and Polarization of Public Opinion 115
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Proof. We begin with a lemma: K, will gradually converge to
y2 — c - 1 + y](-y2 + c + l)2 + 4cy2
116 Bayesian Updating of Political Beliefs
K =2 y2
where c = cr2/cr2 (Gerber and Green 1998; see the appendix for a proof). By
Proposition 3A, divergence occurs if and only if (1 - Kt+l) \y\ > 1, but this is not
possible when Kt+l = K.
Proof by contradiction. (1 - £ , +1) \y\ > 1 implies |y| > 1, because 0 < K, < 1 V t.
If y > 1, divergence in the steady state implies
y 2 - c - 1 + V ( - 7 2 + c + l)2 + 4cy22y2
y > 1
y2 — c - 1 + -y/(-y2 + c + l)2 + 4cy2= > y ------------------------ ------------------------> 1
2 y
=> - (y2 - c - 1 + yj(-y2 + c + l)2 + 4cy2) > (1 - y)2y
=> c + 1 - V (-y2 + c + l)2 + 4cy2 > 2 y - y2
=> c + 1 - Vr4 + 2cy2 - 2y2 + c2 + 2c + 1 > 2y - y2
^ y 2 - 2 y + c + l > Vr4 + 2cy2 - 2y2 + c2 + 2c + 1
y4 - 4y3 + 2cy2 + 6y2 - 4cy - 4y + c2 + 2c + 1 > y4 + 2cy2 - 2y2 + c2 + 2c + 1
=> —4y3 + 6y2 — 4cy — 4y > —2y2
=> -4 y 3 + 8y2 - 4yc - 4y > 0.
This implies c < - y 2 + 2y -1 . But - y 2 + 2y - 1 < 0 V y, and c must be positive because
it is a ratio of positive variances. Contradiction.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Convergence and Polarization of Public Opinion 117
If 7 < “ 1,
y2 - c - 1 + V(-y2 + c + l)2 + 4cy2'2y2
=>2 Irl
=> - Ir2 - c — l + ^ ( - r 2 + c + i)2 + 4cr2) > (l - lrl)2 Irl
=> - (r2 - c - 1 + -\J(-y2 + c + l)2 + 4cy2 j > 2 Irl - 2r2
=> r 2 - 2 Irl + c + 1 > ^ r 4 + 2cy2 - 2r2 + c2 + 2c + 1
=> 74 - 4 Irl3 + 6r2 + 2r2c - 4\y\ - 4\y\c + c2 + 2c + I > y4 + 2c/2 - 2r2 + c2 + 2c + 1
-4 Irl3 + 8-y2 - 4 Irl c - 4 Irl > o.
This implies c < - y 2 - 2 y - 1. But - y 2 - 2y - 1 < 0 V y, and c must be positive
Discussion. The proof of Proposition 3 A shows that polarization can happen
even in the absence of selective exposure or perception: it occurs when updating
causes people who disagree to weight their prior beliefs more heavily than they did
before. And this occurs when (1 - Kt+\) Irl > 1. Because K, lies between 0 and 1 for all
values of t, (1 - Kt+i) must also lie between 0 and 1, and Irl > 1 is therefore a necessary
condition for polarization. It may seem, moreover, that bigger values of Irl produce
more polarization. But this is not generally so, because y enters into the definition of
Kt, too, and the algorithm that determines K, quickly “catches up” to offset the size of
Irl in (1 - Kt) |rl- Proposition 3B shows that the algorithm catches up completely when
K, reaches its steady state K: it is impossible for (1 - K) Irl to be greater than 1.
because it is a ratio of variances. Contradiction. □
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
118 Bayesian Updating of Political Beliefs
time
c o n v e r g e n c e s l ig h t p o l a r i z a t i o n
th e n c o n v e r g e n c ee x t e n d e d p o la r iz a t io n
t h e n c o n v e r g e n c e
Figure 3.4: Convergence and Polarization under Kalman Filter Updating. Allpanels depict simulated belief updating by two hypothetical voters when y = I . I , ao = I , the voter represented by the solid line has prior belief N(l,l), and the voter represented by the dashed line has prior belief N(—1,1). The scale of the y axes differs across panels to make differences between the voters apparent.
In the leftmost panel, cr = .7 and cr2 = 4. We see the intuitive pattern of belief updating under Bayes’ Theorem: subjects who receive the same messages and interpret them in the same way draw closer to agreement every time they update. The convergence is represented by the vertical distance between the solid and dashed line, which diminishes in each period.
The middle panel depicts updating when cr2 = .5 and cr2 = 25. Even though voters are receiving the same messages and interpreting them in the same way, their beliefs diverge (slightly) from period 0 to I and again from period I to period 2. Not until period 4 is the distance between their beliefs smaller than the distance between their priors. This pattern is exacerbated in the rightmost panel, where <x2 - .19 and cr2 = 100. Here, voters’ views diverge continuously in each of the first eleven periods; not until the 22nd period (unshown) is the distance between their beliefs smaller than the distance between their priors.
In practice, polarization is sustained longest when |y| is barely greater than 1
and when cr2 is far smaller than cr2. (See Figure 3.4.) The former condition occurs in
politics whenever a parameter of interest is trending slowly away from zero.7 Almost
all political parameters of interest trend away from zero sometimes, and some (e.g.,
population, racial tolerance, per capita income in developed countries) seem to trend
slowly away from zero for very long periods of time. The second condition, cr2
exists whenever the true variation in a parameter is slight but the quality of the
7 For parameters that have no natural scale, zero is arbitrarily defined. For example, in the case of the President’s honesty, zero might correspond to a survey answer of “neither honest nor dishonest.”
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Convergence and Polarization of Public Opinion
messages that we receive about it is poor. This is likely when secretive or totalitarian
states exert control over the press. Schumpeter (1942) argues that it is also a condition
endemic to democratic politics: although there are knowable political truths, the
feedback that we receive about our political decisions is so poor as to verge on useless.
Presume, for example, that we want to know whether the President is a good steward of
the economy. How are we to know? If we think that we have observed improvement in
the economy, it might be attributed to the President’s policies, or the lagged effects of
his predecessor’s policies, or the actions of incumbents at other levels of government,
or the lagged effects of their predecessors, or the Federal Reserve, or wholly apolitical
factors. It is rarely easy to tell who is responsible. And this elides the difficulty of
simply knowing when the economy has improved. Even a free and robust media will
not be of much help in these cases: no matter how precise the signals they send about
the state of the economy, signals about who deserves credit and blame are unavoidably
vague.8
Bartels (2002, 123) tells us that “it is failure to converge that requires
explanation within the Bayesian framework,” and he fingers selective perception
by Democrats and Republicans as the culprit in their failure to agree on a host of actual
matters. He may be right, but the case is far from cinched, and his characterization
of Bayesian updating is too strong. Bayesian updating models imply only partial
convergence of only some beliefs, and only under fantastically rare conditions do they
imply that people who disagree will “inexorably come to see the world in the same
way.” The logic of Bayesian updating alone provides no reasonable expectation of
8 Indeed, o~ <k cr^ in the case of attributions of praise and blame is the root of Schumpeter’s (1942, 262) contention that “the typical citizen drops down to a lower level of mental performance as soon as he enters the political field.” The problem is not that political man is stupid but that causality in politics is so complex and feedback about political decisions is so poor. Man’s “lower level of mental performance” is due to his failure to sufficiently adjust for the poor quality of information at his disposal.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
convergence—not during a campaign, not even over a lifetime. Of course, agreement
may occur. But enduring disagreement is no proof that people are not Bayesian. Still
less is it proof of partisan bias.
Discussion: Bayes’ Theorem as a Normative
Standard of Belief Updating
As a way of describing how citizens actually update their beliefs, Bayesian updating
models have come in for a wealth of criticism. Much laboratory research shows
that people do not update as Bayesians (Phillips and Edwards 1966; Tversky and
Kahneman 1971; though see Koehler 1996). And some political scientists believe
that the Theorem cannot accommodate ordinary features of public opinion about
politics (Taber and Lodge 2006; Fischle 2000). This essay shows, to the contrary,
that important features of public opinion about politics are readily accommodated by
Bayesian updating. Political events are often surprising, and Bayesian updating can
reflect that surprise by causing people to hold their views less confidently. Partisan
bias affects people’s views through selective perception or misjudgments of the
credibility of media outlets, and Bayesian updating can easily accommodate this.
Most importantly, people who disagree about politics are rarely moved to agreement
by even a flood of evidence. Exposure to the evidence may even cause their views
to draw further apart. Contrary to what some have written, Bayes’ Theorem can
easily accommodate enduring disagreement and polarization, too. None of this
means that Bayesian updating models perfectly capture every facet of political
decision-making—no models do—but it does mean that they are better than many
suppose.
120 Bayesian Updating of Political Beliefs
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Discussion 121
But it is precisely this flexibility that makes Bayesian updating inadequate
as a standard of rational thinking about politics. One need only consider all that it
permits. There is no limit to the amount of evidence that a Bayesian may ignore or
misconstrue, for Bayes’ Theorem applies only to updating, not to the collection or
interpretation of evidence. And even in the absence of perceptual biases—even if
people are interpreting and using new information in exactly the same way—Bayes’
Theorem permits their views to polarize. No standard of rational belief revision should
be so permissive.
The problem is not that the Theorem is irrational; as we have seen, it entails
abiding by axioms of probability so fundamental that they should be a component of
any standard of rational thinking. The problem is that the Theorem is not restrictive
enough: it permits what we should reject as irrational. Political scientists interested
in constructing standards of rational political thinking will do well to couple it with
restrictions on the interpretation of political evidence. Bayes’ Theorem should be part
of any normative criteria by which we judge rational political thinking—but only a
part.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
122 Bayesian Updating of Political Beliefs
Appendix H: Proofs of Several Bayesian Updating
Results
Updating in Response to Many Normal Messages is Equivalent
to Updating in Response to a Single, More Precise Normal
Message
Updating sequentially in response to many messages is equivalent to updating once
in response to a single, more precise message. I show this result for normal-normal
updating—first for the case in which all the messages are drawn from the same
distribution, then for the case in which different messages are drawn from distributions
with different variances.
Formally, assume (as in the normal-normal model) that we are trying to learn
the mean of a distribution whose variance is known. Let the prior belief about the
mean be fi ~ N (//0, cr2). A random sample of messages x = ( jt i, . . . , xn) is drawn
independently from a distribution that is presumed to be N(p, cr2). Because Bayes’
Theorem requires that the posterior be proportional to the prior times the likelihood of
the new messages, updating sequentially requires that
p(jj|x) oc prior x £(.xi|//) x • • • x £(jt„|/i)
n
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Appendix 123
The part that does not depend on pi is constant. Absorbing it into the proportionality
constant, we get
p(/i|x) oc prior x expl U x - n Y2 \ cr2/n
The rightmost term is the likelihood of a single observation from a normal distribution
with mean pi and variance <r2/n.
The case of updating with messages from distributions with different variances
is very similar. Suppose that messages xi = (jti,. . . , jc„) have mean x\ and are
presumed to be drawn independently from the N(pi, cr2Xi) distribution, while messages
x2 = (jc„+1, . . . , xN) have mean x2 and are presumed to be drawn independently from
the N(ji, cr2Xi) distribution. By the previous result, this is equivalent to updating on the
basis of just two messages:
p(pi|x i ,x 2) oc prior x exp l / ( * l - / / ) 2 \. 2 \ cr2, /n /
x exp 1 / (*2 ~ AO2 \2 \(r2xJ(N -ri))
Let cr2 = o Xl /n and cr^ = <tX2/(N - n). Then
pip|xi, X2) oc prior x exp
p n o r x exp
1 ( Ui -pi)2 (x2 -pi)2^l ' 2 l crl a i
(xi - pi)2 + (x2 - n)2 cr2u
- p n o r x exp
= p n o r x exp
= p n o r x exp
crtcr221* 2*
' ( jc 2 - 2fix\ + pi2) a \ t + ( jc 2 - 2pix2 + pi2) cr2u
1 [ - 2m*i + x2<r2j + r fv j, + 4 ^ *2 V Cri«Cr2,
1 ( 2 2//(x1cr2t + X2CT2,) tfcr2' + x2a jt \ + cr21 ( 2 2 -
“ 2 r ~ *r*.+ ° i +oi IPs?)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Absorbing the part that does not depend on p. into the constant of proportionality, we get
124 Bayesian Updating of Political Beliefs
pip|xi, X2) oc prior x exp 1 f + X2o j \ 22 ( ^ 0-2jt)/(cr2t + o-jj r °-2u +crl 1
_ _2 _ 9The rightmost term is the kernel of a normal density with mean *' f +x\ “ and
1* 2*<r? 0-*variance° i +oi ‘
Under Kalman Filtering, the Posterior Estimate of the Mean is
a Linear Function of the Prior Belief and Messages Received
Equation 3.3a shows that when a normally distributed prior belief is updated in
response to information that is perceived to be normal, the Bayes estimate of the
posterior mean is a linear combination of the prior estimate of the mean and the new
information. Diaconis and Ylvisaker (1979) show that this is not peculiar to updating
with normal priors and likelihoods: whenever the prior belief has a distribution in the
exponential family and is conjugate to the distribution of the new information, the
Bayes estimate of the posterior mean is a convex combination of the prior estimate
of the mean and the new information. In this appendix, I calculate the weights of the
linear combination for the Kalman filter estimator at of random variable a t, which are
defined on pages 111-113.
The Kalman filter estimator a t can be written as a linear function of do,
or, = cta0 + f(Xt, where c, = flLiO - K-i)y, Kt is the “Kalman gain” defined on page
114, xt is the vector of messages ( j q , . . . , j c , ) , and f t' is a row vector of weights on the
components of xt.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Appendix 125
ai = ya 0 + Ki(xi - yar0)
= (y - K\y)a0 + Kxxi
= Ci&o + fjxi,
with ci = (y - K\y) = ni= 1 (1 ~ Kdy and fi = K\. Assuming the claim is true for t, it is
also true for t + 1:
a t+1 = ya t + Kt+x(xt+] - ya ,)
= (1 - Kt+i)yat + Kt+lxt+l
= (1 - K,+i)y [cfor0 + + Kl+lxt+i
= [(1 - Kt+\)yct\ or0 + (1 - ^ r+1)yft'xt + Kt+1 xl+ [
= ct+lao + fj+1xt+1,
with cr+i = (1 - Kt+i)yc, and ft+i a vector in which the first t elements are given by
(1 - A^+Oyft and in which the (N + l) th element is Kt+l. □
This proof follows the similar proof in Gerber and Green (1998). They are
not identical because (perhaps due to a copyediting error) the earlier works reports
c, = n U l ~ which is not quite right.
K Has a Steady State
According to Gerber and Green (1998), the Kalman weight Kt in Kalman-filter
updating stabilizes at a unique value:
K _ ~[c + (1 - y2)] + V[c + (1 - y2)]2 +
Proof by induction. The claim is true for t = 1:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
where c = cr 2/cr2.
126 Bayesian Updating of Political Beliefs
Proof. K, = Ptlcr2, so behavior of P, implies behavior of Kt. To find the steady state of
P, (and thus of Kt), we need to find the value for which Pt — Pt+i'
P, = <r2xKt+l
= )Ky2Pt + o l + o j)
P t t f P t + <r2a + o 2x) = P ,y 2crzx + o2
Pf y2 + P.icrl + a j - y W x) - o j o i = 0 .
This is just a quadratic equation, so
~(o-2a + crx - y 2cr2x) ± y j(a2a + a 2x - y 2cr2x)2 - 4 y2(-o-^oj)2 y 2 '
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Appendix 127
The denominator is positive. The numerator must also be positive, then, because P
must be positive.
cr2, cr2, and y2 must all be positive. The numerator will thus only be positive
if it is -(cr2 + cr2 - y2cr2x) + >/(cr2 + cr2 - y2cr2)2 - 4y2(-cr2cr2). In other words, we
replace ± with +:
That is, after some stage t, the newest observation always receives the same weight
(K) when updating. By extension, one’s prior always receives the same weight (1 - K)
when updating.
This result is contrary to the normal-normal model, in which information
received at previous stages is always reflected in the prior. As time passes, the weight
placed on the prior under the normal-normal model always increases, because the
prior always reflects more information than it did in the past. Consequently, the weight
placed on new messages always decreases.
The Kalman filter is more realistic because it implies that very old information
is either forgotten or discounted: when the parameter of interest is changing over time,
a very old message is not as important as a new message, even if both come from
P = ~ ( o j + o~x~ y2° j ) + V ( o j + o j - y2c 2x)2 - 4 y 2( - c r 2cr2)2 y2
K =cr22y2
—[c + (1 - y 2)\ + V(cr2 + cr2 - y2cr2)2 + 4y2cr2cr22 y 2 <t 22 y2
+ 4cy2
y2 - c - 1 + ^ (c - y2 + l)2 + 4cy22y2
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
equally credible sources. That is why the weight placed on the prior when updating
under the Kalman filter model does not always increase over time. See Gerber and
Green (1998) for an extensive discussion of this difference between the normal-normal
and Kalman-filter models.
128 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Bibliography
Abelson, Robert R 1986. “Beliefs Are Like Possessions.” Journal for the Theory o f
Social Behaviour 16 (October): 224-50.
Achen, Christopher H. 1992. “Social Psychology, Demographic Variables, and Linear
Regression: Breaking the Iron Triangle in Voting Research.” Political Behavior 14
(September): 195-211.
Achen, Christopher H. 2002. “Parental Socialization and Rational Party Identification.”
Political Behavior 24 (June): 151 -70.
Achen, Christopher H. 2003. “Toward a New Political Methodology:
Microfoundations and ART.” Annual Review o f Political Science 5 (June): 423-50.
Achen, Christopher H. 2005. “Let’s Put Garbage-Can Regressions and Garbage-Can
Probits Where They Belong.” Conflict Management and Peace Science 22 (Winter):
327-39.
American Presidency Project. 2006 03 01. “Political Party Platforms.”
http://www.presidency.ucsb.edu/platforms.php.
Anderson, Craig A., B. Lynn New, and James R. Speer. 1985. “Argument Availability
as a Mediator of Social Theory Perseverance.” Social Cognition 3 (3): 235-49.
129
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
i 30 Bibliography
Anderson, Craig A., Mark R. Lepper, and Lee Ross. 1980. “The Perseverance of Social
Theories: The Role of Explanation in the Persistence of Discredited Information.”
Journal o f Personality and Social Psychology 39 (December): 1037-49.
Ansolabehere, Stephen, James M. Snyder, Jr., and Charles Stewart, III. 2001.
“Candidate Positioning in U.S. House Elections.” American Journal o f Political
Science 45 (January): 136-59.
Ansolabehere, Stephen, and Shanto Iyengar. 1995. Going Negative: How Political
Advertisements Shrink and Polarize the Electorate. New York: Free Press.
Ansolabehere, Stephen, Shigeo Hirano, James M. Snyder, Jr., and Michiko Ueda.
2006. “Party and Incumbency Cues in Voting: Are They Substitutes?” Quarterly
Journal o f Political Science 1 (March): 119-37.
Barge, Matthew. 9 August 2005. “NARAL Falsely Accuses Supreme Court Nominee
Roberts.” http://www.factcheck.org/article340.html.
Bargh, John A., Shelly Chaiken, Rajen Govender, and Felicia Pratto. 1992. “The
Generality of the Automatic Attitude Activation Effect.” Journal o f Personality and
Social Psychology 62 (June): 893-912.
Bartels, Larry M. 1993. “Messages Received: The Political Impact of Media
Exposure.” American Political Science Review 87 (June): 267-85.
Bartels, Larry M. 2002. “Beyond the Running Tally: Partisan Bias in Political
Perceptions.” Political Behavior 24 (June): 117-50.
Beck, Nathaniel. 1990. “Estimating Dynamic Models Using Kalman Filtering.”
Political Analysis 1: 121-56.
Bernardo, Jose. M., and A. F. M. Smith. 1994. Bayesian Theory. New York: Wiley.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
131
Bochner, Stephen, and Chester A. Insko. 1966. “Communicator Discrepancy, Source
Credibility, and Opinion Change.” Journal o f Personality and Social Psychology 4
(December): 614-21.
Bound, John, David A. Jaeger, and Regina M. Baker. 1995. “Problems with
Instrumental Variables Estimation When the Correlation between the Instruments
and the Endogenous Explanatory Variable Is Weak.” Journal o f the American
Statistical Association 90 (June): 443-50.
Box, George E. P., and George C. Tiao. 1973. Bayesian Inference in Statistical
Analysis. Reading, MA: Addison-Wesley.
Brader, Ted. 2006. Campaigning for Hearts and Minds: How Emotional Appeals in
Political Ads Work. Chicago: University of Chicago Press.
Bullock, John G. 2006. “Partisanship and the Enduring Effects of False Political
Information.” Presented at the Annual Conference of the Midwest Political Science
Association, Chicago.
Callander, Steven, and Simon Wilkie. 2007. “Lies, Damned Lies, and Political
Campaigns.” Games and Economic Behavior.
Campbell, Angus, Philip E. Converse, Warren E. Miller, and Donald E. Stokes. 1960.
The American Voter. Chicago: University of Chicago Press.
Chaiken, Shelly. 1979. “Communicator Physical Attractiveness and Persuasion.”
Journal o f Personality and Social Psychology 37 (August): 1387-97.
Chapman, Loren J., and Jean P. Chapman. 1959. “Atmosphere Effect Re-examined.”
Journal o f Experimental Psychology 58 (September): 220-26.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chong, Dennis, and James N. Druckman. 2006. “Democratic Competition and Public
Opinion.” Presented at the Annual Meeting of the American Political Science
Association, Philadelphia.
Cohen, Geoffrey L. 2003. “Party Over Policy: The Dominating Impact of Group
Influence on Political Beliefs.” Journal o f Personality and Social Psychology 85
(November): 808-22.
Cohen, Joshua. 1989. “Deliberation and Democratic Legitimacy.” In The Good Polity,
ed. Alan Hamlin and Phillip Pettit. Oxford: Blackwell.
Conover, Pamela Johnston, and Stanley Feldman. 1989. “Candidate Perception in an
Ambiguous World: Campaigns, Cues, and Inference Processes.” American Journal
o f Political Science 33 (November): 912-40.
Davis, Otto A., Melvin J. Hinich, and Peter C. Ordeshook. 1970. “An Expository
Development of a Mathematical Model of the Electoral Process.” American Political
Science Review 64 (June): 426-48.
de Finetti, Bruno. 1974. Theory o f Probability: A Critical Introductory Treatment.
Trans. Antonio Machi and Adrian F. Smith. New York: Wiley.
Delli Carpini, Michael X., and Scott Keeter. 1996. What Americans Know about
Politics and Why It Matters. New Haven, CT: Yale University Press.
Diaconis, Persi, and David Freedman. 1986. “On the Consistency of Bayes Estimates.”
Annals o f Statistics 14 (March): 1-26.
Diaconis, Persi, and Donald Ylvisaker. 1979. “Conjugate Priors for Exponential
Families.” Annals o f Statistics 7 (March): 269-81.
132 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
133
Downs, Anthony. 1957. An Economic Theory o f Democracy. New York:
HarperCollins.
Druckman, James N. 2001. “The Implications of Framing Effects for Citizen
Competence.” Political Behavior 23 (September): 225-56.
Druckman, James N. 2004. “Political Preference Formation: Competition,
Deliberation, and the (Ir)relevance of Framing Effects.” American Political Science
Review 98 (November): 671-86.
Druckman, James N., and Kjersten R. Nelson. 2003. “Framing and Deliberation:
How Citizens’ Conversations Limit Elite Influence.” American Journal o f Political
Science Al (October): 729-45.
Duelfer, Charles. 2004. “Comprehensive Report of the Special Advisor to the
Director of Central Intelligence on Iraq’s Weapons of Mass Destruction.”
https://www.cia.gov/cia/reports/iraq_wmd_2004/.
Epley, Nicholas, and David Dunning. 2000. “Feeling ‘Holier than Thou’: Are
Self-Serving Assessments Produced by Errors in Self- or Social Prediction?”
Journal o f Personality and Social Psychology 79 (December): 861-75.
Fischle, Mark. 2000. “Mass Response to the Lewinsky Scandal: Motivated Reasoning
or Bayesian Updating?” Political Psychology 21 (March): 135-59.
Freedman, David A. 1991. “Statistical Models and Shoe Leather.” Sociological
Methodology 21: 291-313.
Frey, Dieter. 1986. “Recent Research on Selective Exposure to Information.” Advances
in Experimental Social Psychology 19: 41-80.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Gaines, Brian J., Janies H. Kuklinski, Paul J. Quirk, Buddy Peyton, and Jay Verkuilen.
2007. “Same Facts, Different Interpretations: Partisan Motivation and Opinion on
Iraq.” Journal o f Politics.
Gellman, Barton. 2004 January 7. “Iraq’s Arsenal Was Only on Paper; Since Gulf War,
Nonconventional Weapons Never Got Past the Planning Stage.” Washingon Post
p. A l.
Gellman, Barton, and Walter Pincus. 2003 August 10. “Depiction of Threat Outgrew
Supporting Evidence.” Washington Post p. Al.
Gelman, Andrew, John B. Carlin, Hal S. Stem, and Donald B. Rubin. 2004. Bayesian
Data Analysis. 2nd ed. Boca Raton, FL: Chapman & Hall.
Gentzkow, Matthew, and Jesse M. Shapiro. 2006. “What Drives Media Slant?
Evidence from U.S. Daily Newspapers.” University of Chicago. Manuscript.
Gerber, Alan, and Donald Green. 1999. “Misperceptions about Perceptual Bias.”
Annual Review o f Political Science 2: 189-210.
Gerber, Alan, and Donald P. Green. 1998. “Rational Learning and Partisan Attitudes.”
American Journal o f Political Science 42 (July): 794-818.
Gerber, Elisabeth R., and John E. Jackson. 1993. “Endogenous Preferences and the
Study of Institutions.” American Political Science Review 87 (September): 639-56.
Gertz v. Schmidt. 1974. 418 U.S. 323.
Goussev, Valeri. 2004. “Does the Brain Implement the Kalman Filter?” Behavioral
and Brain Sciences 21 (June): 404-05.
134 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
135
Green, Donald P., Alan S. Gerber, and Suzanna L. de Boef. 1999. “Tracking Opinion
over Time: A Method for Reducing Sampling Error.” Public Opinion Quarterly 63
(Summer): 178-92.
Groseclose, Tim. 2001. “A Model of Candidate Location When One Candidate Has a
Valence Advantage.” American Journal o f Political Science 45 (October): 862-86.
Grynaviski, Jeffrey D. 2006. “A Bayesian Learning Model with Applications to Party
Identification.” Journal o f Theoretical Politics 18 (July): 323-46.
Hamm, Robert M. 1993. “Explanations for Common Responses to the Blue/Green Cab
Probabilistic Inference Word Problem.” Psychological Reports 72 (1): 219-42.
Harvey, Andrew C. 1989. Forecasting, Structural Time Series and the Kalman Filter.
New York: Cambridge University Press.
Higgins, E. Tory. 1996. “Knowledge Activation: Accessibility, Applicability, and
Salience.” In Social Psychology: Handbook o f Basic Principles, ed. E. Tory Higgins
and Arie W. Kruglanski. New York: Guilford Press.
Hovland, Carl I., and Muzafer Sherif. 1957. “Assimilation and Contrast Effects in
Reactions to Communication and Attitude Change.” Journal o f Abnormal Social
Psychology 55 (September): 244-52.
Husted, Thomas A., Lawrence W. Kenny, and Rebecca B. Morton. 1995. “Constituent
Errors in Assessing Their Senators.” Public Choice 83 (June): 251-71.
Isikoff, Michael, and John Barry. 2005 May 9. “Gitmo: SouthCom Showdown.”
Newsweekp. 10. http://www.msnbc.msn.com/id/7693014/site/newsweek/.
Iyengar, Shanto, and Nicholas A. Valentino. 2000. “Who Says What? Source
Credibility as a Mediator of Campaign Advertising.” In Elements o f Reason,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ed. Arthur Lupia, Mathew D. McCubbins, and Samuel L. Popkin. New York:
Cambridge University Press.
Jackman, Simon. 2005. “Pooling the Polls Over an Election Campaign.” Australian
Journal o f Political Science 40 (December): 499-517.
Jacobson, Gary C. 2006. A Divider, Not a Uniter: George W. Bush and the American
People. Upper Saddle River, NJ: Pearson.
Jamieson, Kathleen Hall, and Paul Waldman. 2003. The Press Effect: Politicians,
Journalists, and the Stories that Shape the Political World. New York: Oxford
University Press.
Jeffrey, Richard. 2004. Subjective Probability: The Real Thing. New York: Cambridge
University Press.
Jessee, Stephen A., and Douglas Rivers. 2007. “Political Judgment Under Uncertainty:
Heuristics and Biases.” Stanford University. Manuscript.
Jones, Edward E., and Richard E. Nisbett. 1972. “The Actor and the Observer:
Divergent Perceptions of the Causes of the Behavior.” In Attribution: Perceiving
the Causes o f Behavior, ed. Edward E. Jones, David E. Kanouse, Harold H. Kelley,
Richard. E. Nisbett, Stuart Valins, and Bernard Weiner. Morristown, NJ: General
Learning Press.
Kalman, Rudolph E. 1960. “A New Approach to Linear Filtering and Prediction
Problems.” Journal o f Basic Engineering, Series D 82: 35-45.
Kalman, Rudolph E., and Richard S. Bucy. 1961. “New Results in Linear Filtering and
Prediction Theory.” Journal o f Basic Engineering, Series D 83 (March): 95-108.
136 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
137
Kam, Cindy D. 2005. “Who Toes the Party Line? Cues, Values, and Individual
Differences.” Political Behavior 27 (June): 163-82.
Katz, Elihu. 1968. “On Reopening the Question of Selectivity in Exposure to Mass
Communication.” In Theories o f Cognitive Consistency: A Sourcebook, ed. Robert P.
Abelson, Elliot Aronson, William J. McGuire, Theodore M. Newcomb, Milton J.
Rosenberg, and Percy H. Tannenbaum. Chicago: Rand McNally.
Keenan, Nancy. 11 August 2005. “Rebuttal to FactCheck.org Analysis of NARAL
Pro-Choice America’s TV Ad.”
http://www.factcheck.org/UploadedFiles/Rebuttal-letter-from-Nancy-Keenan-to-FactCheck.org-8-l l-05.pdf.
Kelley, Harold H. 1973. “The Processes of Causal Attribution.” American Psychologist
28 (February): 107-28.
Kennan, George F. 1997. George F. Kennan and the Origins o f Containment,
1944-1946: The Kennan-Lukacs Correspondence. Columbia, Missouri: University
of Missouri Press.
Kerkhof, Peter. 1999. “Applying the Unimodel to Political Persuasion.” Psychological
Inquiry 10 (2): 137-40.
Kingdon, John W. 1984. Agendas, Alternative, and Public Policies. Boston: Little,
Brown.
Knight, Jack, and James Johnson. 1997. “What Sort of Political Equality Does
Deliberative Democracy Require?” In Deliberative Democracy, ed. James Bohman
and William Rehg. Cambridge: MIT Press.
Koehler, Jonathan J. 1996. “The Base Rate Fallacy Reconsidered: Descriptive,
Normative, and Methodological Challenges.” Behavioral and Brain Sciences 19
(March): 1-53.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Krosnick, Jon A., and Donald R. Kinder. 1990. “Altering the Foundations of Support
for the President through Priming.” American Political Science Review 84 (June):
497-512.
Kuklinski, James H., Paul J. Quirk, Jennifer Jerit, David Schwieder, and Robert F.
Rich. 2000. “Misinformation and the Currency of Democratic Citizenship.” Journal
o f Politics 62 (August): 790-816.
Kunda, Ziva. 1990. “The Case for Motivated Reasoning.” Psychological Bulletin 108
(November): 480-98.
Kurtz, Howard. 16 May 2005. “Newsweek Apologizes: Inaccurate Report on Koran
Led to Riots.” Washington Post p. A01.
Lau, Richard P., and David P. Redlawsk. 2001. “Advantages and Disadvantages of
Heuristics in Political Decision Making.” American Journal o f Political Science 45
(October): 951-71.
Learner, Edward E. 1978. Specification Searches. New York: Wiley.
Learner, Edward E. 1983. “Let’s Take the Con out of Econometrics.” American
Economic Review 73 (1): 31-43.
Lepper, Mark R., Lee Ross, and Richard R. Lau. 1986. “Persistence of Inaccurate
Beliefs about the Self: Perseverance Effects in the Classroom.” Journal o f
Personality and Social Psychology 50 (March): 482-91.
Levendusky, Matthew S. 2006. “Sorting in the U.S. Mass Electorate.” Ph.D. diss.
Stanford University.
Lieb, David A. 2005 April 7. “Missouri Medicaid Cuts Sent to Blunt.” Associated
Press.
138 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
139
Lodge, Milton, and Charles Taber. 2000. “Three Steps toward a Theory of Motivated
Political Reasoning.” In Elements o f Reason, ed. Arthur Lupia, Mathew D.
McCubbins, and Samuel L. Popkin. New York: Cambridge University Press.
Lohmann, Suzanne. 1994. “The Dynamics of Informational Cascades: The Monday
Demonstrations in Leipzig, East Germany, 1989-91.” World Politics 47 (October):
42-101.
Londregan, John B. 2000. Legislative Institutions and Ideology in Chile. New York:
Cambridge University Press.
Lord, Charles. G., Lee Ross, and Mark R. Lepper. 1979. “Biased Assimilation and
Attitude Polarization: The Effects of Prior Theories on Subsequently Considered
Evidence.” Journal o f Personality and Social Psychology 37 (November):
2098-2109.
Luce, R. Duncan. 1995. “Four Tensions Concerning Mathematical Modeling in
Psychology.” Annual Review o f Psychology 46: 1 -26.
Lupia, Arthur. 2002. “Who Can Persuade Whom? Implications from the Nexus of
Psychology and Rational Choice Theory.” In Thinking about Political Psychology,
ed. James H. Kuklinski. New York: Cambridge University Press.
Lupia, Arthur. 2006. “When Can Politicians Scare Citizens Into Supporting Bad
Policies? A Theory of Incentives with Fear-Based Content.” University of Michigan
at Ann Arbor. Typescript.
Lupia, Arthur, and Mathew D. McCubbins. 1998. The Democratic Dilemma: Can
Citizens Learn What They Need to Know? New York: Cambridge University Press.
Lupia, Arthur, Mathew D. McCubbins, and Samuel L. Popkin, eds. 2000. Elements o f
Reason. New York: Cambridge University Press.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Luskin, Robert C., James S. Fishkin, and Roger Jowell. 2002. “Considered Opinions:
Deliberative Polling in Britain.” British Journal o f Political Science 32 (July):
455-87.
McGuire, William J. 1964. “Inducing Resistance to Persuasion: Some Contemporary
Approaches.” Advances in Experimental Social Psychology 1: 191-229.
McGuire, William J. 1969. “The Nature of Attitudes and Attitude Change.” In
Handbook o f Social Psychology, ed. Gardner Lindzey and Elliot Aronson. 2nd ed.
Reading, MA: Addison-Wesley.
Meinhold, Richard J., and Nozer D. Singpurwalla. 1983. “Understanding the Kalman
Filter.” American Statistician 37 (May): 123-27.
Mondak, Jeffery J. 1993a. “Public Opinion and Heuristic Processing of Source Cues.”
Political Behavior 15 (June): 167-92.
Mondak, Jeffery J. 1993b. “Source Cues and Policy Approval: The Cognitive
Dynamics of Public Support for the Reagan Agenda.” American Journal o f Political
Science 37 (February): 186-212.
NARAL Pro-Choice America. 2005. “Speaking Out.” Television advertisement.
http://www.factcheck.org/video/Speaking-Out.wmv.
Neill, Daniel B. 2005. “Cascade Effects in Heterogeneous Populations.” Rationality
and Society 17(2): 191-241.
Newell, Allen, J.C. Shaw, and Herbert A. Simon. 1958. “Chess-Playing Programs
and the Problem of Complexity.” IBM Journal o f Research and Development 2
(October): 320-35.
140 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
141
Nielsen Media Research. 18 August 2005. “Supreme Court Political Ads Virtually
Absent From Television.” Press release.
Ottati, Victor C. 1990. “Determinants of Political Judgments: The Joint Influence of
Normative and Heuristic Rules of Inference.” Political Behavior 12 (June): 159-79.
Page, Benjamin I., and Richard A. Brody. 1972. “Policy Voting and the Electoral
Process: The Vietnam War Issue.” American Political Science Review 66
(September): 979-95.
Petrocik, John R. 1996. “Issue Ownership in Presidential Elections, with a 1980 Case
Study.” American Journal o f Political Science 40 (August): 825-50.
Petty, Richard E., and John T. Cacioppo. 1986. Communication and Persuasion:
Central and Peripheral Routes to Attitude Change. New York: Springer-Verlag.
Phillips, Lawrence D., and Ward Edwards. 1966. “Conservatism in a Simple
Probability Inference Task.” Journal o f Experimental Psychology 72 (September):
346-54.
Poole, Keith T., and Howard L. Rosenthal. 2006. Ideology and Congress. Piscataway,
NJ: Transaction Publishers.
Popkin, Samuel L. 1994. The Reasoning Voter: Communication and Persusasion in
Presidential Campaigns. 2nd ed. Chicago: University of Chicago Press.
Pratt, John W., Howard Raiffa, and Robert Schlaifer. 1964. “The Foundations of
Decision under Uncertainty: An Elementary Exposition.” Journal o f the American
Statistical Association 59 (June): 353-75.
Press, S. James. 2003. Subjective and Objective Bayesian Statistics. 2nd ed. Hoboken,
NJ: Wiley.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Prior, Markus. 2007. Post-Broadcast Democracy. New York: Cambridge University
Press.
Program on International Policy Attitudes. 2003 February 21. “Americans on Iraq
and the U.N. Inspections II.” http://www.pipa.org/OnlineReports/Iraq/IraqUNInsp2_Feb03/
IraqUNInsp2%20Feb03%20rpt.pdf.
Rahn, Wendy M. 1993. “The Role of Partisan Stereotypes in Information Processing
about Political Candidates.” American Journal o f Political Science 37 (May):
472-96.
Rawls, John. 1993. Political Liberalism. New York: Columbia University Press.
Ross, Lee, and Andrew Ward. 1996. “Naive Realism in Everyday Life: Implications
for Social Conflict and Misunderstanding.” In Values and Knowledge, ed. T. Brown,
E. Reed, and E. Turiel. New Jersey: Lawrence Erlbaum.
Ross, Lee, Mark R. Lepper, and Michael Hubbard. 1975. “Perseverance in
Self-Perception and Social Perception: Biased Attributional Processes in the
Debriefing Paradigm.” Journal o f Personality and Social Psychology 32 (May):
880-92.
Savage, Charlie. 2005 June 4. “Pentagon Details Koran Mistreatment.” Boston Globe
p. A l.
Savage, Leonard J. 1954. The Foundations o f Statistics. New York: Wiley.
Savage, Leonard J. 1964. “The Foundations of Statistics Reconsidered.” In Studies in
Subjective Probability, ed. Henry E. Kyburg, Jr. and Howard E. Smokier. New York:
Wiley.
142 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
143
Schaffner, Brian F., and Matthew J. Streb. 2002. “The Partisan Heuristic in
Low-information Elections.” Public Opinion Quarterly 66 (Winter): 559-81.
Schuman, Howard, and Stanley Presser. 1981. Questions and Answers in Attitude
Surveys: Experiments on Question Form, Wording, and Context. San Diego:
Academic Press.
Schumpeter, Joseph A. 1942. Capitalism, Socialism, and Democracy. New York:
Harper & Brothers.
Schwieder, David W., and Paul J. Quirk. 2004. “Missed Cues: Judgment Heuristics and
Citizen Competence.” Presented at the Annual Conference of the Midwest Political
Science Association, Chicago.
Sears, David O., and Jonathan L. Freedman. 1967. “Selective Exposure to Information:
A Critical Review.” Public Opinion Quarterly 31 (Summer): 194-213.
Simon, Herbert A. 1955. “A Behavioral Model of Rational Choice.” Quarterly Journal
o f Economics 69 (February): 99-118.
Simon, Herbert A. 1959. “Theories of Decision-Making in Economics and Behavioral
Science.” American Economic Review 49 (June): 253-83.
Simon, Herbert A., and Allen Newell. 1958. “Heuristic Problem Solving: The Next
Advance in Operations Research.” Operations Research 6 (January-February): 1-10.
Smith, Edward E. 1990. “Categorization.” In Thinking: An Invitation to Cognitive
Science, ed. Daniel N. Osherson and Edward E. Smith. Cambridge, MA: MIT Press.
Smith, Edward E., and Douglas L. Medin. 1981. Categories and Concepts. Cambridge,
MA: Harvard University Press.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Sniderman, Paul M., and Sean M. Theriault. 2004. “The Structure of Political
Argument and the Logic of Issue Framing.” In Studies in Public Opinion: Attitudes,
Nonattitudes, Measurement Error, and Change, ed. Willem E. Saris and Paul M.
Sniderman. Princeton, New Jersey: Princeton University Press.
Steenbergen, Marco R. 2002. “Political Belief Updating: An Experimental
Investigation Within a Bayesian Framework.” Presented at the Annual Scientific
Meeting of the International Society of Political Psychology, Berlin.
Stokes, Donald E. 1963. “Spatial Models of Party Competition.” American Political
Science Review 57 (June): 368-77.
Strasser, Helmut. 1981. “Consistency of Maximum Likelihood and Bayes Estimates.”
Annals o f Statistics 9 (September): 1107-1113.
Taber, Charles S., and Milton Lodge. 2006. “Motivated Skepticism in the Evaluation
of Political Beliefs.” American Journal o f Political Science 50 (July): 755-69.
Tessin, Jeff. 2006. “Cues Given, Cues Received: How Candidates Use Shortcuts When
Voters Need Them Most.” Presented at the Annual Meeting of the Midwest Political
Science Association, Chicago.
Tetlock, Philip E. 2005. Expert Political Judgment. Princeton: Princeton University
Press.
Tversky, Amos, and Daniel Kahneman. 1971. “Belief in the Law of Small Numbers.”
Psychological Bulletin 76 (August): 105-10.
van Houweling, Robert P., and Paul M. Sniderman. 2004. “The Political Logic of a
Downsian Space.” Presented at the Annual Meeting of the Midwest Political Science
Association, Chicago.
144 Bibliography
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
145
Webster, James G. 2005. “Beneath the Veneer of Fragmentation: Television Audience
Polarization in a Multichannel World.” Journal o f Communication 55 (June):
366-82.
White House. 2002 February 14. “President Announces
Clear Skies & Global Climate Change Initiatives.” Speech.
http://www.whitehouse.gov/news/releases/2002/02/20020214-5.html.
Whitney v. California. 1927. 274 U.S. 357.
Williams, Daniel, and Kamran Khan. 2005 May 28. “Muslims Rally Over Koran
Report.” Washington Post p. A 14.
Zaller, John R. 1992. The Nature and Origins o f Mass Opinion. New York: Cambridge
University Press.
Zanna, Mark P., and John K. Rempel. 1988. “Attitudes: A New Look at an Old
Concept.” In The Social Psychology o f Knowledge, ed. Daniel Bar-Tal and Arie W.
Kruglanski. New York: Cambridge University Press.
Zechman, Martin J. 1979. “Dynamic Models of the Voter’s Decision Calculus:
Incorporating Retrospective Considerations into Rational-Choice Models of
Individual Voting Behavior.” Public Choice 34 (September): 297-315.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.