Part 3 Bayesian Updating of Political Beliefs: Normative

Part 3

Bayesian Updating of Political

Beliefs: Normative and Descriptive

Properties

Social scientists increasingly use Bayes’ Theorem as a normative standard of rational

political thinking (Bartels 2002; Gerber and Green 1998, 1999; Tetlock 2005;

Steenbergen 2002) and as a tool to describe how people actually think (Bartels 1993;

Achen 1992, 2002; Husted, Kenny, and Morton 1995; Grynaviski 2006; Lohmann

1994; Neill 2005). Models based on the Theorem are attractive because they offer a

way to account for the weight that people place on old beliefs and new influences when

revising their political ideas. They are also formal models, and as such, they bring the

benefits of mathematical exposition to topics that have usually lacked it (Lupia 2002;

see also Luce 1995). But uncertainty remains about the basis of their normative appeal

and about whether they can accommodate everyday features of political cognition.

This essay clarifies those matters. After explaining the Theorem’s appeal

as a standard of rationality, I show that four important features of political

thought—increased uncertainty in response to surprising information, selective

89

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

perception, attitude polarization, and enduring disagreement—are inconsistent with the

most widely-used Bayesian updating model but quite consistent with other Bayesian

models. Bayes’ Theorem proves capable of capturing many features of real-world

political thinking. But this flexibility is the Theorem’s downside: precisely because it is

consistent with so many different ways of thinking about politics, it is inadequate as a

standard of rationality.

90 Bayesian Updating of Political Beliefs

Bayes’ Theorem and Bayesian Updating Models

Most of the matters that interest political scientists—public opinion toward candidates,

implications of new policies, the probability of terrorist attacks—can be thought of

as probability distributions. Like most distributions, they have means and variances,

and the task that we set for ourselves is to learn about these parameters. A politician’s

ability to manage the economy, for example, may oscillate over time around a fixed

but unknown mean. Learning about politics becomes a matter of learning about

probability distributions—a task to which Bayesian statistics is especially well-suited.

The bedrock of Bayesian statistics is Bayes’ Theorem, an equation that relates

conditional and marginal probabilities:

p(E\S)p(S) p(E\S)p(S)(3 , )

where S and E are events in a sample space and /?(•) is a probability distribution

function. In words, the Theorem indicates that p(S |£), the probability that S occurs

conditional on E having occurred, is a function of the conditional probability p(E\S)

and the marginal probabilities p(E) and p(S). Stated thus, the Theorem is merely an

accounting identity. But a change in terminology draws out its significance. This time,

let S be a statement about politics and p(S ) be a belief about S , i.e., a probability


Introduction 91

distribution indicating someone’s estimate of the extent to which S is true. E is

evidence bearing on the belief. In this version of Bayes’ Theorem, the estimated

probability of S before observing E is given by p(S); it is often called the prior

probability o f S , or simply the “prior.” The estimated probability that S is true after

observing E is p(S |£), often called the posterior probability o f S . And p(E\S) is the

likelihood function that one assigns to the evidence; it reflects a person’s guess about

the probability distribution from which the data are drawn. Understood in this way,

the Theorem tells us how to revise any belief after receiving relevant evidence and

subjectively estimating its likelihood. It is most often applied to beliefs about future

events (Tetlock 2005), but it is fundamentally a tool for calculating probabilities, and

it applies with equal force to all ideas that can be described in probabilistic terms. See

Figure 3.1; for applications of Bayesian models to political attitudes and evaluations,

see Bartels (1993, 2002) and Gerber and Green (1999).1

Bayes’ Theorem is attractive as a normative standard of belief updating because

it can be derived from two fundamental axioms of probability:

1 If it seems confusing to think of attitudes in probabilistic terms, consider that attitudes are merely beliefs that objects are good or bad in some way, often accompanied by affective responses to those objects (Zanna and Rempel 1988; Abelson 1986). If I like John McCain, I have assigned a high probability to the hypothesis that he belongs to a category of objects that I like. And if reviewing new evidence causes me to like McCain less, I assign a lower probability to that hypothesis. This definition of attitude is in keeping with the view that much mental categorization is probabilistic (e.g., Smith and Medin 1981; Smith 1990).

p(5n^) = Jp(£ri5) (3.2a)

(3.2b)



Median Household Income in Am erica

below above

$50,000 $50,000

Probability that Gingrich W ins the Presidency

.4 .6t -.8

Attitude toward John Edwards

Bill Clinton as Steward of the Economy

neither like nor

dislike

like

a littlelike

a lot

Figure 3 .1: Attitudes, Evaluations, and Factual Beliefs Are Probability Distributions. All of political cognition can be conceived in terms of probability distributions, and in doing so we win for political science the vast body of knowledge about subjective probability theory, the chief element of which is Bayes’ Theorem. The upper left-hand panel depicts a factual belief about household income in the U.S.: the person holding this belief estimates that there is a 35% chance that the median household income is below $50,000 and a 65% chance that it is greater than that. This is just a two-category discrete probability distribution. The upper right-hand panel depicts a belief about a future matter, Newt Gingrich’s chance of winning the Presidency in 2008. The belief is a beta distribution: continuous, asymmetric, and bounded between 0 and I (Paolino 2001; Jackman 2008, ch. 2). The lower left-hand panel depicts a positive but somewhat ambivalent attitude about John Edwards: it is a normal distribution. The lower right-hand panel is an ambivalent voter’s evaluation of Bill Clinton as a manager of the national economy—a discrete probability distribution with five categories.


Introduction 93

Equation 3.2a says that the probability that 5 and E both occur equals the probability

that E and 5 both occur. Equation 3.2b is a definition of conditional probability.2

No one consciously rejects either axiom, and following both of them requires that

beliefs be updated according to Bayes’ Theorem: p(S IE) = p(Ep \ and p(E f lS ) =

p(E\S)p(S), so p(S\E) = • By contrast, updating that does not correspond to

Bayes’ Theorem constitutes an implicit rejection of either or both of the axioms.3

Because the denominator of Equation 3.1 only serves to ensure that the

posterior density integrates to one (and is therefore a proper probability density),

Bayes’ Theorem is more commonly expressed as

P(S\E)k P(E\S)P(S).

In words, “the posterior belief is proportional to the prior belief times the likelihood.”

People are Bayesian if their posterior beliefs are determined in this fashion.

Importantly, Bayes’ Theorem says nothing about what one’s priors should be, what

evidence one should use to update, or how one should interpret the evidence that one

does use—a point to which we shall return.

In political science, one Bayesian updating model is far more common than

others: the “normal-normal” model, so-called because it assumes both that people’s

priors are normally distributed and that they perceive new information to be normally

distributed (e.g., Achen 1992; Bartels 1993,2002; Gerber and Green 1999; Husted,

Kenny, and Morton 1995; Gerber and Jackson 1993; Zechman 1979). Suppose that

a voter is trying to learn about p, a politician’s level of honesty. Initially, her belief

2 Although conditional probability is usually presented as an axiom, Bernardo and Smith (1994, ch. 2) show that it can be derived from simpler axioms.

3 This is the simplest valid treatment of a complex topic. For elaborate efforts to root Bayesian statistics in an axiomatic framework, see Savage (1954) and Pratt, Raiffa, and Schlaifer (1964).


about his honesty is normally distributed: pi ~ N (jjq, <Xq). Later, she encounters a

new message, x, that contains information about his level of honesty. She assumes

that the message is a draw from a distribution with a mean equal to the parameter of

interest.4 The normal distribution is usually a sensible assumption: if the message can

theoretically assume any real value, and if error or “noise” is likely to be contributed

to it by many minor causes, the central limit theorem suggests that it is likely to be

normal. We write x ~ N(ji, cr2x). The variance of this distribution, cr2x, captures how

definitive the new information is. If it is communicated directly from a highly credible

source, the signal it sends is clear and the variance is quite small. But if it is merely

a rumor that the voter spots in a tabloid, the signal is only slightly informative and its

variance will be high. Similarly, the variance of a prior or posterior belief is a measure

of the confidence with which it is held: the higher the variance, the less confidence one

places in one’s estimate of //. (See Figure 3.2.)


4 Assuming that the mean of the message distribution is \i is tantamount to assuming that the message comes from an unbiased source. If the voter believes that the message comes from a biased source, she needs to adjust for that bias before updating. This is no obstacle to Bayesian updating (e.g., Jackman 2005), but it is not part of the normal-normal model.


Introduction 95

3 5

strong belief

3 5

weak belief

Figure 3.2: The Variance of a Belief Indicates Its Strength. Both panels depict beliefs that are normal distributions with means of 3. The distribution in the left-hand panel has a variance of .25: the person who holds this belief is quite confident that the parameter of interest is about 3. By contrast, the distribution in the right-hand panel has a variance of 4. The person who holds this belief is not confident that the parameter of interest is close to 3; to him, it could easily be around I or 5 or some value even more distant from 3.

By a common result (e.g., Box and Tiao 1973), a voter with a normal prior

belief who updates according to Bayes’ Theorem in response to x will have posterior

belief fi\x ~ N(jx\,cr\), where

The posterior mean, n\, is a weighted average of the mean of the prior belief and the

new message. The weights are determined by the precisions, i.e., the reciprocals of the

variances of the prior belief and the new message. This is a fantastically convenient

result, as it permits us to compute posterior means without multiplying the prior

probability distribution by the likelihood. And this convenience helps to account for

and (3.3a)

(3.3b)



the popularity of the normal-normal model. But the model has shortcomings that make

it normatively unattractive and descriptively unrealistic:

1. Surprise is impossible. The model implies that people always become more

certain of their beliefs over time.

2. Polarization is unthinkable. Under widely assumed conditions, the model

implies that people who initially disagree are literally incapable of holding

posterior beliefs that are less alike than their priors.

3. Agreement is inevitable. The model implies that people will always disagree less

as they learn more. Furthermore, learning enough will always cause their beliefs

to converge to agreement.5

All of these shortcomings are peculiar to the normal-normal model; they are not

inherent properties of Bayesian updating. In the remainder of this article, I elaborate

each of these shortcomings and define other Bayesian updating models that surmount

them.

5 These two statements are not equivalent: it is possible for an updating model to imply ever-diminishing disagreement without implying eventual agreement. But the normal-normal model implies both.


Bayesians Can Be Surprised 97

The Norm al-Norm al Model Implies that People

Always Become More Certain, But O ther Bayesian

Updating Models Do Not

A desirable property of any learning model is that surprising new evidence can

cause people to become less sure of their beliefs. A distressing property of the

normal-normal model is that surprising new evidence always makes people more

sure of their beliefs. Formally, note that the variance of a posterior belief in the

normal-normal model can be expressed as

people more certain of their belief. Moreover, the extent to which new information

is surprising has no bearing on the extent to which it changes the certainty of one’s

beliefs. This shortcoming of the normal-normal model is often noted (Learner 1978;

Gerber and Green 1998; Bartels 1993, 2002; Grynaviski 2006), even though it has not

dented the model’s popularity. Fortunately, the problem is anything but endemic to

Bayesian updating. It is an artifact of an unrealistic assumption of the normal-normal

model: that we know the variance of the distribution that we are trying to learn about,

and need only estimate its mean. This situation is as rare in politics as it is in any other

domain. A model in which we simultaneously learn about the mean and variance of the

unknown distribution is both more realistic on its face and capable of accommodating

cases in which people become less certain over time.

<7If cr\ is finite, < 1, and therefore cr] < cr : updating will always make0 x


Updating with Unknown Mean and Unknown Variance


Instead of trying to learn only about the mean of a distribution, we are now trying

to learn about both its mean, p, and its variance, cr2. Our prior belief about these

parameters is a bivariate joint distribution, pip, cr2). It can be expressed as the product

of a conditional prior belief about the mean, p(p\(r2), and a marginal prior belief about

the variance, p(cr2). Many different densities might be used to model these priors,

but two of the most obvious choices are a normal distribution for the mean and an

inverse-Gamma distribution for the variance. (The inverse-Gamma distribution is

attractive for modeling variances because it is continuous, flexible, and has a lower

bound of 0 but no upper bound.) Specifically, the normal/inverse-Gamma prior

distribution can be expressed as the product of two densities,

• po is the mean of the prior belief about p

• a21 no is the variance of the prior distribution for p, conditional on cr2. no is

interpretable as “prior sample size”: the smaller it is, the larger the variance

of the prior belief, reflecting the fact that a prior based on less information

(or fewer “prior observations”) is less precise than a prior based on more

information

• v0 > 0 is a prior shape parameter, i.e., a prior “degrees of freedom” parameter

where


• vocr2 is a prior scale parameter, equivalent to the sum of squared residuals

one would obtain from a previously observed dataset of size v0 in which each

observation came from a N(fi0, cr^) dataset.

Bayesians Can Be Surprised 99

The marginal inverse-Gamma prior distribution has a * 2 shape, and indeed, the

inverse-*2 distribution is a special case of the inverse-Gamma distribution. (Gelman

et al. 2004 discuss updating with normal/inverse-*2 priors; see Grynaviski 2006 for an

application.)

Suppose that we encounter messages x = (xi, . . . , jc„) that we believe bear

directly on the parameters about which we are trying to learn; i.e., we believe

that Xi ~ Nip, cr2) V i e (1 , . . . ,«). If our prior belief about n and cr2 is a

normal/inverse-Gamma distribution with parameters //0, n0, v0, and <Xq, our posterior

belief will also be a normal/inverse-Gamma distribution:

filer2,x ~ ) and (3.4a)\ n0 + n)

7 (v\ Vjcr? \cr |x ~ inverse-Gamma I —, I, (3.4b)

where Vi = v0 + n, and vicr2 = v0ct + £ (xt - x)2 + ^ Qi0 - x)2.1=1

Usually, interest focuses on the marginal posterior distribution of fi, which is a t

distribution:

//|x = tVl (//i, -^cr2/ (n0 + n ) j .



Now, the marginal prior distribution of ji is tYQ ^u0, yjo^/noj. If cr2/n0 < cr2/ (n0 + n),

the posterior belief about n has a higher variance than the prior. This occurs when

2 1

no riQ + n

2 ( n 0 + n \ voctq + £ /= i (*« - x)2 + & ( / /0 - * ) 20-51------- 1 <n0 / v0 + n •

( n° + H\ ( yo + n ) ~ v0o^ < V (x, - x)2 + (^0 _ x)2 .\ no j no + n

_2 2 nL'q / 2\ C/- 2 ^— (nv0 + nn0 + n ) + — (n0v0) - v0cr < ) (Xj - x) + ------- (pi0 - x)n0 v ’ n0 n0 + n

cr2 n— [n (v0 + n0 + n)] < V (j:,- - x)2 + —— (/i0 - *)2 . (3.5)no no + n

Equation 3.5 establishes that people who update according to Equations 3.4a and 3.4b

will become less sure if they learn from data that are sufficiently surprising ((Jc - fio)2

is large enough) or sufficiently vague (£"=i ( j c , - x)2 is large enough). Thus, in contrast

to the normal-normal model, a model that presumes that both a mean and a variance

are unknown permits new information to shake the confidence that people repose in

their beliefs.

Bayesian Updating Models Can Accommodate

Biased Interpretation of Political Information

It is common to hear or read that “Bayesian updating requires independence between

priors and new evidence” (Taber and Lodge 2006, 767; see also Ottati 1990, 160;

Fischle 2000). The notion underpinning the claim is, presumably, that Bayes’ Theorem

demands that beliefs correspond to some objective conception of reality. But there


Bayesians Can Be Biased 101

is not even a germ of truth to such claims. Nothing in Bayes’ Theorem—nothing in

the other writing of Reverend Bayes—nothing in the writing of his contemporary,

Laplace—nothing in the whole of Bayesian statistics past or present warrants such

a claim. “Objective perception” of political information may belong in a standard

of rationality—and this is a point to which I shall return—but it has no place in a

framework for belief updating. Indeed, the entire history of Bayesian scholarship

militates against the notion that understanding of probability requires or profits from

objective perception of the world (de Finetti 1974; Savage 1964; Jeffrey 2004).6 Still,

there are some for whom the dream will never die, and Bayesian updating models are

quite able to accommodate them.

Partisanship is often thought to influence political views through selective

perception: the assimilation, from the evidence at hand, of only or chiefly those details

that support one’s prior beliefs (Campbell et al. 1960, esp. Chapter 6; Bartels 2002;

Taber and Lodge 2006; Jacobson 2006; Gaines et al. 2007; see also Bullock 2006).

Nothing in the normal-normal model is incompatible with selective perception; like

Bayes’ Theorem, it makes no prescription about the way in which evidence is to be

interpreted. Still, researchers who have a standard of objective perception may prefer

a model that explicitly distinguishes between selective and objective perception. The

normal-normal model can be adapted to this task.

We begin by distinguishing “good” messages that comport with one’s prior

and “bad” messages that do not. Formally, let the former set of messages be xg =

(x i , . . . , x,) and the latter set be xb = (xt+l, . . . , x T). ~ N (//, cr2) Vi € (1 , . . . , T).

6 O f course, this is not to deny that there are objective probabilities; although many contemporary “subjectivist Bayesians” (e.g., de Finetti, Savage, Diaconis) deny that there are, Bayes and Laplace did not. Note that there is a school of thought that goes by the name “Objective Bayes,” but the “objectivity” that its members favor amounts to the rejection of certain types of prior beliefs as inappropriate—not to the assumption that there are objectively correct likelihoods for data, least of all in the ambiguous world of politics. On both counts, see Press 2003.


xg = xg ~ N (//,cr;/r); Xb = Xb ~ N ( fi ,a 2x/(T - t)). If one’s prior belief is

H ~ N (juo, cr^j, and one updates according to the normal-normal model, one’s posterior

is /i|xg, Xb ~ N (/ii, erf), where


xg = £ |=1 Xj/t, and Xb = EL+i Xi/(T - 0- As each xt is subjectively determined, so

too are xg and xb: different people interpreting the same evidence may come up with

different values of these “sample means.” But if we, as researchers, have a standard

of objective interpretation, we can make this subjective assessment explicit by using

a modified model. The formula for the mean of the posterior is given by Gerber and

Green (1999):

/ l / o l \//1 ^ \ l/cr2 + t/cr2 + (T - t)/(r2J

_ ( t/<r2 \ . _ I (T - t)/cr2x \agXg* \ 1 /cr2 + t/cr2x + ( T - t)/cr2x ) “ *** \ 1 /cr2 + t/cr2 + (T - t) /a 2 ) '

(3.6)

ag and or* are the selection weights', they indicate the extent to which favorable and

unfavorable messages are misinterpreted. xg* and xbt are the objective sample means

of the favorable and unfavorable messages, i.e., the sample means in the eyes of the

researcher. In the absence of selective perception, a g = a b = 1. If higher values

of Xi are preferable, and selective perception consists of giving an unduly favorable

interpretation to bad news, a b will be greater than 1. If selective perception consists of

exaggerating the good news provided by favorable information, ag will be greater than

one.


Convergence and Polarization of Public Opinion 103

A closely related form of political bias is at work when people attribute more or

less credibility to a news source than it deserves: when Communist Party officials exalt

the People s Daily, perhaps; or when Republicans take Rush Limbaugh at face value.

This calls for a slightly different model:

I l / o t \

^ ^ \ I/a* + t/(r2x + (T - t)/o~2/

+ x (________ azfl^x________ \ ( ab( T - t ) / o j \g \ 1 /o% + ocgt/a2 + (T - i ) / a 2) b \ 1 /a 2 + t/cr2x + a b{T - t ) / a 2J ‘

(3.7)

Here, ag and a b apply not to the content of new information but to its credibility. If one

overrates the credibility of favorable messages, ag is greater than 1: it it is as though he

is responding to more favorable messages than he really received. If one underrates the

credibility of unfavorable messages, ab is less than 1: it is as though he is responding

to fewer favorable messages than he has received.

Convergence and Polarization of Public Opinion

under Bayesian Updating

No issue in Bayesian analysis of public opinion is more disputed than the implications

of Bayesian updating for disagreement among people with different prior beliefs.

Gerber and Green (1999, 203-05) maintain that if Republicans and Democrats are

Bayesian, they will agree neither more nor less as they update in response to new

evidence. Empirically, Gerber and Green find just this patterning of presidential

approval over time and adduce it as evidence that many people are Bayesians

whose views are unaffected by partisan bias. Bartels (2002) cites the same public

opinion data as evidence that people are biased Bayesians or not Bayesian at all:



unbiased application of Bayes’ Theorem, he writes, implies convergence of public

opinion. Achen (2005, 334) agrees, and Grynaviski (2006, 331) claims that Bartels

“formally proved” that Bayesian updaters will “inexorably come to see the world

in the same way.” (He didn’t.) A closely related dispute is about polarization of

public opinion: Gerber and Green (1999) write that attitude polarization (Lord,

Ross, and Lepper 1979) is incompatible with Bayesian updating except under very

unusual circumstances, while Steenbergen (2002, 7-8) concludes exactly the opposite.

There is at least a little truth to all of these positions: Bayesian updating does imply

convergence of public opinion under some conditions, but these are more numerous

and more stringent than the discussions to date have acknowledged.

Before embarking on a series of proofs, it will help to distinguish between

three kinds of convergence, all of which are depicted in Figure 3.3. Convergence

to agreement occurs when prior beliefs converge to the same belief after updating.

Convergence to signal occurs when people’s beliefs converge to the mean of the

distribution of messages that they are using to update; if beliefs converge to the

same signal, they also converge to agreement. This kind of convergence has been

widely discussed in Bayesian statistics, where it is subsumed by the broader topic

of consistency of Bayes estimates. In that literature, convergence to signal has

been proved to hold under general conditions for a wide variety of prior and data

distributions (e.g., Diaconis and Freedman 1986; Strasser 1981); I focus here on the

case of normal priors and data because of the ubiquity of these assumptions in political

science. Convergence to truth occurs when beliefs converge to the true parameter of

interest. Often, convergence to signal implies convergence to truth, but not always. If

beliefs are updated in response to messages from a biased news source, convergence to

signal implies that beliefs are not converging to the truth.



E |

% ! 2.1

estimates converge to agreement but not to signal

estimates converge to agreement and signal

but not to truth

time

estimates converge to agreement, signal, and truth

Figure 3.3: Convergence to Agreement, to Signal, and to Truth. The dashed lines depict parameter estimates by two people. They converge in each panel; this is convergence to agreement. The solid black line in each panel represents the mean of the distribution of information that people are using to update their estimates, e.g., the distribution from which news articles are drawn. In the leftmost panel, the parameter estimates do not converge to the mean of this distribution. In the middle and rightmost panels, they do: this is convergence to signal as well as convergence to agreement

The grey line in the last two panels indicates the true parameter value. In the middle panel, it differs from the mean of the information distribution. This occurs whenever the information that people use to update their beliefs is biased on average. In the rightmost panel, the mean of the information distribution is also the true parameter value: here we have convergence to truth as well as to agreement and to signal.

Convergence of Public Opinion Under the Normal-Normal

Model

Proposition 1. A person’s prior belief is/u ~ N(p0, cr2). He updates according to

Equation 3.3a in response to x, a sample of t messages that he perceives to have mean

jc, with

jc, ~ NQj., crl) V i e (1, . . . , t). If t is large enough, his belief will converge to x.

Proof. By a result shown in the appendix, the t messages are equivalent to a

single message Jc from distribution N(ji, cr^/t). Suppose e > 0 and T = . Then°0€t > T implies



M* - *1 =a \x + (a lJ t) ii0 - x(crl + cr2J t)vV/T-2

O’O + ° 1 / '

((T2x/t)(ji0 - X)

+ <r2xlt

(T2x |/io - *1to-2 + 0-2

T5f1o3,

70-2+0-2

-•*1

e)] 1 o*0 6 1 a x

o-\ \Mo -

+ O'2

ecr |/io - x|0-2 (1/iq - *1 - e) + ecr^

= 6. □

Discussion. The proof reveals conditions under which any Bayesian updater’s

belief will converge to a subjectively defined jc. To some (e.g., Grynaviski 2006;

Bartels 2002), it seems a short step to infer that Bayesians who initially disagree will

come to agree with each other. In fact, the conditions set forth in the proof are quite

stringent, and the conditions required for convergence to agreement are more stringent

still.

First among the requirements for convergence to agreement is simply that

people are Bayesian or that they adopt non-Bayesian updating rules that nevertheless

permit convergence. This is a point too often elided by those who take nonconvergence

as proof of partisan bias (e.g., Bartels 2002). There is ample laboratory evidence



that people are not Bayesian updaters, and while most documented non-Bayesian

tendencies do nevertheless permit convergence (Phillips and Edwards 1966; Tversky

and Kahneman 1971), some may not (Chapman and Chapman 1959; Hamm 1993).

The second requirement is that people with different priors perceive the new

information in the same way: technically, they must agree on the value of Jc. This

rules out selective perception (unless people with different prior selectively perceive

evidence in the same way—an unlikely circumstance, given that priors influence

the strength and direction of selective perception). It also rules out cases in which

people are updating evaluations that simply reflect different values. For example, if a

Republican president’s economic policies are consistent with Republican values and

inconsistent with Democratic ones, we should not expect convergence even if members

of both parties are Bayesians who have the same understanding of new economic

information.

A third requirement likely to be violated is that people update exclusively

on the basis of the same set of messages. If they do not—if, for example, i and

j both update on the basis of x, but i also updates on the basis of messages that

seem have a different sample mean—there is no reason to expect updating to cause

agreement. This rules out selective exposure, whereby people with different views

may systematically expose themselves chiefly to congenial news sources (Taber and

Lodge 2006; Gentzkow and Shapiro 2006). It was once argued that selective exposure

is uncommon—especially in politics—because people do not consciously seek to

reinforce their views through their choice of media (Sears and Freedman 1967; Frey

1986). But selective exposure does not require a reinforcement motive (Katz 1968),

and the heightened sorting of the electorate (Levendusky 2006) and splintering of

the market for news into specialized niches (Prior 2007, Chapter 4) may have made it

increasingly common.



A fourth requirement is that the parameter about which people are updating

is constant over time. As I show below, weakening this assumption allows

nonconvergence and polarization when people update their beliefs, even if they are

updating in response to the same information and are interpreting that information in

the same way.

Finally, complete convergence to agreement—the assumption that Bayesians

will “inexorably come to see the world in the same way” (Grynaviski 2006, 331)—can

only occur if updaters are responding to a set of messages so powerful that it causes

them to completely ignore their prior beliefs. (Formally, the precision of the set of

new messages must be infinitely greater than the precision of the prior beliefs.) Even

relative to the other conditions, this is unrealistic. Bartels (1993) argues forcefully that

people’s beliefs about candidates at the start of Presidential campaigns are far stronger

than we usually imagine. And it is widely known that most Americans are exposed

to only meager amounts of political news (Campbell et al. 1960; Zaller 1992; Delli

Carpini and Keeter 1996; Prior 2007). The assumption may hold in the extremely long

run, but by then, as Keynes noted, we’ll be dead.

Almost no interesting political predicaments satisfy all of these conditions.

This suggests that the recent focus on convergence has been misplaced: the failure of

Republicans and Democrats to evaluate the President in the same way or to otherwise

share the same beliefs may be evidence of selective perception, but it may also be due

to any of several other, quite likely factors. On the other hand, convergence becomes

more likely as more of these conditions are met, and it would be quite surprising if

Bayesian updating did not imply convergence of public opinion when all of them are

met. That is why the next two proofs are interesting: they show that convergence may

not occur under Bayesian updating if these conditions are only slightly relaxed.


Nonconvergence and Polarization Under Bayesian Updating

with Selective Perception


Proposition 2: Nonconvergence and Polarization Under Bayesian Updating with

Selective Perception. Let voters i and j have prior belief // ~ N(po, cTq). Both update

in response to x, a set of T messages, under the presumption that jc , ~ N(p, cr2x) V i e

(1 , . . . , T ). As in Equation 3.6, x can be partitioned into xg, a set of t messages that

favor the voters’ prior, and xb, a set of (T - 1) messages that contradict the voters’ prior.

Assume t > 1 and (T - t) > 1. The objective mean of xg is xg* and the objective mean

of Xb is x again, these “objective” means are stipulated by the researcher. Assume

^ Xb*. Voter i updates by Equation 3.6 with selection weights agi and a^,. Voter j

updates by Equation 3.6 with selection weights agj and abj. If agi ± agj or abi a b],

convergence to agreement cannot occur unless

and that will only occur by chance.

Proof. By Appendix A, updating in response to the messages in xg and xb is

equivalent to updating in response to a single draw

_ _ org X g ^ /iT - t ) + abxb*(T2x/t X* (T2x/t + (T2x/(T - t)

agXg(r2x/(T - t ) + a b X b ^ /t ~ cr2xT/n(T - t)

= (agXg^x/iT - 0 + oibXb^Jt) H 2 T ^

tagxgt + (T - t)abXb,= T


from a normal distribution with variance a^/T . By Equation 3.3a, the posterior mean

is

. . I V ° t \ - 1 T/cri \flX + ( l/er j + r / c r j ) '

Assume e > 0. By Proposition 1, \/x\x - x*| < e fo rT > i.e.,//|x will°oeconverge to x,. It only remains to show that jc, takes on different values for i and j

when they have different selection weights:

tagiXg* "H (T t^)(YXf}* t(XgjXg* "I- (T t )(}*bjXfr*


where jc*, is the value of jc, using fs selection weights and xtj is the value of jc* using

/ s selection weights. By assumption, t and T - t are positive, and xgt + xb*, so the

posterior beliefs of i and j will never converge to agreement unless

(agi - a gj) = ( a bi - a bj) . :

Discussion. The proof shows that Bayesians with the same prior beliefs who update

under selective perception will generally have different prior beliefs; it is therefore not

just a nonconvergence result but a polarization result, too. Of course, nonconvergence

is no less likely when i and j have different prior beliefs: it depends entirely on

the difference between jc„• and xtj, and not at all on the difference between prior


beliefs. And even when i and j have different priors, selective perception will lead to

polarization when


Convergence and Polarization under Kalman-Filter Updating

The models considered to this point assume that people are trying to learn about //, a

quality of the political environment that does not change. Achen (1992), for example,

assumes that the Democratic and Republican Parties offer benefits to each voter that

oscillate over time around a mean benefit level that never changes, and he uses the

assumption to justify his use of the normal-normal model to study changes in party

identification. Such fixed-mean assumptions are apt when we are trying to learn about

history and perhaps when we are trying to update our beliefs over the short term.

But they are inappropriate when the parameter of interest changes over time. Pace

Achen, the net benefit that I derive from a party changes as my views or economic

status change. My preferences over policies change as new proposals are placed on the

table or taken off of it. And candidates may improve during their time in office or fall

increasingly under the sway of constituents whose views I oppose. In all of these cases,

the constant-parameter assumption is an approximation at best.

Suppose that a Bayesian is trying to learn about a parameter that changes

according to the rule

a, = ?<*,_! + ea, ea ~ Af(0,o£) (3.8)


where t is the current period, y is a known autoregressive parameter, and ea is a

disturbance term with known, finite, nonzero variance a 2a. Let jcj, . . . , x,_i, xt denote

the observed values of the parameter of interest at times 1, 1, t. The relationship

between x, and a, is

x, = a, + ex, ex ~ N(0, a^) (3.9)

where cr2x is a known, finite, nonzero variance term. (Following Gerber and Green 1998

and Green, Gerber, and de Boef 1999, y, cr2, and cr\ are held constant for simplicity of

exposition. But the model described in this section can easily accommodate the case in

which they change over time. See Meinhold and Singpurwalla 1983 for an example.)

His initial belief is

(ar0 | y, a^, crty ~ N(a0, P0)-

Looking forward to period 1, his prior belief about the parameter of interest is

governed by Equation 3.8:

(ori | y, a l , aj'j ~ N (ya0, + a2a).



But after receiving message x {, he updates. Because both his prior belief and the

perceived likelihood of x Y are normal, he updates according to Equations 3.3a and

3.3b:


(ari | y ,a l , o l ,x i) ~ N(&i,Pi), where

l / ( y 2P(> + O'2)a i = ya 0i / ( y 2P0 + a 2) + i/o-i

l

+ X\l /cr

[ l / t f P o + 0-1)+1/0-1

l /(y 2P0 + o-l)+ l/a-2'

And generally,

( o r , | y , a l , o j , x t _ i , x t ) ~ N ( a t , P t ) , w h e r e

\ / ( y 2 P t- i + a l )at = yat~i

l/(y2Pt-i + o-2) + I /0-2+ x ,

I /0-2

Pt =\ / { y 2 P , - i + a 2a ) + I / 0 - 2 ’

l / ( y 2 P t- i + c r 2 ) + \ / o -2

(3.10)

(3.11)

and where xt_i is the vector of messages jci, . . . , x,_i. Equations 3.10 and 3.11 are

known as the Kalman filter algorithm after Kalman (1960) and Kalman and Bucy

(1961), who show that Equation 3.10 yields the expected value of a t under the

assumption of normal errors. If the normality assumption is relaxed, the Kalman

filter estimator of a, remains best (i.e., least-squares-minimizing) among all linear

estimators. Harvey (1989) and Beck (1990) contain extensive descriptions of

the Kalman filter and its properties. Goussev (2004) argues from a neurological

perspective that the human brain unconsciously uses it to update probabilities.

Meinhold and Singpurwalla (1983) provide a lucid introduction to it from a Bayesian

point of view. Gerber and Green (1998) note that the normal-normal model is a special


case of the Kalman filter model in which y = 1 and cr2a = 0. And given that the

Kalman filter model is a Bayesian updating model, one of its surprising implications

is that people’s beliefs may diverge even if they are updating in response to the same

messages and interpreting those messages in the same way.

Proposition 3A: Polarization Can Occur under Kalman Filtering Even in the

Absence of Selective Perception and Selective Exposure. Assume that a„ (r2a, y, x„

and a i are as described above. Let K, = , D 1/(7t Voter Vs belief about a 0 is

N(&io, Pq). Voter f s belief about «o is N(&jo, Po). They are exposed to one message

at each stage t; the entire set of messages is x,. They update their beliefs according to

Equations 3.10 and 3.11. If a® 4- ajo, their beliefs diverge from time t to time t + 1 if

and only if (1 - K,+l) |y| > 1.

Proof. We begin with a lemma: the Kalman filter estimator a t can be written as

a linear function of or0,

a, = c,ao + ft'xt, (3.12)

where c, = n /= i (1 ~ Kdy, ft' is a row vector, and xt is the column vector of messages

x i , . . . , xt. (See the appendix for a proof.)



Proof by contradiction: assume some t such that beliefs at t + 1 are not more

polarized than beliefs at t even though (1 - Kt+i)\y\ > 1. Note that 0 < l - K , < \ V t .

Then

|a it - a jt| - |q',-,+i - or;,,+i| > 0

=> |c,arl0 + ft'xt - (ctajo + ft'xt)| - |(c +idri0 + ft'+1xt+i) - (cl+la j0 + ft'+1xt+i)| > 0

^ Iq I — Iq +iI > ot t

=> Y \ ( 1 - Ki) Irl - (1 - */+i) Irl n a - Ki) w ^ 0i=l i=l

=>(i-ATf+1) irl < i ,

which is a contradiction. This establishes that divergence occurs between times t and

t + 1 if (1 - Kt+i) |y| > 1. Now assume some t such that beliefs at t + 1 are less alike

than beliefs at t even though (1 - Kt+{) |y| < 1. Then

|a.t - & jt\ ~ |a,-,/+iQ'y>f+i| < 0 t t

=> f [ ( i - Irl - (i - Kt+1) Irl Y \ ( i - Kt) Irl < o1=1 i=i

= > d - ^ +1) l r l> i,

which is a contradiction. This establishes that divergence occurs only if

( l - f f , +1) l r l > l . □

Proposition 3B: Convergence to Agreement Occurs Eventually under the Kalman

Filter. Assume the conditions of Proposition 3 A. At some period t, K, will reach a

steady state. After that point, polarization will not be possible.



Proof. We begin with a lemma: K, will gradually converge to

y2 — c - 1 + y](-y2 + c + l)2 + 4cy2


K =2 y2

where c = cr2/cr2 (Gerber and Green 1998; see the appendix for a proof). By

Proposition 3A, divergence occurs if and only if (1 - Kt+l) \y\ > 1, but this is not

possible when Kt+l = K.

Proof by contradiction. (1 - £ , +1) \y\ > 1 implies |y| > 1, because 0 < K, < 1 V t.

If y > 1, divergence in the steady state implies

y 2 - c - 1 + V ( - 7 2 + c + l)2 + 4cy22y2

y > 1

y2 — c - 1 + -y/(-y2 + c + l)2 + 4cy2= > y ------------------------ ------------------------> 1

2 y

=> - (y2 - c - 1 + yj(-y2 + c + l)2 + 4cy2) > (1 - y)2y

=> c + 1 - V (-y2 + c + l)2 + 4cy2 > 2 y - y2

=> c + 1 - Vr4 + 2cy2 - 2y2 + c2 + 2c + 1 > 2y - y2

^ y 2 - 2 y + c + l > Vr4 + 2cy2 - 2y2 + c2 + 2c + 1

y4 - 4y3 + 2cy2 + 6y2 - 4cy - 4y + c2 + 2c + 1 > y4 + 2cy2 - 2y2 + c2 + 2c + 1

=> —4y3 + 6y2 — 4cy — 4y > —2y2

=> -4 y 3 + 8y2 - 4yc - 4y > 0.

This implies c < - y 2 + 2y -1 . But - y 2 + 2y - 1 < 0 V y, and c must be positive because

it is a ratio of positive variances. Contradiction.



If 7 < “ 1,

y2 - c - 1 + V(-y2 + c + l)2 + 4cy2'2y2

=>2 Irl

=> - Ir2 - c — l + ^ ( - r 2 + c + i)2 + 4cr2) > (l - lrl)2 Irl

=> - (r2 - c - 1 + -\J(-y2 + c + l)2 + 4cy2 j > 2 Irl - 2r2

=> r 2 - 2 Irl + c + 1 > ^ r 4 + 2cy2 - 2r2 + c2 + 2c + 1

=> 74 - 4 Irl3 + 6r2 + 2r2c - 4\y\ - 4\y\c + c2 + 2c + I > y4 + 2c/2 - 2r2 + c2 + 2c + 1

-4 Irl3 + 8-y2 - 4 Irl c - 4 Irl > o.

This implies c < - y 2 - 2 y - 1. But - y 2 - 2y - 1 < 0 V y, and c must be positive

Discussion. The proof of Proposition 3 A shows that polarization can happen

even in the absence of selective exposure or perception: it occurs when updating

causes people who disagree to weight their prior beliefs more heavily than they did

before. And this occurs when (1 - Kt+\) Irl > 1. Because K, lies between 0 and 1 for all

values of t, (1 - Kt+i) must also lie between 0 and 1, and Irl > 1 is therefore a necessary

condition for polarization. It may seem, moreover, that bigger values of Irl produce

more polarization. But this is not generally so, because y enters into the definition of

Kt, too, and the algorithm that determines K, quickly “catches up” to offset the size of

Irl in (1 - Kt) |rl- Proposition 3B shows that the algorithm catches up completely when

K, reaches its steady state K: it is impossible for (1 - K) Irl to be greater than 1.

because it is a ratio of variances. Contradiction. □



time

c o n v e r g e n c e s l ig h t p o l a r i z a t i o n

th e n c o n v e r g e n c ee x t e n d e d p o la r iz a t io n

t h e n c o n v e r g e n c e

Figure 3.4: Convergence and Polarization under Kalman Filter Updating. Allpanels depict simulated belief updating by two hypothetical voters when y = I . I , ao = I , the voter represented by the solid line has prior belief N(l,l), and the voter represented by the dashed line has prior belief N(—1,1). The scale of the y axes differs across panels to make differences between the voters apparent.

In the leftmost panel, cr = .7 and cr2 = 4. We see the intuitive pattern of belief updating under Bayes’ Theorem: subjects who receive the same messages and interpret them in the same way draw closer to agreement every time they update. The convergence is represented by the vertical distance between the solid and dashed line, which diminishes in each period.

The middle panel depicts updating when cr2 = .5 and cr2 = 25. Even though voters are receiving the same messages and interpreting them in the same way, their beliefs diverge (slightly) from period 0 to I and again from period I to period 2. Not until period 4 is the distance between their beliefs smaller than the distance between their priors. This pattern is exacerbated in the rightmost panel, where <x2 - .19 and cr2 = 100. Here, voters’ views diverge continuously in each of the first eleven periods; not until the 22nd period (unshown) is the distance between their beliefs smaller than the distance between their priors.

In practice, polarization is sustained longest when |y| is barely greater than 1

and when cr2 is far smaller than cr2. (See Figure 3.4.) The former condition occurs in

politics whenever a parameter of interest is trending slowly away from zero.7 Almost

all political parameters of interest trend away from zero sometimes, and some (e.g.,

population, racial tolerance, per capita income in developed countries) seem to trend

slowly away from zero for very long periods of time. The second condition, cr2

exists whenever the true variation in a parameter is slight but the quality of the

7 For parameters that have no natural scale, zero is arbitrarily defined. For example, in the case of the President’s honesty, zero might correspond to a survey answer of “neither honest nor dishonest.”


Convergence and Polarization of Public Opinion

messages that we receive about it is poor. This is likely when secretive or totalitarian

states exert control over the press. Schumpeter (1942) argues that it is also a condition

endemic to democratic politics: although there are knowable political truths, the

feedback that we receive about our political decisions is so poor as to verge on useless.

Presume, for example, that we want to know whether the President is a good steward of

the economy. How are we to know? If we think that we have observed improvement in

the economy, it might be attributed to the President’s policies, or the lagged effects of

his predecessor’s policies, or the actions of incumbents at other levels of government,

or the lagged effects of their predecessors, or the Federal Reserve, or wholly apolitical

factors. It is rarely easy to tell who is responsible. And this elides the difficulty of

simply knowing when the economy has improved. Even a free and robust media will

not be of much help in these cases: no matter how precise the signals they send about

the state of the economy, signals about who deserves credit and blame are unavoidably

vague.8

Bartels (2002, 123) tells us that “it is failure to converge that requires

explanation within the Bayesian framework,” and he fingers selective perception

by Democrats and Republicans as the culprit in their failure to agree on a host of actual

matters. He may be right, but the case is far from cinched, and his characterization

of Bayesian updating is too strong. Bayesian updating models imply only partial

convergence of only some beliefs, and only under fantastically rare conditions do they

imply that people who disagree will “inexorably come to see the world in the same

way.” The logic of Bayesian updating alone provides no reasonable expectation of

8 Indeed, o~ <k cr^ in the case of attributions of praise and blame is the root of Schumpeter’s (1942, 262) contention that “the typical citizen drops down to a lower level of mental performance as soon as he enters the political field.” The problem is not that political man is stupid but that causality in politics is so complex and feedback about political decisions is so poor. Man’s “lower level of mental performance” is due to his failure to sufficiently adjust for the poor quality of information at his disposal.


convergence—not during a campaign, not even over a lifetime. Of course, agreement

may occur. But enduring disagreement is no proof that people are not Bayesian. Still

less is it proof of partisan bias.

Discussion: Bayes’ Theorem as a Normative

Standard of Belief Updating

As a way of describing how citizens actually update their beliefs, Bayesian updating

models have come in for a wealth of criticism. Much laboratory research shows

that people do not update as Bayesians (Phillips and Edwards 1966; Tversky and

Kahneman 1971; though see Koehler 1996). And some political scientists believe

that the Theorem cannot accommodate ordinary features of public opinion about

politics (Taber and Lodge 2006; Fischle 2000). This essay shows, to the contrary,

that important features of public opinion about politics are readily accommodated by

Bayesian updating. Political events are often surprising, and Bayesian updating can

reflect that surprise by causing people to hold their views less confidently. Partisan

bias affects people’s views through selective perception or misjudgments of the

credibility of media outlets, and Bayesian updating can easily accommodate this.

Most importantly, people who disagree about politics are rarely moved to agreement

by even a flood of evidence. Exposure to the evidence may even cause their views

to draw further apart. Contrary to what some have written, Bayes’ Theorem can

easily accommodate enduring disagreement and polarization, too. None of this

means that Bayesian updating models perfectly capture every facet of political

decision-making—no models do—but it does mean that they are better than many

suppose.



Discussion 121

But it is precisely this flexibility that makes Bayesian updating inadequate

as a standard of rational thinking about politics. One need only consider all that it

permits. There is no limit to the amount of evidence that a Bayesian may ignore or

misconstrue, for Bayes’ Theorem applies only to updating, not to the collection or

interpretation of evidence. And even in the absence of perceptual biases—even if

people are interpreting and using new information in exactly the same way—Bayes’

Theorem permits their views to polarize. No standard of rational belief revision should

be so permissive.

The problem is not that the Theorem is irrational; as we have seen, it entails

abiding by axioms of probability so fundamental that they should be a component of

any standard of rational thinking. The problem is that the Theorem is not restrictive

enough: it permits what we should reject as irrational. Political scientists interested

in constructing standards of rational political thinking will do well to couple it with

restrictions on the interpretation of political evidence. Bayes’ Theorem should be part

of any normative criteria by which we judge rational political thinking—but only a

part.



Appendix H: Proofs of Several Bayesian Updating

Results

Updating in Response to Many Normal Messages is Equivalent

to Updating in Response to a Single, More Precise Normal

Message

Updating sequentially in response to many messages is equivalent to updating once

in response to a single, more precise message. I show this result for normal-normal

updating—first for the case in which all the messages are drawn from the same

distribution, then for the case in which different messages are drawn from distributions

with different variances.

Formally, assume (as in the normal-normal model) that we are trying to learn

the mean of a distribution whose variance is known. Let the prior belief about the

mean be fi ~ N (//0, cr2). A random sample of messages x = ( jt i, . . . , xn) is drawn

independently from a distribution that is presumed to be N(p, cr2). Because Bayes’

Theorem requires that the posterior be proportional to the prior times the likelihood of

the new messages, updating sequentially requires that

p(jj|x) oc prior x £(.xi|//) x • • • x £(jt„|/i)

n


Appendix 123

The part that does not depend on pi is constant. Absorbing it into the proportionality

constant, we get

p(/i|x) oc prior x expl U x - n Y2 \ cr2/n

The rightmost term is the likelihood of a single observation from a normal distribution

with mean pi and variance <r2/n.

The case of updating with messages from distributions with different variances

is very similar. Suppose that messages xi = (jti,. . . , jc„) have mean x\ and are

presumed to be drawn independently from the N(pi, cr2Xi) distribution, while messages

x2 = (jc„+1, . . . , xN) have mean x2 and are presumed to be drawn independently from

the N(ji, cr2Xi) distribution. By the previous result, this is equivalent to updating on the

basis of just two messages:

p(pi|x i ,x 2) oc prior x exp l / ( * l - / / ) 2 \. 2 \ cr2, /n /

x exp 1 / (*2 ~ AO2 \2 \(r2xJ(N -ri))

Let cr2 = o Xl /n and cr^ = <tX2/(N - n). Then

pip|xi, X2) oc prior x exp

p n o r x exp

1 ( Ui -pi)2 (x2 -pi)2^l ' 2 l crl a i

(xi - pi)2 + (x2 - n)2 cr2u

- p n o r x exp

= p n o r x exp

= p n o r x exp

crtcr221* 2*

' ( jc 2 - 2fix\ + pi2) a \ t + ( jc 2 - 2pix2 + pi2) cr2u

1 [ - 2m*i + x2<r2j + r fv j, + 4 ^ *2 V Cri«Cr2,

1 ( 2 2//(x1cr2t + X2CT2,) tfcr2' + x2a jt \ + cr21 ( 2 2 -

“ 2 r ~ *r*.+ ° i +oi IPs?)


Absorbing the part that does not depend on p. into the constant of proportionality, we get


pip|xi, X2) oc prior x exp 1 f + X2o j \ 22 ( ^ 0-2jt)/(cr2t + o-jj r °-2u +crl 1

_ _2 _ 9The rightmost term is the kernel of a normal density with mean *' f +x\ “ and

1* 2*<r? 0-*variance° i +oi ‘

Under Kalman Filtering, the Posterior Estimate of the Mean is

a Linear Function of the Prior Belief and Messages Received

Equation 3.3a shows that when a normally distributed prior belief is updated in

response to information that is perceived to be normal, the Bayes estimate of the

posterior mean is a linear combination of the prior estimate of the mean and the new

information. Diaconis and Ylvisaker (1979) show that this is not peculiar to updating

with normal priors and likelihoods: whenever the prior belief has a distribution in the

exponential family and is conjugate to the distribution of the new information, the

Bayes estimate of the posterior mean is a convex combination of the prior estimate

of the mean and the new information. In this appendix, I calculate the weights of the

linear combination for the Kalman filter estimator at of random variable a t, which are

defined on pages 111-113.

The Kalman filter estimator a t can be written as a linear function of do,

or, = cta0 + f(Xt, where c, = flLiO - K-i)y, Kt is the “Kalman gain” defined on page

114, xt is the vector of messages ( j q , . . . , j c , ) , and f t' is a row vector of weights on the

components of xt.


Appendix 125

ai = ya 0 + Ki(xi - yar0)

= (y - K\y)a0 + Kxxi

= Ci&o + fjxi,

with ci = (y - K\y) = ni= 1 (1 ~ Kdy and fi = K\. Assuming the claim is true for t, it is

also true for t + 1:

a t+1 = ya t + Kt+x(xt+] - ya ,)

= (1 - Kt+i)yat + Kt+lxt+l

= (1 - K,+i)y [cfor0 + + Kl+lxt+i

= [(1 - Kt+\)yct\ or0 + (1 - ^ r+1)yft'xt + Kt+1 xl+ [

= ct+lao + fj+1xt+1,

with cr+i = (1 - Kt+i)yc, and ft+i a vector in which the first t elements are given by

(1 - A^+Oyft and in which the (N + l) th element is Kt+l. □

This proof follows the similar proof in Gerber and Green (1998). They are

not identical because (perhaps due to a copyediting error) the earlier works reports

c, = n U l ~ which is not quite right.

K Has a Steady State

According to Gerber and Green (1998), the Kalman weight Kt in Kalman-filter

updating stabilizes at a unique value:

K _ ~[c + (1 - y2)] + V[c + (1 - y2)]2 +

Proof by induction. The claim is true for t = 1:


where c = cr 2/cr2.


Proof. K, = Ptlcr2, so behavior of P, implies behavior of Kt. To find the steady state of

P, (and thus of Kt), we need to find the value for which Pt — Pt+i'

P, = <r2xKt+l

= )Ky2Pt + o l + o j)

P t t f P t + <r2a + o 2x) = P ,y 2crzx + o2

Pf y2 + P.icrl + a j - y W x) - o j o i = 0 .

This is just a quadratic equation, so

~(o-2a + crx - y 2cr2x) ± y j(a2a + a 2x - y 2cr2x)2 - 4 y2(-o-^oj)2 y 2 '


Appendix 127

The denominator is positive. The numerator must also be positive, then, because P

must be positive.

cr2, cr2, and y2 must all be positive. The numerator will thus only be positive

if it is -(cr2 + cr2 - y2cr2x) + >/(cr2 + cr2 - y2cr2)2 - 4y2(-cr2cr2). In other words, we

replace ± with +:

That is, after some stage t, the newest observation always receives the same weight

(K) when updating. By extension, one’s prior always receives the same weight (1 - K)

when updating.

This result is contrary to the normal-normal model, in which information

received at previous stages is always reflected in the prior. As time passes, the weight

placed on the prior under the normal-normal model always increases, because the

prior always reflects more information than it did in the past. Consequently, the weight

placed on new messages always decreases.

The Kalman filter is more realistic because it implies that very old information

is either forgotten or discounted: when the parameter of interest is changing over time,

a very old message is not as important as a new message, even if both come from

P = ~ ( o j + o~x~ y2° j ) + V ( o j + o j - y2c 2x)2 - 4 y 2( - c r 2cr2)2 y2

K =cr22y2

—[c + (1 - y 2)\ + V(cr2 + cr2 - y2cr2)2 + 4y2cr2cr22 y 2 <t 22 y2

+ 4cy2

y2 - c - 1 + ^ (c - y2 + l)2 + 4cy22y2


equally credible sources. That is why the weight placed on the prior when updating

under the Kalman filter model does not always increase over time. See Gerber and

Green (1998) for an extensive discussion of this difference between the normal-normal

and Kalman-filter models.

128 Bibliography


Bibliography

Abelson, Robert R 1986. “Beliefs Are Like Possessions.” Journal for the Theory o f

Social Behaviour 16 (October): 224-50.

Achen, Christopher H. 1992. “Social Psychology, Demographic Variables, and Linear

Regression: Breaking the Iron Triangle in Voting Research.” Political Behavior 14

(September): 195-211.

Achen, Christopher H. 2002. “Parental Socialization and Rational Party Identification.”

Political Behavior 24 (June): 151 -70.

Achen, Christopher H. 2003. “Toward a New Political Methodology:

Microfoundations and ART.” Annual Review o f Political Science 5 (June): 423-50.

Achen, Christopher H. 2005. “Let’s Put Garbage-Can Regressions and Garbage-Can

Probits Where They Belong.” Conflict Management and Peace Science 22 (Winter):

327-39.

American Presidency Project. 2006 03 01. “Political Party Platforms.”

http://www.presidency.ucsb.edu/platforms.php.

Anderson, Craig A., B. Lynn New, and James R. Speer. 1985. “Argument Availability

as a Mediator of Social Theory Perseverance.” Social Cognition 3 (3): 235-49.

129


http://www.presidency.ucsb.edu/platforms.php

i 30 Bibliography

Anderson, Craig A., Mark R. Lepper, and Lee Ross. 1980. “The Perseverance of Social

Theories: The Role of Explanation in the Persistence of Discredited Information.”

Journal o f Personality and Social Psychology 39 (December): 1037-49.

Ansolabehere, Stephen, James M. Snyder, Jr., and Charles Stewart, III. 2001.

“Candidate Positioning in U.S. House Elections.” American Journal o f Political

Science 45 (January): 136-59.

Ansolabehere, Stephen, and Shanto Iyengar. 1995. Going Negative: How Political

Advertisements Shrink and Polarize the Electorate. New York: Free Press.

Ansolabehere, Stephen, Shigeo Hirano, James M. Snyder, Jr., and Michiko Ueda.

2006. “Party and Incumbency Cues in Voting: Are They Substitutes?” Quarterly

Journal o f Political Science 1 (March): 119-37.

Barge, Matthew. 9 August 2005. “NARAL Falsely Accuses Supreme Court Nominee

Roberts.” http://www.factcheck.org/article340.html.

Bargh, John A., Shelly Chaiken, Rajen Govender, and Felicia Pratto. 1992. “The

Generality of the Automatic Attitude Activation Effect.” Journal o f Personality and

Social Psychology 62 (June): 893-912.

Bartels, Larry M. 1993. “Messages Received: The Political Impact of Media

Exposure.” American Political Science Review 87 (June): 267-85.

Bartels, Larry M. 2002. “Beyond the Running Tally: Partisan Bias in Political

Perceptions.” Political Behavior 24 (June): 117-50.

Beck, Nathaniel. 1990. “Estimating Dynamic Models Using Kalman Filtering.”

Political Analysis 1: 121-56.

Bernardo, Jose. M., and A. F. M. Smith. 1994. Bayesian Theory. New York: Wiley.


http://www.factcheck.org/article340.html

131

Bochner, Stephen, and Chester A. Insko. 1966. “Communicator Discrepancy, Source

Credibility, and Opinion Change.” Journal o f Personality and Social Psychology 4

(December): 614-21.

Bound, John, David A. Jaeger, and Regina M. Baker. 1995. “Problems with

Instrumental Variables Estimation When the Correlation between the Instruments

and the Endogenous Explanatory Variable Is Weak.” Journal o f the American

Statistical Association 90 (June): 443-50.

Box, George E. P., and George C. Tiao. 1973. Bayesian Inference in Statistical

Analysis. Reading, MA: Addison-Wesley.

Brader, Ted. 2006. Campaigning for Hearts and Minds: How Emotional Appeals in

Political Ads Work. Chicago: University of Chicago Press.

Bullock, John G. 2006. “Partisanship and the Enduring Effects of False Political

Information.” Presented at the Annual Conference of the Midwest Political Science

Association, Chicago.

Callander, Steven, and Simon Wilkie. 2007. “Lies, Damned Lies, and Political

Campaigns.” Games and Economic Behavior.

Campbell, Angus, Philip E. Converse, Warren E. Miller, and Donald E. Stokes. 1960.

The American Voter. Chicago: University of Chicago Press.

Chaiken, Shelly. 1979. “Communicator Physical Attractiveness and Persuasion.”

Journal o f Personality and Social Psychology 37 (August): 1387-97.

Chapman, Loren J., and Jean P. Chapman. 1959. “Atmosphere Effect Re-examined.”

Journal o f Experimental Psychology 58 (September): 220-26.


Chong, Dennis, and James N. Druckman. 2006. “Democratic Competition and Public

Opinion.” Presented at the Annual Meeting of the American Political Science

Association, Philadelphia.

Cohen, Geoffrey L. 2003. “Party Over Policy: The Dominating Impact of Group

Influence on Political Beliefs.” Journal o f Personality and Social Psychology 85

(November): 808-22.

Cohen, Joshua. 1989. “Deliberation and Democratic Legitimacy.” In The Good Polity,

ed. Alan Hamlin and Phillip Pettit. Oxford: Blackwell.

Conover, Pamela Johnston, and Stanley Feldman. 1989. “Candidate Perception in an

Ambiguous World: Campaigns, Cues, and Inference Processes.” American Journal

o f Political Science 33 (November): 912-40.

Davis, Otto A., Melvin J. Hinich, and Peter C. Ordeshook. 1970. “An Expository

Development of a Mathematical Model of the Electoral Process.” American Political

Science Review 64 (June): 426-48.

de Finetti, Bruno. 1974. Theory o f Probability: A Critical Introductory Treatment.

Trans. Antonio Machi and Adrian F. Smith. New York: Wiley.

Delli Carpini, Michael X., and Scott Keeter. 1996. What Americans Know about

Politics and Why It Matters. New Haven, CT: Yale University Press.

Diaconis, Persi, and David Freedman. 1986. “On the Consistency of Bayes Estimates.”

Annals o f Statistics 14 (March): 1-26.

Diaconis, Persi, and Donald Ylvisaker. 1979. “Conjugate Priors for Exponential

Families.” Annals o f Statistics 7 (March): 269-81.

132 Bibliography


133

Downs, Anthony. 1957. An Economic Theory o f Democracy. New York:

HarperCollins.

Druckman, James N. 2001. “The Implications of Framing Effects for Citizen

Competence.” Political Behavior 23 (September): 225-56.

Druckman, James N. 2004. “Political Preference Formation: Competition,

Deliberation, and the (Ir)relevance of Framing Effects.” American Political Science

Review 98 (November): 671-86.

Druckman, James N., and Kjersten R. Nelson. 2003. “Framing and Deliberation:

How Citizens’ Conversations Limit Elite Influence.” American Journal o f Political

Science Al (October): 729-45.

Duelfer, Charles. 2004. “Comprehensive Report of the Special Advisor to the

Director of Central Intelligence on Iraq’s Weapons of Mass Destruction.”

https://www.cia.gov/cia/reports/iraq_wmd_2004/.

Epley, Nicholas, and David Dunning. 2000. “Feeling ‘Holier than Thou’: Are

Self-Serving Assessments Produced by Errors in Self- or Social Prediction?”

Journal o f Personality and Social Psychology 79 (December): 861-75.

Fischle, Mark. 2000. “Mass Response to the Lewinsky Scandal: Motivated Reasoning

or Bayesian Updating?” Political Psychology 21 (March): 135-59.

Freedman, David A. 1991. “Statistical Models and Shoe Leather.” Sociological

Methodology 21: 291-313.

Frey, Dieter. 1986. “Recent Research on Selective Exposure to Information.” Advances

in Experimental Social Psychology 19: 41-80.


https://www.cia.gov/cia/reports/iraq_wmd_2004/

Gaines, Brian J., Janies H. Kuklinski, Paul J. Quirk, Buddy Peyton, and Jay Verkuilen.

2007. “Same Facts, Different Interpretations: Partisan Motivation and Opinion on

Iraq.” Journal o f Politics.

Gellman, Barton. 2004 January 7. “Iraq’s Arsenal Was Only on Paper; Since Gulf War,

Nonconventional Weapons Never Got Past the Planning Stage.” Washingon Post

p. A l.

Gellman, Barton, and Walter Pincus. 2003 August 10. “Depiction of Threat Outgrew

Supporting Evidence.” Washington Post p. Al.

Gelman, Andrew, John B. Carlin, Hal S. Stem, and Donald B. Rubin. 2004. Bayesian

Data Analysis. 2nd ed. Boca Raton, FL: Chapman & Hall.

Gentzkow, Matthew, and Jesse M. Shapiro. 2006. “What Drives Media Slant?

Evidence from U.S. Daily Newspapers.” University of Chicago. Manuscript.

Gerber, Alan, and Donald Green. 1999. “Misperceptions about Perceptual Bias.”

Annual Review o f Political Science 2: 189-210.

Gerber, Alan, and Donald P. Green. 1998. “Rational Learning and Partisan Attitudes.”

American Journal o f Political Science 42 (July): 794-818.

Gerber, Elisabeth R., and John E. Jackson. 1993. “Endogenous Preferences and the

Study of Institutions.” American Political Science Review 87 (September): 639-56.

Gertz v. Schmidt. 1974. 418 U.S. 323.

Goussev, Valeri. 2004. “Does the Brain Implement the Kalman Filter?” Behavioral

and Brain Sciences 21 (June): 404-05.

134 Bibliography


135

Green, Donald P., Alan S. Gerber, and Suzanna L. de Boef. 1999. “Tracking Opinion

over Time: A Method for Reducing Sampling Error.” Public Opinion Quarterly 63

(Summer): 178-92.

Groseclose, Tim. 2001. “A Model of Candidate Location When One Candidate Has a

Valence Advantage.” American Journal o f Political Science 45 (October): 862-86.

Grynaviski, Jeffrey D. 2006. “A Bayesian Learning Model with Applications to Party

Identification.” Journal o f Theoretical Politics 18 (July): 323-46.

Hamm, Robert M. 1993. “Explanations for Common Responses to the Blue/Green Cab

Probabilistic Inference Word Problem.” Psychological Reports 72 (1): 219-42.

Harvey, Andrew C. 1989. Forecasting, Structural Time Series and the Kalman Filter.

New York: Cambridge University Press.

Higgins, E. Tory. 1996. “Knowledge Activation: Accessibility, Applicability, and

Salience.” In Social Psychology: Handbook o f Basic Principles, ed. E. Tory Higgins

and Arie W. Kruglanski. New York: Guilford Press.

Hovland, Carl I., and Muzafer Sherif. 1957. “Assimilation and Contrast Effects in

Reactions to Communication and Attitude Change.” Journal o f Abnormal Social

Psychology 55 (September): 244-52.

Husted, Thomas A., Lawrence W. Kenny, and Rebecca B. Morton. 1995. “Constituent

Errors in Assessing Their Senators.” Public Choice 83 (June): 251-71.

Isikoff, Michael, and John Barry. 2005 May 9. “Gitmo: SouthCom Showdown.”

Newsweekp. 10. http://www.msnbc.msn.com/id/7693014/site/newsweek/.

Iyengar, Shanto, and Nicholas A. Valentino. 2000. “Who Says What? Source

Credibility as a Mediator of Campaign Advertising.” In Elements o f Reason,


http://www.msnbc.msn.com/id/7693014/site/newsweek/

ed. Arthur Lupia, Mathew D. McCubbins, and Samuel L. Popkin. New York:

Cambridge University Press.

Jackman, Simon. 2005. “Pooling the Polls Over an Election Campaign.” Australian

Journal o f Political Science 40 (December): 499-517.

Jacobson, Gary C. 2006. A Divider, Not a Uniter: George W. Bush and the American

People. Upper Saddle River, NJ: Pearson.

Jamieson, Kathleen Hall, and Paul Waldman. 2003. The Press Effect: Politicians,

Journalists, and the Stories that Shape the Political World. New York: Oxford

University Press.

Jeffrey, Richard. 2004. Subjective Probability: The Real Thing. New York: Cambridge

University Press.

Jessee, Stephen A., and Douglas Rivers. 2007. “Political Judgment Under Uncertainty:

Heuristics and Biases.” Stanford University. Manuscript.

Jones, Edward E., and Richard E. Nisbett. 1972. “The Actor and the Observer:

Divergent Perceptions of the Causes of the Behavior.” In Attribution: Perceiving

the Causes o f Behavior, ed. Edward E. Jones, David E. Kanouse, Harold H. Kelley,

Richard. E. Nisbett, Stuart Valins, and Bernard Weiner. Morristown, NJ: General

Learning Press.

Kalman, Rudolph E. 1960. “A New Approach to Linear Filtering and Prediction

Problems.” Journal o f Basic Engineering, Series D 82: 35-45.

Kalman, Rudolph E., and Richard S. Bucy. 1961. “New Results in Linear Filtering and

Prediction Theory.” Journal o f Basic Engineering, Series D 83 (March): 95-108.

136 Bibliography


137

Kam, Cindy D. 2005. “Who Toes the Party Line? Cues, Values, and Individual

Differences.” Political Behavior 27 (June): 163-82.

Katz, Elihu. 1968. “On Reopening the Question of Selectivity in Exposure to Mass

Communication.” In Theories o f Cognitive Consistency: A Sourcebook, ed. Robert P.

Abelson, Elliot Aronson, William J. McGuire, Theodore M. Newcomb, Milton J.

Rosenberg, and Percy H. Tannenbaum. Chicago: Rand McNally.

Keenan, Nancy. 11 August 2005. “Rebuttal to FactCheck.org Analysis of NARAL

Pro-Choice America’s TV Ad.”

http://www.factcheck.org/UploadedFiles/Rebuttal-letter-from-Nancy-Keenan-to-FactCheck.org-8-l l-05.pdf.

Kelley, Harold H. 1973. “The Processes of Causal Attribution.” American Psychologist

28 (February): 107-28.

Kennan, George F. 1997. George F. Kennan and the Origins o f Containment,

1944-1946: The Kennan-Lukacs Correspondence. Columbia, Missouri: University

of Missouri Press.

Kerkhof, Peter. 1999. “Applying the Unimodel to Political Persuasion.” Psychological

Inquiry 10 (2): 137-40.

Kingdon, John W. 1984. Agendas, Alternative, and Public Policies. Boston: Little,

Brown.

Knight, Jack, and James Johnson. 1997. “What Sort of Political Equality Does

Deliberative Democracy Require?” In Deliberative Democracy, ed. James Bohman

and William Rehg. Cambridge: MIT Press.

Koehler, Jonathan J. 1996. “The Base Rate Fallacy Reconsidered: Descriptive,

Normative, and Methodological Challenges.” Behavioral and Brain Sciences 19

(March): 1-53.


http://www.factcheck.org/UploadedFiles/Rebuttal-letter-from-Nancy-Keenan-to-FactCheck.org-8-l

Krosnick, Jon A., and Donald R. Kinder. 1990. “Altering the Foundations of Support

for the President through Priming.” American Political Science Review 84 (June):

497-512.

Kuklinski, James H., Paul J. Quirk, Jennifer Jerit, David Schwieder, and Robert F.

Rich. 2000. “Misinformation and the Currency of Democratic Citizenship.” Journal

o f Politics 62 (August): 790-816.

Kunda, Ziva. 1990. “The Case for Motivated Reasoning.” Psychological Bulletin 108

(November): 480-98.

Kurtz, Howard. 16 May 2005. “Newsweek Apologizes: Inaccurate Report on Koran

Led to Riots.” Washington Post p. A01.

Lau, Richard P., and David P. Redlawsk. 2001. “Advantages and Disadvantages of

Heuristics in Political Decision Making.” American Journal o f Political Science 45

(October): 951-71.

Learner, Edward E. 1978. Specification Searches. New York: Wiley.

Learner, Edward E. 1983. “Let’s Take the Con out of Econometrics.” American

Economic Review 73 (1): 31-43.

Lepper, Mark R., Lee Ross, and Richard R. Lau. 1986. “Persistence of Inaccurate

Beliefs about the Self: Perseverance Effects in the Classroom.” Journal o f

Personality and Social Psychology 50 (March): 482-91.

Levendusky, Matthew S. 2006. “Sorting in the U.S. Mass Electorate.” Ph.D. diss.

Stanford University.

Lieb, David A. 2005 April 7. “Missouri Medicaid Cuts Sent to Blunt.” Associated

Press.

138 Bibliography


139

Lodge, Milton, and Charles Taber. 2000. “Three Steps toward a Theory of Motivated

Political Reasoning.” In Elements o f Reason, ed. Arthur Lupia, Mathew D.

McCubbins, and Samuel L. Popkin. New York: Cambridge University Press.

Lohmann, Suzanne. 1994. “The Dynamics of Informational Cascades: The Monday

Demonstrations in Leipzig, East Germany, 1989-91.” World Politics 47 (October):

42-101.

Londregan, John B. 2000. Legislative Institutions and Ideology in Chile. New York:

Cambridge University Press.

Lord, Charles. G., Lee Ross, and Mark R. Lepper. 1979. “Biased Assimilation and

Attitude Polarization: The Effects of Prior Theories on Subsequently Considered

Evidence.” Journal o f Personality and Social Psychology 37 (November):

2098-2109.

Luce, R. Duncan. 1995. “Four Tensions Concerning Mathematical Modeling in

Psychology.” Annual Review o f Psychology 46: 1 -26.

Lupia, Arthur. 2002. “Who Can Persuade Whom? Implications from the Nexus of

Psychology and Rational Choice Theory.” In Thinking about Political Psychology,

ed. James H. Kuklinski. New York: Cambridge University Press.

Lupia, Arthur. 2006. “When Can Politicians Scare Citizens Into Supporting Bad

Policies? A Theory of Incentives with Fear-Based Content.” University of Michigan

at Ann Arbor. Typescript.

Lupia, Arthur, and Mathew D. McCubbins. 1998. The Democratic Dilemma: Can

Citizens Learn What They Need to Know? New York: Cambridge University Press.

Lupia, Arthur, Mathew D. McCubbins, and Samuel L. Popkin, eds. 2000. Elements o f

Reason. New York: Cambridge University Press.


Luskin, Robert C., James S. Fishkin, and Roger Jowell. 2002. “Considered Opinions:

Deliberative Polling in Britain.” British Journal o f Political Science 32 (July):

455-87.

McGuire, William J. 1964. “Inducing Resistance to Persuasion: Some Contemporary

Approaches.” Advances in Experimental Social Psychology 1: 191-229.

McGuire, William J. 1969. “The Nature of Attitudes and Attitude Change.” In

Handbook o f Social Psychology, ed. Gardner Lindzey and Elliot Aronson. 2nd ed.

Reading, MA: Addison-Wesley.

Meinhold, Richard J., and Nozer D. Singpurwalla. 1983. “Understanding the Kalman

Filter.” American Statistician 37 (May): 123-27.

Mondak, Jeffery J. 1993a. “Public Opinion and Heuristic Processing of Source Cues.”

Political Behavior 15 (June): 167-92.

Mondak, Jeffery J. 1993b. “Source Cues and Policy Approval: The Cognitive

Dynamics of Public Support for the Reagan Agenda.” American Journal o f Political

Science 37 (February): 186-212.

NARAL Pro-Choice America. 2005. “Speaking Out.” Television advertisement.

http://www.factcheck.org/video/Speaking-Out.wmv.

Neill, Daniel B. 2005. “Cascade Effects in Heterogeneous Populations.” Rationality

and Society 17(2): 191-241.

Newell, Allen, J.C. Shaw, and Herbert A. Simon. 1958. “Chess-Playing Programs

and the Problem of Complexity.” IBM Journal o f Research and Development 2

(October): 320-35.

140 Bibliography


http://www.factcheck.org/video/Speaking-Out.wmv

141

Nielsen Media Research. 18 August 2005. “Supreme Court Political Ads Virtually

Absent From Television.” Press release.

Ottati, Victor C. 1990. “Determinants of Political Judgments: The Joint Influence of

Normative and Heuristic Rules of Inference.” Political Behavior 12 (June): 159-79.

Page, Benjamin I., and Richard A. Brody. 1972. “Policy Voting and the Electoral

Process: The Vietnam War Issue.” American Political Science Review 66

(September): 979-95.

Petrocik, John R. 1996. “Issue Ownership in Presidential Elections, with a 1980 Case

Study.” American Journal o f Political Science 40 (August): 825-50.

Petty, Richard E., and John T. Cacioppo. 1986. Communication and Persuasion:

Central and Peripheral Routes to Attitude Change. New York: Springer-Verlag.

Phillips, Lawrence D., and Ward Edwards. 1966. “Conservatism in a Simple

Probability Inference Task.” Journal o f Experimental Psychology 72 (September):

346-54.

Poole, Keith T., and Howard L. Rosenthal. 2006. Ideology and Congress. Piscataway,

NJ: Transaction Publishers.

Popkin, Samuel L. 1994. The Reasoning Voter: Communication and Persusasion in

Presidential Campaigns. 2nd ed. Chicago: University of Chicago Press.

Pratt, John W., Howard Raiffa, and Robert Schlaifer. 1964. “The Foundations of

Decision under Uncertainty: An Elementary Exposition.” Journal o f the American

Statistical Association 59 (June): 353-75.

Press, S. James. 2003. Subjective and Objective Bayesian Statistics. 2nd ed. Hoboken,

NJ: Wiley.


Prior, Markus. 2007. Post-Broadcast Democracy. New York: Cambridge University

Press.

Program on International Policy Attitudes. 2003 February 21. “Americans on Iraq

and the U.N. Inspections II.” http://www.pipa.org/OnlineReports/Iraq/IraqUNInsp2_Feb03/

IraqUNInsp2%20Feb03%20rpt.pdf.

Rahn, Wendy M. 1993. “The Role of Partisan Stereotypes in Information Processing

about Political Candidates.” American Journal o f Political Science 37 (May):

472-96.

Rawls, John. 1993. Political Liberalism. New York: Columbia University Press.

Ross, Lee, and Andrew Ward. 1996. “Naive Realism in Everyday Life: Implications

for Social Conflict and Misunderstanding.” In Values and Knowledge, ed. T. Brown,

E. Reed, and E. Turiel. New Jersey: Lawrence Erlbaum.

Ross, Lee, Mark R. Lepper, and Michael Hubbard. 1975. “Perseverance in

Self-Perception and Social Perception: Biased Attributional Processes in the

Debriefing Paradigm.” Journal o f Personality and Social Psychology 32 (May):

880-92.

Savage, Charlie. 2005 June 4. “Pentagon Details Koran Mistreatment.” Boston Globe

p. A l.

Savage, Leonard J. 1954. The Foundations o f Statistics. New York: Wiley.

Savage, Leonard J. 1964. “The Foundations of Statistics Reconsidered.” In Studies in

Subjective Probability, ed. Henry E. Kyburg, Jr. and Howard E. Smokier. New York:

Wiley.

142 Bibliography


http://www.pipa.org/OnlineReports/Iraq/IraqUNInsp2_Feb03/

143

Schaffner, Brian F., and Matthew J. Streb. 2002. “The Partisan Heuristic in

Low-information Elections.” Public Opinion Quarterly 66 (Winter): 559-81.

Schuman, Howard, and Stanley Presser. 1981. Questions and Answers in Attitude

Surveys: Experiments on Question Form, Wording, and Context. San Diego:

Academic Press.

Schumpeter, Joseph A. 1942. Capitalism, Socialism, and Democracy. New York:

Harper & Brothers.

Schwieder, David W., and Paul J. Quirk. 2004. “Missed Cues: Judgment Heuristics and

Citizen Competence.” Presented at the Annual Conference of the Midwest Political

Science Association, Chicago.

Sears, David O., and Jonathan L. Freedman. 1967. “Selective Exposure to Information:

A Critical Review.” Public Opinion Quarterly 31 (Summer): 194-213.

Simon, Herbert A. 1955. “A Behavioral Model of Rational Choice.” Quarterly Journal

o f Economics 69 (February): 99-118.

Simon, Herbert A. 1959. “Theories of Decision-Making in Economics and Behavioral

Science.” American Economic Review 49 (June): 253-83.

Simon, Herbert A., and Allen Newell. 1958. “Heuristic Problem Solving: The Next

Advance in Operations Research.” Operations Research 6 (January-February): 1-10.

Smith, Edward E. 1990. “Categorization.” In Thinking: An Invitation to Cognitive

Science, ed. Daniel N. Osherson and Edward E. Smith. Cambridge, MA: MIT Press.

Smith, Edward E., and Douglas L. Medin. 1981. Categories and Concepts. Cambridge,

MA: Harvard University Press.


Sniderman, Paul M., and Sean M. Theriault. 2004. “The Structure of Political

Argument and the Logic of Issue Framing.” In Studies in Public Opinion: Attitudes,

Nonattitudes, Measurement Error, and Change, ed. Willem E. Saris and Paul M.

Sniderman. Princeton, New Jersey: Princeton University Press.

Steenbergen, Marco R. 2002. “Political Belief Updating: An Experimental

Investigation Within a Bayesian Framework.” Presented at the Annual Scientific

Meeting of the International Society of Political Psychology, Berlin.

Stokes, Donald E. 1963. “Spatial Models of Party Competition.” American Political

Science Review 57 (June): 368-77.

Strasser, Helmut. 1981. “Consistency of Maximum Likelihood and Bayes Estimates.”

Annals o f Statistics 9 (September): 1107-1113.

Taber, Charles S., and Milton Lodge. 2006. “Motivated Skepticism in the Evaluation

of Political Beliefs.” American Journal o f Political Science 50 (July): 755-69.

Tessin, Jeff. 2006. “Cues Given, Cues Received: How Candidates Use Shortcuts When

Voters Need Them Most.” Presented at the Annual Meeting of the Midwest Political

Science Association, Chicago.

Tetlock, Philip E. 2005. Expert Political Judgment. Princeton: Princeton University

Press.

Tversky, Amos, and Daniel Kahneman. 1971. “Belief in the Law of Small Numbers.”

Psychological Bulletin 76 (August): 105-10.

van Houweling, Robert P., and Paul M. Sniderman. 2004. “The Political Logic of a

Downsian Space.” Presented at the Annual Meeting of the Midwest Political Science

Association, Chicago.

144 Bibliography


145

Webster, James G. 2005. “Beneath the Veneer of Fragmentation: Television Audience

Polarization in a Multichannel World.” Journal o f Communication 55 (June):

366-82.

White House. 2002 February 14. “President Announces

Clear Skies & Global Climate Change Initiatives.” Speech.

http://www.whitehouse.gov/news/releases/2002/02/20020214-5.html.

Whitney v. California. 1927. 274 U.S. 357.

Williams, Daniel, and Kamran Khan. 2005 May 28. “Muslims Rally Over Koran

Report.” Washington Post p. A 14.

Zaller, John R. 1992. The Nature and Origins o f Mass Opinion. New York: Cambridge

University Press.

Zanna, Mark P., and John K. Rempel. 1988. “Attitudes: A New Look at an Old

Concept.” In The Social Psychology o f Knowledge, ed. Daniel Bar-Tal and Arie W.

Kruglanski. New York: Cambridge University Press.

Zechman, Martin J. 1979. “Dynamic Models of the Voter’s Decision Calculus:

Incorporating Retrospective Considerations into Rational-Choice Models of

Individual Voting Behavior.” Public Choice 34 (September): 297-315.


http://www.whitehouse.gov/news/releases/2002/02/20020214-5.html

Documents

Part 3 Bayesian Updating of Political Beliefs: Normative