15
Ordering anomalies in choice experiments Brett Day a, , Jose-Luis Pinto Prades b a Centre for Social and Economic Research on the Global Environment (CSERGE), School of Environmental Sciences, University of East Anglia, Norwich NR4 7TJ, UK b Departamento de Economia, Me´todos Cuantitativos e Historia Econo ´mica, Universidad Pablo de Olavide, Sevilla, Spain article info Article history: Received 10 September 2008 Available online 7 March 2010 Keywords: Choice experiments Ordering effects Dichotomous choice contingent valuation Nonparametric testing abstract This paper investigates whether responses to choice experiments (CEs) are subject to ordering anomalies. While previous research has focussed on the possibility that such anomalies relate to position in the sequence of choice tasks, our research reveals that the particular order of tasks matters. Using a novel experimental design that allows us to test our hypotheses using simple nonparametric statistics, we observe ordering anomalies in CE data similar to those recorded in the dichotomous choice contingent valuation literature. Those ordering effects operate in both price and commodity dimensions and are observed to compound over a series of choice tasks. Our findings cast serious doubt on the current practice of asking each respondent to undertake several choice tasks in a CE while treating each response as an independent observation on that individual’s preferences. & 2010 Elsevier Inc. All rights reserved. 1. Introduction Over recent years, choice experiments (CEs) have enjoyed a startling rise in popularity among the practitioners of non-market valuation [2]. The fundamental building block of a CE is a choice task. A choice task confronts subjects with two or more options where the options differ both in the qualities of the non-market good (the ‘commodity dimension’) and in the cost imposed on the subject (the ‘price dimension’). The usual procedure is to ask subjects to indicate their preferred option in a series of such choice tasks. In principle, carefully designed CEs can furnish researchers with a potentially rich source of data from which to deduce individuals’ willingness to trade off between money and the various dimensions of the commodity space. In contrast, dichotomous-choice (DC) contingent valuation (CV) techniques, that had previously enjoyed the status of ‘most preferred valuation method’, provide a relative paucity of data. In their earliest inception [8], DC CV exercises presented each respondent with just one task; a choice between the status quo and the provision of the non-market good at a cost. Across a sample of respondents, the commodity dimensions were held constant while the price dimension was varied so that the willingness to pay (WTP) distribution for that particular manifestation of the non-market good could be estimated. While this ‘single-bounded’ DC (SBDC) elicitation method was strongly endorsed by the US National Oceanographic and Atmospheric Administration’s blue ribbon panel [4], practitioners were concerned by its relative inefficiency. In particular, SBDC elicitation provides a value estimate for just one manifestation of the non-market good and the collection of just one piece of preference information from each respondent is extremely inefficient. In response to the latter criticism, Hanemann et al. [17] proposed the ‘double-bounded’ DC (DBDC) elicitation method. Here, following the initial DC question, a ‘follow-up’ DC question is asked offering the same non-market good at a different Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/jeem Journal of Environmental Economics and Management ARTICLE IN PRESS 0095-0696/$ - see front matter & 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.jeem.2010.03.001 Corresponding author. Fax: + 44 1603 507719. E-mail address: [email protected] (B. Day). Journal of Environmental Economics and Management 59 (2010) 271–285

Ordering anomalies in choice experiments

Embed Size (px)

Citation preview

Page 1: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Contents lists available at ScienceDirect

Journal ofEnvironmental Economics and Management

Journal of Environmental Economics and Management 59 (2010) 271–285

0095-06

doi:10.1

� Cor

E-m

journal homepage: www.elsevier.com/locate/jeem

Ordering anomalies in choice experiments

Brett Day a,�, Jose-Luis Pinto Prades b

a Centre for Social and Economic Research on the Global Environment (CSERGE), School of Environmental Sciences, University of East Anglia, Norwich NR4 7TJ, UKb Departamento de Economia, Metodos Cuantitativos e Historia Economica, Universidad Pablo de Olavide, Sevilla, Spain

a r t i c l e i n f o

Article history:

Received 10 September 2008Available online 7 March 2010

Keywords:

Choice experiments

Ordering effects

Dichotomous choice contingent valuation

Nonparametric testing

96/$ - see front matter & 2010 Elsevier Inc. A

016/j.jeem.2010.03.001

responding author. Fax: +44 1603 507719.

ail address: [email protected] (B. Day).

a b s t r a c t

This paper investigates whether responses to choice experiments (CEs) are subject to

ordering anomalies. While previous research has focussed on the possibility that such

anomalies relate to position in the sequence of choice tasks, our research reveals that

the particular order of tasks matters. Using a novel experimental design that allows us

to test our hypotheses using simple nonparametric statistics, we observe ordering

anomalies in CE data similar to those recorded in the dichotomous choice contingent

valuation literature. Those ordering effects operate in both price and commodity

dimensions and are observed to compound over a series of choice tasks. Our findings

cast serious doubt on the current practice of asking each respondent to undertake

several choice tasks in a CE while treating each response as an independent observation

on that individual’s preferences.

& 2010 Elsevier Inc. All rights reserved.

1. Introduction

Over recent years, choice experiments (CEs) have enjoyed a startling rise in popularity among the practitioners ofnon-market valuation [2]. The fundamental building block of a CE is a choice task. A choice task confronts subjects withtwo or more options where the options differ both in the qualities of the non-market good (the ‘commodity dimension’)and in the cost imposed on the subject (the ‘price dimension’). The usual procedure is to ask subjects to indicate theirpreferred option in a series of such choice tasks. In principle, carefully designed CEs can furnish researchers with apotentially rich source of data from which to deduce individuals’ willingness to trade off between money and the variousdimensions of the commodity space.

In contrast, dichotomous-choice (DC) contingent valuation (CV) techniques, that had previously enjoyed the status of‘most preferred valuation method’, provide a relative paucity of data. In their earliest inception [8], DC CV exercisespresented each respondent with just one task; a choice between the status quo and the provision of the non-market goodat a cost. Across a sample of respondents, the commodity dimensions were held constant while the price dimension wasvaried so that the willingness to pay (WTP) distribution for that particular manifestation of the non-market good could beestimated. While this ‘single-bounded’ DC (SBDC) elicitation method was strongly endorsed by the US NationalOceanographic and Atmospheric Administration’s blue ribbon panel [4], practitioners were concerned by its relativeinefficiency. In particular, SBDC elicitation provides a value estimate for just one manifestation of the non-market good andthe collection of just one piece of preference information from each respondent is extremely inefficient.

In response to the latter criticism, Hanemann et al. [17] proposed the ‘double-bounded’ DC (DBDC) elicitation method.Here, following the initial DC question, a ‘follow-up’ DC question is asked offering the same non-market good at a different

ll rights reserved.

Page 2: Ordering anomalies in choice experiments

ARTICLE IN PRESS

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285272

price. The follow-up question greatly increases the information garnered from each respondent. Compared with SBDCelicitation, a significantly smaller sample is needed to estimate the WTP distribution to a given level of precision [17].

Subsequent empirical testing, however, has revealed a robust anomaly in responses to DBDC questions. Numerousstudies have observed that responses to follow-up questions are systematically influenced by the particular price offered inthe initial question (for example [20,18,14]). The existence of these price-sequence ordering effects casts serious doubt onthe validity of responses to follow-up questions.1 Indeed, one might argue that these well-documented problems haveprecipitated a growing disaffection with DC CV and contributed to the growing interest in CE methods.

While there are differences between DC CV and CE approaches, there are also similarities. For example, while differingin presentation, the SBDC elicitation method is essentially a simple form of CE in which subjects face only one choice taskrequiring a preference to be stated between their initial endowment and an alternative in which a non-market good isprovided at a price. Likewise, the DBDC elicitation method is a CE with two choice tasks pitting the initial endowmentagainst an alternative offering a non-market good. In this case, moving from the first choice task to the second, the pricedimension of the alternative is altered, but there is no change in the commodity dimension.

Given these similarities, a fundamental question that must be asked of the CE method is whether it is subject to thesame price-sequence ordering anomalies that have been observed in DC CV studies. This paper addresses that question. Inaddition, since CEs allow commodity as well as price dimensions to vary across tasks, this study seeks to establish ifequivalent commodity-sequence ordering anomalies are observable in CE responses. Indeed, our study is also designed toestablish whether ordering anomalies persist in sequences of choice tasks where both price and commodity dimensionsvary simultaneously. Finally, since CEs typically present a series of choice tasks, a further objective is to explore howordering anomalies, should they exist, develop over multiple tasks.

While a number of previous studies have investigated the issue of ordering anomalies in CEs, most have sought toidentify systematic changes in stated preferences relating simply to position in the sequence of tasks. Some studies (forexample [26]) have looked to see whether preferences for particular attributes differ at different positions in the sequence.Others have looked for evidence of either increasing fatigue (for example [24]) or preference learning (for example [19]) asrespondents progress through the sequence of choice tasks. In contrast, our work seeks to understand how ordering effectsare determined by the particular sequence in which the choice tasks are presented. As we report subsequently, we find thattask order has profound effects on respondent choices in CEs.

Within the context of CEs the study that is most similar to our work is that of Holmes and Boyle [19]. Using data from aCE, they estimate a parametric model of preferences and show that the prices and commodity attributes of options inprevious and future tasks have significant impacts on choices in the current task. Building on that insight, our workexplores ordering effects in CEs but using a very different testing framework. In particular, we develop a carefully designedexperiment in which separate samples are presented with three choice tasks. The order of choice tasks for each sample ischosen so that a variety of hypotheses regarding the nature of ordering effects can be tested by simply comparingresponses to certain tasks across samples. In contrast to previous studies, this experimental approach allows us to examineordering effects in CEs without imposing assumptions regarding the structure of preferences. The nonparametric tests weemploy are both simple and transparent and as such, we believe, increase the credibility of our findings.

2. Ordering anomalies in dichotomous choice contingent valuation studies

Implicit in the standard interpretation of both DC CV and CE data are a number of assumptions regarding respondents’behaviour that we shall refer to as the standard assumptions. Among the most important of those assumptions are thefollowing:

1.

effe

Stable reference point: respondents regard each valuation task as independent and exclusive of all other tasks. In thatway, each choice can be taken to reflect preferences from the same reference point: the respondent’s initial endowment.

2.

Neoclassical preferences: respondents possess complete, coherent and stable preferences conforming to the standardneoclassical model. In the context of a CE that assumption dictates that respondents can determine with absolutecertainty which of the options in a choice task is their most preferred.

3.

Truthful revelation: respondents are motivated to answer each question truthfully and hence reveal through theirchoices their most preferred option.

Provided these three behavioural assumptions hold, then an individual’s observed choices in a CE should not change

according to the nature of the options presented in previous choice tasks.

2.1. Price-sequence ordering anomalies in DC contingent valuation studies

In the DC CV literature several studies have shown that average WTP when calculated from responses to follow-upquestions is significantly lower than that calculated from first questions (for example [20,11,18,3,7]). DeShazo [14] appears

1 Following Bateman et al. [6] we describe these as ‘ordering effects’, which run contrary to the expectations of economic theory, and not ‘sequencing

cts’, which are anticipated by economic theory; our thanks to an anonymous referee for highlighting the distinction.

Page 3: Ordering anomalies in choice experiments

ARTICLE IN PRESS

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285 273

to be the first to attribute that shift to a price-sequence ordering anomaly. Reviewing data from a number of previousstudies, DeShazo compares sample acceptance rates (proportion answering ‘Yes’) for follow-up questions to those forinitial questions. He finds that when follow-up questions are preceded by a higher price (what we shall term an improving

price sequence) the sample acceptance rates do not systematically differ. However, when the preceding price was a loweramount (what we shall term a worsening price sequence) acceptance rates are significantly depressed. DeShazo concludesthat it is this unwillingness to accept offers presented in worsening price sequences that causes the downward shift in theaverage WTP implied by follow-up responses.

2.2. Commodity-sequence ordering anomalies in DC contingent valuation studies

Relatively less well studied in DC CV are commodity-sequence ordering anomalies. Using DBDC elicitation, Powe andBateman [22] investigate differences in WTP for environmental improvements to small and large areas of natural habitat.They observe significant commodity-sequence ordering effects; implied values for the large good are higher when valuedafter the small good, while the small good is valued less when presented after the large good. A qualitatively identicalresult is reported by Bateman and Brouwer [5] in a study using SBDC elicitation. It is interesting to note that Bateman et al.[6] observe a similar pattern of commodity-sequence ordering anomaly in a CV study using open-ended elicitation.2

Likewise, Clark and Friesen [13] report the same result for private goods valued in real trading experiments concluding that‘items of a smaller size/quality suffer by relative comparison, whereas items of a larger size/quality gain by it’ (p. 205). Theevidence that exists, therefore, conforms with a commodity-sequence ordering anomaly in which a good is regarded morefavourably when preceded by a question offering a relatively smaller level of provision (an improving commodity sequence)while being regarded less favourably if preceded by a relatively larger level of provision (a worsening commodity sequence).

2.3. Competing explanations of ordering anomalies

A number of possible explanations for ordering anomalies in DC CV exercises have been proposed in the literature. Inessence, each of those explanations postulates a violation of one or more of the standard assumptions. Among the variousexplanations, perhaps three theories stand out as having endured close scrutiny in the CV literature.

2.3.1. The framing model: shifting reference points

DeShazo [14] proposes that in DBDC elicitation, those answering ‘Yes’ to the initial question may suppose that they havecompleted an informal transaction and so update their perceptions of their endowment. In contrast, those answering ‘No’have no reason to update their reference point. The structure of DBDC elicitation ensures that those answering ‘Yes’ face afollow-up question offering a higher price (a worsening price sequence) while those answering ‘No’ are presented with alower price (an improving price sequence). Evoking the loss aversion feature of reference-dependent utility theory [25],DeShazo proposes that it is this shift in reference points that causes responses to follow-up questions in worsening pricesequences to exhibit significantly reduced acceptance rates. Accordingly, the framing model challenges the first standardassumption regarding the independence and exclusivity of tasks.

2.4. The anchoring model: shifting reservation prices

Within the CV literature considerable attention has focused on the possibility of an anchoring or starting-point bias (forexample [9,18,21,27]). The usual explanation for the anchoring effect is motivated by the assertion that individuals’preferences for a non-market good are poorly formed and might more closely resemble a probability distribution over arange of values (for example [23,26]). Accordingly, this explanation challenges the second standard assumption ofcomplete and coherent preferences.

In essence, it is argued that individuals’ preference distributions may be malleable and influenced by outside cues. Inthe case of DC CV, for example, individuals may interpret the price offered in the initial question as conveying informationon the true value of the non-market good. In that case, they may perform a Bayesian update of their prior value distributionplacing greater weight on the value presented to them as the initial offer price [21,1,15,16]. In a worsening price sequence,such respondents are less likely to accept the high-priced offer in the follow-up question since their updated perception ofthe value of the good is somehow focussed on the lower price offered in the initial question. The opposite is true of animproving price sequence. Likewise, commodity-sequence ordering effects may result from the anchoring heuristicoperating on individuals’ value functions in the commodity dimension, an explanation proposed by Powe and Bateman[22] as an explanation for the ordering effects observed in their data.

2 Interestingly, Bateman et al. [6] also observe that these ordering effects disappear when the sequence of valuation questions is described to

respondents previous to undertaking the tasks.

Page 4: Ordering anomalies in choice experiments

ARTICLE IN PRESS

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285274

2.4.1. The strategic behaviour model

A third possible contention is that strategic misrepresentation of preferences might explain the price-sequence orderinganomalies observed in DBDC responses [27,12], an explanation that challenges the third standard assumption of truthfulpreference revelation. Carson and Groves [12] argue that the presentation of a second price in DBDC elicitation signals torespondents that they are in a price negotiation and encourages them to reject the subsequent offer, hoping to signal thatonly a low cost of provision is acceptable. In a similar vein, Powe and Bateman [22] and Bateman et al. [6] cite strategicresponses as a possible explanation of the commodity-sequence ordering effects observed in their DC CV studies.

In Section 6, we will return to consider what light our findings shed on the relative merits of these various theories. Fornow, however, we turn to the question of how price and commodity sequences, like those observed in DC CV questions,might be manifested in a CE.

3. Sequences in choice experiment tasks

Consider a simple CE consisting of a series of choice tasks each presenting just two options; options I and II. To simplifyfurther, imagine that the commodity dimension of the good can be described as small, medium or large. Likewise the pricescan take the values h0, hlow and hhigh. In this context, we can construct pairs of choice tasks that exactly replicate CVquestions with price and commodity sequences.

For example, the sequence of choice tasks shown in the upper part of Fig. 1 replicates a worsening price sequence in aDBDC CV question. Here option I is the same in both tasks (the status quo in a DBDC CV question). Likewise, the commoditydimension of option II is the same in both tasks (the new level of provision of the non-market good in a DBDC CV question).The only difference between the two tasks is that the price dimension of option II increases (a worsening price sequence ina DBDC CV question). Accordingly, we describe option II in choice task 2 as appearing in a worsening price sequence. Notethat in a CE the sequence must be attributed to the particular option in which it is observed.

The lower part of Fig. 1 shows a sequence of CE tasks that replicate the improving price sequence of a DBDC CVquestion. Again option I is the same in both tasks. Option II, on the other hand, offers the same level of provision in bothtasks but at a high-price in the first task followed by a low price in the second task. Again we attribute the sequence to theoption and describe option II in choice task 2 as appearing in an improving price sequence.

If the series of choice tasks in Fig. 1 exactly replicate improving and worsening price sequences in DBDC CV questions,then we might expect them to generate the same sequencing anomalies observed by DeShazo [14]. We could simply testthat contention by presenting independent samples with the two sequences of choice tasks. By virtue of their position atthe beginning of the sequence of tasks, responses to the first task cannot be subject to ordering effects. Consequently wemight take first tasks as forming a control case. Observe that the tasks in Fig. 1 are chosen such that the first choice taskfaced by one sample is identical to the second choice task faced by the other sample. Accordingly, each sample provides thecontrol case against which the other sample’s responses to the second choice task can be compared. According to thestandard assumptions, responses to those second tasks should be similar to responses given in the control case. In contrast,DeShazo’s [14] results suggest that that result will only characterise responses in an improving price sequence. Theworsening price sequence, it is predicted, will produce an ordering anomaly in which the proportion favouring option II inthe second task will be significantly less than the proportion favouring that option in the control case.

Fig. 2 presents a similar construction but this time illustrating option I of the choice tasks in worsening and improvingcommodity sequences. In the upper diagram, option II is the same for both tasks. Option I, on the other hand, maintains thesame price in both tasks, but the amount of good offered in the second task is lower. This is then a worsening commodity

Option II in a Worsening Price Sequence:

Choice Task 1 Choice Task 2

Option I Option II Option I Option II

Small Large Small Large

€Low €0 €0 €High

€High €0 €0 €Low

Option II in an Improving Price Sequence:

Choice Task 1 Choice Task 2

Option I Option II Option I Option II

Small Large Small Large

Fig. 1. Examples of price-sequencing in a choice experiment.

Page 5: Ordering anomalies in choice experiments

ARTICLE IN PRESS

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285 275

sequence in option I. The opposite pattern, an improving commodity sequence in option I, is illustrated in the lower part ofFig. 2.

Again, a simple test of the standard assumptions would be to present the two sequences of choice tasks toindependent samples and compare response proportions for the matched tasks. If the findings from the DC CVliterature carry over to CEs, then we might expect to see ordering effects in both improving and worsening commoditysequences.

Of course, in a CE it is unusual to have only one of the options changing from choice task to choice task. For example,Fig. 3 illustrates choice tasks that present respondents with multiple-sequences. In the top sequence of tasks, option I is in aworsening commodity sequence while option II is also in a worsening sequence but this time in the price domain. Thebottom sequence presents the opposite case where option I is in an improving commodity sequence while option II is in animproving price sequence.

If the patterns of behaviour observed in DC CV studies carry over to these more complex choice situations, then theimproving multiple-sequence should also elicit responses showing evidence of ordering effects. In particular, in theimproving multiple-sequence the relatively larger commodity offered by option I in the second task leads respondents toregard this option more favourably, while the improved price offered by option II in the second task results in no suchequivalent bias. This combination of effects would lead to a relatively greater proportion of respondents favouring option Iin this choice task than would do if that same choice task was the first in the sequence.

In the worsening multiple-sequence, the two ordering biases work in the same direction; the relatively smallercommodity offered by option 1 makes this option appear less favourable but the relatively greater price offered by option 2makes this also appear less favourable. Since we are unable to determine in advance which of the two biases will dominate,it is not possible to make predictions concerning ordering effects in this case.

Option I in a Worsening Commodity Sequence:

Choice Task 1 Choice Task 2

Option I Option II Option I Option II

Medium Large Small Large

€0 €High €0 €High

€0 €High €0 €High

Option I in an Improving Commodity Sequence:

Choice Task 1 Choice Task 2

Option I Option II Option I Option II

Small Large Medium Large

Fig. 2. Examples of commodity-sequencing in a choice experiment.

Worsening Commodity (Option I) and Price (Option II) Sequences:

Choice Task 1 Choice Task 2

Option I Option II Option I Option II

Medium Large Small Large

€0 €Low €0 €High

Improving Commodity (Option I) and Price (Option II) Sequences:

Choice Task 1 Choice Task 2

Option I Option II Option I Option II

Small Large Medium Large

€0 €High €0 €Low

Fig. 3. Examples of multiple-sequencing in a choice experiment.

Page 6: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Table 1Health state descriptions used in the study.

Health dimension Health state

Full health (EQ-5D: 11111) Mild (EQ-5D: 21212) Severe (EQ-5D: 22223)

Personal mobility No problems Some problems Some problems

Self-care No problems No problems Some problems

Normal activity No problems Some problems Some problems

Pain or discomfort None None Moderate

Anxious or depressed No Somewhat Very

Option I Option II

Duration of Treatment

12 mths 12mths

Symptoms Severe None

Duration of Symptoms

2 mths None

Cost €0 €120

I Severe

II €Med

Fig. 4. Schematics of typical choice task.

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285276

4. Experimental design and testing framework

Our application concerns the valuation of health using a CE. In each choice task respondents were asked to imagine thatthey had been diagnosed with a medical problem that would result in an episode of ill-health. The severity of that episodewas measured using the Euroqol (EQ-5D), a standardised instrument for measuring health outcome [10]. In this studywe used three EQ-5D health states (see Table 1); one severe, one mild and full health. The advantage of EQ-5D is that thereare clear dominance relations between health states, such that an improvement from severe to full health is necessarily agreater health gain than an improvement from mild to full health.

Respondents were informed that the problem was treatable and that following treatment they would be returned to fullhealth within 1 year. In each choice task, subjects were asked to choose between two treatment options. One option was amedicine provided by the hospital. While this option was free of charge it meant that the subject would still experiencesymptoms of ill-health for the first 2 months of treatment. The second option in each choice task was to purchase analternative medicine from a pharmacy. While this treatment was costly, it meant that the subject would avoid anysymptoms while being treated thereby enjoying a quality of life that was equivalent to full health.

The commodity dimension of options in our CE described one of three states of health; 2 months of severe ill-health or2 months of mild ill-health or on-going full health. In addition, the price dimension of options in our CE described one of fourlevels of treatment cost; h0, h60 (hLow), h120 (hMed) or h240 (hHigh). As shown in Fig. 4, a typical choice task pitted a zero costtreatment with 2 months of reduced quality of life, against a costly treatment that returned the subject to immediate full health.

In order to simplify the presentation of our experimental design, it is expedient to further condense each choice taskinto the simple schematic shown on the right hand side of Fig. 4. Here the top box represents option I where the cost,treatment and symptom duration dimensions have all been suppressed since these were always h0, 12 and 2 months,respectively. Likewise the bottom box represents option II with the commodity dimensions suppressed since these werealways 12 months of treatment with no symptoms.

For the purposes of our study, we constructed four different choice tasks. Our experimental design involved a six-waysplit sample with each sample facing a different set of three of those four choice tasks. We label the six samples, A1, A2, B1,B2, C1 and C2. The tasks and order in which they were presented to respondents in each sample are summarised using thesimplifying schematic in Fig. 5.

The essence of our experimental design is to confront independent samples with the same choice task but set indiffering sequences and to record the proportions choosing each option. Following the lead of McFadden [20], Batemanet al. [7] and De Shazo [14] (among others), the null hypothesis of the irrelevance of task order can then be tested using asimple nonparametric comparison of proportions in independent samples.3

3 In two independent samples of sizes n1 and n2, we observe the proportion of respondents choosing a particular option and label those p1 and p2,

respectively. The null hypothesis, p1=p2, can be tested using the statistic z¼ ðp1�p2Þ=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffip1ðð1�p1Þ=n1Þþp2ðð1�p2Þ=n2Þ

p. The test statistic has a distribution

that is well-approximated by the standard normal in samples where, according to a frequently cited rule of thumb, np45 and n(1�p)45; a condition

met by all tests applied in our study.

Page 7: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Sample Choice Task 1 Choice Task 2 Choice Task 3

A11 A12 A13

A1II II 50.6

A21 A22 A23

A2

B11 B12 B13

B1

B21 B22 B23

B2

C11 C12 C13

C1

C21 C22 C23

C2

II

II IIII

II II II

II IIII

II IIII

II II II

€Low

€Low

€Low

€Low

Notes:Top box of each choice task represents Option I, bottom box Option II. Acceptance rates are printed in the box to the right of each Option Severe = 2 months 22223, Mild = 2 months 21212 High = €240, Medium = €120, Low = €60

€Med 74.7 €High

€High

€High

€High

€High

57.8 €Med

€Med

€Med

I I I

I I I

I I I

I I I

I I I

I I I

Severe Severe25.3 42.2 Mild

Severe

Severe

Severe

Severe

Severe

Severe

Severe

Severe

Mild

49.4

67.5 75.9

32.5 24.1 36.1

63.9

Mild Mild

Mild

Mild

Mild Mild

€Med

€Med

€Med

€Med

€Med

37.3

62.7

25.9

74.1

22.9

77.1

28.9

71.1

41.0

59.0

59.0

41.0

32.5

67.5

49.4

50.6

64.7

35.3

38.8

61.2

8.4

91.6

24.1

75.9

Fig. 5. Experimental design and observed acceptance rates for options in each task.

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285 277

The first two tasks of our design are selected to test whether CE-style questions precipitate the same ordering anomaliesobserved in DC CV. For clarity, Table 2 describes the option sequences in these first two tasks as well as a prediction of theresulting ordering anomaly suggested by evidence from DC CV studies. In our subsequent testing, we take the absence ofan ordering effect to be our null hypothesis and the predictions of Table 2 to be our (frequently directional) alternativehypotheses.

The comparison of response proportions for the first two choice tasks of samples A1 and A2 provides tests of price-sequencing (as per Fig. 1). The same comparison for samples B1 and B2 test for commodity-sequencing (as per Fig. 2).While the comparison of samples C1 and C2 test for multiple price and commodity-sequencing (as per Fig. 3). Through theinclusion of a third task, our design also allows us to consider how ordering anomalies might evolve over a series of tasks.We return to discuss this issue in more detail in the next section.

5. Results

Surveying was performed in Northern Spain, using professional interviewers undertaking personal interview sessions.Respondents were not rewarded monetarily or otherwise for their participation in the survey. A sample of 500 individualswas drawn from the population of the study area and respondents were randomly allocated into one of the six treatments.

Page 8: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Table 2Patterns of sequencing and predictions of resultant response bias in second choice task.

Sample Option First choice Second choice Sequence Prediction of bias in second choice task

A1 I Severe Severe No sequence Increase attractiveness of option I

II hMed hHigh Worsening

A2 I Severe Severe No sequence No bias

II hHigh hMed Improving

B1 I Mild Severe Worsening Reduce attractiveness of option I

II hMed hMed No sequence

B2 I Severe Mild Improving Increase attractiveness of option I

II hMed hMed No sequence

C1 I Mild Severe Worsening No prediction

II hLow hHigh Worsening

C2 I Severe Mild Improving Increase attractiveness of option I

II hMed hLow Improving

Table 3Comparison of descriptive statistics across samples.

Variable Sample means (standard deviations for continuous variables) Test of difference ingroups (p-value)

A1 A2 B1 B2 C1 C2

Income per household member (h/month) 714.31 704.60 704.68 739.38 609.11 647.38 0.533a

(444.5) (481.0) (623.8) (456.4) (318.5) (429.4)

Age (years) 46.52 45.67 45.67 47.52 44.18 46.12 0.895a

(18.7) (17.9) (16.1) (17.8) (18.4) (16.7)

Formal education (years) 10.17 10.51 10.41 11.20 10.25 11.30 0.203a

(3.80) (3.05) (3.56) (4.12) (4.05) (3.47)

Gender (% female) 54.2 47.0 48.2 48.2 53.0 45.8 0.859b

Working status (% working) 65.06 69.88 68.67 63.53 62.65 60.24 0.846b

Working status (% retired) 16.87 19.28 14.46 21.18 19.28 21.69 0.781b

a p-value calculated from ANOVA using F-test of equality of means across multiple groups.b p-value calculated from w2-test of equality of proportions across multiple groups.

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285278

Each sample contained 83 observations except for sample B2 which contained 85. Table 3 summarises some of thesocioeconomic characteristics of the 6 samples. As can be seen from the final column of that table, no statisticallysignificant differences in those characteristics are observable across samples.

Responses to the CE questions are summarised in Fig. 5 which shows the proportions of each sample choosing eachoption in each choice task. For ease of exposition, we refer to the first question faced by sample A1 as A11, likewise thesecond question faced by this sample is denoted A12, and so on.

5.1. Tests of consistency

The fundamental assumption of our experimental design is that each sample represents an independent observation ofthe same underlying population of preferences. Moreover, we assume that those preferences are economically meaningful,according with basic notions of economic rationality. Our first series of tests examine those assumptions.

5.1.1. Tests 1–3: across-sample consistency in identical first choice task

If each sample is an independent draw from the same population then the distribution of preferences in each sample shouldbe approximately the same. We test that by confronting samples A1, B2 and C2 with identical first choice tasks. Reassuringly,tests 1–3 of Table 4 confirm that the proportions choosing option I in this task (25.3%, 25.9% and 28.9%, respectively) do notdiffer significantly. For the purposes of subsequent testing, we combine these three samples and treat them as observationspertaining to the same set of underlying preferences.4 Of that combined sample of 187 individuals, 26.7% chose option I.

4 As pointed out by an anonymous referee, the pooling of three samples to form a single observation for this task leads to a disparity in sample sizes

in our subsequent testing. The results of the analysis, however, remain qualitatively identical using the disaggregated data even with respect to the broad

level of statistical confidence that can be attributed to the findings of the various tests.

Page 9: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Table 4Tests of across-sample consistency in responses to choice task 1.

Test Test case Control case Predicted change in test case preferences Signed p-valuea

Option I Option II

Response proportions the same for identical choice tasks

Test 1 A11 B21 None None +0.9312b

Test 2 B21 C21 None None +0.6593b

Test 3 C21 A11 None None �0.6004b

More is better

Test 4 B11 A11, B21 and C21 + None +0.0323c

Lower prices are better

Test 5 A11, B21 and C21 A21 None – �0.1527c

Test 6 C11 B11 None – �0.0212c

a p-value records probability that the proportions choosing the two options are identical in control and test cases. A positive (negative) sign indicates

a higher (lower) proportion chose option I in the test case than the control case.b From two-tailed test of equality of proportions.c From one-tailed test of equality of proportions.

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285 279

5.1.2. Test 4: across-sample consistency; more commodity at same price

Standard economic theory supposes that individuals prefer more of good things to less. Accordingly, we would expectmore individuals to choose a treatment if it offered a greater health benefit. In the first task, samples A1, B2 and C2 face achoice between a medium-priced treatment and an episode of severe ill-health. For sample B1 the choice is between thesame medium-priced treatment and an episode of only mild ill-health. When the ill-health episode is mild (B11), 37.7% areprepared to endure it but, in line with expectations, when the ill-health event is severe (A11, B21 and C21) only 26.7% willendure it. As shown in test 4 of Table 4, that difference is statistically significant (p-value: 0.0323).

5.1.3. Tests 5 and 6: across-sample consistency; same commodity at different price

In a similar vein, economically rational individuals should prefer lower priced goods to higher priced goods. In the firsttask, samples A1, B2 and C2 must choose between an episode of severe ill-health and a medium-priced treatment. Forsample, A2 the choice is the same except the treatment now has a high-price. As expected, 73.3% choose the treatment atthe medium price compared with 67.5% when the price is high. Samples B1 and C1 afford similar comparison. Here the firsttasks pair a mild ill-health event with a medium- and low-priced treatment, respectively. Again economic expectations areborne out; 62.7% choose the treatment at the medium price compared with 77.1% at the low price. Referring to tests 5 and6 in Table 4, the first of these comparisons fails to register as a significant difference (p-value: 0.1527), while the other issignificant at the 95% confidence level (p-value: 0.0212).

5.1.4. Within-subject consistency

A further expectation of standard economic theory is that individuals respond to repeated choice tasks with reference toa stable set of well-formed preferences. Under that assumption, we would expect respondents’ choices to exhibit internalconsistency. For example, an individual who agrees to pay for a high-priced treatment in order to avoid a certain episode ofill-health should, if offered the choice in a subsequent task, also agree to pay for a low-priced treatment so as to avoid thesame episode. Respondents to the survey, it transpires, are remarkably consistent. In all 151 cases where individuals’ firsttask choices reveal preferences dictating a particular choice in the second task, consistent choices were made. Of the 138cases where a choice in the first task dictates a particular choice in the third task, only 4 individuals make the inconsistentchoice.

So far, the prognosis for CE is rather positive; subjects’ responses are internally consistent and conform to some basictenets of economic theory. Moreover, the distribution of preferences in first choice tasks across samples is remarkablysimilar. That finding supports our assumption that the first task responses reflect unbiased population preferences and canact as control cases to compare against responses to tasks later in the sequence.

5.2. Price-sequence ordering effects

5.2.1. Test 7: worsening price sequence

The control case for our worsening price sequence test is provided by sample A2. Here we observe 67.5% of the samplechoosing option II in the first task (A21). Sample A1 face the exact same choice but as their second task (A12). This secondtask differs from the first (A11) only in as much as the price of option II increases. This worsening price sequence has asubstantial impact with only 57.8% of that sample selecting option II, almost 10% less than in the control case. As shown intest 7 of Table 5, a one-tailed test reveals this difference to be significant at the 90% level of confidence (p-value: 0.0996).

Page 10: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Table 5Tests of ordering anomalies.

Sequence Test case Control case Predicted change in test case preferences preferences Signed p-valuea

Option I Option II

Price sequence

Test 7 A12 A21 None – +0.0996b

I: no change

II: worsening Price

Test 8 A22 A11, B21 and C21 None None �0.6403c

I: no change

II: improving Price

Commodity sequence

Test 9 B12 A11, B21 and C21 – None �0.0003b

I: worsening commodity

II: no change

Test 10: B22 B11 + None +0.0002b

I. improving commodity

II: no change

Multiple sequence

Test 11 C22 C11 + None +0.0063b

I: improving commodity

II: improving Price

Test 12 C12 A21 – – 1.0000c

I: worsening commodity

II: worsening price

a p-value records probability that the proportions choosing the two options are identical in control and test cases. A positive (negative) sign indicates

a higher (lower) proportion chose option I in the test case than the control case.b From one-tailed test of equality of proportions.c From two-tailed test of equality of proportions.

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285280

5.2.2. Test 8: improving price sequence

The control case for our improving price sequence test is provided by samples A1, B2 and C2 that share the same firstchoice task. In those tasks (A11, B21 and C21), we observe 73.3% of respondents choosing option II. The exact same choice ispresented to sample A2 as their second task (A22). This second task differs from the first (A21) only in that option II is offeredat a lower price. This improving price sequence has a negligible impact on choices with the proportion selecting option IIremaining relatively stable at 75.9%. Test 8 of Table 5 confirms that these two proportions do not differ significantly(p-value: 0.6403).

Our data provide evidence of price-sequence ordering effects in CE tasks. We observe a substantial (if only marginallysignificant) reduction in the frequency with which respondents choose an option if, all else equal, that option is in aworsening price sequence. In contrast, no systematic impact on preferences can be discerned for options in improving pricesequences. Interestingly, the pattern we observe in our CE data exactly replicates the ordering effects recorded by DeShazo[14] in DBDC CV data.

5.3. Commodity-sequence ordering effects

5.3.1. Test 9: worsening commodity sequence

The control case for our test of worsening commodity sequences is provided by the first choice task presented tosamples A1, B2 and C2. Some 26.7% of that combined sample chose the episode of severe ill-health offered by option I.Sample B1 face the same task (B12) but second in the sequence where the previous task (B11) differed only in that option Ientailed enduring an episode of mild ill-health. This worsening commodity sequence has a major impact on the selection ofoption I with only 8.4% of the sample making that choice. Test 9 of Table 5 reveals that difference to be statisticallysignificant at all conventional levels (p-value: 0.0003).

5.3.2. Test 10: improving commodity sequence

The control case for the improving commodity sequence is provided by the first task presented to sample B1. Weobserve 37.7% of that sample plumping for the mild ill-health event offered by option I. Sample B2 faced the same tasksecond in the sequence such that option I is in an improving commodity sequence. Again this precipitates a considerableshift in expressed preferences with the proportion choosing option I jumping to 64.7%. The formal test shown as test 9 inTable 5 confirms this difference to be statistically significant at all conventional levels (p-value from a one-tailed test:0.0002).

Page 11: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Choice Task 1

Choice Task 2

Choice Task 3

First-OrderSequences:

Initial Secondary

Second-OrderSequence:

Fig. 6. First- and second-order sequences in a three-task choice experiment.

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285 281

The pattern of responses observed in our data appears to offer strong evidence of commodity-sequence orderinganomalies. In contrast to the asymmetry observed in the ordering effect for price sequences, these anomalies function inboth improving and worsening directions. Again, our data confirm that the patterns of commodity-sequence orderinganomalies suggested by DC CV studies persist in CEs.5

5.4. Multiple-sequence ordering effects

Unlike DC CV, CEs typically vary more than just one attribute of one option from choice task to choice task. Suchvariation may lead to situations in which both options in a task are in sequences that might potentially influencerespondents’ choices. We refer to such tasks as being in a multiple-sequence.

5.4.1. Test 11: improving multiple-sequence

Consider first the comparison between responses to questions C11 and C22. Both present a choice between an episode ofmild ill-health (option I) and a low priced treatment (option II). While C11 represents our control case, C22 is the secondchoice task such that option I is in an improving commodity sequence, which our earlier findings suggest should make itappear more favourable, and option II is in an improving price sequence, which our earlier findings suggest will have noimpact on preferences. Accordingly we might expect more respondents to select option I in C22 than in the control case(C11). As shown in test 11 of Table 5, this is what we observe; 22.9% prefer option I in C11 compared with 41% in C22 whichproves to be a statistically significant difference at all conventional levels (p-value: 0.0063).

5.4.2. Test 12: worsening multiple-sequence

Worsening multiple-sequences are tested by comparing responses to C12 to the control case provided by A21. Here wehave no directional hypothesis since in C12 both options are in worsening sequences and hence both should be regardedless favourably. The data support that contention as in both tasks the proportion favouring option I is identically 32.5%.

It appears that the price and commodity-sequence ordering effects we have documented in very simple sequences(where the attributes of just one option of the choice change) can be used to explain the patterns of response observed inmultiple-sequences (where the attributes of both options of the choice change). In particular, the ordering effect observedin a multiple-sequence can be explained by netting out the separate ordering effects acting on the individual options of thechoice tasks.

5.5. Third task ordering effects

Thus far our examination has focused on the nature of the sequence of first and second tasks. We shall refer to thatsequence as the initial sequence. Moreover, since the response anomalies observed in the second task are determined by thenature of the immediately preceding task we shall describe those as first-order ordering effects. When a third choice task isintroduced, further sequences are identifiable. The third task forms a first-order sequence with the second task; a sequencethat we shall refer to as the secondary first-order sequence. Furthermore, the third choice task forms a second-order sequence

with the first choice task. Those different types of sequence are depicted in Fig. 6.One possibility is that both or either of these additional sequences induce ordering effects similar in nature to those

observed for the initial first-order sequence (replicated for reference in the second column of Table 6). Under thatassumption, we can identify the particular type of sequence characterising the move from the second to third choice tasksand thereby predict the direction of bias the secondary first-order sequence might precipitate in responses to the thirdtask. Those predictions are listed in column 3 of Table 6.6 Likewise, comparing first and third tasks allows us to predict thedirection of possible bias precipitated by the second-order sequence. Those predictions are listed in column 4 of Table 6.

5 Our assumption that the evidence from the DC CV literature is sufficient to warrant the use of one-tailed tests for commodity-sequence ordering

anomalies is not required. The differences in response proportions prove statistically significant at all conventional levels even when applying a

two-tailed test.6 In the worsening multiple-sequence the overall effect depends on the relative strength of conflicting improving price-sequence and

commodity-sequence ordering effects. In sample C1’s initial sequence those two effects appear to cancel each other out such that no response bias is

Page 12: Ordering anomalies in choice experiments

ARTICLE IN PRESS

Table 6Direction of first- and second-order ordering effects on preferences for option I and observed response bias in the third choice task.

Sample Ordering effects in option I Bias in third task responses compared withcontrol case

First-order initial(observed)a

First-order secondary(potential)

Second-order(potential)

Direction(observed)a

One-tailedtestb

Two-tailedtestb

A1 + + + + 0.0587 0.1173

A2 None + + + 0.0306 0.0612

B1 – + None None 0.4274 0.8547

B2 + �c + None 0.1973 0.3947

C1 None + + + 0.0587 0.1173

C2 + None + + 0.0003 0.0006

a An observed ordering effect is classified as being positive (+) or negative (�) if a one-tailed test is significant at the 90% confidence

level or greater.b p-value records probability that the proportions choosing option I in each third task are identical to the appropriate test case.c See discussion in footnote 6.

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285282

To illustrate, the initial first-order sequence for sample A1 is a worsening price sequence that we have observed to shiftpreferences in favour of option I. That observed bias is indicated by a plus sign in column 2. The secondary first-ordersequence for sample A1 is an improving multiple-sequence. As an initial sequence that pattern was observed to biaspreferences in favour of option I (sample C2). We record that prediction in column 3 with a plus symbol. The second-ordersequence for this sample forms an improving commodity sequence. As an initial sequence that pattern of tasks biasesresponses in favour of option I (sample B2). As such, column 4 for sample A2 contains a plus symbol.

Columns 5–7 of Table 6 provide a summary of the direction of biases actually observed in the third choice tasks. In eachcase, the response proportions of the third task have been compared with the appropriate control case and the statisticalsignificance of any differences in response patterns calculated. For example, in the case of sample A1 we observe responsesto the third task are biased in favour of option I. Of course, that bias is what we might expect given that all three sequencesare predicted to push preferences in that direction. Given the directionality of that prediction a one-tailed test is sufficientto establish an ordering anomaly significant at the 90% level of confidence (p-value: 0.0587).

For sample A1 all first- and second-order sequences work to bias responses in the same direction. As a result, thatsample is relatively uninformative in untangling which of the sequences, if any, are responsible for the response biasobserved in the third task. The possibility remains that responses to the third task are influenced by any combination of thevarious first- and second-order effects. In the following, we use our data to explore a number of hypotheses regarding thedeterminants of bias in third tasks.

5.5.1. Hypothesis 1: no ordering effects in responses to third choice tasks

By the time the third task is presented, respondents may have grown accustomed to the repeated nature of the exercise.As a result, their responses may no longer be distorted by ordering effects. Comparing third choice task responses withthe appropriate base case using a two-tailed test (final column of Table 6), we find that samples A2 and C2 provide evidencethat contradicts that conjecture with greater than 90% and 99% confidence, respectively. Evidently third choice tasks arenot immune from response anomalies.

5.5.2. Hypothesis 2: residual ordering effect from initial sequence

Alternatively, one might conjecture that it is the initial sequence and that sequence alone that causes ordering effects.Respondents’ preferences systematically shift in response to that first sequence but those shifted preferences then persistin their consideration of the subsequent choice tasks. In this case, we would expect the observed response bias from theinitial sequence (column 2 of Table 6) to have the same sign as the observed response bias in the third task (column 5 ofTable 6). Since these present directional hypotheses a one-tailed test is appropriate. While samples A1 and C2 provideresponses that conform to this hypothesis, the responses of the remaining samples provide contradicting evidence. Thepreferences of samples A2 and C1 are apparently unaffected by the initial sequences yet we observe statistically significantbiases in favour of option I in responses to the third task. Likewise, samples B1 and B2 express preferences that aresignificantly affected by the initial sequence but responses to the third task appear unbiased. The hypothesis of responsesbeing influenced solely by the initial sequence is firmly rejected by our data.

(footnote continued)

seen in the second task. Sample B2 face a similar secondary sequence to C1’s initial sequence except that the worsening price sequence is less extreme. In

this case, we surmise that that the two ordering effects will no longer cancel and that deduction supports the appearance of a minus symbol in column 3

of Table 6 for sample B2.

Page 13: Ordering anomalies in choice experiments

ARTICLE IN PRESS

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285 283

5.5.3. Hypothesis 3: second-order effect

Another possibility is that respondents anchor on the attribute levels in the first choice task. Subsequent tasks arecompared with this initial task and it is this comparison which precipitates response anomalies. According to thathypothesis, it is the second-order sequence that should determine the nature of ordering effect in third choice taskresponses (column 4 of Table 6). The data conform much more closely to this hypothesis. We find that for samples A1, A2, C1

and C2 the bias in the third choice task is in the direction of that predicted from the second-order sequences and in all casesa one-tailed test confirms these biases to be significant with 90% confidence or greater. Sample B1 also conforms to theprediction though, in this case, that prediction is for no response anomaly in the third task, a prediction borne out by atwo-tailed test of significance. In the case of sample B2, however, the second-order sequence would imply a bias in favourof option I in third task responses. A one-tailed test rejects that hypothesis. Accordingly, the proposition that the directionof response anomalies in third choice tasks can be explained through second-order sequence effects alone fails tocompletely explain the response patterns observed in our data.

5.5.4. Hypothesis 4: compounding first-order effects

A further possibility is that response anomalies are precipitated primarily by first-order ordering effects. Over a series ofchoice tasks, responses may be influenced by each first-order sequence encountered. The initial sequence maysystematically shift respondents’ preferences and then from that baseline the secondary sequence may precipitate afurther shift. According to this theory, the bias we would expect to observe in responses to third tasks should be in thedirection suggested by a compounding of the first-order sequence effects.

With reference to Table 6, we find that for all samples the predictions made by this hypothesis entirely conform to theobserved direction of bias in third task responses. In sample A1 the initial and secondary first-order sequences both biasresponses in favour of option I and a one-tailed test confirms that responses to the third task are significantly biased in thatdirection (p-value: 0.0587). In samples A2, C1 and C2 where one of the first-order sequences biases responses in favour of option Ibut the other has no predicted effect, we observe significant biases in the expected direction (p-values: 0.0306, 0.0587 and0.0003, respectively). Finally, in samples B1 and B2, the initial and secondary first-order sequences imply biases that movepreferences first in one direction and then in the other. As a result, the observation of no significant biases in third task responses(p-values: 0.8547 and 0.3947, respectively) entirely conforms with the predictions of compounding first-order effects.

In addition to the direction of bias, compounding first-order effects have implications for the magnitude of observedordering anomalies. For example, the initial sequence for sample A2 has no systematic impact on preferences. As such, wemight expect the secondary sequence to induce the same magnitude of response anomaly as would be observed if it werethe initial sequence. In our experiment, that comparison is provided by the initial sequence of sample C2. We observe 36.1%choosing option I in task A23 and 41% choosing that option in task C22. Since those two proportions do not differsignificantly (p-value: 0.5326) the data support the prediction of a similarly-sized response bias.

The one contradictory outcome is provided by a comparison of samples A1 and C1. Here we would expect the magnitudeof the response bias in the third choice task to be larger in A13 than in C13. The reason for that expectation is that bothsamples face an identical secondary sequence. The initial sequence of sample A1, however, acts to positively reinforce thebias of the secondary sequence whereas the initial sequence of sample C1 is observed not to bias responses. In actuality, theresponse biases in the third tasks are of identical size.

Nonetheless, the patterns of response predicted by compounding first-order ordering effects successfully organise ourdata, perfectly anticipating the direction of biases observed in responses to the third choice task. The ‘compounding’explanation is not, however, a perfect explanation for all of our data; in one case the magnitude of response biases does notconform to expectations. In addition, our data does not allow us to identify whether the compounding of first-orderordering effects is further compounded by the second-order sequence. A new experimental design would be needed to castgreater light on those issues.

6. Discussion and concluding remarks

Despite the recent and rapid uptake of CE methods by non-market valuation practitioners, there are reasons to suspectthat the method may suffer from ordering anomalies. Previous research in this area has, in the main, focussed on thepossibility that responses to CEs are affected by position in the sequence of tasks. In contrast, our research explores thequestion of whether the particular sequence of choice tasks matters.

Our findings categorically reject the assumption that CEs are immune from ordering anomalies. Our results indicate thatthe frequency with which respondents choose options in worsening price sequences is significantly reduced though theopposite anomaly is not observed for options in improving price sequences. In addition, we find highly significantanomalies resulting from commodity sequences. In this case, the effects work in both directions with significant reductions(increases) in the frequency with which options in worsening (improving) commodity sequences are chosen.

That pattern of choice behaviour completely organises our data. It can be used to explain the direction of bias observedto choice tasks in multiple-sequences; that is, when both options exhibit price/commodity sequences. Likewise, we findthat the response patterns that are predicted by compounding that behaviour over two sequences of tasks are consistentwith the patterns of response anomaly that we actually observe in third choice tasks.

Page 14: Ordering anomalies in choice experiments

ARTICLE IN PRESS

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285284

Our experiment was designed primarily to document and characterise the existence of ordering anomalies in CEs. andnot to provide insights into the behavioural or psychological mechanism by which those anomalies may be explained. Ourfindings, however, allow us to appraise the various explanations of ordering effects outlined in Section 2.

Our data provide little support for the strategic respondent model as envisaged by Carson and Groves [12]. That modelproposes that individuals may respond negatively to options in simple price (commodity) sequences in an attempt tosignal a requirement for a low price (high level) of provision of the commodity. While we observe that negative reaction tooptions in worsening price and commodity sequences, our data reveals no significant ordering anomaly for options inimproving price sequences and a strong positive ordering anomaly for options in improving commodity sequences.Strategic behaviour, as imagined by Carson and Groves [12], falls at the first hurdle; it is unable to accommodate thepatterns of response observed in our tests of simple first-order ordering effects.

The framing model suggested by DeShazo [14] fits well with our observations of price-sequence ordering anomalies.The model also provides a possible explanation for the negative ordering anomaly we observe in worsening commoditysequences. Our observation of a significant positive ordering anomaly for options in improving commodity sequences,however, runs counter to the predictions of the model. Moreover, there are some considerable complexities in transferringDeShazo’s framing model into the context of CEs. In particular, when CEs (like our own) do not include a status quo option,there is no intuitive parallel to answering ‘No’ in response to a DBDC question. Whichever option is chosen implies somechange from the original endowment.

The anchoring model successfully explains all but one of the patterns of choice seen in our tests of simple first-orderordering anomalies; it fails to provide an explanation for the lack of an ordering anomaly in improving price sequences.One possibility is that the updating process is asymmetric with price signals above the prior reservation price provokingsignificantly larger revisions in preferences than price signals that are below the prior reservation price. Of course, if thatwere true, a further explanation is required to justify why respondents do not react with the same asymmetry when facedby commodity sequences.

Our data also appear to support a model that allows first-order effects to compound over initial and secondarysequences. That pattern would seem to tally quite closely with the updating process envisaged by the anchoring model.Individuals’ values (or, more generally, value functions), updated in a particular direction as a result of the price andcommodity information in the first task, might subsequently be revised further in light of the information in the secondtask, giving the appearance of the compounding ordering effects we observe.

Accordingly, none of the models discussed in Section 2 provide predictions that offer an unambiguously superior fit tothe patterns of response observed in our data, though perhaps the weight of evidence favours an explanation for orderinganomalies driven by a psychological phenomenon not dissimilar to that envisaged in the anchoring model.

While the exact mechanism through which ordering anomalies are manifested is not yet evident, the central messagesof our paper remain clear; CEs are vulnerable to ordering anomalies and the particular pattern of anomaly can be explainedby the particular sequence of choice tasks.

The existence of ordering anomalies poses a serious dilemma for CE analysts. If responses to choice tasks other than the firstare blighted by ordering effects, does it follow that we must simply discard responses to those additional questions? If thatwere so, then CEs would effectively provide no more information than SBDC CV. One possibility would be to model responsesparametrically and control for ordering effects through a suitably specified econometric model (see [18,21], for examples of thatapproach in the CV literature). Of course, the efficacy of an econometric solution relies heavily on the degree to which theeconometric model reflects the true data-generating process. The key constraint on pursuing an econometric modellingsolution, therefore, remains the lack of a robust theoretical understanding of the drivers of ordering effects in CEs. What is clear,however, is that our findings cast serious doubt on what is currently the standard practice; an individual’s responses to a seriesof choice tasks in a CE cannot be assumed to provide independent observations of some stable set of preferences.

Acknowledgments

The authors gratefully acknowledge the financial support of the Ministerio de Investigacion, Ciencia y TecnologıaProject SEJ2007-67734 and Junta de Andalucıa project SEJ2936 which made this research possible. The preparatory workfor this research was undertaken as part of the U.K. Economic and Social Research Council’s Programme for EnvironmentalDecision Making, Award M535255117, and the subsequent work was conducted as part of the Catchment Hydrology,Resources, Economics and Management (ChREAM) project (which is supported by the ESRC and other research councilsunder the RELU programme; Reference RES-227-25-0024).The paper was greatly improved by two anonymous referees’comments. All remaining errors, however, are the authors’ responsibility. Brett Day is grateful to the Institute of Transportand Logistic Studies at University of Sydney for their support during the completion of this paper.

References

[1] D.M. Aadland, A.J. Caplan, O.R. Phillips, A Bayesian examination of information and uncertainty in contingent valuation, Journal of Risk andUncertainty 35 (2007) 149–178.

Page 15: Ordering anomalies in choice experiments

ARTICLE IN PRESS

B. Day, J.-L. Pinto Prades / Journal of Environmental Economics and Management 59 (2010) 271–285 285

[2] W.L. Adamowicz, What’s it worth? An examination of historical trends and future directions in environmental valuation, Australian Journal ofAgricultural and Resource Economics 48 (2004) 419–443.

[3] A. Alberini, B. Kanninen, R.T. Carson, Modeling response incentive effects in dichotomous choice contingent valuation data, Land Economics 73(1997) 309–324.

[4] K. Arrow, R. Solow, P.R. Portney, E.E. Learner, R. Radner, H. Schuman, Report of the NOAA panel on contingent valuation, Federal Register 58 (1993)4601–4614.

[5] I.J. Bateman, R. Brouwer, Consistency and construction in stated WTP for health risk reductions: a novel scope-sensitivity test, Resource and EnergyEconomics 28 (2006) 199–214.

[6] I.J. Bateman, M. Cole, P. Cooper, S. Georgiou, D. Hadley, G.L. Poe, On visible choice sets and scope sensitivity, Journal of Environmental Economics andManagement 47 (2004) 71–93.

[7] I.J. Bateman, I.H. Langford, A.P. Jones, G.N. Kerr, Bound and path effects in double and triple bounded dichotomous choice contingent valuation,Resource and Energy Economics 23 (2001) 191–213.

[8] R.C. Bishop, T.A. Heberlein, Measuring values of extramarket goods: are indirect measures biased?, American Journal of Agricultural Economics 61(1979) 926–930.

[9] K.J. Boyle, R.C. Bishop, M.P. Welsh, Starting point bias in contingent valuation bidding games, Land Economics 61 (1985) 188–194.[10] R. Brooks, EuroQol: the current state of play, Health Policy 37 (1996) 53–72.[11] T.A. Cameron, J. Quiggin, Estimation using contingent valuation data from a ‘‘dichotomous choice with follow-up’’ questionnaire, Journal of

Environmental Economics and Management 27 (1994) 218–234.[12] R.T. Carson, T. Groves, Incentive and informational properties of preference questions, Environmental and Resource Economics 37 (2007) 181–210.[13] J. Clark, L. Friesen, The causes of order effects in contingent valuation surveys: an experimental investigation, Journal of Environmental Economics

and Management 56 (2008) 195–206.[14] J.R. DeShazo, Designing transactions without framing effects in iterative question formats, Journal of Environmental Economics and Management 43

(2002) 360–385.[15] E. Flachaire, G. Hollard, Starting point bias and respondent uncertainty in dichotomous choice contingent valuation surveys, Resource and Energy

Economics 29 (2007) 183–194.[16] N.E. Flores, A. Strong, Cost credibility and the stated preference analysis of public goods, Resource and Energy Economics 29 (2007) 195–205.[17] M. Hanemann, J. Loomis, B. Kanninen, Statistical efficiency of double-bounded dichotomous choice contingent valuation, American Journal of

Agricultural Economics 73 (1991) 1255–1263.[18] J.A. Herriges, J.F. Shogren, Starting point bias in dichotomous choice valuation with follow-up questioning, Journal of Environmental Economics and

Management 30 (1996) 112–131.[19] T.P. Holmes, K.J. Boyle, dynamic learning and context-dependence in sequential, attribute-based, stated-preference valuation questions, Land

Economics 81 (2005) 114–126.[20] D. McFadden, Contingent valuation and social choice, American, Journal of Agricultural Economics 76 (1994) 689–708.[21] D.M. McLeod, O. Bergland, Willingness-to-pay estimates using the double-bounded dichotomous-choice contingent valuation format: a test for

validity and precision in a Bayesian framework, Land Economics 75 (1999) 115–125.[22] N.A. Powe, I.J. Bateman, Ordering effects in nested top-down’ and bottom-up’ contingent valuation designs, Ecological Economics 45 (2003)

255–270.[23] R.C. Ready, J.C. Whitehead, G.C. Blomquist, Contingent valuation when respondents are ambivalent, Journal of Environmental Economics and

Management 29 (1995) 181–196.[24] J. Swait, W. Adamowicz, The influence of task complexity on consumer choice: a latent class model of decision strategy switching, Journal of

Consumer Research 28 (2001) 135–148.[25] A. Tversky, D. Kahneman, Loss aversion in riskless choice: a reference-dependent model, Quarterly Journal of Economics 106 (1991) 1039–1061.[26] G.C. van Kooten, E. Krcmar, E.H. Bulte, Preference uncertainty in non-market valuation: a fuzzy approach, American Journal of Agricultural

Economics 83 (2001) 487–500.[27] J.C. Whitehead, Incentive incompatibility and starting-point bias in iterative valuation questions, Land Economics 78 (2002) 285–297.