13
Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons SUSANNA SIRONEN a *, PEKKA LESKINEN a , ANNIKA KANGAS b and TEPPO HUJALA c a Finnish Environment Institute (SYKE), Joensuu, Finland b Faculty of Agriculture and Forestry, Department of Forest Sciences, University of Helsinki, Helsinki, Finland c Finnish Forest Research Institute, Vantaa Unit, Vantaa, Finland ABSTRACT Several studies on numerical rating in discrete choice problems address the tendency of inconsistencies in decision makersmeasured preferences. This is partly due to true inconsistencies in preferences or the decision makersuncertainty on what he or she really wants. This uncertainty may be reected in the elicited preferences in different ways depending on the questions asked and methods used in deriving the preferences for alternatives. Some part of the inconsistency is due to only having a discrete set of possible judgments. This study examined the variation of preference inconsistency when applying different pairwise preference elicitation techniques in a ve-item discrete choice problem. The study data comprised preferences of ve career alternatives elicited applying interval scale and numerically and verbally anchored ratio scale pairwise comparisons. Statistical regression technique was used to analyse the differences of inconsistencies between the tested methods. The resulting relative residual variances showed that the interval ratio scale comparison technique provided the greatest variation of inconsistencies between respondents, thus being the most sensitive to inconsistency in preferences. The numeric ratio scale comparison gave the most uniform preferences between the respondents. The verbal ratio scale comparison performed between the latter two when relative residual variances were considered. However, the verbal ratio scale comparison had weaker ability to differentiate the alternatives. The results indicated that the decision recommendation may not be sensitive to the selection between these preference elicitation methods in this kind of ve-item discrete choice problem. The numeric ratio scale comparison technique seemed to be the most suitable method to reveal the decision makerstrue preferences. However, to conrm this result, more studying will be needed, with an attention paid to userscomprehension and learning in the course of the experiment. Copyright © 2013 John Wiley & Sons, Ltd. KEY WORDS: decision support; MCDA; inconsistency; measurement scales; preference elicitation 1. INTRODUCTION Multi-objective decision making often requires the comparison of qualitatively different entities (e.g. Alho et al., 2001). Decision analysis is a theory designed to help the decision maker to make a choice from a set of alternative choices. The purpose of decision analysis is to contribute support to decision making in problems that are too complex to be solved only by the common sense alone (e.g. Schmoldt et al., 2001). Multi-criteria decision analysis (MCDA) is a discipline aimed at supporting decision makers faced with multiple goals. In many MCDA cases, decision criteria are conicting because typically, all goals cannot be reached concurrently (e.g. a large and affordable at with good quality and preferred location). The central problem is how to evaluate a set of alternatives in terms of a number of criteria. Multi-objective decisions under certainty require the decision maker to dene the trade-offs between decision criteria or attributes (e.g. Kangas et al., 2008). There is no correct answer what values the decision maker should use, because the trade-offs are subjective (e.g. Keeney and Raiffa, 1976; Kangas et al., 2008). Multi-objective decision making under uncertainty is the most challenging situation, because there may be uncertainty in all of the parameters of decision analysis (Kangas et al., 2008). Many MCDA methods with different denitions and assumptions have been developed including methods based on Multi-attribute utility theory (Keeney and Raiffa, 1976), outranking (e.g. Roy, 1968; Brans et al., 1986; Rogers and Bruen, 1998) and the analytic hierarchy process (AHP) (Saaty, 1980). The MCDA methods differ in many respects, for example, with respect to the preference elicitation techniques employed *Correspondence to: Susanna Sironen, Finnish Environment Institute (SYKE), P.O. Box 111, FI-80101, Joensuu, Finland. E-mail: susanna.sironen@ymparisto.Copyright © 2013 John Wiley & Sons, Ltd. Received 18 October 2012 Accepted 25 April 2013 JOURNAL OF MULTI-CRITERIA DECISION ANALYSIS J. Multi-Crit. Decis. Anal. 21: 183195 (2014) Published online 4 November 2013 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/mcda.1500

Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

  • Upload
    teppo

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

Variation of Preference Inconsistency When Applying Ratioand Interval Scale Pairwise Comparisons

SUSANNA SIRONENa*, PEKKA LESKINENa, ANNIKA KANGASb and TEPPO HUJALAcaFinnish Environment Institute (SYKE), Joensuu, FinlandbFaculty of Agriculture and Forestry, Department of Forest Sciences, University of Helsinki, Helsinki, FinlandcFinnish Forest Research Institute, Vantaa Unit, Vantaa, Finland

ABSTRACT

Several studies on numerical rating in discrete choice problems address the tendency of inconsistencies in decision makers’measured preferences. This is partly due to true inconsistencies in preferences or the decision makers’ uncertainty on what heor she really wants. This uncertainty may be reflected in the elicited preferences in different ways depending on the questionsasked and methods used in deriving the preferences for alternatives. Some part of the inconsistency is due to only having adiscrete set of possible judgments. This study examined the variation of preference inconsistency when applying differentpairwise preference elicitation techniques in a five-item discrete choice problem. The study data comprised preferences offive career alternatives elicited applying interval scale and numerically and verbally anchored ratio scale pairwisecomparisons. Statistical regression technique was used to analyse the differences of inconsistencies between the testedmethods. The resulting relative residual variances showed that the interval ratio scale comparison technique provided thegreatest variation of inconsistencies between respondents, thus being the most sensitive to inconsistency in preferences.The numeric ratio scale comparison gave the most uniform preferences between the respondents. The verbal ratio scalecomparison performed between the latter two when relative residual variances were considered. However, the verbal ratioscale comparison had weaker ability to differentiate the alternatives. The results indicated that the decision recommendationmay not be sensitive to the selection between these preference elicitation methods in this kind of five-item discrete choiceproblem. The numeric ratio scale comparison technique seemed to be the most suitable method to reveal the decisionmakers’ true preferences. However, to confirm this result, more studying will be needed, with an attention paid to users’comprehension and learning in the course of the experiment. Copyright © 2013 John Wiley & Sons, Ltd.

KEY WORDS: decision support; MCDA; inconsistency; measurement scales; preference elicitation

1. INTRODUCTION

Multi-objective decision making often requires thecomparison of qualitatively different entities (e.g. Alhoet al., 2001). Decision analysis is a theory designed tohelp the decision maker to make a choice from a set ofalternative choices. The purpose of decision analysis isto contribute support to decision making in problemsthat are too complex to be solved only by the commonsense alone (e.g. Schmoldt et al., 2001). Multi-criteriadecision analysis (MCDA) is a discipline aimed atsupporting decision makers faced with multiple goals.In many MCDA cases, decision criteria are conflictingbecause typically, all goals cannot be reached concurrently(e.g. a large and affordable flat with good quality and

preferred location). The central problem is how toevaluate a set of alternatives in terms of a number ofcriteria. Multi-objective decisions under certaintyrequire the decision maker to define the trade-offsbetween decision criteria or attributes (e.g. Kangaset al., 2008). There is no correct answer what valuesthe decision maker should use, because the trade-offsare subjective (e.g. Keeney and Raiffa, 1976; Kangaset al., 2008). Multi-objective decision making underuncertainty is the most challenging situation, becausethere may be uncertainty in all of the parameters ofdecision analysis (Kangas et al., 2008).

Many MCDA methods with different definitions andassumptions have been developed including methodsbased on Multi-attribute utility theory (Keeney andRaiffa, 1976), outranking (e.g. Roy, 1968; Brans et al.,1986; Rogers and Bruen, 1998) and the analytichierarchy process (AHP) (Saaty, 1980). The MCDAmethods differ in many respects, for example, withrespect to the preference elicitation techniques employed

*Correspondence to: Susanna Sironen, Finnish EnvironmentInstitute (SYKE), P.O. Box 111, FI-80101, Joensuu, Finland.E-mail: [email protected]

Copyright © 2013 John Wiley & Sons, Ltd.Received 18 October 2012

Accepted 25 April 2013

JOURNAL OF MULTI-CRITERIA DECISION ANALYSISJ. Multi-Crit. Decis. Anal. 21: 183–195 (2014)Published online 4 November 2013 in Wiley Online Library(wileyonlinelibrary.com) DOI: 10.1002/mcda.1500

Page 2: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

(Aloysius et al., 2006). Saaty’s (1977, 1980) AHP is oneof the most widely used techniques developed formulti-criteria decision analysis (e.g. Schmoldt et al.,2001; Temesi, 2010). In the AHP and its generalization,the analytic network process, hierarchies or feedbacknetworks are conducted to describe the decisionstructure (e.g. Saaty, 2001a, 2001b; Kangas and Kangas,2002; Wolfslehner et al., 2005). Then pairwisecomparisons of the decision alternatives are made ateach level of the decision hierarchy to evaluate theirrelative importance with regard to each element abovein the hierarchy (Saaty, 1977). Rating a set ofalternatives can be a demanding task even when oneunique criterion is considered, thus pairwise comparisonmethods simplify the problem by concentrating theattention on pairs of alternatives to be compared undera given criterion (Limayem and Yannou, 2007).

It is difficult to compare the different ratingtechniques, as true ratings of a given decision makerare unobtainable with any method. Thus, no benchmarkexists related to which the quality of the ratings could becalculated. The only available measures of quality of theratings are thus normative, that is comparison tostandards that are selected on the basis of theoreticalgrounds. There has been many studies of the elicitationtechniques applied under MCDA concerning theireffects on consistency, reliability and various othernormative properties of the decision making; however,there are no consistent results that any specifictechnique would be superior to other techniques(Aloysius et al., 2006). One normative standard is theconsistency of the given ratings. Although the decisionmakers need to be consistent in their ratings, a goodrating system should not increase the inconsistency ofthe ratings. The purpose of this study was to examinethe variation of preference inconsistency when applyingdifferent pairwise preference elicitation techniques. Theaim was to explore whether different elicitationtechniques differ in terms of sensitivity to inconsistencyand what are implications for the use of such techniquesin multi-criteria decision analysis tasks. The testedmethods included verbal ratio scale, numeric ratio scaleand interval scale pairwise comparison techniques. Themethods were compared through the regression analysisof pairwise comparison data.

2. ANALYSING PAIRWISE COMPARISONS

2.1. Alternative scoring techniquesIn a standard pairwise comparison experiment, everyalternative is paired with every other alternative inturn. The comparison is evaluating the preference

intensity of a given pair of alternatives with potentialinconsistency between pairs. The AHP utilize ratioscale assessment of the decision alternatives (e.g.Saaty, 2001b). The original scoring techniqueproposed by Saaty (1977) applies a discrete and verbalscoring technique with numerical counterparts 1/9, 1/8, . . ., 1/2, 1/1, 2/1, . . ., 8/1, 9/1. Other techniqueshave been developed, for example, by Ma and Zheng(1991), Lootsma (1993) and Salo and Hämäläinen(1997). The ratio scale scoring technique applied inthe AHP has been debated and criticized, for example,by Barzilai (2005), who claims that the scale ratios aredefined only if there exist an absolute zero for theproperty under measurement such as in temperature.In addition to ratio scale assessments, other scoringmethods include nominal, ordinal, difference andinterval scale assessments (Saaty, 1980).

Nominal scale essentially consists of assigning labelsto objects (e.g. Saaty, 1980). In an ordinal scale, onlydifferences in order can be distinguished, not differencesin preference intensity. Interval scale utilize two fixedpoints of a preference scale. This means that we caninterpret differences in the distance along the scale.When applying interval scale methods, the value scaleis defined by the set of alternatives under consideration.Usually, the most and least preferred alternative withsome criterion create the value scale, and theintermediate alternatives are then evaluated with respectto the specified local scale (Kainulainen et al., 2009).Ratio scale consists not only of equidistant points butalso of a meaningful zero point. Ratio scale assessmentscan be formed both verbally and numerically. Verbalcomparison of objects is more common in our everydaylives than numerical. In addition, the use of verbalassessments is intuitively appealing and user-friendly.The intensity of importance may vary from equalimportance to absolute importance of one object overanother in verbal ratio scale assessments (e.g. Saaty,1980). The verbal ratio scale comparisons must beconverted into numerical ones to derive priorities. Forexample, applying AHP, the verbal statements areconverted into their numeric counterparts from one tonine (e.g. Saaty, 1980). However, theoretically, there isno justification to be restricted only to this kind of verbalgradation and numbers.

The classical pairwise comparison process applyingthe ratio scale assumes that the decision maker cancompare any decision elements pairwisely and providea numerical value of the ratio of their importance(Mikhailov, 2004). The choice of the specific elicitationtechnique usually depends on the decision problem,the number of alternatives and the time and moneyavailable.

S. SIRONEN ET AL.184

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 3: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

2.2. Inconsistency and preference uncertaintySubjective preference assessments are ofteninconsistent, because decision makers may havedifficulties in evaluating some of the alternatives orthey are not necessarily consistent with their ownpairwise evaluations. It may simply be difficult forthe decision maker to express exact numericalestimates of the ratios of importance (Ramik, 2009),and suitable estimate may not be among the discreteset of choices available.

Temesi (2010) classifies the decision makers toinformed or uninformed. The former know thepreference intensities explicitly; the latter do not knowthem sufficiently, or the values may not even exist(Temesi, 2010). If decision makers know theirpreferences, the valuation task involves only revealingthese well-defined and pre-existing preferences. Yet,the design of the valuation task must be defined so thatthe respondent is motivated to research his or herpreferences and respond truthfully (Payne et al.,1999). If people do not have existing well-definedvalues for many objects, the valuation task includesconstructing the preferences. According to Payneet al. (1999), expressed preferences generally reflectboth a decision maker’s basic values for highlightedattributes and the particular heuristics or processingstrategies used to combine information selectively toconstruct the required response to a particularsituation. Thus, expressed preferences include bothrandom error and two different sources of systematicvariance.

Inconsistency is one element of preferenceuncertainty. Measured preferences may also beinconsistent because of behavioural or psychologicalreasons, for example when the respondent becomestired or loses concentration with a long list of pairwisecomparisons or items to rate. It is hard to distinguishwhether and to which extent the observedinconsistencies are based on true inconsistency,uninformed respondent or behavioural reasons orframing or anchoring effects of the preference enquirymethod.

In real-life decision, problems pairwise comparisonmatrices are rarely consistent; therefore, one crucialbut challenging point of the methodology is todetermine the inconsistency of the matrices (Hermanand Koczkodaj, 1996¸ Bozóki and Rapcsák, 2008).Decision maker’s preferences may be inconsistent,for example so that for a given three possiblealternatives, alternative A is two times better than B,B is three times better than C, but A is not six timesbetter than C, but maybe four or eight times better.The preferences may even be intransitive, meaning

that alternative A is preferred to B, B is preferred toC, but then alternative C is preferred to A (e.g.Linares, 2009). Intransitives are more prone to happenwith multi-attribute comparisons between alternativesand in MCDA, especially if the comparisons areconducted with pairwise comparisons (Linares,2009). Especially with many pairs of alternatives, thedecision maker may not see the consequences of manypairwise comparisons (e.g. Temesi, 2010).Inconsistency has a possible effect on decisionmaking, since then, it may not be possible to find outthe optimal alternative. If inconsistency is highenough, decision recommendations of the rank orderare impossible to render.

Many kinds of methods have been proposed toaccount for the inconsistency and the decision maker’sincomplete knowledge of his/her preferences. Oneoption to determine the internal consistency of theapplied method is to perform a classical test-retestexperiment and examine the consistency of the elicitedweights at two points in a time (Bottomley and Doyle,2001). In AHP (Saaty, 1980), the inconsistencies ofpairwise comparisons are measured by the consistencyratio (CR) of a pairwise comparison matrix, which isthe ratio of its consistency index to the correspondingrandom index value. However, the CR is criticizedbecause its definition of inconsistency is based on a10% rule of thumb (e.g. Ramik, 2009). According toBozóki and Rapcsák (2008), Saaty’s consistency ofpairwise decision matrix is insufficient to excludeasymmetric inconsistency. It is also criticized to allowcontradictory judgments in matrices (Kwiesielewiczand van Uden, 2004; Bana e Costa and Vansnick,2008). Therefore, several other methods have beenproposed to measure consistency. Crawford andWilliams (1985), for example, prefer to sum thedifference between the ratio of the calculated prioritiesand the given comparisons. Furthermore, ratio scalepairwise comparisons data can be analysed by usingstandard regression models as well (De Jong, 1984;Crawford and Williams, 1985). The regressionapproach has been further developed in several ways(e.g., Alho et al., 1996; Alho and Kangas, 1997;Leskinen and Kangas, 1998; Alho et al., 2001; Leskinenet al., 2003). The uncertainty of the estimated prioritiesmeasured by the s2 can be incorporated into the analysisof preferences through statistical inference by applyingthe regression approach (Alho and Kangas, 1997).Otherwise, the priorities obtained through theregression analysis behave similarly to the prioritiesobtained through Saaty’s eigenvalue technique, andthe differences in the priority estimates are usually small(e.g. Alho and Kangas, 1997).

VARIATION OF PREFERENCE INCONSISTENCY 185

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 4: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

3. MATERIAL AND METHODS

3.1. Study materialThe study material contained parts of the preferenceenquiry experiment carried out at the University ofHelsinki (Kainulainen et al., 2009). Altogether, 45forestry students were asked to evaluate five differentcareer alternatives according to their own preferences.The career alternatives included lecturer, forestry advisor,senior inspector, software designer and researcher. Thestudents were first given a brief three-sentenceintroduction to each of the career alternatives. Thestudents were asked to give their preferences throughdirect rating and pairwise comparisons both in intervaland ratio scales. Each assignment was distinct. Thisparticular study included three different pairwisecomparison techniques: verbal ratio scale technique,numeric ratio scale technique, and numeric interval scaletechnique. These preference elicitation techniques wereselected, because they represent distinct and practicallyapplicable alternatives that have psychometrically differentcharacteristics and two different but simple calculationprocedures that are both well-known in literature andthus enable comparison with other research. In verbalratio scale technique, the intensity of preference wasgiven in a scale from 1 to 9 as proposed by Saaty(1980). In the numeric ratio scale technique, intensityof preference was obtained as a comparison to the leastpreferred alternative with 100 points within each pair.In SMART methods, the alternatives are rated relativeto the least important one (e.g. Kangas et al., 2008). Innumeric interval scale technique, preference differencein each pair was compared with the difference of theoverall best and worst alternatives set as 100 (seeAppendix). Kainulainen et al. (2009) analysed thesedata sets from the viewpoint of priorities, whereas thisstudy took into account inconsistencies involved.

3.2. MethodsRegression technique was used to analyse all thepairwise comparisons. In general, we may considerthe relative merits of some attributes 1,. . ., n. In apairwise comparisons experiment, the judge is askedto evaluate the attributes in a pairwise manner.Usually, all the pairs (i, j) are compared, which leadsto a maximum of m= n(n� 1)/2 comparisons (e.g.Leskinen, 2000). First, let rij be the relative value ofattribute i compared with attribute j assessed by judgewith respect to a single decision criterion. It isassumed that rij= vi/vj exp(eij), where vi and vj referto the true and unknown values of attributes i and j,and aj measures the uncertainty or error with whichthe true values are obtained in the elicitation by the

judge. Then, by defining yij= ln(rij), the regressionmodel for ratio scale pairwise comparisons data in asingle-judge case becomes (Crawford and Williams,1985; Alho and Kangas, 1997)

yij ¼ ai � aj þ eij (1)

where ai= ln(vi) and the residuals aj are uncorrelatedwith E(eij) = 0 and Var(eij) = s

2. The rij is analysed onlogarithmic scale in the regression model, becausethe rij> 0 and after the logarithmic transformation,the response becomes an arithmetic scale variableallowing the direct utilization of normal distribution.The estimation method is similar for the pairwisecomparisons at the interval scale as well, except for thislogarithmic transformation (Kainulainen et al., 2009).The regression model can be written as

Y ¼ Xaþ «; (2)

whereY,a, and « are the vectors of response, parametersand residuals, respectively, and X is a design matrixdefining the pairwise comparison in question (e.g. Alhoand Kangas, 1997). The parameters can be estimatedby ordinary least-squares technique. The ordinary least-squares estimators are a ¼ XTX

� ��1XTY for the vector

a and s2 ¼ Y� Xað ÞT Y� Xað Þ m� nþ 1ð Þ for theresidual variance, where m is the number of pairwisecomparisons and n is the number attributes to becompared.

To ensure identifiability, it is required that an� 0,where n is the number of attributes to be compared.Thus, the parameter ai measures the value of the entityi relative to the entity n. Thus, in this particular study,where the n = 5, the design matrix X gets the form

X ¼

1 �1 0 01 0 �1 01 0 0 �11 0 0 00 1 �1 00 1 0 �10 1 0 00 0 1 �10 0 1 00 0 0 1

2666666666666664

3777777777777775

; (3)

anda= (a1,a2,a3,a4)T,Y= (y12,y13,y14,y15,y23,y24,y25,y34,

y35,y45)T and «= (e12,e13,e14,e15,e23,e24,e25,e34,e35,e45)

T

(Alho and Kangas, 1997). The estimates of the valuescan be transformed to the scale of priorities by ai ¼exp aið Þ=Σi exp aið Þ, so that Σiai ¼ 1

S. SIRONEN ET AL.186

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 5: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

In this case, the career alternatives were arrangedin the following order: lecturer, forestry advisor,senior inspector, software designer and researcher asit was in the previous study of Kainulainen et al.(2009).

Regression models were constructed separately foreach of the student respondents, and for all of theelicitation techniques. The preferences expressed withthe verbal ratio scale technique were first transformedto a numeric scale according to Saaty (1980), that is tointeger values varying from 1 to 9. To receive as similarscale as possible for all of the elicitation methods toexamine residual variances, for example, the medianof the maximum relative values perceived by therespondents with the numeric ratio scale techniquewas used as the highest value in the verbal ratio scaletechnique. This value was 2; therefore, the scaleconstructed in this way took a form 1/2, 1/1.875, . . .,1/1.125, 1, 1.125, . . ., 1.875, 2.

If pairwise comparisons rij are perfectly consistent,the residual variance s2 = 0. Otherwise, s2 > 0. Thus,inconsistency of the pairwise comparisons may bemeasured by the residual variance s2 in a statisticalway. In addition to the residual variances s2 ¼Y� Xað ÞT Y� Xað Þ m� nþ 1ð Þ , relative residualvariances were calculated, because they are not asscale dependent as the residual variances as such.

Furthermore, the square of the multiple correlationcoefficient, that is, the coefficient of determination(R2) was calculated to describe the overall quality ofthe resulting regression. In this case, it may beestimated by R2 = 1� SST/SSR, where SSR equals thesum of squared residuals from regression and SSTequals the sum of squared scores, because there is noconstant in the regression models (e.g. Alho et al.,2001). Without the constant, the degrees of freedomfor SST are n or the total number of the comparisonsincluded. The degree of freedom for SSR is n� r,where r is the number of columns in the design matrix.Then adjusted R2 is

R2adj ¼ 1� SSR= n� rð Þ

SST=n(4)

Furthermore, the null hypothesis H0 : ai= aj, that isthe merits of attributes i and j, against H1 : ai 6¼ aj wastested by

t ¼ ai � ajffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVar aið Þ þ Var aj

� �� 2�Cov ai; aj� �q ; (5)

which has tm� n+ 1 distribution under H0 andnormality of residuals. The variances and thecovariances are in the form Cov að Þ ¼ s2 XTX

� ��1,

that is, the denominator of the Equation (5) reducesto

ffiffiffiffiffiffiffiffi2s2

p=n for all i and j (e.g. Alho et al., 2001;

Leskinen, 2000). The estimated regression coefficientswere first ordered, and the best, that is, the largestregression coefficient was always tested against theothers. The resulting p-values were classified in differentclasses according to the magnitude of the p-value, andthe frequencies within these classes were calculated.

4. RESULTS

Of the original dataset, three answers were incomplete,and they were dropped out. Thus, the final studymaterial comprised altogether 42 student evaluations.The first step was to derive the preference weights fromthe regression models constructed for each of thestudent respondents. Differences in the inconsistencybetween the elicitation techniques were analysedthrough estimated residual variances and relativeresidual variances of the regression models, coefficientsof determination, and by testing whether the estimatedregression coefficients are equal. The average of theestimated residual variances was smallest with theverbal ratio scale technique (Table I). Also, the rangeof the residual variances was much smaller than inthe other techniques, varying from 0.003 to 0.055. The

Table I. Residual variance, relative residual variance andcoefficients of determination (R2 and adjusted R2) receivedthrough regression analyses for the pairwise comparisondata acquired with the tested scaling techniques

Verbal ratio Numeric ratio Interval

Residual varianceMin 0.003 0.001 0.003Mean 0.021 0.432 0.116Max 0.055 16.81 2.156Relative residual varianceMin 0.054 0.001 0.030Mean 56.08 10.48 993.1Max 2303 381.1 41667R2

Min 0.594 0.301 0.703Mean 0.070 0.117 0.075Max 0.992 1.000 0.994R2adj

Min 0.324 �0.165 0.505Mean 0.873 0.884 0.876Max 0.986 1.000 0.990

VARIATION OF PREFERENCE INCONSISTENCY 187

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 6: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

largest residual variances were received with thenumeric ratio scale technique (Table I). The average ofthe estimated residual variances was 0.432, and therange varied from 0.001 to 16.8. However, the largestresidual variances were mainly due to a couple ofthe respondents (Figure 1). Similarly, large residualvariances were received for the same respondents inthe interval scale technique as well. Otherwise, numericratio scale produced most narrow variation in most ofthe cases. The variance was under 0.02 for 62% of therespondents, whereas it was under 0.02 in half of cases

with the interval scale technique and 55% with theverbal ratio scale technique.

Because the scale may affect the analysis ofresidual variances, relative residual variances werealso calculated as they are not as scale sensitive. Therelative residual variances were lowest for the numericratio scale technique, whereas they were largest fortheinterval scale technique (Table I and Figure 2). Therelative residual variance was under 1 for 76% of therespondents applying numeric ratio scale assessments,whereas it was under 1 for 69% of the respondents with

Figure 1. Residual variances received through regression analyses for the different techniques plotted against respondents1–42. The residual variances for respondent 35 elicited with numeric ratio scale, and respondent 37 elicited with intervalscale had to be omitted from the figure, because the values were so large (16.8 and 2.15, respectively) to make other valuesmore distinguishable.

Figure 2. Relative residual variances received through regression analyses for the different techniques plotted againstrespondents 1–42. The residual variance for the respondent 8 elicited with verbal ratio scale, and respondent 34 elicited withnumeric ratio scale, and respondent 37 elicited with interval scale had to be omitted from the figure, because the values wereso large (2302, 381 and 41 667, respectively) to make other values more distinguishable.

S. SIRONEN ET AL.188

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 7: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

the interval scale technique and 62% with the verbalratio scale technique. The results also showed that mostof the respondents that gave largest relative residualvariances were the same for the numeric ratio scaletechnique and interval scale technique, whereas theywere mainly different respondents who gave themost inconsistent evaluations applying the verbal ratioscale assessments.

The differences between the coefficients ofdeterminations (R2 and adjusted R2) received from eachof the regression model were calculated and plottedagainst the respondents (Figure 3). The differences inthe coefficients of determinations between the intervalscale technique and the numeric ratio scale techniquewerequite uniformly distributed around the zero. In 45% of thecases, the R2 received with the interval scale techniquewas better, and in 55% of the cases, the numeric ratiotechnique gave better coefficient of determination. Bothof these techniques outperformed the verbal ratio scaletechnique when the coefficients of determinations wereanalysed. The R2 was better in 69% of the cases withthe interval scale technique and in 76% of the cases withthe numeric ratio scale technique than what was receivedwith the verbal ratio scale technique.

The hypothesis H0 : ai= aj against H1 : ai 6¼ aj wastested to find out how well the different techniquescould separate the rank order of the different careeralternatives. The best alternative, that is, the alternativehaving the largest regression coefficient, was testedagainst the others. The ordering of the alternativesshowed that there were no differences in the rank orderof the alternatives between the different techniques. Allthe techniques resulted in similar order for all of therespondents. The resulting p-values were classified intofive different classes, specifically, p-values under 0.001,0.001–0.01, 0.01–0.05, 0.05–0.1 and p-values over 0.1.The frequencies within each of these classes werecalculated (Figure 4). The p-values of the verbal ratioscale technique were somewhat larger especiallyconsidering the best alternative versus the second bestalternative. In over half of the cases, the p-value wasover 0.1. Thus, the method could not find significantdifferences between these alternatives as well as theother methods. The numeric ratio had the smallestp-values for the best versus the second best alternative.In addition, the frequencies in the smallest p-value class,that is, p-values under 0.001, were somewhat larger thanin the other methods. According to the p-values, the

Figure 3. Differences in coefficients of determinations (aforementioned R2 and adjusted R2 in the succeeding text) betweenthe tested methods plotted against respondents 1–42.

VARIATION OF PREFERENCE INCONSISTENCY 189

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 8: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

differences in preferences could be best found with thenumeric ratio scale technique then with the interval scaletechnique and worst with the verbal ratio scale technique.

5. DISCUSSION

5.1. Messages from the numerical resultsThe variation of inconsistency when applying differentpairwise preference elicitation techniques in a five-itemdiscrete choice problem was tested in this study throughregression analysis for pairwise comparison data. Inaddition, the preferred rank order acquired with thedifferent techniques was examined. Many experimental

examples show that simple implementation of differenttechniques for deriving weights in the same task and withthe same decision maker often lead to different results(e.g. Moshkovich et al., 2002). In addition to the largenumber of alternatives to be compared in pairwisemanner, the source of inconsistency may be in thequestioning procedure itself or the applied scaling maylead to inconsistency, for example, scaling systems withfixed endpoints (Temesi, 2010).

Results of this study showed that the numeric ratioscale had the smallest residual variances for most ofthe respondents while both relative residual variancesand residual variances as such were analysed. However,large variances for some of the respondents were

Figure 4. Frequencies of the p-values resulting from the hypothesis testing H0 : ai= aj against H1 : ai 6¼ aj within differentp-value classes.

S. SIRONEN ET AL.190

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 9: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

achieved also with this technique. Mostly, the samerespondents gave more inconsistent preferences withthe interval scale technique as well.

The verbal ratio scale produced the most homogenousresults having less disparity between the differentrespondents when the residual variances were analysedas such. On the other hand, the relative residual variancesshowed that the numeric ratio scale assessments had leastdisparity between the respondents. On average and inmost cases, the best coefficients of determination wereachieved with the numeric ratio scale technique.However, couple of large R2 values was achieved withthe numeric ratio scale method as well.

Because the results of the numeric ratio scaletechnique were slightly better in general, but it includedalso quite inconsistent assessments, some kind ofadjusting for these techniques is recommended. In thenumeric ratio scale, the alternatives were rated relativeto the least preferred one. The results would possiblyhave been different if the comparisons were made withrespect to the most interesting alternative instead of theleast interesting alternative. According to the results ofBottomley and Doyle (2001), comparisons with respectto the least preferred alternative were harder to do andelicited weights of an inferior quality at least in theirtest-retest analyses. This should be taken into accountwhen selecting a method for preference elicitation(Bottomley and Doyle, 2001). Payne et al. (1999)studied preference measurement and constructionissues, and identified some principles and techniquesfor preference construction and measurement thatshould mitigate some of the most common andimportant construction faults.

The verbal ratio scale technique produced mostuniform results between the different respondents whileresidual variances as such were analysed. According tothe hypothesis testing, which takes both the regressioncoefficients and uncertainty into account, the verbalratio scale technique had the worst ability ofdistinguishing between options, especially when thebest and second best alternatives were considered. Theverbal ratio scale technique is the most constrictedtechnique from the outset, and in some applications, thiswould probably be too simplified and no differencesbetween the alternatives would be achieved. Thenumeric ratio scale technique was able to separatethe preferred order most effectively, although thedifferences were not large. Otherwise, there were nomismatches or differences in the elicited preferencesbetween the techniques. All the respondents gave thesame preferred order for the career alternatives withevery technique. Thus, none of the methods were toomisleading nor gave too inconsistent results.

However, considering the relative residual variancesand omitting the most inconsistent assessments (relativeresidual variance over 1), all the methods behaved quitesimilarly and gave quite uniform relative residualvariances between the respondents. Thus, the resultsof this five-item discrete choice problem used in thisstudy indicate that the relative residual variance of 1could be used as some kind of threshold for consistentand inconsistent assessments. Feedback for the respondentsregarding the consistency of their evaluations could begiven, if the relative residual variance is over 1 and askthem to reconsider their preferences.

When interpreting the results of this study, theinterval scale technique seemed to be the most sensitivetechnique to the inconsistencies in the responses. Insome ways; however, it needs to be kept in mind thatthe true preferences may indeed be inconsistentalthough it is almost impossible to know this. Then,the interval ratio technique might be the only methodthat allows the decision maker to properly express theirtrue preferences and the related uncertainty. Nevertheless,the model parameters should be consistent to reliablyestimate the true and unknown a. Therefore, the trueunderlying preferences are assumed to be consistent,and the judge’s statements of the true preferencesmay be generally inconsistent and uncertain.

5.2. Behavioural considerationsSome respondents provided high inconsistencies bothin numeric ratio scale and interval scale ratings. It ispossible that these respondents find the verbalstatements more convenient to apply than the numericones. High inconsistency may also be a consequenceof unmotivated respondents. However, either at thispoint or in any situation in general, it was not possibleto detect whether the preferences were true despite ofthe inconsistencies or had the decision maker not fullyunderstood the task or was unmotivated.

The numeric ratio scale technique used in this taskallows eliciting very variable preferences, because thescale can be selected freely. For example, Saaty(1980) stated that there are several reasons why anupper limit should exist, for example the qualitativedistinctions are meaningful in practice when the itemscompared are close together with regard to the propertyused to make the comparison. In addition, Saaty (1980)and Saaty and Vargas (2012) have analysed themagnitude of the upper value of the scale. Accordingto them, people have capacity to qualitatively dividetheir response to stimuli into three categories: high,medium and low, and refine this by further subdividingeach of these intensities of responses resulting in total ofnine subdivisions.

VARIATION OF PREFERENCE INCONSISTENCY 191

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 10: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

The numeric ratio scale technique seems to requiremore training and be more suitable for more experiencedrespondents, who are familiar with numeric methods.Moshkovich et al. (2002) suggested that it is morepreferable to use judgments in a qualitative form, becauseit is more natural for people and provides more reliableinformation. They also suggested that eliciting preferenceinformation in ordinal form instead of cardinal form maylead to a satisfactory, easily explainable and reliablesolution. However, it is very important to be careful thatthe respondent has understood exactly what he/she isbeing asked to do. The numeric ratio scale techniqueallows the respondent to determine the scale freely, andno reference point is selected beforehand. Each of thepairs were compared at a time, while in interval scale,the respondent had to choose the pair including the worstand the best alternative first, and other pairs were thenevaluated with respect to that particular pair. Accordingto Kainulainen et al. (2009), the main benefit of theinterval scale technique is that the evaluation is conductedwith respect to concrete specified utility points. Thereis an opportunity to reduce especially the amount ofthe most inconsistent results, if the respondent has toconsider the decision alternatives more carefully first.

5.3. Validity and value of the resultsOur approach was based on a modification of AHP,which is a well-known technique to analyse decisionalternatives’ performance with respect to multiplecriteria. However, the empirical data collected to analyseuncertainty measures was related to a single criterioninstead of multiple criteria. Despite of this, the resultsrelated to differences between interval and ratio scalepreference assessment techniques can be generalized tomultiple criteria. This is because the multi-criteriaproblem can be seen as a generalization of a single-criterion assessment problem, where a single-criterionmodel is used repeatedly with respect to multiple criteria.Our viewpoint is that we can observe the characteristicsof different measurement scales more clearly withrespect to a single criterion, because potential criterion-specific random impacts are possibly partly balancedout through the aggregation over multiple criteria. Inother words, we think that this simplified one-criterionproblem produces results that inform the framing andanalysis of multi-criteria problems.

Different prioritisation (i.e. estimation) techniquesmayproduce very similar estimates for the relative priorities ofdecision alternatives. This is probably true with respect tothe magnitude of uncertainty measures originated fromthe inconsistency of the pairwise comparisons. But it hasto be remembered that the metrics of the uncertaintymeasures can be different and as a consequence, the

uncertainty measures (such as p-value and CR) cannotbe directly compared. Our central motivation to usestatistical estimation techniques in this study wasthat the uncertainty measures can be used to makeprobabilistic inference on the priorities of decisionalternatives based on hypothesis testing, for example.

Although as similar scales as possible were selectedbeforehand for all of the techniques, the results are notfully comparable, because the dependent variables inthe regression analyses were different when residualvariance is analysed as such. Especially, the numericscale chosen for the verbal ratio scale technique affectsthe results, because it is chosen somewhat arbitrarilyto make the methods comparable. If the original Saaty’snumeric counterparts were applied, the residualvariance would have varied from 0.109 to 0.77,although it was mostly under 0.4 for the other methods,and for the applied numeric counterparts in this study.One possibility could have been to test various numericscales for the verbal statements, for example ageometric or inverse-linear scale, and analyse theimplications of all these to the other methods to makemore general conclusions. However, the analysisconducted applying the relative residual variances weremore comparable and not so scale sensitive.

From the results of this study, it cannot bedetermined if the preferences were truly inconsistentor if the methods were too difficult for some decisionmakers. All the different reasons for inconsistencyare mixed in the analysis. The career choice problemwith a very short description of the alternatives couldhave been too complicated task for the students. Theirassessments considering the career alternatives werepossibly uncertain or unformed per se. Theexperimental design to study preferences should bechosen such that it does not require too long time forthe decision makers to be acquainted with the prosand cons of the different alternatives (e.g. Linares2009). However, the number of alternatives to becompared was not large, thus the task was not socomplex. If the number of alternatives to be comparedin pairwise manner is high, the decision maker ispotentially uncertain in his/her answers, and thepairwise comparison matrix may fail to express the realpreferences of the decision maker (Temesi, 2010). Thelarger the number of alternatives is, the largerthe number of questions to the decision maker for theadditional comparison of alternatives is (Moshkovichet al., 2002). However, if more data were available, itwould have been possible to examine the effects ofthe size of the pairwise comparison matrix to find outthe sensitivity of the different scaling techniquesmore carefully.

S. SIRONEN ET AL.192

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 11: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

5.4. Avenues for further researchOverall, the decision-making problem in this studyincluded only five alternatives, and the number ofparticipants was rather low (42) for making some generalconclusions about the variation of inconsistency fordifferent preference elicitation techniques. An empiricalenquiry with more alternatives and notably highernumber of participants, although a complex and laboriousexperiment, would be a reasonably further step to confirmand analyse further the present results and insights. Forexample, the use of relative residual variance of 1 as athreshold for consistent assessment desires furtherstudy as it has potential to become a widely usedrule-of-thumb in practical multi-criteria analyses.

Perhaps, true inconsistency could be found, or atleast some of the reasons for the inconsistency couldbe excluded, if the experiment was repeated severaltimes, and the same persons were consistentlyinconsistent. In further studies, possible learning effectin the course of the experiment needs to be taken intoaccount by varying the order of tasks, which was notperformed in this study. Also, it would be necessaryto ensure the understanding of respondents concerningthe different preference elicitation tasks, which may befundamental to the validity of the acquired results.Nevertheless, it seems that the decision making isnot very sensitive to the observed inconsistency, asthe obtained decision recommendation was the samein each case. It would be useful to examine with asimulation study (or similar) in what kind of situationsthe inconsistency would have a significant effect onthe recommendations.

APPENDIX A: QUESTIONNAIRE TASKS

Your task is to define how preferable you find thefollowing job alternatives at the moment. In all separatetasks, the only criterion for rating is how interestingthe job alternatives are when considering working inthese jobs.

[Verbal three-sentence descriptions of the fivealternatives omitted.]

Pairwise comparisons in ratio scale, verbal anchoring

Proceed downwards one pair at a time. On each line,tick off (x) the more interesting alternative within thispair (if equally interesting, tick off both) and determinehow much more interesting it is with respect to the lessinteresting alternative—write out on the following linethe letter a–i that corresponds to your view accordingto the table below.

Example responsex Lecturer – Forestry advisor x aSenior inspector – Software designer x eEtc. (total of 10 pairs for 5 alternatives)

a Alternatives are equally interestingb c The better alternative is slightly more interesting than the

worse oned e The better alternative is strongly more interesting than the

worse onef g The better alternative is very strongly more interesting

than the worse oneh i The better alternative is absolutely more interesting than

the worse one

Pairwise comparisons in ratio scale, numericalanchoring

Proceed downwards one pair at a time. On each line,tick off (x) the more interesting alternative within thispair (if equally interesting, tick off both) and write outon the following line, how much more interesting it iswith respect to the less interesting alternative –imagine rate 100 for the worse one and determinethe ratio for the advantage.

Example responseLecturer – Forestry advisor x 150x Senior inspector – Software designer 200Etc. (total of 10 pairs for 5 alternatives)

Pairwise comparisons in interval scale

First, define the pair that includes the most interestingand the least interesting job alternative. Then tick off(x) the most interesting alternative within this pairand write out on the following line the utilitydifference of 100. Second, evaluate each pair at a timestarting from the uppermost pair. Denote by tickingoff which alternative is preferable in each pair(if equally interesting, tick off both). Write out on eachline also the utility difference between each pair byusing the difference between the overall worst andbest alternatives as a reference in evaluation.

Example responsex Lecturer – Forestry advisor 100Senior inspector – Software designer x 50Etc. (total of 10 pairs for 5 alternatives)

VARIATION OF PREFERENCE INCONSISTENCY 193

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 12: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

ACKNOWLEDGEMENTS

This study was funded by the Academy of Finlandproject ‘Bridging the gap between qualitative problemstructuring and quantitative decision analysis in forestry’(decision number 127681).

REFERENCES

Alho J, Kangas J, Kolehmainen O. 1996. Uncertainty in expertpredictions of the ecological consequences of forest plans,Applied Statistics 45: 1–14.

Alho J, Kangas J. 1997. Analyzing uncertainties in experts’opinions of forest plan performances, Forest Science 43:521–528.

Alho J, Kolehmainen O, Leskinen P. 2001. Regressionmethods for pairwise comparisons data. In The AnalyticHierarchy Process in Natural Resource and EnvironmentalDecision Making, Schmoldt DL, Kangas J, Mendoza GA,Pesonen M (eds.), Managing Forest Ecosystems Vol. 3,Kluwer Academic Publishers: The Netherlands; 235–251.

Aloysius JA, Davis FD,WilsonDD, Taylor AR, Kotteman JE.2006. User acceptance of multi-criteria decision supportsystems: the impact of preference elicitation techniques.European Journal of Operational Research 169: 273–295.

Bana e Costa C, Vansnick J. 2008. A Critical analysis of theeigenvalue method used to derive priorities in AHP.European Journal of Operational Research 187: 1422–1428.

Barzilai J. 2005. Measurement and preference functionmodelling. International Transactions in OperationalResearch 12: 173–183.

Bottomley PA, Doyle JR. 2001. A comparison of threeweight elicitation methods: good, better, and best. Omega29: 553–560.

Bozóki S, Rapcsák T. 2008. On Saaty’s and Koczkodaj’sinconsistencies of pairwise comparison matrices. Journalof Global Optimization 42: 157–175.

Brans JP, Vincke P, Mareschal B. 1986. How to select andhow to rank projects: the PROMETHEEmethod.EuropeanJournal of Operational Research 24: 228–238.

Crawford G, Williams C. 1985. A note on the analysis ofsubjective judgment matrices. Journal of MathematicalPsychology 29: 387–405.

De Jong P. 1984. A statistical approach to Saaty’s scalingmethod for priorities. Journal of Mathematical Psychology28: 467–478.

Herman M, Koczkodaj WW. 1996. Monte Carlo study ofpairwise comparisons. Information Processing Letters57: 25–29.

Kainulainen T, Leskinen P, Korhonen P, Haara A, Hujala T.2009. A statistical approach to assessing interval scalepreferences in discrete choice problems. Journal of theOperational Research Society 60: 252–258.

Kangas J, Kangas A. 2002. Multiple criteria decision supportmethods in forest management. In Multi-Objective Forest

Planning, Pukkala T (ed.), Managing Forest EcosystemsVol. 6, Kluwer Academic Publishers: The Netherlands;37–70

Kangas A, Kangas J, Kurttila M. 2008. Decision Support forForest Management. Managing Forest Ecosystems Vol.16. Springer: New York.

Kwiesielewicz M, van Uden E. 2004. Inconsistentand contradictory judgements in pairwise comparisonmethod in AHP. Computers and Operations Research31: 713–719.

Keeney RL, Raiffa H. 1976. Decisions with MultipleObjectives: Preferences and Value Tradeoffs. Wiley:New York.

Leskinen P, Kangas J. 1998. Analysing uncertainties ofinterval judgment data in multiple-criteria evaluation offorest plans. Silva Fennica 32: 363–372.

Leskinen P. 2000. Measurement scales and scale independencein the analytic hierarchy process. Journal of Multi-CriteriaDecision Analysis 9: 163–174.

Leskinen P, Kangas J, Pasanen AM. 2003. Assessingecological values with dependent explanatory variables inmulti-criteria forest ecosystem management. EcologicalModelling 170: 1–12.

Limayem F, Yannou B. 2007. Selective assessment ofjudgmental inconsistencies in pairwise comparisons forgroup decision rating. Computers & Operations Research34: 1824–1841.

Linares P. 2009. Are inconsistent decisions better? Anexperiment with pairwise comparisons. European Journalof Operational Research 193: 492–498

Lootsma FA. 1993. Scale sensitivity in the multiplicativeAHP and SMART. Journal of Multi-Criteria DecisionAnalysis 2: 87–110.

Ma D, Zheng X. 1991. 9/9–9/1 scale method of AHP.Proceedings of the Second International Symposium onthe AHP, Vol. I. University of Pittsburgh, Pittsburgh, PA.

Mikhailov L. 2004. Deriving priorities from ratio-scalecomparison judgments: a multiple objective approach.MCDM 2004, Whistler, B. C. Canada August 6–11,2004 [Cited 2012 January] Available from: http://www.bus.sfu.ca/events/mcdm/Proceedings/Paper/AP94%20CF%20Mikhailov%20Priorities.pdf

Moshkovich HM, Mechitov AI, Olson DL. 2002. Ordinaljudgments in multi-attribute decision analysis. EuropeanJournal of Operational Research 137: 625–641

Payne JW, Bettman JR, Schkade DA. 1999. Measuringconstructed preferences: towards a building code. Journalof Risk and Uncertainty 19: 243–270.

Ramik J. 2009. Consistency of pair-wise comparison matrixwith fuzzy elements. In Proceedings of the Joint 2009International Fuzzy Systems Association World Congressand 2009 European Society of Fuzzy Logic andTechnology Conference, Lisbon, Portugal, July 20–24,2009, Carvalho JP, Dubois D, Kaymak U, Sousa JMC(eds.); 98–101.

Rogers M, Bruen M. 1998. A new system for weightingenvironmental criteria for use within ELECTRE III.European Journal of Operational Research 107: 552–563.

S. SIRONEN ET AL.194

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda

Page 13: Variation of Preference Inconsistency When Applying Ratio and Interval Scale Pairwise Comparisons

Roy B. 1968. Classement et choix en présence de points devue multiples (la méthode ELECTRE). Revue Françaised’Informatique et de Recherche Opérationnelle 8: 57–75.

Saaty TL. 1977. A scaling method for priorities inhierarchical structures. Scandinavian Journal of ForestResearch 15: 234–281.

Saaty TL. 1980. Analytic Hierarchy Process. McGraw Hill:New York.

Saaty TL. 2001a. Decision Making with Dependence andFeedback. The Analytic Network Process. 2nd edn,RWS Publications: Pittsburgh, Pennsylvania.

Saaty TL. 2001b. Fundamentals of the analytic hierarchyprocess. In The Analytic Hierarchy Process in NaturalResource and Environmental Decision Making, SchmoldtDL, Kangas J, Mendoza GA, Pesonen M (eds.),Managing Forest Ecosystems Vol. 3, Kluwer AcademicPublishers: The Netherlands; 15–36.

Saaty TL, Vargas LG. 2012. The seven pillars of theanalytic hierarchy process. In Models, Methods,

Concepts & Applications of the Analytic HierarchyProcess, 2nd edn, Saaty TL, Vargas LG, InternationalSeries in Operations Research & Management ScienceVol. 175; 23–40.

Salo AA, Hämäläinen RP. 1997. On the measurement ofpreferences in the analytic hierarchy process. Journal ofMulti-Criteria Decision Analysis 6: 309–319.

Schmoldt D, Kangas J, Mendoza GA, Pesonen M. (Eds.).2001. The Analytic Hierarchy Process in NaturalResource and Environmental Decision Making. KluwerAcademic Publishers, Managing Forest Ecosystems 3;51–65.

Temesi J. 2010. Pairwise comparison matrices and the error-free property of the decision maker. Central EuropeanJournal of Operations Research 19: 239–249.

Wolfslehner B, Vacik H, Lexer M. 2005. Application of theanalytic network process in multi-criteria analysis ofsustainable forest management. Forest Ecology andManagement 207: 157–170.

VARIATION OF PREFERENCE INCONSISTENCY 195

Copyright © 2013 John Wiley & Sons, Ltd. J. Multi-Crit. Decis. Anal. 21: 183–195 (2014)DOI: 10.1002/mcda