5
Faking propensity and faking-related change: A model-based analysis of the EPQ-R scores Pere J. Ferrando , Cristina Anguiano-Carrasco Research Centre for Behavioural Assessment (CRAMC), ‘Rovira i Virgili’ University, Spain article info Article history: Received 16 December 2010 Received in revised form 27 April 2011 Available online 1 June 2011 Keywords: Eysenck Personality Questionnaire Revised (EPQ-R) Change scores Social desirability Instructed faking Two-group two-wave models abstract Although change scores in a measure administered under neutral and faking-motivating conditions have become a main choice to operationalize faking, there are still some non-resolved issues on the results they provide. The present study uses a two-wave two-group design with a control group to assess three of these issues: (a) the role of individual differences in the amount of faking-induced change, (b) the rela- tion between Impression Management (IM) scores under neutral conditions and change scores, and (c) the convergent validity of change scores as a requisite to view them as measures of an individual- difference variable. A Spanish translation of the Eysenck Personality Questionnaire Revised was admin- istered twice to 489 undergraduate students under standard–standard instructions (N = 215) and under standard-faking-good instructions (N = 274). For the P, N, and Lie scales, the results showed that the role of individual differences was very relevant and that the only common variable underlying the scores was a general factor of faking-induced change. However, the IM scores were unable to predict effective change. Ó 2011 Elsevier Ltd. All rights reserved. 1. Introduction Faking good can be conceptualized as an individual’s deliberate attempt to manipulate responses to psychological instruments, un- der motivated conditions, in order to create a positive impression (e.g., Furnham, 1986; Griffith & Peterson, 2008; McFarland & Ryan, 2006; Zickar & Robie, 1999). At present this concept is generally operationalized by using two main approaches (Griffith & Peterson, 2008; McFarland & Ryan, 2006). The first is to use scores on a detection scale. The second is to use change or difference scores in a measure administered under neutral and faking-motivating conditions. Although the second approach has been used for many decades (e.g. Eysenck & Eysenck, 1963), it started to become more popular in the 1990s as interest in the use of personality measures in selection began to increase again (Griffith & Peterson, 2008; McFarland & Ryan, 2006). The detection-based approach has mainly focused on the Impres- sion Management component (IM; Paulhus, 1991) of social desir- ability (McFarland & Ryan, 2006; Mersman & Shultz, 1998): a conscious tendency by respondents to tailor their answers so as to create a more positive image. As faking-related measures, some authors (e.g. Eysenck & Eysenck, 1976; Furnham, 1986) have interpreted IM scores in two-ways. When administered under faking-motivating conditions they are thought to behave as detec- tion measures. So, in one sense, they reflect the consequences of fak- ing. When administered under neutral conditions, however, they are thought to measure a substantive personality variable (e.g. Furn- ham, 1986; McFarland & Ryan, 2006). However, this second view is controversial (e.g. Holden & Passey, 2010) and there is no consen- sus among its advocates on how this hypothetical variable should be understood. Some authors (Ferrando & Anguiano-Carrasco 2011; McFarland & Ryan, 2006) further conjectured that it might be con- ceptualized as a propensity to fake good. Even if this were so, how- ever, it is not clear to what extent the initial levels in this variable can predict effective change under faking-motivating conditions. Unlike the detection-based strategy, change scores are direct and have obvious face validity. However, they also have potential short- comings that must be addressed if results are to be interpretable. To start with, there are two basic problems. First, changes might be ctotally or partially due to pre-test effects rather than to the changed faking conditions (e.g. Mesmer-Magnus & Viswesvaran, 2006). Sec- ond, the dimensionality and structure of the measure might be dif- ferent under neutral and faking-motivating conditions, which would mean that under faking-motivation conditions, respondents would attach different meanings to the items. That the trait under measure retains the same meaning under both conditions is a logical prerequisite for change scores to be interpreted meaningfully (Rogosa, Brandt, & Zimowski, 1982). In this study we shall use change scores, which we shall inter- pret using parsimonious hypotheses. As for the first problem, we shall assume that there are no pre-test effects. As for the second, we shall adopt the ‘‘theta-shift’’ hypothesis (e.g. Zickar & Robie, 0191-8869/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.paid.2011.05.006 Corresponding author. Address: Universidad ‘Rovira i Virgili’, Facultad de Psicologia, Carretera Valls s/n, 43007 Tarragona, Spain. Tel./fax: +34 977558079. E-mail address: [email protected] (P.J. Ferrando). Personality and Individual Differences 51 (2011) 497–501 Contents lists available at ScienceDirect Personality and Individual Differences journal homepage: www.elsevier.com/locate/paid

Faking propensity and faking-related change: A model-based analysis of the EPQ-R scores

Embed Size (px)

Citation preview

Page 1: Faking propensity and faking-related change: A model-based analysis of the EPQ-R scores

Personality and Individual Differences 51 (2011) 497–501

Contents lists available at ScienceDirect

Personality and Individual Differences

journal homepage: www.elsevier .com/locate /paid

Faking propensity and faking-related change: A model-based analysisof the EPQ-R scores

Pere J. Ferrando ⇑, Cristina Anguiano-CarrascoResearch Centre for Behavioural Assessment (CRAMC), ‘Rovira i Virgili’ University, Spain

a r t i c l e i n f o

Article history:Received 16 December 2010Received in revised form 27 April 2011Available online 1 June 2011

Keywords:Eysenck Personality Questionnaire Revised(EPQ-R)Change scoresSocial desirabilityInstructed fakingTwo-group two-wave models

0191-8869/$ - see front matter � 2011 Elsevier Ltd. Adoi:10.1016/j.paid.2011.05.006

⇑ Corresponding author. Address: Universidad ‘RPsicologia, Carretera Valls s/n, 43007 Tarragona, Spain

E-mail address: [email protected] (P.J. Fer

a b s t r a c t

Although change scores in a measure administered under neutral and faking-motivating conditions havebecome a main choice to operationalize faking, there are still some non-resolved issues on the resultsthey provide. The present study uses a two-wave two-group design with a control group to assess threeof these issues: (a) the role of individual differences in the amount of faking-induced change, (b) the rela-tion between Impression Management (IM) scores under neutral conditions and change scores, and (c)the convergent validity of change scores as a requisite to view them as measures of an individual-difference variable. A Spanish translation of the Eysenck Personality Questionnaire Revised was admin-istered twice to 489 undergraduate students under standard–standard instructions (N = 215) and understandard-faking-good instructions (N = 274). For the P, N, and Lie scales, the results showed that the roleof individual differences was very relevant and that the only common variable underlying the scores wasa general factor of faking-induced change. However, the IM scores were unable to predict effectivechange.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Faking good can be conceptualized as an individual’s deliberateattempt to manipulate responses to psychological instruments, un-der motivated conditions, in order to create a positive impression(e.g., Furnham, 1986; Griffith & Peterson, 2008; McFarland & Ryan,2006; Zickar & Robie, 1999). At present this concept is generallyoperationalized by using two main approaches (Griffith & Peterson,2008; McFarland & Ryan, 2006). The first is to use scores on adetection scale. The second is to use change or difference scoresin a measure administered under neutral and faking-motivatingconditions. Although the second approach has been used for manydecades (e.g. Eysenck & Eysenck, 1963), it started to become morepopular in the 1990s as interest in the use of personality measuresin selection began to increase again (Griffith & Peterson, 2008;McFarland & Ryan, 2006).

The detection-based approach has mainly focused on the Impres-sion Management component (IM; Paulhus, 1991) of social desir-ability (McFarland & Ryan, 2006; Mersman & Shultz, 1998): aconscious tendency by respondents to tailor their answers so as tocreate a more positive image. As faking-related measures, someauthors (e.g. Eysenck & Eysenck, 1976; Furnham, 1986) haveinterpreted IM scores in two-ways. When administered underfaking-motivating conditions they are thought to behave as detec-

ll rights reserved.

ovira i Virgili’, Facultad de. Tel./fax: +34 977558079.rando).

tion measures. So, in one sense, they reflect the consequences of fak-ing. When administered under neutral conditions, however, they arethought to measure a substantive personality variable (e.g. Furn-ham, 1986; McFarland & Ryan, 2006). However, this second viewis controversial (e.g. Holden & Passey, 2010) and there is no consen-sus among its advocates on how this hypothetical variable should beunderstood. Some authors (Ferrando & Anguiano-Carrasco 2011;McFarland & Ryan, 2006) further conjectured that it might be con-ceptualized as a propensity to fake good. Even if this were so, how-ever, it is not clear to what extent the initial levels in this variablecan predict effective change under faking-motivating conditions.

Unlike the detection-based strategy, change scores are direct andhave obvious face validity. However, they also have potential short-comings that must be addressed if results are to be interpretable. Tostart with, there are two basic problems. First, changes might bectotally or partially due to pre-test effects rather than to the changedfaking conditions (e.g. Mesmer-Magnus & Viswesvaran, 2006). Sec-ond, the dimensionality and structure of the measure might be dif-ferent under neutral and faking-motivating conditions, whichwould mean that under faking-motivation conditions, respondentswould attach different meanings to the items. That the trait undermeasure retains the same meaning under both conditions is a logicalprerequisite for change scores to be interpreted meaningfully(Rogosa, Brandt, & Zimowski, 1982).

In this study we shall use change scores, which we shall inter-pret using parsimonious hypotheses. As for the first problem, weshall assume that there are no pre-test effects. As for the second,we shall adopt the ‘‘theta-shift’’ hypothesis (e.g. Zickar & Robie,

Page 2: Faking propensity and faking-related change: A model-based analysis of the EPQ-R scores

498 P.J. Ferrando, C. Anguiano-Carrasco / Personality and Individual Differences 51 (2011) 497–501

1999), and assume that: (a) the same common trait is measuredunder neutral and faking-motivating conditions and the measure-ment properties remain invariant; and (b), under faking-motivat-ing conditions, the true-trait level is temporarily changed toprovide more positive item scores.

The plausibility of the two hypotheses above must be assessedbefore using change scores, and this requires more complex designsand procedures than those usually employed in applied research(Mesmer-Magnus & Viswesvaran, 2006). The first hypothesis re-quires a control group to be used in which the measures are admin-istered in neutral conditions both at Time1 and at Time2. Thesecond requires a structural equation model (SEM) to be fitted inwhich the measurement parameters remain invariant whereasthe structural parameters (of trait levels) can vary.

Even when the above hypotheses are plausible, change scoreshave two further potential problems regarding the issues consid-ered here. First, there is a spurious negative element in the corre-lation between the initial levels and the amount of change.Second, change scores are generally unreliable (Rogosa et al.,1982). Elementary procedures for correcting these shortcomingshave been available for a long time now (e.g. Lord, 1958) but areseldom used. If they are not, the correlational results can bemisleading.

A review of the literature shows that, at present, consistent re-sults have not been obtained on some basic issues related tochange scores. Of these issues, we shall focus on the followingthree.

The first is the role of individual differences: in other words,whether the same instructions or motivating conditions produceabout the same amount of faking in all respondents or whetherthis amount is largely idiosyncratic. This issue is of great practicalrelevance (Furnham 1986; McFarland & Ryan 2006). If, undersimilar faking-motivating conditions, all the respondents increaseor decrease their scores by about the same extent, then the rankorder of the individuals would not be changed. So, selection deci-sions based on this rank will be essentially correct, and criterion-related validities would be unaltered. However, if some individu-als increase or decrease more than others, then validity estimatesmight be distorted and decisions based on rank order might beerroneous.

Of the different procedures that can be used to assess this issue(Lautenschlager, 1986), the one that is most related to the aboveconcerns is based on the correlation between the scores obtainedunder honest and faking-motivating conditions. Studies reportmoderate (Mueller-Hanson, Heggestad, & Thornton, 2006) or low(Loo & Wudel, 1979) correlations that are in all cases substantiallylower than the reliability estimates in each condition. This suggeststhat the amount of change due to individual differences varies con-siderably. However, as discussed below, this evidence is onlypartial.

The second issue refers to the relationship between IM scoresobtained under neutral conditions and change scores. As discussedabove, if IM is regarded as a variable of propensity to fake, thensome relation is expected to be found between this variable andeffective change. So far, however, the results are disappointing.Griffith and Peterson (2008) and McFarland and Ryan (2006) ob-tained non-significant correlations whereas Mersman and Schultz(1998); Quist, Arora, and Griffith (2007); Wrensen and Biderman(2005) obtained weak negative correlations. These weak andinconsistent results might be partly due to the psychometric prob-lems of the change scores referred to above (Burns & Christiansen,2006).

The last issue on which we shall focus is the convergent valid-ity of the change scores. Some authors (Lautenschlager, 1986;McFarland & Ryan, 2006; Mersman & Shultz, 1998; Mueller-Han-son et al., 2006) have assumed that change scores are indicators

of an individual-difference variable. Now, for being conceptual-ized in this way, they should, at the very least, show some con-sistency across-situations and over time. So, if substantialindividual differences are found in the amount of faking-inducedchange, it is of interest to assess the extent to which this changegeneralizes across different measures. McFarland and Ryan(2006) assessed the convergence between change scores fromthe 5 NEO scales and obtained correlations ranging from 0.10to 0.60. Mueller-Hanson et al. (2006) used a more specific mea-sure, the Job Fit Assessment, and obtained correlations rangingfrom 0.40 to 0.60.

1.1. Overview of the study and rationale

This study uses as measures the Neuroticism (N), Psychoticism(P) and Lie scale scores of the Eysenck Personality Questionnairerevised (EPQ-R; Eysenck, Eysenck, & Barrett, 1985). Eysenck’s ques-tionnaires are widely used in basic and applied personality re-search, selection, and clinical assessment. And, although theyhave been criticized, the psychometric properties of their scalesappear to be good. Furthermore, the Lie scale is considered to bea good, almost pure measure of IM (Paulhus, 1991). Finally, thereis a large body of research on the fakeability of the EPI, EPQ-Aand EPQ-R scores (see e.g. Furnham, 1986).

So far, most of the change-based studies that use Eysenck’squestionnaires have been based on mean changes at the group le-vel. As for N, P and Lie, the results are quite consistent (seeFurnham, 1986 for a review). They are very sensitive to faking so,under faking-motivating conditions, Lie scores increase and Nand P scores decrease quite substantially. The E scale appears tobe less sensitive. In greater detail, previous studies that used thegeneric-job instructions that we shall use here found no systematicchanges in the E scores (Eysenck & Eysenck, 1963; Eysenck,Eysenck, & Shaw, 1974; Furnham, 1986). The E change scores, then,cannot be expected to be a valid measure of faking in the presentstudy and so were not used.

Given the basic nature of the questions to be assessed, thisstudy is not based on real applicant samples. Rather, it uses a situ-ational design based on standard and instructed-faking-good con-ditions. Because all the respondents receive exactly the sameinstructions, the role of individual differences can be assessedmore clearly. This role would be far more difficult to assess in a realapplicant sample, in which the motivation for faking would prob-ably be different for different individuals. Indeed, the present re-sults are not believed to be directly generalizable to realapplicant samples (see e.g. Hogan, Barrett, & Hogan, 2007). Ratherthey are intended to respond to certain basic issues in a controlledsituation, which can then be researched further in more realisticsettings.

At the methodological level the study has two parts. The firstconsists of fitting a SEM for each scale which addresses the twostarting hypotheses discussed above. The model used is the one re-cently proposed by Ferrando and Anguiano-Carrasco (2011) in-tended for a design in which two groups (experimental andcontrol) are administered the measures of interest at two pointsof time (see Ferrando & Anguiano-Carrasco, 2011, for details). Con-ceptually, the model places restrictions on both measurement andstructure. The measurement restrictions are that in the four condi-tions the measurement properties of the scale remain invariant.The structural restrictions—that no systematic trait levels shouldoccur—are imposed in the control group. As for the units of analy-sis, in this study we deal with scales that have between 22 and 32items. And because the items are measured in two groups and attwo points of time, the models have between 88 and 128 variables.So, if we work at the item level, it is very difficult to obtain stableresults with the moderately large samples used here. For this rea-

Page 3: Faking propensity and faking-related change: A model-based analysis of the EPQ-R scores

Table 1SEM results: Goodness of fit assessment and estimated mean trait change.

Measure v2 df RMSEA NNFI CFI �d 90% C.I.

Lie 173.88 62 0.08 0.91 0.91 2.04 (1.88;2.20)N 153.75 62 0.07 0.94 0.94 �1.12 (�1.27;�0.97)P 106.12 62 0.05 0.93 0.93 �0.90 (�1.05;�0.75)

P.J. Ferrando, C. Anguiano-Carrasco / Personality and Individual Differences 51 (2011) 497–501 499

son we have decided to use item aggregates (parcels) as units forfitting the SEM. Parcels were made using the commonly acceptedaggregation rules: the same number of items were included in eachparcel and the same number of parcels included in each factor (5).To preserve factor structure, intra-factor parcelled items were uni-dimensional (see Sass & Smith, 2006). We believe that this proce-dure is defensible in this case, and that it can provide a strongenough basis for a correct interpretation of change scores. How-ever, we acknowledge that, strictly speaking, measurement invari-ance is not assessed on an item-by-item basis but only at theaggregate level.

If fits are acceptable, the three main issues will then be assessedin the second part. First, however, we note that fitting the SEM alsomakes it possible to assess the amount of faking-induced change atthe group level (i.e. mean theta-shift difference). The magnitude ofthe mean change can be meaningfully interpreted because the traitlevels under neutral conditions are scaled to have zero mean andunit variance.

The first issue is assessed by comparing the Time1–Time2 cor-relations obtained in the experimental group to the correspondingcorrelations in the control group. Because no systematic change oc-curs in the control group, the Time1–Time2 correlations are inter-pretable in this group as test–retest reliability estimates, so thatthe attenuation with respect to unity is solely due to random error.Now, if the same faking-inducing instructions equally affected allthe respondents, all the scores would undergo the same amountof systematic change (theta-shift). So, the rank order would be pre-served and the Time1–Time2 correlations in the experimentalgroup would be expected to be the same as those in the controlgroup. On the other hand if the amount of change is due to a great-er or lesser extent to individual differences, the Time1–Time2 cor-relations in the experimental group will be lower than those in thecontrol group. The lower these correlations are the more importantthe role of the individual differences will be. This approach, whichuses a control group, is more complete than the single-groupassessments discussed above.

Of the various proposals that have been made for testing thesecond issue, we have chosen Lord’s residual-change procedurebased on the partial correlation (1958). It is designed to assessthe relation between change scores and a third variable c (in thiscase the IM scores under neutral conditions), and the measure isthe partial correlation: rcT2.T1. Conceptually, Lord’s index measuresthe correlation between change and a third variable when the ini-tial levels are statistically controlled. In the present study, Lord’sindex is solely obtained in the experimental group (i.e. the groupin which a systematic change trend towards more socially desir-able responding is expected).

The third issue is also assessed in the experimental group byanalysing the correlation matrix of the change scores in two ways.First, the disattenuated correlations are obtained by using the esti-mated reliabilities of the change scores. Second, the matrix is factor-analyzed using the one common factor model. If the disattenuatedcorrelations are near unity, the change scores are interpreted tomeasure one common construct except for measurement error. Ifthey are clearly lower than one but the factor model fits well, theyare interpreted as having systematic specific variation beyond thecommon factor, but that this specificity is uncorrelated betweenmeasures.

With only three change scores (Lie, N and P) the one commonfactor model is just identified and the goodness of model-data fitcannot be assessed. To solve this problem, we chose to use theLie change scores as an indicator variable, and fix the loading ofthis variable by using the corresponding reliability estimate (seee.g. Hancock, 1997). This constraint provides one degree of free-dom for testing the goodness of fit of the one-factor solution.

2. Method

2.1. Participants and procedure

Respondents were 489 undergraduate students from the Psy-chology and Social Sciences faculties of a Spanish university, andwere assigned to two groups: control (N = 215), and experimental(N = 274). The mean age (about 21) and the proportions of genders(about 80% female) were the same in both groups. The question-naires were administered in paper and pencil version by the sameperson in all cases, and were completed voluntarily in classroomgroups of 25–60 students. The administration was anonymous,and the respondents had to provide only two particulars: genderand age.

At Time1 all the participants were asked to respond under thestandard instructions provided in the manuals of Eysenck’s ques-tionnaires. At Time2, the participants assigned to the control groupwere re-tested using the same standard instructions as at Time1.Participants assigned to the experimental group were given theinstructions listed in Eysenck et al. (1974). Respondents are askedto imagine that they are applying for a job that they really want.They should try to give a good impression when answering byputting what they think the employer would like them to put.The re-test interval was 6 weeks in all cases.

2.2. Measures

The study used the Spanish translation of the EPQ-R by Aguilar,Tous, and Andrés (1990). The number and order of the items in thequestionnaire is exactly the same as in the original version.

3. Results

3.1. First step: SEM-based analyses

In each scale, the two-wave two-group SEM was fitted to theparcel scores using robust maximum likelihood estimation asimplemented in the Mplus program, version 5 (Muthén & Muthén,2007). For each measure, Table 1 shows the model-fit results aswell as the estimated mean trait change in the experimental group.

The goodness-of-fit results in Table 1 are quite clear: for allthree measures, the proposed model fits reasonably well. These re-sults suggest that the two parsimonious starting hypotheses—(a)no systematic re-test changes occur in the control and (b) the mea-surement properties remain invariant under both conditions—canbe considered to be reasonable. So, change estimates can be mean-ingfully interpreted: they are shown together with their corre-sponding 90% confidence intervals. The pattern of changes agreeswith expectations: very strong positive change in L, and strongnegative changes in N and P. In standard deviation units, the re-sults obtained here for N and Lie agree closely to those obtainedby Eysenck and Eysenck (1963).

3.2. Issue 1: The role of IIDD in the amount of change

For both groups, Table 2 shows: (a) the estimated reliabilities ofthe scale scores at each point of time (Cronbach alpha), (b) the

Page 4: Faking propensity and faking-related change: A model-based analysis of the EPQ-R scores

Table 3Convergent validity of the change scores.

Lie N P Factor Loadings

Lie 0.56 �0.69 0.78N �0.43 0.52 �0.56P �0.49 0.36 �0.64

500 P.J. Ferrando, C. Anguiano-Carrasco / Personality and Individual Differences 51 (2011) 497–501

estimated reliabilities of the change scores, (c) the correlations be-tween scores at Time1 and scores at Time2, and (d) the experimen-tal-control correlation differences in the Fisher-z transformed scaletogether with the corresponding 90% confidence interval.

As far as reliabilities are concerned, estimates at Time1 are quitesimilar in both groups, which reinforces the hypothesis of initialbetween – group comparability. Furthermore, the alpha estimatesat Time1 agree with those provided by the Spanish version of theEPQ-R used here (e.g. Ferrando, Chico, & Lorenzo, 1997).

In the control group the reliability estimates at Time2 are verysimilar to those at Time1. However, in the experimental groupthere is a clear increase in the Lie and P scores. Finally, the reliabil-ity estimates of the change scores tend to be a little lower thanthose of the scale scores, but are still acceptable for researchpurposes.

In the control group, the Time1–Time2 correlations are gener-ally smaller than the alpha estimates. However, for a 6-week re-test interval they are reasonable test–retest estimates. In theexperimental group, however, the correlations are much lower. Ta-ken together, these results suggest that, under faking-induced con-ditions, and in the three scales, the rank order of the scores changesmuch more than would be expected in a comparable test–retestsituation. The Fisher-z comparisons show that the differences arehighly significant statistically (none of the confidence intervalscontain the zero value). However, the overlap among the intervalscautions against interpreting the scales differently.

3.3. Issue 2: Relation between IM and change scores

In the experimental group, the partial correlation estimates be-tween IM as measured by the Lie scores at Time1 and the remain-ing change scores were 0.08 (N) and 0.02 (P). None of them weresignificant. So, even when a more refined procedure is used, the re-sults agree with most previous reports in which IM scores obtainedunder neutral conditions were unable to predict effective change.

3.4. Issue 3: The convergent validity of change scores

For the three change scores, Table 3 shows: (a) the product-mo-ment correlations (below the diagonal), (b) the disattenuated cor-relations (above the diagonal), and (c) the loadings of theunidimensional solution obtained by fitting the constrained one-factor model discussed above to the correlation matrix.

The signs of the raw correlations agree with expectations, andthe moderate values are similar to those reported previously forgeneral measures. The disattenuated correlations are indeed higherbut still far from unity. Overall, these results suggest that there is asubstantial degree of convergence among the measures, but that,at the very least each measure has specific variation.

The one-factor model was estimated using the maximum likeli-hood procedure, and fitted the data quite well. The chi-squaredstatistic with one degree of freedom was 0.04, and the NNFI andGFI goodness-of-fit indices were 0.99 and 1, respectively. So itseems reasonable to assume that, although there is substantialspecific variation in the change scores, the specificities are uncor-

Table 2Reliability estimates and Time1–Time2 correlations.

Control group Experimental g

Measure a (T1) a (T2) r (T1T2) a (T1)

Lie 0.73 0.74 0.60 0.75N 0.85 0.87 0.73 0.82P 0.67 0.69 0.71 0.65

Note. a (T1) alpha reliability estimate at Time1, a (T2) alpha reliability estimate at Time2diff control-experimental r (T1T2) difference in Fisher’s Z scale, 90% CI confidence interv

related, and the only thing the measures have in common is a gen-eral factor. Furthermore, if we interpret this factor as ‘faking-induced change’ the results are quite clear. The sign of the loadingreflects the direction of change and its magnitude the sensitivity ofthe corresponding measure. Note again that Lie is the mostsensitive.

4. Discussion

This study aimed to assess some non-resolved general issues onfaking-induced change scores by using EPQ-R scores as measures.With respect to previous studies, this one has a series of strengths.At the design level, it uses a two-wave two-group design with acontrol group. At the analysis level it is model-based, so that thetwo basic hypotheses that are a prerequisite if the change scoresare to be interpreted clearly are first tested with a SEM.

The results obtained for the first and third issue (role of individ-ual differences and convergent validity) were interesting. In thepresent controlled situation, the role of the individual differenceswas of considerable importance. Furthermore, although the changescores exhibited specific variance, the hypothesis that the onlycommon variable underlying the scores was a general factor of fak-ing-induced change was supported by the data. These results seemto warrant further research on change scores viewed as measuresof an individual-differences variable rather than purely situationalmeasures. Future research could extend the present study in a vari-ety of ways. First, a more complete multiple-group two-wave de-sign could be envisaged in which different experimental groupswere instructed to fake specific job profiles (i.e. job-specificinstructions instead of the generic instructions used here). A studyof this type would also make it possible to include the E scale andevaluate the role of individual differences when the instructionsare more concrete. In this respect, the results of Furnham’s(1990) study suggest that this role is still important even whenfaking specific job profiles. Second, it would be of interest to assesshow change scores function in real applicant samples (e.g. Hoganet al., 2007). Given that in real situations the decision to engagein faking depends to a far greater extent on the individual, we con-jecture that the role of individual differences will be much moreimportant. Third, we believe that it would be interesting to assessthe consistency of change scores not only across different scales,but also over time and situations.

The most disappointing result is the lack of relation between IMscores and change scores. Even when methods that control poten-tial biases have been used, the results agree with most previousevidence: IM scores obtained under neutral conditions are unableto predict effective change. This result has prompted some authors

roup

a (T2) q (ch) r (T1T2) ZF diff 90% CI

0.89 0.79 0.24 0.45 (0.30;0.60)0.82 0.74 0.34 0.57 (0.42;0.72)0.77 0.64 0.21 0.67 (0.52;0.82)

, r (T1T2) Time1–Time2 correlation, p (ch) estimated reliability of change scores, ZFal for ZF diff.

Page 5: Faking propensity and faking-related change: A model-based analysis of the EPQ-R scores

P.J. Ferrando, C. Anguiano-Carrasco / Personality and Individual Differences 51 (2011) 497–501 501

to consider IM measures as inappropriate indicators of faking(Burns & Christiansen 2006; Griffith & Peterson, 2008). However,we believe that this consideration is premature. Our results sup-port the statement that IM scores obtained under faking-motivat-ing are the most susceptible to modification and, therefore, themost sensitive for detecting faking. Furthermore, their failure toexplain faking variance in the present situation does not mean thatthey cannot function in other contexts. However, if the body of evi-dence continues to point in this direction, it would be interestingto re-think the type of construct that the IM scores measure underneutral conditions. Or perhaps, as Holden and Passey (2010)hypothesized, whatever they are a mixture of substance and meth-od effects.

Acknowledgment

This research was supported by a Grant from the SpanishMinistry of Education and Science (PSI2008-00236/PSIC).

References

Aguilar, A., Tous, J. M., & Andrés, A. (1990). Adaptación y estudio psicométrico delEPQ-R. Anuario de Psicología, 46, 101–118.

Burns, G. N., & Christiansen, N. D. (2006). Sensitive or senseless: On the use of socialdesirability measures in selection and assessment. In R. L. Griffith & M. H.Peterson (Eds.), A closer examination of applicant faking behavior (pp. 113–148).Greenwich, CT: Information Age Publishing.

Eysenck, H. J., & Eysenck, S. B. G. (1976). Psychoticism as a dimension of personality.New York: Crane-Russak.

Eysenck, S. B. G., & Eysenck, H. J. (1963). An experimental investigation of‘desirability’ response set in a personality questionnaire. Life Sciences, 2,343–355.

Eysenck, S. B. G., Eysenck, H. J., & Barrett, P. (1985). A revised version of thepsychoticism scale. Personality and Individual Differences, 6, 21–29.

Eysenck, S. B. G., Eysenck, H. J., & Shaw, L. (1974). The modification of personalityand Lie scores by special ‘honestly’ instructions. British Journal of Social andClinical Psychology, 13, 41–50.

Ferrando, P. J., & Anguiano-Carrasco, C. (2011). A SEM at the individual and grouplevel for assessing faking-related change. Structural Equation Modeling, 18, 1–19.

Ferrando, P. J., Chico, E., & Lorenzo, U. (1997). Dimensional analysis of the EPQ-R Liescale with a Spanish sample: Gender differences and relations to N, E, and P.Personality and Individual Differences, 23, 631–637.

Furnham, A. (1986). Response bias, social desirability and dissimulation. Personalityand Individual Differences, 7, 385–400.

Furnham, A. (1990). Faking personality questionnaires: Fabricating differentprofiles for different purposes. Current Psychology, 9, 46–56.

Griffith, R. L., & Peterson, M. H. (2008). The failure of social desirability measures tocapture applicant faking behavior. Industrial and Organizational Psychology, 1,308–311.

Hancock, G. R. (1997). Correlation/validity coefficients disattenuated for scorereliability: A structural equation modeling approach. Educational andPsychological Measurement, 57, 598–606.

Hogan, J., Barrett, P., & Hogan, R. (2007). Personality measurement, faking, andemployment selection. Journal of Applied Psychology, 92, 1270–1285.

Holden, R. R., & Passey, J. (2010). Socially desirable responding in personalityassessment: Not necessarily faking and not necessarily substance. Personalityand Individual Differences, 49, 446–450.

Lautenschlager, G. J. (1986). Within-subject measures for the assessment ofindividual differences in faking. Educational and Psychological measurement,46, 309–316.

Loo, R., & Wudel, P. (1979). Estimates of fakeability on the Eysenck personalityquestionnaire. Social Behavior and Personality: An International Journal, 7,157–160.

Lord, F. M. (1958). Further problems in the measurement of growth. Educational andPsychological Measurement, 18, 437–451.

McFarland, L. A., & Ryan, A. M. (2006). Toward an integrated model of applicantfaking behavior. Journal of Applied Social Psychology, 36, 979–1016.

Mersman, J. L., & Shultz, K. S. (1998). Individual differences in the ability t fake onpersonality measures. Personality and Individual Differences, 24, 217–227.

Mesmer-Magnus, J., & Viswesvaran, C. (2006). Assessing response distortion inpersonality tests: A review of research designs and analytic strategies. In R. L.Griffith & M. H. Peterson (Eds.), A closer examination of applicant faking behaviour(pp. 85–113). Greenwich, CT: Information Age Publishing.

Mueller-Hanson, R. A., Heggestad, E. D., & Thornton, G. C. (2006). Individualdifferences in impression management: An exploration of the psychologicalprocess underlying faking. Psychology Science, 48, 288–312.

Muthén, L. K., & Muthén, B. (2007). Mplus user’s guide (5th ed.). Los Angeles: Muthén& Muthén.

Paulhus, D. L. (1991). Measurement and control of response bias. In J. P. Robinson, P.R. Shaver, & L. S. Wrightsman (Eds.), Measures of personality and socialpsychological attitudes (pp. 17–59). San Diego: Academic Press.

Quist, J. S., Arora, S., & Griffith, R. L. (2007). The Association of social desirability andapplicant response distortion: A validation study. Paper presented at the 22ndannual Society for Industrial and Organizational Psychology conference, New York.

Rogosa, D., Brandt, D., & Zimowski, M. (1982). A growth curve approach to themeasurement of change. Psychological Bulletin, 92, 726–748.

Sass, D. A., & Smith, P. L. (2006). The effects of parcelling unidimensional scales onstructural parameters estimates in structural equation modelling. StructuralEquation Modeling, 13, 566–586.

Wrensen, L. B., & Biderman, M. D. (2005). Factors related to faking ability: A structuralequation model application. Paper presented at the 20th Annual Conference ofThe Society for Industrial and Organizational Psychology, Los Angeles, CA.

Zickar, M. J., & Robie, C. (1999). Modeling faking good on personality items: Anitem-level analysis. Journal of Applied Psychology, 84, 551–563.