De Meijer et al (2010, EP) - Development of a video based SJT.pdf

Embed Size (px)

Citation preview

  • Original Articles and Reviews

    Construct-Driven Developmentof a Video-Based SituationalJudgment Test for IntegrityA Study in a Multi-Ethnic Police Setting

    Lonneke A. L. de Meijer,1 Marise Ph. Born,1 Jaap van Zielst,2

    and Henk T. van der Molen1

    1Erasmus University, Rotterdam, The Netherlands2Police Academy of The Netherlands, Apeldoorn, The Netherlands

    Abstract. In a eld study conducted in a multi-ethnic selection setting at the Dutch police, we examined the construct validity of a video-basedsituational judgment test (SJT) aimed to measure the construct of integrity. Integrity is of central importance to productive work performance ofpolice ofcers. We used a sample of police applicants, which consisted of a Dutch ethnic majority group and an ethnic minority group. The ethnicminority applicants came from one of the four largest ethnic minority groups in The Netherlands, namely groups with a Dutch Antillean, aMoroccan, a Surinamese, or a Turkish background. A critical issue is the often-found construct-heterogeneity of SJTs. However, we found that aconstruct-driven approach may be fruitful in the development of SJTs aiming to measure one single construct. Conrming our expectations, wefound support for the construct validity of the SJT intended to measure the construct of integrity. These results held across ethnic majority andethnic minority applicants. Therefore, the SJT is a promising test for personnel selection in a multi-ethnic setting.

    Keywords: situational judgment test, integrity, construct validity, ethnicity

    Situational judgment tests (SJTs) typically consist of scenar-ios of hypothetical work situations in which a problem hasarisen. Accompanying each scenario are multiple possibleways to respond to the hypothetical situation. The test takeris then asked to judge the possible courses of action.Although SJTs have been in use since the 1920s, they havebecome increasingly popular in personnel selection and inthe research literature during the last two decades (e.g., Chan& Schmitt, 1997, 2005; Dalessio, 1994; McDaniel, Hartman,Whetzel, & Grubb, 2007; Olson-Buchanan et al., 1998;Weekley & Jones, 1997, 1999). Several characteristics ofthe SJT have caused its revival. First, McDaniel et al.(2007) meta-analytically showed the criterion-related validityand the incremental validity of SJTs over and above a com-posite of cognitive ability tests and personality question-naires in predicting job performance. Second, SJTs havebeen found to have less adverse impact on ethnic minoritygroups than more traditionally used cognitive ability tests(Clevenger, Pereira, Wiechmann, Schmitt, & Harvey, 2001;Motowidlo, Dunnette, & Carter, 1990; Nguyen &McDaniel,2003; OConnell, Harman, McDaniel, Grubb, & Lawrence,2007; Olson-Buchanan et al., 1998; Weekley & Jones,1997, 1999). Finally, new technology has made the develop-ment of SJTs based on video material possible. The video-based SJT appears to have several advantages compared to

    the paper-and-pencil SJT, such as a higher criterion-relatedvalidity (Lievens & Sackett, 2006), less adverse impact,and higher realism of the test leading to more reliable respon-dent reactions (Chan & Schmitt, 1997; Richman-Hirsch,Olson-Buchanan, & Drasgow, 2000).

    Even though SJTs have a series of advantages, importantquestions still persist. A critical issue is the often-found con-struct-heterogeneity of SJTs and the difculty of developingan SJT that measures one specic construct. Summarizingthe empirical literature, McDaniel and Nguyen (2001)showed that SJTs are not one-dimensional construct tests,but should be considered as a measurement method capableof measuring several constructs. According to McDaniel andNguyen, empirical evidence indicates that the constructsmeasured by SJTs can be limited to cognitive ability org (observed r :36; McDaniel, Morgeson, BruhnFinnegan, Campion, & Braverman, 2001), and the personal-ity dimensions conscientiousness (observed r :26), agree-ableness (observed r :25), and emotional stability(observed r :31; McDaniel & Nguyen, 2001). Althoughwe agree with the viewpoint that SJTs are measurementmethods, we question whether SJTs are limited to measuringg, conscientiousness, agreeableness, or emotional stability.We argue that SJTs can also be built to measure otherconstructs.

    2010 Hogrefe Publishing European Psychologist 2010; Vol. 15(3):229236DOI: 10.1027/1016-9040/a000027

  • The present study aims to demonstrate that a construct-driven development of SJTs is possible. We developed anSJT intended to measure the concept of integrity and basedon video scenarios (i.e., a video-based SJT for integrity). Wecollected eld data in a multi-ethnic setting during Dutchpolice ofcer selection. Therefore, the construct validity ofthe SJT was examined for both the ethnic majority andthe ethnic minority group. The largest ethnic minoritygroups in The Netherlands are groups with a DutchAntillean, a Moroccan, a Surinamese, and a Turkish back-ground. We will rst discuss the concept of integrity and ofintegrity tests. Second, the Integrity-SJT will be introducedand the hypotheses will be described. Finally, the relevanceof the Integrity-SJT for personnel selection will be dealt with.

    The Concept of Integrity and Integrity Tests

    For the following two reasons, more and more attention isgiven to integrity during personnel selection and for job per-formance. First, measures of integrity have shown to be pre-dictive of organizational outcomes, from theft to jobperformance (Ones, Viswevaran, & Schmidt, 1993). Second,integrity tests have also shown to predict incrementally overand above measures of cognitive ability (Schmidt & Hunter,1998).

    Integrity is difcult to dene and appears to consist ofvarious dimensions (Jones, Brasher, & Huff, 2002; VanIddekinge, Taylor, & Eidson, 2005). Often-found examplesof dimensions of integrity are honesty, work values, cus-tomer service, and drug avoidance. Violations of integrityat the Dutch police involve, among other things, corruption,fraud and theft, accepting dubious gifts and services, misuseof authority, and misuse of information (Naeye, Huberts,Van Zweden, Busato, & Berger, 2004), which can be viewedas dimensions of police integrity. Because of the impact thatthese integrity violations may have on the police organiza-tion, it is important to determine an applicants integrityby means of a police ofcer selection measure.

    Broadly, there are two types of integrity tests. Tests usingitems that focus on attitudes toward theft and other dishonestbehaviors are referred to as overt integrity tests, whereastests developed to assess broad personality traits that predictcounterproductive behaviors are referred to as personality-based integrity tests or so-called disguised purposetests (Ones & Viswesvaran, 1998). In general, integrity testshave been found to positively relate to the Big Fivepersonality dimensions of conscientiousness (observedr = .26), agreeableness (observed r = .23), and emotionalstability (observed r = .18; Ones, 1993, in Wanek, 1999).Integrity tests have negligible correlations with cognitiveability (Ones & Viswesvaran, 1998).

    An Integrity-SJT

    The SJT that was developed for the Dutch police consists ofvideos of critical situations in each of which police-integrityviolations are presented. Little is known in the literature

    about construct-driven development of SJTs. We know ofone empirical study; Becker (2005) argued that his SJTwas based on an explicit, clear denition of integrity andwas intended to capture specic integrity values rather thangeneral personality traits or other variables that are relatedto, but are not synonymous with, integrity. He stated that aclear a priori denition of integrity was necessary, but didnot examine the construct validity of his SJT. Becker foundthat the SJTwas a moderately valid predictor of outcomes inreal-world settings, such as promotion, career progress, andstatus as a team leader. As these criteria are general in nature,they do not demonstrate explicitly the SJT as an instrumentmeasuring integrity. Building on Becker (2005), we devel-oped an SJT intended to measure integrity and investigatedits construct validity. To this end, we investigated the rela-tionship between the SJT score and actual integrity-relatedvariables, instead of examining the relationship betweenthe SJT score and general work-related outcomes, as Beckerdid. Additionally, we examined the construct validity of theIntegrity-SJT in a multi-ethnic sample.

    Overview of Hypotheses

    Since both integrity tests and SJTs have shown to be relatedto conscientiousness, agreeableness, and emotional stability(McDaniel & Nguyen, 2001; Ones, 1993, in Wanek,1999), examining correlations between the present SJT andthese three Big Five dimensions solely will not give muchinsight into the question whether the SJT indeed measuresintegrity. If, for instance, correlations around .25 are foundin the present study, would this mean that the SJT measuresintegrity or would this mean that the test is yet another mul-tidimensional SJT? Therefore, the SJTs convergent validitywas examined by means of the relationship between the SJTscore and several integrity-related measures, namely thedimension Honesty-Humility of the HEXACO-model (Lee& Ashton, 2004) and cognitive distortions by means of theHow-I-Think questionnaire (HIT questionnaire; Barriga,Gibbs, Potter, & Liau, 2001). Also, the discriminant validityof the SJT is investigated by means of a cognitive ability test.

    In the following, we will state the hypotheses and thearguments for these hypotheses. The rst hypothesis statesthat scores on the Integrity-SJT will be related to scoreson other integrity-related dimensions (Hypothesis 1).A dimension that has shown a strong resemblance to theconcept of integrity is the sixth factor of the recently intro-duced HEXACO-model of personality (Lee & Ashton,2004; Lee, Ashton, Morrison, Cordery, & Dunlop, 2008).This sixth factor is labeled Honesty-Humility and is typi-cally described as honesty, fairness, sincerity, modesty, andlack of greed. Lee, Ashton, and De Vries (2005) argue thatthe dimension Honesty-Humility has a clear conceptual linkto integrity, since both consist of admissions of wrongdo-ing such as theft, fraud, sabotage, and alcohol and drugabuse (p. 182). Hence, they investigated the relationshipbetween Honesty-Humility on the one hand and workplacedelinquency and scores on an overt integrity test on the otherhand. They found correlations of .47 for workplace

    230 L. A. L. de Meijer et al.: The Development of a SJT for Integrity

    European Psychologist 2010; Vol. 15(3):229236 2010 Hogrefe Publishing

  • delinquency and .53 for integrity. Therefore, we expect thatthe score on the Integrity-SJTwill be substantially correlatedto the score on the dimension Honesty-Humility.

    The HIT questionnaire is a measure of self-serving cog-nitive distortions (Barriga et al., 2001). Self-serving cognitivedistortions are inaccurate or biased ways of attending to orconferring meaning upon experiences associated with exter-nalizing behavior. An example of a person showing self-serv-ing cognitive distortions is someone who has been stealingsomething from a shop but who blames the shop owner formaking stealing possible. Barriga and Gibbs (1996) arguedthat self-serving cognitive distortions should correlate withmeasures of antisocial behavior, such as theft, fraud, aggres-sive behavior, and disobedience. They found a correlation of.54 between scores on the HIT questionnaire and aggressivebehavior, and a correlation of .46 between scores on the HITquestionnaire and delinquent behavior. Therefore, we expectthat the score on the Integrity-SJT will also be substantiallycorrelated to scores on the HIT questionnaire.

    The second hypothesis states that scores on the Integrity-SJT will not be related to scores on a cognitive ability test(Hypothesis 2). Ones and Viswesvaran (1998) showed thatintegrity has a negligible correlation with cognitive ability.Since it is integrity that the SJT intends to measure, weexpect a negligible correlation between the SJT score andscores on the cognitive ability test.

    Finally, the construct validity of the Integrity-SJT will beexamined in a group of Dutch ethnic majority applicants aswell as a group of ethnic minority applicants. Little is knownabout potential differences in construct validity across ethnicgroups. There has been some cross-cultural research ontranslated integrity tests and potential differences in crite-rion-related validity (i.e., Fortmann, Leslie, & Cunningham,2002; Marcus, Lee, & Ashton, 2007). In general, the crite-rion-related validity ndings point toward equivalenceacross cultures. This was examined with North Americanintegrity tests that were appropriately translated predicting(counterproductive) work behavior in Canada, Central andSouth America, South Africa, and Germany. Equivalencein construct validity between ethnic groups differing in mas-tery of the language used in the test, however, has to ourknowledge not been examined until this point in time.Therefore, the central question in this present study iswhether the correlations between the Integrity-SJT on theone hand and other integrity-related measures and the cog-nitive ability test on the other hand are similar across the eth-nic majority and minority group (Research Question).

    Method

    Sample and Procedure

    Data came from ethnic majority and ethnic minority appli-cants who applied for a position at the Police Academy ofThe Netherlands in the period from June 2006 to August2006. The dataset consisted of 203 applicants (59% male;Mage = 23.34,SD = 5.98), ofwhich 151were ethnicmajority

    applicants (58%male,Mage = 23.32, SD = 6.03) and52wereethnic minority applicants (62% male, Mage = 23.39,SD = 5.89). The ethnic minority applicants had a DutchAntillean, Moroccan, Surinamese, or Turkish background.Applicants who are interested in a job as police ofcer rstapply to the local police force where they want to work aftertraining. For the selection procedure, the local police forcesroutinely send all applicants to the national Police Centrefor Competence Assessment andMonitoring (CCM). Duringthe selection process of the CCM, several instruments areused. In the present study, we only used the scores on the cog-nitive ability test. Next to applicants scores on the cognitiveability test, scores on the Integrity-SJT, on the in-depthHonesty-Humility interview, and on the HIT-questionnairewere used.

    Measures

    Situational Judgment Test for Integrity

    SJTs typically consist of hypothetical scenarios describinginterpersonal work situations in which a problem has arisen.The scenario may represent an actual situation on the targetjob or a situation constructed in such a manner that it is psy-chologically identical to an actual work situation (Chan &Schmitt, 1997). Scenarios within the test are usually devel-oped on the basis of a critical-incident analysis involvingsubject matter experts.

    An approach analogous to earlier SJT studies was usedfor the development of the Integrity-SJT (see, e.g., Weekley& Jones, 1997; for an example of an SJT item, seeAppendix A). First, we collected realistic critical incidentsregarding interactions between police ofcers and civiliansor among police colleagues from 15 experienced police of-cers (both policemen and policewomen; both ethnic major-ity and minority police ofcers; police experts had around15 years of police work experience). All incidents focusedon integrity violations and potential reactions to these viola-tions. For example, several incidents dealt with resistingfraudulent people or situations. Second, critical incidentsthat were similar were grouped and scenarios were writtenabout each of these groups of critical incidents. Simulta-neously, with the help of the experienced police ofcers,four response options were derived for each scenario. Thisprocedure resulted in 14 SJT items (a scenario includingits four response options is labeled item) that werepilot-tested in a written version of the test. Third, after exam-ining the descriptives and the factor-analytic results of thepilot-study data (N = 228, 72% male, Mage = 24.08;SD = 6.78), 3 of the 14 SJT items were eliminated.

    Subsequently, a video-based version of the test was devel-oped. Both professional actors and police ofcers weretrained to act in scenarios. After this training, the scenarioswere videotaped in a professional manner. Finally, a panelof experts was asked to ll out the video-based SJT in orderto develop a scoring key. The expert panel consisted of 50experienced police ofcers with on average 14.06 years ofwork experience (SD = 6.38) and with different ethnic back-grounds, namely 10 ethnic majority experts, 10 Antillean

    L. A. L. de Meijer et al.: The Development of a SJT for Integrity 231

    2010 Hogrefe Publishing European Psychologist 2010; Vol. 15(3):229236

  • experts, 10Moroccan experts, 10 Surinamese experts, and 10Turkish experts. Theyhad to evaluate each response optiononits effectiveness given the situation presented in the scenario.Agreement among the experts in effectiveness ratings wasgenerally satisfactory (mean intraclass correlation(ICC) = .70), both within ethnic groups (mean ICC = .69)and between ethnic groups (mean ICC = .69). Sinceagreement among experts was satisfactory, the scoring keywas set at the modus of the total expert group. The absolutedifference between the scoring key of a given item responseoption and the applicant response formed the applicant score,varying from 4 (largest difference between expert and appli-cant response) to 0 (no difference between expert and appli-cant response). The applicant score was subtracted from 4,in order to have an intuitively logical range from 0 (lowestpossible integrity score) to 4 (highest possible integrity score).

    In its nal form, the video-based SJT consists of short,videotaped scenarios of key integrity issues that police of-cers are likely to encounter with civilians or with police col-leagues. A narrator introduces each scenario. Per SJT item,the scene freezes at an important point and the applicant hasto answer the responses related to the scene presented. The11 items have four response options each. Applicants haveto evaluate each response option in terms of its effectivenesswithin the given situation. This response instruction gener-ally is known as a knowledge response instruction.

    The SJT structure was analyzed with structural equationmodeling using Amos 6.0 (Arbuckle, 2005), consisting ofone factor and four subfactors. The model showed a goodt to the data (v2 [df = 1] = 1.09, ns; TLI = .96;CFI = 1.00; RMSEA = .02). The four subfactors representmeaningful clusters of response options (i.e., Factor 1 repre-sented applicant scores on the response option that can gen-erally be described as It is alright for this time. Factor 2:It is not permitted!! (in a stern way), Factor 3: These arethe rules, so it is not allowed. (in a more friendly way), andFactor 4: It is not allowed and I have to report it to thesupervisor!), which all loaded signicantly (.28 < b < .55,p < .05) on one general SJT factor. The error terms of Fac-tor 1 and Factor 3 appeared to be somewhat negatively cor-related (r = .20, ns). The internal consistency of the SJTwas .69. In further analyses, the total SJT score was used.

    Other Integrity Measures

    In-Depth Interview

    The in-depth interview was built around the sixth factor ofthe HEXACO-model (Lee & Ashton, 2004) Honesty-Humility and its four subdimensions: Modesty, Honesty,Morality, and Avoidance of materialism (for denitions,see Appendix B). The interviews were semistructured andbehaviorally based. The interviewer and an assessor, whowas present during the interview, independently made rat-ings on a 7-point Likert scale ranging from 1 (extremelyweak) to 7 (excellent), on the factor Honesty-Humility andeach of the four subdimensions. Interrater reliabilities rangedfrom .63 to .78 (N = 203). The interview ratings were used

    for further analyses. The model of the in-depth interview,consisting of one Honesty-Humility factor and four subdi-mensions, showed a good t to the data (v2 [df = 2] = 2.12,ns; TLI = 1.00; CFI = 1.00; RMSEA = .01). All subdimen-sions loaded signicantly (.45 < b < .83, p < .001) on theHonesty-Humility factor. In further analyses, the dimensionHonesty-Humility was used.

    How-I-Think Questionnaire

    To measure applicants cognitive distortions, the Dutchtranslation (translated from English by Utrecht University,The Netherlands) of the HIT questionnaire (Barriga et al.,2001) was used. The HIT questionnaire was developed tomeasure two broad dimensions, namely cognitive distortionsand behavioral referents, each consisting of four subdimen-sions. Cognitive Distortions consist of the subdimensionsSelf-centered, Blaming others, Minimizing/Mislabeling,and Assuming the Worst. Behavioral Referents consist ofthe subdimensions Opposition-Deance, Physical Aggres-sion, Lying, and Stealing (for denitions, see Appendix B).The alpha reliability of the dimension Cognitive Distortionswas .90 and of the dimension Behavioral Referents was .89,based on the present sample. The alpha reliabilities of thesubdimensions varied from .70 for Blaming others to .79for Stealing. Two models were tested in accordance withBarriga et al. (2001), namely: each consisting of one dimen-sion (i.e., Cognitive Distortions or Behavioral Referents)and four subdimensions. Both models showed a good tto the data (v2 [df = 2] = 8.82, p < .05, and 1.66, ns;TLI = .94 and 1.00; CFI = .99 and 1.00; andRMSEA = .05 and .00). All Cognitive Distortions subdi-mensions loaded signicantly (.80 < b < .90, p < .001)on the Cognitive Distortions dimension. All Behavioral Ref-erents subdimension loaded signicantly (.75 < b < .90,p < .001) on the Behavioral Referents dimension. However,because the two dimensions Cognitive Distortions andBehavioral Referents were highly correlated (r = .99,p < .001), further analyses were conducted with one generalHIT scale consisting of the mean of the two dimensions.

    Cognitive Ability Test

    The police intelligence test (PIT; Rijks PsychologischeDienst, 1975) is a cognitive ability test and consists of107 items divided over six subtests: Verbal Comprehension,Inductive Reasoning, Numerical Reasoning, Word Fluency,Spatial Ability, and Picture Arrangement. The time limit is51 min. Applicants completed the PIT in Dutch. Priorresearch by Lem and Van Doorn (2000) indicated alpha re-liabilities varying from .69 for Series of Numbers, to .87 forFolding Figures. A study by Van der Maesen (1992) showeda corrected predictive validity coefcient of .39 (N = 162)for the total PIT score predicting work performance. Themodel of the test, consisting of one factor for general cogni-tive ability and six subtests, showed a good t to the data(v2 [df = 9] = 19.42, p < .05; TLI = .86; CFI = .94; and

    232 L. A. L. de Meijer et al.: The Development of a SJT for Integrity

    European Psychologist 2010; Vol. 15(3):229236 2010 Hogrefe Publishing

  • RMSEA = .08). All cognitive ability subtests loadedsignicantly (.23 < b < .68, p < .05) on general cognitiveability. In further analyses, the total score on the cognitiveability test was used.

    Analyses

    Preliminary Analyses

    Because response styles can affect answers on questionnaires(e.g., Van de Vijver & Leung, 1997), structural equivalence(i.e., absence of bias) was checked across the ethnic majorityand minority group for each measure separately before con-ducting further analyses. In accordance with Van de Vijverand Leung (1997), structural equivalence across cultures isinterpreted as follows: A test measures the same trait cross-culturally, but not necessarily on the same quantitative scale.Using an equal measurement weights model in Amos 6.0(Arbuckle, 2005), no signicant differences between factorstructures of the measures were found between the ethnicmajority group and the minority group (for detailed informa-tion, please contact the rst author).

    Main Analyses

    Using Amos 6.0 (Arbuckle, 2005), a general model wastested to examine the correlations between the Integrity-SJT and the two integrity-related instruments and betweenthe Integrity-SJT and the cognitive ability test. The modelshowed a good t to the data (v2 [df = 4] = 4.73, ns;TLI = .89; CFI = .98; RMSEA = .03). Multigroup analysiswas conducted to test for cross-ethnic invariance in correla-tions. Correlations between the SJT, the in-depth interview,and the HIT-questionnaire were calculated to examine theconvergent validity of the SJT. The correlation betweenthe SJT and the cognitive ability test was calculated to inves-tigate its discriminant validity.

    Results

    First, we expected that scores on the Integrity-SJT wouldcorrelate with scores on other integrity-related tests (Hypoth-esis 1). Second, we expected that scores on the Integrity-SJTwould not correlate with scores on the cognitive ability test(Hypothesis 2). Finally, we examined potential correlationaldifferences between the ethnic majority group and the ethnicminority group (Research Question).

    With regard to the Research Question, a multigroup anal-ysis was conducted. A model in which the covariances wereheld constant across the ethnic majority and minority groupwas compared to the unconstrained model. No signicantdifference was found between the two models (Dv2

    [Ddf = 8] = 10.26, ns), meaning that no signicant differ-ence in covariances was found between the two ethnicgroups. To investigate the correlations between scores on

    the SJT and the other measures, therefore, one group withoutdistinguishing between ethnic majority and minority appli-cants was used.

    With regard to the integrity-related measures, theobserved correlations with the SJT were .23 (p < .05) forHonesty-Humility and .36 (p < .001) for the HIT ques-tionnaire. On cognitive ability, the correlation with theSJT was .13 (ns). Thus, concerning the convergent-validityevidence, the correlations were moderate to high (Cohen,1988) between the SJT and the in-depth Honesty-Humilityinterview and between the SJT and the HIT questionnaire.Concerning the discriminant-validity results, the correlationbetween the SJT and the cognitive ability test was small(Cohen, 1988). These ndings support Hypotheses 1 and2 and support the notion that the SJT seems to measureintegrity in both the ethnic majority and minority group.

    Discussion

    Situational judgment tests recently have gained in popularitybecause of its incremental validity over and above cognitiveability tests and because of its smaller adverse impact againstethnic minority groups (Clevenger et al., 2001; Motowidloet al., 1990; Nguyen & McDaniel, 2003; Weekley & Jones,1997, 1999). New technology has made the development ofvideo-based SJTs possible, which show even higher crite-rion-related validity (Lievens & Sackett, 2006) and lessadverse impact (Chan & Schmitt, 1997) than paper-and-pencil SJTs. Furthermore, more and more attention is givento measuring integrity of applicants during personnel selec-tion because it has shown to be predictive of counterproduc-tive work behavior varying from theft to job performance(Ones et al., 1993). It is of central importance to determinethe integrity as a work style for police ofcer positions. Infact, integrity is seen as the most important work style ofpolice ofcers according to O*Net (O*Net Online, 2009,May 6). Because of the impact that integrity violations haveon the police organization, a video-based SJT intended tomeasure integrity was developed. In a eld study conductedin a multi-ethnic setting at the Dutch police, we examined theconstruct validity of this Integrity-SJT. We investigated con-vergent and discriminant validity of the SJT, including poten-tial correlational differences between the ethnic majoritygroup and the ethnic minority group. The largest ethnicminority groups in The Netherlands have a Dutch Antillean,a Moroccan, a Surinamese, or a Turkish background.

    Firstly, we found no signicant differences in correlationsbetween the ethnic majority and minority group. Secondly,correlations between the SJT and the integrity-related mea-sures the in-depth Honesty-Humility interview and theHIT questionnaire were moderate to high (Cohen,1988). Finally, the correlation between the SJT and the cog-nitive ability test was low (Cohen, 1988). These results are inline with our expectations (Hypotheses 1 and 2) and demon-strate the construct validity of the SJT measuring integrity.

    The construct Honesty-Humility includes subdimen-sions such as Morality (i.e., being able to avoid fraud and

    L. A. L. de Meijer et al.: The Development of a SJT for Integrity 233

    2010 Hogrefe Publishing European Psychologist 2010; Vol. 15(3):229236

  • corruption and unwilling to take advantage of other individ-uals or of society at large) and Honesty (i.e., being genuine ininterpersonal relations and unwilling to manipulate others).The HIT questionnaire contains subdimensions such asOpposition-Deance (i.e., being disrespectful for rules, laws,or authorities), Stealing, and Lying. Therefore, the moderateto high correlations between the SJT and the Honesty-Humility interview and between the SJT and the HIT ques-tionnaire showed support for the construct validity of theSJT.

    Additionally, we found a negligible relationship betweenthe SJT and the cognitive ability test. Regarding integritytests, Ones and Viswesvaran (1998) showed that they havenegligible correlations with cognitive ability. Since integritywas the intended SJT construct, we expected a small corre-lation between the SJT score and scores on the cognitiveability test. This was what we found, providing more evi-dence for the construct validity of the present SJT.

    Limitations

    Our study had some limitations. First, the small samplesize of ethnic minority applicants resulted in small powerconcerning the multigroup analysis. With regard to the eth-nic minority group, a larger sample size would haveallowed to draw rmer conclusions. Also, a larger samplesize of ethnic minorities would allow a further differentia-tion within the ethnic minority group. De Meijer, Born,Terlouw, and Van der Molen (2006) showed that largescore differences on various selection measures existbetween ethnic minority groups, which might be explainedby differences in history and culture between the ethnicgroups. Investigating these ethnic minority groups sepa-rately may result in more useful information compared tomerely contrasting the ethnic majority to minority groupand not taking into account potential differences betweenethnic groups.

    Second, we did not have criterion data at our disposal toinvestigate the criterion-related validity of the present SJT.Although the construct-validity results are promising, we donot knowwhether the present SJT is able to predict job perfor-mance, workplace (dis)honesty, theft, fraud, etc. Since little isknown about SJTs measuring a single construct, in general,and their criterion-related validity, specically, future researchshould be focused on these types of SJTs and their predictivepower. Furthermore, SJTs intended to measure a single con-struct should be developed in different companies, in differentsettings, and on different job levels to be able to properly gen-eralize the ndings in the present study.

    Conclusion

    A critical issue regarding SJTs is the often-found construct-heterogeneity. However, we argue that a construct-drivenapproach may be fruitful in the development of SJTs mea-suring one single construct. In a eld study conducted in a

    multi-ethnic setting during Dutch police ofcer selection,we examined the construct validity of a video-based SJTmeasuring integrity. We investigated (1) the convergentand discriminant validity of the SJT and (2) potential corre-lational differences between the Dutch ethnic majority andethnic minority group. First, we found support for theconstruct validity of the Integrity-SJT. Second, we foundno signicant differences in correlations between the ethnicmajority and minority group.

    Acknowledgments

    We acknowledge Hans van Loon and Hellen Westerveld fortheir contribution to the execution of this study.

    References

    Arbuckle, J. L. (2005). Amos 6.0 [Computer software]. Chicago:Smallwaters.

    Barriga, A. Q., & Gibbs, J. C. (1996). Measuring cognitivedistortions in antisocial youth: Development and preliminaryvalidation of the How-I-Think questionnaire. AggressiveBehavior, 22(3), 333343.

    Barriga, A. Q., Gibbs, J. C., Potter, G. B., & Liau, A. K. (2001).The How-I-Think questionnaire: Manual. Champaign, IL:Research Press.

    Becker, T. E. (2005). Development and validation of a situationaljudgment test of employment integrity. International Journalof Selection and Assessment, 13(3), 225232.

    Chan, D., & Schmitt, N. (1997). Video-based versus paper-and-pencil method of assessment in situational judgment tests:Subgroup differences in test performance and face validityperception. Journal of Applied Psychology, 82(1),143159.

    Chan, D., & Schmitt, N. (2005). Situational judgment tests. InA. Evers, N. Anderson, & O. Voskuijl (Eds.), Handbook ofpersonnel selection (pp. 219246). Oxford: Blackwell.

    Clevenger, J., Pereira, G. M., Wiechmann, D., Schmitt, N., &Harvey, V. S. (2001). Incremental validity of situationaljudgment tests. Journal of Applied Psychology, 86(3),410417.

    Cohen, J. (1988). Standard power analysis for the behavioralsciences (2nd ed.). Hillsdale, NJ: Erlbaum.

    Dalessio, A. T. (1994). Predicting insurance agent turnover usinga video-based situational judgment test. Journal of Businessand Psychology, 9(1), 2332.

    De Meijer, L. A. L., Born, M. Ph., Terlouw, G., & Van derMolen, H. T. (2006). Applicant and method factors related toethnic score differences in personnel selection: A study at theDutch police. Human Performance, 19(3), 219251.

    Fortmann, K., Leslie, C., & Cunningham, M. (2002). Cross-cultural comparisons of the Reid Integrity Scale in LatinAmerica and South Africa. International Journal of Selectionand Assessment, 10(1/2), 98108.

    Jones, J. W., Brasher, E. E., & Huff, J. W. (2002). Innovations inintegrity-based personnel selection: Building a technology-friendly assessment. International Journal of Selection andAssessment, 10(1/2), 8797.

    Lee, K., & Ashton, M. C. (2004). Psychometric properties of theHEXACO personality inventory. Multivariate BehavioralResearch, 39(2), 329358.

    Lee, K., Ashton, M. C., & De Vries, R. E. (2005). Predictingworkplace delinquency and integrity with the HEXACO and

    234 L. A. L. de Meijer et al.: The Development of a SJT for Integrity

    European Psychologist 2010; Vol. 15(3):229236 2010 Hogrefe Publishing

  • ve-factor models of personality structure. Human Perfor-mance, 18(2), 179197.

    Lee, K., Ashton, M. C., Morrison, D. L., Cordery, J., & Dunlop,P. D. (2008). Predicting integrity with the HEXACO per-sonality model: Use of self- and observer reports. Journal ofOccupational and Organizational Psychology, 81(1),147167.

    Lem, J., & Van Doorn, E. (2000). Voortgangsrapportage.Onderzoek kenmerkende voorspellers politie [Progressreport. Study noticeable predictors police]. Culemborg,The Netherlands: Meurs Personeelsadvies.

    Lievens, F., & Sackett, P. R. (2006). Video-based versus writtensituational judgment tests: A comparison in terms ofpredictive validity. Journal of Applied Psychology, 91(5),11811188.

    Marcus, B., Lee, K., & Ashton, M. C. (2007). Personalitydimensions explaining relationships between integrity testsand counterproductive behavior: Big Five, or one inaddition? Personnel Psychology, 60(1), 134.

    McDaniel, M. A., Hartman, N. S., Whetzel, D. L., & Grubb,W. L. III (2007). Situational judgment tests, responseinstructions, and validity: A meta-analysis. Personnel Psy-chology, 60(1), 6391.

    McDaniel, M. A., Morgeson, F. P., Bruhn Finnegan, E.,Campion, M. A., & Braverman, E. P. (2001). Use of situa-tional judgment tests to predict job performance: A clarica-tion of the literature. Journal of Applied Psychology, 86(4),730740.

    McDaniel, M. A., & Nguyen, N. T. (2001). Situational judgmenttests: A review of practice and constructs assessed. Interna-tional Journal of Selection and Assessment, 9(1/2), 103113.

    Motowidlo, S. J., Dunnette, M. D., & Carter, G. W. (1990). Analternative selection procedure: The low-delity simulation.Journal of Applied Psychology, 75(6), 640647.

    Naeye, J., Huberts, L., Van Zweden, C., Busato, V., & Berger, B.(2004). Integriteit in het dagelijkse politiework [Integrityduring daily police work]. Amsterdam: Vrije Universiteit.

    Nguyen, N. T., & McDaniel, M. A. (2003). Response instruc-tions and racial differences in a situational judgment test.Applied H.R.M. Research, 8(1), 3344.

    Occupational Information Network (O*NET). (2009, May 6).OnLine developed for the US Department of Labor by theNational O*NET Consortium. Retrieved May 6, 2009, fromhttp://www.online.onetcenter.org.

    OConnell, M. S., Hartman, N. S., McDaniel, M. A., Grubb,W. L., & Lawrence, A. III (2007). Incremental validity ofsituational judgment tests for task and contextual perfor-mance. International Journal of Selection and Assessment,15(1), 1929.

    Olson-Buchanan, J. B., Drasgow, F., Moberg, P. J., Mead, A. D.,Keenan, P. A., & Donovan, M. A. (1998). An interactivevideo assessment of conict resolution skills. PersonnelPsychology, 51(1), 124.

    Ones, D. S. (1993). The construct validity evidence for integritytests. University of Iowa. Iowa City: Unpublished doctoraldissertation.

    Ones, D. S., & Viswesvaran, C. (1998). Gender, age, and racedifferences on overt integrity tests: Results across four large-scale job applicant data sets. Journal of Applied Psychology,83(1), 3542.

    Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Com-prehensive meta-analysis of integrity test validities: Findingsand implications for personnel selection and theories of jobperformance [Monograph]. Journal of Applied Psychology,78(4), 679703.

    Richman-Hirsch, W. L., Olson-Buchanan, J. B., & Drasgow, F.(2000). Examining the impact of administration medium onexaminee perceptions and attitudes. Journal of AppliedPsychology, 85(6), 880887.

    Rijks Psychologische Dienst. (1975). Politie intelligentie test[Police intelligence test]. The Hague, The Netherlands: RPD.

    Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility ofselection methods in personnel selection: Practical andtheoretical implications of 85 years of research ndings.Psychological Bulletin, 124(2), 262274.

    Van der Maesen, P. E. A. M. (1992). Het rendement vanpersoneelsselectie [The efciency of personnel selection].The Netherlands: Rijksuniversiteit Groningen, Unpublisheddoctoral dissertation.

    Van de Vijver, A. J. R., & Leung, K. (1997). Methods and dataanalysis for cross-cultural research. Thousand Oaks, CA:Sage.

    Van Iddekinge, C. H., Taylor, M. A., & Eidson, C. E., Jr. (2005).Broad versus narrow facets of integrity: Predictive validity andsubgroup differences. Human Performance, 18(2), 151177.

    Wanek, J. E. (1999). Integrity and honesty testing: What do weknow? How do we use it? International Journal of Selectionand Assessment, 7(4), 183195.

    Weekley, J. A., & Jones, C. (1997). Video-based situationaljudgment testing. Personnel Psychology, 50(1), 2549.

    Weekley, J. A., & Jones, C. (1999). Further studies of situationaltests. Personnel Psychology, 52(3), 679700.

    Received October 10, 2007Accepted September 15, 2009Published online March 16, 2010

    About the authors

    Dr. Lonneke A. L. de Meijer is assistant professor of Industrialand Organizational Psychology at Erasmus University, Rotter-dam, The Netherlands. Her research examines the role of ethnicdiversity in personnel selection, with a special focus on thedevelopment of innovative tests.

    Prof. Dr. Marise Ph. Born is professor of Personnel Psychologyat Erasmus University Rotterdam, The Netherlands, and Presi-dent of the International Test Commission. Her research focuseson personnel selection, test development, and cross-culturalpsychology.

    Dr. Jaap van Zielst is head of the national Police Centre forCompetence Assessment and Monitoring at the Police Academyof The Netherlands.

    Prof. Dr. Henk T. van der Molen is Dean of the Faculty of SocialSciences and professor of Psychology at Erasmus University,Rotterdam, The Netherlands. His research interests are (effec-tiveness of) communication skills training programs, problem-based learning, and innovations in education.

    Lonneke A.L. de Meijer

    Institute of PsychologyErasmus UniversityRotterdam FSWWoudestein T13-24P.O. Box 1738NL-3000 DR RotterdamThe NetherlandsTel. +31 10 408 8678Fax +31 10 408 9009E-mail [email protected]

    L. A. L. de Meijer et al.: The Development of a SJT for Integrity 235

    2010 Hogrefe Publishing European Psychologist 2010; Vol. 15(3):229236

  • Appendix B

    Integrity-Related Dimensions and Their Descriptions

    Dimension Description

    In-depth interviewModesty Being modest, unassuming, and seeing oneself as an ordinary person

    without any claim to special treatment

    Honesty Being genuine in interpersonal relations and unwilling to manipulate others.

    Morality Being able to avoid fraud and corruption and unwilling to take advantageof other individuals or of society at large

    Avoidance of materialism Being uninterested in possessing lavish wealth,luxury goods, and signs of high social status

    How-I-Think questionnaireSelf-centered According status to ones own view, expectations, needs, rights,

    immediate feelings, and desires to such a degree that the legitimate views, etc.,of others are scarcely considered or are disregarded altogether

    Blaming others Misattributing blame to outside sources or misattributing blame forones victimization or other misfortune to innocent others

    Minimizing/mislabeling Depicting antisocial behavior as causing no real harm or referringto others with a belittling or dehumanizing label

    Opposition-deance Being disrespectful for rules, laws, or authorities

    Note. Denitions of the facets of the in-depth Integrity interview are from Lee and Ashton (2004) and denitions of the (sub-)dimensions of the HIT questionnaire are from Barriga et al. (2001). Denitions of the subdimensions Physical Aggression, Stealing,and Lying of the HIT were not listed here, because we assumed that they are self-explanatory.

    Appendix A

    Example of Integrity-SJT Item

    Description of Situation

    A police ofcer (police ofcer 1) comes to work on hismotorbike. When he enters the parking garage of the policestation he accidentally hits a police car, causing a big scratchon the police car. Shortly after, he meets a colleague (policeofcer 2) and tells her what happened.

    Police Officer 1

    Hi! Listen: I just entered the parking garage with mymotorbike and caused a big scratch on one of the police cars.I feel really bad about it and, actually, I dont know what todo.

    Possible Reactions of Police Officer 2

    1. Dont worry about it! Police cars are covered withscratches. (Factor 1, It is alright for this time.)

    2. O. . . Im sorry. If I were you, I would report it to thechief. (Factor 3, These are the rules, so it is notallowed. (in a friendly way))

    3. Well, thats pretty stupid of you!! You have to report itto the chief! (Factor 2, It is not permitted! (in a sternway))

    4. The only thing you can do is to report it to the chief!And if youre not going to do it, I will!! (Factor 4,It is not allowed and I have to report it to thesupervisor!)

    236 L. A. L. de Meijer et al.: The Development of a SJT for Integrity

    European Psychologist 2010; Vol. 15(3):229236 2010 Hogrefe Publishing