Upload
toby-gregory
View
216
Download
1
Embed Size (px)
Citation preview
Exploring the Equivalence and Exploring the Equivalence and
Rater Bias in AC RatingsRater Bias in AC Ratings Exploring the Equivalence and Exploring the Equivalence and
Rater Bias in AC RatingsRater Bias in AC Ratings Prof Gert Roodt – Department of Industrial Psychology and People Management, University of Johannesburg
Sandra Schlebusch – The Consultants
ACSG Conference
17 – 19 March 2010
Presentation OverviewPresentation OverviewPresentation OverviewPresentation Overview
Background and Objectives of the Study Research Method Results Discussion and Conclusions Recommendations
BackgroundBackgroundBackgroundBackground
Construct Validity has long been a Problem in ACs (Jones & Born, 2008)
Perhaps the Mental Models that the Raters use are Part of the Problem
However, other Factors that Influence Reliability Should not be Neglected
Background ContinuedBackground ContinuedBackground ContinuedBackground Continued
To Increase Reliability Focus On all aspects of the Design Model (Schlebusch & Roodt, 2007):
AnalysisDesignImplementation
o Contexto Participants:o Process Owners (Simulation Administrator; Raters; Role-players)
Background ContinuedBackground ContinuedBackground ContinuedBackground Continued
Analysis (International Guidelines, 2009)o Competencies / DimensionsoAlso Characteristics of Dimensions (Jones & Born, 2008)o Situationso Trends/Issues in Organisationo Technology
Background ContinuedBackground ContinuedBackground ContinuedBackground Continued
Design of Simulationso Fidelityo Elicit Behaviouro Pilot
Background ContinuedBackground ContinuedBackground ContinuedBackground Continued
Implementationo Context:
Purposeo Participantso Simulation Administration (Potosky, 2008)
InstructionsResourcesTest Room Conditions
Background ContinuedBackground ContinuedBackground ContinuedBackground Continued
RatersBackgroundCharacteristics“What are Raters Thinking About When Making Ratings?” (Jones & Born, 2008)
Sources of Rater BiasSources of Rater Bias
Rater Differences (background; experience, etc.)
Rater Predisposition (attitude; ability; knowledge; skills, etc.)
Mental Models
Objective of the StudyObjective of the StudyObjective of the StudyObjective of the Study
The Focus of this Study is on Equivalence and Rater Bias in
AC Ratings
More specifically on:Regional DifferencesAge DifferencesTenure DifferencesRater Differences
Participants (Ratees)
Region
Research MethodResearch MethodResearch MethodResearch Method
Region
Frequency PercentValid
PercentCumulative
Percent
Valid
Western 368 34.8 34.8 34.8
Central 537 50.8 50.8 85.6
Eastern 152 14.4 14.4 100.0
Total 1057 100.0 100.0
Participants (Ratees)
Age
Research Method (cont.)Research Method (cont.)Research Method (cont.)Research Method (cont.)
Age (Recode)
Frequency PercentValid
PercentCumulative
Percent
Valid
30 years or less
115 10.9 12.3 12.3
31 - 40 years 217 20.5 23.3 35.6
41 - 50 years 268 25.4 28.7 64.3
51 years or older
333 31.5 35.7 100.0
Total 933 88.3 100.0
Missing System 124 11.7
Total 1057 100.0
Participants (Ratees) Tenure
Research Method (cont.)Research Method (cont.)Research Method (cont.)Research Method (cont.)
Years of Service (Recode)
Frequency PercentValid
PercentCumulative
Percent
Valid
10 years or less
363 34.3 38.9 38.9
11 - 20 years 106 10.0 11.4 50.3
21 - 30 years 196 18.5 21.0 71.3
31 years or more
268 25.4 28.7 100.0
Total 933 88.3 100.0
Missing System 124 11.7
Total 1057 100.0
Research Method (cont.)Research Method (cont.)Research Method (cont.)Research Method (cont.)
Measurement:
In-Basket Test
Measuring Six Dimensions:
• Initiative; • Information Gathering; • Judgement; • Providing Direction; • Empowerment; • Management Control
Overall In-Basket Rating
Research Method (cont.)Research Method (cont.)Research Method (cont.)Research Method (cont.)
Procedure:
Ratings were Conducted by 3 Raters on 1057 Ratees
Observer (Rater)
Frequency Percent Valid PercentCumulative
Percent
Valid
1 370 35.0 35.0 35.0
2 378 35.8 35.8 70.8
3 309 29.2 29.2 100.0
Total 1057 100.0 100.0
Initiative
ResultsResultsResultsResults
Initiative
Frequency Percent Valid Percent Cumulative Percent
Valid
0 61 5.8 5.8 5.8
ND 962 91.0 91.0 96.8
R 34 3.2 3.2 100.0
Total 1057 100.0 100.0
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Initiative
Reliability Statistics: Initiative
Cronbach's Alpha
N of Items
.556 4
Reliability Statistics: Initiative
Observer Cronbach's Alpha N of Items
1 .785 4
2 .610 4
3 .621 4
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Information Gathering
Information Gathering
Frequency PercentValid
PercentCumulative
Percent
Valid
0 61 5.8 5.8 5.8
ND 990 93.7 93.7 99.4
R 6 .6 .6 100.0
Total 1057 100.0 100.0
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Information Gathering
Reliability Statistics: Information Gathering
Cronbach's Alpha N of Items
.485 3
Reliability Statistics: Information Gathering
Observer Cronbach's Alpha N of Items
1 .603 3
2 .355 3
3 .453 3
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Judgement
Judgement
Frequency Percent Valid Percent Cumulative Percent
Valid
0 62 5.9 5.9 5.9
ND 544 51.5 51.5 57.3
R 346 32.7 32.7 90.1
E 105 9.9 9.9 100.0
Total 1057 100.0 100.0
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Judgement
Reliability Statistics: Judgement
Cronbach's Alpha N of Items
.813 5
Reliability Statistics: Judgement
Observer Cronbach's Alpha N of Items
1 .900 5
2 .670 5
3 .770 5
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Providing Direction
Providing Direction
Frequency PercentValid
PercentCumulative
Percent
Valid
0 62 5.9 5.9 5.9
ND 776 73.4 73.4 79.3
R 129 12.2 12.2 91.5
E 72 6.8 6.8 98.3
HE 18 1.7 1.7 100.0
Total 1057 100.0 100.0
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Providing Direction
Reliability Statistics: Providing direction
Cronbach's Alpha N of Items
.745 5
Reliability Statistics: Providing direction
Observer Cronbach's Alpha N of Items
1 .791 5
2 .478 5
3 .742 5
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Empowerment
Empowerment
Frequency PercentValid
PercentCumulative
Percent
Valid
0 62 5.9 5.9 5.9
ND 547 51.8 51.8 57.6
R 250 23.7 23.7 81.3
E 163 15.4 15.4 96.7
HE 35 3.3 3.3 100.0
Total 1057 100.0 100.0
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Empowerment
Reliability Statistics: Empowerment
Cronbach's Alpha N of Items
.749 3
Reliability Statistics: Empowerment
Observer Cronbach's Alpha N of Items
1 .772 3
2 .782 3
3 .764 3
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Control
Control
Frequency PercentValid
PercentCumulative
Percent
Valid
0 61 5.8 5.8 5.8
ND 811 76.7 76.8 82.6
R 126 11.9 11.9 94.5
E 38 3.6 3.6 98.1
HE 20 1.9 1.9 100.0
Total 1056 99.9 100.0
Missing System 1 .1
Total 1057 100.0
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Control
Reliability Statistics: Control
Cronbach's Alpha N of Items
.748 5
Reliability Statistics: Control
Observer Cronbach's Alpha N of Items
1 .788 5
2 .674 5
3 .757 5
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Overall In-Basket Rating
Reliability Statistics: In-basket
Cronbach's Alpha N of Items
.768 6
Reliability Statistics: In-basket
Observer Cronbach's Alpha N of Items
1 .869 6
2 .713 6
3 .695 6
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Regional Differences
Robust Tests of Equality of Means
Statistic(a) df1 df2 Sig.
Initiative Brown-Forsythe 11.567 2 631.991 .000
Info Gathering Brown-Forsythe 14.755 2 789.232 .000
Judgement Brown-Forsythe 12.065 2 625.270 .000
Providing Direction
Brown-Forsythe 6.990 2 482.067 .001
Empowerment Brown-Forsythe 9.205 2 566.078 .000
Control Brown-Forsythe 3.776 2 484.448 .024
In-Basket Brown-Forsythe 10.876 2 621.425 .000
a Asymptotically F distributed.
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Age Differences
Robust Tests of Equality of Means
Statistic(a) df1 df2 Sig.
Initiative Brown-Forsythe 6.002 3 770.593 .000
Info Gathering Brown-Forsythe 6.478 3 913.407 .000
Judgement Brown-Forsythe 12.074 3 710.775 .000
Providing Direction
Brown-Forsythe 4.129 3 725.823 .006
Empowerment Brown-Forsythe 2.571 3 752.519 .053
Control Brown-Forsythe 1.323 3 856.726 .266
In-Basket Brown-Forsythe 7.532 3 828.234 .000
a Asymptotically F distributed.
Results (cont.)- tenureResults (cont.)- tenureResults (cont.)- tenureResults (cont.)- tenure
Tenure differences
ANOVA
Sum of Squares df Mean Square F Sig.
Initiative
Between Groups .673 3 .224 2.810 .038
Within Groups 74.197 929 .080
Total 74.870 932
Info Gathering
Between Groups .457 3 .152 3.073 .027
Within Groups 45.996 929 .050
Total 46.452 932
Judgement
Between Groups 8.438 3 2.813 5.197 .001
Within Groups 502.735 929 .541
Total 511.173 932
Providing Direction
Between Groups 2.539 3 .846 1.597 .189
Within Groups 492.473 929 .530
Total 495.012 932
Empowerment
Between Groups 3.056 3 1.019 1.179 .317
Within Groups 802.629 929 .864
Total 805.685 932
Control
Between Groups 1.067 3 .356 .794 .498
Within Groups 415.966 928 .448
Total 417.033 931
In-Basket
Between Groups 1.416 3 .472 2.560 .054
Within Groups 171.258 929 .184
Total 172.674 932
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Rater Differences
Robust Tests of Equality of Means
Statistic(a) df1 df2 Sig.
Initiative Brown-Forsythe .809 2 1001.085 .446
Info Gathering Brown-Forsythe 2.772 2 1014.735 .063
Judgement Brown-Forsythe 6.839 2 1041.068 .001
Providing Direction
Brown-Forsythe 15.992 2 771.235 .000
Empowerment Brown-Forsythe 24.900 2 843.165 .000
Control Brown-Forsythe 3.740 2 837.540 .024
In-Basket Brown-Forsythe 7.784 2 945.444 .000
a Asymptotically F distributed.
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Post Hoc Tests: JudgementMultiple Comparisons
Dependent Variable: Judgement Dunnett T3
(I) Observer
(J) Observer
Mean Difference
(I-J)
Std. Error
Sig.95% Confidence
Interval
Lower Bound
Upper Bound
Lower Bound
Upper Bound
Lower Bound
12 .145(*) .056 .029 .01 .28
3 .203(*) .058 .001 .06 .34
21 -.145(*) .056 .029 -.28 -.01
3 .058 .055 .641 -.07 .19
31 -.203(*) .058 .001 -.34 -.06
2 -.058 .055 .641 -.19 .07
* The mean difference is significant at the .05 level.
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Post Hoc Tests: Providing Direction
Multiple Comparisons Dependent Variable: Providing Direction
Dunnett T3
(I) Observer
(J) Observer
Mean Difference
(I-J)
Std. Error
Sig.95% Confidence
Interval
Lower Bound
Upper Bound
Lower Bound
Upper Bound
Lower Bound
12 .143(*) .045 .005 .03 .25
3 -.182(*) .065 .015 -.34 -.03
21 -.143(*) .045 .005 -.25 -.03
3 -.325(*) .060 .000 -.47 -.18
31 .182(*) .065 .015 .03 .34
2 .325(*) .060 .000 .18 .47
* The mean difference is significant at the .05 level.
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Post Hoc Tests: EmpowermentMultiple Comparisons
Dependent Variable: Empowerment Dunnett T3
(I) Observer
(J) Observer
Mean Difference
(I-J)
Std. Error
Sig.95% Confidence
Interval
Lower Bound
Upper Bound
Lower Bound
Upper Bound
Lower Bound
12 .023 .059 .971 -.12 .16
3 -.432(*) .076 .000 -.61 -.25
21 -.023 .059 .971 -.16 .12
3 -.455(*) .077 .000 -.64 -.27
31 .432(*) .076 .000 .25 .61
2 .455(*) .077 .000 .27 .64
* The mean difference is significant at the .05 level.
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Post Hoc Tests: ControlMultiple Comparisons
Dependent Variable: Control Dunnett T3
(I) Observer
(J) Observer
Mean Difference
(I-J)
Std. Error
Sig.95% Confidence
Interval
Lower Bound
Upper Boun
d
Lower Bound
Upper Bound
Lower Bound
12 .095 .044 .090 -.01 .20
3 -.044 .059 .834 -.18 .10
21 -.095 .044 .090 -.20 .01
3 -.139(*) .054 .030 -.27 -.01
31 .044 .059 .834 -.10 .18
2 .139(*) .054 .030 .01 .27
* The mean difference is significant at the .05 level.
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Post Hoc Tests: In-Basket
Multiple Comparisons Dependent Variable: In-Basket
Dunnett T3
(I) Observer
(J) Observer
Mean Difference
(I-J)
Std. Error
Sig.95% Confidence
Interval
Lower Bound
Upper Bound
Lower Bound
Upper Bound
Lower Bound
12 .056 .031 .215 -.02 .13
3 -.080 .037 .096 -.17 .01
21 -.056 .031 .215 -.13 .02
3 -.135(*) .033 .000 -.21 -.06
31 .080 .037 .096 -.01 .17
2 .135(*) .033 .000 .06 .21
* The mean difference is significant at the .05 level.
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
Results (cont.)Results (cont.)Results (cont.)Results (cont.)
InitiativeInfo
GatheringJudgement
Providing Direction
Empowerment
Control In-Basket
Initiative 1.000
Info Gathering
.813(**) 1.000
Judgement .448(**) .445(**) 1.000
Providing Direction
.554(**) .506(**) .493(**) 1.000
Empower ment
.441(**) .428(**) .479(**) .469(**) 1.000
Control .491(**) .535(**) .419(**) .431(**) .400(**) 1.000
In-Basket .475(**) .418(**) .761(**) .679(**) .814(**) .595(**) 1.000
Non-Parametric Correlations
Clear Regional; Age and Tenure Differences Do Exist among Participants
Possible Sources of the Differences:
Regional Administration of In-Basket
Thus Differences in Administration Medium (Potosky, 2008)
o Different Administrators (Explaining Purpose; Giving Instructions; Answering Questions)o Different Resources o Different Test Room Conditions
DiscussionDiscussionDiscussionDiscussion
Differences Between Participants Regionally:
English Language Ability (not tested)
Motivation to Participate in the Assessment (not tested)
Differences in Employee Selection Processes as well as Training Opportunities (Burroughs et al., 1973)
Simulation Fidelity (not tested)
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Clear Regional; Age and Tenure Differences Do Exist among Participants
Supporting Findings by Burroughs et al. (1973)
Age does Significantly Influence AC PerformanceParticipants from Certain Departments Perform Better
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Appropriateness of In-Basket for Ratees
Level of ComplexitySituation Fidelity
Recommendations:
Ensure Documented Evidence (Analysis Phase in Design Model)Pilot In-Basket on Target Ratees (Design Phase of Design Model)Shared Responsibility of Service Provider and Client Organisation
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Context in Which In-Basket Administered
Purpose Communicated
Recommendations:
Ensure Participants (Ratees) and Process Owners Understand and Buy-into Purpose
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Consistent Simulation Administration:
Instructions Given ConsistentlyInteraction with AdministratorAppropriate Resources Available During AdministrationTest Room Conditions Appropriate for Testing
Recommendations:
Ensure All Administrators TrainedStandardise Test Room Conditions
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Rater Differences do Exist
Possible Sources of Rater Differences:
Background (All from a Psychology Background, with Management Experience)
Characteristics such as Personality (Bartels & Doverspike)
Owing to Cognitive Load on Raters
Owing to Differences in Mental Models (Jones & Born, 2008)
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Possible Sources of Rater Differences (cont.):
Trainingo All Received Behaviour Oriented Rater Trainingo Frame of Reference Different
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)
Recommendations:
Frame of Reference Training on:
Dimensions,
Management-Leadership Behaviour,
Norms
Project Management
Personality Assessment of Raters
Sub-dimension Differences
Questions?Questions?Questions?Questions?
?
SummarySummarySummarySummary
Found Rater Bias
Need to Research the Source of the Bias
Recommend Frame of Reference Training, Project
Management Communication of Purpose and
Administrator Training