Exploring the Equivalence and Rater Bias in AC Ratings Prof Gert Roodt – Department of Industrial Psychology and People Management, University of Johannesburg

Exploring the Equivalence and Exploring the Equivalence and

Rater Bias in AC RatingsRater Bias in AC Ratings Exploring the Equivalence and Exploring the Equivalence and

Rater Bias in AC RatingsRater Bias in AC Ratings Prof Gert Roodt – Department of Industrial Psychology and People Management, University of Johannesburg

Sandra Schlebusch – The Consultants

ACSG Conference

17 – 19 March 2010

Presentation OverviewPresentation OverviewPresentation OverviewPresentation Overview

Background and Objectives of the Study Research Method Results Discussion and Conclusions Recommendations

BackgroundBackgroundBackgroundBackground

Construct Validity has long been a Problem in ACs (Jones & Born, 2008)

Perhaps the Mental Models that the Raters use are Part of the Problem

However, other Factors that Influence Reliability Should not be Neglected

Background ContinuedBackground ContinuedBackground ContinuedBackground Continued

To Increase Reliability Focus On all aspects of the Design Model (Schlebusch & Roodt, 2007):

AnalysisDesignImplementation

o Contexto Participants:o Process Owners (Simulation Administrator; Raters; Role-players)


Analysis (International Guidelines, 2009)o Competencies / DimensionsoAlso Characteristics of Dimensions (Jones & Born, 2008)o Situationso Trends/Issues in Organisationo Technology


Design of Simulationso Fidelityo Elicit Behaviouro Pilot


Implementationo Context:

Purposeo Participantso Simulation Administration (Potosky, 2008)

InstructionsResourcesTest Room Conditions


RatersBackgroundCharacteristics“What are Raters Thinking About When Making Ratings?” (Jones & Born, 2008)

Sources of Rater BiasSources of Rater Bias

Rater Differences (background; experience, etc.)

Rater Predisposition (attitude; ability; knowledge; skills, etc.)

Mental Models

Objective of the StudyObjective of the StudyObjective of the StudyObjective of the Study

The Focus of this Study is on Equivalence and Rater Bias in

AC Ratings

More specifically on:Regional DifferencesAge DifferencesTenure DifferencesRater Differences

Participants (Ratees)

Region

Research MethodResearch MethodResearch MethodResearch Method

Region

Frequency PercentValid

PercentCumulative

Percent

Valid

Western 368 34.8 34.8 34.8

Central 537 50.8 50.8 85.6

Eastern 152 14.4 14.4 100.0

Total 1057 100.0 100.0

Participants (Ratees)

Age

Research Method (cont.)Research Method (cont.)Research Method (cont.)Research Method (cont.)

Age (Recode)


PercentCumulative

Percent

Valid

30 years or less

115 10.9 12.3 12.3

31 - 40 years 217 20.5 23.3 35.6

41 - 50 years 268 25.4 28.7 64.3

51 years or older

333 31.5 35.7 100.0

Total 933 88.3 100.0

Missing System 124 11.7

Total 1057 100.0

Participants (Ratees) Tenure


Years of Service (Recode)


PercentCumulative

Percent

Valid

10 years or less

363 34.3 38.9 38.9

11 - 20 years 106 10.0 11.4 50.3

21 - 30 years 196 18.5 21.0 71.3

31 years or more

268 25.4 28.7 100.0

Total 933 88.3 100.0

Missing System 124 11.7

Total 1057 100.0


Measurement:

In-Basket Test

Measuring Six Dimensions:

• Initiative; • Information Gathering; • Judgement; • Providing Direction; • Empowerment; • Management Control

Overall In-Basket Rating


Procedure:

Ratings were Conducted by 3 Raters on 1057 Ratees

Observer (Rater)

Frequency Percent Valid PercentCumulative

Percent

Valid

1 370 35.0 35.0 35.0

2 378 35.8 35.8 70.8

3 309 29.2 29.2 100.0

Total 1057 100.0 100.0

Initiative

ResultsResultsResultsResults

Initiative

Frequency Percent Valid Percent Cumulative Percent

Valid

0 61 5.8 5.8 5.8

ND 962 91.0 91.0 96.8

R 34 3.2 3.2 100.0

Total 1057 100.0 100.0

Results (cont.)Results (cont.)Results (cont.)Results (cont.)

Initiative

Reliability Statistics: Initiative

Cronbach's Alpha

N of Items

.556 4

Reliability Statistics: Initiative

Observer Cronbach's Alpha N of Items

1 .785 4

2 .610 4

3 .621 4


Information Gathering



PercentCumulative

Percent

Valid

0 61 5.8 5.8 5.8

ND 990 93.7 93.7 99.4

R 6 .6 .6 100.0

Total 1057 100.0 100.0



Reliability Statistics: Information Gathering

Cronbach's Alpha N of Items

.485 3

Reliability Statistics: Information Gathering


1 .603 3

2 .355 3

3 .453 3


Judgement

Judgement

Frequency Percent Valid Percent Cumulative Percent

Valid

0 62 5.9 5.9 5.9

ND 544 51.5 51.5 57.3

R 346 32.7 32.7 90.1

E 105 9.9 9.9 100.0

Total 1057 100.0 100.0


Judgement

Reliability Statistics: Judgement


.813 5

Reliability Statistics: Judgement


1 .900 5

2 .670 5

3 .770 5


Providing Direction

Providing Direction


PercentCumulative

Percent

Valid

0 62 5.9 5.9 5.9

ND 776 73.4 73.4 79.3

R 129 12.2 12.2 91.5

E 72 6.8 6.8 98.3

HE 18 1.7 1.7 100.0

Total 1057 100.0 100.0


Providing Direction

Reliability Statistics: Providing direction


.745 5

Reliability Statistics: Providing direction


1 .791 5

2 .478 5

3 .742 5


Empowerment

Empowerment


PercentCumulative

Percent

Valid

0 62 5.9 5.9 5.9

ND 547 51.8 51.8 57.6

R 250 23.7 23.7 81.3

E 163 15.4 15.4 96.7

HE 35 3.3 3.3 100.0

Total 1057 100.0 100.0


Empowerment

Reliability Statistics: Empowerment


.749 3

Reliability Statistics: Empowerment


1 .772 3

2 .782 3

3 .764 3


Control

Control


PercentCumulative

Percent

Valid

0 61 5.8 5.8 5.8

ND 811 76.7 76.8 82.6

R 126 11.9 11.9 94.5

E 38 3.6 3.6 98.1

HE 20 1.9 1.9 100.0

Total 1056 99.9 100.0

Missing System 1 .1

Total 1057 100.0


Control

Reliability Statistics: Control


.748 5

Reliability Statistics: Control


1 .788 5

2 .674 5

3 .757 5


Overall In-Basket Rating

Reliability Statistics: In-basket


.768 6

Reliability Statistics: In-basket


1 .869 6

2 .713 6

3 .695 6


Regional Differences

Robust Tests of Equality of Means

Statistic(a) df1 df2 Sig.

Initiative Brown-Forsythe 11.567 2 631.991 .000

Info Gathering Brown-Forsythe 14.755 2 789.232 .000

Judgement Brown-Forsythe 12.065 2 625.270 .000

Providing Direction

Brown-Forsythe 6.990 2 482.067 .001

Empowerment Brown-Forsythe 9.205 2 566.078 .000

Control Brown-Forsythe 3.776 2 484.448 .024

In-Basket Brown-Forsythe 10.876 2 621.425 .000

a Asymptotically F distributed.


Age Differences



Initiative Brown-Forsythe 6.002 3 770.593 .000



Providing Direction

Brown-Forsythe 4.129 3 725.823 .006





Results (cont.)- tenureResults (cont.)- tenureResults (cont.)- tenureResults (cont.)- tenure

Tenure differences

ANOVA

Sum of Squares df Mean Square F Sig.

Initiative

Between Groups .673 3 .224 2.810 .038

Within Groups 74.197 929 .080

Total 74.870 932

Info Gathering

Between Groups .457 3 .152 3.073 .027


Total 46.452 932

Judgement

Between Groups 8.438 3 2.813 5.197 .001


Total 511.173 932

Providing Direction

Between Groups 2.539 3 .846 1.597 .189


Total 495.012 932

Empowerment

Between Groups 3.056 3 1.019 1.179 .317


Total 805.685 932

Control

Between Groups 1.067 3 .356 .794 .498


Total 417.033 931

In-Basket

Between Groups 1.416 3 .472 2.560 .054


Total 172.674 932


Rater Differences



Initiative Brown-Forsythe .809 2 1001.085 .446



Providing Direction

Brown-Forsythe 15.992 2 771.235 .000






Post Hoc Tests: JudgementMultiple Comparisons

Dependent Variable: Judgement Dunnett T3

(I) Observer

(J) Observer

Mean Difference

(I-J)

Std. Error

Sig.95% Confidence

Interval

Lower Bound

Upper Bound

Lower Bound

Upper Bound

Lower Bound

12 .145(*) .056 .029 .01 .28

3 .203(*) .058 .001 .06 .34

21 -.145(*) .056 .029 -.28 -.01

3 .058 .055 .641 -.07 .19

31 -.203(*) .058 .001 -.34 -.06

2 -.058 .055 .641 -.19 .07

* The mean difference is significant at the .05 level.



Post Hoc Tests: Providing Direction

Multiple Comparisons Dependent Variable: Providing Direction

Dunnett T3

(I) Observer

(J) Observer

Mean Difference

(I-J)

Std. Error

Sig.95% Confidence

Interval

Lower Bound

Upper Bound

Lower Bound

Upper Bound

Lower Bound

12 .143(*) .045 .005 .03 .25

3 -.182(*) .065 .015 -.34 -.03

21 -.143(*) .045 .005 -.25 -.03

3 -.325(*) .060 .000 -.47 -.18

31 .182(*) .065 .015 .03 .34

2 .325(*) .060 .000 .18 .47




Post Hoc Tests: EmpowermentMultiple Comparisons

Dependent Variable: Empowerment Dunnett T3

(I) Observer

(J) Observer

Mean Difference

(I-J)

Std. Error

Sig.95% Confidence

Interval

Lower Bound

Upper Bound

Lower Bound

Upper Bound

Lower Bound

12 .023 .059 .971 -.12 .16

3 -.432(*) .076 .000 -.61 -.25

21 -.023 .059 .971 -.16 .12

3 -.455(*) .077 .000 -.64 -.27

31 .432(*) .076 .000 .25 .61

2 .455(*) .077 .000 .27 .64




Post Hoc Tests: ControlMultiple Comparisons

Dependent Variable: Control Dunnett T3

(I) Observer

(J) Observer

Mean Difference

(I-J)

Std. Error

Sig.95% Confidence

Interval

Lower Bound

Upper Boun

d

Lower Bound

Upper Bound

Lower Bound

12 .095 .044 .090 -.01 .20

3 -.044 .059 .834 -.18 .10

21 -.095 .044 .090 -.20 .01

3 -.139(*) .054 .030 -.27 -.01

31 .044 .059 .834 -.10 .18

2 .139(*) .054 .030 .01 .27




Post Hoc Tests: In-Basket

Multiple Comparisons Dependent Variable: In-Basket

Dunnett T3

(I) Observer

(J) Observer

Mean Difference

(I-J)

Std. Error

Sig.95% Confidence

Interval

Lower Bound

Upper Bound

Lower Bound

Upper Bound

Lower Bound

12 .056 .031 .215 -.02 .13

3 -.080 .037 .096 -.17 .01

21 -.056 .031 .215 -.13 .02

3 -.135(*) .033 .000 -.21 -.06

31 .080 .037 .096 -.01 .17

2 .135(*) .033 .000 .06 .21




InitiativeInfo

GatheringJudgement

Providing Direction

Empowerment

Control In-Basket

Initiative 1.000

Info Gathering

.813(**) 1.000

Judgement .448(**) .445(**) 1.000

Providing Direction

.554(**) .506(**) .493(**) 1.000

Empower ment

.441(**) .428(**) .479(**) .469(**) 1.000

Control .491(**) .535(**) .419(**) .431(**) .400(**) 1.000

In-Basket .475(**) .418(**) .761(**) .679(**) .814(**) .595(**) 1.000

Non-Parametric Correlations

Clear Regional; Age and Tenure Differences Do Exist among Participants

Possible Sources of the Differences:

Regional Administration of In-Basket

Thus Differences in Administration Medium (Potosky, 2008)

o Different Administrators (Explaining Purpose; Giving Instructions; Answering Questions)o Different Resources o Different Test Room Conditions

DiscussionDiscussionDiscussionDiscussion

Differences Between Participants Regionally:

English Language Ability (not tested)

Motivation to Participate in the Assessment (not tested)

Differences in Employee Selection Processes as well as Training Opportunities (Burroughs et al., 1973)

Simulation Fidelity (not tested)

Discussion Discussion (cont.)(cont.)Discussion Discussion (cont.)(cont.)

Clear Regional; Age and Tenure Differences Do Exist among Participants

Supporting Findings by Burroughs et al. (1973)

Age does Significantly Influence AC PerformanceParticipants from Certain Departments Perform Better


Appropriateness of In-Basket for Ratees

Level of ComplexitySituation Fidelity

Recommendations:

Ensure Documented Evidence (Analysis Phase in Design Model)Pilot In-Basket on Target Ratees (Design Phase of Design Model)Shared Responsibility of Service Provider and Client Organisation


Context in Which In-Basket Administered

Purpose Communicated

Recommendations:

Ensure Participants (Ratees) and Process Owners Understand and Buy-into Purpose


Consistent Simulation Administration:

Instructions Given ConsistentlyInteraction with AdministratorAppropriate Resources Available During AdministrationTest Room Conditions Appropriate for Testing

Recommendations:

Ensure All Administrators TrainedStandardise Test Room Conditions


Rater Differences do Exist

Possible Sources of Rater Differences:

Background (All from a Psychology Background, with Management Experience)

Characteristics such as Personality (Bartels & Doverspike)

Owing to Cognitive Load on Raters

Owing to Differences in Mental Models (Jones & Born, 2008)


Possible Sources of Rater Differences (cont.):

Trainingo All Received Behaviour Oriented Rater Trainingo Frame of Reference Different



Recommendations:

Frame of Reference Training on:

Dimensions,

Management-Leadership Behaviour,

Norms

Project Management

Personality Assessment of Raters

Sub-dimension Differences

Questions?Questions?Questions?Questions?

?

SummarySummarySummarySummary

Found Rater Bias

Need to Research the Source of the Bias

Recommend Frame of Reference Training, Project

Management Communication of Purpose and

Administrator Training

Documents

Exploring the Equivalence and Rater Bias in AC Ratings Prof Gert Roodt – Department of Industrial Psychology and People Management, University of Johannesburg