Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Western Michigan University Western Michigan University
ScholarWorks at WMU ScholarWorks at WMU
Master's Theses Graduate College
8-1984
Psychometric Characteristics of the Behavioral Observation Scale Psychometric Characteristics of the Behavioral Observation Scale
Gregg Allen Bolt
Follow this and additional works at: https://scholarworks.wmich.edu/masters_theses
Part of the Industrial and Organizational Psychology Commons
Recommended Citation Recommended Citation Bolt, Gregg Allen, "Psychometric Characteristics of the Behavioral Observation Scale" (1984). Master's Theses. 1482. https://scholarworks.wmich.edu/masters_theses/1482
This Masters Thesis-Open Access is brought to you for free and open access by the Graduate College at ScholarWorks at WMU. It has been accepted for inclusion in Master's Theses by an authorized administrator of ScholarWorks at WMU. For more information, please contact [email protected].
PSYCHOMETRIC CH ARA CTERISTICS OF THEBEHAVIORAL OBSERVATION SCALE
by
Gregg A l le n Bolt
A Thesis Submitted to the
Faculty o f The Graduate Col lege in p a r t i a l f u l f i l l m e n t o f the
requirements f o r the Degree o f Master o f Ar ts Department o f Psychology
Western Michigan U n iv e r s i t y Kalamazoo, Michigan
August 1984
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
PSYCHOMETRIC C H ARA CTERISTICS OF THEBEHAVIORAL OBSERVATION SCALE
Gregg A l le n B o l t , M. A.
Western Michigan U n i v e r s i t y , 1984
S e l f - , p e e r - , and supervisor ra t in g s were obtained on 52 psy
c h i a t r i c a ides using a Behavioral Observat ion Scale (BOS). S e l f
ra t in g s showed less len iency e r r o r than peer - and superviso r r a t i n g s .
Halo e r r o r could not be assessed due to a n egat iv e c o r r e l a t i o n be
tween means and v a r ia n ces . A m u l t i t r a i t - m u l t i m e t h o d (MTMM) a n a lys is
supported the presence o f strong r a t e r bias and s i g n i f i c a n t conver
gent v a l i d i t y but not d is c r im in a n t v a l i d i t y . The r e s u l t s o f the
analyses demonstrated t h a t the ra t in g s obta ined from a BOS were not
p sychom etr ica l ly super io r to o th er a p pra is a l fo rm ats. Quest ions
were raised as to the adequacy o f a f i v e p o in t s c a le , data t r a n s f o r
mation, and r a t i n g sca les .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ACKNOWLEDGEMENTS
I would l i k e to express s in cere g r a t i t u d e to the many people
involved in the w r i t i n g o f t h i s th e s is .
Thanks goes to Gerald DeWeerd and the nursing supervisors who
were w i l l i n g to undertake t h i s research p r o j e c t . Norman Peterson
deserves spec ial thanks f o r his undying w i l l i n g n e s s to advise me to
and from Grand Rapids, and f o r h is words o f encouragement. A spe
c i a l thanks a lso goes to Peninnah M i l l e r and Bradley Huitema who
provided s t a t i s t i c a l ass is tanc e and c o n s u l t a t io n . I would a lso l i k e
to acknowledge Dale Brethower and Jack Asher who served as committee
members. And to Shery l , my w i f e , a spec ial thanks f o r prov id in g en
couragement and support when the obstacles seemed insurmountable.
Gregg A11en Bolt
i i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
INFORMATION TO USERS
This reproduction was made from a copy of a document sent to us for microfilming. While the most advanced technology has been used to photograph and reproduce this document, the quality of the reproduction is heavily dependent upon the quality of the material submitted.
The following explanation of techniques is provided to help clarify markings or notations which may appear on this reproduction.
1.The sign or “ target” for pages apparently lacking from the document photographed is “Missing Page(s)” . I f it was possible to obtain the missing page(s) or section, they are spliced into the film along with adjacent pages. This may have necessitated cutting through an image and duplicating adjacent pages to assure complete continuity.
2. When an image on the film is obliterated with a round black mark, it is an indication of either blurred copy because of movement during exposure, duplicate copy, or copyrighted materials that should not have been filmed. For blurred pages, a good image of the page can be found in the adjacent frame. I f copyrighted materials were deleted, a target note will appear listing the pages in the adjacent frame.
3. When a map, drawing or chart, etc., is part of the material being photographed, a definite method of “sectioning” the material has been followed. It is customary to begin filming at the upper left hand corner of a large sheet and to continue from left to right in equal sections with small overlaps. I f necessary, sectioning is continued again—beginning below the first row and continuing on until complete.
4. For illustrations that cannot be satisfactorily reproduced by xerographic means, photographic prints can be purchased at additional cost and inserted into your xerographic copy. These prints are available upon request from the Dissertations Customer Services Department.
5. Some pages in any document may have indistinct print. In all cases the best available copy has been filmed.
Uni
International300 N. Zeeb Road Ann Arbor, Ml 48106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1323913
BOLT, GREGG ALLEN
PSYCHOMETRIC CHARACTERISTICS OF THE BEHAVIORAL OBSERVATION SCALE
WESTERN MICHIGAN UNIVERSITY M .A. 1984
University Microfilms
International 300 N. Zeeb Road, Ann Arbor, MI 48106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ....................................................................................................... i i
LI ST OF T A B L E S ................................................ * ........................................ v
LIST OF F I G U R E S ....................................................................................................... V
Chapter
I . INTRODUCTION ................................................................................................. 1
Halo E r r o r .........................................................................................................11
Leniency E r ro r .......................................................................................... 16
Convergent and D iscr im inant V a l i d i t y ...................................... 19
I I . METHOD....................................................................................................................21*
S u b j e c t s ....................................................................................................... 21*
BOS D e v e l o p m e n t ................................. • .......................................................2 k
P r o c e d u r e ........................................................................................................ 26
A n a l y s e s ............................................................................................................. 26
I I I . R E S U L T S ............................................................................................................... 29
IV. DISCUSSION.......................................................................................................... 36
Data D i s t r i b u t i o n ......................................................................................36
Halo E f f e c t s .................................................................................................... *»1
Leniency E f f e c t ..................... ' .................................................................... 1*1
MTMM I n t e r p r e t a t i o n ................................................................................. 1*3
APPENDICES
A. BEHAVIORAL OBSERVATION SCALE FOR PSYCHIATRIC AIDE . . 1*6
B. INSTRUCTIONS FOR B O S ................................................................................... 51
C. ESTIMATES FOR VARIANCE COMPONENTS ............................................... 52
i i i
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
REFERENCE NOTES ......................................................................................... 53
BIBLIOGRAPHY............................................................................................................ 5i»
I v
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF TABLES
Table
1. Example o f B E S .............................................................................................. 3
2. Example o f B O S ..............................................................................................k
3. Example o f Graphic Rating Scale .................................................... 5
k . Percentage o f Raters in Each Category f o r items on BOS . 29
5. Means, Var iances , One-Way ANOVA, Levene's Test f o r EqualVariances f o r Performance Ratings by Rat ing Course f o r Each U n i t ............................................................................................... 30
6. Dunn-Bonferroni Comparison Tests f o r PerformanceR a t i n g s ................................................................... ... ................................32
7. Weighted Means f o r Source R a t i n g s ................................................. 3*t
8. Three-Way Analys is o f Var iance Summary Tab le ....................... 35
9. Number and Percentage o f Supervisors in Each Category:Latham e t a l . ( 19 79 ) , D a t a ................................................................ 38
LIST OF FIGURES
Figure
1. Mean Source Rat ings by P s y c h i a t r i c U n i t ...................................33
<
v
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER I
INTRODUCTION
The assessment o f how wel l people perform on t h e i r jobs has been
the focus o f cons iderab le research and debate over the past 60 years .
The academician and the p r a c t i t i o n e r have generated l i t e r a l l y hun
dreds o f a r t i c l e s suggest ing new appra isa l systems, re v is in g old
ones, and bante r in g over how to control f o r r a t e r bias and the l i k e .
Why such a f u r o r over appra is a l systems? The demand f o r an e f f e c t i v e
and e f f i c i e n t method f o r assessing performance a r i s e s out o f one or
more o f the fo l lo w in g fo ur purposes: (a) a p p r a is a ls a re the basis
o f promotion and placement dec is io ns; (b) a p p ra is a ls a re f r e q u e n t ly
used to determine m e r i t a l l o c a t i o n ; (c) a p p r a is a ls a re the c r i t e r i o n
against which s e le c t io n devices and t r a i n i n g programs are v a l i d a t e d ;
and (d) a p p r a is a ls are one o f the pr imary sources o f performance feed
back (Kane & Lawler , 1979) . I f o rg a n iz a t io n s are to s u c ces s fu l ly u t i
l i z e performance a p p r a is a ls as the data base f o r personnel d ec is io n s ,
then they must be concerned about how wel l a given performance appra is
al a c c u r a t e ly r e f l e c t s actual performance. In l i g h t o f t h i s concern,
the purpose o f t h i s study was to exp lore some o f the psychometr ic pro
p e r t i e s o f a r e l a t i v e l y new appra is a l system, the Behavioral Observa
t io n Scale (BOS), developed by Latham and Wexley (1 9 7 7 ) . However, be
fo r e the s p e c i f i c s o f the study are descr ibed , the concerns f o r psycho
m e t r i c a l l y sound performance ap p ra isa ls warrent f u r t h e r comment.
Despite the cont inuous f low o f research and new s t a t e o f the
1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
a r t a p p r a i s a l s , many o f the c u r r e n t l y used app ra isa l systems f a l l
short o f expecta t ions in terms o f d is c r im in a n t and convergent v a l i
d i t y , r e l i a b i l i t y , and freedom from r a t e r bias (Kane & Lawler , 1979)-
For those who must r e ly on performance a p p ra is a ls f o r personnel d e c i
sions, awareness o f the l i m i t a t i o n s o f a p p r a is a ls compounds an a lr eady
d i f f i c u l t dec is ion making process. Furthermore, legal requirements
f o r a p p r a is a ls are enforced by The Equal Employment Opportuni ty Com
mission (EEOC), the O f f i c e o f Federal Contract Compliance Programs,
and the co u r ts , who demand v a l i d i t y stud ies f o r a p p r a is a ls c o n t r ib u
t i n g to adverse impact. Since the t ime EEOC wrote "G u ide l ines f o r
Employment S e lec t io n Procedures" (197 0 ) , which in e f f e c t placed legal
requirements on " t e s t s and o th e r s e le c t i o n procedures which a re used
as a basis f o r any employment d e c is io n " (p. 655^3) , numerous court
cases have -been lo s t by o r g a n iz a t io n s because employers implemented
performance a p p r a is a ls c o n t r ib u t i n g to adverse impact. (For reviews
see Cascio & Bernard in , 1981; Schneier , 1978. ) As a r e s u l t o f these
legal pressures, the demand f o r p s ychom etr ic a l ly sound app ra isa l sys
tems has increased in the working community.
In response to the needs f o r e f f e c t i v e ap pra is a l systems, a num
ber o f new models and techn ica l advances in performance a p p ra is a ls
has appeared in the l i t e r a t u r e over the past 20 years ( e . g . , A l le n
6 Rosenberg, 1978; Latham & Wexley, 1977; Rosinger, Myers, & Leoy,
1982; Smith 6 Kendal l , 1963) • One o f the new models is the BOS. W ith
in the l i t e r a t u r e , the BOS is a t t imes r e f e r r e d to as an extension
o f the Behavioral Anchored Rat ing Scale (BARS) o r the Behavioral
Expecta t ion Scale (BES) developed by Smith and Kendall (1963)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
( e . g . , Feldman, 1980; Landy & F a r r , 1980 ) . The BES d i f f e r s from the
BARS in t h a t BES behavioral statements a re w r i t t e n as expecta t ions
r a th e r than as neutral behaviors as w i th BARS (BARS and BES w i l l be
used in te rch a n ge ab ly ) . Both BES and BOS are developed using the C r i
t i c a l In c id ent Technique (Flanagan, 1959) . However, the developers
o f a BES generate behavioral anchors from the in c id e n ts , a l l o c a t e the
behaviora l anchors to s p e c i f i c dimensions, and use seven o f the an
chors to form a Thurstone- type r a t i n g s c a le . A BES example is shown
in Tab le 1.
Tab le 1
Example o f BES
M o t iv a t io n : W i l l in g ness to work hard
7 Employee could be expected to lend help too th e r employees when own work is f in i s h e d
6
5 Employee could be expected to o rg an ize t imeto insure complet ion o f tasks
i*
3 Employee could be expected to need f requentreminders about tasks a t hand
2
1 Employee could be expected to f r e q u e n t ly fo r g e tto complete work and report un f in is hed tasks
On the o th e r hand, the developers o f the BOS d e r iv e behavioral
d e s c r ip t io n s from the in c id e n ts , a l l o c a t e the d e s c r ip t io n s to s p e c i f i c
dimensions, and a t ta ch a L i k e r t - t y p e sca le to each d e s c r ip t i o n w i t h in
each dimension. An example is shown in Tab le 2. The s p e c i f i c
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
procedures f o r developing a BOS are found in Latham and Wexley (1981 ) .
Latham and Wexley (1981) p r i m a r i l y developed the BOS in order to o v e r
come the problems plaguing graphic ra t in g sca le s , a p p ra is a ls based on
c o s t - r e l a t e d outcomes, and the BES.
Tab le 2
Example o f BOS
M o t iv a t io n
1. Lends help to o th er s t a f f when needed
1 2 3 k 5Almost Never Almost Always
2. Organizes t ime so a l l tasks a re completed
1 2 3 ^ 5Almost Never Almost Always
3. Forgets to report unf in ished tasks
1 2 3 ^ 5Almost Always Almost Never
Graphic r a t i n g scales or t r a i t scales have been in c r e a s in g ly c r i
t i c i z e d by o th er researchers and unfavorab ly looked upon by the courts
(Borman 6 Dunnette, 1975; H o l ley & F i e l d , 1975; Kleiman S Durman, 1981;
Kleiman 6 F a ley , 1978; Latham & Wexley, 1977; Sche ie r , 1978) . An ex
ample is shown in Table 3- B r i e f l y , t r a i t s tend to be ambiguous and
cause confusion and m i s i n t e r p r e t a t i o n by r a t e r and r a te e a l i k e . In
a d d i t io n , a t the t ime o f e v a lu a t i o n , unless the e v a lu a to r knows spec i
f i c a l l y what behaviors the t r a i t s denote, feedback from t r a i t scales
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5
can f r e q u e n t ly be meaningless or m is lead in g , and subsequently have
l i t t l e impact on f u t u r e performance. Cascio and Bernardin (1981)
suggested, "performance dimensions should be b e h a v io r a l l y based.
Avoid a b s t ra c t t r a i t names in graphic r a t in g s c a le s . "
Table 3
Example o f Graphic Rating Scale
1 2 3 ^ 5Unfavorable Favorable
Quick Tempered
Stubborn _m __
I n t e l 1i gent — _ __
Fi rm _ ___
Appreciates Me — — — — —
Despite the lack o f support f o r graphic r a t i n g sca les , i t should
be noted th a t l i t t l e evidence supports psychometric s u p e r i o r i t y o f
b e h a v io r a l ly based a p p ra is a ls over graphic ra t in g sca les . In a study
designed to assess u t i l i t y o f th ree r a t in g instruments, includ ing a
t r a i t sca le and a BARS, D eco t i is (1977) repor ted t h a t the th ree in
struments were approximately equal in terms o f t h e i r res is tan ce to
er ro rs o f leniency and c e n t ra l tendency. Landy and Farr (1980) in
t h e i r review on r a t in g too ls concluded, " A f t e r more than 30 years o f
serious research, i t seems th a t l i t t l e progress has been made in de
veloping an e f f i c i e n t and psychomet r ica l ly sound a l t e r n a t i v e to the
t r a d i t i o n a l graphic r a t in g sca le" (p. 89) . Though one may conclude,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
based on common sense or i n t u i t i o n , t h a t b e h a v i o r a l l y based a p pra is a l
too ls a re superior to t r a i t sca les , data support ing psychometric su
p e r i o r i t y has y e t to be documented.
A ppra isa ls based on c o s t - r e l a t e d outcomes have some v a lu e , but
when used in i s o la t i o n from o t h e r performance d a ta , they can o f ten
be misleading and omit re le v a n t performance in form at ion . Such c o s t -
re l a t e d measures t y p i c a l l y include economic or c o s t - r e l a t e d outcomes
o f the o r g a n iz a t io n ( e . g . , p r o f i t s , costs , re turp on in ves tments ) .
Latham and Wexley (1981) c i t e d the fo l lo w in g problems assoc ia ted
w ith c o s t - r e l a t e d formats: (a) c o s t - r e l a t e d measures f r e q u e n t l y omit
re lev an t f a c t o r s f o r which the r a te e should be held accountable ; (b)
c o s t - r e l a t e d measures are o f te n d i f f i c u l t to o b ta in f o r every employee;
(c) c o s t - r e l a t e d measures cam f o r some employees invo lve fa c t o r s be
yond t h e i r c o n t r o l ; (d) c o s t - r e l a t e d measures can f a i l miserab ly in
prov id ing s p e c i f i c performance feedback necessary f o r increas ing or
m ainta in ing p r o d u c t i v i t y ; and (e) c o s t - r e l a t e d measures can f o s t e r a
" r e s u l t s - a t - a l 1-costs m e n t a l i t y " which can run counter to o r g a n iz a
t io n a l values and goals (pp. 4 1 - 4 4 ) . I t seems t h a t i f c o s t - r e l a t e d
measures a re to be used, they should be c a r e f u l l y s c r u t in i z e d to re
f l e c t on ly the fa c to r s under the contro l o f the ra tee and be used as
complementary data f o r b e h a v io r a l l y based d a ta . Latham, Fay, and S arr i
(1979) supported the need to b e h a v i o r a l l y based d a ta , f o r w ithout i t ,
" i t m a y b e easy to determine whether an employee is or is not meeting
a set o f o b j e c t i v e s , but the answer(s) to the q u es t io n (s ) o f how and
why can remain e lu s iv e " (p. 300) .
The BARS has received a cons id erab le amount o f a t t e n t i o n in the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
l i t e r a t u r e since i t s development by Smith and Kendall (1 9 6 3 ) . (For
reviews see Landy 6 F a r r , 1980; Schwab, Heneman, 6 D e c o t i i s , 1975 .)
I t was the i n t e n t io n o f Smith and Kendall to develop a b e h a v io r a l ly
based ra t in g sca le der ived from a complete job a n a ly s i s ; hence, the
scale would take into account a l l c r i t i c a l behaviors o f a job and be
s p e c i f i c enough to avoid the confusion o f ambiguous t r a i t names. In
a d d i t i o n , the sca le could encompass c o s t - r e l a t e d measures. The BARS
was expected to provide p sychom etr ica l ly sound and s p e c i f i c p e r f o r
mance feedback. U n fo r tu n a te ly , the BARS has f a l l e n short o f o r i g i n a l
ex p ec ta t ion s . Studies t h a t have set out to support the psychometric
p r o p e r t ie s o f the BARS have reported equivocal r e s u l t s (Bernard in ,
1977; Bernardin , A lv a res , & Cranny, 1976; Borman 6 Dunnette, 1975;
Campbell, Dunnette, Avery, £ H e l l e r v i c k , 1973; Kingstrom S Bass, 1981
Landy £ F a r r , 1980; Schwab e t a l . , 1975; Shapira £ Sh iron, 1980) .
Borman and V a l lo n (197*0 reported t h a t a f t e r developing a BARS in one
s e t t in g and using i t in another s e t t i n g , the e f f e c t i v e n e s s o f the ap
p ra is a l decreased. S u b j e c t i v i t y in c a t e g o r i z in g the anchors on a BES
is c i t e d as a problem by Latham and Wexley (1981 ) , s ince nonindepen
dent ca teg o r ie s may r e s u l t in redundancy. F i n a l l y , Borman (1979) and
Landy and Far r (1980) argued t h a t f r e q u e n t l y a r a t e r using BARS has
problems d iscern in g the s i m i l a r i t y between anchors on the scale and
actual performance, which may r e s u l t in s i g n i f i c a n t r a t i n g e r r o rs and
poor v a l i d i t y . The f i n a l quest ion raised by Landy and F arr (1980)
concerns whether or not the b e n e f i t s outweigh the costs o f developing
a BARS. This appears to be a l e g i t i m a t e concern in l i g h t o f BARS 1 imi
t a t i o n s .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
In order to overcome the l i m i t a t i o n s o f BARS, and ye t r e t a in
the p r a c t i c a l and legal advantages o f a b e h a v io ra l ly -b as ed r a t in g
sca le , Latham and Wexley (1977) developed the BOS. In a discussion
th a t c i t e d fo u r disadvantages w i th BES usage, Latham and Wexley (1981)
argued t h a t such l i m i t a t i o n s do not occur w i th BOS usage. F i r s t ,
"endorsement o f an in c id en t above the neutra l po in t on the BES impl ies
endorsement o f a l l o t h e r inc idents between the inc ident checked and
the neutra l p o in t" (p. 6 3 ) . The r a t e r using the BOS is al lowed to
eva lu a te the ra tee on a l l r e lev an t behaviors w i t h i n a behavioral d i
mension; whereas the r a t e r using BES is forced to e va lu a te the ra tee
on an e n t i r e behaviora l dimension w i th a s in g le endorsement. Prob
lems occur when the r a t e r cannot endorse items between the neutra l
point and the behaviora l item endorsed. This problem does not occur
w ith BOS usage.
Second, " th e s u b je c t iv e d e f i n i t i o n o f ' c r i t i c a l ' is minimized in
the generat ion o f the behaviora l items f o r BOS" (Latham 6 Wexley, 1981,
p. 6 3 ) . In the process o f BES development, on ly those items judged
to be " c r i t i c a l " a re re ta in ed f o r anchors on the r a t i n g sca le , thereby
increasing the chances o f s i g n i f i c a n t l y reducing content v a l i d i t y . Be
cause a l l behav iora l items t h a t a re not redundant are re ta in ed f o r the
BOS, content v a l i d i t y is not jeo p ard iz ed in BOS development.
T h i r d , " in using BES, standard or normal behaviors may not be
remembered in the same way as unusual or unique behav iors" (Latham &
Wexley, 1981, p. 6 A ) . In order to overcome t h i s problem the BES user
must s y s t e m a t ic a l ly record performance on normal, r o u t in e behav ior .
The recording procedure could e a s i l y become a time consuming task i f
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the re lev an t behaviors a re unknown. BOS e l im in a t e s t h i s problem be
cause i t serves as a c h e c k l i s t f o r both the r a t e r and r a te e ; i r r e l e
vant behaviors a re ignored.
F i n a l l y , Latham and Wexley suggested t h a t the range o f behaviors
on a s in g le dimension may be biased by the judges who develop i t .
Atk in and Colon (1978) conducted research on the Thurstone scale and
found t h a t when judges b e l ie v e one dimension is s i g n i f i c a n t l y more
important than o t h e r s , they w i l l describe few acceptab le behaviors ,
many unacceptable behav iors , and almost no neu t ra l behav iors . Prob
lems o f t h i s so r t a re avoided i f one uses a BOS. On the BOS a r a t e r
is simply required to r a te the frequency o f behavior observed; a l l
r e le van t behaviors a re found on the sca le . Although Latham and Wexley
(1981) argued f o r the s u p e r i o r i t y o f the BOS, most o f t h e i r arguments
were based on lo g ic ra th e r than research da ta . In assessing the psy
chometric c h a r a c t e r i s t i c s o f the BOS, Latham and Wexley did not argue
tha t the BOS was superior to the BES, but they did suggest t h a t the
scale s a t i s f i e d EEOC requirements and standards.
The studies t h a t supported Latham and Wexley's content ion th a t
the BOS was s a t i s f a c t o r y both in terms o f r e l i a b i l i t y and v a l i d i t y
f o r assessing performance, were s u p r is i n g ly , based on the same data
set . Latham and Wexley (1981) wrote: " In previous s tud ies (Latham &
Wexley, 1977; Latham, Wexley & Rand, 1975; and Ronan & Latham, 197^)
the t e s t - r e t e s t and in te ro b serv er r e l i a b i l i t y , as wel l as the v a l i d i t y
o f the BOS in i n d i c a t in g employee attendance and p r o d u c t i v i t y , were
demonstrated" (p. 63 ) . Al though a complete a n a ly s is o f each study is
beyond the scope o f the present study, i t should be pointed out th a t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
10
the th ree stud ies c i t e d in the above quote have hypotheses supported
by the same data set taken from performance o f loggers in the South
eastern United S ta tes . From t h i s , one might conclude t h a t a t best
the r e l i a b i l i t y and v a l i d i t y o f the BOS appears promising, but f u r
t h e r research is necessary before more conclusive statements can be
made.
In a d d i t i o n , Latham and Wexley (1981) contended t h a t the BOS
t y p i c a l l y s a t i s f i e s EEOC standards in terms o f content v a l i d i t y and
in te r ju d g e agreement o f c a t e g o r i z a t i o n . These standards w i l l most
l i k e l y be met i f the procedure t h a t Latham and Wexley (1981, 1977)
described f o r BOS development is fo l lowed .
With respect to r a t e r b ias , Latham and Wexley (1981) suggested
t h a t bias was minimized, "because observers do not have to ex t rap o
l a t e from what they have observed to the placement o f a checkmark be
side an example on the sca le t h a t may or may not be a p p r o p r ia te " (p.
6 3 ) . Empir ical support o f t h i s f i n a l conten t ion has ye t to be docu
mented.
The need f o r f u r t h e r research on the psychometric p r o p e r t ie s o f
the BOS is e v id e n t . From a psychometric s tandpoin t , the BOS has re
ceived on ly i n d i r e c t c r i t i c i s m . The c r i t i c i s m focused on b e h a v io ra l ly
based performance scales has f r e q u e n t ly been d i r e c te d a t the BARS or
BES and on ly i n d i r e c t l y a t the BOS ( e . g . , Landy & F a r r , 1980) . The
content ion t h a t the BOS is an extension o f the BES does not necessar
i l y a l lo w one to argue t h a t the psychometric p ro p e r t ie s o f the BES
are synonomous w i th those o f the BOS because the psychometric pro
p e r t i e s o f the Thurstone sca le are not synonomous w i th those o f the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11
L i k e r t sca le . The need, t h e r e f o r e , f o r independent research on the
psychometric c h a r a c t e r i s t i c s o f the BOS is necessary in o rder th a t
the user o f the BOS may be assured o f i t s e f f e c t i v e n e s s . The re
search a l ready completed on halo e r r o r , leniency e r r o r , convergent
and d isc r im ina n t v a l i d i t i e s w i l l be reviewed.
Halo Er ror
Halo e r r o r has been def ined by Holzbach (1978) as a bias in r a t
ings th a t occurs when a r a t e r evaluates an in d iv id u a l on var ious items
and dimensions w ithout d i f f e r e n t i a t i n g among them, but instead evalu- ’
ates the ra te e according to a s in g le global or o v e r a l l judgment. A
second usage o f halo e r r o r was o f f e r e d by Cooper (1981 ) ; he wrote,
" S a l i e n t f ea tu re s a f f e c t the ra t ing s o f ca te g o r ies th a t the r a t e r be
l iev es are re la te d to the s a l i e n t fe a tu r e s " (p. 218) . Both d e f i n i t i o n s ,
though conceptua l ly d i f f e r e n t , a re o p e r a t i o n a l i z e d s i m i l a r l y . In both
cases, the r a t e r s a re depic ted as t a r n is h in g the ra t in g s by eva lu a t in g
the ra tee in l i g h t o f a global e v a lu a t io n or o f some s a l i e n t f e a t u r e ( s ) .
In a review e n t i t l e d , "Ubiquitous Halo", Cooper (1981) i d e n t i f i e d
two forms o f halo e r r o r th a t occur in a l l ra t in g scales ; i l l u s o r y halo
and t ru e ha lo . I l l u s o r y halo is what one g e n e r a l ly th inks o f as halo
e r r o r ; i t is the bias in ra t in g s th a t most appra is a l users wish to
avoid . True halo is o p e r a t i o n a l i z e d as the t ru e c o r r e l a t i o n s th a t
e x i s t between dimensions. Any c o r r e l a t i o n between two dimensions on
an appra isa l w i l l cons is t o f some i l l u s o r y halo and some t ru e halo.
The c o r r e l a t i o n c o e f f i c i e n t is the sum o f t ru e halo and i l l u s o r y halo.
Both types o f halo war rant review.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
12
Five sources o f i l l u s o r y halo have been i d e n t i f i e d by Cooper
(1981) : undersampling, e n g u l f in g , i n s u f f i c i e n t concreteness, in s u f
f i c i e n t r a t e r m ot iv a t io n and knowledge, and c o g n i t i v e d i s t o r t i o n s .
Undersampling as a source o f halo e r r o r occurs when the r a t e r has in
s u f f i c i e n t in format ion on the r a t e e ' s behavior; t h e r e f o r e , the r a t e r
is forced to r e l y on a global impression or a few s a l i e n t fea tu res
to make each r a t in g d ec is io n .
Engul f ing t races halo e r r o r to the r a t e r ' s b e l i e f t h a t ca teg o r ies
covary w i th global impressions or s a l i e n t f e a t u r e s .
Halo e r r o r th a t is a t t r i b u t a b l e to i n s u f f i c i e n t concreteness oc
curs when r a t e r s base t h e i r ra t in g s on s a l i e n t fe a tu re s because r a te rs
are unable to d i f f e r e n t i a t e item dimensions. An im p l ic a t io n o f t h i s
theory is t h a t i f r a t in g dimensions and ra t in g items are h ig h ly des
c r i p t i v e as opposed to a b s t r a c t , halo e r r o r w i l l be reduced. Em pir i
cal evidence f o r t h i s has been equ iv oca l . Cooper (1981) found e v i
dence to support the hypothesis t h a t halo e r r o r is reduced w i th
high ly d e s c r i p t i v e and concrete r a t in g items. F in le y , Osburn, Dubin,
and Jeannert (1977) reached no f i r m conclusions on the e f f e c t s o f
general and s p e c i f i c anchors on t h e i r r a t in g sca le .
A f o u r t h source o f i l l u s o r y halo a t t r i b u t e s biased ra t in g s to
one's i n a b i l i t y or unwil l ingness to s e n s i t i z e o n e s e l f to committ ing
halo e r r o r s . In an at tempt to remove t h i s source o f halo e r r o r ,
attempts have been made to t r a i n r a te rs to reduce i l l u s o r y ha lo . Some
methods o f t r a i n i n g have been more successful than o th e rs , however,
mixed re s u l t s have been more prev a len t ( e . g . , Borman, 1979; Fay S
Latham, 1982; Thorton 6 Zor ic h , 1980 ; Warmke & B i l l i n g s , 1979; Zedeck
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
13
& Cascio, 1382) .
A f i n a l source o f i l l u s o r y halo occurs as the r e s u l t o f cogni
t i v e d i s t o r t i o n s . Cooper (1981) argued t h a t stored ob servat io ns be
come d i s t o r t e d over t ime as one adds and d e le tes in format ion in the
c o g n i t i v e process. E s s e n t i a l l y d e t a i l is lo s t and b e l i e f s about d i
mension covar iance a re added. As in the t y p ic a l o r g a n iz a t io n w i th an
annual review, r a te rs must r e c a l l an i n d i v i d u a l ' s performance from
the past year ; actual behaviors cannot be r e c a l l e d , but impressions
r e s u l t i n g from cross-dimension c o r r e l a t i o n s a re r e c a l l e d . I f the
cross-dimension c o r r e l a t i o n s o v e r s t a t e t ru e covariance , i l l u s o r y halo
r e s u l t s . Cooper wrote, “Th is f i f t h source has been unappreciated in
the h a lo - re d u c t ion l i t e r a t u r e " (p. 2 21 ) .
For those who must r e ly on r a t i n g scales f o r performance reviews,
the presence o f t ru e halo in a d d i t io n to i l l u s o r y halo make ra t in g s
d i f f i c u l t to i n t e r p r e t . The premise t h a t t r u e c o r r e l a t i o n s e x i s t be
tween dimensions has been supported in the l i t e r a t u r e (Cooper, 1983;
Fay & Latham, 1982; Murphy, 1982) . Cooper (1981) argued t h a t the
a b i l i t i e s to perform a s p e c i f i c job a re f r e q u e n t ly more homogeneous
than heterogeneous. Cooper concluded t h a t a lthough a job may possess
various d u t i e s , the s k i l l s and a b i l i t i e s to perform those d u t ie s are
o f te n dependent and c o r r e l a t e d r e s u l t in g in t ru e halo on performance
a p p r a is a ls . For the researcher as wel l as the employer o f r a t in g
sca les , the im p l ic a t ion s o f t h i s would suggest t h a t in order to as
sess what is t r u e halo and what is i l l u s o r y halo, actua l between d i
mensions c o r r e l a t i o n s must be computed. Murphy (1982) agreed and
w ro te , “ Unless the researcher has some independent es t im ate o f t ru e
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
c o r r e l a t i o n s among performance dimensions and o f the c o r r e l a t i o n s
between performance appra is a l items and o v e r a l l e v a lu a t io n s , i t is
simply not poss ib le to c i t e the observed c o r r e l a t i o n s as evidence
o f ra t in g e r r o r " (p» 162) .
The evidence th a t supported the e x is ten c e o f t r u e and i l l u s o r y
halo was convincing and was taken in to co n s id era t io n when assessing
halo e r r o r in the present study. No attempt was made to suggest a
higher o r lower magnitude o f halo e r r o r using the BOS as compared to
o th e r ap pra isa l systems. The present study was designed to assess
magnitudinal d i f f e r e n c e s in halo e r r o rs across r a t e r s ( s e l f , peer ,
and supervisor ) using the BOS. I t was concluded then t h a t what was
i l l u s o r y halo and what was t r u e halo would make no d i f f e r e n c e s in the
r e s u l t s o f the present study.
As suggested above, the present study was concerned w i th halo
er r o r s as they occurred across r a t e r s ; a s i g n i f i c a n t amount o f r e
search has been conducted which addresses these issues, a l though none
has been found which uses the BOS. (See Landy & F a r r , 1980 f o r rev iew . )
Studies which examined d i f f e r e n c e s in halo e r r o r across the ro l e o f
the r a t e r reported equivocal r e s u l t s . Thorton (1980) reviewed the
l i t e r a t u r e on psychometr ic p r o p e r t ie s o f s e l f - a p p r a i s a l s and found
12 studies r e p o r t in g higher incidence o f halo f o r s e l f - r a t i n g s vs.
peer - and s u p e r v i s o r - r a t i n g s ; however, he a lso found 10 stud ies where
s e l f - a p p r a i s a l s manifested less halo than comparison groups. The d i f
ferences in halo e r r o r in p e e r - r a t i n g s vs. s u p e r v i s o r - r a t in g s has a lso
been s tud ie d . Klimoski and London (197*0 reported a g r e a t e r degree
o f halo e r r o r in p e e r - r a t i n g s , whereas Holzbach (1 9 7 8 ) , found s i m i l a r
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
15
degrees o f halo in p e e r - r a t in g s and s u p e r v is o r - r a t i n g s .
Researchers have suggested var ious hypotheses to e xp la in halo
e r r o r d i f f e r e n c e s among r a t e r s . One hypothesis s ta te s th a t r a t e rs
who occupy d i f f e r e n t ro les in an o rg a n iz a t io n view any one p a r t i c u
l a r job from d i f f e r e n t vantage po in ts (Borman, 197**; Holzbach, 1978;
Schneier & B eat ty , 1978; Zedeck, Imporato, Krausz, Oleno, 197*0-
Each r a t e r in t h i s case may have d i f f e r e n t expecta t io ns f o r an i n d i
v i d u a l ' s performance based on t h e i r ( r a t e r ) own jobs and exper iences.
Schneier and Beat ty (1978) hypothesized t h a t " d i f f e r e n c e s in job d u t ie s
and p r o x im i ty , causing d i f f e r i n g frequencies and/o r du ra t io n o f obser
va t io n or ra tee performance, could account f o r d ive rg en t r a t in g s given
by, f o r example, super iors and peers" (p. 130) .
I f the problem o f i n t e r p r e t a t i o n o f appra isa l dimensions and
items e x is t s f o r r a t e r s occupying d i f f e r e n t vantage p o in t s , then one
might expect t h a t i f the ap pra is a l items were very s p e c i f i c and be
hav iora l in nature then every r a t e r from each vantage po in t should
i n t e r p r e t the app ra isa l item in the same way, thus reducing the d i f
ferences in halo e r r o r across r a t e r s . A f t e r examining the stud ies
tha t assessed halo e r r o r across r a t e r s , i t was found t h a t many research
ers used graph ic r a t in g scales or BARS (Borman, 197**; Heneman, 197**;
Holzbach, 1978; Klimoski & London, 197**; Lawler , 1967; Lee, Malone, S
Greco, 1981; Parker , T a y l o r , B a r r e t , & Martens, 1959; Schneier & B ea t ty ,
1978; Zammuto e t a l , , 1982) . In comparison, Cooper (1 983) reported
tha t by using very s p e c i f i c , behav iora l r a t in g items, halo e r r o r was
reduced. I f i n s u f f i c i e n t concreteness promotes halo e r r o r , and p a r t i
c u l a r l y , causes r a t e r s from d i f f e r e n t ro les to commit d i f f e r e n t degrees
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
o f h a lo e r r o r , then i t can be hyp o th es iz ed t h a t by u t i l i z i n g a beha
v i o r a l r a t i n g s c a l e l i k e the BOS, d i f f e r e n c e s in h a lo e r r o r s across
r a t e r s w i l l be v i r t u a l l y n i l . The p o i n t is t h a t r a t i n g sca le s t h a t
c o n s i s t o f i tems t h a t a r e c l e a r l y d e l i n e a t e d , s p e c i f i c , and measur
a b l e should not be s u b j e c t to m i s i n t e r p r e t a t i o n f rom any van tage
p o i n t . T h e r e f o r e , no d i f f e r e n c e s in h a lo e r r o r s should occ ur across
r a t e r s f rom d i f f e r e n t r o l e s in the o r g a n i z a t i o n . In the p r e s e n t s tu dy ,
i t was h yp othes iz ed t h a t no d i f f e r e n c e s in degree o f h a lo e r r o r would
occur across s u p e r v i s o r - , p e e r - , and s e l f - r a t i n g s .
Leniency Er ror
According to Holzbach (1978 ) , " le n ie ncy e r r o r s , a t t r i b u t a b l e to
s p e c i f i c r a t i n g sources, occur when r a t ing s from d i f f e r e n t ra t in g
sources on the same ra te e group a re s i g n i f i c a n t l y d i f f e r e n t " (p. 579 ) .
Latham and Wexley (1981) suggested t h a t negat ive and p o s i t i v e l e n i
ency e r ro rs are committed by employers who ra te too easy or too
hard. Two problems occur w i th undue leniency e r r o r s ; one measurement
problem and one p r a c t i c a l problem.
The measurement problem occurs when leniency e r r o r s cause undue
r e s t r i c t i o n o f range on the performance ra t in gs which l i m i t s the mag
n i tu d e o f the poss ib le r e la t i o n s h i p between the ra t in g s and o th e r
v a r ia b le s o f in t e r e s t (Holzbach, 1978) . The p r a c t i c a l problem occurs
when the ra tee in t e r p r e t s the performance r a t in g s . With p o s i t i v e l e n i
ency e r r o r s , the performer w i l l i n c o r r e c t l y assume adequate performance
and cont inue w it h poor performance. The appra isa l contaminated w i th
negat iv e leniency e r ro rs w i l l i n c o r r e c t l y r e f l e c t poorer performance
than a c t u a l l y occurs. This performer may be deprived o f rewards or
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
17promotions t h a t were deserved. In e i t h e r s i t u a t i o n , poor performance
may r e s u l t w i th undue len iency e r r o r s .
With respect to len iency e r r o r s as they occur across r a t e r s , the
evidence suggests t h a t s e l f - a p p r a i s a l s a re more le n i e n t than e i t h e r
peer - o r s u p e r v is o r - r a t in g s (Holzbach, 1978; Klimoski & London, 197*t;
Meyer, 1980; Parker e t a l . , 1959; Schneier , 1978; Thorton, 1980) , a l
though one study (Heneman, 197*0 reported less len iency e r r o r f o r s e l f
a p p r a is a ls in comparison to s u p e r v is o r - a p p r a i s a 1 s. With respect to
su p e rv is o r - and p e e r - a p p r a i s a ls , two s tud ies (Schneier , 1978; Zedeck
e t a l . , 197*0 reported t h a t s u p e r v is o r -a p p ra is a ls demonstrated less l e
niency e r r o r s , and one study (Holzbach, 1978) reported no s i g n i f i c a n t
d i f f e r e n c e s .
Very l i t t l e research has been done to e xp la in why len iency e r ro rs
occur. Zammuto e t a l . (1982) reported t h a t o r g a n i z a t io n a l d i f f e r e n c e s
in len iency e r r o r occurred f o r s ix items on t h e i r performance appra isa l
though no conclusions were reached as to why t h i s occurred . No doubt
many o f the r a t e r c h a r a c t e r i s t i c s discussed e a r l i e r a f f e c t len iency
e r r o r s . In a d d i t io n , len iency e r r o rs may be g r e a t e r simply because
an in d iv id u a l wants to make h i m s e l f / h e r s e l f appear competent or a f e l
low employee to appear competent . The consequences o f the appra isa l
then could play an important r o le ; Zedeck and Cascio (1982) supported
t h i s assumption. Consis tent w i t h the research reviewed, i t was hypo
thes ized t h a t s e l f - a p p r a i s a l s would demonstrate more len iency e r ro rs
than e i t h e r peer - or s u p e r v i s o r - r a t in g s .
Before the l i t e r a t u r e is reviewed on convergent and d isc r im in an t
v a l i d i t y , a discussion on the o p era t io n a l d e f i n i t i o n s o f halo e r r o r
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
and len iency e r r o r is warranted , s ince some have found t h a t d i f f e r e n t
o p e ra t io n a l d e f i n i t i o n s y i e l d d i f f e r e n t values o f e r r o r ( e . g . , Saa l ,
Downey, & Lahey, 1980) . Saal e t a l . described four methods o f halo
e r r o r assessment taken from the l i t e r a t u r e . B r i e f l y , the methods are
the f o l lo w i n g : (a) comparison o f mean dimension r a t in g s which examine
the i n t e r c o r r e l a t i o n s among d i f f e r e n t dimensions, h igher i n t e r c o r r e l a
t io ns suggest g re a t e r halo e r r o r ; (b) the r e s u l t o f f a c t o r a n a ly s is o f
the dimension i n t e r c o r r e l a t i o n m a t r ix , fewer f a c t o r s or p r i n c i p l e com
ponents t h a t emerge are i n d i c a t i v e o f g r e a t e r halo e r r o r ; (c) a n a lys is
o f the var ian ce or standard d e v ia t io n s o f a r a t e r ' s r a t i n g o f an i n d i
vidual across each performance dimension, less var ian ce or r e s t r i c t e d
standard d e v ia t io n s suggests g r e a t e r incidence o f halo e r r o r ; and
(d) r a t e r x ra tee x dimension ANOVA, where a s i g n i f i c a n t r a t e r x ra tee
i n t e r a c t i o n , e s p e c i a l l y one t h a t accounts f o r a s i z e a b le propor t ion o f
the t o t a l v a r ia n c e , is in t e r p r e t e d as halo e r r o r . The o p e ra t io n a l
d e f i n i t i o n used in the present study to assess incidence o f halo e r
ro r among r a t e r s is the t h i r d d e f i n i t i o n , examinat ion o f r a t e r v a r i
ance. Al though Saal e t a l . (1980) c r i t i c i z e d a l l fo u r d e f i n i t i o n s as
poor in d ic a to rs o f abso lute ha lo , f o r the present purposes o f compar
ing halo e r r o r among r a te r s t h i s d e f i n i t i o n w i l l s u f f i c e .
According to Saal e t a l . ( 1 980) th ree o p era t io n a l d e f i n i t i o n s e x i s t
f o r assessing len iency e r r o r . The f i r s t d e f i n i t i o n , the most popular
one, is to compare the mean dimension ra t in g s w i th the m id -po in t o f
the sca le . Mean ra t in g s t h a t s i g n i f i c a n t l y exceed the m id -po in t o f
the sca le r e f l e c t len iency; whereas, mean ra t in g s t h a t are below the
midpoint o f the sca le r e f l e c t s e v e r i t y . The second d e f i n i t i o n
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
suggests a r a t e r x r a t e e x dimension ANOVA. A r a t e r main e f f e c t ,
e s p e c i a l l y one t h a t accounts f o r a la rg e pro por t ion o f the t o t a l v a r i
ance, is said to r e f l e c t len iency . F i n a l l y , Saal e t a l . suggested a
few stud ies examined the degree o f skewness o f dimension ra t in g s f o r
evidence o f len iency . A s i g n i f i c a n t neg at iv e skewness o f dimension
ra t in g s is said to r e f l e c t len iency; whereas, a s i g n i f i c a n t p o s i t i v e
skewness is said to r e f l e c t s e v e r i t y . A problem w i t h assessing l e n i
ency is th a t w i thout ac tua l performance d a ta , no abso lu te degree o f
len iency can be determined. The present study o p e r a t i o n a l l y def ined
len iency as present when s u p erv iso r - , peer- , and sel f -mean ra t in g s
d i f f e r e d s i g n i f i c a n t l y across behav iora l i tems. I n t e r e s t was in ex
amining incidence o f len ien cy , not in assessing abso lu te len ie ncy .
Convergent and D iscr im in an t V a l i d i t y
Since Campbell and " is k e (1959) f i r s t introduced the m u l t i t r a i t -
multimethod a n a ly s is (MTMM) as a means to assess convergent and d i s
cr im in an t v a l i d i t y , a s i g n i f i c a n t amount o f research has been done to
assess these v a l i d i t i e s o f ap pra is a l sca les . (For reviews, see Holz
bach, 1978; Lee, Malone, 6 Greco, 1981; Kavanaugh, MacKinney, 6 Wolins
1971.) Included in t h a t research were in v e s t i g a t i o n s rep o r t in g on the
d is c r im in a n t and convergent v a l i d i t y o f the ra t in g s obta ined w i th BARS
A few studies (Dickenson & T i c e , 1973; and Zedeck & Baker, 1972) have
reported l i t t l e o f e i t h e r v a l i d i t i e s . In c o n t r a s t , Friedman and Corne
l i u s (1976) found evidence o f convergent v a l i d i t y and less halo when
p a r t i c i p a n t s were a c t i v e in BARS c o n s t ru c t io n . In a d d i t io n , Lee,
Malone, and Greco (1981) using MTMM, found good convergent and d i s c r i
minant v a l i d i t y using a summated r a t in g sca le . No research was found
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2 °
assessing d is c r im in a n t v a l i d i t y and convergent v a l i d i t y o f a BOS.
Before the present hypothesis is proposed, perhaps i t is worthy
to review the meaning and value o f both convergent and d is c r im in a n t
v a l i d i t y . Convergent v a l i d i t y has been def ined by Holzbach (1978) as,
"the e x ten t o f agreement between two or more measures o f the same
t r a i t using d i f f e r e n t methods" (p. 580 ) . D iscr im in an t v a l i d i t y has
been de f ined by the same as, " th e ex ten t o f independence between mea
sures o f d i f f e r e n t t r a i t s " (p. 5 80 ) . Al though the d e f i n i t i o n s o f con
vergent and d is c r im in a n t v a l i d i t y a re f a i r l y s t r a i g h t f o r w a r d the value
o f assessing them is more obscure. Lawler (1967) w ro te , " th e pr imary
gain from a research po in t o f view is t h a t t h i s approach [MTMM] al lows
the researcher to develop a much more s o p h is t ic a te d understanding o f
his c r i t e r i a than is poss ib le where i t is not employed" (p. 3 72 ) . Part
o f t h i s understanding, as expla ined by Lawler , comes about through
determining convergent and d is c r im in a n t v a l i d i t y . U n f o r t u n a t e ly , i t
seems t h a t , l i k e Lawler , numerous researchers have assumed t h a t more
in format ion is b e t t e r than none, s ince many o f the stud ies reviewed
by the present researcher never mentioned why d is c r im in a n t and conver
gent v a l i d i t y were being assessed.
The importance o f determin ing why convergent and d isc r im in a n t
v a l i d i t y are v a lu a b le must be accomplished so t h a t the r e s u l t s o f any
study assessing them can be put in to proper p e rs p ec t ive . F i r s t , Camp
b e l l and Fiske (1959) wrote , " v a l i d a t i o n is t y p i c a l l y convergent , a
conf i rm at io n by independent measurement procedures. Independence o f
methods is a common denominator among the major types o f v a l i d i t y
(except ing content v a l i d i t y ) in so fa r as they are to be d is t in g u ish ed
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
from r e l i a b i l i t y " (p. 8 1 ) . Al though convergent v a l i d i t y cannot and
does not e v a lu a te v a l i d i t y in the absolu te sense, evidence o f conver
gent v a l i d i t y , c o r r e l a t io n s between the same items or dimensions as
rated by var ious r a te rs being s i g n i f i c a n t l y l a r g e r than zero , i n d i
cates t h a t r a t e rs are ra t in g the same const ruc t o r behav ior , as is
the case w i th a BOS. Th ere fo re , w i th a BOS, s i g n i f i c a n t convergent
v a l i d i t y would suggest th a t r a te r s (peer , s e l f , and supervisor) are
r a t in g the same const ruct o r behavior .
Second, Campbell and Fiske (1959) w ro te , " f o r j u s t i f i c a t i o n o f
novel t r a i t measures, f o r the v a l i d a t i o n o f t e s t i n t e r p r e t a t i o n or f o r
the es tabl ishment o f constru ct v a l i d i t y , d is c r im in a n t v a l i d a t i o n as
wel l as convergent v a l i d a t i o n is re q u i red . Tests can be in v a l id a te d
by too high c o r r e l a t i o n s w i th o th e r t e s t s from which they were intended
to d i f f e r " (p. 8 1 ) . In the same way, behav iora l items on a BOS can
be in v a l id a t e d by too high c o r r e l a t i o n s w i th o th e r items from which
they were intended to d i f f e r . I f the BOS used in the present study
was found to possess some s i g n i f i c a n t degree o f d is c r im in a n t v a l i d i t y ,
then the present BOS could be said to d i f f e r e n t i a t e among behavioral
i tems. Whether or not d i f f e r e n t i a t i o n was accura te and meaningful
cannot be argued wi th out a d d i t io n a l " t r u e " performance da ta . A l l
t h a t could be argued is th a t the behav iora l items do d i s c r im i n a t e per
formance among ratees and /or w i t h i n one ra te e in an o r d e r l y fash io n .
I t should be summarized again a t t h i s time t h a t even i f a BOS was
found to possess s i g n i f i c a n t degrees o f convergent and d is c r im in a n t
v a l i d i t y , d i r e c t inferences about the accuracy or c r i t e r i o n - r e l a t e d
v a l i d i t y o f the items could not be made. I t is poss ib le t h a t a BOS
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
could possess both types o f v a l i d i t y f o r assessing performance, ye t
be in v a l i d in the sense t h a t i t measures " t r u e " performance. True
performance data are needed to make t ru e v a l i d i t y in fe rences . This
observat ion about convergent and d is c r im in a n t v a l i d i t y was f u r t h e r
d e l in e a te d and supported by Lawler (1967 ) -
S im i la r to the purposes o f the present study, o th e r i n v e s t ig a t io n s
t h a t have-ut i 1 ized MTMM approach to assess convergent and d is c r im in a n t
v a l i d i t y f o r combinat ions o f s u p e r v is o r - , s e l f - , and p e e r - r a t in g s have
g e n e r a l ly reported support f o r convergent v a l i d i t y and l i t t l e or no
support f o r d is c r im in a n t v a l i d i t y (Heneman, 197**; Kavanagh e t a l . , 1971
Klimoski & London, 197**; Lawler , 1967; Lee, Malone, & Greco, 1981) .
Lack o f d is c r im in a n t v a l i d i t y has been p r i m a r i l y a t t r i b u t e d to the oc
currence o f la rge halo e f f e c t s . One observat ion t h a t can be made re
garding these studies is t h a t the a ppra is a l scales used were graphic
r a t i n g scales or BARS. As pointed out e a r l i e r , using a summated r a t in g
scale w i th s p e c i f i c items produced good convergent and d is c r im in a n t
v a l i d i t y (Lee e t a l . , 1981) .
In the present study, the d is c r im in a n t v a l i d i t y and convergent
v a l i d i t y o f the BOS to measure performance were explored using the
MTMM. The m u l t i t r a i t s were the behaviora l items on the BOS. The
mult imethods were the r a t e r sources, peer, s e l f , and superv isors . I t
was hypothesized t h a t s ince the items were very concrete and s p e c i f i c ,
s i g n i f i c a n t convergent v a l i d i t y and d is c r im in a n t v a l i d i t y would be
found.
In review, the o ther two hypotheses pre v io u s ly proposed are th a t
(a) no d i f f e r e n c e s in degree o f halo e r r o r would occur across r a t e r
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
sources; and (b) s e l f - r a t i n g s would demonstrate more len iency e r r o r
than p e e r - r a t in g s or s u p e r v is o r - r a t i n g s .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER I I
METHOD
S u b je c ts
Subjects f o r BOS development were 29 male p s y c h i a t r i c a ides
(a ides) and 1 female a id e randomly se lected from a popu la t ion o f
95 aides from a p s y c h i a t r i c f a c i l i t y in Western Michigan.
Subjects who provided data f o r hypotheses t e s t in g included 49
male aides and 2 female a id es , s ix nursing superv isors , and an un
determined number o f peers , both aides and nurses (R N 's ) , p roviding
156 peer ra t i n g s . (The number o f peers could not be determined,
since some peers rated more than one a id e and r a t e r ' s names were kept
anonymous.)
BOS Development
The procedure fo l lowed f o r BOS development c l o s e l y resembled the
procedure o u t l i n e d by Latham and Wexley (19 81 ) . From a computer gen
era ted l i s t o f 95 a id es , 30 aides were randomly se le c ted using a t a b l e
o f random numbers. From each o f the s ix p s y c h i a t r i c subun its, a pro
po r t io n o f a ides was se le c ted equal to the proport ion o f a ides on the
s p e c i f i c u n i t to the t o t a l number o f a id es . The c r i t i c a l inc id ent
technique developed by Flanagan (1959) was u t i l i z e d to c o l l e c t ten
c r i t i c a l inc id ents from each a id e , f i v e inc idents t h a t described e f
f e c t i v e behavior and f i v e inc id ents th a t described i n e f f e c t i v e beha-
v i o r a n d f i v e inciden ts t h a t described i n e f f e c t i v e behavior f o r each
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
25
a id e . (F ur th e r c l a r i f i c a t i o n o f e f f e c t i v e and i n e f f e c t i v e in c id ents
can be found in Latham and Wexley, 1981.)
A t o t a l o f 300 inc idents was c o l l e c t e d . A f t e r i n i t i a l screening
f o r redundancy and ambig u ity , a l i s t o f 167 in c id ents was obta in ed .
Since each o f the s ix u n i ts t re a te d d i f f e r e n t types and ages o f
p a t i e n t s , a id e job d u t ies v a r i e d , n e c e s s i ta t in g the development o f
f i v e s l i g h t l y modif ied BOS's. Relevancy o f the 167 inc id ents to the
u ni ts was accomplished by the resp ect iv e nursing s uperv isor . Each
supervisor e d i te d the l i s t f o r r e lev an t items and a p p ro p r ia te medi
cal ja rg o n . A t o t a l o f 92 inc id ents remained.
The next step was to determine o v e r a l l job ca te go r ie s o r beha
v i o r a l c r i t e r i a under which the inc idents would be grouped ( e . g . , work
h a b i ts , s t a f f i n t e r a c t i o n s , communication) . Nine o f the ten c r i t e r i a
se lected were chosen from the e x i s t i n g a id e job model. Two a id es ,
one nurse, and the researcher c o l l e c t i v e l y assigned the inc idents to
the broader behav iora l c a te g o r ie s . Eighteen items did not seem to
f i t under the given nine c r i t e r i a , so Work Habi ts was se lec ted as a
ten th c r i t e r i o n .
Content v a l i d i t y was assessed in two ways. F i r s t , i t was checked
to insure t h a t each accomplishment l i s t e d in the job d e s c r ip t io n was
represented by an in c id e n t . No items were added or d e le t e d . Second,
a completed BOS was sent to each supervisor and they were asked to
add, d e l e t e , and /o r e d i t items to make the appra isa l job r e leva n t f o r
the re s p e c t iv e u n i t . Supervisors d e le ted h to 26 items.
From the c o r r e c t io n s , f i v e Behavioral Observat ion Scales were
const ructed w i th an item range o f 66 to 88. A l l f i v e BOS's re ta in ed
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the 10 general behav iora l c r i t e r i a . For items represent ing e f f e c t i v e
behav ior , ( 1) almost never and ( 5) almost always served as anchors
on the r a t in g s ca le . For items represent ing i n e f f e c t i v e behaviors;
( 1) almost always and ( 5) almost never served as r a t in g anchors.
T h ere fo re , a S_ always represented superior performance. Percents
d e f in in g the values o f 1 to 5 can be found in Appendix B as wel l as
the in s t r u c t io n s f o r complet ing the BOS.
Procedure
Over a f ive-month per io d , e va lu a t io n s o f a id e performance were
completed. Three sources provided e v a lu a t io n s : peers, s e l f , and su
p e rv iso rs . With respect to the peer e v a lu a t io n s , 25 aides were per
m it ted to choose between nurses and/or a ides to complete t h e i r e va lua
t io n s ; 27 o th e r aides had peers assigned to r a t e them by t h e i r super
v is o rs . The number o f peers rep o r t in g data on a s in g le a ide va r ied
from two to n in e . For the ana lyses , peer ra t in g s were averaged in to
a s in g le r a t i n g . A l l ra t in g s were recorded on computer scoring cards.
Ana 1yses
The analyses were computed using the data on behav iora l i tems,
those found common to the f i v e BOS's. (See Appendix A f o r the ^8
item BOS).
P r io r to running the data analyses fo r the assessment o f halo
e r r o r , len iency e r r o r , and convergent and d is c r im in a n t v a l i d i t y ,
histograms were p l o t t e d fo r each o f the f i v e p s y c h i a t r i c un it s ( u n i t s )
across the three r a t e r sources in order to assess fo r n o rm al i ty o f
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
27
the data d i s t r i b u t i o n . A normal d i s t r i b u t i o n o f data has been shown
to be a necessary assumption o f the ANOVA model (Hopkins & Glass, 1978) .
A t o t a l o f 15 histograms was p l o t t e d .
In o rd er to assess convergent and d isc r im in a n t v a l i d i t y and r a t e r
bias f o r the f i v e u n i ts taken c o l l e c t i v e i y , the ANOVA technique des
cr ib ed by Kavanagh e t a l . (1972) and Stanley (1961) was u t i l i z e d . By
using a three-way f a c t o r i a l design, Kavanagh e t a l . found t h a t very
l a rge MTMM mat r ices could be analyzed w ith considerab ly less e f f o r t .
For example, i f the present study assessed convergent and d isc r im in an t
v a l i d i t y by comparing in t e r c o r r e l a t i o n s as was the technique described
by Campbell and Fiske (195 9 ) , 13*000 i n t e r c o r r e l a t i o n s would have to
be examined and compared! The a l t e r n a t i v e design allowed f o r the as
sessment o f convergent v a l i d i t y by t e s t i n g f o r s i g n i f i c a n t main e f f e c t s
across aides and f o r the assessment o f d is c r im in an t v a l i d i t y by te s t in g
f o r a s i g n i f i c a n t i n t e r a c t i o n between aides and behaviora l items in a
3 x A8 x 52 f a c t o r i a l design, where there were 3 r a t e r sources, A8 be
hav iora l i tems, and 52 a id es .
Included in the ANOVA a n a ly s is described by Kavanagh e t a l . was an
assessment procedure f o r r a t e r bias o p e r a t io n a l i z e d as a s i g n i f i c a n t in
t e r a c t io n between aides and r a t e r sources. Some researchers have in cor
r e c t l y r e f e r r e d to r a t e r bias as halo e r r o r ( e . g . , Kavanagh e t a l . , 1971;
Holzbach, 1978; Lee e t a l . , 19 81 ) . I t should be pointed out t h a t r a t e r
bias may be due to halo e r r o r , but t h a t a s i g n i f i c a n t i n t e r a c t io n be
tween aides and r a t e r sources ( r a t e r bias ) may a lso be due to leniency
e r r o r or some o th e r systematic r a t e r b ias . T h e re fo re , f o r comparison
purposes w i th o th e r l i t e r a t u r e , r a t e r bias was c a lc u la t e d though i t was
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2 8
not considered a measure o f halo e r r o r .
D i f fe ren ce s in halo e r r o r among r a t e r sources f o r each o f the
f i v e u n i ts were compared using Levene's Test f o r equal var iances .
Less var ia nce f o r any one r a t e r source f o r the ^8 behavioral items
was i n d i c a t i v e o f halo e r r o r .
U t i l i z i n g a one-way ANOVA, len iency e r r o r was o p e r a t io n a l i z e d as
a s i g n i f i c a n t r a t e r mean d i f f e r e n c e . Leniency e r r o r was assessed f o r
each u n i t . Given a s i g n i f i c a n t F - r a t i o , the Dunn-Bonferroni (Huitema,
1980) was u t i l i z e d to determine which r a t e r source was most severe and
most l e n i e n t .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER I I I
RESULTS
The r e s u l t s o b ta in e d from comput ing the h is togra ms suggested t h a t
the d a ta v i o l a t e the assumpt ion o f n o r m a l i t y in t h e ANOVA model . H i s
tograms showed t h a t r a t i n g s across the t h r e e r a t e r sources c l u s t e r e d
toward the upper end o f th e s c a l e causing a s t ro ng n e g a t i v e l y skewed
d i s t r i b u t i o n . T a b le 4, a summary o f th e 15 h is togra ms computed, shows
t h e p e r ce nta ge o f r a t e r s in each r a t i n g c a t e g o r y .
T a b l e 4
Perc en tag e o f R a te r s in Each Category f o r I terns on BOS
Category S e l f Peer* Supervi sor
1 1.2% 0.5% 0.42%
2 1.7% 0.8% 0.68%
3 7.1% 4.7% 7.3%
4 31.2% 22.6% 28.3%
5 59-2% 71.3% 63.4%
* P e e r scores were rounded to t h e n e a r e s t wh ole number.
T a b l e 5 p re se n ts the means and v a r i a n c e s f o r t h e f i v e u n i t s and
the r e s u l t s o f t h e one-way ANOVA and Levene 's T e s t f o r equal v a r i a n c e s .
Halo e r r o r in per formance r a t i n g s was o p e r a t i o n a l l y d e f i n e d as p re se n t
when t h e v a r i a n c e a s s o c i a t e d w i t h s u p e r v i s o r , p e e r , and s e l f - r a t i n g s
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced
with perm
ission of the
copyright ow
ner. Further
reproduction prohibited
without
permission.
T a b l e 5
Means, V a r i a n c e s , One-Way ANOVA, Lev en e 's T e s t f o r Equal Va r ia n ce s f o r Performance R a t in g s by R a t in g Source f o r Each U n i t
S e l f S u p e rv i so r Peer ANOVA L even e1s
l i t M V N M V N M V N MS F F
1 ^ - 373 .669 402 4.284 .945 401 4.548 .350 411 7.32 11 .23* 4 1 .7 4 *
2 4.262 .898 899 4.489 .378 872 4 .490 OO 854 14.87 2 6 .5 4 * 87.51
3 4.576 • 513 323 4.224 .643 317 4 .522 .465 324 11.47 2 1 . 28* 13 .3 9*
4 4.491
00LA 501 4.804 .253 522 4 .694 .249 509 12.92 35 .86 * 8 5 . 68*
5 4.491 • 00 281 4 .785 .228 275 4 .5 99 .551 284 6 .1 6 13 .4 8* 24 .17 *
* p .0001
were s i g n i f i c a n t l y d i f f e r e n t o r heterogeneous. S i g n i f i c a n t var ia nce
d i f f e re n c e s by r a t in g sources was found f o r each o f the f i v e u n i t s .
A f t e r f u r t h e r examinat ion o f the r e s u l t s , i t was discovered t h a t due
to the strong negat ive skewed d i s t r i b u t i o n s , a fu nc t io n a l r e l a t i o n s h i p
ex is ted between source mean ra t in g s and the resp ec t ive var iances .
S p e c i f i c a l l y , a s i g n i f i c a n t n eg at ive c o r r e l a t i o n e x is te d between the
r a t e r source means and t h e i r resp ect ive v ar ian ce rendering the var iance
d i f f e r e n c e s u n i n t e r p r e t a b l e as halo e r r o r ( r + - . 8 5 , £ ^ . 0 1 ) .
Leniency e r r o r in performance ra t in g s was o p e r a t i o n a l l y def ined
as present when mean ra t in g s assoc iated w i t h superv isors , peers, and
s e l f - r a t i n g s were s i g n i f i c a n t l y d i f f e r e n t . Table 5 presents the r e
s u l ts o f the one-way ANOVA and Tab le 6 presents the r e s u l t s o f the
Dunn-Bonferroni f o r p a i r comparisons. The r e s u l t s o f the one-way
ANOVA demonstrated s i g n i f i c a n t d i f f e r e n c e s among r a t e r sources f o r
each p s y c h i a t r i c u n i t . Dunn-Bonferroni t e s t s demonstrated t h a t f o r
seven o f the e ig h t s i g n i f i c a n t comparisons found between s e l f - r a t i n g s
and the o th er two sources, s e l f - r a t i n g s were lower o r less le n ie n t
than both peer and s u p e r v i s o r - r a t in g s .
The comparison t e s ts a lso demonstrated t h a t o f the four s i g n i f i
cant d i f f e r e n c e s found between p e e r - r a t in g s and s u p e r v i s o r - r a t in g s ,
two p e e r - r a t in g s from two u n i ts were higher o r more le n ie n t than t h e i r
resp ect iv e mean s u p e r v i s o r - r a t in g s and two p e e r - r a t in g s from two o ther
un it s were less le n i e n t than t h e i r resp ect iv e mean s u p e r v i s o r - r a t i n g s .
No d i f f e r e n c e between p e e r - r a t in g s and s u p e r v i s o r - r a t i n g s were found
f o r the f i f t h u n i t .
From the re s u l t s o f the data analyses on len iency e r r o r , i t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
32
soa)
X I
</)Olc
roa:a)oc(0E
L.0)a.j_
O
(Aa)
co</)u03aEOo
couu.a)u-c0
CO1cc3Q
•1C •K Ht0 ’S 00 OO sO CMLA CM CA LA
u Li. SO O »— CA CMo • • • • •CO a r O LA CM CA
*>
a> aT CA a r LA LACL CO CO CM O CO3 2 ! CM -a* CM OO A -
t/> • • • • •a r a r a r -a* a r
</>>
u CO 0 CM a r CA0) -a* <A CM CA CA0) 21 LA -a* LA SO LA
a . • • • • •■a* a r -a* a r a r
* •j:CM * *)c OOOS -a* a r os
Li. 0 CA CA CO• • • • •
CA LA O LA f—L.<D
a . 00 O CM a r CAa r CA CM CA CA
t/) 2 : LA -a* LA sO LA> • • • • *
CA a r a r -a* a rM-•“ •(1)
l/> ro LA sO r_r-% vO CA CA
21 CA CM LA a r a r• • t •a r a r -a* a r a r
•JC •is •}* •K-a* CM <A * sOsO 0A sO CM 0
L. Li. LA CM O 0 CMo • • • • ••(A T—• sO sO CA r*^
>L.0) -a* c a a r LA LAa co CO CM O CO3 2 : CM -a* CM CO r**.to • • • • •
-a* -a* -a* a r a r(A>
M - CA LA SO »—r - . sO r^ . CA CA
73 S CA CM LA a r a rto • • • • •
-a* a r -a* a r a r
4->• M CM CA a r LAC
=3
O
VQ’l•is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
appeared as i f a r e l a t i o n s h i p o r in t e r a c t io n e x is ted between mean
source ra t in g s and p s y c h i a t r i c u n i t s . F igure 1 demonstrates t h a t ,
indeed, an in t e r a c t io n e x is t e d . From the f i g u r e i t can be seen th a t
both s e l f - r a t i n g s and p e e r - r a t in g s tend to be more s t a b l e than super
v i s o r - r a t i n g s across u n i t s .
.0
.9
.8
• 7
.6
.5
.3
.2
4 . 0
Unit 2 53
S e l f - r a t i n g ------- * ----------* ------------* ------------* ---------- *
S u p e r v i s o r - r a t i n g °------------- °
P e e r - r a t in g +---------- h------------+------------+---------- +
Figure 1. Mean Source Rat ings by P s y c h ia t r i c Unit
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
In ad d i t io n to the above ana lyses , weighted means were ca lc u la te d
f o r the th ree r a t e r sources tak ing the f i v e u n i ts c o l l e c t i v e l y . The
re s u l ts are presented in Table 7- The mean s e l f - r a t i n g was lower than
the p e e r - r a t i n g and the s u p e r v i s o r - r a t i n g ; whereas, the mean peer -
r a t in g was comparable to the mean s u p e r v i s o r - r a t in g .
Tab le 7
Weighted Means f o r Source Rat ings
S e l f - r a t i n g Supervi s o r - r a t i n g P e e r - r a t ing
^.398 b . 523 1*.561
The r e s u l t s o f the ANOVA technique to t e s t f o r convergent and
d is c r im in an t v a l i d i t y and r a t e r bias are presented in Table 8. The
ana lys is provided no support f o r d isc r im in an t v a l i d i t y , strong support
f o r convergent v a l i d i t y , and strong support f o r s u b s tan t ia l r a t e r b ias .
Variance components were a ls o computed. Kavanagh e t a l . (1971) sug
gested formulas f o r es t im a t in g var iance components so th a t one might
compare the amount o f var ia nce due to each source in Table 8. Formu
las f o r computing var ia nce components can be found in Appendix C.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3 5
Table 8
Three-Way Analys is o f Var iance Summary Table
Source df MS F P* VarianceComponent
Aide (A) 51 13.964 15-50 .0001 .091
A x Behavior (B) 2397 1.490 1.65 .16 .196
A x Source (S) 102 6.585 7-31 .008 .118
Er ro r (A,B,S) 4794 .901
*P = P r o b a b i l i t y o f a Type 1 e r r o r .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER IV
DISCUSSION
Data D i s t r i b u t i o n
P re l im in a ry an a ly s is o f the data raised some quest ions as to the
appropr ia teness o f the ANOVA models used to assess len iency e r r o r and
convergent and d is c r im in an t v a l i d i t y . Assumptions under ly ing the ANOVA
model include: (a) n o rm al i ty o f the data d i s t r i b u t i o n and (b) homoge
n e i t y o f var iances (Hopkins & Glass, 1978) . Both o f these assumptions
were shown to be v i o l a t e d by the data se t . S p e c i f i c a l l y , histograms
showed the data set to be n e g a t iv e ly skewed; and Levene's Test provided
support t h a t the var iances assoc ia ted w i t h r a t in g sources were h e te ro
geneous. V i o l a t i o n o f the assumptions has been shown to increase the
p r o b a b i l i t y o f a Type 1 e r r o r .
Despite the v i o l a t i o n s o f assumptions, both ANOVA techniques were
used on the data s e t . In defense o f using the ANOVA techniques on the
skewed da ta , research has demonstrated t h a t the ANOVA is robust w i th
respect to v i o l a t i o n s o f the n o rm a l i ty assumption given a la rg e sample
s iz e ( n > 3 0 ) (Glass, Peckham, & Sanders, 1972; Hopkins £ Glass, 1978) .
The degrees o f freedom f o r the present study, which are d i r e c t l y r e la t e d
to sample s i z e , ranged from 839 to 479^. With respect to heterogeneous
va r ia n ces , the ANOVA model has a lso been shown to be robust given equal
sample sizes o f the groups being compared (Glass, Peckham, & Sanders,
1972; Hopkins S Glass, 1978) . The sample si zes were approximate ly
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
equal in the present data se t .
Although i t was poss ib le to demonstrate t h a t the ANOVA was robust
when the assumptions th a t u n d e r l i e the model were v i o l a t e d , o th e r re
searchers have overcome s i m i l a r problems by using r a t i n g scales t h a t
range from seven po in ts ( e . g . , Friedman s C o rn e l iu s , 1976) to 110
( e . g . , Lee e t a l . , 1981) and by transforming the data set ( e . g . , Latham
et a l . , 1979) . The p o s s i b i l i t y o f increasing the range o f p o t e n t ia l
responses and trans forming the data raised two quest ions f o r the pres
ent study.
The f i r s t q u e s t i o n addressed the issue o f w h e th e r o r not a f i v e
p o i n t s c a l e was adequate f o r a p p r a i s i n g per fo rm ance . Latham and Wex-
ley (19o1) contended t h a t a f i v e p o i n t L i k e r t - t y p e s c a l e was adequate
f o r r a t i n g s c a le s based on the re se arch o f L i s s i t z and Green (1975)
and Jenkins and Taber ( 197 7 ) . A re ex a m i n a t io n o f t h i s l i t e r a t u r e
proved e n 1 i ghten i n g .
The research completed by L i s s i t z and Green (1975) and Jenkins
and Taber (1977) involved two Monte Carlo stud ies which examined the
optimal number or r a t in g po ints f o r assessing r e l i a b i l i t y . Both groups
o f researchers agreed th a t th e re was l i t t l e u t i l i t y in using more than
f i v e ra t in g po in ts given the data a re drawn from a normal ly d i s t r i
buted pop u la t io n . The data used by both groups were generated by a
computer where the means o f e r r o r s , c o r r e l a t io n s between e r r o r s , and
c o r r e l a t i o n s between e r r o r s and t r u e scores were equal to ze ro . Jen
kins and Taber made t h i s observa t io n :
Although our study was l i m i t e d to the case in which responses are d i s t r i b u t e d uni formly across a l l c a t e g o r ie s , such d i s t r i b u t i o n s a re not common in actual research. Future s im ula t ions should exp lore the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
g e n e r a l i z a b i l i t y o f the c u r ren t f in d in g s f o r o th er d i s t r i b u t i o n s , e s p e c ia l l y f o r the skewed ones o f ten found in a p p l ied research, (p. 395)
I t seems t h a t Latham and Wexley (1981) f a i l e d to r e a l i z e t h a t p e r f o r
mance ra t in g scales o f ten generate skewed d i s t r i b u t i o n s thereby making
the f i v e po in t sca le undes i rab le . The data generated by Latham e t a l .
(1979) demonstrated a high degree o f skewness evidenced by the need
to e l i m i n a t e 32 o f 90 items because the items did not d is c r im in a t e
among performers. Furthermore, even a f t e r c a l c u l a t i n g the t o t a l range
o f scores f o r each supervisor and d iv i d in g by f i v e , the d i s t r i b u t i o n
s t i l l appeared n e g a t iv e ly skewed, as shown in Table 9.
Tab le 9
Number and Pe rc entage o f S u p e rv i so rs in Each Category : Latham e t a l . (1979) Data
1. Below Adequate 0 (OS)
2. Adequate 0 (OS)
3 . Ful 1 15 (17S)
A. Excel l en t 59 (65S)
5. Superior 16 (18S)
The q u e s t i o n then was r a i s e d as to e x a c t l y how Ronan and Latham
(197^) and Latham, Wexley , and Rand (1975) were a b l e t o conduct r e l i
a b i l i t y and v a l i d i t y s tu d ie s w i t h t h e i r o b ta i n e d skewed d i s t r i b u t i o n .
( I t should be noted t h a t a l l v a l i d i t y and r e l i a b i l i t y s tu d i e s c i t e d
by Latham and Wexley [1981] were g en era ted f rom th e same d a ta s e t ) .
Ronan and Latham r e p o r te d i n t e r o b s e r v e r r e l i a b i l i t y on th e raw scores
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
was .50 or less f o r 63 o f the 78 behavioral items. In t rao b server
r e l i a b i l i t y was reported as .64 to .8 4 ; however, in t ra o bs erv er r e l i
a b i l i t y was computed by using e ig h t composite scores where the 78
items were s u b j e c t i v e l y c l u s t e r e d . In the assessment o f concurrent
v a l i d i t y , Ronan and Latham c i t e d , " C o r r e la t io n s [between items and
each c r i t e r i o n ] were obtained a f t e r normal i z ing the response to each
i tem" (p. 6 0 ) . How the authors normal ized the data was not reported .
V a l i d i t y c o e f f i c i e n t s a f t e r norm al iza t ions ranged from .16 to .31 ,
where 16 o f the 17 v a l i d i t y c o e f f i c i e n t s were s i g n i f i c a n t a t the .001
l e v e l .
Latham et a l . (1975) reported on the i n t r a r a t e r rel i a b i 1 i t y , i n t e r
r a t e r r e l i a b i l i t y , and relevance o f c lu s te red items taken from a BOS.
In o rder to run the ana lyses , the 78 behav ioral items were grouped
in to e ig h t c r i t e r i o n scores by tak ing the a lg e b r a ic sum of the e f f e c
t i v e behaviors minus the i n e f f e c t i v e behaviors . In o th e r words, the
data were transformed in o rder to increase the r e l i a b i l i t y and v a l i
d i t y c o e f f i c i e n t s . The p o in t is t h a t both s tu d ies , Ronan and Latham
(1974) and Latham e t a l . (1975) demonstrated th a t w it h ou t some t r a n s
formation o f the da ta , the r e s u l t i n g c o r r e l a t i o n c o e f f i c i e n t s were
lower than des ired due to e i t h e r poor r e l i a b i l i t y and v a l i d i t y o f the
BOS f o r assessing performance o r skewness o f the data being analyzed.
(Skewness has been shown to be d etr im enta l to r e l i a b i l i t y and v a l i d i t y
c o e f f i c i e n t s by Lemke and Wiersma [ 1 9 7 6 ] . ) I f the low c o r r e la t i o n s
generated from the raw scores were the r e s u l t o f the skewed d i s t r i b u
t io n , then the f i v e point scale would not be the opt imal choice f o r a
performance a p p r a is a l . The process o f t ransforming the data raised
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the second quest ion .
The second quest ion ra ised concerned whether o r not the data
generated by the BOS should have been transformed so t h a t the data
would be normal ly d i s t r i b u t e d . Al though three t ransfo rm at io ns were
considered, they were re je c te d because o f the issues they ra ise d .
The f i r s t issue d e a l t w i th whether o r not the o r i g i n a l research
quest ion , regarding the convergent and d is c r im in a n t v a l i d i t y o f the
BOS items, would be answered w i th transformed data . A trans form at io n
considered but re je c ted was c l u s t e r i n g the items together and using
the a lg e b r a ic sum o f the items ( e . g . , Ronan S Latham, 197*0. The
t rans form at io n was re je c ted on the grounds t h a t the researcher would
no longer be assessing convergent and d is c r im in a n t v a l i d i t y o f the
i tems, but o f the groups o f i tems.
The second issue d e a l t w i th the appropriateness o f norm al i z ing
the data given no evidence the t r u e populat ion was normal ly d i s t r i b u t e d .
The two t ransform at ions considered were the f o l lo w in g : f i r s t , using a
l o g a r i th m ic fu n c t io n to transform each data po in t (recommended by M i l
l e r , Note 1 ) ; and second, c o r r e c t in g each data po in t f o r len iency
e r r o r (recommended by Brethower, Note 2 ) . The lo g a r i th m ic func t io n
was re je c t e d on the grounds t h a t the r e s u l t i n g s t a t i s t i c a l analyses
from the transformed data would be d i f f i c u l t to i n t e r p r e t (Huitema,
Note 3) and, aga in , no data suggested the t r u e popula t ion was normal ly
d i s t r i b u t e d . Correct in g f o r len iency was re je c te d because no proof
o f abso lu te len iency e r r o r e x i s t e d . I t was poss ib le t h a t most a ides
had e x c e l l e n t performance and t h a t the t r u e populat ion d i s t r i b u t i o n
was n e g a t i v e ly skewed. Landy and Farr (1980) reached some re le v a n t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
conclusions on the research involved in t ransformations o f data gener
ated from r a t in g sca le s , " in g en e ra l , t h i s p a r t i c u l a r area o f research
raises many more quest ions than i t answers" (p. 9 2 ) . Future research
tha t would focus on data transformat io ns should take in to account, not
only how the data a re c lu s t e r e d , but a lso how to i n t e r p r e t the c l u s
tered data.
Halo E f fe c ts
A secondary problem w i th the skewed d i s t r i b u t i o n was th a t i t in
troduced a n eg a t ive c o r r e l a t i o n between r a t e r mean ra t in g s and t h e i r
res p ec t ive v a r ian c e s . The c o r r e l a t i o n was such t h a t as mean ra t in g s
increased, var iances decreased. The c o r r e l a t i o n had the e f f e c t o f
rendering Levene's Test f o r equal var iances u n i n t e r p r e t a b l e , which
in turn made the assessment o f halo e r r o r impossible. Al though Le
vene's Test were s i g n i f i c a n t f o r a l l u n i ts across r a t e r s , i t d id not
make sense to i n t e r p r e t any one group as demonstrat ing g r e a t e r halo
e r r o r than another group because the var iances were n e g a t iv e ly c o r r e
la ted w i th t h e i r re s p ec t ive means. Im p l ic a t io n s f o r f u t u r e research
would be t h a t one should not use the compar is on-o f -var iance technique
f o r the assessment o f halo e r r o r i f skewed d i s t r i b u t i o n s are expected
or ob ta ined .
Leniency E f f e c t
Contrary to most previous research, s e l f - r a t i n g s tended to be
more severe than e i t h e r p e e r - r a t i n g s or s u p e r v i s o r - r a t in g s . In
a d d i t io n to t h i s study, Heneman (197^) was one o f the few researchers
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
k2
to f i n d s e l f - r a t i n g s le ss l e n i e n t than o t h e r r a t i n g sourc es . Heneman
h yp o th es iz ed t h a t s e l f - r a t i n g s tended to be more s e v ere because no
consequences were made c o n t i n g e n t on low r a t i n g s . S e l f - r a t e r s in the
Heneman study knew t h a t Heneman was c o l l e c t i n g d a t a f o r researc h p u r
poses o n ly and t h a t no consequences were a t t a c h e d to t h e i r r a t i n g s .
Few i f any consequences were a t t a c h e d to low r a t i n g s f o r the BOS used
in t h e p r e s e n t s tu d y . (The r e s e a r c h e r is not aware o f any formal con
sequences. )
A second o b s e r v a t i o n made f rom t h e a n a l y s i s com pleted on l e n i e n c y
was t h a t an i n t e r a c t i o n e x i s t e d between r a t e r s and p s y c h i a t r i c u n i t s .
The most pronounced i n t e r a c t i o n in v o lve d the s u p e r v i s o r - r a t i n g s . Both
p e e r - and s e l f - r a t i n g s tended t o be s t a b l e across u n i t s . However, the
s u p e r v i s o r tended t o f l u c t u a t e more between u n i t s suggest ing more i n
c o n s is te n c y in t h e i r r a t i n g s .
Seve ra l e x p l a n a t i o n s cou ld p o s s i b l y account f o r the f l u c t u a t i o n
in s u p e r v i s o r - r a t i n g s and the l a c k o f f l u c t u a t i o n in peer and s e l f -
r a t i n g s . F i r s t , because o n l y one s u p e r v i s o r r e p r e s e n te d each u n i t and
a number o f i n d i v i d u a l s r e p res en ted each s e l f mean r a t i n g and peer mean
r a t i n g , any b ia s in s u p e r v i s o r - r a t i n g s would not have been washed out
in t h e a v e r a g in g process as is p o s s i b l e w i t h t h e o t h e r groups. The
e f f e c t would be t h a t t h e s u p e r v i s o r - r a t i n g s would appear to be more
i n c o n s i s t e n t than t h e o t h e r groups. Second, a cc o r d i n g to Borman
( 197*0 , d i f f e r e n t r a t e r s tend t o v ie w a s i n g l e j o b d i f f e r e n t l y and,
t h e r e f o r e , tend to r a t e i n c o n s i s t e n t l y . I t was h y p o th es iz ed t h a t t h i s
e f f e c t would not occur s in c e t h e BOS c o n s is te d o f b e h a v i o r a l i tem s.
However, i t is p o s s i b l e t h a t the s u p e r v i s o r s s t i l l i n t e r p r e t e d the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
**3
b e h a v i o r a l i tems d i f f e r e n t l y ; u n f o r t u n a t e l y , t h e r e was no s y s te m a t i c
way t o assess i f t h i s was the case. T h i r d , s u p e r v i s o r s may have i g -
> nored o r have been unab le to ju d g e the p e r ce n ta g e o f t imes an a i d e
engaged in any one b e h a v i o r . The e f f e c t would be t h a t each s u p e r v i s o r
used h i s / h e r own i n t e r p r e t a t i o n o f the number o f anchors a t t a c h e d to
each i tem on the BOS. The l i k e l i h o o d o f s u p e r v i s o r s using t h e i r own
r a t i n g i n t e r p r e t a t i o n s is f a i r l y good, s in ce being a b l e to e s t i m a t e
the p er ce n t d i f f e r e n c e a s s o c i a t e d w i t h a 4 o r 5 appears to be a d i f
f i c u l t t a s k g iv en many o p p o r t u n i t i e s to behave in a g iv en manner as
d i c t a t e d by the BOS. U n f o r t u n a t e l y , i t is not p o s s i b l e to assess
which e f f e c t o r com bina t ion o f e f f e c t s caused t h e r a t i n g f l u c t u a t i o n s .
F u tu r e re searc h might focus on each cause in a more c o n t r o l l e d s e t t i n g .
MTMM I n t e r p r e t a t i o n
The r e s u l t s o b ta i n e d f rom t h e th r e e -w a y ANOVA tec h n i q u e to assess
convergent and d i s c r i m i n a n t v a l i d i t y and r a t e r b ia s sho u ld , a cc or d in g
to Kavanagh e t a l . (1971 )> be i n t e r p r e t e d in the f o l l o w i n g manner.
D i f f e r e n t i a t i o n e x i s t e d among a id e s a t t r i b u t a b l e to the BOS used, t h a t
i s , person v a r i a n c e o r convergent v a l i d i t y . However, the e q u a l l y
l a r g e a i d e x source e f f e c t i n d i c a t e d a s u b s t a n t i a l method b ia s con
foun ding the f i r s t r e s u l t . In o t h e r words, the r a t i n g s f o r v a r i o u s
a id es were not c o n s i s t e n t across s u p e r v i s o r s , o r the r a t i n g s an a i d e
re c e iv e d were dependent upon which s u p e r v i s o r the a i d e had as a r a t e r ,
which would tend to d ecre as e the a i d e main e f f e c t . The l a ck o f a i d e ,
x b e h a v i o r i n t e r a c t i o n i n d i c a t e d no o r d e r i n g o f a id e s d i f f e r e n t l y on
d i f f e r e n t b e h a v i o r s , i . e . , no d i s c r i m i n a n t v a l i d i t y . The l a c k o f
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
d isc r im in a n t v a l i d i t y has g e n e r a l ly been a t t r i b u t e d to s u b s tan t ia l
r a t e r bias in performance ra t in g s (Heneman, 197^; Holzbach, 1978;
Lawler , 1967; Lee e t a l . , 1982) .
The important conclusion to be drawn from these r e s u l t s is t h a t
the ra t in g s obtained from using an instrument l i k e the BOS, which has
s p e c i f i c behav ioral i tems, a re not immune to the problems t h a t plague
BARS, BES, and graphic r a t in g sca les . Such problems inc lude , but are
not l im i t e d to , halo e r r o r and len iency e r r o r which adverse ly e f f e c t
d isc r im in a n t v a l i d i t y and generate skewed d i s t r i b u t i o n s making a n a ly
ses d i f f i c u l t to complete and i n t e r p r e t .
These observa t ions regarding the psychometric c h a r a c t e r i s t i c s o f
a BOS raised a f i n a l quest ion: Should performance r a t i n g scales even
be used? Landy and F a r r ' s (1980) conclusion is r e i t e r a t e d : " A f t e r
more than 30 years o f ser ious research, i t seems t h a t l i t t l e progress
has been made in developing an e f f i c i e n t and psychom etr ica l ly sound
a l t e r n a t i v e to the t r a d i t i o n a l graphic r a t in g sca le" (p. 8 9 ) . Even
the BOS, which appears to be the very best at tempt to overcome the
shortcomings o f ambiguous performance r a t in g sca les , was found to be
no b e t t e r than the graphic r a t i n g s ca le . Murphy, M a r t i n , and Garcia
(1982) concluded, based on t h e i r research w i th the BOS, " th e BOS as
t y p i c a l l y used, measure t r a i t l i k e judgements r a t h e r than behaviora l
observa t ion" (p. 562) .
A f t e r completing the l i t e r a t u r e review on performance a p p r a is a ls ,
and the data a n a ly s i s , i t became in c re as in g ly ev id en t t h a t t r u e halo
e r r o r could not be assessed w it hout t ru e performance measures, th a t
leniency e r r o r could not be assessed wi thout t r u e performance
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
measures, t h a t a l l c r i t e r i o n - r e l a t e d v a l i d i t y s tudies could not be
completed wi thout t r u e performance measures, and the dec is ion to
normal ize the data could not be made without t r u e performance data
t h a t would support a t ru e normal ized d i s t r i b u t i o n . Since t r u e per
formance measures a re necessary to compute many o f the psychometric
p r o p e r t ie s o f r a t in g sca les , i t seems log ica l to use them in eva lu
a t in g performance ra th er than s u b je c t iv e r a t in g sca les . Perhaps i f
we had spent 30 years o f ser ious research on techniques o f ob ta in in g
t r u e performance measures r a t h e r than a t tem pt in g to improve ra t in g
sca le s , performance a p p ra is a ls would be super ior to a p p r a is a ls now
a v a i l a b l e .
Future research should in v e s t ig a t e measures o f p r o d u c t i v i t y t h a t
inc lude outcome measures and process measures o f p r o d u c t i v i t y t h a t are
not cost r e l a t e d . Latham and Wexley (1981) argued t h a t c o s t - r e l a t e d
measures o f ten omit important in fo rm at io n , are d i f f i c u l t to o b ta in ,
include fa c t o r s beyond the per form er 's c o n t r o l , lead to a r e s u l t s - a t -
a l1 - c o s t s m e n t a l i t y , and provide inadequate feedback necessary to co r
re c t performance. Future in v e s t ig a t io n s should be d i re c t e d a t over
coming these b a r r i e r s and engineering new methods to measure p ro d u c t i
v i t y . Overcoming the hurdles to o b ta in in g t ru e performance measures
may be a more product ive avenue f o r research than a ttempting to o v e r
come the hurdles o f s u b je c t iv e r a t in g scales .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A
BEHAVIORAL OBSERVATION SCALE FOR PSYCHIATRIC AIDE
Interpersonal R e la t io nsh ip s w i th S t a f f
1. O f fe rs ass is tance to nurses
almost never 1 2 3 A
2. Attends s t a f f meetings when possible
almost never 1 2 3 ^
5 almost always
5 almost always
In te rpersona l R e la t ionsh ip s w i th P a t ie n ts
3. Takes i n i t i a t i v e to int roduce h imsel f to new p a t i e n t s
almost never 1 2 3 ** 5 almost always
k . P a r t i c i p a t e s in d a i l y a c t i v i t y w i th p a t ie n t s
almost never 1 2 3 ^ 5 almost always
5. Spends too much time in the o f f i c e avo id ing in t e r a c t i o n w i th p a t ie n t s
2 3 k !almost always 1 almost never
6. Discusses p a t i e n t issues c o n f i d e n t i a l l y when poss ib le
almost never 1 2 3 4 5
7. Is w i l l i n g to discuss p a t ie n t s complaints
almost never 1 2 3 ^
8. Praises p a t ie n t s f o r accomplishments
almost never 1 2 3 ^
almost always
almost always
almost always
Communication Process
9. Charts and records 1 : 1 1s
almost never 1 2 3 ^
10. Completes his share o f the d a i l y ch ar t in g
almost never 1 2 3 ^
5 almost always
5 almost always
A6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11. Charts in a concise manner
almost never 1 2 3 4 5 almost always
12. Charts l e g i b l y and w i th good grammar and s p e l l i n g
almost never 1 2 3 4 5 almost always
13- Charts w i th a p p ro p r ia te ch ar t in g symbols
almost never 1 2 3 4 5 almost always
14. Makes assessments re le v a n t to t reatment plan when ch ar t in g
almost never 1 2 3 4 5 almost always
15. Conveys accura te in format ion to team
almost never 1 2 3 4 5 almost always
16. Lis tens to p a t ie n ts and s t a f f w i th a t t e n t i o n
almost never 1 2 3 4 5 almost always
17. Informs a t l e as t one o th e r s t a f f member before leav in g the u n i t
almost never 1 2 3 4 5 almost always
18. C a l ls ahead i f he expects to be l a t e f o r work
almost never 1 2 3 4 5 almost always
S p i r i t u a l Values
19- Shares s p i r i t u a l needs and s p i r i t u a l issues a p p r o p r i a t e ly
almost never 1 2 3 4 5 almost always
Leadership A b i l i t i e s / R o l e Modeling
20. C ar r ies out a l l delegated tasks
almost never 1 2 3 4 5 almost always
21. Makes in d iv id u a l decis ions when necessary
almost never 1 2 3 4 5 almost always
22. Is a b le to cont inue to funct io n a p p r o p r ia t e l y in s t r e s s f u l s i tu a t io n s
almost never 1 2 3 4 5 almost always
23. Demonstrates good emphathic s k i l l s
almost never 1 2 3 4 5 almost always
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
24. Knows u n i t rules
almost never 1
25.
26.
27-
5 almost always
F a i l s to confront when u n i t r u l e is broken
almost always 1 2 3 4 5
Uses humor a p p r o p r ia t e ly
almost never 1 2
Complains about work, p a t i e n t s , and /or s t a f f
almost always 1 2 3 4 5
almost never
5 almost always
almost never
Educat ional
28. Attends assigned inserv ices
almost never 1 2 3 5 almost always
29- Takes opp o r tun i ty to invo lve h im self in nonmandatory inserv ic es
almost never 1 2 3 4 5 almost always
Teaching
30. Is ab le to teach c o n f l i c t r e so lu t io n and does so when necessary
almost never 1 2 3 4 5 almost always
31. A ss is ts in o r i e n t a t i n g new nursing personnel
almost never 1 2 3 4 5 almost always
Implementat ion o f Treatment Programs
32. Wri tes treatment programs w i th s p e c i f i c o b je c t iv e s
almost never 1 2 3 4 5 almost always
33- Follows trea tment plan issues in 1 : 1 's
almost never 1 2 3 4
34. Does his share o f the c lose observa t ions
almost never 1 2 3 4
35. Completes close observat ions on t ime
almost never 1 2 3 4
almost always
almost always
almost always
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
36. Attends team meetings when poss ib le
almost never 1 2 3 4 5 almost always
37- Provides suggest ions f o r t reatment plan in team meetings
almost never 1 2 3 4
38. Follows team dec is ions
almost never 1 2
5 almost always
5 almost always
Eth ic a l and Legal Issues
39. Knows lo c a t io n o f s a fe ty equipment
almost never 1 2 3 4 5 almost always
40. Knows f i r e and tornado procedures
almost never 1 2 3 4 5 almost always
41. Discusses c o n f i d e n t i a l information w i th p a t i e n t s , f a m i l i e s , f r i e n d s , or r e l a t i v e s o f p a t i e n t
•almost never 1 2 3 4 5 almost always
Publ ic R e la t ions
42. Rea l i zes t h a t t h e i r communication about t h e i r work in f luences the community's concept o f p s y c h ia t r i c care
almost never 1 2 3 4 5 almost always
43. Takes an a c t i v e ro le as host or hostess to new personnel , p a t i e n t s , v i s i t o r s , students , and in tern s
almost never 1 2 3 4 5 almost always
44. Can d i r e c t o thers to a p p r o p r ia te persons when a d d i t i o n a l i n f o r .mation is requested
almost never 1 2 3 4 5 almost always
Work Habits
45. Genera l ly d is p la ys a p o s i t i v e a t t i t u d e
almost never 1 2 3 4 5 almost always
46. Evidence in mood and a t t i t u d e th a t personal problems are not i n t e r f e r i n g w i th job performance
almost never 1 2 3 4 5 almost always
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
47. Demonstrates a w i l l i n g n e s s to accomplish var ious tasks and special p ro jec ts
almost never 1 2 3 4 5 almost always
48. Comes to work on t ime
almost never 1 2 3 4 5 almost always
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX B
INSTRUCTIONS FOR BOS
Rate the in d iv id u a l whom you a re e v a lu a t in g on the 5 po in t scale by darkening the corresponding number on the computer sheet . Rate the employee as best you can in the f o l lo w in g manner:
Employees rece ive a 1 i f you suspect they engage in t h i s behavior 0-50 percent o f the t ime, 2 f o r 50-65 percent o f the t ime, 3 f o r 65“ 80 percent o f the t ime, ** f o r 80-90 percent o f the t ime, and5 f o r 90-100 percent o f the t ime. I f you are unable to make a f a i rr a t i n g , leave i t b la n k .
NOTE: the words almost always and almost never a re reversed f o rthose items t h a t are worded as in a p p ro p r ia te behav io r . Hence, the in d iv id u a l w i l l always be rated a 5 when he is e x h i b i t i n g e x c e l l e n t behavior .
REMINDERS: Do not w r i t e on t h i s book le t .
Always use a #2 p e n c i l ; do not make any s t ray marks on the computer sheet .
When you erase, erase comple tely.Be sure you begin w i th number 1 on the computer sheet .
Do not f i l l in any names.
51
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX C
ESTIMATES FOR VARIANCE COMPONENTS
Source V a r ia n c e Formula
A id e (a) MS. - MS. D A A x B x S
nm
A x Behav io r (B) MS. „ - MS. „ _ A x B A x B x Sm
A x Source (S)MSA x S ” MSA x B x S
n
E r r o r MS. D c A x B x S
Note: n = number o f b e h a v i o r s , m = number o f sources
52
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
REFERENCE NOTES
1. M i l l e r , Penennah S. Personal communication, March 1984.
2. Brethower, Dale. Personal communication, March 1984.
3. Huitema, Bradley. Personal communication, A p r i l 1984.
53
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY
A l l e n , P . , 6 Rosenberg, S. (19 78 ) . The development o f a ta s k - o r ie n te d approach to performance e va lu a t io n in the C i t y o f New York. Publ ic Personnel Management, 7_, 26 -32 .
A t k i n , R. S . , 6 Conlon, E. J. (19 78 ) . Behav iora l1y anchored ra t in g sca les: Some t h e o r e t i c a l issues. Academy o f Management Review,3 , 119-128.
Bernardin , H. J. ( 1 97 7 ) . Behavioral expecta t ion scales versus sum- mated r a t in g scales: A f a i r e r comparison. Journal o f AppliedPsychology, 6 2 ( 4 ) , 422-427 .
Bernardin , H. J . , A lv a re s , K. M . , s Cranny, C. J. ( 1976 ) . A recom- parison o f behav iora l expecta t io n scales to summated sca les .Journal o f Appl ied Psychology, 6 1 , 564-570 .
Borman, W. C. (1974 ) . The r a t in g o f i n d iv id u a ls in o r g a n iz a t io n s :An a l t e r n a t i v e approach. Org an iza t io n a l Behavior and Human Performance, 12 , 105-124.
Borman, W. C. ( 1 979 ) . Format and t r a i n i n g e f f e c t s on r a t i n g accuracy and r a t e r e r r o r s . Journal o f Appl ied Psychology, 6 4 , 410-421.
Borman, W. C . , & Dunnette, M. D. (19 75 ) . Behavior-based versust r a i t - o r i e n t a t e d performance r a t in g s : An em pir ic a l study. Journalo f Appl ied Psychology, 6 0 ( 5 ) , 561-565 .
Borman, W. C . , & V a l lo n , R. W. (1974 ) . A view o f what can happen when behaviora l expecta t io n scales are developed in one s e t t in g and used in an oth er. Journal o f Appl ied Psychology, 5 9 ( 2 ) , 197- 201.
Boruch, R . , L a rk in , J . , Wol ins, L . , & MacKinney, A. ( 1970 ) . A l t e r n a t i v e methods o f a n a ly s is : M u l t i t r a i t - m u l t i m e t h o d data . Educat io n a l and Psychological Measurement, 30 , 833- 8 5 3 .
Campbell, D. F . , & F is ke , D. W. (1959 ) . Convergent and d is c r im in a n t v a l i d a t i o n by the m u l t i t r a i t - m u l t i m e t h o d m a t r i x . Psychological B u l l e t i n , 56, 81 -105 .
Campbell, J. P . , Dunnette, M. D . , Arvey, R. D . , & H e l l e r v i k , I . V. (19 73 ) . The development and e va lu a t ion o f b e h a v i o r a l l y based r a t in g sca les . Journal o f Appl ied Psychology, 5 7 , 15-22.
Cascio, W. F . , S Bernard in , J. H. (1981 ) . Im p l ica t io n s o f p e r f o r mance ap p ra isa l f o r personnel d ec is io ns . Personnel Psychology,34, 211-226 .
54
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
55
Cooper, W. H. (1981 ) . Ubiquitous ha lo . Psychological B u l l e t i n , 90,218-244 .
Cooper, W. H. (1983) • In te rn a l homogeniety, d e s c r ip t iv e n e s s , andhalo: Resurrect ing some answers and quest ions about the s t r u c t u r e o f job performance r a t in g c a t e g o r ie s . Personnel Psychology,36, 489-501 .
D eC o t i is , T. A. (1977 ) . An a n a ly s is o f the ex te rn a l v a l i d i t y and ap p l ied relevance o f th re e ra t in g forms. O rg an iz a t io n a l Behavior and Human Performance, 19, 247-266.
Dickenson, T. I . , s T i c e , T. E. (1973) - A m u l t i t r a i t - m u l t i m e t h o d a n a ly s is o f scales developed by r e t r a n s l a t i o n . Organ iz a t io na l Behavior and Human Performance, j3, 421-438.
Fay, C. H . , 6 Latham, G. P. (19 82 ) . E f f e c t s o f t r a i n i n g and r a t in g scales on r a t in g e r r o r s . Personnel Psychology, 3 5 , 35“46.
Feldman, J . M. (1981) . Beyond a t t r i b u t i o n theory: C og n i t ive processes in performance a p p r a i s a l . Journal o f Appl ied Psychology,6 6 ( 2 ) , 127-148.
F i n l e y , D. M . , Osborn, H. G. , Dubin, J . A . , S Jeanneret , P. R.(1977 ) - B e h a v io ra l ly based r a t in g sca les : E f fe c ts o f s p e c i f i canchors and disguised sca le cont inua. Personnel Psychology, 30,659-669 .
Flanagan, J. C. ( 1959 ) . The c r i t i c a l inc ident technique. Psycholo- g ic a l B u l l e t i n , 5 1 , 327-358-
Friedman, B. A . , & Co rn e l iu s , E. T. (197 6 ) . E f f e c t o f r a t e r p a r t i c ip a t i o n in sca le const ruc t io n on the psychometric c h a r a c t e r i s t i c s o f two r a t i n g sca le formats. Journal o f Appl ied Psychology, 6 1 ( 2 ) , 210- 216 .
Glass, G. V . , Peckman, P. D . , & Sanders, J. R. (1972 ) . Consequences o f f a i l u r e to meet assumptions under ly ing the f i x e d e f f e c t s a n a ly s is o f v ar ia n ce and covariance . Review o f Educat ional Research,42, 237-288 .
Heneman, H. G. (1974 ) . Comparisons o f s e l f - and super io r ra t in gs o f managerial performance. Journal o f Appl ied Psychology, 59( 5 ) , 638-642 . ~
H o l le y , W. H . , 6 F i e l d , H. S. (1 975 ) . Performance a p pra is a l andthe law. Labor Law J o u r n a l , 423-430 .
Holzbach, R. L. (1978) . Rater bias in performance r a t in g s : Superio r ,s e l f - , and peer r a t i n g s . Journal o f Appl ied Psychology, 6 3 ( 5 ) , 579-588 .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Hopkins, K . , & Glass, G. (1 9 7 8 ) . Basic s t a t i s t i c s f o r the behavioral sc iences . Englewood C l i f f s , New Jersey: P r e n t i c e - H a l 1.
Huitema, B. E. (1980) . The a n a lys is o f covar iance and a l t e r n a t i v e s . New York: John Wiley & Sons.
Jenkins, D . , 6 Taber, T. ( 19 75 ) . A monte c a r lo study o f fac to rs a f f e c t i n g th ree indices o f composite scale r e l i a b i l i t y . Journal o f Appl ied Psychology, 6 0 ( 1 ) , 10-13.
Kane, J. S . , & Lawler , E. E. (1979 ) . Performance appra isa l e f f e c t iveness: I t s assessment and de term inates . In B. M. Staw ( E d . ) ,Research in o rg a n iza t io n a l b eh a v io r . Greenwich, CT: JAI Press.
Kavanagh, M . , MacKinney, A . , & Wol ins, L. (197 1 ) . Issues in manag e r i a l performance: M u l t i t r a i t - m u l t i m e t h o d a n a ly s is o f ra t i n g s .Psychological B u l l e t i n , 2 5 ( 1 ) . 3**"**9.
Kingstrom, P. 0 . , S Bass, A. R. (1981 ) . A c r i t i c a l a n a ly s is o fs tudies comparing b e h a v io ra l ly anchored ra t in g scales (BARS) and o th e r r a t in g formats . Personnel Psychology, 3*t, 263-289.
Kleiman, L. S . , & Durman, R. L. ( 1981) . Performance a p p r a i s a l ,p ro m ot i on , and the c o u r t s : A c r i t i c a l r e v 'e w . Personnel Ps ycho l ogy, 3i*., 103- 121.
Kle iman, L. S . , S F a l e y , R. (19 78 ) . Assessing c o n t e n t v a l i d i t y : Standards s e t by the c o u r t s . Personnel P sych o lo g y , 31 , 701-713.
K l i m o s k i , R. J . , & London, M. (197*0- Role o f the r a t e r in p e r f o r mance a p p r a i s a l . Journ a l o f A p p l i e d P sych o lo g y , 5 j3 (4 ) , *»*t5"**51 •
Landy, F. J . , & F a r r , J . L. ( 1980) . Performance r a t i n g . Psycho- l o g i c a l B u l l e t i n , 8 7 , 82 -107.
Latham, G. P . , Fay, C . , & S a a r i , L. ( 1 9 7 9 ) . The development o f b e h a v i o r a l o b s e r v a t i o n s ca le s f o r a p p r a i s i n g t h e per formance o f foreman. Personnel p sy ch o lo g y , 32 , 299_3 11 •
Latham, G. P . , S Wexley, K. N. (1 9 7 7 ) . Behavioral observat ions c a le s f o r per formance a p p r a i s a l purposes. Personnel P sych o lo g y ,30, 255-268 .
Latham, G. P . , & Wexley, K. N. (19 81 ) . Increasing p r o d u c t i v i t ythrough performance a p p r a i s a l . Reading, MA: Addison-Wesley.
Latham, G. P . , Wexley , K. N . , & Rand, T . M. (197 5 ) . The re le v a n c e o f b e h a v i o r a l c r i t e r i a developed from t h e c r i t i c a l i n c i d e n t t e c h n iq u e . Canadian Jou rna l o f B eh a v io ra l Sc ie n c e , _7> 3**9- 358.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Lawler , E. E. (1967) - The m u l t i t r a i t - m u l t i m e t h o d approach to measuring managerial job performance. Journal o f Appl ied Psychology. 51. (5) , 369-381 .
Lee, R . , Ma l ine , M . , & Greco, S. (1981 ) . M u l t i t r a i t - m u l t i m e t h o d a n a ly s is o f performance ra t in g s f o r law enforcement personnel . Journal o f Appl ied Psychology, 6 6 ( 5 ) , 625-632 .
Lemke, E . , & Wiersma, W. (19 76 ) . P r i n c i p le s o f psychological measurement. Chicago: Rand McNally.
L i s s i t z , R . , S Green, S. ( 197 5 ) . E f f e c t o f the number o f sca le points on r e l i a b i l i t y : A monte c a r lo approach. Journal o fAppl ied Psychology, 6 0 ( 1 ) , 10 -13 .
Meyer, H. H. (1 980 ) . S e l f a p pra isa l o f job performance. Personnel Psychology, 3 3 , 291-295-
Minium, E. W. (1 9 7 8 ) . S t a t i s t i c a l reasoning in psychology and educa t io n (2nd. e d . ) . New York: John W iley 6 Sons.
Murphy, K. (1 9 8 2 ) . D i f f i c u l t i e s in the s t a t i s t i c a l contro l o f halo . Journal o f Appl ied Psychology, 6 7 / 2 ) , 161-164 .
Murphy, K . , M a r t in , C . , & G arc ia , M. (1 982) . Do behavior observat i o n scales measure observat ion? Journal o f Appl ied Psychology, 6 7 ( 5 ) , 562-567 .
Myers, J . L . , DiCecco, J. V . , Whi te , J. B . , & Borden, V. M. ( 1982) . Repeated measurements on dichotomous v a r i a b l e s : Q and F t e s t s .Psychological B u l l e t i n , 9 2 ( 2 ) , 517-525.
Parker , J . , T a y lo r , E . , B a r re t , R . , & Martens, L. (1 959 ) . Rating sca le conten t : I I I . R e la t io n s h ip between su p erv iso ry - and s e l f -ra t i n g s . Personnel Psychology, 12, 49 -63 .
Ronan, W. W . , & Latham, G. P. (197 4 ) . The r e l i a b i l i t y and v a l i d i t y of the c r i t i c a l in c id en t technique: A c lo s e r look. Studies inFersonnel Psychology, 6 / 1 ) , 53 -64 .
Rosinger, G. , Myers, L. B . , & Leoy, G. W. (1 9 8 2 ) . Development o f a b e h a v i o r a l l y based performance appra isa l system. Personnel Psychology, j j , 75"88.
Saal , F . , Downey, R . , & Lahey, M. (1980 ) . Rat ing the ra t in g s :Assessing the psychometric q u a l i t y o f r a t i n g d a ta . Psychological B u l l e t i n , 8 8 ( 2 ) , 413-428.
Schneier , D. B. (1 9 7 8 ) . The impact o f EEOC l e g i s l a t ion on p e r f o r mance appra i sal s. Personnel , Ju ly -August , 24 -34 .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Schneier , C. E . , 6 B ea t ty , R. W. ( 1 978 ) . The in f lu e n ce o f ro lep re s c r ip t io n s on the performance appra isa l process. Academy o f Management J o u r n a l , 2J_(1), 129-135.
Schneier , C. E . , & B ea t ty , R. W. (1 979 ) . I n t e g r a t i n g b e h a v io r a l l y -based and e f fec t iven e s s -b ase d methods. Personnel A d m in is t ra to r ,2 4 ( 7 ) , 65 -76 .
Schwab, D. P . , Heneman, H. G. , & Decot i. is , T. A. ( 197 5 ) . Behavior- a l l y anchored r a t i n g scales: A review o f the l i t e r a t u r e . Personnel Psychology, 2 8 , 549-562 .
S h i p i r a , A . , & Shiron, A. (1980 ) . New issues in the use o f behav i o r a l ly anchored r a t i n g scales: Level o f a n a l y s is , the e f f e c t so f in c id en t f requency, and ex te rn a l v a l i d a t i o n . Journal o f App l i e d Psychology, 6 5 ( 5 ) , 517-523.
Smith, P. C . , & Kendal l , L. M. (1 963 ) . R e t r a n s la t io n on expecta t io ns An approach to the co n st ru c t io n o f unambiguous anchors f o r ra t in g sca les . Journal o f Appl ied Psychology, 4 7 , 149-155-
Stan ley , J. C. (19 61 ) . Ana lysis o f u n rep l ic a te d three-way c l a s s i f i ca t io n s , w i th a p p l i c a t io n s o f r a t e r bias and t r a i t independence. Psychometr ika, 2 6 , 205-219 .
Thorton, G. C. (1980 ) . Psychometric p ro p e r t ie s o f s e l f - a p p r a i s a l s o f jo b performance. Personnel Psychology, 3 3 , 263-271•
Thorton, G. C . , S Z o r ic h , S. ( 1 9 80 ) . T r a in in g to improve observer accuracy. Journal o f Appl ied Psychology, 6 5 ( 3 ) , 351“354.
United Sta tes C i v i l Serv ice Commission, EEOC, Department o f J u s t ic e and Department o f Labor. (1 9 77 ) . Uniform g u id e l in e s on employee s e le c t io n procedures. Federal R e g is te r , 42, 65542-65552 (Appendix B).
Warmke, D. L . , & B i l l i n g s , R. S. (1979 ) . Comparison o f t r a i n i n g methods f o r improving the psychometric q u a l i t y o f experimental and a d m i n i s t r a t i v e performance r a t in g s . Journal o f Appl ied Psychology, 6 4 ( 2 ) , 124-131.
Zammuto, R. F . , London, M . , & Rowland, K. (19 82 ) . O rgan iz a t io n and r a t e r d i f f e r e n c e s in performance a p p r a is a ls . Personnel Psychology, 35, 643-658 .
Zedeck, S . , & Baker, H. T. ( 1972 ) . Nursing performance as measured by behav ioral exp ecta t io n scales: A m u l t i t r a i t - m u l t i m e t h o d a n a ly s is . O rg an iza t io n a l Behavior and Human Performance, ]_> 457-466 .
Zedeck, S . , & Cascio, W. F. ( 1 982) . Performance ap p ra isa l dec is ions as a funct io n o f r a t e r t r a i n i n g and purposes o f the a p p r a i s a l . Journal o f Appl ied Psychology, §]_((>), 752-758 .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Zedeck, S . , Imparato, N . , Krausz, M. , Oleno, T. (1 9 7 4 ) . Development o f BARS as a fu n c t io n o f o r g a n iz a t io n a l l e v e l . Journal o f Appl ied Psychology, 5 9 ( 2 ) , 249-252.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.