Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
DOCUMENT RESUME
ED 380 476 TM 022 780
AUTHOR Wright, Benjamin D.TITLE Rasch Factor Analysis.PUB DATE Oct 94NOTE 34p.; Expanded version of a paper presented at the
Annual Meeting of the Midwestern Educational ResearchAssociation (October 12-15, 1994).
PUB TYPE Reports Evaluative/Feasibility (142)Speeches /Conference Papers (150)
EDRS PRICE MF01/PCO2 Plus Postage.DESCRIPTORS Comparative Analysis; *Estimation (Mathematics);
*Factor Analysis; Instructional Leadership; *ItemResponse Theory; Principals; Problem Solving; *RatingScales; Statistical Analysis; Teachers; TestInterpretation
IDENTIFIERS *Linear Models; *Rasch Model
ABSTRACTFactor analysis and Rasch measurement are compared,
showing how they address the same data with different interpretationsof numerical status. Both methods use the same estimation method,with different measurement models, and they solve the same problem,with different utility. Factor analysis is faulted for mistakingstochastic observations of ordered labels as established linearmeasures and for failing to construct linear measurement. Using theRasch measurement to replace factor analysis is developed for adichotomy and shown for a rating scale example using responses of2,049 Chicago (Illinois) public school teachers on the 13-item"Strength of Principal. Leadership Scale" rating scale. Includes 11figures. (Contains 18 references.) (Author/SLD)
***********).----;,:,:**:.*****:.**AA*******************************
Reproductions supplied by EDRS are the best that can be madefrom the original document.
**********************************************************************
PERMISSION TO REPRODUCE THIS
Offse ofPA
Educational Res RASCH FACTOR ANALYSISearch and improvement ' MATERIAL HAS BEEN GRANTED BYU.S. DERTMENT OF EDUCATION
EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)
gENJ-hil1 /4.1 lidf /6 )17--
Ma document has been reprodpced asreceived from the person of urganization Benjamin D. Wrightoriginating d
C Minor changes have been made to IMPf0We MESA, Education, University of Chicagoreproduction w.my
Points of yew or opiruonss:steo rn tors docu-ment do not necessarily represent officialOERI position or policy
To THE EDUCATIONAL RESOURCESINFORMATION CENTER IERIC)
Abstract
Compares factor analysis and Rasch measurement. Shows how
they: 1) address the same data, with different interpretations of
numerical status, 2) use the same estimation method, withdifferent measurement models, 3) solve the same problem, with
different utility. Factor analysis is faulted for 1) mistaking
stochastic observations of ordered labels as established linear
measures and for 2) failing to construct linear measurement. How
to use Rasch measurement to replace factor analysis is developed
for a dichotomy and shown for a rating scale example.
Factor Analysis
Input datum xni is a test score, Likert rating or MCQ
response of persons n=1,N to items (or tests) i=1,L. The raw
data are expected to be sufficiently linear to allow equatingincommensurable item origins and scales by subtracting local
means and dividing by local standard deviations.
The sample standardized data for Factor 1 become:
ZniI (Xni mi) / Si
with E xni/17 si - E (xn, !Toy (N i)
(1)
This item scale equating expects complete data. Only
persons with usable responses to every item can participate.When data are missing, they must be feigned or incomplete persons
(or items) deleted. Deleting persons alters the interpretation
of the standardizing sample. Deleting items alters the
construct. Pair-wise deletion to estimate correlations biases
factors.
The model fo' Factor 1 is:
2UnlVil enil enil -1t1(0 P Q1)
{lint n =1,N} is a vector of person "factor scores",predicted for persons by Factor 1.
i =1,L} is a vector of item "factor loadings",the regressions of fuo1 on the data from items i=1,L.
(2)
C1%, 'A short version of this mss. was delivered at the Mid-Western Educational
Research Association Annual Meeting, October 13, 1994.
Copyrights 10 26 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 1
The residual from Factor 1 is:
Zni1 Un11711 eni1 Zn12
Whether this residual is all error e11- N(0,02)
or in addition to error implies other factors is presumablyunknown.2 When an additional factor is suspected, Factor 1residuals are used for seeking Factor 2 and so on. Matrices ofdecreasing residuals { {z1 }} for j=2,M are extracted LI turn tocalculate the M factor model (Thurstone, 1947):
(3)
znii - e e -N(0,02) (4)
Since there is no objective basis for a "right" number of
factors nor a "theoretical" value for "error" a2 to factor downto, factor analysts default to conventions like: Stop whenfactor size (sum of squared factor loadings) becomes less than
one. Stop when successive factor sizes level off. Stop aftertwo or three factors, because anything more complicate,' isimpossible to replicate.
The simplest way to obtain optimal values for person anditem vectors {12,0 and {v1} for each Factor j is to minimize:
N L
E E cznii 2 (5)n-1 i-1
This "direct factor analysis" (Saunders 1950, Cattell 1952,MacRae 1960, Wright and Evitts 1961) is a principal componentdecomposition (Hotelling 1933) of a "sample standardized" datamatrix into j=1,M item vectors {v6 i=1,1} and M correspondingperson vectors funj n=1,111.3 The results are comparable toThurstone centroids (1947).
Decomposition to identify each Factor j = 1,M isaccomplished by initializing at u4=1 for ali n and iteratingthrough equations:
Vi j En-1
Uni -1-1
renormed by C2 - E un2i/N uni - uni/c so that E un2j Nn-1 n-1
(6)
until successively smaller changes in {u1} become uninteresting.
2 Should another factor be expected, the most efficient approach is toanalyze each subset of expected-to-be-singular items separately and to defercomparing any resulting "variables" until their construct definition andquantitative representation is established.
3Principle component decomposition is the core of most contemporary factoranalysis and multi-dimensional scaling programs.
Copyright') 10 26 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 2
Standardized factor score unj is t'e value predicted by
Factor j's regression on "independent" variables i = 1,L withregression coefficients {vg}.
Factor j results are:
a) Factor j Size (variance "explained"):
vij with M < G < Li -g -1
b) Factor j Loadings (regression coefficients):
(7)
vii- Eunizniim of factor scores 11/4 on residuals {{z,0} .n-1
c) Factor j Standard Scores (zero mean, unit variance):
un ;/C predicted by regression coefficients {v,
from residuals {{z0} .
Problems with Factor Analysis
1. Raw data {{).41}1 are never linear measures. Even testscores, unless transformed into logits, become increasinglynon-linear as they near their finite limits. When xtd is a Likert
rating it is not even cardinal! But ordinaldatg are not suitedto Equations (1) through (5) and the factor scores they produceare necessarily non-linear.
2. In view of the non-linearity, the "true score" errormodels of Equations (2) and (3) do deal with uncertainty in auseful way.
3. The necessity for complete data is awkward. Data arenever complete.
4. After each factor is extracted, its residuals { {z1}} are
the data for smaller factors. These residuals contain one sureeffect, the turbulence left behind by the estimation of preceding
factors. Intimations of smaller factors are necessarily awash in
the residual wake of larger factors.
5. Without a basis for anticipating a final "error" size fora2, there is no objective way to decide when to stop factoring.
6. Software implementations seldom provide standard errorsfor factor loadings or factor scores.
7. When a "same" set of items is refactored from a newsample of persons, neither factor sizes nor loadings are ever thesame. Only the most generous fudging allows one to suppose theirfactor structure has been confirmed.
Copyrights 10 26 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 3
4
Most factor analysts swallow problems 1 through 6. Butproblem 7 is fatal. As the numerical instability of presumed"replications" emerges, we are forced to retreat from non-reproducible numbers to nominal conclusions. Person factorscores are abandoned. All but the relative magnitudes of itemfactor loadings is ignored. The only use we make of the factoranalysis output is to classify items according to their highestfactor loading. Person scores for each category of items areobtained, not from factor scores (even when separate refactoringsare done for each class of items) but by summing the originalstandardized (but inevitably non-linear) person responses zed tothe items in a factor class. Whatever the distribution of itemloadings in a factor class may suggest, all items are given equalweight in this summation of non-linear numerical labels.
Rasch Factor Analysis
Why not admit that our data are neither measures norcardinal numbers but necessarily begin as nothing more thanlabels { {Cm}} for nominal qualities - labels which may respond toan intelligent ordinal scoring xrd= 0,1,2,3 to produce theordinal score matrix {{x,; }1?
Familiar examples are: rating scales like (strongly agree >agree > disagree > strongly disagree) which can usually be scoredx = 3,2,1,0 or at least x1 = 2,1,0,0 and MCQ options like(right > wrong) which can always be scored x11 = 1,0. Even rawscores (r+1 > r > r-l) can respond to an ordinal interpretation.
Why not embrace the inescapable initially nominal butpossibly ordinally interpretable status of datum >46 and addressit directly for what it is with a probability model for theoccurence of ordered categories (Rasch 1961, Andrich 1978, Wrightand Masters 1982)?
First, we will show the algebra of this approach in itssimplest form, the use of x1; =0,1 to represent a dichotomy throughwhich nominal events interpreted as signifying "more" of anintended variable are scored "1" and nominal events signifyingless are scored "0" (Rasch 1960/1980/1992, pp 62-124).
Then we will illustrate the empirical similarities anddifferences between factor analysis and Rasch measurement byanalyzing the responses of 2049 public school teachers to a"Strength of Principal Leadership" rating scale.
Copyright° 10 26 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 4
5
To begin the algebra for xni=0,1 we will not mistake thenumber labeling as a linear measure, but will, instead, recognizeit for the binomial it is - a stochastic binomial for which wehave decided, on the basis of our measuring intentions, what kindof events are "better" for our purposes than others, i.e. what wedecide qualifies as a "right" answer. To set up a countingsystem for this preference, we label the preferred ("right"answer) event "1" and it's absence ("not right" hence "wrong"answer) "0".
The error model which follows is not the ill-suited linearerror (true score) model of factor analysis (Equation 2) but abinomial probability model Pni, for the occurrence of xni=0,1(Equation 8) which, because of its formulation (Rasch 1960):
1) Obtains the parameter separability necessary forconstructing objective (additive conjoint) measurement (Luceand Tukey 1964, Perline, Wright and Wainer 1979) and
2) Has Fisher sufficient statistics (1922) for linear personmeasures 13, and item calibrations Di which combine additivelyas (I3,-Di) (and therefore construct the linearity we need forsubsequent quantitative analysis) to govern theprobabilities of xn, = 0,1
The necessary and sufficient model is:
log :Pni1/Pnio) w 1312-D1 0,1
from which
Exmi Pni/ Vxni Pn11(1-Pni1) Pni1Pn10
(8)
Parameters for this model, as with factor analysis (5), areestimated by minimizing:
N L
E ocni- fxai)n-1 i-1
but now Pnii - exp (4,-/51) / [1+ exp (.62,-"5,)xni
rather than
(9)
Copyright. 10 26 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 5
With this Rasch approach to xm we get not only least-square
(the implementation here for maximizing likelihood) estimates of
linear person measures B, and linear item calibrations Di along
the single common measurement line (variable) which theyconjointly define, but also a stochastic basis, the binomial
error variances p,ip,,i0, for estimating relevant standard errors
for B, and Di and for evaluating the probabilities of residuals:
Yni Xni-pnit Zni Ynii Oni1Pni0 Ezni 0 Vzni - 1 (10)
This enables a detailed misfit analysis which, in turn,
allows a partition of the full matrix of residuals ffz,011 into
those many z,i1 which are observed to be no greater than the
probability model expects, say < 2, and hence, in "all
probability", of no immediate empirical interest and the,
possibly interesting, subset of remaining, more extreme
residuals, say > 3, which are sufficiently improbable to
invite further investigation and reconsideration as possible
evidence of a second variable.
It is only when improbable residuals emerge that there is an
empirical incentive to look for a presumably unexpected second
variable.4
Should we decide to venture further, the improbable
residuals tell us exactly where to look. No need to accumulate
confusion by wallowing through the full matrix {{z,i1}1 or by
subsequent factor rotations which are, after all, directed by the
largest residuals. We already knot from the residuals in hand
which particular items (and also which persons) do not fit into
the construction of the first variable. Should the:Te be another
useful, albeit unexpected, variable in these data, it will be
most directly accessible among the original (rather thanresidual) person responses to the subset of items which misfit
the first analysis.
To seek a second variable j=2, therefore, we concentrate on
just the data for the subset of misfitting items.5 We apply the
Rasch probability model again, not to the whole matrix of
residuals ffz1111, but only to the submatrix f{x1i,iej=2}1 of
original ordinally interpreted responses of persons to this
subset of items. We estimate a new set of linear itemcalibrations {Da} for just these misfitting items along with a
new set of additive conjoint person measures pal on this newly
defined "Variable 2".
' In a sensible research, of course, a "theoretical" incentive would
dominate most "empirical" results. There would be a set of well-desgned items
which were intended to define a single variable. The "empirical" queJtion would
narrow to finding out whether any of these well-intended items failed to perform
usefully and, if so, with which particular persons and possibly "Why?".
3 The focusing provided by identifying items manifesting improbable
residuals is analogous in purpose and consequence to the item clustering by which
factor rotation is guided.
Copyrights '0 26 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 6
To find out whether unexpected Variable 2 brings out any newinformation, we plot Variable 2 person measures {Ba} againstVariable 1 person measures (Bol. The shape of this plot shows usin detail the extent to which we have found a useful secondvariable and also for which of these persons it may actuallyprovide new information.
Because we have independent standard errors for each linearmeasure on each variable, the statistical status of thedifferences (Ba - Bo) for each of the n=1,N persons can be judgedobjectively by comparing them with their estimated standarderrors:
(B,22- En1) SFE1 4M7SP:a - 2122 - N(0,1) (11)
Extension of the dichotomous Rasch model to the orderedresponse categories x,a=0,1,2,3,m,00 of rating scales, partialcredits, grades, ranks, raw scores, counts and to models withadditional facets for raters and tasks is straightforward(Wright & Masters 1982, Linacre 1989).
An Empirical Example
We will illustrate the empirical similarities anddifferences between factor analysis and Rasch measurement byusing both methods to analyze the responses which 2049 Chicagopublic school teachers gave to the 13 item "Strength of PrincipalLeadership" rating scale on page 8 of the Consortium on ChicagoSchool Research questionnaire "Charting Reform: The Teachers'Turn, 1994".6
The 13 rating items in Figure 1 were written to define asingle line of inquiry which produced a single measure ofperceived "Strength of Principal Leadership" for each of the 2049teachers. The methodological question then is: Are these 13items used by these 2049 teachers in a way that enables theconstruction of a reasonable and useful single measure?
(Figure 1]
The three items about to be exposed as diverging from thecoherent core defined by the other 10 are marked [A], [B] and (C]in Figure 1.
Figure 2 is the factor analysis scree plot of principalcomponent eigenvalues for these data. Some factor analysts mightconclude, at this point, that the 13 items work together wellenough to define a single 13 item factor and stop. The screeplot, however, does hint that components [2] and (3] may be a bittoo large.
'The research from which these data come is supported by the Consortium onChicago School Research under the direction of Anthony S. Bryk and Penny Sebring.The factor analyses were done with SAS by Stuart Luppescu. The Meech measuzementanalyses were done with BIGSTEPS by Winifred Lopez.
Copyrights 10 26 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 7
[Figure 2)
Figure 3 is a Rasch measurement item misfit plot for the
same data. Here the salient misfit of items [A), [Et) and [C) is
unequivocally distinct.
[Figure 3]
Figure 4 is the table of varimax factor loadings. Factor
analysts who rotate these data will not miss the exceptionality
of items [A), [8] and [C] and, after studying their item text,
will have some useful ideas as to why these three items might not
follow the mainline defined by the 10 item core.
[Figure 4]
Figure 5 is the Reach measurement item calibration table
listed both in misfit and in measurement orders. Here we see for
each of the 13 items not only its fit statistics in mean square
and standardized form but also its raw score point-biserial and
relative "difficulty to be agreed with", its "unpopularity" as it
were.
[Figure 5)
Figure 5 shows that items (B] and [C) are 0.4 logits harder
to agree with (1.04 and 0.99 versus 0.61 logits) than the next
"hardest" item. Their texts share a close, personal supervision
of teacher by principal. This supervision could be supportive
but it is more likely to be restrictive. Indeed the verb
"supervises" is viewed by many as counter-collaborative. Thus
these items are not only hard to agree with but also ambiguous
with respect to the spirit of increasing egalitariancollaboration which dominates the hierarchy of the 10 core items.
Item [A), "Principal makes all final decisions." at the
other end of the line (at -0.77 logits) is the item easiest to
agree with but also emphatically counter-collaborative.
Once the text of these three items is examined and
understood in terms of the pro-collaborative tenor of the 10 core
items, it is easy to agree with both factor analysis and Rasch
measurement results and to remove these three items from the
definition of this "Strength of Principal Leadership" variable.
Indeed, at this point we may find ourselves wondering why we
included these three items in the first place. We may also
wonder, now that we see the construct evolution of the variable
defined by the 10 core items, whether it would not be useful to
rename this variable "Strength of Principal Support for
Collaboration".
Do you see how useful the Rasch measurement information in
Figre 5 can be for confirming a decision to set aside the three
aberrant items and to identifying the construct evolved by the
hierarchy of the 10 core items? Other parts of the standard
Rasch measurement output are equally useful.
Copyright') 10 26 94 B.D.Wright,HESA,5835 Kimbark,Chicago 60637 8
9
Figure 6 maps both 2049 teachers and 10 core items onto thesingle line of inquiry that the 10 items define. The line ofinquiry rises from low agreement at the bottom of Figure 6 up tohigh agreement at the top. On the left, each teacher is located
at their level of agreement. On the right, each item is located
first at its level of transition from "strongly disagree" to"disagree", then at its level of transition from "disagree" to"agree" and finally, on the far right, at its level of transitionfrom "agree" to "strongly agree".
[Figure 6]
Figure 7 shows the same data in a different form. Now theline of inquiry is drawn to increase from left to right. Theexact count of teachers at each level of agreement is given alongthe bottom of the upper figure. We can see that '27 teachers atthe top of Figure 6 and at the far right of Figure 7 have"strongly agreed" with all 10 items. We can also see that themodal group of 333 teachers at a measure of about 1.3 logits is
above the disagree-to-agree transition of even the hardest toagree with item. And we can see that 59 teachers at the far left
of Figure 7 have claimed to "strongly disagree" with all 10
items.
[Figure 7]
Figure 7 also shows us something of considerable importanceto our understanding and application of this rating scale as itwas used by these teachers. The spacing beween rating scalecategories (1) to mark "strongly disagree, (2) to mark
"disagree", (3) to mark "agree" and (4) to mark "strongly agree"is not uniform (equal) as a Likert interpretation would have it.Values for the expected difficulties of each step are given inthe table at the bottom of Figure 7. The estimated increase indifficulty from "strongly disagree" at step (1) and "disagree" atstep (2) in expected step measures is -1.74 -(-3.74) e 2.0
logits. But the estimated increase from "agree at step (3) and"strongly agree" at step (4) is 4.57 - 1.28 = 3.29 logits. Thesecond distance is 1.65 times greater than the first. We seethat it is tangibly easier to move up from "strongly disagree" to"disagree" and so reduce strong disagreement, than it is to moveup from "agree to "strongly agree", and so produce strongagreement. Figure 7 also shows us why we might prefer to avoidfactoring Likert scores like 1,2,3,4 as though they were equallyspaced measures.
Rasch measurement also shows us useful information abouteach of the 2049 teachers. Figure 8 begins this part of theRasch analysis by showing the distribution of teacher responsepattern misfit. We see that a substantial number of teachers areusing these 10 items idiosyncratically. subsequent pages ofoutput identify these teachers and show Lel which items theyprerided surprising ratings. These diagnostic outputs show ushow teachers use the 10 items and differentiate the many teacherswhose responses are sufficiently coLerent to produce a validmeasure of perceived "Strength of Principal Support forCollaboration" from other teachers whose response patterns makeunique individual statements.
1 0
Copyrighto 10 26 94 B.D.Wright,HESA,5635 Kimbark,Chicago 60637 9
[Figure 8]
Figure 9 shows the exact and non-linear relationship betweenfactor scores and Rasch measures for these 2049 teachers. Sincethe Rasch measures are modelled to be linear and the Rasch fitanalyses expose any failures of data to support the linearmeasure construction based on this modelling, it must be thefactor scores which are not linear.
[Figure 9]
Figure 10 summarizes Richard Smith's use of two-factorsimulated data to evaluate how well factor analysis and Raschmeasurement detect a second factor. Smith finds that factors ofequal size can only be discerned when they are uncorrelated R.3.Against that kind of data factor analysis does better than Raschmeasurement.
Ppo
[Figure 10]
Against all other kinds of data however, particular thekind of data most frequently encountered in social scienceresearch in which the factors are NOT of equal size and NOTuncorrelated, i.e. the usual situation where there is first anintended dominant factor and then an unintended off-shoot,correlated with the first factor and less well represented, Raschmeasurement does better.
Finally, Figure 11 collects into one summary table theconsiderations which bring out the similarities and differencesbetween factor analysis and Rasch measurement.
[Figure 11]
Copyright° 11 1 94 B.D.4Jright,MESA,5835 Kimbark,Chicago 60637 10
References
Andrich, D. (1978). A rating formulation for ordered responsecategories. Psychometrika, 42, 561-573.
Catteil, R.B. (1952). Factor Analysis. New York: Harper, 414-416.
Fisher, R.A. (1922). On the mathematical foundations oftheoretical statistics. 'oEhllam)hicalTrAnktLac.os of theRoyal Society of London (A), 222: 309-368.
Hotelling, H. (1933). Analysis of a complex of statisticalvariables into principal components. journalofionAlpsychology, 21, 417-441, 498-520.
Linacre, J.M. (1989). Many =Fa gat'Msch moasurqment. Chicago:MESA Press.
Luce, R.D. & Tukey, J.W. (1964). Simultaneous conjointmeasurement. Journal of Mathematical Psychology, 1, 1-27.
Lumsden, J. (1961). The construction of unidimensional tests.Psychological Bulletin, 58, 122-131.
MacRae, D. (1960). Direct factor analysis of sociometric data.pociometry, 22, 360-371.
Perline, R.,Wright, B.D. and Wainer, H. (1979). The Rasch modelas additive conjoint measurement. Applied PsychologicalMeasurement, 2, 237-256.
Rasch, G. 1960/1980/1993). EllohAldligtic2dociAlEjmapngIntelligence and Attainment Tests. Chicago: MESA Press.
Rasch, G. (1961). On general laws and the meaning of measurementin psychology. proceedings of the Fourth Berkeley Symposiumon Mathematical Statistics and Probability, 4, 321-33.Berkeley: University of California Press.
Saunders, D.R. (1950). agaltdCNOLIMLthADirggtE2tc_a_.AnAlvsis of Psychological ScOXSMAttigga. Ph.D.Dissertation. Urbana: University of Illinois.
Smith, R.M. and Miao, C.Y. (1991). Assessing unidimensionalityfor. Rasch measurement. In M. Wilson (ed.) QhigctiygreasurementilIgglyintoPsactIce. Yol.2. Norwood, N.3.:Ablex Publishing Corp.
Thurstone, L.L. (1947). Nyltirde-Factor Analysis. Chicago:University of Chicago Press.
Wright, B.D. and Linacre, J.M. (1989/94). BIGSTEPS:_RaschAnalysis Computer Program. Chicago: MESA Press.
Wright,B.D. and Evitts, M. (1961). Direct factor analysis insociometry. Sociometry, 24, 82-98.
Copyrighte 11 1 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 11
12
Wright, B.D. and Masters, G.N. (1982) EatingacalL_Aallyals.Chicago: MESA Press.
Wright, B.D. and Stone, M. (1979). Best Test Design. Chicago:MESA Press.
Copyright° 11 1 94 B.D.Wright,MESA,5835 Kimbark,Chicago 60637 12
Figu
re 1
The
Int
ende
d V
aria
ble
is:
STR
EN
GT
H O
F PR
INC
IPA
L L
EA
DE
RSH
IPT
o be
con
stru
cted
fro
m te
ache
rs' r
espo
nses
to th
e fo
llow
ing
13 it
ems.
The
pri
ncip
al a
t thi
s sc
hool
:strongly
strongly
disagree disagree
agree
agree
Encourages teachers to try new methods of instruction
0()
()
()
Is willing to make changes
()
()
()
0Makes clear to the staff his or her expectations for
meeting instructional goals
()
()
()
()
Bets high standards for teaching
()
()
()
()
Promotes parental and community involvement in the school
0()
()
()
Supports and encourages teachers to take risks
()
()
()
()
Understands how children learn
()
()
()
()
Works to create a sense of community in this school
0()
()
()
communicates a clear vision for our school
()
()
()
()
Visits classrooms regularly
[8]
()
()
()
()
Makes the fina.L decision on all important matters [A]
()
()
()
0
Is strongly committed to shared decision making
()
()
()
()
Closely supervises teachers' work
[C]
()
()
()
()
Con
sort
ium
on
Chi
cago
Sch
ool R
esea
rch
Que
stio
nnai
re"C
hart
ing
Ref
orm
: The
Tea
cher
s' T
urn,
199
4" P
age
814
15
Copyright 10 19 94 B.D.Wright MESA 5835 Kimbark,Chicago 60637
312-702-1596
1
8+
7+
[1]
Figu
re 2
How
Man
y Fa
ctor
s A
re T
here
?SC
RE
E P
LO
T O
F E
IGE
NV
AIN
FSfo
r 13
Ori
gina
l Ite
ms
I
E 6
+COMPONENT
EIGENVALUE
ONE MIGHT CONCLUDE
II
18.1
THAT THIS SET OF 13 ITEMS
GI
21.0
DEFINES 1 AND ONLY 1 FACTOR.
E 5
+3
0.8
NI
THERE IS, HOWEVER, USEFUL
VI
40.5
INFORMATION IN ROTATED FACTORS 2
A 4
+5
0.5
AND 3 WHICH ISOLATE THE SAME
3
LI
60.4
ITEMS FOUND TO BE WORST FITTING
U1
70.3
IN THE RASCH ANALYSIS.
E 3
+8
0.3
SI
90.
3
2+ 1+ O
n3 O
P fl
OM
116
[2] [3
] 45
67
817
Copyright 10 19 94 E.D.Wright MESA
5835 Einbark,Chicago 60637
312-702-1596
2
-10 H +
9 +
8 +
7 -}
-T
6E
5M
4 3 +
2 3.
N0
F-1
I-2
T-3 -
4S
-5T
-6D
-7 -8 +
[X]
- 9
-10
[X]
I
I
-10
Figu
re 3
Whi
ch I
tem
s M
isfi
t?R
ASC
H r
am F
IT P
LO
Tfo
r 13
Ori
gina
l Ite
ms
-8-6
-4-2
02
4
[x]
-8-6
[X]
FIT
RE
GIO
N
68
10
HH
[A]
MIS
FIT
RE
GIO
N[B
]
[C]
[X]
11
-4-2
02
4IT
EM
OU
TFI
T S
TD
68
10
Item
sA
, B, C
mis
Frr
9 8 6 5 4 2 1 0 - 1 -2 -3 - 4
- 5
- 6 -7 - 8 -9 -10
Thr
ee it
ems
mar
ked
[A],
[B
] an
d [C
] an
d id
entif
ied
by R
asch
sta
tistic
s an
d ite
m te
xtin
Fig
ure
5 ar
e fo
und
to m
isfi
t mor
e ex
trem
ely
than
the
rem
aini
ng 1
0 ite
ms.
The
se a
re th
e sa
me
3 ite
ms
iden
tifie
d as
not
on
the
mai
n fa
ctor
by
the
vari
max
fac
tor
rota
tion
repo
rted
in F
igur
e 4.
1819
Copyright 11 2 94 B.D.Wright MESA 5835 Ximbark,Chicago 60637
312-702-1596
3
Figu
re4
Whi
ch I
tem
s A
re N
ot O
n T
he M
ain
Fact
or?
RO
TA
TE
D F
AC
TO
R L
OA
DIN
GS
for
13 O
rigi
nalI
tem
s
FACTOR
LOADINGS
Fl
F2
F3
THE PRINCIPAL AT THIS SCHOOL:
0.81
0.15
0.12
encourages teachers to try new
methods
0.83
0.20
-0.00
is willing to make changes
0.72
0.38
0.24
makes teaching expectations clear
0L14
0.38
0.24
sets high standards. for teaching
0.75
0.26
0.15
promotes community involvement
0.78
0.23
0.03
encourages teachers to take
risks
0.75
0.35
0.16
understands how students learn
Ras
ch0.78
0.36
0.07
works to create community
MIS
FIT
0.75
0.41
0.18
communicates a clear vision
item
s:
0.33
[0.8
5]0.07
visits
clas
sroo
ms
regu
larl
y[B
]
0.12
0.14
[0.9
5]m
akes
all
fina
l dec
isio
ns[A
]
0.10
0.47
-0.05
committed to shared decisions
0.33
[0.8
4]0.20
clos
ely
supe
rvis
es te
ache
rs[C
]
Var
ianc
e ex
plai
ned:
Fl =
6.1
, F2
= 2
.6, F
3 =
1.2
Rot
ated
Fac
tors
2 a
nd 3
fin
d ite
ms
[Al,
[B1
and
[q n
ot o
nth
e m
ain
fact
or.
20
Copyright 10 19 94 B.D.Wright MESA 5835 Einbark,Chicago 60637
312-702-1596
21
4
Figu
re 5
Whi
ch I
tem
s M
isfi
t?It
ASC
H M
ISFI
T O
RD
ER
for
13
Ori
gina
l Ite
ms
MEASURE
-.77 .9
91.
04 .57
ERROR
MISFIT
MNSQ
STD
PTBIS
THE PRINCIPAL AT THIS SCHOOL:
.04
2.78
9.9
A..21
mak
es a
ll fi
nal d
ecis
ions
.04
1.34
9.4
B .61 visits
clas
sroo
ms
regu
larl
y.0
413
28.
9 C
.62
clos
ely
supe
rvis
es te
ache
rs.0411.13
3.81
.691 encourages teachers to take risks
MEASURE
ERRORMNSQ
STD
PTBIS
THE PRINCIPAL AT THIS SCHOOL:
1.04
.04 1.32
8.9 C
.62
.99
.04 1.34
9.4 B
.61
.61
.04
.95
-1.5
.74
.57
.04
1.13
3.8
.69
.05
.04
.66
-9.9
.83
.02
.04
.76
-7.5
.80
-.20
.04
.68
-9.8
.78
-.30
.04
.69
-9.6
.79
-.32
.04
.83
-4.7
.73
-.38
.04
.63
-9.9
.81
-.62
.04
.92
-2.1
.69
-.69
.04
.73
-7.6
.74
-.77
.0412.78
9.9IA
.211
closely supervises teachers
visits classrooms regularly
strongly committed to shared decisions
encourages teachers to take risks
communicates a clear vision for school
works to create community in school
understands how children learn
makes teaching expectations clear
is willing to make changes
sets high standards for teaching
encourages teachers to try new methods
promotes community involvement
makes all final decisions
Fact
or 3
Fact
or 2
Fact
or 2
Fact
or 2
Fact
or 2
Fact
or 3
Ras
ch m
isfi
t ana
lysi
s fi
nds
the
sam
e 3
devi
ant i
tem
s fo
und
by f
acto
r an
alys
is. Q
ualit
ativ
ely
i.e. f
indi
ngde
vian
t ite
ms,
fac
tor
anal
ysis
and
Ras
ch a
naly
sis
reac
h th
e sa
me
conc
lusi
on. B
ut th
ere
are
subs
tant
ial
diff
eren
ces
in th
e ut
ility
and
infe
rent
ial s
tabi
lity
of q
uant
itativ
e re
sults
.Fa
ctor
ana
lysi
s gi
ves
sam
ple
depe
nden
t ite
m r
egre
ssio
n co
effi
cien
ts a
nd th
e sa
mpl
e st
anda
rdiz
ed p
erso
n sc
ores
pre
dict
ed b
y th
isre
gres
sion
on
the
data
. Ras
ch a
naly
sis
give
s te
st-f
ree
pers
on m
easu
res
and
sam
ple-
free
item
cal
ibra
tions
posi
tione
d to
geth
er o
n th
e co
mm
on li
near
sca
le d
efin
ed b
y th
e ite
ms.
2223
Copyright 11 2 94 B.D.Wright MESA 5835 Kirabark,Chicago 60637
312-702-1596
5
Figu
re 6
Ras
ch M
AP
OF
TE
AC
HE
RS
on th
e 10
WE
ST r
rEm
sMEASURE
TEACHERS
ITEMS
- LOW -4
MEAN
ITEMS
ME
ASU
RE
- HIGH
6.0
5.0
4.0
6.0
. ###
####
M#
1f#
Teachers
-IT
EM
S-
XX
midyely
between
5.0
who
STRONGLY
AGREE
.11;
4.0
.1#1 O
i3.0
.### 01 O
f2.0
.##;
XX
X XXX
STRONGLY
AGREE
and
3.0
XX
AGREE
2.0
############
1.0
####;
XX midway
1.0
between
####
XX
DISAGREE
.0
.##
and
.0
##XXX
AGREE
.###
#XX
-1.0
.##
-1.0
.##
.##
-2.0
Teachers
.#
XX
midway
-2.0
who
.#
between
STRONGLY
XX
STRONGLY
-3.0
DISAGREE
X XXX
DISAGREE
-3.0
and
XX
DISAGREE
-4.0
-4.0
PERSONS
ITEMS
- LOW -J
MEAN
ITEMS
- HIGH
-ITEMS
-EACH 'I' IN TEACHER COLUMN
IS
28 TEACHERS; EACH I.' IS 1 TO
27 TEACHERS
Ras
ch o
utpu
t of
teac
her
mea
sure
s in
the
met
ric
of th
e va
riab
le d
efin
ed b
yite
m c
alib
ratio
ns e
nabl
esM
APP
ING
teac
hers
and
item
s to
geth
er in
to a
sin
gle
pict
ure
show
ing
the
spre
adan
d ta
rget
ing
of th
esa
mpl
e of
per
sons
on
the
span
of
defi
ning
item
s. T
hese
MA
PS s
how
nor
mat
ive
and
crite
rion
val
idity
sim
ulta
neou
sly.
MA
PS a
lso
sy lw
for
eac
h pe
rson
thei
r ex
pect
ed le
vel o
f ag
reem
ent w
ith e
ach
item
.
24Copyright 10 19 94 E.D.Wright MESA 5835 Eimbark,Chicago 60637
312-702-1596
tirri
t*O
W13
2for
ivIt
rakS
iiilr
E'a
M4M
ad
205
13.00EMMMME.F.
.--.SEMEE=PMffffMMNYetn
727
1 1 3
13
51
1 11 1122245 566147 115 32 0 7 6 815
8 9
98
2T
EA
CH
ER
99
44 00 9898782832624582423331157293782391631762571
7D
IST
RIB
UT
ION
QS
MS
Q
RA
SCH
SU
MM
AR
Y O
F R
AT
ING
SC
AL
E S
TE
PM
EA
SUR
ES
CATEGORY OBSERVED
STEP
EXPECTED SCORE MEASURES
THURSTONE
AVG
LABEL
COUNT
MEASURE
STEP-.5
AT STEP
STE14.5
THRESHOLD
MEASURE
11314
NONE
[ -3
.74]
-2.87
-2.46
23219
-2.51
-2.87
[-1.
74]
-.63
-2.67
-.74
310729
-.96
-.63
[1.2.1q
3.50
-.80
1.31
44713
3.46
3.50
[ 4.
57]
3.47
3.98
11314
NONE
[ -3
.74]
-2.87
23219
-2.51
-2.87
[-1.
74]
-.63
310729
-.96
-.63
[1.2.1q
3.50
44713
3.46
3.50
[ 4.
57]
-2.46
-2.67
-.74
-.80
1.31
3.47
3.98
____modal_i___________--mean
The
Lik
ert r
atin
gs la
bele
d 1,
2,3,
4 us
ed b
y th
ese
teac
hers
are
NO
T E
QU
AL
LY
SPA
CE
D I
NT
HE
IR U
SE.
The
teac
hers
nee
ded
only
2.0
0 lo
gits
to g
o fr
omST
RO
NG
LY
DIS
AG
RE
E a
t -3.
74 to
DIS
AG
RE
E a
t -1.
74,
but 3
.29
logi
ts to
go
from
AG
RE
E a
t 1.2
8 to
STR
ON
GL
Y A
GR
EE
at 4
.57.
Mos
t Lik
ert s
cale
s ar
e ev
en
mor
e un
equa
lly s
pace
din
use
. Ins
tead
of
supp
osin
g eq
ual s
paci
ng, R
asch
ana
lysi
sle
ts th
e re
spon
ses
of
the
pers
ons
usin
g th
e ra
ting
scal
e de
term
ine
the
spac
ing
actu
ally
in e
ffec
t for
them
dur
ing
thei
r ra
tings
.26
Cnpyright 10 19 94 B.D.Wright MESA 5835 Eimbark,Chicago
60637
312-702-1596
727
LY
SPA
CE
D I
N T
HE
IR U
SE.
The
teac
hers
nee
ded
only
2.0
0 lo
gits
to g
o fr
omST
RO
NG
LY
DIS
AG
RE
E a
t -3.
74 to
DIS
AG
RE
E a
t -1.
74,
but 3
.29
logi
ts to
go
from
AG
RE
E a
t 1.2
8 to
STR
ON
GL
Y A
GR
EE
at 4
.57.
Mos
t Lik
ert s
cale
s ar
e ev
en
mor
e un
equa
lly s
pace
din
use
. Ins
tead
of
supp
osin
g eq
ual s
paci
ng, R
asch
ana
lysi
sle
ts th
e re
spon
ses
of
the
pers
ons
usin
g th
e ra
ting
scal
e de
term
ine
the
spac
ing
actu
ally
in e
ffec
t for
them
dur
ing
thei
r ra
tings
.26
Vig
ure
8FI
ND
ING
MIS
VIT
IIN
G T
EA
CH
ER
Son
the
10 B
est I
tem
s
7 +
6 5
T4
E A C3
H E R2
I N1 -I-
I T S T D
FIT
RE
GIO
N
MIS
FIT
RE
GIO
N
Note large
number of
[C]
teachers
[F]2]
who do not
[2G]
fit.
[9]
[2282]
[2233]
[11222]
[2593]
[2]
[2]3]
12 [783]
26[922]
436*92
664 16**3
1
422****1:
133***2
2 9***1
35****3
8****3
6***82
****5
2***1
1***5
****
2*73
01
23
TEACHER MISFIT STD
7 6 5 4 3 2 1 0 - 1
- 2
Eac
h of
thes
e m
isfi
tting
teac
hers
can
be
reco
gniz
edfo
r w
hat i
s un
ique
in th
eir
view
s by
exa
min
ing
the
indi
vidu
al r
espo
nse
vect
ors
prin
ted
in B
IGST
EPS
Tab
le 7
"M
isfi
tting
Per
sons
". M
any
ofth
e m
isfi
ts
appe
arin
g he
re a
re d
ue to
ethn
ic d
iffe
renc
es in
the
use
of 2
of
the
10ite
ms.
But
that
's a
noth
er s
tory
.
"Copyright
10 19 94 B.D Wright MESA 5835 Kinbark,Chicago
60637
312-702-1596
829
Figu
re 9
FAC
TO
R S
CO
RE
S ve
rsus
RA
SCH
ME
ASU
RE
Son
the
10 B
est I
tem
s fo
r th
e20
49 T
each
ers
I
sI
II
II
-4-3
-2-1
0
RA
SCH
ME
ASU
RE
S+
1.
32
Figu
re 1
0R
ASC
H M
EA
SUR
EM
EN
T v
s FA
CT
OR
AN
AL
YSI
S -
in G
ener
alSm
ith, R
.N.
(199
2).Assessing nu/dimensionality for the Rasch Rating Scale Model. AERA,
San Francisco, April 1992] evaluates thedimensionality sensitivity of principle component
factor analysis (by SPSS-FC) and Rasch measurement
(by BIGSTEPS) for 5-category rating scale
data simulated from pairs of variously correlated
factors x and y represented by equal and
unequal numbers of items Ny and N,.
"Yes" for factor analysis means > 70% of
the
y ite
ms
loaded on the 2nd factor.
"Yes" for Rasch measurement means > 70% of the y items produced
standardized mean square outfits > 2.0.
Rxy
v9 m
eFactor Analysis
Rasch Item MISFIT
Equ
al s
ize
fact
ors
S -
35
Yes
Yes
10
- 30
Yes
Yes
WH
EN
Ny
= N
x15
- 25
.Yes
Yes
can
only
be
sepa
rate
dN
=.0
720
- 20
FA b
ette
rY
esN
Ow
hen
near
ly R
< .3
5-
35Y
esY
esIJ
NC
OR
RE
LA
TE
D10
- 30
Yes
Yes
15-
25Y
esY
esR
rg .
2620
- 20
PA b
ette
rY
esN
O
5-
35Y
esY
es10
-30
Yes
Yes
15
-25
Yes
Yes
R=.39
20
- 20
NO
NO
535
Yes
Yes
Ras
ch it
em M
ISFI
T10
-30
Yes
Yes
does
bet
ter
than
R=
.55
15-
25
NO
Rasch
bette
rY
esFa
ctor
ana
lysi
s at
R=
.55
20-
20N
ON
Ose
para
ting
fact
ors
CO
RR
EL
AT
ED
R >
155
- 35
Yes
Yes
11.-
-=.7
210
-30
NO
Rasch
bette
rY
esW
HA
TE
VE
R N
y <
NX
R=
.72
15-
25N
ORasch
bette
rY
es20
- 20
NO
NO
R=
.89
5-
35
NO
Ras
e!' b
ette
rY
esR
=.8
910
- 30
NO
Rasch
bette
rY
es15
- 25
NO
NO
20-
20N
ON
O
Copyright 10 19 94 B.D.Wright MESA 5835 Kimbark,Chicago 60637
312-702-1596
.41
Figure 11COMPARING FACTOR ANALYSLS AND RASCH MEASUREMENT
ASPECT
MOTIVE
FACTOR ANALYSIS
INPUT MODEL
MISSING DATA
summarize data
RASCH MEASUREMENT
construct measurement
established linearities
loses rows or colsbiases factors
EQUALIZATIONnorm-standardized
intoC = MA- lis
stochastic events
routinely accommodatedno data lost
criterion-classifiedinto ordered categories
= 0,1,2,,,
METHOD
MODEL
principal components latent trait
logipthip =
ESTIMATEDBY
MINIMIZING
ANCHOREDAT
ERRORMODEL
ITEMESTIMATE
ITEMESTIMATEERROR
DI linear calibration ofitem i on variable
PERSONESTIMATE
u, score predicted for BB linear measure ofarson n on variable
PERSONESTIMATEERROR
RESIDUALSTATISTICS
MISFITS
TO SEEK NEXT subtract um from all swVARIABLE for next data (zw-uA)
includes residual noise
3.1
-1/2
EtxpelNa) = 0Vizw-Pla = PpAa.
( -P 1)2 >> P dna
select only midof misfitting itemsavoids residual noise