Upload
danae
View
22
Download
0
Embed Size (px)
DESCRIPTION
Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces. A Critical Examination of the Common Psychometric Claims of Unidimensionality, Linearity, and Interval-Level Measurement. Joseph MartineauQi DiaoYang Lu Dipendra SubediSamuel DrakeFeng-Hsien Pang - PowerPoint PPT Presentation
Citation preview
1
A Critical Examination of the Common Psychometric Claims of Unidimensionality, Linearity, and Interval-Level Measurement
Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces
Joseph Martineau Qi Diao Yang LuDipendra Subedi Samuel Drake Feng-Hsien PangKyle Ward Shu-Chuan Kao Tian SongTianli Li Xin Li Yan ZhengAuthorship is alphabetical after the first four
2
Mic
hig
an D
epart
ment
of
Edu
cati
on
Author Affiliations
□ Michigan Department of Education□ Joseph Martineau□ Kyle Ward
□ Michigan State University□ Qi Diao□ Samuel Drake□ Shu-Chuan Kao□ Tianli Li□ Xin Li□ Yang Lu□ Feng-Hsien Pang□ Tian Song□ Dipendra Subedi□ Yan Zheng
3
Mic
hig
an D
epart
ment
of
Edu
cati
on
First, a Little Academic Genealogy
□ Why the concern with these concepts?□ Michigan State University students and grads
□ Mentored by Reckase□General theme is the real world is more complicated than
we would like□ Stimulated by discussions of…
□Multidimensionality□Implications of violations of unidimensionality assumptions
of traditional psychometric models
□ This talk could be entitled The real world is so much more complicated than we psychometricians anticipated, it’s not even funny.
□ Conflicts with at least the title of the next presentation
4
Mic
hig
an D
epart
ment
of
Edu
cati
on
First, a Little Academic Genealogy
□ This work is not the responsibility of Reckase, but is clearly an outgrowth of that work□ The focus of this work is not to knock down traditional
psychometric models, but to challenge claims based on those models, to encourage more valid use of scales resulting from those models, and to move beyond those models if we want to continue making those claims
□ Martineau (2004, 2006a) showed that violations of the assumptions of linearity and unidimensionality can create very misleading results when using vertically scaled data
□ This presentation identifies trajectories within and across grades and investigates the difficulties posed by non-linearity of scales, unmodeled dimensionality, and less than interval-level measurement
5
Mic
hig
an D
epart
ment
of
Edu
cati
on
Content Specifications in a Grade 3-8 Mathematics Assessment Blueprint
□Grade 3-8 Fall 2005 Michigan Educational Assessment Program (MEAP) Mathematics
□Based on a hierarchical content standard structure as follows:□Content Area (e.g. Mathematics)
□Strand (e.g. Geometry)□Domain (e.g. Transformation & Symmetry)
□Benchmark (e.g. Recognize that transformed shapes are still the same shape)
6
Mic
hig
an D
epart
ment
of
Edu
cati
on
Content Specifications in a Grade 3-8 Mathematics Assessment Blueprint
0%
20%
40%
60%
80%
100%
3 4 5 6 7 8
Grade Level of the Assessment
Pe
rce
nt
of
tota
l
AlgebraGeometryData & ProbabilityMeasurementNumber & Operations
7
Mic
hig
an D
epart
ment
of
Edu
cati
on
The Illusion of Unidimensionality
□ Current dimensionality assessment procedures under-detect dimensionality because they…□ Rely on detection of dominant dimensionality□ Rely on the assumption that shared variance is a sign of
shared construct representation (and therefore dominant dimensionality)
□ However, shared variance can result both from shared construct representation and exogenous causal variables□ The uniqueness of one construct (the unshared variance) is
identified as a nuisance factor that may be ignored□ The shared variance of multiple constructs is interpreted as
a single construct when the shared variance will decrease if the constructs respond differently to an exogenous variable
8
Mic
hig
an D
epart
ment
of
Edu
cati
on
The Illusion of Unidimensionality
□Martineau (2006b) and Tan & Gierl (2006) show that current dimensionality assessment procedures under-detect dimensionality when dimensions are correlated and/or factor structure is complex
□Martineau & Reckase (2006b) show that strongly multidimensional data (e.g. 10 or more factors) are identified as unidimensional using traditional methods when factors are moderately to strongly correlated
9
Mic
hig
an D
epart
ment
of
Edu
cati
on
Empirical Analyses of Unidimensionality and Multidimensionality□Winter 2004 Grade 4 Michigan
Educational Assessment Program (MEAP) Mathematics Test
□See Martineau, Mapuranga, & Ward, 2006—AERA paper□Meets unidimensionality criteria□Contains four interpretable dimensions□Reasonably similar to intended content
structure (a diffuse test blueprint indicating X% of items per each of five strands without further specificity)
10
Mic
hig
an D
epart
ment
of
Edu
cati
on
Empirical Analyses of Unidimensionality and Multidimensionality□ Martineau, Mapuranga, & Ward (2006)
□ Four dimensions (correlated 0.58 to 0.77)1. Pattern/rule generation/recognition2. Specific mathematics vocabulary3. Combinations4. Match data to source
□ Six content clusters□ Rule and pattern recognition (dimension 1)□ Match data to source (dimension 4)□ Identify/carry out rule based operations (dimensions 1 &
4)□ Combinations (dimensions 1 and 3)□ Counting units (low loadings on all dimensions)□ Mathematics vocabulary (dimension 2)
11
Mic
hig
an D
epart
ment
of
Edu
cati
on
Empirical Analyses of Unidimensionality and Multidimensionality□Fall 2005 Grade 3-8 MEAP Mathematics Tests
□Meets unidimensionality criteria (Rasch model fits without any concerns, EFA identifies a single dimension)
□Unable to interpret with small number of dimensions
□Contains between 9 and 12 dimensions per grade (with only 4-5 dimensions having sufficient contributing items to produce a reliable scale)
□Reasonably similar to intended content structure (a very tight test blueprint indicating 3 items per content standard with 20 content standards)
□Dimensionality is reasonably representative of the level of specificity in the test blueprint
12
Mic
hig
an D
epart
ment
of
Edu
cati
on
Empirical Analyses of Unidimensionality and Multidimensionality□Results & Implications
□Multiple dimensions are identifiable even when traditional methods identify a single dimension
□Dimensions differ from each other in non-trivial ways (e.g. along the lines of theoretically defined content standards as represented by test blueprints)
□If there are reasonable theoretical reasons to expect that different types of content within a subject matter may respond differently to an exogenous variable…
□The different types of content should be scaled separately
□If not, the results may be misleading and harmful to the educational community
13
Mic
hig
an D
epart
ment
of
Edu
cati
on
The Illusion of Linearity and Interval-Level Measurement
□ From Reckase’s invited address at the APA 1989 annual meeting, used with permission
□ Trajectory of a 3-PL unidimensional ACT mathematics scale through a two-dimensional mathematics achievement space
□ It is possible that the unidimensional scale is linear, or that one or both of the multidimensional scales are linear, but not all three!
□ The meaning of the unidimensional scale changes depending upon location
□ When the scale is non-linear, it is also not equal-interval by definition
14
Mic
hig
an D
epart
ment
of
Edu
cati
on
Non-linear Unidimensional Trajectories through Multidimensional Content Space
□Plotting the trajectories□Obtain overall “unidimensional” scores□Calibrate unidimensional scales for each strand
within each grade where there were 10 or more items
□Divide the overall scale into quantiles (between 50 and 100) with equal numbers of students in each quantile
□Calculate the mean strand scores for each quantile
□Plot the mean strand scores against each other
15
Mic
hig
an D
epart
ment
of
Edu
cati
on
Unidimensional Trajectories through Multidimensional Content Space
Grade 3 MEAP Mathematics
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
-2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5
Number & Operations
Me
as
ure
me
nt
16
Mic
hig
an D
epart
ment
of
Edu
cati
on
Unidimensional Trajectories through Multidimensional Content Space
Grade 4 MEAP Mathematics
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
-2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5
Number & Operations
Me
as
ure
me
nt
17
Mic
hig
an D
epart
ment
of
Edu
cati
on
Unidimensional Trajectories through Multidimensional Content Space
-2-1
01
2
-2
-1
0
1
2-2
-1
0
1
2
Data & Probability
Grade 5 MEAP Mathematics
Geometry
Measu
rem
en
t
18
Mic
hig
an D
epart
ment
of
Edu
cati
on
Unidimensional Trajectories through Multidimensional Content Space
-2-1
01
2
-2-1
01
2-2
-1
0
1
2
Data & Probability
Grade 6 MEAP Mathematics
Geometry
Nu
mb
er
& O
pera
tio
ns
19
Mic
hig
an D
epart
ment
of
Edu
cati
on
Unidimensional Trajectories through Multidimensional Content Space
-2-1
01
2
-2-1
01
2-2
-1
0
1
2
Algebra
Grade 7 MEAP Mathematics
Geometry
Nu
mb
er
& O
pera
tio
ns
20
Mic
hig
an D
epart
ment
of
Edu
cati
on
Unidimensional Trajectories through Multidimensional Content Space
-2-1
01
2
-2
-1
0
1
2-2
-1
0
1
2
Algebra
Grade 8 MEAP Mathematics
Geometry
Nu
mb
er
& O
pera
tio
ns
21
Mic
hig
an D
epart
ment
of
Edu
cati
on
Another View of Non-Linear Trajectories
□Vertically Scale all Mathematics 3-8 Items
□Use Unidimensional, Vertically Scaled Item Parameters to Calculate Student Strand Scores
□Calculate Centiles Across and Within Grades
□Create Plots
22
Mic
hig
an D
epart
ment
of
Edu
cati
on
23
Mic
hig
an D
epart
ment
of
Edu
cati
on
24
Mic
hig
an D
epart
ment
of
Edu
cati
on
Yet Another View of Non-Linear Measurement
□Reference composites□Easiest third of items on grade 6□Hardest third of items on grade 6□Easiest third of items on grade 7□Hardest third of items on grade 7
□Identifies the dimensions best measured by the top and bottom thirds of the grade 6 and 7 tests
25
Mic
hig
an D
epart
ment
of
Edu
cati
on
-1.5
-1
-0.5
0
0.5
-0.5
0
0.5
1
1.5
-1
-0.5
0
0.5
1
Dimension 1
Reference Composites of Easiest and Hardest Thirds of ItemsBlue is Grade 6, Red is Grade 7
DImension 2
Dim
ensi
on 3
26
Mic
hig
an D
epart
ment
of
Edu
cati
on
A Caveat About Psychometric Claims
□Corporations with a monetary interest tend to make the claims described below often and with impunity
□Academics are less likely to make these claims, but some do still make them
□When academics make these claims, they tend to make them under more stringent (but still insufficiently stringent) conditions
27
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 1:Advanced Psychometric Procedures Produce(Essentially) Unidimensional Scales
□Rasch/IRT Point of View
□ “…[IRT scales are] meaningful only if each and every question contributes to the measure of a single attribute.”
□ Bond & Fox, 2001, p. 25
□ If there is a single dominant dimension in the data, the data can be successfully modeled as unidimensional.
□ Nandakumar, 1991
□ Items that are clearly multidimensional can be selected to model a composite dimension that satisfies essential unidimensionality needs.
□ Reckase, Ackerman, & Carlson, 1988
□ The implication is that if the data fit the selected model (e.g. meets unidimensionality and fit tests), it is reasonable to treat the resulting scale as unidimensional from top to bottom
28
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 1:Advanced Psychometric Procedures Produce(Essentially) Unidimensional Scales
□Examples□ “We place all of our test items on the RIT scale according to
their difficulty.”□ NWEA website, 9/18/2006□ http://www.nwea.org/assessments/researchbased.asp)
□ “the developmental standard score is also a number that describes a student's location on an achievement continuum.”
□ ITBS website, 9/18/2006□ http://www.education.uiowa.edu/itp/itbs/itbs_interp_score.htm)
□ The Quantile Framework measures student mathematical achievement and concept/application solvability on the same scale ….”
□ Quantiles website, 9/18/2006□ http://www.quantiles.com/DesktopDefault.aspx?view=fa&tabindex
=4&tabid=22)
29
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 1:Advantages of Claiming Unidimensional Scales
□ Unidimensionality is a great simplifier of…□ Measurement models□ Interpretation of the resulting scale□ Explanation of the scale to non-technical audiences□ Statistical analyses of the resulting scale□ Collaboration with substantive researchers□ Instrument development of the instrument□ Meeting profit margins
□ Unidimensionality is a convenient selling point for…□ Clients□ Researchers□ Policymakers□ Managers□ Shareholders□ Ourselves
30
Mic
hig
an D
epart
ment
of
Edu
cati
on
For example, if the following two models
both approximately capture the shared variance in items 1 through N, we select the less complex model to be parsimonious. So, what’s wrong with that?
The Illusion of Unidimensionality
Achievement
Item 1
Item 2
Item 3
Item N
…
Achievement
Item 1
Item 2
Item 3
Item N
…
AchievementDimension 1
Item 1
Item 2
Item 3
Item N
…AchievementDimension 2
Predictor
AchievementDimension 1
Item 1
Item 2
Item 3
Item N
…AchievementDimension 2
Predictor
31
Mic
hig
an D
epart
ment
of
Edu
cati
on
Let’s try this one on for size:
This model on the left is, of course, a ridiculous econometric model. The model on the right is simplistic, but more realistic. But how does the left model fare as a psychometric model?
The Illusion of Unidimensionality
EconomicHealth
Projected IACorn Crop
Price ofWatermelon
AK TourismRevenue
College TownRevenue
…
Days since lasthard frost
Projected IACorn Crop
Price ofWatermelon
AK TourismRevenue
College TownRevenue
…SummerVacation
Season
32
Mic
hig
an D
epart
ment
of
Edu
cati
on
Let’s try this one on for size:
The left model fares very well in the psychometric community. But so does the model on the right. So how does the model on the left cause problems for education?
The Illusion of Unidimensionality
MathematicsAchievement
2 digitaddition
2 digitmultiplication
calculatetriangle height
identifyparallelogram
…
ComputationAchievement
2 digitaddition
2 digitMultiplication
calculatetriangle height
identifyparallelogram
…Geometry
Achievement
ClassEnvironment
33
Mic
hig
an D
epart
ment
of
Edu
cati
on
How the Claim of Unidimensionality Can Harm the Educational Community
□ When the available data is “unidimensional,” studies are likely to be performed using general unidimensional outcomes
□ If there is any reasonable theory that different dimensions will respond differently to a given educational intervention, using general unidimensional outcomes is misleading
□ To visualize the effect of incorrect claims of unidimensionality□ Assume that the “unidimensional” scale traverses a
multidimensional space in a linear fashion□ Understand that if the traversal is non-linear, the
problems become greater (see Martineau, 2006a)
34
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Multidimensionality
□ Best Case Scenario□ An essentially unidimensional reading scale traverses
the decoding/comprehension space linearly□ Scale measures mostly decoding, but is also moderately
sensitive to difference in comprehension□ Randomized experiment, students assigned to phonics
versus whole language instruction (the two groups start out at the same location in the decoding/comprehension space)
□ Pre-post assessment on the same unidimensional reading instrument (problems become worse if a different level of the assessment is used—see Martineau, 2004, 2006a)
□ Phonics has a strong positive effect on decoding (effect size 0.4), no effect on comprehension; whole language has opposite effects
35
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Multidimensionality
400
450
500
550
400 450 500 550
Decoding
Co
mp
reh
ensi
on
essentially unidimensional reading scale, measuring mostly decoding, but some comprehension
36
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Multidimensionality
400
450
500
550
400 450 500 550
Decoding
Co
mp
reh
ensi
on
pretest centroid, both groups (e.g. randomized experiment)
37
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Multidimensionality
400
450
500
550
400 450 500 550
Decoding
Co
mp
reh
ensi
on
whole-language post-test centroid
phonics post-test centroid
38
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Multidimensionality
400
450
500
550
400 450 500 550
Decoding
Co
mp
reh
ensi
on
Projections of multidimensional centroids onto the essentially unidimensional scale
39
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Multidimensionality
400
450
500
550
400 450 500 550
Decoding
Co
mp
reh
ensi
on
Effect of Phonics over Whole Language
40
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Multidimensionality
□ Misleading results□ Phonics instruction increases reading ability over whole
language instruction□ Accurate results with multidimensional
instruments□ Phonics instruction increases decoding ability over whole
language instruction□ Whole language instruction increases comprehension
ability over phonics instruction□ What are the policy ramifications of these
misleading results? Can anyone say “Educational Pendulum?”
□ If the assessment were an essentially unidimensional measure of mostly comprehension but some decoding, the misleading results would go in exactly the opposite (but equally misleading) direction
41
Mic
hig
an D
epart
ment
of
Edu
cati
on
Alternate Effects of Unmodeled Multidimensionality
400
450
500
550
400 450 500 550
Decoding
Co
mp
reh
ensi
on
Projections of multidimensional centroids onto the essentially unidimensional scale
42
Mic
hig
an D
epart
ment
of
Edu
cati
on
Alternate Effects of Unmodeled Multidimensionality
400
450
500
550
400 450 500 550
Decoding
Co
mp
reh
en
sio
n
No distinguishable effects on reading?
43
Mic
hig
an D
epart
ment
of
Edu
cati
on
Alternate Effects of Unmodeled Multidimensionality
□ Misleading results□ It does not matter whether one uses phonics or whole
language instruction□ Accurate results with multidimensional
instruments□ Phonics instruction increases decoding ability over whole
language instruction□ Whole language instruction increases comprehension
ability over phonics instruction□ What are the policy ramifications of these
misleading results?□ If the assessment were an essentially
unidimensional measure of mostly comprehension but some decoding, the misleading results would go in exactly the opposite (but equally misleading) direction
44
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales
□If a scale is unidimensional and measures at the interval level, the scale is then linear by definition
□Therefore
□ If we claim that a scale is unidimensional
AND
□ If we claim that a scale is interval level
THEN
□ We also claim that the scale is linear
45
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales
□ Rasch Point of View□ “…estimates derived from Rasch procedures are located on an
interval scale….”□ Bond & Fox, 2001, p. 119□ This claim is well accepted in the Rasch literature
□ Examples□ “You can liken the [NWEA Rasch scale] to a meterstick which is
comprised of equal units of measurement, centimeters.”□ NWEA website, 9/18/2006□ http://www.nwea.org/assessments/researchbased.asp
□ “Measurements…are now reportable in a common unit, a Lexile, which is similar to the degree calibrations on a thermometer….”
□ Lexile website, 9/18/2006□ p.14 of
http://www.lexile.com/lexilearticles/objective-measurement-reading-response.pdf).
□ Pearson Educational Measurement’s PASeries reading assessment is reported on the Lexile scale….”
□ Pearson website, 9/18/2006□ http://www.pearsonpaseries.com/downloads/overview/PASeriesOverview.html
)
46
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales
□ This claim is more controversial in the more general psychometric world, but the claim is made nonetheless
□ Claims for:□ “[IRT models] are calibrated to an equal-interval scale….”
□ METRIC website, 9/18/2006□ http://www.metric.research.med.va.gov/learn/theories/theories_irt.asp)
□ “[An IRT] scale…[is not] interval…although it is popular and reasonable to assume [it is].”
□ Hambleton, Swaminathan, and Rogers, 1991, p. 87
□ Claims against:□ “[The equal interval assumption] is…either demonstrably wrong…
or arbitrary….”□ Cliff, 1991, p. 37
□ “When the characteristic…cannot be directly observed, claims of equal-interval properties…are not testable and are therefore meaningless.”
□ Zwick, 1992, p. 209
47
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales
□General Psychometric Examples□“…[TerraNova K-12] Scale Scores [derived from 3-PL
IRT models] are equal-interval scores….”□ CTB McGraw-Hill website, accessed 9/18/2006□ http://www.ctb.com/static/about_assessment/popup_f6.jsp
□“…the [ITBS] developmental standard score scale…mirrors reality better….[than Grade Equivalents in that] growth is usually not as great at the upper grades as it is at the lower grades.”
□ Iowa Testing Program website, accessed 9/18/2006□ http://www.education.uiowa.edu/itp/itbs/itbs_interp_score.htm
□1998 NAEP Technical manual describes several analyses that assume the scale scores are equal interval
□ NCES website, accessed 9/18/2006□ http://nces.ed.gov/nationsreportcard/pdf/main1998/2001509.pdf
48
Mic
hig
an D
epart
ment
of
Edu
cati
on
Psychometric Claim 2:Advantages of Claiming Linear, Interval-Level Scales
□ Scale linearity and level of measurement determine the types of analyses that are appropriate
□ Non-linear, nominal/ordinal measurement□ Requires categorical analyses (discarding some data to create
categories) or non-linear transformations□ Interpretation is more difficult□ Categorical/non-linear analyses are less well known and
developed□ Linear, interval/ratio level measurement
□ Allows for linear variable analyses without transformation□ Interpretation is more clear□ Analyses of linear interval/ration variables are better known and
better developed (e.g. HLM, ANOVA, Factor Analysis, Regression)□ If we have linear, interval/ratio level measurement, then
studies of education based on those measures are very much like studies of physical, directly measurable phenomena
49
Mic
hig
an D
epart
ment
of
Edu
cati
on
How the Claim of Linearity/Interval-Level Measurement Can Harm the Educational Community
□Non-linearity may dramatically increase the problems of un-modeled dimensionality
□Scenario demonstrates how this can happen
50
Mic
hig
an D
epart
ment
of
Edu
cati
on
How the Claim of Linearity/Interval-Level Measurement Can Harm the Educational Community
□ Scenario□ The MEAP grade 8 mathematics scale traverses the
algebra/geometry space non-linearly (as taken from the empirical trajectories shown earlier)
□ Scale measures mostly geometry near the bottom, and changes to mostly algebra near the top
□ Quasi-experiment, matched samples (based on the unidimensional math pre-test) from convenience populations are assigned to use or not use 3-D manipulatives
□ Pre-post assessment on the same unidimensional reading instrument (problems become worse if a different level of the assessment is used—see Martineau, 2004, 2006a)
□ Treatment has a strong positive effect on geometry gains (effect size 0.5), no effect on algebra gains
51
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Non-Linear, Non-Interval Scales
-2.0
-1.5
-1.0
-0.5
0.0
0.5
-1.5 -1.0 -0.5 0.0 0.5
Algebra
Geo
met
ry
Treatment pre-test centroid
Control pre-test centroid
52
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Non-Linear, Non-Interval Scales
-2.0
-1.5
-1.0
-0.5
0.0
0.5
-1.5 -1.0 -0.5 0.0 0.5
Algebra
Geo
met
ry
Both pre-test centroids project onto the same location on the non-linear unidimensional scale(gives matched samples)
53
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Non-Linear, Non-Interval Scales
-2.0
-1.5
-1.0
-0.5
0.0
0.5
-1.5 -1.0 -0.5 0.0 0.5
Algebra
Geo
met
ry
Treatment post-test centroid
Control post-test centroid
54
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Non-Linear, Non-Interval Scales
-2.0
-1.5
-1.0
-0.5
0.0
0.5
-1.5 -1.0 -0.5 0.0 0.5
Algebra
Geo
met
ry
Projections of post-test centroids onto non-linear unidimensional scale
55
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Non-Linear, Non-Interval Scales
-2.0
-1.5
-1.0
-0.5
0.0
0.5
-1.5 -1.0 -0.5 0.0 0.5
Algebra
Geo
met
ry Effect in favor of the control group
56
Mic
hig
an D
epart
ment
of
Edu
cati
on
Effects of Unmodeled Non-Linearity and Non-Interval Level Measurement
□ Misleading results□ Treatment has negative effect on math achievement
□ Accurate results with linear, interval-level multidimensional measurement□ Treatment has positive effect on geometry achievement□ Treatment has no effect on algebra achievement
□ What are the policy ramifications of these misleading results?□ Don’t use 3-D manipulatives—they are a detriment to
learning mathematics!□ It is possible to actually reverse the estimated effects
from real effects (an overall negative effect may be detected when the only real effect on any dimension is positive)!
57
Mic
hig
an D
epart
ment
of
Edu
cati
on
Implications: Where Do We Go From Here With Unidimensional Psychometric Models?
□Need more stringent measures of unidimensionality□Avoid relying on the assumption that shared
variance means that items measure the same construct
□Pay more careful attention to subject matter specialists/theorists in terms of
□What constitutes a separable dimension□Whether different dimensions are reasonably likely to
be affected differently by exogenous variables we want to include in the models
□Incorporating both empirical dimensionality assessment and theoretical content dimensions
58
Mic
hig
an D
epart
ment
of
Edu
cati
on
Implications: Where Do We Go From Here With Unidimensional Psychometric Models?
□ Unless a measure is very narrow, drop the claim of linearity and interval-level measurement□ Stop marketing broad “unidimensional” scales as
adequate for statistical procedures requiring linear, interval-level scales
□ Take proactive measures to discourage inappropriate uses of broad “unidimensional” scales as linear and interval-level:□ Clearly label broad “unidimensional” scales as ordinal
and/or ordered metrics□ Caution against assuming linearity and or interval-level
measurement□ Stop washing our hands of inappropriate uses of
psychometric scales because we are not the ones using the scales—we only produce them
□ Speak out against inappropriate uses
59
Mic
hig
an D
epart
ment
of
Edu
cati
on
But Linearity & Interval-level Measurement are So Useful, How Do We Get Them Back?
□We will achieve these goals neither quickly nor easily□There is a strong investment in claiming
properties of linearity, unidimensionality, and interval-level measurement□Profit!□Prestige□Simplicity□Theory (e.g. Rasch and/or IRT)
□Adequate psychometric models do not currently exist to substantiate these claims
60
Mic
hig
an D
epart
ment
of
Edu
cati
on
But Linearity & Interval-level Measurement are So Useful, How Do We Get Them Back?
□Step out of unidimensional illusions and into multidimensional reality□30+ years of research on
multidimensionality□Negligible application□Feasibility is a problem (“it’s impractical”)
□Cost□Software□Effort
□Replicability is a problem□Interpretability is a problem□Dimensionality theory is a problem
61
Mic
hig
an D
epart
ment
of
Edu
cati
on
But Linearity & Interval-level Measurement are So Useful, How Do We Get Them Back?
□Step out of unidimensional illusions and into multidimensional reality □Step away from what is convenient to…
□Develop stronger models of dimensionality that do not assume shared variance necessarily indicates shared underlying constructs
□Develop feasible and replicable methods of multidimensional scale analysis that do not assume simple structure
□Develop methods of multidimensional psychometric analysis that pay strong attention to both theoretical (e.g. intended, interpreted) and empirical (e.g. observed) content structure