61
1 A Critical Examination of the Common Psychometric Claims of Unidimensionality, Linearity, and Interval-Level Measurement Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces Joseph Martineau Qi Diao Yang Lu Dipendra Subedi Samuel Drake Feng-Hsien Pang Kyle Ward Shu-Chuan Kao Tian Song Tianli Li Xin Li Yan Zheng Authorship is alphabetical after the first four

Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

  • Upload
    danae

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces. A Critical Examination of the Common Psychometric Claims of Unidimensionality, Linearity, and Interval-Level Measurement. Joseph MartineauQi DiaoYang Lu Dipendra SubediSamuel DrakeFeng-Hsien Pang - PowerPoint PPT Presentation

Citation preview

Page 1: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

1

A Critical Examination of the Common Psychometric Claims of Unidimensionality, Linearity, and Interval-Level Measurement

Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

Joseph Martineau Qi Diao Yang LuDipendra Subedi Samuel Drake Feng-Hsien PangKyle Ward Shu-Chuan Kao Tian SongTianli Li Xin Li Yan ZhengAuthorship is alphabetical after the first four

Page 2: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

2

Mic

hig

an D

epart

ment

of

Edu

cati

on

Author Affiliations

□ Michigan Department of Education□ Joseph Martineau□ Kyle Ward

□ Michigan State University□ Qi Diao□ Samuel Drake□ Shu-Chuan Kao□ Tianli Li□ Xin Li□ Yang Lu□ Feng-Hsien Pang□ Tian Song□ Dipendra Subedi□ Yan Zheng

Page 3: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

3

Mic

hig

an D

epart

ment

of

Edu

cati

on

First, a Little Academic Genealogy

□ Why the concern with these concepts?□ Michigan State University students and grads

□ Mentored by Reckase□General theme is the real world is more complicated than

we would like□ Stimulated by discussions of…

□Multidimensionality□Implications of violations of unidimensionality assumptions

of traditional psychometric models

□ This talk could be entitled The real world is so much more complicated than we psychometricians anticipated, it’s not even funny.

□ Conflicts with at least the title of the next presentation

Page 4: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

4

Mic

hig

an D

epart

ment

of

Edu

cati

on

First, a Little Academic Genealogy

□ This work is not the responsibility of Reckase, but is clearly an outgrowth of that work□ The focus of this work is not to knock down traditional

psychometric models, but to challenge claims based on those models, to encourage more valid use of scales resulting from those models, and to move beyond those models if we want to continue making those claims

□ Martineau (2004, 2006a) showed that violations of the assumptions of linearity and unidimensionality can create very misleading results when using vertically scaled data

□ This presentation identifies trajectories within and across grades and investigates the difficulties posed by non-linearity of scales, unmodeled dimensionality, and less than interval-level measurement

Page 5: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

5

Mic

hig

an D

epart

ment

of

Edu

cati

on

Content Specifications in a Grade 3-8 Mathematics Assessment Blueprint

□Grade 3-8 Fall 2005 Michigan Educational Assessment Program (MEAP) Mathematics

□Based on a hierarchical content standard structure as follows:□Content Area (e.g. Mathematics)

□Strand (e.g. Geometry)□Domain (e.g. Transformation & Symmetry)

□Benchmark (e.g. Recognize that transformed shapes are still the same shape)

Page 6: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

6

Mic

hig

an D

epart

ment

of

Edu

cati

on

Content Specifications in a Grade 3-8 Mathematics Assessment Blueprint

0%

20%

40%

60%

80%

100%

3 4 5 6 7 8

Grade Level of the Assessment

Pe

rce

nt

of

tota

l

AlgebraGeometryData & ProbabilityMeasurementNumber & Operations

Page 7: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

7

Mic

hig

an D

epart

ment

of

Edu

cati

on

The Illusion of Unidimensionality

□ Current dimensionality assessment procedures under-detect dimensionality because they…□ Rely on detection of dominant dimensionality□ Rely on the assumption that shared variance is a sign of

shared construct representation (and therefore dominant dimensionality)

□ However, shared variance can result both from shared construct representation and exogenous causal variables□ The uniqueness of one construct (the unshared variance) is

identified as a nuisance factor that may be ignored□ The shared variance of multiple constructs is interpreted as

a single construct when the shared variance will decrease if the constructs respond differently to an exogenous variable

Page 8: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

8

Mic

hig

an D

epart

ment

of

Edu

cati

on

The Illusion of Unidimensionality

□Martineau (2006b) and Tan & Gierl (2006) show that current dimensionality assessment procedures under-detect dimensionality when dimensions are correlated and/or factor structure is complex

□Martineau & Reckase (2006b) show that strongly multidimensional data (e.g. 10 or more factors) are identified as unidimensional using traditional methods when factors are moderately to strongly correlated

Page 9: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

9

Mic

hig

an D

epart

ment

of

Edu

cati

on

Empirical Analyses of Unidimensionality and Multidimensionality□Winter 2004 Grade 4 Michigan

Educational Assessment Program (MEAP) Mathematics Test

□See Martineau, Mapuranga, & Ward, 2006—AERA paper□Meets unidimensionality criteria□Contains four interpretable dimensions□Reasonably similar to intended content

structure (a diffuse test blueprint indicating X% of items per each of five strands without further specificity)

Page 10: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

10

Mic

hig

an D

epart

ment

of

Edu

cati

on

Empirical Analyses of Unidimensionality and Multidimensionality□ Martineau, Mapuranga, & Ward (2006)

□ Four dimensions (correlated 0.58 to 0.77)1. Pattern/rule generation/recognition2. Specific mathematics vocabulary3. Combinations4. Match data to source

□ Six content clusters□ Rule and pattern recognition (dimension 1)□ Match data to source (dimension 4)□ Identify/carry out rule based operations (dimensions 1 &

4)□ Combinations (dimensions 1 and 3)□ Counting units (low loadings on all dimensions)□ Mathematics vocabulary (dimension 2)

Page 11: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

11

Mic

hig

an D

epart

ment

of

Edu

cati

on

Empirical Analyses of Unidimensionality and Multidimensionality□Fall 2005 Grade 3-8 MEAP Mathematics Tests

□Meets unidimensionality criteria (Rasch model fits without any concerns, EFA identifies a single dimension)

□Unable to interpret with small number of dimensions

□Contains between 9 and 12 dimensions per grade (with only 4-5 dimensions having sufficient contributing items to produce a reliable scale)

□Reasonably similar to intended content structure (a very tight test blueprint indicating 3 items per content standard with 20 content standards)

□Dimensionality is reasonably representative of the level of specificity in the test blueprint

Page 12: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

12

Mic

hig

an D

epart

ment

of

Edu

cati

on

Empirical Analyses of Unidimensionality and Multidimensionality□Results & Implications

□Multiple dimensions are identifiable even when traditional methods identify a single dimension

□Dimensions differ from each other in non-trivial ways (e.g. along the lines of theoretically defined content standards as represented by test blueprints)

□If there are reasonable theoretical reasons to expect that different types of content within a subject matter may respond differently to an exogenous variable…

□The different types of content should be scaled separately

□If not, the results may be misleading and harmful to the educational community

Page 13: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

13

Mic

hig

an D

epart

ment

of

Edu

cati

on

The Illusion of Linearity and Interval-Level Measurement

□ From Reckase’s invited address at the APA 1989 annual meeting, used with permission

□ Trajectory of a 3-PL unidimensional ACT mathematics scale through a two-dimensional mathematics achievement space

□ It is possible that the unidimensional scale is linear, or that one or both of the multidimensional scales are linear, but not all three!

□ The meaning of the unidimensional scale changes depending upon location

□ When the scale is non-linear, it is also not equal-interval by definition

Page 14: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

14

Mic

hig

an D

epart

ment

of

Edu

cati

on

Non-linear Unidimensional Trajectories through Multidimensional Content Space

□Plotting the trajectories□Obtain overall “unidimensional” scores□Calibrate unidimensional scales for each strand

within each grade where there were 10 or more items

□Divide the overall scale into quantiles (between 50 and 100) with equal numbers of students in each quantile

□Calculate the mean strand scores for each quantile

□Plot the mean strand scores against each other

Page 15: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

15

Mic

hig

an D

epart

ment

of

Edu

cati

on

Unidimensional Trajectories through Multidimensional Content Space

Grade 3 MEAP Mathematics

-2.5

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

-2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5

Number & Operations

Me

as

ure

me

nt

Page 16: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

16

Mic

hig

an D

epart

ment

of

Edu

cati

on

Unidimensional Trajectories through Multidimensional Content Space

Grade 4 MEAP Mathematics

-2.5

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

-2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5

Number & Operations

Me

as

ure

me

nt

Page 17: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

17

Mic

hig

an D

epart

ment

of

Edu

cati

on

Unidimensional Trajectories through Multidimensional Content Space

-2-1

01

2

-2

-1

0

1

2-2

-1

0

1

2

Data & Probability

Grade 5 MEAP Mathematics

Geometry

Measu

rem

en

t

Page 18: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

18

Mic

hig

an D

epart

ment

of

Edu

cati

on

Unidimensional Trajectories through Multidimensional Content Space

-2-1

01

2

-2-1

01

2-2

-1

0

1

2

Data & Probability

Grade 6 MEAP Mathematics

Geometry

Nu

mb

er

& O

pera

tio

ns

Page 19: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

19

Mic

hig

an D

epart

ment

of

Edu

cati

on

Unidimensional Trajectories through Multidimensional Content Space

-2-1

01

2

-2-1

01

2-2

-1

0

1

2

Algebra

Grade 7 MEAP Mathematics

Geometry

Nu

mb

er

& O

pera

tio

ns

Page 20: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

20

Mic

hig

an D

epart

ment

of

Edu

cati

on

Unidimensional Trajectories through Multidimensional Content Space

-2-1

01

2

-2

-1

0

1

2-2

-1

0

1

2

Algebra

Grade 8 MEAP Mathematics

Geometry

Nu

mb

er

& O

pera

tio

ns

Page 21: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

21

Mic

hig

an D

epart

ment

of

Edu

cati

on

Another View of Non-Linear Trajectories

□Vertically Scale all Mathematics 3-8 Items

□Use Unidimensional, Vertically Scaled Item Parameters to Calculate Student Strand Scores

□Calculate Centiles Across and Within Grades

□Create Plots

Page 22: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

22

Mic

hig

an D

epart

ment

of

Edu

cati

on

Page 23: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

23

Mic

hig

an D

epart

ment

of

Edu

cati

on

Page 24: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

24

Mic

hig

an D

epart

ment

of

Edu

cati

on

Yet Another View of Non-Linear Measurement

□Reference composites□Easiest third of items on grade 6□Hardest third of items on grade 6□Easiest third of items on grade 7□Hardest third of items on grade 7

□Identifies the dimensions best measured by the top and bottom thirds of the grade 6 and 7 tests

Page 25: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

25

Mic

hig

an D

epart

ment

of

Edu

cati

on

-1.5

-1

-0.5

0

0.5

-0.5

0

0.5

1

1.5

-1

-0.5

0

0.5

1

Dimension 1

Reference Composites of Easiest and Hardest Thirds of ItemsBlue is Grade 6, Red is Grade 7

DImension 2

Dim

ensi

on 3

Page 26: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

26

Mic

hig

an D

epart

ment

of

Edu

cati

on

A Caveat About Psychometric Claims

□Corporations with a monetary interest tend to make the claims described below often and with impunity

□Academics are less likely to make these claims, but some do still make them

□When academics make these claims, they tend to make them under more stringent (but still insufficiently stringent) conditions

Page 27: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

27

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 1:Advanced Psychometric Procedures Produce(Essentially) Unidimensional Scales

□Rasch/IRT Point of View

□ “…[IRT scales are] meaningful only if each and every question contributes to the measure of a single attribute.”

□ Bond & Fox, 2001, p. 25

□ If there is a single dominant dimension in the data, the data can be successfully modeled as unidimensional.

□ Nandakumar, 1991

□ Items that are clearly multidimensional can be selected to model a composite dimension that satisfies essential unidimensionality needs.

□ Reckase, Ackerman, & Carlson, 1988

□ The implication is that if the data fit the selected model (e.g. meets unidimensionality and fit tests), it is reasonable to treat the resulting scale as unidimensional from top to bottom

Page 28: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

28

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 1:Advanced Psychometric Procedures Produce(Essentially) Unidimensional Scales

□Examples□ “We place all of our test items on the RIT scale according to

their difficulty.”□ NWEA website, 9/18/2006□ http://www.nwea.org/assessments/researchbased.asp)

□ “the developmental standard score is also a number that describes a student's location on an achievement continuum.”

□ ITBS website, 9/18/2006□ http://www.education.uiowa.edu/itp/itbs/itbs_interp_score.htm)

□ The Quantile Framework measures student mathematical achievement and concept/application solvability on the same scale ….”

□ Quantiles website, 9/18/2006□ http://www.quantiles.com/DesktopDefault.aspx?view=fa&tabindex

=4&tabid=22)

Page 29: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

29

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 1:Advantages of Claiming Unidimensional Scales

□ Unidimensionality is a great simplifier of…□ Measurement models□ Interpretation of the resulting scale□ Explanation of the scale to non-technical audiences□ Statistical analyses of the resulting scale□ Collaboration with substantive researchers□ Instrument development of the instrument□ Meeting profit margins

□ Unidimensionality is a convenient selling point for…□ Clients□ Researchers□ Policymakers□ Managers□ Shareholders□ Ourselves

Page 30: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

30

Mic

hig

an D

epart

ment

of

Edu

cati

on

For example, if the following two models

both approximately capture the shared variance in items 1 through N, we select the less complex model to be parsimonious. So, what’s wrong with that?

The Illusion of Unidimensionality

Achievement

Item 1

Item 2

Item 3

Item N

Achievement

Item 1

Item 2

Item 3

Item N

AchievementDimension 1

Item 1

Item 2

Item 3

Item N

…AchievementDimension 2

Predictor

AchievementDimension 1

Item 1

Item 2

Item 3

Item N

…AchievementDimension 2

Predictor

Page 31: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

31

Mic

hig

an D

epart

ment

of

Edu

cati

on

Let’s try this one on for size:

This model on the left is, of course, a ridiculous econometric model. The model on the right is simplistic, but more realistic. But how does the left model fare as a psychometric model?

The Illusion of Unidimensionality

EconomicHealth

Projected IACorn Crop

Price ofWatermelon

AK TourismRevenue

College TownRevenue

Days since lasthard frost

Projected IACorn Crop

Price ofWatermelon

AK TourismRevenue

College TownRevenue

…SummerVacation

Season

Page 32: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

32

Mic

hig

an D

epart

ment

of

Edu

cati

on

Let’s try this one on for size:

The left model fares very well in the psychometric community. But so does the model on the right. So how does the model on the left cause problems for education?

The Illusion of Unidimensionality

MathematicsAchievement

2 digitaddition

2 digitmultiplication

calculatetriangle height

identifyparallelogram

ComputationAchievement

2 digitaddition

2 digitMultiplication

calculatetriangle height

identifyparallelogram

…Geometry

Achievement

ClassEnvironment

Page 33: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

33

Mic

hig

an D

epart

ment

of

Edu

cati

on

How the Claim of Unidimensionality Can Harm the Educational Community

□ When the available data is “unidimensional,” studies are likely to be performed using general unidimensional outcomes

□ If there is any reasonable theory that different dimensions will respond differently to a given educational intervention, using general unidimensional outcomes is misleading

□ To visualize the effect of incorrect claims of unidimensionality□ Assume that the “unidimensional” scale traverses a

multidimensional space in a linear fashion□ Understand that if the traversal is non-linear, the

problems become greater (see Martineau, 2006a)

Page 34: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

34

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Multidimensionality

□ Best Case Scenario□ An essentially unidimensional reading scale traverses

the decoding/comprehension space linearly□ Scale measures mostly decoding, but is also moderately

sensitive to difference in comprehension□ Randomized experiment, students assigned to phonics

versus whole language instruction (the two groups start out at the same location in the decoding/comprehension space)

□ Pre-post assessment on the same unidimensional reading instrument (problems become worse if a different level of the assessment is used—see Martineau, 2004, 2006a)

□ Phonics has a strong positive effect on decoding (effect size 0.4), no effect on comprehension; whole language has opposite effects

Page 35: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

35

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Multidimensionality

400

450

500

550

400 450 500 550

Decoding

Co

mp

reh

ensi

on

essentially unidimensional reading scale, measuring mostly decoding, but some comprehension

Page 36: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

36

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Multidimensionality

400

450

500

550

400 450 500 550

Decoding

Co

mp

reh

ensi

on

pretest centroid, both groups (e.g. randomized experiment)

Page 37: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

37

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Multidimensionality

400

450

500

550

400 450 500 550

Decoding

Co

mp

reh

ensi

on

whole-language post-test centroid

phonics post-test centroid

Page 38: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

38

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Multidimensionality

400

450

500

550

400 450 500 550

Decoding

Co

mp

reh

ensi

on

Projections of multidimensional centroids onto the essentially unidimensional scale

Page 39: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

39

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Multidimensionality

400

450

500

550

400 450 500 550

Decoding

Co

mp

reh

ensi

on

Effect of Phonics over Whole Language

Page 40: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

40

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Multidimensionality

□ Misleading results□ Phonics instruction increases reading ability over whole

language instruction□ Accurate results with multidimensional

instruments□ Phonics instruction increases decoding ability over whole

language instruction□ Whole language instruction increases comprehension

ability over phonics instruction□ What are the policy ramifications of these

misleading results? Can anyone say “Educational Pendulum?”

□ If the assessment were an essentially unidimensional measure of mostly comprehension but some decoding, the misleading results would go in exactly the opposite (but equally misleading) direction

Page 41: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

41

Mic

hig

an D

epart

ment

of

Edu

cati

on

Alternate Effects of Unmodeled Multidimensionality

400

450

500

550

400 450 500 550

Decoding

Co

mp

reh

ensi

on

Projections of multidimensional centroids onto the essentially unidimensional scale

Page 42: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

42

Mic

hig

an D

epart

ment

of

Edu

cati

on

Alternate Effects of Unmodeled Multidimensionality

400

450

500

550

400 450 500 550

Decoding

Co

mp

reh

en

sio

n

No distinguishable effects on reading?

Page 43: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

43

Mic

hig

an D

epart

ment

of

Edu

cati

on

Alternate Effects of Unmodeled Multidimensionality

□ Misleading results□ It does not matter whether one uses phonics or whole

language instruction□ Accurate results with multidimensional

instruments□ Phonics instruction increases decoding ability over whole

language instruction□ Whole language instruction increases comprehension

ability over phonics instruction□ What are the policy ramifications of these

misleading results?□ If the assessment were an essentially

unidimensional measure of mostly comprehension but some decoding, the misleading results would go in exactly the opposite (but equally misleading) direction

Page 44: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

44

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales

□If a scale is unidimensional and measures at the interval level, the scale is then linear by definition

□Therefore

□ If we claim that a scale is unidimensional

AND

□ If we claim that a scale is interval level

THEN

□ We also claim that the scale is linear

Page 45: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

45

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales

□ Rasch Point of View□ “…estimates derived from Rasch procedures are located on an

interval scale….”□ Bond & Fox, 2001, p. 119□ This claim is well accepted in the Rasch literature

□ Examples□ “You can liken the [NWEA Rasch scale] to a meterstick which is

comprised of equal units of measurement, centimeters.”□ NWEA website, 9/18/2006□ http://www.nwea.org/assessments/researchbased.asp

□ “Measurements…are now reportable in a common unit, a Lexile, which is similar to the degree calibrations on a thermometer….”

□ Lexile website, 9/18/2006□ p.14 of

http://www.lexile.com/lexilearticles/objective-measurement-reading-response.pdf).

□ Pearson Educational Measurement’s PASeries reading assessment is reported on the Lexile scale….”

□ Pearson website, 9/18/2006□ http://www.pearsonpaseries.com/downloads/overview/PASeriesOverview.html

)

Page 46: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

46

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales

□ This claim is more controversial in the more general psychometric world, but the claim is made nonetheless

□ Claims for:□ “[IRT models] are calibrated to an equal-interval scale….”

□ METRIC website, 9/18/2006□ http://www.metric.research.med.va.gov/learn/theories/theories_irt.asp)

□ “[An IRT] scale…[is not] interval…although it is popular and reasonable to assume [it is].”

□ Hambleton, Swaminathan, and Rogers, 1991, p. 87

□ Claims against:□ “[The equal interval assumption] is…either demonstrably wrong…

or arbitrary….”□ Cliff, 1991, p. 37

□ “When the characteristic…cannot be directly observed, claims of equal-interval properties…are not testable and are therefore meaningless.”

□ Zwick, 1992, p. 209

Page 47: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

47

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 2:Advanced Psychometric ProceduresProduce Linear, Interval Level Scales

□General Psychometric Examples□“…[TerraNova K-12] Scale Scores [derived from 3-PL

IRT models] are equal-interval scores….”□ CTB McGraw-Hill website, accessed 9/18/2006□ http://www.ctb.com/static/about_assessment/popup_f6.jsp

□“…the [ITBS] developmental standard score scale…mirrors reality better….[than Grade Equivalents in that] growth is usually not as great at the upper grades as it is at the lower grades.”

□ Iowa Testing Program website, accessed 9/18/2006□ http://www.education.uiowa.edu/itp/itbs/itbs_interp_score.htm

□1998 NAEP Technical manual describes several analyses that assume the scale scores are equal interval

□ NCES website, accessed 9/18/2006□ http://nces.ed.gov/nationsreportcard/pdf/main1998/2001509.pdf

Page 48: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

48

Mic

hig

an D

epart

ment

of

Edu

cati

on

Psychometric Claim 2:Advantages of Claiming Linear, Interval-Level Scales

□ Scale linearity and level of measurement determine the types of analyses that are appropriate

□ Non-linear, nominal/ordinal measurement□ Requires categorical analyses (discarding some data to create

categories) or non-linear transformations□ Interpretation is more difficult□ Categorical/non-linear analyses are less well known and

developed□ Linear, interval/ratio level measurement

□ Allows for linear variable analyses without transformation□ Interpretation is more clear□ Analyses of linear interval/ration variables are better known and

better developed (e.g. HLM, ANOVA, Factor Analysis, Regression)□ If we have linear, interval/ratio level measurement, then

studies of education based on those measures are very much like studies of physical, directly measurable phenomena

Page 49: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

49

Mic

hig

an D

epart

ment

of

Edu

cati

on

How the Claim of Linearity/Interval-Level Measurement Can Harm the Educational Community

□Non-linearity may dramatically increase the problems of un-modeled dimensionality

□Scenario demonstrates how this can happen

Page 50: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

50

Mic

hig

an D

epart

ment

of

Edu

cati

on

How the Claim of Linearity/Interval-Level Measurement Can Harm the Educational Community

□ Scenario□ The MEAP grade 8 mathematics scale traverses the

algebra/geometry space non-linearly (as taken from the empirical trajectories shown earlier)

□ Scale measures mostly geometry near the bottom, and changes to mostly algebra near the top

□ Quasi-experiment, matched samples (based on the unidimensional math pre-test) from convenience populations are assigned to use or not use 3-D manipulatives

□ Pre-post assessment on the same unidimensional reading instrument (problems become worse if a different level of the assessment is used—see Martineau, 2004, 2006a)

□ Treatment has a strong positive effect on geometry gains (effect size 0.5), no effect on algebra gains

Page 51: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

51

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Non-Linear, Non-Interval Scales

-2.0

-1.5

-1.0

-0.5

0.0

0.5

-1.5 -1.0 -0.5 0.0 0.5

Algebra

Geo

met

ry

Treatment pre-test centroid

Control pre-test centroid

Page 52: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

52

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Non-Linear, Non-Interval Scales

-2.0

-1.5

-1.0

-0.5

0.0

0.5

-1.5 -1.0 -0.5 0.0 0.5

Algebra

Geo

met

ry

Both pre-test centroids project onto the same location on the non-linear unidimensional scale(gives matched samples)

Page 53: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

53

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Non-Linear, Non-Interval Scales

-2.0

-1.5

-1.0

-0.5

0.0

0.5

-1.5 -1.0 -0.5 0.0 0.5

Algebra

Geo

met

ry

Treatment post-test centroid

Control post-test centroid

Page 54: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

54

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Non-Linear, Non-Interval Scales

-2.0

-1.5

-1.0

-0.5

0.0

0.5

-1.5 -1.0 -0.5 0.0 0.5

Algebra

Geo

met

ry

Projections of post-test centroids onto non-linear unidimensional scale

Page 55: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

55

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Non-Linear, Non-Interval Scales

-2.0

-1.5

-1.0

-0.5

0.0

0.5

-1.5 -1.0 -0.5 0.0 0.5

Algebra

Geo

met

ry Effect in favor of the control group

Page 56: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

56

Mic

hig

an D

epart

ment

of

Edu

cati

on

Effects of Unmodeled Non-Linearity and Non-Interval Level Measurement

□ Misleading results□ Treatment has negative effect on math achievement

□ Accurate results with linear, interval-level multidimensional measurement□ Treatment has positive effect on geometry achievement□ Treatment has no effect on algebra achievement

□ What are the policy ramifications of these misleading results?□ Don’t use 3-D manipulatives—they are a detriment to

learning mathematics!□ It is possible to actually reverse the estimated effects

from real effects (an overall negative effect may be detected when the only real effect on any dimension is positive)!

Page 57: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

57

Mic

hig

an D

epart

ment

of

Edu

cati

on

Implications: Where Do We Go From Here With Unidimensional Psychometric Models?

□Need more stringent measures of unidimensionality□Avoid relying on the assumption that shared

variance means that items measure the same construct

□Pay more careful attention to subject matter specialists/theorists in terms of

□What constitutes a separable dimension□Whether different dimensions are reasonably likely to

be affected differently by exogenous variables we want to include in the models

□Incorporating both empirical dimensionality assessment and theoretical content dimensions

Page 58: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

58

Mic

hig

an D

epart

ment

of

Edu

cati

on

Implications: Where Do We Go From Here With Unidimensional Psychometric Models?

□ Unless a measure is very narrow, drop the claim of linearity and interval-level measurement□ Stop marketing broad “unidimensional” scales as

adequate for statistical procedures requiring linear, interval-level scales

□ Take proactive measures to discourage inappropriate uses of broad “unidimensional” scales as linear and interval-level:□ Clearly label broad “unidimensional” scales as ordinal

and/or ordered metrics□ Caution against assuming linearity and or interval-level

measurement□ Stop washing our hands of inappropriate uses of

psychometric scales because we are not the ones using the scales—we only produce them

□ Speak out against inappropriate uses

Page 59: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

59

Mic

hig

an D

epart

ment

of

Edu

cati

on

But Linearity & Interval-level Measurement are So Useful, How Do We Get Them Back?

□We will achieve these goals neither quickly nor easily□There is a strong investment in claiming

properties of linearity, unidimensionality, and interval-level measurement□Profit!□Prestige□Simplicity□Theory (e.g. Rasch and/or IRT)

□Adequate psychometric models do not currently exist to substantiate these claims

Page 60: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

60

Mic

hig

an D

epart

ment

of

Edu

cati

on

But Linearity & Interval-level Measurement are So Useful, How Do We Get Them Back?

□Step out of unidimensional illusions and into multidimensional reality□30+ years of research on

multidimensionality□Negligible application□Feasibility is a problem (“it’s impractical”)

□Cost□Software□Effort

□Replicability is a problem□Interpretability is a problem□Dimensionality theory is a problem

Page 61: Non-Linear Unidimensional Scale Trajectories Through Multidimensional Content Spaces

61

Mic

hig

an D

epart

ment

of

Edu

cati

on

But Linearity & Interval-level Measurement are So Useful, How Do We Get Them Back?

□Step out of unidimensional illusions and into multidimensional reality □Step away from what is convenient to…

□Develop stronger models of dimensionality that do not assume shared variance necessarily indicates shared underlying constructs

□Develop feasible and replicable methods of multidimensional scale analysis that do not assume simple structure

□Develop methods of multidimensional psychometric analysis that pay strong attention to both theoretical (e.g. intended, interpreted) and empirical (e.g. observed) content structure