27
Diagnostic Modeling of English Language Grammar Attributes Session 3 AERA Diagnostic Measurement Workshop 1

Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

  • Upload
    donhi

  • View
    235

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Diagnostic Modeling of English Language Grammar Attributes

Session 3

AERA Diagnostic Measurement Workshop 1

Page 2: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Session 3 Overview

• An example of a DCM through an application

• Educational measurement – English proficiency LCDM demonstration in practice

Sample results

Potential problems in analysis

AERA Diagnostic Measurement Workshop 2

Page 3: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

LARGE-SCALE LANGUAGE ASSESSMENT USING THE LCDM

Session 3 – Diagnostic Modeling of English Language Grammar Attributes

AERA Diagnostic Measurement Workshop 3

Page 4: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Introduction

• With the emphasis of today’s academic environment on testing, the focus of formative assessment is growing

• Among possible formative settings, language assessment has received some attention (e.g., Buck & Tatsuoka, 1998; Jang, 2004; von Davier, 2005)

• The purpose of this study is to explore the possibility of using the LCDM for the evaluation of the grammar section of the Examination for the Certificate of Proficiency in English (ECPE)

Also provides an example analysis using the LCDM

AERA Diagnostic Measurement Workshop 4

Page 5: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Examination for the Certificate of Proficiency in English (ECPE)

• The ECPE is a test developed and scored by the English Language Institute of the University of Michigan

• The ECPE was developed to measure advanced English ability in respondents for which English is not their first language

• Analysis is for the grammar section of the test 40 multiple choice items (28 items used in analysis)

10 were non-operational 2 had difficulties greater than 0.9

AERA Diagnostic Measurement Workshop 5

Page 6: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Example Item from Grammar Section

• An example written to resemble an item in the

Grammar section of the ECPE is:

I have always ______ snow.

to enjoy

enjoyed

enjoying

to enjoyed

AERA Diagnostic Measurement Workshop 6

Page 7: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

ECPE ANALYSIS METHODS Session 3 – Diagnostic Modeling of English Language Grammar Attributes

AERA Diagnostic Measurement Workshop 7

Page 8: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Examinees and Data

• A total of 2922 examinees are used to analyze the 2003-2004 ECPE Grammar section

The average age of examinees was approximately 23 years old Approximately 50% spoke Portuguese and an additional 31% of the

examinees spoke Spanish as a first language

• Presently, a composite score is provided for each examinee

• This study illustrates how DCMs can provide skill specific feedback for examinees

The full LCDM was estimated using Mplus

Marginal maximum likelihood estimation

AERA Diagnostic Measurement Workshop 8

Page 9: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Attributes Measured by Test

• Three attributes measured representing knowledge of: Morphosyntactic rules

Cohesive rules

Lexical rules

• Q-matrix characteristics 19 items measuring only one attribute (simple structure) 9 items measuring two attributes 0 items measuring all three attributes

AERA Diagnostic Measurement Workshop 9

Page 10: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

ECPE Q-matrix

• Here are the entries for several items from the ECPE Q-matrix

AERA Diagnostic Measurement Workshop 10

Item Morphosyntactic

Rules

Cohesive

Rules

Lexical

Rules

1 1 1 0

2 0 1 0

3 1 0 1

4 0 0 1

5 0 0 1

6 0 0 1

7 1 0 1

Page 11: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

LCDM RESULTS Session 3 – Diagnostic Modeling of English Language Grammar Attributes

AERA Diagnostic Measurement Workshop 11

Page 12: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

LCDM Results

• To further describe the parameters of the LCDM, several types of results will be presented:

Item parameter results Inspection of interactions

Interpretation

Structural parameter results Implied attribute hierarchy

AERA Diagnostic Measurement Workshop 12

Page 13: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Example Item

• To demonstrate parameter interpretation, results from Item 7 will be shown

Attributes measured: Morphosyntactic rules (Attribute 1)

Lexical rules (Attribute 3)

• Parameter estimates:

AERA Diagnostic Measurement Workshop 13

Parameter Estimate SE p-value

λ7,0 -0.106 0.095 0.264

λ7,1,(1) 2.855 0.208 0.000

λ7,1,(3) 0.952 0.144 0.000

λ7,2,(1,3) -0.952 0.144 0.000

Page 14: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

LCDM Intercepts

• Estimated Intercept: -0.106 (0.095)

• Indicates the logit of a correct response for a non-master of all attributes

Here, non-masters have an average probability of a correct response: exp(-0.106)/1+exp(-0.106) = 0.47

• Hypothesis test is not important Tests whether non-masters have a probability of a correct

response of 0.5

• Problematic when very high Difficult to identify other parameters

Indicates issues with test, Q-matrix, or attributes

AERA Diagnostic Measurement Workshop 14

Page 15: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Higher Order Model Parameters

• Interpretation of main effects and interactions proceeds sequentially:

• If interactions are present: Examine highest level of interaction

If significantly different from zero, leave in model

If not, term can be omitted

• If interactions are not present: Examine how far main effect is from zero

AERA Diagnostic Measurement Workshop 15

Page 16: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Examining Interaction Parameters

• 2-way interaction parameter: -0.952 (0.144)

• P-value for parameter was small (0.000) Indicates parameter is significantly different from zero

Candidate to leave in model

• Value indicates that there is an under-additive effect of mastering both attributes

Means mastery of one attribute is sufficient to have high chance to get item correct

AERA Diagnostic Measurement Workshop 16

Page 17: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

More on Interactions

• Interaction pattern for this item indicates that mastery morphosyntatic rules is key to answering correctly

Mastery of lexical rules helps, but not above that of mastery of morphosyntatic rules

For why this is the case, stay tuned…

AERA Diagnostic Measurement Workshop 17

-0.5

0

0.5

1

1.5

2

2.5

3

α3=0 α3=1

Lo

git

(X=

1|α

)

α1=0

α1=1

0

0.2

0.4

0.6

0.8

1

α1=0; α3=0

α1=0; α3=1

α1=1; α3=0

α1=1; α3=1

P(X

=1|α

)

Possible Attribute Patterns

Page 18: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Other Interactions

• Of 9 interaction parameters, 3 were significantly different from zero

Candidates to be removed from model

• Of the 6 non-significant interactions 4 had small main effects on one attribute

Attribute not highly related to item response

Indicates that Q-matrix may be incorrect Have to re-fit with new Q-matrix and look at information criteria

(later session)

AERA Diagnostic Measurement Workshop 18

Page 19: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Interpreting Main Effects

• When significant interactions are present, main effects cannot be easily interpreted

Sometimes called conditional main effects

Need to know combination of attributes mastered to fully describe item response function

• Main effects in LCDM have added concern Lower bound is zero (for monotonicity)

p-values are inaccurate as they approach zero

AERA Diagnostic Measurement Workshop 19

Page 20: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

ECPE Item 7 Lexical Main Effect

• Because of the significant interaction, interpretation is conditional

• When Morphosyntactic Rules have not been mastered:

• Lexical main effect λ7,1,(3) = 2.855

• Respondents who have mastered Lexical Rules have an increase in logit of 0.952 over respondents who are non-masters

0

0.2

0.4

0.6

0.8

1

α1=0; α3=0

α1=0; α3=1

α1=1; α3=0

α1=1; α3=1

P(X

=1|α

)

Possible Attribute Patterns

AERA Diagnostic Measurement Workshop 20

Page 21: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

ECPE Item 7 Morphosyntactic Main Effect

• Because of the significant interaction, interpretation is conditional

• When Lexical Rules have not been mastered:

• Morphosyntactic main effect λ7,1,(1) = 0.952

• Respondents who have mastered Morphosyntactic Rules have an increase in logit of 2.855 over respondents who are non-masters

0

0.2

0.4

0.6

0.8

1

α1=0; α3=0

α1=0; α3=1

α1=1; α3=0

α1=1; α3=1

P(X

=1|α

)

Possible Attribute Patterns

AERA Diagnostic Measurement Workshop 21

Page 22: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

General Modeling Tips

• High-level interactions are difficult to estimate in most samples

More than 2-way interactions may not be possible

• Modeling strategy: Try all interactions

If model does not converge, limit to only 2-way interactions

Remove non-significant interactions from model If all interactions and main effects for an attribute are close

to zero: Entry for attribute in Q-matrix can be removed

Double check with AIC/BIC as hypothesis test is approximate (more on model fit to come later)

AERA Diagnostic Measurement Workshop 22

Page 23: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Attribute Pattern Probabilities

• Base-rate pattern of profiles mastered in sample indicates an attribute hierarchy

Lexical Cohesive Morphosyntatic

• Implications for Item 7 Cannot have morphosyntatic

without lexical

• Suggests information about second-language acquisition

AERA Diagnostic Measurement Workshop 23

Page 24: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

CONCLUDING REMARKS – EDUCATIONAL MEASUREMENT

Session 3 – Diagnostic Modeling of English Language Grammar Attributes

AERA Diagnostic Measurement Workshop 24

Page 25: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Educational Measurement Wrap-Up

• Demonstrated results from LCDM when applied to English language assessment

• Described item parameter estimates Interpreting interactions/main effects

Modeling strategy

• Described structural parameter estimates Useful for understanding latent variables measured by test

AERA Diagnostic Measurement Workshop 25

Page 26: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Wrap-Up and Take-Home Points

• Session 3 demonstrated a potential use of DCMs

• Applications of DCMs are rare Tests haven’t been built to measure categorical attributes

Item information is different in DCMs

Users haven’t had access to software Previously, most applications use software built by researchers

– MCMC in Fortran or WinBugs

– MML in Fortran

Now, researchers can use Mplus

AERA Diagnostic Measurement Workshop 26

Page 27: Diagnostic Modeling of English Language Grammar … for the Certificate of Proficiency in English (ECPE) • The ECPE is a test developed and scored by the English Language Institute

Notes on Usefulness of DCMs

• Full utility of DCMs cannot be understood unless applications become more frequent

For now, have to use sub-optimal data and problems Future applications coming soon

Mathematical reasoning test under development (NSF funded) Assessment of readiness for first grade in kindergartners

• Funding opportunities exist and seem to review well

Educational Measurement: NSF (DR-K12); IES (Goals 2 and 5) Psychological Measurement: NIH (NIMH; NIDA; NIA;…)

• Industry seems interested

ETS/College Board/ACT/Measurement Inc. Typically proprietary – dangerous for academics

AERA Diagnostic Measurement Workshop 27