22
1 MEASUREMENT Goal MEASUREMENT Goal To develop reliable and To develop reliable and valid valid measures measures using state- using state- of-the-art of-the-art measurement measurement models models Members: Chang, Berdes, Members: Chang, Berdes, Gehlert, Gibbons, Schrauf, Gehlert, Gibbons, Schrauf, Weiss Weiss

1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

Embed Size (px)

Citation preview

Page 1: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

11

MEASUREMENT GoalMEASUREMENT Goal

To develop reliable and valid To develop reliable and valid measuresmeasures using state-of-the-art using state-of-the-art measurement modelsmeasurement models– Members: Chang, Berdes, Gehlert, Members: Chang, Berdes, Gehlert,

Gibbons, Schrauf, WeissGibbons, Schrauf, Weiss

Page 2: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

22

Why Item Response Why Item Response Theory?Theory?

Classical Test Theory (Traditional)Classical Test Theory (Traditional) Item Response Theory (Modern)Item Response Theory (Modern)

Measures of precision fixed for all Measures of precision fixed for all scoresscores

Precision measures vary across scoresPrecision measures vary across scores

Longer scales increase reliabilityLonger scales increase reliability Shorter, targeted scales can be equally Shorter, targeted scales can be equally reliable reliable (Short Form)(Short Form)

Scale properties are sample Scale properties are sample dependentdependent

Item & scale properties are invariant Item & scale properties are invariant within a linear transformation within a linear transformation (DIF)(DIF)

Comparing person scores Comparing person scores dependent on item setdependent on item set

Person scores comparable across Person scores comparable across different item sets different item sets (CAT)(CAT)

Comparing respondents requires Comparing respondents requires parallel scalesparallel scales

Different scales can be placed on a Different scales can be placed on a common metric common metric (Instrument (Instrument Linking/Equating)Linking/Equating)

Mixed item formats leads to Mixed item formats leads to unbalanced impact on total scale unbalanced impact on total scale scoresscores

Easily handles mixed item formatsEasily handles mixed item formats

Summed scores are on an ordinal Summed scores are on an ordinal scalescale

Scores on interval scaleScores on interval scale

Graphical tools for item and scale Graphical tools for item and scale analysisanalysis

Page 3: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

33

Item Response Theory Item Response Theory (IRT)(IRT) A family of mathematical descriptions of A family of mathematical descriptions of

what happens when a person meets a what happens when a person meets a test or survey questiontest or survey question

Relates characteristics of items (Relates characteristics of items (item item parametersparameters) and characteristics of ) and characteristics of persons (persons (person latent traitsperson latent traits) to the ) to the probability of a correct or probability of a correct or rating/categorical responserating/categorical response

Models the test-taking behavior at the Models the test-taking behavior at the itemitem level level

Page 4: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

44

Likely(“easy”)

Unlikely(“hard”)

Poor Good

PersonPerson Latent Trait Latent Trait

ItemItem Location Location

Q Q Q

Q Q Q Q Q QQ Q Q

Q Q Q

Q Q Q Q Q QQ Q Q

Item-Person MapItem-Person Map

Chang & Gehlert (2002).

Page 5: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

55

Dichotomous Dichotomous Unidimensional IRT Unidimensional IRT Models Models

1-PL (Rasch)1-PL (Rasch)– Difficulty (Difficulty (bb))

2-PL 2-PL – Difficulty (Difficulty (bb))– Discriminating (Discriminating (aa))

3-PL3-PL– Difficulty (Difficulty (bb))– Discriminating (Discriminating (aa))– Guessing (Guessing (cc))

Constant x ai

0

1.0

cj

Ability ()bi

)(1

1)1()(

ii bDaiii

eccP

Page 6: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

66

Polytomous IRT ModelsPolytomous IRT Models

PolytomousPolytomous– 1-PL (threshold)1-PL (threshold)

Partial Credit Partial Credit Rating ScaleRating Scale

– 2-PL (threshold & 2-PL (threshold & discriminating) discriminating)

NominalNominal Graded ResponseGraded Response Generalized Generalized

Partial CreditPartial Credit

0

0.2

0.4

0.6

0.8

1.0

-3 -2 -1 0 1 2 3

1

2

3

Ability

Pro

bab

ilit

y

Item Characteristic Curve: 0001

Partial Credit Model (Normal Metric)

1 2 3

* Vigorous activities, such as running, lifting heavy objects, participating in strenuous sports

1=Yes, Limited a lot

2=Yes, Limited a little

3=No, Not Limited at all

Page 7: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

77

Potential Advantages of Using Potential Advantages of Using IRT in “Geriatric” Pain IRT in “Geriatric” Pain AssessmentAssessment

Refine existing instrumentsRefine existing instruments Evaluate Evaluate itemitem and and scalescale characteristics characteristics Evaluate different response formatsEvaluate different response formats Detect differential item functioningDetect differential item functioning Evaluate person fit (clinical diagnosis)Evaluate person fit (clinical diagnosis) Equate/Link instrumentsEquate/Link instruments Establish item banks and brief formsEstablish item banks and brief forms Develop computerized adaptive Develop computerized adaptive

testingtesting

Page 8: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

88

Item Banking and CATItem Banking and CAT

0.00

0.25

0.50

0.75

1.00

-3 -2 -1 0 1 2 3

Depression

Pro

ba

bil

ity o

f R

esp

on

se

EEDDCCBBAA new

new

FF

Item Pool (Sets of Questions)

IRT

Q

Q

Q

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00

Overall Mental Health

Pro

ba

bil

ity o

f R

esp

on

se

Brief FormsCAT

Item Bank (Catalogued; Hierarchically Structured)

Q

Page 9: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

99

Principles of Adaptive Principles of Adaptive TestingTesting

IRT pre-calibrated item bankIRT pre-calibrated item bank Initial item selectionInitial item selection Test scoring methodTest scoring method Item selection during test Item selection during test

administrationadministration Stopping rulesStopping rules

Page 10: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1010

Item BankItem Bank

Set of carefully IRT-calibrated questionsSet of carefully IRT-calibrated questions Items covers entire latent trait continuum Items covers entire latent trait continuum Items represent differing amounts of traitItems represent differing amounts of trait Items represent differing amounts of Items represent differing amounts of

informationinformation Basis for tailored/adaptive testingBasis for tailored/adaptive testing Items can be selected to maximize Items can be selected to maximize

precision and retain clinical relevance precision and retain clinical relevance

Page 11: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1111

Item Banking is Inter-Item Banking is Inter-disciplinarydisciplinary PsychometriciansPsychometricians Information scientistsInformation scientists Clinicians/healthcare providersClinicians/healthcare providers Outcomes researchersOutcomes researchers Content expertsContent experts ……

Page 12: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1212

Approaches to Develop Approaches to Develop

Item BanksItem Banks Top-Down ApproachTop-Down Approach Bottom-Up ApproachBottom-Up Approach

Health

Physical Mental Social

Physical Functioning

Pain

Spiritual

Depression AnxietySymptom

Page 13: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1313

Development and Development and Maintenance of an Item Maintenance of an Item BankBank How to best calibrate existing items?How to best calibrate existing items?

– Model selectionModel selection– Whose item parameters to use? Whose item parameters to use? – Standardization? Standardization? – Generic vs. disease-specificGeneric vs. disease-specific

Item parameter driftItem parameter drift– Anchor or Re-calibrate?Anchor or Re-calibrate?

How to write and best test new items?How to write and best test new items?

Page 14: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1414

Adaptive TestAdaptive Test

An adaptive test is a tailored, An adaptive test is a tailored, individualized measure which individualized measure which involves selecting a set of test involves selecting a set of test items for each individual that items for each individual that best measures the best measures the psychological characteristics psychological characteristics of that person (Weiss, 1985)of that person (Weiss, 1985)

Weiss DJ. Adaptive testing by computer. J Consult Clin Psychol. Dec 1985;53(6):774-789.

Page 15: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1515

Why Computerized Why Computerized Adaptive Testing?Adaptive Testing? Adaptive testing selects questions Adaptive testing selects questions

based on previous responsesbased on previous responses Tailored item and test difficultiesTailored item and test difficulties Eliminates floor and ceiling effectsEliminates floor and ceiling effects Require fewer questions to arrive at an Require fewer questions to arrive at an

accurate estimateaccurate estimate Automate question administration, Automate question administration,

data recording, scoring, and prompt data recording, scoring, and prompt reportingreporting

Allows for immediate feedback Allows for immediate feedback

Page 16: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1616

CAT AlgorithmCAT Algorithm

Administer Item of Median Difficulty (or Screening Item)

Score Item

Estimate Latent Trait (Theta)

Termination Criterion Satisfied

Stop

Choose and Administer

Next Item with Maximum

Information

YesNo

Page 17: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1717

Increase of Accuracy of Increase of Accuracy of Ability or Latent Trait Ability or Latent Trait Estimation in CATEstimation in CAT

Ability ()

Item 1

Item 1-4

Item 1-2

Item 1-3

Item 1-5

For each item added to the test, the width of the interval decreases.

Page 18: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1818

Potential Problems with Potential Problems with CAT in Pain and Health CAT in Pain and Health Outcomes MeasurementOutcomes Measurement Context effectsContext effects Unbalanced contentUnbalanced content Time frameTime frame Response categoriesResponse categories MultidimensionalityMultidimensionality

Page 19: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

1919

What kind of short What kind of short form?form?

 

Rarely or none of the time

(less than 1 day)

Some or a little of the

time(1-2 days)

Occasionally or a

moderate amount of

time(3-4 days)

All of the time

(5-7 days)

1.   I was bothered by things that usually don't bother me

Question 1Question 100 I do not feel sad. I do not feel sad.11 I feel sad I feel sad22 I am sad all the time and I can’t snap out of it. I am sad all the time and I can’t snap out of it.3 I am so sad or unhappy that I can’t stand it.3 I am so sad or unhappy that I can’t stand it.

Are you basically satisfied with your life?True/False

Page 20: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

2020

MORE Research Still Needed MORE Research Still Needed for Effective CAT for Effective CAT ImplementationImplementation Item productionItem production Item statisticsItem statistics Item exposureItem exposure Maintaining a Maintaining a

valid bank of valid bank of items for test items for test constructionconstruction

Fairness Fairness Delivery optionsDelivery options Effects of modes Effects of modes

of administration of administration Cost-benefit Cost-benefit

considerationsconsiderations

Page 21: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

2121

Individual Researchers

Individual Researchers

Pharm. Industries

Pharm. Industries

Non-profit Institutions

Non-profit Institutions

Subscriber

IRT AnalysesIRT Analyses

Item Parameters

Consortium Approval

Consortium Approval

Customized Information Retrieval; CAT;

(automated) Brief Form

National “Central” Item Bank

Infrastructure of a Infrastructure of a NationalNational Geriatric Pain Geriatric Pain Item BankItem Bank

Government Agencies

Government Agencies

Collector BuilderAnalyzer Retriever

Public

Page 22: 1 MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models To develop reliable and valid measures using state-of-the-art

2222

Field Survey PDA

Field Survey Laptop

Clinic Survey Station

Survey ArchiveSurvey DataWarehouse

Subject Profile

Survey Designer

Physician Station

XML/XSL Parser

Security Manager

SurveyRetriever

Survey Collector

Service Dispatcher

Survey Analyzer

Device Detector

SurveyBuilder

Patient

Pharmaceutical Research

System Administrator

Physician PDA

Data Collection

Data Analysis/Mining

System Maintenance

University Research

PROsIT System

An Integrated Solution An Integrated Solution for Pain and Outcomes for Pain and Outcomes AssessmentsAssessments

Chang, C.-H., & Yang, D. (2003, April 15). Patient-Reported Outcomes Information Technology: The PROsITTM System. ISPOR CONNECTIONS, 9(2), 5-6.