58
Generalizability Theory Nothing more practical than a good theory!

Generalizability Theory

Embed Size (px)

DESCRIPTION

Generalizability Theory. Nothing more practical than a good theory!. This presentation is made by Prof. Zhao. Overview of Presentation. Classes of reliability theories Generalizability Theory G-study D-study Illustrations. Three Reliability Theories. Classical Test Theory - PowerPoint PPT Presentation

Citation preview

Page 1: Generalizability Theory

Generalizability Theory

Nothing more practical than a good theory!

Page 2: Generalizability Theory

This presentation is made by Prof. ZhaoThis presentation is made by Prof. Zhao

Page 3: Generalizability Theory

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 4: Generalizability Theory

Three Reliability Theories

Classical Test Theory Generalizability Theory Item Response Theory

Page 5: Generalizability Theory

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 6: Generalizability Theory

Generalizability Theory

Fundamental is the concept of parallel measures (like classical test theory), but the theory allows a multitude of error sources

Generalizability concept:Reliability is dependent on the inferences (generalizations) that the investigator wishes to make with the data from the measurement

Page 7: Generalizability Theory

Illustration

Essay test 7 vignette based essay questions 2 markers independently marking all

questions for all examinees Reliability in a classical framework:

Cronbach’s alpha: 0.66 Inter rater reliability (i.e. kappa) 0.71

Page 8: Generalizability Theory

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

The larger the variance of T in relation to X, the higher the

reliability

Page 9: Generalizability Theory

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

= = =

Page 10: Generalizability Theory

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

Reliability = Variance of TVar T + Var E

Page 11: Generalizability Theory

Multiple sources of error variance

Reliability = Variance of TVar T + Var E

Markers Essays Unexplained

Page 12: Generalizability Theory

Two steps in G analysis

1) G(eneralizability)-study:Estimation of sources of variance that influence the measurement (e.g., variance between examinees, essays and markers)

2) D(ecision)-study:Estimation of reliability indices as a function of concrete sample size(s) (e.g., number of essays, number of markers)

Page 13: Generalizability Theory

G-study steps

Determine facets (factors of variance)

Determine design Random vs fixed Crossed vs nested

Page 14: Generalizability Theory

Crossed vs nested designs

A B

1

2

3

4

5

6

A B C D E F G H I J K L

Crosseddesign

Nesteddesign

Page 15: Generalizability Theory

G-study

Determine facets (factors of variance)

Determine design Random vs Fixed Crossed vs nested Collect data

Analysis of Variance (ANOVA) Estimation of variance components

Page 16: Generalizability Theory

Illustration 1

Essay Test 7 vignette based open ended questions 100 students One marker marked all essays for all

students G-study questions?

N of factors/facets? Random/fixed facets? Nested or crossed?

One facet designRandomCrossed

Page 17: Generalizability Theory

Sources of Variance

Person x Items

p ipi,e

Page 18: Generalizability Theory

Sources of Variance

Person x Items

ip pi,e

Page 19: Generalizability Theory

Sources of Variance

Person x Items

p ipi,e

Page 20: Generalizability Theory

Sources of Variance

Person x Items

p pi,e

Page 21: Generalizability Theory

Variance component estimation (one facet design)

An observed score for a person on an item (Xpi):

Xpi = [Overall mean]

+ p - [Person effect]

+ i - [Item effect]

+ pi - p - i - [Residual]

Each of these effects have an average (always 0) anda variance (2). The latter ones are the variance components.

The variance of all observes scores Xpi across all persons and items:

^

^2 (Xpi) = ^2p

^2i + ^2

pi,e +

Page 22: Generalizability Theory

Variance components

P x I design

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

Page 23: Generalizability Theory

Crossed vs nested designs

A B

1

2

3

4

5

6

A B C D E F G H I J K L

Crosseddesign

Nesteddesign

Page 24: Generalizability Theory

Sources of Variance

Items : Persons

p i,pi,e

Page 25: Generalizability Theory

Variance components

I : P design

p

i,pi,e

97.57

663.21

35.7550.90

13.35

86.65

ipi,e

261.24371.97

Source

EstimatedVariance

Component

Percentageof TotalVariance

Page 26: Generalizability Theory

Variance components

I : P design

p

i,pi,e

97.57

663.21

35.7550.90

13.35

86.65

ipi,e

261.24371.97

Source

EstimatedVariance

Component

Percentageof TotalVariance

pi,pi,e

97.57663.21

13.3586.65

Page 27: Generalizability Theory

Sources of Variance

Person x Items x Judges

p i

pij,e

pi

pj ij

j

Page 28: Generalizability Theory

Variance component estimation (two facet design)

An observed score for a person on an item (Xpi):

Xpi = [Overall mean]

+ p - [Person effect]

+ j - [Item effect]

+ i - [Judge effect]

+ pj - p - j + [Person by judge effect] + pi - p - i + [Person by item effect]

+ ij - j - i + [Judge x item effect]

+ pij - pj - pi - ij + p + j + i - [Residual]

The variance of observes scores Xpi across all persons and items:

^2 (Xpij) = ^2p

^2j + + ^2

i + ^2pj +

^2pi +

^2ij +

^2pij,e

Page 29: Generalizability Theory

Variance componentsP x I x J design

Source

pij

pipjij

pij,e

EstimatedVariance

Component

48.7125.1215.00

185.8733.1880.0072.94

Percentageof TotalVariance

10.575.453.26

40.337.20

17.3615.83

Page 30: Generalizability Theory

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 31: Generalizability Theory

Two steps in G analysis

1) G(eneralizability)-study:Estimation of sources of variance that influence the measurement (e.g., variance between examinees, essays and markers)

2) D(ecision)-study:Estimation of reliability indices as a function of concrete sample size(s) (e.g., number of essays, number of markers)

Page 32: Generalizability Theory

Interpretation of scores

Norm-oriented perspectiveScores have relative meaning; scores have meaning in relation to each other

Domain-oriented perspectiveScores have absolute meaning to the domain of measurement

Mastery-oriented perspectiveScores have meaning in relation to a cut-off score (reliability of decisions, not of scores)

Page 33: Generalizability Theory

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

Reliability = Variance of TVar T + Var E

Page 34: Generalizability Theory

Illustration 1

Essay test 7 vignette based essay questions 1 markers marked all questions for all

examinees Norm-referenced perspective

Calculate generalizability coefficient!

Page 35: Generalizability Theory

D-study (ni = 7; norm-referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

G =T

T + E=

97.57

97.57 + 371.97/7= 0.65

Page 36: Generalizability Theory

Illustration 2

Essay test 7 vignette based essay questions 1 markers marked all questions for all

examinees Domain-referenced perspective

Calculate dependability coefficient!

Page 37: Generalizability Theory

D-study (ni = 7; domain referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

D =97.57

97.57+= 0.52

261.24/ 7

+371.97/ 7

Page 38: Generalizability Theory

Illustration 3

Essay test 7 vignette based essay questions 1 markers marked all questions for all

examinees Domain-referenced perspective

Calculate dependability coefficient fora sample of 10 essays!

Page 39: Generalizability Theory

D-study (ni = 10; domain referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

D =97.57

97.57+= 0.61

261.24/10

+371.97/ 10

Page 40: Generalizability Theory

D-studies for several item samples

N Essays

1571015

GeneralizabilityCoefficient (G)

0.210.570.650.720.80

DependabilityCoefficient (D)

0.130.440.520.610.70

Page 41: Generalizability Theory

Illustration 4

Essay test 7 vignette based essay questions 2 markers independently marked all

questions for all examinees Norm-referenced perspective

Calculate generalizability coefficient!

Page 42: Generalizability Theory

D-study (ni=7; nj=2; norm referenced)

Source

pij

pipjij

pij,e

VarianceComponent

48.7125.1215.00

185.8733.1880.0072.94

% of TotalVariance

10.575.453.2640.337.2017.3615.83

G =48.71

48.71+= 0.50

185.87/ 7

+33.18/2

+72.94/2 x 7

Page 43: Generalizability Theory

Illustration 5

Essay test 7 vignette based essay questions 2 markers independently marked all

questions for all examinees Domain-referenced perspective

Calculate dependability coefficient!

Page 44: Generalizability Theory

D-study (ni=7; nj=2; domain referenced)

Source

pij

pipjij

pij,e

VarianceComponent

48.7125.1215.00

185.8733.1880.0072.94

% of TotalVariance

10.575.453.2640.337.2017.3615.83

D =48.71

48.71+= 0.43

25.12/ 7

+15.00/2+185.87/

14+33.18/

2+80.00/

14+72.94/

14

Page 45: Generalizability Theory

Illustration 6

Essay test 7 vignette based essay questions 2 different markers

independently marked each question for all examinees

Norm-referenced perspective

Calculate generalizability coefficient!

Page 46: Generalizability Theory

D-study (ni=7; nj=2; norm referenced)

SourceEstimated Var

ComponentPerc of Total

Variance

(Judges : Items) x Persons

pi

j,ijpi

pj,pij,e

48.7125.1895.00

185.87106.12

10.575.45

20.6240.3323.03

G =48.71

48.71+= 0.52

185.87/ 7

+ 106.12/2 x 7

Page 47: Generalizability Theory

D-study summary table

TwoMarkers

0.440.500.560.61

OneMarker

0.390.470.560.65

TwoMarkers

0.460.540.630.72

Same Markerfor all essays

Different Markerfor each essayNumber

ofEssays

571015

OneMarker

0.360.410.450.49

Norm-referenced score interpretation

Page 48: Generalizability Theory

Another reliability index

Reliability coefficient (G & D coefficients) Scale independent (0-1) Non-intuitive interpretation

Standard Error of Measurement (SEM) Intuitive interpretation Scale dependent

Page 49: Generalizability Theory

Standard Error of Measurement

X =X = Observed score

T + E T = True score

E = Error score

Reliability index = Variance of TVariance T + Variance E

EStandard Error of Measurement (SEM) =

Page 50: Generalizability Theory

Interpretation of SEM

Suppose an examinee has a score of 60% and the SEM is 5:

60555045 65 70 7565% CI

1.96 x 5 10

60555045 65 70 7595% CI

2.14 x 5 11

60555045 65 70 7595% CI

Page 51: Generalizability Theory

D-study (ni = 7; norm referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

G =97.57

97.57 + 371.97/7= 0.65

SEM = = 7.29 371.97 /7

Page 52: Generalizability Theory

D-study (ni=7; nj=2; domain referenced)

Source

pij

pipjij

pij,e

VarianceComponent

48.7125.1215.00

185.8733.1880.0072.94

% of TotalVariance

10.575.453.2640.337.2017.3615.83

D =48.71

48.71+= 0.43

25.12/ 2

+15.00/2+185.87/

14+33.18/

2+80.00/

14+72.94/

14SEM = = 8.57

Page 53: Generalizability Theory

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 54: Generalizability Theory

Scenario CEX

A clinical mini exercise (CEX) was developed in which examinees are periodically observed and rated on a rating form. An investigator analyzed a data set from 88 residents who were each observed on 4 occasions by a single different examiner (cf. 1. Norcini JJ, Blank LL, Arnold GK, Kimbal HR. The mini-CEX (Clinical Evaluation Exercise): A

preliminary investigation. Annals of Internal Medicine 1995;123:795-799.). Variance

Componentsp

o,op,eG =

p

p + o:p /4

= Do:p

Page 55: Generalizability Theory

Scenario OSCE I

An OSCE was administered to 100 final year students consisting of 15 stations. Each station was scored by two independent examiners on a case specific checklist. Different examiners were used in each station.

VarianceComponents

ps

G =p

p +j:spspj:s

ps /15

+ pj:s /2 x15

Page 56: Generalizability Theory

Scenario OSCE II

An experimental OSCE was administered to 20 residents. Each resident was tested on a different day. For each resident 3 stations were organized consisting of real patients that were available that day. Two examiners observed all residents in all stations and completed a generic rating scale.

VarianceComponents

ps:p D =

p

p +s:p /3j

ps:spj

+ j /2+ ps:s /

3+ pj /

6

Page 57: Generalizability Theory

Scenario Clerkship Evaluation

An investigator wishes to evaluate teaching quality of 10 clinical clerkships. She developed a questionnaire with 30 items on various quality aspects. The questionnaire was administered in all clerkships by 50 students.

VarianceComponents

ci

s:cci

cs:i

G =c

c + s:c /50

+ ci /30

+ cs:i /50 x 30

PS: It is doubtful that i is a random facet and i could be treated as fixed or ignored!

Page 58: Generalizability Theory

Further reading & software

Literature Cronbach LJ, Gleser GC, Nanda H, Rajaratnam N. The dependability of behavioral

measurements: Theory of generalizability for scores and profiles. New York: Wiley, 1972. Original monograph on generalizability theory. Complete, but hardly accessible for any reader.

Brennan RL. Elements of Generalizability Theory. Iowa: ACT Publications, 1983.This is the resource book for most specialists. Not easy for non-statistically trained readers

Shavelson RJ, Webb NM. Generalizability theory: A primer. Newbury Park, CA: Sage Publications, 1991 . Good and accessible introduction to generalizability theory for any reader

Software GENOVA

Conducts G and D studies and provides ample statistical information. Operates on any PC. Program is relatively old and not user friendly. Program available from Dr. J. Crick, National Board of Medical Examiners, National Board of Medical Examiners, 3750 Market Street,Philadelphia, PA 19104-3190, USA.

SPSSSPSS General Linear Models, Subprogram Variance Components, estimates variance components (also for unbalanced designs). D-studies need to be done manually.