19
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American Institutes for Research June 8, 2009

Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Embed Size (px)

Citation preview

Page 1: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 1

Estimating Performance Below the National Level

Applying Simulation Methods to TIMSS

Fourth Annual IES Research Conference

Dan Sherman, Ph.D.American Institutes for ResearchJune 8, 2009

Page 2: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 2

Current TIMSS Not Appropriate For Direct State-level Estimates

Sample is designed to produce national rather than state estimates

Small number of schools sampled in about 40 states in 2007– 234 public schools with 4th grade scores

• CA has 30+ schools, FL, TX , and NY have 10+ schools

– 207 public schools with 8th grade scores• CA has 20+ schools, TX, NY, and MI have 10+ schools

Large variations in school means within states makes direct estimates sensitive to choice of schools

Page 3: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 3

One Alternative to Direct Estimation for States Is to Use Regression Model Approach

Relate individual school-level mean score to variables observed for school, district, and state– Regression can take account of data structure

(e.g., clustering of schools) in estimation

Use regression coefficients to create expected score for schools outside TIMSS sample

“Add up” expected scores across schools (weighted by number of students within state)– Provides estimate of mean score for state

Variance estimation is analytically complex; here handled by simulation

Page 4: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 4

Overview of Regression Model Model begins with Common Core of Data (CCD)

variables on right hand side– Share of students by gender, race, poverty status; also

school size and location indicator (i.e.,urban , rural)– Can apply these variables to all public schools in sample

Then add average 2007 NAEP state math score for grade– Helps measure whether school in high or low performing

state, relative to expectation of CCD variables Also add in relative performance of school on state

math test (percent proficient in school in SD units relative to mean percent proficient in state)– Helps adjust for position of school in state compared to

other schools

Page 5: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 5

Goodness of Fit (Adjusted R2 ) for Alternative Specifications

ModelDemographic

s Only

Locality / School Size

Added NAEP Score

added

Relative Position on State Test

AddedGrade 4 Math 0.640 0.665 0.669 0.741Grade 4 Science 0.688 0.702 0.708 0.771Grade 8 Math 0.505 0.522 0.562 0.585Grade 8 Science 0.589 0.601 0.638 0.653

Page 6: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 6

Summary of Alternative Regression Specifications

Models explain significant portion of variation in TIMSS (much more than PISA)

Demographics are (collectively) most significant group of variables – Poverty status is single best predictor variable

– State NAEP score related to school mean, as is (strongly) position in state on test

Key is to predict large share of variance in mean scores of sampled schools– Larger R2 is desirable to closely track school mean

estimates; otherwise will have large variance when school estimates aggregated to states

Page 7: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 7

Validation of School-level Model with (Three) Individual States

Can validate outside the model by predicting mean scores in 2 states with larger TIMSS samples (MA and MN with approximately 50 schools per grade per state– California design part of national sample

Fit (R2) between actual means and those predicted by model similar to overall model:

Obviously would like to validate/ examine model fit to schools in other states

Grade/ Subject Massachusetts Minnesota California

Grade 4 Math 0.701 0.551 0.789

Grade 4 Science 0.574 0.545 0.817

Grade 8 Math 0.742 0.551 0.506

Grade 8 Science 0.547 0.533 0.607

Page 8: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 8

Illustration of Model Fit to Individual School Data – High R2

450

500

550

600

Pre

dict

ed

Sco

re

450 500 550 600 650TIMSS Score

R-squared = 0.70

Massachusetts: 4th Grade MathComparison of Predicted to Actual Mean School Scores

Page 9: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 9

Illustration of Model Fit to Individual School Data - Lower R2

4

505

005

506

00P

redi

cte

d S

core

450 500 550 600 650TIMSS Score

R-squared = 0.55

Massachusetts: 8th Grade MathComparison of Predicted to Actual Mean School Scores

Page 10: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 10

Approach to computing State-Level Means

Create estimate of school-level mean from TIMSS sample to create point estimate of mean for each school in CCD

Weight school estimates up to state-level mean using CCD number of students in grade

Should be unbiased in terms of expectation across repeated samples of schools

Page 11: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 11

Sources of Variance in Estimates of State-level Means

1. Mix and characteristics in TIMSS sample used in given regression– Different sample will produce different coefficients

2. Individual school estimate has prediction error around its expectation from regression– Standard error of regression – reduces with

R2 and increases with distance from sample mean

– Mean square error of regressions is about 20–25 points

3. Measurement error around mean (relatively small – about 5 points) for sampled schools

Page 12: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 12

Estimating Variance in State-Level Means1. Take coefficients from regression model using

2007 TIMSS sample as given2. Predict expected mean score for all regular public

schools in CCD and add sources of error (i.e., random draws from distributions) for individual schools

3. Apply TIMSS sampling methodology (or any other!) and draw sample of schools

4. Estimate regression for this sample and compute coefficients

5. Compute state means from coefficient sample6. REPEAT procedure drawing different samples to

compute state-level means7. Summarize means and standard deviation of

estimated means (i.e, SEs) by state

Page 13: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 13

Illustration of National Results (4th Grade Math Scores)

Page 14: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 14

Summary of National Means and Standard Errors

Regression/ simulation provides national estimates similar to direct estimates

Can be “apportioned” to states pulling out observations for individual states

Grade/ Subject Model Estimate

Mean (SE)Direct Estimate

Mean (SE)

Grade 4 Math 528 (3.4) 527 (2.6)

Grade 4 Science 534 (1.8) 536 (2.9)

Grade 8 Math 508 (3.5) 506 (3.0)

Grade 8 Science 520 (3.4) 517 (3.0)

Page 15: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 15

DCMS

NMLA

ALCASCAZ

RI

HI

OKNV

NE

TNGAKY

AR

TXWV

AK

UT

NC

FL

IL

DEOR

SDMD

MI

WYCO

MO

MT

ME

CT

NYIA

ND

VAID

WA

VT

INKS

NJPA

WI

OHNHMAMN

DCMSNM

LAALCASCAZ

RIOKNV

ARTNGAKYTX

ILWV

NCNE

ORDE

FLSDMD

MICO

AKWYMO

MEIANY

MTVAID

UTWA

NDCT

INWIKSVT

PAOH

NJNHMN

MAHI

400 450 500 550 600 650Mean and 95 percent Confidence Interval

4th Grade MathematicsEstimated TIMSS State Means and Confidence Intervals

Page 16: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 16

Comparison of Model Results to Direct State-Level Estimates

Grade/ Subject Model Estimate

Mean (SE)Direct Estimate

Mean (SE)

State = MA

Grade 4 Math 557 (7.7) 572 (3.5)

Grade 4 Science 562 (4.5) 571 (4.3)

Grade 8 Math 549 (8.0) 547 (4.6)

Grade 8 Science 568 (7.7) 556 (4.6)

State = MN

Grade 4 Math 555 (5.5) 554 (5.9)

Grade 4 Science 564 (3.4) 551 (6.1)

Grade 8 Math 549 (6.2) 532 (4.2)

Grade 8 Science 566 (5.9) 539 (4.8)

Page 17: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 17

Some Observations on State-Level Estimates

4th grade math presented as illustration; high correlation in state estimates across grade and subject (r = 0.90 to 0.98)

Median standard errors across states are 6.1 for 4th grade math, 3.3 for 4th grade science, 5.6 for 8th grade math, and 5.4 for 8th grade science

Large standard errors in HI , AK, and DC estimates reflect large population shares of minorities multiplied through by variance of coefficients of associated variables

Page 18: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 18

Potential Extensions Current estimates are regression-based only and do not

directly combine with existing state-level samples– Composite estimator would weight the two by relative

variance, though current sample doesn’t support these estimators well with small state samples

Can validate model against two states (MA and MN) but would be useful to have other states with sufficient samples – Could work with 1999 data that had 13 state samples

Regression / Simulation approach could be used to assess alternative sampling schemes or models, given regression-based assumptions of scores at school level– Could draw larger samples to assess precision of direct

estimates and compute composite estimates– Could draw from more high-minority schools to reduce

standard errors for some states

Page 19: Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American

Slide 19

Conclusions

Regression model works well with current TIMSS given relatively high explanatory power of regression models

Direct estimates for most states would not be reliable/ credible with current sampling

Sample could be potentially modified to provide direct estimates for states and “borrow strength” from regression and require smaller state samples– Simulation can help evaluate sampling schemes

a priori for complex estimators