Upload
audra-harris
View
215
Download
0
Embed Size (px)
Citation preview
Slide 1
Estimating Performance Below the National Level
Applying Simulation Methods to TIMSS
Fourth Annual IES Research Conference
Dan Sherman, Ph.D.American Institutes for ResearchJune 8, 2009
Slide 2
Current TIMSS Not Appropriate For Direct State-level Estimates
Sample is designed to produce national rather than state estimates
Small number of schools sampled in about 40 states in 2007– 234 public schools with 4th grade scores
• CA has 30+ schools, FL, TX , and NY have 10+ schools
– 207 public schools with 8th grade scores• CA has 20+ schools, TX, NY, and MI have 10+ schools
Large variations in school means within states makes direct estimates sensitive to choice of schools
Slide 3
One Alternative to Direct Estimation for States Is to Use Regression Model Approach
Relate individual school-level mean score to variables observed for school, district, and state– Regression can take account of data structure
(e.g., clustering of schools) in estimation
Use regression coefficients to create expected score for schools outside TIMSS sample
“Add up” expected scores across schools (weighted by number of students within state)– Provides estimate of mean score for state
Variance estimation is analytically complex; here handled by simulation
Slide 4
Overview of Regression Model Model begins with Common Core of Data (CCD)
variables on right hand side– Share of students by gender, race, poverty status; also
school size and location indicator (i.e.,urban , rural)– Can apply these variables to all public schools in sample
Then add average 2007 NAEP state math score for grade– Helps measure whether school in high or low performing
state, relative to expectation of CCD variables Also add in relative performance of school on state
math test (percent proficient in school in SD units relative to mean percent proficient in state)– Helps adjust for position of school in state compared to
other schools
Slide 5
Goodness of Fit (Adjusted R2 ) for Alternative Specifications
ModelDemographic
s Only
Locality / School Size
Added NAEP Score
added
Relative Position on State Test
AddedGrade 4 Math 0.640 0.665 0.669 0.741Grade 4 Science 0.688 0.702 0.708 0.771Grade 8 Math 0.505 0.522 0.562 0.585Grade 8 Science 0.589 0.601 0.638 0.653
Slide 6
Summary of Alternative Regression Specifications
Models explain significant portion of variation in TIMSS (much more than PISA)
Demographics are (collectively) most significant group of variables – Poverty status is single best predictor variable
– State NAEP score related to school mean, as is (strongly) position in state on test
Key is to predict large share of variance in mean scores of sampled schools– Larger R2 is desirable to closely track school mean
estimates; otherwise will have large variance when school estimates aggregated to states
Slide 7
Validation of School-level Model with (Three) Individual States
Can validate outside the model by predicting mean scores in 2 states with larger TIMSS samples (MA and MN with approximately 50 schools per grade per state– California design part of national sample
Fit (R2) between actual means and those predicted by model similar to overall model:
Obviously would like to validate/ examine model fit to schools in other states
Grade/ Subject Massachusetts Minnesota California
Grade 4 Math 0.701 0.551 0.789
Grade 4 Science 0.574 0.545 0.817
Grade 8 Math 0.742 0.551 0.506
Grade 8 Science 0.547 0.533 0.607
Slide 8
Illustration of Model Fit to Individual School Data – High R2
450
500
550
600
Pre
dict
ed
Sco
re
450 500 550 600 650TIMSS Score
R-squared = 0.70
Massachusetts: 4th Grade MathComparison of Predicted to Actual Mean School Scores
Slide 9
Illustration of Model Fit to Individual School Data - Lower R2
4
505
005
506
00P
redi
cte
d S
core
450 500 550 600 650TIMSS Score
R-squared = 0.55
Massachusetts: 8th Grade MathComparison of Predicted to Actual Mean School Scores
Slide 10
Approach to computing State-Level Means
Create estimate of school-level mean from TIMSS sample to create point estimate of mean for each school in CCD
Weight school estimates up to state-level mean using CCD number of students in grade
Should be unbiased in terms of expectation across repeated samples of schools
Slide 11
Sources of Variance in Estimates of State-level Means
1. Mix and characteristics in TIMSS sample used in given regression– Different sample will produce different coefficients
2. Individual school estimate has prediction error around its expectation from regression– Standard error of regression – reduces with
R2 and increases with distance from sample mean
– Mean square error of regressions is about 20–25 points
3. Measurement error around mean (relatively small – about 5 points) for sampled schools
Slide 12
Estimating Variance in State-Level Means1. Take coefficients from regression model using
2007 TIMSS sample as given2. Predict expected mean score for all regular public
schools in CCD and add sources of error (i.e., random draws from distributions) for individual schools
3. Apply TIMSS sampling methodology (or any other!) and draw sample of schools
4. Estimate regression for this sample and compute coefficients
5. Compute state means from coefficient sample6. REPEAT procedure drawing different samples to
compute state-level means7. Summarize means and standard deviation of
estimated means (i.e, SEs) by state
Slide 13
Illustration of National Results (4th Grade Math Scores)
Slide 14
Summary of National Means and Standard Errors
Regression/ simulation provides national estimates similar to direct estimates
Can be “apportioned” to states pulling out observations for individual states
Grade/ Subject Model Estimate
Mean (SE)Direct Estimate
Mean (SE)
Grade 4 Math 528 (3.4) 527 (2.6)
Grade 4 Science 534 (1.8) 536 (2.9)
Grade 8 Math 508 (3.5) 506 (3.0)
Grade 8 Science 520 (3.4) 517 (3.0)
Slide 15
DCMS
NMLA
ALCASCAZ
RI
HI
OKNV
NE
TNGAKY
AR
TXWV
AK
UT
NC
FL
IL
DEOR
SDMD
MI
WYCO
MO
MT
ME
CT
NYIA
ND
VAID
WA
VT
INKS
NJPA
WI
OHNHMAMN
DCMSNM
LAALCASCAZ
RIOKNV
ARTNGAKYTX
ILWV
NCNE
ORDE
FLSDMD
MICO
AKWYMO
MEIANY
MTVAID
UTWA
NDCT
INWIKSVT
PAOH
NJNHMN
MAHI
400 450 500 550 600 650Mean and 95 percent Confidence Interval
4th Grade MathematicsEstimated TIMSS State Means and Confidence Intervals
Slide 16
Comparison of Model Results to Direct State-Level Estimates
Grade/ Subject Model Estimate
Mean (SE)Direct Estimate
Mean (SE)
State = MA
Grade 4 Math 557 (7.7) 572 (3.5)
Grade 4 Science 562 (4.5) 571 (4.3)
Grade 8 Math 549 (8.0) 547 (4.6)
Grade 8 Science 568 (7.7) 556 (4.6)
State = MN
Grade 4 Math 555 (5.5) 554 (5.9)
Grade 4 Science 564 (3.4) 551 (6.1)
Grade 8 Math 549 (6.2) 532 (4.2)
Grade 8 Science 566 (5.9) 539 (4.8)
Slide 17
Some Observations on State-Level Estimates
4th grade math presented as illustration; high correlation in state estimates across grade and subject (r = 0.90 to 0.98)
Median standard errors across states are 6.1 for 4th grade math, 3.3 for 4th grade science, 5.6 for 8th grade math, and 5.4 for 8th grade science
Large standard errors in HI , AK, and DC estimates reflect large population shares of minorities multiplied through by variance of coefficients of associated variables
Slide 18
Potential Extensions Current estimates are regression-based only and do not
directly combine with existing state-level samples– Composite estimator would weight the two by relative
variance, though current sample doesn’t support these estimators well with small state samples
Can validate model against two states (MA and MN) but would be useful to have other states with sufficient samples – Could work with 1999 data that had 13 state samples
Regression / Simulation approach could be used to assess alternative sampling schemes or models, given regression-based assumptions of scores at school level– Could draw larger samples to assess precision of direct
estimates and compute composite estimates– Could draw from more high-minority schools to reduce
standard errors for some states
Slide 19
Conclusions
Regression model works well with current TIMSS given relatively high explanatory power of regression models
Direct estimates for most states would not be reliable/ credible with current sampling
Sample could be potentially modified to provide direct estimates for states and “borrow strength” from regression and require smaller state samples– Simulation can help evaluate sampling schemes
a priori for complex estimators