36
EDUCATION RESEARCH EDUCATION RESEARCH MEETS THE GOLD MEETS THE GOLD STANDARD: STANDARD: STATISTICS, EDUCATION, AND STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER “NO RESEARCH METHODS AFTER “NO CHILD LEFT BEHIND” CHILD LEFT BEHIND” Mack C. Shelley, II Mack C. Shelley, II Iowa State University Iowa State University [email protected] [email protected] Presented at the Joint Statistical Presented at the Joint Statistical Meetings, August 7-11, 2005, Meetings, August 7-11, 2005, Minneapolis, MN Minneapolis, MN

EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

Embed Size (px)

Citation preview

Page 1: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

EDUCATION RESEARCH EDUCATION RESEARCH MEETS THE GOLD MEETS THE GOLD

STANDARD:STANDARD:STATISTICS, EDUCATION, STATISTICS, EDUCATION,

AND RESEARCH METHODS AND RESEARCH METHODS AFTER “NO CHILD LEFT AFTER “NO CHILD LEFT

BEHIND”BEHIND”

Mack C. Shelley, IIMack C. Shelley, IIIowa State UniversityIowa State [email protected]@iastate.edu

Presented at the Joint Statistical Meetings, Presented at the Joint Statistical Meetings, August 7-11, 2005, Minneapolis, MNAugust 7-11, 2005, Minneapolis, MN

Page 2: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

BackgroundBackground

This session is meant to help inform This session is meant to help inform the national debate over the role of the national debate over the role of scientific standards for research in scientific standards for research in education, particularly as those education, particularly as those research standards are influenced by research standards are influenced by statistical methods and theory.statistical methods and theory.

This session builds on a National This session builds on a National Science Foundation award to myself Science Foundation award to myself and Brian Hand (University of Iowa).and Brian Hand (University of Iowa).

Page 3: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

BackgroundBackground

The panel is designed to meld The panel is designed to meld research interests in statistics, research interests in statistics, education, and related disciplines, education, and related disciplines, and to discuss the dramatically and to discuss the dramatically changing context of contemporary changing context of contemporary education research.education research.

Why, exactly, is the context changing Why, exactly, is the context changing for statistical research in education?for statistical research in education?

Page 4: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

BackgroundBackground

Standards for acceptable research in Standards for acceptable research in education are affected greatly by:education are affected greatly by: the recent creation of the Institute of the recent creation of the Institute of

Education Sciences in the U.S. Education Sciences in the U.S. Department of EducationDepartment of Education

passage of the No Child Left Behind Act passage of the No Child Left Behind Act of 2001, andof 2001, and

Passage of the Education Sciences Passage of the Education Sciences Reform Act (H.R. 3801) in 2002Reform Act (H.R. 3801) in 2002

Page 5: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

BackgroundBackground

Together, these developmentsTogether, these developments have reconstituted federal support for have reconstituted federal support for

research and dissemination of information in research and dissemination of information in educationeducation

are meant to foster “scientifically valid are meant to foster “scientifically valid research,” andresearch,” and

have established what is referred to as the have established what is referred to as the “gold standard” for research in education.“gold standard” for research in education.

Page 6: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

BackgroundBackground

These and other developments These and other developments denote that greater education denote that greater education research emphasis now is placed onresearch emphasis now is placed on quantification,quantification, the use of randomized trials, andthe use of randomized trials, and the selection of valid control groupsthe selection of valid control groups

Page 7: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

BackgroundBackground

This panel is intended to be part of a This panel is intended to be part of a sustained and expanded dialoguesustained and expanded dialogue

between the statistical community and between the statistical community and those who implement the education those who implement the education research agendaresearch agenda

through a discussion of whether and how to through a discussion of whether and how to implement the new standards for statistical implement the new standards for statistical work in the field of education researchwork in the field of education research

Page 8: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Is The “Gold Standard”?What Is The “Gold Standard”?

U.S. Department of Education, U.S. Department of Education, Institute of Education Sciences, Institute of Education Sciences, National Center for Education National Center for Education Evaluation and Regional AssistanceEvaluation and Regional Assistance Identifying and implementing Identifying and implementing

educational practices supported by educational practices supported by rigorous evidence: A user friendly guiderigorous evidence: A user friendly guide

http://www.ed.gov/about/offices/list/ihttp://www.ed.gov/about/offices/list/ies/news.html#guidees/news.html#guide

Page 9: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Is The “Gold Standard”?What Is The “Gold Standard”?

This publication emphasizes:This publication emphasizes: evidence-based interventionsevidence-based interventions educational outcomes that have been educational outcomes that have been

found to be effective in randomized found to be effective in randomized controlled trialscontrolled trials

““research’s “gold standard” for research’s “gold standard” for establishing what works”establishing what works”

following patterns of evidence use in following patterns of evidence use in medicine and welfare policymedicine and welfare policy

Page 10: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Is The “Gold Standard”?What Is The “Gold Standard”?

The quality of studies needed to The quality of studies needed to establish “strong” evidence requiresestablish “strong” evidence requires randomized controlled trials that are randomized controlled trials that are

well-designed and implementedwell-designed and implemented that the quantity of evidence needed that the quantity of evidence needed

spans trials showing effectiveness in two spans trials showing effectiveness in two or more typical school settingsor more typical school settings

including a setting similar to that of including a setting similar to that of schools/classroomsschools/classrooms

Page 11: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Is The “Gold Standard”?What Is The “Gold Standard”?

““Possible” evidence may includePossible” evidence may include randomized controlled trials whose randomized controlled trials whose

quality/quantity are good but fall short quality/quantity are good but fall short of “strong” evidenceof “strong” evidence

and/or comparison-group studies in and/or comparison-group studies in which the intervention and comparison which the intervention and comparison groups are groups are very closely matchedvery closely matched

in academic achievement, demographics, in academic achievement, demographics, and other characteristicsand other characteristics

Page 12: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Is The “Gold Standard”?What Is The “Gold Standard”?

Evaluating whether an intervention is backed by Evaluating whether an intervention is backed by “strong” evidence of effectiveness hinges on“strong” evidence of effectiveness hinges on well-designed and well-implemented randomized well-designed and well-implemented randomized

controlled trialscontrolled trials demonstrating that there are no systematic demonstrating that there are no systematic

differences between intervention and control groups differences between intervention and control groups before the interventionbefore the intervention

the use of measures and instruments of proven the use of measures and instruments of proven validityvalidity

““real-world” objective measures of the outcomes the real-world” objective measures of the outcomes the intervention is designed to affectintervention is designed to affect

attrition of no more than 25% of the original sampleattrition of no more than 25% of the original sample effect size combined with statistical significanceeffect size combined with statistical significance an adequate sample size to achieve statistical an adequate sample size to achieve statistical

significancesignificance controlled trials implemented in more than one site in controlled trials implemented in more than one site in

schools that represent a cross-section of all schoolsschools that represent a cross-section of all schools

Page 13: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

No Child Left BehindNo Child Left Behind Public Law 107–110 [H.R. 1]Public Law 107–110 [H.R. 1]

passed on January 8, 2002passed on January 8, 2002 ““An Act to close the achievement gap with An Act to close the achievement gap with

accountability, flexibility, and choice, so that accountability, flexibility, and choice, so that no child is left behind”no child is left behind”

the “No Child Left Behind Act of 2001” (NCLB)the “No Child Left Behind Act of 2001” (NCLB) established standards for academic assessments in established standards for academic assessments in

mathematics, reading or language arts, and sciencemathematics, reading or language arts, and science multiple up-to-date measures of student academic multiple up-to-date measures of student academic

achievement, including measures that assess higher-achievement, including measures that assess higher-order thinking skills and understandingorder thinking skills and understanding

These requirements for program assessment lead to These requirements for program assessment lead to many opportunities and circumstances for the many opportunities and circumstances for the application of statistical methods.application of statistical methods.

Page 14: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

No Child Left BehindNo Child Left Behind The research program under NCLB was designed to examine The research program under NCLB was designed to examine

the effect of the assessment and accountability systems on the effect of the assessment and accountability systems on students, teachers, parents, families, schools, school districts, students, teachers, parents, families, schools, school districts, and States, including correlations between such systems andand States, including correlations between such systems and student academic achievementstudent academic achievement progress toward meeting the State-defined level of progress toward meeting the State-defined level of

proficiencyproficiency progress toward closing achievement gap changes in course progress toward closing achievement gap changes in course

offerings, teaching practices, course content, and offerings, teaching practices, course content, and instructional materialinstructional material

teacher, principal, and pupil-services personnel turnover teacher, principal, and pupil-services personnel turnover ratesrates

student dropout, grade-retention, and graduation ratesstudent dropout, grade-retention, and graduation rates students with disabilitiesstudents with disabilities student socioeconomic statusstudent socioeconomic status level of student English proficiencylevel of student English proficiency student ethnicity and racestudent ethnicity and race

Page 15: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

The Education Sciences Reform The Education Sciences Reform Act and IESAct and IES

““The Education Sciences Reform Act”The Education Sciences Reform Act” ““An Act to provide for improvement of Federal An Act to provide for improvement of Federal

education research, statistics, evaluation, education research, statistics, evaluation, information, and dissemination, and for other information, and dissemination, and for other purposes”purposes”

H.R. 3801, passed January 23, 2002H.R. 3801, passed January 23, 2002 reconstituted federal support for research and reconstituted federal support for research and

dissemination of information in education, to dissemination of information in education, to foster “scientifically valid research”foster “scientifically valid research”

established the Institute of Education Sciences established the Institute of Education Sciences (IES)(IES)

replacing the Office of Educational Research and replacing the Office of Educational Research and ImprovementImprovement

part of the Department of Education but functioning part of the Department of Education but functioning separately from itseparately from it

Page 16: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

The Education Sciences Reform The Education Sciences Reform Act and IESAct and IES

IES is the research arm of the Department of EducationIES is the research arm of the Department of Education Mission is to expand knowledge and provide information onMission is to expand knowledge and provide information on

the condition of educationthe condition of education practices that improve academic achievementpractices that improve academic achievement the effectiveness of Federal and other education programsthe effectiveness of Federal and other education programs

GoalGoal the transformation of education into an evidence-based field in which the transformation of education into an evidence-based field in which

decision makers routinely seek out the best available research and decision makers routinely seek out the best available research and data before adopting programs or practices that will affect significant data before adopting programs or practices that will affect significant numbers of studentsnumbers of students

Consists ofConsists of Grover J. (Russ) Whitehurst, first Director, since November 2002Grover J. (Russ) Whitehurst, first Director, since November 2002 Office of the DirectorOffice of the Director National Center for Education ResearchNational Center for Education Research National Center for Education StatisticsNational Center for Education Statistics National Center for Education Evaluation and Regional AssistanceNational Center for Education Evaluation and Regional Assistance National Center for Special Education ResearchNational Center for Special Education Research

Page 17: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

The Education Sciences Reform The Education Sciences Reform Act and IESAct and IES

HR 3801 defined “Scientifically based HR 3801 defined “Scientifically based research standards” toresearch standards” to apply rigorous, systematic, and apply rigorous, systematic, and

objective methodology to obtain reliable objective methodology to obtain reliable and valid knowledge relevant to and valid knowledge relevant to education activities and programseducation activities and programs

present findings and make claims that present findings and make claims that are appropriate to and supported by the are appropriate to and supported by the methods that have been employedmethods that have been employed

Page 18: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

The Education Sciences Reform The Education Sciences Reform Act and IESAct and IES

““Scientifically based research” also includesScientifically based research” also includes employing systematic, empirical methods that draw on employing systematic, empirical methods that draw on

observation or experimentobservation or experiment involving data analyses that are adequate to support the general involving data analyses that are adequate to support the general

findingsfindings relying on measurements or observational methods that provide relying on measurements or observational methods that provide

reliable datareliable data making claims of causal relationships only in random assignment making claims of causal relationships only in random assignment

experiments or other designs (to the extent such designs experiments or other designs (to the extent such designs substantially eliminate plausible competing explanations for the substantially eliminate plausible competing explanations for the obtained results)obtained results)

ensuring that studies and methods are presented in sufficient ensuring that studies and methods are presented in sufficient detail and clarity to allow for replication or, at a minimum, to offer detail and clarity to allow for replication or, at a minimum, to offer the opportunity to build systematically on the findings of the the opportunity to build systematically on the findings of the researchresearch

obtaining acceptance by a peer-reviewed journal or approval by a obtaining acceptance by a peer-reviewed journal or approval by a panel of independent experts through a comparably rigorous, panel of independent experts through a comparably rigorous, objective, and scientific reviewobjective, and scientific review

using research designs and methods appropriate to the research using research designs and methods appropriate to the research question posedquestion posed

Page 19: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

The Education Sciences Reform The Education Sciences Reform Act and IESAct and IES

““Scientifically valid education evaluation” means Scientifically valid education evaluation” means an evaluation thatan evaluation that adheres to the highest possible standards of quality with adheres to the highest possible standards of quality with

respect to research design and statistical analysisrespect to research design and statistical analysis provides an adequate description of the programs provides an adequate description of the programs

evaluated and, to the extent possible, examines the evaluated and, to the extent possible, examines the relationship between program implementation and relationship between program implementation and program impactsprogram impacts

provides an analysis of the results achieved by the provides an analysis of the results achieved by the program with respect to its projected effectsprogram with respect to its projected effects

employs experimental designs using random employs experimental designs using random assignment, when feasible, and other research assignment, when feasible, and other research methodologies that allow for the strongest possible methodologies that allow for the strongest possible causal inferences when random assignment is not causal inferences when random assignment is not feasiblefeasible

may study program implementation through a may study program implementation through a combination of scientifically valid and reliable methodscombination of scientifically valid and reliable methods

Page 20: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What WorksWhat Works What Works Clearinghouse (WWC)What Works Clearinghouse (WWC)

established in 2002 by IESestablished in 2002 by IES to provide educators, policymakers, and the public with a to provide educators, policymakers, and the public with a

central and trusted source of scientific evidence of what works central and trusted source of scientific evidence of what works in educationin education

administered by the U.S. Department of Education, through a administered by the U.S. Department of Education, through a contract to a joint venture of the American Institutes for contract to a joint venture of the American Institutes for Research and the Campbell CollaborationResearch and the Campbell Collaboration

reviews and reports on existing studies of interventions reviews and reports on existing studies of interventions (education programs, products, practices, and policies) in (education programs, products, practices, and policies) in selected topic areasselected topic areas apply standards that follow scientifically valid criteria for apply standards that follow scientifically valid criteria for

determining the effectiveness of these interventions determining the effectiveness of these interventions Technical Advisory Group (TAG)Technical Advisory Group (TAG)

leading experts in research design, program evaluation, and leading experts in research design, program evaluation, and research synthesisresearch synthesis

advises on the standards for evaluation research reviewsadvises on the standards for evaluation research reviews monitors and informs the methodological aspects of WWC monitors and informs the methodological aspects of WWC

reviews and reportsreviews and reports

www.whatworks.ed.govwww.whatworks.ed.gov

Page 21: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Works - TAGWhat Works - TAG Dr. Larry V. Hedges, ChairpersonDr. Larry V. Hedges, Chairperson, Stella M. Rowley Professor of Education, Psychology, , Stella M. Rowley Professor of Education, Psychology,

Public Policy Studies, and Sociology, University of Chicago, and editorial board member Public Policy Studies, and Sociology, University of Chicago, and editorial board member of the American Journal of Sociology, the Review of Educational Research, and of the American Journal of Sociology, the Review of Educational Research, and Psychological Bulletin. Psychological Bulletin.

Dr. Betsy Jane BeckerDr. Betsy Jane Becker, Professor of Measurement and Quantitative Methods, College of , Professor of Measurement and Quantitative Methods, College of Education, Michigan State University. Education, Michigan State University.

Dr. Jesse A. BerlinDr. Jesse A. Berlin, Professor of Biostatistics, University of Pennsylvania School of , Professor of Biostatistics, University of Pennsylvania School of Medicine, and Director of Biostatistics at the university's Comprehensive Cancer Center. Medicine, and Director of Biostatistics at the university's Comprehensive Cancer Center.

Dr. Douglas CarnineDr. Douglas Carnine, Professor of Education, University of Oregon, and Director of the , Professor of Education, University of Oregon, and Director of the National Center to Improve the Tools of Educators. National Center to Improve the Tools of Educators.

Dr. Thomas D. CookDr. Thomas D. Cook, Professor of Sociology, Psychology, Education and Social Policy, , Professor of Sociology, Psychology, Education and Social Policy, Northwestern University, and Faculty Fellow at the Institute for Policy Research. Northwestern University, and Faculty Fellow at the Institute for Policy Research.

Dr. David J. FrancisDr. David J. Francis, Professor of Quantitative Methods, Chairman of the Department of , Professor of Quantitative Methods, Chairman of the Department of Psychology, and Director of the Texas Institute for Measurement, Evaluation, and Psychology, and Director of the Texas Institute for Measurement, Evaluation, and Statistics, University of Houston. Statistics, University of Houston.

Dr. Robert L. LinnDr. Robert L. Linn, distinguished Professor of Education, University of Colorado at , distinguished Professor of Education, University of Colorado at Boulder, and Co-Director of the National Center for Research on Evaluation, Standards, Boulder, and Co-Director of the National Center for Research on Evaluation, Standards, and Student Testing. and Student Testing.

Dr. Mark W. LipseyDr. Mark W. Lipsey, Senior Research Associate, Vanderbilt Institute for Public Policy , Senior Research Associate, Vanderbilt Institute for Public Policy Studies, and Director of the Center for Evaluation Research and Methodology. Studies, and Director of the Center for Evaluation Research and Methodology.

Dr. David MyersDr. David Myers, Senior Fellow, Mathematica Policy Research, and former Director of , Senior Fellow, Mathematica Policy Research, and former Director of the U.S. Department of Education's national evaluation of Upward Bound. the U.S. Department of Education's national evaluation of Upward Bound.

Dr. Andrew C. PorterDr. Andrew C. Porter, Patricia and Rodes Hart Professor of Educational Leadership , Patricia and Rodes Hart Professor of Educational Leadership and Policy and Director of the Learning Sciences Institute at Vanderbilt University. and Policy and Director of the Learning Sciences Institute at Vanderbilt University.

Dr. David RindskopfDr. David Rindskopf, Professor of Psychology and Educational Psychology, City , Professor of Psychology and Educational Psychology, City University of New York Graduate Center, and elected Fellow of the American Statistical University of New York Graduate Center, and elected Fellow of the American Statistical Association. Association.

Dr. Cecilia E. RouseDr. Cecilia E. Rouse, Professor of Economics and Public Affairs, and joint appointee in , Professor of Economics and Public Affairs, and joint appointee in the Economics Department and Woodrow Wilson School, Princeton University. the Economics Department and Woodrow Wilson School, Princeton University.

Dr. William R. ShadishDr. William R. Shadish, Founding Faculty and Professor of Social Sciences, , Founding Faculty and Professor of Social Sciences, Humanities, and Arts at the University of California, Merced.Humanities, and Arts at the University of California, Merced.

Page 22: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Works Current TopicsWhat Works Current TopicsThe What Works Clearinghouse (WWC) prioritizes topics The What Works Clearinghouse (WWC) prioritizes topics

based on the following criteria: based on the following criteria: potential to improve important student outcomes; potential to improve important student outcomes; applicability to a broad range of students or to applicability to a broad range of students or to

particularly important subpopulations; particularly important subpopulations; policy relevance and perceived demand within the policy relevance and perceived demand within the

education community; and education community; and likely availability of scientific studies. likely availability of scientific studies. Specifically, the topics were selected from nominations Specifically, the topics were selected from nominations

received through: received through: emails from the public; emails from the public; meetings and presentations sponsored by the What Works meetings and presentations sponsored by the What Works

Clearinghouse; Clearinghouse; the What Works Network;the What Works Network; suggestions presented by senior members of education suggestions presented by senior members of education

associations, policymakers, and the U.S. Department of associations, policymakers, and the U.S. Department of Education; andEducation; and

reviews of existing research.reviews of existing research.

Page 23: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

What Works Current TopicsWhat Works Current TopicsTopics include: Topics include: MathMath—Curriculum-Based Interventions for Increasing Middle —Curriculum-Based Interventions for Increasing Middle

School MathSchool Math ReadingReading—Interventions for Beginning Reading—Interventions for Beginning Reading Character EducationCharacter Education—Comprehensive Schoolwide Character —Comprehensive Schoolwide Character

Education Interventions: Benefits for Character Traits, Education Interventions: Benefits for Character Traits, Behavioral, and Academic OutcomesBehavioral, and Academic Outcomes

Dropout PreventionDropout Prevention—Interventions for Preventing High —Interventions for Preventing High School DropoutSchool Dropout

English Language LearningEnglish Language Learning—Interventions for Elementary —Interventions for Elementary School English Language Learners: Increasing English School English Language Learners: Increasing English Language Acquisition and Academic AchievementLanguage Acquisition and Academic Achievement

Math—Math—Curriculum-Based Interventions for Increasing Curriculum-Based Interventions for Increasing Elementary School MathElementary School Math

Early ChildhoodEarly Childhood—Interventions for Improving Preschool —Interventions for Improving Preschool Children’s School ReadinessChildren’s School Readiness

Delinquent, Disorderly, and Violent BehaviorDelinquent, Disorderly, and Violent Behavior——Interventions to Reduce Delinquent, Disorderly, and Violent Interventions to Reduce Delinquent, Disorderly, and Violent Behavior in Middle and High SchoolsBehavior in Middle and High Schools

Adult LiteracyAdult Literacy—Interventions for Increasing Adult Literacy—Interventions for Increasing Adult Literacy Peer-Assisted LearningPeer-Assisted Learning—Peer-Assisted Learning —Peer-Assisted Learning

Interventions in Elementary Schools: Reading, Mathematics, Interventions in Elementary Schools: Reading, Mathematics, and Science Gainsand Science Gains

Page 24: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

““Does Not Meet Evidence Screens”Does Not Meet Evidence Screens”Studies may not pass WWC screening requirements for the Studies may not pass WWC screening requirements for the

following reasons:following reasons: Evaluation research designEvaluation research design. The study did not meet certain . The study did not meet certain

design standards. Study designs that provide the strongest design standards. Study designs that provide the strongest evidence of effects includeevidence of effects include

randomized controlled trialsrandomized controlled trials regression discontinuity designsregression discontinuity designs quasi-experimental designs quasi-experimental designs (must use a similar (must use a similar

comparison group and have no attrition or disruption comparison group and have no attrition or disruption problems)problems)

single subject designssingle subject designs Topic area definitionTopic area definition. The study did not meet the intervention . The study did not meet the intervention

definition developed by the WWC for a particular topic.definition developed by the WWC for a particular topic. Time period definition (Time period definition (generally, the last 20 years)generally, the last 20 years) Relevant outcomeRelevant outcome

academic outcomes, not, for example, student self-academic outcomes, not, for example, student self-confidenceconfidence

needs to have only one relevant outcome to pass this screenneeds to have only one relevant outcome to pass this screen test reliability or validitytest reliability or validity sample or description of relevant test items if a study sample or description of relevant test items if a study

outcome test is not known or availableoutcome test is not known or available Relevant student sampleRelevant student sample

Page 25: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current ExampleMATHEMATICS AND SCIENCE EDUCATION MATHEMATICS AND SCIENCE EDUCATION

RESEARCH GRANTS PROGRAMRESEARCH GRANTS PROGRAM CFDA (Catalog of Federal Domestic CFDA (Catalog of Federal Domestic

Assistance) NUMBER: 84.305Assistance) NUMBER: 84.305 RELEASE DATE: May 6, 2005RELEASE DATE: May 6, 2005 REQUEST FOR APPLICATIONS NUMBER: REQUEST FOR APPLICATIONS NUMBER:

NCER-06-02 Mathematics and Science NCER-06-02 Mathematics and Science Education Research Grants ProgramEducation Research Grants Program

http://www.ed.gov/about/offices/list/ies/http://www.ed.gov/about/offices/list/ies/programs.htmlprograms.html

LETTER OF INTENT RECEIPT DATE: LETTER OF INTENT RECEIPT DATE: September 12, 2005September 12, 2005

APPLICATION RECEIPT DATE: November 3, APPLICATION RECEIPT DATE: November 3, 2005, 8:00 p.m. Eastern time2005, 8:00 p.m. Eastern time

Page 26: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current ExampleREVIEW CRITERIA FOR SCIENTIFIC MERITREVIEW CRITERIA FOR SCIENTIFIC MERIT SignificanceSignificance

Does applicant make a compelling case for the potential Does applicant make a compelling case for the potential contribution of the project to the solution of an education contribution of the project to the solution of an education problem?problem?

Does the applicant present a strong rationale justifying the Does the applicant present a strong rationale justifying the need to evaluate the selected intervention (e.g., does need to evaluate the selected intervention (e.g., does prior evidence suggest that the intervention is likely to prior evidence suggest that the intervention is likely to substantially improve student learning and achievement)?substantially improve student learning and achievement)?

Research PlanResearch Plan Does the applicant presentDoes the applicant present

(a) clear hypotheses or research questions(a) clear hypotheses or research questions (b) clear descriptions of and strong rationales for the (b) clear descriptions of and strong rationales for the

sample, measures (including information on reliability sample, measures (including information on reliability and validity), data collection procedures, and research and validity), data collection procedures, and research designdesign

(c) a detailed and well-justified data analysis plan?(c) a detailed and well-justified data analysis plan? Does the research plan meet the requirements described Does the research plan meet the requirements described

in the section on the Requirements of the Proposed in the section on the Requirements of the Proposed Research?Research?

Is the research plan appropriate for answering the Is the research plan appropriate for answering the research questions or testing the proposed hypotheses?research questions or testing the proposed hypotheses?

Page 27: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example

Applications under Goal Three Applications under Goal Three (Efficacy and Replication Trials)(Efficacy and Replication Trials) Under Goal Three, the Institute requests Under Goal Three, the Institute requests

proposals to test the efficacy of fully proposals to test the efficacy of fully developed interventions that already developed interventions that already have evidence of potential efficacy.have evidence of potential efficacy.

By By efficacyefficacy, the Institute means the , the Institute means the degree to which an intervention has a degree to which an intervention has a net positive impact on the outcomes of net positive impact on the outcomes of interest in relation to the program or interest in relation to the program or practice to which it is being compared.practice to which it is being compared.

Page 28: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example

Methodological requirementsMethodological requirements (i) (i) SampleSample

TheThe applicant should define, as applicant should define, as completely as possible, the sample to be completely as possible, the sample to be selected and sampling procedures to be selected and sampling procedures to be employed for the proposed study. employed for the proposed study. Additionally, the applicant should Additionally, the applicant should describe strategies to insure that describe strategies to insure that participants will remain in the study participants will remain in the study over the course of the evaluation.over the course of the evaluation.

Page 29: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example (ii)(ii) DesignDesign

Applicants should describe how potential threats Applicants should describe how potential threats to internal and external validity will be addressed.to internal and external validity will be addressed.

Studies using randomized assignment to Studies using randomized assignment to treatment and comparison conditions are strongly treatment and comparison conditions are strongly preferred.preferred.

When a randomized trial is used, the applicant When a randomized trial is used, the applicant should clearly state the unit of randomization should clearly state the unit of randomization (e.g., students, classroom, teacher, or school).(e.g., students, classroom, teacher, or school).

Choice of randomizing unit or units should be Choice of randomizing unit or units should be grounded in a theoretical framework.grounded in a theoretical framework.

Applicants should explain the procedures for Applicants should explain the procedures for assignment of groups (e.g., schools, classrooms) assignment of groups (e.g., schools, classrooms) or participants to treatment and comparison or participants to treatment and comparison conditions.conditions.

Page 30: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example(ii) (ii) Design (continued)Design (continued)

Only in circumstances in which a Only in circumstances in which a randomized trial is not possiblerandomized trial is not possible may may alternatives that substantially minimize alternatives that substantially minimize selection bias or allow it to be modeled be selection bias or allow it to be modeled be employed. Applicants … must make a employed. Applicants … must make a compelling case that randomization is not compelling case that randomization is not possible.possible.

Acceptable alternatives include Acceptable alternatives include appropriately structured regression-appropriately structured regression-discontinuity designs or other well-designed discontinuity designs or other well-designed quasi-experimental designs that come close quasi-experimental designs that come close to true experiments in minimizing the effects to true experiments in minimizing the effects of selection bias on estimates of effect size.of selection bias on estimates of effect size.

Page 31: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example(ii) (ii) Design (continued)Design (continued)

A well-designed quasi-experiment reduces A well-designed quasi-experiment reduces substantially the potential influence of selection bias substantially the potential influence of selection bias on membership in the intervention or comparison on membership in the intervention or comparison group. This involves:group. This involves:

demonstrating equivalence between the demonstrating equivalence between the intervention and comparison groups at program intervention and comparison groups at program entry on the variables measuring program entry on the variables measuring program outcomes (e.g., math achievement test scores), or outcomes (e.g., math achievement test scores), or obtaining such equivalence through statistical obtaining such equivalence through statistical procedures such as propensity score balancing or procedures such as propensity score balancing or regressionregression

demonstrating equivalence or removing statistically demonstrating equivalence or removing statistically the effects of other variables on which the groups the effects of other variables on which the groups may differ and that may affect intended outcomes may differ and that may affect intended outcomes of the program being evaluated (e.g., demographic of the program being evaluated (e.g., demographic variables, experience and level of training of variables, experience and level of training of teachers, motivation of parents or students)teachers, motivation of parents or students)

a design for the initial selection of the intervention a design for the initial selection of the intervention and comparison groups that minimizes selection and comparison groups that minimizes selection bias or allows it to be modeledbias or allows it to be modeled

Page 32: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example (iii)(iii) PowerPower

Applicants should clearly address the power of Applicants should clearly address the power of the evaluation design to detect a reasonably the evaluation design to detect a reasonably expected and minimally important effect.expected and minimally important effect.

For determining the sample size, applicants For determining the sample size, applicants need to consider the number of clusters, the need to consider the number of clusters, the number of individuals within clusters, the number of individuals within clusters, the potential adjustment from covariates, the potential adjustment from covariates, the desired effect, the intraclass correlation (i.e., desired effect, the intraclass correlation (i.e., the variance between clusters relative to the the variance between clusters relative to the total variance between and within clusters), the total variance between and within clusters), the desired power of the design, one-tailed vs. two-desired power of the design, one-tailed vs. two-tailed tests, repeated observations, attrition of tailed tests, repeated observations, attrition of participants, etc.participants, etc.

Applicants should anticipate the degree to Applicants should anticipate the degree to which the magnitude of the expected effect which the magnitude of the expected effect may vary across the primary outcomes of may vary across the primary outcomes of interest.interest.

Page 33: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example

(iv) (iv) MeasuresMeasures Investigators should includeInvestigators should include

relevant standardized measures of student relevant standardized measures of student achievement (e.g., standardized measures achievement (e.g., standardized measures of mathematics achievement)of mathematics achievement)

other measures of student learning and other measures of student learning and achievement (e.g., researcher-developed achievement (e.g., researcher-developed measures)measures)

measures of teacher practicesmeasures of teacher practices information on the reliability, validity, and information on the reliability, validity, and

appropriateness of proposed measuresappropriateness of proposed measures

Page 34: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example

(v) (v) Fidelity of implementation of the Fidelity of implementation of the interventionintervention The applicant shouldThe applicant should

specify how the implementation of the specify how the implementation of the intervention will be documented and measuredintervention will be documented and measured

either indicate how the intervention will be either indicate how the intervention will be maintained consistently across multiple groups maintained consistently across multiple groups (e.g., classrooms and schools) over time or (e.g., classrooms and schools) over time or describe the parameters under which variations describe the parameters under which variations in the implementation may occurin the implementation may occur

propose research designs that permit the propose research designs that permit the identification and assessment of factors identification and assessment of factors impacting the fidelity of implementationimpacting the fidelity of implementation

Page 35: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example (vi) (vi) Comparison group, where applicableComparison group, where applicable The applicant shouldThe applicant should

describe strategies to avoid contamination describe strategies to avoid contamination between treatment and comparison groupsbetween treatment and comparison groups

include procedures for describing practices in the include procedures for describing practices in the comparison groupscomparison groups

be able to compare intervention and comparison be able to compare intervention and comparison groups on the implementation of key features of groups on the implementation of key features of the interventionthe intervention

using a business-as-usual comparison group is using a business-as-usual comparison group is acceptableacceptable applicants should specify the treatment or applicants should specify the treatment or

treatments received in the comparison grouptreatments received in the comparison group applicants should account for the ways in which applicants should account for the ways in which

what happens in the comparison group are what happens in the comparison group are important to understanding the net impact of the important to understanding the net impact of the experimental treatmentexperimental treatment

Page 36: EDUCATION RESEARCH MEETS THE GOLD STANDARD: STATISTICS, EDUCATION, AND RESEARCH METHODS AFTER NO CHILD LEFT BEHIND Mack C. Shelley, II Iowa State University

A Real Live Current ExampleA Real Live Current Example (vii) (vii) Mediating and moderating variablesMediating and moderating variables

Mediating and moderating variables that are Mediating and moderating variables that are measured in the intervention condition that are also measured in the intervention condition that are also likely to affect outcomes in the comparison condition likely to affect outcomes in the comparison condition should be measured in the comparison condition should be measured in the comparison condition (e.g., student time-on-task, teacher experience/time (e.g., student time-on-task, teacher experience/time in position).in position).

The evaluation should account for sources of variation The evaluation should account for sources of variation in outcomes across settings (i.e., to account for what in outcomes across settings (i.e., to account for what might otherwise be part of the error variance).might otherwise be part of the error variance).

(viii) (viii) Data analysisData analysis specific statistical procedures should be describedspecific statistical procedures should be described the relation between hypotheses, measures, and the relation between hypotheses, measures, and

independent and dependent variables should be clearindependent and dependent variables should be clear the effects of clustering must be accounted for in the the effects of clustering must be accounted for in the

analyses, even when individuals are randomly analyses, even when individuals are randomly assigned to conditionassigned to condition