53

National evaluationofupwardbound

  • Upload
    chears

  • View
    575

  • Download
    0

Embed Size (px)

DESCRIPTION

Presentation explains issues with the evaluation and gives results of re-analysis correcting for sampling and non-sampling error issues

Citation preview

Page 1: National evaluationofupwardbound
Page 2: National evaluationofupwardbound

1. Give overview and policy history2. Explain what went wrong and

why it went wrong3. Present results of re-analyses

that mitigate issues and correct impact estimates

4. Discuss next steps and invitation for more analyses

Topics/Purposes of Presentation

Page 3: National evaluationofupwardbound

Not a critique of random assignment-recognize

power of method and hope this critique will improve its application

Not a general critique of Mathematica Policy Research ‘s work—believe conclusions and reports of “no impact” estimates in their Upward Bound (UB) reports are seriously flawed; very critical of Mathematica’s refusal to acknowledge more robust positive impact estimates and their misleading masking of key issues with the study in reports---but respect the hard work and determination of completing this study

Not an Act of Advocacy for the program —am acting as a researcher concerned with meeting research standards

Clarification of What Presentation is Not

Page 4: National evaluationofupwardbound

Employed as Contractor for over 25 years:

Westat for 16 years and served as Project Director (PD) for National Evaluation of Student Support Services (SSS) evaluation.

Mathematica for 6 years served as PD for National Evaluation of Talent Search—While employed at Mathematica also served as Survey Director for UB Third and start of Fourth follow up data collection

RTI for 3 years served as NSOPF PD UB study began in 1992--Controversial Study over entire history—random assignment combined with probability national sample—very rare. Mathematica published 4 reports (two most recent 2004 & 2009) I joined US Department of Education (ED), Policy and Planning Studies Services (PPSS) in late 2004 ---Team Leader for Secondary

Postsecondary Cross-Cutting (SPCC) Team---UB study was under my team.

Developed concerns—Involved in long painful internal debate-- 2006-2011; Retired from ED in 2011

Currently Co-Principal Investigator for ED i3-grant—Using Data to Inform College Access Programming at Pell Institute for Study of Higher Education at Council for Opportunity in Education (COE)

Personal Involvement Disclosure

Page 5: National evaluationofupwardbound

Basic Problem

As final ED COR/Technical Monitor found impact estimates published in 2004 and again in 2009 were seriously flawed such that the conclusions of “no detectable impact” for UB program were found to be erroneous

Re-analyses correcting for these errors using standard statistical procedures found strong positive results for the UB program on major outcomes

Report is not transparent in revealing these issues or the findings of positive results when these issues are addressed

Page 6: National evaluationofupwardbound

UB begun in 1965 as part of civil rights movement and New Society: 1991—Upward Bound Math Science (UBMS) initiative begun Goal –increase college access and preparation for eligible high school students (low-income (150 percent of poverty) and first generation college (no parent has BA degree) Academic focus—6-to 8 week program on college campus in summer and academic year follow-up sessions Most intensive of TRIO programs--$4900 per year per student served; Average program serves 50 students per year Grants made to postsecondary institutions to run programs—often students enroll in institutions---currently over 1000 programs across nation

Upward Bound (UB) Program Overview

Page 7: National evaluationofupwardbound

The Pell Institute 7

Percentage of high school students who had at least one parent with a four-year college degree by

race/ethnicity: 1972, 1980, 1990 and 2002: NCES High School Longitudinal Studies

26

43

7

13

21

31

4045

52

11

29

23

38

2723

13

16

8

14

22

14

15

29

21

0

10

20

30

40

50

60

1970 1975 1980 1985 1990 1995 2000 2005

White Hispanic or LatinoBlack or African American AsianAmerican Indian or Alaska Native All

Note large increase since program began in percent of parents having BA degree

Page 8: National evaluationofupwardbound

UB Evaluation: Study History

Second national evaluation and first random assignment study of UB: Begun in 1992 –last follow-up in 2003-04

Under 3 contracts Mathematica has authored 4 reports published by ED 1996, 1999, 2004, 2009; Fourth follow up report unpublished

Page 9: National evaluationofupwardbound

Unique combination

Multi-stage complex nationally representative probability sampling procedures –inverse probability of selection weighted to national estimates Experimental random assignment design

Multi-stage sample design 67 projects from 46 strata designed to represent different types of projects (4-2year, public-private, small, med, large, rural, non-rural, race/ethnicity of participants) 339 end stage strata for 1500 treatment and 1380 control applicants

Projects required to recruit at least twice number of openings so can do random assignment Study sought to change as little as possible about the program except recruitment

Accommodations—allowed “must serves” removed from analyses Did not control actual offering of treatment or participation of those assigned Multi-grade—multi-year cohort—grades 7 to 10 at baseline

UB Study Basic Design

Page 10: National evaluationofupwardbound

Flawed reports authored by Mathematica Policy Research have driven ED Policy with

regard to UB program for more than a Decade

Third Follow up--- reported no average overall effects; but large effects for students at-risk academically and with lower educational expectations defined as expecting less than a BA at baseline

The Program Assessment Rating Tool (PART) was developed to assess and improve program performance so that the Federal government can achieve better results ----UB given OMB PART rating of “ineffective”

Based on study findings --ED began new UB Initiative to serve more academically at risk students

Budget ---Bush budget zero funding of all federal pre-college programs (UB, UBMS, Talent Search and Gear Up) in FY05 and FY06—Justified by UB study results--dropped in FY07 and FY08

Page 11: National evaluationofupwardbound

UB 2006 Absolute Priority to serve 1/3 at-risk and 9th

grade ; New random assignment study to evaluate begun

2006 Congress blocked in 2007 and cancelled by ED in 2008

HEOA 2008 Mandates rigorous evaluations Prohibits over-recruitment to program only for for

the purposes of evaluation random assignment –does not prohibit any random assignment studies only when is deliberate denial of services

Absolute Priority cancelled

Policy History (cont)

Page 12: National evaluationofupwardbound

Impact Estimates Reported by Mathematica and on ED

Website have: Inadequately controlled for bias in favor of control group Serious representational issues for largest 4-year public

stratum Severe unequal weighting with one project given 26

percent of weight Lack of standardization of outcome measures to expected

high school graduation year for sample that spanned 5 years of expected high school graduation year

Inappropriate use of National Student Clearinghouse (NSC) data when coverage was too low to meet standards or non-existent and there is evidence of bias

Page 13: National evaluationofupwardbound

Other Researchers Have Confirmed Issues Initial concern came in 2005 from Mathematica itself when a new staff person

no longer employed there who was lead analyst from Fourth Follow up sent ED tables showing results were sensitive to only one project– revealed for first time that one project had 26 percent of weight; seemingly large negative impacts---Positive overall impacts when excluded; not significant impacts when included

PPSS Consultation with RTI—statistical experts—James Chromy—Fellow of American Statistical Society --sent file in 2007 and he advised on how to handle project 69—treat as ineligible ---and replicated statistical tabulations using SUDANN—asked for sample frame –Mathematica delayed in sending

David Goodwin -Division Director who was original COR for UB study and who originally defended the impact estimates eventually came to see the problems and believe that analyses without project 69 were more credible

IES external reviews confirmed basic issues—stated results with project 69 were not robust

When present information academic discussants and audiences are incredulous do not understand why ED would continue to publish these impacts

Page 14: National evaluationofupwardbound

Experimental design work examining the

threats to validity (for example, Shadish, Cook, and Campbell; Heckman)

Survey methods research on —sampling and non-sampling error (for example, Groves, et. al 2004)

Statistical and program evaluation standards (for example, the Program Evaluation Standards, NCES Standards, AERA Standards ).

Guidance from three intersecting traditions

Page 15: National evaluationofupwardbound

What is Sampling and Non-Sampling Error?

Sampling error is the error caused by observing a sample instead of the whole population. Sample to sample variation estimated by observing variation among the sample members or sub-dividing the sample

Non-sampling error is a catch all term for deviations from true value of estimates or study error that is not caused by sampling (examples non-response bias, lack of understanding of questions, lack of recall)—harder to measure statistically

Page 16: National evaluationofupwardbound

Basic Assumptions of Random Assignment Studies

1. Sample representative of population to which wish to generalize

2. Treatment and control group are equivalent3. Treatment and control group treated equally

except for the treatment4. Treatment and control group are mutually

exclusive with regard to the treatment

Page 17: National evaluationofupwardbound

Major Focus on the Technical Standards Violations in

report Also covers

Transparency issues in the report (does not provide information needed to judge and also masks some of the issues)

Review process issues—In politically directed process the report was published over the objections of unit responsible for the study (the PPSS Team Leader and Technical Reviewers) and over the Office of Postsecondary Education (OPE) formal disapproval in last week of Bush Administration Note: It was published with the reported acquiescence of IES

even though an IES external reviewer had specifically stated that the “impact estimates were not robust”

Request for Correction Covers

Page 18: National evaluationofupwardbound

1. Seriously flawed sample design—one project of 67 carrying 26 percent of weight—only one single project selected from largest study defined stratum (some cases weighted up to 200 times weights of other students)

2. Serious representational issues for project with 26 percent of weight –was atypical for its 4-year stratum in that had mostly 2-year and less than 2-year certificate programs

3. Treatment and control group that has bias in favor of the control group ----were seriously non-equivalent

4. Outcome variables were not standardized to expected high school graduation year (EHSGY) for sample that spanned 5 years of graduation dates

5. Improper use of National Student Clearinghouse data for non-responders to surveys when coverage was too low or non-existent and evidence of bias

6. Lack of transparency in acknowledging issues and masking some of issues—biased reporting of findings—lack of acknowledgement of alternative credible positive findings for Upward Bound

REPORTS HAVE 6 MAJOR STANDARDS VIOLATIONS

Page 19: National evaluationofupwardbound

1. Sample Design Issues

Sample highly stratified—46 for 67 projects Unequal weighting---One project carries 26

percent, 3 projects 35, and 8 projects 50 percent of weight

Project level stratification—339—strata unequal within projects

Basic Design Flaw--One project for largest Treatment-control non-equivalency introduced

by outlier 26 percent project

Page 20: National evaluationofupwardbound

Project that should have been declared ineligible to represent its 4-year stratum

carried 26 percent of the weight Extreme unequal weighting and

serious representation issues One project of 67 in sample

carried 26 percent of weight (known as 69) and was sole representative of the largest 4-year public strata, but was a former 2-year school with largely less than 2-year programs

Project partnered with job training program

Inadequate representation of 4-year stratum

Figure 5. Percent of sum of the weights by project of the 67 projects making up the Upward Bound national evaluation sample: study conducted 1992-93-2003-04

2 6 .3 8

0

5

1 0

1 5

2 0

2 5

3 0

P e rc e n t o f w e ig h t

N OT E : O f the 67 projects making up the UB sample just over half (54 percent) have less than 1 percent of the weights each and one project (69) accounts for 26.4 percent of the weights. SOURCE : D ata tabulated (D ecember 2007) by Policy and Program Studies Service (PPSS) of O ffice of Planning, Evaluation and Policy D evelopment (O PEPD ) US D epartment of Education (ED ) using national evaluation of Upward Bound data files: study conducted 1992 -93-2003-04.

Page 21: National evaluationofupwardbound

2. Treatment–Control Non-Equivalency

Sample well matched without project 69 Project 69 introduces bias into the overall

sample in favor of the controls Project 69 has large differences (examples)

Education expectations: 56 percent of controls expect advanced degree—15 percent treatment

9th grade academics—8 percent controls are at risk; 33 percent of treatment group are at risk

Expected HS grad is 1997 (younger group)—60 percent of treatment and 42 percent of controls

Page 22: National evaluationofupwardbound

Project 69 had seriously non-equivalent treatment

and control group

0

10

20

30

40

50

60

70

80

90

100

Male Expect MA orhigher

Base grade 8or below

Algebra in 9th High academicrisk

GPA below 2.5 White

No69Treatment No69Control 69Treatment 69Control

Page 23: National evaluationofupwardbound

The Pell Institute 23

Project 69

Treatment, 80

Control, 20

Treatment, 77

Control, 23

Treatment, 21

Control, 79

0102030405060708090

100

High academicrisk

In 9th (younger)grade in 1993-94

Expect advanceddegree

Treatment Control

Other 66 projects in sample

Treatment, 51

Control, 49

Treatment, 51

Control, 49

Treatment, 49

Control, 51

0102030405060708090

100

High academicrisk

In 9th (younger)grade in 1993-94

Expect advanceddegree

Treatment Control

Bias in 69 and balance in rest of sample taken together

Page 24: National evaluationofupwardbound

The Pell Institute 24

Treatment, 58

Control, 42

Treatment, 56

Control, 44

Treatment, 42

Control, 58

0102030405060708090

100

High academicrisk

In 9th (younger)grade in 1993-94

Expect advanceddegree

Treatment Control

Page 25: National evaluationofupwardbound

3. Lack of Outcome Standardization to Expected High School Graduation Year

(EHSGY)

Multi-grade study cohort spanned 5 years of expected high school graduation

At the time of the last (5th) follow-up 10 percent had 6 years, 30 percent had 7 years; 34 percent had 8 years; 19 percent had 9 years; and 5 percent had 10 years since high school graduation

Unbalances between treatment and control ---Control has larger percentage of older 10th grade students at time of randomization

Mathematica never standardized outcome measures based on EHSGY; ED staff derived these variables for re-analysis

Page 26: National evaluationofupwardbound

4. Survey Attrition and Non-Response and Non-Coverage

Bias Concern in longitudinal studies UB rates very high for follow ups but at 74

percent by end—control group 4-5 percent less response rate --on Third and Fourth

Positive outcomes more likely to respond Use federal aid files to observe and impute Improper use of National Student Clearinghouse

for non-respondents when enrollment coverage too low and biased due to clustering; and when 2-year and less than 2-year was non-existent in most applicable period

Page 27: National evaluationofupwardbound

Figure 4. Percent of total UB study participants found on the federal financial aid files as applicants and as Pell recipients, classified by fourth follow–up survey response status: study conducted 1992-93-2003-04

63

79

47

62

0 10 20 30 40 50 60 70 80 90

Pell recipient

Applied for aid

Responder Non-responder

NOTE: Unweighted data based on 2845 Upward Bound sample members from both treatment and control groups SOURCE: Data tabulated (October 2006) by Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and Policy Development (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files and Federal Applicant and Award Files 1994-95 to 2003-04

Page 28: National evaluationofupwardbound

5. Service Participation and non-Participation

Issues Waiting List Drop-Outs --26 percent of treatment coded as waiting list file drop-outs—kept in treatment sample

First Follow-up survey 18% non-participation in neither UB or UBMS in treatment group

Survey data--12-14 percent controls evidence of UB or UBMS participation

60 percent controls and 92 percent treatment group reported some pre-college supplemental service participation

Page 29: National evaluationofupwardbound

6. Masking of Issues in Final Report

Failure to report on project 69’s representational issues Failure to acknowledge large impacts without project

69 and stating that exclusion of project 69 does not make a difference in conclusions

Failure to acknowledge NSC coverage and bias issues Failure to acknowledge standardization of outcomes

results and misleading statements concerning results Failure to acknowledge the extent of academic risk

bias in favor of the control group in estimates

Page 30: National evaluationofupwardbound

Alternative Re-Analyses

Experimental Analyses Intent to treat (ITT)—UB opportunity--original

random assignment groups—Logistic regression Treatment on Treated (TOT) -UB/UBMS

participation—Instrumental Variables Regression Quasi-experimental--Observational

UB/UBMS compared to non-UB/non-UBMS service Any service compared to no service

Selected subgroup (academic risk-and educational expectations)

Page 31: National evaluationofupwardbound

Instrumental Variables Regression used in TOT/CACE and Observational

analyses Two stage regression—

mitigate selection bias First stage models factors

related to participation Second stage --uses results

as additional control in the model estimating outcomes

Page 32: National evaluationofupwardbound

Use same statistical methods (logistic and

instrumental variables regression) Statistical programs that take into account the

complex multi-stage sample design in estimating standard errors--STATA

Same ITT opportunity grouping: TOT participation grouping recognizes UBMS as form of UB

Similar model baseline controls: both omit 9th grade academic risk indicators; include additional control for grade at baseline

Same weights--Mathematica

What is the same as Mathematica’s Analyses?

Page 33: National evaluationofupwardbound

Standardize outcomes by expected high school

graduation year Avoid using early NSC data when coverage too low;

use only for BA degree as supplement for non-responders to surveys

Use all applicable follow-up surveys (3 to 5) not just one round at a time; used federal aid files

Present data with and without project 69 and weighted and unweighted;

View impact estimates without project 69 as reasonably robust for 74 percent of applicants; view estimates with project 69 as non-robust and use should be avoided especially for estimates of BA impact

What is Different from Mathematica’s

analyses

Page 34: National evaluationofupwardbound

Re-analyses Findings for Enrollment and Financial aid

Standardizing for Expected High School Graduation Year (and not using NSC data for enrollment) found significant and substantial positive ITT and TOT findings weighted and unweighted and with and without project 69

Page 35: National evaluationofupwardbound

Overall Results Significant and substantial positive ITT

and TOT findings weighted and unweighted and with and without project 69 for: Evidence of postsecondary entrance in

+18 months and for +4 years Application for financial aid in +18 months

and for +4 years Evidence of award of any postsecondary

degree or credential by fourth follow up (4 to 6 years after EHSGY)

Page 36: National evaluationofupwardbound

Figure 1. E stimated rates of postsecondary entrance within +1 (about 18 months) of expected high school graduation year (E H SGY for Upward Bound O pportunity (IT T ) and Upward Bound/ Upward Bound M ath Science Participation (T O T / CACE ): study conducted 1992-93-2003-04

7 4 .6

7 3 .3

7 3 .5

7 2 .9

6 0 .4

6 4 .3

6 2 .5

6 6

4 0 4 5 5 0 5 5 6 0 6 5 7 0 7 5 8 0

T O T /C A C E e v id e n c eo f p o s ts e c o n d a ry

w ith in + 1 o f E H S G Y(e x c lu d e s o u t l ie r )

IT T e v id e n c e o fp o s ts e c o n d a ry w ith in

+ 1 o f E H S G Y(e x c lu d e s o u t l ie r )

T O T /C A C E e v id e n c eo f p o s ts e c o n d a ry

w ith in + 1 o f E H S G Y( in c lu d e s o u t l ie r )

IT T e v id e n c e o fp o s ts e c o n d a ry w ith in

+ 1 o f E H S G Y( in c lu d e s o u t l ie r )

C o n tro lT re a tm e n t

*/ **/ ***/ **** Significant at 0.10/ 0.05/ . 01/ 00 level; UB = regular Upward Bound; UBMS = Upward Bound Math Science; IT T = Intent to Treat; T O T = Treatment on Treated; CA CE = Com plier A verage Causal E ffect.

N O T E : E stimated rates from STA T A logistic and instrumental variables regression taking into account the complex sample design. W eighted estimates use poststratified weights. See table 4 in body of the report for detailed note. SO URCE : D ata tabulated (January 2008) Policy and Program Studies Service (PPSS) of O ffice of Planning, E valuation and Policy D evelopment (O PEPD ) US D epartment of E ducation (ED ) using national evaluation of Upward Bound data files: study conducted 1992-93-2003-04; and Federal A id A pplication and Pell A ward Files 1994-95 to 2003-04.

D iffe r e n c e 6 .9 * * * *

D i f fe r e n c e 1 0 .9 * * * *

D i ffe r e n c e 9 .1 * * *

D i f fe r e n c e 1 4 .2 * * * *

Page 37: National evaluationofupwardbound
Page 38: National evaluationofupwardbound

F igure 2. E stim ated rates o f app lication for federal fin ancial aid w ithin + 4 o f expected high school

graduation year (E H SG Y ) for U pw ard Bound O pportunity (IT T ) and U pw ard Bound/ U pw ard B ound M ath Science Participation (T O T / C A C E ): study conducted 1992-93-2003-04

6 9 . 1

6 7 . 7

6 6 . 7

6 5 . 4

5 7 .1

6 0 .4

5 6 . 1

5 8 .7

4 0 4 5 5 0 5 5 6 0 6 5 7 0 7 5 8 0

T O T / C A C E a p p l i e d f o rf e d e r a l f i n a n c ia l a i d w i t h i n + 4 o f E H S G Y

( e x c l u d e s o u t l i e r )

I T T a p p l i e d f o r f e d e r a lf i n a n c ia l a i d w i t h i n + 4

o f E H S G Y ( e x c l u d e so u t l i e r )

T O T / C A C E a p p l i e d f o rf e d e r a l f i n a n c ia l a i d w i t h i n + 4 o f E H S G Y

( i n c l u d e s o u t l i e r )

I T T a p p l i e d f o r f e d e r a lf i n a n c ia l a i d w i t h i n + 4

o f E H S G Y ( i n c l u d e so u t l i e r )

C o n t r o lT r e a t m e n t

* / * * / * * * / ** * * Sign ificant at 0.10/ 0.05/ . 01/ 00 lev el; U B = regular U pward B o und; U B M S = U pward B o un d M ath Science; IT T = In ten t to T reat; T O T = T reatm ent on T reated; CA C E = C om plier A verage C ausal E ff ect.

N O T E : E stim ated rates fro m ST A T A lo gistic and instrum ental variab les regressio n taking in to acco unt the co m plex sam ple design . W eigh ted data use po ststratified weights. See table 6 and tab le 4 in bo dy o f the report for detailed no tes. SO U R C E : D ata tabulated (January 2008) Po licy and P ro gram Studies Serv ice (P P SS) o f O ff ice o f P lann ing, E v aluatio n and P o licy D evelop m ent (O P E P D ) U S D epartm ent o f E ducatio n (E D ) using national ev aluation o f U pward B o und data files: study co nducted 1992-93-2003-04; and F ederal A id A pp lication and P ell A ward F iles 1994-95 to 2003-04.

D i f f e r e n c e 6 . 7 * * * *

D i f f e r e n c e 1 0 .6 * * * *

D i f f e r e n c e 7 . 3 * * *

D i f f e r e n c e 1 1 .9 * * * *

Page 39: National evaluationofupwardbound

Re-Analyses--Awarded a BA in +6 years of EHSGY

Weighted with 69 not sign. Unweighted sign. For the 74 percent of sample not

represented by project 69 28 percent increase in BA award

for ITT UB opportunity (13.3 increased to 17.0)

50 percent increase in BA award for TOT UB participation analyses (14.1 to increased to 21.1)

Page 40: National evaluationofupwardbound

Impact of Upward Bound (UB) on Bachelor’s (BA)

degree attainment NOTE: Instrumental Variables Regression

models for Treatment on the Treated (TOT) estimates based on 66 of 67 projects in UB sample: National Evaluation of Upward Bound, study conducted 1992-93 to 2003-04

EHSGY = Expected High School Graduation Year; NSC = National Student Clearinghouse; SFA = Student Financial Aid All estimates significant at the .01 level or higher. Estimates based on 66 of 67 projects in sample representing 74 percent of UB at the time of the study. One project removed due to introducing bias into estimates and representational issues. We use a 2-stage instrumental variables regression procedure to control for selection effects for the Treatment on the Treated (TOT) impact estimates.

SOURCE: Data tabulated January 2010 using: National Evaluation of Upward Bound data files, study sponsored by the Policy and Program Studies Services (PPSS), of the Office of Planning, Evaluation and Policy Development (OPEPD), U.S. Department of Education; study conducted 1992-9 to -2003-04.

Page 41: National evaluationofupwardbound

UB/UBMS Participation Compared with Other non-UB/UBMS Services

Participation Quasi-experimental--Uses 2-stage

instrumental variables regression—controls for selection bias not eliminate

Found statistically significant and substantive positive results for UB/UBMS participation for: Evidence of postsecondary entrance +1 and +4 Application for financial aid +1 and +4 Award of BA in +6 unweighted overall and

unweighted and weighted without project 69

Page 42: National evaluationofupwardbound

Table 5. Evidence of Postsecondary Entrance within +1 (18 months) and within +4 of expected high school graduation year (EHSGY for observational models comparing types of service receipt: National Evaluation of Upward Bound, study conducted 1992-93 to 2003-2004

All sampling strata One outlier project removed (remainder represents 74 percent of Horizons waiting list)

Outcome variable

Participated in UB/ UBMS compared with participated in other non-UB/ non-UBMS pre-college support or supplemental services only (observational –instrumental variables regression)

Any pre-college support or supplemental services reported compared with no services reported (observational –instrumental variables regression)

Participated in UB/ UBMS compared with participated in other non-UB/ non-UBMS pre-college support or supplemental services only (observational –instrumental variables regression)

Any pre-college support or supplemental services reported compared with no services reported (observational –instrumental variables regression)

Evidence of postsecondary entrance within +1 of EHSGY

xb T = 74.4 xb C = 65.3 Difference = 9.1*** (xb T = 76.2 xb C = 66.8 Difference = 9.3****)

xb-T = 73.5 xbC = 48.6 Difference = 25.0**** (xb T = 75.8 xb C = 51.7 Difference = 24.1****)

xb T = 75.0 xb C = 61.7 Difference = 13.3**** (xb T = 76.3 xb C = 66.3 Difference = 10.1****)

xb T = 74.3 xb C = 44.6 Difference = 29.8**** (xb T = 75.9 xb C = 51.1 Difference = 24.7****)

Evidence of postsecondary entrance within +4 EHSGY

xb T = 75.6 xb C = 67.5 Difference = 8.2*** (xb T = 78.2 xb C = 68.7 Difference = 9.5****)

xb-T = 74.8 xb-C = 51.4 Difference = 23.5*** (xb T = 77.7 xb C = 54.1 Difference = 23.6****)

xb T = 76.5 xb C = 64.4 Difference = 12.1**** (xb T = 78.4 xb C = 68.2 Difference = 10.2****)

xb T = 75.9 xb C = 47.8 Difference = 28.1**** (xb T = 77.8 xb C = 53.7 Difference = 24.1****)

*/ **/ ***/ **** Significant at 0.10/ 0.05/ .01/ 00 level UB = regular Upward Bound; UBMS = Upward Bound Math Science; T = Treatment; C = Control or comparison; xb = linear prediction from STATA ivreg instrumental variables regression. Odds ratio = prT(1-prC)/ prC(1-prT). NOTE: Unweighted data given in parentheses. Please see table 4 for detailed notes. SOURCE: Data tabulated (January 2008) by Policy and Planning Studies Services (PPSS) using data from the, National Evaluation of Upward Bound, study files baseline through 4th follow up and Federal Aid Application and Pell Award Files 1994-95 to 2003-04.

Page 43: National evaluationofupwardbound

Bottom 20 percent on academic indicators

Large positive significant effects for: Postsecondary entrance Application for financial aid Award of any postsecondary degree

Not for BA degree –two few achieved to compare treatment and control Top 80 percent on academic indicators

Moderate positive significant effects for: Postsecondary entrance Application for financial aid Award of any postsecondary degree For BA degree in +6

Sub-Group Analyses

Page 44: National evaluationofupwardbound
Page 45: National evaluationofupwardbound

Impact Estimates from Two Stage Instrumental Variables Regression for Percent Obtaining a BA in +6 years based on UB

Random Assignment Evaluation

Note: All estimates significant at the .01 level or higher. Estimates based on 66 of 67 projects in sample representing 74 percent of UB at the time of the study. One project removed due to introducing bias into estimates and representational issues.

Page 46: National evaluationofupwardbound

Random Assignment National Evaluation of Upward Bound (UB) Data on Estimated increase in life-time taxes paid

compared to program cost per participant—taxes are 4.9 to 5.9 times the cost of participation

Sources and Assumptions: *UB Evaluation Data. Estimated based on estimated differences in educational attainment between the treatment and control group from random assignment study that followed sample for 6 to 10 years after expected high school graduation. $41, 495 figure based on impact estimates from the final Fifth Follow up Survey using outcome variables derived by Mathematica Policy Research with weights adjusted for survey non-response.  $36,493 estimates based on outcomes variables for longitudinal file standardized by expected high school graduation date   Treated on the Treated  (TOT) estimates based on instrumental variables regression modeling for 66 of the 67 projects representing 74 percent of the sample.  One project of 67 in the sample excluded due to fact that was found to be ineligible to represent its stratum and also had large imbalances between treatment and control group that due to extreme weight that introduced bias into previously published overall estimates. *Life time earnings and taxes data from US Census Bureau; The Big Payoff: Educational Attainment and Synthetic estimates  of Work-Life Earnings, July 2002, Current Population Reports Jennifer Day and Eric Newburger; College Board , Education Pays, The Benefits of Higher Education for Individuals and Society: 2007 **Cost of UB program per participant: US Department of Education Data on average cost of UB for one year --$4900 Assumes average participant uses about 1.5 times this level of resources.

Page 47: National evaluationofupwardbound

Ways to Support Request for Correction

Public statement of fact that submitting and reasons

Statement requesting timely review by ED signed by stakeholders and evaluators

Holding panels discussing the issues at major education and evaluation associations (wider issues of evaluation methods and use and transparency)

Accountability of the evaluator contractors and ed. issues

Support for Timely Review Correction Request will be

needed

Page 48: National evaluationofupwardbound

Caution about trying to do too much---Chose a difficult and atypical design

combining probability sampling with experimental design---led to serious issues—made worse by mistakes made and general lack of awareness of sampling and non-sampling study errors and role in impact estimation

Sample design flawed from start with serious unequal weighting—follow established standards for sample design

Representation issues—contractor did not adequately check representation of stratum and did not fully reveal issues when discovered

Lack of care in analysis in outcome measures that were not standardized to expected high school graduation which spanned 5 years

Lack of checking treatment and control group balance--equivalency on key attributes—faith in random assignment to ensure

Failure to respect stakeholder concerns about control group contamination and other issues and technical monitor legitimate concerns about the representation and treatment-control group non-balance bias issues ---- repeatedly dismissed as non-objective advocates

How could problems have been avoided in first place? Follow existing standards!

Page 49: National evaluationofupwardbound

Serious Problems with Doing Nothing about

Report1. ED continues to officially misrepresent the impact of UB2. The UB program reputation continues to be hurt by the

evaluation and stakeholders have officially objected; could have serious consequences in Congress

3. Missed opportunity to build on the program’s successes and find ways to strengthen and adapt program to achieve nations goals of increased postsecondary access and completion

4. Evaluation research as a whole suffers from not correcting mistakes made and learning from them

Page 50: National evaluationofupwardbound

Not try to represent entire population of interest with study (remove project 69 and represent 74 percent)—IES reviewer stated that estimates are robust for other 66 projects taken together

Standardize outcomes to expected high school graduation year

Use NSC data only for BA degree and not for less than BA and not for postsecondary entrance

How to Correct Report? It is correctable and can

provide useful information

Page 51: National evaluationofupwardbound

Partnership model among stakeholders Use more innovative evaluation methods (collaborative, participatory, empowerment, utilization, systems analysis, culturally responsive evaluation) Utilized resources/leverage academic institutional research offices of grantees Focus on program improvement rather than up or down Open and transparent sharing Build capacity for self evaluation and accountability Utilization of standards for statistical research and program evaluation

Next Steps in Evaluation

Page 52: National evaluationofupwardbound

Invitation to Research & Further Additional

Information The full text of the COE Request for Correction can be

found at http://www.coenet.us/files/spotlight-COE_Request_for_Correction_of_Mathematica_Report_011812.pdf

Statement of concern by leading researchers in field http://www.coenet.us/files/spotlight-Statement_of_Concern_011812.pdf

Results of the re-analysis detailing study error issues can be found at: http://www.coenet.us/files/files-Do_the_Conclusions_Change_2009.pdf.

Information on obtaining the restricted use UB data files for additional research can be obtained by contacting: [email protected]

Page 53: National evaluationofupwardbound

Margaret.Cahalan@pellinstitute.

org

202-347-7430 ex 212301-642-4851

Contact Information