Research Article - kenanaonline.comassessment, in comparison to a 35% of English-speaking students, or a statistically signiﬁcant 43 points difference (NCES, 2010). Similarly, the

JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 49, NO. 8, PP. 987–1011 (2012)

Research Article

The Effect of an Instructional Intervention on Middle School EnglishLearners’ Science and English Reading Achievement

Rafael Lara-Alecio,1 Fuhui Tong,1 Beverly J. Irby,2 Cindy Guerrero,1

Maggie Huerta,1 and Yinan Fan1

1Department of Educational Psychology, Texas A&M University, College Station, Texas 778432Sam Houston State University, Huntsville, Texas 77320

Received 27 April 2011; Accepted 7 June 2012

Abstract: This study examined the effect of a quasi-experimental project on fifth grade English

learners’ achievement in state-mandated standards-based science and English reading assessment. A

total of 166 treatment students and 80 comparison students from four randomized intermediate schools

participated in the current project. The intervention consisted of on-going professional development and

specific instructional science lessons with inquiry-based learning, direct and explicit vocabulary instruc-

tion, integration of reading and writing, and enrichment components including integration of technology,

take-home science activities, and university scientists mentoring. Results suggested a significant and

positive intervention effect in favor of the treatment students as reflected in higher performance

in district-wide curriculum-based tests of science and reading and standardized tests of oral reading

fluency. � 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 987–1011, 2012

Keywords: science intervention; English learners; inquiry-based learning; integrating science and

literacy; standards-based assessment

Recent demographics have revealed that English language learners (ELLs) comprise 21%

of the national enrollment in public elementary and secondary schools, with 79% of those

students being Spanish speakers (National Center for Education Statistics [NCES], 2010). In

Texas, alone, over 778,806 students were served in ELL programs in 2009–2010, accounting

for 16% of the school population (Texas Education Agency [TEA], 2010a). These numbers

are significant when achievement is compared to that of mainstream students. For example, at

the eighth grade level, the national data show that only 3% of ELLs achieved at or above

proficient level on the 2009 National Assessment of Educational Progress (NAEP) reading

assessment, in comparison to a 35% of English-speaking students, or a statistically significant

43 points difference (NCES, 2010). Similarly, the percentage of ELLs at or above proficient

at eighth grade level was 5% in Math and 2% in Science, as compared to 36% in Math and

32% in Science among English-speaking students (NCES, 2010). Such an achievement gap is

also reflected in the state-wide standardized assessment of science in which fifth grade ELLs

Additional Supporting Information may be found in the online version of this article.

Contract grant sponsor: National Science Foundation (NSF, grant # 0822343 to Texas A&M University,

0822153 to Sam Houston State University).

Correspondence to: F. Tong; E-mail: [email protected]

DOI 10.1002/tea.21031

Published online 11 July 2012 in Wiley Online Library (wileyonlinelibrary.com).

� 2012 Wiley Periodicals, Inc.

scored lower than any other subgroup including Special Education, Economically

Disadvantaged, and At-Risk (TEA, 2010a).

These statistics are staggering especially in an era of school accountability reform which

has been based on measured student achievement at both national and state levels (Anderson,

2012; Maerten-Rivera, Myers, Lee, & Penfield, 2010). The No Child Left Behind (NCLB,

2002) reinforces that all schools are held accountable of their students’ attainment by meeting

annual yearly progress (AYP), and achieving 100% proficiency rate on state tests by 2014,

while facing sanctions if they fall behind in meeting the standards. Science standards began

to be required in 2005–2006, and by 2007, science achievement was required to be measured

at least three times from Grades 3 to 12 (U.S. Department of Education, 2009). Such pressure

has resulted in the increased focus on test-driven accountability and teaching to the test,

because improvement on the assessments is how schools and districts are currently being

evaluated (Texas AFT, 2008). Even though there is now a state waiver for NCLB, account-

ability is still at the forefront. For example, states must ensure the following: standards are

adopted for college and career readiness; new accountability systems are implemented with

more flexibility in assessing student achievement, and evaluation and support systems are

developed and based on measures to improve teacher effectiveness (Hu, 2012). Considering

teacher effectiveness in science education, Lee (2005) stated that ‘‘Teachers often lack the

knowledge and the institutional support needed to address the complex educational needs of

ELLs’’ (p. 492). Indeed, according to Byrnes, Kiger, and Manning (1997), most classroom

teachers have had minimal, if any, training in meeting the academic or linguistic needs of

their ELLs, and, in fact, McCloskey (2002) reported that only 12% of teachers nationwide

had any training on how to teach with ELLs, much less when the content area of science is

added. Even with professional development on strategies to make content comprehensible for

ELLs, mainstream teachers did not accommodate the ELLs’ learning needs as they should

(Bentley, 2004).

Because the best program for ELLs at this point remains undetermined due to lack of

randomized trial studies and other sufficiently rigorous studies, many school district personnel

are blindly adopting approaches for ELLs, while these students ‘‘frequently confront the

demands of academic learning through a yet-unmastered language’’ (Lee, 2005, p. 492). Most

researchers have not actually observed bilingual/English as second language (ESL) and/or

English immersion science classrooms in a large-scale study to take into account instructional

factors in the learning of English (Bruenig, 1998; Irby, Tong, Lara-Alecio, Meyer, &

Rodriguez, 2007; Meyer, 2000). Furthermore, few researchers have utilized experimental or

quasi-experimental design to report how classroom instruction in science for ELLs can be

enhanced (e.g., August, Branum-Martin, Cardenas-Hagan, & Francis, 2009; Lee, Maerten-

Rivera, Penfield, LeRoy, & Secada, 2008; Lynch, Kuipers, Pyke, & Syesze, 2005). Therefore,

the purpose of our study was to evaluate, via a quasi-experimental design, the effectiveness of

a literacy-integrated science intervention on fifth grade ELLs’ science and reading literacy

achievement on accountability-based state assessments.

Standards-Based Accountability and ELLs’ Science Achievement

As Anderson (2012) noted, quantitative measurement of educational quality has existed

in the United States since the beginning of the early 20th century, but accountability standards

and repercussions for not meeting the standards, that is, high-stakes consequences, have pro-

gressively increased. The NCLB set unprecedented forceful provisions on using state-mandat-

ed assessments to hold schools accountable for their students’ academic performance (Wang,

988 LARA-ALECIO ET AL.

Journal of Research in Science Teaching

Beckett, & Brown, 2006). These assessments are developed surrounding a common theme of

the current reform to adopt more rigorous and measurable standards and higher expectations

for student performance (Linn, 2000). To aid in decision-making about individual students’

program placement/exit, grade promotion, graduation, etc., many states use scores on stan-

dardized assessments to measure achievement. While in most states only math and reading

tests are high stakes and counted in AYP that carry significant educational consequences for

all students including ELLs (Anderson, 2012; Maerten-Rivera et al., 2010), in Texas, the state

where the current study was conducted, a science achievement test is also included in the

state accountability system that assigns ratings to every campus and district based on the

evaluation of indicators of performance, such as assessment results on the state standardized

instruments (TEA, 2010b). Texas accountability ratings for AYP are based on test perfor-

mance in three more subjects (Science, Writing, and Social Studies) other than just math and

reading, and in two more grade levels (Grades 9 and 11) than NCLB requires (TEA, 2011a).

NCLB regulations required students to be assessed in science in at least one elementary,

middle and high school grade level starting with the 2007–2008 school year, but the regula-

tions do not currently require science proficiency rates to be used in AYP determinations

(TEA, 2011b).

Current policies that call for more accountability in the education system present oppor-

tunities and challenges in science education for ELLs. On one hand, it is expected that the

appropriate use of standardized tests can promote educational quality and academic achieve-

ment in school accountability (American Educational Research Association, 2000; McNeil,

2000). However, it is also noted that the introduction of an accountability system has also led

to the narrowed gap between language minority students and their non-language minority

peers (Hanushek & Raymond, 2005). Furthermore, researchers have expressed concerns re-

garding the reliability, validity, and fairness of state-mandated achievement tests (Abedi,

2004; Kieffer, Lesaux, & Snow, 2008), particularly in science for ELLs (Penfield & Lee,

2010). In fact, ELLs’ language status was found to be associated with lower science achieve-

ment scores—more so than with gender, ethnicity, or economic status (Maerten-Rivera et al.,

2010). One of the assessment issues related to AYP reporting, as summarized by Abedi

(2004), pointed to the fact that assessment results are not directly comparable across the

ELLs and non-ELLs groups. The data also showed that ELLs’ performance may be under-

estimated, due to confounding of language and content. Additionally, another confounding

issue for data analysis related to performance is the way in which ELLs are categorized

without attention to their immigration status, disallowing researchers, in general, to have

access if the students are first, second, or third generation ELLs (Clewell, deCohen, &

Murray, 2007). Additionally, we note that researchers do not have information regarding the

classification of ELLs as former or current program participants in Texas or in national

databases.

Effective Science Instruction for ELLs

Researchers have noted components of pedagogy that contribute to ELLs’ science

achievement on standardized tests that are a part of school accountability systems. These

components include science inquiry approach (National Research Council, 1996); targeted,

explicit, and intensive instruction in the specialized language of science (Anstrom et al.,

2010; Freeman & Freeman, 2008; Kieffer, Lesaux, Rivera, & Francis, 2009; Luykx., Lee,

Mahotiere, Lester, Hart, & Deaktor, 2007; Tong, Irby, Lara-Alecio, Yoon, & Mathes, 2010);

and the integration of literacy instruction with science content (Merino & Scarcella, 2005).

These components are elaborated as follows.

MIDDLE SCHOOL SCIENCE 989


Science Inquiry

Science inquiry is based on constructivist theory (Bruner, 1996) in which individuals are

believed to learn by making connections between new information and prior knowledge. In

science-inquiry instruction, students are expected to build their own knowledge as teachers

facilitate and encourage students to ask questions, hypothesize, experiment, and draw infer-

ences from science experiences and experiments in the classroom (Rosebery, Warren, &

Conant, 1992). Such type of instruction allows students to use all ranges of language skills

(listening, speaking, reading, and writing) and provides a strong base for establishing back-

ground knowledge and vocabulary for students, and consequently promotes academic

achievement for ELLs (August et al., 2009).

Inquiry-based interventions have been found to promote the development of ELLs’ con-

ceptual understanding of science (Amaral, Garrison, & Klentschy, 2002; August et al., 2009;

Lee, Deaktor, Hart, Cuevas, & Enders, 2005; Lee et al., 2008). This foundation in understand-

ing, in turn, results in students’ higher achievement in science. As Liu, Lee, and Linn (2011)

noted, ‘‘. . . teachers who [implement] . . . inquiry-based science units more often tended to

have larger student success in science achievement’’ (p. 1103).

Geier et al. (2008) examined the effect of a combination of standards-based and inquiry

science instruction on standardized science achievement in an urban school district. It was

reported that seventh and eighth graders receiving standards-based science inquiry interven-

tion outperformed their peers in science content and process understanding. As the research-

ers noted from their findings, standards-based, science inquiry curriculum can lead to positive

scores on science standardized tests for underserved urban students.

Amaral et al.’s (2002) landmark study with ELLs from Grades K to 6 in a high poverty,

primarily Spanish-speaking community in California employed inquiry-based science instruc-

tion within the context of a structured English immersion approach. In their program, units

were created using inquiry-based kits that allowed students a hands-on approach to learning.

In addition, teachers were provided with professional development training to support their

implementation of the instruction which integrated science and literacy (i.e., reading and

writing). Data from state-mandated assessment showed that the achievement of ELLs in-

creased in all content areas in relation to the number of years the students participated in the

science inquiry program. Though the study’s goal was not initially to explicitly improve read-

ing and math, the researchers attributed the student growth in these areas to the problem

solving and critical thinking that science inquiry methods promoted.

Similarly, Lee et al.’s (2005) pre- and post-design study with upper elementary ELL

students participating in a science inquiry intervention showed statistically significant gains of

science and literacy. As in Amaral et al.’s (2002) study, the Lee et al. intervention provided

training for teachers. Students were given the opportunity to conduct science investigations

and were subsequently given background information on the science concepts they had ex-

plored. This form of instruction closely mirrors the 5-E model developed by Bybee (1996) of

the Biological Science Curriculum Study and recognized as a framework to teach science

inquiry. In this model, students are first Engaged in a science concept, then they are asked to

Explore the concept by means of guided inquiry, and finally they are moved into the Explain

phase, explicitly discussing scientific concepts. Concepts are re-enforced in the Elaborate

phase and concluded in the Evaluate phase. It should be noted, as pointed out by August

et al. (2009) that both the Amaral et al. and Lee et al. studies employed a pre- and post-test

pre-experimental design without a comparison group; as a result, the gains cannot be directly

predicted by the treatment effect and may have some potential threats of internal validity

such as maturation between pre and post tests.



Lee et al.’s (2008) subsequent study presented a quasi-experimental 5-year study with

upper elementary ELL learners in urban schools. Like the other studies, this study trained

teachers to instruct using a science inquiry approach provided description as how the science

units were developed to promote student initiative in learning with the teacher as a facilitator.

More specifically, the units included guides for teachers in terms of questions to ask students

to promote higher level thinking as the students conducted science investigations with science

background information to be presented at the end of the student investigation, which, again,

followed a 5-E model approach. One of the findings was that treatment students demonstrated

significantly greater gains in science achievement as measured by a researcher developed

science exam. Like Amaral et al.’s (2002) study, the impact of a successfully implemented

science inquiry intervention was not only reflected in science achievement as measured by

public-released items from the NAEP, but other content areas as well, likely as a result the

success of science inquiry in promoting higher level thinking. Additionally, Lynch et al.

(2005) conducted a quasi-experimental study to evaluate the effect of a highly rated science

curriculum which included science inquiry investigations among diverse groups of students.

Although former ELLs and native English speakers significantly outscored the comparison

group, the results were similar between ELLs and the comparison group, which, according to

the authors, was likely due to the high literacy demand in the curriculum, as well as the

assessment that failed to capture the gains made by these ELLs.

A more recent intervention study that reported significant intervention effects was con-

ducted by August et al. (2009) among sixth grade ELLs and native English speakers. The

purpose of their study was to assess the effectiveness of a 9-week science inquiry intervention

model to develop science knowledge and academic language. The researchers randomly

assigned sections, or classrooms, to receive the science intervention or to serve as comparison

groups. They also implemented professional development to support teachers with the inter-

vention, and explicitly noted using the 5-E approach discussed earlier. Moreover, the use of

visuals, graphic organizations, demonstrations, experiments, modeling to students, explicit

vocabulary instruction, and reading integration were included in the intervention. Pre- and

post-test assessments selected from state-standardized tests in science content and science

vocabulary were developed by the researchers, with fidelity checks to ensure teachers were

following the inquiry curriculum. Results noted post-test differences in favor of the treatment

group for both science knowledge and vocabulary, after adjusting for pre-test performance, with

reported standardized effect sizes estimate of 0.163 for Science and 0.263 for Vocabulary.

Finally, Santau, Maerten-Rivera, and Huggins (2011) analyzed the effectiveness of a

5-year professional development intervention aimed at improving science and literacy

achievement for urban elementary school students, including ELLs. Inquiry-based science

was the primary goal of this intervention with a balance between teacher guidance and

student initiative during the hands-on, real-world science workshops. Students’ science

achievement was measured by a researcher-developed multiple choice test. The mixed model

analysis revealed that all students in the intervention made significant gains from pre- to post-

test, independent of their English language status (i.e., ELL and non-ELL) or number of years

the students participated in the intervention. The authors concluded that the strength of pro-

fessional development led to effective science instruction, student learning, and, consequently,

higher achievement.

Direct and Explicit Vocabulary Instruction

Direct and explicit vocabulary instruction has been linked with the language and literacy

acquisition success for both elementary and middle grades ELLs (Beck, McKeown, & Kucan,



2002; Graves, 2000; National Institute of Child Health and Human Development [NICHD],

2000). Tong et al. (2010) presented a successful case of direct and explicit language and

literacy intervention delivered to Spanish-speaking ELLs longitudinally from kindergarten

to second grade. Their intervention consisted of a systematic strand of teaching vocabulary

directly through pronunciation, spelling, repeated exposure in context, Spanish (i.e., students’

first language) clarification, and word meaning. The Tong et al. experimental study resulted in

faster growth in English vocabulary knowledge and subsequent reading comprehension

among treatment students.

Carlo et al. (2004) implemented a vocabulary intervention relying on explicit instruction

to fifth grade Spanish-speaking ELLs. The instruction included the teaching of word meaning

in the context of engaging texts, the pronunciation, polysemy, morphology, glossary use with

Spanish translation in context and English definition, and cognate use. The positive treatment

effect was reported by the authors at the end of the academic year in academic vocabulary

knowledge and reading comprehension.

Although little is known at middle school level as the means by which ELLs most effec-

tively develop content, along with English oracy and literacy proficiency, explicit vocabulary

instruction has been identified in studies with science intervention. For example, Lee et al.

(2005) provided teachers with guides on how to promote literacy with ideas on science writ-

ing topics and the integration of literature related to the science topic. To promote English

oral proficiency, the units introduced key vocabulary words at the beginning of lessons and

encouraged students to use and practice the words in a variety of contexts. Results yielded

growth in both language acquisition and science achievement.

Science and Literacy Integration

Simultaneously, within the framework of science inquiry, researchers and theorists have

acknowledged the importance of providing structured and explicit instruction that delineates a

difference between ‘‘scientific’’ language and ‘‘everyday’’ language for ELLs (Warren,

Ballenger, Ogonowski, Rosebery, & Hudicourt-Barnes, 2001, p. 530). Moreover, researchers

have agreed that without the explicit learning of science language, science will simply ‘‘re-

main a foreign language to most students’’ (Wellington & Osborne, 2001, p. 139), especially

for ELLs. All of the studies we reviewed involved ELLs integrated science and literacy. For

instance, in Amaral et al.’s (2002) inquiry-based science instruction study, one of the most

marked components included the integration of science notebooks in which students were

asked to write frequently and in a structured manner about their science investigations in

order to promote English writing literacy and science conceptual formation.

Within the last decade, although handful descriptive studies have been published to help

practitioners integrate language and content in the science classroom for monolingual children

(e.g., Gelman & Brenneman, 2004; Knipper & Duggan, 2006; Rupley, 2009; Waldman &

Crippen, 2009); unfortunately, there has been limited research produced addressing the needs

for ELLs (e.g., Huerta & Jackson, 2010; Hyland, 2007; Pray & Monhardt, 2009; Rupley &

Slough, 2010). Even fewer are the studies that target the integration of literacy and science

for ELLs at the middle school level (Amaral, Garrison, & Duron-Flores, 2006; Palumbo &

Sanacore, 2009; Watkins & Lindahl, 2010). Klentschy (2005) elaborated on how to imple-

ment science notebooks effectively in the classroom to promote English literacy skills and

scientific thinking. He outlined the different components of a science notebook that can be

systematically modeled and used by students during a science inquiry experiment. Palumbo

and Sanacore (2009) underlined the importance of explicitly and systematically teaching

vocabulary to diverse middle school learners, including teaching the pronunciation, roots, and



meanings of academic words in the content areas. In addition, they also suggested that read-

ing strategies to increase fluency and comprehension can be incorporated into the science

classroom such as repeated reading and readers theater. Watkins and Lindahl (2010) recom-

mended that a structured reading framework in science can and should include the use of

explicit vocabulary instruction for ELLs in the content areas. They provided a useful frame-

work of strategies to activate student background knowledge, increase student motivation and

reading comprehension, and integrate vocabulary, oral, and written language development

within content area instruction.

Guiding Premises of the MSSELL Project

According to Palumbo and Sanacore (2009), there is a growing body of research on

potentially effective practice to ensure ELL academic achievement that points to the value

and importance of embedding literacy instruction in the content areas as a way to provide

context, purpose, meaning, and motivation for learning at all grade levels. Unfortunately,

there is a lack of clear understanding of how to best assist ELLs in acquiring the academic

language of science while also learning the second language of English in the most efficient

and effective manner. Limitations are identified among such literature in the following

aspects. First, descriptive and non-experimental studies fail to suggest a cause and effect

relationship under certain conditions, with information on effect size missing, limiting the

comparability across studies. According to synthesis of research conducted on science educa-

tion and student diversity by Lee and Luykx (2006), experimental designs in studies with

diverse students are rare, few studies are longitudinal, and many of the studies do not yield

concrete evidence related to student achievement. Similarly, after their extensive review of

literature, August et al. (2009) pinpointed that the majority of studies available on effective

science instruction among upper-elementary and middle grades ELLs are descriptive in na-

ture, with few studies that have used an experimental design to test the effectiveness of an

intervention.

Second, detailed information from instructional practices is missing, making replication

almost impossible (Tong, Lara-Alecio, Irby, Mathes, & Kwok, 2008). For example, Thomas

and Collier (2002) conducted a landmark longitudinal study responding to the need to deter-

mine which language support programs successfully promote the long-term academic

achievement of ELLs. Although their study lays the foundation for our study, it included in

the sample typical programs that occurred with great variety and more specifically, a lack of

descriptions of instructional practices from district to district and state to state. It is difficult

to control for such variability and lack of specificity within a large-scale study. To address

this issue, a longitudinal experimental/quasi-experimental design in which the language/litera-

cy is merged with science content standards (Lee & Luykx, 2006; Merino & Scarcella, 2005;

Minicucci, 1996) taught by qualified professionals trained with inquiry-based instruction is

needed to enhance Spanish-speaking English learners’ academic language development and

science achievement in English in state and other standardized assessments.

These guiding principles gleaned from the literature indicate that students will increase

significantly in their academic language in science, their English language/literacy skills, and

science achievement and will outperform students who are in the typical practice science

program given that they have (a) appropriately certified teacher who receive ongoing profes-

sional development and frequent classroom observations with feedback, along with scripted

lessons with clarifications in Spanish, (b) bilingual paraprofessionals to assist ELLs with clar-

ifications and support low functioning students in science, and (c) enhanced curriculum that

includes inquiry-based learning, scripted lessons, technology integration, academic oral and



written science language development, family involvement, and university scientist and

college student mentors.

Therefore, the purpose of our study was to evaluate, via a quasi-experimental design, the

effectiveness of a literacy-integrated science intervention on fifth grade ELLs’ science and

reading literacy achievement on accountability-based state assessments. Specifically, we

sought to answer the following questions:

(1) Do students who are enrolled in the literacy-embedded science treatment condition

classrooms perform better on the district science benchmark tests, and state stan-

dardized science test than do students in a comparison condition?


classrooms perform better on the district reading benchmark tests, and state stan-

dardized reading test than do students in a comparison condition?


classrooms perform better on the standardized English decoding measure than do

students in a comparison condition?

Methods

Design, Context, and Participants

Our study was derived from a larger longitudinal (fifth to sixth grade), field-based re-

search project targeting native both ELLs and low socioeconomic status (SES) non-ELLs in

an urban school district in Southeast Texas. Over 45% of students in the district are served

whose first language is Spanish. The majority of students (85%) in the selected school district

site qualified for free or reduced-lunch (TEA, 2010a). This particular district was selected for

study because of its (a) positive reputation based on student achievement and national awards

such as the Broad Foundation Prize, (b) lengthy experience working with ELLs, (c) consisten-

cy in program philosophy and implementation, and (d) because of the ease of access to

English learning and regular programs within the same school throughout the district. All

participating ELLs were identified as limited English proficient with Spanish as the primary

language spoken at home.

There were 10 intermediate schools (lower middle school) in the selected district from

Southeast of Texas. The state law (Texas Education Code, 1995) prohibits random selection

and assignment on the basis of individual students for program placement; therefore, in the

larger research project, four intermediate schools with principals’ approval from the district

site were randomly assigned to conditions, resulting in two treatment (enhanced science prac-

tice) and two comparison (typical science practice) schools. Table 1 demonstrates the demo-

graphics of these four schools. When a school was assigned, teachers from that campus were

then randomly selected to the assigned condition within that campus, and both ELLs and low

SES non-ELLs in the same school received the same practice to allay contamination between

experimental and comparison classrooms. Due to the low return response in the two compari-

son schools, a balanced design with four science teachers in their respective condition

(i.e., two in treatment and two in comparison) could not be achieved. Therefore, in order to

increase the sample size at student level, we recruited four more teachers within the same

comparison schools that were already assigned based on the criteria that they had Spanish-

speaking ELLs in their classrooms. Because of the non-random addition of these teachers,

this design of this overall project was considered to be quasi-experimental. Among the

12 teachers, the average number of years of teaching was 8.6 (9 for the treatment group and



8.4 for the comparison group), with two teachers new to teaching profession. Each teacher

taught 2–3 classes (or commonly termed as rotations in the district, see Table 2 for a break-

down of numbers by condition). The final sample consisted of a total of 166 treatment

students with an average age of 11.83 years (SD ¼ 0.75) and 80 comparison students with

an average age of 11.78 years (SD ¼ 0.71). Student information was coded by the district

personnel so that no individual student’s name can be identified.

Intervention

Professional Development. The intervention was composed of two main components.

The first component was teacher professional development (professional portfolio assessment,

biweekly staff development sessions, and monthly staff meetings for paraprofessionals).

This component consisted of ongoing training workshops for both teachers (biweekly) and

paraprofessionals (monthly) provided by research coordinators for 3 hours per session.

During the trainings teachers (a) reviewed and practiced upcoming lessons and materials,

(b) discussed science concepts and cleared up their own misconceptions, (c) reflected on and

discussed student learning, (d) assessed pedagogical progress as a teacher in the intervention,

(e) conducted experiments and inquiry activities, and targeted areas that students may have

difficulty, and (f) were instructed on the following ESL strategies that were incorporated

into the researcher-developed lessons: questioning strategies, language scaffolding, visual

scaffolding, manipulatives and realia, advanced organizers, cooperative grouping, content con-

nections, and technology integration. The scripted lesson plans were tightly aligned to state

science standards, national science standards, and English language proficiency standards

with leveled questions that included cognitive verbs (e.g., identify, describe, explain, analyze,

and draw conclusion). Further, the lesson plans also included minimal language-of-instruction

clarifications, specifically L2 (English) clarified by L1 (Spanish); otherwise, the language of

instruction for the intervention was L2.

Table 2

Breakdown of number of students, rotations, and teachers by condition

Condition School Teacher Rotation Student

Treatment A 3 8 41B 3 7 39

Comparison C 2 6 118D 2 3 48

Table 1

School demographics for treatment and comparison group 2009–2010

African

American

(%)

Hispanic

(%)

White

(%)

Native

American

(%)

Asian

(%)

Low

SES

(%)

ELL

(%)

Academic

Rating

TreatmentSchool A 19.5 78.3 1.5 0.1 0.6 92.9 30.4 RecognizedSchool B 11.9 84.1 2.4 0.0 1.6 92.1 31.3 Exemplary

ComparisonSchool C 26.8 68.4 2.4 0.0 2.5 88.0 31.5 RecognizedSchool D 26.5 67.7 5.0 0.2 0.7 90.5 22.9 Recognized



It is worth mentioning that a positive effect of training workshops was noted from class-

room observations which suggested that treatment teachers were utilizing each minute follow-

ing the lessons plans with inquiry-based learning and integration of vocabulary and writing

in a structured, systematic manner. Based on our bi-monthly meetings and our field notes,

treatment teachers reported that the professional development equipped them with strategies

to (a) build science academic vocabulary; (b) integrate speaking, reading, and writing into

science lessons; (c) implement questioning strategies that hold students accountable for

their own learning; and (d) improve time management. Teachers commented that they would

utilize such strategies in their classrooms after the project implementation.

Instructional Activities. The second component, an 85-minute daily science instruction,

usually began with Daily Oral and Written Language in Science (DOWLS, approximately

10 minutes), a warm-up activity in which students were presented with a science-related

prompt or scenario, given individual think time, recorded written responses, and then dis-

cussed responses with a student partner. The academic science intervention continued follow-

ing the 5-E instructional cycle including: (a) engage activity (5–10 minutes), which consisted

of either a visual or teacher demonstration of a science concept helping to focus students’

thinking and make connections between past and present learning; (b) explore activity

(10–20 minutes), in which students worked in cooperative groups to explore science concepts

through manipulating science materials or exploring environment; (c) explain activity

(15–30 minutes), in which students gained deeper understanding of science concepts through

direct instruction of science vocabulary, partner reading of expository text (see further

description of CRISELLA in the next section), and interacting with science software

that explained science concepts through animation and simulation; (d) evaluate activity

(10–20 minutes), in which students demonstrated their understanding of the science concept

in their science journals (see further description of WAVES in the next section); and when

time allowed; and (e) elaborate activity in which students further applied their understanding

by conducting additional activities. A daily lesson concluded with closure questions to review

the science concept as related to the objective.

A major component was Content Area Reading in Science for English Literacy and

Language Acquisition (CRISELLA), which focused on vocabulary development and exten-

sion through science-related expository text to improve students’ understanding of science

concepts. In CRISELLA, students were provided with direct instruction of vocabulary, includ-

ing pronunciation of words, student friendly definitions, and visual scaffolding before they

started reading. Students then partner-read and asked each other scripted comprehension ques-

tions and had the opportunity to re-read the selection to increase fluency and comprehension.

After that, the teacher reviewed the questions and answered with the class, clarifying any

misconceptions. Teachers also modeled academic science language, opportunities for students

to practice using science academic language, and encourage students to answer in complete

sentences. Students entered vocabulary words and definitions into the glossary of their science

journals weekly.

Another component to improve science-related literacy is the integration of writing, that

is, Written and Academic oral language Vocabulary development in English in Science

(WAVES), in which individual science notebooks were used to help students process science

content through written academic science vocabulary. Science journals were structured so that

students had multiple opportunities to write daily to (a) record predictions and observations,

(b) illustrate and label diagrams, (c) organize information using notebook two-dimensional

figures, (d) record vocabulary in the glossary section of the journal, and (e) develop writing



skills while creating perspective-based writing, post-cards, newspaper articles, and reflections

on science field trips. Teachers were trained to provide both science and writing feedback.

The academic science curriculum was embedded with English as a second language strat-

egies and questioning strategies [teachers asked scripted leveled questions that include cogni-

tive verbs and were trained to alternate among answering techniques, such as (a) randomness,

(b) quick write, (c) pair-share, (d) choral response, (e) visual cues, and (f) timed thinking].

Students were held accountable by not being allowed to say ‘‘I don’t know,’’ instead students

had the option of having more time to think, requesting for the question to be asked in a

different way, requesting clues, or conferencing with another student.

In addition to the well-controlled components described above, there were also enrich-

ment components in the treatment classroom to assist student learning, including (a) treatment

classrooms were equipped with computers, a projector, a document camera, an interactive

whiteboard, science-based educational software, such as EduSmart, internet resources, and

a digital camera that further facilitated delivery of the academic science curriculum;

(b) Science Saturdays with scientists. Over 150 treatment students traveled to one of the

research universities that are part of this larger research project (November 14, 2009 and

February 27, 2010) and attended two interactive sessions developed by university scientists

based on physical, earth, life, and space sciences; and (c) Family involvement in science

(FIS). Take-home science materials were developed for students to work with their parents/

family so that students and their parents can become citizen scientists, as we define as those

citizens who are engaged in their world and understand scientific principles to better under-

stand their world and assist others, especially their children, in understanding and navigating

it. Each booklet contained a short letter to the family introducing the concepts and vocabulary

for the chapter, a crossword puzzle, a fun fact, various family science activities, an additional

reading relating to the topic, links to online science games, a short assessment, and a section

to sign and return upon completion. Each take home booklet was provided in both English

and Spanish. One 45-minute parent meeting was held during the fall of the academic year to

discuss how to implement FIS at home.

The district’s science scope and sequence outlines specific state standards to be covered

each 6 weeks (see Table 3 for a list of science content taught in treatment and comparison

conditions). The lesson plans developed by the research team were based on the same science

standards that the district included in their lesson plans in typical practice. Therefore,

both treatment and typical science teachers taught the same science standards; however, the

standards were incorporated in a different order during each 6-week period in comparison

classrooms.

Table 3

Science content taught in both treatment and comparison conditions during each 6-week period

Six Weeks Science Content

1 Force and motion; forms of energy; reflection and refraction2 Matter and its properties, classifying matter, conductors and insulators, mixtures and

solutions3 Solar system, physical characteristics of the sun, earth, and moon; rotation and revolution

of earth; phases of the moon4 Weather and the water cycle; earth’s structure; earth’s changing surface; fossils5 Basic needs of organism; how organisms survive; processes and cycles in ecosystems;

competition, reproduction, and life cycles6 Adaptations; inherited traits and learned behaviors



Typical Practice/Comparison Condition

In typical practice in the comparison group, science is taught by certified or permitted

bilingual/ESL education and science education teachers in English for the ELL students with

no Spanish clarifications. Compared to a daily 85-minute 5-E lesson in treatment classrooms,

science instruction in comparison classrooms varied from 80 to 90 minutes daily including

one 5-E lesson cycle weekly. Language development strategies may be included in the con-

tent subjects. Teachers followed a locally developed science curriculum aligned to the TEKS.

On-going classroom observations were conducted and suggested the following: (a) vocabulary

instruction included the use of word walls and students looking up definitions in a glossary;

(b) students read textbook independently and answered questions at the end of the section;

(c) students completed worksheets and handouts, there was limited use of science journals;

and (d) the integration of ESL strategies and questioning strategies varied and were inconsis-

tent. Although no training support was provided by the research team to the teachers in the

comparison group, they attended workshops as a state requirement to fulfill a minimum of

30 professional development hours each year related to their content area. Typical practice

also had computers, projectors, and document cameras (ELMOs) in the science classrooms.

The major technological difference was that treatment classrooms had EduSmart, a science

software aligned with the intervention.

Measures

A battery of assessments was given to student participants to evaluate the effect after

1 year of implementation of this project. These measures include both standardized tests and

district-developed tests for the classroom level in science and English language and literacy.

These two levels were adopted based on the work of Ruiz-Primo, Shavelson, Hamilton, and

Klein (2002). They indicated:

. . . we reasoned that if science reform has an impact on students’ achievement, this

result should be located at different levels: most likely, if at all, at the local classroom

curriculum level, and then, we hoped, to transfer to more broadly or distal situations

than those covered in the classroom, for example, those posed by statewide and national

assessments. Evaluating students with achievement measures at different distances from

the science curriculum they studied would provide a better picture of the extent of

the effect that the science reform is having than using only close or distal measures.

(p. 371)

Specifically, based on the multilevel, multifaceted assessments framework proposed by

Ruiz-Primo et al., we employed a proximal level assessment which is used to ensure that the

teachers were teaching the assigned standards (and to hold schools accountable). That level

is observed in the district benchmark tests used in our study. Additionally, we employed what

Ruiz-Primo et al. called a distal level assessment based on state standards in a particular

domain (in our case—the Texas Assessment of Knowledge and Skills in Science and

Reading) and on a national level in assessing reading fluency using DIBELS.

District Benchmark Tests. The district-wide benchmark tests (in science and reading)

were used to compare students’ performance. These benchmark tests are criterion-referenced;

using cut-off scores to determine if a student passes and/or meets commended performance.

These benchmark tests were developed according to the scope and sequence of the fifth grade

Texas Essential Knowledge and Skills (TEKS), the state standards/curriculum in science/

reading. Similar to the state standardized test (see the following section), a student who



passes the science test (e.g., 18 correct items out of 24 in science benchmark test 1) demon-

strates satisfactory performance with equal to or above state passing standard and a sufficient

understanding of TEKS-aligned science curriculum; and the level of commended performance

(e.g., 22 correct items out of 24 in science benchmark test 1) reveals high academic achieve-

ment, considerably above state passing standard and a thorough understanding of the TEKS-

aligned science curriculum. The topics covered over the year are: physics (test 1), chemistry

(test 2), earth and space (test 4), life science (test 5), with process skills integrated into these

topics. Tests 3 and 6 are mid and final tests and cumulative of these topics. (Sample test items

are available as Supporting Information accompanying the online article.)

Face validity of the benchmark tests was established through a recursive process

of reviews by committee members as to if the tests relayed what they should, and content

validity was obtained because the tests were directly aligned the state standards (TEKS).

More specifically, the science/reading program director decided the order in which to teach

the state standards. The curriculum guide sequenced the science/reading TEKS that were

taught in each of the 6-week period. The recursive process occurred with a committee com-

prised curriculum specialists, teachers, and program director used the district curriculum

guide to determine which TEKS were to be taught for each 6-week period, for a total of

six periods. This committee analyzed these TEKS and references-related questions from pre-

vious Texas Assessment of Knowledge and Skills (TAKS), the state standardized test for

Grades 3–12, to write test questions. Once the test questions were generated they were sent to

the proofing committee, consisting of science/reading specialists. This committee reviewed

the test questions, checked the alignment to the TEKS, and corrected any errors. These

tests then went back to the test writing committee to make further changes and edits if need-

ed. Then the tests were forwarded to a district committee consisting of district curriculum

directors for the final grammar and format edits before they were printed and administered to

students.

The information is limited about the reliability and validity of the benchmark tests at the

district level, nevertheless, these instruments are what has been adopted by the district, and

the focus of our study is on improved achievement as measured by district and state level

tests which are used hold educators accountable. We, therefore, checked the internal consis-

tency reliability and predictive validity based on the sample in our study. Internal consistency

for science benchmark tests was calculated on each 6-week test respectively, ranging from

0.66 to 0.80, with an average of 0.72 over the five tests for the sample of this study.

Predictive validity with TAKS, the state standardized assessment, ranged from 0.45 to 0.60

with an average of 0.53. Internal consistency for reading benchmark tests ranged from 0.60 to

0.82, with an average of 0.71 over the four tests for the sample of this study; and predictive

validity with TAKS ranged from 0.44 to 0.62 with an average of 0.51. These estimates are

similar to that reported in two previous studies by Lee et al. (2005) with a range of internal

consistency from 0.71 to 0.86 and Lee et al. (2008) with an internal consistency of 0.60 in

pre-test and 0.71 in post-test; and therefore, should be considered within an acceptable range.

State Standardized Test: Texas Assessment of Knowledge and Skills (TAKS). The TAKS, a

criterion-referenced assessment, measures student mastery of the content areas of state curric-

ulum outlined in the TEKS. The TAKS reading assessments, available in both English and

Spanish, are first administered during the spring of Grade 3. Typically, Spanish-speaking

ELLs who are not otherwise exempt can take the TAKS in Spanish for up to 3 years in

Grades 3–6. In Texas, the student’s assessment committee is responsible for making such

recommendation as to whether a Spanish-speaking ELL should be assessed with reading



TAKS in English or reading TAKS in Spanish (TEA, 2006). In the current study, 19 ELLs

from comparison classrooms and 14 from treatment classrooms were recommended to take

reading TAKS in Spanish.

The TAKS science assessments are first administered during the spring of Grade 5. As is

described by TEA, students who pass TAKS science in fifth grade demonstrate satisfactory

performance with equal to or above state passing standard and a sufficient mastery of the

TEKS-aligned science curriculum. The level of commended performance reveals high aca-

demic achievement, considerably above state passing standard; and a thorough mastery of the

TEKS science curriculum. Both English TAKS reading and science tests in fifth grade consist

of multiple-choice items (42 in reading and 40 in science) measuring four objectives. The

English form has a reported internal consistency ranging from 0.87 to 0.90, and predictive

validity ranging from 0.56 to 0.79 with SAT and ACT (TEA, 2008). In addition, construct

validity has been established using confirmatory factor analysis with good model fit for read-

ing in Grades 3 and 5 (Burk, Johnson, & Whitley, 2005; Davies, O’Malley, & Wu, 2007). For

fifth grade TAKS (both English and Spanish) in 2010, a raw score of 29 (i.e., total number of

correct items) corresponds to a scaled score of 2100, the cut-off point representing meeting

standard level (i.e., passing). A raw score of 39 corresponds to a scaled score of 2300, the

cut-off point representing commended performance level (TEA, 2010b). (Sample test items

are available as Supporting Information accompanying the online article.)

Dynamic Indicators of Basic Literacy Skills (DIBELS). DIBELS (Good and Kaminiski,

2002) includes a set of procedures and measures for assessing the acquisition of early literacy

skills from kindergarten through sixth grade. These tests were initially developed as curricu-

lum-based assessments and have documented its psychometric reliability and validity to mea-

sure ‘‘critical skills that underlie early reading success.’’ They can be used for evaluative

purpose on individual students or student groups. In this study, Oral Reading Fluency (ORF)

was administered to students in the beginning and end of fifth grade. ORF is reported to have

a median alternate form reliability of 0.95 (Good, Kaminski, Smith, & Bratten, 2001), concur-

rent validity of 0.92–96 with test of ORF, and predictive validity of 0.71 with Stanford

Achievement Test 10 (Roehrig, Petscher, Nettles, Hudson, & Torgesen, 2008). In this subtest,

examinees are asked to read grade-level fictional passages aloud, and the score is the number

of words correctly read in 1 minute. For each time of administration, three stories were test-

ed, and the middle score was used for analysis.

Fidelity Measure. Fidelity checks were developed to ensure the implementation among

the treatment science teachers. The Science Teacher Observation Record (STOR) was used to

monitor if teachers were delivering intervention as proposed. The inter-rater reliability of

STOR was 0.86. Each of the 5-Es was rated on a 1–4 scale in the following areas unique to

this intervention: (a) knowledge with lesson content; (b) material usage and teacher prepara-

tion; (c) student involvement; (d) academic language scaffolding; (e) affective and cognitive

feedback; (f) writing feedback; and (g) pacing. In addition there was a category for profes-

sional development. The total possible score is 124. The mean score for treatment teacher is

107.93 (SD ¼ 15.2) with four rounds of observation at the beginning, beginning/middle, and

middle and end of the academic year, with an average observational time of 84 minutes per

round per teacher. Observers were educators and had attended the professional workshops

and had been trained by the principal investigators. They had had over 10 years of experience

observing the classrooms over the course of one previous federal funded longitudinal project.

In addition to the fidelity instrument, that is, STOR, field notes from classroom observations

were also collected from both conditions and suggested that the instructional behavior were



more consistent and committed to science teaching in treatment classrooms than in compari-

son classrooms.

Data Collection and Analysis

Data were collected in the fall and spring of school year 2009–2010. Science benchmark

test was administered to each student every 6 weeks during the school year with a total of

six tests. Each 6-week benchmark test has a different topic area, for example, the science

benchmark tests cover Physics (test 1), Chemistry (test 2), Mid-term (cumulative of physics,

chemistry, and space, test 3), Earth/Space (test 4), Life Science (test 5), and final test

(cumulative of physics, chemistry, space and life science, test 6). TAKS data were collected

during the spring of 2010. Scores from benchmark tests, and TAKS scores were obtained

through the district database. Data from DIBELS were collected at the beginning and end of

school year.

To examine the initial equivalence between the two groups, students’ vocabulary knowl-

edge, reading comprehension skills, and non-verbal ability were measured for the larger

research project, and were collected at the beginning of fifth grade (Fall, 2009). These meas-

ures included the subtests of Picture Vocabulary and Passage Comprehension in Woodcock

Language Proficiency Battery-Revised WLPB-R; Woodcock, 1991), standardized instruments

assessing a broad range of language proficiency in speaking, listening, reading, and writing in

English. Listening Comprehension subtest requires test takers to listen to a passage read to

them and are asked to supply the single word missing at the end of the passage. In the

Passage Comprehension subtest test-takers are required to point to the picture represented by

a phrase. It also measures skills of reading a short passage and identifying a missing key

word. W-scores were used for analysis. Finally, students’ non-verbal ability was measured by

the Naglieri Nonverbal Ability Test (NNAT, Naglieri, 1997). It is designed for ages between

5 and 17 to provide a concise but reliable and valid non-verbal evaluation of general ability.

Students are required to analyze the associations among the parts of the divided matrix, the

design, and to determine which answer choice is correct based on the information in each test

item. The NNAT has been utilized as an identification of gifted children, especially those who

are culturally and linguistically diverse. This test is a group administration with approximate-

ly 30 minutes, given as a district-level measure.

As a first step, students’ listening and reading comprehension in English, as well as their

non-verbal cognitive ability was compared between treatment and comparison groups prior to

the implementation of the larger research project using independent samples t-test. No statisti-

cally significant difference was detected on listening comprehension (F ¼ 0.11, p ¼ 0.736),

nor reading comprehension (F ¼ 0.25, p ¼ 0.619); nor on non-verbal ability (F ¼ 0.03,

p ¼ 0.863). Therefore, it initial equivalence was established.

Note that it may not be beneficial to plot progress based on the scaled score for district

benchmark tests, because each 6 weeks test covers completely different science content,

which suggests that a higher score does not necessarily indicate progression, instead, it may

mean that the science content is simply different, more interesting to the students, or more

difficult by topic area in a single 6 weeks. Further, on the TAKS tests, a scaled score of 2099

is still considered failing with only 1 point below 2100 (the cut-off score), while 2101 is

considered passing with only 1 point above. Therefore, to most accurately present the find-

ings, we conducted chi-squared test of independence to compare the rate of passing and

commended performance between treatment and comparison groups of ELLs for benchmark

tests and TAKS, with Cramer’s V being reported as effect size or magnitude of the relation-

ship. Because a total of 33 students took Spanish TAKS reading test, we decided to exclude



these data when comparing the English literacy achievement between the two conditions. To

analyze DIBELS score, analysis of covariance (ANCOVA) with pre-test as covariate was

conducted to compare students’ growth in literacy skills during the first year of implementa-

tion. Partial eta squared value as one type of effect size was reported.

Results

In this study, we investigated the effectiveness of a quasi-experimental study on science

intervention among fifth grade ELLs. We compared students’ performance between treatment

and comparison groups in the district benchmark tests in science and reading, state standard-

ized tests in science and reading, and literacy development. Results are presented by construct

measured, that is, science and reading achievement.

Science Achievement

Benchmark Tests in Science. Table 4 lists the descriptive statistics of percentage of pass-

ing and commended performance by condition, together with effect size in the form of

Cramer’s V. Results indicate that there is a statistically significant difference in the percentage

of passing in tests 2, 4, and 6 (ps < 0.05), and in the percentage of commended performance

in tests 2, 4 and 6 (ps < 0.05). These differences are all in favor of treatment students who

demonstrated higher rates with the magnitude ranging between 0.127 and 0.195. For example,

there is an average of 87% passing and 43% commended performance rates in the treatment

group over the five benchmark tests in science, as compared to an average of 78% passing

and 32% commended performance in comparison group (Test 5 was optional due to the time

conflict of TAKS administration).

TAKS in Science. Chi-squared test did not reveal statistically significant differences in the

rate of passing or commended performance (ps > 0.05). The treatment group had an average

passing rate of 78.2%, and a commended performance rate of 25.1%. Similarly, the compari-

son group had an average of passing 84.6%, and a commended performance rate of 19.8%.

Reading Achievement

Benchmark Tests in Reading. Table 5 lists the descriptive statistics of percentage of

passing and commended performance by condition, together with effect size in the form of

Cramer’s V. Results from chi-squared test of independence suggest a similar pattern as was

Table 4

Difference between treatment (n ¼ 166) and comparison group (n ¼ 80) in District Benchmark Science

Tests

Test

Passing (%) Commended Performance (%)

Treatment Comparison Effect Sizea Treatment Comparison Effect Sizea

1 85.5 88.5 �0.040 41.6 34.3 0.0662 89.7 76.3 0.178�� 49.7 33.8 0.150�

3 84.6 80.9 0.026 34.0 27.5 0.0654 89.6 74.7 0.194�� 47.2 32.9 0.136�

6 86.3 69.9 0.195�� 43.5 30.3 0.127�

aPositive effect size indicates higher performance in treatment condition. Test 5 was optional as it was given the same

time period as TAKS test.�p < 0.05.��p < 0.01.



observed in science benchmark tests that treatment students statistically outperformed com-

parison students in the passing rate in tests 2, 4, and 6 (ps < 0.05), with the magnitude of

such difference ranging between 0.163 and 0.238. For example, there is an average of 56.7%

passing and 8% commended performance in the treatment group over the four benchmark

tests in science, as compared to an average of 38% passing and 3.7% commended perfor-

mance in comparison group (test 5 was optional due to the time conflict of TAKS administra-

tion, and not all groups were given test 3). The only statistically significant difference in the

rate of commended performance was found in second test in favor of treatment group

(p < 0.05). Further, it is worth noting that on average, both treatment and comparison groups

demonstrated a low passing rate in the reading tests, indicating that a higher percentage

of students did not meet the expected standard level of performance in the subject area of

reading.

TAKS in Reading. Chi-squared test revealed statistically significant differences in the rate

of passing (x2 ¼ 3.086, p ¼ 0.046, Cramer’s V ¼ 0.11), in favor of the treatment group with

an average passing rate of 68.9%, as compared to the comparison group with an average of

passing 60.4%. No statistically significant difference was found in the rate of commended

performance (p > 0.05) with an average of 7.8% in the treatment group and 7.4% in compar-

ison group. Similarly, the average rate of passing in TAKS reading test was lower than that in

TAKS science test for both treatment and comparison groups.

DIBELS. Results from the ANCOVA (with pre-test score as the covariate and post-test

score as the dependent variable) revealed that, although all students (n ¼ 246) made statisti-

cally substantial gains from beginning to the end of school year, a statistically significant

difference was observed, F ¼ 37.26, p < 0.001, partial eta squared ¼ 0.134, with treatment

students outperforming their comparison peers on the post-test with 12 points at end of fifth

grade, after adjustment for pre-test performance levels at the beginning of fifth grade.

Limitation

This study was, of course, subject to several limitations. Given the fact that the purpose

of this current study was to evaluate the effect of first year of implementation, we did not

present the results using multi-level structure. One reason was that the fidelity measure

Table 5

Difference between treatment (n ¼ 166) and comparison group (n ¼ 80) in District Benchmark Reading

Tests

Test

Passing (%) Commended Performance (%)

Treatment Comparison Effect Sizea Treatment Comparison Effect Sizea

1 44.1 33.3 0.103 1.9 3.8 �0.0592 58.7 32.8 0.238�� 7.7 0 0.157�

4 49.3 31.9 0.163� 5.9 1.4 0.1386 74.8 54 0.212�� 16.5 9.5 0.096

aPositive effect size indicates higher performance in treatment condition. Not all groups were given test 3. Test 5 was

optional as it was given the same time period as TAKS test.�p < 0.05.��p < 0.01.��p < 0.01.



indicated a fairly consistent practice among treatment classrooms. In a related note, we had a

quite small sample size at the teacher level (n ¼ 12) or school level (n ¼ 4), which, results in

a potential limitation of power for teacher level or school level. In order to increase the

power, we would have had to conduct randomization at the section or teacher level, which,

however, would have resulted in the contamination of the intervention at the teacher level

since they would be teaching both treatment and comparison classrooms; or there could have

been both comparison and treatment teachers on the same campus. We analyzed the data with

multi-level modeling approach and obtained similar results. Future studies with larger sample

size at the level of randomization, preferably school, are highly desired to yield results that

can justify casual–effect relationship. Further, as we plan for the next step of the larger

research project, we will take into consideration the rotation/classroom nature of science

instruction in these schools as students complete sixth grade, and we will include the hierar-

chical modeling with cross-classification in the final analysis. Another limitation is that we

compared conditions that differed in several enrichment components (e.g., family science

support, Saturday science activities, and extra technology resources in the classroom) that

may seem to confound the ability to compare the nature-of-the-instruction treatment.

Nevertheless, due to the low response rate (<10%) from parents who were required to rate

their child’s performance, and a 60% attendance rate for the university activities, it is unlikely

to measure if these enrichment components directly influenced benchmark scores.

Discussion and Conclusions

The purpose of our study was to evaluate the effectiveness of a literacy-integrated science

intervention with professional development on fifth grade ELLs’ science and reading literacy

achievement on accountability-based state assessments. Findings reveal that, on average,

the intervention produced higher academic achievement for treatment students in both science

and reading outcomes as reflected in the district-wide standards-based measures, which were

aligned with the materials taught in both treatment and comparison classrooms. Standardized

effect sizes were in small-to-moderate range, with larger magnitude in science than in read-

ing. More specifically, in the science benchmark tests, the treatment group demonstrated

a better understanding of the different science topics in 3 out of 5 benchmark tests.

Furthermore, a considerably larger percentage of treatment students performed above the state

passing standards with a thorough understanding of the TEKS in 4 out of 5 benchmark tests.

These findings, consistent with previous studies (e.g., Amaral et al., 2002; August et al.,

2009; Lee et al., 2008), suggest that inquiry approaches to teaching and learning science,

rather than teaching strategies in high needs schools in an attempt to raise test scores, can

promote ELLs’ performance on standardized and high-stakes achievement tests.

In the reading benchmark tests, treatment students demonstrated a higher percentage of

passing in 3 out of the 4 tests, suggesting satisfactory performance with equal to or above

state passing standard and a sufficient understanding of the state reading curriculum. In addi-

tion, although on average, treatment and comparison students acquired a similar level of suffi-

cient mastery of the state curriculum in science achievement, more students from the

treatment classrooms met the state standard in reading with a moderate effect size of 0.11.

Finally, results also show that the intervention produced positive gains for treatment students

in ORF as reflected in standardized measure, with a moderate standardized effect size

(0.134). These findings corroborated current research findings, such as those by Amaral et al.

(2002) and Lee et al. (2005), regarding the positive effect of integrating literacy with science

instruction on literacy outcomes for ELLs. Our findings, plus those of other researchers

mentioned, reveal that literacy integration in the content areas is a promising solution to the



challenge of holding all students, including ELLs accountable on, not just reading assess-

ments, but content–area (i.e., science) assessments as well (Maerten-Rivera et al., 2010).

In addition, the results of our study underscore the importance of implementing direct and

explicit vocabulary instruction that yield positive results within content–area instruction for

ELLs.

We observed, nonetheless, that the differences between treatment and comparison groups

in science achievement were more pronounced for the benchmark tests than for the TAKS.

These findings seem to align with Ruiz-Primo et al.’s (2002) conclusion that proximity of

assessment matters. They suggested that close and proximal (which was the benchmark tests

in our study) assessments had a greater influence on detecting students’ improvement on

assessments, while such sensitivity was less evident in distal assessments (which was the

TAKS in our study).

One might question the heavy focus on English science vocabulary building in this

science intervention. Existing literature indicates that compared to their monolingual English

peers, ELLs lack not only the amount of vocabulary words, but also the depth of vocabulary

knowledge as well (August, Carlo, Dressler, & Snow, 2005; Carlo et al., 2004). As a result,

these ELLs are less able to comprehend English text at grade level, and to perform on aca-

demic assessment in English, and they may be at risk of being diagnosed as learning disabled,

when in fact their limitation is due to limited English vocabulary and poor comprehension

(August et al., 2005). In his summary of the work of the National Reading Panel (NRP) on

reading comprehension instruction, Kamil (2004) asserted ‘‘vocabulary seems to occupy an

important middle ground in learning to read’’ (p. 215). Likewise, Hickman, Pollard-Durodola,

and Vaughn (2004) reported that for ELLs of primary importance in academic language de-

velopment is the ‘‘related elements of vocabulary and comprehension’’ (p. 720). In addition,

level of English language proficiency and literacy (particularly size of academic vocabulary)

is positively related to academic achievement in English among ELLs (Fernandez & Nielsen,

1986; Lindholm-Leary, 2001; Rumberger & Larson, 1998). Therefore, the focus on strength-

ening science vocabulary among ELLs is deemed necessary to enlarge and expand their learn-

ing of the academic language in order to solve science problems competently and to move

forward as citizen scientists. Our hypothesis was also supported by the findings of the fifth

grade implementation that more statistically significant differences were identified in the sci-

ence benchmark tests in favor of treatment group. It is also worth noting that the average

rates of passing and commended performance were higher in science than in reading for both

benchmark tests and TAKS in both groups. Such a finding corroborates that of Lee et al.

(2005) who used responses to expository writing samples to assess literacy achievement.

They noted that though there were statistically significant increases and effect magnitudes on

all measures of science and literacy for the treatment group both grade levels, stronger effect

magnitudes for science achievement over literacy achievement were found. Lee et al. recom-

mended, and we agree, that more emphasis on literacy development in tandem within science

learning in future studies.

One may argue the issue of time-on-task as well as the multiple components in the in the

intervention group which could explain the differences in performance. However, it should be

recognized that students in both the intervention (85 minutes) and comparison conditions

(with varied length of time between 80 and 90 minutes) had similar amount of minutes in

science instruction during the day, and the district has had a long-standing positive reputation

for typical practice in their science education programs. Our point, from an instructional

perspective, is how to best allocate and utilize those minutes so as to provide quality science

instruction at school. Further, it was the multiple components in our intervention that differed



from the typical practice and served as the contrast, not the time; nevertheless, these compo-

nents were all embedded within the daily inquiry-based 5-E lessons, as compared to weekly

5-E lessons observed in comparison classrooms. The end result derived from our study is that

for students with limited English proficiency who are placed at a disadvantage of learning

academic content in a language that is less familiar, effective practices in science instruction

should include an explicitly structured (specific to what the teacher should teach), consistent

(format of lessons are consistent and components become understood by and commonplace

students), inquiry-based (5-E model hands-on activities), standards-aligned curriculum that is

integrated with language literacy and with efficiently utilized instructional time each school

day. Educators are encouraged to introduce and apply such effective practices in science

instruction in order to maximize ELLs’ achievement on standardized assessments while

promoting students language and content literacy critical to their ongoing academic success

beyond middle school and into high school and college.

Finally, as addressed in the limitations, we are aware that the effect of the enrichment

components of the intervention is difficult to measure. However, enrichment activities with

visits to science labs at universities or private labs are encouraged to peak interests; enrich-

ment activities with parental involvement take-home packets are encouraged to engage

students and parental conversations around science, and additional technology integration in

lessons are encouraged to support science concepts and maintain interest and discussion

around those concepts.

Little is known at the middle school and beyond about the effective instruction for

language minority students who are learning English and the content science knowledge in

English at the same time (August & Hakuta, 1997; Lee & Luykx, 2006). The larger research

project from which the current study was derived, to our best knowledge, is the only longitu-

dinal quasi-experimental design with science intervention of a full academic year among

fifth grade ELLs and low-SES English proficient students; and therefore, the results have the

potential to impact middle school level science for ELLs where there is the combination of

the teaching of science with integrated English language development skills. We found that

not only did a literacy-integrated science curriculum and instruction matter, but also the other

component of the intervention, that is, professional development for teachers, as did Lee

(2005) find, mattered in terms of melding teachers’ understanding of science content and

teachers’ application of techniques for teaching English language acquisition and literacy

skills.

NCLB urged educators and researchers to seek out answers to what predicts and fosters

students’ school progress (Abedi, 2004; Maerten-Rivera et al., 2010). Part of the answer

to promoting ELLs’ success on standardized assessments could be, as Abedi (2004) put it,

‘‘improving teacher capacity,’’ that is, to train teachers to become ‘‘well qualified in both

language development and content, each of which plays a crucial role in [ELL] student

achievement’’ (p. 12). The implications of this study, particularly related to professional

development for teachers who have ELLs and economically disadvantaged students in their

classrooms, is to emphasize (a) the development of science academic vocabulary through

listening, speaking, reading, and writing activities, (b) the implementation of leveled question-

ing to help students develop oral language and correct misconceptions, and (c) the integration

of second language learning theory into the science classrooms. Because achieving

similar outcomes as reported in our study requires a substantial amount of time and resource

commitment, we recommend that districts consider at least 4 hours time release per month

for teachers to develop skills outlined in our study that promote student learning in science

and literacy.



Our findings may impact the field of science education and bilingual/ESL education in

that if this model works under controlled conditions, it can be sustainable and transferrable to

other settings. The advantage of our intervention is that it can be implemented in schools by

enhancing and modifying traditional practice, provided that the teachers are given quality

professional development on content–area strategies and time to develop their skills. If school

district personnel introduce non-intrusive, consistent, yet structured literacy-integrated science

interventions for ELLs with supported professional development, then teachers may be open

to such practices to improve ELLs’ academic English proficiency and science achievement.

Consequently, ELLs, who have not had the opportunity to develop specialized academic

language skills and who, thus, have limited access to learning science skills and concepts,

then may be provided opportunities to experience success in science achievement.

The authors thank the participating students, parents, teachers, and school administrators

for their cooperation.

References

Abedi, J. (2004). Accommodations for students with limited English proficiency in the National

Assessment of Educational Progress. Applied Measurement in Education, 17(4), 371–392.

American Educational Research Association. (2000). AERA position statements: High-stakes test-

ing in pre-k education. Retrieved from http://aera.net/policyandprograms/?id¼378

Anderson, K. J. B. (2012). Science education and test-based accountability: Reviewing their

relationship and exploring implications for future policy. Science Education, 96(1), 104–129.

Anstrom, K., DiCerbo, P., Butler, F., Katz, A., Millet, J., & Rivera, C. (2010). A review of literature

on academic English: Implications for K-12 English language learners. Arlington, VA: The George

Washington University Center for Equity and Excellence in Education.

Amaral, O., Garrsion, L., & Duron-Flores, M. (2006). Taking inventory. Science and Children,

43(4), 30–33.

Amaral, O. M., Garrison, L., & Klentschy, M. (2002). Helping English learners increase achieve-

ment through inquiry-based science instruction. Bilingual Research Journal, 26(2), 213–239.

August, D., Branum-Martin, L., Cardenas-Hagan, E., & Francis, D. (2009). The impact of an

instructional intervention on the science and language learning of middle grade English language learn-

ers. Journal of Research on Educational Effectiveness, 2, 345–376. DOI: 10.1080/19345740903217623

August, D., Carlo, M., Dressler, C., & Snow, C. (2005). Accelerating English academic vocabulary:

An intervention design for Spanish literate children acquiring English as a second language. Learning

Disabilities Research and Practice, 20(1), 50–57. DOI: 10.1111/j.1540-5826.2005.00120.x

August, D., & Hakuta, K. (1997). Improving schooling for language-minority children: A research

agenda. Washington, DC: National Research Council.

Beck, I. L., McKeown, M. G., & Kucan, L. (2002). Bringing words to life: Robust vocabulary

instruction. New York: Guilford Press.

Bentley, M. L. (2004). ELLs: Children left behind in science class. Paper presented at the Annual

Meeting of the School Science and Mathematics Association. Atlanta, Georgia.

Bruenig, N. A. (1998). Measuring the instructional use of Spanish and English in elementary

transitional bilingual classrooms. Dissertation, Abstracts International, 59(04), 1046A.

Bruner, J. (1996). The culture of education. Cambridge, MA: Harvard University Press.

Burk, J., Johnson, D. M., & Whitley, J. (2005). Validity of the Texas Assessment of Knowledge

and Skills (TAKS). The Journal of Border Education Research, 4(2), 29–39.

Bybee, R. W. (1996). The contemporary reform of science education. In J. Rhoton & P. Bowers

(Eds.), Issues in science education (pp. 1–14). Arlington, VA: National Science Education Leadership

Association.

Byrnes, D. A., Kiger, G., & Manning, M. L. (1997). Teachers’ attitudes about language diversity.

Teaching and Teacher Education, 13(6), 637–644.



Carlo, M., August, D., McLaughlin, B., Snow, C., Dressler, C., Lippman, D., . . . White, C. E.

(2004). Closing the gap: Addressing the vocabulary needs of English language learners in bilingual and

mainstream classrooms. Reading Research Quarterly, 39(2), 188–215.

Clewell, B. C., de Cohen, C. C., & Murray, J., (2007). Promise of peril? NCLB and the education

of ELL students. Retrieved from http://www.urban.org/UploadedPDF/411469_ell_students.pdf

Davies, S., O’Malley, K., & Wu, B. (2007, April). Establishing measurement equivalence of trans-

adapted reading and mathematics tests. Paper presented at the annual meeting of the American

Educational Research Association, Chicago.

Fernandez, R., & Nielsen, F. (1986). Bilingualism and Hispanic scholastic achievement: Some

baseline result. Social Science Research, 15, 43–70.

Freeman, Y. S., & Freeman, D. E. (2008). Academic language for English language learners and

struggling readers. Portsmouth, NH: Heinemann.

Geier, R., Blumenfeld, P. C., Marx, R. W., Krajcik, J. S., Fishman, B., Soloway, E., & Clay-

Chambers, J. (2008). Standardized test outcomes for students engaged in inquiry-based science curricula

in the context of urban reform. Journal of Research in Science Teaching, 45(8), 922–939. DOI: 10.1002/

tea.20248

Gelman, R., & Brenneman, K. (2004). Science learning pathways for young children. Early

Childhood Research Quarterly, 19(1), 150–158. DOI: 10.1016/j.ecresq.2004.01.009

Good, R. H., & Kaminiski, R. A. (Eds.). (2002). Dynamic indicators of basic early literacy skills

(6th ed.). Eugene, OR: Institute for the Development of Educational Achievement.

Good, R. H., Kaminski, R. A., Smith, S., & Bratten, J. (2001). Technical adequacy of second grade

DIBELS Oral Reading Fluency passages (Tech. Rep. No. 8). Eugene: University of Oregon.

Graves, M. F. (2000). A vocabulary program to compliment and bolster a middle grade comprehen-

sion program. In B. M. Taylor, M. F. Graves, & P. van den Broek, (Eds.), Reading for meaning:

Fostering comprehension in the middle grades (pp. 116–135). Newark, DE: International Reading

Association.

Hanushek, E. A., & Raymond, M. E. (2005). Does school accountability lead to improved student

performance? Journal of Policy Analysis and Management, 24(2), 297–327. DOI: 10.1002/pam.20091

Hickman, P., Pollard-Durodola, S., & Vaughn, S. (2004). Storybook reading: Improving vocabulary

and comprehension for English language learners. The Reading Teacher, 57, 720–730.

Hu, W. (2012). 10 States are given waivers from education law. The New York Times. Retrieved

from http://www.nytimes.com/2012/02/10/education/10-states-given-waivers-from-no-child-left-behind-law.

html?_r¼1

Huerta, M., & Jackson, J. (2010). Connecting literacy and science to increase achievement for

English language learners. Early Education Childhood Education Journal, 38, 205–211. DOI: 10.1007/

s10643-010-0402-4

Hyland, K. (2007). Genre pedagogy: Language, literacy and L2 writing instruction. Journal of

Second Language Writing, 16, 148–164.

Irby, B. J., Tong, F., Lara-Alecio, R., Meyer, D., & Rodriguez, L. (2007). The critical nature of

language of instruction compared to observed practices and high stakes tests in transitional bilingual

classroom. Research in the Schools, 14(2), 27–36.

Kamil, M. L. (2004). Vocabulary and comprehension instruction. In P. McCardle & V. Chhabra

(Eds.), The voice of evidence in reading research. (pp. 213–234) Baltimore, ML: Paul H. Brookes

Publishing Co.

Kieffer, M. J., Lesaux, N., Rivera, M., & Francis, D. J. (2009). Accommodations for English lan-

guage learners taking large-scale assessments: A meta-analysis on effectiveness and validity. Review of

Educational Research, 29(3), 1168–1201. DOI: 10.3102/003465430933249

Kieffer, J. J., Lesaux, N. K., & Snow, C. E. (2008). Promises and pitfalls: Implications of NCLB

for identifying, assessing, and educating English language learners. In G. L. Sunderman (Ed.), Holding

NCLB accountable: Achieving accountability, equity, and school reform (pp. 57–74). Thousand Oaks,

CA: Corwin Press.



Klentschy, M. (2005). Science notebook essentials: A guide to effective notebook components.

Science and Children, 43, 24–27.

Knipper, K. J., & Duggan, T. J. (2006). Writing to learn across the curriculum: Tools for compre-

hension in content area classes. The Reading Teacher, 59(5), 462–470. DOI: 10.1598/RT.59.5.5

Lee, O. (2005). Science education with English language learners: Synthesis and research agenda.

Review of Educational Research, 75(4), 491–521. DOI: 10.3102/00346543075004491

Lee, O., Deaktor, R. A., Hart, J. E., Cuevas, P., & Enders, C. (2005). An instructional intervention’s

impact on the science and literacy achievement of culturally and linguistically diverse elementary

students. Journal of Research in Science Teaching, 42(8), 857–887. DOI: 10.1002/tea.20071

Lee, O., & Luykx, A. (2006). Science education and student diversity. New York, NY: Cambridge

University Press.

Lee, O., Maerten-Rivera, J., Penfield, R., LeRoy, K., & Secada, W. G. (2008). Science

achievement of English language learners in urban elementary schools: Results of a first-year profes-

sional development intervention. Journal of Research in Science Teaching, 45(1), 31–52. DOI: 10.1002/

tea.20209

Lindholm-Leary, K. J. (2001). Dual language education. Avon, UK: Multilingual Matters.

Linn, R. L. (2000). Assessments and accountability. Educational Researcher, 29(4), 4–15.

Liu, O., Lee, H., & Linn, M. C. (2011). Measuring knowledge integration: Validation of four-year

assessments. Journal of Research in Science Teaching, 48(9), 1079–1107.

Lynch, S., Kuipers, J., Pyke, C., & Szesze, M. (2005). Examining the effects of a highly rated

science curriculum unit on diverse students: Results from a planning grant. Journal of Research in

Science Teaching, 42(8), 921–946. DOI: 10.1002/tea.20080

Luykx, A., Lee, O., Mahotiere, M., Lester, B., Hart, J., & Deaktor, R. (2007). Cultural and home

language influences on children’s responses to science assessments. Teachers College Record, 109(4),

897–926.

Maerten-Rivera, J., Myers, N., Lee, O., & Penfield, R. (2010). Student and school predictors of

high-stakes assessment in science. Science Education, 94, 937–962.

Merino, B. J., & Scarcella, R. (2005). Teaching science to English learners. Invited Essay.

Language Minority Research Institute Newsletter, 14(4), 1–7.

Meyer, L. (2000). Barriers to meaningful instruction for English learners. Theory Into Practice,

39(4), 228–236.

McCloskey, M. (2002). President’s message: No child left behind. TESOL Matters, 12, 4. Retrieved

from http://www.tesol.org/pubs/articles/2002/tm12-4-04.html

McNeil, L. M. (2000). Contradictions of reform: The educational costs of standardization. New

York: Routledge.

Minicucci, C. (1996). Learning science and English: How school reform advances scientific learn-

ing for limited English proficient middle school students. Santa Cruz, CA: National Center for Research

on Cultural Diversity and Second Language. Learning. CREDE.

Naglieri, J. A. (1997). Naglieri nonverbal ability test. San Antonio, TX: The Psychological Corp.

National Center for Education Statistics. (2010). The Condition of Education 2010 (NCES 2010-

028). Washington, DC: U.S. Department of Education.

National Institute of Child Health and Human Development. (2000). Report of the National

Reading Panel. Teaching children to read: An evidence-based assessment of the scientific research litera-

ture on reading and its implications for reading instruction (NIH Publication No. 00-4769). Washington,

DC: U.S. Government Printing Office.

National Research Council. (1996) National Science Education Standards. Washington, DC: The

National Academy Press.

No Child Left Behind Act. (2002). Pub. L. No. 107-110. Washington, DC.

Palumbo, A., & Sanacore, J. (2009). Helping struggling middle school literacy learners achieve

success. The Clearing House, 82(6), 275–280. Retrieved from ProQuest Education Journals. (Document

ID: 1808509731).



Penfield, R. D., & Lee, O. (2010). Test-based accountability: Potential benefits and pitfalls of

science assessment with student diversity. Journal of Research in Science Teaching, 47(1), 6–24.

Pray, L., & Monhardt, R. (2009). Sheltered instruction techniques for ELLs: Ways to adapt science

inquiry lessons to meet the academic needs of English language learners. Science and Children, 46(7),

34–38.

Roehrig, A. D., Petscher, Y., Nettles, S. M., Hudson, R. F., & Torgesen, J. K. (2008). Accuracy of

the DIBELS Oral Reading Fluency measure for predicting third grade reading comprehension outcomes.

Journal of School Psychology, 46(3), 343–366.

Rosebery, A. S., Warren, B., & Conant, F. R. (1992). Appropriating scientific discourse: Findings

from language minority classrooms. The Journal of the Learning Sciences, 21, 61–94.

Ruiz-Primo, M. A., Shavelson, R. J., Hamilton, L., & Klein, S. (2002). On the evaluation of sys-

temic science education reform: Searching for instructional sensitivity. Journal of Research in Science

Teaching, 39(5), 369–393. DOI: 10.1002/tea.10027

Rumberger, R. W., & Larson, K. A. (1998). Toward explaining differences in educational achieve-

ment among Mexican-American language-minority students. Sociology of Education, 71(1), 68–92.

Rupley, W. H. (2009). Linking reading and science: Focusing on a broader base of understanding.

Reading Psychology, 31(3), 203–205. DOI: 10.1080/02702710903241389

Rupley, W. H., & Slough, S. W. (2010). Building prior knowledge and vocabulary in science in the

intermediate grades: Creating hooks for learning. Literacy Research and Instruction, 49, 99–112.

Santau, A. O., Maerten-Rivera, J. L., & Corinne Huggins, A. (2011). Science achievement of

English language learners in urban elementary schools: Fourth-grade students achievement results from

a professional development intervention. Science Education, 95, 771–793.

Texas AFT. (2008). Beyond TAKS (and NCLB): Putting Texas school accountability back on track.

Retrieved from http://docs.texasaft.org/legislative/TestReformForumPaper100208.pdf

Texas Education Agency. (2006). TAKS performance level descriptors. Austin, TX: Author.

Retrieved from http://www.tea.state.tx.us/index3.aspx?id¼3222&menu_id¼793

Texas Education Agency. (2008). Texas education agency technical report 2006–2007. Retrieved

from http://www.tea.state.tx.us/index3.aspx?id¼4326&menu_id¼793 http://www.tea.state.tx.us/student.

assessment/resources/techdigest/

Texas Education Agency. (2010a). 2009–10 Academic Excellence Indicator System. Retrieved

from http://ritter.tea.state.tx.us/perfreport/aeis/2010/state.html

Texas Education Agency. (2010b). Accountability manual: The 2010 accountability rating system

for Texas public schools and school districts. Retrieved from http://ritter.tea.state.tx.us/perfreport/

account/2010/manual/manual.pdf

Texas Education Agency. (2011a). Accountability manual: The 2011 accountability rating system

for Texas public schools and school districts. Retrieved from http://ritter.tea.state.tx.us/perfreport/

account/2011/manual/manual.pdf

Texas Education Agency. (2011b). 2011 Adequate yearly progress (AYP) guide. Retrieved from

http://ritter.tea.state.tx.us/ayp/2011/index.html

Texas Education Code. (1995) 74th Leg., Ch. 260, § 29.063.

Thomas, W. P., & Collier, V. P. (2002). A national study of school effectiveness for language

minority students’ long-term academic achievement. Final report. Washington, DC: Center for Research

on Education, Diversity & Excellence.

Tong, F., Lara-Alecio, R., Irby, B.J., Mathes, P., & Kwok, O. (2008). Accelerating early academic

oral English development in transitional bilingual and structured English immersion programs.

American Educational Research Journal, 45(4), 1011–1044.

Tong, F., Irby, B. J., Lara-Alecio, R., Yoon, M., & Mathes, P. G. (2010). Hispanic English learners’

responses to longitudinal English instructional intervention and the effect of gender: A multilevel analy-

sis. Elementary School Journal, 110(4), 542–566.



U.S. Department of Education. (2009). Standards and assessment group accountability group.

Ed.gov. Retrieved from http://www2.ed.gov/admins/lead/account/saa.html

Waldman, C. A., & Crippen, K. J. (2009). Integrating interactive notebooks: A daily learning cycle

to empower students for science. The Science Teacher, 76, 51–55.

Wang, L., Beckett, G. H., & Brown, L. (2006). Controversies of standardized assessment in school

accountability reform: A critical synthesis of multidisciplinary research evidence. Applied Measurement

in Education, 19(4), 305–328.

Warren, B., Ballenger, C., Ogonowski, M., Rosebery, A., & Hudicourt-Barnes, J. (2001).

Rethinking diversity in learning science: The logic of everyday language. Journal of Research in

Science Teaching, 38(5), 529–552.

Watkins, N. M., & Lindahl, K. M. (2010). Targeting content area literacy instruction to meet the

needs of adolescent English language learners. Middle School Journal, 41(3), 23–32.

Wellington, J., & Osborne, J. (2001). Language and literacy in science education. Buckingham,

England: Open University Press.

Woodcock, R. W. (1991). Woodcock Language Proficiency Battery—Revised, English and Spanish

Forms: Examiner’s manual. Itasca, IL: Riverside.



Documents

Research Article - kenanaonline.comassessment, in comparison to a 35% of English-speaking students, or a statistically signiﬁcant 43 points difference (NCES, 2010). Similarly, the