Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 49, NO. 8, PP. 987–1011 (2012)
Research Article
The Effect of an Instructional Intervention on Middle School EnglishLearners’ Science and English Reading Achievement
Rafael Lara-Alecio,1 Fuhui Tong,1 Beverly J. Irby,2 Cindy Guerrero,1
Maggie Huerta,1 and Yinan Fan1
1Department of Educational Psychology, Texas A&M University, College Station, Texas 778432Sam Houston State University, Huntsville, Texas 77320
Received 27 April 2011; Accepted 7 June 2012
Abstract: This study examined the effect of a quasi-experimental project on fifth grade English
learners’ achievement in state-mandated standards-based science and English reading assessment. A
total of 166 treatment students and 80 comparison students from four randomized intermediate schools
participated in the current project. The intervention consisted of on-going professional development and
specific instructional science lessons with inquiry-based learning, direct and explicit vocabulary instruc-
tion, integration of reading and writing, and enrichment components including integration of technology,
take-home science activities, and university scientists mentoring. Results suggested a significant and
positive intervention effect in favor of the treatment students as reflected in higher performance
in district-wide curriculum-based tests of science and reading and standardized tests of oral reading
fluency. � 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 987–1011, 2012
Keywords: science intervention; English learners; inquiry-based learning; integrating science and
literacy; standards-based assessment
Recent demographics have revealed that English language learners (ELLs) comprise 21%
of the national enrollment in public elementary and secondary schools, with 79% of those
students being Spanish speakers (National Center for Education Statistics [NCES], 2010). In
Texas, alone, over 778,806 students were served in ELL programs in 2009–2010, accounting
for 16% of the school population (Texas Education Agency [TEA], 2010a). These numbers
are significant when achievement is compared to that of mainstream students. For example, at
the eighth grade level, the national data show that only 3% of ELLs achieved at or above
proficient level on the 2009 National Assessment of Educational Progress (NAEP) reading
assessment, in comparison to a 35% of English-speaking students, or a statistically significant
43 points difference (NCES, 2010). Similarly, the percentage of ELLs at or above proficient
at eighth grade level was 5% in Math and 2% in Science, as compared to 36% in Math and
32% in Science among English-speaking students (NCES, 2010). Such an achievement gap is
also reflected in the state-wide standardized assessment of science in which fifth grade ELLs
Additional Supporting Information may be found in the online version of this article.
Contract grant sponsor: National Science Foundation (NSF, grant # 0822343 to Texas A&M University,
0822153 to Sam Houston State University).
Correspondence to: F. Tong; E-mail: [email protected]
DOI 10.1002/tea.21031
Published online 11 July 2012 in Wiley Online Library (wileyonlinelibrary.com).
� 2012 Wiley Periodicals, Inc.
scored lower than any other subgroup including Special Education, Economically
Disadvantaged, and At-Risk (TEA, 2010a).
These statistics are staggering especially in an era of school accountability reform which
has been based on measured student achievement at both national and state levels (Anderson,
2012; Maerten-Rivera, Myers, Lee, & Penfield, 2010). The No Child Left Behind (NCLB,
2002) reinforces that all schools are held accountable of their students’ attainment by meeting
annual yearly progress (AYP), and achieving 100% proficiency rate on state tests by 2014,
while facing sanctions if they fall behind in meeting the standards. Science standards began
to be required in 2005–2006, and by 2007, science achievement was required to be measured
at least three times from Grades 3 to 12 (U.S. Department of Education, 2009). Such pressure
has resulted in the increased focus on test-driven accountability and teaching to the test,
because improvement on the assessments is how schools and districts are currently being
evaluated (Texas AFT, 2008). Even though there is now a state waiver for NCLB, account-
ability is still at the forefront. For example, states must ensure the following: standards are
adopted for college and career readiness; new accountability systems are implemented with
more flexibility in assessing student achievement, and evaluation and support systems are
developed and based on measures to improve teacher effectiveness (Hu, 2012). Considering
teacher effectiveness in science education, Lee (2005) stated that ‘‘Teachers often lack the
knowledge and the institutional support needed to address the complex educational needs of
ELLs’’ (p. 492). Indeed, according to Byrnes, Kiger, and Manning (1997), most classroom
teachers have had minimal, if any, training in meeting the academic or linguistic needs of
their ELLs, and, in fact, McCloskey (2002) reported that only 12% of teachers nationwide
had any training on how to teach with ELLs, much less when the content area of science is
added. Even with professional development on strategies to make content comprehensible for
ELLs, mainstream teachers did not accommodate the ELLs’ learning needs as they should
(Bentley, 2004).
Because the best program for ELLs at this point remains undetermined due to lack of
randomized trial studies and other sufficiently rigorous studies, many school district personnel
are blindly adopting approaches for ELLs, while these students ‘‘frequently confront the
demands of academic learning through a yet-unmastered language’’ (Lee, 2005, p. 492). Most
researchers have not actually observed bilingual/English as second language (ESL) and/or
English immersion science classrooms in a large-scale study to take into account instructional
factors in the learning of English (Bruenig, 1998; Irby, Tong, Lara-Alecio, Meyer, &
Rodriguez, 2007; Meyer, 2000). Furthermore, few researchers have utilized experimental or
quasi-experimental design to report how classroom instruction in science for ELLs can be
enhanced (e.g., August, Branum-Martin, Cardenas-Hagan, & Francis, 2009; Lee, Maerten-
Rivera, Penfield, LeRoy, & Secada, 2008; Lynch, Kuipers, Pyke, & Syesze, 2005). Therefore,
the purpose of our study was to evaluate, via a quasi-experimental design, the effectiveness of
a literacy-integrated science intervention on fifth grade ELLs’ science and reading literacy
achievement on accountability-based state assessments.
Standards-Based Accountability and ELLs’ Science Achievement
As Anderson (2012) noted, quantitative measurement of educational quality has existed
in the United States since the beginning of the early 20th century, but accountability standards
and repercussions for not meeting the standards, that is, high-stakes consequences, have pro-
gressively increased. The NCLB set unprecedented forceful provisions on using state-mandat-
ed assessments to hold schools accountable for their students’ academic performance (Wang,
988 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
Beckett, & Brown, 2006). These assessments are developed surrounding a common theme of
the current reform to adopt more rigorous and measurable standards and higher expectations
for student performance (Linn, 2000). To aid in decision-making about individual students’
program placement/exit, grade promotion, graduation, etc., many states use scores on stan-
dardized assessments to measure achievement. While in most states only math and reading
tests are high stakes and counted in AYP that carry significant educational consequences for
all students including ELLs (Anderson, 2012; Maerten-Rivera et al., 2010), in Texas, the state
where the current study was conducted, a science achievement test is also included in the
state accountability system that assigns ratings to every campus and district based on the
evaluation of indicators of performance, such as assessment results on the state standardized
instruments (TEA, 2010b). Texas accountability ratings for AYP are based on test perfor-
mance in three more subjects (Science, Writing, and Social Studies) other than just math and
reading, and in two more grade levels (Grades 9 and 11) than NCLB requires (TEA, 2011a).
NCLB regulations required students to be assessed in science in at least one elementary,
middle and high school grade level starting with the 2007–2008 school year, but the regula-
tions do not currently require science proficiency rates to be used in AYP determinations
(TEA, 2011b).
Current policies that call for more accountability in the education system present oppor-
tunities and challenges in science education for ELLs. On one hand, it is expected that the
appropriate use of standardized tests can promote educational quality and academic achieve-
ment in school accountability (American Educational Research Association, 2000; McNeil,
2000). However, it is also noted that the introduction of an accountability system has also led
to the narrowed gap between language minority students and their non-language minority
peers (Hanushek & Raymond, 2005). Furthermore, researchers have expressed concerns re-
garding the reliability, validity, and fairness of state-mandated achievement tests (Abedi,
2004; Kieffer, Lesaux, & Snow, 2008), particularly in science for ELLs (Penfield & Lee,
2010). In fact, ELLs’ language status was found to be associated with lower science achieve-
ment scores—more so than with gender, ethnicity, or economic status (Maerten-Rivera et al.,
2010). One of the assessment issues related to AYP reporting, as summarized by Abedi
(2004), pointed to the fact that assessment results are not directly comparable across the
ELLs and non-ELLs groups. The data also showed that ELLs’ performance may be under-
estimated, due to confounding of language and content. Additionally, another confounding
issue for data analysis related to performance is the way in which ELLs are categorized
without attention to their immigration status, disallowing researchers, in general, to have
access if the students are first, second, or third generation ELLs (Clewell, deCohen, &
Murray, 2007). Additionally, we note that researchers do not have information regarding the
classification of ELLs as former or current program participants in Texas or in national
databases.
Effective Science Instruction for ELLs
Researchers have noted components of pedagogy that contribute to ELLs’ science
achievement on standardized tests that are a part of school accountability systems. These
components include science inquiry approach (National Research Council, 1996); targeted,
explicit, and intensive instruction in the specialized language of science (Anstrom et al.,
2010; Freeman & Freeman, 2008; Kieffer, Lesaux, Rivera, & Francis, 2009; Luykx., Lee,
Mahotiere, Lester, Hart, & Deaktor, 2007; Tong, Irby, Lara-Alecio, Yoon, & Mathes, 2010);
and the integration of literacy instruction with science content (Merino & Scarcella, 2005).
These components are elaborated as follows.
MIDDLE SCHOOL SCIENCE 989
Journal of Research in Science Teaching
Science Inquiry
Science inquiry is based on constructivist theory (Bruner, 1996) in which individuals are
believed to learn by making connections between new information and prior knowledge. In
science-inquiry instruction, students are expected to build their own knowledge as teachers
facilitate and encourage students to ask questions, hypothesize, experiment, and draw infer-
ences from science experiences and experiments in the classroom (Rosebery, Warren, &
Conant, 1992). Such type of instruction allows students to use all ranges of language skills
(listening, speaking, reading, and writing) and provides a strong base for establishing back-
ground knowledge and vocabulary for students, and consequently promotes academic
achievement for ELLs (August et al., 2009).
Inquiry-based interventions have been found to promote the development of ELLs’ con-
ceptual understanding of science (Amaral, Garrison, & Klentschy, 2002; August et al., 2009;
Lee, Deaktor, Hart, Cuevas, & Enders, 2005; Lee et al., 2008). This foundation in understand-
ing, in turn, results in students’ higher achievement in science. As Liu, Lee, and Linn (2011)
noted, ‘‘. . . teachers who [implement] . . . inquiry-based science units more often tended to
have larger student success in science achievement’’ (p. 1103).
Geier et al. (2008) examined the effect of a combination of standards-based and inquiry
science instruction on standardized science achievement in an urban school district. It was
reported that seventh and eighth graders receiving standards-based science inquiry interven-
tion outperformed their peers in science content and process understanding. As the research-
ers noted from their findings, standards-based, science inquiry curriculum can lead to positive
scores on science standardized tests for underserved urban students.
Amaral et al.’s (2002) landmark study with ELLs from Grades K to 6 in a high poverty,
primarily Spanish-speaking community in California employed inquiry-based science instruc-
tion within the context of a structured English immersion approach. In their program, units
were created using inquiry-based kits that allowed students a hands-on approach to learning.
In addition, teachers were provided with professional development training to support their
implementation of the instruction which integrated science and literacy (i.e., reading and
writing). Data from state-mandated assessment showed that the achievement of ELLs in-
creased in all content areas in relation to the number of years the students participated in the
science inquiry program. Though the study’s goal was not initially to explicitly improve read-
ing and math, the researchers attributed the student growth in these areas to the problem
solving and critical thinking that science inquiry methods promoted.
Similarly, Lee et al.’s (2005) pre- and post-design study with upper elementary ELL
students participating in a science inquiry intervention showed statistically significant gains of
science and literacy. As in Amaral et al.’s (2002) study, the Lee et al. intervention provided
training for teachers. Students were given the opportunity to conduct science investigations
and were subsequently given background information on the science concepts they had ex-
plored. This form of instruction closely mirrors the 5-E model developed by Bybee (1996) of
the Biological Science Curriculum Study and recognized as a framework to teach science
inquiry. In this model, students are first Engaged in a science concept, then they are asked to
Explore the concept by means of guided inquiry, and finally they are moved into the Explain
phase, explicitly discussing scientific concepts. Concepts are re-enforced in the Elaborate
phase and concluded in the Evaluate phase. It should be noted, as pointed out by August
et al. (2009) that both the Amaral et al. and Lee et al. studies employed a pre- and post-test
pre-experimental design without a comparison group; as a result, the gains cannot be directly
predicted by the treatment effect and may have some potential threats of internal validity
such as maturation between pre and post tests.
990 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
Lee et al.’s (2008) subsequent study presented a quasi-experimental 5-year study with
upper elementary ELL learners in urban schools. Like the other studies, this study trained
teachers to instruct using a science inquiry approach provided description as how the science
units were developed to promote student initiative in learning with the teacher as a facilitator.
More specifically, the units included guides for teachers in terms of questions to ask students
to promote higher level thinking as the students conducted science investigations with science
background information to be presented at the end of the student investigation, which, again,
followed a 5-E model approach. One of the findings was that treatment students demonstrated
significantly greater gains in science achievement as measured by a researcher developed
science exam. Like Amaral et al.’s (2002) study, the impact of a successfully implemented
science inquiry intervention was not only reflected in science achievement as measured by
public-released items from the NAEP, but other content areas as well, likely as a result the
success of science inquiry in promoting higher level thinking. Additionally, Lynch et al.
(2005) conducted a quasi-experimental study to evaluate the effect of a highly rated science
curriculum which included science inquiry investigations among diverse groups of students.
Although former ELLs and native English speakers significantly outscored the comparison
group, the results were similar between ELLs and the comparison group, which, according to
the authors, was likely due to the high literacy demand in the curriculum, as well as the
assessment that failed to capture the gains made by these ELLs.
A more recent intervention study that reported significant intervention effects was con-
ducted by August et al. (2009) among sixth grade ELLs and native English speakers. The
purpose of their study was to assess the effectiveness of a 9-week science inquiry intervention
model to develop science knowledge and academic language. The researchers randomly
assigned sections, or classrooms, to receive the science intervention or to serve as comparison
groups. They also implemented professional development to support teachers with the inter-
vention, and explicitly noted using the 5-E approach discussed earlier. Moreover, the use of
visuals, graphic organizations, demonstrations, experiments, modeling to students, explicit
vocabulary instruction, and reading integration were included in the intervention. Pre- and
post-test assessments selected from state-standardized tests in science content and science
vocabulary were developed by the researchers, with fidelity checks to ensure teachers were
following the inquiry curriculum. Results noted post-test differences in favor of the treatment
group for both science knowledge and vocabulary, after adjusting for pre-test performance, with
reported standardized effect sizes estimate of 0.163 for Science and 0.263 for Vocabulary.
Finally, Santau, Maerten-Rivera, and Huggins (2011) analyzed the effectiveness of a
5-year professional development intervention aimed at improving science and literacy
achievement for urban elementary school students, including ELLs. Inquiry-based science
was the primary goal of this intervention with a balance between teacher guidance and
student initiative during the hands-on, real-world science workshops. Students’ science
achievement was measured by a researcher-developed multiple choice test. The mixed model
analysis revealed that all students in the intervention made significant gains from pre- to post-
test, independent of their English language status (i.e., ELL and non-ELL) or number of years
the students participated in the intervention. The authors concluded that the strength of pro-
fessional development led to effective science instruction, student learning, and, consequently,
higher achievement.
Direct and Explicit Vocabulary Instruction
Direct and explicit vocabulary instruction has been linked with the language and literacy
acquisition success for both elementary and middle grades ELLs (Beck, McKeown, & Kucan,
MIDDLE SCHOOL SCIENCE 991
Journal of Research in Science Teaching
2002; Graves, 2000; National Institute of Child Health and Human Development [NICHD],
2000). Tong et al. (2010) presented a successful case of direct and explicit language and
literacy intervention delivered to Spanish-speaking ELLs longitudinally from kindergarten
to second grade. Their intervention consisted of a systematic strand of teaching vocabulary
directly through pronunciation, spelling, repeated exposure in context, Spanish (i.e., students’
first language) clarification, and word meaning. The Tong et al. experimental study resulted in
faster growth in English vocabulary knowledge and subsequent reading comprehension
among treatment students.
Carlo et al. (2004) implemented a vocabulary intervention relying on explicit instruction
to fifth grade Spanish-speaking ELLs. The instruction included the teaching of word meaning
in the context of engaging texts, the pronunciation, polysemy, morphology, glossary use with
Spanish translation in context and English definition, and cognate use. The positive treatment
effect was reported by the authors at the end of the academic year in academic vocabulary
knowledge and reading comprehension.
Although little is known at middle school level as the means by which ELLs most effec-
tively develop content, along with English oracy and literacy proficiency, explicit vocabulary
instruction has been identified in studies with science intervention. For example, Lee et al.
(2005) provided teachers with guides on how to promote literacy with ideas on science writ-
ing topics and the integration of literature related to the science topic. To promote English
oral proficiency, the units introduced key vocabulary words at the beginning of lessons and
encouraged students to use and practice the words in a variety of contexts. Results yielded
growth in both language acquisition and science achievement.
Science and Literacy Integration
Simultaneously, within the framework of science inquiry, researchers and theorists have
acknowledged the importance of providing structured and explicit instruction that delineates a
difference between ‘‘scientific’’ language and ‘‘everyday’’ language for ELLs (Warren,
Ballenger, Ogonowski, Rosebery, & Hudicourt-Barnes, 2001, p. 530). Moreover, researchers
have agreed that without the explicit learning of science language, science will simply ‘‘re-
main a foreign language to most students’’ (Wellington & Osborne, 2001, p. 139), especially
for ELLs. All of the studies we reviewed involved ELLs integrated science and literacy. For
instance, in Amaral et al.’s (2002) inquiry-based science instruction study, one of the most
marked components included the integration of science notebooks in which students were
asked to write frequently and in a structured manner about their science investigations in
order to promote English writing literacy and science conceptual formation.
Within the last decade, although handful descriptive studies have been published to help
practitioners integrate language and content in the science classroom for monolingual children
(e.g., Gelman & Brenneman, 2004; Knipper & Duggan, 2006; Rupley, 2009; Waldman &
Crippen, 2009); unfortunately, there has been limited research produced addressing the needs
for ELLs (e.g., Huerta & Jackson, 2010; Hyland, 2007; Pray & Monhardt, 2009; Rupley &
Slough, 2010). Even fewer are the studies that target the integration of literacy and science
for ELLs at the middle school level (Amaral, Garrison, & Duron-Flores, 2006; Palumbo &
Sanacore, 2009; Watkins & Lindahl, 2010). Klentschy (2005) elaborated on how to imple-
ment science notebooks effectively in the classroom to promote English literacy skills and
scientific thinking. He outlined the different components of a science notebook that can be
systematically modeled and used by students during a science inquiry experiment. Palumbo
and Sanacore (2009) underlined the importance of explicitly and systematically teaching
vocabulary to diverse middle school learners, including teaching the pronunciation, roots, and
992 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
meanings of academic words in the content areas. In addition, they also suggested that read-
ing strategies to increase fluency and comprehension can be incorporated into the science
classroom such as repeated reading and readers theater. Watkins and Lindahl (2010) recom-
mended that a structured reading framework in science can and should include the use of
explicit vocabulary instruction for ELLs in the content areas. They provided a useful frame-
work of strategies to activate student background knowledge, increase student motivation and
reading comprehension, and integrate vocabulary, oral, and written language development
within content area instruction.
Guiding Premises of the MSSELL Project
According to Palumbo and Sanacore (2009), there is a growing body of research on
potentially effective practice to ensure ELL academic achievement that points to the value
and importance of embedding literacy instruction in the content areas as a way to provide
context, purpose, meaning, and motivation for learning at all grade levels. Unfortunately,
there is a lack of clear understanding of how to best assist ELLs in acquiring the academic
language of science while also learning the second language of English in the most efficient
and effective manner. Limitations are identified among such literature in the following
aspects. First, descriptive and non-experimental studies fail to suggest a cause and effect
relationship under certain conditions, with information on effect size missing, limiting the
comparability across studies. According to synthesis of research conducted on science educa-
tion and student diversity by Lee and Luykx (2006), experimental designs in studies with
diverse students are rare, few studies are longitudinal, and many of the studies do not yield
concrete evidence related to student achievement. Similarly, after their extensive review of
literature, August et al. (2009) pinpointed that the majority of studies available on effective
science instruction among upper-elementary and middle grades ELLs are descriptive in na-
ture, with few studies that have used an experimental design to test the effectiveness of an
intervention.
Second, detailed information from instructional practices is missing, making replication
almost impossible (Tong, Lara-Alecio, Irby, Mathes, & Kwok, 2008). For example, Thomas
and Collier (2002) conducted a landmark longitudinal study responding to the need to deter-
mine which language support programs successfully promote the long-term academic
achievement of ELLs. Although their study lays the foundation for our study, it included in
the sample typical programs that occurred with great variety and more specifically, a lack of
descriptions of instructional practices from district to district and state to state. It is difficult
to control for such variability and lack of specificity within a large-scale study. To address
this issue, a longitudinal experimental/quasi-experimental design in which the language/litera-
cy is merged with science content standards (Lee & Luykx, 2006; Merino & Scarcella, 2005;
Minicucci, 1996) taught by qualified professionals trained with inquiry-based instruction is
needed to enhance Spanish-speaking English learners’ academic language development and
science achievement in English in state and other standardized assessments.
These guiding principles gleaned from the literature indicate that students will increase
significantly in their academic language in science, their English language/literacy skills, and
science achievement and will outperform students who are in the typical practice science
program given that they have (a) appropriately certified teacher who receive ongoing profes-
sional development and frequent classroom observations with feedback, along with scripted
lessons with clarifications in Spanish, (b) bilingual paraprofessionals to assist ELLs with clar-
ifications and support low functioning students in science, and (c) enhanced curriculum that
includes inquiry-based learning, scripted lessons, technology integration, academic oral and
MIDDLE SCHOOL SCIENCE 993
Journal of Research in Science Teaching
written science language development, family involvement, and university scientist and
college student mentors.
Therefore, the purpose of our study was to evaluate, via a quasi-experimental design, the
effectiveness of a literacy-integrated science intervention on fifth grade ELLs’ science and
reading literacy achievement on accountability-based state assessments. Specifically, we
sought to answer the following questions:
(1) Do students who are enrolled in the literacy-embedded science treatment condition
classrooms perform better on the district science benchmark tests, and state stan-
dardized science test than do students in a comparison condition?
(2) Do students who are enrolled in the literacy-embedded science treatment condition
classrooms perform better on the district reading benchmark tests, and state stan-
dardized reading test than do students in a comparison condition?
(3) Do students who are enrolled in the literacy-embedded science treatment condition
classrooms perform better on the standardized English decoding measure than do
students in a comparison condition?
Methods
Design, Context, and Participants
Our study was derived from a larger longitudinal (fifth to sixth grade), field-based re-
search project targeting native both ELLs and low socioeconomic status (SES) non-ELLs in
an urban school district in Southeast Texas. Over 45% of students in the district are served
whose first language is Spanish. The majority of students (85%) in the selected school district
site qualified for free or reduced-lunch (TEA, 2010a). This particular district was selected for
study because of its (a) positive reputation based on student achievement and national awards
such as the Broad Foundation Prize, (b) lengthy experience working with ELLs, (c) consisten-
cy in program philosophy and implementation, and (d) because of the ease of access to
English learning and regular programs within the same school throughout the district. All
participating ELLs were identified as limited English proficient with Spanish as the primary
language spoken at home.
There were 10 intermediate schools (lower middle school) in the selected district from
Southeast of Texas. The state law (Texas Education Code, 1995) prohibits random selection
and assignment on the basis of individual students for program placement; therefore, in the
larger research project, four intermediate schools with principals’ approval from the district
site were randomly assigned to conditions, resulting in two treatment (enhanced science prac-
tice) and two comparison (typical science practice) schools. Table 1 demonstrates the demo-
graphics of these four schools. When a school was assigned, teachers from that campus were
then randomly selected to the assigned condition within that campus, and both ELLs and low
SES non-ELLs in the same school received the same practice to allay contamination between
experimental and comparison classrooms. Due to the low return response in the two compari-
son schools, a balanced design with four science teachers in their respective condition
(i.e., two in treatment and two in comparison) could not be achieved. Therefore, in order to
increase the sample size at student level, we recruited four more teachers within the same
comparison schools that were already assigned based on the criteria that they had Spanish-
speaking ELLs in their classrooms. Because of the non-random addition of these teachers,
this design of this overall project was considered to be quasi-experimental. Among the
12 teachers, the average number of years of teaching was 8.6 (9 for the treatment group and
994 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
8.4 for the comparison group), with two teachers new to teaching profession. Each teacher
taught 2–3 classes (or commonly termed as rotations in the district, see Table 2 for a break-
down of numbers by condition). The final sample consisted of a total of 166 treatment
students with an average age of 11.83 years (SD ¼ 0.75) and 80 comparison students with
an average age of 11.78 years (SD ¼ 0.71). Student information was coded by the district
personnel so that no individual student’s name can be identified.
Intervention
Professional Development. The intervention was composed of two main components.
The first component was teacher professional development (professional portfolio assessment,
biweekly staff development sessions, and monthly staff meetings for paraprofessionals).
This component consisted of ongoing training workshops for both teachers (biweekly) and
paraprofessionals (monthly) provided by research coordinators for 3 hours per session.
During the trainings teachers (a) reviewed and practiced upcoming lessons and materials,
(b) discussed science concepts and cleared up their own misconceptions, (c) reflected on and
discussed student learning, (d) assessed pedagogical progress as a teacher in the intervention,
(e) conducted experiments and inquiry activities, and targeted areas that students may have
difficulty, and (f) were instructed on the following ESL strategies that were incorporated
into the researcher-developed lessons: questioning strategies, language scaffolding, visual
scaffolding, manipulatives and realia, advanced organizers, cooperative grouping, content con-
nections, and technology integration. The scripted lesson plans were tightly aligned to state
science standards, national science standards, and English language proficiency standards
with leveled questions that included cognitive verbs (e.g., identify, describe, explain, analyze,
and draw conclusion). Further, the lesson plans also included minimal language-of-instruction
clarifications, specifically L2 (English) clarified by L1 (Spanish); otherwise, the language of
instruction for the intervention was L2.
Table 2
Breakdown of number of students, rotations, and teachers by condition
Condition School Teacher Rotation Student
Treatment A 3 8 41B 3 7 39
Comparison C 2 6 118D 2 3 48
Table 1
School demographics for treatment and comparison group 2009–2010
African
American
(%)
Hispanic
(%)
White
(%)
Native
American
(%)
Asian
(%)
Low
SES
(%)
ELL
(%)
Academic
Rating
TreatmentSchool A 19.5 78.3 1.5 0.1 0.6 92.9 30.4 RecognizedSchool B 11.9 84.1 2.4 0.0 1.6 92.1 31.3 Exemplary
ComparisonSchool C 26.8 68.4 2.4 0.0 2.5 88.0 31.5 RecognizedSchool D 26.5 67.7 5.0 0.2 0.7 90.5 22.9 Recognized
MIDDLE SCHOOL SCIENCE 995
Journal of Research in Science Teaching
It is worth mentioning that a positive effect of training workshops was noted from class-
room observations which suggested that treatment teachers were utilizing each minute follow-
ing the lessons plans with inquiry-based learning and integration of vocabulary and writing
in a structured, systematic manner. Based on our bi-monthly meetings and our field notes,
treatment teachers reported that the professional development equipped them with strategies
to (a) build science academic vocabulary; (b) integrate speaking, reading, and writing into
science lessons; (c) implement questioning strategies that hold students accountable for
their own learning; and (d) improve time management. Teachers commented that they would
utilize such strategies in their classrooms after the project implementation.
Instructional Activities. The second component, an 85-minute daily science instruction,
usually began with Daily Oral and Written Language in Science (DOWLS, approximately
10 minutes), a warm-up activity in which students were presented with a science-related
prompt or scenario, given individual think time, recorded written responses, and then dis-
cussed responses with a student partner. The academic science intervention continued follow-
ing the 5-E instructional cycle including: (a) engage activity (5–10 minutes), which consisted
of either a visual or teacher demonstration of a science concept helping to focus students’
thinking and make connections between past and present learning; (b) explore activity
(10–20 minutes), in which students worked in cooperative groups to explore science concepts
through manipulating science materials or exploring environment; (c) explain activity
(15–30 minutes), in which students gained deeper understanding of science concepts through
direct instruction of science vocabulary, partner reading of expository text (see further
description of CRISELLA in the next section), and interacting with science software
that explained science concepts through animation and simulation; (d) evaluate activity
(10–20 minutes), in which students demonstrated their understanding of the science concept
in their science journals (see further description of WAVES in the next section); and when
time allowed; and (e) elaborate activity in which students further applied their understanding
by conducting additional activities. A daily lesson concluded with closure questions to review
the science concept as related to the objective.
A major component was Content Area Reading in Science for English Literacy and
Language Acquisition (CRISELLA), which focused on vocabulary development and exten-
sion through science-related expository text to improve students’ understanding of science
concepts. In CRISELLA, students were provided with direct instruction of vocabulary, includ-
ing pronunciation of words, student friendly definitions, and visual scaffolding before they
started reading. Students then partner-read and asked each other scripted comprehension ques-
tions and had the opportunity to re-read the selection to increase fluency and comprehension.
After that, the teacher reviewed the questions and answered with the class, clarifying any
misconceptions. Teachers also modeled academic science language, opportunities for students
to practice using science academic language, and encourage students to answer in complete
sentences. Students entered vocabulary words and definitions into the glossary of their science
journals weekly.
Another component to improve science-related literacy is the integration of writing, that
is, Written and Academic oral language Vocabulary development in English in Science
(WAVES), in which individual science notebooks were used to help students process science
content through written academic science vocabulary. Science journals were structured so that
students had multiple opportunities to write daily to (a) record predictions and observations,
(b) illustrate and label diagrams, (c) organize information using notebook two-dimensional
figures, (d) record vocabulary in the glossary section of the journal, and (e) develop writing
996 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
skills while creating perspective-based writing, post-cards, newspaper articles, and reflections
on science field trips. Teachers were trained to provide both science and writing feedback.
The academic science curriculum was embedded with English as a second language strat-
egies and questioning strategies [teachers asked scripted leveled questions that include cogni-
tive verbs and were trained to alternate among answering techniques, such as (a) randomness,
(b) quick write, (c) pair-share, (d) choral response, (e) visual cues, and (f) timed thinking].
Students were held accountable by not being allowed to say ‘‘I don’t know,’’ instead students
had the option of having more time to think, requesting for the question to be asked in a
different way, requesting clues, or conferencing with another student.
In addition to the well-controlled components described above, there were also enrich-
ment components in the treatment classroom to assist student learning, including (a) treatment
classrooms were equipped with computers, a projector, a document camera, an interactive
whiteboard, science-based educational software, such as EduSmart, internet resources, and
a digital camera that further facilitated delivery of the academic science curriculum;
(b) Science Saturdays with scientists. Over 150 treatment students traveled to one of the
research universities that are part of this larger research project (November 14, 2009 and
February 27, 2010) and attended two interactive sessions developed by university scientists
based on physical, earth, life, and space sciences; and (c) Family involvement in science
(FIS). Take-home science materials were developed for students to work with their parents/
family so that students and their parents can become citizen scientists, as we define as those
citizens who are engaged in their world and understand scientific principles to better under-
stand their world and assist others, especially their children, in understanding and navigating
it. Each booklet contained a short letter to the family introducing the concepts and vocabulary
for the chapter, a crossword puzzle, a fun fact, various family science activities, an additional
reading relating to the topic, links to online science games, a short assessment, and a section
to sign and return upon completion. Each take home booklet was provided in both English
and Spanish. One 45-minute parent meeting was held during the fall of the academic year to
discuss how to implement FIS at home.
The district’s science scope and sequence outlines specific state standards to be covered
each 6 weeks (see Table 3 for a list of science content taught in treatment and comparison
conditions). The lesson plans developed by the research team were based on the same science
standards that the district included in their lesson plans in typical practice. Therefore,
both treatment and typical science teachers taught the same science standards; however, the
standards were incorporated in a different order during each 6-week period in comparison
classrooms.
Table 3
Science content taught in both treatment and comparison conditions during each 6-week period
Six Weeks Science Content
1 Force and motion; forms of energy; reflection and refraction2 Matter and its properties, classifying matter, conductors and insulators, mixtures and
solutions3 Solar system, physical characteristics of the sun, earth, and moon; rotation and revolution
of earth; phases of the moon4 Weather and the water cycle; earth’s structure; earth’s changing surface; fossils5 Basic needs of organism; how organisms survive; processes and cycles in ecosystems;
competition, reproduction, and life cycles6 Adaptations; inherited traits and learned behaviors
MIDDLE SCHOOL SCIENCE 997
Journal of Research in Science Teaching
Typical Practice/Comparison Condition
In typical practice in the comparison group, science is taught by certified or permitted
bilingual/ESL education and science education teachers in English for the ELL students with
no Spanish clarifications. Compared to a daily 85-minute 5-E lesson in treatment classrooms,
science instruction in comparison classrooms varied from 80 to 90 minutes daily including
one 5-E lesson cycle weekly. Language development strategies may be included in the con-
tent subjects. Teachers followed a locally developed science curriculum aligned to the TEKS.
On-going classroom observations were conducted and suggested the following: (a) vocabulary
instruction included the use of word walls and students looking up definitions in a glossary;
(b) students read textbook independently and answered questions at the end of the section;
(c) students completed worksheets and handouts, there was limited use of science journals;
and (d) the integration of ESL strategies and questioning strategies varied and were inconsis-
tent. Although no training support was provided by the research team to the teachers in the
comparison group, they attended workshops as a state requirement to fulfill a minimum of
30 professional development hours each year related to their content area. Typical practice
also had computers, projectors, and document cameras (ELMOs) in the science classrooms.
The major technological difference was that treatment classrooms had EduSmart, a science
software aligned with the intervention.
Measures
A battery of assessments was given to student participants to evaluate the effect after
1 year of implementation of this project. These measures include both standardized tests and
district-developed tests for the classroom level in science and English language and literacy.
These two levels were adopted based on the work of Ruiz-Primo, Shavelson, Hamilton, and
Klein (2002). They indicated:
. . . we reasoned that if science reform has an impact on students’ achievement, this
result should be located at different levels: most likely, if at all, at the local classroom
curriculum level, and then, we hoped, to transfer to more broadly or distal situations
than those covered in the classroom, for example, those posed by statewide and national
assessments. Evaluating students with achievement measures at different distances from
the science curriculum they studied would provide a better picture of the extent of
the effect that the science reform is having than using only close or distal measures.
(p. 371)
Specifically, based on the multilevel, multifaceted assessments framework proposed by
Ruiz-Primo et al., we employed a proximal level assessment which is used to ensure that the
teachers were teaching the assigned standards (and to hold schools accountable). That level
is observed in the district benchmark tests used in our study. Additionally, we employed what
Ruiz-Primo et al. called a distal level assessment based on state standards in a particular
domain (in our case—the Texas Assessment of Knowledge and Skills in Science and
Reading) and on a national level in assessing reading fluency using DIBELS.
District Benchmark Tests. The district-wide benchmark tests (in science and reading)
were used to compare students’ performance. These benchmark tests are criterion-referenced;
using cut-off scores to determine if a student passes and/or meets commended performance.
These benchmark tests were developed according to the scope and sequence of the fifth grade
Texas Essential Knowledge and Skills (TEKS), the state standards/curriculum in science/
reading. Similar to the state standardized test (see the following section), a student who
998 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
passes the science test (e.g., 18 correct items out of 24 in science benchmark test 1) demon-
strates satisfactory performance with equal to or above state passing standard and a sufficient
understanding of TEKS-aligned science curriculum; and the level of commended performance
(e.g., 22 correct items out of 24 in science benchmark test 1) reveals high academic achieve-
ment, considerably above state passing standard and a thorough understanding of the TEKS-
aligned science curriculum. The topics covered over the year are: physics (test 1), chemistry
(test 2), earth and space (test 4), life science (test 5), with process skills integrated into these
topics. Tests 3 and 6 are mid and final tests and cumulative of these topics. (Sample test items
are available as Supporting Information accompanying the online article.)
Face validity of the benchmark tests was established through a recursive process
of reviews by committee members as to if the tests relayed what they should, and content
validity was obtained because the tests were directly aligned the state standards (TEKS).
More specifically, the science/reading program director decided the order in which to teach
the state standards. The curriculum guide sequenced the science/reading TEKS that were
taught in each of the 6-week period. The recursive process occurred with a committee com-
prised curriculum specialists, teachers, and program director used the district curriculum
guide to determine which TEKS were to be taught for each 6-week period, for a total of
six periods. This committee analyzed these TEKS and references-related questions from pre-
vious Texas Assessment of Knowledge and Skills (TAKS), the state standardized test for
Grades 3–12, to write test questions. Once the test questions were generated they were sent to
the proofing committee, consisting of science/reading specialists. This committee reviewed
the test questions, checked the alignment to the TEKS, and corrected any errors. These
tests then went back to the test writing committee to make further changes and edits if need-
ed. Then the tests were forwarded to a district committee consisting of district curriculum
directors for the final grammar and format edits before they were printed and administered to
students.
The information is limited about the reliability and validity of the benchmark tests at the
district level, nevertheless, these instruments are what has been adopted by the district, and
the focus of our study is on improved achievement as measured by district and state level
tests which are used hold educators accountable. We, therefore, checked the internal consis-
tency reliability and predictive validity based on the sample in our study. Internal consistency
for science benchmark tests was calculated on each 6-week test respectively, ranging from
0.66 to 0.80, with an average of 0.72 over the five tests for the sample of this study.
Predictive validity with TAKS, the state standardized assessment, ranged from 0.45 to 0.60
with an average of 0.53. Internal consistency for reading benchmark tests ranged from 0.60 to
0.82, with an average of 0.71 over the four tests for the sample of this study; and predictive
validity with TAKS ranged from 0.44 to 0.62 with an average of 0.51. These estimates are
similar to that reported in two previous studies by Lee et al. (2005) with a range of internal
consistency from 0.71 to 0.86 and Lee et al. (2008) with an internal consistency of 0.60 in
pre-test and 0.71 in post-test; and therefore, should be considered within an acceptable range.
State Standardized Test: Texas Assessment of Knowledge and Skills (TAKS). The TAKS, a
criterion-referenced assessment, measures student mastery of the content areas of state curric-
ulum outlined in the TEKS. The TAKS reading assessments, available in both English and
Spanish, are first administered during the spring of Grade 3. Typically, Spanish-speaking
ELLs who are not otherwise exempt can take the TAKS in Spanish for up to 3 years in
Grades 3–6. In Texas, the student’s assessment committee is responsible for making such
recommendation as to whether a Spanish-speaking ELL should be assessed with reading
MIDDLE SCHOOL SCIENCE 999
Journal of Research in Science Teaching
TAKS in English or reading TAKS in Spanish (TEA, 2006). In the current study, 19 ELLs
from comparison classrooms and 14 from treatment classrooms were recommended to take
reading TAKS in Spanish.
The TAKS science assessments are first administered during the spring of Grade 5. As is
described by TEA, students who pass TAKS science in fifth grade demonstrate satisfactory
performance with equal to or above state passing standard and a sufficient mastery of the
TEKS-aligned science curriculum. The level of commended performance reveals high aca-
demic achievement, considerably above state passing standard; and a thorough mastery of the
TEKS science curriculum. Both English TAKS reading and science tests in fifth grade consist
of multiple-choice items (42 in reading and 40 in science) measuring four objectives. The
English form has a reported internal consistency ranging from 0.87 to 0.90, and predictive
validity ranging from 0.56 to 0.79 with SAT and ACT (TEA, 2008). In addition, construct
validity has been established using confirmatory factor analysis with good model fit for read-
ing in Grades 3 and 5 (Burk, Johnson, & Whitley, 2005; Davies, O’Malley, & Wu, 2007). For
fifth grade TAKS (both English and Spanish) in 2010, a raw score of 29 (i.e., total number of
correct items) corresponds to a scaled score of 2100, the cut-off point representing meeting
standard level (i.e., passing). A raw score of 39 corresponds to a scaled score of 2300, the
cut-off point representing commended performance level (TEA, 2010b). (Sample test items
are available as Supporting Information accompanying the online article.)
Dynamic Indicators of Basic Literacy Skills (DIBELS). DIBELS (Good and Kaminiski,
2002) includes a set of procedures and measures for assessing the acquisition of early literacy
skills from kindergarten through sixth grade. These tests were initially developed as curricu-
lum-based assessments and have documented its psychometric reliability and validity to mea-
sure ‘‘critical skills that underlie early reading success.’’ They can be used for evaluative
purpose on individual students or student groups. In this study, Oral Reading Fluency (ORF)
was administered to students in the beginning and end of fifth grade. ORF is reported to have
a median alternate form reliability of 0.95 (Good, Kaminski, Smith, & Bratten, 2001), concur-
rent validity of 0.92–96 with test of ORF, and predictive validity of 0.71 with Stanford
Achievement Test 10 (Roehrig, Petscher, Nettles, Hudson, & Torgesen, 2008). In this subtest,
examinees are asked to read grade-level fictional passages aloud, and the score is the number
of words correctly read in 1 minute. For each time of administration, three stories were test-
ed, and the middle score was used for analysis.
Fidelity Measure. Fidelity checks were developed to ensure the implementation among
the treatment science teachers. The Science Teacher Observation Record (STOR) was used to
monitor if teachers were delivering intervention as proposed. The inter-rater reliability of
STOR was 0.86. Each of the 5-Es was rated on a 1–4 scale in the following areas unique to
this intervention: (a) knowledge with lesson content; (b) material usage and teacher prepara-
tion; (c) student involvement; (d) academic language scaffolding; (e) affective and cognitive
feedback; (f) writing feedback; and (g) pacing. In addition there was a category for profes-
sional development. The total possible score is 124. The mean score for treatment teacher is
107.93 (SD ¼ 15.2) with four rounds of observation at the beginning, beginning/middle, and
middle and end of the academic year, with an average observational time of 84 minutes per
round per teacher. Observers were educators and had attended the professional workshops
and had been trained by the principal investigators. They had had over 10 years of experience
observing the classrooms over the course of one previous federal funded longitudinal project.
In addition to the fidelity instrument, that is, STOR, field notes from classroom observations
were also collected from both conditions and suggested that the instructional behavior were
1000 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
more consistent and committed to science teaching in treatment classrooms than in compari-
son classrooms.
Data Collection and Analysis
Data were collected in the fall and spring of school year 2009–2010. Science benchmark
test was administered to each student every 6 weeks during the school year with a total of
six tests. Each 6-week benchmark test has a different topic area, for example, the science
benchmark tests cover Physics (test 1), Chemistry (test 2), Mid-term (cumulative of physics,
chemistry, and space, test 3), Earth/Space (test 4), Life Science (test 5), and final test
(cumulative of physics, chemistry, space and life science, test 6). TAKS data were collected
during the spring of 2010. Scores from benchmark tests, and TAKS scores were obtained
through the district database. Data from DIBELS were collected at the beginning and end of
school year.
To examine the initial equivalence between the two groups, students’ vocabulary knowl-
edge, reading comprehension skills, and non-verbal ability were measured for the larger
research project, and were collected at the beginning of fifth grade (Fall, 2009). These meas-
ures included the subtests of Picture Vocabulary and Passage Comprehension in Woodcock
Language Proficiency Battery-Revised WLPB-R; Woodcock, 1991), standardized instruments
assessing a broad range of language proficiency in speaking, listening, reading, and writing in
English. Listening Comprehension subtest requires test takers to listen to a passage read to
them and are asked to supply the single word missing at the end of the passage. In the
Passage Comprehension subtest test-takers are required to point to the picture represented by
a phrase. It also measures skills of reading a short passage and identifying a missing key
word. W-scores were used for analysis. Finally, students’ non-verbal ability was measured by
the Naglieri Nonverbal Ability Test (NNAT, Naglieri, 1997). It is designed for ages between
5 and 17 to provide a concise but reliable and valid non-verbal evaluation of general ability.
Students are required to analyze the associations among the parts of the divided matrix, the
design, and to determine which answer choice is correct based on the information in each test
item. The NNAT has been utilized as an identification of gifted children, especially those who
are culturally and linguistically diverse. This test is a group administration with approximate-
ly 30 minutes, given as a district-level measure.
As a first step, students’ listening and reading comprehension in English, as well as their
non-verbal cognitive ability was compared between treatment and comparison groups prior to
the implementation of the larger research project using independent samples t-test. No statisti-
cally significant difference was detected on listening comprehension (F ¼ 0.11, p ¼ 0.736),
nor reading comprehension (F ¼ 0.25, p ¼ 0.619); nor on non-verbal ability (F ¼ 0.03,
p ¼ 0.863). Therefore, it initial equivalence was established.
Note that it may not be beneficial to plot progress based on the scaled score for district
benchmark tests, because each 6 weeks test covers completely different science content,
which suggests that a higher score does not necessarily indicate progression, instead, it may
mean that the science content is simply different, more interesting to the students, or more
difficult by topic area in a single 6 weeks. Further, on the TAKS tests, a scaled score of 2099
is still considered failing with only 1 point below 2100 (the cut-off score), while 2101 is
considered passing with only 1 point above. Therefore, to most accurately present the find-
ings, we conducted chi-squared test of independence to compare the rate of passing and
commended performance between treatment and comparison groups of ELLs for benchmark
tests and TAKS, with Cramer’s V being reported as effect size or magnitude of the relation-
ship. Because a total of 33 students took Spanish TAKS reading test, we decided to exclude
MIDDLE SCHOOL SCIENCE 1001
Journal of Research in Science Teaching
these data when comparing the English literacy achievement between the two conditions. To
analyze DIBELS score, analysis of covariance (ANCOVA) with pre-test as covariate was
conducted to compare students’ growth in literacy skills during the first year of implementa-
tion. Partial eta squared value as one type of effect size was reported.
Results
In this study, we investigated the effectiveness of a quasi-experimental study on science
intervention among fifth grade ELLs. We compared students’ performance between treatment
and comparison groups in the district benchmark tests in science and reading, state standard-
ized tests in science and reading, and literacy development. Results are presented by construct
measured, that is, science and reading achievement.
Science Achievement
Benchmark Tests in Science. Table 4 lists the descriptive statistics of percentage of pass-
ing and commended performance by condition, together with effect size in the form of
Cramer’s V. Results indicate that there is a statistically significant difference in the percentage
of passing in tests 2, 4, and 6 (ps < 0.05), and in the percentage of commended performance
in tests 2, 4 and 6 (ps < 0.05). These differences are all in favor of treatment students who
demonstrated higher rates with the magnitude ranging between 0.127 and 0.195. For example,
there is an average of 87% passing and 43% commended performance rates in the treatment
group over the five benchmark tests in science, as compared to an average of 78% passing
and 32% commended performance in comparison group (Test 5 was optional due to the time
conflict of TAKS administration).
TAKS in Science. Chi-squared test did not reveal statistically significant differences in the
rate of passing or commended performance (ps > 0.05). The treatment group had an average
passing rate of 78.2%, and a commended performance rate of 25.1%. Similarly, the compari-
son group had an average of passing 84.6%, and a commended performance rate of 19.8%.
Reading Achievement
Benchmark Tests in Reading. Table 5 lists the descriptive statistics of percentage of
passing and commended performance by condition, together with effect size in the form of
Cramer’s V. Results from chi-squared test of independence suggest a similar pattern as was
Table 4
Difference between treatment (n ¼ 166) and comparison group (n ¼ 80) in District Benchmark Science
Tests
Test
Passing (%) Commended Performance (%)
Treatment Comparison Effect Sizea Treatment Comparison Effect Sizea
1 85.5 88.5 �0.040 41.6 34.3 0.0662 89.7 76.3 0.178�� 49.7 33.8 0.150�
3 84.6 80.9 0.026 34.0 27.5 0.0654 89.6 74.7 0.194�� 47.2 32.9 0.136�
6 86.3 69.9 0.195�� 43.5 30.3 0.127�
aPositive effect size indicates higher performance in treatment condition. Test 5 was optional as it was given the same
time period as TAKS test.�p < 0.05.��p < 0.01.
1002 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
observed in science benchmark tests that treatment students statistically outperformed com-
parison students in the passing rate in tests 2, 4, and 6 (ps < 0.05), with the magnitude of
such difference ranging between 0.163 and 0.238. For example, there is an average of 56.7%
passing and 8% commended performance in the treatment group over the four benchmark
tests in science, as compared to an average of 38% passing and 3.7% commended perfor-
mance in comparison group (test 5 was optional due to the time conflict of TAKS administra-
tion, and not all groups were given test 3). The only statistically significant difference in the
rate of commended performance was found in second test in favor of treatment group
(p < 0.05). Further, it is worth noting that on average, both treatment and comparison groups
demonstrated a low passing rate in the reading tests, indicating that a higher percentage
of students did not meet the expected standard level of performance in the subject area of
reading.
TAKS in Reading. Chi-squared test revealed statistically significant differences in the rate
of passing (x2 ¼ 3.086, p ¼ 0.046, Cramer’s V ¼ 0.11), in favor of the treatment group with
an average passing rate of 68.9%, as compared to the comparison group with an average of
passing 60.4%. No statistically significant difference was found in the rate of commended
performance (p > 0.05) with an average of 7.8% in the treatment group and 7.4% in compar-
ison group. Similarly, the average rate of passing in TAKS reading test was lower than that in
TAKS science test for both treatment and comparison groups.
DIBELS. Results from the ANCOVA (with pre-test score as the covariate and post-test
score as the dependent variable) revealed that, although all students (n ¼ 246) made statisti-
cally substantial gains from beginning to the end of school year, a statistically significant
difference was observed, F ¼ 37.26, p < 0.001, partial eta squared ¼ 0.134, with treatment
students outperforming their comparison peers on the post-test with 12 points at end of fifth
grade, after adjustment for pre-test performance levels at the beginning of fifth grade.
Limitation
This study was, of course, subject to several limitations. Given the fact that the purpose
of this current study was to evaluate the effect of first year of implementation, we did not
present the results using multi-level structure. One reason was that the fidelity measure
Table 5
Difference between treatment (n ¼ 166) and comparison group (n ¼ 80) in District Benchmark Reading
Tests
Test
Passing (%) Commended Performance (%)
Treatment Comparison Effect Sizea Treatment Comparison Effect Sizea
1 44.1 33.3 0.103 1.9 3.8 �0.0592 58.7 32.8 0.238��� 7.7 0 0.157�
4 49.3 31.9 0.163� 5.9 1.4 0.1386 74.8 54 0.212�� 16.5 9.5 0.096
aPositive effect size indicates higher performance in treatment condition. Not all groups were given test 3. Test 5 was
optional as it was given the same time period as TAKS test.�p < 0.05.��p < 0.01.���p < 0.01.
MIDDLE SCHOOL SCIENCE 1003
Journal of Research in Science Teaching
indicated a fairly consistent practice among treatment classrooms. In a related note, we had a
quite small sample size at the teacher level (n ¼ 12) or school level (n ¼ 4), which, results in
a potential limitation of power for teacher level or school level. In order to increase the
power, we would have had to conduct randomization at the section or teacher level, which,
however, would have resulted in the contamination of the intervention at the teacher level
since they would be teaching both treatment and comparison classrooms; or there could have
been both comparison and treatment teachers on the same campus. We analyzed the data with
multi-level modeling approach and obtained similar results. Future studies with larger sample
size at the level of randomization, preferably school, are highly desired to yield results that
can justify casual–effect relationship. Further, as we plan for the next step of the larger
research project, we will take into consideration the rotation/classroom nature of science
instruction in these schools as students complete sixth grade, and we will include the hierar-
chical modeling with cross-classification in the final analysis. Another limitation is that we
compared conditions that differed in several enrichment components (e.g., family science
support, Saturday science activities, and extra technology resources in the classroom) that
may seem to confound the ability to compare the nature-of-the-instruction treatment.
Nevertheless, due to the low response rate (<10%) from parents who were required to rate
their child’s performance, and a 60% attendance rate for the university activities, it is unlikely
to measure if these enrichment components directly influenced benchmark scores.
Discussion and Conclusions
The purpose of our study was to evaluate the effectiveness of a literacy-integrated science
intervention with professional development on fifth grade ELLs’ science and reading literacy
achievement on accountability-based state assessments. Findings reveal that, on average,
the intervention produced higher academic achievement for treatment students in both science
and reading outcomes as reflected in the district-wide standards-based measures, which were
aligned with the materials taught in both treatment and comparison classrooms. Standardized
effect sizes were in small-to-moderate range, with larger magnitude in science than in read-
ing. More specifically, in the science benchmark tests, the treatment group demonstrated
a better understanding of the different science topics in 3 out of 5 benchmark tests.
Furthermore, a considerably larger percentage of treatment students performed above the state
passing standards with a thorough understanding of the TEKS in 4 out of 5 benchmark tests.
These findings, consistent with previous studies (e.g., Amaral et al., 2002; August et al.,
2009; Lee et al., 2008), suggest that inquiry approaches to teaching and learning science,
rather than teaching strategies in high needs schools in an attempt to raise test scores, can
promote ELLs’ performance on standardized and high-stakes achievement tests.
In the reading benchmark tests, treatment students demonstrated a higher percentage of
passing in 3 out of the 4 tests, suggesting satisfactory performance with equal to or above
state passing standard and a sufficient understanding of the state reading curriculum. In addi-
tion, although on average, treatment and comparison students acquired a similar level of suffi-
cient mastery of the state curriculum in science achievement, more students from the
treatment classrooms met the state standard in reading with a moderate effect size of 0.11.
Finally, results also show that the intervention produced positive gains for treatment students
in ORF as reflected in standardized measure, with a moderate standardized effect size
(0.134). These findings corroborated current research findings, such as those by Amaral et al.
(2002) and Lee et al. (2005), regarding the positive effect of integrating literacy with science
instruction on literacy outcomes for ELLs. Our findings, plus those of other researchers
mentioned, reveal that literacy integration in the content areas is a promising solution to the
1004 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
challenge of holding all students, including ELLs accountable on, not just reading assess-
ments, but content–area (i.e., science) assessments as well (Maerten-Rivera et al., 2010).
In addition, the results of our study underscore the importance of implementing direct and
explicit vocabulary instruction that yield positive results within content–area instruction for
ELLs.
We observed, nonetheless, that the differences between treatment and comparison groups
in science achievement were more pronounced for the benchmark tests than for the TAKS.
These findings seem to align with Ruiz-Primo et al.’s (2002) conclusion that proximity of
assessment matters. They suggested that close and proximal (which was the benchmark tests
in our study) assessments had a greater influence on detecting students’ improvement on
assessments, while such sensitivity was less evident in distal assessments (which was the
TAKS in our study).
One might question the heavy focus on English science vocabulary building in this
science intervention. Existing literature indicates that compared to their monolingual English
peers, ELLs lack not only the amount of vocabulary words, but also the depth of vocabulary
knowledge as well (August, Carlo, Dressler, & Snow, 2005; Carlo et al., 2004). As a result,
these ELLs are less able to comprehend English text at grade level, and to perform on aca-
demic assessment in English, and they may be at risk of being diagnosed as learning disabled,
when in fact their limitation is due to limited English vocabulary and poor comprehension
(August et al., 2005). In his summary of the work of the National Reading Panel (NRP) on
reading comprehension instruction, Kamil (2004) asserted ‘‘vocabulary seems to occupy an
important middle ground in learning to read’’ (p. 215). Likewise, Hickman, Pollard-Durodola,
and Vaughn (2004) reported that for ELLs of primary importance in academic language de-
velopment is the ‘‘related elements of vocabulary and comprehension’’ (p. 720). In addition,
level of English language proficiency and literacy (particularly size of academic vocabulary)
is positively related to academic achievement in English among ELLs (Fernandez & Nielsen,
1986; Lindholm-Leary, 2001; Rumberger & Larson, 1998). Therefore, the focus on strength-
ening science vocabulary among ELLs is deemed necessary to enlarge and expand their learn-
ing of the academic language in order to solve science problems competently and to move
forward as citizen scientists. Our hypothesis was also supported by the findings of the fifth
grade implementation that more statistically significant differences were identified in the sci-
ence benchmark tests in favor of treatment group. It is also worth noting that the average
rates of passing and commended performance were higher in science than in reading for both
benchmark tests and TAKS in both groups. Such a finding corroborates that of Lee et al.
(2005) who used responses to expository writing samples to assess literacy achievement.
They noted that though there were statistically significant increases and effect magnitudes on
all measures of science and literacy for the treatment group both grade levels, stronger effect
magnitudes for science achievement over literacy achievement were found. Lee et al. recom-
mended, and we agree, that more emphasis on literacy development in tandem within science
learning in future studies.
One may argue the issue of time-on-task as well as the multiple components in the in the
intervention group which could explain the differences in performance. However, it should be
recognized that students in both the intervention (85 minutes) and comparison conditions
(with varied length of time between 80 and 90 minutes) had similar amount of minutes in
science instruction during the day, and the district has had a long-standing positive reputation
for typical practice in their science education programs. Our point, from an instructional
perspective, is how to best allocate and utilize those minutes so as to provide quality science
instruction at school. Further, it was the multiple components in our intervention that differed
MIDDLE SCHOOL SCIENCE 1005
Journal of Research in Science Teaching
from the typical practice and served as the contrast, not the time; nevertheless, these compo-
nents were all embedded within the daily inquiry-based 5-E lessons, as compared to weekly
5-E lessons observed in comparison classrooms. The end result derived from our study is that
for students with limited English proficiency who are placed at a disadvantage of learning
academic content in a language that is less familiar, effective practices in science instruction
should include an explicitly structured (specific to what the teacher should teach), consistent
(format of lessons are consistent and components become understood by and commonplace
students), inquiry-based (5-E model hands-on activities), standards-aligned curriculum that is
integrated with language literacy and with efficiently utilized instructional time each school
day. Educators are encouraged to introduce and apply such effective practices in science
instruction in order to maximize ELLs’ achievement on standardized assessments while
promoting students language and content literacy critical to their ongoing academic success
beyond middle school and into high school and college.
Finally, as addressed in the limitations, we are aware that the effect of the enrichment
components of the intervention is difficult to measure. However, enrichment activities with
visits to science labs at universities or private labs are encouraged to peak interests; enrich-
ment activities with parental involvement take-home packets are encouraged to engage
students and parental conversations around science, and additional technology integration in
lessons are encouraged to support science concepts and maintain interest and discussion
around those concepts.
Little is known at the middle school and beyond about the effective instruction for
language minority students who are learning English and the content science knowledge in
English at the same time (August & Hakuta, 1997; Lee & Luykx, 2006). The larger research
project from which the current study was derived, to our best knowledge, is the only longitu-
dinal quasi-experimental design with science intervention of a full academic year among
fifth grade ELLs and low-SES English proficient students; and therefore, the results have the
potential to impact middle school level science for ELLs where there is the combination of
the teaching of science with integrated English language development skills. We found that
not only did a literacy-integrated science curriculum and instruction matter, but also the other
component of the intervention, that is, professional development for teachers, as did Lee
(2005) find, mattered in terms of melding teachers’ understanding of science content and
teachers’ application of techniques for teaching English language acquisition and literacy
skills.
NCLB urged educators and researchers to seek out answers to what predicts and fosters
students’ school progress (Abedi, 2004; Maerten-Rivera et al., 2010). Part of the answer
to promoting ELLs’ success on standardized assessments could be, as Abedi (2004) put it,
‘‘improving teacher capacity,’’ that is, to train teachers to become ‘‘well qualified in both
language development and content, each of which plays a crucial role in [ELL] student
achievement’’ (p. 12). The implications of this study, particularly related to professional
development for teachers who have ELLs and economically disadvantaged students in their
classrooms, is to emphasize (a) the development of science academic vocabulary through
listening, speaking, reading, and writing activities, (b) the implementation of leveled question-
ing to help students develop oral language and correct misconceptions, and (c) the integration
of second language learning theory into the science classrooms. Because achieving
similar outcomes as reported in our study requires a substantial amount of time and resource
commitment, we recommend that districts consider at least 4 hours time release per month
for teachers to develop skills outlined in our study that promote student learning in science
and literacy.
1006 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
Our findings may impact the field of science education and bilingual/ESL education in
that if this model works under controlled conditions, it can be sustainable and transferrable to
other settings. The advantage of our intervention is that it can be implemented in schools by
enhancing and modifying traditional practice, provided that the teachers are given quality
professional development on content–area strategies and time to develop their skills. If school
district personnel introduce non-intrusive, consistent, yet structured literacy-integrated science
interventions for ELLs with supported professional development, then teachers may be open
to such practices to improve ELLs’ academic English proficiency and science achievement.
Consequently, ELLs, who have not had the opportunity to develop specialized academic
language skills and who, thus, have limited access to learning science skills and concepts,
then may be provided opportunities to experience success in science achievement.
The authors thank the participating students, parents, teachers, and school administrators
for their cooperation.
References
Abedi, J. (2004). Accommodations for students with limited English proficiency in the National
Assessment of Educational Progress. Applied Measurement in Education, 17(4), 371–392.
American Educational Research Association. (2000). AERA position statements: High-stakes test-
ing in pre-k education. Retrieved from http://aera.net/policyandprograms/?id¼378
Anderson, K. J. B. (2012). Science education and test-based accountability: Reviewing their
relationship and exploring implications for future policy. Science Education, 96(1), 104–129.
Anstrom, K., DiCerbo, P., Butler, F., Katz, A., Millet, J., & Rivera, C. (2010). A review of literature
on academic English: Implications for K-12 English language learners. Arlington, VA: The George
Washington University Center for Equity and Excellence in Education.
Amaral, O., Garrsion, L., & Duron-Flores, M. (2006). Taking inventory. Science and Children,
43(4), 30–33.
Amaral, O. M., Garrison, L., & Klentschy, M. (2002). Helping English learners increase achieve-
ment through inquiry-based science instruction. Bilingual Research Journal, 26(2), 213–239.
August, D., Branum-Martin, L., Cardenas-Hagan, E., & Francis, D. (2009). The impact of an
instructional intervention on the science and language learning of middle grade English language learn-
ers. Journal of Research on Educational Effectiveness, 2, 345–376. DOI: 10.1080/19345740903217623
August, D., Carlo, M., Dressler, C., & Snow, C. (2005). Accelerating English academic vocabulary:
An intervention design for Spanish literate children acquiring English as a second language. Learning
Disabilities Research and Practice, 20(1), 50–57. DOI: 10.1111/j.1540-5826.2005.00120.x
August, D., & Hakuta, K. (1997). Improving schooling for language-minority children: A research
agenda. Washington, DC: National Research Council.
Beck, I. L., McKeown, M. G., & Kucan, L. (2002). Bringing words to life: Robust vocabulary
instruction. New York: Guilford Press.
Bentley, M. L. (2004). ELLs: Children left behind in science class. Paper presented at the Annual
Meeting of the School Science and Mathematics Association. Atlanta, Georgia.
Bruenig, N. A. (1998). Measuring the instructional use of Spanish and English in elementary
transitional bilingual classrooms. Dissertation, Abstracts International, 59(04), 1046A.
Bruner, J. (1996). The culture of education. Cambridge, MA: Harvard University Press.
Burk, J., Johnson, D. M., & Whitley, J. (2005). Validity of the Texas Assessment of Knowledge
and Skills (TAKS). The Journal of Border Education Research, 4(2), 29–39.
Bybee, R. W. (1996). The contemporary reform of science education. In J. Rhoton & P. Bowers
(Eds.), Issues in science education (pp. 1–14). Arlington, VA: National Science Education Leadership
Association.
Byrnes, D. A., Kiger, G., & Manning, M. L. (1997). Teachers’ attitudes about language diversity.
Teaching and Teacher Education, 13(6), 637–644.
MIDDLE SCHOOL SCIENCE 1007
Journal of Research in Science Teaching
Carlo, M., August, D., McLaughlin, B., Snow, C., Dressler, C., Lippman, D., . . . White, C. E.
(2004). Closing the gap: Addressing the vocabulary needs of English language learners in bilingual and
mainstream classrooms. Reading Research Quarterly, 39(2), 188–215.
Clewell, B. C., de Cohen, C. C., & Murray, J., (2007). Promise of peril? NCLB and the education
of ELL students. Retrieved from http://www.urban.org/UploadedPDF/411469_ell_students.pdf
Davies, S., O’Malley, K., & Wu, B. (2007, April). Establishing measurement equivalence of trans-
adapted reading and mathematics tests. Paper presented at the annual meeting of the American
Educational Research Association, Chicago.
Fernandez, R., & Nielsen, F. (1986). Bilingualism and Hispanic scholastic achievement: Some
baseline result. Social Science Research, 15, 43–70.
Freeman, Y. S., & Freeman, D. E. (2008). Academic language for English language learners and
struggling readers. Portsmouth, NH: Heinemann.
Geier, R., Blumenfeld, P. C., Marx, R. W., Krajcik, J. S., Fishman, B., Soloway, E., & Clay-
Chambers, J. (2008). Standardized test outcomes for students engaged in inquiry-based science curricula
in the context of urban reform. Journal of Research in Science Teaching, 45(8), 922–939. DOI: 10.1002/
tea.20248
Gelman, R., & Brenneman, K. (2004). Science learning pathways for young children. Early
Childhood Research Quarterly, 19(1), 150–158. DOI: 10.1016/j.ecresq.2004.01.009
Good, R. H., & Kaminiski, R. A. (Eds.). (2002). Dynamic indicators of basic early literacy skills
(6th ed.). Eugene, OR: Institute for the Development of Educational Achievement.
Good, R. H., Kaminski, R. A., Smith, S., & Bratten, J. (2001). Technical adequacy of second grade
DIBELS Oral Reading Fluency passages (Tech. Rep. No. 8). Eugene: University of Oregon.
Graves, M. F. (2000). A vocabulary program to compliment and bolster a middle grade comprehen-
sion program. In B. M. Taylor, M. F. Graves, & P. van den Broek, (Eds.), Reading for meaning:
Fostering comprehension in the middle grades (pp. 116–135). Newark, DE: International Reading
Association.
Hanushek, E. A., & Raymond, M. E. (2005). Does school accountability lead to improved student
performance? Journal of Policy Analysis and Management, 24(2), 297–327. DOI: 10.1002/pam.20091
Hickman, P., Pollard-Durodola, S., & Vaughn, S. (2004). Storybook reading: Improving vocabulary
and comprehension for English language learners. The Reading Teacher, 57, 720–730.
Hu, W. (2012). 10 States are given waivers from education law. The New York Times. Retrieved
from http://www.nytimes.com/2012/02/10/education/10-states-given-waivers-from-no-child-left-behind-law.
html?_r¼1
Huerta, M., & Jackson, J. (2010). Connecting literacy and science to increase achievement for
English language learners. Early Education Childhood Education Journal, 38, 205–211. DOI: 10.1007/
s10643-010-0402-4
Hyland, K. (2007). Genre pedagogy: Language, literacy and L2 writing instruction. Journal of
Second Language Writing, 16, 148–164.
Irby, B. J., Tong, F., Lara-Alecio, R., Meyer, D., & Rodriguez, L. (2007). The critical nature of
language of instruction compared to observed practices and high stakes tests in transitional bilingual
classroom. Research in the Schools, 14(2), 27–36.
Kamil, M. L. (2004). Vocabulary and comprehension instruction. In P. McCardle & V. Chhabra
(Eds.), The voice of evidence in reading research. (pp. 213–234) Baltimore, ML: Paul H. Brookes
Publishing Co.
Kieffer, M. J., Lesaux, N., Rivera, M., & Francis, D. J. (2009). Accommodations for English lan-
guage learners taking large-scale assessments: A meta-analysis on effectiveness and validity. Review of
Educational Research, 29(3), 1168–1201. DOI: 10.3102/003465430933249
Kieffer, J. J., Lesaux, N. K., & Snow, C. E. (2008). Promises and pitfalls: Implications of NCLB
for identifying, assessing, and educating English language learners. In G. L. Sunderman (Ed.), Holding
NCLB accountable: Achieving accountability, equity, and school reform (pp. 57–74). Thousand Oaks,
CA: Corwin Press.
1008 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
Klentschy, M. (2005). Science notebook essentials: A guide to effective notebook components.
Science and Children, 43, 24–27.
Knipper, K. J., & Duggan, T. J. (2006). Writing to learn across the curriculum: Tools for compre-
hension in content area classes. The Reading Teacher, 59(5), 462–470. DOI: 10.1598/RT.59.5.5
Lee, O. (2005). Science education with English language learners: Synthesis and research agenda.
Review of Educational Research, 75(4), 491–521. DOI: 10.3102/00346543075004491
Lee, O., Deaktor, R. A., Hart, J. E., Cuevas, P., & Enders, C. (2005). An instructional intervention’s
impact on the science and literacy achievement of culturally and linguistically diverse elementary
students. Journal of Research in Science Teaching, 42(8), 857–887. DOI: 10.1002/tea.20071
Lee, O., & Luykx, A. (2006). Science education and student diversity. New York, NY: Cambridge
University Press.
Lee, O., Maerten-Rivera, J., Penfield, R., LeRoy, K., & Secada, W. G. (2008). Science
achievement of English language learners in urban elementary schools: Results of a first-year profes-
sional development intervention. Journal of Research in Science Teaching, 45(1), 31–52. DOI: 10.1002/
tea.20209
Lindholm-Leary, K. J. (2001). Dual language education. Avon, UK: Multilingual Matters.
Linn, R. L. (2000). Assessments and accountability. Educational Researcher, 29(4), 4–15.
Liu, O., Lee, H., & Linn, M. C. (2011). Measuring knowledge integration: Validation of four-year
assessments. Journal of Research in Science Teaching, 48(9), 1079–1107.
Lynch, S., Kuipers, J., Pyke, C., & Szesze, M. (2005). Examining the effects of a highly rated
science curriculum unit on diverse students: Results from a planning grant. Journal of Research in
Science Teaching, 42(8), 921–946. DOI: 10.1002/tea.20080
Luykx, A., Lee, O., Mahotiere, M., Lester, B., Hart, J., & Deaktor, R. (2007). Cultural and home
language influences on children’s responses to science assessments. Teachers College Record, 109(4),
897–926.
Maerten-Rivera, J., Myers, N., Lee, O., & Penfield, R. (2010). Student and school predictors of
high-stakes assessment in science. Science Education, 94, 937–962.
Merino, B. J., & Scarcella, R. (2005). Teaching science to English learners. Invited Essay.
Language Minority Research Institute Newsletter, 14(4), 1–7.
Meyer, L. (2000). Barriers to meaningful instruction for English learners. Theory Into Practice,
39(4), 228–236.
McCloskey, M. (2002). President’s message: No child left behind. TESOL Matters, 12, 4. Retrieved
from http://www.tesol.org/pubs/articles/2002/tm12-4-04.html
McNeil, L. M. (2000). Contradictions of reform: The educational costs of standardization. New
York: Routledge.
Minicucci, C. (1996). Learning science and English: How school reform advances scientific learn-
ing for limited English proficient middle school students. Santa Cruz, CA: National Center for Research
on Cultural Diversity and Second Language. Learning. CREDE.
Naglieri, J. A. (1997). Naglieri nonverbal ability test. San Antonio, TX: The Psychological Corp.
National Center for Education Statistics. (2010). The Condition of Education 2010 (NCES 2010-
028). Washington, DC: U.S. Department of Education.
National Institute of Child Health and Human Development. (2000). Report of the National
Reading Panel. Teaching children to read: An evidence-based assessment of the scientific research litera-
ture on reading and its implications for reading instruction (NIH Publication No. 00-4769). Washington,
DC: U.S. Government Printing Office.
National Research Council. (1996) National Science Education Standards. Washington, DC: The
National Academy Press.
No Child Left Behind Act. (2002). Pub. L. No. 107-110. Washington, DC.
Palumbo, A., & Sanacore, J. (2009). Helping struggling middle school literacy learners achieve
success. The Clearing House, 82(6), 275–280. Retrieved from ProQuest Education Journals. (Document
ID: 1808509731).
MIDDLE SCHOOL SCIENCE 1009
Journal of Research in Science Teaching
Penfield, R. D., & Lee, O. (2010). Test-based accountability: Potential benefits and pitfalls of
science assessment with student diversity. Journal of Research in Science Teaching, 47(1), 6–24.
Pray, L., & Monhardt, R. (2009). Sheltered instruction techniques for ELLs: Ways to adapt science
inquiry lessons to meet the academic needs of English language learners. Science and Children, 46(7),
34–38.
Roehrig, A. D., Petscher, Y., Nettles, S. M., Hudson, R. F., & Torgesen, J. K. (2008). Accuracy of
the DIBELS Oral Reading Fluency measure for predicting third grade reading comprehension outcomes.
Journal of School Psychology, 46(3), 343–366.
Rosebery, A. S., Warren, B., & Conant, F. R. (1992). Appropriating scientific discourse: Findings
from language minority classrooms. The Journal of the Learning Sciences, 21, 61–94.
Ruiz-Primo, M. A., Shavelson, R. J., Hamilton, L., & Klein, S. (2002). On the evaluation of sys-
temic science education reform: Searching for instructional sensitivity. Journal of Research in Science
Teaching, 39(5), 369–393. DOI: 10.1002/tea.10027
Rumberger, R. W., & Larson, K. A. (1998). Toward explaining differences in educational achieve-
ment among Mexican-American language-minority students. Sociology of Education, 71(1), 68–92.
Rupley, W. H. (2009). Linking reading and science: Focusing on a broader base of understanding.
Reading Psychology, 31(3), 203–205. DOI: 10.1080/02702710903241389
Rupley, W. H., & Slough, S. W. (2010). Building prior knowledge and vocabulary in science in the
intermediate grades: Creating hooks for learning. Literacy Research and Instruction, 49, 99–112.
Santau, A. O., Maerten-Rivera, J. L., & Corinne Huggins, A. (2011). Science achievement of
English language learners in urban elementary schools: Fourth-grade students achievement results from
a professional development intervention. Science Education, 95, 771–793.
Texas AFT. (2008). Beyond TAKS (and NCLB): Putting Texas school accountability back on track.
Retrieved from http://docs.texasaft.org/legislative/TestReformForumPaper100208.pdf
Texas Education Agency. (2006). TAKS performance level descriptors. Austin, TX: Author.
Retrieved from http://www.tea.state.tx.us/index3.aspx?id¼3222&menu_id¼793
Texas Education Agency. (2008). Texas education agency technical report 2006–2007. Retrieved
from http://www.tea.state.tx.us/index3.aspx?id¼4326&menu_id¼793 http://www.tea.state.tx.us/student.
assessment/resources/techdigest/
Texas Education Agency. (2010a). 2009–10 Academic Excellence Indicator System. Retrieved
from http://ritter.tea.state.tx.us/perfreport/aeis/2010/state.html
Texas Education Agency. (2010b). Accountability manual: The 2010 accountability rating system
for Texas public schools and school districts. Retrieved from http://ritter.tea.state.tx.us/perfreport/
account/2010/manual/manual.pdf
Texas Education Agency. (2011a). Accountability manual: The 2011 accountability rating system
for Texas public schools and school districts. Retrieved from http://ritter.tea.state.tx.us/perfreport/
account/2011/manual/manual.pdf
Texas Education Agency. (2011b). 2011 Adequate yearly progress (AYP) guide. Retrieved from
http://ritter.tea.state.tx.us/ayp/2011/index.html
Texas Education Code. (1995) 74th Leg., Ch. 260, § 29.063.
Thomas, W. P., & Collier, V. P. (2002). A national study of school effectiveness for language
minority students’ long-term academic achievement. Final report. Washington, DC: Center for Research
on Education, Diversity & Excellence.
Tong, F., Lara-Alecio, R., Irby, B.J., Mathes, P., & Kwok, O. (2008). Accelerating early academic
oral English development in transitional bilingual and structured English immersion programs.
American Educational Research Journal, 45(4), 1011–1044.
Tong, F., Irby, B. J., Lara-Alecio, R., Yoon, M., & Mathes, P. G. (2010). Hispanic English learners’
responses to longitudinal English instructional intervention and the effect of gender: A multilevel analy-
sis. Elementary School Journal, 110(4), 542–566.
1010 LARA-ALECIO ET AL.
Journal of Research in Science Teaching
U.S. Department of Education. (2009). Standards and assessment group accountability group.
Ed.gov. Retrieved from http://www2.ed.gov/admins/lead/account/saa.html
Waldman, C. A., & Crippen, K. J. (2009). Integrating interactive notebooks: A daily learning cycle
to empower students for science. The Science Teacher, 76, 51–55.
Wang, L., Beckett, G. H., & Brown, L. (2006). Controversies of standardized assessment in school
accountability reform: A critical synthesis of multidisciplinary research evidence. Applied Measurement
in Education, 19(4), 305–328.
Warren, B., Ballenger, C., Ogonowski, M., Rosebery, A., & Hudicourt-Barnes, J. (2001).
Rethinking diversity in learning science: The logic of everyday language. Journal of Research in
Science Teaching, 38(5), 529–552.
Watkins, N. M., & Lindahl, K. M. (2010). Targeting content area literacy instruction to meet the
needs of adolescent English language learners. Middle School Journal, 41(3), 23–32.
Wellington, J., & Osborne, J. (2001). Language and literacy in science education. Buckingham,
England: Open University Press.
Woodcock, R. W. (1991). Woodcock Language Proficiency Battery—Revised, English and Spanish
Forms: Examiner’s manual. Itasca, IL: Riverside.
MIDDLE SCHOOL SCIENCE 1011
Journal of Research in Science Teaching