30
UNSTANDARDIZED MEASURES A Cross-Case Analysis of Test Prep in Two Urban High-Needs Fourth-Grade Classes Ted Kesler , This article presents a cross-case analysis of two fourth- grade teachers’ instruction while preparing their stu- dents for an English language arts test. Both teachers taught in high-needs urban public schools and were identified as effective teachers of balanced literacy through a multiple nomination process. This article uses situated literacy theory to consider the research ques- tion, How did the localized contexts influence both teachers’ instruction during test prep? The author fo- cuses on the theme of standardization, which emerged from the cross-case analysis, and presents findings in four categories: (a) classroom conditions, (b) test mate- rials, (c) test administration, and (d) test results. Find- ings show key distinctions in the localized contexts that directly influenced both teachers’ test prep. These dis- tinctions challenge the construct of standardization that constitutes high-stakes tests and provide a far more com- plex view of school contexts that terms such as “urban, high-needs” conflate, with implications for research and policy. D URING the 2006 2007 school year, I investigated how two fourth-grade teachers in urban high-needs public schools prepared their students for the high-stakes New York State English Language Arts exam (the ELA). Several characteristics of urban high-needs public schools in federal and state ed- ucation policies defined both schools, including high poverty levels, based on 70% of , © 2013 by The University of Chicago. All rights reserved. 0013-5984/2013/11304-0002 $10.00

Untitled

Embed Size (px)

Citation preview

Page 1: Untitled

UNSTANDARDIZED MEASURES

A Cross-Case Analysis of Test Prep in Two UrbanHigh-Needs Fourth-Grade Classes

Ted Kesler!"##$% &'((#)# ,&*+, "$*-#.%*+, '/$#0 ,'.1

!"#$%!&$This article presents a cross-case analysis of two fourth-grade teachers’ instruction while preparing their stu-dents for an English language arts test. Both teacherstaught in high-needs urban public schools and wereidentified as effective teachers of balanced literacythrough a multiple nomination process. This article usessituated literacy theory to consider the research ques-tion, How did the localized contexts influence bothteachers’ instruction during test prep? The author fo-cuses on the theme of standardization, which emergedfrom the cross-case analysis, and presents findings infour categories: (a) classroom conditions, (b) test mate-rials, (c) test administration, and (d) test results. Find-ings show key distinctions in the localized contexts thatdirectly influenced both teachers’ test prep. These dis-tinctions challenge the construct of standardization thatconstitutes high-stakes tests and provide a far more com-plex view of school contexts that terms such as “urban,high-needs” conflate, with implications for research andpolicy.

DU R I N G the 2006 –2007 school year, I investigated how two fourth-gradeteachers in urban high-needs public schools prepared their students for thehigh-stakes New York State English Language Arts exam (the ELA). Severalcharacteristics of urban high-needs public schools in federal and state ed-

ucation policies defined both schools, including high poverty levels, based on 70% of

$'( ()(*(+$!%, #&'--) .-/%+!) -'("2# 334 , $"25#. 6© 2013 by The University of Chicago. All rights reserved. 0013-5984/2013/11304-0002 $10.00

Page 2: Untitled

students or higher who received free or reduced-price lunch; high percentages, based onnational averages (nces.ed.gov), of minorities and English language learners; and a higherpercentage, based on national averages (nces.ed.gov), of teachers who had taught 5years or less. Both schools were located in poor, noisy, densely populated, high-volume traffic areas, with few nearby public parks. I used a multiple nominationprocess (Allington & Johnston, 2002, p. 32) to select highly effective teachers ofliteracy and conducted ethnographic research to discover what counted as literacyinstruction in their classrooms, what counted as test preparation (test prep), andwhat were the larger social, cultural, and historical contexts that shaped what theteachers did. In this article, I focus specifically on the test prep portion of bothteachers’ instruction. I asked the following research question: How did the localizedcontexts influence both teachers’ instruction during test prep?

The Institutional Context: Regulating Literacy

In the two schools where the study was conducted, the ELA exam was definitely highstakes— especially for the fourth grade, since the fourth- (and eighth-) grade testscores are reported in the local newspapers.1 Holdover policies, teacher evaluations,and school rankings and status all depend on test scores. Another reason that fourthgrade was considered more intense was that, unlike 2 days of testing in third and fifthgrades,2 the fourth-grade exam took 3 days. On the first day, there were multiple-choice passages; the second day, short and extended written responses to an orallistening passage; and the third day, short and extended written responses to tworeading passages (students were directed to use information from both passages intheir extended response). In this study, I paid attention to all the visible practices thatboth teachers used to prepare their students for this high-stakes exam.

Balanced Literacy

In the spring of 2003, the New York City Department of Education (NYCDOE)declared balanced literacy as the official literacy curriculum framework for all ele-mentary and middle schools. This announcement was part of the NYCDOE’s Chil-dren First initiative, in response to No Child Left Behind (NCLB), to unify andstabilize the school system. Balanced literacy, however, is not a program, but rathercalls for a balance of skills-based and holistic-based approaches that together aim tomeet the needs of students to learn to read and write (Applebee, 2003; Au, 2003;Freppon & Dahl, 1998; Pearson & Raphael, 1999).

At the time of this study, the New York City Department of Education (2003)provided a conceptual overview of balanced literacy that emphasized the followingcomponents: (a) independent reading for at least half an hour daily; (b) writingworkshop for at least half an hour daily; (c) systematic phonemic awareness, phonics,spelling, and word study instruction; (d) direct and explicit instruction daily, usingmini-lessons, small-group work, and one-on-one conferences; (e) interactive read-aloud daily; (f) shared reading and shared and interactive writing; and (g) literacystructures such as literature circles, book clubs, guided reading groups, and thematicstudy groups. The NYCDOE maintained that these components, when implementedcollectively and in balance, enabled students to acquire “the habits and techniques ofaccomplished readers and writers” (NYCDOE, 2003, p. 26).

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 3: Untitled

But balanced literacy is more than a set of routines and practices. How it is im-plemented involves “ways of being in the world . . . ways of acting, interacting, feel-ing, believing, valuing, and using various sorts of objects, symbols, tools, and tech-nologies” (Gee, 2005, p. 7) that define what Gee calls a Discourse (with Gee’scapitalization). Discourses are always linked to the power structures of institutionssuch as schools that enable particular kinds of literacy, restrict others, and normalizeparticular ways of showing competence, thus generating a hierarchy of successfuland unsuccessful expressions of literacy in classrooms. This normalizing processmasks the reality that teachers and students have a limited range of ways of enactingliterate behaviors in their classrooms. It also masks the situated meanings in localizedcontexts of the ensuing literacy practices.

For example, although the NYCDOE framework does not emphasize how to or-ganize and balance the component structures, in many schools, including the twoschools of this study, two key organizing structures of balanced literacy practices inthe upper elementary school grades were interactive read-aloud and reading andwriting workshops. During interactive read-aloud, teachers generally gathered theirstudents in a meeting area, read children’s literature aloud, and engaged them ininteractive discussion. Interactive read-aloud sessions generally lasted 20 –30 min-utes. Reading and writing workshops generally lasted 45–50 minutes. Workshop timebegan with a 15-minute mini-lesson that provided explicit teaching of a particularskill and strategy. During work time, students were expected to apply the focus of thisand other mini-lessons to their ongoing reading and writing work. Workshops con-cluded with a 5–10-minute sharing of students’ work. Reading and writing work-shops were organized by units of study of approximately 4 weeks duration. Thedemands of teaching units of study made them the central components of the bal-anced literacy framework. Also, in the weeks preceding the high-stakes test, testsophistication as a genre study became the focus during interactive read-aloud andboth reading and writing workshops, as some literature on test prep recommends(Hornof, 2008; Kontovourki & Campis, 2010; Santman, 2002).

The balanced literacy units of study were intended to both accommodate andalign skills and strategies with the demands of the ELA exam. This design was basedon the belief in aligning test preparation and the balanced literacy curriculum, andproviding students with preparation in the test format, otherwise known as “test-wiseness” (Millman, Bishop, & Ebel, 1965), that is consistent with the literature ontest preparation (e.g., Calkins, Montgomery, Santman, & Falk, 1998; Fountas & Pin-nell, 2001; Guthrie, 2002; Hornof, 2008; Miyasaka, 2000). Guthrie (2002) stated,“Assessment tasks should not become the curriculum. It is not the format of the testthat should be aligned with teaching, but the objectives that should be aligned withteaching” (p. 387). This belief was also consistent with statements by the New YorkState Education Department, who collaborated with CTB/McGraw Hill to create theexam. As explicitly stated in its core curriculum guides, the ELA assessments focus onstudents’ actual performances as readers, writers, and listeners and are directly con-nected to curriculum and instructional practice through the performance indicators,emphasizing that if teachers teach “effectively,” minimal preparation for the testwould be required (New York State Education Department, 2005).

Gee (2005) discusses fluid boundaries among discourses. He uses the metaphor ofweaving together multiple strands of multiple discourses that, when enacted andrecognized often enough, become “a single Discourse whose hybridity may ulti-

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' (#)*

Page 4: Untitled

mately be forgotten” (p. 30). The alignment that the literature suggests calls for theblending of practices across the two discourses of balanced literacy and test prep,generating what Guthrie (2002) called integrated reading-language arts (p. 376). Per-formance on the test, Guthrie stated, “is highly influenced by general reading com-petence gained through substantial time spent in a coherent curriculum that isaligned with the test” (p. 377). Some of the ways the balanced literacy framework inboth schools aligned with test preparation was by developing students’ stamina forreading and writing, giving them experience with both narrative and nonnarrativetext structures, and with both fiction and nonfiction, writing both narratives andexpository essays, and writing ELA-like responses to the read-aloud texts. Moreover,the test sophistication genre study relied on the same balanced literacy structures.Thus, while observing literacy events in their classrooms, balanced literacy and testprep practices often blended together. In this article, I sometimes refer to thesepractices as blended discourse.

I define test prep as the teachers’ deliberate teaching to the ELA exam, whichincluded these blended practices. For example, from the start of the school year, asdeliberate test prep for the ELA exam, Jennifer (all names are pseudonyms) guidedher students to write short essay responses to prompts that she provided immediatelyfollowing their interactive read-aloud time. Moreover, during the test sophisticationunit of study, test prep was the literacy work in both classes. In this study, I focus onthe instruction during this 5-week test prep unit.

Theoretical Framework

For this article, I use a theory of situated literacy (Barton & Hamilton, 1998; Barton,Hamilton, & Ivanic, 2000; Gee, 1996; Street, 1995) to guide the cross-case analysis ofthe teachers’ test prep instruction. Street (1995) distinguished an autonomous, uni-versal model versus an ideological model of literacy, explaining that in the autono-mous model, “many individuals, often against their own experience, come to con-ceptualize literacy as a separate, reified set of ‘neutral’ competencies, autonomous ofsocial context—and with the procedures and social roles through which this modelof literacy is disseminated and internalized” (p. 114). Street cited Ogbu’s definition ofliteracy as characteristic of schools, in which literacy is “synonymous with academicperformance” and “the ability to read and write and compute in the form taught andexpected in formal education” (p. 107). This separate, reified set of neutral compe-tencies, autonomous of social contexts, is especially dominant in test prep for thehigh-stakes, standardized ELA, with the underlying values on individual perfor-mance.

Conversely, an ideological model of literacy recognizes that literacy is inherentlycontextual. Thus, there is not one universal, autonomous form of literacy, but rathermultiple literacies that are always situated in the social institutions of hierarchy andpolitical power (Street, 1995). Gee (1996) used the distinction d/Discourse to distin-guish language in use (discourse) from the richly contextual, social embeddedness ofliteracy with all its underlying values, beliefs, actions, interactions, and commitments(Discourse). Gee (2005) explained that “all life for all of us is just a patchwork ofthoughts, words, objects, events, actions, and interactions in Discourses” (p. 7).Gee’s conception of Discourse maintains that our ways of being, speaking, writing,and reading are intimately tied to our different discourse communities. Thus, liter-

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 5: Untitled

acy practices are infused with identity and identity is situated in literacy practices. Inthis study, I began with the premise that when teachers prepare their students forstandardized tests, they are teaching them how to function competently in a Dis-course of language arts testing, and how to take on the situated identities of compe-tent test takers, including how to position their bodies, how to enact test sophistica-tion, how to sustain their attention, how to read for one right answer, how to avoidthe test makers’ tricks, and so on— beliefs, practices, and ways of speaking, listening,acting, valuing, and thinking that are particular to each context.

Lewis, Enciso, and Moje (2007) argued for new visions of sociocultural theory thataddress issues of power, identity, and agency. Particularly pertinent to this study areissues of power, since “systems of educational accountability built on high-stakes,standardized tests are in fact intended to increase external control over what happensin schools and classrooms” (Au, 2007, p. 264). Lewis et al. adapted a Foucauldiansense of power as “a field of relations that circulate in social networks rather thanoriginating from some point of domination” (p. 4). In specific contexts, power is“produced in and through individuals as they are constituted in larger systems ofpower and as they participate in and reproduce those systems” (p. 4).

Similarly, Maybin (2000) explains the importance of oral language in the media-tion of texts in literacy events for expressing these contextual relationships. Based inFoucauldian conceptions of discourse, Maybin explained that language interactionsaround texts have an immediate function in accomplishing bureaucratic or educa-tional tasks. But they also serve to induct the individual into the discourses of widersocial structures, which have specific consequences for people’s positioning in rela-tion to particular kinds of knowledge, their social relationships, and their sense ofidentity (p. 205). Foucault (1995) used the term surveillance to describe the coercivepower of wider social structures on everyday practices in institutions such as schools.Maybin (2000) stated, “this suggests a dialectical relationship between structure andagency, and between micro- and macro-level contexts” (p. 198), which inevitablyinvolves issues of power. Thus, a theory of situated literacy that tightly couples thedynamics of power in local contexts suggests a careful analysis of discourse. In thisarticle, I consistently look at the literacy practices, events, and texts that constitutedtest prep in both teachers’ classrooms, and, through discourse analysis, their dialec-tical relationships to the macro-level contexts of their schools.

Review of the Literature

A review of the research on the impact of testing on teachers’ beliefs and practicesshows that the influence of high-stakes tests on teaching and learning is pervasive andcomplex. Based on this review, I put the findings into three categories: studies thatgenerally advocate teaching to the test, studies that report teacher accommodation totesting, and studies that report the deleterious impact of testing. Abrams, Pedulla,and Madaus (2003) stated, “Regardless of one’s position on this issue, it is impossibleto deny that statewide testing policies influence classroom instruction and studentlearning” (p. 18).

Skrla and Scheurich (2001) and Koschoreck (2001) looked at the impact of high-stakes tests at the district level in Texas. Skrla and Scheurich (2001) studied fourschool districts, ranging in size from 8,000 to 50,000 students, focusing especially oneach district’s superintendent. Their findings showed how the superintendents took

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' #()*

Page 6: Untitled

an antideficit orientation that pervaded their districts, “bringing their districts much,much closer to the democratic ideal that all of us hold dear—truly high and equitableacademic success for literally all children” (p. 257). In the school district of 50,000students, Koschoreck (2001) reported that the accountability system functioned as atool for promoting equity. This led to a “paradigmatic coalescence” (p. 302), spurredby a visionary superintendent, for active participation among all district stakeholdersthat “led to a coherent image of a community working toward the common goal ofsuccess for all children” (p. 302).

At the classroom level, some case studies reported teachers preparing students forsuccess on the standardized test while providing a more comprehensive literacyprogram (e.g., Langer, 2001; Williamson, Bondy, Langley, & Mayne, 2005). BothWollman-Bonilla (2004) and Wolf and Wolf (2002) reported teachers’ success withthe teaching of writing. Through a study of persuasive writing in a third- and afourth-grade class, Wollman-Bonilla concluded that “teachers needn’t teach to thetest in a narrow, evaluation-focused manner; rather, they can develop tools thatmove students towards test readiness while keeping writing principles in focus” (p.510). Wolf and Wolf studied the writing instruction of six exemplary fourth- andseventh-grade teachers in Kentucky and Washington. The authors revealed a few keycharacteristics of these teachers’ writing instruction: (a) they always presented largerpurposes for writing than just a narrow focus on the test, (b) they gave their students“substantive opportunities” for writing (p. 231), and (c) they had in-depth under-standing of the required performance assessments. The researchers concluded that,using exemplary practices, these teachers managed to prepare their students fortesting “without losing sight of the richness of human composition” (p. 239).

Some case studies and teacher testimonies reported preparing students success-fully for standardized tests by approaching test prep as a genre study, consistent withthe literature on test preparation. Teachers in these reports found the overlap of goodreading instruction with test prep and then, in the weeks before the test, provided afocused study on the particular demands of the test. For example, Santman (2002)reported teaching her middle school students to read with stamina, across a variety ofgenres, and strategies to handle unfamiliar texts before engaging in a test sophistica-tion genre study. Kontovourki and Campis (2010) showed ways to empower studentsduring this test sophistication study by turning testing into a game that the studentscould master. A key component of the test prep genre study that they reported wasthe collaboration among teachers that positioned them “as professionals even withinthe constraints of mandated testing” (p. 244). Nevertheless, these reports have acritical stance toward the regime of testing. For example, even as Santman preparedstudents for success on the test, she guided them to recognize the injustices of thestandardized tests and felt deeply conflicted about the compromises she had to make,leading her to conclude, “in the end, what we’re really after—lifelong readers andthoughtful citizens—is lost. Is that what we want?” (p. 211).

Another view of the impact of testing is that teachers will continue functioningwithin their operative paradigms. In this view, the demands of tests cause adapta-tions and intensifications to teachers’ pedagogy, but no real improvements (Cim-bricz, 2001; Firestone, Mayrowetz, & Fairman, 1998; Grant, 2001; Zancanella, 1992).Significantly, these findings often contradict teachers’ claims that high-stakes testscompletely control their curriculum and instruction, as they state in interviews andsurveys, highlighting the essential need for observation of teacher practices as part of

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 7: Untitled

the research methodology, as this study does. For example, Grant (2001) studied twohighly experienced teachers who had “radically different instructional practices”(p. 400). Grant reasoned that if tests drive curriculum, then what accounts for thevariation in these teachers’ pedagogy, and what might this variation mean for “test-ing as a lever of instructional change?” (p. 413). Based on his analysis, Grant con-cluded that the tests are just one of several influences and have little direct, deep, andconsistent influence on these teachers’ practices: “The pervading sense that testsdrive content, instruction, and the like seems alternately overstated, ill informed, ormisplaced. If tests are an influence on practice, and more importantly, if they areintended as a means of changing teachers’ practices, then they may be an uncertainlever at best” (pp. 421– 422).

In her unpublished dissertation, Cimbricz (2001) studied the influence ofstandards-based state testing on three experienced fourth-grade teachers’ pedagogy,who all taught in the same school in an upstate New York suburban school district.She found that the teachers made adaptations, intensified, and added on to theirliteracy instruction in response to state ELA assessments. As a result, they cut back orput off other curriculum work, such as creative writing or science and social studiesinstruction, or delegated this work to others. However, like Grant (2001), Cimbriczreported the teachers’ agentive role in what literacy instruction happened in theirclassrooms. Cimbricz (2001) affirmed the findings of numerous policy analysts thatthe teachers in her study were not passive recipients of policy and mandates butactive interpreters, based on their beliefs, knowledge, and experience of pedagogy.The teachers were engaged in a process that Tyack and Cuban (1995) call adaptivetinkering, “whereby teachers preserve what they think is valuable and remedy whatthey think is not” (Cimbricz, 2001, p. 137). This finding suggests the need to study thesituated literacy that occurs as teachers prepare their students for standardized tests.

Some researchers reported several detrimental impacts of testing. These impactstend to center around four categories: (a) testing as a gatekeeping mechanism, (b) thenarrowing of the curriculum, (c) the loss of control, and (d) the inhibiting costs(Koschoreck, 2001, p. 301). Most of these studies relied on large-scale survey meth-ods, emphasizing the need for more case studies at the classroom level (Abrams et al.,2003; Haladyna, Nolen, and Haas, 1991; Smith, 1991; Urdan & Paris, 1994). Manyresearchers concluded that teaching to the test improved students’ test scores with nobenefit to their literacy performance, raising questions about the validity of testscores as a measure of student achievement.

By also focusing on the Texas standardized achievement tests, the survey study byHoffman, Assaf, and Paris (2001) directly challenged the findings reported by Skrlaand Scheurich (2001) and Koschoreck (2001). The researchers wondered why stu-dent scores were rising each year, indicating clear instructional improvements, whiletheir scores showed little improvement in reading or writing achievement on anorm-referenced test with national norms. Similarly, the National Assessment ofEducational Progress scores showed that growth in reading for Texas students hadbeen flat during the same period. These results suggested that teachers had simplydone a better job teaching to the test, but not a better job at literacy instruction. Thesefindings also corroborated the respondents’ claims on the survey. The researchersconcluded that when tests drive instruction, teachers become less considerate of theirstudents’ needs, and instruction becomes more patterned and predictable and lessresponsive and adaptive (Hoffman et al., 2001, p. 490).

!"! ! !"# #$#%#&!'() *+",,$ -,.(&'$ #$%& '()*

Page 8: Untitled

Some case studies and teacher testimonies corroborate these findings (e.g., Assaf,2006; Bomer, 2005; Dooley, 2005). Bomer (2005) discussed how the intense climateof accountability and surveillance created by federal and state policies diminishedher professionalism and pushed her out of classroom teaching. Dooley (2005) re-ported one resource-room teacher’s resistance to accountability and surveillancepressures in order to provide literacy instruction that best met the needs of herstudents in a well-off suburban school. Conversely, Assaf (2006) reported one highlyexperienced reading specialist’s acquiescence in the face of high-stakes testing in anurban school with a large immigrant population, replacing what she knew as “soundliteracy instruction with instruction mainly interested in helping students pass thetest” (p. 165). Assaf then suggested ways for teachers to push back against the regimeof testing.

Dooley and Assaf (2009) then conducted a cross-case analysis of the impact ofstandardized testing on the two resource-room teachers, one in a middle-classsuburban school and one in a high-needs urban school. Their results showeddramatic differences in these teachers’ instructional practices, despite the align-ment of their beliefs about literacy instruction, as a result of their localizedcontexts. The suburban schoolteacher was able to enact literacy practices thatinvolved much more social interaction for the construction of meaning, whereasthe urban schoolteacher enacted many more skills-based practices to derive atext’s inherent meanings that were aligned with the test format and requiredstudents to work in isolation. Thus, NCLB accountability measures exacerbatedthe achievement gap that this policy intended to rectify. The results led theauthors to advocate for more in-depth studies of teachers’ practices in contextsthat highlight the situated effects of standardized measures.

As with the teacher testimony and case study reported by Santman (2002) andKontovourki and Campis (2010), this study focuses on test prep in the context of abalanced literacy framework and as a test sophistication genre study at the classroomlevel. However, unlike the peripheral attention to school-wide contexts that influ-enced the teachers’ work in those reports, this study pays careful attention to school-wide contexts that were integral parts of the situated literacy in both classrooms.Moreover, those reports focused on single cases. As in the study by Dooley and Assaf(2009), this study presents a cross-case analysis of two in-depth cases, looking atsituated meanings. However, this study looks at the test prep practices of two fourth-grade teachers in urban, high-needs schools. They worked in schools that federalpolicies and funding sources would consider equivalent based on disaggregated stu-dent populations. The intention is to illuminate distinctions within the “urban, high-needs” designation that policies and funding sources conflate and that directly affectteachers’ literacy instruction. Thus, this study addresses the following research ques-tion: How did the localized contexts influence both teachers’ instruction during testprep?

Methodology

Ethnographic collective case study methods were used. Within a socioculturalframework, collective case study prevented an assumption of one best way, andconversely presented the complexity and variety of curriculum implementationin context. Case studies enable the researcher to see “the larger economic, cul-

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 9: Untitled

tural, and historical forces that shape and are shaped by local encounters” (Dy-son & Genishi, 2005, p. 9). Two cases was an optimal number since it reached thelimit of reasonableness of in-depth analysis for one researcher while still enablingthe thick description I sought. Merriam (1998) asserted that collective case studydemands a two-step process of analysis: within-case analysis followed by cross-case analysis. This article focuses on the cross-case analysis with the aim of“building substantive theory offering an integrated framework” (p. 195) acrossboth cases.

The Two Teachers

I selected Jennifer and Lisa for this study after an extensive multiple nomina-tion process (Allington & Johnston, 2002, p. 32) for fourth-grade teachers inhigh-needs urban public schools who demonstrated exemplary literacy instruc-tion and classroom management. Using purposeful sampling (Merriam, 1998), Isent an e-mail letter to 20 nominators who all served in instructional leadershippositions, with a list of 12 criteria based on a synthesis of the research literaturefor effective teachers of literacy. I first narrowed the 34 nominees by consideringonly those who were teaching a mainstream, monolingual fourth grade in ahigh-needs school. I further narrowed the nominees by talking with people whohad first-hand experience with them at their schools, using the same set of cri-teria. These people included principals, assistant principals, literacy coaches, andparent coordinators. I then contacted the nominees, and eight nominees werewilling to commit to the study.

In the second phase of the nomination process, I conducted nonparticipant ob-servations in each of the nominee’s classrooms for approximately an hour and a half,during their literacy block of time. My final choices of participants were also guidedby Stake’s (2000) advice to lean toward those cases that seem to offer “opportunitiesto learn” (p. 446). “Even for collective case studies, selection by sampling of attributesshould not be the highest priority. Balance and variety are important; opportunity tolearn is of primary importance” (p. 447). Some of the considerations for “opportu-nities to learn” were (a) the receptivity and enthusiasm of the teacher for this project,(b) the teacher’s willingness to accommodate my “role-making” (Angrosino & Maysde Perez, 2000, p. 678), (c) particular challenges that the teacher faced with herstudents who were labeled “at-risk,” (d) the demographics of the class, (e) the schoolcontext, and (f) accessibility. My personal experience as a literacy staff developer alsoguided these decisions.

The two teachers in this study were both White 30-year-old women who hadattended highly selective undergraduate programs and earned their master’s degreesin education through the New York City Department of Education’s Teaching Fel-lows program, which specifically prepares candidates to teach in high-needs schools(www.nycteachingfellows.org). Lisa, in her fourth year of teaching, and Jennifer, inher fifth year, were considered the veteran teachers on their grade. Lisa and I met atthe start of the study, so Lisa knew me only as a researcher in her classroom. Con-versely, I was the literacy staff developer in Jennifer’s school for 3 years prior to thisstudy, and Jennifer knew me in this role before I became a researcher in her class-room.

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' ()*+

Page 10: Untitled

Researcher’s Role

For role making, I acknowledged “the transactional aspects of ethnographic re-search” (Florio-Ruane & McVee, 2000, p. 159) in both research sites. First, I func-tioned within “the creative tension in the role of member/observer” (Angrosino &Mays de Perez, 2000, p. 684). In other words, while I actively pursued observation inboth research sites, I also made a role for myself, with both teachers’ guidance, asassistant and consultant, which reflected my role as literacy staff developer. “In nocase is it advantageous for the ethnographer to be passive in the face of the assump-tions of the community he or she is studying” (p. 680). This was especially importantgiven the high-stakes conditions that I was researching. Second, I used “proportion-ate reason” (p. 692) in order to “weigh the consequences” (p. 693) of my actions ineach setting, engaging three criteria: experience, intuition, and trial and error. I kepta reflective journal and met biweekly with critical friends to discuss my role makingand strengthen data collection.

The School Sites

Both schools were located in urban poor neighborhoods. Both were clean and wellmaintained and had some excellent facilities. For example, Lisa’s school had twocomputer labs, a spacious new library, and a well-organized book room; Jennifer’sschool had a shiny new gym, a computer lab, and a science lab. Both also had staffdevelopers from the same literacy support provider for ongoing professional devel-opment.

But the school sites also had pronounced differences, which the findings will showpartly constituted the situated literacy that occurred. Lisa’s school, constructed in the1950s, had 1,300 students and was at 130% capacity. Student demographics were 78%Hispanic, 19% Black, 2% White, and 1% Asian or other ethnicities. Forty-five percentof the students were English language learners. Eighty-six percent qualified for free orreduced-price lunch. The school had a turnover rate of 50% for teachers who taught5 years or less, and 26% of the teachers had 3 or fewer years of teaching experience. Ingrades 3–5, class sizes averaged 30 students. There were seven fourth-grade classes,including one inclusion class; in addition, there were one third-/fourth-grade andone fourth-/fifth-grade special education class. To accommodate the overcrowding,lunch and recess periods went from 10:20 to 1:55, and since Lisa’s classroom windowsfaced the recess yard, her class contended with recess noise, in addition to the noiseof busy city streets, for large portions of the day. Lisa had 32 students, including sixwith individualized education plans.

Jennifer’s school, constructed in 1995, had an elevator to accommodate studentswith physical disabilities in inclusion classrooms. The school had 544 students andwas at 80% capacity. The student demographics were 87% Hispanic, 10% Black, 2%White, 1% Asian, and 1% American Indian; 28% of the students were English lan-guage learners. Ninety-nine percent of the students qualified for free or reduced-price lunch. The school had a turnover rate of only 8% for teachers who taught 5 yearsor less, and 17% of the teachers had 3 or fewer years of teaching experience. Averageclass size for grade 3 was 22 students; for grade 4, 24; and for grade 5, 26, whichmatched the city averages. There were three fourth-grade classes: one was bilingual,the second was inclusion, and the third was general education. The school staff called

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 11: Untitled

this class the “regular” or “top” class that reproduced a discourse of normalcy (Fou-cault, 1995), which tended to have the higher-performing students based on theirthird-grade ELA scores and their teachers’ recommendations. Jennifer had 24 stu-dents in the general education class. Jennifer’s classroom faced the busy city streets,and sometimes sirens and traffic drowned out instruction.

Data Sources

Fieldwork for this study lasted from mid-October 2006 to the end of January 2007.During that time, I made 20 alternating visits to each teacher’s classroom duringliteracy instruction, usually occurring in the morning and ranging from 2.5 to 4hours. I observed both teachers for one whole school day, including the extended dayprogram (a 37.5-minute session, Monday through Thursday, that was mandated forthe lowest-achieving students in each class, based on their previous-year assessmentresults), to get the full flow of the day. My focus was on their literacy instruction andthe take-up of that instruction by three focal students in each class. In Lisa’s class, thefocal students were Samuel, Frederick, and Maria. In Jennifer’s class, the focal stu-dents were Sabrina, Bernard, and Amelia. All six focal students were designated“at-risk” by their teachers within the context of their own classes. In this article, Iinclude these students’ ELA results as part of the data set for cross-case analysis. Testscores are placed in a range from 1 to 4 (see Table 4): A score of 1 is below proficiency,2 indicates partial proficiency, 3 indicates proficiency, and 4 indicates above profi-ciency. A score of 3 indicates grade-level performance, and a score of 1 indicates apotential holdover.

Six data sources contributed to the cross-case analysis: (1) participant observation;(2) two semistructured interviews with each teacher, pre- and post-fieldwork, plusongoing conversations and e-mail exchanges, including after-test results, reported inearly June; (3) three 2.5-hour collaborative meetings at the beginning, middle, andend of the study; (4) one semistructured interview with each school principal aftertest results were reported; (5) informal conversations with building personnel such asassistant principals, literacy coaches, and parent coordinators; and (6) documentanalysis. During my visits, I took extensive notes and audio-recorded literacy events.Consistent with Gee’s (1996) conception of literacy events as situated meanings inparticular Discourses of practice, I observed “the shared contexts of meaning thatconstitute social activity” (Hicks, 1996, p. 113) in both classroom settings. After eachvisit, I wrote reflective notes. Formal test prep sessions presented one importantDiscourse for observation. I paid particular attention to language use and media-tional tools and symbols that built connections between “everyday” and “scientific”concepts (Vygotsky, 1978), within an ecological view of literacy (Barton & Hamilton,1998), as a way to provide assisted performance in the zone of proximal developmentfor all students.

The research design included three collaborative group meetings in November,December, and January. Each meeting was 2.5 hours long and was built around awhole-day workshop, focusing on the alignment of balanced literacy and test prep,that the two teachers and I attended at the university that sponsored their staffdevelopment. My aim was to develop a community of learners, guided by threeprinciples: collegiality, intellectual challenge, and practicality. This was collaborativework in which I also openly contributed my expertise and experience as part of my

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' ()*+

Page 12: Untitled

shifting role of researcher/participant (Angrosino & Mays de Perez, 2000). All inter-views and sessions were audio-recorded and later transcribed. Finally, I also analyzedrelevant documents, including the teachers’ plan books and lesson plans, test prepmaterials, sample tests, and so forth.

Data Analysis

During fieldwork, my data collection and analysis kept showing key distinctionsbetween the two sites even as both teachers followed the same balanced literacycurriculum and worked toward the same demands of the ELA (see Tables 1 and 2). Infact, strong similarities in both teachers’ backgrounds, experiences, beliefs, and lit-eracy events enabled me to more easily perceive these distinctions. Theories of situ-ated literacies illuminated these distinctions as inevitable aspects of each localizedcontext occurring at micro and macro levels of school functioning (Maybin, 2000).Moreover, these distinctions challenged assumptions of standardization that are thepremise of high-stakes testing. I was also realizing that the distinctions I was perceiv-ing were not expressed in policy and funding documents for urban high-needsschools with comparable student populations. I made notes on these insights inreflective memos.

After fieldwork ended, I reread data across sources for cross-case analysis (Miles &Huberman, 1994). Using constant comparative analysis (Strauss & Corbin, 1994),

Table 1. Lisa’s Shift in Literacy Practices for Test Prep

Time (inMinutes) Balanced Literacy Test Prep

25 Routines, word study, independent reading Routines, independent reading25 Read-aloud with accountable talk25 Reading workshop Read-aloud with accountable talk25 Short responses from the read-aloud or

from a shared reading text25 Writing workshop Day 2 extended response or Day 3 work2550

–Day 1 work: test-taking strategies for

multiple choice

Table 2. Jennifer’s Shift in Literacy Practices for Test Prep

Time (inMinutes) Balanced Literacy Test Prep

25 Routines, calendar math, reading projects,independent reading

Routines, calendar math, reading projects,independent reading

25 Read-aloud with accountable talk Read-aloud with accountable talk20 Short responses off the read-aloud Short responses from the read-aloud or from a

shared reading text25 Writing workshop Day 2 extended response or Day 3 work2525 Reading workshop Day 1 work: test-taking strategies for multiple

choice, some independent reading2515 Some word work (two or three times per week),

so that reading and/or writing workshopshad shorter amounts of time

Some word work (one or two times per week),so that reading and/or writing workshopshad shorter amounts of time

!"#$%"&%'&()*& +*%#!'*# ! ,--

Page 13: Untitled

one emerging theme was one-size-fits-all instruction or, as the teachers sometimescalled it, teaching to the middle. This practice occurred in the data sets of both sites,but with vastly different ramifications given each teacher’s localized context. Bynarrowing my focus to data units that I coded for test prep, I perceived these differ-ences as inherent tensions as efforts to achieve standardization played out in eachsite. In the cross-case analysis of the teachers’ implementation of test prep, fivecategories in the theme of standardization arose: (a) classroom conditions, (b) op-portunities for integrated curriculum, (c) test materials, (d) test administration, and(e) test results.

I then reanalyzed the data sets for key quotes in our interviews and collaborativegroup meetings or key literacy events and practices in both teachers’ classrooms thatdemonstrated the theme of standardization in action. To reveal the complexity ofthis instruction, I relied on discourse analysis, since language was the central meansthrough which new understandings were negotiated among participants in theseclassrooms (Hicks, 1996). I relied on Gee’s (2005) and Fairclough’s (1989) systems ofdiscourse analysis, which provided diverse and flexible toolkits for analysis consis-tent with the theoretical framework of this study. For example, I considered who hadagency in discourse units, what relational values the words expressed, and whatcausal connections were being made. Put simply, I considered who was doing what towhom within the Discourse of test prep (Gee, 2005). Discourse analysis enabled me toperceive issues of power, positioning, and perspective in language, which revealedthe connections between micro and macro levels of localized contexts. Throughthese layers of analysis, I was able to express the nuances in each category within thestandardization theme.

Findings

Using data from the cross-case analysis, I now elaborate on differences in the local-ized contexts that directly influenced both teachers’ instruction during test prep infour of the five categories listed above. Space limitations prevent my discussion ofdifferences in opportunities for integrated curriculum. Findings show key distinc-tions in these two urban high-needs schools that policies do not distinguish.

Differences in Classroom Conditions

During the teachers’ school-sanctioned test prep unit of study, their literacy in-struction narrowed. I compared literacy events in both classrooms before and duringtheir test prep units of study, which lasted 5 weeks. For Lisa, the curriculum narrowedas shown in Table 1. For Jennifer, the curriculum narrowed as shown in Table 2.

Lisa extended her independent reading time out of a strong belief in providinguninterrupted reading time for students to read books at their independent readinglevels. For Lisa, providing this time was imperative because (a) she believed that mostof her students did not have family supervision to ensure independent reading timeat home, (b) during the rest of her literacy block, at least one-third of her studentswould be working with texts that were too hard for them, and (c) she had few otheropportunities during the day to work with students one-on-one or in small groups.Out of this commitment, she extended her test prep work an additional 25 minutes,which cut into other curriculum work (exclusively the untested subjects of science

!"" ! !"# #$#%#&!'() *+",,$ -,.(&'$ #$%& '"()

Page 14: Untitled

and social studies). In Jennifer’s class, an opposite outcome occurred: students losttime for independent reading, which occurred almost exclusively during readingworkshop time. For Jennifer, time for uninterrupted independent reading duringtest prep was less a priority because (a) she perceived that most of her students hadfamily supervision to ensure independent reading time at home, (b) the test prepmaterials that Jennifer provided accommodated most of her students, and (c) shewas able to work with students one-on-one and in small groups during her extendedday program. In addition, as Table 2 shows, for 15 minutes three times a week, wordwork cut into other curriculum work (exclusively, the untested subject of socialstudies).

No new literacy instruction happened the week of the ELA. In a 182-day schoolyear, time devoted to the ELA then accounted for 30 days, or 16.5% of instructionaltime. In addition, the focus of their extended day work also became test prep. Forthose students who attended an after-school program and/or the Saturday Acad-emy, test prep was also the focus. Some students also reported doing test prep athome. Moreover, teachers were required to administer three interim literacyassessments (two before the ELA). Each interim assessment took an hour toadminister, not including the teachers’ time for scoring and data entry. Similartime demands and requirements occurred for the math standardized test. Alltold, this presented a pronounced intensification and narrowing of activity fortest prep.

Lisa and Jennifer had other pronounced differences in their localized contexts thatinfluenced what they accomplished in their classrooms. In Table 3, I list key distinc-tions in the two sites that directly connected to conditions for optimal test prep in theresearch literature (e.g., Calkins et al., 1998; Guthrie, 2002; Hornof, 2008; Miyasaka,2000). Regardless of both schools’ designation as high-needs, these were not stan-

Table 3. Differences in Lisa’s and Jennifer’s Localized Contexts

Lisa Jennifer

32 students 24 studentsMore than one-third of the students were reading and writing

below grade-level norms, some as low as 2.5 years belowgrade level.

The “top” fourth-grade class, so most of thestudents were able to read grade-leveltexts.

Limited access to differentiated materials, which furtherpositioned Lisa and her students into one-size-fits-all,teach-to-the-middle practices.

Easy access to photocopying and manyresources from her literacy coach, such asdifferentiated test prep materials.

As a result of 130% school capacity, Lisa had all 32 students inher extended day program, with three other teachers whowere more distracting than helpful, based on myobservations and numerous statements by Lisa about theseconditions.

Jennifer had only her 10 most strugglingstudents in her extended day program,which enabled her to work with themintensively in small groups.

No science room. The science teacher traveled with a cart toeach classroom, which limited the kinds of scienceexperiences the students could have.

A science room. The science teacher wasthus able to conduct experiments andgive students hands-on scienceexperiences.

Again, as a result of 130% capacity, Lisa had to take 15 to 20minutes out of her daily instructional time for a whole-class bathroom break.

No whole-class bathroom break. Thus,Jennifer’s class had 15 to 20 minutes ofextra instructional time.

Six of Lisa’s students, including two of the focal students,attended pull-out programs, so they missed two to threeinstructional periods daily, three to four times per week.

All of Jennifer’s students benefited from herconsistent instructional methods all daylong.

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 15: Untitled

dardized conditions. Thus, no matter how experienced, devoted, and hard workingthe two teachers were, Lisa clearly faced more arduous conditions, as both teachersoften expressed.

Moreover, they repeatedly expressed how these conditions directly affected whatthey could accomplish with their students. In our third collaborative meeting, afterthe ELA exam, I asked both teachers what worked well and what was problematic intheir test prep:

Lisa: Well, I feel like a lot of the challenges I faced were in terms of class sizeand behavioral issues. And, if that had been different, then all the test prepwould have been different. But, it’s been a very difficult situation.

Jennifer: I’ve had a number of very difficult classes. But this year I have beenfortunate to have a very capable group, which, for me, was such a welcomechange, so I was able to cover a lot more, more quickly than in previousyears.

Both Lisa and Jennifer’s utterances hide the source of agency, presumably theadministration, who annually determine the makeup of each class. In Lisa’s firststatement, she connects class size and behavioral issues. In her use of the conditionalsentence, Lisa then directly connects class size and behavioral issues with the qualityof her test prep. This connection presupposes larger policy debates of the effects ofschool safety and class size on the quality of teachers’ instruction. Jennifer’s use of theword fortunate expresses fate or chance: the quality of her students is not in hercontrol. Her use of the causal connector so directly connects the quality of her stu-dents with the quality of the work that she is able to do with them: “I was able to covera lot more, more quickly than in previous years.” Both teachers, then, asserted thatkey factors in the specific contexts of their classes were out of their control, yetdirectly affected the quality of their test prep.

Test Materials

The pervasive power of high stakes in both schools also expressed itself in terms ofthe amount of test prep materials that the administrations supplied. Both teachersreported receiving more test prep materials than they knew what to do with. Forthem, most of these materials were wasted, as the following excerpt in our thirdcollaborative group meeting illustrates:

Jennifer: I just have to say, you have not seen a fraction of the amount ofmaterials that we received.

[Lisa chuckles knowingly.]

Researcher: Oh, really. Is that true for you too [Lisa]?Lisa: Absurd. Absurd.Jennifer: You would laugh. Because what else are you supposed to do.

Especially for extended day and after school.[Lisa sighs loudly in exasperation.]

Jennifer: Because that gets delegated to people who have no idea what’sgoing on, who get a key from the custodian to go to a closet, who knock onyour door [J knocks on the desk three times], I get one of the kids to open it,

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' #"()

Page 16: Untitled

and they come into your room with a stack of books this thick [J makes aspace two feet high between her hands]. Archaic test prep books from 10years ago. “Here. Your test prep books. Now I’ve done my job. You’vegot your materials.” [J rubs her hands to indicate wiping your handsclean.]

Researcher: But you don’t have to use them.Jennifer: No. But you asked about the materials that we’re supplied with. I

could fill a room with the materials that we’re supplied with.

Jennifer’s utterance that begins “because that gets delegated to people” maskedthe source of authority. Power again resided with an unnamed, disembodied entity,suggesting hidden power (Fairclough, 1989) in the form of the administration. Bothteachers were positioned as powerless and compliant in these interactions, evenwhile they considered them to be absurd. By using your room (as opposed to myroom), Jennifer generalizes this absurd situation to all teachers in test-taking grades.Moreover, Jennifer refers to “people who have no idea what’s going on,” even thoughin her relatively small school community, she knows these people personally byname, to establish that the entire staff is complicit in this absurd imposition. Herparody of their speech and gestures conveys workers mindlessly following orders,accentuating the position the entire school staff has been put in under high-stakestesting. She portrays them as just doing their job, regardless of how it affects teachersand students. This exchange also suggests the amount of school funds that go towardtest prep materials and aligns with reports of the aggressive practices of publishingcompanies, which produce standardized tests and test prep materials, to marketthese products to schools for substantial profits with no evidence base for improvingtest score results (Glovin & Evans, 2006).

In addition to workbooks, both teachers had easy access to and used ELA examsfrom previous years, which were posted for downloading on the New York State Edu-cation Department website (http://www.nysedregents.org/grade4/englishlanguagearts/home.html). They also compiled collections of short passages, of the length and inthe various genres on the test, with talk and reading response prompts, resemblingwhat students would experience on the ELA. Thus, either the teachers providedpassages that had testlike prompts or multiple-choice questions, or they generatedthese prompts or questions themselves. Therefore, both teachers believed that testprep materials should match the formats, questions, and levels of difficulty of theactual test as much as possible, which was consistent with the literature on test prep.However, as I presented in Table 3, Lisa had greater demand for, but less access to,differentiated test prep materials. Moreover, providing test prep materials as much aspossible at the levels of difficulty of the actual test, with their emphasis on one rightanswer, pushed both teachers into one-size-fits-all teaching.

Since Jennifer perceived less surveillance from her administration, she chose notto use the piles of workbooks that filled an entire back cabinet in her classroom.Instead, she used a series of workbooks by the Kaplan Test Prep Company. Again, asa result of having “a very capable class” that was “quick on the uptake” of test prepskills and strategies, Jennifer felt that the one-size-fits-all passages in these work-books matched the needs of most of her students, and she was able to work closelywith students in the extended day program who might have struggled. These weredisposable workbooks, so students were allowed to write in them, take marginal

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 17: Untitled

notes, and underline—all test-taking skills that Jennifer emphasized. Moreover,since Jennifer was in a school at only 80% capacity, with just three fourth-gradeclasses with an average of 24 students, she had easy access to other differentiated testprep materials from her literacy coach, including packets of short passages at differ-ent reading levels, like the kinds the students would face on the ELA. Jennifer gavethese to her students to read and discuss in December, 1 month before the test.

For Lisa, the one-size-fits-all demands of test prep materials proved problematic,given the wide range of reading levels represented by her students. In our secondcollaborative group meeting, we discussed her use of past ELA exams for test prep:

Researcher: Are you differentiating these materials? What about the kids forwhom these passages are too hard?

Lisa: Well, unfortunately I don’t have enough resources, so I’m going to betargeting the middle.

“Targeting the middle” was a predicament that Lisa called “morally awful.” Lisaleaves unstated the causes of not having enough resources, but based on previousstatements, we can infer that she is implying the demands of an overcrowded school.Also implied in this main clause is Lisa’s positioning in this situation: there’s nothingshe can do about it. The connector so then establishes the cause-and-effect connec-tion: as a result of not having enough resources, Lisa is unfortunately positioned totarget the middle.

The Kaplan test prep workbooks that Lisa used consisted of 12 lessons. Each lessonfocused on a particular strategy for the day one multiple-choice format and includeda multiple-choice passage at four levels of text difficulty. The texts— even at thelowest levels, however—were too difficult for approximately one-third of her stu-dents. In addition, students were not permitted to write in the workbooks, so theywere unable to practice taking marginal notes or underlining, skills that they woulduse on the actual ELA. Instead, they wrote their answers on a separate piece of paper,which added another layer of organization and less immediacy to their work. We hadthe following exchange in our third collaborative meeting:

Researcher: How has test prep helped or hurt your students’ literacylearning?

Jennifer: I don’t think it’s hurt them. I certainly hope it hasn’t hurt them!Lisa: I think it may have hurt a few of my students, my strugglers, because

those times when we were working on Kaplan, and I was handing out testmaterials for the middle of the class, they really didn’t have that much to do,so they were kind of lost for that time. They weren’t getting anything from thiswork.

Jennifer: [Speaks over Lisa] Any time spent on reading materials that weretoo difficult was wasted time for those kids.

Lisa’s use of the clause “I was handing out test materials for the middle of the class”further generates her position as powerless and compliant regarding the use of testprep materials. By using the progressive past tense, Lisa mitigates her action andascribes more direct agency to the test prep materials: Lisa was the one who handedout these test prep materials, but it was the test prep materials that created this

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' (")*

Page 18: Untitled

situation of targeting the middle. We also know from previous statements that Lisabelieved she had no choice but to use these test prep materials in this way. Therefore,by providing test prep materials that carefully aligned with the actual test, Lisa waspositioned to teach to the middle using one-size-fits-all instructional practices,which left her strugglers “kind of lost for that time” and unable to get anything fromthis work. As Jennifer expressed, this was “wasted time for those kids.” Jennifer’sstatement assimilated language from Richard Allington (2001), What Really Mattersfor Struggling Readers, connected to their balanced literacy framework. Allington wasone of the guest speakers at one of the day-long workshops that we attended, andboth teachers were deeply familiar with this principle of practice. Thus, both teach-ers’ belief in matching test prep materials with the demands of the actual test gener-ated a mismatch for these students, further determining their subjectivities as at-riskand failing that “may have hurt them.”

Lisa also explained that the administration mandated the use of another set of testprep booklets called STARS for multiple-choice passage practice, which was thefocus of her test prep in the weeks preceding the test in her extended day program:“Also, the end of the day, we’re mandated to use these other test prep books calledSTARS, I’m not sure if you’re familiar with them. They’re test prep books, so forextended day we have to use those. Last year, they had us use this in place of Kaplan,and we kinda rose up as a grade and had it changed, because it’s really not thateffective. It’s more for practice with multiple choice, but the way they teach theconcepts is less than ideal.” Lisa’s use of the passive form in her first utterance giveshidden agency to the administration. Further, her language establishes an “us versusthem” dichotomy with her use of they in her third utterance and her description, “wekinda rose up as a grade,” which is the language of protest. In Lisa’s final utterance,her language “the way they teach the concepts” gives agency to the test prep work-book designers for the implementation of this material. Thus, according to Lisa, theteachers were positioned to use test prep workbooks that were “not that effective”and to implement teaching practices that were “less than ideal.” Twice I observedLisa’s implementation of workbook lessons in the extended day program. Studentshad only the fourth-grade level D workbooks, which were beyond the reading level ofat least one-third of the students. Since she had all 32 students in her extended dayprogram, she was unable to provide the differentiated support that these studentsneeded. Nor were these workbooks aligned with the format of the test, which pro-ponents of test prep as genre study strongly advocate. Thus, the use of these work-books further encouraged teaching to the middle, which Lisa perceived as possiblyharmful to “her strugglers,” since they “weren’t getting anything from this work.”

Test Administration

In a climate of accountability and surveillance, high-stakes standardized tests arethe lever of change. However, my findings show serious challenges to the assumptionof standardization in the testing conditions between the two schools. For both teach-ers, a powerful source of semiotic mediation in the construction of meaning forread-aloud texts was all the paralinguistic moves that both teachers made (Wertsch,1991): changes in volume, pitch, and pacing; their use of gestures and expressions;and their body language. These semiotic tools mattered most of all for day 2 of theELA, the oral listening passage. All of Jennifer’s students benefited from the consis-

!"#$%"&%'&()*& +*%#!'*# ! ,-,

Page 19: Untitled

tency of her approach on the actual test day; on the other hand, 13 of Lisa’s students—more than one-third of her class—tested in other rooms with modified test conditionsand did not receive the benefit of her consistent paralinguistic communications. Yet,these students needed this consistency the most on testing days.

The directions for the day 2 test administration are vague. They state, “Read thelistening selection aloud to the students twice, making sure to read the title, the nameof the author, and any introductory material. Read the listening selection at a mod-erate and steady pace, speaking clearly and with expression” (CTB/McGraw-Hill,2007, p. 17). It is up to each test administrator to determine what a moderate andsteady pace is or what it means to speak clearly and with expression. Consequently,staff developers from the literacy support provider encouraged teachers to use theirfull array of gestures and other paralinguistic tools during the read-aloud, and theytrained their students to “read” these tools in the weeks preceding the test. Here ishow Lisa prepared her class to listen to a short story (Dec. 14, 2006):3

When I’m reading aloud, I’m going to be doing things that are going to better helpyou understand the passage. They’re called gestures, and I’ll be gesturing with myhands, and later we’ll discuss if this helped you with understanding the passage. Ifyou are not looking at me, you will miss out on this support. And not only are youlistening to my voice, because my voice is going to be CHANGING IN VOLUME,and I’m going to slow down. . . . Remember when there’s something important,I’m going to pause [Lisa does with her voice what she is describing: raising thevolume, slowing down, pausing, for each of these phrases]. Not only are you lis-tening to my voice, but you’re also watching me for WHAT MY BODY IS DOING,the gestures [Lisa moves her open hands, palms up, up and down] because that isgoing to give you a lot of help. And I’m going to read the same way today as I willon the test day. Of course, it’s going to be a different passage, but I’ll be reading itthe same way.

One week before the test, from a need for uniformity in test administration amongall testing conditions, Lisa’s school administration announced that while teacherswere permitted to read with expression, all other modes of communication were notallowed. Meanwhile, at Jennifer’s school, no such policy occurred, and students wereable to benefit from this training.

Both teachers administered practice tests in the weeks preceding the test. I com-pared their reading of the teacher directions to the actual directions, and noted all theways that they conveyed meaning, which highlighted again that “in the end, to readis to be able to actively assemble situated meanings in one or more specific ‘literate’Discourses. There is no ‘reading in general,’ at least none that leads to thought andaction in the world” (Gee, 2000a, p. 204). For example, in Table 4, I overlay Jennifer’sreading of the directions with the actual directions as printed in the manual for theday 2 practice test.

All Jennifer’s gestures, changes in pitch, volume, emphasis, phrasing, and thediscourse patterns of overlapping voices construct the meaning of the directions,consistent with her class’s test prep work. Notice, for example, how much emphasisJennifer gives to listening and watching carefully, but not taking notes, for the firstreading aloud of the story. In fact, Jennifer’s emphasis on not taking notes reinforces

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' (")*

Page 20: Untitled

her previous lessons on watching her carefully as she reads aloud for all the para-linguistic cues she will provide for the story’s meaning. For the second reading aloudof the story, however, Jennifer’s modifications convey that she expects her studentsto take notes in moderation and to use those notes to support all their written re-sponses. For example, notice how Jennifer changes the modal relational form in thedirections, “you may use these notes . . .” that grants permission, to the imperative,“Use these notes. . . .” The class’s emphasis that the notes will not count toward theirgrade reinforces that their notes can be abbreviated, messy, and without punctuationas they practiced. By encouraging the overlapping voices of the students, Jennifer isable to monitor their comprehension of what she is expecting of them, and readingthe directions becomes a communal act of meaning construction. Of course, on thedays of the actual test, Jennifer prevented students’ overlapping speech. However, bythat point, Jennifer fully expected the transformation of this intermental process intoan intramental one (Vygotsky, 1978). In addition, many of her paralinguistic toolswould remain. The class’s co-construction of meaning for the directions shows howa discourse community inevitably appropriates texts for its own ends (Barton &Hamilton, 1998). This is what readers do to construct meaning from texts and itinevitably varies from context to context, since reading is a sociocultural act. More-over, on the second reading of the passage, Jennifer gave explicit signals for note-taking. In the following excerpt, Jennifer was preparing students for this note-takingwork:

Jennifer: If I get LOUD! [“Ooh! That means . . .”] What am I telling you? [manyvoices]. That something’s IMPORTANT! I can’t say to you [in a whisper], ‘Jose,write this down [clap!]!’ But if all of a sudden I’m SHOUTING [clap!] something,what’s Jose gonna think? [“Write it down!” “It’s important!”]. ‘Hey! Ms. Millerseems pretty excited. There must be something that’s worth [pause] jotting down.’Right? [“Yeah.”] Or if I pause [J pauses for 5 seconds] [“Oh, my gosh! That’ssomething important!”] I’m giving you time to jot something down. [“Jot some-thing down now.”] So if I pause, you should think, ‘Ah, hah! Maybe I should writedown that word or that detail, right?

Table 4. Actual Directions and Jennifer’s Reading of Them

Actual Directions Jennifer’s Reading of the Directions

You will listen to the story twice. The firsttime you hear the story, listen carefullybut do not take notes. As you listen to thestory the second time, you may want totake notes. Use the space below and onthe next page for your notes. You mayuse these notes to answer the questionsthat follow. Your notes on these pageswill NOT count toward your final score(CTB/McGraw-Hill, 2007, p. 1).

You will listen to the story twice. The FIRST time you hear thestory, LISTEN CAREFULLY [“and watch”] (“and, of course,you’ll be watching”), BUT DO NOT TAKE NOTES! As youlisten to the story the second time, then you may want to takesome notes. You can use the space below and even the nextpage to take your notes. [Some comments, and J shushes them.][. . .] Use these notes to answer all the questions that follow.Your notes on these pages will NOT COUNT TOWARDSYOUR FINAL SCORE! [Students had the directions and shoutedthem out with Jennifer here.] [Whisper] That means your notesare not graded! They’re only notes to help you! So far, are thereany questions? (Field notes, Dec. 21, 2006)

Note.—For this transcription, italic indicates language that was not included in the actual directions and [. . .] indicates omittedlanguage.

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 21: Untitled

All Jennifer’s students benefited from these ways of being on test day, whereasmore than one-third of Lisa’s students—all students with testing modifications,which included the three focal students in this study— had the test administered byan unfamiliar person in an unfamiliar context. This is in addition to testing condi-tions that in both schools included constant urban noise, compared to the quiet andcalm outside the classroom windows of most suburban and rural schools. Certainlyfor these two teachers, test administration was not standardized.

Interpreting Test Results

In late May, 2007, both schools received the ELA test score results. Both principalsand teachers shared and discussed these results with me. Tables 5 and 6 summarizethe results for both classes.

Lisa and her principal were disappointed by these results. As a grade, fourth grad-ers’ scores decreased compared to their third-grade scores. In Lisa’s class, six studentswent down while only four students went up in their scores. All the students whoscored at Level 1 had to attend summer school and retake the test in August, or theywould be held over. The principal told me that by the end of summer, only three orfour students would end up repeating fourth grade. “For some reason, most of thestudents pass the test at the end of summer. Maybe they [the test administrators]make the scale a little easier for the students to pass” (field notes, May 30, 2007). Thisstatement implied that passing might only partially indicate academic progress. The

Table 5. Final Test Scores for Lisa’s and Jennifer’s Classes

Lisa’s Class Jennifer’s Class

No. of Students a Final Score b No. of Students Final Score b

0 4 (716–775) 1 4 (716–775)4 3 (650–715) 17 3 (650–715)

16 2 (612–649) 6 2 (612–649)9 1 (430–611) 0 1 (430–611)

aThere were 32 students in Lisa’s class; the total here (29) reflects the fact that three students were exempt from testing.

bScaled scores in parentheses.

Table 6. Final Test Score Results for Focal Students

Reading Levelsin September

Reading Levelsat Time of ELA

ELA Scoresin Grade 3

ELA Scoresin Grade 4 Difference

Lisa’s students:Samuel K K N/A 511 XFrederick K K N/A 589 XMaria M P N/A 631 X

Jennifer’s students:Sabrina M M 617 638 !21Bernard N O 638 638 0Amelia N N 638 656 !18

Note.—The letters in the first two columns refer to the book leveling system of Fountas and Pinnell (2001), which ranges fromlevel A to level Z. Levels O through T are considered fourth-grade range, with O at the low end and T at the high end.

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' (")*

Page 22: Untitled

principal’s reference to test administrators as a disembodied they imbued them withpower. She positioned herself as compliant and lacking power in the faceless gaze ofthis disciplinary mechanism that has the power to adjust test scores and grant or denygrade promotion (Foucault, 1995). Based on these results, nine of Lisa’s students wereat risk of grade retention, including Samuel and Frederick, and both Lisa and theschool were positioned as in need of improvement. At the time of these test results,Lisa was at a loss of what to do differently in her literacy instruction given the makeupof her class and her localized context.

Although Jennifer’s class had solid results, both Jennifer and the principal werealso disappointed. Ten students went down in their scores from third grade, ninestayed the same, and only five went up. Two of those students were Sabrina andAmelia, and Sabrina and Bernard were right on the cusp of a 3. The way the gradingscales work, the difference in their scores and Amelia’s might have been as little as oneor two multiple-choice answers. Based on the third-grade scores, Jennifer and theprincipal both expected more 4s, considering that this was the “top” class. Jenniferfelt that she did a better job this year with test prep and had a stronger class academ-ically that was more socially cohesive than in previous years, and yet her previousyear’s class performed better. She was at a loss about what to do differently for literacyinstruction in the future. Nevertheless, none of Jennifer’s students were at risk forgrade retention or in need of mandatory summer school; thus, Jennifer and herschool were positioned as successful.

Overall, the test score results certainly did not provide teachers or principals with“the information they need to ensure that children will reach academic success” (U.S.Department of Education, 2001), as NCLB assures annual testing will do. Both teach-ers were at a loss about what to do differently in their enactments of blended Dis-course. Nor did the test results correlate with changes in the focal students’ readinglevels. In Lisa’s class, Maria’s reading level increased from M to P, yet she scoredlower than all three of the focal students in Jennifer’s class. In Jennifer’s class, of thethree focal students, Bernard’s reading level increased one level, from N to O, but hewas the only one who had no increase in his test score results. Amelia, who remainedat level N, scored 18 points higher. Both teachers had equal levels of qualifications,professional development, professional expertise, commitment, and care. Both gavetheir utmost effort. The pronounced differences in the makeup of their classes andtheir localized contexts challenge assumptions of standardization in a climate ofaccountability and surveillance that positions one teacher, Lisa, as unsuccessful, andthe other teacher, Jennifer, as successful.

Discussion

The findings in this article corroborate survey research on the detrimental effects ofhigh-stakes testing on teaching (i.e., Abrams et al., 2003; Haladyna et al., 1991; Hoff-man et al., 2001; Smith, 1991; Urdan & Paris, 1994). They also corroborate the dom-inant theme that Au (2007) reported in his metasynthesis of qualitative studies of theeffects of high-stakes testing on curriculum: “the combination of contracting curric-ulum, fragmentation of the structure of knowledge, and increasing teacher-centeredpedagogy in response to high-stakes testing” (p. 263). Certainly the demands of testprep narrowed the curriculum in significant ways for both Jennifer and Lisa. Withintheir balanced literacy framework, test prep induced targeting the middle, especially

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 23: Untitled

at the expense of the lowest performing students, which Lisa called “morally awful.”Their balanced literacy and test prep units of study consumed time and attentionfrom key subjects in a well-balanced liberal arts education, such as science, socialstudies, and the arts. Math, science, social studies, and the arts also occurred asdiscrete subjects in a fragmented curriculum. The units of study also crowded outopportunities for integrated curriculum across disciplines and integration in and outof school. Ironically, both teachers’ approach to test prep prevented them fromachieving Guthrie’s (2002) recommendations for integrated curriculum that hestated would help students do well on standardized tests. Ravitch (2010) stated,“Ironically, test prep is not always the best preparation for taking tests. Childrenexpand their vocabulary and improve their reading skills when they learn history,science, and literature, just as they may sharpen their mathematics skills while learn-ing science and geography. And the arts may motivate students to love learning” (p.108).

Also consistent with the research was how much time was devoted to test prep. Inaddition to teaching to the test in their blended Discourse, both teachers enacted aschool-sanctioned test prep unit of study of more than 5 weeks. Nichols and Berliner(2007) explained that this substantial amount of time for test prep is especially pro-nounced in high-needs schools. Smith (1991) asserted that coupled with other man-dates and an already overcrowded curriculum, time taken for test prep hindersteachers’ flexibility to meet students’ needs and limits what and how they teach (p.10). We also saw that test scores produced feelings of alienation and dissonance forthe teachers. For Jennifer, the test scores did not reflect the actual academic attain-ment she believed her students achieved this year. For Lisa, the test scores could notaccount for “the vagaries of pupil effort and emotional status at the time of the test”(p. 9). Also unarticulated were all the social and economic factors that Rothstein(2004) asserted directly contribute to students’ test performance. The findings alsoshow that both Jennifer and Lisa experienced guilt and anxiety about the impact ofhigh-stakes tests on their students (Smith, 1991). Recall, for example, our discussionabout how test prep may have hurt their students.

The use of situated literacy as a theoretical framework challenged the concept ofstandardization. As a result of important contextual differences in two high-needsurban public schools, the conditions for testing were certainly not uniform for allstudents. For one thing, differences in school contexts enabled Jennifer to use moreparalinguistic modes of communication during test directions and the read-aloud ofthe passage on day 2 of the ELA than Lisa could. Moreover, one-third of Lisa’sstudents—the lowest performing students with test modifications— did not benefitfrom Lisa’s modes of expression at all on testing days. As Street (1995) reminded us,“literacy is part of ideological practice and will no more conform to the grand designsof central planners than will the members of the different cultures that happen to bedefined within a given nation-state” (p. 45).

Situated literacy illuminated other important factors in each localized context thatchallenge the validity of standardized tests for high-stakes purposes. First, we sawthat while both teachers were pushed into one-size-fits-all practices, these practiceshad much different consequences in their two contexts. Most of Jennifer’s studentscould manage the test prep materials. Moreover, she had access to more differenti-ated materials and was able to provide support to her neediest students in her ex-tended day program. At least one-third of Lisa’s students struggled with even the

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' (#")

Page 24: Untitled

lowest levels of test prep materials that Lisa was able to provide, and with 32 studentsin her extended day program, she was unable to provide differentiated support.Other conditions listed in Table 3 also directly contributed to the quality and amountof teaching in each localized context. For example, consider that, as a result of beingin a school at 130% capacity, Lisa had a designated bathroom break for her class thattook up 15 to 20 minutes daily. Or, consider that, as a result of being in a school at80% capacity, Jennifer’s students had access to weekly hands-on experiences in ascience lab.

These differences also corroborate the literature that gives critical readings of theaccountability measures of NCLB (e.g., Nichols & Berliner, 2007; Rothstein, 2004).Psychometricians use the term variability to describe the kinds of differences inconditions that I found between the two fourth-grade classes in these schools. Theyexpect this variability and therefore realize that differences between two schools’standardized test scores do not adequately determine differences in the relative qual-ities of the schools. Likewise, differences in test scores between two classes in the sameschool do not adequately determine differences in the relative quality of the teachers(Koretz, 2008). Ravitch (2010) concluded, “Because there are so many variables thatcannot be measured, even attempts to match schools by the demographic profile oftheir student body do not suffice to eliminate random variation” (p. 154).

Situated literacy also enabled intriguing analysis of the relative power and posi-tioning the teachers perceived and how their perceptions revealed the connectionsbetween the micro level of their classrooms and the macro level of their schools andbeyond “as they participate[d] in and reproduce[] those systems” (Lewis et al., 2007,p. 4). In general, both teachers expressed complicity in mandates they opposed, suchas contending with the overwhelming amount of test prep materials they received.My interviews with both principals showed that they too were positioned as power-less and complicit in mandates that gave them little knowledge about best educa-tional practices for their students. The findings showed “how individuals are in-serted, through local activities, into broader regulating discourses” (Maybin, 2000, p.205).

The discourse analysis, however, also showed that Lisa more consistently used thepassive syntax, referenced hidden power, and left sources of agency unnamed. Shedescribed “going along” with mandates, even when they were “morally awful.” Herpositioning showed a stronger sense of surveillance, in a school at 130% capacity,with seven general education fourth-grade classes, and with 32 students in her classwho had a huge range of variability in their reading and writing abilities. Lisa had touse the STARS workbooks in her extended day program, and teaching to the middlehad dire consequences for at least the lowest third of her students. Conversely, Jen-nifer more consistently used active syntax that expressed more of her own agency forteaching to the test. After school aides delivered the big piles of test prep workbooks,Jennifer was able to close her door and choose test prep materials that best fit theneeds of her students. She positioned herself with far less surveillance in a school at80% capacity, teaching the “top” fourth-grade class, with 24 students who had amuch narrower range of variability in their reading and writing abilities. Jennifer wasable to give individual and small-group attention to the “strugglers” in her extendedday program, and teaching to the middle accommodated most of her students. Bothteachers expressed the “dialectical relationship between structure and agency, andbetween micro- and macro-level contexts” in their schools (Maybin, 2000, p. 198). By

!"#$%"&%'&()*& +*%#!'*# ! ,--

Page 25: Untitled

looking across both sites, it became clear that teaching to the middle, test prep ma-terials, administering the test, integrating the curriculum, and other pieces of lan-guage, tools, technology, or social practice connected to teaching to the test took onquite different meanings and values in these different contexts (Gee, 2000b). This iswhat Gee meant when he said that “words and contexts are mutually constitutive ofeach other” (p. 190).

Limitations

This study focuses on two teachers who taught 5 years or less, came through analternative licensing program, and only taught in the schools where I observed them.While I have no way of knowing if or how either teacher would perform differently indifferent settings, the point of this article was to show how a climate of high-stakestesting influenced each teacher’s localized context in particular ways. Nor could Igeneralize these findings to all urban teachers in high-needs schools. Other teachersmight respond much differently to macro-level policies and mandates, even withinthese same contexts. Indeed, a growing body of research shows how teachers insimilar urban settings to the school sites in this study are experiencing success, evenas they contend with high-stakes tests (e.g., Kontovourki & Campis, 2010; Langer,2001; Wolf & Wolf, 2002; Wollman-Bonilla, 2004). However, the cross-case analysisin this article showed key distinctions among urban public schools that terms such as“urban, high-needs” gloss over, and how these distinctions influence what counts asliteracy in each context. Finally, this study considered the teachers’ perspectives andpractices, and therefore offered an in-depth but narrow view of the influence ofhigh-stakes tests on literacy instruction. We also need to consider studies from theperspectives of students (e.g., Triplett & Barksdale, 2005), administrators (e.g., Ko-schoreck, 2001; Skrla & Scheurich, 2001), and policy makers. Collectively, such stud-ies might indicate common issues across perspectives that would provide a morenuanced and complex picture of the influence of high-stakes testing on classroompractices than currently exists.

Implications

The findings lead to two implications. First, while many interview and surveystudies analyze teachers’ beliefs about high-stakes testing, this article indicates theneed for more case studies that show connections between teachers’ beliefs andpractices at the micro-level and macro-level policies. In addition, Dooley and Assaf(2009) stated, “understanding educational inequity from teachers’ perspectivesmight help to offer a fair portrayal of the complexities of context” (p. 384). This, inturn, might influence macro-level policies regarding high-stakes tests and literacyinstruction. Complexities of context in my study showed that NCLB accountabilitymeasures exacerbate the achievement gap among urban, high-needs schools.

Second, teacher quality needs to be assessed on terms other than students’ testscores. I witnessed two devoted, competent teachers exert utmost effort to imple-ment the expected literacy curriculum. Indeed, each was nominated by multiplepeople in their local communities as highly effective. Yet, based on their students’ testscores, one was considered successful and the other unsuccessful, despite the unevenconditions that they faced.

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' #(")

Page 26: Untitled

Baskwill (2006) discussed a perception in American culture that those who are lesssuccessful in school simply do not try hard enough, based on the adage, “if at first youdon’t succeed, try, try again.” Baskwill posited that this adage is a powerful myth inour nation’s cultural model of success. For example, jobless people are simply “nottrying hard enough” or “are lazy” or “prefer living on welfare.” In our high-stakesclimate, it now implicates schools, teachers, and ultimately students. It forces con-formity to standards to be measured against, and consequently reifies a deficit dis-course for teachers and students as “lacking motivation, not applying themselves, orbeing ‘deficient’ in some way” (p. 512). This study reveals what Baskwill reported:that success or failure in high-stakes tests might have nothing to do with effort.Instead, teachers in Baskwill’s study saw the need for other discourses around literacythat would actually open opportunities for students’ literacy learning that our high-stakes school environments are shutting down.

Notes

1. They are also the two testing grades for the low-stakes National Assessment of EducationalProgress (NAEP).

2. In 2011, the ELA also became a 3-day test for grades 3 and 5.3. My transcriptions of discourse used the following coding: underline indicates overlapping

speech; bold print indicates emphasis; ALL CAPITALS indicates an increase in volume (and pitch);[words in brackets] indicate actions; [“words in brackets”] indicate students’ talk.

References

Abrams, L. M., Pedulla, J. J., & Madaus, G. F. (2003). Views from the classroom: Teachers’ opinionsof statewide testing programs. Theory into Practice, 42(1), 18 –29.

Allington, R. L. (2001). What really matters for struggling readers: Designing research-based programs.New York: Longman.

Allington, R. L., & Johnston, P. H. (2002). Reading to Learn: Lessons from exemplary fourth-gradeclassrooms. New York: Guilford.

Angrosino, M. V., & Mays de Perez, K. A. (2000). Rethinking observation: From method to con-text. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (2nd ed., pp.673–703). Thousand Oaks, CA: Sage.

Applebee, A. N. (2003). Balancing the curriculum in the English language arts: Exploring thecomponents of effective teaching and learning. In J. Flood, D. Lapp, J. R. Squire, & J. M. Jensen(Eds.), Handbook of research on teaching the English language arts (2nd ed., pp. 676 – 686).Mahwah, NJ: Erlbaum.

Assaf, L. C. (2006). One reading specialist’s response to high-stakes testing pressures. ReadingTeacher, 60(2), 158 –167.

Au, K. H. (2003). Balanced literacy instruction: Implications for students of diverse backgrounds.In J. Flood, D. Lapp, J. R. Squire, & J. M. Jensen (Eds.), Handbook of research on teaching theEnglish language arts (2nd ed., pp. 955–966). Mahwah, NJ: Erlbaum.

Au, W. (2007). High-stakes testing and curricular control: A qualitative metasynthesis. EducationalResearcher, 36(5), 258 –267.

Barton, D., & Hamilton, M. (1998). Local literacies: Reading and writing in one community. London:Routledge.

Barton, D., Hamilton, M., & Ivanic, R. (Eds.). (2000). Situated literacies: Reading and writing incontext. New York: Routledge.

Baskwill, J. (2006). If at first you don’t succeed . . . : A closer look at an old adage. Language Arts,38(6), 506 –513.

!"#$%"&%'&()*& +*%#!'*# ! ,-.

Page 27: Untitled

Bomer, K. (2005). Missing the children: When politics and programs impede our teaching. Lan-guage Arts, 82(3), 168 –176.

Calkins, L., Montgomery, K., Santman, D., & Falk, B. (1998). A teacher’s guide to standardizedreading tests: Knowledge is power. Portsmouth, NH: Heinemann.

Cimbricz, S. K. (2001). The making and meaning of change: Standards-based state testing, fourth-grade teachers’ thinking and practice, and English language arts instruction (Unpublished doc-toral dissertation). State University of New York at Buffalo.

CTB/McGraw-Hill. (2007). English Language Arts Test: Teacher’s directions. The English LanguageArts Test, retrieved from http://www.nysedregents.org/grade4/englishlanguagearts/home.html

Dooley, C. M. (2005). One teacher’s resistance to the pressures of test mentality. Language Arts,82(3), 177–185.

Dooley, C. M., & Assaf, L. C. (2009). Contexts matter: Two teachers’ language arts instruction inthis high-stakes era. Journal of Literacy Research, 41(3), 354 –391.

Dyson, A. H., & Genishi, C. (2005). On the case: Approaches to language and literacy research. NewYork: Teachers College Press.

Fairclough, N. (1989). Language and power. London: Longman.Firestone, W. A., Mayrowetz, D., & Fairman, J. (1998). Performance-based assessment and instruc-

tional change: The effects of testing in Maine and Maryland. Educational Evaluation and PolicyAnalysis, 20(2), 95–113.

Florio-Ruane, S., & McVee, M. (2000). Ethnographic approaches to literacy research. In M. L.Kamil, P. B. Mosenthal, P. D. Pearson, & R. Barr (Eds.), Handbook of reading research (Vol. 3,pp. 153–162). Mahwah, NJ: Erlbaum.

Foucault, M. (1995). Discipline and punish: The birth of the prison (A. Sheridan, Trans.) (2nd ed.).New York: Vintage Books.

Fountas, I., & Pinnell, G. S. (2001). Guiding readers and writers, grades 3– 6: Teaching comprehension,genre, and content literacy. Portsmouth, NH: Heinemann.

Freppon, P. A., & Dahl, K. L. (1998). Balanced instruction: Insights and considerations. ReadingResearch Quarterly, 33(2), 240 –251.

Gee, J. P. (1996). Social linguistics and literacies: Ideologies in Discourses (2nd ed.). London: Taylor &Francis.

Gee, J. P. (2000a). Discourse and sociocultural studies in reading. In M. L. Kamil, P. B. Mosenthal,P. D. Pearson, & R. Barr (Eds.), Handbook of reading research (Vol. 3, pp. 195–207). Mahwah, NJ:Erlbaum.

Gee, J. P. (2000b). The New Literacy studies: From “socially situated” to the work of the social. InD. Barton, M. Hamilton, & R. Ivanic (Eds.), Situated literacies: Reading and writing in context(pp. 180 –196). London: Routledge.

Gee, J. P. (2005). An introduction to discourse analysis: Theory and method (2nd ed.). New York:Routledge.

Glovin, D., & Evans, D. (2006, December). How test companies fail your kids. Bloomberg Markets,127–138.

Grant, S. G. (2001). An uncertain lever: Exploring the influence of state-level testing in New YorkState on teaching social studies. Teachers College Record, 103(3), 398 – 426.

Guthrie, J. T. (2002). Preparing students for high-stakes test taking in reading. In A. E. Farstrup &S. J. Samuels (Eds.), What research has to say about reading instruction (3rd ed., pp. 370 –391).Newark, DE: International Reading Association.

Haladyna, T. M., Nolen, S. B., & Haas, N. S. (1991). Raising standardized achievement test scoresand the origins of test score pollution. Educational Researcher, 20(5), 2–7.

Hicks, D. (1996). Contextual inquiries: A discourse-oriented study of classroom learning. In D.Hicks (Ed.), Discourse, learning, and schooling (pp. 104 –141). New York: Cambridge UniversityPress.

Hoffman, J. V., Assaf, L. C., & Paris, S. G. (2001). High-stakes testing in reading: Today in Texas,tomorrow? Reading Teacher, 54(5), 482– 492.

Hornof, M. (2008). Reading tests as a genre study. Reading Teacher, 62(1), 69 –73.Kontovourki, S., & Campis, C. (2010). Meaningful practice: Test prep in a third-grade public

school classroom. Reading Teacher, 64(4), 236 –245.

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' ()"*

Page 28: Untitled

Koretz, D. (2008). Measuring up: What educational testing really tells us. Cambridge, MA: HarvardUniversity Press.

Koschoreck, J. W. (2001). Accountability and educational equity in the transformation of an urbandistrict. Education and Urban Society, 33(3), 284 –304.

Langer, J. A. (2001). Beating the odds: Teaching middle and high school students to read and writewell. American Educational Research Journal, 38, 837– 880.

Lewis, C., Enciso, P., & Moje, E. B. (2007). Reframing sociocultural research on literacy: Identity,agency, and power. Mahwah, NJ: Erlbaum.

Maybin, J. (2000). The new literacy studies: Context, intertextuality and discourse. In D. Barton,M. Hamilton, & R. Ivanic (Eds.), Situated literacies: Reading and writing in context (pp. 197–209). London: Routledge.

Merriam, S. B. (1998). Qualitative research and case study applications in education. San Francisco:Jossey-Bass.

Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: An expanded sourcebook (2nded.). Thousand Oaks, CA: Sage.

Millman, J., Bishop, C. H., & Ebel, R. (1965). An analysis of test-wiseness. Educational and Psycho-logical Measurement, 25(3), 707–726.

Miyasaka, J. R. (2000, April 21–24). A framework for evaluating the validity of test preparationpractices. Paper presented at the annual meeting of the American Educational Research Asso-ciation, New Orleans.

New York City Department of Education. (2003). A comprehensive approach to balanced literacy: Ahandbook for educators. New York: Author.

New York State Education Department. (2005). English language arts core curriculum (pre-kinder-garten– grade 12). Albany, NY: Author.

Nichols, S. L., & Berliner, D. C. (2007). Collateral damage: How high-stakes testing corrupts Amer-ican schools. Cambridge, MA: Harvard Education Press.

Pearson, P. D., & Raphael, T. E. (1999). Toward a more complex view of balance in the literacycurriculum. In W. D. Hammond & T. E. Raphael (Eds.), Early literacy instruction for the newmillennium (pp. 1–21). Grand Rapids: Michigan Reading Association and Center for the Im-provement of Early Reading Achievement.

Ravitch, D. (2010). The death and life of the great American school system: How testing and choice areundermining education. New York: Basic Books.

Rothstein, R. (2004). Class and schools: Using social, economic, and educational reform to close theblack-white achievement gap. New York: Economic Policy Institute, Teachers College.

Santman, D. (2002). Teaching to the test? Test preparation in the reading workshop. Language Arts,79(3), 203–211.

Skrla, L., & Scheurich, J. J. (2001). Displacing deficit thinking in school district leadership. Educa-tion and Urban Society, 33(3), 235–259.

Smith, M. L. (1991). Put to the test: The effects of external testing on teachers. Educational Re-searcher, 20(5), 8 –11.

Stake, R. E. (2000). Case studies. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitativeresearch (2nd ed.). Thousand Oaks, CA: Sage.

Strauss, A., & Corbin, J. (1994). Grounded theory methodology: An overview. In N. K. Denzin &Y. S. Lincoln (Eds.), Handbook of qualitative research (pp. 273–284). Thousand Oaks, CA: Sage.

Street, B. V. (1995). Social literacies: Critical approaches to literacy in development, ethnography andeducation. London: Longman.

Triplett, C. F., & Barksdale, M. A. (2005). Third through sixth graders’ perceptions of high-stakestesting. Journal of Literacy Research, 37, 237–260.

Tyack, D., & Cuban, L. (1995). Tinkering toward utopia: A century of public school reform. Cam-bridge, MA: Harvard University Press.

Urdan, T. C., & Paris, S. G. (1994). Teachers’ perceptions of standardized achievement tests. Edu-cational Policy, 8(2), 137–156.

U.S. Department of Education. (2001). No child left behind. Jessup, MD: Education PublicationsCenter. Retrieved from http://www.ed.gov/pubs/edpubs.html

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge,MA: Harvard University Press.

!"#$%"&%'&()*& +*%#!'*# ! ,-,

Page 29: Untitled

Wertsch, J. V. (1991). Voices of the mind: A sociocultural approach to mediated action. Cambridge,MA: Harvard University Press.

Williamson, P., Bondy, E., Langley, L., & Mayne, D. (2005). Meeting the challenge of high-stakestesting while remaining child-centered: The representations of two urban teachers. ChildhoodEducation, 81(4), 190 –195.

Wolf, S. A., & Wolf, K. P. (2002). Teaching true and to the test in writing. Language Arts, 79(3),229 –241.

Wollman-Bonilla, J. E. (2004). Principled teaching to(wards) the test? Persuasive writing in twoclassrooms. Language Arts, 81(6), 502–511.

Zancanella, D. (1992). The influence of state-mandated testing on teachers of literature. Educa-tional Evaluation and Policy Analysis, 14(3), 283–295.

!"# ! !"# #$#%#&!'() *+",,$ -,.(&'$ $%&' ()"*

Page 30: Untitled

Copyright of Elementary School Journal is the property of University of Chicago Press and itscontent may not be copied or emailed to multiple sites or posted to a listserv without thecopyright holder's express written permission. However, users may print, download, or emailarticles for individual use.