12

Click here to load reader

Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

  • Upload
    jay

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

This article was downloaded by: [Bangor University]On: 21 December 2014, At: 12:08Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41Mortimer Street, London W1T 3JH, UK

The Journal of Educational ResearchPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/vjer20

Effects of Classroom Assessment on Student Motivation inFifth-Grade ScienceCandice Stefanou a & Jay Parkes ba Bucknell Universityb University of New MexicoPublished online: 02 Apr 2010.

To cite this article: Candice Stefanou & Jay Parkes (2003) Effects of Classroom Assessment on Student Motivation in Fifth-GradeScience, The Journal of Educational Research, 96:3, 152-162, DOI: 10.1080/00220670309598803

To link to this article: http://dx.doi.org/10.1080/00220670309598803

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in thepublications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations orwarranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsedby Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings,demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectlyin connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science CANDICE STEFANOU Bucknell University

ABSTRACT Researchers have suggested that assessment has the potential to af€ect learner behavior in terms of cognitive strategy use and motivation. The authors attempted to provide an understanding of the nature of the effect of particular assess- ment types on motivation. Students in 3 5th-grade science class- es were exposed to 3 different classroom assessment conditions: traditional paper-and-pencil tests, a laboratory task format of assessment, and a performance assessment. Measures of atti- tudes about science, goal orientation, and cognitive engagement (J. L. Meece, P. C. Blumenfeld, & R. K. Hoyle, 1988) were used. Analyses indicate a signi6-t effect attributable to assessment type on goal orientation only, with the traditional paper-and- pencil tests and the performance assessments fostering more task-focused orientations than the laboratory tests.

Key words: classroom assessment, tifth-grade science, student motivation

issatisfaction with large-scale, high-stakes multiple- D choice testing in the 1980s led many educators to focus attention on the format of the assessments that stu- dents were taking. Although some educators called for abol- ishing the use of standardized tests (e.g., Kohn, 2000), oth- ers rallied around the “tail wagging the dog” argument acknowledging that large-scale, high-stakes tests were not going away and that a better tactic might be to co-opt high- stakes tests and make those measures “tests worth taking” (Wiggins, 1992). As a result, research was done and con- jecture made on the systemic impact of educators’ using performance assessments for large-scale, high-stakes assessment (e.g., Khattri, Reeve, & Kane, 1998).

Part of the impetus for the performance-assessment movement was the grassroots work of classroom teachers in assessing their students through projects, portfolios, and the like. There were many claims that the same benefits seen (but rarely documented) at the classroom level would

JAY PARKES University of New Mexico

The goal of this study was to explore one claim for per- formance assessments: that they foster a more positive motivational orientation for students than supplied-response assessments do when used in the classroom. There is more than anecdotal experience to expect them to do so; there is a strong theoretical underpinning to support this argument, assuming that certain design elements are in place. In this study, we explored the effects of different classroom assess- ment formats on motivation by using upper elementary sci- ence classes as the context.

Systemic Impact of Standardized Testing

Because the framework of interest in this study centers on the consequences of test use (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on Measurement in Education [NCME], 1999; Messick, 1994), the importance of consequences to valid inferences and uses of assessment results is explored through a review of studies that docu- ment the consequences of large-scale, high-stakes tests.

Studies of Consequences

The consequences of a particular assessment are impor- tant evidence of validity (AERA, APA, NCME, 1999; Mes- sick, 1994; Shepard, 1997) and are especially important with reference to performance assessments (e.g., Shepard, 1989; Wiggins, 1989,1992). Exploring the consequences of a particular assessment procedure is in essence a cost-ben- efit analysis. That is, there are intended positive benefits to be gained from a particular assessment, although there are undoubtedly also some unforeseen and sometimes detri- mental outcomes. The role of an analysis of consequences is to determine whether the positive consequences (intend- ed and unintended) outweigh the negative consequences.

accrue when these assessment formats were transferred to the large-scale, high-stakes level (cf. Wiggins, 1989). Few of those Claims for large-SCde, high-stakes use were empir- ically explored (Bond, 1995). bucknell.edu)

Address correspondence to Candice Stefanou, 465 OIin Hall, ~ ~ ~ & l l Universiv, kwisburg, PA 17837. (E-mail: csrefm@

152

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 3: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

Januarymebruary 2003 [Vol. 96(No. 3)] 153

This type of analysis is not new in the performance-assess- ment debate. Although test designers have acknowledged some of the benefits of using performance assessments in large-scale, high-stakes settings, the costs for doing so in dollars and in time (among other costs) have made the use of such assessments prohibitive (cf. Hardy, 1996).

A body of research focuses on the costs or detrimental effects of standardized testing on teaching practices and learning strategies (e.g., Darling-Hammond, 1990; Shepard, 1991). The argument in favor of alternative assessments as a tool for reform rests in large part on the results of studies that have been conducted to assess the effects of traditional standardized testing on instruction and achievement (e.g., Herman, 1992; Herman & Golan, 1993; McNeil & Valen- zuela, 2000; Monty Neil1 & Medina, 1989; Smith & Rot- tenberg, 1991). Such studies indicate that there are a num- ber of unintended, negative consequences as a result of an increasing overreliance on standardized assessments. Included among these adverse consequences are (a) large portions of instructicinal time devoted to preparing students to take the standardized tests; (b) narrowing of the cumcu- lum to include only that which is assessed by the tests; (c) fragmentation of the curriculum, leading to an inert knowl- edge base for the siudents; (d) evolution of instructional practices, including teacher-made tests, to mimic the types of learning skills that dominate standardized tests; and (e) truncation of students’ learning strategies, resulting in the use of primarily lower order learning skills.

Research provides support for some of the claims about standardized testing. For example, in a survey of nearly 1,OOO students in elementary school through high school, Paris, Lawton, Turner, and Roth (1991) asked students about standardized tests. As they get older, students feel “greater resentment, anxiety, cynicism, and mistrust of stan- dardized achievemerit tests” (p. 16). The authors reported that students seem to get the message that the tests do not count for anything-they have no impact on grades, teach- ers and parents do riot care about an individual student’s performance, and the tests are unrelated to what is going on in the classroom. Overall, Paris et al. reported a decrease in motivation to perform well on standardized achievement tests as students get older.

A. H. Karmos and Karmos (1984) found similar results by surveying students in Grades 6-9. In general, they found that students tended to react negatively to standardized achievement tests. A large percentage of students respond- ed positively to statements that the tests were dumb or a waste of time. However, the authors found that students who believed that achievement tests were important and who had higher perceptions of self-efficacy did better on the tests than students who reacted negatively to the tests.

Claims About Pellfonnance Assessment

Advocates of performance assessments argue, in light of what is believed about the perceived powerful effects of the

ability of standardized assessments to drive instruction, that the form of an assessment creates the conditions for the type and quality of instructional practices engaged in by educa- tors and learning strategies and behaviors exhibited by stu- dents. The arguments offered to demonstrate the advantages of performance assessments over more traditional assess- ments are that performance assessments (a) are more authentic assessments; (b) use context, thus diminishing the potential overreliance on inert knowledge; (c) encourage higher order thinking rather than memorization and recall; (d) are more engaging and therefore more motivating for the student; and (e) are more valid assessments, specifically more systemically valid (Frederiksen & Collins, 1989). Wiggins (1989) argued that standardized tests rarely inform students, often tap irrelevant information, and are seldom meaningful to students. He suggested that performance assessments not only inform others about the test taker but also inform the test taker about the important issues and topics within a given field.

Furthermore. the idea that performance assessments may encourage higher order thinking (e.g., Frederiksen, 1984; Resnick, 1987; Resnick & Resnick, 1989; Shepard, 2000; Wiggins, 1992) and may be more engaging and more moti- vating (e.g., Meisels, Dorfman, & Steele, 1995; Wiggins, 1989) than traditional tests leads one to investigate the hypothesized relationship. The criticisms of standardized testing and advocacy for performance assessments have been generalized to the format of the tests, that is, multiple- choice items versus open-ended tasks. At this point in the understanding of the effects of testing, it is unclear whether the environment surrounding the test format (e.g., the imposed testing for accountability purposes) or the format itself is more problematic.

Performance Assessment in the Classroom

There is some supportive evidence for the claims made regarding performance assessments in the classroom. Lee (1994) found that college students have a tendency to use different study strategies depending on the format of the test to be taken; deep-level processing strategies are associated with performance assessments, and surface-level strategies are associated with traditional paper-and-pencil tests. Lu and Suen (1995) examined the effects of field dependence and field independence on performance-assessment scores and multiple-choice test scores in a college classroom. They found that there was an interaction between cognitive style and assessment type. Their results showed that field-inde- pendent students scored higher on the performance assess- ment than did field-dependent students, whereas no differ- ence existed on the multiple-choice scores. This finding suggests that performance assessments may require a set of filtering strategies different from those that are required for traditional assessments.

Parkes (2000) explored the relationship between goal ori- entation and scores on both an objective measure and a per-

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 4: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

154 The Journal of Educational Research

formance assessment. High school introductory Spanish students were given an objective test of grammar, vocabu- lary, and idioms and a performance assessment requiring a written paragraph about a day of sightseeing in Spain. Both tests covered the same text chapters. The paragraphs were scored for verb forms, vocabulary, coherence, and creativity. The students also completed a survey of their perceived control. Parkes found that a student’s perceived internal control correlated significantly with performance- assessment scores but did not correlate with objective-test scores. Elliot, McGregor, and Gable (1999) similarly found that performance goals (avoidant and approach) were related to an examination score but mastery goals were not. Furthermore, when Parkes (2000) used objec- tive-test scores to predict performance-assessment scores, perceptions of internal control explained variance beyond that predicted by the knowledge represented on the objec- tive test.

Pintrich and DeGroot (1990) explored the relationships among self-efficacy, test anxiety, intrinsic value, cognitive strategy use, and self-regulation and seventh graders’ per- formance on seatwork, examinations and quizzes, and essays and reports. Regression analyses revealed a different set of relationships when one considered examinations and quizzes versus essays and reports. In the authors’ investiga- tion of examinations and quizzes, (a) test anxiety showed a negative relationship; (b) self-regulation was positively related; and (c) cognitive strategy use, self-efficacy, and intrinsic value were unrelated. For the essays and reports, there was no relationship with self-efficacy, test anxiety, or intrinsic value; levels of cognitive strategy use was nega- tively related and self-regulation positively related. Studies such as these provide support for the claims that different psychological processes operate under the different assess- ment formats.

Some researchers (e.g., Meisels, Dorfman, & Steele, 1995; Wiggins, 1989) have claimed that performance assessments will be more motivating for students than other forms of assessment. There are characteristics of perfor- mance assessments that should foster positive goal orienta- tions in students, especially when these characteristics are examined from the perspective of the achievement goal the- oretic framework.

Motivational Climate and Orientation

Goal theory posits that persons enter achievement situa- tions with different orientations. Generally speaking, a per- son might exhibit mastery-learning goals or performance goals (both approach and avoidant). Although slightly dif- ferent names have been used in the literature, the underly- ing concepts are the same (cf. Ames & Archer, 1988; Midg- ley, Kaplan, & Middleton, 2001).

Students with a mastery-learning goal orientation are concerned with increasing their competence and believe that effort is a key mechanism in achieving mastery. Work-

ing hard is valued in successful performance, learning situ- ations are seen as opportunities to develop one’s skills, and learning is perceived as rewarding in itself. Success is not determined through normative comparison but is self-refer- enced. These students are more likely to adopt deep strate- gy use (Alao & Guthrie, 1999; Elliot, McGregor, & Gable, 1999), to initiate learning, and to prefer difficult or chal- lenging tasks (e.g., Turner, Thorpe, & Meyer, 1998). Deep strategy use is similar to intrinsic motivation in that it involves a desire to learn for learning’s sake.

Students with performance goal orientations are con- cerned either (a) with displaying their competence or abili- ty and avoiding a display of incompetence or inability (Dowson & McInerney, 2001; Middleton & Midgely, 1997; Turner et al., 1998) or (b) with performing for the purpose of receiving external rewards. Learning situations are seen as opportunities to be successful and to be deemed able. Learning is merely a means to an end. Putting forth effort, or working hard, is perceived as an antagonist to being judged able. Consequently, such students will avoid making a demonstration of their lack of ability (Dowson & McIner- ney; Middleton & Midgely). Success is determined in ref- erence to others (interpersonal comparisons) or an external standard. Such students are less likely to choose chal- lenging tasks and to use higher order learning strategies (Turner et al.).

These personal characteristics are influenced by the envi- ronment surrounding an individual (Anderman & Young, 1994; Church, Elliot, & Gable, 2001). This forms a cyclical effect whereby one’s goal orientation influences how one perceives the environment and the environment affects one’s goal orientation. If one considers assessments to be part of the environment, then assessments might affect and be affected by goal orientation (Ames, 1992). Parkes (2000) presented evidence that goal orientation can predict perfor- mance-assessment scores.

Assessments, and, more generally, evaluative informa- tion, are highly salient elements that one can use to estab- lish a motivational climate (Ames, 1992; Church et al., 2001). When evaluation is product focused, emphasizes the elimination of errors (Turner et al. 1998), encourages social comparison (Anderman & Young, 1994) and public perfor- mance, a performance-focused climate ensues (Ames). Authority and control are also important elements in estab- lishing a motivational climate (Ames). When students are encouraged to be autonomous, when they have choices and are encouraged to choose judiciously and taught how to do so, a positive mastery climate is established (Ames).

In summary, a mastery climate can be fostered when (a) tasks involve variety, novelty, diversity, and interest; (b) tasks are meaningful and therefore hold justification for engagement; (c) tasks are challenging; (d) students have control over the process, the product, or both; (e) students are encouraged to develop and use self-management and monitoring skills and effective learning strategies; (f) stu- dents have real choices and input; and (g) the focus is on

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 5: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

JanuaryEebruary 2003 [Vol. %(No. 3)] 155

individual improvement, mastery, effort, and making mis- takes as part of growth (Ames, 1992).

Motivational Aspects of Performance Assessments

Blumenfeld ( 1992 I determined that performance assess- ments are good candidates for research on mastery goals because of their characteristics. According to Baker, O’Neill, and Linn (1993, p. 1211), performance assess- ments (a) are open ended, (b) focus on higher order or com- plex skills, (c) use context sensitive strategies, (d) use com- plex problems requiring several types of performance and significant student time, (e) may require group as well as individual performance, and (f) allow for a significant degree of student choice. Given the goal theory perspective, and especially Ames’s ( 1992) articulation of instructional practices that foster mastery orientations, these characteris- tics should help with that process.

Method

Participants

Seventy-nine students in three 5th-grade science classes participated in this study. All three classes were in the same rural, northeastern 1J.S. elementary school. The classes were heterogeneous in terms of ability levels and gender; the actual classroom roster was determined in the previous academic year by a team consisting of school administra- tors and faculty. All three classes were taught by the same teacher. The sample consisted of 46 (58%) boys, 28 (35%) girls, and 10 (6%) students whose sex was not indicated. All students were either 10 or 11 years old.

Procedure

The study took place during the last half of the school year (roughly January to June) and focused on three differ- ent instructional units: (a) a unit on fresh water, (b) a unit on salt water, and (c) a special unit on The Voyage ofthe Mimi, a commercially available interdisciplinary unit (Bank Street College of Education. 1990). Each form of assessment was developed for each unit. The assessment type was then counterbalanced across the three classes of students so that at each unit, one class took each type of assessment, and across the three units, each class received each type of assessment. After each assessment had been administered, we asked the students to complete the questionnaire about their attitudes toward science, goals, and cognitive engage- ment during the preceding assessment session.

instruments

In this study, we used portions of the Science Activity Questionnaire (reported in Meece, Blumenfeld, & Hoyle, 1988), specifically the Science Attitudes, Goal Orientations,

and Cognitive Engagement subscales. The Science Atti- tudes subscale consisted of 12 items with observed Cron- bach’s alpha of .76 when administered after the first instruc- tional unit, .86 when administered after the second instructional unit, and .87 when administered after the third instructional unit. Higher scores on the Science Attitudes subscale indicate more positive experiences with science. The Goal Orientations subscale consisted of 12 items with observed Cronbach’s alphas after each instructional unit of .56, .36, and .62, respectively. Some items on the Goal Ori- entations subscale require reverse scoring. When the sub- scale is scored with items given the appropriate score and given the direction of the reverse scoring, higher scores indicate more mastery-learning goal orientations. The Cog- nitive Engagement subscale consisted of 15 items with observed Cronbach’s alphas after each instructional unit of .55, .7 1, and .74, respectively. Some items on the Cognitive Engagement subscale required reverse scoring. When the subscale was scored with items given the appropriate score and given the direction of the reverse scoring, higher scores indicated more self-regulated and engaged behaviors.

Students were assessed through traditional paper-and- pencil (P&P) tests, traditional science laboratories (lab), and performance assessments (PA). The traditional P&P tests were designed by the classroom teacher and consisted of multiple-choice, true-false, matching, fill-in-the-blank, and concept definition essay questions. The teacher designed the traditional P&P tests to capture the conceptual knowl- edge apart from the student’s ability to put that knowledge to use. We designed the laboratory assessments and the PAS jointly with the classroom teacher (see Appendix for a sam- ple P&P test, laboratory assessment, and PA). The laborato- ry assessments were designed along the lines of typical lab- oratory experiments that require students to actively manipulate materials and reach conclusions based on obser- vations. The steps that the students needed to take were clearly defined, and the observations the students needed to make were specified. The intent of the laboratory assess- ment was to serve as an assessment type that falls in between the traditional P&P test and the more open-ended PA. We designed PAS by using the attributes of perfor- mance-based assessments delineated by Baker, O’Neil, and Linn (1993) and reviewed earlier. In the sample laboratory assessment and PA provided in the Appendix, we illustrat- ed the differences between the 2 items using the same sub- ject-matter platform.

In addition to the quantitative data, we obtained qualita- tive data in the form of taped interviews with each of the fifth-grade science classes several days after the third instructional unit. All three interview sessions were con- ducted in the classroom with the entire class participating. The sessions were conducted on the same day in three con- secutive class periods. We asked the following questions: (a) Having experienced all three assessment types, which type of assessment did the students prefer? (b) How could the PA be made better; that is, what recommendations

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 6: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

156 The Journal of Educational Research

would the students have for making these tests “tests worth taking” (as they have been called by Wiggins, 1992)? And following conversation during the interviews, we asked, (c) If grades were not a factor, would the students’ preferences change? We collected qualitative data to assist in gaining a better understanding of the interplay between assessment type and student motivation than what might have been cap- tured by the inventory.

Analyses

Any effect on the dependent variables (attitudes, goal ori- entations, and cognitive engagement) caused by time of year and topic of the unit should have been negated by the counterbalance. Likewise, the counterbalance combined both between-subjects and within-subjects comparisons across different assessment types. Therefore, the data were collapsed so that one multivariate analysis of variance was performed with the three dependent variables (attitudes, goal orientations, and cognitive engagement) and assess- ment type, the one within-subjects independent variable. Some students were lost because of the repeated measures design. Students were eliminated from the study if they missed either the PA test or the laboratory test on the sched- uled test date. Although students could easily make up the traditional P&P test by taking it on another day, they could not make up the PA or laboratory tests in this way because the tests were designed to be carried out in groups. There- fore, once a student missed either the PA or the laboratory test, he or she was automatically given the traditional P&P test as the make-up test and was accordingly excluded from the analysis. This left an operational sample of 57 students who each had one traditional P&P test, one PA, and one laboratory test.

Results

There was a statistically significant multivariate effect (Wilks’s h = .769), F(6,5 1) = 2.55 1, p < .05. Follow-up uni- variate F tests revealed no statistically significant effect on science attitudes, F(2, 112) = 2.557, p > .05, or cognitive engagement, F(2, 11 2) = 0.922, p > .05. There was, howev- er, a statistically significant effect for goal orientation, F(2, 112) = 6.013, p < .05. Using the Bonferroni multiple-com- parisons procedure for pairwise comparisons, we found that the mean differences of goal orientation were statistically

significantly higher in the P&P and PA conditions than in the Lab condition, although there was no statistically sig- nificant difference between the PA and P&P conditions. The respective means are reported in Table 1.

Qualitative analyses revealed that the students were con- cerned primarily with grades, and their preference for assess- ment type was overwhelmingly in favor of P&P, the form that they knew best and that they believed would help them achieve their grade. When asked for a show of hands for which type of assessment they preferred, 54 out of 79 stu- dents indicated the traditional P&P test; 23 students indicat- ed a preference for the laboratory; and only 2 students pre- ferred the PAS. They preferred the P&P test because they knew how to study for it; they knew how to take it; they pre- ferred the breadth of the P&Ps over the depth of the PAS; and, above all, they knew what the teacher wanted from them. When asked why they preferred the P&P test, the students pointed to the comfort that the familiarity with the P&P test provided for them. For example, students responded,

Experience taking the test: I prefer the regular test because we have more experience on them than the other.

I think the written test was a lot easier because it was like the ones we already had. You just had to give the answers to the questions.

It seems like there are more questions and you can study for that, but really you don’t need to study for the lab test. But the written ones-I don’t know. I just know them better.

Knowing what the teacher expects:

The written test, they kind of [like] tell you more of [like] what they want to see really. You can tell what the teacher is looking for but it’s not so clear in the other ones.

I like the written test better because it [like] gives you the directions and you don’t have to be so specific about things besides the essays. And you’re learning about many different things instead of just one thing.

Breadth of coverage:

On the written test, it covered everything that we learned. On the lab or the performance test, it was only about one or two things. So we studied everything instead of just that.

The PA required the students to come up with the rules of what steps to take when on their own, causing great concern about taking the wrong step and thereby negatively influ-

Table 1.-Means and Standard Deviations I I Variable Science attitudes Goal orientation Cognitive engagement

M SD M SD M SD

Paper and pencil 3.100 0.414 3.214 0.434 3.120 0.507 Laboratory 3.008 0.454 3.035 0.386 3.055 0.439 Performance assessment 3.149 0.391 3.213 0.431 3.126 0.442

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 7: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

JanuarylFebruary 2003 [Vol. 96(No. 3)] 157

encing their grade. E,xamples of student concerns about PA and laboratory assessments follow:

Ambiguity regarding, expectations and how to proceed:

On the performance test, or whatever it was, there were no directions and you could have done something [like] what the teacher didn’t want. Like something different and on the written test there were directions and you knew what you were doing.

You didn’t get direcrions and it was [like] hard to do.

I didn’t like performance assessments because I didn’t know where to start or anything. I didn’t have a set of rules. I couldn’t set my goal. My goal is to get an A.

I didn’t like the performance test because if you couldn’t pick out what the rules were or if the rules you made up were wrong then everything you did would be wrong.

Depth rather than brcadth of coverage: I didn’t like it becau:ie you studied all things then it was only on one thing.

Unfamiliar with task demands:

With the lab and the other one, it all depended on how you explained it. By wnting and I’m not very good at writing. The other one-it’s how much you know.

Discussions of the relative merits of PAS and laboratory tests revealed that there were some features that the students found attractive. Generally speaking, the students found certain aspects of working in teams to be both positive and negative and the interactive, hands-on nature of the tasks to be positive as well.

Benefits of working in teams:

I prefer the lab test. It’s a lot easier to get answers because you’re working in a group. And it’s a lot easier to get them in the beginning so you have them for the other part where you have to write down your results and all that. And you can get together and if you don’t know something maybe somebody else in your group does and they can help you out with it.

I like to work with partners so you can give your opinion and everyone else gives their opinion so you can get the general idea.

Hands-on aspects:

It’s something new and different and you can actually see for yourself what happens when you put two ends of a magnet together or something.

I like it more because you got to interact with your work instead of just [like] writing true/false. You got to do it.

I don’t like to wear my hand out because it starts aching. And it’s more fun and you get to be involved with your work instead of having to do it all [the experiment] in your head like with the written test.

I like doing hands on activities and I really hate writing.

I like the performance assessment because it’s just like the lab but you get to think of the rules and stuff-how to do it.

I liked it because it was more fun. You actually got to do something. I like it because you get to fool around. I mean that’s the purpose-you experiment and things.

When grades were removed from the equation, a differ- ent picture emerged; students were much more likely to take willingly the risks of deciding how to proceed on their own. Issues of challenge also emerged in the interviews. Although students found the familiar P&P tests preferable, they also indicated that the PAS presented them with more of an intellectual challenge, with challenge being viewed in a positive light.

For example, students’ preoccupation with grades is evi- dent in the following comments (made multiple times by many different children):

I prefer the regular kind of test because you get better grades on them.

I like the regular test better because it was too hard to get a 100, because like on the graphs you’d always get a 3 and you’d call that average. You had to be really outstanding to get a 5.

When the children were asked to consider how they might feel if grades were no longer assigned, they responded,

The performance assessments are more challenging.

I think it would be a little different there because the project with the experiment would be more fun than sitting down and just writing essays and filling out true or false, fixing the wrong answer.

If there were no grades, and you were going to do the project one. . . the project one was more fun.

However, the children were not necessarily in favor of elim- inating grades. Some expressed concerns that the elimina- tion of grades would create a situation in which students would not have a reason to work hard and then would not be acknowledged for the work they did.

I like grades. It makes people stand out more. People who get good grades are. going to stand out more.

I think that if somebody had a pretty bad report card and they wanted to higher it and then they had this test where there was no grade and they got everything right they would feel pretty bad because then they wouldn’t get a higher grade.

I think it would make a big difference because you wouldn’t really care because your parents wouldn’t see the grade on the paper.

Discussion

The results of the quantitative analysis indicate that goal orientation may be influenced by assessment type. Howev- er, the picture that emerges from these data is not definitive enough to allow us to say that PA alone fosters a mastery goal orientation. In looking at the data, it is clear that both

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 8: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

158 The Journal of Educational Research

PA and P&P tests had a tendency to foster a mastery goal orientation more for these students. But why both? And why especially P&P tests? These are the types of tests that are thought to mimic standardized tests and therefore should orient students toward performance goals. The tests should have been least likely to foster a mastery goal orien- tation. Even the laboratory assessments seemed to allow students more opportunities to become actively engaged with task requirements. This result should have led to more cognitive engagement and to a more mastery-task goal ori- entation. The hypothesized relationships do not clearly emerge simply on the basis of the quantitative data.

However, when we combine the quantitative results with the qualitative information, a somewhat clearer picture emerges. Students seemed to believe that they needed the familiar format to provide them with information that they had mastered the concepts. They also seemed to believe that this format was the best way for them to demonstrate to the teacher that they knew the material. This pattern seems to represent the recent emergence of the differentiation be- tween performance-approach and performance-avoidant goals. Midgely et al. (2001) synthesized the literature in this area to point out that sometimes a performance-approach orientation is not harmful to learning and perhaps even helpful. What seemed to be emerging in this study was a performance-approach orientation from the students regard- ing P&P tests.

It is clear that these students equated a good grade with competence and ability. Because they knew the format of P&P only as a testing tool, they believed that their best chances of demonstrating competency was through this for- mat. They did not appear to interpret any of the more open- ended classroom activities (not inclusive of testing situa- tions) as developmental precursors to competence in performing on PAS. Nor did they appear to appreciate the difference between knowing science and doing science. The knowing part is what was important, and the only mecha- nism that they believed served as a vehicle for demonstrat- ing that knowledge was the traditional P&P test.

In addition, the risk involved because of the ambiguity imposed by the openness of the task seemed to be more than these children were willing to take. Their concerns about being able to predict what the teacher wanted of them appeared to far outweigh any excitement about their being able to decide for themselves how to demonstrate under- standing of scientific principles. The impression that one is left with after listening to the interviews is not that the chil- dren were necessarily reluctant to strike out on their own because of a perceived lack of ability or confidence in their ability. Rather, their reluctance seemed in large part to emanate from concerns about how such activities would affect their grades.

If we remove receiving a grade from the equation for these children, they appear to have viewed PA in a much more favorable light. Once the high stakes (high being rel- ative to the individual at a particular time) component was

removed, these children expressed much of what is hypoth- esized about the interaction of PA with students’ behaviors and motivation. Then the students were much more willing to take risks to try to identify the problem and attempt alter- nate solution paths. They stated that not only might PA be more interesting than P&P tests but also that it was actual- ly more challenging. Of course, the concern is whether challenge is viewed as something negative and something to be avoided (i.e., frustration) or whether challenge is inter- preted in the sense of being something positive and engag- ing. When asked for an interpretation of challenge, these students defined challenge as more of the intellectual chal- lenge that is thought to foster task orientation. They were not referring to task difficulty and challenge equated with frustration. These students were indicating that under low- stakes conditions (in their case-no grade attached), PA may be more preferable to the other forms of assessment presented to them in this study. These students indicated that PA was more interesting, more engaging, and more intellectually challenging than the traditional P&P tests. However, when one reattaches the stakes (adds the grade factor back in), students’ perceptions of which format they prefer may very well revert to the original preference for the P&P tests.

What seems to be emerging is that it may not be so much the assessment format per se that influenced the goal orientations of students but the assessment format in interaction with the stakes or consequences attached to the results of the assessment. In other words, challenge is fine (and may even carry a different definition under low- stakes conditions-that is, intellectual challenge), as long as it is not academically costly. Choice and autonomy are fine, as long as there is no perceived penalty. If one reat- taches stakes to an assessment, then the students’ goal ori- entations seem to change.

Limitations of the Study

The strength and clarity of the quantitative results may have been affected by the fluctuating and sometimes low reliability of the scales used. The reliability estimates reported by Meece et al. (1988) were all within acceptable ranges; the lowest reported value was .75. Several factors could have affected our reliability estimates; they varied within subscales and across administrations of the instru- ment. Although we anticipated some fluctuation given the different nature of each form of assessment for the different instructional units, at some points the reliability estimates were disconcertingly low. In addition to the design of the study, the size of the final sample also may have affected the reliability estimates. Larger samples typically produce more stable estimates. This result further tempers our interpreta- tion of and confidence in the quantitative data alone.

Clearly, this study is a first step in understanding the rela- tionship of assessment type and student motivation. Re- searchers need to do much more work to fully understand

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 9: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

JanuarylFebmary 2003 [Vol. %(No. 3)] 159

the effects of assessment type on student behavior and on instructional practices to support desired student behaviors. However, the data indicate that the relationship may not be as linear as may have been believed. Nor is the relationship necessarily clearly depicted without a qualitative look at the interaction from the students’ perspective.

REFERENCES

Alao, S., & Guthrie. J. T. (:1999). Predicting conceptual understanding with cognitive and motivational variables. The Journal of Educational Research, 92, 243-254.

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educaticnal and psychological testing. Washington, DC: Author.

Ames. C. (1992). Classrooms: Goals, structures, and student motivation. Journal of Educationai Psychology, 84(3), 261-271.

Ames, C., & Archer, J. ( 1988). Achievement goals in the classroom: Stu- dents’ learning strategies and motivation processes. Journal I$ Educa- tional Psychology 80, 26267.

Anderman, E. M.. & Young, A. J. (1994). Motivation and strategy use in science: Individual differences and classroom effects. Journal of Research in Science Teaching, 31. 81 1-83 I.

Baker, E. L., O’Neil, H. E, & Linn, R. L. (1993). Policy and validity prospects for performance-based assessment. American Psychologist, 48. 1210-1218.

Bank Street College of Education. (1990). The vovage of the Mimi. Pleas- antville, N Y Sunburst Communications.

Blumenfeld, P. C. (1992). Classroom learning and motivation: Clarifying and expanding goal theory. Journal of Educational Psychology, 84,

Bond, L. (1995). Unintended consequences of performance assessment: Issues of bias and fairness. Educational Measurement: Issues and P rac- rice. 14(4), 21-24.

Church, M. A,, Elliot, A. J., & Gable, S. L. (2001). Perceptions of class- room environment, achievement goals, and achievement outcomes. Journal of Educational Psychology, 93( I), 43-54.

Darling-Hammond, L. (1990). Achieving our goals: Superficial or struc- tural reforms. Phi Deltti Kappan, 72(4), 286-295.

Dowson, M., & McInerney, D. M. (2001). Psychological parameters of students’ social and work avoidance goals: A qualitative investigation. Journal of Educational Psychology, 93( I), 3542.

Elliot, A. J., McGregor, H. A,, & Gable, S. (1999). Achievement goals, study strategies, and exam performance: A mediational analysis. Jour- nal of Educationul Psychology, 91(3), 549-563.

Frederiksen, J. R., & Collins, A. (1989). A systems approach to educa- tional testing. Educational Researcher. 18(9), 27-32.

Frederiksen, N. (1984). The real test bias: Influences of testing on teach- ing and learning. American Psychologist, 36( 10). 193-202.

Hardy. R. (1996). Performance assessment: Examining the costs. In M. B. Kane & R. Mitchell i:Eds.). Implementing performance assessmenr: Pmmises, pmblems, ar)d challenges (pp. 107-1 17). Mahwah, NJ: Erl- baum.

Herman, J. L. ( 1992). Acc.wntability and alternative assessment: Research and development issue!:. Los Angeles: National Center for Research on Evaluation, Standards, and Student Testing. (ERIC Document Repro- duction Service No. ED357037)

Herman, J. L., & Golan, S . (1993). The effects of standardized testing on teaching and schools. Educational Measurement: Issues and Practice. 12(4), 20-25,4142.

Karmos, A. H.. & Karmas, J. S. (1984). Attitudes toward standardized achievement tests and their relation to achievement test performance. Measurement and Evduarion in Counseling and Development. 17, 56-66.

272-28 I.

Khattri, N., Reeve. A. L., & Kane, M. B. (1998). Principles and practices of performance assessment. Mahwah, NJ: Erlbaum.

Kohn, A. (2000). The case against standardized testing: Raising the scores, ruining the schools. Portsmouth, NH: Heinemann.

Lee, S. (1994). The effect of assessment approach on reported study strat- egy use. Unpublished doctoral dissertation, Pennsylvania State Univer- sity, University Park.

Lu. C., & Suen, H. K. (1995). Assessment approaches and cognitive styles. Journal of Educational Measurement, 32( I ) , 1-1 7.

McNeil, L., & Valenzuela, A. (2000). The hurnbl impact of the TAAS sys- tem of tesring in Texas: Beneath the accountability rhetoric. Retrieved Oct- ober 12, 2001, from the World Wide Web: http://www.Iaw.harvard. edu/groups/civilrights/conferences/testing98/drafts/mcneil~valenzuela. html

Meece, J. L., Blumenfeld, P. C., & Hoyle, R. K. (1988). Students’ goal ori- entations and cognitive engagement in classroom activities. Journal of Educarional Psychology, 80(4), 514-523.

Meisels, S. J., Dorfman, A,, & Steele, D. (1995). Equity and excellence in group-administered and performance-based assessments. In M. T. Net- tles & A. L. Nettles (Eds.), Equity and excellence in educarional testing and assessment. Boston: Kluwer.

Messick. S. (1994). The interplay of evidence and consequences in the val- idation of performance assessments. Educational Researcher. 23(2), 13-23.

Middleton, M. J., & Midgely, C. (1997). Avoiding the demonstration of lack of ability: An underexplored aspect of goal theory. Journal of Edu- cational Psychology, 89(4), 710-71 8.

Midgely, C., Kaplan, A., & Middleton, M. (2001). Performance-approach goals: Good for what, for whom. under what circumstances, and at what cost‘? Journal of Educational Psychology, 93( I ), 77-86.

Monty Neill, D., & Medina. N. J. (1989). Standardized testing: Harmful to educational health. Phi Delta Kappan. 70(9), 688-697.

Paris, S. G., Lawton, T. A,, Turner, J. C., & Roth. J. L. (1991). A develop- mental perspective on standardized achievement testing. Educational Researcher, 20(5). 12-20.

Parkes, 1. (2000). The interaction of assessment format and examinees’ perceptions of control. Educarional Researrher, 42(2), 175-1 82.

Pintrich, P. R., & DeGroot, A. V. (1990). Motivational and self-regulated learning components of classroom academic performance. Journal of Educational Psychology, 82( 1). 3340.

Resnick, L. B. (1987). Education und learning tu think. Washington, DC: National Research Council, Committee on Mathematics, Science, and Technology Education and Commission on Behavioral and Social Sci- ences and Education. (ERIC Document Reproduction Service No. ED289832)

Resnick, L. B., & Resnick, D. P. (1989). Tests as standards of uchievement in schools. New York: Educational Testing Service Conference. (ERIC Document Reproduction Service No. ED335421 )

Shepard, L. (1989). Why we need better assessments. Educational Lead- ership, 46(7), 4-9.

Shepard, L. A. (1991). Will national tests improve student learning? Phi Delta Kappan. 73(3), 232-238.

Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16(2), 5-8, 13, 24.

Shepard, L. A. (2000). The role of assessment in a learning culture. Edu- cational Researcher, 29(7), 4-14. [Available online: http://www.aera .netlpubs/er/arts/29~7/shep0 1 .htm]

Smith, M. L., & Rottenberg. C. (1991). Unintended consequences of exter- nal testing in elementary schools. Educational Measurement: Issues and Practice. 10(4), 7-1 1.

Turner, J. C., Thorpe, P. K., & Meyer, D. K. (1998). Students‘ reports of motivation and negative affect: A theoretical and empirical analysis. Journal of Educntional Psychology, 90(4), 758-77 1.

Wiggins, G. (1989). A true test: Toward more authentic and equitable assessment. Phi Delta Kappan, 70(9), 703-713.

Wiggins, G . ( 1992). Creating tests worth taking. Educational Leader.ship. 49(8), 26-33.

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 10: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

160 The Journal of Educational Research

APPENDIX Sample Test Items

Science Grade 5: Sample Paper and Pencil Test Questions

(NOTI? all items in the appendix have been sampled from among the tests administered in this study. This appendix does not represent an intact test.)

Example multiple-choice questions:

A humpback whale's migration on the east coast takes them from a. California to Mexico. b. Massachusetts to the Caribbean. c. Massachusetts to Africa. d. California to South America.

A fathom is equal to a. 6 feet. b. 10 feet. c. 1 foot. d. 12 feet.

a. condense. b. precipitate. c. evaporate. d. transpire.

a. in the deep ocean zone. b. at the bottom. c. in the thermocline. d. at the surface.

Example Matching Exercise

The heat from the sun's rays cause Ocean water to

The temperature of ocean water is highest

I . salinity 2. density 3. shoreline 4. seamounts 5 . Oceanographer 6. brackish

A. layer between fresh water and salt water B. the amount of dissolved salts in water C. scientist who studies the Ocean D the amount of space between water molecules E. underwater mountains F. boundary where the land and ocean meet

Example TrueFalse items

A census is the process for counting certain organisms. A hydrophone measures the temperature of the water at different depths. Anne is the marine biologist on "The Mimi". Female humpback whales sing long songs.

Example short answer questions

List two indications that a storm was approaching in Episode 8. List two pieces of information the radio tag gave the crew about the whale. Why don't whales get hypothermia in cold water? What is the reason we shiver when we are cold?

Other item types

Please put a "B" for benthos, "N" for nekton, or " P for plankton on each line below to identify the organism.

Whale shark Starfish Krill Kelp Squid

Choose the best answer from the words below and write it on the line.

Abyssal plains trenches Submarine canyons mid-ocean ridges Continental shelf

Deep V-shaped valleys cut through the continental shelf and the continental slope are called Long, narrow cracks along the Ocean floor are called Large, flat areas on the Ocean floor are called Flat area of the Ocean floor covered by shallow water where you swim is called the Mountain ranges located under the Ocean are called

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 11: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

January/February 2003 [Vol. 96(No. 3)]

Science Grade 5: Planet Earth Unit (Lab)

Objective: To follow directions to conduct an experiment to compare rates at which different substances dissolve in fresh water. Students will make predictions and observations, will use appropriate units of measurement, will report results, and will form a conclusion about “Whar happens when diflerent substances are aa!ded to wateb”

161

Key Competencies:

Key Concepts and Content:

Science Skills Needed:

Follow directions to perform the experiment Make credible predictions Use appropriate units of measurement Make accurate observations Report observations accurately Arrive at a logical conclusion based on results of observations

Many substances form a solution when mixed with water; some do not. Some substances go into solution faster than others.

Observing Measuring Organizing Inferring Predicting Experimenting Communicating

Materials Needed: For the Class: water table salt granulated table sugar corn starch

3 clear plastic cups 3 plastic straws or stirrers a plastic teaspoon 3 pieces of tape or sticky labels a graduated measuring cup data sheets crayons or colored pencils

For Each Group:

Performance Task I: In groups, students will follow directions to conduct an experiment observing the rate at which a substance dissolves in water. They will work as a group up to the point of writing a conclusion.

Performance Task 11: Individually, students will write a conclusion to the experiment. They will base their conclusion on the observations and recordings from the experiment.

Diredons Given to Students

Directions for the Groups:

You are going to add table salt, sugar and corn starch to cups of water. What do you predict will happen to the substances in each cup?

Salt:

sugar:

Corn S,tarch:

Fill three clear plastic cups with about the same amount of water. Leave about an inch at the top so you can stir. Label each so you know which gets salt, sugar, and corn starch.

How many teaspoons will you add to each cup? Circle 1 2 or 3. Each cup gets the same number.

Drop in the number of teaspoons you circled above of each substance into the appropriate cup. After adding the sub- stances;, sit and observe for two minutes. What can you see happening?

Stir each cup 10 times using the same technique. Did any dissolve?

Repeat stirring each 10 times and then observing all of them. Continue to do this until the substance disappears. Mark out the number of times stirred before a substance disappeared if it did so. Salt: 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

sugar: 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

Corn Starch: 10 20 30 40 50 60 70 80 90 130 140 150 160 170 180 190 200

Make ;I bar graph of these results.

Did each dissolve:

100 110 120

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14

Page 12: Effects of Classroom Assessment on Student Motivation in Fifth-Grade Science

162 The Journal of Educational Research

Which dissolved fastest?

Write a conclusion that you could make based on the results of your experiment.

Science Grade 5: Planet Earth Unit (PA)

Objective: To follow directions to conduct an experiment to compare rates at which different substances dissolve in fresh water. Students will make predictions and observations, will use appropriate units of measurement. will report results. and will form a conclusion about “What happens when different substances are added to water”

Key Competencies:

Key Concepts and Content:

Science Skills Needed:

Materials Needed: For the Class:

For Each Group:

Make credible predictions Use appropriate units of measurement Make accurate observations Report observations accurately Arrive at a logical conclusion based on results of observations

Many substances form a solution when mixed with water; some do not. Some substances go into solution faster than others.

Observing Measuring Organizing Inferring Predicting Experimenting Communicating

water table salt granulated table sugar corn starch

3 clear plastic cups 3 plastic straws or stirrers a plastic teaspoon 3 pieces of tape or sticky labels a graduated measuring cup data sheets crayons or colored pencils

Performance Task I: In groups, students will follow directions to conduct an experiment observing the rate at which a substance dissolves in water. They will work as a group up to the point of writing a conclusion.

Performance Task 11: Individually, students will write a conclusion to the experiment. They will base their conclusion on the observations and recordings from the experiment.

Directions Given to Students

Directions for the Groups:

Will salt, sugar and corn starch mix completely and dissolve when added to water? We know some substances will dis- solve in water and others will not. Why is that? What are the things that could possibly make a difference in whether the substance will dissolve or not?

List some things that might make a difference in the ability of salt, sugar or corn starch to dissolve in water.

ALL THINGS BEING EQUAL, meaning that you do the same thing for each substance each time ....... . ..... ... What do you predict will happen when you add salt to water? What do you predict will happen when you add sugar to water? What do you predict will happen when you add corn starch to water? You have been given some equipment to use to create a way of determining if your predictions were right. Conduct an experiment and observe what happens to see if you were right.

You have been given: 3 clear plastic cups 3 plastic straws or stirrers a plastic teaspoon 3 pieces of tape or sticky labels a graduated measuring cup data sheets crayons or colored pencils

Available to the whole class is: water table salt granulated table sugar corn starch

You will need to describe your experiment (what you are doing). You will need to record your observations (tell what happens as i t happens). You will need to present your results both in words and in a graph. You will need to reach a conclusion (tell what happens to the three substances when added to water). You will need to communicate that conclusion to the teacher.

Dow

nloa

ded

by [

Ban

gor

Uni

vers

ity]

at 1

2:08

21

Dec

embe

r 20

14