4
295 Investigating Gender Bias in the Evaluations of Middle School Teachers of Mathematics Clyde A. Wiles Division of Education Indiana University Northwest Gary, Indiana 46408 Gender differences, and particularly those associated with the learning of mathematics, have been the focus of a great deal of formal investigation. Shibley and Linn (1986), Chipman (1988), and Friedman( 1989) summarize a great deal ofresearch relating to this. There is general agreement that differences in mathematics achievement have been measured during early and late adolescence, and, that while these differences are usually very small, they are most evident and not nearly so small when tasks that are associated with problem solving are the focus of the investigation (Dossey, Mullis, Lindquist, & Chambers. 1988). While there is some reason to believe that these differences have been decreasing and indeed will cease to exist in the not- too-distant future, such extrapolations may prove to be deceiving. More disturbing than the usual small differences found on standardized tests are the much larger differences in course taking and subsequent career alternatives reported for boys and girls (Leder, 1986). If differences in problem-solving achievement and differences in course taking are results of environmental influences, it is important to know what these influences are. One approach to this question was taken by Duval (1980). She examined teachers’ grading behavior on a geometry exam for evidence of teacher bias. Student responses to a state geometry exam were reviewed, and an examination paper was constructed that was reflective of a typical student. Secondary teachers of the State ofNew York were asked to score this exam. Six forms of the paper were provided as follows. The paper was labeled either as Jeanne or Thomas and then provided with an academic profile of grades that indicated Jeanne or Thomas was above average, average, or below average. Extensive analysis of the returned, scored papers revealed no differences in the scores given to Jeanne or Thomas of any indicated ability level. In discussing the implications of this study, Duval noted that while no significant differences were found, there was some appearance of discrepant grading behavior, and she suggested that populations of teachers of the elementary or junior high school levels should be examined for such behavior. She conjectured, "their [the teachers of these lower grade levels] own apprehensions and insecurities about mathematics may be communicated to students, including the stereotypical belief that mathematics is predominantly a male domain (p. 12)." This study is a direct response to Duval’s recommendation. The hypothesis tested was that teachers will reflect an expectation of male success in the scores they assign to middle-grade-level (fourth, fifth .and sixth) students1 work in mathematical problem solving, and that there will be a higher rating given to an indicated male student than to an indicated female student. Method This study was directed at teachers of grades 4, 5, and 6 because these grades immediately precede grade levels at which achievement differences attributable to gender have been widely reported. The student content involved mathematical problem solving because it is with respect to understanding and applications that differences have been most pronounced. The selected subjects were teachers because they provide a key link in the transmission of information that children receive about their relative abilities and achievements. Furthermore, the subjective nature of the evaluation of soludons-both for answer and method-ofnon-routineproblems would seem to allow the most opportunity for gender bias to express itself. Since the term gender is more closely related to role identification than sex, gender rather than sex is the term used. Procedure In the spring of 1988. 96 fourth-, fifth- and sixth-grade teachers evaluated a single set of student responses to five non- traditional problems. These teachers were typical of practicing upper-grade elementary teachers of suburban/urban northwest Indiana. This convenience sample of teachers was teaching in the same schools and/or neighborhood schools of graduate students enrolled in a master’s program at a nearby university. The data were collected by these graduate students. The willing teachers were asked to score a student paper that was constructed in the following way. A set of nontraditional problems was selected and administered to two classes of sixth-grade students. These problems were taken from no specific source; however, no claim for originality of these problems is made. Students were asked to write descriptions and illustrations of how they thought each problem might be solved and to provide correct answers if they could. They were further told that why they thought an answer was correct was more important than the answer itself. The regular teachers of these students administered the problem Volume 92(6). October 1992

Investigating Gender Bias in the Evaluations of Middle School Teachers of Mathematics

Embed Size (px)

Citation preview

295

Investigating Gender Bias in the Evaluations of Middle SchoolTeachers of MathematicsClyde A. Wiles Division of Education

Indiana University NorthwestGary, Indiana 46408

Gender differences, and particularly those associated withthe learning ofmathematics, havebeen the focus ofa great dealof formal investigation. Shibley and Linn (1986), Chipman(1988), andFriedman(1989) summarizea greatdeal ofresearchrelating to this. There is general agreement that differences inmathematics achievement have been measured during earlyand late adolescence, and, that while these differences areusually very small, they aremost evident and not nearly so smallwhen tasks that are associated with problem solving are thefocus of the investigation (Dossey, Mullis, Lindquist, &Chambers. 1988).

While there is some reason to believe that these differenceshave been decreasing and indeed will cease to exist in the not-too-distant future, such extrapolationsmayprovetobedeceiving.More disturbing than the usual small differences found onstandardized tests are the much larger differences in coursetaking and subsequent career alternatives reported forboys andgirls (Leder, 1986). If differences in problem-solvingachievement and differences in course taking are results ofenvironmental influences, it is important to know what theseinfluences are.

One approach to this question was taken by Duval (1980).She examined teachers’ grading behavior on a geometry examfor evidence of teacher bias. Student responses to a stategeometry exam were reviewed, and an examination paper wasconstructed that was reflective ofa typical student. Secondaryteachers ofthe State ofNew York wereasked to score this exam.Six forms ofthe paper wereprovided as follows. The paper waslabeled either as Jeanne or Thomas and then provided with anacademic profile ofgrades that indicatedJeanne or Thomas wasabove average, average, or below average. Extensive analysisof the returned, scored papers revealed no differences in thescores given to Jeanne orThomas ofany indicated ability level.

In discussing the implications ofthis study, Duval noted thatwhile no significant differences were found, there was someappearance of discrepant grading behavior, and she suggestedthat populations of teachers of the elementary or junior highschool levels should be examined for such behavior. Sheconjectured, "their [the teachers of these lower grade levels]own apprehensions and insecurities about mathematics may becommunicated to students, including the stereotypical beliefthat mathematics is predominantly a male domain (p. 12)."

This study is a directresponse to Duval’s recommendation.The hypothesis testedwas that teachers will reflectan expectation

of male success in the scores they assign to middle-grade-level(fourth, fifth .and sixth) students1work in mathematical problemsolving, and that there will be a higher rating given to anindicated male student than to an indicated female student.

Method

This study was directed at teachers of grades 4, 5, and 6because these grades immediately precede grade levels atwhich achievement differences attributable to gender havebeen widely reported. The student content involvedmathematical problem solving because it is with respect tounderstanding and applications that differences havebeen mostpronounced. The selected subjects were teachers because theyprovide a key link in the transmission of information thatchildren receive about their relative abilities and achievements.Furthermore, the subjective nature of the evaluation ofsoludons-both foranswerand method-ofnon-routineproblemswould seem to allow the most opportunity for gender bias toexpress itself. Since the term gender is more closely related torole identification than sex, gender rather than sex is the termused.

Procedure

In the spring of 1988. 96 fourth-, fifth- and sixth-gradeteachers evaluated a single set of student responses to five non-traditional problems. These teachers were typical ofpracticingupper-grade elementary teachers of suburban/urban northwestIndiana. This convenience sample of teachers was teaching inthe same schools and/or neighborhood schools of graduatestudents enrolled in a master’s program at a nearby university.The datawere collected by these graduate students. Thewillingteachers were asked to score a student paper that wasconstructed in the following way.A set of nontraditional problems was selected and

administered to two classes of sixth-grade students. Theseproblems were taken from no specific source; however, noclaim for originality of these problems is made. Students wereasked to write descriptions and illustrations ofhow they thoughteach problem might be solved and to provide correct answersif they could. They were further told that why they thought ananswer was correct was more important than the answer itself.Theregular teachers ofthese students administered the problem

Volume 92(6). October 1992

Gender Bias296

set.Following the pilot completion of these ten items, the work

of the students was reviewed, and five of the ten items alongwith partially-constructed student work were selected thatseemed representative of a fair-to-good student. Correct andincorrect answers, with work that was not always consistentwith theanswerprovidedbythestudent, were selected. Languageand writing irregularities were retained because they werepresent on the actual student papers. The intention was toproduce a realistic sample of student work on a set of teacher-made, problem-solving, exercises. The objective was to put theeventual teacher subjects at ease and to allow the maximumpossible opportunity for conscious or unconscious gender biasto be expressed.

Items

The items, along with a description of the student work,were:

1. You are hosting a 12-team, single elimination basketballtourney. Howmany games need to be scheduled beforea winneris found? (A single elimination tourney means that once a teamloses, it is no longer in the tourney.) [The correct answer, 11,wasgiven along with the student’s rationale, "Each game takes twoteams to haveagame ifyou keep subtracting thenumberone youwill find the answer."]

2. A ball is dropped from a height of 10 feet and will bouncehalf its height each time. How many times will the ball hit thefloorbefore itbounces to aheightofno morethan one foot? [Thepaper provided the correct answer, 5, with a partially correctpicture consistent with the wrong rationale, "5 because itbounces back one less foot every bounce."]

3. A frog falls into a 30-foot well. Each day it is able to climb8 feet up the wall and it slides back 4 feet at night. How long willit take the frog to climb out of the well? [An incorrect answer,3 days, 6 hours, was provided along with a correctly completeddivision example showing 30 divided by 8 and the text, "If heclimbs 8 feet each day and falls four feet each night that meanshe is climbing four feet every 24 hours. You will divide 8 feetinto 30 feet."]

4. You and 7 friends go for a ride in a rowboat that onlycarries two people at a time. How many rides are needed for all8 people to ride with everyone once? [The correct answer of28is provided. Student work consists ofa systematic listing of 28ordered pairs, AB, AC, AD,..., GH.]

5. How many different ways can you have 170 in change?[The incorrect answer. 5, is provided. The handwritten textshows a correct listing of all six possibilities but two ofthem arewritten on the same line. There are five lines.]

Following the selection of the items along with the studentresponses just described, two sixth-grade students, a boy and agirl, were asked by their teacher to write these responses on anewly copied problem sheet. The boy was asked to write Robertat the top of his paper, and the girl to write Valarie at the top of

hers. The names Robert and Valarie were selected as beingclearly male and female and as being ethnically neutral.

Subjects wereasked to read a studentpaperand assign from0 to 10 points for each item. They were asked to complete thetask at one sitting, not to discuss it with anyone else, and toreturn the scored paper as soon as possible. The subjects wereteachers who agreed to participate on this basis. They weregiven papers labeled Valarie orRobert on an alternating basis.They generally did this task overa lunch period or during a freeperiod.A cover sheet for the problem set was constructed that told

the teacher subject that problem-solving behavior of upper-grade children was ofinterest and that somebase-line data wasto be established. Teachers were also told that this was thework ofa sixth-grade student, the answers were not all correct,and their rating of the strategy and approach were moreinteresting than thecorrectness ofthe solution. It was suggestedthat 2 points be awarded for a correct answer and that anadditional 0 to 8 points be awarded as a rating of the student’sstrategy and approach. Finally, they were told, "We are mostinterested in your perception of the quality of this work."Correct answers were provided but no rationale for a correctanswer or strategy for problem solution was given. A numberofquestions were suggested that the teachers mightconsider inawarding the 0 to 8 points (for example, "How appropriate,effective and efficient was the strategy chosen?" "Were correctarithmetical operations chosen and carried out?").

Results

All 96 of the teachers who received forms completed andreturned them. The data are reported in Table 1. There were30 fourth-grade teachers, 32 fifth-grade, and 28 sixth-gradeteachers who scored the studentpaper. Six teachers were listedas other for one of the following reasons: (a) they werepresently teaching at grade 7, (b) they were not presentlyteaching, or (c) they did not indicate the grade they wereteaching on the scoring form. This left data from 90 teachersto be considered. Twenty-five of them were men and 65 werewomen.

There was some concern at this point about the treatment.That is, perhaps the teachers did not notice that they werescoring a paper titled eitherRobertor Valarie. Ofcourse iftheydid not, any bias regarding how a boy or girl ought to do couldnot operate. During the two weeks following the original datacollection effort, 44 of the scorers were contacted and asked,"Which student’s paper did you grade?" Thirty-six of themknewRobertor Valarie. Others could notproduce the student’sname but reported that they thought they had either a boy or agirl but could not remember the name. Other observationswere made by those who administered the survey. Theyreported that some teachers did much more than score thepapers. Answers were corrected, comments were written, andteachers asked what the catch was. Some teachers wondered

School Science and Mathematics

Gender Bias

297

Table 1

Teachers’ Ratings for Robert and Valarle: Means (StandardDeviations) for Teacher Groups^

Group r^ Robert n^ Valarie

Female Teachers4th Grade 14 34.6 13 34.5

(4.77) (4.65)5th Grade 11 33.6 10 31.8

(5.84) (4.39)6th Grade 9 31.8 8 34.3

(5.25) (4.71)Total 34 33.5 31 33.6

(5.25) (4.60)Male Teachers4th Grade 1 28.0 2 35.0

(0.00) 4.24)5th Grade 3 35.7 8 37.6

(1.15) (2.62)6th Grade 5 36.2 6 29.5

(4.82) (3.33)Total 9 35.1 16 34.3

(4.37) (4.82)

"Data for the six non-classified teachers are not included.Wumber of teachers in each groupif the given answers were actually correct, and some teacherssaid this reminded them that they need to do more problemsolving within their own classroom. In sum, it appearsunreasonable to suppose that these teachers did not notice theindicated gender of the student.A paper could have been assigned from 0 to 50 points.

Scores given forRobertranged from 25 through 46, with a meanof 33.9 and a standard deviation of4.63. The scores for Valarieranged from 23 through 44, with a mean of 33.8, and standarddeviation of 5.07.

The data were analyzed by a MANOVA. Table 2 presentsa summary of the F tests completed. The following variableswere entered into the model: (a) student (Robert or Valarie), (b)grade level ofteacher (4,5 or 6), and (c) genderofteacher (male,female). The hypothesis of no differences in the scores forRobert and Valarie was tested by considering the main effect ofstudent. It was not significant. It seemed wise to consider othermain effects and interactions for significance also. Neither theremaining main effects for grade level and gender of teacher orthe pair-wise interactions of student by gender of teacher andstudent by grade level were significant. The number of meninvolved is too small to give any meaning to other interactions.

Discussion

This study cannot demonstrate that gender bias does notexist but it can be reported that the sample size is reasonable,

Table 2

Summary of MANOVASource SS df MS F

Student 4.65 1 4.65 .22Gender of Teacher 36.67 1 36.67 1.71Grade Level 60.27 2 30.14 1.41Student X Gender 3.01 1 3.01 .14Student X Grade Level 6.94 2 3.47 .16Within Cells 1672.70 78 21.44

the observations were independent, and that no evidence ofsignificant differences were found. Regrettably, the number ofmen involved was too small to give a strong test of differencesin the scoring of men and women teachers. The gender ofteacher was not significant as a main effect or in interactionwith student gender. All of this then is consistent with Duval’s(1980) study which also failed to find evidence of teacher biasin scoring high school geometry exams.

It may be that the purenovelty of scoring papers for strategyand process masked any such predispositions; that is, eventhough teachers may be biased in their expectations, they wereso involved with the novel task ofscoring problem solving teststhat this bias did not come into play. The wide range of scoresfor the papers indicate, at the least, a lack of uniformity ofexpectations about the task. Of course, Valarie and Robertwere just names; actual boys and girls may be required for biasto be expressed. In the situations reported by Fennema andReyes (1981) whereboys and girls wereresponded to differently,the students were not hypothetical; however, in spite of theseplausible arguments, there is no reason provided by these datato suppose that middle school teachers as a group are gradingboys’ and girls* work differently.

Implications

The evidence provided here suggests that any teacherparticipation in a process to direct boys and girls into sociallycorrect roles with respect to mathematics is not so subtle asconscious or unconscious differential responses to writtenstudent work. It follows then that teachers’ actions are morepurposeful than scoring bias might suggest. People who aredirected by purposecan re-direct themselves on the basis ofnewinformation and insight. This gives reason to hope that anyenvironmental causesofgenderdifferences in eitherachievementor course taking that are under the control of teachers can likelybe eliminated. If indeed the causes of observed differences inmathematics achievementare environmental.itcan beexpected,along with Friedman (1989), that such differences will in factdisappear in a rather short period of time.

If this is so, it is encouraging to believe that deliberateattempts to encourage objectivity in responses to students

Volume 92(6), October 1992

Gender Bias298

should be both possible and effective. If teachers can directthemselves to ignore gender expectations on written testresponses, they canbeexpected tolookbeyondotherstereotypicalexpectations ofrace, language, economics, and class as well. Ifthis can be done for test scoring, perhaps it can be done for otherkinds of student responses. Teachers should take a personalinventory of their attitudes and expectations regarding whatboys or girls or members of this or that group can or cannot dowell and make adjustments in the direction of fairness andobjectivity that such an inventory may suggest.

Another matter of practical importance was illuminated bythe data of this study. While deliberate attempts were made toinsure variability in the scores given to Robert or Valarie, thevery wide range of scores attributed to either Robert or Valariesuggests that teachers do not have valid and reliable methods forscoring or evaluating student problem-solving efforts. Theobservations of the data collectors concerning the tentativenessand reported insecurity ofteacher-subjects underlie the need forsuch methods. Tentativeness regarding the scoring ofproblem-solving tasks is to be expected in problem-poor curriculumswhere the primary emphasis is placed upon right answers toroutine computations and applications. Inservice training effortsthat focus on valid and reliable means for evaluating studentresponses to non-trivial, non-routine problems in bothmathematics and science may provide a context for developingproblem-centered curriculums in the middle school.

References

Chipman, S. (1988). Far too sexy a topic. EducationalResearcher, 77(3), 46-49.

Dossey, J.. Mullis, I., Lindquist, M., <& Chambers, D. (1988).The mathematics report card: Are \ve measuring up?Princeton, NJ: Educational Testing Service.

Duval,C.(1980). Differential teachergradingbehaviortowardfemale students of mathematics. Journal for Research inMathematics Education, 14,202-213.

Fennema, E., & Reyes, L. (1981). Teacher/peer influences onsex differences in mathematics confidence: Final report.Madison: University of Wisconsin-Madison.

Friedman,L.(1989) Mathematics and the gendergap: Ameta-analysis ofrecent studies on sex differences in mathematicstasks. Review of Educational Research, 59(2), 185-213.

Leder, G. C. (1986). Gender linked differences in mathematicslearning: Further explorations. Paper presented at theannual meeting of the National Council of Teachers ofMathematics, Washington, DC.

Shibley, J., & Linn, M. C. (Eds.). (1986). The psychology ofgender: Advances throughmeta-analysis. Baltimore, MD:John Hopkins University Press.

Authors Sought for the 1995 NCTM Yearbook

The Council’s yearbook for 1995 is tentatively entitled Connecting Mathematics Throughoutthe Curriculum. The editorial panel, headed by Peggy House, is seeking papers relating to thefollowing categories:

Connecting Mathematics: Issues and PerspectivesConnecting Mathematics to Mathematics

Connecting Mathematics Across the CurriculumConnecting Mathematics in the Community

Initial manuscripts are due by February 15, 1992. Complete information regarding thesecategories is available from Arthur R Coxford, General Editor, 1228 I School of Education,The University of Michigan, Ann Arbor, MI 48109-1259. The guidelines may be obtained bywriting or by calling (313) 764-8420 during working hours.

School Science and Mathematics