15
This article was downloaded by: [UQ Library] On: 22 November 2014, At: 11:38 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK The Journal of Educational Research Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/vjer20 Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers Larry G. Daniel a & Debra A. King b a University of Southern Mississippi b Learning Solutions , Hattiesburg, Mississippi Published online: 01 Apr 2010. To cite this article: Larry G. Daniel & Debra A. King (1998) Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers, The Journal of Educational Research, 91:6, 331-344, DOI: 10.1080/00220679809597563 To link to this article: http://dx.doi.org/10.1080/00220679809597563 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

  • Upload
    debra-a

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

This article was downloaded by: [UQ Library]On: 22 November 2014, At: 11:38Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41Mortimer Street, London W1T 3JH, UK

The Journal of Educational ResearchPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/vjer20

Knowledge and Use of Testing and Measurement Literacy ofElementary and Secondary TeachersLarry G. Daniel a & Debra A. King ba University of Southern Mississippib Learning Solutions , Hattiesburg, MississippiPublished online: 01 Apr 2010.

To cite this article: Larry G. Daniel & Debra A. King (1998) Knowledge and Use of Testing and Measurement Literacy of Elementaryand Secondary Teachers, The Journal of Educational Research, 91:6, 331-344, DOI: 10.1080/00220679809597563

To link to this article: http://dx.doi.org/10.1080/00220679809597563

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in thepublications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations orwarranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsedby Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings,demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectlyin connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

LARRY G. DANIEL University of Southern Mississippi

ABSTRACT The purposes of the present descriptive study were (a) to determine the educational testing and measurement literacy of elementary and secondary teachers, (b) to examine the degree to which various testing and measurement concepts are applied in the classroom assessment environment, and (c) to determine if assessment strategies vary across elementary and secondary teachers. Ninety-five elementary and secondary teachers completed a survey that described their knowledge of testing and measurement, their use of assessment strategies, and their grade and content areas along with additional demo- graphic information. The teachers’ knowledge bases were some- what inadequate, hut the teachers regularly used the knowledge they did possess when assessing student progress. Few differ- ences were noted between elementary and secondary teachers’ knowledge and uses of assessment practices.

esting and measurement are among the many tools T teachers use in making decisions concerning student progress. Routinely, teachers assess the status and develop- ment of the learners in their classrooms. Carlberg ( I98 1 ) reported that teachers spend an average of 15% of class- room time in testing procedures, plus the additional time needed for test preparation, scoring, and analysis of results. Similarly, Stiggins and Conklin ( 1988) estimated that assessment and corresponding activities occupy as much as one third of the total time teachers spend in educational endeavors. More recently, Stiggins ( 199 1 ) increased that estimate to one half. Testing and measurement have clearly become an integral part of schooling, yet many teachers report that their training in this aspect of teacher compe- tence is less than adequate (Scales, 1993).

With assessment taking its place as a major component in the educational process, teacher effectiveness relies, in part, on teachers’ knowledge of interpretation and use of a vari- ety of assessment procedures. Wolf (1995) affirmed this position by stating, “A knowledgeable teacher is the foun- dation of informed assessment” (p. 4). Accordingly, for teachers to achieve maximum effectiveness, they must know how to appropriately interpret and use measurement

DEBRA A. KING Learning Solutions, Hattiesburg, Mississippi

and evaluation when examining the learning environment of students (Noll, 1955). All teachers are expected to evaluate student progress. and success in this endeavor comes only through substantial knowledge of testing and measurement procedures and techniques (Gullickson, 1985).

School reform initiatives have. in recent years, brought about the need for increased attention to the teacher’s knowl- edge base. Wise, Lukin, and Roos ( 1 99 1 ), for example, con- cluded that teachers need skills in testing and measurement in order to adequately assess the numerous teaching models used in today’s classrooms. Moreover, current usage of a vast array of student assessment procedures makes knowledge of educational testing and measurement concepts particularly necessary for today’s classroom teachers. Popham ( 1995), who uses testing, measurement, and assessment interchange- ably, identified three contemporary functions of assessment that directly affect teachers: (a) public perceptions of educa- tional effectiveness, (b) evaluation of teachers, and (c) guid- ance of instructional objectives. The National Board for Pro- fessional Teaching Standards (NBPTS) identified teacher management and monitoring of student learning as a general proposition of accomplished practice. The NBPTS standards require that teachers be able to adequately assess student progress and teach students how to evaluate their own progress and improvement (Shapiro, 1995). Clearly, the aver- age teacher must have a working knowledge of testing and measurement procedures.

Numerous studies have been conducted that examined issues surrounding use of measurement and evaluation in the classroom. For example, Noll (1955) examined the require- ments for teacher certification in various states to determine the degree to which these states’ certification guidelines included coursework in testing and measurement. Noll also examined university criteria for preservice teacher education in this area. The results indicated that at the time few states

Address correspondence io Lmry C. Duiriel. Educcitioncil h i i d -

ership und Reseurch, Univtv-sity of Southern Mississippi. Hutties- burg, MS 39406-5027.

331

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 3: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

332 The Journal of Educational Research

required a course in measurement and evaluation for certifi- cation. Furthermore, public universities included in the study generally did not require a course in educational measure- ment for graduation in teacher education. Similarly, Goslin ( 1967) found that less than 40% of 1,450 teachers surveyed had received more than one measurement course at the pre- service level; a sizable number of teachers had never had a course in assessment techniques or attended a clinic or inser- vice program that emphasized measurement.

In a related study of preservice education, Roeder (1 972) collected data from 9 16 teacher training institutions to ascertain whether preparation in testing and measurement was required for elementary education majors to graduate or to receive state certification. His findings indicated that 57.7% of the institutions surveyed did not require elemen- tary preservice teachers to complete an evaluation course before graduation. Of the remaining institutions, 12.1 % required only a 1- or 2-hr evaluation course, 17.8% required a 3-semester-hr course, 1.4% required more than one course in evaluation, and 7.2% indicated that testing and measure- ment received emphasis as part of another course required of elementary education majors. Overall, Roeder conclud- ed, elementary teachers were generally ill prepared to use testing procedures effectively. In an investigation that again focused on preparation of preservice teachers, Schafer and Lissitz ( 1987) found that training in testing and measure- ment at the university level appeared to be inadequate to meet the needs of classroom teachers. Furthermore, although testing and measurement occupy a significant por- tion of teacher time, only the school counseling program appeared to provide students with sufficient preparation regarding assessment of student progress.

A systematic effort to study the assessment needs of pre- service teachers (n = 397) was conducted by Wise et al. (199 I ) . The teachers reported on their perceptions of the formal training they had received during their preservice education. Forty-seven percent of the total rated their train- ing as either somewhat or very inadequate. Two thirds of the teachers in the sample indicated that they had taken less than one measurement course at the preservice level. Fur- thermore, most teachers indicated that most of their knowl- edge of testing and measurement was obtained through on- the-job experience using a trial-and-error format. Yet, most teachers reported that their knowledge of testing and mea- surement was most influenced by the course or courses they had taken at the college or university level. To enhance teacher knowledge of assessment-related activities, Wise et al. recommended extensive requirements in testing and measurement during preservice education.

This recommendation was corroborated by Rosenfeld, Thornton, and Skurnik (1986), who found that knowledge of tests and measurement was related to core job functions of teachers as measured by the National Teacher Exam (NTE). The NTE, which is the primary screening device of preser- vice teachers in many states, listed “diagnosis” as one of five skills components necessary for teacher competence, thus

emphasizing the aforementioned need for enhanced prepara- tion in testing and measurement at the preservice level.

Salmon-Cox (1981) conducted a study of how elemen- tary inservice teachers use standardized tests. Her findings and those of others (e.g., Herman & Dorr-Bremme, 1982; Kellaghan, Madaus, & Airasian, 1982) indicated that a great deal of teacher assessment takes the form of “observation,” with standardized tests used primarily as a supplement to this observation. In addition, the results suggested that teachers used standardized tests within rather narrow para- meters. Those tests were found to be used primarily as a guide for instruction and for grouping or tracking students within the classroom setting. Salmon-Cox, in effect, deter- mined that standardized tests were given a back seat to a child’s classroom performance, which was oftentimes ascertained through observation. Furthermore, these class- room teachers generally failed to use test information as an aid in decision making, which indicated a rather limited focus on measurement techniques.

In a related study of elementary and secondary inservice teachers, Gullickson ( 1 984) reported that teachers are, by and large, heavy users of tests but lack a sophisticated knowledge of assessment-related activities. This finding is consistent with the previously cited finding by Wise et al. ( 1 99 1 ), who noted that teachers believed that a substantial portion of their knowledge of testing and measurement was acquired through on-the-job experience, not through preser- vice education at the university level. Gullickson summa- rized, therefore, that inservice teachers may be inadequate- ly prepared in the area of testing and measurement.

Stiggins and Bridgeford ( 1985) analyzed the classroom assessment practices of 228 elementary and secondary teach- ers to determine how teachers used self-developed assess- ments (i.e., teacher-made tests). First, the results indicated that teacher-made tests are used much more consistently than are standardized tests. As with previous research, the teach- ers reported substantial reliance on observation as a means of assessing student progress. Stiggins and Bridgeford, howev- er, indicated that teachers used their own objective tests more frequently than other types of assessment, including observa- tion. Interestingly, Flemming and Chambers ( 1 984) found that 80% of the items used on these teacher-made objective tests primarily measured lower order cognitive skills that focused on fact and skill acquisition.

Stiggins and Bridgeford (1985) also discovered that teachers at different grade levels used different types of assessment. For example, their findings indicated that teachers in higher grades tended to use assessment tech- niques (objective tests) that were self-developed, whereas teachers in lower grades relied more heavily on standard- ized tests for assessment of student progress. The authors suggested that there are fundamental differences in elemen- tary, junior high, and high school environments that effect assessment practices at these various levels. Finally, even though these teachers were concerned about the quality of their assessment practices, few appeared to be in the process

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 4: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

July/August 1998 [Vol. 91(No. 6)] 333

of changing current assessment practices or seeking avenues that would aid in the development of improved assessment techniques (e.g., inservice education or addi- tional coursework in testing and measurement).

Gullickson (1985), in a similar study, examined how teachers at different grade levels and in different content areas used assessment. The results indicated that significant differences do exist among assessment techniques used at different grade levels. Whereas elementary teachers used a variety of assessment methods, secondary teachers relied on fewer types of assessment, with teacher-made tests being the predominate method used to determine student progress. Differences across teachers of various cumcular areas were not noteworthy. Gullickson emphasized that past and current undergraduate measurement courses fail to focus on assessment techniques that include nontest assess- ment as well as a range of testing techniques; this lack of preparation could, in effect, place inservice teachers at a distinct disadvantage in assessing student progress.

In practical terms, Canady and Hotchkiss (1989) identi- fied how this lack of teacher assessment knowledge is played out in the classroom. When they examined grading practices, they identified several common assessment mis- takes. For example, teachers oftentimes assigned grades based on pretest scores that reflect incomplete instruction, or they fail to adequately relay to students what will be included on tests, which then requires students, instead of their teachers, to decide what is important enough to study. Similarly, emphasizing higher order thinking skills during instruction yet testing on facts and recall was found to be a recurring assessment problem. In addition, teachers may place emphasis on assessments early in the instructional process while concepts are still forming. This emphasis tends to leave students with the belief that success is unat- tainable. Another common assessment mistake reported by Canady and Hotchkiss was the tendency to assign zeros for missing or incomplete work, which, in effect, are inade- quate and inappropriate indicators of achievement and have a profound effect on student averages.

Hills ( 199 1 ) identified other common assessment mis- takes that included using grades for discipline, assigning grades that are contingent on improvement, using tests that are technically inappropriate, and deviating from estab- lished standardized-test administration guidelines. Schafer ( 1993) emphasized that a particularly disturbing aspect associated with these assessment misuses is the possibility of others modeling poor practices. Poor practice tends to perpetuate more poor practice.

In sum, the research during the past 30 years has clearly documented the inadequacies of teacher preparation and knowledge in the area of testing and measurement. Although the findings of this research generally paint a rather gloomy picture of assessment practices in our schools, this same research can serve as a foundation for future improvement in how teachers are trained to assess student achievement. With this in mind, a shift from examining, through research, what

preservice and inservice teachers do not know to what they should know seems appropriate. Accordingly, Winograd, Paris, and Bridge (1995) recently offered the following in- sight: “Literacy reflects both processes of learning and products of learning, so assessment must provide measures of both” (p. 4).

Gullickson (1984) offered a specific look at what teach- ers need to know concerning testing and measurement. First, he emphasized the need for distinction between tests and evaluation: Tests offer primarily descriptions, whereas evaluation requires a combination of description and facili- tation of judgment. Regarding this distinction, Gullickson found that teachers do not use tests for evaluation exclu- sively; however, they do believe that tests are the most appropriate way to assess student progress. According to Gullickson, a broadening of this assumption is needed. Similarly, Dorr-Bremme and Herman (1986) focused their national survey concerning assessment of student achieve- ment on several factors that they considered to be the frame- work for assessment practices. This framework included emphasis on federal/state/local testing requirements, feder- al/state/local programs that require assessment, organiza- tion of curriculum and instruction as it relates to assess- ment, types of students enrolled, teacher attitudes and beliefs as to the usefulness of tests and types of assessment, previous testing and measurement experience, responsibili- ties of district and local schools (i.e., providing training experiences), purpose and frequency of testing, types of test scores used, and impact of assessment. The authors further suggested that tests be designed that could meet the demands of classroom use while simultaneously fulfilling policymaking functions at the local, district, or state level.

Based on their findings, Stiggins and Bridgeford (1985) surmised that inservice teachers must understand how to assess data gathered via teacher observation as a part of the total assessment of student progress. Even more important is their supposition that greater attention should be given to the practices of assessment for general classroom use, with emphasis on building collaboration among educators. Rele- vant inservice training for teachers was also suggested. Consequently. Stiggins and Bridgeford recommended that classroom teachers exercise quality control of the tests they construct, whether these tests are based on subjective inter- pretations or observation-based assessments.

Rowntree (1987) proposed further suggestions for a com- pilation of knowledge surrounding testing and measure- ment. These suggestions included assessment that clearly articulates the goals and objectives to be assessed, the use of a variety of assessment methods, maximum feedback from the assessment, knowing the criteria to be used as per- formance indicators before assessment ( i .e., criterion-reier- enced or norm-referenced), and realizing that different col- leagues may, in effect, hold different opinions of the same assessment device.

A statement that focused on appropriate measurement standards was issued by the American Federation of Teach-

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 5: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

334 The Journal of Educational Research

ers, the National Council on Measurement in Education, and the National Education Association; the statement steered preservice and inservice teachers toward more effective assessment. It emphasized that teachers should be proficient in (a) matching assessment methods with deci- sions that enhance instruction; (b) developing these meth- ods; (c) adequately understanding commercially produced and teacher-produced assessment results; (d) using assess- ment results to make decisions concerning individual stu- dents, developing teaching plans and strategies, planning curricula, and instigating school improvements; (e) using assessment to perpetuate valid grading practices; (f) com- municating assessment results to a variety of individuals; and (g) recognizing assessment practices that are deemed to be inappropriate (Schafer, 1993).

Kubiszyn and Borich (1996) further proposed that, among many important measurement issues, knowing the type of test needed based on the purpose of the test, identi- fication of instructional objectives, knowledge of essay-test construction, reliability, validity, basic test statistics, and the implications that surround the assignment of grades as an end product of assessment are important issues that class- room teachers must address. An additional recommendation proposed by the authors is that teachers acquire the knowl- edge to effectively communicate assessment results to par- ents and the community at large in clear and meaningful ways. Schafer (1993) cited miscommunication of standard- ized test results to parents and the community as a recurring dilemma in our schools. In short, teachers often do not report standardized test scores correctly because of inade- quate assessment literacy. This miscommunication is exhib- ited when teachers report percentile ranks as percentages of subject content (50th percentile is in the “F’ range) or treat grade equivalents as the averages of students at that grade level when the comparison should be made with the content of the curriculum at that particular grade level.

O’Neil ( 1994) further reported that today’s teachers should be competent in developing “rubrics” that provide criteria that describe how students perform at a variety of levels. Similarly, the notion of portfolio assessment (collec- tions of student work showing student effort, progress, or achievement in one or more areas) is currently gaining in popularity. Engel ( 1994) determined that effective portfolio usage requires preparation; professional development; attention to decision making; and a rethinking of instruc- tion, curriculum, and assessment. She further contended that a knowledge base for portfolio assessment is vital for all concerned, with an emphasis on parental understanding.

Popham (1995) identified several critical issues sur- rounding assessment that teachers should understand in order to be effective in the classroom. Among these issues are (a) diagnosis of students’ strengths and weaknesses, (b) monitoring of students’ progress, (c) assignment of grades, (d) teacher-constructed assessment, (e) knowledge of com- mercially constructed educational tests, and (f) determina- tion of the teacher’s instructional effectiveness. Popham

further emphasized the need for a realization that the pri- mary goal of testing is to assist the teacher in making sound, rational decisions about education.

While making the argument that assessment has as its primary goal the improvement of student performance, Jamentz ( 1995) identified four key practices that link assessment to instruction and are therefore important for teachers to internalize: (a) the identification of standards and the creation of meaningful assessment, (b) the founda- tion of assessment in order to improve instruction, (c) assisting students in becoming users of assessment in order to gauge their individual learning, and (d) using assessment to positively affect the learning environment. Ultimately, according to Jamentz, teachers should continually strive for improved assessment that will, in effect, result in enhanced student performance.

As described by Seeley ( 1995), teachers need to recreate classrooms that are moving from a “festing culture-where teachers are the sole authority, students work alone, and learning is done to pass a test-to an assessment culture- where teachers and students collaborate about learning, assessment takes many forms for multiple audiences, and assessments are blurred’ (p. 6). Similarly, Wiggins (1995) outlined six necessary components that teachers must embrace in order to report assessment accurately. These include (a) a distinction between standard-referenced (crite- rion-referenced) and norm-referenced achievement, (b) interpretation of data that assesses the student against an established standard and against progress of the individual student, (c) a system that establishes standards for each grade level that is accessible to students in order to chart individual progress, (d) assessment of achievement at mul- tiple levels, (e) designation of the difficulty of the assess- ment in relation to the quality of the student work produced, and (f) assessment of habits that comprise the individual character of students. Finally, Kohn (1995) encouraged teachers to examine carefully why students are evaluated as opposed to how they are evaluated.

Teachers are expected to know a great deal concerning testing and measurement if they are to be effective in assessing student progre Clearly, much is required of the classroom teacher in the realm of assessment, with all roads leading to increased assistance to the student. Farr (1995) summarized this view with the following insight: “The bot- tom line in selecting and using any assessment should be whether it helps the students” (p. 4).

An interesting question posed by the preceding statement is whether today’s teachers are, in fact, using assessment to aid students. Numerous pitfalls in the use of testing and measurement have been previously documented. Are con- temporary teachers avoiding these assessment snares or continuing to make the same mistakes as their predeces- sors? Posing a similar question in 1987, Schafer and Lissitz stated, “We hope that in fifteen years there is not yet anoth- er survey revealing little progress since the previous study, as ours has done” (p. 62) . Even though I5 years have not yet

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 6: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

July/August 1998 [Vol. 91(No. 6 ) ] 335

passed, the need to determine what contemporary teachers know concerning testing and measurement and how they use what they know in the classroom is a noteworthy and much needed endeavor. Again referring to Farr (1995), the bottom line in assessment is helping students. This raises the persistent and recurring question as to whether class- room teachers are, in effect, using appraisal techniques that improve the learning environment for students.

Purposes of the Study

We designed the present study to determine the literacy level of elementary and secondary teachers concerning edu- cational testing and measurement. We also sought to examine the reciprocal relationship between this knowledge base and the actual classroom assessment practices used by elemen- tary and secondary teachers. We sought to ascertain whether contemporary teachers are more adequately prepared than their predecessors to use assessment as a framework to aid student performance and whether assessment practices differ by grade level. Specifically, our purposes in the present study were to (a) determine the educational testing and measure- ment literacy of elementary and secondary teachers, (b) examine how testing and measurement concepts are applied in the classroom assessment environment, and (c) determine if assessment strategies vary across grade levels.

Method

Participants

The participants in the present study were elementary and secondary teachers (n = 95) from two schools in south- ern Mississippi. Because the sample was a relatively small, nonprobability sample, we took steps to deternine the degree to which the sample was similar to the population of teachers nationwide to ensure at least reasonable evidence of generalizability of the findings to that population. As Schumacher and McMillan ( 1993) noted, “Often re- searchers will describe the subjects carefully to show that although they were not selected randomly from a larger pop- ulation, the characteristics of the subjects appear representa- tive of much of the population” (p. 160).

In analyzing the present sample’s representativeness, we found that the teachers sampled taught students in a wide range of grades ( 1 st through 12th) and curricular/certifica- tion areas (including elementary education, special educa- tion, mathematics, science, social studieshistory, English/ language arts, reading, vocational/technical education, phys- ical educationhealth, library, and counseling). At the ele- mentary level, the respondents were generally responsible for teaching all content areas; secondary teachers were spe- cialized according to subject matter. Distribution of the sample indicated that 74 (80.4%) of the respondents repre- sented elementary grades (K-6) and 21 (19.6%) represent- ed secondary grades (7-12). Of the 90 persons indicating

gender, 1 1 . 1 % ( n = 10) were men and 88.9% ( n = 80) were women. The average age of the sample was 41.5 years, with ages ranging from 22 to 59 years. The average number of years of teaching experience was 14. I , with years of expe- rience ranging from l to 39 years. An examination of the various degrees earned by the respondents indicated that of the 92 respondents providing this information, 38% (n = 35) held bachelor’s degrees, 58.7% ( n = 54) held master’s degrees, and 3.3% (n = 3) held specialist degrees. A break- down of the ethnic composition of the sample (92 cases reporting) indicated that 94.6% (n = 87) were White and 5.4% (n = 5) were African American. The district included in the study was considered to be suburban, with schools varying in size and student composition.

Also, although the sample used in the present study was one of convenience, it is interesting to note that the sample’s demographic characteristics are within reasonable bounds of the actual data for the teaching force at large in the Unit- ed States (National Center for Educational Statistics, 1997). The present sample had a higher percentage of women (88.9%) than the national teaching population (73%), reflecting the high percentage of elementary teachers included in the present sample. Similarly, the percentage of White teachers was slightly higher than the national average (94.6% vs. 87.3%). The present sample was slightly more highly educated than the national average, with 62% of the sample holding degrees above the bachelor’s, compared with the national average of 53.3%. Length of teaching experience ( 14. I years) for the present sample was virtual- ly identical to the national data (14.8 years). Because of these similarities, we deemed the sample to be relatively reflective of teachers at large but limited by its geographic cohesiveness.

All participants were fully informed of the purposes of the study and were advised of their rights as human participants in accordance with federal guidelines. The initial approval of the study included permission from the office of the superin- tendent of schools of the district from which the sample was drawn. At the time the data were collected, the participants were given the option not to participate without penalty. Strict anonymity of responses was assured by the nonplace- ment of teachers’ names on the completed answer sheets.

Instrumentation

The questionnaire used for the purposes of the present study contained 67 items divided categorically by back- ground information (7 items), testing and measurement lit- eracy (30 items), and use of assessment techniques (30 items). Background information was gathered through either a fill-in-the-blank or four-choice (i.e., A, B, C, or D) multiple-choice design. Items on the knowledge segment of the survey were presented in a true-false format, and responses to the level of use of assessment techniques items were elicited through a 5-point Likert-type scale arrange- ment in which 1 = I do not use it at all, 2 = I rarely use it,

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 7: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

336 The Journal of Educational Research

3 = I occasionallv use it, 4 = I use it a good bit, and 5 = I use it with great frequency. Responses were recorded on an optiscan answer sheet that accompanied the survey. An explanation of the purpose of the study, directions for each section, and assurance of confidentiality were included in the text of the instrument. Alpha reliability coefficients for the scores on the measurement literacy and use of tech- niques sections were .60 and .93, respectively. The some- what lower alpha coefficient for the scores on the measure- ment literacy section is typical for a true-false test of this length because of the susceptibility of true-false items to guessing (Linn & Gronlund, 1995).

Data Analvtic Procedures

We used descriptive procedures to analyze all data. We computed item difficulty indices (IDIs) for the true-false knowledge-base-for-testing-and-measurement items. An ID1 ranges in value from zero to one and indicates the percentage of persons who have gotten an item correct (Linn & Gron- lund, 1995). We computed IDIs for the entire sample and sep- arately for the elementary and secondary cohorts. In addition, we computed a total score for the true-false items based on counting the number of correct responses to the 30 items in this portion of the instrument. We computed mean scores for this variable for the entire sample and separately for the ele- mentary and secondary cohorts. Descriptive statistics for the continuously scaled assessment-use items were computed for the entire sample as well as for the elementary and secondary cohorts. Because the present study was descriptive in nature, we used no parametric statistical tests.

Results

Teachers' Knowledge Base ,for Testing and Measurement

Item difficulty indices for the 30 knowledge-base-for- testing-and-measurement items are presented for the entire sample and separately for the elementary/secondary cohorts in Table 1 . Descriptive statistics for the total scores on the true-false items are presented in Table 2. We used a mini- mum ID1 of .80 to ascertain the content that teachers in the sample were most likely to have mastered. Seven items (A8, A9, AIO, A25, A28, A30, and A31-see Table 1) met this criteria for the full sample. Interestingly, the majority of these items reflected the teachers' understanding of stan- dardized tests and their knowledge of various types of stan- dard scores, suggesting an adequate knowledge base regard- ing these concepts.

We used IDIs between S O and 3 0 to ascertain the con- tent that teachers in the sample were less likely to have mas- tered. However, because at least half of the sample respond- ed to these items correctly, we assumed that at least some of the teachers had an adequate knowledge base in this area. Fifteen items (A4-A7, Al I-A13, A15, A18, A19, A21, A26, A27, A29, and A32) had IDIs within this range. In

Table 1.-Item Difficulty Indices for Tkue-False Items Across Teaching Levels

Sample ID1 No. cases

A 4 One rfthe shortcomings of the runge us un index cfvuriuhilitv i s thut it is derived,from only 2 ruw scores. ( T )

For entire sample .61 I I 90

Elementary ,630 I 73 Secondary ,5294 17

~~ ~~~

AS Ifscwes on I I test ure heterogeneous, the test will tend to have u lurge stundurd deviation. (T)

For entire sample ,5667 90

Elementary ,5890 73 Secondary ,4706 17

A 6 Once perfomiimce stundurds ure set,fr,r u norm-referenced test. the norms mu! he usedfiw up to I5 years hefore renorming is

required. ( F )

For entire sample ,7333 90

Elementary ,767 I 73 Second a r y ,5882 17

A7 Peflormunce .stundcirds bused on u locul norm g m u p ure ulmost u1wuy.s lower thim stundurds based on u nutionul norm gmup. ( F )

For entire sample .71 I I 90

Elementary .7 I23 73 Secondary ,7059 17

A8 Percentile scores ure the most common nay of describing studenrs 'perfr,rmcince on stundurdized tests. (TJ

For entire sample ,9778 90

Elementary I .oooo 73 Secondary ,8824 17

A 9 A grude-cyuiwlenr score of 6.2fi)r u 5th grcrder indicures the 5th gruder is performing 6 months ond 2 weeks ahead

of e.rpectution.s. ( F )

For entire sample .9778 90

Elementary ,9863 73 Secondary .94 I2 17

A I O ELvn though uge- m d Krcrcle-equivulent scores ure communicu- hle, the! ure often misinterpreted. ( T )

For entire sample ,8667 90

Elementary ,904 I 73 Secondary ,7059 17

A I I A percentile score on ( I sttmdurdi,-ed test is bused on rhe per- cmtuge of irenis the student has unswaered correctly. ( F )

For entire sample ,5778 90

Elementary ,6027 73 Secondary ,4706 17

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 8: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

July/August 1998 [Vol. 91(No. 6)] 337

~ ~~~

Table L-continued

Sample ID1 No. cases

A12 Es.sciy items are I Y ~ useful because of their tendency to be highly reliable. ( F )

For entire sample .71 I I 90

Elementary ,6849 73 Secondary ,8235 17

A 13 Matching items are most appropriately used when trsting material is highly homogeneous in nature. (T )

For entire sample ,5333 90

Elementary ,5479 73 Secondary ,4706 17 ~~

A14 Multiple-choice itenrs are useful in measuring not only,factual recall hut also the ability to analyze and synthesize information. ( T )

For entire sample .41 I 1 90

Elementary ,4247 73 Secondary ,3529 17

A 15 When con.rtructing tests. teachers should bc more concerned with content validity than construct or predictive validitv. ( T )

For entire sample ,6667 90

Elementary ,6849 73 Secondary ,5882 17

A16 A correlation ($.US between 2 sets of scores for the strme test given to the .same students at different times indicates the scores are

highly valid. ( F )

For entire sample ,3222 90

Elementary .3425 73 Secondary ,2353 17

A17 I t is m: correct to make the stutement "This is a rrliable test." (T )

For entire sample .41 I I 90

Elementary ,4110 73 Second a r y ,4118 17

A18 A per.son'.s observed score on any given test actually indicates a point in the range near which the person's true score falls. (T )

For entire sample .71 I 1 90

Elementary ,7260 73 Secondary ,647 I 17

A19 On a criterion referenced test, each student's score is coni- pared to the performance of the class as a whole. ( F )

For entire sample .61 I 1 90

Elementary ,6301 73 Secondary S294 17

(ruble continues)

Table l.-continued

Sample ID1 No. ca%e\

A20 In general, adding items to a test wi l l increase the likelihood ofthe test pmducing reliable results. ( T )

For entire sample ,3889 90

Elementary .3973 73 Secondary .3529 17

A21 Determining the diflculty level ($a test item helps the teacher to evaluate the degree to which students have attained instructional

objectives. (T)

For entire sample ,6889 90

Elementary ,6849 73 Secondary ,7059 17

A22 A correlution of -.K2 is "stronger" than a correlation of +.33. ( T )

For entire sample ,2222 90

Elementary Secondary

,1918 ,3529

73 17

A23 The standard error ($measure indicates the umount of vuri- ability in ti set of scores. ( F )

For entire sample , I556 90

Elementary . I096 73 Secondary ,3529 17

A24 Of the mean. median. and mode. the median is the central ten- dencv statistic most affected by outlying scores. ( F )

For entire sample ,2889 90

Elementary ,3151 73 Secondary . I765 17

A25 Teacher-nitrde tests are usuall.v preferuhle t o standardi:ed tests u s a measurc, ofthe teacher 's specific learning objectir~es. (T )

For entire sample .9000 90

Elementary ,8904 73 Secondary .94 I2 17

A26 Matching items should be used,fi)r measurinK students ' uhility 10 understand general principles. ( F )

For entire sample ,5333 90

Elementary ,5890 73 Secondary ,2941 17

A27A stanine score of6 is higher than (I T score ($50. ( T )

For entire sample ,5889 90

Elementary .56 I6 73 Secondary ,7059 17

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 9: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

338 The Journal of Educational Research

Table I.--continued

Sample ID1 No. cases

A28 A test score at the 61st percentile indicutes that 61% qf the items on the test hoiv been answered correctly. ( F )

For entire sample ,8333 90

Secondary ,8824 17 Elementary .82 I9 73

~~

A29 The purpose of iichietwnent tests is to measure the student’s performance utter instruction has taken place. ( T )

For entire sample ,6778 90

Elementary ,6849 73 Secondary ,647 I 17

A30 A stundtirdixd test is a test published I)y a professional test ile~elopment compuny niiniinistered ociwrding to a prescribed set

ofprocedures. ( T )

For entire sample .9444 90

Secondary .94 I2 17 Elementary ,9452 73

A.11 When interpreting results of n stunilurdixd test. the test’s norms senv iis n menn.s,fiw judging the student‘s relative letrl of performunce u,hen compared to N representative sumple of

test takertv. ( T )

For entire sample .91 I I 90

Elementary .9 I78 73 Secondary 3824 17

A32 A stucient portfolio consists .simply of a ,file in which student u,ork is pluced. ( F )

For entire sample .6OOO 90

Elementary ,6027 73 Secondary .sun2 17

A33 Pe~onnunce-bn.sed iissessments ore superior to traditional tichiei~ement testing because they tend to produce more highly reli-

able ratings ofstudent performance. ( F )

For entire sample . I222 90

Secondary . I765 17 Elementary . I096 73

Nofc’. For each item, total cabeh = Y3 and mising case\ = 3 (3.2%).

Table 2.-Descriptive Statistics for True-False Total Score Across Teaching Levels

Sample M SD N

For entire sample 18.2556 3.2 136 90

Secondary 17.3529 3.315s 17 Elementary 18.4658 3. I758 73

N o f e . Total cases = 93: Missing = 3 (3.2%)

general, these items attempted to measure the teachers’ knowledge base across the areas of variability statistics (4 items), construction of and purpose of various types of tests and test items (5 items), and knowledge of norm-referenced tests and scores (5 items). In addition, a single item (A32) within this group addressed the teachers’ understanding of portfolio assessment.

Finally, we examined items with ID1 values that were less than S O . Considering that less than half of the sample had correctly responded to these items, it was apparent that the knowledge base of the participants was inadequate in the areas addressed by these items. Eight items (A14, A16, A17, A20, A22, A23, A24, and A33) had IDIs within this range. Content themes addressed by these items included knowledge of validity, reliability, and elementary test statis- tics (6 items); use of multiple-choice items ( 1 item); and use of performance-based assessments ( 1 item).

The teachers’ knowledge of testing and measurement, assessed with 30 true-false items in the instrument, was used (see Table 2). The mean of 18.3 items (61%) correct was just 3 items better than a chance score of 15 (50%) had the teachers simply guessed at the items; this finding sug- gests that the teachers generally lacked knowledge of the issues addressed in the items. Considering that Linn and Gronlund (1995) noted that it is typical for true-false tests to yield scores no lower than 80% for a usual group of examinees, a score of at least 24 points would be judged as adequate. Obviously, the mean score for the present sample was substantially lower than this criterion.

Teuchers ’ Use of Assessment Information

Descriptive statistics are presented for the 30 use-of- assessment-information items for the entire sample and sep- arately for the elementary/secondary cohorts in Table 3. For interpretive purposes, items having means greater than 3.00 (information used more than “occasionally”) were consid- ered to reflect information used with some frequency in the teachers’ classroom assessment practices. By contrast, items with means less than 3.00 (information used less than occa- sionally) were considered to reflect information not typically used in the teachers’ classroom assessment practices. A brief overview of these two item cohorts follows.

Information used more,frequently. Sixteen items (Items 36, 38-45, 47, SO, 5 I , and 60-63) represented information fre- quently used by teachers in assessing and reporting student progress. The majority of these items focused on knowledge relative to the construction and use of teacher-made tests (Items 38, 40-44, SO, 60, and 61). Two items were related to performance-based assessments (Items 62 and 63), 1 was related to writing of educational objectives (Item 39), and 2 were related to interpretation of standardized achievement test data (Items 36 and 47). In addition, I item measured the teach- ers’ use of information about strategies for grouping students (Item 5 I ) , and another item measured their use of information about reporting student progress to parents (Item 45).

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 10: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

July/August 1998 [Vol. 91(No. 6)] 339

Table 3.-Descriptive Statistics fnr Assessment-Use Items

Sample M SD No. cases

ITEM 34 Knowledge of the ud~mntciges uncl disudivintu,qe.s of sttrndtrrdi,-ed tests.

For entire sample 2.8539 1.0175 n9

Elementary 3.0139 ,9567 72 Secondary 2. I765 1.0146 17

Total cases = 93; Missing = 4 (4.3%)

ITEM 35 Ability to c~mptire standurdized tests M,ith reucher-niude tests.

For entire sample 2.2584 .97 I5 n9

Eilementary 2.2222 ,9527 72 Secondary 2.41 i n 1,0641 17

Total cases = 93; Missing = 4 (4.3%)

ITEM 36 Ability io interpret crchiewment test sc’ores.

For entire sample 3.47 I9 .9545 89

Elementary 3.6528 ,8906 72 Secondary 2,7059 ,8489 17

Total cases = 93; Missing = 4 (4.3%)

ITEM 37 Knowled,qc, of ,qcnerul infi)rmution trbout indii.iduul intc4li~c,nCe (optitucie) tes1.s.

For entire sample 2.6591 1.0816 nn Elementary 2.8028 ,9946 71 Secondary 2 . 0 5 ~ ~ 1.2485 17

Total cases = 93; Missing = 5 (5.4%)

ITEM 38 Knowledge of crdi~tintugrs crnd discrdi~untu,qes of tcwcher- mu& tests.

For entire sample 3.3295 ,9910 nx Elementary 3.2535 1.0102 71 Secondary 3.6471 ,8618 17

Total cases = 93; Missing = 5 (5.4%)

ITEM 3Y Ability to .sttite ediicutional objectirw in mcusiiruble terms.

For entire sample 3.8977 1.0062 88

Elementary 3.8873 1.0358 71 Secondary 3.9412 ,8993 17

Total cases = 93; Missing = 5 (5.4%) ~~

ITEM 40 Knonledge o/ ihe ~ e n e r a l principles of test construction.

For entire sample 3.61 36 ,9992 nn Elementary 3.5070 1.0124 71 Secondary 4 . 0 ~ ~ ,8269 17

Total cases = 93; Missing = 5 (5.4%)

(rtrhlr i ,o i t r inur J

Table 3.-cnntinued

Sample M SD No. cahes

ITEM 4 I Knowled~o of thc trdivrntcr~es trnd discrdi~trritti~c~.s of iririous types o f object test itenis.

For entire sample 3.4205 ,9556 xn Elementary 3.3380 ,9848 71 Secondary 3.7647 ,7524 17

Total cases = 93; Missing = 5 (5.4%)

ITEM 42 Knowledge of techniques c~udinini.sterin~ (I tost.

For entire sample 3.8977 ,9594 88

Elementary 3.8310 ,9998 71 Secondary 3. I765 ,7276 17

Total cases = 93; Missing = 5 (5.4% )

ITEM 43 Ability to construct (fiferent type.s of I P S I ircwis.

For entire sample 3.7586 1.2101 87

Elementary 3.6714 1.2594 70 Secondary 4. I I76 .9275 17

Total cases = 93; Missing = 6 (6.5% )

ITEM 44 Knowledge ofprincip1e.v involwd i i i scorin,q diferc.iir types of I C ~ I items.

For entire sample 3.55 I7 1.0202 87

Elementary 3.5429 1.0452 70 Second a r y 3 . s w ,9393 17

Total cases = 93; Missing = 6 (6.5% )

ITEM 45 Knouledge ofprocedurc..s ,for c%fecti~vly reportiria ,stutlcwr pmgress to parents.

For entire sample 4.3908 ,8939 87

Elementary 4.37 I4 ,9352 7 0 Secondary 4.4706 .7 I74 17

Total cases = 9.3; Missing = 6 (6.5% )

ITEM 40 Funriliurity \cith correct procedures.fiw usirix 11 “itrhle of spec(jii’ution.s ” or “test hlireprint ’* nhen con.strut~tin,q (I ICSI .

For entire sample 2.0230 .9642 87

Elementary 2.IoOo .9502 70 Secondary 1.7059 ,9852 17

Total cases = 93; Missing = 6 (6.5%)

ITEM 47 Ability to interpret ivrious types of .stcriidarcl scores yiddeil by formu1 uchic~rmenr mid optitude ie.si.\.

For entire sample 3.0460 1.0105 87

Elementary 3.1714 ,9627 70 Second a r y 2.5294 1.0676 17

Total cases = 93; Missing = 6 (6 .5%)

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 11: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

340 The Journal of Educational Research

Table 3.-continued

Sample M SD No. cases

ITEM 48 Ability to compare two c1as.se.s on the basis of simple ,stati,stic.s such as means and standurd deviations.

For entire sample 2.4713 1.0766 87

Elementary 2.47 14 1.0864 70 Secondary 2.4706 1.0676 17

Total cases = 93; Missing = 6 (6.5%)

ITEM 4Y Knowledge of concepts of validity, reliability, and item unulysis.

For entire sample 2.9885 1.1049 87

Elementary 3.0429 1.1349 70 Secondary 2.7647 ,970 I 17

Total cases = 93: Missing = 6 (6.5%)

ITEM 50 Ability to do a simple item analysis for a teacher-made tP.St.

For entire sample 3.1379 I .3483 87

Elementary 3.1000 I .3742 70 Secondary 3.2941 1.2632 17

Total cases = 93; Missing = 6 (6.58)

ITEM 51 Knowledge of the advantages and disadvantages of various strategies for grouping students.

For entire sample 3.6977 1 . 1 175 86

Elementary 3.7286 1.141 I 70 Secondary 3.5625 1.0308 16

Total cases = 93; Missing = 7 (7.58)

ITEM 52 Familiaritv with the uses of a .frequency distribution.

For entire sample 2.0930 ,9534 86

Elementary 2.1 143 .9254 70 Secondary 2 . m 1.0954 16

Total cases = 93; Missing = 7 (7.5%)

ITEM 53 Understanding of the concept of measurement erro,:

For entire sample 2.5 176 1.1914 85

Elementary 2.52 17 1.1582 69 Secondary 2.5000 1.3663 16

Total cases = 93: Missing = 8 (8.6%)

ITEM 54 Understanding of the uses of the mode, median, and mean.

For entire sample 2.9767 1.2740 86

Elementary 3.0429 1.2561 70 Secondary 2.6875 1.3525 16

Total cases = 93; Missing = 7 (7.5%)

(table continues)

Table 3.-continued

Sample M SD No. cases

ITEM 55 Ability to compute the mode, median, and meanfiir u given set of scores.

For entire sample 2.7093 1.2354 86

Elementary 2.7429 1.2358 70 Secondary 2.5625 1.2633 16

Total cases = 93; Missing = 7 (7.5%)

ITEM 56 Understanding ofthe uses of the standard deviation.

For entire sample 2.5698 1.1836 86

Elementary 2.5429 1.1253 70 Secondary 2.6875 1.4477 16

Total cases = 93; Missing = 7 (7.5%)

ITEM 57 Knowledge of the properties. uses, and limitations o f the normal curve.

For entire sample 2.4588 1.097 1 85

Elementary 2.4286 1.0977 70 Secondary 2.60OO 1.1212 15

Total cases = 93; Missing = 8 (8.6%)

ITEM 58 Knowledge of the means and standard deviations of com- mon standard scores (e.g.. z scores, T scores, stanines).

For entire sample 2.15 I2 1.2504 86

Elementary 2.2286 I .3206 70 Secondary 1.8125 .9 I06 16

Total cases = 93; Missing = 7 (7.5%)

ITEM 59 Ability to interpret a correlation coeficient.

For entire sample 1.8837 1.0674 86

Elementary 1.9000 1.0653 70 Secondary 1.8125 I , 1087 16

Total cases = 93; Missing = 7 (7.5%)

ITEM 60 Knowledge of appropriate procedures for administering a test.

For entire sample 4.0233 1.0844 86

Elementary 4.oooo 1.1421 70 Secondary 4. I250 ,8602 16

Total cases = 93; Missing = 7 (7.5%)

ITEM 61 Ability to write effective test items of various types (e.8.. true-false, multiple-choice, matching, short-answec essay).

For entire sample 3.7791 1.2593 86

Elementary 3.657 1 1.3175 70 Secondary 4.3 I25 ,7932 16

Total cases = 93; Missing = 7 (7.5%)

(table continues)

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 12: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

JulyIAugust 1998 [Vol. 91(No. 6)] 34 1

Table 3.-continued

Sample M SD No. cases

lTEM 62 KnowledRe of procedures ,fiw ensurinx ticcurcite nrtings of prrforiiicmce-ho.srd tusks.

For entire sample 3.2326 I.0480 86

Elementary 3.257 I I.0589 70 Secondary 3. I250 1.0247 16

Total cases = 93; Missing = 7 (7.5%)

ITEM 63 Knowleclgt, r!fprocrditre.s for coii ipi l in~ and opprtiising CI

student portfiilio.

For entire sample 3.2857 1.0929 84

Secondary 2.6667 1.4475 15 Elementary 3.4203 ,961 I 69

Total cases = 93; Missing = 9 (9.770)

Information used less frequently. The remaining 14 items (Items 34, 35, 37,46,48, 49, and 52-59) represented infor- mation not frequently used by the teachers. The majority of these items represent statistical information (Items 48, 49, and 52-59). although 3 reflected general information about and comparisons of various standardized tests (Items 34, 35, and 37), and I focused on the use of a table of specifi- cations when constructing a test (Item 46).

Comparisons Across Grude Levels

Teachers’ knowledge base .for testing and measurement. Generally speaking, there were few noteworthy differences between elementary and secondary teachers in the diffcul- ty levels of the 30 items used to measure the teachers’ knowledge base. Using the previously established cate- gories for determining level of mastery (i.e., ID1 > .80 = mastery; S O I ID1 I .80 = partial proficiency; and ID1 < .50 = inadequate mastery), we compared IDIs for the cohort groups on each item. Differences were considered noteworthy if the IDIs for the two groups fell within dis- parate mastery ranges. Only four such differences were found (Items 1 I , 12, 13, and 26), and all favored the ele- mentary-teacher cohort. Nevertheless, total scores across the elementary and secondary cohorts (see Table 2) indi- cated only a minimal difference in the knowledge base of the two groups, with the two means varying only about one third of a standard deviation.

Teachers ’ use of assessment information. In general, assessment-use items were not perceived differently across the grade-level cohorts. In fact, an inspection of the cohort means across the 30 items (Table 3) indicates that only 1 item (Item 36--“ability to interpret achievement test scores”) accounted for a difference of as much as a standard deviation. This finding indicates that uses of assessment

information do not vary considerably across elementary and secondary teachers.

Discussion

Teachers’ Knowledge Base .for Testing utid Meusurernetit

In general, the teachers did not appear to have an exten- sive knowledge base in testing and measurement. That the mean score for the items in this section of the survey was only slightly higher than chance results is particularly dis- turbing. This finding indicates that inservice teachers may be less well prepared in testing and measurement skills than they are in other areas of their training (Gullickson, 1984; Scales, 1993).

As indicated previously, the teachers tended to have an adequate knowledge base as regards their understanding of standardized tests and their interpretation of various types of standard scores. In addition, some of the teachers had an adequate understanding of variability statistics, construc- tion of and purpose of various types of tests and test items, norm-referenced tests and scores, and portfolio assessment. Considering the importance of standardized testing in edu- cation, teachers’ ability to interpret results of these tests is imperative. Hence, it is reassuring that the teachers’ highest level of competence was i n this area. Moreover, teachers’ effectiveness relies in part on their knowledge of interpreta- tion and use of a variety of assessment procedures. Thus, the fact that at least half of the participating teachers demonstrated competence in their knowledge of variability statistics, test construction, norm-referenced tests, and port- folio assessment indicates an appropriately comprehensive focus on assessment procedures.

The teachers’ knowledge base regarding psychometric characteristics of tests and statistical issues related to assessment was less adequate. In particular, the participat- ing teachers appeared to lack a basic understanding of the concepts of validity and reliability and did not understand simple test statistics. Regarding knowledge of validity and reliability, for example, most of the teachers (a) did not know that longer tests tend to produce more reliable results (Item A20), (b) inappropriately assumed reliability to be a characteristic of a test rather than test data (Item A17), (c) inappropriately believed that high-inference assessment decisions based on evaluation of performance-based data necessarily produced more highly reliable assessment results, and (d) seemed to be confused about differences between validity and reliability (Item A16). Although these deficiencies in the teachers’ knowledge base may not be detrimental to the teachers’ day-to-day assessment prac- tices, the deficiencies may lead these teachers to place their confidence in various assessment tools (e.g., standardized tests, student portfolios) without adequately investigating the psychometric merits of such tools. Considering that these assessment tools are frequently used in making high- stakes decisions (e.g., promotionhetention decisions, pro-

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 13: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

342 The Journal of Educational Research

gram placement decisions, special education screening), teachers’ inability to make informed judgments about the merits of these tools could yield very distressing results.

Regarding knowledge of test statistics, the respondents lacked a basic understanding of correlation coefficients (Item A22), central tendency statistics (Item A24), and mea- sures of test score error (Item A23). As previously noted regarding teachers’ knowledge of basic principles of validi- ty and reliability, dangers resulting from deficiencies in teachers’ knowledge of elementary test statistics may not be immediately apparent. However, such deficiencies may prove problematic. For example, a teacher who does not understand simple central tendency and variability statistics may have a difficult time trying to explain to a student or parent the degree to which a given student’s score varies from a typical score on a particular teacher-made test. Sim- ilarly, a less-than-adequate understanding of the correct way to interpret a coefficient of correlation may prompt a teacher to initiate a classroom practice in hopes of improv- ing student achievement just because that practice was cor- related with student achievement.

Teachers’ Use of Assessment Information

The foregoing findings indicate that teachers frequently use information regarding the assessment and reporting of student progress. In fact, the only 2 items (i.e., Items 45 and 60) with overall mean scores in excess of 4 (“I use it [this information] a good bit”) dealt with these issues. Specifi- cally, the teachers recognized that their knowledge of how to interpret standardized test scores, how to construct good teacher-made tests, and how to communicate assessment results to parents were extremeiy important. Thus, these teachers agreed with Wiggins’s ( 1995) assertion that inter- pretation of data that assesses the student against estab- lished standards and against progress of the individual stu- dent is a necessary component of teacher effectiveness.

It is not surprising that the participating teachers report- ed that knowledge of the construction and use of teacher- made tests was among the information that they found most useful in their assessment practices. Despite the increasing use of performance-based assessments, teacher-made tests no doubt remain one of the most used sources of informa- tion about student progress. Consequently, Popham ( 1995) identified teachers’ ability “to construct and evaluate their own classroom tests” as one of the critical issues that will continue to dominate teachers’ thinking about assessment (p. 17). Nevertheless, the teachers reported that they were not particularly familiar with more formalized test con- struction procedures (i.e., use of a table of specifications or test blueprint when constructing a test [Item 461).

As previously noted, the teachers reported that they did not frequently use knowledge of statistical procedures relat- ed to testing and measurement in their assessment of stu- dents. On the one hand, this finding is not surprising when one considers that teachers do not frequently have opportu-

nities for direct application of this knowledge (Schafer & Lissitz, 1987). On the other hand, the question remains as to whether teachers do not use this information frequently because they lack a sufficient knowledge base in this area. Educational test and measurements textbooks (e.g., Kubiszyn & Borich, 1996; Linn & Gronlund, 1995; Popham, 1995) provide numerous examples illustrating the usefulness of this type of information to the classroom teacher. Moreover, with the advent of computer-assisted scoring procedures for classroom tests, many teachers now are privy to test summary output from scoring programs that presents teachers with a host of helpful summary sta- tistics that would be tedious for them to compute by hand. These statistics may be useful in interpreting test results and refining existing tests. Hence, further study is needed to determine exactly why teachers do not typically use this type of information.

The teachers also indicated that they did not use general information about or typically make comparisons among various types of standardized tests. In fact, teachers may view standardized tests as somewhat of a necessary evil, tol- erating their use while taking little opportunity to gather specific information about the nature of these tests. Teach- ers may consider standardized achievement tests to be an inferior source of assessment information (Stiggins & Bridgeford, 1985). Thus, as Salmon-Cox ( 198 1 ) noted, “Teachers rely heavily on observation and . . . feel confident with their own judgments. . . . [A] child’s classroom perfor- mance is given more credence than a standardized test score” (p. 633).

Comparisons of Assessment Knowledge and Use Across Grade Levels

Few noteworthy differences were noted between the ele- mentary and secondary teachers’ difficulty levels for the 30 items used to measure the teachers’ knowledge base about assessment, indicating that the teachers had approximately the same amount and type of knowledge about assessment despite any differences in their preservice teacher education programs. As previously noted, there were also few differ- ences between elementary and secondary teachers across the 30 items used to measure the teachers’ use of assess- ment information, with the difference between groups exceeding one standard deviation on only 1 item (Item 36). These findings contrast with those of previous studies (e.g., Gullickson, 1985; Stiggins & Bridgeford, 1985) that have indicated a number of differences in assessment usage across teachers at the two levels. Considering that the sec- ondary teacher cohort used in the present study was rather small (n = 21), this cohort may not necessarily serve as an adequate representation of secondary teachers in general. Thus, additional studies with larger samples are warranted. Furthermore, the size of the sample precluded examination of differences in the teachers’ knowledge and use of assess- ment procedures across various curricular areas.

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 14: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

JulylAugust 1998 [Vol. 91(No. 6)]

Summary

343

REFERENCES

The foregoing results indicate that elementary and sec- ondary teachers lack an adequate knowledge base regarding testing and measurement procedures. These findings con- firm those of previous studies (Gullickson, 1995; Salmon- Cox. 1981; Scales, 1993; Schafer & Lissitz, 1987; Stiggins & Bridgeford, 1985). However, the teachers did not appear to be totally deficient in all areas of testing and measure- ment. Moreover, the teachers regularly applied the knowl- edge they did have when assessing student progress. In par- ticular, the construction and use of teacher-made tests appear to be assessment procedures frequently used by ele- mentary and secondary teachers.

One of the purposes of the present study was to deter- mine whether teacher education programs are doing a bet- ter job of preparing teachers for the realities of student assessment than they did 10 to IS years ago. The results indicate that teachers of today are not remarkably more or less knowledgeable than their counterparts of 10 to 15 years ago as regards the principles of testing and measurement. Hence, it would appear, at least for the teachers in the pres- ent sample, that teacher training in this area has not changed dramatically in recent years.

Perhaps the most obvious implication of these findings is the recommendation that inservice teachers receive con- tinuous training in the knowledge and use of assessment procedures. As noted by Stiggins and Bridgeford (1983, "Inservice training, structured to meet teachers' assessment needs, provides the greatest opportunity for impact. . . . [This] training must focus on real teacher needs and pro- vide guidance i n quality control for all teacher-made tests, including those based on observation and subjective judge- ments" (p. 285) . In addition, similar training should be incorporated in preservice teacher education programs to ensure the continued competence of the future teacher work force.

A worthy area of inquiry would be the investigation of cases in which testing and measurement have been success- fully incorporated into teacher education programs. Pro- grams of this type could be described in the literature along with documentation of their successes i n adequately training teachers for testing and measurement tasks. These programs could serve the field as models of the types of programs that other teacher training institutions might emulate. As previously noted, Schafer and Lissitz (1987) stated in their study a decade ago that they hoped that future studies would find that teachers' knowledge of testing and measurement had increased. Unfortunately, the results of the present study confirm that little progress has been made. The hope for the future is that model programs will be implemented on a wide-scale basis. With these programs in place, perhaps the field could avoid yet another disappointing report of the status of teachers' knowledge of testing and measurement a decade from now.

Canady, R. L.. & Hotchkiss. P. R. (1989). It's a g a d score! Just a bad grade. Phi Daltu Kappati. 71. 68-7 I .

Carlherg, C. (1981). South Dakotir study rcport. Denver. CO: Midconti- nental Regional Eiiucational Laboratory.

Dorr-Bremme, D. W.. & Herman. J. L. ( 1986). Assessing .clud(wt i r i ~ h i i w - ment: A profile oJ'classrt)oni practices (CSE Monograph Series in Eval- uation No. 1 I ) . 1-0s Angeles: University of California. Center for the Study of Evaluation.

Engel, B. S. (1994). Portfolio assessment and the new paradigm: New instruments and new places. The Educutionul h r u m . SY. 22-27.

Farr, R. ( 1995. February/March). Assessment quotes. Rtwcling Today, /2(4), 4.

Hemming, M.. & Chambers. B. ( 1984. September). Window otr thr i.lir.c.c- morn: A look at teocher.s' tests. Portland. OR: Northwest Regional Test- ing Laboratory.

G o s h , D. A. ( 1967). 72uchim trrid testing. Hartford. CT Connecticut Printers.

Gullickson, A. R. (1984). Teacher perspectives ofthcir instructional use of tests. The Journal nf Edrrcutionul Rrsc~rrc~h. 77(4). 244-248.

Gullickson, A. R. ( 1985). Student evaluation techniques and their relation- ship to grade and curriculum. The Journul of E~lrriwfionirl Re.seiin.h. 79(2), 96-100.

Herman, J., & Dorr-Bremme, D. W. (1982, March). assess in^ students: Teachers ' routine practices arid reusorring. Paper presented at the annu- al meeting of the American Educational Research Association, New York.

Hills, J. R. (1991). Apathy concerning grading and testing. Phi Delttr Kap- pan, 72, 54G-545.

Jamentz, K. ( 1995). Making sure that assessment improves performance. Educational Leatler.ship. 5 / ( 6 ) . 55-57.

Kellaghan, T., Madaus, G. F.. & Airasian. P. W. (1982). Thc c~fl+cfs ofstun- da rdi:ed testing Boston : K I u wcr-N ij ho ff.

Kohn. A. (1995). Grading: The issue is not how but why. Eiluiutiiniirl Leadership, 52(2). 38-41.

Kubiszyn. T., & Borich, G. ( 1996). Educutionul trstinx trnd n r ~ ~ ~ i . s r r r ~ ~ i n ~ ~ i r t : Classroom applic,ation undpractice (5th ed.). New York: HarperCollins.

Linn. R. L., & Gronlund, N. E. (1995). Meusurentmt mil u.s.scs.smcnf in teaching (7th ed.). Englewood Cliffs, NJ: Merrill.

National Center for Educational Statistics. ( 1997). Arneric.u 's teachivx Profile of a proJession. IYY3-Y4 (NCES 9 7 4 0 ) . Washington, DC: U.S. Government Printing Oflice.

Noll. V. H. ( 1955). Requirements in educational measurements for prospective teachers. School and Society, 82. 88-90,

O'Neil, J. (1994, August). Making assessment meaningful. Upilufe. 36(4). I , 4-5.

Popham, W. J. ( 1995). Clirssroorn us.se.s.sirient: Whtrr rcvxhcr.c nerd lo know. Boston: Allyn-Bacon.

Roeder. H. H. ( 1973). Are today's teachers prepared to use tests'? Peirboily Jounial of Education. 4Y. 239-240.

Rosenfeld, M.. Thornton. R. F.. & Skumik. I.. S. ( 1986). Relirtionships between job functions and the N T E core hartery (Research Report No. 86-8). Princeton. NJ: Educational Testing Service.

Rowntree, D. ( 1987). A.ssessing sfudents: How shtrll w krrorc, r h m i ? New York: Nichols.

Salmon-Cox, L. ( 198 I ). Teachers and standardized tests: What's really happening'? Phi Delta Kuppun. 62(9), 63 1 4 3 4 .

Scales, P. (1993). How teachers and education deans rate the quality of teacher preparation for the middle grades. Journtrl of' Tcuchcr Eilrt(a- tion, 4 4 5 ) . 378-383.

Schafer, W. D. ( 1993). Assessment literacy for teachers. Tl icvq into Pnii.-

tice. 32(2), 119-126. Schafer, W. D., & Lissitz, R. W. (1987). Measurement training for school

personnel: Recommendations and reality. Joitrnul r! fTiwl ic~r E~liri~irriori. 38(3), 57-63.

Schumacher. S.. & McMillan. J. H. ( 1993). Rescwrch iri ~~ilrtcirtiinr: A con- ceptual introch-tion (3rd ed.). New York: HarperCollins.

Seeley, M. M. (1995). The mismatch hetween assessment and griiding. Educational Leadership. 52(2), 4-6.

Shapiro. B. C. (1995). The NBPTS sets standards l or ing. Educotioncrl Leadership. 52(6). 55-57.

Stiggins, R. J. (1991 ). Relevant classroom assessment training for tcachers. Educational Mcasurenrent: Issues imd Practice. I I( 2 ) . 35-39,

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014

Page 15: Knowledge and Use of Testing and Measurement Literacy of Elementary and Secondary Teachers

344 The Journal of Educational Research

Stiggins, R. J., & Bridgeford, N. J. (1985). The ecology of classroom

Stiggins. R. J., & Conklin, N. F. (1988). Teacher training in assessment,

Wiggins, G. (1995). Toward better report cards. Educational Leadership,

Winograd, P., Paris, S., & Bridge, C. (1995, Februaryhlarch). Assessment quotes. Reading Toduy, /2(4), 4.

Wise, S. L., Lukin, L. E., & Roos, L. L. (1991). Teacher beliefs about train- ing in testing and measurement. Journal of Teacher Education. 42( I ) , 3742 .

assessment. Journal of Educational Meusurement, 22(4), 271-286.

Portland, OR: Northwest Regional Educational Laboratory.

52(2), 28-37. Wolf, K. P. (1995). Assessment quotes. Reuding Today, /2(4), 4.

RESEARCH B W W W W W W W W W W W W W W W W W 4 . W W W W

ORDER FORM

a YES! I would like to order a one-year subscription to The Journal of Educa- tional Research, published bimonthly. I understand payment can be made to Heldref Publications or charged to my V I S M a s t e r C a r d (circle one).

0 $40.00 Individuals a $80.00 Institutions

ACCOUNT # EXPIRATION DATE

SIGNATURE

W The Journal of Educational Research has contributed to the advancement of educa- tional practice in elementary and secondary schools for more than 75 years. Authors experiment with new proce- dures, evaluate traditional practices, replicate previous research for validation, and perform other work central to

NAMEANSTITUTION W understanding and improving

ADDRESS

C ITYlSTATEE I P

COUNTRY

the education of today's stu- ' dents and teachers. The Journal of Educational : Research is a valuable resource for teachers, coun-

trators, planners, and educa-

ADD $15.00 FOR POSTAGE OUTSIDE THE U.S. ALLOW 6 WEEKS FOR DELIVERY OF FIRST ISSUE. selors, supervisors, adminis-

a . a

a a

www.heldref.org W

SEND ORDER FORM AND PAYMENT TO: HELDREF PUBLICATIONS tional researchers. The Journal of Educational Research 1319 EIGHTEENTH ST., NW, WASHINGTON, DC 20036-1802 PHONE (202) 296-6267 SUBSCRIPTION ORDERS 1(800)365-9753

FAX (202) 296-5149

Dow

nloa

ded

by [

UQ

Lib

rary

] at

11:

38 2

2 N

ovem

ber

2014