Evaluation in Education and Human Services
Editors:
George F. Madaus, Boston College, Chestnut Hill, Mass., U.S.A.
Daniel L. Stufflebeam, Western Michigan University, Kalamazoo, Mich., U.S.A.
Previously published books in the series:
Kelleghan, T., Madaus, G.F., and Airasian, P.W.; The Effects of Standardized Testing
The Courts, Validity, and Minimum Competency Testing
George F. Madaus Editor
Springer Science+Business Media, LLC
Library on Congress Cataloging in Publication Data Main entry under title:
The Courts, validity, and minimum competency testing.
(Evaluation in education and human services) Includes index. 1. Competency based educational tests - Law and legislation -
United States -Congresses. I. Madaus, George F. II. Series KF4156.A75C68 1982 344, 73'077 82·4666
347.30477
ISBN 978-94-017-5366-1 ISBN 978-94-017-5364-7 (eBook) DOI 10.1007/978-94-017-5364-7
Copyright© 1983 by Springer Science+Business Media New York Originally published by Kluwer · Nijhoff Publishing in 1983 Softcover reprint of the hardcover 1st edition 1983
No part of this book may be reproduced in any form by print, photoprint, microfilm, or any other means, without written permission from the publisher.
Contents
List of Figures and Tables vii
Preface ix
1 Debra P. v. Turlington: Judicial Standards for Assessing the Validity of Minimum Competency Tests Diana Pullin 3
2 Minimum Competency Testing for Certification: The Evolution and Evaluation of Test Validity George F. Madaus 21
3 Validity and Competency Tests: The Debra P. Case, Conceptions of Validity, and Strategies for-the Future Walt Haney 63
4 Establishing Instructional Validity for Minimum Competency Programs Robert Calfee, with the assistance of Edmund Lau and Lynne Sutter 95
5 Curricular Validity: Convincing the Court That It Was Taught without Precluding the Possibility of Measuring It Robert L. Linn 115
VI
6 Validity as a Variable: Can The Same Certification Test Be Valid for All Students?
CONTENTS
William H. Schmidt, Andrew C. Porter, John R. Schwille, Robert E. Floden, and Donald J. Freeman 133
7 Overlap: Testing Whether It Is Taught Gaea Leinhardt 151
8 What Constitutes Curricular Validity in a High-School-Leaving Examination Decker F. Walker 169
9 Curricular Validity: The Case for Structure and Process Richard L. Venezky 181
10 Minimum Competency Testing of Reading: An Analysis of Eight Tests Designed for Grade 11 Jeanne S. Chall, Ann Freeman, and Benjamin Levy 195
11 The Search for Content Validity through World Knowledge RogerW. Shuy 207
Appendix A: Debra P. v. Turlington, United States District Court, M.D. Florida, Tampa Division, July 12, 1979. As Amended August 7 and 8, 1979. 229
Appendix B: Debra P. v. Turlington, United States Court of Appeals, Fifth Circuit, Unit B, May 4, 1981. 257
Appendix C: Debra P. v. Turlington, United States Court of Appeals, Fifth Circuit, September 4, 1981. On Petition for Rehearing and Petition for Rehearing En Bane. 269
Index 283
List of Contributors and Conference Participants 290
List of Figures and Tables
List of Figures
Figure 3-1. Terms with Which to Address the Question of Whether the Test Covers Material Actually Taught 74
Figure 5-1. Means (•) and 95% Confidence Limits on Two Reading Comprehension Test Passages for a Rural (N = 101) and an Urban Sample (N = 106) (based on Johnston 1981) 124
Figure 6-1. Venn Diagram Showing Interrelationship Between Types of Content Validity 135
Figure 6-2. Venn Diagram Showing Relationship Between Objectives and Instructional Content 136
Figure 6-3. Content Analysis of Stanford Achievement Test (Intermediate Level/Grades 4.5-5.6), 1973 141
Figure 6-4. Distribution of Items in the Addison-Wesley Fourth-Grade Text 142
Figure 7-1. Overlap between Tests, Texts, and Instruction 157
Figure 11-1. Intersection of Sender, Message, and Receiver to Available Real Knowledge to the Test Question 208
Figure 11-2. Intersection of Test Question Communication Context with Receiver's World Knowledge 210
Figure 11-3. Intersection of Text Question Communicative Context with Both Sender and Receiver's World Knowledge 211
Figure 11-4. Representation of StraightTask Learning 216
Figure 11-5. Representation of Functional Learning 217
Figure 11-6. Communicative Skills and Computational Skills Set in a Framework of Generalization 218
VII
VIII LIST OF FIGURES AND TABLES
List of Tables
Table 3-1. Correlations (Pearson r) and Partial Correlations between Functional Literacy and GPAs, Eleventh Grade 88
Table 6-1. Distribution of Specific Topics across Concepts, Skills, and Applications 143
Table 6-2. Percent of Tested Topics Covered in Each Textbook 144
Table 7-1. Means, Standard Deviations, and Correlations Using Overlap Estimates from IDS, Grades 1 and 3 160
Table 7-2. Results of Regression of Posttest on Pretest and Overlap (IDS)* 161
Table 7-3. Correlations and Regressions Using Overlap Estimates (LD) 162
Table 10-1. State High School Level Reading Miminum Competency Tests 199
Table 10-2. Characteristics of State Test Passages 200
Table 10-3. Publishers High School Level Reading Minimum Competency Tests 202
Table 1 0-4. Characteristics of Passages on Publishers' Tests 203
PREFACE
On May 4, 1981 the United States Court of Appeals, Fifth Circuit, handed down a landmark decision in the Debra P. v. Turlington case. Plaintiffs had challenged Florida's policy requiring that students, to receive their high school diploma, pass the State Student Assessment Test, Part II (SSAT-11)- a test purporting to measure functional literacy. The Fifth Circuit ruled that the state may not deprive its high school seniors of the economic and educational benefits of a high school diploma until it demonstrated that the SSAT -II (the functional literacy test, [FLT]) was a fair test of what is taught in its classrooms. The Fifth Circuit thereupon sent the case back to the district court for a new hearing at which the state must show that the material tested was actually taught.
The ruling that a test used as a graduation requirement must measure what is actually taught in schools is one that creates immediate precedent for other litigation involving minimum competency testing (MCT) for certification and has implications in fourteen other states that presently have such programs. The ruling should affect the deliberations of those policy makers currently considering new assessment programs designed to certify or classify pupils and those charged with implementing extant programs in elementary and secondary schools. It should have an impact on the process of selecting or contracting for the development of certification tests, on the steps that a test developer must follow in order to produce a valid certification test, and on the implementation of the total program. In states or districts that mandate a certification test, the decision should affect the work of those in charge of curriculum development, implementation, and supervision and will, perforce, affect the teacher who is ultimately responsible for what is taughtand how it is taught- in the classroom.
The Fiftb Circuit's finding that in the field of competency testing, an important component of content validity is curricular validity - defined as things that are currently taught- reintroduced the old concept of curricular validity to the testing
IX
X PREFACE
community and by inference endorsed recent argument's for the need to demonstrate the instructional validity ofMCTs. The court's interpretation of the meaning of content validity raises a number of important and interesting questions. What are the implications of using the adjectives curricular and instructional to modify the noun validity? How do these concepts relate to each other and to the prevailing concept of content validity? What are the implications of the court endorsement of two additional facets of validity at a time when the profession is abandoning a misunderstood tripartite conceptualization in favor of a unified concept of validity? How does a state go about demonstrating that its certification test measures what was taught and that the inferences made about pupils are valid? What kinds of data should advocacy groups look for in evaluating the fundamental fairness of a test used for certification in elementary or secondary schools?
In an attempt to answer these questions, the Ford Foundation sponsored a conference on the Courts and the Content Validity of Minimum Competency Tests, held at Boston College in Chestnut Hill, Mass. on October 13 and 14, 1981. Ten papers dealing with the issue of how one might investigate the match between the content of an MCT and a school district's curriculum and instruction were commissioned for the conference. The paper were then discussed by the thirty-six participants, After the conference, the authors revised their papers, which now constitute ten of the eleven chapters in this book. Chapter 2, written after the conference, traces the evolution of the concept of content validity. Work on this chapter was supported by the Ford Grant and by the Carnegie Corporation. The Ford Foundation also helped to underwrite the costs of editing this book; the Carnegie Corporation supported production expenses to help reduce the cost.
This book focuses entirely on what a state department of education or a local educational authority might do to evaluate the extent to which skills and knowledge measured by an MCT were in fact covered by curriculum materials and by instruction. Several chapters provide necessary background. Chapter 1 by Diana Pullin gives the reader a legal analysis of the Debra P. case. (The full decision of the district court and the Fifth Circuit Court are reproduced in Appendix A and B and the petition for Rehearing En Bane in Appendix C) Several chapter authorsGeorge F. Madaus (2), Walt Haney (3), Robert Calfee (4), RobertL. Linn (5), and William H. Schmidt, Andrew C. Porter, John R. Schwille, Robert E. Floden, and Donald J. Freeman (6)- discuss general issues of validity.
Techniques to analyze the match between textbooks and test material are described by Schmidt and his colleagues (6) and Gaea Leinhardt (7). These chapters also describe techniques that have been used to analyze the match between instruction and material covered on tests. Calfee ( 4) and Linn ( 5) describe techniques that can be used to explore differences between groups on itemresponse patterns. Jeanne S. Chall (10) describes techniques to analyze the reading-difficulty levels associated both with textbooks and with MCTs. Roger W.
PREFACE XI
Shuy (11) offers a linguistic analysis of issues related to test validity. Decker F. Walker (8) and Richard L. Venezky (9) discuss aspects of the curriculum and school environment that must be present if pupils can be said to have had an adequate opportunity to learn.
There was no attempt to arrive at a consensus on what steps should be followed in evaluating the curriculum or instructional validity of MCTs used for certification. The hope is that after reading this book, readers will be able to draw up their own agendas for evaluating the extent to which pupils have had a fair opportunity to acquire the skills measured by such tests.
Acknowledgments
I am indebted to a number of people for their assistance in producing this book. First I wish to express my appreciation to the officers of the Ford Foundation for their financial support. Marjorie Martus, while working as a program officer at the Ford Foundation, was particularly helpful in pl~ning the conference. Edward Meade of the Ford Foundation supported efforts to organize and edit the manuscript. I also wish to thank the Carnegie Corporation of New York for its financial support. Fritz Mosher of the Carnegie Corporation was particularly helpful throughout the entire process.
Boston College also offered a great deal of support. For assistance at various stages of the writing process, I am indebted to Dean Mary Griffin of the School of Education and to Joseph Pedulla of the Center for the Study of Testing, Evaluation, and Educational Policy.
Diana Pullin was extremely helpful in planning the conference. My thanks go to Robert Linn, Jason Millman, and Daniel Stufflebeam for chairing the group discussions at the conference. Peter Airasian and Paul Weckstein offered summaries of all papers from a measurement and legal view, respectively, which helped to focus the group discussions. Robert Calfee offered excellent suggestions on what might be included in the third section of Chapter 2. Joseph Foley of Boston State College was of enormous help to me in reacting to and editing my chapter. My thanks to Philip Jones of Kluwer-Nijhoff Publishing for speeding up the steps in production in order to publish this book as soon as possible. To my wife Anne, my thanks for all of her patient help.
For their assistance at the conference, I am indebted to Paul Lucas and Rita Comtois. To Carolyn Pike, Catherine Bransfield I extend my thanks for the many patient hours they gave during the process.
Last, but by no means least, special thanks go to each of the conference participants for their time, consideration, and substantial contributions and to the chapter authors for an excellent job and for their patience.