Download pdf - The Courts, Validity, and Minimum Competency Testing978-94-017-5364-7/1.pdf · The Courts, Validity, ... In an attempt to answer these questions, ... conference on the Courts and

The Courts, Validity, and Minimum Competency Testing

Evaluation in Education and Human Services

Editors:

George F. Madaus, Boston College, Chestnut Hill, Mass., U.S.A.

Daniel L. Stufflebeam, Western Michigan University, Kalamazoo, Mich., U.S.A.

Previously published books in the series:

Kelleghan, T., Madaus, G.F., and Airasian, P.W.; The Effects of Standardized Testing


George F. Madaus Editor

Springer Science+Business Media, LLC

Library on Congress Cataloging in Publication Data Main entry under title:

The Courts, validity, and minimum competency testing.

(Evaluation in education and human services) Includes index. 1. Competency based educational tests - Law and legislation -

United States -Congresses. I. Madaus, George F. II. Series KF4156.A75C68 1982 344, 73'077 82·4666

347.30477

ISBN 978-94-017-5366-1 ISBN 978-94-017-5364-7 (eBook) DOI 10.1007/978-94-017-5364-7

Copyright© 1983 by Springer Science+Business Media New York Originally published by Kluwer · Nijhoff Publishing in 1983 Softcover reprint of the hardcover 1st edition 1983

No part of this book may be reproduced in any form by print, photoprint, microfilm, or any other means, without written permission from the publisher.

Contents

List of Figures and Tables vii

Preface ix

1 Debra P. v. Turlington: Judicial Standards for Assessing the Validity of Minimum Competency Tests Diana Pullin 3

2 Minimum Competency Testing for Certification: The Evolution and Evaluation of Test Validity George F. Madaus 21

3 Validity and Competency Tests: The Debra P. Case, Conceptions of Validity, and Strategies for-the Future Walt Haney 63

4 Establishing Instructional Validity for Minimum Competency Programs Robert Calfee, with the assistance of Edmund Lau and Lynne Sutter 95

5 Curricular Validity: Convincing the Court That It Was Taught without Precluding the Possibility of Measuring It Robert L. Linn 115

VI

6 Validity as a Variable: Can The Same Certification Test Be Valid for All Students?

CONTENTS

William H. Schmidt, Andrew C. Porter, John R. Schwille, Robert E. Floden, and Donald J. Freeman 133

7 Overlap: Testing Whether It Is Taught Gaea Leinhardt 151

8 What Constitutes Curricular Validity in a High-School-Leaving Examination Decker F. Walker 169

9 Curricular Validity: The Case for Structure and Process Richard L. Venezky 181

10 Minimum Competency Testing of Reading: An Analysis of Eight Tests Designed for Grade 11 Jeanne S. Chall, Ann Freeman, and Benjamin Levy 195

11 The Search for Content Validity through World Knowledge RogerW. Shuy 207

Appendix A: Debra P. v. Turlington, United States District Court, M.D. Florida, Tampa Division, July 12, 1979. As Amended August 7 and 8, 1979. 229

Appendix B: Debra P. v. Turlington, United States Court of Appeals, Fifth Circuit, Unit B, May 4, 1981. 257

Appendix C: Debra P. v. Turlington, United States Court of Appeals, Fifth Circuit, September 4, 1981. On Petition for Rehearing and Petition for Rehearing En Bane. 269

Index 283

List of Contributors and Conference Participants 290

List of Figures and Tables

List of Figures

Figure 3-1. Terms with Which to Address the Question of Whether the Test Covers Material Actually Taught 74

Figure 5-1. Means (•) and 95% Confidence Limits on Two Reading Comprehension Test Passages for a Rural (N = 101) and an Urban Sample (N = 106) (based on Johnston 1981) 124

Figure 6-1. Venn Diagram Showing Interrelationship Between Types of Content Validity 135

Figure 6-2. Venn Diagram Showing Relationship Between Objectives and Instructional Content 136

Figure 6-3. Content Analysis of Stanford Achievement Test (Intermediate Level/Grades 4.5-5.6), 1973 141

Figure 6-4. Distribution of Items in the Addison-Wesley Fourth-Grade Text 142

Figure 7-1. Overlap between Tests, Texts, and Instruction 157

Figure 11-1. Intersection of Sender, Message, and Receiver to Available Real Knowledge to the Test Question 208

Figure 11-2. Intersection of Test Question Communication Context with Receiver's World Knowledge 210

Figure 11-3. Intersection of Text Question Communicative Context with Both Sender and Receiver's World Knowledge 211

Figure 11-4. Representation of StraightTask Learning 216

Figure 11-5. Representation of Functional Learning 217

Figure 11-6. Communicative Skills and Computational Skills Set in a Framework of Generalization 218

VII

VIII LIST OF FIGURES AND TABLES

List of Tables

Table 3-1. Correlations (Pearson r) and Partial Correlations between Functional Literacy and GPAs, Eleventh Grade 88

Table 6-1. Distribution of Specific Topics across Concepts, Skills, and Applications 143

Table 6-2. Percent of Tested Topics Covered in Each Textbook 144

Table 7-1. Means, Standard Deviations, and Correlations Using Overlap Estimates from IDS, Grades 1 and 3 160

Table 7-2. Results of Regression of Posttest on Pretest and Overlap (IDS)* 161

Table 7-3. Correlations and Regressions Using Overlap Estimates (LD) 162

Table 10-1. State High School Level Reading Miminum Competency Tests 199

Table 10-2. Characteristics of State Test Passages 200

Table 10-3. Publishers High School Level Reading Minimum Competency Tests 202

Table 1 0-4. Characteristics of Passages on Publishers' Tests 203

PREFACE

On May 4, 1981 the United States Court of Appeals, Fifth Circuit, handed down a landmark decision in the Debra P. v. Turlington case. Plaintiffs had challenged Florida's policy requiring that students, to receive their high school diploma, pass the State Student Assessment Test, Part II (SSAT-11)- a test purporting to measure functional literacy. The Fifth Circuit ruled that the state may not deprive its high school seniors of the economic and educational benefits of a high school diploma until it demonstrated that the SSAT -II (the functional literacy test, [FLT]) was a fair test of what is taught in its classrooms. The Fifth Circuit thereupon sent the case back to the district court for a new hearing at which the state must show that the material tested was actually taught.

The ruling that a test used as a graduation requirement must measure what is actually taught in schools is one that creates immediate precedent for other litigation involving minimum competency testing (MCT) for certification and has implications in fourteen other states that presently have such programs. The ruling should affect the deliberations of those policy makers currently considering new assessment programs designed to certify or classify pupils and those charged with implementing extant programs in elementary and secondary schools. It should have an impact on the process of selecting or contracting for the development of certification tests, on the steps that a test developer must follow in order to produce a valid certification test, and on the implementation of the total program. In states or districts that mandate a certification test, the decision should affect the work of those in charge of curriculum development, implementation, and supervision and will, perforce, affect the teacher who is ultimately responsible for what is taughtand how it is taught- in the classroom.

The Fiftb Circuit's finding that in the field of competency testing, an important component of content validity is curricular validity - defined as things that are currently taught- reintroduced the old concept of curricular validity to the testing

IX

X PREFACE

community and by inference endorsed recent argument's for the need to demonstrate the instructional validity ofMCTs. The court's interpretation of the meaning of content validity raises a number of important and interesting questions. What are the implications of using the adjectives curricular and instructional to modify the noun validity? How do these concepts relate to each other and to the prevailing concept of content validity? What are the implications of the court endorsement of two additional facets of validity at a time when the profession is abandoning a misunderstood tripartite conceptualization in favor of a unified concept of validity? How does a state go about demonstrating that its certification test measures what was taught and that the inferences made about pupils are valid? What kinds of data should advocacy groups look for in evaluating the fundamental fairness of a test used for certification in elementary or secondary schools?

In an attempt to answer these questions, the Ford Foundation sponsored a conference on the Courts and the Content Validity of Minimum Competency Tests, held at Boston College in Chestnut Hill, Mass. on October 13 and 14, 1981. Ten papers dealing with the issue of how one might investigate the match between the content of an MCT and a school district's curriculum and instruction were commissioned for the conference. The paper were then discussed by the thirty-six participants, After the conference, the authors revised their papers, which now constitute ten of the eleven chapters in this book. Chapter 2, written after the conference, traces the evolution of the concept of content validity. Work on this chapter was supported by the Ford Grant and by the Carnegie Corporation. The Ford Foundation also helped to underwrite the costs of editing this book; the Carnegie Corporation supported production expenses to help reduce the cost.

This book focuses entirely on what a state department of education or a local educational authority might do to evaluate the extent to which skills and knowledge measured by an MCT were in fact covered by curriculum materials and by instruction. Several chapters provide necessary background. Chapter 1 by Diana Pullin gives the reader a legal analysis of the Debra P. case. (The full decision of the district court and the Fifth Circuit Court are reproduced in Appendix A and B and the petition for Rehearing En Bane in Appendix C) Several chapter authorsGeorge F. Madaus (2), Walt Haney (3), Robert Calfee (4), RobertL. Linn (5), and William H. Schmidt, Andrew C. Porter, John R. Schwille, Robert E. Floden, and Donald J. Freeman (6)- discuss general issues of validity.

Techniques to analyze the match between textbooks and test material are described by Schmidt and his colleagues (6) and Gaea Leinhardt (7). These chapters also describe techniques that have been used to analyze the match between instruction and material covered on tests. Calfee ( 4) and Linn ( 5) describe techniques that can be used to explore differences between groups on itemresponse patterns. Jeanne S. Chall (10) describes techniques to analyze the reading-difficulty levels associated both with textbooks and with MCTs. Roger W.

PREFACE XI

Shuy (11) offers a linguistic analysis of issues related to test validity. Decker F. Walker (8) and Richard L. Venezky (9) discuss aspects of the curriculum and school environment that must be present if pupils can be said to have had an adequate opportunity to learn.

There was no attempt to arrive at a consensus on what steps should be followed in evaluating the curriculum or instructional validity of MCTs used for certification. The hope is that after reading this book, readers will be able to draw up their own agendas for evaluating the extent to which pupils have had a fair opportunity to acquire the skills measured by such tests.

Acknowledgments

I am indebted to a number of people for their assistance in producing this book. First I wish to express my appreciation to the officers of the Ford Foundation for their financial support. Marjorie Martus, while working as a program officer at the Ford Foundation, was particularly helpful in pl~ning the conference. Edward Meade of the Ford Foundation supported efforts to organize and edit the manuscript. I also wish to thank the Carnegie Corporation of New York for its financial support. Fritz Mosher of the Carnegie Corporation was particularly helpful throughout the entire process.

Boston College also offered a great deal of support. For assistance at various stages of the writing process, I am indebted to Dean Mary Griffin of the School of Education and to Joseph Pedulla of the Center for the Study of Testing, Evaluation, and Educational Policy.

Diana Pullin was extremely helpful in planning the conference. My thanks go to Robert Linn, Jason Millman, and Daniel Stufflebeam for chairing the group discussions at the conference. Peter Airasian and Paul Weckstein offered summaries of all papers from a measurement and legal view, respectively, which helped to focus the group discussions. Robert Calfee offered excellent suggestions on what might be included in the third section of Chapter 2. Joseph Foley of Boston State College was of enormous help to me in reacting to and editing my chapter. My thanks to Philip Jones of Kluwer-Nijhoff Publishing for speeding up the steps in production in order to publish this book as soon as possible. To my wife Anne, my thanks for all of her patient help.

For their assistance at the conference, I am indebted to Paul Lucas and Rita Comtois. To Carolyn Pike, Catherine Bransfield I extend my thanks for the many patient hours they gave during the process.

Last, but by no means least, special thanks go to each of the conference participants for their time, consideration, and substantial contributions and to the chapter authors for an excellent job and for their patience.