UNDERSTANDING · Descriptive Statistics—Central Tendency, 51 Mean, 52 Median, 52 Mode, 54 Using Excel1 and SPSS1 to Understand Central Tendency, 56 Excel1,56 SPSS1,58 Distributions,

UNDERSTANDINGEDUCATIONALSTATISTICS USINGMICROSOFTEXCEL1 AND SPSS1

UNDERSTANDINGEDUCATIONALSTATISTICS USINGMICROSOFTEXCEL1AND SPSS1

MARTIN LEE ABBOTTDepartment of SociologySeattle Pacific UniversitySeattle, Washington

Copyright# 2011 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New JerseyPublished simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form orby any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except aspermitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the priorwritten permission of the Publisher, or authorization through payment of the appropriate per-copy fee tothe Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400,fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permissionshould be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken,NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best effortsin preparing this book, they make no representations or warranties with respect to the accuracy orcompleteness of the contents of this book and specifically disclaim any implied warranties ofmerchantability or fitness for a particular purpose. No warranty may be created or extended by salesrepresentatives or written sales materials. The advice and strategies contained herein may not be suitablefor your situation. You should consult with a professional where appropriate. Neither the publisher norauthor shall be liable for any loss of profit or any other commercial damages, including but not limited tospecial, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact ourCustomer Care Department within the United States at (800) 762-2974, outside the United Statesat (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print maynot be available in electronic formats. For more information about Wiley products, visit our web site atwww.wiley.com.

Library of Congress Cataloging-in-Publication Data:

ISBN: 978-0-470-88945-9

Printed in Singapore

10 9 8 7 6 5 4 3 2 1

To those who seek a deeper understanding of the world as it appears and of whatlies beyond.

CONTENTS

Preface xix

Acknowledgments xxi

1 Introduction 1

Approach of the Book, 1Project Labs, 2Real-World Data, 3Research Design, 3“Practical Significance”—Implications of Findings, 4Coverage of Statistical Procedures, 5

2 Getting Acquainted with Microsoft Excel1 7

Data Management, 7Rows and Columns, 8Data Sheets, 8

The Excel1 Menus, 9Home, 9Insert Tab, 9Page Layout, 9Formulas, 10Data, 11Review and View Menus, 16

vii

3 Using Statistics in Excel1 17

Using Statistical Functions, 17Entering Formulas Directly, 17Data Analysis Procedures, 20Missing Values and “0” Values in Excel1 Analyses, 20Using Excel1 with Real Data, 20School-Level Achievement Database, 20TAGLIT Data, 21The STAR Classroom Observation ProtocolTM Data, 22

4 SPSS1 Basics 23

Using SPSS1, 23General Features, 24Management Functions, 26Reading and Importing Data, 26Sort, 26

Additional Management Functions, 30Split File, 30Transform/Compute (Creating Indices), 32Merge, 34

Analysis Functions, 39

5 Descriptive Statistics—Central Tendency 41

Research Applications—Spuriousness, 41Descriptive and Inferential Statistics, 44The Nature of Data—Scales of Measurement, 44Nominal Data, 45Ordinal Data, 46Interval Data, 48Ratio Data, 50Choosing the Correct Statistical Procedure for the Nature ofResearch Data, 50

Descriptive Statistics—Central Tendency, 51Mean, 52Median, 52Mode, 54

Using Excel1 and SPSS1 to Understand Central Tendency, 56Excel1, 56SPSS1, 58

Distributions, 61Describing the Normal Distribution, 62Central Tendency, 63Skewness, 63

viii CONTENTS

Kurtosis, 65Descriptive Statistics—Using Graphical Methods, 66Frequency Distributions, 66Histograms, 67

Terms and Concepts, 71Real-World Lab I: Central Tendency, 74Real-World Lab I: Solutions, 75Results, 75Results, 76

6 Descriptive Statistics—Variablity 81

Range, 82Percentile, 82Scores Based on Percentiles, 83Using Excel1 and SPSS1 to Identify Percentiles, 84Note, 86Standard Deviation and Variance, 87Calculating the Variance and Standard Deviation, 88The Deviation Method, 88The Average Deviation, 89

The Computation Method, 91The Sum of Squares, 91Sample SD and Population SD, 92Obtaining SD from Excel1 and SPSS1, 94

Terms and Concepts, 96Real-World Lab II: Variability, 97Real-World Lab II: Solutions, 97Results, 97

7 The Normal Distribution 101

The Nature of the Normal Curve, 101The Standard Normal Score: z Score, 103The z-Score Table of Values, 104Navigating the z-Score Distribution, 105Calculating Percentiles, 108Creating Rules for Locating z Scores, 108

Calculating z Scores, 111Working with Raw Score Distributions, 114Using Excel1 to Create z Scores and Cumulative Proportions, 115STANDARDIZE Function, 115NORMSDIST Function, 117NORMDIST Function, 118

Using SPSS1 to Create z Scores, 119Terms and Concepts, 121

CONTENTS ix

Real-World Lab III: The Normal Curve and z Scores, 121Real-World Lab III: Solutions, 122

8 The Z Distribution and Probability 127

Transforming a z Score to a Raw Score, 128Transforming Cumulative Proportions to z Scores, 128Deriving Sample Scores from Cumulative Percentages, 130Additional Transformations Using the Standard Normal Distribution, 131Normal Curve Equivalent, 131Stanine, 131T Score, 132Grade Equivalent Score, 132

Using Excel1 and SPSS1 to Transform Scores, 132Probability, 134Determinism Versus Probability, 135Elements of Probability, 136Probability and the Normal Curve, 136Relationship of z Score and Probability, 137“Inside” and “Outside” Areas of the Standard Normal Distribution, 139Outside Area Example, 140

“Exact” Probability, 141From Sample Values to Sample Distributions, 143Terms and Concepts, 144Real-World Lab IV, 144Real-World Lab IV: Solutions, 145

9 The Nature of Research Design and Inferential Statistics 147

Research Design, 148Theory, 149Hypothesis, 149

Types of Research Designs, 150Experiment, 150Post Facto Research Designs, 153

The Nature of Research Design, 154Research Design Varieties, 154Sampling, 155

Inferential Statistics, 156One Sample from Many Possible Samples, 156Central Limit Theorem and Sampling Distributions, 157The Sampling Distribution and Research, 160Populations and Samples, 162The Standard Error of the Mean, 162“Transforming” the Sample Mean to the Sampling Distribution, 163Example, 163

x CONTENTS

Z Test, 166The Hypothesis Test, 167Statistical Significance, 168Practical Significance: Effect Size, 168Z-Test Elements, 169

Terms and Concepts, 169Real-World Lab V, 171Real-World Lab V: Solutions, 172

10 The T Test for Single Samples 175

Z Versus T: Making Accommodations, 175Research Design, 176Experiment, 177Post Facto Comparative Design, 178

Parameter Estimation, 178Estimating the Population Standard Deviation, 178A New Symbol: sx, 180Biased Versus Unbiased Estimates, 181A Research Example, 181

T Test for a Single Mean, 182Example Calculations, 184Degrees of Freedom, 185The T Distribution, 187The Hypothesis Test, 188

Type I and Type II Errors, 189Type I (Alpha) Errors (a), 189Type II (Beta) Errors (b), 190

Effect Size, 191Another Measurement of the (Cohen’s d) Effect Size, 192

Power, Effect Size, and Beta, 193One- and Two-Tailed Tests, 193Two-Tailed Tests, 194One-Tailed Tests, 194Choosing a One- or Two-Tailed Test, 196

A Note About Power, 196Point and Interval Estimates, 197Calculating the Interval Estimate of the Population Mean, 197

The Value of Confidence Intervals, 199Using Excel1 and SPSS1 with the Single-Sample T Test, 200SPSS1 and the Single-Sample T Test, 200Excel1 and the Single Sample T Test, 203

Terms and Concepts, 204Real-World Lab VI: Single-Sample T Test, 205Real-World Lab VI: Solutions, 206

CONTENTS xi

11 Independent-Samples T Test 209

A Lot of “T ’S’’, 209Research Design, 210Experimental Designs, 210Pretest or No Pretest, 213Post Facto Designs, 214

Independent T Test: The Procedure, 215Creating the Sampling Distribution of Differences, 216The Nature of the Sampling Distribution of Differences, 217Calculating the Estimated Standard Error of Difference, 218Using Unequal Sample Sizes, 220The Independent T Ratio, 221

Independent T-Test Example, 222The Null Hypothesis, 222The Alternative Hypothesis, 223The Critical Value of Comparison, 223The Calculated T Ratio, 224Statistical Decision, 225Interpretation, 226

Before–After Convention with the Independent T Test, 226Confidence Intervals for the Independent T Test, 227Effect Size, 228Equal and Unequal Sample Sizes, 229The Assumptions for the Independent-Samples T Test, 229The Excel1 “F-Test Two Sample for Variances” Test, 230The SPSS1 “Explore” Procedure for Testing the Equalityof Variances, 233The Homogeneity of Variances Assumption for theIndependent T Test, 235A Rule of Thumb, 236

Using Excel1 and SPSS1 with the Independent-SamplesT Test, 236Using Excel1 with the Independent T Test, 236Using SPSS1 with the Independent T Test, 239

Parting Comments, 242Nonparametric Statistics, 243Terms and Concepts, 246Real-World Lab VII: Independent T Test, 247Procedures, 247

Real-World Lab VII: Solutions, 248

12 Analysis of Variance 257

A Hypothetical Example of ANOVA, 258The Nature of ANOVA, 259

xii CONTENTS

The Components of Variance, 260The Process of ANOVA, 261Calculating ANOVA, 262Calculating the Variance: Using the Sum of Squares (SS), 262Using Mean Squares (MS), 265Degrees of Freedom in ANOVA, 266Calculating Mean Squares (MS), 266The F Ratio, 267The F Distribution, 269

Effect Size, 269Post Hoc Analyses, 271“Varieties” of Post Hoc Analyses, 272The Post Hoc Analysis Process, 273Tukey’s HSD (Range) Test Calculation, 273Means Comparison Table, 275Compare Mean Difference Values from HSD, 276Post Hoc Summary, 276

Assumptions of ANOVA, 276Additional Considerations with ANOVA, 277A Real-World Example of ANOVA, 277Are the Assumptions Met?, 278Hand Calculations, 281Calculating SST , 283Calculating SSB , 283Calculating SSW, 283The Hypothesis Test, 283Effect Size, 284Post Hoc Analysis, 284

Using Excel1 and SPSS1 with One-Way ANOVA, 285Excel1 Procedures with One-Way ANOVA, 285SPSS1 Procedures with One-Way ANOVA, 287

The Need for Diagnostics, 292Nonparametric ANOVATests, 293Terms and Concepts, 296Real-World Lab VIII: ANOVA, 296Real-World Lab VIII: Solutions, 297

13 Factorial Anova 307

Extensions of ANOVA, 307Within-Subjects ANOVA, 307Two-Way Within-Subjects ANOVA, 308ANCOVA, 308

Multivariate ANOVA Procedures, 309MANOVA, 309

CONTENTS xiii

MANCOVA, 309Factorial ANOVA, 309Interaction Effects, 309An Example of 2�ANOVA, 310Charting Interactions, 311Simple Effects, 312

The Example DataSet, 312Calculating Factorial ANOVA, 312Calculating the Interaction, 315The 2�ANOVA Summary Table, 315Creating the MS Values, 316The Hypotheses Tests, 317The Omnibus F Ratio, 317Effect Size for 2�ANOVA: Partial h2, 318Discussing the Results, 319

Using SPSS1 to Analyze 2�ANOVA, 321The “Plots” Specification, 323Omnibus Results, 325Simple Effects Analyses, 325

Summary Chart for 2�ANOVA Procedures, 327Terms and Concepts, 327Real-World Lab IX: 2�ANOVA, 329Real-World Lab IX: 2�ANOVA Solutions, 330

14 Correlation 337

The Nature of Correlation, 338Explore and Predict, 338Different Measurement Values, 338Different Data Levels, 338Correlation Measures, 338

The Correlation Design, 339Pearson’s Correlation Coefficient, 340Interpreting the Pearson’s Correlation, 340The Fictitious Data, 341Assumptions for Correlation, 342

Plotting the Correlation: The Scattergram, 342Patterns of Correlations, 343Strength of Correlations in Scattergrams, 344

Creating the Scattergram, 345Using Excel1 to Create Scattergrams, 345Using SPSS1 to Create Scattergrams, 347

Calculating Pearson’s r, 348The Z-Score Method, 349The Computation Method, 351

xiv CONTENTS

Evaluating Pearson’s r, 353The Hypothesis Test for Pearson’s r, 353The Comparison Table of Values, 354Effect Size: The Coefficient of Determination, 354

Correlation Problems, 356Correlations and Sample Size, 356Correlation is Not Causation, 357Restricted Range, 357Extreme Scores, 358Heteroscedasticity, 358Curvilinear Relations, 358

The Example Database, 359Assumptions for Correlation, 360Computation of Pearson’s r for the Example Data, 363Evaluating Pearson’s r: Hypothesis Test, 365Evaluating Pearson’s r: Effect Size, 365

Correlation Using Excel1 and SPSS1, 366Correlation Using Excel1, 366Correlation Using SPSS1, 367

Nonparametric Statistics: Spearman’s Rank-Order Correlation (rs), 369Variations of Spearman’s Rho Formula: Tied Ranks, 371A Spearman’s Rho Example, 373

Terms and Concepts, 374Real-World Lab X: Correlation, 376Real-World Lab X: Solutions, 377

15 Bivariate Regression 383

The Nature of Regression, 384The Regression Line, 385Calculating Regression, 388The Slope Value b, 389The Regression Equation in “Pieces”, 389A Fictitious Example, 389Interpreting and Using the Regression Equation, 390

Effect Size of Regression, 391The Z-Score Formula for Regression, 392Using the Z-Score Formula for Regression, 392Unstandardized and Standardized Regression Coefficients, 394

Testing the Regression Hypotheses, 394The Standard Error of Estimate, 394Calculating sest, 395

Confidence Interval, 396Explaining Variance through Regression, 397Using Scattergrams to Understand the Partitioning of Variance, 399

CONTENTS xv

A Numerical Example of Partitioning the Variation, 400Using Excel1 and SPSS1 with Bivariate Regression, 401The Excel1 Regression Output, 402The SPSS1 Regression Output, 404

Assumptions of Bivariate Linear Regression, 408Curvilinear Relationships, 409Detecting Problems in Bivariate Linear Regression, 412A Real-World Example of Bivariate Linear Regression, 413Normal Distribution and Equal Variances Assumptions, 413The Omnibus Test Results, 414Effect Size, 414The Model Summary, 415The Regression Equation and Individual Predictor Testof Significance, 415The Scattergram, 416

Advanced Regression Procedures, 417Multiple Correlation, 417Partial Correlation, 418Multiple Regression, 419Additional Considerations, 419

Terms and Concepts, 419Real-World Lab XI: Bivariate Linear Regression, 420Real-World Lab XI: Solutions, 422

16 Introduction to Multiple Linear Regression 429

The Elements of MLR, 429Same Process as Bivariate Regression, 430Similar Assumptions, 430Statistical Significance, 430Effect Size, 430Coefficients, 430Scatterdiagrams, 431

Some Differences Between Bivariate Regression and MLR, 431Multiple Coefficients, 431Multicollinearity, 431Explanation of R2, 431Entry Schemes, 432

Stuff Not Covered, 432Using MLR with Categorical Data, 432Curvilinear Regression, 433Multilevel Analysis, 433

MLR Extended Example, 433Are the Assumptions Met?, 434The Findings, 437

xvi CONTENTS

The SPSS1 Findings, 438The Unstandardized Coefficients, 442The Standardized Coefficients, 442Collinearity Statistics, 443The Squared Part Correlation, 443Conclusion, 444

Terms and Concepts, 445Real-World Lab XII: Multiple Linear Regression, 445Real-World Lab XII: MLR Solutions, 445

17 Chi Square and Contingency Table Analysis 453

Contingency Tables, 453The Chi Square Procedure and Research Design, 454Post Facto Designs, 455Experimental Designs, 455

Chi Square Designs, 455Goodness of Fit, 455Expected Frequencies—Equal Probability, 456Expected Frequencies—A Priori Assumptions, 456

The Chi Square Test of Independence, 456A Fictitious Example—Goodness of Fit, 457Frequencies Versus Proportions, 460

Effect Size—Goodness of Fit, 460Chi Square Test of Independence, 461Two-Way Chi Square, 461Assumptions, 462

A Fictitious Example—Test of Independence, 462Creating Expected Frequencies, 462Degrees of Freedom for the Test of Independence, 464

Special 2� 2 Chi Square, 466The Alternate 2� 2 Formula, 467Effect Size in 2� 2 Tables: Phi, 467Correction for 2� 2 Tables, 468

Cramer’s V: Effect Size for the Chi Square Test of Independence, 469Repeated Measures Chi Square, 470Repeated Measures Chi Square Table, 472

Using Excel1 and SPSS1 with Chi Square, 472Using Excel1 for Chi Square Analyses, 475Sort the Database, 475The Excel1 Count Function, 476The Excel1 CHITEST Function, 476The Excel1 CHIDIST Function, 477

Using SPSS1 for the Chi Square Test of Independence, 478The Crosstabs Procedure, 478

CONTENTS xvii

Analyzing the Contingency Table Data Directly, 481Interpreting the Contingency Table, 483

Terms and Concepts, 483Real-World Lab XIII: Chi Square, 484Real-World Lab XIII: Solutions, 484Hand Calculations, 484Using Excel1 for Chi Square Analyses, 485Using SPSS1 for Chi Square Solutions, 486

18 Repeated Measures Procedures: Tdep and ANOVAws 489

Independent and Dependent Samples in Research Designs, 490Using Different T Tests, 491The Dependent T-Test Calculation: The Long Formula, 491Example, 492Results, 494Effect Size, 494

The Dependent T-Test Calculation: The Difference Formula, 495The Tdep Ratio from the Difference Method, 496

Tdep and Power, 496Using Excel1 and SPSS1 to Conduct the Tdep Analysis, 496Tdep with Excel

1, 497Tdep with SPSS

1, 498Within-Subjects ANOVA (ANOVAws), 499Experimental Designs, 499Post Facto Designs, 501

Within-Subjects Example, 501Using SPSS1 for Within-Subjects Data, 501Sphericity, 501

The SPSS1 Procedure, 502The SPSS1 Output, 504The Omnibus Test, 506Effect Size, 507Post Hoc Analyses, 507The Interpretation, 507

Nonparametric Statistics, 508Terms and Concepts, 509

References 511

Appendix: Statistical Tables 513

Index 523

xviii CONTENTS

PREFACE

I have written this book many times in my head over the years! As I conductedresearch and taught statistics (graduate and undergraduate) in many fields, I devel-oped an approach to helping students understand the difficult concepts in a newway. I find that the great majority of students are visual learners, so I developeddiagrams and figures over the years that help create a conceptual picture of thestatistical procedures that are often problematic to students (like samplingdistributions!).

The other reason I wanted to write this book was to give students a way to under-stand statistical computing without having to rely on comprehensive and expensivestatistical software programs. Because most students have access to MicrosoftExcel1,1 I developed a step-by-step approach to using the powerful statistical pro-cedures in Excel1 to analyze data and conduct research in each of the statisticaltopics I cover in the book.

I also wanted to make those comprehensive statistical programs more approacha-ble to statistics students, so I have also included a hands-on guide to SPSS1 in par-allel with the Excel1 examples. In some cases, SPSS1 has the only means toperform some statistical procedures; but in most cases, both Excel1 and SPSS1

can be used.Last, like my other work dealing with applied statistical topics (Abbott, 2010), I

included real-world data in this book as examples for the procedures I discuss. Iintroduce extended examples in each chapter that use these real-world datasets, andI conclude the chapters with a Real-World Lab in which I present data for students

1 Excel1 references and screen shots in this book are used with permission from Microsoft.

xix

to use with Excel1 and SPSS1. Each Lab is followed by the Real World Lab:Solutions section so that students can examine their work in greater depth.

One limitation to teaching statistics through Excel1 is that the data analysis fea-tures are different, depending on whether the user is a Mac user or a PC user. I amusing the PC version, which features a Data Analysis suite of statistical tools. Thisfeature may no longer be included in the Mac version of Excel1 you are using.

I am posting the datasets for the real-world labs at the Wiley Publisher ftp site.You can access these datasets there to complete the labs instead of entering the datafrom the tables in the chapters. You may note some slight discrepancies in the re-sults if you enter the data by hand rather than downloading the data due to roundingof values. The data in the chapters are typically reported to two decimal places,whereas the analyses reported in the Labs are based on the actual data that bothExcel1 and SPSS1 carry to many decimal places even though you may only see avalue with two decimal places. Despite any slight differences resulting from round-ing, the primary findings should not change. You may encounter these types ofdiscrepancies in your research with real data as you move data from program toprogram to page.

The John Wiley & Sons Publisher ftp address is as follows:ftp://ftp.wiley.com/public/sci_tech_med/educational_statistics. You may also

want to visit my personal website at the following address:http://myhome.spu.edu/mabbott/.

MARTIN LEE ABBOTT

Seattle, Washington

xx PREFACE

ACKNOWLEDGMENTS

I would like to thank everyone who reviewed this manuscript. In particular,Nyaradzo Mvududu’s thorough critique was invaluable throughout the process.Adrianna Bagnall reviewed the manuscript and provided help in a great many otherways, especially with the tables. Dominic Williamson’s outstanding work on thefigures and graphic design was a critical feature of my approach to conceptualunderstanding of complex processes. I am especially grateful for his design of theimage on the book cover. Kristin Hovaguimian again provided outstanding supportfor the Index—not an easy task with a book of this nature. My graduate students inIndustrial/Organizational Psychology were kind to review the Factorial ANOVAchapter (Chapter 13).

I also want to thank Duane Baker (The BERC Group, Inc.) and Liz Cunningham(T.E.S.T., Inc.) for approval to use their data in this book as they did for my formerwork (Abbott, 2010). Using real-world data of this nature will be very helpful toreaders in their efforts to understand statistical processes.

I especially want to recognize Jacqueline Palmieri and Stephen Quigley at JohnWiley & Sons, Inc. for their continuing encouragement. They have been steadfast intheir support of this approach to statistical analysis from the beginning of our worktogether.

MARTIN LEE ABBOTT

xxi

1INTRODUCTION

Many students and researchers are intimidated by statistical procedures. This may

in part be due to a fear of math, problematic math teachers in earlier education, or

the lack of exposure to a ‘‘discovery’’ method for understanding difficult proce-

dures. Readers of this book should realize that they have the ability to succeed in

understanding statistical processes.

APPROACH OF THE BOOK

This is an introduction to statistics using EXCEL1 and SPSS1 to make it more

understandable. Ordinarily, the first course leads the student through the worlds of

descriptive and inferential statistics by highlighting the formulas and sequential

procedures that lead to statistical decision making. We will do all this in this book,

but I place a good deal more attention on conceptual understanding. Thus, rather

than memorizing a specific formula and using it in a specific way to solve a prob-

lem, I want to make sure the student first understands the nature of the problem,

why a specific formula is needed, and how it will result in the appropriate informa-

tion for decision making.

By using statistical software, we can place more attention on understanding how

to interpret findings. Statistics courses taught in mathematics departments, and in

some social science departments, often place primary emphases on the formulas/

processes themselves. In the extreme, this can limit the usefulness of the analyses

to the practitioner. My approach encourages students to focus more on how to

understand and make applications of the results of statistical analyses. EXCEL1

Understanding Educational Statistics Using Microsoft Excel1 and SPSS1. By Martin Lee Abbott.# 2011 John Wiley & Sons, Inc. Published 2011 by John Wiley & Sons, Inc.

1

and other statistical programs are much more efficient at performing the analyses;

the key issue in my approach is how to interpret the results in the context of the

research question.

Beginning with my first undergraduate course through teaching statistics with

conventional textbooks, I have spent countless hours demonstrating how to conduct

statistical tests by hand and teaching students to do likewise. This is not always a

bad strategy; performing the analysis by hand can lead the student to understand

how formulas treat data and yield valuable information. However, it is often

the case that the student gravitates to memorizing the formula or the steps in an

analysis. Again, there is nothing wrong with this approach as long as the student

does not stop there. The outcome of the analysis is more important than memorizing

the steps to the outcome. Examining the appropriate output derived from statistical

software shifts the attention from the nuances of a formula to the wealth of informa-

tion obtained by using it.

It is important to understand that I do indeed teach the student the nuances of

formulas, understanding why, when, how, and under what conditions they are used.

But in my experience, forcing the student to scrutinize statistical output files

accomplishes this and teaches them the appropriate use and limitations of the

information derived.

Students in my classes are always surprised (ecstatic) to realize they can use

their textbooks, notes, and so on, on my exams. But they quickly find that, unless

they really understand the principles and how they are applied and interpreted, an

open book is not going to help them. Over time, they come to realize that the analy-

ses and the outcomes of statistical procedures are simply the ingredients for what

comes next: building solutions to research problems. Therefore, their role is more

detective and constructor than number juggler.

This approach mirrors the recent national and international debate about math

pedagogy. In my recent book, Winning the Math Wars (Abbott et al., 2010), my

colleagues and I addressed these issues in great detail, suggesting that, while tradi-

tional ways of teaching math are useful and important, the emphases of reform

approaches are not to be dismissed. Understanding and memorizing detail are

crucial, but problem solving requires a different approach to learning.

PROJECT LABS

Labs are a very important part of this course since they allow students to take

charge of their learning. This is the ‘‘discovery learning’’ element I mentioned

above. Understanding a statistical procedure in the confines of a classroom is neces-

sary and helpful. However, learning that lasts is best accomplished by students

directly engaging the processes with actual data and observing what patterns

emerge in the findings that can be applied to real research problems.

In this course, we will have several occasions to complete Project Labs that pose

research problems on actual data. Students take what they learn from the book

material and conduct a statistical investigation using EXCEL1 and SPSS1. Then,

2 INTRODUCTION

they have the opportunity to examine the results, write research summaries, and

compare findings with the solutions presented at the end of the book.

These are labs not using data created for classroom use but instead using

real-world data from actual research databases. Not only does this engage students

in the learning process with specific statistical processes, but it presents real-world

information in all its ‘‘grittiness.’’ Researchers know that they will discover knotty

problems and unusual, sometimes idiosyncratic, information in their data. If

students are not exposed to this real-world aspect of research, it will be confusing

when they engage in actual research beyond the confines of the classroom.

The project labs also introduce students to two software approaches for solv-

ing statistical problems. These are quite different in many regards, as we will

see in the following chapters. EXCEL1 is widely accessible and provides a

wealth of information to researchers about many statistical processes they

encounter in actual research. SPSS1 provides additional, advanced procedures

that educational researchers utilize for more complex and extensive research

questions. The project labs provide solutions in both formats so the student can

learn the capabilities and approaches of each.

REAL-WORLD DATA

As I mentioned, I focus on using real-world data for many reasons. One reason is

that students need to be grounded in approaches they can use with ‘‘gritty’’ data. I

want to make sure that students leave the classroom prepared for encountering the

little nuances that characterize every research project.

Another reason I use real-world data is to familiarize students with contemporary

research questions in education. Classroom data often are contrived to make a cer-

tain point or show a specific procedure, which are both helpful. But I believe that it

is important to draw the focus away from the procedure per se and understand how

the procedure will help the researcher resolve a research question. The research

questions are important. Policy reflects the available information on a research

topic, to some extent, so it is important for students to be able to generate that

information as well as to understand it. This is an ‘‘active’’ rather than ‘‘passive’’

learning approach to understanding statistics.

RESEARCH DESIGN

People who write statistics books have a dilemma with respect to research design.

Typically, statistics and research design are taught separately in order for students

to understand each in greater depth. The difficulty with this approach is that

the student is left on their own to synthesize the information; this is often not

done successfully.

Colleges and universities attempt to manage this problem differently. Some

require statistics as a prerequisite for a research design course, or vice versa. Others

RESEARCH DESIGN 3

attempt to synthesize the information into one course, which is difficult to do given

the eventual complexity of both sets of information. Adding somewhat to the

problem is the approach of multiple courses in both domains.

I do not offer a perfect solution to this dilemma. My approach focuses on an

in-depth understanding of statistical procedures for actual research problems. What

this means is that I cannot devote a great deal of attention in this book to research

design apart from the statistical procedures that are an integral part of it. However, I

try to address the problem in two ways.

First, wherever possible, I connect statistics with specific research designs. This

provides an additional context in which students can focus on using statistics to

answer research questions. The research question drives the decision about which

statistical procedures to use; it also calls for discussion of appropriate design in

which to use the statistical procedures. We will cover essential information about

research design in order to show how these might be used.

Second, I am making available an online course in research design as part of this

book. In addition to databases and other research resources, you can follow the web

address in the Preface to gain access to the online course that you can take in

tandem with reading this book or separately.

‘‘PRACTICAL SIGNIFICANCE’’—IMPLICATIONS OF FINDINGS

I emphasize ‘‘practical significance’’ (effect size) in this book as well as statistical

significance. In many ways, this is a more comprehensive approach to uncertainty,

since effect size is a measure of ‘‘impact’’ in the research evaluation. It is important

to measure the likelihood of chance findings (statistical significance), but the extent

of influence represented in the analyses affords the researcher another vantage point

to determine the relationship among the research variables.

I call attention to problem solving as the important part of statistical analysis. It

is tempting for students to focus so much on using statistical procedures to create

meaningful results (a critical matter!) that they do not take the next steps in

research. They stop after they use a formula and decide whether or not a finding is

statistically significant. I strongly encourage students to think about the findings in

the context and words of the research question. This is not an easy thing to do

because the meaning of the results is not always cut and dried. It requires students

to think beyond the formula.

Statisticians and practitioners have devised rules to help researchers with this

dilemma by creating criteria for decision making. For example, squaring a correla-

tion yields the ‘‘coefficient of determination,’’ which represents the amount of

variance in one variable that is accounted for by the other variable. But the next

question is, How much of the ‘‘accounted for variance’’ is meaningful?

Statisticians have suggested different ways of helping with this question. One

such set of criteria determines that 0.01 (or 1% of the variance accounted for) is

considered ‘‘small’’ while 0.05 (5% of variance) is ‘‘medium,’’ and so forth. (And,

much to the dismay of many students, there are more than one set of these criteria.)

4 INTRODUCTION

But the material point is that these criteria do not apply equally to every research

question.

If a research question is, ‘‘Does class size affect math achievement,’’ for

example, and the results suggest that class size accounts for 1% of the variance in

math achievement, many researchers might agree it is a small and perhaps even

inconsequential impact. However, if a research question is, ‘‘Does drug X account

for 1% of the variance in AIDS survival rates,’’ researchers might consider this to

be much more consequential than ‘‘small’’!

This is not to say that math achievement is any less important than AIDS

survival rates (although that is another of those debatable questions researchers

face), but the researcher must consider a range of factors in determining meaning-

fulness: the intractability of the research problem, the discovery of new dimensions

of the research focus, whether or not the findings represent life and death, and so on.

I have found that students have the most difficult time with these matters. Using

a formula to create numerical results is often much preferable to understanding

what the results mean in the context of the research question. Students have

been conditioned to stop after they get the right numerical answer. They typically

do not get to the difficult work of what the right answer means because it isn’t

always apparent.

COVERAGE OF STATISTICAL PROCEDURES

The statistical applications we will discuss in this book are ‘‘workhorses.’’ This is

an introductory treatment, so we need to spend time discussing the nature of statis-

tics and basic procedures that allow you to use more sophisticated procedures. We

will not be able to examine advanced procedures in much detail. I will provide

some references for students who wish to continue their learning in these areas. It is

hoped that, as you learn the capability of EXCEL1 and SPSS1, you can explore

more advanced procedures on your own, beyond the end of our discussions.

Some readers may have taken statistics coursework previously. If so, my hope is

that they are able to enrich what they previously learned and develop a more

nuanced understanding of how to address problems in educational research through

the use of EXCEL1 and SPSS1. But whether readers are new to the study or

experienced practitioners, my hope is that statistics becomes meaningful as a

way of examining problems and debunking prevailing assumptions in the field

of education.

Often, well-intentioned people can, through ignorance of appropriate processes

promote ideas in education that may not be true. Furthermore, policies might be

offered that would have a negative impact even though the policy was not based on

sound statistical analyses. Statistics are tools that can be misused and influenced by

the value perspective of the wielder. However, policies are often generated

in the absence of compelling research. Students need to become ‘‘research literate’’

in order to recognize when statistical processes should be used and when they are

being used incorrectly.

COVERAGE OF STATISTICAL PROCEDURES 5

Documents

UNDERSTANDING · Descriptive Statistics—Central Tendency, 51 Mean, 52 Median, 52 Mode, 54 Using Excel1 and SPSS1 to Understand Central Tendency, 56 Excel1,56 SPSS1,58 Distributions,