Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
UNDERSTANDINGEDUCATIONALSTATISTICS USINGMICROSOFTEXCEL1 AND SPSS1
UNDERSTANDINGEDUCATIONALSTATISTICS USINGMICROSOFTEXCEL1AND SPSS1
MARTIN LEE ABBOTTDepartment of SociologySeattle Pacific UniversitySeattle, Washington
Copyright# 2011 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New JerseyPublished simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form orby any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except aspermitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the priorwritten permission of the Publisher, or authorization through payment of the appropriate per-copy fee tothe Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400,fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permissionshould be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken,NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best effortsin preparing this book, they make no representations or warranties with respect to the accuracy orcompleteness of the contents of this book and specifically disclaim any implied warranties ofmerchantability or fitness for a particular purpose. No warranty may be created or extended by salesrepresentatives or written sales materials. The advice and strategies contained herein may not be suitablefor your situation. You should consult with a professional where appropriate. Neither the publisher norauthor shall be liable for any loss of profit or any other commercial damages, including but not limited tospecial, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact ourCustomer Care Department within the United States at (800) 762-2974, outside the United Statesat (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print maynot be available in electronic formats. For more information about Wiley products, visit our web site atwww.wiley.com.
Library of Congress Cataloging-in-Publication Data:
ISBN: 978-0-470-88945-9
Printed in Singapore
10 9 8 7 6 5 4 3 2 1
To those who seek a deeper understanding of the world as it appears and of whatlies beyond.
CONTENTS
Preface xix
Acknowledgments xxi
1 Introduction 1
Approach of the Book, 1Project Labs, 2Real-World Data, 3Research Design, 3“Practical Significance”—Implications of Findings, 4Coverage of Statistical Procedures, 5
2 Getting Acquainted with Microsoft Excel1 7
Data Management, 7Rows and Columns, 8Data Sheets, 8
The Excel1 Menus, 9Home, 9Insert Tab, 9Page Layout, 9Formulas, 10Data, 11Review and View Menus, 16
vii
3 Using Statistics in Excel1 17
Using Statistical Functions, 17Entering Formulas Directly, 17Data Analysis Procedures, 20Missing Values and “0” Values in Excel1 Analyses, 20Using Excel1 with Real Data, 20School-Level Achievement Database, 20TAGLIT Data, 21The STAR Classroom Observation ProtocolTM Data, 22
4 SPSS1 Basics 23
Using SPSS1, 23General Features, 24Management Functions, 26Reading and Importing Data, 26Sort, 26
Additional Management Functions, 30Split File, 30Transform/Compute (Creating Indices), 32Merge, 34
Analysis Functions, 39
5 Descriptive Statistics—Central Tendency 41
Research Applications—Spuriousness, 41Descriptive and Inferential Statistics, 44The Nature of Data—Scales of Measurement, 44Nominal Data, 45Ordinal Data, 46Interval Data, 48Ratio Data, 50Choosing the Correct Statistical Procedure for the Nature ofResearch Data, 50
Descriptive Statistics—Central Tendency, 51Mean, 52Median, 52Mode, 54
Using Excel1 and SPSS1 to Understand Central Tendency, 56Excel1, 56SPSS1, 58
Distributions, 61Describing the Normal Distribution, 62Central Tendency, 63Skewness, 63
viii CONTENTS
Kurtosis, 65Descriptive Statistics—Using Graphical Methods, 66Frequency Distributions, 66Histograms, 67
Terms and Concepts, 71Real-World Lab I: Central Tendency, 74Real-World Lab I: Solutions, 75Results, 75Results, 76
6 Descriptive Statistics—Variablity 81
Range, 82Percentile, 82Scores Based on Percentiles, 83Using Excel1 and SPSS1 to Identify Percentiles, 84Note, 86Standard Deviation and Variance, 87Calculating the Variance and Standard Deviation, 88The Deviation Method, 88The Average Deviation, 89
The Computation Method, 91The Sum of Squares, 91Sample SD and Population SD, 92Obtaining SD from Excel1 and SPSS1, 94
Terms and Concepts, 96Real-World Lab II: Variability, 97Real-World Lab II: Solutions, 97Results, 97
7 The Normal Distribution 101
The Nature of the Normal Curve, 101The Standard Normal Score: z Score, 103The z-Score Table of Values, 104Navigating the z-Score Distribution, 105Calculating Percentiles, 108Creating Rules for Locating z Scores, 108
Calculating z Scores, 111Working with Raw Score Distributions, 114Using Excel1 to Create z Scores and Cumulative Proportions, 115STANDARDIZE Function, 115NORMSDIST Function, 117NORMDIST Function, 118
Using SPSS1 to Create z Scores, 119Terms and Concepts, 121
CONTENTS ix
Real-World Lab III: The Normal Curve and z Scores, 121Real-World Lab III: Solutions, 122
8 The Z Distribution and Probability 127
Transforming a z Score to a Raw Score, 128Transforming Cumulative Proportions to z Scores, 128Deriving Sample Scores from Cumulative Percentages, 130Additional Transformations Using the Standard Normal Distribution, 131Normal Curve Equivalent, 131Stanine, 131T Score, 132Grade Equivalent Score, 132
Using Excel1 and SPSS1 to Transform Scores, 132Probability, 134Determinism Versus Probability, 135Elements of Probability, 136Probability and the Normal Curve, 136Relationship of z Score and Probability, 137“Inside” and “Outside” Areas of the Standard Normal Distribution, 139Outside Area Example, 140
“Exact” Probability, 141From Sample Values to Sample Distributions, 143Terms and Concepts, 144Real-World Lab IV, 144Real-World Lab IV: Solutions, 145
9 The Nature of Research Design and Inferential Statistics 147
Research Design, 148Theory, 149Hypothesis, 149
Types of Research Designs, 150Experiment, 150Post Facto Research Designs, 153
The Nature of Research Design, 154Research Design Varieties, 154Sampling, 155
Inferential Statistics, 156One Sample from Many Possible Samples, 156Central Limit Theorem and Sampling Distributions, 157The Sampling Distribution and Research, 160Populations and Samples, 162The Standard Error of the Mean, 162“Transforming” the Sample Mean to the Sampling Distribution, 163Example, 163
x CONTENTS
Z Test, 166The Hypothesis Test, 167Statistical Significance, 168Practical Significance: Effect Size, 168Z-Test Elements, 169
Terms and Concepts, 169Real-World Lab V, 171Real-World Lab V: Solutions, 172
10 The T Test for Single Samples 175
Z Versus T: Making Accommodations, 175Research Design, 176Experiment, 177Post Facto Comparative Design, 178
Parameter Estimation, 178Estimating the Population Standard Deviation, 178A New Symbol: sx, 180Biased Versus Unbiased Estimates, 181A Research Example, 181
T Test for a Single Mean, 182Example Calculations, 184Degrees of Freedom, 185The T Distribution, 187The Hypothesis Test, 188
Type I and Type II Errors, 189Type I (Alpha) Errors (a), 189Type II (Beta) Errors (b), 190
Effect Size, 191Another Measurement of the (Cohen’s d) Effect Size, 192
Power, Effect Size, and Beta, 193One- and Two-Tailed Tests, 193Two-Tailed Tests, 194One-Tailed Tests, 194Choosing a One- or Two-Tailed Test, 196
A Note About Power, 196Point and Interval Estimates, 197Calculating the Interval Estimate of the Population Mean, 197
The Value of Confidence Intervals, 199Using Excel1 and SPSS1 with the Single-Sample T Test, 200SPSS1 and the Single-Sample T Test, 200Excel1 and the Single Sample T Test, 203
Terms and Concepts, 204Real-World Lab VI: Single-Sample T Test, 205Real-World Lab VI: Solutions, 206
CONTENTS xi
11 Independent-Samples T Test 209
A Lot of “T ’S’’, 209Research Design, 210Experimental Designs, 210Pretest or No Pretest, 213Post Facto Designs, 214
Independent T Test: The Procedure, 215Creating the Sampling Distribution of Differences, 216The Nature of the Sampling Distribution of Differences, 217Calculating the Estimated Standard Error of Difference, 218Using Unequal Sample Sizes, 220The Independent T Ratio, 221
Independent T-Test Example, 222The Null Hypothesis, 222The Alternative Hypothesis, 223The Critical Value of Comparison, 223The Calculated T Ratio, 224Statistical Decision, 225Interpretation, 226
Before–After Convention with the Independent T Test, 226Confidence Intervals for the Independent T Test, 227Effect Size, 228Equal and Unequal Sample Sizes, 229The Assumptions for the Independent-Samples T Test, 229The Excel1 “F-Test Two Sample for Variances” Test, 230The SPSS1 “Explore” Procedure for Testing the Equalityof Variances, 233The Homogeneity of Variances Assumption for theIndependent T Test, 235A Rule of Thumb, 236
Using Excel1 and SPSS1 with the Independent-SamplesT Test, 236Using Excel1 with the Independent T Test, 236Using SPSS1 with the Independent T Test, 239
Parting Comments, 242Nonparametric Statistics, 243Terms and Concepts, 246Real-World Lab VII: Independent T Test, 247Procedures, 247
Real-World Lab VII: Solutions, 248
12 Analysis of Variance 257
A Hypothetical Example of ANOVA, 258The Nature of ANOVA, 259
xii CONTENTS
The Components of Variance, 260The Process of ANOVA, 261Calculating ANOVA, 262Calculating the Variance: Using the Sum of Squares (SS), 262Using Mean Squares (MS), 265Degrees of Freedom in ANOVA, 266Calculating Mean Squares (MS), 266The F Ratio, 267The F Distribution, 269
Effect Size, 269Post Hoc Analyses, 271“Varieties” of Post Hoc Analyses, 272The Post Hoc Analysis Process, 273Tukey’s HSD (Range) Test Calculation, 273Means Comparison Table, 275Compare Mean Difference Values from HSD, 276Post Hoc Summary, 276
Assumptions of ANOVA, 276Additional Considerations with ANOVA, 277A Real-World Example of ANOVA, 277Are the Assumptions Met?, 278Hand Calculations, 281Calculating SST , 283Calculating SSB , 283Calculating SSW, 283The Hypothesis Test, 283Effect Size, 284Post Hoc Analysis, 284
Using Excel1 and SPSS1 with One-Way ANOVA, 285Excel1 Procedures with One-Way ANOVA, 285SPSS1 Procedures with One-Way ANOVA, 287
The Need for Diagnostics, 292Nonparametric ANOVATests, 293Terms and Concepts, 296Real-World Lab VIII: ANOVA, 296Real-World Lab VIII: Solutions, 297
13 Factorial Anova 307
Extensions of ANOVA, 307Within-Subjects ANOVA, 307Two-Way Within-Subjects ANOVA, 308ANCOVA, 308
Multivariate ANOVA Procedures, 309MANOVA, 309
CONTENTS xiii
MANCOVA, 309Factorial ANOVA, 309Interaction Effects, 309An Example of 2�ANOVA, 310Charting Interactions, 311Simple Effects, 312
The Example DataSet, 312Calculating Factorial ANOVA, 312Calculating the Interaction, 315The 2�ANOVA Summary Table, 315Creating the MS Values, 316The Hypotheses Tests, 317The Omnibus F Ratio, 317Effect Size for 2�ANOVA: Partial h2, 318Discussing the Results, 319
Using SPSS1 to Analyze 2�ANOVA, 321The “Plots” Specification, 323Omnibus Results, 325Simple Effects Analyses, 325
Summary Chart for 2�ANOVA Procedures, 327Terms and Concepts, 327Real-World Lab IX: 2�ANOVA, 329Real-World Lab IX: 2�ANOVA Solutions, 330
14 Correlation 337
The Nature of Correlation, 338Explore and Predict, 338Different Measurement Values, 338Different Data Levels, 338Correlation Measures, 338
The Correlation Design, 339Pearson’s Correlation Coefficient, 340Interpreting the Pearson’s Correlation, 340The Fictitious Data, 341Assumptions for Correlation, 342
Plotting the Correlation: The Scattergram, 342Patterns of Correlations, 343Strength of Correlations in Scattergrams, 344
Creating the Scattergram, 345Using Excel1 to Create Scattergrams, 345Using SPSS1 to Create Scattergrams, 347
Calculating Pearson’s r, 348The Z-Score Method, 349The Computation Method, 351
xiv CONTENTS
Evaluating Pearson’s r, 353The Hypothesis Test for Pearson’s r, 353The Comparison Table of Values, 354Effect Size: The Coefficient of Determination, 354
Correlation Problems, 356Correlations and Sample Size, 356Correlation is Not Causation, 357Restricted Range, 357Extreme Scores, 358Heteroscedasticity, 358Curvilinear Relations, 358
The Example Database, 359Assumptions for Correlation, 360Computation of Pearson’s r for the Example Data, 363Evaluating Pearson’s r: Hypothesis Test, 365Evaluating Pearson’s r: Effect Size, 365
Correlation Using Excel1 and SPSS1, 366Correlation Using Excel1, 366Correlation Using SPSS1, 367
Nonparametric Statistics: Spearman’s Rank-Order Correlation (rs), 369Variations of Spearman’s Rho Formula: Tied Ranks, 371A Spearman’s Rho Example, 373
Terms and Concepts, 374Real-World Lab X: Correlation, 376Real-World Lab X: Solutions, 377
15 Bivariate Regression 383
The Nature of Regression, 384The Regression Line, 385Calculating Regression, 388The Slope Value b, 389The Regression Equation in “Pieces”, 389A Fictitious Example, 389Interpreting and Using the Regression Equation, 390
Effect Size of Regression, 391The Z-Score Formula for Regression, 392Using the Z-Score Formula for Regression, 392Unstandardized and Standardized Regression Coefficients, 394
Testing the Regression Hypotheses, 394The Standard Error of Estimate, 394Calculating sest, 395
Confidence Interval, 396Explaining Variance through Regression, 397Using Scattergrams to Understand the Partitioning of Variance, 399
CONTENTS xv
A Numerical Example of Partitioning the Variation, 400Using Excel1 and SPSS1 with Bivariate Regression, 401The Excel1 Regression Output, 402The SPSS1 Regression Output, 404
Assumptions of Bivariate Linear Regression, 408Curvilinear Relationships, 409Detecting Problems in Bivariate Linear Regression, 412A Real-World Example of Bivariate Linear Regression, 413Normal Distribution and Equal Variances Assumptions, 413The Omnibus Test Results, 414Effect Size, 414The Model Summary, 415The Regression Equation and Individual Predictor Testof Significance, 415The Scattergram, 416
Advanced Regression Procedures, 417Multiple Correlation, 417Partial Correlation, 418Multiple Regression, 419Additional Considerations, 419
Terms and Concepts, 419Real-World Lab XI: Bivariate Linear Regression, 420Real-World Lab XI: Solutions, 422
16 Introduction to Multiple Linear Regression 429
The Elements of MLR, 429Same Process as Bivariate Regression, 430Similar Assumptions, 430Statistical Significance, 430Effect Size, 430Coefficients, 430Scatterdiagrams, 431
Some Differences Between Bivariate Regression and MLR, 431Multiple Coefficients, 431Multicollinearity, 431Explanation of R2, 431Entry Schemes, 432
Stuff Not Covered, 432Using MLR with Categorical Data, 432Curvilinear Regression, 433Multilevel Analysis, 433
MLR Extended Example, 433Are the Assumptions Met?, 434The Findings, 437
xvi CONTENTS
The SPSS1 Findings, 438The Unstandardized Coefficients, 442The Standardized Coefficients, 442Collinearity Statistics, 443The Squared Part Correlation, 443Conclusion, 444
Terms and Concepts, 445Real-World Lab XII: Multiple Linear Regression, 445Real-World Lab XII: MLR Solutions, 445
17 Chi Square and Contingency Table Analysis 453
Contingency Tables, 453The Chi Square Procedure and Research Design, 454Post Facto Designs, 455Experimental Designs, 455
Chi Square Designs, 455Goodness of Fit, 455Expected Frequencies—Equal Probability, 456Expected Frequencies—A Priori Assumptions, 456
The Chi Square Test of Independence, 456A Fictitious Example—Goodness of Fit, 457Frequencies Versus Proportions, 460
Effect Size—Goodness of Fit, 460Chi Square Test of Independence, 461Two-Way Chi Square, 461Assumptions, 462
A Fictitious Example—Test of Independence, 462Creating Expected Frequencies, 462Degrees of Freedom for the Test of Independence, 464
Special 2� 2 Chi Square, 466The Alternate 2� 2 Formula, 467Effect Size in 2� 2 Tables: Phi, 467Correction for 2� 2 Tables, 468
Cramer’s V: Effect Size for the Chi Square Test of Independence, 469Repeated Measures Chi Square, 470Repeated Measures Chi Square Table, 472
Using Excel1 and SPSS1 with Chi Square, 472Using Excel1 for Chi Square Analyses, 475Sort the Database, 475The Excel1 Count Function, 476The Excel1 CHITEST Function, 476The Excel1 CHIDIST Function, 477
Using SPSS1 for the Chi Square Test of Independence, 478The Crosstabs Procedure, 478
CONTENTS xvii
Analyzing the Contingency Table Data Directly, 481Interpreting the Contingency Table, 483
Terms and Concepts, 483Real-World Lab XIII: Chi Square, 484Real-World Lab XIII: Solutions, 484Hand Calculations, 484Using Excel1 for Chi Square Analyses, 485Using SPSS1 for Chi Square Solutions, 486
18 Repeated Measures Procedures: Tdep and ANOVAws 489
Independent and Dependent Samples in Research Designs, 490Using Different T Tests, 491The Dependent T-Test Calculation: The Long Formula, 491Example, 492Results, 494Effect Size, 494
The Dependent T-Test Calculation: The Difference Formula, 495The Tdep Ratio from the Difference Method, 496
Tdep and Power, 496Using Excel1 and SPSS1 to Conduct the Tdep Analysis, 496Tdep with Excel
1, 497Tdep with SPSS
1, 498Within-Subjects ANOVA (ANOVAws), 499Experimental Designs, 499Post Facto Designs, 501
Within-Subjects Example, 501Using SPSS1 for Within-Subjects Data, 501Sphericity, 501
The SPSS1 Procedure, 502The SPSS1 Output, 504The Omnibus Test, 506Effect Size, 507Post Hoc Analyses, 507The Interpretation, 507
Nonparametric Statistics, 508Terms and Concepts, 509
References 511
Appendix: Statistical Tables 513
Index 523
xviii CONTENTS
PREFACE
I have written this book many times in my head over the years! As I conductedresearch and taught statistics (graduate and undergraduate) in many fields, I devel-oped an approach to helping students understand the difficult concepts in a newway. I find that the great majority of students are visual learners, so I developeddiagrams and figures over the years that help create a conceptual picture of thestatistical procedures that are often problematic to students (like samplingdistributions!).
The other reason I wanted to write this book was to give students a way to under-stand statistical computing without having to rely on comprehensive and expensivestatistical software programs. Because most students have access to MicrosoftExcel1,1 I developed a step-by-step approach to using the powerful statistical pro-cedures in Excel1 to analyze data and conduct research in each of the statisticaltopics I cover in the book.
I also wanted to make those comprehensive statistical programs more approacha-ble to statistics students, so I have also included a hands-on guide to SPSS1 in par-allel with the Excel1 examples. In some cases, SPSS1 has the only means toperform some statistical procedures; but in most cases, both Excel1 and SPSS1
can be used.Last, like my other work dealing with applied statistical topics (Abbott, 2010), I
included real-world data in this book as examples for the procedures I discuss. Iintroduce extended examples in each chapter that use these real-world datasets, andI conclude the chapters with a Real-World Lab in which I present data for students
1 Excel1 references and screen shots in this book are used with permission from Microsoft.
xix
to use with Excel1 and SPSS1. Each Lab is followed by the Real World Lab:Solutions section so that students can examine their work in greater depth.
One limitation to teaching statistics through Excel1 is that the data analysis fea-tures are different, depending on whether the user is a Mac user or a PC user. I amusing the PC version, which features a Data Analysis suite of statistical tools. Thisfeature may no longer be included in the Mac version of Excel1 you are using.
I am posting the datasets for the real-world labs at the Wiley Publisher ftp site.You can access these datasets there to complete the labs instead of entering the datafrom the tables in the chapters. You may note some slight discrepancies in the re-sults if you enter the data by hand rather than downloading the data due to roundingof values. The data in the chapters are typically reported to two decimal places,whereas the analyses reported in the Labs are based on the actual data that bothExcel1 and SPSS1 carry to many decimal places even though you may only see avalue with two decimal places. Despite any slight differences resulting from round-ing, the primary findings should not change. You may encounter these types ofdiscrepancies in your research with real data as you move data from program toprogram to page.
The John Wiley & Sons Publisher ftp address is as follows:ftp://ftp.wiley.com/public/sci_tech_med/educational_statistics. You may also
want to visit my personal website at the following address:http://myhome.spu.edu/mabbott/.
MARTIN LEE ABBOTT
Seattle, Washington
xx PREFACE
ACKNOWLEDGMENTS
I would like to thank everyone who reviewed this manuscript. In particular,Nyaradzo Mvududu’s thorough critique was invaluable throughout the process.Adrianna Bagnall reviewed the manuscript and provided help in a great many otherways, especially with the tables. Dominic Williamson’s outstanding work on thefigures and graphic design was a critical feature of my approach to conceptualunderstanding of complex processes. I am especially grateful for his design of theimage on the book cover. Kristin Hovaguimian again provided outstanding supportfor the Index—not an easy task with a book of this nature. My graduate students inIndustrial/Organizational Psychology were kind to review the Factorial ANOVAchapter (Chapter 13).
I also want to thank Duane Baker (The BERC Group, Inc.) and Liz Cunningham(T.E.S.T., Inc.) for approval to use their data in this book as they did for my formerwork (Abbott, 2010). Using real-world data of this nature will be very helpful toreaders in their efforts to understand statistical processes.
I especially want to recognize Jacqueline Palmieri and Stephen Quigley at JohnWiley & Sons, Inc. for their continuing encouragement. They have been steadfast intheir support of this approach to statistical analysis from the beginning of our worktogether.
MARTIN LEE ABBOTT
xxi
1INTRODUCTION
Many students and researchers are intimidated by statistical procedures. This may
in part be due to a fear of math, problematic math teachers in earlier education, or
the lack of exposure to a ‘‘discovery’’ method for understanding difficult proce-
dures. Readers of this book should realize that they have the ability to succeed in
understanding statistical processes.
APPROACH OF THE BOOK
This is an introduction to statistics using EXCEL1 and SPSS1 to make it more
understandable. Ordinarily, the first course leads the student through the worlds of
descriptive and inferential statistics by highlighting the formulas and sequential
procedures that lead to statistical decision making. We will do all this in this book,
but I place a good deal more attention on conceptual understanding. Thus, rather
than memorizing a specific formula and using it in a specific way to solve a prob-
lem, I want to make sure the student first understands the nature of the problem,
why a specific formula is needed, and how it will result in the appropriate informa-
tion for decision making.
By using statistical software, we can place more attention on understanding how
to interpret findings. Statistics courses taught in mathematics departments, and in
some social science departments, often place primary emphases on the formulas/
processes themselves. In the extreme, this can limit the usefulness of the analyses
to the practitioner. My approach encourages students to focus more on how to
understand and make applications of the results of statistical analyses. EXCEL1
Understanding Educational Statistics Using Microsoft Excel1 and SPSS1. By Martin Lee Abbott.# 2011 John Wiley & Sons, Inc. Published 2011 by John Wiley & Sons, Inc.
1
and other statistical programs are much more efficient at performing the analyses;
the key issue in my approach is how to interpret the results in the context of the
research question.
Beginning with my first undergraduate course through teaching statistics with
conventional textbooks, I have spent countless hours demonstrating how to conduct
statistical tests by hand and teaching students to do likewise. This is not always a
bad strategy; performing the analysis by hand can lead the student to understand
how formulas treat data and yield valuable information. However, it is often
the case that the student gravitates to memorizing the formula or the steps in an
analysis. Again, there is nothing wrong with this approach as long as the student
does not stop there. The outcome of the analysis is more important than memorizing
the steps to the outcome. Examining the appropriate output derived from statistical
software shifts the attention from the nuances of a formula to the wealth of informa-
tion obtained by using it.
It is important to understand that I do indeed teach the student the nuances of
formulas, understanding why, when, how, and under what conditions they are used.
But in my experience, forcing the student to scrutinize statistical output files
accomplishes this and teaches them the appropriate use and limitations of the
information derived.
Students in my classes are always surprised (ecstatic) to realize they can use
their textbooks, notes, and so on, on my exams. But they quickly find that, unless
they really understand the principles and how they are applied and interpreted, an
open book is not going to help them. Over time, they come to realize that the analy-
ses and the outcomes of statistical procedures are simply the ingredients for what
comes next: building solutions to research problems. Therefore, their role is more
detective and constructor than number juggler.
This approach mirrors the recent national and international debate about math
pedagogy. In my recent book, Winning the Math Wars (Abbott et al., 2010), my
colleagues and I addressed these issues in great detail, suggesting that, while tradi-
tional ways of teaching math are useful and important, the emphases of reform
approaches are not to be dismissed. Understanding and memorizing detail are
crucial, but problem solving requires a different approach to learning.
PROJECT LABS
Labs are a very important part of this course since they allow students to take
charge of their learning. This is the ‘‘discovery learning’’ element I mentioned
above. Understanding a statistical procedure in the confines of a classroom is neces-
sary and helpful. However, learning that lasts is best accomplished by students
directly engaging the processes with actual data and observing what patterns
emerge in the findings that can be applied to real research problems.
In this course, we will have several occasions to complete Project Labs that pose
research problems on actual data. Students take what they learn from the book
material and conduct a statistical investigation using EXCEL1 and SPSS1. Then,
2 INTRODUCTION
they have the opportunity to examine the results, write research summaries, and
compare findings with the solutions presented at the end of the book.
These are labs not using data created for classroom use but instead using
real-world data from actual research databases. Not only does this engage students
in the learning process with specific statistical processes, but it presents real-world
information in all its ‘‘grittiness.’’ Researchers know that they will discover knotty
problems and unusual, sometimes idiosyncratic, information in their data. If
students are not exposed to this real-world aspect of research, it will be confusing
when they engage in actual research beyond the confines of the classroom.
The project labs also introduce students to two software approaches for solv-
ing statistical problems. These are quite different in many regards, as we will
see in the following chapters. EXCEL1 is widely accessible and provides a
wealth of information to researchers about many statistical processes they
encounter in actual research. SPSS1 provides additional, advanced procedures
that educational researchers utilize for more complex and extensive research
questions. The project labs provide solutions in both formats so the student can
learn the capabilities and approaches of each.
REAL-WORLD DATA
As I mentioned, I focus on using real-world data for many reasons. One reason is
that students need to be grounded in approaches they can use with ‘‘gritty’’ data. I
want to make sure that students leave the classroom prepared for encountering the
little nuances that characterize every research project.
Another reason I use real-world data is to familiarize students with contemporary
research questions in education. Classroom data often are contrived to make a cer-
tain point or show a specific procedure, which are both helpful. But I believe that it
is important to draw the focus away from the procedure per se and understand how
the procedure will help the researcher resolve a research question. The research
questions are important. Policy reflects the available information on a research
topic, to some extent, so it is important for students to be able to generate that
information as well as to understand it. This is an ‘‘active’’ rather than ‘‘passive’’
learning approach to understanding statistics.
RESEARCH DESIGN
People who write statistics books have a dilemma with respect to research design.
Typically, statistics and research design are taught separately in order for students
to understand each in greater depth. The difficulty with this approach is that
the student is left on their own to synthesize the information; this is often not
done successfully.
Colleges and universities attempt to manage this problem differently. Some
require statistics as a prerequisite for a research design course, or vice versa. Others
RESEARCH DESIGN 3
attempt to synthesize the information into one course, which is difficult to do given
the eventual complexity of both sets of information. Adding somewhat to the
problem is the approach of multiple courses in both domains.
I do not offer a perfect solution to this dilemma. My approach focuses on an
in-depth understanding of statistical procedures for actual research problems. What
this means is that I cannot devote a great deal of attention in this book to research
design apart from the statistical procedures that are an integral part of it. However, I
try to address the problem in two ways.
First, wherever possible, I connect statistics with specific research designs. This
provides an additional context in which students can focus on using statistics to
answer research questions. The research question drives the decision about which
statistical procedures to use; it also calls for discussion of appropriate design in
which to use the statistical procedures. We will cover essential information about
research design in order to show how these might be used.
Second, I am making available an online course in research design as part of this
book. In addition to databases and other research resources, you can follow the web
address in the Preface to gain access to the online course that you can take in
tandem with reading this book or separately.
‘‘PRACTICAL SIGNIFICANCE’’—IMPLICATIONS OF FINDINGS
I emphasize ‘‘practical significance’’ (effect size) in this book as well as statistical
significance. In many ways, this is a more comprehensive approach to uncertainty,
since effect size is a measure of ‘‘impact’’ in the research evaluation. It is important
to measure the likelihood of chance findings (statistical significance), but the extent
of influence represented in the analyses affords the researcher another vantage point
to determine the relationship among the research variables.
I call attention to problem solving as the important part of statistical analysis. It
is tempting for students to focus so much on using statistical procedures to create
meaningful results (a critical matter!) that they do not take the next steps in
research. They stop after they use a formula and decide whether or not a finding is
statistically significant. I strongly encourage students to think about the findings in
the context and words of the research question. This is not an easy thing to do
because the meaning of the results is not always cut and dried. It requires students
to think beyond the formula.
Statisticians and practitioners have devised rules to help researchers with this
dilemma by creating criteria for decision making. For example, squaring a correla-
tion yields the ‘‘coefficient of determination,’’ which represents the amount of
variance in one variable that is accounted for by the other variable. But the next
question is, How much of the ‘‘accounted for variance’’ is meaningful?
Statisticians have suggested different ways of helping with this question. One
such set of criteria determines that 0.01 (or 1% of the variance accounted for) is
considered ‘‘small’’ while 0.05 (5% of variance) is ‘‘medium,’’ and so forth. (And,
much to the dismay of many students, there are more than one set of these criteria.)
4 INTRODUCTION
But the material point is that these criteria do not apply equally to every research
question.
If a research question is, ‘‘Does class size affect math achievement,’’ for
example, and the results suggest that class size accounts for 1% of the variance in
math achievement, many researchers might agree it is a small and perhaps even
inconsequential impact. However, if a research question is, ‘‘Does drug X account
for 1% of the variance in AIDS survival rates,’’ researchers might consider this to
be much more consequential than ‘‘small’’!
This is not to say that math achievement is any less important than AIDS
survival rates (although that is another of those debatable questions researchers
face), but the researcher must consider a range of factors in determining meaning-
fulness: the intractability of the research problem, the discovery of new dimensions
of the research focus, whether or not the findings represent life and death, and so on.
I have found that students have the most difficult time with these matters. Using
a formula to create numerical results is often much preferable to understanding
what the results mean in the context of the research question. Students have
been conditioned to stop after they get the right numerical answer. They typically
do not get to the difficult work of what the right answer means because it isn’t
always apparent.
COVERAGE OF STATISTICAL PROCEDURES
The statistical applications we will discuss in this book are ‘‘workhorses.’’ This is
an introductory treatment, so we need to spend time discussing the nature of statis-
tics and basic procedures that allow you to use more sophisticated procedures. We
will not be able to examine advanced procedures in much detail. I will provide
some references for students who wish to continue their learning in these areas. It is
hoped that, as you learn the capability of EXCEL1 and SPSS1, you can explore
more advanced procedures on your own, beyond the end of our discussions.
Some readers may have taken statistics coursework previously. If so, my hope is
that they are able to enrich what they previously learned and develop a more
nuanced understanding of how to address problems in educational research through
the use of EXCEL1 and SPSS1. But whether readers are new to the study or
experienced practitioners, my hope is that statistics becomes meaningful as a
way of examining problems and debunking prevailing assumptions in the field
of education.
Often, well-intentioned people can, through ignorance of appropriate processes
promote ideas in education that may not be true. Furthermore, policies might be
offered that would have a negative impact even though the policy was not based on
sound statistical analyses. Statistics are tools that can be misused and influenced by
the value perspective of the wielder. However, policies are often generated
in the absence of compelling research. Students need to become ‘‘research literate’’
in order to recognize when statistical processes should be used and when they are
being used incorrectly.
COVERAGE OF STATISTICAL PROCEDURES 5