SPSS Handout Version 19 1-12-12

Embed Size (px)

Citation preview

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    1/88

    INTRODUCTION TO SPSS

    FOR WINDOWSVersion 19.0

    Winter 2012

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    2/88

    ContentsPurpose of handout & Compatibility between different versions of SPSS.. 1SPSS window & menus 1Getting data into SPSS & Editing data.. 3Reading an SPSS viewer/output (.spv) file & Editing your pout. 7Saving data as an SPSS data (.sav) file..... 8Saving your output (statistical results and graphs) 9Exporting SPSS Output. 10Printing your work & Exiting SPSS.. 11Running SPSS using syntax or command language (.sps files). 12Display variable names or variable labels.13Creating and Recording VariablesCreating a new variable. 14Recoding or combining categories of a variable 15Example: Recoding a categorical variable...15Example: Creating a indicator or dummy variable..17

    Summarizing your data

    Frequency tables (& bar charts) for categorical variables. 20Contingency tables for categorical variables. 21Descriptive statistics (& histograms) for numerical variables.. 22Descriptive statistics (& boxplots) by groups for numerical variables. 24Using the Split File option for summaries by groups 26Using the Select Cases option for summaries for a subgroup of subjects/observations 27Graphing your dataBar chart 28Histogram & Boxplot 29Normal probability plot. 30Error bar plot.. 31Scatter plot. 32

    Adding a line or loess smooth to a scatter plot.. 32Stem-and-leaf plot.. 33Hypothesis tests & Confidence intervalsOne sample t test & Confidence interval for a mean. 34Paired t test & Confidence interval for the difference between means. 37Two sample t test & Confidence interval for the difference between means 39Sign test and Wilcoxon signed rank test....... 42Mann Whitney U test (or Wilcoxon rank sum test).............. 45One-way ANOVA (Analysis of variance) & Post-hoc tests......... 47Kruskal-Wallis test.....50One-sample binomial test...... 52

    McNemars test..53Chi-square test for contingency tables...55Fishers exact test....... 55Trend test for contingency tables/ordinal variables....... 55Binomial, McNemars, Chi-square and Fishers exact tests using summary data.... 59Confidence interval for a proportion. 63Correlation & RegressionPearson and spearman rank correlation coefficient....... 65Linear regression........ 68

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    3/88

    Liner regression via ANOVA commands.. 76Logistic regression 80

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    4/88

    1

    Purpose of handout

    IBM SPSS Statistics (or SPSS) provides a powerful statistical and data management system in a

    graphical environment. The user interfaces make statistical analysis more accessible for casual

    users and more convenient for experienced users. Most tasks can be accomplished simply by

    pointing and clicking the mouse.

    The objective of this handout is to get you oriented with SPSS for Windows. It teaches you howto enter and save data in SPSS, how to edit and transform data, how to explore your data by

    producing graphics and summary descriptives, and how to use pointing and clicking to run

    statistical procedures.

    Compatibility between different versions of SPSS and PASW Statistics

    SPSS data files (files ending in .sav) and syntax (command) files (files ending in .sps) are

    compatible between different versions of SPSS (at least, versions 11.0 or newer). However,

    SPSS viewer/output files (files ending in .spv) are NOT compatible between differentversions. One option for avoiding compatibility problems between different versions of SPSS is

    to export your output using an html or MS Word format. The compatibility between

    Window and Mac versions of SPSS is also limited.

    SPSS Windows & Menus

    An overview of the SPSS windows, menus, toolbars, and dialog boxes is given in the SPSS

    Tutorials under Help. You can also find information under Topics, Case Studies, Statistics

    Coach, and Command & Syntax (if you are using syntax commands.)

    Window Types

    Data Editor. When you start an SPSS session, you usually see the Data Editor window(otherwise you will see a Viewer window). The Data Editor displays the contents of the working

    data file. There a two views in the data editor window: 1) Data View displays the data in a

    spreadsheet format with variable names listed for column headings, and 2) Variable View whichdisplays information about the variables in your data set. In the Data View you can edit or enter

    data, and in the Variable View you can change the format of a variable, add format and variable

    labels, etc.

    Viewer (Output). Statistical results and graphs are displayed in the Viewer window. The

    (output) Viewer window is divided into two panes. The right-hand pane contains the all theoutput and the left-hand pane contains a tree-structure of the results. You can use the left-handpane for navigating through, editing and printing your results.

    Chart Editor. The chart editor is used to edit graphs. When you double-click on figure orgraph, it will reappear in a chart editor window.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    5/88

    2

    Syntax Editor. The Syntax Editor is used to create SPSS command syntax for using the SPSS

    production facility. Usually you will be using the point and click facilities of SPSS, and hence,you will not need to use the Syntax Editor. More information about the Syntax Editor and using

    the SPSS syntax is given in the SPSS Help Tutorials under Working with Syntax. A few

    instructions to get you started are given later in the handout in the section Running SPSS using

    the Syntax Editor (or Command Language)

    Menus

    Data Editor Menu:

    File. Use the File menu to create a new SPSS file, open an existing file, or read in spreadsheet or

    database files created by other software programs (e.g., Excel).

    Edit. Use the Edit menu to modify or copy data and output files.

    View. Choose which buttons are available in the window or how the window should look.

    Data. Use the Data menu to make changes to SPSS data files, such as merging files, transposing

    variables, or creating subsets of cases for subset analysis.

    Transform. Use the Transform menu to make changes to selected variables in the data file (e.g.,

    to recode a variable) and to compute new variables based on existing variables.

    Analyze. Use the Analyze menu to select the various statistical procedures you want to use, such

    as descriptive statistics, cross-tabulation, hypothesis testing and regression analysis.

    Graphs. Use the Graphs menu to display the data using bar charts, histograms, scatterplots,boxplots, or other graphical displays . All graphs can be customized with the Chart Editor.

    Utilities. Use the Utilities menu to view variable labels for each variable.

    Add-ons. Information about other SPSS software.

    Window. Choose which window you want to view.

    Help. Index of help topics, tutorials, SPSS home page, Statistics coach, and version of SPSS.

    Viewer Menu: Menu is similar to Data Editor menu, but has two additional options:

    Insert. Use the insert menu to edit your output

    Format. Use the format menu to change the format of your output.

    Chart Editor Menu: Use SPSS Help to learn more about the Chart Editor.

    Toolbars

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    6/88

    3

    Most Windows applications provide buttons arranged along the top of a window that act as

    shortcuts to executing various functions. In SPSS, you will find such buttons (icons) at the topthe of the Data Editor, Viewer, Chart Editor, and Syntax windows. The icons are usually

    symbolic representations of the procedure they execute when pushed, unfortunately their

    meanings are not intuitively obvious until one has already used them. Hence, the best way to

    learn these buttons is to use them and note what happens.

    The Status Bar The Status Bar runs along the bottom of a window and alerts the user to the status

    of the system. Typical messages one will see are Processor is ready, Running procedure.The Status Bar will also provide up-to-date information concerning special manipulations of the

    data file like whether only certain cases are being used in an analysis or if the data has been

    weighted according to the value of some variable.

    File Types

    Data Files. A file with an extension of.sav is assumed to be a data file in SPSS for Windows

    format. A file with an extension of .por is a portable SPSS data file. The contents of a data fileare displayed in the Data Editor window.

    Viewer (Output) Files. A file with an extension of.spv is assumed to be a Viewer file

    containing statistical results and graphs.

    Syntax (Command) Files. A file witn an extension of.sps is assumed to be a Syntax file

    containing spss syntax and commands.

    Getting Data into SPSS & Editing Data

    When reading and editing data into SPSS the data will be displayed in the Data Editor Window.An overview of the basic structure of an SPSS data file is given in the SPSS Help Tutorials:

    1. Choose Help on the menu bar2. Choose Tutorial3. Choose Reading Data

    Reading Data from a SPSS Data (.sav) File

    To read a data file from your computer/floppy disk/flash drive that was created and saved usingSPSS. The filename should end with the suffix .sav.

    1. Choose Open an existing data source2. Double click on the filename or3. Single click on the filename and choose OK

    Or

    1. Choose Cancel

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    7/88

    4

    2. Choose File on the menu bar3. Choose Open4. Choose Data...5. Edit the directory or disk drive to indicate where the data is located.6. Double click on the filename or

    7. Single click on the filename and choose Open

    Reading Data from an Text Data File

    To read an raw/text (ascii) data file from your computer/floppy disk/flash drive, where the data

    for each observation is on a separate line and a space is used to separate variables on the sameline (i.e., the file format is freefield). The filename should end with the suffix .dat.

    1. Choose File on the menu bar2. Choose Read Text Data3. Choose Files of Type *.dat4. Edit the directory or disk drive to indicate where the data is located5. Double click on the filename or6. Single click on the filename and choose Open7. Follow the Import Wizard Instructions.

    You can also get to the Import Wizard as follows:

    1. Choose File on the menu bar2. Choose Open3. Choose Data...4. Choose Files of Type *.dat5. Edit the directory or disk drive to indicate where the data is located

    6. Double click on the filename or7. Single click on the filename and choose Open8. Follow the Import Wizard Instructions.

    Instructions on how to read a text data file in fixed format are located in SPSS Help Tutorials

    under Reading Data from a Text File.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    8/88

    5

    Reading Data from Other Types of External Files

    SPSS allows you to read a variety of other types of external files, such as Excel spreadsheet files,

    SAS data files, and Stata data files. To read data from other types of external files, you follow

    the same steps as you would for reading an SPSS save file, except that you specify the file type

    according to what package was used to create the save file. For further instruction on how to readdata from other types of external files, see the SPSS for Windows Base System User's Guide on

    data files or the SPSS Help Tutorials.

    Entering and Editing Data Using the Data Editor

    The Data Editor provides a convenient spreadsheet-like facility for entering, editing, and

    displaying the contents of your data file. A Data Editor window opens automatically when you

    start an SPSS session. Instruction on Using the Data Editor to enter data is given in the SPSSHelp Tutorials. Note that if you are already familiar with entering data into a different

    spreadsheet program (e.g., MS Excel), you might find it easy to enter your data in the program

    your are familiar with and then read the data into SPSS.

    Entering Data. Basic data entry in the Data Editor is simple:

    Step 1. Create a new (empty) Data Editor window. At the start of an SPSS session a new(empty) Data Editor window opens automatically. During an SPSS session you can create a new

    Data Editor window by

    1. Choose File2. Choose New3. Choose Data

    Step 2. Move the cursor to the first empty column.

    Step 3. Type a value into the cell. As you type, the value appears in the cell editor at the top of

    the Data Editor window. Each time you press the Enter key, the value is entered in the cell and

    you move down to the next row. By entering data in a column, you automatically create a

    variable and SPSS gives it the default variable name var00001.

    Step 4. Choose the first cell in the next column. You can use the mouse to click on the cell or use

    the arrow keys on the keyboard to move to the cell. By default, SPSS names the data in thesecond column var00002.

    Step 5. Repeat step 4 until you have entered all the data. If you entered an incorrect value(s) youwill need to edit your data. See the following section on Editing Data.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    9/88

    6

    Editing Data. With the Data Editor, you can modify a data file in many ways. For example you

    can change values or cut, copy, and paste values, or add and delete cases.

    To Change a Data Value:

    1. Click on a data cell. The cell value is displayed in the cell editor.

    2. Type the new value. It replaces the old value in the cell editor.3. Press then Enter key. The new value appears in the data cell.

    To Cut, Copy, and Paste Data Values1. Select (highlight) the cell value(s) you want to cut or copy.2. Pull down the Edit box on the main menu bar.3. Choose Cut. The selected cell values will be copied, then deleted. Or4. Choose Copy. The selected cell values will be copied, but not deleted.5. Select the target cell(s) (where you want to put the cut or copy values).6. Pull down the Edit box on the main menu bar.7. Choose Paste. The cut or copy values will be ``pasted'' in the target cells.

    To Delete a Case (i.e., a Row of Data)1. Click on the case number on the left side of the row. The whole row will be highlighted.2. Pull down the Edit box on the main menu bar.3. Choose Clear.

    To Add a Case (i.e., a Row of Data)1. Select any cell in the case from the row below where you want to insert the new case.2. Pull down the Data box on the main menu bar.3. Choose Insert.

    Defining Variables. The default name for new variables is the prefix varand a sequential five-

    digit number (e.g., var00001, var00002, var00003). To change the name, format and other

    attributes of a variable.

    1. Double click on the variable name at the top of a column or,2. Click on the Variable View tab at the bottom of Data Editor Window.3. Edit the variable name under column labeled Name. The variable name must be eight

    characters or less in length. You can also specify the number of decimal places (under

    Decimals), assign a descriptive name (under Label), define missing values (under

    Missing), define the type of variable (under Measure; e.g., scale, ordinal, nominal), anddefine the values for nominal variables (under Values).

    After the data is entered (or several times during data entering), you will want to save it as anSPSS save file. See the section on Saving Data As An SPSS Save File.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    10/88

    7

    Reading an SPSS Viewer/Output (.spv) File

    Statistical results and graphs are displayed in the Viewer window. An overview of how to use

    the Viewer is given in the SPSS Help Tutorials under Working with Output.

    If you saved the results of Viewer window during an earlier SPSS session, you can use thefollowing commands to display the Viewer (output) results in a current SPSS session. However,

    SPSS output/viewer files (files ending in .spv) are NOT always compatible between differentversions. Usually SPSS output files created with an older version and can be read by a new

    version, but an output file created using a new version can not be read by an older version. One

    option for avoiding compatibility problems between different versions of SPSS is to exportyouroutput in html or MS Word format. The compatibility between Window and Mac versions of

    SPSS is limited.

    To read a Viewer file from your computer\floppy disk\flashdrive that was created and savedusing SPSS. The filename should end with the suffix spv.

    1. Choose File on the menu bar2. Choose Open3. Choose Output...4. Edit the directory or disk drive to indicate where the data is located5. Double click on the filename or6. Single click on the filename and choose Open

    Editing Your Output

    Editing the statistical results and graphs in the Viewer window is beyond the scope of this

    handout. Instructions on how to edit your output is given in the SPSS Help Tutorials underWorking with Output and Creating and Editing Charts.

    You can use either the tree-structure in the left hand pane or the results displayed in the righthand pane to select, move or delete parts of the output.

    To edit a table or object (an object is a group of results) you first need to double click on thetable/object so an editing box appears around the table/object, and then select the value you

    want to modify. An editing box' will be a ragged box outlining the table. If you only do asingle click you will get a box with straight/plain lines outlining the table. In general, to create

    nice looking tables of your results it is often easier to hand enter the values into a blank MS

    Word table than to edit a SPSS table/object (either in SPSS or MS Word).

    To edit a chart you first need to double click on the chart so it appears in a new Chart Editorwindow. After you are done editing the chart, close the window and then export the chart, for

    example to a windows metafile and then into a MS Word file.

    By default in SPSS a P-value is displayed as .000 if the P-value is less than .001. You canreport the P-value as

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    11/88

    8

    1. In a SPSS (output) Viewer window double click (with the left mouse button) on the tablecontaining the p-value you want to display differently A ``editing box'' should appeararound the table.

    2. Click on the p-value using theright mouse button.3. Choose Cell Properties. (If you do not get this option, you need to double click on the table

    to get the ragged box.)4. Change the number of decimals to the desired number (default is 3).5. Choose OK or6. Double click on the p-value with the left mouse button and SPSS will display the p-value

    with more significant digits. If the p-value is very small, the p-value will be displayed in

    scientific notation (e.g., 1.745E-10 = 0.0000000001745).

    Saving Data as an SPSS Data (.sav) File

    To save data as a new SPSS Data file onto your computer/floppy disk/flashdrive:

    1. Display the Data Editor window (i.e., execute the following commands while in the DataEditor window displaying the data you want to save.)

    2. Choose File on the menu bar.3. Choose Save As...4. Edit the directory or disk drive to indicate where the data should be saved. SPSS will

    automatically add the .sav suffix to the filename.

    5. Choose Save

    To save data changes in an existing SPSS Save: file.

    1. Display the Data Editor window (i.e., execute the following commands while in the Data

    Editor window displaying the data you want to save.)2. Choose File box on the menu bar3. Choose Save

    Caution. The Save command saves the modified data by overwriting the previous version of the

    file.

    You can save your data in other formats besides an SPSS save file (e.g., as an ASCII file, Excelfile, SAS data set). To save your data with a given format you follow the same steps as saving

    data in a new SPSS Save file, except that you specify the Save as Type as the desired format.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    12/88

    9

    Saving Your Output (Statistical Results and Graphs)

    To save the statistical results and graphs displayed in the Viewer window as a new SPSS Output

    file:

    1. Display the Viewer window (i.e., execute the following commands while in the Viewerwindow displaying the results you want to save.)

    2. Choose File on the menu bar.3. Choose Save As...4. Edit the directory or disk drive to indicate where the output should be saved. SPSS will

    automatically add the .spv suffix to the filename.5. Choose Save

    To save Viewer changes in an existing SPSS Output file.

    1. Display the Viewer window (i.e., execute the following commands while in the Viewer

    window displaying the results you want to save.)2. Choose File on the menu bar.3. Choose Save.

    Caution. The Save command saves the modified Viewer window by overwriting the previousversion of the file.

    NOTE that you will not be able to open SPSS output that was created with a different version

    than the version of SPSS that you are using to open the output. You can avoid thisincompatibility problem by exporting your output in an html or MS Word format (see the next

    page).

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    13/88

    10

    Exporting SPSS Output

    Sometimes you will want to save your SPSS output in a different file format than a SPSS output

    file, because you want to avoid compatibility problems between different versions of SPSS, you

    want to further edit your output in a Word document, or you want include graphs or figures in

    another document file. The basic steps in exporting SPSS output to another file type are, whilein a SPSS (output) Viewer window:

    1. Choose File

    2. Choose Export

    3. Objects to Export: Choose whatyou want to export

    All: Exports all the output and otherinformation not shown in the

    output. You usually do not want to

    use this opion.

    All visible: Exports all visible

    output

    Selected: Exports only output that is

    selected or highlighted in the

    Viewer window

    4. Document Type: Choose the

    type of file or format you want touse save your results.

    Word/RTF (*.doc) is a good option.

    Numerical and graphical output willbe saved in the same file.

    With the HTML option numericaloutput will be saved in one file and

    each graph will be saved in a

    separate file.

    5. Document File Name: Enter

    the file name and location.

    6. Choose OK (or Paste)

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    14/88

    11

    Printing Your Work in SPSS

    To print statistical results and graphs in the Viewer window or data in the Data Editor window:

    NOTE there is no printing capability at the Seattle Downtown CampusClassroom Location.

    Exiting SPSS

    To exit SPSS:

    1. Choose File on the menu bar2. Choose Exit SPSS

    If you have made changes to the data file or the output file since the last time you saved these

    files, before exiting SPSS you will be asked whether you want to save the contents of the Data

    Editor window and Viewer window. If you are unsure as to whether you want to save thecontents of the data or output window, choose Cancel, then display the window(s) and if you

    want to save the contents of the window, follow the instructions in this handout for saving data

    or output windows. SPSS will use the overwrite method when saving the contents of thewindow.

    1. Display the output or data you want to

    print (i.e., execute the followingcommands while in a viewer/output ordata window)

    2. Choose File on the menu bar.3. Choose Print...4. Choose All visible output or Selected

    output (if you have selected parts of the

    output).

    5. Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    15/88

    12

    Running SPSS using Syntax (or Command Language)

    This handout describes how to the run various statistical summaries and procedures using the

    point-and-click menus in SPSS. However, it is possible run SPSS commands using SPSS

    syntax/command language. If you are running similar analyses repeatedly, it can be more

    efficient to run your analysis using SPSS syntax. How to run SPSS using the syntax/commandlanguage is beyond the scope of this handout. Help on running SPSS using the syntax/command

    language can be found in the SPSS Tutorials under Working with Syntax.

    To get you started using SPSS syntax, follow the point-and-click instructions for running a

    particular analysis, but select Paste instead of OK at the last step. A Syntax Editor windowwill open containing the SPSS syntax for running the analysis. To run the analysis you can

    choose Run on the menu bar or you can highlight the syntax you want to run, click the right

    mouse button, and select Run Selection. You can add more syntax to the Syntax Editor window

    by using the point-and-click method, selecting Paste instead of OK at the last step. Theadditional syntax will be added at the bottom of the Syntax Editor window. You can also write

    syntax directly into the syntax file and/or use copy, paste and editing commands to modify thesyntax. Remember to save you syntax file before exiting SPSS. The file should end in .sps.You can open a syntax file by selecting File on the menu bar, Open, and the Syntax

    Heres an example of SPSS

    syntax.

    This syntax runs a two sample t-

    test comparing HDL cholesterol

    (hdl) for subjects without andwith CHD (incchd, coded 0 for

    no and 1 for yes).

    This syntax creates 3 indicators

    variables, neversmoker,

    formersmoker, and

    currentsmoker for smoking status(smoke).

    Note that a period (.) is used to

    denote the end of a string ofsyntax and Execute. is

    sometimes required to run thesyntax.

    Comments can be added between

    the symbols /* and */ or after *to help you remember what the

    syntax is doing.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    16/88

    13

    Displaying Variable Names or Variable Labels

    When running SPSS via the menus you want to either have the variable labels or variable names

    displayed.

    Here is an example of the variablelabels being displayed. The

    variable name is also (always)

    displayed in parenthesis after thevariable label.

    Here is an example of the variablename being displayed.

    To select whether thevariable labels or names

    display:

    1. Choose Edit2. Choose Options3. Choose General4. Select Display labels

    or names.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    17/88

    14

    Creating and Recoding Variables

    Creating a New Variable

    To create a new variable:

    1. Display the Data Editor window (i.e., execute the following commands while in the DataEditor window displaying the data file you want to use to create a new variable).2. Choose Transform on the menu bar3. Choose Compute Variable...4. Enter the new variable name in the Target Variable box.5. Enter the definition of the new variable in the Numeric Expression box (e.g., SQRT(visan),

    LN(age), or MEAN(age)) or

    6. Select variable(s) and combine with desired arithmetic operations and/or functions.7. Choose OK

    After creating a new variable(s), you will probably want to save the new variable(s) by re-saving

    your data using the Save command under File on the menu bar (See Saving Data as an SPSS

    Save File). Further instructions on creating a new variable are given in the SPSS Help Tutorialsunder Modifying Data Values.

    Example: Creating a (New) Transformed Variable

    You can use the SPSS commands for creating a new variable to create a transformedvariable. Suppose you have a variable indicating triglyceride level, trig, and you want totransform this variable using the natural logarithm to make the distribution less skewed(i.e., you want to create a new variable which is natural logarithm of triglyceride levels).

    Now, a new variable, lntrig, which is the natural logarithm of trig, will be added to yourdata set. Remember to save your data set before exiting SPSS (e.g., while in the SPSSData window, choose Save under File or click on the floppy disk icon).

    1. Display the Data Editorwindow2. Choose Transform on the

    menu bar

    3. Choose Compute...4. Enter, say, lntrig, in the

    Target Variable box.

    5. Enter Ln(trig) in the NumericExpression box.

    6. Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    18/88

    15

    Recoding or Combining Categories of a Variable

    To recode or combine categories of a variable:

    1. Display the Data Editor window (i.e., execute the following commands while in the Data

    Editor window displaying the data file you want to use to recode variables).2. Choose Transform on the menu bar3. Choose Recode4. Choose Into Same Variables... orInto Different Variables...5. Select a variable to recode from the variable list on the left and then click on the arrow

    located in the middle of the window. This defines the input variable.6. If recoding into a different variable, enter the new variable name in the box under Name:,

    then choose Change. This defines the output variable.

    7. Choose Old and New Values...8. Choose Value or Range under Old Value and enter old value(s).9. Choose New Value and enter new value, then choose Add.

    10. Repeat the process until all old values have been redefined.11. Choose Continue12. Choose OK

    After creating a new variable(s), you will probably want to save the new variable(s) by re-savingyour data using the Save command under File box on the menu bar (See Saving Data as an SPSS

    Save File).

    Example: Recoding a Categorical Variable

    You can use the commands for recoding a variable to change the coding values of a

    categorical variable. You may want to change a coding value for a particular category tomodify which category SPSS uses as the referent category in a statistical procedure. Forexample, suppose you want to perform linear regression using the ANOVA (or GeneralLinear Model) commands, and one of your independent variables is smoking status, smoke,that is coded 1 for never smoked, 2 for former smoker and 3 for current smoker. Bydefault SPSS will use current smoker as the referent category because current smokerhas the largest numerical (code) value. If you want never smoked to be the referentcategory you need to recode the value for never smoked to a value larger than 3.

    Although you can recode the smoking status into the same variable, it is better to recode

    the variable into a new/different variable, newsmoke, so you do not lose your original dataif you make an error while recoding.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    19/88

    16

    Remember to save your data set before exiting SPSS.

    1. Display the Data Editorwindow

    2. Choose Transform3. Choose Recode4. Choose Into Different

    Variables...

    5. Select the variable smoke asthe Input variable

    6. Enter newsmoke as the nameof the Output variable, and

    then choose Change.

    7. Choose Old and NewValues...

    8. Choose Value under OldValue. (It may already be

    selected.)9. Enter 1 (code for never

    smoker)

    10.Choose Value under NewValue. (It may already be

    selected.)

    11.Enter 4 (or any value greaterthan 3)

    12.Choose Add13.Choose All Other Values

    under Old Value.

    14.Choose Copy Old Value(s)under New Value.

    15.Choose Add16.Choose Continue17.Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    20/88

    17

    Example: Creating Indicator or Dummy Variables

    You can use the commands for recoding a variable to create indicator or dummy variablesin SPSS. Suppose you have a variable indicating smoking status, smoke, that is coded 1 fornever smoked, 2 for former smoker and 3 for current smoker. To create three new

    indicator or dummy variables for never, former and current smoking:

    Now, you have created a binary indicator variable for never smoker (coded 1 if neversmoker, 0 if former or current smoker). Next, create a binary indicator variable forformer smoker.

    1. Display the Data Editorwindow

    2. Choose Transform3. Choose Recode4. Choose Into Different

    Variables...

    5. Select the variable smokeas the Input variable

    6. Enter neversmoke as the

    name of the Outputvariable, and then choose

    Change.

    7. Choose Old and NewValues...

    8. Choose Value under OldValue. (It may already be

    selected.)

    9. Enter 1 (code value fornever smoker)

    10.Choose Value under NewValue. (It may already be

    selected.)11.Enter 1 (to indicate never

    smoker)

    12.Choose Add13.Choose All Other Values

    under Old Value.

    14.Choose Value under NewValue.

    15.Enter 016.Choose Add17.Choose Continue

    18.Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    21/88

    18

    Now, you have a created a binary indicator variable for former smoker (coded 1 if formersmoker, 0 if never or current smoker). To create a binary indicator variable for currentsmoker you would use similar commands to those for creating the indicator variable forformer smoke, except that now the value of 3 for smoke is coded as 1 and all other valuesare coded as 0.

    1. Display the Data Editorwindow

    2. Choose Transform3. Choose Recode4. Choose Into Different

    Variables...5. Select the variable smoke

    as the Input variable

    6. Enter formersmoke as thename of the Output

    variable, and then choose

    Change. (Or change (edit)

    never to former, and then

    choose Change).

    7. Choose Old and NewValues...

    8. Choose 11 under

    OldNew and thenchoose Remove.

    9. Choose Value under OldValue.

    10.Enter 2 (code value forformer smoker)

    11.Choose Value under NewValue.

    12.Enter 1 (to indicate formersmoker)

    13.Choose Add14.Choose Continue15.Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    22/88

    19

    Example: Creating a Categorical Variable From a Numerical Variable

    You can use the commands for recoding a variable to create a categorical variable from a numericalvariable (i.e., group values of the numerical variable into categories). For example, suppose you havea variable that is the number of pack years smoked, packyrs, and you want to create a categorical

    variable with the four categories, 0, >0 to 10, >10 to 30, and >30 pack years smoked.

    Note that if you may want to use different coding values depending on which category you want tobe used as the referent category in certain statistical procedures. Remember to save your data setbefore exiting SPSS.

    1. Display the Data Editor window2. Choose Transform3. Choose Recode4. Choose Into Different Variables...5. Select the variable packyrs as the Input

    variable

    6. Enter a name for the new variable,packcat, for the Output variable, and

    then choose Change.7. Choose Old and New Values...8. Choose Value under Old Value. (It may

    already be selected.)9. Enter 010. Choose Value under New Value.11. Enter 0 (to indicate 0 pack years)12. Choose Add13. Choose Range under Old Value.14. Enter 0.01 and 10 in the two blank

    boxes.

    15. Choose Value under New Value16. Enter 1 (to indicate >0 to 10 pack years)17. Choose Add18. Choose Range under Old Value.19. Enter 10.01 and 30 in the two blank

    boxes.

    20. Choose Value under New Value21. Enter 2 (to indicate >10 to 30 pack

    years)

    22. Choose Add23. Choose Range, value through HIGHEST

    under Old Value.

    24. Enter 30.01 in the blank box.25. Choose Value under New Value26. Enter 3 (to indicate >30 pack years)

    27. Choose Add

    28. Choose Continue

    29. Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    23/88

    20

    Summarizing Your Data

    Frequency Tables (& Bar Charts) for Categorical Variables. To produce frequency tablesand bar charts for categorical variables:

    1. Choose Analyze from the menu bar

    2. Choose Descriptive Statistics3. Choose Frequencies4. Variable(s): To select the variables you want from the source list on the left, highlight a

    variable by pointing and clicking the mouse and then click on the arrow located in the middle

    of the window. Repeat the process until you have selected all the variables you want.

    5. Choose Charts (Skip to step 7 if you do not want bar charts.)6. Choose Bar Chart(s)7. Choose Continue8. Choose OK

    Example: Frequency table and bar chart for the categorical variable, smoking status(smoke).

    Frequency table and bar chart of smoking status

    currentformernever

    Smoking status

    60

    50

    40

    30

    20

    10

    0

    Percent

    Smoking status

    Smokingstatus is theselectedvariable(s) andBar chartsunder Chartshas beenselected.

    Smoking status

    Fre-quency Percent

    ValidPercent

    Cumu-lative

    Percent

    never 590 59.0 59.0 59.0

    former 293 29.3 29.3 88.3

    current117 11.7 11.7 100.0

    Total 1000 100.0 100.0

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    24/88

    21

    Contingency Tables for Categorical Variables. To produce contingency tables for categorical

    variables:

    1. Choose Analyze from the menu bar.2. Choose Descriptive Statistics3. Choose Crosstabs...4. Row(s): Select the row variable you want from the source list on the left and then click on the

    arrow located next to the Row(s) box. Repeat the process until you have selected all the row

    variables you want.5. Column(s): Select the column variable you want from the source list on the left and then

    click on the arrow located next to the Column(s) box. Repeat the process until you have

    selected all the column variables you want.6. Choose Cells...7. Choose the cell values (e.g., observed counts; row, column, and margin (total) percentages).

    Note the option is selected when the little box is not empty.8. Choose Continue9. Choose OK

    Example: Contingency table of smoking status by coronary heart disease (CHD).

    Smoking status * Incident CHD Crosstabulation

    Incident CHD

    Totalno yes

    Smokingstatus

    never Count 537 53 590

    % within Smoking status 91.0% 9.0% 100.0%

    former Count 257 36 293

    % within Smoking status 87.7% 12.3% 100.0%

    current Count 106 11 117

    % within Smoking status 90.6% 9.4% 100.0%

    Total Count 900 100 1000

    % within Smoking status 90.0% 10.0% 100.0%

    Smokingstatus is therow variableand CHD isthe columnvariable.

    Observed

    counts androwpercentageswill bedisplayed.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    25/88

    22

    Descriptive Statistics (& Histograms) for Numerical Variables. To produce descriptive

    statistics and histograms for numerical variables:

    1. Choose Analyze on the menu bar2. Choose Descriptive Statistics

    3. Choose Frequencies...4. Variable(s): To select the variables you want from the source list on the left, highlight avariable by pointing and clicking the mouse and then click on the arrow located in the middle

    of the window. Repeat the process until you have selected all the variables you want.5. Choose Display frequency tables to turn off the option. Note that the option is turned off

    when the little box is empty.

    6. Choose Statistics7. Choose summary measures (e.g., mean, median, standard deviation, minimum, maximum,

    skewness or kurtosis).

    8. Choose Continue9. Choose Charts (Skip to step 11 if you do not want histograms.)

    10.Choose Histograms(s)11.Choose Continue12.Choose OK

    An alternate way to produce only the descriptive statistics is at step 3 to choose Descriptives...

    instead of Frequencies..., then, select the variables you want. By default SPSS computes themean, standard deviation, minimum and maximum. Choose Options... to select other summary

    measures.

    Example: Descriptive summaries and histogram for the numerical variable age.

    Age is the variable to summarize. Youcan select more than one variable toanalyze.

    Remember to turn off the Displayfrequency tables option.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    26/88

    23

    Summaries for Age

    Statistics

    Age

    N Valid 1000

    Missing 0

    Mean 72.14

    Std. Deviation 5.275

    Minimum 65

    Maximum 90

    Histogram of Age

    9590858075706560

    Age

    120

    100

    80

    60

    40

    20

    0

    Frequency

    Mean =72.14 Std. Dev. =5.275N =1,000

    Histogram

    Mean, standarddeviation,minimum andmaximum wereselected under

    Statistics, andhistogram wasselected underCharts

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    27/88

    24

    Descriptive Statistics (& Boxplots) by Groups for Numerical Variables. To produce

    descriptive statistics and boxplots by groups for numerical variables:

    1. Choose Analyze on the menu bar2. Choose Descriptive Statistics3. Choose Explore...

    4. Dependent List: To select the variables you want to summarize from the source list on theleft, highlight a variable by pointing and clicking the mouse and then click on the arrow

    located next to the dependent list box. Repeat the process until you have selected all the

    variables you want.5. Factor List: To select the variables you want to use to define the groups from the source list

    on the left, highlight a variable by pointing and clicking the mouse and then click on the

    arrow located next to the factor list box.

    6. Choose Plots... (If you do not want boxplots, choose Statistics for the Display option andskip to Step 11.)

    7. Choose Factor levels together from the Boxplot box.8. Select Stem-and-leaf option from the Descriptive box to turn off the option.

    9. Choose Continue10.Choose Both for the Display option11.Choose OK

    Example: Total cholesterol by family history of heart attack (yes or no).

    Under StatisticsDescriptives is usuallyselected by default.

    Under Plots selectBoxplot option andunselect stem-and-leaf.

    Select Percentiles ifyou want the 25th and75th percentiles toreport with themedian.

    In this example total cholesterol isthe dependent variable. You canselect more than one variable.

    Summaries will be computed for

    each group defined by familyhistory of heart attack.

    Both numerical summaries(statistics) and plots are selected.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    28/88

    25

    Descriptives

    Familyhistory of

    heartattack Statistic

    Std.Error

    Totalcholesterol

    no Mean221.93 1.417

    95% ConfidenceInterval for Mean

    Lower Bound 219.15

    Upper Bound 224.72

    5% Trimmed Mean 221.63

    Median 219.76

    Variance 1350.641

    Std. Deviation 36.751

    Minimum 111

    Maximum 363

    Range 252

    Interquartile Range 49

    Skewness .184 .094

    Kurtosis .363 .188

    yes Mean 220.53 2.150

    95% ConfidenceInterval for Mean

    Lower Bound 216.30

    Upper Bound 224.76

    Boxplot of Total Cholesterol by Family History of Heart Attack

    yesno

    Family hist ory of heart attack

    400

    350

    300

    250

    200

    150

    100

    Totalcholesterol

    812

    875

    659

    95

    172

    438

    729

    The explore commandby default produces alot of differentsummaries, so you needto select what toreport.

    All summaries areshown for all groups the table has beencropped in thisexample.

    The interquartilerange is reported asthe differencebetween the 75th and25th percentiles.Request percentiles(see prior page) to getthe 25th and 75thpercentiles.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    29/88

    26

    Using the Split File Option for Summaries by Groups for Categorical and NumericalVariables. The Split File option in SPSS is a convenient way to produce summaries, graphs, andrun statistical procedures by groups. To activate the option:

    1. Choose Data on the menu bar of the Data Editor window

    2. Choose Split File3. Choose Compare groups or Organize output by groups. The two options display the outputdifferently. Try each option to see which works best for your needs.

    4. Choose the variable that defines the groups.5. Choose OK

    Now, all the summaries, graphs, and statistical procedures you request will be done(automatically) for each group. To turn off this option:

    1. Choose Data on the menu bar of the Data Editor window2. Choose Split File

    3. Choose Analyze all cases, do no create groups4. Choose OK

    Example. Use the Split File option to run summaries by family history of heart attack (yesor no).

    Compare groups option will try todisplay the results for each groupside by side when feasible.

    Organize output by groups optionwill display the results separatelyfor each group starting with thegroup with the lowest numericalcode value.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    30/88

    27

    Using the Select Cases Option for Summaries for a subgroup of subjects/observations.

    The Select Cases option in SPSS is a convenient way to produced summaries and run statistical

    procedures for a subgroup of subjects or to temporary exclude subjects from the analysis. To

    activate this option:

    1. Choose Data on the menu bar of the Data Editor window2. Choose Select Cases3. Choose If condition is satisfied4. Choose If5. Enter the expression that indicates the subjects/observation you want to select.6. Choose Continue7. Choose OK

    Now, all the summaries, graphs, and statistical procedures you request will be done using only

    the selected subjects/observations. To turn off this option:

    1. Choose Data on the menu bar of the Data Editor window

    2. Choose Select Cases3. Choose All cases4. Choose OK

    Example: Select subjects not lipid lowering medications (i.e., subjects with lipid = 0indicating no medications).

    Select the If condition is satisfiedand then If

    Caution! Usually you do not want todelete observations from yourdataset, so do not select this

    Typical expressions will involvecombinations of the following symbols:

    Symbol Definition= equal~= not equal>= greater than or equal greater than< less than& and| or

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    31/88

    28

    Graphing Your Data

    You can produce very fancy figures and graphs in SPSS. Producing fancy figures and graphs is

    beyond the scope of this handout. Instructions on producing figures and graphs can be found in

    SPSS Help under Topics Contents Building Charts and Editing Charts, as well as in the

    SPSS Tutorials under Creating and Editing Charts. Note, that both the Help and Tutorials youneed to have Internet access. Also, last time I tried the doing a tutorial is didnt work.

    This handout covers the basic commands for creating simple graphs using the Legacy Dialogs

    under Graphs versus the newer methods using the Chart Builder .

    Bar Charts

    The easiest way to produce simple bar charts is to use the Bar Chart option with the

    Frequencies... command. See Frequency Tables (& Bar Charts) for Categorical Variables. Youcan only produce only one bar chart at a time using the Bar command.

    currentformernever

    Smoking status

    60.0%

    50.0%

    40.0%

    30.0%

    20.0%

    10.0%

    0.0%

    Percent

    currentformernever

    Smoking status

    60.0%

    50.0%

    40.0%

    30.0%

    20.0%

    10.0%

    0.0%

    Percent

    yes

    no

    Family history ofheart attack

    1. Choose Graphs and then Legacy Dialogs from the menu bar.2. Choose Bar...3. Choose Simple, Clustered, or Stacked4. Choose what the data in the bar chart represent (e.g., summaries for groups of cases).5. Choose Define6. Select a variable from the variable list on the left and the click on the arrow next to the

    Category axis.

    7. Choose what the bars represent (e.g., number of cases or percentage of cases)8. Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    32/88

    29

    Histograms

    The easiest way to produce simple histograms is to use the Histogram option with the

    Frequencies... command. See Descriptive Statistics (& Histograms) for Numerical Variables.

    You can produce only one histogram at a time using the Histogram command.

    5040302010

    Body mass ind ex

    120

    100

    80

    60

    40

    20

    0

    Frequency

    Mean =26.2366 Std. Dev. =4.8667N =1,000

    Boxplots

    The easiest way to produce simple boxplots is to use the Boxplot option with the Explore...

    command. See Descriptive Statistics (& Boxplots) By Groups for Numerical Variables.

    You can produce only one boxplot at a time using the Boxplot command.

    diabeticimpaired fastingglucose

    normal

    ADA d iab etes s tatu s

    400

    200

    0

    Serum

    fasting

    glucose

    785

    880

    684

    77

    673

    1. Choose Graphs and then LegacyDialogs from the menu bar.

    2. Choose Boxplot...3. Choose Simple or Clustered4. Choose what the data in the

    boxplots represent (e.g.,

    summaries for groups of cases).5. Choose Define6. Select a variable from the

    variable list on the left and thenclick on the arrow next to the

    Variable box.

    7. Select the variable from thevariable list that defines the

    groups and then click on the

    arrow next to Category Axis.

    8. Choose OK

    1. Choose Graphs and then LegacyDialogs from the menu bar

    2. Choose Histogram...3. Select a variable from the

    variable list on the left and thenclick on the arrow in the middle of

    the window.

    4. Choose Display normal Curve ifyou want a normal curve

    superimposed on the histogram.

    5. Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    33/88

    30

    Normal Probability Plots. To produce Normal probability plots:

    1. Choose Analyze from the menu bar2. Choose Descriptive Statistics.3. Choose Q-Q Plots... to get a plot of the quantiles (Q-Q plot) or choose P-P Plots... to get a

    plot of the cumulative proportions (P-P plot)4. Select the variables from the source list on the left and then click on the arrow located in themiddle of the window.

    5. Choose Normal as the Test Distribution. The Normal distribution is the default TestDistribution. Other Test Distributions can be selected by clicking on the down arrow and

    clicking on the desired Test distribution.

    6. Choose OK

    SPSS will produce both a Normal probability plot and a detrended Normal probability plot for

    each selected variable. Usually the Q-Q plot is the most useful for assessing if the distribution ofthe variable is approximately Normal.

    6004002000-200

    Observed Value

    250

    200

    150

    100

    50

    0

    -50

    Expected

    Norm

    alValue

    Normal Q-Q Plot of Serum fasting glucose

    5040302010

    Observed Value

    40

    30

    20

    10

    ExpectedNormalValue

    Normal Q-Q Plot of Body mass index

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    34/88

    31

    Error Bar Plot. To produce an error bar plot of the mean of a numerical variable (or the means

    for different groups of subjects):

    1. Choose Graphs and then Legacy Dialogs from the menu bar.2. Choose Error Bar...

    3. Choose Simple or Clustered4. Choose what the data in the error bars represent (e.g., summaries for groups of cases).5. Choose Define6. Select a variable from the variable list on the left and then click on the arrow next to the

    Variable box.

    7. Select the variable from the variable list that defines the groups and then click on the arrownext to Category Axis.

    8. Select what the bars represent (e.g., confidence interval, standard deviation, standard errorof the mean)

    9. Choose OK

    Error Bar Plot

    diabeticimpaired fastingglucose

    normal

    ADA di abetes s tatu s

    300

    250

    200

    150

    100

    50

    Mean

    +-2SD

    Serumf

    astingglucose

    A bar chart of the mean with error bars can be made

    using the commands for making a bar chart

    ADA d iabetes s tat us

    diabeticimpaired fastingglucose

    normal

    MeanSerum

    fastingglucose

    300

    200

    100

    0

    Error bars: +/- 2 SD

    1. Choose Graphs and then Legacy Dialogsfrom the menu bar.

    2. Choose Bar...3. Choose Simple4. Choose Summaries for groups of cases5. Choose Define6. Select a variable from the variable list on

    the left and the click on the arrow next tothe Category axis (e.g., diabetes status)

    7. Choose Other statistic (e.g. mean). By

    default the mean will be selected.8. Choose a variable for the Variable that

    you the want to display the mean (or Other

    statistic).

    9. Choose Options10. Select Display error bars11. Select Standard deviation, and enter2

    for the Multiplier12. Choose Continue13. Choose OK

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    35/88

    32

    Scatter Plot. To produce a scatter plot between two numerical variables:

    5040302010

    Body mass index

    140

    120

    100

    80

    60

    40

    20

    0

    HDL

    cholesterol

    HLD cholesterol vs BMI

    Adding a linear regression line to a scatter plot. To add a linear regression (least-squares) line

    to a scatter plot of two numerical variables:

    5040302010

    Body mass index

    140

    120

    100

    80

    60

    40

    20

    0

    HDL

    cholesterol

    HLD cholesterol vs BMI

    R Sq Linear = 0.121

    Additional options:o Choose Mean under Confidence Intervals (in the Properties window) to add a prediction

    interval for the linear regression line to the scatter plot or

    o Choose Individual under Confidence Intervals to add a prediction interval for individualobservations to the scatter plot.

    7.Click on the ``X'' in the upper right hand corner of the Chart Editor window, or choose Fileand then Close to return to the Viewer window.

    1. Choose Graphs and then LegacyDialogs on the menu bar.

    2. Choose Scatter/Dot...

    3. Choose Simple4. Choose Define5. Y Axis: Select the y variable you

    want from the source list on the left

    and then click on the arrow next to

    the y axis box.6. X Axis: Select the x variable you

    want from the source list on the left

    and then click on the arrow next to

    the x axis box.7. Choose Titles...8. Enter a title for the plot (e.g., y vs.

    x).9. Choose Continue10.Choose OK

    1. While in the Viewer windowdouble click on the scatter plot. The

    scatter plot should now bedisplayed in a window titled Chart

    Editor.2. Choose Elements.3. Choose Fit Line at Total. (A line

    should be added to the plot, because

    the next 2 steps are the defaultoptions.

    4. Choose Linear (in the Propertieswindow)

    5. Choose Apply6. Choose Close

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    36/88

    33

    Adding a Loess (scatter plot) smooth to a scatter plot. To add a Loess smooth to a scatter plot

    of two numerical variables:

    5040302010

    Body mass index

    140

    120

    100

    80

    60

    40

    20

    0

    HDL

    cholesterol

    HLD cholesterol vs BMI

    Stem-and-leaf Plot. To produce stem-and-leaf plot:

    1. Choose Analyze on the menu bar2. Choose Descriptive Statistics3. Choose Explore...4. Dependent List: To select the variables

    you want from the source list on the left,

    highlight a variable by pointing andclicking the mouse and then click on the

    arrow located next to the dependent list

    box. Repeat the process until you have

    selected all the variables you want.5. Choose Plots...6. Choose Stem-and-leaf from the

    Descriptive box. Note the option mayalready be selected if the little box is not

    empty.

    7. Choose None from the Boxplot box8. Choose Continue9. Choose Plots for the Display option10.Choose OK

    Severity of Illness Index Stem-and-

    Leaf Plot

    Frequency Stem & Leaf

    2.00 4 . 34

    7.00 4 . 6688899

    10.00 5 . 0001112344

    3.00 5 . 568

    1.00 Extremes (>=62)

    Stem width: 10.00

    Each leaf: 1 case(s)

    1. While in the Viewer windowdouble click on the scatter plot. Thescatter plot should now be

    displayed in a window titled ChartEditor.

    2. Choose Elements.3. Choose Fit Line at Total.The next two steps (4. & 5.) may bealready selected

    4. Choose Loess (in the Propertieswindow). Default options for % of

    points to fit (50%) and kernel(Epanechnikov) are usually

    appropriate options.

    5. Choose Apply (in the Propertieswindow).

    6. Choose Close7. Click on the ``X'' in the upper right

    hand corner of the Chart Editor

    window, or choose File and then

    Close to return to the Viewer.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    37/88

    34

    Hypothesis Tests & Confidence Intervals

    One-Sample t Test

    1. Choose Analyze from the menu bar.2. Choose Compare Means3. Choose One-Sample T Test...4. Test Variable(s): Select the variable you want from the source list on the left, highlight

    variables by pointing and clicking the mouse and then click on the arrow located in the

    middle of the window.

    5. Edit the Test Value. The Test Value is the value of the mean under the null hypothesis. Thedefault value is zero.

    6. Choose OK

    Confidence Interval for a Mean (from one sample of data)

    1. Choose Analyze from the menu bar.2. Choose Compare Means3. Choose One-Sample T Test...4. Test Variable(s): Select the variable you want from the source list on the left, highlight

    variables by pointing and clicking the mouse and then click on the arrow located in themiddle of the window.

    5. The Test Value should be 0, which is the default value.6. By default a 95% confidence interval will be computed. Choose Options to change the

    confidence level.7. Choose OK

    SIDS Example. There were 48 SIDS cases in King County, Washington, during the years1974 and 1975. The birth weights (in grams) of these 48 cases were:

    2466 3941 2807 3118 2098 31753317 3742 3062 3033 2353 35152013 3515 3260 2892 1616 44232750 2807 2807 3005 3374 35722722 2495 3459 3374 1984 24953005 2608 2353 4394 3232 3062

    2013 2551 2977 3118 2637 15032722 2863 2013 3232 2863 2438

    We want to know if the mean birth weight in the population of SIDS infant is differentfrom that of normal children, 3300 grams. We could construct a 95% confidence interval,to see if the interval contains the value of 3300 grams or we could perform a one sample ttest to test if the mean in the SIDs population is equal to 3300 (versus not equal to 3300).

    The mean (and standarddeviation) of thesemeasurements is 2891 (623)grams.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    38/88

    35

    To construct a95%confidence interval

    One-Sample Statisti cs

    N Mean Std. DeviationStd. Error

    Mean

    birth weight 48 2891.1250 623.39177 89.97885

    One-Sample Test

    Test Value = 0

    t df Sig. (2-tailed)Mean

    Difference

    95% Confidence Intervalof the Difference

    Lower Upper

    birth weight 32.131 47 .000 2891.12500 2710.1109 3072.1391

    When computing the

    interval for a mean makesure the Test Value is 0.

    Ignore the t test results(t, df, sig.) because theseresults are for testing ifthe mean birth weight isequal to 0 (versus notequal to zero).

    95% confidence interval for the

    mean birth weight is 2710 to

    3072 rams

    Number of subjects, mean,

    standard deviation, and standarderror of the mean.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    39/88

    36

    To perform a one sample t test to test if the mean in the SIDs population is equal

    to 3300 versus not equal to 3300.

    One-Sample Statisti cs

    N Mean Std. DeviationStd. Error

    Mean

    birth weight 48 2891.1250 623.39177 89.97885

    One-Sample Test

    Test Value = 3300

    t df Sig. (2-tailed)

    Mean

    Difference

    95% ConfidenceInterval of the

    Difference

    Lower Upperbirth weight -4.544 47 .000 -408.87500 -589.8891 -227.8609

    To run the one-sample ttest to test if the meanbirth weight is equal to3300 you need to changethe Test Value from thedefault value of 0 to 3300.

    Ignore the results for 95%confidence interval of thedifference, because it is theconfidence interval for themean minus 3300.

    Sig. (2-tailed) = two tailed p-value =

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    40/88

    37

    Paired t Test

    1. Choose Analyze from the menu bar.2. Choose Compare Means3. Choose Paired-Samples T Test...4. Paired Variable(s): Select two paired variables you want from the source list on the left, and

    then click on the arrow in the middle of the in window. The order in which you select thetwo variables will determine how the difference is computed. Repeat the process until you

    have selected all the paired variables you want to test.5. Choose OK

    Confidence Interval for the Difference Between Means from Paired Sample

    By default a 95% confidence interval for the difference means of the paired samples will becomputed when performing a paired t test. Choose Options to change the confidence level.

    Prozac Example. To compare the effect of Prozac on anxiety 10 subjects are given oneweek of treatment with Prozac and one week of treatment with a placebo. The order ofthe treatments was randomized for each subject. An anxiety questionnaire was used tomeasure a subject's anxiety on a scale of 0 to 30. Higher scores indicate more anxiety.

    Subject Placebo Prozac Difference

    1 22 19 3

    2 18 11 7

    3 17 14 34 19 17 2

    5 22 23 -1

    6 12 11 1

    7 14 15 -1

    8 11 19 -8

    9 19 11 8

    10 7 8 -1

    Mean difference, 1.3d Standard deviation, 4.5ds

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    41/88

    38

    Paired t test and confidence interval for the difference between paired means.

    Paired Samples Statisti cs

    Mean N Std. DeviationStd. Error

    Mean

    Pair 1 placebo 16.1000 10 4.95424 1.56667

    prozac 14.8000 10 4.68568 1.48174

    Paired Samples Correlations

    N Correlation Sig.

    Pair 1 placebo & prozac 10 .556 .095

    Paired Samples Test

    Paired Differences t dfSig. (2-tailed)

    MeanStd.

    DeviationStd. Error

    Mean95% Confidence Interval of

    the Difference

    Lower Upper

    Pair 1 placebo- prozac

    1.30000 4.54728 1.43798 -1.95293 4.55293 .904 9 .390

    Summaries for eachsample of data (orvariable).

    Correlation between the pairedvalues - usually not useful.

    difference = placebo - prozac

    mean difference = 1.3

    standard deviation of thedifferences = 4.5

    standard error of thedifferences = 1.4

    95% confidence interval for the

    mean difference is -1.9 to 4.6

    Paired t test

    Sig. (2 tailed) = two-sided p-value = 0.39

    t = test statistic value = .904

    df = degrees of freedom

    The order of the variables incalculating the difference is

    determined by the order inwhich you selected thevariables. The difference willcomputed by Variable 1 Variable 2.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    42/88

    39

    Two-Sample t Test

    1. Choose Analyze on the menu bar.2. Choose Compare Means3. Choose Independent-Samples T Test...4. Test Variable(s): Select the test variable you want from the source list on the left and then

    click on the arrow located next to the test variable box. Repeat the process until you haveselected all the variables you want.

    5. Grouping Variable: Select the variable which defines the groups and then click on thearrow located next to the grouping variable box.

    6. Choose Define Groups...7. Click on blank box next to Group 1, then enter the code value (numeric or

    character/string) for group 1.

    8. Click on blank box next to Group 2, then enter the code value (numeric orcharacter/string) for group 2.

    9. Choose Continue10.Choose OK

    Confidence Interval for the Difference Between Means from Independent

    Samples

    By default a 95% confidence interval for the difference means from two independent samples

    will be computed when performing a two sample t test. Choose Options to change theconfidence level.

    Model Cities Example. Two groups of people were studied - those who had been randomly

    allocated to a Fee-For-Service medical insurance group and those who had been randomlyallocated to a Prepaid insurance group.

    We would like to compare the two groups on the quality of health care they received ineach group, but first we would like to know how comparable the groups are on othercharacteristics that might affect medical outcome. For example, we would like to know ifthe mean age in the two groups is similar. Hopefully, the process of random allocationminimizes this possibility, but there is always a chance that it didn't.

    Group n Mean Standarddeviation

    Prepaid (GHC) 1167 24.0 15.3

    Fee-for-service (KCM) 3207 26.4 17.1

    We could compare the average age between the two groups using a two sample t test or aconfidence interval for the difference between the average ages of the two groups.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    43/88

    40

    Two sample t test and 95% confidence interval for the difference between means

    (from independent samples).

    T-TestGroup Statistics

    prov N Mean Std. DeviationStd. Error

    Mean

    age GHC 1167 23.9846 15.30787 .44810

    KCM 3207 26.3676 17.10260 .30200

    Independent Samples Test

    Levene's Test for

    Equality of Variances

    F Sig.

    age Equal variancesassumed 47.068 .000

    Equal variancesnot assumed

    After you select the Grouping Variable,SPSS will put in question marks toprompt you to define the code values forthe two groups. Select Define Groupsto enter the code values.

    In this example the group codes arenumeric, 0 (for GHC) and 1 (for KCM)

    Summaries for eachsample/group.

    SPSS by default tests if the

    variances are equal using Levenestest. A small p-value (sig.)indicates the variances may bedifferent.

    sig. = p-value =

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    44/88

    41

    Independent Samples Test

    t-test for Equality of Means

    t df Sig. (2-tailed)Mean

    DifferenceStd. ErrorDifference

    age Equal variancesassumed -4.188 4372 .000 -2.38306 .56896

    Equal variancesnot assumed -4.410 2293.698 .000 -2.38306 .54037

    Independent Samples Test

    95% ConfidenceInterval of the

    Difference

    Lower Upper

    age Equal variancesassumed -3.49851 -1.26760

    Equal variancesnot assumed -3.44273 -1.32338

    Two Sample t test. SPSS by default always performs both versions of the twosample t test assuming equal variance and unequal variances

    Sig. (2 tailed) = two sided p-value =

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    45/88

    42

    Sign Test and Wilcoxon Signed-Rank Test

    1. Choose Analyze from the menu bar.2. Choose Nonparametric Tests3. Choose Legacy Dialogs4. Choose 2 Related Samples...

    5. Test Pair(s) List: Select two paired variables you want from the source list on the left, andthen click on the arrow in the middle of the in window. The order in which you select the

    two variables will determine how the difference is computed. Repeat the process until youhave selected all the paired variables you want to test.

    6. Choose Sign as the Test Type.7. and/or8. Choose Wilcoxon as the Test Type.9. Choose OK

    Aspirin Example. To compare 2 types of Aspirin, A and B, 1 hour urine samples werecollected from 10 people after each had taken either A or B. A week later the sameroutine was followed after giving the other type to the same 10 people.

    Person Type A Type B Difference

    1 15 13 2

    2 26 20 6

    3 13 10 3

    4 28 21 7

    5 17 17 0

    6 20 22 -27 7 5 2

    8 36 30 6

    9 12 7 5

    10 18 11 7

    Mean = 19.2 15.6 3.6 = d

    Standard deviation = 8.63 7.78 3.098 =ds

    A Sign test or Wilcoxon Signed Rank test could be used to compare the two types of

    Aspirin.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    46/88

    43

    Descriptive Statistics

    N Mean Std. Deviation Minimum Maximum

    Percentiles

    25th 50th (Median) 75th

    AspirinA 10 19.2000 8.62554 7.00 36.00 12.7500 17.5000 26.500

    AspirinB 10 15.6000 7.77746 5.00 30.00 9.2500 15.0000 21.250

    Sign TestFrequencies

    N

    AspirinB - AspirinA NegativeDifferences(a)

    8

    PositiveDifferences(b)

    1

    Ties(c) 1

    Total 10

    a AspirinB < aspirinAb AspirinB > aspirinAc AspirinB = aspirinA

    Test Statistics(b)

    AspirinB -AspirinA

    Exact Sig. (2-tailed) .039(a)

    a Binomial distribution used.b Sign Test

    The order of the variables incalculating the difference isdetermined by the order inwhich you selected the

    variables. The difference willcomputed by Variable 2 Variable 1 (which is theopposite of the paired t test).

    Select Wilcoxon or Sign (orboth)

    Under Options you can select summariesDescriptive (n, mean, etc.) and Quartiles(median, 25th and 75th percentile)

    Sign Test

    Exact sig. (2-tailed) = exact, two-sided p-value= 0.039

    The p-value is exact because it is computed usingthe Binomial distribution instead of using anapproximation to the Normal distribution. (Notethat the exact p-value is reported only for smallsample sizes.)

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    47/88

    44

    Wilcoxon Signed Ranks TestRanks

    N Mean Rank Sum of Ranks

    aspirinb - aspirina Negative Ranks 8(a) 5.38 43.00

    Positive Ranks 1(b) 2.00 2.00

    Ties 1(c)

    Total 10

    a aspirinb < aspirinab aspirinb > aspirinac aspirinb = aspirina

    Test Statistics(b)

    aspirinb -aspirina

    Z -2.442(a)

    Asymp. Sig. (2-tailed) .015

    a Based on positive ranks.b Wilcoxon Signed Ranks Test

    Wilcoxon Signed Rank Test

    Asymp. Sig. (2-tailed) = two sided p-value = 0.015

    Asymp. is an abbreviation for asymptotic, which

    means the p-value is computed using a large sampleapproximation based on the Normal distribution.

    Informationused in thetest statistic not usuallyreported; use

    the previousdescriptives.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    48/88

    45

    Mann-Whitney U Test (or Wilcoxon Rank Sum Test)

    1. Choose Analyze on the menu bar.2. Choose Nonparametric Tests3. Choose Legacy Dialogs

    4. Choose 2 Independent Samples...5. Test Variable(s): Select the test variable you want from the source list on the left and then

    click on the arrow located next to the test variable box. Repeat the process until you haveselected all the variables you want.

    6. Grouping Variable: Select the variable which defines the grouping and then click on thearrow located next to the grouping variable box. The grouping variable must be numeric forthe variable to appear on the left hand side.

    7. Choose Define Groups...8. Click on the blank box next to group 1, then enter the code value (it must be numeric) for

    group 1.9. Click on the blank box next to group 2, then enter the code value (it must be numeric) for

    group 2.10.Choose Continue to return to Two Independent Samples dialog box.11.Choose Mann-Whitney U as the Test Type. Note that the option may already be selected if

    the little box is not empty.

    12.Choose OK

    Legionnaires Example. During July and August, 1976, a large number of Legionnairesattending a convention died of mysterious and unknown cause. Chen et al. (1977) examinedthe hypothesis of nickel contamination as a toxin. They examined the nickel levels in thelungs of nine cases and nine controls. There was no attempt to match cases and controls.The data are as follows (g/100g dry weight):

    Legionnaire cases 65 24 52 86 120 82 399 87 139Controls 12 10 31 6 5 5 29 9 12

    The Mann Whitney U test could be used to compare the two groups.

    After you select the GroupingVariable, SPSS will put in questionmarks to prompt you to define the

    code values for the two groups.Select Define Groups to enter thecode values.

    Note: The codes must be numeric,otherwise the grouping variable willnot appear on the left hand side.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    49/88

    46

    Mann-Whitney TestRanks

    group N Mean Rank Sum of Ranks

    Nickel 1 9 13.78 124.00

    2 9 5.22 47.00

    Total 18

    Test Statistics(b)

    nickel

    Mann-Whitney U 2.000

    Wilcoxon W 47.000

    Z -3.403

    Asymp. Sig. (2-tailed) .001

    Exact Sig. [2*(1-tailedSig.)] .000(a)

    a Not corrected for ties.b Grouping Variable: group

    In this example the group codes are1 for le ionnaires and 2 for controls.

    Information used in the teststatistic not usually reported.The descriptives under Optionsare not useful; you can producerelevant descriptives (e.g.median and interquartile rangefor each group) using the

    Explore command.

    Mann Whitney test

    Asymp. Sig. (2-tailed) = two-sided p-value =0.001

    This p-value is computed based a largesample approximation to the Normal

    distribution and it corrects for ties in thedata, if present.

    Exact Sig. [2*(1-tailed Sig.)] = two-sided p-value =

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    50/88

    47

    One-way ANOVA (Analysis of Variance) (E.g., to compare two or more means

    from two or more independent samples)1. Choose Analyze on the menu bar2. Choose Compare Means3. Choose One-Way ANOVA...

    4. Dependent: Select the variable from the source list on the left for which you want to use tocompare the groups and then click on the arrow next to the dependent variable box. You run

    multiple one-way ANOVAs by selecting more than one dependent variable.

    5. Factor: Select the variable from the source list on the left which defines the groups.6. Choose OK

    To perform pairwise comparisons to determine which groups are different while controlling for

    multiple testing use the Post Hoc... option. There are many methods to choose from (e.g.,

    Bonferroni and R-E-G-W-Q).

    Other useful options can be found under Options... For example, choose Descriptive to get

    descriptive statistics for each group (e.g., mean, standard deviation, minimum value, andmaximum value). Choose Homogeneity-of-variance to perform the Levene Test to test if the

    group variances are all equal versus not all equal. A small p-value for the Levene's Test may

    indicate that the variances are not all equal.

    CHD Example. We can use one-way ANOVA to compare HDL levels between subjects withdifferent hypertensive status (0=normotensive, 1=borderline, 2=definite)

    Hypertensive StandardGroup n Mean Deviation

    Normotensive 1568 55.8 15.5Borderline 547 55.7 16.2Definite 1310 53.5 15.2

    You can select 1 or morevariables to comparebetween groups.

    The variable selected asthe Factor defines thegroups. The variable can benumeric orcharacter/string.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    51/88

    48

    OnewayANOVA

    HDL cholesterol

    Sum ofSquares df Mean Square F Sig.

    Between Groups 4344.834 2 2172.417 9.045 .000Within Groups 821904.577 3422 240.183

    Total 826249.411 3424

    Descriptives

    HDL cholesterol

    N MeanStd.

    DeviationStd.Error

    95% Confidence Interval forMean Minimum Maximum

    Lower Bound Upper Bound

    normotensive 1568 55.82 15.500 .391 55.05 56.59 21 138

    borderline 547 55.67 16.202 .693 54.30 57.03 24 149

    definite 1310 53.47 15.192 .420 52.64 54.29 15 129

    Total 3425 54.90 15.534 .265 54.38 55.42 15 149

    One-way analysis of variance

    Sig. = p-value =

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    52/88

    49

    Post Hoc TestsUnder Post Hoc you can request further comparisons be done between each of thepossible pair of groups to determine which groups are different from each other. Theseare multiple comparison procedures, which control for the number of tests/comparisonbeing performed. There are many methods to choose from; below is an example of the

    Bonferroni method and Ryan-Einot-Gabriel-Welsch method.

    Multiple Comparisons

    Dependent Variable: HDL cholesterol

    (I)Hypertensionstatus

    (J)Hypertensionstatus

    MeanDifference

    (I-J)Std.Error Sig. 95% Confidence Interval

    Lower Bound Upper Bound

    Bonferroni normotensive borderline .157 .770 1.000 -1.69 2.00definite 2.356(*) .580 .000 .97 3.74

    borderline normotensive -.157 .770 1.000 -2.00 1.69definite 2.198(*) .789 .016 .31 4.09

    definite normotensive -2.356(*) .580 .000 -3.74 -.97borderline -2.198(*) .789 .016 -4.09 -.31

    * The mean difference is significant at the .05 level.

    The Bonferroni method is a method that shows all pairwise comparisons/differences alongwith a p-value (sig.) adjusted for the number of comparisons. In this example, subjectswith normal blood pressure and borderline hypertension have similar HDL cholesterollevels, but subjects with definite hypertension have different HDL cholesterol levels thanboth subjects with normal blood pressure and borderline hypertension.

    Homogeneous SubsetsHDL cholesterol

    Hypertension status N

    Subset for alpha = .05

    1 2

    Ryan-Einot-Gabriel-Welsch Range

    definite 1310 53.47

    borderline 547 55.67

    normotensive 1568 55.82

    Sig. 1.000 .867

    Means for groups in homogeneous subsets are displayed.

    The Ryan-Einot-Gabriel-Welsch (R-E-G-W-Q) method is a method that groups togethergroups that are similar in the same subset and groups that are different are in differentsubsets. In this example, subjects with normal blood pressure and borderlinehypertension are in one subset and subjects with definite hypertension are in a differentsubset. Hence, subjects with definite hypertension have different HDL cholesterol levelsthan subjects with normal blood pressure and borderline hypertension, but subjects withnormal blood pressure and borderline hypertension have similar HDL cholesterol levels.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    53/88

    50

    Kruskal-Wallis Test1. Choose Analyze on the menu bar.2. Choose Nonparametric Tests3. Choose Legacy Dialogs

    4. Choose K Independent Samples...5. Test Variable(s): Select the test variable you want from the source list on the left and then

    click on the arrow located next to the test variable box. Repeat the process until you haveselected all the variables you want to test.

    6. Grouping Variable: Select the variable which defines the grouping and then click on thearrow located next to the grouping variable box.

    7. Choose Define Range...8. Click on the blank box next to Minimum, then enter thesmallest numeric code value for

    the groups.9. Click on the blank box next to Maximum, then enter the largest numeric code value for the

    groups.

    10.Choose Continue11.Choose Kruskal-Wallis H as the Test Type. Note that the option may already be selected if

    the little box is not empty.

    12.Choose OK

    CAUTION: The group variable must be numeric and you must correctly enter thesmallest numeric code value and the largest numeric code value. SPSS will not allow you to

    select a character/string variable as the grouping variable, and allow you to incorrectly enter thenumeric code values. The results displayed for the Kruskal Wallis test in these cases will be

    incorrect, but no error or warning message will be displayed.

    CHD Example. We can use one-way ANOVA to compare serum insulin levels betweensubjects with different hypertensive status (0=normotensive, 1=borderline, 2=definite)

    HypertensiveGroup n Median IQR*

    Normotensive 1568 12 9, 15Borderline 547 12 9, 17Definite 1310 14 11, 20

    *IQR, interquartile range = 25th percentile, 75th percentile

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    54/88

    51

    Kruskal Wallis test

    Kruskal-Wallis TestRanks

    Hypertension status N Mean Rank

    Serum insulin normotensive 1568 1526.31

    borderline 547 1685.28

    definite 1310 1948.03

    Total 3425

    Test Statistics(a,b)

    Serum insulin

    Chi-Square 130.816

    df 2

    Asymp. Sig. .000

    a Kruskal Wallis Testb Grouping Variable: Hypertension status

    You can select 1 or morevariables to compare betweengroups.

    The variable selected as theGrouping Variable defines thegroups. THE VARIABLESHOULD BE NUMERIC.

    In this example the smallest numericcode is 0 (for normal) and the largestnumeric code is 2 (for definite).

    Information used in the teststatistic not usually reported.

    The descriptives under Optionsare not useful; you can producerelevant descriptives (e.g.median and interquartile rangefor each group) using theExplore command.

    Kruskal Wallis test

    Asymp. Sig. = p-value =

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    55/88

    52

    One-Sample Binomial Test1. Choose Analyze from the menu bar.2. Choose Nonparametric Tests3. Choose Legacy Dialogs4. Choose Binomial...

    5. Test Variable List: Select the test variable you want from the source list on the left and thenclick on the arrow located next to the test variable box. Repeat the process until you have

    selected all the variables you want.6. Test Proportion: Click on the box next to Test Proportion and enter/edit the proportion

    value specified by your null hypothesis.

    7. Choose OK

    Example. In the TRAP study, 125 patients of the 527 patients who were negative forlymphocytotoxic antibodies at baseline became antibody positive. The expected rate forbeing antibody positive is 30%. We could use the one-sample binomial test to test if therate is different in the TRAP study population.

    NPar Tests

    Binomial Test

    Category N Observed Prop. Test Prop.

    Exact Sig. (1-

    tailed)

    Outcome Group 1 No 402 .8 .3 .000

    Group 2 Yes 125 .2

    Total 527 1.0

    Outcome is a variablecoded 1 if positive and 0if negative.

    Make sure to edit thetest proportion value.This case .30 or 30%.The default is .50.

    One-sample binomial test, two-sided p-value given by 2 x .001 = .002(Note: SPSS reports the one-sided p-value).

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    56/88

    53

    McNemar's Test

    1. Choose Analyze from the menu bar.2. Choose Descriptive Statistics3. Choose Crosstabs...4. Row(s): Select the row variable you want from the source list on the left and then click on

    the arrow located next to the Row(s) box. Repeat the process until you have selected all therow variables you want.

    5. Column(s): Select the column variable you want from the source list on the left and thenclick on the arrow located next to the Column(s) box. Repeat the process until you have

    selected all the column variables you want.

    6. Choose Cells...7. Forcell values choose total under percentages.8. Choose Continue9. Choose Statistics...10.Choose McNemar11.Choose Continue

    12.Choose OK

    There is also another way to run McNemars test (but the test pair variables must be numeric).

    1. Choose Analyze from the menu bar.2. Choose Nonparametric Tests3. Choose Legacy Dialogs4. Choose 2 Related Samples...5. Test Pair(s) List: Select two paired variables you want from the source list on the left,

    highlight both variables by pointing and clicking the mouse and then click on the arrow

    located in the middle of the window. Repeat the process until you have selected all the

    paired variables you want.6. Choose McNemar as the Test Type.7. Unselect Wilcoxon to turn off the option. Note that the option is turned off when the little

    box is empty.

    8. Choose OK

    Example. Suppose we want to compare two different treatments for a rare form ofcancer. Since relatively few cases of this disease are seen, we want the two treatmentgroups to be as comparable as possible. To accomplish this goal, we set up a matched studysuch that a random member of each matched pair gets treatment A (chemotherapy),whereas the other member gets treatment B (surgery). The patients are assigned to pairs

    (621 pairs) matched on age (within 5 years), sex, and clinical condition. The patients arefollowed for 5 years, with survival as the outcome variable.

    The 5-year survival rate for treatment A is 17.1% (106/621) and for treatment B is 15.3%(95/621). We could use McNemars test to compare the survival rate of the twotreatments.

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    57/88

    54

    McNemars test

    CrosstabsTreatmentA * TreatmentB Crosstabulation

    TreatmentB Total

    died survived

    TreatmentA died Count 510 5 515

    % of Total 82.1% .8% 82.9%

    survived Count 16 90 106

    % of Total 2.6% 14.5% 17.1%

    Total Count 526 95 621

    % of Total 84.7% 15.3% 100.0%

    Chi-Square Tests

    Value

    Exact Sig.(2-sided)

    McNemar Test .027(a)

    N of Valid Cases 621

    a Binomial distribution used.

    It doesnt matter for McNemarstest which variable is selected forthe Row(s): or Columns(s). You canrun more than one test at a time.

    Under

    StatisticsselectMcNemar.

    Under Cells,in thisexample,select Totalpercentages.

    McNemars test

    Exact Sig. (2-sided) = exact two-sided p-value= 0.027

    The p-value is exact because it is computedusing the Binomial distribution instead of usingan approximation to the Normal distribution.

    Survival rate for

    Treatment A is

    17.1%

    Survival rate for

    Treatment B is

    15.3%

  • 7/27/2019 SPSS Handout Version 19 1-12-12

    58/88

    55

    Chi-square Test, Fishers Exact test and Trend test for Contingency Tables

    If the Chi-square test is requested for a 2 x 2 table, SPSS will also compute the Fisher's Exacttest. If the Chi-square test is requested for a table larger than 2 x 2, SPSS will also compute the

    Mantel-Haenszel test for linear or linear by linear association between the row and column

    variables.

    1. Choose Analyze from the menu bar.2. Choose Descriptive Statistics3. Choose Crosstabs...4. Row(s): Select the row variable you want from the source list on the left and then click on

    the arrow located next to the Row(s) box. Repeat the process until you have selected all therow variables you want.

    5. Column(s): Select the column variable you want from the source list on the left and thenclick on the arrow located next to the Column(s) box. Repeat the process until you have

    selected all the column variables you want.6. Choose Cells...7. Choose the cell values (e.g., observed and expected co