Lab Activity Community Research Program IV
STUDENT’S GUIDE
INTRODUCTION OF SPSS 15.0 FOR WINDOWS
Faculty of Medicine
Universitas Padjadjaran
JATINANGOR
2012
Lab Activity Community Research Program IV
Week:
Introduction of SPSS (two meetings)
Learning Objective:
At the end of the course, the student should be able to:
1. Start the SPSS 15.0 program
2. Perform data entry
3. Perform data import from excel file (xls)
4. Save data in SPSS file (sav) and SPSS output file (spo)
5. Perform data sorting and selection
6. Perform data transformation as follow:
a. Involving dates and times
b. Recoding variables
7. Perform descriptive statistics as follow:
a. Frequency and percentage
b. Mean, median
c. Standard deviation, range, quartiles, interquartile range
8. Create an appropriate table, chart, or graph using SPSS to present the data
9. Open an existing data in SPSS file or SPSS output
Reference
1. Peat J, Barton B. Medical Statistics: A Guide to Data Analysis and Critical Appraisal. 1st ed.
Oxford: BMJ Books; 2005.
Lab Activity Community Research Program IV
1. Starting the SPSS 15.0 for Windows program
Click Start menu
Click All Programs
Choose SPSS for Windows
or
Click the shorcut
You will have the following window:
Lab Activity Community Research Program IV
Select Type in data
Click OK
Now, we have two windows, run in different settings. The first is run under SPSS Data Editor
(untitled1 dataset0) and the second is run under SPSS Viewer (output1 document1).
The first window consists of Data View and Variable View. In Data View, the appearance is
almost similar with Microsoft Office Excel, having rows and columns. The difference is that we
are able to modify the columns by giving the variables name, in Variable View, thus to enter the
data, we can start from the first row. In Excel, to enter the data, we usually start from the second
row, as the first is being use for variables name.
Lab Activity Community Research Program IV
2. Performing data entry
2.1 Variable setting in Variable View
2.1.1 Name
Must begin with an alphabetic character and each must be unique. To use space
between words is not allowed, instead we can use underscore ( _ ).
2.1.2 Type
In this occasion, we are going to use Numeric and Date.
Click
Click
Lab Activity Community Research Program IV
2.1.3 Width
For variable type numeric, it shows how many numbers/digits, thus it can be
modified in previous step.
2.1.4 Decimals
For variable type numeric, it shows how many numbers/digits behind coma, thus it
can be modified in previous step.
2.1.5 Label
Further description about the variable can be type here and will be shown on the
SPSS output replacing the variable name, after performing a statistical analysis
(descriptive or analytic) or creating chart/graph. If it were left empty, then the
variable name would be shown instead.
2.1.6 Values
When we use variable type numeric to represent categories (let‟s say for variable
sex, we assign 1 = male and 2 = female; for variable socioeconomic status, we assign
0 = high, 1 = moderate, and 2 = low) or after recoding the variables (further
discussion in step 6.2), make sure we put value label for each number. The order of
the number is important as it will determine the data presentation after performing a
statistical analysis (descriptive or analytic) or creating chart/graph. For instance, if
we use a table, the smallest number will be positioned on the first row or on first left
column.
Lab Activity Community Research Program IV
If we do not put any value label, we will have unappropriate output, since it only
shows the number and whoever read the output will be confused or have wrong
conclusions, especially for those unfamiliar with the data or the study. As shown in
the dummy table, one might have a conclusion: from the study, we have 11 persons
with low, 11 persons with moderate, and 8 persons with high socioeconomic status.
What is the conclusion from following dummy table, which value labels have been
assigned correctly?
2.1.7 Missing
Suppose we might want to distinguish between data that are missing because a
respondent refused to answer and data that are missing because the question didn't
apply to that respondent. Data values that are specified as user-missing are flagged
for special treatment and are excluded from most calculations. In this occasion, we
will have a complete set of data, thus the default setting will be no missing values.
2.1.8 Columns
If necessary, we can set the width of the column in the Data View.
2.1.9 Align
If necessary, we can set the alignment in the Data View.
Lab Activity Community Research Program IV
2.1.10 Measure
Nominal scales:
1) Have no order and are generally category labels that have been assigned to
classify items or information
2) In SPSS, it can be string (alphanumeric) values or numeric values that have been
assigned to represent categories
Ordinal scales:
1) Have a logical or ordered relationship across the values
2) It is usually not possible to measure a specific amount difference between
categories
3) In SPSS, it can be string (alphanumeric) values or numeric values that have been
assigned to represent categories
Scale
1) In SPSS, set to this measure if the variable type numeric (interval or ratio) or date
2) In SPSS, set to this measure for numeric values that have been assigned to
represent categories (do not forget to put value labels!)
2.2 Data Organization
2.3 Enter the data in Data View
To practice data entry, please use Data entry exercise for SPSS.pdf (can be downloaded
from crpfkunpad.multiply.com). After the information for each variable has been defined in
Variable View, the data can be entered in the Data View screen. Try to perform data entry in
10 – 15 minutes!
3. Performing data import from Excel file (xls)
Many researchers use Excel or Access for ease of entering and managing the data but for
statistical analyses and creating appropriate charts/graphs are best executed in a specialist
statistical package as SPSS, as the integrity and accuracy of the statistics are guaranteed. As for
SPSS 15.0, it can import data in an Excel 2003 (xls) and Access 2002 -2003 (mdb) file format.
Starting from SPSS16.0 version, it can import until Excel 2007 (xlsx) and Access 2007 (accdb)
file format. Make sure we do not open the file using MS Excel while we try to import the data
into SPSS!
Lab Activity Community Research Program IV
Suppose we will import Measles Case data in Surveillance Data Analyses 2012.xls file into
SPSS. We can use the following steps:
3.1 Open Data Method
Click File
Select Open
Select Data…
or by Clicking symbol under File tab
On Open Data Window:
Search the intended folder
Select Excel (*.xls) in Files of Type options
Select Surveillance Data Analyses 2012.xls file
Click Open
On Opening Excel Data Source window:
Put a tick/check mark on Read variable names from the first row of data option
Select Measles Case in Worksheet options
Change the maximum width for string columns as 15 or 20 (not to wide)
Click OK
Lab Activity Community Research Program IV
Lab Activity Community Research Program IV
3.2 Open Database Method
Click File
Select Open Database
Select New Query…
Select Excel File
Click Next
or double-click on Excel File
On the ODBC Driver Login window:
Click Browse…
On Open Data window:
Search the intended folder
Select Excel (*.xls) in Files of Type options
Select Surveillance Data Analyses 2012.xls file
Click Open
Lab Activity Community Research Program IV
On the ODBC Driver Login window:
Click OK
Lab Activity Community Research Program IV
On Database Wizard window:
Just follow the instruction to select the data. An alternative choice is by double-clicking
„Measles Cases$‟
Click Next
Click Next again when we were asked to limit the cases
If we allow SPSS to automatically recode string variables (remember the value label we
have discussed earlier), we can put a tick/check mark for string variables. One might say this
is an advantage, but other might say no, since the value label are ordered according to
alphabetical order of the string value (Example: 1 for No and 2 for Yes). If we choose not to
recode to numeric, then we shall do it later (in step 6.2.1)
Change the Width for variable-width string fields as 10 (not to wide)
Click Next
Select Retrieve the data I have selected
Click Finish
Lab Activity Community Research Program IV
Lab Activity Community Research Program IV
Lab Activity Community Research Program IV
4. Saving data in SPSS file (sav) and SPSS output file (spo)
4.1 Saving data in SPSS file (Make sure we are on Data View or Variable View screen)
Click File
Select Save
In Save Data As window:
Select the intended folder
Type “measles” as the file name
Select SPSS(*.sav) in Save as type
Click Save
Lab Activity Community Research Program IV
4.2 Saving SPSS output (Make sure we are on Output View)
Click File
Select Save
In Save Output As window:
Select the intended folder
Type “measles” as the file name
Select Viewer Files (*.spo) in Save as type
Click Save
Lab Activity Community Research Program IV
4.3 Save the other data from previous data entry procedure as SPSS file and name it as “data
entry spss”, so we will have two SPSS files and one output file.
4.4 Notes
Any SPSS 15.0 data file (*.sav) can be opened by different version of SPSS, let‟s say SPSS
16.0, 17.0, or 18.0, but the SPSS 15.0 output file (*.spo) can‟t be opened by different
version of SPSS, since each version have its own output file.
5. Performing data sorting and selection (please use data entry spss.sav file!)
5.1 Data sorting
Suppose we will sort the data according to date of birth of the mother (ascending)
5.1.1 By right-clicking the variable name
5.1.2 By clicking Data tab
Click Data
Select Sort Cases…
Right click
Lab Activity Community Research Program IV
In Sort Cases window:
Select dob variable then move it to Sort by window
Select Ascending in Sort Order options
Click OK
5.2 Data selection (please use data entry spss.sav file!)
Suppose we will select the data only for smoking mothers, thus further data transformation
(step 6), descriptive statistics (step 7), and presenting data (step 8) will be applied only for
smoking mothers.
Click Data
Select Cases…
Lab Activity Community Research Program IV
In Select Cases window:
Select If condition is satisfied
Click If…
Select
Click
Lab Activity Community Research Program IV
In Select Cases: If window:
Select smoking variable then make sure we have expression: “smoking = 1”
In Select Cases window:
Click OK
Lab Activity Community Research Program IV
To select all data again:
Click Data
Select Cases…
In Select Cases window:
Click Reset
Click OK
Lab Activity Community Research Program IV
6. Performing data transformation
6.1 Data involving dates and times (please use data entry spss.sav file!)
6.1.1 Determining age (years) from date of birth of mother
Click Transfom
Select Date and Time Wizard…
In Data and Time Wizard window:
Select Calculate with dates and times
Click Next
Lab Activity Community Research Program IV
Select Calculate the number of time units between two dates
Click Next
In Date 1: select Current date and time (March, 12th
, 2012) from Variables box
In minus Date2 : select dob from Variables box
Set the unit as Years
Click Next
Lab Activity Community Research Program IV
in Result Variable, please type “age”
Click Finish
Lab Activity Community Research Program IV
6.1.2 Predicting duedate (next 280 days from the first day of last menstrual period)
Click Transfom
Select Date and Time Wizard…
In Data and Time Wizard window:
Select Calculate with dates and times
Click Next
Lab Activity Community Research Program IV
Select Add or subtract a duration from a date
Click Next
In Date: select first day of last menstrual period from Variables box
In Duration Constant type 280
In Unit: select Days
Click Next
Lab Activity Community Research Program IV
In Result Variable, please type “duedate”
Click Finish
Lab Activity Community Research Program IV
6.2 Calculating/computing (please use data entry spss.sav file!)
Determining body mass index (BMI, kg/m2)\
Click Transform
Select Compute…
In Compute Variable window:
In Target Variable: type “bmi”
In Numeric Expression: select weight and height from the left box, make sure we have
expression: weight / (height / 100) ** 2
Click OK
Lab Activity Community Research Program IV
6.3 Recoding variables
6.3.1 Into Different Variable (please use data entry spss.sav file!)
6.3.1.1 Age group of the mother
20 – 24 years (numeric code = 1)
25 – 29 years (numeric code = 2)
30 – 34 years (numeric code = 3)
Click Transfrom
Select Recode into Different Variables…
In Recode into Different Variable window:
Select variable age and move it to the right box
In Output Variable, for Name: please type “agegroup”
Click Change
Click Old and New Values…
Lab Activity Community Research Program IV
In Recode into Different Variable: Old and New Values window:
Select Range, then type 20 trough 24
In New Value, select Value, then type 1
Click Add
Lab Activity Community Research Program IV
Select Range, then type 24 trough 29
In New Value, select Value, then type 2
Click Add
Select Range, then type 30 trough 34
In New Value, select Value, then type 3
Click Add
Click Continue
Click OK
Lab Activity Community Research Program IV
Click Variable View
Modify Values for agegroup variable
In Value Labels window:
Type value 1, type Label “20 – 24 years”
Type value 2, type Label “24 – 29 years”
Type value 3, type Label “30 – 34 years”
Click OK
Click
Lab Activity Community Research Program IV
6.3.1.2 BMI category/classification
18.5 kg/m2 = Underweight (numeric code = 1)
18.5 – 24.99 kg/m2 = Normal (numeric code = 2)
≥ 25 kg/m2 = Overweight (numeric code = 3)
Click Transfrom
Select Recode into Different Variables…
In Recode into Different Variable window:
Click Reset
Select variable bmi and move it to the right box
In Output Variable, for Name: please type “bmicat”
Click Change
Click Old and New Values…
Lab Activity Community Research Program IV
In Recode into Different Variable: Old and New Values window:
Select Range, LOWEST through value: then type 18.49
In New Value, select Value, then type 1
Click Add
Select Range, then type 18.5 through 24.99
In New Value, select Value, then type 2
Click Add
Select Range, value through HIGHEST: then type 25
In New Value, select Value, then type 3
Click Add
Lab Activity Community Research Program IV
Click Continue
Click OK
Lab Activity Community Research Program IV
Click Variable View
Modify Values for bmicat variable
In Value Labels window:
Type value 1, type Label “Underweight”
Type value 2, type Label “Normal”
Type value 3, type Label “Overweight”
Click OK
Click
Lab Activity Community Research Program IV
6.3.2 Into Same Variable (please use measles.sav file!)
Yes = 1 and No = 2
Click Transfrom
Select Recode into Same Variables…
In Recode into Same Variable window:
Select all variables using Yes or No answer and move them into String Variables
box
Click Old and New Values…
In Recode into Same Variables: Old and New Values window:
In Old Value, type “Yes”, then type New Value as 1
Click Add
Lab Activity Community Research Program IV
In Old Value, type “No”, then type New Value as 2
Click Add
Click Continue
Lab Activity Community Research Program IV
Click OK
Click Variable View
Modify Type (from String to Numeric) and Values for these variables: Fever, fever
with rash, conjunctivitis, coryza, cough, lymphnode, probable case, confirm case,
traveling history, hospitalized, death, and immunization
In Value Labels window:
Type value 1, type Label “Yes”
Type value 2, type Label “No”
Click OK
Click
Lab Activity Community Research Program IV
7. Performing descriptive statistics (please use data entry spss.sav file!)
7.1 Frequency and percentage (categorical data)
7.1.1 Suppose we wish to know frequency of age group and BMI category for all
participants
Click Analyze
Select Descriptive Statistics
Select Frequencies…
In Frequencies window:
Select agegroup and bmicat variables and move them into Variable(s) box
Put a tick/chek mark onto Display frequency tables option
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
Lab Activity Community Research Program IV
7.1.2 Suppose we wish to know frequency of age group and BMI category based on
smoking status
Click Analyze
Select Descriptive Statistics
Select Crosstabs…
In Crosstabs window:
Select agegroup and bmicat variables and move them into Row(s) box
Select smoking variable and move it into Column(s) box
Click Cells…
In Crosstabs: Cell Display
Put a tick/ check mark for Column percentages (depend on the study design)
Notes: Cross-sectional and cohort, select Row; Case-control, select Column
Click Continue
Lab Activity Community Research Program IV
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
7.2 Mean, median, standard deviation, range, quartiles, interquartile range (numerical data)
7.2.1 Suppose we wish to know mean, median, SD, range, quartiles, and IQR of age
(years) and BMI (kg/m2) for all participants
Click Analyze
Select Descriptive Statistics
Select Explore…
Lab Activity Community Research Program IV
In Explore window:
Select age and bmi variables and move them into Dependent List box
Click Statistics…
In Explore: Statistics
Put a tick/check mark for Percentiles
Click Continue
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
Lab Activity Community Research Program IV
7.2.2 Based on smoking status
Click Analyze
Select Descriptive Statistics
Select Explore…
In Explore window:
Select age and bmi variables and move them into Dependent List box
Select smoking variable and move it into Factor List box
Click Statistics…
Lab Activity Community Research Program IV
In Explore: Statistics
Put a tick/check mark for Percentiles
Click Continue
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
Lab Activity Community Research Program IV
8. Creating an appropriate table, chart, or graph using SPSS to present the data
8.1 Contingency table (please use measles.sav file!)
Suppose we wish to know whether sex is a risk factor for measles disease (confirm case)
using a case-control study. We will use the same method with step 7.1.2 and add Statistics
Click Statistics…
In Crosstab: Statistics window:
Put a tick/check mark for Risk
Click Continue
Lab Activity Community Research Program IV
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
8.2 Pie chart (please use data entry spss.sav file!)
Suppose we wish to know the proportion of smoking among mothers
Click Graphs
Select Legacy Dialogs
Select Pie…
Lab Activity Community Research Program IV
In Pie Charts window:
Select Summaries for groups of cases
Click Define
In Define Pie: Summaries for Groups of Cases window:
Select smoking variable and move it into Define Slices by box
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
8.3 Bar (please use data entry spss.sav file!)
Suppose we wish to know the agegroup distribution of the mothers
Click Graphs
Select Legacy Dialogs
Select Bar…
In Bar Charts window:
Select Simple
Select Summaries for groups of cases
Click Define
Lab Activity Community Research Program IV
In Define Simple Bar: Summaries for Groups of Cases window:
Select agegroup variable and move it into Category Axis box
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
8.4 Histogram (please use data entry spss.sav file!)
Suppose we wish to know the distribution of BMI (kg/m2) of mothers
Click Graphs
Select Legacy Dialogs
Select Histogram…
In Histogram window:
Select bmi variable and move it into Variable box
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
8.5 Line (please use the data of Clinical Scenario B from Surveillance Analyses Data Exercise!)
Suppose we wish to know the rotavirus infection trend from 1974 to 1989 in one large
district in Bandung
Click Graphs
Select Legacy Dialogs
Select Line…
In Line Charts window:
Select Simple
Select Summaries for groups of cases
Click Define
Lab Activity Community Research Program IV
In Define Simple Line: Summaries for Groups of Cases window:
Select Other statistics for Line Represents option
Select rotavirus_cases and move it into Variable box
Select year variable and move it into Category Axis box
Click Change Statistics…
Lab Activity Community Research Program IV
In Statistics window:
Select Sum of Values
Click Continue
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
8.6 Box-plot (please use data entry spss.sav file!)
Suppose we wish to compare between median of BMI (kg/m2) of smoking and non-smoking
mothers
Click Graphs
Select Legacy Dialogs
Select Boxplot…
Lab Activity Community Research Program IV
In Boxplot window:
Select Simple
Select Summaries for groups of cases
Click Define
In Define Simple Boxplot: Summaries for Groups of Cases window:
Select bmi and move it to Variable box
Select smoking and move it to Category Axis box
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
8.7 Error Bar (please use data entry spss.sav file!)
Suppose we wish to compare between mean of BMI (kg/m2) of smoking and non-smoking
mothers
Click Graphs
Select Legacy Dialogs
Select Error Bar…
Lab Activity Community Research Program IV
In Error Bar window:
Select Simple
Select Summaries for groups of cases
Click Define
In Define Simple Error Bar: Summaries for Groups of Cases window:
Select bmi and move it to Variable box
Select smoking and move it to Category Axis box
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
8.8 Scatter/Dot
Suppose we wish to know the correlation between weight (kg) and height (cm) of the
mothers
Click Graphs
Select Legacy Dialogs
Select Scatter/Dot…
Lab Activity Community Research Program IV
In Scatter/Dot window:
Select Simple Scatter
Click Define
In Simple Scatterplot window:
Select weight variable and move it into Y axis box
Select height variable and move it into X axis box
Click OK
Check the result in SPSS output
Lab Activity Community Research Program IV
Lab Activity Community Research Program IV
9. Opening an existing data in SPSS file or SPSS output
9.1 When starting SPSS program
9.1.1 SPSS file
Select Open an existing data source
Select the intended file, let‟s say measles.sav or More Files… to select another file
Click OK
Lab Activity Community Research Program IV
9.1.2 SPSS output
Select Open another type of file
Select the intended file or More Files… to select another file
Click OK
9.2 From the Window Explorer
9.2.1 SPSS file
Select the intended folder
Double click on the intended file
9.2.2 SPSS output
Select the intended folder
Double click on the intended file