1
Data processing and exporting
Module 2 Session 6
2
Overview
The next slide again shows the data management cycle.
Data have been entered (and checked) in Epi Info. We are now ready to undertake data analysis. Do we remain in Epi Info, or move to another software?
This session explores both: simple data analysis in Epi Info and exporting data – so that analysis can be
undertaken in another software if required.
3
Design survey
Design questionnaire
Enumerators collect data in the field
Data entered onto computer
Manual checking, editing etc.
Data analysis
Reporting of results
Computer data management
Data management cycle
Conception
We are now ready to move on to the data analysis stage
4
Contents
Using Epi Info to merge data from separate files
Simple summaries, tables and graphs in Epi Info
Exporting data from Epi Info to use with other programmes
5
Learning Objectives
At the end of this session participants will be able to:
merge data from separate data files use Epi-Info to produce tables, graphs and
summary statistics and interpret the results export data from Epi Info for reading into another
software package.
Merging Data
There are three common types of merging:
Top to Bottom – Adding records – Merge
Side to Side – Adding variables – Relate
Table Lookup – Data at different levels - Relate
6
Top to Bottom Merge
Used when data files have the same variables but different records
Used to combine data entered by different data entry staff
For example: A enters records 1 to 10, B enters records 11 to 20
ID Var1 Var2 var3
01 24 54 62
02 32 54 14
03 54 24 35
7
ID Var1 Var2 var3
04 35 45 12
05 64 74 25
06 54 54 65
Side to Side Merge
Used when data files have same records but different variables
Each file should have key field(s) to ensure correct merging
For example: Person A enters Health data, Person B enters Education data
ID Health1 Health2
01 02 03
02 04 05
03 14 24
8
ID Educ1 Educ2
01 34 45
02 71 55
03 62 34
Table Lookup Merge
Used for data at different levels For example: Household data and data on
individuals within the household Household ID must appear in Individuals dataset
9
HHID WALL ROOF FLOOR
10001 02 03 01
10002 03 03 01
10003 04 05 01
HHID ID GENDER AGE
10001 01 01 51
10001 02 02 48
10001 03 01 20
10002 01 02 84
10003 01 01 56
10003 02 02 51
Activity 2
The first part of the practical takes you through the process of a side-to-side merge
An example of the Table Lookup merge appears in Activity 4
The top-to-bottom merge is a much simpler and more intuitive process so is not included in this practical
10
Data Checking with Analysis
Initial data analyses can be part of the data checking process
Useful to check on spellings and ranges – e.g. are all ages feasible?
Useful to have ability to produce simple tables and charts from the data entry package
Corrections then made in the same package
11
12
Data Analysis in Epi Info
The Analyze Data utility in Epi Info produces tables, graphs and summary statistics.
The relevant commands are: Frequencies : 1-way tables Tables : cross-tabulations (2-way tables) Means: mean values Graph : graphs and charts Summarize: summary statistics
13
Frequencies (FREQ command) is used for 1-way tables
14
Tables is used for cross-tabulations
15
Means produces mean, median, minimum,
maximum, quartiles, standard deviation.
Graph Offers a wide choice of graphs
16
Example Bar Chart
17
Labelling values
When data have been entered as numeric codes the graphs do not give much information
To label the value we first define a new text variable
Then we recode the existing numeric variable into the new variable
18
Define and Recode
19
Revised Bar Chart
20
21
Activity 4
For your dataset, generate: Frequencies for the categorical variables Means for the continuous variables Cross-tabulations Graphs
The results are copied to a Word file to be used in a report (sessions 10 & 11)
Moving out of Epi Info
The Write (Export) command within Analyze Data exports data from Epi Info.
22
Options on the Export
When exporting to Excel the options are Excel 3.0 or Excel 4.0
These are earlier versions of Excel – only one worksheet per file – but they can be read by the later versions of Excel
You can choose which variables to export Note Epi-Info gives no indication that the export
has been done – re-running the command will append the same data to the worksheet!
23
Variable and Value Labels
Variable labels in Epi-Info are the prompts or questions
Value labels can only be done by recoding into a different variable
24
Labels in Exported file
In resulting Excel file variable names are used as column headings
Text fields come across too
25
Vlookup Function in Excel
Alternatively export only the numeric codes and use Vlookup in Excel for labels
Codes are stored in a separate worksheet =vlookup(C2, Codes!$A$2:$B$8, 2, FALSE) Advantage is that codes are synchronised
with labels
26
Vlookup Example
27
Activity 6 (Optional)
Export some of the data you have been working on into Excel
Try to use the vlookup function to label the coded variables
For more information about the Vlookup function see Chapter 5 – Multi-level Data of SSC Introduction to data handling in Excel – SADC version
28