Upload
perez-eric
View
205
Download
0
Embed Size (px)
Citation preview
ANALYSIS OF SURVEILLANCE
DATADr. Ronnie D. Domingo
Design
formField data
gathering
Dataencodin
g
Data Analysi
s
ReportWritin
g
Data Processing
Data
pro
cess
ing
• Sorting• Coding• Editing • Summarizing
Data analysis
Data processing
• A series of steps undertaken to transform collected raw data into a form suitable for statistical analysis (Sanchez et al, 1989)
Data sorting method
• Types of data sheets• Numbering system for
data sheets (especially for surveys)
• The physical “container” for these raw data
Sorting
Coding
Editing
Summarizing
Data Coding• Examples
Data Possible codes
“Yes” answer Y or 1
“No” answer N or 2
No response 999 or U for unknown
Does not know 888 or D
Sorting
Coding
Editing
Summarizing
Data editing/ validation
Examine the data for four things: C.A.T.S.• Completeness• Accuracy• Traceability• Standard format
Sorting
Coding
Editing
Summarizing
Spreadsheet from Hell
By Daniel W. Byrne
Spreadsheet from Heaven
By Daniel W. Byrne
GIGO• Garbage In, Garbage Out
Form Level Validation:
• At the stage of filling up the online or printed form.
• Mandatory vs optional fields• INC entries= “SUBMIT” failD
ata
Vali
dat
ion
Field Level Validation:
• Field= space where you write the answer
• “Farmer’s Name” field = Fernan@do Cruz
• Date: 03-02-2016• Provide a list of possible
answers• Other fields auto appear or
disappear
Dat
a Va
lid
atio
n
Data Saving Validation:
• Option: keep the record as a draft copy vs “Submit” as final copy
• User with time to review and revise entries
Dat
a Va
lid
atio
n
Validation of Continuous Variables
• Continuous variables: age, height, weight, feed consumption, size of lesion, egg per gram of feces, temperature, etc.
• Check the following: – Minimum value– maximum value– mean– median
Vari
able
s
Validation techniques
Sample bar chart of lung score of pigs from several farm sources. The expected lung scores should range from 0-55. Note “farmer117” registered an erroneous lung score of 60.
Farmer1
11
Farmer1
12
Farmer1
13
Farmer1
14
Farmer1
15
Farmer1
16
Farmer1
17
Farmer1
18
Farmer1
19
Farmer1
20
Farmer1
21
Farmer1
22
Farmer1
23
Farmer1
24
Farmer1
25
Farmer1
26
Farmer1
27
Farmer1
28
Farmer1
290
10
20
30
40
50
60
70
Validation of Categorical Variables
• Categorical Variables –– nominal (sick, healthy)– ordinal (+,++, +++)
• Techniques:– Frequency checks– Cross Tabulations
Vari
able
s
Cross-check variables to detect awkward combinations.
Example a male dog positive for metritis.
Data Verification
• Comparing the output of two encoders
• Comparing the data on the screen against the original paper document.
• Comparing the print out of the computer database and the original printed document.
Summarizing the data
6400 records6002 usable398 rejected4565 from Bulacan1835 from Pampanga
83 other places2 files Abat.xls Farm.xls
Sorting
Coding
Editing
Summarizing
Design
formField data
gathering
Dataencodin
g
Data Analysi
s
ReportWritin
g
Data Processing
Data analysis
Data analysis: Tools
• Install statistical and graphics software packages
• Examples: SAS, SPSS, STATA, Epi Info, R software, Open Epi, Win Epi, QGIS
• Check the provider for newer software packages.
Type of Statistical Analysis
Descriptive statistics
Measures Descriptive StatisticsMeasures of central tendency
Mean, median, mode
Measures of variation Range, variance, standard deviation, standard error, confidence limits
Frequency distribution Counts or proportions in different groups; use frequency tables, histograms and other graphs for visual presentation
Rates and ratios Incidence, prevalence, etc.
Inferential statistics
Tests for difference Tests for Association
See next page Cohort study= Relative risk, attributable riskCase-control study = Odds ratioExperimental study = Protective valueCorrelation and regression analysis = linear relationship, non-linear relationship
From your sample, make inferences about the larger
population
Inferential statistics(deduce, generalize, extrapolate)
• Uses the theory of probability to make inferences about larger populations from your sample.
• The pattern seen in the analyzed sample is extrapolated to the target population.
Tests
Sample flow chart to select the appropriate statistical test
Essential components of a common report in veterinary
practice
Generate information from collected data.
Name the comic hero who caught this criminal?
The Phantom
Who visited this place?
Calling?
Every disease leaves a distinct
mark
Two premises of modern epidemiology:
Diseases in populations do not occur
in random fashion
Diseases in populations do have multiple determinants
Disease patterns are described based on three main epidemiologic variables:
Reasons for the Epi Triad:
• The three = most important;• The result= significant
information• The process= systematic• The by-product= hypothesis;• The output = transferable to
the stakeholders.
Information is processed data
Basic Activities: CDC
Count Aggregate the cases in the line listing by characteristic (e.g., place, animal, time)
Divide Divide the number of cases by the relevant denominator
Compare
Compare incidence across groups
Forms of analysis output
• Textual• Tabular• Graphical
Data Presentation: Graphical (Horizontal bar graph)
SFB
BFB
PFB
RFB
PGF
AAF
RDF
SCF
0 10 20 30 40 50 60 70 80 90
Proportion of positive samples (%)
Farm
Cod
e
Figure 1. Bar Graph of the proportion of Mycolasma hyopneumonia positive samples per farm of origin as detected by LAMP technique
Qualitative data
Data Presentation: Graphical (Vertical bar graph)
Aurora Bataan Bulacan N.Ecija Pampanga Tarlac Zambales -
50,000
100,000
150,000
200,000
250,000
300,000
350,000
Figure 1. Estimated dog population in the different provinces of Region III, 2013)
Qualitative data
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 20130
50
100
150
200
250
300
350
400
450
500
Data Presentation: Graphical (Line graph)
Figure 2. Secular trend of animal rabies in Central Luzon, 2002 to 2013.
ContinuousQuantitative
data
Data Presentation: Graphical (Pie Graph)
Bulacan20%
Nueva Ecija15%
Tarlac10%
Pampanga30%
Zambales7%
Aurora6%
Bataan12%
Figure 3. Rabies vaccine allotment to different provinces in Central Luzon, 2013
AnimalWhich type of animals are prone to develop the disease and which
type tends to be spared?
Common groupings employed in epidemiology
AgeSex
SpeciesBreed
Use
Disease patterns are described based on three main epidemiologic variables:
AgeSex
SpeciesBreed
Use
Classification of time trends
• Short term• Cyclical• Seasonal• Long-term
Graphs of endemic and sporadic diseases
January February March April May June July August September October November December0
5
10
15
20
25
Seasonal distribution of animal rabies in Central Luzon, 2002-2011
Month
Inci
denc
e co
unt o
f ani
mal
rabi
es
Disease patterns are described based on three main epidemiologic variables:
AgeSex
SpeciesBreed
Use
Short termCyclicalSeasonalLong-term
Surra Prevalence – CATT Percent Positive by Municipality
Source: EAHMI, based on data provided by PAHC and RADLs.
Types of Thematic Maps
1. Qualitative maps= maps that show non-measurable characteristics (e.g. Low and high rainfall).
2. Quantitative maps= maps that depict areas with measured variations
Qualitative Map
Geographic distribution of Japanese encephalitis
Types of quantitative maps: (a) Dot maps (b) Choropleth maps (c) Isopleth maps(d) Proportional symbol maps
Dot Maps
Choropleth maps
• Geographic areas are shaded or colored according to a prearranged key, each shading or color type corresponding to a range of values
• Commonly used in showing population density information
Isopleth Map
from iso meaning “equal” and pleth meaning “lines.”
Dot maps Choropleth maps Isopleth maps Proportional symbol maps