35
What Can I Do With My Data? Utilizing Existing Data for Analysis and Hypothesis Development Falgunee Parekh, MPH, PhD

What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

What Can I Do With My Data?

Utilizing Existing Data for Analysis and Hypothesis Development

Falgunee Parekh, MPH, PhD

Page 2: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Agenda

• My Research Background

• Background on Analysis of Surveillance (or Initial) Data

• Case Study of Lassa Fever Data Analysis Utilizing Surveillance Data• Development of Collaboration

• Type of Existing Data

• Developing a research question

• Analysis Plan

• Results

• Questions and Discussion

Page 3: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Research Background• Infectious Disease Epidemiologist

• >15 years of experience

• Field Epidemiology and Clinical Research

• Disease Experience

• Malaria, Zika, Lassa Fever, Influenza,

• Zoonotic Diseases and One Health

Approach

• Country Experience

• Peru, Colombia, India, Azerbaijan, Tanzania, Democratic Republic of Congo,

Gabon, South Africa, Zimbabwe

Page 4: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Aims of Surveillance

• Allows for rapid detection of disease outbreaks

• Supports early identification of disease problems – endemic and non-endemic

• Provides an early warning system able to identify new and emerging diseases

• Assess the health status of a defined population (estimating level of occurrence/trends among diseases)

• Confirm absence of a specific disease

Page 5: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Uses and Applications of Surveillance Data

• Estimate the magnitude of the problem

• Detect epidemics/define a problem

• Evaluate control measures

• Facilitate health planning

• Determine geographic distribution of illness

• Portray the natural history of a disease

• Generate hypotheses, stimulate research

• Monitor changes in infectious agents and/or health practices

Page 6: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Example: Raw Dataset

Case

#

Date of Onset Disease Case

Classification

Age Gender

1 22/10/16 Anthrax Confirmed 19 M

2 25/10/16 Anthrax Not a case 17 M

3 19/10/16 Anthrax Probable 23 F

4 15/10/16 Anthrax Investigation

Pending

18 ?

5 23/10/16 Anthrax Confirmed 21 F

6 27/10/16 Anthrax Suspect 18 M

7 21/10/16 Anthrax Confirmed 25 F

Page 7: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Methods of Analysis of Surveillance Data

• Descriptive Methods• Analysis of the data by person, place and time

• Calculation of rates

• Use of tables, graphs, and maps

• Analytical methods• Cohort studies

• Case-Control studies

Page 8: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Developing a Data Analysis Plan

• To analyze data you need a data analysis plan• A series of steps to organize your work

• The data analysis plan must build upon itself• Start with simple descriptive statistics

• Build to more complex analyses

• Examine the data for possible errors and correct if possible at every step of the data analysis plan

Page 9: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Components of a Surveillance Analysis Plan

• Become familiar with the data

• Check for errors – “Clean” the data

• Analyze counts and rates by year, months, or weeks (Time)• Check for trends and seasonality

• Analyze data by regions or districts (Place)

• Analyze data by age and sex (Person)

• Subgroup analysis

Page 10: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Data Quality

• Missing Values

• Completeness of critical variables

• Data entry errors,

• Adherence to strict case definitions

• Biases• Severe cases tend to be reported more than mild cases

• Better surveillance in urban areas than rural

• Non-standard reporting

Page 11: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Collaborations

• Develop collaborations with other investigators • Fulfill your knowledge gaps

• Assist in development of analysis plan

• Allows for multiple perspectives in interpretation of analysis

• Allows for hypothesis development and continued collaboration on future projects

Page 12: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Case Study – Lassa Fever Data, Sierra Leone

Page 13: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Case Study – Lassa Fever Data, Sierra Leone

Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents of bioterrorism

Ease of aerosolized dissemination Low infectious dose High morbidity/mortality rates Lack of effective vaccines or treatments

The outbreak of Ebola demonstrates the rapid spread of VHFs across borders and regions due to mobile populations

VHFs have serious impact on public health and heavy burden on health care infrastructure and agencies

Lassa Fever has been imported to other countries

Page 14: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

BackgroundLassa Fever (LF) Lassa virus (LASV) is an arenavirus

Reservoir is the multimammate rat genus Mastomys

LF is NOT a rare disease

Endemic to West Africa and transmitted throughout the year

Occurs in several countries including Guinea, Liberia, Nigeria, and Sierra Leone

Estimated that 300,000 cases and 5,000 deaths occur annually

One of the only VHFs that can be prospectively studied

Understanding how LF spreads can better help us understand other disese like Ebola

LF in Sierra Leone 2004-2011

Page 15: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Study Objective

Characterize the morbidity/mortality, epidemiology and risk factors associated with clinical outcome for

infection with Lassa virus (LASV)

Page 16: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Description of Dataset – LF from Sierra Leone Developed Collaboration:

Sierra Leone Ministry of Health and Sanitation (MOHS) provided access to country-wide data on suspected LF cases

Surveillance and clinical data of suspected cases reported by MOHS, 2008 – 2013

Includes data on: Suspected Cases identified through passive and active surveillance Results of diagnostic laboratory testing Epidemiologic data collected from patient questionnaires and

clinical assessments Potential contacts identified and approached by active surveillance

team

Page 17: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Dataset

Study Methods:• Retrospective

analysis of data collected from surveillance of LF in Sierra Leone

• Assess epidemiologic risk factors associated with disease and mortality

Page 18: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Where Do I Start??

Page 19: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Analysis of Data by Person, Place and Time

Analysis by Person• Compare counts or

frequencies by:

• Age

• Gender

• Ethnicity

• Occupation

• Vaccination status

• Others?

Analysis by Place• Present geographic

distribution of counts or rates

• Where cases were reported

• Where exposures might occur

• Determine the geographic area with the highest rates of infection

Analysis by Time• Examine occurrence of disease

during particular time interval (years, months, weeks)

• Seasonal trends

• Analysis of time using person and place subcategories:

• Gender frequency over time

• Frequency in a region over time.

Page 20: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Analysis of Subgroups

• Analysis of sub-groups can reveal additional information

• Sub-Groups• Gender

• Children

• Ethnicity

• Individuals with outdoor occupations

• Combinations – (gender and ethnicity)

Page 21: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Develop an Analysis Plan

Univariate analysis

Temporal trend analysis across years

Risk factor analysis to assess predictors of disease and mortality AgeGenderOther subgroups

Page 22: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Results 2008-2013 – Univariate Analysis by Time

3348 suspected LF cases identified between 2008-2013:

27.0% were LF Positive

31.5% of LF Positive (n=872), Died

56.3% of suspected cases were Female

13.7% of suspected received Ribavirin treatment

178317

673776 806

598

3348

42 64191 192 222 194

905

19 34 57 66 59 40

275

0

500

1000

1500

2000

2500

3000

3500

4000

2008 2009 2010 2011 2012 2013 Total

Lassa Fever Enrollees, Diagnosis and Mortality 2008-2013

N LF Pos LF Died

Page 23: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Results – Analysis by Time2008-2013

The proportion of female suspected cases significantly increased over the years

Days Since Onset of Illness(DSOI) significantly different across the years

Appears to be decreasing

Characteristic 2008 2009 2010 2011 2012 2013 TotalChi-Sq. P-value

CA* Trend P-Value

N 178 317 673 776 806 598 3348

Female 84 (47.2) 177 (55.8) 356 (52.9) 460 (59.3) 473 (805, 58.7) 335 (56.0) 1,885 (3347, 56.3) .016 .026

Age in Years (Median) 25.5 (26.0) 25.0 (316, 25.0) 23.7 (670, 23.0) 24.3 (766, 24.0) 24.7 (788, 23.0) 23.7 (593, 22.0) 24.3 (3311, 23.0) .23** NA

Mean DSOI/days (Median)

9.6 (134, 8.0) 9.2 (307, 7.0) 8.6 (647, 6.0) 9.6 (600, 7.0) 8.2 (418, 6.0) 8.5 (323, 6.0) 8.9 (2429, 7.0) .0003** NA

*Cochran Armitage Trend test, **Krukal Wallis test

Characteristics of Suspected LF Cases by Year

Page 24: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF ResultsTotal Suspected Enrollees with Defined LF Diagnosis, 2008-2013

0

5

10

15

20

25

30

35

2008 2009 2010 2011 2012 2013

Pro

po

rtio

n o

f Su

spec

ted

Cas

es

Year

LF Positive Cases and Ribavirin Treated by Year

% LF + %Ribavirin

p<.0001*

p<.0009*

0

10

20

30

40

50

60

2008 2009 2010 2011 2012 2013Pro

po

rtio

n o

f LF

Po

siti

ve

Year

LF Mortality by Year

%Mortality

p<.0001*

Increased prevalence may be due to improved detection and/or increasing transmission

More mild LF cases may be detected that don’t require Ribavirin treatment

* Cochran-Armitage Trend test

Page 25: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Results – Analysis by PlaceMap of Cases in Sierra Leone

• LF cases identified from districts that had previously not reported LF

Improved detection of LF

Improved awareness of the population at risk of LF

LF may be spreading

2008 2009 2010

2011 2012 2013

Courtesy of Marc Souris

Page 26: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Results – Analysis by PersonRisk Factors of Lassa Fever Diagnosis, 2008-2013

LF positive were of significantly younger age and had more days since onset of illness

LF negative were significantly more likely to have reported a death in their household, and contact with a LF case

Gender was not significantly different between LF positive and LF negative

Characteristic All Patients LF Non-LF P-value

N 3233 882 (27.3) 2351 (72.2)

Female 1823 508 (57.6) 1315 (55.9) NSMean Age (Median) 24.5 (24.0) 21.9 (20.0) 25.5 (25.0) <.0001*

Mean DSOI(Median) 8.96 (2377,7.0) 9.6 (723,8.0) 8.7 (1654,6.0) <.0001*

House Deaths 167(927) 36 (295,12.2) 131 (632,20.7) .0017

Contact with LF Case 770(2009) 145 (565,25.6) 625 (1444,43.3) <.0001

Ribavirin 454 (3218) 406 (871,46.6) 48 (2347,2.1) <.0001* Wilcoxon Rank Sum Test,

Page 27: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF ResultsRisk Factors of Lassa Fever Mortality, 2008-2013

Non-Survivors were of significantly younger age (p=.0005)

Survivors significantly more likely to report household death or contact with LF case (p=.045, p<.0001)

Ribavirin significantly associated with mortality (p<.0001); most likely confounding factor and an indication of disease severity

Characteristic Total LF Non-Survivors Survivors P-valueN 856 271 (31.7) 585 (68.3)Female 495 (57.8) 146 (53.9) 349 (59.7) NS

Mean Age (Median) 21.7 (20.0) 18.7 (18.0) 23.1 (21.0) .0005*Mean DSOI(Median) 9.6 (704,8.0) 9.3 (230,8.0) 9.7 (474,7.0) NS*

House Deaths 36 (285) 3 (60,5.0) 33 (225,14.7) .045

Contact with LF Case 139 (549) 16 (141,11.4) 123 (408,30.2) <.0001

Ribavirin 405 (852) 156 (270,57.8) 249 (582,42.8) <.0001* Wilcoxon Rank Sum Test,

Page 28: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Results – Subgroup AnalysisChildren < 5 years of age vs. All Other Suspected LF Cases, 2008-2013

Children < 5 years were significantly more likely to be LF positive, receive Ribavirin treatment, and die from LF compared to all others

Children < 5 years were significantly more likely to have malaria

All others significantly more likely to report household death or contact with case; low sample size

Total Age<5 All Others P-value

N 3233 583 2650

LF Positive 882 (27.3) 198(34.0) 684(25.8) <.0001

LF Mortality (N=856) 271 (31.2) 83(193,43.0) 188(663,28.4) .0001

Ribavirin 454(3218,14.1) 107(582,18.4) 347(2636,13.2) .0011

Female 1823(56.4) 268(46.0) 1555(58.7) <.0001

Household Deaths 167(927,18.0) 7(95,7.4) 160(832,19.2) .0044

Contact with Case 770(2009,38.3) 76(309,24.6) 694(1700,40.8) <.0001

Mean DSOI 9.6(704,8.0) 8.0(392,7.0) 9.1(1985,7.0) NS

Malaria 152 (356,42.7) 57(87,65.5) 95(269,35.3) <.0001

• Among LF+, median DSOI for< 5years was 7.0 compared to 8.0

for all others (p=.065)

Page 29: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Results – Subgroup AnalysisPregnant vs. Non-pregnant females (14-49), 2008-2013

Total Pregnant Non-Pregnant P-value

N 345 162(47.0) 183(53.0) -

LF Positive 120(34.8) 63(38.9) 57(31.2) NS

LF Mortality 44(117,37.6) 32(61,52.5) 12(56(21.4) .0005

Ribavirin 71(343,20.7) 43(160,26.9) 28(15.3) .0083

Household Deaths 21(220,9.6) 5(68,7.4) 16(152,10.5) NS

Contact with Case 39(266,14.7) 10(107,9.4) 29(159,18.2) .04

Mean DSOI 8.4(278,6.0) 8.5(122,7.0) 8.4(156,6.0) NS

Malaria 41(117,35.0) 17(46,37.0) 24(71,33.8) NS

Pregnant women significantly more likely to receive Ribavirin treatment and die from LF

Non-pregnant women significantly more likely to report contact with case

Small sample size, so difficult to detect significance for other factors

Page 30: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Results – Subgroup AnalysisMalaria and LF Co-Infection, 2008-2013

Malaria testing results reported for 356 suspected LF cases

152 (42.7%) of suspected LF cases were Malaria positive

55/141 (39.0) of LF + patients were co-infected with malaria

Those who were co-infected were of significantly younger age compare to those who were only LF positive 7.0 years vs. 22.7 years (p=.0005)

The majority (41.8%) of co-infection cases occurred in children<5 years of age

No significant difference in mortality detected between co-infected and LF+ alone; low sample size

www.cdc.gov

Lassa Fever

p. falciparum Malaria

Page 31: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

LF Dataset

Study Methods:• Retrospective

analysis of data collected from surveillance of LF in Sierra Leone

• Assess epidemiologic risk factors associated with disease and mortality

Page 32: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Summary of LF Results - Interpretation

LF prevalence significantly increased over the years and reported from new districts Could be due to improving detection, increasing transmission, or both

LF mortality significantly decreased over the years Earlier detection and improving clinical management may result in better

outcome

Ribavirin treatment significantly associated with mortalityThe most severe cases usually receive Ribavirin treatment

Ribavirin treatment probably a confounding factor, and an indicator of severe disease

Page 33: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Summary of LF Results – Interpretation and Hypothesis Generation

Summary of Results Young individuals, especially children < 5

years of age, were significantly more likely to be LF positive, to receive Ribavirin treatment, and to die from LF Early Detection and clinical care targeted

for LF infected young children may be critical to improving LF outcome

Pregnant women were significantly more likely to die for LF compared to non-pregnant counterparts

High prevalence of malaria co-infection, especially in younger age Impact of co-infections on LF outcome

needs to be further investigated

Hypothesis Development

• Young children have increased risk of LASV infection and severe LF

• Pregnant women have increased risk of severe LF and death

• Malaria exacerbates LASV infection and results in more severe LF outcome

Page 34: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

ConclusionIf you have data, develop a step by step plan for analysis:

• Define objectives• Assess data quality• Develop collaborations • Develop study methods • Develop analysis plan

• Person, Place, Time

• Conduct analysis utilizing appropriate resources• Interpret Results• Present Results – Abstract, Presentation, or Manuscript• Develop Hypothesis for futures studies

Page 35: What Can I Do With My Data? › 2018 › 04 › ... · Case Study –Lassa Fever Data, Sierra Leone Viral Hemorrhagic Fevers (VHFs) pose serious biological threats and potent agents

Questions