Upload
raed-mansour
View
260
Download
2
Embed Size (px)
Citation preview
Predictive Analytics & West Nile Disease Prevention
Raed Mansour, MS @reedmonseur
November 3, 2015
Presenter Disclosure
Raed Mansour, MS
No relationships to disclose.
Culex pipiens
"Culex pipiens 2007-1" by Alvesgaspar - Own work. Licensed under CC BY-SA 3.0 via Commons https://commons.wikimedia.org/wiki/File:Culex_pipiens_2007-1.jpg#/media/File:Culex_pipiens_2007-1.jpg
West Nile Virus Activity by State (10/27/15)
West Nile Virus Disease Cases 2015
58
West Nile Virus Disease Cases 2015
Comprehensive Surveillance Since 2004, CDPH has had a comprehensive environmental surveillance and control program to address West Nile virus.
• Collection of mosquitoes from traps located throughout the city
• Identification and sorting of mosquitoes collected from these traps
• Testing of specific species of mosquitoes for WNV
• Environmental control measures have included targeted adulticiding efforts.
Weekly Reports June to October
All Trap Locations & Sprays (2007-2014)
What if…
• Determine the effectiveness of adulticiding efforts in reducing WNV-positive mosquito pools and allocate and prioritize trapping and spraying.
• Predict when and where different species of mosquitos will test positive for West Nile Virus.
• Improve vizualizations
• Create an open source model replicable for other cities to use.
Partners
Robert Wood Johnson Foundation and Kaggle
• RWJF funded this project to build a model that will identify factors associated with the presence of West Nile virus in mosquitoes and determine the effectiveness of mosquito control measures.
• Information gained will help CDPH better target its surveillance, prevention and control efforts.
• Collaboration builds capacity.
Challenge
Given weather, location, testing, and spraying data, this competition asks you to predict when and where different species of mosquitos will test positive for West Nile virus. A more accurate method of predicting outbreaks of West Nile virus in mosquitos will help the City of Chicago and CPHD more efficiently and effectively allocate resources towards preventing transmission of this potentially deadly virus.
Crowdsourcing Models
Data Quality Controlled Local Climatological Data, National Oceanic and Atmospheric Administration. Daily and monthly summaries for approximately 1,600 U.S. locations. Daily Summary forms are not available for all stations. May 1, 2007 to October 31, 2014. Zoonotic Disease Tracker. Records all environmental surveillance data as well as well as data gathered to determine the effectiveness of larvaciding efforts. There are 100,000 records and 72,000 records of adulticiding data. Vector Disease Control International. Provides consultation on environmental surveillance and control measures and to conduct adulticiding efforts. VDCI has also recorded the where adulticiding has occurred.
Data • Spray
Since August 29, 2011, Date, Time, Latitude, Longitude
• Train Data May 9, 2007 to September 9, 2013 Date, Address, Species, Block, Street, Trap, Address, NumberAndStreet, Latitude, Longitude, Address Accuracy, NumMosquitos, WNV Present
• Sample Submission WNV Present
• Test Data June 11, 2008 to October 2, 2014, Id, Date, Address, Species, Block, Street, Trap AddressNumberAndStreet, Latitude, Longitude and AddressAccuracy
An open competition to determine the best model.
1,306 TEAMS
1,375 SCRIPTS
1,447 PLAYERS
29,961 ENTRIES
“I'm very much a novice myself at Kaggle and have benefited immensely from
other people generously sharing their scripts. Although the team and I did rather
poorly overall (ROC ~0.679) this was my favorite competition by far.”
Crowdsourcing Models
Operationalize… • What is the normal business practice?
• What is the Minimum Viable Product?
• Customization to Needs of Program UX/UI Dashboard/Tablet to set Traps & Plan Spraying
• Updates/Improvements/Discoveries Culex population dynamics, weather forecasting, wind, etc. It is believed that hot and dry conditions are more favorable for WNV than cold and wet.
• Deployment Connected Data Systems, Open and/or Closed Data
• Automation
• Sustainability
• Competing priorities
Open vs “Closed” Data Ease of Collaborations to build capacity faster
• Lower Costs
• Limited bureaucratic process (contracting)
• Complement traditional PH workforce with data scientists
Open Codes
• Opportunities to improve models thru crowd evaluation
• Replication and adaptation to spread the innovation
Policy & Infrastructure
• Tradition creates inertia that impedes change efforts
• Must work within existing constructs
• Sustainability (workforce vs. community)
Limited privacy concerns vs PHI (deidentified blocks)
Open Data Spurs Innovation
Community Built
• Apps
• Models
• Evaluations
• Data Visualizations
• Opens Funding Opportunities
Data Vizualizations
Data Visualizations
Fight the Bight App?
Most apps work by emitting a high frequency sound inaudible to humans, mosquitoes will stay away. What if Predictive models can alert based on analytics to give: Recommendations when they are needed the most • Clothing • Repellents • Time of Day Mosquito-Proof Your Home • Emptying standing water from flowerpots, gutters, buckets, pool covers, pet water
dishes, discarded tires, and birdbaths on a regular basis. Build Community Trust • Announcements of West Nile Virus Surveillance and Control Programs • Knowing spraying schedule with automated visualizations of communities to be sprayed • Facilitate Calls to 311 to report standing water, dead birds or high grass/weeds.
Open Chicago Models Available
https://github.com/Cardal/Kaggle_WestNileVirus
https://github.com/diefimov/west_nile_virus_2015
https://github.com/nhlx5haze/Kaggle_WestNileVirus
/ChicagoPublicHealth
@ChiPublicHealth
www.CityofChicago.org/Health