Upload
nicole-a-donnelly
View
246
Download
0
Embed Size (px)
Citation preview
Forecasting critical food violations at restaurants using open data
Nicole DonnellyPyData DCOctober 8, 2106
Hello!Thank you!
Who are you?
Who am I?
Why am I here?
The Project
Replicate Chicago’s Food Inspection Forecasting project using Python and data about DC.
Data ComputeWrangleIngest ReportVisualizeDataReport
Data ComputeWrangleIngest ReportVisualizeDataReport
Data ComputeWrangleIngest ReportVisualizeDataReport
Hypothesis
Foodborne illness outbreaks affect millions of people annually. The city of Washington, DC, like most cities, has limited resources to inspect food establishments for critical violations that lead to these outbreaks.
We can use machine learning to predict when a critical violation is likely to occur and prioritize inspections to catch these violations sooner, mitigating foodborne illness outbreaks and more effectively deploying limited resources.
Instance: an inspection
Features: the data about the instance
Prediction: will there be a critical violation
Data
Weather
DOH InspectionsCrimeABRADCRAConstruction
RatingNumber of ReviewsCategory
Non-emergency City Issues
Places
Scraping
APIs
CSVs
Ingest
Clean the data
Create the instances
Come to terms with features
Feature engineering
Wrangle
Which estimator?
All of them
Compute
Drumroll please...
Visualize
Results, out of sample data
The scores were not great, but reprioritizing the inspections using the model confidence scores yields results.
Report
11% more violations 10 day sooner
What now?Build better dataset
Get more dataGet more input
Poor scores do not mean failure, they are just a starting point.