Upload
vitaly-gordon
View
1.372
Download
1
Tags:
Embed Size (px)
Citation preview
Data ScienceData Meetup Jan. 12
What is data science?Besides a reason to have beer and pizza…
What does the literature say?
Hacking“Good data scientists understand, in a
deep way, that the heavy lifting of
cleanup and preparation isn’t
something that gets in the way of solving
the problem…
bash/awk/sed
DJ Patilit is the
problem”
StatisticsWhat’s the probability that 2 people in the front 2 rows share a birthday?1. ~10%2. ~20%3. ~50%4. ~90%What’s the probability that a 99% accurate test diagnosed a 1/1000 disease?1. ~10%2. ~50%3. ~90%4. ~99%
Domain Expertise
Intelligence CookbookJust follow the steps
The Recipe
First, make it valuable.Then, make it possible.Then, make it beautiful.
Then, make it smart.
Example
E-Commerce website
Make it valuable
Find a KPI that is correlated to bottom line
revenue
e.g. number of products the visitor browses
through
Make it possible
Develop the simplest heuristic
e.g. show the visitor one of the top 10 selling products
Make it beautiful
Create a method to quickly test new algorithms against old ones
e.g. create a framework that split tests two models and reports which one is better
Make it smart
Figure out in what field your problem is and choose an off the
shelf algorithm
e.g. recognize that the problem is product
recommendation and use collaborative filtering
Common ML problems• Supervised learning
• Classification• Regression• Anomaly detection
• Unsupervised learning• Clustering• Separation
• Recommendation• Feature based recommendation• Collaborative filtering
• Search• Indexing• Ranking
To sum it all upReal data science is hard
but …
Real data science is the last step in data science, not the first
and besides …
The most important thing in data science is the business, not the science