Upload
bigdataexpo
View
140
Download
2
Embed Size (px)
Citation preview
BEST MODEL
• Which one would you choose here?
• It’s about making a tradeoff
• This trade off is the most important job of the PO
• A 100% correct answer might not exist!!!
ULTIMATELY
• It’s about creating value from data
• Using Machine Learning, Advanced Analytics, and visualization
WHEN YOU SAY DATA SCIENCE, COMPANIES UNDERSTAND
• All the things big data
• Predictive modeling & Advanced Analytics
• More money
• Do all the cool things the others are doing
TRADITIONAL DATA WAREHOUSE
ARCHITECTURE
EDW
Data consumer
Web app
Dashboard /Reporting
TraditionalBusiness app
WHAT COMPANIES GOT
• A lot of POCs
• A lot of screenshots/presentations/dashboards on a laptop
• Nice stories to tell to their network, about those screenshots and especially those dashboards
• Headaches with data and infra even more scattered
WHAT DO COMPANIES ACTUALLY NEED
• Put things into production
• They don’t teach that in any data science course or MOOC (that I know)
OVERSIMPLIFYING
Requirements
DataSources
ExplorationModeling
Products
Feedback
Data scientist MLengineer
Dataengineer
Dataengineer
🤦🤦♀️🤦🤦
Customers
KAGGLE CURSE
• gdd.li/toldYouSo
• Many data scientists approach the problem at hand with a Kaggle-like mentality: delivering the best model in absolute terms, no matter what the practical implications are.
• In reality it's not the best model that we implement, but the one that combines quality and practicality: a continuous balancing act
• Netflix competition
BUSINESS CASE
Business case for
• True Positives
• True Negatives
Cost of
• False Positives
• False Negatives
SKILLS
• Participate in actually building production quality systems OR being proficient enough in R or python to hack together a prototype on a very small dataset?
• Supply of the second group keeps growing while demand is flat or shrinking
• Especially as executives get burned by “data scientists” who don't know how to help them build things of value
HIRING
• Companies that are not engineering driven, often have trouble hiring good technical people
• The “IQ” test is not really representative of applied data science
• At GoDataDriven we do a “at home, at your convenience” assessment
• Real dataset, real business question, real product
• Models are software: treat them as such
TAKEAWAYS
• POs should know “their stuff”
• Automate all the data movements
• Hire data scientists that are good at programming (or hire machine learning engineers)
QUESTIONS?
• We’re hiring
• Data & Machine Learning Engineers!