Upload
domino-data-lab
View
3.455
Download
0
Embed Size (px)
Citation preview
#datapopupseattle
How Data Science Builds Better Products
Sean McClure, Ph.DData Scientist, Senior Consultant, ThoughtWorks
WorldOfDataSci Thoughtworks
#datapopupseattle
UNSTRUCTUREDData Science POP-UP in Seattle
www.dominodatalab.com
D
Produced by Domino Data Lab
Domino’s enterprise data science platform is used by leading analytical organizations to increase productivity, enable collaboration, and publish
models into production faster.
How Data ScienceBuilds Better Products
Sean McClure Data Scientist, ThoughtWorks
Products are Built to Quickly Test Ideas
lightweight end-to-end imperfect
Why Do We Build Products?
data value
product
early stages
learning is the key to building better products
world -> data -> discovery -> pivot
productproduct
productdiscovery
TRADITIONAL “DISCOVERY”
the right decisions
the right decisions
the right decisionsTRADITIONAL “DISCOVERY”
the right decisionsTRADITIONAL “DISCOVERY”
the right decisionsTRADITIONAL “DISCOVERY”
the right decisionsTRADITIONAL “DISCOVERY”
the right decisions
?unrealistic
TRADITIONAL “DISCOVERY”
data science
the right decisions
+understands strategyunderstands data
BETTER DISCOVERY
Count-controlled loops Condition-controlled loops Collection-controlled loops Infinite loops Restart loop Generators
Early exit from loops Loop variants and invariants Loop system cross-references Structured non-local control flow Conditions Exceptions
Loops Flow Control structures If-then-(else) Case and switch Coroutines Continuations
STANDARD SOFTWAREWhat’s Wrong With the Usual Approach?
All functionality is locked in place
STANDARD SOFTWARE
software environment
STANDARD SOFTWARE
software environment
STANDARD SOFTWARE
software
environment
Learning algorithms
Model Validation
Model Performance
Data visualizationOperationalizing Models
Scientific computing libraries
Data cleansing
Data preparation
Probability and statistics
Loops Flow Control structures If-then-(else) Case and switch Coroutines Continuations
Count-controlled loops Condition-controlled loops Collection-controlled loops Infinite loops Restart loop Generators
Early exit from loops Loop variants and invariants Loop system cross-references Structured non-local control flow Conditions Exceptions
ADAPTIVE SOFTWAREWhat is the New Approach?
unlocked
ADAPTIVE SOFTWARE
software environment
ADAPTIVE SOFTWARE
software environment
ADAPTIVE SOFTWARE
software environment
“rapid and flexible response to change”“continuous improvement”
post-developmentdevelopment
ability to pivot
ability to pivot
How Do You Put the Brain in the Box?
Successful Data Products
• establish early benchmarks • understand true validation • build sophistication via iteration • provide APIs to model results • get continuous exposure to domain experience • design product experiments
Need to utilize technology choices that allow for building data products successfully
Search Engine Marketing - Recommendation• Increasing CTR? • Decreasing CPC? • Call volume trends • Percentage of Good Call trends. • Page Position • Visits vs Cost Per Visit • Impressions vs CTR graph. • Breakdown of CVT types • Click-to-call • Daily Budget Spend • Top 5 KWs vs Previous Good Cycle • Budget distribution • Impressions per publisher • Revenue per publisher • Page position per publisher • Review for Negative KWs • Review for Partner site issues • Review for OAT • Check Category page • Impression Share • Are the ads approved and running? • Below 1st Page Bid KWs • Quality Score • Is it loading? • Are all numbers replacing correctly? • Out of Area Traffic • High Spend – Low Revenue. • Super Low CTRs
making decisions
Data Product
Hadoop Cluster
Databases
DB DataProducer
Queue
Reporting Data
Operational Data
rl_op
rl_keyword
rl_report
HDFS
Flume
Data Core CPI Data Mart
Campaign
Creative
Publishers
Proxy Logs Call Logs
CPI
Admin Console
Others
Others
SqoopCPI
SpaceRaw
Normalized
Core Jobs
CPI Jobs
Search Engine Marketing - Recommendation
Designing APIs around
Model Results
HealthcarePrediction Engine
#datapopupseattle
@datapopup #datapopupseattle
#datapopupseattle
Thank You To Our Sponsors
THANK YOU