Upload
domino-data-lab
View
2.769
Download
0
Embed Size (px)
Citation preview
Data Science and Goodhart’s Law
Kyle PolichData Science, Inc.
2
Goodhart’s Law
When a measure becomes a target, it ceases to be a good measure
3
Sales Rep Compensation Example
• Base pay + variable commission• For monthly <50k, commission = 3%• For monthly 50-99k, commission = 5%• For monthly 100k+, commission = 7%
4
Some Examples
Spam filtering arms race Search engine ranking Clearing cookies to get better airline prices Keep account open to manipulate FICO score Retail discounting/couponing strategies Bidding in AdTech marketplaces
5
Measuring with Cross ValidationCross Validation• You should be doing this anyway!• Set production performance expectation• Measure post deployment• Total deviation =
deviation due to overfit + deviation due to incomplete training+ deviation due to Goodhart’s Law
6
Measuring via Homogeneity Assumption
Can you train a model to accurately predict the date at which the observation was created?
7
Measuring Drift
8
Measuring DriftTypical failure from a web application release
9
Measuring DriftPossible failure from a web application release
10
Dealing with it
• Detection is key• Experimentation is required• Agile methods for model
deployment
11
Causal Impact• An approach to
estimating the causal effect of a designed intervention on a time series.
• Predicts counterfactual (how response likely would have evolved absent the intervention)
12
Self Fulfilling Prophecies
• Beware!• Case study: lead qualification
– Try to predict leads that will close– Relearn the bias of your training
13
Fast Iterations
• Outside normal SWLC release cycle– State updates– Parameter tuning
• Run experiments
14
Explanatory power
• Goodhart’s law will often manifest on only a subset of (possibly significant) instances.
• Model interpretability for effected instances is key
15
Interpretable Models
16
Interpretable Models
17
Why Should I Trust You?Explaining the Predictions of Any Classifier
Ribeiro, Singh, Guestrin
Model Interpretability
18
Summary• Goodhart’s law: When a measure becomes a target, it
ceases to be a good measure• As a data scientist, if your work is meaningful, you will
encounter it• Try to measure it in the data• Work on explanatory models to mitigate• Don’t let the average case blind you
19
DataScience
facebook.com/datascience
@DataSkeptic@datascienceinc
linkedin.com/company/datascience-inc
(310) 579 - 6200