26
DATA PRODUCTS: 5 DEADLY SINS AND HOW TO PREVENT THEM Pride Wrath Lust Gluttony Sloth Mathieu Bastian Web Summit 2015, Dublin Credits: The Seven Deadly Sins, Nanatsu no Taizai & nimbus-mage.deviantart.com

Data Products: 5 Deadly Sins and How To Prevent Them

Embed Size (px)

Citation preview

DATA PRODUCTS: 5 DEADLY SINS AND HOW TO PREVENT THEM

Pride Wrath Lust Gluttony Sloth

Mathieu BastianWeb Summit 2015, Dublin

Credits: The Seven Deadly Sins, Nanatsu no Taizai & nimbus-mage.deviantart.com

ABOUT ME• Data scientist & engineer

• Led data products team at LinkedIn

• Gephi co-founder

• Open-source contributor

2

DATA PRODUCTS

Source: http://bit.ly/1kMUPAe.

Tentative definition

User-facing production system based on an automated learning algorithm

3

DATA PRODUCTS TODAY

4

PRIDE"Excessive belief in one’s own abilities or

excessive love of oneself"

5

PRIDE

Source: http://www.themeasurementstandard.com/wp-content/uploads/2015/06/data-scientist-as-superman.jpg6

With power comes responsibility

7

Source: http://www.economist.com/node/15579717

Who are you building it for?Understand user intent

Integrate into the user flow

Explain recommendations to the user

Set right user expectations

Treat user like you would like to be treated

8

Credits: Google

Anticipate edge cases

9

WRATH“Choice of violent and hateful actions

over love and patience"

10

WRATH

11

Exercise perseverance

Reward

Time

Phase II: Growth

Phase III: Maintenance

Phase I: Inception

12

But have a plan

13

LUST"Depraved thought, unwholesome morality and desire for excitement"

14

LUST

Credits: Google Data Center

15

Perform due diligence

16

Thank the janitor & handyman

17

GLUTTONY"The consumption of more of anything

than you need"

18

GLUTTONY

19

Avoid solo data scientists

20

Credits: Lucasfilm

Choose the right problemM - Measurable

E - Explainable

R - Rapid prototyping

C - Core

I - Iterable

21

SLOTH"Not caring about others or living life in a

fulfilling way"

22

SLOTH

23

Embrace continuous data pipelines

Source: http://http://azkaban.github.io/

24

Make data pipelines robust

Code

Upload

Run workflow

Look atlogs Code Upload Run

workflow

PigUnit25

THANK YOU!

Mathieu Bastian @mathieubastian

www.linkedin.com/in/mathieubastian