Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Michael de Toldi – BNP Paribas Cardif
October 2015
2
What governance for Data Analytics?
3
What do we want as clients?
Personalised offers
Adapted services
Quick answers
Great prices
Contextual offers
Expected contacts
Value for loyalty
Privacy
4
Some people think that we should…
5
But this, is the reality!
6
And it is better to know more than one model…
All models are wrong
but some are useful“
“George E. P. Box
7
“Data Science” vs “Data Mining” culture
Statistics
SQL, Flat files, Web scrapping, web
logs, Cookies, Text, Images, .json,
.xml, …
High abstraction programming
Open Source soft & libraries
MOOCs
Blogs, Tutorials, Forums,
KAGGLE competitions
Statistics,
Computer science : memory
optimization, parallelization
Data Format
Tools
Training
Skills
SQL
Flat files
Low level programming
frameworks
Proprietary software
Books
Diploma
1990 2015Data Mining Data Science
8
Toolbox
Objective
Strategy
Generalizing
Validation
Computation
Experience
The Data Modelling Culture The Machine Learning Culture
OLS
GLMs
GAMs
Cox
X Y
Regularized GLMs
Random Forests
Gradient Boosting
Neural Nets
Blending/Stacking
yX
Understand model nature Look for best accuracy
Design manually model structure Hyperparameters to control
model complexity
Combat overfitting with expertise Automated strategies to combat
overfitting
« Goodness of fit » statistical tests Measured by predictive accuracy
Model simplicity Parallelizing strategies
• Provide more insight about how nature is associating
the response variables to the input variables.
• Works well for small datasets
• But, if the model is a poor emulation of nature, the
conclusions based on this insight may be wrong !
• Sometimes considered as black box (unfairly
for some techniques)
• They often produce higher predictive power
with less modelling efforts because of
automated strategies
“Machine Learning” vs “Data Modelling” culture
9
1st conviction for Data Analytics
Data Science should be
understood and internalised“
“Michaël de Toldi
10
What about data in insurance companies?
Actuarial Data
Marketing Data
Commercial Data
Finance Data
Client Management
Data
Fraud Data
11
What about data in insurance companies?
Actuarial
Marketing
Finance
CommercialClient
Management
Fraud
INT / EXT
DATA
12
2nd conviction for Data Analytics
Internal & external Data
should be freed, shared,
controlled and secured
“
“Michaël de Toldi
13
Data Science & IT are deeply linked
14
Data Science innovation relies on Open Source
15
3rd conviction for Data Analytics
IT framework should be
Data Science friendly“
“Michaël de Toldi
16
Data Analytics governance in a nutshell…
Data Science should be understood and internalised
Internal & external Data should be freed, shared, controlled and secured
IT framework should be Data Science friendly
Don’t wait too much to
understand what is at stake!
THANK YOU!BNP PARIBAS CARDIF8, rue du Port
92 728 Nanterre Cedex
Tel.: +33 (0)1 41 42 83 00
bnpparibascardif.com