17
Michael de Toldi BNP Paribas Cardif October 2015

Michael de Toldi BNP Paribas Cardif October 2015chaire-dami.fr/files/2015/12/De-Toldi-analytics-governance.pdf · 7 “Data Science” vs “Data Mining” culture Statistics SQL,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Michael de Toldi – BNP Paribas Cardif

    October 2015

  • 2

    What governance for Data Analytics?

  • 3

    What do we want as clients?

    Personalised offers

    Adapted services

    Quick answers

    Great prices

    Contextual offers

    Expected contacts

    Value for loyalty

    Privacy

  • 4

    Some people think that we should…

  • 5

    But this, is the reality!

  • 6

    And it is better to know more than one model…

    All models are wrong

    but some are useful“

    “George E. P. Box

  • 7

    “Data Science” vs “Data Mining” culture

    Statistics

    SQL, Flat files, Web scrapping, web

    logs, Cookies, Text, Images, .json,

    .xml, …

    High abstraction programming

    Open Source soft & libraries

    MOOCs

    Blogs, Tutorials, Forums,

    KAGGLE competitions

    Statistics,

    Computer science : memory

    optimization, parallelization

    Data Format

    Tools

    Training

    Skills

    SQL

    Flat files

    Low level programming

    frameworks

    Proprietary software

    Books

    Diploma

    1990 2015Data Mining Data Science

  • 8

    Toolbox

    Objective

    Strategy

    Generalizing

    Validation

    Computation

    Experience

    The Data Modelling Culture The Machine Learning Culture

    OLS

    GLMs

    GAMs

    Cox

    X Y

    Regularized GLMs

    Random Forests

    Gradient Boosting

    Neural Nets

    Blending/Stacking

    yX

    Understand model nature Look for best accuracy

    Design manually model structure Hyperparameters to control

    model complexity

    Combat overfitting with expertise Automated strategies to combat

    overfitting

    « Goodness of fit » statistical tests Measured by predictive accuracy

    Model simplicity Parallelizing strategies

    • Provide more insight about how nature is associating

    the response variables to the input variables.

    • Works well for small datasets

    • But, if the model is a poor emulation of nature, the

    conclusions based on this insight may be wrong !

    • Sometimes considered as black box (unfairly

    for some techniques)

    • They often produce higher predictive power

    with less modelling efforts because of

    automated strategies

    “Machine Learning” vs “Data Modelling” culture

  • 9

    1st conviction for Data Analytics

    Data Science should be

    understood and internalised“

    “Michaël de Toldi

  • 10

    What about data in insurance companies?

    Actuarial Data

    Marketing Data

    Commercial Data

    Finance Data

    Client Management

    Data

    Fraud Data

  • 11

    What about data in insurance companies?

    Actuarial

    Marketing

    Finance

    CommercialClient

    Management

    Fraud

    INT / EXT

    DATA

  • 12

    2nd conviction for Data Analytics

    Internal & external Data

    should be freed, shared,

    controlled and secured

    “Michaël de Toldi

  • 13

    Data Science & IT are deeply linked

  • 14

    Data Science innovation relies on Open Source

  • 15

    3rd conviction for Data Analytics

    IT framework should be

    Data Science friendly“

    “Michaël de Toldi

  • 16

    Data Analytics governance in a nutshell…

    Data Science should be understood and internalised

    Internal & external Data should be freed, shared, controlled and secured

    IT framework should be Data Science friendly

    Don’t wait too much to

    understand what is at stake!

  • THANK YOU!BNP PARIBAS CARDIF8, rue du Port

    92 728 Nanterre Cedex

    Tel.: +33 (0)1 41 42 83 00

    bnpparibascardif.com