60
Model Visualisation ___ Amit Kapoor @amitkaps

Model Visualisation

Embed Size (px)

Citation preview

Page 1: Model Visualisation

Model Visualisation___Amit Kapoor@amitkaps

Page 2: Model Visualisation

Story___“We don’t see things as they are, we see them as we are.”— Anais Nin

Page 3: Model Visualisation

The Blind Men & the Elephant___“And so these men of IndostanDisputed loud and long,Each in his own opinionExceeding stiff and strong,Though each was partly in the right,And all were in the wrong.”

— John Godfrey Saxe

Page 4: Model Visualisation

The Elephant: Data___“Data is just a clue to the end truth”— Josh Smith

Page 5: Model Visualisation

The Men: Building Models___"All models are wrong, but some are useful"— George Box

Page 6: Model Visualisation

Ladder of Abstraction___Data AbstractionVisual AbstractionModel Abstraction

Page 7: Model Visualisation
Page 8: Model Visualisation

Why Build Models?___First Level of Ignorance

"I know, what I don't know"

Page 9: Model Visualisation

Why Visualise Models?___Second Level of Ignorance

"I don't know, what I don't know"

Page 10: Model Visualisation

Machine Learning (ML) Speak___Data TransformationVisual ExplorationModel Building

Page 11: Model Visualisation

ML Pipeline___Data Transformation — — — — Model Building (Tidy Data) ! ! ! Visual Exploration (Data-Vis)

Page 12: Model Visualisation

ML Pipeline++___Data Transformation — — — — Model Building (Tidy Data) (Tidy Model) ! ! ! ! ! ! Visual Exploration Model Exploration (Data-Vis) (Model-Vis)

Page 13: Model Visualisation

Model-Vis Key Concept___Use visualisation to aid the transition of implicit knowledge in the data and your head to explicit knowledge in the model.

Page 14: Model Visualisation

Model-Vis Approach___[0] Visualise the data space[1] Visualise the predictions in the data space[2] Visualise the errors in model fitting[3] Visualise with different model parameters[4] Visualise with different input datasets[5] Visualise the entire model space[6] Visualise the entire feature space[7] Visualise the many models together

Page 15: Model Visualisation

Model-Vis Examples___Regression: SmallClassification: Large pRegression: Large n

Page 16: Model Visualisation

Model-Vis Examples___Cars (n < 50, p = 4)Digits (n ~ 5K, p = 785)Taxi (n ~ 10M, p = 20)

Page 17: Model Visualisation

Regression: Small___Cars dataset - price vs kmplScraped from comparison websiteRefined & tidied upBase version for petrol carsPrice < 1,000K, n = 42

Page 18: Model Visualisation

brand model price kmpl type bhpTata Nano 199 23.9 Hatchback 38Suzuki Alto800 248 22.7 Hatchback 47Hyundai EON 302 21.1 Hatchback 55Nissan Datsun 312 20.6 Hatchback 67... ... ... ... ... ...

Suzuki Ciaz 725 20.7 Sedan 91Skoda Rapid 756 15.0 Sedan 104Hyundai Verna 774 17.4 Sedan 106VW Vento 785 16.1 Sedan 104

Page 19: Model Visualisation

[0] Visualise the data space

Page 20: Model Visualisation

[1] Visualise the predictions in data space

Page 21: Model Visualisation

[2] Visualise the errors in model fitting

Page 22: Model Visualisation

[3] Visualise with different model parameters

Page 23: Model Visualisation

[4] Visualise with different input datasets

Page 24: Model Visualisation

[5] Visualise the entire model space

Page 25: Model Visualisation

[6] Visualise the entire feature space

Page 26: Model Visualisation

[7] Visualise the many models together

Page 27: Model Visualisation

Model-Vis Approach___[0] Visualise the data space[1] Visualise the predictions in the data space[2] Visualise the errors in model fitting[3] Visualise with different model parameters[4] Visualise with different input datasets[5] Visualise the entire model space[6] Visualise the entire feature space[7] Visualise the many models together

Page 28: Model Visualisation

Model-Vis & ML Approach___[0] DATA VIS: the data space[1] PREDICTION: the predictions in the data space[2] VALIDATION: the errors in model fitting[3] TUNING: with different model parameters[4] BOOTSTRAP: with different input datasets[5] ENSEMBLE: the entire model space[6] FEATURE ENGG: the entire feature space[7] N-MODELS: the many models together

Page 29: Model Visualisation

Move through Layers___Iterative, not linearUp and Down, not lateralComplementary, not exclusive

Page 30: Model Visualisation

p/n/N Model-Vis challenge___p -- High dimensional datan -- Large and big dataN -- Multiple models

Page 31: Model Visualisation

Classification: 2 Class___MNIST - digit recognitionReduced to 2-class: 1 and 2p = 784, 28 x 28 gray pixel mapn > 5000

Page 32: Model Visualisation

MNIST dataset: Examples of number 1 and 2

Page 33: Model Visualisation

Visualise the data space

Page 34: Model Visualisation

Identify the features - Symmetry & Intensity

Page 35: Model Visualisation

Visualise the reduced feature space

Page 36: Model Visualisation

Visualise the predictions in data space

Page 37: Model Visualisation

Visualise the predictions boundaries

Page 38: Model Visualisation

Visualise the errors in model fitting

Page 39: Model Visualisation

Visualise with different model parameters

Page 40: Model Visualisation

Easy to visualise errors in data space

Page 41: Model Visualisation

How to scale for large p?___Curse of dimensionalityMesh approach computationally expensiveNeed to use projections

Page 42: Model Visualisation

For entire feature space - PCA projection

Page 43: Model Visualisation

Map the error on the projection

Page 44: Model Visualisation

Cannot use any projection e.g. t-SNE

Page 45: Model Visualisation

High-p Boundary Classifiers___

Github: highdimensional-decision-boundary-plot

Page 46: Model Visualisation

Regression: Large n___NYC Taxi Trip Datan ~ 10M (in just one month)p = 20, geo location (drop & pick up), fare breakup, passenger no. etc.

Page 47: Model Visualisation

Data-Vis Issue___

Plotting is hard e.g. alpha

Sampling (~1%) may be effective

Require careful tuning parameters e.g. overweighting unusual values

Page 48: Model Visualisation

Binning Helps___“Bin - Summarize - Smooth: A framework for visualising big data” - Hadley WickamPackage in R: 'BigVis' (2013)

Recent Interactive implementation in PythonPackage in Python: 'Datashader' (2016)

Page 49: Model Visualisation

Vis Data Space___

Plot the probability of getting a tip

Start to see the patterns in the visualisation

Page 50: Model Visualisation

Vis Predictions___

Predict the probability of getting tip

Simple Linear Model - drop coords, passenger count, time and day of week

Page 51: Model Visualisation

Vis Errors___

Visualise the errors in tip probability distribution

Page 52: Model Visualisation

N-Models Challenge___Model ExplosionEntire Model Space+ Add Tuning Models+ Add Bootstrap Models+ Add Ensemble Models+ Add Cross-Validation Models

Page 53: Model Visualisation

N-Models Challenge___Keep track of prediction & errorsKeep track of model output parameters

Page 54: Model Visualisation

Tidy Model___Augment predictions & errors to datasetCreate output parameters data frameVisualise like Tidy Data

Page 55: Model Visualisation

Managing N-Models___"Managing Many Models in R"by Hadley Wickham "Broom Package in Rby David Robinson

Page 56: Model Visualisation

p/n/N Model-Vis challenge___p -- High dimensional datan -- Large and big dataN -- Multiple models

Page 57: Model Visualisation

p/n/N Model-Vis approach___p -- use Projectionsn -- use Binning or SamplingN -- use Tidy Model

Page 58: Model Visualisation

Model-Vis___Similar challenges to Data-VisMore an Art, than a ScienceEssential in ML Model PipelineBoth to Explain or to PredictScope for easier tooling

Page 59: Model Visualisation

Model-Vis___Slides and Codehttp://modelvis.amitkaps.com

Mini-Site and Explanation (by End of 2016)

Page 60: Model Visualisation

Model Visualisation___Amit Kapoor@amitkaps

amitkaps.com