Barga Data Science lecture 10

Deriving Knowledge from Data at Scale


Models in Production


Putting an ML Model into Production

bull AB Testing


Controlled Experiments in One Slide

Concept is Trivial

bull Must run statistical tests to confirm differences are not due to chance

bull Best scientific way to prove causality ie the changes in metrics are caused by changes introduced in the treatment(s)


Best Practice AA Test

Run AA tests

before


Best Practice Ramp-up

Ramp-up


Best Practice Run Experiments at 5050


Cost based learning


Imbalanced Class Distribution amp Error CostsWEKA cost sensitive learning

weighting method

false negatives FNtry to avoid

false negatives


Imbalanced Class DistributionWEKA cost sensitive learning

Preprocess Classify

metaCostSensitiveClassifier

set the FN to 100 FP to 10

tries to optimize accuracy or error can be cost-sensitivedecision trees rule learner





curatedcompletely specify a problem measure progress

paired with a metric target SLAs scoreboard


This isnrsquot easyhellip

bull Building high quality gold sets is a challenge

bull It is time consuming

bull It requires making difficult and long lasting

choices and the rewards are delayedhellip


enforce a few principles

1 Distribution parity

2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility

6 Experimentation velocity

7 Data is gold


bull Test set blindness

bull Reproducibility and Data is gold

bull Experimentation velocity


Building Gold sets is hard work Many common and avoidable mistakes aremade This suggests having a checklist Some questions will be trivial toanswer or not applicable some will require workhellip

1 Metrics For each gold set chose one (1) metric Having two metrics on the samegold set is a problem (you canrsquot optimize both at once)

2 WeightingSlicing Not all errors are equal This should be reflected in the metric notthrough sampling manipulation Having the weighting in the metric has twoadvantages 1) it is explicitly documented and reproducible in the form of a metricalgorithm and 2) production train and test sets results remain directly comparable(automatic testing)

3 Yardstick(s) Define algorithms and configuration parameters for public yardstick(s)There could be more than one yardstick A simple yardstick is useful for ramping upOnce one can reproduceunderstand the simple yardstickrsquos result it becomes easierto improve on the latest ldquoproductionrdquo yardstick Ideally yardsticks come withdownloadable code The yardsticks provide a set of errors that suggests whereinnovation should happen


4 Sizes and access What are the set sizes Each size corresponds to an innovationvelocity and a level of representativeness A good rule of thumb is 5X size ratiosbetween gold sets drawn from the same distribution Where should the data live Ifon a server some services are needed for access and simple manipulations Thereshould always be a size that is downloadable (lt 1GB) to a desktop for high velocityinnovation

5 Documentation and format Create a formatAPI for the data Is the datacompressed Provide sample code to load the data Document the format Assignsomeone to be the curator of the gold set


6 Features What (gold) features go in the gold sets Features must be pickled for result to be reproducible Ideally we would have 2 and possibly 3 types of gold sets

a One set should have the deployed features (computed from the raw data) This provides the production yardstick

b One set should be Raw (eg contains all information possibly through tables) This allows contributors to create features from the raw data to investigate its potential compared to existing features This set has more information per pattern and a smaller number of patterns

c One set should have an extended number of features The additional features may be ldquobuilding blocksrdquo features that are scheduled to be deployed next or high potential features Moving some features to a gold set is convenient if multiple people are working on the next generation Not all features are worth being in a gold set

7 Feature optimization sets Does the data require feature optimization For instance an IP address a query or a listing id may be features But only the most frequent 10M instances are worth having specific trainable parameters A pass over the data can identify the top 10M instance This is a form of feature optimization Identifying these features does not require labels If a form of feature optimization is done a separate data set (disjoint from the training and test set) must be provided


8 Stale rate optimization monitoring How long does the set stay current In manycases we hide the fact that the problem is a time series even though the goal is topredict the future and we know that the distribution is changing We must quantifyhow much a distribution changes over a fixed period of time There are several waysto mitigate the changing distribution problem

a Assume the distribution is IID Regularly re-compute training sets and Gold sets Determine thefrequency of re-computation or set in place a system to monitor distribution drifts (monitor KPIchanges while the algorithm is kept constant)

b Decompose the model along ldquodistribution (fast) tracking parametersrdquo and slow tracking parametersThe fast tracking model may be a simple calibration with very few parameters

c Recast the problem as a time series problem patterns are (input data from t-T to t-1 prediction attime t) In this space the patterns are much larger but the problem is closer to being IID

9 The gold sets should have information that reveal the stale rate and allows algorithmsto differentiate themselves based on how they degrade with time


10 Grouping Should the patterns be grouped For example in handwriting examples aregrouped per writer A set built by shuffling the words is misleading because trainingand testing would have word examples for the same writer which makesgeneralization much easier If the words are grouped per writers then a writer isunlikely to appear in both training and test set which requires the system to generalizeto never seen before handwriting (as opposed to never seen before words) Do wehave these type of constraints Should we group per advertisers campaign users togeneralize across new instances of these entities (as opposed to generalizing to newqueries) ML requires training and testing to be drawn from the same distributionDrawing duplicates is not a problem Problems arise when one partially drawexamples from the same entity on both training and testing on a small set of entitiesThis breaks the IID assumption and makes the generalization on the test set mucheasier than it actually is

11 Sampling production data What strategy is used for sampling Uniform Are any ofthe following filtered out fraud bad configurations duplicates non-billable adultoverwrites etc Guidance use the production sameness principle


11 Unlabeled set If the number of labeled examples is small a large data set ofunlabeled data with the same distribution should be collected and be made a goldset This enables the discovery of new features using intermediate classifiers andactive labeling


Greatest Challenge in Machine Learning


gender age smoker eye color

male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model


The greatest challenge in Machine LearningLack of Labelled Training Datahellip

What to Do

bull Controlled Experiments ndash get feedback from user to serve as labels

bull Mechanical Turk ndash pay people to label data to build training set

bull Ask Users to Label Data ndash report as spam lsquohot or notrsquo review a productobserve their click behavior (ad retargeting search results etc)


What if you cant get labeled Training Data

Traditional Supervised Learning

bull Promotion on bookseller rsquos web page

bull Customers can rate books

bull Will a new customer like this book

bull Training set observations on previous customers

bull Test set new customers

Whathappensif only few customers rate a book

Age Income LikesBook

24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label

copy 2013 Datameer Inc All rights reserved


24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -


Semi-Supervised Learning

Can we make use of the unlabeled data

In theory no

but we can make assumptions

PopularAssumptions

bull Clustering assumption

bull Low density assumption

bull Manifold assumption


The ClusteringAssumption

Clustering

bull Partition instances into groups (clusters) of similar

instances

bull Many different algorithms k-Means EM etc

Clustering Assumption

bull The two classification targets are distinct clusters

bull Simple semi-supervised learning cluster then

perform majority vote



Clustering


instances


Clustering Assumptionbull The two classification targets are distinct clusters





Clustering


instances







Clustering


instances






Generative Models

Mixture of Gaussiansbull Assumption the data in each cluster is generated

by a normal distribution

bull Find most probable location and shape of clusters

given data

Expectation-Maximization

bull Two step optimization procedure

bull Keeps estimates of cluster assignment probabilities

for each instance

bull Might converge to local optimum


Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



BeyondMixtures of Gaussians


bull Can be adjusted to all kinds of mixture models

bull Eg use Naive Bayes as mixture model for text classification

Self-Training

bull Learn model on labeled instances only

bull Apply model to unlabeled instances

bull Learn new model on all instances

bull Repeat until convergence


The Low DensityAssumption

Assumption

bull The area between the two classes has low density

bull Does not assume any specific form of cluster

Support Vector Machine

bull Decision boundary is linear

bull Maximizes margin to closest instances



Assumption








Assumption








Semi-Supervised SVMbull Minimize distance to labeled and

unlabeled instancesbull Parameter to fine-tune influence of

unlabeled instancesbull Additional constraint keep class balance correct

Implementationbull Simple extension of SVM

bull But non-convex optimization problem
















Semi-Supervised SVM

Stochastic Gradient Descentbull One run over the data in random order

bull Each misclassified or unlabeled instance moves

classifier a bit

bull Steps get smaller over time

Implementation on Hadoopbull Mapper send data to reducer in random order

bull Reducer update linear classifier for unlabeled

or misclassified instances

bull Many random runs to find best one


Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







The ManifoldAssumption

The Assumption

bull Training data is (roughly) contained in a low

dimensional manifoldbull One can perform learning in a more meaningful

low-dimensional spacebull Avoids curse of dimensionality

Similarity Graphs

bull Idea compute similarity scores between instances

bull Create network where the nearest

neighbors are connected


TheManifoldAssumption

The Assumption


dimensional manifold

bull One can perform learning in a more meaningful

low-dimensional space

bull Avoids curse of dimensionality

Similarity Graphsbull Idea compute similarity scores between instances

bull Create a network where the nearest neighbors are

connected



The Assumption



bull One can perform learning in a more

meaningful low-dimensional space


SimilarityGraphs


bullCreate network where the nearest neighbors

are connected


Label Propagation

Main Ideabull Propagate label information to neighboring instances

bull Then repeat until convergence

bull Similar to PageRank

Theorybull Known to converge under weak conditions

bull Equivalent to matrix inversion


Label Propagation







Label Propagation







Label Propagation







Conclusion

Semi-Supervised Learningbull Only few training instances have labels

bull Unlabeled instances can still provide valuable signal

Different assumptions lead to different approachesbull Cluster assumption generative models

bull Low density assumption semi-supervised support vector machines

bull Manifold assumption label propagation


10 Minute Breakhellip


Controlled Experiments


bull A

bull B


OEC

Overall Evaluation Criterion

Picking a good OEC is key



bull Lesson 2 GET THE DATA



bull Lesson 2 Get the data


Lesson 3 Prepare to be humbledLeft Elevator Right Elevator


bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing


bull HiPPO stop the project

From Greg Lindenrsquos Blog httpglindenblogspotcom200604early-amazon-shopping-carthtml


TED talk






bull Raise your right hand if you think A Wins

bull Raise your left hand if you think B Wins

bull Donrsquot raise your hand if you think theyrsquore about the same

A B


bull A was 85 better


A

B

Differences A has taller search box (overall size is the same) has magnifying glass icon

ldquopopular searchesrdquo

B has big search button



bull Donrsquot raise your hand if they are the about the same




A B





get the data prepare to be humbled


Any statistic that appears interesting is almost certainly a mistake

If something is ldquoamazingrdquo find the flaw

Examples

If you have a mandatory birth date field and people think itrsquos

unnecessary yoursquoll find lots of 111111 or 010101

If you have an optional drop down do not default to the first

alphabetical entry or yoursquoll have lots jobs = Astronaut

The previous Office example assumes click maps to revenue

Seemed reasonable but when the results look so extreme find

the flaw (conversion rate is not the same see why)


Data Trumps Intuition


Sir Ken Robinson


bull OEC = Overall Evaluation Criterion


bull Controlled Experiments in one slide

bull Examples yoursquore the decision maker


It is difficult to get a man to understand something when his

salary depends upon his not understanding it

-- Upton Sinclair


Hubris


Cultural Stage 2Insight through Measurement and Control

bull Semmelweis worked at Viennarsquos General Hospital animportant teachingresearch hospital in the 1830s-40s

bull In 19th-century Europe childbed fever killed more than a million women

bull Measurement the mortality rate for women giving birth was

bull 15 in his ward staffed by doctors and students

bull 2 in the ward at the hospital attended by midwives



bull He tried to control all differences

bull Birthing positions ventilation diet even the way laundry was done

bull He was away for 4 months and death rate fell significantly when he was away Could it be related to him

bull Insight

bull Doctors were performing autopsies each morning on cadavers

bull Conjecture particles (called germs today) were being transmitted to healthy patients on the hands of the physicians

bull He experiments with cleansing agents

bull Chlorine and lime was effective death rate fell from 18 to 1


Semmelweis Reflex

bull Semmelweis Reflex

2005 study inadequate hand washing is one of the prime contributors to the 2 million health-care-associated infections and 90000 related deaths annually in the United States


Fundamental Understanding


HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding




bull Cultural evolution hubris insight through measurement Semmelweis reflex fundamental understanding



bull Real Data for the city of Oldenburg

Germany

bull X-axis stork population

bull Y-axis human population

What your mother told you about babies and

storks when you were three is still not right

despite the strong correlational ldquoevidencerdquo

Ornitholigische Monatsberichte 193644(2)


Women have smaller palms and live 6 years longer

on average

Buthellipdonrsquot try to bandage your hands


causal


If you dont know where you are going any road will take you there

mdashLewis Carroll



before




bull Hippos kill more humans than any other (non-human) mammal (really)

bull OEC

Get the data

bull Prepare to be humbled

The less data the stronger the opinionshellip


Out of Class Reading

Eight (8) page conference paper

40 page journal versionhellip



Course ProjectDue Oct 25th


Open Discussion on Course Projecthellip




Gallery of Experiments

Contributed by the community


Azure Machine Learning Studio


Sample

Experiments

To help you get started


Experiment

Tools that you can use in your

experiment For feature

selection large set of machine

learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment

Deriving Knowledge from Data at Scalehttpgalleryazuremlnetbrowsetags=[22Azure20ML20Book22

Deriving Knowledge from Data at ScaleCustomer Churn Model


Deployed web service endpoints

that can be consumed by applications

and for batch processing



Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction

1 Define the objective and quantify it with a metric ndash optionally with constraints

if any This typically requires domain knowledge

2 Collect and understand the data deal with the vagaries and biases in the data

acquisition (missing data outliers due to errors in the data collection process

more sophisticated biases due to the data collection procedure etc

3 Frame the problem in terms of a machine learning problem ndash classification

regression ranking clustering forecasting outlier detection etc ndash some

combination of domain knowledge and ML knowledge is useful

4 Transform the raw data into a ldquomodeling datasetrdquo with features weights

targets etc which can be used for modeling Feature construction can often

be improved with domain knowledge Target must be identical (or a very

good proxy) of the quantitative metric identified step 1


Feature selection

Model training

Model scoring

Evaluation

Train Test split

5 Train test and evaluate taking care to control

biasvariance and ensure the metrics are

reported with the right confidence intervals

(cross-validation helps here) be vigilant

against target leaks (which typically leads to

unbelievably good test metrics) ndash this is the

ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split

6 Iterate steps (2) ndash (5) until the test metrics are satisfactory


Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation


Thatrsquos all for our coursehellip


Models in Production



bull AB Testing



Concept is Trivial





Run AA tests

before



Ramp-up




Cost based learning



weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





bull AB Testing



Concept is Trivial





Run AA tests

before



Ramp-up




Cost based learning



weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Concept is Trivial





Run AA tests

before



Ramp-up




Cost based learning



weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Run AA tests

before



Ramp-up




Cost based learning



weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Ramp-up




Cost based learning



weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






Cost based learning



weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Cost based learning



weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





weighting method


false negatives



Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Preprocess Classify



















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation


















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation
















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation















2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation












2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






2 Testing blindness

3 Production parity

4 Single metric

5 Reproducibility


7 Data is gold


































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation
































male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation



























male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation
























male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation


















male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation












male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation









male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation







male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





male 19 yes green

female 44 yes gray

male 49 yes blue

male 12 no brown

female 37 no brown

female 60 no brown

male 44 no blue

female 27 yes brown

female 51 yes green

female 81 yes gray

male 22 yes brown

male 29 no blue

lung cancer

no

yes

yes

no

no

yes

no

no

yes

no

no

no

male 77 yes gray

male 19 yes green

female 44 no gray

yes

no

no

Train ML Model



What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





What to Do














24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation













24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -

Model

Test Data

Prediction

Training Data

Attributes

Target

Label



24 60K +

65 80K -

60 95K -

35 52K +

20 45K +

43 75K +

26 51K +

52 47K -

47 38K -

25 22K -

33 47K +


22 67K

39 41K


22 67K +

39 41K -




In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






In theory no


PopularAssumptions






Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Clustering


instances








Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Clustering


instances







Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Clustering


instances







Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Clustering


instances






Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Generative Models




given data




for each instance



Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Generative Models




given data




for each instance



Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Generative Models




given data




for each instance







Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation








Self-Training







Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Assumption








Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Assumption








Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Assumption




























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation

























Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation


















Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation











Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Semi-Supervised SVM



classifier a bit







Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Semi-Supervised SVM



classifier a bit








The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





The Assumption




Similarity Graphs






The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





The Assumption








connected



The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





The Assumption






SimilarityGraphs



are connected


Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Label Propagation







Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Label Propagation







Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Label Propagation







Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Label Propagation







Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Conclusion











bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation








bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




bull A

bull B


OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




OEC












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation












bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation











bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation









bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation








bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




bull Lesson 1

bull Lesson 2

bull Lesson 3

15 Bing





TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation







TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




TED talk









A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation











A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation








A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation







A B




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




A

B










A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




A B









Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation








Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






Examples











Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Sir Ken Robinson









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation











-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation









-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






-- Upton Sinclair


Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Hubris













bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation















bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation








bull Insight






Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Semmelweis Reflex






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




HubrisMeasure and

Control

Accept Results

avoid

Semmelweis

Reflex

Fundamental

Understanding








Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation










Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Germany









on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





on average



causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




causal



mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





mdashLewis Carroll



before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




before





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation







bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





bull OEC

Get the data




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation
















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation















Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation













Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation











Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation










Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation









Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Sample

Experiments



Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Experiment




learning algorithms



Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Using

classificatio

n

algorithms

Evaluating

the model

Splitting to

Training

and Testing

Datasets

Getting

Data

For the

Experiment









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation











Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation










Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation









Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation





Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Define

Objective

Access and

Understand the

Data

Pre-processing

Feature andor

Target

construction














Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Feature selection

Model training

Model scoring

Evaluation

Train Test split







ML heavy step


Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Define

Objective

Access and

Understand

the data

Pre-processing

Feature andor

Target

construction

Feature selection

Model training

Model scoring

Evaluation

Train Test split



Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation




Access Data

Pre-processing

Feature

construction

Model scoring




Book

Recommendation






Book

Recommendation





Book

Recommendation




Book

Recommendation





Data & Analytics

Barga Data Science lecture 10