Machine Learning as a Daily Work for a Programmer- Volodymyr Vorobiov

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

What It Is And How It Works

Machine Learning

Volodymyr VorobiovSoftware Development Consultant at RubyGarage

Machine learning is a subset of art i f ic ia l inte l l igence whose goal is to g ive computers the abi l i ty to teach themselves, whereas art i f ic ia l inte l l igence is a general concept of smart machines. In other words, art i f ic ia l inte l l igence is implemented through machine learning or - to be more precise - through machine learning a lgor i thms.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

artificialintelligence

TEACH YOUR COMPUTER

EXAMPLES OF HOW MACHINE LEARNING IS USED IN THE REAL WORLD

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

- Facia l recognit ion - Voice recognit ion- Text recognit ion- Diagnost ics into medic ine, - Sel f-dr iv ing cars- Robots behavior adjustment- Ads target ing, - Predict ions in f inancia l t rading- Vir tual and augmented real i ty- Astronomy and space ???

The 21st century is the age of data. I t ’s l i tera l ly everywhere. In fact , there has been an exponent ia l growth in the volume of data over the past decade; the tota l amount of data doubles every two years. Most of i t , however, isn’t used. Huge volumes of data can be tagged, structured, and analyzed, reveal ing a lot of valuable informat ion. Only machine learning a lgor i thms can easi ly cope with th is task.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

WHY THE FUTURE BELONGS TO MACHINE LEARNING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

HOW MACHINE LEARNING WORKS

Preprocessing Learinng Evaluation Prediction

Labels

RawData

Labels

Labels

Final Model New DataTraining Dataset

Test Dataset

LearningAlgorithm

(putting data into the necessary shape)

(creating a model with the help of training data)

(model assessment using test data)

application of the model)

TOOLS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

- Python - Pandas - Powerful data analysis l ibrary for Python Pandas is a powerful data analysis Python l ibrary that provides flexible and fast data structures for processing “relational” or “labeled” data. This is a fundamental data analysis toolkit in Python. - Scikit-learn - Machine Learning in Python These are simple and effective open-source tools for data mining and analysis. - Statsmodels This is a Python module providing functions and classes to estimate different statistical models as well as to conduct tests and explore statistical data. The Statsmodels module offers a comprehensive l ist of result statistics. - Matplotl ib Matplotl ib is a Python 2D plotting l ibrary that releases publication quality f igures in multiple formats and interactive environments in different platforms.

The quality of the data and the amount of useful information that itcontains are key factors that determine how well a machine learning algorithm can learn. Therefore, it is absolutely crit ical that we makesure to examine and preprocess a dataset before we feed it to a learning algorithm.- Removing and imputing missing values from the dataset- Getting categorical data into shape for machine learning algorithms- Selecting relevant features for the model construction

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DATA PREPROCESSING

DATA PREPROCESSING DATASET PRESENTATION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Independent variables Dependent variables

IMPORTING THE DATASET

IMPORTING THE DATASET

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DEALING WITH MISSING DATA

Most computat ional tools are unable to handle such missing values or would produce unpredictable results i f we s imply ignored them. Therefore, i t is crucia l that we take care of those missing values before we proceed with further analyses.

- El iminating samples or features with missing valuesThe easiest solut ion to th is problem is s imply to remove samples with missing values f rom a dataset . However, th is seemingly handy approach has a number of drawbacks. For example, removing too many of such samples is l ike ly to compromise the qual i ty of the analys is .

- Imputing missing valuesThe solut ion is to use var ious interpolat ion techniques that help to “guess” the missing values f rom other samples in a dataset .

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DEALING WITH MISSING DATA

IMPUTING MISSING VALUES

IMPUTING MISSING VALUES RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

HANDLING CATEGORICAL DATA

ENCODE LABELS

ENCODE LABELS RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DUMMY VARIABLES

DUMMY VARIABLES

DUMMY VARIABLES RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DUMMY VARIABLE TRAP

PARTITIONING A DATASET INTO TRAINING AND TEST SETS

TRAINING AND TEST SETS RESULTS

BRINGING FEATURES ONTO THE SAME SCALE

SAME SCALE RESULTS

TRAINING AND SELECTING A PREDICTIVE MODEL

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

- Supervised learning - Regression - Classi f icat ion

- Unsupervised learning - Cluster ing - Dimensional i ty Reduct ion

- Reinforcement Learning

- Associat ion Rule Learning

- Natural Language Processing

- Deep Learning

- Model Select ion

SUPERVISED LEARNING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

For making predictions about the future

RegressionFor predict ing cont inuous outcomes

ClassificationFor predict ing c lass labels

REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Regression models (both linear and non-linear) are used for predicting a real value,

like salary for example. If your independent variable is time,

then you are forecasting future values, otherwise your model is predicting

present but unknown values.

SIMPLE LINEAR REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

y

x

Constant Coefficent

Dependent variable (DV) Independent variable (IV)

y = b + b*x 1 10

DATASET PRESENTATION. EXPERIENCE AND SALARY.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SIMPLE LINEAR REGRESSION TRAINING

SIMPLE LINEAR REGRESSION TRAINING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

MULTIPLE LINEAR REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Constant Coefficent

Dependent variable (DV) Independent variable (IVs)

y = b + b*x + b*x ... + b*x1 1 2 2 n n0

DATASET PRESENTATION. INVESTMENT FUND STATISTIC.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

MULTIPLE LINEAR REGRESSION TRAINING

EVALUATING REGRESSION MODELS PERFORMANCE

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

1. All-in

2. Backward Elimination

3. Forward Selection

4. Bidirectional Elimination

5. Score Comparison

Stepwise

Regression

BACKWARD ELIMINATION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

STEP 1: Select a significance level to stay in the model (e.g. SL = 0.05)

STEP 2: Fit the full model with all possible predictors

STEP 3: Consider the predictor with the highest P-value. If P > SL, go to STEP 4, otherwise go to FIN

STEP 4: Remove the predictor

STEP 5: Fit model without this variable*

BACKWARD ELIMINATION TRAINING

BACKWARD ELIMINATION TRAINING STEP 1

BACKWARD ELIMINATION TRAINING STEP 4

EVALUATING PERFORMANCE R-SQUARED

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SUM (y - y^) -> min2

ii

Experience

Simple Linear Regression:Salary ($)

y i

y i

EVALUATING PERFORMANCE R-SQUARED

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SS = SUM (y - y^) 2

i ires

SS = SUM (y - y ) 2

i avgtot

y avg

Experience

Simple Linear Regression:Salary ($)

EVALUATING PERFORMANCE ADJUSTED R-SQUARED

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

p - number of regressorsn - sample size

ADJUSTED R-SQUARED STEP 3



POLYNOMIAL REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

y

x

y = b + b x + b x1 1 2 12

0

POLYNOMIAL REGRESSION. DATASET PRESENTATION.

BLUFFING DETECTOR

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

POLYNOMIAL REGRESSION. FITTING THE DATASET

POLYNOMIAL REGRESSION. TRAINING THE MODEL

POLYNOMIAL REGRESSION RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SUPPORT VECTOR REGRESSION BASED ON SUPPORT VECTOR MACHINE

SUPPORT VECTOR REGRESSION. RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

WHAT IF

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

X1

X2

DECISION TREE REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Split 4

Split 2

Split 1

Split 3200

20 40

170

X1

X2

1023

0.7-64.1300.5

65.7

Y

DECISION

TREE REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

X < 20 1

X < 200 2

300.5 65.7 1023

-64.1 0.7

X < 1702

X < 401

yes no

yes no yes no

yes no

DECISION TREE REGRESSION TRAINING

DECISION TREE REGRESSION RESULT

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

ENSEMBLE LEARNING. RANDOM FOREST REGRESSION.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

STEP 1: Pick at random K data points from the Training set.

STEP 2: Build the Decision Tree associated to these K data points.

STEP 3: Choose the number Ntree of trees you want to build and repeat STEPS 1 & 2

STEP 4: For a new data point, make each one of your Ntree trees predict the value of Y to for the data point in question, and assign the new data point the average across all of the predicted Y values.

RANDOM FOREST REGRESSION TRAINING

RANDOM FOREST REGRESSION RESULT

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

REGRESSION MODELS. PROS AND CONS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

CLASSIFICATION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Unlike regression where you predict a continuous number,

you use classification to predict a category.

There is a wide variety of classification applications from medicine to marketing.

LOGISTIC REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

This is new:Action (Y/N)

Age

We know this:Salary ($)

Experience

y = b0 + b1*x

LOGISTIC REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Action (Y/N) Action (Y/N)

Age Age

LOGISTIC REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

LOGISTIC REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

LOGISTIC REGRESSION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

LOGISTIC REGRESSION PREDICTION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DATASET PRESENTATION. SOCIAL NETWORK ADS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

LOGISTIC REGRESSION. PREPROCESSING

LOGISTIC REGRESSION. TRAINING

LOGISTIC REGRESSION. TRAINING SET RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

LOGISTIC REGRESSION. TEST SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-NEAREST NEIGHBORS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-NEAREST NEIGHBORS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

STEP 1: Choose the number K of neighbors

STEP 2: Take the K nearest neighbors of the new data point, according to the Euclidean distance

STEP 3: Among these K neighbors, count the number of data points in each category

STEP 4: Assign the new data point to the category where you counted the most neighbors

Your Model is Ready

K-NEAREST NEIGHBORS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-NEAREST NEIGHBORS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Category 1: 3 neighbors

Category 2: 2 neighbors

K-NEAREST NEIGHBORS. TRAINING

K-NEAREST NEIGHBORS. TRAINING SET RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-NEAREST NEIGHBORS. TEST SET RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SUPPORT VECTOR MACHINES

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SUPPORT VECTOR MACHINES TRAINING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SUPPORT VECTOR MACHINES. TRAINING SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

SUPPORT VECTOR MACHINES. TEST SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

KERNEL SVM

KERNEL SVM TRAINING

NAIVE BAYES. TRAINING SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. TEST SET RESULTS.

RUBY

GAR

AGE

201

7

TECH

NO

LOG

Y MATTERS

NAIVE BAYES

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Bayes Theorem

DRIVER OR WAALKER.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. BAYES THEOREM. WALKS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. BAYES THEOREM. DRIVES.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. P(WALKS).

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. P(X).

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. P(X|WALKS).

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. P(WALKS|X).

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. P(DRIVES|X).

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. NEW WALKER.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. TRAINING.

NAIVE BAYES. TRAINING SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NAIVE BAYES. TEST SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DECISION TREE CLASSIFICATION

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DECISION TREE CLASSIFICATION. TRAINING.

DECISION TREE CLASSIFICATION. TRAINING SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DECISION TREE CLASSIFICATION. TEST SET RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

RANDOM FOREST CLASSIFICATION

STEP 1: Pick at random K data points from the Training set.

STEP 2: Build the Decision Tree associated to these K data points.

STEP 3: Choose the number Ntree of trees you want to build and repeat STEPS 1 & 2

STEP 4: For a new data point, make each one of your Ntree trees predict the category towhich the data point belongs, and assign the new data point to the category that winsthe majority vote.

RANDOM FOREST CLASSIFICATION. TRAINING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

RANDOM FOREST CLASSIFICATION. TRAINING SET RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

RANDOM FOREST CLASSIFICATION. TEST SET RESULTS.

EVALUATING CLASSIFICATION MODELS PERFORMANCE.FALSE POSITIVES & FALSE NEGATIVES.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

EVALUATING CLASSIFICATION MODELS PERFORMANCE.CONFUSION MATRIX.

CLASSIFICATION MODELS. PROS AND CONS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Clustering is similar to classi�cation, but the basis is di�erent. In Clustering you don’t know what you are looking for, and you are trying to identify some segments or clusters in your data. When you use clustering algorithms on your dataset, unexpected things can suddenly pop up like structures, clusters and groupings you would have never thought of otherwise.

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

STEP 1: Choose the number K of clusters

STEP 2: Select at random K points, the centroids (not necessarily from your dataset)

STEP 3: Assign each data point to the closest centroid -> That forms K clusters

STEP 4: Compute and place the new centroid of each cluster

STEP 5: Reassign each data point to the new closest centroid. If any reassignment took place, go to STEP 4, otherwise go to FIN.

Your Model is Ready

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS CLUSTERING RANDOM INITIALIZATION PROBLEM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


K-MEANS SELECTING THE NUMBER OF CLUSTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS SELECTING THE NUMBER OF CLUSTERS

RUBY

GAR

AGE

201

7

TECH

NO

LOG

Y MATTERS

DATASET PRESENTATION. MALL CUSTOMERS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS. TRAINING. OPTIMAL NUMBER OF CLUSTERS.

K-MEANS. OPTIMAL NUMBER OF CLUSTERS RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

K-MEANS TRAINING

K-MEANS. RESULT

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

HIERARCHICAL CLUSTERING

RUBY

GAR

AGE

201

7

TECH

NO

LOG

Y MATTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

HIERARCHICAL CLUSTERING AGGLOMERATIVE

STEP 1: Make each data point a single-point cluster That forms N clusters

STEP 2: Take the two closest data points and make them one cluster That forms N-1 clusters

STEP 3: Take the two closest clusters and make them one cluster That forms N-2 clusters

STEP 4: Repeat STEP 3 until there is only one cluster

FIN


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

HIERARCHICAL CLUSTERING DENDROGRAMS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

4 clusters

DENDROGRAMS OPTIMAL NUMBER OF CLUSTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DENDROGRAMS OPTIMAL NUMBER OF CLUSTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

DENDROGRAM. FINDING THE OPTIMAL NUMBER OF CLUSTERS.

DENDROGRAM. RESULTS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

HIERARCHICAL CLUSTERING. TRAINING.

HIERARCHICAL CLUSTERING RESULT

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

CLUSTERING MODELS. PROS AND CONS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

REINFORCEMENT LEARNING

Reinforcement Learning is a branch of Machine Learning, also called Online Learning. It is used to solve interacting problems where the data observed up to time t is consideredto decide which action to take at time t + 1.

It is also used for Arti�cial Intelligence when training machines to performtasks such as walking. Desired outcomes provide the AI with reward,undesired with punishment. Machines learn through trial and error.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

THE MULTI-ARMED BANDIT PROBLEMHot to bet to maximize your return

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

THE MULTI-ARMED BANDIT PROBLEM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

THE MULTI-ARMED BANDIT PROBLEM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

UPPER CONFIDENCE BOUND ALGORITHM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7

TECH

NO

LOG

Y MATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS


RUBY

GAR

AGE

201

7

TECH

NO

LOG

Y MATTERS

RANDOM SELECTION

RANDOM SELECTION. RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

UPPER CONFIDENCE BOUND. TRAINING.

UPPER CONFIDENCE BOUND. RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

THOMPSON SAMPLING ALGORITHM

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

BAYESIAN INFERENCE

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

BAYESIAN INFERENCE. EXPLANATION.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

CREATING DISTRIBUTION BASED ON AN INITIAL DATA

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

PULLING RANDOM VALUES FROM DISTRIBUTIONS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

ADJUSTING THE PERCEPTION OF THE WORLD

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

THE FINAL MODEL

RUBY

GAR

AGE

201

761

TECH

NO

LOG

Y MATTERS

UCB VS THOMPSON SAMPLING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

THOMPSON SAMPLING ALGORITHM. TRAINING.

THOMPSON SAMPLING ALGORITHM. RESULTS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NATURAL LANGUAGE PROCESSING

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Natural Language Processing (or NLP) is applying Machine Learningmodels to text and language.Teaching machines to understand what is said in spokenand written word is the focus of Natural.


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Natural Language Processing (or NLP) is applying Machine Learningmodels to text and language.Teaching machines to understand what is said in spokenand written word is the focus of Natural.


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

Language Processing.Whenever you dictate something into your iPhone / Android devicethat is then converted to text, that’s an NLP algorithm in action.


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

You can use NLP on an article to predict somecategories of the articles you are trying to segment.You can use NLP on a book to predict the genre of the book.


RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

A very well-known model in NLP is the Bag of Words model.It is a model used to preprocess the texts to classify before�tting the classi�cation algorithms on the observationscontaining the texts.

DATASET PRESENTATION. RESTAURANT REVIEWS.

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NLP. TRAINING. IMPORTING THE DATASET AND CLEANING THE TEXTS.

NLP. TRAINING. CLEANING THE TEXTS. RESULTS.

NLP. TRAINING. CREATING THE BAG OF WORDS MODEL.

NLP. CREATING THE BAG OF WORDS MODEL.

NLP. TRAINING. SPLITTING THE DATASET INTO THE TRAINING SET AND TEST SET.

NLP. TRAINING. FITTING NAIVE BAYES TO THE TRAINING SET.

NLP. TRAINING. PREDICTING AND MAKING THE CONFUSION MATRIX.

NLP. CONFUSION MATRIX. RESULTS.

THE NEURON

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

HOW DO NEURAL NETWORKS LEARN?

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

NEURAL NETWORKS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

RUBY

GAR

AGE

201

7TEC

HN

OLO

GY M

ATTERS

TO BE CONTINUED

Technology

Machine Learning as a Daily Work for a Programmer- Volodymyr Vorobiov