Upload
ruby-meditation
View
97
Download
2
Embed Size (px)
Citation preview
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
What It Is And How It Works
Machine Learning
Volodymyr VorobiovSoftware Development Consultant at RubyGarage
Machine learning is a subset of art i f ic ia l inte l l igence whose goal is to g ive computers the abi l i ty to teach themselves, whereas art i f ic ia l inte l l igence is a general concept of smart machines. In other words, art i f ic ia l inte l l igence is implemented through machine learning or - to be more precise - through machine learning a lgor i thms.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
artificialintelligence
TEACH YOUR COMPUTER
EXAMPLES OF HOW MACHINE LEARNING IS USED IN THE REAL WORLD
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
- Facia l recognit ion - Voice recognit ion- Text recognit ion- Diagnost ics into medic ine, - Sel f-dr iv ing cars- Robots behavior adjustment- Ads target ing, - Predict ions in f inancia l t rading- Vir tual and augmented real i ty- Astronomy and space ???
The 21st century is the age of data. I t ’s l i tera l ly everywhere. In fact , there has been an exponent ia l growth in the volume of data over the past decade; the tota l amount of data doubles every two years. Most of i t , however, isn’t used. Huge volumes of data can be tagged, structured, and analyzed, reveal ing a lot of valuable informat ion. Only machine learning a lgor i thms can easi ly cope with th is task.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
WHY THE FUTURE BELONGS TO MACHINE LEARNING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HOW MACHINE LEARNING WORKS
Preprocessing Learinng Evaluation Prediction
Labels
RawData
Labels
Labels
Final Model New DataTraining Dataset
Test Dataset
LearningAlgorithm
(putting data into the necessary shape)
(creating a model with the help of training data)
(model assessment using test data)
application of the model)
TOOLS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
- Python - Pandas - Powerful data analysis l ibrary for Python Pandas is a powerful data analysis Python l ibrary that provides flexible and fast data structures for processing “relational” or “labeled” data. This is a fundamental data analysis toolkit in Python. - Scikit-learn - Machine Learning in Python These are simple and effective open-source tools for data mining and analysis. - Statsmodels This is a Python module providing functions and classes to estimate different statistical models as well as to conduct tests and explore statistical data. The Statsmodels module offers a comprehensive l ist of result statistics. - Matplotl ib Matplotl ib is a Python 2D plotting l ibrary that releases publication quality f igures in multiple formats and interactive environments in different platforms.
The quality of the data and the amount of useful information that itcontains are key factors that determine how well a machine learning algorithm can learn. Therefore, it is absolutely crit ical that we makesure to examine and preprocess a dataset before we feed it to a learning algorithm.- Removing and imputing missing values from the dataset- Getting categorical data into shape for machine learning algorithms- Selecting relevant features for the model construction
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DATA PREPROCESSING
DATA PREPROCESSING DATASET PRESENTATION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Independent variables Dependent variables
IMPORTING THE DATASET
IMPORTING THE DATASET
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DEALING WITH MISSING DATA
Most computat ional tools are unable to handle such missing values or would produce unpredictable results i f we s imply ignored them. Therefore, i t is crucia l that we take care of those missing values before we proceed with further analyses.
- El iminating samples or features with missing valuesThe easiest solut ion to th is problem is s imply to remove samples with missing values f rom a dataset . However, th is seemingly handy approach has a number of drawbacks. For example, removing too many of such samples is l ike ly to compromise the qual i ty of the analys is .
- Imputing missing valuesThe solut ion is to use var ious interpolat ion techniques that help to “guess” the missing values f rom other samples in a dataset .
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DEALING WITH MISSING DATA
IMPUTING MISSING VALUES
IMPUTING MISSING VALUES RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HANDLING CATEGORICAL DATA
ENCODE LABELS
ENCODE LABELS RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DUMMY VARIABLES
DUMMY VARIABLES
DUMMY VARIABLES RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DUMMY VARIABLE TRAP
PARTITIONING A DATASET INTO TRAINING AND TEST SETS
TRAINING AND TEST SETS RESULTS
BRINGING FEATURES ONTO THE SAME SCALE
SAME SCALE RESULTS
TRAINING AND SELECTING A PREDICTIVE MODEL
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
- Supervised learning - Regression - Classi f icat ion
- Unsupervised learning - Cluster ing - Dimensional i ty Reduct ion
- Reinforcement Learning
- Associat ion Rule Learning
- Natural Language Processing
- Deep Learning
- Model Select ion
SUPERVISED LEARNING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
For making predictions about the future
RegressionFor predict ing cont inuous outcomes
ClassificationFor predict ing c lass labels
REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Regression models (both linear and non-linear) are used for predicting a real value,
like salary for example. If your independent variable is time,
then you are forecasting future values, otherwise your model is predicting
present but unknown values.
SIMPLE LINEAR REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
y
x
Constant Coefficent
Dependent variable (DV) Independent variable (IV)
y = b + b*x 1 10
DATASET PRESENTATION. EXPERIENCE AND SALARY.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SIMPLE LINEAR REGRESSION TRAINING
SIMPLE LINEAR REGRESSION TRAINING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
MULTIPLE LINEAR REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Constant Coefficent
Dependent variable (DV) Independent variable (IVs)
y = b + b*x + b*x ... + b*x1 1 2 2 n n0
DATASET PRESENTATION. INVESTMENT FUND STATISTIC.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
MULTIPLE LINEAR REGRESSION TRAINING
EVALUATING REGRESSION MODELS PERFORMANCE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
1. All-in
2. Backward Elimination
3. Forward Selection
4. Bidirectional Elimination
5. Score Comparison
Stepwise
Regression
BACKWARD ELIMINATION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
STEP 1: Select a significance level to stay in the model (e.g. SL = 0.05)
STEP 2: Fit the full model with all possible predictors
STEP 3: Consider the predictor with the highest P-value. If P > SL, go to STEP 4, otherwise go to FIN
STEP 4: Remove the predictor
STEP 5: Fit model without this variable*
BACKWARD ELIMINATION TRAINING
BACKWARD ELIMINATION TRAINING STEP 1
BACKWARD ELIMINATION TRAINING STEP 4
EVALUATING PERFORMANCE R-SQUARED
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SUM (y - y^) -> min2
ii
Experience
Simple Linear Regression:Salary ($)
y i
y i
EVALUATING PERFORMANCE R-SQUARED
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SS = SUM (y - y^) 2
i ires
SS = SUM (y - y ) 2
i avgtot
y avg
Experience
Simple Linear Regression:Salary ($)
EVALUATING PERFORMANCE ADJUSTED R-SQUARED
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
p - number of regressorsn - sample size
ADJUSTED R-SQUARED STEP 3
ADJUSTED R-SQUARED STEP 4
ADJUSTED R-SQUARED STEP 5
POLYNOMIAL REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
y
x
y = b + b x + b x1 1 2 12
0
POLYNOMIAL REGRESSION. DATASET PRESENTATION.
BLUFFING DETECTOR
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
POLYNOMIAL REGRESSION. FITTING THE DATASET
POLYNOMIAL REGRESSION. TRAINING THE MODEL
POLYNOMIAL REGRESSION RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SUPPORT VECTOR REGRESSION BASED ON SUPPORT VECTOR MACHINE
SUPPORT VECTOR REGRESSION. RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
WHAT IF
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
X1
X2
DECISION TREE REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Split 4
Split 2
Split 1
Split 3200
20 40
170
X1
X2
1023
0.7-64.1300.5
65.7
Y
DECISION
TREE REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
X < 20 1
X < 200 2
300.5 65.7 1023
-64.1 0.7
X < 1702
X < 401
yes no
yes no yes no
yes no
DECISION TREE REGRESSION TRAINING
DECISION TREE REGRESSION RESULT
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
ENSEMBLE LEARNING. RANDOM FOREST REGRESSION.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
STEP 1: Pick at random K data points from the Training set.
STEP 2: Build the Decision Tree associated to these K data points.
STEP 3: Choose the number Ntree of trees you want to build and repeat STEPS 1 & 2
STEP 4: For a new data point, make each one of your Ntree trees predict the value of Y to for the data point in question, and assign the new data point the average across all of the predicted Y values.
RANDOM FOREST REGRESSION TRAINING
RANDOM FOREST REGRESSION RESULT
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
REGRESSION MODELS. PROS AND CONS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
CLASSIFICATION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Unlike regression where you predict a continuous number,
you use classification to predict a category.
There is a wide variety of classification applications from medicine to marketing.
LOGISTIC REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
This is new:Action (Y/N)
Age
We know this:Salary ($)
Experience
y = b0 + b1*x
LOGISTIC REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Action (Y/N) Action (Y/N)
Age Age
LOGISTIC REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
LOGISTIC REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
LOGISTIC REGRESSION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
LOGISTIC REGRESSION PREDICTION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DATASET PRESENTATION. SOCIAL NETWORK ADS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
LOGISTIC REGRESSION. PREPROCESSING
LOGISTIC REGRESSION. TRAINING
LOGISTIC REGRESSION. TRAINING SET RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
LOGISTIC REGRESSION. TEST SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-NEAREST NEIGHBORS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-NEAREST NEIGHBORS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
STEP 1: Choose the number K of neighbors
STEP 2: Take the K nearest neighbors of the new data point, according to the Euclidean distance
STEP 3: Among these K neighbors, count the number of data points in each category
STEP 4: Assign the new data point to the category where you counted the most neighbors
Your Model is Ready
K-NEAREST NEIGHBORS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-NEAREST NEIGHBORS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Category 1: 3 neighbors
Category 2: 2 neighbors
K-NEAREST NEIGHBORS. TRAINING
K-NEAREST NEIGHBORS. TRAINING SET RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-NEAREST NEIGHBORS. TEST SET RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SUPPORT VECTOR MACHINES
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SUPPORT VECTOR MACHINES TRAINING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SUPPORT VECTOR MACHINES. TRAINING SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
SUPPORT VECTOR MACHINES. TEST SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
KERNEL SVM
KERNEL SVM TRAINING
NAIVE BAYES. TRAINING SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. TEST SET RESULTS.
RUBY
GAR
AGE
201
7
TECH
NO
LOG
Y MATTERS
NAIVE BAYES
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Bayes Theorem
DRIVER OR WAALKER.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. BAYES THEOREM. WALKS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. BAYES THEOREM. DRIVES.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. P(WALKS).
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. P(X).
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. P(X|WALKS).
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. P(WALKS|X).
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. P(DRIVES|X).
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. NEW WALKER.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. TRAINING.
NAIVE BAYES. TRAINING SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NAIVE BAYES. TEST SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DECISION TREE CLASSIFICATION
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DECISION TREE CLASSIFICATION. TRAINING.
DECISION TREE CLASSIFICATION. TRAINING SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DECISION TREE CLASSIFICATION. TEST SET RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
RANDOM FOREST CLASSIFICATION
STEP 1: Pick at random K data points from the Training set.
STEP 2: Build the Decision Tree associated to these K data points.
STEP 3: Choose the number Ntree of trees you want to build and repeat STEPS 1 & 2
STEP 4: For a new data point, make each one of your Ntree trees predict the category towhich the data point belongs, and assign the new data point to the category that winsthe majority vote.
RANDOM FOREST CLASSIFICATION. TRAINING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
RANDOM FOREST CLASSIFICATION. TRAINING SET RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
RANDOM FOREST CLASSIFICATION. TEST SET RESULTS.
EVALUATING CLASSIFICATION MODELS PERFORMANCE.FALSE POSITIVES & FALSE NEGATIVES.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
EVALUATING CLASSIFICATION MODELS PERFORMANCE.CONFUSION MATRIX.
CLASSIFICATION MODELS. PROS AND CONS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Clustering is similar to classi�cation, but the basis is di�erent. In Clustering you don’t know what you are looking for, and you are trying to identify some segments or clusters in your data. When you use clustering algorithms on your dataset, unexpected things can suddenly pop up like structures, clusters and groupings you would have never thought of otherwise.
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
STEP 1: Choose the number K of clusters
STEP 2: Select at random K points, the centroids (not necessarily from your dataset)
STEP 3: Assign each data point to the closest centroid -> That forms K clusters
STEP 4: Compute and place the new centroid of each cluster
STEP 5: Reassign each data point to the new closest centroid. If any reassignment took place, go to STEP 4, otherwise go to FIN.
Your Model is Ready
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING RANDOM INITIALIZATION PROBLEM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING RANDOM INITIALIZATION PROBLEM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING RANDOM INITIALIZATION PROBLEM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS CLUSTERING RANDOM INITIALIZATION PROBLEM
K-MEANS SELECTING THE NUMBER OF CLUSTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS SELECTING THE NUMBER OF CLUSTERS
RUBY
GAR
AGE
201
7
TECH
NO
LOG
Y MATTERS
DATASET PRESENTATION. MALL CUSTOMERS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS. TRAINING. OPTIMAL NUMBER OF CLUSTERS.
K-MEANS. OPTIMAL NUMBER OF CLUSTERS RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
K-MEANS TRAINING
K-MEANS. RESULT
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING
RUBY
GAR
AGE
201
7
TECH
NO
LOG
Y MATTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
STEP 1: Make each data point a single-point cluster That forms N clusters
STEP 2: Take the two closest data points and make them one cluster That forms N-1 clusters
STEP 3: Take the two closest clusters and make them one cluster That forms N-2 clusters
STEP 4: Repeat STEP 3 until there is only one cluster
FIN
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING DENDROGRAMS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING DENDROGRAMS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING DENDROGRAMS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING DENDROGRAMS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING DENDROGRAMS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
4 clusters
DENDROGRAMS OPTIMAL NUMBER OF CLUSTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DENDROGRAMS OPTIMAL NUMBER OF CLUSTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
DENDROGRAM. FINDING THE OPTIMAL NUMBER OF CLUSTERS.
DENDROGRAM. RESULTS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HIERARCHICAL CLUSTERING. TRAINING.
HIERARCHICAL CLUSTERING RESULT
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
CLUSTERING MODELS. PROS AND CONS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
REINFORCEMENT LEARNING
Reinforcement Learning is a branch of Machine Learning, also called Online Learning. It is used to solve interacting problems where the data observed up to time t is consideredto decide which action to take at time t + 1.
It is also used for Arti�cial Intelligence when training machines to performtasks such as walking. Desired outcomes provide the AI with reward,undesired with punishment. Machines learn through trial and error.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
THE MULTI-ARMED BANDIT PROBLEMHot to bet to maximize your return
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
THE MULTI-ARMED BANDIT PROBLEM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
THE MULTI-ARMED BANDIT PROBLEM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
UPPER CONFIDENCE BOUND ALGORITHM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
UPPER CONFIDENCE BOUND ALGORITHM
RUBY
GAR
AGE
201
7
TECH
NO
LOG
Y MATTERS
UPPER CONFIDENCE BOUND ALGORITHM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
UPPER CONFIDENCE BOUND ALGORITHM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
UPPER CONFIDENCE BOUND ALGORITHM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
UPPER CONFIDENCE BOUND ALGORITHM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
UPPER CONFIDENCE BOUND ALGORITHM
RUBY
GAR
AGE
201
7
TECH
NO
LOG
Y MATTERS
RANDOM SELECTION
RANDOM SELECTION. RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
UPPER CONFIDENCE BOUND. TRAINING.
UPPER CONFIDENCE BOUND. RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
THOMPSON SAMPLING ALGORITHM
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
BAYESIAN INFERENCE
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
BAYESIAN INFERENCE. EXPLANATION.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
CREATING DISTRIBUTION BASED ON AN INITIAL DATA
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
PULLING RANDOM VALUES FROM DISTRIBUTIONS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
ADJUSTING THE PERCEPTION OF THE WORLD
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
THE FINAL MODEL
RUBY
GAR
AGE
201
761
TECH
NO
LOG
Y MATTERS
UCB VS THOMPSON SAMPLING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
THOMPSON SAMPLING ALGORITHM. TRAINING.
THOMPSON SAMPLING ALGORITHM. RESULTS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NATURAL LANGUAGE PROCESSING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Natural Language Processing (or NLP) is applying Machine Learningmodels to text and language.Teaching machines to understand what is said in spokenand written word is the focus of Natural.
NATURAL LANGUAGE PROCESSING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Natural Language Processing (or NLP) is applying Machine Learningmodels to text and language.Teaching machines to understand what is said in spokenand written word is the focus of Natural.
NATURAL LANGUAGE PROCESSING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
Language Processing.Whenever you dictate something into your iPhone / Android devicethat is then converted to text, that’s an NLP algorithm in action.
NATURAL LANGUAGE PROCESSING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
You can use NLP on an article to predict somecategories of the articles you are trying to segment.You can use NLP on a book to predict the genre of the book.
NATURAL LANGUAGE PROCESSING
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
A very well-known model in NLP is the Bag of Words model.It is a model used to preprocess the texts to classify before�tting the classi�cation algorithms on the observationscontaining the texts.
DATASET PRESENTATION. RESTAURANT REVIEWS.
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NLP. TRAINING. IMPORTING THE DATASET AND CLEANING THE TEXTS.
NLP. TRAINING. CLEANING THE TEXTS. RESULTS.
NLP. TRAINING. CREATING THE BAG OF WORDS MODEL.
NLP. CREATING THE BAG OF WORDS MODEL.
NLP. TRAINING. SPLITTING THE DATASET INTO THE TRAINING SET AND TEST SET.
NLP. TRAINING. FITTING NAIVE BAYES TO THE TRAINING SET.
NLP. TRAINING. PREDICTING AND MAKING THE CONFUSION MATRIX.
NLP. CONFUSION MATRIX. RESULTS.
THE NEURON
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
HOW DO NEURAL NETWORKS LEARN?
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
NEURAL NETWORKS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
RUBY
GAR
AGE
201
7TEC
HN
OLO
GY M
ATTERS
TO BE CONTINUED