17
{ Introduction to Machine Learning Ryo Onozuka 2016/11/30

A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

Embed Size (px)

Citation preview

Page 1: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

{

Introduction to Machine Learning

Ryo Onozuka2016/11/30

Page 2: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

It is becoming popular that using machine learning in business models.

Understanding what machine learning does inside it is needed to analyze those business models.

Why Machine Learning?

2012 20162014

Big DataMachine LearningData Analysis

Data from Google Trends

Page 3: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

What is Machine Learning?

They are algorithm that is becoming wiser by incorporating experience (i.e. data).

They get knowledge and rules automatically from data.

They emerged when past researches on AI were faced with the limit that human explicitly supplied AI with knowledge and rules.

Training Data

Classification Model

EstimateBoundary

Page 4: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

Rule Based Approach: Before Machine Learning

How can we classify people in this campus into some categories.

Make a rule of the classification by human. When the amount of data are small or data is difficult to

quantify, the result from human’s inference is better. Machine cannot answer what it didn’t know. Looks young?

Looks tired? Teacher

Undergraduate students

Graduate students

Not young

Category

Condition

Young

Not tired Tried

Page 5: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

How Machine Learning Solve this? There are some approaches. Supervised Learning

Regression, Naïve Bayes, k-NN, Neural Network, etc. Unsupervised Learning

k-means Semi-supervised Learning Reinforcement Learning

Important!

Main purpose Classification, Prediction, Recommendation

Page 6: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

How Supervised Learning Solve this?

First, we need LARGE amount of appropriate data on people in this campus.

Second, we need to extract FEATURES from each person. Third, we select one algorithm and train it.

LabelUndergrad.Undergrad.

TeacherUndergrad.

↓Training data for supervised learning

Learn these correspondences

Id Young Hair Fashion Tired1 young blond t shirts not tired2 young black suits tired3 not

youngwhite suits tired

4 young black t shirts not tired

Page 7: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

It is like polls.

How do it learn a rule to classify?

Feature Undergrad.

Teacher

Tired +1 +1Not tired +2

Young +3Not young +1Blond hair +1

Id Young Hair Fashion Tired1 young blond t shirts not tired2 young black suits tired3 not

youngwhite suits tired

4 young black t shirts not tired

LabelUndergrad.Undergrad.

TeacherUndergrad.

Feature Undergrad.

Teacher

Black hair +1White hair

+1

suits +1 +1t shirts +2

Page 8: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

How will it classify this person?

When a new person come up

Feature Undergrad.

Teacher

Tired +1 +1Not tired +2Young +3

Not young +1Blond hair +1

Feature Undergrad.

Teacher

Black hair

+1

White hair

+1

suits +1 +1t shirts +2

Tired, young, black hair, suits

Undergrad. Teacher6 points 2 points

Undergrad. !!

Page 9: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

1) How will it classify this person? 2) Do you have any idea to improve its

accuracy? Change features that it extracts from

people? Change a way to gather data for training? Change algorithm of it?

Question (10 min.)Not tired, not young, black hair, suits

Page 10: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

How Unsupervised Learning Solve this?

Id Young Hair Fashion Tired1 young bold t shirts not tired2 young black suits tired3 not

youngwhite suits tired

4 young black t shirts not tired

LabelUndergrad.Undergrad.

TeacherUndergrad.

↓Training data for unsupervised learning

Unsupervised learning don’t need labels

Id young

not youn

gbold black white

t shirt

ssuits tired not

tired

1 1 1 1 12 1 1 1 13 1 1 1 14 1 1 1 1

Page 11: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

Make a distance matrix

Id young

not youn

gbold black white

t shirt

ssuits tired not

tired

1 1 1 1 12 1 1 1 13 1 1 1 14 1 1 1 1

Calculate distance between every pair.

There are many types of distance. 1) Euclid distance2) Correlation coefficient3) Cosine similarity, etc.

Page 12: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

Calculate Euclid DistanceId youn

gnot

young

bold black whitet

shirts

suits tired not tired

Distance

1 1 1 1 12 1 1 1 1

(1-1)2 (0-0)2(1-0)2 (0-1)2 (0-0)2 (1-0)2 (0-1)2 (0-1)2 (1-0)2

√6

1 1 1 1 13 1 1 1 1

(1-0)2 (0-1)2 (1-0)2 (0-0)2 (0-1)2 (1-0)2 (0-1)2 (0-1)2 (1-0)22√2

1 1 1 1 14 1 1 1 1

(1-1)2 (0-0)2 (1-0)2 (0-1)2 (0-0)2 (1-1)2 (0-0)2 (0-0)2 (1-1)2√2

2 1 1 1 13 1 1 1 1

(1-0)2 (0-1)2 (0-0)2 (1-0)2 (0-1)2 (0-0)2 (1-1)2 (1-1)2 (0-0)22

3 1 1 1 14 1 1 1 1

(0-1)2 (1-0)2 (0-0)2 (0-1)2 (1-0)2 (0-1)2 (1-0)2 (1-0)2 (0-1)22√2

Page 13: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

Find the most similar people1 2 3 4

1 1 √6 2√2 √22 √6 1 2 23 2√2 2 1 2√24 √2 2 2√2 1

↓ Distance Matrix

Id Young Hair Fashion Tired1 young bold t shirts not tired2 young black suits tired3 not

youngwhite suits tired

4 young black t shirts not tired

↓ Gathered Data

The nearest!

The nearest!

2nd nearest

2nd nearest

Page 14: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

1 2 3 41 1 √6 2√2 √22 √6 1 2 23 2√2 2 1 2√24 √2 2 2√2 1

Show the Result in a form of Dendrogram

↓ Distance Matrix

Id Young Hair Fashi

on Tired

1 young bold t shirts

not tired

2 young black suits tired

3 not young white suits tired

4 young black t shirts

not tired

↓ Gathered Data

1 4 2 3

Distance

3

2

0

Cluster A Cluster B

Use furthest one

as distance between clusters

Page 15: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

Name clusters

Id Young Hair Fashi

on Tired

1 young bold t shirts

not tired

2 young black suits tired

3 not young white suits tired

4 young black t shirts

not tired

↓ Gathered Data

1 4 2 3

Distance

-1

0

1

Cluster A Cluster B

1 4

Cluster A

2 3

Cluster B

Not tired young cluster?

Tired suits cluster?

→Are there better names?

Page 16: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

Make a new distance matrix and a new dendrogram

Name each cluster What cluster will this person belong to? Do you have any idea to improve this

result?

Question (15 min.)Not tired, not young, black hair, suits

Page 17: A very easy explanation to understanding machine learning (Supervised & Unsupervised Learning)

You are a data analyst of smart phone game company.You want to predict whether new customer will buy premium membership or not.

1) Which algorithm do you use? 2) How do you gather data? 3) If you use supervised learning, how to make

label for training? 4) What features will you extract from user

activities?

Discussion (20 min.)