INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

INF3490 - Biologically inspired computingSupport Vector Machines, Ensemble

Learning, and Dimensionality Reduction

Weria KhaksarOctober 17, 2018

2

Support Vector Machines (SVM)

17.10.2018

3

Support Vector Machines (SVM): Background

17.10.2018

SVM is used for extreme classification cases.CAT DOG

?

4


17.10.2018

Remember the inefficiency of the Perceptron?

5


17.10.2018

Linear Separability

?6


17.10.2018

A trick to solve it …

It is always possible to separate out twoclasses with a linear function, provided thatyou project the data into the correct set ofdimensions.

7


17.10.2018

A trick to solve it …

?

8

Support Vector Machines (SVM): The Margin

17.10.2018

Which line is the best separator?

9


17.10.2018

Why do we need the best line?

10


17.10.2018

Which line is the best separator?

The one with the highest margin

11

Support Vector Machines (SVM): Support Vectors

17.10.2018

Which data points are important?

12

Support Vector Machines (SVM): Support Vectors

17.10.2018

Which data points are important?

Support Vectors

The data points in each class that lie closest to the classification line are called Support Vectors.

13

Support Vector Machines (SVM): Optimal Separation

17.10.2018

The margin should be as large as possible.

the best classifier is the one that goes through the middle of the marginal area.

We can through away other data and just use support vectors for classification.

14

Support Vector Machines (SVM): The Math.

17.10.2018

𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 |𝑀|𝑠. 𝑡. : 𝑡 𝐰 . 𝐱 𝑏 1, 𝑖 1, … , 𝑛

15

Support Vector Machines (SVM): Slack Variables for Non-Linearly Separable Problems:

17.10.2018 16

Support Vector Machines (SVM): Slack Variables for Non-Linearly Separable Problems:

17.10.2018

17

Support Vector Machines (SVM): KERNELS

17.10.2018

The trick is to modify the input features in some way, to beable to linearly classify the data.

The main idea is to replace the input feature, 𝐱 , with somefunction, 𝜙 𝐱 .

The main challenge is to automate the algorithm to find theproper function without a suitable knowledge domain.

18


17.10.2018




19


17.10.2018




20

Support Vector Machines (SVM): SVM Algorithm:

17.10.2018

21

Support Vector Machines (SVM): SVM Examples:

17.10.2018 22


17.10.2018Performing nonlinear classification via linear separation in higher dimensional space

23


17.10.2018

The SVM learning about a linearly separable dataset (top row) and a dataset that needs two straight lines to separate in 2D(bottom row) with left the linear kernel,middle the polynomial kernel of degree 3, and right the RBF kernel.

24


17.10.2018 The effects of different kernels when learning a version of XOR

25

Ensembled Learning

17.10.2018 26

Ensemble Learning: Background

17.10.2018

Having lots of simple learners that each provide slightly different results,

Putting them together in a proper way,

The results are significantly better.

27


17.10.2018

The Basic Idea:

28

Ensemble Learning: Important Considerations

17.10.2018

Which learners should we use?

How should we ensure that they learn different things?

How should we combine their results?

29

Ensemble Learning: Important Considerations

17.10.2018

Which learners should we use?

How should we ensure that they learn different things?

How should we combine their results?

30


17.10.2018

If we take a collection of very poor learners,each performing only just better than chance,then by putting them together it is possible tomake an ensemble learner that can performarbitrarily well.

We just need lots of low-quality learners, anda way to put them together usefully, and wecan make a learner that will do very well.

31


17.10.2018 32

Ensemble Learning: How it works?

17.10.2018

33

Ensemble Learning: BOOSTING

17.10.2018

As points are misclassified, their weights increase inboosting (shown by the data point getting larger), whichmakes the importance of those data points increase,making the classifiers pay more attention to them.

34


17.10.2018

AdaBoost:

35


17.10.2018 36


17.10.2018

AdaBoost: How it works?

37


17.10.2018

AdaBoost:

AdaBoost in Action

38

Ensemble Learning: BAGGING

17.10.2018

Bagging (Bootstrap Aggregating):

39


17.10.2018

Bagging (Bootstrap Aggregating): How it works?

40


17.10.2018

Bagging (Bootstrap Aggregating): Examples:

41

Ensemble Learning: Summary

17.10.2018 42

Dimensionality reduction

17.10.2018

43

Dimensionality reduction: Why?

17.10.2018

When looking at data and plotting results, we can never go beyond three dimensions.

The higher the number of dimensions we have, the more training data we need.

The dimensionality is an explicit factor for the computational cost of many algorithms.

Remove noise. Significantly improve the results of the learning

algorithm. Make the dataset easier to work with. Make the results easier to understand.

44

Dimensionality reduction: How?

17.10.2018

Feature Selection: Looking through the featuresthat are available and seeing whether or notthey are actually useful.

Feature Derivation: Deriving new features fromthe old ones, generally by applying transforms tothe dataset.

Clustering: Grouping together similar datapoints, and see whether this allows fewerfeatures to be used.

45

Dimensionality reduction: Example

17.10.2018 46

Dimensionality reduction: Principal Components Analysis (PCA)

17.10.2018

47


17.10.2018

The principal component is the direction in the data with the largest variance.

48


17.10.2018

49


17.10.2018

PCA is a linear transformation

• Does not directly help with data that is not linearly separable.

• However, may make learning easier because of reduced complexity.

PCA removes some information from the data

• Might just be noise.• Might provide helpful nuances that may be of help

to some classifiers.50

Dimensionality reduction: Principal Components Analysis (PCA) Example

how to project samples into the variable space17.10.2018

17.10.2018 51

Documents

INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria