9
INF3490 - Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria Khaksar October 17, 2018 2 Support Vector Machines (SVM) 17.10.2018 3 Support Vector Machines (SVM): Background 17.10.2018 SVM is used for extreme classification cases. CAT DOG ? 4 Support Vector Machines (SVM): Background 17.10.2018 Remember the inefficiency of the Perceptron? 5 Support Vector Machines (SVM): Background 17.10.2018 Linear Separability ? 6 Support Vector Machines (SVM): Background 17.10.2018 A trick to solve it … It is always possible to separate out two classes with a linear function, provided that you project the data into the correct set of dimensions.

INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

  • Upload
    others

  • View
    5

  • Download
    1

Embed Size (px)

Citation preview

Page 1: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

INF3490 - Biologically inspired computingSupport Vector Machines, Ensemble

Learning, and Dimensionality Reduction

Weria KhaksarOctober 17, 2018

2

Support Vector Machines (SVM)

17.10.2018

3

Support Vector Machines (SVM): Background

17.10.2018

SVM is used for extreme classification cases.CAT DOG

?

4

Support Vector Machines (SVM): Background

17.10.2018

Remember the inefficiency of the Perceptron?

5

Support Vector Machines (SVM): Background

17.10.2018

Linear Separability

?6

Support Vector Machines (SVM): Background

17.10.2018

A trick to solve it …

It is always possible to separate out twoclasses with a linear function, provided thatyou project the data into the correct set ofdimensions.

Page 2: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

7

Support Vector Machines (SVM): Background

17.10.2018

A trick to solve it …

?

8

Support Vector Machines (SVM): The Margin

17.10.2018

Which line is the best separator?

9

Support Vector Machines (SVM): The Margin

17.10.2018

Why do we need the best line?

10

Support Vector Machines (SVM): The Margin

17.10.2018

Which line is the best separator?

The one with the highest margin

11

Support Vector Machines (SVM): Support Vectors

17.10.2018

Which data points are important?

12

Support Vector Machines (SVM): Support Vectors

17.10.2018

Which data points are important?

Support Vectors

The data points in each class that lie closest to the classification line are called Support Vectors.

Page 3: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

13

Support Vector Machines (SVM): Optimal Separation

17.10.2018

The margin should be as large as possible.

the best classifier is the one that goes through the middle of the marginal area.

We can through away other data and just use support vectors for classification.

14

Support Vector Machines (SVM): The Math.

17.10.2018

𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 |𝑀|𝑠. 𝑡. : 𝑡 𝐰 . 𝐱 𝑏 1, 𝑖 1, … , 𝑛

15

Support Vector Machines (SVM): Slack Variables for Non-Linearly Separable Problems:

17.10.2018 16

Support Vector Machines (SVM): Slack Variables for Non-Linearly Separable Problems:

17.10.2018

17

Support Vector Machines (SVM): KERNELS

17.10.2018

The trick is to modify the input features in some way, to beable to linearly classify the data.

The main idea is to replace the input feature, 𝐱 , with somefunction, 𝜙 𝐱 .

The main challenge is to automate the algorithm to find theproper function without a suitable knowledge domain.

18

Support Vector Machines (SVM): KERNELS

17.10.2018

The trick is to modify the input features in some way, to beable to linearly classify the data.

The main idea is to replace the input feature, 𝐱 , with somefunction, 𝜙 𝐱 .

The main challenge is to automate the algorithm to find theproper function without a suitable knowledge domain.

Page 4: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

19

Support Vector Machines (SVM): KERNELS

17.10.2018

The trick is to modify the input features in some way, to beable to linearly classify the data.

The main idea is to replace the input feature, 𝐱 , with somefunction, 𝜙 𝐱 .

The main challenge is to automate the algorithm to find theproper function without a suitable knowledge domain.

20

Support Vector Machines (SVM): SVM Algorithm:

17.10.2018

21

Support Vector Machines (SVM): SVM Examples:

17.10.2018 22

Support Vector Machines (SVM): SVM Examples:

17.10.2018Performing nonlinear classification via linear separation in higher dimensional space

23

Support Vector Machines (SVM): SVM Examples:

17.10.2018

The SVM learning about a linearly separable dataset (top row) and a dataset that needs two straight lines to separate in 2D(bottom row) with left the linear kernel,middle the polynomial kernel of degree 3, and right the RBF kernel.

24

Support Vector Machines (SVM): SVM Examples:

17.10.2018 The effects of different kernels when learning a version of XOR

Page 5: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

25

Ensembled Learning

17.10.2018 26

Ensemble Learning: Background

17.10.2018

Having lots of simple learners that each provide slightly different results,

Putting them together in a proper way,

The results are significantly better.

27

Ensemble Learning: Background

17.10.2018

The Basic Idea:

28

Ensemble Learning: Important Considerations

17.10.2018

Which learners should we use?

How should we ensure that they learn different things?

How should we combine their results?

29

Ensemble Learning: Important Considerations

17.10.2018

Which learners should we use?

How should we ensure that they learn different things?

How should we combine their results?

30

Ensemble Learning: Background

17.10.2018

If we take a collection of very poor learners,each performing only just better than chance,then by putting them together it is possible tomake an ensemble learner that can performarbitrarily well.

We just need lots of low-quality learners, anda way to put them together usefully, and wecan make a learner that will do very well.

Page 6: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

31

Ensemble Learning: Background

17.10.2018 32

Ensemble Learning: How it works?

17.10.2018

33

Ensemble Learning: BOOSTING

17.10.2018

As points are misclassified, their weights increase inboosting (shown by the data point getting larger), whichmakes the importance of those data points increase,making the classifiers pay more attention to them.

34

Ensemble Learning: BOOSTING

17.10.2018

AdaBoost:

35

Ensemble Learning: BOOSTING

17.10.2018 36

Ensemble Learning: BOOSTING

17.10.2018

AdaBoost: How it works?

Page 7: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

37

Ensemble Learning: BOOSTING

17.10.2018

AdaBoost:

AdaBoost in Action

38

Ensemble Learning: BAGGING

17.10.2018

Bagging (Bootstrap Aggregating):

39

Ensemble Learning: BAGGING

17.10.2018

Bagging (Bootstrap Aggregating): How it works?

40

Ensemble Learning: BAGGING

17.10.2018

Bagging (Bootstrap Aggregating): Examples:

41

Ensemble Learning: Summary

17.10.2018 42

Dimensionality reduction

17.10.2018

Page 8: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

43

Dimensionality reduction: Why?

17.10.2018

When looking at data and plotting results, we can never go beyond three dimensions.

The higher the number of dimensions we have, the more training data we need.

The dimensionality is an explicit factor for the computational cost of many algorithms.

Remove noise. Significantly improve the results of the learning

algorithm. Make the dataset easier to work with. Make the results easier to understand.

44

Dimensionality reduction: How?

17.10.2018

Feature Selection: Looking through the featuresthat are available and seeing whether or notthey are actually useful.

Feature Derivation: Deriving new features fromthe old ones, generally by applying transforms tothe dataset.

Clustering: Grouping together similar datapoints, and see whether this allows fewerfeatures to be used.

45

Dimensionality reduction: Example

17.10.2018 46

Dimensionality reduction: Principal Components Analysis (PCA)

17.10.2018

47

Dimensionality reduction: Principal Components Analysis (PCA)

17.10.2018

The principal component is the direction in the data with the largest variance.

48

Dimensionality reduction: Principal Components Analysis (PCA)

17.10.2018

Page 9: INF3490 -Biologically inspired computing Support Vector ... · INF3490 -Biologically inspired computing Support Vector Machines, Ensemble Learning, and Dimensionality Reduction Weria

49

Dimensionality reduction: Principal Components Analysis (PCA)

17.10.2018

PCA is a linear transformation

• Does not directly help with data that is not linearly separable.

• However, may make learning easier because of reduced complexity.

PCA removes some information from the data

• Might just be noise.• Might provide helpful nuances that may be of help

to some classifiers.50

Dimensionality reduction: Principal Components Analysis (PCA) Example

how to project samples into the variable space17.10.2018

17.10.2018 51