46
CS 360: Machine Learning Prof. Sara Mathieson Fall 2019

CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

CS 360: Machine Learning

Prof. Sara MathiesonFall 2019

Page 2: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Admin• Midterm 2 TODAY!

• No office hours Friday

• Final project presentations:– Wednesday Dec 18: 1-4pm (block out the entire

time, but we may not need all of it)– Option to present last day of class (email me)

Page 3: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Outline for November 21

• Support Vector Machines Review

• Likelihood functions (Bernoulli and Logistic Regression)

• Finish practice problems (Q2-Q4)

Page 4: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Outline for November 21

• Support Vector Machines Review

• Likelihood functions (Bernoulli and Logistic Regression)

• Finish practice problems (Q2-Q4)

Page 5: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples
Page 6: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples
Page 7: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples
Page 8: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples
Page 9: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples
Page 10: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

+

+

--

+

Perceptron Recap

Example modified from Achim J. Lilienthal & Thorsteinn Rögnvaldsson

Round 2:

Incorrect classification -

Page 11: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

+

+

--

+

Example modified from Achim J. Lilienthal & Thorsteinn Rögnvaldsson

Round 2:

Incorrect classification

Perceptron Recap

Page 12: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

+

+

--

+

Example modified from Achim J. Lilienthal & Thorsteinn Rögnvaldsson

Round 2:

Incorrect classification“Push” w away from negative point

Perceptron Recap

Page 13: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

+

+

--

+

Example modified from Achim J. Lilienthal & Thorsteinn Rögnvaldsson

Round 2:

-Incorrect classification“Push” w away from negative point

Perceptron Recap

Page 14: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

-

+

+

--

+

Example modified from Achim J. Lilienthal & Thorsteinn Rögnvaldsson

What is the new weight vector?

Round 2:

Perceptron Recap

Page 15: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

SVM flowchart

Goal: maximize the separation between

positive and negative examples

Page 16: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

SVM flowchart

Goal: maximize the separation between

positive and negative examples

Create optimization problem: “maximizing the margin”

Page 17: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

SVM flowchart

Goal: maximize the separation between

positive and negative examples

Create optimization problem: “maximizing the margin” Convert this “primal”

optimization problem into the “dual”

Page 18: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

SVM flowchart

Goal: maximize the separation between

positive and negative examples

Create optimization problem: “maximizing the margin” Convert this “primal”

optimization problem into the “dual”

Training: solve dual optimization problem to find alpha value for each training example

Page 19: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

SVM flowchart

Goal: maximize the separation between

positive and negative examples

Create optimization problem: “maximizing the margin” Convert this “primal”

optimization problem into the “dual”

Training: solve dual optimization problem to find alpha value for each training example

Testing: use alpha values of support vectors to classify new points (don’t

explicitly calculate w)

Page 20: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

SVM flowchart

Goal: maximize the separation between

positive and negative examples

Create optimization problem: “maximizing the margin”

• Use relationship between functional & geo margin

• Put arbitrary constraint on functional margin

• Take multiplicative inverse to make max into min

Convert this “primal” optimization problem

into the “dual”

Training: solve dual optimization problem to find alpha value for each training example

Testing: use alpha values of support vectors to classify new points (don’t

explicitly calculate w)

Page 21: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

SVM flowchart

Goal: maximize the separation between

positive and negative examples

Create optimization problem: “maximizing the margin”

• Use relationship between functional & geo margin

• Put arbitrary constraint on functional margin

• Take multiplicative inverse to make max into min

Convert this “primal” optimization problem

into the “dual”

• Use Lagrange multipliers to incorporate constraints

• Minimize with respect to w and b

Training: solve dual optimization problem to find alpha value for each training example

Testing: use alpha values of support vectors to classify new points (don’t

explicitly calculate w)

Page 22: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Functional and Geometric MarginsSVM classifier:(same as perceptron)

Page 23: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Functional and Geometric MarginsSVM classifier:(same as perceptron)

Functional Margin:

Page 24: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Functional and Geometric MarginsSVM classifier:(same as perceptron)

Functional Margin:

Geometric Margin:(distance between example and hyperplane)

Page 25: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Functional and Geometric MarginsSVM classifier:(same as perceptron)

Functional Margin:

Geometric Margin:(distance between example and hyperplane)

Note:

Page 26: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Optimization Problem: try 1

Goal: maximize the minimum distance between example and hyperplane

Page 27: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Optimization Problem: try 1

Goal: maximize the minimum distance between example and hyperplane

Formulation: optimize a function with respect to a constraint

(force functional and geometric margin to be equal)

Page 28: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Optimization Problem: try 2

Idea: substitute functional margin divided by magnitude of weight vector

(gets rid of non-convex constraint)

Page 29: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Optimization Problem: try 3

Idea: put arbitrary constraint on functional margin

Page 30: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Optimization Problem: try 3

Idea: put arbitrary constraint on functional margin

Page 31: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Lagrangian

• The alpha values are our Lagrange multipliers• We don’t care about our constraint if it is not active

Page 32: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Lagrangian

• The alpha values are our Lagrange multipliers• We don’t care about our constraint if it is not active

• First minimize with respect to w & b, becomes W(alpha)

Page 33: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Lagrangian

• The alpha values are our Lagrange multipliers• We don’t care about our constraint if it is not active

• First minimize with respect to w & b, becomes W(alpha)

Page 34: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Kernel Trick

• Now we can replace dot products with any kernel!

Page 35: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Final Goal: classification• After using Kernel Trick with dual optimization

problem, we have:

• To classify, we use:

Page 36: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Final Goal: classification• After using Kernel Trick with dual optimization

problem, we have:

• To classify, we use:

Page 37: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Final Goal: classification• After using Kernel Trick with dual optimization

problem, we have:

• To classify, we use:

Page 38: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Outline for November 21

• Support Vector Machines Review

• Likelihood functions (Bernoulli and Logistic Regression)

• Finish practice problems (Q2-Q4)

Page 39: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples
Page 40: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Outline for November 21

• Support Vector Machines Review

• Likelihood functions (Bernoulli and Logistic Regression)

• Finish practice problems (Q2-Q4)

Page 41: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Follow ups on questions from class

• ANOVA: considered a special case of linear regression

• AdaBoost: why ½ in front of the score?– Comes out in the derivation– Main idea: solve for the classifier scores that

minimize exponential loss

Page 42: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Handout 19, Question 2

• r = 1/3, probability of one classifier being wrong

• T = 5, number of classifiers

• R = number of votes for the wrong class

• If R=3,4,5 then we will vote for the wrong class overall

Page 43: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples
Page 44: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Handout 19, Question 2

• This analysis assumed classifiers were independent!

• What if they are not? How did Random Forests help us decorrelate classifiers?

Page 45: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Handout 19, Question 2

• This analysis assumed classifiers were independent!

• What if they are not? How did Random Forests help us decorrelate classifiers?

• Note about Bagging: choosing n with resampling actually does produce a very different dataset– As n increases, roughly 0.37 not chosen each time

Page 46: CS 360: Machine Learningcs.haverford.edu/faculty/smathieson/teaching/f19/lecs/lec21/lec21.pdf · SVM flowchart Goal: maximize the separation between positive and negative examples

Q4