92
VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, Fei-Fei Li & Andrej Karpathy & Justin Johnson L Aykut Erdem

VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

VBM683

Machine Learning

Pinar Duygulu

Slides are adapted from

Dhruv Batra,

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Aykut Erdem

Page 2: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Python

• Python is a high-level, dynamically typed

multiparadigm programming language. Python code

is often said to be almost like pseudocode, since it

allows you to express very powerful ideas in very few

lines of code while being very readable.

• http://cs231n.github.io/python-numpy-tutorial/

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 3: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

NumPy

http://www.numpy.org/• NumPy is the fundamental package for scientific

computing with Python. It contains among other

things:

• a powerful N-dimensional array object

• sophisticated (broadcasting) functions

• tools for integrating C/C++ and Fortran code

• useful linear algebra, Fourier transform, and random

number capabilities

• Besides its obvious scientific uses, NumPy can also

be used as an efficient multi-dimensional container of

generic data. Arbitrary data-types can be defined.

This allows NumPy to seamlessly and speedily

integrate with a wide variety of databases.Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 4: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Plan for today

• Introduction to Supervised/Inductive Learning

• Your first classifier: k-Nearest Neighbour

(C) Dhruv Batra

Page 5: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Types of Learning

• Supervised learning

– Training data includes desired outputs

• Unsupervised learning

– Training data does not include desired outputs

• Weakly or Semi-supervised learning

– Training data includes a few desired outputs

• Reinforcement learning

– Rewards from sequence of actions

(C) Dhruv Batra

Page 6: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Supervised / Inductive Learning

• Given

– examples of a function (x, f(x))

• Predict function f(x) for new examples x

– Discrete f(x): Classification

– Continuous f(x): Regression

– f(x) = Probability(x): Probability estimation

(C) Dhruv Batra

Page 7: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Slide Credit: Pedro Domingos, Tom Mitchel, Tom Dietterich

Page 8: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Supervised Learning

• Input: x (images, text, emails…)

• Output: y (spam or non-spam…)

• (Unknown) Target Function– f: X Y (the “true” mapping / reality)

• Data – (x1,y1), (x2,y2), …, (xN,yN)

• Model / Hypothesis Class– g: X Y

– y = g(x) = sign(wTx)

• Learning = Search in hypothesis space– Find best g in model class.

(C) Dhruv Batra

Page 9: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

(C) Dhruv Batra Slide Credit: Yaser Abu-Mostapha

Page 10: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Basic Steps of Supervised Learning• Set up a supervised learning problem

• Data collection – Start with training data for which we know the correct outcome provided by a teacher or

oracle.

• Representation – Choose how to represent the data.

• Modeling– Choose a hypothesis class: H = {g: X Y}

• Learning/Estimation– Find best hypothesis you can in the chosen class.

• Model Selection– Try different models. Picks the best one. (More on this later)

• If happy stop– Else refine one or more of the above

(C) Dhruv Batra

Page 11: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Learning is hard!

• No assumptions = No learning

(C) Dhruv Batra

Page 12: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Klingon vs Mlingon Classification

• Training Data

– Klingon: klix, kour, koop

– Mlingon: moo, maa, mou

• Testing Data: kap

• Which language?

• Why?

(C) Dhruv Batra

Page 13: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Concept Learning

• Definition: Acquire an operational definition of a

general category of objects given positive and

negative training examples. • Also called binary

classification, binary supervised learning

Slide credit: Thorsten Joachims

Page 14: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Slide credit: Thorsten Joachims

Page 15: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

(C) Dhruv Batra

Page 16: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Loss/Error Functions

• How do we measure performance?

• Regression:

– L2 error

• Classification:

– #misclassifications

– Weighted misclassification via a cost matrix

– For 2-class classification:

• True Positive, False Positive, True Negative, False Negative

– For k-class classification:

• Confusion Matrix

(C) Dhruv Batra

Page 17: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Training vs Testing

• What do we want?

– Good performance (low loss) on training data?

– No, Good performance on unseen test data!

• Training Data:

– { (x1,y1), (x2,y2), …, (xN,yN) }

– Given to us for learning f

• Testing Data

– { x1, x2, …, xM }

– Used to see if we have learnt anything

(C) Dhruv Batra

Page 18: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Procedural View

• Training Stage:

– Raw Data x (Feature Extraction)

– Training Data { (x,y) } f (Learning)

• Testing Stage

– Raw Data x (Feature Extraction)

– Test Data x f(x) (Apply function, Evaluate error)

(C) Dhruv Batra

Page 19: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Statistical Estimation View

• Probabilities to rescue:

– x and y are random variables

– D = (x1,y1), (x2,y2), …, (xN,yN) ~ P(X,Y)

• IID: Independent Identically Distributed

– Both training & testing data sampled IID from P(X,Y)

– Learn on training set

– Have some hope of generalizing to test set

(C) Dhruv Batra

Page 20: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Concepts

• Capacity

– Measure how large hypothesis class H is.

– Are all functions allowed?

• Overfitting

– f works well on training data

– Works poorly on test data

• Generalization

– The ability to achieve low error on new test data

(C) Dhruv Batra

Page 21: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Guarantees

• 20 years of research in Learning Theory

oversimplified:

• If you have:

– Enough training data D

– and H is not too complex

– then probably we can generalize to unseen test data

(C) Dhruv Batra

Page 22: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 23: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 24: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 25: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 26: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 27: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 28: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 29: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

29Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 30: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 31: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Your First Classifier:

Nearest Neighbours

(C) Dhruv Batra Image Credit: Wikipedia

Page 32: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 33: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 34: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 35: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Synonyms

• Nearest Neighbours

• k-Nearest Neighbours

• Member of following families:

– Instance-based Learning

– Memory-based Learning

– Exemplar methods

– Non-parametric methods

(C) Dhruv Batra

Page 36: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Nearest Neighbor is an example of….

Instance-based learning

Has been around since

about 1910.

To make a prediction,

search database for

similar datapoints, and fit

with the local points.

Assumption: Nearby

points behavior similarly

wrt y

x1 y1

x2 y2

x3 y3

.

.

xn yn

(C) Dhruv Batra

Page 37: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Instance/Memory-based Learning

Four things make a memory based learner:

• A distance metric

• How many nearby neighbors to look at?

• A weighting function (optional)

• How to fit with the local points?

Slide Credit: Carlos Guestrin(C) Dhruv Batra

Page 38: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

1-Nearest Neighbour

Four things make a memory based learner:

• A distance metric

– Euclidean (and others)

• How many nearby neighbors to look at?

– 1

• A weighting function (optional)

– unused

• How to fit with the local points?

– Just predict the same output as the nearest neighbour.

Slide Credit: Carlos Guestrin(C) Dhruv Batra

Page 39: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 40: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 41: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 42: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 43: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 44: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 45: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 46: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

k-Nearest Neighbour

Four things make a memory based learner:

• A distance metric

– Euclidean (and others)

• How many nearby neighbors to look at?

– k

• A weighting function (optional)

– unused

• How to fit with the local points?

– Just predict the average output among the nearest

neighbours.

(C) Dhruv Batra Slide Credit: Carlos Guestrin

Page 47: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

1 vs k Nearest Neighbour

(C) Dhruv Batra Image Credit: Ying Wu

Page 48: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

1 vs k Nearest Neighbour

(C) Dhruv Batra Image Credit: Ying Wu

Page 49: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 50: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

(C) Dhruv Batra 50

Page 51: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 52: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 53: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Nearest Neighbour

• Demo 1

– http://cgm.cs.mcgill.ca/~soss/cs644/projects/perrier/Nearest.

html

• Demo 2

– http://www.cs.technion.ac.il/~rani/LocBoost/

(C) Dhruv Batra

Page 54: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

• Gender Classification from body proportions

– Igor Janjic & Daniel Friedman, Juniors

(C) Dhruv Batra

Page 55: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Scene Completion [Hayes & Efros, SIGGRAPH07]

(C) Dhruv Batra

Page 56: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Hays and Efros, SIGGRAPH 2007

Page 57: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

… 200 totalHays and Efros, SIGGRAPH 2007

Page 58: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Context Matching

Hays and Efros, SIGGRAPH 2007

Page 59: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Graph cut + Poisson blending Hays and Efros, SIGGRAPH 2007

Page 60: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Hays and Efros, SIGGRAPH 2007

Page 61: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Hays and Efros, SIGGRAPH 2007

Page 62: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Hays and Efros, SIGGRAPH 2007

Page 63: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Hays and Efros, SIGGRAPH 2007

Page 64: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Hays and Efros, SIGGRAPH 2007

Page 65: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Hays and Efros, SIGGRAPH 2007

Page 66: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 67: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 68: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 69: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 70: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 71: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 72: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient
Page 73: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 74: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 75: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 76: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Fei-Fei Li & Andrej Karpathy & Justin Johnson L

Page 77: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

(C) Dhruv Batra 77

Page 78: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

(C) Dhruv Batra 78

Page 79: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

1-NN for Regression

(C) Dhruv Batra

x

y

Here, this is the closest datapoint

Figure Credit: Carlos Guestrin

Page 80: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

1-NN for Regression

• Often bumpy (overfits)

(C) Dhruv Batra Figure Credit: Andrew Moore

Page 81: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

9-NN for Regression

• Often bumpy (overfits)

(C) Dhruv Batra Figure Credit: Andrew Moore

Page 82: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Multivariate distance metrics

Suppose the input vectors x1, x2, …xN are two dimensional:

x1 = ( x11 , x12 ) , x2 = ( x21 , x22 ) , …xN = ( xN1 , xN2 ).

One can draw the nearest-neighbor regions in input space.

Dist(xi,xj) =(xi1 – xj1)2+(3xi2 – 3xj2)

2

The relative scalings in the distance metric affect region shapes

Dist(xi,xj) = (xi1 – xj1)2 + (xi2 – xj2)

2

Slide Credit: Carlos Guestrin

Page 83: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Euclidean distance metric

where

Or equivalently,

A

Slide Credit: Carlos Guestrin

Page 84: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Scaled Euclidian (L2) Mahalanobis

(non-diagonal A)

Notable distance metrics

(and their level sets)

Slide Credit: Carlos Guestrin

Page 85: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Minkowski distance

(C) Dhruv Batra Image Credit: By Waldir (Based on File:MinkowskiCircles.svg)

[CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

Page 86: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

L1 norm (absolute)

Linf (max) norm

Scaled Euclidian (L2)

Mahalanobis

(non-diagonal A)

Notable distance metrics

(and their level sets)

Slide Credit: Carlos Guestrin

Page 87: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Kernel Regression/Classification

Four things make a memory based learner:

• A distance metric– Euclidean (and others)

• How many nearby neighbors to look at?– All of them

• A weighting function (optional)– wi = exp(-d(xi, query)2 / σ2)

– Nearby points to the query are weighted strongly, far points weakly. The σ parameter is the Kernel Width. Very important.

• How to fit with the local points?– Predict the weighted average of the outputs

predict = Σwiyi / Σwi

(C) Dhruv Batra Slide Credit: Carlos Guestrin

Page 88: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Weighting/Kernel functions

wi = exp(-d(xi, query)2 / σ2)

(Our examples use Gaussian)

(C) Dhruv Batra Slide Credit: Carlos Guestrin

Page 89: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Effect of Kernel Width

(C) Dhruv Batra

• What happens as σinf?

• What happens as σ0?

Image Credit: Ben Taskar

Page 90: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Problems with Instance-Based Learning

• Expensive

– No Learning: most real work done during testing

– For every test sample, must search through all dataset –

very slow!

– Must use tricks like approximate nearest neighbour search

• Doesn’t work well when large number of irrelevant

features

– Distances overwhelmed by noisy features

• Curse of Dimensionality

– Distances become meaningless in high dimensions

– (See proof in next lecture)

(C) Dhruv Batra

Page 91: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

• k-NN– Simplest learning algorithm

– With sufficient data, very hard to beat “strawman” approach

– Picking k?

• Kernel regression/classification– Set k to n (number of data points) and chose kernel width

– Smoother than k-NN

• Problems with k-NN– Curse of dimensionality

– Irrelevant features often killers for instance-basedapproaches

– Slow NN search: Must remember (very large) dataset for prediction

(C) Dhruv Batra

What you need to know

Page 92: VBM683 Machine Learning - Hacettepepinar/courses/VBM683/lectures/nearest... · VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, ... be used as an efficient

Parametric vs Non-Parametric Models

• Does the capacity (size of hypothesis class) grow

with size of training data?

– Yes = Non-Parametric Models

– No = Parametric Models

• Example

– http://www.theparticle.com/applets/ml/nearest_neighbor/

(C) Dhruv Batra