20
Machine Learning Intro iCAMP 2012 Max Welling UC Irvine 1

Machine Learning Intro iCAMP 2012

  • Upload
    betha

  • View
    61

  • Download
    0

Embed Size (px)

DESCRIPTION

Machine Learning Intro iCAMP 2012. Max Welling UC Irvine. Machine Learning. Algorithms that learn to make predictions from examples (data). Types of Machine Learning. Supervised Learning Labels are provided, there is a strong learning signal. e.g. classification, regression. - PowerPoint PPT Presentation

Citation preview

Page 1: Machine Learning  Intro  iCAMP  2012

1

Machine Learning Intro iCAMP 2012

Max WellingUC Irvine

Page 2: Machine Learning  Intro  iCAMP  2012

2

Machine Learning

• Algorithms that learn to make predictions from examples (data)

Page 3: Machine Learning  Intro  iCAMP  2012

3

Types of Machine Learning

• Supervised Learning• Labels are provided, there is a strong learning signal.• e.g. classification, regression.

• Semi-supervised Learning.

• Only part of the data have labels. • e.g. a child growing up.

• Reinforcement learning.• The learning signal is a (scalar) reward and may come with a delay.• e.g. trying to learn to play chess, a mouse in a maze.

• Unsupervised learning• There is no direct learning signal. We are simply trying to find structure in data.• e.g. clustering, dimensionality reduction.

Page 4: Machine Learning  Intro  iCAMP  2012

4

Unsupervised Learning:

(LLE – Roweis & Saul)

Dimensionality Reduction: clustering

Page 5: Machine Learning  Intro  iCAMP  2012

5

Supervised Learning

Regression Classification

Page 6: Machine Learning  Intro  iCAMP  2012

6

Collaborative Filteringm

ovie

s (+

/- 1

7,77

0)

users (+/- 240,000)

total of +/- 400,000,000 nonzero entries(99% sparse)

4

(Netflix Dataset)

4

? 1

1?

Page 7: Machine Learning  Intro  iCAMP  2012

7

Generalization• Consider the following regression problem:• Predict the real value on the y-axis from the real value on the x-axis.• You are given 6 examples: {Xi,Yi}.• What is the y-value for a new query point X* ?

X*

Page 8: Machine Learning  Intro  iCAMP  2012

8

Generalization

Page 9: Machine Learning  Intro  iCAMP  2012

9

Generalization

Page 10: Machine Learning  Intro  iCAMP  2012

10

Generalization

which curve is best?

Page 11: Machine Learning  Intro  iCAMP  2012

11

• Ockham’s razor: prefer the simplest hypothesis consistent with data.

Generalization

Page 12: Machine Learning  Intro  iCAMP  2012

12

Generalization

Learning is concerned with accurate predictionof future data, not accurate prediction of training data.

Question: Design an algorithm that is perfect at predicting training data.

Page 13: Machine Learning  Intro  iCAMP  2012

13

Learning as Compression

• Imagine a game where Bob needs to send a dataset to Alice.

• They are allowed to meet once before they see the data.

• The agree on a precision level (quantization level).

• Bob learns a model (red line).

• Bob sends the model parameters (offset and slant) only once

• For every datapoint, Bob sends -distance along line (large number) -orthogonal distance from line (small number) (small numbers are cheaper to encode than large numbers)

Page 14: Machine Learning  Intro  iCAMP  2012

14

Generalization

learning = compression = abstraction

• The man who couldn’t forget …

Page 15: Machine Learning  Intro  iCAMP  2012

15

Classification: nearest neighbor

Example: Imagine you want to classify versus

Data: 100 monkey images and 200 human images with labels what is what.

Task: Here is a new image: monkey or human?

Page 16: Machine Learning  Intro  iCAMP  2012

16

1 nearest neighbor

Idea: 1. Find the picture in the database which is closest your query image.

2. Check its label.

3. Declare the class of your query image to be the same as that of the closest picture.

query closest image

Page 17: Machine Learning  Intro  iCAMP  2012

17

kNN Decision Surface

decision curve

Page 18: Machine Learning  Intro  iCAMP  2012

18

Bayes Rule(s)

Riddle: Joe goes to the doctor and tells the doctor he has a stiff neck and a rash.The doctor is worried about meningitis and performs a test that is 80% correct, that is,for 80% of the people that have meningitis it will turn out positive. If 1 in 100,000 peoplehave meningitis in the population and 1 in 1000 people will test positive (sick or not sick)what is the probability that Joe has meningitis?

Answer: Bayes Rule.

P(meningitis | positive test) = P(positive test | meningitis ) P(meningitis) / P(positive test) = 0.8 * 0.00001 / 0.001 = 0.008 < 1%

Page 19: Machine Learning  Intro  iCAMP  2012

19

Naïve Bayes Classifier

testresult

meningitis

stiff-neck, rash

Naïve Bayes Classifier:

P(Y|X1,X2) = P(X1|Y) P(X2|Y) P(Y) / PX1, X2)

X1

X2Y

P(X1,X2|Y) = P(X1|Y) P(X2|Y) Conditional Independence:

Page 20: Machine Learning  Intro  iCAMP  2012

20

Bayesian Networks & Graphical Models

• Main modeling tool for modern machine learning

• Reasoning over large collections of random variables with intricate relations