50
1 CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman [email protected] Machine Learning: The Theory of Learning R&N 18.5

CS 4700: Foundations of Artificial Intelligence

  • Upload
    monet

  • View
    27

  • Download
    2

Embed Size (px)

DESCRIPTION

CS 4700: Foundations of Artificial Intelligence. Prof. Bart Selman [email protected] Machine Learning: The Theory of Learning R&N 18.5. Machine Learning Theory. Central question in machine learning: How can we be sure that the hypothesis produced by a - PowerPoint PPT Presentation

Citation preview

Page 1: CS 4700: Foundations of  Artificial Intelligence

1

CS 4700:Foundations of Artificial Intelligence

Prof. Bart [email protected]

Machine Learning:The Theory of Learning

R&N 18.5

Page 2: CS 4700: Foundations of  Artificial Intelligence

2

Machine Learning Theory

Central question in machine learning:

How can we be sure that the hypothesis produced by a learning alg. will produce the correct answer on previously unseen examples?

Question in some sense too vague… Leads to the general problem of inductive inference: How can we generalize

from data?

Major advance: Computational Learning Theory Valiant 1984; Turing Award 2010.

Page 3: CS 4700: Foundations of  Artificial Intelligence

3

Page 4: CS 4700: Foundations of  Artificial Intelligence

4

Page 5: CS 4700: Foundations of  Artificial Intelligence

5

Page 6: CS 4700: Foundations of  Artificial Intelligence

6

Probabilities to the rescue: certain “bad” events (e.g.learner learns the wrong hypothesis)become exponentially rare with enough data.

Page 7: CS 4700: Foundations of  Artificial Intelligence

7

Page 8: CS 4700: Foundations of  Artificial Intelligence

8

(or higher.)

Page 9: CS 4700: Foundations of  Artificial Intelligence

9

S = # of “heads”

Page 10: CS 4700: Foundations of  Artificial Intelligence

10

Wow!

S = # of “heads”

So, when flipping a fair coin 10,000 times, you’ll“never, ever” see more than 5,500 heads. And, no oneever will… I.e., with enough data, we can be verycertain that if coin is fair, we won’t see a “large” deviationin terms of #heads vs. #tails.

Page 11: CS 4700: Foundations of  Artificial Intelligence

11

Page 12: CS 4700: Foundations of  Artificial Intelligence

12

Page 13: CS 4700: Foundations of  Artificial Intelligence

13

Page 14: CS 4700: Foundations of  Artificial Intelligence

14

Also, makes interesting learning algorithms possible!

Page 15: CS 4700: Foundations of  Artificial Intelligence

15

Page 16: CS 4700: Foundations of  Artificial Intelligence

16

Two issues: (1) (Re: High prob.) May get a “bad” sequence oftraining examples (many “relatively short men). Smallrisk but still small risk of getting wrong hypothesis.(2) We only use a small set of all examples (otherwise notreal learning). So, we may not get hypothesis exactly correct.

Finally, want efficient learning --- polytime!!

Page 17: CS 4700: Foundations of  Artificial Intelligence

17

Valiant’s genius was to focus in on the “simplest” modelsthat still captured all the key aspects we want in a “learningmachine.” PAC role has been profound even though mainlyto shine a “theoretical” light.

Page 18: CS 4700: Foundations of  Artificial Intelligence

18

Page 19: CS 4700: Foundations of  Artificial Intelligence

19

Page 20: CS 4700: Foundations of  Artificial Intelligence

20

For our learning algorithm, we simply use a method thatkeeps the hypothesis consistent with all examples seenso far. Can start out with an hypothesis that says “No”to all examples. Then when first positive example comesin, minimally modify hypothesis to make it consistent with thatexample. Proceed doing that for every new pos example.

Page 21: CS 4700: Foundations of  Artificial Intelligence

21

So, h_b could mistakenly be learned!

Note: we want to make it likely that all consistenthypotheses are approximately correct. So, no “bad” consistenthypothesis occur at all.

Page 22: CS 4700: Foundations of  Artificial Intelligence

22

² and ± are assumed given(set as desired)Keep size hypothesis class H down.Another showing of Ockham’s razor!

Page 23: CS 4700: Foundations of  Artificial Intelligence

23

So, in this setting the requirements on our learningalgorithm are quite minimal. We do also want poly time though.

Page 24: CS 4700: Foundations of  Artificial Intelligence

24

Aside: Shannon already noted that the vast majorityof Boolean functions on N letters look “random” andcannot be compressed. (No structure!)

Page 25: CS 4700: Foundations of  Artificial Intelligence

25

Page 26: CS 4700: Foundations of  Artificial Intelligence

26

Page 27: CS 4700: Foundations of  Artificial Intelligence

27

Page 28: CS 4700: Foundations of  Artificial Intelligence

28

Page 29: CS 4700: Foundations of  Artificial Intelligence

29

Page 30: CS 4700: Foundations of  Artificial Intelligence

30

Page 31: CS 4700: Foundations of  Artificial Intelligence

31

Page 32: CS 4700: Foundations of  Artificial Intelligence

32

Still need polytime learning alg. to generatea consistent decision list with m examples.

Page 33: CS 4700: Foundations of  Artificial Intelligence

33

Page 34: CS 4700: Foundations of  Artificial Intelligence

34

Page 35: CS 4700: Foundations of  Artificial Intelligence

35

Page 36: CS 4700: Foundations of  Artificial Intelligence

36

Page 37: CS 4700: Foundations of  Artificial Intelligence

37

Page 38: CS 4700: Foundations of  Artificial Intelligence

38

Page 39: CS 4700: Foundations of  Artificial Intelligence

39

Page 40: CS 4700: Foundations of  Artificial Intelligence

40

Some more PAC learnability examples.

Page 41: CS 4700: Foundations of  Artificial Intelligence

41

Page 42: CS 4700: Foundations of  Artificial Intelligence

42

Page 43: CS 4700: Foundations of  Artificial Intelligence

43

Page 44: CS 4700: Foundations of  Artificial Intelligence

44

Page 45: CS 4700: Foundations of  Artificial Intelligence

45

Page 46: CS 4700: Foundations of  Artificial Intelligence

46

Page 47: CS 4700: Foundations of  Artificial Intelligence

47

Page 48: CS 4700: Foundations of  Artificial Intelligence

48

Page 49: CS 4700: Foundations of  Artificial Intelligence

49

Page 50: CS 4700: Foundations of  Artificial Intelligence

50