116
CSC 4510/9010 Spring 2015. Paula Matuszek CSC 4510/9010: Applied Machine Learning Reinforcement and Transfer Learning Dr. Paula Matuszek [email protected] [email protected] (610) 647-9789 1

CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

  • Upload
    buiphuc

  • View
    219

  • Download
    3

Embed Size (px)

Citation preview

Page 1: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

CSC 4510/9010: Applied Machine Learning

Reinforcement and Transfer Learning

Dr. Paula Matuszek

[email protected]

[email protected]

(610) 647-9789

1

Page 2: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

What Is Machine Learning?• “Learning denotes changes in a system that ... enable a

system to do the same task more efficiently the next time.” –Herbert Simon – In other words, the end result is a changed model or of some

kind; the focus is on the end product• “Learning is constructing or modifying representations

of what is being experienced.” –Ryszard Michalski – The experiences perceived must be captured or represented in

some way; learning modifies that representation. This definition focuses on the process, rather than the result.

2

Page 3: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

So what is Machine Learning?• We can consider that the “system” is a computer

and its programs, or a statistical model with parameters.

• Another way of looking at machine learning is as a way to get a computer to do things without having to explicitly describe what steps to take, by giving it examples or feedback

• The computer then looks for patterns which can explain or predict what happens.

• The computer is trained through the examples.

3

Page 4: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

The Architecture of a ML System

• Every machine learning system has four parts:– a representation or model of what is being

learned– an actor. This is the part that uses the

representation and actually does something.– a critic. The part that provides feedback. – a learner. The part that modifies the

representation or model, using the feedback.

4

Based on Russell and Norvig, Artificial Intelligence: A Modern Approach, Third Edition, Prentice-Hall, 2009. http://aima.cs.berkeley.edu/

Page 5: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Representation• A learning system must have a

representation or model of what is being learned.

• This is the component that changes based on experience.

• In a machine learning system this may be a mathematical model or formula, a set of rules, a decision tree, or some other form of information.

5

Page 6: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

General Model of Learning Agent

6

Environment

Agent

Critic

Learning Element

Problem Generator

Performer with KB

Performance Standard

Sensors

Effectors

feedback

learning goals

changes

knowledge

Based on Russell and Norvig, Artificial Intelligence: A Modern Approach, Third Edition, Prentice-Hall, 2009. http://aima.cs.berkeley.edu/

Page 7: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

copyright Paula Matuszek 20159

Paula Matuszek

Animals Behind the Scene7

Is it a bird?

Is it a penguin?

Yes

Does it have fur?

No

Is it a mouse? Is it a frog?

Yes No

After several passes.

Computer: Is it a bird?Human: NoComputer: Is it a frog?Human: NoComputer: What is it?Human: A mouseComputer: Tell me a question that distinguishes a frog from a mouse.Human: Does it have fur?Computer: What is the answer for a frog?Human: no

Page 8: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Animals Guessing Game Architecture• The Animals Guessing Game has all of the parts

of a Machine Learning Architecture:– The Representation is a sequence of questions and

pairs of yes/no answers (called a binary decision tree).– The Actor “walks” the tree, interacting with a human;

at each question it chooses whether to follow the “yes” branch or the “no” branch.

– The Critic is the human player telling the game whether it has guessed correctly.

– The Learner elicits new questions and adds questions, guesses and branches to the tree.

8

Page 9: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning• The Animals Game is a simple form of

Reinforcement Learning.• Very early concept in Artificial

Intelligence!

9

• www-03.ibm.com/ibm/history/ibm100/us/en/icons/ibm700series/impacts/

Page 10: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning

Page 11: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/

Page 12: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?

Page 13: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?One solution is unsupervised learning

Page 14: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?One solution is unsupervised learningFor a more complex problem: ■ The agent has a task to perform ■ It takes some actions in the world ■ At some later point, it gets feedback telling it how well it did on

performing the task ■ The agent performs the same task over and over again

Page 15: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?One solution is unsupervised learningFor a more complex problem: ■ The agent has a task to perform ■ It takes some actions in the world ■ At some later point, it gets feedback telling it how well it did on

performing the task ■ The agent performs the same task over and over again

This problem is called reinforcement learning: ■ The agent gets positive reinforcement for tasks done well ■ The agent gets negative reinforcement for tasks done poorly

Page 16: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning (cont.)

Page 17: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning (cont.)

The goal is to get the agent to act in the world so as to maximize its rewards

Page 18: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning (cont.)

The goal is to get the agent to act in the world so as to maximize its rewardsThe agent has to figure out what it did that

made it get the reward/punishment ■ This is known as the credit assignment problem

Page 19: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning (cont.)

The goal is to get the agent to act in the world so as to maximize its rewardsThe agent has to figure out what it did that

made it get the reward/punishment ■ This is known as the credit assignment problemReinforcement learning approaches can be

used to train computers to do many tasks ■ backgammon and chess playing ■ job shop scheduling ■ controlling robot limbs

Page 20: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Formalization

Given: ■ a state space S ■ a set of actions a1, …, ak

■ reward value at the end of each trial (may be positive or negative)

Output: ■ a mapping from states to actions

Page 21: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Formalization

Given: ■ a state space S ■ a set of actions a1, …, ak

■ reward value at the end of each trial (may be positive or negative)

Output: ■ a mapping from states to actions

example: Alvinn (driving agent) state: configuration of the car

learn a steering action for each state

Page 22: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Repeat: ⬥ s ß sensed state ⬥ If s is terminal then exit ⬥ a ß choose action (given s) ⬥ Perform a

Reactive Agent Algorithm

Page 23: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Accessible or observable stateRepeat:

⬥ s ß sensed state ⬥ If s is terminal then exit ⬥ a ß choose action (given s) ⬥ Perform a

Reactive Agent Algorithm

Page 24: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek 14

Policy (Reactive/Closed-Loop Strategy)

• A policy Π is a complete mapping from states to actions

-1

+1

2

3

1

4321

Page 25: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Repeat: ⬥ s ß sensed state ⬥ If s is terminal then exit ⬥ a ß Π(s) ⬥ Perform a

Reactive Agent Algorithm

Page 26: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Approaches

Learn policy directly– function mapping from states to actions Learn utility values for states (i.e., the

value function)

Page 27: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Value FunctionThe agent knows what state it is in The agent has a number of actions it can perform in each state. Initially, it doesn't know the value of any of the states If the outcome of performing an action at a state is deterministic, then the agent can update the utility value U() of states: ■ U(oldstate) = reward + U(newstate) The agent learns the utility values of states as it works its way through the state space

Page 28: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Learning States and Actions• A typical approach is:• At state S choose some action A• Taking us to new State S1.

– If S1 has a positive value, increase value of A at S.– If S1 has a negative value, decrease value of A at S.– If S1 is new initial value is unknown. 0?

• Repeat until?• One complete learning pass eventually gets to a

deterministic state. (win or lose)

18

Page 29: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

ExplorationThe agent may occasionally choose to explore suboptimal moves in the hopes of finding better outcomes ■ Only by visiting all the states frequently enough can we

guarantee learning the true values of all the states

A discount factor is often introduced to prevent utility values from diverging and to promote the use of shorter (more efficient) sequences of actions to attain rewards The update equation using a discount factor γ is: ■ U(oldstate) = reward + γ * U(newstate) Normally, γ is set between 0 and 1

Page 30: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Selecting an Action

Page 31: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Selecting an ActionSimply choose action with highest (current)

expected utility?

Page 32: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Selecting an ActionSimply choose action with highest (current)

expected utility?Problem: each action has two effects ■ yields a reward (or penalty) on current sequence ■ information is received and used in learning for

future sequences

Page 33: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Selecting an ActionSimply choose action with highest (current)

expected utility?Problem: each action has two effects ■ yields a reward (or penalty) on current sequence ■ information is received and used in learning for

future sequences

Trade-off: immediate good for long-term well-being

Page 34: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Selecting an ActionSimply choose action with highest (current)

expected utility?Problem: each action has two effects ■ yields a reward (or penalty) on current sequence ■ information is received and used in learning for

future sequences

Trade-off: immediate good for long-term well-being

try a shortcut – you might get lost; you might learn a new, quicker route!

Page 35: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Exploration policy

Page 36: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Exploration policyWacky approach (exploration): act randomly

in hopes of eventually exploring entire environment

Page 37: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Exploration policyWacky approach (exploration): act randomly

in hopes of eventually exploring entire environmentGreedy approach (exploitation): act to

maximize utility using current estimate

Page 38: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Exploration policyWacky approach (exploration): act randomly

in hopes of eventually exploring entire environmentGreedy approach (exploitation): act to

maximize utility using current estimateReasonable balance: act more wacky

(exploratory) when agent has little idea of environment; more greedy when the model is close to correct

Page 39: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

Exploration policyWacky approach (exploration): act randomly

in hopes of eventually exploring entire environmentGreedy approach (exploitation): act to

maximize utility using current estimateReasonable balance: act more wacky

(exploratory) when agent has little idea of environment; more greedy when the model is close to correctExample: n-armed bandits…

Page 40: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor

CSC 4510/9010 Spring 2015. Paula Matuszek

RL Summary

Active area of research Approaches from both OR and AI There are many more sophisticated

algorithms that we have not discussed Applicable to game-playing, robot

controllers, others

Page 41: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning• Reinforcement learning systems learn a

series of actions or decisions, rather than a single decision, based on feedback given at the end of the series.

• A reinforcement learner has a goal, and carries out trial-and-error search to find the best paths toward that goal

23

Page 42: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Reinforcement Learning• A typical reinforcement learning system is an active

agent, interacting with its environment.• It must balance

– exploration: trying different actions and sequences of actions to discover which ones work best

– achievement: using sequences which have worked well so far

• It must also learn successful sequences of actions in an uncertain environment

• Typical current applications are in artificial intelligence and in engineering.

24

Page 43: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

• Slides based on presentation from Haitham Bou Ammar, Maastricht University

25

Page 44: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning• Data used in training a classifier must be

properly chosen to be representative• If not? Accuracy will be worse than expected• But suppose we want to apply a classifier to a

new or shifting domain? Retrain!– But that’s expensive.

• Can we somehow use our existing classifier as a starting point to give us a shortcut?

• This is Transfer Learning.26

Page 45: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Motivation

Page 46: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Motivation

y

x

Training Data

Page 47: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Motivation

Model

x€

y

Page 48: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Motivation

Model

x€

y?

Test Data

Page 49: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Motivation

Model

x€

y?

Test Data

Assumptions: 1.Training and Test are from same distribution 2.Training and Test are in same feature space

Page 50: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Motivation

Model

x€

y?

Test Data

Assumptions: 1.Training and Test are from same distribution 2.Training and Test are in same feature spaceN

ot T

rue

in m

any r

eal-w

orld

appl

icat

ions

Page 51: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Page 52: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Page 53: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Page 54: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Page 55: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Model

Physics Machine Learning

Life Science

Page 56: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Model

?

Physics Machine Learning

Life Science

Page 57: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Model

?

Physics Machine Learning

Life Science

Page 58: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Model

?

Physics Machine Learning

Life Science

Content Change !

Page 59: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Model

?

Physics Machine Learning

Life Science

Content Change !

Assumption violated!

Page 60: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Web-document Classification

Model

?

Physics Machine Learning

Life Science

Content Change !

Assumption violated!

Learn a new model

Page 61: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Page 62: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Learn new Model :

1. Collect new Labeled Data 2. Build new model

Page 63: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Learn new Model :

1. Collect new Labeled Data 2. Build new model

Page 64: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Learn new Model :

1. Collect new Labeled Data 2. Build new model

Reuse & Adapt already learned model !

Page 65: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Page 66: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Page 67: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Page 68: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Features Task One

Page 69: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Model OneFeatures Task One

Task One

Page 70: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Page 71: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Cars

Motorcycles

Page 72: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Cars

Motorcycles

Features Task Two

Page 73: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Cars

MotorcyclesTask Two

Features Task Two

Model Two

Page 74: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Cars

MotorcyclesTask Two

Features Task One

Features Task Two

Model Two

Page 75: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Examples: Image Classification

Cars

MotorcyclesTask Two

Features Task One

Features Task Two

Reuse

Model Two

Page 76: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Traditional Machine Learning vs. Transfer

Page 77: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Traditional Machine Learning vs. Transfer

Different Tasks

Learning System

Learning System

Learning System

Traditional Machine Learning

Page 78: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Traditional Machine Learning vs. Transfer

Source Task

Knowledge

Target Task

Learning System

Different Tasks

Learning System

Learning System

Learning System

Traditional Machine Learning Transfer Learning

Page 79: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning Definition

Given a source domain and source learning task, a target domain and a target learning task, transfer learning aims to help improve the learning of the target predictive function using the source knowledge, where

or

Page 80: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Definition

● Therefore, if either :

Page 81: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Definition

● Therefore, if either :

Page 82: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Definition

● Therefore, if either :

Page 83: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Definition

● Therefore, if either : Domain Differences

Page 84: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Definition

● Therefore, if either : Domain Differences

Page 85: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Definition

● Therefore, if either : Domain Differences

Page 86: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Definition

● Therefore, if either : Domain Differences

Task Differences

Page 87: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferring

Page 88: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

Page 89: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

Instances

?

Page 90: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

Instances

?

Model ?

Page 91: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

Instances

?

Model ?

Features ?

Page 92: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

How to Tra

nsfer ?

Instances

?

Model ?

Features ?

Page 93: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

How to Tra

nsfer ?

Instances

?

Model ?

Features ?

Weig

ht Insta

nces ?

Page 94: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

How to Tra

nsfer ?

Instances

?

Model ?

Features ?

Map

Mod

el ?

Weig

ht Insta

nces ?

Page 95: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

How to Tra

nsfer ?

Instances

?

Model ?

Features ?

Map

Mod

el ?

Unify Fea

tures ?

Weig

ht Insta

nces ?

Page 96: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Questions to answer when transferringW

hat to T

ransfe

r ?

How to Tra

nsfer ?

When

to T

ransf

er ?

Instances

?

Model ?

Features ?

Map

Mod

el ?

Unify Fea

tures ?

Weig

ht Insta

nces ?

In w

hich

Situ

ation

s

Page 97: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Different Distributions• Example: classify documents from the

web into important or not important– Documents in different domains have the

same feature space: Bag of words (frequency of each term)

– However, the words have different frequencies in the different domains

– The distribution of features is different• So modify instances

37

Page 98: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Algorithms: TrAdaBoost

● Assumptions: ○ Source and Target task have same feature space:

○ Marginal distributions are different:

Page 99: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Algorithms: TrAdaBoost

● Assumptions: ○ Source and Target task have same feature space:

○ Marginal distributions are different:

Not all source data might be helpful !

Page 100: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Algorithm: TrAdaBoost

● Idea: ○ Iteratively reweight source samples such that:

÷ reduce effect of “bad” source instances ÷ encourage effect of “good” source instances

● Requires: ○ Source task labeled data set ○ Very small Target task labeled data set ○ Unlabeled Target data set ○ Base Learner

Page 101: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Different Features• Example: classify images into cars and

motorcycles• We already have a classifier that classifies

images into trucks and buses• Features won’t be the same

– but some of them will (driver enclosed?)– and some of them will be similar but on

different dimensions (big or small?)

40

Page 102: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transferring Features• Many methods:

– Supervised Feature Construction. Self-taught learning.

– Unsupervised Feature Construction– TAMAR

41

Page 103: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 104: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 105: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Inductive Transfer Learning

Labeled data are available in a target domain

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 106: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Inductive Transfer Learning

Labeled data are available in a target domain

No labeled data in a source domain

Case 1An overview of

various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 107: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Inductive Transfer Learning

Self-taught Learning

Labeled data are available in a target domain

No labeled data in a source domain

Case 1An overview of

various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 108: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Inductive Transfer Learning

Self-taught Learning

Labeled data are available in a target domain

No labeled data in a source domain

Labeled data are available in a source domain

Case 1

Case 2

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 109: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Multi-task Learning

Inductive Transfer Learning

Self-taught Learning

Labeled data are available in a target domain

No labeled data in a source domain

Labeled data are available in a source domain

Case 1

Case 2Source and

target tasks are learnt

simultaneously

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 110: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Multi-task Learning

Transductive Transfer Learning

Inductive Transfer Learning

Self-taught Learning

Labeled data are available in a target domain

Labeled data are available only in a

source domain

No labeled data in a source domain

Labeled data are available in a source domain

Case 1

Case 2Source and

target tasks are learnt

simultaneously

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 111: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Multi-task Learning

Transductive Transfer Learning

Inductive Transfer Learning

Domain Adaptation

Self-taught Learning

Labeled data are available in a target domain

Labeled data are available only in a

source domain

No labeled data in a source domain

Labeled data are available in a source domain

Case 1

Case 2Source and

target tasks are learnt

simultaneously

Assumption: different

domains but single task

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 112: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Multi-task Learning

Transductive Transfer Learning

Inductive Transfer Learning

Domain Adaptation

Sample Selection Bias /Covariance Shift

Self-taught Learning

Labeled data are available in a target domain

Labeled data are available only in a

source domain

No labeled data in a source domain

Labeled data are available in a source domain

Case 1

Case 2Source and

target tasks are learnt

simultaneously

Assumption: different

domains but single task

Assumption: single domain and single task

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 113: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Transfer Learning

Multi-task Learning

Transductive Transfer Learning

Unsupervised Transfer Learning

Inductive Transfer Learning

Domain Adaptation

Sample Selection Bias /Covariance Shift

Self-taught Learning

Labeled data are available in a target domain

Labeled data are available only in a

source domain

No labeled data in both source and target domain

No labeled data in a source domain

Labeled data are available in a source domain

Case 1

Case 2Source and

target tasks are learnt

simultaneously

Assumption: different

domains but single task

Assumption: single domain and single task

An overview of various settings of transfer learning

slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt

Page 114: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek

Conclusions

● Transfer learning is to re-use source knowledge to help a target learner

● Transfer learning is not generalization

● Self-Taught learning transfer unlabeled features

Page 115: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Summary 1• Goal of transfer learning: to reuse

knowledge from previous learner to help develop a new learner.

• New learner can be required for– new instances

• different features• different distribution of the same features

– a new task

44

Page 116: CSC 4510/9010: Applied Machine Learningmatuszek/spring2015/ReinforcementAnd... · CSC 4510/9010: Applied Machine Learning ... c28_rl.ppt, taken in turn from ... Alvinn (driving agent)

CSC 4510/9010 Spring 2015. Paula Matuszek

Summary 2• We can transfer knowledge from

– instances– features– model

• It’s not always worth transferring. There must still be some relationship between the knowledge behind the two learners

• Complex and growing field

45