Upload
buiphuc
View
219
Download
3
Embed Size (px)
Citation preview
CSC 4510/9010 Spring 2015. Paula Matuszek
CSC 4510/9010: Applied Machine Learning
Reinforcement and Transfer Learning
Dr. Paula Matuszek
(610) 647-9789
1
CSC 4510/9010 Spring 2015. Paula Matuszek
What Is Machine Learning?• “Learning denotes changes in a system that ... enable a
system to do the same task more efficiently the next time.” –Herbert Simon – In other words, the end result is a changed model or of some
kind; the focus is on the end product• “Learning is constructing or modifying representations
of what is being experienced.” –Ryszard Michalski – The experiences perceived must be captured or represented in
some way; learning modifies that representation. This definition focuses on the process, rather than the result.
2
CSC 4510/9010 Spring 2015. Paula Matuszek
So what is Machine Learning?• We can consider that the “system” is a computer
and its programs, or a statistical model with parameters.
• Another way of looking at machine learning is as a way to get a computer to do things without having to explicitly describe what steps to take, by giving it examples or feedback
• The computer then looks for patterns which can explain or predict what happens.
• The computer is trained through the examples.
3
CSC 4510/9010 Spring 2015. Paula Matuszek
The Architecture of a ML System
• Every machine learning system has four parts:– a representation or model of what is being
learned– an actor. This is the part that uses the
representation and actually does something.– a critic. The part that provides feedback. – a learner. The part that modifies the
representation or model, using the feedback.
4
Based on Russell and Norvig, Artificial Intelligence: A Modern Approach, Third Edition, Prentice-Hall, 2009. http://aima.cs.berkeley.edu/
CSC 4510/9010 Spring 2015. Paula Matuszek
Representation• A learning system must have a
representation or model of what is being learned.
• This is the component that changes based on experience.
• In a machine learning system this may be a mathematical model or formula, a set of rules, a decision tree, or some other form of information.
5
CSC 4510/9010 Spring 2015. Paula Matuszek
General Model of Learning Agent
6
Environment
Agent
Critic
Learning Element
Problem Generator
Performer with KB
Performance Standard
Sensors
Effectors
feedback
learning goals
changes
knowledge
Based on Russell and Norvig, Artificial Intelligence: A Modern Approach, Third Edition, Prentice-Hall, 2009. http://aima.cs.berkeley.edu/
copyright Paula Matuszek 20159
Paula Matuszek
Animals Behind the Scene7
Is it a bird?
Is it a penguin?
Yes
Does it have fur?
No
Is it a mouse? Is it a frog?
Yes No
After several passes.
Computer: Is it a bird?Human: NoComputer: Is it a frog?Human: NoComputer: What is it?Human: A mouseComputer: Tell me a question that distinguishes a frog from a mouse.Human: Does it have fur?Computer: What is the answer for a frog?Human: no
CSC 4510/9010 Spring 2015. Paula Matuszek
Animals Guessing Game Architecture• The Animals Guessing Game has all of the parts
of a Machine Learning Architecture:– The Representation is a sequence of questions and
pairs of yes/no answers (called a binary decision tree).– The Actor “walks” the tree, interacting with a human;
at each question it chooses whether to follow the “yes” branch or the “no” branch.
– The Critic is the human player telling the game whether it has guessed correctly.
– The Learner elicits new questions and adds questions, guesses and branches to the tree.
8
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning• The Animals Game is a simple form of
Reinforcement Learning.• Very early concept in Artificial
Intelligence!
9
• www-03.ibm.com/ibm/history/ibm100/us/en/icons/ibm700series/impacts/
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?One solution is unsupervised learning
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?One solution is unsupervised learningFor a more complex problem: ■ The agent has a task to perform ■ It takes some actions in the world ■ At some later point, it gets feedback telling it how well it did on
performing the task ■ The agent performs the same task over and over again
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement LearningSupervised (inductive) learning is the simplest and most studied type of learning/How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform? What’s the critic?One solution is unsupervised learningFor a more complex problem: ■ The agent has a task to perform ■ It takes some actions in the world ■ At some later point, it gets feedback telling it how well it did on
performing the task ■ The agent performs the same task over and over again
This problem is called reinforcement learning: ■ The agent gets positive reinforcement for tasks done well ■ The agent gets negative reinforcement for tasks done poorly
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning (cont.)
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning (cont.)
The goal is to get the agent to act in the world so as to maximize its rewards
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning (cont.)
The goal is to get the agent to act in the world so as to maximize its rewardsThe agent has to figure out what it did that
made it get the reward/punishment ■ This is known as the credit assignment problem
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning (cont.)
The goal is to get the agent to act in the world so as to maximize its rewardsThe agent has to figure out what it did that
made it get the reward/punishment ■ This is known as the credit assignment problemReinforcement learning approaches can be
used to train computers to do many tasks ■ backgammon and chess playing ■ job shop scheduling ■ controlling robot limbs
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Formalization
Given: ■ a state space S ■ a set of actions a1, …, ak
■ reward value at the end of each trial (may be positive or negative)
Output: ■ a mapping from states to actions
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Formalization
Given: ■ a state space S ■ a set of actions a1, …, ak
■ reward value at the end of each trial (may be positive or negative)
Output: ■ a mapping from states to actions
example: Alvinn (driving agent) state: configuration of the car
learn a steering action for each state
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Repeat: ⬥ s ß sensed state ⬥ If s is terminal then exit ⬥ a ß choose action (given s) ⬥ Perform a
Reactive Agent Algorithm
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Accessible or observable stateRepeat:
⬥ s ß sensed state ⬥ If s is terminal then exit ⬥ a ß choose action (given s) ⬥ Perform a
Reactive Agent Algorithm
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek 14
Policy (Reactive/Closed-Loop Strategy)
• A policy Π is a complete mapping from states to actions
-1
+1
2
3
1
4321
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Repeat: ⬥ s ß sensed state ⬥ If s is terminal then exit ⬥ a ß Π(s) ⬥ Perform a
Reactive Agent Algorithm
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Approaches
Learn policy directly– function mapping from states to actions Learn utility values for states (i.e., the
value function)
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Value FunctionThe agent knows what state it is in The agent has a number of actions it can perform in each state. Initially, it doesn't know the value of any of the states If the outcome of performing an action at a state is deterministic, then the agent can update the utility value U() of states: ■ U(oldstate) = reward + U(newstate) The agent learns the utility values of states as it works its way through the state space
CSC 4510/9010 Spring 2015. Paula Matuszek
Learning States and Actions• A typical approach is:• At state S choose some action A• Taking us to new State S1.
– If S1 has a positive value, increase value of A at S.– If S1 has a negative value, decrease value of A at S.– If S1 is new initial value is unknown. 0?
• Repeat until?• One complete learning pass eventually gets to a
deterministic state. (win or lose)
18
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
ExplorationThe agent may occasionally choose to explore suboptimal moves in the hopes of finding better outcomes ■ Only by visiting all the states frequently enough can we
guarantee learning the true values of all the states
A discount factor is often introduced to prevent utility values from diverging and to promote the use of shorter (more efficient) sequences of actions to attain rewards The update equation using a discount factor γ is: ■ U(oldstate) = reward + γ * U(newstate) Normally, γ is set between 0 and 1
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Selecting an Action
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Selecting an ActionSimply choose action with highest (current)
expected utility?
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Selecting an ActionSimply choose action with highest (current)
expected utility?Problem: each action has two effects ■ yields a reward (or penalty) on current sequence ■ information is received and used in learning for
future sequences
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Selecting an ActionSimply choose action with highest (current)
expected utility?Problem: each action has two effects ■ yields a reward (or penalty) on current sequence ■ information is received and used in learning for
future sequences
Trade-off: immediate good for long-term well-being
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Selecting an ActionSimply choose action with highest (current)
expected utility?Problem: each action has two effects ■ yields a reward (or penalty) on current sequence ■ information is received and used in learning for
future sequences
Trade-off: immediate good for long-term well-being
try a shortcut – you might get lost; you might learn a new, quicker route!
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Exploration policy
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Exploration policyWacky approach (exploration): act randomly
in hopes of eventually exploring entire environment
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Exploration policyWacky approach (exploration): act randomly
in hopes of eventually exploring entire environmentGreedy approach (exploitation): act to
maximize utility using current estimate
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Exploration policyWacky approach (exploration): act randomly
in hopes of eventually exploring entire environmentGreedy approach (exploitation): act to
maximize utility using current estimateReasonable balance: act more wacky
(exploratory) when agent has little idea of environment; more greedy when the model is close to correct
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
Exploration policyWacky approach (exploration): act randomly
in hopes of eventually exploring entire environmentGreedy approach (exploitation): act to
maximize utility using current estimateReasonable balance: act more wacky
(exploratory) when agent has little idea of environment; more greedy when the model is close to correctExample: n-armed bandits…
based on http://www.csee.umbc.edu/courses/671/fall05/slides/c28_rl.ppt, taken in turn from Jean-Claude Latombe and Lise Getoor
CSC 4510/9010 Spring 2015. Paula Matuszek
RL Summary
Active area of research Approaches from both OR and AI There are many more sophisticated
algorithms that we have not discussed Applicable to game-playing, robot
controllers, others
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning• Reinforcement learning systems learn a
series of actions or decisions, rather than a single decision, based on feedback given at the end of the series.
• A reinforcement learner has a goal, and carries out trial-and-error search to find the best paths toward that goal
23
CSC 4510/9010 Spring 2015. Paula Matuszek
Reinforcement Learning• A typical reinforcement learning system is an active
agent, interacting with its environment.• It must balance
– exploration: trying different actions and sequences of actions to discover which ones work best
– achievement: using sequences which have worked well so far
• It must also learn successful sequences of actions in an uncertain environment
• Typical current applications are in artificial intelligence and in engineering.
24
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
• Slides based on presentation from Haitham Bou Ammar, Maastricht University
25
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning• Data used in training a classifier must be
properly chosen to be representative• If not? Accuracy will be worse than expected• But suppose we want to apply a classifier to a
new or shifting domain? Retrain!– But that’s expensive.
• Can we somehow use our existing classifier as a starting point to give us a shortcut?
• This is Transfer Learning.26
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Motivation
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Motivation
€
y
€
x
Training Data
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Motivation
Model
€
x€
y
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Motivation
Model
€
x€
y?
Test Data
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Motivation
Model
€
x€
y?
Test Data
Assumptions: 1.Training and Test are from same distribution 2.Training and Test are in same feature space
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Motivation
Model
€
x€
y?
Test Data
Assumptions: 1.Training and Test are from same distribution 2.Training and Test are in same feature spaceN
ot T
rue
in m
any r
eal-w
orld
appl
icat
ions
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Model
Physics Machine Learning
Life Science
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Model
?
Physics Machine Learning
Life Science
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Model
?
Physics Machine Learning
Life Science
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Model
?
Physics Machine Learning
Life Science
Content Change !
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Model
?
Physics Machine Learning
Life Science
Content Change !
Assumption violated!
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Web-document Classification
Model
?
Physics Machine Learning
Life Science
Content Change !
Assumption violated!
Learn a new model
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Learn new Model :
1. Collect new Labeled Data 2. Build new model
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Learn new Model :
1. Collect new Labeled Data 2. Build new model
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Learn new Model :
1. Collect new Labeled Data 2. Build new model
Reuse & Adapt already learned model !
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Features Task One
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Model OneFeatures Task One
Task One
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Cars
Motorcycles
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Cars
Motorcycles
Features Task Two
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Cars
MotorcyclesTask Two
Features Task Two
Model Two
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Cars
MotorcyclesTask Two
Features Task One
Features Task Two
Model Two
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Examples: Image Classification
Cars
MotorcyclesTask Two
Features Task One
Features Task Two
Reuse
Model Two
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Traditional Machine Learning vs. Transfer
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Traditional Machine Learning vs. Transfer
Different Tasks
Learning System
Learning System
Learning System
Traditional Machine Learning
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Traditional Machine Learning vs. Transfer
Source Task
Knowledge
Target Task
Learning System
Different Tasks
Learning System
Learning System
Learning System
Traditional Machine Learning Transfer Learning
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning Definition
Given a source domain and source learning task, a target domain and a target learning task, transfer learning aims to help improve the learning of the target predictive function using the source knowledge, where
or
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Definition
● Therefore, if either :
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Definition
● Therefore, if either :
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Definition
● Therefore, if either :
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Definition
● Therefore, if either : Domain Differences
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Definition
● Therefore, if either : Domain Differences
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Definition
● Therefore, if either : Domain Differences
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Definition
● Therefore, if either : Domain Differences
Task Differences
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferring
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
Instances
?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
Instances
?
Model ?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
Instances
?
Model ?
Features ?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
How to Tra
nsfer ?
Instances
?
Model ?
Features ?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
How to Tra
nsfer ?
Instances
?
Model ?
Features ?
Weig
ht Insta
nces ?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
How to Tra
nsfer ?
Instances
?
Model ?
Features ?
Map
Mod
el ?
Weig
ht Insta
nces ?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
How to Tra
nsfer ?
Instances
?
Model ?
Features ?
Map
Mod
el ?
Unify Fea
tures ?
Weig
ht Insta
nces ?
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Questions to answer when transferringW
hat to T
ransfe
r ?
How to Tra
nsfer ?
When
to T
ransf
er ?
Instances
?
Model ?
Features ?
Map
Mod
el ?
Unify Fea
tures ?
Weig
ht Insta
nces ?
In w
hich
Situ
ation
s
CSC 4510/9010 Spring 2015. Paula Matuszek
Different Distributions• Example: classify documents from the
web into important or not important– Documents in different domains have the
same feature space: Bag of words (frequency of each term)
– However, the words have different frequencies in the different domains
– The distribution of features is different• So modify instances
37
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Algorithms: TrAdaBoost
● Assumptions: ○ Source and Target task have same feature space:
○ Marginal distributions are different:
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Algorithms: TrAdaBoost
● Assumptions: ○ Source and Target task have same feature space:
○ Marginal distributions are different:
Not all source data might be helpful !
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Algorithm: TrAdaBoost
● Idea: ○ Iteratively reweight source samples such that:
÷ reduce effect of “bad” source instances ÷ encourage effect of “good” source instances
● Requires: ○ Source task labeled data set ○ Very small Target task labeled data set ○ Unlabeled Target data set ○ Base Learner
CSC 4510/9010 Spring 2015. Paula Matuszek
Different Features• Example: classify images into cars and
motorcycles• We already have a classifier that classifies
images into trucks and buses• Features won’t be the same
– but some of them will (driver enclosed?)– and some of them will be similar but on
different dimensions (big or small?)
40
CSC 4510/9010 Spring 2015. Paula Matuszek
Transferring Features• Many methods:
– Supervised Feature Construction. Self-taught learning.
– Unsupervised Feature Construction– TAMAR
41
CSC 4510/9010 Spring 2015. Paula Matuszek
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Inductive Transfer Learning
Labeled data are available in a target domain
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Inductive Transfer Learning
Labeled data are available in a target domain
No labeled data in a source domain
Case 1An overview of
various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Inductive Transfer Learning
Self-taught Learning
Labeled data are available in a target domain
No labeled data in a source domain
Case 1An overview of
various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Inductive Transfer Learning
Self-taught Learning
Labeled data are available in a target domain
No labeled data in a source domain
Labeled data are available in a source domain
Case 1
Case 2
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Multi-task Learning
Inductive Transfer Learning
Self-taught Learning
Labeled data are available in a target domain
No labeled data in a source domain
Labeled data are available in a source domain
Case 1
Case 2Source and
target tasks are learnt
simultaneously
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Multi-task Learning
Transductive Transfer Learning
Inductive Transfer Learning
Self-taught Learning
Labeled data are available in a target domain
Labeled data are available only in a
source domain
No labeled data in a source domain
Labeled data are available in a source domain
Case 1
Case 2Source and
target tasks are learnt
simultaneously
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Multi-task Learning
Transductive Transfer Learning
Inductive Transfer Learning
Domain Adaptation
Self-taught Learning
Labeled data are available in a target domain
Labeled data are available only in a
source domain
No labeled data in a source domain
Labeled data are available in a source domain
Case 1
Case 2Source and
target tasks are learnt
simultaneously
Assumption: different
domains but single task
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Multi-task Learning
Transductive Transfer Learning
Inductive Transfer Learning
Domain Adaptation
Sample Selection Bias /Covariance Shift
Self-taught Learning
Labeled data are available in a target domain
Labeled data are available only in a
source domain
No labeled data in a source domain
Labeled data are available in a source domain
Case 1
Case 2Source and
target tasks are learnt
simultaneously
Assumption: different
domains but single task
Assumption: single domain and single task
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
CSC 4510/9010 Spring 2015. Paula Matuszek
Transfer Learning
Multi-task Learning
Transductive Transfer Learning
Unsupervised Transfer Learning
Inductive Transfer Learning
Domain Adaptation
Sample Selection Bias /Covariance Shift
Self-taught Learning
Labeled data are available in a target domain
Labeled data are available only in a
source domain
No labeled data in both source and target domain
No labeled data in a source domain
Labeled data are available in a source domain
Case 1
Case 2Source and
target tasks are learnt
simultaneously
Assumption: different
domains but single task
Assumption: single domain and single task
An overview of various settings of transfer learning
slide from http://www1.i2r.a-star.edu.sg/~jspan/publications/A%20Survey%20on%20Transfer%20Learning.ppt
Based on https://project.dke.maastrichtuniversity.nl/datamining/2013-Slides/transfer-01.pptCSC 4510/9010 Spring 2015. Paula Matuszek
Conclusions
● Transfer learning is to re-use source knowledge to help a target learner
● Transfer learning is not generalization
● Self-Taught learning transfer unlabeled features
CSC 4510/9010 Spring 2015. Paula Matuszek
Summary 1• Goal of transfer learning: to reuse
knowledge from previous learner to help develop a new learner.
• New learner can be required for– new instances
• different features• different distribution of the same features
– a new task
44
CSC 4510/9010 Spring 2015. Paula Matuszek
Summary 2• We can transfer knowledge from
– instances– features– model
• It’s not always worth transferring. There must still be some relationship between the knowledge behind the two learners
• Complex and growing field
45