Upload
judson
View
37
Download
1
Embed Size (px)
DESCRIPTION
Parameter Learning. Announcements. Midterm 24 th 7-9pm, NVIDIA. Midterm review in class next Tuesday. Extra study material for midterm (after class). Homework back. Regrade process. Looking into reflex agent on pacman. Some changes to the schedule. - PowerPoint PPT Presentation
Citation preview
Parameter Learning
Announcements
• Midterm 24th 7-9pm, NVIDIA
• Midterm review in class next Tuesday
• Homework back
• Regrade process
• Looking into reflex agent on pacman
• Extra study material for midterm (after class).
• Some changes to the schedule
• Want to hear your song before class?
Happy with how you did
A lot of class left
>= 17 / 20>= 15 / 20>= 12 / 20>= 9 / 20>= 2 / 20
> 20 / 20> 23 / 20
Pac Man Grades
CS221 Grade Book16%
A lot of class left
GoodGoodOk?TalkTalk
GoodYay
How we see it
CS221 Grade Book16%
A lot of class left
GoodGoodOk?TalkTalk
GoodYay
Good job!
CS221 Grade Book16%
A lot of class left
GoodGoodOk?TalkTalk
GoodYay
Alright
CS221 Grade Book16%
A lot of class left
GoodGoodOk?TalkTalk
GoodYay
Rethink
CS221 Grade Book16%
Theory on Grades
Real World Problem
Formal Problem
Solution
Model the problem
Apply an Algorithm
Evaluate
Common Error: Formalize a problem
: what makes a state
: possible actions from state s
Succ: states that could result from taking action a from state s
: reward for taking action a from state s
: starting state
: whether to stop
: the value of reaching a given stopping point
Modeling Discrete Search
: what makes a state
: possible actions from state s
: probability distribution of states that could result from taking action a from state s
: reward for taking action a from state s
: starting state
: whether to stop
: the value of reaching a given stopping point
Modeling Markov Decision
Definition: Bayes Net = DAGDAG: directed acyclic graph (BN’s structure)
• Nodes: random variables (typically discrete, but methods also exist to handle continuous variables)
• Arcs: indicate probabilistic dependencies between nodes. Go from cause to effect.
• CPDs: conditional probability distribution (BN’s parameters) Conditional probabilities at each node, usually stored as a table (conditional probability table, or CPT)
Root nodes are a special case – no parents, so just use priors in CPD:
iiii xxP of nodesparent all ofset theis where)|(
)()|( so , iiii xPxP
Modeling Bayes Net
Formally:(1) State variables and their domains(2) Evidence variables and their domains(3) Probability of states at time 0(4) Transition probability(5) Emission probability
Modeling Hidden Markov Model
X5X2
E1
X1 X3 X4
E2 E3 E4 E5
Formally, we want to get our model inside the python
Scary?
Theory on Grades
Previously on CS221
In Class Research
Previously on CS221
In Class Research
Previously on CS221
Formally:(1) State variables and their domains(2) Evidence variables and their domains(3) Probability of states at time 0(4) Transition probability(5) Emission probability
Hidden Markov ModelX5X2
E1
X1 X3 X4
E2 E3 E4 E5
E1
X1 X2X1
Filtering
Tracking Other Cars
Track a Car!
Pos2
Dist1
Pos1
Dist2
Track a Robot!
Dist1
Pos1
Value of d
Prob
abili
ty D
ensit
y
μ = True distance from x to your car
Track a Robot!
Dist1
Pos1
Value of d
Prob
abili
ty D
ensit
y
μ = True distance from x to your car
Track a Robot!
Dist1
Pos1
Value of d
Prob
abili
ty D
ensit
yσ = Const.SONAR_STD
Track a Robot!
Pos2Pos1
Particle Filters
A particle is a hypothetical instantiation of a variable.
Store a large number of particles.
Elapse time by moving each particle given transition probabilities.
When we get new evidence we weight each particle and create a new generation.
The density of particles for any given value is an approximation of the probability that our variable equals that value
Sometimes |X| is too big to use exact inference
– |X| may be too big to even store B(X)– E.g. X is continuous– E.g. X is a real world map
Solution: approximate inference– Track samples of X, not all values– Samples are called particles– Time per step is linear in the number of
samples– But: number needed may be large– In memory: list of particles, not states
This is how robot localization works in practice
0.0 0.1
0.0 0.0
0.0
0.2
0.0 0.2 0.5
Particle Filtering
Each particle is moved by sampling its next position from the transition model
– Reflect the transition probs– Here, most samples move clockwise, but
some move in another direction or stay in place
This captures the passage of time– If we have enough samples, close to the
exact values before and after (consistent)
Elapse Time
Slightly trickier:– We downweight our samples based on
the evidence
– Note that, as before, the probabilities don’t sum to one, since most have been downweighted (in fact they sum to an approximation of P(e))
Observe Step
Rather than tracking weighted samples, we resample
N times, we choose from our weighted sample distribution (i.e. draw with replacement)
This is analogous to renormalizing the distribution
Now the update is complete for this time step, continue with the next one
Old Particles: (3,3) w=0.1 (2,1) w=0.9 (2,1) w=0.9 (3,1) w=0.4 (3,2) w=0.3 (2,2) w=0.4 (1,1) w=0.4 (3,1) w=0.4 (2,1) w=0.9 (3,2) w=0.3
New Particles: (2,1) w=1 (2,1) w=1 (2,1) w=1 (3,2) w=1 (2,2) w=1 (2,1) w=1 (1,1) w=1 (3,1) w=1 (2,1) w=1 (1,1) w=1
Resample
Track a Robot!
Pos2
Walls1
Pos1
Walls2
Sometimes sensors are
wrong
Sometimes motors don’t
work
Start
Transition Prob
Laser sensor Sense walls
Emission Prob
37
Original Particles
38
Observation
39
Reweight…
40
Resample + Pass Time
41
Observation
42
Reweight…
43
Resample
44
Pass Time
45
Observation
46
Reweight…
47
Resample
48
Pass Time
49
Observation
50
Reweight…
51
Resample
52
Pass Time
53
Awesome Pants!
Analogy One
55
Analogy Two
Particle Filters
A particle is a hypothetical instantiation of a variable.
Store a large number of particles.
Elapse time by moving each particle given transition probabilities.
When we get new evidence we weight each particle and create a new generation.
The density of particles for any given value is an approximation of the probability that our variable equals that value
Previously on CS221
In Class Research
Previously on CS221
In Class Research
Particle Filter
Expectation Maximization
Motivating Example
Particle Filter
Why Care?
Live Research
Some Education Theory
Grit,Tenacity,
Perseverance
Mindset
Gender,Ethnicity
Some Education Theory
Grit,Tenacity,
Perseverance
Mindset
Gender,Ethnicity
Affect Grades
Some Education Theory
Grit,Tenacity,
Perseverance
Mindset
Gender,Ethnicity
Affect Grades
Does Not Affect Grades
Story
Research Project
g3
t1 t2 t3
e1 e2 e3
g1 g2 b
i
?
Research Project
g3
t1 t2 t3
e1 e2 e3
g1 g2 b
i
?
Basic Idea
1. Chose the model that most accurately predicts our grades.
2. See where that model predicts a higher grade than was given.
What is the best model?for s in students:
# Pretend s is not in the class.
TotalError +=
Compute grade of s given
Learn parameters p given
# Pretend we don’t know the grade of s
# See how well we did
Leave one out cross validation
Predict Misgradesfor s in students:
# Pretend s is not in the class.
Misgrade =
Compute grade of s given
Learn parameters p given
# Pretend we don’t know the grade of s
# Did we under-grade?
Leave one out cross validation
Novel Science
On worst pset question+
Contest
And?
[suspense]
[more suspense]
[how…]
[many…]
[slides…]
[can…]
[he…]
[go…]
Simple Model
gi
ei
ti
si
Grades produce simple, observable features
Description
gi Grade on assn i
si Submit time
ti Time taken
ei Enjoyment
My Model
Students have grit, assignments have difficulty
gi
diei grit
ti
Description
gi Grade on assn i
di Submit time
ti Time taken
ei Enjoyment
grit Overcoming ability
My Model
Students have grit, assignments have difficulty
Creative Model
We can predict grades based on the grades of collaborators
Collaborators?
gb gg
Collaborators?
gb gg
Error: cycle in bayes net!
Collaborators?
gb gg
cbg
Warning: Large tables!
Creative Model
We can predict grades based on the grades of collaborators
gi
ci
mi
eio si
ti
Description
gi Grade on assn i
si First submit time
ci Partner score
ei Enjoyment
ti Time
o Did optional?
mi Motivation
Real World Problem
Formal Problem
Parameterized Model
Model the problem
Learning Algorithm
SMILE Solver
Inference Algorithm
Solution
Evaluate
Simple Model
gi
ei
ti
si
Grades produce simple, observable features
Description
gi Grade on assn i
si Submit time
ti Time taken
ei Enjoyment
My Model
Students have grit, assignments have difficulty
gi
diei grit
ti
Description
gi Grade on assn i
di Submit time
ti Time taken
ei Enjoyment
grit Overcoming ability
Creative Model
We can predict grades based on the grades of collaborators
gi
ci
mi
eio si
ti
Description
gi Grade on assn i
si First submit time
ci Partner score
ei Enjoyment
ti Time
o Did optional?
mi Motivation
Real World Problem
Formal Problem
Parameterized Model
Model the problem
Learning Algorithm
SMILE Solver
Inference Algorithm
Solution
Evaluate
SMILE Solver
Structural Modeling, Inference, and Learning Engine
Real World Problem
Formal Problem
Parameterized Model
Model the problem
Learning Algorithm
SMILE Solver
Inference Algorithm
Solution
Evaluate
Simple Model
gi
ei
ti
si
Grades produce simple, observable features
Description
gi Grade on assn i
si Submit time
ti Time taken
ei Enjoyment
Lets Program!
Learn with Hidden Vars
Learn with Hidden Vars
Conditionalprobability
tablesValue of hidden
variables
Learn with Hidden Vars
Conditionalprobability
tablesValue of hidden
variables
Learn with Hidden Vars
Conditionalprobability
tablesValue of hidden
variables
Learn with Hidden Vars
Conditionalprobability
tablesValue of hidden
variables
# Expectation Maximization# ------------------------# Solve the chicken and egg problem of guessing# model params and assignments to unobserved varsdef expectationMaximization(observed):
# Start with a random assignment params = getRandomParams()
# Until your params stop changing while not hasConverged():
# Estimate unobserved variables, given params unobserved = getBestAssn(observed, params)
# Estimate params, given unobserved params = getParams(observed, unobserved)
# You have calculated unobserved and params return params
Expectation Maximization
# Expectation Maximization# ------------------------# Solve the chicken and egg problem of guessing# model params and assignments to unobserved varsdef expectationMaximization(observed):
# Start with a random assignment params = getRandomParams()
# Until your params stop changing while not hasConverged():
# Estimate unobserved variables, given params unobserved = getBestAssn(observed, params)
# Estimate params, given unobserved params = getParams(observed, unobserved)
# You have calculated unobserved and params return params
Random Initialization
# Expectation Maximization# ------------------------# Solve the chicken and egg problem of guessing# model params and assignments to unobserved varsdef expectationMaximization(observed):
# Start with a random assignment params = getRandomParams()
# Until your params stop changing while not hasConverged():
# Estimate unobserved variables, given params unobserved = getBestAssn(observed, params)
# Estimate params, given unobserved params = getParams(observed, unobserved)
# You have calculated unobserved and params return params
EM Loop
# Expectation Maximization# ------------------------# Solve the chicken and egg problem of guessing# model params and assignments to unobserved varsdef expectationMaximization(observed):
# Start with a random assignment params = getRandomParams()
# Until your params stop changing while not hasConverged():
# Estimate unobserved variables, given params unobserved = getBestAssn(observed, params)
# Estimate params, given unobserved params = getParams(observed, unobserved)
# You have calculated unobserved and params return params
Convergence
# Expectation Maximization# ------------------------# Solve the chicken and egg problem of guessing# model params and assignments to unobserved varsdef expectationMaximization(observed):
# Start with a random assignment params = getRandomParams()
# Until your params stop changing while not hasConverged():
# Estimate unobserved variables, given params unobserved = getBestAssn(observed, params)
# Estimate params, given unobserved params = getParams(observed, unobserved)
# You have calculated unobserved and params return params
Expectation Step
# Expectation Maximization# ------------------------# Solve the chicken and egg problem of guessing# model params and assignments to unobserved varsdef expectationMaximization(observed):
# Start with a random assignment params = getRandomParams()
# Until your params stop changing while not hasConverged():
# Estimate unobserved variables, given params unobserved = getBestAssn(observed, params)
# Estimate params, given unobserved params = getParams(observed, unobserved)
# You have calculated unobserved and params return params
Maximization Step
# Expectation Maximization# ------------------------# Solve the chicken and egg problem of guessing# model params and assignments to unobserved varsdef expectationMaximization(observed):
# Start with a random assignment params = getRandomParams()
# Until your params stop changing while not hasConverged():
# Estimate unobserved variables, given params unobserved = getBestAssn(observed, params)
# Estimate params, given unobserved params = getParams(observed, unobserved)
# You have calculated unobserved and params return params
Return
Experimental Bias?
Blind grading.Other motivations.
Results
Algorithm Grade Fixes
Random 0 / 10
Naïve Bayes 3 / 10
Mine 3 / 10
Creative 4 / 10
Note: The fixed grades were not mutually exclusive. Overall 6 homeworks had grading errors alleviated!
Results
Algorithm Grade Fixes
Random 0 / 10
Naïve Bayes 3 / 10
Mine 3 / 10
Creative 4 / 10
Note: The fixed grades were not mutually exclusive. Overall 6 homeworks had grading errors alleviated!
Creative Model
We can predict grades based on the grades of collaborators
gi
ci
mi
eio si
ti
Description
gi Grade on assn i
si First submit time
ci Partner score
ei Enjoyment
ti Time
o Did optional?
mi Motivation
What does it mean?
Sharma Algorithm?
First Step
Bette
r mod
els
Better features
Birth of a Research Problem
Bette
r mod
els
Better features
Birth of a Research Problem
Algorithms from part 3 of
the class
Game Theoretic?
Ethical?
Machine Learning
Search
Variable Based
Where We Are
Machine Learning
Search
Variable Based
Where We Are
Variable Based
Machine Learning
Search
Where We Are
Search
Variable Based
Machine Learning
Where We Are
Midpoint
What you should know• CSP
• Formalization• Probability
• Joint distribution• Bayes Nets
• Formalization• Exact Inference• Temporal Models
• Formalization• Particle Filters
• Parameter Learning• Fully observed
Clarify exactly what you need to know