Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Human-awareRobo.cs
1
CSE 591: Human-aware Robotics
Instructor: Dr. Yu (“Tony”) Zhang
Location & Times: CAVC 359, Tue/Thu, 9:00--10:15 AM Office Hours: BYENG 558, Tue/Thu, 10:30--11:30AM
Oct 6/Nov 1, 2016
This set of slides borrow from various online sources; it is used for educational purposes only.
SlidesadaptedfromPieterAbbeel(UCBerkeley)
Human-awareRobo.cs
2
Modeling of Humans
Behavior model
Human-awareRobo.cs
3
Goal
Modeling of Humans
GoalBehavior model
Ø Goal and intent selection
Goal
Human-awareRobo.cs
4
Goal
Modeling of Humans
Goal
river
GoalBehavior model
• Goal and intent selection Ø Plan selection
Human-awareRobo.cs
5
Goal
Modeling of Humans
Goal
river
GoalBehavior model
• Goal and intent selection • Plan selection (informed by the capabilities, and influenced by mental states and etc.)
Human-awareRobo.cs
6
Modeling of Humans
Behavior model
• Goal and intent selection • Plan selection (informed by the capabilities, and influenced by mental states and etc.)
Ø Goal/planrecogni?onshouldbeinformedbythebehaviormodel
Ø Howshouldwelearnabehaviormodel?
Human-awareRobo.cs
7
Outline
BehaviormodelingØ Capabilitymodel
GoalpreferenceØ InverseRL
• WhyIRL• InverseRLvs.Behavioralcloning• Mathema?calformula?onofIRL• Applica?ons
Human-awareRobo.cs
Humanmodeling
Humanteammate
Human-awareplanner
Observa?ons
Humanmodels
Robotmodels
Plangenera6on
8
Modeling of Humans 1. Nopre-specifiedgoals/plans2. Incompleteobserva6ons
Human-awareRobo.cs
9
LearningChallenges
CompleteObserva6ons
ActualObserva6ons
Observa6ons(par6al)withindefinitegaps
Behavior model
1. Nopre-specifiedgoals/plans2. Incomplete&noisyobserva6ons
Human-awareRobo.csCapability
->:denoteanatomicstatechange
{has_water(AG),has_coffee_beans(AG)}->{has_boilling_water(AG),has_coffee_beans(AG)}->{has_boilling_water(AG),has_ground_coffee_beans(AG)}->{has_coffee(AG)}
Westartwithanincompleterepresenta6on
§ DEFINITION(CAPABILITY)–Givenanagent,acapabilityisamapping,whichisanasser.onabouttheprobabilityoftheexistenceofaplaninfewerthanorequaltoTatomicstatechangesthatcanconnectthetwostates.
Par6alstates
has_water(AG)=>has_ground_coffee_beans(AG)has_boiling_water(AG)=>has_coffee(AG)…WhenT=2
WhenT=3…(includingallcapabili?eswhenT=2)has_water(AG)=>has_coffee(AG)
Boundonthegapsbetweenobserva6ons
10
Human-awareRobo.csCapabilityModel
Capabilitymodelencodesallcapabili6esforagivenT
T-gapcapabilitymodel
Synchroniclinks
Diachroniclinks
11
Human-awareRobo.cs
12
CapabilityModel
Human-awareRobo.csCapabilityModel&EncodedCapabili?es
sI=>sE
Acondi6onalprobability(specifiedbyapar6alini6alandeventualstate)
Jointdistribu6onoverT
Acapability:T-gapcapabilitymodel
Acapabilitymodelencodesthefollowingdistribu6ons:
13
Human-awareRobo.csLearningCapabilityModels
§ Learningmodelstructure
Causalrela?onships(diachroniclinks);variablecorrela?ons(synchroniclinks)
§ Learningmodelparameters Condi?onalprobabili?es
Learningfrom(gap-bounded)plantraces
14
Human-awareRobo.cs
15
ParameterLearning
Learningfromincompletetraces
Human-awareRobo.cs
16
ParameterLearning
LearningsamplesApplyBayesianlearning(assumingbetadistribu6ons):
Weassumethatthemaximumnumberofmissingstateobserva6onsbetweenanytwoobserva6onsinthepar6alplantraceisupperboundedbyT
DEFINITION(T-GAPPARTIALPLANTRACE).AT-gappar.alplantraceisapar.alplantraceinwhichallk[1,2…]<=T
Human-awareRobo.cs
17
UsingCapabilityModels
§ Robotcanpredictthehuman’snextac?onoutcomes
Statepredic6on(goalrecogni6on)
Proac6veassistance(toincreasegoalsuccessprobability)
§ Robotcanreasonabouthowlikelyataskcanbeachievedbythehuman
Human-awareRobo.cs
18
Outline
BehaviormodelingØ Capabilitymodel
GoalpreferenceØ InverseRL
• WhyIRL• InverseRLvs.Behavioralcloning• Mathema?calformula?onofIRL• Applica?ons
Human-awareRobo.cs
19
Goal
Modeling of Humans
Goal
Goal
Human-awareRobo.cs
20 Safety?Time?Comfort?Wai?ng?me?Speed?
Human-awareRobo.cs
21
Outline
BehaviormodelingØ Capabilitymodel
GoalpreferenceØ InverseRL
• WhyIRL• InverseRLvs.Behavioralcloning• Mathema?calformula?onofIRL• Applica?ons
Human-awareRobo.cs
22
Reward:R(s)Decayingfactor:Policy:π
MarkovDecisionProcess
Human-awareRobo.cs
23
Human-awareRobo.cs
24
Human-awareRobo.cs
25
Outline
BehaviormodelingØ Capabilitymodel
GoalpreferenceØ InverseRL
• WhyIRL• InverseRLvs.Behavioralcloning• Mathema?calformula?onofIRL• Applica?ons
Human-awareRobo.cs
26
Human-awareRobo.cs
27
Human-awareRobo.cs
28
Human-awareRobo.cs
29
Outline
BehaviormodelingØ Capabilitymodel
GoalpreferenceØ InverseRL
• WhyIRL• InverseRLvs.Behavioralcloning• Mathema?calformula?onofIRL• Applica?ons
Human-awareRobo.cs
30
Human-awareRobo.cs
31
Human-awareRobo.cs
32
Human-awareRobo.cs
33
Human-awareRobo.cs
34
Human-awareRobo.cs
35
Human-awareRobo.cs
36
Human-awareRobo.cs
37
Human-awareRobo.cs
38
Human-awareRobo.cs
39
Human-awareRobo.cs
40
Human-awareRobo.cs
41
Human-awareRobo.cs
42
Human-awareRobo.cs
43
Human-awareRobo.cs
44
Directlycomputeapolicy!
Human-awareRobo.cs
45
[Abbeel&Ng,2004]
Human-awareRobo.cs
46
Human-awareRobo.cs
47
Outline
BehaviormodelingØ Capabilitymodel
GoalpreferenceØ InverseRL
• WhyIRL• InverseRLvs.Behavioralcloning• Mathema?calformula?onofIRL• Applica?ons
Human-awareRobo.cs
48
Human-awareRobo.cs
49
Human-awareRobo.cs
50
Outline
BehaviormodelingØ Capabilitymodel
GoalpreferenceØ InverseRL
• WhyIRL• InverseRLvs.Behavioralcloning• Mathema?calformula?onofIRL• Applica?ons