8
Interactive Reinforcement Learning Human Generated Reward Presentation for Summer Camp 2015 May 25 2015

Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

Interactive Reinforcement Learning

Human Generated Reward

Presentation for Summer Camp 2015 May 25 2015

Page 2: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour
Page 3: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

Reinforcement Learning

• Trial and error learning

• Explore and exploit

• Represent, predict and control

• Connect actions with rewards

• Maximize future reward

Sutton and Barto 1988

Page 4: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

Interactive Machine Learning

Fails and Olsen Jr. 2003

Page 5: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

Human Generated Reward

• Humans know more!

• Shaping systems to adapt

• Effectively reward learning

• Transfer learning through collaboration

• How can RL harness human reward?

Knox and Stone 2012

Page 6: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

Kuhlmann et al. 2004

Learning from Advice Learning from Shaping

Blumberg et al. 2002

Thomaz et al. 2006

Learning from Demonstration

Left: Argall et al. 2010 Right: Koenemann et al. 2014

Page 7: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

Learning from Trial and Error

Levine et al. 2015

Learning from Refinement

Cakmak et al. 2012

Page 8: Interactive Reinforcement Learningkorymathewson.com/assets/Summer-Camp-2015-Presentation.pdf · Sophie's kitchen . Imaoe: EC Berkelev Robot Learning Lab Can I do How do I Can I pour

Application

• Shared control

• Augmented representation

• Integrate human and non-human interaction

• Autonomous prosthetics