Learning in Heuristic Search-based Planning...Learning in Search-based Planning Carnegie Mellon...

Learning

Heuristic Search-based Planning

Maxim Likhachev

Associate Professor

Robotics Institute/NREC

Carnegie Mellon University

Search-based Planning Lab (SBPL)

Joint work with Ishani Chatterjee, Ben Cohen, Andrew Dornbush, Victor Hwang,

Venkatraman Narayanan, Michael Phillips, Kalyan Vasudev

Maxim LikhachevMaxim Likhachev

Going into the Real-world

• Robot models and simple world interactions can be pre-encoded

• Planning on those models enables the robots to operate under

benign/narrow conditions right away

• Real-world: real-time + going beyond what’s given

Carnegie Mellon University 2

Waseda/Mitsubishi robot

Learning in Search-based Planning

Speeding up

planning

Learning

cost function

Going beyond

the prior model

Waseda/

Mitsubishi

Re-use of previous results within search (Phillips et al.,’12; Islam et al.,‘18)

Learning heuristic functions (Bhardwaj et al.,’17; Paden & Frazzoli,’17; Thayer et al.,’11)

Learning order of expansions (Choudhary et al.,’17)

Speeding up

planning

Learning

cost function

Going beyond

the prior model

Learning a cost function from demonstrations (Ratliff et al.,’09; Wulfmeier et al.,’17)

Crusher (from Ratliff et a., ‘09 paper)

Speeding up

planning

Learning

cost function

Going beyond

the prior model

Learning additional dimensions to reason over (Phillips et al.,’13)

Combining learned skills and prior model (Vasudev et al., ongoing)

Speeding up

planning

Learning

cost function

Going beyond

the prior model

Waseda/

Mitsubishi

Re-use of previous results within search (Phillips et al.,’12; Islam et al.,’18)

Learning heuristic functions (Bhardwaj et al.,’17; Paden & Frazzoli,’17; Thayer et al.,’11)

Learning order of expansions (Choudhary et al.,’17)

Experience Graphs [Phillips et al., RSS’12]

• Many planning tasks are repetitive

- loading a dishwasher

- opening doors

- moving objects around a warehouse

• Can we re-use prior experience to

accelerate planning, in the context of

search-based planning?

• Especially useful for high-dimensional

problems such as mobile manipulation!

Given a set of previous paths (experiences)…

Put them together into an E-graph (Experience graph)

Given a new planning query…

…would like to re-use E-graph to speed up planning in similar situations

Re-use is via focusing search with a recomputed hε() heuristic function:

General idea:Instead of biasing the search towards the goal, heuristics hε(s) biases it towards a set of paths in Experience Graph

Can be computed via a single Dijkstra’s search on the Experience Graph

heuristics hε(s) is guaranteed to be ε-consistentheuristics hε(s) is guaranteed to be ε-consistent

Theorem 1: Algorithm is complete with respect to the original graph

Theorem 2: The cost of the solution is within a given bound on sub-optimality

Theorem 1: Algorithm is complete with respect to the original graph

Theorem 2: The cost of the solution is within a given bound on sub-optimality

Application of Experience Graphs

• Learning to plan faster from experience and demonstrations

Speeding up

planning

Learning

cost function

Going beyond

the prior model

Learning Additional Dimensions

• Learning Additional Dimensions in the Graph from Demonstrations

[Phillips et al., RSS’13]

Learning Additional Dimensions

• Learning Additional Dimensions in the Graph from Demonstrations

[Phillips et al., RSS’13]

Demonstrations provided in simulation; work by A. Dornbush

Speeding up

planning

Learning

cost function

Going beyond

the prior model

Integrating Learned Skills

• Suppose:– We have a graph G = {S, E} that describes how the robot can move its base/arms

– We have a set of k skills ψi…k that include skills for pushing/pulling doors/drawer

How skills ψi…k should be integrated with G, so that a planner can generate an overall plan?How skills ψi…k should be integrated with G, so that a planner can generate an overall plan?

skills connect disconnected

components in G

We assume ψi: X→{a,X’}, and each X maps onto unique S

A skill could potentially be available at each state S, but depending on data, at some states there is higher

confidence in its success than at others

A skill could potentially be available at each state S, but depending on data, at some states there is higher

confidence in its success than at others

• If confidence is estimated (e.g., via Dropouts [Gal & Ghahramani]), then:– Option 1: cost(s,a’,s’) is inflated proportionally to the estimated confidence

– Option 2: represent the planning problem as POMDP

- planning is exponential in (S, ψi) pairs

- however, there exists a clear preference on the outcomes: it is always preferred

for a skill to be successful at a given S

- planning is exponential in (S, ψi) pairs

- however, there exists a clear preference on the outcomes: it is always preferred

for a skill to be successful at a given S

Planning problem can be decomposed into a series of graph searches using PPCP (Likhachev & Stentz,’09):

- avoids planning in a belief state-space- scales to large-scale problems in real-time- provides rigorous theoretical guarantees

Planning problem can be decomposed into a series of graph searches using PPCP (Likhachev & Stentz,’09):

- avoids planning in a belief state-space- scales to large-scale problems in real-time- provides rigorous theoretical guarantees

Going Forward

• Explore option 2 (POMDP planning with uncertainty due to skills)

• Relax the assumption that each X maps onto unique S

• Apply the framework to few domains including navigation through

crowded areas

Maxim Likhachev 34Carnegie Mellon University

Thanks!

• Students & Staff:

– Ishani Chatterjee

– Ben Cohen

– Andrew Dornbush

– Victor Hwang

– Venkatraman Narayanan

– Michael Phillips

– Kalyan Vasudev

• Funding:

– ARL

– ONR

– Mitsubishi

Learning in Heuristic Search-based Planning...Learning in Search-based Planning Carnegie Mellon...

Documents

Robotic Motion Planning: A* and D* Search - cs.cmu.edu./motionplanning/lecture/AppH-astar-dstar_howie.pdfRobotic Motion Planning: A* and D* Search ... Algorithm The search requires

Artificial Intelligence Areas of AI and Some Dependencies Search Vision Planning Machine Learning Knowledge Representation Logic Expert Systems RoboticsDesigning

Machine Learning and Search -State of Search 2016

Automated (AI) Planning - cvut.czAutomated (AI) Planning Planning by state-space search Introduction Classi cation Progression Regression Search algorithms for planning Uninformed

2015 Motion Planning Graph Search

Academic Career Planning & Job Search Practicalitiescliao/academic_job_search.pdf · Academic Career Planning & Job Search Practicalities. Plan for Luck . Career Planning Mindset

Search and Planning for Inference and Learning in Computer Vision Iasonas Kokkinos, Sinisa Todorovic and Matt (Tianfu) Wu 1

E cient Bayes-Adaptive Reinforcement Learning using Sample ... · E cient Bayes-Adaptive Reinforcement Learning using Sample-Based Search ... Monte-carlo Tree Search, Planning 1

Planning and Search, Model Learning, Dyna Architecture ...ipvs.informatik.uni-stuttgart.de/mlr/wp-content/uploads/2017/07/10... · Reinforcement Learning Model-based RL and Integrated

Planning with Local Search

AUTOMATED PLANNING AND ACTING - Masaryk University · planning byforward search planning byminmax search symbolic model checking techniques determinization techniques Consider one

Planning your job search 2009

Wentworth Learning System Search

Job search Career Planning

1 Chapter 4 State-Space Planning. 2 Motivation Nearly all planning procedures are search procedures Different planning procedures have different search

SLI Learning Search Connect · 2019-01-18 · SLI Learning Search Connect For Magento 2 User Guide v1.3.1 The Learning Search Connect module integrates with SLI Systems’ Search

Planning - NTNUberlin.csie.ntnu.edu.tw/.../AI2003-Lecture08-Planning.pdf · 2003-12-23 · Planning with State-Space Search • Backward state-space search (Regression planning) –

Planning, Execution & Learning 1. Heuristic Search … Execution & Learning: Heuristic 2 Simmons, Veloso : Fall 2001 Heuristic Search Planning • Basic Idea – Automatically Analyze

Learning to Search

State Space Search and Planning