43
Perception, Learning and Control Jitendra Malik UC Berkeley

Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Perception, Learning and Control

Jitendra Malik

UC Berkeley

Page 2: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Bipedalism

Opposable thumb

Tool use

Language

Abstract thinking

Symbolic behavior

Phylogeny of Intelligence

Cambrian Explosion540 million years ago

Variety of life forms,almost all phyla emerge

Animals that couldsee and move

Hominid evolution, last 5 million yearsModern humans, last 50 K years

Anaxogaras: It is because of his being armed with hands that man is the most intelligent animal

Gibson: we see in order to move and we move in order to see

Page 3: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

The evolutionary progression

• Vision and Locomotion

• Manipulation

• Language

Successes in AI seem to follow the same order!

Page 4: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Animal Gaits

-Stephen Cunnane

Page 5: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Computational Gaits

Hildebrand (1965)

Page 6: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Computational Gaits — Central Pattern Generators

Matsuoka, Kiyotoshi.

"Mechanisms of

frequency and

pattern control in the

neural rhythm

generators." Biologic

al cybernetics (1987)

Page 7: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Computational Gaits —Raibert Controller

Raibert, Marc H., et al. Dynamically Stable Legged Locomotion. MIT ARTIFICIAL INTELLIGENCE LAB,

1983.

Page 8: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Spot - Boston Dynamics

Page 9: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Alternating Gait

Crawl Gait

Page 10: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Gaits in Real Life

Page 11: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Cole, Whitney G., Scott R.

Robinson, and Karen E.

Adolph. "Bouts of steps:

The organization of infant

exploration." Development

al psychobiology (2016)

Gaits in Real Life

Page 12: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Problem 1Gaits are not a complete model of all legged

movement

Page 13: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

“We see in order to

move, and we

move in order to see”

— J.J. Gibson

Coupled Perception and Action

Page 14: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Matthis, Jonathan Samir, Jacob L. Yates, and Mary M. Hayhoe. "Gaze and the control of foot placement when walking in natural

Coupled Perception and Action

Page 15: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Weakly Coupled Vision

Spot — Boston Dynamics Laikago — Unitree Robotics

Page 16: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Spot - Boston Dynamics

Problem 2Vision is missing, or only weakly coupled

Page 17: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Spot - Boston Dynamics

Our Axioms

1. Solve general legged locomotion (no explicit gait models).

2. Use vision from ground up.

3. Use learning as much as possible.

Page 18: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Visual Control of Legged Locomotion

Ashish KumarDeepak Pathak Stuart AndersonJitendra Malik

Page 19: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Uneven Terrain with Depth

• Trained with PPO, small CNN (~4ms on a GPU)

• Rewards contain energy penalties inspired from Biomechanics

Page 20: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Uneven Terrain with Egocentric Depth

• Testing on Medium Difficulty

Page 21: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Uneven Terrain with Egocentric Depth

• Testing on High Difficulty

Page 22: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Latency in Vision

1. High Level footstep (low freq) planner and low level stepping

function (high freq) — Train HL followed by LL / train LL

followed by HL

2. Train with updating the vision features with less frequency

Page 23: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Animal Navigation: Landmarks and Maps

Niko Tinbergen (1951)

Page 24: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Learn Skills that Enable a Robot to Move Around in Novel Environments

Robot w/camera In novel environment

Go around obstacles

Come out of cubicles Go down hallwayGo through door

Page 25: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

The classical robotics solution is SLAM(Simultaneous Localization and Mapping)

Video Credits: Mur-Artal et al., Palmieri et al.

Page 26: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Mapping

Observed Images3D Reconstruction

Path PlanningLocalization usingcurrent image

Classical Mapping and Planning

Execute Action

3D Reconstruction

Policy Execution

Page 27: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

▪ Unnecessary:

▪ Precise reconstruction of everything is not necessary.▪ Precise localization may also not be necessary.

▪ Insufficient: ▪ Only geometry, no semantics.▪ Nothing is known till it is explicitly observed, failure to exploit

experience with similar past environments.▪ Not robust to changes in the environment.▪ No way to encode semantic primitives (go down the hallway, go

through the doorway).

Observed Images

3D Reconstruction

Mapping

Classical Mapping and Planning

Path Planning

Localization using

current image

Execut

Action

Page 28: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Cognitive Mapping & Planning –Gupta et al (2017)

Page 29: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Visual Memory for Robust Path FollowingKumar, Gupta, Fouhey, Levine & Malik (2018)

Page 30: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

30

Perception and Interaction Language

Turing (1950)Computing MachineryAnd Intelligence

Page 31: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

The Development of Embodied Cognition: Six Lessons from BabiesLinda Smith & Michael Gasser

Page 32: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

The Six Lessons

• Be multi-modal

• Be incremental

• Be physical

• Explore

• Be social

• Use language

Page 33: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Model-free Reinforcement Learning has very high sample eomplexity

OpenAI et al, Learning Dexterous In-Hand Manipulation, arXiv 2018

OpenAI l l R b k’ b w R b H arXiv 2019

Page 34: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Learning by Imitating OthersReinforcement Learning of Skills from Videos : Peng, Kanazawa, Malik, Abbeel and Levine

Page 35: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Human-Object Interactions in the WildZhe Cao, Ilija Radosavovic, Angjoo Kanazawa, Jitendra Malik

Page 36: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Eye Movements while making a cup of tea

Land, Mennie and Rusted (1999)

Page 37: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed
Page 38: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed
Page 39: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed
Page 40: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

AI systems need to build “mental models”

If the organism carries a `small-scale model’ of external reality and of its own possible actions within its head, it is able to try out various alternatives, conclude which is the best of them, react to future situations before they arise, utilize the knowledge of past events in dealing with the present and the future, and in every way to react in a much fuller, safer, and more competent manner to the emergencies which face it (Craik, 1943,Ch. 5, p.61)

Commonsense is not just facts, it is a collection of models

Page 41: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Learning 3D Human Dynamics from Video

Angjoo Kanazawa*, Jason Zhang*, Panna Felsen*, Jitendra Malik

CVPR 2019

* Equal contribution

Page 42: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed

Auto-regressive prediction of 3D motion from video

Input

Video

Ground Truth

Video

Predicted Future

Different

Viewpoint

Predicting 3D Human Dynamics from Video, Zhang, Felsen, Kanazawa, Malik(ICCV 2019)

Page 43: Perception, Learning and Control...The classical robotics solution is SLAM (Simultaneous Localization and Mapping) Video Credits: Mur-Artal et al., Palmieri et al. Mapping Observed