Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Masthead LogoFordham University
DigitalResearch@Fordham
Faculty Publications Robotics and Computer Vision Laboratory
5-2010
A Visual Imagination Approach to CognitiveRoboticsDamian M. LyonsFordham University, [email protected]
Sirhan ChaudhryFordham University
D. Paul BenjaminPace University
Follow this and additional works at: https://fordham.bepress.com/frcv_facultypubs
Part of the Robotics Commons
This Conference Proceeding is brought to you for free and open access by the Robotics and Computer Vision Laboratory atDigitalResearch@Fordham. It has been accepted for inclusion in Faculty Publications by an authorized administrator of [email protected] more information, please contact [email protected].
Recommended CitationLyons, Damian M.; Chaudhry, Sirhan; and Benjamin, D. Paul, "A Visual Imagination Approach to Cognitive Robotics" (2010). FacultyPublications. 8.https://fordham.bepress.com/frcv_facultypubs/8
A Visual Imagination Approach to Cognitive Robotics*
Damian M. LyonsSirhan Chaudhry
Robotics and Computer LaboratoryDept. of Computer & Information Science
Fordham University NY 10458
D. Paul BenjaminSchool of Computer Science
Pace university , NY
Symposium onUnderstanding the Mind and Brain
Tucson Arizona, May 2010
*Supported in part by DOE grant DE-FG02-08CH11542
FordhamRobotics & Computer
Vision
Overview of Talk
Introduction & Motivation
Approach: Visual simulation
Method: Match-mediated difference
Method: View & Object Synchronization
Experimental Results
Summary & Conclusions
FordhamRobotics & Computer
Vision
Motivation: “Cognitive Robotics”
Build robot systems capable of reasoning about all the kinds of complex phenomena that occur in everyday, real-world interactions.
FordhamRobotics & Computer
Vision
Fordham Urban Search and Rescue Team (FUSAR)
FordhamRobotics & Computer
Vision
Application: Reconnaissance, Security, Search and Rescue
Unstable terrain – reason about what may happen if terrain is disturbed
Dynamic terrain events – reason about where to go to avoid damage
Implicit collaboration – reason about how to contribute to ongoing task
Explicit collaboration – reason about when help is needed
FordhamRobotics & Computer
Vision
Specific Issue: Model objects with complex behaviors
Wall collapsing - bricks/debris go where?
Object rebounds off one or more surfaces
- ends up where
Unstable surface begins to slip - where does it slide to?
Make a judgment about the ‘precariousness’ of an object
FordhamRobotics & Computer
Vision
Specific Issue: Model objects with complex behaviors
Wall collapsing - bricks/debris go where?
Object rebounds off one or more surfaces
- ends up where
Unstable surface begins to slip - where does it slide to?
Make a judgment about the ‘precariousness’ of an object
FordhamRobotics & Computer
Vision
Reactive Approach
Behavior-based, reactive approach to visual tracking solution is possible.
But - interposing wall, other agent, or collection of objects complicates this.
Need to model potential behaviors of and interactions with other objects/agents to allow prediction
FordhamRobotics & Computer
Vision
Cognitive Approach
[Shanahan 2006] has proposed that
cognitive functions such as anticipationand planning operate through a process of internal simulation of actions and environment.
Craik (via Péter Érdi): Mind constructs
small-scale models of reality to predict events.
FordhamRobotics & Computer
Vision
Overview of Talk
Introduction & Motivation
Approach: Visual simulation
Method: Match-mediated difference
Method: View & Object Synchronization
Experimental Results
Summary & Conclusions
FordhamRobotics & Computer
Vision
The Minimal Subscene
Itti & Arbib (2005) define the minimal subscene as a middle ground between visual attention and language.
Arguably a better place to start for robots than VISIONS STM structure (because it contains ‘verbs’)
FordhamRobotics & Computer
Vision
Arbib’s ‘Schema Theory’ [Arbib 1981,1998]
Perceptual schemas:
Will the environment support (afford) the task
Continuously extract parameters for the task
Motor schemas:
Are the control systems to exploit such parameters
can be coordinated to effect a wide variety of action.
FordhamRobotics & Computer
Vision
The Minimal Subscene and ‘Visual Imagination’ module
Minimal SubsceneCurrent motor and perceptual schema, other related m & p
schemas
Fusion of Visual Attention
Internal Simulation
Planning &
Learning
Library ofPerceptual and Motor Schemas
‘Visual Imagination’
Planned activities
Visual‘output’
FordhamRobotics & Computer
Vision
Our approach to the Mirror Subsystem
On going work with Benjamin@PACEsince 2008
Use 3D game engine (OGRE) to simulate physics/appearance.
Compare graphical output of 3D simulation with actual video image from robot camera
Acts as an ‘imagination’ sensor
FordhamRobotics & Computer
Vision
Comparing Real and Synthetic Video imagery
PROS Potentially fast (image comparisons)
Doesn’t require visual attention to know anything about the simulation
Interface between schemas and simulation grounded in visual semantics
CONS Comparing graphical and visual image is
much harder than comparing two visual images
FordhamRobotics & Computer
Vision
Working Example
Predicting and tracking and intercepting a target that undergoesmultiple collisions with its environment.
FordhamRobotics & Computer
Vision
The Minimal SubsceneSchema Assemblage
Motor SchemaInterceptRollingObject
Perceptual SchemaScene
Background
Perceptual Schema
Rolling Ball
navigation
target
Arbib’s ‘Schema Theory’ [Arbib 1981,1998]
FordhamRobotics & Computer
Vision
The Minimal Subscene &the Mirror System
Motor SchemaInterceptRollingObject
Perceptual SchemaScene
Background
Perceptual Schema
Rolling Ball
navigation
target
Fusion of VisualAttention
3DSimulation
3DRendering
Camera
FordhamRobotics & Computer
Vision
Scene Background Perceptual Schema
Camera Simulation
Synchronization
Match Mediated Difference
(MMD)
New, Missing, or Unexpected
Elements
Perceptual SchemaScene Background
PrPs
Synthetic Image
Visual Image
He
Perceptual SchemaNew ‘object’
Fusion of VisualAttention
FordhamRobotics & Computer
Vision
Rolling ObjectPerceptual Schema
Camera Simulation
Synchronization
Match Mediated Difference
(MMD)
Perceptual SchemaRolling Object
PrPs
Synthetic Image
Visual Image
Prediction Request
Motion Correction
FordhamRobotics & Computer
Vision
Filling theScene Background
FordhamRobotics & Computer
Vision
Our Scene BackgroundOgre ‘room’ with floor and walls
Texture map visual image onto surface
Represent robot by simulation camera
FordhamRobotics & Computer
Vision
Mirror Aspect of Visual Imagination: Scene Background
Need to keep real camera and simulated view (virtual camera) synchronized, so that
Difference operation between views will only yield
unexpected objects or
unexpected motions of expected objects
FordhamRobotics & Computer
Vision
Mirror Aspect of Visual Imagination: objects
Need to keep simulated objects in synchronization with their observed behavior, so that
Difference operation between views will only yield
unexpected objects or
unexpected motions of expected objects
FordhamRobotics & Computer
Vision
Overview of Talk
Introduction & Motivation
Approach: Visual simulation
Method: Match-mediated difference
Method: View & Object Synchronization
Experimental Results
Summary & Conclusions
FordhamRobotics & Computer
Vision
Comparing Real & Synthetic Images:
The problem
|Is – I’r|
IrIs
I’r = He Ir
|Is – Ir|
FordhamRobotics & Computer
Vision
Match-mediated Difference Mask (MMDM)
Pp pepe
pq
' )'(
1)(
1)(
e(p) = | p – m( p ) |
• Place a normalized Gaussian at each point p’ in the set of match points P
Pp
v
p
pp
eSP '
2
'
2)'(
1
||
1
• Define the normalized match quality q(p) to be the inverse of the match error
FordhamRobotics & Computer
Vision
MMDM (Cont.)
Pp
v
p
m
pp
eS
pq
PpI
'
2
'
2)'(
)'(
||
1)(
FordhamRobotics & Computer
Vision
Match-mediated Difference Image
)(
|)(')(|)(
pI
pIpIpI
m
rsd
Pp
v
p
m
pp
eS
pq
PpI
'
2
'
2)'(
)'(
||
1)(
FordhamRobotics & Computer
Vision
Summary of difference images
|Is – Ir| |Is – I’r| )(
|)(')(|)(
pI
pIpIpI
m
rsd
Difference image Warped difference image MMDI
FordhamRobotics & Computer
Vision
Object Missing
FordhamRobotics & Computer
Vision
New Object
FordhamRobotics & Computer
Vision
Overview of Talk
Introduction & Motivation
Approach: Visual simulation
Method: Match-mediated difference
Method: View & Object Synchronization
Experimental Results
Summary & Conclusions
FordhamRobotics & Computer
Vision
Synchronization Method
Projection matrix P = K [R | ] where
K is intrinsic matrix, and
[ R | ] is extrinsic matrix with rotation R and a translation .
Synthetic matrix Ps = Ks [ I | 0 ]
Real matrix Pr = Kr [R | ].
FordhamRobotics & Computer
Vision
Synchronization method
Relationship between the error homography He produced by the MMD module and camera projection matrices:
(where n is the normal to the image plane and d is the depth of the image plane)
r
T
se Kd
nRKH
1
FordhamRobotics & Computer
Vision
Synchronization loop
)()()1( 1
serss KHKgtRtR
Assuming that the translation is small*
We use the following formula to iteratively synchronize
* [Guerrero et al 2005 ] describe an algorithm to calculate both R and
FordhamRobotics & Computer
Vision
Overview of Talk
Introduction & Motivation
Approach: Visual simulation
Method: Match-mediated difference
Method: View & Object Synchronization
Experimental Results
Summary & Conclusions
FordhamRobotics & Computer
Vision
Three points during synchronization
Three steps (t=5,10,20) during the 20 step synchronization of real and synthetic images. Column (A) real image, (B) Synthetic image with corner points, (C) warped image, (D) MMD mask, and (E) (zero) MMD image.
t=5
t=10
t=20
A B C D E
FordhamRobotics & Computer
Vision
Synchronization with non-zero MMDI
FordhamRobotics & Computer
Vision
Mirror System:Object Motion
MMDI Yields difference between actual and predicted position of target
Given:
Camera angle is known
Frame rate is known
Ground plane assumption (if we don’t have stereo)
We can calculate the force difference
FordhamRobotics & Computer
Vision
Corrective Force
• Real object at pr(t)
• Simulated at ps(t)
• Correction:
• Slow down or Speed up
• Back towards observed track
fsr(t)= k( pr(t) - ps(t) )
• Until fsr(t) <
pr(t)
ps(t)
fsr(t)
FordhamRobotics & Computer
Vision
Synchronizing with object
roll
Roll – no correct
Roll –correction
FordhamRobotics & Computer
Vision
Synchronizing with object
Bounce – no correction
Bounce – with correctionbounce
FordhamRobotics & Computer
Vision
Summary & Future work
Novel Cognitive Architecture for visual imagination to support prediction of complex behavior
Mirror system to synchronize world and objects: relationship to neural mechanism that implements the Mirror Neurons ?
Next steps:
Use bounce prediction
Compare with tracking
Real-time implementation
Integration with mapping and navigation
FordhamRobotics & Computer
Vision
Image Pixels mapped Terrain Spatiogramto depth TSG
Landmark representation
• Combines appearance and terrain spatial information• Fast comparison operation• Robust to occlusion• Integrate over multiple views
FordhamRobotics & Computer
Vision
The End
Thank You
FordhamRobotics & Computer
Vision
Experimental Results
Roll Error during Synchronization
gain=0.02,0.05,0.25,0.2
0
0.5
1
1.5
2
2.5
3
3.5
4
1 21 41 61Iteration Step
Ro
ll E
rro
r
g=0.01
g=0.05
g=0.15
g=0.2
Graph of the Roll Error versus Iteration step during synchronization for four values of gain for an initial 3.5 degree error between synthetic and real images
FordhamRobotics & Computer
Vision
Roll, Pitch and Yaw error
0
5
10
15
20
25
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81
Iteration
Ro
ll E
rro
r
-2.5
-2
-1.5
-1
-0.5
0
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81
Iteration
Pit
ch
Err
or
-2.5
-2
-1.5
-1
-0.5
0
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81
Iteration
Yaw
Err
or
0
5
10
15
20
25
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52
Iteration
Ro
ll E
rro
r
-2.5
-2
-1.5
-1
-0.5
0
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52
Iteration
Yaw
Err
or
-2.5
-2
-1.5
-1
-0.5
0
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52
Iteration
Pit
ch
Err
or
0
5
10
15
20
25
1 4 7 10 13 16 19 22 25 28 31 34 37
Iteration
Ro
ll E
rro
r
-2.5
-2
-1.5
-1
-0.5
0
1 4 7 10 13 16 19 22 25 28 31 34 37
Iteration
Pit
ch
Err
or
-2.5
-2
-1.5
-1
-0.5
0
1 4 7 10 13 16 19 22 25 28 31 34 37
Iteration
Yaw
Err
or
(A) (B) (C)
Graphs of roll (A), pitch (B) and yaw (C) error during synchronization for gain values (t-b)of 0.05, 0.1, 0.15 with initial error of roll 23, yaw 2.9, and pitch 2.9 degrees
FordhamRobotics & Computer
Vision
Previous Work
Integrating simulation with robot progamming or robot design [Albrecht et al 06, Diankov et al 08].
Polybot/Polyscheme [Cassimatis et al. 2004] planning is viewed as sequence of mental ‘simulations’ that include physical effects.
Overlaying simulation graphical output on visual imagery [Bejczy et al 95, Burkert et al 04].
Comparing real and synthetic video [Rushmeier et al 95]
Integrating OGRE & SOAR [Benjamin et al 2006], Match-Mediated Difference operation [Lyons et al 2009].