1
SitLog SitLog** is a programming language and environment for the specification and interpretation of service robots' tasks. All the RoboCup@Home tests in Golem are programmed in SitLog. The computational mechanism consists of two interpreters working in tandem: one for interpreting the task's structure, which is represented through a Functional Recursive Transition Network (F-RTN), and the other for interpreting content and control information associated to the nodes of the F-RTN, which stand for task's situations. The two interpreters allow the definition of applications at a highly abstract level, in a declarative and compact form. DMs are represented through this formalism. There are two main kinds of DMs: those representing the structure of the task (e.g. RoboCup@Home's tests) and those representing task independent generic behaviors (see, grasp, follow, find, etc.). Dialogue Model Hierarchy of Behaviors Golem-II+ is the latest service robot developed by the Golem Group. We design and construct domain independent service robots. Our developments are based in a theory of Human-Robot Communication centered in the specification of protocols representing the structure of service robots' tasks, which are called Dialogue Models (DMs). GOLEM-II+ Departamento de Ciencias de la Computación GOLEM-II+ Computer Science Department Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas Universidad Nacional Autónoma de México The Golem Group: Luis Pineda (team leader), Ivan Meza, Caleb Rascón, Gibran Fuentes, Mario Peña, Lisset Salinas, Arturo Rodríguez, Mauricio Reyes, Hernando Ortega, Joel Durán, Varinia Estrada http://golem.iimas.unam.mx [email protected] INTERACTION-ORIENTED COGNITIVE ARCHITECTURE The main communication cycle involves perceptual interpretation, DMs and intentional action, and subsumes autonomous systems, which deal with reactive behavior. IOCA Environment Rendering Coordinator SitLog Dialogue Models Perceptual Interpreter Autonomous Reactive Systems Recognition IOCA Action Specification Within the present framework we have developed IOCA. This architecture has three main layers: - TOP: Expectation / Action-Selection - MIDDLE: Interpretation / Action-Specification - BOTTOM: Recognition / Rendering VISUAL OBJECT RECOGNITION AND GRASPING The Golem Project uses the Multiple Object Pose Estimation and Detection (MOPED) algo- rithm and framework*. Golem is equipped with two in- house developed arms with 4- degrees of freedom. The vision algorithm provides the para- meters h, a and b representing the distance, height and depth between the robot's eye and the object; and a triangle composed of b and the two segments of the arm (l1 and l2) is defined. The elbow's position is located at the intersection of the two circles (c1 and c2) and is computed through diagrammatic reasoning, as illustrated below, determining the angles α and β at the shoulder and the wrist. This process corresponds to Golem's gross grasping plan. Once the arm approaches to the object, Golem's arm searches it reactively through a mechanism involving three infra-red sensors to adjust the target's position. This strategy compensates dynamically for vision and mechanical errors. MOPED's pose estimation - PeopleBot robotic base - Dell Precision M4600 - LAIDETEC-IIMAS robotic arms x2 - QuickCam Pro 9000 Webcam - Microsoft Kinect Camera - Hokuyo UTM-30LX Laser - Shure Base omnidirectional microphones x3 - RODE VideoMic directional microphone - M-Audio Fast Track Ultra audio external interface - Infinity 3.5-Inch Two-Way speakers x2 HARDWARE PocketSphinx, JACK OpenCV, OpenNI, MOPED Sicstus Prolog Festival TTS Player/Stage and Gearbox Navigation SitLog Vision Voice Recognition and Audio Processing Voice Synthesizer Object Manipulation Roboplus WordSpotting, GF grammar Language Interpretation SOFTWARE LIBRARIES MODULE SOFTWARE move_success move_error scan recursive search neutral search neutral fs_found final fs_error final scan neutral found not_found Pos = [ ] Pos ≠ [ ] Pos = [ ] find Diagrammatic Reasoning Gross planning Reactive Behaviour Dynamic local adjustment a b h l 1 l 2 β α * Collet, Alvaro and Martinez, Manuel and Srinivasa, Siddhartha S. "The MOPED framework: Object Recognition and Pose Estimation for Manipulation". In The International Journal of Robotics Research. April, 2011. c1 c 2 ** Pineda, Luis and Salinas, Lisset and Meza, Ivan and Rascón, Caleb and Fuentes, Gibrán. "SitLog: A Programming Language for Service Robots' Tasks". Submitted to International Journal of Advanced Robotic Systems. 2013. M-DOA (Multi-Direction Of Arrival)*** algorithm is an autonomous system that allows Golem to locate several sources of sound in its surroundings. Bring me the juice Hi, Golem! Ok. Thanks! Sorry! Did you say juice? Yes, please. MULTI-DIRECTION OF ARRIVAL Multiple Conversational Partners Could you please talk one at the time? I want water... I want orange juice... I want water Ok, water and... Interruption Handling This functionality can be used to react directly to such a stimulus. In addition it can also be embedded in language interaction, making the robot capable of handling single or multiple interruptions while it is engaged in a conversation with multiple partners. Sample not acceptable: will NOT be processed Sample acceptable: will be processed DOA Clustering source 1 source 2 source 3 Image Triangle Array-Redundancy Single Source Detection Algorithm I'll be with you in a moment *** Rascón, Caleb and Pineda, Luis. "Lightweight Multi-direction-of-arrival Estimation on a Mobile Robotic Platform" In Lecture Notes in Eng. and Comp. Science: Proc. of The World Congress on Engineering and Computer Science 2012, Vol I. ...you? I want orange juice

RIPS RoboCup@home 2013, The Netherlands

Embed Size (px)

DESCRIPTION

Golem-II+ is the latest service robot developed by the Golem Group. We design and construct domain independent service robots. Our developments are based in a theory of Human-Robot Communication centered in the specification of protocols representing the structure of service robots' tasks, which are called Dialogue Models (DMs).

Citation preview

Page 1: RIPS RoboCup@home 2013, The Netherlands

SitLogSitLog** is a programming language and environment for the specification and interpretation of service robots' tasks. All the RoboCup@Home tests in Golem are programmed in SitLog. The computational mechanism consists of two interpreters working in tandem: one for interpreting the task's structure, which is represented through a Functional Recursive Transition Network (F-RTN), and the other for interpreting content and control information associated to the nodes of the F-RTN, which stand for task's situations. The two interpreters allow the definition of applications at a highly abstract level, in a declarative and compact form. DMs are represented through this formalism. There are two main kinds of DMs: those representing the structure of the task (e.g. RoboCup@Home's tests) and those representing task independent generic behaviors (see, grasp, follow, find, etc.).

Dialogue ModelHierarchy of Behaviors

Golem-II+ is the latest service robot developed by the Golem Group. We design and construct domain independent service robots. Our developments are based in a theory of Human-Robot Communicationcentered in the specification of protocols representing the structure of service robots' tasks, which are called Dialogue Models (DMs).

GOLEM-II+

Departamento de Ciencias de la Computación

GOLEM-II+Computer Science Department

Instituto de Investigaciones en Matemáticas Aplicadas y en SistemasUniversidad Nacional Autónoma de México

The Golem Group: Luis Pineda (team leader), Ivan Meza, Caleb Rascón, Gibran Fuentes, Mario Peña, Lisset Salinas, Arturo Rodríguez, Mauricio Reyes,

Hernando Ortega, Joel Durán, Varinia Estrada

http://[email protected]

INTERACTION-ORIENTED COGNITIVE ARCHITECTURE

The main communication cycle involves perceptual interpretation, DMs and intentional action, and subsumes autonomous systems, which deal with reactive behavior.

IOCA

Environment

Rendering

Coordinator

SitLog

Dialogue Models

PerceptualInterpreter

AutonomousReactive SystemsRecognition

IOCA

ActionSpecification

Within the present framework we have developed IOCA. This architecture has three main layers:

- TOP: Expectation / Action-Selection- MIDDLE: Interpretation / Action-Specification- BOTTOM: Recognition / Rendering

VISUAL OBJECT RECOGNITION AND GRASPINGThe Golem Project uses the Multiple Object Pose Estimation and Detection (MOPED) algo-rithm and framework*.Golem is equipped with two in-house developed arms with 4-degrees of freedom. The vision algorithm provides the para-meters h, a and b representingthe distance, height and depth between the robot's eye and the object; and a triangle composed of b and the two segments of the arm (l1 and l2) is defined. The elbow's position is located at the intersection of the two circles (c1 and c2) and is computed through diagrammatic reasoning, as illustrated below, determining the angles α and β at the shoulder and the wrist. This process corresponds to Golem's gross grasping plan. Once the arm approaches to the object, Golem's arm searches it reactively through a mechanism involving three infra-red sensors to adjust the target's position. This strategy compensates dynamically for vision and mechanical errors.

MOPED's pose estimation

- PeopleBot robotic base - Dell Precision M4600 - LAIDETEC-IIMAS robotic arms x2 - QuickCam Pro 9000 Webcam - Microsoft Kinect Camera - Hokuyo UTM-30LX Laser - Shure Base omnidirectional microphones x3 - RODE VideoMic directional microphone - M-Audio Fast Track Ultra audio external interface - Infinity 3.5-Inch Two-Way speakers x2

HARDWARE

PocketSphinx, JACK

OpenCV, OpenNI, MOPED

Sicstus Prolog

Festival TTS

Player/Stage and GearboxNavigation

SitLog

Vision

Voice Recognition and Audio Processing

Voice Synthesizer

Object Manipulation Roboplus

WordSpotting, GF grammarLanguage Interpretation

SOFTWARE LIBRARIESMODULE

SOFTWARE

move_success

move_error

scanrecursive

searchneutral

searchneutral

fs_foundfinal

fs_errorfinal

scanneutral

found not_found

Pos = [ ] Pos ≠ [ ]

Pos = [ ]

find

Diagrammatic ReasoningGross planning

Reactive BehaviourDynamic local adjustment

a

b

h

l1 l2βα

* Collet, Alvaro and Martinez, Manuel and Srinivasa, Siddhartha S. "The MOPED framework: Object Recognition and Pose Estimation for Manipulation". In The International Journal of Robotics Research. April, 2011.

c1 c2

** Pineda, Luis and Salinas, Lisset and Meza, Ivan and Rascón, Caleb and Fuentes, Gibrán. "SitLog: A Programming Language for Service Robots' Tasks". Submitted to International Journal of Advanced Robotic Systems. 2013.

M-DOA (Multi-Direction Of Arrival)*** algorithm is an autonomous system that allows Golem to locate several sources of sound in its surroundings.

Bring me the juice

Hi, Golem!

Ok. Thanks!

Sorry! Did you say juice?

Yes, please.

MULTI-DIRECTION OF ARRIVAL

Multiple Conversational PartnersCould you please talk

one at the time?

I want water...I want orange juice...

I want water

Ok, water and...

Interruption Handling

This functionality can be used to react directly to such a stimulus. In addition it can also be embedded in language interaction, making the robot capable of handling single or multiple interruptions while it is engaged in a conversation with multiple partners.

Sample not acceptable: will NOT be processed

Sample acceptable: will be processed

DOA Clustering

source 1

source 2source 3

Image Triangle Array-RedundancySingle Source Detection Algorithm

I'll be with you in a moment

*** Rascón, Caleb and Pineda, Luis. "Lightweight Multi-direction-of-arrival Estimation on a Mobile Robotic Platform" In Lecture Notes in Eng. and Comp. Science: Proc. of The World Congress on Engineering and Computer Science 2012, Vol I.

...you?

I want orange juice