13
*tll ;lrtÉrfi üi¡i.¡i"ii, i q rliH 5n{}p ilir ]"¡rcf fi{,fi I &¡J{]i ¡i.3tit tl, i,, eflen{! *nn inlu¡tiauen{ lsiems rtñJpAAMS 2005 Z0th ta 21st of October

rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

  • Upload
    hacong

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

*tll ;lrtÉrfi üi¡i.¡i"ii, i q rliH 5n{}p ilir ]"¡rcf fi {,fi I &¡J{]i ¡i.3tit tl, i,,eflen{! *nn inlu¡tiauen{ lsiems

rtñJpAAMS 2005Z0th ta 21st of October

Page 2: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

4th International Workshop on Practical Aaalications of

lt;il;ff'üurtiogtnt Svstems - TIVPAAMS 200s

Proceedings IWPAAMS 2005

Acknowledgements

The IWPAAMS'2004 is organized in collaboration with:

o UNIVERSIDADDELEON --.,.-.. óipuiÁcroNPRovlNCIALDlLEoN. friNf¡ OB CASTILLA Y LEON

. üÑiiismo DE EDUcAcIÓN Y cIENCIA

. CAJAESPAÑA

. TELEFÓNICAI+D

rheorganizing:"-Tr*::ily;l#I3:,Hi1l'H.T.3*othemrortheir aPPreciated contrlbutlon r

O U n i v e r s i d ' d d e L e Ó n

Secretariado de Publicacrones

O Los autores

\SBN:84-9'l'73-222-1Depósito Legal: LE-l 867-2005

i.i*rtu", tlniversidad cle León Se icio de lmprenta

TABLE OF CONT

TABLE Otr

Preface. . ....... .

Architectu r€

ConstruyendoJavier Bajo, Jr

RT-Java para rMaftí Navarro

ComunA: HelInteligentes.Juan J. Gonzá

Data mining nJuan M. Hemi

Aplicación deagente de filtnMaría H. Meji

Sistema de SeJosé Luís Poz

Agent Applic

P¡edicción derial.F¡aile Nieto J.

Simulating coJosé M. GaláLópez-Parede

Developing arDaniel Gonz.i

Librería para IXabiel G. PafIsabel Rodríg

PalliaSys: ageA.Moreno, A-

Page 3: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

TABLE OF CONTENTS

Agente software para planificación de despliegue y auditorías de seguriclad en redesinal¿í¡nbricas.Manuel Vilas Paz, Isabel Rodríguez, José Ramón Villa¡ Flecla.a,. .................,.,.,.........229

New Tendencies in MAS

Towards a Multi-Agent Software A¡chitecture for Human-Robot IntegmtionJ.L. Blanco, A. Cruz, C. Galindo, J. A. Ferníndez-Madrigal, I. Gonzitlez. ................235

Arquitectura Multiagente con Balanceo de Carga Basado en Conocimiento Distribuido.Diego Ceñal, Angel Alonso Alvarez, José Ramón Villar Flecha. ...............................245

Towards a ftamework for a service-o¡iented automated nesotiation.Manuel Resinas, Pablo Fernandez. Rafael Corchuelo.

Multiclassifiers: Applications, nethods and architectures.Saddys Segrera, María N. Moreno.

. . . . . . . . . . . . . - . . . 1 f )

Page 4: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

Towards a Multi-Agent Software Architecture forHu man-Rotlot Integration

Blanco J.L, Cruz A., Galindo C., Femández-Madrigal J.A., and González J.

E.T.S.l. lnformática Universidad de MálagaCampus Teati¡los, 29071, Málaga, Spai¡l

j l blancofglc¡i ¡u a. u ma. eshttp://www. isa.u¡na.es

Abstract, A recenl a¡rd promising area of research in robotics is a¡D1ed to developrobots that share the same physical environnrent as hunrans. In tlris area, tlre mostrelevant issue is not autonomy of the robot but the intell¡gent interaction or coop-eration with humans. Some particular instances are seryice robots (ntuseLrm guides,surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where thelatter usually exhibit closer interaction with hur¡ans than the former. ln this paperwe identifo and use nrulti-agent technology as a suitable approach for developingassista¡t robot architectures, since it enables a very close interaction -namely, ¡)l¿-gralion- with humans (by considering the human as part of the fobot a¡chitectüe)and it is also capable of learning the best way of cooldinating both robot and lru-man skills (tlrrough reinforcement leam¡ng). We illustrate the approach with ourrobotic wheelchair SENA, a mobile robot inteDded for assistance to handicappedpersons.

I Introduction

Robotic applications that interact with hur¡ans have rccently attracted great attentionf¡om the scientific commrurity. One of the reasons is that the presence of a human thatcan interact wiü the robot may relax some of the requirements denanded fion corn-plete autonomous robots.

Among rcbots that interact with humans, maybe the most studied ones are seryice ro-óofs (that carly out liurited tasks in human populated scenarios such as museums, hospi-tals, etc.) and ass¡stant robots (which can not work without humans at all, lbr examplesurgery robots, assistant robots for elderly people, artifrcial limbs, etc). Human-robotinteraction becornes an important requirement in these applications. They also bear twoimportant characteristics that make them distinctive ftom othe$: on the one hand, theyimpose some critical issues such as operating robustness, physical safety, and human-friendly interaction (which have been coped recently by the Hunan Centered Robotics(HCR) paradigm (tlll, t13l). On the other hand, the human becomes in fact so inte-grated into the assistant rcbot that it is no more me¡ely a pan olthe envilonnent.

ln the literature, human participation in the ¡obotic operation is not a new idea. Sometenns have been coined to reflect this, such as cooperatíon 1191, colluboration f5f, orsupet'visíon (t51, tl5l), but all of üem deal with the human as being extemal to the ro-bot. We p¡opose here to take a step funher: to consider the hurnan as one ofthe compo-nents of the robotic application. This is specially evident in applications where the hu-man can be considered p/rysically as a paú of the robot, such as robotic wheelchairs orartifrcial linrbs. In these situations we should talk of hunan-robot ¡ntesrqtíon tafher

Page 5: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

236 NEW TDNDTNCIES IN ] t f^S

lhan httntcln-N¡bot interaction. We can obtain the following benefits flot¡ such integra-tion:

-Augrrent the robot capabilities with new skills. Humans can perfom actions beyondthe robot capabilities, i.e. to open a door, to take an elevator.

-lrrlprove the robot skills. Humans rnay cari.y out actions initially assigned to the ro-bot in different and sometinres more dependable ways! or.may complete robot actionsthat have lailed.

Our apploach uses a multi-agent software robotic arcbjtecture, that we callMA'CHRIN (the acronym for "Multi-Agent Architecture lbr Close Human-Robot Inte-gration") for such integration. This fi'amewo¡k is rnore appropriate than a classical hy-bdd layered (delibelative-reactive) architecture [16], since a software agent is the clos-est we can be to lepresent the human within the architecture (as a pafi of it) under thecullent soltware tech¡ology.

The work presented in this paper is a first step towards a colnpletely integmted hu-lnan-robot systen. At this point, the human is not yet an agent, but his/her capabilitiescan be distributed inside any agent that needs them for augnenting its skills. Our clesignhas seveml imnediate advantages concerning efficiency in huutan-robot coordinationand at the same time enables a high deglee of integration ofthe human into the architec-ture. ln a fulure work we will analyze the advantages of considering the hurnan as an-other, complete agent that comlnunicates üth the rest.

The following sections of the paper describe ou¡ framework in more detail. Section 2gives an overview of the proposed multi-agent architecture. The semantics that enableinter-agent cor¡rnunication and intra-agent mental states is desc¡ibed in Section 3. Sec-tion 4 is devoted to the leaming process that allows each agenl to select which intemalskill is nost appropriate to solve a given action (which includes hunan skills). Section 5illustrates our approach with a real assistant robotic application. Finally some conclu-sions and futule work are outlined.

2 ArchitectureOverview

MA2CHRIN, which is an evolution ol a non-agent-based architecture presented else-where [9]. is a Multi Agent Systen (MAS) cornposed of a variety of agents. Its mainI'eature is that hurlan skills are integrated into those agents. For instance, the human canhelp the robot to perceive world infomation (as we have delnonstrated previously in[4]), or to plovide a plan for achieving a goal (intemcting with the planner agent, as wehave shown in [8]), ol even to physically perform actions like open a door or call anelevatol'

This is suppolted through the use ofthe Comnon Agent Structtu.e (CAS) as the skele-ton of each agent (see fig. l). The abilities ofthe agent, both human and robotic, are theso-called stil/ ,r?¡¡J (see fig.2). Robot¡c skill units represent abilities ifrple[rented bysoi-tware algorithnrs. while htonqn skíll arirs pennit the corulection between humans andarchitectufe agents, enabling them to perfonn huuran abilities (i.e., open a door) throughappropriate interlaces (i.e., voice communication).

Page 6: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

4"'INniRNA |roN^L WoR(s op oN PRACf¡c^L Appltc^ltoNs ot,AGDN fs ^ND MuLftAG|jN I Sys tLMs 237

Fig. 1� The common agent structure (CAS). All the software agents are designed using thisstructure. The nunrber of skill units contained into each agent is variable and it depends ón itsfunctionality as well as on the human and robot capabilities that the CAS provides. See text formore detail.

Arnong the different possibilities for accomplishing its tasks an agent should be ableto learn over time the most convenienl one given its past experience and the cuffentconditions of the environment [14]. Within the CAS skeleton, ihe Smart Selector is incharge of the agent's leaming process, and it is active every time the agenr operares, tnorder to adapt and optirnize the agent performance under changeable environmentalconditions, as commented in section 4.

All skill units considered by the ,Sr?a¡-l Selector to accomplish a requested action pur-suit the same goal (i.e. to reach to a destination in the case of navigating), but they ¡rayexhibit diffe¡ences in the way they are invoked, especially in the case of hurnan units.

The translation lrom a generic action requested by an extemal agent into the skill unitpa¡ameters is carried out by the Semantíc Bridge. This cornponent also pennits the in-temals of the agent (SmaÍ Selecto¡ and Skill Units) to requsst/provide data to otheragents of the architecture.

The Semantic Briclge is also in charge ol uraintaining the Senuntic Knov,ledge Base(SKB) that represents the intemal mental state of the agent in terms of intentions andbeliefs. Such an intemal mental state is updated by Ibe Semantic Br¿dg¿ üth incomingrnessages or requests l¡om other agents. Communications between agents relies on theInter-Agent Interface that provides comr¡unication prirnitives to send/reccrve uressagesfollowing a fixed semantics as commented in mo¡e detail in section 3.

Page 7: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

238 NEWTDNDENCIES IN MAS

Fig, 2. Intemal scheme of two skill units. a) A hunan skill unit that permits a person to manu-

ally guide the robot through a dialogue using approp ate interfaces, i.e. via voice. b) a robotic

skill unit that also performs navigation.

3 A Semantics for Agents

ln contmst with traditional software architectures, where conmunication between dif-

ferent pieces of software usually makes intensive use of the clienl-server paradigm (ust

using procedural calls, RPC or any oüer function invocation rrechanism), MAS im-

Doses a much dcher set of inte¡actions between agents. Thus, if communications arc

expected to reflect and aflect the agents' intemal mental states, they should be more

thán rnele raw data, but information about agent attitudes. Hence, üe semantics of in-

teractions must be well defined.Communications in MA2CHRIN are currently based on üe most commonly used

scheme for MAS, which is message passing. In this approach we use a well-dehned,

standardized message fonnat, the one proposed by FIPA, namely ACL [7]. Moreover'

we have been highly inspired by the CAL specifrcation [6] in order to define the under-

lying semantics of communicative acts.Agent mental attitudes in our architecture are defined using these operators:

c Belíefs: the operator Bi P means that agent i beliefs that fact p holds true'

. Intentionst the operator Ii 4 means that agent i intents 4 to hold tlue Semantic

meaning of an intention [] diflers from the concept of goa,/ usually used in robot-

ics. lntentions are usually used for invoking the execution ofactions.

The set of commun icative acts or performatíves thatwe are using in MA:CHRIN, and

their associate feasible preconditions (FP) and rational effects (RE), hclude:

. Inform: The intention of an agent i of letting another agentj to know something

that ¡ curently beliefs. Its formal model results as: {<i, inform (/'O)>; FP: Bi 0 ¡

- Bi q¡ 0; RE: B.¡ Q)

Page 8: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

F

4 "' INTERN^TroN^ L WoRKsr{opoN pR cTlcAL AppLtcAl.toNs olj AcENTs AND MúLTrAceN.r Syst LMs 239

Request: Sende¡ agent requests the ¡eceiver to execute a given action. Its formalmodel is: {<i, request (/, a )>; Fp:Fp(a) [rA ^ Bi Agent(/, a) n _B¡ I¡ Done(a);RE : Done (a) ) , where we have used th e operators: AgentQ,a),. meaning"that agentj is able of performing action a; Done(a) which means action a has-to be exe_cúed,, a\d FP(q) [i/, representing the part of Fp(a) which are mental attitudes of

Agree; The agreement ol an agent i to perfonn a requested action for some otheragentJ: {<i, agree(¡, a)> = <1, inform (7, I¡ Done(a))>; Fp: B; cr ir rB¡ B¡ a; RE:B¡ cr], where for clarity, cr = I¡ Donela¡.Cancel: Informs to an agentJ that agent I has no longer the intention ol agent jperforming the action a: {<i, cancel (7, a)> = <i, inform(, -I; Done(a)>; Fp: _l¡Done(a) ,r B; B¡ I¡ Done(a); RE: B¡ -I¡ Done(a))Failur€: Agent i is intended to perform action a while its preconditions are ful_filled, but it was not possible to complete it and currently ii intents no longer totry it: {<1, failure(7, a)> = <i, inform(7, Done(e,Feasible(a) ,^' I¡ Done(a)) ,r-Done(a) a -I¡ Done(a))>; FP: B¡ o ,a, -B¡ B¡ cr ; RE: B¡ cr], where for clarity,cr=_Done(e, Feasible(a) a I¡ Done(a)) rr -Done(a) ,n' _I¡ Done(a), and the opera-tor Feasible(a) means that preconditions lor action a are rnet at tLát time ,Refuse: Agent i refuses the rcquest ofagentT for performing 4 where $ is the rea_son of rejection, or true if reason is not supplied. Another possibility is Q being atextual message in natural language. Its model is: {<i, refuse(7, n ,O)r =.r, ,o_fomf, -Feasible(a)>; <i, inform(7, Q n -Done(a) ,r -l¡ Done(a)>; ff : B¡ _nea_sible(a) a B¡ B¡ Feasible(a) A Bi o¿ ,r -Bi B; d; RE: B¡ _Feasible(a) a B¡cr), where at ; a2 mearls that both actions are sequenced, aná for clarity a = q ,r-Done(a) ¡ -I¡ Done(a).Inform-ret This perfonnative contains a B expression, whose meaning is notstandardized, but it is assumed that the receiver agent should be able to evaluate it

|-:ill11 II-11:ll1:: a precondirion sr¿rins rhal rh^e.reqresring asenr doesn,r betiofrbar agenrj arrea¡ly has rh€rnrenuon ot peflormrng ¿¡ a condirion lhar becomes false ane¡ receiving an dgl.¿? pe|fomlal¡vc.

Page 9: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

240 NDW TINDENCIES IN MAS

through the proper translation in its semantic bridge. Its formal model is: {<i, in-foru-r'ef(,0)> = <i, infonl(i, Eval(p))>; FP: B¡ I.¡ Done(<i, inform-ref(, p)>);

RE: B¡ Eval(B)), where Eval(p) represents the expression P evaluated at agentl.

Please note that while we have made many simplifying assurnptions in order to re-duce the complexity ol standard CAL, our semantics remains very close to the one pro-posed there and it is self-consistent. In order to transpoú the nessages between agents,the performatives are sent inside an ACL fo¡n.ratted string, which is coded into an ACLcompliant nessage in plain text fomat.

4 Learning Intra-Agent Skill Selection

In our architecture, agents must leam how to select the appropriate skill unit (either hu-man or robotic) in order to carry out the agent actions. We have chosen for üat purposeReinJbrcement Learning (RL) [10] to be implemented in our Smart Selector, since RLhas a üorouglüy studied lormal foundation and it also seems to obtain good results inmobile robotics tasks ([12], [17]).

In short, in RL an agent in a state s executes some action a tuming its state into s ' andgelting a reinforcernent signal or reward r. Those experience tuples (s,a,J',r) are usedfor finding a policy n that maximizes sorne long-run measure ofreward.

There is a comprehensive bu¡ch of neüodologies for modelling and solving RLproblems. A well-known solution are Markov Decision Processes (MDPs), that can bedefined by: a set of states S, a set of actions l, a reward or ¡einforcement function "R : Sxl -+91, and a state transition function 7: SxA -+ II(s), where II(s) is a probabilitydistribution over the set S.

Upon these sets and functions, an optimal policy ¡ that maximizes the obtained re-wards can be computed though the definition of lbe optimal yqlue of a state f (s),which is the expected reward that the agent will gain if it starts in that state and executesthe optimal policy:

,"(s) = max(R(r, d) + 7,I7(i, a, r)I/'(r)), Vs € S. (5- , i t

Where paranreter y is a discount faclo/ that ¡epresenls how much attention is paid tofuture rewards. The optinal policy is then specified by:

a'(s) = arg ¡¡¿*1¡1r,4 + yl�rg,a,s')v' ({D. Q

Both reinforcement and state t¡ansition functions arc natned a modeL However, sucha model is not always known in advance; in fact, most of robotics applications cannotprovide that p or k¡owledge. For model-free problems as the one at hand, we can use awidely known approach called Q-learning.lt uses the following opümal value functionQ instead of tr/ (s), which can be recursively computed on line by means of:

Qg,a) = QG,a)+ qO + ytnaxeG', o')- eG,a)). (3)

Parameter u is lhe learning rqte, a\d iI must slowly decrease in order to guaranteeconvergence of lunction Q to"Q'. Once the optimal Q-íunction Q- is obtained, the opti-mal policy can be fixed as it was stated in (2).

For illustrating Q-leaming in ou¡ Smafi Selectors, we focus our attention on üeNqv¡gqtion Agent. This agent is in charge ofthe performing [rotion actions ofthe rcbot

Page 10: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

Fii

4t" IN aERN^I roN^L woRKsl top oN pR^ct ¡cAL AppLtc^TroNs or AcuNTs AND MuL] r^c EN r S ys1 EMs 241

(in particular, of our robotic wheelchair). There are three skill units ava able in thisagent: a human skill unit that puts hunan in charge of movement through a joystick, apath tracker algoritb'r that lollows a previously calculated path, and a robotic reactiveskill unit that ¡nplements a reaclive navigation algorithm I l].

. The mathematical formulation ofthe eJearring tectrnique for the Nayigctt¡on Agent

1 S :

- S:{SUCCESS, FAILURE, NAV-OK, NAV_NOK}. SUCCESS means that therobot has reached its objective. FAILURE rep¡esents those situatrons where thevehicle stops outside its goal, due to some unexpected problern. If the robot istavelling üth normality, it is in NAV_OK state. Howéver, if difficulties ariseduring navigation' the robot tums to NAV_NOK state. Transitions ¿urons smresare displayed in fig. 4.

- ,4 simply matches the set ofskill units.- The reinforcement function R is considered as a suur of relevant factors, namely:

ability for obstacle avoidance, energetic consumption, path tracking curvatuiechanges, and distance to the goal.

Fig. 4. State transition diagram for th€ Navigation Agent.

5 A Real Robotic Assistant Application

MA2CHRIN has been tested on an assistant robot called SENA (see fig 5). It is a roboticwheelchair based on a commercial powered wheelchair that has be-en equipped withseveral sensors and an onboard computer to reliably perform highJevel tasks in indoorenvironments sENA accounts fo¡ \vi-Fi cor¡rection capabilitieJthat i'rp¡ove ro a grcatextend the possibilities of our robot, ranging from telé-operaüon of the vehicre to thehuman drivel access to intemet.

Our tests have been carried out within our lab and near corridors, in which the usercan select a destination via yoice. Since the main operation of SENA is navigation, wehave locussed our experiences on the leaming process of the agent devoted to this typeof task. We have conside¡ed th¡ee different skill units within

-the navigation agent, as

described in section 4.For a navigation task, the Plan¡er agent interacts with the user 1as described in [g])for constructing the best route to the goal. Wjth the route available, the Navigatiá

Agent executes each navigation step. All the communications that arise in this scheme

Page 11: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

242 NDW TCNDINCIES IN MAS

follow the semantics described in section 3. When each navigation step arrives at theNavigation Agent, its Se[rantic Bridge translates it to the Smart Selector for performingthe action. The Smafi Selector then runs the QJearning algoritbür.

Fig. 5. Two views ofthe SENA robotic wheelchair in which we have evaluated our human-robot integration architecture.

Our findings indicate that the Q-ñ.rnction really converges to a certain value after anappropdate number of leaming steps. Fig. 6 shows, for states FAILURE, NAV,OK andNAV_NOK, how Q-function of every action (HUMAN, ROBOT_DEL and HUMAN,respectively) hnally converges.

Fig. 6. Q function values for states NAV_OK and NAV_NOK. For each state, the bestcboices are Robotic Delibemtive and Human, respectively.

It can also be noticed that, for a given state, the maximum value of üe Q funcüonchanges among different actions. This means that, üough initially Q-leaming chooses ace ain action, as execution progresses another action can be discovered as the bestsuited option for navigation. To illustrate this, refer to fig. 7: until learning step 10, it isROBOT_DEL who provides best Q function value; then, until app¡oximately leamingstep 70, best action tums to ROBOT_REA; frnally, HUMAN becomes the most fitüngvalue.

Page 12: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

rIl 4 ' " IN1 ' [RNAI ' IONALWORKSI IOPONPRACI ]C LAPPLILAI IONSOI 'ACEN¡SANDMULf IACENISYS' ILJN1S 243

Fig. 7. Q function valUes for the FAILURE state. In this case the best choice is Hunran.

Finally, we have also noticed that the hurnan skill unit is mostly leamed l.or those statescoresponding to dsky or difficult situations (FAILURE and NÁV NOK srates).

6 Conclusions and Future Work

Usually, human-robot interaction is conside¡ed to be as a sinple comurunication be_tween the robot architecture and the human. However, we claim that in asststant robot_ics, a nruch stronger interaction is needed: the hurran, in fact, must be ¡ntegrated íntothe rcbotic Irchítecture.

_ For that purpose, we have presented in this paper a multi-agent robotic architecture

that enables such human-robot integration. We believe that a mrilti_agenr sysrelt ls rnoreapprop ate.than conventional approaches since agents are closer toiepresent hulnan asa paú of the systen than other software constiuctions (modules, procedures, etc.).Agents, as well as humans, have intentions, mentar states, and use-some semantics intheir communications. In addition, they posses some leaming capabilities that allowthem to adapt to envirorunental changes and to achieve good pe-rformances ovel trrne._

This paper is a first step towards the inclusion ofthe human as an agent within the ar_chitecturc. At this stage, the human is not yet an rgent, bur his/her cipabilities are dis_trib_uted hside any agent that needs them for augmenting its skills.

When an agent is endowed with a set of skill unitJfor performing some actron, rtmust decide wtich skill unit (including both robotic and human ones) is the best at eachsituation- We have itlplemented a eJéaming procedure that learns this ussocration overtime, trying to optimize the behavior ofthe system at long_terlr.r.

We have also defined the semantics for the agents to clom¡¡unicate and maintain theirinte¡nal mental states, as well as how this semÁtics is tlanslated into practrcal tequestslor the algoritluns ofthe architecture.

^"t1]n: t ly'., we wil l analyze tbe effecrs ofincluding the human as one ofthe agenrs

oI the arclxtecture. We will also continuing on working with assistant robots (such asour robotic wheelchair) in order to implement solutions for the rcal requrrements of as_slstance applications.

Page 13: rtñJpAAMS 2005 - Universidad de Almeríaingmec.ual.es/~jlblanco/papers/blanco2005tma_IWPAAMS.pdf · surveillance, etc.) and assistant robots (elderly cafe, telesurgery, etc.), where

244 NIW TENDENCIIS IN MAS

References

L Bla¡co J.L, Conzafez J., Fenrandez-Madrigal J.A. The PT-Space: A new Space Representa-líon for Notl-Holotlontic Mobile Robot reactive Navigation. Tech. Repon, University ofMalaga, 2005.

2. Cohen P.R. and Levesqre H.l. lntention is choice wíth conúitnezt. Art. Intellig., vol. 42, n.3, 1990.

3. Fenrandez-Madrigal J.A. and Conzalez J. Multí-Hierarchical Representation of Large-ScaleSpace. International Series on Microprocessor-based and Intelligent Systenrs EDgiDeering.Vol. 24, Kluwer Academic Publishefs, Netherlands, 2001.

4. Femandez-Madrigal J.A., Calindo C, and Gonzalez L Assist¡ve N.w¡gation of a RoboticWeelchai' usiry a Mtitihierctrchical Model of the Environment, lntegrated Computer-Aided Engineering, vol | | ., pp. 309-322,2004.

5. Fong T.W. and Thorpe C. Robot as Partner: Vehícle Teleoperation with Collaborqtiye Con-fro¿ Proc. ofthe NRL Workshop on Multi-Robot Systems, 2002.

6. Foundation for Intelligent Physical Agents. ACL Communicative Act Library Specification,2002.

7. Foundation for Intelligent Physical Agents. ACL Message Structure Spec., 2002.8. Galindo C., Fernandez-Madrigal J,A., and Gonzalez J.. Hierarchical Task Planning through

World Abs h'ac t¡on. IEEE Tra¡s. on Roborics, 20(4), pp. 667 -690, 2004.9. Gaf indo C' Aonzalez J., and Fernandez-Madrigl J.A. An Archítectte .for Close Hu¡nan-

Robot Interaction. Applicqt¡on to Rehabilitatiolt RobolrcJ, IEEE lnt. Conf. on Mechatronicsand Automation (ICMA'2005), Ontario (Canada), July-Agust 2005.

l0.Kaelbling L.P., Littman M.L., Moore A.N. Reínforcement Learning: A S¿¡ry¿-y. Jounral ofArtificial Intelligence Research 4 (\996) 231177.

ILKhatib O., Htntan-Cenleted Robotícs and Haptic Interaction: From Assistance to Surgery,the Enterging Applícations. 3rd lnt. Workshop on Robot Mot¡on and Control, 2002.

12.Liu� J.,\¡lu J. Mttti-Agekr Robotic Systen r. CRC Press (2001)l3.Morioka K., Lee J.H., and Hashimoto H. Human Centered Robotics in Intelligeht Space

Proc. ofthe IEEE ICRA'02, Washington, DC.l4.Murch R. and Jolrnson R. lntelligent softwn.e agents, Prcntice Hall 1999.15. Schoftz .1. Theory and et'aluatíon of human-robot interaction, Proc. of the 36th lntemational

Conference on System Sciences, Hawai, 2003.16. Simmons R., Goodwin R., Haigh K., Koening S., and Sullivan J, A Layered Architecture for

OJJice Delivery Robols. lstlnt. Conf. on Autonomous Agents, pp.235-242,1991.l7.SmaÍt W.D., Kaelbling L.P. EÍfectiw Reiforcement Leaning for Mobile Robots.lnterna-

tional Confere¡ce on Robotics and Automation (2002)l8.Sutton R.S., Bafto A.C. Reinforcenten¡ Learníng: An httroductíon. The MIT Press, Cam-

bridge, MA ( 1998).lg.Tahboub K.A. A Semí-Attonomoüs Reactiye Control Architecttce, J. lntell. Robotics Syst.,

\o1.32, n. 4, pp. 445-459,2001.