54
-This paper integrates, expands and substantially modi"es previous conference papers (Balkanski & Hurault-Plantet, 1997; Hurault-Plantet & Balkanski, 1998). Int. J. Human-Computer Studies (2000) 53, 915}968 doi:10.1006/ijhc.2000.0426 Available online at http://www.idealibrary.com on Cooperative requests and replies in a collaborative dialogue model- CECILE BALKANSKI AND MARTINE HURAULT-PLANTET LIMSI/CNRS, B.P. 133, 91403 Orsay, France. email: Mbalkanski,mhpN@limsi.fr In this paper, we present a computational model of dialogue, and an underlying theory of action, which supports the representation of, reasoning about and execution of communi- cative and non-communicative actions. This model rests on a theory of collaborative discourse, and allows for cooperative human}machine communication in written dia- logues. We show how cooperative behaviour, illustrated by the analysis of a dialogue corpus and formalized by an underlying theory of cooperation, is interpreted and produced in our model. We describe and illustrate in detail the main algorithms used to model the reasoning processes necessary for interpretation, planning, generation, as well as for determining which actions to perform and when. Finally, we present our imple- mented system. Our data are drawn from a corpus of human}human dialogues, selected and tran- scribed from a day-long recording of phone calls at a phone desk in an industrial setting (Castaing, 1993). We present an analysis of this corpus, focusing on dialogues which require, in order to succeed, helpful behaviour on the part of both the caller and the operator. The theoretical framework of our model rests on the theory of collaborative discourse developed by Grosz and Sidner (1986, 1990), Grosz and Kraus (1993, 1996), and further extended by Lochbaum (1994, 1995). An important objective guiding the design of our dialogue model was to allow the agent being modelled to interpret and manifest a type of cooperative behaviour which follows Grosz and Kraus's formalization of the commit- ment of each collaborative agent towards the actions of the other collaborative agents. The model we propose extends Lochbaum's approach to discourse processing in extend- ing her interpretation algorithm to allow for the treatment of a wider range of dialogues, and in providing an algorithm of task advancement which guides the generation process and allows for the interleaving of execution and planning, thereby facilitating coopera- tion among agents. The cooperative behaviour of the agent being modelled rests on the use of communicative actions allowing agents to share additional knowledge and assist each other in performing their actions. ( 2000 Academic Press 1. Introduction In this paper, we present a computational model of dialogue, and an underlying theory of action, which supports the representation of, reasoning about and execution of 1071-5819/00/120915#54 $35.00/0 ( 2000 Academic Press

Cooperative requests and replies in a collaborative dialogue model

Embed Size (px)

Citation preview

Int. J. Human-Computer Studies (2000) 53, 915}968doi:10.1006/ijhc.2000.0426Available online at http://www.idealibrary.com on

Cooperative requests and replies in a collaborativedialogue model-

CECILE BALKANSKI AND MARTINE HURAULT-PLANTET

LIMSI/CNRS, B.P. 133, 91403 Orsay, France. email: Mbalkanski,[email protected]

In this paper, we present a computational model of dialogue, and an underlying theory ofaction, which supports the representation of, reasoning about and execution of communi-cative and non-communicative actions. This model rests on a theory of collaborativediscourse, and allows for cooperative human}machine communication in written dia-logues. We show how cooperative behaviour, illustrated by the analysis of a dialoguecorpus and formalized by an underlying theory of cooperation, is interpreted andproduced in our model. We describe and illustrate in detail the main algorithms used tomodel the reasoning processes necessary for interpretation, planning, generation, as wellas for determining which actions to perform and when. Finally, we present our imple-mented system.

Our data are drawn from a corpus of human}human dialogues, selected and tran-scribed from a day-long recording of phone calls at a phone desk in an industrial setting(Castaing, 1993). We present an analysis of this corpus, focusing on dialogues whichrequire, in order to succeed, helpful behaviour on the part of both the caller and theoperator.

The theoretical framework of our model rests on the theory of collaborative discoursedeveloped by Grosz and Sidner (1986, 1990), Grosz and Kraus (1993, 1996), and furtherextended by Lochbaum (1994, 1995). An important objective guiding the design of ourdialogue model was to allow the agent being modelled to interpret and manifest a type ofcooperative behaviour which follows Grosz and Kraus's formalization of the commit-ment of each collaborative agent towards the actions of the other collaborative agents.The model we propose extends Lochbaum's approach to discourse processing in extend-ing her interpretation algorithm to allow for the treatment of a wider range of dialogues,and in providing an algorithm of task advancement which guides the generation processand allows for the interleaving of execution and planning, thereby facilitating coopera-tion among agents. The cooperative behaviour of the agent being modelled rests on theuse of communicative actions allowing agents to share additional knowledge and assisteach other in performing their actions.

( 2000 Academic Press

1. Introduction

In this paper, we present a computational model of dialogue, and an underlying theoryof action, which supports the representation of, reasoning about and execution of

-This paper integrates, expands and substantially modi"es previous conference papers (Balkanski& Hurault-Plantet, 1997; Hurault-Plantet & Balkanski, 1998).

1071-5819/00/120915#54 $35.00/0 ( 2000 Academic Press

916 C. BALKANSKI AND M. HURAULT-PLANTET

communicative as well as non-communicative actions. This model rests on a theory ofcollaborative discourse, and allows for cooperative human}machine communication inwritten dialogues. We show how cooperative behaviour, illustrated by the analysis ofa dialogue corpus and formalized by an underlying theory of cooperation, is interpretedand produced in our model. We describe and illustrate in detail the main algorithms usedto model the reasoning processes necessary for interpretation, planning, generation, aswell as for determining which actions to perform and when. We also describe ourimplementation, an application that integrates the dialogue model presented in thispaper. The interpretation, planning and generation algorithms have been completelyimplemented and the simulations of the system's processing of sample dialogues, present-ed in this paper, correspond to the way the implemented system actually processes thedialogues.

1.1. DOMAIN OF APPLICATION

The domain we will use to illustrate our model is that of a telephone desk. Our data aredrawn from a corpus of human}human dialogues, selected and transcribed from a day-long recording of phone calls at a phone desk in an industrial setting (Castaing, 1993).The full recording consists of about 500 dialogues, varying in length from a minimum ofthree exchanges (two by the operator, and one by the caller) to almost 30; 353 of themhave been transcribed. In these dialogues, the goal of the caller is to enter into com-munication with a particular person, which requires going through the phone operator.The goal of the operator is then to establish a communication link, using the telephone,between the calling person and the person with whom this person wants to talk to, a taskwhich may lead to many unexpected (and interesting) di$culties.

The study of this corpus allowed us to identify the primary entities that are referred toby the speakers (people, extensions, groups, departments, etc.), and the characteristics ofthese entities. The de"nition of the knowledge required of the system, as well as a numberof heuristic rules underlying its reasoning capabilities, was then guided by our analysis ofdi!erent actions, communicative and non-communicative, performed by the caller andtelephone operator, as well as by the di!erent aspects of the callers' requests.

The dialogue in Figure 1 illustrates the type of discourse provided in this corpus andpresents an insightful example of the cooperation which is needed in such a task. In thetranslated English version of this dialogue, C is the caller and O, the operator. Thecorpus adopts a number of transcription conventions. Words between backslashes (C)indicate an overlap between the two speakers' utterances. Proper names are abbreviatedto upper-case letters, except in special contexts, as in utterance (7) where the name isspelled out in upper-case letters to indicate accented speech. All names cited are "ctitious.Extensions are not because the phone desk has been reorganized and the extensions nolonger exist.

In this example, the task begins with two actions, given in lines (1) and (2) of Figure 1,that are part of a plan which allows the caller to establish communication with theoperator. The "rst action is a manipulative action (dialing a phone number), while thesecond is a communicative action uttered in response to the caller's action. The callerthen states his goal, speaking to Mr A, but the operator apparently cannot "nd thatperson in her directory. If she were not cooperative, she could have then just answered

(1) L: SappelT C: Stelephone callT(2) S: CNRS O: CNRS(3) L: bonjour Mme euh je voudrais le

Mr A svpC: hello hm I'd like extension Mr A please

(4) S: Mr ACouiC quel est son poste? O: Mr A CyesC what's his extension?(5) L: c'est euh il travaille avec Mme Muzeu C: it's hm he works with Mrs Muzeu(6) S: Mme? O: Mrs?(7) L: MUZEU C: MUZEU(8) S: A. j'connais pas Muzeu ia m'dit O: I don't have a Muzeu I don't know

vous savez ou elle se trouve dans quel do you know where she is in which(9) L: hou la la la la c'est au LEIO mais C: oh well it's at the LEI but

Cah oui e!ectivementC Coh yes indeedC(10) S: 22}26 j'vous la passe hein O: 22}26 I'll transfer you over to her

FIGURE 1. Dialogue No 2-A(4)N7.

COOPERATIVE REQUESTS AND REPLIES 917

&&I don1t know Mr A'', suggesting an end to the conversation, and therefore to their briefcollaboration. However, as illustrated in our corpus, even if this type of response isuttered, it is typically followed by a cooperative request for additional information. Aswill be shown in the corpus study (Section 2), these requests take di!erent forms: askingfor an extension, for the department where the person works, or for the name of anintermediary person, for instance. In our corpus, these alternatives are typically tried inthe same order, apparently linked to the increasing complexity of the underlying plan.The caller's response in (5), &&he works with Mrs Muzeu11, shows that he is aware of thecooperative strategies of the operator and that he uses this knowledge in a cooperativemanner. Although he does not provide the information explicitly requested (an exten-sion), he does provide alternative information that will help the operator. It is, amongother things, this cooperative behaviour, and the knowledge it implies, that we wanted tomodel in our system and which we will present in this paper.

1.2. THEORETICAL FRAMEWORK

Our computational model rests on the theory of collaborative discourse developed byGrosz and Sidner (1986, 1990), Grosz and Kraus (1993, 1996), and further extended byLochbaum (1994, 1995). This work rests on the claim that discourses involve collab-orative behaviour and is centred around the notion of a SharedPlan. SharedPlans wereintroduced to model the set of beliefs and intentions about actions to be performed thatagents must hold for their collaboration to be successful. Grosz and Kraus (1993, 1996)proposed a revised and extended version of SharedPlans which, in particular, takes intoaccount the commitment of each agent not only towards the success of his or her ownactions, but also towards the success of the actions to be executed by the other agents. Itis this commitment towards the other agents which leads each agent to adopt the helpfulbehaviour necessary for their cooperation.- Grosz and Kraus introduced several axioms

-This commitment also leads the agents to avoid con#ict. This type of behaviour, however, lies beyond thescope of this paper.

918 C. BALKANSKI AND M. HURAULT-PLANTET

formalizing this commitment in terms of beliefs and intentions. We will show howthe adoption of cooperative behaviour represented by these axioms is illustrated bothin dialogues extracted from our corpus, and in a number of examples presentedto demonstrate the functioning of our model. An important objective guiding the designof our dialogue model was indeed to allow the agent being modelled to interpretand manifest a type of cooperative behaviour which follows the Grosz and Krausformalization of collaborative plans, and which is illustrated in Figure 1. This behaviourrests on the use of communicative actions allowing agents to share additional know-ledge, in order to help each other in performing actions or to replan when a problemarises.

Lochbaum (1994) subsequently proposed a computational model, based on theSharedPlan formalism, for recognizing intentional structure and utilizing that structurein discourse processing. She introduced in particular a structure, named a recipe graph orRgraph, which she uses in the algorithm modelling the reasoning process by which anagent explains the contribution of an utterance to the agents' current partial SharedPlan.Rgraphs represent the beliefs of the agent being modelled as to how all of the actsunderlying the agents' discourse are related at a given point in the dialogue. These areacts that have been, or will have to be, performed by the collaborating agents toaccomplish their individual and shared objectives.

The computational model we propose extends Lochbaum's approach to discourseprocessing in several signi"cant ways, and in particular (1) in modelling initial greetings,typical of dialogue openings but often omitted in actual models, (2) in extending herinterpretation algorithm to allow for the treatment of a wider range of dialogues, and (3)in providing an algorithm of task advancement which guides the generation process andallows for the integration of planning and acting, thereby facilitating cooperation amongthe agents. While Lochbaum provided the foundations for the main building blocks ofthis model, both in terms of representation and reasoning, she focused mainly oninterpretation, and only brie#y addressed generation. As a result, the implementation ofthe generation module of her system makes use of an &&oracle'' for making decisions amonga set of possible tasks. Instead, our generation module calls upon a Task Advancementalgorithm whose goal is to determine which actions the system can perform, and in whichorder, so as to allow the overall task to progress. It also determines when to plan furtheractions, when to replan an action that failed, or when to produce an utterance and whatit should contain. It does so by manipulating the Rgraph, making choices by assigning tothe various options priorities guided towards action execution.

1.3. PAPER OVERVIEW

In this paper, we begin by describing our corpus study, and in particular how thecooperative behaviour of the conversational partners manifests itself. In Section 3, wethen describe aspects of the treatment of cooperation in the underlying theory, relatingthese aspects to sample dialogues drawn from our corpus. We continue, in Section 4, witha description of the main components of our model, namely the action, recipe- and recipe

-As will be shown later, a recipe is a set of constituent actions, and associated constraints that are necessaryfor the performance of a complex action.

COOPERATIVE REQUESTS AND REPLIES 919

graph knowledge structures, as well as the Interpretation and Task Advancementalgorithms. In Section 5, we present the di!erent elements of the architecture of oursystem. Section 6 illustrates the functioning of our model on a number of sampledialogues inspired by our corpus. Section 7 describes our working system. Finally, inSection 8, we discuss the coverage of our model, as well as a number of related issues, andwe compare our approach to other dialogue systems. We end with a few concludingremarks on the main aspects of our work and directions for future research.

2. Corpus study

The recording of the corpus was performed at a telephone desk consisting of threetelephone posts, operated by three experts and one novice operator. The corpus contains353 dialogues that were transcribed from this recording. These are separated into twogroups: request dialogues (275 of them), and subdialogues resuming a previous request(78 of them, called &&dialogues de reprise'' in the original corpus). These subdialoguesfollow request dialogues, occurring after a certain lapse of time, when the line is busy or isnot answered. The operator then engages in a new dialogue with the caller. In some cases,the caller maintains his initial request (waiting for instance until the line becomes free), inothers, he modi"es his request (asking for someone else for instance). In the presentanalysis, we focus on the initial request dialogues, leaving these subdialogues for a futurestudy.

In this section, we begin by brie#y presenting these dialogues, and we then provide ananalysis of the cooperative strategies used by conversational partners to attain their goal.The purpose of this study is two-fold: to reveal relevant aspects of the knowledge of thespeakers as well as the reasoning processes underlying the cooperative strategies thesespeakers use. The information concerning the speakers' knowledge was used as a basisfor encoding the actions and recipes that make up the system's knowledge, whereasinformation concerning the reasoning processes was used for the development of thesystem's reasoning capabilities needed for interpreting an utterance and planning itsresponse. In Section 3, we show how the cooperative strategies revealed by this analysismay be interpreted in the context of the theoretical model on which our model rests.

2.1. DESCRIPTION OF THE DIALOGUES

In our corpus, the type of request most frequently made by the caller is a request forcommunication based on an extension (e.g. &&I1d like extension 33-22.''). As shown inTable 1, these represent 131 out of 275 request dialogues of the corpus (48%). Amongthese, 122 of them (94%) are immediately successful: the person requested is found, andno request for further information is needed. The remaining nine dialogues (6%) makeuse of clari"cation subdialogues. There are 92 dialogues (33%) involving requests basedon the name of the person (e.g. &&May I speak to Mr B.? 11), 70 of which are realized withoutthe need for clari"cation subdialogues (76%). The remaining dialogues include essential-ly requests for communication with a particular department (e.g. &&I1d like the PersonnelDepartment.11), requests for information that lead to the operator transferring the callerto particular department (e.g. &&I1d like information about the Courier du CNRS.11), andcombinations of the above. The "ve atypical dialogues involve two requests for the fax

TABLE 1Di+erent types of requests on the part of the caller

Type of requests on the Dialogues without Dialogues withpart of the caller clari"cation subdialogues clari"cation subdialogues

(all successful) TotalSuccess Failure Total (275)

Based on extension 122 8 1 9 131Based on name of a person 70 18 4 22 92Based on name of a department 10 6 1 7 17Based on name of person and

extension9 2 0 2 11

Request for information 9 1 0 1 10Based on name of person 7 1 0 1 8

and departmentBased on department and 1 0 0 0 1

request for additionalinformation

Miscellaneous 0 4 1 5 5

920 C. BALKANSKI AND M. HURAULT-PLANTET

number of the organization, two that make reference to previous calls, and one thatinvolves a caller trying to answer a newspaper advertisement concerning a house.

Table 1 categorizes the dialogues that include clari"cation subdialogues as leading tosuccess or failure. It is important to note in this respect that the corpus does not providecomplete information concerning the "nal result of the caller's request. Once the caller istransferred over to someone (the requested person, an intermediary person, or a particu-lar department), only those dialogues that are followed by subdialogues resuminga previous request provide complementary information about the success or the failureof the transfer (for instance, that it failed because the line is busy or is not answered). Inall other cases, the corpus does not let us know whether the caller was actually able tospeak to the requested person. We therefore consider as &&failed'' only those requests thatled to the operator transferring the caller over to the Personnel Department, as illus-trated later in the dialogue of Figure 3, even if the success in the other cases remainsuncertain (as happens when the operator transfers the caller to a particular departmentor an intermediate person for instance).

We focused on the set of request dialogues based on the name of the person becausethis type of dialogue includes more interesting, and numerous, instances of clari"cationsubdialogues. In the 22 such dialogues, the operator initiates one or more clari"cationsubdialogues because she misunderstood the name of the person, or because she cannot"nd it in her directory, or because the caller asked for several people or is unsure aboutthe name of the person he is looking for. In these cases, requests for additionalinformation include one or more of the following elements, recapitulated in Table 2 aftertheir description.

f Request for repetition of the name: in six of these 22 dialogues, the operator asks thecaller to repeat the name of the person requested, as illustrated in utterance (4) ofFigure 2. In one case, the operator "nds the name (and therefore the extension) after

(1) L: SappelT C: Stelephone callT(2) S: CNRS bonjour O: CNRS hello(3) L: allo est-ce que j'pourrais parler

a Mme S?C: hello could I speak to Mrs S?

(4) S: Mme ? O: Mrs?(5) L: S C: S(6) S: S? O: S?(7) L: oui C: yes(8) S: ne quittez pas } O: Please hold on }

vous savez ou travaille cette personne? Do you know where this person works?(9) L: euh alors attendez } C: hm well wait }

service du Courrier du CNRS department of the &&Courrier du CNRS''(10) S: d'accord ne quittez pas O: Okay, hold on

FIGURE 2. Dialogue No 2-B(2)N73: request for repetition of name, con"rmation of name and name ofdepartment.

COOPERATIVE REQUESTS AND REPLIES 921

repetition of the name by the caller. In two other cases, the operator "nds the nameafter being given the name of the department to which the requested person belongs. Intwo other cases, the caller spontaneously provides the extension of the person reques-ted after repeating the name. Finally, in the last case, the person requested is unknownof the operator, and the caller is unable to provide additional information.

f Request for con,rmation of the name: in six dialogues, the operator asks the caller forcon"rmation of the name of the person requested, as illustrated in utterance (6) ofFigure 2 and utterance (4) of Figure 3. In one case, the person is found after the callerrepeated the name. In another case, the person is found, but only after the operator alsoasked where this person worked (as in Figure 2). In the four other cases, the person isnot found, and the caller is directed to a department or intermediary person on thebasis of additional information spontaneously provided by the caller or asked for bythe operator, as in Figure 3.

f Request for the extension: in three dialogues the operator asks for the extension of theperson requested. In two cases, she did not "rst ask the caller to repeat or con"rm thename and she "nds the people requested although the callers were not able to providethe extensions. In the other case, the request for the extension is preceded by a requestfor the repetition of the name, and in this case the operator does not "nd the person.

f Request for the name of a department: in "ve dialogues, the operator asks for the nameof the department to which belongs the person requested. In one case, this request isnot preceded by a request for repetition or con"rmation of the name, and the personis not found. In the four other cases, the request for the name of the department ispreceded by a request for more information on the name. In two of these cases theperson is found, in two others, the person is not found.

f Request for the name of an intermediary person: in two dialogues, the operator asks forthe name of a person working with the person requested. The caller is unable to givethis name, and the operator does not "nd the person.

f Correction of the name: in two dialogues the operator spontaneously corrects the nameof the person requested (the caller gave a wrong name, but close to the correct one). Inanother similar case, the caller, unsure about the name, suggests two names; here theoperator replies with the correct name.

(1) L: SappelT C: Stelephone callT(2) S: CNRS O: CNRS

[2] [2](3) L: puis-je parler a Mme M svp? C: May I speak to Mrs M please?(4) S: M ? O: M?(5) L: oui M C: yes M(6) S: ne quittez pas }

dans quel service travaille-t-elle?O: hold on }

in what department does she work?(7) L: je ne sais pas C: I don't know(8) S: ah msi son nom ne me dit rien euh O: oh I don't know her name hm(9) L: je sais que c'est une biologiste C: I know she's a biologist

(10) S: mais vous ne savez pas du tout avec quielle peut travailler dans quel laboratoire

O: but you really don't know with whomshe could work in which laboratory

(11) L: non non pas du tout C: no no not at all[2] [2]

(12) S: bon eH coutez j'peux vous passer O: well listen I can transfer you overeH ventuellement les services du personnel . . . to the Personnel Department . . .

FIGURE 3. Dialogue No 1-B(2)N80: request for con"rmation of name and for additional information.

TABLE 2Di+erent types of requests on the part of the operator

Types of requests on the part of No. of dialogues including this type of requestthe operator

Success Failure Total (30)

Repetition of the name 5 1 6 /22Con"rmation of the name 2 4 6 /22Extension 2 1 3 /22Name of a department 2 3 5 /22Name of an intermediary person 0 0 2 /22Correction of the name 3 0 3 /22Disambiguation 5 0 5 /22

922 C. BALKANSKI AND M. HURAULT-PLANTET

f Request for disambiguation: in three dialogues, the caller asks for two di!erent people(&&I1d like to talk to Mrs ¸ or Mrs D11) and the operator replies with a request for furtherinformation to be able to make a choice. In another case, the operator, who did notunderstand the name correctly, suggests two possible names and asks the caller tochoose. Finally, in still another case, the operator asks for the "rst name of the personrequested because two people had the same last name.

Table 2 recapitulates the types of requests for additional information, described above,that appear in the clari"cation subdialogues that we examined. The total number ofrequests (30) is greater than the number of dialogues that include clari"cation subdia-logues (22) because the operator sometimes needs to make several requests for additionalinformation, within the same dialogue, before being able to perform the necessarytransfer. In particular, dialogues leading to failure include, in average, a greater numberof requests for additional information than those leading to success. As shown in Table 2,

COOPERATIVE REQUESTS AND REPLIES 923

of the four dialogues including clari"cation subdialogues based on the name of a personthat failed, one included a request for repetition of the name, all four contained a requestfor con"rmation of the name, one included a request for the extension of the person andthree contained a request for the name of the department of this person.

2.2. ANALYSIS

The di!erent types of requests described in the previous section provide information onthe actions and strategies used by the operator to transfer the caller over to a person thatshe does not "nd in her directory. These requests concern parameters of actions that arepart of the recipes used by the operators. We integrated these actions and recipes into ourknowledge base.

Our analysis of the corpus revealed constraints on the order in which these requestsare generated. When the operator does not "nd the name of a person in her directory, shemay begin by asking for con"rmation of the name. If the same name is repeated, or ifanother unknown name is provided, she may then ask for the extension, then the name ofthe department, then the name of a co-worker. The operator formulates one of theserequests, or two of them when the "rst one fails, or more rarely all three of them, butalways in this order. For instance, the request for the name of the department may befollowed by a request for the name of a co-worker, but there is never a request for theextension after a request for the name of the department. This order appears tocorrespond to the increasing complexity of the underlying plan. The modi"cation ofa name or extension will indeed allow the operator to perform the requested transfer. Thename of an intermediary person, involves replanning the transfer: the operator will haveto "rst transfer the call over to the intermediary person, who will then make therequested transfer.

We also notice that callers provide spontaneous cooperative responses, as illustratedin utterance (5) of the dialogue in Figure 4. In these cases, callers give additionalinformation without having been explicitly asked for it, but after having been asked forsome other information indicating that the operator is having trouble "nding the personrequested. In 15 dialogues where clari"cation of the name was needed, two suchspontaneous cooperative responses provided the extension of the person requested, twogave the name of the department to which belonged the person and one gave the name ofan intermediary person. We therefore "nd in the cooperative responses of the caller thesame type of information as that asked for in the operator's requests. This shows that thecaller shares at least some of the knowledge of the operator regarding domain action and

(1) L: SappelT C: Stelephone callT(2) S: CNRS bonjour O: CNRS hello(3) L: oui est-ce que je pourrais avoir

Mr C svpC: yes could I speak to Mr C please?

(4) S: Mr ? O: Mr?(5) L: C au poste 22}87 C: C extension 22}87(6) S: oui O: yes

FIGURE 4. Dialogue No 1-B(2)N71: request for repetition of name followed by a cooperative response ofthe caller.

(1) L: SappelT C: Stelephone callT(2) S: CNRS bonjour O: CNRS hello(3) L: bonjour Cbonjour pourrais-je

avoir } euh } Mr P svp?C: hello ChelloC could I speak to } hm }Mr P please?

(4) S: P? O: P?(5) L: oui C: yes(6) S: connais pas O: I don't know him(7) L: ah C: oh(8) S: ah O: oh(9) L: c'est quelqu'un qui s'occupe de la

comptabiliteH fonctionnementC: it's someone who takes care ofmaintenance accounting

(10) S: oh j'vous passe le fonctionnement O: oh I'll transfer you over to maintenancealors hein n'quittez pas then hold on

(11) L: merci C: thank you

FIGURE 5. Dialogue No 3-A(6)N51: request of con"rmation of name, followed by cooperative initiative ofthe caller.

924 C. BALKANSKI AND M. HURAULT-PLANTET

strategies; this knowledge allows him to be cooperative and provide the necessaryinformation for the operator to perform them. This also shows that our system should beable to both display cooperative behaviour and recognizes such behaviour on the part ofits conversational partner.

Finally, we observe that although helpful behaviour of agents is typical ofthe dialogues in our corpus, one dialogue does include a counter-example, shown inFigure 5. After con"rmation of the name, which the operator does not appear to know,she does not make any further request for information [see in particular utterances (6)and (8)]. Instead, it is the caller that keeps the dialogue going. Castaing (1990) observedthat operators display a di!erent behaviour depending on how overloaded the phonedesk is. We have no indication supporting this observation for this particular example,but it is reasonable to hypothesize that a temporary overload will lead to a lesscooperative behaviour on the part of an operator. We did not seek to reproduce this typeof situation in our model.

The analysis of the di!erent requests, and the order in which they occur, lead us todistinguish two such types of requests: those that initiate a clari"cation subdialoguewhile maintaining the original plan and those that initiate a clari"cation subdialoguewhen replanning is necessary. Examples of the "rst type include requests for therepetition or con"rmation of the name, or requests for an extension. In these cases,the original plan may be pursued, either by the instantiation of a missing parameter or bythe modi"cation of a parameter that has been incorrectly instantiated. Examples of thesecond type include request for the name of an intermediary person. In these dialogues,the operator does not directly transfer the caller to the person requested, as initiallyplanned for, but to somebody else, a co-worker for instance, who may then be able toestablish the connection with the requested person.

On the callers' side, we observed the two corresponding cooperative strategies. The"rst one consists in helping the operator by providing additional information that sheneeds for her current plan. In this case, the initial plan may therefore be maintained. Thesecond one consists in helping the operator by suggesting a di!erent way of connectingwith the required person, for instance by an intermediary person. In this case, replanningis necessary.

COOPERATIVE REQUESTS AND REPLIES 925

We therefore notice that both the caller and the operator can initiate one of thesestrategies. It is also possible for the two speakers to alternate, each one of them beinginitiator at di!erent moments of the dialogue. Our model allows for these two types ofstrategies to be used by the conversational partners to attain their goal.

3. Cooperation in the underlying theory

As mentioned earlier, our computational model is based on the theory of collaborativediscourse developed by Grosz and Sidner (1986, 1990), Grosz and Kraus (1993, 1996),and further extended by Lochbaum (1994, 1995). This work rests on the claim thatdiscourses, like many other non-linguistic activities, involve collaborative behaviour. It isto provide a foundation for theories of collaboration that they introduced the notion ofa SharedPlan. As mentioned earlier, SharedPlans are used to model the mental states ofagents when they have a collaborative plan to do a group action. Collaborating agentsmust hold these beliefs and intentions for their collaboration to be successful. Figure 6provides a high-level overview which Grosz and Kraus use for their formalization ofcollaborative plans. It provides the main components of the mental states of collab-orative agents.

A SharedPlan is a formal model of collaborative plans. A SharedPlan for an actiontherefore includes a mutual belief concerning a way to perform this action [subactions,item (1)], individual intentions that the action and subactions be performed [item (2)],and its structure is a hierarchical structure of individual plans and SharedPlans. Items(2a) and (2b) re#ect the commitments that the collaborating agents must have to theactions of the group as a whole and to those of other agents. These commitments, whichengender the cooperative behaviour of the agents, are formalized in Grosz and Kraus'de"nition of a SharedPlan with the Int.Th operator (intend-that, used to represent anagent's intending that some proposition hold, as opposed to intend-to, used to representan agent's intending to do some action). Figure 7 provides two axioms centered aroundthis operator and representing the adoption of helpful, and thus cooperative, behaviour-(Grosz & Kraus, 1996). In these axioms, G is an agent, and A and B are actions.

The "rst axiom, A5, provides for direct help, while the second, A6, provides for moreindirect help. According to A5, if an agent intends that a proposition prop holds, whilenot believing that prop holds, and believes that his performing an action A would lead toprop becoming true, then that agent will consider doing A. This will lead the agent todeliberate about A, and, barring con#icts, to adopt an intention to do A. This situation isillustrated in the dialogue given in Figure 4. After utterance (4), the caller realizes that theoperator is not able to transfer him over to Mr C. The caller thus adopts the intention toperform the communicative action of informing her about Mr C's extension, an actionthat will allow the operator to subsequently execute the required transfer action.

The second axiom, A6, applies in situations when an agent G believes that hisperforming an action A would allow another agent to perform another action B thatwould lead to prop becoming true. G will then consider doing A, which will lead him todeliberate about A, and eventually to adopt an intention to do A. This situation is

-For presentation purposes, Figure 7 presents only the English paraphrasing of the formal de"nitions of theGrosz and Kraus axioms, de"nitions which are not central to the present work.

To have a collaborative plan for an action, a group of agents must have.(1) Mutual belief of a (partial) recipe for this action.(2a) Individual intentions that the action be done.(2b) Individual attentions that collaborators succeed in doing the (identi"ed) constituent

subactions.(3) Individual or collaborative plans for the subactions.

FIGURE 6. Key components of collaborative plans (Grosz & Kraus, 1996).

Axiom (A5)} G Int.Th some prop which G does not believe is true} and G believes it can do something (A) that will bring about prop1s holdingN G will consider doing A.

Axiom (A6)} G Int.Th some prop which G does not believe is true} and G believes it can do something (A) that will enable another agent to do something

else (B) that will bring about prop1s holdingN G will consider doing A.

FIGURE 7. Simpli"ed English description of axioms A5 and A6 for Int.Th (Grosz & Kraus, 1996).

926 C. BALKANSKI AND M. HURAULT-PLANTET

illustrated in the dialogue given in Figure 3. In utterance (12), the operator realizes thatshe cannot satisfy the caller's request despite several attempts at obtaining additionalinformation. She then decides to transfer the caller over to an intermediary person(action A in the axiom), here someone from the Personnel Department, who in turnmight be able to transfer the caller to the desired person (action B in axiom). Of course,the operator would not consider transferring the caller over to some other person if shedid not believe that that other person could, in turn, transfer the caller to the desiredperson. This axiom, therefore, like the previous one, might be too strong as axiom.However, it provides a useful framework in which to de"ne the adoption of helpfulbehaviour.

Grosz and Kraus (1996) introduced these axioms, as well as the rest of their formaliz-ation of collaborative plans and of the mental attitudes involved, not with the intentionof having them be directly implementable, but &&to be used as a speci"cation for agentdesign''. It is precisely in this sense that they have been used in the context of our study.They served as part of a theoretical framework in which we analyse our sampledialogues, and they provided guidance in the design of our dialogue model, andespecially in the rules of our Task Advancement algorithm concerning actions that failedor action whose execution conditions cannot be satis"ed. As will be shown in the nextsection, when this is the case, the algorithm leads to plan repair or, if this is not possible,to the generation of the reasons for failure that can help another agent (herethe conversational partner) to repair the plan. The adoption of cooperative behaviour onthe part of the agent being modelled, as represented in the axioms presented above,therefore emerges from the rules of the Task Advancement algorithm concerning thefailure or the inability to perform an action.

COOPERATIVE REQUESTS AND REPLIES 927

4. Dialogue representation and algorithms

This section presents the representation of dialogue as well as algorithms on which ourdialogue system is based. Our representation of dialogue follows Grosz and Sidner'stheory of discourse (1986) and Lochbaum's formalization of intentional structure (1994).It includes a Rgraph, initially introduced by Lochbaum and presented in the introduc-tory section to this paper. Since Rgraphs are made up of actions and recipes, we begin bydescribing the underlying theory of action. We then present the two main algorithms ofour dialogue model, namely the Interpretation algorithm and the Task Advancementalgorithm, showing in particular how they manipulate the Rgraph in the reasoningprocesses which they model.

4.1. ACTIONS AND RECIPES

Actions may be basic or complex. Basic actions are performable at will, under certainconditions, while complex actions have associated recipes, their performance requiringthe performance of each action in their recipe, under certain conditions. Recipes repres-ent what agents know when they know a way of doing something. They representinformation about the abstract performance of an action, and are composed of a set ofconstituent acts and associated constraints.

The graphical representation in Figure 8 illustrates the structure of a recipe. As shownin this "gure, we de"ne recipes as two-level trees, the root being a complex action, theleaves being the constituent actions, basic or complex, necessary for the execution of thehead action. Actions may be performed by a set of agents MagentsN, and may manipulatea set of entities, MentitiesN. The recipe node, represented by an oval box, containsconstraints on parameters which have to be satis"ed to allow for coherence among theparameters (agents, time, entities) of the actions in the recipe, as well as temporalconstraints on the execution of constituent actions and, when necessary, other contextualconstraints. The performance of each action in a recipe contributes directly to theperformance of the head action. We use the D-contributes relation (for directly contrib-utes), introduced and de"ned by Lochbaum (1994), to model this relationship. Theperformance of the head action results from the performance of each action in its recipe,and results in the intended e!ect of the performance of these constituent acts. Theperformance of the head action of the recipe therefore constitutes the goal of the agentwho holds the intention to achieve this e!ect.

4.2. RGRAPHS

As mentioned in the Introduction, an Rgraph represents the beliefs of the agent beingmodelled as to how all of the acts underlying the agents' discourse are related at a givenpoint in the dialogue. It constitutes a partial representation of the SharedPlan built bythe collaborating agents. As such, it is composed of the acts that the agent beingmodelled believes are acts that both collaborating agents believe are necessary for therealization of their joint goal, and which they will eventually attempt to execute. AnRgraph evolves as the dialogue proceeds, each utterance either providing informationleading to modi"cations of the Rgraph or resulting from reasoning processes whichmodi"ed the Rgraph.

FIGURE 8. Structure of a recipe.

FIGURE 9. Sample Rgraph.

928 C. BALKANSKI AND M. HURAULT-PLANTET

A sample Rgraph is given in Figure 9. It is a dynamic representation, resulting frominstantiating and composing recipes as the dialogue, and therefore the underlying task,progress. References to this Rgraph will help explain the more complex aspects of thealgorithms presented in the next section. The Rgraph contains the actions, communicat-ive as well as non-communicative, that constitute the steps of the underlying task. Theroot action de"nes the task to perform, that is, the goal of the collaborative plan of theagents. This structure, and therefore the overall sequence of steps to perform, is notprede"ned, but is determined dynamically, as the dialogue progresses. Although parts ofthis sequence are individually prede"ned (the recipes that are included in the Rgraph),the way in which these parts are selected and assembled, and what additional actionsmay have to be added (for instance to satisfy preconditions), has to be inferred and variesfrom situation to situation.

As illustrated in Figure 9, an Rgraph includes more information than just a concatena-tion of existing recipes. It includes, in particular, instantiations of agents, times and otherentities that make up the parameters of the actions,- thereby providing more specializedinformation than that which is included in the recipes of the knowledge base. Rgraphs

-Instantiated parameters are printed in italic font, whereas non-instantiated parameters are printed inregular font.

TABLE 3Action statuses and their corresponding meanings

1 The agent being modelled believes that the act will be part of agents' joint plan

2 The agent being modelled believes that the agents agree that it is an element of theirjoint plan

3 The act is basic and the agent being modelled believes that the agents would agree toits performance

4 The act is complex and the agent being modelled believes that the agents would agreeto a particular recipe for the act

5 The act cannot be performed (repair is required)

6 The act has been performed, with or without success (if failure, no more repair of thisact is possible)

COOPERATIVE REQUESTS AND REPLIES 929

may also include actions that are not part of the recipes, but that are added to the Rgraphbecause they are needed to allow the subsequent performance of other actions. This is thecase for the LookUpPhoneNb action, a knowledge precondition action, which is connec-ted by an enablement link to the action towards which it contributes, namely RequestCommunication (System, PhNb). Enablement is a relation that holds between twoactions when the performance of the "rst brings about a condition or a set of conditions,that is necessary for the subsequent performance of the second. The other type of linkpresent in an Rgraph represents the D-contributes relation mentioned earlier, whichconnects an action from a particular recipe to the head action of that recipe. Finally, thecontributes relation is the transitive closure of the D-contributes and enablementrelations.- Every action A in an Rgraph therefore contributes to each one of the actionsthat are part of the path between A and the root of the Rgraph. More details will be givenlater on about when and how actions are added to an Rgraph.

Rgraphs also include status information, indicating, for each action in the Rgraph, thebelief of the agent being modelled about and its state in the agent's underlying plan.Action statuses are given in Table 3. For instance, in Figure 9, the action Connect(System, ;ser, PhNb) has a status of 1 indicating that the agent being modelled believesthat this action will be part of the other agent's and his joint plan, whereas theEstablishCommunication(;ser, System) has a status of 6 indicating that this action hasbeen performed.

Actions are generally added to the Rgraph with status value of 1; this status is thengradually updated, as the dialogue proceeds, to status 6. The following sections will

-De"nitions of action relations are given in Lochbaum (1994) and Balkanski (1993). The contributes relationwas originally introduced and de"ned for plan recognition purposes, to allow a reasoning system to take intoaccount the fact that agents beliefs may be partial (Lochbaum, Grosz & Sidner, 1990). In particular, an agentmight not know the exact relation that holds between two actions. The contributes relation allows therepresentation of this level of knowledge.

930 C. BALKANSKI AND M. HURAULT-PLANTET

explain how the updating is performed. For each action in the Rgraph, these transitionscorrespond to the evolution from an intention to perform that action, to the reasoningnecessary to perform it, and "nally to its performance. These statuses can be described inthe context of Bratman's theory of intention (Bratman, 1987). In this theory, intentionplays three characteristic roles: (1) posing a problem and leading to means-end reasoning,(2) constraining the agent's other intentions and (3) guiding the agent's conduct. The "rsttwo roles are related to the reasoning-centred dimension of intention, while the thirdconcerns the relation between intention and acting (&&endeavouring'' in Bratman's terms,or the volitional dimension of intention). The status of an action in the Rgraphcorresponds "rst to the establishment of an individual intention (status 1), then ofa mutual belief concerning this intention to perform that action (status 2). The agent thenengages in means-end reasoning, leading to status 3 or 4 depending on whether the act isbasic or complex. Finally, the agent will endeavour to perform the act, leading to status6 if the act is successfully performed, and to status 5 otherwise. The second role ofintention, namely that of a "lter with respect to other intentions, has so far been left aside.In our corpus, we have not found any example of actions that may have unexpectedside-e!ects which may lead the agent to have to replan. This is nevertheless an aspect ofour model that will require further study.

Another aspect relating to action statuses that we have not yet considered is thedi!erence in status between SharedPlans and the individual plans of the agent beingmodelled. If an action is not basic, then the agents must indeed decide whether they willform a SharedPlan for the act (status 4), or whether one of them will form an individualplan for the act. Indeed, some recipes and/or actions may be items of a detailed naturethat might not need to be a part of any shared recipe. For instance, the caller in ourexamples need not know the detail of how the operator "nds an extension* except if theoperator runs into problems and needs the caller's cooperation to succeed. Therefore, ifthe caller interferes, then what seemed to be an individual plan becomes a shared plan.Integrating shared plans and individual plans is a di$cult but important problem thatremains to be investigated.

Lochbaum (1994) introduces statuses for distinguishing between individual and sharedplans, as well as statuses that inspired our statuses 1}4. However, it is not clear how thesestatuses are used in her model, nor how they are updated and a!ected by the rest of hermodel. In Section 4.4, we will show how our Task Advancement algorithm a!ects, and isa!ected, by the information provided by action statuses.

4.3. DISCOURSE INTERPRETATION

The Interpretation algorithm models the process by which an agent determines how thegoal underlying an utterance contributes to the goal and subgoals that are part of thecurrent collaborative plan. In this paper, we focus on utterances that communicateinformation about a single action and reason only about that action.- Therefore, the roleof the interpretation algorithm is to determine if, and if so how, the performance of act A,

-As shown elsewhere (Balkanski, 1993), multi-action utterances may convey a wealth of information abouta speaker's beliefs and intentions that should also be taken into account in interpreting the agent's utterances.The interpretation of these more complex types of utterances is an area of future research.

COOPERATIVE REQUESTS AND REPLIES 931

underlying the input utterance,- is relevant given the beliefs of the agent being modelledabout recipes and the current Rgraph. Act A will be considered relevant if the algorithmcan determine that it contributes in some way to the head action of the Rgraph, forinstance, by directly contributing to act B currently in focus. Two cases are to bedistinguished: the interpretation of the initial utterance, when the Rgraph is empty, andthe interpretation of the other utterances when the Rgraph is not empty.

The interpretation of the initial utterance depends essentially on the content of thisutterance in terms of the goal that is expressed. More speci"cally, if the action underlyingthe utterance expresses this goal explicitly, then the Rgraph can be initialized with the actA underlying this utterance. Otherwise, the utterance is about an action that contributesto an implicit goal, in which case this goal has to be inferred to be able to initialize theRgraph appropriately. Many researchers assume that speakers begin by explicitly statingtheir goal (e.g. Lochbaum, 1994; Grosz & Sidner, 1990; Rich & Sidner, 1998). Allen (1983)does not make such an assumption, but limits the search space by allowing only twopossible goals (taking and meeting a train). Our algorithm combines both approaches,selecting one or the other on the basis of whether or not the agent being modelled canperform (or participate in the performance of ) the action ActInit underlying the initialutterance. It therefore distinguishes the following three cases (in what follows, the agentbeing modelled is referred to as the hearer).

f If the agent of ActInit is the hearer: the Rgraph is initialized with ActInit. (Since thehearer is the agent of ActInit, this agent can adopt an intention to perform ActInit.)

f If the agent of ActInit is the speaker:* If ActInit is complex: the Rgraph is initialized with ActInit. (Since the utterance is

about a complex action, the performance of which the hearer can participate to, thehearer can therefore adopt an intention that ActInit be performed.)

* If ActInit is basic: the Rgraph is initialized with ActInit and a recipe for an action Actcontaining ActInit. [Since the utterance is about a basic action performed (or to beperformed) by the speaker alone, the hearer cannot participate in the performance ofthis action. ActInit is thus assumed to contribute to another action Act, the perfor-mance of which the hearer can participate to. The hearer can adopt an intention thatAct be performed.]

In all three cases, the Interpretation algorithm will eventually add to the Rgraph eithera recipe for ActInit or a recipe containing ActInit. If there are several recipes possible forActInit, the algorithm selects one of them, based on priority (recipes are ordered in thedatabase, depending on criteria de"ned on the basis of our corpus study). If there areseveral recipes containing ActInit, the choice is made on the basis of parameter con-straints and on the temporal ordering of actions in a recipe. (An alternative that could beexplored is to ask the speaker to choose among several possible recipes, as is done, forinstance, in Lesh, Rich & Sidner, 1998). As will be shown in the next section, the TaskAdvancement algorithm allows for recipes to be changed later if a problem arises.

Looking now at the more general case, when the Rgraph is not empty, the Interpreta-tion algorithm's role is to search for a link between act A underlying the input utterance

-The act A underlying the current utterance may be a communicative action, for instance, when theutterance provides information concerning an action parameter [as in (9) of Figure 2], or a task-related action,for instance, when the utterance consists in the goal speci"cation of an act [as in (3) of Figure 2].

932 C. BALKANSKI AND M. HURAULT-PLANTET

and act B currently in focus, thereby showing a Contributes relation between act A andthe head action of the Rgraph. In our current model, the act in focus corresponds to thecurrent action, namely the "rst action to be performed in the temporal order of theactions in the Rgraph; if all actions of the Rgraph have been executed, the act in focus isthe root of the Rgraph. Links between actions can occur in a number of ways.

f If A can be identi"ed with B: then A is an instantiation of the same action as thatrepresented by B, and replaces B in the Rgraph.

f If A is part of a recipe for B, and the constraints associated with this recipe aresatis"able: then A can be added to the Rgraph, connected by a D-contributes relationto act B (downwards construction).

f If B is part of a recipe for A, and the constraints associated with this recipe aresatis"able: then A can be added to the Rgraph, B being connected by a D-contributesrelation to act A (upwards construction, when B is the root of the Rgraph).

f If A is such that its performance will enable the subsequent performance of act B, forexample, by instantiating one of its parameters, then A can be added to the Rgraph,connected to act B by an enablement relation.

Links can also occur across several recipes, action A being part of a recipe for C, andC being part of a recipe for B, for instance. These di!erent types of links are illustrated inthe sample Rgraph presented earlier in Figure 9, and will be further illustrated withsample dialogues in Section 6. If a link can be found between the input act A and the actin focus B, then act A is added to the Rgraph, along with the other acts (and associatedrecipes) that may be part of the path forming this link. If a link cannot be found, then theInterpretation algorithm searches for a link between A and act B@ to which B contributes,either by D-contributes (because B@ is in the recipe for B) or by enablement (because B@enables B). If a link is still not found, it repeats the search procedure for a link between B@and act BA to which B@ contributes, and so forth, until the root of the Rgraph. If no link isfound between the incoming act A and the root act, then the algorithm fails, anda message indicating so is returned to the interpretation module.

The Rgraph may be built in an upward or downward manner, and the search for a linkbetween actions may lead to the addition of more than one recipe. Heuristic rules areneeded to constrain the search and to avoid the danger of becoming a computationallyexplosive plan recognition procedure. For the time being, our domain is restrictedenough not to lead to such problems. Scaling up to larger domains will require furtherresearch [Lesh & Etzioni, (1996) for instance address this problem]. Interestingly, Leshet al. (1998) have shown that in human}computer collaboration, properties of thecollaborative setting may be exploited to make plan recognition tractable. These proper-ties include the focus of attention, the use of partially elaborated hierarchical plans andthe possibility of asking for clari"cation, which are all properties that our model exploitsas well.

Acts that are added to the Rgraph during interpretation acquire statuses 2, 4 or 6. Theinitial action, ActInit, is always derived from an utterance produced by the speaker. It isassigned a status of 6 if it has been performed (in our application, it is the speaker&&calling'' the system), and a status of 2 otherwise. In both cases, this means that the agentbeing modelled comes to agree to the speaker's proposition to include A as an element oftheir joint plan. The agent may further believe that the act has been performed (status 6).

COOPERATIVE REQUESTS AND REPLIES 933

When ActInit is basic and its agent is the speaker, then, as mentioned above, the Rgraphis initialized with ActInit and a recipe for an action Act containing ActInit (see the thirdcase for the interpretation of the initial utterance). In this case, Act is given a status of 4.This assignment shows that the agent being modelled makes the hypothesis that thespeaker also agrees to the particular recipe that it has selected. If it did not make thisassumption, it would have to ask for con"rmation each time it selects a recipe, whichwould represent an unrealistic burden on the dialogue. This point will be furtheraddressed in the discussion section.

When the act A being interpreted is not the initial action, then we also assume mutualagreement about the additional actions that are added to the Rgraph as a result of theinterpretation of A. These actions are assigned status 4 if they are part of the pathbetween act A being interpreted and act B in focus (because a recipe for those acts willhave been selected for the purpose of that path), and status 2 otherwise (because theinterpretation algorithm does not search for recipes for every action added to the Rgraphas a result of the path). For instance, if a path is found between the input act A and the actin focus B, such that B is part of a recipe for C and C is part of a recipe for A (upwardsconstruction), then C and A will be assigned status 4, while the other actions of therecipes for C and A will be added with status 2. The hypothesis that the agent beingmodelled can assume speaker agreement for the path found between A and B issupported by the fact that act A was mentioned by the speaker. It is also coherent withour observations about the caller's behaviour in the corpus, showing that he shares atleast some of the operator's domain knowledge. This hypothesis, however, will have to beweakened for applications where a priori shared knowledge about the task is morelimited than in the context of a phone desk. During the planning phase, on the otherhand, as will be explained in the next section, actions are added with a status of 1, i.e.without the assumption of speaker agreement, because the agent's reasoning is no longerguided by the need to integrate an act mentioned by the speaker into the current Rgraph.

So far, an act A underlying the speaker's utterance was found relevant, given thebeliefs of the agent being modelled about recipes and the current Rgraph if a linkcould be found between this act A and act B currently in focus. Act A can also berelevant if its performance can be interpreted as establishing a belief concerningact B. This is the case for explicit agreements for instance, when speakers indicatethat they understand, and accept, the content of their partner's utterance concerning B.Since these actions do not contribute directly to the task, they are not added tothe Rgraph. Instead, the Interpretation algorithm updates the status of act B on the basisof this belief, for instance by updating to 2 a status that is at 1. In other cases, utteranceslike &&okay'' or &&yes11 only serve to indicate understanding. Such acknowledgement actscan be referred to as grounding acts (Clark & Schaefer, 1987; Traum, 1994). Theestablishment of mutual belief is a topic that will be further discussed in the discussionsection.

4.4. DISCOURSE GENERATION: THE TASK-ADVANCEMENT ALGORITHM

The generation algorithm leads to the generation of an utterance which contributes tofurther elaborating the collaborative plan or when no further elaboration is necessary,suggests an end to the dialogue. A critical aspect of this algorithm, however, is to allow

934 C. BALKANSKI AND M. HURAULT-PLANTET

an agent to make progress in the realization of the common goal, hence the name chosenfor this algorithm: Task Advancement (TA) algorithm. To do so, this algorithm inter-leaves planning and execution, while being guided primarily by execution constraints. Itdetermines which actions the agent being modelled can perform, and in which order; itexecutes actions when possible, and "nds what information is possibly missing toperform them; it determines when to plan further actions, and when to produce anutterance and what it should contain. It performs these di!erent tasks by examining andupdating the Rgraph.

The overall structure of the Task Advancement algorithm is a loop. Its initializationconsists in determining the current action, its stopping condition is veri"ed whena communicative act needs to be produced, and its body consists in applying a number ofrules to allow the task to progress. Details of these three parts of the algorithm are thefollowing.

f Initialization. The algorithm begins by initializing action Aito the current action.

Currently, we make simplifying assumptions and let Ai

be the next action to beexecuted, that is, the "rst leaf action in the Rgraph, by temporal order, whose status isdi!erent than 6. This assumption rests on the fact that, generally, the current action hasa direct link with the discourse focus action. Ideally, the TA algorithm should,however, be able to de"ne the current action both on the basis of execution constraints(what is the next action to be executed?) and discourse constraints (what is the actioncurrently in focus?). We are in the process of investigating this issue.

f Stopping conditions. The algorithm returns (for later generation) a communicative actwhen A

iveri"es one of the following three clauses.

(1) Aiis a communicative action whose agent is the agent being modelled; in this case,

the status of Aiis updated to 6, and A

iis returned.

(2) Aiis an action whose agent is not the agent being modelled; in this case a com-

municative act is constructed and returned, signalling the success of the previousaction performed by the agent being modelled.

(3) Aiis the root action; in this case a communicative act is constructed and returned,

signalling the success or failure (and reason) of action Ai.

These stopping conditions correspond to situations in the dialogue when the agent beingmodelled has to interrupt his planning/execution process. He then either asks for orprovides the other agent with information which will contribute to establishing mutualbelief between the two agents about domain knowledge necessary for the execution of anaction in the plan, or about the status of an action in the plan. The communicativeactions returned by clause (1) are used mainly to satisfy a knowledge preconditionassociated with an action in the Rgraph, as illustrated in utterance (4) in the dialogue ofFigure 1, when the operator asks the caller for Mr A's extension. In this type of example,mutual belief is needed concerning the value of a parameter; one of the agents is missingnecessary information, and the communicative actions that are added to the Rgraph forobtaining this information form clari"cation subdialogues.

Communicative actions constructed and returned by clause (2) are used when thealgorithm determines that the next action is to be performed by someone other than theagent being modelled. The agent being modelled then has to let that other personperform the next action. It can do so in a number of ways. We selected the strategy that is

FIGURE 10. Status update for Aiby Task Advancement algorithm.

COOPERATIVE REQUESTS AND REPLIES 935

used most often in our corpus, namely for the agent to provide information about the lastaction performed, both because of its frequency of use and its applicability in a widerange of situations in our domain. This strategy is illustrated in utterance (10) thedialogue of Figure 1, when the operator signals to the caller that she has performed therequested transfer. Another strategy would be to directly ask for A

ito be performed. The

use of our model in other domains would require introducing a level of choice ofstrategies, depending on the context of A

i.

Communicative actions constructed and returned by clause (3) are used when thealgorithm determines that it has "nished reasoning about the task. At this point, eitherall actions in the Rgraph will have been performed successfully, or some action will havefailed, leading to the failure of the root action. In both cases, the communicative actionreturned will lead the agent to indicate to the user either the success or the failure of thisaction.f Body of the loop. The algorithm applies one of a set of rules, described below, on the

basis of status of this action (see Table 2). Each rule returns the (new) current action,which is either the same as the input action but with a di!erent status, or a di!erentaction. If this action does not verify the stopping condition of the algorithm, it, in turn,becomes the input to the next iteration of the loop, leading to another application ofa rule. Brief descriptions of these rules are the following. Figure 10 provides a sche-matic representation of the status updates resulting from the application of these rules.Detailed examples will be analysed in Section 6, illustrating all of these rules beingapplied.

RULE 1 (Ai's status"1). Establishing agreement about the current action.

A is returned with a status of 2.

i

936 C. BALKANSKI AND M. HURAULT-PLANTET

(This rule rests on the hypothesis that the agent being modelled assumes that the otherdialogue participant also agrees that A

ishould be part of their mutual recipe. Assump-

tions about user agreement and mutual belief will be discussed later.)

RULE 2 (Ai's status"2). Recipe selection.

(a) If Aiis basic: it is returned with a status of 3.

(b) Else if Aiis complex: the TA algorithm looks for a recipe for A

ithat is compatible

with the Rgraph.* If is "nds one,- it adds it to the Rgraph, and returns A

iwith a status of 4.

* If it cannot "nd a recipe, or if none of the recipes it knows are compatiblewith the Rgraph, then it adds to the Rgraph the ObtainRecipe action (andassociated recipe), connected by an enablement link with A

i.

RULE 3 (Ai's status"3 and agent of A

i"agent being modelled). Execution of a basic

action.The TA algorithm veri"es that all of A

i's parameters are instantiated, and that all its

constraints are satis"ed.If so, it performs A

i:

(a) If the action was successfully executed: Aiis returned with a status of 6 and a result

attribute indicating success.(b) Else, A

iis returned with status 5 and a result attribute indicating failure.

If a parameter is not instantiated or a constraint not veri"ed, then Ai

cannot beperformed:

(c) The TA algorithm adds to the Rgraph an action that will allow it to (try to)instantiate this parameter or satisfy the constraint. This new action is connected byan enablement link with A

iand returned with a status of 1. It is either a particular

action associated with Aior, if there is no such action, the AchieveHaveValue

action.(d) If all possible actions used to instantiate a parameter or satisfy a constraint fail,

then Aiis returned with a status of 6 and a result attribute indicating failure (reason

for failure is stored for later generation).

RULE 4 (Ai's status"4). Execution of a complex action.

The TA algorithm searches for the "rst action in the recipe for Aithat does not have

a status of 6.(a) If it "nds one, this action is returned.(b) If there is no such action (i.e. all actions have been performed), then A

iis returned

with a status of 6 and with a result attribute indicating success.

RULE 5 (Ai's status"5). Repair: reexecution of a basic action that failed or replanning of

a complex action that cannot be performed because an action in its recipe failed.If A

iis basic.

(a) If the TA algorithm can determine the reason for failure and make the necessarychanges so that the reexecution has a chance to succeed, then A

iis returned with

a status of 3 to allow it to be reattempted.(b) Else A

iis returned with a status of 6 and a result attribute indicating failure.

-As mentioned earlier, recipes are ordered in the database. Therefore, if it "nds several, it selects the "rst one.

COOPERATIVE REQUESTS AND REPLIES 937

Else Aiis complex: two cases are to be distinguished, depending on whether A

i's recipe

has been partially executed or not:* if there are no actions in A

i's recipe that have already been successfully

executed: then the TA algorithm searches for another recipe for Ai.

(c) If it "nds one, it is selected and Ais returned with a status of 4. For example, in

Figure 9, if TransferPerson had a status of 5, then a new recipe for this action couldbe tried, since no action &&beneath'' it has been performed.

(d) If all recipes have been tried, then Aiis returned with a status of 6 and result

attribute indicating failure.* If there are actions in A

i's recipe that have already been successfully executed:

then trying a new recipe is not possible. The TA algorithm therefore looks fora di!erent link between A

iand the head(s) of the executed actions.

(e) If such a link is found, the Rgraph is updated on the basis of this link, and Aiis

returned with a status of 4. For example, in Figure 9, if EstablishCommunica-tion(User, Mr B) had a status of 5, then a new link between this action andEstablishCommunication(User, System) would have to be found since this lastaction has already been performed.-

(f ) Else Aiis returned with a status of 6 and a result attribute indicating failure.

RULE 6 (Ai's status"6). Processing of A

iis ,nished, successfully or unsuccessfully, but

no more repair is possible.Two cases are to be distinguished, depending on how A

iis connected to the Rgraph.

(a) If Aiis connected by an enablement link to an action of the Rgraph: then this other

action is returned with its status unchanged. For example, if the action LookUp-PhoneNb in Figure 9 acquired status 6, then the next action to be consideredwould be the RequestCommunication action.

(b) Else if Aiis connected by a D-contributes link to an action of the Rgraph: then

this other action is returned, either with its status unchanged if the result attributeof A

iindicates success, or with a status of 5 if the result attribute of A

iindicates

failure. For example, if the RequestCommunication(System, PhNb) action ac-quired status 6 and was performed successfully, then the next action to beconsidered would be TransferPerson; by virtue of Rule 4, the algorithm would thenconsider the Connect action. On the other hand, if RequestCommunication hadfailed, then Transfer Person would also fail, and hence would have to be replanned(by Rule 5).

5. Dialogue system architecture

A schematic representation of the architecture of our system is given in Figure 11. In thephone desk domain, we have two agents: the user, who assumes the role of the caller, andthe system, which assumes the role of the operator. We begin by describing the system'sknowledge and then present its reasoning capabilities.

-In fact, this rule needs to be re"ned to take into account the fact that some actions may have e!ects thatneed to be undone.

FIGURE 11. Schematic diagram of the system architecture: , Reception and sending of an utterance;, process call; , Data retrieval; , Data retieval and update; , Process; , Data.

938 C. BALKANSKI AND M. HURAULT-PLANTET

5.1. SYSTEM'S KNOWLEDGE

The system has both static and dynamic knowledge. The system's dynamic knowledgeconsists of knowledge about the dialogue context. It is updated as the dialogue pro-gresses, and includes a recipe graph and a focus stack. In this paper, we focus on theRgraph, described in the previous section.

The system's static knowledge consists of a knowledge base of actions, recipes andentities. This knowledge includes both communicative knowledge and domain know-ledge. The "rst type is task independent, whereas the second depends on the application,and has to be built for each new domain. Looking "rst at the task-independentcommunicative actions and recipes, one important reason for engaging in communica-tion is for obtaining information. For this purpose, our system makes use of the complexAchieveHave Value action, which involves obtaining or con"rming a value for a para-meter of a action to be performed.- The system's knowledge base currently containsthree recipes for this action, given in Figure 12. For presentation purposes, we representgraphically equality constraints between agents by using the same variable names, andtemporal constraints by presenting the actions of the recipes in their temporal order. The"rst, achieveHaveValueByRequest, is used when G1 does not yet have a value for theparameter P, and leads to the generation of a wh-question. The second,achieveHaveValueByCon"rm, is used when G1 does already have a value for theparameter, and leads to the generation of a yes/no question. The <1 value indicated forthe parameter P in the CheckValue action is the value that G1 is seeking to verify. The<2 value indicated for the parameter P in the Con"rm Value action will be the same as<1 if G2 answers positively; otherwise,<2 is the new value provided by G2. The< valueindicated for the parameter P in the AchieveHaveValue action will then be equal to <1or<2 depending on G2's answer. The Con"rm Value is therefore a complex action, withseveral possible recipes, to model, among other things, the fact that a speaker may ormay not provide an alternative value in the case of a negative answer [see Bunt (1989) fora possible formalization of di!erent sorts of answer functions]. The third recipe,

-The AchieveHaveValue action is similar to Lochbaum's Achieve(has.sat.descr) action (Lochbaum, 1995).We introduce, however, three di!erent recipes for this action, thereby modelling additional aspects of theaction, such as the request for information.

FIGURE 12. AchieveHaveValue recipes.

COOPERATIVE REQUESTS AND REPLIES 939

achieveHaveValueByInform, is used when G1 is provided with the missing informationwithout having previously asked for it explicitly.

We therefore consider that communicative actions (there RequestValue, Inform Value,CheckValue and Con"rmValue) in dialogues contribute to the same goal, AchieveHaveValue, which is to obtain information. Similar considerations are found in other formal-isms (e.g. Bunt, 1989; Sadek, 1991). The recipes and actions presented in Figure 12constitute an initial set of communicative actions, su$cient for the purposes of ourcorpus. Extensions to other types of dialogues will require additional actions. Possiblere"nements include distinguishing between yes/no questions and checks [as done byCarletta, Isard, A., Isard, S., Kowtko, Doherty-Sneddon & Anderson (1997), for in-stance], introducing positive and negative CheckValue actions, depending on thespeaker's beliefs about the hearer's beliefs concerning the value to be checked [as in Bunt(1989), for instance], and allowing for the generation of yes/no questions in othercontexts than that of the CheckValue action (as needed for indirect speech acts forinstance).

The system's task-related actions and recipes were de"ned on the basis of our corpusanalysis. Those that are needed for the examples presented later in this paper are given inFigures 13}16. In our domain, a number of actions are communicative. They arenevertheless categorized as task-related since they are closely related to the phone deskcontext.

The system's data base contains two recipes for establishing communication. The "rst,establishCommunication given in Figure 13, requires that the caller, G1, "rst request thecommunication, and then that the callee, G2 accept it. Constraints on this recipe includethat the PhNb parameter be the number of G2. In our domain, establishing a commun-ication is always done by telephone, but in other domains, communication may of coursebe established di!erently, for example, by calling someone's name, or tapping someoneon the back, and the callee then accepting the communication by replying &&yes? '', orsimply establishing eye contact.

The second recipe, establishCommByIntermediary given in Figure 14, consists in G1establishing communication with an intermediary person G3 (the telephone operator forinstance), G3 transferring the call to G2, and G2 accepting the communication. Thisrecipe includes constraints encoding the fact that G3 has to know G2, i.e. G3 has to bethe telephone operator or a person working with G2. According to our corpus, G3 couldalso be someone belonging to G2's department or a person from the Personnel Depart-ment. These alternatives are currently not a part of the recipe mostly because theknowledge concerning the recipes for establishing communication with a departmenthave not yet been developed.

FIGURE 13. establishCommunication recipe.

FIGURE 14. establishCommByIntermediary recipe.

FIGURE 15. talkToPerson recipe.

FIGURE 16. transferPerson recipe.

940 C. BALKANSKI AND M. HURAULT-PLANTET

COOPERATIVE REQUESTS AND REPLIES 941

The talkToPerson recipe re#ects the fact that for a caller G1 to speak to a callee G2, hemust "rst establish communication with G2, and then speak.

Finally, transferring a call consists simply in requesting the communication (thatis, &&dialing'' the number) and then establishing the connection. In our domain,acknowledging the connection is not a part of the transferPerson recipe, but part ofthe recipes for the EstablishCommunication action. Constraints on the transferPersonrecipe include that PhNb is the number of G2, and that G3 has to be employed by thecompany. Indeed, only someone within the company, the operator or another employee,can transfer an outside call to someone within. Finding an appropriate instantiation forthe PhNb parameter of the RequestCommunication action, for instance by looking it up,or asking the caller for the number, is necessary prior to dialing a particular number.This action is treated as a knowledge precondition. We follow Lochbaum (1995) intreating knowledge preconditions not as actions that are part of a recipe (similarparameter de"nition actions would then, at least theoretically, be needed for all para-meters), but as actions that are added, when needed, to the dialogue context (here anRgraph, as will be explained later).

5.2. SYSTEM'S REASONING CAPABILITIES

The system's reasoning capabilities are built into four main components, a DialogueManager, and Interpretation module, a Generation module and a Plan Reasoner, asshown in Figure 11. The Dialogue Manager receives the input, a natural languageutterance, sends it to the Interpretation module, then calls the Generation module toproduce its response.

The Interpretation module's role is to update the discourse context so as to re#ect thecontribution of the current utterance to the dialogue. An utterance is thus viewed as anaction which modi"es the mental states of the agents involved. This module is dividedinto three submodules. The Parser and Semantic Interpreter parses sentences input bythe user, and translates them into a semantic representation. The role of the Speech ActIdenti"cation module is to recognize from this semantic representation the speech act(e.g. suggest) as well as the act that is related to the task and which be reasoned about.Finally, the Pragmatic Interpretation module determines the meaning of the actionunderlying the utterance in the current discourse context, and updates the discoursecontext accordingly. To do so, this module calls upon the Plan Reasoner, whose roleduring interpretation is to "nd a way of integrating act A underlying the currentutterance to the Rgraph. These reasoning processes are modelled by the Interpretationalgorithm described in the previous section.

The goal of the Generation module is to make as much progress as possible in the taskunderlying the dialogue, in reaction to the interpretation of an utterance, and then togenerate the system's response. Di!erent submodules are again introduced for thispurpose, which mirror the three submodules of the Interpretation module. The Plann-ing/Execution module, calling upon the Plan Reasoner, tries to further build the Rgraph,executing actions when necessary, and then plans the content of the system's response.These reasoning processes are modelled by the Task Advancement algorithm describedearlier. The Speech Act Generator and Surface Generat or modules then further struc-ture the system's utterance.

942 C. BALKANSKI AND M. HURAULT-PLANTET

Since the "rst and last two submodules are not central to our current researchinterests, we have so far left them aside. In designing and implementing our system, wefocused on the Plan Reasoner, called upon by Interpretation and Generation modules,when the reasoning required involves retrieving information from the knowledge base,and/or manipulating the discourse context, and in particular the Rgraph. Updating thediscourse context may also involve pushing a new segment on, or popping a segment o!,the focus stack, depending on whether the current utterance begins a new discoursesegment, continues the current segment or ends it. Further research is needed in this areato determine precisely how to recognize the beginnings and ends of segments withouthaving to rely on predetermined forms to identify beginnings and ends of discoursesegments (as is done in other implementations, e.g. Lochbaum, 1994; Rich & Sidner,1998).

6. Sample dialogues

In this section, we simulate the system's processing of three sample dialogues,of increasing complexity. All three are motivated by our corpus. In the "rst dialogue,the person requested is immediately found and transferred over to the caller withoutthe need for any clari"cation. The two following dialogues present alternative scenariosin which di$culties arise in the identi"cation of the person requested, therebyillustrating the cooperative behaviour of the speakers. These examples will allow us todiscuss some interesting aspects of the Interpretation and Task Advancement algorithmspresented earlier in the paper. Other aspects will be discussed in the subsequentdiscussion section.

The presentation here remains informal and skips some details, but the simulationcorresponds to the way the implemented system actually processes the dialogue. Theonly aspects of the trace that represent partially hand-crafted behaviour for the systemhave to do with the syntactic and semantic interpretation of the input utterances, as wellas the surface generation of the output utterance from the system's internal representa-tion. This will be detailed in Section 7, which presents the working system. There areinteresting aspects of natural language interpretation and generation that our system isnot yet capable of handling, such as reference, for instance. This explains utterance (4),which repeats &&Mr B'', rather than using a pronoun. A more natural response on the partof the system would indeed be &&Here he is.''

6.1. EXAMPLE 1: THE BASIC EXAMPLE

The "rst dialogue to be examined is given in Figure 17. For the system, the task starts outwith the phone call made by the user, indicated in (1), and simulated by the user clickinga button in our interface. The system's Dialogue Manager receives the call. Thismanipulation act is directly translated into the Request Communication action, and sentto the pragmatic Interpretation module, which in turn sends it to the Plan Reasoner toupdate the Rgraph (initially empty).

The Plan Reasoner "nds that the RequestCommunication action is part of both theestablishCommunication and transferPerson recipes (see Figures 13 and 16). The Trans-ferPerson recipe is not used because of the constraint stating that G1 has to be employed

(1) Stelephone callT(2) System: Hello.(3) User: Hello. I would like to speak to Mr B.(4) System: Here is Mr B.

FIGURE 17. Sample dialogue, version 1.

FIGURE 18. Recipe graph after the phone call in (1).

COOPERATIVE REQUESTS AND REPLIES 943

by the company. In this particular context, G1 would be the user (agent of the RequestCommunication action) and the constraint would therefore not be satis"ed.

Thus, the Plan Reasoner initializes the Rgraph with the establishCommunicationrecipe and associated actions, with instantiated parameters, as shown in Figure 18. ThePlan Reasoner may thus infer implicit goals, and the Rgraph represents the system'scurrent beliefs about the user's intention. The RequestCommunication action is assigneda status of 6 (since it has already been performed), AcceptCommunication, a status of 2,and EstablishCommunication a status of 4.

The pragmatic Interpretation module returns a success signal to the Dialogue Man-ager, which then calls upon the Planning module. This module in turn calls the PlanReasoner which applies the Task Advancement (TA) algorithm to try to advance thetask. The Plan Reasoner looks for the "rst action to be executed (initialization of the TAalgorithm) by temporal order. It "nds the AcceptCommunication action, a communicat-ive action for which the system is the agent (stopping condition of the TA algorithm). ThePlanning module updates its status to 6 and thus sends it to the generation moduleswhich then produce the standard greeting: &&Hello.''

Although the above reasoning may seem tedious, it is important to include it. Toomany systems avoid the opening utterances of a dialogue, either ignoring conventionalgreetings, or assuming that the user will begin the dialogue with an explicit expression ofits intention. We wish to avoid these simpli"cations, and model the dialogue right fromthe beginning, as it occurs in our corpus.

The Dialogue Manager then receives the user's utterance:

(3) User: Hello. I would like to speak to Mr B.

The Dialogue Manager calls the Interpretation module which interprets theconventional &&Hello11 greeting as an acknowledgement by the user of the EstablishCom-munication action. Since the status of the EstablishCommunication action is 6, thisacknowledgement con"rms that this action has been successfully performed. Utteranceslike this one, which repeat a previous utterance, may be interpreted as acknowledge-

944 C. BALKANSKI AND M. HURAULT-PLANTET

ments if they are followed by another utterance of the same speaker (as is the case here),but as a request for acknowledgement otherwise. These di!erent cases will be discussedin the discussion section.

The "rst two interpretation modules then build a syntactic/semantic representation ofthe second part of the utterance, recognize a request for an action, and the TalkToPersonaction, instantiated as TalkToPerson(;ser, Mr B). The Plan Reasoner subsequentlysearches the database of recipes for a link between this action and the EstablishCom-munication action, which is the action currently in focus in the Rgraph. It "ndsa D-contributes link through the TalkToPerson recipe, but does not pursue this possibil-ity because of incompatible parameter instantiations for the EstablishCommunicationaction. In the Rgraph, this action is instantiated as EstablishCommunication(;ser,System), whereas in the talkToPerson recipe, it would be instantiated as EstablishCom-munication(;ser, Mr B).

The Plan Reasoner then "nds a di!erent link between the EstablishCommunicationand TalkToPerson actions, making use of second instantiation of the EstablishCom-munication action, this time using the establishCommByIntermediary recipe. Since theseactions are compatible with the current Rgraph, it adds them, with their recipes, to theRgraph, as shown in Figure 19.

The Planning module then calls the Plan Reasoner to advance the task. It searches forthe next action to be performed (initialization of the TA algorithm), and "nds theTransferPerson action. It is a complex action, with a status of 2. Rule 2 is then applied.The Plan Reasoner "nds a recipe for this action. After having veri"ed the recipe'sconstraints, and found them compatible with the current Rgraph, it adds the recipe to theRgraph, and returns the TransferPerson action with a status of 4. Figure 20 shows theinstantiated version of the transferPerson recipe that is added to the Rgraph in Figure 19.

The Plan Reasoner then applies rule 4 and returns the basic-level action RequestCom-munication(System, PhNb), with a status of 1, which becomes the current action. Itsstatus is changed to 2 (rule 1), then to 3 (rule 2). The Plan Reasoner then applies rule 3 forthe RequestCommunication action, but cannot yet execute this action because the PhNb

FIGURE 19. Rgraph after the interpretation of utterance (3).

FIGURE 20. Partial view of Figure 19 after the planning phase following the interpretation of utterance (3).

FIGURE 21. Updated Rgraph of Figure 20 after the addition of the LookUpPhoneNb action.

COOPERATIVE REQUESTS AND REPLIES 945

parameter is not instantiated. A knowledge precondition is therefore not met. Asindicated in the description of rule 3, if a parameter of A

iis not instantiated, the Plan

Reasoner adds to the Rgraph either a particular action associated with Ai, or the

AchieveHave Value action, either of which will allow the system to (try to) instantiatethis parameter. Here the RequestCommunication action does have a link withthe LookUpPhoneNb action which serves this purpose. The LookUpPhoneNb andAchieveHaveValue actions correspond to the two alternatives that are possible forsatisfying this precondition: the system could either search for the number in its databaseor it could ask the caller for the missing information. Priority being given to the morespeci"c action, the Plan Reasoner thus adds the LookUpPhoneNb action to the Rgraph,connected to the RequestCommunication action by an enablement link, since the goal ofthis action is to satisfy a knowledge precondition of the RequestCommunication action.The LookUpPhoneNb action will therefore have to be executed before the RequestCom-munication action.

The resulting Rgraph is shown in Figure 21. The LookUpPhoneNb action is returnedby rule 3, and it therefore becomes the current action, with status 1, then status 2 (rule 1)and then status 3 (rule 2).

We note here that in the Lochbaum (1994) work, subtrees rooted at actions related byenablement relations are removed from the Rgraph once they are performed. In ourmodel, they are maintained because we believe that the fact that a constraint has beensatis"ed (here a parameter instantiation, necessary for the performance of an action), aswell as the way in which it was satis"ed, is important to the remainder of the dialogue.For instance, it will be needed later if one has to replan an action, produce explanationsof the dialogue, or return to the earlier segments.

946 C. BALKANSKI AND M. HURAULT-PLANTET

The Plan Reasoner then applies rule 3 for the LookUpPhoneNb action, which itexecutes after having veri"ed its constraints. The PhNb parameter is replaced with thenumber found, which is propagated upwards through the Rgraph. This action is returnedwith a status of 6 and a result attribute indicating success. Rule 6 is then applied,returning the RequestCommunication action, the action enabled by LookUpPhoneNb.Its status is 3. Rule 3 is therefore applied next, and the Plan Reasoner can now performthe RequestCommunication action, which is returned with a status of 6. Application ofrule 6 returns the TransferPerson action, which becomes the current action, with status 4,and application of rule 4 then returns the basic-level action Connect. The status of thisaction is changed from 1 to 2 (rule 1), then 3 (rule 2). The Plan Reasoner then executesthis action, and changes its status to 6 (rule 3). The TransferPerson again becomes thecurrent action (rule 6), and acquires status 6 as well (rule 3). The TransferPerson againbecomes the current action (rule 6), and acquires status 6 as well (rule 4). The action itD-contributes to become the current action, EstablishCommunication(;ser, Mr B) (rule6) (see Figure 19).

The AcceptCommunication(Mr B) action then becomes the current action (rule 4).This current action satis"es the stopping condition of the algorithm since it is not to beperformed by the system. The Plan Reasoner therefore signals the successful execution ofthe TransferPerson action to the Generation modules, which will produce the system'sanswer: &&Here is Mr B.11 If the AcceptCommunication succeeds, then the dialogue ends(since the system has nothing more to do). If it fails (busy signal for instance), then thesystem may initiate a subdialogue resuming the previous request, as described in thestudy of our corpus.

6.2. COOPERATION IN THE DIALOGUE MODEL

Most recipes in our application inherently involve cooperation between agents inthat some of the constituent acts are to be performed by the user while others areto be performed by the system. The cooperative behaviour of agents also manifestsitself through communicative actions, which, for example, allow agents toprovide information when the other agent makes a request, when required informationis lacking, when misunderstanding occurs between the communicating agents, orwhen a plan initially adopted fails. These actions also allow agents to keep eachother informed of the result of manipulation actions that have been performed,or simply to make suggestions. We focus here on communicative actions mani-festing helpful behaviour, an important component of this cooperativebehaviour.

In this section, we simulate the system's processing of two more examples, focusing oncommunicative actions and cooperative strategies, thereby illustrating the interpretationand generation of utterances re#ecting the agents' helpful behaviour. The simulationagain corresponds to the way the implemented system actually processes the dialogue.These examples will also illustrate how the two intention-that axioms presented earlier inthe paper (see Figure 7) are at play in situations where agents seek to assist each other.The "rst dialogue, inspired by the one given in Figure 4, presents a situation in which theuser (caller), realizing that the system (operator) will not be able to transfer him over tothe person requested, performs the communicative action of informing the system of the

(1) Stelephone callT(2) System: Hello.(3) User: Hello. I would like to speak to Mr B.(4) System: Mr B?(5a) User: Yes,(5b) his extension is 2287.(6) System: Here is Mr B.

FIGURE 22. Sample dialogue, version 2.

COOPERATIVE REQUESTS AND REPLIES 947

extension of this person.- The user thereby allows the system to satisfy a knowledgeprecondition of the subsequent transfer action. The second dialogue, inspired from theone given in Figure 2, presents a situation in which the system, realizing that it will not beable to establish a communication link between the user and the person requested,decides to establish the communication through an intermediary person.

6.2.1. Example 2: interpretation of a cooperative responseThe three "rst utterances of the dialogue in Figure 22 are identical to those of the

previous example (Figure 17). At this point in the dialogue, the Rgraph is thus that givenin Figures 19}21. We resume the simulation when the current action is LookUp-PhoneNb, after having interpreted utterance (3) (see Figure 21). In order to execute thisaction (rule 3), the system veri"es that all input parameters are instantiated, and itsconstraints veri"ed, namely that the person whose number is being searched is part of thedatabase. Here the system cannot "nd an entity corresponding to Mr B in its knowledgebase. This problem could result from incomplete knowledge on the part of the system (arecently arrived temporary employee, for instance, may not yet have been added to thedatabase) or from an error on the part of the user (he meant someone else).

As stated in rule 3, and illustrated in the previous example, an uninstantiated para-meter or an unveri"ed constraint, leads to the addition of an enablement link betweenthe basic action that cannot yet be performed, here LookUpPhoneNb, and an actionthat might allow this performance. Since the LookUpPhoneNb action does not havea link with another action that may allow the satisfaction of its constraint on theparameter Mr B, the system adds the action AchieveHaveValue(System, P: Mr B) tothe Rgraph, connected by an enablement link to the LookUpPhoneNb action, asshown in Figure 23. It is a complex action for which it knows three recipes (rule 2). Sincethe parameter P is already instantiated, the Plan Reasoner selects theachieveHaveValueByCon"rm recipe, and adds it to the Rgraph. The resulting Rgraph isgiven in Figure 23.

The current action becomes CheckValue (rule 4), a communicative action that istherefore sent to the generation module (stopping condition of the TA algorithm), andwhich will lead to the request for con"rmation &&Mr B? 11. In the Rgraph its status ischanged to 6.

-Recognizing that the number is a precondition for the transfer should be su$cient for the caller to providethe helpful information, therefore preventing the system from having to look it up. If this were the case, then thecaller would have provided the number upfront, right after giving the name. However, as mentioned in ourpresentation of our corpus, this combination rarely occurs.

FIGURE 23. Partial view of the Rgraph, before the generation of utterance (4) (connected to the Rgraph ofFigure 19).

948 C. BALKANSKI AND M. HURAULT-PLANTET

We notice here that the "rst clari"cation request made by the system is a request forcon"rmation of the name, i.e. that the system assumes that the name might be wrong. Asnoted in our corpus study, other responses would be appropriate here, such as asking forthe extension or for the department name or transferring the person over to Personnel.They are not all systematically tried, but their use follows ordering constraints. In ourmodel, these responses are tried in an order that depends on the temporal order of theactions in the Rgraph, as well as on the rules of the Task Advancement algorithmconcerning the addition of enabling actions. For instance, if the user had not providedthe extension in utterance (5b), the algorithm would have led the system to ask for thatinformation next. This will be shown in the example of the next section. The behaviour ofthe system is coherent with the behaviour observed through the corpus in that it respectsthe same ordering constraints.

Returning to our example, the Dialogue Manager then receives the user's response

(5a) User: Yes,(5b) his extension is 2287.

This utterance contains two parts, which are interpreted sequentially. The Speech ActIdenti"cation module recognizes a con"rm act on the basis of the representation of the"rst part of the utterance &&>es11. The act corresponds to the communicative actionCon"rmValue, which is then passed along to the Plan Reasoner by the PragmaticInterpretation module. This action is already in the Rgraph and it is the act currently infocus. The Con"rmValue action of the Rgraph is therefore updated on the basis of theparameters of the incoming Con"rmValue action, and its status is changed to 6. TheCon"rmValue action, and therefore also the AchieveHaveValue, having been success-fully executed, their status is updated to 6. The status of the next action, LookUp-PhoneNb is also updated to 6, but with a result attribute indicating failure since the MrB. parameter has been con"rmed, and, therefore, the constraint on this parameter (beingpart of the database) still cannot be satis"ed. The current action is then the next action,RequestCommunication.

The Speech Act Identi"cation module then recognizes an inform act on the basis of therepresentation of the second part of the utterance, &&his extension is 2287''. The act

FIGURE 24. Partial view of the Rgraph, after utterance (5).

COOPERATIVE REQUESTS AND REPLIES 949

corresponds to the communicative action InformValue, which is then passed along to thePlane Reasoner by the pragmatic Interpretation module.

A link is found between the act currently in focus, RequestCommunication, and theincoming act InformValue through the AchieveHaveValue action and theachieveHaveValueByInform recipe, as shown in Figure 24. Propagation of the value&&2287'' provided by the InformValue action allows for the instantiation of the PhNbparameter of the RequestCommunication action (as well as the Connect action). Duringthe planning/generation phase following the interpretation of (5b), the RequestCom-munication action can therefore be executed, and the end of the dialogue proceeds as forthe preceding example.

We notice that the InformValue action is part of two recipes for the AchieveHave-Value action, namely achieveHaveValueByRequest and achieveHaveValueByInform.The "rst recipe, however, requires that a RequestValue action be performed by thesystem beforehand, which is not the case. The second recipe was therefore selected. Thisrecipe allows the system to interpret information spontaneously provided by the caller,and, therefore, to interpret a parameter instantiation made by the caller to help him. Thissituation thus re#ects the use of Axiom 5 for the Int.Th operator de"ned by Grosz andKraus (Figure 7). When the system generates utterance (4), &&Mr B?11, the proposition&&System can bring about the RequestCommunication action'' is not true. Indeed, for thisproposition to be true, all of the action parameters have to be instantiated, and all of theaction constraints have to be satis"ed, which is not the case here. The caller (agent G inthe axiom) will then perform the InformValue action (action A in the axiom) to allow theproposition &&System can bring about the RequestCommunication action'' to becometrue.

6.2.2. Example 3: generation of a cooperative requestThe last example to be examined is given in Figure 25. This dialogue is identical to theprevious one up to, and including, utterance (5) [corresponding to (5a) in the previousexample]. We resume the simulation when the Rgraph is that given in Figure 21 andutterance (5) has been interpreted. The status of the CheckValue, Con"rmValue andAchieveHaveValue actions is therefore 6.

(1) Stelephone callT(2) System: Hello.(3) User: Hello. I would like to speak to Mr B.(4) System: Mr B?(5) User: Yes.(6a) System: I don't have a Mr B.(6b) What is his extension?(7) User: I don't know.(8a) System: I don't know Mr B's extension.(8b) Do you know with whom Mr B works?(9) User: He works with Mr C.(10) System: Here is Mr C.

FIGURE 25. Sample dialogue, version (3).

FIGURE 26. Rgraph during the planning phase following the interpretation of utterance (5) (connected to theRgraph of Figure 19).

950 C. BALKANSKI AND M. HURAULT-PLANTET

The Plan Reasoner is called to advance the task. LookUpPhoneNb becomes thecurrent action (initialization of TA algorithm). The Mr B parameter having beencon"rmed, this action cannot be executed. It is returned with a status of 6 and a resultattribute indicating failure (rule 3). The reason for failure (Mr B not in database) is storedfor later generation. The Plan Reasoner then considers the new current action, RequestCommunication (rule 6). The knowledge precondition concerning the parameter PhNbis still not satis"ed (rule 3). This rule does not allow to add a second time the sameenablement link with an action that has already been tried (here LookUpPhoneNb), inorder to avoid an inde"nite reiteration of the same question. The Plan Reasonertherefore tries to satisfy this precondition by another means, here, by adding theAchieveHaveValue(system, PhNb) to the Rgraph, connected by an enablement link tothe RequestCommunication action, as shown in Figure 26.

The recipe selected for the AchieveHaveValue action is achieveHaveValueByRequest(because the value of the PhNb parameter is unknown). The "rst action of this recipe isRequestValue, a communicative action that stops the TA algorithm. The generationmodule "rst produces the reason for failure that has been stored [utterance (6a): &&I don1thave a Mr B.11], then the question corresponding to the RequestValue action [utterance

FIGURE 27. Rgraph during the planning phase following the interpretation of utterance (7).

COOPERATIVE REQUESTS AND REPLIES 951

of (6b): &&=hat is his extension? 11]. Utterance (7) (00I don1t know 11 ) is then interpreted as thefailure of an AchieveHaveValue action on the part of the user, necessary for thesubsequent performance of the InformValue action on the part of the user, necessary forthe subsequent performance of the InformValue action, indicating the failure of thisaction as well. The status of InformValue, and therefore also of AchieveHaveValue, istherefore updated to 6, with a result attribute indicating failure. The current actionbecomes RequestCommunication. The Plan Reasoner is called to advance the task. Sincethe knowledge precondition on the PhNb parameter cannot be satis"ed, the status ofRequestCommunication is changed to 6, and its result attribute indicates failure (rule 3).The reason for failure (extension cannot be found) is stored for later generation.-

The new current action is then TransferPerson(System,;ser, Mr B ), with a status of 5,the action to which RequestCommunication directly contributes (rule 6). No action in itsrecipe having yet been performed, the Plan Reasoner searches for a new recipe (rule 5).Since this action does not have any other recipe, it acquires a status of 6, and a resultattribute indicating failure. The new current action (see Figure 19) is EstablishCom-munication(;ser, Mr B ), with a status of 5 (rule 6).

The Plan Reasoner then applies Rule 5 for the EstablishCommunication(;ser, Mr B)action. The recipe for this action does contain an action that has already been performed,namely, EstablishCommunication(;ser, System). It therefore searches for a new pathbetween this action and the current action EstablishCommunication(;ser, Mr B ). It"nds a link which connects these two actions by an additional instantiation of theestablishCommByIntermediary recipe, as shown in Figure 27. This path introduces anintermediary X, who will (maybe) be able to establish the communication between theuser and Mr B. Finding this link amounts to the system replanning the way in which theuser will be able to establish communication with Mr B.

-As illustrated in utterances (6a) and (8a), the system is explicit about the actions it fails to execute. This pointwill be further addressed in the Discussion section.

FIGURE 28. Partial view of Figure 27, after the generation of utterance (8).

952 C. BALKANSKI AND M. HURAULT-PLANTET

The system's replanning allows it to display a helpful behaviour corresponding to theuse of Axiom 6 of Grosz and Kraus (Figure 7). Indeed, the system (G1 in the axiom) willperform the action TransferPerson(System, ;ser, X) (action A in the axiom), thusallowing the person X (G2 in the axiom) to perform the action TransferPerson(X, ;ser,Mr B ) (action B in the axiom), thereby allowing the proposition &&User can bring aboutEstablishCommunication(;ser, Mr B ) to become true (which will be the case if allactions succeed).

Returning to the Task Advancement algorithm, the new current action returned byrule 5 is EstablishCommunication(;ser, Mr B ), with a status equal to 4. By applicationof rule 4 "rst to EstablishCommunication(;ser, Mr B ), then to EstablishCommunica-tion(;ser, X), the current action becomes TransferPerson(System,;ser, X), with a statusof 2. As in the preceding examples, the Plan Reasoner adds the transferPerson recipe tothe Rgraph (rule 2), then the action LookUpPhoneNb(System, X, PhNb) to satisfy theknowledge precondition of RequestCommunication(System, PhNb) on the PhNb para-meter (rule 3), as shown in Figure 28. The parameter X of the LookUpPhoneNb actionbeing uninstantiated, the Plan Reasoner tries to satisfy this knowledge precondition byadding the AchieveHaveValue(System, X) action to the Rgraph (rule 3), along with theachieveHaveValueByRequest recipe (Rules 1 and 2 for the AchieveHaveValue action).The next current action is the communicative action RequestValue(System,;ser, X) (rule4). The person X has to belong to the system's database to satisfy the constraint of theLookUpPhoneNb action, and has to be a person working with Mr B to satisfy theconstraint of the establishCommByIntermediary recipe of the EstablishCommunica-tion(X,;ser, Mr B ) action (constraints are propagated through recipes). The system thenasks the user if he knows with whom Mr B works [utterance (8b)]. The generation of thisutterance is preceded by the generation of the reason for failure of the RequestCommuni-cation(System, PhNb(Mr B)) that had been previously stored [utterance (8a)]. Theresulting Rgraph is given in Figure 28.

Utterance (9), &&He works with Mr C11, is interpreted as providing the requested value.The Plan Reasoner then updates the Rgraph by replacing the parameter X with the value

COOPERATIVE REQUESTS AND REPLIES 953

provided by the user, here Mr C. This new value is propagated upwards through theRgraph, in all actions participating in recipes along the path from the InformValueaction up to the root. The system will then be able to perform the two actions of thetransferPerson recipe, that is RequestCommunication and Connect, thereby transferringthe user to Mr C. The next current action is AcceptCommunication(Mr C) (seeFigure 27), an action to be performed by an agent other than the system, thus verifyingthe stopping condition of the Task Advancement algorithm. The Plan Reasoner thussignals the successful execution of the TransferPerson action to the generation modules,which will produce the system's answer: &&Here is Mr C 11.

7. Implementation

We have implemented, in Smalltalk, an application that simulates a receptionist for phonecalls at a phone desk and that integrates the dialogue model presented in this paper. Theimplemented system includes the four main components introduced in Section 5: a Dia-logue Manager, an Interpretation module, a Generation module and a Plan Reasoner, asshown in Figure 11. Given our current research interests, the component that has beencompletely implemented is the Plan Reasoner, with the associated algorithms of thePragmatic Interpretation and Planning/Execution submodules. The other submoduleshave also been implemented, but with substantial simplifying assumptions, as describedbelow. Our application currently contains 14 actions and 10 recipes, which is su$cient forhandling the sample dialogues presented in Section 6. We begin by describing the generalinterface and then discuss the actual state of the implemented system.

7.1. THE INTERFACE

The interface window of our application is shown in Figure 29, displaying the dialoguecorresponding to that given in Figure 25. The application is not connected to a realtelephone system. The user &&calls'' the system by clicking a button (Appel Standard¹eH leHphonique) in the upper left corner of the dialogue interface window. The dialoguethen begins. It ends when the user clicks on another button (ArreL t de la communication)simulating the user hanging up. The caller types his &&utterance'' in a text window at thebottom of the screen ( EntreHe de l@intervention de l@utilisateur:). The text then appears ina dialogue window, in which the utterances of the system are also output. The userreleases his turn by entering a carriage-return in the text window, while the systemreleases its turn after having displayed its utterance(s) in the dialogue window.

This interface allows the user to inspect the system's knowledge base. At any moment,the user can thus view the actions, recipes and entities that are part of the system's staticknowledge. The user can also view the system's dynamic knowledge about the discoursecontext, i.e. the Rgraph representing the system's beliefs about the acts agreed to by theagents over the course of the dialogue. Since this structure is dynamic, the user can followits evolution during the interpretation and generation phases, in a window that appearswhen the user clicks on the Rgraph button of the interface window. Figure 30 shows thedisplay of the Rgraph at the end of the dialogue given in Figure 29. Enablement links arenot displayed, but actions inserted into the Rgraph with these links appear in dotted-lined boxes.

FIGURE 29. Interface of the phone desk simulator.

FIGURE 30. Rgraph of the dialogue given in Figure 29 at the end the dialogue.

954 C. BALKANSKI AND M. HURAULT-PLANTET

COOPERATIVE REQUESTS AND REPLIES 955

7.2. STATE OF THE IMPLEMENTATION

Since our application is implemented in an object-oriented language, all modules areimplemented as classes of objects. These classes include the Dialogue Agent, DialogueInterface, Interpreter, Plan Reasoner, Rgraph, Action and Recipe classes, brie#y de-scribed below. The Dialogue Agent, Interpreter, Plan Reasoner and Rgraph classes areindependent of the telephone switchboard application; so are the Action and Recipeclasses corresponding to communicative actions. The Dialogue Interface class, as well asthe task-related Action and Recipe classes, are dependent upon the domain, and wouldhave to be built for each new application.

Our application currently has two instances of the Dialogue Agent class, namely theoperator and the caller, that is, the user. Future extensions of the system may includeimplementing additional dialogue agents. The attributes of the user agent are currentlyempty (nothing is known about his or her knowledge). A further extension would consistin adding values to these attributes, representing both &&a priori11 knowledge and know-ledge acquired during the dialogue (such as his name if he introduced himself ), both ofwhich would constitute aspects of a user model.

The two agents &&talk'' to each other by using the interface describing in the precedingsection. The interface is implemented as an instance of the Dialogue Interface class. Thisclass simulates the functionnalities of a telephone. It transfers the user's utterance to theoperator instance of the Dialogue Agent class, and outputs to the user the utteranceconstructed by this agent. The Interface class also handles the various displays ofinformation (actions, recipes, Rgraph).

The syntactic}semantic interpretation is handled by the Interpreter class. To performits task, this class calls upon a parser and semantic interpreter, as well as a submoduleresponsible for determining the speech act. It returns a &&message'' containing the resultsof the interpretation to the dialogue agent who called upon it. This message is composedof the syntactic}semantic representation of the utterance, its speech act, as well as theaction underlying the utterance and its representation. In our current implementation,however, very few utterances are interpreted in this way because we have not yet fullyextended the parser and semantic interpret to allow it to cover the full range of utterancesfound in our domain. To allow our application to handle other types of utterances, wehave hand-coded their corresponding messages. The parser and semantic interpreter, inthese cases, is not called upon.

The parser and semantic interpreter that we use is based on an LFG grammer (Kaplan& Bresnan, 1982) and a semantic interpreter based on conceptual graphs (Sowa, 1984)that have been developed by members of our research group (Vapillon, Bri!ault, Sabah& Chibout, 1997). Currently, for our application domain, it only handlesutterances of the form &&Je voudrais parler a Corinne Martin'' (I would like to speak toCorinne Martin) and &&Pourrais-je parler a Corinne Martin ? '' (Could I speak to CorinneMartin), or some other name. We are testing the semantic}syntactic interpreter for othertypes of utterances, entering the necessary lexical and semantic information as needed.The main di$culties arise in the syntactic constructions used.

The speech act interpretation is still very preliminary. Recognition of the speech act isbased on a "xed number of verb forms, such as the French verb &&vouloir'' which istranslated into a request, as well as on syntactic markers, such as the use of a question

956 C. BALKANSKI AND M. HURAULT-PLANTET

mark. Recognition of the action that is the object of speech act is based on a patternmatching algorithm between the conceptual graph built by the syntactic/semanticinterpreter and the semantic representations of actions in the knowledge base.

The pragmatic interpretation, as well as the planning/execution phase, are performedby the Plan Reasoner class. This class contains the reasoning processes associated withthe Rgraph, that is, the Interpretation and Task Advancement algorithms described inthis paper. The Rgraph class contains the Rgraph and the algorithms necessary forupdating of the Rgraph, such as adding recipes and enabling actions to the Rgraph.

Actions and recipes are also implemented as classes. Each action is a class containinga semantic representation of the action (a conceptual graph), constraints associated withthis action, a list of actions allowing the instantiation of its parameters, and a list ofrecipe(s) if it is a complex action or an execution method if it is a basic action. This classalso contains algorithms for verifying parameter instantiations and constraints. Eachrecipe is a class which contains pointers towards the action for which this recipe isde"ned, as well as towards the actions that are part of the recipe. This class also containsalgorithms verifying constraints (e.g. temporal constraints) and the propagation ofconstraints between the head action and the actions of the recipe.

The generation of the output utterance is to be performed by the Generator class.Currently, this class is not implemented. Instead, generation is handled by the PlanReasoner, which simply "lls in sentence templates when one of the stopping conditions ofthe Task Advancement algorithm is veri"ed. For this to be possible, the executionmethod of each communicative action contains a sentence template to be instantiatedwith the parameters of the communicative action. Similarly, each action has in itsrepresentation a corresponding sentence template that will be used when necessary toexpress either its success or its failure. For instance, the TransferPerson(System, User, X)action contains the sentence template &&Here is X'', where X is to be replaced by theinstantiation of the X parameter of the TransferPerson to indicate the successful execu-tion of this action. Sentence templates are a temporary solution allowing the system toproduce &&natural'' sounding utterances despite the lack of an adequate generator. Forinstance, it allows for hard-coded referential expressions, as in utterance (6b) of thedialogue given in Figure 25 (00=hat is his extension? 11), and indirect questions, as inutterance (8) of this same dialogue (00Do you know with whom Mr B works? 11). Notice thatthe French equivalent of utterance (6b), &&Quel est son poste11, can be easily hard-coded inour application because possessive pronouns in French agree with the object that is&&possessed'' (here the masculine word &&poste'') and not the possessor, as in English.

8. Discussion

In this section, we evaluate and discuss the coverage of our model, looking in particularat the di!erent kinds of plan repair that the agents engage in when a problem arises. Wefurther address the topic of mutual beliefs and we compare our model with otherdialogue models that also make use of plan and/or intention information.

8.1. COVERAGE OF THE MODEL

We consider both the theoretical model and the implemented system. We focus on thecoverage of the Plan Reasoner and associated reasoning capacities, that is, the inter-

COOPERATIVE REQUESTS AND REPLIES 957

pretation and Task Advancement algorithms. We do not, however, take into considera-tion the syntactic}semantic interpretation submodules, which, as mentioned earlier, arestill very limited in coverage, nor the surface generation of the natural languageutterance, which we have not yet addressed.

We restrict the discussion to the type of dialogues studied in this paper, namely requestdialogues, reviewing "rst the di!erent types of requests found in the corpus, and thenturning our attention to the di!erent kinds of plan repair that are used following theserequests when a problem arises in their realization.

As mentioned in the corpus study section, our corpus also contains dialogues in whicha user resumes a previous request (for instance because the correspondent was previouslyunavailable). Extension of our model to handle dialogues from this second group wouldrequire modifying the third stopping condition of the Task Advancement algorithm, inorder to allow the dialogue to resume, rather than to end, when the current action is theroot action. The Plan Reasoner would also have to be given the capacity for maintaining,and reasoning about, the history of the dialogue. A structure similar to the Rich andSidners' &&segmented interaction history'' could prove to be useful for this purpose (Rich& Sidner, 1998).

8.1.1. User+s requestsAs mentioned in our corpus study, most requests are based on extensions (e.g. 00I1d likeextension 2022 11) and on names (e.g. &&I1d like to speak to Mr B 11 ). Our theoretical model,as well as our implemented system, has the capacity for interpreting, and responding to,these types of requests. Obviously, additional domain knowledge, such as more peopleand names of departments, would be necessary to allow for the interpretation of all thedi!erent uses of these types of requests found in our corpus.

Our model, however, does not yet cover the other types of requests found in thecorpus, namely requests for communication with a department, requests for varioustypes of information and combinations of the above. While the Interpretation algorithmwill probably not require extensive additions to be able to handle these other types ofrequests, modelling the necessary recipes is not a trivial task. Finally, the "ve atypicaldialogues referred to in the corpus study section (request for a fax number, request withreference to previous calls, and response to an advertisement about a house) are notcovered by our current model either. While requests for fax numbers do not presentmajor di$culties, the three other requests do because of the heavy dependence on generalknowledge that we are currently not able to integrate to the system's knowledge base.The utterance &&Bonjour, c1est encore moi11 ( 00Hello, it1s me again11) for instance presentsa number of challenging interpretation problems.

8.1.2. Plan repairAs mentioned in the corpus study section, the operator and the caller adopt twodi!erent types of strategies when a problem arises in handling the caller's request,i.e. when an action in the Rgraph either fails or cannot be executed. The "rst consistsin further elaborating the current plan, while the second involves replanning. Thissection examines how the system's knowledge and reasoning capacities (namely, theInterpretation and Task Advancement algorithms) allow the system to adopt thesestrategies.

958 C. BALKANSKI AND M. HURAULT-PLANTET

f Clari,cation subdialogues: these subdialogues are engaged to determine (or con"rm)the value of a parameter. In our system, it is the AchieveHaveValue action (Figure 12)which allows the Plan Reasoner to begin such a subdialogue or to respond in thecontext of such a subdialogue. The AchieveHaveValue action is connected to theaction of which it is going to instantiate a parameter (see rule 3), and is thereforeintegrated to the Rgraph. As a result the question and response actions that are part ofthe di!erent recipes for the AchieveHaveValue action impose a temporal order on thesequencing of the actions in the Rgraph, as well as an obligation to perform theseactions. Thus, if the system makes a request on the value of a parameter, thesubsequent action, expected by the system, is an answer on the part of the user. On theother hand, if the user makes a request, then the next action in the Rgraph will be ananswer to be produced by the system. These recipes, therefore, provide an alternativeway of representing a number of discourse obligations related to communicativeactions, which are represented, for instance, as a set of rules encoding discourseconventions in the Traum and Allens' (1994) work or as adjacency pairs in Zancanaro,Stock and Strappavara (1997).

f Replanning: when the operator realizes that she will not be able to execute the initialplan of transferring over the caller directly to the requested person, she builds a newplan involving a transfer through an intermediary person. In our system, this behavi-our arises through the application of rule 5 of the Task Advancement algorithm. Whena complex action fails, the Plan Reasoner tries to perform it di!erently, either byselecting a new recipe, or by searching for a di!erential path between the actions thathave already been executed and the action that requires replanning. Figure 29 showsan implemented example.

f Information spontaneously provided by the user: this information allows the agents, forinstance, to satisfy a knowledge precondition on an action of the plan. Our interpreta-tion algorithm searches for possible enablement links between the action underlyingthe user's utterance and the action in focus, and it therefore capable of interpreting thistype of utterance. The fact that the user may &&volunteer'' information that is notrequested shows that our model also allows for mixed-initiative interaction (Novick& Sutton, 1997).

f Multiple utterances: the corpus also revealed instances of the caller producing severalutterances before realising his turn.- This happens in particular when the callerspontaneously provides missing information, after having "rst answered the operator(see for instance, Figure 4). In our system, such utterances are interpreted sequentially,and the Dialogue Manager calls upon the generation modules only when all theseutterances have been interpreted, as illustrated in the discussion of the dialogue inFigure 22.

f Explanations: when the operator cannot execute an action that has been planned for,she usually provides the caller with an explanation of that failure [for instance,utterance (8) in Figure 1]. This behaviour has been integrated into our system, asillustrated in utterance (6a), Figure 25, so that when an action fails, the reason for this

-As mentioned in the preceding section, the user of our implemented system releases his turn by enteringa carriage-return in the text window, while the system releases its turn after having displayed its utterance(s) inthe dialogue window.

COOPERATIVE REQUESTS AND REPLIES 959

failure (knowledge precondition, or constraint not satis"ed) is saved in a list ofmessages that will be sent to the generation module when one of the stoppingconditions of the Task Advancement algorithm is met.

Our corpus study also showed that certain clari"cation subdialogues are engaged bythe operators to remove an ambiguity concerning a parameter. This is the case when theoperator corrects a name provided by the caller (wrong spelling for instance), asks for the"rst name of a person whose last name occurs several times in her directory, or asks thecaller to choose between several people belonging to the department that was requested(head of the group or secretary for instance). These subdialogues are not yet handled byour system. The integration of a spelling checker to our model (e.g. Fournier, 1994) couldallow the system to "nd names with closely related spellings. Allowing the system tosuggest a choice between several possible values for a given parameter would requireadditional recipes for the AchieveHaveValue action. These extensions are part of ourprojects for future research.

The ability to replan is a critical feature of any dialogue system. As shown in ourcorpus, lack of information, miscommunication, unexpected answers or requests, are alldialogue situations which may lead to action (and therefore plan) failure. Failure andrepair are therefore central issues of communication, and our model allows to handlethem. For instance, returning to the sample dialogue in Figure 25, utterance (6b), &&=hatis this extension? 11, results from the (successful) execution of the communicative actionRequestValue(System, ;ser, PhNb) (see the Rgraph in Figure 26). As explained earlier,utterance (7) (&&I don1t know11) is interpreted as the failure of the subsequent InformValueaction, which in turn leads to the failure of the AchieveHaveValue(System, PhNb) andRequestCommunication(System, PhNb) actions. As a result, the Task Advancementalgorithm will replan the TransferPerson(System, ;ser, Mr B) action to which theRequestCommunication directly contributes.

As indicated above, it is the rule 5 of the Task Advancement algorithm which allowsour dialogue system to replan a complex action that cannot be performed. This rule isalso designed to allow our system to reexecute a basic action that failed. Items (a) and (b)of rule 5 indeed lead to reexecution of a failed action if the reason for failure can bedetermined and the necessary changes made to allow for success. These conditions arenecessary to avoid subsequent executions that would probably fail again. If this is not thecase, the failed action leads to replanning.

As indicated in rule 3 of the Task Advancement algorithm, a basic action may faileither because it cannot be attempted (because of a parameter remaining uninstantiatedor a constraint unsatis"ed) or as a result of its execution. The "rst case corresponds toexecution inability (which leads to replanning of the action towards which the failedaction contributes), the second to execution failure (which leads, under certain condi-tions, to reexecution). The distinction between execution inability and execution failurecorresponds to the distinction made by Traum and Allen (1994) between PRepairs orplan repairs, that are performed when failure leads to replanning, and between ERepairsor execution repairs, that are performed when failure leads to reexecution. These authorspresent a theory and formalization of these di!erent sorts of repair, but it is not clearwhat computational mechanisms are provided by this theory to determine, for instance,when to perform one or the other type of repair.

960 C. BALKANSKI AND M. HURAULT-PLANTET

Most examples from our corpus require replanning rather than attempting the sameaction again. For this reason, we have not yet fully tested, nor implemented, the "rst partof rule 5, designed for handling the reexecution of basic actions that have failed [items(a) and (b)]. One example for which reexecution could be preferable, however, involvesthe EstablishCommunication action. This action will fail if the requested person does notanswer the phone, or is already on line. When this happens, and certain conditions aremet (for instance, the caller can wait a few minutes), then attempting the action again isan option to be considered. As this example illustrates, though, there are constraints thatthe system cannot verify before execution, as well as unforeseeable obstacles. Item(a) of rule 5, stating that if the system can determine the reason for failure and makethe necessary changes so that the reexecution has a chance to succeed, then A

imay be

reattempted, is therefore too strong. Achieving belief that something in the context of thisexecution may have changed, thereby allowing the action to succeed, should be su$cientto allow an action to be attempted again.

8.2. MUTUAL BELIEFS

It is now widely accepted that successful communication requires not only conveyinginformation to another agent, but also achieving some degree of mutual belief on the partof the conversational agents of the speaker's intention to convey that information. Theprocess of adding information to this common ground is termed grounding by Clark& Schaefer (1987). According to their theory, conversation is composed of contributions,namely units containing both a presentation phase (when the speaker focuses on thecontent of the conversation) and an acceptance phase (when both the speaker and thehearer focus on grounding that content, i.e. on establishing the mutual belief that thehearer has understood what the speaker means). Once both phases have been completed,it will be common ground between the conversational partners that the hearer under-stands what the speaker meant.

The acceptance phase may take many forms; it may be implicit, when the hearerpresupposes acceptance of the speaker's presentation by going on to the next contribu-tion, it may consist in a single utterance, such as &&Okay'', or it may be composed ofmultiple conversational turns. The importance and diversity of the acceptance phase wasillustrated in the preceding section, when discussing the way in which the agents in ourdomain acknowledge that communication has been successfully established between them.The presentation phase too, while it cannot be implicit, can take di!erent forms, possiblyincluding whole embedded contributions. The di$culty in integrating such a theory ina computational model is precisely the fact that the presentation and acceptance phasesare so diversi"ed. As noted by Traum (1994), it is often hard to tell whether a particularutterance is part of the presentation phase or the acceptance phase, as well as whethercontributions are ever really complete; as for generation, the model gives little indicationas to how to decide what to say next. Traum therefore, attempting to solve theseproblems, develops a computational theory of grounding acts, which, when implementedwithin a dialogue system, can be used to determine, for any given state of the dialogue,whether material has been grounded or what it would take to ground that material.

Although our model does not integrate a formal theory of groundings, it rests ona number of hypotheses concerning mutual beliefs that are compatible with Traum's

COOPERATIVE REQUESTS AND REPLIES 961

theory. In particular, the interpretation and Task Advancement algorithms assume thatthe user, by proceeding with a new utterance that naturally follows the current one (i.e.by being coherent with the current discourse context), implicitly acknowledges under-standing of this current utterance. Similarly, when the system produces a new utterancethat naturally follows the user's utterance, it is implicitly acknowledging understandingof the user's utterance. This hypothesis corresponds to the Clark and Schaefer's (1987)weakest method for grounding an utterance. Traum also interprets a speaker's utterancethat is coherent with the discourse context as an implicit signalling of understanding.

The Task Advancement Rules 2, 3 and 5 also involve assumptions about the mutualbelief. Here, by adding a recipe (rules 2 and 5) or enabling action (rule 3) to the Rgraph,the system has to make the hypothesis that the user agrees to the particular recipe oraction that it has just selected. As mentioned in the interpretation section, the lack ofsuch an assumption requires having to ask for con"rmation each time the system selectsa recipe. Mutual belief is assumed as long as the interpretation of the user's utterancecon"rms or develops the current dialogue context. Therefore, the system has to informthe user when this is not the case, for instance, when there is disagreement concerning theinstantiation of a parameter, or when replanning is necessary. This is, in part, the role ofthe stopping conditions of the Task Advancement algorithm.

A number of heuristics underlying our Interpretation and Task Advancement algo-rithms therefore rests on hypotheses concerning the mutual beliefs of the dialogueparticipants. As such, they play a causal role in the behaviour of the agent beingmodelled. However, a more explicit treatment of mutual belief would allow us to makea number of improvements to our model. It would, for instance allow us to re"ne ourtreatment of the vocalization of failed actions. In particular, as illustrated in the dialogueof Figure 25, the system currently vocalizes the reason for failure of actions that lead toclari"cation subdialogues [see rule 3 item (b)]. Although this behaviour is desirable [see,for instance, utterance (8) in Figure 1], our rules tend to lead to a system that is overlyverbose about its di$culties. There are situations in which the system should not expressthe reason for failure because it should know that the user already knows so. Forinstance, if the system asks the user for some information X (e.g. the extension of someperson), then it is reasonable to assume that the system believes that the user believes thatthe system does not know this information. Therefore, if the user subsequently cannotprovide this information, then the system should not repeat that it is lacking thisinformation before going on with the dialogue. A more complete treatment of the mutualbeliefs acquired as a result of the execution of communicative actions would allow thesystem to determine when to express, and when not to express, a reason for failure.

A more explicit treatment of mutual belief would also allow our model to coveradditional aspects of our corpus, such as the di!erent types of acknowledgements presentin dialogue openings. All dialogues indeed begin with the operator saying &&CNRSbonjour11 (identi"cation of the organization name followed by hello) or &&CNRS11 or &&oui11,00bonjour11, 00allo11- or combinations of these. This initial utterance by the operator servesto indicate to the caller that communication has been successfully established betweenthem. Interestingly, in most cases, the user then does not immediately state his or her

-The French expression &&Allo'' provides a more informal way of acknowledging communication bytelephone than &&Bonjour11 (the latter meaning &&hello'').

962 C. BALKANSKI AND M. HURAULT-PLANTET

request but, instead, either acknowledges that the communication has indeed beenestablished (saying &&hello11 or &&yes11 before asking for someone), or makes a furtherrequest for acknowledgement- (saying &&hello11 or 00allo'', to which the operator againanswers &&hello11 or 00yes''). The user's request for acknowledgement thereby initiatesa subdialogue whose goal is to establish mutual belief of the successful establishment ofthe communication. Using the Traums' 1994, terminology, we may refer to theseacknowledgements as grounding acts, namely acts that deal with establishing mutualbelief about the dialogue. These acts may also be implicit (which is the case in only 7% ofthe dialogues from our corpus), when the caller implicitly acknowledges that thecommunication has been successfully established by directly making his request.

In our model, the initial action to be interpreted is the user's phone call, and the initialutterance to be generated is the system's greeting. Our decision of having the system say&&hello11 rather than some other expression among those found in our corpus is purelyarbitrary, any of the alternatives being possible. What is more relevant is that thisgreeting is modelled as an action, namely AcceptCommunication, that is part of theestablishCommunication recipe. In other words, it is necessary for an agent to accepta communication after another agent has requested it. The complex action of establish-ing communication will fail if an agent A1 calls another agent A2 but A2 does not answerin some way. The user's &&hello11, on the other hand, is treated by our interpretationalgorithm as an explicit (and optional) acknowledgement act. A more comprehensivetreatment of grounding acts, such as that provided in the Traum (1994), work forinstance, would be helpful in treating these acknowledgements. We are currently inves-tigating this issue. A detailed analysis of the dialogue openings found in our corpus, aswell as an initial proposal for the treatment of acknowledgements occurring in thisopenings, is provided elsewhere (Hurault-Plantet & Balkanski, 1999).

Finally, further development of rule 1 of the Task Advancement algorithm also restson a more explicit treatment of grounding acts. We are investigating ways of allowing forthe transition from individual to mutual belief to be done on the basis of the likelihood ofthe user adopting the belief that the action being considered will be part of their mutualplan, instead of simply assuming this mutual belief. This likelihood is to be veri"ed on thebasis of the system's beliefs, including, for instance, a search for possible side-e!ects ofthis action. Rule 1 should also be able to determine when con"rmation on the system'spart is necessary. Interpretation of the user's explicit con"rmation also has to beinterpreted. These considerations may lead to adding an intermediate status between1 and 2 for acts that have been mentioned but not yet agreed upon. Therefore, althoughthis rule is currently vacuous and automatic, thereby making status 2 appear useless, thedistinction between statuses 1 and 2 is kept for allowing for future improvements.

8.3. COMPARISON WITH OTHER MODELS

Our model uses plan recognition and plan elaboration to understand and producecommunicative actions in the context of a task-oriented dialogue. Many other systemshave been developed in this perspective, some of which are close to ours in terms of the

-If the tapes from which our corpus were available, intonational features of the dialogues would probablycorroborate the distinction we suggest exists here between acknowledgements and requests for acknowledg-ment.

COOPERATIVE REQUESTS AND REPLIES 963

representation of communicative actions, the representation of the underlying plan orthe associated reasoning processes.

The Trains system from Rochester (Allen et al., 1995) involves an interactive planningassistant that helps a user construct and monitor plans about a railroad freight system. Itwas implemented as part of a long-term research project in natural language processingand plan reasoning. The agent collaborates with other agents on plans that are incremen-tally formed as the two agents interact. The domain plan reasoner (Ferguson, 1995) usesan event-based temporal logic for representing and reasoning about actions and a plangraph showing that a certain sequence of actions leads to the desired goal. Plan graphsare comparable to Rgraphs, with additional re"nements such as the integration of nodesthat describe both events and states. Interpretation of an event (or goal or fact) isperformed by searching through a space of plan graphs. When this event is not alreadypresent in the graph, then the plan reasoner expands the plan, by adding decompositionsand by following precondition and e!ect links, much like during our interpretation. Itdoes not appear to be possible, however, to expand the plan upwards, which implies thatthe user has to state his or her goal upfront. When the agents agree on a plan, thedialogue manager sends it to an execution planner. Execution and planning are thereforeseparated, which does not allow for their interleaving and raises the question of repairduring execution. In our model, we have attempted to de"ne defeasibility conditions forrecipes and non-executed actions that allow the plan reasoner to replan in the midst ofexecution.

The Trains system is based on a sophisticated representation of mutual beliefs (Traum,1994) which introduces several nestings of belief spaces, represented as distinct butrelated modalities. This system thus distinguishes a shared modality, which includesaspects of plan assumed to be jointly intended by the system and the user, mutuallybelieved proposal modalities, which include aspects of plan proposed by one or the otherparty but have not yet been accepted, proposal modalities, which include aspects of planwhich have not yet been acknowledged, and "nally, the system private modality, whichincludes aspects of plan that the system has not yet communicated to the user. Ground-ing acts allow proposals to move from a private version of proposals (the proposalmodalities) to the mutually believed proposal modalities, while acceptance acts allowproposals to move from either of these proposal modalities to the shared modality.Although these di!erent belief spaces were de"ned with the goal of modelling negoti-ation, which is not the main focus of our research, our model would bene"t from anexplicit representation of the assumptions concerning mutual beliefs which currentlyimplicitly underlie our Plan Reasoner.

The Circuit Fixit system from Duke (Smith & Hipp, 1994; Smith, Hipp & Biermann,1995), is a voice dialogue system developed in the circuit repair domain. This systemimplements a theory whose central mechanism is a Prolog-style theorem-prover. Thesystem's knowledge is encoded in Prolog rules, which are called to prove a goal. Whenthe proof fails, i.e. when the system is lacking necessary information, then the systemengages in a dialogue to acquire the &&missing axioms''. The Theorem Prover's haltingconditions is therefore similar to the stopping conditions of our Task Advancementalgorithm. Our stopping conditions, and therefore our uses of language, cover a widerrange of situations since our Rgraph includes communicative actions as well as task-related actions, on the part of both the user and the system. The Circuit Fixit system, on

964 C. BALKANSKI AND M. HURAULT-PLANTET

the other hand, engages in dialogue only for the purpose of theorem proving, i.e. onlywhen the system encounters a problem in achieving its goal. The authors show that theirsystem is coherent with Grosz and Sidner's theory of discourse (1986), their proof treecorresponding to the intentional component, the set of predicates that are related bytheir rules to the currently active predicate corresponding to the attentional component,and each subdialogue being a new discourse segment. Their proof tree is similar to ourRgraph in structure, but the way in which these structures are built is very di!erent.Furthermore, it is not clear if their rules are defeasible, and, as a result, if replanning is anoption, nor how communicative actions are reasoned about and integrated into thisstructure. On the other hand, their system includes capabilities for accounting for userknowledge and abilities, as well as for handling interruptions, capabilities that would bedesirable extensions to our system.

The Artimis system from France TeH leH com (Sadek, Bretier & Panaget, 1997) is a ra-tional agent based on the implementation of a formal theory of interaction (Bretier& Sadek, 1996). The domain of application is the vocal query, over the telephone, of thedirectory hosted by France TeH leH com. The user's utterance is translated into a logicalexpression representing the communicative action underlying the utterance and itscontent. In this model, as in ours, communicative actions and task-related actions havethe same representation (Sadek, 1991). Rationally and cooperativity principles, encodedas axioms, are then used by a theorem which proves to interpret this expression and planthe communicative action(s) that will be used for the system's response. The mainadvantage of this axiomatization is that the principles used are clearly represented andreasoned about. Artimis has a global representation of the context, namely the current&&mental state'', i.e. beliefs and intentions and the mental representation for the objectscurrently manipulated; its rational unit &&produces a plan (e.g. a sequence) of dialogueacts, as a reaction to the &&understood'' input''. (Sadek & De Mori, 1998). In our model,the partial representation of the SharedPlan (namely the Rgraph) is the central structureof the dialogue context. It contains both domain actions and communicative actions,some of which are decided and performed by agents acting as a group, while others areindividual actions, the execution of which contributes to these group actions. Groupactions may be planned (or replanned) by one or more of the agents participating in theseactions, their planning and execution leading to cooperation among the agents involved.

9. Conclusion

This paper demonstrated the feasibility of our dialogue model for interpreting andgenerating both communicative and non-communicative actions in the contextof task-oriented dialogues. The Interpretation and Task Advancement algorithmsallow our system to display a number of dialogue functionalities that are necessaryin order for a computer to be able to play the role of a user-friendly, cooperative dialoguepartner. Some of the functionalities that have been described as fundamental in hu-man}computer dialogue systems (e.g. in Hayes & Reddy, 1983; Sadek & De Mori, 1998)and which are displayed by our system include: the ability to handle clari"cationsubdialogues, contextual interpretation and response, mixed-initiative interaction,cooperative reactions, keeping track of the focus of attention and performing problem-solving reasoning. Our system was also shown to behave cooperatively, being able to

COOPERATIVE REQUESTS AND REPLIES 965

both interpret helpful behaviour on the part of its dialogue partner and manifest helpfulbehaviour towards its partner, on the basis of communicative actions and cooperativestrategies.

The central component of our system is the Plan Reasoner, called upon both by theinterpretation and planning/execution submodules to perform plan recognition, planelaboration and repair, as well as plan execution tasks. It does so by manipulating theRgraph, a structure representing the beliefs of the agent being modelled as to how all ofthe acts underlying the agents' discourse are related at a given point in the dialogue. Wepresented our Interpretation and Task Advancement algorithms, and illustrated them ona three sample dialogues motivated from our corpus. We have an actual working systemthat empirically validates the proposed computational model.

The main component of the discourse context, namely the Rgraph, was borrowed fromLochbaum (1994). Our work di!ers from hers, however, in allowing the system to processdialogue openings, interleave execution and planning and handle generation. Our modelfurther substantially extends Lochbaum's model by providing an interpretation algo-rithm that allows for a more extensive search through the Rgraph than her RgraphAugmentation algorithm and by developing a Task Advancement algorithm that givesthe system planning and generation capabilities that her system lacks. Our interpretationalgorithm, indeed, constructs the Rgraph in an upward or downward manner, and thesearch for a link between actions may lead to the addition of more than one recipe. OurTask Advancement algorithm is able to determine what to do next, whereas theimplementation of the generation module of her system makes use of an &&oracle'' (i.e. theuser) for selecting a task (e.g. execute an action or instantiate a parameter) among alltasks that are possible at a given point in the dialogue.

On the other hand, Lochbaum's model provides a more complete formalization of theprocess of recognizing the intentional structure of discourse and using that structure indiscourse processing. She explains, in particular, how SharedPlans and relationshipsamong them provide the basis for computing intentional structure. This is an issue thatwe have not yet addressed, but which will become particularly relevant when we developthe focus stack component of our model. This extension will require examining thenature of discourse segmentation, in order to de"ne precisely the operations on the focusstack, and will lead to examining the role of the SharedPlans in the process of determin-ing whether an utterance begins a new segment of the discourse, completes the currentsegment or contributes to it.

SharedPlans have been criticized as being di$cult to apply in situations whereplanning and execution have to be integrated (Traum & Allen, 1994). Our approachallows us to overcome this problem by integrating both communicative actions andtask-related actions to the Rgraph, and by the way our Task Advancement algorithm iscapable of stopping its reasoning on the task to produce communicative actions whennecessary. Other researchers who adopt SharedPlans as a basis for their dialogue modelhave also developed means of interleaving planning and execution (e.g. Zancanaro et al.,1997; Rich & Sidner, 1998). While Rich and Sidner have implemented a collaborativeagent toolkit and used it to build a software agent that collaborates with the user ofa graphical interface, our system provides more extensive interpretation and generationcapabilities. Zancanaro et al. (1997) propose to extend the SharedPlan model to apply tomultimodal interaction by adding adjacency pairs to account for local coherence and

966 C. BALKANSKI AND M. HURAULT-PLANTET

capabilities for multimedia coordination, but their work does not present their actualcomputational model and associated interpretation and generation algorithms.

A number of important theoretical issues require further attention, the most importantone being the explicit treatment of mutual beliefs and the related issue of the interpreta-tion of grounding acts. Our model would also bene"t from a more extensive set ofcommunicative actions, as well as of task-related actions and recipes that would allowour system to cover additional aspects of our corpus. Future research also involvesfurther developing the Speech Act Identi"cation module of our model, and dealing withmore complex utterances, such as those described in previous work of one of the authors(Balkanski, 1993), which may convey additional information about a speaker's beliefsand intentions about the underlying actions. Finally, our implemented system needs tobe extended, in particular with respect to the coverage of the syntactic and semanticinterpreter and the modelling of additional recipes to be able to test our model insituations where the knowledge base is larger.

We would like to thank the Special Issue editors, and the three anonymous reviewers for theirinsightful comments.

References

ALLEN, J. F. (1983). Recognizing intention from natural language utterances. In M. BRADY &C. BERWICK, Eds. Computational Models of Discourse, pp. 107}166. Cambridge, MA: MITPress.

ALLEN, J. F., SCHUBERT, L. K., FERGUSON, G., HEEMAN, P., HWANG, C. H., KATO, T., LIGHT, M.,MARTIN, N., MILLER, B., POESIO, M. & TRAUM, D. R. (1995). The ¹RAINS project: a casestudy in building a conversational planning agent. Journal of Experimental and ¹heoreticalArti,cial Intelligence, 7, 7}48.

BALKANSKI, C. (1993). Actions beliefs and intentions in multi-action utterances. TR-16-93. Ph.D.Dissertation, Harvard University, Cambridge, MA.

BALKANSKI, C. & HURAULT-PLANTET, M. (1997). Communicative actions in a dialogue model forcooperative discourse: an initial report. Proceedings of AAAI Fall 1997 Symposium onCommunicative Action in Humans and Machines. Cambridge, MA.

BRATMAN, M. (1987). Intention, Plants, and Practical Reason. Cambridge, MA: Harvard UniversityPress.

BRETIER, P. & SADEK, D. (1996). A rational agent as the kernel of a Cooperative Spoken DialogueSystem: implementing a logical theory of interaction. In J. P. MULLER & M. J. WOOLDRIDGE

& N. R. JENNINGS, Eds. Intelligent Agents III, Proceedings of the 3rd International=orkshopon Agent ¹heories, Architectures, and ¸anguages, ¸ectures Notes in Arti,cial Intelligence,pp. 189}203. Heidelberg: Springer-Verlag.

BUNT, H. C. (1989). Information dialogues as communicative action in relation to partnermodelling and information processing. In TAYLOR et al. Eds. ¹he Structure of MultimodalDialogue, pp. 47}73. Elsevier Science Publishers B.V. (North Holland).

CARLETTA, J., ISARD, A., ISARD, S., KOWTKO, J., DOHERTY-SNEDDON, G. & ANDERSON, A. H.(1997). The reliability of a dialogue structure coding scheme. Computational ¸inguistics, 23,13}31.

CASTAING, M. F. (1990). Etude des verbalisations engageH es dans un standard teH leH phoniquetraditionnel entre standardistes et appelants. Notes et documents ¸IMSI 90-7. LIMSI/CNRS,Orsay, France.

CASTAING, M. F. (1993). Corpus de dialogues enregistreH s dans un standard teH leH phonique. Notes etdocuments ¸IMSI 93-9. LIMSI/CNRS, Orsay, France.

CLARK, H. H. & SCHAEFER, E. F. (1987). Collaborating on contributions to conversations.¸anguage and Cognitive Processes, 2, 19}41.

COOPERATIVE REQUESTS AND REPLIES 967

FERGUSON, G. (1995). Knowledge representation and reasoning for mixed-initiative planning. TR-562, Ph.D. Dissertation. Department of Computer Science, University of Rochester.

FOURNIER, J. P. (1994). Docile agents to process Natural Language. Proceedings of 6th Interna-tional Conference on ¹ools with Arti,cial Intelligence. New Orleans, LA, USA.

GROSZ, B. J. & SIDNER, C. L. (1986). Attention, intentions, and the structure of discourse.Computational ¸inguistics, 12, 175}204.

GROSZ, B. J. & SIDNER, C. L. (1990). Plans for discourse. In P. R. COHEN, J. L. MORGAN & M. E.POLLACK, Eds. Intentions in Communication, pp. 417}444. Cambridge, MA: MIT Press.

GROSZ, B. J. & KRAUS, S. (1993). Collaborative plans for group activities. Proceedings of the 13thInternational Joint Conference on Arti,cial Intelligence. ChambeH ry, France.

GROSZ, B. J. & KRAUS, S. (1996). Collaborative plans for complex group action. Arti,cialIntelligence, 86, 269}357.

HAYES, P. J. & REDDY, D. R. (1983). Steps toward graceful interaction in spoken and writtenman}machine communication. International Journal of Man}Machine Studies, 19, 231}284.

HURAULT-PLANTET, M. &. BALKANSKI, C. (1998). Communication and manipulation acts ina collaborative dialogue model. Proceedings of the 2nd International Conference onCooperative Multimodal Communication. Tilburg, Nederlands.

HURAULT-PLANTET, M. & BALKANSKI, C. (1999). Acknowledgement acts in dialogue openings.Amstelogue 199, Amsterdam=orkshop on the Semantics and Pragmatics of Dialogue.

KAPLAN, R. & BRESNAN, J. (1982). Lexical functional grammer: a formal system for grammaticalrepresentation. In J. BRESNAN, Ed. ¹he Mental Representation of Grammatical Relations, pp.173}281. Cambridge, MA: MIT Press.

LESH, N. & ETZIONI, O. (1996). Scaling up goal recognition. Proceedings of the InternationalConference on Principles of Knowledge Representation and Reasoning.

LESH, N., RICH, C. & SIDNER, C. (1998). Using plan recognition in human}computer collabora-tion. 7th International Conference on ;ser Modeling. Ban!, Canada, July 1999.

LOCHBAUM, K. E. (1994). ;sing collaborative plans to model the intentional structure of discourse.TR-25-94, Ph.D. Dissertation. Harvard University, Cambridge, MA.

LOCHBAUM, K. E. (1995). The use of knowledge preconditions in language processing. IJCAI +95,pp. 1260}1266, San Mateo, CA: Morgan Kaufmann Publishers, Inc.

LOCHBAUM, K. E., GROSZ, B. J. & SIDNER, C. L. (1990). Models of plans to support communica-tion: an initial report. Proceedings of AAAI-90, pp. 485}490. Boston, MA.

NOVICK, D. & SUTTON, S. (1997). What is mixed-initiative interaction.=orking Notes, AAAI SpringSymposium on Computational Models for Mixed-Initiative Interaction. Stanford University, CA.

RICH, C. & SIDNER, C. (1998). Collagen: a collaboration manager for software interface agents.;ser Modeling and ;ser-Adapted Interaction. Special Issue on Computational Models forMixed Initiative Interaction, 8, 315}350.

SADEK, D. (1991). Dialogue acts are rational plans. ESCA/E¹R==orkshop on the Structure ofMultimodal Dialogue. Maratea, Italy.

SADEK, D., BRETIER, P. & PANAGET, F. (1997). ARTIMIS: natural dialogue Meets RationalAgency. Proceedings of 15th International Joint Conference on Arti,cial Intelligence, IJCAI197.Nagoya, Japan.

SADEK, D. & DE MORI, R. (1998). Dialogue Systems. In R. DE MORI, Ed. Spoken Dialogues withComputers. New York: Academic Press.

SMITH, R. W. & HIPP, D. R. (1994). Spoken Natural ¸anguage Dialog Systems: A PracticalApproach. Oxford: Oxford University Press.

SMITH, R. W., HIPP, D. R. & BIERMANN, A. W. (1995). An architecture for voice dialog systemsbased on prolog-style theorem proving. Computational ¸inguistics, 21, 281}320.

SOWA, J. (1984). Conceptual Structures: Information processing in Man and Machine. Reading, MA:Addison-Wesley.

TRAUM, D. R. (1994). A computational theory of grounding in natural language conversation. TR-545,Ph.D. Dissertation. Department of Computer Science, University of Rochester.

TRAUM, D. R. & ALLEN, J. F. (1994). Towards a formal theory of repair in plan execution and planrecognition. Proceedings of the 13th =orkshop of the ;K Planning and Scheduling SpecialInterest Group.

968 C. BALKANSKI AND M. HURAULT-PLANTET

VAPILLON, J., BRIFFAULT, X., SABAH, G. & CHIBOUT, K. (1997). An object-oriented linguisticengineering environment using LFG (Lexical Functional Grammar) and CG (ConceptualGraphs). Proceedings of Computational Environments for Grammar Development and ¸inguisticEngineering (ENVGRAM), ACL+97 Workshop. Madrid, Spain, 99}106.

ZANCANARO, M., STOCK, O. & STRAPPAVARA, C. (1997). A discussion on augmenting andexecuting sharedplans for multimodal communication. Proceedings of AAAI Fall 1997 Sympo-sium on Communicative Action in Humans and Machine. Cambridge, MA.