15
LONG PAPER A cognitive framework for robot guides in art collections Dimitrios Vogiatzis Vangelis Karkaletsis Published online: 15 July 2010 Ó Springer-Verlag 2010 Abstract A basic goal in human–robot interaction is to establish such a communication mode between the two parties that the humans perceive it as effective and natural; effective in the sense of being responsive to the informa- tion needs of the humans, and natural in the sense of communicating information in modes familiar to humans. This paper sets the framework for a robot guide to visitors in art collections and other assistive environments, which incorporates the principles of effectiveness and naturalness. The human–robot interaction takes place in natural lan- guage in the form of a dialogue session during which the robot describes exhibits, but also recommends exhibits that might be of interest to the visitors. It is also possible for the robot to explain its reasoning to the visitors, with a view to increasing transparency and consequently trust in the robot’s suggestions. Furthermore, the robot leads the visi- tors to the location of the desired exhibit. The framework is general enough to be implemented in different hardware, including portable computational devices. The framework is based on a cognitive model comprised of four modules: a reactive, a deliberative, a reflective and an affective one. An initial implementation of a dialogue system realising this cognitive model is presented. main ontology. Keywords HCI Dialogue system Cognitive architecture Recommender system Explanations 1 Introduction The present work aims to set the specifications for the dialogue system of intelligent mobile robots that interact with humans, while they operate and provide services in populated environments. In particular, the specifications are focused on robots serving as guides in museums, art collections or cultural foundations, but they can be exten- ded to other domains also, since the specifications are quite general. Robots have been deployed as guides in art col- lections for a long time (at least since 1999), but their focus has been mostly on navigation, image recognition and keeping track of people [5, 7, 31]. In human–robot inter- action, the support of natural language (at least up to a certain degree) has been somewhat neglected; most exist- ing systems are offering only prerecorded information about the exhibits. A museum visitor, after entering the premises, is in the following situation: there are too many exhibits to visit, some of which might be of interest, whereas many others are indifferent. It seems that some guidance is necessary in order to personalise the visit and make it better fulfil the goals and preferences of the visitor. This guidance could be offered through a mobile robot that interacts with visitors, offering personalised information about specific exhibits and advice about what exhibits to visit, could play this guidance role. Interaction occurs through a system that is equipped with a dialogue manager enacting a dialogue session. The dialogue manager is based on a modular cognitive model that comprises a reactive module, a deliberative module and finally a reflective module, staring from the bottom up to top. These modules perform the ‘‘thinking’’ process of the system, each acting at a different level of cognition. In addition, the architec- ture is enriched with an affective module that is able to detect user emotions and express robot emotions. Thus, the D. Vogiatzis (&) V. Karkaletsis Institute of Informatics and Telecommunications, NCSR ‘‘Demokritos’’, 153 10, Agia Paraskevi, Athens, Greece e-mail: [email protected] V. Karkaletsis e-mail: [email protected] 123 Univ Access Inf Soc (2011) 10:179–193 DOI 10.1007/s10209-010-0199-3

A cognitive framework for robot guides in art collections

Embed Size (px)

Citation preview

LONG PAPER

A cognitive framework for robot guides in art collections

Dimitrios Vogiatzis • Vangelis Karkaletsis

Published online: 15 July 2010

� Springer-Verlag 2010

Abstract A basic goal in human–robot interaction is to

establish such a communication mode between the two

parties that the humans perceive it as effective and natural;

effective in the sense of being responsive to the informa-

tion needs of the humans, and natural in the sense of

communicating information in modes familiar to humans.

This paper sets the framework for a robot guide to visitors

in art collections and other assistive environments, which

incorporates the principles of effectiveness and naturalness.

The human–robot interaction takes place in natural lan-

guage in the form of a dialogue session during which the

robot describes exhibits, but also recommends exhibits that

might be of interest to the visitors. It is also possible for the

robot to explain its reasoning to the visitors, with a view to

increasing transparency and consequently trust in the

robot’s suggestions. Furthermore, the robot leads the visi-

tors to the location of the desired exhibit. The framework is

general enough to be implemented in different hardware,

including portable computational devices. The framework

is based on a cognitive model comprised of four modules: a

reactive, a deliberative, a reflective and an affective one.

An initial implementation of a dialogue system realising

this cognitive model is presented. main ontology.

Keywords HCI � Dialogue system �Cognitive architecture � Recommender system �Explanations

1 Introduction

The present work aims to set the specifications for the

dialogue system of intelligent mobile robots that interact

with humans, while they operate and provide services in

populated environments. In particular, the specifications

are focused on robots serving as guides in museums, art

collections or cultural foundations, but they can be exten-

ded to other domains also, since the specifications are quite

general. Robots have been deployed as guides in art col-

lections for a long time (at least since 1999), but their focus

has been mostly on navigation, image recognition and

keeping track of people [5, 7, 31]. In human–robot inter-

action, the support of natural language (at least up to a

certain degree) has been somewhat neglected; most exist-

ing systems are offering only prerecorded information

about the exhibits. A museum visitor, after entering the

premises, is in the following situation: there are too many

exhibits to visit, some of which might be of interest,

whereas many others are indifferent. It seems that some

guidance is necessary in order to personalise the visit and

make it better fulfil the goals and preferences of the visitor.

This guidance could be offered through a mobile robot that

interacts with visitors, offering personalised information

about specific exhibits and advice about what exhibits to

visit, could play this guidance role. Interaction occurs

through a system that is equipped with a dialogue manager

enacting a dialogue session. The dialogue manager is based

on a modular cognitive model that comprises a reactive

module, a deliberative module and finally a reflective

module, staring from the bottom up to top. These modules

perform the ‘‘thinking’’ process of the system, each acting

at a different level of cognition. In addition, the architec-

ture is enriched with an affective module that is able to

detect user emotions and express robot emotions. Thus, the

D. Vogiatzis (&) � V. Karkaletsis

Institute of Informatics and Telecommunications, NCSR

‘‘Demokritos’’, 153 10, Agia Paraskevi, Athens, Greece

e-mail: [email protected]

V. Karkaletsis

e-mail: [email protected]

123

Univ Access Inf Soc (2011) 10:179–193

DOI 10.1007/s10209-010-0199-3

aim is to address human–robot communication from two

sides: by enabling robots to correctly perceive and under-

stand natural human behaviour and by making them act in

ways that are familiar to humans. The proposed cognitive-

based model is a solid framework for an effective and

natural robot.

This paper is organised as follows. The next section

provides an overview of related work, as well as of this

paper’s contribution. Section 3 provides an overview of the

cognitive-based framework. The constituent parts are then

analysed. The reactive, deliberative and reflective modules

of the framework are elaborated in Sects. 4, 5, and 6,

respectively. An implementation of the framework as well

as some early evaluation results are exposed in Sect. 7.

Finally, Sect. 8 presents the conclusions and future

research directions.

2 Related work

The research presented here falls at the crossroads of dia-

logue, cognitive and recommender systems. While a robot

guide will integrate speech and gesture recognition as well

as text to speech generation, and it will be able to map the

surrounding space and navigate, these subjects are not

covered in the current work.

Among humans, dialogue represents perhaps the most

natural form of communication. A spoken dialogue sys-

tem (SDS) enables humans to communicate with com-

puters using speech as input, and the computer responds

with speech generation (see [17] for an overview of

spoken dialogue systems). Spoken dialogue systems are

divided into three major categories depending on who has

the control of the dialogue. In system-directed dialogue,

the system asks a sequence of questions to elicit the

required parameters of the task from the user. In user-

directed dialogue, the dialogue is controlled by the user,

who asks the system questions in order to obtain infor-

mation. Finally, in mixed-initiative dialogue, the dialogue

control alternates between the two participants. The user

can ask questions at any time, but the system can also

take control to elicit the required information or to clarify

ambiguous information. In all types of dialogue systems,

the dialogue has to be managed from the robot’s side, in

order to determine what questions the system should ask,

when and how it should respond to the requests of the

users.

Concerning dialogue management techniques, there are

three main types, i.e., state based, frame based and plan

based. The most common category is the state based which

is based on graphs. Dialogue is expressed as a network of

states connected by edges. The system at each state can

perform one of the following steps:

1. Ask the user for specific information, either providing

the expected answers or asking a specific question,

2. Generate a response to the user, or

3. Access an external application.

The assumption here is that the user should provide a

response among a limited number of particular answers,

thus the input at each state should be specific. This leads to

system-directed dialogue systems. The advantages of this

technique are faster development and more robust systems.

On the other hand, the usage of this technique, results in

limited flexibility in the dialogue structure. CSLU,1

Dipper2 and TrindiKit3 are representative examples of

platforms that support the building of state-based dialogue

systems.

In frame-based approaches, each frame represents a task

or subtask and it has slots representing the pieces of

information that the system needs in order to complete the

task. Frame-based dialogue managers ask the user ques-

tions in order to fill in the slots of the frame in a certain

order, but they also allow the user to guide the dialogue

by proving information that fills in slots according to

individual preference. Therefore, with this technique,

mixed-initiative systems can be built. This leads to shorter

dialogues compared to state-based approaches. However,

user utterances become less restricted and hence harder to

predict, which increases the time needed to develop a

robust system.

Plan-based techniques are used in building dialogue

systems, where the pieces of information or actions needed

to perform a task are hard to predict in advance. This type

of technique, in contrast with the other two types, does not

depend on task modelling (using graphs or frames). It

concentrates instead on identifying the user’s plan and

determining how the system can contribute towards the

execution of that plan. This is a dynamic process, where

new information from the user may either contribute to the

system’s perception of the user’s plan or force the system

to modify it. Consequently, more initiatives are allowed by

the user, at the cost of far more complex implementation

and maintenance compared to the previously mentioned

approaches.

Recent research has started to examine the efficient use

of domain knowledge in dialogue systems [18]. Domain

knowledge is knowledge about the environment in which

the system operates. It contains domain-specific concepts,

concepts properties and relations, such as concept x is-part-

of concept y. The ontology in ontology-based dialogue

systems drives interpretation, generation and the interac-

tion of the dialogue system with the user. For instance, the

1 http://www.cslu.cse.ogi.edu/toolkit/.2 http://www.ltg.ed.ac.uk/dipper/.3 http://www.ling.gu.se/projekt/trindi/trindikit/.

180 Univ Access Inf Soc (2011) 10:179–193

123

domain-specific lexicon and grammar of the automatic

speech recognition (ASR) component can be partially

derived from the ontology. Furthermore, the NLG com-

ponent (NLG) can generate descriptions of the ontology

objects. Dialogue systems with a domain knowledge

backbone benefit from reduced reliance on hand-crafted

linguistic components, more flexible dialogues and, in

general, they can be ported more easily to new application

domains.

Building artificial systems that interact intelligently and

naturally with humans is a difficult task. The field of

cognitive systems contributes in this effort by creating

systems that are based on the cognitive processes or

architectures in humans. Cognition can be understood as a

sequence of steps, each step feeding the next one. It begins

with a perception of the outside world and continues with

the representation of the perceived information. Then the

newly acquired information is processed along with older

pieces of information to produce actions towards the out-

side world, as well as to change the internal state of the

system. Thus, a cognitive system reacts upon events in the

environment, and deliberates anticipating future events.

Moreover, it can handle situations that have not been

preprogrammed and is robust upon unexpected events.

Additionally, it interacts intelligently with other social

agents, including humans. Cognitive systems are at the

crossroads of artificial intelligence, linguistics and psy-

chology. An overview of cognitive system architectures

can be found in [15] and [34].

There are two main stances for implementing a cogni-

tive system. The first, known as cognitivist approach,

involves symbols and rules. Symbols are related to entities

of the surrounding world, and rules tell how to process the

symbols. Ultimately, it is based on the physical symbol

system hypothesis that was stated by Newell and Simon

[21]. The second major approach is described as emergent

and incorporates techniques from connectionism and

dynamic systems. In such systems, knowledge is repre-

sented in a more opaque, subsymbolic manner. The sub-

symbolic representation is usually defined by the origin of

the information (e.g., sensors or database), rather than by

the content of the information. Moreover, in an emergent

system with multiple streams of information, knowledge is

usually stored in a distributed fashion. The need for com-

bination of cognitivist and emergent systems arises from

the need for both symbolic and subsymbolic processing.

This combination leads to hybrid systems. A key enabling

technology for next generation robots for the service,

domestic and entertainment market is human–robot inter-

action. A robot that collaborates with humans on a daily

basis (be this in care applications, in a professional or

private context) requires interactive skills that go beyond

keyboards, button clicks or even natural language. For this

class of robots, human-like interactivity is a fundamental

part of their functionality. In doing so, virtual emotion

expression and human emotion recognition must be inte-

grated in a cognitive system. Handling emotions in a

computational system has been advocated for a long time

[26]. Essentially, the argument for affective computers is

that they can exhibit a more natural and intelligent

behaviour in interaction with humans. The idea of incor-

porating emotions in dialogue systems is gaining momen-

tum [2].

The authors were involved in the projects Xenios4 and

Indigo,5 which both involved robots operating in museums

and serving as guides to the exhibits. In particular, in both

projects robots were deployed at the ‘‘Foundation of

the Hellenic World’’, which is a cultural institution for the

presentation of Hellenic history.6 In the Xenios project, the

robotic system involved a state-based spoken dialogue

system that could talk about specific exhibits of a museum,

upon the user’s request. In the Indigo project (a continua-

tion and extension of Xenios), the robotic platform enabled

a more sophisticated human–robot interaction. The dia-

logue form was more complex, and the platform included

an affective module that could express robotic emotions by

altering the robot’s facial features.

The present work, based on the experience from those

projects, advocates the introduction of a complex cog-

nitive architecture (inspired by the work of Minsky [19])

to support a more natural form of robot human interac-

tion, which is enhanced with a affective module. There

is also a recommender module, which aims at making

suggestions to the visitor about exhibits that might be of

individual interest, based on analysing his past prefer-

ences and his similarity to other users. In addition, the

reasoning behind the suggestions can be explained to the

user.

Another line of research work is the conversational

recommender systems, which actively involve the user in

the formation of a recommendation of a product or service.

The user’s participation can take the form of a dialogue

session or simply of a selection of the most relevant items

from a list of n-items; thus, the user provides feedback that

can be extremely useful, especially in differentiating

between his short-term interests and his longer-term

interests. Conversational recommender systems were ini-

tially developed as an enhancement to content-based sys-

tems, but recently they have found a fertile ground in

collaborative systems [13, 27].

4 http://www.ics.forth.gr/xenios/.5 http://www.ics.forth.gr/indigo/contact.html.6 http://www.fhw.gr/fhw/en/home/index.html.

Univ Access Inf Soc (2011) 10:179–193 181

123

2.1 Contribution

This paper sets a framework of intelligent interaction

between a machine agent and a human. The suggested

framework incorporates many modalities of interaction,

creating a whole, which is perceived as natural and intel-

ligent by humans. The modalities are separate modules

connected to the framework, and they include, natural

language generation, natural language understanding and

robot navigation.

The framework is based on an enhanced variation of a

three-level cognitive architecture that has been proposed in

[30], and it allows the engineering of many possible

machine agents in different incarnations. In the current

work, the framework is implemented as a mobile robot, but

the strength of the framework is that it is flexible enough to

be implemented on different hardware, such as PDA

devices. Moreover, the framework reduces, what is con-

sidered as intelligent interaction into the reaction, deliber-

ation and reflection layers. The current form of the

framework clarifies the interplay of its different constitu-

ents, and the ways they are combined to create a natural

form of interaction. The framework models the user that

interacts with it. The modelling aims to predict user future

preferences. The model is quite complex, as it includes past

user interaction, user similarity to previous users and user

emotional state. In addition, the robot is able to express

emotions. It also differentiates between what is to be

expressed by the machine agent, and how it will be

expressed. Finally, the framework provides for explana-

tions of the robot’s utterances and is able to adapt to the

desires of the current user. This is achieved with the aid of

a recommendation engine that forms part of the the

deliberative level of the framework.

The framework was implemented in a workable proto-

type, and a demonstration of its feasibility has been dem-

onstrated at the premises of an art collection.

3 Robot framework

The dialogue system according to the proposed framework

consists of resources and modules (see Fig. 1). The mod-

ules are the Dialogue System Manager (DSM), the Natural

NLG TTS NLIASR

ROBOT FACE

PSERVER

APPLICATION DATABASES

RECOMMENDATION ENGINE

EXPLANATION ENGINE

DIALOGUE MANAGER &

COMMUNICATION SERVER

DOMAIN ONTOLOGY

USERS MODELS

(PSERVER)

ROBOT PERSONALITY

GESTURE RECOGNITION

NAVIGATION

Fig. 1 Modules and resources of the robotic platform

182 Univ Access Inf Soc (2011) 10:179–193

123

Language Generation (NLG) engine, the Personalisation

Server (PServer), the Automatic Speech Recognition (ASR)

engine, the Text To Speech (TTS) synthesiser, a display

depicting a robot face and the Gesture Recogniser. Finally,

there is a communication server, which enables the inter-

module communication.

The DSM is the ‘‘actor’’ of the whole dialogue system,

in the sense that it is the module that invokes and coordi-

nates all the other modules. In order to decide the next

dialogue state and the text it will utter (through the TTS

unit), it takes into account the cognitive model, the inter-

action history of the user, as well as the location of the

robot. All the above contribute into creating more natural

dialogues. The DSM also communicates with the robot

navigation module, which controls the robotic movement,

and the gesture recogniser for understanding simple ges-

tures from the visitor. The DSM is the cognitive centre of

the entire system, and its constituent parts are the reactive,

deliberative, reflective and affective modules, which are

discussed separately in the next sections (see Fig. 2 for a

conceptual view of the cognitive structure). The cognitive

structure has been influenced by the work of Sloman in

[30]. The modules are not domain specific; consequently,

they can be easily transferred to other domains.

The resources of the dialogue system are as follows: the

robotic personality, which influences the expression of

robotic emotions; the resources of the NLG engine; user

models; data bases recording the interaction history of each

user; and some databases that hold canned text and other

information used during the dialogue.

4 Reactive module

The reactive module is the lowest in terms of cognitive

abilities. It involves no decision and no planning. It just

reacts to user’s requests without any deliberation. It has rules

of the form: \if[ condition \do[, or rules of the form if

\situation and goal[then\do[. For instance, the first type

of rules might capture the fact that when the user talks, the

dialogue system tries to understand. The second type of

rules, refer to acting, provided that the action matches a user

set goal. For instance, the user wants to be informed about an

exhibit that is located far away from his current location, and

places this query to the robot. As the robot traverses the

museum rooms, guiding the user to the exhibit of interest, it

encounters other exhibits that it completely ignores. Other

functions performed by the reactive module include:

– if there is an obstacle in the course of the robot, it tries

to avoid it

– if the robot is close to an exhibit of interest to the user,

it stops and talks about it

– quits the tour upon relevant request by the user.

In order to achieve the above, the robot needs a form a

mapping of the surrounding space and also to be able to

recognise human gestures. The ASR module is also part the

of reactive module.

Next, two examples of a user’s request that invoke the

reactive module are presented.

On the other hand, a request such as the following

cannot be handled by the reactive module:

U: I want something interesting.

5 Deliberative module

The deliberative module allows the robot to exhibit a

sophisticated and complex behaviour in a museum that

renders the interaction with a human more natural. Delib-

eration is placed at a ‘‘higher level’’ than reaction. The

point of focus here is the dialogue system of the robotic

platform. This is not to say that navigation, machine visionFig. 2 Cognitive system architecture of the dialogue manager

Univ Access Inf Soc (2011) 10:179–193 183

123

and speech recognition do not require sophisticated algo-

rithms and possibly planning, but they do not form an

integral part of the proposed framework.

In the current framework, the dialogue form is dynamic

in the sense that the next robot utterance depends on the

history of interaction of the user, his current location, his

stereotype, the museum’s ontology and some other factors

that are analysed below. That is, the dialogue form is not

preset; rather, the next dialogue state is generated dynam-

ically by a complex deliberation process. It should be

mentioned that the dialogue manager is in control of the

dialogue session. In particular, the deliberation takes

decisions regarding (see also Fig. 3).

WHAT to talk about: It concerns the exhibit or the

museum program to talk about next to a specific user. It

depends on the ontology of the museum, the user’s past

interaction with the robot, other users’ interaction with the

robot, the visitor assimilation factor related to exhibits and,

finally, the robot’s preference.

HOW to express what is to be uttered. This depends on

the user’s model and on the robot’s emotional state.

The functionality of these modules is described next.

5.1 WHAT to talk about

5.1.1 Ontology based

Based on the museum’s ontology suggestions for exhibits

that are spatially related to the current position of the user

can be provided. In particular, suggestions are provided for

exhibits that are in close physical proximity to the location

of the user (close-to, next-to, right-to, left-to, above, below

the user).

This mechanism also provides suggestions for items that

are semantically related to what the user has seen. Some

interesting semantic similarities that can exploited include

the following: subclasses of the same classes and entities of

the same class are related. Similar time periods are also

related, even if they do not refer to entities of the same

class. Finally, exhibits by the same designer are also

semantically related.

Let us assume that the visitor has seen some pottery

samples from the archaic period. Then the robot makes the

following semantically related suggestion:

5.1.2 Recommendation

The DSM records the interaction history of each user, i.e,

what each user has seen. The above information, along

with the user demographic data, constitute the user model.

The interaction history of all users is kept in a structured

form, which facilitates extraction of personal preferences

and the derivation of similarities among users. The

module that realises the above is the PServer. In partic-

ular, the PServer is equipped with a recommender module

that suggests items of possible interest to the current user

(see [1] for an overview of recommender systems). It can

provide content-based recommendations, that is sugges-

tions that are similar to past user preferences. Thus, the

suggestion of an unseen exhibit to a particular user is

estimated based on similar items he has seen in the past.

In doing so, the system tries to discover commonalities

between exhibits seen in the past, which have been pos-

itively rated.

In particular, some of following features (fields) of

exhibits (entities) are used to detect commonalities: author,

historical period, type of artefact, etc.

Also, a collaborative recommender system is in place.

Such a system can suggest interesting exhibits to a user

based on items judged as interesting by a group of ‘‘similar

minded’’ users. For instance, it might be discovered that

visitors interested in architecture are also interested in

sculpture. Thus, a new visitor, after having received

information about an architectural exhibit, might be offered

advice for seeing the collection of sculptures. A more

detailed expositions of the capabilities of the PServer is

provided is Sect. 7.3.

The system also takes into account acceptance of pre-

vious suggestions. For example, assuming that the user hasFig. 3 Functional relation of the modules of the dialogue manager

184 Univ Access Inf Soc (2011) 10:179–193

123

seen in the past a similar programme to the one suggested,

the following is a case of content-based recommendation,

where the user expresses aversion to the suggestion.

In another example, the user has just seen a few exhibits

of the museum. A recommendation is produced by simi-

larly minded users:

5.1.3 Assimilation

Assimilation is a variable that controls whether text will be

generated and consequently uttered for an exhibit. As

already mentioned, the whole museum collection of

exhibits is represented by an OWL ontology. For every

entity of the ontology, the value of a parameter known as

assimilation variable can be set. This captures the degree

to which a particular entity has been comprehended by a

certain visitor stereotype. Thus, each time text is generated

for an exhibit, a counter is decreased by one, until it

reaches zero. After that, the exhibit is considered known to

a user and no more text can be generated even if the user

explicitly asks for it.

5.2 HOW to express

Assuming the existence of certain plausible user stereo-

types, such as children, adults and experts, different user

stereotypes have different needs concerning a museum

tour. For example, children might favour, shorter descrip-

tions of museum items with simpler words, and experts

might favour more elaborate descriptions. The user ste-

reotype can be inferred as follows. Assuming that child-

and adult-specific tickets are tagged with RFID, then a

robot equipped with an RFID reader could infer the user

stereotype.

The following is an example of description of an exhibit

by the Robot when addressing to an adult and a child,

respectively. In the case of the child, the text is shorter and

uses simpler vocabulary.

5.2.1 Affective module

In the cognitive architecture, the affective module is

the‘‘emotional’’ centre of the system. It is capable of rec-

ognising certain types of emotions, but also of expressing

emotions.

Thus, the robot is able to recognise the emotional con-

tent in the user’s utterance as he interacts. Currently, the

emotional recognition is limited to recognising polite,

impolite or neutral requests by the user. Recognition is

based on swallow parsing the user’s utterance. The emo-

tional content of the user’s utterance creates an emotional

impulse to the robot, which is represented according to the

OCC model [23].

Then, the next step is the expression of robot emotion. In

doing so, the emotional impulse is processed along with the

robot’s personality represented by an OCEAN vector and

the previous mood of the robot (represented as an OCC

vector). This will result in a new robotic emotion, which

can be directed towards influencing the facial robot fea-

tures on an LCD display. It is also possible to influence the

intonation of the text to speech system of the robot in order

to differentiate between positive and negative emotions.

The robot’s facial features change to reflect high interest

following the emotional response received by a motivated

user’s utterance.

The robot’s facial features can also change to reflect

disappointment, upon an unwelcome user’s request.

U: I am bored, I want to quit.

R: Ok. Let me take you to the exit.

Univ Access Inf Soc (2011) 10:179–193 185

123

6 Reflective module

The reflective module stands on top of the deliberation

and reactive modules and provides a management mech-

anism for the cognitive system. There are many reasons

for the existence of such a mechanism, and in particular

for revising the policies implemented by the two lower

level modules. There are cases when the deliberation

module fails to provide suggestions about exhibits that are

interesting to the users. As mentioned, the robotic sug-

gestions are based on semantic similarity, or on a rec-

ommender system that has two versions (a content based

and a collaborative based). Hence, there are three strate-

gies for generating the next dialogue state with a sug-

gestion, and it is necessary to identify which one is most

adequate to a certain user. Whereas some a-priori

assumptions can be stated about when each suggestion is

more applicable, these assumptions will soon or later turn

to insufficient. This is a case where the reflective mech-

anism is needed to revise or evolve the strategy of

selecting the most appropriate strategy to generate the

next state of the dialogue, or in other words to decide the

content of an utterance. A simple and efficient way to set

up the management mechanism is to assign weights to the

different suggestions, and then to evolve the weights by

means of a genetic algorithm. This algorithm is revising

the weights based on approval or disapproval discovered

in users’ utterance, which creates a positive or negative

reinforcement signal.

Another important function of the reflective module is to

provide explanations of its suggestions.

6.1 Explanations

6.1.1 Opaque explanations

There are cases in which the visitor will request an

explanation of why the robot suggested an exhibit. First, it

might be on a purely informative basis, or the request

might be an indirect suggestion to the robot that it has

failed to provide adequate explanation. Thus, the robot

needs to revise its policy. In any case, explainability

increases the visitor’s trust in the system.

Two different kinds of explanation are offered. The first

is the neighbour style explanation, which essentially pro-

vides a histogram of the similar users’ preferences (see

Fig. 4 and [11] for reference). This type of explanation is

pertinent in the case that the system employs the collabo-

rative-based recommendations.

Another type of explanation is based on the system’s

past-credibility, with respect to the current user.

The two aforementioned types of explanation are char-

acterised as opaque, since the reasoning is not based on the

features of the exhibit but rather on preferences of users.

6.1.2 Transparent explanations

In this case, the explanation is based on the features (fields)

of the suggested exhibit, which are matched again the

features of the user’s model (profile). For instance, if

the user’s model is that of Table 1, then the features of the

15

32

0%

10%

20%

30%

40%

50%

60%

70%

80%

Liked Neutral Disliked

Fig. 4 Similar users preferences

Table 1 Individual user model, comprising three attributes and 8

features. Numbers denote user’s ratings or preference of the corre-

sponding features

Age 23

Expertise Non-expert

Gender Female

Occupation Engineer

Dedicated-to.athena 0.2

Dedicated-to.zeus 0.2

Dedicated-to.athena 0.2

Located-in.ancient-agora 0.8

Located-in.acropolis 0.1

Located-in.keramikos 0.3

Constructed-by.phedias 0.9

Constructed-by.kallikrates 0.1

186 Univ Access Inf Soc (2011) 10:179–193

123

model (those below the occupation) will be matched

against the features of the suggested exhibit, and the fields

that provide a close match will be reported as an

explanation.

In case there is a follow-up question with the user

requesting further explanation on the acquisition, in the

first place, of the user model fields (feature) values, then

the relevant exhibits that the user has visited can be cited to

back-up the current form of his model. This type of

explanations is based on the work reported in [20].

6.1.3 Explanations and natural language generation

As it has been mentioned, the text that is communicated to

the user is generated by the NLG and is based on resources

such as the domain ontology and the microplans (which are

templates used by the NLG engine). Apart from the

exhibits that are described by the domain ontology, the

types of explanation that are offered are also included in

the domain ontology. The opaque and the transparent types

of explanation are subtypes of the explanation type and are

represented in OWL as follows:

\owl : Class rdf : ID ¼ ‘‘Transparent’’[\rdfs : subClassOf rdf : resource

¼ ‘‘#Explanations’’=[ \=owl : Class[owl : Classrdf : ID ¼ ‘‘Opaque’’[\rdfs : subClassOf rdf : resource

¼ ‘‘#Explanations’’=[ \=owl : Class[

The opaque type has two instances, namely the past-

credibility and the neighbour style, whereas the transparent

type has the feature-based instance.

\a:owl : Opaquerdf : ID¼ ‘‘past� credibility’’[\a:owl : numberxml : lang¼ ‘‘EN’’[singular

\=a:owl : number[\=a:owl : Opaque[\a:owl : Opaquerdf : ID¼ ‘‘neighbour� style’’[

\a:owl : numberxml : lang¼ ‘‘EN’’[

singular

\=a:owl : number[\=a:owl : Opaque[\a:owl : Transparentrdf : ID

¼ ‘‘feature� based’’[\a:owl : numberxml : lang¼ ‘‘EN’’[ singular

\=a:owl : number[\=a:owl : Transparent[

The following are examples of two fields of the opaque

data type, namely the number of similar neighbours and the

percentage of past successes.

\owl : DatatypePropertyrdf : ID

¼ ‘‘nSimilarNeighbors’’[\owl : DatatypePropertyrdf : ID

¼ ‘‘perPastSuccess’’[

A microplan that is employed by the NLG module for

generating text for the field perPastSuccess. A microplan is a

succession of slots. In this case, the name of the microplan is

p1, and the microplan is made of the following slots: the

pronoun I, the verb have been the string correct and is

followed by the field value, which is a number.

\owlnl : MicroplanName[p1

\owlnl : MicroplanName[\owlnl : Pron[\owlnl : Valxml : lang¼ ‘‘en’’[I\=owlnl : Val[\=owlnl : Pron[\owlnl : Verb[\owlnl : voice[active\owlnl : voice[\owlnl : tense[presentperfect\=owlnl : tense[\owlnl : Valxml : lang¼ ‘‘en’’[havebeen \=owlnl : Val[\=owlnl : Verb[\owlnl : string[\owlnl : Valxml : lang¼ ‘‘en’’[correct\owlnl : Val[\owlnl : string[\owlnl : Filler[\owlnl : case[nominative\=owlnl : case[\owlnl : Filler[\owlnl : string[\owlnl : Valxml : lang¼ ‘‘en’’

[ofthetime\owlnl : Val[ \owlnl : string[

7 Implementation and evaluation

A part of the proposed framework has been implemented

with a view to evaluating it in real conditions. This section

Univ Access Inf Soc (2011) 10:179–193 187

123

describes the modules and resources that have been

implemented and integrated in an early test case.

7.1 Design of the dialogue manager

This subsection describes how the dialogue manager imple-

ments the content module that was mentioned in a previous

subsection. TrindiKit [33] was chosen as the starting point of

a model for dialogue management. TridiKit is based on two

very simple notions: the current state of the dialogue and

some update rules, which are comprised of a condition and an

action part. Once the appropriate rule that matches the cur-

rent state is discovered, then the current state is replaced with

the action part of the rule. The original TrindiKit had some

limitations, as it did not allow for dynamic rules in the sense

that is described below. Thus, Trindikit was altered in order

to integrate a recommendation module.

The system tries to suggest items, which in this case are

museum exhibits or documentaries that are relevant to the

user’s desires, interests, etc. The suggestion depends on the

factors mentioned in Sect. 5.1 regarding the WHAT mod-

ule. Namely, the suggestion can be ontology based or based

on the suggestion of a recommender system. Thus, the

update rule of the TrindiKit should include in the action

part an invocation of the recommendation engine. In that

sense, the update rules are dynamic—the same rule yields

different responses based on the user’s interaction history.

The following is an example of information state (IS) of

the dialogue, as well as of an update rule, following the

formal representation of Trindikit. The IS is split into 3

parts. The first part is called Objects and refers to the

objects, or exhibits (in the current setting), that the user has

already seen. Position refers to the current position of the

visitor with reference to a specific object. The second part

is the dialogue history (DH) and refers to what the user

(usr) and the robot (sys) have uttered in the past. Finally,

the third part represents the last utterance (LU), which

came from the user or the system. The user in this example

has already visited objects o2, o4 and o16. He stands now

close to o16; the user liked o16, and the last one to talk was

the user. The above information is represented in accor-

dance with the TrindiKit formalism as follows:

IS

Objects [Visited: o2, o4, o16]

[Position: o16]

DH [...likes(usr, o16)]

LU Speaker: usr

At this point, an update rule is applied that matches the

current information state. In particular, if the last speaker

was the user, and the meaning of his utterance was that he

liked the visited object, then the recommender module is

called to suggest an item that might be of interest to the

user. This is the action part of the rule, realised in the

context of TrindiKit with the push command, which will

insert in a stack the result of the recommendation.

7.2 Ontology authoring and NLG

The ontology is represented in OWL,7 the Semantic Web

standard language for specifying ontologies, using the

ELEON8 ontology authoring and enrichment tool [6].

ELEON is an open-source project developed at NCSR

‘‘Demokritos’’.

In particular, ELEON allows the authoring of OWL

ontologies, and the enrichment of ontologies with linguistic

resources, such as a lexicon and microplans. Microplans

provide instruction to an NLG module about the production

of text. Microplans are templates, which are simpler than

grammar rules, but far easier to develop by non-experts. An

NLG engine does not form an integral part of ELEON.

ELEON exports the authored ontology in OWL format, and

the linguistic resources in the RDF format. Moreover,

ELEON is able to support linguistic resources for English,

Italian and Greek. Finally, the ELEON can be connected to

an inference engine for consistency check of the authored

ontology.

The NLG engine is the NaturalOWL [10], which

generates text based on an enriched ontology. The

ontology was authored based on information provided by

the archaeologists of the FHW. NaturalOWL, is heavily

based on ideas from ILEX [22] and M-PIRO [12]. Unlike

its predecessors, NaturalOWL is simpler (e.g., it is

entirely template-based), and it provides native support for

OWL ontologies. NaturalOWL adopts the typical pipeline

architecture of NLG systems [28]. It produces texts in

three sequential stages: document planning, microplanning

7 http://www.w3.org/TR/owl-features/.8 ELEON is available at http://www.iit.demokritos.gr/*eleon/.

188 Univ Access Inf Soc (2011) 10:179–193

123

and surface realisation. In document planning, the system

first selects the logical facts (OWL triples), which will be

conveyed to the user and specifies the document structure.

In microplanning, it constructs abstract forms of sen-

tences, then it aggregates them into more complex peri-

ods, and finally selects appropriate referring expressions.

In surface realisation, the abstract forms of sentences are

transformed into real text, and appropriate syntactic and

semantic annotations can be added, for example to help

the TTS produce more natural prosody. The system is also

able to compare the described entity to other entities of

the same collection (e.g., unlike all the vessels that you

saw, which were decorated with the black-figure tech-

nique, this amphora was decorated with the red-figure

technique).

7.3 PServer and user models

This section provides an extensive description of PServer

because its functionality is crucial to the implementation of

the framework. It stores and handles user models, and it

supports user stereotypes, as well as groupings (commu-

nities) of users and exhibits. The recommendation engine,

as well as the explanation engine, are implemented as

clients to the PServer. In particular, the recommendation

engine relies on the existence of user and exhibit com-

munities. Also, the user stereotypes are essential to the

NLG engine, because the generated text is user stereotype

dependent. The PServer is a tool under development at

NCSR ‘‘Demokritos’’ [24].

In the context of PServer, it is necessary to define and

distinguish between attributes and features. Attributes are

independent of the current application domain (i.e.,

museum), and they primarily refer to user characteristics

such as: age, gender, occupation, level of expertise,

address, etc. Naturally, attribute values do not change in

the course of user interactions. Features, are application

dependent and consequently refer to characteristics of the

exhibits of the current art collection. Thus, they may refer

to the following: archaeological period, architect, location,

type of the exhibit, etc. The value of each feature reflects

the user’s preference of this feature, or the user’s frequency

of visit of this feature.

In particular, the PServer distinguishes between the

following entities: individual user models, user stereo-

types and user or feature communities. All of these entities

aim to capture and represent users, and to group users and

features with a view of detecting similar users, with the

ultimate purpose of offering recommendations. Let A be

the set of all user attributes, and F be the the set of

features.

7.3.1 Individual user models

Each user is represented by an individual user model that is

comprised of attributes and features (see Table 1).

7.3.2 Stereotypes

User modelling via stereotypes was introduced for the first

time in [29]. In the context of the PServer, stereotypes SC

are made of a combination of a subset of attributes, i.e.,

SC(SA), where SA � A (see Table 2). Stereotypes can be

hand crafted by experts and represent some plausible

assumptions about users. Stereotypes are very useful for

novel users, i.e., for users who have no or little interaction

history and thus no exhibit suggestions can be made from

their past behaviour or their similarity to other users. Ste-

reotypes essentially link users with characteristics of

exhibits. Thus, in the provided example the first stereotype

in Table 2 might be associated with shorter descriptions of

exhibits, and the second stereotype might be associated

with the omission of certain exhibit descriptions that are

too evident for experts.

7.3.3 Communities

Two types of communities are supported, user and feature

types. Both types represent groupings of users or features

based exclusively on feature values. A user community

UC, in particular, represents a group of users who have

similar feature values with respect to certain features.

UC = (fk(1), fk(2), …, fk(n)). Thus, users who belong in the

same community are deemed to be similar.

Table 2 A stereotype of children non-expert users (visitors) and a

stereotype of expert users

Age Child

Expertise Non-expert

Age Adult

Expertise Expert

Table 3 Two features communities: the first one expresses the fact

that users interested in Acropolis are also interested in the Ancient

Agora; the second expresses a common interest in the architects

‘‘Phedias’’ and ‘‘Kallikrates’’

Located-in.acropolis

Located-in.ancient-agora

Constructed-by.phedias

Constructed-by.kallikrates

Univ Access Inf Soc (2011) 10:179–193 189

123

On the other hand, a feature community FC represents a

collective of features that have similar feature values.

FC = (fj(1), fj(2), …, fj(m)). Thus, features belonging in the

same community are deemed to be similar, and can be

suggested in the sense of: ‘‘users who selected this item

have also expressed interest in these items’’. Communities

of both types are built with the cluster mining algorithm

[25]. Since user communities group similar users, they can

be used for listing similar users for explaining the sug-

gestions (see relevant section). Feature communities are

most useful in suggesting new items to users, once a user

has expressed interest in an item of the community.

7.4 Other modules

The rest of the modules of the platform can be split into

user interface modules and platform-specific modules.

7.4.1 User interface modules

A TTS and an ASR system from Acapela9 have been used.

In particular, the ASR module, apart from the voice rec-

ognition, also includes natural language understanding for

both English and Greek. Moreover, the TTS is able to

produce speech in English and in Greek.

The emotion generation and expression module can be

implemented according to the work in [9] and in [14]. In

the future, it is planned to influence also the intonation of

the text to speech system.

7.4.2 Platform-specific modules

The communication server that transfers messages between

the different modules has been provided by the Computa-

tional Vision and Robotics Laboratory of Institute of

Computer Science (ICS) of the Foundation for Research

and Technology–Hellas (FORTH).10 The same group has

provided the hardware for the robot platform as well as the

navigation module. In particular, the robot is the model

MP-470, of the GPS GmbH Neobotix11 and is depicted at

Fig. 5.

Moreover, the mapping of the premises of the FHW has

been performed automatically by the robot according to the

work described in [32]. The gesture recognition module is

able to comprehend three gestures (yes, no and quit) and is

described extensively in [4].

Finally, the cognitive structure of the dialogue manager

has been developed by the authors for the purpose of the

evaluation.

7.5 Resources

The resources of the present implementation include the

domain ontology, the user stereotypes and the application

databases.

7.5.1 Domain ontology

The domain ontology describes the area to be visited

(buildings, rooms and programs). Thus, there are exhibit

classes (such as sculptures and paintings), subclasses

(such as sculptures from a certain archaeological site) and

instances (such as the statue of Zeus). For each instance,

there is additional information in the form of fields that

describe the exhibit, state its author, the historical period

in which it belongs, etc. The ontology can be updated by

adding visiting areas, new exhibits or by updating

information on already existing areas and exhibits. The

use of the ontology enables the DSM to describe the

newly added or updated objects without further

configuration.

Authors using the ELEON can record lexical informa-

tion in the form of nouns and verbs that form the domain-

specific dictionary. This is required by the natural language

generation process. In addition, authors can specify the

degree of appropriateness of each noun for each user ste-

reotype. For instance, some nouns would be marked as

appropriate for adults only, whereas others, simpler words

would be a substitute for children.

Authors, can add a new noun by providing an identity

name. Then, the author has to specify the forms that the

noun assumes in various languages (English, Italian and

Greek), which depend on the idiosyncrasy of each lan-

guage. For instance, the singular and plural form across

Fig. 5 Robot at the foundation of the Hellenic world

9 http://www.acapela-group.com/index.asp.10 http://www.ics.forth.gr/*xmpalt/research/orca/.11 http://www.neobotix.de/en/.

190 Univ Access Inf Soc (2011) 10:179–193

123

cases can be specified, in addition to the gender and

whether it is countable or uncountable.

7.5.2 User stereotypes

User stereotypes such as adult, child, expert, can be defined

with the aid of the PServer (described in Sect. 7.3). User

stereotypes are quite useful since they permit extensive

personalisation of the information that users receive [3,

16]. Thus, user stereotypes determine the user’s interest in

the ontology entities (e.g., some facts about painting

techniques may be too elementary for experts and thus they

can be omitted). Also, the number of times a fact has to be

repeated before the system can assume that a user has

assimilated it depends on whether the user is an expert or a

layman (e.g., how many times the robot has to repeat a

description of the temple of Hephestos). In addition, user

stereotypes specify the appropriateness of certain linguistic

elements (e.g., it might employ simpler terms for children

than for adults), as well as parameters that control the

maximum desired length of exhibit description. Finally,

different speech synthesiser intonations can be chosen for

different user stereotypes.

7.5.3 Application databases

There is also a Canned Text Database, which contains

fixed text that will be spoken at the commencement, at the

end, or at an intermediate stage of the visitor’s interaction

with the dialogue system. Canned text, also contains some

string variables that are instantiated during the dialogue

session. Moreover, there is a Domain-Specific Database

that in effect contains instances of the ontology, for

example, particular buildings, programs and rooms. This

information is extracted from the ontology that the NLG

module uses [10]. Finally, the robot personality is included

in the Domain-Specific Database and is represented

according to the OCEAN [8] model.

7.6 Evaluation

The first phase of the evaluation involved a usability study

of the robot guide at the premises of the Foundation of the

Hellenic world (FHW). According to the experimental

protocol, the robot was left wandering at the foyer of FHW,

with a view of attracting the attention of the visitors. Then,

upon the request of a visitor the experiment supervisor

would briefly explain the purpose of this robot, as well as

its capabilities and would let the visitor interact for about

15 minutes. After that the experiment supervisor filled in a

predefined questionnaire based on the visitor’s answers.

In this phase of the experiment, 76 visitors were

involved, 37 men and 39 women; 34 of them were up to

15 years old. The point was to evaluate the robot guide as a

platform, but also the individual modules, such as the voice

recognition module, the NLG engine, the TTS module, and

the dialogue system from a usability perspective.

According to the results, the robot platform urges the

visitors to visit the exhibits of the FHW. Another finding is

that visitors would like to find similar robots in other

museums. Considering the effect of different platform

modules on the robot–human interaction, it was discovered

that speech recognition had the most profound effect, fol-

lowed by the form of the robot itself and by the TTS

module. Considering the user-friendliness of different

modes of communication, first comes the speech recogni-

tion, followed by the usage of the robot’s touch screen (as

it was expected).

The produced text was deemed as satisfactory. How-

ever, it should be noted that the users were unaware that the

text is produced dynamically and that it is based on an

ontology, on microplans and on user stereotypes. The

produced text can be improved by extending the ontology

and the microplans.

Finally, the views about the sophistication of the dia-

logue system seem to be split between satisfactory and

mediocre.

8 Conclusions and future directions

This article aimed to introduce a framework for the dia-

logue system of intelligent robots. Such robots interact with

humans in a quite natural way as they describe or suggest

exhibits.

The framework proposes a new way of designing dia-

logue managers. In particular, it introduces a cognitive

architecture into dialogue management. The architecture is

based on a reactive, a deliberative and a reflective module.

The reactive module handles the simplest human–robot

interactions. The deliberative module is able to suggest

concepts of interest to the user. A domain ontology allows

the deliberative module to suggest items that are semanti-

cally related to what the user has just seen. In addition,

user’s past interaction as well as similar users’ interaction

can be used to suggest popular items. Finally, the delib-

erative module employs an affective module for emotion

recognition and emotion expression. At the topmost level is

the reflective module that acts as the management mech-

anism of the cognitive architecture. Necessary resources

for such a system include, a domain ontology, a robot

personality and the user types.

A full evaluation of the proposed framework, which is

based on a cognitive architecture, is a complex issue, which

will be the subject of future research. However, the eval-

uation process has started, by examining the effect of

Univ Access Inf Soc (2011) 10:179–193 191

123

certain modules as well as that of the robot as a whole on

the perceived usability. The evaluation was based on users’

perception of the interaction with the robot, and it was

captured through answers to a questionnaire.

Certain modules are still to be evaluated; in particular,

the recommendation engine and the explanation engine,

which form part of the PServer, have not been evaluated

yet. Another criterion that can be tested is the robustness

of the system. How can it respond to unforeseen situa-

tions; for instance how does the system behave if it

does not understand the user’s utterance? Or what if

it does not have enough knowledge to respond to a

user’s request? These situations must be handled when

striving for a system that operates in an open-ended

environment.

Acknowledgments This work was partially supported by the

research programmes XENIOS (Information Society, 3.3, Greek

national project) and INDIGO (FP6, IST-045388, EU project).

References

1. Adomavicius, G., Tuzhilin, A.: Toward the next generation of

recommender systems: a survey of the state-of-the-art and pos-

sible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749

(2005)

2. Andre, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds):

Affective Dialogue Systems. Springer, Berlin (2004)

3. Androutsopoulos, I., Oberlander, J., Karkaletsis, V.: Source

authoring for multilingual generation of personalised object

descriptions. Nat. Lang. Eng. 13(3), 191–233 (2007)

4. Baltzakis, H., Argyros, A., Lourakis, M., Trahanias, P.: Tracking

of human hands and faces through probabilistic fusion of multiple

visual cues. In: Proceedings of International Conference on

Computer Vision Systems (ICVS) (2008)

5. Bennewitz, M., Faber, F., Schreiber, M., Behnke, S.: Towards a

humanoid museum guide robot that interacts with multiple per-

sons. In: Proceedings of the 5th IEEE-RAS International Con-

ference on Humanoid Robots (2005)

6. Bilidas, D., Theologou, M., Karkaletsis, V.: Enriching OWL

ontologies with linguistic and user-related annotations: the ELEON

system. In: Proceeding of International Conference on Tools with

Artificial Intelligence (ICTAI). IEEE Computer Society Press

(2007)

7. Chiu, C.: The Bryn Mawr tour guide robot. PhD thesis, Bryn

Mawr College (2004)

8. Costa, P.T., McCrae, R.R.: Normal personality assessment in

clinical practice: The NEO personality inventory. Psycol. Assess.,

4(1), 5–13 (1992)

9. Egges, A., Zhang, X., Kshiragar, S., Thalmann, N.M.: Emotional

communication with virtual humans. In: Proceedings of the 9th

International Conference on Multimedia Modelling (2003)

10. Galanis, D., Androutsopoulos, I.: Generating multilingual

descriptions from linguistically annotated OWL ontologies: the

NATURALOWL system. In: Proceedings of the 11th European

Workshop on Natural Language Generation, Schloss Dagstuhl,

Germany (2007)

11. Herlocker, J.L., Konstan, J.A., Riedl, J.: Explaining collaborative

filtering recommendations. In: CSCW: Proceedings of the ACM

Conference on Computer Supported Cooperative Work,

pp. 241–250. ACM, New York (2000)

12. Isard, A., Oberlander, J., Matheson, C., Androutsopoulos, I.:

Special issue advances in natural language processing: speaking

the users’ languages. IEEE Intell. Syst. Mag. 1(18), 40–45 (2003)

13. Kelly, J.P., Bridge, D.: Enhancing the diversity of conversational

collaborative recommendations: a comparison. Artif. Intell. Rev.

25(1–2), 79–95 (2003)

14. Kshirsagar, S., Garchery, S., Sannier, G., Magnenat-Thalmann,

N.: Synthetic faces: analysis and applications. Int. J. Imaging

Syst. Technol. 13(e1), 65–73 (2003)

15. Langley, P., Laird, J.E., Rogers, S.: Cognitive architectures:

research issues and challenges. Technical report, Computational

Learning Laboratory, Stanford University (2006)

16. McEleney, B., Hare, G.O.: Efficient dialogue using a probabilistic

nested user model. In: 19th International Joint Conference on

Artificial Intelligence, Edinburgh, Scotland (2005)

17. McTear, M.F.: Spoken Dialogue Technology. Towards the

Conversational User Interface. Springer, Berlin (2004)

18. Milward, D., Beveridge, M.: Ontology-based dialogue systems.

In: Proceedings of IJCAI Workshop on Knowledge and Rea-

soning in Practical Dialogue Systems (2003)

19. Minsky, M.: The Emotion Machine. Simon and Shuster, New

York City (2006)

20. Mooney, R.J., Roy, L.: Content-based book recommending using

learning for text categorization. In: DL: Proceedings of the 5th

ACM conference on Digital Libraries, pp. 195–204, ACM, New

York (2000)

21. Newell, A., Simon, H.A.: Computer science as empirical inquiry:

symbols and search. Commun. ACM 19(3), 113–126 (1976)

22. O’Donnell, M., Mellish, C., Oberlander, J., Knott, A.: ILEX: an

architecture for a dynamic hypertext generation system. Nat.

Lang. Eng. 7(3), 225–250 (2001)

23. Ortony, A., Glore, G., Collins, A.: The Cognitive Structure of

Emotions. Cambridge University Press, Cambridge (1988)

24. Paliouras, G., Mouzakidis, A., Ntoutsis, C., Alexopoulos, A.,

Skourlas, C.: PNS: personalized multi-source news delivery. In:

Proceedings of the 10th International Conference on Knowledge-

Based & Intelligent Information & Engineering Systems (KES),

UK (2006)

25. Paliouras, G., Papatheodorou, C., Karkaletsis, V., Spyropoulos,

C.D.: Clustering the users of large web sites into communities. In:

ICML : Proceedings of the 17th International Conference on

Machine Learning, pp. 719–726. Morgan Kaufmann, San Fran-

cisco (2000)

26. Piccard, R.W.: Affective Computing. MIT Press, Cambridge

(1997)

27. Rafter, R., Smyth, B.: Conversational collaborative recommen-

dation—an experimental analysis. Artif. Intell. Rev. 24(3–4),

301–318 (2005)

28. Reiter, E., Dale, R.: Building Natural Language Generation

Systems. Cambridge University Press, Cambridge (2000)

29. Rich, E.: User modeling via stereotypes. Cogn. Sci. 3(4),

329–354 (1979)

30. Sloman, A.: Requirements for a fully-deliberative architecture (or

component of an architecture). http://www.cs.bham.ac.uk/

research/projects/cosy/papers/#dp0604, COSY-DP-0604 (HTML)

(2006)

31. Thrun, S., Bennewitz, M., Burgard, W., Cremers, A.B., Dellaert,

F., Fox, D., Haehnel, D., Rosenberg, C., Roy, N., Schulte, J.,

Schulz, D.: MINERVA: a second generation mobile tour-guide

robot. In: Proceedings of the IEEE International Conference on

Robotics and Automation (ICRA) (1999)

32. Trahanias, P., Burgard, W., Argyros, A., Hahnel, D., Baltzakis,

H., Pfaff, P., Stachniss, C.: TOURBOT and WebFAIR: web-

operated mobile robots for tele-presence in populated exhibitions.

IEEE Robot. Autom. Mag., Spec. Issue Robot. Autom. Eur.: Proj.

Funded Comm. Eur. Union 12(2), 77–89 (2005)

192 Univ Access Inf Soc (2011) 10:179–193

123

33. Traum, D., Larsson, S.: The information state approach to dia-

logue management. In: Juppenvelt, J., Smith, R. (eds) Current and

New Directions in Discourse and Dialogue. Kluwer, The Neth-

erlands (2003)

34. Vernon, D., Metta, G., Sandini, G.: A survey of artificial cogni-

tive systems: implications for the autonomous development of

mental capabilities in computational agents. IEEE Trans. Evol.

Comput., Spec. Issue Auton. Ment. Dev. 11(2), 151–180 (2007)

Univ Access Inf Soc (2011) 10:179–193 193

123