8
A Cognitive System for Detecting Emotions in Literary Texts and Transposing them into Drawings Daniel Dichiu, Ana Lucia Pai�, Sunita Andreea Moga, Catalin Buiu Laboratory of Natural Computing and Robotics Politehnica University of Buchest Bucharest, Romania [email protected] Abstract-Emotions are key factors in the cognitive and social development of human beings. Robot artists are emotional robots that express themselves using various forms of art. Their goal must be to evoke an aesthetic experience in the audience. Emotion recognition from text is an active field of research which produced several detection methods during the last years. This paper presents the development of a cognitive system that extracts the dominant emotion from a literary text and transposes it into a graphical form using a drawing robot. This comes in the general context of a project aiming at developing robot artists capable of expressing artistic emotions. (Abstract) Kwordscognitive system, natural language proccessing, emotion detection, drawing robot (k wor) I. INTRODUCTION Emotions are key factors in social interactions because they coect people and improve creativity, health and other life aspects. Reason and emotion can be seen as two complimentary systems for making decisions, erefore emotions into powerl tools which if properly used, can become a resource at least as important as the intellect. Emotions affect human thinking, perception and behavior so therefore play an important role in taking decisions, and leing, and can even overcome reason under stress conditions. According to Mvin Minsky, "the question is not whether intelligent machines can have any emotions, but whether machines can be intelligent without any emotions" [1]. Design and implementation of emotional robots is desirable so at they can be used in complex environments, where interaction with people is crucial to the success of their tasks, such as in hospitals, schools or entertainment. Presently there are several robots that can express emotions, such as pet robots like iCat (the robot cat om Philips), Aibo (the robot dog om Sony), Paro (a seal robot developed by AIST), or humanoid robots such as Actroid Der (developed by the Osaka University) and Kobian (a robot that uses his whole body to express different emotions). All these mark an evolution of understanding and expressing emotions in robotics. Yet, in the current development stage, robots are only able to mimic emotions to a certain degree allowed by their level of embodiment. This manner of experiencing emotions is different om what humans understand and is closer to that of believable agents [2] that only provide the illusion of life by reacting to situations in an emotional manner. 978-1-4244-6588-0/10/$25.00 ©2010 IEEE Emotion detection om text, speech or song represents an emerging field of research. Various methods, techniques and implementations have been proposed, for example [3] where a method to infer dialogist's emotion by using a Bayesian network for prosodic features of the dialogist's voice is described. Emotion detection om text methods are usually divided into the following three main categories: keyword- based, leing-based, and hybrid approaches as considered in [4] where a solution based on extracting keywords with semantic analysis, and ontology design with emotion theory of appraisal is proposed. Some approaches involve using a bimodal analysis of speech and text [5]. Drawing may be considered as the oldest way to express one's emotions and it is one of the favorite application areas for robot artists. Starting om simple to complex there have been many implementation of drawing robots, like the "automated drawing with vibrating cups" presented in [6], or Drawbots, a three year project started in 2005 by an inteational group of researchers with the aim of building a robot that could exhibit creative behavior through drawing [6]. The research had two ameworks: one concentratg on making an agent with the potential for manifesting autonomous creative behavior while the second conceed methodologies for recognizing such behavior. Initial experiments were carried out in simulation using a model of a Khepera robot. The intention was to develop drawing behaviors on real robots. The viability of eir approach was suggested by the simulation experiments [6]. A research project at the natuRO Laboratory [7] at Politehnica University of Bucharest is conceed with developing cognitive architectures for emotional robot artists. One of the first results was a theoretical study on the number and nature of ndamental emotions that have to be expressed by a robot artist [8]. Based on those results, this paper presents a cognitive architecture for a system that is able to identi basic emotions om a literary text, and to instruct a drawing robot to draw a picture expressing the dominant emotion. The structure of this paper is as follows. Section II presents the general architecture of e proposed cognitive system, while the next two sections detail the main components of the system: emotion detection module (Section III) and drawing module (Section IV), respectively. The paper ends with conclusions and some ideas for her development. 1958

A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

A Cognitive System for Detecting Emotions in Literary Texts and Transposing them into Drawings

Daniel Dichiu, Ana Lucia Pai�, Sunita Andreea Moga, Catalin Buiu Laboratory of Natural Computing and Robotics

Politehnica University of Bucharest Bucharest, Romania

[email protected]

Abstract-Emotions are key factors in the cognitive and social

development of human beings. Robot artists are emotional robots

that express themselves using various forms of art. Their goal

must be to evoke an aesthetic experience in the audience.

Emotion recognition from text is an active field of research which

produced several detection methods during the last years. This

paper presents the development of a cognitive system that

extracts the dominant emotion from a literary text and transposes it into a graphical form using a drawing robot. This

comes in the general context of a project aiming at developing

robot artists capable of expressing artistic emotions. (Abstract)

Keywords--cognitive system, natural language proccessing,

emotion detection, drawing robot (key words)

I. INTRODUCTION

Emotions are key factors in social interactions because they connect people and improve creativity, health and other life aspects. Reason and emotion can be seen as two complimentary systems for making decisions, therefore emotions turn into powerful tools which if properly used, can become a resource at least as important as the intellect. Emotions affect human thinking, perception and behavior so therefore play an important role in taking decisions, and learning, and can even overcome reason under stress conditions.

According to Marvin Minsky, "the question is not whether intelligent machines can have any emotions, but whether machines can be intelligent without any emotions" [1]. Design and implementation of emotional robots is desirable so that they can be used in complex environments, where interaction with people is crucial to the success of their tasks, such as in hospitals, schools or entertainment. Presently there are several robots that can express emotions, such as pet robots like iCat (the robot cat from Philips), Aibo (the robot dog from Sony), Paro (a seal robot developed by AIST), or humanoid robots such as Actroid Der (developed by the Osaka University) and Kobian (a robot that uses his whole body to express different emotions). All these mark an evolution of understanding and expressing emotions in robotics.

Yet, in the current development stage, robots are only able to mimic emotions to a certain degree allowed by their level of embodiment. This manner of experiencing emotions is different from what humans understand and is closer to that of believable agents [2] that only provide the illusion of life by reacting to situations in an emotional manner.

978-1-4244-6588-0/10/$25.00 ©2010 IEEE

Emotion detection from text, speech or song represents an emerging field of research. Various methods, techniques and implementations have been proposed, for example [3] where a method to infer dialogist's emotion by using a Bayesian network for prosodic features of the dialogist's voice is described. Emotion detection from text methods are usually divided into the following three main categories: keyword­based, learning-based, and hybrid approaches as considered in [4] where a solution based on extracting keywords with semantic analysis, and ontology design with emotion theory of appraisal is proposed. Some approaches involve using a bimodal analysis of speech and text [5].

Drawing may be considered as the oldest way to express one's emotions and it is one of the favorite application areas for robot artists. Starting from simple to complex there have been many implementation of drawing robots, like the "automated drawing with vibrating cups" presented in [6], or Drawbots, a three year project started in 2005 by an international group of researchers with the aim of building a robot that could exhibit creative behavior through drawing [6]. The research had two frameworks: one concentrating on making an agent with the potential for manifesting autonomous creative behavior while the second concerned methodologies for recognizing such behavior. Initial experiments were carried out in simulation using a model of a Khepera robot. The intention was to develop drawing behaviors on real robots. The viability of their approach was suggested by the simulation experiments [6].

A research project at the natuRO Laboratory [7] at Politehnica University of Bucharest is concerned with developing cognitive architectures for emotional robot artists. One of the first results was a theoretical study on the number and nature of fundamental emotions that have to be expressed by a robot artist [8]. Based on those results, this paper presents a cognitive architecture for a system that is able to identify basic emotions from a literary text, and to instruct a drawing robot to draw a picture expressing the dominant emotion.

The structure of this paper is as follows. Section II presents the general architecture of the proposed cognitive system, while the next two sections detail the main components of the system: emotion detection module (Section III) and drawing module (Section IV), respectively. The paper ends with conclusions and some ideas for further development.

1958

Page 2: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

I I. FROM TEXT TO DRAWINGS VIA EMOTION DETECTION. A

GENERAL ARCHITECTURE

A general architecture of the proposed cognitive system that recognizes emotion from texts and draws a corresponding picture is given in Fig. 1. The two main modules are the text­to-emotion module and the emotion-to-drawing module and will be presented in the current and following sections of the paper. The text-to-emotion module extracts different emotions from a literary text using a classification algorithm based on a constructed graph that gives information on the words expressing emotion in the text, offering a score for each emotion, and the connections between them.

The emotion-to-drawing module uses the resulted graph and other input parameters to physically represent the dominant emotion found in the text by sketching a human face that illustrates the given emotion through widely accepted facial expressions. This was accomplished using a computer controlled robot as a drawing agent.

Considering the application domain of this project, a certain distinction should be made between moral emotions, and artistic emotions. The project addresses this last category and is based on the Indian theory of Rasa, that identifies nine artistic emotions as described in [8]: pleasure, mirth, anger, energy, fear, disgust, astonish, serenity, sorrow.

Until now, despite various attempts, little progress has been made in integrating mUltiple forms of artistic expression into a robot's behavior and transforming this in a coherent form of human-robot communication.

User Input

Emotion Graph

Figure I. The general architecture

The proposed modules are described in detail in the next sections together with use cases, results, conclusions and further development.

III. INTELLIGENT SYSTEM FOR EMOTION DETECTION IN TEXT

A. Natural language processing

In the context of the presented application, language (written or spoken) can be considered as a communication channel which expresses emotions. A presentation of the way people express emotions in text based communication can be found in [9]. The nine basic emotions identified in [8] are considered when analyzing text. Linguistics deals with two major areas of language: meaning (semantics and pragmatics), structure and rules (morphology and syntax). Semantics is the part of linguistics that studies the meaning of words. Pragmatics studies the meaning of words in a given context (based on context, acquired knowledge, intention). As in this

paper the acquired knowledge (words and their meaning) can be considered the same between writer and reader, it all comes down to setting the right context. The algorithm presented in this paper uses a preset constant value of context size.

WordNet [10], a lexical database of English language, where open word classes (nouns, verbs, adjectives, adverbs) are grouped into sets of synonyms (synsets) has been used for the emotion detection algorithm. Between synsets there are relations of hypernymy, synonymy, antonymy. As of WordNet 3.0 there are 155.287 words and 117.659 senses in the database. Senti WordN et, a database of words annotated with positive or negative sentiment, was not used, because in the current form, it does not add value to the proposed algorithm.

/) Document retrieval The algorithm of the detection system is similar to a

document retrieval system and thus must be able to handle both synonymy (i.e., two different words have the same or closely related meaning - they can be used interchangeably in a text, without the meaning of the text being changed) and polysemy (i.e., same word has different meanings) problems. One of the earliest methods that tackled both was Latent Semantic Indexing (LSI) [11]. The LSI algorithm consists of generating a term-document matrix, performing singular value decomposition (SVO) and using singular values thus obtained for determining the concepts in each document. A very interesting fact regarding this method is that it returns relevant results even if exact keywords are not used. Another advantage is the use of a mathematical representation, which means the algorithm is not language dependent. However, a big disadvantage is the necessity of building the term-document matrix each time a new document is added and even more, applying the SVO algorithm, which is computation-intensive.

A related method was proposed by Ceglowski et al. [12] in the form of Contextual Network Graphs (CNG). Unlike LSI, it uses graphs to encode information about terms and documents. Each document is linked by weighted vertices to every term it contains and vice-versa. The weight of the vertices represents the number of appearances of each term in the document to which it is connected, normalized to the [0, 1] interval. An example of a CNG (Fig. 2) is generated from the documents in Table I and associated Table II. The energy (weight) of a node is set to I for each term in the query and at each iteration, this energy is propagated to connected nodes. The fact that the weight is between 0 and 1 ensures that the energy will eventually fall below a preset threshold. Every document that has an energy value above this threshold will be returned as relevant to the query. This method doesn't require the computation of a SVO, but on the other hand will not return relevant results if the exact keywords are not used.

TABLE I. AN EXAMPLE OF A CORPUS

Node Document

I The sky is clear today. 2 His level of enthusiasm was sky high. 3 Today the shop is closed. 4 You can go to the concert if you tidy up you room. 5 It is clear that he didn't finish his homework. 6 The concert will take place only if the sky is clear.

1959

Page 3: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

TABLE II.

Node

a b c d e

TERM DOCUMENT TABLE ASSOCIATED WITH TABLE I.

Term Total number of appearences

Today 2 Sky 3 Clear 3 Concert 2 Shop I

/a -- 3

/1\ 5 I b

� e

2 .,./' "" /c 6

/ d_4

Figure 2. An example of CNG generated from Table I

New methods of document retrieval that make use of the hyperlink structure of the Internet appeared in the late 1990s [12]. Hypertext Induced Topic Search (HITS) generates two graphs for each query: an authorities graph (an authority is a document with several inlinks) and a hubs graph (a hub is a document which has several outlinks). The related algorithm is based on the assumption that good hubs point to good authorities and good authorities are pointed to by good hubs. Thus, HITS computes two scores for each document (an authority score and a hub score). The advantage of the method is that it works with relative small matrices (given the number of documents on the Internet). The disadvantage is that it has to generate new graphs each time a new document enters the query.

Although it uses the same authorities graph concept in its query input, Page Rank, one of the base algorithms used by Google search engine, computes the importance of each document (once every two weeks) and thus becomes query independent. At query time, PageRank has to only generate the graph and retrieve the importance score calculated before.

2) Semantic distances In order to construct an algorithm for emotion detection, the

problem of polysemy (or word sense disambiguation) needs to be handled. For this purpose there are measures of similarity (or relatedness) which calculate the distance between the senses of two words. A first attempt was made by Lesk [13] who measured the semantic distance between two words as the number of overlaping words in the definitions of the given words. Other semantic distances rely on the structure of hypernymy embedded in WordNet, while some semantic distances rely both on the hypernymy structure and information from large corpora. The Leacock-Chodorow distance is used for nouns in WordNet and has the following formula:

(1)

where shortestyath(nl, n2) is the length of the shortest path between the two synsets and D is the depth of the hypernymy tree (in WordNet's case, D is 19).

The Resnik measure introduces the information content of concepts [14], which is calculated by:

IC(c) = -IOg( freq(c) ) freq(root)

(2)

where IC(c) is the information content of concept c, root is the root node of the used taxonomy (in WordNet's nouns case, "entity"), freq(c) andfreq(root) are the number of appereances of the concept, respectively the root in the used corpora. The Resnik distance is then calculated as:

(3)

where lcs(cl, c2) is the lowest common subsuming concept of cl and c2.

B. Construction algorithm

Python scripting language has been used together with Natural Language Toolkit (NL TK) [15] for natural language processing purposes, NetworkX for graph construction purposes and Matplotlib for graph visualization purposes, which allows a rapid development-testing cycle. The algorithm for constructing a synonym graph is as follows:

• 1- given a word and a maximum number of nodes threshold,

• 2- for each synonym in the given word's synset, a node connected to the given word is added to the graph;

• 3- for each new synonym added to the graph, add a new word from that synonyms synset;

• 4- repeat step 3 until the given threshold is reached.

A variation of this algorithm (in which words from the synset definition were added to the graph) was experimented with, but initial results showed that it was not reliable.

I) Initial Graph Graphs were constructed for the nine basic emotions

("pleasure", "mirth", "anger", "energy", "fear", "disgust", "astonish", "serenity", "sorrow") [9], with a maximum nodes' number of 200 (except for the "sorrow" graph which has only 29 nodes). Fig. 3 presents an intermediate synonym graph, after passing through step 3 in the above algorithm for the first time. Due to the construction algorithm, the graphs are connected, with a varying diameter between 5 and 8 edges. Fig. 4 shows an overview of an initial synonym graph and it can be seen that there are clusters of words centered on several concepts. The presence of these clusters means that the concept at the center of each cluster has multiple meanings and most likely not all the meanings related to the initial word (due to these clusters, we can end up with "sad" in the same synonym graph of "happy").

1960

Page 4: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

glo ing at

gleef'�"--+--+--m� lness

Figure 3. Synonym graph for "mirth" after first pass through step 3

Figure 4. Initial graph for 9 emotions

An ideal graph would not have such clusters and the developed training algorithm tries to eliminate them. From the initial experiments on the Gutenberg corpus, which comes with NL TK, it was concluded that the further from the initial word a node is, the less relevant that node is. An acceptable average would be a graph with a maximum diameter of five edges and three being each node's maximum degree.

2) Traine d Graph To eliminate irrelevant nodes, a distance based on the

shortest path between two synsets was first tried:

(4)

where shortestyath(c/. cJJ is the shortest path length between the two concepts in the WordNet structure. For the concepts "pleasure" and "emotion", the distance is presented in Table III. As can be seen, for relative similar concepts, the distance varies very much given the fact that the distance is situated in the [0, 1] interval. So the Leacock-Chodorow distance was tested:

(5)

where shortestyath(c/. cJJ is the shortest path length between the two concepts in the WordNet structure and D is depth of the WordNet taxonomy (19 for nouns). The results for the same concepts are shown in Table IV.

TABLE III. THE DISTANCE BETWEEN THE SYNSETS OF "PLEASURE" AND

"EMOTION".

Concept 1 Concept 2 Distance between concepts

pleasure.n.O I emotion.n.OI 0.33 (a fundamental (any strong feeling) feeling that is hard to define but that people desire to experience) joy.n.02 emotion.n.OI 0.09 (something or (any strong feeling) someone that provides a source of happiness) pleasure.n.03 emotion.n.OI 0.08 (a formal (any strong feeling) expression) pleasure.n.04 emotion.n.OI 0.1 (an activity that (any strong feeling) affords enjoyment)

The fact that Leacock-Chodorow distance (D'ch) takes into consideration the depth of the graph structure can be seen as the variation of the distance is smaller than in the shortest path case. The distance D'ch between antonyms "pleasure" and "pain", and then between two different emotions ("pleasure" and "anger") was calculated, with the results shown in Table V and Table VI.

The calculated distances are in the same range as those between "pleasure" and "emotion", so the Leacock-Chodorow cannot be used for removing irrelevant nodes in the synonym graph.

The Resnik distance was also calculated, with the Semcor corpus information content, for the same synsets (Table VII). Due to the fact that this distance requires a corpus form which to gather the information content, the algorithm was not able to calculate the distance between some concepts. As such, this distance is not recommended for removing irrelevant nodes from our synonym graph.

TABLE IV. LEACOCK-CHODOROW DISTANCE BETWEEN THE SYNSETS OF

"PLEASURE" AND "EMOTION".

Concept 1 Concept 2 Leacock-Chodorow distance between

concents

pleasure.n.OI emotion.n.OI 2.53 (any strong feeling)

(a fundamental feeling that is hard to define but that people desire to experience) joy.n.02 emotion.n.OI 1.23

(any strong feel ing) (something or someone that provides a source of happiness) pleasure.n.03 emotion.n.OI 1.15

(any strong feeling) (a formal exnression) pleasure.n.04 emotion.n.OI 1.33

(any strong feel ing) (an activity that affords enjoyment)

1961

Page 5: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

TABLE V. LEACOCK -CHODOROW DISTANCE BETWEEN THE ANTONYMS "PLEASURE" AND "PAIN".

Concept 1 Concept 2 Leacock-Chodorow distance between

concepts

pleasure.n.OI pain.n.OI 1.24 pleasure.n.OI pain.n.02 2.54 pleasure.n.OI pain.n.03 1.15 pleasure.n.OI pain.n.04 1.15

TABLE VI. LEACOCK-CHODOROW DISTANCE BETWEEN THE SYNSETS "PLEASURE" AND "ANGER".

Concept 1 Concept 2 Leacock-Chodorow distance between

concepts

pleasure.n.OI anger.n.OI 2.25 pleasure.n.OI anger.n.02 1.55 pleasure.n.OI wrath.n.02 0.99

TABLE VII. RESNIK DISTANCE, WITH INFORMATION CONTENT FROM SEMCOR CORPUS, BETWEEN THE SYNSETS OF "PLEASURE" AND "EMOTION".

Concept 1 Concept 2 Resnik distance between concepts

pleasure.n.OI emotion.n. 0 I 4.63 (a fundamental (any strong feeling) feeling that is hard to define but that people desire to experience) joy.n.02 emotion.n.OI 0.78 (something or (any strong feeling) someone that provides a source of �ness) pleasure.n.03 emotion.n.OI Error (one of the (a formal (any strong feeling) concepts was not expression) found in the Semcor

corpus) pleasure.n.04 emotion.n.OI Error (one of the (an activity that (any strong feeling) concepts was not affords enjoyment) found in the Semcor

corpus)

After experimenting with the graph and different documents in the Gutenberg corpus, it was observed that relevant words in the graph are at a maximum of three edges from the initial word. Thus, the training algorithm must eliminate words which are further away from the initial word, but not those that, even far, are still relevant. For this latter part, the number of appearances of the words in the same context as the initial word was taken into consideration. In the given case, the size of the context is preset to a constant value of 1000 words (after eliminating stop words). The semantic distance for the training algorithm is calculated as such: as in

shortest path (c 1 , C 2 ) 0.2·D

where D is the diameter of the graph.

The training algorithm is as follows:

(6)

• 1 - each node's weight is initialized with a value of 0;

• 2 - each node's weight is calculated as the number of appearances in the training document;

• 3 - for each node with a weight different from 0, the distance to the initial node (word) is calculated with (6);

• 4 - the general weight of each node is calculated as the weight times distance from the initial node;

• 5 - the retention threshold is calculated as the weighted average of all general weights;

• 6 - each node with a general weight above the retention threshold is kept in the graph; all other nodes are eliminated;

• 7 - if, at the end, by eliminating a node, the graph becomes unconnected, the graph that contains the initial word in kept.

The training algorithm was applied on eight documents from the Gutenberg corpus and the obtained results are presented in Table VIII.

3) Testing algorithm The testing of the graphs was done on eight documents

(different than the training documents) of the Gutenberg corpus. To establish if a given context is valid (preset to a constant value of 1000 words after removing stop words) the trained graph has been used and the weight of each node set as the number of appearances of each node in the context. Then the weight of the context was calculated with the following formula:

weightcontext = 1 " * jreq(cc) (7)

L...J h shortest path(c;occ) CeE syn _ grap -

where shortest"'path(cb cJ is the shortest path length between a node cc and the initial ci node, and freq(cc) is the number of appearances of node cc in the context. Contexts that scored a weight above a threshold of 9 were regarded as containing the emotion.

4) Results Above the given threshold of 9, some contexts were marked

as containing one of the nine emotions. Remarkably, none of the contexts were marked as containing two or more opposing emotions. Nevertheless, positive emotions were found in the same context ("mirth" and "astonish", "astonish" and "serenity"). The proposed algorithms for generating, training and testing a synonym graph have satisfactory results. In generating an oversized graph the goal was to incorporate as many synonyms with the initial word as possible, therefore detecting emotions expressed through related concepts. The purpose of the training algorithm is to remove the nodes in the initial graph that are not relevant for the given emotion. The testing algorithm's results showed that the trained graphs can correctly detect the presence of emotion in the some contexts. However, emotions presented in the text by using metaphors or implied emotions are not detected by the proposed algorithm. The thresholds and parameter values for the different algorithms were set based on empirical observations.

1962

Page 6: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

TABLE VIII. RESULTS OF THE TRAINING ALGORITHM.

Word Retention Number of

threshold removed nodes

Pleasure 9.57 59 Mirth 3.40 75 An�er 4.8 1 7 1 EnelID' 5.76 62 Fear 10.06 97 Disgust 1.48 6 1 Astonish 5.1 1 89 Serenity 6.92 72 Sorrow 16.49 8

IV. TRANSPOSING EMOTIONS FROM TEXT TO IMAGES USING A ROBOTIC DRAWING SYSTEM

The second part in the development of this project was focused on finding an efficient way for transposing emotions extracted from literary texts to robot drawings. The recognition of emotions in texts was discussed in the previous section, and this section will detail the emotion-to-drawing module. An essential step in the transposing is the expression of emotions, discussed below. A comparison between logically occurring ste�s in emotion modeling [16] and the emotion transposing desIgn proposed in this paper is shown in Table IX and discussed below, not in terms of existing models, but of practical ways of accomplishing them.

TABLE IX. EMOTION MODELING VERSUS EMOTION TRANSPOSING.

Emotion Emotion Modeling Transposing

Stages Aoolicable I . Recognition Yes 2. Generation No 3. Expression Yes 4. Effects on Behavior Partially

The emotion recognitIon stage is necessary in both approaches, but the way in which this is accomplished varies from application to application. In the design presented in this paper, the recognition stage is accomplished through the text­to-emotion module. Emotion generation is certainly not necessary in transposing, because no new emotional state is needed, but the recognized emotion is transformed from one form to another. The expression stage generally uses audiovisual means: face, gestures, posture, voice intonation, breathing, noise. In our transposing approach this is done through drawn facial expressions. While modeling emotions, the effects of particular emotions to the subject's behavior are an important matter. In the proposed system, the effects on behavior cannot be directly noticed, unless the expression stage is influenced by external factors like sound or movement.

A schematic representation of the emotion to drawing module can be seen in Fig. 5. The input is comprised of user parameters and an emotion graph generated by the text to emotion module. The emotion graph presents the dominant emotion of the text that will be expressed in the drawing and several related emotions used to model the intensity and dynamics of the dominant emotion [17].

Uscrlnpul Emotion Grnph �-----'

Figure 5. Emotion to drawing module

An e-Puck robot [18] has been used as a drawing agent. E ­Puck is a small mobile robot with two motor wheels, being able to draw with a soft pen attached to it that marks its trajectory. Because of this limitation, it can only draw single-colored continuous line images. As the physical representation of emotions, the drawing of faces has been chosen because facial expressions of emotions can be easily recognized and understood by a human viewer In addition, the interaction between a person and a robot is more natural than using other type of devices.

The program parameters that can be set by the user are the following: a scaling factor that determines the output size of the drawing, the drawing rate, that can be either predetermined (slow, medium or fast) or can be adjusted dynamically during the drawing process with input taken from an audio file. The program also maps emotions to colors and is able to determine a color range corresponding to the emotion that will be represented and will require that the right drawing color will be mounted on the robot. The robot can also mark the end of the drawing through a sound pattern specific for each emotion.

Another input parameter is the type of drawing: automatically generated images or pictures from a file. The first approach takes into consideration the two-dimensional arousal­valence representation of emotions [19]. The aspect of the face elements is automatically adjusted according to the information in the graph, by specific functions that draw the eyes, lips, eyebrows and face contour. A representation of the way these elements vary and their mapping on the valence-arousal space can be seen in [20]. In Fig. 6, an example of such a generated image can be seen (in this case, the emotion represented is fear). The face representation is reduced to straight lines and circular portions. The points marked in Fig. 6 represent the connections between these segments and also the order in which they will be drawn by the robot.

/" /

P� P� /" I P8

Pl1� P12

P13 ------------

P15

Figure 6. Example of generated image for fear

1963

Page 7: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

Pleasure Mirth Sorrow

An er Ener y Fear

Disgust Astonishment Serenity

Figure 7. Generated images for the nine artistic emotions

The resulted drawing is a continuous line representation marking the robot's trajectory. A view of nine artistic emotions represented with the automatically generated images can be seen in Fig. 7.

The second drawing type considered is the image-from-fiIe case, when an image representing the facial expression of the emotion considered is either provided by the user (this is useful when the text already has illustrations or the user wants to add appropriate images to the text) either taken from a database from previous experiments. Before actually being drawn by the robot the images undergo several processes: image is first converted to a continuous line drawing in order for the robot to be able to represent it (a special algorithm is used [21)), then the resulted drawing is simplified and facial features are being extracted. The robot's trajectory is then generated based on these results. The advantages of this second type of drawing is that it allows more complex representations and adds new features to the drawing (like hair, wrinkles, etc), making the final output more realistic. A special problem in this case is the mapping of initial images to emotions. The emotion can be indicated by the user, but a facial expression recognition function is considered for future implementation.

In order to get information on the system state and the actual progress of the drawing two monitor functions are provided. The first one concentrates on the output provided by the robot and its functionality and the second function gives feedback on the drawing process by highlighting the already covered portion of the image and showing the instantaneous values of dynamically modified parameters.

V. CONCLUSIONS AND FURTHER WORK

This paper presented an original practical approach for transposing emotions from text to drawings. Several usage scenarios can be imagined for this application, including aid in accelerated learning, text analysis, etc. The efficiency of the drawing method is given by the optimization of robot's

trajectory for single, continuous line drawings, resulting simplified yet realistic representations of the face and the basic facial expressions.

Furthermore, the presence of emotions in a literary text was detected by using an innovative approach of natural language processing. This included building a synonym graph for a given emotion and purging, through a machine learning algorithm, irrelevant words from that graph. Different distances were tested for this algorithm, based on WordNet taxonomy only or on both WordNet taxonomy and a training corpus. The given solution accounted for the fact that further synonyms in the graph are less likely to be encountered in the same context with the initial word.

Further work will include evaluating and simulating emotion dynamics, making the robot able to learn from past simulations and anticipate new emotions and their dynamics. Another issue to be considered is a multi agent implementation of the whole idea, in which multiple robots can contribute in realizing a more complex, larger scale or mUltiple colored representation of the targeted emotion.

Another direction for development would be to fuse multiple forms of art with the goal of evoking emotions in the human audience, for example drawing while dancing. This approach would require modifYing the drawing routines so that the desired image could be completed by multiple robots while also executing simple dance movements.

As for the natural language processing research, the algorithms for training and testing used contexts of fixed length. Better results will be achieved if contexts of variable length are used. In this respect, methods for determining the best possible context length for a given emotion will be explored, such as probabilistic topic modeling using Latent Dirichlet Allocation (LOA) [22].

ACKNOWLEDGEMENT

This work was supported by CNCSIS - UEFlSCSU, project number PNII - IDEI 1692/2008.

REFERENCES

[I ] M. L. Minsky, "The society of mind", New York, N.Y.: Simon and Schuster, 1986

[2] 1. Bates, "The role of emotion in believable agents", Communications of the ACM, Volume 37, Issue 7, ISSN:000I-0782, pp. 122-125, 1994

[3] S. Kato, Y. Sugino, and H. Itoh, "A Bayesian Approach to Emotion Detection in Dialogist's Voice for Human Robot Interaction", In Lecture Notes in Computer Science, Volume 4252/2006, Springer Berlin 1 Heidelberg, ISSN 0302-9743 (Print) I 6 1 1-3349 (Online)

[4] E. Chao-Chun Kao, C. C. Liu, Ting-Hao Yang, C. T. Hsieh, and V. W. Soo, "Towards Text-based Emotion Detection," Information Management and Engineering, International Conference on, pp. 70-74, 2009 International Conference on Information Management and Engineering, 2009

[5] I. N. Milat, H. Seridi, and M. Sell ami, "Towards an Intelligent Emotional Detection in an E-Leaming Environment ", In Lecture Notes in Computer Science, Volume 5091/2008, Springer Berlin 1 Heidelberg, ISSN 0302-9743 (Print) 16 1 1-3349 (Online)

[6] P. Brown, B. Bigge, 1. Bird, P. Husbands, M. Perris, M. Stokes, "The Drawbots", Proceedings of Mutamorphosis, Prague, pp. 1 - 7, 2007

[7] http://natural.ics.pub.ro

1964

Page 8: A Cognitive System for Detecting Emotions in Literary Texts ...lasa.epfl.ch/publications/uploadedFiles/ALP_SMC2010.pdfEmotion detection from text, speech or song represents an emerging

[8] C. Buiu, N. Popescu, "On the Aesthetic Emotions in Human-Robot Interaction. Implications on Interaction Design of Robotic Artists", accepted for publication, International Journal of Innovative Computing, Information and Control, 2010

[9] J. Hancock, C. Landrigan, C. Silver, "Expressing emotion in text-based communication", Proceedings of the SIGCHI conference on Human factors in computing systems, San Jose, California, USA, pp. 929 - 932, 2007

[ 10] G. A. Miller, "WordNet - About Us.", WordNet, Princeton University, 2009

[ 1 1] Deerwester, S., Dumais, S., Furnas, G., Landauer, T. and Harshman, R., "Indexing by latent semantic analysis", Journal of the American Society for Information Science, 1990, pp. 391-407.

[ 12] M. Ceglowski, A. Coburn, and J. Cuadrado, "Semantic Search of Unstructured Data using Contextual Network Graphs", (2004)

[ 13] M. Lesk, "Automatic Sense Disambiguation. Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cone", In Proceedings of SIGDOC '86, 1986

[ 14] P. Resnik, "Using information content to evaluate semantic similarity in a taxonomy", In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, 1995

[ 15] S. Bird, E. Klein, and Edward Loper - "Natural Language Processing with Python", O'Reilly Media, ISBN 13: 978-0596516499, 2009

[ 16] E. Hudlicka, "What are we modeling when we model emotions", AAAI Spring Symposium on "Emotion, Personality and Social Behavior" March 27, 2008

[ 17] C. Becker, S. Kopp, I. Wachsmuth, "Simulating the Emotion Dynamics of a Multimodal Conversational Agent", Artificial Intelligence Group, Faculty of Technology, University of Bielefeld, Germany, Springer­Verlag Berlin Heidelberg, E. Andre et al. (Eds.): ADS 2004, LNAI 3068, pp. 154-165, 2004

[ 18] www.e-puck.org [ 19] http://www.ai.mit.edulprojects/sociable/facial-expression.html

[20] M. Y. Lim , R. Aylett - A New Approach to Emotion Generation and Expression, School of Mathematical and Computer Sciences, Heriot Watt University, Edinburgh, Scotland, 2008

[21] R. Bosch, A. Herman, "Continuous Line Drawings via the Traveling Salesman Problem", Dept. of Mathematics, Oberlin College, Oberlin, Ohio, 2003

[22] B1ei, David M.; Ng, Andrew Y.; Jordan, Michael I, "Latent Dirichlet allocation", Journal of Machine Learning Research 3: pp. 993- 10, 2003

1965