16
This article was downloaded by: [Columbia University] On: 08 December 2014, At: 23:37 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of New Music Research Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/nnmr20 Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media Gualtiero Volpe a University of Genoa , Italy Published online: 16 Feb 2007. To cite this article: Gualtiero Volpe (2005) Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media, Journal of New Music Research, 34:1, 23-37, DOI: 10.1080/1080/09298210500123911 To link to this article: http://dx.doi.org/10.1080/1080/09298210500123911 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

Embed Size (px)

Citation preview

Page 1: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

This article was downloaded by: [Columbia University]On: 08 December 2014, At: 23:37Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Journal of New Music ResearchPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/nnmr20

Multisensory Integrated Expressive Environments:Toward a Paradigm for Multimodal and DistributedEnvironments for the Performing Arts and New MediaGualtiero Volpea University of Genoa , ItalyPublished online: 16 Feb 2007.

To cite this article: Gualtiero Volpe (2005) Multisensory Integrated Expressive Environments: Toward a Paradigm forMultimodal and Distributed Environments for the Performing Arts and New Media, Journal of New Music Research, 34:1, 23-37,DOI: 10.1080/1080/09298210500123911

To link to this article: http://dx.doi.org/10.1080/1080/09298210500123911

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

Multisensory Integrated Expressive Environments: Toward a

Paradigm for Multimodal and Distributed Environments for the

Performing Arts and New Media

Gualtiero Volpe

University of Genoa, Italy

Abstract

This article introduces Multisensory Integrated Expres-sive Environments (MIEEs) as a framework fordistributed active mixed reality environments for theperforming arts, and discusses the structure of MIEEsand their global properties. Extended Multimodal En-vironments (EMEs) are introduced as basic componentsand described in detail, mainly with reference to whatthey contain – real and virtual objects, and real andvirtual subjects. EMEs are then connected together into anetwork of spaces enabling geographically distributedperformances. The concept of ‘‘active EMEs’’ is finallyemployed to introduce MIEEs as a hierarchical structureof metaspaces conceived as virtual subjects collaboratingin achieving the overall narrative or aesthetic goal of theperformance. Some examples of EMEs and MIEEs arealso discussed.

1. Introduction

A main objective of the research activity carried out atthe DIST-InfoMus Laboratory is to make a scientificand technological contribution to the development ofnovel forms of artistic performances, where theperforming action takes place in a number of inter-active physical as well as virtual connected spaces. Inthis scenario, spectators usually become participantssince they are able to generate and modify contentdirectly through interaction. A performance can be

organized on several levels of abstraction, with multi-ple narrative lines interleaving and interactivelydeveloping across the connected spaces. Technologyinteracts with art at the level of the language artemploys to convey content and provide the audiencewith an aesthetical experience. New media areexploited to provide users/participants with a richexperience and sense of presence.

This article discusses Multisensory Integrated Expres-sive Environments (MIEEs) as a framework fordistributed active mixed reality environments for theperforming arts. Communication in MIEEs mainly takesplace through the non-verbal conveyance of expressive,emotional content. Expressive gestures (Camurri et al.,2004a) are thus addressed as first-class conveyors of suchexpressive information. Research on MIEEs involvesscientific and technical aspects. From a scientific point ofview, important issues are: the definition of paradigmsand metaphors for modelling such environments; under-standing of the communication process of expressivecontent taking place inside them; and the definition ofinteraction strategies through which participants influ-ence the performance. A main problem is themanagement of complex and indirect interaction strate-gies while, at the same time, preserving understandingand effectiveness. From a technical point of view, issuesrelated to the design and development of such environ-ments have to be faced: from low-level aspects such asthe design and implementation of suitable hardware (e.g.,sensor systems) and software (e.g., software for real-timegeneration and processing of audio and visual content)

Correspondence: InfoMus Laboratory –Laboratorio di Informatica Musicale, DIST, University of Genoa, Viale Causa 13, Genoa16145, Italy. E-mail: [email protected]

Journal of New Music Research2005, Vol. 34, No. 1, pp. 23 – 37

DOI: 10.1080/1080/09298210500123911 ª 2005 Taylor & Francis Group Ltd.

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 3: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

components to higher-level issues such as the definitionof a design methodology.

A first step in working with MIEEs consists ofdefining a model for them, which has to take intoaccount two main aspects. First, the structure of a MIEE:its basic components, how they are connected in theenvironment, and the properties of both the basiccomponents and the whole environment. And second,the communication process: how information flows in aMIEE with respect to both the interaction betweenenvironment and users, and between the basic compo-nents in the environment. This article will discuss thestructure of a MIEE and its global properties. ExtendedMultimodal Environments (EMEs) will be introduced asbasic components and discussed in detail, mainly withreference to their contents – real and virtual objects, andreal and virtual subjects. Then EMEs will be connectedtogether into a network of spaces enabling geographi-cally distributed performances. The concept of ‘‘activeEMEs’’ will be employed to introduce MIEEs as ahierarchical structure of metaspaces conceived as virtualsubjects collaborating to achieve the overall narrative oraesthetic goal of the performance. The second aspect –the communication processes taking place in a MIEEwith a particular emphasis on expressive gesture – isdiscussed in Camurri et al. (2003a, 2004a).

MIEEs are discussed and developed in this article withrespect to their employment in artistic performances. Itshould be noted that the paradigm can be applied toother scenarios such as museum applications. MIEEs canalso be used for scientific research on multimodalinteractive systems and for human aid programs suchas education or rehabilitation (Camurri et al., 2003b).

2. The basic components: Extended multimodalenvironments

The interactive environments discussed in this article getinspiration from the Multimodal Environments (MEs)described in Camurri and Ferrentino (1999). MEs areconceived as ‘‘a population of physical and softwareagents capable of changing their reactions and theirsocial interaction over time’’: the ‘‘living agents’’ observethe users and extract features related, for example, tomotion and gesture. Features are then mapped onto real-time generation of music, sound, visual media. Agentscan be software agents, ranging from invisible observersto ‘‘believable characters’’ (Bates, 1994), as well asphysical agents – namely, robots moving onstage likethe Theatrical Museal Machine (Camurri & Ferrentino,1999). Agents are multimodal since multiple sensorialmodalities (e.g., visual, auditory, haptic) are involvedboth with respect to perception by spectators/partici-pants and analysis of inputs from spectators/participants.

This article introduces the novel concept of ExtendedMultimodal Environment (EME) as the basic componentof a Multisensory Integrated Expressive Environment(MIEE). EMEs are an extension of MEs since theyexplicitly include humans (usually, performers andspectators/participants) in the model, and they explicitlyenvisage contexts in which the performance is spreadover a number of distributed physical and virtual spaces.EMEs are mixed reality spaces and can be classified interm of the reality – virtuality continuum (Milgram &Kishino, 1994) – that is, they can be completely real(physical) environments (as in traditional theatre perfor-mances), completely virtual environments or somethingin between (e.g., augmented reality spaces, augmentedvirtuality spaces or, ideally, spaces where the user cannotdistinguish what is real from what is virtual). Consoli-dated mixed reality techniques can be used in the designand implementation of EMEs.

An EME contains four kinds of entities: real objects,virtual objects, real subjects and virtual subjects.Imagine, for example, a music and dance ensembleperforming in a sensorized space, interacting with real-time music generation programs and possibly with otherperformers in other locations. Every performer in theensemble is a real subject. Performers can interact withreal instruments (real objects) as well as virtual instru-ments (virtual objects) created in the environment.Software agents (virtual subjects) observe performersand extract features of their performance. Altogether, theenvironment, its inhabitants (performers and softwareagents) and the objects in it (e.g., instruments) are anEME. Suppose the ensemble is connected to anotherEME (e.g., another ensemble in another location):extracted features from one ensemble can be employedto influence audio and visual content generation in theother location. The connected EMEs are a first simpleexample of MIEE. Consider another example in theperforming arts scenario: an actress plays onstage. Herbody movements and, in particular, the movement of herlips are analyzed and employed to process in real timeher voice and generate further audio material. Again, theenvironment, the actress and the software agents analyz-ing her movements and producing audio output are partof an EME.

Concrete examples of EMEs are not restricted to theperforming arts, but can also be found in manyapplication scenarios. An interactive installation in amuseum can be considered an EME inhabited by thevisitors. The whole museum, conceived as a collection ofconnected EMEs, can be considered a MIEE. In therapyand rehabilitation, interactive multimedia applicationscan be designed for specific therapeutic tasks (e.g.,improving motion fluency by encouraging – with positiveaudio feedback – such motion in patients). Patients arethus real subjects. They interact with virtual subjects thatevaluate how patients accomplished their tasks and

24 Gualtiero Volpe

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 4: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

provide feedback. Patients may also need to employ realobjects for therapeutic tasks, which might be replaced byvirtual objects (e.g., simulation of real objects in order toreduce possible risks for patients). These examples will befurther developed and discussed later in this article.

Many state-of-the-art multimedia interactive systemsmay be classified and described as EMEs, but few effortscan be found in the literature addressing an in-depthanalysis of EMEs, their components, their properties,and the kinds of interaction and processing taking placeinside them. Such analysis would be useful since it wouldprovide some conceptual tools for describing, designingand implementing EMEs. In the following sections, thebasic components of an EME and its properties arediscussed in detail.

2.1 Real and virtual objects

Following the distinctions proposed by Milgram andKishino (1994), real objects are defined as ‘‘any objectsthat have an actual objective existence’’. Thus, realobjects are objects that effectively exist in an EME: forexample, any piece of scenery can be considered a realobject, and physical icons (Ishii & Ullmer, 1997) are realobjects as well. Any subject in a given EME can directlyobserve real objects and, if possible, use them. Con-versely, virtual objects are ‘‘objects that exist in essenceor effect, but not formally or actually’’ (Milgram &Kishino, 1994). This definition could be further extendedsince it is possible to consider virtual objects that do notcorrespond to any existing real object (i.e., do not exist inessence or effect), but are the results of the creativeimagination of the designer of a performance.

Virtual objects can be dynamically created, destroyed,used and moulded (i.e., their properties can be dynami-cally changed over time) by subjects. Usually, theycannot be directly observed, but the effects of their usecan be perceived. As an example, consider a scenariodescribed in Camurri and Ferentino (1999): a singleagent observes and interprets movement and gestures bya user (e.g., a dancer). Depending on the identified ‘‘styleof movement’’, a kind of ‘‘dynamic hyper-instrument’’ isgenerated and played. For example, nervous andrhythmic gestures evoking a percussionist produce acontinuous transformation toward a set of virtual drumslocated where motion is detected. If movement evolvestoward smoother gestures, a continuous change alsotakes place in the music output: for example, virtualdrums are transformed in a virtual string quartet. In theframework of the model proposed in this article, thedynamic hyper-instrument can be considered a virtualobject: it can be created in a given location, used (i.e.,played), destroyed and its properties can be dynamicallychanged over time. A lot of such virtual dynamic hyper-instruments can be created in a given space. Each of themcannot be directly observed (since they are virtual), but

the effects of their use (i.e., the sound produced whileplaying them) can be perceived. Virtual objects are thusable to implement traditional metaphors like ‘‘hyper-instruments’’ (Machover & Chung, 1989), but also to gopartially beyond ‘‘hyper-instruments’’ by enabling thedynamic behavior previously described. Moreover, vir-tual objects can be employed in more complex scenarios:for example, they can implement Schaeffer’s (1977) musicobjects.

The same mechanisms described here for audio canalso be employed for objects whose use is perceivedvisually. Referring again to the example above, when theagent detects nervous and rhythmic gestures, it is possibleto create an object producing a continuous transforma-tion toward an image in which some features (e.g., colorsassociated with energy, sharp edges) are emphasized.

2.2 Real and virtual subjects

‘‘Subject’’ refers to anything able to perceive what ishappening in the environment around it and actaccordingly. In other words, ‘‘subject’’ is here synon-ymous with ‘‘agent’’ (see, e.g., the definition in Russelland Norvig, 1995). Nevertheless, the term ‘‘subject’’ isused since ‘‘agent’’ has been often abused in the literaturein recent years.

‘‘Real subjects’’ are defined as any subjects having anactual objective existence. Two kinds of subjects areusually found in EMEs that have an ‘‘objectiveexistence’’: humans and robots. Notice that whilehumans are always considered real subjects, robots areconsidered real subjects only if they are able to perceiveand act (i.e., they have a certain degree of expressiveautonomy). ‘‘Expressive autonomy’’ is defined as ‘‘theamount of degrees of freedom that a director, achoreographer, a composer (or in general the author ofan application including expressive content communica-tion), leaves to the agent in order to take decisions aboutthe appropriate expressive content in a given momentand about the way to convey it’’ (Camurri et al., 2000).This can be clarified with an example: a small robot in aperformance is used to carry a video camera forobtaining images of the performers that are thendeformed and projected onto large screens. If the robotmoves strictly according to commands coming from thedirector, it does not have any expressive autonomy and,in fact, it is a real object (i.e., the director just uses it).Suppose instead that the robot is able to decide where topoint the video camera, basing its decision on theperformers’ expressive gestures. Now the robot makesdecisions according to its perceptions, it has a certaindegree of expressive autonomy and can be considered areal subject.

Virtual subjects do not have an objective existence;they can be dynamically created and destroyed, and sincethey are subjects, they are able to perceive and act. From

Multisensory Integrated Expressive Environments 25

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 5: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

the point of view of perception, virtual subjects are ableto observe the environment through sensors (real inEMEs – e.g., video cameras, microphones) and processinformation in order to get an internal representation(state) of the environment. From the point of view ofaction, they use actuators (real in EMEs) to generateoutputs (e.g., music, sound, visual media) in theenvironment. Similar to what happens for virtual objects,virtual subjects cannot be directly observed, but theeffects of their actions can be perceived.

Subjects (mainly virtual subjects) can be classified withrespect to their properties. Two main criteria distinguishvirtual subjects: the output channel they mainly use intheir actions and the predominant aspect of theirbehavior (i.e., whether they mainly observe, act or doboth). With respect to the second criterion, an importantrole is again played by expressive autonomy. If, on theone hand, subjects must have a certain degree ofexpressive autonomy (otherwise they would be objects),on the other hand, the amount of expressive autonomystrongly influences what a subject can do. With respect tothe first criterion, (virtual) subjects can be distinguishedin audio subjects, visual subjects and multimodal subjectsdepending on the main channel they use for their actions– auditory output (sound and music) or visual output(images, lights) or both, respectively. Notice thatalthough the classification has been restricted to audioand visual outputs here, it can be further extended toother modalities (e.g., haptics).

With respect to the second criterion, a distinction canbe made between observers, actors and characters.Observers are subjects whose main role is observing aparticular aspect of the EME. They extract features,interpret them and provide other subjects with informa-tion (structured on more levels) about what they areobserving. A particular kind of observers are thoseassociated with humans – that is, observers responsiblefor tracking and analyzing the actions a human isperforming. Another subset consists of observers respon-sible for observing the environment from the point ofview of a human with which they are associated: in asense they are customized observers. Both virtual subjectsand robots can play the role of observers.

Actors are subjects whose main role is acting (i.e.,producing music, sound, visual media) depending oninputs received from other subjects (mainly observers).Avatars are an important kind of actor, usuallyconceived as a representation of a human in a virtualreality environment (see, e.g., Bahorsky, 1998). Anavatar therefore acts accordingly to what the human itrepresents is doing: its main role is representing thehuman through its actions. Both virtual subjects androbots can play the role of actors and avatars.

Characters are subjects that both perceive and act;they often are not associated to a given human, but caninteract with humans. Characters therefore have a higher

degree of expressive autonomy with respect to observersand actors. A lot of research has been carried out oncharacters to improve their behavior and believability ina huge variety of application fields (e.g., virtual tutors,virtual assistants, characters in game environments,characters for sign language, characters for televisionapplications). Design of virtual characters is beyond thescope of this article, so characters will not be furtherdiscussed.

I would like to point out two issues of particularrelevance in artistic contexts. First, anthropomorphism isnot a strict requirement: it is possible (and sometimespreferable) to have cartoon-like characters or abstractshapes that are not anthropomorphic at all. And second,though most research on virtual characters actuallyfocuses on verbal communication and related fields suchas automatic and believable generation of co-verbalgestures (see, e.g., research on Emotional ConversationalAgents), here communication mainly takes place throughnon-verbal channels. Characters thus have to demon-strate their believability through the audio and visualoutput they produce.

Clones replicate the actions of a human by translatingthem into auditory (audio clones) or visual (visualclones) form. The level of abstraction at which humans’actions are translated can vary considerably: for exam-ple, in a very simple scenario, some movements can berecognized and associated with generation of audio orvisual output. In more complex cases, high-level infor-mation about expressive gesture can be involved in thetranslation process. Notice that a clone needs an observerto gather information about the human and an actor (anavatar) to generate output. If the clone is created in thesame mixed reality space in which the human actually is(i.e., in the same geographical location), the two aspectscan be merged and the clone is in fact a character. Ifinstead generation of output takes place in another mixedreality space (i.e., in another location), an observer willbe needed in the space where the human actually is, andan actor/avatar will be needed in the space where theoutput has to be generated. By combining classificationsaccording to the two criteria, two relevant cases emerge:audio clones and visual clones (i.e., avatars mainly actingthrough audio and visual output, respectively).

2.3 Interaction paradigms between subjects

How do real and virtual subjects socially interact in anEME? Only two approaches will be discussed brieflyhere, but further and more complex paradigms could beintroduced and employed. Although the two approachesare quite traditional, they are easy to understand andimplement, and can be employed to build prototypes ofEMEs.

In collaborative models, subjects cooperate in thefulfilment of the goals of the performance. Collaborative

26 Gualtiero Volpe

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 6: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

models have been used in a lot of application contexts –for example, in Artificial Intelligence and in Human-Computer Interaction in the field of conversationalagents (see, e.g., Guinn & Biermann, 1993; Perez-Quinones & Sibert, 1996). In competitive models, subjectscompete to obtain resources and to achieve a goal bygetting the best performance or scoring. Competitivemodels are most used in games (and video games). Boththese models have been extensively studied in severaldisciplines ranging from Computer Science to Economicsand the Social Sciences. Here I just put into evidencesome aspects that are relevant for employing the modelsin EMEs.

In the literature, the term ‘‘collaborative’’ is often usedwith reference to Collaborative Virtual Environments(CVEs): systems that ‘‘use VR technology to visualize aspace inhabited by multiple users, usually geographicallyremote in the real world’’ (Benford et al., 1997). Takinginspiration from Benford’s definition, in the context ofEMEs ‘‘collaborative’’ means that subjects cooperate inthe common group ‘‘work’’ of generating the perfor-mance. In other words, while the performance mayremain orchestrated and supervized by its designer(composer, choreographer, director) as in more tradi-tional scenarios, it can evolve and be moulded on thebasis of joint and coordinated actions of subjects thatdirectly collaborate in generating and transforming thecontent. In the case of an artistic performance, thecommon goal driving the participants’ actions can beidentified, for example, in a communicative objective ofthe performance as a whole – that is, in the acquiredconsciousness and understanding of the message thedesigner wants to communicate through the sharedexperience. Supervision by the artist/director and evolu-tion depending on subjects’ actions can be mixed atseveral extents: this issue is related again to the conceptof autonomy (and expressive autonomy). The word‘‘collaborative’’ is therefore used mainly with reference toits social meaning (i.e., bringing together people co-operating in the fulfilment of an artistic goal), rather thanin its technological implications.

The term ‘‘competitive’’ is also intended to have aless specific meaning than that found in the specialisedliterature (e.g., in the field of genetic algorithms). Here‘‘competitive’’ refers to the traditional game paradigmwhere players compete in achieving a goal by trying toobtain the best performance. A performance can thusbe designed in a game-like perspective where subjects‘‘fight’’ each against each other to get the best scoreand win the game. The game paradigm can raise theinterest and engagement of participants considerably.The use of well-acknowledged conventions, as indrama and games, has been demonstrated to beeffective in introducing novel forms of interaction tothe general public, even to novices in technology(Rinman, 2002).

The two paradigms can also be joined. For example, itis possible to have competitive environments wheresubjects grouped in teams collaborate in trying to winthe game. A recent example of MIEE in which thecollaborative/competitive paradigms have been exten-sively employed is the game Ghost in the Cave (Rinman etal., 2004), designed and implemented within the frame-work of the EU-IST Project MEGA (MultisensoryExpressive Gesture Applications, see www.megaprojec-t.org). The game exploits non-verbal communicationmechanisms based on expressive gesture. Two playerscontrol their avatars in a virtual environment throughvoice and full-body movements. Spectators, divided intotwo teams, can participate in the game by influencing itthrough their movements.

3. Connecting together more extendedmultimodal environments

Up to now the discussion focused on a single EME andon what it contains. An EME exists in a givengeographical location. However, a central issue is thedefinition of a performance environment that is notlimited to a specific physical and geographical location,but that can be spread over several different locations.This is nowadays allowed by broadband communicationtechnologies. Evolution in technology will also removethe need to have dedicated installations (and dedicatedplaces) for performance, thus enabling distribution innon-traditional environments (e.g., at home). Connectingtogether more EMEs raises some issues about howsubjects populating a given environment are representedin the other ones, and how subjects in a givenenvironment can use objects belonging to another one.Such a situation can be handled by using the virtualsubjects observer and avatar described above. Considerfor example two EMEs connected through a network(see Figure 1).

Two observers are associated to a real subject (ahuman) in EME 1. The first observes the human trying toanalyze his or her actions and behavior (e.g., what he orshe is doing with an object). Notice that the informationthe observer can extract from the human ranges overmultiple layers of abstraction: from simple detection ofmotion in given regions or of given body parts, toinformation about gestures the human is performing, tothe possible emotion the human is trying to express, tohis or her engagement with respect to the performancethat is taking place. Examples of extraction of featureson several levels from human full-body movement can befound in Camurri et al. (2004a). The second observerobserves what is actually happening in the EME: forexample, it observes what other subjects are doing withobjects. Again information over several levels of abstrac-tion and complexity can be extracted. Notice that the

Multisensory Integrated Expressive Environments 27

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 7: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

second observer can be ‘‘customized’’ in order to observethe environment according to the preferences of thehuman subject with which it is associated. For example,if the human subject has a particular sensitivity towardlight changes or a given musical genre, the observer canbe programmed to attribute a particular relevance tolight changes and that musical genre (for a moreextensive discussion of this aspect, see Camurri et al.,2004b).

Information collected by the two observers is sent overthe network to an avatar inhabiting EME 2. The avatarcan thus act in EME 2 depending on what the human itrepresents is doing and observing in EME 1. Further-more, the avatar can also observe what is happening inEME 2. Avatar’s actions can therefore depend on: theactions of the human as observed by observer 1 in EME1; what is happening in EME 1, filtered by observer 2according to the human’s preferences; and what is

happening in EME 2, observed and filtered by the avataraccording to the human’s preferences. Avatar’s actionscan consist in generation of audio and visual content orin suitable use (and creation/destruction, if needed) ofvirtual objects. The avatar could also use real objects ifthey can be used without the need for physicallyinteracting with them (e.g., objects that can be auto-matically controlled). Conversely, information gatheredby the avatar in EME 2 can be sent back to EME 1,where it can be presented to the human in several wayswith increasing complexity, ranging from displays show-ing what is happening in EME 2 to the visual and audiofeedback generated by an actor in EME 1 on the basis ofdata coming from EME 2.

The mechanisms described here can be replicated inorder to connect together more EMEs: a network ofEMEs can thus be obtained enabling distributedperformances. Of course, complexity increases: for

Fig. 1. Connecting two Extended Multimodal Environments. Cubes represent objects (solid lines for real objects and dashed lines for

virtual objects) and human-like shapes represent subjects (again, solid lines for real subjects and dashed lines for virtual subjects).

28 Gualtiero Volpe

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 8: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

example, a human physically inhabiting a given EMEcan have avatars in each connected EME, all receivinginformation from the two observers associated to thehuman. Conversely, the human can receive feedbackfrom each of his or her avatars populating the network ofEMEs. A network of EMEs can be represented as agraph where nodes represent EMEs and edges can begiven different meanings at different levels of abstraction.They may just represent physical connections amongEMEs (e.g., network connections), but they may alsorefer to higher-level relationships. These higher-levelrelationships may refer, for example, to the role of theconnected EMEs in the overall structure of the perfor-mance (e.g., two EMEs are populated by differentcharacters participating in the same storyboard; twoEMEs have a complementary role in a shared locationsuch as two installations in the same exhibit).

Relationships can also have properties. For example,suppose an edge represents a network connectionbetween two EMEs. Such a connection may have adifferent permeability with respect to different kinds ofsensory information (e.g., the connection may allowtransfer of either auditory information only, or visualinformation only, or both of them). A connection canalso be symmetric (i.e., with exchange of informationfrom both sides) or asymmetric (i.e., one EME onlysends information and the other only receives it). It isthus possible to define a semantic for graph structuresrepresenting EMEs and their properties, the relation-ships among EMEs and the properties of suchrelationships. From such a graphic structure it is alsopossible to derive further properties. Consider, forexample, a graph in which the edges mean that theconnected EMEs can influence each other (a propertyof such relationship could specify whether the influenceis unidirectional or bi-directional). From such a graph,a transitive property can be derived on the basis ofwhich an EME can indirectly influence what happens inany other EME for which a path connecting the twoEMEs can be found in the graph. The definition of thesyntax and the semantics of a graphical language fordescribing EMEs and the relationships among them, aswell as a detailed analysis of possible high-levelrelationships and of their role in connecting EMEs,are currently under investigation.

While techniques of augmentation such as thosedescribed by Milgram and Kishino (1994) or thoserelated to the tangible bits approach (Ishii & Ullmer,1997) can be used internally to each EME composing anetwork, mixed reality boundaries (Benford et al., 1998)can also be a good (but not the only) choice forconnection between EMEs. Even if the complexity ofan extended network of EMEs is more a theoreticalcondition than a practical one (in practice, usually only afew EMEs will be connected together), such a complexitycan make it difficult to design, organize and coordinate aperformance: the cross-influences can make it impossibleto develop a narration across the EMEs and thejuxtaposition of too many effects can generate situationsthat are not understandable by the spectators/partici-pants. A further layer of coordination and supervision istherefore needed in MIEEs.

4. Active extended multimodal environments

An EME can be equipped with sensors and effectors.Environmental sensors can be used to get an overallpicture of what is happening. An environmental audioand visual output can be generated. An EME cantherefore be conceived as an active space – that is, it canbe part of the performance since its environmentalproperties can be moulded depending on the evolutionof the performance. A simple example is given by a spacein which elements (e.g., lights, scenery) are dynamicallychanged in real time by performers’ actions. Consider,for example, a concert taking place in an EME. TheEME can observe the performers and produce visualoutputs (e.g., abstract shapes) depending on the playedmusic. The same music could also be captured throughmicrophones, processed and reproduced on the basis ofwhat and how the performers play and how they move.More complex situations can also be conceived.

Active EMEs usually need to have a state: a corpus ofinformation about what is actually happening and whathas happened in the environment. Depending on theirdegree of activity, active EMEs can be classified along acontinuum ranging from completely passive environ-ments to highly dynamic active environments (see Figure2). In completely passive environments, users (perfor-

Fig. 2. Active EMEs can be represented along a continuum.

Multisensory Integrated Expressive Environments 29

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 9: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

mers/spectators/participants) cannot influence the envir-onment in any way. The environment constantly remainsthe same, or if it changes, changes are predefined. Forexample, this is what happens in traditional theatrescenarios, where any change in lights and scenery isdecided before the performance and extensively testedduring rehearsals.

On the other side of the continuum, highly dynamicactive environments are equipped with environmentalsensors and actuators, and implement complex strategiesto analyze data from sensors and map them ontogeneration of multimedia output. Several degrees ofcomplexity are possible, for example, with respect to howmuch memory of the past is kept and used in themapping process, and how much the mapping strategiescan dynamically evolve over time.

A relevant case that lies between completely passiveenvironments and highly dynamic active EMEs are‘‘reactive environments’’ in which a collection of fixedrules is used in the mapping process. As a concreteexample, a reactive environment can be endowed with astatic collection of condition-action rules. Depending onthe input from environmental sensors, some rules willfire and an action will be selected from among the onesenabled by the firing rules. Dynamic active environ-ments can have more sets of condition-action rules andcan switch among sets depending on the history andgoals of the performance. In other words, highlydynamic active environments can intervene in thenarrative structure of the performance, make decisionsand influence both the evolution of the performanceand the interaction among the subjects populating theenvironment.

An active EME has sensors (i.e., it ‘‘perceives’’ what ishappening inside itself through a number of environ-mental sensors) and ‘‘effectors’’ (i.e., it is able to generatesuitable multimedia content depending on what itperceived), and usually has a state (i.e., it has an internalrepresentation of what is happening). These are the sameproperties that define an agent: in fact, the definition byRussell and Norvig (1995) says that an agent is‘‘anything that can be viewed as perceiving its environ-ment through sensors and acting upon the environmentthrough effectors’’. Therefore, an active EME can beconsidered an agent that is itself the environment and,according to the previous definitions, can be regarded asubject. Is it a real subject or a virtual subject? A ‘‘realsubject’’ is one having an ‘‘objective existence’’: an EMEthat physically exists in a given geographical locationshould therefore be considered a real subject. Acompletely virtual environment, on the other hand,should be considered a virtual subject because it doesnot have an ‘‘objective existence’’. In any event, as thediscussion proceeds, the problem of understanding whatin fact is real and what is virtual becomes more complex,but to a certain extent, less relevant.

Note that in this discussion only active EMEs areconsidered (i.e., from reactive environments to highlydynamic active environments). Completely passive en-vironments cannot be considered subjects since theyneither perceive nor act. Sometimes, however, it ispossible to import completely passive environments intothe model by considering them a special kind of object. Infact, if it is possible to externally control some aspects ofthe environment (e.g., lights), a subject could use thesemechanisms to intervene in the environment. Theenvironment neither ‘‘perceives’’ nor does it have aninternal state, but subjects (even virtual subjects) can useit as an object by intervening through the mechanismswith which the passive environment provides.

5. Structure of multisensory integratedexpressive environments

Let us now consider two active EMEs connected througha network (a situation like the one described in Figure 1).According to what has been noted above, each EME canbe thought to be a subject communicating with otherEMEs/subjects through the network connection. It isthus possible to define a kind of metaspace, one layerabove the two EMEs, in which the two EMEs can berepresented as communicating subjects (see Figure 3).

In a similar way, when more active EMEs areconnected together, they can be modelled as a collectionof subjects interacting in a metaspace one level above thenetwork of EMEs (see Figure 4). Each subject/EME canhave more or less knowledge about the other subjects/EMEs in the metaspace, and their interaction can bemore or less strong and tight. According to thismetaphor, the development of a narrative structurealong the network of EMEs and the achievement of theperformance’s narrative and aesthetic goals can bethought of as the outcome of the interaction (e.g., eithercollaborative or competitive) of the subjects/EMEs in themetaspace representing the network. The EMEs canintervene and directly influence what is happening insidethem with the aim of enriching the experience ofspectators/participants by controlling the complexity ofthe interaction, thus helping spectators/participants tounderstand the contents of the performance and enhancefruition.

If, on the one hand, each EME can be thought of ashaving its own storyboard and ‘‘artistic goals’’, andsubjects are ‘‘actors’’ collaborating or competing toachieve the ‘‘artistic goals’’ of the EME, on the otherhand the metaspace at layer 1 will also have its ownstoryboard and its own ‘‘artistic goals’’, but in this caseeach EME is an actor in the layer 1 storyboard andEMEs interact by collaborating or competing for theachievement of the ‘‘artistic goals’’ at layer 1. Supposethat two networks of EMEs generate two metaspaces in

30 Gualtiero Volpe

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 10: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

which the EMEs in the two networks are actorscollaborating and/or competing in the context of thestoryboard of each metaspace. The metaspace canobserve what the subjects/EMEs are doing inside it,and intervene and influence their choices: in other words,the metaspace can be considered an active environmentand therefore a subject ‘‘perceiving’’ what the subjects/EMEs are doing inside it and acting accordingly.1 Thetwo metaspaces can then grouped as subjects in anothermetaspace one layer above. The two metaspaces will be‘‘actors’’ in the storyboard of the new upper metaspace,and will contribute to the goals of the new metaspace.

This paradigm constitutes the basic structure ofMIEEs. It can be replicated recursively by creating morelevels of abstraction in which each active space ormetaspace is considered as a subject in a metaspace onelayer above. Each active space and metaspace has its ownstoryboard and subjects and, as a subject itself, is part ofthe storyboard of the metaspace one layer above it (seeFigure 5). The paradigm allows organization of aperformance on several levels with possible layerednarrative structures. Simple interactive storyboards may

be implemented in EMEs. The use either of reactiveEMEs or of a suitable combination of reactive strategiesand higher-level decision-making processing may help indesigning interactive performances providing spectatorswith a rich aesthetic experience, but not overwhelmingthem with too many complex and incomprehensibleoutputs.

The complexity of a distributed performance caninstead be faced in the metaspaces at the upper layers.In other words, the upper layers are responsible forboth the flow of contents and the interaction strategiesamong multiple EMEs. On the one hand, the MIEEparadigm allows the separation of different narrativestructures at different levels of abstraction thus enablingthe definition of multiple-layered (possibly non-linear)narrative structures. On the other hand, the paradigmprovides a unified conceptual framework allowing theorchestration of the narrative structures at the variouslayers in order to achieve the aesthetic goals of thewhole performance. The author of the performance andthe designer of the interactive systems can thusconcentrate their efforts on every single componentand then connect the components in a MIEE structure.This process may facilitate the authoring task, thesystem design and implementation task, and the correcttransferring of the requirements stated by the authorinto the design and implementation of the interactivesystems.

The modularity of the MIEE model is another keyaspect. The development of the storyboard both in theEMEs and in every higher-level metaspace is governedby the same paradigm: the progress of the narration

Fig. 3. Two active EMEs connected through the network can be represented as two subjects in a metaspace, one layer above the twoEMEs.

1Notice that at this point the metaspace will be usually

considered as a virtual environment and a virtual subject, sinceit will not have an objective existence. Its ‘‘perceptions’’ and‘‘actions’’ with respect to subjects/EMEs will not be physical

(like, e.g., generation of audio/visual content in EMEs). Rather,metaspaces will act as software agents interacting with othersoftware agents (the subjects/EMEs). However, sometimes it isalso possible to find a physical counterpart of metaspaces (this

will be discussed in an example later in this article).

Multisensory Integrated Expressive Environments 31

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 11: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

depends on the interactions between the subjectspopulating the space. In this perspective, the designof a performance finally consists of the identification ofthe subjects at the different layers and of the design ofthe interaction (i.e., of the interaction strategies)between the subjects. The identification of the subjects

is strictly related to the identification of the connec-tions among EMEs and among higher-levelmetaspaces. Given a collection of EMEs, severaldifferent possibilities of grouping them in metaspacescan be taken into account. How spaces are groupedand connected depends on many aspects, and decisions

Fig. 4. A network of EMEs can be represented as a group of subjects in a metaspace, one layer above the EMEs in the network.

Fig. 5. Structure of a Multisensory Integrated Expressive Environment (MIEE).

32 Gualtiero Volpe

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 12: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

about this are made during the design of theperformance. For example, EMEs belonging to thesame geographical region can be connected in the samenetwork. Grouping may also depend on properties orrelationships among EMEs (e.g., EMEs having similarproperties may be grouped in the same metaspace).Usually, however, grouping is performed with a quitedirect reference to the physical or logical structuresunderlying a performance or an installation – that is,spaces having a physical (e.g., in the same building,region, country) or logical (e.g., populated by similarcharacters, having similar objectives) contiguity aregrouped in the same metaspace. An example will begiven in the following section.

MIEEs are multisensory since they deal with informa-tion from different sensory channels (e.g., visual,auditory, haptic). They are also multilayered since theyrepresent a performance with respect to narrativestructures situated at several layers of abstraction. Theyare integrated since a number of particular aspects of theinteractive performance, such as analysis of spectators/participants’ behavior, real-time generation of multi-media output, individuation and application of suitablemappings between analyzed behavior and generatedoutput and management of the whole performance atmultiple layers, are all grouped and considered under thesame conceptual framework. They are expressive sincemost of the interaction and communication processestaking place inside them (both at the levels of ‘‘physical’’EMEs and at the level of ‘‘virtual’’ metaspaces) aim toconvey expressive, emotional, affective content (a discus-sion about what is considered to be ‘‘expressive content’’and some of the mechanisms through which suchexpressive content is conveyed in MIEEs can be foundin Camurri et al., 2004a).

6. Examples of multisensory integratedexpressive environments

Several EMEs were developed for public performancesand multimedia events and exhibits in the framework ofthe EU-IST project MEGA. A quick description of someof them has been provided in Section 2 above. All theexamples have been completely or partially implementedusing the EyesWeb open platform and the EyesWebExpressive Gesture Processing Library (see www.eyeswe-b.org).

Consider again the example of the music and danceensemble. It has been implemented in the framework ofthe New York University Music and Dance Program inItaly (2003 and 2004 editions), a three weeks summerschool held in Genoa (Program Director: Esther Lam-neck; Choreographer: Douglas Dunn). Some realsubjects (musicians and dancers) are performing onstage.A virtual subject is associated to each real subject andanalyzes his or her performance (performed music orfull-body movements). Multiple video cameras andmicrophones provide virtual subjects with data. Simplerules are applied to map the output of the analysis togeneration of audio and visual content. This EME canthus be classified as a ‘‘reactive environment’’ (seeSection 4), even if more complex decision-makingprocesses are sometimes applied along the performance(e.g., for deciding which combination of visual effectsshould be applied to the incoming images). Figure 6shows an excerpt from the final concert of the 2003program. Both the music played by the performers andthe movements of the dancers are analyzed by somevirtual subjects. This information is used to interactivelycontrol (with EyesWeb) video generation and processingin real time.

Fig. 6. Picture from the final concert of the New York University Summer Program in Italy (Genoa, July 2003). Both the music playedby the performers and the movements of the dancers are analyzed by some virtual subjects in an EME. This information is used to

interactively control (with EyesWeb) video generation and processing in real time. (Photo by Matteo Ricchetti)

Multisensory Integrated Expressive Environments 33

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 13: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

In the EME created for the concert Allegoriadell’opinione verbale, the focus was on the interactionbetween a real subject (an actress) and her virtual audioclone. This piece by the composer Roberto Doati, basedon Gianni Revello text, was conceived at the DIST-InfoMus Laboratory and then performed on stage forthe 2001 – 2002 autumn season of Gran Teatro La Feniceat the Teatro Malibran, Venice, within Per vocepreparata (a musical theatre production with works fromAperghis, Cage, Casale, Kagel, Pachini, Schnebel) and atOpera House of Genoa Teatro Carlo Felice, Genoa,Italy. During the concert, the actress (Francesca Faiella)is on stage and sits on a stool placed at the front of thestage near the left side. A video camera is placed (hidden)in the left part of the backstage, and is used both to getimages of the face of the actress for projection onto alarge screen and to capture her lip and face movements.A virtual subject extracts and processes the movementsof her lips and face. Then it uses such information toprocess in real time her voice and diffuse spatializedelectroacoustic music on eight loudspeakers placedwithin the auditorium. Thus, in this example, the virtualsubject acts as an audio clone – that is, it translates theactions of the actress from the visual/motion channel (lipand face movements) to the auditory channel (processedvoice and music). (Further information about thisexample and results of audience evaluation can be foundin the article by Lindstrom et al. in this special issue.)

EMEs and MIEEs have been discussed with respect tothe scenario of distributed artistic performances wherethe narration is structured on multiple layers. However,artistic applications are not the only field in which thesemodels can be employed. For example, some EMEs weredesigned as environments for performing therapeuticexercises for the rehabilitation of Parkinson’s patients.The work, carried out in the framework of the EU-ISTproject CARE-HERE in collaboration with Centro diBioingegneria at Ospedale La Colletta-ASL 3 Genovese,consisted in the design and implementation of acollection of therapeutic exercises aiming at creating‘‘pleasant’’ aural and visual feedback that encouragedimprovement of movement in patients (Camurri et al.,2003b). One of the developed exercises consisted of anEME containing a virtual subject observing the move-ment of a patient (real subject) and generating a paintingonto a large screen. The interaction was based on somemeasured movement cues. For example, the color of thepainting may depend on fluency or impulsiveness; theamount of detected motion may be associated tointensity of the color trace; pauses in movement aredetected and allow for restarting the process and re-assigning/adapting interaction mappings. In this case, thevirtual subject can be considered a visual clone of the realsubject (the patient). Applications are under develop-ment where an audio clone is associated to a patient.Figure 7 shows the output of the system (implemented as

Fig. 7. An EME for therapeutic exercises. A virtual subject observes a patient, extracts features from her movement, and generatesvisual outputs in real-time. The figure shows the output of the system. The black shadow is the silhouette of the patient. The coloredareas are generated by the patient through her own motion (a kind of body painting). Colors depend on the energy and fluency of the

patient’s movements.

34 Gualtiero Volpe

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 14: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

an EyesWeb application). The black shadow is thesilhouette of a patient. The patient can paint with his orher own body. Colors depend on the energy and fluencyof the patient’s movements. The exercise is intended toimprove motion fluency.

As final example let us consider another applicationscenario: a museum exhibit in which visitors passthrough several rooms and installations, again followinga kind of narrative structure (the narrative structure ofthe exhibit), and where a main goal is enhancing fruition.Let us start by considering an installation in a room ofthe museum. Several degrees of complexity are possible,ranging from simply displaying movies and reproducingaudio excerpts to interactive situations where visitors areobserved, clones are generated, and audio and visualcontent is produced in real time depending on visitors’behavior. The installation can therefore be regarded asan EME, in which visitors (real subjects) are activelyinvolved in discovering what the exhibit is intended tocommunicate to them. Real and virtual objects and otherreal (e.g., robots) and virtual (e.g., video and audioclones) subjects may be involved in the installation.

Museum installations that, singularly considered, canbe regarded as EMEs were developed on severaloccasions. See, for example, the installations at ‘‘Cittadei Bambini’’ – literally Childrens’ City, a permanentscience museum exhibit for children in Genoa, Italy –(described in Camurri & Ferentino, 1999), and morerecently the installations at ‘‘Citta della Scienza’’ – apermanent science exhibit in Naples, Italy. A room in themuseum can contain a certain number of installationsconnected together through a local area network. If eachinstallation is considered an active EME, the room as awhole can therefore be considered a metaspace in whichsubjects representing the installations in the roomcollaborate in the context of a higher-level communica-tion objective (or a higher-level narrative structure) –namely, what visitors are supposed to learn by visitingthat room.

Two aspects are worth notice at this point. First, theinstallations contained in the room should be activeEMEs. In other words, they should be able to observewhat visitors are doing, keep and update an internalstate, and act accordingly by dynamically modifyingparts of the installation. This means that a certain level ofcomplexity is required in the installation and that thedesigner has to be careful to find a good trade-offbetween complexity and comprehensibility. Simpler andsometimes passive installations can be included asobjects, if they provide control mechanisms.

Second, this is an example in which the metaspace hasa physical correspondence in the museum room. Theroom can be abstracted as an active space inhabited bysubjects (the installations) interacting and collaboratingtoward a common goal: enhancing the fruition of theexhibit. Consider now a further layer of abstraction: for

example, rooms in the museum can be grouped withrespect to thematic areas (i.e., rooms concerning similartopics can be grouped in the same thematic area). Athematic area can thus be considered another metaspace,collocated at layer 2 and inhabited by the rooms that assubjects collaborate in the management of the visitthrough a narrative path across the thematic area. Themuseum as a whole can be regarded as a metaspace atlayer 3, where all the thematic areas, considered assubjects, interact and collaborate in managing flows ofvisitors inside the museum. The whole structure is shownin Figure 8. More levels can be added if needed: forexample, if the museum is spread over several buildings,each building can be regarded as another metaspace at anintermediate layer between the thematic areas and thewhole museum.

In concrete applications, how EMEs and metaspaceshave to be grouped in higher layer metaspaces is oftenquite easy to decide given the application scenario. Forexample, in the case of the museum, grouping isperformed on the basis of location (e.g., all theinstallations in the same room are grouped in ametaspace representing the room) and on the basis ofthe theme of the exhibit (e.g., all the rooms belonging tothe same thematic area are grouped in a metaspacerepresenting the thematic area). Similar criteria can alsobe used in the case of artistic performances, wheregrouping may depend, for example, on the geographicallocation (e.g., EMEs situated in the same region orcountry can be grouped together) or on the content (e.g.,grouping EMEs that are similar in term of storyboard orof role of participants). The museum example has beenpartially implemented in the above-mentioned exhibitionat ‘‘Citta della Scienza’’, Naples, where the installationsemploy simple rule-based strategies for collaborating inenhancing visitors’ fruition of the exhibit. In a morerecent implementation at ‘‘Museo del Mare’’, Genoa,opened in July 2004, a MIEE was designed to managethe sonorization of the rooms in the museum.

7. Conclusions

This article introduced Multisensory Integrated Expres-sive Environments (MIEEs) as a framework forstructuring, designing and implementing distributedactive mixed reality environments for performing artsapplications. MIEEs have been described in their basiccomponents, Extended Multimodal Environments(EMEs), and in how these components are recursivelyconnected in the MIEE architecture. MIEEs are mainlyemployed for developing distributed real-time interactivemultimedia systems and performances with the aim ofenhancing experience and fruition of content. Theyembed dynamic models of interaction and cross-modalmappings, and provide a paradigm for designing and

Multisensory Integrated Expressive Environments 35

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 15: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

organizing complex narrative structures and dealing withsuch complexity.

The advantages of such a paradigm for performingarts and new media mainly relates to the availability of aunified conceptual framework for the design andimplementation of complex, distributed performances.On the one hand, the framework allows the author of acomplex performance to structure it in a number ofcomponents whose properties and relationships can bespecified through the model. On the other hand, thedeveloper of mutimodal interactive systems for theperforming arts can also benefit from the framework:given the design of a performance, the framework canhelp the developer to specify the requirements of theneeded systems (i.e., what each component is supposed todo), their architecture (i.e., how the system componentsare organized and connected) and the kind of processingfor which each system is responsible. Moreover, applica-tion of the MIEE paradigm is not limited to performingarts, but ranges over a number of application scenariosincluding museum and cultural applications, therapy andrehabilitation, didactics and edutainment, and entertain-ment. Further research on MIEEs will be needed to fullyunderstand the potentialities and implications of theparadigm. For example, as discussed in Section 3 above,the definition of the syntax and semantics of a graphicallanguage for describing EMEs and the relationshipsamong them is still an open issue.

Evaluation of MIEEs is also an open and criticalaspect: tools and experiments are needed to assess howmuch MIEEs can really improve artistic experience andfruition of content. This aspect has been faced at theDIST-InfoMus Laboratory and in the MEGA projectthrough the design of ‘‘spectators interfaces’’ (i.e.,interfaces for ‘‘measuring’’ spectators’ and users’ reac-tions and feelings when exposed to rich artistic stimuli, asin a MIEE). A particular focus has been on analysis ofspectators’ engagement (see, e.g., Camurri et al., 2004a).

For some of the MIEEs discussed in this article,spectators’ evaluations have been collected using experi-mental psychology techniques. Results confirm usuallyhigh acceptance of MIEEs by spectators and statisticallymeasurable improvements in acceptance and engagement(see the article by Lindstrom et al. in this special issue).

Software tools are currently under development andtesting for supporting the MIEE paradigm. In particular,MIEEs are going to be fully supported by the EyesWebopen platform. This is achieved by adding to EyesWebthree main capabilities: an efficient and effective sub-patching mechanism enabling the development of thecomposing EMEs (or of part of them) as separate sub-patches (i.e., subsets of an EyesWeb application);mechanisms for automatic and transparent distributionof EyesWeb applications in different physical locationsthrough broadband network connections; and the devel-opment of a further layer of processing at a higher levelof abstraction (called ‘‘META-EyesWeb’’) able to super-vise and schedule execution of possibly distributedEyesWeb applications according to adaptive and dy-namic narrative structures. While EyesWeb will providesupport for developing the component EMEs and for theanalysis of real subjects’ behavior, META-EyesWeb willprovide tools for designing and implementing the MIEEsmetaspaces and the interaction paradigms for the virtualsubjects inhabiting them.

Acknowledgments

I wish to thank the scientific director of the DIST-InfoMus Laboratory, Professor Antonio Camurri, andmy colleagues Paolo Coletta, Alberto Massari, BarbaraMazzarino, Massimiliano Peri, Matteo Ricchetti, AndreaRicci and Riccardo Trocca. I also thank colleagues fromthe partner institutions who worked in the EU-ISTProject MEGA, which partially supported this research.Some more recent developments of this research have

Fig. 8. A museum modelled as a MIEE.

36 Gualtiero Volpe

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014

Page 16: Multisensory Integrated Expressive Environments: Toward a Paradigm for Multimodal and Distributed Environments for the Performing Arts and New Media

been partially supported by the EU-IST Project TAI-CHI (Tangible Acoustic Interfaces in Computer HumanInteraction).

References

Bahorsky, R. (Ed.) (1998). Official Internet Dictionary.Houston: Government Institutes.

Bates, J. (1994). The role of emotions in believable agents.Communications of the ACM, 37(3), 122 – 125.

Benford, S., Snowdon, D., Colebourne, A., O’Brien, J. &Rodden, T. (1997). Informing the design of collaborativevirtual environments. In: S.C. Hayne & W. Prinz (Eds.),GROUP’97: Proceedings of the ACM SIGGROUPConference on Supporting Group Work. New York:ACM Press.

Benford, S., Greenhalgh, C., Reynard, G., Brown, C. &Koleva, B. (1998). Understanding and constructingshared spaces with mixed reality boundaries. ACMTransactions on Computer-Human Interaction, 5(3),185 – 223.

Camurri, A. & Ferentino, P. (1999). Interactive environ-ments for music and multimedia. Multimedia Systems, 7,32 – 47.

Camurri, A., Coletta, P., Ricchetti, M. & Volpe, G. (2000).Expressiveness and physicality in interaction. Journal ofNew Music Research, 29(3), 187 – 198.

Camurri, A., Lagerlof, I. & Volpe, G. (2003a). Recognizingemotion from dance movement: Comparison of spectatorrecognition and automated techniques. InternationalJournal of Human-Computer Studies, 59(1 – 2), 213 – 225.

Camurri, A., Mazzarino, B., Volpe, G., Morasso, P.,Priano, F. & Re, C. (2003b). Application of multimediatechniques in the physical rehabilitation of Parkinson’spatients. Journal of Visualization and Computer Anima-tion, 14(5), 269 – 278.

Camurri, A., Mazzarino, B., Ricchetti, M., Timmers, R. &Volpe, G. (2004a). Multimodal analysis of expressivegesture in music and dance performances. In: A. Camurri& G. Volpe (Eds.), Gesture-based communication inhuman-computer interaction. Berlin: Springer Verlag.

Camurri, A., Mazzarino, B. & Volpe, G. (2004b). Expressiveinterfaces. Cognition, Technology & Work, 6(1), 15 – 22.

Guinn, I.C. & Biermann, A. (1993). Conflict resolution incollaborative discourse. Paper presented at Computa-tional Models of Conflict Management in CooperativeProblem Solving Workshop, 13th International JointConference on Artificial Intelligence (IJCAI), Chambery,France.

Ishii, H. & Ullmer, B. (1997). Tangible bits: Towardsseamless interfaces between people, bits and atoms.Paper presented at CHI’97, 22 – 27 March, Atlanta.Published in Proceedings of the SIGCHI Conference onHuman Factors in Computing Systems. New York: ACMPress.

Machover, T. & Chung, J. (1989). Hyperinstruments:Musically intelligent and interactive performance andcreativity systems. Paper presented at the InternationalComputer Music Conference (ICMC89), Columbus.

Milgram, P. & Kishino, F. (1994). A taxonomy of mixedreality visual displays. IEICE Transactions on Informa-tion Systems, E77-D(12), 1321 – 1329.

Perez-Quinones, M. & Sibert, J.L. (1996). A collaborativemodel of feedback in human-computer interaction. Paperpresented at the Conference on Human Factors inComputing Systems (CHI’96), Vancouver. Published inProceedings of the SIGCHI Conference on Human Factorsin Computing Systems: Common Ground. New York:ACM Press.

Rinman M.L. (2002). Forms of interaction in mixed realityperformance: A study of the artistic event Desert Rain.Licentiate thesis, Royal Institute of Technology (KTH),Stockholm.

Rinman, M.L., Friberg, A., Bendiksen, B., Cirotteau, D.,Dahl, S., Kjellmo, I., Mazzarino, B. & Camurri, A.(2004). Ghost in the Cave: An interactive collaborativegame using non-verbal communication. In: A. Camurri& G. Volpe (Eds.), Gesture-based communication inhuman-computer interaction. Berlin: Springer Verlag.

Russell, S. & Norvig, P. (1995). Artificial intelligence: Amodern approach. Chatham, NJ: Prentice-Hall.

Schaeffer, P. (1977). Traite des objets musicaux, 2nd edn.Paris: Editions du Seuil.

Multisensory Integrated Expressive Environments 37

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

23:

37 0

8 D

ecem

ber

2014