8
Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University of Malta, Msida, Malta {louis.p.lopes; antonios.liapis; georgios.yannakakis}@um.edu.mt Abstract Emotional progression in narratives is carefully structured by human authors to create unexpected and exciting situations, often culminating in a climactic moment. This paper ex- plores how an autonomous computational designer can cre- ate frames of tension which guide the procedural creation of levels and their soundscapes in a digital horror game. Using narrative concepts, the autonomous designer can describe an intended experience that the automated level generator must adhere to. The level generator interprets this intent, bound by the possibilities and constraints of the game. The tension of the generated level guides the allocation of sounds in the level, using a crowdsourced model of tension. Introduction Several computationally creative systems have stood at the interplay of different multidisciplinary creative domains. It should not come as a surprise, therefore, that several projects in computational creativity tackle the transforma- tion of data from one domain to another, e.g. images to soundscapes (Johnson and Ventura 2014), news articles to collages (Krzeczkowska et al. 2010), academic papers to songs and their lyrics (Scirea et al. 2015), text descrip- tions to player abilities (Cook and Colton 2014), to name a few. Due to the dissimilarities between source and target creative domains, such computational systems must learn to creatively interpret the patterns of the input, and work to- wards making them apparent in the output while still obey- ing the constraints and the expressivity of the target creative domain (e.g. a limited color palette). In this context, digital games are particularly relevant as a multi-faceted medium where visuals, audio, narrative and rule- and level-design come together in an interactive expe- rience (Liapis, Yannakakis, and Togelius 2014). Not only must these creative domains go well together, but they must provide players with an enjoyable experience: depending on the genre, this experience can be, for instance, frantic in “bullet hell” action games, relaxing in exploration games, or tense in horror games (Ekman and Lankoski 2009). When drawing inspiration from dissimilar creative do- mains, it is important to find the right patterns to replicate (or re-interpret) in the creative output of the system. While systems can look at structural similarities and associations (Grace, Gero, and Saunders 2012), a promising approach is to identify the intentions of the creator of one artefact and attempt to match those intentions in the artefact of the other domain. Towards that outcome, having access to a frame of reference for the intentions going into the creative act is ideal. Framing information, as suggested in the FACE model of Colton, Charnley, and Pease (2011), can be provided by the creative system itself as “a piece of natural language text that is comprehensible by people”. Such framing informa- tion can clarify the intentions of the system in its design choices and can make its creativity more easily perceptible (Colton 2008). Moreover, the framing information can act as a guide when transforming media generated by such a creative system into different media. In the context of digital games, a human game designer’s primary concern and frame of reference is the intended player experience. In most games, the intended player ex- perience affects all design decisions: from the color palette to the responsiveness of the controls and from the sound ef- fects for rewards to the back-story presented in an introduc- tory cut-scene. Taking a successful horror game such as Am- nesia: The Dark Descent (Frictional Games 2010) as an ex- ample, the intended player experience is one of dread, of im- minent tragedy, of confusion and constant second-guessing of players’ perception and actions. Towards this experience, the visuals include dark colors and dim lights, the audio fo- cuses on ambient noises which foreshadow monsters, the level design has narrow corridors and low visibility while the game rules preclude any way to combat monsters. This paper extends the Sonancia creative system, (Lopes, Liapis, and Yannakakis 2015a; 2015b) by providing the soft- ware with the capacity to choose and describe the intended player experience, which is then used to generate game lev- els and their soundscapes for a horror game. The ability of the computational designer to describe its intentions in clear text to a human audience is paramount in the perception of creativity. Moreover, the system can then create the frame (as the progression of tension) via evolutionary search driven by several fitnesses targeting specific narrative structures. The paper includes several examples of generated frames and their corresponding levels and soundscapes. As an ad- ditional contribution to earlier work, the current version of Sonancia uses a crowdsourced model of tension to allocate sounds to the level in a way that more closely matches the human perception (or ground truth) of tension. 205 Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

Framing Tension for Game Generation

Phil Lopes, Antonios Liapis, Georgios N. YannakakisInstitute of Digital Games, University of Malta, Msida, Malta

{louis.p.lopes; antonios.liapis; georgios.yannakakis}@um.edu.mt

Abstract

Emotional progression in narratives is carefully structured byhuman authors to create unexpected and exciting situations,often culminating in a climactic moment. This paper ex-plores how an autonomous computational designer can cre-ate frames of tension which guide the procedural creation oflevels and their soundscapes in a digital horror game. Usingnarrative concepts, the autonomous designer can describe anintended experience that the automated level generator mustadhere to. The level generator interprets this intent, boundby the possibilities and constraints of the game. The tensionof the generated level guides the allocation of sounds in thelevel, using a crowdsourced model of tension.

IntroductionSeveral computationally creative systems have stood at theinterplay of different multidisciplinary creative domains.It should not come as a surprise, therefore, that severalprojects in computational creativity tackle the transforma-tion of data from one domain to another, e.g. images tosoundscapes (Johnson and Ventura 2014), news articles tocollages (Krzeczkowska et al. 2010), academic papers tosongs and their lyrics (Scirea et al. 2015), text descrip-tions to player abilities (Cook and Colton 2014), to namea few. Due to the dissimilarities between source and targetcreative domains, such computational systems must learn tocreatively interpret the patterns of the input, and work to-wards making them apparent in the output while still obey-ing the constraints and the expressivity of the target creativedomain (e.g. a limited color palette).

In this context, digital games are particularly relevant asa multi-faceted medium where visuals, audio, narrative andrule- and level-design come together in an interactive expe-rience (Liapis, Yannakakis, and Togelius 2014). Not onlymust these creative domains go well together, but they mustprovide players with an enjoyable experience: dependingon the genre, this experience can be, for instance, franticin “bullet hell” action games, relaxing in exploration games,or tense in horror games (Ekman and Lankoski 2009).

When drawing inspiration from dissimilar creative do-mains, it is important to find the right patterns to replicate(or re-interpret) in the creative output of the system. Whilesystems can look at structural similarities and associations(Grace, Gero, and Saunders 2012), a promising approach is

to identify the intentions of the creator of one artefact andattempt to match those intentions in the artefact of the otherdomain. Towards that outcome, having access to a frameof reference for the intentions going into the creative act isideal. Framing information, as suggested in the FACE modelof Colton, Charnley, and Pease (2011), can be provided bythe creative system itself as “a piece of natural language textthat is comprehensible by people”. Such framing informa-tion can clarify the intentions of the system in its designchoices and can make its creativity more easily perceptible(Colton 2008). Moreover, the framing information can actas a guide when transforming media generated by such acreative system into different media.

In the context of digital games, a human game designer’sprimary concern and frame of reference is the intendedplayer experience. In most games, the intended player ex-perience affects all design decisions: from the color paletteto the responsiveness of the controls and from the sound ef-fects for rewards to the back-story presented in an introduc-tory cut-scene. Taking a successful horror game such as Am-nesia: The Dark Descent (Frictional Games 2010) as an ex-ample, the intended player experience is one of dread, of im-minent tragedy, of confusion and constant second-guessingof players’ perception and actions. Towards this experience,the visuals include dark colors and dim lights, the audio fo-cuses on ambient noises which foreshadow monsters, thelevel design has narrow corridors and low visibility whilethe game rules preclude any way to combat monsters.

This paper extends the Sonancia creative system, (Lopes,Liapis, and Yannakakis 2015a; 2015b) by providing the soft-ware with the capacity to choose and describe the intendedplayer experience, which is then used to generate game lev-els and their soundscapes for a horror game. The ability ofthe computational designer to describe its intentions in cleartext to a human audience is paramount in the perception ofcreativity. Moreover, the system can then create the frame(as the progression of tension) via evolutionary search drivenby several fitnesses targeting specific narrative structures.The paper includes several examples of generated framesand their corresponding levels and soundscapes. As an ad-ditional contribution to earlier work, the current version ofSonancia uses a crowdsourced model of tension to allocatesounds to the level in a way that more closely matches thehuman perception (or ground truth) of tension.

210

205Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Page 2: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

CliffhangerFraming InformationRandomly Selected

Tension CurveArtificially Evolved

Horror Game LevelArtificially Evolved

Level SonificationDeterministic

Figure 1: The creative process of the Sonancia system described here: a randomly selected tension frame is used to evaluate anevolving tension progression (the intended tension curve). Once complete, the final intended tension curve guides the evolutionof a level generator which attempts to place monsters and items to match the tension curve. Finally, the evolved level and itsderived tension curve are used to deterministically allocate sounds based on a crowdsourced model of aural tension. The coreinnovations of this paper are the framing information and the evolved tension curve (the first two modules); the level generatorwas first described in (Lopes, Liapis, and Yannakakis 2015a), while sonification has been improved via crowdsourced models.

BackgroundSonancia attempts to blend different game facets (in this pa-per level design, audio and narrative): this section coversrelated work on blending, focusing on the audio facet.

Blending Game FacetsDigital games are a medium combining different creativefacets: visuals, audio, narrative, ludus, level architectureand game-play; these facets complement each other to cre-ate specific kinds of interactive experiences (Liapis, Yan-nakakis, and Togelius 2014). While designing content foreach facet is a creative task, blending the different facetsis of utmost challenge and promise within computationalcreativity (Lopes and Yannakakis 2014). Game generationsystems like Angelina (Cook, Colton, and Pease 2012) andGame-o-matic (Treanor et al. 2012) extensively explore howdifferent facets of games can be combined to create interest-ing and thought-provoking experiences. Commercial games(designed and fine-tuned by humans) tend to blend eithertheir rules (ludus) or level design (architecture), in the caseof e.g. action-RPGs. However, suggestions for automatingsuch blends creatively have been put forth (Gow and Cor-neli 2015). Blends between audio and gameplay have beenexplored in AudioInSpace, where the shooting mechanicschange according to the background music, which can behand-authored (loaded from a music library) or artificiallyevolved (Hoover et al. 2015). Similar studies have focusedon blending audio and narrative in order to foreshadow up-coming story events via sound (Scirea et al. 2014).

The current paper builds upon and extends earlier work onSonancia (Lopes, Liapis, and Yannakakis 2015a; 2015b), byallowing it to autonomously decide on an emotional progres-sion through framing inspired by narrative structures, andby applying a crowdsourcing methodology for the emotionalevaluation of sounds in the sonification audio library.

Sound and User ExperienceWhen effectively used, audio has the potential of enhancingthe player experience by fully immersing the player within

a virtual world (Collins 2013). This property is especiallyimportant within the genre of horror in which particular au-dio patterns such as musical foreshadowing, the absence ofnoise, or even a rise of tempo, volume and pitch can elicitstressful experiences for players (Garner, Grimshaw, andNabi 2010; Ekman and Lankoski 2009). These audio pat-terns are successful in eliciting intense affective responses ifthey are well interwoven with the design of the game lev-els. Earlier work of the authors explored how this could beachieved by sonifying levels based on a common progres-sion of tension (Lopes, Liapis, and Yannakakis 2015a). Inthose studies each sound assset was given an empirical mea-sure of how tense that particular sound was perceived, al-lowing the Sonancia system to effectively place sounds thataccommodate the rise and fall of tension during play (Lopes,Liapis, and Yannakakis 2015b).

Inspired by earlier success of crowdsourcing for anno-tating highly subjective notions such as game aesthetics(Shaker, Yannakakis, and Togelius 2013), the previous So-nancia system (Lopes, Liapis, and Yannakakis 2015b) isextended via crowdsourcing of annotations on tension forsound samples. Such annotations can be used to derive moreaccurate data-driven computational models of tension in hor-ror games, and offer Sonancia a human-verified, objectiveand more reliable way to select and place sounds to createspooky, tense soundscapes.

MethodologySonancia consists of several generative modules working asa pipeline (see Fig. 1): each generator restricts and guidesthe type of content which can be created in the next gen-erative step, and with each step the content becomes morerefined. The final result is a complete horror game, whereplayers must reach a specific room within a haunted man-sion while avoiding terrifying monsters along the way (seeFig. 2a). Players do not have weapons and must avoid directconfrontation with monsters; monsters thus act as an insti-gator of tension and fear, regardless of the player’s skill.

Levels in Sonancia are generated via evolutionary com-

211

206Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Page 3: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

putation guided by intended tension frames, evolved previ-ously. Every Sonancia level consists of rooms connectedby doors; rooms can have monsters to be avoided and theobjective which must be reached to complete the level (seeFig. 2a). The level is characterized by its critical path, whichis the shortest sequence of rooms (i.e. shortest path) betweenthe player’s starting room and the room with the objective.

The version of Sonancia presented in this paper consistsof three different generative modules (see Fig. 1): the fram-ing of tension to a randomly chosen narrative property, thegame level generation, and the level sonification module.The details of each module are presented below.

Framing TensionTo clearly explain how levels are created in Sonancia, it isimportant to firstly define how the designer intention is rep-resented within the system. The frame for the task of horrorgame generation is provided by an intended tension curvewhich consists of a 2D representation of how tension risesand falls as the player progresses along the critical path (seeFig. 2b). In other words, the intended tension curve portraysthe ideal player experience when going through the level.

This paper specifically explores how an autonomous cre-ative system can provide a frame to the level generation pro-cess by creating different intended tension curves. For hor-ror games, we focus on a frame of tension as an amalgam ofthe predominant emotions within the horror genre (Ekmanand Lankoski 2009): fear, anxiety and stress.

Evolving Intended Tension Curves: The intended ten-sion curves are created via a genetic algorithm (GA), drivenby one or more aesthetics of narrative progression. The GAallows for flexibility and creativity when defining the curve,but push it towards specific shapes. The tension curve is rep-resented as as an array of values between 0 and 3 (in incre-ments of 0.25), where the array index is the room in the orderof the critical path, while each value of the array is the spe-cific tension value. Evolution applies a roulette wheel selec-tion mechanism with one-point crossover (Mitchell 1998).After recombination each offspring has a 20% chance ofmutating, i.e. incrementing or decrementing a single valuein the array by 0.25 (provided the result is within 0 and 3).The GA runs for 100 generations with a population of 100individuals, each initialized with random tension values.

Evaluating Intended Tension Curves: Eight different fit-ness functions are encoded into the system, inspired bynarrative structures and normalized to [0, 1]. The Escalat-ing and Decreasing tension fitness rewards individuals withrooms that have a higher or lower tension value from the pre-vious room, respectively. The Resting Point fitness rewardsindividuals with the deepest tension ‘valley’, while the Sur-prising Moment fitness rewards the height of the highest‘peak’. The Cliffhanger fitness rewards tension curves withat least one peak, where the last room’s tension is higherthan any of the peaks. The Denouement fitness gives highvalues to individuals if the highest peak is close to the finalroom (but is not the final room). Unresolved Tension fitnessrewards consecutive rooms with the same tension. Finallythe Rising & Falling Tension fitness is proportionate to the

(a) Example 2D level (b) Tension curves of Fig. 2a.

(c) Playable Sonancia level in 3D (curated).

Figure 2: Example of a Sonancia “haunted manor” level in2D (Fig. 2a) and 3D (Fig. 2c). In Fig. 2a, the room withthe diagonal lines is the starting room, red rectangles aredoors, green triangles are monsters, the blue square is theobjective and the black arrow is the critical path (the shortestpath between the starting room and objective). The criticalpath creates a level tension curve (grey) in Fig. 2b whichmust closely match the intended tension curve (black).

number of peaks in the tension curve. Among these eightfitnesses, one is chosen randomly to generate the appropri-ate tension frame. To increase the expressivity of generatedframes, the system can also choose two fitnesses and applyan “Or” or “And” operator which sums or multiplies, re-spectively, the individual fitness scores.

The Level Tension Curve: Each level derives a tensioncurve from the distribution of monsters on the level’s criti-cal path: this process generates the level tension curve. Go-ing through each room on the critical path, the level tensioncurve increases tension by 1 if the room contains a mon-ster; if the room has no monster the tension decreases by 0.5(to a minimum of 0) to simulate the players relaxing after astressful event. Figure 2b shows the level tension curve forthe level of Fig. 2a.

Level GenerationTo create levels that adhere to the frame of intended ten-sion, a search-based PCG approach was chosen (Togelius etal. 2011). The level generation process has been describedin (Lopes, Liapis, and Yannakakis 2015a), but a high leveldescription is included in this paper for the sake of com-

212

207Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Page 4: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

pleteness. The level layout is represented as an array of in-tegers (each integral value corresponding to a room’s identi-fier or ID), while the doors are represented by their connect-ing rooms’ IDs and monsters or objective by the ID of theroom they are in and their type. Mutations allow a gene tochange the level layout (pushing walls or splitting rooms), orto add, remove and move doors, monsters or the objective.Crossover is omitted due to its disruptive nature.

Levels construct their own version of the tension curve(i.e. the level tension curve), and the fitness function rewardsrooms which more closely match the intended tension curve.The fitness function is the average distance between leveland intended tension curve (see Fig. 2b). If the level hasfewer rooms on the critical path then the intended tensioncurve is scaled, while if it has more rooms then it receives aminimal fitness as the intended tension curve acts as a con-straint on room number. In addition, the fitness function alsocalculates the number of unique rooms visited between thestart (room ID 0) and all “dead-end” rooms and subtracts thenumber of rooms with no doors (as those can not be visited).More details on the evolutionary algorithm and objectivescan be found in (Lopes, Liapis, and Yannakakis 2015a).

Level SonificationLevel sonification in the Sonancia system consists of allocat-ing specific audio pieces within the level, based on the leveltension curve. The goal of sonification is to have soundswhich match the tension of the room, i.e. rooms with mon-sters will have scarier associated music; this is different from(Lopes, Liapis, and Yannakakis 2015b) which used sonifi-cation for suspense as the reverse of tension. Sonancia in-cludes a soundbank of human-authored recordings with anaverage length of 7 seconds. To accurately map sound assetsto specific values of tension, a crowdsourcing experimentwas conducted to obtain an approximation of how tense thedifferent sounds are compared to each other.

The Sound Library: The Sonancia sound library cur-rently contains 97 different sound assets, recorded by hu-man authors via the FM8 (Native Instruments 2006) tooland the Reaper (Cuckos 2005) digital audio workstation. Tomaintain a large, yet feasible number of samples for crowd-sourced annotation, we undersampled sounds from the li-brary based on their “pitch” and “loudness”.

According to Garner, Grimshaw, and Nabi (2010) loud(i.e. power) and high-pitched sounds tend to trigger fearfulemotions. Based on this finding, we plotted (see Fig. 3) eachaudio asset according to the ∆Db value (loudness) and av-erage power of frequencies above 5k (high-pitch). For thecrowdsourcing experiment presented in this paper we se-lected the 40 sounds (out of the 97 available) with the highestaverage Euclidean distance between them along pitch andloudness (see Fig. 3).

Crowdsourcing Tension: A survey was conducted to de-rive an approximate value of reported tension to each soundasset in the library1. For annotating the tension value ofsounds we adopt a rank-based approach due to its evidenced

1sonancia.institutedigitalgames.com

Figure 3: Scatter plot of the entire Sonancia sound library.High pitch frequencies are between 5 and 22 Hz, while vol-ume difference is between the maximum and minimum dBvalues of the sound. Triangles and circles are, respectively,selected and unselected audio samples.

effectiveness for highly subjective notions such as affect andemotion (Yannakakis and Hallam 2011). Human annotatorswere presented with pairs of sounds selected randomly andwere asked to report which sound in each pair is more tensevia a 4-alternative forced choice questionnaire (Yannakakisand Hallam 2011). Annotators could listen to the two se-lected sounds as many times as they desired. At the timeof writing, 452 pairs of sounds have been ranked by ten-sion. While this is a smaller number than the 780 possi-ble pairings, the sound pairs were randomized and thus allsounds were annotated for at least half of the possible pair-ings; some insight on every sound’s tension properties canbe gleaned even with the limited data.

The 40 sounds are ranked based on the human-annotatedtension preferences. The global order of sound ten-sion is derived through the pairwise preference test statis-tic (Yannakakis and Hallam 2011) which is calculated asPi = (

∑Nj zij)/N , where zij is the tension preference score

of i in the pair of sounds i and j (zij is +1 if sound i ispreferred, −1 if sound j is preferred, and 0 if no sound ispreferred or there is no annotation); N is the total number ofsounds. The obtained tension preference scores P define theglobal order (rank) of each sound with respect to tension.

Audio Allocation and Mixing: Audio allocation consistsof placing sound assets in each room of a level, based onthe level tension curve and the tension preference score ofeach sound in the library. The system picks sounds equidis-tantly from the global order (in descending tension prefer-ence score) depending on the total number of rooms (notonly those in the critical path). A sound is assigned to eachroom so that the room’s tension value matches the global or-der of sound tension. The process starts with the most tensesound which is allocated to the room with the highest ten-sion and it continues until no more rooms (or sounds) areavailable and each room has a unique sound. Higher rankedsounds with respect to tension are prioritized for rooms onthe critical path. For rooms with equal tension values, the

213

208Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Page 5: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

first room in the critical path gets the more tense sound.The mixing algorithm controls how the sounds are played

in the game. Audio mixing uses the player’s distance from aneighbouring room to adjust the volume of the contributionfrom each neighbouring room’s sound. This mixing rule al-lows players to hear sounds from neighbouring rooms, of-fering a sense of foreshadowing.

ExperimentsThis section describes results obtained from Sonancia’s en-tire process, from creating a framing of tension to generat-ing levels based on this frame and finally sonifying the level.The goal is to evaluate, in a qualitative way, how the differ-ent generators interpret (in a tension graph, level structure orsound sequence) a frame of increasing detail created by theprevious generative step in the pipeline of Fig. 1. The dis-cussion of results assesses how accurately, for instance, thelevels match the tension curves and where the limitations ofone generative domain lead to a creative transformation ofthe other domain’s data.

The system ran independently 40 times, where the fram-ing fitnesses were selected (and often combined) by the sys-tem without human intervention. Once a framing fitness isselected, intended tension curves evolved for 100 genera-tions in 20 independent runs; the fittest one among theseruns is selected to guide level generation. Level generationperformed 20 independent runs for 100 generations usingthe intended tension curve found previously. For brevity,we discuss the fittest individuals (tension curves, levels andsoundscapes) for four chosen generated frames; these pro-vide the most varied and interesting results. The highlightedsystem’s frames were provided in text as such:

1. “I want an experience with a denouement.”

2. “I want an experience with a cliffhanger.”

3. “I want an experience with both a surprising moment anda point of rest.”

4. “I want an experience with decreasing tension or acliffhanger.”

The following sections describe (in the above order) the ten-sion curves, levels and soundscapes created following thiscomputer-generated frames of tension.

Framing DenouementDenouement (or conclusion) is encoded aesthetically as afitness function which rewards when the highest peak in thetension curve is near the last room (but not the last room, asthat would not form a ‘peak’ per se). Observing the fittest in-tended tension curve in Figure 4c, the intended tension curve(in black) matches this specification as the highest peak (at2.5) is on the 7th room out of 8 rooms on the critical path.

Level Generation: Figure 4a shows the fittest level forthe intended tension curve discussed above; its level tensioncurve is shown in Fig. 4c, in grey. It is immediately obviousthat the level tension curve does not match the intended oneclosely, although it does have a single peak at room 4 anda denouement of 4 rooms after that (rooms 5-8). The level

(a) Best level for Denouement (b) Best level for Cliffhanger

(c) Tension for Denouement (d) Tension for Cliffhanger

Figure 4: Haunted mansions and their intended and actual(level) tension curves for single aesthetics.

Room 1 2 3 4 5 6 7 8Rank 22 16 7 1 4 10 13 19P 0.04 0.13 0.44 0.79 0.67 0.24 0.19 0.05

(a) Denouement

Room 1 2 3 4 5 6 7 8Rank 10 16 19 22 13 4 7 1P 0.24 0.13 0.05 0.04 0.19 0.67 0.44 0.79

(b) Cliffhanger

Table 1: Sound selection for the Denouement (Table. 1a)and Cliffhanger (Table. 4b) levels. The sounds’ correspond-ing rank position with respect to tension (Rank) and tensionpreference score (P ) are also presented.

cannot match the intended tension curve since e.g. monstersalways add 1 to the tension and decay does not allow the‘constant’ tension between rooms 3 and 4 or the quick dropsof rooms 5 and 6. Instead, evolution attempts to balance thetradeoffs between monsters and tension decay, by adding orremoving monsters in specific rooms. The result in Fig. 4acontains 3 monsters in the first three rooms after the start-ing one, and then no monsters until the objective room —allowing the player to relax. The biases and constraints ofthe level generation forced evolutionary search to interpretthe intended tension curve to the best of its ability; the leveltension curve does exhibit denouement, albeit lasting longer.

Level Sonification: Table 1a shows the distribution of dif-ferent audio assets and their respective rank value withinthe level of Fig. 4a. It is important to note that sonifica-tion will always follow the level tension curve, to create a

214

209Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Page 6: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

soundscape coherent to the current level. In this instancethe algorithm places the highest ranked sound at the tensionpeak (i.e. room 4). Room 3 has a higher ranked sound thanroom 6 even though they have the same tension values, as itoccurs first on the critical path. A game-play video of thislevel is available online2.

Framing the CliffhangerThe cliffhanger is encoded aesthetically as a fitness functionwhich rewards tension curves with at least one peak, wherethe last room’s tension is higher than any of the peaks (act-ing, thus, as the cliffhanger). The fittest intended tensioncurve in Figure 4d (in black) matches this specification asthe tension peaks in the 6th room (with a value of 2.0) butthe final room is even more tense (2.25). It should be notedthat the curve starts at the highest value (3.0) in room 1; thisis due to the fact that the first room does not register as apeak (peaks compare tension values with both neighbours).

Level Generation: Figure 4b shows the fittest level forthe intended tension curve discussed above; its level tensioncurve is shown in Fig. 4d, in grey. Unlike denouement, thelevel tension curve closely matches the intended one for thecliffhanger aesthetic. Both curves start at the maximum pos-sible tension (for levels, this is 1 if there is a monster in thefirst room) and then drop the tension in the next rooms onlyto increase it around rooms 5 and 6, culminating at the high-est value (ignoring the first room in the intended curve) onroom 8. Interestingly the level curve drops to 0 in rooms 3and 4 as it can not maintain the near-identical tension of theintended curve (due to tension decay).

The result in Fig. 4b has 4 monsters on the critical path,distributed near the start and end of this path. This causes aninitial tense moment for the players when they start the level,then lets them relax with 3 empty rooms, reach a climax aftertwo monsters and release some tension with the next-to-last-room only to find a monster in the room with the objective.

Level Sonification: Table 1b contains the sounds allo-cated along the critical path of the level in Fig. 4b). As thelevel tension curve closely matches the intended curve, soni-fication largely matches the original frame as well. Whilerooms 2 to 5 have sounds with a low global rank value, thischanges swiftly with tense sounds which culminate to themost tense sound in the last room. A game-play video ofthis level is available online3.

Frame of Surprising Moments and Resting PointsWhen combining fitness functions, the “and” combinationforces both fitnesses to have high scores: in this case, thesurprising moment aesthetic rewards high ‘peaks’ while theresting point aesthetic rewards deep ‘valleys’. Indeed, bothaesthetics are present in the intended tension curve of Fig. 5cas it exhibits the highest peak (height of 3) and the lowestpossible valley (depth of 3, considering the tallest adjacentpeak). The aggressive changes in tension were expected, asboth fitnesses directly reward high peaks and deep valleys;

2https://youtu.be/IJQFqxfHqY83https://youtu.be/z5R12NPVVFA

(a) Best level for Surprise andResting Point

(b) Best level for Decreasing orCliffhanger

(c) Tension for Surprise andResting Point

(d) Tension for Decreasing orCliffhanger

Figure 5: Haunted mansions and their intended and actual(level) tension curves for combined aesthetics.

Room 1 2 3 4 5 6 7 8Rank 7 16 22 10 1 4 13 19P 0.44 0.13 0.04 0.24 0.79 0.67 0.19 0.05

(a) Surprising Moments and Resting Points

Room 1 2 3 4 5 6 7 8Rank 16 4 10 19 22 13 1 7P 0.13 0.67 0.24 0.05 0.04 0.19 0.79 0.44

(b) Decreasing Tension or a Cliffhanger

Table 2: Sound selection for the Surprising Moments andResting Points (Table 2a) and Decreasing Tension or aCliffhanger (Table 2b) levels. The sounds’ correspondingrank position with respect to tension (Rank) and tensionpreference score (P ) are also presented.

their combination unsurprisingly causes tension to soar froma value of 0 to 3 within the span of two rooms. In manyother runs, the fittest individuals contained adjacent roomswith tension values of 0 and 3 (or vice versa).

Level Generation: Figure 5a shows the fittest level forthe intended tension curve discussed above; its level tensioncurve is shown in Fig. 5c, in grey. The level tension curvematches the intended tension curve as closely as possiblegiven the constraints of the way it is computed. The fact thateach room can have only one monster (which increases ten-sion by 1) causes the peak of room 5 after the resting pointin room 3 to have lower tension values than the intended.The structure of the level tension curve retains both a resting

215

210Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Page 7: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

point at room 3 and the surprising moment at room 5, andthus matches the provided frame. Of interest is the observa-tion that unlike the intended curve, the last room in the leveldoes not contribute to an increase in tension since adding amonster there (getting the tension value to 2) would causemore deviation from the intended value (1.25).

The level of Fig. 5a has 3 monsters on the critical path,placed primarily midway to the objective. This yields a highspike (surprising moment) in room 5 after encountering twomonsters. The starting room also has a monster, in order tolet players relax in the next two rooms and thus reach theresting point (before more stressful events) in room 3.

Level Sonification: Table 2a contains the sounds allocatedalong the critical path of the level in Fig. 5a. Obviously, themost tense sound is placed on the surprising moment (room5) which matches both the intended and the level tensioncurve; similarly, the resting point has the least tense soundas per the provided frame. Due to the similarity of the levelcurve with that of Fig. 4c, tense sounds are allocated in asomewhat similar fashion with a slight change in the firstrooms. A game-play video of this level is available online4.

Framing Decreasing Tension or a CliffhangerCombining fitness functions with an “or” in this system addsthe two fitness scores together. This will still reward thepresence of both features but since it is less aggressive thanmultiplying the scores (as in “and”), it may reward eitherfitness equally. The fittest tension curve in Fig. 5d, for in-stance, does not have the cliffhanger pattern (although par-tially it does exhibit a peak in room 7) but has predominantlya decreasing tension. The cliffhanger and decreasing tensionare conflicting objectives, as the former rewards an increasein tension both for the presence of a peak and for the finalroom. Therefore, the intended curve in Fig. 5d attempts tobalance between the two by predominantly having a decreas-ing tension while also having a peak (which is rewarded, par-tially, by the cliffhanger fitness). Thus, the less aggressivesearch of the “or” operator is demonstrated.

Level Generation: Figure 5b shows the fittest level forthe intended tension curve discussed above; its level tensioncurve is shown in Fig. 5d, in grey. The level tension curvematches the intended tension curve except that the gradientof the tension decay is different: this causes evolution to usetwo rooms (6 and 7) to increase the tension in order to matchthe tension value of room 7 (2.25 in the intended curve and2.5 in the actual one). Interestingly, despite the expecteddifferences when the level generator interprets the intendedframe (e.g. an increase in tension at room 2), the aestheticsmatch between intended and level tension curve. The leveltension curve predominantly has decreasing tension, with nocliffhanger but at least one peak (thus fulfilling one of the re-quirements for a cliffhanger).

The level of Fig. 5b has 4 monsters on the critical path,placed at the start and towards the end of the critical path.The two monsters in the first and second rooms trigger a

4https://youtu.be/P2HkGr719f0

very tense experience to the player, but the decreasing ten-sion aesthetic allows them to relax for the next 4 rooms be-fore facing two more monsters in rooms 6 and 7. The roomwith the objective does not have a monster, affording somerelaxation to the player.

Level Sonification: Table 2b shows how sounds were al-located within the level and their respective values. Com-pared to the other sonification results, this soundscapespreads highly tense sounds throughout the level rather thanconcentrating them in a specific section. Interestingly, thelevel tension curve is unique compared to the other cases asno room has a tension value of 0. For instance, room 2 hasthe second highest ranked sound, but is surrounded by lesstense sounds, while the most tense sound is reserved for theclimax (i.e. room 7). A game-play video of this level isavailable online5.

DiscussionThe results highlighted four example tension frames whichwere associated with one or multiple fitness dimensions.The results showed that the intended tension curves createdby the system matched the patterns in the narrative struc-tures they were based on. An exception was when conflict-ing fitnesses were combined with the “or” operator, whereone fitness could dominate the other (earning the operatorits name). The generated levels in many cases matched theintended curve (if not value-for-value) but the limitations ofthe level tension curve calculation could cause deviations(e.g. in the case of denouement). At a high-level, all gener-ated levels exhibited the intended aesthetics of each frame.

Observing results with other fitness dimensions of fram-ing, we found that Escalating, Decreasing and UnresolvedTension fitnesses created the least variability in the tensioncurves. This was expected, as these fitnesses reward smallincremental changes in the tension or no changes (for Unre-solved Tension). Both the Surprising Moment and RestingPoint fitnesses created more variations in the tension curvesbut both showed similar patterns: a drastic change of ten-sion (from 0 to 3 or vice versa) between two adjacent rooms(similar to Fig. 5c). This pattern is impossible to replicatein the levels, leading to more free-form interpretation of theintended curve by the level generator. An interesting emer-gent solution to attain less aggressive tension changes waswhen fitnesses were combined: for instance, combining anyfitness with the Escalating or the Decreasing fitness yieldedcurves with smoother changes in tension. Due to a less strictevaluation formula, the Denouement, Cliffhanger and Rising& Falling Tension created the most diverse curves. Peaksvery often varied in tension, and in some cases the entirecurve would have low values of tension, or only high values.

The additional modules of the Sonancia pipeline (high-lighted in Fig. 1) contribute to the creativity of the systemin two core ways: framing information and interpretation.Framing information (as desired narrative structures) allowthe generator to describe in human language its intent; thefitness function associated with each narrative structure al-

5https://youtu.be/JnFli_F-r38

216

211Proceedings of the Seventh International Conference on Computational Creativity, June 2016

Page 8: Framing Tension for Game Generation · 2016-09-09 · Framing Tension for Game Generation Phil Lopes, Antonios Liapis, Georgios N. Yannakakis Institute of Digital Games, University

lows the system to appreciate whether it has achieved thisintent. The fact that the initial frame is chosen randomlyis a current limitation of the system; its creativity could bestrengthened if the inspiration for a narrative structure comesfrom elsewhere (e.g. a newspaper article). Interpretation inSonancia is strengthened by extending the pipeline to in-clude generated tension curves which guide the level gen-erator, which in turn guides the sonification process: as thelevel of detail of the creative artifact increases from an ab-stract frame to a playable game, the generators must cre-atively interpret the guidelines of the previous generativestep in order to satisfy them while still obeying the limi-tations of their own level of detail (e.g. the structural re-quirements of a level). This requires a degree of imaginationfrom each module in transforming high-level directives intohigher-detail artifacts. Finally, as the final game levels arealways playable and contain the necessary components forhorror gameplay, Sonancia has the necessary skill and thuscompletes the creative tripod exhibited by creative systems(Colton 2008).

ConclusionsThis paper presented a system capable of creating and com-bining different structures of tension influenced by narrativeconcepts, then transforming them into horror levels and theiraccompanying soundscapes. Several dimensions of tensionframing structure were developed and tested; the four ex-ample generated frames highlighted the process from frameconceptualization to level generation and finally to sonifica-tion. Demonstrations of the sonification of all the examplelevels in this paper can be found online6.

AcknowledgmentsThe authors would like to thank all participants of thecrowdsourcing experiment. This work was supported, inpart, by the FP7 Marie Curie CIG project AutoGameDesign(project no: 630665) and the Horizon 2020 project Cross-Cult (project no: 693150).

ReferencesCollins, K. 2013. Playing with sound: a theory of interacting withsound and music in video games. MIT Press.Colton, S.; Charnley, J.; and Pease, A. 2011. Computationalcreativity theory: The face and idea descriptive models. In Pro-ceedings of the Second International Conference on ComputationalCreativity.Colton, S. 2008. Creativity vs. the perception of creativity in com-putational systems. In Papers from the AAAI Spring Symposium onCreative Intelligent Systems.Cook, M., and Colton, S. 2014. A rogue dream: Automaticallygenerating meaningful content for games. In Proceedings of theAIIDE workshop on Experimental AI & Games.Cook, M.; Colton, S.; and Pease, A. 2012. Aesthetic considerationsfor automated platformer design. In Proceedings of the ArtificialIntelligence and Interactive Digital Entertainment conference.

6https://goo.gl/Qshvqv

Ekman, I., and Lankoski, P. 2009. Hair-raising entertainment:Emotions, sound, and structure in silent hill 2 and fatal frame. Hor-ror video games: Essays on the fusion of fear and play 181–199.Garner, T.; Grimshaw, M.; and Nabi, D. A. 2010. A preliminaryexperiment to assess the fear value of preselected sound parametersin a survival horror game. In Proceedings of the 5th Audio MostlyConference: A Conference on Interaction with Sound, 10. ACM.Gow, J., and Corneli, J. 2015. Towards generating novel gamesusing conceptual blending. In Eleventh Artificial Intelligence andInteractive Digital Entertainment Conference.Grace, K.; Gero, J.; and Saunders, R. 2012. Representational affor-dances and creativity in association-based systems. In Proceedingsof the International Conference on Computational Creativity.Hoover, A. K.; Cachia, W.; Liapis, A.; and Yannakakis, G. N. 2015.AudioInSpace: Exploring the creative fusion of generative audio,visuals and gameplay. In Proceedings of the EvoMusArt confer-ence. Springer. 101–112.Johnson, D., and Ventura, D. 2014. Musical motif discovery innon-musical media. In Proceedings of the International Confer-ence on Computational Creativity.Krzeczkowska, A.; El-Hage, J.; Colton, S.; and Clark, S. 2010.Automated collage generation – with intent. In Proceedings of theInternational Conference on Computational Creativity.Liapis, A.; Yannakakis, G. N.; and Togelius, J. 2014. Computa-tional game creativity. In Proceedings of the International Confer-ence on Computational Creativity, 285–292.Lopes, P., and Yannakakis, G. N. 2014. Investigating collaborativecreativity via machine-mediated game blending. In Proceedingsof the Artificial Intelligence and Interactive Digital Entertainmentconference.Lopes, P.; Liapis, A.; and Yannakakis, G. N. 2015a. Sonancia:Sonification of procedurally generated game levels. In Proceedingsof the 1st Computational Creativity and Games Workshop.Lopes, P.; Liapis, A.; and Yannakakis, G. N. 2015b. Targetinghorror via level and soundscape generation. In Eleventh ArtificialIntelligence and Interactive Digital Entertainment Conference.Mitchell, M. 1998. An introduction to genetic algorithms. MITpress.Scirea, M.; Cheong, Y.-G.; Nelson, M. J.; and Bae, B.-C. 2014.Evaluating musical foreshadowing of videogame narrative expe-riences. In Proceedings of the Audio Mostly: A Conference onInteraction With Sound. ACM.Scirea, M.; Barros, G. A. B.; Shaker, N.; and Togelius, J. 2015.SMUG: Scientific music generator. In Proceedings of the Interna-tional Conference on Computational Creativity.Shaker, N.; Yannakakis, G.; and Togelius, J. 2013. Crowd-sourcingthe aesthetics of platform games. IEEE Transactions on Computa-tional Intelligence and AI in Games 5(3).Togelius, J.; Yannakakis, G. N.; Stanley, K. O.; and Browne, C.2011. Search-based procedural content generation: A taxonomyand survey. Proceedings of the Computational Intelligence andGames Conference 3(3):172–186.Treanor, M.; Blackford, B.; Mateas, M.; and Bogost, I. 2012.Game-o-matic: Generating videogames that represent ideas. InProceedings of the FDG workshop on Procedural Content Gener-ation in Games, 11. ACM.Yannakakis, G. N., and Hallam, J. 2011. Ranking vs. preference:a comparative study of self-reporting. In Affective computing andintelligent interaction. Springer. 437–446.

217

212Proceedings of the Seventh International Conference on Computational Creativity, June 2016