13
G. Subsol (Ed.): VS 2005, LNCS 3805, pp. 135 147, 2005. © Springer-Verlag Berlin Heidelberg 2005 Toward Interactive Narrative Ken Perlin Media Research Laboratory, Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, USA http://mrl.nyu.edu/~perlin/ What is the future of interactive entertainment? Can we tap deeper emotions? Can we go beyond game-like experiences to create powerful interactive literary narratives? Are these even meaningful questions? In this paper I will address one possible way to make these questions meaningful, and one possible path to their answer. First let’s briefly discuss the significance of stories within culture. Stories are actually quite central to cultures. One could even say that a culture defines itself by the stories it tells. But what if we want to interact with stories? This would raise several questions: (i) what would change? (ii) what would stay the same? (iii) how do we make such a thing? and finally, (iv) where is the artist/author then located, with respect to the observer/reader? One way to approach these questions is to try to look at the act of reading or hearing a story as a peculiar form of game-play. To do this, we need to examine why stories are able to be so compelling. A first observation is that the mainstream obsession of human culture is other people. We are all continually engaged by questions on the order of: "Why did she just say that?" "I wonder what they were thinking?" "Do you think he really likes her?" These are not scientific questions. Rather, they call for shaded emotional judgments and intuitive understandings of human interaction, as well as the subliminal ability to "read" people. Such questions exist within a space of psychological awareness. To relate this space to interaction, it is useful to compare our traditional notions of game play to our traditional notions of narrative. Narrative is generally driven by character. The unities that generally bound the telling of narratives are quite strong, even when those narratives appear in widely varying sensory forms, such as novels, plays, movies, or short stories. Consider, for example, the Salinger novel “Catcher in the Rye” and the MGM film “Gone with the Wind” (not the novel): On the sensory level, Holden Caulfield is experiences only as printed text on paper, whereas Rhett Butler and Scarlett O’Hara are experienced via the appearance and

Towards Interactive Narrative

  • Upload
    trl31

  • View
    12

  • Download
    0

Embed Size (px)

DESCRIPTION

t

Citation preview

Page 1: Towards Interactive Narrative

G. Subsol (Ed.): VS 2005, LNCS 3805, pp. 135 – 147, 2005. © Springer-Verlag Berlin Heidelberg 2005

Toward Interactive Narrative

Ken Perlin

Media Research Laboratory, Department of Computer Science,

Courant Institute of Mathematical Sciences, New York University, USA

http://mrl.nyu.edu/~perlin/

What is the future of interactive entertainment? Can we tap deeper emotions? Can we go beyond game-like experiences to create powerful interactive literary narratives? Are these even meaningful questions? In this paper I will address one possible way to make these questions meaningful, and one possible path to their answer.

First let’s briefly discuss the significance of stories within culture. Stories are actually quite central to cultures. One could even say that a culture defines itself by the stories it tells.

But what if we want to interact with stories? This would raise several questions: (i) what would change? (ii) what would stay the same? (iii) how do we make such a thing? and finally, (iv) where is the artist/author then located, with respect to the observer/reader?

One way to approach these questions is to try to look at the act of reading or hearing a story as a peculiar form of game-play. To do this, we need to examine why stories are able to be so compelling.

A first observation is that the mainstream obsession of human culture is other people. We are all continually engaged by questions on the order of:

• "Why did she just say that?" • "I wonder what they were thinking?" • "Do you think he really likes her?"

These are not scientific questions. Rather, they call for shaded emotional judgments and intuitive understandings of human interaction, as well as the subliminal ability to "read" people. Such questions exist within a space of psychological awareness.

To relate this space to interaction, it is useful to compare our traditional notions of game play to our traditional notions of narrative.

Narrative is generally driven by character. The unities that generally bound the telling of narratives are quite strong, even when those narratives appear in widely varying sensory forms, such as novels, plays, movies, or short stories. Consider, for example, the Salinger novel “Catcher in the Rye” and the MGM film “Gone with the Wind” (not the novel):

On the sensory level, Holden Caulfield is experiences only as printed text on paper, whereas Rhett Butler and Scarlett O’Hara are experienced via the appearance and

Page 2: Towards Interactive Narrative

136 K. Perlin

speech of skilled actors projected onto celluloid. Yet our reasons for following their stories are quite similar. Each character has a central character mystery, some set of desires and character traits that are gradually revealed as the narrative progresses. We turn the page or keep watching the screen mainly to find out what these particular powerful personalities will do next as the plot unfolds: how they will respond; what choices they will make.

In fact we can state that linear psychological narratives generally have in common (i) psychological buy-in by the audience and (ii) a requirement of willing suspension of disbelief.

The reader/viewer is continually being challenged by such questions as: Who are these people? What will they do next, and why?

It is not what happens that primarily holds our interest, it is how we believe the characters feel about it. Note that the reader/viewer has no agency over the plot or its characters. The only “action” that the audience takes is an ever-changing internal emotional empathic response as the characters experience and respond to their world.

One of my favourite sayings about the centrality of character in stories is the following: “Plot is the drugged meat that you throw over the fence to put the dog to sleep, so you can rob the house.”

Contrast this with games. In a game you have quite a bit of agency over what happens, through some sort of game mechanic (e.g.: moving your avatar forward, jumping, shooting, opening doors, solving puzzles, picking up objects, ...) and yet you generally don’t believe in the characters in the deep way that you believe in the characters of a novel or film. When I play The Sims, everything in the game is telling me that the people I see are doll figures (albeit very entertaining ones), not deeply resonant characters like Holden Caulfield or Scarlett O’Hara.

And yet, we can actually cast the experience of reading a novel or viewing a movie as a peculiar sort of game mechanic. This game mechanic is one that I call “hack the

Page 3: Towards Interactive Narrative

Toward Interactive Narrative 137

characters”. As the story progresses, we continually find ourselves asking questions, such as: “who are they?” “why are they doing that?” “what will they do next?” In an important sense, we know that we are playing a sort of game, and that it is this peculiar game play that makes watching a movie quite different from experiences in real life.

Take for example, the following scenario. A husband and wife are hanging around their kitchen on a relaxed Sunday afternoon, when the husband’s best friend walks in. The wife immediately becomes quiet, and pointed doesn’t look at or talk to the friend. The husband doesn’t notice.

If this were to happen in real life, there could be a million different underlying causes for these events. But if it happens in a movie, the audience immediately knows that the wife and the best friend are sleeping together behind the husband’s back, and the audience is engaged in the underlying question of if or when the husband will find out.

One can describe many such scenarios. As our response to these scenarios makes clear, movies and other stories are full of artificial conventions. We are all so used to these conventions that we are often not even consciously aware of them. This graceful slipping into stylized convention is essential to the willing suspension of disbelief that allows the narrative to reach us and to exert power over us.

More specifically, when you pick up a novel or start to watch a movie, you know at some level that you are engaging in a contract with an author. There is a reason the author is leading you through this story, and that reason generally centers on creating within you an emotional bond with the characters in that story, and in giving you, by proxy, an emotionally dynamic experience through those characters and through the choices they make.

This is why psychological narrative is interesting, even though the reader or viewer has no control over what happens. Even though you can’t change anything, you are entertained by the characters’ choices. This process requires the author to carefully maintain the believability of characters. If Rhett Butler were to suddenly dance the hula with a chicken on his head, Gone with the Wind would lose a considerable amount of its hold on the audience.

If we wish to make interactive psychological narratives that allow their audience to engage in a similar process of “hack the character”, we need to respect this principle of believability.

For example, if you were to make an interactive version of Raiders of the Lost Ark, you would not want to allow the player to fly the camera into Indiana Jones’ underwear drawer to find out whether he wears boxers or briefs. Such an interaction, being on a completely inappropriate level, would destroy the essential mystery of Indiana Jones, and would completely pull you out of the interesting questions about his character that make him deep and compelling.

Similarly, you could not effectively engage in a process of “hack the character” if you were allowed to hack directly into his sub-conscious and tweak the knobs there, which is effectively the game mechanic of The Sims:

Page 4: Towards Interactive Narrative

138 K. Perlin

We can state such limitations on game-play as a principle: The player should only be able to interact with or influence things outside of characters, as those characters make their choices.

This whole idea of defining a genre of art or entertainment by defining its scope of interactivity is actually quite old. Take, for example, the difference between the traditional genres of painting and sculpture. Even if we look at a painting such as Leonardo DaVinci’s Mona Lisa in different ways, we essentially get the same viewing experience:

In contrast, viewing a sculpture such as the Venus de Milo is a different and unique experience for every variation of view angle and lighting:

Page 5: Towards Interactive Narrative

Toward Interactive Narrative 139

It is important to note that this differing and unique experience for each viewer does not imply that each visitor to the Louvre is an artistic collaborator. It simply means that sculpture is one of many art forms that are meant to be experienced in an infinite variety of ways, with a different experience available for each recipient of the work. Other examples of this are architecture, musical instruments, and the creation of procedural art with software.

So what might be an interactive “story-like” game mechanic that preserves character believability? An example will help here: a bar scene (bar scenes are always fraught with possibility).

Let us say that our young protagonist Lisa is interested in two young men: Jim and Bobby. Jim is handsome, dresses well, is quick with a compliment or clever line, and is popular with the girls. Bobby, in contrast, is less good looking, dresses unfashionably, and isn’t nearly as quick or clever. But we already know enough at this point in the narrative to know that if Lisa were to end up with Jim tonight, he would just sleep with her and quickly go on to another conquest. She would likely become depressed, go off her diet, drop out of Medical School, and generally become a less interesting character.

Now at this point is important to keep in mind that if we maintain believability, then interesting characters are always more rewarding for the player than uninteresting characters. They player wants Lisa to stay interesting and full of possibilities.

Note that if the player were able to directly hack into Lisa’s subconscious, as in The Sims, then it would be impossible to keep Lisa interesting in the traditional narrative sense, because Lisa could then be readily nudged into betraying or subverting her own goals. It is this uncanny sense of unbelievability that moves players of The Sims to set fire to their characters, drown them, starve them, and otherwise wreak havoc upon them the way that some small children experiment with ants and magnifying glasses on a sunny day (all of which was quite anticipated by the game’s designer Will Wright, incidentally). In contrast, a reader of Catcher in the Rye would never be moved to do such things to Holden Caulfield. He is much too valuable alive, as a continually surprising character - the reader very much wants to see what he is going to do or say next.

So what sorts of game mechanic are available to the player, outside of modifying the character herself? I posit that the proper way to maintain believability is to provide the player with a constrained ability to modify the world around the protagonists, and to follow the response of the protagonists to these modifications of their world.

Page 6: Towards Interactive Narrative

140 K. Perlin

For example, let’s say that we see Lisa in the bar with Jim and Bobby, and we happen to already know that Jim has a weakness for drink. We can’t make him do anything, but we can tempt him. Let’s say the player can “influence” the world to the limited extent of opening the bar, at a point in the conversation where Lisa is deciding between Jim and Bobby. The sequence below shows how this might play out: Jim moves to the now-open bar, thereby focusing less of his attention on Lisa. Lisa therefore has a chance to spend some time with Bobby, and the balance of power shifts. Lisa and Bobby bond, she ends up with him, stays on her diet, finishes Medical School, and is ready to move on to ever new and exciting adventures.

But none of this works if the world around the characters is not believable. To go back to an earlier example, if the player can suddenly place a chicken atop Lisa’s head, the entire question of psychological believability becomes meaningless. In order to keep the whole enterprise from falling apart, we need some reasonable - yet non-intrusive - way to constrain the player’s choices.

Page 7: Towards Interactive Narrative

Toward Interactive Narrative 141

One way to do this is by thinking in terms of parallel universes. As time goes on, possible universes appear at a more or less constant rate. Every time the player makes a choice or decision, this forest of potential universes is pruned. This sort of pruning was nicely illustrated in the 1998 Peter Howitt film “Sliding Doors”. In that film, the story split into two possible futures: one in which the heroine just barely caught a train, and the other in which she just barely missed it. Note that if this had been an interactive game, rather than a movie, and the player had chosen one of these futures, then in that moment, the number of possible outcomes for the game would have been halved.

Such a situation arises every time a player makes a choice in a game. In this sense, there is always some sort of equilibrium: As time moves forward, more possible universes appear and multiply. Meanwhile, as the player makes choices, possible universes disappear and divide. A world can be said to be “believable” if this equilibrium is reasonable. A world can be said to be “unbelievable” to the extent that the player is making choices that divide the character’s possibilities so severely as to make them uninteresting, such as by forcing Lisa to walk around with that chicken on her head.

We can even formalize this a bit. If one equal choice by the player places the narrative into a future having probability 1/2, then two successive equal choices by the player will place the narrative into a future having probability 1/4. And in general, n successive equal choices will place the narrative into a future having probability 1/2N. For example, twenty successive even choices would lead to a future having probability of about 1/1000000.

To enforce believability, we can maintain some sort of cost for making choices. For example, the player can be given a certain store of spendable energy. Making a choice costs a certain constant amount of this energy. This leads to the following property:

Energy ≡ -Log(probability)

So how do we author systems that allow artists to create a mutable narrative that can evolve in a way that reflects user choices?

We cannot use explicit branching narrative structures, because this leads to a combinatorial explosion:

The amount of work it take to make anything interesting through the creation of branching narratives is exponential: In order to support a typical N-decision path through the narrative space, the author needs to have authored on the order of 2N distinct story segments.

Page 8: Towards Interactive Narrative

142 K. Perlin

What is called for is multi-layered interaction. The author needs to be given tools to define desired properties of the interactive narrative at a coarse-grained level, as well as tools to define properties at successively finer levels of detail.

The response to an audience intervention must be mediated at all levels of detail. If the audience has just made such an intervention (such as opening the bar in the example above), then the question “what happens now” is not a linear one. Rather it involves the contribution or style sheets at multiple levels of detail.

We can define this as the principle of layered contingent narrative. An engine that supports believable interactive narrative properly looks like a fractal, with influences and interreflections between different levels of detail.

Using Style-Sheets When Authoring Interactive Media Consider the user interaction paradigm of a Web page built with some commercial layout package such as Front Page:

user text • { layout software • style sheet } • { renderer • fonts/kerning } • document view

There are four different types of people involved in the user of such a WYSIWYG editor: (i) the author who is creating the content, (ii) the user who is browsing the

Page 9: Towards Interactive Narrative

Toward Interactive Narrative 143

page, (iii) a page layout expert who is making decisions about spacing, centering, etc., and (iv) a programmer who implements that connection between the first three people.

Generally speaking, the author of the document never actually needs to meet the stylist or the programmer. As far as she is concerned, there could be an entire team of page layout experts and programmers somewhere behind the scenes (and there generally is). Some of these people may even be deceased by the time the web content itself is written.

Yet the talent and ability of the page layout expert is available to the author of the interactive document, as long as the expert has encapsulated that talent and ability into style sheets which can be interpreted by the program and applied to the author’s document.

The key here is that creation has been divided into two processes, with two different languages: (i) a style description language, for specifying geometric relationships and behaviors for entities on the page, and (ii) a content language, which is mostly a natural language such as English.

Note that there must be another programmer/style-expert pair underneath the first one, simply to deal with decisions concerning character shape and font kerning. So we can see that the principle of content creation supported by programmers and style experts extends to layers of support. Every time the author adds, deletes or rearranged words, a team of experts is virtually in the room, making corresponding spatial layout decisions based on generic (ie: not specific to this specific content) principles of page layout design.

Principles for a Narrative Generation Architecture An interactive narrative that is played out in a software run-time engine can similarly be designed as an interaction between author, audience, and visual style expert, where the latter’s contribution is supported by software layers that convert style advice into run-time actions:

A user’s contingent script:

•{ behavior system • behavior script } (produces discrete tokens of action and mood) • { animation system • animation script } (produces joint movements, voice prosody) • { renderer • appearance model } (produces polygons, sounds) • animation view

In this case there is a programmer/style-expert pair at three different levels: (i)what happens next, (ii)who such decisions are acted out, and (iii)how it all appears visually.

The first (highest) of these levels is concerned with plot decisions, underlying psychological tone, and general scene blocking and camera placement. The second (middle) level is concerned with body language, gesture, facial expression, fine details of both scene blocking and camera placement. The third (bottom) layer is concerned with what we traditionally think of as graphics rendering.

For each of these layers, a combination of talent and expertise is required by a style expert. Also required for each layer is a scripting language built around such talent

Page 10: Towards Interactive Narrative

144 K. Perlin

and expertise, which allows the expert’s contribution to be applied to enhance the specific content that was created by the author.

Since such systems are in their infancy, it is difficult to be more specific at this stage. The purpose of this description is to focus future developments toward effective definition and implementation of such style specification languages for interactive narrative and interactive virtual acting.

The ability to author non-linear narrative decisions. is half of the solution to the problem of creating interactive narrative experiences. The other half of the solution is presentation of those decisions to the audience. In a cinematic or game-like medium, this requires acting.

Principle of Believable Virtual Acting Note that the middle layer of our style diagram is concerned with what is traditionally called acting: how an character turns his head or places her foot; whether a character turns fully to look at another, or only acknowledges the presence of the other peripherally. The ability to author and then play out non-linear narrative decisions requires believable presentation of those decisions to the audience. In a cinematic or game-like medium, this requires acting.

In the absence of believable actors, everything runs the risk of looking like Plan 9 from Outer Space:

Note that canned linear animation is not sufficient for purposes of implementing interactive narrative. The second time that an audience sees an actor perform the same movements, that actor effectively ceases to exist in the audience’s mind, since that character has now been identified by the audience as a mindless automaton, and the all-important sense of “what will the character do next” has been lost.

The following principles need to be adhered to:

• canned animation is a dead end. • acting needs to be procedural, from the inside out

Experiments in Believable Virtual Acting In related work in support of non-linear narrative [4] we have done experiments with emotive facial expression:

Page 11: Towards Interactive Narrative

Toward Interactive Narrative 145

Some of this work has made its way into commercial interactive games such as Half Life 2 which does has an underlying narrative, although not the engine needed to maintain character believability. For example, the facial musculature of the character of Alyx in Half Life 2 was influenced by our work on facial expression [1]:

Similarly we have done work on virtual actors that can convey body language, shown briefly below. This work is gradually finding its way into the game industry, but its real utility will come when applied to interactive narrative in support of believable characters.

Façade There have been some attempts to build a framework such as the one we describe. For example, the groundbreaking work Façade by Michael Mateas and Andrew Stern is a working example of contingent narrative [2].

To implement this work they developed the ABL language, which allows contingent execution of narrative plans. For example, if a high level directive of the narrative requires a character to carry a bottle of wine across the room, but the actor already has a cigarette in one hand, then a lower layer of narrative execution directs the actor to carry the bottle with the other hand.

Yet Façade is not quite what we want, going into the future. According to its authors, Façade does not break down the narrative choices in all levels from highest to lowest, but rather relies on several dozen carefully scripted interactive narrative scenelets, with techniques to steer the narrative toward these relatively linear set pieces. The resulting particular interactive narrative had to be heroically built through several years of intensive work. In a sense, one might say that Façade has hit an important half-way point between the traditional forking narrative, and the sort of fully procedural narrative generation system that we have outlined here.

Page 12: Towards Interactive Narrative

146 K. Perlin

To put the various elements into one place, here are some thoughts about what narrative constraints might be needed in the near term:

• The audience or "player" cannot have an avatar that engages in dialogue with characters, because currently existing techniques are not sufficient to sustain conversational believability between an actual human and non-player characters;

• There can be no direct puppeteering of the minds of characters, in the style of The SIMS;

• The audience cannot freely fly around the camera to peer into the sock drawer of Indiana Jones, or otherwise breaking character believability;

• The audience should only have the power to manipulate the world, not characters themselves; the characters should be free to respond in a way that accords with their nature;

• The probability of events happening should be respected.

Page 13: Towards Interactive Narrative

Toward Interactive Narrative 147

What About Realism? Photographic realism is only of benefit for interactive narrative if it can be sustained. Inappropriate realism, brought in without proper support for acting and behavior, actually works against believability. Dr. Masahiro Mori laid out the basis for much of this in his principle of the “Uncanny Valley” - principally in the observation stopping just short of required realism can lead to things becoming very unbelievable [3];

In fact it is often better for characters to remain decisively cartoon-like. A good recent example of a successful application of this principle is Brad Bird’s brilliant animated film The Incredibles.

Conclusion

In conclusion, we have laid out some principles describing how the tools to create interactive character-based non-linear narrative might be developed, and also what such a medium might look like in practice. If we are collectively successful in implementing such a thing, then we will be able to direct (not just animate) believable interactive actors, and interactive narrative will finally get its Steven Spielberg and its Virginia Woolf. We will attain true interactive character-driven narrative, and our work will touch peoples’ souls.

References

1. Birdwell, Ken. 2005. Personal communication 2. Mateas, M. 2005. http://www.interactivestory.net 3. Mori, M. 1982. The Buddha in the Robot. Charles E. Tuttle Co. 4. Perlin, K. 1997. Layered Compositing of Facial Expression. In ACM SIGGRAPH 97 Visual

Proceedings: the Art and interdisciplinary Programs of SIGGRAPH ‘97 (Los Angeles, California, United States, August 03 - 08, 1997). L. Pocock, R. Hopkins, D. Ebert, and J. Crow, Eds. SIGGRAPH ‘97. ACM Press, New York, NY, 226-227.