A model of suspense for narrative generation - …oro.open.ac.uk/51555/1/26_Paper.pdf · A model of suspense for narrative generation Richard Doust Open University, UK [email protected]

Open Research OnlineThe Open University’s repository of research publicationsand other research outputs

A model of suspense for narrative generationConference or Workshop Item

How to cite:

Doust, Richard and Piwek, Paul (2017). A model of suspense for narrative generation. In: Proceedings of the10th International Conference on Natural Language Generation (Alonso, Jose M.; Bugarın, Alberto and Reiter, Ehudeds.), Association for Computational Linguistics, pp. 178–187.

For guidance on citations see FAQs.

c© 2017 The Authors

Version: Version of Record

Link(s) to article on publisher’s website:https://aclanthology.coli.uni-saarland.de/papers/W17-3527/w17-3527

Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyrightowners. For more information on Open Research Online’s data policy on reuse of materials please consult the policiespage.

oro.open.ac.uk

http://oro.open.ac.uk/help/helpfaq.html

https://aclanthology.coli.uni-saarland.de/papers/W17-3527/w17-3527

http://oro.open.ac.uk/policies.html

A model of suspensefor narrative generation

Richard DoustOpen University, UK

[email protected]

Paul PiwekOpen University, UK

[email protected]

Abstract

Most work on automatic generation of narra-tives, and more specifically suspenseful narra-tive, has focused on detailed domain-specificmodelling of character psychology and plotstructure. Recent work on the automatic learn-ing of narrative schemas suggests an alterna-tive approach that exploits such schemas formodelling and measuring suspense. We pro-pose a domain-independent model for track-ing suspense in a story which can be used topredict the audience’s suspense response on asentence-by-sentence basis at the content de-termination stage of narrative generation. Themodel lends itself as the theoretical foundationfor a suspense module that is compatible withalternative narrative generation theories. Theproposal is evaluated by human judges’ nor-malised average scores correlate strongly withpredicted values.

1 Introduction

Research on computational models of narrative has along tradition, with important contributions from re-searchers in natural language generation, such as theAUTHOR system (Callaway and Lester, 2002), whichprovides the blueprint for a prose generation archi-tecture including a narrative planner and organiserwith traditional NLG pipeline components (sentenceplanner, realiser) and a revisor.

Two features are found in much of such research.Firstly, work on content determination and organisa-tion has often centred on use of detailed representa-tions for character goals and plans – see for exam-ple Cavazza et al. (2002) and Cavazza and Charles(2005). According to such approaches, a good and,more specifically, a suspenseful story should arise

out of the complex interaction of the system com-ponents. It is often hard to separate the differentcontributions of the system choices from the qualityof the underlying domain-specific plans and storytemplates (Concepcion et al., 2016).

Secondly, existing approaches to suspense often in-terlock with the concept of a story protagonist undersome kind of threat. For example, the suspense mod-elling in the SUSPENSER system Cheong and Young(2015) - which aims to enable choices that can max-imise suspense in narrative generation - is entirelybased on Gerrig and Bernardo (1994)’s definition,according which suspense varies inversely with thenumber of potential actions of the central protagonistwhich could allow him or her to escape a threat. Simi-larly, Zillman’s definition (Zillmann, 1996) links sus-pense to the reader’s fearful apprehension of a storyevent that threatens a liked protagonist. Delatorreet al. (2016) proposed a computational model basedon Zillman’s definition from which they derived theuse of emotional valence, empathy and arousal asthe key components of suspense. Interestingly, theirexperimental results led the authors to question theusefulness of empathy for measuring suspense.

In this paper, we approach the problem of sus-penseful story generation from a different angle. Ourmain contribution is a domain-independent model ofsuspense together with a method for measuring thesuspensefulness of simple chronological narratives.Thus we identify a separately testable measure ofstory suspensefulness. By separating out emotionalsalience and character empathy considerations frominformational and attentional processes at the heartof the suspense reaction, we construct a modular def-inition of suspense that could in theory encompassthe definitions cited above. The method builds on thepsychological model of narrative proposed by Brewer

and Lichtenstein (1982), and was first developed inDoust (2015).

Inspired by Brewer and Lichtenstein’s informalmodel, we build a formal model in which the conceptof a narrative thread plays a pivotal role. Narrativethreads model the reader’s expectations about whatmight happen next in a given story. As a story istold, narrative threads are activated and de-activated.Different threads may point to conflicting events thatare situated in the future. As more of the story is re-vealed, the moment of resolution of the conflict mayappear more or less proximal in time. We capturethis by formally defining the concept of Imminence.Imminence is based on the potential for upcomingstoryworld events to conflict with one another andon the narrative proximity with these conflicts. Itis the key factor in what we call conflict-based sus-pense. Additionally, as the story is told, conflictinginterpretations about certain events in the story mayprevail. This leads us to define a distinct second typeof suspense which we call revelatory suspense.

Our approach has a number of advantages. Firstly,by disentangling suspense from story protagonist-based modelling, our approach can deal with scenesthat have no discernable human-like protagonist. Anexample could be an ice floe slowly splitting up or aball slowly rolling towards the edge of a table1. Thusthe empirical coverage of the theory is extended.

Secondly, the model lends itself as the theoreticalfoundation for a suspense module that is compatiblewith alternative narrative generation theories. Such amodule could be used to evaluate story variants beingconsidered in the search space of a narrative gener-ation program. This is possible because the modeloperates at a level of abstraction where character mo-tivations are subsumed by higher level properties ofan unfolding story, such as imminence.

Thirdly, the underlying world knowledge that ourmodel relies on, makes use of a format, narrativethreads, which meshes with recent work in computa-tional linguistics on the automatic learning of narra-tive schemas as proposed in Chambers and Jurafsky(2009) and Chambers and Jurafsky (2010).

The remainder of this paper is organised as follows.In Section 2 we discuss previous primarily compu-

1One could postulate the existence of imaginary protagonistsfor such scenarios, but this seems unnecessarily complex.

tational work on suspense. The section concludeswith a description of Brewer and Lichtenstein’s psy-chological theory of suspense, which forms the basisfor our model. Section 3.1 introduces our formalmodel. Section 3.2 presents our model of the reader’sresponse to a suspenseful narrative and the algorithmfor calculating suspense. Section 3.3 presents anoverview of revelatory suspense. Section 4 reportson the evaluation of the model by human judges. Fi-nally, in Section 5 we present our conclusions andavenues for further research.

2 Related work on computational modelsof narrative and suspense

In computational models of narrative, a common ap-proach is to determine some basic element, which,when manipulated in certain ways, will produce askeletal story-line or plot. For example, TALE-SPIN (Meehan, 1977) uses the characters’ goals,whereas MINSTREL (Turner, 1992) uses both au-thorial and character goals. MEXICA (Perez y Perezand Sharples, 2001) uses a tension curve to representlove, emotion and danger in order to drive the gen-eration process. Riedl and Young (2010) and laterthe GLAIVE narrative planner (Ware and Young,2014) introduce a novel refinement search planningalgorithm that combines intentional and causal linksand can reason about character intentionality.

The focus in the aforementioned work is on theglobal story-modelling task and the automatic gener-ation of new narratives. Suspense is seen as one ofa set of by-products of story generation. There is nore-usable model of what makes a suspenseful story.

Other approaches have been more specificallyaimed at generating suspenseful stories. Cheong andYoung (2015)’s SUSPENSER and O’Neill and Riedl(2014)’s DRAMATIS propose cognitively motivatedheuristics for suspense using the planning paradigm.Characters have goals and corresponding plans, andsuspense levels are calculated as a function of these.The measurement of suspense in these algorithmswas evaluated using alternate versions of a story. Inparticular, O’Neill and Riedl (2014) asked humanjudges to make a decision about which story wasmore suspenseful and then compared these results tothe story their system identified as most suspenseful.

Several psychological theories of narrative under-

standing have attempted to approach suspense mod-elling. For example, Kintsch (1980) considers theschemata and frames that readers call upon to actuallylearn from the text they are reading. They examinethe additional focus generated by expectation viola-tions, i.e., surprise rather than suspense.

Our suspense measurement method is groundedin Brewer and Lichtenstein (1982)’s psychologicaltheory of narrative understanding. They suggest thatthree major discourse structures account for the ‘en-joyment’ of a large number of stories: surprise, cu-riosity and suspense. This approach is based onthe existence of Initiating Events (IE) and OutcomeEvents (OE) in a given narrative.

For suspense, an IE is presented which triggersthe prediction of an OE which corresponds to a sig-nificant change in the state of the storyworld. Thereader feels concern about this outcome event, andif this state is maintained over time, the feeling ofsuspense will arise. Such a change in the state ofthe storyworld can have a positive or negative va-lence for the reader, and may often be significantbecause it concerns the fate of a central character inthe story. However, this link to a character’s fate isnot a requirement of our model.

Also, as Brewer & Lichtenstein say: ‘often addi-tional discourse material is placed between the ini-tiating event and the outcome event, to encouragethe build up of suspense’ (Brewer and Lichtenstein,1982, p. 17). Thus, to produce suspense, the IE andOE are ordered chronologically and other events areplaced between them.

We propose a computational extension of thismodel based on what we call narrative threads, aconcept grounded in psychological research suchas Zwaan et al. (1995), the constructionist andprediction-sustantiation models of narrative compre-hension (Graesser et al., 1994) and scripts (Lebowitz,1985; Schank and Abelson, 1977). Our narrativethreads include both causal and intentional links asdo for example Ware and Young (2014) and also in-clude the concept of recency (see for example Joneset al. (2006)).

The research reported in this paper differs in twokey respects from the computational approaches de-scribed above.

Firstly, it eschews a planning approach to storygeneration that makes use of detailed modelling of

character intentions and goals. In our view, suspenseis not dependent on the existence of characters’ goals:we can experience suspense about a piece of stringbreaking under the strain of a weight. Nor does ourapproach require the existence of a central protago-nist and his or her predicament.

Secondly, our model tracks suspense throughoutthe telling of a story. Unlike much previous work,rather than evaluate only the predicted overall sus-pense level of a story, in our evaluation we comparepredicted suspense levels with human judgementsat multiple steps throughout the telling of the story.This provides us with a much more fine-grained eval-uation of our suspense measurement method than hashitherto been used.

Brewer and Lichtenstein’s work has been the basisof further work such as Hoeken and van Vliet (2000)and Albuquerque et al. (2011). The former foundthat ‘suspense is evoked even when the reader knowshow the story will end’ whereas the latter exploredstory-line variation to evoke suspense, surprise andcuriosity. However, neither presents a model of howsuspense fluctuates during the telling of a story norany empirical evaluation of such a model.

In the following section, we present our computa-tional model of suspense that extends that of Brewerand Lichtenstein (1982).

3 Formalism and Algorithm for SuspenseGeneration

3.1 FormalismA story can be considered the work of a hypotheticalauthor, who first chooses some events from a story-world and orders them into a fabula (this includes allrelevant events from the storyworld, not just thosethat get told). The author then chooses events andorderings of events from this fabula to create a storydesigned to trigger specific reactions from its readers.

3.1.1 A storyworldA storyworld W = (E,N,D) is made up of the

following elements:

• E, the set of possible events,

• N, the set of narrative threads. Each narrativethread Z ∈ N consists of a fixed sequence ofdistinct events chosen from the set E and anImportance value, Value(Z),

• D, the set of ordered pairs (a, b) of disallowingevents where a, b ∈ E and a disallows b.

We will be dealing in this research only withchronological stories. For a given set of narrativethreads, a story will satisfy the chronological con-straint if and only if:

For all pairs of events a and b where aprecedes b in the story, if there are anynarrative threads in which both a and boccur, then in at least one of these threadsa precedes b.

Using the chronological qualities of narrativethreads, we can now define a fabula as a chrono-logically ordered list of n events chosen from E, theset of possible events in the storyworld W. We definea story as an ordered list of events chosen from agiven fabula. In the general case, a story for a fab-ula can reorder, repeat or skip any of the fabula’sevents. Because we are only dealing with chronolog-ical stories, in our current model, the only allowabledifference between a fabula and a story is that someelements of the fabula can be skipped.

We now give two constraints on fabulas for a givenstoryworld W = (E,N,D). An (optional) complete-ness relation between the set of events, E, and theset of narrative threads, N, is a useful constraint toinclude in most storyworlds. It excludes the possibil-ity that an event in a fabula has no narrative threadwhich contains it, thus avoiding the situation wherean event is ‘uninterpretable’ in storyworld terms.

Concerning D, the set of disallowing event-pairs,(a, b) ∈ D means that if a is told, then b is predictednot to occur in storyworld W, or we can also say,b should not be one of the subsequent events to betold. We will therefore require that no event that is amember of a fabula disallows any other2.

3.1.2 Telling the storyTelling a story is equivalent to going through an

ordered list of events one by one. To ‘tell an eventin the story’, we take the next event from a list ofUntold events and add it to the tail of a list of Toldevents. Each new told event may have an effect onone or more narrative threads.

2This is in fact a transposition of the constraints used in theGLAIVE narrative planner (Ware and Young, 2014).

Each narrative thread also has a Conveyed andUnconveyed event list. Events in a thread becomeconveyed in two cases: when they are told in a story,or when they are presumed to have occurred in thestoryworld because in some thread they precede anevent which has been told (see also 3.1.3).

If the new story event matches a member of the Un-conveyed list of any narrative thread, then we moveit (and all the events before it) into the thread’s Con-veyed list. Additionally, the thread also becomesactive (if previously, it was not).

Finally, certain threads may be deactivated by thenew story event. Any active narrative thread with anevent α in its Unconveyed list will be deactivated ifan event γ is told in the story and (γ, α) ∈ D.

For each thread Z, we designate state(Z) whichindicates both whether Z is active or inactive, whichevents in Z have been conveyed, and which are asyet unconveyed. Before the story starts to be told,all narrative threads are inactive and all their eventsare in their respective Unconveyed lists. Inactivethreads always have this form and have no effecton suspense calculations. When the last event in anarrative thread Z gets conveyed in the story, we cansay: ‘Z succeeds’.

3.1.3 Implicated events

An implicated prior event is any event in the Con-veyed list of some active narrative thread that hasnot been told in the story, but is ‘presumed to haveoccurred’. If α and γ are implicated prior events (indifferent active threads), and (γ, α) ∈ D, then α is aconflicted implicated prior event. In a similar way toimplicated upcoming events, implicated prior eventsin different threads may remain in conflict with eachother over several story steps. A conflicted thread isa thread whose Conveyed list contains at least oneconflicted implicated prior event.

An implicated upcoming event is just any memberof the Unconveyed list of an active thread. Such anevent is predicted to be told in the current story witha confidence level that depends on the confidence wehave in the narrative thread of which it is a member.It is conflicts between implicated upcoming eventsthat create suspense.

3.1.4 Confirmed and unconfirmed threadsActive threads may be confirmed or unconfirmed.

An active confirmed thread is any thread whose Con-veyed list contains at least one told event. Activeunconfirmed threads with no told events are impor-tant in our system because they allow for a degree offlexibility in the linking together of different narra-tive threads. Thus, an inactive thread which sharesat least one event with some other active thread canbecome active but unconfirmed for the purposes ofsuspense calculation. For example, a set of threadswhich detail the different things that someone mightdo when they get home can under this rule be acti-vated before the story narrates the moment when theyopen their front door. We can formalise this in thefollowing way:

An inactive thread Z can become an activeunconfirmed thread if any of its (uncon-veyed) events appears in the Unconveyedlist of some other active confirmed thread(and as long as it has no event that is disal-lowed by some told event).

Thus, in such a case, an inactive (and thus uncon-firmed) thread Z becomes active even though noneof its events have yet been told in the story. We cansay that ‘the confirmation of thread Z is predicted’.

3.2 Modelling the reader’s predicted reactions:the suspense algorithm

For each narrative thread, we first determine the fol-lowing intermediate values: Imminence, Importance,Foregroundedness, and Confidence. We then com-bine these values to calculate the suspense contribu-tion from each individual narrative thread. Finallywe propose a heuristic to combine all these individ-ual narrative thread suspense values and produce theglobal suspense level for each moment in the story.

3.2.1 ImminenceEach active narrative thread Z generates two val-

ues for Imminence. Completion Imminence is relatedto the number of events in Z still to be conveyed forit to be completed or to ‘succeed’. Figure 1 shows athread with a Completion Imminence number of 4.

Interruption Imminence is related to the smallestnumber of events still to be conveyed in some otherthread Y before an event is told which can interruptZ

Figure 1: Completion Imminence

Conveyed events are black, unconveyed grey.

by disallowing one of its events. In the case where nothread can interrupt Z, the Interruption Imminence ofZ is zero. In Figure 2, thread A has an InterruptionImminence number of 3 due to thread B.

Figure 2: Interruption Imminence

If a large number of events must be told for athread to be completed, the Imminence is low, andvice versa. To model this behaviour, we adopted theratio 1/x . Also, to enable exploration of the relativeeffects of Completion and Interruption Imminence onthis measure, we used a factor ρ to vary the relativeweighting of these two effects.

We can now give a first definition of theTotal Imminencen(Z) of a narrative thread Z afterthe nth event in the story.

Total Imminence = ρ1

H+ (1− ρ) 1

R(1)

where H is the number of events to the completionof Z and R is the minimum number of events beforean event in some other narrative thread could be toldwhich would disallow some unconveyed event in Z.

Experimentation with the implementation of ourmodel led us to choose ρ = 0.7, in effect boostingthe relative effect of Completion imminence.

3.2.2 ForegroundednessWe use a parameter called Foregroundedness to

represent how present a given narrative thread is inthe reader’s mind. This is similar to the concept of

recency in the psychological literature3. The Fore-groundedness of each narrative thread changes witheach new event in the story and varies between 0 and1. Threads which contain the current story event areconsidered to be very present and get ascribed themaximum level of Foregroundedness, that is 1.

In our current model, the Foregroundedness of allother narrative threads is simply set to decrease ateach story step due to the following decay function:

decayFunction(x) = βx,where 0 < β < 1 (2)

Experimentation led us to use β = 0.88.

3.2.3 ConfidenceDepending on the number of its conflicted prior

events, a thread will have varying degrees of Confi-dence as the story progresses. In Figure 3, we shownarrative threads A and B that share an event, theevent which has just been told in the story.

Figure 3: Threads with a shared event. Implicated prior events

have a question mark. Bidirectional arrows show mutual disal-

lowing relations.

We see that two implicated prior events in A arein conflict with implicated prior events in B. Overallthen, thread A has two conflicted prior events andone confirmed event. Narrative threads with manyconflicted prior events will have a low Confidencelevel, reducing their potential effect on suspense. Wedefine the Confidencen(Z) of a narrative thread Zafter the nth event in the story as follows:

Confidence =1

(1 + φQP )

where φ = 1.5 (3)

where P is the (non-zero) number of told eventsand Q the number of conflicted implicated priorevents in Z and 0 ≤ Confidence ≤ 1. Note that

3See for example, Jones et al. (2006).

if the threads containing events conflicting with Zget deactivated, then Z may come to no longer haveany conflicted prior events. In such a case, as longas Z has at least one confirmed event, its confidencelevel would reach the maximum value of 1. In otherwords, if P > 0, Q = 0, then Confidence = 1. Em-pirical work on our implementation led us to use a‘conflicted-to-told ratio’ of φ = 1.5.

3.2.4 ImportanceWe next define Value(Z) as the measure of the

Importance of a narrative thread Z.

Value(Z) = the predicted degree of posi-tive or negative appraisal of the storyworldsituation that the reader would have, wereZ to succeed.

In our model, we use the range (−10,+10) forthis value, where −10 and +10 correspond to eventsabout which the reader is very negative (sad, dissatis-fied) and very positive (happy, satisfied) respectively.

3.2.5 Our suspense algorithmAfter the telling of the nth story event, we cal-

culate the Imminencen(Z), Foregroundednessn(Z),Confidencen(Z) and Importancen(Z) for each activenarrative thread Z. For the general case, we assumedthat all four variables were independent and chosemultiplication to combine them and create a measureof the suspense contribution of each narrative threadafter story event n:

Suspensen(Z) = Imminencen(Z)

× Importancen(Z)

× Foregroundednessn(Z)

× Confidencen(Z)

(4)

Once we have the suspense level of each activethread, we assume that the thread with the highestsuspense value is the one that will be responsible forthe story’s evoked suspense at that point. We there-fore define this thread’s suspense value as equivalentto the suspense level of the narrative as a whole atthat point in the story.

3.3 Revelatory suspenseAs well as the general case of conflict-based suspensedescribed above, our thread-based model allows us

to deal with a type of suspense we call revelatorysuspense, or curiosity-based suspense. This kind ofsuspense is linked to the potential disambiguation ofa story event that belongs to several narrative threads.There is suspense about which thread will providethe ‘correct’ interpretation of the event.

To understand this, we can imagine that in a givenstoryworld, event δ is present in several differentthreads. When δ is told in the story, several threadsbecome activated as candidates to uniquely ‘explain’it. Subsequent story events may disallow some can-didate threads. Exactly which thread turns out to bethe correct ‘explanation’ of δ in the storyworld willbe determined by the rest of the story. In Figure 3already mentioned, there is therefore revelatory sus-pense about which of threads A and B will remainactive to explain the shared event. Revelatory sus-pense is potentially present as soon as the storyworldhas threads with shared events.

We can contrast this disambiguation process withCheong and Young (2015)’s SUSPENSER system,where suspense varies inversely with the number ofpossible actions of a central protagonist. Similarly,because the decrease in conflicted events boosts thethread’s Confidence level, in our model, the suspense-ful effect of a thread will go up as ambiguity is re-duced. However, in the SUSPENSER system, the sus-pense depends on the protagonist’s options, whereasin our model, it depends on the reduced number ofoptions for the reader.

4 Evaluation

We implemented our formal model and suspense met-ric computationally. To test our implementation, wedesigned and wrote a short suspenseful story, theMafia story, where an important judge drives towardshis home with a bomb ticking in his car. The storywas inspired by the story used in Brewer and Licht-enstein (1982)’s experiment.4

4.1 Original story and storyworld calibrationFirst we used Zwaan’s protocol (Zwaan et al., 1995)to split the story into separate events each time therewas a significant change in either time, space, inter-action, subject, cause or goal.

4Both the PROLOG implementation and story vari-ants are available at https://doi.org/10.6084/m9.figshare.5208862.

The next step was to create the storyworld informa-tion. Our model is designed to rely on information ina form which could be generated automatically fromreal-world data or corpora. The actual generation ofthis information lay however outside the scope of thisresearch. We therefore created the events, E, the nar-rative threads N, their importance values Value(Z)and the set of disallowing events D by hand, partlymodelling our work on the event chains describedin Chambers and Jurafsky (2009).

We then conducted an experiment to calibrate theimportance values and check the validity of the eventsin our hand-made narrative threads. The online inter-face created for the experiment presented a warm-upstory and then the Mafia story to the participants(N=40 for 33 story steps, 1320 individual judge-ments), recording step-by-step self-reported suspenseratings using magnitude estimation (see for exampleBard et al. (1996)). The raw suspense ratings wereconverted to normalised z-scores.

Next, we used our suspense algorithm to producesuspense level predictions for all the steps in theMafia story. Once we had obtained both predictedand experimental values for suspense levels in theMafia story, we examined their degree of match andmismatch for different sections of the story. We thenadjusted some importance values and made minormodifications to some narrative threads.

4.2 Variant of the original story

Next, we created the Mafia-late story variant, whichdiffered only from the original (henceforth) Mafia-early story in that the vital information suggestingthe presence of a bomb in the judge’s car is revealedat a later point in the story. Apart from this changein the event order, we strove to create as realistic astory as possible that used exactly the same events.

With the calibrated importance values from thefirst study, we then used our implementation to cre-ate new predictions for this story variant. For com-parison, we show these together with the originalMafia-early predictions in Figure 45.

We then collected human suspense level judge-ments (N=46 for 31 steps, 1426 individual judge-

5The Mafia-late story has two events fewer than the Mafia-early story. To facilitate comparison, we have aligned the Mafia-late and Mafia-early results so that as far as possible the sameevents occur at the same point on the x-axis.

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34

0

2

4

6

8

story step

susp

ense

leve

l

Mafia-earlyMafia-late

Figure 4: Predicted suspense for the story variants. The Mafia-

early and Mafia-late stories mention the bomb at steps 4 and 15

respectively.

ments) for the Mafia-late story. We predicted that thesuspense levels calculated by our calibrated modelfor the Mafia-late story variant would agree withthe step-by-step averaged z-scores of ratings for thisstory given by a new set of participants.

4.3 Results and Statistical Analysis

The magnitude estimation ratings obtained for eachparticipant were first converted to z-scores. For eachstory step, we then calculated the mean and standarddeviation of the z-scores for all participants which,for comparison, we present together with the pre-dicted values in Figure 5.

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34

−2

−1

0

1

2

story step

susp

ense

leve

l(z-

scor

es)

predictionsexperimental results

Figure 5: Experimental and predicted suspense for the Mafia-

late story

For these two curves, the Pearson Correlation Co-efficient is 0.8234 and the Spearman’s Rho Coeffi-cient is 0.794, both values indicating a strong positive

correlation. However, the vertical standard deviationvalues for the z-scores are large, suggesting largevariations between participants’ responses. To checkinter-coder reliability, we performed Fleiss’ Kappatest and achieved a value of 0.485. Landis and Koch(1977) interprets this value as signifying moderateinter-coder agreement. Levels of agreement as mea-sured by Fisher’s test between predicted and exper-imental suspense levels for our story-variant showhighly significant success in prediction (P=0.002).

5 Conclusions and further work

We believe that narrative and more specifically sus-pense is an important topic of study. As discussedin Delatorre et al. (2016), suspense is a pervasivenarrative phenomenon that is associated with greaterenjoyment and emotional engagement.

In this paper, we describe a formal model of sus-pense based on four variables: Imminence, Impor-tance, Foregroundedness and Confidence togetherwith a method for measuring suspense as a storyunfolds. The model enabled us to predict step-by-step fluctuations in suspensefulness for a short storywhich correlate well with average self-reported hu-man suspense judgements.

Our method for obtaining suspense judgementswas intrusive and may have created some interfer-ence with the reading process. Ideally, future re-search should explore less intrusive methods thatuse direct physiological measurements of the partici-pants. So far, the search for measurement methodsthat correspond with perceived suspense has beenunsuccessful (Cheong and Young, 2015).

A key difference with previous work is that ourmodel predicts and evaluates suspense at multiplestages within the same story. This is why we fo-cussed on variations in one storyworld instead of alarge corpora of stories. Indeed, our goal is to createthe first model of the suspense evoked as a narrativeis being received, and not just a single overall sus-pense rating. Also, instead of starting from charactergoals and plans, our basic construct is the narrativethread which is akin to narrative schemas that canbe harvested automatically (Chambers and Jurafsky,2009). In future work, we aim to apply our methodto such automatically harvested schemas and extendour model to different storyworlds and story variants.

ReferencesAlexandre C Albuquerque, Cesar Tadeu Pozzer, and An-

gelo EM Ciarlini. 2011. The usage of the structural-affect theory of stories for narrative generation. InGames and Digital Entertainment (SBGAMES), 2011Brazilian Symposium on Games and Digital Entertain-ment, pages 250–259. IEEE.

Ellen Gurman Bard, Dan Robertson, and Antonella Sorace.1996. Magnitude estimation of linguistic acceptability.Language, 72(1):32–68.

W.F. Brewer and E.H. Lichtenstein. 1982. Stories are toentertain: A structural-affect theory of stories. Journalof Pragmatics, 6(5-6):473–486.

Charles B Callaway and James C Lester. 2002. Narrativeprose generation. Artificial Intelligence, 139(2):213–252.

Marc Cavazza and Fred Charles. 2005. Dialogue Genera-tion in Character-based Interactive Storytelling. AAAI,AIIDE2005,, pages 21–26.

Marc Cavazza, Fred Charles, and Steven J Mead. 2002.Character-Based Interactive Storytelling. IEEE Intelli-gent Systems,, 17(4):17–24.

Nathanael Chambers and Dan Jurafsky. 2009. Unsuper-vised learning of narrative schemas and their partici-pants. In Proceedings of the Joint Conference of the47th Annual Meeting of the ACL and the 4th Interna-tional Joint Conference on Natural Language Process-ing of the AFNLP, volume 2, pages 602–610. Associa-tion for Computational Linguistics.

Nathanael Chambers and Daniel Jurafsky. 2010. Adatabase of narrative schemas. In Proceedings of the7th International Conference on Language Resourcesand Evaluation (LREC’10), Valletta, Malta. EuropeanLanguage Resources Association (ELRA).

Yun-Gyung Cheong and R Michael Young. 2015. Sus-penser: A story generation system for suspense. IEEETransactions on Computational Intelligence and AI inGames, 7(1):39–52.

Eugenio Concepcion, Gonzalo Mendez, Pablo Gervas, andCarlos Leon. 2016. A Challenge Proposal for NarrativeGeneration Using CNLs. In 9th International NaturalLanguage Generation Conference, INLG, pages 171–173, Edinburgh, Scotland, UK.

Pablo Delatorre, Barbara Arfe, Pablo Gervas, and ManuelPalomo-Duarte. 2016. A component-based architec-ture for suspense modelling. Proceedings of AISB, 3rdInternational on Computational Creativity (CC2016),pages 32–39.

Richard Doust. 2015. A domain-independent model ofsuspense in narrative. Ph.D. thesis, The Open Univer-sity.

R.J. Gerrig and A.B.I. Bernardo. 1994. Readers asproblem-solvers in the experience of suspense. Poetics,22(6):459–472.

Arthur C Graesser, Murray Singer, and Tom Trabasso.1994. Constructing inferences during narrative textcomprehension. Psychological review, 101(3):371.

H. Hoeken and M. van Vliet. 2000. Suspense, curiosity,and surprise: How discourse structure influences theaffective and cognitive processing of a story. Poetics,27(4):277–286.

Matt Jones, Bradley C Love, and W Todd Maddox. 2006.Recency effects as a window to generalization: sepa-rating decisional and perceptual sequential effects incategory learning. Journal of Experimental Psychology:Learning, Memory, and Cognition, 32(2):316–332.

Walter Kintsch. 1980. Learning from text, levels of com-prehension, or: Why anyone would read a story anyway.Poetics, 9(1):87–98.

J Richard Landis and Gary G Koch. 1977. The mea-surement of observer agreement for categorical data.Biometrics, 33(1):159–174.

Michael Lebowitz. 1985. Story-telling as planning andlearning. Poetics, 14(6):483–502.

J.R. Meehan. 1977. Tale-spin, an interactive programthat writes stories. In Proceedings of the Fifth Interna-tional Joint Conference on Artificial Intelligence (IJ-CAI), Cambridge, MA, USA, volume 1, pages 91–98.

Brian O’Neill and Mark Riedl. 2014. Dramatis: A compu-tational model of suspense. In Proceedings of the 28thAAAI Conference on Artificial Intelligence, Quebec City,Quebec, Canada, pages 944–950.

Rafael Perez y Perez and Mike Sharples. 2001. Mexica:A computer model of a cognitive account of creativewriting. Journal of Experimental and Theoretical Arti-ficial Intelligence, 13(2):119–139.

Mark O Riedl and Robert Michael Young. 2010. Narra-tive planning: balancing plot and character. Journal ofArtificial Intelligence Research, 39(1):217–268.

Roger C Schank and Robert P Abelson. 1977. Scripts,plans, goals and understanding: an inquiry into humanknowledge structures. Lawrence Erlbaum AssociatesPublishers, Hillsdale, NJ.

S.R. Turner. 1992. MINSTREL: a computer model ofcreativity and storytelling. University of California atLos Angeles, CA, USA.

Stephen G Ware and R Michael Young. 2014. Glaive: Astate-space narrative planner supporting intentionalityand conflict. In Proceedings of the 10th InternationalConference on Artificial Intelligence and InteractiveDigital Entertainment, AIIDE2014, pages 80–86, NorthCarolina State University, Raleigh, NC USA.

Dolf Zillmann, 1996. The psychology of suspense in dra-matic exposition, pages 199–231. Vorderer, P, Wulff,HJ and Friedrichsen, M.

Rolf A Zwaan, Mark C Langston, and Arthur C Graesser.1995. The construction of situation models in narrative

comprehension: An event-indexing model. Psychologi-cal Science, 6(5):292–297.

Documents

A model of suspense for narrative generation - …oro.open.ac.uk/51555/1/26_Paper.pdf · A model of suspense for narrative generation Richard Doust Open University, UK [email protected]