13
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014 203 A Supervised Learning Framework for Modeling Director Agent Strategies in Educational Interactive Narrative Seung Y. Lee, Jonathan P. Rowe, Bradford W. Mott, and James C. Lester Abstract—Computational models of interactive narrative offer signicant potential for creating educational game experiences that are procedurally tailored to individual players and support learning. A key challenge posed by interactive narrative is de- vising effective director agent models that dynamically sequence story events according to players’ actions and needs. In this paper, we describe a supervised machine-learning framework to model director agent strategies in an educational interactive narrative Crystal Island. Findings from two studies with human participants are reported. The rst study utilized a Wizard-of-Oz paradigm where human “wizards” directed participants through Crystal Island’s mystery storyline by dynamically controlling narrative events in the game environment. Interaction logs yielded training data for machine learning the conditional probabilities of a dy- namic Bayesian network (DBN) model of the human wizards’ directorial actions. Results indicate that the DBN model achieved signicantly higher precision and recall than naive Bayes and bigram model techniques. In the second study, the DBN director agent model was incorporated into the runtime version of Crystal Island, and its impact on students’ narrative-centered learning ex- periences was investigated. Results indicate that machine-learning director agent strategies from human demonstrations yield models that positively shape players’ narrative-centered learning and problem-solving experiences. Index Terms—Bayesian networks, interactive drama, machine learning, narrative, serious games. I. INTRODUCTION R ECENT years have witnessed substantial growth in re- search on computational models of interactive narrative in digital games [1]–[5]. Computational models of interactive Manuscript received October 06, 2012; revised June 14, 2013 and October 07, 2013; accepted October 24, 2013. Date of publication November 20, 2013; date of current version June 12, 2014. This work was supported in part by the U.S. National Science Foundation under Grants REC-0632450 and DRL-0822200. Any opinions, ndings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reect the views of the National Science Foundation. This work was also supported by the Bill and Melinda Gates Foundation, the William and Flora Hewlett Foundation, and ED- UCAUSE. S. Y. Lee was with the Department of Computer Science, North Carolina State University, Raleigh, NC 27695-8206 USA. He is now with SAS Institute, Cary, NC 27513 USA (e-mail: [email protected]). J. P. Rowe, B. W. Mott, and J. C. Lester are with the Department of Com- puter Science, North Carolina State University, Raleigh, NC 27695-8206 USA (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TCIAIG.2013.2292010 narrative aim to procedurally adapt story experiences in re- sponse to players’ actions, as well as tailor story elements to individual players’ preferences and needs. A common metaphor for interactive narrative models is a director agent (drama man- ager), which is a centralized software agent that works behind the scenes to procedurally direct a cast of nonplayer characters and storyworld events [4], [6], [7]. The capacity to augment and revise narrative plans at runtime has shown promise for several applications, including entertainment [8]–[10], art [1], training [11], and education [6], [12], [13]. In education, com- putational models of interactive narrative have been embedded in narrative-centered learning environments for a range of subjects, including language and culture learning [14], social skills development [12], network security [15], and middle school science [16]. Modeling interactive narrative director agents poses several computational challenges. Interactive narratives correspond to spaces of possible story experiences that grow exponentially in their number of events. Consequently, interactive narrative director agents are expected to effectively navigate large story spaces during runtime, and they are expected to recognize and react to players’ subjective experiences. In educational applica- tions of interactive narrative, personalizing story events to sup- port student learning and engagement is a promising direction for improving educational outcomes, but there is a dearth of the- oretical guidance, or even best practices, to guide the design of interactive narrative models. In recognition of these challenges, efforts have been undertaken to automatically induce computa- tional models of interactive narrative from large data sets [7], [17], [18]. A promising approach for automating the creation of di- rector agents is machine-learning models directly from human demonstrations. This approach requires humans to simulate director agents by controlling story events in an interactive narrative. Nondirector players, often adopting the role of the story’s protagonist, simultaneously explore the narrative envi- ronment under the guidance of the human director’s actions. Data from these human–human interactions yields a training corpus for automatically inducing models of director agent strategies, which can be obtained by applying supervised ma- chine-learning techniques. The end result of this approach is a data-driven director agent model that can replace the human director in the interactive narrative environment. In educational interactive narratives, the director agent acts as a narrative-cen- tered tutor, providing personalized guidance and feedback 1943-068X © 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES…perpustakaan.unitomo.ac.id/repository/A Supervised... · 2016-08-01 · IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014 203

A Supervised Learning Framework for ModelingDirector Agent Strategies in Educational

Interactive NarrativeSeung Y. Lee, Jonathan P. Rowe, Bradford W. Mott, and James C. Lester

Abstract—Computational models of interactive narrative offersignificant potential for creating educational game experiencesthat are procedurally tailored to individual players and supportlearning. A key challenge posed by interactive narrative is de-vising effective director agent models that dynamically sequencestory events according to players’ actions and needs. In this paper,we describe a supervised machine-learning framework to modeldirector agent strategies in an educational interactive narrativeCrystal Island. Findings from two studies with human participantsare reported. The first study utilized a Wizard-of-Oz paradigmwhere human “wizards” directed participants through CrystalIsland’s mystery storyline by dynamically controlling narrativeevents in the game environment. Interaction logs yielded trainingdata for machine learning the conditional probabilities of a dy-namic Bayesian network (DBN) model of the human wizards’directorial actions. Results indicate that the DBN model achievedsignificantly higher precision and recall than naive Bayes andbigram model techniques. In the second study, the DBN directoragent model was incorporated into the runtime version of CrystalIsland, and its impact on students’ narrative-centered learning ex-periences was investigated. Results indicate that machine-learningdirector agent strategies from human demonstrations yield modelsthat positively shape players’ narrative-centered learning andproblem-solving experiences.

Index Terms—Bayesian networks, interactive drama, machinelearning, narrative, serious games.

I. INTRODUCTION

R ECENT years have witnessed substantial growth in re-search on computational models of interactive narrative

in digital games [1]–[5]. Computational models of interactive

Manuscript received October 06, 2012; revised June 14, 2013 andOctober 07,2013; accepted October 24, 2013. Date of publication November 20, 2013; dateof current version June 12, 2014. This work was supported in part by the U.S.National Science Foundation under Grants REC-0632450 and DRL-0822200.Any opinions, findings, and conclusions or recommendations expressed in thismaterial are those of the authors and do not necessarily reflect the views ofthe National Science Foundation. This work was also supported by the Bill andMelinda Gates Foundation, the William and Flora Hewlett Foundation, and ED-UCAUSE.S. Y. Lee was with the Department of Computer Science, North Carolina

State University, Raleigh, NC 27695-8206 USA. He is now with SAS Institute,Cary, NC 27513 USA (e-mail: [email protected]).J. P. Rowe, B. W. Mott, and J. C. Lester are with the Department of Com-

puter Science, North Carolina State University, Raleigh, NC 27695-8206 USA(e-mail: [email protected]; [email protected]; [email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TCIAIG.2013.2292010

narrative aim to procedurally adapt story experiences in re-sponse to players’ actions, as well as tailor story elements toindividual players’ preferences and needs. A common metaphorfor interactive narrative models is a director agent (drama man-ager), which is a centralized software agent that works behindthe scenes to procedurally direct a cast of nonplayer charactersand storyworld events [4], [6], [7]. The capacity to augmentand revise narrative plans at runtime has shown promise forseveral applications, including entertainment [8]–[10], art [1],training [11], and education [6], [12], [13]. In education, com-putational models of interactive narrative have been embeddedin narrative-centered learning environments for a range ofsubjects, including language and culture learning [14], socialskills development [12], network security [15], and middleschool science [16].Modeling interactive narrative director agents poses several

computational challenges. Interactive narratives correspond tospaces of possible story experiences that grow exponentiallyin their number of events. Consequently, interactive narrativedirector agents are expected to effectively navigate large storyspaces during runtime, and they are expected to recognize andreact to players’ subjective experiences. In educational applica-tions of interactive narrative, personalizing story events to sup-port student learning and engagement is a promising directionfor improving educational outcomes, but there is a dearth of the-oretical guidance, or even best practices, to guide the design ofinteractive narrative models. In recognition of these challenges,efforts have been undertaken to automatically induce computa-tional models of interactive narrative from large data sets [7],[17], [18].A promising approach for automating the creation of di-

rector agents is machine-learning models directly from humandemonstrations. This approach requires humans to simulatedirector agents by controlling story events in an interactivenarrative. Nondirector players, often adopting the role of thestory’s protagonist, simultaneously explore the narrative envi-ronment under the guidance of the human director’s actions.Data from these human–human interactions yields a trainingcorpus for automatically inducing models of director agentstrategies, which can be obtained by applying supervised ma-chine-learning techniques. The end result of this approach is adata-driven director agent model that can replace the humandirector in the interactive narrative environment. In educationalinteractive narratives, the director agent acts as a narrative-cen-tered tutor, providing personalized guidance and feedback

1943-068X © 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

204 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014

within the narrative to enhance the learning environment’spedagogical effectiveness.In this paper, we present a framework for machine-learning

director agent strategies from observations of human–humaninteractions in an educational interactive narrative. The ap-proach involves training dynamic Bayesian network (DBN)models of director agent strategies from a corpus of humandirectorial actions. In order to investigate the framework, weuse a testbed interactive narrative called Crystal Island. CrystalIsland features a science mystery where players investigate aspreading illness afflicting a team of scientists. Results fromtwo empirical studies with human participants playing CrystalIsland are reported. The first study employed a Wizard-of-Oz(WOZ) paradigm—in other words it involved regular (i.e.,nonwizard) users interacting with wizard users providingdirectorial guidance in Crystal Island—in order to generate atraining corpus for inducing director agent models. The corpuswas used to machine-learn conditional probability tables in aDBN model of the human wizards’ directorial strategies. Thesecond study involved a modified version of Crystal Island thatreplaced human wizards with the DBN director agent model.A comparison between the DBN model and a baseline systemis described, including the systems’ differential impacts onplayers’ narrative-centered learning experiences. Empiricalfindings demonstrating our framework’s impact on students’learning outcomes are presented.

II. RELATED WORK

Several families of algorithms have been employed for mod-eling interactive narrative director agents. Classical planningis one prevalent approach because Stanford Research InstituteProblem Solver (STRIPS)-style plans align naturally with com-putational representations of stories. Plan-based director agentsmonitor and revise the executions of story plans in order to re-spond to players’ actions and preserve desirable narrative prop-erties [8]. In addition to classical planning, reactive plannershave been investigated for dynamically responding to player ac-tions under real-time performance constraints. Several of thesesystems incorporate special-purpose data structures inspired bynarrative concepts, such as dilemmas or beats, in order to bundlestory content for reactive delivery [1], [19].Search-based approaches have been investigated for dynami-

cally managing interactive narratives. Search-based approachesattempt to find plot sequences that optimize designer-specifiedevaluation functions [20]. These formalisms often use memo-rization or depth-bounded search techniques in order to con-strain their computation times. However, they are sensitive tothe narrative evaluation functions employed, which may be dif-ficult to craft.Case-based reasoning techniques have been used in several

story-centric interactive narrative systems. The Open-endedProppian Interactive Adaptive Tale Engine (OPIATE) storydirector dynamically responds to user actions by retrievingProppian subplots rooted in particular story contexts using-nearest neighbor techniques [21]. Work by Sharma et al. [9]modifies earlier search-based drama management approaches[20] by incorporating a case-based player model that approx-imates users’ plot preferences.

Another important class of narrative adaptation techniquesrelies on decision-theoretic planning algorithms [3], [6], [7],[22]. A family of interactive narrative models, known asdeclarative optimization-based drama managers (DODM),employs Markov decision processes to encode director agenttasks [7], [22]. DODM models’ parameters are automaticallyinduced using online reinforcement learning techniques (suchas temporal-difference learning) with large interactive narrativecorpora generated from simulated users.U-Director is an example of a decision-theoretic director

agent that utilized dynamic decision networks in an earlyversion of the Crystal Island interactive narrative [6]. The di-rector agent triggers story-based hints that assist students whilethey investigate the interactive science mystery. THESPIANis another example of a decision-theoretic interactive narra-tive system; it endows virtual characters with goal-orienteddecision-making models that are loosely based on partiallyobservable Markov decision processes (POMDPs) [3]. Eachvirtual character implements a recursive “theory of mind”model in order to reason about how its actions impact thebeliefs and goals of other agents.Recently, data-driven techniques have been employed to

automatically devise interactive narrative models. Interactivenarrative models have been machine learned from simulationdata sets [7], [22], induced from large corpora of narrativeblog entries [18], and distilled from crowd-sourced narrativedescriptions [17]. In work that is perhaps most closely relatedto our own, Orkin [23] automatically induces models of so-cial behavior and dialog from thousands of human–humaninteractions in The Restaurant Game. Orkin’s approach, calledcollective artificial intelligence, differs from ours by focusingon models of characters’ social behaviors rather than directoragent strategies. Further, Orkin’s computational frameworkcombines crowd-sourcing, pattern discovery, and case-basedplanning techniques, whereas our approach leverages WOZstudies and DBNs to induce models of director agent strategies.Work on computational models of interactive narrative for

education has also examined several algorithmic techniques.Efforts to provide personalized, story-embedded support forlearning have largely focused on rule-based techniques fordelivering hints [13], decompositional partial-order planningtechniques for scaffolding problem solving [15], hand-authoreddynamic decision networks for narrative-centered tutorialplanning [6], and POMDP-based multiagent simulations forinteractive pedagogical dramas [14]. Other work has examinedcharacter-based models of interactive narrative generation forsocial skills learning [12], although this work did not investigatedirector agents. In our work, we induce educational interactivenarrative models directly from data generated by human usersserving as director agents.

III. DATA-DRIVEN DIRECTOR AGENT FRAMEWORK

Data-driven frameworks for learning models of directoragent strategies hold considerable appeal for addressing thechallenges inherent in creating director agent models. Ourframework for inducing director agent models consists of fourphases: corpus collection, model learning, predictive modelevaluation, and runtime model evaluation. Fig. 1 depicts the

LEE et al.: A SUPERVISED LEARNING FRAMEWORK FOR MODELING DIRECTOR AGENT STRATEGIES IN EDUCATIONAL INTERACTIVE NARRATIVE 205

Fig. 1. Data-driven framework for modeling director agent strategies from human demonstrations.

first three stages of the approach for inducing director agentstrategies from human–human interaction data using DBNs.

A. Corpus Collection

In order to acquire a corpus of interactive narrative trainingdata that effectively represents canonical directorial sequences,the corpus collection process should address the following con-siderations. First, the corpus collection should be conductedusing a data-collection version of the interactive narrative en-vironment in which a human player, or wizard, controls the be-havior of a simulated director agent. Other than the decision-making mechanism responsible for driving the director agent,the narrative environment should mirror the final “director-en-hanced” version of the game; it is important for players’ interac-tions to closely resemble the interactions that ultimately occurin the final narrative environment. This symmetry increases thechances that a model trained from corpus data will also performwell when integrated into the final interactive narrative.Second, the human director agent should be provided with

an easy-to-use narrative dashboard in order to control the pro-gression of the interactive narrative. The narrative dashboardenables the human director agent to perform actions (e.g., di-recting a nonplayer character to perform a specific action, trig-gering an in-game object to appear) in an omnipotent manner,mimicking the capabilities of a computational director agent. Inaddition to controls, the narrative dashboard should report onthe storyworld’s state in order to inform the wizard’s decisions.

It is important to note that the dashboard must be designed care-fully so as not to be too complex for the wizard to effectivelyuse while an interactive narrative is underway.Third, the wizard should make her directorial decisions based

on observable story information brokered by the game envi-ronment (e.g., storyworld state, player behaviors), constraintsimposed on the interactive story by the plot graph, and beliefsabout actions that will lead to the most compelling story for theplayer.Fourth, wizards should be given ample opportunity to famil-

iarize themselves with the narrative-dashboard and interactivenarrative’s structure as encoded by the plot graph. Pilot data col-lections can be performed as part of training wizards prior tocollecting actual corpus data. One should not necessarily expectthat a human will be an effective director agent without priortraining or opportunities to familiarize herself with the narra-tive environment.

B. Model Learning

Models of director agent strategies are machine learnedfrom narrative interaction logs acquired during a corpus collec-tion study with human participants. Models are induced fromtraining data sets using supervised machine-learning techniquesspecific to the models being devised (e.g., DBNs, naive Bayes).The wizard’s directorial actions serve as class labels to bepredicted by induced models. In choosing models to encodedirector agent strategies, several factors should be considered.First, the model should be capable of explicitly representingchanges in the director agent’s belief state over time, as well

206 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014

Fig. 2. WOZ-enabled Crystal Island.

as temporal changes to the game’s narrative and world states.Second, the director agent model should not only recommendwhat directorial action to perform at a given decision point,but also indicate the appropriate time to intervene. Third, thedirector agent should be capable of integrating observationsfrom several sources to inform its directorial strategies. Sourcesof observations might include narrative history, storyworldstate, player activity, and player beliefs. Finally, the model mustbe capable of addressing these requirements while operatingat runtime, functioning efficiently enough for integration withgame engine technologies.

C. Predictive Model Evaluation

Once a director agent model is induced using a trainingcorpus and supervised machine-learning algorithms, thelearned model should be evaluated using test data sets toexamine the director agent model’s ability to predict a humandirector’s narrative decisions. The performance of the learnedmodel can be evaluated with respect to predictive accuracy, in-cluding metrics such as precision and recall. The model shouldbe compared to a baseline approach in order to determine howeffectively the learned model performs relative to alternatetechniques.

D. Runtime Model Evaluation

Once a sufficient level of predictive accuracy is obtained, thedirector agent model can be integrated into the runtime interac-tive narrative system. This introduces the opportunity to empir-ically evaluate the induced director agent model through studieswith human participants examining how the system affects andengages players.

IV. CRYSTAL ISLAND INTERACTIVE NARRATIVE TESTBED

To investigate director agent strategies, a WOZ1 data collec-tion was conducted with a customized version of the CrystalIsland interactive narrative [16].

A. Crystal Island

Crystal Island is an educational adventure game that featuresan interactive science mystery set on a recently discoveredtropical island. The game is built using Valve Corporation’sSource™ engine, the game technology behind the popularHalf-Life® 2 series. Crystal Island has been designed to help

1A WOZ study is a study paradigm in which participants interact with whatappears to be an autonomous computer system, but is actually a person (the“wizard”) in another location.

middle school students learn microbiology concepts throughan engaging and immersive story-centric experience [16].Within the story, the player adopts the role of a protagonistattempting to discover the identity and source of an infectiousdisease plaguing a research station that is located on the island.Throughout the mystery, the player is free to explore the worldand interact with other characters while forming questions, gen-erating hypotheses, collecting data, and testing hypotheses. Theplayer can pick up and store objects, view posters, operate labequipment, and talk with nonplayer characters to gather cluesabout the source of the disease. During the course of solving themystery, the player completes an in-game diagnosis worksheetto organize her findings, hypotheses, and conclusions. Uponcompleting the diagnosis worksheet, the player verifies itscontents with the camp nurse and develops a treatment plan forthe sickened Crystal Island team members.

B. Crystal Island: WOZ Version

For the corpus collection, a custom episode of Crystal Island(Fig. 2) was created that includes a companion agent who as-sists the player to solve the mystery. The player adopts the roleof Alex Reid visiting her father, Bryce, who serves as the re-search station’s lead scientist. Crystal Island’s narrative back-story is as follows: Alex has arrived at Crystal Island to visither father whom she has not seen for a while. As she approachesthe dock, she hears news that her father has fallen ill from Al,the camp foreman. Al tells her that Audrey, Ford, and her fatherwere out on an expedition gathering specimens. Their expedi-tion was scheduled to last for two days; however, they failedto return to the camp on time. Al found this very unusual sincethey were known to adhere closely to schedule. Fearful for theirsafety, Al led a search team to locate them. After two days ofsearching, the research team discovered that the expedition teamhad fallen ill on the south side of the island. It appears that thegroup lost their way, became ill, and could not make it back tothe camp. They are in the infirmary and are being attended to bythe camp’s nurse. Upon hearing the news, Alex hurries to theinfirmary to see her father and his colleagues. Kim, the camp’snurse, informs her that their condition is poor. Her father seemsto be doing much worse than the others. Kim is baffled by theillness and does not know what could have caused it. She asksAlex to help her identify the disease and its source.

C. WOZ Functionalities

To investigate director agent strategies, Crystal Island wasextended to include new functionalities specific to a WOZ

LEE et al.: A SUPERVISED LEARNING FRAMEWORK FOR MODELING DIRECTOR AGENT STRATEGIES IN EDUCATIONAL INTERACTIVE NARRATIVE 207

study design. In this WOZ-enabled version of Crystal Island,a human wizard provides narrative planning functionalitiesas well as spoken natural language dialog for the companioncharacter. Playing the role of the camp nurse, the wizard workscollaboratively with the player to solve the science mysterywhile also performing directorial actions to guide the interactivenarrative experience. Together in the virtual environment, thewizard and player carry on conversations using voice chat andobserve one another’s actions while investigating the mysteryscenario. In addition to directing the nurse character’s naviga-tion, spoken communication, and manipulation behaviors, thewizard guides the player’s investigative activities and controlsthe narrative’s pace and progression. To support these activities,the wizard’s computer terminal includes a detailed dashboardthat provides information about the player’s activities in theenvironment (e.g., reading books, testing objects, updating thediagnosis worksheet) as well as controls to initiate key narra-tive events in the environment (e.g., introducing new patientsymptoms, having a nonplayer character deliver additionalitems for testing). This narrative dashboard provides the wizardwith sufficient capabilities to simulate a computational directoragent.In addition to new wizard functionalities, the Crystal Island

narrative environment was streamlined to increase focus on areduced set of narrative interactions between the player andwizard, as well as reduce the amount of time spent navigatingthe environment. This was accomplished by confining the sce-nario to a single building that encompasses the camp’s infir-mary and laboratory. Within this environment the player andwizard gain access to all of the materials needed to solve thescience mystery (e.g., sickened researchers, background booksand posters, potential sources of the disease, lab equipment).The scenario, controls, and narrative dashboard were refinedthroughout a series of pilot studies with college students thatwere conducted prior to the corpus collection described in thispaper.

D. Example Scenario

To illustrate the behavior of the WOZ-enabled Crystal Is-land environment, consider the following scenario. A playerhas been collaborating with the nurse character, whose behav-iors are planned and executed by a human wizard. The playerhas learned that an infectious disease is an illness that can betransmitted from one organism to another, often through foodor water. Under guidance of the nurse, the player has examinedthe patients’ symptoms and run lab tests on food items. Throughthis exploration, the user has come to believe that the source ofthe illness is a waterborne disease and that it is likely choleraor shigellosis. Although she believes cholera is more likely, sheis unable to arrive at a final diagnosis. Through her conversa-tion with the nurse character, the wizard determines that theplayer is having difficulty ruling out shigellosis. The wizard de-cides that this is an opportune moment to provide a hint. Thewizard uses the narrative dashboard to enable the “observe legcramp symptom” plot point, which results in one of the patientsmoaning loudly in the infirmary. The player hurriedly examinesthe patient and informs the wizard, “He has leg cramps. Thatmeans it is cholera.” The wizard asks the player to update her

diagnosis worksheet with her new hypothesis and explain whyshe believes her recent finding. The player then provides a de-tailed explanation justifying her diagnosis, and the story con-cludes with the nurse treating the patients for cholera.

V. CORPUS COLLECTION PROCEDURE

A study with human participants was conducted in whichmore than 20 hours of trace data were collected using theWOZ-enabled version of the Crystal Island game environment. Thetrace data includes detailed logs of all player and wizard ac-tions (e.g., navigation, manipulation, and decision making) inthe interactive narrative environment, as well as audio and videorecordings of their conversations.

A. Participants

The participants were 33 eighth-grade students (15 males and18 females) from North Carolina ranging in age from 13 to 15

. Two wizards assisted with thecorpus collection, one male and one female. Each session in-volved a single wizard and a single student. The student and thewizard were physically located in different rooms throughoutthe session.

B. Wizard Protocol

To improve the consistency of the wizards’ interactive nar-rative decision-making and natural language dialog activities, awizard protocol was iteratively developed and refined througha series of pilot studies. The resulting protocol included ahigh-level procedure for the wizards to follow (e.g., introduceyourself as the camp nurse, describe the patient situation tothe player), a set of interaction guidelines (e.g., collaborativelywork with the player to solve the mystery, encourage the playerto explain her conclusions), and a set of narrative guidelines(e.g., descriptions about the overall story structure, suggestionsabout appropriate contexts for narrative decisions, explanationsabout event ordering constraints).Prior to the corpus collection with the eighth grade students,

each wizard was trained on theCrystal Islandmicrobiology cur-riculum and the materials that would be provided to studentsduring the corpus collection. The wizard training also includedinformation on key concepts from theCrystal Island curriculumand the protocol to follow. After carefully reviewing the mate-rials over the course of a week and having any of their questionsanswered, the wizards participated in at least three training ses-sions with college students. After each training session, a re-searcher performed an “after action review” with the wizard todiscuss his or her interactions with the students and adherenceto the wizard protocol. Through these training sessions, wizardsgained considerable experience performing directorial actionsin Crystal Island, and they developed strategies for directingstudents toward successful learning and problem-solving out-comes. While the wizards were not professional tutors, theydid have significant prior experience in educational technology.Consequently, the wizards were considered to possess above-novice pedagogical skills, as well as familiarity with the CrystalIsland environment. These qualifications suggested that the wiz-ards were capable of demonstrating effective directorial strate-gies for machine learning.

208 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014

C. Participant Procedure

When participants arrived at the data collection room, theywere greeted by a researcher and instructed to review a setof Crystal Island handouts, including information on theback story, task description, characters, and controls. Uponcompleting their review of the handouts, the researcher pro-vided further direction to the participants on the use of thekeyboard and mouse controls. The researcher then informedthe participants that they would be collaborating with anotherhuman-controlled character, the camp nurse, in the environ-ment to solve the science mystery. Participants were asked tocommunicate with the camp nurse throughout their sessions.Finally, the researcher answered any questions from the partic-ipants, informed them that the sessions were being videotaped,instructed them to put on their headsets and position their mi-crophones, and asked them to direct all future communicationto the camp nurse. The researcher remained in the room withthe participant for the duration of their session. The CrystalIsland session concluded once the participant and wizard ar-rived at a treatment plan for the sickened virtual scientists. Theparticipants’ sessions lasted no more than 60 min. During dataanalysis, data from one of the participants were eliminated asan outlier—the data were more than three standard deviationsfrom the mean—leaving 32 usable trace data logs.

VI. MODELING DIRECTOR AGENT STRATEGIES WITH DBNS

The WOZ study yielded a training corpus for ma-chine-learning models of director agent strategies in theCrystal Island narrative environment. In order to emulatedirector agent strategies effectively, it is necessary to inducemodels that prescribe when to intervene with a directorial ac-tion, as well as which action to perform during an intervention.These two tasks correspond to two submodels for the directoragent: an intervention model and an action model. Directoragents utilize numerous storyworld observations that changeover time to accurately determine the most appropriate time tointervene and the next director agent action to perform in anunfolding story.In order to induce models to perform these tasks, we em-

ployed DBNs [24]. A DBN is a directed acyclic graph that en-codes a joint probability distribution over states of a complexsystem over time. DBNs consist of nodes representing randomvariables, directed links that encode conditional dependenciesamong random variables, and conditional probability distribu-tions annotating the nodes. A defining feature of DBNs is theiruse of time slices, which characterize the state of the underlyingsystem at particular phases of time. By utilizing time slices,DBNs support probabilistic inference about events that changeover time.The DBN models in this work were implemented with the

GeNIe/SMILE Bayesian modeling and inference library [25].The DBNs’ network structure was hand authored, but theconditional probability tables that annotate each node wereautomatically induced from the training corpus. The expecta-tion–maximization algorithm from the SMILearn library wasused to learn the DBNs’ conditional probability tables. Ex-pectation–maximization was used as a matter of convenience;there were no hidden variables or missing data (i.e., missing

attributes). SMILearn’s implementation of expectation–max-imization served as an off-the-shelf method for parameterlearning. More specialized parameter-training techniques, suchas relative-frequency estimation, would have also been appro-priate, but a single iteration of expectation–maximization canserve the same purpose for fully observable models such as theDBNs in this work. After the models’ parameters were learned,the resulting networks were used to infer directorial decisionsabout when to intervene in the Crystal Island interactive narra-tive, as well as which action to perform during an intervention.It should be noted that while the set of conditional probabilityvalues induced as parameters for the DBN models are specificto the story arc, characters, and game environment of CrystalIsland, we anticipate that the data-driven methodology advo-cated by this paper generalizes across narrative spaces andgame environments.

A. Feature Selection

Feature selection is an important problem that has beenstudied for many years by the machine-learning community[26]. By selecting the most relevant predictor features from acorpus of training data, the performance of machine-learnedmodels is often improved considerably. In our study, we uti-lized a two-step feature selection approach to induce accuratemodels of director agents’ intervention and action decisionstrategies. First, we performed a post-study “after action re-view” with the wizards to discuss his/her narrative decisionstrategies while interacting with the players. During the corpuscollection we video recorded all of the actions taken by thewizards within the WOZ-enabled version of Crystal Island.After the corpus collection was over, we asked the wizards towatch the recorded videos and “think aloud” by retrospectivelydescribing why they chose particular narrative decisions. Thewizard was periodically prompted to explain the factors thatdrove her directorial decisions. The wizard was also asked whatfeatures she considered when making the narrative decisions.Afterward, we conducted a qualitative analysis of the wizards’responses to identify candidate features for the intervention andaction models.During the second stage of feature selection, we performed a

brute force feature ranking method. Selected features were eval-uated with all possible combinations of the input features. Sub-sets consisting of features that yielded the most efficient perfor-mance for each intervention and action model were chosen andimplemented. The initial pool consisted of eight features identi-fied from the retrospective commentaries provided by wizards.The automated brute force feature selection process reduced theset to four features in the intervention model and two featuresin the action model. There were no features considered from thespoken natural language dialogs between wizards and players.For the purpose of this investigation, all features were computedfrom the in-game trace data.

B. Intervention Strategies

The high-level structure of the DBN model of director agentintervention strategies, i.e., when to perform a directorial ac-tion, is shown in Fig. 3. Three time slices are illustrated in thefigure. In each slice, the intervention decision from the previous

LEE et al.: A SUPERVISED LEARNING FRAMEWORK FOR MODELING DIRECTOR AGENT STRATEGIES IN EDUCATIONAL INTERACTIVE NARRATIVE 209

Fig. 3. High-level illustration of the DBN model for director agent intervention strategies.

time slice (intervention decision ) influences the current in-tervention decision (intervention decision ). Within each timeslice, observations from the story world, collectively known asnarrative state , also influence the intervention decision. Theseobservations include the physical state of the storyworld, pro-gression of the narrative, user knowledge of the story, and theoverall story timeline. Each time slice encodes a probabilisticrepresentation of the director agent’s beliefs about the overallstate of the narrative.The DBN director agent intervention model consists of the

following variables.• Intervention decision. Intervention decision is a binaryvariable with two values: action and no action. Actionindicates that a director agent should perform an action tointervene in the story. No action indicates that the directoragent should remain inactive.

• Physical state. Physical state is a discrete variable, withnine possible values, that encodes the player’s current loca-tion and the wizard’s location in the story world. Locationsare subdivided into discrete regions spanning the environ-ment’s virtual infirmary and laboratory. All user interac-tions occur within these locations.

• Narrative progress. Narrative progress is a discrete vari-able, with five possible values, that characterizes the nar-rative structure of Crystal Island’s plot. To represent narra-tive progress, we modeledCrystal Island’s plot structure interms of a five-phase narrative arc framework: exposition,complication, escalation, climax, and resolution. Transi-tions between narrative phases were deterministically trig-gered by the occurrence of plot points within Crystal Is-land’s narrative. For example, when the leg-cramp-revealplot point occurred in the narrative, that marked the transi-tion from the climax phase to the resolution phase, becauseat this point the student had all of the information neededto diagnose the illness with certainty. These triggers weremanually specified by Crystal Island’s designer. Both stu-dents and wizards experienced the same real-time progres-

sion of narrative phases; there were no differences in howwizards and students witnessed the plot advance, althoughwizards were able to control how and when certain plotpoints occurred through directorial actions.

• Player knowledge. Player knowledge nodes are discretevariables, with ten possible values, that encode the player’sbeliefs about salient facts of the story learned through inter-actions with the narrative environment and nonplayer char-acters. Player knowledge is measured on an ordinal scale.Within the Crystal Island environment, players completea diagnosis worksheet while solving the science mystery,which provides details regarding players’ current beliefsabout the story. Player knowledge node values are deter-mined from user performance on the diagnosis worksheet.

• Time index. Time index nodes are discrete variables thatmodel the overall timeline of the storyworld, providingtemporal evidence to guide intervention decisions. Thereare 84 possible time index values, each corresponding to adiscrete time slice in the model.

During runtime, new observations are made available to thedirector agent, such as player locations and narrative phases,which trigger corresponding nodes in the DBN to update withobserved values. These updates propagate throughout thenetwork by changing the marginal probabilities of connectednodes. Propagating evidence enables the model to performinferences about the most probable intervention decision attime slice , which encodes whether a director agent shouldintervene at a given point in time.

C. Action Strategies

The high-level structure of the DBN model created fordirector agent action strategies, i.e., which directorial actionto perform, is shown in Fig. 4. The set of directorial actionswere those controlled by the wizard’s narrative dashboard;navigation, object manipulation, conversation, and nonplayercharacter behaviors were not part of the model. The figureillustrates three time slices and their corresponding narrative

210 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014

Fig. 4. High-level illustration of the DBN model for director agent action strategies.

action decisions: action decision , action decision , andaction decision . The three time slices incorporate narrativeobservations that encode information about the physical stateof the storyworld and the narrative’s plot progression. Theaction model shares structural similarities with the interventionmodel, although different predictor features are used.In the action model, action decision nodes encode current and

prior actions performed by the director agent during interven-tions. The physical state and narrative progress nodes are iden-tical to the equivalent nodes in the intervention strategies modeldescribed in Section VI-B. When selecting directorial actions,the model consults beliefs about the narrative environment’scurrent physical state and plot progress, as well as its prior his-tory of action decision and action decision (Fig. 4).Given a DBN structure such as the one described above,

conditional probability tables (CPTs) for each observationnode in the network can be induced from a training corpusof wizard–player interaction data. After the model’s networkstructure and conditional probability tables have been obtained,the model can be used to guide director agent strategies duringruntime. When the model is used for runtime decision making,observed evidence is provided to the DBNmodel, which causesupdates to marginal probability values in the network that inturn affect the computed probabilities of intervention and actiondecisions at each time slice.

D. Predictive Model Evaluation Results and Discussion

Models of director agent intervention and action strategieswere machine learned from the training corpus. The predic-tive performance of these models was evaluated with respectto predictive accuracy, recall, and precision using cross-valida-tion techniques. This section describes results from these twomachine-learning analyses.1) Directorial Intervention Model: The unrolled DBN

intervention model contained a total of 84 time slices. Wechose 84 time slices by considering the average length of anarrative episode in the training corpus (approximately 45

min), identifying the minimum time required to perform aruntime Bayesian update without inhibiting Crystal Island’sgraphical performance (approximately 32 s), and then com-puting the maximum number of Bayesian updates that couldoccur in an average episode . The unrollednetwork consisted of 84 time slices. It should be noted that thelarge number of conditional probabilities in each time sliceraised potential data sparsity concerns for machine learningthe DBN’s parameters. While GeNIe requires the unrollednetwork be provided as an input for parameter learning, datasparsity issues were ameliorated by GeNIe’s parameter learningprocess: the parameters for a given time slice in a DBN arelearned using data from all time slices in the training corpus.In our case, 33 traces multiplied by 22 observations per sliceyielded approximately 726 training observations per DBN timeslice. In effect, the unrolled DBN’s parameter learning processmirrors training a DBN consisting of just two time slices. It ispossible to further reduce data sparsity concerns by performingsmoothing. However, we did not perform an explicit smoothingstep in our analysis. The DBN’s conditional probability tableswere directly those computed by GeNIe/SMILE’s implementa-tion of expectation–maximization with the training corpus.Statistical analyses were conducted to assess the effective-

ness of DBNs for modeling director agent intervention strate-gies. To determine the relative effectiveness of the DBN model,an alternate naive Bayes model was developed as a baselinefor comparison purposes. The naive Bayes model leveraged thesame set of observation variables as the DBN model, but theobservable variables were assumed to be conditionally indepen-dent of one another. Further, temporal relations between succes-sive time slices were not included in the naive Bayes model. Bycontrast, the DBNmodel included conditional dependence rela-tions between successive time slices. These relations are shownas directed links in Fig. 3. Both of the models were learnedusing the same training corpus: trace data collected from 32player interactions with the Crystal Island game environmentduring the aforementioned WOZ study. A leave-one-out cross-

LEE et al.: A SUPERVISED LEARNING FRAMEWORK FOR MODELING DIRECTOR AGENT STRATEGIES IN EDUCATIONAL INTERACTIVE NARRATIVE 211

TABLE ICLASSIFICATION RESULTS OF DIRECTOR AGENT INTERVENTION MODELS

TABLE IICLASSIFICATION RESULTS OF DIRECTOR AGENT ACTION MODELS

validation method was employed. We employed leave-one-outcross validation in order to ensure that we had enough trainingdata to induce accurate predictive models. In some cases, leave-one-out cross validation can be subject to overfitting—a poten-tial tradeoff of this validation scheme—but in practice we findthat it is sufficient for validating models that perform well inruntime settings.Recall, precision, and accuracy were computed for each

model. Table I displays classification results for the naiveBayes and DBN models. The DBN model outperformed thebaseline naive Bayes model in all categories, achieving a morethan 16% absolute improvement in accuracy over the baselineapproach. Also, the DBN model achieved sizable absolutegains in recall and precision, 34% and 50%, respectively, ascompared to the baseline approach.The evaluation indicates that inducing DBN models from

human demonstrations of director agent intervention strategiesis an effective approach, and the independence assumptions in-herent in naive Bayes models may not be appropriate for inter-active narrative environments.2) Directorial ActionModel: The unrolled DBNmodel of di-

rector agent action strategies contained a total of 22 time slices.We chose 22 time slices because that was the maximum numberof interventions observed in the training episodes. Once again,the DBN used conditional probability table values induced fromGeNIe/SMILE’s implementation of expectation–maximizationrun on the training corpus. Statistical analyses were performedto investigate the effectiveness of DBNs for predicting humandirectors’ action decisions during interactive narrative interven-tions. To examine the relative effectiveness of the DBN modelcompared to a baseline approach, a bi-gram model was devel-oped in which only the previous action decision was used topredict the next action decision. This network structure was anappropriate baseline for comparing against the DBN model be-cause it represents the most basic temporal probabilistic modelfor guiding director agent strategies.Themodels’ predictive performances were evaluated in terms

of overall accuracy, macro-averaged recall, and macro-aver-aged precision. The action decision model solves a multiclassclassification problem, which requires the use of macro-av-eraging techniques when using evaluation metrics such asrecall and precision [27]. Macro-averaging is commonly used

by the natural language processing community to evaluatelinguistic classifiers. Macro-averaging is performed by takingthe arithmetic means of all predicted classes. This methodassumes that each class has equal weight. The resulting valuesare considered macro-averaged recall and macro-averagedprecision, respectively.Leave-one-out cross validation was employed to evaluate

the DBN and bi-gram director agent models. Results from theevaluation are shown in Table II. The bi-gram model predictedhuman directorial actions with 71% accuracy, whereas theDBN model achieved an accuracy of 93.7%. These resultscorrespond to a 23% absolute improvement by the DBN modelover baseline approaches. Further, the DBN model achievedsizable gains in both recall and precision compared to thebaseline approach. These results suggest that leveraging evi-dence about narrative structure, physical locations, and actiondecision history can substantially improve models of directoragents’ action decisions.

VII. EVALUATING DBN MODELS OF DIRECTOR AGENTSTRATEGIES IN RUNTIME SYSTEMS

To examine the effectiveness of DBN models of directoragent strategies in a runtime setting, the intervention modeland the action model were both integrated into the CrystalIsland interactive narrative environment. After the integrationwas completed, the updated version of Crystal Island was thesubject of a second study involving human participants. In thisstudy, the game environment was identical to the WOZ-enabledversion of Crystal Island from the corpus collection with twoexceptions: 1) the DBN director agent controlled narrativeevents and nonplayer character behaviors rather than a humanwizard; and 2) the nurse character’s dialog was delivered inthe form of prerecorded speech and text rather than naturallanguage conversation between the wizard and the participant.While the omission of natural language dialog is a limitation,we anticipated that it would not impact the interactive narra-tive’s dynamics in such a manner that would inhibit the DBNdirector agent’s ability to impact players’ narrative-centeredlearning experiences.The director agent’s actions were reified in the narrative en-

vironment by the camp nurse character. When the directorialaction was chosen, the nurse automatically moved to an appro-priate location in the story world and performed the directed ac-tion. For example, if the director agent model chose to advancethe interactive narrative by helping the player gather patientsymptoms, the director agent would direct the camp nurse toapproach the player and suggest that the player examine the pa-tient. The camp nurse would also lead the player toward the in-firmary where patients lay in medical cots. Whenever the nursewas in an idle state (i.e., not implementing a directorial action),she followed the user around the environment using built-in pathfinding, only responding to user requests for information. Thecamp nurse’s dialog was presented through simultaneous speechand text; prerecorded speech was provided by a human voiceactor, and the text of her dialog appeared at the bottom of thescreen. Players interacted with the nurse and other nonplayercharacters to receive environmental information (e.g., How doesone operate the laboratory equipment? Where is the library?),

212 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014

and uncover clues about the science mystery (e.g., What is awaterborne disease? What are Bryce’s symptoms?). Players se-lected their questions using dialog menus. All character dialog,including the nurse’s dialog, was selected deterministically, andwas fully specified in fixed dialog trees. In the case of the nurse,the dialog tree was designed to approximate, at a simple level,the conversations that occurred during the WOZ corpus col-lection study. However, the DBN director agent model did notimpact characters’ dialog behavior, except in cases where trig-gering dialog was part of a directorial action.

A. Runtime Evaluation Experiment

An evaluation experiment was conducted to examine players’narrative learning experiences while interacting with the up-dated Crystal Island environment with integrated DBN directoragent models. The between-group study design included twoexperimental conditions. One condition featured the DBNmodels of director agent strategies to guide narrative events inCrystal Island. The second condition used a simplified modelthat did not leverage machine-learned directorial strategies.Participants’ narrative experiences and learning outcomeswere compared between conditions to examine the relativeeffectiveness of director agent models induced from humandemonstrations.1) Machine-Learned Model: In the condition with machine-

learned director agent models, players investigated the CrystalIsland mystery under the guidance of the DBN director agentmodel. The director agent actively monitored players as theyinteracted with the storyworld, determined when to intervene,and selected appropriate directorial actions to guide participantsthrough the intended narrative. The director agent had controlover director intervention decisions (i.e., deciding when to in-tervene) and director action decisions (i.e., selecting what inter-vention to perform). The machine-learned director agent modelhad access to 15 potential directorial actions—actions that wereoriginally controlled through the narrative dashboard during theWOZ study—which are listed in Table III. The machine-learnedmodel did not directly control nonplayer characters’ behavior,or the nurse’s navigation and dialog behavior, except in caseswhere the behaviors were part of a directorial action listed inTable III.2) Base Model: In the base condition, players investigated

the Crystal Island mystery under the guidance of a minimaldirector agent model that did not use machine-learned directo-rial strategies. This director agent model controlled a subset offive directorial actions (Table III) that were required for solvingthe mystery (i.e., the player could not progress in the narrativewithout the director taking action). The director agent in thiscondition was a simple rule-basedmodel, not a machine-learnedmodel. Directorial decisions, such as introducing new patientsymptoms and objects, were triggered whenever specific pre-conditions were satisfied in the environment.This particular baseline was chosen for several reasons.

First, we considered comparing the machine-learned modelto a human wizard. However, it was logistically infeasible torecruit, train, and observe additional wizards (with no prior ex-perience) to participate in the study due to the planned numberof participants. Furthermore, human wizards would introduce

TABLE IIIDIRECTOR AGENT DECISIONS

a confounding variable—natural language dialog—to the ex-periment that would potentially impact the results. Therefore,we removed this study design from consideration. Second,we considered comparing the machine-learned model to aprior director agent model for the Crystal Island environment,U-Director [6]. However, U-Director was designed for an olderversion of Crystal Island, and it would have required a com-plete reimplementation to work in the updated environment.Consequently, we elected to pursue a third option, a rule-basedmodel that provides minimal pedagogical guidance, to examinethe DBN model’s ability to effectively coach students as theysolved the science mystery.

B. Study Method

A total of 123 participants played Crystal Island. Partici-pants were middle school students ranging in age from 12 to 15

LEE et al.: A SUPERVISED LEARNING FRAMEWORK FOR MODELING DIRECTOR AGENT STRATEGIES IN EDUCATIONAL INTERACTIVE NARRATIVE 213

. Six of the participants were elimi-nated due to hardware and software issues, and 15 participantswere eliminated due to incomplete data on pre- and post-ex-periment questionnaires. Among the remaining participants, 45were male and 57 were female.Several days prior to playing Crystal Island, participants

completed a short series of pre-experiment questionnaires.The questionnaires included a microbiology curriculum testto assess student understanding of science concepts that wereimportant in Crystal Island’s science mystery. The curriculumtest consisted of 20 multiple-choice questions about microbi-ology concepts, including scientific method, pathogens, diseasetransmission methods, and infectious diseases. During theexperiment, participants were randomly assigned to a studycondition, and they were given 45 min to complete CrystalIsland’s interactive narrative scenario. Immediately aftersolving the mystery, or 45 min of interaction, whichever camefirst, participants exited the Crystal Island game environmentand completed a series of post-experiment questionnaires.By terminating interactive narrative episodes after 45 min, itwas ensured that the director agent models would not exceedthe supported number of time slices. The post-experimentquestionnaires included the microbiology curriculum test,which was identical to the pretest version. The post-experimentquestionnaires took approximately 30 min to complete. In total,the study sessions lasted no more than 90 min.

C. Results

Statistical analyses were conducted to assess the relativeeffectiveness of the director agent models between partici-pants. The analyses focused on two experiential outcomes:participants’ science learning gains and participants’ efficiencyat solving the science mystery. The first outcome relates tohow successfully the director agent model promotes CrystalIsland’s primary objective: science learning. The second out-come relates to how effectively the director agent model guidesparticipants toward a desired narrative resolution.Participants achieved significant positive learning gains

from playing Crystal Island. A matched pairs -test comparingpre- and post-experiment curriculum test scores indicatedthat participants’ learning gains were statistically significant,

. Examining learning outcomeswithin each condition revealed that participants in the ma-chine-learned model condition achieved significant learninggains, but participants in the base model condition did notachieve significant gains (Table IV). Furthermore, there wasa significant difference between the conditions in terms oflearning gains. An analysis of covariance (ANCOVA) com-paring post-test scores between conditions while controllingfor pretest scores revealed that learning gains from the ma-chine-learned model were significantly greater than the basemodel, .A second analysis of the director agent model focused on par-

ticipants’ efficiency at solving Crystal Island’s mystery. The in-vestigation examined two metrics for each condition: whetherparticipants solved the mystery, and the duration of time taken

TABLE IVLEARNING GAINS BY CONDITION

TABLE VMYSTERY-SOLVING EFFICIENCY BY CONDITION

by participants to solve it. Table V shows means and standarddeviations for the metrics in each condition.To examine whether the two conditions differed in terms of

participants who successfully solved the mystery, a chi-squaretest was performed. The results showed that the correlation wassignificant, (likelihood ratio, , Pearson,

), indicating that the number of participants who solvedthe mystery varied significantly between the conditions. Partic-ipants in the machine-learned model condition solved the mys-tery at a rate of 92.7%, whereas participants in the base modelcondition solved the mystery at a much lower rate of 70.2%.We also examined differences in time taken by participants to

solve Crystal Island’s mystery narrative. An ANOVA revealedthat differences between the two conditions were statisticallysignificant, . These results indicatethat participants in the machine-learned model condition solvedthe mystery significantly more efficiently than participants inthe base model condition.

VIII. DISCUSSION

Empirical evaluation of the DBN-based director agent modelsupports the promise of data-driven approaches for creatinginteractive narrative systems. Findings from a controlled exper-iment with human participants revealed that machine-learnedDBNs significantly outperformed baseline approaches formodeling director agent strategies in a runtime narrative-cen-tered learning environment. Participants who interacted withmachine-learned director agent models achieved significantlygreater learning outcomes than participants in the baselinecondition, and they also solved the science mystery scenariomore frequently and efficiently. These experiential measuresare highly relevant for educational applications of interactivenarratives, which is a key focus of the Crystal Island gameenvironment. A possible explanation for these results is thatthe DBN-based director agent delivered hints and promptseffectively to reduce obstacles to student learning, as well asmaintained effective pacing to minimize student boredom andfrustration. In contrast, the base model did not provide compa-rable story-embedded educational support, either in terms oftimeliness or content. It should be noted that these results meritcomplementary analyses that focus on aesthetic dimensions ofplayer experiences. Studies of players’ subjective experiences,

214 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 6, NO. 2, JUNE 2014

as well as empirical patterns in their aesthetic experiences, areimportant for understanding the aggregate impacts of directoragent models.Furthermore, statistical analyses involving leave-one-out

cross validation found that DBNs are capable of accuratelymodeling human demonstrations of directorial strategies. Webelieve that DBNs’ ability to accurately model human direc-torial strategies is a key reason for their effectiveness in theruntime Crystal Island environment. The findings are also no-table because the DBN models did not incorporate any featuresfrom the spoken natural language dialogs that occurred in theWOZ study. While natural language dialog is widely believedto be a key channel in learning-related communication [28], inthis case, it was not necessary to induce models of directorialstrategies with high degrees of accuracy, or induce a directoragent model that effectively enhanced student learning andproblem-solving outcomes in a runtime environment.While supervised machine-learning techniques based on

DBNs show considerable promise, they do have limitations.The computational complexity of probabilistic belief updatesincreases as growing numbers of narrative features or actionsare introduced to a model, or as a network’s connectednessincreases. Increasing the dimensionality of narrative represen-tations can lead to increased predictive power, but these gainscome at the cost of runtime performance. A careful balancemust be maintained to ensure that director agent models caneffectively reason about key aspects of an interactive narrativewhile still completing inferences within reasonable durationsof time. These tradeoffs are particularly salient in graphicallyintensive games, which may have limited central processingunit (CPU) and memory resources available for AI-relatedcomputations.Furthermore, the current framework’s requirement that a

corpus collection version of an interactive narrative mirror thefinal version of the interactive narrative may prove limiting forsome categories of games. A growing number of games are up-dated continuously after release through downloadable content,expansion packs, and in-app purchases. These updates result ingame dynamics that frequently change. This relatively recentadvancement in game distribution poses notable challengesfor data-driven approaches to devising interactive narrativemodels, but at the same time modern networking technologiesintroduce substantial opportunities for collecting training dataat relatively low costs.

IX. CONCLUSION AND FUTURE WORK

Computational models of interactive narrative offer consid-erable potential for creating engaging story-centric game expe-riences that are dynamically tailored to individual players. De-vising effective models of director agent strategies is critical forachieving this vision. We have presented a data-driven frame-work for machine-learning models of director agent strategiesusing DBNs. The framework enumerates key considerationsof corpus acquisition, model learning, predictive model eval-uation, and runtime model evaluation phases of the approach,and its efficacy has been demonstrated through implementations

with the Crystal Island interactive narrative environment. Em-pirical studies involving human participants revealed that DBNscan accurately predict interactive narrative director agent strate-gies demonstrated by human players. Furthermore, statisticalanalyses of data from a controlled experiment indicated that ma-chine-learned DBN director agent models have significant pos-itive impacts on participants’ experiential outcomes, includinglearning and narrative progress. The results suggest that the pro-posed framework for modeling director agent strategies holdsconsiderable promise for promoting effective and engaging nar-rative-centered learning experiences.Several directions for future work are promising. First, user

modeling played a minor role in this work, but it could serve amuch more prominent role in enhancing computational modelsof interactive narrative. By maintaining fine-grained models ofplayers’ knowledge during interactive narratives, opportunitiesfor new classes of directorial actions may be introduced. An-other promising direction is exploring natural language dialog ininteractive narrative. Endowing virtual characters with sophis-ticated natural language dialog capabilities offers a mechanismfor guiding players through interactive stories using naturalisticand engaging interfaces. During the WOZ study described inthis paper, wizards used natural language dialog to guide partici-pants whenever unexpected behaviors were encountered. Whilethis was emulated through voice-acted speech in the evaluationstudy, and lack of natural language dialog did not prevent themachine-learned director agent model from effectively shapingstudents’ narrative-centered learning outcomes, devising adap-tive models of interactive dialog represents a promising lineof investigation for future enhancements to narrative-centeredlearning environments.

ACKNOWLEDGMENT

The authors would like to thank Valve Corporation for accessto the Source™ SDK, as well as J. Grafsgaard, K. Lester, M.Mott, and J. Sabourin for assisting with studies.

REFERENCES[1] M. Mateas and A. Stern, “Structuring content in the Façade interac-

tive drama architecture,” in Proc. 1st Int. Conf. Artif. Intell. InteractiveDigit. Entertain., 2005, pp. 93–98.

[2] J. McCoy, M. Treanor, B. Samuel, N. Wardrip-Fruin, and M. Mateas,“Comme il Faut: A system for authoring playable social models,” inProc. 7th Int. Conf. Artif. Intell. Interactive Digit. Entertain., 2011, pp.158–163.

[3] M. Si, S. Marsella, and D. Pynadath, “Directorial control in a decision-theoretic framework for interactive narrative,” in Proc. 2nd Int. Conf.Interactive Digit. Storytelling, 2009, pp. 221–233.

[4] H. Yu and M. O. Riedl, “A sequential recommendation approach forinteractive personalized story generation,” in Proc. 11th Int. Conf. Au-tonom. Agents Multiagent Syst., 2012, pp. 71–78.

[5] J. Porteous, J. Teutenberg, F. Charles, and M. Cavazza, “Controllingnarrative time in interactive storytelling,” in Proc. 10th Int. Conf. Au-tonom. Agents Multiagent Syst., 2011, pp. 449–456.

[6] B. W. Mott and J. C. Lester, “U-Director: A decision-theoretic narra-tive planning architecture for storytelling environments,” in Proc. 5thInt. Joint Conf. Autonom. Agents Multiagent Syst., 2006, pp. 977–984.

[7] D. L. Roberts, M. J. Nelson, C. L. Isbell, M.Mateas, andM. L. Littman,“Targeting specific distributions of trajectories in MDPs,” in Proc. 21stNat. Conf. Artif. Intell., 2006, pp. 1213–1218.

[8] M. O. Riedl, C. J. Saretto, and R. M. Young, “Managing interactionbetween users and agents in a multi-agent storytelling environment,”in Proc. 2nd Int. Conf. Autonom. Agents Multiagent Syst., 2003, pp.741–748.

LEE et al.: A SUPERVISED LEARNING FRAMEWORK FOR MODELING DIRECTOR AGENT STRATEGIES IN EDUCATIONAL INTERACTIVE NARRATIVE 215

[9] M. Sharma, S. Ontanón, C. Strong, M. Mehta, and A. Ram, “Towardsplayer preference modeling for drama management in interactive sto-ries,” in Proc. 20th Int. Florida Artif. Intell. Res. Soc. Conf., 2007, pp.571–576.

[10] D. Thue, V. Bulitko, M. Spetch, and E. Wasylishen, “Interactive story-telling: A player modelling approach,” in Proc. 3rd Artif. Intell. Inter-active Digit. Entertain. Conf., 2007, pp. 43–48.

[11] J. Niehaus, B. Li, and M. O. Riedl, “Automated scenario adaptationin support of intelligent tutoring systems,” in Proc. 24th Conf. FloridaArtif. Intell. Res. Soc., 2011, pp. 531–536.

[12] N. Vannini, S. Enz, M. Sapouna, D. Wolke, S. Watson, S. Woods, K.Dautenhahn, L. Hall, A. Paiva, E. André, R. Aylett, and W. Schneider,“FearNot!’: A computer-based anti-bullying-programme designed tofoster peer intervention,” Eur. J. Psychol. Edu., vol. 26, no. 1, pp.21–44, Jun. 2011.

[13] B. Nelson and D. Ketelhut, “Exploring embedded guidance and self-ef-ficacy in educational multi-user virtual environments,” Int. J. Comput.-Supported Collabor. Learn., vol. 3, no. 4, pp. 413–427, 2008.

[14] W. L. Johnson, “Serious use of a serious game for language learning,”Int. J. Artif. Intell. Edu., vol. 20, no. 2, pp. 175–195, 2010.

[15] J. Thomas and R.M. Young, “Annie: Automated generation of adaptivelearner guidance for fun serious games,” IEEE Trans. Learn. Technol.,vol. 3, no. 4, pp. 329–343, Oct.-Dec. 2011.

[16] J. P. Rowe, L. R. Shores, B. W. Mott, and J. C. Lester, “Integratinglearning, problem solving, and engagement in narrative-centeredlearning environments,” Int. J. Artif. Intell. Edu., vol. 21, no. 1–2, pp.115–133, 2011.

[17] B. Li, S. Lee-urban, and M. O. Riedl, “Toward autonomous crowd-powered creation of interactive narratives,” in Proc. 5th AAAI Work-shop Intell. Narrative Technol., 2012, pp. 20–25.

[18] R. Swanson and A. Gordon, “Say anything: A massively collaborativeopen domain story writing companion,” in Proc. 1st Joint Int. Conf.Interactive Digit. Storytelling, 2008, pp. 32–40.

[19] H. Barber and D. Kudenko, “Dynamic generation of dilemma-basedinteractive narratives,” in Proc. 3rd Artif. Intell. Interactive Digit. En-tertain. Conf., 2007, pp. 2–7.

[20] P. Weyhrauch, “Guiding interactive drama,” Ph.D. dissertation, Schl.Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA, 1997.

[21] C. R. Fairclough, “Story games and the OPIATE system: Using case-based planning for structuring plots with an expert story director agentand enacting them in a socially situated game world,” Ph.D. disserta-tion, Dept. Comput. Sci., Trinity College, Univ. Dublin, Dublin, Ire-land, 2004.

[22] M. J. Nelson, D. L. Roberts, C. L. Isbell, and M. Mateas, “Reinforce-ment learning for declarative optimization-based drama management,”in Proc. 5th Int. Joint Conf. Autonom. Agents Multiagent Syst., 2006,pp. 775–782.

[23] J. Orkin, “Collective artificial intelligence: Simulated role-playingfrom crowdsourced data,” Ph.D. dissertation, MIT Media Lab, Mass-achusetts Inst. Technol., Boston, MA, USA, 2013.

[24] T. Dean and K. Kanazawa, “A model for reasoning about persistenceand causation,” Comput. Intell., vol. 5, no. 3, pp. 142–150, 1990.

[25] M. J. Druzdzel, “SMILE: Structural modeling, inference and learningengine and GeNIe: A development environment for graphical deci-sion-theoretic models,” in Proc. 16th Nat. Conf. Artif. Intell., 1999, pp.342–343.

[26] I. Guyon and A. Elisseef, “An introduction to variable and feature se-lection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, Mar. 2003, 2003.

[27] Y. Yang and X. Liu, “A re-examination of text categorizationmethods,” in Proc. 22nd Annu. Int. ACM SIGIR Conf. Res. Develop.Information Retrieval, 1999, pp. 42–49.

[28] D. Litman and K. Forbes-Riley, “Predicting student emotions in com-puter-human tutoring dialogues,” in Proc. 42nd Annu. Meeting Assoc.Comput. Linguistics, 2004, DOI: 10.3115/1218955.1219000.

Seung Y. Lee received the Ph.D. degree in computerscience from the North Carolina State University,Raleigh, NC, USA, in 2012.His research interests are in multimodal interac-

tion, interactive narrative, statistical natural languageprocessing, and affective computing. He is currentlywith the SAS Institute, Cary, NC, USA, where hiswork focuses on machine-learning and sentimentanalysis in natural language processing.

Jonathan P. Rowe received the B.S. degree incomputer science from Lafayette College, Easton,PA, USA, in 2006 and the M.S. and Ph.D. degreesin computer science from the North Carolina StateUniversity, Raleigh, NC, USA, in 2010 and 2013,respectively.He is a Research Scientist in the Center for Ed-

ucational Informatics, North Carolina State Univer-sity. His research investigates artificial intelligence,data mining, and human–computer interaction in ad-vanced learning technologies, with an emphasis on

game-based learning environments. He is particularly interested in modelinghow learners solve problems and experience interactive narratives in virtualworlds.

Bradford W. Mott received the Ph.D. degree incomputer science from the North Carolina StateUniversity, Raleigh, NC, USA, in 2006.He is a Senior Research Scientist in the Center

for Educational Informatics, North Carolina StateUniversity. His research interests include artificialintelligence and human–computer interaction, withapplications in educational technology. In particular,his research focuses on game-based learning en-vironments, intelligent tutoring systems, computergames, and computational models of interactive

narrative. Prior to joining North Carolina State University, he worked in thevideo games industry developing cross-platform middleware solutions forPlayStation 3, Wii, and Xbox 360.

James C. Lester received the B.A. degree in historyfromBaylor University,Waco, TX, USA, in 1983 andthe B.A. (highest honors, Phi Beta Kappa), M.S., andPh.D. degrees in computer science from the Univer-sity of Texas at Austin, Austin, TX, USA, in 1986,1988, and 1994, respectively.He is a Distinguished Professor of Computer

Science at the North Carolina State University,Raleigh, NC, USA. He is the Founding Directorof the Center for Educational Informatics, NorthCarolina State University. His research focuses on

transforming education with technology-rich learning environments. Utilizingartificial intelligence, game technologies, and computational linguistics, hedesigns, develops, fields, and evaluates next-generation learning technologiesfor K-12 science, literacy, and computer science education. His work onpersonalized learning ranges from game-based learning environments andintelligent tutoring systems to affective computing, computational models ofnarrative, and natural language tutorial dialog. He is an author of over 150publications in these areas.Dr. Lester has served as Program Chair for the International Conference on

Intelligent Tutoring Systems, the International Conference on Intelligent UserInterfaces, and the International Conference on Foundations of Digital Games,and as Editor-in-Chief of the International Journal of Artificial Intelligence inEducation. He has been recognized with a National Science Foundation CA-REER Award and a number of Best Paper Awards.