Online Resource 1 for: The Effect of Contextualized Conversational Feedback in …10.1007... · · 2012-10-05The Effect of Contextualized Conversational Feedback in a Complex Open-Ended

1

Online Resource 1 for:

The Effect of Contextualized Conversational

Feedback in a Complex Open-Ended Learning

Environment

Published in: Educational Technology Research & Development

Corresponding Author:

James R. Segedy

Institute for Software Integrated Systems, Vanderbilt University

PMB351829

Nashville, TN 37235

Phone: +1-615-417-6551

Fax: +1-615-343-7440

Email: [email protected]

John S. Kinnebrew


PMB351829

Nashville, TN 37235


Gautam Biswas


PMB351829

Nashville, TN 37235


2

Analyzing Learning Behaviors and Strategies

Introduction

This document serves as a comprehensive description and explanation of a

data mining methodology for assessing and comparing student learning behaviors,

which includes three major components: (1) action abstraction while maintaining

context using a relevance metric; (2) generation and comparison of hidden

Markov models to analyze aggregated student learning behaviors; and (3)

combining sequential pattern mining and episode discovery/mining to identify

differences in specific learning behavior patterns. Additionally, this document

demonstrates the usefulness of these techniques by analyzing a set of data from

the Betty’s Brain computer-based learning environment (Leelawong and Biswas

2008).

The traditional approach to measuring students’ SRL thinking has been

through the use of self-report questionnaires (e.g., Pintrich, Marx, and Boyle

1993; Weinstein, Schulte, & Palmer 1987; Zimmerman and Pons 1986). The

underlying assumption in these questionnaires is that self-regulation is an aptitude

that students possess. For example, the questionnaire items might attempt to

assess students’ inclination to elaborate as they read a passage, or to determine

their approach to managing available time resources (Perry and Winne 2006;

Zimmerman 2008). This approach has been useful, as the self-report

questionnaires have been shown to be good predictors of students’ standard

achievement test scores, and they correlate well with achievement levels (Pintrich

et al. 1993; Zimmerman and Pons 1986). However, Hadwin and others (Azevedo

and Witherspoon 2009; Hadwin, Nesbit, Jamieson-Noel, Code, and Winne 2007;

Hadwin, Winne, Stockley, Nesbit, and Woszczyna 2001; Perry and Winne 2006)

have argued that while the questionnaires provide valuable information about the

learners’ self-perceptions, they fail to capture the dynamic and adaptive nature of

SRL as students are involved in learning, knowledge-building, and problem-

solving tasks.

Recently, studies have illustrated the need to supplement self-report

questionnaires with measures of SRL as dynamic processes (Hadwin et al. 2007;

Winne and Jamieson-Noel 2002). For example, Hadwin et al. (2007) performed a

study that collected activity traces of eight students using the gStudy system

(Perry and Winne 2006). The activity traces were analyzed in four different ways:

3

(i) frequency of studying events, (ii) patterns of studying activity, (iii) timing and

sequencing of events, and (iv) content analyses of students’ notes and summaries.

The results of this analysis were compared against students’ self-reports on their

self-regulated learning. One of the important findings was that many participants’

self-reports of studying tactics were not well-calibrated with studying events

traced in the gStudy system. The researchers found that the best matched item

showed only a 40% agreement, and the average agreement was 27%. The authors

concluded from this study that direct analyses of student activity traces in e-

learning environments are important for furthering our understanding of both how

students apply SRL, and how that application affects their learning performance.

Increasingly, researchers have begun to utilize trace methodologies in

order to examine the complex temporal patterns of self-regulated learning

(Aleven, McLaren, Roll and Koedinger 2006; Azevedo and Witherspoon 2009;

Azevedo, Witherspoon, Chauncey, Burkett, and Fike 2009; Biswas et al. 2010;

Hadwin et al. 2007; Jeong and Biswas 2008; Zimmerman 2008). Underlying these

approaches is a move away from assessing self-regulation as an intrinsic aptitude

and, instead, assessing it as dynamic and adaptive event occurrences. By

identifying and analyzing temporal event sequences from trace data (e.g., student

actions in a computer-based learning environment), we hope to better understand

student’s self-regulated learning strategies and develop online measurement

schemes to provide more adaptive feedback and scaffolding.

Action Abstraction with Context

Like many other computer-based learning environments, Betty’s Brain

produces extensive logs of the actions performed by students using the system. To

effectively perform sequential data mining/analysis, raw logs of learning

interactions must first be transformed into an appropriate sequence of actions by

raising the level of abstraction from raw log events to a canonical set of distinct

actions. In this step, researcher-identified categories of actions define an initial

alphabet (set of action symbols) for the sequences. This also allows us to filter out

irrelevant information (e.g., cursor position) and to combine qualitatively similar

actions (e.g., querying an agent through different interfaces or about different

concepts in a given topic). Further, this categorization focuses the data mining

4

analyses on researcher-directed distinctions and features in the student learning

activities.

In Betty’s Brain, we categorize student activities as five, qualitatively

distinct actions:

(1) READ: students access one of the pages in the resources;

(2) EDIT: students add, delete, or modify a concept or link in the map;

(3) QUER: students ask Betty a question;

(4) EXPL: students ask Betty to explain her answer to a query; and

(5) QUIZ: students tell Betty to take a quiz.

Sometimes abstracting the raw log traces through action categorization

also strips important context from the actions in the trace sequence. For example,

while differentiating the addition of every possible link in a concept map would

result in an unwieldy set of distinct actions, knowing that the link added is the

same as one described in the last read action or in a following query can be

important context information. To maintain a balance between minimizing the

number of distinct actions in the sequences and keeping relevant context

information, we employ a metric indicating the relevance of each action to recent,

previous actions. Based on this relevance metric we split each categorized action

into two distinct actions: (1) high relevance and (2) low relevance to recent

actions. High relevance actions were those whose score exceeded the mean for

that category of action in the full set of student activity traces, while low

relevance actions had scores less than the mean (Biswas et al. 2010).

For the current study, each student action was assigned a relevance score

that depended on the number of relevant previous actions within a 3-action

window. For Betty’s Brain, a prior action is considered relevant to the current

action if it is related to, or operates on, one of the same map concepts or links.

This score provides a measure of informedness for map building/refinement

actions and, similarly, a measure of diagnosticity for map monitoring activities

(Biswas et al. 2010). Overall, the relevance scoring maintains some action context

by providing a rough measure of strategy focus or consistency over a sequence of

actions. For example, if a student edits a link that is part of a causal chain used to

answer a recent query, the query action is counted in the edit’s relevance score.

The increased relevance score suggests a more informed edit because it was

related to a recent query. Similarly, if a student asks a query where the answer

5

uses a recently-edited link, the edit action is counted toward the relevance score of

the query, implying the student’s use of the query was diagnostic of the edited

link.

Modeling Learning Behaviors with HMMs

The student activities logged by a learning environment result from a

variety of internal cognitive/metacognitive states, strategies, and processes used

by the student. Analyses that do not take into account the sequential nature of

student interactions with the system (e.g. action frequency counts) capture limited,

indirect information about students’ learning strategies and cognitive states.

Employing a direct representation of these internal states and strategies with a

probabilistic automaton, such as a hidden Markov model (HMM) (Rabiner 1989),

has the potential for facilitating identification, interpretation, and comparison of

student learning behaviors (Biswas et al. 2010). In particular, HMMs can provide

a concise representation of student learning strategies and behaviors (e.g.,

strategies and their relationships, as opposed to simple action or sequence

frequencies) derived from learning activity sequences (Jeong and Biswas 2008).

HMMs include a set of states and probabilistic transitions (i.e., more likely

transitions are assigned higher probabilities) between those states.

Like a student’s mental processes, the states of an HMM are hidden, meaning that

they cannot be directly observed, but they produce observable emissions (e.g.,

actions in a learning environment). Another important aspect of an HMM is the

likelihood of starting in a given state, which can be important information for

understanding learning behaviors and strategies. Together, three sets of

probabilities form a complete model: (1) transition probabilities, which determine

the likelihood of going from one state to another at each step; (2) state output

probabilities, which define the likelihood of observing different outputs in each

state; and (3) state initial probabilities, which define the likelihood that a state will

be the starting state for an output sequence.

In recent years, a number of researchers have had success in applying

HMM generation techniques to learning interaction data. For example, Pardos

and Heffernan (2011) have successfully applied HMMs (in conjunction with a

bagged decision tree classifier) to predict correctness of student answers in

computer math learning environments. In contrast to research using HMMs for

6

action prediction, our exploratory methodology uses HMM generation to identify

learning behaviors of student groups and to compare behaviors across those

groups. Boyer et al. (2009) have used HMMs for identifying strategies and high-

level flow in student dialogues with human tutors. Cohen and Beal (2009) have

used HMMs to analyze dialogues with an intelligent tutoring system to identify

different states of engagement. Some of the more closely related work with

HMMs for identification of student learning strategies is the HMM clustering

technique employed by Shih, Koedinger, and Scheines (2010). Performance and

learning metrics are used to direct the generation of HMM collections that predict

learning outcomes. In contrast, our technique generates HMMs from a set of

student activity sequences to build a model that represents an aggregate view of

each group’s learning behaviors. The technical details of our approach to

generating HMMs from a set of student activity sequences are presented in

(Biswas et al. 2010; Jeong and Biswas 2008).

The generated HMM for a set of student activity traces provides an

overview of aggregate learning behaviors and strategies in the group of students.

We compare the HMMs between groups based on presence or absence of

identified learning behaviors, as well as differences in their proportion of expected

behavior occurrences. In many cases, the learning behavior patterns and strategies

identified by comparison of HMMs can illustrate marked differences between

groups, such as experimental versus control conditions or high versus low

performers (e.g., Biswas et al. 2010; Jeong and Biswas 2008).

Comparative Sequence Mining

While the HMM analysis allows for effective interpretation of common

learning behaviors, states, and strategies in a student group, it may not be

sufficient for a detailed understanding of differences in student learning behavior.

By aggregating various less common behaviors and specific variations of general

strategies into one state, valuable information about the order of actions within a

state can be overlooked by this level of analysis/modeling. For example, an HMM

state that includes map editing, querying, explanation, and quizzing actions (with

a high self-loop probability) would only indicate what proportion of each activity

was used, but not the order in which those actions occur. Thus, it is impossible to

determine the precise strategy (or strategies) indicated by the state. Perhaps a

7

student was using regular queries to check Betty’s knowledge during teaching and

periodically assessing her overall progress using a quiz. Another equally plausible

scenario involves the student periodically administering a quiz during teaching

and then using queries and explanations based on quiz questions to isolate

incorrect information in the map.

This issue becomes even more pronounced when trying to compare across

groups of students. When the HMMs generated for two groups contain similar

states, it is still possible that those states represent different, although often

related, learning strategies. To provide a finer-grained, complementary level of

analysis, our methodology employs a novel combination of sequence mining

techniques (Kinnebrew & Biswas 2011).

An important part of this extended analysis is the use of sequential pattern

mining (Agrawal and Srikant 1995) to identify frequent1 patterns of actions within

a group. However, this often results in a very large number of frequent patterns

that can be difficult to analyze effectively. In general, a major challenge in

effective analysis of frequent patterns is limiting the often large set of results to

interesting or important patterns (i.e., the “effectiveness” of a mining technique as

opposed to its “efficiency” in finding frequent patterns (Agrawal and Imielinski

1993)).

To find these important patterns in a comparative analysis between student

groups, our methodology combines sequential pattern mining with episode

discovery/mining to identify differentially frequent patterns. In contrast to

sequential pattern mining, episode mining (Mannila, Toivonen, and Verkamo

1997) define frequency as the number of times the pattern can be matched within

a single sequence of actions. Combining sequential pattern mining and episode

mining, our methodology focuses on learning activity patterns that are used

differentially between two groups of students.

Although generally considered distinct subfields of sequence mining, there

has been some recent work in combining techniques and concepts from sequential

pattern mining and episode mining. For example, Ding, Lo, Han, and Khoo

(2009) employ episode mining over a set of sequences, rather than a single

sequence. However, our methodology separately employs sequential pattern

1 Pattern frequency in sequential pattern mining is defined as the number of different sequences (in

this case the number of different student interaction traces) that include the pattern.

8

mining to first identify common patterns within each comparison group. Lo,

Khoo, and Liu (2008) employ multiple frequency thresholds and techniques (from

sequential pattern mining and episode mining) in a combined algorithm for the

discovery of recurrent rules in a single set of sequences.

In our analysis of differentially frequent patterns, the sequential pattern

mining frequency measure (i.e., how many students exhibit the given pattern) is

important for identifying patterns common to a group (Kinnebrew & Biswas

2011). We refer to this as the “sequence support” (s-support) of the pattern,

following the convention of Lo et al. (2008), and we call patterns meeting a given

s-support threshold s-frequent. The other frequency metric, employed for

comparing patterns across groups (Kinnebrew & Biswas 2011), is the episode

frequency (i.e., the frequency with which the pattern is repeated within an

interaction trace). For a given trace, we refer to this as the “instance support” (i-

support), following Lo et al. (2008), and we call patterns meeting a given i-

support threshold i-frequent. Further, we allow a maximum gap of one action

between consecutive actions in the patterns2 to effectively identify frequent

learning interaction patterns despite the noisy nature of learning activities

(Kinnebrew & Biswas 2011).

To compare the identified frequent patterns across groups, we calculate the

difference in group i-support frequencies of each pattern. This focuses the

comparison on patterns that are employed significantly more often by one group,

and it produces four distinct categories of frequent patterns. The two categories

where the patterns are s-frequent in only one group illustrate patterns primarily

employed by that group but not the other, and the two categories where the

patterns are common to both groups are differentiated by which group uses the

pattern more often (i.e., higher i-frequency).

Example Analysis

To demonstrate the effectiveness of these techniques, this section presents

an analysis of data from a recent classroom experimental study that tested the

effect of contextualized conversational (CC) feedback in Betty’s Brain

(Leelawong and Biswas 2008) by comparing it to a baseline feedback approach

called prompt-based action-oriented (PA) feedback. The research hypothesis was

2 A maximum gap of one means that between each consecutive pair of actions in a frequent

pattern, the mining algorithm allows up to one additional action that is not part of the given

pattern.

9

that the conversational and contextual nature of the CC feedback helps students

better understand the relevance and context of the feedback. Thus, students

receiving CC feedback would more often take actions in accordance with the

feedback, and therefore would better integrate reading, causal map editing, and

assessing activities.

To identify differences in learning behaviors between the two conditions,

we analyzed learning activity traces for students in the PA and CC groups by

employing the HMM and comparative sequence mining methods described

previously. The resulting group HMMs contain five distinct states that are

described by the state-action emission probabilities in Figure 1. The value shown

for an action in a given state is the probability of that action being produced in the

state. When an action ends in -L or -H, it indicates low or high relevance,

respectively, of that action to previous actions. Using the interpretation methods

detailed in (Biswas et al. 2010), we interpreted these states as representing:

Reading - students are primarily reading the resources.

Informed editing - students are primarily making high-relevance edits,

suggesting a more focused map-building effort.

Uninformed editing - students are primarily making low-relevance edits,

possibly indicating the use of guessing behaviors.

Uninformed editing and checking - students are performing assessment

behaviors like querying and quizzing to check the correctness of their

causal maps, but are also making a significant number of low-relevance

edits to their maps. This implies that students are not reflecting on the

results of their assessments, and they may have resorted to trial-and-error

methods to correct their maps.

Informed editing and checking - students are making high-relevance

changes to their map while also using queries and quizzes to assess the

correctness of their causal maps. This state likely corresponds to more

effective attempts at employing monitoring strategies to identify and

correct erroneous (or missing) information in the map.

The HMM results illustrate a similar set of behaviors employed by both

the PA and CC groups, although all of the uninformed editing actions in the PA

group are combined in the uninformed editing and checking state rather than

being split between it and a separate uninformed editing state in the CC group.

10

Additionally, the proportions of expected state occurrences are also relatively

similar between the two groups. However, two interesting differences in behavior

patterns are worth noting: (i) the PA group’s model has a smaller likelihood

(39%) of transitioning directly from informed editing to informed (editing and)

checking activities compared to the CC group’s model (61% transition

probability); and (ii) the CC group’s model exhibits a higher likelihood of

following these informed editing and checking activities with more reading (29%

versus 14% for the PA group’s model).

[Fig. 1 Learning Activity Emissions in HMM States]

The HMM results (Figure 2) illustrate a similar set of behaviors employed

by both the PA and CC groups, although all of the uninformed editing actions in

the PA group are combined in the uninformed editing and checking state rather

11

than being split between it and a separate uninformed editing state in the CC

group. Additionally, the proportions of expected state occurrences are also

relatively similar between the two groups. However, three interesting differences

in behavior patterns are worth noting: (1) the PA group’s HMM has a smaller

likelihood (39%) of transitioning directly from informed editing to informed

(editing and) checking activities compared to the CC group’s HMM (61%

transition probability); (2) the CC group’s HMM exhibits a higher likelihood of

following these informed editing and checking activities with more reading (29%

versus 14% for the PA group’s HMM); (3) even in the uninformed editing and

checking state, the proportion of diagnostic to non-diagnostic queries (QUER-H

to QUER-L) was higher in the CC group. Altogether, these results indicate that

the CC group (1) was more likely to intersperse diagnostic checking activities

between their reading and informed editing activities and (2) was more likely to

return to reading after informed editing and checking, which suggests that they

may have used monitoring to identify weak points for further exploration with the

resources, as well as to confirm their current understanding. Presumably, this

periodic, diagnostic checking behavior helped students in the CC group identify,

research, and correct mistakes in their concept maps in a more systematic fashion.

[Fig. 2 HMMs for the PA and CC Groups]

To provide a complementary, finer-grained level of analysis, we augment

the HMM results with a comparison of frequent learning activity patterns as

described previously. The initial comparative sequence mining analysis revealed

12

that many of the differentially frequent patterns differed only by the number of

consecutive reads or edits in the pattern. To improve this exploratory analysis, we

revised the log pre-processing to distinguish a single action from repeated actions,

which were condensed to one “action” with the “-MULT” identifier. Using the

transformed sequences, our comparative sequence mining technique allowed us to

identify additional differences in behavior between the two groups. Table 1

presents the top three differentially frequent patterns in four comparison

categories (i.e., categorized by whether the patterns were s-frequent in one or both

groups and in which group they were more i-frequent). In this analysis, we

employed an s-support threshold of 50% to analyze patterns that were evident in

the majority of either or both groups of students.

Learning Activity Pattern i-Support

(CC – PA)

s-Frequent

Group

READ-MULT EDIT-H-MULT READ 4.39 CC

READ-MULT READ-MULT EDIT-H READ 4.11 CC

READ-MULT EDIT-H READ-MULT EDIT-H

READ

3.11 CC

EDIT-H READ 3.76 Both

EDIT-H READ-MULT 2.14 Both

READ-MULT EDIT-H READ 2.08 Both

EDIT-L QUIZ-L -0.91 Both

QUIZ-L EDIT-l -1.29 Both

EDIT-L QUER-H -1.53 Both

QUER-H EXPL-H -2.29 PA

EDIT-L EDIT-L -2.35 PA

QUER-L EDIT-L -2.47 PA

[Table 1 Top Differentially Frequent Patterns between CC and PA groups]

This analysis illustrates that many of the patterns that were differentially

frequent in the CC group (i.e., patterns with a positive value in the i-support

difference column defined as CC i-support minus PA i-support for the given

pattern) were repeated read-and-edit patterns. Many other read-and-edit

sequences were frequent in both groups, but the differentially frequent sequences

for the CC group often included single reads and informed edits, while the PA

13

group generally relied on multiple reads and their differentially frequent patterns

with single edits tended to employ uninformed edits. These results suggest that the

CC group’s students may have more frequently employed a knowledge

construction strategy of reading a small portion of the resources and adding a link

to their map each time they identified a causal relationship. Conversely, the PA

group students may have engaged in less systematic knowledge construction

strategies. In particular, they may have been less focused and encountered more

difficulties during extended editing series. This difference in learning behavior

suggests that the conversational, contextualized feedback in the CC group resulted

in more effective deployment of knowledge construction strategies.

Learning Activity Pattern i-Support

(CC – PA)

s-Frequent

Group

READ-MULT EDIT-H 2.21 CC

READ-MULT EDIT-H

READ-MULT QUER-H

1.61 CC

READ-MULT EDIT-H

QUIZ-H READ-MULT

1.61 CC

QUIZ-H EDIT-H 1.33 Both

EDIT-H QUER-H EDIT-H 1.31 Both

EDIT-H QUER-H READ-MULT 1.31 Both

EDIT-L QUIZ-L -0.91 Both

QUIZ-L EDIT-l -1.29 Both

EDIT-L QUER-H -1.53 Both

QUER-H QUER-H -2.29 PA

QUER-H EXPL-H -2.29 PA

QUER-L EDIT-L -2.47 PA

[Table 2 Top Differentially Frequent Patterns including Queries or Quizzes]

The top differentially frequent patterns in the PA group primarily involve

the use of queries or quizzes in conjunction with single, uninformed edits.

Therefore, we extended our comparative sequence mining analysis by using a

regular expression constraint to focus on monitoring patterns. In particular, we

identified interesting results for patterns including Query or Quiz actions, which

are important for effective monitoring of the knowledge construction and map

14

building process. Table 2 presents the top three differentially frequent

querying/quizzing patterns in each comparative category.

This analysis illustrates that the CC group was more likely to use queries

and quizzes in between sequences of reading and single, informed edits. In

contrast, the PA group had a differential preference for using queries before and

after individual, uninformed edits. This provides further detail to the HMM

results that illustrated a difference in transitions to the “informed editing and

checking” states and from it to the “reading” state in these groups. These results

suggest that the CC group employed monitoring activities, such as queries and

quizzes, in a way that was more related to the shorter sequences of reading and

editing that they were currently involved in. Overall, the identified difference in

learning behavior indicate that the conversational, contextualized feedback in the

CC group resulted in a beneficial change in the students’ understanding and/or

employment of both knowledge construction and monitoring strategies.

References

Agrawal, R., & Imielinski, T. (1993). Mining association rules between sets of items in large

databases. ACM SIGMOD Record, (May), 207-216. New York, New York, USA: ACM Press.

Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In P.S. Yu & A.L.P. Chen (Eds.),

Proceedings of the eleventh international conference on data engineering (pp. 3-14).

Washington, DC: IEEE Computer Society.

Aleven, V., McLaren, B., Roll, I., & Koedinger, K. (2006). Toward meta-cognitive tutoring: A

model of help-seeking with a cognitive tutor. International Journal of Artificial Intelligence in

Education, 16(2), 101–128. IOS Press.

Azevedo, R. (2005). Using hypermedia as a metacognitive tool for enhancing student learning?

The role of self-regulated learning. Educational Psychologist, 40(4), 199-209.

doi:10.1207/s15326985ep4004_2

Azevedo, R., Witherspoon, A., Chauncey, A., Burkett, C., & Fike, A. (2009). MetaTutor: A

metacognitive tool for enhancing self-regulated learning. In Cognitive and metacognitive

educational systems: Papers from the AAAI Fall Symposium (pp. 14–19). Menlo Park, CA:

AAAI Press.

Biswas, G., Jeong, H., Kinnebrew, J. S., Sulcer, B., & Roscoe, R. (2010). Measuring self-regulated

learning skills through social interactions in a teachable agent environment. Research and

Practice in Technology Enhanced Learning, 5(2), 123-152.

Cohen, P., & Beal, C. (2009). Temporal data mining for educational applications. International

Journal of Software Informatics, 3(1), 31-46.

15

Ding, B., Lo, D., Han, J., & Khoo, S.C. (2009). Efficient mining of closed repetitive gapped

subsequences from a sequence database. 2009 IEEE 25th International Conference on Data

Engineering, 1024-1035. IEEE.

Hadwin, A.F., Nesbit, J.C., Jamieson-Noel, D., Code, J., & Winne, P.H. (2007). Examining trace

data to explore self-regulated learning. Metacognition and Learning, 2(2-3), 107-124.

Hadwin, A.F., Winne, P.H., Stockley, D.B., Nesbit, J.C., & Woszczyna, C. (2001). Context

moderates students’ self-reports about how they study. Journal of Educational Psychology,

93(3), 477–487.

Jeong, H., & Biswas, G. (2008). Mining student behavior models in learning-by-teaching

environments. In R.S.J.d. Baker, T. Barnes, & J.E. Beck (Eds.),

proceedings (pp. 127-136).

Montreal, Quebec, Canada.

Kinnebrew, J.S. & Biswas, G. (2011). Modeling and measuring self-regulated learning in

teachable agents environments. Journal of e-Learning and Knowledge Society, 7(2), 19-35.

Leelawong, K., & Biswas, G. (2008). Designing learning by teaching agents: The Betty’s Brain

system. International Journal of Artificial Intelligence in Education, 18(3), 181–208.

Lo, D., Khoo, S.C., & Liu, C. (2008). Efficient mining of recurrent rules from a sequence

database. Proceedings of the 13th International Conference on Database Systems for

Advanced Applications (pp. 67–83). Springer-Verlag.

Mannila, H., Toivonen, H., & Verkamo, A. I. (1997). Discovery of frequent episodes in event

sequences. Data Mining and Knowledge Discovery, 1(3), 259-289.

doi:10.1023/A:1009748302351

Pardos, Z.A., & Heffernan, N.T. (2011). Using HMMs and bagged decision trees to leverage rich

features of user and skill from an intelligent tutoring system dataset. To appear in: The

Journal of Machine Learning Research.

Perry, N. E., & Winne, P. H. (2006). Learning from learning kits: gStudy traces of students’ self-

regulated engagements with computerized content. Educational Psychology Review, 18(3),

211-228. doi:10.1007/s10648-006-9014-3

Pintrich, P.R., Marx, R.W., & Boyle, R.A. (1993). Beyond cold conceptual change: The role of

motivational beliefs and classroom contextual factors in the process of conceptual change.

Review of Educational Research, 63(2), 167-199.

Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech

recognition. Proceedings of the IEEE, 77(2), 257–286. IEEE Computer Society.

Shih, B., Koedinger, K.R., & Scheines, R. (2010). Unsupervised discovery of student learning

tactics. Proceedings of the 3rd International Conference on Educational Data Mining, 201-

210.

Weinstein, C., Schulte, A., & Palmer, D. (1987). The Learning and Study Strategies Inventory.

Clearwater, FL: H&H Publishing.

Winne, P.H. & Jamieson-Noel, D. (2002). Exploring students’ calibration of self reports about

study tactics and achievement. Contemporary Educational Psychology, 27(4), 551-572.

16

Zimmerman, B.J. (2008). Investigating self-regulation and motivation: Historical background,

methodological developments, and future prospects. American Educational Research Journal,

45(1), 166-183.

Zimmerman, B.J. & Pons, M.M. (1986). Development of a structured interview for assessing

student use of self-regulated learning strategies. American Educational Research Journal,

23(4), 614-628.

Documents

Online Resource 1 for: The Effect of Contextualized Conversational Feedback in …10.1007... · · 2012-10-05The Effect of Contextualized Conversational Feedback in a Complex Open-Ended