A computational model for causal learning in cognitive agents

Knowledge-Based Systems 30 (2012) 48–56

Contents lists available at SciVerse ScienceDirect

Knowledge-Based Systems

journal homepage: www.elsevier .com/locate /knosys

A computational model for causal learning in cognitive agents

Usef Faghihi ⇑, Philippe Fournier-viger, Roger NkambouDepartment of Computer Science, UQAM, 201, avenue du Président-Kennedy, Local PK 4150, Montréal (Québec), Canada

a r t i c l e i n f o

Article history:Received 12 January 2011Received in revised form 5 September 2011Accepted 6 September 2011Available online 6 November 2011

Keywords:Cognitive agentsComputational causal modeling, andlearningEmotionsConsciousnessEpisodic memoryCausal memory

0950-7051/$ - see front matter � 2011 Published bydoi:10.1016/j.knosys.2011.09.005

⇑ Corresponding author.E-mail addresses: [email protected] (U

[email protected] (P. Fournier-viger), Roger.Nkbou).

a b s t r a c t

To mimic human tutors and provide optimal training, a cognitive tutoring agent should be able to con-tinuously learn from its interactions with learners. An important element that helps a tutor better under-stand learner’s mistake is finding the causes of the learners’ mistakes. In this paper, we explain how wehave designed and integrated a causal learning mechanism in a cognitive agent named CELTS (ConsciousEmotional Learning Tutoring System) that assists learners during learning activities. Unlike other worksin cognitive agents that used Bayesian Networks to deal with causality, CELTS’s causal learning mecha-nism is implemented using data mining algorithms that can be used with large amount of data. The inte-gration of a causal learning mechanism within CELTS allows it to predict learners’ mistakes. Experimentsshowed that the causal learning mechanism help CELTS improve learners’ performance.

� 2011 Published by Elsevier B.V.

1. Introduction

Causal learning is the process through which we come to infer andmemorize an event’s reasons or causes based on previous beliefs andcurrent experience that either confirm or invalidate previous beliefs[1,2]. Finding the causes of the problems helps human beings to betterdeal with everyday problems. For instance what are the causes of theglobal warming? Different methods are proposed for finding causalrelations between events such as scientific experiments, statisticalrelations, temporal order, prior knowledge, and so forth [2,3].Researchers in cognitive computer sciences have been interested incausality and how causal learning can be implemented in cognitiveagents [4]. For example, researchers simulated causal mechanismsin cognitive architectures such as ACT-R [5] and CLARION [6].However, these works have one or more of the following limitations:

� Assumptions about the causal model or values to variables haveto be specified by a programmer or domain expert.� Causal mechanisms are implemented by using techniques that

are not scalable to handle a large amount of data.� Mechanisms do not include a role for emotions.

In this paper, we will discuss our attempt for the implementationof causal learning in the Conscious Emotional Learning Tutoring

Elsevier B.V.

. Faghihi), [email protected] (R. Nkam-

System (CELTS) [7], which addresses these limitations. CELTS is ageneral cognitive architecture designed to be put to work as aTutor for astronauts learning to manipulate the International SpaceStation’s (ISS) robotic telemanipulator, Canadarm2. CELTS architec-ture relies on the functional ‘‘consciousness’’ [8] mechanism formuch of its operations. It also bears some functional similaritieswith the physiology of the nervous system. Its modules communi-cate with one another by contributing information to its WorkingMemory through information codelets1 [9]. In this study, CELTS isintegrated in CanadarmTutor [10,11] (Fig. 2), a simulation-basedintelligent tutoring system for learning how to operate the Canadarm2robotic arm (Fig. 1) installed on the International Space Station (ISS).CanadarmTutor learning environment is a 3D reproduction ofCanadarm2 on the space station and its control panel (Fig. 2). Learningactivities in CanadarmTutor mainly consists of operating Canadarm2for performing various real-life tasks with the simulator such as carry-ing loads with the robotic arm or inspecting the ISS. Operating Canad-arm2 is a complex task because astronauts have to follow a strictsecurity protocol. Furthermore, Canadarm2 has seven-degrees of free-dom (seven joints that can be rotated) and users only have a partialview of the environment through the cameras that they choose andadjust. CanadarmTutor was the subject of several research projects[10]. CELTS is the component of CanadarmTutor that acts as the core

1 Based on Hofstadter et al.’s idea, a codelet is a very simple agent, ‘‘a small piece ofde that is specialized for some comparatively simple task’’. Implementing Baarseory’s simple processors, codelets do much of the processing in the architecture. In

ur case, each information codelet possesses an activation value and an emotionalalence specific to each cognitive cycle.

cothov

http://dx.doi.org/10.1016/j.knosys.2011.09.005

mailto:[email protected]




http://dx.doi.org/10.1016/j.knosys.2011.09.005

http://www.sciencedirect.com/science/journal/09507051

http://www.elsevier.com/locate/knosys

Fig. 1. A 3D representation of Canadarm2.

U. Faghihi et al. / Knowledge-Based Systems 30 (2012) 48–56 49

tutor which takes all the pedagogical decisions, generates dialogueand performs the high-level assessment of the learner performance.

The learners’ manipulations of the virtual world simulator(Fig. 2), constitute the interactions between CELTS and its users.In particular, the virtual world simulator sends all manipulationdata to CELTS, which in turn sends learners various types of adviceto improve their performance (Fig. 3).

We have implemented a general emotional mechanism (emo-tions and emotional learning), episodic learning2 (EPL) [7] and cau-sal learning [12] in CELTS. In the context of CELTS, we refer to causallearning as the use of inductive reasoning to generalize causal rulesfrom sets of experiences. CELTS observes astronauts’ behavior with-out complete information regarding the reasons for their behavior.

CELTS’s causal learning in its current implementation is capableof finding the causes of user mistakes. For instance, as an exampleof the real world, suppose one observes that each time one forgetsto adjust his car’s side and front mirrors (M), he tends to have poorcontrol over the wheel (W) and cause collision risk (C) with othercars. We can link these variables in the following way:

M ! W ! C; ð1Þ

W M ! C: ð2Þ

The first graph (1) shows that the probability of forgetting mir-ror adjustment is independent of the probability of a collision riskwith other cars, but it is in turn conditional on the occurrence ofpoor wheel control. The second graph (2) demonstrates that theprobability of poor wheel control is independent of the probabilityof making a collision with other cars and is conditional on forget-ting mirror adjustment. To give to CELTS the capability of causallearning to learn complex causal relationships like the aboveexample, such that events maybe conditionally dependent or inde-pendent, we designed a causal learning mechanism [7]. However, itwas found in our experiment that the number of rules learned canbe very large for CELTS to handle. Because CELTS is a tutor thatinteracts with many learners, so, it is faced with a huge amountof data. We observed that at any given time, only a small subsetof the rules is relevant for the current situation.

To determine which rule best matches with the current prob-lem, we present in this paper a modified version of the causallearning mechanism described previously [12]. Our modified algo-rithm relies on a custom data mining algorithm that is designed toefficiently mine only the rules that are relevant to the current sit-uation. To do so, we have changed our previous algorithm to accepta time constraint and other constraints on rules to be mined. As itwill be shown in the experimental evaluation, using the con-straints (1) can reduce the number of rules found and the execu-tion time of the causal learning algorithm by several orders ofmagnitude, and (2) the new causal learning algorithm is thereforescalable to handle large amount of data (more than 20,000 se-quences) very efficiently. This is an important contribution because

2 CELTS’ episodic learning helps it to learn and then remember logged informationabout the behavior of learners or other agents during training sessions. It also helpsthe agent remember learners’ previous relevant mistakes and previously usedmethods-whether successful or unsuccessful.

in current cognitive agents, causal learning has up to now beenimplemented with techniques such as Bayesian networks thatare not scalable for handling a large amount of data.

The rest of this paper is organized as follows. In Section 2, wewill give a brief review of the related works about causal learningin cognitive science. In Section 3, we will briefly explain CELTS’semotional and episodic learning mechanisms. In Section 4, we pro-pose an improved causal learning mechanism for CELTS. In Section5, we present results from our experiments with CELTS. Finally, inSection 6, we draw a conclusion.

2. Related works

Up to now, scientists mainly proposed to use causal Bayes nets(acyclic graphs to establish causal relation between events. The keyissue for the construction of a causal Bayes net is finding condi-tional probabilities between events. Mathematics is used to de-scribe conditional and unconditional probabilities between agraph’s variables. The structure of a causal graph restricts the con-ditional and unconditional probabilities between the graph’s vari-ables. We can find the restrictions between variables by using theCausal Markov Assumption (CMA). The CMA suggests that everynode in an acyclic graph is conditionally independent of its ascen-dants, given the node’s parents (direct causes). Knowing a graphs’sstructure and the values of some of the variables, we are capable ofpredicting the conditional probability of other variables. CausalBayes nets are also capable of predicting the consequences of directexternal interventions on their nodes. When, for instance, an exter-nal intervention occurs on a node (N), it must solely change its va-lue and not affect other node values in the graph except throughthe node N’s influences. In conclusion, one can generate a causalstructure from sets of effects and conversely predict sets of effectsfrom a causal structure [4].

Most of the proposed models use a Bayesian approach for theconstruction of causal knowledge. Bayesian networks work withhidden and non-hidden data and learn with little data. However,Bayesian learning needs experts to assign predefined values to vari-ables, and this is often a very difficult and time-consuming task[13]. In the context of a tutoring agent like CELTS, this is a seriousissue, because we wish that CELTS could learn and adapt its knowl-edge of causes automatically without any human intervention. An-other problem for Bayesian learning, crucial in the present context,is the risk of combinatory explosion in the case of large amounts ofdata. In the case of our agent, constant interaction with learnerscreates a large amount of data stored in CELTS modules. For thisreason, we believe that using a data mining (DM) algorithm is moreappropriate to implement a causal learning mechanism in CELTS.

The other advantage of implementing causal learning usingdata mining algorithms (DM) is that CELTS can learn in a real-timeincremental manner-that is, the system can update its informationby interacting with various users. However, it must be noted thatalthough data mining algorithms learn faster than Bayesian net-works when all data is available, they have problems with hiddendata. Furthermore, like Bayesian learning, there is a need to vali-date the knowledge found by data mining algorithms by a domainexpert [13]. In the next section, we give a brief overview of CELTS’architecture. We will then describe in detail our approach to causallearning and put forward its advantages and limits.

3. An overview of CELTS’ architecture and learning mechanisms

3.1. CELTS’s architecture

CELTS (Fig. 4) is a cognitive agent architecture based on Baars’theory [14] of consciousness. It is constructed with simple agents

Fig. 2. CanadarmTutor interface.

Fig. 3. An example of CELTS message.

3 ‘‘Core affect is a term used to describe the mental representation of bodilyanges that are sometimes experienced as feelings of hedonic pleasure and

ispleasure with some degree of arousal’’.

50 U. Faghihi et al. / Knowledge-Based Systems 30 (2012) 48–56

called ‘‘codelets’’ (which reproduce Baars’ ‘‘simple processors’’).The central point of the system is the ‘‘access consciousness’’,which allows all resources to access centrally selected informationthat is ‘‘broadcast’’ to unconscious processes (which guides theagent to be stimulated only with the most relevant information).CELTS performs through cognitive cycles. A cognitive cycle inCELTS starts by perception and usually ends by the execution ofan action (e.g., Fig. 3). CELTS uses its Behaviour Network (BN) foraction selection (Fig. 5). The BN is implemented based on Maes’Behaviour Net [15]. It is a network of partial plans that analysesthe context to decide what to do and which type of behaviour toset off (Fig. 5). Given that CELTS acts as a tutor, an expert can definedifferent solutions in the BN to help learners. Thus, BN’s nodes arethe pedagogical supports for learning such as scaffolding messages,hints, demonstrations, etc. to assist learners while they manipu-late. Canadarm2 in the virtual environment (Fig. 2). The learners’manipulations of the Canadarm2 in the virtual environment(Fig. 2) constitutes the information perceived by CELTS. Those per-ceptions go through CELTS cognitive cycles and the process willend by an action (e.g., in Fig. 3). For the rest of this section webriefly explain how emotional and episodic learning are imple-mented in CELTS.

3.2. Emotional mechanism

We incorporated an emotional mechanism in CELTS inspired byresearch on neurosciences. It is based on the observation that emo-

tions influence cognition, and vice versa [16–18]. In particular,emotions play an important role for the construction of causesand causal learning in human being [19]. In this section, we pres-ent our generic computational model of emotions and explain indetails how the ‘‘peripheral-central’’ [17,20] model and the apprai-sal theory of emotions [21,22] are implemented in CELTS. The ap-praisal theory posits that agent-environment interactions inciteappraisal variables in the agent, which leads to the generation ofaffective states that occur with some intensity and which mayset off behavioral and cognitive outcomes. CELTS’ emotional statesemerge through collaboration between perceptions (sensory inputfrom the virtual world) and inputs coming from interactions be-tween different modules and what is broadcasted through the sys-tem by CELTS’ consciousness mechanism. For CELTS to make senseof its core affective3 state [23], it must be engaged in situated con-ceptualization that links the core affective state to an object or event.Conceptualization is thus necessary for CELTS to understand its coreaffective state. To that effect, memories and prior knowledge regard-ing an object or event are put to use to interpret current sensations.This view on the emergence of the emotions is in accordance withthe psychological constructionist model as defended by Lindquistand her colleagues [23].

CELTS can react in two ways when faced with a situation. Infor-mation coming from CELTS’ Perceptual mechanism flows along ashort and long route (ESR and ELR in Fig. 4), which we will describenow.

3.2.1. CELTS’ long routeThe long route can be viewed as CELTS’ implementation of the

appraisal theory. CELTS’ cognitive cycle start with perception.When a percept enters Working Memory (WM) as a single coalitionof codelets, emotional codelets inspect the coalition’s informational

chd

Fig. 4. CELTS’s architecture.

Fig. 5. Part of the behavior network in CELTS.


content (the assessment part of appraisal theory), and infuse it witha level of activation proportional to the assessed emotional valenceof a perceived situation. This increases or decreases the likelihoodthat the coalition will draw attention (AM) to itself. CELTS’ attentionmechanism (Fig. 4), then chooses the information that emotionallyinfluenced by CETLS’ emotional mechanism (Fig. 4, ESR & ELR) andsends it for conscious competition. CELTS’ Consciousness mecha-nism then broadcasts the information through the system for a

decision to be made (see ELR rectangle in Fig. 4). In CELTS, thebroadcasting of information will cause the appraisal of emotions.CELTS’ Attention mechanism in turn influences the emotion mech-anism (EM) by providing information regarding the discrepancy be-tween what was expected given an action and what effectivelyoccurred. This information may alter the future valence assignedby EM to the situation, as well as the importance EM gives to the sit-uation. After each cycle of interaction with the environment, CELTS’EM updates its knowledge of the environment (i.e., the emotionalvalence to be associated with events) for future cycles of interactionwith it. This update may also alter current emotional state (e.g., fearto happiness). Thus, the importance given to any situation mayincrease or decrease when CELTS next encounters it. We canalready see that, in our architecture, emotions are implicated incognitive processing (they influence action selection), and viceversa (cognitive processing can modify the emotional appraisal ofsituations).

3.2.2. CELTS’ short routeThe short route (see ESR rectangles in Fig. 4) starts with percep-

tion just like the long route. Perception codelets connect in parallelboth to the BN and to its emotional codelets. The activation sent di-rectly by perception codelets to emotional codelets is the first stageof the short route). The emotional codelets in EM establishes thepositive or negative emotional valance of the event for the system.The valence assigned to the event may result from evolution (an in-nate valence accorded to evolutionarily important situations) orfrom learning.

In CELTS, some emotional codelets may correspond to innate(designed) sensitivities (e.g., to excessive speed for Canadarm2,or an imminent collision between the arm and the ISS); othersmay have learnt the valence of situations from experience. Eitherway, emotional codelets possess direct connections to behaviornodes in the BN, to which they send positive or negative activa-tions. Some of these emotional codelets react more strongly thanothers and accordingly send stronger valence activations to thebehavior nodes. If the valence activations exceed a behaviornode’s firing threshold, then the corresponding action will fire


automatically. This emotional intervention reflects a direct routebetween the mechanisms responsible for emotional appraisal,influencing action selection.

3.3. Episodic learning

The second learning mechanism that is integrated in CELTS isthe episodic mechanism (EPM) [7]. CELTS’s episodic learning con-sists of building an episodic memory (a memory of past events)to answer questions such as what, where and when [24,25]. Fora tutoring agent, episodic memory is crucial to know which inter-actions were successful with learners and which ones were not,and to use this information to improve its behavior. It also helpstutor to remember learners’ mistakes. We briefly explain CELTS’episodic memory and learning in the following. It is implementedby two phases.

3.3.1. Recording interactions with learnersThe first phase consists of recording all the information that is

broadcast in the system during a training session between CELTSand learners. In our context, CELTS learns during astronauts’ train-ing sessions for Canadarm2 manipulation in the virtual world sim-ulator. A trace of what occurred in the system is recorded in CELTS’different memories during consciousness broadcasts. In our imple-mentation, each sequence of interactions during a training sessionis recorded as a sequence of events in a sequence database. Eachevent in CELTS, denoted as X = (ti,Ai), represents what happenedduring a cognitive cycle. The timestamp ti of an event indicatesthe cognitive cycle number, whereas the set of items Ai of an eventcontains an item that represents the coalition of information (e.g.,collision risk with ISS) that were broadcast during the cognitivecycle.

For example, Table 1 shows an example of a database producedby user manipulation of Canadarm2 in the virtual world containing6 short sequences. Consider the first sequence. The first event ofsequence S1 indicates that during cognitive cycle 0, due to armmanipulation by the learner, coalition c1 was broadcasted and thatan emotional valence of �0.8 for emotion e1 (high threat) wasassociated with the broadcast. The second event of S1 indicatesthat at cognitive cycle 1, coalition c2 was broadcasted with emo-tional valence �0.3 for emotion e2 (medium fear) and that behav-iour b1 was executed.

3.3.2. Memory consolidationThe second phase is memory consolidation. CELTS’ episodic

learning mechanism periodically extracts frequent sub-sequencesof events by using a custom sequential pattern mining algorithm(see Faghihi et al., 2011for more details).

In the previous example, if the event S1 appears several timesduring learners’ interactions with CELTS, the following patterncan be discovered:

fForget camerag; fForget adjust Camerag ¼> fCollision riskg

It indicates that when users forgot to adjust the cameras, thenthey did not adjust the camera parameters correctly, so this was

Table 1A data set of 6 sequences.

ID Events sequences

S1 <(0, c1 e1{�0.8}), (1, c2 e2{�0.3} b1), (2, c4 b5)>S2 <(0, c1 e1{�0.8}), (1, c3), (2, c4 b4), (3, c5 b3)>S3 <(0, c2 e2{�0.3}), (1, c3), (2, c4), (3, c5 b3)>S4 <(0, c3), (1, c1 e1{�0.6} b4), (2, c3)>S5 <(0, c4 b4), (1, c5), (2, c6)>S6 <(1, c1 e1{�0.6} b4), (2, c4 b4), (3, c5)>

followed by a collision risk in the virtual world. CELTS uses theabove information for the construction of its episodic memoryand adapt its behavior to learners by reusing ‘‘positive’’ patterns(carrying positive emotions) and avoiding ‘‘negative’’ patterns. Forexample, CELTS reuses solutions (e.g., registered sequences ofevents carrying positive emotions) that successfully helped learnersmany times, while avoiding solutions (e.g., registered sequences ofevents carrying negative emotions) that negatively affected users.For example, consider the case of a collision risk with Canadarm2.In this situation, CELTS can use different types of interventions withlearners such as giving a hint, showing a demonstration, etc. Tochoose an intervention, CELTS will select the most positive patternsin its episodic memory that are related to the current situation. Thisis complementary to the causal learning mechanism described next,which is responsible for finding the causes of the learners’problems.

4. Integrating causal learning in CELTS

We now describe how we have designed and integrated a causallearning mechanism into CELTS architecture.

4.1. CELTS’s causal model

CELTS’s causal learning takes place during its cognitive cycles.CELTS’s WM is monitored by expectation codelets and other typesof codelets (see CELTS’s emotional mechanism for more details [7].If expectation codelets observe information coming in WM con-firming that the behavior’s expected result was not observed, thenthis failure brings CELTS’s emotional and attention mechanismsback to that information. To deal with the failure, emotional cod-elets that monitor WM first send a portion of emotional valencessufficient to get CELTS’s Attention mechanism to select informationabout the failed result and bring it back to consciousness [7]. Theinfluence of emotional codelets at this point remains for the nextcognitive cycles, until CELTS finds a solution or has no remedyfor the failure. Since relevant resources need to be recruited to al-low CELTS’s modules to analyze the cause of the failure and to al-low deliberation to take place concerning supplementary and/oralternative actions, the consciousness mechanism broadcasts thisinformation to all modules. Among different modules inspectingthe broadcasted information by the consciousness mechanism,the episodic and causal learning mechanisms are also collaboratingto find previous sequences of events from Long Term Memory(LTM) content that occurred before the failure of the action. Thesesequences of events are the interactions that took place betweenCELTS and users during Canadarm2 manipulation by users in thevirtual world. They are saved to different CELTS’s Memoriesrespecting the temporal ordering of the events that occurred be-tween users and CELTS. The retrieved sequences of events containthe nodes. Each node contains at least an event and an occurrencetime (see CELTS’s episodic learning [7]for more information). Forinstance, in Figs. 2 and 3 different interactions may occur betweenusers and CELTS depending on whether the nodes’ preconditions inthe Behavior Network (BN) become true. Through this information,CELTS, using sequential pattern mining algorithms, extracts usefulinformation. For example, if a user forgets to adjust the camera be-fore he or she moves Canadarm2 and consequently chooses anincorrect joint, then the user will make collision risk. Such afore-mentioned information is gained through deductive reasoning inCELTS. Given such information, we were interested in finding thecauses of the problem produced by the users in the virtual world.The causal learning mechanism (CLM) constantly extracts sequen-tial rules (e.g., X ? Y) between sets of events with their confidenceand support from all past events. These rules indicate sequential


relationship between events. After finding the candidate rule as thecause of the failure, CELTS’s CLM re-executes it and waits for theuser feedback. However, if after the execution of the candidate ruleit turns out that it did not help the user to solve the problem, thenCELTS’s CLM writes in a failure in the WM. The failure leads CELTS’scausal learning to examine other related nodes to the current fail-ure with the highest support and confidence. Each time a new nodeis proposed by causal learning and executed by BN, an expectationnode brings back to the consciousness mechanism the confirma-tion from users to make sure that the found rule is the real causeof the failure. Finally, if a new cause is found, it will be integratedin CELTS’s Causal Memory. In the end, if no solution can be found,the causal learning mechanism puts the following message in WM:‘‘I have no solution for this problem.’’.

CELTS’s causal memory is built on its episodic memory. In thenext sub-section we explain the link between CELTS’ causal learn-ing and episodic learning. Reader is referred to [7] for more detailsabout the episodic learning mechanism.

4.2. Learning by extracting rules from what is broadcasted in CELTS

The second phase of causal learning deals with mining rulesfrom the sequences of events recorded for all of CELTS’s executions.To find the causes of the problem produced by the users in the vir-tual world, we chose RuleGrowth4, a data mining algorithm that wehave developed in previous work [27]. The algorithm extractssequential rules (e.g., NODEi) NODEf), whereNODEi and NODEf areunordered sets of events) between sets of events with their confi-dence and support5 [28] from all past events. The interpretation ofa ruleNODEi) NODEf is that if events from NODEi occur, the eventsfrom NODEi are likely to follow.

The original RuleGrowth algorithm takes a database of eventsequences (as previously defined) and two thresholds as input: aminimum support and a minimum confidence threshold. It out-puts the set of rules (R1,R2, . . . ,Rn) of the form NODEi) NODEi

that have a support and confidence greater or equal to thesethresholds, and where NODEi \ NODEi = £. The support and confi-dence of a rule are defined respectively as sup(NODEi) NODEf)/sand sup(NODEi) NODEf)/sup(NODEi), where sup(NODEi) NODEf)denotes the number of sequences such that NODEi appears before())NODEf, which is the total number of sequences and sup(NODEi)is the number of sequences containing NODEi. The confidencecan be interpreted as an estimate of the conditional probabilityP(NODEf j NODEi) [29–31].

In the context of CELTS, for each ruleRi: NODEi) NODEf, NODEf

and NODEirespectively represents causes and effects of a fail-ure.However, one problem that occurs when applying RuleGrowthin CELTS is that there can be up to several thousands of rules thatare found when the sequence database is large. At any given mo-ment, only a few rules are generally relevant. If too many rulesare found, it degrades the performance of the data mining algo-rithm and also of CELTS, which has too many rules to considerfor reasoning. To reduce the number of rules and to extract onlythe most precise and relevant ones, we have adapted the Rule-Growth algorithm to add constraints on events that a rule can con-tain as well as a time constraint to specify the maximum timeduration of a rule.

Fig. 6. The ConstraintMiner algorithm.

4 Note that in our first implementation of the causal learning mechanism we usedthe CMRules algorithm [26]. We replaced it by RuleGrowth because it is faster, useless memory and produces the same result.

5 Given a transaction database D, defined as a set of transactions T = {t1, t2, . . . , tnand a set of items I = {i1, i2, . . . , in}, where t1, t2, . . . , tn # I, the support of an itemsetX # I for a database is denoted as sup(X) and is calculated as the number otransactions that contain X. The support of a rule X # Y is defined as sup(X ? Y)/jTjwhere sup (X ? Y) is the number of transaction where X occurs before Y. Theconfidence of a rule is defined as conf (X) Y) = sup(X ? Y)sup(X)

}

f,

The event constraints are set when an expert, for a given prob-lem, creates solutions in the BN. When the expert creates ques-tions, he tags some CELTS’s Behavior Network (BN) nodes so thatthey respect some specific event constraints. After assigning con-straints, the RuleGrowth algorithm only extracts the rules that re-spects only set of constraints from interactions between CELTS andlearners to help learners with only relevant hints, messages, dem-onstrations, and so forth. To assign the time constraint, CELTS usesits BN’s nodes that are tagged by the expert to respect event con-straints. That is, when CELTS reaches a node tagged as C1 (definedbelow), the temporal constraint timer for this solution starts auto-matically. The timer stops once the user makes mistakes. In whatfollows, we first present these constraints and then explain howthey are useful for the reasoning of CELTS. The constraints arethe following:

� C1: the set of events that the left part of a rule can contain,� C2: the set of events that the right part of a rule can contain,� C3: the set of events that the left part of a rule has to contain,� C4: the set of events that the right part of a rule has to contain.� C5: the maximum time duration of a rule.

We modified RuleGrowth to ignore events that are excludedaccording to event constraints C1 and C3, or C2 and C4 whensearching for events that can extend the left or right parts of a rule.The parameter C5 allows CELTS to exclude rules between eventsthat are separated by too much time. Although, this modificationis simple, it can dramatically decrease the execution time and re-duce the number of rules found, as we will show in the experimen-tal results of this paper. Fig. 6 shows the pseudo-code of thismodified algorithm that we name ConstraintMiner. It takes asparameters a set of sequences DB, the four constraints C1, C2, C3,C4, C5and the minimum confidence and support thresholds.

We now explain how the five constraints that we have definedare used to discover rules that are relevant for CELTS. The eventconstraints can be used to discover rules that answer the followingfour questions:

Q1: What may happen? Constraint C1 is used to force rules tocontain only events that occurred in the left part of the rule. For in-stance, if the event {Forget-camera adjustment} and {Low_visibility}occurs, CELTS can put these two events in C1 and discover rules suchas {Forget-camera adjustment, Low visibility}) {Collision risk},


which indicates that this combination of events could result in acollision risk.

Q2: What was the cause of the event? Constraint C4 is used toforce rules to contain only events that occurred in the right part ofthe rule. This allows CELTS to find explanations of why theseevents occurred. For instance, in one situation, CELTS recorded thatCollision risk (CR) occurred. By searching the right part of the rulescontaining constraint C4, CELTS was able to find that forgettingcamera adjustments and choosing wrong joint(s) are the mostlikely causes of the problem.

Q3: Given a specific situation, which events will take place?Using constraint C1 and/or C3, we make sure that occurred eventswill be registered in the left part of the rules, and by using con-straints C2 and/or C4 we make sure that the prediction will be reg-istered in the right part of the rules. By using this strategy, CELTScan obtain rules that predict the occurrence of some specific eventsin the future. For instance, if a learner moves the Canadarm2 tooclose to the ISS, CELTS can put this event in constraint C3 andthe collision event in constraint C4, to know if it is likely that a col-lision with ISS will occurs. This information can then be used totake the appropriate action.

Q4: What may happen after a given action of CELTS? Given a sit-uation, constraint C1 is assigned to all registered events that havehappened and constraint C3 is assigned to all registered action(s)that CELTS can take for the situation. By using these constraints,CELTS can discover rules that indicate the consequences of the pos-sible actions to be taken. For instance, if while manipulatingCanadarm2, the learner makes mistakes by choosing a bad jointin a low visibility situation, C1 is assigned to the situation andthe remedial possible actions that CELTS can take are assigned toC3 (e.g., ‘‘Engage dialogue about joint selection’’). Thus, using con-straints C1 and C3 can allow discovering rules such as:

fChoosing bad Camera; low visibility;hintsg) flearner correct mistakeg

Given that the situation described by the rule is similar to the cur-rent situation, CELTS can then use the rule to take appropriateaction.

4.2.1. Adding the time constraintThe time constraint is used in combination with previous con-

straints to restrict the time between events in a rule. The time con-straint is used because if some events appear a long time after someother events, they are less likely to be related by a causal relation-ship. Moreover, in the context of tutoring agents, it is generallyagreed that virtual tutors should address the errors that learnersdid recently instead of the ones that they did before. In CELTS, thetime constraint is set dynamically as previously explained.

4.2.2. Combining all constraintsThe previous four questions and the time constraint can be

combined to achieve more complex reasoning. For instance, CELTScan detect why some learners do not know which joint must bechosen to achieve a specific goal (‘‘don’t_know_right_joint’’). Todo so, while CELTS interacts with learners, it seeks all the ruleswhose right part contains the following information: {don’t_know_right_joint} for a given time duration C5. By applying thedata mining algorithm on the sequences stored in its episodicmemory, CELTS can discover the rule {Rotating_the_arm_more_than_it_has_to,Chose_bad_joint}) {don’t_know_right_joint}.

This means that the cause may be that the learner rotated theCanadarm2 more than it had to so he chose the wrong joint. Byusing the constraints C3, CELTS can then search the following infor-mation ‘‘What is the best joint to choose?’’ and ‘‘How large the armmovement should be?’’ given the cause {don’t_know_right_joint}.

The aforementioned constraints and rules help CELTS to predictthe results of learners’actions and how to deal with them.

5. Testing causal learning in the New CELTS

To determine the extent to which the improved causal learningmechanism improved CELTS’ performance, we asked 8 users to testthe new version of CELTS with improved causal learning mecha-nisms (version A) and the version of CELTS with the previous cau-sal learning mechanism (version B). Learners were invited tomanipulate Canadarm2 for approximately 2 h, using both versionsA and B of the system. The first four students (group A) used ver-sion A, and then version B. The second four learners (group B), firstused version B and then version A. After its interactions with theusers, CELTS categorized them into novices, intermediate. Duringthe experiments with version A, CELTS automatically learned morethan 2000 rules from its interactions with learners. A few examplesof rules are found below:

� 42% of the time, when the learner was not aware of distancesand Canadarm2 was closed to the ISS, there was a collision risk:{Canadarm2_NearISS, Not_aware_of_distance}) {Collision risk}� 51% of the time, when a user forgot to adjust the camera, he/she

later chose an incorrect joint of Canadarm2: {Forget_adjust_Camera}) {Bad_joint}� 10% of the time, when the user moved Canadarm2 without

adjusting the camera, he/she increase the risk of collisions:{Move_Canadarm2}, {Forget_adjust_Camera}) {Collision risk}� 10% of the users who manipulated Canadarm2 close to the

space station, being aware of the distance and having reachedthe goal, were classified as experts: {Canadarm2_Near_ISS,Aware_of_distance, Goal_attained} ) {Expert}.

Such rules are then used by CELTS as described in the BehaviorNetwork (Fig. 5) to adapt its behavior. To do so, during each cogni-tive cycle, CELTS checks which rules match with its current execu-tion. If several rules match the current execution, the one havingthe most strength is used for prediction. The strength of a rule isdefined as: Strength (rule) = Confidence (rule) ⁄ Support (rule).

To assure the quality of the rules found by CELTS, we asked adomain expert to evaluate them. Given that checking all rulesone by one would be tedious, the expert examined 150 rules fromthe 2000 recorded rules. Overall, the expert confirmed the correct-ness of about 85 % of the rules. Furthermore, from the found rules,many unexpected rules (e.g., correct) were discovered.

To evaluate to which extent the integration of the learningmechanisms impacted the performance of the learners, we mea-sured four performance indicators during the usage of version Aand version B of CanadarmTutor by group A and group B: (1) thepercentage of questions that they answered correctly (2) the aver-age time that they took to complete each exercise in minutes, (3)the mean number of collision risks incurred by learners, and (4)the mean number of violations of the security protocol committedduring the exercises. Fig. 7 illustrates the results: (1) Group A cor-rectly answered50% of the questions, whereas Group B correctlyanswered 30% of the questions; (2) Group A took an average of1.45 min to complete an exercise whereas Group B took an averageof 2.13 min, (3) Group A incurred an average of 3 collision risksmade by learners, whereas Group B incurred an average of 5 colli-sion risks made by learners; (4) Group A had an average of 10 pro-tocol violations whereas Group B had an average of 18 protocolviolations. Although we have not used a very large number oflearners in this experiment, from these results, we can see thatthe performance of the learners who used the new version of CELTSclearly improved.

Fig. 7. Learners’ performance comparison.


Furthermore, we analyzed the correctness of the CELTS’ hintsand messages to the learners during tasks. To determine if an inter-vention was correct, we asked learners to rate each of CELTS’ inter-ventions as being appropriate or inappropriate. Because learnerscould incorrectly rate the tutor’s interventions, we also observedthe training sessions and verified the ratings given by each learnerone by one subsequently. The results show that the average num-ber of appropriate interventions was about 83% using version Aand 58% using version B. This is also a considerable improvementin CELTS’ performance over its previous version.

We also assessed the user satisfaction by performing a 10 minpost-experiment interview with each user. We asked each partici-pant to tell us which version of CELTS they preferred, to explainwhy, and to tell us what should be improved. Users unanimouslypreferred the new version. Some comments given by users were:(1) the tutoring agent ‘‘exhibited a more intelligent and naturalbehavior’’; (2) ‘‘the interactions were more varied’’; (3) ‘‘the tutor-ing agent seems more flexible’’; (4) ‘‘in general, gives a more appro-priate feed-back’’. There were also several comments on howCELTS could be improved. In particular, many users expressedthe need to give CELTS a larger knowledge base for generating dia-logues. We plan to address this issue in future work.

Lastly, we have also measured the execution time of our cus-tomized data mining algorithms used in CELTS to determinewhether learners’ performance was an issue. The execution timewas always very good in our experiments. On average, it extractedcausal rules and sequential patterns from approximately 500 se-quences in less than 50 ms.

To test the scalability of the algorithms, we generated 20,000sequences of events automatically. The sequences were generatedby randomly answering the questions asked by CELTS duringCanadarm2 manipulation. The data mining algorithms were ap-plied to the sequences in real time to extract useful informationand the cause of the event (causal rules). The performance hasbeen very good, the algorithms terminating in less than 100 s withthe following parameters: a minimum support of 0.05 and a min-imum confidence of 0.3. This demonstrates the scalability of thealgorithms for much larger amounts of data than have been re-corded in CELTS during our experiments with users. This perfor-mance is comparable to the performance of other large-scalefrequent pattern mining algorithms in the literature which can of-ten handle hundreds of thousands of sequences or transactions[27].

6. Conclusion

In this paper, we described how our improved data mining algo-rithm can improve a cognitive agent to find the cause of the learn-ers’ mistakes. The new and the previous versions of CELTS’ causallearning algorithm were compared in four ways. First, CELTS’ per-

formance was evaluated based on the number of correct interven-tions given to the learners during training sessions. Second, wemeasured the impact on learners’ performance. Third, we evalu-ated the satisfaction of users. Fourth, a domain expert examinedthe correctness of the causal rules learned by CELTS. Althoughthe experiment was carried with a limited number of participants,the result suggests a major improvement with the new causallearning mechanism.

In CELTS, causal learning is dependent on the episodic learningwhich is further dependent upon the emotions.

For future work, we plan to explore the following researchproblems. First, the causal learning algorithms used in this studyare not incremental. Therefore, for each CELTS execution, the algo-rithms must read the whole database instead of taking advantagesof the previous results. Another limitation in our work is that giventhe observed data and the confidence and support calculated byCELTS’s causal learning algorithm, the question remains as tohow one could use this information to produce a probability distri-bution as it exists in Bayesian Networks.

References

[1] A. Maldonado, A. Catena, J. C. Perales, A. Cándido, Cognitive biases in humancausal learning, (2007).

[2] B. Daniel, B. Christopher, S. Linda, M. Douglas, Psychology of Learning andMotivation Moral Judgment and Decision Making, Elsevier, 2009.

[3] D.A. Lagnado, M.R. Waldmann, Y. Hagmayer, S.A. Sloman, Beyond Covariation:Cues to Causal Structure, in: A.S. Gopnik, Laura (Eds.), Causal learning:Psychology, Philosophy and Computation, Published by Oxford UniversityPress, Inc., 2007, pp. 154–172.

[4] A. Gopnik, L. Schulz, Causal learning: Psychology, Philosophy, andComputation, Oxford University Press, USA, 2007.

[5] W. Schoppek, Stochastic independence between recognition and completion ofspatial patterns as a function of causal interpretation, in: Proceedings of the24th Annual Conference of the Cognitive Science Society, vol. Mahwah, NJ:Erlbaum (2002) 804–809.

[6] S. Hélie, Modélisation de l’apprentissage ascendant des connaissancesexplicites dans une architecture cognitive hybride, PHD, DIC, UQAM,Montréal, (2007).

[7] U. Faghihi, The use of emotions in the implementation of various types oflearning in a cognitive agent, Ph.D, Computer Science, University of Quebec atMontreal (UQAM), Montreal, (2011).

[8] S. Franklin, F.G.J. Patterson, The LIDA Architecture:Adding New Modes ofLearning to an Intelligent, Autonomous, Software Agent, Integrated Design andProcess Technology, 2006.

[9] D.R. Hofstadter, M. Mitchell, The Copycat Project: A model of mental fluidityand analogy-making, in: K.J. Holyoak, J.A. Barnden (Eds.), Advances inConnectionist and Neural Computation Theory, Logical Connections, vol.2,Ablex, Norwood N.J. (1994).

[10] R. Nkambou, K. Belghith, F. Kabanza, An approach to intelligent training on arobotic simulator using an innovative path-planner, in: Proceedings of the 8thInternational Conference on Intelligent Tutoring Systems (ITS) LNCS (2006)645-654.

[11] R. Nkambou, P. Fournier-Viger, E. Mephu Nguifo, Learning task models in ill-defined domain using an hybrid knowledge discovery framework, Knowledge-Based Systems, vol. 24, Elsevier, 2011. 1.

[12] U. Faghihi, P. Fournier-Viger, R. Nkambou, P. Poirier, A Generic causal learningModel for Cognitive Agent, in: The Twenty Third International Conference on


Industrial, Engineering & Other Applications of Applied Intelligent Systems(IEA-AIE 2010) (2010).

[13] M. Braun, W. Rosenstiel, K.-D. Schubert, Comparison of Bayesian networks anddata mining for coverage directed verification category simulation-basedverification, in: High-Level Design Validation and Test Workshop, 2003. EighthIEEE International (2003) 91–95.

[14] B.J. Baars, In the Theater of Consciousness:The Workspace of the Mind, OxfordUniversity Press, Oxford, 1997.

[15] P. Maes, How to do the right thing, Connection Science 1 (1989) 291–323.[16] E.A. Phelps, Emotion and cognition: insights from studies of the human

amygdala, Annual Review of Psychology 57 (2006) 27–53.[17] J.E. LeDoux, Emotion circuits in the brain, Annual Review of Neurosciences

2000 23 (2000) 155–184.[18] a. Damasio, The Feeling of What Happens: Body and Emotion in the Making of

Consciousness, Harvest Books, 2000.[19] A.C. Martı́nez, A. Maldonado López, J.C. Perales López, A. Cándido Ortiz,

L. Guadarrama, R. Beltrán, D. Contreras Ros, A. Herrera, casual Effects ofemotional induction in causal learning, Psicológica: Revista de metodologı́a ypsicologı́a experimental 27 (2) (2006) 245–268. ISSN 0211-2159.

[20] W. Cannon, The James-Lange theory of emotion: a critical examination and analternative theory, American Journal of Psychology 39 (1927) 106–124.

[21] R. Lazarus, Emotion and Adaptation, Oxford University Press, NY, 1991.[22] S. Marsella, J. Gratch, P. Petta, Computational Models of Emotion, in: K.R. In

Scherer, T. Bänziger, E. Roesch (Eds.), A Blueprint for an AffectivelyCompetent Agent: Cross-Fertilization between Emotion Psychology,Affective Neuroscience, and Affective Computing, Oxford University Press,2010.

[23] K.A. Lindquist, T.D. Wager, H. Kober, E. Bliss-Moreau, L.F. Barrett, The brainbasis of emotion: A meta-analytic review, Behavioral and Brain Sciences,Cambridge University Press, 2011.

[24] E. Tulving, Precis of Elements of Episodic Memory, Behavioural and BrainSciences 7 (1984) 223–268.

[25] D. Purves, E. Brannon, R. Cabeza, S.A. Huettel, K. LaBar, M. Platt, M. Woldorff,Principles of Cognitive Neuroscience, Sinauer Associates, Sunderland,Massachusetts, 2008.

[26] P. Fournier-Viger, U. Faghihi, R. Nkambou, E. Mephu Nguifo, CMRules: MiningSequential Rules Common to Several Sequences, Knowledge-based Systems,Elsevier, 2012. p. 18 pages.

[27] P. Fournier-viger, R. Nkambou, V.S. Tseng, RuleGrowth: mining sequentialrules common to several sequences by pattern-growth, in: Proceedings of the26th Symposium on Applied Computing (ACM SAC 2011) (2011) 954–959.

[28] R. Agrawal, T. Imielminski, A. Swami, Mining Association Rules Between Sets ofItems in Large Databases, SIGMOD Conference (1993) 207–216.

[29] J. Hipp, U. Güntzer, G. Nakhaeizadeh, Data mining of association rules and theprocess of knowledge discovery in databases, Industrial Conference on DataMining (2001) 15–36.

[30] J. Deogun, L. Jiang, Prediction Mining – An Approach to Mining AssociationRules for Prediction, in: Proceedings of RSFDGrC 2005 Conference. Springer-Verlag Berlin Heidelberg vol. LNSC 3642 (2005) 98–108.

[31] D. Li and J. S. Deogun, Discovering Partial Periodic Sequential Association Ruleswith Time Lag in Multiple Sequences for Prediction, in: Proceedings of ISMIS2005, LNCS, vol. 34, Springer, (2005) 332–341.

Documents

A computational model for causal learning in cognitive agents