Dynamic and Temporal Bayesian Networks - INAOE

Introduction

DynamicBayesiannetworksInference

Learning

TemporalEventNetworksInference

Learning

ApplicationsGesture Recognition

Predicting HIVMutational Pathways

References

Dynamic and Temporal Bayesian Networks

Probabilistic Graphical Models

L. Enrique Sucar, INAOE

(INAOE) 1 / 41

Introduction


Learning


Learning



References

Outline

1 Introduction

2 Dynamic Bayesian networksInferenceLearning

3 Temporal Event NetworksInferenceLearning

4 ApplicationsGesture RecognitionPredicting HIV Mutational Pathways

5 References

(INAOE) 2 / 41

Introduction


Learning


Learning



References

Introduction

Introduction

• There are two basic types of Bayesian network modelsfor dynamic processes: state based and event based

• State based models represent the state of each variableat discrete time intervals, so that the networks consist ofa series of time slices, where each time slice indicatesthe value of each variable at time t – Dynamic BayesianNetworks

• Event based models represent the changes in state ofeach state variable; each temporal variable will thencorrespond to the time in which a state change occurs –event networks or temporal networks.

(INAOE) 3 / 41

Introduction


Learning


Learning



References

Dynamic Bayesian networks


• Dynamic Bayesian networks (DBNs) are an extensionof Bayesian networks to model dynamic processes

• A DBN consists of a series of time slices that representthe state of all the variables at a certain time, t

• For each temporal slice, a dependency structurebetween the variables at that time is defined, called thebase network

• Additionally, there are edges between variables fromdifferent slices, with their directions following thedirection of time, defining the transition network

(INAOE) 4 / 41

Introduction


Learning


Learning



References


DBN - example

(INAOE) 5 / 41

Introduction


Learning


Learning



References


Assumptions

• First order Markov model. The state variables at time tdepend only on the state variables at time t − 1 (andother variables at time t).

• Stationary process. The structure and parameters ofthe model do not change over time.

(INAOE) 6 / 41

Introduction


Learning


Learning



References


Particular cases: MC, HMMs, KF

• DBNs can be seen as a generalization of Markovchains:

P(X1,X2, ...,XT ) = P(X1)P(X2 | X1)...P(XT | XT−1) (1)

• And HMMs:

P({S1:T ,Y1:T}) =

P(S1)P(Y1 | S1)T∏

t=2

P(St | St−1)P(Yt | St ) (2)

• Kalman Filters, also have one state and oneobservation variable, but both variables are continuous.The basic Kalman filter assumes Gaussian distributionsand linear functions for the transitions and observations

(INAOE) 7 / 41

Introduction


Learning


Learning



References

Dynamic Bayesian networks Inference

Types of Inference

• Filtering. Predict the next state based on pastobservations: P(Xt+1 | Y1:t ).

• Prediction. Predict future states based on pastobservations: P(Xt+n | Y1:t ).

• Smoothing. Estimate the current state based on pastand future observations (useful for learning):P(Xt | Y1:T ).

• Decoding. Find the most likely sequence of hiddenvariables given the observations:ArgMax(X1:T ) P(X1:T | Y1:T ).

(INAOE) 8 / 41

Introduction


Learning


Learning



References

Dynamic Bayesian networks Inference

Inference

• Efficient inference methods have been developed forparticular types of models, such as HMMs

• For more complex models, inference becomescomputationally intractable

• In these cases we can apply approximate methodsbased on sampling, such as Markov chain Monte Carlo,such as Particle Filters

(INAOE) 9 / 41

Introduction


Learning


Learning



References

Dynamic Bayesian networks Learning

Learning DBNs

• There are 4 basic cases for learning DBNs• For all the cases we can apply extensions of the

methods for parameter and structure learning for BNs

Structure Observability MethodKnown Full Maximum likelihood estimationKnown Partial Expectation–maximization (EM)Unknown Full Search (global) or tests (local)Unknown Partial EM and global or local

(INAOE) 10 / 41

Introduction


Learning


Learning



References


Unknown Structure and Full Observability

• Assuming that the DBN is stationary (time invariant), wecan consider that the model is defined by two structures:(i) the base structure, and (ii) the transition structure

• Divide the learning of a DBN into two parts, first learnthe base structure, and then, given the base structure,learn the transition structure

(INAOE) 11 / 41

Introduction


Learning


Learning



References


Structure Learning: base and transitionnetworks

• For learning the base structure we can use all theavailable data for each variable, ignoring the temporalinformation. This is equivalent to learning a BN

• For learning the transition network we consider thetemporal information, in particular the data for allvariables in two consecutive time slices, Xt and Xt+1

• Considering the base structure, we can then learn thedependencies between the variables at time t and t + 1

(INAOE) 12 / 41

Introduction


Learning


Learning



References

Temporal Event Networks


• Temporal event networks (TENs) are an alternative toDBNs for modeling dynamic processes

• In a temporal event network, a node represents the timeof occurrence of an event or state change of certainvariable

• For some problems, in which there are few statechanges in the temporal range of interest, eventnetworks provide a simpler and more efficientrepresentation

• We will focos on one variant known as Temporal NodesBayesian Networks

(INAOE) 13 / 41

Introduction


Learning


Learning



References


TNBNs: Definition

• A Temporal Nodes Bayesian Network (TNBN) iscomposed of a set of Temporal Nodes (TNs)

• TNs are connected by edges, where each edgerepresents a causal-temporal relationship

• There is at most one state change for each variable(TN) in the temporal range of interest

• The value taken by the variable represents the intervalin which the event occurs – time is discretized in a finitenumber of intervals, allowing a different number andduration of intervals for each node

• Each interval defined for a child node represents thepossible delays between the occurrence of one of itsparent events (cause) and the corresponding child event

(INAOE) 14 / 41

Introduction


Learning


Learning



References


Car accident example

• Assume that at time t = 0, an automobile accidentoccurs, that is, a Collision. This kind of accident can beclassified as severe, moderate, or mild. To simplify themodel, we will consider only two immediateconsequences for the person involved in the collision:Head Injury and Internal Bleeding. A Head Injury canbruise the brain, and chest injuries can lead to InternalBleeding. These are all instantaneous events that maygenerate subsequent changes, for example the HeadInjury event might generate Dilated Pupils and unstableVital Signs

(INAOE) 15 / 41

Introduction


Learning


Learning



References


Car accident example

• Additionally, a physician domain expert provided someimportant temporal information: If a head injury occurs,the brain will start to swell and if left unchecked theswelling will cause the pupils to dilate within 0 to 60minutes. If internal bleeding begins, the blood volumewill start to fall, which will tend to destabilize vital signs.The time required to destabilize vital signs will dependon the severity of the bleeding: if the bleeding is gross,it will take from 0 to 15 minutes; if the bleeding is slight itwill take from 15 to 45 minutes. A head injury also tendsto destabilize vital signs, taking from 0 to 15 minutes tomake them unstable

(INAOE) 16 / 41

Introduction


Learning


Learning



References


TNBN: car accident example

(INAOE) 17 / 41

Introduction


Learning


Learning



References

Temporal Event Networks Inference

TNBN: inference

• A TNBN allows for reasoning about the probability ofoccurrence of certain events, for diagnosis (i.e., findingthe most probable cause of a temporal event) orprediction (i.e., determining the probable future eventsthat will occur given a certain event)

• Standard probability propagation techniques forstandard BNs can be applied

• However, given that a TNBN represents relative timesbetween events, the cases of prediction and diagnosishave to be differentiated

(INAOE) 18 / 41

Introduction


Learning


Learning



References


Prediction

• In the case where at least one of the root(instantaneous) nodes of the TNBN is part of theevidence, then the time reference for the model is fixedand probability propagation can be performed directly,obtaining the posterior probability of the subsequentevents

• For temporal nodes, the inference procedure willdetermine the probability of occurrence for eachtemporal interval, or of not occurrence

(INAOE) 19 / 41

Introduction


Learning


Learning



References


Diagnosis

• In the case where none of the instantaneous nodes areknown, and the evidence is given only for temporalnodes, then several scenarios need to be considered

• In this case, all the n possible intervals for the TN haveto be considered, performing inference n times, one foreach interval

• The results for each scenario have to be maintained,until there is additional evidence, such as theoccurrence of another event, that allows for discardingsome scenarios

(INAOE) 20 / 41

Introduction


Learning


Learning



References

Temporal Event Networks Learning

TNBNs: learning

• Learning a TNBN involves 3 aspects: (i) learning thetemporal intervals for the temporal nodes, (ii) learningthe structure of the model, (iii) learning the parametersof the model

• As these three components are interrelated, an iterativeprocedure is required, that learns an initial estimate ofone (or more) of these aspects, and then improvesthese initial estimates iteratively

• Next we present the LIPS algorithm

(INAOE) 21 / 41

Introduction


Learning


Learning



References


Algorithm

• First, it performs an initial discretization of the temporalvariables, for example using an Equal-Widthdiscretization – obtains an initial approximation of theintervals for all the Temporal Nodes

• Then it performs standard BN structural and parameterlearning

• The interval learning algorithm refines the intervals foreach temporal node (TN) by means of clustering. Forthis, it uses the information of the configurations of theparent nodes

• It obtains different sets of intervals that are merged andcombined, this process generates different interval setsthat will be evaluated in terms of the predictive accuracy

• Finally, the parameters (CPTs) are updated accordingto the new set of intervals for each TN

(INAOE) 22 / 41

Introduction


Learning


Learning



References


Data to Learn the Accident TNBN

• Top: original data showing the time of occurrence of thetemporal events.

• Bottom: temporal data after the the initial discretization

Collision Head Injury InternalBleeding

DilatedPupils

Vital S.

severe true gross 14 20moderate true gross 25 25mild false false - -. . . . . . . . . . . . . . .Collision Head Injury Internal

BleedingDilatedPupils

Vital S.

severe true gross [10-20] [15-30]moderate true gross [20-30] [15-30]mild false false - -. . . . . . . . . . . . . . .

(INAOE) 23 / 41

Introduction


Learning


Learning



References


Learning example – accident TNBN

• After the initial discretization, the initial structure andparameters are learned

• Then the intervals for each temporal node are refined• Finally the parameters are re-estimated• Intervals obtained for the node Dilated Pupils – the

temporal data is divided into two partitions, one for eachconfiguration of the parent node

Partition IntervalsHead Injury=true [11− 35]

[11− 27][32− 53][8− 21][25− 32][45− 59]

Head Injury=false [3− 48][0− 19][39− 62][0− 14][28− 40][47− 65]

(INAOE) 24 / 41

Introduction


Learning


Learning



References

Applications

Applications

• Dynamic Bayesian networks are used for dynamicgesture recognition

• Temporal event networks are used for predicting HIVmutational pathways

(INAOE) 25 / 41

Introduction


Learning


Learning



References

Applications Gesture Recognition

Gesture Recognition

• Dynamic Bayesian networks provide an alternative toHMMs for dynamic gesture recognition

• Greater flexibility in terms of the structure of the models• We consider a particular type of DBN known as

dynamic Bayesian network classifier (DBNC)• A DBNC has a hidden state variable for each time

instant, St ; however, the observation variable isdecomposed into m attributes, A1

t , ...Amt , which are

assumed to be conditionally independent given St

(INAOE) 26 / 41

Introduction


Learning


Learning



References


DBNC

• The joint probability of a DBNC can be factored as:

P({S1:T ,A1:T}) =

P(S1)[M∏

m=1

P(Am1 | S1)]

T∏t=2

P(St | St−1)[M∏

m=1

P(Amt | St )]

(3)

(INAOE) 27 / 41

Introduction


Learning


Learning



References


Gesture recognition with DBNCs

• 9 hand gestures oriented to command a mobile robot

(INAOE) 28 / 41

Introduction


Learning


Learning



References


Features

• The features include motion and posture information,including in total 7 attributes

• Motion features are ∆area - or changes in the handarea-, ∆x and ∆y -or changes in hand position on theXY-plane of the image

• Posture features named form, right, above, and torsodescribe hand orientation and spatial relations betweenthe hand and other body parts, such as the face andtorso

(INAOE) 29 / 41

Introduction


Learning


Learning



References


Posture features

(INAOE) 30 / 41

Introduction


Learning


Learning



References


Experiments

• Three experiments were done to compare classificationand learning performances of DBNCs and HMMs

• An important difference between the DBNCs and theHMMs is the number of parameters that are required foreach model

• The number of parameters to specify state observationdistributions of HMMs with posture-motion features is648 and with only motion data is 27

• With DBNCs, parameters are 21 in the former case, and12 in the latter case. This significant reduction in thenumber of parameters for the DBNC has an importantimpact in the training

(INAOE) 31 / 41

Introduction


Learning


Learning



References


Results• In the case of the experiments with the same person

using motion and posture attributes, both HMMs andDBNCs obtain a very good performance, withrecognition rates between 96 and 99%

• DBNCs obtain slightly better recognition results, butwith a significant reduction in training time, about tentimes faster than HMMs

• For the experiments with multiple people, as expectedthe performance of both, HMMs and DBNCs decreases,to about 86% recognition with motion-posture attributes.If only motion attributes are used, the performance is inthe order of 65%

• In the case of the third experiments, variations indistance and orientation also have an impact in therecognition rates, with the second aspect having agreater effect

(INAOE) 32 / 41

Introduction


Learning


Learning



References

Applications Predicting HIV Mutational Pathways

Predicting HIV Mutational Pathways

• The human immunodeficiency virus (HIV) is one of thefastest evolving organisms on the planet

• In the case of antiretroviral therapy (ART), a strongselective pressure acting on HIV that, under suboptimalconditions, readily selects for mutations that allow thevirus to replicate even in the presence of highly potentantiretroviral drug combinations

• We address the problem of finding mutation-mutationand drug-mutation associations in individuals receivingantiretroviral therapy – focused on protease inhibitors(PIs)

• Historical data from HIV patients was used to learn aTNBN, and then this model was evaluated in terms ofthe discovered relationships

(INAOE) 33 / 41

Introduction


Learning


Learning



References


Data

• Clinical data from 2373 patients with HIV subtype B wasretrieved from the HIV Stanford Database

• For each patient, data consisted of an initial treatment (acombination of drugs) administered to the patient and alist of the most frequent mutations in the viral populationat a specific time after the initiation of treatment

Patient Initial Treatment List of Mutations Time (Weeks)Pat1 LPV, FPV, RTV L63P, L10I 15

V77I 25I62V 50

Pat2 NFV, RTV, SQV L10I 25V77I 45

(INAOE) 34 / 41

Introduction


Learning


Learning



References


Learning the model• A TNBN was learned from a reduced HIVDB data set

that focused on certain drugs and part of the virus• Two modifications to the original algorithm were made,

one to measure the strength of thetemporal-probabilistic relations, and another to vary thevariable order given to the structure learning algorithm(K2), so the results are not biased by a particularpredefined order

• In order to evaluate the models and to measure thestatistical significance of edge strengths,non-parametric bootstrapping was used (obtainingseveral models)

• A strong relation was defined as one that appeared in atleast 90% of the graphs, and a suggestive relation wasdefined as one that occurred with values between 70%and 90%

(INAOE) 35 / 41

Introduction


Learning


Learning



References


Learned TNBN

(INAOE) 36 / 41

Introduction


Learning


Learning



References


Analysis

• The model was able to predict clinically relevantassociations between the chosen drugs and mutations

• A strong association between SQV, G48V, and I84Vwas predicted in the model

• Two possible mutational pathways for LPV resistancewere predicted:

• I54V→ V32I→ I47V• L90M→ M46IL→ I84V

• Also, the shared mutational pathway between IDV andLPV was observed, involving mutations L90M, M46IL,I54V, V82A and I84V

(INAOE) 37 / 41

Introduction


Learning


Learning



References

References

Book

Sucar, L. E, Probabilistic Graphical Models, Springer 2015 –Chapter 9

(INAOE) 38 / 41

Introduction


Learning


Learning



References

References

Additional Reading (1)

Arroyo-Figueroa, G., Sucar, L.E.: A Temporal BayesianNetwork for Diagnosis and Prediction. In: Proceedingsof the 15th Conference on Uncertainty in ArtificialIntelligence (UAI), Morgan-Kaufmann, San Mateo, pp.13–20 (1999)

Aviles-Arriaga, H.H., Sucar, L.E., Mendoza-Duran, C.E.,Pineda-Cortes, L.A.: Comparison of Dynamic NaiveBayesian Classifiers and Hidden Markov Models forGesture Recognition. Journal of Applied Research andTechnology. 9(1), 81–102 (2011)

N. Friedman, N., Murphy, K., Russell, S.: Learning theStructure of Dynamic Probabilistic Networks. In:Proceedings of the Fourteenth Conference onUncertainty in Artificial (UAI), Morgan KaufmannPublishers Inc., pp. 139–147 (1998)

(INAOE) 39 / 41

Introduction


Learning


Learning



References

References


Galan, S.F., Arroyo-Figueroa, G., Dıez, F. J. Sucar, L.E.:Comparison of Two Types of Event Bayesian Networks:A Case Study. Applied Artificial Intelligence. 21(3),185–209 (2007)

Ghahramani, Z.: Learning Dynamic Bayesian Networks.Lecture Notes in Computer Science. 1387, 168–197(1998)

Hernandez-Leal, P., Gonzalez, J.A., Morales, E.F.,Sucar, L.E.: Learning Temporal Nodes BayesianNetworks. International Journal of ApproximateReasoning. 54(8), 956–977 (2013)

(INAOE) 40 / 41

Introduction


Learning


Learning



References

References


Hernandez-Leal, P., Rios-Flores, A., Avila-Rios, S.,Reyes-Teran, G., Gonzalez, J.A., Fiedler-Cameras, L.,Orihuela-Espina, F., Morales, E.F., Sucar, L.E.:Discovering HIV Mutational Pathways using TemporalBayesian Networks. Artificial Intelligence in Medicine.57(3), 185–195 (2013)

Murphy, K.: Dynamic Bayesian Networks:Representation, Inference and Learning. Dissertation,University of California, Berkeley (2002)

(INAOE) 41 / 41

Documents

Dynamic and Temporal Bayesian Networks - INAOE