Bayesian Networks - HLTRIvgogate/ml/2012s/notes/BN.pdf · •Given a Bayesian network, answer...

BAYESIAN NETWORKS Vibhav Gogate

Today – Bayesian networks

• One of the most exciting advancements in statistical AI

and machine learning in the last 10-15 years

• Generalizes naïve Bayes and logistic regression

classifiers

• Compact representation for exponentially-large probability

distributions

• Exploit conditional independencies

Judea Pearl: Turing award in 2011 for his contributions

to Bayesian networks

CAUSAL STRUCTURE

• Draw a directed acyclic (causal) graph for the following

• Direct arrows from cause to effect

• Story:

• There is a Burglar alarm that rings when we have Burglary

• However, sometimes it may ring because of winds that exceed

• When the alarm rings your neighbor Mary Calls

• When the alarm rings your neighbor John Calls

CAUSAL STRUCTURE

• There is a Burglar alarm

that rings when we have

Burglary

• However, sometimes it may

ring because of winds that

exceed 60mph

• When the alarm rings your

neighbor Mary Calls

• When the alarm rings your

neighbor John Calls

Winds Burglary

John Calls Mary Calls

Representation of Joint Distribution

• P(W,B,A,J,M)

=P(W)P(B)P(A|B,W)P(J|A)

P(M|A)

Winds Burglary

P(W) P(B)

P(A|B,W)

P(J|A) P(M|A)

In general:

𝑃 𝑥1, … , 𝑥𝑛 = 𝑃(𝑥𝑖

𝑖=1

|𝑝𝑎 𝑥𝑖 )

Possible Queries

• Inference

• P(W=?|J=True)

• Most probable

explanation

• Assignment of values to all

other variables that has the

highest probability given

that J=True and M=False

• Maximum Aposteriori

Hypothesis.

Winds Burglary

Number of Parameters

• Assume Binary variables

• How many parameters with each CPT 𝑃 𝑥𝑖 𝑥1, … , 𝑥𝑗 ?

• (Domain-size of xi-1)*(All possible combinations of x1,…,xj).

• P(W,B,A,J,M)

=P(W)P(B)P(A|B,W)P(J|A)

P(M|A)

Winds Burglary

P(W) P(B)

P(A|B,W)

P(J|A) P(M|A)

Graph represents conditional

independence assumptions!! • Each variable is conditionally independent of its non-

descendants given its parents, called Markov

assumptions.

• I(X,parents(X),non-descendants(X))

Winds Burglary

P(W) P(B)

P(A|B,W)

P(J|A) P(M|A)

I(W,{},{B})

I(B,{},{W})

I(A,{W,B},{})

I(J,{A},{M,W,B})

I(M,{A},{J,W,B})

Properties of conditional Independence

Proof of Symmetry

Proof of Decomposition

Implication of conditional Independence

properties • Start with conditional independence statements based on

the Markov assumptions and derive new properties.

• For example

• I(X,parents(X),non-descendants(X)) implies that

• I(non-descendants(X),parents(X),X)

• Symmetry

• I(X,parents(X),Y) where Y is a subset of the non-descendants of X.

• Decomposition

Implication of conditional Independence

properties (continued)

• Given a Bayesian network if I ask you a question, is W

independent of M given A?

• Derive all possible Conditional independence statements

using the five properties and Markov assumptions and

check if the above statement is in it.

Winds Burglary

Graphical Test of Conditional

Independence Properties: D-separation

• D-sep(X,Y,Z) only if I(X,Z,Y)

• To decide whether D-Sep(X,Z,Y)

• Check whether X and Y are disconnected in a new DAG G’

obtained from G using the following steps:

• We delete any leaf node W from DAG G as long as W is not in X, or Y or

Z. This process is repeated until no more nodes can be deleted.

• Delete all edges outgoing from nodes in Z

A more complicated algorithm for d-separation is given in

your book.

D-Separation

D-separation

Recap: What is a Bayesian network?

• A compact representation of the joint distribution

• A compact representation of a set of conditional

independence assumption.

Relationship to Naïve Bayes

• Markov assumptions?

• Distribution? Class

X1 X2 Xn

Markov Assumption:

A variable X is independent

of its non-descendants given

its parents

Real Bayesian networks applications

• Diagnosis of lymph node disease

• Speech recognition

• Microsoft office and Windows • http://www.research.microsoft.com/research/dtg/

• Study Human genome

• Robot mapping

• Robots to identify meteorites to study

• Modeling fMRI data

• Anomaly detection

• Fault dianosis

• Modeling sensor network data

Now What?

• Given a Bayesian network, answer queries

• Inference

• Learning Bayesian networks from Data

• Structure and weight learning

• Partially and fully observed data

• MLE and Bayesian approach

• Requires Inference, i.e., computing P(X|evidence) where evidence

is an assignment to a subset of variables

• Unfortunately, Inference is NP-hard.

• Tractable classes based on treewidth

• Approximate Inference approaches

Bayesian Networks - HLTRIvgogate/ml/2012s/notes/BN.pdf · •Given a Bayesian network, answer...

Documents

Bayesian Optimization with Robust Bayesian Neural …aad.informatik.uni-freiburg.de/papers/16-NIPS-BOHamiANN.pdf · Bayesian Optimization with Robust Bayesian Neural Networks Jost

3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Learning Bayesian Networks and Causal Discoverychirayukong.github.io/infsci2725/resources/08_Learning Bayesian... · Learning Bayesian Networks and Causal Discovery Bayesian networks

BAYESIAN AND NON BAYESIAN ESTIMATION OF ERLANG

Ferozoc1520911.ferozo.com/disciplinas-bn.pdf · VûCûUCð . Created Date: 3/27/2020 4:20:06 PM

Bayesian Learning and Learning Bayesian Networks · Bayesian Learning and Learning Bayesian Networks Chapter 20 some slides by Cristina Conati . Overview ! Full Bayesian Learning

Bayesian Hypothesis Testing 1 Running head: BAYESIAN

Bayesian Vector Autoregressions - Northwestern Universityfaculty.wcas.northwestern.edu/~lchrist/course/Korea_2016/bayesian... · Bayesian Vector Autoregressions Vector Autoregressions

CanIt-PRO User’s Guide...Chapter9, “Bayesian Filtering”, explains CanIt-PRO’s Bayesian filtering module. Bayesian filtering Bayesian filtering uses statistical analysis

Being Bayesian About Network Structure. A Bayesian ...people.ee.duke.edu/~lcarin/BayesianNetworkStructure.pdf · Title: Being Bayesian About Network Structure. A Bayesian Approach

Mathematical Statistics, Lecture 3 Bayesian Models · R. i. Bayesian Models Bayesian Framework Examples. Bayesian Model for Bernoulli Trials. ... Mathematical Statistics, Lecture

Bayesian Programming formalism and ProBT API€¦ · - Bayesian Robot Programming - The CyCab: Bayesian Navigation on Sensory-Motor Trajectories - The Bayesian Occupancy Filter -

bayesian bayesian network

Bayesian Belief Networks Compound Bayesian Decision Theory

Bayesian Essentials and Bayesian Regression

2012S > / > EL NORTE > >

Bayesian Estimation - · PDF fileIntroduction Bayesian estimation: the basics Priors Evaluating the posterior Bayesian inference and model comparison Bayesian estimation in Dynare

Bayesian Logistic Regression, Bayesian Generative ... · Figure courtesy: MLAPP (Murphy) Prob. Mod. & Inference - CS698X (Piyush Rai, IITK) Bayesian Logistic Regression, Bayesian

Bayesian Learning of Bayesian Networks with Informative Priors - …stoics.org.uk/~nicos/pbs/amai.pdf · 2019. 3. 4. · Bayesian Learning of Bayesian Networks Bayesian Learning of

Utsg Sta220 Midterm 2012s