Probabilistic Graphical Models · Probabilistic Graphical Models > Inference: Variable Elimination 6 / 13. Example: Factor Marginalization a 1 a1 a1 a1 a2 a2 a 2 a2 a3 a3 a3 a3 b1

Inference: Variable Elimination

Probabilistic Graphical Models

Johan Pensar

Probabilistic Graphical Models > Inference: Variable Elimination 1 / 13

Problem Formulation

We are interested in the most common type of query in graphical models: the conditionalprobability query,

P(Y |E = e) =P(Y , e)

P(e).

Each instantiation of the numerator can P(y , e) can be computed from the joint:

P(y , e) =∑w

P(y , e,w),

where W = X − Y − E , and furthermore:

P(e) =∑w

P(y , e).

NOTE: This is the same as taking the vector(P(y , e)

)y∈Val(Y )

and renormalizing.


Complexity

In principle, we can compute the required marginal distribution from the joint, however,this leads to the exponential blowup that we are trying to avoid with graphical models.

Unfortunately, it turns out that the problem of inference in graphical models is NP-hard,probably requiring exponential time in the worst case.

In addition, approximate inference is also NP-hard.

However, we are rarely interested in the worst case, and many real-world applications candealt with effectively using exact or approximate inference algorithms.


Basic Idea

Assume we have a BN with the structure: A→ B → C → D, and we want to compute:

P(D) =∑C

∑B

∑A

P(A)P(B |A)P(C |B)P(D |C ).

We push in the sums:

P(D) =∑C

P(D |C )∑B

P(C |B)∑A

P(A)P(B |A).

In the first step, we compute the product ψ1(A,B) = P(A)P(B |A) and then sum out Ato obtain τ(B) =

∑A ψ1(A,B).


Basic Idea

We now haveP(D) =

∑C

P(D |C )∑B

P(C |B)τ1(B).

In the second step, we compute the product ψ2(B,C ) = τ2(B)P(C |B) and then sum outB to obtain τ2(C ) =

∑B ψ2(B,C ), which is used in the final step to obtain P(D).

This procedure uses dynamic programming, performing the innermost summation first.

Basically, two ideas help us address the exponential blowup of the joint distribution:

Some subexpression in the factorization depends only on a small number of variables.

We can compute such expressions once and cache the results.


Factor Marginalization

The main steps of the variable elimination algorithm can be viewed as manipulation offactors.

Using the factor-based view, the algorithm is defined in a general form that applies toboth BNs and MNs.

Definition 9.3: Let X be a set of variables, and Y 6∈ X a variable. Let φ(X ,Y ) be afactor. We define the factor marginalization of Y in φ, denoted

∑Y φ, to be a factor ψ

over X such that:ψ(X ) =

∑Y

φ(X ,Y ).

This operation is also called summing out of Y in φ.


Example: Factor Marginalization

a1

a1

a1

a1

a2

a2

a2

a2

a3

a3

a3

a3

b1

b1

b2

b2

b1

b1

b2

b2

b1

b1

b2

b2

c1

c2

c1

c2

c1

c2

c1

c2

c1

c2

c1

c2

0.25

0.35

0.08

0.16

0.05

0.07

0

0

0.15

0.21

0.09

0.18

a1

a1

a2

a2

a3

a3

c1

c2

c1

c2

c1

c2

0.33

0.51

0.05

0.07

0.24

0.39


Properties of Factor Operations

Factor product and summation behave similarly as product and summation over numbers:

Both operations are commutative:

φ1 · φ2 = φ2 · φ1 and∑X

∑Y

φ =∑Y

∑X

φ.

The factor product is associative:

(φ1 · φ2) · φ3 = φ1 · (φ2 · φ3).

We can exchange summation and product:

If X 6∈ Scope[φ1] :∑X

(φ1 · φ2) = φ1 ·∑X

φ2.


Sum-Product Inference Task

In our example, letting P(A,B,C ,D) = φA · φB · φC · φD , we are interested in computing:

P(D) =∑C

∑B

∑A

φA · φB · φC · φD

More generally, we want to compute the value of an expression of the form:∑Z

∏φ∈Φ

φ,

which is called the sum-product inference task.

The key insight for efficient computation of the above expression is that we can “push in”the summation operations due to the (typically) small scope of the factors.


The Sum-Product Variable Elimination Algorithm


The Sum-Product Variable Elimination Algorithm

Theorem 9.5: Let X be some set of variables, and let Φ be a set of factors such that foreach φ ∈ Φ, Scope[φ] ⊆ X . Let Y ⊂ X be a set of query variables, and let Z = X − Y .Then for any ordering ≺ over Z , Sum-Product-VE(Φ,Z ,≺) returns a factor φ∗(Y ) suchthat

φ∗(Y ) =∑Z

∏φ∈Φ

φ.

Can be applied to both MNs, where the factors are clique potentials:

Φ = {φk(C k)}mk=K ,

and BNs, where the factors are CPDs:

Φ = {φXi}ni=1,where φXi

= P(Xi |PaXi).


Example: The Sum-Product VE Algorithm with Different Orderings

Grade

Letter

Job

Happy

Coherence

SAT

IntelligenceDifficulty


The Sum-Product VE Algorithm with Evidence

We can use the same algorithm also for computing P(Y , e):

The algorithm is applied to the E = e reduced set of factors, and X − E −Y are eliminated.

We renormalize the resulting factor by dividing by the sum of the entries.


Documents

Probabilistic Graphical Models · Probabilistic Graphical Models > Inference: Variable Elimination 6 / 13. Example: Factor Marginalization a 1 a1 a1 a1 a2 a2 a 2 a2 a3 a3 a3 a3 b1