28
Undirected Graphical Models (Markov random fields) Seung-Hoon Na Chonbuk National University

Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Undirected Graphical Models(Markov random fields)

Seung-Hoon Na

Chonbuk National University

Page 2: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks• Bidirectional associative memory

Page 3: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Bidirectional associative memory

• In general,

• After some iters, BAM goes to a fixed point (x,y):

Page 4: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Bidirectional associative memory

• Hebbian learning

– Then,

• For a set of m vector pairs

Page 5: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Bidirectional associative memory

– 𝒙0: initial vector

– the excitation vector:

– (𝒙0, 𝒚0): a stable state of the network if

smaller (because of the minus sign) if the vector 𝑾𝒚0

𝑇 lies closer to 𝒙0

The energy function

Page 6: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Bidirectional associative memory

• The energy function E of BAM:

• Generally, using a threshold-unit for each node

Page 7: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks

• Asynchronous BAM

– Each unit computes its excitation at random times and changes its state to 1 or −1 independently of the others and according to the sign of its total excitation

– For each update, the energy function is decreasing

• Hopfield network

– A special case of an asynchronous BAM: x=y

– Each unit is connected to all other units except itself

– With necessary conditions for weight matrix W

• 1) The symmetry 2) Zero diagonal

Page 8: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks• Network with a non-zero diagonal

• Network with asymmetric connections,

(1, 1, 1) (−1,−1,−1)

Page 9: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks

• Energy function

Page 10: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks: Example• A flip-flop network

– Only stable states are (1,−1) and (−1, 1).

Energy function of a flip-flop

Page 11: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks: Example

Page 12: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks: Example

Page 13: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Hopfield networks• A fully connected Ising model with a

symmetric weight matrix 𝑾 = 𝑾𝑇

• Exact inference is intractable

• Iterative conditional modes (ICM): just sets each node to its most likely (lowest energy) state, given all its neighbors

Page 14: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Learning MRFs: Moment matching

• MRF

• Log-likelihood:

• Gradient for 𝜽𝑐 :

Moment matching

Page 15: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Learning MRFs: Latent variable models

• Log-likelihood:

• Gradient:

• Alternatively, we can use a generalized EM

Page 16: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Approximate methods for

Learning MRFs

• Pseude-likelihood• Stochatic maximum likelihood• Iterative proportional fitting• Generalized iterative scaling

Page 17: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Pseudo likelihood

– 𝑃 𝑦 ≈ ς𝑖 𝑃(𝑦𝑖|𝑀𝐵 𝑦𝑖 )

The representation used by pseudo likelihood

𝑀𝐵 𝑦𝑖 : Markov blanket of 𝑦𝑖

Page 18: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Approximate methods for Learning MRFs:

Stochatic maximum likelihood

Approximate the model expectation using MC sampling

Page 19: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Iterative proportional fitting

𝑝 𝒚 𝜽 =1

𝑍𝑒𝑥𝑝

𝑐

𝜽𝑐𝑇𝝓𝑐(𝒚) =

1

𝑍ෑ

𝑐

𝜓𝑐(𝒚𝑐) = 𝑝 𝒚 𝝍

logL=σ𝑖 log 𝑝 𝒚(𝑖) 𝜽 = σ𝑖 σ𝑐 log𝜓𝑐(𝒚𝑐(𝑖)) − log𝑍

𝜕 𝑙𝑜𝑔𝐿

𝜕 𝜓𝑐(𝑦𝑐)=σ𝑖 𝐼(𝒚𝑐

(𝑖)= 𝑦𝑐)

𝜓𝑐(𝑦𝑐 )−σ𝒚∈𝑌(𝑦𝑐)

ς𝑐′≠𝑐𝜓𝑐′(𝒚𝑐′)

𝑍

i-th example에서 clique c에해당되는 potential function

𝑍 =

𝒚

𝑐

𝜓𝑐(𝒚𝑐)

𝜕 𝑙𝑜𝑔𝐿

𝜕 𝜓𝑐(𝑦𝑐)=σ𝑖 𝐼(𝒚𝑐

(𝑖)= 𝑦𝑐)

𝜓𝑐(𝑦𝑐 )−1

𝑍

𝒚∈𝑌(𝑦𝑐)

𝑍 𝑝(𝒚|𝝍)

𝜓𝑐(𝑦𝑐 )

𝑌(𝑦𝑐): cli𝑞𝑢𝑒가 𝑦𝑐값을 갖는모든 y집합

Page 20: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Iterative proportional fitting

𝜕 𝑙𝑜𝑔𝐿

𝜕 𝜓𝑐(𝑦𝑐)=σ𝑖 𝐼(𝒚𝑐

(𝑖)= 𝑦𝑐)

𝜓𝑐(𝑦𝑐 )−1

𝑍

𝒚∈𝑌(𝑦𝑐)

𝑍 𝑝(𝒚|𝝍)

𝜓𝑐(𝑦𝑐 )

𝜕 𝑙𝑜𝑔𝐿

𝜕 𝜓𝑐(𝑦𝑐)=𝑝𝑒𝑚𝑝(𝑦𝑐)

𝜓𝑐(𝑦𝑐 )−𝑝𝑚𝑜𝑑𝑒𝑙 𝑦𝑐 𝝍

𝜓𝑐 𝑦𝑐

Fixed point algorithm by making the gradient zero

𝜓𝑐 𝑦𝑐 ← 𝜓𝑐 𝑦𝑐𝑝𝑚𝑜𝑑𝑒𝑙 𝑦𝑐 𝝍

𝑝𝑒𝑚𝑝(𝑦𝑐)

Page 21: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Iterative proportional fitting

Page 22: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Learning MRFs:

Generalized iterative scaling

Page 23: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Markov Logic Networks

• A Markov logic network 𝐿 is a set of pairs (𝐹𝑖 , 𝑤𝑖), where 𝐹𝑖 is a formula in first-order logic and 𝑤𝑖 is a real number.

• Together with a finite set of constants 𝐶 ={𝑐1, 𝑐2, . . . , 𝑐 𝐶 }, 𝐿 defines a Markov network 𝑀𝐿,𝐶 :

𝑀𝐿,𝐶 contains one binary node for each possible grounding of each predicate appearing in 𝐿. The value of the node is 1 if the ground atom is true, and 0 otherwise.

𝑀𝐿, 𝐶 contains one feature for each possible grounding of each formula 𝐹𝑖 in L. The value of this feature is 1 if the ground formula is true, and 0 otherwise. The weight of the feature is the 𝑤𝑖 associated with 𝐹𝑖 in 𝐿.

Page 24: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Markov Logic Networks

• The probability distribution of a possible world 𝑥:

𝑛𝑖(𝑥): the number of true groundings of 𝐹𝑖 in 𝑥

𝑥{𝑖}: the state (truth values) of the atoms appearing in 𝐹

Page 25: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Markov Logic Networks: Example

Page 26: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Markov Logic Networks: Example

Page 27: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Markov Logic Networks

• Inference

Page 28: Chonbuk National Universitynlp.chonbuk.ac.kr/PGM2019/slides_jbnu/MRF.pdf · 2019. 5. 1. · •A flip-flop network –Only stable states are (1,−1) and (−1, 1). Energy function

Markov Logic Networks

• Learning

– Use Pseudo-likelihood to approximation the model probability