Part II

1

White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorial

Part II

ftp://ftp.research.microsoft.com/pub/mlas/david/uai-tut99.pdf

ftp://ftp.research.microsoft.com/pub/mlas/david/uai-tut99.pdf

2

3

4

= Ct,h

Example: for (ht + htthh), we get p(d|m) = 3!2!/6!

5

6

Numerical example for the network X1 X2

Imaginary sample sizes denoted N’ijk

Data: (true, true) and (true, false)

7

8

Used so far

Desired

9

How do we assign structure and parameter priors ?

Structure priors: Uniform, partial order (allowed/prohibited edges), proportional to similarity to some a priori network.

10

BDeK2

11

12

13

14

)|()1(

)1()|( h

yxyxxx

yyhxyxy mpmp

15

16

17

18

Example: Suppose the hyper distribution for (X1,X2) is Dir( a00, a01 ,a10, a11).

So how to generate parameter priors?

19

Example: Suppose the hyper distribution for (X1,X2) is Dir( a00, a01 ,a10, a11)This determines a Dirichlet distribution for the parameters of both directed models.

20

21

Summary: Suppose the parameters for (X1,X2) are distributed Dir( a00, a01 ,a10, a11).Then, parameters for X1 are distributed Dir(a00+a01 ,a10+a11).Similarly, parameters for X2 are distributed Dir(a00+a10 ,a01+a11).

22

BDe score:

23

24

25

26

Example: f(x+y) = f(x) f(y)Example: f(x+y) = f(x) f(y)Solution: (ln f )`(x+y) = (ln f )`(x) Solution: (ln f )`(x+y) = (ln f )`(x) and so: (ln f )`(x) = constantand so: (ln f )`(x) = constantHence: (ln f )(x) = linear functionHence: (ln f )(x) = linear functionhence: f(x) = c ehence: f(x) = c eaxax Assumptions: Positive everywhere, DifferentiableAssumptions: Positive everywhere, Differentiable

Functional Equations Example

27

The bivariate discrete case

28


29


30


31

32

Documents

Part II