27
. Multivariate Information Bottleneck Nir Friedman Ori Mosenzon Noam Slonim Naftali Tishby Hebrew University

Multivariate Information Bottleneck

  • Upload
    kacy

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Multivariate Information Bottleneck. Nir FriedmanOri Mosenzon Noam Slonim Naftali Tishby Hebrew University. Statistics. Data Analysis. Population. Information Bottleneck. Bachlor’s degree. Some college. Cluster “age” clusters that are predictive of education level?. High school. - PowerPoint PPT Presentation

Citation preview

Page 1: Multivariate Information Bottleneck

.

Multivariate Information Bottleneck

Nir Friedman Ori Mosenzon

Noam Slonim Naftali Tishby

Hebrew University

Page 2: Multivariate Information Bottleneck

Data Analysis

Population

Statistics

5 15 25 35 45 55 65 75 80

Age

Page 3: Multivariate Information Bottleneck

Information Bottleneck

Cluster “age” clusters that are predictive of education level?

High sc

hool

Bachlo

r’s d

egre

e

PHDNon

e

17192429343944495459646974

Some

colle

ge

Page 4: Multivariate Information Bottleneck

Information Bottleneck

Cluster “age” clusters that are predictive of education level?

Also cluster education attained to be predictive of age?

High sc

hool

Bachlo

r’s d

egre

e

PHDNon

e

17192429343944495459646974

Some

colle

ge

Page 5: Multivariate Information Bottleneck

Our contribution

Generalize Information Bottleneck:

Generic principle for specifying systems of interacting clusters

Characterization of the solution for these specs

General purpose methods for constructing solutions

Page 6: Multivariate Information Bottleneck

Information Bottleneck[Tishby, Peirera & Bialek 99]

A

B P(A,B)

T

B P(T,B)

P(T|A)

Soft clustering

);( ATI);( BTI

A B

T

Minimize: I(T;A) - I(T;B)

CompressionInformation lost about A

Preserved information about B

Tradeoff

Page 7: Multivariate Information Bottleneck

Information Bottleneck Reexamined

A B

T

A B

T

Actual Distribution

)|(),( ATPBAP

Input parameters

A B

T

Desired independencies

)|;( TBAInd

G in G out

Page 8: Multivariate Information Bottleneck

Example: Symmetric Bottleneck

Simultaneous clustering of both A and B P(TA|A)

P(TB|B)

A

TA

B

TB

G in

A B

TA TB

G out

So that TA captures the information A contain about B

TB captures the information B contain about A

Page 9: Multivariate Information Bottleneck

General Principle

Input: P(X1,…,Xn)

G in - Compression Tj clusters values of paj

G out - Desired (conditional) independencies

Goal: Find P(Tj|paj) in G in to “match” G out

X1 X2 Xn…

T1 Tk…

Page 10: Multivariate Information Bottleneck

Multi-information

Multi-information

Information random variables jointly contain about each other

Generalizes mutual information

I

])()(),,(

[log),,(1

11

n

nn XPXP

XXPEXXΙ

Page 11: Multivariate Information Bottleneck

Graph Projection

Let G be a DAG

Define:

)(min)( QPKLGPKL GQ

P

Distributions consistent with G

All possible distributions

Page 12: Multivariate Information Bottleneck

Graph Projection

Let G be a DAG

Define:

)(min)( QPKLGPKL GQ

P

Multi-info as thoughP is consistent with G

Real multi-info

Gn IXXIGPKL ),,()( 1

Proposition:

Page 13: Multivariate Information Bottleneck

Multi-information & Bayesian Networks

Proposition:

If P is consistent with G

Then

Define

I

i

iin XPXXP )|(),,( 1 pa

Sum of local interactions

i

iiG XII );( pa

i

iin XIXXI );(),,( 1 pa

Page 14: Multivariate Information Bottleneck

Optimizing Criteria

Two goals: Lose info wrt G in

Attain conditional independencies in G out

Optimization objective:

)( outin GPKLIL

Force clusters to compress Minimize violations

of conditional indep. in G out

Page 15: Multivariate Information Bottleneck

Additional Interpretation

Using properties of we can rewrite

Thus, we can instead minimize

)(

)(outinin

outin

III

GPKLIL

outin IIL

)( GPKL

Minimize informationin G in

Maximize informationin G out

Page 16: Multivariate Information Bottleneck

Minimization Objective - Example

);();();( BABA TTIBTIATIL

A

TA

B

TB

G in

A B

TA TBG out

Symmetric Bottleneck

Recall BA

BABA BAPBTPATPTTP,

),()|()|(),(

Input (fixed)Parameters we

can controlParameters we

can control

Page 17: Multivariate Information Bottleneck

Characterization of Solutions

Thm: Minimal point if and only if

)},(Exp{),(

)()|( jj

jj

jjj td

Z

tPtP pa

papa

d(tj,paj) - measure of “distortion” between tj and paj

For example in symmetric bottleneck:))|()|((),( aBBA tTPaTPKLatd

Page 18: Multivariate Information Bottleneck

Finding Solutions

How can we find solutions?

Asynchronous update Pick an index j Update P(Tj|paj)

Theorem Asynchronous updates converge to (local) minima

)},(Exp{),(

)()|( jj

jj

jjj td

Z

tPtP pa

papa

Page 19: Multivariate Information Bottleneck

Example - 20 newsgroup

20,000 messages from 20 news group [Lang 1995]

A - newsgroup of the message B - word in the message

P(a,b) -

probability that choosing a random position in the corpus would select word b in a message in newsgroup a

We applied symmetric bottleneck on both attributes

Page 20: Multivariate Information Bottleneck

20 Newsgroup: Symmetric Bottleneck

N

ewsg

roup

word

Page 21: Multivariate Information Bottleneck

20 Newsgroup: Symmetric Bottleneck

alt.atheismrec.autosrec.motorcyclesrec.sport.*sci.medsci.spacesoc.religion.christiantalk.politics.*

comp.*misc.forsalesci.cryptsci.electronics

carturkishgameteamjesusgunhockey…

xfileimageencryptionwindowdosmac…

New

sgro

up

word

P(TD,TW)

Page 22: Multivariate Information Bottleneck

20 Newsgroup: Symmetric Bottleneck

New

sgro

up

word

P(TD,TW)

Page 23: Multivariate Information Bottleneck

20 Newsgroup: Symmetric Bottleneck

New

sgro

up

word

P(TD,TW)

Page 24: Multivariate Information Bottleneck

20 Newsgroup: Symmetric Bottleneck

New

sgro

up

word

P(TD,TW)

Page 25: Multivariate Information Bottleneck

20 Newsgroup: Symmetric Bottleneck

New

sgro

up

wordatheistschristianityjesusbiblesinfaith…

alt.atheismsoc.religion.christiantalk.religion.misc

P(TD,TW)

Page 26: Multivariate Information Bottleneck

Discussion

General framework: Defines a new family of optimization problems

… and solutions

Future directions: Additional algorithms - agglomerative solutions Relation to generative models Parametric constraints in Gout

Page 27: Multivariate Information Bottleneck

Example: Parallel Bottleneck

A B

T1 T2A

T1

B

T2

Gin Gout

)];,();([);();( 212111 BTTITTIBTIATIL

))|()|((

)),|(),|((),(

aBB

BaBA

tTPaTPKL

TtBPTaBPKLatd