The Relative Entropy Rate of Two Hidden Markov Processes Or Zuk Dept. of Phys. Of Comp. Systems...

The Relative Entropy Rate of Two Hidden Markov

Processes

Or Zuk

Dept. of Phys. Of Comp. Systems

Weizmann Inst. Of Science

Rehovot, Israel

Overview

Introduction Distance Measures and Relative Entropy rate Results: Generalization from Entropy Rate. Future Directions

Introduction

Hidden Markov Processes are relevant: Error Correction (Markovian source +noise) Signal Processing, Speech recognition Experimental physics -telegraph noise, TLS+noise,

quantum jumps. Bioinformatics -biological sequences, gene

expression

Transmission

Noise 10%

Markov chain HMP47000 48000 49000

time (ms)

Quantum jumps

0 200 400 600 800 1000

Mesoscopic wires

HMP - Definitions

Markov Process:

X – Markov Process

Mλ – Transition Matrix

mλ(i,j) = Pr(Xn+1 = j| Xn = i)

Hidden Markov Process :Y – Noisy Observation of XRλ – Noise/Emission Matrix

rλ(i,j) = Pr(Yn = j| Xn = i)

RλRλ

Xn Xn+1

Yn+1Yn

Models are denoted by λ and µ.

Example: Binary HMP

p(1|0)

p(0|1)

p(1|1)

p(0|0)

)1|1()1|0(

)0|1()0|0(

q(0|0) q(1|0)q(0|1)

q(1|1)

)1|1()1|0(

)0|1()0|0(

Transition Emission

Example: Binary HMP (Cont.) A simple, Symmetric Binary HMP :

M = R =

All properties of the process depend on two parameters, p and . Assume w.l.og. p, < ½

Overview

Distance Measures for Two HMPs

Why important ? Often, one learns a HMP from data. It is important

to know how different is the learned model from the true model.

Sometimes, many HMPs may represent different sources (e.g. different authors, different protein families etc.), and we wish to know which sources are similar.

What distance measure to use? Look at joint distributions of N consecutive Y

symbols Pλ(N) and Pµ

Relative Entropy (RE) Rate

Notation : Relative Entropy for finite (N-symbol) distributions:

Take the limit to get the RE-rate:

jiiji YYYY ,...,,][ 1

Alternative definition, using conditional relative entropy:

First proposed for HMPs by [Juang&Rabiner 85]. Not a norm (not symmetric, no triangle inequality). Still it has several natural interpretations:

-If one generates data from λ, and gives likelihood score to µ, then D(λ || µ) is the average likelihood-loss per symbol (compared to the optimal model λ).

-If one compresses data generated λ, assuming erroneously it was generated by µ, then one ‘looses’ on average D(λ || µ) per symbol.

For Markov chains, D(λ || µ) is easily given by:

For HMPs, D(λ || µ) is difficult to compute. So far only bounds [Silva&Narayanan] or approximation algorithms

[Li et al. 05, Do 03, Mohammad&Tranter 05] are known. D(λ || µ) generalizes the concept of the Shannon entropy

rate, using:

H(λ) = log s – D(λ || u)

Where u is the uniform model, s is the alphabet size of Y. The entropy rate H for an HMP is a Lyapunov Exponent,

which is hard to compute generally. [Jacquet et al 04] What is known (for H) ? Lyapunov exponent representation,

analyticity, asymptotic expansions in different Regimes. Generalize results and techniques to the RE-rate.

Why is calculating D(λ || µ) difficult?

Markov Chains:

-All states with the same no. of flips have the same prob.

Polynomial number of types (probs).

HMPs :Many Markov chains, {X} contributes to the same Y. Different {Y}s have different probs.

Exponential number of types (probs). Method of types does not work here.

Overview

RE-Rate and Lyapunov Exponents

What is Lyapunov exponent? Arises in Dynamical Systems, Control Theory, Statistical

Physics etc. Measures the stability of the system. Take two (square) matrices A,B. Choose each time at

random A (with prob. p) or B (w.p. 1-p). Look at the norm:

(1/N) log ||ABBBAABAB…BA||

The limit:

-Exists a.s. [Furstenberg&Kesten 60]

-Called Top Lyaponov Exponent.

-Independent of Matrix Norm chosen. HMP entropy rate is given as a Lyaponov Exponent

[Jacquet et al. 04]

RE-Rate and Lyapunov Exponents

What about RE-rate? Given as the difference of two Lyapunov Exponents:

-The G’s are random matrices, which are simply obtained from M and R using the forward equations.

-Different matrices appear in the two Lyapunov exponents, but the probabilities selecting the matrices are the same.

Analyticity of the RE-Rate

Is the RE-rate continuous, ‘smooth’, or even analytic in the parameters governing the HMPs?

For Lyapunov exponents: Known analyticity in the matrix entries [Rulle 79], and their probabilities [Peres 90,91] separately.

For HMP entropy rate, analyticity was recently shown by [Han&Marcus 05].

Analyticity of the RE-Rate

Using both results, we are able to show:

Thm: The RE-rate is analytic in the HMPs parameters.

Analyticity is shown only in the interior of the parameters domain (i.e. strictly positive probabilities).

Behavior on the boundaries is more complicated. Sometimes analyticity remains on the boundaries (and beyond). Sometimes we encounter singularities. Full characterization is still lacking [Marcus&Han 05].

RE-Rate Taylor Series Expansion

While in general the RE-rate is not known, there are specific parameters values for which it is easily given in closed-form (e.g. for Markov-Chains). Perhaps we can ‘expand’ around these values, and get asymptotic results near them.

Similar approach was used for Lyapunov exponents [Derrida], and for HMP entropy rate [Jacquet et al. 04, Weizmann&Ordenlich 04, Zuk et al. 05] giving first-order asymptotics in various regimes.

Different Regimes – Binary Case

p -> 0 , p -> ½ ( fixed)

-> 0 , -> ½ (p fixed)

We concentrate on the ‘High-SNR regime’ -> 0, and

‘almost-memoryless regime’ p-> ½.

For High-SNR (η= λ,µ) :

Solution can be given as a power-series in :

In [Zuk,Domany,Kanter&Aizenman 06] we give a procedure for calculating the full Taylor-Series Expansion for the HMP entropy rate, in the ‘High SNR’, and ‘almost memoryless’ regime.

Main observation: Finite systems give the correct RE rate up to a given order:

Was discovered using computer experiments (symbolic computation in Maple).

Stronger result holds for the entropy rate (orders ‘settle’ for N ≥ (k+3)/2)

Does not hold for any regime. For some regimes (e.g. p->0), even first order never settles.

Proof Outline (with M. Aizenman)

(k+3)/2

H(p,) up to O(k)

Two main Ideas:

A. To distinguish between noise at different site

2 3….j

B. When m=0, the observation Ym=Xm,

conditioning back to the past is ‘blocked’

D(λ||µ)

Overview

First order :

Higher orders were computed for the binary symmetric case. Similar results for the ‘almost-memoryless’ regime. Radius of convergence seems larger for the latter

expansion, albeit no rigorous results are known.

Future Directions

o Study other regimes. (e.g. two ‘close’ models).o Behavior of the EM algorithm. o Generalizations (e.g. different alphabets sizes,

continuous case).o Physical realization of HMPs (mesoscopic systems,

quantum jumps) o Domain of Analyticity - Radius of convergence.

Thanks

o Eytan Domany (Weizmann Inst.)o Ido Kanter (Bar-Ilan Univ.)o Michael Aizenman (Princeton Univ.)o Libi Hertzberg (Weizmann Inst.)

The Relative Entropy Rate of Two Hidden Markov Processes Or Zuk Dept. of Phys. Of Comp. Systems...

Documents

Preventing Disentanglement by Symmetry Manipulations G. Gordon, A. Kofman, G. Kurizki Weizmann Institute of Science, Rehovot 76100, Israel Sponsors: EU,

Discovery and Early Multi-Wavelength ... - Caltech Astronomy · 1Department of Particle Physics and Astrophysics, The Weizmann Institute of Science, Rehovot 76100, Israel. 2email:

Weizmann Institute of Science - CRISPR-Mediated …...Rotem Sorek,1 C. Martin Lawrence,2,3 and Blake Wiedenheft4 1Department of Molecular Genetics, Weizmann Institute of Science, Rehovot

P SHIMON EDELMAN, PHD - Cornell Universitykybele.psych.cornell.edu/~edelman/Edelman-cv.pdf1978 — B.Sc., Electronics Engineering THE WEIZMANN INSTITUTE OF SCIENCE Rehovot, Israel

ABSTRACT - slac.stanford.edu · SLAC - PUB - 4223 February 1987 T/E Electroweak Interactions - Standard and Beyond* HAIM HARARI Weizmann Institute of Science, Rehovot, Israel

Dynamical Supersymmetry Breaking - arxiv.orgDynamical Supersymmetry Breaking Yael Shadmi Department of Particle Physics Weizmann Institute of Science, Rehovot 76100, Israel and Physics

Quantum Coherent Control with Non-classical Light Department of Physics of Complex Systems The Weizmann Institute of Science Rehovot, Israel Yaron Bromberg,

The State of the Art in Hydrodynamic Turbulence: Past Successes and Future Challenges Itamar Procaccia The Weizmann Institute of Science Rehovot, Israel

Nonlinear Microscopy and Temporal Focusing Microscopy Y. Silberberg Physics of Complex Systems Weizmann Institute of Science Rehovot, Israel CREOL April

Con. = 0.121 Con. = 0.003 Roughness measuring Saragusti, I. †‡, Sharon, I. ‡, Smilansky, U. † and Karasik, A. † † Weizmann Institute of Science, Rehovot,

Daniel Zajfman Max-Planck Institute for Nuclear Physics Heidelberg, Germany and Weizmann Institute of Science Rehovot, Israel Physics with Colder Molecular

Lecture Notes: Fundamentals of Nonlinear Physics · Lecture Notes: Fundamentals of Nonlinear Physics Victor S. L’vov The Weizmann Institute of Science, Israel, Rehovot 2014 The

Inflammation-induced cancer: crosstalk between tumours ... · 1Department of Immunology, Weizmann Institute of Science, 100 Herzl Street, Rehovot 76100, ... Chronic inflammatory processes

Non-Abelian states of matter - USPmacbeth.if.usp.br/~gusev/natureNon Abelian.pdf · 2010. 3. 31. · 1Department of Condensed Matter Physics, Weizmann Institute of Science, Rehovot

Lengths, Energies and Time Scales in Photosynthesis. Implications for Artificial Systems. Dror Noy Plant Sciences Dept. Weizmann Institute of Science Rehovot,

Symposium Program - Harvard University...Senior Scientist, Immunology Department, Incumbent of the Rina Gudinski Development Chair, Weizmann Institute of Science, Rehovot, Israel September

A. BreskinTIPP09 Tsukuba VISIBLE-SENSITIVE GAS-PMs A. Breskin, A. Lyashenko, R. Chechik Weizmann Institute of Science, Rehovot, Israel J.M.F. dos Santos,

The Nuclear Shell Model – Past and Present Igal Talmi The Weizmann Institute of Science Rehovot Israel

Shohat’s Method and Universality in Random Matrix Theory Weizmann Institute of Science Eugene Kanzieper Department of Condensed Matter Physics Rehovot,

In Defense of Nearest-Neighbor Based Image Classification Oren Boiman The Weizmann Institute of Science Rehovot, ISRAEL Eli Shechtman Adobe Systems Inc