IDSIA Lugano Switzerland On the Convergence Speed of MDL Predictions for Bernoulli Sequences Jan...

IDSIA Lugano Switzerland

On the Convergence Speed of MDL Predictions for Bernoulli Sequences

Jan Poland and Marcus Hutter

Is MDL Really So Bad?

Big Picture

MDL Bayes Other methods, e.g. PAC-Bayes

Bernoulli Classes

121410

111001

111000

111010

341161101

111011

141161100

Code = 111|{z}1+#bi ts

0|{z}stop

10|{z}data

² Set of parameters £ = f#1;#2; : : :g½[0;1]² Weights w# for each#2 £² Weights correspond to codes: w#=2¡ (Code#)

² Givenobservedsequencex=x1x2:::xn

² Probabilityof x given#:p#(x) =##ones(x)(1¡ #)n¡ #ones(x)

² Posterior weights w#(x) =w#p#(x)P#w#p#(x)

² Bayesmixture»(x) =P#w#(x)#

² MDL/MAP #¤(x) =argmax#w#(x)#

² MaximumLikelihood (ML):SameasMAP,butwithprior weightsset to1

Estimators

An Example Process

Sequence x

Bayes mixture

ML estimate

MAP (MDL) *

0000011

...(32)...

...(640)...

Trueparameter#0= 5

16 =0:3125

² Let #0 2 £ bethe trueparameter withweight w0

² » converges to #0 almost surely and fast,

preciselyP 1t=0E(»¡ #0)2 · ln(w¡ 10 )

² #¤ convergesto#0 almost surelyandingeneral slow,

preciselyP 1t=0E(#¤ ¡ #0)2 · O(w¡ 10 )

² Even true for arbitrary non-i.i.d. (semi-) measures!² TheML estimates converge to #0 almost surely,no such assertion about convergencespeed possible

What We Know

² Bayesmixturebound is descri pt i on l ength(#0)

² MDL bound is exp(descri pt i on l ength(#0))

² ) MDL is exponentiallyworse in general

² This is also a loss bound!

² Howabout simple classes?

² Deterministic classes: can showboundhuge constant£(descri pt i on l ength(#0))3

² Simplestochastic classes, e.g. Bernoulli?

Is MDL Really So Bad?

N parameters, w#= 1Nfor all #, #0= 1

MDL Is Really So Bad!

12+ 116

12+ 18

12+ 14

: :: } }}}}

Pt E(#

¤ ¡ #0)21#¤2[12+18;12+14]

Pt E(#

¤ ¡ #0)21#¤2[12+ 116;12+ 18]

Pt E(#

¤ ¡ #0)2=O(w¡ 10 ) in the following example:

² The instantaneous loss bound is good,

precisely E (#¤ ¡ #0)2 · 1nO¡ln(w¡ 10 )

² This does not imply a ¯nitely bounded cumulativeloss!

² The cumulative loss bound is good for certain niceclasses (parameters+weights)

² Intuitively: Bound is good if parameters of equalweights areuniformly distributed

MDL Is Not That Bad!

² De ne interval construction (I k; J k) which exponen-tially contracts to #0

² Let K (I k) betheshortest description lengthof some#2 I k

Prepare Sharper Upper Bound

38#0= 1

4}J 0= [0; 12)

}I 0= [12;1]

I1 J1I1

² Let K (J k) betheshortest description lengthof some#2 J k

² Theorem:X

E(#¤ ¡ #0)2 · O¡lnw¡ 10 +

2¡ ¢ (k)p¢ (k)

² Corollaries: \Uniformly distributed weights ) goodbounds

Sharper Upper Bound

² £ = fall computable#2 [0;1]g

² w#=2¡ K (#), whereK denotesthepre xKolmogorovcomplexity

²Pk 2¡ ¢ (k)

p¢ (k) = 1 ) Theoremnot applicable

² Conjecture:X

E(#¤¡ #0)2 · O¡lnw¡ 10 +

2¡ ¢ (k)¢

² ) bound huge constant£pol ynomi al holds forincompressible#0

² Compare to determistic case

The Universal Case

² Cumulativeand instantaneousboundsareincompat-ible

² Main positivegeneralizes to arbitrary i.i.d. classes

² Openproblem: goodboundsformoregeneral classes?

² Thank you!

Conclusions

IDSIA Lugano Switzerland On the Convergence Speed of MDL Predictions for Bernoulli Sequences Jan...

Documents

UT35A/MDL, UT32A/MDL Digital Indicating Controller (DIN ... · Thank you for purchasing the UT35A/MDL, UT32A/MDL Controller. (The UT35A and UT32A with option /MDL have no display.)

Time-Bounded Sequential Parameter Optimizationmurphyk/Papers/hutter-lion10.pdf · Time-Bounded Sequential Parameter Optimization Frank Hutter, ... continuous parameters on single

The Ant Colony Optimization (ACO) Metaheuristic: a Swarm - Idsia

Tracking of a ballistic missile with a-priori information - Idsia

316-978-8600 1 mdl@wichita.edu wichita.edu/mdl 4174 S

Exact Algorithms for Hard Graph Problems (Algoritmi Esatti - Idsia

MDL Probabilistic Forecast Guidance NFUSE Presentation Bob Glahn (MDL) Kathryn Gilbert (MDL/MOS)

Karl Hutter - Click Bond

IDSIA Lugano Switzerland Master Algorithms for Active Experts Problems based on Increasing Loss Values Jan Poland and Marcus Hutter Defensive Universal

AMR Hutter S02 KinBasic

Artificial Intelligence: Overview - Homepage of Marcus Hutter

Bayesian Treatment of Incomplete Discrete Data applied to Mutual Information and Feature Selection Marcus Hutter & Marco Zaffalon IDSIA IDSIA Galleria

Marcus Hutter

Applications of Discrete Optimization - The Swiss AI Lab IDSIA

Distribution of Mutual Information from Complete …...Distribution of Mutual Information from Complete and Incomplete Data⁄ Marcus Hutter and Marco Zaﬁalon IDSIA, Galleria 2,

7777.md...2020/09/30 · 22 500 80 000 62 500 219 600 176 000 385 000 528 000 1 120 000 2 611 754 75 MDL 50 MDL 30 MDL 25 MDL 20 MDL 15 MDL 10 MDL 5 MDL 1 687 500 MDL 4 000 000 MDL

MDL Plaintiffs (1600+) · Town of Hammondville, Alabama, MDL Notice Only Town of Leighton, Alabama, MDL Notice Only Town of McKenzie, Alabama, MDL Notice Only Town of Munford, Alabama,

An Ant Colony System Hybridized with a New Local Search - Idsia

ALGORITHMIC THEORIES OF EVERYTHING - Idsia

Deep Big Multilayer Perceptrons For Digit Recognition - Idsia